Rong “Ellen” Lin ** Chelsea Ct, Daly City, CA
ac28df@r.postjobfree.com
https://www.linkedin.com/in/elin1024/
https://github.com/peanut1007
Skills
Technical: R, PSQL, Microsoft SQL, Python, Scala, cTakes, pandas, NumPy, MongoDB, scikit-learn, Machine Learning, HL7, SPSS, GitHub, Office, PowerPoint, Excel
Strengths: Data Science, Statistics, Problem-Solving, Interpersonal Skills, Corporate Finance
Languages: Fluent in English and Mandarin
Education
M.S., Health Informatics
University of San Francisco, San Francisco, CA
Expected: Dec 2017
B.S ., Finance and Economics
University of Kansas, Lawrence, KS
Aug 2012
Academia & Data Experience
Statistician/Data Scientist Jun 2017 - Present
NCIRE, San Francisco VA Medical Center, Pulmonary Lab o Using REDCap to manage study progress and ETL questionnaire and EHR data o Analyzing time series data by using ARIMA method to find patterns in activity monitors o Building Machine Learning models and select features from both the cardiopulmonary test results and symptoms which classify lung related diseases
Data Engineering Jan 2017 – May 2017
KlaraHealth, USF Affiliate
o Performed ETL in Python by NumPy arrays and pandas o Processed unstructured clinical text into JSON/XML with NLP tool Bioportal Annotator & REST API o Applied bootstrapping and cross-validation for resampling data, used grid search for tuning parameters in the model, trained and tested Machine Learning Classifier Models such as SVM and Random Forests to classify diabetes, best outcome of 81% ROC accuracy
Data Forecasting Analyst Intern Jun 2016 – Dec 2016 MindLight Medical, USF Affiliate
o Explored Longitudinal data analysis for the behavioral sciences using R o Extracted features from EEG data labeled by “normal”, “high risk” and “diagnostic autism” o Applied feature ranking based on differences on average trajectories of the groups o Analyzed and distinguished differences in the profiles’ behaviors o Predicted the trajectory model and its trends by using Linear Mixed Effects Regression and Group-Based Trajectory Model
Bioinformatics Project Aug 2016 - Dec 2016
School of Nursing and Health Professions, University of San Francisco o Appraised, selected and utilized data from Genbank, the Protein Data Bank, the Cancer Atlas and the Gene Expression Omnibus to solve a given problem
o Used BLAST and multiple sequence alignment to investigate the leptin receptor protein, including retrieving orthodox sequences to the leptin receptor protein from the Entrez database at NCBI. After that, used a hierarchy tree to visualize the similarity.
o Used DESeq2 to analyze RNA-seq to find the differential gene expression, including using MAplot to represent log fold change corresponding to mean count. Then we used it to determine which genes are expressed in disease and not disease group.
Corporate Experience
Business Analyst Jan 2015 - May 2016
Rocket Fuel, Inc., Redwood City, CA
o Generated recurring revenue reports, distributed by advertisers & publishers o Analyzed transactional data from Salesforce, sought underlying reasons why campaigns underperformed, and prepared reports for advanced operation analysis