SUMMARY
Data analyst with academic and industry experience executing data-driven solutions to increase efficiently, accuracy, and utility of internal data processing. Experienced at creating data regression models, using predictive data modeling, and analyzing data mining algorithms to deliver insights and implement action- oriented solutions to complex academic problems.
EDUCATION BACKGROUND
Georgetown University Washington DC, USA M.S. Biostatistics Aug.2019-Dec.2020 Highlighted Coursework: Epidemiology, Survival Analysis, Categorical Data Analysis, Linear Models and Multivariate Analysis, Machine Learning of Bioinformatics, Experimental Design and Clinical Trials, Data Science, Quantitative Data Analysis .
Nanjing Agricultural University Nanjing, China B.E. Food Science and Engineering Sep.2015-Jun.2019
ACADEMIC PROJECTS
Multi-omics data integration: Metabolomics and proteomics in Duchenne Muscular Dystrophy
Washington DC, USA Instructor: Professor Simina Boca Jan.2020-present
Integrated metabolomics and proteomics dataset with common set of study subjects.
Used different dimension reduction approaches (PCA, ICA, t-SNE, UMAP) on integrative analysis of multi-
omics data to gain biological insights.
Simulated data to understand confounding effects on different statistical methods due to potential
technical artifacts.
Used R with best practices for reproducible research, including literate programming with R Studio and
Sweave, used R Shiny to build interactive applications.
Wrote research paper including detailed statistical analysis result from different dimension reduction method and edited and revised paper with rigorous academic writing standards.
Application of Least-Squares Support Vector Machines for Quantitative Evaluation of Cell Phone Processor Quality Nanjing, China
Instructor: Professor Leiqing Pan Sep. 2017-Apr. 2018
Determined input parameters as quality indicators by measuring cell phone processor samples with near- infrared spectroscopy.
Applied Least-Squares Support Vector (LS-SVM) using MATLAB to establish the correlation model between the quality indicators and cell phone processor quality.
Used the training set data to train the model, the validation set to select the model and predict 400 samples (test set) to evaluate the learning method.
Wrote research paper from spectroscopy collection to statistical results of quality parameters.
INTERNSHIP EXPERIENCE
Allied Millennial Partners, LLC
New York, USA
Data Analyst Internship
Jun.2020-Aug.2020
Tested Efficient Market Hypothesis of on the target stocks and target investing companies, as well as the market indices using R, Python.
Built ARIMA model, linear state space model and VARMA model to analyze closed price and returns of target stock using R and Python.
Analyzed empirical findings by different statistical tests to examine the properties of auto-correlation, serial dependency (Seasonality in Financial Markets).
Forecasted incoming trend for target stocks using technical factor model and gave investment suggestions.
Wrote detailed company financial analysis and quantitative analysis report and presented to the executive team, received very positive feedback from team members.
SKILLS & INTERESTS
Communication Skills: Fluent in English Verbal and Writing Communication
Analytical Skills: Data Mining and Machine Learning, Time series, Survival Analysis, Categorical Data Analysis, Linear models and Multivariate analysis
Programming Skills: SAS, R, SQL, Python, MATLAB
Certification: SAS Certificated Base Programmer
Open-Mindedness: Adaptability, Creativity, Logical Thinking, Problem Solving, Working Independently