Post Job Free
Sign in

SAS, R, SQL, Python, Data Mining and Machine Learning, Linear models

Location:
Arlington, VA
Posted:
January 12, 2021

Contact this candidate

Resume:

SUMMARY

Data analyst with academic and industry experience executing data-driven solutions to increase efficiently, accuracy, and utility of internal data processing. Experienced at creating data regression models, using predictive data modeling, and analyzing data mining algorithms to deliver insights and implement action- oriented solutions to complex academic problems.

EDUCATION BACKGROUND

Georgetown University Washington DC, USA M.S. Biostatistics Aug.2019-Dec.2020 Highlighted Coursework: Epidemiology, Survival Analysis, Categorical Data Analysis, Linear Models and Multivariate Analysis, Machine Learning of Bioinformatics, Experimental Design and Clinical Trials, Data Science, Quantitative Data Analysis .

Nanjing Agricultural University Nanjing, China B.E. Food Science and Engineering Sep.2015-Jun.2019

ACADEMIC PROJECTS

Multi-omics data integration: Metabolomics and proteomics in Duchenne Muscular Dystrophy

Washington DC, USA Instructor: Professor Simina Boca Jan.2020-present

Integrated metabolomics and proteomics dataset with common set of study subjects.

Used different dimension reduction approaches (PCA, ICA, t-SNE, UMAP) on integrative analysis of multi-

omics data to gain biological insights.

Simulated data to understand confounding effects on different statistical methods due to potential

technical artifacts.

Used R with best practices for reproducible research, including literate programming with R Studio and

Sweave, used R Shiny to build interactive applications.

Wrote research paper including detailed statistical analysis result from different dimension reduction method and edited and revised paper with rigorous academic writing standards.

Application of Least-Squares Support Vector Machines for Quantitative Evaluation of Cell Phone Processor Quality Nanjing, China

Instructor: Professor Leiqing Pan Sep. 2017-Apr. 2018

Determined input parameters as quality indicators by measuring cell phone processor samples with near- infrared spectroscopy.

Applied Least-Squares Support Vector (LS-SVM) using MATLAB to establish the correlation model between the quality indicators and cell phone processor quality.

Used the training set data to train the model, the validation set to select the model and predict 400 samples (test set) to evaluate the learning method.

Wrote research paper from spectroscopy collection to statistical results of quality parameters.

INTERNSHIP EXPERIENCE

Allied Millennial Partners, LLC

New York, USA

Data Analyst Internship

Jun.2020-Aug.2020

Tested Efficient Market Hypothesis of on the target stocks and target investing companies, as well as the market indices using R, Python.

Built ARIMA model, linear state space model and VARMA model to analyze closed price and returns of target stock using R and Python.

Analyzed empirical findings by different statistical tests to examine the properties of auto-correlation, serial dependency (Seasonality in Financial Markets).

Forecasted incoming trend for target stocks using technical factor model and gave investment suggestions.

Wrote detailed company financial analysis and quantitative analysis report and presented to the executive team, received very positive feedback from team members.

SKILLS & INTERESTS

Communication Skills: Fluent in English Verbal and Writing Communication

Analytical Skills: Data Mining and Machine Learning, Time series, Survival Analysis, Categorical Data Analysis, Linear models and Multivariate analysis

Programming Skills: SAS, R, SQL, Python, MATLAB

Certification: SAS Certificated Base Programmer

Open-Mindedness: Adaptability, Creativity, Logical Thinking, Problem Solving, Working Independently



Contact this candidate