Post Job Free

Resume

Sign in

Data Analyst

Location:
Washington, DC
Posted:
November 11, 2020

Contact this candidate

Resume:

Zixiao (Melody) Zhang

**** * ******** *****, *** 316, Arlington, VA 22201 ■ 202-***-**** ■ adhrau@r.postjobfree.com SAS Certified Specialist seeking position in data analyst, statistician, biostatistician and related area EDUCATION

M.S. in Biostatistics December 2020

Georgetown University, Washington, D.C.

Coursework: Clinical Trial, Machine Learning, Data Science, Categorical Data Analysis, Epidemiology, Survival Analysis B.S. in Statistics July 2019

Hong Kong Baptist University, Zhuhai, China

Coursework: Time Series Analysis, Data Mining, Simulation, Multivariate Analysis WORK EXPERIENCE

Research Assistant

Biostatistics Department, Georgetown University, Washington, D.C. January 2020 – May 2020

• Assisted 6 projects which selected the interested covariates and merged health status data collected from Georgetown University Hospital

• Checked normality of continuous variables using Shapiro-Wilk test and summarized statistics and univariate test using R package “tableone”

• Performed Kruskal-Wallis test to check significant association with analyzed p-value between continuous variables and categorical variables with non-normality variables, Wilcoxon rank-sum test for binary group and ANOVA for normality variables using R

• Generated 6 overall reports and interpreted the results by inserting comments in Excel RELATED EXPERIENCE

Research on Breast Cancer in Microarray Studies (R) September 2020

• Transformed raw data to pre-processed data using package “Bioconductor” in R

• Developed the self-define R function to implement the Quantile Normalization algorithm

• Conducted the moderated test and selected genes at the cutoff of BH adjusted p-value using package “Limma” in R

• Conducted the reduction dimension of variables by Principal Component Analysis (PCA), Independent Component Analysis (ICA) and Multidimensional Scaling (MDS)

Project – Relationship between Diabetes and Vitamin C (SAS) August 2020

• Cleaned data by deleting the missing value directly and identified diabetes based on data from NHANES between 2005 and 2006 using PROC SQL

• Summarized data that describe the mean, median, minimum value and maximum value with baseline table using PROC CONTENTS, PROC MEANS and PROC FREQ

• Conducted normality test, t-test, logistic regression with analyzing p-value using PROC TTEST, PROC LOGISTIC

• Generated reports using procedures like PROC SORT, PROC REQ, PROC UNIVARIATE, concluded that vitamin C concentration in serum was significantly lower in diabetics than non-diabetics in the US population, and more significant in non-Hispanic group

Machine Learning Research Based on Mental Health Data (Python) June 2020

• Performed data cleaning by dealing with missing data and developed data processing algorithms that transform non- standard data

• Split data into a training set and a testing set in package “Scikit-learn”

• Explored data using correlation matrix and conducted charts to visualize data via Seaborn

• Developed model using machine learning method including Logistic Regression, KNN, Decision Tree to predict treatment in package “Scikit-learn”; tuned parameters in Python with 82% accuracy by comparing the confusion matrix and ROC/AUC score

CERTIFICATIONS AND TECHNICAL SKILLS

Certifications

• SAS Certified Specialist: Advanced Programming Using SAS 9.4

• SAS Certified Specialist: Base Programming Using SAS 9.4 Technical skills

• R, SAS, Python, SPSS, Excel, RDBMS (MySQL), Tableau, AWS, Linux, LaTex



Contact this candidate