Sign in

Data Analyst

New Haven, Connecticut, United States
February 21, 2019

Contact this candidate


Siyun He Email : Mobile : +1-203-***-****


Yale University New Haven, CT

Master of Science in Biostatistics Sep. 2017 { May. 2019 Leadership: Student Ambassador of Yale School of Public Health (2018) Membership: American Public Health Association (APHA), American Statistical Association (ASA) Related Courses: Application Statistics Programming with SAS and R; Machine Learning; Statistical Consulting; Fundamentals of Clinical Trials; Survival Analysis; Longitudinal and Multilevel Data Analysis

University of Toronto Ontario, Canada

Honours Bachelor of Science (Hons. BSc) in Pharmacology Sep. 2013 { Jun. 2017 Honours: Innis College Alumni Association Scholarship (2015); Dean’s List (2015, 2016) Experience

Center for Outcomes Research and Evaluation (CORE) New Haven, CT Statistical Analyst May 2018 - Present

Conducted research on prevalence, awareness and treatment of isolated diastolic hypertension and severe hypertension using health care data from 2.6 million adults in the China PEACE Million Persons project;

Cleaned, prepared, combined, visualized, and analyzed data using R and SAS;

Performed subgroup analysis to examine how the prevalence, awareness, and treatment rates vary by population subgroups;

Developed multivariable mixed models with a logit link function and township-speci c random intercepts, accounting for spatial autocorrelations, to identify individual characteristics associated with awareness and treatment;

Created graphs, tables, and charts to assess the number and classes of medications used by treated participants;

Reported and discussed statistical ndings in weekly meetings with physicians, research scientists, project leaders, and senior statisticians;

Participated in drafting and revising publications.

China CITIC Bank Beijing, China

Data Analyst Aug 2018 - Sep 2018

Worked with managers and statisticians to develop nancial models to analyze the capital

ows to public credit;

Presented on how to do data exploration using R as employee training for non-statistical departments;

Collaborated with various teams and departments across the company.

University of Toronto Ontario, Canada

Research Assistant May 2016 - Aug 2016

Replicated and modi ed Mazaratis research to nd a potential model to study the biological basis of depression-like symptoms in epilepsy and to test anti-depression drugs;

Used SigmaPlot to conduct t-test, Mann-Whitney test, and two-way ANOVA test to visualize and analyze data;

Compared the di erence in seizure stage, afterdischarge threshold, and behavioural tests between control and experimental groups.


Evaluation of Breast Cancer Diagnostic Method: Built logistic models to predict the breast cancer diagnostic outcome using nuclear morphometric characteristics as predictors on Breast Cancer Diagnostic dataset using R. Performed model selection using AIC criterion, ROC, and AUC.

Time Series analysis on Air Passenger Data: Applied iterative tting method and ARIMA model to t the data and predict future trend of air passenger amount.


Programming: R, SAS, SQL, Python

Software: Latx, Excel, PowerPoint, Word, Prezi, SigmaPlot

Analysis: Machine Learning, Time Series, Text Mining, Generalized Linear Models, Multivariable mixed model

Languages: English(pro cient), Chinese(native)

Contact this candidate