Sign in

R, SAS, SQL, Python

New York, New York, United States
October 28, 2016

Contact this candidate



*** **** ******, *** ****, NY *****347-***-**** • EDUCATION

Columbia University, Graduate School of Arts and Sciences, New York, NY September 2014-February 2016 M.A. in Statistics

GPA: 4.0/4.0

Relevant Coursework: Data Mining, Statistical Machine Learning, Bayesian Statistics, Advanced Data Analysis Peking University, School of Mathematical Sciences, Beijing, China September 2010–June 2014 B.S. in Statistics

Relevant Coursework: Data Structures, Mathematical Statistics, Applied Regression Analysis, Statistical Software ACADEMIC EXPERIENCE

Project: Forecasting Rossmann Store Sales with Store, Promotion and Competitor Data December 2015

Defined a new measure resembling accuracy rate to evaluate fitted models

Fitted a seasonal ARIMA model for mean sales of all stores at one time; Fitted the difference from mean for each store against all features using robust regression to fix non-normality; Added up to obtain predictions

Clustered stores by their competition conditions using customer segmentation algorithm, and then fitted a generalized linear mixed model (GLMM)

Compared the out-of-sample forecast performance for these two methods to get the most accurate prediction Project: Classification of Balance Scale Data in Psychological Experiments August 2015

Compared classification performance for baseline-category logit model, tree-based ensemble methods

(bagging, boosting and random forests), one-against-one soft-margin SVM with Gaussian kernel and artificial neural networks

Further fixed the imbalance of scale data using cascade approach as well as SMOTE algorithm and compared their performances

Project: Facial Recognition for the Yale Faces B Dataset March 2015-April 2015

Conducted PCA for dimension reduction, displayed "eigenfaces" to determine the number of principal components; Reconstructed faces

Applied k-nearest neighbors (kNN) on reconstructed faces to identify new faces

Compared matching accuracy for different assemble of lighting conditions Project: Longitudinal Data Analysis on the Rat Weight Data March 2014-June 2014

Explored the trend of growth and the covariance structure of error terms

Fitted a random effect model, applied AIC/BIC methods in identifying the structure of error covariance matrix using restricted maximum likelihood approach (REML)

Reduced models by Wald test using ML and made inferences in the effect of diet and exercise on weight PROFESSIONAL EXPERIENCE

Young Talent Program intern at Citibank, Beijing, China October 2013 – December 2013

Provided customers with wealth management suggestions including insurance plans, mutual funds and foreign exchange options based on their financial status and objectives

Applied L1 regularized logistic regression to predict default of credit card clients for overseas online shopping and gave a presentation to directors


Language skills: Fluent in Chinese

Computer skills: Proficient in R, Python, SAS, SQL, Matlab, Java, MS Office (Word, Excel, PowerPoint)

Contact this candidate