Sign in

Data Analyst

Cerritos, California, United States
September 12, 2018

Contact this candidate



RELOCATABLE Cerritos, CA 217-***-**** SAS Certified


University of Illinois – Urbana Champaign, Champaign, IL August 2016 – May 2018

MS in Statistics

GPA: 3.93/4.0

Beijing Forestry University, Beijing, China September 2012 – July 2016

BS in Statistics

GPA: 3.50/4.0


R SAS Python SPSS SQL Microsoft Excel Pivot tables Vlookup Statistical modeling Machine learning Tableau


Technical Consulting & Research, Inc, Weston, CT July 2018 – Present

Statistical Data Mining Analyst

Utilized Python to extract data of contacts information from the government website by Web Scraping, and performed data cleaning with R, visualized data with Tableau.

University of Illinois – Urbana Champaign, Champaign, IL January 2018 – May 2018

Course Assistant in Statistics Programming Methods

Facilitated weekly office hours helping 60 students to better understand R programming - including simulation, data visualization, data cleaning, and R shiny app.

Mentored 5 student teams toward improved understanding of data analytics projects and R shiny app design.

Research Institute of College Admissions, Beijing, China May 2017 – August 2017

Data Analyst

Built machine learning models (linear regression, random forest, logistic regression, etc.) to predict academic performance and determine most impactful effects on students’ scores; achieved 90% accuracy in predicting students’ choices of major.

Created data visualizations using ggplot in R and Tableau and developed SQL queries to extract data from database.

Enhanced data by correcting missing values and transforming long data to wide data.

Ipsos, Beijing, China March 2016 – July 2016

Marketing Research Analyst

Designed questionnaire to explore ways for measuring, managing and improving customer relationships to clients’ organizations.

Briefed clients on best business suggestions with detailed report; visualized research data and built different statistical models (linear regression, clustering analysis, etc.) to analyze data for different projects by R and SPSS.

Cleaned and summarized accumulated data by pivot table and Vlookup in Excel.


Prediction of Loan Status, Champaign, IL October 2017 – December 2017

Predicted with 87% accuracy whether or not a loan would default.

Built XGboost, regularized logistic regression, random forest in R to predict loan status; evaluated each model via log-loss.

Cleaned historical loan data (1M observations, 74 variables) between 2007 and 2016 and visualized data using ggplot in R.

Prediction of Iowa Housing Price, Champaign, IL March 2017 – May 2017

Accurately predicted final housing prices by building linear regression models, random forest, and gradient boosting machine modeling in R.

Evaluated each model via RMSE between the logarithm of the predicted value and the observed sale price.

Cleaned housing data (3K observations, 79 variables) between 2006 and 2010.

Movie Review Sentiment Analysis (Applied NLP), Champaign, IL March 2017 – May 2017

Demonstrated 95% prediction accuracy for positive or negative reviews of movies on IMDB.

Implemented sentiment analysis via logistic regression and random forest models.

Cleaned 50,000 data entries using Python by removing HTML tags and stop words, substituting punctuations and numbers with space, converting to lower case, tokenized into words, transformed data into vectors using TF-IDF.

Contact this candidate