Sign in

Data Scientist

Arlington, VA
April 22, 2020

Contact this candidate


Jichong Chai 202-***-**** 1400 S Joyce St, Arlington, VA 22202


Recent Ph.D. graduate in Statistics with 2 first-author research publications and 4+ years extensive teaching experience. Seeking to solve analytical problems and develop models as a Data Scientist, offering expertise in Statistics and Machine Learning, 4+ years research experience, as well as skills in technical tools for data analysis. Education

The George Washington University Washington, DC

Ph.D., Statistics May 2019

Dissertation: Privacy Protection in Data Collection via Randomized Response Procedures Master of Science, Statistics GPA: 3.97/4.00 May 2014 Zhejiang University Hangzhou, China

Bachelor of Engineering, Electronic Engineering GPA: 3.6/4.0 June 2011 Technical Skills

Statistics and Machine Learning: Regression, Tree-based model, PCA and Clustering, Experiment of Design, Survey Methodology, Neural Network, Bayesian Statistics, Time Series, Privacy Preserving Data Mining, etc.

Technical tools: R, SQL, Python (pandas, scikit-learn, TensorFlow), SAS, Tableau Work Experiences

The George Washington University Washington, DC

Ph.D. Researcher August 2015 – Present

Defined a practical and rigorous metric for privacy protection in data collection from surveys

Developed a privacy protection method that has 30% less estimation risk than the state of art RAPPOR algorithm currently used by Google, Apple and other companies

Wrote 2 first-author papers and presented 4 talks at conferences and seminars Teaching Fellow January 2015 – May 2019

Led recitation lectures and lab sections (R and SAS) for 8 courses. (Data Mining, Statistical Computing, Applied Regression, Intro of Statistics, Inference, Design of Experiment, and Multivariate Analysis) Tianjin Shipping Industry Fund Tianjin, China

Data Analyst Intern September 2011 – May 2012

Conducted exploratory data analysis and built linear regression models based on Baltic Freight Index to help make shipping business decisions and business planning

Supported engineers to monitor and analyze core data in shipbuilding progress used in project management Projects

Privacy Preserving Data Mining: Classification on perturbed Census data

Reconstructed the dataset from the Census data that perturbed by proposed new privacy protection method

Built a XG Boost model based on the reconstructed dataset and improved AUC from 0.59 to 0.77

Privacy Preserving Data Mining on network data

Developed a Logistic regression model for network data with perturbed edges

Proposed an efficient EM algorithm on estimating parameters of the Logistic model with enhanced accuracy

Data Competition: Fraud detection on credit card transaction data with 600,000+ transactions

Processed missing data by regression imputation and handled imbalanced dataset by oversampling

Developed a GBM classification model with AUC 0.83 by conducting feature engineering and parameter tuning Publications

2 first-author papers: Electronic Journal of Statistics (2018), AStA Advances in Statistical Analysis (2019, in review) Awards

Minna Mirin Kullback Memorial Prize in 2018 (Best PhD Student), University Fellowship in 2013 (Top 5%)

Contact this candidate