Jichong Chai
*******.****@*****.*** 202-***-**** 1400 S Joyce St, Arlington, VA 22202
Summary
Recent Ph.D. graduate in Statistics with 2 first-author research publications and 4+ years extensive teaching experience. Aspiring Data Scientist with expertise in Statistics and Machine Learning, 5+ years research experience, and excellent proficiency in technical tools for data analysis.
Education
The George Washington University Washington, DC
Ph.D., Statistics May 2019
Dissertation: Privacy Protection in Data Collection via Randomized Response Procedures Master of Science, Statistics GPA: 3.97/4.00 May 2014 Zhejiang University Hangzhou, China
Bachelor of Engineering, Electronic Engineering GPA: 3.6/4.0 June 2011 Technical Skills
Statistics and Machine Learning: Regression, Tree-based model, PCA and Clustering, Experiment of Design, Survey Methodology, Neural Network, Bayesian Statistics, Time Series, Privacy Preserving Data Mining, etc.
Technical tools: R, SQL, Python (pandas, scikit-learn, TensorFlow), SAS, Tableau Work Experiences
MOYI Inc. New York, NY
Data Scientist March 2020 – Present
Created a Machine Learning pipeline by Python leads to an alpha-generating stock trading strategy
Developed a ML solution for making trading decisions by predicting stock market trend and daily price changes
Built a RF model on selected technical indicators which makes 20% more return than “buy and hold” strategy The George Washington University Washington, DC
Ph.D. Researcher August 2015 – March 2020
Defined a practical and rigorous metric for privacy protection in data collection from surveys
Developed a privacy protection method that has 30% less estimation risk than the state of art RAPPOR algorithm currently used by Google, Apple and other companies
Wrote 2 first-author papers and presented 4 talks at conferences and seminars Teaching Fellow January 2015 – May 2019
Led recitation lectures and lab sections (R and SAS) for 8 courses. (Data Mining, Statistical Computing, Applied Regression, Intro of Statistics, Inference, Design of Experiment, and Multivariate Analysis) Projects
Privacy Preserving Data Mining: Classification on perturbed Census data
Reconstructed the dataset from the Census data that perturbed by proposed new privacy protection method
Built a XG Boost model based on the reconstructed dataset and improved AUC from 0.59 to 0.77
Privacy Preserving Data Mining on network data
Developed a Logistic regression model for network data with perturbed edges
Proposed an efficient EM algorithm on estimating parameters of the Logistic model with enhanced accuracy
Data Competition: Fraud detection on credit card transaction data with 600,000+ transactions
Processed missing data by regression imputation and handled imbalanced dataset by oversampling
Developed a GBM classification model with AUC 0.83 by conducting feature engineering and parameter tuning Publications
2 first-author papers: Electronic Journal of Statistics (2018), AStA Advances in Statistical Analysis (2019, in review) Awards
Minna Mirin Kullback Memorial Prize in 2018 (Best PhD Student), University Fellowship in 2013 (Top 5%)