Post Job Free
Sign in

Data Scientist

Location:
Arlington, VA
Posted:
August 07, 2020

Contact this candidate

Resume:

Jichong Chai

*******.****@*****.*** 202-***-**** 1400 S Joyce St, Arlington, VA 22202

Summary

Recent Ph.D. graduate in Statistics with 2 first-author research publications and 4+ years extensive teaching experience. Aspiring Data Scientist with expertise in Statistics and Machine Learning, 5+ years research experience, and excellent proficiency in technical tools for data analysis.

Education

The George Washington University Washington, DC

Ph.D., Statistics May 2019

Dissertation: Privacy Protection in Data Collection via Randomized Response Procedures Master of Science, Statistics GPA: 3.97/4.00 May 2014 Zhejiang University Hangzhou, China

Bachelor of Engineering, Electronic Engineering GPA: 3.6/4.0 June 2011 Technical Skills

Statistics and Machine Learning: Regression, Tree-based model, PCA and Clustering, Experiment of Design, Survey Methodology, Neural Network, Bayesian Statistics, Time Series, Privacy Preserving Data Mining, etc.

Technical tools: R, SQL, Python (pandas, scikit-learn, TensorFlow), SAS, Tableau Work Experiences

MOYI Inc. New York, NY

Data Scientist March 2020 – Present

Created a Machine Learning pipeline by Python leads to an alpha-generating stock trading strategy

Developed a ML solution for making trading decisions by predicting stock market trend and daily price changes

Built a RF model on selected technical indicators which makes 20% more return than “buy and hold” strategy The George Washington University Washington, DC

Ph.D. Researcher August 2015 – March 2020

Defined a practical and rigorous metric for privacy protection in data collection from surveys

Developed a privacy protection method that has 30% less estimation risk than the state of art RAPPOR algorithm currently used by Google, Apple and other companies

Wrote 2 first-author papers and presented 4 talks at conferences and seminars Teaching Fellow January 2015 – May 2019

Led recitation lectures and lab sections (R and SAS) for 8 courses. (Data Mining, Statistical Computing, Applied Regression, Intro of Statistics, Inference, Design of Experiment, and Multivariate Analysis) Projects

Privacy Preserving Data Mining: Classification on perturbed Census data

Reconstructed the dataset from the Census data that perturbed by proposed new privacy protection method

Built a XG Boost model based on the reconstructed dataset and improved AUC from 0.59 to 0.77

Privacy Preserving Data Mining on network data

Developed a Logistic regression model for network data with perturbed edges

Proposed an efficient EM algorithm on estimating parameters of the Logistic model with enhanced accuracy

Data Competition: Fraud detection on credit card transaction data with 600,000+ transactions

Processed missing data by regression imputation and handled imbalanced dataset by oversampling

Developed a GBM classification model with AUC 0.83 by conducting feature engineering and parameter tuning Publications

2 first-author papers: Electronic Journal of Statistics (2018), AStA Advances in Statistical Analysis (2019, in review) Awards

Minna Mirin Kullback Memorial Prize in 2018 (Best PhD Student), University Fellowship in 2013 (Top 5%)



Contact this candidate