Xiaohuan Li
***W ***TH ST APT**, New York, NY ***25
Cell:917-***-**** Email: ******@********.***
SUMMARY AND OBJECTIVE
Have extensive knowledge of statistical models and machine learning algorithms, experienced with data
pre-processing, exploratory data analysis and model construction.
Seeking a data analyst position in a friendly and fast growing environment with quantitative problem-solving and
analytical skills to help the organization achieve its missions and goals.
EDUCATION
Columbia University, Graduate School of Arts and Sciences New York, NY
MA in Statistics, GPA: 3.6/4.0 December 2014
Huazhong University of Science and Technology, School of Economics Wuhan, China
BS in Financial Engineering, GPA: 3.8/4.0 June 2013
PROFESSIONAL EXPERIENCE
Rongzhi Investment Management Company Shanghai, China
Data Analyst Intern July-August 2014
Worked closely with various teams across the company, and provided technical assistance in support of
management and customer requests;
Enhanced market research efficiency by applying web scraping using Python in data collection and organizing
field trips to do detailed market investigation;
Implemented statistical models by R to do regression and correlation analysis.
China Minsheng Banking Corp.Ltd Hangzhou, Zhejiang, China
Financial Market Intern March-May 2013
Extracted financial index data from web sources to effectively facilitate daily inter-bank borrowing business;
Communicated with clients to understand their needs and made recommendations.
PROJECT EXPERIENCE
Face Recognition Fall 2014
Processed original data by transferring rectangle color images to square black & white ones;
Used three algorithms included Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and
Rectangular-Area Feature Extraction to do face recognition;
Compared the results of three methods, concluded that LDA outperformed PCA, and Rectangular-Area Feature
Extraction also has good performance if enough features are used.
Prediction of Default Probability Spring 2014
Explored application of various classification methods, models and algorithms to prediction default probability
based on clients’ basic and credit information;
Conducted data pre-processing and exploratory data analysis in R and used Github for version control;
Implemented Naïve Bayes, Logistic Regression, Support Vector Machines and Random Forest to do
classification and reached 90 percent accuracy.
Target Population Selection Spring 2014
Predicted whether a person makes over 50k a year based on 32561 observations and 15 variables by applying
Logistic Regression and Decision Trees;
Applied stepwise algorithm in Logistic Regression for feature selection and modified Decision Tree by using
Random Forest to improve the accuracy, which then reached 80 percent.
SKILLS AND CERTIFICATION
Skills: Microsoft Office Suite, R, SAS, MySQL, Python, Data Mining, Machine Learning, Time Series Analysis
Certification: Certified Advanced Programmer for SAS 9; Passed Level I of the CFA Program