Chi Zhang
917-***-**** ******@********.*** *** Washington Blvd Apt1311, Jersey City, NJ 07310
EDUCATION
Columbia University New York, NY
Master of Science in Data Science Expected 02/2019 Relevant Courses: Probability and Statistics; Machine Learning; Deep Learning; Exploratory Data Analysis and Visualization University of Glasgow-University of Electronic Science and Technology of China Dual Bachelor’s Degrees Chengdu, China Bachelor of Engineering (Hons) in Electronics and Electrical Engineering 09/2013~06/2017 GPA: 3.5/4.0; Rank: 10/122
PROJECTS
Understanding Congressional Bills, Capstone Project with Bloomberg
• Designed and developed a data analysis workflow to automatically draw insights from a congressional bill to assist understanding.
• Used Stanford Named Entity Recognizer (NER) and spaCy NER to extract impacted population and quantities mentioned in a bill.
• Used Plotly to create interactive visualizations to see the bills clustering in terms of topics, states, etc. Cancer Cells Detection on Gigapixel Pathology Image Using CNN
• Generated training data by sliding window with central area and surrounding context to solve insufficient data problem.
• Performed data augmentation and down sampling to overcome imbalanced dataset problem.
• Built multi-input (multi-scale image) model based on the Inception V3 to detect cancer cells. Prediction of Box Office Performance Based on Twitter Using Sentiment Analysis
• Crawled over 42,000,000 movie reviews from Twitter by a Python wrapper, and time-series box office data from web and cleaned.
• Built the first Emoji Sentiment Dictionary in NLP Sentiment Analysis field using a statistical method and used NLTK sentiment analyzer to evaluate each tweet’s sentiment score. Then, used Linear Regression, and Random Forest, to predict future box office performance of a specific movie, and K-Means to cluster movies reviews.
• Used Tableau to visualize the data and analyze results, the score reaches 80%. Statistical Machine Intelligence & Learning Lab, Big Data Research Center (the largest Big Data research institution in China) Research Assistant
The Metropolitan Museum Exhibitions Detection Using Transfer Learning Based on CNN
• Collected a small dataset (50 images/class) generated by a video slicing and used transfer learning to train a exhibitions detector.
• Applied dropout, data augmentation, early-stopping and fin-tuning by unfreezing the top layers of a pre-trained network.
• Used TensorBoard to visualize the training process and reached a 95% accuracy. China Mobile Customer Churn Prediction
• Data Cleaning including outlier detection, missing value imputation, categorical feature engineering and stratified splitting.
• Developed Logistic Regression, Random Forest and Gradient Boosting Tree models for telecommunications service vendors to predict customer churn probability based on labeled data.
• Evaluated model performance by confusion matrix and analyzed feature importance. PROFESSIONAL EXPERIENCE
American Credit Acceptance (ACA) Spartanburg, SC
Modeling Analyst Intern 06/2018-08/2018
• Updated risk model by adding more new variables and functions to improve the performance on Car Loan business.
• Improved variable selection strategy from a Voting-Based to a Ranking-Based method with Linear Regression, Random Forest, Correlations, and other variable selection techniques. Applied it on the ID Analytics credit and fraud risk dataset. Fine feature selection with stepwise GBM. Created new risk variables using the existing variables. Increased the Gini Coefficient by 16%.
• Found a problem of current Lending Risk model and addressed it by developing four potential Primary Borrower and Co- Borrower Indifferent GBM models. Increased the Gini Coefficient by over 20%, which would promisingly avoid millions of losses from business with ACA’s largest partner, CarMax. ISU-Wm Schwartz & Co Chicago, IL
CRM Assistant 07/2015-08/2015
• Documented customers’ information and administered the information system.
• Extracted insights from user profile and analyzed user behaviors. Targeted potential customers.
• Participated and organized weekly regional commercial meeting as a company representative. AXA China Region Insurance Company Ltd. Hong Kong, China Data Analyst Assistant 07/2014-08/2014
• Led a group of 7 interns in data analysis regarding to Mainland China companies and ranked 1st in Asset Allocation Competition.
• Involved in the project of Hedge Investment Fund Construction, ranking 1st portfolio in group competition. SKILLS
Technical: Python, R, SQL, MongoDB, D3, Tableau, git, AWS, Spark, Microsoft Office Sklearn, Pandas, Numpy, TensorFlow, Keras