Sign in

Statistical Analysis, Machine Learning, NLP, Data Visualization

New York City, NY
February 23, 2020

Contact this candidate



646-***-**** EDUCATION

Columbia University - M.S. in Business Analytics, GPA: 4.00/4.00 Aug 2019 – Present n Coursework: Data Analytics (Python), Tools for Analytics (Bash, Git, SQL, Django), Business Analytics (R), Optimization (Gurobi), Statistics and Stimulation, Analytics on the Cloud (Scala, Spark), Capital Markets. Nankai University - B.E. in Economics, GPA: 3.86/4.00 Sept 2015 – Jun 2019 n First-class scholarship, Merit Student (top 5%); Public scholarship, Excellent Student Cadre (5% of 197). n Coursework: Econometrics, Time Series Analysis, Macroeconomics, Accounting, Financial Economics, Database. SKILLS

Programming Language: Python, R, Scala, SQL; Tools: Spark, Django, TensorFlow, STATA, MATLAB, Tableau EXPERIENCES

Dun & Bradstreet (Columbia Capstone Project) New York, United States Data Scientist Jan 2020 – Present

n Develop a framework to measure the digital maturity degree of 5000 public companies. n Digital Definition: Set a comprehensive definition of how to qualify and measure digital transformation. n Data Collecting: Use Python to scrape from yahoo finance, Wikipedia and companies’ social media channels. n Prediction: Design a rating scheme of the vital business components, including social media influence, products, financial performance, and organizations. Combine machine learning and NLP techniques to predict the digital transformation index, which quantifies company’s digital level among its industries and business. China Securities Beijing, China

Industry Research Intern in the Food and Beverage group Apr 2019 – Jul 2019 n Data Extracting: Extracted financial data, price indicators, and market share information. n Industry Analysis: Classified the development status of enterprises, products, and industries, and predicted financial performance within 5 years to write investment strategy and industry in-depth reports. n Daily Maintenance: Calculated daily and weekly stock market data and collected information on corporate announcements and industry news to write journals and weekly reports. KPMG China Limited Beijing, China

Risk Consultant Intern Aug 2018 – Oct 2018

n Monte Carlo Simulation: Modified a drug pricing and market access model with Crystal Ball. Applied joint experiment to obtain the utility value, and simulated market share of decisions to improve management flexibility. n Market Survey: Designed 3 questionnaires, surveyed 15 doctors, 40 patients, and 5 government agencies to collect information on treatment paths, medicine cost, price affordability, therapy evaluation, and policy support. n Results Processing: Calculated costs by weighing three main subjects. Combined with existing policies to estimate price trend after listing by establishing dummy variables. Completed 100+ pages PPT report. Nankai University, Economic School Tianjin, China

Research Assistant in Carbon emission peak constraint and economic growth path project Mar 2017 – Jul 2018 n Regression: Collected 56 years’ data from 7 developed countries. Established regression models to study the relationship between economic variables and carbon emissions with STATA to compare their emission status and examined the time and economic performance required for China to reach its emission peak. n Simulation: Simulated 45 paths on emission reduction speed and the length of the platform period of China. Received different choices impact on abatement costs and environmental damage costs. Determined the optimal emission reduction speed and platform length to ensure the least economic loss. PROJECTS & COMPETITIONS

Columbia University, Fake review detection and text analysis Nov 2019 – Dec 2019 n Data Scraping & Processing: Extracted data from the Yelp website. Used SQL to preprocess datasets. n Machine Learning: Used Python to establish 4 different classification models. Evaluated model performance parameters and applied SVM to predict reviews’ truth and goodness. n Text Analysis: Drew pie charts, histograms, and word clouds to conclude features. Conducted VADER weighted sentiment analysis and established LDA models to get the most frequently used descriptive words. Nankai University, Environmental pollution and urbanization Sept 2017 – Jan 2018 n Panel model: Established an environmental pollution index to indicate the pollution condition of 18 provinces. Based on the Kuznets curve, established an individual fixed effect model that revealed an N-shaped relationship. L’Oréal On-Campus Charity Sales Apr 2018 – Jun 2018 n Designed promotional activities to attract more than 10k teachers and students. Responsible for volunteer recruitment and training. Achieved 220,000 CNY sales through online market and received Social Welfare Award.

Contact this candidate