Post Job Free
Sign in

Data Analyst Python SQL R

Location:
Philadelphia, PA
Posted:
August 15, 2020

Contact this candidate

Resume:

MENGFEI WANG

*************@*****.*** • 1-571-***-**** • linkedin.com/in/mengfeiwang-aria/

EDUCATION

MS, Business Analytics (TOP 20%) DUKE UNIVERSITY, THE FUQUA SCHOOL OF BUSINESS (STEM) Jun 2019-May 2020 BS, Applied Mathematics and Finance (Top 10%) CHINA UNIVERSITY OF POLITICAL SCIENCE AND LAW Sept 2015- Jun 2019 SELECTED PROJECTS

Deep Learning, Natural Language Process (Python/Tableau): News-driven Movement of Stock Daily Return Jan – Feb 2020

• Programmed web crawler to scrap 20K+ pieces of News containing ‘Tesla’ as the keyword from Jan 2014 to Oct 2019.

• Applied NLP method LDA topic modeling to cluster text tokens into 5 groups and visualized the topics by word cloud.

• Extracted the abnormal return from Fama-French-Factor model, setting targeted binary variable.

• Trained classification models such as KNN to predict the News-driven movements of Tesla stock daily returns.

• Cross-validated and fine-tuned the SVM model via grid search in Python with 59% accuracy; published in Medium. Data Visualization, Predict: Analyzing and Predicting Usage of Shared Bicycles System Around Cities in CA Oct - Dec 2019

• Visualized weather, demographics, customer and traffic data using Python, such as heatmap, Choropleth map, boxplot.

• Performed EDA on time series data via SQL in Python, recreated multi-groups data like weekday for easier tracking.

• Feature Engineered by PCA and selected variables by Lasso, such as humidity, rainy time, traffic volume.

• Fine-tuned Random Forest model with 65% accuracy, offering suggestions on bikes supply and usage. (Matplotlib, Pandas, NLTK)

Predicting Model(SQL, R): Credit Card Default Rate Next Month Aug - Oct 2019

• Collected 50k+ pieces data via SQL, performed data cleansing and EDA such as missing value, imbalanced variables

• Visualized 20+ Lasso-selected features such as the demographic information, transaction history, and users’ behaviors

• Cross-validated fine-tuned classification models to predict the default rate next month, achieving 80% accuracy via R

• Explained models and interpreted results for the team to compile report, for banks’ better credit card business EXPERIENCE

FINSLATE Feb - May 2020

Portfolio Data Scientist, Team Manager

• Led team of 5 to build optimization portfolio asset allocation models on MPT, Risk Parity, and Blacklitterman; complied 50+ page white paper, delivered the result to C-levels and data scientists in the company.

• Oversaw project scoping, teamwork, client-facing communications, accomplished project 2 weeks earlier

• Extracted 50k+ market data from Yahoo Finance and investors’ performance records from Cambridge Database via R

• Conducted time series analysis to unsmooth the highly autocorrelated data via the Geltner method.

• Built, constrained, rebalance, and robust allocation model with changeable input variables such as risk preference.

• Applied 8-block bootstrapping to test models out-of-sample performance, achieving OOS Sharpe ratio over 0.8 MORGAN STANLEY CAPITAL INTERNATIONAL June - Dec 2018 Quantitative Data Analyst Intern

• Extracted and cleaned 10 years’ Chinese Equity market and investors’ market behavior data from Wind via R, used for quantitative models adjustment such as pricing model; increased the working efficiency like data extracting time by 20%.

• Applied time series analysis on ADF test and built AR-GARCH model to predict impact of black-swan events on Eviews.

• Simulated and bootstrapped data to backtest and validate predicting models, applying various new risk metrics.

• Conducted 6 research inquiries, explained stochastic mathematics and algorithms theory; hosted innovation workshop BEIJING CHANGJIANG QIANXIN XINHUI INVESTMENT MANAGEMENT Jan - Dec 2017 Investment Business Analyst

• Selected 50k+ industry data, applied Exploratory Data Analysis to drive insights for triple bottom line investing process

• Self-taught Pre-money valuation model within 2 weeks to value 4 NEEQ companies, presenting advice and risks suggestions for $500k+ investment project.

• Created visualization dashboard via Tableau to update and present investment seminar insights, elevating team’s industry understanding with 30% higher idea delivery frequency TECHNICAL CAPABILITIES

Tools: R, SQL, Python(BeautifulSoup, PyTorch, PyCharm,Tensorflow, Matplotib), Tableau, MATLAB, Advanced Excel, SPSS, EViews Analysis and Modeling : Statistics; Data Visualization, Time Series Analysis, Churn Analysis, Predicting Model, A/B Testing Machine Learning: Classification Models, Regression, Causal Inference, Neural Network, Natural Language (NLP), Cross-Validation



Contact this candidate