HAO YI
Edison, NJ *****
Tel: 732-***-****
Email: ********@*******.***
Education
Rutgers, THE STATE UNIVERSITY OF NEW JERSEY New Brunswick, NJ
Master of Science, Mathematical Finance, May 2013-2015
UNIVERSITY OF LIVERPOOL Liverpool, UK
Bachelor of Science, Mathematics with Finance, July 2013 (First Class, top 10%)
Experience
BANK OF CHINA SuZhou, China
Data Analyst Intern Summer, 2012
Collected, summarized and analyzed client data as well as financial statements
•
Responded to questions from clients; Prepared briefings for senior managers
•
Projects
Twitter Sentiment Analysis, Data Mining
-Investigate the predictive power of public sentiment from various Twitter accounts for stock market return movement
-Request twitter data through Twitter API
-Data cleaning and feature extraction to raw collected data in R
-Tweets sentiment classification (supervised machine learning) comparison among Multinomial Naïve Bayes, Support
Vector Machine and Logistic Regression using Scikit-learn and Pandas packages in Python
-Granger causality analysis of classified sentiment on AR (2) time series model
Movie Recommendation Database, Business Data Management
-Create ‘Movie Recommendation’ relational schema based on conceptual ER-diagram in MAMP MySQL on Toad
-Verify and normalize the database system to Third Normal Form (3NF)
-Apply SQL query for specific users’ requirements and output recommended movies
News grouping, Text Classification (Scikit-learn package)
-Train a Naïve Bayes classifier composed of a feature vectorizer (transform text into numeric features), evaluated by K-
fold cross-validation
-Improve the results by trying to sparse the text tokens and remove stop words
-Model selection for the best ‘alpha’ parameter in MultinomialNB and ‘gamma’ in SVC; ‘C-gamma’ combination of
parameter selection using Grid Search
Rating Cereals, Regression Analysis
-Build multiple linear regression model based on three basic assumptions via R software
-Determine linearity between transformed response and features (regressors) using partial regression plots
-Variable selection employed by Backward selection, AIC criterion and Mallow’s Cp Statistic
-Test for adequacy among candidate models, including basic assumptions and multicollinearity
-Conclude the best model and detect any potential drawbacks
Stock data set analysis, Time Series Analysis
-E-views associated with time series analysis, including stationary analysis, seasonality analysis, univariate modeling,
cointegration analysis, building error correction model, causality analysis, and K-S normality test
Additional
Working knowledge of MS Office Suite, Scikit-learn and Pandas packages in Ipython Notebook, R, E-views
•
Basic knowledge of SAP, Data Warehouse, Excel Pivot, MATLAB, C++, C#(ASP.NET&ADO.NET)
•