Post Job Free
Sign in

Data Analyst

Location:
New York City, NY
Posted:
January 11, 2019

Contact this candidate

Resume:

Yujie Zhu

347-***-**** ■ ******@********.*** ■ www.linkedin.com/in/yujie-zhu-a91b34136

EDUCATION

Columbia University, Graduate School of Arts and Sciences New York, NY M.A. in Mathematics of Finance GPA (3.9/4.0) expected Feb 2019

• Coursework: Time-Series Modeling, Programming for quant, machine learning, numerical methods in finance East China Normal University, Department of mathematics Shanghai, China B.S. in Mathematics and Applied Mathematics Sept 2013 – Jun 2017 GPA(91/100) Major GPA(94/100) Ranking(3/113)

• Coursework: Calculus, Algebra, ODE, PDE, Real Analysis, Complex Analysis, Probability, Statistics, Stochastic EXPERIENCE

Gresham Investment Management ($7.2bn AUM) New York, NY Quantitative Analyst& Data Analyst Intern Jun 2018-Dec 2018

• Data cleaning/validation: Collected raw data from Bloomberg, cleaned data and committed into database, generated back-adjusted future prices series and checked the existing of outliers, missing values, zero values

• Automation/Code optimization: Built tools to automate tracking errors analysis, speeded up some modules more than 50% by using multiprocessing and vectorization and rewrote them in OOP

• Signal Research: Calculated the carry of commodity futures, explored and researched the relation between the carry and the future’s return, smoothed and standardized the carry series, developed trading strategy based on it

• Portfolio Allocation: Did tree clustering for different assets to explore the hierarchical structure in them, used Hierarchical Risk Parity method to allocate the weight of each asset based on the clustering Highfort Investment Management Shanghai, China

Quantitative Research Intern Apr 2018-May 2018

• Time Series: Resample intraday data according to different standards (timestamp, volume, dollar) to get homogenous series for further investigation, applied different models (GARCH, EGARCH, EWMA) to describe the volatility of the time series

• Ensemble Machine Learning: Given feature matrix of intraday data, applied Lasso, Ridge, Extra-tree, XGBoost, LGBM and other regression model combined with ensemble algorithms such as bagging and stacking to improve the R-square of the prediction

NiuStock Financial Company Shanghai, China

Quantitative Analyst Summer Intern Jun 2016-Aug 2016

• Factor analysis: Applied Mean Decrease Impurity, Mean Decrease Accuracy and Single Feature Importance combined with PCA to evaluate the importance of factors, used random forest algorithm to select stocks PROJECTS

Prediction of Price of Airbnb Property (Citadel Datathon Challenge: rank top 20%) New York, NY

• Extracted and merged data from about 1GB raw data, visualized the data to get insights, analyzed and looked for correlations between attributes

• Did data cleaning and feature engineering including feature scaling, feature selection, creating new feature

• Tried different regression model to fit the data, used cross-validation to evaluate the model’s performance and got the result that Lasso Regression model performs best with R-square above 0.7

• Used bagging method to optimize the regression model and made prediction on test set Analysis of Trend-Following Strategy on Bitcoin and Crude Oil Futures Market New York, NY

• Applied variance-ratio test, push-response test and other statistical tests to analyze the trend-following properties of 5-min frequency price time series data of Bitcoin and Crude Oil futures

• Used Full Grid Search and Coordinate Descent for parameter optimization of the strategy

• Realized and back tested the trend-following strategy on both markets, analyzed and compared the results SELF-MOTIVATED LEARNING in QUANTATIVE FINANCE

• “Advances in Financial Machine Learning”: preprocessing of financial data such as event-based sampling, feature selection such as null importance test, feature engineering such as fractionally differentiated features

• “Elements of Statistical Learning”: bias/variance of the model, Ensemble ML algorithm like random forest and gradient boosting tree, further reading about xgboost and lightgbm which is really popular recently COMPUTER SKILLS/OTHER

Programming Languages: Python, C/C++, Matlab, R, SQL Languages: Mandarin, English(TOEFL 110/120,GRE V161/170,Q170/170)



Contact this candidate