Sign in

Data Analyst

New York City, NY
November 23, 2018

Contact this candidate

Resume: 646-***-**** *** W ** CYRUS SHA


COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK New York, NY Master of Arts in Statistics, Department of Statistics, GPA:3.33/4.00 Expected 12/2018

Courses: Advanced Machine Learning, Applied Data Analysis, Business Analysis, Probability, Time Series Analysis, Statistical Computing and Intro to Data Science Models, Stochastic Process in Application


Bachelor Degree in Science, School of Mathematics, GPA:3.33/4.00 08/2013-06/2017

Courses: Data Mining, Mathematical Statistics, Optimization, Numerical Analysis, Mathematical Experiments and Mathematical Software, Data Structure and Algorithms


Programming: R (caret/glmnet/dplyr/ggplot2/rvest), SQL, Tableau, Python(Numpy/Pandas/Tensorflow/SciPy/Matplotlib/Scikit-learn), Excel, VBA

Statistics and Machine Learning: Regression: Linear/Logistic/Lasso/Ridge; Classification: Decision Tree/SVM/Bagging/Random forest/K-Nearest Neighbors; K-mean/Hierarchical Clustering; Boosting; Principal Component Analysis(PCA)/Linear Discriminant analysis(LDA); Neural Network; A/B Test


AXA Advisors, LLC New York, NY

Data Analyst Intern 08/2018-present

Transformed and cleaned unstructured candidate data into well-structured data format using SQL and Excel; built Logistic regression model to classify and analyze incoming datasets with colleagues widely adopting this model

Predicted pricing complexity of incoming accounts using Extreme Gradient Boosting(XGB) classifier in R to facilitate account assignment process and improve efficiency (packages:XGBoost, caret, caTools)

Designed and developed internal processing bot with LinkedIn Messaging capability to improve HR efficiencies in identifying and communicating with potential sales applicants, optimizing workflows by 60% Peltast Partners Chicago, IL

Data Analyst Intern 06/2018-08/2018

Utilized SQL to pull large primary and secondary data sets while cleaning and validating data integrity using R

Built stepwise Lasso regression statistical model to predict Philippines’ regional economic development and trend for the next 3 years; Created geographical heating maps and interactive dashboards on Tableau to visualize economic growth of regional developments in the Philippines

Collected and cleaned the U.S steel imports data from the census bureau, conducted modeling for classification and prediction; identified and analyzed key patterns to clients

Guangdong Bond Capital Management, Ltd Guangzhou, China Quantitative Analyst Intern 02/2016-04/2016

Transformed and loaded daily transaction data from Wind Financial Terminal to SQL Server and calculated stock return fluctuation, daily variation, turnover rate, etc. using R (packages:RDOBC, WindR, dplyr, quantmod)

Reported on the practical operations of RSI (Relative Strength Index) in real trading of stock index futures and established financial models, provided recommendation to the leadership team

Transformed candlestick charts to equal volume-interval for clear tracking of price movements; programmed position-holding costs function provided by Tongdaxin Financial Terminal into R code


Stock Market Index Prediction (Introduction to the Maths of Finance, Columbia University) 04/2018-05/2018

Applied SVM to stock(S&P 500)’s historical data, using RBF kernel with optimal C and sigma, achieved 74% maximum accuracy in initial model for predicting index prices

Refined prediction to achieve 93-97% accuracy and 0.37 volatility using decision tree algorithms

Constructed a portfolio based on decision tree model, achieving 1.197% average daily return and 18% maximum cumulative return vs. 12% the highest return of S&P 500 during holding period

FIFA Ballon d’Or Winner Prediction (Data Mining, SYSU) 04/2016-05/2016

Performed principal component analysis (PCA) on collected data to reduce dimension, selected four principal variables for prediction and further analysis

Analyzed the data using K-Nearest Neighbor (KNN) for verifying and adjusting above prediction; further analyzed data by factor analysis; successfully predicted top 5 ranking of the 2015 FIFA Ballon d’Or

Global COMAP Project (Mathematical Contest in Modeling, SYSU) 01/2016-02/2016

Led team to achieve top 40 percentile results of global competition with honorable mention

Built ROI model for US non-profit organizations’ to evaluate firms’ investments in college students using multivariate linear model to filter and optimize various indicators; conducted data cleaning and validated data integrity

Contact this candidate