Xiao(Shawn) Guo
614-***-**** ****@****.***
EDUCATION
Bowling Green State University Bowling Green, Ohio M.S in Applied Statistics and Operational Research (Comprehensive Scholarship) 08/2015-05/2017 Research Assistant: Extensively used Logistics Regression, Random Forest and created Machine Learning models The Ohio State University Columbus, Ohio
B.A in Economics (Vice Director of Entrepreneurship Association; Cofounder of ohioabc forum) 08/2012-12/2014 SKILLS & CERTIFICATION
• Programming Skills & Certification: R (plyr, dplyr, ggplot2), Python (Pandas, SciPy, NumPy, Scikit-learn), Advanced Excel (VBA/Pivot Tables/VLOOKUP), MS SQL Server and Access, Tableau, Bloomberg, SAS Advanced Certification
• Core Skills: Statistical Learning, Machine Learning, Data Mining, Model Building, Supervised and Unsupervised Learning, Statistical Inference, Linear Regression, Linear and Integer Programming (Operational Research), Linear Algebra, Data Visualization and Database Design, Probability Theory, Experimental Design, Econometrics
• Languages: English (Fluent), Chinese Mandarin (Native) WORKING EXPERIENCE
GtechFin Inc. (Python) New York City, NY
Quantitative Strategic Analyst intern 10/2017-Present
• Created market and fund research papers and presentations using sophisticated quantitative analytics, collected and extracted online data from the web for further data analysis and modeling preparation
• Contributed to proprietary production code that use and develop highly quantitative techniques to support and augment discretionary Portfolio Managers investment process
• Built up financial model, and performed quantitative analysis of internal trading strategies and portfolios to track market trend, and refined strategic and tactical asset allocation models to ensure specific technical indicators
• Reviewed the strategy, optimized the parameters, and ended up with a pressure test based on the historical data simulation Tellon Trading Inc. Atlanta, GA
Data Analyst intern 07/2017-10/2017
• Evaluated a mass of information extracted from multiple sources, and decomposed high-level information into details for analysis
• Revised back-end business processes used to distributors and partners’ data into production databases; conducted data analysis to create detail-oriented report such monthly reconciliation and sales summary for further internal analytics purpose
• Performed Ad-Hoc requests for clients and customized SQL queries to extract requested information and ensure data integrity
• Developed statistical analysis, and provided legible technical reports of company’s performance for supervisor and clients ACADEMIC PROJECTS
Predictive Fundraising Analysis for Donor Identification (R & Python)
• Identified donors and the amount of donation via fundraising organization dataset (145 features and 100K+ records), and achieved 40% improvement in donor identification compared to random baseline system
• Managed missing values of essential features by regression tree method, and performed data over-sampling & under-sampling by Synthetic Minority Over-Sampling technique to deal with imbalance classification (less than 5% positive donation)
• Built Lasso/Ridge Regression, Logistic Regression, Random Forest, Naïve Bayes, and compared models by cross-validation, achieved 48% of improvement on the test dataset for the final model Machine Learning for Stock Price Variation Analysis (Python & SAS Miner)
• Cleaned and imputed raw data for modeling preparation, applied Explanatory Data Analysis to do feature transformation
• Developed multiple models such as Random Forest, Logistic Regression, Gradient Boosting tree to predict the stock volatility over a period of two years
• Tuned parameter through Bayesian Optimization, and improved model accuracy from 85% to 91% by Ensemble Learning Database Design and Optimization for “Current Event Game” (MS SQL Server & Access)
• Designed database schema that contains 20+ tables, constructed Entity-Relationship diagram with various patterns and relationships, established the connection between offline interface system and Discuz using SQL Server
• Transformed the ER model into database design, created and optimized advanced SQL queries to simulate the whole process to ensure database integrity, and presented the benefits of new database design