Ziyi (Iggy) Zhao
Los Angeles, CA, USA 959-***-**** Email: ****.****.****@*****.***
LinkedIn: www.linkedin.com/in/iggyzhao GitHub: https://github.com/IggyZhao SUMMARY
Master student who aims at finding Data Analyst / Business Intelligence opportunities, with sufficient project experience in machine learning, data visualization and business intelligence. Strong knowledge of statistics and programming in SQL & Python. EDUCATION
University of Connecticut May 2021 (Exp.)
• MSc in Business Analytics and Project Management (Top 5 student, Spring 2020 Scholarship) GPA 4.2/4.0
• Coursework: statistics in R (TA), predictive modeling (TA), SQL, business intelligence (RA), web analytics, Hadoop Massachusetts Institute of Technology on edX Credentials
• MicroMasters in Statistics and Data Science
• Coursework: probability, statistics, data analysis in social science, machine learning with Python University of Nottingham June 2019
• BSc in International Business Economics (First Class Honors, rank 4th /334) GPA 3.8/4.0
• Coursework: advanced calculus, econometrics, quantitative methods, bigdata analytics, database design and implementation
• Awards: Provost’s scholarship, Dean’s scholarship, governmental scholarship, excellent graduate, outstanding student SKILLS
Programming Python, R, SQL, SAS, Stata
Analytics Techniques
• Decision Tree, Random Forest, Classical & Penalized Regression Methods (Ridge, Lasso)
• Principal Component Analysis (PCA), Clustering (K-means), K-Nearest Neighbors
• Exploratory Data Analysis, Hypothesis Testing, A/B Testing, Text Mining, Data Visualization Software Tableau (Desktop Specialist), Oracle, Spark, JMP Pro, SPSS Modeler, Advanced Excel, AWS, Analytical Solver EXPERIENCE
Data Scientist Intern Global AI, NY, US June 2020 – Aug 2020
• Participated in a team of five to establish web applications for stock statistics and analyze COVID-19 impacts on stocks
• Obtained historical stock indices using Python API and generated time series and analytical graphs using Plotly
• Deployed the web application on Heroku for clients to interact by specifying inputs of time range, tickers, and parameters
• Assisted team leader to research the impacts of COVID-19 on US stocks and presented insight reports to stakeholders Data Analyst Intern Forkaia, CA, US Jan 2020 – Feb 2020
• Researched on social media popularity by Natural Language Processing to optimize marketing effects by SAS and Python
• Organized text by tokenizing, stop-words removing, stemming, lemmatizing, extracted features by TF-IDF method
• Trained text rule builder, decision tree, logistic regression, optimized performance by decision tree (AUC = 0.887)
• Reported actionable strategies including number of characters (135-145), hashtag usage (3-5), dynamic and positive content
• Detected proper posting time for workdays and weekdays, the social media followers increased by 18.2% after 2 months PROJECTS
Online Movie Platform Consumption Analytics and Recommendation Engine Implementation (Python, Spark)
• Identified factors influencing movie conversions with Python and built movie recommendation engine with Spark
• Processed datasets by feature encoding and standard scaling, visualized data patterns with Seaborn and Matplotlib
• Implemented Lasso and Ridge linear regression and random forest (depth = 19, R square = 0.51) to find key factors
• Built ETL pipeline and conducted OLAP, implemented ALS model (RMSE = 0.74) to provide customized recommendation
• Suggested display 5 personalized recommended movies in the upper left corner of the user interface to raise conversion Impacts of Customer Portraits on Churn Behavior and Analytics of Customer Retention (Python)
• Developed machine learning algorithms to predict bank customer churn based on labeled data through Python
• Conducted feature engineering, and optimized logistic regression, random forests, and K-Nearest Neighbors
• Diminished overfitting by regularization with optimal parameters, evaluated model performance via 5-fold cross-validation
• Optimized random forest (precision = 0.76) and calculated feature importance to identify main factors
• Raised practical recommendations in terms of product design and follow-up service to raise customer retention E-Commerce Customer Purchasing Intention and Marketing Creatives Analytics (Python, JMP Pro)
• Led a team of 6 to predict online shopping intention of customers and provide insights to increase conversion rate
• Followed the SEMMA procedure: sampled data using Python, explored data patterns, modified data by missing value and outlier imputation, variable transformation, data binning, dimensionality reduction by logistic regression
• Compared 6 supervised machine learning models, generated results based on decision tree (sensitivity = 80.84%, lift = 4.8)
• Proposed actionable marketing strategies regarding page values improvement, association rule and price discrimination