Jun (Angela) Zhou
626-***-**** ***********@*****.*** San Jose, CA
LinkedIn: https://www.linkedin.com/in/jun-zhou-958961112/ HIGHLIGHTS
● 3 months DS intern at PayPal + 2 years experiences as Investment Analyst
● Industry work experience in big data, business intelligence, data visualization and machine learning
● Strong background in statistics, data analytics, and solid programming skills of Python and SQL
● Excellent communication skills, demonstrated presentation skills to non-technical audience EDUCATION
San Jose State University, CA, USA Aug 2019-Present Master of Science in Data Analytics
University College London, UK Sep 2012-Sep 2013
Master of Science in Epidemiology
Sun Yat-sen University, China Sep 2007-Jun 2012
Bachelor of Medicine in Preventive Medicine
WORK EXPERIENCE
PayPal San Jose, CA
Data Engineer Intern (work as Data Analyst) May 2020-Aug 2020
● Worked in Risk Data team, conducted risk data analysis to detect fraud and prevent loss
● Analyzed and summarized risky patterns in send money flow and explored indicators for fraud transactions
● Built anti-fraud models with feature engineering and risk pattern analysis by Teradata (SQL)
● Models were presented to VP level and taken over by full-time for implementation Eagle Holdings Beijing, China
Investment Analyst Dec 2015-Dec 2017
● Data analysis on investment trends and global market environments
● Reported and visualized industry surveys and benchmarks, economic and demographic trends by Tableau
● Conducted market research (industry research, financial forecast and enterprise valuation estimation) in support of the selection and evaluation of potential investment healthcare startups PROJECTS
Bank Customer Churn Prediction and Analysis
● Explored the dataset by analyzing the key factors based on labeled data and checking feature correlations in Python
● Preprocessed dataset through data cleaning, categorical feature transformation, and standardization
● Built supervised machine learning models including Logistic Regression, Random Forest and K-Nearest Neighbors, selected the optimal hyperparameters by regularization in each model
● Used 5-fold cross-validation to evaluate model performance of classification and analyzed feature importance to identify top factors affecting results
Customer Reviews Analysis and Topic Modeling
● Developed review text pre-processing pipeline by tokenization, stemming, removing stop words and extracted features by Term Frequency – Inverse Document Frequency (TFIDF)
● Trained unsupervised machine learning models (K-means clustering, Latent Dirichlet Analysis) to cluster customer reviews
● Identified latent topics and key words of each topic DoorDash Platform Analysis in MySQL
● Analyzed Doordash work process data in MySQL to improve customer satisfaction and business performance
● Generated ER diagram of DoorDash order procedure, mocked data generation, adjusted data through Mockaroo
● Conducted data analysis of customer satisfaction, restaurant popularity, dashers’ income and platform revenue in MySQLWorkbench
SKILLS
● Programming: Python (sklearn, pandas, numpy), SQL, SPSS, STATA
● Machine Learning: Classical & Penalized Regression Methods (Lasso, Ridge), Decision Tree, Random Forest, Regularization, Clustering, K Nearest Neighbors, K-means, Principal Component Analysis (PCA)
● Statistical Analysis: Hypothesis Testing, A/B Testing, Text Mining, Time Analysis
● Tools: Tableau, Hadoop,Simba Teradata, MySQLWorkbench, Oracle Live SQL, Pig, Hive, Spark, ArcGIS