Post Job Free
Sign in

Customer Data

Location:
San Jose, CA
Posted:
January 25, 2021

Contact this candidate

Resume:

Jun (Angela) Zhou

626-***-**** ***********@*****.*** San Jose, CA

LinkedIn: https://www.linkedin.com/in/jun-zhou-958961112/ HIGHLIGHTS

● 3 months DS intern at PayPal + 2 years experiences as Investment Analyst

● Industry work experience in big data, business intelligence, data visualization and machine learning

● Strong background in statistics, data analytics, and solid programming skills of Python and SQL

● Excellent communication skills, demonstrated presentation skills to non-technical audience EDUCATION

San Jose State University, CA, USA Aug 2019-Present Master of Science in Data Analytics

University College London, UK Sep 2012-Sep 2013

Master of Science in Epidemiology

Sun Yat-sen University, China Sep 2007-Jun 2012

Bachelor of Medicine in Preventive Medicine

WORK EXPERIENCE

PayPal San Jose, CA

Data Engineer Intern (work as Data Analyst) May 2020-Aug 2020

● Worked in Risk Data team, conducted risk data analysis to detect fraud and prevent loss

● Analyzed and summarized risky patterns in send money flow and explored indicators for fraud transactions

● Built anti-fraud models with feature engineering and risk pattern analysis by Teradata (SQL)

● Models were presented to VP level and taken over by full-time for implementation Eagle Holdings Beijing, China

Investment Analyst Dec 2015-Dec 2017

● Data analysis on investment trends and global market environments

● Reported and visualized industry surveys and benchmarks, economic and demographic trends by Tableau

● Conducted market research (industry research, financial forecast and enterprise valuation estimation) in support of the selection and evaluation of potential investment healthcare startups PROJECTS

Bank Customer Churn Prediction and Analysis

● Explored the dataset by analyzing the key factors based on labeled data and checking feature correlations in Python

● Preprocessed dataset through data cleaning, categorical feature transformation, and standardization

● Built supervised machine learning models including Logistic Regression, Random Forest and K-Nearest Neighbors, selected the optimal hyperparameters by regularization in each model

● Used 5-fold cross-validation to evaluate model performance of classification and analyzed feature importance to identify top factors affecting results

Customer Reviews Analysis and Topic Modeling

● Developed review text pre-processing pipeline by tokenization, stemming, removing stop words and extracted features by Term Frequency – Inverse Document Frequency (TFIDF)

● Trained unsupervised machine learning models (K-means clustering, Latent Dirichlet Analysis) to cluster customer reviews

● Identified latent topics and key words of each topic DoorDash Platform Analysis in MySQL

● Analyzed Doordash work process data in MySQL to improve customer satisfaction and business performance

● Generated ER diagram of DoorDash order procedure, mocked data generation, adjusted data through Mockaroo

● Conducted data analysis of customer satisfaction, restaurant popularity, dashers’ income and platform revenue in MySQLWorkbench

SKILLS

● Programming: Python (sklearn, pandas, numpy), SQL, SPSS, STATA

● Machine Learning: Classical & Penalized Regression Methods (Lasso, Ridge), Decision Tree, Random Forest, Regularization, Clustering, K Nearest Neighbors, K-means, Principal Component Analysis (PCA)

● Statistical Analysis: Hypothesis Testing, A/B Testing, Text Mining, Time Analysis

● Tools: Tableau, Hadoop,Simba Teradata, MySQLWorkbench, Oracle Live SQL, Pig, Hive, Spark, ArcGIS



Contact this candidate