San Jose, CA
**********@*****.***
YUAN (FLORENCE) GAO
TECHNICAL
SKILLS
● Programming Languages : Python, SQL, Shell, Git, HTML, CSS, JavaScript
● Data Analysis and Reporting : Python (numpy, pandas, matplotlib, scikit-learn), R, Pentaho, Tableau, Excel
● Databases : SQL & NoSQL, MySQL, PostgreSQL, BigQuery, Teradata, MongoDB, Hadoop, Hive EDUCATION
● University of Colorado Denver – Business Intelligence, M.S. 12/2017
● Southwestern University of Finance and Economics, China – Accounting, B.A . 05/2006 PROFESSIONAL
EXPERIENCE
WeRide Corp (an autonomous-driving car startup), San Jose, CA Data Analyst 08/2017 – present
● Responsible for performing data analytical tasks, such as data manipulation, quantitative analysis, statistical modeling, data visualization, and reporting. Partnered with engineering team and product team on delivering actionable insights.
● Built data ETL pipeline using Postgres and Python to automatically process the raw sensor data and convert into structured databases. This automated process has resulted in significant time and cost savings.
● Extracted daily collected labeled data from NoSQL database, performed data preprocessing by Python to ensure data quality, which included dealing with missing values, duplicates, inaccurate information, and outlier detection.
● Wrote MySQL and HiveQL queries against large datasets and performed data analysis to identify problems and actions required, which helps engineering team was improving the efficiency of their developing algorithms.
● Identified the proper metrics needed to measure the performance of our product; performed statistical analysis by Python to identify the root cause behind trends or other abnormal scenarios.
● Built/modified/maintained dashboards using Tableau, Excel, SQL, Jupyter Notebooks, and regularly delivered ad-hoc analysis reports to unlock opportunities for growth. PROJECT
EXPERIENCE
Exploratory Data Analysis on UC Denver student portal 06/2017
● A/B test: Perform A/B test for UC Denver student portal. Modeling the user response data of different web interface design for course registration system and performing hypothesis test and user satisfaction prediction.
● Data analysis: applied the exploratory data analysis approach to summarize and visualize the critical characteristics and trends of the data as well as visualized the insights by using numpy, pandas, matplotlib, seaborn, Jupyter Notebooks.
Predict Human/Robot bidder for online auction site 03/2019
● Analyze the Kaggle Human/Robot online bidding data; Use Python to perform data wrangling and preprocessing to optimize data quality and applied statistical analysis approach to visualize the important characteristics by using numpy, pandas, matplotlib, pyTable .
● Built and trained predict model in scikit-learn using logistic regression algorithm to classify human/robot bidder on an online auction site and improved the ROC AUC score to 90% by feature engineering and parameter tuning.