Post Job Free
Sign in

Data Analyst

Location:
Los Angeles, CA
Posted:
February 19, 2021

Contact this candidate

Resume:

SHENGWANG (ARTHUR) ZHANG

Los Angeles, CA 805-***-**** ********@***.***

EDUCATION

University of Southern California (USC) Expected Graduation: 05/2022 Master of Science in Applied Data Science

University of California, Santa Barbara (UCSB) 09/2016 – 03/2020 Bachelor of Science in Statistics and Data Science SKILLS & COURSEWORK

Relevant Coursework: Data mining; Machine Learning; Foundations of Data Management; Data Science at Scale; Regression Analysis; SAS Base Program; Data Science Computing

Data Analyzing Skills: SQL, Python, R, Spark, Hadoop, Pandas, Tableau, Machine Learning, Database Manipulation WORK EXPERIENCE

Quality Assurance Data Analyst Intern 09/2020 – 11/2020 Broadstreet COVID-19 Data Project Los Angeles, CA

Collaborated with the data entry team to accurately record the daily number of COVID-19 cases on Google Sheet

Handled over 400 data points by fixing absurd downtrend of confirmed cases to ensure the integrity of the dataset

Established Linear Regression models to predict future number of cases and compared them with real-world datasets Data Analyst Intern 07/2019 - 09/2019

Shaanxi Help You Electronic Technology Co. Ltd Xi’an, China

Implemented Hadoop and Spark clusters on Docker Compose to shorten 13% of data computing time

Analyzed DAU, Engagement, and Elevator Running Time metrics to gain actionable insights for the ads campaign

Cooperated and presented data analysis conclusion with the marketing team to increase 20% of the acquisition rate Operation Specialist 04/2020 – 06/2020

Ezeeship Los Angeles, CA

Acquired 10 registered customers and increased sales by 20% in two weeks to further enhance company awareness

Initialized 3 demos introducing our new system features to existing customers and increased 10% of the conversion rate

Developed the email campaign to promote our service using mail meteor to gain 20 new customers PROJECTS

Big Data Analysis of YouTube Videos – Data Management 09/2020 - 11/2020

Managed datasets from various databases to evaluate the most popular videos under diverse attributes on AWS instance

Preprocessed and cleaned over 60,000 videos and channels data including missing values and variable standardization using Pyspark to shorten 20% of data operating time

Aggregated datasets stored in MySQL and Firebase databases into an integrate YouTube dataset to the User Interface Binary Prediction of NBA Player’s shot outcome – Machine Learning 09/2020 - 11/2020

Built Logistic Regression, Decision Tree, and XGBOOST binary models to best predict Steph Curry’s shot outcome

Utilized k-fold Cross-Validation to measure and compare the mean accuracy rate of each model and then chose the Logistic Regression model with the highest accuracy rate of 68%

Executed the final model to players in all different positions to ensure the comprehensiveness of our model on Spark Analysis of International Airline Passengers – Time Series 01/2019 - 03/2019

Developed a Time Series model to predict that the International passengers will grow about 20% in the next six seasons

Launched normality checking by plotting Histogram and Q-Q Plot of the final model and performing Shapiro Wilk Test

Performed Ljung-Box Test to testify the independence and drew ACF and PACF plots to attest the constant variance 2016 Election Analysis – Machine Learning 10/2019 - 12/2019

Created county and state-level data visualizations to gain insights into party preferences of different states and counties

Applied Principal Component Analysis to reduce the data dimensionality and concluded that poverty and income per capita are the most crucial features contributing to the result of the Election

Implemented the hierarchical clustering algorithm to determine what is the ideal number of clusters to group counties



Contact this candidate