Sign in

Data Analyst

Irvine, CA
March 31, 2019

Contact this candidate


Zihan Zhao

Irvine, CA ***** +1-949-***-****


University of California, Irvine Expected June 2019 Master of Science, Statistics

Coursework: Machine Learning, Linear Models, Categorical Data Analysis, Longitudinal Data Analysis, Bayesian Data Analysis, Time Series Analysis, Inference with Missing Data; Survey Sampling University of California, Irvine June 2017

Bachelor of Arts, Major in Economics, Minor in Statistics Coursework: Calculus, Linear Algebra, Probability and Statistics, Applied Econometrics, Mathematics of Finance, Managerial Economics, Introduction to Business and Management WORK EXPERIENCE

Weichai Power Co., Ltd Weifang, China

Data Analyst Intern Aug. 2018 - Sept. 2018

• Cleaned and standardized features data of diesel engines faulty parts with Tableau Prep

• Created effective dashboards with Tableau to uncover relationships of diesel engines performance parameters and recommended the insight required to build models to the Manager

• Coordinated improvements to data issues across multiple departments

• Created presentations to introduce main concepts of selective Gartner articles PROJECTS (Selected)

Emotion Detection with CNN (Python, Teamwork) Nov. 2018 - Dec. 2018

• Performed image preprocessing on the raw data in resizing image dimensions and normalizing pixel intensity values

• Constructed a Convolutional Neural Network with three sets of convolution and max pooling layers to detect eight different types of emotions by using Tensorflow in Google Colab

• Trained the final model on the training and validation set under a learning rate of 0.005 and a batch size of 64 with the Adam optimizer and yielded an accuracy rate of about 78% on the test set Time Series Analysis for PM 2.5 in Beijing (RStudio) Dec. 2018

• Visualized the data using ggplot2 in R and determined the type of time series model

• Fitted ARIMA model for PM 2.5 concentration and predicted future points in the series

• Fitted VAR model to explore the relationship between PM 2.5 concentration and meteorological factors

• Conducted model diagnostics to assess the validity of models Retest Effects Analysis on Diagnosis Groups of Alzheimer’s Disease (RStudio) June 2018

• Performed exploratory analysis of the data, including summary statistics, plots and related explanation

• Conducted Linear Mixed Effects models and Likelihood Ratio Test to quantify retest effect in three neuropsychological tests for Alzheimer’s disease (Longitudinal data analysis)

• Derived conclusions based on hypothesis tests and wrote a format data analysis report Bayesian Analysis on Boston Housing Data (RStudio) Mar. 2018

• Built Bayesian linear regression model to predict the median values of owner-occupied homes

• Performed predictor selection using DIC, BIC and LPML

• Presented posterior inferences for regression parameters and for subgroup means

• Used autocorrelation function and Gelman-Rubin statistics to do diagnostic about MCMC convergency SKILLS

Programming Language: SQL, Tableau, Python, R (dplyr, ggplot2, r2jags, rmarkdown, geepack, nlme, forecast, etc.)

Tools: Latex; Microsoft Office (Word, PowerPoint, Excel)

Contact this candidate