Sign in

Data Analyst Manager

Irvine, California, United States
April 17, 2018

Contact this candidate


Kate Zhang

Cactus Bloom, Irvine, CA *****949-***-**** • • Green Card Holder OBJECTIVE

• Seeking full-time position after April in Data Analyst. EDUCATION

• University of California, Irvine, CA M.S., Statistics GPA: 3.89 Mar 2018

• Shanghai Normal University, Shanghai, China B.S., Education GPA: 3.90 July 2003 QUALIFICATION

• Proficient in relational database management, data preparing, cleaning and conversion.

• Excellent at data visualization, analysis, getting meaningful insight, transforming it into digestible information and presenting results in a “story-telling” way.

• Skilled in modeling and predicting using machine learning algorithm such as K-Nearest Neighbor, K-Means, Linear Regression/Classifier, Naïve Bayes Classifier, Support Vector Machine, Decision Tree, Neural Network, Ensemble (Random Forest, AdaBoost, Gradient Boost) and more. PROJECTS

Association between Serum Beta-Carotene and Dose Level (46 observations, each have 15 measurements)

• Scrutinized through the data, got summary statistics, identified outliers, typos and errors in the dataset; Corrected, imputed or set to NA depending on the type of errors

• Visualized data and found trend, based on pre-knowledge and reference papers, month 0-3 was averaged as baseline (all observations took placebo that time) and one knot was added at month 12 (a turning point)

• Translated client’s problems into data driving problems and built statistical model and conducted analysis

• Presented analytical results in a useful and meaningful way to my client Rainfall Prediction Using Machine Learning Algorithm (14 features and 400000 observations)

• Conducted data exploration, examined the distribution of target value and the features to identify patterns

• Trained different classifier to get some insights and found a single decision tree performed well

• Enhanced performance by using ensembles based on decision tree (Random Forests, AdaBoost DT)

• Further pushed the performance by feature engineering; Used feature importance to pick top 3 features and created new features using squared transformation, log transformation, multiplication transformation and division transformation; Applied Extreme Gradient Boosting to model and predict the new data set

• Combined three ensemble classifiers using different weights base on individual performance SKILLS


• R • Python • SQL • Tableau • Access • Excel • Word • PowerPoint • Photoshop Others

• Language Chinese (fluent in both writing and speaking) • Real Estate Agent License

Contact this candidate