Kate Zhang
Cactus Bloom, Irvine, CA ***** • 949-***-**** • ***********@*****.*** • Green Card Holder OBJECTIVE
• Seeking full-time position after April in Data Analyst. EDUCATION
• University of California, Irvine, CA M.S., Statistics GPA: 3.89 Mar 2018
• Shanghai Normal University, Shanghai, China B.S., Education GPA: 3.90 July 2003 QUALIFICATION
• Proficient in relational database management, data preparing, cleaning and conversion.
• Excellent at data visualization, analysis, getting meaningful insight, transforming it into digestible information and presenting results in a “story-telling” way.
• Skilled in modeling and predicting using machine learning algorithm such as K-Nearest Neighbor, K-Means, Linear Regression/Classifier, Naïve Bayes Classifier, Support Vector Machine, Decision Tree, Neural Network, Ensemble (Random Forest, AdaBoost, Gradient Boost) and more. PROJECTS
Association between Serum Beta-Carotene and Dose Level (46 observations, each have 15 measurements)
• Scrutinized through the data, got summary statistics, identified outliers, typos and errors in the dataset; Corrected, imputed or set to NA depending on the type of errors
• Visualized data and found trend, based on pre-knowledge and reference papers, month 0-3 was averaged as baseline (all observations took placebo that time) and one knot was added at month 12 (a turning point)
• Translated client’s problems into data driving problems and built statistical model and conducted analysis
• Presented analytical results in a useful and meaningful way to my client Rainfall Prediction Using Machine Learning Algorithm (14 features and 400000 observations)
• Conducted data exploration, examined the distribution of target value and the features to identify patterns
• Trained different classifier to get some insights and found a single decision tree performed well
• Enhanced performance by using ensembles based on decision tree (Random Forests, AdaBoost DT)
• Further pushed the performance by feature engineering; Used feature importance to pick top 3 features and created new features using squared transformation, log transformation, multiplication transformation and division transformation; Applied Extreme Gradient Boosting to model and predict the new data set
• Combined three ensemble classifiers using different weights base on individual performance SKILLS
Computer
• R • Python • SQL • Tableau • Access • Excel • Word • PowerPoint • Photoshop Others
• Language Chinese (fluent in both writing and speaking) • Real Estate Agent License