YANG YU
850-***-**** *********@*****.*** **** E 3rd Ave, #2011, San Mateo, CA 94401
Objective
A highly motivated Ph.D. candidate with a solid background in statistics and mathematics, with extensive experiences in statistical analysis and machine learning, currently looking for a job in the field of data scientist Education
Florida State University Tallahassee, FL
Ph.D. in Applied Statistics Sep 2015 – Expected Apr 2019 Florida State University Tallahassee, FL
M.S. in Applied Statistics Sep 2013 – May 2015
Xiamen University Xiamen, China
B.S. in Applied Mathematics Sep 2009 – May 2013
Skills
Softwares: R, SAS, SQL, Python(NumPy, pandas), Microsoft Excel(pivot table, VLOOKUP), Tableau Highlights: GLM, Machine Learning
Work Experience
Florida State University Department of Statistics Tallahassee,FL Teaching Assistant Sep 2016 – May 2018
Gave lectures on Introduction to Applied Statistics, hosted recitation classes and graded assignments Florida State University Statistical Consulting Center Tallahassee,FL Consultant Sep 2015 – May 2016
Performed statistical analyses for clients, including but not limited to, descriptive statistics, ANOVA, t-test, regression modeling, experimental design, time series modeling, variable selection
Generated and cleaned messy datasets in Excel or MySQL
Identified and resolved the inconsistencies in research projects from diverse fields
Recommended statistical techniques and interpreted the results in non-technical ways
Published the 2015 – 2016 FSU Statistical Consulting Center Annual Report Research Experience
Florida State University Tallahassee, FL
Analysis of Daily Ozone Concentration Sep 2016 – Present
Implemented universal kriging to obtain the predicted values of covariates on certain location of interest in R
Built a location and covariate dependent covariance function and optimized the model by log-likelihood function
Increased the accuracy rate by 8.36% and speeded up the process by 4 times compared to normal Bayesian method Project Experience
Florida State University Tallahassee,FL
Breast Cancer Analysis Mar 2016 – Apr 2016
Applied Lasso and Ridge regression on breast cancer dataset(24481 predictors, 291 observations)
Trained a random forest with 300 trees and visualized the process in MATLAB
All methods achieved above 90% accuracy rate while Lasso consumed the least time Machine Learning Algorithms Realization in Weka Sep 2015 – Oct 2015
Performed decision tree, random forest, logistic regression, neural network, k-NN, Naive Bayes and SVM on the landsat satellite data set in Weka and compared test errors vs log training times Auto Insurance Analysis Mar 2015 – Apr 2015
Classified the insurers from training set into six risky levels by decision tree
Assigned the insurers into the corresponding group and calculated the claim probability by neural network
Adjusted the premium with respect to the average claim probability
Validated the model in test set, differentiated of premium between claimed and unclaimed group by 62% Titanic Survival Analysis Nov 2014 – Dec 2014
Processed missing values, conducted independence test, developed logistic regression on training set
Validated the logistic equation, generated the confusion matrix, achieved 97% accuracy rate