Post Job Free
Sign in

Data Assistant

Location:
Houston, TX
Posted:
September 29, 2014

Contact this candidate

Resume:

YaoYU

Contact SUMMARY

*** **** **** **

Solid statistical and machine learning knowledge in data analysis, eight years of hands-on ex-

Simi Valley CA, 93065

perience in data cleaning, modeling and mining, proficiency in programming in Java, Python,

951-***-****

R, C++ and SQL, excellent interpersonal and communication skills

*********@*****.***

EDUCATION

Linkedin: yaoyu1

Jul 2012 Ph.D. in Applied Statistics University of California, Riverside

Programming Thesis: Bayesian and Non-parametric Approaches to Missing Data Analysis

Expert GPA: 3.93/4.0

R, SQL, Java, SAS Jul 2007 Bachelor in Statistics University of Science and Technology of China

Intermediate

EXPERIENCE

Python, C++, Matlab

Beginner 2012–Present Amgen Inc. Thousand Oaks, California

Scala Biostatistics Manager

Highly self-motivated professional, team player, working on multiple critical tasks

Coursework across the drug development. Main responsibilities involve:

Machine Learning, • Providing sound, strategic, statistical input to optimize study design and

Algorithms and Data meeting scientific / business justifications

Structures, • Subject-matter expert in data imputation and Bayesian Statistical tech-

Data Science, niques, providing thought leadership and consulting expertise to other statis-

Theoretical Statistics ticians in the team

and Probability, • Gathering business insights, collaborating and conducting studies with ex-

Time Series, perts across multiple functions to meet aggressive timelines

Statistical Computing, • Presenting and articulating complex statistical concepts to various level au-

Advanced Design and diences from executive management to programmers

Analysis of • Actively involved in innovative exploratory statistical solutions:

Experiments, – Investigating potential drug effect on blood pressure fluctuation based

Multivariate Analysis, on patients’ measurements every 15 minutes for a long period with

Statistical Consulting adjustment of other effects/interactions

and Data Analysis, – Researching the correlation between weight and drug concentration by

Bayesian Statistics, performing a meta analysis on pooled data from multiple data sources

Nonparametric – Implementing Bayesian analysis to scale the historical data and man-

Methods aged to downsize the new study in the design step

– Developing a sequential design procedure to evaluate the equivalence

of two drugs with integrated composite scores

2009-2012 University of California, Riverside Riverside, California

Research Assistant

Focused on two angles missing data analysis: a non-parametric method without

any distribution assumption and the Bayesian methods

• Developed an extended Fisher discriminant (the test threshold is determined

by implementing boostrapping) to classify missing types.

• Improved the computational efficiency of MCMC algorithm for the predictive

models of multilayered missing data in a large survey data under different

scenarios

• Optimized the model selection by performing model checking and

goodness-of-fit test

2009 Amgen Inc. Thousand Oaks, California

Intern

Conducted literature review and implemented Monte Carlo simulation procedure in data imputation

and analysis

• Performed missing pattern recognition and developed the visual tools to illustrate the impact

• Compared the performance of single value imputation, mixed effect model, weighted estimating

equation, and Bayesian approaches in a simulation study under different assumptions

2008–2009 University of California, Riverside Riverside, California

Student Consultant

• Identified covariates which contribute to the differences in cervicovaginal cytokine concentrations

between pregnant and non-pregnant women using the robust principle component analysis(Data

from Department of Plant Pathology & Microbiology, UC Riverside)

• Cleaned the data, evaluated the performance of Naive Bayes, CART and random forest on clas-

sifying estimated value ranges of pre-owned cars (Data from KBB)

2007–2012 University of California, Riverside Riverside, California

Teaching Assistant

Assisted teaching with the duties including leading discussions, clarifying related concepts, guiding

statistical analysis with the software(minitab, excel and SAS) in graduate level courses.

LEADERSHIP

2013 Medical Science Biostatistics department at Amgen Inc. Thousand Oaks California

Team Lead

Lead the team to improve operational efficiency on knowledge sharing

• Organized and facilitated the weekly meeting, allocated the workload

• Performed division of labor, specified the aspects which need improvement, brain-stormed the

possible solutions, analyzed the impact and feasibility

• Presented the proposal to the executive management and initiated the process to optimize work

efficiency

PUBLICATIONS

Jun Li, Yao Yu, A Nonparametric Test of Missing Completely at Random for Incomplete Multivariate Data, Psychome-

trika, 2014. doi: 10.1007/s11336-014-9410-4



Contact this candidate