Post Job Free
Sign in

Data Scientist

Location:
Los Angeles, CA
Posted:
July 24, 2014

Contact this candidate

Resume:

Huihui Duan

* d *************@*****.*** . L os Angeles, CA 91406 ! 608 -556 -3368 L inkedIn: h ttp://www.linkedin.com/in/huihuiduan

EDUCATION AND TRAINING PROFESSIONAL EXPERIENCES

BIG DATA AND HADOOP ONLINE TRAINNING JULY, 2014 – PRESENT FARMERS INSURANCE GROUP • LOS ANGELES, CA JULY, 2013 - PRESENT

Senior Commercial Product Analyst

JOHNS HOPKINS UNIVERSITY AT COURSERA APR, 2014 – JULY, 2014

Worked with R&D department and Accuracy Department build

predictive models on loss ratio;

Specialization Certificate, Data Science (Data Scientist)

Collaborated with marketing department about segmentation and

strategies using statistical model;

MOOC OF COURSERA/EDX/UDACITY JAN, 2013 - PRESENT

Cooperated with Underwriters and grouped agents and industry

classes to increase product efficiencies;

Free Online Courses about Computer Science and Statistics

Monitored models, identified trends and took detailed analysis on

UNIVERSITY OF WISCONSIN – MADISON MAY, 2011 – AUGUST, 2012 business and product questions.

SPHERE INSTITUTE • SAN FRANCISCO, CA JAN, 2013 – JULY, 2013

Master Degree in Statistics

Master Project 1: The relationship between cognitive task performance and

Data and Policy Analyst

vascular health in young adults

Built regression model using SAS and R to adjust outpatient

• Studied the relationship between cognitive task performance and •

prospective payments for different categories of hospitals and

vascular health in young adults;

reported policy suggestions to Federal government;

• Detected the gender differences on demographic, vascular and

cognitive measures Designed a new algorithm to reduce the running time of a project

Master Project 2: Nonlinear model of Gray wolves growth in Wisconsin from 4 hour to 40 minute;

• Analyzed how the radio-telemetry error was affected by state, pilot, Visualized country-wised medical data in R.

month and year;

UNIVERSITY OF WISCONSIN - MADISON • MADISON, WI AUG, 2009 – AUG, 2012

• Identified and compared causes of wolf mortality changes;

• Fitted nonlinear growth models, estimated the parameters and

Research Assistant

compared different models.

Built five Bayesian models and two Non-Bayesian models in R, and

UNIVERSITY OF WISCONSIN - MADISON SEP, 2009 – DEC, 2012 compared the accuracies of prediction;

Applied distributed system using Python to solve advanced

Master Degree in Quantitative Genetics computing problems in Bayesian regression.

Master Thesis: Whole Genome Prediction Within and Across Environments - An

Application to Wheat Yield

PROGRAMMING TOOLS

• Compared the accuracy of whole genome prediction of wheat yield

within and across environments; R/RStudio Hadoop Python

Solved the advanced computing problems of Bayesian regression SAS Hive Java

methods in 100 replications of random sub sampling cross-validation SQL Pig Perl

in a genomic selection context. Excel/Access Hbase MATLAB

Weka Linux Octave

Huihui Duan

* d *************@*****.*** . L os Angeles, CA 91406 ! 608 -556 -3368 L inkedIn: h ttp://www.linkedin.com/in/huihuiduan

learning algorithms

DATA MINING, MACHINE LEARNING AND ALGORITHM PROJECTS Development Tools: Octave

Implemented machine learning algorithms, including Linear Regression,

THE ANALYTICS EDGE PROJECT AT EDX.ORG FEB 2014 – MAY 2014 •

Logistic Regression, Neural Network, Support Vector Machines, K–Means

Project Description: Learn analytical applications in statistical analysis.

Clustering, Principal Component Regression, Anomaly Detection, and

Development Tool: R Recommender Systems;

Built Linear Regression by geographic and climate data to estimate the Enhanced the understanding and programming of machine learning

• •

quality of wine from different areas; algorithms for prediction, classification, clustering and association

Performed Logistic Regression, Decision Tree and Random Forest for 675

criminal case data to classify supreme court decisions;

COMPUTATIONAL INVESTING PROJECT AT COURSERA.ORG FEB 2013 – APR 2013

Turned Tweets into Knowledge using Text Mining Analytics;

Visualized U.S. election results, plotted network data using vary circles

• Project Description: Utilize data mining in Event Study and implement trading

and coloring vertices and summarized text data using word clouds. strategies in US stock market.

Development Tool: Python

STATISTICAL LEARNING AT STANFORD ONLINE JAN 2014 – APR 2014 Built a real trading algorithm from stock price event study, a stock price.

At every event, buy 100 shares of the equity, sell those 5 trading days

Project Description: Course in supervised learning, with a focus on regression and later and the compound annual return is 14.15%;

classification methods. Built a real trading algorithm from Bollinger Bands event study. At every

Development Tool: R event, buy 100 shares of the equity, sell those 5 trading days later and the

Learned linear and polynomial regression, logistic regression and linear compound annual return is 8.76%.

discriminant analysis;

Performed Cross-validation and the bootstrap, model selection and

• DATA STRUCTURE AND ALGORITHM PROJECT AT

regularization methods (ridge and lasso);

UNIVERSITY OF WISCONSIN – MADISON SEP 2009 – DEC 2010

Studied nonlinear models, splines, and generalized additive models;

Applied Tree-based methods, random forest and boosting and support-

Project Description: Implemented data structure and algorithms in Java.

vector machines;

Development Tools: Java, Linux

Discussed unsupervised learning methods: principal components and

Learned basic data structure and Object - Oriented Programming (OOP)

clustering (k-means and hierarchical).

projects and implemented complex data structure in Java;

Implemented the Bayesian network algorithm and Hidden Markov Model

DATA MINING PROJECT AT COURSERA.ORG SEP 2013 – OCT 2013 in Java and applied the algorithms in several bioinformatic applications;

Project Description: Learn data mining technical through practical examples with

MATHEMATICAL MODELING PROJECT AT

the free Weka software

SICHUAN AGRICULTURAL UNIVERSITY JUN 2007 – JUN 2009

Development Tools: Weka

Performed data mining using Decision Trees, Nearest Neighbor, Linear

Project Description: Mathematical modeling contests and training

Regression, Classification by Regression, Logistics Regression, Support

Development Tools: MATLAB

Vector Machines;

Leaded a team in the Mathematical modeling contests and interned as a

Improved understanding of data mining and introduced new ensemble •

Trainer and the Vice President at the Mathematical Modeling Association;

learning technical, such as Bagging, Randomization, Boosting and Stacking

Created teaching materials, gave lectures on mathematical modeling and

programing skills in Matlab for new members;

MACHINE LEARNING PROJECT AT COURSERA.ORG APR 2013 – JUL 2013

Awarded as Meritorious Winner in the 2008 International Mathematical

Contest in Modeling (IMCM)

Project Description: Implemented and applied the most advanced machine



Contact this candidate