Huihui Duan
* d *************@*****.*** . L os Angeles, CA 91406 ! 608 -556 -3368 L inkedIn: h ttp://www.linkedin.com/in/huihuiduan
EDUCATION AND TRAINING PROFESSIONAL EXPERIENCES
BIG DATA AND HADOOP ONLINE TRAINNING JULY, 2014 – PRESENT FARMERS INSURANCE GROUP • LOS ANGELES, CA JULY, 2013 - PRESENT
Senior Commercial Product Analyst
JOHNS HOPKINS UNIVERSITY AT COURSERA APR, 2014 – JULY, 2014
Worked with R&D department and Accuracy Department build
•
predictive models on loss ratio;
Specialization Certificate, Data Science (Data Scientist)
Collaborated with marketing department about segmentation and
•
strategies using statistical model;
MOOC OF COURSERA/EDX/UDACITY JAN, 2013 - PRESENT
Cooperated with Underwriters and grouped agents and industry
•
classes to increase product efficiencies;
Free Online Courses about Computer Science and Statistics
Monitored models, identified trends and took detailed analysis on
•
UNIVERSITY OF WISCONSIN – MADISON MAY, 2011 – AUGUST, 2012 business and product questions.
SPHERE INSTITUTE • SAN FRANCISCO, CA JAN, 2013 – JULY, 2013
Master Degree in Statistics
Master Project 1: The relationship between cognitive task performance and
Data and Policy Analyst
vascular health in young adults
Built regression model using SAS and R to adjust outpatient
• Studied the relationship between cognitive task performance and •
prospective payments for different categories of hospitals and
vascular health in young adults;
reported policy suggestions to Federal government;
• Detected the gender differences on demographic, vascular and
cognitive measures Designed a new algorithm to reduce the running time of a project
•
Master Project 2: Nonlinear model of Gray wolves growth in Wisconsin from 4 hour to 40 minute;
• Analyzed how the radio-telemetry error was affected by state, pilot, Visualized country-wised medical data in R.
•
month and year;
UNIVERSITY OF WISCONSIN - MADISON • MADISON, WI AUG, 2009 – AUG, 2012
• Identified and compared causes of wolf mortality changes;
• Fitted nonlinear growth models, estimated the parameters and
Research Assistant
compared different models.
Built five Bayesian models and two Non-Bayesian models in R, and
•
UNIVERSITY OF WISCONSIN - MADISON SEP, 2009 – DEC, 2012 compared the accuracies of prediction;
Applied distributed system using Python to solve advanced
•
Master Degree in Quantitative Genetics computing problems in Bayesian regression.
Master Thesis: Whole Genome Prediction Within and Across Environments - An
Application to Wheat Yield
PROGRAMMING TOOLS
• Compared the accuracy of whole genome prediction of wheat yield
within and across environments; R/RStudio Hadoop Python
Solved the advanced computing problems of Bayesian regression SAS Hive Java
•
methods in 100 replications of random sub sampling cross-validation SQL Pig Perl
in a genomic selection context. Excel/Access Hbase MATLAB
Weka Linux Octave
Huihui Duan
* d *************@*****.*** . L os Angeles, CA 91406 ! 608 -556 -3368 L inkedIn: h ttp://www.linkedin.com/in/huihuiduan
learning algorithms
DATA MINING, MACHINE LEARNING AND ALGORITHM PROJECTS Development Tools: Octave
Implemented machine learning algorithms, including Linear Regression,
THE ANALYTICS EDGE PROJECT AT EDX.ORG FEB 2014 – MAY 2014 •
Logistic Regression, Neural Network, Support Vector Machines, K–Means
Project Description: Learn analytical applications in statistical analysis.
Clustering, Principal Component Regression, Anomaly Detection, and
Development Tool: R Recommender Systems;
Built Linear Regression by geographic and climate data to estimate the Enhanced the understanding and programming of machine learning
• •
quality of wine from different areas; algorithms for prediction, classification, clustering and association
Performed Logistic Regression, Decision Tree and Random Forest for 675
•
criminal case data to classify supreme court decisions;
COMPUTATIONAL INVESTING PROJECT AT COURSERA.ORG FEB 2013 – APR 2013
Turned Tweets into Knowledge using Text Mining Analytics;
•
Visualized U.S. election results, plotted network data using vary circles
• Project Description: Utilize data mining in Event Study and implement trading
and coloring vertices and summarized text data using word clouds. strategies in US stock market.
Development Tool: Python
STATISTICAL LEARNING AT STANFORD ONLINE JAN 2014 – APR 2014 Built a real trading algorithm from stock price event study, a stock price.
•
At every event, buy 100 shares of the equity, sell those 5 trading days
Project Description: Course in supervised learning, with a focus on regression and later and the compound annual return is 14.15%;
classification methods. Built a real trading algorithm from Bollinger Bands event study. At every
•
Development Tool: R event, buy 100 shares of the equity, sell those 5 trading days later and the
Learned linear and polynomial regression, logistic regression and linear compound annual return is 8.76%.
•
discriminant analysis;
Performed Cross-validation and the bootstrap, model selection and
• DATA STRUCTURE AND ALGORITHM PROJECT AT
regularization methods (ridge and lasso);
UNIVERSITY OF WISCONSIN – MADISON SEP 2009 – DEC 2010
Studied nonlinear models, splines, and generalized additive models;
•
Applied Tree-based methods, random forest and boosting and support-
•
Project Description: Implemented data structure and algorithms in Java.
vector machines;
Development Tools: Java, Linux
Discussed unsupervised learning methods: principal components and
•
Learned basic data structure and Object - Oriented Programming (OOP)
•
clustering (k-means and hierarchical).
projects and implemented complex data structure in Java;
Implemented the Bayesian network algorithm and Hidden Markov Model
•
DATA MINING PROJECT AT COURSERA.ORG SEP 2013 – OCT 2013 in Java and applied the algorithms in several bioinformatic applications;
Project Description: Learn data mining technical through practical examples with
MATHEMATICAL MODELING PROJECT AT
the free Weka software
SICHUAN AGRICULTURAL UNIVERSITY JUN 2007 – JUN 2009
Development Tools: Weka
Performed data mining using Decision Trees, Nearest Neighbor, Linear
•
Project Description: Mathematical modeling contests and training
Regression, Classification by Regression, Logistics Regression, Support
Development Tools: MATLAB
Vector Machines;
Leaded a team in the Mathematical modeling contests and interned as a
Improved understanding of data mining and introduced new ensemble •
•
Trainer and the Vice President at the Mathematical Modeling Association;
learning technical, such as Bagging, Randomization, Boosting and Stacking
Created teaching materials, gave lectures on mathematical modeling and
•
programing skills in Matlab for new members;
MACHINE LEARNING PROJECT AT COURSERA.ORG APR 2013 – JUL 2013
Awarded as Meritorious Winner in the 2008 International Mathematical
•
Contest in Modeling (IMCM)
Project Description: Implemented and applied the most advanced machine