WILLIMANTIC, CT *****
YIXIN. LI
www.linkedin.com/in/yixinli18 https://github.com/laraty
**************@*****.***
Education
University of Connecticut, Storrs, CT
Spring 2016
M.S. in Statistics, May 2016, GPA: 3.8/4.0.
Relevant Coursework: Time Series; Regression; Categorical Analysis; Visualization; Data Preprocessing; Optimization; Classification and Clustering; Image Data Analysis; Document Analysis.
Relevant Course Projects: Face Image Clustering; Document Embedding; Image Segmentation; Prediction of Hyperlink Between Webpages.
Nankai University, Tianjin, China
Spring 2013
B.S. in Mathematics, Minor in Statistics.
Technological Skills and Language
Original Coding:
Software:
Certificates:
Language:
Neural Networks, Support Vector Machines, Decision Tree, Bayesian Methods, K-means, K-Nearest Neighbors, PCA, ICA, Bagging and Boosting.
Python (Proficiency); R (Proficiency); Java (Competent); SQL (Experienced); C++(Basic); SAS (Competent); Matlab (Competent); MS Office; Latex.
SAS Certified Base Programmer; SAS Certified Advanced Programmer.
Japanese(Basic).
Technology Experience
Online Project Competitions
Prediction of Return Rates for Fashion Distributor, Data Mining Cup 2016
Spring 2016
Explored and analyzed datasets; constructed the modeling methodology; created features under multi-level.
Utilized Python, R and Matlab.
Property Inspection Prediction, Liberty Mutual, Kaggle Competition
Fall 2015
Predicted hazard level of insured property using five classification models (Random Forest, Gradient Forest, Extra Tree, Neural Network, KNN).
Increased the prediction accuracy from 0.30 to 0.36 by cross-validation and ensemble methods.
Utilized Python and R.
Academic Projects
Exam Scheduling Problem, Department of Mathematics, University of Connecticut
Spring 2016
Optimized the exam schedule by Simulated Annealing(SA) and Ant-Colony Optimization(ACO) Algorithms. Applied model to University exam data. Published a research report on http://datascience.uconn.edu/ .
Built SQL database of exam schedule to access the schedule arrangement of each student or exam.
Utilized R, Matlab, SQL and Java.
Twitter Business User Identification Spring 2015
Read twitter contents from website; built prediction model by comparison of 5 machine learning methodology; produced word-frequency dictionary.
Identified users who sent tweets for business and advertisement with over 90% accuracy rate of prediction.
Utilized Python, R and Hadoop
Industry Projects and Employment
Summer Intern, Goldenson Center, University of Connecticut
Summer 2015
Individual Financial Planning Model, Mass Mutual Financial Group
Quantified the individual financial wellness by establishing a generalized linear model (GLM) with R.
Long Term Care Incidence Experience Study, Gen Re Corporation
Jointly wrote R script, enabling Gen Re to transform the data from its client companies to standard format, and robustly calculate Incidence Exposure and Expected Incidence Rate.