Post Job Free
Sign in

Machine Learning, Statistical Analysis.

Location:
Los Angeles, CA
Posted:
September 18, 2017

Contact this candidate

Resume:

Xin **** W **th (Julia) St. Los Angeles, CA ***** Xu

919-***-**** ***********@*****.*** SheliaXin Education

Duke University Durham, NC

MASTER OF STATISTICAL SCIENCE, GPA: 3.8 Aug. 2015 - May 2017 Nanjing University Nanjing, China

B.SC IN DEPARTMENT OF MATHEMATICS, MAJOR IN STATISTICS, GPA: 3.7 Sept. 2011 - Jun. 2015 University of Wisconsin - Madison Madison, WI

VISITING STUDENT IN THE DEPARTMENT OF STATISTICS, GPA: 3.9 Jan. 2014 - May 2014 Skills

Data and Analytics Tools/Languages: Python, R, SQL, JAVA, SparkR, Hadoop, C++, Visual Basic. Statistical Methods: Machine Learning, Predictive Modeling, Data Mining, Bayesian Statistical Modeling. Internship

In4mation Insights Needham, MA

DATA SCIENTIST INTERN Jun. 2016 - Aug. 2016

• Developed a latent class algorithm based on mixed-mode finite mixture models and EM algorithm used to segment customers.

• Built 15+ feature functions and constructed an R package for latent class clustering and regression.

• Designed and developed a Shiny App with functions of exploratorydataanalysis,model recommendation, 4 market segmenta- tion models, result comparisons and interactive visualizations.

• Analyzed and processed 5 datasets with this market segmentation tool and demonstrated to the data scientist team. Essence Securities Shanghai, China

RESEARCH ANALYST INTERN AT MACHINE INDUSTRY GROUP Jan. 2015 - May 2015

• Built valuation models with Visual Basics Application and forecasted stock price of 10+ new companies in China.

• Constructed pairs trading models in R and provided investment advices in daily report.

• Researched on 10+ global tech companies and wrote a 26-page report about the Internet of Things, cooporating with IT group. Projects

Zillow’s Home Value Prediction Aug. 2017 - Present

• Predicting the Zestimate residual error in 6 months and ranking top 2% (46th/2844) on the Leaderboard.

• Cleaning over 2.9 millions house properties data, doing data exploratory analysis and creating 20+ new property features.

• Constructing a 3-layers stacking model with XGBoosting, LightGBM, Neural Network, Random Forest, Regression in python. Scalable K-means++ Project Apr. 2016 - May 2016

• Researched on K-means related algorithms and developed the K-means, K-means++, Scalable K-means++ algorithms in Python.

• Optimized the performance of K-means related algorithms with Cython, multiprocessing, and pySpark.

• Analyzed and compared misclassification rate, clustering cost, and runtime performance on four large datasets ( > 1 gigabytes). HMM Stock Price Prediction Apr. 2016 - May 2016

• Built Gaussian Hidden Markov Models to catch the stock price pattern of 7 high-tech companies scrapped from Yahoo Finance.

• Forecasted the stock price with HMM and compared performance with 2 traditional time series models (SARIMA) in Python.

• Provided investment strategies (6 opens in 20-days, win-lose ratio: 66.7% ) and presented research results to the class of 50. Reddit Text Mining Analysis Nov. 2015 - Dec. 2015

• Utilized Rhipe and Hadoop MapReduce to detect hot topic trends with 30 gigabytes JSON files scraped from Reddit.

• Conducted text mining analysis on comments around Valentine’s Day by R. H1-B Influential Factors Analysis Oct. 2015 - Dec. 2015

• ScrapeddatafromOfficeofForeignLaborCertificationandconductedPrincipleComponentAnalysistodetectinfluentialfactors.

• Analyzed influential factors with Frequentist Logistic regression, Lasso regression and Bayesian Probit regression in R. Extracurricular Activity

Organizer, Alumni Dinner of Statistical Department in 2016 Spring Durham, NC WON THE GRADUATE SCHOOLS PROFESSIONAL DEVELOPMENT GRANT Dec. 2015 - Mar. 2016 Minister, Publishing Department at Department of Math at Nanjing University Nanjing, China LED TEAM OF 10, PROMOTED 4 EVENTS WITH 200+ ATTENDANCE; DESIGNED POSTERS WITH PHOTOSHOP Sept. 2011 - Sept. 2013 JULIA RÉSUMÉ



Contact this candidate