Post Job Free
Sign in

machine learning

Location:
Chicago, IL
Posted:
March 15, 2018

Contact this candidate

Resume:

Xiao (Shawn) Chen

**** * ******* **. *******, ***16 *****@****.********.*** 573-***-****

LinkedIn: https://www.linkedin.com/in/xiao-shawn-chen-86511a69/ Portfolio: http://rpubs.com/Cx530548220

TECHNICAL SKILLS

Operating system: Linux, macOS, Windows

Machine learning and Deep Learning: Python, PyTorch, PySpark, scikit-learn

Statistical Packages: R, SAS, Excel

Databases: MySQL, SQL, Teradata, Google Big Query

Cloud Platform: Google Cloud Platform

Visualization and Business Intelligence: ggplot2, R markdown, Tableau EDUCATION AND CERTIFICATION

University of Missouri, Columbia, Missouri M.A Statistics Aug. 2015 - May 2017 Anhui Polytechnic University, China B.S Electrical Engineering and Automation Sept. 2010 - July 2014 INTERNSHIP EXPERIENCE

Youzu Interactive Co. Ltd. Shanghai, China

Assistant Game Analyst Jan. 2015 - July 2015

Generating weekly, monthly reports for various business users according to the business requirements. Manipulating/mining data from databases tables (MySQL, R and, Tableau)

Adjusted the start time, duration and activity awards and other game setting to improve user activities. Led to 12% increase in retention and $2 million rises in monthly revenue.

Responsible for performed game analysis on League of Angels, Facebook’s 2015 Best Web Game with

$25 million monthly and $72 million annual revenue in the North America region. Machine Learning and Deep Learning EXPERIENCE

Implement ConvNet by Numpy on Google Cloud Platform Jan.2018 – Present

• Used PyTorch to build VGG-16 architecture for CIFAR-10 image classification with GPU support.

• Applied He initialization, ReLU, bath normalization, drop out regulation and, Adam optimization for model training. Achieved 81.3% accuracy.

Recruit Restaurant Visitor Forecasting in Kallgle.com Dec. 2017 – Feb. 2018

Created interactive data analysis in R by ggolot2 and R markdown.

Clean data, merge dataset and split current variables for features engineering.

Used XGBoost to build Gradient Boosting Tree with 0.514 RMSE and top 15%. Credit Card Fraud Detection Oct. 2017 – Dec.2017

Built Multivariate Gaussian Anomaly Detection system by Numpy for fraud detection.

Used CV to choose epsilon value with F1, Recall and Precision score, applied this epsilon value on train dataset and achieve 0.763 F1 score and achieve 93% accuracy. Prediction of client’s charity behavior Sep. 2017 – Nov. 2017

Fixed skewed data problem by SMOTE oversampling method.

Built random forest regression model for missing value imputation.

Used Logistic Regression model with an L1 penalty for behavior classification and got 0.82 F1, and 0.82 AUC.

Built linear Stochastic Gradient Descent model to predict the amount of donation and got 3.7 MSE. Leaf Classification in Kaggle.com Sep. 2016 - Dec. 2016

Competed in Kaggle.com Leaf Classification competition and ranked top 12%.

Used PCA to reduce data dimensions and keep 98% information.

Normalized data and utilized L2 regulation to avoid overfitting problem.

Built SVM classification model with Gaussian kernel. Used 5-fold Cross Validation method to choose C and gamma, and KNN classification model with a large K to avoid overfitting problem.



Contact this candidate