Post Job Free
Sign in

Data Scientist

Location:
New York, NY
Posted:
November 04, 2015

Contact this candidate

Resume:

Di Zhu

*** *** ******, ****, *******, NJ *****

201-***-****

*****@*******.***

OBJECTIVE: To obtain an internship or a full-time job as Business Analyst or Data Scientist

EDUCATION: Stevens Institute of Technology, Hoboken, NJ

Master of Science Business Intelligence & Analytics Expected 05/2016

Hefei University of Technology, Hefei, China

Bachelor of Engineering Chemical Engineering 06/2010

EXPERIENCE: Stevens Institute of Technology, Hoboken, NJ 10/2015 – Present

Graduate School Student Assistant

Helped industry students with Hadoop Pig & Hive exercises in Stevens Institute of Technology’s Hadoop Bootcamp

Graded homework for the Optimization and Process Analytics course

Davidson Lab, Hoboken, NJ 05/2015 – 08/2015

Research Assistant, Data Analyst

Extracted and processed the U.S. Northeast Urban Ocean Observatory data set by Python

Clustered observation stations into 3 different groups and set up different time series machine learning model within groups to predict the U.S. northeast ocean storm surges

Improved the astronomical water level forecasting model by applying machine learning technology in Python via scikit-learning

China Petroleum & Chemical Corporation, China 06/2013 – 08/2013

Summer Internship, Data Analyst

Extraction Transformation Loading (ETL) and batch retail data via SAP

SKILLS: Certifications: SAS Base Programmer, Essential Bloomberg

Programming & Statistics: Python, R, SQL, SAS, Hadoop Pig & Hive

Languages: English(fluent), Mandarin Chinese(native)

PROJECTS: Decision Support System

Built a decision support system to help MultiMagazine Inc. selecting marketing targets by Python

Normalized the categorical variables and filled in the missing values

Randomly split the data set using 70% and 30% of the observations for the training and testing data set respectively

Fitted the training set by Logistic Regression, CART, Support Vector Machine, and Naïve Bayes algorithms

Evaluated the 4 different algorithms by ROC curves, Precision-Recall curves, Error Rate, and Confusion Matrix, and picked the most suitable one based on the company’s Benefit/Cost marketing matrix

Developed a user friendly environment to run this predictive model

Boston House Value Analysis

Reduced the dimension of the Boston housing dataset with Principle Components Analysis in R and SAS

Identified 5 of the 14 variables that best explained the houses price using AIC subset regression

Performed multivariate regression to examine the relationship between the houses price and the set of selected variables

TV Reviews Analysis

Collected more than 4000 TV set reviews from 4 different websites using Python script

Performed a text mining process on the reviews

Developed a decision tree to select useful reviews based on rating, length and the reviews tags



Contact this candidate