Post Job Free
Sign in

Data Machine

Location:
Katy, TX
Posted:
July 04, 2017

Contact this candidate

Resume:

ZHICHENG YOU

************@*****.*** 424-***-**** Houston, Texas

Summary

Data Scientist with proven experience in data processing, feature engineering, machine learning and data visualization

Advanced skills in Python, Hadoop and Spark

5-year experience of client facing with weekly meetings and presentations Experience

DATA SCIENCE CONSULTANT DIGITALVELOCITY LLC

09/2016-PRESENT

Met with clients to discuss the challenges

Used SQL to retrieve data for data analysis and visualization

Performed exploratory data analysis and data visualizations using Python and related packages

Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn in Python for developing various machine learning algorithms

Implemented Spark on Amazon Web Services (AWS) to facilitate the usage of large training dataset

Recent Projects:

Credit Risk Identification: Successfully utilized decision tree algorithm to identify the factors that are predictive of higher risk of default of a customer. Model was able to achieve an accuracy of 76%. Accuracy was improved to 83 % using boosting

Fraud detection model: Built decision tree based (random forest, ada-boosting and gradient boosting) models to improve fraud detection accuracy

Song recommendation system: Stored in HDFS; Implemented song recommendation system in Spark; Recommended 10 new songs for each user

SENIOR GEOPHYSICAL DATA ANALYST CGG SERVICES U.S. INC.

01/2012-09/2016

Worked on geophysical data processing/visualization projects (data in tens of TB)

Designed data-oriented analysis workflows to meet requirements of projects

Identified and removed noise of big datasets

Deployed various machine learning methods and algorithms to develop data models and make data products

Summarized and interpreted the processing results and generated client oriented reports

Led weekly face to face clients’ meetings to illustrate the results to clients RESEARCH TRAINEE UNIVERSITY OF TEXAS M.D. ANDERSON CANCER

10/2008-12/2011

Designed and simulated chest digital tomosynthesis imaging (DTS) systems with C++ programs and Matlab

Simulated projections of the digital chest model in different DTS systems

Reconstructed 3D chest images from projections using machine learning algorithms Page 2

Education

PHD IN PHYSICS UNIVERSITY OF HOUSTON

06/2006-12/2011

Dissertation: “New Techniques for Chest Digital Tomosynthesis Imaging” MS IN PHYSICS UNIVERSITY OF MINNESOTA

09/2002-06/2006

BS IN PHYSICS UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA

09/1997-07/2002

Skills

Programming: Python (5 years), Hadoop(2 year), Spark(2 year), R(1 year), SQL(1 year), Matlab(3 years), C#(1 year), C++(3 years), C(8 years)

System: Windows, Linux

Certification

Microsoft Specialist: Programming in C#

Microsoft Certified Professional

Award

CGG 2016 Above & Beyond Award

Online Courses

Machine Learning (taught by Prof. Andrew Ng from Stanford University)

Machine Learning Foundations (taught by Prof. Hsuan-Tien Lin from National Taiwan University)

Machine Learning Techniques (taught by Prof. Hsuan-Tien Lin from National Taiwan University) Work Authorization

Legally authorized to work in the US without sponsorship



Contact this candidate