ZHICHENG YOU
************@*****.*** 424-***-**** Houston, Texas
Summary
Data Scientist with proven experience in data processing, feature engineering, machine learning and data visualization
Advanced skills in Python, Hadoop and Spark
5-year experience of client facing with weekly meetings and presentations Experience
DATA SCIENCE CONSULTANT DIGITALVELOCITY LLC
09/2016-PRESENT
Met with clients to discuss the challenges
Used SQL to retrieve data for data analysis and visualization
Performed exploratory data analysis and data visualizations using Python and related packages
Used pandas, numpy, seaborn, scipy, matplotlib, scikit-learn in Python for developing various machine learning algorithms
Implemented Spark on Amazon Web Services (AWS) to facilitate the usage of large training dataset
Recent Projects:
Credit Risk Identification: Successfully utilized decision tree algorithm to identify the factors that are predictive of higher risk of default of a customer. Model was able to achieve an accuracy of 76%. Accuracy was improved to 83 % using boosting
Fraud detection model: Built decision tree based (random forest, ada-boosting and gradient boosting) models to improve fraud detection accuracy
Song recommendation system: Stored in HDFS; Implemented song recommendation system in Spark; Recommended 10 new songs for each user
SENIOR GEOPHYSICAL DATA ANALYST CGG SERVICES U.S. INC.
01/2012-09/2016
Worked on geophysical data processing/visualization projects (data in tens of TB)
Designed data-oriented analysis workflows to meet requirements of projects
Identified and removed noise of big datasets
Deployed various machine learning methods and algorithms to develop data models and make data products
Summarized and interpreted the processing results and generated client oriented reports
Led weekly face to face clients’ meetings to illustrate the results to clients RESEARCH TRAINEE UNIVERSITY OF TEXAS M.D. ANDERSON CANCER
10/2008-12/2011
Designed and simulated chest digital tomosynthesis imaging (DTS) systems with C++ programs and Matlab
Simulated projections of the digital chest model in different DTS systems
Reconstructed 3D chest images from projections using machine learning algorithms Page 2
Education
PHD IN PHYSICS UNIVERSITY OF HOUSTON
06/2006-12/2011
Dissertation: “New Techniques for Chest Digital Tomosynthesis Imaging” MS IN PHYSICS UNIVERSITY OF MINNESOTA
09/2002-06/2006
BS IN PHYSICS UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA
09/1997-07/2002
Skills
Programming: Python (5 years), Hadoop(2 year), Spark(2 year), R(1 year), SQL(1 year), Matlab(3 years), C#(1 year), C++(3 years), C(8 years)
System: Windows, Linux
Certification
Microsoft Specialist: Programming in C#
Microsoft Certified Professional
Award
CGG 2016 Above & Beyond Award
Online Courses
Machine Learning (taught by Prof. Andrew Ng from Stanford University)
Machine Learning Foundations (taught by Prof. Hsuan-Tien Lin from National Taiwan University)
Machine Learning Techniques (taught by Prof. Hsuan-Tien Lin from National Taiwan University) Work Authorization
Legally authorized to work in the US without sponsorship