Sign in

Data Analyst

Worcester, MA
January 02, 2020

Contact this candidate


Yuxin Chen

+1-571-***-**** Apt *, 7 Pratt street, Worcester, MA.


Passionate in data analysis with 5 years’ experiences of software design development in C++, C# Python and R. Profi- cient knowledge in mathematical, statistical modeling, algorithms of regression, classification, solid understanding of data analysis and data engineering.


Worcester Polytechnic Institute, MA, Worcester Jan 2018 - Dec 2019 Master’s in data analytics and information technology Courses: Statistical Method for Data Science; Business intelligence; Database Application and Development; System Design and Analysis

Southwest Jiao tong University, China Sep 2013 - Jun 2017 Bachelor of Signal Automatically Control

Courses: Theory of probability, C++, C#, Computer Network, Signals and Systems, Automatic Control SKILLS

Programming Languages: Python, R, C++, C#, Java, Visual Basic, HTML, Xshell6, Scala Software Packages: Tableau, AWS, SAS, Airflow, Visio, Access, Excel Database: MySQL, NoSQL, MSSQL, PostgreSQL, SparkSQL, AWS(RDS), Oracle, Hadoop and HDFS, Hive, Spark PROFESSIONAL EXPERIENCE lnc Data Scientist, Data Analyst, Data Engineer April 2019 - Aug 2019

• Used HiveSQL, SparkSQL, MySQL, AWS(RDS) to conduct data analysis and data cleaning for 20 million data.

• Conducted data migration between two Hive platforms, performed data anomaly detection for Hive database.

• Built ODS database structure for Functional Center and conducted data dictionary, data interface document.

• Wrote Python and Xshell6 scripts to automatically validate and monitor data migration process for AWS(RDS).

• Created web crawler in Python and HTML to collect 20+ companies’ data for internal business team.

• Built a Classification Model using Random Forests to predict user anomaly behavior and achieved 93% accuracy.

• Created Tableau dashboard to visualize and analyze company operation status like GAAP and user distribution. Graduate Assistant at WPI Data Analyst Aug 2019- Dec 2019

• Collected longitude and latitude information for 10K+ locations in Python web crawler and Baidu Map API.

• Conducted MySQL database combination in Python and performed data cleaning job in Python regular expression.

• Created web crawler in Python and HTML to collect data for 3 topics from 5 years of paper materials. Data Research/Graduate Assistant at WPI Data Analyst Aug 2019- Dec 2019

• Based on clinic requirements, Performed data collection, data cleaning and ETL to MySQL database for 50k+ data.

• Conduct data preparation for prediction modeling on MySQL such as separating the data into different groups.

• Performed data analysis for the clients who have anomaly behaviors and give out reasonable explanation.

• Performed Lasso prediction on what kind of patients are more willing to come to the clinic with 90.1% accuracy. Southwest Jiao tong University Laboratory- ZPW2000A program Jan 2016- July 2017

• Developed Signal processing algorithms in signal control module using ZPW2000A principle.

• Developed a GUI including function of train running simulation and block control in C# and C++.

• Developed a train control and tracking software package using C# and C++. PROJECTS

NYC Taxi marketing project Jan 2019- Apr 2019

• Performed source data cleaning and ETL to R and Python data frame.

• Built profitability regression model using KNN, PCR and LASSO, and the models have 91% average CV accuracy.

• Built credit card payment classification model Using LASSO to find out the potential sponsor for NYC Taxi. Twitter data mining for Twitch Jan 2019- Feb 2019

• Collected data through Twitter API, performed data cleaning NLP, and collected the data into Python data frame.

• Starting from the raw data, conducted data analysis to find the most popular game topic in Twitch recently.

• Conducted k means clustering analysis in Python to find the best streamer and increase the number of viewers. Database application on Android Studio for Clevo Co. Jan 2018- May 2018

• Established the ODS data structure based on RDBMS and used VISIO to build the ER diagram.

• Wrote data dictionaries and user stories based on business specifications and scenarios.

• Developed application in Android 4.2 Java and created multiple interfaces for data operations using Android NDK.

Contact this candidate