LinkedIn: www.linkedin.com/in/yuxin-chen-20562b156 OBJECTIVE
Passionate in data analysis with 5 years’ experiences of software design development in C++, C# Python and R. Profi- cient knowledge in mathematical, statistical modeling, algorithms of regression, classification, solid understanding of data analysis and data engineering.
Worcester Polytechnic Institute, MA, Worcester Jan 2018 - Dec 2019 Master’s in data analytics and information technology Courses: Statistical Method for Data Science; Business intelligence; Database Application and Development; System Design and Analysis
Southwest Jiao tong University, China Sep 2013 - Jun 2017 Bachelor of Signal Automatically Control
Courses: Theory of probability, C++, C#, Computer Network, Signals and Systems, Automatic Control SKILLS
Programming Languages: Python, R, C++, C#, Java, Visual Basic, HTML, Xshell6, Scala Software Packages: Tableau, AWS, SAS, Airflow, Visio, Access, Excel Database: MySQL, NoSQL, MSSQL, PostgreSQL, SparkSQL, AWS(RDS), Oracle, Hadoop and HDFS, Hive, Spark PROFESSIONAL EXPERIENCE
58.com lnc Data Scientist, Data Analyst, Data Engineer April 2019 - Aug 2019
• Used HiveSQL, SparkSQL, MySQL, AWS(RDS) to conduct data analysis and data cleaning for 20 million data.
• Conducted data migration between two Hive platforms, performed data anomaly detection for Hive database.
• Built ODS database structure for Functional Center and conducted data dictionary, data interface document.
• Wrote Python and Xshell6 scripts to automatically validate and monitor data migration process for AWS(RDS).
• Created web crawler in Python and HTML to collect 20+ companies’ data for internal business team.
• Built a Classification Model using Random Forests to predict user anomaly behavior and achieved 93% accuracy.
• Created Tableau dashboard to visualize and analyze company operation status like GAAP and user distribution. Graduate Assistant at WPI Data Analyst Aug 2019- Dec 2019
• Collected longitude and latitude information for 10K+ locations in Python web crawler and Baidu Map API.
• Conducted MySQL database combination in Python and performed data cleaning job in Python regular expression.
• Created web crawler in Python and HTML to collect data for 3 topics from 5 years of paper materials. Data Research/Graduate Assistant at WPI Data Analyst Aug 2019- Dec 2019
• Based on clinic requirements, Performed data collection, data cleaning and ETL to MySQL database for 50k+ data.
• Conduct data preparation for prediction modeling on MySQL such as separating the data into different groups.
• Performed data analysis for the clients who have anomaly behaviors and give out reasonable explanation.
• Performed Lasso prediction on what kind of patients are more willing to come to the clinic with 90.1% accuracy. Southwest Jiao tong University Laboratory- ZPW2000A program Jan 2016- July 2017
• Developed Signal processing algorithms in signal control module using ZPW2000A principle.
• Developed a GUI including function of train running simulation and block control in C# and C++.
• Developed a train control and tracking software package using C# and C++. PROJECTS
NYC Taxi marketing project Jan 2019- Apr 2019
• Performed source data cleaning and ETL to R and Python data frame.
• Built profitability regression model using KNN, PCR and LASSO, and the models have 91% average CV accuracy.
• Built credit card payment classification model Using LASSO to find out the potential sponsor for NYC Taxi. Twitter data mining for Twitch Jan 2019- Feb 2019
• Collected data through Twitter API, performed data cleaning NLP, and collected the data into Python data frame.
• Starting from the raw data, conducted data analysis to find the most popular game topic in Twitch recently.
• Conducted k means clustering analysis in Python to find the best streamer and increase the number of viewers. Database application on Android Studio for Clevo Co. Jan 2018- May 2018
• Established the ODS data structure based on RDBMS and used VISIO to build the ER diagram.
• Wrote data dictionaries and user stories based on business specifications and scenarios.
• Developed application in Android 4.2 Java and created multiple interfaces for data operations using Android NDK.