Post Job Free

Resume

Sign in

Data Analyst Python

Location:
Pittsburgh, PA
Posted:
November 20, 2020

Contact this candidate

Resume:

Pittsburgh, Pennsylvania,*****

412-***-****

adh0bw@r.postjobfree.com

www.linkedin.com/in/Xin-Huo

EDUCATION

Master of Science in Business Analytics Minor in Computer Science Drexel University

Sep 2018 – Apr 2020 GPA 3.84

LeBow Alumni Merit Scholarship 2018-2019 2019-2020

Master of Business Administration Duquesne University

Aug 2014 – Aug 2017 GPA 3.45

Bachelor of Business Administration and Management Beihua University

Sep 2009– July 2013 GPA 3.20

Excellent Prize for Academic Performance 2010-2011 2011-2012

PROJECT EXPERIENCE

Forecasting Drexel’s Research Awards and Expenditures Data Summary & ML Model 2019

•Main goals: Creating a time-series model to predict Drexel University’s research awards and expenditures in five years by utilizing historical datasets.

•Responsibilities:

Performed overall data summary by utilizing Python and R Language. Used MATLAB plot package to translate the historical dataset into a graph format.

Used Pandas and Numpy package within Python to perform data profiling, data cleansing, and data transformation. Removed all duplicated records and reshaped the dataset into one integrated dataset.

Used KNN method to replace missing values with substitute values.

Created two types of time-series marching learning models (ARIMA and VAR) by utilizing Python’s Statsmodels and Scikit-learn models. Kept improving ML models to get 93% accuracy in the end.

Created a result report with Tableau. Translated all numerical results into a line chart with a timestamp.

Screening Tool for Chronic Kidney Disease Data Transformation & ML Model 2019

Main goal: Created a statistical model for screening Chronic Kidney Disease and translated the model into an easy use screening tool.

Responsibilities:

Conducted data summary with R language, evaluated the dataset quality and performed a graphical overview report.

Performed data cleaning with Python. Removed all duplicated records.

Used Pandas and Numpy package of Python to slice and reshape the dataset for building statistical model.

Performed two methods (Principal component analysis and Factor analysis) with Python package to select the key features.

Developed various statistical models (Logistic regression, Random Forest, K-Mean, and SVM) with R language and Python, and improved the performance until getting over 85% accuracy.

Translated the statistical model into a simple questionnaire that can be used easily.

Sentiment Analysis of Amazon Music, Spotify, and Pandora Data Summary& Data Visualization 2018

•Main goals: Performed a sentiment analysis of these 3 companies and predicted the future performance of these 3 companies.

•Responsibilities:

Used Python package to do data cleaning, removed the duplicated data, and replace the missing data with a substitute value. (K nearest neighbor)

Created a keywords dictionary for information interpreting.

Translated the row information into a categorical or numerical data format with Python and R packages.

Created a random forest model to classify records within the dataset.

Performed data visualization with gg-plot of R language and matlabplot of Python.

Created a graphical report with Tableau.

WORK EXPERIENCE

Drexel University Research Data Analyst July-2020

Created a graphical data review for the US health care dataset.

Translated the data format into a standard compressed data format (CSV).

Created data summary with Python (Pandas and Numpy).

Sliced and Reshaped dataset with Python, R Language, SAS for advance analysis.

Used feature selection methods (PCA and FA) to select the feature for building statistical model.

Created a logistic regression model for predicting the cost of each family income level.

Henan Mingsheng Real Estate Co., Ltd Financial Data Analyst Mar-2018 to Aug-2018

Collected daily financial data records with Microsoft Excel.

Performed data cleaning with Python. Removed the duplicated record or useless records.

Translated the dataset into a standard database format with Python.

Used SQL script to update the daily record into database.

Created new datasets with SQL script for advance analysis.

Built Statistical model based on historical data to predict future financial demand and decreased the annual financial cost by 5% of the total.

Performed monthly financial reports with Tableau.

Nanyang LingDa Construction Machinery Co., Ltd HR Data Analyst Nov-2017 to Feb-2018

Prepared daily human resource information with MS excel and translated the record into standard data format with Python for uploading.

Created and Managed databases with SQL scripts and uploaded daily HR records into databases.

Corporate with my super advisor to perform advanced data analysis with Python.

Built a screen tool base on a decision tree model for evaluating employee’s working performance.

Used Tableau to perform the evaluation report.

SOFTWARE AND SKILLS

Programming Language: Python (Pandas, Numpy, Scikit Learn), JAVA

Database Tools: SQL Server, MySQL, Snowflake

Statical Language: R Language, SAS

Cloud Platform: AWS, Google Cloud Service, Microsoft Azure

IDE: PyCharm, MS Visual Studio, R-studio

BI Tools: Tableau

Xin Huo



Contact this candidate