Sign in

Data Analytics

Dearborn, MI
October 08, 2019

Contact this candidate



**** ********** **, ********, ** *****

+1-631-***-**** LinkedIn EDUCATION

Stony Brook University May 2019

M.A in Economics (GPA 3.2) Stony Brook,NY

Course: Statistical Computing, Mathematical Statistics, Applied Econometrics Chengdu University of Technology June 2017

B.A in Economics (GPA 3.3) Chengdu, China

Course: Statistics, Linear Algebra, Econometrics


Programming Languages: Python(numpy, pandas, scikit-learn), R(dplyr, ggplot2, lubridate), SQL

Database: MySQL, Oracle, MS SQL Server

Tools: Tableau, Qlikview, Excel(PivotTable, VLOOKUP, VBA), MATLAB EXPERIENCE

China Bohai Bank July-August 2018

Summer Intern, Investment Analyst Beijing, China

Performed statistical analysis including linear regression and survival analysis to summarize the sales and longevity of nancial products using R

Processed data and conducted nancial analysis to compare investment products from 12 competitive banks in order to develop 2 new investment products


Credit Card Approvals Predicting - Machine Learning, Python

Trained an automatic credit card approval predictor using machine learning techniques

Imputed the missing values for numeric data by computing mean imputation and missing values for categorical data with most frequent values

Fitted a logistic regression model to predict whether a credit card application will be approved or not

Performed the grid search method to improve the accuracy of logistic regression from 83.3% to 85.4% Recommendation System for Books - Nature Language Processing, Python

Established a content-based book recommendation system using NLP

Applied tokenization and stemming process to avoid the e ect of English word variation

Built a tf-idf model to determine the importance of each token and selected top 10 speci c tokens of each book

Measured the similarity between books by computing cosine similarity score and visualized the result as a bar chart ECommerce Shipping Methods Normalizing - SQL

Wrote and executed SQL queries to normalize di erent shipping methods (1000+) into 6 standardized methods based on shipping time across 1.8 million online sales orders spread over 15 years

Calculated the most popular shipping methods and the average order value for di erent states in the United States. Visualized these results with interactive dashboard by creating heat maps on Tableau

Displayed the data on a world map that shows the concentration of international orders based on geographic locations

R Package of Pooling p-values

Developed an R package to pool p-values for multiple data frames by implementing 4 methods (Fisher, Stou er, minP and maxP)

Checked normal distribution and group di erence of data before pooling p-values by statistical methods including one way ANOVA, two sample t-test and Kruskal Wallis test

Contact this candidate