Post Job Free
Sign in

Data Analyst Intern

Location:
Easton, PA
Posted:
June 28, 2022

Contact this candidate

Resume:

YUJIE HU

Bethlehem, PA ************@*****.*** 484-***-****

EDUCATION

Lehigh University Bethlehem, PA

Master of Science in Statistics, Mathematics May 2022

GPA: 3.74/4.0 Dean’s List

Relevant Coursework: Time Series, Statistical Machine Learning, Linear Models in Statistics

Ocean University of China Qingdao, China

Bachelor of Applied Mathematics Jun 2020

GPA: 3.27/4.0

SKILLS

Data Analytics: Statistics, Data Modeling, Data Automation, Data Visualization, A/B Testing

Machine Learning Models: Multiple Linear Regression, Logistic Regression, LDA, Nonparametric Regression (Local Linear, Nadaraya Watson, Spline), PCA, KNN

Technical Skills: Python (pandas, numpy, matplotlib, statsmodels, sklearn, seaborn), R (olsrr, car, glmnet, gam, dplyr, PerformanceAnalytics, ggplot2, tidyr, splines, ISLR, MASS), SQL (where, group by, having, order by, count, max, sum), SAS (base, stat, sgraph, macro, sql, iml, iml+), SPSS, MATLAB

PROFESSIONAL EXPERIENCE

Jiangxi ISUZU Motors Co., Ltd. Jiangxi, China

Data Analyst Intern Jan 2019 – Aug 2019

●Collected inventory and sales data from different sources and built databases by producing reusable and scalable scripts in SQL and Python, improving efficiency by 60%.

●Built insightful dashboards and reports for senior management to forecast, track and measure stock level, sales, cost, savings, and profit.

●Participated in monthly business process improvement meetings with cross-functional teams, and reduced reporting time by 45% through revamping existing excel reports.

Nanchang Investigation Team of National Bureau of Statistics Jiangxi, China

Data Analyst Intern Jul 2018 – Sept 2018

●Researched local malls and food trucks to gather sales data.

●Transformed research data into standard formats using Python, reducing manual intervention by 70%.

●Analyzed research data and estimated correct tax amounts. Provided insights and drafted a 10-page report for the management level.

PROJECT EXPERIENCE

Study of People’s Happiness Based on Multiple Regression Model Jan 2022 – Apr 2022

●Performed data cleaning, transformation, data quality check, and exploratory data analysis on data of 149 countries in Python and SQL.

●Identified the key factors that affected people's happiness and implemented a linear regression model in Python to determine the relationship between the evaluation index and key metrics.

●Tuned parameters and compared model performance using repeated cross-validation, achieving 92% accuracy.

Future Sales Prediction Oct 2021 – Dec 2021

●Preprocessed and analyzed 2M+ sales data of 200K products in R to identify key factors, established feature engineering, and constructed correlation features.

●Discovered product categories with similar sales patterns by clustering the correlation vectors and trained for each category in stepwise selection and lasso regression to refine the model's variables.

●Built a logistic regression model in R to predict the sales of 200K products for the next 3 months.

●Optimized model using LOOCV with high RMSE at 0.91 and reached 89.8% accuracy.



Contact this candidate