Post Job Free

Resume

Sign in

Modeling, Data analytic, Coding

Location:
Ann Arbor, MI
Posted:
November 20, 2023

Contact this candidate

Resume:

Heyuan Liu

+1-919-***-**** ad1bo3@r.postjobfree.com LinkedIn: www.linkedin.com/in/heyuanliu

EDUCATION

University of Michigan - Ann Arbor, MI

Master in Applied statistics, May 2024 GPA 3.7/4.0 Bachelor of Science in Statistics, May 2022 GPA 3.9/4.0

• Hand-on Experiences: Data Model & Machine Learning, Data Mining & Segmentation, Database Management

• Courses: Statistical Finance, Time Series, Regression Analysis, Bayesian Inference, Survey Sampling, Experiment Design.

• Software Tools: In-depth knowledge of SQL & Python, EXCEL, Microsoft PPT, POWER BI, Tableau, R, SAS, C++

• TOFEL: 109/120

WORK EXPERIENCE

the Institute of the Social Research

Summer Research Assistant, Ann Arbor, MI May 2022 - Aug 2022

• Built a logistic regression model and random effect models by incorporating various factors to predict whether or not the participants in this survey will responds; the prediction accuracy has been improved by 6%.

• Built the weighting model by incorporating the updated logistic regression model and various methods to account for the different status of the survey participants over a period of 50 years. Thus the missing values is manipulated to ensure the sample is unbiased.

• Implemented data visualization using R to identify characteristics of low-income households, distribution patterns of response rates over time, the relationship between income patterns and response rates, as well as the relationship between other demographic characteristics and response rates of over 5,000 survey participants over a period of 50 years. Hubei Bestore Electronic Commerce Co., Ltd

Business Analyst Assistant, Wuhan, China May 2021 - Aug 2021

• Analyzed the data in the company's owned CRM tool, achieved the prediction of purchasing behaviors for different customer groups through various distribution channels, and implemented A/B testing. As a result, precision marketing was achieved and the average GMV increased by 8.7%.

• Processed a large dataset and completed a quantitative, multi-dimensional evaluation of "big influencers" - such as sales volume, conversion rate, cost-profit ratio, brand image compatibility, and audience demographics - to select the best influencers for collaboration, helping managers make data-driven strategic decisions.

• Established an innovative advertising performance evaluation model, different from traditional ROI framework, becoming a new standard for advertising budget allocation.

• Manipulated data in SQL, generated quantitative reports with different measurement frameworks in Excel, such as the “618- shopping festival” performance, as well as detect the underperformed areas, strength and new growth opportunities.

• Conducted industry research by reviewing industry reports and other means to understand industry trends, competitor and consumer analysis. Produced market analysis reports to provide decision support for daily work. Amazon

Business Intelligence Manager Assistant, China, part-time. Jan 2021 - Feb 2021

• Pre-processed and modeled the historical shipment data in Tableau, and then built an automatic tool to predict the percentage of the affected shipments of products if the truck’s departure time is adjusted, achieving the user-friendly interface on the website.

• Worked with a large dataset in Tableau to compose a root-cause dashboard report for the negative profitability issue of 5 furniture brands on Amazon, successfully providing the 2 data-driven solutions to increase the profitability to zero. PROJECT

Sepsis Prediction March 2023 - April 2023

• Project Description: Build the machine learning model (decision tree, random forest, extreme gradient boosting, adaptive boosting, etc.) to accurately predict whether or not ICU patients will develop sepsis.

• Project Challenges: 40 variables for 20,000+ ICU patients’ are recorded every hour are stored in a separate CSV file, which needs to be merged and converted into a Pandas data frame; compute new variables to reflect changes in various indicators during the patient's stay in the ICU, successfully reducing the number of variables from 500 to 60, not only solving the issues of overfitting and underfitting but also significantly improving the accuracy of the model's predictions; redefine classification criteria to reduce the false positive rate (FPR).

• Project Results: The final prediction accuracy is 92%, and the project ranked in the top 5% of the class. Income Class Prediction Feb 2023 - March 2023

• Project Description: Build the machine learning models (random forest, quadratic discriminant analysis, support vector machine, etc.) in Python to predict the income class of Michigan residents(200,000+), and create a dashboard for data visualization and user interaction design.

• Project Challenges: Redefining the income class classification criteria to reduce sample bias, selecting the employment population and handling outliers, as well as considering inflation, the prediction accuracy was increased from 46% to 77%.

• Dashboard: https://stats-507-375403.ue.r.appspot.com/ Weight Sensor Error Prediction Dec 2022 - Mar 2023

• Project Description: Building a weight sensor error prediction model for fitness equipment to notify gyms of the need for timely error correction rather than relying on gym members to report errors.

• Project Challenges: Different types of machines have different error distributions, so each machine needs its own error model. Basic machine learning models are not suitable as there is no clear relationship between various factors and errors. Therefore, complex time series models are required. Defining the reasonable range of errors improves the model's rationality and prediction accuracy.

• Project Results: The predicted error rate is 3.6%, which was highly praised by the client. Tiffany&Co Market Strategy Research and Analysis Oct 2022 - Dec 2022

• Project Description: Conducting a comprehensive analysis of Tiffany&Co's US market strategy, analyzing issues such as a small market share, unclear positioning, and infrequent product updates, and providing corresponding solutions.

• Marketing Solutions: Partnering with young models to attract a younger target audience, expanding sub-brands to offer products that appeal to younger consumers while maintaining the brand image, aligning prices with brand positioning; Posting promotional posters on social media platforms such as Instagram and Snapchat from Sep 2022 to May 2024 to enhance brand image and awareness.

Michigan Data Science Team

Club member Sep 2022 - Current

• Dog and Cat image classification: used Tensorflow and transfer learning method to trained the CNN model to accomplish dog and cat breed classification, and then achieved the 80% validation accuracy for each.

• Strength training equipments & weight estimation: pre-processed the weight estimation data(data quality check), visualized the data to explore the potential pattern of the estimation error and its related factors, build the model (linear regression, lasso regression, decision tree, time series etc) to describe the estimation error; and then do the error forecasting to notify the gym manager when to do the equipment recalibration.

Statistics Class Projects Winter 2022

• Covid-19 RecCovid-19 Recovery Rate: Used R and the Linear Bayesian regression to model the effect of several socioeconomic factors on COVID-19 recovery rate before the production of COVID-19 vaccines over 50k data.

• Stock Price Forecasting: used R, self-selected variables and multiple machine learning models (random forest, classification, QDA, bootstrapping, decision trees, clustering etc) to predict the10-min-forward price of a specific stock on a specifc day by using the stock price per minute for 3 stocks from 500k data.

• NYC taxis industry: produced the quantitative description of the 3-year NYC taxis industry, such as the average fare paid per ride, the monthly passenger volumne and the estimated tips proportion per ride, by weighting.

• Vaccine Effectiveness Evaluation: compute the confidence interval for the probability of the different age group of people having COVID-19 after taking the vaccine by working with the data in R.



Contact this candidate