Post Job Free
Sign in

Senior Data Scientist

Location:
New York City, NY
Posted:
May 20, 2025

Contact this candidate

Resume:

Lizhihan (Lizy) Yu

352-***-**** ******@*****.*** New York, NY

EDUCATION

GEMOLOGICAL INSTITUTE OF AMERICA New York, NY

Graduate Gemologist GPA 3.7/4.0 Expected 07/2025

GEORGIA INSTITUTE OF TECHNOLOGY Atlanta, GA

Master of Science in Computer Science GPA 3.8/4.0 05/2023

• Main courses: Deep Learning, Machine Learning for Trading TEXAS A&M UNIVERSITY College Station, TX

Doctor of Philosophy in Agricultural Economics/Minor Major: Applied Statistics GPA: 3.6/4.0 08/2021

• Main courses: Microeconomics, Game Theory, Econometrics, Agricultural Finance, Data Mining, Database Master of Science in Medical Sciences GPA: 3.6/4.0 12/2017

• A Non-antibody Therapy for STEC Infection & Medical Experiments in Drug Discovery, Advised by Warren E. Zimmer WORK EXPERIENCE

Nielsen Company Remote

Senior Data Scientist, Global Marketing Mix Data Science Team: Seasonal Effects Modeling 11/2022 – 12/2024

• Demand Forecasting: Developed and productionized an Airflow-automated regression model to generate trend and seasonality proxies from holiday data, Google Trends, and client inputs, boosting seasonal demand forecast accuracy by 20%

• Marketing Mix Stabilization: Integrated COVID-19 and economic shock indicators into forecasting models, applied L regularization to mitigate multicollinearity, improving forecast accuracy by 16%

• Marketing Measurement Effectiveness: Built and deployed machine learning models in Airflow to quantify the impact of promotions, media, seasonality, and holidays on sales, helping optimize campaign ROI and budget allocation

• Internal Chatbot: Contributed to building a company-specific chatbot using Meta’s Llama model, delivering tailored responses via a FastAPI interface and enabling secure querying of internal company knowledge Nationwide Mutual Insurance Company Columbus, OH

Data Scientist, Analytics Technology & HPC Team: Auto Insurance Pricing 06/2021 – 11/2022

• Auto Pricing and Policy Alignment: Designed telematics driving behavior features using real-time GPS data, analyzed large-scale datasets with AWS EC2 and EMR, delivering key insights for auto insurance pricing and supporting the passage of Ohio House Bill 283

• Sale Prediction: Designed and implemented an end-to-end machine learning pipeline using 365 demographic and historical data to predict annuity advisor sales, providing data-driven insights to support annuity sale strategies for stakeholders

• Deployment: Deployed three production models, cash flow forecasting (ARIMA), portfolio optimization (mean-variance), and annuity persistency prediction (random forest), using AWS SageMaker, reducing average model inference time by 52%

• Model Risk: Conducted annual model risk assessments for life annuity cash flow models, evaluating model design, data quality, performance, and compliance with governance standards to ensure stability

• High-Impact Communication: Trained 150 data science associates on Git, AWS SageMaker and Kubeflow

• Mentor Intern: Mentored an intern on an end-to-end telematics pricing project using PySpark and XGBoost Data Science Intern, Enterprise Analytics Office: Auto Indication 05/2020 – 08/2020

• Auto Indication: Developed an economic rule-based pipeline for automatic dimensionality reduction on high-dimensional datasets, reducing variable selection time from 9 days to 20 minutes and saving ~$6,000 per run APPLIED RESEARCH & PUBLICATIONS

Case Studies on Medical Laboratory Workflow Optimization 08/2018 – 05/2020

• Utilized medical and economic expertise to conduct field interviews, collecting data to map medical laboratory workflow

• Built a multi-objective Mixed-Integer Programming (MIP) model about inventory management and machine scheduling that generated actionable operational recommendations, resulting in $80K in annual profit for a previously break-even company Grass-fed Beef Market Penetration 09/2020 – 04/2021

• Conducted market demand analysis to identify downstream needs and opportunities for organic food with consumer survey panel data fixed effects model

• Built predictive model to forecast categorized organic beef market share and price premium by choosing the outperform casual-effect model, combine with robustness tests SKILLS

• Programming: Git, Python (PySpark, TensorFlow, NumPy, Pandas, Scikit-learn), R, MySQL, Java, Linux, Stata, MATLAB

• MLOps: AWS (S3, SageMaker, Kubeflow, Athena, Cloudwatch, EC2, EMR), Kafka, Tableau, Airflow

• Modelling: Machine/Deep Learning, Causal Inference, Econometrics, Time Series Analysis, Optimization



Contact this candidate