Post Job Free
Sign in

Data Analytics & Insightful ML Intern

Location:
St. Louis, MO
Posted:
February 03, 2026

Contact this candidate

Resume:

Zimin (Greyson) Wang

+1-314-***-**** ********@*****.*** LinkedIn

EDUCATION

Washington University in Saint Louis, McKelvey School of Engineering Saint Louis, MO Master of Information Systems Management, Information Systems Management, GPA: 3.89/4.0 Sep 2024 - Dec 2025 Coursework: Data Management, Analytics Application, Machine Learning and Pattern Classification, Generative AI & LLM University of California, Santa Barbara Santa Barbara, CA Bachelor of Arts, Statistics and Data Science Sep 2018 - Jun 2022 Coursework: Data Science Principles, Probability & Statistics, Regression Analysis, Analysis, Bayes Data Analysis, Time Series PROFESSIONAL EXPERIENCE

Innova AI Tech LLC San Francisco, CA

Data Analyst Intern May 2025 - Aug 2025

• Analyzed 1M+ e-commerce transaction records using RFM metrics and K-Means clustering in Python, segmenting customers into high- value, at-risk, and dormant groups, informing targeted marketing strategies and improved repeat purchase rate by 12%.

• Designed and analyzed segment-based A/B tests for promotional campaigns, performing statistical significance validation and lift analysis to identify high-performing offers that increased conversion rate by 18% and improved marketing ROI.

• Collaborated with marketing and product teams to translate clustering and experimentation insights into actionable campaign recommendations, enabling more precise customer targeting and contributing to a 10%+ increase in customer lifetime value.

• Developed data visualizations and summary dashboards using Python and Tableau to communicate RFM distributions, cluster behaviors, and experiment outcomes, accelerating data-driven decision-making. Bestone Payment Co.,Ltd. Shandong, China

Data Analyst Intern Jan 2025 - May 2025

• Built a scalable ETL data pipeline to process millions of historical credit card transaction records using Python and SQL, identify fraud patterns, and engineer behavioral and transaction-level features, improving model signal quality and fraud detectability.

• Developed fraud prediction models using Logistic Regression and Random Forest, applying cross-validation and ROC-AUC, precision- recall to balance fraud detection and false positive rates, achieving a 20%+ improvement in recall at fixed precision thresholds.

• Conducted threshold tuning and cost-sensitive analysis to optimize decision rules, quantifying trade-offs between fraud loss reduction and customer friction, and contributing to an 15% reduction in fraudulent transaction losses while maintaining customer approval rates.

• Designed monitoring dashboards via Power BI to track model performance, fraud rates, and false positives over time, enabling risk teams to identify model drift early and supporting faster, data-driven adjustments to fraud strategies. PROJECTS EXPERIENCE

Analysis of COVID-19: How Socio-economic Indicators Impact Serious Pandemic Jan 2025 - May 2025

• Conducted retrospective statistical analysis to examine relationships between socio-economic indicators (GDP per capita, happiness indices) and COVID-19 outcomes, leveraging regression and correlation techniques to identify significant drivers of infection severity.

• Designed and built predictive models to forecast post-quarantine infection trends using time-series and regression-based approaches, validating predictions against WHO-reported data to demonstrate robustness and generalizability.

• Engineered features and performed data normalization to improve model stability, enabling clearer interpretation of socio-economic impacts on public health outcomes across regions.

• Communicated analytical insights through structured reports and visualizations, supporting data-driven discussions on public policy effectiveness and pandemic response strategies.

Classification of Red Wine Quality Using Machine Learning Jan 2025 - May 2025

• Constructed and evaluated multiple classification models including Support Vector Machine (SVM), Artificial Neural Network (ANN), and Random Forests using 11 physicochemical features, systematically comparing performance across accuracy and precision metrics.

• Optimized model hyperparameters through cross-validation and feature scaling, achieving a final classification accuracy of 88.4% while maintaining strong generalization on unseen data.

• Performed feature importance analysis to identify key drivers of wine quality ratings, translating technical model outputs into interpretable insights relevant for quality control decisions.

• Documented model assumptions, limitations, and performance trade-offs to ensure reproducibility and guide practical model selection decisions for quality control and analytical reporting. TECHNICAL SKILLS

Programming & Query: SQL (PostgreSQL / MySQL / BigQuery); Python (Pandas / NumPy); Excel (Pivot Tables / VLOOKUP / XLOOKUP / Power Query)

Analytics & Data Processing: ETL / Data Processing; EDA; Data Cleaning; KPI Tracking; Trend Analysis; Cohort Analysis; Funnel Analysis; Segmentation; Root Cause Analysis; Data Quality Checks / Validation Modeling for Analytics: Linear / Logistic Regression; Forecasting; Baseline Models; Model Interpretation for Business Insights Experimentation & Causal Analysis: A/B Testing; Hypothesis Testing; Confidence Intervals; Metric Definition & Interpretation Visualization & BI: Tableau / Power BI / Looker; Dashboard Design; Automated Reporting; Data Storytelling / Insight Communication Cloud & Big Data: Cloud Data Warehouses (BigQuery / Snowflake); Large-scale Datasets; Query Optimization; Cloud-based BI Integration Collaboration: Stakeholder Management; Requirements Gathering; Documentation (Metric Definitions / Data Dictionary)



Contact this candidate