Yiwen Yuan
***********@*****.*** +1-872-***-**** Authorized to work in the U.S.
Summary
• Ph.D. in Statistics with over six years of hands-on experience in data science, machine learning, and analytics projects, including internships in business sectors and academic research.
• Strong exposure to data modeling and machine learning (ML), with experience in developing and deploying models using Natural Language Processing (NLP), Large Language Models (LLM), Neural Networks, and AI/ML frameworks like PyTorch and TensorFlow.
• Skilled in generating business insights, deploying ML/DL models in production, and designing A/B tests while addressing experimentation bias.
• Proficient in leveraging cloud-based ML frameworks (AWS, Azure, GCP) and tools like GitHub and MLOps.
• Experienced with SQL and Exploratory Data Analysis on large datasets.
• Strong ability to collaborate with technical teams to optimize and automate model development and deployment processes. Education
Ph. D. in Statistics - Bowling Green State University, Bowling Green, OH Aug 2020 - Aug 2024 Dissertation Topic: Lasso Method with SCAD Penalty for Estimation and Variable Selection in Sequential Models
• Enhanced predictive accuracy and reduced risk by 76% on large datasets.
• Applications in finance and healthcare, supporting forecasting and risk management. Award: J. Robert & G. Overman Award
M.A. in Applied Statistics - Bowling Green State University, Bowling Green, OH Aug 2018 - May 2020 Experiences
Business Insights Intern (R Studio, R Shiny, SQL, Snowflake) May 2022 – Aug 2022 Welltower. Inc Toledo, OH
• Built and deployed end-to-end predictive pricing models using UK demographic data for senior housing costs.
• Collected and analyzed demographic data from diverse sources using R and SQL to extract business insights.
• Developed and refined data models to accurately reflect the organization’s data structures and business needs.
• Created interactive dashboards and reports using R Shiny for real-time decision support.
• Optimized data visualization and model selection processes, improving performance and efficiency by reducing time by 20%.
• Designed and deployed a heatmap in R Studio to visualize model prediction results and integrated it into an R Shiny dashboard, effectively communicating findings to stakeholders.
Business Insights Intern (R Studio, R Shiny, Snowflake, SQL, Python) May 2021 – Aug 2021 Welltower. Inc Toledo, OH
• Independently self-learned and developed a Python web scraper to automate data collection, reducing costs by 30%.
• Enhanced data integration efficiency by 15% using R Studio instead of manual SQL processes.
• Processed and analyzed demographic data to uncover insights supporting senior housing strategies.
• Trained models, fine-tuned parameters, and generated forecasts to align financial projections with business needs.
• Created heatmaps in R Shiny to visualize model predictions, improving stakeholder engagement. Graduate Teaching Associate Aug 2020 – Feb 2023
Bowling Green State University Bowling Green, OH
• Independently led undergraduate Statistics courses, fostering a collaborative and engaging learning environment.
• Assessed and provided constructive feedback on students' classwork, assignments, and tests to support their academic growth.
• Designed, evaluated, and revised curricula, course materials, and teaching methods to enhance student understanding and engagement. Projects
Image Classification with Neural Networks (Python)
• Developed and trained a neural network for image classification using TensorFlow and Keras.
• Implemented forward/backward propagation and gradient descent to optimize the model.
• Achieved 92% test accuracy by fine-tuning hyperparameters and analyzing performance with Matplotlib to detect overfitting issues. Multivariate Statistics Design (R Studio)
• Analyzed the quality of white wine using R, leveraging multiple variables across thousands of observations.
• Applied methodologies include multivariate normality tests, cluster analysis, classification, and principal component analysis.
• Selected and implemented the most suitable models and algorithms for accurate predictions. Spam Identification Using Machine Learning (R Studio)
• Built supervised ML models in R Studio for spam detection.
• Applied discriminant analysis (LDA, QDA), tree-based methods, support vector machines (SVM), and MARS.
• Used cross-validation, bootstrapping, and random forests for model evaluation and selection.