Thanh Son Ha
****@***.*** LinkedIn 312-***-****
EDUCATION
California State Polytechnic University, Pomona Pomona, California Master of Science Business Analytics 2025
Business & Analytical Technologies Society Club
Member of CPP Data Science and Artificial Intelligence Club University of Illinois at Chicago Chicago, Illinois Bachelor of Information Decision Science – Business Analytics 2021 SKILLS
• Technical Skills: SQL, Python (pandas, Numpy, matplotlib), Tableau, Power BI, Machine Learning, Data Cleaning, Data Visualization.
• Analytics and Machine Learning: Predictive modeling, Sentiment Analysis, BERT NLP models, Classification. EXPERIENCE
Parker Dewey - Atlas Testing Laboratories Pomona, California Data Analyst Intern Current
• Conducted data cleaning and preprocessing using Python (pandas, NumPy) to ensure accuracy in metallurgical test data.
• Built predictive models to assess material performance and failure risks, improving quality control and reducing defects.
• Automated data extraction, transformation, and reporting using SQL and Power BI, cutting manual effort by 30%.
• Developed interactive dashboards to visualize trends in mechanical testing, enabling faster decision-making for engineers.
• Implemented statistical analyses (regression, ANOVA, hypothesis testing) to optimize testing procedure, reducing processing time by 15%.
• Collaborated with R&D and operations teams to enhance material evaluation workflows, supporting aerospace and automotive clients.
RWS Anaheim, California
Data Annotation June 2024 – January 2025
• Annotated large datasets for machine learning models, ensuring high accuracy and consistency.
• Collaborated with cross-functional teams to refine annotation guidelines and improve data quality.
• Contributed to enhancing model performance by providing precise and reliable data annotations. Vinastar Corporation Anaheim, California
Data Entry May 2022 – December 2022
• Managed and maintained customer databases, ensuring 99% data accuracy across order records.
• Digitized hard-copy documents into a secure, organized digital system, improving retrieval efficiency.
• Identified and corrected data discrepancies, reducing errors and enhancing data integrity.
• Implemented routine data backups, strengthening data security and recover measures. PROJECTS
1. Case study: Customer Retention – Telco Customer Churn Data:
• Conducted Exploratory Data Analysis (EDA) to identify key factors influencing customer churn.
• Developed and evaluated classification models using metrics such as accuracy, precision, sensitivity, specificity, FDR, FOR, ROC curve, and lift chart to assess model performance.
• Leverage model insights to generate data-driven promotion strategies, optimizing customer retention and reducing churn. 2. Case Study: Yelp Review Analysis
• Utilized BERT pre-trained models to classify Yelp reviews into categories, processing 19,000 + data points for accurate text classification.
• Improved model accuracy by implementing train-validation data splits, optimizing performance through hyperparameter tuning.
• Performed sentiment analysis to assess customer feedback trends, providing actionable insights for businesses. 3. Case Study: Stroke Prediction
• Analyze and visualize the relationships between different health variables to identify unhealthy habits and risk factors associated with strokes.
• Develop and evaluate models to predict the likelihood of a stroke occurring based on a range of health factors. Using appropriate performance to assess the model’s accuracy and reliability.