Post Job Free
Sign in

Machine Learning Data Analyst

Location:
Anaheim, CA
Posted:
May 12, 2025

Contact this candidate

Resume:

Thanh Son Ha

****@***.*** LinkedIn 312-***-****

EDUCATION

California State Polytechnic University, Pomona Pomona, California Master of Science Business Analytics 2025

Business & Analytical Technologies Society Club

Member of CPP Data Science and Artificial Intelligence Club University of Illinois at Chicago Chicago, Illinois Bachelor of Information Decision Science – Business Analytics 2021 SKILLS

• Technical Skills: SQL, Python (pandas, Numpy, matplotlib), Tableau, Power BI, Machine Learning, Data Cleaning, Data Visualization.

• Analytics and Machine Learning: Predictive modeling, Sentiment Analysis, BERT NLP models, Classification. EXPERIENCE

Parker Dewey - Atlas Testing Laboratories Pomona, California Data Analyst Intern Current

• Conducted data cleaning and preprocessing using Python (pandas, NumPy) to ensure accuracy in metallurgical test data.

• Built predictive models to assess material performance and failure risks, improving quality control and reducing defects.

• Automated data extraction, transformation, and reporting using SQL and Power BI, cutting manual effort by 30%.

• Developed interactive dashboards to visualize trends in mechanical testing, enabling faster decision-making for engineers.

• Implemented statistical analyses (regression, ANOVA, hypothesis testing) to optimize testing procedure, reducing processing time by 15%.

• Collaborated with R&D and operations teams to enhance material evaluation workflows, supporting aerospace and automotive clients.

RWS Anaheim, California

Data Annotation June 2024 – January 2025

• Annotated large datasets for machine learning models, ensuring high accuracy and consistency.

• Collaborated with cross-functional teams to refine annotation guidelines and improve data quality.

• Contributed to enhancing model performance by providing precise and reliable data annotations. Vinastar Corporation Anaheim, California

Data Entry May 2022 – December 2022

• Managed and maintained customer databases, ensuring 99% data accuracy across order records.

• Digitized hard-copy documents into a secure, organized digital system, improving retrieval efficiency.

• Identified and corrected data discrepancies, reducing errors and enhancing data integrity.

• Implemented routine data backups, strengthening data security and recover measures. PROJECTS

1. Case study: Customer Retention – Telco Customer Churn Data:

• Conducted Exploratory Data Analysis (EDA) to identify key factors influencing customer churn.

• Developed and evaluated classification models using metrics such as accuracy, precision, sensitivity, specificity, FDR, FOR, ROC curve, and lift chart to assess model performance.

• Leverage model insights to generate data-driven promotion strategies, optimizing customer retention and reducing churn. 2. Case Study: Yelp Review Analysis

• Utilized BERT pre-trained models to classify Yelp reviews into categories, processing 19,000 + data points for accurate text classification.

• Improved model accuracy by implementing train-validation data splits, optimizing performance through hyperparameter tuning.

• Performed sentiment analysis to assess customer feedback trends, providing actionable insights for businesses. 3. Case Study: Stroke Prediction

• Analyze and visualize the relationships between different health variables to identify unhealthy habits and risk factors associated with strokes.

• Develop and evaluate models to predict the likelihood of a stroke occurring based on a range of health factors. Using appropriate performance to assess the model’s accuracy and reliability.



Contact this candidate