Post Job Free
Sign in

Machine Learning Data Science

Location:
Quan Tan Binh, 72100, Vietnam
Posted:
October 31, 2025

Contact this candidate

Resume:

Phạm Tấn Phước

MSSV: ********

082******* **************@*****.*** linkedin.com/in/tấn-phước-phạm-3363162b0 github.com/PhamTanPhuoc66

Summary

Detail-oriented Data Science student with a strong foundation in statistics, machine learning, and deep learning. Experienced in building end-to-end ML pipelines and computer vision systems. Passionate about applying data-driven methods to real-world problems and eager to grow as a Data Science Intern. Skills

Programming: Python, R, SQL, C/C++

Libraries/Frameworks: scikit-learn, PyTorch, TensorFlow, Pandas, NumPy, Matplotlib, dbt, Airflow Machine Learning: Regression, Classification, Clustering, PCA, Decision Tree, Random Forest, XGBoost, Deep Learning Statistical Methods: Hypothesis Testing, Confidence Interval, ANOVA, Regression Analysis Databases: SQL (PostgreSQL, MySQL), NoSQL (MongoDB, DynamoDB) MLOps & Cloud: MLflow, Docker, AWS, KServe, Kubernetes Education

University of Science (VNU-HCM) 2022 – Present

Bachelor of Science in Data Science

Certifications

– TOEIC Listening & Reading: 935/990 Speaking & Writing: 280/400

– Coursera Specializations: Data Analysis, Machine Learning, Deep Learning Projects

End-to-End Machine Learning & Deep Learning Data Pipeline

– Built a complete ML pipeline using Databricks, Airflow, and dbt within a Lakehouse (Bronze–Silver–Gold) architecture.

– Developed churn prediction and recommendation models using scikit-learn, Surprise, and PyTorch (collaborative filtering

& neural recommenders).

– Automated experiment tracking and model deployment with MLflow + KServe (auto-scaling to zero).

– Technologies: Databricks, Airflow, dbt, MLflow, KServe, PyTorch, scikit-learn, Surprise.

– § github.com/PhamTanPhuoc66/End-to-end-olist-project Body Performance Analysis Project

– Analyzed a 13k-record fitness dataset to study correlations between body metrics and performance levels (A–D).

– Conducted EDA, feature engineering (fitness_score, pulse_pressure), visualization, and statistical testing (t-test, ANOVA, permutation).

– Trained and compared several classification models (Naive Bayes, LDA, Decision Tree, Random Forest, XGBoost, etc.); Random Forest performed best.

– Technologies: R, tidyverse, ggplot2, caret, randomForest, xgboost, corrplot. Real-Time Face Recognition & Attendance (YuNet + PCA)

– Developed a real-time face recognition system using YuNet (OpenCV) for detection and PCA (Eigenfaces) + KNN for recognition.

– Implemented FastAPI WebSocket streaming with async processing, batching, and multi-client support for low-latency inference.

– Designed modular backend and integrated YOLOv8 (FP16, CUDA) for optional object detection demo.

– Technologies: FastAPI, asyncio, OpenCV, PCA, KNN, YuNet, YOLOv8, WebSocket.



Contact this candidate