Pham Nguyen Quoc
Thanh
Data Engineer
Personal details
Pham Nguyen Quoc Thanh
phamquocthanh1804@gmail.
com
Ho Chi Minh
Skills
Programming languages: Python,
SQL
Database: Postgresql
Big data: Apache Airflow, Apache
Hadoop, PySpark
Visualization: Tableau
Languages
English
Certificates
TOEIC
960
IBM Data Engineering by IBM on
Coursera
Google Data Analytics by Google on
Coursera
Machine Learning by Stanford
University & DeepLearning.AI on
Coursera
Profile
I am a fourth-year student at University of Science majoring in Data Science. I am currently seeking an internship in Data Engineering industry to broaden my knowledge and learn about new technologies.
Education
Bachelor Nov 2021 - Present
University of Science - VNUHCM, Ho Chi Minh
Projects
ETL pipeline with Leaguepedia API Sep 2024 - Oct 2024 Description:
Developed an ETL pipeline to extract, transform, and load data from the Leaguepedia API for comprehensive esports data analysis. Automated the ETL workflow using Apache Airflow to ensure timely data updates.
Analyzed data using PySpark to derive insights.
Technologies: Python, Airflow, Postgresql, PySpark, dbt-core, Docker Link: https://github.com/quocthanh18/ETL-Leaguepedia League of Legends Winrate Prediction Jun 2024 - Jul 2024 Description:
Used Riot API to fetch match statistics for predicting outcome. Conducted data cleaning and preprocessing to ensure high data integrity. Deployed machine learning algorithms to predict match outcomes with precision.
Technology: Python, Tensorflow
Link: https://github.com/quocthanh18/LOL-V2
Document Clustering with MapReduce Mar 2024 - Mar 2024 Description:
Utilized Python to perform stemming, tokenization, and removal of stop- words and contractions on the BBC dataset.
Calculated TF-IDF scores using MapReduce paradigm. Implemented K-Means clustering using MapReduce paradigm to group documents into meaningful clusters.
Technology: Python, Java
Link: https://github.com/quocthanh18/Hadoop