SUMMARY
EDUCATION PERSONAL PROJECT
SKILLS
CERTIFICATIONS
K3/2 Bien Hoa, Dong Nai
ad4dcq@r.postjobfree.com
LUU GIA BAO
Data Engineer Intern
Passionate and driven Data Engineering enthusiast seeking an internship opportunity to apply and further develop skills in Python, SQL, Data Warehousing, PySpark, Azure Data Factory, and Power BI. Eager to contribute to meaningful projects while gaining hands-on experience in the field. Jan 2024- Feb 2024
Overview: This project aims to build a dashboard for F1 racing data visualization, providing insights into various aspects of Formula 1 races.
Github: https://github.com/lgbao123/f1
Technologies: Databricks with PySpark, Azure Data Lake, Azure SQL Database, Azure Data Factory, Power BI
Responsibility:
Data Ingestion: Data is fetched from the The Ergast API ( an experimental web service which provides a historical record of motor racing data for non-
commercial purposes ) using Databricks with
pyspark.
Data Storage: Ingested data is stored in Azure Data Lake for further processing.
Data Transformation: Data is transformed as per
requirements using Databricks.
Data Warehousing: Transformed data is stored in
Azure SQL Database for efficient querying.
Dashboard Creation: Power BI is used to create
interactive and insightful dashboards.
Pipeline Orchestration: Azure Data Factory is used for scheduling weekly and monitoring the data
pipeline.
Data Engineer
2019 – Now
GPA : 8/10
Awards
Python
SQL
Spark
Azure (Azure data factory, Azure
synapse, DataLake)
MySQL, SQL Server
Power BI
Solid understanding of Data
Structures and Algorithms
Familiarity with data warehousing
concepts
TOEIC: 765
SQL (Intermediate) HackerRank
Certificate
HCMC University of Technology F1 Racing Dashboard Project and Education
lgbao123
Academic scholarship, 2021
Feb 2024- Mar 2024
Overview: The project utilized data from Our World in Data to track daily cases, deaths, and hospital admissions related to COVID-19 in Europe. Data include 4 csv files : Cases and Deaths, Hospitalizations, Testing, Country.
Github: https://github.com/lgbao123/covid19
Technologies: Azure Data Factory, Azure SQL Database, Azure Data Factory, Power BI Responsibility:
The data flow involves fetching files from the Our World in Data to a data lake storage account using HTTP Connector .
Data processing occurs in Azure Data Factory through the implementation of data pipelines and data flow .
The transformed data is subsequently published to the Azure SQL Database as Data warehouse. The final step entails visualizing the data in Power BI, retrieving it from the SQL Database. COVID-19 Project: Implementing a Data Engineering Pipeline within Azure Data Factory