Data Engineer

Location:

Ho Chi Minh City, Vietnam

Posted:

August 09, 2025

Contact this candidate

Resume:

039*-***-*** · **********@*****.*** · github.com/hieuvm2911

VO MINH HIEU

Data Engineer

OBJECTIVE

Seeking an entry-level role in Data Engineering to apply my knowledge of pipelines, ETL, and real-time data processing In the short term, I aim to gain hands-on experience with distributed systems and cloud-based data workflows to strengthen my technical foundation. Over the long term, I aspire to become a specialized Data Engineer who can design scalable architectures and lead data infrastructure initiatives that drive business growth. Vietnam National University - Ho Chi Minh City University of Science Bachelor of Information Technology

Major: Information System

GPA: 8.2

Relevant coursework: Data Structures and Algorithms, Data Visualization, Applied Data Analysis Certification: VSTEP B2 (Listening 5.5 - Reading 6.5 - Writing 7.0 - Speaking 6.0) Oct 2020 - Mar 2025

EDUCATION

Languages & Tools: Python, PL/SQL, Bash, Git, Docker Data Engineering: Apache Spark, Kafka, Airflow, SSIS, ETL, Databricks Cloud & Storage: Azure, Oracle, MySQL, MongoDB

Soft skills: Problem Solving, Teamwork & Collaboration, Adaptability, Attention to Detail SKILLS

BUILDING DATA WAREHOUSE: AIR QUALITY MONITORING // SQL, ETL, OLTP, OLAP, SSIS, SSAS Course project - Vietnam National University - Ho Chi Minh City University of Science Responsibility:

Designed a Snowflake schema data model to analyze U.S. air quality trends from 2021 to 2023. Developed ETL workflows using SSIS for data extraction, cleaning, transformation, and loading into the data warehouse. Developed OLAP cubes and MDX queries to explore AQI trends and quarterly state-level statistics. Learning outcome:

Gained practical experience in data modeling, SSIS-based ETL, and OLAP cube development for analysis. Strengthened skills in MDX querying and data visualization to derive meaningful insights from complex datasets. Completed the project with a score of 9.3/10.

Oct-Dec, 2024

PROJECTS

REAL-TIME DATA PIPELINE: MYSQL TO AZURE CLOUD // SPARK, KAFKA, AIRFLOW, AZURE July 2025 Personal project - Github repository

Responsibility:

Built a real-time CDC pipeline from MySQL to Azure using Debezium, Kafka, Spark Structured Streaming, and Airflow. Streamed binlog changes to Kafka, transformed data with Spark, and stored it in ADLS Gen2 & Azure SQL DWH. Orchestrated and containerized the entire pipeline with Airflow and Docker Compose. Learning outcome:

Orchestrated and containerized the entire pipeline with Airflow and Docker Compose. Gained hands-on experience with cloud data integration (Azure Blob Storage, Azure SQL DWH). Strengthened orchestration skills using Airflow and operationalized the pipeline in a fully containerized environment.

Contact this candidate