039*-***-*** · **********@*****.*** · github.com/hieuvm2911
VO MINH HIEU
Data Engineer
OBJECTIVE
Seeking an entry-level role in Data Engineering to apply my knowledge of pipelines, ETL, and real-time data processing In the short term, I aim to gain hands-on experience with distributed systems and cloud-based data workflows to strengthen my technical foundation. Over the long term, I aspire to become a specialized Data Engineer who can design scalable architectures and lead data infrastructure initiatives that drive business growth. Vietnam National University - Ho Chi Minh City University of Science Bachelor of Information Technology
Major: Information System
GPA: 8.2
Relevant coursework: Data Structures and Algorithms, Data Visualization, Applied Data Analysis Certification: VSTEP B2 (Listening 5.5 - Reading 6.5 - Writing 7.0 - Speaking 6.0) Oct 2020 - Mar 2025
EDUCATION
Languages & Tools: Python, PL/SQL, Bash, Git, Docker Data Engineering: Apache Spark, Kafka, Airflow, SSIS, ETL, Databricks Cloud & Storage: Azure, Oracle, MySQL, MongoDB
Soft skills: Problem Solving, Teamwork & Collaboration, Adaptability, Attention to Detail SKILLS
BUILDING DATA WAREHOUSE: AIR QUALITY MONITORING // SQL, ETL, OLTP, OLAP, SSIS, SSAS Course project - Vietnam National University - Ho Chi Minh City University of Science Responsibility:
Designed a Snowflake schema data model to analyze U.S. air quality trends from 2021 to 2023. Developed ETL workflows using SSIS for data extraction, cleaning, transformation, and loading into the data warehouse. Developed OLAP cubes and MDX queries to explore AQI trends and quarterly state-level statistics. Learning outcome:
Gained practical experience in data modeling, SSIS-based ETL, and OLAP cube development for analysis. Strengthened skills in MDX querying and data visualization to derive meaningful insights from complex datasets. Completed the project with a score of 9.3/10.
Oct-Dec, 2024
PROJECTS
REAL-TIME DATA PIPELINE: MYSQL TO AZURE CLOUD // SPARK, KAFKA, AIRFLOW, AZURE July 2025 Personal project - Github repository
Responsibility:
Built a real-time CDC pipeline from MySQL to Azure using Debezium, Kafka, Spark Structured Streaming, and Airflow. Streamed binlog changes to Kafka, transformed data with Spark, and stored it in ADLS Gen2 & Azure SQL DWH. Orchestrated and containerized the entire pipeline with Airflow and Docker Compose. Learning outcome:
Orchestrated and containerized the entire pipeline with Airflow and Docker Compose. Gained hands-on experience with cloud data integration (Azure Blob Storage, Azure SQL DWH). Strengthened orchestration skills using Airflow and operationalized the pipeline in a fully containerized environment.