Minh Sang Nguyen
Data Engineer
037******* ****************@*****.*** https://github.com/minhsangit1008 Ho Chi Minh city PROFESSIONAL SUMMARY
Data Engineer with hands-on experience building data pipelines, orchestrating workflows, and designing scalable data solutions on AWS and Azure. Proficient in Python, SQL, and modern data platforms including AWS Glue, Azure Data Factory, Microsoft Fabric, and Iceberg. Passionate about leveraging cloud and big data technologies to deliver reliable, high-quality, and business-driven insights. SKILLS
Technical Skills Programming & Scripting: Python (Pandas, PySpark, Requests, BeautifulSoup, Selenium), SQL, Trino&Presto, Boto3
Data Processing & ETL: AWS Glue, PySpark, Data Catalog, Schema Standardization, Data Cleaning, Data Transformation, Iceberg Tables, ETL Automation, Microsoft Fabric (Data Engineering & Lakehouse workflows), Azure Data Factory
Cloud & Big Data Tools: AWS S3, AWS Athena, AWS Lambda, AWS Lake Formation, Amazon MWAA
(Airflow), Lifecycle Rules, AWS DataSync, Lakehouse (Delta), Fabric Notebooks (collaborative PySpark/Spark SQL), Fabric pipelines / dataflows.
Data Analysis & Visualization: Power BI (Interactive Dashboards, KPIs, Trend Analysis), Chart on python File Formats & Integration: CSV, Excel (XLSX), JSON, Parquet Workflow Orchestration: Airflow DAGs, Task Scheduling, Data Pipeline Monitoring, Fabric pipelines. Version Control & Collaboration: Git/GitHub
Other Tools: Excel Advanced Functions, Web Crawling & Scraping English Skill Intermediate working proficiency.
Soft skills: Effective communication and teamwork skills Logical thinking and data analysis
Proactive learning and good time management
Critical Thinking Observe and listen
Understand the key issues of the work
Propose the optimal solution
WORK EXPERIENCE
Rackspace Technology 06/2025 - 11/2025
Data Engineer Intern
• Designed and maintained ETL pipelines using AWS Glue (PySpark) to ingest multi-format data into the Glue Data Catalog and Iceberg tables.
• Developed AWS Lambda functions to automate ingestion, file processing, and integration with S3 and Athena.
• Optimized S3 storage structure, schema design, and partitioning to improve query performance and reduce cost.
• Configured AWS Lake Formation to enforce secure and compliant data access across teams.
• Orchestrated data workflows using Amazon MWAA (Airflow) for scheduling, dependency management, and data validation.
• Improved Glue ETL job performance by enhancing transformation logic and optimizing storage layout.
• Documented pipeline workflows, architecture, and operational procedures for team use and knowledge sharing. Freelancer 8/2025 - 9/2025
ETL Pipelines on Azure Data Factory & Microsoft Fabric - Filltrona
• Built metadata-driven ETL pipelines (source staging bronze silver) using ADF.
• Developed Fabric PySpark notebooks for data cleaning and incremental loading.
• Applied SQL window functions for deduplication and data quality checks.
• Maintained lineage consistency and automated ingestion for multiple datasets. Freelancer 6/2025 - 7/2025
Data Crawler & Analyst
• Built automated data collection pipelines using Python (Requests, BeautifulSoup, Selenium).
• Cleaned, standardized, and integrated multi-source datasets to support analytics.
• Created dashboards and KPIs in Power BI to provide insights into user behavior and business trends.
• Crawled tax data, matched records via fuzzy logic, and extracted company info from TopCV.
• Automated ~90% of the workflow; processed ~500 companies in under 2 hours.
• Optimized crawling and processing workflows, reducing total processing time by 30%. Personal project -
Data Warehouse for Product Inventory Management
• Built a Data Warehouse for analyzing product inventory using data from AdventureWorks2012. Designed a Star Schema model, developed ETL processes using SSIS, and created Power BI dashboards for inventory status, low-stock alerts, and inventory turnover analysis.
• Tech: SQL Server, SSIS, Power BI
EDUCATION
Ton Duc Thang University 2021 - 2024
Information Technology – Software Engineering – Data Engineer Orientation Senior student majoring in software engineering - high quality system Cole 2024 - 2025
Data Engineer major
Learn in-depth data engineer skills, implement real-world projects CERTIFICATION
Colevn Big Data Certificate
Aptis ESOL 154 ( IELTS 5.5)
INTEREST
Learn new technologies in data and AI
Join technology forums, learn from the data engineering community Read books, listen to music and play chess
© topcv.vn