Nguyen Hoang Khang
Data Engineer Intern/Fresher
094******* adxv72@r.postjobfree.com https://github.com/khanghoang1210 Ho Chi Minh City OBJECTIVE
Data engineer intern seeking practical experience and professional growth in a dynamic environment. Passionate about leveraging data engineering skills to develop efficient data pipelines and contribute to data-driven solutions. Eager to work with advanced technologies and collaborate with a motivated team to optimize data storage, retrieval, and analysis processes. Committed to expanding knowledge and making a positive impact on the organization's data infrastructure. EDUCATION
University of Science - VNUHCM 2021 - Now
Mathematics and Computer Science
GPA: 7.4
WORK EXPERIENCE
MindX Technology School 9/2022 - 11/2022
Teaching Assistant
Supporting teachers in teaching programming subjects, assisting teachers in preparing lesson plans for classes. Participating in teaching experiences for students who have not been exposed to programming. ACTIVITIES
#QuanQuanGCP Challenge Seasson 4 9/2022 - 10/2022
Participant
Practicing Data Analytics Lab exercises on the Google Cloud platform CERTIFICATIONS
Python Developer for AI - VTC Academy 7/2022
PySpark Real Time Projects - Udemy 5/2023
Data Engineering - AI For Everyone 6/2023
SKILLS
Programing Languages Python, R, JavaScript, C/C++
Frameworks/Platforms Apache Spark, Apache Hadoop, Apache Hive, Docker, Airflow Database Management Systems MS SQL Server, PostgreSQL, MongoDB Core Qualifications ETL (Extract, Transform, Load) Data pipeline development & management
Knowledge about Data Lake, Data Warehouse
Crawling data (BeautifulSoup, Scrapy)
Have knowledge of the AWS platform (Lambda, Step Fuction, Glue, S3) Have knowledge of Git, Docker and Linux
Others Understading of Rest API
Know how to Web Back-end works
Knowledge basic Statistics and Machine Learning algorithms PROJECTS
Prescriber Data Pipeline
( 3/2023 - 5/2023 )
Client Personal project
Description
Using PySpark on Docker to create data pipeline: ingest, preprocessing, transform, extract and load data to Hadoop by Apache Hive
Link Github https://github.com/khanghoang1210/prescriber-pipeline Position Developer
Responsibilities
Collect data
Build data pipeline
Deploy the Spark job to Yarn
Technologies
Python, Docker
Framework: Pyspark, Hadoop, Hive
Database: PostgreSQL
Book Data Pipeline
( 6/2023 - 6/2023 )
Client Personal project
Description
Build data pipeline: crawler, extract, ingestion data to Data Lake and transform load to Data Warehouse on-cloud AWS
Link Github https://github.com/khanghoang1210/book_pipeline Position Developer
Responsibilities
Crawler data
Build data pipeline
Technologies
Python: BeautifulSoup
Cloud: Lambda, Step Functions, Glue, S3
ETL pipeline with Airflow
( 02/2023 - 02/2023 )
Client Personal project
Description
Build ETL pipeline with BashOperator library in Python and submit the DAG to the Airflow. Airflow run on Docker.
Link Github https://github.com/khanghoang1210/ETL-with-Airflow Position in project Developer
Responsibilities
Build ETL pipeline
Submit to Airflow
Technologies Bash command line, Python, Airflow, Docker
© topcv.vn