Data Engineer Science

Location:

Ho Chi Minh City, Vietnam

Posted:

March 05, 2025

Contact this candidate

Resume:

Nguyen Van Hoai Nam

Data Engineer

Ó 087******* R ***************@*****.*** Ho Chi Minh, Viet Nam hnamiuher Objective

As someone with a quick learning ability and a high self-learning spirit, I aspire to become a proficient Data Engineer. While I am currently following the curriculum of my academic program and may lack extensive experience in real-world projects, I am actively refining and enhancing my skills to better myself.

Academic Background

IUH - Industrial University of Ho Chi Minh City 08/2021 – Present Information Technology Data Science Senior student

• Teamwork in numerous projects over 4 years with skills such as presentation, requirements analysis and design, web design, data crawling, analytical thinking, and building dashboards using PowerBI.

• Member of the Data Innovation Lab, fostering knowledge exchange among Data Science and Computer Science students.

• Achieved second prize in the university’s E ureka competition.

• Submitted and accepted a paper at the Youth Scientist Conference at IUH.

• GPA: 3.39/4.0

Projects

Housing Price Prediction with ML and Dashboard PowerBI GitHub: Project Repository

Description: This project involves collecting real estate data from i-batdongsan.com to analyze and predict house prices using various Machine Learning models. The processed data was visualized in PowerBI to provide insights into housing prices across Vietnam. Technologies: Selenium, BeautifulSoup, Scrapy, Numpy, Pandas, Linear Regression, XGBoost Regres- sion, GradientBoostingRegressor

Video Game Sales report

GitHub: Project Repository

Description: The project utilizes Python libraries and SQL to clean data and generate visualizations that illustrate the performance of popular gaming platforms worldwide using Power BI Technologies: Numpy, Pandas, Matplolib, Seaborn, SQL, PowerBI Real-time Streaming Data from Binance using Kafka, Airflow GitHub: Project Repository

Description: This project primarily utilizes Docker and Big Data images to stream data into a database using Apache Kafka as a message broker and Apache Airflow for workflow orchestration. The Kafka- based data streaming is complete, and the remaining components are still being developed Technologies: Docker, Apache Kafka, Apache Spark, Apache Airflow, PostgreSQL, Cassandra. 1

Skills

Programming

Python, C, Java, SQL

Communication

English (Intermediate reading and listening). Certificates will be added after May Soft Skills

Communication, teamwork, collaboration, critical thinking Technical Skills

• Performing Data Visualization, Storytelling, and Diagnostic Analytics using PowerBI.

• Using SQL to query Data Warehouse and optimize queries for data reporting.

• Building predictive models using Machine Learning algorithms and statistical analysis with Python.

• Performing web scraping and data crawling using tools like Selenium, BeautifulSoup, and Scrapy to collect structured and unstructured data from websites and APIs for analysis.

• Designing and implementing ETL pipelines to extract data from multiple sources, transform the data for analysis, and load it into databases or data warehouses for further processing

• Building automated database pipelines to process large volumes of data, including integrating different data sources, scheduling data updates, and ensuring smooth data flow using tools like Apache Airflow and SQL

Tools and Frameworks

Selenium, BeautifulSoup, Scrapy, PowerBI, Apache Airflow, PySpark, Apache Kafka, Hadoop, Git, Pos- greSQL

Contact this candidate