Nguyen Van Hoai Nam
Data Engineer
Ó 087******* R ***************@*****.*** Ho Chi Minh, Viet Nam hnamiuher Objective
As someone with a quick learning ability and a high self-learning spirit, I aspire to become a proficient Data Engineer. While I am currently following the curriculum of my academic program and may lack extensive experience in real-world projects, I am actively refining and enhancing my skills to better myself.
Academic Background
IUH - Industrial University of Ho Chi Minh City 08/2021 – Present Information Technology Data Science Senior student
• Teamwork in numerous projects over 4 years with skills such as presentation, requirements analysis and design, web design, data crawling, analytical thinking, and building dashboards using PowerBI.
• Member of the Data Innovation Lab, fostering knowledge exchange among Data Science and Computer Science students.
• Achieved second prize in the university’s E ureka competition.
• Submitted and accepted a paper at the Youth Scientist Conference at IUH.
• GPA: 3.39/4.0
Projects
Housing Price Prediction with ML and Dashboard PowerBI GitHub: Project Repository
Description: This project involves collecting real estate data from i-batdongsan.com to analyze and predict house prices using various Machine Learning models. The processed data was visualized in PowerBI to provide insights into housing prices across Vietnam. Technologies: Selenium, BeautifulSoup, Scrapy, Numpy, Pandas, Linear Regression, XGBoost Regres- sion, GradientBoostingRegressor
Video Game Sales report
GitHub: Project Repository
Description: The project utilizes Python libraries and SQL to clean data and generate visualizations that illustrate the performance of popular gaming platforms worldwide using Power BI Technologies: Numpy, Pandas, Matplolib, Seaborn, SQL, PowerBI Real-time Streaming Data from Binance using Kafka, Airflow GitHub: Project Repository
Description: This project primarily utilizes Docker and Big Data images to stream data into a database using Apache Kafka as a message broker and Apache Airflow for workflow orchestration. The Kafka- based data streaming is complete, and the remaining components are still being developed Technologies: Docker, Apache Kafka, Apache Spark, Apache Airflow, PostgreSQL, Cassandra. 1
Skills
Programming
Python, C, Java, SQL
Communication
English (Intermediate reading and listening). Certificates will be added after May Soft Skills
Communication, teamwork, collaboration, critical thinking Technical Skills
• Performing Data Visualization, Storytelling, and Diagnostic Analytics using PowerBI.
• Using SQL to query Data Warehouse and optimize queries for data reporting.
• Building predictive models using Machine Learning algorithms and statistical analysis with Python.
• Performing web scraping and data crawling using tools like Selenium, BeautifulSoup, and Scrapy to collect structured and unstructured data from websites and APIs for analysis.
• Designing and implementing ETL pipelines to extract data from multiple sources, transform the data for analysis, and load it into databases or data warehouses for further processing
• Building automated database pipelines to process large volumes of data, including integrating different data sources, scheduling data updates, and ensuring smooth data flow using tools like Apache Airflow and SQL
Tools and Frameworks
Selenium, BeautifulSoup, Scrapy, PowerBI, Apache Airflow, PySpark, Apache Kafka, Hadoop, Git, Pos- greSQL
2