TRAN DUY ANH THUAN
DATA ENGINEERING INTERN
Phone: 039*******
Address: *** *** *****, ***** ***, Hanoi
Email: ***********@*****.***
Github: https://github.com/Thu2001
SUMMARY
I am a newly graduated Software Engineer, passionate about data and looking for a Data Engineer Fresher position. With experience from personal projects and previous internships, I have consolidated my knowledge of ETL, Spark, Hadoop and related technologies. I have solid skills in SQL and Python programming. I am committed to applying my existing knowledge and continuously learning to contribute to real-world data projects. SOFTWARE DEVELOPMENT INTERN THONG NHAT HOSPITAL
Build and operate end-to-end ETL pipeline to extract, transform and load data from CSV files into PostgreSQL database.
Utilize Apache Spark to efficiently process, transform and clean big data. Automate and schedule daily ETL pipeline runs using Apache Airflow. Use Python as the main programming language to process data and connect components in the pipeline. Store processed data in PostgreSQL or Microsoft SQL Server (DBT normalization). Develop kiosk software to automate patient registration and information lookup processes, helping to improve operational efficiency.
Develop and integrate APIs to connect to the hospital's central database, optimize data access and improve user/process efficiency.
Achievements: Significantly reduced waiting times for both patients and staff, while improving patient experience. June 2023 - June 2024
WORK EXPERIENCE
CONTRACT-BASED GAME DEVELOPER BINH MINH EDUCATIONAL COMPANY Design and develop educational games, focusing on psychology and communication, achieving significant improvements in user engagement.
Develop database and API solutions for games, supporting features and improving data management efficiency. Implement professional web data collection technology and use Lakehouse (databricks) to process big data, serving content creation and in-depth analysis.
July, 2024 - June, 2025
PERSONAL PROJECT WEATHER DATA ETL PROJECT
Build an automated machine learning ETL pipeline to collect, process, and store weather data. Key technologies: Airflow/Astro for workflow scheduling and management, Python for API calls and data processing, PostgreSQL or Microsoft SQL Server for storage, and Docker for application packaging. SKILLS
HARD SKILLS
SQL, Python, Hadoop, Spark, Kafka, Airflow, JavaScript, Java, C#,Microsoft SQL Server, MySQL, PostgreSQL, ReactJS, NodeJS, Microsoft.NET, OOP, JSON/HTML/XML parser, Pandas,databricks,dbt,restAPI,Docker, Machine Leaning, ETL. SOFT SKILLS
Logical thinking, Ability to quickly adapt to new technology, Ability to work well in groups and independently, Presentation and interpretation skills, Proactive, Eager to learn, Responsibility in work. CERTIFICATE
Presentation & Teamwork Skills Critical Thinking & Time Management Skills English: Intermediate (B1 CEFR) EDUCATION
Ho Chi Minh City University of Technology (HUTECH) Software Engineer Graduated in [11/2024] PERSONAL PROJECT HIGH-VOLUME WEB SCRAPING & DATA PIPELINE Collect product information and process product data from the web using Selenium and BeautifulSoup, transform and normalize dbt data using PySpark and load into database (PostgreSQL, SQL Server ). Automate the process through Airflow. Significantly reduce the time spent on manual data collection and entry. Technology: Python, Selenium, BeautifulSoup, PySpark, Apache Airflow, SQL, PostgreSQL, Docker, dbt, restAPI.