Thang Bao Kien Nguyen
076-***-**** # *********@*****.*** + HCM, VietNam
§ github.com/Calebb0709
Summary
I am a data and AI enthusiast with a strong aptitude for learning and a proven ability to thrive under pressure. My intellectual curiosity fuelsmypassionforscientificresearchandexplorationofnewtechnologies.I excel incollaborative environments, leveraging my teamwork skills to achieve common goals. Education
University of Information Technology,Bachelor in Data Science
• GPA: 3.0/4.0
• Interest: Multi-modal, NLP (Natural Language Processing), CV (Computer Vision), AI system, Data Engineering, Cloud Computing .
Sept. 2021 - Current
(Expected May. 2025)
Technology
Skill:
• Programming : Python(proficient), C/C++, R, SQL(prior experience).
• Frameworks : PyTorch(proficient), TensorFlow, Keras, Spark(prior experience), MongDB, Docker, MLOPs.
• Tools: VSCode, Jupyter Notebook, Terminal, Google Colab, Git, PowerBI, Microsoft Azure.
Project
Real Time Sentiment Analasys for Fast Food Brand
Team project - Role: Member- Team size: 4
• Tool and Technology: Apache Airflow, Docker, Spark, Kafka,Flask, Python, MongoDB, Char.js.
• Using Apache Airflow for data pipelines, Docker for environment management, Spark for processing, and Kafka for real-time streaming. Flask will provide a Python API, MongoDB will store data, and Char.js will visualize sentiment trends on the dash- board.
• Soure code : HERE 2
UIT Data Science Challenge Group B-Vietnamese Fact Checking (09/2023-11/2023) 2 Competition project - Role: Member- Team size: 4
• Tool and Technology: Natural Language Processing, Data Analysis, Data Preprocessing, Pytorch.
• Developed a rule-based system for sentence splitting for evidence retrieval, and proposed a solution combining pre trained language models with a Support Vector Machine (SVM) kernel for verdict classification in Fact Checking.
• Achievement: Highest position in group B in the Fact Checking task.
• Soure code : HERE 2
Data Science Advanced Analysis 2023 Competition- Link Prediction for Wikipedia Articles (04/2023- 06/2023) 2 Competition project - Role: Member- Team size: 4
• Tool and Technology: Natural Language Processing, Data Analysis, Data Preprocessing, Pytorch.
• Designed a Vanilla Fully Connected Neural Network and Mutual Attention Transformer method, focusing on dual at tention tex- tual features in two nodes to address the link prediction task.
• Achievement: Nearly absolute score on the public and private tests, 4th place on the leaderboard. Accepted paper by DSAA conference.
• Soure code : HERE 2
1
Experiment
Research Student- University of Information and Technology- Vietnam National Uni- versity HCM Multi-modal
• Project: Vietnamese Visual Question Answering
– Advisor : MSc. Kiet Van Nguyen and Nghia Hieu Nguyen
– Tool and Technology : PyTorch, Natural Language Processing, Computer Vision, Deep Learning.
– Construct new datasets and develop methods to improve the VQA model in the Vietnamese language.
06/2023 - Current
Natural Language Processing
• Project: Vietnamese AI Generated Detector
– Advisor :PhD. Trong-Hop Do
– Tool and Technology : PyTorch, Natural Language Processing, Deep Learning, Data Crawling.
– Build new datasets and conduct deep learning models capable of distinguishing between human-written text and AI-generated text in the education domain. 01/2024 - Current
Publications
MAT:Effective Link Prediction via Mutual Attention Transformer) 2 Accepted by Data Science Advanced Analysis Conference (DSAA) (rank A core 2021) 11/2023 ViOCRVQA: Novel Benchmark Dataset and Vision Reader for VQA by Understanding Vietnamese Text in Images 2 Information : HERE 2
ViTextVQA: A Large-Scale VQA Dataset for Evaluating Vietnamese Text Comprehension in Images 2 Information : HERE 2
2