Data Science Team Leader

Location:

Quan Tan Binh, 72100, Vietnam

Posted:

October 16, 2025

Contact this candidate

Resume:

TRAN XUAN DIEN

034******* Mail Scholar LinkedIn GitHub

RESEARCH INTEREST

Deep Learning, NLP, LLM, and Multimodal Learning. Beyond conducting experiments and authoring research papers, I focus on leveraging AI solutions to address practical challenges in everyday life. EDUCATION

Bachelor of Science in Data Science 2022 – present Industrial University of Ho Chi Minh City GPA: 3.71 / 4.00 TECHNICAL EXPERIENCE

1. Heineken APAC D&A Hub 08/2025 - Present

Data Science Intern Onsite

- Developed and optimized a generalized MCP-based multi-agent framework to automate reasoning and tool execution for diverse analytics workflows.

- Built LLM-driven chatbot systems and trained ML models for forecasting and decision support.

- Implemented unit tests and performance optimization to enhance agent reliability and scalability. PROJECT ENGINEERING AND COMPETITIONS

1. Multimodal Sacarsm Detection for UITC2024

Team Leader - Faster-United [Code] 2024

- Achieved the top 1 score on the scoreboard with an F1-score of 44.75%.

- Developed a multimodal sarcasm classification pipeline using pre-trained Vintern-1B-v2 models for image caption generation. Data was processed through text, generated captions, and image features, optimized with Cross Entropy and Focal Loss, using ViT and Jina Embedding V3.

- Trained four models for 2-class, 3-class, and 4-class tasks, integrating predictions with a Voting Model and enhancing performance with post-processing to ensure consistent labeling for similar captions. 2. LLM-Powered Video Search

Team Leader - Faster-United [Code] 2024

- Developed an intelligent video search system combining Large Language Models (LLMs) with multimodal search methods to enhance video retrieval accuracy and user experience.

- Integrated various search methods (ASR, OCR, captioning) and employed the Combined Ranking Score

(CRS) algorithm for optimal query ranking and retrieval across multiple modalities.

- Utilized advanced technologies, including FAISS for vector search, YOLO for object detection, TF-IDF for metadata encoding, and CLIP for multimodal embedding and similarity matching. 3. Information-checking Task

Team Member - IUH.AI Faster Bias [Code] 2023

- Achieved the top 1 score on the scoreboard with a strict accuracy of 78.97%.

- Successfully developed the SemViQA system, which includes two main components: Semantic-based Evidence Retrieval (SER) combining TF-IDF and Question Answering with Token Classification (QATC), and Two-step Verdict Classification (TVC) with the labels SUPPORT, REFUTE, and NEI (Not Enough Information). RESEARCH PROJECTS

1. ViAMR: Fine-tuning LLMs for Abstract Meaning Representation in Vietnamese VLSP 2025 (Accepted) [Code]

*Dien X. Tran, *Nhon V. Trong, Kien C. Nguyen

2. ViDRILL: A Multi-Stage Retrieval Framework for Vietnamese Legal Document Search 1

VLSP 2025 (Accepted) [Code]

*Dien X. Tran, Tai D. Truong, Kien C. Nguyen

3. ViGSA: A Multi-Task Aspect-Based Sentiment Analysis Model with Auxiliary Embedding and Global Sentiment Integration for Vietnamese Restaurant Reviews ESWA 2025 journal (Q1) (Under Review) [Code]

*Dien X. Tran, *Kien Cao-Van, Tinh Nguyen-Huu, Hoang-Tuan Dao-Xuan, Hung Nguyen-Viet, Khanh-Duy Cao-Phan

4. PBA-Net: A Dual-Branch Architecture with Positional Bias Attention and Multi-Scale CNN for Low-resource Language Aspect-Based Sentiment Analysis EAAI 2025 journal (Q1) (Under Review) [Code]

*Khanh-Duy Cao-Phan, *Kien Cao-Van, Hung Nguyen-Viet, Di T. Le, Tinh Nguyen-Huu, Dien X. Tran 5. SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking TAI 2025 journal (Q1) (Under Review) [Demo], [PyPI], [Code]

*Dien X. Tran, *Nam V. Nguyen, Thanh T. Tran, Anh T. Hoang, Tai V. Duong, Di T. Le, Phuc-Lu Le 6. LLM-Powered Video Search: A Comprehensive Multimedia Retrieval System SOICT 2024 (Rank B) (Accepted) [Code]

*Dien X. Tran, *Anh T. Hoang, Tai V. Duong, Kien C. Nguyen HONORS & AWARDS

1. VLSP 2025 Evaluation Campaign 2025

a) Top 5/58 - DRILL Track

b) Top 2 - Semantic Parsing

c) Top 6 - Numerical Reasoning QA

2. Top 11/149 - CodeMMLU Challenge 2025

3. UIT Challenge 2023 - 2024

a) First Prize - UIT Challenge 2024

b) Encouragement Prize - UIT Challenge 2023

4. AI HCM Challenge 2024 - 2025

a) - AI HCM Challenge 2025

b) Potential Prize - AI HCM Challenge 2024

5. Third Prize - Calculator-Based Math Contest 2022, Ho Chi Minh City 2022 6. Gold Medal - 11th Grade Mathematics Olympiad 2021, Ho Chi Minh City 2021 SPECIALIZED SKILLS

AI/ML: Machine Learning, Deep Learning, NLP, model optimization, GenAI, Multi-agent, LLMs. Programming: Python (BeautifulSoup, Selenium), Django, SQL. Tools: Docker, Airflow, Power BI, FAISS, CLIP, MLflow. Other: Logical thinking, problem-solving, math & stats foundations. 2

Contact this candidate