Post Job Free
Sign in

Machine Learning Data Scientist

Location:
Bien Hoa, Dong Nai, Vietnam
Posted:
September 25, 2025

Contact this candidate

Resume:

Ho Minh Quan

Data Scientist/ AI Engineer Intern

0344.123.089 ********@*******.*****.***.** quanho114 hominhquan0112 Objective

I am a final-year Data Science student at VNUHCM–US with a strong passion for language technologies and applied machine learning. I am currently focusing on LLM tools and scalable ML workflows to deepen my practical expertise. I am seeking an internship where I can collaborate with experienced engineers, contribute to impactful AI solutions, and grow through continuous learning and hands-on experience. Projects

Intelligent Fake News Detection using BERT & XLNet (GitHub ) 04/2025 – 05/2025 Fine-tuned BERT & XLNet to classify news as FAKE/REAL, achieving 96% test accuracy. Built a Gradio demo for interactive testing with custom inputs. Handled data preprocessing, tokenization, and model evaluation. Conducted comparative evaluation of BERT vs. XLNet for performance analysis. Technologies: Jupyter Notebook (Python), PyTorch, Hugging Face Transformers, Scikit-learn, Gradio LLM-Powered Chatbot with Document Retrieval and Context Fusion (GitHub ) 05/2025 – 06/2025 Built a Retrieval-Augmented Generation (RAG) chatbot to query uploaded PDF documents. Designed a document pipeline for parsing, chunking, embeddings, and vector storage. Integrated IBM Watsonx foundation models via LangChain for context-aware responses. Developed an interactive Gradio web app for real-time Q&A. Technologies: Python, LangChain, IBM Watsonx, PyPDFLoader, ChromaDB, Gradio Multi-Agent RAG Chatbot for Document Question Answering (GitHub ) 06/2025 – 09/2025 Developed a multi-agent RAG chatbot that coordinates retrieval, reasoning, and verification agents to deliver accurate, hallucination-free answers from user-uploaded documents. Implemented a hybrid retrieval pipeline (BM25 + vector search) with a fact-checking mechanism to enhance relevance and reliability.

Built an interactive Gradio interface enabling real-time document-based Q&A and seamless user interaction. Technologies: Python, LangChain, IBM Watsonx, ChromaDB, LangGraph, Gradio Education

Bachelor of Science in Mathematics and Computer Science (Specialization: Data Science) Sept 2022 - Present VNUHCM - University Of Science

GPA: 8.01

Relevant Courses: Artificial Intelligence (10), Machine Learning (7.7), Natural Language Processing (8.1), Statistical Data Processing (8.5), Image Analysis and Processing (9.2) Skills

Programming Languages: C/C++, Python, R, SQL

Frameworks: PyTorch, TensorFlow, Keras, scikit-learn, Hugging Face, spaCy, FAISS, Pandas, NumPy, Gradio Tools: Docker, Git/Github, Ubuntu, Google Colab, Jupyter Notebook, LangChain, LangGraph Domains of Expertise: Machine Learning, Deep Learning, Natural Language Processing (NLP), Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) Certificates

Google Advanced Data Analytics Link

Natural Language Processing Specialization Link

Machine Learning Specialization Link

IBM Data Science Professional Certificate Link

TOEIC SW 230 Link



Contact this candidate