thtruong****
thtruong****
*********************@*****.***
Face Recognition System Using FaceNet and VGG-Face2 Technologies: Python, TensorFlow/Keras, FaceNet, VGG-Face2, SVM, OpenCV, Flask Key Points:
Built an end-to-end face recognition system to distinguish 2 individuals using 500+ custom images. Applied transfer learning (FaceNet/VGG-Face2) for feature extraction and SVM/cosine similarity for classification.
Optimized model performance: 98% accuracy, <0.2s inference time via quantization. Deployed a Flask web interface for real-time testing. Sep 2024 - Dec 2024
Ensemble Vehicle Detection (YOLOv8 + CODETR)
Technologies: Python, YOLOv8, YOLO11, CODETR, Docker Key Points:
Team of 3: Combined YOLO (real-time) and Co-DETR (precision) to detect 4 vehicle classes. Trained on 25k+ images with pseudo-labeling and weather/lighting augmentation. Containerized the inference API using Docker, enabling seamless deployment on cloud platforms (AWS EC2).
Result: Top 6/100+ teams in the SoiCTHackathon 2024 - Vehicle Detection competition. Oct 2024 - Dec 2024
Multimodal Video Retrieval
Technologies: CLIP, FAISS, DBNet, VietOCR, YOLOv8, Whisper, FastAPI, Docker Key Points:
Multimodal Search Engine: integrated CLIP (image/text embeddings) with FAISS for efficient similarity search, enabling real-time querying on large-scale video datasets. Designed indexing pipelines to streamline cross-modal retrieval (image text).
Scene Text Recognition: built a Vietnamese text extraction system using DBNet for detection and VietOCR for recognition, handling complex backgrounds. Multimodal ASR & Detection: deployed YOLOv8 for object detection and Whisper for speech-to-text, supporting hybrid queries (e.g., combining objects and spoken keywords). Backend Development: developed a scalable API using FastAPI and containerized services with Docker for cloud deployment.
June 2024 - Oct 2024
SUMMARY
Nguyen Thanh Truong
PROJECTS
EDUCATION
A passionate Computer Science student eager to contribute to the fields of Data Science and Artificial Intelligence, with a particular interest in Computer Vision and its applications. With my foundational knowledge and my strong desire to learn, I’m looking forward to participating in a dynamic and professional environment.
University of Information Technology - VNU-HCMC Oct 2022 - Present GPA: 8.17/10.0
Falcuty: Computer Science
Programming: Python, SQL, C++
ML/DL Frameworks: TensorFlow, PyTorch, Keras, OpenCV, FastAPI Data Preprocessing: Augmentation, Labeling (CVAT), Data Cleaning, Feature Engineering Computer Vision: YOLOv8, DETR, FaceNet, MediaPipe, CLIP NLP & Speech: Whisper, VietOCR
MLOps & Tools: Docker, Git, GitHub, ElasticSearch
TECHNICAL SKILLS
Robotic Navigation System
Technologies: YOLO, OpenCV
Key Points:
Created a 5,000+ image dataset with Roboflow (diverse backgrounds/people) and applied augmentations (rotation, noise, lighting) for robustness. Trained a gesture recognition model (MediaPipe + custom CNN) and converted to TensorFlow.js for browser deployment.
Built a JavaScript interface to map gestures to control commands (e.g., robot movement) with <100ms latency.
Result: Top 8/80 teams in Remotebot AI Challenge competition. Aug 2023 - Oct 2023
Vietnamese Sentiment Analysis with PhoBERT
Technologies: PhoBERT, Transformers, PyTorch, Hugging Face Key Points:
Fine-tuned PhoBERT-base on UIT-VSFC dataset (Vietnamese emotion corpus) to classify 3 emotions. Preprocessed text (tokenization, accent normalization) and handled class imbalance via weighted loss. Achieved F1-score 0.88 (vs. 0.82 baseline)
Sep 2024 - Jan 2025