Le Phu Truong
+84-937****** # ***********.****@*****.*** ï in/letruongzzio § letruongzzio
EDUCATION
University of Science - VNUHCM 2022 - 2026
Bachelor of Science in Mathematics and Computer Science Ho Chi Minh City, Vietnam
• GPA: 3.76/ 4.0
• Academic Incentive Scholarship, Semester 4 - Top 5% of the faculty EXPERIENCES
Research Intern January 2025 – Present
AISIA Research Lab, University of Science - VNUHCM Ho Chi Minh City, Vietnam
• Integrate deep learning architectures & semantic segmentation frameworks to detect pulmonary vascular abnormalities.
• Experiment ensemble stacking to combine the outstanding aspects of each trained model.
• Research findings for publication and validate methodologies under the instruction of faculty lecturers.
• Tools: CNN, Fourier Neural Operators (FNO), UNet (ver. 1, Attention, 2, 3), DeepLabV3+, Stacking Ensemble AI Engineer Intern March 2025 – Present
EyeCode AI & Smart Tech Hub Ho Chi Minh City, Vietnam
• Implement NLP pipelines leveraging LLMs to generate details and feedback for essay scoring.
• Engineer Modular RAG workflows, fine-tune retrieval and reranking to boost accuracy in automated scoring tasks.
• Conduct fine-tuning LLMs for effectively scoring IELTS Writing with LoRA and QLoRA.
• Tools: LangChain, Modular RAG, GPT-2, Phi-2, DeepSeek LLM, FAISS, Chroma, (Q)LoRA PROJECTS
Face Image Retrieval Computer Vision, Embedding Learning § GitHub
• Fine-tuned ResNet-50 & MobileNet-V2 to learn facial embeddings for efficient image retrieval on the CelebA dataset.
• Optimized embedding distribution with KD-Tree for fast nearest-neighbor search.
• Achieved 2% improvement in retrieval accuracy and 15 seconds reduction in inference time. Diabetes Risk Analysis Statistical Modeling, Hypothesis Testing, Machine Learning § Dataset
• Handled imbalanced data, and engineered features by discretizing continuous variables to align with classification tasks.
• Applied A/B testing and ANOVA to explore associations between qualitative/quantitative variables.
• Developed multi-class and binary classification models using Naive Bayes, Logistic Regression, LDA and QDA.
• One-vs-rest Logistic Regression (oversampling) achieved the best results with Macro-F1 = 0.6620 and Recall = 0.7478. Mental Attention Classification Machine Learning, Model Training § GitHub
• Applied dimensionality reduction techniques to improve classification efficiency and reduce noise.
• Used baseline classifiers (Logistic Regression, LDA, SVM, XGBoost, LightGBM) to gain initial insights.
• Trained a Multi-layer Perceptron (MLP) as an enhanced model to capture non-linear patterns.
• Achieved approximately 90% F1-score in classifying mental states. House Price Prediction EDA, Feature Engineering, Model Evaluation § GitHub
• Conducted EDA on estate data scraped from batdongsan.vn, handled data and skewed distributions.
• Engineered features via data transformation, address normalization, KMeans clustering, and geospatial encoding.
• Selected features using Variance Threshold, SelectKBest, Mutual Information, and Random Forest Importance.
• Trained models: Ridge, SVM, and Extra Trees achieved the best performance with R 2
= 1.0 and RMSE 0.
Vietnamese-English Translation NLP, Seq2Seq Models § GitHub
• Data preprocessing with manual tokenization and vocabulary building.
• Trained translation models using handcrafted RNN, LSTM, GRU, and Transformer architectures.
• Enhanced the performance with pre-trained Large Language Models: BERT & GPT-2, or MBart-50. Cat-Dog Image Classification API FastAPI, Computer Vision, Model Serving § GitHub
• Developed a production-ready API using FastAPI to classify cat vs dog images.
• Applied transfer learning to fine-tune ResNet-18 model with partial layer freezing for efficient classification.
• Designed asynchronous inference logic, logging, and custom CORS middleware with modularized architecture.
• Implemented RESTful endpoints for serving image classification results. SKILLS
Programming Languages: C/C++, Python, SQL, R, MATLAB Tools & Technologies: Git, Docker, Linux, PostgreSQL, MongoDB, FastAPI Data Science & Analytics: pandas, NumPy, Matplotlib, Seaborn, Plotly, spaCy Machine Learning & AI: scikit-learn, OpenCV, PyTorch, TensorFlow, HuggingFace, LangChain Technical Skills: Data Processing, Statistical Modeling, Machine Learning, Computer Vision, NLP CERTIFICATIONS
• UIT Collegiate Programming Contest 2023
• Heading for the Future: Data Analysis (South Region)
• Supervised Machine Learning: Regression and Classification
• IELTS 6.5 (CEFR B2)