Nguyen Phuc Thanh Danh
090-***-**** VN ****.*************@*****.***.** Linkedin DanhNguyennene (GitHub) EDUCATION
Ho Chi Minh City University of Technology (HCMUT)
Undergraduate Bachelor of Computer Science GPA: 3.0/4.0 Sept 2022 - May 2026
(Expected)
PROJECTS
Chart Data Extraction (Sponsored by HCMUT) Aug 2024 - Present
• Developed an Image-to-Text deep learning model for end-to-end data extraction from chart images, converting extracted data into structured JSON files for tabular representation.
• Data Analysis & Preprocessing: Conducted Exploratory Data Analysis (EDA) on open-source datasets (ChartQA, PlotQA, DVQA), managed chart-type balancing, augmented bounding boxes, and generated synthetic data for model enhancement.
• Technology Stack: Fine-tuned the pretrained Matcha model (Math Reasoning and Chart Derendering Pretraining) using PyTorch, later transitioning to PyTorch Lightning for improved training efficiency.
• Model Optimization: Frozen 40% of the vision encoder layers during training to enhance efficiency.
• Advanced Techniques: Integrated Exponential Moving Average (EMA) and Stochastic Weight Averaging (SWA) for improved training stability and performance.
• Distributed Training: Successfully implemented Distributed Data Parallel (DDP) in PyTorch before transitioning to PyTorch Lightning.
• Benchmark: Achieved F1 Score: 99% and TED: 78%.
• GitHub Repository: Link, Google Colab Notebook: Link, Dataset: Link, Presentation: Link Interpolation models July 2024 - August 2024
• Developed various interpolation models, including Least Squares Regression, Lagrange Interpolation, Chebyshev Polynomials, and Hermite Polynomials, leveraging Linear Algebra techniques.
• Technology Stack: Implemented interpolation algorithms exclusively using NumPy, applying mathematical formulations and numerical methods.
• Reference: Based on Crista Arangala - Linear Algebra with Machine Learning and Data (CRC Press, 2023).
• Google Colab Notebook: Link
Chinese MNIST - Digit Recognizer (Kaggle) June 2024 - July 2024
• Designed and implemented a deep learning model combining Convolutional Neural Networks (CNNs) with PyTorch Transformers to recognize Chinese digits using the Kaggle dataset.
• Optimization Techniques: Designed and implemented batch normalization, dropout, and multiple convolutional layers to enhance model performance.
• Benchmark: Achieved 96% accuracy.
• Google Colab Notebook: Link
TECHNICAL SKILLS
• Programming Languages: Python, C++, SQL, R, JavaScript
• Frameworks: Pytorch, Pytroch Lightning, TensorFlow.
• Libraries: Pandas, Mathplotlib, Transformers, OpenCV, NumPy, Spark
• Tools: Docker, Git, Nvim and Vim, Linux, MySQL
• English: Proficient (IELTS 6.5)