Tran Hoang Anh
ï LinkedIn/TranHoangAnh # ******************@*****.*** 086*******
Education
VNUHCM - University of Science Ho Chi Minh, Viet Nam Bachelor of Science in Data Science September 2020 – October 2024
• GPA: 3.43/4.0
• IELTS Score: 5.5
Relevant Coursework
• Data Structures and Algorithm: 3.95/4.0
• Object Oriented Programming: 4.0/4.0
• Python for Data Science: 4.0/4.0
• Data Mining: 3.9/4.0
• Introduction to Machine Learning: 4.0/4.0
• Advanced Machine Learning (Deep Learning): 3.9/4.0 Publication
• Tram T. Doan, Thuan Q. Tran, Dat T. Le, Anh H. Tran, An T. Nguyen, An-Tran Hoai-Le, Tran-Tung Doan-Nguyen, Son T. Huynh, Binh T Nguyen, “HOSSemEval-EB23: A Robust Dataset for Aspect-Based Sentiment Analysis of Hospitality Reviews”. Multimedia Tools and Applications, 2024. Experience
AISIA Research Lab Ho Chi Minh, Viet Nam
Undergraduate Research Assistant September 2023 – July 2024
• Led a team of 10 members in labeling a new dataset with over 30,000 data points for Aspect-Based Sentiment Analysis.
• Executed data pre/post-processing for dataset normalization and converted data into the appropriate format for models input.
• Conducted Exploratory Data Analysis (EDA) using Matplotlib on labeled data to uncover patterns, insights, and anomalies.
• Leveraged state-of-the-art techniques using pre-trained models such as BERT and T5 to benchmark the new dataset.
• Collaborated on writing research paper and major revisions. Projects
Data Creation Using LLM Python, LLaMA 3, Prompting May 2024
• Conducted research on existing data augmentation methods and identified gaps in these methods.
• Developed a sampling algorithm for sampling the label space of data.
• Created a data creation pipeline using a fine-tuned LLaMA 3 model as the core function to generate new data based on their labels.
• Leveraged state-of-the-art techniques using T5 models to benchmark the new methods.
• Achieved notable results compared to those obtained from manually crafted data. Melanoma Skin Cancer Detection § Python, AlexNet, ResNet, VGG December 2023
• Executed comprehensive data preparation for a melanoma classification project: loaded images, visualized patterns, defined dataset classes, and strategically implemented data augmentation.
• Leveraged transfer learning with pre-trained CNN models (AlexNet, ResNet, VGG) to enhance performance and achieve a 92% F1-score in this challenge.
Predict Interest-Rate (Banking) Python, XGBoost, MLP July 2023
• Enhanced data quality through meticulous cleaning, outlier handling, and feature extraction.
• Conducted Exploratory Data Analysis (EDA) using Matplotlib and Seaborn to identify insights, patterns, and anomalies within the data.
• Successfully achieved a prediction accuracy of 78% using advanced machine learning models including Decision Tree, XGBoost, CatBoost, and MLP.
Technical Skills
Languages: Python, C/C++, LaTeX
Developer Tools: VS Code, Google Colab, Jupyter Notebook Technologies/Frameworks: PyTorch, Pandas, Numpy, Scikit learn