Resume

Natural Language Processing, AI, Machine Learning

Location:

An Binh, Binh Duong, Vietnam

Salary:

4.000.000 VND

Posted:

April 24, 2024

Contact this candidate

Resume:

Nguyễn Thị Thùy

§ github.com/thuynguyen**** # ad482p@r.postjobfree.com 032*******

Education

Computer Science GPA: 8.6/10.0

University of Information Technology, Vietnam National University Ho Chi Minh City (VNU-HCM) 9/2021 – Present Coursework

Onlines courses: Langchain with Python BooCamp, The Complete JavaScript Course 2024 From Zero to Expert!, Version Control

Technical books: Full-Stack Flask and React, Natural Language Processing with Transformers Awards:

• Won the title of "Thanh niên tiên tiến làm theo lời Bác"and Awarded "Excellent achievements in Youth Union work and youth movements"at school level in the two school years 2021-2022 and 2022-2023

• UIT Data Science Challenge 2023 Honorable Mention

• Achieved top 3 at Software Mention Detection in Scholarly Publications(SOMD) - Subtask I @NSLP 2024

• Achieved top 1 at task 1 of Shared task HOPE at IberLEF 2024 Skills

Programming languages: Python, C++, JavaScript

Tools: Git/GitHub, VS Code, Jupyter Notebook

Framework/Library: Hugging Face, Flask, React

Soft skills: Teamwork, Presentation, Self-learning, English - 270/400 Toeic Speaking and Writing Experience

Multimedia Communications Laboratory - MMLab UIT Research Assistant 8/2023 – Present

• Learn the latest information about the field of Natural Language Processing(NLP) by reading scientific articles.

• Support research projects by collecting and processing data, deploying models for NLP tasks, evaluating model performance, and analyzing results.

• Preprocess data, implement, tune, and evaluate machine learning models, deep learning, and pre-trained language models for NLP tasks such as text classification, named entity recognition, and text generation.

• Attend shared tasks and write technical reports or scientific articles under the guidance of the instructor Department of Computer Science Youth Union Secretary 5/2023 – Present

• Manage 14 personnel and all annual activities in the department’s Youth Union

• Leader at Event Team of the department’s Youth Union

• Plan, organize, and operate for WeCode - an algorithm programming contest with more than 100 participants

• Plan, organize, and operate the Trainee Program - a two-month training program for first-year students Summary Knowledgeable and Experienced

• Basic understanding of Flask API and React.

• Have experience reading and writing scientific reports or papers.

• Experienced in implementing machine learning algorithms, deep learning architectures, and pre-trained language models for Natural Language NLP tasks such as text classification, named entity recognition, and text generation.

• Experienced in processing, standardizing, evaluating model performance, and analyzing data for NLP tasks.

• Proactively and creatively propose innovative approaches to tackle each unique task.

• Have get acquainted and solved several tasks in the field of computer vision such as image classification, object recognition, face detection, OCR

Scientific publication

• Accepted Research on Evaluating Data Augmentation Techniques for Sentiment Analysis in Vietnamese

UIT Young Scientist Conference Oct. 2023

• Submitted Software Mention Recognition with a Three-Stage Framework Software Mention Detection in Scholarly Publications(SOMD) - Subtask I @NSLP 2024 Apr. 2024

• Submitted Two-Stage Framework for Identifying and Extracting Vietnamese Comparative Opinion Quintuple Extraction

Journal of Computer Science and Cybernetics Apr. 2024

• Under review An Empirical Study of Prompt Engineering with Large Language Models for Hope Detection in English and Spanish

HOPE at IberLEF 2024 Apr. 2024

Projects

Application of Text Generation Models for Comparative Opinion Quintuple Extraction in Vietnamese Project for Undergraduate Thesis and Scientific Research at university Dec. 2023 – Present

• Utilized my understanding of Transformer architecture and application of pre-trained language models as the primary method to address this task.

• Employed the base method of Fine-tuning ViT, a pre-trained Transformer-based encoder-decoder model tailored for Vietnamese, to tackle this task.

• Conducted research and proposed alternative methods to enhance efficiency in comparison to the base approach, including:

Modifying the input and output formats to identify the optimal prompting format.

Fine-tuning pre-trained models using a multi-view approach.

Aggregating multiple models to generate the final answer. An Empirical Study of Prompt Engineering with Large Language Models for Hope Detection HOPE at IberLEF 2024 Apr. 2024

• Achieved top 1 at task 1 with our final prompting.

• Solved the problem in an unsupervised approach using ChatGPT 3.5.

• We conduct experiments on two aspects Prompting technique and Providing information strategies to find the best prompting for this task:

Three prompting techniques in Prompting technique aspect are: One-shot, Few-shot, and Chain of thought

4 strategies in Providing information strategies aspect are: Only request, Concept of problem, Meaningful of classes, and Role defining

Authored a 10-page paper detailing our promptings and submitted it to the conference Software Mention Recognition with a Three-Stage Framework Software Mention Detection in Scholarly Publications - Subtask I @NSLP 2024 project repository1 Jan. 2024 – Mar. 2024

• Achieved a top-three position in the competition with our final result.

• Addressed this challenge as a named entity recognition problem by leveraging pre-trained language models such as BERT, SciBERT, and XLM-R.

• Proposed three distinct approaches to tackle this task:

Approach 1: Employed the base method of token classification.

Approach 2: Implemented a two-stage system comprising entity extraction and entity classification.

Approach 3 (best approach): Developed a three-stage system integrating a binary classifier to identify sentences containing entities into Approach 2

• Authored an 8-page paper detailing our solution and submitted it to the conference. 1https://github.com/thuynguyen2003/NER-Three-Stage-Framework-for-Software-Mention-Recognition Research on Evaluating Data Augmentation Techniques for Sentiment Analysis in Vietnamese Paper at UIT Young Scientist Conference project repository2 Oct. 2023

• Fine-tuning pre-trained language models such as PhoBERT, BERT, and XLM-R for sentiment analysis in Vietnamese.

• Verifying the performance of traditional augmentation methods such as random swap, random delete, random insert, and back-translation for sentiment analysis in Vietnamese.

• Comparing the performance of machine learning models, pre-trained language models for sentiment analysis in Vietnamese.

Chatbot for Public Services

Team project in UIT Data Science Challenge 2023 project repository3 Jan. 2024

• Identified real-world challenges and proposed effective solutions within the Chatbot for Public Services project.

• Familiar with the essential components typically included in a Q&A system or chatbot.

• Explored various methodologies and models for document retrieval and reranking to enhance the project’s effectiveness.

2https://github.com/thuynguyen2003/Data-Augmentation-Techniques-for-Sentiment-Analysis-in-Vietnamese 3https://github.com/thuynguyen2003/CS336.O11-IR-Chatbot-for-Public-Service

Contact this candidate