Machine Learning, Computer Vision, NLP, Data Processing and Storage

Location:

Quan 1, 71000, Vietnam

Posted:

June 05, 2024

Contact this candidate

Resume:

Bùi Đức Nhân

AI/ML Engineer

Profile

076*******

***************@*****.***

Linkedin:

https://www.linkedin.com/in/duc-nhan-

bui-a25648237/

Github: https://github.com/Narius2030

Thu Duc city, Ho Chi Minh city

Skills

Programming Languages

Python, Java, C#, C++, OOP

Data Modelling

SQL Server, Relational Database (SQL), SSAS

Machine Learning/Deep Learning

Scikit-learn, Tensorflow, PyTorch, Google Colab,

Kaggle, Regression, Classification, Clustering, NLP

(text generation, text classification, ...), Computer Vision (object detection)

Data Manipulation and Visualization

Numpy, Matplotlib, Pandas, Power BI, Excel

Data Pipeline and Collection

Apache Airflow, BeautifulSoup, SSIS

Big Data

Apache Hadoop, Apache Hive

Mathematics

Probability and Statistics, Linear Algebra

Objective

After graduation, I want to work officially as AI/ML Engineer experts who will deal with machine learning problems like Computer Vision or NLP, and implement on practical products such as Web apps or IoT devices. Futhermore, I passionate to process data, erect data pipeline and storage. Education

HCMC University of Technology and Education 2021 - 2025 Data Engineering

GPA: 8.61

Certifications

09/2023 IELTS 6.0 Certificate of IDP Vietnam

08/2022 SQL (Intermediate) Certificate of Hackerank 03/2024 Supervised ML: Regression and Classification of Coursera Projects

Applying Artificial Neural Networks to Build Vietnamese Text Generation Models as Part of the Generative AI Problem Team: 1

(Individual Project)

4/2024 - 5/2024

- I developed a large language-based model for automatically generating text. I collected news data in various genres from VnExpress website using BeautifulSoup and preprocessed the data using NLP techniques. This includes extracting sentences, determining the meaning of related phrases, building a corpus, and generating input sequences using the N- GRAM method.

- I implemented Deep Learning architecture with embedding layers and LSTM, leveraging TensorFlow and Keras libraries for model development and evaluation. I embeded the model to the website using Streamlit framework.

Source: https://github.com/Narius2030/Vietnamese-Text- Generator.git

Applying Deep Learning and Machine Learning to Build Text Classification Model and Clustering Algorithm to Automatically Classify News' genre and Search Similar News Team: 1

(Individual Project)

4/2024 - 5/2024

- I implement data pipeline using Airflow to schedule news data collection process from VnExpress, I use BeautifulSoup to scrape. I apply several cleaning techniques for dataframe and text like. I embeded the model to the website using Streamlit framework.

English

IELTS 6.0 (9/2023)

Other

Streamlit, NET Framework, Ubuntu

Desktop/Server, Git

Activities

GDSC member 2022 - 2025

Google Developer Student Club Ho Chi Minh UTE

In this club, I usually take part in creating webinars as logistic department member. Those webinars

discuss about modern and trending technologies in

many fields such as Web, Cloud or Data. We also

host many academic activities like Hackathon,

BeCoder or CTF.

Strength

• Teamwork and Planning

• Time Management

• Hard working and curious

• Creative

- In Text Classification task, I apply natural language pre-processing techniques firstly to normalize text like including removing punctuation, stop words, and symbols, combining meaningful Vietnamese words, reformatting text, encoding words, and creating a corpus. Then I design a neural network using LSTM and Hybrid (CNN, LSTM) as to learn the features of each article.

- In the Text Clustering task, I implement Word2Vec (skip-gram) to discover the relationships among words. Then, I embed word for whole articles and calculate the mean vectors which represent for embedded ones. Finally, I calculate the Cosine to estimate the similar among articles so that I can cluster and search similar articles into groups. Source: https://github.com/Narius2030/Vietnamese-Text- Classification-and-Clustering

Building The Movie Recommender System by Content-based method Team: 2

2/2024 - 5/2024

- I use TF-IDF technique (Item profile) to define the important level of each category (words) in the numrical values through the whole movies

(documents) on movie's content. I apply PCA and highly-correlated elimination methods to reduce dataset dimension.

- This RS is based on Content-based method. I use Cosine Similarity for measuring the level of similarity among movies. Besides, I use Rigde model, a regression model, for learning the rating of users and figure out the User profile, which is W parameter matrix and the bias b. Then, I filter top N highest rated movies for that user.

Source: https://github.com/Narius2030/Recommendation- System.git

Contact this candidate