Post Job Free
Sign in

Machine Learning Information Technology

Location:
Bien Hoa, Dong Nai, Vietnam
Posted:
February 25, 2025

Contact this candidate

Resume:

BÙI KHẮC KIÊN

Data Engineer

**/**/****

Male

091*******

*************@*****.***

Dormitory area B - Mac Dinh Chi

street - Dong Hoa ward - Di An city

- Binh Duong

https://github.com/BuiKien15

Skills

Data Preprocessing

Data Processing

Spark and PySpark

Python Programming

Analysis and Interpretation

Data Visualization

Machine Learning

Data Mining

CAREER GOALS

"Wish to participate in real Data Engineering projects, where I can apply knowledge of programming, databases, data visualization and Machine Learning to solve specific problems of the company. With the knowledge imparted from university, experience from projects, the ability to diligently explore, eager to learn along with teamwork skills, withstand work pressure and communicate effectively, I believe that I can contribute to building a team Data Engineering is strong and effective, while learning and developing yourself."

EDUCATION

2021 - 2025

UNIVERSITY OF INFORMATION TECHNOLOGY - NATIONAL UNIVERSITY HCMC -UIT

Majoring in Information Systems

GPA: 3.05/4.0

PROJECTS

Semester 2 - 2024

UNIVERSITY OF INFORMATION TECHNOLOGY Student

Title: Building a Retail Sales Data Warehouse

Link: https://drive.google.com/drive/folders/1HzCtz_qf_xjotTN9izRabtkIIP6GzJOT? usp=sharing

Objective: Developed a data warehouse for analyzing retail sales. Technologies: SQL Server, SSIS, SSAS, Power BI, Python. Highlights:

• Designed data models and implemented ETL processes.

• Created OLAP cubes for analysis and generated reports in Power BI. Outcome: Improved data accessibility and insights for business decisions. Semester 2 - 2024

UNIVERSITY OF INFORMATION TECHNOLOGY Student

Title: Analyze Data on Virtual Currency Prices

Link: https://github.com/canhlong1430/PTDLK

Objective: Developed predictive models for cryptocurrency prices using time series analysis.

Technologies: ARIMA, Linear Regression, RNN, GRU, LSTM, VAR, TBATS, PatchTST. Highlights:

• Analyzed Binance Coin (BNB), Dogecoin (DOGE), and Ethereum (ETH) datasets.

• Utilized various algorithms for price prediction over 30, 60, and 90 days.

• Evaluated models using MSE, MAE, RMSE, and MAPE metrics. Outcome: Provided insights into cryptocurrency price trends to assist investors in decision-making.

Semester 2 - 2024

UNIVERSITY OF INFORMATION TECHNOLOGY Student

Title: Predicting the Likelihood of Hotel Booking Cancellations from Hotel Activity Data

Link: https://github.com/BuiKien15/Data-Mining

Objective: Develop a machine learning model to predict hotel booking cancellations using hotel activity data.

Technologies: Python, Pandas, Scikit-learn, Seaborn, Plotly. Data Warehousing Concepts

(Understanding of Data Models,

Star Schema, etc.)

Data Transformation (ETL Process -

Using SSIS)

Exploratory Data Analysis

Time Series Analysis

Model Evaluation and Comparison

OLAP Cube Development (Using

SSAS)

MDX Querying

Deep Learning

Hobbies

● Football

OTHER SKILLS

Programming languages: C#, C++,

Python, R

Query language: Oracle

Big data: Hadoop, Apache Ant

Social network analysis: Gephi, network

structure analysis, centrality measures,

information spread model

Soft skills: Teamwork, flexible

communication, ability to communicate

CERTIFICATE

IELTS 6.0

ACTIVITY

Member of the executive committee of

MMCL Association

Highlights:

• Analyzed the Hotel Booking Demand dataset with 119,390 records and 36 attributes.

• Preprocessed data to handle missing values, outliers, and normalization.

• Implemented various classification algorithms including Decision Tree, Naive Bayes, and Logistic Regression.

• Utilized K-Fold Cross Validation to evaluate model performance. Outcome: Provided insights into factors influencing booking cancellations, assisting hotels in improving reservation management and reducing cancellation rates. Semester 1 - 2025

UNIVERSITY OF INFORMATION TECHNOLOGY Student

Title: Research and Implementation of Clustering Algorithms on Terrorist Attacks in the UK Using PySpark

Link:

https://drive.google.com/drive/folders/13wQiJjUYWVynJFakE9SIS3WqG6afHEL? fbclid=IwY2xjawIafxJleHRuA2FlbQIxMAABHbhvE01bqehO8daUtxTH4nrNEzNZRDBvK 8uFGGyX1qOsE5DpMrPoPU3QGg_aem_iTDndYZmZec7LTm7hH8PZg Objective: Develop a machine learning model to cluster terrorist attacks in the UK to propose effective strategies for counter-terrorism. Technologies: PySpark, K-Means.

Highlights:

• Analyzed the Global Terrorism Dataset comprising 5,235 records with 135 attributes.

• Preprocessed data to handle missing values and irrelevant columns.

• Implemented K-Means clustering to identify patterns in terrorist attacks.

• Evaluated clusters based on attack types, casualties, and targets. Outcome: Provided insights into the characteristics of terrorist attacks, aiding in the development of targeted counter-terrorism strategies. Semester 1 - 2025

UNIVERSITY OF INFORMATION TECHNOLOGY Student

Title: Network of Supply Chains/Supermarkets in the States of the US Objective: Analyze the relationships between product categories and states within supermarket chains to enhance business strategies. Link: https://drive.google.com/drive/folders/119a5yAgZ88jeUihJXP-YM5QjiujTL7nT Technologies: Python, Gephi, Louvain Algorithm, Girvan-Newman. Highlights:

• Analyzed a dataset of supermarket sales with 9,994 records and 13 attributes.

• Cleaned and processed data to eliminate duplicates and handle missing values.

• Converted data from DataFrame to graph format for community detection.

• Implemented Louvain and Girvan-Newman algorithms to identify clusters in sales data.

• Evaluated the networks using metrics such as PageRank, Eigenvector, Closeness, and Betweenness centralities.

Outcome: Provided insights into sales trends across different states, facilitating the development of targeted marketing strategies for various product categories.

© topcv.vn



Contact this candidate