University of Science, VNU-HCM
NGUYỄN THỊ CẨM LAI
D A T A A N A L Y S T
Programming Languages: C/C++, Python, Julia, SQL
Technology/Framework: Github, Jupyter Notebook,
Google Colab, VS Code
Well-trained: pandas, numpy, sklearn, markdown,
data visualization libraries in python
Data visualization tool: Tableau, Python
Soft skills: problem-solving, analytical thinking, self- study, teamwork, time management
Languages: Vietnamese (Native), English
Others: Office (word, excel, powerpoint), statistical S K I L L S
Ho Chi Minh city, Viet Nam
***************@*****.***
A B O U T M E
C E R T I F I C A T E
GRANDMASTER KAGGLE
ONLINE CERTIFICATE
E D U C A T I O N
MAJOR IN DATA SCIENCE
linkedin.com/in/ntclaii/
Full name: Nguyễn Thị Cẩm Lai
Date of birth: 13/03/2002
Gender: Female
Hi, call me Lai. I'm here to affirm my enthusiasm for the field of Data Science. I'm a final year students, majoring in Data Science at the University of Sciences - VNU. I’m addicted to learning and growing every day. With my enthusiasm and willingness to learn, I hope to be able to become a member and contribute to the development of your company.
C O N T A C T
ENGLISH
Google Data Analytics (Coursera)
Data Analysis Using Python (Coursera)
Data Analysis with Python: Zero to Pandas (Jovian) Machine Learning with Python: Zero to GBMs (Jovian) Three-level SQL: Basic, Intermediate, Advanced (HackerRank) TOEIC 815 (2021)
I N V O L V E D P R O J E C T
ANALYZE DATA AND BUILD MACHINE LEARNING
MODELS PYTHON
BUILD DASHBOARD TABLEAU
Kaggle: kaggle.com/nguyenthicamlai
Github: github.com/ntclai
E X P E R I E N C E
Explore data properties and conduct data preprocessing Visualize data with a variety of chart types, show relationships between attributes
Make comments on the current human resource situation of the enterprise and propose solutions to reduce the employee attrition rate
Building a Logistic Regression model to predict employee resignation decisions (get accuracy score: 90%)
Use the pineline technique to build a sequence of data preprocessing steps before feeding into the machine learning model
Train data through various types of machine learning models
(Decision Tree Regressor, Linear Regression, Logistic Regression, Gradient Boostring ) to find the model that gives the best prediction results.
Conclusion Gradient Boostring model is the best when it gives RMSLE score: 0.13683
Explore the properties of the data and preprocess it before training the machine learning model
Implement and optimize the Decision Tree model to predict whether that person will live or die on the Titanic (accuracy score: 96%)
Based on students' test results, analyze and visualize the data to find the factors that influence those results
Proposing solutions to help improve learning outcomes for students
Employee Attrition and Factory
House Prices
Titanic - Machine Learning from Disaster
Students Performance in Exams
KAGGLE GRANDMASTER
Notebooks Grandmaster (achieved 16 gold medals, 2 silver medal) Datasets Master
Building personal projects
Participating in machine learning competitions
Contribute self-collected data from websites to
the community
Highest rank: 43 of 290.638
One year of participation and contribution:
Become Notebooks Grandmaster:
SEE MORE PERSONAL PROJECTS
Superstore in US
Build a dashboard showing the business situation (sales, profit, shopping trends, ...) in December 2021 of a supermarket in the US