LE NGUYEN NHUT TRUONG
Data Scientist
[ **************@*****.*** Ó +(84-979****** HCMC, Viet Nam
https://www.linkedin.com/in/lnntruong/ https://github.com/truongcntn2017 PROGRAMMING
LANGUAGES AND
FRAMEWORK
Scala: Professional working proficiency
Airflow: Professional working proficiency
Python: Professional working proficiency
Scikit-learn: Professional working
proficiency
Tensorflow: Limited working proficiency
Pytorch: Limited working proficiency
R: Limited working proficiency
SQL Server: Limited working proficiency
STRENGTHS
Hard-working Responsible Creative
Communication skills
LANGUAGES
English: Professional working proficiency
Vietnamese: Native or bilingual
proficiency
INTERESTS
Data Mining
Statistics
Natural language processing
Data Structure And Algorithms
Artificial Intelligence
Object - Oriented Programming
ACADEMIC
ACHIEVEMENTS
First prize Zalo AI Challenge Computer
Vision Task in Internal branch 2021
Top 27 challenge competitions in
University of Science in 2019, 2020
Participate ACM - ICPC at University of
Science in 2018, 2019, 2020
Semi final Entropy Competition about
Data Analysis at John Von Neumann 2019
Top 10 Round 2 Code Tour at VNG 2020
EXPERIENCE
Data Scientist
DataNest
November 2022 - August 2023 HCMC, Viet Nam
Improving Model Credit Score from 0.67 AUC to 0.71 AUC
Build model and deploy Gender, Age. Gender AUC 0.92, Age Group 0.83 Accuracy, 73MMonthlyActiveUsersofViettel.
Improving 0.7 AUC Model Credit Score by Device Target Encoding
Null rate Analytic Improve 1.3 AUC Score.
Data Scientist (Middle)
VNG Corporation, Zalo Group, R&D Team
April 2021 – March 2022 HCMC, Viet Nam
Research Embedding Technical, Auto ML, User Segmenation, Probability Data Structure, Parallel Computing and Snorkel.
EDA and ETL 70MMonthlyActiveUsersofZaloandmodelingfor Income prediction multiple classifiers. AUC - mu: 0.92.
Define problem and Prediction Income by location in all wards ( 10K) in Vietnam.
EDA and ETL 25MMonthlyActiveUsersofBaoMoiandmodeling using Keyword extraction (KeyBERT) and embedding keywords.
Based on keyword predict for User Interest (using GraphSAGE is one type of GNN).
URL embedding (using Character Embedding, Word Embedding).
Literature Review Problem and define problem. EDA and ETL 300M Monthly Active Anonymous Users for Prediction Age, Gender of Anonymous Users.
Literature Review Problem and define problem. Clustering User Anonymous (using HDBSCAN).
Research: Anchor method is able to explain any black box classifier, with two or more classes.
Associate Data Scientist (Junior)
VNG Corporation, ZION, ZaloPay, Risk Team
October 2020 – December 2020 HCMC, Viet Nam
Research CatBoost model, stacking methods, shap value model.
EDA and Modeling with million users.
Improving precious ( from 0.7 to 0.8) and recall ( from 0.2 to 0.3) for a detector of promotion abuser (In Processing Transaction State). Data Scientist Fresher
VNG Corporation, ZION, ZaloPay, Risk Team
July 2020 – October 2020 HCMC, Viet Nam
Module 1: Soft skills and presentation skills
Module 2: System and Network: Distributed system, network with spark, airflow
Module 3: Analytic: EDA transactions data with pandas profiling
(Domain knowledge e-commerce, banking, e-wallet)
Module 4: Modelling: Modeling for a detector of promotion abuser
(promotion transactions data, imbalanced data)
Research: Smote for imbalanced data
Research: Boots method for imbalanced data
EDUCATION
B.Sc. in Information Technology
Honor program
Faculty of Information Technology,
VNU - HCMUniversityofScience,
Viet Nam
2017 2021 GPA: 8.1/10
M.Sc. in Information Technology
Honor program
Faculty of Information Technology,
VNU - HCMUniversityofScience,
Viet Nam
2021 Present
EXPERIENCE
Web Developer
Pascalia Asia
June 2019 – September 2019 HCMC, Viet Nam
Creating web application using the iterative development process. Allow users to securely and easily login to your web application.
Build and secure a web server with Restful Developer-friendly API.
RD OCR with tesseract and spacy. (Framework of python for Computer Vision, Natural language processing).