Ho Chi Minh City, Viet Nam Dinh Tuan Anh
Data Analyst Machine Learning Engineer
*.*********@*****.***
linkedin.com/adinh26101
facebook.com/adinh26101
An enthusiastic Machine Learning Engineer with a passion for learning new technologies. My skills are more engineer-oriented. My strongest programming language is Python, which I’ve used for almost tasks from crawling and processing data to machine learning and web development and operation. Besides it, I like to work with data because each data have its own insight and value, I like to discover them and bring them into the application. I also like to research Machine Learning and Deep Learning, especially generative models. In addition, I have also worked with MLOps in GCP. EXPERIENCES
Data Analyst / Vifon 4/2023 – now
Leader Tan Phu District, Ho Chi Minh City, Viet Nam
• Develop an automated planning module for the logistics department using VBA and Python, release for logistic staff by a web app using Streamlit.
• Integrate and create dashboards with Power BI.
• In-depth dataization, building last-mile logistics operation model, building SOP, KPIs Vender, proposing routes to reduce costs based on python.
Machine Learning Engineer / OneMount 6/2022 – 12/2022 Internship District 1, Ho Chi Minh City, Viet Nam
• Help build a chatbot using Dialogflow.
• Develop APIs to retrieve logs from Splunk for MLOps admin APIs.
• Create a CLI that can be used to call MLOps admin APIs.
• Help Data Scientists on using MLOps platforms.
Freelancer / Hyphens 2/2021 - 6/2021
Freelancer Singapore
• Develop automation modules that use VBA to process Excel files from Outlook and integrate them with master data.
• Help sales staff create dashboards with Power BI. EDUCATION
Bachelor of Data Science, Industrial University of Ho Chi Minh City 8/2019 — 7/2023
• Graduate thesis: 9.8/10
SKILLS
Data Crawling Beautiful Soup, Selenium
Data Manipulation Excel, Python, SQL
Data Visualization Python, Power BI, D3js
Web Programming HTML, CSS, JS, Django
AI/ML Platform MLflow, Kubeflow
Operations Git, Docker, Kubernetes on GCP
Communication English (TOEIC 645), Vietnamese (native speaker) PROJECTS
Owner / Federated Learning for Credit Scoring 2022 Graduate Thesis
• Implement Federated Learning on the Home Credit Default Risk dataset to show that we can collaboratively train a deep learning model without data leakage.
• The Home Credit Default Risk dataset was published in a Kaggle competition to predict which loan applicants are unable to repay their loans. I performed exploratory data analysis and split the data into 50 nodes to simulate 50 contributors. I then used the Federated Averaging and Federated Proximal algorithms to train a shared machine learning model to detect which customers are unable to repay their loans.
• The result shows that the accuracy of the centralized and decentralized models is not so different. Tech stack: Pytorch
Owner / Comment classification 2021
Personal Project
• Extract over 100,000 comments from Shopee’s e-commerce website using a web crawler, then processed the data and created a dictionary by tokenizing all of the comments.
• Classify comments by using Logistic Regression with high accuracy.
• Create a website with a user-friendly interface that allows users to interact with the model and push the web app to the docker hub.
Tech stack: Underthesea, Django, docker