***************@*****.***
Thu Duc, TP.Ho Chi Minh Vietnam
I am a student at the HCM University of Science, majoring in Data Science. With a big passion for working with data and models and I am eager to gain more experiences from realistic problems in a company. I am a dedicated person and a collaborative teammate. I find working with data and models to be both challenging and rewarding, and I am always looking for new things to learn and improve my skills, gain more valuable experiences. I am currently seeking a internship position in the fields of Data. I am excited to be a part of your company and contribute my skills and knowledge to your company.
Major: Bachelor Degree in Data Science August 2021 - July 2025 Data Engineer Coaching 1 on 1 Mar 2024 - Now
Ho Chi Minh University Of Science (HCMUS)
Học Viện UniGap
USER REVIEW DATA COLLECTION
ANDNLPSENTIMENT
ANALYSISPROJECT:LEVERAGING NATURAL
LANGUAGEPROCESSING FOR INSIGHT
Nov 2023 - Dec 2023
Objective: To Collect User Reviews From Trustpilot And Apply Advanced NLPTechniques For Sentiment Analysis.
-Data Collection: Developed a robust data collection system to gather userreviews from Trustpilot, ensuring a comprehensive dataset for analysis.NLP
-Analysis: Leveraged NLTK (Natural Language Toolkit) to process and analyze thecollected data, extracting key insights and sentiments expressed in the reviews.
-Sentiment Categorization: Conducted sentiment analysis on user reviews tocategorize sentiments into positive, negative, or neutral categories, providingvaluable insights into customer sentiment.
-Application Development: Developed a user-friendly Streamlit -poweredapplication to showcase the practical applications of NLP in sentiment analysis.This interactive Data Analyst
TUẤN KHA TRẦN
// Contact Information
https://www.linkedin.com/in/tran-
kha-0508abcd/
// Skills
Programming Languages:
Proficient in Python with
expertise in libraries such as
pandas, scikit-learn, TensorFlow,
Selenium, NLTK, as well as
experience in R, C++.
Tools and Technologies:
-Linux: Basic Proficiency in
Linux operating system for data
engineering tasks.
-Git: Basic Proficiency in version
control using Git for
collaborative software
development.
-Apache Airflow: Basic
Proficiency in workflow
// Objective
// Education
GPA: 8.2/10
5 months of hands-on learning and real-world experience with the 1-on-1 data engineer coaching program at Unigap
// Project (Https://Www.Datascienceportfol.Io/Trantuankha0508) 2023
2023
2023
British Airways Data Science Online
Internship
Coursera IBM Data Analyst Certificate
Coursera Machine Learning by Andrew
Ng
Automated Data Pipeline Project for Tiki Mar 2024 - May 2024 platform allows users to explore and understand the nuances ofemotion classification in user reviews.
Objective: Automate Data Extraction, Transformation, Quality Assurance, And Pipeline From Tiki.Vn
-Data Crawling: Implemented web scraping techniques to extract comprehensive product data from Tiki.vn.
-Data Storage: Stored extracted data efficiently in MongoDB, optimizing retrieval with indexed "short_description" fields.
-Data Extraction and Transformation: Extracted essential data from MongoDB, processed it into CSV format for downstream analysis.
-Data Quality Assurance: Implemented Soda for rigorous data quality checks, ensuring anomaly detection and data integrity validation in CSV files.
-Data Loading into Google BigQuery: Managed seamless loading of validated data from CSV files into Google BigQuery for scalable analytics.
-Data Modeling with DBT: Execute star schema data modeling on Google BigQuery, enhancing query performance and data organization.
-Automation with Apache Airflow: Orchestrated end-to-end workflow automation using Apache Airflow's Directed Acyclic Graph (DAG). Scheduled pipeline execution, managed dependencies, and monitored task statuses for streamlined operations. DATA SCIENCE AND AI CLUB November 2023 - Now
Member Of The Club's Content TeamCurate And Produce Engaging
-Content To Data Science For Dissemination Within The Club And External Audiences automation and management
using Apache Airflow.
-DBT: Basic Proficiency for data
transform for building data
warehouse
Database Management: Basic
Proficiency in SQL and NoSQL.
Data Architecture: Knowledgeable
about Data Warehouse and Data
Lake concepts, with an
understanding of their role in data
management and analytics. Basic
proficient in using Google BigQuery
for scalable and efficient data
warehousing and data
transformation.
Mathematical Skills: Strong
foundation in Calculus and
Statistics for data analysis and
modeling.
Data Visualization:Basic
Proficiency in data visualization
tools including Tableau and Power
BI for creating insightful and
interactive visualizations.
// Certifications
// Activities