Post Job Free
Sign in

Machine Learning Information Systems

Location:
Ho Chi Minh City, Vietnam
Salary:
1000000VND
Posted:
June 23, 2025

Contact this candidate

Resume:

Contact

Education

032*******

Phone

*****.********@*****.***

Email

Linh Dong, Thu Duc, Ho Chi Minh City

Address

Skills

TAO CHI VY

Data Engineer - Data Scientist

ABOUT ME

Data Engineering

(Primary Focus)

Big Data & Streaming

Apache Hadoop, Apache Spark,

Apache NiFi, Apache Kafka

HCMC UNIVERSITY OF TECHNOLOGY

AND EDUCATION

Specialization: Information Systems

2021-2025

I am Tao Chi Vy, an Information Systems student

specializing in Data Engineering with experience in full- stack development using Django. I have a solid

foundation in ETL pipelines, data warehousing, and system optimization to enhance data processing

efficiency. I design scalable data architectures and integrate machine learning models into data-driven applications.

Additionally, I develop web applications and APIs that ensure smooth interaction between frontend and

backend.

WORK EXPERIENCE

Data Engineer - Data Scientist Intern

VNA Group 3 months ( 9/2024 - 11/2024 )

Built a centralized data warehouse for poor

households and people with revolutionary merits,

supporting data-driven policy development.

Improved data retrieval efficiency by 30% using

PostgreSQL and MySQL over two months.

Developed a real-time ETL pipeline with Apache NiFi, boosting processing speed by 40%.

CAREER OBJECTIVE

Aspiring to advance my career in Data Engineering by focusing on designing and optimizing ETL pipelines and data workflows. I have hands-on experience with

Apache NiFi, Kafka, Spark, Hadoop, and databases

including PostgreSQL, MySQL, and MongoDB. Leveraging my backend development skills with Django and

experience integrating machine learning models, I aim to build efficient, scalable data systems that drive business value.

Relevant coursework:

Big Data Analytics - Coursera,

Samsung Innovation Campus - Big

Data Course

Foundations of User Experience (UX)

Design - Coursera

https://www.linkedin.com/in/taovy060103

linkedin

https://github.com/Cloudy009

Github

https://chivy-mycv.onrender.com

Personal Website

Skills

Backend & API Development

(Supporting Role)

Backend Development

Python (Django, FastAPI, Flask)

Java (Spring Framework)

IntelliJ IDEA

API Development

RESTful APIs, Dialogflow

Deployment & DevOps:

Docker, AWS, Render, Railway,

MongoDB Atlas, Git, Supabase,

Neon

Frontend Development

Languages & Frameworks

JavaScript, React.js, HTML, CSS,

SCSS

Other Skills

Tools & Platforms

Jupyter Notebook, Dash, DBeaver

pgAdmin, MongoDB Compass

Soft Skills

Communication

Problem-solving

Critical thinking

Solution design

Teamwork

Adaptability

PROJECT

Oracle-Based Data Pipeline with Airflow, Spark & Power BI Developed an end-to-end data pipeline using Oracle, Airflow, Spark, MinIO, and DBT. Automated ingestion from CSV/Excel into Oracle, enriched data from external APIs

(stored in MinIO), performed transformations with Spark and DBT, and visualized KPIs in Power BI dashboards. Real-Time Twitter Sentiment Analysis

Developed a real-time pipeline to analyze public

sentiment from Twitter using AWS EC2, Apache NiFi, Kafka, Spark Streaming, and MongoDB.

Deployed TF-IDF + Logistic Regression model for tweet classification

Streamed and processed live tweets; stored results in MongoDB

Visualized trends with Dash & Plotly; containerized with Docker Compose

Real-Time E-commerce Analytics Dashboard with Spark, Kafka & Grafana

Built a real-time analytics system for e-commerce user behavior using Kafka, Spark Streaming, and MySQL.

Stored and visualized time-series data with InfluxDB and Grafana.

Integrated batch demographic data with streaming

activity for combined analytics.

Deployed via Docker Compose with performance tuning and monitoring.

Delivered dashboards showing campaign performance, gender-based order distribution, and real-time insights. Optimized RESTful APIs with FastAPI, cutting data

retrieval time by 50%

Deployed and managed services with Docker for

efficient containerization and orchestration.

Data Engineer - Freelance Project

VNA Group 2 months ( 3/2025 - 4/2025 )

Designed and automated a centralized admissions

data warehouse for high school applicants, using a hybrid Star–Snowflake schema and SCD Type 1

Scheduled daily ETL jobs using Pentaho PDI.

Triggered real-time ETL updates via Pentaho Carte API, achieving data refresh within ~1 minute.

Ensured data accuracy and reliability through testing with Postman and JavaScript Fetch API.

Delivered ready-to-use data for strategic insights into high school admissions and student demand.

ETL & Data Processing

Machine Learning

Integration, Data Processing,

Recommendation Systems,

ETL Processes, DBT, Pentaho

PDI

Excel

Database & Storage

SQL Server, MySQL,

PostgreSQL, MongoDB,

Cassandra, Firebase, Neo4j,

CouchDB, MinIO, InfluxDB,

Oracle

Data Visualization:

BI Tools: Tableau, Power BI

Real-time Dashboards: Grafana

Orchestration & Automation

Power Automate

CERTIFICATE

Google Data Analysis

TOIEC 600+

Samsung Innovation Campus (SIC)

Full portfolio & source code on GitHub: github.com/Cloudy009 Build Sales Data Warehouse for Business Intelligence Designed and implemented a data warehouse for sales reporting, order fulfillment, and inventory analysis using SQL Server and Visual Studio Community 2022.

Dockerized ETL Pipeline with Airflow & FastAPI

Created a Docker-based ETL pipeline with Airflow for a PostgreSQL data warehouse. Implemented multi-layer

(Bronze–Gold) processing, integrated MinIO for storage, restored data with PgAdmin, and exposed FastAPI

endpoints with REST-triggered DAGs for real-time

synchronization.

Customer Segmentation & Recommendation System

Processed behavioral data and developed machine

learning models for customer segmentation and product suggestion.

Sales Performance Dashboard with Power BI

This Power BI project delivers a sleek and interactive dashboard for visualizing sales performance data across regions, products, and time periods. It features dynamic slicers, auto-refresh with Power Automate, a modern interface designed in Figma, and real-time refresh timestamps.

Passport Renewal System with Oracle & Django

Built a web application for managing passport renewals using Django and Oracle. Included authentication, form validation, and data persistence features.

Food & Beverage Web App with Chatbot

Developed a Django-based F&B web application with

MySQL backend and a Dialogflow-integrated chatbot. Also deployed a version using MongoDB Atlas for flexible data storage. Hosted on Render and Railway for scalable deployment.

Language

English

Japanese

Hobby

Singing

Playing Ghita

Sport



Contact this candidate