Post Job Free
Sign in

Fresher Data Engineer / AI Data Engineer Candidate

Location:
Linh Xuan, Vietnam
Posted:
May 13, 2026

Contact this candidate

Resume:

PHAN LONG DAO

Fresher Data Engineer / AI Data Engineer

Expected Graduation Date: Sep 2026

Phone: +84-889******

Address: P. Linh Xuan, Tp. Ho Chi Minh

Gender: Male

Email: *************@*****.***

Interests: Passionate about system design and building scalable data architectures inte- grated with AI/LLM models.

EDUCATION

HO CHI MINH CITY UNIVERSITY OF SCIENCE (HCMUS)

Bachelor of Data Science

Major in Data Science

GPA: 3.6/4.0 — IELTS: 7.5

Relevant Coursework: Big Data Processing(Spark, Hadoop), Machine Learning, Data Warehouse Systems, Rest API, Explainable AI (SHAP), Data Engineer (Airflow, Kafka), AI (MCP, Rag) WORK EXPERIENCE

MIND IOT

AI Agent Data Engineer Intern Mar 2026 – May 2026

• Integrated MCP protocol enabling LLM agents to perform CRUD + schema inspection + unstructured knowl- edge reading on Odoo database, reducing manual prompting by 80%

• Adopted and fixed mcp authentication proxy server from Open-source mcp-auth-proxy to implement Oauth 2.0

• Implement Rag for indexing knowledge base to use as a tool SAIGON AI

Data Science Intern Sep 2025 – Dec 2025

• Built early warning system analyzing 8,000+ HP product reviews, predicting the trend for each product within 30 days.

• Developed ML models (Random Forest, XGBoost, LSTM) for rating prediction, sentiment and issue classifica- tion

• Performed EDA with 25+ visualizations analyzing rating distributions, geographic patterns, and temporal trends; implemented review credibility system filtering 15% unverified reviews

• Extracted insights using BERTopic and ABSA to automate recurring problem identification.

• Deployed results by dashboards in streamlit and Rest APIs PROJECTS

SGX DERIVATIVES DAILY DOWNLOADER 5th Aug 2025 - 10th Aug 2025 Individual Project — Technologies: Python, Airflow, Logging

• Automated SGX data ingestion (2013 – present) with auto-retry, saving the department 1 hour daily.

• Built comprehensive logging system tracking download status, failures, and permanent errors in structured JSON format for monitoring and debugging; scheduled task running daily with Apache Airflow

• Note: Project maintained under confidentiality requirements, GitHub repository not publicly available REAL-TIME CREDIT CARD TRANSACTION ANALYTICS SYSTEM Jun 2025 – Jul 2025 Team Leader — Technologies: Kafka, Spark Streaming, Hadoop HDFS, Power BI, Airflow, Python

• Led a team of 4 to build a streaming system processing 100k+ transactions/day with less than3s latency.

• Tech stack: Kafka, Spark Streaming, Hadoop HDFS, Power BI and Airflow.

• Reduced false positives by 25% through real-time feature engineering and model scoring, displayed in BI dash- board.

• GitHub: github.com/longdaophan/Data-Engineering-project ADDITIONAL

Technical Skills:

• Languages & Frameworks: Python (FastAPI, SQLAlchemy), SQL, Javascript, React.

• Data Engineering: Spark, Kafka, Apache Airflow, Hadoop HDFS, Data Warehousing.

• AI Agent & ML: MCP, LangChain, Langgraph, Scikit-learn,

• Tools: Redis, Git, Power BI, Streamlit.



Contact this candidate