Lam La Email : ***********@*****.***
Mobile : +84-96-432-****
Education
•
Hanoi University of Science and Technology Hanoi, VN Bachelor of Data Science; GPA: 3.20 Aug. 2020 – Aug. 2024
•
Nguyen Hue High School of Gifted Students Hanoi, VN Chemistry Major Aug. 2017 – July. 2020
Experience
•
Sotatek Hanoi, VN
Data Engineer Sept 2023 - Sept 2024
Data Collection: Collected oil and gas data from various sources to prepare for training classification models. Crawled Bitcoin data from popular websites to perform cross-checking with transformed data.
Data Transformation: Transformed raw Bitcoin data stored in ClickHouse to serve the downstream tasks including displaying users’ daily balance and transaction information. Manipulated data build tool (dbt) to provide transformation scripts in the uniform repository.
Data Warehousing: Setup, schema design, and management of MongoDB and PostgreSQL. Works on implementing the data schema of Open Subsurface Data Universe (OSDU) in similar oil and gas data systems. Design schema for vector database (MilvusDB) for data retrieval of Chatbot system.
Data Processing: Design the architecture for processing change data capture (CDC). Built and deployed microservices to process Kafka messages which are changes captured from non-relational (MongoDB) and relational
(PostgreSQL) databases. Build data pipelines for vector database.
Chat Bot Application: Implemented RAG architecture and used Azure OpenAI Services to build REST APIs for a virtual assistant system that takes knowledge data stored in MilvusDB or from the internet.
Machine Learning: Developed data pipeline for Oil and Gas tasks including crawling oil and gas data, designing data schema, processing data using PySpark, and training machine learning models with Dataiku.
•
FTECH AI Hanoi, VN
Data Scientist Intern Sept 2022 - Jan 2023
Data Collection and Management: Monitoring ETL pipeline in Airflow server, which contains the job of crawling and preprocessing data using PySpark, also responsible for maintenance.
Data Processing: Writting SQL, NoSQL to create table for visualization in downstream tasks
Chat Bot Application: Build chatbot using LLMs provided by OpenAI and relating packages (Langchain and vector stores)
•
Freelancer Hanoi, VN
Full-stack Developers Jan 2022 - Sept 2022
Data Collection: Developing scraping tool web that crawls and processes data in multiple fields like job, and real estate from other websites using Python for web scraping (Scrapy, Selenium).
Web Developer: Developed and maintained back-end services (Django); building user interface for crawling job monitoring and deployed the web using Docker
Projects
• Data Scraping Tool: Designed jobs for crawling data from Singapore Exchange (SGX) using Scrapy and Selenium.
• Airport Analysis: Produced a data pipeline for daily airport data collection, transformed the data in Airbyte from MySQL to Amazon Redshift, and data visualization in PowerBI. Provided scripts in the data build tool (dbt) repository.
• Stock Price Prediction: Trained deep learning model for predicting the closed stock prices and produced interactive visualization using Gradio.
Skills
• Programming Languages: Python, SQL, Javascript, Java
• Certification: Majority of courses provided by IBM for Data Engineer (11/13 courses); OpenAI Deep Learning Specialization; Prompt Engineering provided by DeepLearning AI
• Language: Fluent English (Ielts 7.0)