Machine Learning Data Science

Location:

Quan 1, 71000, Vietnam

Posted:

March 28, 2025

Contact this candidate

Resume:

Tran Minh Hien

Curriculum Vitae

****************@*****.*** 039******* Hồ Chí Minh tmh1008 TMHippo

tmh1008

INTRODUCTION

As a highly motivated and passionate Data Science Senior, I am seeking a position in a dynamic organization that will allow me to leverage my skills and knowledge in machine learning and deep learning. I am looking for an opportunity to work with a talented team and continue to learn and improve both technical and professional skills

EDUCATION

Bachelor's Degree in Data Science

University of Science - Vietnam National University GPA: 3.28/4

2021 – present

TECHNICAL SKILLS

Programming Languages

Python, R, SQL, C/C++, Dart, Html, Css

Hard Skills

Machine Learning, Deep Learning, Natural Language Processing(NLP), Data Preprocessing & Statistics, Data Visualization

Tools & Platforms

Git, GitHub, Jupyter Notebook, Google Colab

Framework

TensorFlow, Keras, Scikit-learn, Fastapi, Hugging Face AWARDS

TOP 15 Group A

UIT Data Science Challenge 2023

Project: APP RESOLEMATE GitHub

11/2023

Description: The application is a chatbot created using Dialogflow, aimed at collecting information about social life complaints through messaging.

•Speed up information collection, group and divide information levels and transfer to the processor quickly

PROJECTS

SEMINAR-CROSS SECTIONAL MOMENTUM GitHub 09/2024 – 01/2025

•Designed, implemented, and evaluated 4 investment strategies (Long-only, Short- only, Long-Short, Equal-Weight) using both technical indicators (MACD, Return) and machine learning models (LambdaMART, Random Forest, ListNet)

•Developed ranking-based portfolio allocation combining momentum signals and Learning to Rank (LTR) algorithms

•Conducted backtesting across 3 timeframes (Train, Validation, Test) with metrics including: Sharpe Ratio, Information Ratio, Expected Return, Volatility, Max Drawdown, Hit Rate, Avg Profit/Loss

•Key findings: • Long-Short strategies consistently outperformed others in both returns and risk control • MACD (technical) and LambdaMART (LTR model) delivered the most robust and reliable results

•Tools: Python, pandas, scikit-learn, Matplotlib, financial modeling SEOUL BIKE RENTAL GitHub 05/2024 – 07/2024

•Descripsion: This project uses R and machine learning to forecast bike demand in Seoul, optimizing bike-sharing operations and resource allocation for improved customer satisfaction and sustainability.

•The main focus is on processing and conducting statistical analysis of bike rental data, followed by developing strategies to minimize errors. A regression model is built to predict future bike rental demand, helping to forecast the number of bikes needed for rental.

DENOISE BY SVD Colab GitHub 05/2024 – 06/2024

•Decripsion: Create a Noise Reduction System using Singular Value Decomposition

•Focus on analyzing different types of noise and handling them mathematically without using libraries. The system should be deployed on Streamlit, allowing users to adjust the amount of key components retained in the audio. MOVIE RECOMMENDER GitHub 10/2023 – 12/2023

•Description: Create a movie recommendation system using a dataset of 45,000 movies released on or before July 2017, including attributes such as cast, budget, revenue, release date, language, and ratings. It also contains 26 million ratings from 270,000 users, with ratings ranging from 1 to 5, collected from the GroupLens website.

•Build a recommendation model, we can combine Content-Based Filtering and Collaborative Filtering using techniques such as scaled weighted average and popularity scores. This hybrid approach leverages the strengths of both methods to deliver more accurate and personalized recommendations. FACIAL ATTRIBUTE PREDICTION GitHub Demo 05/2023 – 07/2023

•Description: The project involves detecting attributes from a facial image

•This project focuses on building and optimizing a CNN for facial attribute recognition. The model extracts features through convolutional layers, reduces dimensions with pooling, and classifies through fully connected layers. Optimization involves tuning hyperparameters, using data augmentation, dropout, and transfer learning to improve accuracy and generalization. ICR - IDENTIFYING AGE-RELATED CONDITIONS GitHub 05/2023 – 07/2023

•Description: The competition involves a binary classification problem where you need to predict whether a subject has been diagnosed with one of three age-related conditions based on anonymized health characteristics.

•The main focus is on data processing and building a custom XGBoost model, which achieved a log loss of 0.18, demonstrating that the model performs quite accurately.

Contact this candidate