Data Analyst Machine

Location:

Fort Worth, TX

Posted:

March 25, 2021

Contact this candidate

Resume:

MINH DANH NGUYEN

+84-86-513**** • +1-682-***-****

linkedin.com/in/1028 • github.com/minhnguyen10

EDUCATION

Texas Christian University, Fort Worth, TX

Bachelor of Science in Computer Science, May 2021

Overall GPA: 3.8 – Major GPA: 3.7

Programing: Python (Advanced), R (Intermediate), Java (Advanced), C (Fundamental) Big Data: Spark in Scala, Hadoop, SQL

Tools: Git, Notebook, SQL Workbench, R Studio, Anaconda External library and framework: Scikit-Learn, NumPy, Pandas, Seaborn, Matplolib, TensorFlow EXPERIENCE

Research Assistant, TCU Department of Computer Science April 2019 – Present Machine Learning and Deep Learning Research Fort Worth, TX

• Develop and apply machine learning models to different research projects

• Work with several libraries to implement Machine Learning and Deep Learning models Big Data Analyst Intern, Viettel Cyber Security June 2020 – Sept 2020 Big Data and Machine Learning Hanoi, VN

• Clean and process user log file data using Spark in Scala to build out solutions to support product needs

• Cluster and analyze unique proprietary datasets

PROJECTS

Truck Detection Sept 2020 – Present

Senior Design Project

• Perform a truck image segmentation process from scratch

• Mine and process satellite images scraped from Mapbox API to build dataset

• Implement Deep Learning model – UNet to segment truck from the images Car Price Prediction Dec 2019 – Dec 2019

Data Mining and Visualization Final Project

• Pre-process the data using NumPy, Pandas, and Seaborn library

• Implement Decision Tree Regressor, Random Forest Regressor, Linear Regressor, Polynomial Regressor models in Google collab notebook

• Visualize predicted result and compare it with the correct one using Matplolib Beijing House Price Prediction May 2019 – Dec 2019 Machine Learning Research Project

• Pre-process the data, handle outliers and missing data, remove unnecessary features, replace features, and hot code categorical attributes using NumPy and Pandas

• Implement Random Forest, Extreme Gradient Boosting, Light Gradient Boosting Machine, Hybrid Regression, and Stack Generalization Regression models with Python

• Generate the best models based on RMSLE for the Beijing house price dataset to provide more efficient way to predict house price

CERTIFICATE & RELEVANT COURSEWORK

IBM Data Science Professional Certificate, Calculus, Linear Algebra, Data Mining and Visualization, Database System, Artificial Intelligence, Data Science – Dataquest.io, Neural Networks and Deep Learning – deeplearning.ai

Contact this candidate