Data Scientist, Data Engineer

Location:

Ellicott City, MD

Salary:

$55,000

Posted:

January 23, 2021

Contact this candidate

Resume:

Daniel Joseph Park

443-***-**** ************@*****.*** Ellicott City, Maryland

EDUCATION

Towson University Towson, MD

Bachelor of Science, Computer Sciences with a Concentration in Cyber Security May 2018

Relevant Coursework:

-Data Structures and Algorithm Analysis

-Discrete Mathematics

-Elementary Linear Algebra

-Statistical Methods

University of Maryland, Baltimore County Catonsville, MD

Master of Professional Studies, Data Science May 2020

Relevant Coursework:

-Data Management

-Big Data Processing

-Intro to Data Analysis and Machine Learning

SOFTWARE & TECHNICAL SKILLS

Languages: Java, C, C++, SQL, Python, R

Skills: Jupyter Notebook, Apache Spark (RDDs, Spark SQL), Matplotlib, NumPy, Pandas, Seaborn, Supervised Learning (Linear Regression, Logistic Regression, Decision Trees, Random Forest, Neural Networks), Unsupervised learning (K-means Clustering), Scikit-learn, Git

PROJECTS

Airbnb Pricing

Utilized Machine Learning models to predict Airbnb pricing in Washington D.C. using an existing data set from insideairbnb.com.

Cleaned the data using pandas to decrease the number of columns and created visuals using Matplotlib.

Created models such as XGBoost and L1/L2/Dropout Regularization methods on a three-layered neural network.

Used feature engineering to increase the MSE and coefficient of determination in order to create more accurate models.

Predicting Bike Rentals

Applied linear regression, decision tree, and random forest models to predict the total number of bikes people rented in a given hour in the city of Washington D.C.

Used Mean Squared Error (MSE) as the error metric to determine how accurate each model was at predicting.

Concluded that random forest gave the lowest MSE due to the presence of nonlinear predictors and its ability to decrease overfitting.

Predicting Car Prices

Applied the k-nearest neighbors algorithm to predict a car’s market prices using its attributes

Used both univariate and multivariate models and modified them using k-fold cross validation and grid search.

Used Root Mean Squared Error (RMSE) as the error metric and visualized the results using matplotlib.

Discovered that a higher k-value results to a higher RMSE.

Contact this candidate