Daniel Joseph Park
443-***-**** ************@*****.*** Ellicott City, Maryland
EDUCATION
Towson University Towson, MD
Bachelor of Science, Computer Sciences with a Concentration in Cyber Security May 2018
Relevant Coursework:
-Data Structures and Algorithm Analysis
-Discrete Mathematics
-Elementary Linear Algebra
-Statistical Methods
University of Maryland, Baltimore County Catonsville, MD
Master of Professional Studies, Data Science May 2020
Relevant Coursework:
-Data Management
-Big Data Processing
-Intro to Data Analysis and Machine Learning
SOFTWARE & TECHNICAL SKILLS
Languages: Java, C, C++, SQL, Python, R
Skills: Jupyter Notebook, Apache Spark (RDDs, Spark SQL), Matplotlib, NumPy, Pandas, Seaborn, Supervised Learning (Linear Regression, Logistic Regression, Decision Trees, Random Forest, Neural Networks), Unsupervised learning (K-means Clustering), Scikit-learn, Git
PROJECTS
Airbnb Pricing
Utilized Machine Learning models to predict Airbnb pricing in Washington D.C. using an existing data set from insideairbnb.com.
Cleaned the data using pandas to decrease the number of columns and created visuals using Matplotlib.
Created models such as XGBoost and L1/L2/Dropout Regularization methods on a three-layered neural network.
Used feature engineering to increase the MSE and coefficient of determination in order to create more accurate models.
Predicting Bike Rentals
Applied linear regression, decision tree, and random forest models to predict the total number of bikes people rented in a given hour in the city of Washington D.C.
Used Mean Squared Error (MSE) as the error metric to determine how accurate each model was at predicting.
Concluded that random forest gave the lowest MSE due to the presence of nonlinear predictors and its ability to decrease overfitting.
Predicting Car Prices
Applied the k-nearest neighbors algorithm to predict a car’s market prices using its attributes
Used both univariate and multivariate models and modified them using k-fold cross validation and grid search.
Used Root Mean Squared Error (RMSE) as the error metric and visualized the results using matplotlib.
Discovered that a higher k-value results to a higher RMSE.