Data Engineer
JOHN DOYLE
808-***-**** *******@*****.***
https://github.com/JohnK-Doyle
https://www.linkedin.com/in/john-kozo-doyle//
Seaside, CA 96955
Skills
Languages:
Python (Pandas, NumPy, SciPy, Matplotlib,
Scikit-Learn, Keras, Tensorflow, PySpark)
R: tidyverse tools (e.g., dplyr and ggplot2) and
other data munging tools
SQL (MS SQL Server, MySQL, Spark SQL,
PostgreSQL)
Databases:
RDMS
Google BigQuery
Data Ingestion
Web scraping
APIs
Tools
UNIX, Linux and Linux on Windows,
Virtual Machines (AWS & Azure)
Tableau
Excel (VLookup, Conditional
Formatting, Pivot Tables)
Microsoft Power BI
Machine Learning
Logistic Regression
Linear Regression, Multiple Linear
Regression
Decision Trees, Regression Trees
Gradient Descent
Soft Skills
Collaboration, Problem Solving Skills
Research, Attention to Detail
Presentation Skills
Projects
SENTIMENT ANALYSIS ON IMDB DATA – Personal Project – Seaside, CA - April 2023
● Utilized Python to perform EDA on the dataset.
● Utilized Python and Scikit-Learn to perform sentiment analysis on 50,000 movie reviews.
● Performed data cleaning and split each review and sentiment into training and test data.
● Applied regression and machine learning to produce a final model. ANOMALY DETECTION OF LARGE AMOUNTS OF DATA – Work Project – Pearl Harbor, HI
- May 2021
● Used a large shipping database and used Python and Machine Learning algorithms (Scikit-Learn, Keras, and Tensorflow),
● Ran as a batch overnight,
● Sharded the results in numerous sectors of then a shared directory (pre-cloud for the client).
● Aggregated and visualized the data by using pandas, matplotlib and wordcloud to compile a professional report.
● Briefed results to operational leaders for decision-making. Work Experience
DATA ANALYST – USGI – Seaside, CA October 2022- April 2023 I used Excel, R programming, and Python programming to sort through big data and to conduct ETL, using R Studio and Jupyterlab via Anaconda on a classified system as well as using them on an unclassified home system. I am using Microsoft SQL Server but have also used SQL Lite and MySQL.
Used Excel, dplyr in R and pandas in Python for data manipulation, and the ggplot2 package in R-Studio to create visualizations in R, and dashboards in Tableau, Power BI, and Excel. I also use Matplotlib and Seaborne in Python DATA ENGINEER – Calhoun International, LLC – Honolulu, HI January 2021 - September 2022 I used Python and SQL. Using Python, I've used Word2Vec, Doc2Vec, Scikit-Learn, Keras and Tensorflow to write Machine Learning and Sentiment Analysis programs including Anomaly Detection of outliers relating to a large dataset briefing the results to operational decision makers.
N6 DEPARTMENT HEAD – Navy Information Operations Command Hawaii – Honolulu, HI October 2017 - December 2020
Managed and led 15 Information Systems Technicians (ITs) including their training and performed evaluations, awards, and mentoring for 15 ITs and 12 Logistics people demonstrating leadership and mentorship. Education
MASTER OF ARTS – Liberty University – Lynchburg, VA 2013 Major: Management and Leadership
BACHELOR OF SCIENCE - Joint Military Intelligence College - Washington, DC 2004 Major: Strategic Intelligence
Associate Science - Hawaii Pacific University - Honolulu, HI 2000 Major: Computer Science
IBM Data Engineering Essentials, May 09, 2023 - This credential earner has demonstrated a foundational knowledge of the core concepts, ecosystem, and life cycle of data engineering. The badge earner can explain the use of data integration platforms and how they relate to data pipelines and the ETL and ELT processes. The earner can also describe the types of issues that can impact the performance of data pipelines and databases and steps for troubleshooting these issues.