Post Job Free
Sign in

Data Engineer Machine Learning

Location:
Arvada, CO
Posted:
October 18, 2024

Contact this candidate

Resume:

Page * *

Jacquelyn Harland

Data Engineer

***********@*****.*** • 630-***-****

LinkedIn URL • GitHub • Portfolio • Denver, CO 80007 Innovative, results-driven, and AWS-certified Data Engineer with a track record of architecting scalable cloud data warehouse solutions and optimizing data pipelines, while leveraging AWS services to enhance performance and infrastructure efficiency. Expert in developing robust ETL processes, implementing data governance frameworks, and managing complex data engineering projects from inception to production. Proficient in utilizing wide array of technologies, including AWS Glue, Lambda, EMR, and Snowflake to develop/maintain scalable data infrastructures. Skilled at writing Python scripts for data enrichment, creating REST APIs, and employing machine learning frameworks to derive actionable insights from big data. Recognized for enhancing system performance through innovative memory management techniques, optimizing code for improved QAQC processes, and effectively communicating complex technical results to both technical/non-technical stakeholders. Passionate about leveraging data engineering expertise to drive business value, ensure data quality, and support critical decision-making processes across organizations. Areas of Expertise

• Big Data Technologies & Apache Spark

• Data Governance & Privacy Compliance

• Cloud Data Warehousing & ETL Pipelines

• AWS Services & Cloud Architecture

• Data Quality Assurance & Validation

• Data Visualization/Business Intelligence

• Application Development

• Python & SQL Programming

• Machine Learning Integration

Technical Proficiencies

Data Visualization Tools: Software: Power BI, Tableau Cloud Data Warehousing: Platforms: Redshift, Aurora, Snowflake, MySQL Workbench, TOAD ETL Tools: Technologies: AWS Glue (Crawlers, Data Catalog, ETL Jobs, Interactive Sessions, Triggers, Workflows), Athena, AWS CloudFormation, Lambda, EC2, SAS, AppFlow, FireHose, EMR, EventBridge, Kinesis, AWS Step Functions

Programming Languages: Languages & Frameworks: SQL, Python (including machine learning frameworks, such as sci- kit-learn, PyTorch), Oracle PL/SQL, PySpark, Spark SQL Cloud Platforms

Big Data Technologies:

Cloud Services: AWS (EMR, API Gateway, SAM, CLI, EC2, SageMaker, S3, KMS, IAM), Jitterbit, Salesforce Big Data: Apache Spark, Hadoop Data Storage Solutions: Databases: MongoDB, DynamoDB, RDS, Redshift, MySQL Workbench, Oracle, Postgres, AWS S3 Technologies: Snow family (Snowpipe, SnowSQL)

Data Engineering/Operations: Capabilities: Data Marts, Git/GitHub, Data Governance, Data Privacy Containers: Docker, AWS ECS, AWS EKS

Infrastructure Construction: Tools: SAM CLI for CloudFormation, CDK for deploying Lambda functions, API Gateway, DynamoDB locally with reusable constructs

Professional Experience

Data Engineer Walsworth – Denver, CO (Remote) 2023 – Present Engineer and optimize cloud data warehouse solutions to enhance performance, scalability, and infrastructure efficiency. Develop and deploy robust data pipelines to optimize data models and facilitate comprehensive data-driven integration. Manage workflows using Apache Airflow to automate and ensure efficient scheduling and execution of data tasks. Implement data privacy and governance policies to ensure encryption, anonymization, and compliance across all data handling processes. Create and maintain scalable cloud infrastructure for data lakes and labs utilizing AWS services, such as EMR, Sagemaker, and S3. Write Python scripts for data enrichment and integration. Establish data validation protocols and quality metrics to ensure the accuracy, completeness, and consistency of data. Lead projects from inception to production by managing all aspects of implementation. Key Contributions:

● Enhanced system performance by engineering scalable cloud data warehouse solutions that improved data processing capabilities.

● Boosted data accessibility and utility by leading the development and deployment of multiple high-performance data pipelines. Jacquelyn Harland

Page 2 3

● Implemented a data governance framework that strengthened data privacy, compliance, and security across the organization.

● Led data validation standards that improved data quality and reliability, supporting critical business decisions and operations.

● Developed and maintained a robust data infrastructure that supported scalable data operations and enabled efficient data analysis and machine learning capabilities.

● Fulfilled data requirements and communicated complex technical results to senior leaders and non-technical audiences in collaboration with cross-functional teams.

Data Engineer EDF Renewables – Denver, CO (Remote) 2022 – 2023 Architected and deployed serverless applications on AWS utilizing core technologies such as S3, DynamoDB, Elastic Beanstalk, Lambda, Sagemaker, and API Gateway. Interacted with AWS through CLI, APIs, and SDKs to conduct data validation and analysis in production and sandbox environments. Enhanced quality assurance and control by modifying code to add clear, concise features, and implemented graphical batch displays for nighttime processes. Designed code to enable users to address data reprocessing needs prior to analyzing QAQC parameters, enhancing data integrity and usability. Optimized memory management by caching datasets and refactoring functions to automatically add prefix loggings. Key Contributions:

● Streamlined the user experience for data reprocessing by devising and implementing coding enhancements.

● Expanded functionality and improved system integration by developing and maintaining a Flask-based REST API.

● Enhanced system performance and reduced overhead by implementing innovative memory management techniques.

● Attained AWS Certified Developer – Associate credential by demonstrating proficiency in identifying and leveraging AWS technologies to build and troubleshoot serverless applications.

● Improved deployment processes and application scalability by creating and implementing a robust serverless application using AWS Step Functions, API Gateway, Lambda, and S3.

● Optimized nighttime operations and graphical data presentations by advancing QAQC processes through code alterations that enable more precise data analysis and feature enhancements. Data Systems Analyst Kleinfelder – Denver, CO 2021 Managed specialized data analysis requests, performing both quantitative and qualitative assessments to meet diverse client needs. Utilized advanced scripting, querying, and analytics tools including T-SQL/SSRS, Report Builder, Zerion-iForm Builder, and ETL/SSIS to optimize data utility and accuracy. Conducted data mapping and analysis to ensure data integrity/actionable insights. Key Contributions:

● Improved operational efficiency and data presentation quality by proficiently creating and managing dashboards in Power BI.

● Enhanced data visualization and interpretation capabilities by collaborating in the development of a user-friendly software application, specifically for groundwater monitoring, utilizing R and GWSDAT for British Petroleum. Honors Math Teacher Jeffco PS – Arvada, CO & Cherry Creek – Aurora, CO 2019 – 2021 Collaborated with educational staff to analyze and acquire new and prior-year data for presentation at board meetings, identifying opportunities to enhance data-driven decision-making. Implemented innovative data management tools and methodologies to enrich the curriculum and ensure comprehensive coverage of required topics. Supervised and supported a diverse student body, establishing clear educational goals and monitoring data to guide student recovery and success strategies. Transformed educational challenges into data science problems, making complex concepts accessible to non-technical audiences. Facilitated an interactive learning environment, leading an advanced mathematicians group driven by student initiatives. Key Contributions:

● Achieved 99% proficiency level that surpassed the next honors standards by enhancing student understanding/mastery of algebra.

● Increased the efficiency, dependability, and quality of educational data management by developing new tools and strategies.

● Organized conferences with educational partners to regulate academic environments, resolve conflicts, and engage stakeholders.

● Directed and organized a Math Competition at Cherry Creek District while leading students to achieve the top three ranks. Jacquelyn Harland

Page 3 3

District U-46 – Elgin, IL 2014 – 2018

Science Teacher, (2016 – 2018)

Conducted performance gap analyses to align student performance with organizational educational goals. Facilitated stakeholder conferences to discuss educational strategies and resolve discrepancies. Instructed courses in earth sciences, space, biology, chemistry, and general science, incorporating innovative data management techniques to enhance curriculum delivery. Key Contributions:

● Enhanced student comprehension in subject matter by 80% through the strategic implementation of a revised curriculum that optimized learning outcomes.

Applied Engineering Teacher, (2014 – 2016)

Oversaw classroom resources under constrained conditions, innovating solutions to enhance educational delivery. Developed and taught a diverse curriculum that included Vehicle Engineering, Forensic Science, Prosthetics, 3D Printing, Water Purification, Sustainability, and Bridge Building, utilizing Bentley and CAD software for instruction in IDOT Bridge building. Key Contributions:

● Delivered effective solutions to technical challenges aimed at enhancing organizational and student capabilities.

● Employed advanced problem-solving skills to adapt and integrate new technologies into the curriculum to foster learning environment. Education

Master of Science in Math & Science Education, Specializations: English, Science, Social Studies, Math University of St. Francis, Joliet, IL Bachelor of Science in Dietetics, Chemistry Miami University, Oxford, OH Awards & Certifications

Data Science Bootcamp Certification University of Denver, Denver, CO

● Acquired tech programming skills in Excel, VBA, Python, R, JavaScript, SQL, Tableau, Big Data, and Machine Learning. STEM Certification University of Cincinnati, Cincinnati, OH

● Scored in the top 15% on the Microsoft Excel Assessment. SQL Rank: Scored in the top 10% in Test Dome Skills Assessment. AWS Certified Developer – Associate

AWS Certified Data Engineering – Specialty: Scheduled for next month. Professional Projects

Statistical Analysis: R Program: GitHub Repository

● Developed predictive models and performed regression analysis to evaluate car prototype metrics using R. Credit Risk Analysis: GitHub Repository

● Implemented supervised machine learning models to assess credit risk using Python and various data science tools. Movies ETL: GitHub Repository

● Designed and managed an ETL pipeline to automate data processing and integration into PostgreSQL database. Redshift ETL Pipeline: GitHub Repository

● Constructed a data warehousing environment using AWS Redshift, VPC, and Glue. Pipeline with Singer: GitHub Repository

● Developed and managed data transformation pipeline integrating Singer, PySpark, and Airflow for efficient data orchestration.



Contact this candidate