Post Job Free
Sign in

Data Engineer

Company:
ApplyBoard
Location:
Kitchener, ON, Canada
Posted:
April 26, 2024
Apply

Description:

The Role:

The data engineering team is an experienced team, responsible for supporting our product development and the entire organization. In addition to building ETL pipelines to automate analytics and building integrations between systems, the team is responsible for building and maintaining the infrastructure used to host these pipelines and integrations. The team is also responsible for building and maintaining data access components and providing tooling and analytics that are required for our predictive/ML models.

What you will be doing:

Build and maintain analytics with Python (pandas/pyspark)

Build and maintain ETL pipelines on AWS (EC2/Glue ETLs/Airflow)

Build and maintain Infrastructure components to support our pipelines and integrations(CDK)

Setup and maintain integrations between different systems to enable data flow between these systems (Appflow)

Actively contribute to shaping the direction of our data platform including architecting our data warehouse, machine learning deployment infrastructure, and ETL/ELT workflows

Gather and understand data requirements by working with stakeholders across multiple teams

Working closely with Engineering, IT, and Security to build processes and standards for our data science platform and how it integrates with data sources across the company

Developing ingestion, transformation, and cleansing pipelines to prepare a variety of structured and unstructured data sources for data analytics

Maintaining our data platform including managing and improving our redshift cluster and monitoring our data pipelines

Developing infrastructure using CDK to deploy data products to internal and external users

Providing operational support to the data science team

Being a go-to person about data-related questions company-wide

What you bring to this role:

Bachelor’s degree in Engineering, Computer Science, Mathematics, or a related technical discipline

4+ years experience in the data engineering field

Experience in setting up and maintaining a high volume of ETL pipelines

Experience in setting up ETL orchestration

Familiarity with infrastructure as code (CDK or Terraform) is a plus

Advanced knowledge of SQL and knowledge of NoSQL (MongoDB)

Ability to communicate effectively with people who are both highly technical, and non-technical alike

Strong analytical skills and an understanding of data science

Driven, passionate and creative, and thrives in a fast-paced environment

Knowledge of data modeling and system design using UML

Experience with AWS computing (eg. EC2, Lambda) and data storage technologies (eg. Redshift)

Tech Stack:

PostgreSQL

Python

Pandas

Nice to have Pyspark

Nice to have CDK or Terraform

AWS

Apply