Azure Data Engineer - Lakehouse & Databricks Specialist

Location:

Brampton, ON, Canada

Posted:

March 08, 2026

Contact this candidate

Resume:

Arun G

Toronto, ON

Email: *********************@*****.***

Professional Summary

Azure Data Engineer with 3+ years of hands-on experience designing and implementing scalable cloud-native data solutions using Azure Data Factory (ADF), Azure Databricks, ADLS Gen2, Azure Synapse Analytics and Delta Lake. Strong expertise in Lakehouse Architecture, Medallion design (Bronze/Silver/Gold), Spark optimization and CI/CD implementation using Azure DevOps. Experienced in migrating legacy ETL solutions to modern Databricks-based distributed data platforms.

Core Technical Skills

Cloud & Data Platform

Azure Data Factory (ADF), Azure Databricks, Azure Data Lake Storage Gen2 (ADLS), Azure Synapse Analytics, Delta Lake, Azure Key Vault, Azure Active Directory, Azure DevOps

Programming & Processing

PySpark, Spark SQL, Python, SQL

Architecture & Modeling

Lakehouse Architecture, Medallion Architecture (Bronze/Silver/Gold), Data Modeling, Batch/Incremental Data Loading, Delta MERGE, Auto Loader, Change Data Capture (CDC), SCD Type 1 & 2, Data Validation, Data Quality Checks.

DevOps & Security

Git, Azure DevOps CI/CD Pipelines, RBAC, Managed Identities, Key Vault Integration

Visualization

Power BI

Professional Experience

Data Engineer

Ontario Securities Commission (OSC)

Toronto, ON

Sep 2022 – Jul 2025

Project: Enterprise Data Platform Modernization & Databricks Migration

Led migration of enterprise data platform from Azure Synapse to Azure Databricks, improving distributed processing performance by approximately 40%.

Designed and implemented Lakehouse architecture using Medallion pattern (Bronze Silver Gold) with Delta Lake.

Built and orchestrated 25+ parameterized Azure Data Factory (ADF) pipelines integrating HTTP sources, ADLS Gen2, and Databricks notebooks.

Developed scalable ingestion framework using Databricks Auto Loader to process large CSV files (1GB to 1.5GB per file), enabling incremental ingestion with schema evolution.

Developed optimized PySpark transformations and Delta Lake MERGE operations for incremental upserts and SCD processing.

Designed parameterized ADF pipelines with event-based and scheduled triggers for dynamic workflow orchestration.

Integrated ADF with Databricks clusters for notebook execution and job orchestration.

Configured autoscaling clusters and optimized Spark jobs to improve performance and reduce execution time.

Implemented CI/CD pipelines using Azure DevOps for automated deployment of ADF pipelines and Databricks artifacts across environments.

Enforced security best practices using RBAC, Managed Identities, and Azure Key Vault for secrets management.

Performed data validation, reconciliation, and quality checks across all transformation layers.

Collaborated in Agile sprint cycles, participating in backlog grooming, sprint planning, and release deployments.

Reviewed and enhanced Scala-based Spark notebooks to ensure compatibility with PySpark frameworks and improve maintainability.

Environment:

Azure Data Factory (ADF), Azure Databricks, ADLS Gen2, Azure Synapse Analytics, Delta Lake, PySpark, Spark SQL, Azure DevOps CI/CD, Git, RBAC, Azure Key Vault.

Education

Bachelor of Computer Application, Bharathiar University Chennai, India, 2015

PG Diploma in Web Design and Development, Conestoga College, Ontario, Canada, 2021

Contact this candidate