Data Engineer Senior

Location:

Minneapolis, MN

Posted:

May 16, 2025

Contact this candidate

Resume:

Ed Arrey

• Blaine, Minnesota • 763-***-**** • ********@*****.***

• https://www.linkedin.com/in/ed-arrey-66801480/

SUMMARY

Highly accomplished Senior Data Engineer with 10+ years of experience designing, implementing, and optimizing end-to-end data infrastructures. Adept at building robust data pipelines using Azure Synapse, Databricks, ADLS Gen2, PySpark, Kusto Query Language (KQL), and Delta Lake, while leveraging Medallion Architecture for seamless bronze, silver, and gold layers. Demonstrates deep expertise in SQL, Python, and ETL/ELT frameworks, coupled with strong knowledge of data modeling, data warehousing, business intelligence, and structured streaming solutions. Skilled at driving secure, compliant data governance using Azure Purview, Azure Key Vault, Unity Catalog, and CI/CD best practices. Proficient in GitHub, Azure DevOps, ARM template deployments, and Delta Live Tables for streamlined development, deployment, and real-time analytics. Successfully migrated on-premises SQL data to Synapse and enhanced analytics with Power BI and Synapse SQL pools for high-performance reporting and insights.

CORE COMPETENCIES

Data Migration & Integration (On-Prem to Cloud, Cross-Cloud Transfers)

Pipeline Development & Automation (Azure Data Factory, Databricks, CI/CD)

Structured Streaming & Real-Time Analytics (Kafka, Event Hubs, Delta Live Tables)

Data Modeling & Warehousing (Star/Snowflake Schemas, Dimensional Modeling)

BI & Analytics (Power BI, SSRS, Tableau, Synapse Notebooks)

Advanced SQL & Python (T-SQL, PySpark, SparkSQL, Pandas)

Data Governance & Security (Azure Purview, Unity Catalog, Key Vault, RBAC)

CI/CD & DevOps (GitHub, Azure DevOps, ARM Templates)

Requirements Gathering & Stakeholder Management

Troubleshooting & Debugging

Agile/Scrum & SDLC Methodologies

TECHNICAL SKILLS

Cloud & Big Data:

Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Data Explorer, AWS S3, Google BigQuery, Snowflake, Oracle Database, Azure Cosmos DB

Programming & Scripting:

Python (PySpark, Pandas), T-SQL, SparkSQL, Apache Spark, Structured Streaming

Databases & Data Warehousing:

MS SQL Server, SQL Server Data Tools, SSIS/SSRS/SSAS, Snowflake, Azure SQL Database

Data Formats & Tools:

CSV/JSON/Parquet/XML, Delta Lake/File Format, Delta Live Tables, dbt, Unity Catalog, Azure Purview, Azure Logic Apps

DevOps & CI/CD:

Git, GitHub, Azure DevOps, ARM Templates, Docker (if applicable), Terraform (if applicable)

Data Visualization:

Power BI (DAX), Tableau, Databricks SQL, Synapse Notebooks

PROFESSIONAL EXPERIENCE

Sr. Azure Data Engineer

MGK Co. — Brooklyn Park, MN

(January 2020 – Present)

Data Lake Storage Architecture: Established a data lake storage layer using the Medallion Architecture (bronze, silver, gold) for efficient, scalable, and secure data processing.

Synapse Data Ingestion: Optimized data loading to Azure Synapse with PolyBase for parallelism and best-practice trigger-driven data flows.

PySpark Transformations: Built robust data pipelines and transformations in Synapse notebooks, leveraging PySpark, Pandas, and Parquet for EDA and analytics.

SQL Pool Optimization: Deployed and optimized dedicated SQL pools for large-scale analytics, focusing on data distribution, partitioning, and resource consumption to handle petabyte-scale datasets.

High-Performance BI: Enabled reporting on massive datasets through Power BI, serverless SQL pools, and materialized views, improving query performance and user experience.

Data Governance & Purview: Configured Azure Purview for Synapse SQL pools, enhancing data discovery, classification, lineage, and compliance monitoring.

On-Prem SQL Migration: Led the migration of on-premises SQL databases to Azure Synapse, designing architectural plans and implementing Synapse Pathway.

Workload Management: Provisioned Synapse dedicated SQL pools with appropriate partitioning, workload management, and auto-scaling.

Modern Data Warehouse Integration: Consolidated legacy systems into a cloud-based Azure data warehouse using Azure Data Factory, applying integration runtimes and data transformations.

ADLS Gen2 Security: Configured ADLS Gen2 with firewall rules, private endpoints, Azure Key Vault encryption, SAS tokens, and monitoring/alerts.

ADF Pipelines: Developed trigger-based pipelines to load files from data lakes to Azure SQL Database, orchestrating ETL with control flow activities.

SSIS Package Migration: Designed self-hosted and shared integration runtimes for migrating SSIS packages into ADF, reducing on-prem infrastructure dependencies.

Databricks & Key Vault: Set up Azure Databricks with Key Vault integration, mounting ADLS, processing data with PySpark notebooks, and creating Delta Lake tables.

CI/CD Deployments: Streamlined ADF pipeline deployments via ARM templates and Azure DevOps, ensuring consistent releases across environments.

Cost Optimization: Automated pausing/resuming of SQL pools from ADF, reducing operational costs without affecting SLAs.

Cross-Cloud Migrations: Migrated data from Amazon S3, Google Cloud Storage, and BigQuery into ADLS Gen2 and Synapse for unified analytics.

Logic Apps Integration: Orchestrated data processing with Azure Logic Apps, Azure Functions, and Databricks notebooks for a flexible, event-driven architecture.

Delta Live Tables: Leveraged Delta Live Tables in Databricks to ensure real-time data consistency, automated lineage, and enhanced reliability.

Azure DevOps Workflows: Implemented branching, pull requests, and gated CI/CD pipelines for continuous integration and delivery.

Azure Data Engineer II (Databricks)

MGK Co. — Golden Valley, MN

(November 2015 – January 2020)

Databricks Provisioning: Created and managed Azure Databricks workspaces using Azure CLI, ARM templates, and Azure Portal, adding users/groups for secure access.

Storage Mounting & Connectivity: Mounted ADLS Gen2 and Azure Blob to DBFS, enabling seamless reads/writes to SQL DB, Synapse SQL, Cosmos DB, CSV, JSON, and Parquet.

Spark Performance Tuning: Monitored Spark UI for query execution insights, optimizing partitions, shuffle operations, and file formats for faster performance.

Structured Streaming: Built streaming pipelines using Kafka, Event Hubs, checkpointing, and window aggregations for near real-time data ingestion and processing.

Databricks Security: Integrated Azure Key Vault secrets, Azure App Configuration, and Log Analytics to ensure compliance, observability, and secure data access.

Delta Lake Implementations: Leveraged concurrency, performance optimizations, versioning, and caching to build reliable Delta Lake architectures.

Databricks SQL & Dashboards: Developed SQL endpoints, dashboards, and user access policies for data exploration and BI, incorporating advanced query optimization.

CI/CD & Version Control: Configured GitHub version control for Databricks notebooks and leveraged Azure DevOps pipelines for automated releases.

Unity Catalog Governance: Centralized governance with Unity Catalog, managing fine-grained permissions and data-lineage tracking within Databricks.

Network Security: Applied RBAC, ACLs, and VNet integration for secure connectivity to ADLS Gen2; enforced credential passthrough for restricted data access.

Cost-Effective Clusters: Deployed Databricks Pools and spot instances, reducing compute costs while maintaining high availability.

Spark Optimization: Applied OPTIMIZE, ZORDER, Delta caching, bloom filtering, and dynamic partition pruning to significantly improve query performance.

Resource Management: Oversaw authentication/authorization, ARM template deployments, and Azure CLI scripting for DevOps automation.

ETL to Snowflake: Ingested, transformed, and curated data from ADLS Gen2 and S3 Blob to Snowflake, scheduling tasks for automated data loading.

Snowflake Administration: Created databases, schemas, tables, external tables, and views in Snowflake, configuring staged loads for CSV and JSON.

Data Warehouse Engineer

Verizon Wireless — Bloomington, MN

(June 2014 – November 2015)

ETL Design & Development: Implemented ETL processes using T-SQL and SSIS, supporting business reporting needs with scalable data pipelines.

Data Quality & Profiling: Conducted data analysis and profiling to identify quality gaps, implementing validation rules for improved reliability.

Data Mart Creation: Built data marts for business analytics, incorporating replication processes via Azure SQL.

SQL Optimization: Utilized SSMS for query tuning, stored procedure enhancements, and advanced T-SQL transformations.

BI Solutions: Redesigned OLAP cubes, data models, and paginated reports, collaborating with business units to refine and prioritize requirements.

Business Intelligence Engineer / SSIS Developer

Bianalytixs — New Brighton, MN

(January 2013 – June 2014)

Database Object Creation: Developed optimized views, functions, and stored procedures to support SSRS reporting and BI workflows.

Standardized Reporting: Created SSRS templates (portrait/landscape) and improved existing T-SQL scripts for better performance and consistency.

Process Automation: Identified database inefficiencies and designed automated solutions to streamline internal workflows.

Salesforce Integration: Constructed data pipelines to integrate Salesforce with on-prem databases, performing extraction, transformation, and loading.

Data Infrastructure Design: Engineered robust frameworks that support advanced data modeling and analytics across various business domains.

EDUCATION

DCS, Doctor of Computer Science — Big Data Analytics (2022)

Colorado Technical University, Colorado Springs, CO

MSc, Computer Science — Data Science (2019)

Colorado Technical University, Colorado Springs, CO

BSc, Computer Science — Software Application Programming (2017)

Colorado Technical University, Colorado Springs, CO

A.A.S. Degree in .NET Programming (2015)

Hennepin Technical College, Eden Prairie, MN

Contact this candidate