Senior Azure Data Engineering Leader

Location:

Lima, OH

Posted:

June 30, 2026

Contact this candidate

Resume:

Bharathkumar M

*****************@*****.*** • 567-***-**** • linkedin.com/in/bharath520

SUMMARY

Senior Data Engineer with 10+ years of experience delivering enterprise data platforms across insurance, financial services,

private credit, HVAC services, and banking. Career started in SQL Server / SSIS / SSRS / Power BI development at HSBC and

Conduent, grew into data engineering and BI delivery at CNA Insurance and Byte Link, and progressed into cloud-native data

engineering leadership at Antares Capital, Service Experts, and AAA Hoosier Motor Club. Equally comfortable owning the full

Azure stack end-to-end and dropping back into deep on-prem SQL Server and MSBI work when the role calls for it.

Hands-on expert across Azure Data Factory (metadata-driven ControlTable frameworks, ForEach + Lookup, dynamic stored

procedures), Azure Synapse Analytics, Azure Databricks (PySpark, Spark SQL, Delta Lake, Structured Streaming), Azure SQL

Database, ADLS Gen2, and Self-Hosted Integration Runtimes; secondary experience with AWS (S3, Glue, Redshift, RDS,

SageMaker) and Google BigQuery. Sole engineer responsible for the current AAA HMC modernization, where I provisioned

the full Azure ingestion stack from scratch (Linked Services, HMCSHIR Self-Hosted IR, Key Vault, RBAC), built a

metadata-driven framework loading PostgreSQL via SSH tunnel, on-prem SQL Server via ODBC, and BigQuery into Azure

SQL, designed a per-table self-healing watchdog, and authored 100+ pages of handoff documentation.

Strong data modeling background covering star and snowflake schemas, SCD Type 2, dimensional modeling, medallion

architecture on Delta Lake, and metadata-driven design patterns. Power BI specialist with advanced DAX (CALCULATE,

USERELATIONSHIP, semi-additive time intelligence), row-level security, and dashboards spanning financial reporting,

regulatory analytics, portfolio performance (IRR, MOIC), Customer 360, and operational KPIs. Experienced with CI/CD (Azure

DevOps, GitHub Actions), monitoring (Azure Monitor, Log Analytics, Application Insights, Splunk), and cloud security

(RBAC, Managed Identity, Key Vault), driving business value through secure, reliable, well-documented data platforms

aligned with Agile and SDLC practices.

SKILLS

Category Skills & Tools

Cloud Platforms Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Databricks (Unity

Catalog, External Locations, Instance Pools), Azure Machine Learning Studio, Azure

Data Factory, Azure SQL Database (Managed Identity), Azure SQL Managed

Instance, Azure Functions, Azure Logic Apps, Azure Stream Analytics, Azure DevOps,

Event Hub, Key Vault, AWS S3, AWS Glue, AWS Redshift, AWS RDS, AWS SageMaker,

Google BigQuery

Data Engineering & ETL SSIS (Lookup, Conditional Split, Derived Column, Foreach Loop, SCD Type 2), Azure

Data Factory (metadata-driven ControlTable framework, ForEach + Lookup, dynamic

stored procedure invocation, Mapping Data Flows), Python (Pandas, NumPy,

requests), PySpark, Spark SQL, Delta Lake, Structured Streaming, Kafka, Batch &

Streaming Pipelines, Watermark-based Incremental Loads, ROW_NUMBER

Deduplication, Self-Healing Watchdog Frameworks, Polybase, OPENROWSET, ODBC

integration via Self-Hosted Integration Runtime, SSH-tunneled connectivity

Infrastructure & Azure Linked Services (Managed Identity for Azure SQL, Service Account for

Integration BigQuery, ODBC for on-prem SQL Server, ADLS Gen2), Self-Hosted Integration

Runtime (HMCSHIR install, registration, multi-node management),

AutoResolveIntegrationRuntime configuration, SSH tunnel infrastructure on Windows

service accounts (NSSM service pattern), ODBC driver setup, ADF Triggers (schedule

and tumbling window)

Data Modeling & Star Schema, Snowflake Schema, Slowly Changing Dimensions (SCD Type 1 / Type 2),

Design Partitioning (range, hash, list), Logical & Physical Design, Normalization (3NF),

Medallion Architecture (bronze / silver / gold on Delta Lake), Control Table /

Metadata-Driven Architecture

Monitoring & Logging Azure Monitor, Diagnostic Settings, Log Analytics (Kusto Query Language),

Application Insights, Action Groups + ADF Failure Alerts, Splunk (log monitoring,

Category Skills & Tools

alerting, dashboards), AWS CloudWatch, custom audit tables with action-code

taxonomies

Databases & Azure Synapse (dedicated + serverless SQL pools), Azure SQL Database, SQL Server

Warehousing (on-prem and cloud, 2016 / 2019 / 2022), PostgreSQL, AWS Redshift, Google

BigQuery

Visualization & BI Power BI (advanced DAX with CALCULATE, USERELATIONSHIP, semi-additive YTD /

QTD / MTD time intelligence, KPIs, row-level security, drill-through, bookmarks), SSRS

(parameterized, sub-reports, subscriptions), SSAS (Tabular & Multidimensional),

Tableau, Qlik Sense, Power Apps, Power Automate

Machine Learning & AI Azure ML Studio, Databricks MLlib, Scikit-learn, Forecasting (ARIMA, Prophet),

Classification, Anomaly Detection, AWS SageMaker

CI/CD & DevOps Azure DevOps (Pipelines, Repos, YAML build / release), GitHub Actions, Git

(feature-branch and trunk-based workflows), CI/CD for ADF / Synapse / Databricks

artifacts

API & Integration REST APIs, SOAP APIs, Postman, ODBC drivers, SSH Tunnels, Self-Hosted Integration

Runtime, Azure Functions, Azure Logic Apps, SAP HANA, SharePoint, third-party API

integrations

Languages Python, T-SQL, PL/SQL, PySpark, MDX (SSAS Multidimensional)

EDUCATION

Master of Science 2024

Indiana Institute of Technology – Fort Wayne, IN

Bachelor of Science 2015

Nagarjuna University, Guntur, IND

CERTIFICATIONS

• Microsoft Certified Associate – Azure Data Engineer

• Career Essentials in Data Analysis by Microsoft and LinkedIn Education

• Career Essentials in Project Management by Microsoft and LinkedIn Education

EXPERIENCE

Azure Data Engineer / Lead Data Engineer

AAA Hoosier Motor Club (HMC) March 2026 – Present, Indianapolis, IN

Sole data engineer on the data modernization initiative, architecting end-to-end ETL framework migrating enterprise data

from AWS-hosted PostgreSQL, on-prem SQL Server (via ODBC), and Google BigQuery into Azure SQL Database. Designed

for incremental loads, schema drift tolerance, composite primary keys, and self-healing recovery from source resync events.

• Provisioned full Azure ingestion stack from scratch: configured Linked Services for Azure SQL Database

(LS_AzureSQL_AAATest_MI with Managed Identity authentication), Google BigQuery (service account authentication),

on-prem SQL Server via ODBC, and ADLS Gen2; installed and registered the HMCSHIR Self-Hosted Integration Runtime

on the on-prem server with multi-node support and configured AutoResolveIntegrationRuntime for cloud-only

activities.

• Implemented security and access controls using Azure Key Vault and RBAC: stored BigQuery service account JSON,

PostgreSQL credentials, SSH private keys, and ODBC connection secrets in Key Vault; granted the Data Factory

Managed Identity Key Vault Secrets User role for runtime secret retrieval; provisioned database-level permissions on

Azure SQL for the ADF Managed Identity and SHIR service account, removing all plaintext credentials from pipeline and

dataset definitions.

• Configured Azure Monitor end-to-end for pipeline observability: enabled Diagnostic Settings on the Data Factory to

stream Pipeline / Trigger / Activity runs and AllMetrics to a Log Analytics workspace; created an Action Group

(HMC-DataPipeline-Alerts) with email and Teams webhook recipients; built ADF_PipelineFailure_Alert with

metric-based conditions across six production pipelines and per-pipeline severity tuning for proactive failure

notification.

• Built coordinated ADF trigger orchestration on Eastern Time: TR_CPL_Warmup_Daily (12:15 AM) wakes the Azure SQL

instance and Databricks pool, TR_CPL_ResyncDetector_Daily (12:30 AM) runs the per-table watchdog, TR_CPL_Daily

(1:30 AM) runs the main load, TR_bigquery_to_azure_sql (Saturday 2:30 AM) runs the weekly BigQuery pipeline, and

Globalware runs Saturday 10:30 PM, with deliberate gaps to avoid resource contention and enable same-night anomaly

detection plus selective full reload.

• Set up the Azure Databricks workspace for the ETL framework: configured Unity Catalog External Locations

(hmcsaiteporting-bigquery, connectplus-raw) bound to a managed storage credential, provisioned the

HMC_Pipeline_pool instance pool with pre-warm logic, and authored two parameterized PySpark notebooks

(HMC_member_incremental_merge for CPL composite-PK merges and HMC_generic_incremental_merge reused across

BigQuery and future sources) with schema-tolerant unionByName history accumulation.

• Designed custom audit and operational logging schema in Azure SQL: cpl.ResyncLog with an action-code taxonomy

(BASELINE_SET / COUNT_OK / WATERMARK_RESET / SKIPPED) capturing every watchdog decision, cpl.WatchdogState

holding per-table baselines and thresholds, and cpl.SchemaChangeLog tracking auto-ALTER events for downstream

review.

• Built dynamic stored procedure framework in T-SQL with ROW_NUMBER deduplication,

INFORMATION_SCHEMA-based schema drift handling (auto-ALTER for new columns, type-drift detection), and

composite primary key support, replacing source-specific procs with reusable per-schema templates.

• Designed watermark-driven incremental load pattern in Azure Data Factory using ForEach + Lookup over a central

ControlTable, orchestrating data flows across ADLS Gen2 staging and Azure SQL targets with auto-create staging tables

and idempotent merge operations.

• Implemented PostgreSQL connectivity via SSH tunnel on the Self-Hosted Integration Runtime, diagnosing standby

replica WAL recovery conflicts (error 40001) and converting parallel ForEach to sequential to ensure stable nightly

extracts.

• Migrated on-prem SQL Server data integration off a retiring server by consolidating legacy SSIS jobs into a single ADF

pipeline using ODBC connectivity through the Self-Hosted Integration Runtime, with parquet staging and

TRUNCATE+INSERT loads.

• Built Google BigQuery to Azure SQL incremental and reference pipelines using service account authentication,

composite primary key handling, and NULL-preserving load logic to retain legitimate source rows without sacrificing

constraint integrity.

• Developed parameterized PySpark notebooks in Azure Databricks performing schema-tolerant unionByName history

merge with primary key deduplication on watermark DESC, supporting both PostgreSQL and BigQuery source patterns

through one reusable notebook.

• Designed self-healing watchdog system using T-SQL stored procedures with dynamic source row-count detection that

compares against per-table thresholds and selectively resets watermarks on anomaly events, replacing a

global-threshold design that caused unnecessary full reloads.

• Rebuilt brittle SSH tunnel infrastructure by replacing three coordinated scheduled tasks with a single

service-account-owned task surviving RDP disconnects, with structured logging and a watchdog restart mechanism.

• Developing Python automation in Azure Functions to integrate third-party API data for business KPI reporting, replacing

manual weekly script execution with scheduled deployment, Key Vault-based authentication, and ADLS / Azure SQL

persistence.

• Built coordinated trigger orchestration (warmup watchdog main pipeline) on Eastern Time schedules, enabling

same-night anomaly detection and selective watermark reset for only affected tables.

• Authored a full set of handoff documentation (six documents totaling 100+ pages) covering CPL pipeline architecture,

BigQuery pipeline architecture, Globalware ODBC integration, watchdog logic and threshold tuning, SSH tunnel

infrastructure (v1 to v2 incident lessons), and Databricks workspace setup; established design decisions, trade-offs,

troubleshooting playbooks, and operational runbooks for downstream maintainers.

Reduced pipeline runtime from ~2h 25m to ~45m through Databricks pool warmup pattern. Resolved 32GB storage

exhaustion via TRUNCATE-first staging design. Delivered three production pipelines covering ~50GB of data across AWS, GCP,

on-prem, and Azure source systems, with metadata-driven design enabling new tables to be onboarded via a single

ControlTable INSERT.

Azure Data Engineer / Power BI Developer

Service Experts March 2024 – February 2026, Richardson, TX

Team member on the data engineering squad building unified customer and operational data integration across Azure

Synapse Analytics, Azure Databricks, Delta Lake, and Snowflake. Delivered real-time and batch reporting solutions for

HVAC service operations supporting analytics, ML forecasting, and field operations.

• Designed scalable data integration solutions in Azure Synapse Analytics with dedicated and serverless SQL pools,

materialized views, Polybase external tables, OPENROWSET queries, and partitioning strategies for large-scale

analytical workloads.

• Built and maintained Azure Databricks notebooks using PySpark, Spark SQL, and Python libraries (Pandas, NumPy,

Scikit-learn, MLlib) for transformation, cleansing, schema evolution, aggregation, and ML model development.

• Leveraged Delta Lake with Structured Streaming and Kafka ingestion for managing batch and streaming datasets with

ACID compliance, enabling real-time analytics via Azure Event Hub and Databricks pipelines.

• Developed metadata-driven, parameterized pipelines in Azure Data Factory orchestrating flows across Synapse,

Databricks, ADLS, Blob Storage, Snowflake, and sources including SAP HANA, SharePoint, on-prem SQL Server (via

Self-Hosted IR), and REST APIs tested via Postman.

• Designed star and snowflake schema models in Synapse implementing medallion architecture (bronze / silver / gold) on

Delta Lake; built SSAS Tabular models with hierarchies, partitions, and DAX expressions for enterprise BI reporting.

• Built Power BI dashboards sourced from Synapse, Databricks, and Snowflake with advanced DAX, KPIs, slicers,

drill-downs, and row-level security delivering churn probability, revenue prediction, and technician utilization insights.

• Built Logic Apps and Power Automate workflows to trigger pipelines on file arrival and integrate Power BI / SharePoint

alerts with email and Teams notifications, reducing manual handoffs between systems.

• Implemented CI/CD using Azure DevOps Pipelines and GitHub Actions for automated deployments of ADF, Synapse,

and Databricks components, with Azure Monitor and Application Insights for runtime telemetry and alerting.

• Integrated Azure Key Vault, Managed Identity, Azure SQL Managed Instance, and RBAC for secure secrets

management and access control across ADF, Databricks, and Synapse, removing plaintext credentials from pipeline

definitions.

• Collaborated with data science, BI, and field operations teams in Agile sprints, contributing to architecture reviews,

sprint planning, code reviews, and stakeholder demos.

Delivered unified Customer 360 data platform and operational forecasting models supporting HVAC service operations

across multiple regions. Streamlined deployments through DevOps automation and reduced manual data preparation work

via metadata-driven pipeline design.

Senior Data Engineer / Power BI Developer

Byte Link Systems September 2023 – March 2024, Katy, TX

Senior data engineer on a client services team building ETL pipelines, data integration, and reporting solutions. Focused on

core data engineering work: T-SQL development, SSIS migration, relational modeling, query tuning, and Power BI

dashboards, with light use of Azure Data Factory and Databricks where projects called for cloud destinations.

• Developed and tuned complex T-SQL stored procedures, views, UDFs, and CTEs in SQL Server, optimizing execution

plans through covering indexes, statistics maintenance, and query refactoring to resolve long-running joins and

parameter-sniffing issues.

• Designed logical and physical data models for client warehouses and reporting marts, applying normalization (3NF) for

OLTP-style data and star schema with conformed dimensions and SCD Type 2 for analytical workloads.

• Built and maintained SSIS packages for batch ETL across SQL Server, PostgreSQL, flat files, and Excel sources, using

Lookup, Conditional Split, Derived Column, and SCD transformations with package-level configurations and

parameterized connections.

• Wrote Python scripts (Pandas, NumPy, requests) for data profiling, CSV / Excel cleansing, REST API ingestion, and

lightweight transformation jobs that fed downstream SQL workflows.

• Used Azure Data Factory selectively for cloud copy workloads (Mapping Data Flows, ForEach, Lookup, Web, Wait

activities) and PySpark / Spark SQL in Azure Databricks for the occasional large transformation, but kept the bulk of

logic in T-SQL and SSIS to match the team's stack.

• Integrated REST and SOAP APIs into ETL workflows with retry logic, exponential backoff, and pagination handling for

third-party data feeds.

• Built Power BI dashboards with star-schema data models, advanced DAX measures, time intelligence (YTD, MoM, QoQ

rolling averages), drill-throughs, and bookmarks for client business reviews.

• Developed SSRS reports (ad-hoc and parameterized) and migrated legacy Excel and Access reports into Power BI as part

of client modernization engagements.

• Collaborated with analysts and client stakeholders to gather requirements, design source-to-target mappings, and

document data lineage for audit and handoff.

Delivered multiple client-facing ETL and reporting projects with a focus on SQL performance, data quality, and maintainable

code, balancing on-prem SQL Server / SSIS work with selective cloud usage where it added real value.

Senior Data Engineer / Power BI Developer

Antares Capital February 2020 – August 2022, Chicago, IL

Senior engineer on the data engineering team at a private credit and direct lending firm, building data pipelines and Power

BI reporting solutions for financial reporting, regulatory submissions, and investment performance / portfolio management.

Worked primarily on the Azure stack (Synapse, Databricks, ADF) with AWS Redshift, Glue, and S3 as secondary sources,

with a strong BI focus on Power BI dashboards and DAX modeling for portfolio managers and finance leadership.

• Designed and developed interactive Power BI dashboards for loan portfolio analytics, fund performance, and regulatory

reporting, with advanced DAX measures (CALCULATE filter context, USERELATIONSHIP for role-playing dates,

semi-additive YTD / QTD / MTD measures), KPIs, drill-throughs, bookmarks, and row-level security for

portfolio-manager and fund-level access control.

• Built scalable data integration workflows using Azure Synapse Analytics (dedicated and serverless SQL pools), Azure

Data Factory, and Databricks with SQL, PySpark, and Spark SQL; pulled secondary sources from AWS Redshift and AWS

Glue catalogs via cross-cloud copy patterns.

• Designed star, snowflake, and hybrid schemas in Azure Synapse and AWS Redshift; created optimized stored

procedures, views, and UDFs in T-SQL with indexing, partitioning, and statistics tuning for analytics and ETL

performance on loan-level and transaction-level fact tables.

• Delivered regulatory and financial reporting datasets (loan tape, exposures, covenants, fund-level performance) with

audit-ready lineage, reconciliations against source systems of record, and DAX measures tied to finance-approved

calculation logic.

• Built portfolio-management dashboards covering IRR, MOIC, weighted-average spread, default and recovery metrics,

and concentration limits, sourced from Synapse and Databricks gold-layer tables curated for finance and investment

teams.

• Developed and deployed machine learning models in Databricks and Azure ML Studio using Python, Pandas, and MLlib

for forecasting and classification (default risk, prepayment likelihood), with secondary experimentation on AWS

SageMaker pipelines.

• Enabled real-time and batch data processing using Delta Lake, Azure Stream Analytics, Kafka, and AWS S3, ensuring

ACID compliance, schema enforcement, and data versioning on portfolio-update feeds.

• Orchestrated Databricks jobs with retry logic, autoscaling, and monitoring via Azure Monitor, AWS CloudWatch, and

Splunk dashboards for observability and SLA tracking on critical end-of-day and month-end reporting jobs.

• Collaborated with portfolio managers, finance, risk, and compliance stakeholders to gather requirements, prototype

dashboards, and deliver scalable BI / data solutions with CI/CD automation using GitHub and Azure DevOps.

Delivered a Power BI reporting layer covering portfolio performance, regulatory reporting, and fund analytics across a

private credit / direct lending portfolio, with the underlying data engineering platform on Azure Synapse + Databricks and

selective AWS integration for legacy data sources.

Data Engineer / Power BI Developer and Analyst

CNA Insurance March 2018 – January 2020, Chicago, IL

Data engineer on a BI and analytics team at a commercial property and casualty insurance carrier, building SQL Server data

models, SSIS ETL pipelines, and SSRS / Power BI / Tableau reporting for Sales, Underwriting, Claims, and Finance. Focused

on core data engineering: T-SQL development, SSIS package design, SSAS modeling, performance tuning, and dashboard

delivery.

• Designed and implemented logical and physical data models in SQL Server with normalization (3NF) and referential

integrity for transactional systems, and dimensional star / snowflake models for analytical workloads in Underwriting

and Claims.

• Built and optimized complex T-SQL stored procedures, triggers, views, functions, and indexes supporting ETL,

reconciliations, and business processes; tuned SQL performance using DMVs, Execution Plans, and Index Tuning Advisor

to resolve bottlenecks through covering indexes, statistics maintenance, and parameter-sniffing mitigations.

• Developed and maintained SSIS ETL pipelines with SCD Type 2, Conditional Splits, Lookups, Derived Columns, and

Foreach Loop containers, integrating flat files, mainframe extracts, legacy databases, and cloud sources into the

enterprise data warehouse.

• Migrated on-prem SQL Server and MySQL databases to Amazon RDS leveraging S3 backup / restore, validating row

counts, identity reseed, and downstream SSIS package connections after cutover.

• Built SSAS Tabular and Multidimensional models with hierarchies, partitions, calculated measures (DAX and MDX), and

role-based security to support enterprise BI for Sales, Underwriting, and Finance.

• Designed and deployed SSRS reports with parameterized expressions, drill-throughs, subscriptions, and scheduled

deliveries; modernized legacy SSRS and Excel reporting by migrating into Tableau and Power BI with DAX models and

time-intelligence measures.

• Delivered BI solutions for Sales, Underwriting, Claims, and Finance teams, collaborating with stakeholders to provide

insights into claims development, policy trends, loss ratios, and financial performance.

Built and maintained a SQL Server + SSIS + SSAS reporting foundation supporting multiple business lines across the

insurance organization, modernizing legacy SSRS and Excel reporting into Power BI and Tableau dashboards used daily by

Sales, Underwriting, and Finance leadership.

Data Analyst / SQL Developer (SSIS, SSRS, SSAS)

Conduent April 2016 – February 2018, Somerset, NJ

• Wrote and tuned complex T-SQL queries (CTEs, window functions, PIVOT / UNPIVOT, recursive queries) for ad-hoc

business analysis, monthly operational reporting, and reconciliations across client datasets.

• Developed and maintained SSIS packages for batch ETL, integrating flat files, Excel, SharePoint lists, and SQL Server

sources with Lookup, Conditional Split, Derived Column, Foreach Loop, and SCD transformations; scheduled and

monitored jobs via SQL Server Agent.

• Built SSRS reports (ad-hoc, parameterized, sub-reports, drill-down) with expressions, custom code, and scheduled

subscriptions for delivery to business users.

• Designed SSAS Tabular models with hierarchies, calculated measures, KPIs, and role-based security to support

self-service analytics.

• Performed data analysis and profiling on operational datasets to identify trends, anomalies, and data quality issues,

presenting findings to business stakeholders in regular review meetings.

• Created interactive dashboards in Power BI and Tableau on top of SQL Server and SSAS sources, with slicers,

drill-throughs, and DAX measures for KPIs and time intelligence.

• Migrated legacy SSRS, QlikView, and Excel reports into Power BI and Tableau as part of internal modernization

initiatives.

• Contributed to small Azure migration projects using Azure Data Factory and Azure SQL Database to copy on-prem

datasets into the cloud, gaining initial exposure to cloud-based ETL patterns.

• Collaborated with analysts, project managers, and client stakeholders to gather reporting requirements, document

source-to-target mappings, and deliver validated datasets and dashboards.

Junior Data Analyst / MSBI Developer

HSBC April 2015 – March 2016, Bangalore, India

• Wrote SQL queries against SQL Server databases for ad-hoc business questions, learned query tuning fundamentals

(indexes, execution plans), and supported senior analysts on monthly reporting tasks.

• Assisted in designing and maintaining SSIS ETL workflows ingesting data from SQL Server, Excel, and XML sources into

reporting systems, learning common transformations (Lookup, Conditional Split, Derived Column) and SSIS package

patterns.

• Developed SSRS reports (ad-hoc and parameterized) and supported the BI team with report deployments,

subscriptions, and user troubleshooting.

• Built interactive dashboards in Power BI and Qlik Sense to visualize KPIs and business metrics, picking up DAX basics

and data modeling concepts.

• Gained hands-on experience with Power Apps and Power Automate flows on SharePoint to streamline simple approval

workflows across Microsoft 365.

• Supported governance and deployment activities by reusing templates, following team standards, and learning

enterprise rollout practices in a regulated banking environment.

Contact this candidate