Data Engineer Azure

Location:

Leander, TX

Salary:

800000

Posted:

October 15, 2025

Contact this candidate

Resume:

• Designed and maintained large-scale data processing workflows in Databricks using PySpark, handling 7+ TB of fund inflow/outflow data weekly across ETFs, mutual funds, and institutional portfolios, improving AUM reporting speed by 35%.

• Built ETL pipelines using Azure Data Factory to move data from on-prem SQL Server and trading APIs into Azure Data Lake, transforming 5M+ fixed income trade records daily and reducing analyst data prep time by 40%.

• Optimized Snowflake-based data warehouse performance by reorganizing micro-partitions, applying clustering keys, and implementing materialized views, cutting report generation time from 90 seconds to under 30 seconds.

• Engineered streaming data ingestion pipelines using Apache Kafka and Amazon Kinesis, processing over 0.8 million messages daily and reducing exposure update latency from 15 minutes to 90 seconds.

• Constructed a secure CDC (Change Data Capture) framework using Debezium and PostgreSQL, syncing 200,000+ investor updates per week into Snowflake marketing marts and improving lead conversion tracking accuracy by 32%.

• Formulated and deployed dbt transformation models for analytical layers in Snowflake, reducing SQL redundancy by 60% and enabling analysts to reuse modular queries more effectively across 6 departments without duplicating logic.

• Automated infrastructure provisioning using Terraform and GitLab CI/CD, improving reproducibility and reducing sandbox setup time from 5 days to just 36 hours, which helped data science teams launch experimentation environments with minimal support.

• Developed 20+ Power BI dashboards using data from BlackRock’s DABS system, offering visibility into user access, permissions, and activity. Helped compliance teams complete monthly reviews faster and reduce unauthorized access.

• Partnered closely with enterprise data governance to roll out tokenization, column-level encryption, and RBAC for over 23 million rows of sensitive PII and transaction-level data, helping the team pass all SOC 2, GDPR, and internal audit checks without any violations.

• Migrated more than 140 legacy ETL jobs from an on-premise Oracle setup to a modernized big data architecture using Hadoop and Informatica on AWS, increasing data pipeline reliability to 99.9% and eliminating over 70% of nightly job failures caused by legacy dependency chains.

Contact this candidate