Data Engineer Azure

Location:

Hyderabad, Telangana, India

Salary:

95000

Posted:

September 10, 2025

Contact this candidate

Resume:

SATHISH LAVUDIYA

DATA ENGINEER

• Dallas, Texas *5038 • +1-682-***-**** • **************@*****.***

Professional Summary:

Detail-oriented Data Engineer with 4 years of experience turning raw data into trusted insights that support business growth. Skilled at designing efficient data solutions that improve accuracy, speed, and reliability. Experienced in managing large datasets, streamlining data processes, and ensuring compliance with industry standards. Known for delivering clear results in fast-paced environments and working closely with teams to achieve shared goals.

Skills:

Programming: Python, SQL, Scala, Java

Cloud Platforms: Microsoft Azure, AWS, Google Cloud Platform (GCP)

Big Data Frameworks: Apache Spark, Apache Kafka, Hadoop

Data Warehousing: Azure Synapse, Amazon Redshift, Snowflake, Google Big Query

ETL Tools: Azure Data Factory, Apache NiFi, Talend, Informatica, SSIS, Databricks

Data Lake/Data Storage: Azure Data Lake Storage (Gen2), Amazon S3, Delta Lake

Workflow Orchestration: Apache Airflow, Azure Data Factory

Containerization & DevOps: Docker, Kubernetes, Terraform, Azure DevOps

Streaming Technologies: Apache Kafka, Azure Event Hubs, Spark Structured Streaming

Data Modeling & Governance: Data Modeling, Common Data Model, RBAC, Purview, Unity Catalog

Business Intelligence & Visualization: Power BI, Tableau

Security: IAM, HIPAA Compliance, Azure Key Vault

API Development: Azure API Management, RESTful APIs

Machine Learning Integration: Scikit-learn, ML pipelines

Work Experience

Data Engineer, 12/2023 to Current

Pwc- Remote

Architected a full-stack Azure Lakehouse using Data Factory, ADLS Gen2, and Delta Lake to integrate multi-source healthcare data (EMR systems, claims/CPT files, NPI/ICD APIs), delivering a unified analytics layer for RCM use cases.

Orchestrated metadata-driven ingestion pipelines in Azure Data Factory, enabling incremental and full loads with parallel execution, audit logging, and archival, improving reliability and cutting load time by 30%.

Designed a Medallion Architecture (Landing Bronze Silver Gold) in Databricks (Spark/Python), standardizing 10+ datasets into Parquet/Delta formats with ACID compliance, resulting in 50% faster query performance.

Implemented the Common Data Model and SCD Type 2 in the silver layer, creating surrogate keys to unify hospital IDs and maintaining complete historical records, ensuring 100% longitudinal accuracy.

Modeled the Gold layer with one fact (transactions) and six dimensions (patients, providers, encounters, claims, ICD, CPT), enabling KPI tracking for AR > 90 days, Days in AR, and Net Collection Rate.

Enhanced governance and security by applying Azure Key Vault, AAD app registrations, RBAC, and Databricks Unity Catalog, ensuring HIPAA compliance and enterprise-wide lineage visibility.

Data Engineer, 01/2022 to 07/2023

Spsoft global

Developed real-time streaming pipelines using Apache Kafka and Azure Event Hubs to process 5M+ daily events, reducing order tracking latency from minutes to under 5 seconds across fulfillment centers.

Migrated 20TB+ of historical data from MySQL/PostgreSQL into Azure Synapse with ADF and CDC, boosting query performance by 50% while ensuring seamless batch-to-stream integration.

Optimized 30+ ETL workflows in Databricks (Spark), reducing processing latency by 40% and increasing SLA compliance by 25%.

Established a centralized Azure Data Lake (Gen2) governed by Purview, applying 100+ RBAC policies to secure 100TB+ of structured and semi-structured data for enterprise-wide access.

Deployed event-driven pipelines using Azure Functions and Durable Functions to power real-time fraud detection and inventory updates, achieving 99.9% accuracy.

Automated infrastructure and CI/CD pipelines with Terraform and Azure DevOps, reducing manual deployment by 70% and accelerating release cycles from weekly to daily.

Delivered 50+ secure APIs via Azure API Management and created 10+ Power BI dashboards, lowering SLA response time for customer inquiries by 40% and enabling real-time executive KPI tracking.

Enhanced data quality by applying systematic cleaning, validation, and transformation procedures, reducing downstream errors.

Scripted Python automations for repetitive tasks, increasing team productivity and minimizing manual mistakes.

Data Engineer Intern, 01/2021 to 12/2021

Spsoft global

Maintained SSIS ETL workflows to automate ingestion from diverse sources into SQL Server Data Warehouse, boosting pipeline performance and ensuring integrity.

Authored SQL queries and stored procedures for analytics and reporting, improving data retrieval efficiency by 25%.

Designed Power BI dashboards for senior management, cutting data analysis time by 40%.

Automated validation checks with Python and SQL scripts, reducing inconsistencies across datasets.

Tuned slow-running SQL queries, improving execution times by 30%.

Constructed ETL jobs for structured data loading and transformation, ensuring reliable datasets for reporting.

Integrated business logic into SSRS reports to align with user needs and enhance usability.

Created interactive dashboards using Power BI and SQL, enabling KPI tracking and improving decision-making speed by 20%.

Education

Degree: Business Analytics, 05/2025

Trine University - Detroit, Michigan

Contact this candidate