Post Job Free
Sign in

Data Engineer Azure

Location:
Chennai, Tamil Nadu, India
Posted:
October 15, 2025

Contact this candidate

Resume:

JAYARAM AMGOTH

682-***-**** • *********@*****.*** • Open to Relocation • Linkedin

Azure Data Engineer with 6+ years of experience designing and implementing scalable data solutions across telecom, healthcare, and academic domains. Skilled in building end-to-end data pipelines and Lakehouse architectures using Azure Databricks, Data Factory, Synapse, and ADLS. Strong background in PySpark, Python, and SQL for data processing, automation, and analytics. Experienced in integrating machine learning models, enforcing data quality and compliance, and optimizing performance in production environments. Known for collaborating with cross-functional teams, mentoring peers, and delivering reliable, high-performance data systems. PROFESSIONAL EXPERIENCE

The University of Texas at Arlington Data Engineer Jan 2025 – Present

● Built a centralized data lake on Azure by collaborating with 10+ cross-functional teams, leveraging Azure Data Lake Storage, Synapse Analytics, Databricks, and Data Factory to streamline data access and analytics.

● Developed an end-to-end data pipeline to ingest patient neuro-monitoring data from hospital devices and land it in Azure Data Lake Storage (ADLS), improving data accessibility and supporting advanced analytics for clinical research.

● Designed and automated ETL workflows using ADF and PySpark to support ad hoc data requests, ensuring efficient and accurate data delivery across academic and research systems.

● Analyzed 6,000+ co-op feedback records using Python (NLP and topic modeling) in Databricks to identify industry skill trends, driving updates to 3+ academic programs.

T-Mobile Data Engineer May 2024 – Dec 2024

● Built ADF to Databricks pipelines for 10 enterprise data sources, improving overall processing speed by 20% and accelerating daily delivery into Azure Data Lake for downstream analytics.

● Implemented Medallion architecture in a Data Mesh, enabling governed self-service access and reducing time-to-insight by 30%.

● Optimized Spark jobs with partition pruning, bucketing, and executor tuning in high volume pipelines, cutting aggregation runtime by 40% and significantly improving SLA reliability.

● Developed PyDeequ validations and anomaly detection, reducing reporting errors by 25% and avoiding $120K annually in vendor penalties and overpayments.

● Migrated vendor reconciliation process into Snowflake and dbt, automating workflows that lowered operational costs by 16% and saving the finance team $200K annually.

● Designed marketing attribution pipelines combining CRM, campaign, and ad-platform data (Facebook Ads, Google Ads), enabling ROI tracking and optimized ad spend allocation.

Optum Data Engineer Mar 2021 – July 2023

● Built Spark Structured Streaming pipelines on Azure Databricks to process medical claims from Event Hubs in 10-minute batches, reducing processing latency from 4 hours to 10 minutes and 2x daily throughput, while ensuring HIPAA compliance for PHI.

● Integrated XGBoost fraud models into Spark ETL workflows, reducing detection lag from 24 hours to 15 minutes and enabling earlier review of suspicious claims representing $5M+ in annual exposure.

● Modeled provider and claim data in Azure Synapse using star schemas to support regulatory reporting, cutting audit preparation time by 40% and helping the organization avoid penalties of $500K annually.

● Implemented Great Expectations validations on silver layer datasets, raising data quality by 25% and reducing monthly manual investigation effort by 150 hours.

● Versioned Spark ETL and validation scripts with Git and integrated them into Azure DevOps CI/CD pipelines, reducing deployment errors by 70% and shrinking release cycles from 3 days to 4 hours. Flipkart Data Engineer May 2019 – Feb 2021

● Automated incentive defect detection across 150+ warehouses, improving payroll accuracy and saving 36K+ hours annually and $550K+ in costs through end-to-end pipeline automation.

● Engineered large-scale ETL pipelines with Spark, Delta Lake, and Hive to process 1TB+/week of supply chain data, improving throughput by 35% and accelerating product availability insights.

● Partnered with warehouse operations teams to set up automated alerts and monitoring dashboards, improving SLA adherence by 20% and ensuring real-time issue detection.

● Developed APIs to integrate payroll validation results with HR systems, embedding business rules that automated failure assignments and exception handling, reducing manual intervention and errors. SKILLS

Programming & Querying: Python, SQL

Azure Cloud Services: Databricks, Data Factory (ADF), Synapse Analytics, Data Lake Storage (ADLS), Event Hubs Data Engineering: ETL/ELT Development, PySpark, Delta Lake, Lakehouse Architecture, Data Modeling, Workflow Orchestration Data Quality & Governance: PyDeequ, Great Expectations, Data Security & Compliance (HIPAA) Machine Learning Integration: Model Deployment & Scoring within Spark Workflows DevOps & Automation: CI/CD with Azure DevOps, Git, Terraform (IaC) Databases & Warehousing: Snowflake, Azure Synapse, Hive Visualization: Power BI

EDUCATION

The University of Texas at Arlington - Master of Science, Business Analytics CVR College of Engineering - Bachelor of Technology, Computer Science CERTIFICATIONS

Microsoft Certified: Azure Data Engineer Associate Microsoft Certified: Fabric Data Engineer Associate Pursuing Azure Solutions Architect Expert (in progress)



Contact this candidate