Data Engineer Azure

Location:

United States

Salary:

100000

Posted:

September 19, 2025

Contact this candidate

Resume:

Sri Akhil Gupta

***************@*****.*** +1-314-***-****

www.linkedin.com/in/sri-akhil-gupta-thatikonda-255a12151 https://github.com/sthatikonda3

PROFESSIONAL SUMMARY

Result-driven professional with 4+ years of experience in designing, implementing, and optimizing data pipelines and advanced analytics solutions across Azure, AWS, enhancing data processing efficiency by 40%. Proven success in big data processing, ETL processes, data warehousing, leveraging Apache Spark, Databricks, and Kafka. Proficient in orchestrating automated workflows with Apache Airflow, Azure Data Factory. Expertise in DevOps tools such as Jenkins, Docker, and Kubernetes for streamlined deployment processes.

EXPERIENCE

Ryan Specialty Jan 2025 – Present

Azure Data Engineer Remote, United States

Designed pipelines to capture data from Kafka and Event Hubs into ADLS Gen2, securing access for 150+ users with RBAC and Key Vault.

Developed 200+ Spark and Spark SQL scripts in Databricks and Fabric to transform raw data and load curated datasets into ADLS Gen2 for Lakehouse architecture.

Built 25+ Delta Live Tables pipelines with bronze, silver, and gold layers, applying deduplication, partitioning, and SCD Type 2 logic to improve query performance by 40%.

Orchestrated 30+ workflows in Azure Data Factory using scheduled and event triggers, improving pipeline reliability by 35%.

Delivered 15+ Power BI dashboards using Synapse and Fabric datasets, reducing manual reporting effort by 60%.

Configured monitoring with Azure Monitor and Application Insights, setting up 20+ alerts and reducing incident resolution time by 35%.

Integrated Azure AI Foundry with Databricks pipelines to supply curated training datasets and deploy prediction services into production analytics.

Created feature engineering workflows in Spark and registered features in AI Foundry, enabling reuse across multiple ML projects.

Capital One June 2023 – Dec 2023

AWS Data Engineer Remote, United States

Ingested over 1TB of transactional data daily into Amazon S3, creating raw and configured AWS Glue Crawlers to catalog 300+ datasets and automating schema discovery for integration with the Glue Data Catalog.

Developed 100+ Spark and Hive SQL transformation scripts in Glue and EMR clusters to cleanse, aggregate, and enrich large datasets.

Orchestrated 25+ ETL workflows by integrating Apache Airflow with AWS Glue and Redshift, enabling event-based triggers and dependency management.

Configured CloudWatch with 15+ alerts and log streams to monitor Glue jobs and EMR clusters, improving issue detection and reducing troubleshooting time by 30 percent.

Delivered Redshift datasets to analytics teams, powering Tableau and Power BI dashboards used by 40+ business stakeholders.

Managed version control for Spark scripts and Glue jobs in GitHub, integrating CI/CD workflows to automate deployment of ETL code to development and production environments.

Mahindra Mahindra Ltd. Jan 2022 –Dec 2022

Data Engineer Mumbai, India

Automated ingestion of 300GB+ daily data from UNIX servers into HDFS using Sqoop and shell scripts, creating a centralized data repository.

Designed Hive tables with partitions and indexing, reducing supply chain reporting queries by 35%.

Built streaming pipelines with Spark Structured Streaming to deliver near real-time metrics, reducing manual monitoring by 50%.

Developed Spark transformation scripts in Jupyter notebooks to cleanse and enrich SQL Server data, then stored curated datasets in HDFS.

Migrated 20+ data workflows into Azure using Logic Apps and Data Factory with event triggers and alerts, and integrated curated datasets into Azure Synapse with partition switching and materialized views, reducing query times by 45%.

Wipro (United HealthCare) Jun 2020 –Dec 2021

Data Engineer: Chennai, India

Processed 50GB of healthcare datasets weekly in HDFS and created partitioned Hive tables, improving query response times for reporting teams by 15%.

Developed 10+ Spark scripts in Jupyter Notebooks to cleanse and aggregate patient and claims data, supporting accurate downstream reporting.

Used Python libraries (Pandas, NumPy) to clean and analyze 50K+ structured records, improving operational reporting accuracy by 10%.

Assisted in building 3 KPI dashboards in Power BI, enabling tracking of claims processing efficiency and service-level performance.

Managed 20+ sprint tasks in Jira, ensuring timely delivery of assigned activities within Agile sprints.

Maintained 30+ Python and PySpark scripts in GitHub repositories, providing version control and team collaboration.

Conducted data validation and quality checks on 5K–10K records per week, ensuring compliance reporting was complete and accurate.

SKILLS

Programming & Scripting: Python, SQL, Java, Shell Scripting, HTML.

Big Data & Data Processing: Apache Spark, Hadoop, HDFS, Hive, Kafka, Spark Structured Streaming, Delta Lake, Delta Live Tables, Sqoop

Cloud Platforms & Services:

1.Microsoft Azure: ADLS Gen2, Azure Data Factory (ADF), Azure Synapse Analytics, Azure Databricks, Microsoft Fabric, Azure Event Hubs, Azure Monitor, Application Insights, Key Vault, RBAC Policies

2.Amazon Web Services (AWS): S3, Glue, Glue Crawlers, Redshift, EMR, Lambda, CloudWatch, Step Functions, IAM

Data Modeling & Architecture: Star Schema, Snowflake Schema, Data Marts, Data Warehousing, Lakehouse Architecture, SCD Type 2 Implementation, Partitioning, Indexing

Data Analytics & Visualization: Power BI, Tableau, Looker, Microsoft Excel, KPI Dashboards, Reporting Automation.

AI & ML Enablement: Azure AI Foundry, Azure Machine Learning Integration, Feature Engineering Pipelines, ML-ready Data Preparation, Cognitive Services APIs (OCR, Text Analytics)

Version Control & Collaboration: Git, GitHub, GitHub Actions, Jira, Confluence, Agile/Scrum.

EDUCATION

Saint Louis University Dec 2024

Master of Science in Information Systems

GMRIT Jun 2021

Bachelor of Technology in Information Technology

CERTIFICATIONS

Microsoft Certified: Fabric Data Engineer Associate Microsoft 2025 Credential: https://learn.microsoft.com/en-us/users/sriakhilguptathatikonda-0390/transcript/d5gg4bl06gyz8qp

Contact this candidate