Data Engineer Big

Location:

Surat, Gujarat, India

Posted:

October 15, 2025

Contact this candidate

Resume:

Sowmya Kotagiri Data Engineer

***************@*****.*** +1-254-***-**** LinkedIn

Summary

Data Engineer with over 5 years of experience in engineering and maintaining data pipelines, databases, and cloud-based solutions. Skilled in SQL, Python, and big data technologies with hands-on expertise in AWS, Azure, and GCP. Strong background in healthcare and IT projects, ensuring data accuracy, security, and compliance. Proficient in ETL processes, data modeling, and reporting to support business decisions. Skills

Programming & Scripting: Python (Pandas, NumPy,

PySpark), SQL (T-SQL), Bash

Cloud Platforms: Azure (Data Factory, Databricks, Synapse), AWS (S3, Redshift, Glue, EMR, Athena), GCP

(BigQuery, Dataflow, Pub/Sub), Teradata

Hadoop Ecosystem: HDFS, YARN, MapReduce, Hive,

Sqoop, Spark (1.x/3.x), Pig

Big Data & Streaming Tools: Apache Spark, Apache Kafka, Apache Flink, Spark Streaming, PySpark

ETL Tools: Informatica, Talend, ETL/ELT pipeline development, Data Warehousing, Data Modeling,

Apache Airflow, dbt

Databases: PostgreSQL, DB2, SQL Server, MySQL,

Oracle, Cosmos DB, Snowflake, Delta Lake, Redis

Visualization Tools: Tableau, Power BI (DAX, Power Query), IBM Cognos Analytics (Framework Manager,

Report Studio, Transformer)

Healthcare Data Standards & Compliance: HL7,

FHIR, HIPAA compliance, Claims Data, Healthcare

Data Platforms

DevOps Tools: Git/GitHub, CI/CD pipelines, Docker, Kubernetes, Terraform

Soft Skills: Stakeholder Communication,

Agile/Scrum Collaboration

Experience

UnitedHealth Group, IL Jan 2023 – Current

Data Engineer

Designed and automated ETL pipelines with Informatica to integrate claims and patient encounter data (HL7, FHIR, EDI 837/835) from diverse sources, reducing data ingestion time by 35% while ensuring HIPAA compliance.

Built real-time streaming workflows using Apache Kafka to process high-volume eligibility and claim transactions, scaling throughput to 500K+ events per hour for downstream analytics.

Developed optimized data warehouse structures in AWS Redshift, leveraging partitioning and query optimization to accelerate financial and clinical reporting by 30% and cut query runtime from minutes to under 40 seconds.

Delivered interactive Power BI dashboards that provided payers and providers with actionable insights on cost forecasting, claims adjudication, and patient outcomes, increasing stakeholder adoption by 60%. Infosys, India Aug 2021 – Sept 2022

Data Engineer

Built scalable ingestion pipelines on Apache Flink and Kafka Streams, enabling near-real-time data processing of 90K+ daily records with <5s latency for IT service monitoring.

Designed and implemented Delta Lake architecture on Databricks Lakehouse, improving query performance by 35% and ensuring ACID compliance for critical IT operational datasets.

Automated ETL workflows using dbt and Airflow, reducing manual intervention by 40% while enhancing pipeline transparency and lineage tracking.

Deployed Snowflake-based data warehouse integrating multiple ITSM sources, enabling centralized analytics and reducing reporting turnaround time from 2 days to 18 hours.

Partnered with cross-functional teams to build data quality and governance checks within pipelines, aligning with ITIL standards and minimizing data inconsistencies across IT support systems. Hexaware Technologies, India Jul 2020 – Jul 2021

Junior Data Engineer

Assisted in building and maintaining ETL pipelines using Talend and Pentaho Data Integration (PDI) to ingest and process data from multiple operational systems.

Developed and optimized SQL queries, stored procedures, and functions to improve data retrieval speed by 25% for reporting teams.

Performed data cleansing, validation, and transformation tasks to ensure accuracy and consistency across enterprise applications.

Collaborated with senior engineers to design normalized and star schema data models, enabling efficient storage and faster reporting for BI users.

Education

University of New Haven, CT Aug 2023

Master’s in Business Analytics

Contact this candidate