Data Engineer Solutions

Location:

Cincinnati, OH

Posted:

September 10, 2025

Contact this candidate

Resume:

Avinash Mudda

OH, US +1-513-***-**** ***************@*****.*** LinkedIn

SUMMARY

Data Engineer with 3+ years of experience designing data solutions that support regulatory reporting, operational analysis, and business decision-making. Skilled in optimizing workflows, improving data accuracy, and reducing delivery delays across diverse domains. Consistently contributed to improving system reliability and supporting cross-team initiatives. Known for delivering structured, scalable, and high-impact data solutions that align with organizational goals, reduce rework, and enhance accessibility for analysts, compliance teams, and business stakeholders.

EDUCATION

Master of Engineering in Computer Science University of Cincinnati, OH, US Aug 2023 - May 2025 Bachelors in Computer Science and Engineering GITAM University, Andhra Pradesh, India Aug 2019 - Apr 2023 TECHNICAL SKILLS

Programming & Query Languages Python (Pandas, PySpark, NumPy), SQL (Joins, Window Functions, Indexing), Bash/Shell scripting .

Big Data & ETL Technologies Apache Spark (Batch and Streaming), Apache Airflow (DAG orchestration), Kafka (Stream Processing), AWS Glue, dbt (ELT transformations)

Cloud Platforms & Services: AWS: S3, Redshift, Glue, Lambda, Athena, EMR Azure: Data Factory, Synapse Analytics, Azure Blob Storage, GCP BigQuery, Cloud Functions, Cloud Migration (AWS, GCP, Azure) Data Warehousing & Storage: Snowflake, Amazon Redshift, Google BigQuery, Delta Lake, Parquet, Avro, ORC Data Modeling & Architecture: Star/Snowflake Schema, OLAP and OLTP, SCD Types (1 & 2), CDC (Change Data Capture), Data Lakehouse Design

DevOps & Automation: Docker, Terraform, Jenkins, Git, GitHub Actions, Azure DevOps, CI/CD Pipeline Management Monitoring & Data Quality: Great Expectations, AWS CloudWatch, Prometheus, Data Validation & Lineage Tools (Apache Atlas, DataHub) Tools & Collaboration: Jupyter Notebooks, VSCode, DBeaver, Tableau/Power BI (data QA/validation), Jira, Confluence Development Methodologies: Agile (Scrum/Kanban), Version Control, TDD (Test-Driven Development), Code Review Machine Learning / AI Infrastructure: LLMs(Large Language Mod els), Vector Databases(Pinecone, FAISS), MLOps(MLflow, Kubeflow). PROFESSIONAL EXPERIENCE

Elevance Health, IL, US July 2024 - Current

Data Engineer

• Designed Kafka + AWS Glue workflows to stream 2M+ daily claims, reducing data latency by 40% and improving reporting accuracy for actuarial risk scoring and compliance submissions.

• Engineered Slowly Changing Dimenions (SCD Type 2) logic using PySpark and dbt, enabling longitudinal tracking across Snowflake and Redshift, improving member history consistency in downstream analytics.

• Orchestrated 120+ Airflow DAGs across batch and streaming sources, lowering manual reruns by 70% and improving job-level SLA adherence across compliance, operations, and clinical domains.

• Deployed Great Expectations validations with CloudWatch alerts, preventing schema mismatches and reducing data quality defects in regulatory datasets by over 60%.

• Containerized PySpark pipelines with Docker and Jenkins, deploying to ECS environments and reducing ETL promotion errors from 20% to under 3% across dev, staging, and production tiers. Mphasis, India Sep 2021 - Aug 2023

Data Engineer

• Migrated 60+ SSIS ETL pipelines to PySpark and Airflow on Azure, reducing average load duration by 45% and consolidating transformation logic for centralized management.

• Implemented Change Data Capture(CDC) pipelines using watermarking and reducing sync lag from 24 hours to 5 minutes, significantly improving data timeliness across business domains.

• Created dynamic Airflow DAG templates and reusable modules, accelerating new pipeline deployments from 10 days to under 3 while reducing configuration issues.

• Provisioned Synapse, ADF, and Blob Storage using Terraform, ensuring infrastructure parity and reducing environment provisioning time by 80% during development and release phases.

• Tuned Synapse SQL queries for Power BI reports, decreasing dashboard load times by 50% and improving end-user response for finance and audit teams.

• Collaborated with QA to validate outputs and logic, reducing user acceptance test rejections by 52% and improving release readiness during monthly code drops.

• Maintained pipeline documentation, source-to-target mappings, and test cases in Confluence, enabling seamless handovers and accelerating onboarding for new developers by 65%. PROJECTS

• ETLQC: Launched software for ETL/data testing using Python Flask, and integrated APIs, processing 2TB of data per session.

• Digital License Management: Developed an RBAC web based key management software with low-latency and high-throughput RESTful APIs services, handling 30M+ key validations per minute requests while maintaining high reliability. CERTIFICATION

ITIL Foundation Certification – Simplilearn Link AWS Certified Data Engineer Associate (DEA-01) Link

Contact this candidate