ASHRITHA REDDY
Data Engineer
+1-260-***-**** ****************@*****.*** Indiana, USA
Professional Summary
Results-driven Data Engineer with 4 years of experience designing and optimizing large-scale data solutions for healthcare and financial services clients. Skilled in building ETL/ELT pipelines, implementing HIPAA-compliant architectures, and enabling real-time analytics using AWS, Azure, and modern big data technologies. Proven ability to deliver secure, scalable, and high-performing data pipelines supporting advanced analytics, fraud detection, and regulatory compliance. Technical Skills
• Cloud Platforms: Azure (Databricks, Data Factory, Synapse, Data Lake), AWS (Glue, Redshift, S3, EMR, Lambda, Kinesis,CloudWatch)
• Big Data Ecosystem: Apache Spark, Hadoop, Kafka, Hive, Delta Lake, PySpark
• Databases & Storage: PostgreSQL, MySQL, MongoDB, Snowflake, Amazon RDS
• ETL/ELT Tools: Azure Data Factory, AWS Glue, Databricks, Informatica, SSIS, Airflow
• Programming & Scripting: Python (Pandas, NumPy, PySpark), SQL, Scala, Bash
• Data Visualization: Power BI, Tableau, AWS QuickSight
• DevOps & CI/CD: Docker, Kubernetes, Terraform, Jenkins, Git
• Other Skills: Data Quality & Governance, Data Warehousing, Streaming Data, Batch Processing Professional Experience
Elevance Health Data Engineer Jan 2024 – Present Indiana, USA
• Designed and deployed ETL pipelines in Azure Data Factory and Databricks to process healthcare claims data, reducing latency by 20%.
• Built a HIPAA-compliant data lake on Azure Data Lake Storage, integrating structured and unstructured datasets from multiple healthcare systems.
• Implemented real-time ingestion using Azure Event Hubs and Kafka, improving fraud detection response times.
• Developed PySpark transformations in Databricks for batch and streaming data processing, improving throughput of claims validation workflows.
• Created Power BI dashboards for clinical and claims data, providing executives with insights into approval rates, anomalies, and processing delays.
• Automated pipeline monitoring and alerts using Azure Monitor and Logic Apps, reducing downtime and manual intervention.
• Partnered with data science teams to create feature engineering pipelines for predictive modeling (readmission risk, patient segmentation).
• Tuned Spark clusters in Databricks by adjusting partitioning and caching strategies, cutting processing time by 15% on large batch jobs.
• Enforced RBAC and AAD authentication across Azure resources, ensuring compliance with HIPAA and security best practices. JP Morgan Chase Data Engineer Jan 2021 – Dec 2022 Hyderabad, India
• Designed and implemented scalable ETL workflows using AWS Glue and Python to process millions of daily payment transactions.
• Built a centralized data lake on Amazon S3, integrating structured and semi-structured datasets for secure financial data storage.
• Developed real-time streaming pipelines with Apache Kafka and AWS Kinesis, enabling anomaly detection and fraud monitoring.
• Optimized PySpark jobs on Amazon EMR, reducing transaction aggregation time by 30%.
• Deployed Redshift clusters for analytical queries, improving reporting efficiency for compliance and risk management teams.
• Built QuickSight dashboards to visualize fraud rates, chargebacks, and settlement timelines, cutting manual reporting effort by 35%.
• Automated infrastructure provisioning with Terraform and AWS Lambda, improving scalability and reducing operational overhead.
• Implemented data validation frameworks in Python to enforce schema rules, improving data accuracy for reconciliation reports.
• Migrated legacy SQL Server datasets into AWS RDS and Redshift, ensuring 99.9% data integrity and faster analytics.
• Coordinated with compliance teams to meet PCI DSS and SOX regulatory standards, ensuring secure handling of sensitive transaction data. Academic Projects
Cloud-Based Data Lake for Healthcare Analytics (AWS, PySpark, Redshift, Power BI)
• Designed and implemented a cloud-native data lake on AWS S3 to store structured and unstructured healthcare datasets (HL7, claims, patient records).
• Developed ETL pipelines using PySpark and AWS Glue for cleaning, transforming, and loading data into Redshift for analytical queries.
• Built Power BI dashboards to visualize patient admission trends, disease outbreaks, and treatment costs.
• Result: Delivered a HIPAA-compliant architecture that improved data accessibility for healthcare research teams. Education
Master of Science in Information Technology – Indiana Institute of Technology, Fort Wayne, IN Bachelor of Technology in Computer Science – JNTUH, Hyderabad, India Certifications
• AWS Certified Data Analytics – Specialty
• Microsoft Certified: Azure Data Engineer Associate
• Databricks Certified Associate Developer for Apache Spark