NIHARIKA MADALA Data Engineer
+1-806-***-**** **************@*****.*** Texas, USA LinkedIn
SUMMARY
Results-oriented Data Engineer with 3+ years of experience designing, building, and optimizing cloud-native, enterprise-scale data pipelines across AWS and Azure. Proven success in developing ETL/ELT frameworks, lakehouse architectures, and real-time streaming solutions supporting millions of customer and patient records. Skilled in Python, SQL, Spark, and Kafka with expertise in Snowflake, Databricks, Airflow, and Terraform. Adept at implementing data quality frameworks, governance, and compliance controls (HIPAA, GDPR, PCI-DSS) to ensure secure, accurate, and business-ready data delivery for healthcare and financial services. TECHNICAL SKILLS
Programming & Querying: Python, SQL, Scala, Java
Cloud Platforms: AWS (Redshift, Glue, EMR, S3, Lambda, Athena), Azure (Data Factory, Synapse, Data Lake, Databricks), GCP
(BigQuery)
Data Warehousing & Lakehouse: Snowflake, Databricks, Delta Lake, BigQuery Data Pipelines & Orchestration: Apache Airflow, AWS Step Functions, Azure Data Factory, dbt Big Data & Streaming: Apache Spark, PySpark, Kafka, Kinesis DevOps & Automation: Docker, Kubernetes, Terraform, Jenkins, GitHub Actions, CI/CD Data Governance & Quality: Great Expectations, Deequ, Metadata Management, Data Lineage Visualization & BI Support: Power BI, Tableau, Looker Methodologies & Compliance: Agile/Scrum, SDLC, HIPAA, GDPR, PCI-DSS PROFESSIONAL EXPERIENCE
Data Engineer PNC Financial Services, USA Sept 2024 – Present
Engineered and automated ETL/ELT pipelines with AWS Glue, Lambda, and Step Functions, reducing data latency by 40% for real-time risk analytics.
Developed Snowflake lakehouse integrating structured/unstructured data from 20+ financial systems, enabling analytics for 5M+ customer records.
Implemented Kafka-based streaming pipelines for fraud detection and transaction monitoring, improving anomaly detection by 18%.
Built data validation & quality frameworks (Great Expectations) ensuring 99.9% accuracy across enterprise datasets.
Deployed containerized applications with Docker & Kubernetes, cutting infrastructure costs by 20% through scalable orchestration.
Automated infrastructure provisioning with Terraform, improving deployment speed and consistency across multi-region AWS environments.
Partnered with data scientists and risk teams to deliver ML-ready datasets, accelerating model training and deployment by 25%.
Data Engineer CVS Health, India Jun 2021 – Jun 2023
Designed and optimized Azure Data Factory pipelines to ingest healthcare claims, EMR, and pharmacy data, ensuring full HIPAA compliance.
Developed PySpark ETL jobs on Databricks to process 2M+ patient records, reducing runtime by 35%.
Modeled dimensional schemas & star designs in Azure Synapse, supporting population health and cost analytics.
Built automated Power BI dashboards for clinicians, tracking patient adherence, readmission rates, and provider performance.
Implemented role-based access controls & encryption policies, ensuring regulatory compliance with HIPAA and CMS frameworks.
Collaborated with actuarial teams and business analysts to forecast healthcare cost trends, driving value-based care initiatives.
EDUCATION
Texas Tech University – Texas, USA Aug 2023 – May 2025 Master of Science in Computer Science
KL Deemed to be University, India Jun 2019 – May 2023 Bachelors in Technology
CERTIFICATIONS
IBM Data Analyst Professional Certificate – Excel, SQL, Python, data visualization, foundational AI. Wipro TalentNext – Java Full Stack Development
Tessolve – Embedded IoT & Industrial Automation
Google AI – Explore ML (Beginner Level)
CompTIA Security+ Certification
PROJECTS
Healthcare Data Pipeline Optimization
Led the optimization of a healthcare data pipeline, reducing processing time by 40% and improving data accuracy.
Utilized AWS Glue and Apache Spark to streamline ETL processes, enabling real-time data access for analytics.
Collaborated with data scientists to enhance data models, resulting in improved patient outcome predictions. Financial Data Governance Implementation
Spearheaded the implementation of data governance practices for a financial services client, ensuring compliance with PCI-DSS regulations.
Developed a comprehensive data quality framework that improved data integrity by 35%.
Engaged with stakeholders to establish data stewardship roles and responsibilities.