Sneha Sameera Medepalli
Cincinnati, Ohio
+1-513-***-**** # ****************@*****.*** ï LinkedIn Profile Summary
• Experienced Cloud Data Engineer with 5+ years of expertise designing scalable, secure, and efficient data platforms across AWS, Azure.
• Specialized in building robust ETL/ELT pipelines, data lakes, and warehousing solutions using tools like Azure Data Factory, AWS Glue, and Google Dataflow
• Proficient in Python, PySpark, and SQL for data processing, transformation, and automation in large-scale distributed environments
• Hands-on experience with streaming technologies including Kafka, Kinesis, and Spark Streaming for real-time ingestion and analytics
• Strong background in healthcare, retail, and telecom domains, ensuring compliance with standards like HIPAA while delivering high-impact analytics solutions
• Implemented CI/CD and monitoring frameworks using tools such as Azure DevOps, CloudWatch, and CodePipeline, enabling faster and reliable releases
• Collaborated cross-functionally with data scientists, business stakeholders, and engineering teams to align data solutions with organizational goals
• Agile practitioner with active participation in bi-weekly sprints, sprint planning, and retrospectives to drive continuous improvement in data delivery
Education
University of cincinnati 2023–2024
Master’s in Information Technology Cincinnati, Ohio, USA Relevant Coursework
• Introduction to Algorithms
• Design and Analysis of Algorithm
• Database organization
• Online Social Networks Analysis
• Data Preparation and Analysis
• Software Project Management
Certifications
Microsoft Certified: Azure Data Engineer
Experience
Cleveland Clinic Apr 2024 – Present
Sr. Data Engineer Cleveland, Ohio
• Architected and deployed scalable ETL pipelines using Azure Data Factory and Azure Synapse Analytics to process over 10M daily clinical records, enabling real-time insights for physicians and research teams
• Engineered data lake solutions using Azure Data Lake Gen2 and Delta Lake, facilitating centralized, secure storage of electronic health records (EHR) across multiple departments
• Leveraged Azure Databricks and PySpark for data wrangling and transformation of high-volume FHIR and HL7 datasets, improving downstream analytics performance by 35%
• Collaborated closely with clinical data teams, bioinformaticians, and compliance officers to align data models with HIPAA and internal governance policies using Azure Purview
• Built CI/CD pipelines using Azure DevOps to automate deployment of data pipelines, unit tests, and lineage validation, reducing manual effort by 40%
• Developed and monitored data quality checks using Azure Monitor, Log Analytics, and custom alerts, ensuring integrity in patient-critical datasets
• Integrated Power BI dashboards with Azure Synapse to provide medical leadership with live operational insights, reducing report turnaround times from days to minutes
• Actively contributed to bi-weekly Agile sprints, participating in sprint planning, backlog grooming, and daily stand-ups to ensure timely and collaborative delivery of data engineering solutions
(NTT DATA) Walgreens Apr 2022 – Jul 2023
AWS Data Engineer Banglore, India
• Designed and implemented end-to-end data pipelines using AWS Glue, Lambda, and Step Functions to ingest and transform retail pharmacy and supply chain data
• Utilized Amazon S3, Athena, and Redshift Spectrum for building a cost-effective, serverless analytics platform for inventory and sales forecasting
• Processed real-time transactional data using Kinesis Data Streams and stored in Amazon Redshift, supporting dynamic inventory adjustments across 1,000+ retail locations
• Created ETL frameworks with PySpark and AWS Glue to handle large-scale prescription and patient data while maintaining HIPAA compliance
• Implemented CI/CD pipelines using AWS CodePipeline, CodeBuild, and CloudFormation for automated deployment and monitoring of data infrastructure
• Collaborated with supply chain analysts, clinical data teams, and business stakeholders to define data requirements and validate KPIs for patient-centric retail insights
• Performed data quality checks and anomaly detection using Amazon CloudWatch, SNS, and custom alerting scripts to ensure data integrity
• Participated in Agile ceremonies including sprint planning, retrospectives, and backlog grooming to drive timely delivery of cloud-based data solutions
Humana Mar 2020 – Mar 2022
Data Engineer Chennai, India
• Developed and maintained ETL workflows using Informatica PowerCenter and SQL Server Integration Services (SSIS) to extract, transform, and load healthcare claims and enrollment data
• Built robust data pipelines to process large volumes of HIPAA-compliant patient data, improving data availability for analytics and actuarial teams
• Created stored procedures and complex T-SQL queries to automate data validation and transformation for monthly healthcare reporting
• Collaborated with business analysts, compliance teams, and data stewards to ensure accuracy, privacy, and timeliness of sensitive member data
• Worked on performance tuning of queries and ETL workflows to reduce end-to-end processing time by 25%
• Integrated data from multiple source systems including Oracle, Flat Files, and external vendor APIs into a centralized data warehouse
• Utilized Tableau to develop internal dashboards for clinical operations teams to track claim statuses, member visits, and provider efficiency
• Participated in Agile sprints, contributing to sprint reviews and retrospectives while aligning deliverables with HIPAA regulations and data security policies
Technical Skills
Data Storage: Data Lakes, Data Warehousing, SQL & NoSQL Databases, Cloud Storage (Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, Azure Cosmos DB) Data Processing: ETL/ELT Development, Data Pipelines, Batch & Stream Processing, Data Transformation, Real-time Data Processing (Event Hubs, Stream Analytics)
Big Data Technologies: Hadoop, Spark, Distributed Computing, Large-scale Data Processing Cloud Platforms: Azure, AWS, Google Cloud Platform (GCP) Programming Languages: Python, SQL, Java, Scala, R ETL/ELT Tools: Apache NiFi, Informatica, Talend, Apache Airflow, Azure Data Factory, AWS Glue, SSIS, SSAS, AWS Data Pipeline
Analytics & Reporting: Data Visualization, BI Tools (Tableau, Power BI), DAX, Descriptive & Predictive Analytics Orchestration: Workflow Automation, Job Scheduling, Data Pipeline Orchestration Machine Learning: Model Deployment, Feature Engineering, Integration with Data Pipelines Networking: Data Integration, API Management, Cloud Networking, Data Migration Data Warehousing: Dimensional Modeling (Star/Snowflake Schemas), Azure Synapse Analytics, OLAP, Data Marts