Keerthi Bhaskara Sirapu
*********************@*****.*** — +1-934-***-**** — LinkedIn
Profile Summary
• Cloud Data Engineer with close to 5 years of hands-on experience building secure, scalable, and optimized data solutions across Retail, Banking, and E-commerce domains.
• Designed and deployed scalable data lakes, ELT pipelines and real-time streaming solutions across Azure, AWS, GCP, Databricks, and Snowflake, processing over 20 million records daily.
• Proficient in Python, SQL, PySpark, dbt, and Delta Lake to build scalable transformation pipelines, automation workflows, and distributed data processing.
• Implemented robust CI/CD and DataOps frameworks using Azure DevOps, Terraform, and Kuber- netes, cutting release times by 40% and improving deployment reliability.
• Experienced in implementing data governance, lineage tracking, and compliance with industry standards such as HIPAA, SOX, GDPR, PCI-DSS.
• Collaborated with analytics, ML, and business teams to develop AI-powered insights, personalized rec- ommendation systems, and demand forecasting models. Education
Stony Brook University (SUNY) Aug 2023 - May 2025
Master’s in Data Science.
Certifications
• Microsoft Certified: Azure Data Engineer Associate
• AWS Certified Data Analytics – Specialty
Professional Experience
Walgreens Apr 2024 – Present
Senior Data Engineer Chicago, IL
• Designed and optimized enterprise-grade ETL pipelines using Azure Data Factory, Databricks, and Delta Lake, to process 20M+ pharmacy transactions daily, improving SLA adherence by 35%.
• Migrated critical reporting workflows from legacy SQL Server to Snowflake and Synapse, cutting query runtimes by 65% and generating $250K+ in annual savings.
• Engineered low latency streaming pipelines Kafka + Event Hubs + dbt to integrate POS, loyalty, and e-commerce data, enabling fraud detection and real time personalization under 2-second latency.
• Implemented Azure Purview with DataOps practices to establish data governance and lineage tracking, ensuring HIPAA and PCI-DSS compliance.
• Introduced executive level dashboards in Power BI, accelerating business decision-making by 30% across retail and pharmacy operations.
• Automated infrastructure provisioning and CI/CD using Azure DevOps and Terraform, cutting manual deployment effort by 50% and enhancing pipeline reliability.
• Collaborated with data science teams to deliver ML-ready feature pipelines for recommendation engines and demand forecasting models.
Accenture Jul 2021 – Aug 2023
Data Engineer Hyderabad, India
• Built scalable data pipelines using AWS Glue, dbt, and AWS Lambda to process 8M+ daily banking transactions, with automated lineage for regulatory compliance. 1
• Designed real-time fraud detection streams with Kinesis and Redshift, reducing false positives by 20% and improving fraud detection accuracy.
• Developed Snowflake data marts to support SOX and GDPR reporting, accelerating compliance workflows and audit readiness.
• Automated batch processing with Apache Airflow and Python, increasing pipeline reliability and reducing job failures by 40%.
• Created risk analytics dashboards using Tableau and SQL, improving portfolio visibility for credit and lending operations.
Flipkart Jun 2020 – Jun 2021
Junior Data Engineer Bangalore, India
• Developed and maintained ETL pipelines using Informatica and SQL, processing millions of daily trans- actions from orders and customer interactions.
• Optimized recommendation system pipelines with Python and PySpark, reducing refresh latency by 30% and improving personalization accuracy.
• Migrated historical order data to AWS Redshift and S3, enabling scalable and faster supply chain analytics.
• Created interactive dashboards in Power BI to analyze customer behavior, driving data-backed market- ing strategies.
• Collaborated cross-functionally with product managers and analysts to ensure data integrity and enforce governance policies of e-commerce datasets.
Key Projects
• Fraud Detection Pipeline: Helped catch fraudulent banking activity in real time to reduce financial loss. Set up a high-throughput streaming pipeline using Kinesis, Redshift, dbt, achieving fraud detection in under 3 seconds and reducing fraudulent losses by 25% through automated alerts and early intervention.
• Customer 360 Platform: Gave the business a complete view of each customer to support better personal- ization and targeting. Brought together data from e-commerce, POS, and loyalty systems into a Snowflake + Delta Lakehouse, combining 30M+ customer records into one central platform.
• Recommendation Engine: Improved product suggestions for users, increasing engagement and sales. Built ML pipelines using PySpark and Databricks, to refresh recommendations 30% faster, boosting cross-sell conversions by 18%.
Technical Skills
Data Storage: Snowflake, Synapse, Redshift, Cosmos DB, Data Lakes, SQL/NoSQL Data Processing: ETL/ELT, PySpark, dbt, Kafka, Kinesis, Event Hubs, Spark, Databricks, Delta Lake, Batch/Stream Processing
Cloud Platforms: Azure, AWS, GCP
Programming: Python, SQL, Scala, Java
ETL Tools: Azure Data Factory, AWS Glue, Informatica, Talend, Airflow, SSIS Analytics/BI: Power BI, Tableau, DAX, Advanced SQL Orchestration & DevOps: Azure DevOps, GitHub Actions, AWS CodePipeline, Terraform, Kubernetes ML/AI Integration: Feature Engineering Pipelines, Model Deployment, Recommendation Systems Modeling: Star/Snowflake Schemas, OLAP, Data Marts 2