SAI SRI MADHUVANI GODALA
Austin, TX +1-601-***-**** ************@*****.*** LinkedIn: linkedin.com/in/saisrimadhu
PROFESSIONAL SUMMARY
Cloud Data Engineer with 5+ years of experience in designing, building, and optimizing large-scale data pipelines and cloud-based data warehouses. Proven expertise in ETL/ELT automation, real-time data processing, Databricks lakehouse implementations, and regulatory compliance (CPNI, GDPR) using Python, Spark enable data-driven decision making
SKILLS
●ETL/Tools: Airflow, DBT, Databricks, GitHub Actions (CI/CD), Docker, Git, SSIS, SSAS, SSRS, Oozie, Power BI, Tableau
●Cloud Platforms: AWS (Redshift, Batch/EC2/Fargate, RDS, S3, Athena, Lambda, Glue), GCP (BigQuery, Dataflow, Pub/Sub, Cloud Functions), Azure (Data Factory,
SQL DB, Blob Storage)
●Big Data Ecosystems: Spark (PySpark, Spark SQL, Spark Streaming), Hadoop, Kafka (Confluent Cloud), Hive, Pig, HDFS, Sqoop, Delta Lake
●Databases: Redshift, Redshift Spectrum, PostgreSQL, SQL Server, BigQuery, Oracle, HBase
●Programming Languages: Python, SQL, PySpark, Shell Scripting, Java, Scala
●Data Modeling: Star/Snowflake schemas, Slowly Changing Dimensions (SCD), Partitioning, Query Optimization
●CI/CD & DevOps: Docker, Kubernetes, GitHub Actions, Jenkins, Terraform
●Operating Systems: Linux, macOS, Windows
CERTIFICATIONS
AWS Certified Solutions Architect – Associate (Issued 05/2024)
AZ-900 Azure Fundamentals
Tableau for Data Visualization – Udemy(Issued 02/2023)
PROFESSIONAL EXPERIENCE
Senior Data Engineer Humana Inc Austin TX Feb 2024 - Present
●Led AWS Data Lakehouse initiative processing multi-terabyte healthcare claims data using Glue, S3, Redshift, and Snowflake.
●Orchestrated 15+ HIPAA-compliant data pipelines via Airflow (MWAA), reducing data latency by 40%.
●Built serverless ETL workflows with Python, PySpark, and Lambda achieving 99.9% reliability and improving scalability by 35%.
●Developed Python-based data validation framework reducing data issues by 25% and enhancing audit readiness.
●Implemented near real-time Snowpipe ingestion to Snowflake improving analytics latency by 90%.
●Integrated CloudWatch and Prometheus monitoring cutting incident resolution time by 50%.
●Provisioned infrastructure through Terraform templates ensuring consistent CI/CD deployment pipelines.
Data Engineer Verizon Global LLC Dallas TX Jul 2023 - Jan 2024
●Implemented automated data masking for 5M+ telecom subscriber records in PostgreSQL/Redshift, ensuring CPNI/GDPR compliance and reducing audit findings by 40%.
●Built Airflow/Python pipelines to integrate network usage, CRM, and billing data in real time, improving customer issue resolution speed by 30%.
●Developed Spark Streaming (Databricks) anomaly detection models analyzing call patterns and usage metrics, flagging 15% more fraudulent SIM activities.
●Created PySpark MLlib models predicting customer churn with 20% higher accuracy, directly supporting targeted retention campaigns.
●Architected a Delta Lake solution on Databricks, cutting AWS costs by 60% while maintaining ACID compliance for 50+ ETL and analytics jobs.
●Automated telecom performance and compliance dashboards (Power BI) for network and SLA reporting, saving 200+ hours/year in manual data preparation.
Data Engineer DTCC Dallas TX Mar 2022 - May 2023
●Developed low-latency Spark-SQL pipelines to process real-time NYSE/NASDAQ market data (JSON/AVRO), reducing trade analytics latency by 30% through optimized schema RDDs and partitioned Hive tables.
●Automated trade surveillance by implementing AWS Lambda functions with CloudWatch triggers to monitor 10M+ daily equity transactions for FINRA compliance violations, reducing false positives by 35%.
●Optimized Hive/Spark SQL queries for trading analytics through partition pruning and indexing, improving portfolio risk query performance from 10 minutes to under 30 seconds.
●Migrated SQL Server trade execution data to Hadoop via optimized Sqoop jobs, enabling T+1 settlement analytics while maintaining 99.9% data accuracy.
●Designed Oozie workflows to automate ETL processes for trade reconciliation, combining Shell, Pig and Hive jobs to reduce end-of-day settlement delays by 50%.
●Implemented real-time trade data ingestion pipeline using Kafka Connect to stream Sybase trade logs into analytics systems, enabling sub-second monitoring of order flows.
Research Assistant University of Southern Mississippi Hattiesburg, MS Sep 2020- Dec 2021
● Assisted in developing and maintaining the University Library Management System using Python, Django, and PostgreSQL, improving reliability and overall user experience.
● Optimized SQL queries and database workflows for room booking and resource tracking, reducing processing time and ensuring up-to-date availability data.
● Built data validation and synchronization scripts to maintain accuracy across student and faculty modules, minimizing manual updates and data errors.
● Collaborated with faculty and students to analyze usage metrics and generate reports that supported better scheduling and resource utilization.
● Designed interactive dashboards and visual reports using Tableau and Matplotlib to monitor booking trends and system performance.
Associate Data Engineer CouponUp Pvt Ltd Hyderabad, India Mar 2019- Aug 2020
●Designed and deployed end-to-end ETL/ELT pipelines on GCP (BigQuery, Dataflow, Pub/Sub) and AWS (S3, EC2), processing 10M+ daily records with <1 minute latency for real-time analytics.
●Built event-driven pipelines using Cloud Functions (Python) and Dataflow to auto-load streaming data from Pub/Sub/GCS into BigQuery, reducing manual data processing by 85%.
●Implemented Star/Snowflake schemas with partitioning and materialized views in BigQuery, improving query performance by 40% while reducing costs through optimized storage design.
●Established Docker-based CI/CD workflows using GitHub Actions to automate deployment of Spark jobs and Airflow DAGs, cutting deployment errors by 65%.
●Created predictive models (Python/Spark ML) for customer behavior forecasting (AUC 0.89) and built real-time fraud detection systems processing 5K events/sec with 95% accuracy.
EDUCATION
University of Southern Mississippi Hattiesburg, USA Aug 2020 – Dec 2021
Master of Science, Computer Science Awarded Merit scholarship for Graduate Program
Coursework: Advanced Algorithms, Distributed Database Systems, Software Design with Development and Machine learning
JNTU Hyderabad Hyderabad, India Aug 2015 – May 2019
Bachelor of Technology, Computer Science Awarded Merit scholarship for Undergraduate Program
Coursework: Data Structures, Algorithms, Artificial Intelligence, Operating Systems, Database Systems, Computer Networks