SUMMARY
Data Engineer with *+ years of experience designing and automating large-scale ETL pipelines, data lakes, and real-time data workflows using AWS Glue, Spark, and Kafka. Skilled in Python, SQL, Airflow, and cloud architecture (AWS, GCP) with a strong focus on data reliability, governance, and performance optimization. EDUCATION
Masters in Computer Science Saint Louis University, MO Aug 2022 – May 2024 Bachelor of Science in Computer Science Kakatiya Institute of Technology & Science Aug 2018 – Jun 2022 Skills
Programming Languages: Python, SQL, R, Scala
Databases: Snowflake, PostgreSQL, MySQL, MongoDB
Big Data Tools: Apache Spark, Kafka, Airflow, Databricks, Flink Cloud Platforms: AWS (Glue, S3, Redshift, EC2), GCP (BigQuery, Dataflow), Azure Synapse ETL & Orchestration: Apache NiFi, Matillion ETL, MSBI (SSIS, SSAS, SSRS) DevOps & CI/CD: Git, Jenkins, Docker, AWS CodePipeline Data Visualization: Power BI, Tableau, Looker
EXPERIENCE
Data Engineer Jul 2024 – Present
Bank of America, USA
• Built high-performance batch and streaming pipelines using AWS Glue, PySpark, and Kafka, improving data processing speed by 40%.
• Designed secure data lake architecture using AWS Lake Formation with encryption and IAM policies for compliance.
• Automated ETL workflows with Apache Airflow, adding failure alerts and real-time monitoring to boost reliability by 25%.
• Developed cross-cloud pipelines between AWS and GCP for scalable, low-latency analytics.
• Created CI/CD pipelines in Jenkins and CodePipeline, reducing deployment time by 50%. Data Engineer Intern Jun 2023 – Aug 2023
Blue Cross Blue Shield (BCBS), NJ
• Created and optimized SQL-based ETL scripts to transform raw healthcare claims and provider datasets for use in risk scoring and cost prediction models.
• Participated in schema design reviews and helped standardize staging environments to follow star schema and 3NF normalization, improving query performance and data integrity.
Data Engineer Mar 2020 – Apr 2022
SpringML, India
• Optimized complex SQL workflows in Snowflake, implementing Star and Snowflake schemas to improve data modeling and reduce ETL runtime by 40% on a 15TB warehouse.
• Built and deployed scalable ML-ready pipelines using Python (Scikit-learn, SciPy) and R, extracting insights from 500M+ row datasets for advanced analytics use cases.
• Led cloud migration to AWS, architecting automated data ingestion pipelines with S3 and monitoring via CloudWatch, resulting in improved pipeline efficiency and reliability.
• Improved data transformation performance through advanced partitioning, parallel processing, and indexing strategies, enhancing scalability and data quality across workflows.
• Furnished real-time, actionable insights through Power BI and Looker dashboards, supporting key KPI monitoring and contributing to an estimated $90K in cost savings.
Projects
Hospital Management System (2024): Built a Django web app with secure APIs and integrated BI-ready data models for patient tracking.
Alien Invasion - Game Data Analytics (2023): Captured gameplay data with Python and analyzed player behavior using Pandas and Matplotlib.
Pranavi Gunukula
Data Engineer Data Analyst
****************@*****.*** 314-***-**** LinkedIn Saint Louis, MO