Data Engineer with 3+ Years in Big Data Pipelines

Location:

Posted:

November 20, 2025

Resume:

SUMMARY

Data Engineer with *+ years of experience designing and automating large-scale ETL pipelines, data lakes, and real-time data workflows using AWS Glue, Spark, and Kafka. Skilled in Python, SQL, Airflow, and cloud architecture (AWS, GCP) with a strong focus on data reliability, governance, and performance optimization. EDUCATION

Masters in Computer Science Saint Louis University, MO Aug 2022 – May 2024 Bachelor of Science in Computer Science Kakatiya Institute of Technology & Science Aug 2018 – Jun 2022 Skills

Programming Languages: Python, SQL, R, Scala

Databases: Snowflake, PostgreSQL, MySQL, MongoDB

Big Data Tools: Apache Spark, Kafka, Airflow, Databricks, Flink Cloud Platforms: AWS (Glue, S3, Redshift, EC2), GCP (BigQuery, Dataflow), Azure Synapse ETL & Orchestration: Apache NiFi, Matillion ETL, MSBI (SSIS, SSAS, SSRS) DevOps & CI/CD: Git, Jenkins, Docker, AWS CodePipeline Data Visualization: Power BI, Tableau, Looker

EXPERIENCE

Data Engineer Jul 2024 – Present

Bank of America, USA

• Built high-performance batch and streaming pipelines using AWS Glue, PySpark, and Kafka, improving data processing speed by 40%.

• Designed secure data lake architecture using AWS Lake Formation with encryption and IAM policies for compliance.

• Automated ETL workflows with Apache Airflow, adding failure alerts and real-time monitoring to boost reliability by 25%.

• Developed cross-cloud pipelines between AWS and GCP for scalable, low-latency analytics.

• Created CI/CD pipelines in Jenkins and CodePipeline, reducing deployment time by 50%. Data Engineer Intern Jun 2023 – Aug 2023

Blue Cross Blue Shield (BCBS), NJ

• Created and optimized SQL-based ETL scripts to transform raw healthcare claims and provider datasets for use in risk scoring and cost prediction models.

• Participated in schema design reviews and helped standardize staging environments to follow star schema and 3NF normalization, improving query performance and data integrity.

Data Engineer Mar 2020 – Apr 2022

SpringML, India

• Optimized complex SQL workflows in Snowflake, implementing Star and Snowflake schemas to improve data modeling and reduce ETL runtime by 40% on a 15TB warehouse.

• Built and deployed scalable ML-ready pipelines using Python (Scikit-learn, SciPy) and R, extracting insights from 500M+ row datasets for advanced analytics use cases.

• Led cloud migration to AWS, architecting automated data ingestion pipelines with S3 and monitoring via CloudWatch, resulting in improved pipeline efficiency and reliability.

• Improved data transformation performance through advanced partitioning, parallel processing, and indexing strategies, enhancing scalability and data quality across workflows.

• Furnished real-time, actionable insights through Power BI and Looker dashboards, supporting key KPI monitoring and contributing to an estimated $90K in cost savings.

Projects

Hospital Management System (2024): Built a Django web app with secure APIs and integrated BI-ready data models for patient tracking.

Alien Invasion - Game Data Analytics (2023): Captured gameplay data with Python and analyzed player behavior using Pandas and Matplotlib.

Pranavi Gunukula

Data Engineer Data Analyst

****************@*****.*** 314-***-**** LinkedIn Saint Louis, MO

Contact this candidate