Bhanu Kumar Jampala
+1-612-***-**** *********@*****.***
Professional Summary
Results-driven Data Engineer with 5+ years of experience designing and implementing large-scale, cloud-based data solutions. Expert in building scalable ETL pipelines using AWS (Glue, Redshift, Lambda, EMR) and Azure (Data Factory, Synapse, Databricks). Led major data migrations and real-time streaming projects using Spark, Kafka, and dbt, improving pipeline efficiency by up to 50%. Strong foundation in data modeling, automation, and cross-functional collaboration to drive data-driven decision-making.
Technical Skills
• Cloud Platforms: AWS (EC2, S3, EMR, Lambda, Redshift, Glue, Route53, RDS, DynamoDB, SNS, SQS, IAM), Azure
(Data Factory, Data Lake, Azure SQL, Cosmos DB, Synapse Analytics, Azure DevOps)
• Big Data & Streaming: Apache Spark, Hadoop, Hive, HDFS, Kafka, Spark Streaming, Flink, Airflow, NiFi, Databricks
• Programming & Scripting: Python (PySpark, Boto3, NumPy, Pandas, Flask), SQL, Shell Scripting, Scala, Java (Basic)
• ETL & Data Warehousing: AWS Glue, Azure Data Factory, Informatica IICS, dbt, Redshift, Snowflake, BigQuery, Teradata, Amazon S3 Data Lake
• Data Modeling & Governance: Star/Snowflake Schema, Kimball Methodology, ER Modeling, Data Vault, GDPR, OCC Compliance
• Monitoring & Reporting: Power BI, Tableau, Prometheus, Looker, AWS CloudWatch, Log Analytics
• DevOps & CI/CD: Docker, Kubernetes, Jenkins, Terraform, Git, GitHub, Bitbucket, Azure DevOps, Confluence
• Tools & IDEs: PyCharm, Jupyter, Anaconda, IntelliJ, Maven, Gradle
• Databases: SQL Server, Oracle, MySQL, MongoDB, Cassandra, Azure Cosmos DB Professional Experience:
Client: Bank of America Jun 2023 - Present
Role: AWS Data Engineer
Responsibilities:
• Architected and optimized high-scale data pipelines and infrastructure on AWS to support real-time analytics and reduce operational complexity in a financial services environment.
• Engineered an AWS-based data lake and warehouse using S3, Redshift, Glue, and EMR, improving processing efficiency by 40% and reducing costs by 30%.
• Developed scalable ETL pipelines in PySpark and AWS Glue to ingest structured/unstructured data from DB2, SQL Server, and SFTP into Redshift.
• Designed and implemented Spark applications in Databricks for data transformations, aggregations, and analytics.
• Integrated Apache Airflow for DAG scheduling, reducing manual job handling by 60% and increasing system reliability by 35%.
• Leveraged Kafka and Spark Streaming to process real-time UI activity data from XML feeds with low latency.
• Utilized Apache Flink for stateful stream processing on large-scale datasets, enabling responsive analytics dashboards.
• Transitioned infrastructure from Redshift to Snowflake, improving query performance by 50% and reducing data warehousing costs by 30%.
• Deployed dbt and Informatica automation workflows to streamline transformations, reducing ETL latency by 30%.
• Managed data extractions from diverse sources using Python (Boto3) and orchestrated large-scale transfers into S3.
• Built Prometheus-based dashboards using Plotly/Matplotlib for system-level metrics visualization.
• Applied Kimball dimensional modeling techniques to build star schema-based data marts for business intelligence needs.
• Configured and maintained EMR clusters for distributed Spark/Hadoop jobs across petabyte-scale datasets. Client: Accenture Jun 2021 – Aug 2022
Role: Data Engineer
Responsibilities:
• Led cloud data platform migration and pipeline modernization initiatives for enterprise clients using Azure and Snowflake, delivering faster, more secure, and scalable data operations.
• Migrated on-prem SQL Server databases to Azure SQL and Snowflake, ensuring seamless schema compatibility and data validation.
• Built Azure Data Factory (ADF) pipelines for batch and incremental data loading from Azure Blob, SQL, and Cosmos DB.
• Enhanced ETL workloads using Informatica IICS, achieving 50% faster execution via partitioning and parallelism.
• Converted IICS workflows to ADF pipelines to improve scalability and pipeline maintainability.
• Created Azure Databricks notebooks using Python for data wrangling, transformation, and cleansing.
• Orchestrated real-time data movement with Kafka Streams and Azure Event Hubs for microservices consumption.
• Automated Azure tasks with Bash scripts, reducing data pipeline failures by 30% and improving throughput by 50%.
• Built integration between Azure Data Factory and Snowflake using Python-Snowflake connector for seamless data loading.
• Applied dimensional modeling and Data Vault design to build robust, query-efficient reporting datasets.
• Used Flask APIs to expose processed data for real-time dashboard consumption and application integrations.
• Built Synapse-based analytics pipelines to support advanced data science workflows on high-volume datasets.
• Implemented security best practices including Azure Key Vault, Role-Based Access Control (RBAC), and data masking to meet GDPR compliance.
Client: Accenture Jun 2019 – May 2021
Role: Associate Software Developer
Responsibilities:
• Developed and automated end-to-end testing and data ingestion pipelines for healthcare and insurance clients using Python, MongoDB, and AWS.
• Built automation pipelines to process 10,000+ healthcare claims daily using Docker, Python, and MongoDB.
• Created a test automation framework for querying Excel, interacting with MongoDB, and validating API responses.
• Reduced daily manual testing from 4 hours to 5 minutes by building fully automated regression suites.
• Developed APIs using Flask to serve test data and generate real-time reports for QA teams.
• Designed and implemented MongoDB-based data ingestion scripts for large datasets, reducing manual handling by 60%.
• Led requirement gathering and business logic analysis to create over 100 reusable test scenarios.
• Applied Python automation to streamline response schema validation and data mapping tasks.
• Conducted peer code reviews, sprint planning, and test case development in Agile environments.
• Built CI/CD pipeline integrations using Jenkins and Git to run nightly test automation jobs.
• Explored GCP fundamentals and supported proof-of-concept (POC) initiatives for future cloud migrations.
• Documented automation workflows in Confluence and maintained GitLab repositories for code versioning.
• Enhanced test coverage and reduced failure debugging time through detailed logging and exception handling. Education
Concordia University St Paul
Master of Information Technology and Management