Data Engineer Factory

Location:

Minneapolis, MN, 55401

Posted:

October 15, 2025

Contact this candidate

Resume:

Bhanu Kumar Jampala

+1-612-***-**** *********@*****.***

Professional Summary

Results-driven Data Engineer with 5+ years of experience designing and implementing large-scale, cloud-based data solutions. Expert in building scalable ETL pipelines using AWS (Glue, Redshift, Lambda, EMR) and Azure (Data Factory, Synapse, Databricks). Led major data migrations and real-time streaming projects using Spark, Kafka, and dbt, improving pipeline efficiency by up to 50%. Strong foundation in data modeling, automation, and cross-functional collaboration to drive data-driven decision-making.

Technical Skills

• Cloud Platforms: AWS (EC2, S3, EMR, Lambda, Redshift, Glue, Route53, RDS, DynamoDB, SNS, SQS, IAM), Azure

(Data Factory, Data Lake, Azure SQL, Cosmos DB, Synapse Analytics, Azure DevOps)

• Big Data & Streaming: Apache Spark, Hadoop, Hive, HDFS, Kafka, Spark Streaming, Flink, Airflow, NiFi, Databricks

• Programming & Scripting: Python (PySpark, Boto3, NumPy, Pandas, Flask), SQL, Shell Scripting, Scala, Java (Basic)

• ETL & Data Warehousing: AWS Glue, Azure Data Factory, Informatica IICS, dbt, Redshift, Snowflake, BigQuery, Teradata, Amazon S3 Data Lake

• Data Modeling & Governance: Star/Snowflake Schema, Kimball Methodology, ER Modeling, Data Vault, GDPR, OCC Compliance

• Monitoring & Reporting: Power BI, Tableau, Prometheus, Looker, AWS CloudWatch, Log Analytics

• DevOps & CI/CD: Docker, Kubernetes, Jenkins, Terraform, Git, GitHub, Bitbucket, Azure DevOps, Confluence

• Tools & IDEs: PyCharm, Jupyter, Anaconda, IntelliJ, Maven, Gradle

• Databases: SQL Server, Oracle, MySQL, MongoDB, Cassandra, Azure Cosmos DB Professional Experience:

Client: Bank of America Jun 2023 - Present

Role: AWS Data Engineer

Responsibilities:

• Architected and optimized high-scale data pipelines and infrastructure on AWS to support real-time analytics and reduce operational complexity in a financial services environment.

• Engineered an AWS-based data lake and warehouse using S3, Redshift, Glue, and EMR, improving processing efficiency by 40% and reducing costs by 30%.

• Developed scalable ETL pipelines in PySpark and AWS Glue to ingest structured/unstructured data from DB2, SQL Server, and SFTP into Redshift.

• Designed and implemented Spark applications in Databricks for data transformations, aggregations, and analytics.

• Integrated Apache Airflow for DAG scheduling, reducing manual job handling by 60% and increasing system reliability by 35%.

• Leveraged Kafka and Spark Streaming to process real-time UI activity data from XML feeds with low latency.

• Utilized Apache Flink for stateful stream processing on large-scale datasets, enabling responsive analytics dashboards.

• Transitioned infrastructure from Redshift to Snowflake, improving query performance by 50% and reducing data warehousing costs by 30%.

• Deployed dbt and Informatica automation workflows to streamline transformations, reducing ETL latency by 30%.

• Managed data extractions from diverse sources using Python (Boto3) and orchestrated large-scale transfers into S3.

• Built Prometheus-based dashboards using Plotly/Matplotlib for system-level metrics visualization.

• Applied Kimball dimensional modeling techniques to build star schema-based data marts for business intelligence needs.

• Configured and maintained EMR clusters for distributed Spark/Hadoop jobs across petabyte-scale datasets. Client: Accenture Jun 2021 – Aug 2022

Role: Data Engineer

Responsibilities:

• Led cloud data platform migration and pipeline modernization initiatives for enterprise clients using Azure and Snowflake, delivering faster, more secure, and scalable data operations.

• Migrated on-prem SQL Server databases to Azure SQL and Snowflake, ensuring seamless schema compatibility and data validation.

• Built Azure Data Factory (ADF) pipelines for batch and incremental data loading from Azure Blob, SQL, and Cosmos DB.

• Enhanced ETL workloads using Informatica IICS, achieving 50% faster execution via partitioning and parallelism.

• Converted IICS workflows to ADF pipelines to improve scalability and pipeline maintainability.

• Created Azure Databricks notebooks using Python for data wrangling, transformation, and cleansing.

• Orchestrated real-time data movement with Kafka Streams and Azure Event Hubs for microservices consumption.

• Automated Azure tasks with Bash scripts, reducing data pipeline failures by 30% and improving throughput by 50%.

• Built integration between Azure Data Factory and Snowflake using Python-Snowflake connector for seamless data loading.

• Applied dimensional modeling and Data Vault design to build robust, query-efficient reporting datasets.

• Used Flask APIs to expose processed data for real-time dashboard consumption and application integrations.

• Built Synapse-based analytics pipelines to support advanced data science workflows on high-volume datasets.

• Implemented security best practices including Azure Key Vault, Role-Based Access Control (RBAC), and data masking to meet GDPR compliance.

Client: Accenture Jun 2019 – May 2021

Role: Associate Software Developer

Responsibilities:

• Developed and automated end-to-end testing and data ingestion pipelines for healthcare and insurance clients using Python, MongoDB, and AWS.

• Built automation pipelines to process 10,000+ healthcare claims daily using Docker, Python, and MongoDB.

• Created a test automation framework for querying Excel, interacting with MongoDB, and validating API responses.

• Reduced daily manual testing from 4 hours to 5 minutes by building fully automated regression suites.

• Developed APIs using Flask to serve test data and generate real-time reports for QA teams.

• Designed and implemented MongoDB-based data ingestion scripts for large datasets, reducing manual handling by 60%.

• Led requirement gathering and business logic analysis to create over 100 reusable test scenarios.

• Applied Python automation to streamline response schema validation and data mapping tasks.

• Conducted peer code reviews, sprint planning, and test case development in Agile environments.

• Built CI/CD pipeline integrations using Jenkins and Git to run nightly test automation jobs.

• Explored GCP fundamentals and supported proof-of-concept (POC) initiatives for future cloud migrations.

• Documented automation workflows in Confluence and maintained GitLab repositories for code versioning.

• Enhanced test coverage and reduced failure debugging time through detailed logging and exception handling. Education

Concordia University St Paul

Master of Information Technology and Management

Contact this candidate