Data Engineer Integration

Location:

United States

Posted:

May 28, 2025

Contact this candidate

Resume:

Name: Chaitanya Balivada

Email: *********.*.****@*****.***

Phone: +1-959-***-****

LinkedIn: https://www.linkedin.com/in/chaitanya-b-173a34312/

Role: Sr. Data Engineer

Professional Summary:

Over 10 Years of Extensive experience on Data Engineering field including Ingestion, Data Lake, Data warehouse, Reporting and Analytics.

Strong knowledge and experience on Data Analysis, Data Lineage, and Big Data pipelines, Data quality, Data Reconciliation, Data transformation rules, Data flow diagram including Data replication, Data integration and Data orchestration tools.

Solid Experience and understanding of Implementing large scale Data warehousing Programs and E2E Data Integration Solutions on Snowflake Cloud, AWS Redshift, Informatica Intelligent Cloud Services (IICS - CDI) & Informatica Power Center integrated with multiple Relational databases (MySQL, Teradata, Oracle, Sybase, SQL server, DB2)

Knowledge and experience on AWS services like Redshift, Redshift spectrum,S3,Glue,Athena, Lambda, cloud watch and EMRs like HIVE, Presto

Hands on Experience on Python programming for data processing and to handle Data integration between On-prem and Cloud DB or Data warehouse.

Experience with container based deployments using Docker, working with Docker images, Docker registries.

Hands-On experience on Analyzing SAS ETL, Implementation of Data integration in Informatica using XML, Web services, SAP ABAP, SAP IDoc.

Experienced with Teradata utilities Fast Load, Multi Load, BTEQ scripting, Fast Export, SQL Assistant and Tuning of Teradata Queries using Explain plan

Worked on Dimensional Data modelling in Star and Snowflake schemas and Slowly Changing Dimensions (SCD).

Developed Informatica Development Standards, Best practices, Solution Accelerators and Re-usable components for design and delivery assurance.

Pro-Active to Production Issues, punctuality in Meeting deadlines and always follow First Time Right (FTR) and On Time Delivery (OTD) approach.

Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL and U-SQL Azure Data Lake Analytics.

Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Data bricks.

Operationalize data ingestion, data transformation and data visualization for enterprise use.

Mentor and train junior team members and ensure coding standard is followed across the project.

Help talent acquisition team in hiring quality engineers.

Experience in real time streaming frameworks like Apache Storm.

Worked on Cloudera and Hortonworks distribution.

Progressive experience in the field of Big Data Technologies, Software Programming and Developing, which also includes Design, Integration, Maintenance.

Hands-on experience with Snowflake utilities, SnowSQL, Snow Pipe, Big Data model techniques using Python Java.

TECHNICAL SKILLS:

Cloud applications: Snowflake, Data bricks on AWS / Azure, IDMC, Salesforce

ETL / Informatica Tools: Informatica Cloud/IDMC, DBT, AWS Glue, ADF, Alteryx

Big Data / Data Lake: Cloudera, Synapse, Redshift, Big Query, Ranger, Hive, Delta Lake, S3, ADLS Gen2 and Impala

BI Tools / Data Analytics: Business Objects, Power BI, SAC and Tableau.

Databases: Oracle 19g, Oracle ERP, Redshift, Hive, Delta Lake, MS SQL Server, Alloy DB, Azure SQL, DB2, MS Access, Sybase and Teradata.

Programming: C, Python / Pyspark, SQL, PL/SQL, Java and Web services

Methodologies: E-R Modeling, Star Schema, Snowflake

Scheduling Tools: Tivoli, Crontab, Airflow, Autosys and Informatica Scheduler

Other Software Engineering Tools/Technologies: Airflow, Cloud Watch, ADF, Tivoli, SC3, JIRA, UNIX Shell Scripting, Altova, QTP 9.2, SQL*Loader, SQL*Plus, TOAD, MS Visio, TOAD 8.0, Squirrel, SQL Developer, Putty, Super Putty, Quality center, Teradata SQL Assistant, SQL Navigator 6.7, SF Data loader, Post Man, SOAP UI.

Professional Experience

Client: Advent Health /Bloomfield, CT Oct 2022 to Present

Role: Senior Data Engineer

As a Sr. Data Engineer, built scalable real-time ETL/data pipelines for analytics and engineering using AWS, Data bricks, Snowflake, and DBT, Informatica Cloud, integrating various on-prem and cloud data sources. Architected and deployed a resilient data platform using the AWS stack (EC2, S3, RDS, Lambda, Redshift, Cloud Formation, and Athena), Data bricks, and Airflow to manage data lakes, delta lakes, and data warehouse environments.

Responsibilities:

Collaborated closely with internal teams, customer support, implementation partners, architects, and business analysts to gather requirements and design end-to-end data flow systems.

Utilized tools like lucid chart and SQLDBM to create architectural diagrams and visual workflows for data migration and flow across systems, ensuring alignment with business stakeholders prior to implementation.

Extensive development and administration experience in batch and streaming ETL using tools such as DBT, IICS, Snowflake, AWS cloud services, Data bricks, and Salesforce.

Developed end-to-end pipelines in Snowflake using DBT, ingesting data from source systems.

Migrated from legacy ETL scripts to a modern ELT architecture using DBT and Airflow.

Solid understanding and utilization of Snowflake features including Snow SQL, Snow pipe, data loading strategies, Snow tasks, and secure data copying from AWS via stages and access control mechanisms.

Good understanding and hands-on experience in strategizing and implementing end-to-end data migration from on premise database Oracle to Snowflake. Key responsibilities and steps like migration strategy, data extraction, validation etc..

Built robust Pyspark pipelines on Data bricks for both batch and streaming data ingestion and transformations, integrating data from on-prem to AWS S3, RDS, and Snowflake.

Hands-on experience with Data bricks, including workspace UI management, Delta Lake, notebooks, and pipeline development using Python and Spark SQL. Skilled in leveraging Data bricks frameworks, cloud architecture, and performance optimization techniques.

Architected and developed cloud-native data solutions using AWS services including S3, Glue, Lambda, SQS / SNS, Athena, and Redshift, improving data integration and reporting processes.

Managed and maintained S3 buckets, IAM policies, and Cloud Watch for improved security, monitoring, and logging of cloud resources.

Built a data quality layer using DBT tests and snapshots. And implemented CI/CD workflow for DBT using GitHub Actions and dbt-cloud API.

Used push-down ELT strategies to optimize performance and reduce data movement by executing transformations within Snowflake. Built workflows and data lineage using Where scape 3D for audit and governance purposes.

Led the design and implementation of ETL pipelines using Informatica IICS, AWS Glue, and Data bricks, enabling seamless data transformation from multiple on premise and cloud sources.

Proficient in operational responsibilities including on-call support, administration, repository setup, backup strategies, and user management.

Familiar with Agile methodologies and tools including Confluence, CI/CD pipelines, Bit bucket, GitHub, and Source Tree.

Hands-on experience implementing Big Data analytics platforms using Cloudera and Hortonworks Hadoop ecosystems.

Strong knowledge of SQL, HiveQL, Python, Pyspark, and Snow SQL for handling large volumes of data efficiently.

Environment: Data bricks, Snowflake, IDMC / IICS, DBT, Salesforce, Snowflake, IAM, S3, Glue, Lambda, Athena, Redshift, Delta lake, Python, Spark-SQL, Pyspark, Delta lake, Airflow, Linux, Apache Impala, Ranger, Super Putty,, Oracle 11g, SQL Server, Tableau, Perl, Web Services, TOAD, Xml, Flat Files, Altova, JIRA, SOAP UI, Post Man

Client: Fifth Third Bank/ Pittsburgh, PA Feb 2021 to Oct 2022

Role: Sr. Data Engineer / Sr. ETL Developer

Performed the role of Sr. Data engineer, ETL Integration with Salesforce and Big Data technologies (AWS). Implemented platform migration from on-prem to cloud migration data engineering / migration principles using AWS tech stack, and IICS.

Responsibilities:

Worked on POC on Snowflake, Azure SQL data warehouse and Google big query to understand the functionality of the different cloud DWaaS provider and provide results to senior management to help make decisions.

Work with enterprise architects on solutioning the POC and on EDW cloud journey.

Work with team on making decisions on how to migrate the data from on prem to cloud, which tools can be used for ETL or ELT on cloud.

Convert and review code from oracle PL/SQL programming to snowflake code, make performance changes and test.

Perform load testing using JMeter, performance testing to ensure snowflake can handle the real time load we see in EDW. Compare on - prem vs cloud on various parameters.

Create notebooks to load xml files in Azure SQL datwarehouse using Azure data bricks.

Collaborated with internal Architects and Business Analysts to understand requirements and design robust data flow systems.

Installed and configured multiple Secure Agents (SA) for both ICS and IICS cloud platforms.

Implemented load balancing and high availability solutions for Informatica Cloud environments.

Created IICS Cloud mappings to extract and load data across various platforms including Data Lake, Salesforce Service Cloud, S3, Marketing Cloud, SAP, and Oracle.

Processed and transformed large volumes of structured, semi-structured, and unstructured data using ETL and Big Data (Hadoop) frameworks.

Integrated data across platforms such as Snowflake, Salesforce (SFDC), Salesforce Marketing Cloud (SFMC), AWS, and Hadoop using Informatica Cloud.

Set up GitHub integration and version control, along with configuring SAML authentication in IICS.

Proficient in using connectors for Hadoop, Hive, JDBC, Salesforce, Oracle, and S3 on the Cloud.

Automated monitoring of Informatica Cloud real-time URLs using custom Python scripts.

Contributed to building Data Marts (DM) and Data Warehouses (DWH) for IU Health providers, ensuring compliance with HIPAA regulations (Medicare – TMG Facets, Commercial – HLTH_RULES, Medicaid – Medicaid).

Provided 24/7 production support for ETL, BI, and cloud-based processes.

Environment: Informatica Power Center, IDQ, IICS & ICRT, BDM, Salesforce, Cloudera, Hive, Snowflake, Cloudera, Hive, AWS, Apache Impala, Azure Data bricks, Ranger, Python, Pyspark, SC3, Tivoli, UNIX / LINUX, WinSCP, Super Putty, Snowflake, AWS, Oracle 11g, SQL Server, Tableau, Perl, Web Services, TOAD, Xml, Flat Files, Altova, JIRA, SOAP UI, 834 and 837 Files, Post man

Client: The Home Depot / Atlanta, GA Dec 2019 to Feb 2021

Role: ETL developer Tech Lead

Responsibilities:

Requirement gathering and understanding the functional specifications through constant face to face interaction.

Preparing data flow diagrams, data models and designing ETL.

Setup CI/CD pipeline for build and release of oracle SQL code and run automate jobs using UFT.

Designing, validating and transforming the data from various sources.

Writing and validating the complex SQL procedures and blocks to load the data into the data warehouse.

Writing SQL scripts for applying the transformation logic.

Identify performance bottlenecks in the design, evaluate and propose alternate solutions.

Review the work to ensure all the prerequisites for quality of the delivery are met. Coordinate with business user for the user acceptance testing for the developed application.

Leading the team at onsite and provide support to offshore.

Designing scheduling document for the jobs created in univiewer Dollar universe.

Monitor long running critical jobs and provide solutions for reducing the time (performance tuning).

Analyze data and provide support to business users during UAT phase and post go-live.

Client: Soft world Technologies - India Nov 2016 to Jun 2019

Role: Snowflake Data Engineer

Responsibilities:

Served as the Snowflake Database Administrator responsible for leading the data model design and database migration deployment production releases to endure our database objects and corresponding metadata were successfully implemented to the production platform environments; (Dev, Quall and Prod) AWS Cloud (Snowflake).

Performed day-to-day integration with the Database Administrators (DBA) DB2, SQL Server, Oracle and AWS Cloud teams to ensure the insertion of database tables, columns and its metadata have been successfully implemented out to the DEV, QUAL and PROD region environments in AWS Cloud - Aurora and Snowflake.

Performed ETL data translation using Informatica of functional requirements to Source to Target Data Mapping documents to support large datasets (Big Data) out to the AWS Cloud databases; Snowflake and Aurora.

Performed logical and physical data structure designs and DDL generation to facilitate the implementation of database tables and columns out to the DB2, SQL Server, AWS Cloud (Snowflake) and Oracle DB schema environment using Erwin Data Modeler Model Mart Repository version 9.6.

Assisted Project Managers and Developers in performing ETL solution design and development to produce reporting, dash boarding and data analytics deliverables.

Technical Team Member of the T. Rowe Price Information Architect-Data Modeling Agile team; responsible for developing Enterprise Conceptual, Logical and Physical Data Models; Data Dictionary, supporting the three Business Units: Retirement Plan Services (RPS), Shared Support Platforms and Global Investment Services (GIS).

Client: – Life Insurance Corporation of India - Hyderabad, India Aug 2013 to Oct 2016

Role: PL/SQL Oracle Developer

Responsibilities:

Involved in software development life cycle for the project.

Customization of front-end screens for various modules in Finacle.

Worked on LMS (liquidity management system) module in Finacle.

Created automatic sweep code along with PL/SQL procedures.

Design and providing the solution to the client. Working on creating functional documents from requirements.

Writing complex PL/SQL procedures, functions to extract data in required format for interfaces.

Tracking the issues that are raised in JIRA, doing SLA management to make sure that SLA is not missed during UAT and postproduction support.

Designing scheduling of jobs in IBM Tivoli.

Performance tuning for the long running jobs in database.

Contact this candidate