Azure Data Engineer

Location:

Edison, NJ

Posted:

August 23, 2025

Contact this candidate

Resume:

Name: Ranjit Kumar Panuganti Mobile: +1-945-***-****

Email: *******.*.*********@*****.***

Professional Summary

Highly Skilled Data Warehousing Architect and Azure Data Lead with 18 years Experience in Data Warehousing ETL tools, Reporting tools and Certified Azure Data Engineer and Cloud fundamentals. In-depth knowledge on cloud DWH platforms like Azure Databricks, ADF, Delta Lake, Snowflake, ETL tool Datastage, Tableau, Talend, SSRS, Cognos. Involved in complete Software Development life-cycle (SDLC) of various projects, including Agile methodology, Requirements gathering, System Designing, Data modeling, Migration, PoC and Maintenance. Excellent Interpersonal and communication skills with an ability to remain highly focused and self-assured in fast-paced and high-pressure environments.

Having 18 years of experience in Data ware Housing and Business Intelligence projects in Banking, Finance, Risk Compliance, Capital Markets and Insurance industry which includes ETL DataStage, Azure Data Factory, DataBricks, Python Pandas, Numpy, Snowflake, Snowspark, Tableau, Power BI, T-SQL, Big Data, SSRS, SQL Server, Oracle, Teradata, Hadoop, Autosys and Unix in SMBC Bank USA NJ/NY, Wells Fargo India Solutions, Wipro Technologies Ltd., India.

Extensive ETL tool experience using IBM Infosphere/Websphere DataStage, ADF, Databricks and Reporting tools of Tableau, Power BI, SSRS, Cognos.

Created Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.

Banking domain experience in Capital Markets, Finance, Risk Compliance and Implemented or designed code in Hedging, Mitigate losses due to price fluctuations, Grossup, Offset, Netting, Reclass functional.

Built and managed Data Pipelines and Jobs with Databricks (Delta Live Tables, PySpark, Python, SQL) and Azure Data Factory to streamline data processing.

Experience in developing Spark applications using Spark-SQL/PySpark in Databricks for data extraction, transformation, and aggregation from multiple file formats for Analyzing & transforming the data to uncover insights into the customer usage patterns.

Pyspark jobs were finetuned using Spark AQE enabled and it’s inbuilt configuration parameters for different data volume conditions.

Extract Load and Transform data from sources Systems to Azure Data Storage services using a combination of Azure Data factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics. Data ingestion to one or more Azure services (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks Medallion architecture.

Written AWS Lambda code in Python for nested json files, converting comparing,sorting.

Construct the AWS data pipelines using VPC, EC2, S3, Auto Scaling Groups (ASG), EBS, Snowflake, IAM, CloudFormation, Route 53, CloudWatch, CloudFront, CloudTrail.

Design and construct of AWS Data pipelines using various resources in AWS including AWS API Gateway to receives response from AWS lambda and retrieve data from snowflake using lambda function and convert the response into json format using Database as Snow Flake DynamoDB, AWS Lambda function and AWS S3.

Build ETL pipeline end to end from AWS S3 to key, Value store DynamoDB, and Snowflake Datawarehouse for analytical queries and specifically for cloud data.

Evaluating Technology stack for building Analytics solutions by doing research and finding right strategies, tools for building end to end analytics solutions and help designing technology roadmap for Medallion Architecture, Data Ingestion, Data lakes, Data processing and Visualization.

Developed Tableau reports/Dashboards at enterprise level and Line of Business level. Presented briefly and migrated multiple legacy projects into Tableau and Power BI.

Upgraded IBM Datastage from 9.1 to 11.5 Version with all Environments in place and zero issues. Received Achieving Excellence Award 4 for this Conversion and driving end to end.

Implemented Legacy ETL Application migration to Azure platform on Medallion architecture using ADF for migration and Databricks for Analytics.

Implemented SCD Type 2,3 in Datastage and usage of JSON/XML Configuration to read or write files.

Rewrite the Datastage V11.5 software package in Wells Fargo Enterprise level to fix it as it replaces older version.

Developed parallel jobs implemented SCD Type 2, Used transformer stage, CDC Stage, ODBC Connectors, Sequencers, runtime column propagation.

Migrated as PoC for around 100 MS SQL Server tables to Azure Cloud Database using ADF(Azure Data Factory).

Python Pandas, Numpy Libraries are used in scripting for Azure blob files testing and data analytics.

ADF Created Pipelines and Dataflow jobs used Conditional split, exists, joins and derived columns of schema modifier windows, pivot.

Created Technical Specification documents related to Datastage Software Install, Design, Architecture, Migration Implementation Steps.

Experience in fine tuning Teradata SQL Queries by seeing Execution plan, Repartitioning Indexes, BulkLoad/BTeq loadof data.

Implementation of Bulk Load/Bteq Load in Datastage Teradata connectors and in UNIX Shell scripts query part.

Expertise in Snowflake concepts like setting up Resource monitors, RBAC controls, Scalable virtual warehouse, SQL performance tuning, zero copy clone, time travel and automating them.

Experience in in re-clustering of the data in Snowflake with good understanding on Micro-Partitions.

Experience in Migration processes to Snowflake, Azure Cloud environments from on-premises database environment.

Key achievement is partnered with 40+ teams to establish business glossary and metadata repository using IBM infosphere and Abinitio BRE components.

Implemented Data governance strategy for EFT LoB by developing an enterprise wide MDM system to provide consistency of data lineage and definitions across 40+ Systems.

Created a centralized MDM database system with 500+ attributes per 5 systems to support various business units and Business requirements.

Strong understanding of the principles of Data Warehousing using fact tables, dimension tables and star/snowflake schema modeling.

Worked extensively with Dimensional modeling, Data migration, Data cleansing, ETL Processes for data warehouses.

Led a team in Developing new and modifying design approach to automate routine tasks as of the new ETL architecture directions. Used Enterprise Edition/Parallel stages like Datasets, Change Data Capture, Row Generator and many other stages in accomplishing the ETL Coding.

Leading the technical design team and performing code peer review and analysis.

Microsoft Certified Azure Data Cloud Fundamentals 2022.

Created LoB level metrics using Tableau and finished multiple projects smoothly.

Excellent team player with problem-solving, Handling conflict issues and trouble-shooting capabilities.

Proven ability to quickly learn and apply new technologies, have creativity, innovation, and ability to work in a paced environment.

Work effectively with diverse groups of people both as a team member and individual.

Trained and practitioner of Agile development methodologies and used tools (Jira, Confluence) in a compliance driven environment.

Technical Skills

DWH/ETL Technologies : Azure Databricks, ADF, IBM Infosphere DataStage 11.7, 9.3,

ADLS, Python, Tableau, AWS, S3, SnowFlake, Power BI,

Talend 7.3, Collibra, Hadoop, Cognos, SSRS.

Databases : ADLS, Oracle, Teradata, Snowflake, MSSQL Server 2018

Languages : T- SQL, SQL, Python, Autosys, Unix Shell Scripts.

Other Tools : GitHub, Service Now, SCM, Jenkins, Jira, Confluence,Agile

Monitoring tools : Cloudera Manager, Autosys, Control M

Operating Systems : Windows, UNIX, AIX, Mac

Cloud Technologies : Azure ADF, Snowflake Cloud DWH, AWS

Reporting Tools : Tableau, Power BI, Cognos

Work Experience

Client: SMBC Bank NJ/NY USA Sept2023 – Till Date

Role: ETL Architect & Data Engineer

Description:

Oracle General Ledger Application under Capital Markets is moving to Azure cloud environment from legacy system. This Project involves Azure Data Lake environment with Source data extract from legacy systems using ADF Pipelines and Transformations using DataBricks. Datastage and Informatica were used for the data and Balances consolidation among from different inputs and finally used Tableau and Denodo views for reports.

Responsibilities:

Experience in Migration processes of Datastage Jobs to Azure cloud environment from on-premises database environment Oracle, SQL Server.

Created ADF Pipeline jobs to extract source data from different legacy systems like EBS, ELF and HORIZON (Oracle GL Application).

Built and managed Data Pipelines and Jobs with Databricks (Delta Live Tables, PySpark, Python, SQL) and Azure Data Factory to streamline data processing.

Extract Transform and Load data from sources Systems to Azure Data Storage services using a combination of Azure Data factory, Spark SQL, and U-SQL Azure Data Lake Analytics. Data ingestion to one or more Azure services (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks Medallion architecture.

Written Pyspark code for Trial Balance dataclasses, Securities, Derivatives and Functional processing like Netting, reclass, Grossup and Offset rules.

Automated data processing and improved efficiency using Databricks Jobs and Data Pipelines (DLT) with Spark, enhancing overall data processing accuracy.

Designed incremental loading strategies with Change Data Capture (CDC) and Delta Lake MERGE operations in Bronze, Silver, Gold zones and catalog tables.

Deployed CI/CD pipelines using Azure DevOps with automated Databricks notebook testing and data validation.

Migrated as PoC for around 100 MS SQL Server tables to Azure Cloud Database using ADF(Azure Data Factory).

Python Pandas, Numpy Libraries are used in scripting for Azure blob files(Landing files) testing and data analytics.

Databricks Catalog tables were queried for analytics and File extracts. Databricks Unity catalog data lineage and data quality insights session demonstrated to Dev Teams, CoE and QA, Business Teams.

Databricks jobs were fine tuned dataclass wise specific interms of Customization of cluster, Salting techniques to avoid data skewness repartitioning, redesign of framework, Joins and control tables.

Python coding in Dataframes joins, nested for loops and advanced programming concepts in Python.

ADF Created Pipelines and Dataflow jobs used Conditional split, exists, joins and derived columns of schema modifier windows, pivot.

Designed end-to-end data integration solutions using Azure Data Factory to extract, transform, and load data from on-premises and cloud sources.

Implemented data pipelines with data movement and transformation activities, utilizing Azure Data Factory's built-in transformations and custom activities.

Developed Reconciliation requirement jobs in Datastage leveraging Financial Controllers to upload latest Mapping document(FINMAPP) and using this generate desired output files in ADI System.

Implemented Trial Balance validation across entity level for different Source Systems and reports on different parameters in Tableau.

Helped QA Team in automation leveraging PYTHON Code in their complex validations.

Technologies/Tools: Azure Data Factory, DataBricks, PySpark, Python Pandas, Numpy, Datastage, Collibra, Tableau, Autosys, Oracle, SQL Server, Service Now, JIRA, Power BI.

Client: Wells Fargo India Jan2022 – July 2023

Role: ETL Architect & Data Engineer

Description:

Enterprise Risk and Finance Technology is part of EFT Alignment in Wells Fargo which deals with Risk applications related to Internal Loss Data, Third Party, Capital Markets, Operational Risk and Finance technologies.

TRIMS, ILD are the DWH applications in different tech stack and on MS SQL Server and Oracle.

Responsibilities:

Led the migration of an on-premises TRIMS(ThirdParty RISK) Application data warehouse from Datastage to Azure cloud environment using ADF, Databricks PySpark.

Migrated as PoC for 100 MS SQL Server tables to Azure Cloud Database using ADF.

Designed end-to-end data integration solutions using Azure Data Factory & Databricks to extract, transform, and load data from on-premises to cloud environment.

Implemented data pipelines with data movement and transformation activities, utilizing Azure Data Factory's built-in transformations and custom activities in Data flow.

Implemented ETL Framework using PYSPARK and loaded landing files into Parquet files with Medallion architecture transformations.

Rewritten Legacy ETL SQL scripts into PySpark SQL Scripts for faster performance.

Managed the ingestion of landing csv files from different datasources to ADLS Gen2 using Azure Data Factory (Self hosted integration runtime) then we are mounting to further processing using Databricks PySpark.

Ensured secure storage and accessibility of data in Azure Delta Lake Storage.

Created Databricks pipeline to handle Data transformations, including extraction, cleansing, normalization, and structuring of raw data in Delta lake architecture involves Bronze, Silver and Gold enrichments.

Integrated Azure Data Factory with other Azure services, such as Azure Databricks and Azure SQL Database, to support advanced analytics and reporting needs.

Awarded AE4 for the PoC implementation of replacing all modules file delivery through NDM process into single ETL job with NDM script integration.

Experience in Migration processes of Datastage ILD to Snowflake from on-premises database environment Oracle.

Experience in designing and building manual or auto ingestion data pipeline using Snowpipe.

SnowSQL Experience in developing stored Procedures writing Queries to analyze and transform data.

Proficient in Snowflake concepts like setting up Resource monitors, RBAC controls, scalable virtual warehouse, SQL performance tuning, zero copy clone, time travel and automating them.

Experience in in re-clustering of the data in Snowflake with good understanding on Micro-Partitions.

Develop framework for converting existing DSX jobs into Snowflake/Talend jobs.

Define virtual warehouse sizing for snowflake for different type of workloads.

Redesigned the views in snowflake to increase the performance.

Multiple reporting projects from excel, DB rewrited into Tableau and Power BI.

Lead EFTPM Metrics in Tableau and create sharepoint URL for business usages.

Lead the team in ETL tool migration from Datastage to Snowflake conversion.

Identify ideas, Prepare PoC, Drive end to end delivery.

Drive meetings with BSC, QA and Release management.

Project management activities-Quality delivery, Incident tracking, Resolving issues,

Codewalkthrough, Code review, Defect prevention management.

Technologies/Tools: Azure Databricks PySpark, ADF, ADLS Gen2, Snowflake, DataStage v11.7, Matillion, T-SQL, Tableau, Autosys, Oracle, SQL Server, Service Now, JIRA, Power BI.

Client: Wells Fargo India Mar 2016 - Dec 2021

Role: Team Lead and Developer

Description:

Operation Risk Utilities (1ORU), ORIS, SHRP Applications, as part of the Wells Fargo Enterprise Risk Management, is responsible to support analytics capability and provide the environment to house and maintain TPRM,ILD,CRAS,EIW,SCM data for reporting and analytics purposes for Wells Fargo stake holders like EDA, TRIMS, SHRP, SORP, EIW, CRAS, ILD team.

Responsibilities:

Lead team in Design, Development, Agile Sprint, Testing and Deployment activities.

Lead team in Technical and Management of Datastage migration from v9.1 to v11.5 which includes Server Setup, Environment creation, NDM Configuration, Testing, Code fixes/Rewrite.

Awarded AE4 for the PoC implementation of replacing all modules file delivery through NDM process into single ETL job with NDM script integration.

Rewrite the Datastage V11.5 software package in Wells Fargo Enterprise level to fix it as it replaces older version.

Created Technical Specification documents related to Datastage Software Install, Design, Architecture, Migration Implementation Steps.

Shared containers in Datastage were used for repetitive steps while creating file using stored procedure passing different parameters.

Datastage jobs were fine tuned wherever required fixes like Hash partitioning of keys which have Joins, Lookup, Shared containers, Before/After job sub routines, Sequencer Job Activity for parallel process to run, Queries fine tune at ETL and DB Side.

Performance tuning of Teradata SQL Queries by seeing Execution plan in Teradata SQL Assistant. Repartitioning of Indexes, Avoiding full table scans, CTE Expressions.

Teradata table stats collection and dropping off Indexes/Keys before bulk load.

PROD support, Migration Activites for ORU, TRIMS, SHRP and ORIS Applications were handled smoothly with out any single issue in PROD After conversion or Upgradation.

Implemented SCD Type 2,3 in Datastage and usage of JSON/XML Configuration to read or write files.

Lead EFT LoB metrics build up from Scratch on all PROD Support Issues and Change Request Deployments into Tableau.

Multilpe Dashboards created with multiple table joins using Data blending, Dual Axis, Blended Axis, Context filter, global filter, Interactive Dashboards in Tableau.

Created Hierarchy filters to Drill Down/Drill Up the metrics over Managerial Hierarchy at Lob level and Year to date filters over dashboard and views.

Identify ideas, Prepare PoC, Drive end to end delivery.

Design Technical docs, prepare test cases and peer review of Code/Scripts.

Developing Enterprise Lob Metrics for every quarter of year.

Lead activities STAMP forecast, Metrics, Drive meetings with BSC, QA and Release

management.

Project management activities-Quality delivery, Incident tracking, Resolving issues,

Codewalkthrough, Code review, Defect prevention management.

Technologies/Tools: DataStage v11.5, v9.1, Tableau, T-SQL, Control M, Teradata 15 DB, SQL Server, Stored Procedures, SCM, Pac2k, NDM scripts, RSA Archer reports.

Client: Wells Fargo India Aug 2012 - Mar 2016

Role: ETL Design and Developer

Description:

The Capital Markets Operational Data Store (CMODS) is a ETL and database system that contains consolidated mortgage loan data, including pipeline loans, reverse loans, and loan commitments from rate lock to settlement, using data provided by participating loan origination systems, servicing systems, and other mortgage information systems. Loan information is refreshed daily. CMODS merges the data from multiple loan origination systems into a single unified repository for analysis, and provides data for downstream business line functions. Business line customers include Secondary Markets Accounting & Controls (SMAC), Pipeline/Warehouse Asset Valuation (PWAV) group, Servicing Portfolio Management, Asset Sales, Structured Finance, Investment Analytics, Agency Relations, and the Trade Desk.

Responsibilities:

Developing Enterprise, M&E Projects for every quarter of year.

Created Parallel Jobs in Datastage and used/configured Excel Read, XML write output file, Teradata Connector, ODBC connector and Joins,Lookup, Tranformer jobs.

Multiple complex Sequencer jobs were created using parameters in Job activity, User Varaiable activity, Shell Scripts, Before/After job sub routines.

In Shell scripts customization of data load is done for Data Bulk load and BTeq load in Teradata database tables.

Datastage Datasets Orchadmin Commands were used to read or write data into txt files from Dataset.

Datastage Containers are used for regular usage of job with different parameters.

Fine tune the Datastage Parallel jobs at Datastage side of Hash partition keys of join and sort. Having collection of Teradata Table Statistics, Dropping of Keys/Indexes before bulk load, Clustered Indexes at Teradata DB Side and Queries.

Lead the team in Abinitio Metadata Capture effort for Validating data in CMODS.

Support, Performance analysis and tuning of CMODS application.

Prepare Technical docs and lead the project level effort in design, testing and deployment.

Guide team members in code test, fixes, code review, support issues and documentation.

Make the team and myself SME(Small Module Expertise) in CMODS.

New Ideas presentation, coming up with PoC and implementation.

Project management activities-Quality delivery, Incident tracking, Resolving issues,

Codewalkthrough, Code review, Defect prevention management.

Technologies/Tools: DataStage v9.1, Abinitio, Autosys, Teradata 15, UNIX Shell Script.

Client: Thames Water UK India Mar 2012 - Aug 2012

Role: ETL Developer

Responsibilities:

Understanding the Data Migration and business requirements.

Develop Data stage Jobs for data extraction and migration from one server to other server with different source data.

Read/Extract the data from MS Access, Excel files, SQL server to SQL Server tables as target stage.

ODBC Drivers were configured to connect to SQL Server table in Datastage.

DBA Role like Maintenance of Database, Creating Triggers and Stored Procedures at SQL server database.

Audit tables created for capturing the DML updates on Tables using DML Triggers which are created by me.

Database migration from Dev to SIT, UAT and deployment in production server.

MS SQL Server Database created and fine tune table data load with right keys establishment and confirmation from architect.

Roles/Database login id’s created for Team or Business to have read or write access to tables.

Technologies/Tools: Data Stage 8.5, Autosys, SQL server 2008, Cordys.

Client: Michelin India Feb 2011 – Mar 2012

Role: ETL Developer

Responsibilities:

Monthly generation of cube with the new monthly data given by business in IBM Cognos reporting tool.

Cognos server and MS BI SQL Server DTS Packages configuration.

Develop Datastage Sequencer and parallel jobs and documentation for installation of Datastage software in SDLC Environments Dev, SIT, UAT and PROD.

Develop ETL jobs for new enhancements using talend and few in MSBI DTS Packages.

Production support activities and Maintenance. Performance of tuning jobs.

Fix data issues and test it before the next month activity.

Upgradation of servers and Enhancements in MEMS_OF, PAPERLESS apps.

Technologies/Tools: Datastage, Talend, Cognos reporting, MS SQL Server MSBI, Crystal reports.

Client: General Motors, Detroit India Dec 2008 –Jan 2011

Role: Software Engineer

Responsibilities:

Develop Datastage Parallel Jobs for data extraction from input files, transforming and generate target IW files.

Develop sequencers and shell scripts for existing jobs and new business requirements.

SQL Queries fine tune for better performance like right keys of establishment, indexing, partitioning.

PL/SQL Stored procedures were written for Daily reports extraction of Delta Tables Load.

Project management activities – Status tracking. Defect Prevention Management.

Run the regular cycle runs with the new data given by business.

Production support data issues and find the root cause of the issue.

Fix the data issues and test it before the next run.

Preparation of the Technical Design Documentation, test cases.

Write PLSQL stored procedure for object wise excel report based on requirements.

Technologies/Tools: IBM DataStage, Oracle SQL stored procedures, Autosys and UNIX shell scripts.

Certifications

Microsoft Certified Azure Data Engineer Associate Since December, 2024

https://learn.microsoft.com/api/credentials/share/en-us/RanjithKumarPanuganti-5483/C8017E575437C172?sharingId=B3DF4D6C268FEDEC

Microsoft Certified: Azure Fundamentals Since December, 2022

https://www.credly.com/badges/fa04747a-fca0-43b2-96c2-205986268f70/public_url

Academic Profile

QUALIFICATION

BOARD/COLLEGE

YEAR OF COMPLETION

AGGREGATE

Bachelor of Technology (Electronics.Communication.Eng)

Affiliated to Acharya Nagarjuna University.

V.R.Siddhartha Engineering College,

Vijayawada.

2004-08

82%

USA Profile

VISA STATUS

ADDRESS

CURRENT COMPANY

CONTACT

H1B – Nov 2027

20 BRITTON STREET, JERSEY CITY, NEW JERSEY 07306, USA

Sumitomo Mitsui Banking Corporation(SMBC)

+1-945-***-****

*******.*.*********@*****.***

Contact this candidate