Resume

Etl Developer Data Engineer

Location:

Novi, MI

Posted:

June 28, 2023

Contact this candidate

Resume:

AnjiReddy Anumula

+1-248-***-****

adxyqc@r.postjobfree.com

Professional Summary:

overall, 13 years of experience in IT Industry as a in design, development, and support of enterprise applications on ETL, ELT, Big data and BI technologies.

Having very good communication skills, analytical thinking and worked on end-to-end project life cycle.

Experience in Data warehousing and big data tools using Informatica PowerCenter 10x/9x/8x, Informatica Data Quality, IICS, Ab Initio, Python, Hadoop, Hive, Pandas, Snowflake, Spark, Databricks technologies.

Good knowledge on AWS tools like EC2, S3, Athena, Lambda, CloudFormation, Glue, Redshift, EMR.

Experience on developing projects in Waterfall and Agile methodologies.

Good knowledge on Hadoop ecosystem data lake and OLAP architecture.

Good experience in Unix, Python, PySpark, Boto and basic Java.

Extensive experience in implementation of Data Cleanup procedures, transformations, Scripts, Stored Procedures, and execution of test plans for loading the data successfully into the targets.

Proficient in understanding business processes/requirements and translating them into technical requirements.

Experience in implementing the business rules by creating Informatica transformations like (Expression, Aggregate, Lookup, Router, Filter, Joiner, Rank, Update Strategy, XML Parser, Stored Procedure, Sequence generator, Application SQ, Web services Consumer, JBOSS source, Data Quality Transformations, IICS Masking, IICS Cleansing, IICS Hierarchy Builder, IICS Hierarchy Parser) and developing Mappings.

Extensive knowledge on Snowflake data warehouse data loads, extraction, integration and virtual warehouse creation techniques.

Experience in QA testing using automation tools.

Good experience in fine tuning Hive, Spark, Snowflake queries and Informatica code.

Good exposure on Informatica 9x/10x platform.

Exposure on Data Warehousing, Data Lake and Cloud architecture.

Good knowledge on Teradata, Oracle, SQL server and Cassandra.

Experience on GIT and CICD pipelines.

Very good knowledge in developing automation scripts using Python and UNIX.

Worked on migrating data from on-prem data lake to AWS S3 using different APIs.

Worked on integrating S3 files with AWS Athena.

Extensively used integrating on-prem data warehouse with AWS cloud using Glue, EMR, Athena, Redshift.

Worked on moving data from Oracle to S3 using Boto3 Python.

Knowledge on visualization the data using Tableau and Power BI.

Professional Experience:

Working as a Software Engineer at Reliable Software from April 2023 to till date.

Worked as a Sr. Data Engineer at JP Morgan, Hyderabad, India from May 2016 to till April 2023.

Worked as an ETL Developer at Deloitte Consulting, Hyderabad, India from July 2013 to April 2016.

Worked as an ETL Developer Engineer at PwC SDC, Bangalore, India from July 2012 to July 2013.

Worked as an ETL Developer at Capgemini Consulting, Bangalore, India from Oct 2010 to July 2012.

Education:

M.C.A (Computers) from Osmania University.

AWS certified Cloud practitioner and Developer Associate.

Technical Skills:

ETL Tools

Informatica PowerCenter, IDQ, IICS, Informatica Analyst, Glue

Programming Languages

Python, Java, SQL, Shell scripting and PySpark

Big Data

Hadoop, HDFS, Hive, Spark, Sqoop

Code management tools

Git, Jira, Confluence, Bitbucket, Jenkins

Scheduling Tools

Control M, Cron, Apache Airflow, Autosys

RDBMS

Teradata, Oracle, SQL Server, MySQL

Operating system

UNIX and LINUX

Cloud Technologies

AWS, Snowflake

Project #1

Project Title

Deposit data mart (Deposit application) and AMLKYC

Role

Senior Data Engineer, ETL & Big Data Architect

Client

JP Morgan & Chase

Duration

March 2018 – April 2023

Environment

Informatica, Teradata 15, Hadoop, Ab Initio, Erwin, Hive, Java, Spark, Python, IICS, PySpark, Pandas, S3, Glue, Athena, GIT, Jenkins, Hortonworks, Kafka, Snowflake

Deposit and Deposit data mart is snapshot of deposit and current accounts which holds the snapshot of all the daily and month end balances of all chase DDA accounts.

Responsibilities:

Understanding the functional processes in the client’s existing models

Participated in initial architect discussions regarding the planning and implementation.

Designed complex data flow mappings using Informatica.

Extensively used Teradata loaders using Informatica.

Handled multiple performance related issues using Informatica/Teradata/Spark code.

Worked on different types of relational/flat files/Mainframe/xml using Informatica.

Worked on syncing Salesforce data into DWH using IICS.

Worked on various IICS concepts like parameterization, Macros.

Worked on IICS transformations like Masking, Cleanse, Hierarchy builder, Hierarchy Parser etc.

Supported existing dataflows using Informatica, IICS and Ab Initio.

Acted as offshore lead, Data analyst and worked on multiple releases parallelly.

Improved performance tuning in existing code where jobs are running long hours.

Developed and owned multiple data sync jobs with Oracle.

Actively participated in Spark, IICS and AWS POCs.

Using PySpark Converted legacy code into Spark framework.

Worked on multiple AIS RISK models using PySpark.

Worked on PIF (Python Integration Frameworks) for data loads to load from different databases to Hadoop platform.

Experience on QA testing using SQL, Python scripts, negative testing, automation test with PRANA.

Experience in modelling tables using Erwin and validate using ModelBoost.

Worked with MSA team to get the work done using target state architecture (Spark).

Worked with different source teams to analyse the requirements for moving to new Data Center.

Created CICD pipelines using Jenkins.

Created Python scripts for automating frequent tasks.

Worked on creating Databricks cluster and configuration and using for DEV and batch.

Worked on creating jobs using Control-M, Autosys and knowledge on Apache Airflow.

Worked on different placement APIs to move data from Hortonworks to AWS.

Extensively used DPL (Chase spark framework) for ingestion and semantic operations.

Worked in Agile methodologies and acted as Scrum Master when needed.

Worked on Kafka and integrated it with Hive using AbInitio and UDS framework.

Provisioned Sematic tables using AWS EMR with complex PySpark logic with business requirements.

Created Snowpipe and SQS events to automate AWS S3 files to Snowflake.

Migrated complex master and transaction data from on-prem Hadoop into Snowflake.

Worked on provisioning the data in AWS Athena and Redshift.

Worked on AWS tools like lambda using Python.

Project #2

Project Title

Target State POC

Role

Senior ETL developer and Data Engineer

Client

JP Morgan & Chase

Duration

November 2017 – March 2018

Environment

Python, Cassandra, Cloud (Cloud Foundry), Apache JMeter, JAVA, AWS S3, Pandas, matplotlib

The purpose of implementing this POC is to implement the Document process from Batch method to Event based architecture.

Responsibilities:

Understanding the new technologies and implementing the business scenarios to achieve Event based architecture.

Created Python code to move the data from different tables in Cassandra NoSQL database.

Used Boto packages/Pandas to move data from Oracle to AWS S3 buckets.

Created a robust data model to handle table loads from Oracle to extract files using Python.

Worked on handling memory issues in dealing with data extracts using Python.

Used data frames using Pandas in Python to join the Cassandra tables.

Tested the Web Service rule engine performance using Apache JMeter tool.

Created python report for daily opened ticket details for different teams across portfolio.

Created summary report using python code at CCB-OPT level, used matpotlib, excel utilities to create the graph reports for senior management.

Project #3

Project Title

Auto Letters

Role

ETL Developer and Integration lead

Client

JP Morgan & Chase

Duration

July 2017 – November 2017

Environment

Informatica PC 9.1, UNIX, Oracle 10g, Control M, PL/SQL

The purpose of Auto Letters program is to migrate Auto application letters to a centralized letter system for all Auto letters composition, distribution and imaging and archiving.

Responsibilities:

Understanding the business rules completely and implemented the UNIX scripts and Informatica mappings.

Created reusable functions in UNIX.

Improved the performance of file validation with UNIX scripts.

Extensively used Informatica to load data into Oracle database.

Created robust validation methods as per the business requirements.

Created the re-usable Unix Scripts and Informatica mappings to add the new source systems with easy changes.

Created proper use of parameters to make minimal change for future updates.

Prepared Unit test case documents and tested the code successfully.

Used control-m to automate the batch jobs for Unix and Informatica code.

Project #4

Project Title

IDMS

Role

ETL Developer and Data Engineer

Client

JP Morgan & Chase

Duration

March 2017 – July 2017

Environment

Informatica PC 9.1, UNIX, Oracle 10g, Control M, PL/SQL

Integrated Dispute Management System is a single platform for different Card disputes. It enhances the customer interaction services and reduce the existing control issues.

Responsibilities:

Understanding the business rules completely and implemented the UNIX scripts.

Improved the performance of file validation with UNIX scripts.

Created the re-try logic in Script and Informatica mapping to re-try the failed records at Web Services side

Created proper use of parameters to make minimal change for future updates.

Prepared Unit test case documents and tested the code successfully.

Prepared complex Informatica mappings for CMOD files.

Project #5

Project Title

Click 2 sign

Role

ETL Developer

Client

JP Morgan & Chase

Duration

June 2016 – February 2017

Environment

Informatica PC 9.1, UNIX, Control M

JP Morgan is one of the biggest banks in US, it is a major provider of financial services. This project introduces to send the documents to the customer’s online portal or mobile app to sign digitally.

Responsibilities:

Understanding the business rules completely and implemented the UNIX scripts and Informatica mappings.

Prepared the data flow diagrams based the business logic.

Created reusable functions in UNIX.

Extensively used Informatica for complex business logic and handled different types of flat files.

Created process mechanism in scripts to understand the flow for each run.

Created robust validation methods as per the business requirements.

Created proper use of parameters to make minimal change for future updates.

Prepared Unit test case documents and tested the code successfully.

Used control-m to automate the batch jobs for Unix and Informatica code.

Project #6

Project Title

BCBSA Membership outbound extract

Role

Senior ETL Developer and Integration lead

Client

WellPoint Inc. (Anthem Health, Elevance Health)

Duration

June 2014 – May 2016

Environment

Informatica PC 9.5, Teradata 14.0, UNIX, SQL Server

The client is an independent licensee of the Blue Cross and Blue Shield Association and serves its members as the Blue Cross licensee. On a monthly basis, Home Plans are to submit a full membership roster of all members who have had coverage during the past 24 months. These monthly extract files will need to reflect standard NDW formats and be sent by the last day of the month containing membership information as of the first day of that month.

Responsibilities:

Create, debug, and performance enhance Informatica mappings.

Created Error handling process used to capture errors and allow continuation of data loads while emailing errors to the appropriate staff.

Wrote several basic to advanced SQL scripts executed from a Unix environment, as well as performance tuned several queries.

Worked on Teradata FastExport, FastLoad connection in Informatica.

Worked as core developer for OME extract.

Project #7

Project Title

CCP2 Informatica Rewrite

Role

ETL Developer

Client

WellPoint Inc. (Anthem Health, Elevance Health)

Duration

Jul 2013 – June 2014

Environment

Informatica PC 9.1, Teradata 14.0, UNIX, SQL Server, MySQL

This project is a rewrite of the DX PC2 Monthly extracts: Member, Codeset, Claim, Claim Line, Claim Provider Error, Monthly Codeset, DXCG, Provider, Rx. Those are currently coded in BTEQ and coded in Informatica. This will allow DX to run these extracts with performance improvements and prepare the application to be more scalable in terms of adding ACOs for future developments. And added new functionality as per the business requirements.

Responsibilities:

Extensively used PowerCenter to design multiple mappings based on business logic.

Understood the existing architecture designed in Teradata.

Considered the performance things to meet the client requirements.

Created UNIX scripts for additional logic apart from the Informatica coding.

Created complex Teradata queries.

Worked on Teradata FastExport connection in Informatica.

Prepared Technical Design Document and Unit Test Plan documents.

Created Visio diagrams for architectural changes.

Worked on PII and PHI data.

Project #98

Project Title

DATA CLEANSING

Role

ETL Developer

Client

KBR Inc.

Duration

Jan 2013 – Jul 2013

Environment

Oracle 11g, IDQ 9.1.0, Informatica PC 9.1.0, SQL Developer

The objective of Data cleansing project is to ensure complete and accurate dataset in support of the CoreERP Canada conversions and Go-live. During cleansing, data is checked for consistency and accuracy against defined business requirements and data standards, and invalid data is corrected through manual or automated processes. Data cleansing enables data reconciliation – accurate data enables high throughput levels which is required for successful conversion.

Responsibilities:

Used IDQ Profiling tool for HTR, PTP, ATC and PTC. Developed IDQ profiles and scorecards using salient features of the IDQ tool.

Acted as a data analyst for HTR and PTP, analyzed the data quality requirements and tested them on different conversion cycles for its compliance.

Developed a robust and reusable address cleansing routine for the client.

Developed and implemented the dashboard reporting star schema.

Worked on converting complex IDQ profiles to convert to Informatica PowerCenter mapplets.

Project #9

Project Title

DATA CONVERSION

Role

ETL Developer

Client

KBR Inc.

Duration

Jul 2012 – Dec 2012

Environment

Oracle 11g, Informatica Power Center 9x

This Project involves migration of different sources from various major systems (SAP, JDE and other) that provide engineering and construction services in the energy, petrochemicals, government services and civil infrastructure sectors.

Responsibilities:

Extensively used PowerCenter to design multiple mappings based on business logic provided in the Design level document.

Responsible for process improvements in Informatica maps to bring down the loading time efficiently and improve the performance of the schedules.

Addressing the change requests from users and perform the end-to-end implementation to ensure defect free delivery of the work products.

Worked closely with other team players to isolate, track, and troubleshoot defects.

Responsible for reviewing and testing of all objects and entities as per the standards and guidelines consistent with checklist process.

Project #10

Project Title

MDM Solution for Consumer Products Group (CPG)

Role

ETL Integration developer

Client

Corporate Solution in collaboration with Informatica

Duration

Oct 2010 – July 2012

Environment

Oracle 11g, Informatica Power Center 9x, Power Exchange, Informatica Data Services, Informatica Data Quality, PL/SQL, Informatica Analyst, Informatica MDM 9.1-Hub, SIF, JBOSS 5.1, SAP ECC, SOAP UI, Windows Server 2008 R2, Informatica Cloud (IICS)

Description:

CPG solution provides better data management capabilities to the Consumer Product companies all over the world. Data Quality and Data Governance provided by this solution are considered most important for any Consumer products Company to deal with huge amounts of data on a daily basis. With enterprise data growing phenomenally every day, it has become the need of the hour to manage such data effectively for better business opportunities. Our CPG solution developed in collaboration with Informatica, gives various data management abilities to the Consumer Product companies by enabling them Cross-sell and Up-sell their products. Master data proves to be the Backbone of the enterprise data warehouse, without which the companies end up trying to manage in-correct data residing in many of their data warehouses, thus wasting time and resources. To avoid unnecessary confusion and trouble related to managing the data (related to Customers, Products, Geographies, Addresses, etc), consumer product companies can rely on this solution to achieve their goals with a higher Return-on-Investment (ROI). Prior to taking the data to MDM, extraction and transformation is performed using ETL tool Informatica PowerCenter 9.x. Maintaining Master data also involves much SQL and PL/SQL activity. So, we write Procedures, Functions and Triggers to maintain the data. We develop sessions and Workflows as part of ETL work involved in managing Master Data.

Responsibilities:

Analysed the requirements.

Developed Low level and High-level Design.

Manage the Informatica Server environment and Repository; configure the complete environment with different tools (Informatica PC, PX, DQ, MDM, SAP ECC).

As part of admin related activities, installed and configured Informatica PowerCenter Repository, Data Integration Service and worked on admin tasks like adding licenses, Repository back-up, maintaining Informatica repository database, adding plugins etc.

Developed Informatica mappings and MDM mappings.

Implemented complex Informatica mappings and provided as service to Informatica MDM.

Used workflow manager to create sessions and run workflows.

Worked on Informatica Developer to create Data Quality rules and Web Service mappings.

Done Batch and Real-time integration with SAP.

Implemented real-time integration using SOA & Web services such as Informatica Data Services (IDS).

Implementing business logic in Informatica and MDM mappings.

Involved in solving performance issues.

Prepared Use case scenarios.

Prepared Unit Test cases and Test results.

Contact this candidate