Gnanaprakash Pandigunta
Senior Datastage Developer
Email:*************@*****.***
Mobile: +1-647-***-****
Professional Summary:
I am Senior Datastage Developer with a total of 10 years 6 Months of work experience in Data Warehousing tools and technologies. My main areas of expertise have been in Data warehousing applications with extensive usage of IBM Datastage 9.1,11.5 and 11.7(Info Sphere Information Server, Web Sphere), Informatica Power Center, HADOOP (Big Data), UNIX, DB2 and SSRS. I have worked for Risk Management of Banking and Financial Service domain with Agile and Waterfall Methodologies.
Excellent Experience in Designing, Developing, Documenting, Testing and Maintenance of ETL jobs and mappings in Server and Parallel jobs using Data Stage to populate tables in Data Warehouse and Data mart.
Certified “IBM Certified Soultion Developer Infosphere Datastage v8.5” in 2014.
Expert in designing Parallel jobs using various stages like Join, Merge, Lookup, remove duplicates, Filter, Complex Flat File, Unstructured, Dataset, Lookup file set, Modify, Aggregator, Real time stages such as XML, Hierarchal stage.
Have performed integration of Data stage Web services with Oracle Enterprise Resource Planning Cloud (ERP) using REST ful and SOAP Web service API’s for Scotiabank’s Travel and Expense team.
Have experience in designing and implementing ETL applications using Big Data HADOOP Ecosystems (Oozie, Hive, Sqoop), UNIX and Autosys.
Strong development skills in IBM Information Analyser and Infosphere Metadata Asset Manager(IMAM) for design and development of DQ rules based on the Business Standards .
Strong knowledge of coding and debugging SQL queries.
Very Good hand on experience in SQL Server Reporting Services.
Excellent knowledge in UNIX shell and Perl scripting.
Excellent analytical, logical and programming skills.
Proven track record in troubleshooting of Data Stage jobs and addressing production issues like performance tuning and enhancement.
Have experience in developing Logical and Physical data models using best practices to ensure high data quality and reduced redundancy.
Good knowledge of Data Warehouse Architecture and Data Warehouse concepts.
Have constantly engaged in Daily and Monthly Production runs as a Primary resource.
Excellent client interaction skills.
Good communication & Presentation skills, Great Work management Skills, Very Good organizational skills, Self-motivated, hardworking, great team Player, intellectually flexible, Quick Learner with short learning Curve and Quick to Adapt.
Work Experience and Clients:
Scotia Bank June 2021 – Till Date
Senior Datastage Developer & Lead(Contractor)
Location: Toronto, Canada
Canadian Imperial Bank of Canada Nov 2020 – June 2021
Senior Informatica Developer (TCS Consultant)
Scotia Bank(Various Projects) Sep 2012 – June 2020
Datastage Developer (TCS Consultant)
Technical Proficiency:
Educational Qualification:
Bachelor of Technology in Information and Communication Technology
SASTRA University, Thanjavur, INDIA
CGPA : 8.38 Year of Passing: 2012
Professional Experiences:
Project #1:
Project
Net Cumulative Cash Flow Reporting
Customer
Scotiabank
Location
44 King Street, Toronto
Role
Senior Datastage Developer & Lead
Period
Jan 2022 – Till Date
Project Description
The NCCF Reporting is an OSFI initiative that is reported directly to OSFI monthly, with the operational capacity to increase the frequency to weekly or even daily. The NCCF measures an institution's surplus or deficit at a given time period, calculated as the difference between the sum of eligible cash inflows and the sum of prescribed cash outflows from the reporting date up to the time period considered. Accordingly, an institution's survival horizon corresponds to the last period before which the NCCF turns negative and is expressed in weeks or in months. The NCCF is to be reported in Canadian dollar (CAD) equivalent for each major foreign currency, defined as US dollar (USD), Euro (EUR), and British pound sterling (GBP), as well as other currencies determined to be eligible by OSFI, collectively referred as "Major Foreign Currencies".
Responsibilities
Key resource involved in gathering requirement specifications from Senior Business Analysts of Group Treasury of Scotiabank.
Interacted with various stakeholders to get the requirements finalized and was part of the Sign off meeting and project start up discussions.
Designed and Architected a ETL Hierarchy based technical solution for the Implementation of the whole NCCF project for Scotiabank.
Solution is designed starting from Consumption of Data feeds from Source Team till the report is generated which is to be submitted for OSFI in the April 2023.
While Being in the role as Lead Developer, have also lead a team of 3 onshore and 4 offshore resources who are working on this project.
Design and Development of 16 Cashflow time based buckets as per OSFI’s specification
Implementation of the buckets and Materiality calculation design based on the business mappings
Design and development of ETL Datastage jobs to consume various products-Loans, Mortgages, Deposit, Securities and Derivatives and load into respective Datamarts.
Have used majority of Performance effective stages of Datastage while developing the components to ensure redundancy and scalability
Developed PERL scripts that calculates the NCCF Line and Class that is key while calculating and generation of the NCCF report.
All components are designed as per Proper coding standards and guidelines as per Scotiabank
Parallelism design to ensure all the products are ingested simultaneously without holding system resources.
Suggested OLAP layer to be on the Hierarchal tables for better traceability.
Quick turnaround in development testing and debugging which is the key considering the deadlines as per OSFI
Planned multiple parallel executions in pre production environment and executed them successfully.
Scheduling the jobs in Autosys to run as per source system availability so as to avoid manual intervention.
Technology stack / Solution Environment
OS –Linux
ETL – IBM Datastage 11,7
IBM DB2
Unix
Perl
Autosys
Location
Toronto, Canada
Project #2:
Project
Liquidity Risk Migration Project
Role
Datastage Developer
Customer
Scotia Bank
Location
44 King Street West, Toronto
Period
June 2021 to December 2021
Project Description
Integrated Service layer (ISL) plays a vital role in ALRE project whose primary objective is to deliver the Bank's product system data to the Automated Liquidity Risk Engine (ALRE) for subsequent processing. This includes generation of cubes, processing through financial models as well as reporting. The scope of data required is broad, ranging from lending, deposit, securities, off balance sheets to derivative products. The task of delivering data while on the surface is technical in nature and does require complex transformations and calculations which make up the significant portion of the delivery task.
Responsibilities
Identifying all the Data stage, DB, Scripts and Reporting components to document for Migration.
Importing the code components and compiling with the latest 11.7 version build.
Identify and fix the code components that needs to be fixed as part of migration to 11.7
Thorough unit and regression test in DEV, IST and PREPROD environments with multiple periods of DATA both consisting of daily and monthly batches.
Comparing the datasets to produce the identical data between the old and new environments.
Lead a team of 3 offshore and 3 onshore resources.
Have raised case tickets and communicated to IBM regarding the issues that we faced in the Code migration to 11.7
Optimized the job run time by modifying the jobs that take more time to run.
Project #3:
Project
Global Operations Workbench (GOW)
Customer
Canadian Imperial Bank of Commerce (CIBC)
Location
483 Bay Street, Toronto
Role
Senior ETL Developer
Period
November 2020 – June 2021
Project Description
Global Operations Workbench (GOW) is a bank internal web-based application that automates and streamlines business processes, transactions and communications between multiple departments within CIBC. There are 12 applications that comes under single umbrella called GOW. These applications include Brokerage Risk Management (BRM), Corporate Actions (CA), Internal Transfer Management (ITM), Reconciliations (RECs) and few other applications.
Responsibilities
Design and development of Complex ETL Programs using Informatica Power Center.
Development of mappings using Transformations in Informatica according to technical specifications.
Performing various transformations like Filter, Expression, Lookup, Aggregate, Update Strategy, Normalizer, Joiner, Router and Union.
Data Extraction from Oracle DB to staging area.
Implementation of SCD for accessing full history of data.
Creation of UNIX shell scripts to fine tune the ETL flow of the Informatica workflows.
Effective use of Informatica Parameter file to define mapping variables, workflow variables, FTP connections and Database connectors.
Effectively worked on Onsite and Offshore model. Lead a team with 6 people at offshore.
Setting up of Feedhub forms from Source to Destination to transfer file securely and seamlessly.
Design, develop and implementation of Autosys Jobs to execute Feedhub and Informatica workflows.
Setting up of folders, groups, users and permissions and performed Repository administration using Repository Manager.
Involved in data analysis for source and target systems.
Performed Unit testing, System Integration Testing of Informatica mappings.
Conduction of daily meeting with offshore team, weekly status review meetings and monthly code reviews on Feedhub, Autosys and Informatica.
.
Technology stack / Solution Environment
OS –Linux
ETL – Informatica Power Center Designer, Workflow Manager, Workflow Monitor, Repository Manager
RDBMS Oracle.
Unix
Autosys
Feedhub
Location
Toronto, Canada
Project #4:
Project
US Data Quality
Customer
Scotia Bank
Location
40 King Street West, Toronto
Role
Developer
Period
January-2020 – July-2020
Project Description
Data Quality project is an integral part of Risk Data Aggregation (RDARR) Program of Scotiabank. The purpose of this project is to empower the bank to meet its data management capabilities so that the bank can track, measure and monitor the quality of Critical Data Elements (CDE). This application provide the data quality service to multiple business domains like Liquidity Risk, Non-Retail Credit Risk, Anti Money laundering, US Finance and US Customer. My role in this Project mainly dealt with United States (US) Finance and Customer Data.
Responsibilities
Design and development of Data stage Complex real time jobs to read/write data into IBM Db2 tables.
Data profiling to enable analysis by the risk Data stewards.
Collect and classify data quality rules with respect to standards and guidelines that are outlined in the Data Quality policy.
Performing the Data Quality dimension check like Completeness, Validity and Accuracy using the IBM Information Analyser
Implementing Data Quality reporting, trending and variance dashboards.
Identify and implement a set of Data Quality business rules to be used to monitor the quality of data based upon Critical Data Elements(CDE’s) and attributes that are relevant to each stage of the CDE’s data flow.
Requirement gathering by working closely with customer. Analytical skill to analyse all stated requirements, prioritize requirements based on business needs and develop design. Prepared Low Level and High Level Design Documents.
Design and Development of Data stage parallel and sequence jobs with complex stages to extract data from heterogeneous sources, apply transform logics to extracted data and load into Data Warehouse Databases.
Impact analysis-prepare a list of impacted jobs and scripts to be modified to achieve the requirement in the EDD or the change request.
Setup of new environment for defining Project specific variables, assigning default values, Node configuration for a project using IBM Infosphere Datastage Administrator.
Design of Parallel and Sequence jobs using various stages like Join, Merge, Lookup, Remove duplicates, Filter, Dataset, Lookup file set, Modify, Unstructured Data, Aggregator, Transformer etc using IBM Infosphere Datastage Designer.
Design and Development of Data stage parallel and sequence jobs with complex stages to extract data from heterogeneous sources, apply transform logics to extracted data and load into Data Warehouse Databases.
Writing complex DB2 Procedures and SQL queries to extract data from different Enterprise Data applications.
Design of Unix Shell scripts/routines/functions that are called using Datastage jobs.
Setting up of SFTP/FTP between different host and destination servers to facilitate the File transfer.
Analyse complex job designs and divide into different job segments and execute through job sequencer for better performance and easy maintenance.
Troubleshooting of Datastage jobs and addressing production issues like performance tuning and enhance the job execution time.
Schedule, Monitor the jobs using IBM Infosphere Datastage Director.
Design and Development of all kinds of reports using SSRS to facilitate the data access for end users.
Prepare project technical documents, External & General Design Documents, Run books and Induction Manuals.
Job task planning and organizing.
Maintain Code standards, review of jobs and perform Unit test before delivery.
.
Technology stack / Solution Environment
OS –Unix
ETL - IBM Info Sphere Data stage information server 11.5 Client
RDBMS – IBM DB2.
Unix
Reporting- SQL Server Reporting Services
Location
Toronto, Canada
Project #5:
Project
Global Finance Technology (GFT)- Travel and Expense Management
Role
Datastage Developer
Customer
Scotia Bank
Location
44 King Street West, Toronto
Period
January 2019- December 2019
Project Description
Scotiabank Global Finance's Travel & Expense Management(T&E) system receives a daily data extract from the EC system with employee details. This information is used to set up the staff in the system so that T&E can process employee expense claims when they are submitted. The HR data also enables T&E to identify each employee's manager so that the work flow can route expense claims to the "one up" manager for approval. The T&E initiative is a strategic priority for Global Finance. This system has already been implemented in Canadian Banking. The scope is currently limited to synchronize the employee information from MDM HR system to Cloud HCM after the Cloud ERP.
PeopleSoft Application Server will supply COFA business unit, Departments and Location mapping to Transits by extracting data from tree definition RPT_TRANSIT_AP_BU to the Data stage Server. This data is needed to perform transformations.
Responsibilities
Design and development of Data stage Complex real time jobs to read/write data into Oracle HCM cloud
Understanding the JSON metadata of the payload to perform GET/PATCH/POST operations in HCM using Data stage real time stages.
Creating the required payload using the inbuilt schema designers available in data stage.
Performing all the required Web Service operations to access Oracle Cloud from the Data stage jobs.
Design of Technical Specification Document for all the modules required for development.
Requirement gathering by working closely with customer. Analytical skill to analyse all stated requirements, prioritize requirements based on business needs and develop design. Prepared Low Level and High Level Design Documents.
Design and Development of Real time Data Acquisition and transformation jobs using the IBM Web service stages to report the Real time data online to the end users systems.
Impact analysis, prepare a list of impacted jobs and scripts to be modified to achieve the requirement in the EDD or the change request.
Setup of new environment for defining Project specific variables, assigning default values, Node configuration for a project using IBM Infosphere Datastage Administrator.
Design and Development of Data stage parallel and sequence jobs with complex stages to extract data from heterogeneous sources, apply transform logics to extracted data and load into Data Warehouse Databases.
Analyse complex job designs and divide into different job segments and execute through job sequencer for better performance and easy maintenance.
Schedule, Monitor the jobs using IBM Infosphere Datastage Director.
Design of Unix Shell scripts and Oracle query scripting for efficient ETL jobs.
Setting up of SFTP/FTP between different host and destination servers to facilitate the File transfer.
Design and development of SSRS reports to facilitate the data for senior management.
Prepare project technical documents, External & General Design Documents, Run books and Induction Manuals.
Job task planning and organizing.
Maintain Code standards, review of jobs and perform Unit testing before delivery.
Identify a set of ETL process Development Options and evaluation of the same.
Developed Queries in Oracle to integrate the logic in Datastage Jobs.
Technology stack / Solution Environment
OS –Unix
ETL - IBM Info Sphere Data stage information server 11.5 Suite
Oracle HCM Cloud
RDBMS – Oracle.
Unix
Reporting – Microsoft SQL Server Reporting Services(SSRS)
Location
Toronto, Canada
Project #6:
Project
Automated Liquidity Risk Engine (ALRE)- Liquidity
Role
Datastage Developer
Customer
Scotia Bank
Location
44 King Street West, Toronto
Period
September- 2012 to December 2018
Project Description
Integrated Service layer (ISL) plays a vital role in ALRE project whose primary objective is to deliver the Bank's product system data to the Automated Liquidity Risk Engine (ALRE) for subsequent processing. This includes generation of cubes, processing through financial models as well as reporting. The scope of data required is broad, ranging from lending, deposit, securities, off balance sheets to derivative products. The task of delivering data while on the surface is technical in nature and does require complex transformations and calculations which make up the significant portion of the delivery task.
Responsibilities
Identifying a set of ETL process and the place holders in IBM data warehousing model for Development Options and evaluation of the same.
Design of Technical Specification Document for all the modules required for development.
Requirement gathering by working closely with customer. Analytical skill to analyse all stated requirements, prioritize requirements based on business needs and develop design. Prepared Low Level and High Level Design Documents.
Impact analysis, prepare a list of impacted jobs and scripts to be modified to achieve the requirement in the EDD or the change request.
Data profiling using IBM Infosphere Information Analyzer to perform column analysis on source files for data quality assessment, data quality monitoring.
Setup of new environment for defining Project specific variables, assigning default values, Node configuration for a project using IBM Infosphere Datastage Administrator.
Understanding the mapping specifications of source and Target attributes along with business specific rules using IBM InfoSphere FastTrack.
Design of Parallel and Sequence jobs using various stages like Join, Merge, Lookup, Remove duplicates, Filter, Dataset, Lookup file set, Modify, Unstructured Data, Aggregator, Transformer etc using IBM Infosphere Datastage Designer.
Design and Development of Data stage parallel and sequence jobs with complex stages to extract data from heterogeneous sources, apply transform logics to extracted data and load into Data Warehouse Databases.
Migration and Implementation of ETL Application flows using HADOOP Ecosystems (HIVE, Oozie, Sqoop) for better performance and optimization.
Analyse complex job designs and divide into different job segments and execute through job sequencer for better performance and easy maintenance.
Creation of Indexes and Aggregate tables for the data warehouse, Consolidation of reports, identifying the granularity of data and source to target mapping of the data elements.
Schedule, Monitor the jobs using IBM Infosphere Datastage Director.
Troubleshooting of Datastage jobs and addressing production issues like performance tuning and enhancement, utilizing the specialized knowledge of ISL.
Extraction of Complex data using Unstructured Stage from different sources and loading into HDFS environment.
Design and Development of Real time Data Acquisition and transformation jobs using the IBM Web service stages to report the Real time data online to the end users systems.
Design of Unix Shell scripts and DB2 query scripting for efficient ETL jobs.
Design, Development and Maintenance of Dashboard Reports using SQL Server Reporting Services(SSRS)
There are complex data dependencies using metadata stored in the repository and prepared batches for the existing sessions to facilitate scheduling of multiple sessions, understanding these dependencies.
Implementation of BaseLIII LCR (Liquidity Coverage Ratio), NSFR (Net Stable Funding Ratio) Rules in ETL process to impact the line number attributes of Feed generated for ALGO. Engage in peer reviews during design, coding and testing.
Implementation of the ISL batch using Autosys and Tidal Enterprise Scheduler with accurate predecessor and successor jobs to execute the Application.
Prepare project technical documents, External & General Design Documents, Run books and Induction Manuals.
Job task planning and organizing.
Maintain Code standards, review of jobs and perform Unit testing before delivery.
Responsible for applying the best practices to ensure high data quality and reduced redundancy while providing the Solution on Enterprise Data Lake (EDL) Platform for ISL Applications.
Identify a set of ETL process Development Options and evaluation of the same.
Handle different phases of project life cycle and critical deliverable, coordination between onsite-offshore team and to co-ordination with various upstream and downstream teams.
Developed UNIX shell and Perl scripts to enhance the functionality of Datastage Jobs and eliminate manual work.
Developed Queries in DB2 to integrate the logic in Datastage Jobs.
Involved in Code deployment from Development to various phases such as IST, QAT and Production.
-Identifying the root cause of defects logged on HPQC and provided quality results on debugging the Defects. Involve in Monthly Production runs which aims to generate results required by the ALGO which require to do the following:
-Run predefined set of processes to absorb data provided by different sources.
-After every successful run, Transmit Files (end results) to the user for further analysis by ALGO.
-Documentation Includes run time of each Job completed, Job status description of Issues faced during the run and the methods/Approach/Steps taken to troubleshoot the same.
Support Activities
Supported Daily & Monthly production Operational runs.
Status updates on the progress of the production runs on a
Timely basis.
Prepared the operational process Run-Book with detailed steps and screenshot.
Prepared issue log summary for each operational run.
Documented the run statistics for every operational run.
Supported DEV, IST, QAT, PRD Runs.
Production issue resolving with quick Turn Around Time(TAT).
Technology stack / Solution Environment
OS –Unix
ETL - IBM Info Sphere Datastage information server 8.1, 9.1 and 11.5 Clients
RDBMS – DB2.
Reporting- Microsoft SQL Server Reporting Services
Location(s)
Bangalore, India
Toronto, Canada
Hardware
Technology
Software Products
Tools
Windows
AIX Unix
Data warehouse
ETL
Big Data(Hadoop)
ETL-Datastage 9.1,11.5 and 11.7
ETL- Informatica 9.1
Unix
Perl
SQL, PL/SQL
DB2/Oracle
Microsoft Visual Studio
REST/SOAP API Oracle
Oozie
Sqoop
SQL Server Reporting Services
Autosys
Feedhub
Designer client
Director client
Administrator client
Manager Client
Information Analyzer
Information Server
Power Center Designer
Power Center Workflow Manager
PowerCenter Workflow Monitor
Power Center Repository Manager
DB2 Control Center
SSRS reporting tool
HP QC
FileZilla
Putty
JIRA
Beyond Compare
Soap UI