Post Job Free

Resume

Sign in

Data Warehousing Quality

Location:
Jersey City, NJ
Posted:
March 20, 2024

Contact this candidate

Resume:

Shiva Kumar Talari

Sr. ETL Informatica Developer

ad4gnn@r.postjobfree.com

+1-856-***-****

Professional Summary:

9+ years of experience as Business Intelligence developer who worked on various Data warehousing (DW) applications using Informatica Power Centre 10.x/9.x, IDMC/IICS (CDI) and proficient in Unix & SQL. I have worked on diverse domains such as Energy, Healthcare, Banking & Finance.

Extensive knowledge in analysis, design, development, implementation, and troubleshooting using Data warehousing (DW) Methodologies and Data Modeling. I am a communicative team player committed to excelling in quality and meeting project deadlines.

8+ years of experience on Development/Lead with Informatica PowerCenter, Data Integration, Data Quality, Informatica Cloud Intelligent services (IICS) on Linux platform in AWS cloud services and cloud rational databases.

Experienced using Cloud Data Integration (CDI) of informatica components/products to integrate and design from multiple applications.

Extensively worked on developing Informatica Mappings, Mapplets, Sessions, Workflows and Worklets for data loads from various sources such as Oracle, Flat Files (Avro, Parquet, VSAM), Snowflake, JSON, Teradata, XML, SAP BW, DB2, S3, SQL Server.

Extensively worked on data extraction of VSAM files directly from Mainframe server and through binary files using Informatica Power Exchange.

Expertise in Salesforce, including hands-on experience with the Data Loader application. This includes managing data import/export tasks, ensuring data quality, and supporting data migration projects.

Implemented advanced performance tuning techniques in Informatica, including pushdown optimization and dynamic partitioning, enhancing ETL throughput and minimizing resource consumption in data warehousing operations.

Implemented data flow optimization initiatives for a large-scale data warehousing project, reducing data load times by 40% through efficient mapping, partitioning, and performance tuning in Informatica PowerCenter.

Experience on loading data into Snowflake DB in the cloud from various sources.

Worked on Informatica mapping to remove duplicates and loaded to Google Big query.

Experience with Snowflake cloud data warehouse and AWS S3 bucket for integrating data from multiple source system which include loading nested JSON formatted data into snowflake table.

Implemented data extraction, data cleansing, and incremental loading, enhancing reporting capabilities.

Technically proficient professional skilled in task flow design, dependencies, and scheduling leveraging AWS cloud infrastructure.

Good understanding of Star and Snowflake Schema, Dimensional Modeling, Relational Data Modeling and Slowly Changing Dimensions.

Proficient in Informatica Data Quality 10.1 (IDQ) with hands-on experience in IICS components, including Cloud Data Integration (CDI), application integration (CAI), monitoring, and administration. Skilled in data profiling, analysis, cleansing, address validation, fuzzy matching/merging, data conversion, and exception handling.

Involved in Technical Architecture Design Process, by gathering high level Business Requirement Document and Technical Specification Document from Technical Architecture and followed the specific conventions.

Experience in working with Data replication, Synchronization, and Dynamic mappings.

Experience on various Software Development Life Cycles including Analysis, Design, Development and Testing to solidify client requirements in conjunction with Software Developers.

Managed the migration of custom Informatica code and mappings, employing best practices to modify and optimize legacy ETL processes for enhanced performance and efficiency in the upgraded environment from version 9.6 to 10.5, identifying compatibility issues, planning resource allocation, and establishing a comprehensive rollback strategy to mitigate risks.

Executed data load reviews, analysis, and verification of ETL logic design for data warehouse and data marts in and/of STAR Schema methodology with dimensions and fact tables.

I possess extensive experience in seamlessly integrating Power BI within the Informatica environment to deliver comprehensive reporting solutions.

Developed Complex Mapping using Source qualifier, Lookup, Joiner, Aggregator, XML Parser, XML Generator, Expression, Filter, Router, Union, PL/SQL Store Procedure, Web Services, Hierarchical stage, Transaction Control and other Transformation for the slowly changing Dimension (Type1, Type2 and CDC) to keep track of historical data.

Good knowledge on GitHub for maintaining versioning of code. Also, expertise in Teradata RDBMS using Fastload, Multiload, Tpump, Fastexport, Teradata SQL Assistant and BTEQ utilities.

Implemented a comprehensive data quality monitoring system that showcases data quality scores and statistics through Power BI dashboards. Utilized Power BI's visualization capabilities to present key data quality metrics, trends, and insights, enabling stakeholders to make informed decisions based on accurate and reliable data.

Experience in real-time Integration process that interact with REST API, web services and Interface with various applications on cloud.

Proficiently developed Python scripts to parse complex Excel files, extracting and transforming data into CSV format. This streamlined data processing workflows, enhancing data accessibility and compatibility for further analysis and reporting.

Successfully implemented shell script for API integration to extract data from a variety of sources, utilizing authentication methods, including API keys and OAuth tokens.

Implemented Performance Tuning techniques at application, database, and system levels by using Indexes, Partitioning, Materialized View, External Table, Procedures, Functions and Explain Plan.

Experience with creating UNIX shell scripts for Informatica pre and post session operations, Batch scripting and day-to-day activities like, monitoring network connections, database ping utilities, Scheduler monitoring, file archiving and cleanups, file splitting, FTPing to various locations, Listfile generation scripts, Log Rotation and metrics generation scripts etc.

Experienced in the Azure Dashboards (Epics, User Stories, Tasks) & JIRA platform and Agile methodologies, adept at effectively utilizing JIRA to manage and streamline agile development processes, ensuring efficient collaboration, task tracking, and project delivery.

Technical Skills:

Domains: Banking [Finance / Liquidity], Insurance and Retail, Health care, Energy & Utilities

ETL Tools: Informatica IICS / IDMC, Informatica Data Quality 10.5, 10.2, Informatica PowerCenter 10.5,10.2,10.1/9.5.X, DataStage 11.X, 9.X

Databases: MS SQL Server, MySQL, PostgreSQL, DB2, Oracle 8i/9i/10g/11g/19c, Teradata (v2R5, 13), AWS RedShift, Google Big Query, Snowflake

BI/ETL reporting Tools: IDQ Scorecards, Tableau, MS EXCEL, Power BI

Programming Languages: SQL, Teradata SQL, PL/SQL, Python, UNIX Korn, Bourn Shell/Scripting

Scheduler: Control-M, Unix Cron scheduler, Autosys, Informatica scheduler

IDE /Bug Tracking Tools: ServiceNow, PG Admin 4.3, JIRA, Azure DevOps

Version Control Tools: Tortoise SVN, GIT

Data modelling Tools: ERWIN and Microsoft Visio

Certifications:

Completed Informatica Cloud Data Integration (CDI) Modernization Certification (2023).

Completed AWS Cloud Practitioner Certification (2022).

Completed IBM Infosphere DataStage Developer 9.1 certification.

Education Qualification:

Master of Technology (Information Technology) Wipro Integrated Program VIT 2014 – 2018

Bachelor of Computer Science Osmania University 2011 – 2014

Rewards & Recognition:

Recognized with “Sprint Star award” in 2023 for Leading a team and saved 10K+USD/year by implementing new automation process in the project using Informatica and Unix.

Professional Experience:

Client: CorroHealth New Jersey June-2023 – Till Date

Role: Sr. ETL Developer Lead (Informatica IICS, PC & Data Quality)

Description:

The organization's mission as a foremost provider of healthcare analytics and technology-driven solutions. Tasked with positively impacting the financial performance of hospitals and health systems, I contribute by delivering integrated solutions, leveraging proven expertise, intelligent technology, and scalability. My role involves optimizing data extraction, transformation, and loading processes, ensuring seamless integration across the entire life cycle of the project.

Responsibilities:

Analyse the sources, targets, transformed the data, and mapped the data and loading the data into Targets using Informatica.

Involved in requirement gathering, impact assessment till production deployment.

Engage in the design and build project phases, then own the user acceptance test, deployment/adoption activities, and handovers to Application Support.

Designed and built mappings to extract data from SFDC, Oracle, SAP BW, SQL Server and loaded into Oracle/Designed the ETL flows for generating flat file extracts as per the business need.

Orchestrated seamless integration between Informatica PowerCenter and SAP BW, optimizing data flows and ensuring efficient extraction, transformation, and loading processes, including extraction from and real-time processing of IDOCs.

Experienced In creating the data maps using the COBOL copy books for the VSAM extraction.

Designed and developed Informatica PowerCenter workflows to seamlessly extract and transform ensuring consistency and quality for regulatory compliance and faster decision-making in clinical trials.

Developed interactive data visualizations and dashboards using Power BI to track key healthcare metrics such as patient admission rates, treatment outcomes, and staff performance indicators, enabling stakeholders to make data-driven decisions.

Experience in Change Management process and other application supports life cycles including Incident Management, Problem Management, Documentation (Run/Handbook), Prod Release Support, Batch Job monitoring, Defect Fixes.

Developed logical/ physical data models using Erwin tool across the subject areas based on the specifications and established referential integrity of the system.

Developed Complex Mapping using Source qualifier, Lookup, Joiner, Aggregator, Expression, Filter, Router, Union, Store Procedure, Web Services and other Transformation for the slowly changing Dimension (Type1, Type2 and CDC) to keep track of historical data.

Designed and developed a data pipeline to collect data from multiple sources like Salesforce, Snowflake, API's, SAP HANA, DB2, Flat files, JSON files, Relational DB's like Teradata, oracle, PostgreSQL and inject it to target systems such as Relational DB’s, RedShift, Flat files etc.,

Effectively created Python scripts to read Excel files, extracting them and converting data into CSV format. This improved data processing operations, increasing data accessibility and compatibility for applying the business rules.

Creating all the masking Rules to mask the customers sensitive data to perform the testing.

Develop shipping address mappings to verify the feeds against Client dependent data.

Used Partitions in Sessions and indexes at table level to improve the performance of the database load time.

Proficiently developed Python scripts to parse complex Excel files, extracting and transforming data into CSV format. This streamlined data processing workflows, enhancing data accessibility and compatibility for further analysis and reporting.

Bulk loading and unloading data into Snowflake tables using COPY command.

Integrated and automated data workloads to Snowflake Warehouse.

Created Snowpipe for continuous data load.

Worked on extracting data from the heterogeneous source systems like MS SQL Server, Oracle, PostgreSQL and loading into Landing Layer and then to Data Warehouse (DW).

Optimized existing PL/SQL procedures to enhance the overall performance and scalability of the Informatica ETL processes.

Developed PL/SQL Stored Procedures, Views and Triggers to implement complex business logics to extract, cleanse the data, transform and load the data into the Oracle database.

Creating Test cases and detailed documentation for Unit Test, System, Integration Test and UAT to check the data quality.

Moved the company from a SQL Server database structure to Salesforce Objects and responsible for ETL and data validation.

Worked closely with the end users in writing the functional specifications based on the business needs. Analyzed the source data coming from Oracle.

Created Unix shell (Korn & Bourn) scripts for automating the archival and cleanup of outdated data files and database records, optimizing storage resource utilization. Additionally, established automated processes for executing Informatica workflows.

Informatica jobs scheduling using Unix cron entries, Informatica Scheduler and Control-M, ensuring efficient and timely execution of ETL workflows.

Create profiles and score cards in the Analyst tool to analyse the data and present the documented results to the client.

Review code was developed by other developers and provides input to and drive programming standards.

Environment: Informatica Intelligent Cloud Services (IICS), IDMC, Informatica PC/DQ 10.5, CDI, CDQ, Oracle 19c, QuerySurge, Salesforce, SQL, PL/SQL, DB2, FLATFILES (Fixed width and Delimited), XML, JSON, SQL Developer, Dbeaver, Aws, S3, PUTTY, UNIX / Linux, Control-M.

WIPRO USA

Project 1:

Client: Cyclone Project, British Petroleum WIPRO Texas, USA, Remote Apr-2019 - May-2023

Role: Sr. ETL Developer (Informatica PowerCenter & IICS (CDI), Data Quality)

Description:

BP p.l.c. is a British multinational oil and gas company headquartered in London, England. It is one of the oil and gas "supermajors" and one of the world's largest companies measured by revenues and profits. Our objective was to assess the quality of the data within the system and generate comprehensive reports and scores. By leveraging the power of Informatica tool, we successfully executed a solution that enabled business users to confidently rely on the data they were handling. which was successfully completed during the term of the project.

Responsibilities:

Responsible for leading a team of 6, successfully coordinating their efforts and driving collaborative problem-solving on Informatica projects.

As a lead, effectively managed task assignments, provided guidance and mentorship, and ensured the team's cohesive and efficient workflow. My leadership skills have contributed to the successful completion of projects and the achievement of team goals.

Extensive experience of software Development life cycle methodologies like waterfall & Agile.

Implemented Data warehousing (DW) project like History/Full load, Incremental/Delta load using mapping parameter.

Orchestrated seamless integration between Informatica PowerCenter and SAP BW, ensuring smooth data flow and real-time access to critical business.

Extracted flat files from storage S3 bucket to oracle database using IICS Data integration services.

Secure Agent setup on instances and loaded files from S3 into a warehouse using Data Integration services (IICS) as well as extracted staging from heterogeneous databases to cloud warehouse.

Implemented Informatica's data quality mappings to cleanse, deduplicate, and enrich the data before or after it's stored in Azure Data Lake.

Designed and developed a data pipeline to collect data from multiple sources like Salesforce, Snowflake, API's, SAP HANA, DB2, Flat files, JSON files, Relational DB's like Teradata, oracle, PostgreSQL and inject it to target systems.

Successfully implemented XML data generation processes using Informatica PowerCenter's XML Generator transformation to create structured XML documents for downstream systems.

Created data pipeline and ETL process to load Data from AWS S3 to RDS and redshift.

Bulk loading and unloading data into Snowflake tables using COPY command.

Integrated and automated data workloads to Snowflake Warehouse.

Created Snowpipe for continuous data load.

Loading and transforming large sets of Structured, Semi-Structured, analysed and extracted on IICS into Redshift/ S3. Developed technical design specification to load the data into the data mart tables confirming to the business rules.

Experience in data integration to Saleforce.com CRM (SFDC) using Informatica Cloud Services (Data Replication Tasks, Data Synchronization Tasks & Mapping Configuration Tasks as well Informatica Power Center.

Involved in design and development of complex ETL mappings and stored procedures in an optimized manner. Cleansed the source data, extracted and transformed data with business rules, and built reusable components such as Mapplets, Reusable transformations and mapping tasks etc.

Develops Cloud Services tasks (Replication/Synchronization/Mapping Configuration) to load the data into Salesforce (SFDC) Objects.

Developed complex Informatica mappings to load the data from various sources using different transformations like parser, Web services, deduplication, data processor, connected and unconnected look up, update Strategy, expression, aggregator, joiner, filter, normalizer, rank and router Developed mapplets and worklets for reusability.

Expert in optimization and query performance tuning in oracle using various techniques involving distribution analysis, data-type analysis, configuration parameters etc.

Applied partitioning and bulk loads for loading large volume of data and implemented indexes at table level to improve the performance of data retrieval.

Used Informatica debugging techniques to debug the mappings and used session log files and bad files to trace errors occurred while loading.

Involved in performance tuning of mappings, transformations and (workflow) sessions to optimize session performance.

Successfully developed shell script / Python-based API integration to collect data from several kinds of sources, leveraging authentication techniques such as API keys and OAuth tokens.

Effectively created Python scripts to read Excel files, extracting them and converting data into CSV format. This improved data processing operations, increasing data accessibility and compatibility for applying the business rules.

Implemented Unix shell (Korn & Bourn) scripts to automate the archiving and cleanup of old data files or database records, ensuring that storage resources are used efficiently. Furthermore, implemented automations to execute the Informatica workflows. By scheduling these scripts, we can trigger ETL processes at specific times or in response to certain events, ensuring data is processed when needed.

Developed PL/SQL Stored Procedures, Views and Triggers to implement complex business logics to extract, cleanse the data, transform and load the data into the Oracle database.

Review code was developed by other developers and provides input to and drive programming standards. Installed/configured Teradata Power Connect for Fast Export for Informatica and Amazon redshift cloud data integration application for faster data queries.

Expertise in using Teradata Utilities BTEQ, M-Load, F-Load, TPT and F-Export in combination with Informatica for better Load into Teradata Warehouse.

Created JDBC, ODBC connections in Amazon redshift from the connect client tab of the console.

Automated the administrative tasks of migrating code from one environment to another.

Experience in Change Management process and other application supports life cycles including Incident Management, Problem Management, Documentation (Run/Handbook), Prod Release Support, Batch Job monitoring, Defect Fixes.

Created Materialized views for summary tables for better query performance.

Implemented weekly error tracking and correction process using Informatica.

Worked on extracting data from the heterogeneous source systems like MS SQL Server, Oracle, PostgreSQL and loading into Landing Layer and then to Data Warehouse.

Creating Test cases and detailed documentation for Unit Test, System, Integration Test and UAT to check the data quality.

Moved the company from a SQL Server database structure to Salesforce Objects and responsible for ETL and data validation.

Worked closely with the end users in writing the functional specifications based on the business needs. Analyzed the source data coming from Oracle.

Environment: Informatica Intelligent Cloud Services (IICS), Informatica PC/DQ 10.5, Oracle 19c, Power BI, Salesforce, SQL, PL/SQL, DB2, FLATFILES (Fixed width and delimited), XML, JSON, SQL Developer, Dbeaver, Snowflake,QuerySurge, Snowpipe, RedShift, Aws, S3, PUTTY, UNIX / Linux, Python, Autosys r11.

Project 2:

Client: British Petroleum, WIPRO Hyderabad, India Feb-2018 - Apr-2019

Role: Informatica Power Centre Developer

Description:

The concept of Informatica Shared Environment (ISE) was devised to create an architecture that allows multiple BP Refining & Marketing projects across various business units in BP North Americas to share a common Informatica environment for their ETL needs. By sharing the enhancements cost, operational costs, software/hardware, maintenance, and support, this approach significantly reduces the cost per project and eliminates the need for redundant setup efforts. The ISE project provides shared hardware resources and administrative support to all projects within BP, resulting in cost and time savings across different projects.

Responsibilities:

Led the enhancement project for existing Informatica projects, working closely with cross-functional teams to gather requirements, analyze existing workflows, and identify areas for improvement.

Implemented enhancements to optimize data integration and transformation processes, resulting in improved data quality and reduced processing time.

Utilized Informatica PowerCenter & IICS to develop and deploy efficient ETL workflows, ensuring seamless data flow and accurate data transformations.

Collaborated with business users to understand their needs and translate requirements into technical specifications for Informatica development.

Conducted thorough testing and debugging of Informatica workflows, resolving issues and ensuring the smooth functioning of data pipelines.

Applied performance tuning techniques, such as partitioning, caching, and parallel processing, to optimize Informatica workflows and achieve faster processing times.

Implemented indexes at table level to improve the performance of data retrieval.

Experience in working with UNIX Shell Scripts for automatically running sessions, aborting sessions and creating parameter files. Written number of Unix shell scripts (Korn Shell) to run various batch jobs.

Implemented scripts using the python panda’s library to manipulate data within CSV files, Excel spreadsheets, or other common formats.

Created comprehensive documentation, including technical designs, user manuals, and training materials, to facilitate knowledge transfer and ensure effective project maintenance.

Collaborated with the project team to ensure timely delivery of project milestones and adherence to project timelines.

Actively participated in project meetings, providing updates on project progress, risks, and mitigation strategies.

Environment: IICS / IDMC, CAI, CDI, Informatica 9.5.x, 10.2, Oracle 11g, SQL, DB2, Flat Files (Fixed width and Delimited), SQL Developer, PUTTY, UNIX, WINSCP

Project 3:

Client: Financial Data Warehouse (FDW), Eli Lilly, WIPRO Hyderabad, India Apr-2016 - Jan-2018

Role: Informatica Developer

Description:

ELI LILLY is a pharmaceutical company which produces and distributes Pharma products across the globe. This project built a pharmaceutical Financial Data Warehouse to present an integrated, consistent, real-time view of enterprise-wide data. The Data Warehouse enhances sales and Order management reporting for the pharmaceutical research group. Data from different sources was brought into Oracle using Informatica ETL and sent for reporting using Cognos. This Data Warehouse enhances Sales and Order Management reporting for a pharmaceutical research group, delivering reports and information to sales and marketing management.

Responsibilities:

Collaborate with cross-functional teams to gather requirements, understand data sources, and design ETL workflows for integrating patient engagement data from various systems.

Develop complex data transformation logic to cleanse, enrich, and standardize data from diverse sources.

Designed Source to Target mapping using Informatica PowerCenter for their Claims, Health Care Department data.

Developed Complex mappings in Informatica load the data from various sources using different transformations like Source Qualifier, Look up (connected and unconnected), Expression, Aggregate, Sequence Generator, Joiner, Union, Filter, Update Strategy, Rank and Router transformations.

Created Informatica Mappings to load Member, Provider, Contract, Claims & Services data for State health plans.

Created Informatica Mappings to load data for Power Account Payments, Statements, and Claim Card Transaction for Indiana (HIP Members) here tables are used to generate monthly statements which will be sent to the member in the plan.

Optimize data loading strategies, partitioning techniques, and parallel processing to achieve optimal performance and reduce processing times.

Explore opportunities for integrating real-time data streaming capabilities within the ETL framework.

Create automated alerts and dashboards to proactively identify and address any data quality issues, processing bottlenecks, or campaign failures.

Ensure the confidentiality and integrity of patient data by implementing robust data security measures. Adhere to industry-specific data protection regulations (e.g., HIPAA) and internal data governance policies to safeguard patient information throughout the ETL process.

Proactively identify opportunities for process enhancement and automation within the ETL lifecycle. Regularly evaluate new ETL tools, technologies, and best practices to streamline development, testing, and deployment processes, contributing to overall team efficiency and agility.

Informatica jobs scheduling using Unix cron entries, Informatica Scheduler and Control-M, ensuring efficient and timely execution of ETL workflows.

Coordinated with scheduling team to run Informatica jobs for loading historical data in production.

Responsible for build changes, system integration testing and UAT activities and maintained the status.

Environment: Informatica 9.5.x, 10.2, Oracle 11g, SQL, PL-SQL, DB2, Cognos, Flat Files (Fixed width and Delimited), SQL Developer, PUTTY, Control-M, UNIX / Linux, WINSCP

Project 4:

Client: HSBC Project, WIPRO Hyderabad, India Oct-2015 - Mar-2016

Role: ETL DataStage Developer

Description:

HSBC activities are organized into different business divisions: Retail Banking (including Mortgages), Wholesale, Life, Pensions &Insurance, and Wealth & International. The project was to transform the data coming from various sources through multiple stages before being loaded into the data warehouse and maintenance.

Responsibilities:

Designed and developed ETL (Extract, Transform, Load) processes using IBM DataStage, ensuring efficient and accurate data integration.

Developed data mappings, transformations, and workflows in DataStage to meet business requirements and data quality standards.

Collaborated with cross-functional teams to gather data requirements and translate them into technical specifications for DataStage development.

Performed data cleansing, and data validation using DataStage to ensure data accuracy and integrity. enforced job scheduling and monitoring using DataStage Director to automate ETL processes and ensure timely data delivery.

Conducted performance tuning activities, optimizing DataStage jobs for improved processing speed and efficiency.

Created and maintained technical documentation, including data lineage, data dictionaries, and job dependencies.

Developed DataStage jobs extensively using Aggregator, Sort, Merge, and Data Set in Parallel Extender to achieve better job performance.

Making estimates and plans for developing engineering products, writing specifications for various development work, setting up coding standards to be used throughout system development.

Participating in Design activities, data loading/unloading and Tuning & Optimization.

Participated in system testing and user acceptance testing (UAT) to validate the functionality and performance of DataStage jobs.

Environment: IBM DataStage 11.3, SQL Server, UNIX, Control-M

Project 5:

Client: Lloyds Banking Group, WIPRO Bangalore, India Nov -2014 - Sep-2015

Role: ETL DataStage Developer

Description:

Worked extensively on Informatica designer to design a robust end-to-end ETL process involving complex transformation like Source Qualifier, Lookup, Update Strategy, Router, Aggregator, Sequence Generator, Filter, Expression, Stored Procedure, External Procedure, Transactional Control for the efficient extraction, transformation and loading of the data to the staging and then to the Data Mart (Data Warehouse) checking the complex logics for computing the facts.

Responsibilities:

Involved in Design, Mapping documents, Build and unit test plan creation and execution.

Supporting System and System integration testing.

Creation of Production deliverables which includes scheduling plans and releases.

Utilized Unix shell scripts to automate file handling process and file cleansing activities.

Supporting the production implementation and warranty of the project.

Environment: DataStage (v 9.1), UNIX, SQL, TWS & Connect Direct, SQL Developer.



Contact this candidate