Data Etl

Location:

Alpharetta, GA

Posted:

December 12, 2020

Contact this candidate

Resume:

Avinash Phone: 225-***-****

Email: *******.********@*****.***

SUMMARY:

More than 9 years total IT experience in developing Business Intelligence solutions including building Data Warehouses, Datamarts and ETL for clients in major industry sectors like Telecom, Pharmacy, Finance and Insurance

More than 6 years of ETL tool experience using IBM Information Server DataStage and QualityStage 8.x, Ascential Datastage 7.x/6.0 in designing, developing, testing and maintaining jobs using Designer, Manager, Director, Administrator and Debugger

Experienced in troubleshooting of DS jobs and addressing production issues like performance tuning and fixing the data issues

Excellent knowledge of studying the data dependencies using Metadata of DataStage and preparing job sequences for the existing jobs to facilitate scheduling of multiple jobs

Strong understanding of the principles of DW using Fact Tables, Dimension Tables, star schema modeling, Ralph-Kimball approach, Bill-Inmon approach

Experienced in writing system specifications, translating user requirements to technical specifications, ETL Source-Target mapping document and testing documents

Experience in integration of various data sources with Multiple Relational Databases RDBM systems Oracle, Teradata, Sybase, SQL Server, MS Access, DB2

Worked on integrating data from flat files, COBOL files and XML files.

Extensively worked on extracting data from SAP using ABAP Extract Stage.

Experience in writing, testing and implementation of the Triggers, Procedures, functions at Database level and form level using PL/SQL

Sound knowledge in UNIX shell scripting

Knowledge of full life cycle development for building a Data Warehouse

Good working knowledge of Client-Server Architecture

EDUCATION DUCATION & CERTIFICATION

Higher Degree: Master of Science, May 2011

University: Louisiana State University

Field: Systems Science

Certified in IBM WebSphere DataStage 11.5.v.

TECHNICAL SKILLS:

ETL Tools: SSIS: MS Visual Studio 2012/2015, IBM Datastage 9.1/8.7/8.5/7.5.

Big Data Ecosystems: Hadoop, HDFS, Hive, Sqoop and Talend.

Databases: Teradata, Oracle 10g/9i/8i, SQL Server 2000, DB2, Sybase, MS Access.

Big Data Ecosystems: Hadoop, MapReduce, HDFC, Hive, Sqoop.

UNIX Tools: C Shell, K Shell, Bourne Shell, Perl.

Programming Languages: SQL, PL/SQL, MDX, Visual Basic, XML.

Data Modeling: Erwin

Professional Experience

Federal Home Loan Bank of Atlanta (Contractor) Aug 2018 – Current

Sr. ETL Architect and Developer

Responsibilities

Extensively used DataStage for extracting, transforming and loading databases from sources including Oracle, DB2 and Flat files.

Collaborated with EDW team in, High Level design documents for extract, transform, validate and load ETL process data dictionaries, Metadata descriptions, file layouts and flow diagrams.

Collaborated with EDW team in, Low Level design document for mapping the files from source to target and implementing business logic.

Generation of Surrogate Keys for the dimensions and fact tables for indexing and faster access of data in Data Warehouse.

Tuned transformations and jobs for Performance Enhancement.

Extracted data from flat files and then transformed according to the requirement and Loaded into target tables using various stages like sequential file, Look up, Aggregator, Transformer, Join, Remove Duplicates, Change capture data, Sort, Column generators, Funnel and Oracle Enterprise.

Created Batches (DS job controls) and Sequences to control set of jobs.

Extensively used DataStage Change Data Capture for DB2 and Oracle files and employed change capture stage in parallel jobs.

Executed Pre and Post session commands on Source and Target database using Shell scripting.

Collaborated in design testing using HP Quality Center.

Extensively worked on Job Sequences to Control the Execution of the job flow using various Activities & Triggers (Conditional and Unconditional) like Job Activity, Wait for file, Email Notification, Sequencer, Exception handler activity and Execute Command.

Collaborated with BI and BO teams to find how reports are affected by a change to the corporate data model.

Utilized Parallelism through different partition methods to optimize performance in a large database environment.

Developed DS jobs to populate the data into staging and Data Mart.

Executed jobs through sequencer for better performance and easy maintenance.

Performed the Unit testing for jobs developed to ensure that it meets the requirements.

Involved in creating Table definitions,Indexes, views,sequences, Materialized view creation.

Prepared documentation for addressing the referential integrity relations in between the tables at ETL level.

Environment: IBM Info sphere Datastage 11.7v, Tidal Scheduling, Toad, Oracle Sequel Developer, MS SQL Server database, Web Services to communicate between Bloomberg and Bank.

Blue Cross Blue Shield, IL-Chicago (Contractor) Sep 2014 – Aug 2018

Sr. ETL Architect and Developer

Responsibilities

Analyzed, designed, developed, implemented and maintained Parallel jobs using IBM info sphere Data stage.

Involved in design of dimensional data model – Star schema and Snow Flake Schema

Generating DB scripts from Data modeling tool and Creation of physical tables in DB.

Worked SCDs to populate Type I and Type II slowly changing dimension tables from several operational source files

Created some routines (Before-After, Transform function) used across the project.

Experienced in PX file stages that include Complex Flat File stage, Dataset stage, Lookup File Stage, Sequential file stage.

Implemented Shared container for multiple jobs and Local containers for same job as per requirements.

Hadoop data ingestion through HQL scripts.

Experience in importing and exporting data using sqoop.

Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modeling and data mining.

Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.

Created Hive Queries that helped market analysts spot emerging trends by comparing source data with EDW reference tables and historical metrics.

Adept knowledge and experience in mapping source to target data using IBM Data Stage 8.x

Experienced in developing parallel jobs using various Development/debug stages (Peek stage, Head & Tail Stage, Row generator stage, Column generator stage, Sample Stage) and processing stages (Aggregator, Change Capture, Change Apply, Filter, Sort & Merge, Funnel, Remove Duplicate Stage)

Debug, test and fix the transformation logic applied in the parallel jobs

Involved in creating UNIX shell scripts for database connectivity and executing queries in parallel job execution.

Used the ETL Data Stage Director to schedule and running the jobs, testing and debugging its components & monitoring performance statistics.

Experienced in using SQL *Loader and import utility in TOAD to populate tables in the data warehouse.

Prepared FastTrack Mapping Specifications, created metadata layouts in Metadata workbench and updated Business Glossary.

Successfully implemented pipeline and partitioning parallelism techniques and ensured load balancing of data.

Deployed different partitioning methods like Hash by column, Round Robin, Entire, Modulus, and Range for bulk data loading and for performance boost.

Repartitioned job flow by determining Datastage PX best available resource consumption.

Created Universes and reports in Business object Designer.

Created, implemented, modified and maintained the business simple to complex reports using Business objects reporting module.

Environment: IBM Info sphere Datastage 8.7,11.5v Flat files, ZENA, UNIX, Erwin, Teradata SQL Assistant, MS SQL Server database, XML files, Business Glossary,MS Access database.

United Health Group, Eden Prairie MN (Contractor) April 2014 – Sep2014

Role: Sr. ETL Developer

UnitedHealth Group Inc. is a diversified managed health care company headquartered in Minnetonka, Minnesota, U.S. It is No. 14 on Fortune magazines top 500 companies in the United States.

Responsibilities:

Work closely with Business analysts and Business users to understand the requirements and to build the technical specifications.

Responsible to create Source to Target (STT) mappings.

Responsible to design, develop and built datastage parallel jobs using DataStage designer.

Developed and supported the Extraction, Transformation and Load process (ETL) for a data warehouse from various data sources using DataStage Designer.

Designed and developed Parallel jobs to extract data, clean, transform, and to load the target tables using the DataStage Designer

Good understanding on Info sphere Information Server Architecture, Infosphere Information Governance Catalog and IMAM.

Worked on Governance Catalog to govern information assets through the development of a governance catalog of categories and terms.

Designed developed job sequential to run multiple jobs

Used DataStage Designer for importing the source and target database schemas, importing and exporting jobs/projects, creating new job categories and table definitions.

Designed and developed the Routines and Job Sequence for the ETL jobs

Prepared Technical Specification Documents for DataStage Jobs.

Involved in Unit testing and integration testing.

Responsible for running jobs using Job Sequencers, Job Batches.

Deployed different partitioning methods like Hash by field, Entire and Range for bulk data loading and for performance boost.

Implemented logical and physical data modeling with Star and Snowflake techniques using Erwin in Data warehouse.

Extensively used Reject Link, Job Parameters, and Stage Variables in developing jobs.

Used the DataStage Director to run, schedule, monitor, and test the application on development, and to obtain the performance statistics.

Developed Packages and customized functions and Triggers based upon the business logics.

Involved in Performance tuning of complex queries.

Developing Oracle PL/SQL stored procedures, Functions, Packages, SQL scripts to facilitate the functionality for various modules.

Environment: DataStage v9.1(Designer, Director and Administrator), QualityStage, Test Director, ClearCase, AutoSys, DS Extract PACK for SAP 5.1, IBM DB2 9.1, AIX 5.3, SQL Server, Toad for DB2, Windows XP, Microsoft Visio, TWS.

Health Partners, Bloomington MN (Contractor) Oct 2013 – Mar 2014

Employer – IT KeySource, Inc

Sr. Datastage Developer

Project: TSYS to FDR conversion

HealthPartners is an integrated, nonprofit health care provider located in Bloomington, Minnesota offering care, coverage, research and education to its members, patients and the community.

Responsibilities:

Interacted with End user community to understand the business requirements and in identifying data sources.

Analyzed the existing informational sources and methods to identify problem areas and make recommendations for improvement. This required a detailed understanding of the data sources and researching possible solutions.

Implemented dimensional model (logical and physical) in the existing architecture using Erwin.

Studied the PL/SQL code developed to relate the source and target mappings.

Helped in preparing the mapping document for source to target.

Worked with Datastage Manager for importing metadata from repository, new job Categories and creating new data elements.

Designed and developed ETL processes using DataStage designer to load data from Oracle, MS SQL, Flat Files (Fixed Width) and XML files to staging database and from staging to the target Data Warehouse database.

Used DataStage stages namely Hash file, Sequential file, Transformer, Aggregate, Sort, Datasets, Join, Lookup, Change Capture, Funnel, Peek, Row Generator stages in accomplishing the ETL Coding.

Developed job sequencer with proper job dependencies, job control stages, triggers.

Used QualityStage to ensure consistency, removing data anomalies and spelling errors of the source information before being delivered for further processing.

Excessively used DS Director for monitoring Job logs to resolve issues.

Involved in performance tuning and optimization of DataStage mappings using features like Pipeline and Partition Parallelism and data/index cache to manage very large volume of data.

Documented ETL test plans, test cases, test scripts, and validations based on design specifications for unit testing, system testing, functional testing, prepared test data for testing, error handling and analysis.

Used Autosys job scheduler for automating the monthly regular run of DW cycle in both production and UAT environments.

Verified the Cognos Report by extracting data from the Staging Database using PL/SQL queries.

Wrote Configuration files for Performance in production environment.

Participated in weekly status meetings.

Environment: IBM Information Server 8.5 (Designer, Director and Administrator), QualityStage, Test Director, ClearCase, Zeke, K-Shell Scripts, Mainframe TSO ISPF, IBM DB2 9.1, AIX 5.3, WinSQL for DB2.

First Tennessee Bank, Memphis, TN (Contractor) Feb 2012 – Oct 2013

Sr. Datastage Developer

Project: TSYS to FDR conversion

Responsibilities:

•Interacted with End user community (FDR) to understand the business requirements and in identifying data sources.

•Analyzed the existing informational sources and methods to identify problem areas and make recommendations for improvement. This required a detailed understanding of the data sources and researching possible solutions.

•Used Classic Federation Server to get file from the Mainframe to Unix and vice versa.

•Used DataStage stages namely Z/OS file stage using cobol copy books to extract the data from the Mainframe through Classic Federation server, Column Export, Column Import, Sequential file, Transformer, Aggregate, Sort, Datasets, Join, Lookup, Change Capture, Funnel, Peek, Row Generator stages in accomplishing the ETL Coding.

•Developed job sequencer with proper job dependencies, job control stages, triggers.

•Used Zeke job scheduler for automating the monthly regular run of DW cycle in both production and UAT environments.

•Reviewed reports on Mainframe TSO ISPF environment and allocated datasets on the mainframe using Classic Federation jobs.

•Created shared containers to simplify job design.

•Performed performance tuning of the jobs by interpreting performance statistics of the jobs developed.

•Documented ETL test plans, test cases, test scripts, and validations based on design specifications for unit testing, system testing, functional testing, regression testing, prepared test data for testing, error handling and analysis.

•Worked on change management system on code migrations from Dev to QA to Prod environments.

•Extensively worked on building ETL interfaces to read and write data from DB2 data base using DB2 Enterprise Stage and DB2 API Stage.

•Involved in functional and technical meetings and responsible for creating ETL Source – to – Target maps.

•Modifying the existing jobs and Hash files according to the changing business rules

•Loading the Historical data into the Warehouse.

•Developed jobs for transforming the data and stages like Join, Merge, Lookup, Funnel, Transformer, Pivot and Aggregator

•Experience with Scheduling tool Zeke for automating the ETL process.

Environment: IBM Information Server 8.5 (Designer, Director and Administrator), QualityStage, Test Director, ClearCase, AutoSys, K-Shell Scripts, SQL Server Integration Services (SSIS) 2005, SAP ECC R3, SAP BW, DS Extract PACK for SAP 5.1,IBM DB2 9.1, AIX 5.3, Embarcadero for DB2

Freddie Mac, McLean, VA (Contractor) Dec 2010 – Jan 2012

Sr. DataStage Developer

The Federal Home Loan Mortgage Corporation (FHLMC), known as Freddie Mac, is a public Government Sponsored Enterprise (GSE). The FHLMC was created in 1970 to expand the secondary market for mortgages in the US. Along with other GSEs, Freddie Mac buys mortgages on the secondary market, pools them, and sells them as a mortgage-backed security to investors on the open market. This secondary mortgage market increases the supply of money available for mortgage lending and increases the money available for new home purchases.

Responsibilities:

•Designed and developed jobs for extraction of data from different datafeeds into IBM DB2 database.

•Coded many shell scripts for efficient job scheduling.

•Worked on preparing the test cases and testing ETL jobs and data validation.

•Developed parallel jobs using various Development/debug stages and processing stages (Aggregator, Change Capture, Change Apply, SAP ABAP Extract Stage, IDoc Stage, BAPI Stage, Filter, Sort & Merge, Funnel, and Remove Duplicate Stage).

•Worked on change management system on code migrations from Dev to QA to Prod environments.

•Proficient in ETL (Extract – Transform – Load) using SQL Server Integration Services 2005 (SSIS) and Informatica PowerCenter tool.

•Performed debugging on these jobs using Peek stage by outputting the data to Job Log or a stage.

•Extensively worked on building ETL interfaces to read and write data from DB2 data base using DB2 Enterprise Stage and DB2 API Stage.

•Involved in functional and technical meetings and responsible for creating ETL Source – to – Target maps.

•Demonstrated expertise utilizing ETL tools, including SQL Server Integration Services (SSIS), ETL package design, and RDBMS systems like SQL Server.

•Modifying the existing jobs and Hash files according to the changing business rules

•Loading the Historical data into the Warehouse.

•Developed jobs for transforming the data and stages like Join, Merge, Lookup, Funnel, Transformer, Pivot and Aggregator

•Experience with Scheduling tool Autosys for automating the ETL process.

•Involved in developing a Control Module for the complete process using PERL and UNIX scripting.

•Worked on documenting technical design documents and source to target (STT) documents.

•Involved in Unit Testing, Integration testing and UAT Performance Testing.

•Worked with Embarcadero to interact with DB2.

Louisiana State University, Baton Rouge LA Aug 2009 – Dec 2010

ETL Data Research Assistant

Responsibilities:

•Involved in all phases of SDLC.

•Responsible for creating detailed design and source to target mappings.

•Responsible to communicate with business users and project management to get business requirements and translate to ETL specifications.

•Used DataStage/QualityStage Designer to import/export jobs, table definitions, Custom Routines and Custom Transformations.

•Created Extract Transform and Load (ETL) interfaces and gateways for backend database.

•Designed Mappings between sources to operational staging targets, using Star Schema, Implemented logic for Slowly Changing Dimensions (SCD).

•Extensive hand on experience on design and developing Parallel and server jobs.

•Extensively worked on building datastage jobs using various stages like Oracle Connector, Funnel, Transformer stage, Sequential file stage, LookUp, Join and Peek Stages.

•Extensively used Sort, Merge, Aggregator, Peek, DataSet and Remove Duplicates stages.

•Involved in the migration of DataStage jobs from development to QA and then to production environment.

•Created shared containers to use in multiple jobs.

•Hands on experience upgrading datastage from v7.5 to Information Server 8.1.1.

•Imported and exported Repositories across DataStage projects using datastage designer.

•Extensively worked on DataStage Job Sequencer to Schedule Jobs to run jobs in Sequence.

Environment: IBM Information Server 8.1.1 DataStage and QualityStage (Designer, Directory and Administrator), Oracle 10g, Cognose v10, PL/SQL, HP-UNIX 11, Toad for Oracle.

Contact this candidate