Sr. ETL Data Warehouse Developer
8+ Years of extensive experience in IT industry in the areas of Data warehousing, Business Intelligence, Application Development, Business Analysis.
Experience and high performance in complete Software Development Life Cycle (SDLC) including system analysis, design, development, implementation, testing, deployment and support of Data Warehousing and Data Integration Solutions using IBM InfoSphere Datastage, Informatica PowerCenter and Talend Big Data Integration tools.
Strong business understanding of verticals like Banking, Finance, Telecommunications and Healthcare domains.
Have extensively worked in developing ETL program to support Data Extraction, transformations and loading using IBM InfoSphere Datastage, Informatica PowerCenter and Talend Big Data Integration.
Practical understanding of Data modeling (Dimensional & Relational) Star/ Snowflake Schema, logical and physical data modeling for Data warehousing and On-line transactional processing systems (OLTP).
Experience in documenting High Level Design, Low level Design, Unit test plan, Unit test cases, Integration Testing, System testing and Deployment documents.
Experienced in integration and transforming of various data sources from Databases like MySQL, MS Access, Oracle, DB2, SQL Server, Teradata, PostgreSQL and formats like flat-files, COBOL files, XML into data staging area.
Experience in importing and exporting data into HDFS and HIVE.
Expertise in writing complex Ad-hoc SQL queries for reporting purposes.
Extensive experience in Client Server technology area with Oracle Database, SQL Server, DB2 and PL/SQL for the back end development of Stored procedures, functions, packages, TYPE Objects, Triggers, cursors, REF cursors, parameterized cursors, Views, Materialized Views, PL/SQL collections.
Proficient in writing UNIX Shell Scripts to automate business process and ETL jobs, SQL scripts and file transfer.
Extensive knowledge of Job Information Language (JIL) and supporting CA Technologies Workload Automation AutoSys in both UNIX and Windows environments.
Having knowledge on working with Automating applications using Jenkins, Anthillpro and IBM uDeploy continuous integration tool.
Exceptional analytical and problem solving skills to work in development and production support teams in handling critical situations to meet the deadlines for successful completion of the tasks/projects.
Involved on migrating Datastage jobs from IBM InfoSphere Information server Datastage version 9.1 to IBM InfoSphere Information server Datastage 11.5.
Involved in ETL Conversion project from IBM InfoSphere Datastage to Informatica PowerCenter and Talend Big Data Integration.
Involved in Database upgrade, migration and server tech refresh.
Good understanding of the Hadoop ecosystem and strong conceptual knowledge in Hadoop architecture components Technical Skills
IBM InfoSphere Server 11.5(Integrated DataStage and Quality Stage), IBM DataStage 11.5/9.x/8.x (Designer, Director, Administrator, Manager) Informatica Power Center 9.6.1(Designer, workflow manager, Monitor and Repository Manager)
Talend Big Data Integration v6.3/ v5.6.1
Cognos Analytics 11.1, 10.2 (Report Studio, Analysis Studio, Query Studio), Framework Manager
Data Modeling ERwin 9.x/8.x, MS Visio, Power design Professional Experience
Wells Fargo, Charlotte, NC
Sr. ETL Data Warehouse Developer August 2015 – Present Wholesale Loan Services (WLS):
Wholesale Loan Services (WLS) is used to consolidate loan tracking information from multiple applications into a single repository to provide timely and accurate reports to managers in WLS & CRE Loan centers. Responsibilities:
Worked on the proof of concept to bring data to HDFS and Hive.
Involved in moving multiple files from UNIX file system to HDFS and vice versa.
Used over 40+ Components in Talend Like tHDFS(Connection, Put, Get, Input, Output), tHive(Connection, input, Output, Row), tMap, tJoin, tfilelist, tjava, tNormalize, tlogrow, tOracle(Input, Output, SCD), tRunJob, tFlowToIterate, floop, tPrejob, tPostjob, tsendEmail, tFileinputDelimited, tFileOutputDelimited, tParallelize, TLogCatcher, etc.)
Using Sqoop import command Imported RDBMS data to HDFS.
Created multiple Hive tables, implemented partitioning, dynamic partitioning and buckets in Hive for efficient data access.
Created Pig and Hive UDFs to do data manipulation, transformations, joins and some pre-aggregations and produce summary results from Hadoop to downstream systems.
Created and used GlobalMap Context variables and Context variables in Talend job to reuse and dynamically pass values during the run time.
Created and used Joblets in Talend to reuse the transformation logic process in another jobs.
Loaded data from different databases for multiple source systems into staging area using dynamic column section.
Loaded data from different file formats into HDFS.
Good conceptual knowledge on YARN which is essentially a system for managing distributed applications.
Created Talend job versions and updated in master Github repository for code deployment in TAC.
Deploy jobs set execution plans and schedule jobs using Talend Administration Center Job Conductor.
Review, monitor and Investigate server log messages using TAC UI in Monitoring tab.
Worked on IBM InfoSphere Datastage tools using Designer, Director, Administrator, Information Services Director, Information Analyzer, Multi-client manager.
Involved with business analysts in requirements gathering and prepared technical design/specifications and analyzing source data for data extraction, transformation and loading using Datastage designer.
Used IBM Datastage to extract data from relational databases, flat files and then transformed based on the requirement and loaded into target tables using various stages like sequential file, Look up, Aggregator, Transformer, Join, Remove Duplicates, Change capture data, Sort, Column generators, Funnel and Oracle connector.
Created Datastage Job Sequences to control the execution of the job flow using various Activities & Triggers (Conditional and Unconditional) like Job Activity, Wait for file, Email Notification, Sequencer, Exception handler activity and Execute Command.
Programming PL/SQL, HTML/CSS, Java, C/C++, XML, Python, R, PowerShell Operating Systems
Windows 2012/2008 R2/2000/XP/NT, Unix (Solaris, IBM AIX), Red Hat Linux 6
Oracle 12c, 11g/10g, SQL Server 20016/2012, Teradata 16.10, MS access, IBM DB2 10.1, PostgreSQL, MySQL, Hive
Version Control WinCVS, Tortoise SVN, GitHub
Build & Deployment Jenkins, Artifact, AnthillPro, uDeploy Other Tools
MS Office, AutoSys, TPump, Fast Load, Multi Load, TPT, JIRA, HP Quality center, Toad, SQL*Loader, SSMS, Teradata SQL Assistant, Putty, WinSCP, SharePoint, MS-DOS, PowerShell, IAM
Created Stored Procedures, Functions, Packages and Triggers using PL/SQL.
Implemented restart strategy and error handling techniques to recover failed Datastage jobs.
Did performance tuning to improve Data Extraction, Data process and Load time.
Wrote complex SQL Queries involving multiple tables with joins.
Involved in migrating Datastage jobs from IBM InfoSphere Information server version 9.1 to IBM InfoSphere Information server 11.5.
Involved in migrating from Oracle 11g databases to Exadata Oracle 12c databases.
Used exchange partition technique for loading huge volume of data into target tables.
Scheduled and monitor Autosys jobs using AutoSys interface and command line.
Build and deployment of continuous integration and Automating process using Jenkins, Anthillpro and uDeploy.
Wrote Bash shell scripts to export and import Datastage jobs using uDeploy tool for continuous deployment in higher environment.
Perform production support with quick turn-around time for hot fixes.
Created different types of reports like list reports & cross tab, drill trough and chat reports using Cognos 10.1
Manually modified the user defined SQL in Report Studio to tune and/or to write complicated reports.
Used Framework Manager for model design, creating and publishing packages.
Conducted team meetings with team members (with Offshore/Onsite) to assign the work and get the update of assigned task.
Involved in all phases of development starting from Requirements Analysis, Data profiling, Data Certification, High level design, Low level design, Coding, performance tuning, and Testing, Supported System Testing and UAT and Implementation.
Environment: Talend Big Data Integration v6.3.2, IBM InfoSphere Datastage 11.5, Cognos Analytics 11.1, Cloudera Hadoop distribution, Pig, Hive, Impala, Sqoop, MapReduce, HDFS, Oracle 12c, SQL Server 2016, Teradata 16.10, PL/SQL, AutoSys, WinCVS, Erwin, Linux, WinSCP, GitHub, AnthillPro, Jenkins, uDeploy, JIRA Customer Due Diligence (CDD)
Wells Fargo Bank consistently identifying high-risk Customers to the OCC, Customer Due Diligence- to provide holistic view of the Customer across Wholesale’s products, systems, etc. Responsibilities:
Worked on Informatica PowerCenter tools- Designer, Repository Manager, Workflow Manager, and Workflow Monitor.
Worked on Informatica Utilities Source Analyzer, Warehouse Designer, Mapping Designer, Mapplet Designer and Transformation Developer.
Created reusable transformations to load data from operational data source to Data Warehouse and involved in capacity planning and storage of data.
Implemented best practices as per the standards while designing technical documents and developing Informatica ETL process.
Designed and maintained the Logical/ Physical Dimensional Data models for Wholesale system and involved in Dimensional modeling (Snowflake Schema) using Erwin to design the business process.
Generated specifications for keys, constraints, Indexes and other physical model attributes for existing data extracts, data structures, and data processes.
Developed complex mappings such as Slowly Changing Dimensions Type II-Time stamping in the Mapping Designer.
Used various transformations like Source Qualifier, Stored Procedure, Expression, Connected and Unconnected lookups, Update Strategy, Filter, Joiner, Aggregator, Sorter, Router, Rank, Sequence Generator, and Transactional Control to implement complex business logic.
Used Informatica Workflow Manager to create workflows, sessions, database connections and batches to run the mappings.
Involved in ETL, reporting and data testing for Database server tech refresh.
Used parameters and variables in the mappings to pass the values between mappings and sessions.
Wrote UNIX shell Scripts & PMCMD commands to execute Informatica Workflows and transfer files.
Used JIRA ticketing system to track issues and process flow to move tasks from one activity to another.
Created business continuity plan (BCP) for application disaster recovery and participated in annual BCP exercise.
Prepared migration document to move the mappings from development to testing and then to production repositories.
Created deployment groups from the repository manager with all the informatica objects that need to be migrated to the destination machine and validated post production deployment.
Involved in production support to meet SLA targets and monitor/troubleshoot production issues and performance tuning. Environment: Informatica PowerCenter 9.6.1, Cognos Analytics 11.1, Oracle 12c, SQL Server 2016, Teradata 16.10, PL/SQL, AutoSys, WinCVS, Erwin, Linux, WinSCP, GitHub, JIRA Tracfone Wireless Inc., Miami, FL July 2014 – July 2015 ETL Developer
TracFone Wireless Inc. is a prepaid wireless provider. TracFone depends on leading wireless companies like Cingular, Verizon, and Alltel etc. to provide services to its customers. As a result, the various cell phone companies bill TracFone for the service provided to TracFone Customers. This project dealt with customization and implementation of Enterprise Data warehouse for TracFone with Oracle, SQL Server, flat files.
Developed normalized Logical and Physical database models to design OLTP system.
Extracted data from various heterogeneous sources like Oracle, DB2, Flat Files and XML files to load into staging area.
Involved in gathering and analyzing the requirements from business for transforming data to load into reporting tables.
Involved in building the ETL architecture, source to target mapping documents to load data into Database tables.
Developed, updated and maintained DataStage jobs to load data into staging tables and then to Dimensions and Facts.
Used different types of parallel processing stages like Aggregator, Merge, Join, Lookup, Remove Duplicates, Transformer, Copy, Filter, Modify, Pivot Enterprise, Slowly Changing dimension, Surrogate Key Generator, FTP, Difference and Sorter.
Used Type 1 SCD and Type 2 SCD mappings to update slowly Changing Dimension tables.
Involved in creating tables, views and materialized views and modifying existing tables to fit in existing data model.
Designed Server and Parallel DataStage jobs to run in multiple invocations with each instance starting with different parameters to process different data sets.
Wrote PLSQL procedure to generate a dynamic SQL query based on the parameter and service used.
Involved in Performance Tuning of Datastage jobs and SQL queries.
Created Oracle range and list partition tables and local bitmap index for fast data retrieval or data update.
Used Datastage Director to schedule, unscheduled, run, stop, reset, clear job logs, cleanup job resources and clear status file.
Implemented job sequence by specifying a job control routine to truncate and load for RCP Process.
Wrote ad-hoc SQL queries depending on the needs of the data user.
Performing export and import of Datastage components, table definitions and routines.
Prepared technical design, tests (Unit/System), and supported user acceptance testing.
Prepared migration document to move the Datastage jobs from development to testing and then to production. Environment: IBM InfoSphere Datastage 8.7, Oracle 11g, SQL Server 2008R2, DB2 V9.7, SQL, PL/SQL, SQL Loader, Toad, UNIX, PostgreSQL, WinCVS, Erwin, MSDOS, Shell Scripting State Of Indiana, Indianapolis, IN September 2011 – June 2014 ETL Developer
The Family and Social Services Administration (FSSA), a health care and social service funding service agency, provides consolidated delivery of human services by State government. Responsibilities:
Implemented Slowly Changing Dimensions - Type I & II and III in Dimension tables as per the requirements.
Extensively Used IBM DataStage, Informatica PowerCenter (ETL Tool), Oracle PL/SQL, UNIX and MS Products to Extract, Cleanse, Transform and load data to target systems with right format and level of detail.
Used Erwin to design Dimensional modeling (Star/ Snow flake schema) for the Data warehouse to process data into dimensions and measured facts.
Created local and shared containers to facilitate ease and reuse of jobs.
Created and used parameter sets in Datastage server and parallel jobs and set to project default so that the values for the parameter that are input at run time from DS administor.
Created DataStage server jobs by using various stages like Sequential file, Folder, Hashed File, Complex Flat File, InterProcess, Merge, Aggregator and Transformer to extract, massage data and load into database tables.
Created Job Control to set up DataSatage job, run it, wait for it to finish and test for success.
Created Message Handlers to suppress from log and promote to warning during DataStage run time.
Involved in building the ETL architecture and Source to Target mapping to convert IBM InfoSphere DataStage to Informatica PowerCenter to load data into Data warehouse.
Ran the Datastage XML through the Data Transformation studio tool to get a report about the complexity and level of effort for manual conversion to PowerCenter.
Created and maintained stored definitions, transformation rules and targets definitions using Informatica repository Manager.
Used various transformations like Aggregator, Normalizer, Rank, Router, Filter, Expression, Sequence Generator, Update Strategy, Joiner, Lookup, Stored Procedure, and Union transaformations to develop robust mappings in the Informatica Designer.
Developed mapping parameters and variables to support SQL override.
Created mapplets to use same data transformation logic in different mappings.
Used existing ETL standards to develop these mappings to load into staging tables and then to Dimensions and Facts.
Worked on different tasks in Workflows like sessions, events raise, event wait, decision, e-mail, command, worklets, Assignment, Timer and scheduling of the workflow.
Created sessions, configured workflows to extract data from various sources, transformed data, and loading into data warehouse.
Extensively used SQL* loader to load data from flat files to the database tables in Oracle.
Used Debugger to test the mappings and fix the bugs, modified existing mappings for enhancements of new business requirements.
Wrote UNIX shell Scripts & PMCMD commands for FTP of files from remote server and execute Informatica workflows.
Involved in Performance tuning at source, target, mappings, sessions, and system levels.
Prepared migration document to move the mappings from development to testing and then to production repositories. Environment: Informatica PowerCenter 8.6.1, IBM Datastage 8.5, Cognos 10.1 ( Business Insight, Business Insight Advanced), Oracle 11g, SQL Server 2008R2, DB2 V9.7, SQL, PL/SQL, SQL Loader, SQL Developer, WinCVS, Erwin, MSDOS, Shell Scripting
IPage Info Tech Inc, Hyderabad, India June 2008 – December 2009 Database/ Informatica Developer
Worked with Informatica PowerCenter and Oracle Database techniques to improve job performance while working with bulk data sources.
Used many transformations like Source Qualifier, Expression, Lookup, Router, Aggregator, Filter, Sequence Generator, Stored Procedure, Update Strategy, joiner and Rank to design the Informatica mappings.
Generated the Surrogate keys for composite attributes while loading data into data warehouse.
Involved in writing the data loading stored procedures, functions using PL/SQL while extracting the data from various sources systems into staging area.
Responsible for developing, maintaining and running Informatica workflows manually for ad-hoc data load.
Created the Workflow sessions and used PMCMD commands to run the mappings in sequence.
Involved in creating tables, views, materialized views, tablespaces, indexes and key constraints.
Involved in exporting and import database backup using import/ export utility
Created stored procedures using PL/SQL for reconciliation reports.
Involved in Unit testing and Regression testing and provided support during System testing and user acceptance testing. Environment: Informatica PowerCenter 8.5, PL/SQL, TOAD, Oracle 10g, Windows 2000, UNIX, Flat files Education and training:
Bachelors of Technology in Electronics and Communication, JNTU, Hyderabad, India August 2004 – April 2008
Master of Science in Computer Science, University of Virginia, Virginia, USA December 2009 – March 2011
Big Data/Hadoop Training – Online 2016
Talend Big Data Integration 2017
Oracle Database SQL Expert
SDLC and ALM process