Krishna Yelisetty
Senior ETL Developer and Consultant
Cell: +1-248-***-****
E-Mail: ***************@*****.***
Professional Summary
Having 10+ years of IT industry experience encompassing a wide range of skill set in Data Warehousing Technologies. Have worked across a brand range of industries including Banking, Manufacturing, Retail, Accounting and Finance
Having extensive consulting & delivery experience in the implementation and management of large, complex EDW and Data Marts projects.
Sound knowledge of ETL Tools, Linux/UNIX and highly skilled in developing, debugging, troubleshooting, monitoring and performance tuning using IBM DataStage Designer, DataStage Administrator, and DataStage Director on Server Engine and ETL Talend.
Strong Background in designing Data warehouse management strategy like Reconciliation, Change Data Capture, Dimensional Model (Star schema and Snowflake schema), database
Worked on different kinds of DataStage Stages like Complex Flat File, Dataset, Lookup Fileset, Copy, Oracle Connector, ODBC Connector, Lookup, Join, Aggregator, Sort, Merge, Filter, Transformer and Funnel.
Used different stages of DataStage Designer like Oracle OCI, Lookup, Join, Merge, Funnel, Filter, Copy, Aggregator, Sort, Column Generator, Remove Duplicates, Modify, Transformer.
Transformed and maintained business logic into database design by SQL objects like stored procedures.
Extensively worked on Parallel Jobs using various stages like Join, Lookup, Filter, Funnel, Copy, Slowly changing dimension, Change Data Capture (CDC), Sequential File, Oracle Enterprise, DB2/UDB, Merge, Transformer.
Implementation of SCD Type I/II using the Datastage design and jobs.
Strong Experience in Unix scripting and PL/SQL scripting on relational databases
Strong in writing UNIX shell scripts, Perl and awk scripts for Data validations, Data cleansing etc.
Expertise in Unit testing, Integration testing, back-end testing, System testing, Regression testing and maintenance
Experience in DataStage migration process between 8.1 and 8.5. and 8.5 to 9.1.2 and to 11.3.1.1
Have knowledge in Install/upgrade Information Server product suite on different tiers (Server and Client) and post install configuration.
Checking if data was compliant with organization standards and following business rules using QualityStage
Worked on the Hadoop implementation project using Hortonworks for moving/transforming the data from relational databases or files to Hive tables and applying transformations using Pig, Beeline-Hive and Spark.
Worked on the Hadoop migration project from IBM BigInsights to Hortonworks and applying the HERA framework to all Hadoop related code/jobs.
Hands on experience in HDFS, YARN, Hive, Pig, Sqoop, Thoosa, Charon, HBase, Spark, Oozie, Python and Shell scripting such as Bash.
Experience in Agile Methodology project execution model and expertise in coordinating the teams between multi locations.
Extensive work experience in software development life cycle (SDLC) including project initiation, planning & business analysis, requirement gathering, business & architecture process design, design & development, testing, deployment and maintenance of ETL solutions and following Quality Assurance processes
Technical Skills:
Data warehousing ETL tools : Datastage 8.1, Datastage 8.5, Datastage9.1, Datastage11.5, IBM
InfoSphere Information Analyzer, FastTrack Tool, IBM InfoSphere
MDM 11, Information Governance Catalog, Governance Dashboard,
Metadata Asset Manager, Data Click, Talend ETL, Informatica
Reporting Tools : Qlikview 11, Tableau, Business Objects
Database : Netezza, DB2 8.1/9.1/9.5, Teradata, Oracle 9.2/10.x and SQL
Data Science/Programming : R Programming, Statistics, Python, Regression models, Machine Learning
Language, Statistical Inference, Exploratory Data Analysis, Reproductive
Research
Hadoop Tools : HDFS, Hive, Pig, Charon, Thoosa, Spark, Sqoop, HERA
Framework, Github, Python
Scheduling : Control M Scheduler, Autosys, Crontab
Scripting : Unix Shell Scripting, R, Python, PL/SQL
Other Tools : Aginity, Squirrel, Putty, MobaXterm, Toad, Winscp, Sql Developer
Operating Systems : Unix, Windows, AIX, Linux
Certifications:
IBM Certified Solution Developer InfoSphere DataStage
Education:
Bachelor of Technology from JNTU University, Hyderabad, India
Experience Summary:
Dec 2017 – Till Date
Location – Johns Creek, GA
Project 1: Macy’s Inc Corporation
Role : ETL Lead and Developer
Environment : Datastage 11.3, Datastage 11.5,, Python, Hive, Pig, SAS Enterprise, Netezza, DB2, Oracle, Winscp, Putty,
Aginity, Netezza, Control M, ETL Talend
Macy's Inc, is an American department store chain, that is one of two department store chains owned by the company, with the other being Bloomingdale’s. As of July 2016, the Macy's division operates 669 department store locations in the continental United States, Hawaii, Puerto Rico, and Guam, including the Herald Square flagship location in Midtown Manhattan, New York City. Worked on the Datawarehousing side of the Real Time Offers (RTO) by effectively using the Campaign Management System and data.
Roles and Responsibilities:
Interaction with business team to understand their requirements and the operational process flow.
Data Modelling as per the requirements gathered and interacting with the DBA for implementation of the model using ERWIN data modeler
Analyze the existing applications and minimize the impact on existing applications
Preparation of the estimates, time lines of the deliverables and project execution plan
Used DataStage Administrator to create Repository, User groups, Users and managed users by setting up their privileges and profiles.
Used DataStage Designer to design and develop jobs for extracting, cleansing, transforming, integrating, and loading data into different Data Marts
Performance tuning to optimize the performance of the ETL.
Designed and developed jobs using Parallel Extender for splitting bulk data into subsets and to dynamically distribute to all available nodes to achieve best Job performance
Used dsimport, dsexport, dsjob, orchadmin utilities extensively
Transformed the Master data(Full/Delta) from heterogeneous sources, populated the dimension tables after delta process and surrogate key generation.
Designed jobs using different parallel job stages such as Transformer, Join, Merge, Lookup, Filter, Dataset, Lookup File Set, Remove Duplicates, Change Data Capture, Switch, Modify and Aggregator.
Code reviews, preparing Implementation run sheet, System acceptance documents, Operational acceptance checklist, Support handover documents.
Performed Unit testing, Integration testing and User Acceptance testing for every code change and enhancements in development, QA, Preproduction and Production
Working on Control M Scheduler to create the cycles for the projects
Interaction and co-ordination with client and offshore team for smooth project execution.
Mentor development team members in design and development of complex ETL and BI implementations.
Perform architectural assessments of the client's Enterprise Data Warehouse (EDW).
Responsible for building and driving alignment to an Enterprise Reference Architecture.
Developed Unix scripts by using hive and Python shell commands as per the requirement.
Maintained and administrated HDFS through Hadoop - shell scripting, Python.
Used Python for writing script to move the data across hive tables using dynamic partitions.
Developed the reusable scripts to load/unload/read the hive tables data dynamically.
Responsible to work on Hadoop Hortonworks ecosystem for writing the hive scripts including reusable scripts to process the history or source/vendor data into HDFS layer and applying the transformation logic using Pig, load the data into hive.
Nov 2015 – Nov 2017
Location – San Antonio, TX
Project 1: United Services Automobile Association (USAA)
Role : ETL Lead and Developer
Environment : Datastage 9.1.2, Datastage 11.5,, Python, Hive, Pig, Charon, Thoosa, Tableau, Netezza, DB2, Winscp, Putty, Aginity,
Netezza, Control M, Informatica
The United Services Automobile Association (USAA) is a Texas-based Fortune 500 diversified financial services group of companies including a Texas Department of Insurance regulated reciprocal inter-insurance exchange and subsidiaries offering banking, investing, and insurance to people and families that serve, or served, in the United States military. Worked on the below mentioned efforts during my tenure with USAA.
Credit card portfolio data mart that helps the users to track the performance of newly launched product into the market and do the adhoc analysis on the mart data
Bank relationship analytical environment that enables the product managers to analyze the primary banking members with their scores and relationship with the different banking products
Early month of book credit card and deposit portfolio vintages which helps the product owners to access the data and reports the behavior of the financial metrics transaction usage, balances, revolve rate, transactors, revolvers.
Undeniable Value Products effort provides the insights to the newly launched credit card and checking and how those are performing in the market so as to enhance the product features to other products as well.
Roles and Responsibilities:
Preparation of the estimates, time lines of the deliverables and project execution plan
Analyze the existing applications and minimize the impact on existing applications
Interaction with business team to understand their requirements and the operational process flow.
Data Modelling as per the requirements gathered and interacting with the DBA for implementation of the model using ERWIN data modeler
Used DataStage Designer to design and develop jobs for extracting, cleansing, transforming, integrating, and loading data into different Data Marts
Designed jobs using different parallel job stages such as Transformer, Join, Merge, Lookup, Filter, Dataset, Lookup File Set, Remove Duplicates, Change Data Capture, Switch, Modify and Aggregator.
Performance tuning to optimize the performance of the ETL.
Designed and developed jobs using Parallel Extender for splitting bulk data into subsets and to dynamically distribute to all available nodes to achieve best Job performance
Used dsimport, dsexport, dsjob, orchadmin utilities extensively
Transformed the Master data(Full/Delta) from heterogeneous sources, populated the dimension tables after delta process and surrogate key generation.
Used DataStage Administrator to create Repository, User groups, Users and managed users by setting up their privileges and profiles.
Code reviews, preparing Implementation run sheet, System acceptance documents, Operational acceptance checklist, Support handover documents.
Performed Unit testing, Integration testing and User Acceptance testing for every code change and enhancements in development, QA, Preproduction and Production
Interaction and co-ordination with client and offshore team for smooth project execution.
Mentor development team members in design and development of complex ETL and BI implementations.
Working on Control M Scheduler to create the cycles for the projects
Perform architectural assessments of the client's Enterprise Data Warehouse (EDW).
Responsible for building and driving alignment to an Enterprise Reference Architecture.
Responsible to work on Hadoop Hortonworks ecosystem for writing the hive scripts including reusable scripts to process the history or source/vendor data into HDFS layer and applying the transformation logic using Pig, transfer the files using Charon by utilizing the HERA framework
Developed unix scripts by using hive and Python shell commands as per the requirement.
Maintained and administrated HDFS through Hadoop - shell scripting, Python.
Used Python for writing script to move the data across hive tables using dynamic partitions.
Developed the reusable scripts to load/unload/read the hive tables data dynamically.
Nov 2011 – Oct 2015
Project 2: Fiat Chrysler Automobiles (FCA)
Role : ETL Lead and Developer
Environment : Datastage 8.5, Datastage 9.1.2, Datastage 11.3.1.1,
Qlikview, Winscp, Putty, SQLDBx, DB2, Tableau, ETL Talend
Chrysler, in an Endeavour to build best-in-class discovery analytics approach for extraction, transforming as per the business logic, loading, analyzing and reporting its business and financial information has envisioned deploying ETL-Datastage based applications.
This would create an advance next generation analytical layer creation in aligning with Business Analytics formation. TCS was part of creating and maintaining several critical BI applications to help Chrysler in this area.
This project falls under TCS’s Manufacturing practice area and required the application of international Tata organization’s internally-developed and proprietary development procedures, methodology and tools. These tools, methodologies and processes are proprietary and are bespoke applications themselves, and they are individually customized by TCS’s teams for use in executing projects. TCS is planning to work with Chrysler Business Analytics group for ETL-Datastage based implementations
Roles and Responsibilities:
Primary responsibilities include assisting project teams on ETL Architecture /Design / Development / Deployment.
Built Interfaces and automated with Datastage ETL tool and Unix Shell Scripting.
Designed ETL processes and develop source-to-target data mappings, integration workflows, and load processes.
Used many stages like Transformer, Sequential file, Dataset, Remove Duplicates, Sort, Join, Merge, Lookup, Funnel, Copy, Filter, ODBC, DB2, FTP, etc.
Use the Administrator client to create, delete, and configure projects.
Code deployment activities in all environment
Worked on SAP packs for extracting data from SAP and loading data to SAP.
Responsible for developing the jobs, which involves IDOC Load and IDOC Extract.
Using SAP Datastage Administrator, checked IDOC logs, generated IDOC’s by selecting IDOC types and cleared IDOC logs when required
Involved in working with BAPI stage while troubleshooting the issues.
Implemented Datastage jobs in GRID environment and made them run.
Designed, developed, and implemented enterprise-class data warehousing solutions.
Review of source systems and proposed data acquisition strategy.
Designed Sequencers to automate the whole process of data loading.
Gave expert solution to the users for problem in Datastage.
Translated the business processes into Datastage mappings.
Collect, analyze, and process data for business clients and senior management using multiple databases or systems, spreadsheets, and other available tools for reporting and/or decision making that have a major impact on various clients / organizations.
Facilitating problem solving and decision making in business groups by building business cases and providing insights through complex analysis using multiple databases or systems.
Collecting and analyzing process, performance, and operational data for generating reports and graphs.
Interpreting user requests to determine appropriate methods and data files for extracting data.
Developed and maintained documentation to client standards / provide feedback on standards where they can be improved.
Helped in establishing ETL procedures and standards for the objects improving performance enhancements and migrated the objects from Development, QA, and Stage to Prod environments
Oct 2010 – Oct 2011
Project 3: Standard Bank
Role : Datastage Developer
Environment : Datastage 8.1, Oracle, Toad, UNIX, WinSCP
Standard Bank Group is one of the big four full-service South African banks providing services inside and outside the African continent. The group operates in a range of banking and related financial services. The group has a wide representation which spans 17 African countries and 16 countries outside of Africa with an emerging markets focus.
Probe is a behavioral scoring system currently used for Namibia and Nigeria, and is housed in South Africa. The business objective is to provide Account Management capabilities at customer and account levels through the use of Probe. Such capabilities include the determination of BRIs (Behavioral Risk Indicators), confidential limits, pre-scoring and risk grading through the use of decision trees, score cards and strategies.
CACS is a solution, residing at Centre that helps the users (collectors) to control the Arrears and Rehabilitations & Recoveries (R&R) areas effectively. The current CACS solution has three source systems (Accounting systems) namely Finacle, HP&L and PRIME. ODS, residing in Country, will cater to CACS with Finacle and HP&L data.
Job Responsibilities:
Provide Assistance in analysis of the business requirements document and provide the solution design document
Perform process flow design & also prepare the Impact Assessment document with risks and assumptions
Prepare the detailed design document incorporating the business/transformation rules and get the necessary signoffs
Designing and developing the Datastage jobs to extract, transform and load the required data to achieve end-to-end data migration & integrity using necessary stages.
Used Datastage Designer to create jobs to extract, transform and the load into the target and Datastage Director for running, monitoring, scheduling and validating the jobs.
Performed Import and Export of Data Stage components and table definitions using DataStage Manager.
Developed Test Cases and performed Unit Testing, System Integration Testing & UAT support.
Prepared SQL queries to validate data in both source and target systems.
Developed Datastage parallel job using different development/debug and processing stages.
Developed Test Cases and performed Unit Testing, System Integration Testing, UAT Testing
Prepared SQL queries to validate data in both source and target systems.
Creating and Maintaining Transformation matrix and ETL Templates
Key contributor in Defect Management Process
Apr2009 – Oct 2010
Project 4: IBM Australia - Account Receivable Information Warehouse (ARIW)
Role : Developer
Environment : Datastage 7.5, Datastage 8.5, Cognos, Winscp, Putty, UNIX
ARIW is a management information warehouse to support Accounts receivable business objectives, like Open AR, Disputes. It also has a web-based tool for AR management reporting. The tools and Techniques chosen in ARIW ensure that it is a flexible and adaptable application. ARIW offers a collection of Views, Databases and Reports on AR Data that are refreshed daily.
The user population of ARIW consists of the IBM Financial and General Management and Lenovo Financial and General Management. ARIW is divided into 2 sections, one for IBM Data and the other for Lenovo Data. Access can be granted to each section individual. Users can view their reports via a web browser.
The most important sources of the A/R data in ARIW are
CARS (Common Accounts Receivable Information Systems) built up using IBM Mainframes and Db2.
CDMS (Common Debt Management System) built up using DB2 and Notes.
In addition, ARIW also pulls the data from other country specific Databases and Flat files.
Job Responsibilities:
Provide technical expertise in Data Design/Data Mapping Activities. This involved performing Source to Target Data Mapping, Business Rules, Transformation rules and ETL mapping for implementation in Datastage.
Key contributor in Data Modeling workshops for logical and conceptual data models
Key contributor in Source data Extraction and thereby Analyzing/Profiling the data and defining Data Cleansing rules.
Created Data Conversion Extraction & Migration Flow Design documents by identifying required Data for migration, defining the transformation entities and attributes, cleansing and validation rules
Key contributor in requirement gathering workshops to consolidate Data requirements & Data Formats
Active participant in the weekly operations and issues/risks meetings with internal team members and business users. Minuted the meeting outcome and generated the weekly operations report and issues/risk register for Senior management
Designed and developed Datastage jobs to extract Source Data into staging area
Contributed to all cycles of datawarehouse design and development – Requirement Gathering, Data Modelling, Data Analysis, Cleansing, Profiling, ETL Design & Development.
July2007 – Apr 2009
Project 5: IBM Australia - AP GBS Datamart
Role : Developer
Environment : Data stage 7.5, DB2
AP GBS Datamart is a central repository containing data extracted from a variety of IBM internal source systems to fulfil a variety of business functions. It provides Partners and GBS Executives with updated information in order to facilitate tracking, monitoring and decision making. It provides clear insights into the revenue, expenses, and projections for all projects (current and future) along with the partner information.
Job Responsibilities:
Create different jobs in Datastage by using several types of stages.
Extraction of data from various sources to the target database DB2
Deliver new and complex high quality solutions to clients in response to varying business requirements
Responsible for effective communication between the team and the customer
Developed ETL jobs as per business rules using ETL design document
Converted complex job designs to different job segments and executed through job sequencer for better performance and easy maintenance.
Used DataStage maps to load data from Source to target.
Enhanced the reusability of the jobs by making and deploying shared containers and multiple instances of the jobs.
Performed Unit testing and System Integration testing by developing and documenting test cases.