Kumari.L
***********@*****.***
Summary:
7+ years of experience in Analysis, Design, Development, Testing, Implementation, Enhancement and Support of ETL applications which includes strong experience in OLTP & OLAP environments as a Data Warehouse/Business Intelligence Consultant. Experience in Talend Open Studio (6.x/5.x) for Data Integration, Data Quality and Big Data.
Experience in working with Data Warehousing Concepts like OLAP, OLTP, Star Schema, Snow Flake Schema, Logical Data Modeling, Physical Modeling and Dimension Data Modeling.
Widespread experience in using Talend features such as context variables, triggers, connectors for Database and flat files.
Hands on Involvement on many components which are there in the palette to design Jobs & used Context Variables to Parameterize Talend Jobs.
Experienced in ETL Talend Data Fabric components and used features of Context Variables, MySQL, Oracle, Hive Database components.
Tracking Daily Data load, Monthly Data extracts and send to client for their verification.
Strong experience in designing and developing Business Intelligence solutions in Data Warehousing using ETL Tools.
Excellent understanding and best practice of Data Warehousing Concepts, involved in Full Development life cycle of Data Warehousing.
Experienced in analyzing, designing and developing ETL strategies and processes, writing ETL specifications.
Involved in extracting user's Data from various Data sources into Hadoop Distributed File Systems (HDFS)
Experience with MapReduce, Pig, Programming Model, Installation and Configuration of Hadoop, HBase, Hive, Pig, Sqoop and Flume using Linux commands.
Experienced in using Talend Data Fabric tools (Talend DI, Talend MDM, Talend DQ, Talend Data Preparation, ESB, TAC)
Experienced in working with different data sources like Flat files, Spreadsheet files, log files and Databases.
Knowledge in Data Flow Diagrams, Process Models, E-R diagrams with modeling tools like ERwin & ERStudio.
Extensive experience in J2EE platform including, developing both front end & back end applications using Java, Servlets, JSP, EJB, AJAX, Spring, Struts, Hibernate, JAXB, JMS, JDBC, Web Services.
Strong Understanding of Data Modeling (Relational, dimensional, star and snowflake schema) Data analysis implementation of Data Warehouse using Widows and Unix.
Hands-on experience across all stages of Software Development Life Cycle (SDLC) including business requirement analysis, data mapping, build, unit testing, systems integration and user acceptance testing.
Worked in all phases of BW/BI full life cycles including Analysis, Design, Development, Testing, Deployment, Post-Production Support/Maintenance, Documentation and End-User Training.
Professional Experience:
Penske Logistic, Beachwood, OH
Feb’ 2016 – Current
Sr. Talend Developer
Responsibilities:
Worked in the Data Integration Team to perform data and application integration with a goal of moving high volume data more effectively, efficiently and with high performance to assist in businesscritical projects.
Has developed custom components and multi-threaded configurations with a flat file by writing JAVA code in Talend.
Interacted with Solution Architects and Business Analysts to gather requirements and update Solution Architect Document Created mappings and sessions to implement technical enhancements.
Deployed and scheduled Talend jobs in Administration console and monitoring the execution
Created separate branches with in the Talend repository for Development, Production and Deployment.
Excellent knowledge with Talend Administration console, Talend installation, using Context and global map variables in Talend.
Review requirements to help build valid and appropriate DQ rules and implement DQ Rules using Talend DI jobs.
Create cross-platform Talend DI jobs to read data from multiple sources like Hive, Hana, Teradata, DB2, Oracle.
Create Talend Jobs for data comparison between tables across different databases, identify and report discrepancies to the respective teams.
Talend Administrative tasks like - Upgrades, create and manage user profiles and projects, manage access, monitoring, setup TAC notification.
Observed statistics of Talend jobs in AMC to improve the performance and in what scenarios errors are causing. Created Generic and Repository schemas.
Performed Data Manipulations using various Talend Components like tMap. tjavarow, tjava, tOracleRow, tOracleInput, tOracleOutput, tMSSQLInput and many more.
Implementing complex business rules by creating re-usable transformations and robust mappings using Talend transformations like tConvertType, tSortRow, tReplace, tAggregateRow, tUnite etc.
Created standard and best practices for Talend ETL components and jobs.
Extraction, transformation and loading of data from various file formats like .csv, .xls, .txt and various delimited formats using Talend Open Studio.
Worked on HIVE QL to get the data from hive database.
Responsible for developing data pipeline with Amazon AWS to extract the data from weblogs and store in HDFS
Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
Troubleshoot data integration issues and bugs, analyze reasons for failure, implement optimal solutions, and revise procedures and documentation as needed.
Responsible to tune ETL mappings, Workflows and underlying data model to optimize load and query performance.
Configure Talend Administration Center (TAC) for scheduling and deployment. Create and schedule Execution Plans - to create Job Flows
Worked with production support in finalizing scheduling of workflows and database scripts using AutoSys.
Environment: Talend 6.2.1/6.0.1, Talend Open Studio Big Data/DQ/DI, Talend Administrator Console, Oracle 11g, Teradata V 14.0, Hive, HANA, PL/SQL, DB2, XML, JAVA. ERwin 7, UNIX Shell Scripting.
Amerisource Bergen
Jul’ 2014 – Dec’ 2015
Talend Developer
Responsibilities:
Worked closely with Business Analysts to review the business specifications of the project and to gather the ETL requirements.
Created Talend jobs to copy the files from one server to another and utilized Talend FTP components. Created and managed Source to Target mapping documents for all Facts and Dimension tables
Analyzing the source data to know the quality of data by using Talend Data Quality.
Involved in writing SQL Queries and used Joins to access Data from Oracle, and MySQL. Assisted in migrating the existing data center into the AWS environment.
Prepared ETL mapping Documents for every mapping and Data Migration document for smooth transfer of project from development to testing environment and then to production environment.
Design and Implemented ETL for data load from heterogeneous Sources to SQL Server and Oracle as target databases and for Fact and Slowly Changing Dimensions SCD-Type1 and SCD-Type2.
Utilized Big Data components like tHDFSInput, tHDFSOutput, tPigLoad, tPigFilterRow, tPigFilterColumn, tPigStoreResult, tHiveLoad, tHiveInput, tHbaseInput, tHbaseOutput, tSqoopImport and tSqoopExport.
Used Talend most used components (tMap, tDie, tConvertType, tFlowMeter, tLogCatcher, tRowGenerator, tSetGlobalVar, tHashInput & tHashOutput and many more)
Created many complex ETL jobs for data exchange from and to Database Server and various other systems including RDBMS, XML, CSV, and Flat file structures.
Experienced in using debug mode of Talend to debug a job to fix errors.
Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Talend Integration Suite.
Conducted JAD sessions with business users and SME's for better understanding of the reporting requirements.
Developed Talend jobs to populate the claims data to data warehouse - star schema.
Used Talend Admin Console Job conductor to schedule ETL Jobs on daily, weekly, monthly and yearly basis.
Worked on various Talend components such as tMap, tFilterRow, tAggregateRow, tFileExist, tFileCopy, tFileList, tDie etc.
Worked Extensively on Talend Admin Console and Schedule Jobs in Job Conductor.
Environment: Talend Enterprise Big Data Edition 5.1, Talend Administrator Console, MS SQL Server 2012/2008, Oracle 11g, Hive, HDFS, Sqoop, TOAD, UNIX Enterprise Platform for Data integration.
Barclays, Wilmington, DE
Feb’ 2013 – Jun’ 2014
Talend Developer
Responsibilities:
Involved in design and development of business requirements and analyzed application requirements and provided recommended design
Participated actively in end user meetings and collected requirements.
Used Informatica as an ETL tool for extraction, transformation and loading (ETL) of data in the Data Warehouse.
Designed, developed, and documented several mappings to extract the data from Flat files and Relational sources.
Extensively worked on confirmed dimensions for the purpose of incremental loading of the target database.
Integrated java code inside Talend studio by using components like tJavaRow, tJava, tJavaFlex and Routines.
Used ETL methodologies and best practices to create Talend ETL jobs.
Experienced in using debug mode of Talend to debug a job to fix errors. Developed re-usable transformations, mappings and Mapplets confirming to the business rules.
Developed Talend jobs to populate the claims data to data warehouse-star schema. Created workflows and tested mappings and workflows in development and production environment.
Used shell script to automate pre-session and post-session process.
Used Debugger to test the mappings and fix the bugs and actively involved in performance improvements of mapping and sessions and fine-tuned all transformations.
Involved in enhancements and maintenance activities of the data warehouse including performance tuning, rewriting of stored procedures for code enhancements.
Developed new and maintaining existing Informatica mapping and workflows based on specifications.
Performed Informatica code migration from development/ QA/ production and fixed and solved mapping and workflow problems.
Implemented Performance tuning of existing stored procedures, functions, views & SQL Queries.
Environment: Informatica, Talend platform for Data management 5.6.1, UNIX Scripting, Toad, Oracle 10g
Apollo Hospital
Jan’ 2011 – Dec’ 2012
Informatica Developer
Responsibilities:
Interacted with the business users on regular basis to consolidate and analyze the requirements.
Identified the Entities and the relationships between the Entities to develop a logical model and later translated into physical model.
Used Normalization up to 3NF and De-normalization for effective performance. Involved in implementation of the Test cases and Test Scripts.
Tested the data and data integrity among various sources and targets. Tested to verify that all data were synchronized after the data is troubleshoot, and also used SQL to verify/validate test cases.
Written Test Cases for ETL to compare Source and Target database systems and check all the transformation rules.
Defects identified in testing environment where communicated to the developers using defect tracking tool HP Quality Center
Performed Verification, Validation, and Transformations on the Input data
Tested the messages published by INFORMATICA and data loaded into various databases
Written Test Cases for ETL to compare Source and Target database systems and check all the transformation rules.
Extracted data from databases like Oracle, SQL server and DB2 using Informatica to load it into a single repository for data analysis.
Worked on multiple data marts in Enterprise Data Warehouse (EDW).
Worked on Informatica Power Center Designer tools like Source Analyzer, Target Designer, Transformation Developer, Mapping Designer and Mapplet Designer.
Worked on Informatica Power Center Workflow Manager tools like Task Designer, Workflow Designer, and Worklet Designer.
Designed and developed Informatica power center medium to complex mappings using transformations such as the Source Qualifier, Aggregator, Expression, Lookup, Filter, Router, Rank, Sequence Generator, Stored Procedure and Update Strategy
Worked as a key project resource taking day-to-day work direction and accepting accountability for technical aspects of development.
Developed the business rules for cleansing/validating/standardization of data using Informatica Data Quality.
Designed and developed multiple reusable cleanse components.
Environment: Informatica, SQL/MS SQL Server, MS Analysis Services, Windows NT, MS Visio, XML.