Post Job Free

Resume

Sign in

Data Developer

Location:
United States
Posted:
August 22, 2015

Contact this candidate

Resume:

Nagarajan Balakrishnan

Email: acrado@r.postjobfree.com

Mobile: +1 - 804-***-****

P R O F E S S I O N A L E X P E R I E N C E

6.7 Years of experience in Information Technology in managing implementing and supporting Data Warehousing as an ETL developer/lead .

6 months of experience in building data lake for the Claims Initiation and Updates Process.

Cloudera Certified Professional on Hadoop – CCD 410 (Big Data Processing).

Certified MongoDB developer.

2 years of experience in developing and managing Hadoop projects like Mapreduce with Java/J2EE, Hive, Pig Latin, MongoDB,SQOOP, and Cloudera.

6 Years of experience in Ab Initio, EME, UNIX & Shell scripting, CONTROL-M, SQL SERVER, Teradata, DB2 and Oracle.

3 + years in leading ETL programs handling Ab Initio design and development

for complex projects.

Expertise in various modules of Banking and Financial Services industry which includes Recoveries and Basel II (COAF,UK Cards) and also in Insurance domain which includes Claim process for Personal Accident, Marine Cargo, Property Insurance, Motor Insurance and Fire Insurance.

Designed, Developed and lead the team to build HR Datawarehouse for Capital One.

Worked in Org level Hadoop POC’s which includes

Buiding HR Datawarehouse in HDFS using Hive.

XML Processing in Mapreduce.

File Broking capability.

Feasibility analysis in selecting ETL tool for Big Data in Capital One.

Proficient and highly skilled in managing & developing software solutions in

Waterfall and Agile SDLCs.

Conducted training sessions on Hadoop and also on advance concepts like PDL, Vectors and Metaprogramming functions.

T E C H N I C A L S K I L L S E T S (In Big Data and DataWareHouse)

Hadoop related Big Data Technologies.

Hadoop, Map Reduce, HDFS, Pig, Hive, HBase, MongoDB, Sqoop, Talend with Big Data, DMExpress with Big Data, Hortonworks, Cloudera.

Java Technologies

Core Java, JDBC, Java Beans.

Technology

Data Ware Housing – Ab Initio 2.15,3.05,3.1,3.2.4.1

Scripting

Korn Shell Scripting, JavaScript, Python

Operating Systems

Unix (Sun Solaris),Unix HP-UX,Windows 200x, Windows 9x

Databases

Teradata 12, DB2, Oracle 10g, SQL Server 2005.

IDE

Eclipse 3.x

Job Scheduler

Control M, JCL

DETAILS OF EXPERIENCE Projects Profile

GEICO Feb 2015 – Till Date

Project: Building Data Lake for Claims Initiation and Updates Process.

Role: Lead Developer.

GEICO being a pioneer in Auto Insurance in US and with customers spread across the country has large volumes of data flowing each day in its Claims process. To maintain history data for almost 80 years and also to accumulate each day transactional data it has planned to move to Big Data world to build a data lake using Hadoop and Ab Initio technologies for better performance and to reduce cost.

Responsibilities:

Involved in requirement gathering to understand the overall claims process.

Created a high level design approach to build a data lake which will embrace the existing history data and also to suffice the need to process the transactional data.

Created generic design architecture for unloading data from 500 odd tables, which will also strip off the noisy data being generated from the source.

Co-ordinated with the data modellers to create Hive tables which will replicate the current warehouse table structure.

Lead the team to load the history data/dead data (40 years) using SQOOP.

Lead the team in creating a generic process and build psets to unload the data from Oracle Golden Gate Transaction DB and load in to Hive warehouse (Source like Stage).

Co-ordinated with the R – SAS team to perform dead data analysis for the history data loaded.

Created Source like, Enrich and Visualization consumption stages in Hadoop ecosystem.

Developed Mapreduce jobs in Java for both Enrich process and Visualization Consumption Stage to accomplish the business rules and to create tables in the Visualization consumption stage and loaded the Enriched data in Hive Tables.

Tools/Technologies :

Cloudera, JDK (1.7) Hive 1.2.1, Sqoop 1.4.6, Ab Initio 3.2.4.1, UNIX, Oracle 11g, DB2 .

Capital One Nov 2013 – Feb 2015

Organisation : Cognizant Technology Solutions

Project: HR Data Warehouse

Role: Onsite Coordinator and Tech Lead.

HR Data Warehouse is built with all the employee related details including personal information, payroll information, hierarchy of reporting managers from immediate supervisor till CEO over an employee’s entire tenure in the organisation, department related information and appraisal information.

Responsibilities:

Involved in Data Modelling of the Dimension and Fact tables and applied my thought process for efficient design.

Gathering requirements and understanding the business functionality.

Creating High Level Design and Detailed Level Design.

Developed ETL solution for one of the complex dimension table (Reports to Hierarchy) at onsite which loops through the hierarchy from immediate supervisor till the CEO of the organization for the entire tenure of the employee and completed it with 100% accuracy.

Implemented Ab Initio Plans for each Dim and Fact tables which reduced the number of jobs and made the job scheduling fairly simpler and this has considerably reduced the cost for implementation.

Worked directly with the QA team during testing phase and fixed the issues or requirement changes promptly.

Worked directly with the business users during UAT phase and fixed the changes as requested.

Created stories and tasks based on the scope and ensured to deliver it within the sprint as committed and had driven offshore team to exercise the same.

Co Ordinated with the team to help them fix any technical issues and also did Code Review of all the deliverables from Offshore.

Provided demo to the client at the end of each sprint.

Co Ordinated with the production team to ensure successful implementation.

Monitored jobs to ensure smooth implementation.

Tools/ Technologies :

Ab Initio, UNIX, Oracle, EME, HPSM, Control – M.

Hadoop POC’s did (as an additional responsibility):

1.HR Datawarehouse in HDFS:

Working on a POC to build HR Datawarehouse in HDFS through HIVE. Currently HR is storing data in Oracle Database and to reduce the database cost and ETL cost this POC is intiated. We are currently using Talend and HIVE.

Tools/ Technologies :

Cloudera, Hive 1.2.1, Sqoop, Mapreduce to pull the data and load in to Source like Stage of data lake.

2.XML Processing in Mapreduce:

We receive XML files from IFM and in Ab Initio it was tedious to process those XML files, so we planned to perform a POC to process the same in Mapreduce.

Role played in the POC:

Wrote Mahout Code to create XMLInputFormat.

Imported in Mapreduce Namespace and processed the XML Data on the existing functionality in Ab Initio and loaded the processed files successfully in HDFS.

Tools/ Technologies :

Cloudera, Mapreduce, Mahout code.

3.Enterprise Wide File Broking integration

Did a POC in storing and reading the files from current Production servers in to HDFS and successfully proved to have all the current process in HDFS either including some functionalities like purging files based on retention policy for Daily Weekly and Monthly files, receiving files from external systems like TSYS via Enterprise Data Exchange.

Tools/ Technologies :

Java (JDK 1.7), Unix.

4.POC on Org Level ETL tool select for Big Data:

Performed POC’s on replicating a complex Ab Initio Graph which has most of the functionalities within using DMExpress and Talend and submitted the feasibility analysis of both the tools.

5.POC on Big Data functionality exploration in Talend:

Talend has all the Hadoop plugin components like Hive, PIG, Sqoop, and HBase, Copying files back and forth HDFS etc. Performed extensive functionality analysis in exploring PIG, Sqoop, Hive and HBase using Talend and submitted a deck to the org.

6.Knowledge Sharing Sessions :

Conducted trainings on Hadoop and Big Data and helped them install base apache Hadoop and its associated plugins like Hive, PIG, SQOOP, HBASE. Also conducted working sessions on Map Reduce and above plugins as well.

Capital One Apr 2012 – Oct 2013

Organisation : Cognizant Technology Solutions

Project: Basel II (COAF), Recoveries

Role: Offshore Tech Lead.

Recoveries module involves collecting the Credit Card debt from customers. It receives credit card information from TSYS and considers all factors which will tend the customers not to pay the debt like Bankruptcy, deceased and creates a lossmart for those debts that was not received.

Basel II is a compliance project which is built for regulatory purpose. COAF (Capital One Auto Finance) has a Data warehouse built which will calculate the audit data and will send the data to RWA (Risk Weighted Asserts) for audit purposes.

Responsibilities:

Understanding the business requirement and leading the team of 5 for deliverables.

Created detailed design from the High Level Design received from Onsite.

Reviewed the Detailed design with Onsite and received baseline approval.

Lead the team during development phase and solved their technical issues.

Reviewed Unit Test cases and ensured all the test case scenarios are covered.

Migrated the code to QA Box after the development and Unit testing is complete.

Supported System Testing and helped team to fix issues/defects/requirement changes.

Worked with the Production team for successful implementation and monitored jobs during implementation.

Value Adds did:

1.Problem Statement :

Before migrating the code to QA there is a Code Checker Validation that needs to be performed to ensure the graph doesn’t have any violations as per the standards. One such validation is to ensure Ab Initio components doesn’t have default description. The Code Checker script will not give the component name which will be stressful in case of complex graph to find the component for default description.

Resolution:

Created a script which will find the description of the Ab Initio component and if it’s a default description then it will give the component name. This script is like a pre code checker script which is reducing so much of time for the developers.

2.Problem Statement :

Files that are pushed from the external system needs to undergo Data Cleansing before applying business logics. This involves bad data check, null check. To perform this every time there is a need to create a graph and apply the DQ check whichever is necessary.

Resolution:

Created a generic graph which will perform the DQ check (Null and bad data check) for those fields that needs data quality check. This reduced the time to a great extent to create a separate graph and also this doesn’t get caught in Code checker during QA migration as this is a generic graph.

CITI Bank Mar 2011 – Apr 2012

Organisation : Tata Consultancy Services

Project: Fraud Auth

Role: Offshore Developer

Fraud Authorization module is developed to identify the Fraudulent customers based on their Credit Scores. Files will be received from Credit Bureau and based on SCODEINT and involving complex logics Fraud Authorization will be calculated and loaded to the table

Responsibilities:

Supported in Gathering requirement and understanding the business functionality in case of TPR (Technical Project Request) document.

Develop the graphs according to the business requirements.

Analyze source systems file layouts and write DML for extracting the data from various sources like flat files, tables, Mainframe Copy Books, Responder Layouts.

Involve in analyzing the data transformation logics, mapping implementations and data loading into target database through Ab-Initio graphs.

Developing UNIX shell scripts for automation processes.

Involve in fixing the unit and functional test case/data defects.

Analysis of Existing application and identifying improvements.

Environment: Ab Initio 3.03, Teradata 12, UNIX (Sun Solaris Korn Shell).

CITI Bank 2009 Jan – Mar 2011

Organisation : Hewlett Packard Global India Pvt Ltd.

Project: CDW, CR&G

Role: Offshore Developer

Card Datawarehouse module houses all the Credit, Debit Card and Forex card information. It captures every single transaction of the customer and the datawarehouse helps BI to generate Monthly Statements.

CRNG module has all survey related data like Campaign information, Customer Satisfaction survey. It will be loaded in Teradata table which will help Business users to take decision for furnishing the business in those grey areas.

Responsibilities:

Develop the graphs according to the TPR document.

Analyze source systems file layouts and write DML for extracting the data from tables.

Involve in analyzing the data transformation logics, mapping implementations and data loading into target database through Ab-Initio graphs.

Developing UNIX shell scripts for automating processes.

Involve in fixing the unit and functional test case/data defects.

Analysis of Existing application and identifying improvements.

Prepare result analysis graphs to validate business test case results.

Performance testing of all graphs against huge volumes of data.

Environment: Ab Initio 3.03, Teradata 12, UNIX (Sun Solaris Korn Shell), MOSS 2010.

Other Tools Known: Hadoop 1.1.1, Hive, PIG, HBASE, Sqoop,Flume,DMExpress (Big Data),Talend (Big Data), MongoDB and attended Training in Cognos 8.4, Microstrategy 9 and Datastage 8x Parallel Jobs.

Awards and Recognition:

Received Synergizer 2013 for being vibrant in leading a team and delivering zero defect code consecutively and for the contribution of Value Adds as an additional responsibility.



Contact this candidate