Hadoop Developer

Location:

Posted:

April 19, 2017

Resume:

Experience Summary

Having ** years of experience in System Analysis, Design, Development, Testing, Implementation and Maintenance of business applications using Hadoop Echo Systems, Mainframe Applications and Java.

Very good understanding of Hadoop architecture, HDFS & MapReduce 2.0 programming technique, and the daemons of Hadoop 2.0 - Name Node, Data Node, Resource Manager, Node Manager, Proxy Server.

Working experience in MapReduce programming model and Hadoop Distributed File Systems.

Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hive, HBase, Pig, Sqoop, Flume, Oozie, Spark for scalability, distributed computing, streaming and high performance computing.

Experience in managing and reviewing Hadoop log files.

Experience in importing and exporting data using Sqoop from HDFS to Relational DB Systems and vice-versa.

Experience in analyzing data using HiveQL, Pig Latin and custom MapReduce programs in Java.

Extending Hive and Pig core functionality by writing custom UDFs.

Experienced at migrating map reduce programs into Spark transformations using Spark and Scala.

Very good understanding and working Knowledge of Object Oriented Programming (OOPS) and Multithreading.

Experience in Core Java, Web Services, JDBC, MySQL, Oracle, DB2, IMS-DB.

Expertise in creating databases, users, tables, views, stored procedures, functions, joins and indexes in Oracle DB.

Expert in Building, Deploying and Maintaining the Applications

Have very good data Analysis and data validation skills and good exposure to the entire Software Development Lifecycle (SDLC), CoE Centric and Agile Methodologies. Good at preparing SOW and Amendments.

Acted as Scrum Master for Product teams with a focus on guiding teams towards improving the way they work.

Offer outstanding talents in resource loading (recruiting/staffing), team building, developing project scope (budget, timelines and delivery dates), cost avoidance, continuous design improvements and customer relationships.

Involved in review meetings with Project Managers, Developers and Business Associates for Project.

TECHNICAL SKILS:

Big Data Ecosystem

HDFS, MapReduce 2.0, YARN, Hive, Pig, Sqoop, HBase, Oozie,

Flume, Spark, Scala, Hue, Python.

Programming Languages

JAVA, PL/SQL, Bash scripting, COBOL, PL/1, JCL, Unix/Linux shell scripts

Databases

IMS-DB, DB2, MS SQL Server, Oracle, MS Access and MySQL

Platforms

Windows 95/98/NT/2K/XP/Win7, UNIX, Linux, MVS, Z/OS

IDEs and Tools

Eclipse, Putty, MS VISIO

Mainframe Tools & Utilities

ENDEVOR, XP-EDITOR, FILE-AID, IMS-FILEAID, SPUFI, QMF, ABEND-AID. IBM Data Studio.

Development Approach

SDLC, Agile, CoE

Operating System

Windows NT/9X/2000, UNIX, LINUX, Z/OS 1.6

Defect Tracking Tools

HP Quality Center (QC), HP Application Lifecycle Management (ALM)

Version Control

SVN, Endevor and Changeman

Other tools

Putty, Win SCP. Version One

Professional Profile:

Aug 2016 to Till Date

Sep 2010 to July 2016

Nov 2006 to Aug 2010

May 2005 to Nov 2006.

Project Experience:

Project # 1

Project Title

Enterprise Dataware house

Duration

Aug -2016 to till date

Client

CVS Health

Environment

Linux, Hbase, Hive, Pig, Sqoop, Spark, Scala, Oozie, Kafka, PL/SQL, Windows NT, UNIX Shell Scripting, and SQL Server.

Domain

Pharmacy.

About Client:

CVS Health is an American retail pharmacy and health care company headquartered in Woonsocket, Rhode Island. The company began in 1963 with three partners who grew the venture from a parent company, Mark Steven, Inc., that helped retailers manage their health and beauty aid product lines. The business began as a chain of health and beauty aid stores, but within several years, pharmacies were added. To facilitate growth and expansion, the company joined The Melville Corporation, which managed a string of retail businesses. Following a period of growth in the 1980s and 1990s, CVS Corporation spun off from Melville in 1996, becoming a standalone company trading on the New York Stock Exchange as NYSE: CVS.

Project Description:

EDW is a downstream application that takes data from RxClaim and stores in data warehouse and that data gets used by the client and other systems for reporting purposes. The EDW accepts the feed from different adjudication systems like RxClaim, Recap, QL and Pharmacare. There are feed from foreign claims, extracts and external claims. All the data from these adjudication systems come in the form of flat files. Along with data file there is trigger file which comes immediately after the data file is received. The trigger file also contains some audit information like total number of records in data file, which is used for the preliminary audit of incoming data.

Role and Contribution as a CVS Tech Lead

Lead the EDW application, directly interacting with the operational users in the Client Statements to gather the functional specifications and understand them to build the technical specifications.

Involved in software architecture, detailed design, coding, testing and creation of functional specs of the application especially for insert/message/special handling/ forcing.

Developed Map Reduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the HBASE.

Responsible for building scalable, distributed data solutions using Hadoop.

Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with hive and pig.

Developed Spark scripts by using Scala shell commands as per the client requirements.

Installed and configured Hadoop & developed multiple MapReduce jobs in java for data cleaning and preprocessing.

Developed pig and hive scripts for transforming/formatting the data as per business rules.

Used data ingestion tools like Sqoop to import data from Oracle to HDFS and vice versa

Involved in loading data from UNIX file system to HDFS.

Hands on experience in writing Hadoop MapReduce jobs to implement the core logic using Java API, Pig scripts and Hive queries.

Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF’s in Hive querying.

Involved in replacing the default Metastore of Hive to MySQL from derby database.

The processed data by all means is imported into Hive warehouse which enabled business analysts and operation groups to write Hive queries.

Good understanding and exposure to the Hadoop cluster administration

Used various performance optimization techniques to help run the process quicker

Developed suit of unit test cases for Mapper, Reducer and Driver classes using MR Testing library

Provided support in answering the concerns raised by the end customers using the hive queries

Project # 2

Project Title

Optumera

Duration

April -2014 to July – 2016

Client

Best Buy

Environment

Linux, Hadoop, Map Reduce 2,Hbase, Hive, Pig, Sqoop, Oozie, Kafka, PL/SQL, Windows NT, UNIX Shell Scripting.

Domain

Retail.

About Client:

Best Buy is an American multinational consumer electronics corporation headquartered in Richfield, Minnesota, a Minneapolis suburb. It operates in the United States, Mexico, and Canada. Best Buy sells consumer electronics and a variety of related merchandise, including software, video games, music, mobile phones, digital cameras, car stereos and video cameras. In FY 2016, Best Buy operated 1,073 Best Buy and 350 Best Buy Mobile stand-alone stores in the US. Best Buy also operated: 136 Best Buy and 56 Best Buy Mobile stand-alone stores in Canada; and 18 Best Buy stores and 6 Best Buy Express stores in Mexico.

Project Description:

Optumera is a shopper-centric merchandising suite that helps with expertise in merchandise optimization. It helps address the need for holistically driven shopper-centric merchandising decisions

Backed by strong domain experience in the area of markdown, space and assortment optimization, Optumera makes the merchandising process shopper-centric and provides faster return on investment, better performance and customer delight.

Assortment Optimization: This module helps the retailer create localized optimized assortments. It considers factors such as item performance, market performance, inventory, space, customer decision trees, choice sets, and loyalty parameters to provide recommendations. This makes the assortment shopper-centric, relevant and profitable.

Consumer Decision Tree and Demand Transfer: This helps the retailer to understand customer choices better by providing meaningful insights into customer shopping patterns. Further, it enables retailer to identify complementary and substitutable products in the category to make prudent assortment and space decisions.

Clustering: This module helps the retailer to create store groups based on various objectives like Shopper Trip-based Clustering. It enables the retailer to create category specific clusters for tailored assortments, better replenishment, profitable pricing strategies, and accurate forecasting.

Macro Space Optimization: This enables the right spacing of categories in the store. It takes into consideration category performance, market dynamics, customer preferences, store characteristics, category roles, and dynamics. It recommends execution-friendly category space reallocation to maximize returns.

Role and Contribution as Hadoop Data Integration Lead & Scrum Master

Expert in Hadoop Configuration and Map-Reduce programs, designed and developed MR for data integration of Assortment Optimization.

Responsible for building scalable, distributed data solutions using Hadoop.

Installed and configured Hadoop & developed multiple Map Reduce jobs in java for data cleaning and preprocessing.

Developed pig and hive scripts for transforming/formatting the data as per business rules.

Used data ingestion tools like Sqoop to import data from Oracle to HDFS and vice versa

Involved in loading data from UNIX file system to HDFS.

Hands on experience in writing Hadoop Map Reduce jobs to implement the core logic using Java API, Pig scripts and Hive queries.

Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF’s in Hive querying.

Good understanding and exposure to the Hadoop cluster administration

Used various performance optimization techniques to help run the process quicker

Developed suit of unit test cases for Mapper, Reducer and Driver classes using MR Testing library

Provided support in answering the concerns raised by the end customers using the hive queries

Have experience in software product development across all phases of Project Development Life Cycle (PDLC). Have in depth experience in UI Technologies like HTML, JavaScript.

Managed and reviewed Hadoop log files.

Gained in-depth understanding on various merchandising concepts.

Shared responsibility for administration of Hadoop, Hive and Pig.

Experienced in defining job flows using Oozie

Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.

Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.

Managed and reviewed Hadoop log files.

Shared responsibility for administration of Hadoop, Hive and Pig.

Coached team members on Agile principles and providing general guidance on the methodology

Engaged with other Scrum Masters to increase the effectiveness of the application of Scrum in the organization.

Project # 3

Project Title

Target Corporation – IMN Services

Duration

Sep -2010 to Mar -2014

Client

Target, USA

Environment

JAVA, J2EE, XML and AJAX, Unix Shell Scripts, Apache ANT, Adobe FLEX, JAXB, Apache Ivy, JDBC, Eclipse, SQL, Sql server 2000, Windows NT, UNIX

Domain

Retail.

Project Description:

Item Management (IMN) system/Program is containing the core business functionalities for the Target Business that is used to define and maintain each product bought/sold in the Target Stores. The new IMN Application will replace the existing business system (GMS Item) to overcome the identified draw backs in the GMS Item Systems. Item, which is a prime attribute in retail industry is setup in Target using IMN system. IMN deals with 72 inbound systems and more than 200 outbound systems. This Program covers Conversion, DB-SYNC, Application Development, Comparison, Testing, Life cycle Testing, Like Item Testing, Program /Warranty Support, CAGEN Rewrite.

The purpose of the new system Item Management (IMN) system is to provide a production set-up and maintenance solution to support planning, acquiring, producing and the sales of merchandise. The new Item Management system will be architecturally more flexible and scalable than the current item system to address current opportunities and constraints without sacrificing performance.

Role and Contribution as a Java Developer

Lead the IMN application, directly interacting with the operational users in the Client Statements to gather the functional specifications and understand them to build the technical specifications.

Involved in software architecture, detailed design, coding, testing and creation of functional specs of the application especially for insert/message/special handling/ forcing.

Developed using new features of Java 1.5 like Generics, enhanced for loop and Enums. Developed the functionalities using Agile Methodology

Used multithreading concepts in Java to design the application to support multiple users processing the inserts/messages during the month-end.

Developed Java Exception Handling Framework for whole system.

Created wrapper classes for Java collections.

Implemented ORM framework Hibernate instead of traditional JDBC code.

Created and injected spring services, spring controllers and DAOs to achieve dependency injection and to wire objects of business classes.

Integrated the IMN application with the upstream applications through JMS, WebSphere MQ, SOAP based Web services, and XML.

Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for SQL Server database.

Tuned SQL statements, Hibernate mapping, and WebSphere application server to improve performance, and consequently met the SLAs.

Preparing builds, deploy and Co-ordinate with the release management team to ensure that the proper process is followed during the release.

Providing End to End support for the testing activities during System Testing and UAT.

Production support for the application and handling of critical issues in timely manner by analyzing and writing SQL queries in SQL Server.

Continuously learned Agile/Scrum techniques and shared findings with the team

Final review of all deliverables.

Project # 4

Project Title

General Motors.

Duration

Nov -2006 to Aug 2010

Client

General Motors GM, USA

Environment

IBM-S 390, Z/OS, Windows NT, JCL, DB2, IMS DB, COBOL ENDEVOR, XPEDITOR, FILEAID, SPUFI, ISPF, Visio Client, CA-7, VSAM, SAM.

Domain

Manufacturing and Production.

About Client:

General Motors Corp. (NYSE: GM), the world's largest automaker, has been the global industry sales leader for 75 years. Founded in 1908, GM today employs about 327,000 people around the world. With global headquarters in Detroit, GM manufactures its cars and trucks in 33 countries and its vehicles are sold in 200 countries. In the first half of 2006, GM sold 4.6 million cars and trucks globally. In the first six months of 2006, GM sold 1.07 million passenger cars and light commercial vehicles in Europe, achieving a market share of 9.3 percent. GM's largest national market is the United States, followed by China, Canada, the United Kingdom and Germany.

ECoC, European Certificate of Conformity, main role is to print certificate of registration. To support the harmonized procedure for type approving motor vehicles and their trailers, the European Directive 70/156/EEC was adopted.

In accordance with COP (Conformity of Production), a process has been established to guaranty that every single vehicle is manufactured in full conformance with the legal requirements of the European market.

As a result of European Union legislation, new versions of passenger cars being produced from model year 1996 onwards and having European Type Approval, legally require an European Certificate of Conformity (ECoC) for all EU and EFTA countries. In order to meet these new requirements the ECoC-System was implemented in 1995.

Role and Contribution as a Database Analyst

Offer DBA DB2 support for application development team.

Ensure integrity, availability and performance of DB2 database systems by providing technical support and maintenance.

Monitor database performance and recommend improvements for operational efficiency.

Assist in capacity planning, space management and data maintenance activities for database system.

Perform database enhancement and modification as per the requirements.

Perform database recovery and backup tasks on daily and weekly basis.

Develop and maintain patches for database environments.

Identify and recommend database techniques to support business needs.

Maintain database security and disaster recovery procedures.

Perform troubleshooting and maintenance of multiple databases.

Resolve any database issues in accurate and timely fashion.

Monitor databases regularly to check for any errors such as existing locks and failed updates.

Oversee utilization of data and log files.

Manage database logins and permissions for users.

Project # 5

Project Title

Parker Hannifin Development

Duration

May 2005 to Nov 2006

Client

Parker Hannifin, USA

Environment

COBOL, PL/I, TELON, JCL, DB2, IMS-DB, FILE AID, XP-EDITOR, ENDEVOR, SDSF, ISPF, ALCHEMIST

Domain

Manufacturing

Project Description:

Any unsolicited change to an order detail date, quantity and/or release code could create production issues if that change exceeds what is anticipated for that customer. The customer service person may not be aware of these activities until potential problems arise which could result in over/under shipping due to scheduling issues. The new EDI Order Activity will post all changes that occur to the order item bucket data either by an EDI 830, 862, 850 transactions or by manual order detail maintenance that updates the item dates, quantities or release codes.

MSS divisions that use the EDI Integrated 850/830/862 processes to add/or update the customer’s order need to be aware of an exceptional date and/or increase or decrease of a sales order dollar value due to a schedule quantity applied to the order. With the ‘Replace All’ function utilized the customer service person may not see a hard copy of the transaction and not be aware of this exceptional change for any number of days. Errant data entry may go undetected and result in issues affecting the on time delivery of product.

Role and Contribution as an Analyst / Developer

Reviewing the requirements sent by the Client/ Onsite coordinators.

Coding as per the Parker standards.

Preparing the Impact Analysis Documents and Test plans (UTP).

Involved in Analysis, Design, Coding and Testing

Analysis of the program, which got Abended and preparation of the Analysis report.

Documentation, code reviews, unit testing, integration testing.

Participated in Implementation & post production product warranty activities.

Educational Qualification

Master of Science in Information Systems (MScIS) in, October 2003 from Sri Venkateswara University in Tirupati.

Bachelor of Computer Applications (B.C.A) in, April 2001 from Sri Krishnadeveraya University in Anantapur Specialized in Computer Science.

Contact this candidate