Data Project

Location:

Farmington, MI

Posted:

April 03, 2017

Contact this candidate

Resume:

Mail ID: ***********@*****.*** Mobile: +1-248-***-****

Professinal Summary:

Having 4 years of professional experience with Cognizant & ITC Infotech in Software Design and Application Development.

Having nearly 2.5+ years of Experience in Hadoop Ecosystem, 1 year of Experience in Core Java technologies and 2+ months of experience in Hadoop Testing.

Experience in working with top clients Apple Inc, United Health Group.

Experience in HDFS, HIVE, Map Reduce, PIG, YARN, Sqoop,

Shell scripting.

Good knowledge in Apache spark.

Basic knowledge in Machine Learning.

Knowledge in reporting tool Qlickview.

Hands on experience using Apache, MapR, Hortonworks Distributions.

Experience in handling development, Support and short time testing projects.

Domain expertise in Healthcare, Manufacturing and Retail industry.

Worked in Agile methodologies.

Big data Certification from Know Big Data (Corporate training Institute).

Big data University certifications taken from Cognizant on Hadoop

Good understanding about HDFS and yarn Architecture.

Educational Qualifications:

Examination

Discipline/ Specialization

Bachelor in Technology

Information Technology (76%)

Skill Set:

Language

Core Java, IText.

Hadoop Technologies

HDFS, HIVE, Map Reduce, Yarn, PIG, Sqoop, Apache spark.

Version Control

CVS and SVN

Database

Oracle 11g.

Operating System

UNIX, Windows XP,Windows 7,MAC OS

Server

Apache Tomcat 6.0 onwards.

Domain

Healthcare and Manufacturing

Tools

Eclipse, Winsep, Putty, Bedrock.

Trainings Udergone:

Course

Duration

Course conducted by

Year

From

Big Data

05-Dec-2014

30-Dec-2014

ITC infotech

2015

Qlick view

12-Dec-2016

23-Dec-2014

Cognizant

2016

Professional Experience:

Project 4 (Latest):

Project Title

Piano

Client

Apple Inc.

Duration

Jul-2016 To Jan-2017

Technology

HIVE

Environment

MAC OS, Hadoop, MapR

Description:

Piano is a project which has two data sources (Piano and ESL). Active data is located in these two databases and some transformations to generate end reports. These reports are related to apple subscriptions, summaries data for iPod, Apple mobiles. These reports are displayed in the dashboards.

Apple Inc. is an American multinational technology company headquartered in Cupertino, California, that designs, develops, and sells consumer electronics, computer software, and online services. Its hardware products include the iPhone smartphone, the iPad tablet computer, the Mac personal computer, the iPod portable media player, the Apple Watch smartwatch, and the Apple TV digital media player.

Responsibilities:

Participating in client level communications to go through application specific requirements

Active participation in validating the statistics from source and target data sources

Creating Hive queries to bounce the data against end user systems

Downloading and validating the reports between UI & database.

Project 3:

Project Title

ORCHARD

Client

OPTUM, UHG

Duration

Sep-2015 To Mar-2016

Technology

Java,HIVE, MapReduce, BedRock

Environment

Linux, Hadoop 2.0, MapR

Description:

Orchard project consists of 9 distinct data sources. The data sources are DB2, Oracle 11G and due to the huge volume of data, the processing is extremely slow. As a consequence, there is a substantial challenge in reporting from such legacy database sources, this motivated for search of a better file management system and hence adaptation of Hadoop ecosystem. The data from these sources are pulled using ETL tool (data stage) into inbox where it adds .ctl and .meta files to the incoming .dat file. Here the data is organised, prepared and ingested into Hadoop Raw zone through data management system Bedrock. The bedrock is responsible for the ingestion process. Once the data is ingested into raw zone (HDFS), the data goes through validation steps, after validation data goes into the Hive tables. For each hive table a view is created with queries.

Responsibilities:

Active participation in the team in implementing the solution for IRIS,RX Claims requirement.

Implemented validations like Gender validation, file duplication, Record count, Schema validation and column validations.

Created hive tables, loading it to hive tables.

Automating the process using Bedrock and 75 % time spent as an individual contributor to the below activities.

oUsed Avro Mapreduce for schema evaluation and fast serialization mechanism.

oWrote Map Reduce jobs for validations.

oCreated Hive External tables and working on them using Hive QL.

oCreated Hive views and provided to the business users.

oWritten Hive queries for data analysis to meet the business requirements.

oUsed Hive partitioning technique for optimization.

oDeveloped UDF in java.

oInvolved in Documentation process before code delivered to the End user.

oManaged the jobs, Created Jobs with the Bedrock (Oozie Ingestion tool).

Project 2:

Project Title

Health Data Analytics

Client

Sutter Health

Duration

Dec-2014 To Aug-2015

Technology

HDFS, MapReduce, Hive, Oracle, Shell Scripting

Environment

Hadoop 2.3.0,Oracle

Description:

Sutter Health is a non-profitable health system in Northern California, headquartered in Sacramento. It includes doctors, hospitals and other health care services in more than 100 Northern California cities and towns.

There were around 7 Data Sources and each of the source consisting of around 120 -130 files each. Data Ingestion from these sources is done in multiple steps such as Data is ingested as-is from the multiple Data Sources to Hadoop Layer (Landing Zone), Data Quality checks (DQC) are performed on the data in Landing Zone. Some of the DQC includes record count check, column count check, MD5 check etc. After DQC is completed the files are copied to Raw Zone. Some of the Data Standardization checks include basic transformations of date values, datatype checks, range checks and standardizing values etc.

Responsibilities:

Developed Shell Script for Data Quality checks (count check, column count check, MD5 check).

Developed MapReduce framework for applying Business Rules (datatype checks, range checks) and Data Standardizations.

Used Shell Scripts for automation of Reports.

Project 1:

Project Title

Technical Proposal Solution

Client

Trelleborg Sealing solutions

Duration

July-2013 to Nov-2014

Technology

Core Java, ITEXT

Environment

Hadoop 2.0, Unix

Description:

Trelleborg is a world leader in engineered polymer solutions that seal, damp and protect critical applications in demanding environments. Its innovative engineered solutions accelerate performance for customers in a sustainable way. Technical proposal Tool is providing the solutions to the automobiles.

Responsibilities:

Genrating PDF using ITEXT According to the Given Template.

Implemented business logic using Core Java.

Coding, unit testing and bug Fixes of the code.

Developed DAO and services using hibernate.

Personal Profile:

Current Visa: H4

DOB: 15-JUNE-1991

Address: 36661 Grand River Ave APT 103

Farmington, MI 48335

Home Phone: +1-248-***-****

Contact this candidate