hadoop Developer

Location:

Pune, MH, India

Posted:

April 11, 2017

Contact this candidate

Resume:

DINESH PARDESHI

***************@*****.***

+91-982*******

Professional Summary

Overall 3.5+ Years of Experience in Hadoop Technologies.

Hands on Experience in Hadoop Technologies like HDFS, MapReduce, Hive and Sqoop.

Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.

Moving Tera bytes of data from RDBMS to HDFS using Sqoop.

Hands on Experience in Data Modelling in Hive.

Experience in CDH environment.

Hands on experience in Linux Shell Commands.

Good knowledge in Pig, Flume, Spark Core and Oozie.

Good communication, interpersonal, analytical skills, and strong ability to perform as part of team.

Willing to update my knowledge and learn new skills according to Business Requirements.

Work Experience:

Hadoop Developer - Persistent Systems, Pune from October2013 to till date.

Technical Skills:

Frameworks : Hadoop.

Big Data Technology : MapReduce, Hive, Sqoop, Flume, Pig, Spark Core& Oozie.

File System : HDFS.

RDBMS : MySQL, Oracle 11g.

Languages : Core Java.

Operating systems : Linux and Windows Family.

Linux Tools : Putty, WinSCP.

IDE : Eclipse, My Eclipse

Educational Qualifications:

B.C.A (Bachelor of Computer Application) In Swami Ramanand Teeth Marathwada University, Nanded.

Professional Experience:

Project # 1

Title : Drug sales and its usage analysis

Period : Oct 2015 – Till Date

Designation: Hadoop Developer

Team Size : 9

Platform : CDH 5.X

Technology: HDFS, Hive, Sqoop, core Java, Oracle 11g and Oozie.

Description

The purpose of this project is research a medicine and development of Drugs. It is analysis of web-enabled product suite that improves communication; optimize promotional activations for new product launches of pharmaceutical companies. It helps to medical representative to do their daily activities and it provides sales forces with a single, centralized location from which access product information and evaluate competitors.

Role &Responsibilities:

Involved in manage data coming from Relational DB to HDFS using Sqoop.

Writing CLI commands using HDFS.

Involved in creating Hive tables, loading with data which will run internally in MapReduce way.

Implemented complex Hive queries.

Implemented Partitioning of Hive tables.

Involved in analysing data in Hive warehouse using Hive Query Language (HQL).

Project # 2

Title : Rehousing of Banking Data (Data Migration)

Period : July 2014 – Aug 2015

Designation: Hadoop Developer

Team Size : 8

Platform : CDH 4.X

Technology : HDFS, MapReduce, Sqoop, Hive, Core Java.

Description

RD is a Data migration project which gets the data from RDBMS to staging area which is relocates in HDFS. The purpose of this project converting all traditional data (RDBMS) into Hadoop technology and validating Meta files which is coming from RDBMS. Those data mart is used in presentation layer to view the data on reporting tool.

Roles &Responsibilities:

Responsible for data movement from client library and relational database to HDFS using Sqoop.

Involved in Created Sqoop jobs with incremented load and populated Hive tables.

Involved in writing queries with Hive Query Language.

Worked with performance issues and tuning the Hive scripts (bucketing, Partitioning)

Participating and developing Meta files as per requirement.

Involved in validating data.

Project # 3:

Title : Analysis of logs generated by user IP address

Period : Feb 2014 – May 2014

Designation: Hadoop Developer

Team Size : 3

Technology : Hadoop, Hive, Sqoop, MySQL

Description

Given data contains two tables. One contains IP address and corresponding country code. Another one contains the details of URL requests coming from each ip address, time stamp etc. Using Hive, I have extracted the top 20 IP's from which more requests are coming and mapped with corresponding country using join operations. This was an Internal POC to

Furnish the Hadoop Skills.

Roles &Responsibilities:

Worked on Sqoop to transfer the data stored in RDBMS to HDFS.

Created both managed and external tables to optimize performance.

Worked with the Hive scripts (Join, Partitioning).

Responsible for analysing data in Hive warehouse using Hive Query Language

Contact this candidate