Sign in

Data Hadoop

Pune, Maharashtra, India
March 27, 2020

Contact this candidate


Highly skilled Hadoop administrator. Around *years of experience in Hadoop (cloudera, hdp, apache hadoop) has extensive knowledge and possesses strong abilities in administration of large data clusters in Big Data Environments and is extremely analytical with excellent problem-solving.


Toll India Logistics Pvt Ltd, Toll Technology Centre. April2019 till date

Configuring, Deploying, Maintaining and Monitoring Hadoop Cluster, Performing Various Operations while Storing, Processing and Analyzing the Big Data on AWS Cloud.

Worked on setting up Hadoop cluster for the Production Environment in CDH.

Worked on setting up Pipeline using Streamsets to ingest Streaming Data from devices connected from several vendors.

Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop files.

Installed and managed multiple Hadoop clusters - Production, stage, development.

Work with the team in providing hardware architectural guidance, planning and estimating cluster capacity, and creating roadmaps for Hadoop cluster deployment.

Overriding Hadoop defaults configurations for customization.

High availability and backup policies.

Involved in implementing security on Hadoop Cluster.

Experience on Hadoop Administration which includes setting up and configuring a core/customized Hadoop cluster, installing, configuring, maintaining and monitoring HDFS, Yarn, Flume, Sqoop, Pig, Hive, Impala, Kerberos etc.

Roles and responsibilities include Yarn administration.

Sentry for Authorization.

Expertise in performance tuning, storage capacity management.

Knowledge of Big Data Appliance.

Setting Alerts for Different Services on Hadoop Ecosystem.

Experience integrating with CI tools and methods, including Git, Jenkins, Ansible.

Supporting Business Intelligence Team for Processing Data.



Worked with Solution Architect to build pipelines using Streamsets for Streaming data in and used Kafka to ingest data from Streamsets Pipeline.

Defined Kafka Topics to distil the data as per vehicle type.

Created Pipeline to connect Raw Kafka to our Database.

Worked on 5000+ vehicle devices data.

Driver and Vehicle data were used to understand the behavior of Vehicle,

Data Check.

Working on Data Check Project with Project Team.

Created certain checks to define and detect the defects in the ongoing production.

Created different pipelines to validate the data from the source to the destination.

Addressed the defects and fixed them.

Completeness, Uniqueness and Validity Checks.

Defining the efforts, resources, scope of work and cost per hour basis was one of my responsibility.

Creed Global Technologies Pvt Ltd, Bangalore. Nov-2014 to April-19

Telecom’s Data.

Customer CDR’s Analysis is done on the telecom data in both volume and complexity.

Data analysis used as a background application to motivate many problems.

Work based on cloud system, Tele Data, which combines data mining, social network analysis and statistics analysis with Big Data technologies


CloudU from ranksheet

Ibm bigdata administrator


Native place : Pune

Status : Married

Education: Bachelor of engineering E&TC

Contact this candidate