Sign in

Manager Hadoop

Pune, Maharashtra, India
as per the company norms
May 04, 2019

Contact this candidate


Zaid Ahmed Shaikh



Professional Summary:

Hadoop Admin Professional with 5+ years of experience and Expertise in Apache Hadoop, Cloudera Enterprise distribution of Hadoop and Hortnworks Data Platform Eco system.

Expert knowledge in setting up and managing Hadoop clusters, on-premise and on-cloud.

Hands on with various admin concepts like enabling HDFS,Yarn,Hive,etc Cluster security, setting up client configurations files, cluster planning and setup etc.

Skills Profile:


Linux/Unix, Windows, Ubuntu, Centos, Virtual Machines, AWS Cloud

Hadoop Distribution

Hortonwork, Cloudera, Apache Hadoop, MapR

Hadoop Ecosystem

Cloudera Hadoop (5.x), Mapreduce, Apache Hadoop, HDFS, HBase, Flume, YARN, Hive, Spark, Sentry, and Kerberos with AD.


MySQL,Oracle, Postgree.


Shell Script

Professional Experience:

Organization: Dreamcaredevelopers,Pune

Client: Indian based

Domain: Sensor data

Project: IOT (Industry based)

Role: Sr. Hadoop Administrator

Duration:2016-Till date

29-Node Cluster for a industry based client. Initially we started with cluster planning and capacity planning and then after discussing with the customer we setup Was involved in setting up the cluster as well as supporting it 24*7 to make sure it was up and running. Set up the security for the cluster using Kerberos and Sentry. Due to sensitive data stored on the cluster, it was very important to secure the cluster as well as the data to make sure only the right people have the right access.


Setup cluster on Cloudera on in house environment.

Configure Pentaho and VSFTP.

Monitoring the cluster and make sure cluster should be up and running.

Observing the storage.

Setting Quota for the Users.

Organization: Dreamcaredevelopers,Pune

Client: Saudi Based

Domain : Post-paid Billing

Project: Telecom Domain

Role: Hadoop Administrator


Managing a 12-node cluster on AWS for a telecom client. Performed daily activities related to cluster monitoring and management. Daily 10-12 GB of data was ingested and the cluster was used by multiple users hence it was very important that the cluster’s health be good. Users were also expected to execute multiple queries. Hence the focus was also on optimising the cluster for performance.


Configure and setup Hortonworks Cluster

Monitoring the cluster and make sure cluster should be up and running.

Perform Linux operation like setting quota, identify bad disk if appear and comment it and unmounts it.set crontab as per the requirement, compress old logs, and reboot the host.

Assign privilege to user using Ranger and ACL.

Monitor resource utilization on RM UI.

Perform Queue operation.

Recover data/files from DR cluster if any file got corrupted.

Responsibilities as Admin:

Setting up production grade Hadoop clusters and its components through Cloudera Manager in virtualized environments (AWS Cloud) and on-premises.

Strong understanding of core Hadoop concepts and design principles

Deploying hadoop cluster from scratch.

Responsible for implementation and support of the Enterprise Hadoop environment.

Creating a VPC on AWS Cloud.

Knowledge on UNIX/LINUX

Installation, Configuration and Administration of Various Hadoop distributions like Cloudera and Hortonwork (Ambari) & Apache Hadoop.

Involves designing, capacity arrangement, cluster set up, performance fine-tuning, monitoring, scaling and administration.

Cluster maintenance and monitoring setup using tools like Ganglia, Nagios, Ambari etc.

Experience working with cloud based Hadoop and SQL technologies.

Strong understating of Linux Operating System concepts, memory, CPU, storage and networks

Ability to derive meaningful metrics from various log files for Hadoop, Hive, Spark systems

Troubleshooting skills and understanding of resource management, security (user level and file system level), as well as experience managing CPU, memory, storage resources for a large Hadoop cluster with 50s of users.

Working knowledge of cluster monitoring tools like Ambari, Ganglia or Nagios

Configuring various components such as HDFS, YARN, MapReduce, Flume, Hive, Hue, Spark, HBase, Pig, Zookeeper, and Sentry.

In charge of installing, administering, and supporting Windows and Linux operating systems in an enterprise environment

Decommissioning and commissioning the Node on running cluster including Balancing HDFS block data.

Adding and removing users from the group and availing set of permissions to the specified group.

HDFS support and maintenance.

Configured Data Disaster recovery for cluster.

Setting up new Hadoop users.

Configure Rack awareness.

Knowledge for storage, performance tuning and volume management of Hadoop clusters and MapReduce routines.

Experienced in Designing Hadoop Architecture AWS Cloud and with Production ready features like High Availability, Scalability, and Security.

Configuration and security for Hadoop clusters using Kerberos.

Configuring various components such as HDFS, YARN, MapReduce, Sqoop, Flume, Hive, Zookeeper, Oozie, Sentry

Strong experience integrating Hadoop environments with Active Directory / Kerberos services and 3rd party tools

Monitor Hadoop cluster connectivity and performance.

To troubleshoot, performance tuning, cluster Monitoring, diagnose and solve the Hadoop issues and making sure that they do not occur again.

Manage and analyze Hadoop log files.

Monitoring of Ambari and Nagios alerts.

File system management and monitoring.

Monitoring of long running jobs through Resource Managers UI.

Kerberos Configuration on CDH.

Implementation High Availability solution

Flume configuration to load log files into HDFS

Sqoop configuration to import/export data to/from Mysql databases.

Setting Hadoop prerequisite on Linux server

Experience in Commissioning and De-commissioning, Trash configuration, node balancer.

AD, LDAP, SSL installation and Configuration on CDH.

Working knowledge of Cloudera Manager. Report generation of running nodes using various benchmarking operations.

Setting up Linux users and groups, setting up Kerberos principals and testing HDFS, and Map Reduce access for the new users.

Backup and recovery task by Creating Snapshots Policies, Backup Schedules and recovery from node failure.

Adding / removing nodes to an existing hadoop cluster.

Working Knowledge on Kerberos AD, Sentry.

Work with team to plan and deploy new hadoop environments.

Analyze Namenode and Resource Manager Web UI.

Requires innately curious person with creative ability to solve complex problems.

Installation and Configuration of Sentry on CDH for authorization.

Implementing Hadoop security using ACL.

Experience on Configuring Trash and recovery.

Setting Quota for the Users.

Knowledge of UNIX Shell Scripting.


1.Completed B.Sc. (Information Technology) from North Maharashtra University.

2.Completed MCA (Master of computer Application) form Pune University.


Software Testing

Personal Snippets

Date of birth: 14/12/1990

Place: Pune

Contact this candidate