Post Job Free
Sign in

Senior Cloudera Administrator SME with Cloudera Data Platform

Company:
Cubicle Logic LLC
Location:
Alexandria, VA, 22311
Posted:
May 09, 2024
Apply

Description:

Job Description

Senior Cloudera Administrator/Infrastructure SME with Cloudera Data Platform Experience (SDX and CDP – On-Prem, Public Cloud and Private Cloud)

Role & Responsibilities:

Will be responsible for the administration of Cloudera CDP on-prem and cloud infrastructure.

Leverage the CDP features to build the cloud-hybrid infrastructure for CDP (CDP Public Cloud).

Independently installs and maintains Big Data (Cloudera) clusters in high available, load balanced configuration across multiple (Production, QA and Development) environments in both on-prem and Cloud (AWS) environment.

Implement the Knox and Kerberos for Cluster security and integrate with enterprise and cloud IAM.

Develop scripts to automate and streamline operations and configuration

Manage and automate the installation process (use tools like Ansible) for CDP Manager, CDH, and the ecosystem projects. Activities include : Set up a local CDH repository; Perform OS-level configuration for Hadoop installation; Install Cloudera Manager server and agents; Install CDH using Cloudera Manager; Add a new node to an existing cluster; Add a service using Cloudera Manager

Schedule the jobs using Apache Nifi or Air flow

Analyze, recommend and implement improvements to support environment/infrastructure management initiatives. Configure - Perform basic and advanced configuration needed to effectively administer a Hadoop cluster on-prem and on-cloud. Activities include : Configure a service using Cloudera Manager; Create an HDFS user's home directory; Configure NameNode HA; Configure ResourceManager HA; Configure proxy for Hiveserver2/Impala

Maintain and modify the cluster to support day-to-day operations in the enterprise. Activities include : Rebalance the cluster; Set up alerting for excessive disk fill; Define and install a rack topology script; Install new type of I/O compression library in cluster; Revise YARN resource assignment based on user feedback; Commission/decommission a node

Enable relevant services and configure the cluster to meet goals defined by security policies, Activities include: Configure HDFS ACLs; Install and configure Sentry; Configure Hue user authorization and authentication; Enable/configure log and query redaction; Create encrypted zones in HDFS

Benchmark the cluster operational metrics, test system configuration for operation and efficiency. Activities include: Efficiently copy data within a cluster/between clusters; Create/restore a snapshot of an HDFS directory; Get/set ACLs for a file or directory structure; Benchmark the cluster (I/O, CPU, network)

Under general supervision, manage Big Data Administration activities, technical documentation, system performance support, and internal customer support. May provide input into the development of Systems Architecture for mission critical corporate development projects.

Research performance issues, configuring the cluster with Cloudera best practices, optimizing specifications and parameters to fine-tune and proactively avoid performance issues

Skills:

Must be Cloudera Certified Hadoop and/or Spark Administrator

Hands-on experience on Cloudera installation, configuration, debugging, tuning and administration.

Must have infrastructure implementation experience on on-prem and Cloud (AWS).

Must have knowledge on CDP Containers and integration of on-prem clusters with Cloud clusters.

Strong hands on experience in implementation of Security like Kerberos, Sentry, OS Upgrade and TLS/SSL implementation etc

Must have knowledge on Cloudera SDX, Cloudera Public Cloud and Private Cloud infrastructures.

Experience administrating distributed applications: Hadoop, Spark, Kafka, Map Reduce, Hive, Impala

Experience with large-scale high-performance distributed systems like Hadoop, NoSQL or Spark

Assist in preparing and scaling the cluster as required to execute mission critical data processing processes

Experience with setting up and configuring the YARN queues using the YARN queue manager

· Deep understanding of IP, TCP, UDP, SSL/TLS protocols

Experience with Devops, scripting or other automation

Experience with Performance monitoring and tuning

Hadoop Cluster maintenance as well as creation and removal of nodes

Working knowledge of Networks, Linux OS and Unix Shell Scripting • Understanding working authorization mechanism for Ranger / Sentry

Systems implementation, operations, and its optimization as Hadoop Admin

Demonstrate ability to find the root cause of a problem in Spark/HDFS and CDP clusters, optimize inefficient execution, and resolve resource contention scenarios; Resolve errors/warnings in Cloudera Manager; Resolve performance problems/errors in cluster operation;Determine reason for application failure; Configure the Fair Scheduler to resolve application delays

Experience on Cloudera Data Science Workbench and Cloudera Data Flow products.

Experience working with Systems Operation Department in resolving variety of infrastructure issues

Apply