Resume

Manager Data

Location:

Covington, KY

Salary:

130K

Posted:

December 15, 2018

Contact this candidate

Resume:

Rajesh Vunnam

Email: ac7y9l@r.postjobfree.com

Phone: 917-***-****

Professional Summary

Around 6+ years of professional experience including around 4 years of Linux Administrator and 4 plus years in Big Data analytics as Sr. Hadoop/Big Data Administrator.

Experience in all the phases of Data warehouse life cycle involving Requirement Analysis, Design, Coding, Testing, and Deployment.

Experience in working with business analysts to identify study and understand requirements and translated them into ETL code in Requirement Analysis phase.

Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters, Hortonworks & Cloudera Hadoop Distribution.

Good understanding in Microsoft Analytics Platform System (APS) HdInsight.

Experience in managing the Hadoop infrastructure with Cloudera Manager.

Knowledge in Cassandra read and write paths and internal architecture.

Good Understanding in Kerberos and how it interacts with Hadoop and LDAP.

Practical knowledge on functionalities of every Hadoop daemon, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.

Experience in understanding and managing Hadoop Log Files.

Experience in Massively Parallel Processing (MPP) databases such as Microsoft PDW.

Experience in understanding Hadoop multiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn.

Upgrading Hortonworks clusters with latest release packs.

Experience in Adding and removing the nodes in Hadoop Cluster.

Experience in managing the Hadoop cluster with IBM Big Insights, Horton works Distribution Platform.

Experience in extracting the data from RDBMS into HDFS Sqoop.

Vast experience in maintaining of different Unix operating systems such as RHEL3/4/5, Solaris 8/9/10, AIX 5.1/5.3

Worked on analyzing Hadoop cluster and different big data analytic tools including HDFS, Hive, HBase, Flume, Oozie and Sqoop.

Experience in collecting the logs from log collector into HDFS using up Flume.

Experience in setting up and managing the batch scheduler Oozie.

Good understanding of No SQL databases such as HBase, Neo4j and Mongo DB.

Experience in analyzing data in HDFS through Map Reduce, Hive and Pig.

Experience in integration of various data sources like Oracle, DB2, Sybase, SQL server and MS access and non-relational sources like flat files into staging area.

Experience in Data Analysis, Data Cleansing (Scrubbing), Data Validation and Verification, Data Conversion, Data Migrations and Data Mining.

Excellent interpersonal, communication, documentation and presentation skills.

Technical Skills

Hadoop Ecosystem

HDFS, Yarn, Map-Reduce, Hive, Pig, Hue, Impala, Sqoop, Oozie, Flume, Zookeeper.

NoSql Databases

HBase.

Programming Language

C, C++, Java.

Database

DB2, ORACLE, MySQL, TeraData.

Scripting Languages

Shell Scripting.

Operating Systems

Linux, Solaris 8/9/10, Unix, Windows, Mac

WEB Servers

Apache Tomcat, JBOSS and Apache Http web server, Oracle Weblogic.

Cluster Management Tools

Cloudera Manager and HDP Ambari.

Virtualization technologies

VMware VSphere, Citrix Xen-Server.

Education Qualification

MS in Engineering Management

California State University Northridge, Northridge CA.

Professional Experience

Kroger/8451, OH Sept 2017 – Till Date

Sr Hadoop Administrator

Administered Oracle BDA Hadoop Cluster and helped teams to migrate from BDA’s to CDH 5.13.

Designed the hdfs directory structures and security for the new Hadoop infrastructure.

Secured the directories with ACL’s and Encryption Zones.

Increased the cluster capacity by monitoring the usage and created Utilization reports.

Created DR for both production clusters and scheduled replication jobs to copy data from other cluster.

Worked on HDF to use Nifi for Data Ingestion.

Upgraded HDF from 3.1.10 to 3.2.0.3-2 and Nifi to 1.7.

Worked with setting up Nifi Registry to Manage flows from one env to other.

Designed and implemented yarn resource pools and assigned quotas based on SLA’s

Worked on spark standalone cluster for increasing performance of spark applications

Monitored and managed Big Data clusters using a variety of open source (Dr Elephant) and proprietary toolsets.

Worked with Cloudera in evaluation of Workload XM 360.

Updated certificates on servers and for Applications like GitHub and Pivotal.

Worked with monitoring tools like Unravel Data and Pepper Data.

Installed Unravel Data on all clusters.

Worked on Yarn Dynamic Resource Pools for proper resource allocation across the Yarn.

Worked with static service pool configuration to maximize the cluster utilization.

Responsible for various system administration tasks.

Involved in various application design and implementations.

Troubleshooted various issues related to memory leaks and DIMM failures.

Worked with Distcp to copy data from Cluster to Cluster.

Made backups and Recovered data using HDFS snapshots.

Experience of Auditing, Oracle Security, OEM, SQL Developer and TOAD.

Participated in daily scrum meetings to support application teams POC activities.

Adding, Decommissioning and rebalancing nodes for newly implemented 12, 100, 72 node clusters.

Upgraded the Done stress and performance testing, benchmark for the cluster.

Debug and solve the major issues with Cloudera manager by interacting with the Infrastructure team from Cloudera.

Implemented Cluster High Availability and for services like Hdfs, Hive and Yarn.

Environment: CDH 5.13.1, 5.15.0; HDF 3.1.1.0, 3.2.0.3, Nifi 1.6, 1.7, Yarn, Hadoop Map Reduce, HDFS, Zoo Keeper, Oozie, Hue, Hive, Impala.

Cigna, CT Mar 2017 – Sept 2017

Sr Hadoop Administrator

Worked on Performance tuning and optimization of cluster to achieve high performance.

Installed and configured Atscale and integrated it to the Existing cluster.

Upgraded the cluster from CDH 5.7.1 to 5.10.1.

Worked on adding multiple nodes to the cluster to increase the capacity of cluster by 100 nodes.

Performed both major and minor upgrades to the existing Cloudera Hadoop cluster.

Applied patches and bug fixes on Hadoop Clusters.

Resolved tickets submitted by users, troubleshoot errors and documented the solutions.

Managing, monitoring, troubleshooting and reviewing of log files.

Setting up HDFS Name/Space quotas.

Applying Patches and Perform Version Upgrades to RHEL servers.

Worked on Incident Management, Problem Management and Change Management using HPSM

Performance Management and Reporting using Cloudera manager.

Well versed with Yarn architecture and schedulers.

Managed user and group access to various Big Data resources

Worked on Hadoop Data capacity planning and node forecasting and planning

Optimizing Map-reduce jobs by using compression mechanisms.

Performed Commissioning and Decommission of nodes from the cluster.

Implemented Fair and capacity schedulers.

Implemented automatic failover zookeeper and zookeeper failover controller.

Implemented Kerberos for authenticating all the services in Hadoop Cluster.

Environment: Cloudera 5.7.1 5.10.1; Yarn, Hadoop Map Reduce, HDFS, Zoo Keeper, Oozie, Hue, Hive, Impala.

Caterpillar, IL June 2015 – Feb 2017

Hadoop Administrator

Successfully upgraded the Hadoop cluster from CDH 4.7 to CDH 5.0.0.

Implemented and configured High Availability Hadoop Cluster (Quorum Based)

Installed and Configured Hadoop monitoring and Administrating tools: Nagios and Ganglia

Helped in setting up Rack topology in the cluster.

Back up of data from active cluster to a backup cluster using distcp.

Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings.

Deployed Network file system (NFS) for NameNode Metadata backup.

Fixing Hbase tables and configuring the region servers.

Removed the nodes for maintenance or malfunctioning nodes using decommissioning and added nodes using commissioning.

Hands on experience working on Hadoop ecosystem components like Yarn, Hadoop Map Reduce, HDFS, Zoo Keeper, Oozie, Hive, Sqoop, Pig, Flume.

Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.

Experience in using Flume to stream data into HDFS - from various sources.

Used Oozie workflow Engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.

Used DataStax OpsCenter and nodetool utility to monitor the cluster.

Evaluated Data model by running endurance tests using JMeter and OpsCenter.

Installed Oozie workflow engine to run multiple Hive and pig jobs.

Implemented Fair scheduler and capacity schedulers.

Implemented automatic failover zookeeper and zookeeper failover controller.

Implemented Kerberos for authenticating all the services in Hadoop Cluster.

Deployed Network file system for Name Node Metadata backup.

Performed cluster back using DISTCP, Cloudera manager BDR and parallel ingestion.

Performed both major and minor upgrades to the existing cluster and also rolling back to the previous version.

Designed the cluster so that only one secondary name node daemon could be run at any given time.

Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.

Hands-on experience in Python scripting.

Performed various configurations, which include networking and IP tables, resolving hostnames, SSH key less login.

Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.

Performed Enterprise level Hadoop upgrade on existing cluster, applied patches and did version upgrade.

Troubleshoot the Hortonworks Data Platform in multiple types of environments and take ownership of problem isolation and resolution, and bug reporting.

Provided POC for test and QA cluster using HDP 1.7 using NOSQL DB using HBASE.

Built lab Hadoop cluster based on multiple virtual machines for multiple testing reasons.

Environment: Cloudera 4.7, 5.0.0; Nagios and Ganglia; Yarn, Hadoop Map Reduce, HDFS, Zoo Keeper, Oozie, Hive, Sqoop, Pig, Flume, Hortonworks HDP 1.7,2.2.

WebWiseGlobal May 2012 – Oct 2013

Hadoop Administrator

Administration and maintenance of the cluster using OpsCenter, Devcenter, Linux, Node tool etc.

Responsible for Cluster maintenance, Commissioning and Decommissioning cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.

Designed and developed an API for rider's preferences with all CRUD capabilities.

Created views for ambary for different components in Hortonworks

Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.

Implemented Name Node backup using NFS. This was done for High availability.

Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.

File system management and monitoring and responsible for giving presentations about new ecosystems to be implemented in the cluster

Implementing security protocols and patches provided by Oracle and Cloudera

Ensuring data integrity and security

Creating and managing users and groups access

Creating directory structures and managing Access Control Lists

Environments: RHEL 5.4, 5.5; Linux, Redhat, CDH.

FMC Technologies, India June 2010 – May 2012

Senior Linux Administrator

Installation and configuration of Linux for new build environment.

Installing and maintaining the Linux servers

Extensive use of REDHAT Enterprise Linux 5.X.

Operating system backup and upgrades from RHEL 5.4 to 5.5.

Created volume groups logical volumes and partitions on the Linux servers and mounted file systems and created partitions.

Deep understanding of monitoring and troubleshooting mission critical Linux machines.

Improve system performance by working with the development team to analyze, identify and resolve issues quickly.

Worked on asset management for UNIX and Linux hardware, software, and equipment, and ensured that asset management policies are adhered to by the team.

Ensured data recovery by implementing system and application level backups.

Performed various configurations that include networking and IPTables, resolving host names and SSH keyless login.

Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.

Automate administration tasks through the use of scripting and Job scheduling using CRON.

Monitoring System Metrics and logs for any problems.

Running CRON-tab to back up data.

Adding, removing, or updating user account information, resetting passwords, etc.

Maintaining the MySQL server and Authentication to required users for databases.

Creating and managing Logical volumes using LVM.

Installing and updating packages using YUM.

Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.

Automate administration tasks through use of scripting and Job Scheduling using CRON.

Environments: RHEL 5.4, 5.5; Linux, Redhat, MySQL, YUM, CRON

Contact this candidate