Senior Certified Hadoop Consultant
Sandeep K
***.*******@*****.***
PROFESSIONAL SUMMARY:
•Cloudera Certified Hadoop Administrator with 8 years of professional IT experience which includes 4.5+ years of experience in Big data ecosystem related technologies.
•Excellent understanding, knowledge of Hadoop architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, Pig, Hive, Sqoop, Oozie, HBase, Yarn and MapReduce programming paradigm.
•Hands on experience in installing, configuring, and using Hadoop ecosystem components like HDFS, MapReduce, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, Impala and Flume.
•Well versed with installation, configuration, managing and supporting Hadoop cluster using various distributions like Apache Hadoop, Cloudera-CDH, and Hortonworks HDP.
•Experience in managing and reviewing Hadoop log files.
•Experience in providing security for Hadoop Cluster with Kerberos.
•Experience on Hadoop cluster maintenance, including data and metadata backups, file system checks, commissioning and decommissioning nodes and upgrades.
•Performed Hive tuning activities.
•Monitor and manage Linux servers (Hardware profiles, Resource usage, Service status etc).
•Server backup and restore Server status reporting, managing user accounts, and password policies and files permissions.
•Performed tuning of Hadoop clusters and Hadoop MapReduce routines.
•Expertise in performing Hadoop cluster tasks like commissioning and decommissioning of nodes without any effect to running jobs and data.
•Configuring monitoring tools like Grafana, Nagios for Hadoop cluster monitoring.
•Worked on data lake (ETLP) concepts on most of the big data projects.
•Good knowledge in different working strategies like Agile, Waterfall and Scrum methodologies.
•Experience in Verticals including Retail, Telecom, finance and insurance domains.
•Familiarity and experience with data warehousing and ETL tools.
•Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.
CERTIFICATION:
Cloudera Certified Administrator for Apache Hadoop (CCAH).
TECHNICAL SKILLS:
Big Data Technologies : HDFS, YARN, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper,
Flume
BigData Distributions : Cloudera, Hortonworks, Apache Hadoop, MapR
Installation : Ansible, GitLab
Scripting Languages : Shell, Bash
Monitoring tools : Grafana, Ganglia, Nagios, Ambari, Jenkins, Navigator
Reporting Tools : Tableau, Jaspersoft
Programming Languages : SQL, PL/SQL, Java, Chef, Puppet
Application Servers : Apache Tomcat, WebLogic Server, WebSphere, JBoss
Databases : Oracle9.x, 10g, 11g, MySQL Server, DB2, HBase,
MongoDB, Cassandra
Networking & Protocols : TCP/IP, Telnet, HTTP, HTTPS, FTP, SNMP, LDAP, DNS.
Operating Systems : Linux, UNIX, MAC, Windows NT / 98 /2000/ XP Vista,
Windows 7, Windows 8
PROFESSIONAL EXPERIENCE:
Client: Apple Inc., Sunnyvale, CA 2015 Dec - Present
Role: Sr. Hadoop Administrator
Responsibilities:
•Responsible for cluster maintenance, troubleshooting, manage data backups, review log files in multiple clusters.
•Building clusters by using Ambari also expertise in rolling and express upgrades.
•Expertise in cluster benchmark and configure the memory settings in YARN level based on results.
•Changing the configurations based on the requirements of the users for the better performance of the jobs and dynamic tuning to make cluster available and efficient.
•Worked on setting up namenode high availability for major production cluster and designed Automatic failover control using zookeeper and quorum journal nodes.
•Configuring the queues and maintaining the capacity scheduler in all environments to provide resources based on the allocation.
•Developed scripts for benchmark in the form of DFSIO, teragen, terasort and wordcount.
•Configured YARN and fine-tuned YARN settings to improve performance.
•Formulated procedures for installation of Hadoop patches, updates and version upgrades.
•Implemented Kerberos Security Authentication protocol for production cluster.
•Providing security for Hadoop cluster Active Directory/LDAP, and TLS/SSL utilizations.
•Monitoring host resources like CPU, RAM, HDD/mounts and also monitoring security logs.
•Written automated script to monitor HDFS and HBase through Cron jobs.
•Experienced in setting up the project and volume setups for the new projects.
•Experienced in managing and reviewing log files.
•Involved in configuring Oozie workflow engine to run multiple Hive jobs.
•Worked with Hadoop developers, designers in troubleshooting Hive job failures and issues and helping to developers.
Client: Carter’s Inc., Atlanta, GA 2014 Oct – 2015 Dec
Role: Sr. Hadoop Administrator
Responsibilities:
•Installed, Configured and Maintained 70 node - Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, Hbase, Zookeeper and Sqoop.
•Extensively worked with Cloudera Distribution Hadoop CDH 5.x
•Extensively involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of the Hadoop Cluster.
•Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery etc.,
•Installed and configured Hue interface for UI access of Hadoop components like hive, pig, oozie, sqoop, Hbase, file browser etc.,
•Installed Cloudera Navigator to configure, collect and view audit events such as timestamp, operation, users.
•Timely and reliable support for all production and development environment: deploy, upgrade, operate and troubleshoot.
•Refactor existing Opscode Chef Automation code.
•Built and deployed a Chef Server in AWS for infrastructure automation.
•Helped in Hive queries tuning for performance gain.
•Configured Data lake which serves as a base layer to store and do analytics on data flowing from multiple sources into Hadoop Platform
•Provide support to developers, Install their custom software’s, upgrade Hadoop components, solve their platform issues, and help them troubleshooting their long running jobs.
•Implement both major and minor version upgrades to the existing cluster and also rolling back to the previous version if needed.
•Daily status checks for Oozie workflow and monitor Cloudera manager and check data node status to ensure nodes are up and running.
•Expertise in performing Hadoop cluster tasks like commissioning and decommissioning of nodes without any effect to running jobs and data.
•Used sqoop import and export extensively.
•Scheduling production batch jobs using Control-M
•Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principles, generation key tab file each service and managing keytab using keytab tools.
•Configured NameNode high availability and Resource Manager high availability
•Resolved various issues faced by users, which are related to platform.
•Worked directly with vendors, partners and internal clients on gathering and refining technical requirements and designs in order to develop a working solution that addressed needs.
•Act as point of contact for workflow failure/hitches.
•Worked round the clock especially during deployments.
•Monitoring and maintaining Hadoop cluster Hadoop/HBase/zookeeper using these tools Ganglia and Nagios
Environment: CDH 5.4.3 and 4.x, Cloudera Manager CM 5.1.1, Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, Chef, Redhat/Centos 6.5, Puppet, Control-M
Client: Barclays, NYC City, NY Aug 2013- Sept 2014
Role: Sr. Hadoop Administrator
Responsibilities:
Responsible for upgrading Hortonworks Hadoop HDP2.2.0 and Mapreduce 2.0 with YARN in Multi Clustered Node environment.
Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, Spark and loaded data into HDFS.
Monitored multiple Hadoop cluster environments using Ganglia and Nagios.
Monitored workload, job performance and capacity planning using Ambari
Install, Configure and benchmark new and existing clusters in customer environment
Responsible for the uptime of the clusters
Provide 24X7 Support to the Business teams
Monitor job performance, System metrics and logs for any problems
Work along with the Service Vendors to resolve the tickets that were raised by various business teams
Co-ordinate with various teams e.g. Firewall, OS, Security, SAT to resolve tickets
Co-ordinate with Hortonworks to fix un-resolved issues on the platform
Implement new feature(s) on the cluster as per the business requirement
Upgrade the platform and ensure availability of the services post upgrade
Integrate external BI tools like Tableau, Jaspersoft
Worked on Hive/Hbase vs RDBMS, imported data to hive, created internal and external tables, partitions, indexes, views, queries and reports for BI data analysis.
Created User accounts and given the users the access to the Hadoop Cluster.
Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Resolved various issues faced by users, which are related to platform.
Environment: Hadoop, Yarn, Map Reduce, Hive, HDFS, PIG, Sqoop, Oozie, HDP 2.x, Hortonworks, Ganglia, Nagios, HBase, ZooKeeper, Ambari 1.x, 2.x and Unix/Linux, Tableau, Jaspersoft
Client: Guardian Life Insurance, Raleigh, NC June 2012- Aug 2013
Role: Hadoop Admin
Responsibilities:
Installed/Configured/Maintained Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Extensively involved in Installation and configuration of Cloudera distribution Hadoop CDH 3.x, CDH 4.x.
Involved in setup of 50 nodes Hadoop cluster.
Worked on upgrading cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration.
Installed and Integrated Oozie with the Hadoop stack to run multiple hive and Pig scripts.
Involved in creating and maintaining Hive tables, loading data into the tables using Hive queries and MapReduce jobs.
Handled Data load from different UNIX file systems to HDFS.
Customized the SSH settings in the Master node.
Configured the MapReduce property to make sure local temporary storage is using large disk partitions.
Created User accounts and given the users the access to the Hadoop Cluster.
Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Implemented Kerberos for authenticating all the services in Hadoop Cluster.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Resolved various issues faced by users which are related to platform
Environment: CDH, Hadoop, HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Kerberos, Shell script, UNIX
Client: British Telecom, Bangalore, INDIA Dec 2009-May 2012
Role: Linux Administrator
Responsibilities:
• Administration of RHEL 5/6 which includes installation, testing, tuning, upgrading and loading patches, troubleshooting server issues.
• Configure and automate the deployment of Linux and VMware infrastructure through our existing Kickstart infrastructure.
• Configure Linux guests in a VMware ESX environment.
• Understand server virtualization technology such as VMware.
• Worked on Cisco USC, virtual infra on VMware, Storage migration and installations.
• Installing, configuring, custom building Oracle10g and preparing servers for database installation which includes adding kernel parameters, software installation, permissions etc.
• Implemented multi-tier application provisioning in OpenStack cloud, integrating it with Puppet.
• Involved in integrated Vsphere hypervisor with OpenStack.
• Configure and maintained FTP, DNS, NFS and DHCP servers.
• Configuring, maintaining and troubleshooting of local development servers.
• Performed configuration of standard Linux and network protocols, such as SMTP, DHCP, DNS, LDAP, NFS, SMTP, HTTP, SNMP and others.
• Writing shell scripting for automation.
Developed puppet recipes for automation of Hadoop Installation and configuration of nodes.
Worked on virtual and physical Linux host for decommission.
Server Administrator Tomcat, Tomcat serving dynamic servlet and JSP requests.
Managing cron jobs, batch processing and job scheduling.
Worked on planning for the recovery of critical IT systems and services in a fallback situation following a disaster that overwhelms the resilience arrangements.
Monitoring System Activities like CPU, Memory, Disk and Swap space usage to avoid any performance issues.
Tuning the Kernel parameters for the better performance of applications like Oracle.
Provided 24X7 on-calls production and customer support including trouble shooting problems.
Environment: LINUX, FTP, Shell, UNIX, VMware, NFS, TCP/IP, Puppet, Oracle Red Hat Linux.
Client: COMSAT Systems, Hyderabad, INDIA Dec 2008-Dec 2009
Role: Linux Administrator
Responsibilities:
• Build Linux servers, Upgrade and patch existing servers. Compile, built and upgrade Linux kernel.
• Setup Solaris Custom Jumpstart server and clients also implement Jumpstart installation.
• Worked with Telnet, Rlogin, used to inter-operate hosts.
• Contact various systems administration works under CentOS, Red Hat Linux environments.
• Performed regular day-to-day system administrative tasks including User Management, Backup, Network Management, and Software Management including Documentation etc.
• Recommend system configurations for clients based on estimated requirements.
• Performed reorganization of disk partitions, file systems, hard disk addition, and memory upgrade.
• Monitored system activities, log maintenance, and disk space management.
• Encapsulated root file systems, and mirrored the file systems were mirrored to ensure systems had redundant boot disks.
• Administrated Apache Servers, published client's web site in our Apache server.
• Fix all the system problems, based on system email information and users' complaints.
• Upgrade software, adding patches, and adding new hardware in UNIX machines.
Environment: UNIX, FTP, TCP/IP, Red Hat Linux.
Education: Bachelor of Technology in Electronic Instrumentation and Control Engineering (EICE) from NBKR Institute of Science and Technology, Sri Venkateswara University, India, 2008.