Data Manager

Location:

Hyderabad, Telangana, India

Posted:

January 24, 2017

Contact this candidate

Resume:

https://www.linkedin.com/in/veeraiah-nalabolu-***ba8132

VEERAIAH

M: (732) -579-8468

E: ********@*********.***

PROFESSIONAL SUMMARY:

* ***** ** ******* ********** as software developer in design, development, deploying, administrating and supporting large scale distributed systems.

4+ years of experience as Hadoop Developer and Big Data analyst.

In depth knowledge of Hadoop Architecture and Hadoop daemons such as Name Node, Secondary Name Node, Data Node, Job Tracker and Task Tracker.

Experienced in installing and configuring Hadoop v1.0 and v2.0.

Experience with multiple Cloudera Distribution versions like CDH 3, CDH4, and CDH5.

Good knowledge of MAPR and Horton Works Distributions for Hadoop.

Experience in HBase cluster configuration, deployment and troubleshooting.

Passionate towards working in Big Data and Analytics environment.

Determined, committed and hardworking individual with strong communication, interpersonal and organizational skills.

Expertise in using various Hadoop Ecosystem components such as MapReduce, Pig, Hive, HBase, Sqoop, Oozie, Flume and Spark for data storage and analysis.

Strong experience in core Java, J2EE, SQL and RESTful web services.

Expertise in Text Processing and Analysis using HiveQL.

Good understanding of NoSQL Data bases like MongoDB and Cassandra.

Experienced in developing custom UDF's for Pig and Hive to incorporate methods and functionality of Python/Java into PigLatin and HiveQL.

Good Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.

Experienced in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.

Experience in troubleshooting errors in HBase Shell/API, Pig, Hive and MapReduce.

Experienced in installing and running various Oozie workflows and automating parallel job executions.

Experience in running shell scripts using Hadoop Streaming.

Experience in managing Hadoop clusters and services using Cloudera Manager.

Highly experienced in importing and exporting data between HDFS and Relational Systems like MySQL and Teradata using Sqoop.

Assisted Deployment team in setting up Hadoop cluster and services.

Familiar with almost all the SDLCycles like agile, waterfall, spiral methodologies.

Experienced in Identifying improvement areas for systems stability and providing end-end high availability architectural solutions.

PROFESSIONAL EXPERIENCE:

Client: VERIZON WIRELESS, TX January 2016 – Till Date

Role: Hadoop Developer/Big Data Analyst

Responsibilities:

Participated in development/implementation of Cloudera Hadoop environment.

Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.

Worked on automating importing and exporting jobs into HDFS and Hive using Sqoop from relational databases like Oracle and Teradata.

Used Spark as an execution engine replacing MapReduce for certain jobs.

Used Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.

Written PIG and Hive UDFs and used MapReduce and Junit for unit testing.

Worked on loading and transformation of large sets of structured, semi structured and unstructured data.

Involved in installing, configuring and using Hadoop Ecosystem components.

Worked on performance troubleshooting and tuning Hadoop clusters.

Managing and reviewing Hadoop log files.

Migrated data existing in Hadoop cluster into Spark and used SparkSQL and Scala to perform actions on the data.

Worked in installing cluster, commissioning & decommissioning of Datanodes, Namenode recovery, capacity planning, and slots configuration.

Implemented Cassandra column oriented NoSQL database and associated RESTful web service that persists high volume data for Vertical teams.

Supported MapReduce Programs those are running on the cluster. Involved in loading data from UNIX file system to HDFS.

Experience in configuring HBase cluster and troubleshooting using Cloudera Manager.

Performed distributed transactional queueing on HBase.

Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Involved in running query using Impala and used BI tools to run ad-hoc queries directly on Hadoop.

Environment: Hadoop, HDFS, Hive, Pig, HBase, Impala, MySQL, Java, SQL, PIG, Zookeeper, Sqoop, Hue, Teradata, CentOS.

Client: WELLS FARGO,SFO,CA January 2014-December 2015

Role: Hadoop Developer

Responsibilities:

Involved in loading data from LINUX file system, servers, Java web services using Kafka Producers, partitions.

Migrated complex Map reduce programs into Spark RDD transformations, actions.

Implemented Kafka High level consumers to get data from Kafka partitions and move into HDFS.4

Worked on analyzing Hadoop cluster and different big data analytic tools including Map Reduce, Hive and Spark.

Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.

Exporting of result set from HIVE to MySQL using Sqoop export tool for further processing.

Evaluated the performance of Apache Spark in analyzing genomic data.

Implemented Hive complex UDF's to execute business logic with Hive Queries.

Implemented Impala for data analysis.

Prepared Linux shell scripts for automating the process.

Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.

Automation of all the jobs starting from pulling the Data from different Data Sources like MySQL and pushing the result dataset to Hadoop Distributed File System and running MR, PIG, and Hive jobs using Kettle and Oozie (Work Flow management)

Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.

Load and transform large sets of structured, semi structured, and unstructured data with Map Reduce, Hive, and Pig.

Evaluated usage of Oozie for Workflow Orchestration.

Worked with NoSQL databases like HBase in creating tables to load large sets of semi structured data coming from various sources.

Created partitioned tables in Hive, mentored analyst and test team for writing Hive Queries.

Involved in cluster setup, monitoring, test benchmarks for results.

Involved in agile methodologies, daily scrum meetings, Sprint planning's.

Environment: Hadoop, HDFS, Pig, Hive, Flume, Sqoop, Oozie, HBase, Zookeeper, MySQL, Shell scripting, Linux Red Hat, core Java 7, Eclipse.

Client: Big Data and Analytics Technology Excellence Group, Irving, TX June2012-November 2013

Role: Hadoop Developer

Roles and Responsibilities:

Developed simple and complex Map/Reduce Jobs using Hive and Pig.

Developed MapReduce Programs for data analysis and data cleaning.

Developed PIG Latin scripts for the analysis of semi structured data.

Performed optimization on Pig scripts and Hive queries to increase efficiency and added new features to existing code.

Created Hive tables and loaded data by writing Hive UDFs.

Developed Hive queries to process the data for visualizing and reporting.

Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.

Installed and configured Hive, Pig, Sqoop, and Oozie on the Hadoop cluster.

Worked on importing and writing data to HBase and reading the same using Hive.

Installed Oozie Workflow engine to run multiple Hive and Pig Jobs.

Migration of ETL processes from Oracle to Hive to test the data manipulation.

Setup and benchmarked Hadoop/HBase clusters for internal use.

Extracted data from databases like SQL Server and Oracle 10g using Sqoop Connectors into HDFS for processing using Pig and Hive.

Continuous monitoring and managing of Hadoop cluster using Cloudera Manager.

Developed Java MapReduce programs for the analysis of sample log file stored in cluster.

Conducted some unit testing for the development team within the sandbox environment.

Environment: Apache Hadoop, Cloudera Manager CDH2, CDH3, CentOS, Java, MapReduce, HBase, Eclipse Indigo, Hive, Sqoop, Oozie and SQL.

Client: Exilant Technologies, AP, INDIA February 2009 – April 2012

Role: Oracle DBA

Involved in 24x7 Oracle Production Database Administration.

Involved in installation, configuration and support of Oracle 11g RAC.

Refresh/Cloning of Database and applications for development and testing purpose.

Involved in Upgrade/Migration of the oracle databases from Oracle 9i to Oracle 10g.

Performance tuning for optimized results using Explain Plan, SQL*Trace, TKPROF and Statspack.

Created scripts to query performance views in an effort to reduce parse times and tune memory structures such as the database buffer cache, shared pool, library cache and PGA for a shared server configuration.

Design, document, implement, and maintain all backup, recovery including disaster recovery procedures.

Implementation of High Availability solutions with Oracle 10g RAC, Standby Database Data Guard, Replication.

Created physical standby database (for disaster recovery) from hot backup of primary database.

Resolved gaps between primary and standby databases (Gap Resolution).

Write monitoring/health check scripts to alert team of database uptime/downtime status, sizing issues, which ensured availability 99.98% of the time.

Creating Tables, indexes, & triggers to assist developers.

Managing, troubleshooting and resolving oracle database, RAC and applications issues.

Refresh/Cloning of Database and applications for development and testing purpose.

Was involved in developing a disaster recovery (DR) plan.

Knowledge of the Oracle Golden Gate installation, configuration, troubleshooting GG issues.

Performing Backup/Recovery of all the Oracle databases using RMAN.

Scripts for Health Check of Databases.

Reading Alert Logs and User trace Files and Diagnosing the Problem.

Writing UNIX shell Scripts and Scheduling through crontab.

Providing Solutions to the Problems Faced by end Users or Programmers.

OS level activities like space monitoring of Mount points.

Scripts for Analyzing of Tables and Rebuilding of Indexes.

Taking periodic backup of the database and the software.

Automation of Backups.

Interfacing with Oracle Corporation for technical support.

Environment: Oracle 9i,10.2.0.2,11.1.0.7, Maestro7.0, HP service desk, Remedy, SQL Server 2000/2005, OEM 10g grid, SQL*Loader, Windows XP, IBM AIX 5.3, WebSphere, Solaris5.9, Informatica Power center 8.1x and 8.6.

Contact this candidate