Resume

Data Developer

Location:

Kadapa, AP, India

Posted:

June 07, 2017

Contact this candidate

Resume:

Harsha Kolluru

Java/Hadoop Developer - ac0qfv@r.postjobfree.com 816-***-**** Ext 202

Summary

8+ years of experience in a various IT related technologies, which includes 4 years of hands-on experience in Big Data technologies.Implementation and extensive working experience in wide array of tools in the Big Data Stack like HDFS, Spark, MapReduce, Hive, Pig, Flume, Oozie, Sqoop, Kafka, Zookeeper and HBase

Skill Set

HDFS

MapReduce

Hive

Yarn

Pig

Sqoop

Kafka

Storm

Flume

Oozie

Zookeeper

Apache Spark

Impala

Java

Python

Scala

J2EE

SQL

Unix

Tableau

Docker

Eclipse

Spring Boot

Elastic search

AWS

Nifi

Linux

Windows

Applets

Swing

JDBC

JSON

Java Script

JPS

Servlets

JFS

JQuery

JBoss

Shell Scripting

Cassandra

MVC

Struts

Spring

Hibernate

HBase

Cassandra

MongoDB

Dynamo DB

HTML

AJAX

XML

Apache Tomcat

Technical Summary

Proficient in installing, configuring and using Apache Hadoop ecosystems such as MapReduce, Hive, Pig, Flume, Yarn,HBase, Sqoop, AWS, Spark, Storm, Kafka, Oozie, and Zookeeper.

Strong comprehension of Hadoop daemons and Map-Reduce topics.

Used Informatica Power Center for Extraction, Transformation, and Loading (ETL) of information from numerous sources like Flat files, XML documents, and Databases.

Experienced in developing UDFs for Pig and Hive using Java.

Strong knowledge of Spark for handling large data processing in streaming process along with Scala.

Hands On experience on developing UDF, DATA Frames and SQL Queries inSpark SQL.

Highly skilled in integratingKafka with Spark streaming for high speed data processing.

Worked with NoSQL databases like HBase, Cassandra and MongoDB for information extraction and placehuge amount of data.

Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores

Experienced inwriting Storm topology to accept the events from Kafka producer and emit into Cassandra DB.

Ability to develop Map Reduce program using Java and Python.

Hands-on experience in provisioning and managing multi-tenant Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2, Open Stack.

Good understanding and exposure to Python programming.

Knowledge in developing a Nifi flow prototype for data ingestion in HDFS.

Exporting and importing data to and fromOracle usingSQL developer for analysis.

Good experience in using Sqoop for traditional RDBMS data pulls.

Worked with different distributions of Hadoop likeHortonworks and Cloudera.

Strong database skills in IBM- DB2, Oracle andProficient in database development, including Constraints, Indexes, Views, Stored Procedures, Triggers and Cursors.

Extensive experience in Shell scripting.

Extensive use of Open Source Software and Web/Application Servers like Eclipse 3.x IDE and Apache Tomcat 6.0.

Experience in designing a component using UML Design-Use Case, Class, Sequence, and Development, Component diagrams for the requirements.

Involved in reports development using reporting tools likeTableau. Used excel sheet, flat files, CSV files to generated Tableau adhoc reports.

Broad design, development and testing experience with Talend Integration Suite and knowledge in Performance tuning of mappings.

Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.

Experience in cluster monitoring tools likeAmbari &Apache hue.

Professional Experience

Mayo Clinic, Rochester, MN April 2016 - Present

Role: Hadoop Developer

Environment

Hadoop

Cloudera

Spark

YARN

Elastic search

Hive/SQL

Scala

PIG

MapReduce

Impala

HDFS

Sqoop

Talend

Kafka, Storm

ETL

Informatica

DB2

Agile

Maven

Responsibilities

Used Spark API over Cloudera Hadoop YARN to perform analytics on data.

Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.

Worked on batch processing of data sources using Apache Spark, Elastic search.

Involved in Converting Hive/SQL queries into Sparktransformations using Spark RDD, Scala.

Worked on migrating PIG scripts and MapReduce programs to Spark Data frames API and Spark SQL to improve performance

Experience in pushing from Impala to micro strategy.

Created scripts for importing data into HDFS/Hive using Sqoop from DB2.

Loading data from different source into hive using Talend tool.

Implemented Data Ingestion in real time processing using Kafka.

Developed data pipeline using Kafka and Storm to store data in to HDFS.

Used all major ETLtransformations to load tables through Informatica mappings.

Worked on Sequential files, RC files, Maps ide joins, bucketing, partitioning for Hive performance enhancement and storage improvement.

Developed Pig scripts to parse the raw data, populate staging tables and store the refined data in partitioned DB2 tables for Business analysis.

Worked on managing and reviewing Hadoop log files. Tested and reported defects in an AgileMethodology perspective.

Used Apache Maven extensively while developing MapReduce program.

Coordinating with Business for UAT sign off.

Neo Prism Solutions, Schaumburg, IL Jan 2015 - Mar 2016 Role: Hadoop Developer

Environment

Hadoop

Pig

Hive

MapReduce

Flume

HDFS

AWS

Dynamo DB

PySpark

HBase

Spring Boot

Linux

Sqoop

Python

Oozie

Nagios

Ganglia

Docker

Responsibilities

Worked on Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce

Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis

Worked on debugging, performance tuning of Hive & Pig Jobs.

Worked on AWS environment for developing and deploying of custom Hadoop applications.

Extracted and Stored data onDynamoDB to work on Hadoop Application.

Generate Pipeline using PySpark and Hive

Created HBase tables to store various data formats of PII data coming from different portfolios

Experiencein developing java applications using SpringBoot.

Involved in loading data from LINUX file system to HDFS

Importing and exporting data into HDFS and Hive using Sqoop

Experience working on processing unstructured data using Pig and Hive

Developed spark scripts using Python.

Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs

Assisted in monitoring Hadoopcluster using tools like Nagios, and Ganglia

Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts

Developed Docker Images, Containers, Registry.

SOGETI, Cincinnati, OH Feb 2013 - Nov 2014

Role: Hadoop Developer

Environment

Hadoop

MapReduce

HDFS

UNIX

Hive

Sqoop

Cassandra

ETL

Pig Script

Cloudera

Oozie

Responsibilities

Installed and configured HadoopMapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.

Involved in loading data from UNIX file system to HDFS.

Installed and configured Hive and also written Hive UDFs.

Importing and exporting data into HDFS and Hive using Sqoop

Used Cassandra CQL and Java API’s to retrieve data from Cassandra table.

Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.

Worked hands on with ETL process.

Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.

Extracted the data from Teradata into HDFS using Sqoop.

Analyzed the data by performing Hive queries and running Pigscripts to know user behavior like shopping enthusiasts, travelers, music lovers etc.

Exported the patterns analyzed back into Teradata using Sqoop.

Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.

Installed Oozie workflow engine to run multipleHive.

Developed Hivequeries to process the data and generate the data cubes for visualizing.

State of Idaho, West state street, ID Oct 2011 - Jan 2013

Role: Java Developer

Environment

Java, JSP2.1, EJB

J2EE, Mvc2

Struts

Servlets 3.0

JDBC 4.0

Ajax

Java Script

HTML5, CSS

JBoss, EJB, JTA

JMS, MDB

SOAP

XSL/ XSLT, XML

Struts

MVC, DAO

JUnit

PL/SQL

Responsibilities

Developed, Tested and Debugged the Java, JSP and EJB components using Eclipse.

Implemented J2EE standards, MVC2 architecture using Struts Framework

Developed web components using JSP, Servlets and JDBC

Taken care of Client Side Validations utilized JavaScript and Involved in reconciliation of different Struts activities in the structure.

For analysis and design of application created Use Cases, Class and Sequence Diagrams.

ImplementedServlets, JSP and Ajax to design the user interface

Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to the User Interface

Used JBoss for EJB and JTA, for caching and clustering purpose

UsedEJBs (Session beans) to implement the business logic, JMS for communication for sending updates to various other applications and MDB for routing priority requests

Wrote Web Services using SOAP for sending and getting data from the external interface

Used XSL/XSLT for transforming and displaying reports Developed Schemas for XML

Developed a web-based reporting for monitoring system with HTML and Tiles using Struts framework

Used Design patterns such as Business delegate, Service locator, Model View Controller (MVC), Session, DAO.

Involved in fixing defects and unit testing with test cases using JUnit

Developed stored procedures and triggers in PL/SQL

App Labs, Hyderabad, India Jul 2008 - Aug 2011

Role: Java Developer

Environment

Servlets, JSP

HTML

Java Script, XML

CSS

MVC

Struts

PL/SQL

JDBC

HTML

Oracle

Hibernate

JUnit

Responsibilities

Implemented server side programs by using Servlets and JSP.

Designed, developed and validated user interface using HTML, Java Script, XML and CSS.

Implemented MVC using Struts Framework.

Handled the database access by implementing Controller Servlet.

Implemented PL/SQL stored procedures and triggers.

Used JDBC prepared statements to call from Servlets for database access.

Designed and documented of the store procedures.

Widely used HTML for web based design.

Worked on database interactions layer for updating and retrieving of data from Oracle database by writing stored procedures.

Used spring framework dependency injection and integration with Hibernate. Involved in writing JUnit test cases.

Education

Qualification

Department

College

Location

Bachelor of Technology

Electronics and Communication

Sreenidhi institute of technology

Ghatkesar, Hyderabad, India.

Contact this candidate