Harsha Kolluru
Java/Hadoop Developer - ac0qfv@r.postjobfree.com 816-***-**** Ext 202
Summary
8+ years of experience in a various IT related technologies, which includes 4 years of hands-on experience in Big Data technologies.Implementation and extensive working experience in wide array of tools in the Big Data Stack like HDFS, Spark, MapReduce, Hive, Pig, Flume, Oozie, Sqoop, Kafka, Zookeeper and HBase
Skill Set
HDFS
MapReduce
Hive
Yarn
Pig
Sqoop
Kafka
Storm
Flume
Oozie
Zookeeper
Apache Spark
Impala
Java
Python
Scala
J2EE
SQL
Unix
Tableau
Docker
Eclipse
Spring Boot
Elastic search
AWS
Nifi
Linux
Windows
Applets
Swing
JDBC
JSON
Java Script
JPS
Servlets
JFS
JQuery
JBoss
Shell Scripting
Cassandra
MVC
Struts
Spring
Hibernate
HBase
Cassandra
MongoDB
Dynamo DB
HTML
AJAX
XML
Apache Tomcat
Technical Summary
Proficient in installing, configuring and using Apache Hadoop ecosystems such as MapReduce, Hive, Pig, Flume, Yarn,HBase, Sqoop, AWS, Spark, Storm, Kafka, Oozie, and Zookeeper.
Strong comprehension of Hadoop daemons and Map-Reduce topics.
Used Informatica Power Center for Extraction, Transformation, and Loading (ETL) of information from numerous sources like Flat files, XML documents, and Databases.
Experienced in developing UDFs for Pig and Hive using Java.
Strong knowledge of Spark for handling large data processing in streaming process along with Scala.
Hands On experience on developing UDF, DATA Frames and SQL Queries inSpark SQL.
Highly skilled in integratingKafka with Spark streaming for high speed data processing.
Worked with NoSQL databases like HBase, Cassandra and MongoDB for information extraction and placehuge amount of data.
Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores
Experienced inwriting Storm topology to accept the events from Kafka producer and emit into Cassandra DB.
Ability to develop Map Reduce program using Java and Python.
Hands-on experience in provisioning and managing multi-tenant Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2, Open Stack.
Good understanding and exposure to Python programming.
Knowledge in developing a Nifi flow prototype for data ingestion in HDFS.
Exporting and importing data to and fromOracle usingSQL developer for analysis.
Good experience in using Sqoop for traditional RDBMS data pulls.
Worked with different distributions of Hadoop likeHortonworks and Cloudera.
Strong database skills in IBM- DB2, Oracle andProficient in database development, including Constraints, Indexes, Views, Stored Procedures, Triggers and Cursors.
Extensive experience in Shell scripting.
Extensive use of Open Source Software and Web/Application Servers like Eclipse 3.x IDE and Apache Tomcat 6.0.
Experience in designing a component using UML Design-Use Case, Class, Sequence, and Development, Component diagrams for the requirements.
Involved in reports development using reporting tools likeTableau. Used excel sheet, flat files, CSV files to generated Tableau adhoc reports.
Broad design, development and testing experience with Talend Integration Suite and knowledge in Performance tuning of mappings.
Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
Experience in cluster monitoring tools likeAmbari &Apache hue.
Professional Experience
Mayo Clinic, Rochester, MN April 2016 - Present
Role: Hadoop Developer
Environment
Hadoop
Cloudera
Spark
YARN
Elastic search
Hive/SQL
Scala
PIG
MapReduce
Impala
HDFS
Sqoop
Talend
Kafka, Storm
ETL
Informatica
DB2
Agile
Maven
Responsibilities
Used Spark API over Cloudera Hadoop YARN to perform analytics on data.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Worked on batch processing of data sources using Apache Spark, Elastic search.
Involved in Converting Hive/SQL queries into Sparktransformations using Spark RDD, Scala.
Worked on migrating PIG scripts and MapReduce programs to Spark Data frames API and Spark SQL to improve performance
Experience in pushing from Impala to micro strategy.
Created scripts for importing data into HDFS/Hive using Sqoop from DB2.
Loading data from different source into hive using Talend tool.
Implemented Data Ingestion in real time processing using Kafka.
Developed data pipeline using Kafka and Storm to store data in to HDFS.
Used all major ETLtransformations to load tables through Informatica mappings.
Worked on Sequential files, RC files, Maps ide joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
Developed Pig scripts to parse the raw data, populate staging tables and store the refined data in partitioned DB2 tables for Business analysis.
Worked on managing and reviewing Hadoop log files. Tested and reported defects in an AgileMethodology perspective.
Used Apache Maven extensively while developing MapReduce program.
Coordinating with Business for UAT sign off.
Neo Prism Solutions, Schaumburg, IL Jan 2015 - Mar 2016 Role: Hadoop Developer
Environment
Hadoop
Pig
Hive
MapReduce
Flume
HDFS
AWS
Dynamo DB
PySpark
HBase
Spring Boot
Linux
Sqoop
Python
Oozie
Nagios
Ganglia
Docker
Responsibilities
Worked on Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
Worked on debugging, performance tuning of Hive & Pig Jobs.
Worked on AWS environment for developing and deploying of custom Hadoop applications.
Extracted and Stored data onDynamoDB to work on Hadoop Application.
Generate Pipeline using PySpark and Hive
Created HBase tables to store various data formats of PII data coming from different portfolios
Experiencein developing java applications using SpringBoot.
Involved in loading data from LINUX file system to HDFS
Importing and exporting data into HDFS and Hive using Sqoop
Experience working on processing unstructured data using Pig and Hive
Developed spark scripts using Python.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
Assisted in monitoring Hadoopcluster using tools like Nagios, and Ganglia
Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
Developed Docker Images, Containers, Registry.
SOGETI, Cincinnati, OH Feb 2013 - Nov 2014
Role: Hadoop Developer
Environment
Hadoop
MapReduce
HDFS
UNIX
Hive
Sqoop
Cassandra
ETL
Pig Script
Cloudera
Oozie
Responsibilities
Installed and configured HadoopMapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive UDFs.
Importing and exporting data into HDFS and Hive using Sqoop
Used Cassandra CQL and Java API’s to retrieve data from Cassandra table.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Worked hands on with ETL process.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
Extracted the data from Teradata into HDFS using Sqoop.
Analyzed the data by performing Hive queries and running Pigscripts to know user behavior like shopping enthusiasts, travelers, music lovers etc.
Exported the patterns analyzed back into Teradata using Sqoop.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Installed Oozie workflow engine to run multipleHive.
Developed Hivequeries to process the data and generate the data cubes for visualizing.
State of Idaho, West state street, ID Oct 2011 - Jan 2013
Role: Java Developer
Environment
Java, JSP2.1, EJB
J2EE, Mvc2
Struts
Servlets 3.0
JDBC 4.0
Ajax
Java Script
HTML5, CSS
JBoss, EJB, JTA
JMS, MDB
SOAP
XSL/ XSLT, XML
Struts
MVC, DAO
JUnit
PL/SQL
Responsibilities
Developed, Tested and Debugged the Java, JSP and EJB components using Eclipse.
Implemented J2EE standards, MVC2 architecture using Struts Framework
Developed web components using JSP, Servlets and JDBC
Taken care of Client Side Validations utilized JavaScript and Involved in reconciliation of different Struts activities in the structure.
For analysis and design of application created Use Cases, Class and Sequence Diagrams.
ImplementedServlets, JSP and Ajax to design the user interface
Used JSP, Java Script, HTML5, and CSS for manipulating, validating, customizing, error messages to the User Interface
Used JBoss for EJB and JTA, for caching and clustering purpose
UsedEJBs (Session beans) to implement the business logic, JMS for communication for sending updates to various other applications and MDB for routing priority requests
Wrote Web Services using SOAP for sending and getting data from the external interface
Used XSL/XSLT for transforming and displaying reports Developed Schemas for XML
Developed a web-based reporting for monitoring system with HTML and Tiles using Struts framework
Used Design patterns such as Business delegate, Service locator, Model View Controller (MVC), Session, DAO.
Involved in fixing defects and unit testing with test cases using JUnit
Developed stored procedures and triggers in PL/SQL
App Labs, Hyderabad, India Jul 2008 - Aug 2011
Role: Java Developer
Environment
Servlets, JSP
HTML
Java Script, XML
CSS
MVC
Struts
PL/SQL
JDBC
HTML
Oracle
Hibernate
JUnit
Responsibilities
Implemented server side programs by using Servlets and JSP.
Designed, developed and validated user interface using HTML, Java Script, XML and CSS.
Implemented MVC using Struts Framework.
Handled the database access by implementing Controller Servlet.
Implemented PL/SQL stored procedures and triggers.
Used JDBC prepared statements to call from Servlets for database access.
Designed and documented of the store procedures.
Widely used HTML for web based design.
Worked on database interactions layer for updating and retrieving of data from Oracle database by writing stored procedures.
Used spring framework dependency injection and integration with Hibernate. Involved in writing JUnit test cases.
Education
Qualification
Department
College
Location
Bachelor of Technology
Electronics and Communication
Sreenidhi institute of technology
Ghatkesar, Hyderabad, India.