Hadoop Developer

Location:

Arlington, TX

Salary:

$65 houlry

Posted:

August 23, 2017

Contact this candidate

Resume:

RAVI DEVALLA Email: ****.********@*****.***

HADOOP DEVELOPER Phone: 908-***-****

PROFESSIONAL SUMMARY

* ***** ** ************ ********** in IT, including Big Data, Hadoop. Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.

Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Pentaho, Hbase, Zookeeper, Sqoop, Oozie and Flume.

Hands on experience using Cloudera Hadoop Distributions.

Responsible for writing Map Reduce programs.

Good hands-on experience in Apache Spark with Scala

Streaming of data into HDFS using SQOOP.

Streaming of data into HDFS using Flume.

Analyzed data by writing Apache Hadoop custom Map Reduce programs in Java and UDFs for Pig and Hive using Java in order to analyze the data efficiently.

Collected logs data from various sources and integrated in to HDFS using Flume.

Logical Implementation and interaction with HBase

Developed Map Reduce jobs to automate transfer the data from HBase

Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.

Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems.

Strong understanding of NoSQL databases like HBase, MongoDB & Cassandra.

Good experience in executing UNIX commands and writing shell scripts.

Worked in Multiple Environment in installation and configuration

Experience loading data to Hive partitions and creating buckets in Hive

Good understanding on Cassandra Query Language

Excellent Object-Oriented Programming skills with C++ and Java and in-depth understanding of data structures and algorithms.

Excellent Java development skills using J2EE, J2SE, Servlets, Junit, JSP, JDBC.

Experience in integration of various data sources like Java, RDBMS, Shell Scripting, Spreadsheets, and Text files.

Experience in database programming in oracle environment using SQL with GUI tools such as SQL Plus and SQL Developer.

Has experience in deploying and configuring application servers such as Web logic and Apache Tomcat.

Possess strong communication and Interpersonal skills. Proven success in initiating, promoting and maintaining strong interpersonal relations. Can quickly master and work on new concepts and applications with minimal supervision.

EDUCATION: Bachelor’s of Technology, JNTU, Kakinada, India

TECHNICAL SKILLS

Hadoop Ecosystem

HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Hbase, Oozie, Zookeeper, YARN, Apache Spark, Kafka

Programming Languages

Java, Python, C, SQL, PL/SQL, PHP, Java script, Scala

Framework

Hibernate 2.x/3.x, Spring 2.x/3.x, Struts 1.x/2.x and JPA

Operating Systems

UNIX, Windows, LINUX

Application Servers

IBM Web sphere, Tomcat, Web Logic, Web Sphere

Databases

Oracle, DB2, H2, SQL Server, MY SQL, MongoDB

Java IDE

Eclipse 3.x, IBM Web Sphere Application Developer, IBM RAD 7.0

PROFESSIONAL EXPERIENCE

Fidelis Care, Rego Park, New York July 2016 - Till Date

Hadoop Developer

Description: Fidelis Care offers quality, affordable coverage to residents across the State of New York, including products available through Medicaid, Medicate and the NEW York State of Health.

Responsibilities

Worked on implementation and maintenance of Cloudera Hadoop cluster.

Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like Pig, Hive, and Hbase.

Developed and executed custom MapReduce programs, PigLatin scripts and HQL queries.

Developed Schedulers that communicated with the Cloud based services (AWS) to retrieve the data.

Used Hadoop FS scripts for HDFS (Hadoop File System) data loading and manipulation.

Performed Hive test queries on local sample files and HDFS files.

Developed and optimized Pig and Hive UDFs (User-Defined Functions) to implement the functionality of external languages as and when required.

Implemented Frameworks using Scala and Python to automate the ingestion flow.

Extensively used Pig for data cleaning and optimization.

Developed Hive queries to analyze data and generate results.

Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.

Worked on reading multiple data formats on HDFS using Scala.

Managed, reviewed and interpreted Hadoop log files.

Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL.

Worked on SOLR for indexing and search optimization.

Analyzed business requirements and cross-verified them with functionality and features of NOSQL databases like HBase, Cassandra to determine the optimal DB.

Analyzed user request patterns and implemented various performance optimization measures including but not limited to implementing partitions and buckets in HiveQL.

Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.

Monitored workload, job performance and node health using Cloudera Manager.

Used Flume to collect and aggregate weblog data from different sources and pushed to HDFS.

Integrated Oozie with Map-Reduce, Pig, Hive, and Sqoop.

Environment: Hadoop 1x, HDFS, MapReduce, Pig 0.11, Scala, Spark, Python, AWS, Hive 0.10, Crystal Reports, Sqoop, HBase, Shell Scripting, UNIX

JPMC, Columbus, OH Sep 2015- June 2016

Hadoop developer

Description: JPMC is a leader in investment banking, financial services for consumers and small businesses, commercial banking, financial transaction processing and asset management. JPMC serve millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients.

Responsibilities

Implemented best practices for the full software development life cycle including coding standards, code reviews, source control management and build processes.

Effectively used Sqoop to transfer data between databases (RDBMS) and HDFS.

Designed workflow by scheduling Hive processes for Log file data, which is streamed into HDFS using Flume.

Involved in creating Hive tables, and loading and analyzing data using hive queries.

Developed Pig Latin scripts to extract the data from the mainframes output files to load into HDFS.

Developed Map-Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.

Experienced in developing HIVE and SCALA Queries on different data formats like Text file, CSV file.

Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.

Written Hive queries for analyzing and reporting purposes of different streams in the company.

Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.

Analyzed the SQL scripts and designed the solution to implement using Scala.

Use Avro serialization technique to serialize data. Applied transformations and standardizations and loaded into HBase for further data processing.

Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.

Exported the analyzed data to the relational databases using Sqoop for virtualization and to generate reports for the BI team.

Documented all the requirements, code and implementation methodologies for reviewing and analyzation purposes.

Used Sqoop extensively to ingest data from various source systems into HDFS.

Written Hive queries for data analysis to meet the business requirements.

Created Hive tables and worked on them using HiveQL.

Installed cluster, worked on commissioning & decommissioning of DataNode, NameNode recovery, capacity planning, and slots configuration.

Installed and Configured Hadoop cluster using Amazon Web Services (AWS) for POC purposes.

Assisted in loading large sets of data (Structure, Semi Structured, and Unstructured).

Installed and configured Flume, Sqoop, Pig, Hive, and HBase on Hadoop clusters.

Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.

Managed Hadoop clusters include adding and removing cluster nodes for maintenance and capacity needs.

Wrote test cases in JUnit for unit testing of classes, documented Unit testing, Logged and resolved defects in the roll out phase.

Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Java, Python, Flume, Scala, Spark, Cloudera, Oozie, AWS, MySQL, UNIX, Teradata, Oracle 11g, HP Quality Center and Application Lifecycle Management.

Reyes Holdings, Rosemont, IL May 2014 -Aug 2015

Hadoop Developer

Description: Reyes Holdings, aligned with leading brewers and foodservice providers, delivers the best-known brands and widest variety of food and beverage items to retailers around the world. I worked as a Hadoop Developer in Data Insights team where I performed analysis on big data sets on Hadoop clusters and helped the organization get advantage by finding out the customer trends which helped in market targeting, brands popularity region wise and advertisement investment allocation.

Responsibilities

Work on the proof-of-concept for Apache Hadoop framework initiation.

Work on Installed and configured Hadoop 0.22.0 MapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and preprocessing.

Importing and exporting data into HDFS and HIVE using Sqoop.

Involve in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.

Implement Partitioning, Dynamic Partitions, Buckets in HIVE.

Responsible to manage data coming from different sources.

Monitor the running MapReduce programs on the cluster.

Responsible for loading data from UNIX file systems to HDFS.

Install and configure Hive and also written Hive UDFs.

Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.

Implement the workflows using Apache Oozie framework to automate tasks.

Develop scripts and automated data management from end to end and sync up b/w all the clusters.

Manage IT and business stakeholders, conduct assessment interviews, solution review sessions.

Review the code developed and suggest any issues w.r.t customer data.

Use SQL queries and other tools to perform data analysis and profiling.

Mentor and train the engineering team in use of Hadoop platform and analytical software, development technologies.

Environment: Apache Hadoop, Java (jdk1.6), DataStax, Flat files, Oracle 11g/10g, MySQL, Toad 9.6, Windows NT, UNIX, Sqoop, Hive, Oozie.

Caesars Entertainment, Reno, NV July 2013 -April 2014

Hadoop Developer

Description: Caesars Entertainment Corporation is the world's largest casino entertainment company, focused on building loyalty and value with its customers through a unique combination of great service, excellent products, unsurpassed distribution, operational excellence and technology leadership.

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.

Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.

Analyzed data using Hadoop components Hive and Pig.

Worked hands on with ETL process.

Responsible for running Hadoop streaming jobs to process terabytes of xml's data.

Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.

Responsible for creating Hive tables, loading data and writing hive queries.

Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.

Extracted the data from Teradata into HDFS using the Sqoop.

Exported the patterns analyzed back to Teradata using Sqoop.

Installed Oozie workflow engine to run multiple Hive and Pig jobs, which run independently with time and data availability.

Environment: Hadoop Cluster, HDFS, Hive, Pig, Sqoop, Hadoop Map Reduce, HBase, Shell Scripting

NorthShore University Health System, IL July2012-Feb2013 Java Developer

Description: NorthShore University Health System (NorthShore) is an integrated healthcare delivery system serving patients throughout the Chicago metropolitan area.

Responsibilities

Involved in development of business domain concepts into Use Cases, Sequence Diagrams, Class Diagrams, Component Diagrams and Implementation Diagrams.

Implemented various J2EE Design Patterns such as Model-View-Controller, Data Access Object, Business Delegate and Transfer Object.

Responsible for analysis and design of the application based on MVC Architecture, using open source Struts Framework.

Involved in configuring Struts, Tiles and developing the configuration files.

Developed Struts Action classes and Validation classes using Struts controller component and Struts validation framework.

Developed and deployed UI layer logics using JSP, XML, JavaScript, HTML /DHTML.

Used Spring Framework and integrated it with Struts.

Involved in Configuring web.xml and struts-config.xml according to the struts framework.

Designed a light weight model for the product using Inversion of Control principle and implemented it successfully using Spring IOC Container.

Used transaction interceptor provided by Spring for declarative Transaction Management.

The dependencies between the classes were managed by Spring using the Dependency Injection to promote loose coupling between them.

Provided connections using JDBC to the database and developed SQL queries to manipulate the data.

Wrote stored procedure and used JAVA APIs to call these procedures.

Developed various test cases such as unit tests, mock tests, and integration tests using the JUNIT.

Experience writing Stored Procedures, Functions and Packages.

Environment: Java, J2EE, Struts MVC, Tiles, JDBC, JSP, JavaScript, HTML, Spring IOC, Spring AOP, JAX-WS, Ant, Web sphere Application Server, Oracle, JUNIT and Log4j, Eclipse

Value Labs, Hyderabad, India Jan 2010 – June 2012

Java Developer

Project Name: WORx (Healthcare Management Software)

Description: WORx is a comprehensive health care pharmacy workflow management software used by pharmacies to streamline clinical operations. It provides an array of services that include electronic prescription, drug-drug interaction checking, medication dispensing, patient medication history, insurance related and inventory management services. This project comprises of rewriting existing legacy applications besides adding a range of new features

Responsibilities

Reverse engineered legacy systems, analyzed and documented various existing workflows.

Part of collaborating with business teams to consolidate on various new requirements.

Assisted in creating various design documents like class diagrams and sequence diagrams. Used confluence for creating design documents.

Part of designing modular application based on micro services architecture.

Implemented various backend modules which collaborate with each other using restful web services.

Designed Restful URLs for various modules and implemented corresponding endpoints using Spring MVC Technology.

Defined XSDs for various payloads and created JAXB objects from XSDs.

Defined DAO interfaces and added documentation to define contracts.

Using DAO design pattern coded DAO implementations to DAO contract interfaces.

Following test first methodology, wrote unit test cases for DAO using in memory data base – H2

Implemented JMS components, sender and receiver, with an aim of achieving asynchronous communication, high throughput between sub modules.

As part of legacy system maintenance, fixed bugs, made enhancements, added new service APIs, added new features to UI.

Deployed builds on apache tomcat.

Used ECLIPSE as IDE, MAVEN for build management, JIRA for issue tracking, CONFLUENCE for documentation purpose, GIT for version controlling, ARC (Advanced Rest Client) for endpoint testing, CRUICIBLE for code review and SQL Developer as DB client.

Environment: Core java, Spring MVC, Spring Security, Spring JMS, Spring JDBC template, XML, Log4J, Apache Tomcat, Active MQ, HTML, CSS, Bootstrap, Java script, Jira, Confluence.

Contact this candidate