Hadoop Developer

Location:

Ann Arbor, MI

Posted:

March 29, 2018

Contact this candidate

Resume:

Dinesh M

E-Mail: ******.*****@*****.***

Mobile: 630-***-****

Professional Summary

Over 7 years of experience in Information Technology involving Analysis, Design, Testing, Implementation and Training. Excellent skills in state-of-the-art technology of client server computing, desktop applications and website development.

Over 4 years of work experience on Big Data Analytics with hands on experience on writing Map Reduce jobs on Hadoop Ecosystem including Hive and Pig.

Good working experience on Hadoop architecture, HDFS, Map Reduce and other components in the Cloudera - Hadoop echo system. ‘Experience in various Hadoop distributions(Cloudera, Hortonworks) & cloud platforms (AWS Cloud, Microsoft Azure)

Good working experience on Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm

Hands on experience in installing, configuring, and using Hadoop ecosystem components likeHadoop, Map Reduce, HDFS, Hive, Sqoop, Pig, Zookeeper and Flume.

Used Apache Kafka for tracking data ingestion to Hadoop cluster .Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions. Real time streaming the data using Spark with Kafka for faster processing.

Developed User Defined Functions (UDFs) for Apache Pig and Hive using Python and Java languages.

Familiar with Spark, Kafka, Storm, Talend, and Elastic search.

Knowledge of NoSQL databases such as HBase, MongoDB &Cassandra.

Experience in developing PigLatin and HiveQL scripts for Data Analysis and ETL purposes and also extending the default functionality by writing User Defined Functions (UDFs) for data specific processing.

Experience in job scheduling and monitoring through Oozie and Zookeeper.

Knowledge in Data warehousing and using ETL tools like Informatica and Pentaho.

Experience in migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop&Flume

Wrote business logic code in C# code behind files to read data from database stored procedure.

Extremely good in Struts, Spring Framework, Hibernate.

Strong technical background on C#.NET, Windows Azure(Cloud), Windows Service, Entity Framework, LINQ, Windows Service, SQL serv

In-depth understanding of Spark Architecture including Spark Core, Spark SQL, Data

Frames, Spark Streaming, Spark MLlib.

Expertise in writing Spark RDD transformations, actions, Data Frames, case classes for the required input data and performed the data transformations using Spark-Core.

Expertise in developing Real-Time Streaming Solutions using Spark Streaming.

Proficient in big data ingestion and streaming tools like Flume, Sqoop, Spark, Kafka and Storm.

Experience in working with Cloudera, Hortonworks, and Microsoft Azure HDINSIGHT Distributions.

Proficient in Data-Structures, Design Patterns in C++ Solid experience in building multi-threaded applications in C++, Python

Hands on experience migrating complex map reduce programs into Apache Spark RDD transformations.

Experience in implementing OLAP multi-dimensional cube functionality using Azure SQL Data Warehouse

Used various Ajax/JavaScript framework tools like Java Script, jQuery, JSON.

Good Understanding of Design Patterns like MVC, Singleton, Session Facade, DAO, Factory.

Strong experience in software development using Java/J2EE technologies.

Expertise in back-end/server side java technologies such as: Web services, Java persistence API (JPA), Java Messaging Service (JMS), Java Database Connectivity (JDBC), Java Naming and Directory Interface (JNDI).

Expertise in J2EE and MVC architecture/implementation, Web Services, SOA, Analysis, Design, Object modeling, Data modeling, Integration, Validation, Implementation and Deployment.

Well experienced in building servers like DHCP, PXE with kick-start, DNS and NFS and used them in building infrastructure in a Linux Environment. Automated build, testing and integration with Ant, Maven and JUnit.

Rich experience in Agile Methodologies such as extreme programming (XP), Scrum, waterfall model and Test Driven Development TDD.

Expert level skills in designing and implementing web server solutions and deploying java application servers like Tomcat, JBoss, WebSphere, WebLogic on Windows& UNIX platform.

Knowledge in Spark APIs to cleanse, explores, aggregate, transform, and store data.

Experience with RDBMS and writing SQL and PL/SQL scripts used in stored procedures.

Strengths include good team player, excellent communication interpersonal and analytical skills, flexible to work with new technologies and ability to work effectively in a fast-paced, high volume, deadline-driven environment.

Education:

Bachelor of Technology in Information Technology.

References:

References are provided Upon Request.

Technical Skills:

Language

Java, PL/SQL, C, C++, C#, Unix Shell Scripting and Scala.

Hadoop/Big Data

Apache Hadoop, Yarn, HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Oozie, Flume, Cassandra, Zookeeper, Spark, SparkSQL,Microsoft Azure

Java Technologies

Java EE, Servlets, JSP, JUNIT, AJAX

NoSQL Databases

Hbase, Cassandraand MongoDB

Web Technologies

JavaScript, HTML, XML, CSS3.

Frameworks

MVC, Hibernate, Spring Framework

IDE'S

Eclipse (GALILEO, HELIOS, INDIGO, Mars), MyEclipse, Net beans.

Web Services

SOAP, WSDL, UDDI, APACHE AXIS, REST.

Web and Application Servers

Web Logic, JBOSS, Web Sphere, Apache Tomcat

Databases

MySQL, Oracle, MS SQL Server

Build Automation

Ant, Maven

RDBMS

Oracle, DB2, MySQL, SQL Server

Operating Systems

Linux (Red Hat, Ubuntu, Fedora), MAC OSx, Windows

Professional Experience:

Volkswagen, Ann Arbor, MI

Sr. Hadoop/Spark Developer March 2016 – Present

Responsibilities:

Involved in Cluster Setup, monitoring and administration tasks likecommission and decommissionnodes.

Good at working on Hadoop, MapReduce, and Yarn/MRv2 developed multiple MapReduce jobs for structured, semi-structured and unstructured data in java.

Developed MapReduce programs in Java for parsing the raw data and populating staging Tables.

Created Hive queries to compare the raw data with EDW reference tables and performing aggregates

Experienced in developing custom input formats and data types to parse and process unstructured and semi structured input data and mapped them into key value pairs to implement business logic in Map-Reduce.

Experience in implementing custom sterilizer, interceptor, source and sink as per the requirement in Flume to ingest data from multiple sources.

Involved in developing Hive DDLs to create, alter and drop Hive tables and storm, & Kafka.

Experience in setting up Fan-out workflow in flume to design v shaped architecture to take data from many sources and ingest into single sink.

Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in Near real time and persist it to Cassandra.

Consumed JSON messages using Kafka and processed the JSON file using Spark Streaming to capture UI updates

Developed spark programming code in SCALA on INTELLIJ IDE using SBT tools.

Performance tuning of SQOOP, Hive and Spark jobs.

Worked with .Net and C# to create dash board according to the client requirements.

Experienced in writing live Real-time Processing and core jobs using Spark Streaming with Kafka as a data pipe-line system.

Implemented OLAP multi-dimensional cube functionality using Azure SQL Data Warehouse.

Wrote AZURE POWERSHELL scripts to copy or move data from local file system to HDFS Blob storage

Importing and exporting data into HDFS and Hive using Sqoop.

Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in Near real time and persist it to Cassandra.

Experienced in analyzing data with Hive and Pig.

Experienced in writing live Real-time Processing and core jobs using Spark Streaming with Kafka as a data pipe-line system.

Used Kafka Streams to Configure Spark streaming to get information and then store it in HDFS.

Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.

Integrating bulk data into Cassandra file system using MapReduce programs.

Expertise in designing, data modeling for Cassandra NoSQL database.

Experienced in managing and reviewing Hadoop log files.

involved in Data Migration process using Azure by integrating with Github repository and Jenkins.

Experienced in implementing High Availability using QJM and NFS to avoid single point of failure

Experienced in writing live Real-time Processing using Spark Streaming with Kafka.

Developed custom mappers in python script and Hive UDFs and UDAFs based on the given requirement.

Connects to a NFSv3 storage server supporting AUTH_NONE or AUTH_SYS authentication method.

Used HiveQL to analyze the partitioned and bucketed data and compute various metrics for reporting.

Experienced in querying data using SparkSQL on top of Spark engine.

Experience in managing and monitoring Hadoop cluster using Cloudera Manager.

Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.

Implemented analytical platform that used Hive Functions and different kind of join operations like Map joins, Bucketed Map joins.

Unit tested a sample of raw data and improved performance and turned over to production.

Environment: CDH, Java(JDK1.7), Hadoop, Azure, MapReduce, HDFS, Hive, Sqoop, Flume, NFS, Cassandra, Pig, Oozie, Kerberos, Scala, SparkSQL, Spark Streaming, Kafka, Linux, AWS, Shell Scripting, MySQL Oracle 11g, SQL*PLUS, C++, C#

T-Mobile, Seattle, WA

Big Data/Hadoop Developer January 2014 - February 2016

Responsibilities:

Worked with business partners to gather business requirements.

Installed Name node, Secondary name node, (Resource Manager, Node manager, Application master), Data node using Cloudera.

Worked extensively in creating MapReduce jobs using to power data for search and aggregation.

Worked extensively with Sqoop for importing metadata from Oracle.

Extensively used Pig for data cleansing.

Work with the Teradata analysis team using BigData technologies to gather the business requirements.

Worked as a Hadoop developer to analyze large amounts of data to analyze regulatory reports by creating MapReduce jobs in Java.

Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.

Provided Hadoop, OS, Hardware optimizations.

Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.

Understanding the performance bottlenecks by analyzing the existing hadoop cluster and provided performance tuning accordingly.

Performed configuration and troubleshooting of services like NFS, NIS, NIS+, DHCP, FTP, LDAP, Apache Web servers.

Experience in implementing applications on Spark frameworks using Scala.

Developed spark programming code in SCALA on INTELLIJ IDE using SBT tools.

Regular Commissioning and Decommissioning of nodes depending upon the amount of data.

Installed and configured Hadoop components Hdfs, Hive, HBase.

Communicating with the development teams and attending daily meetings.

Addressing and Troubleshooting issues on a daily basis.

Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.

Cluster maintenance as well as creation and removal of nodes.

Monitor Hadoop cluster connectivity and security.

Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.

Designed the shell script for backing up of important metadata and rotating the logs on a monthly basis.

Involved in implementing dashboard using .Net and C#

Developed highly efficient algorithms in C++ through both pair-programming and independent work

Testing, evaluation and troubleshooting of different NoSQL database systems and cluster configurations to ensure high-availability in various crash scenarios.

Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.

Implemented open source monitoring tool GANGLIA for monitoring the various services across the cluster.

Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.

Provided the necessary support to the ETL team when required.

Integrated Nagios in the Hadoop cluster for alerts.

Performed both major and minor upgrades to the existing cluster and also rolling back to the previous version.

Environment:Linux, HDFS, MapReduce, Hive, Pig, Azure, kDC, NAGIOS, GANGLIA, Oozie, Sqoop, Cloudera Manager.

Vanguard Group, Malvern, PA

Java/Hadoop Developer January 2012–December 2013

Responsibilities:

Involved in all phases of Agile Scrum Process like Stand up, Retrospective, Sprint Planning meetings.

Designed and developed Enterprise Eligibility business objects and domain objects with Object Relational Mapping framework such as Hibernate.

Developed JSP pages for presentation layer (UI) using Struts with client side validations using Struts Validator framework/ JavaScript.

Used JMS in the project for sending and receiving the messages on the queue.

Developed the UI panels using JSF, XHTML, CSS, DOJO and JQuery.

Used AJAX and JavaScript for validations and integrating business server side components on the client side with in the browser.

Establish coding standards for Java, JEE, ExtJS, etc.

Wrote JavaScript functions to get Dynamic data and Client side validation.

Created Oracle database tables, stored procedures, sequences, triggers, views

Developed the CRUD API for the POSEngine using Restful Webservices.

Involved in the development of SQL, PL/SQL Packages, Stored Procedures

Implemented the Connectivity to the Data Base Server Using JDBC.

Consumed Web Services using Apache CXF framework for getting remote information

Developed Rest architecture based web services to facilitate communication between client and servers.

Experienced on loading and transform

Used C# to design a dashboard for the client to view the reports and other data. ing of large sets of structured, semi structured and unstructured data from HBase through Sqoop and placed in HDFS for further processing.

Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.

Managing and scheduling Jobs on a Hadoop cluster using Oozie.

Involved in creating Hive tables, loading data and running hive queries in those data.

Created webservices using WTP tool plugin to the eclipse IDE which is deployed as a separate application using Maven scripts.

Performed general SharePoint IDE/Clearcase/Clearquest administration

Wrote unit testing of various components layer with JUnit framework.

Manage multiple, high profile cross-functional AGILE program teams across various business units.

Identified Requirements done the design and development of use cases using UML

Responsible for developing GUI / user interfaces using JSP, CSS & DHTML

Designed and developed the web-tier using Html, JSP's, Servlets, Struts and Tiles framework.

Used the Eclipse as IDE, configured and deployed the application onto WebLogic application server using Maven build scripts to automate the build and deployment process.

Implemented a prototype to integrate PDF documents into a web application using iText PDF library

Designed and developed client and server components of an administrative console for a business process engine framework using Java, Google Web Toolkit and Spring technologies.

Designer and Architect of SOA Governance (Oracle enterprise repository) and Wiki plug-in development for O2 UK Repository search engine and SOA Shop for Services.

Environment: JAVA, J2EE, SPRING, HIBERNATE, STRUTS, JQUERY, AJAX, SENCHA EXTJS, JAVASCRIPT, ORACLE, CRUD, PL/SQL, JDBC, APACHE CXF, REST, ECLIPSE, WEBLOGIC, CLEARCASE, JUNIT, AGILE, UML, JSP, JSTL (JAVA SERVER PAGESSTANDARD TAG LIBRARY), JMS, SERVLET, MAVEN, ITEXT, GOOGLE WEB KIT (GWT), JASPER REPORT, ILOG, WEB 2.0, SOA.

Comerica Bank, San Jose, CA

Java Developer June 2011 -December2011

Responsibilities:

Responsible for requirement gathering and analysis through interaction with end users.

Involved in designing use-case diagrams, class diagram, interaction using UML model with Rational Rose.

Designed and developed the application using various design patterns, such as session facade, business delegate and service locator.

Worked on Maven build tool. Involved in developing JSP pages using Struts custom tags, JQuery and Tiles Framework.

Used JavaScript to perform client side validations and Struts-Validator Framework for server-side validation. Developed Web applications with Rich Internet applications using Java applets, Silverlight, JavaFX.

Involved in creating Database SQL and PL/SQL queries and stored Procedures. Implemented Singleton classes for property loading and static data from DB and Debugged and developed applications using Rational Application Developer (RAD).

Developed a Web service to communicate with the database using SOAP. Developed DAO (data access objects) using Spring Framework 3 and Deployed the components in to WebSphere Application server 7. Actively involved in backend tuning SQL queries/DB script. Worked in writing commands using UNIX, Shell scripting.

Involved in developing other subsystems' server-side components. Production supporting using IBM clear quest for fixing bugs.

Generated Java wrappers for web services using Apache AXIS.

Environment: JBoss, XML SOAP, RESTful, Java EE 6, IBM WebSphere Application Server 7, Apache-Struts 2.0, EJB 3, Spring 3.2, JSP 2.0, Web Services, C#, JQuery 1.7, Servlet 3.0, Struts-Validator, Struts-Tiles, Tag Libraries, ANT 1.5, JDBC, JMS, Service Bus.

Client: Aris Global - Mysore, Karnataka

Role: Java Developer January 2010 - March 2011

Responsibilities:

Developed all the UI using JSP and SpringMVC with client side validations using JavaScript.

Developed the DAO layer using Hibernate.

Designed class and sequence diagrams for Enhancements.

Developed the user interface presentation screens using HTML, XML, CSS, JQuery.

Experience in working with Spring MVC using AOP, DI/IOC.

Co-ordinate with the QA leads for development of test plan, test cases, and unit test code.

Involved in testing and deployment of the application on Apache Tomcat Application Server during integration and QA testing phase.

Developed Software using C#, Java, .NET, AJAX, CSS, SQL Server 2008 and met business needs pertaining to the client requirements and the accredited standards.

Involved in building JUNIT test cases for various modules.

Maintained the existing code base developed in spring and Hibernate framework by incorporating new features and doing bug fixes.

Involved in Application Server Configuration and in Production issues resolution.

Wrote SQL queries and Stored Procedures for interacting with the Oracle database.

Documentation of common problems prior to go-live and while actively involved in a Production Support role.

Environment: J2EE/J2SE, Java1.5, JSP, Ajax4JSF, JSF 1.2, Apache MyFaces, RichFaces 3.3, Spring Frame Work 3, Hibernate, JMS, CSS3, Apache CXF, XML, HTML, Log 4j, C#, .Net, Web Sphere 6, RAD 7, Oracle, SunOS (UNIX), Shell Script, Sub Version, Maven.

Contact this candidate