Hadoop Developer

Location:

Libertyville, IL

Posted:

September 01, 2017

Contact this candidate

Resume:

Nikesh Dara

Hadoop Developer

*********@*****.***

872-***-****

Experience Summary:

9+ years of extensive Professional IT experience, including 5+ years of Hadoopexperience, capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.

Well experienced in the Hadoop ecosystem components like Hadoop, MapReduce, Cloudera,Horton works, Mahout,HBase, Oozie, Hive, Sqoop, Pig, andFlume.

Experience in using Automation tools like Chef for installing, configuring and maintaining Hadoop clusters.

Lead innovation by exploring, investigating, recommending, benchmarking and implementing data centric technologies for the platform.

Technical leadership role responsible for developing and maintaining data warehouse and Big Data roadmap ensuring Data Architecture aligns to business centric roadmap and analytics capabilities.

Experienced in Hadoop Architect and Technical Lead role, provide design solutions and Hadooparchitectural direction .

Strong knowledge with Hadoop cluster connectivity and security.

Demonstrates an ability to clearly understand user’s data needs and identifies how to best meet those needs with the application, database, and reporting resources available.

Strong understanding of Data Modeling in data warehouse environment such as star schema and snow flake schema.

Extending Hive and Pig core functionality by writing custom UDFs .

Great understanding of structure of relational database, which give me advance to write complex SQL statement combine multiple joins and inline-view.

Proficient in writing Structured Query Language under Microsoft SQL Server and Oracle environment.

Create, manipulate and interpret reports with specific program data.

Determine client and practice area needs and customize reporting systems to meet those needs.

Hands on experience with HDFS, Map-reduce, PIG, Hive, AWS, Zookeeper, Oozie, HUE, Sqoop,Spark, Impala, Accumulo.

Good experience on general data analytics on distributed computing cluster like Hadoop using Apache Spark,Impala and Scala.

Worked on different RDBMS databases like Oracle, SQL Server, and MySQL.

Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS and transferred large datasets between Hadoop andRDBMS by implementing SQOOP.

Good experience of NoSQL databases like MongoDB, Cassandra, and HBase.

Hands on experience developing applications on HBase and expertise with SQL, PL/SQL database concepts.

Excellent understanding and knowledge of ETL tools like Informatica.

Having experience in using Apache Avro to provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes.

Extensive experience in Unix Shell Scripting.

Expertise in Hadoop workflows scheduling and monitoring using Oozie, Zookeeper.

Good Knowledge in developing MapReduce programs using Apache Crunch.

Strong experience as a Java Developer in Web/intranet, Client/Server technologies using Java, J2EE technologies which includes Struts framework, MVC design Patterns, JSP, Servlets, EJB, JDBC,JSLT,spark XML/XLST, JavaScript, AJAX, JMS, JNDI, RDMS, SOAP, Hibernate and custom tag Libraries.

Supported technical team members for automation, installation and configuration tasks.

An excellent team player and self-starter with good communication and interpersonal skills and proven abilities to finish tasks before target deadlines.

Technical Skills:

Big Data -Apache Hadoop, Cloudera, Hive, Hbase, Sqoop, Flume, Spark,Pig, HDFS and MapReduce, Oozie, Scala, Impala, Cassandra, Zookeeper, Apache Spark, Apache Kafka, Accumulo and Apache STORM.

Databases- Oracle, MySQL, MS SQL Server and MS Access.

T-SQL, PL/SQL, SSIS, SSRS

Programming Languages C/C++, Java,python.

Java Technologies Java, J2EE, JDBC, JSP, Java Servlets, JMS, Junit, Log4j.

IDE Development Tools Eclipse, Net Beans, My Eclipse, SOAP UI, Ant.

Operating Systems Windows, Mac, Unix, Linux.

Frameworks Struts, Hibernate, Spring.

PROJECTS:

Client: Volkswagen Credit Inc, Libertyville, IL Jan 2015 - Till date

Role : Lead Hadoop Developer

VW Credit, Inc. (VCI), a wholly owned subsidiary of Volkswagen Group of America, Inc., was founded in 1981 as the financial service arm of Volkswagen Group of America, Inc. VCI, a captive finance company, services Volkswagen and Audi retail customers and dealers as Volkswagen Credit, and Audi Financial Services. The company provides competitive financial products and services to dealers and their customers throughout the United States including retail leasing, retail financing, and balloon financing, along with wholesale financing for new and used vehicles. We maintain a Remarketing Department for disposing of end-of-term lease/balloon contract vehicles, and all used company vehicles.

Description:

VW Credit, Inc. (VCI), a wholly owned subsidiary of Volkswagen Group of America, Inc, is a financial service arm of Volkswagen Group of America. Project deals with developing historical database using Hadoop ecosystem for maintaining last 10 years of data spread across branches in US. Main aim of the project is to centralize the source of data for audit/legal report generation using historical database which otherwise are generated from multiple sources.

Responsibilities:

Worked on Hadoop cluster which ranged from 4-8 nodes during pre-production stage and it was sometimes extended up to 24 nodes during production .

Built APIs that will allow customer service representatives to access the data and answer queries.

Designed changes to transform current Hadoop jobs to HBase.

Handled fixing of defects efficiently and worked with the QA and BA team for clarifications.

Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.

Extending the functionality of Hive and Pig with custom UDF s and UDAF's.

Developed Spark Application by using Scala.

The new Business Data Warehouse (BDW) improved query/report performance, reduced the time needed to develop reports and established self-service reporting model in Cognos for business users.

Implemented Bucketing and Partitioning using Hive to assist the users with data analysis.

Used ooziescripts for deployment of the application and perforce as the secure versioning software.

Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.

Extracted large volumes of data feed from different data sources, performed transformations and loaded the data into various Targets.

Develop database management systems for easy access, storage and retrieval of data.

Perform DB activities such as indexing, performance tuning and backup and restore.

Used Sqoop to import the data from RDBMS to Hadoop Distributed File System (HDFS) and later analyzed the imported data using Hadoop Components

Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java.

Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.

Expert in creating PIG and Hive UDFs using Java in order to analyze the data efficiently.

Responsible for loading the data from BDW Oracle database, Teradata into HDFS using Sqoop.

Implemented AJAX, JSON, and Java script to create interactive web screens.

Wrote data ingestion systems to pull data from traditional RDBMS platforms such as Oracle and Teradata and store it inNoSQL databases such as MongoDB.

Involved in creating Hive tables, then applied HiveQL on those tables, this will invoke and run MapReduce jobs automatically.

Support of applications running on Linux machines

Developed data formatted web applications and deploy the script using HTML5, XHTML, CSS and Client side scripting using JavaScript.

Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts

Participated in requirement gathering form the Experts and Business Partners and converting the requirements into technical specifications.

Used Zookeeper to manage coordination among the clusters.

Experienced in analyzing Cassandra database and compare it with other open-sourceNoSQL databases to find which one of them better suites the current requirements.

Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts

Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability

Assisted application teams in installing Hadoop updates, operating system, patches and version upgrades when required

Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.

Environment: Apache Hadoop 2.0.0, Pig 0.11, Hive 0.10, Sqoop 1.4.3, Flume, MapReduce, JSP, Structs2.0, NoSQL, HDFS, Teradata, Sqoop, LINUX, Oozie, Cassandra, Hue, HCatalog, Java.IBM Cognos, Oracle 11g/10g, Microsoft SQL Server, Microsoft SSIS, DB2 LUW, TOAD for DB2, IBM Data Studio, AIX 6.1, UNIX Scripting,

Client: Med Plus, Seattle, WAMay 2012 – Oct2014

Role :Hadoop Developer

Description: Med Plus has concentrated on creating and supporting clinical and managerial IT applications giving arrangements that serve doctor rehearses, doctor's facilities, integrated delivery networks (IDN) and health information exchanges (HIE).

Responsibilities:

Experienced on adding/installation of new components and removal of them through Ambari.

Architecture design and implementation of deployment, configuration management, backup, and disaster recovery systems and procedures.

Responsible for importing log files from various sources into HDFS using Flume.

Handled Big Data utilizing a Hadoop group comprising of 40 hubs.

Performed complex HiveQL queries on Hive tables.

Actualized Partitioning, Dynamic Partitions, Buckets in HIVE.

Create technical designs, data models and data migration strategies, create dimensional data models, data marts.

Design and build maintain logical and physical databases, dimensional data models, ETL layer design and data integration strategies.

Created final tables in Parquet format.

Developed PIG scripts for source data validation and transformation.

Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.

Developed NoSQL database by using CRUD, Indexing, Replication and Sharing in MongoDB.

Experience using Talend administration console to promote and schedule jobs.

Extracted and updated the data into MongoDB using Mongo import and export command line utility interface.

Involved in unit testing using MR unit for MapReduce jobs.

Utilized Hive and Pig to create BI reports.

Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.

Worked with Informatica MDM in creating single view of the data.

Environment: Cloudera, Hadoop, HDFS, Pig, Hive, MapReduce, Java, Flume, Informatica, Oozie, Linux/Unix Shell scripting, Avro, MongoDB, Python, Perl, Java (jdk1.7), Git, Maven,SAPBW,COGNOS, Jenkins.

Client:Call & Block- HyderabadJune 2008 – Feb 2012

Role :Java Developer

Description: Call and Block is one of the growing software solutions organization, which delivers best performance and solutions to the customers by using new and disruptive technologies in both frontend and backend. They deliver support to the clients by providing and notifying the information of the caller when the customers get a call.

Responsibilities:

Involved in analysis and gathering requirements and user specifications from business analyst.

Involved in creating use case, class, sequence, package dependency diagrams using UML.

Involved in Database Design by creating Data Flow Diagram (Process Model) and ER Diagram (Data Model).

Used JavaScript for certain form validations, submissions and other client side operations.

Created Stateless Session Beans to communicate with the client. Created Connection Pools and Data Sources.

Implemented and supported the project through development, Unit testing phase into production environment.

Designing the database and coding of SQL, PL/SQL, Triggers and Views using IBM DB2.

Deployed Server-side common utilities for the application and the front-end dynamic web pages using Servlets, JSP and custom tag libraries, JavaScript, HTML/DHTML and CSS.

Environment: Java 5.0, J2EE, JSP, HTML/DHTML, CSS, JavaScript DB2, Windows XP, Struts Framework, Eclipse IDE, Web Logic Server, SQL, PL/SQL.

Contact this candidate