Data Project

Location:

Idlewild, MI, 49642

Salary:

$60/Hr

Posted:

June 28, 2017

Contact this candidate

Resume:

Allada

E-Mail: ******.*************@*****.***

Sr.Hadoop Consultant

Number:-469-***-****

PROFESSIONAL SUMMARY:

7+ years of professional IT work experience in Analysis, Design, Development, Deployment and Maintenance of critical software and big data applications.

Over 4 years of experience in Big Data platform as both Developer and Administrator.

Hands on experience in developing and deploying enterprise based applications using major Hadoop ecosystem components like Map Reduce, YARN, Hive, Pig, Hbase, Flume, Sqoop, Spark Streaming, Spark SQL, Storm, Kafka, Oozie, Nifi and Cassandra.

Hands on experience in using MapReduce programming model for Batch processing of data stored in HDFS.

Installed and configured multiple Hadoop clusters of different sizes and with ecosystem components like Pig, Hive, Sqoop, Flume, HBase, Oozie and Zookeeper.

Worked on all major distributions of Hadoop Cloudera (CDH4, CDH5), Hortonworks (HDP 2.2, 2.4).

Responsible for designing and building a DataLake using Hadoop and its ecosystem components.

Handled Data Movement, data transformation, Analysis and visualization across the lake by integrating it with various tools.

Defined extract-translate-load (ETL) and extract-load-translate (ELT) processes for the Data Lake.

Extensively worked on Spark and its components like Sparksql, Spark and Spark streaming.

Defined real time data streaming solutions across the cluster using Spark Streaming, Apache Storm, Kafka, Nifi and Flume.

Good Expertise in Planning, Installing and Configuring Hadoop Cluster based on the business needs.

Transformed and aggregated data for analysis by implementing work flow management of Sqoop, Hive and Pig scripts.

Experience working on different file formats like Avro, Parquet, ORC, Sequence and Compression techniques like Gzip, Lzo, snappy in Hadoop.

Experience in retrieving data from databases like MYSQL, Teradata, Informix, DB2 and Oracle into HDFS using Sqoop and ingesting them into Hbase and Cassandra.

Experience writing Oozie workflows and Job Controllers for job automation.

Integrated Oozie with Hue and scheduled workflows for multiple Hive, Pig and Spark Jobs.

In-Depth knowledge of Scala and Experience building Spark applications using Scala.

Good experience working on Tableau and Spotfire and enabled the Jdbc/Odbc data connectivity from those to Hive tables.

Adequate knowledge of Scrum, Agile and Waterfall methodologies.

Expertise in web Technologies like HTML, CSS, PHP, XML.

Worked on various Tools and IDEs like Eclipse, IBM Rational, Apache Ant-Build Tool, MS-Office, PLSQL Developer, SQL*Plus.

Highly motivated with the ability to work independently or as an integral part of a team and Committed to highest levels of profession.

TECHNICAL SKILLS:

Big Data / Hadoop

HDFS, MapReduce, HBase, Kafka, PIG, HIVE, Sqoop, Impala and Flume

Real time/Stream Processing

Apache Storm, Apache Spark

Operating Systems

Windows, Unix and Linux

Programming Language

Java, Scala, SQL

Data Base

Oracle 9i/10g, SQL Server, MS Access

Web Technologies

HTML, XML, JavaScript

IDE Development Tools

Eclipse, NetBeans

Methodologies

Agile, Scrum and Waterfall

PROFESSIONAL EXPERIENCE:

Anthem BCBS, Indianapolis, IN June 2016 – till date

Sr. Hadoop Consultant

Project Description: The project deals with developing a next gen data analysis application. This application is based on latest Hadoop and frontend technologies. Customer Information is fed from various sub systems to create a complete and actionable customer profile.

Responsibilities:

Installing, configuring and testing Hadoop ecosystem components like MapReduce, HDFS, Pig, Hive, Sqoop, Flume, Oozie, Hue and HBase.

Imported data from various sources into HDFS and Hive using Sqoop.

Experience in writing customized UDF's in java to extend Hive and Pig Latin functionality.

Created Partitions and Buckets in Hive for both Managed and External tables for optimizing performance.

Involved in creating data-models for customer data using Cassandra Query Language.

Worked on Apache Nifi as ETL tool for batch processing and real time processing.

Configured Tez as execution engine for Hive queries to improve the performance.

Developed a data pipeline using Kafka and Spark to store data into HDFS and performed the real-time analytics on the incoming data.

Hands on experience in Spark and Spark Streaming creating RDD's, Applying operations -Transformation and Actions on it.

Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala.

Developed Spark Programs for Batch and Real time processing.

Wrote Hive queries for data analysis to meet the business requirements.

In-depth knowledge of Scala and experienced in building the Spark applications using Scala.

Configured Flume to stream data into HDFS and Hive using HDFS Sinks and Hive sinks.

Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.

Continuous monitoring and managing the Hadoop cluster.

Environment: Hortonworks, Hadoop, Map Reduce, HDFS, Ambari, Hive, Pig, Sqoop, Apache Kafka, Oozie, SQL, Flume, Spark, Cassandra, Informatica, Java, Github.

ATT, Dallas, Tx Mar 2015 to May 2016

Sr. Hadoop Developer

Project Description: This project dealt with improving Sale and Transaction data by building a robust unified data source for data exploration purposes and easy retrieval of critical information. Then, it splits the transactions based on type and generates the files for the respective providers. The platform is built on Hadoop ecosystem with HDFS/HBase being the primary data storage.

Responsibilities:

Worked on analysing Hadoop stack and different big data analytic tools including Pig, Hive, HBase database and Sqoop.

Designing and implementing semi-structured data analytics platform leveraging Hadoop.

Worked on performance analysis and improvements for Hive and Pig scripts at MapReduce job tuning level.

Involved in Optimization of Hive Queries.

Developed a frame work to handle loading and transform large sets of unstructured data from UNIX system to HIVE tables.

Involved in Data Ingestion to HDFS from various data sources.

Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.

Extensively used Apache Sqoop for efficiently transferring bulk data between Apache Hadoop and relational databases.

Automated sqoop, hive and pig jobs using Oozie scheduling.

Extensive knowledge in NoSQL databases like Hbase

Have good knowledge on writing and using the user defined functions in HIVE, PIG and MapReduce.

Helped business team by installing and configuring Hadoop ecosystem components along with Hadoop admin.

Developed multiple Kafka Producers and Consumers from scratch as per the business requirements.

Worked on loading log data into HDFS through Flume

Created and maintained technical documentation for executing Hive queries and Pig Scripts.

Worked on debugging and performance tuning of Hive & Pig jobs.

Used Oozie to schedule various jobs on Hadoop cluster.

Used Hive to analyse the partitioned and bucketed data.

Worked on establishing connectivity between Tableau and Spotfire.

Environment: Hadoop, HDFS, Map Reduce, Mongo DB, Java, VMware, HIVE, Eclipse, PIG, Hive, HBase, Sqoop, Flume, Linux, UNIX.

Adidas, Indianapolis, IN Nov 2013 to Mar 2015

Hadoop Developer.

Project Description: This project deals with centralization and processing of data from various facilities, format, normalize, and standardize that data in a way that can be used to create and improve the community inventory records.

Analysed data using Hadoop Components Hive and Pig.

Experienced in development using Cloudera distribution system.

Worked Hands on with ETL process.

Developed Hadoop Streaming jobs to ingest large amount of data.

Load and transform large data sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.

Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.

Involved in loading data from UNIX file system to HDFS. Experience in working with Hadoop clusters using Cloudera distributions.

Experience in Importing and Exporting the Data using SQOOP from HDFS to Relational Database systems.

Involved in doing POC’s for performance comparison of Spark SQL with Hive.

Developed Spark Programs for Batch and Real time processing.

Imported data using Sqoop from Teradata using Teradata connector.

Created Sub-Queries for filtering and faster execution of data. Created multiple Join tables and fetched the required data.

Worked in AWS environment for development and deployment of custom Hadoop applications.

Install and Set up HBASE and Impala.

Used Apache Impala to read, write and query the Hadoop data in HDFS, Hbase

Implemented Partitioning, Dynamic Partitions and Buckets in Hive.

Supported Map Reduce Programs those are running on the cluster.

Developed ETL test scripts based on technical specifications/Data design documents and Source to Target mappings.

Worked on Talend ETL on single and multi-server environments.

Environment: Cloudera, HDFS, Pig, Hive, Map Reduce, python, Sqoop, Storm, Kafka, LINUX, Hbase, Impala, Java, SQL

XO Communications, Dallas, TX Oct 2012 to Oct 2013

Java/Hadoop Developer

Project Description: The project designs and develops service interfaces, for retrieving reference data from common user interface for agents, call centre and for online customers. A central repository was built to maintain all data in one place.

Responsibilities:

Design and development of Java classes using Object Oriented Methodology.

Worked in system using Java, JSP and SERVLET.

Development of Java classes and methods for handling Data from database.

Experience in sequence data pre-processing, extraction, model fitting and validation using ML pipelines.

Uses Talend Open Studio to load files into Hadoop HIVE tables and performed ETL aggregations in Hadoop HIVE.

Used sqoop to import data from SQL server to hadoop ecosystem.

Integration of Cassandra with Talend and automation of jobs.

Did Scheduling and monitoring the console outputs through Jenkins.

Worked in Agile environment, which uses Jira to maintain the story points.

Worked on Implementation of a toolkit that abstracted Solr&ElasticSearch.

Maintenance and troubleshooting in Cassandra cluster.

Installed and configured Hive and written Hive UDFs in java and python

Attended and Conducted User meetings for requirement analysis and project reporting.

Testing and bug fixing and providing support the production.

Ally Financial, Charlotte, NC May 2011 to Sep 2012

Java Developer

Project Description: An intuitive web-based Auto Insurance Application which allows customers to access their account online was developed and maintained. This application provides employers to access customer information to process and authorize claims of the customers and generates free quotes for the customers depending on the type of insurance package they selected. It also facilitates customers to pay their monthly premiums.

Responsibilities:

Worked with Business analysts and Product owners to analyse and understand the requirements and giving the estimates.

Implement J2EE design patterns such as Singleton, DAO, DTO and MVC.

Developed this web application to store all system information in a central location using Spring MVC, JSP, Servlet and HTML.

Used SpringAOP module to handle transaction management services for objects in any Spring-based application.

Implemented Spring DI and Spring Transactions in business layer.

Developed data access components using JDBC, DAOs, and Beans for data manipulation.

Designed and developed database objects like Tables, Views, Stored Procedures, User Functions using PL/SQL, SQL Developer and used them in WEB components.

Used iBATIS for dynamically building SQL queries based on parameters.

Developed JavaScript and JQuery functions for all Client side Validations.

Developed Junit test cases for Unit Testing &Used Maven as build and configuration tool.

Used Shell scripting to create jobs to run on daily basis.

Debugged the application using Firebug and traversed through the nodes of the tree using DOM functions.

Monitored the error logs using log4jand fixed the problems.

Used Eclipse IDE and deployed the application on Web Logic server.

Responsible for configuring and deploying the builds on Web Sphere App Server.

Environment: Java, J2EE, Java Script, XML, JavaScript, JDBC, Spring Framework, Hibernate, Rest Full Web services, Web Logic Server, Log4j, JUnit, ANT, SoapUI, Oracle11g.

Wilshire technologies, Hyderabad, India Jan 2010 to April 2011

Java Developer

Project Description: A Resource Tracking application for tracking the details of all employees in is developed and maintained. Application provides users an interface to view, edit and update their details. It also allows the Manager level Users to book an employee for a particular project or task with an option to view the growth and revenue statistics.

Responsibilities:

Collecting and understanding the User requirements and Functional specifications.

Development of GUI Using HTML, CSS, JSP and JavaScript.

Creating components for isolated business logic.

Deployment of application in J2EE Architecture.

Implemented Session Facade Pattern using Session and Entity Beans

Developed message driven beans to listen to JMS.

Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.

Used WebLogic to deploy applications on local and development environments of the application.

Extensively used the JDBC Prepared Statement to embed the SQL queries into the java code.

Developed DAO (Data Access Objects) using Spring Framework 3.

Developed Web applications with Rich Internet applications using Java applets, Silverlight, Java.

Used JavaScript to perform client side validations and Struts-Validator Framework for server-side validation.

Provided on call support based on the priority of the issues.

Environment: Java, JSP, SQL, MS-Access, JavaScript, HTML.

EDUCATION: Bachelor of Technology.

Contact this candidate