Resume

Johnson Controls

Location:

Milwaukee, WI

Posted:

April 01, 2017

Contact this candidate

Original resume on Jobvertise

Resume:

Anil .K

732-***-****

aczlnn@r.postjobfree.com

PROFESSIONAL SUMMARY

Over 8 years of extensive IT experience and over 4 years of experience in Hadoop eco-system and java

technologies like HDFS, Map Reduce, Apache Pig, Hive, HBase, Sqoop, Flume, YARN and Zookeeper.

Highly Proficient and in depth understanding of Hadoop Architecture and various components such as HDFS, Job

Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts and experience in working with

MapReduce programs using Apache Hadoop for working with Big Data to analyze large data sets efficiently.

Experience in importing and exporting Tera bytes of data between HDFS and Relational Database Systems using

Sqoop.

Expert in ingesting data into the Big Data eco System.

Proficiency in Spark using Scala for loading data from the local file systems like HDFS, Amazon S3, Relational and

NoSQL databases using Spark SQL, and Import data into RDD and Ingesting data from a range of sources using

Spark Streaming.

Hands on experience in working with Hadoop Ecosystems Including Hive, Pig, HBase, Oozie, Impala, Spark, Drill

and Hue.

Working knowledge on ETL and BI tools like Informatica and Teradata.

Extensive Knowledge in Development, analysis and design of ETL methodologies in all the phases of Data

Warehousing life cycle.

Worked in Building application platforms in the Cloud by leveraging Amazon Web Services, open source

technologies & best engineering practices of CI/CD.

Expertise in databases such as Oracle, MySQL, SQL Server and IBM DB2 databases to manage tables, views,

indexes, sequences, stored procedures, functions, triggers and packages, and expertise in NoSQL databases like

MongoDB, Cassandra to manage document oriented data, manage cluster and CRUD operations on data.

Thorough knowledge in core Java concepts like OOP, JAVA SWING, JDBC, JMS, Multi-Threading, JUnit and

advanced Java concepts like JSP, Servlets, Struts, HTML, XML, CSS, Hibernate, AJAX, SVN, Java Beans and

SPRING

Proficient in developing web based applications and client server distributed architecture applications in Java/ J2EE

technologies using Object Oriented Methodology.

Worked with cloud services like Amazon web services AWS and Google cloud.

Excellent Knowledge in Data Warehousing Concepts.

Well versed in using software development methodologies like Water Fall, Agile (SCRUM), and Test Driven

Development and Service orientation architecture.

Experience in using code repository tools - Tortoise SVN, GitHub, and Visual Source Safe.

Strong communication and analytical skills and a demonstrated ability to handle multiple tasks as well as work

independently or in a team

TECHNICAL SKILLS

Hadoop/Big data technologies Hive, Hbase, Sqoop, Pig, MapReduce, YARN, flume, Oozie, Zoo Keeper

J2SE/J2EE Technologies Java, J2EE, JDBC, JSP, Servlets, Spring, Java Beans

Web Services SOAP, RESTful

IDE Tools Eclipse, Net Beans, RSA, RAD, Oracle Web logic workshop

Cloud Technologies AWS EC2, S3

Databases Oracle, SQL-Server, MySQL server, MS SQL, IBM DB2, MongoDB, Cassandra

Web/Application Servers Apache Tomcat, IBM WebSphere, Web logic Application server, JBOSS

Programming or Scripting

C, Java, Unix Shell/Bash Scripting, Python.

Languages

Platforms Windows, Linux and Unix

Version Control Tortoise SVN, GIT and Visual Source Safe

Methodologies Agile/ Scrum, Waterfall

PROFESSIONAL EXPERIENCE

Client: Johnson Controls, Milwaukee, WI May 15

Till Date

Role: Hadoop Developer

Description:

Johnson Controls is one of the top fortune 500 companies. When I was hired into Johnson s family they are developing a

new ingestion framework to replace their current framework with advanced features. In this project, I got a chance to

work various stages of frame work development using Hadoop tools.

Responsibilities:

Involved in HBASE setup and storing data into HBASE, which will be used for analysis

Developed analytical components using Scala, Spark, Apache Mesos and Spark Stream.

Involved in loading data from LINUX file system to HDFS.

Developed Spark scripts by using Scala Shell commands as per the requirement.

Involved in converting Map Reduce programs into Spark transformations using Spark RDD on Scala.

Worked hands on with ETL process, responsible for running Hadoop streaming jobs to process terabytes of xml data.

Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.

Implementing Spark Streaming Applications in Java.

Combined visualizations into Interactive Tableau Dashboards and published them to the web portal.

Data analysis using Spark with Scala.

Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and Python.

Analyzed the SQL scripts and designed the solution to implement using PySpark.

Worked on reading multiple data formats on HDFS using Scala.

Extracted and updated the data into Monod using MongoDB import and export command line utility interface.

Implemented real time system with Kafka, Storm and Zookeeper.

Developed Hive Scripts equivalent to Teradata and performance tuning using Hive.

Used Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala and

databases such as HBase, and MongoDB.

Client Communication and Participated in the requirements gathering with Business users

Co-ordinate with offshore and onsite team to understand the requirements and prepare High level and Low-level

design documents from the requirements specification.

Environment: Hive, Pig, HBase, MapReduce, Flume, Spark, Spark SQL, Spark Streaming, PySpark, Scala, Python,

Kafka, storm, Zoo Keeper, Shell Scripting, NoSQL database MongoDB, Oozie, Zoo Keeper, Shell/Bash Scripting,

YARN, JIRA, JDBC

Client: United Health Group, Minnetonka, MN Jan 14

April 15

Role: Hadoop Developer

Description:

United Health Group is well known for providing insurance and health services in the United States. In this project, we

developed a framework called DataFabric which aims to auto-ingest data from different data sources into their own

datalake for performing analysis on all data. Framework is developed in such a way to manage and maintain the

incremental data which would be coming along with historical data. The framework uses different environments like

Hive, Hbase, Talend, Hadoop environment. DataFabric framework is still under development which needs to face many challenges to

overcome and to become more efficient framework.

Responsibilities:

Responsible to manage data coming from different sources.

Storage and Processing in Hue covering all Hadoop ecosystem components.

Involved in creating Hive tables, and loading and analyzing data using Hive queries.

Developed Simple to complex MapReduce Jobs using Hive and Pig

Involved in system design and development in Core Java using Collections, Multithreading.

Experience in CI and CD with Jenkins.

Develop Pig Latin scripts to extract data from the output files to load into HDFS.

Develop custom UDF's and implement Pig scripts.

Implemented UDFs in java for hive to process the data that can't be performed using Hive inbuilt functions.

Developed simple to complex Unix shell/Bash scripting scripts in framework developing process.

Worked on implementing Flume to import streaming data logs and aggregating the data to HDFS through Flume.

Involved in writing Flume and Hive scripts to extract, transform and load the data into Database.

Used Oozie to orchestrate the MapReduce jobs and worked with HCatalog to open up access to Hive's Metastore.

Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with

components on HDFS, Hive.

Created contexts to use the values throughout the process to pass from parent to child jobs and child to parent jobs.

Experienced in Talend Data Integration, Talend Platform Setup on Windows and Unix systems.

Worked on POC's to inetgrate Spark with other tools.

Created Oozie Jobs for workflow of Spark, Sqoop and Shell scripts.

Created Spark Application to load data into Dynamic Partition Enabled Hive Table.

Worked on state full transformation of Spark Application.

Using Mahout MapReduce to parallelize a single iteration.

Installation, management and monitoring of Hadoop cluster using Cloudera Manager.

Automation script to monitor HDFS and HBase through cronjobs

Translation of Business Processes into data mappings for building the Data Warehouse.

Created Parquet Hive tables with Complex Data Types corresponding to the Avro Schema.

Planned releases with the team using JIRA and Confluence.

Processed data with Hive and Teradata, and developed web applications using Java and Oracle SQL.

Environment: HDFS, Sqoop, HiveQL, HBase, Pig, Flume, Yarn, Oozie, Kafka, Zoo keeper, Apache Storm, Core Java,

Jenkins, Teradata, VPC, Apache Parquet, ETL, Git, UNIX/Linux Shell Scripting, NoSQL, JIRA

Client: Yale University, New Haven, CT June 13 Dec 13

Role: Hadoop Developer

Description:

Yale University is an American Private Ivy League research university in New Haven, Connecticut. Founded in 1701 in

Saybrook Colony as the Collegiate School, the University is the third-oldest institution of higher education in the United

States.

Responsibilities:

Analyzing the requirement to setup a cluster.

Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple

MapReduce jobs in java.

Worked with the infrastructure and admin team in designing, modelling, sizing and configuring Hadoop cluster of 15

nodes.

Developed MapReduce programs in Java for parsing the raw data and populating staging Tables.

Developed Unix/Linux Shell Scripts and PL/SQL procedures.

Extracted the data from MySQL into HDFS using Sqoop.

Created Hive queries to compare the raw data with Enterprise Data Warehouse (EDW) reference tables and

performing aggregates

Importing and exporting data into HDFS and Hive using Sqoop.

Writing Pig scripts to process the data.

Developed and designed Hadoop, Spark and Java components.

Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.

Involved in HBASE setup and storing data into HBASE, which will be used for further analysis.

Developed Unix/Linux Shell Scripts and PL/SQL procedures.

Building Enterprise Data Warehouse (EDW) on Amazon Redshift Database via Hadoop EMR (Pig).

Using various tools such as Jenkins, Ant, Maven, and Chef I established release management processes.

Configured and managed through the Amazon Web Services (AWS) Management Console using Amazon Cloud

Search.

Install KAFKA on Hadoop cluster and configure producer and consumer coding part in java to establish connection

from twitter source to HDFS with popular hash tags.

Implementing JavaScript Execution Core Java.

Analyzed the possibility of reusing existing BTEQ code and recommended compute/storage loads of Teradata to be

offloaded to Hadoop/Hive for business benefits.

Developing clustering, classification and recommender systems around Elastic Search and Mahout and compare and

contrast it with SPARK MLlib.

Migrated corporate Linux servers from physical servers to Amazon Web services (AWS) virtual servers.

Evaluated, recommended, maintained, and administered issue tracking tool bugzilla and managed issues with JIRA

The data was loaded using the Hive Parquet Serde s for the Avro Data Type.

Used SQL queries to retrieve data from Enterprise Data Warehouse (EDW).

Involved in Dimensional Modeling on AWS Redshift and Tuning.

Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and

Sqoop.

Unit tested a sample of raw data and improved performance and turned over to production.

Environment: HDFS, MapReduce, Core Java, Unix Shell Scripting, PL/SQL, Pig, Hive, Hbase, Sqoop, Flume, Oozie,

Zoo keeper, Core Java, HiveQL, Kafka, NoSQL, Spark, Amazon EMR, Amazon Redshift, Mahout, Amazon Web

services (AWS), Apache Parquet, ETL, Teradata, JIRA, Jenkins, Git, UNIX/Linux Shell Scripting, Java/J2EE

Century National Insurance, India Dec 2009

Jan 12

Java Developer

Description:

The New York Motor Vehicle Commission (MVC) is the government agency responsible for titling, registering and

providing plates and also licensing drivers in the U.S state of New York. It also provides online support for Renewing

titles, registrations, licenses etc.

Responsibilities:

Involved in deployment of full Software Development Life Cycle (SDLC) of the tracking system like Requirement

gathering, Conceptual Design, Analysis, Detail design, Development, System Testing and User Acceptance

Worked in Agile Scrum methodology

Involved in writing exception and validation classes using core java

Designed and implemented the user interface using JSP, XSL, DHTML, Servlets, JavaScript, HTML, CSS and AJAX

Developed framework using Java, MySQL and web server technologies

Validated the XML documents with XSD validation and transformed to XHTML using XSLT

Implemented cross cutting concerns as aspects at Service layer using Spring AOP and of DAO objects using Spring-

ORM

Spring beans were used for controlling the flow between UI and Hibernate

Services using SOAP, WSDL, UDDI and XML using CXF framework tool/Apache Commons

Worked on database interaction layer for insertions, updating and retrieval operations of data from data

base by using queries and writing stored procedures

Wrote Stored Procedures and complicated queries for IBM DB2. Implemented SOA architecture with Web

Used Eclipse IDE for development and JBoss Application Server for deploying the web application

Used Apache Camel for creating routes using Web Service

Used JReport for the generation of reports of the application

Used Web Logic as application server and Log4j for application logging and debugging

Used CVS version controlling tool and project build tool using ANT

Environment: Java, HTML, CSS, JSTL, JavaScript, Servlets, JSP, Hibernate, Struts, Web Services,, Eclipse, JBoss, JSP,

JMS, JReport, Scrum, MySQL, IBM DB2, SOAP, WSDL, UDDI, AJAX, XML, XSD, XSLT, Oracle, Linux, JBoss,

Log4J, JUnit, ANT, CVS

InfoTech, India Aug 08 Nov 09

Java Developer

Description:

The objective of Item Management is to set-up, maintain, and share Item information in a flexible system that easily

supports Unilever s growth, increases speed to market and improves data accuracy, while reducing user workload. Guided

Setup is one module of the project which functions as a wizard to complete the configuration and creation of various types

of items like Single Items, Multiple Items and Assortment Items.

Responsibilities:

Analysis, design and development of application based on J2EE and design patterns

Involved in all phases of SDLC (Software Development Life Cycle)

Developed user interface using JSP, HTML, CSS and JavaScript

Involved in developing functional model, object model and dynamic model using UML

Development of the Java classes to be used in JSP and Servlets

Implemented asynchronous functionalities like e-mail notification using JMS

Implemented Multithreading to achieve consistent concurrency in the application

Used the Struts framework for managing the navigation and page flow

Created SQL queries and used PL/SQL stored procedures

Used JDBC for database transactions

Developed stored procedures in Oracle

Involved in developing the helper classes for better data exchange between the MVC layers

Used Test Driven Development approach and wrote many unit and integration test cases

Used Eclipse as IDE tool to develop the application and JIRA for bug and issue tracking

Worked on running integrated testing using JUNIT and XML for building the data structures required for the Web

Service

Used ANT tool for building and packaging the application

Code repository management using SVN

Environment : Core Java, Struts, Servlets, HTML, CSS, JSP, XML, JavaScript, Water fall, Eclipse IDE, Oracle, SQL,

JDBC, JBOSS, JUNIT, ANT, JUNIT, Eclipse ANT, SVN, Apache Tomcat Server

Contact this candidate