Data Sql Server

Location:

Hyderabad, Telangana, India

Posted:

September 30, 2016

Contact this candidate

Resume:

Kushal Kumar

************@*****.*** 201-***-****

Summary:

Over 6 years in IT with 4 years experience in Big Data technologies such as Spark, Horton works and Cloudera Hadoop distributions

Experience in analyzing data using HiveQL, Pig Latin and custom MapReduce programs in Java.

Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.

Extending Hive and Pig core functionality by writing custom UDFs in Python

Architected, Designed and maintained high performing ELT/ETL Processes.

Tuning, and Monitoring Hadoop jobs and clusters in a production environment.

Managed and reviewed Hadoop log file.

Participated in an Agile SDLC to deliver new cloud platform services and components.

Developed and Maintained the Web Applications using the Web server Tomcat.

Exceptional ability to learn new technologies and to deliver outputs in short deadlines.

Having Experience on UNIX commands and Deployment of Applications in Server.

Expereince writing custom SQL queries and building dashboards in Tableau

Certifications:

CCA Spark and Hadoop Developer CCA-175

Technical Skills:

Hadoop

Hadoop 2.2, HDFS, MapReduce, Pig 0.8, Hive0.13, Sqoop 1.4.4, Spark 1.3 Zookeeper 3.4.5, Yarn,,Scala,Impala,Kafka,Tez,tableau,NoSql-Hbase, Cassandra.

Hadoop management & Security

Hortonworks Ambari, Cloudera Manager

Web Technologies

DHTML, HTML, XHTML, XML, XSL (XSLT, XPATH), XSD, CSS, JavaScript

Server SideScripting

UNIX Shell Scripting, Python Scripting

Database

Oracle 10g, Microsoft SQL Server, MySQL, DB2,Optima,Teradata Sql,,RDBS.

Web Servers

Apache Tomcat 5.x, BEA Weblogic 8.x, IBM Websphere 6.0/5.1.1

IDE

WSAD5.0, IRAD 6.0, Eclipse3.5, Dreamweaver 13.2.1

OS/Platforms

Mac OS X 10.9.5,Windows2008/Vista/2003/XP/2000/NT,Linux(All major distributions), Unix.

Methodologies

Agile, UML, Design Patterns, SDLC

Education:

MS-IST, Wilmington University, Wilmington,DE

Bachelor of Engineering in Mechanical Engineering, Manipal University, India

Professional Experience:

Staples Development Lab, Seattle, WA August 2015 – till date

Sr. Hadoop Developer

Description: Staples EDS working with a third-party on a project where primary objective was to create a personalized experience for a user from the point at which customer received the email all the way until they checkout. Different data sources gathered data at varying levels of customer interaction. Collecting and aggregating the data in HDFS and building HIVE tables in order to calculate effectiveness of various segments using KPI’S (ex: performance of email, dollar values of sales for segment etc).

Responsibilities:

Worked on Hortonworks-HDP2.2 distribution of Hadoop

Experience working with Teradata Studio,MS SQL, DB2 for identifying required tables and views to export into HDFS.

Responsible for moving data from Teradata, MS SQL Server, DB2 to HDFS to development cluster for validation and cleansing.

Responsible for doing cleansing and validations at HDFS, Teradata, and Hive table level.

Writing SQOOP statements for one-time imports and scripts for incremental import to HDFS from Teradata, SQL SERVER, DB2.

Cleansing and validating data in HDFS and exporting to Teradata by writing SQOOP export statements.

Worked extensively with SSH, SFTP to move data into HDFS from third-party server.

Responsible for moving data from Linux file system into HDFS.

Worked on monitoring and troubleshooting the Kafka-Storm-HDFS data pipeline for real-time data ingestion in Datalake in HDFS.

Extensive experience working with ETL of large datasets using Pyspark in Spark on HDFS

Experience working with Spark SQL and creating RDD’s using pyspark .

Working knowledge of Dataframes API on Spark.

Developed HIVE tables on data using different SERDE’s,storage formats and compression techniques.

Writing HIVEQL queries for integrating different tables to create views to produce result set.

Extensive experience tuning Hive queries using memory joins for faster execution and appropriating resources

Worked on right join logic recursively to generate a high-level overview of tables for Tableau dashboards.

Worked extensively with Tableau to produce dashboards.

Environment: Hadoop, MapReduce,Spark, HDFS, Hive, Oozie, Java (jdk1.6),eclipse, Kafka, HBase (NoSQL), Sqoop, Pig.

GE Capital, Stamford, CT Oct 2014 – Jul 2015

Sr. Hadoop Developer/Java

Description: GE Capital’s internet-based Consumer credit application (ETail) processing and Client onboarding IT processes, Consumer-facing improvements include a streamlined application form that will have application pre-fill and back-fill capabilities and a new mobile enabled process for Consumers. GE will rely on data provided by Clients and 3rd Parties to optimize the application processing for Consumers and Clients. A new Consumer eQuickscreen function will enable GE to interface directly with Consumers to offer pre-approved Credit. This application will internally deal with different source systems (FDR, Surveyor) to process the consumer information and will take a decision whether to Approve/Reject/Pending statues and will arrange the loan amount as per credit rate history for an Approved loans sites.

Responsibilities:

Responsible for data gathering from multiple sources like Teradata, Oracle, Sql server etc.

Responsibe for doing validiations and cleansing the data.

Finding the right joins logics and create valuable data sets for further data analysis. Architecture design and develop the whole application to ingest and process high volume mainframe data into Hadoop infrastructure using Hadoop map-reduce.

Design and develop customized business rule framework to implement business logic using hive, pig UDF

Functions in Python

Experienced in working with various kinds of data sources such as Teradata and Oracle. Successfully loaded files to HDFS from Teradata, and loaded from HDFS to HIVE.

Experienced in using Zookeeper and OOZIE Operational Services for coordinating the cluster and scheduling workflows.

Experienced in working with Elastic MapReduce (EMR)

Analysis of XML and log files.

Supported Map Reduce Programs which are running on the cluster. Involved in loading data from UNIX file system to HDFS.

Exported the analyzed data to the relational databases using SQOOP for visualization and to generate reports for the BI team.

Evaluated suitability of Hadoop and its ecosystem to the above project and implementing / validating with various proof of concept (POC) applications to eventually adopt them to benefit from the Big Data Hadoop initiative.

Maintain System integrity of all sub-components related to Hadoop.

Environment: Apache Hadoop, Mapreduce, Pig, Sqoop, Hive, Impala, Oozie, Hbase.

Xerox Healthcare, Brooklyn, NY Oct 2013 – September 2014

Hadoop Developer/Java

Description: The assignment comprised of integrating the Health Enterprise portal with Fame Legacy System. Also to implement Single Sign-on (SSO), Security and Member eligibility information.

Responsibilities:

Experience in developing solutions to analyze large data sets efficiently

Developed Map Reduce application to find out the useful metrics from the data. Did a thorough testing in local mode and distributed mode found bugs with the code and ensured 100% issue free delivery to production.

Expert level understanding of Map Reduce internals, including shuffling and partitioning. The bottlenecks in performance of a map reduce program.

Created Hive external tables and managed tables, designed data models in hive.

Implemented business logic using Pig scripts

Finding the right joins logics and creates valuable data sets for further data analysis.

Worked extensively on Pig and hive.

Responsible to develop custom udf’s in pig, hive.

Developed multiple MapReduce jobs in Java for data cleaning and processing.

Responsible for building scalable distributed data solutions using Hadoop.

Worked hands on with ETL process.

Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.

Extracted the data from Teradata into HDFS using Sqoop.

Exported the patterns analyzed back into Teradata using Sqoop.

Developed Hive queries to process the data and generate the data cubes for visualizing.

Implemented oracle as database to store the data and gained exposure to various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL.

Environment: Java, J2EE, JavaScript, Struts, Spring, Hibernate, SQL/PLSQL, Web Services,Unix,Linux, Hadoop, MapReduce, HDFS, Hive, Ooozie, Java (jdk1.6),eclipse, Cloudera, HBase (NoSQL), Sqoop, Pig.

IBS InfoWeb Private Limited, India Mar 2010– Dec 2012

Application Developer

Description: Worked on a comprehensive reverse bidding pharmaceuticals portal, where the consumer is allowed to make the final decision on the price at which he intends to buy the medicine. This portal provides a common place for consumers, prescribers, providers and pharmacists enabling easy commerce.

Responsibilities:

Developed the user interface with HTML, JavaScript, JSP and Tag Libraries using Struts

Designed and developed the application using various design patterns, such as session façade, business delegate and service locator

Developed authentication and authorization prototype using Axis-wsse (used as SOAP/WSS4J)

Developed custom logging that logs application specific details about ERAGUI

Configured Internationalization using resource bundles on JSP pages

Developed Stateless Session beans provide a client's view of the application's business logic

Developed functional and unit testing framework like Test Driven Development in different modules using JUNIT, Solved several key issues by improving code as well as business processes and integrated with ANT build Tool

Developed Middleware Support for Data-flow Distributionin Web Services Composition

Implemented Java Collection framework and Exception handling framework in middle Tier modules

Configured open source tools like Log4J, commons BeanUtils, commons Digester in the application

Implemented oracle as database to store the data and gained exposure to various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL.

Environment: Java, J2EE, JavaScript, Struts, Spring, Hibernate, SQL/PLSQL, Web Services,Unix.

Contact this candidate