Post Job Free

Resume

Sign in

Data Web Services

Location:
Columbia, MD
Posted:
November 21, 2017

Contact this candidate

Resume:

Siva

E-Mail:ac3fgq@r.postjobfree.com

Phone: +1-646-***-****

Sr. Hadoop Developer

PROFESSIONAL SUMMARY:

Over 7+ years of Professional experience in IT Industry with 4 years of experience in analysis, architectural design, prototyping, development, Integration and testing of applications using Java/J2EE Technologies and 3+ years of experience in Big Data Analytics as Hadoop and Spark Developer.

Ability to design, develop, deploy and support solutions using Agile Scrum methodology that leverage the Client and good understanding of various phases in Software Development life cycle (SDLC).

Experienced in building highly scalable Big Data solutions using Hadoop and multiple distributions i.e., Cloudera, Hortonworks and NoSQL platforms (Flume, HBase, Cassandra, Couchbase and MongoDB).

Strong Knowledge and experience in Hadoop and Big Data Ecosystem including MapReduce, HDFS, Hive, Pig, Spark, Cloudera Navigator, Mahout, HBase, Zookeeper, Kafka, Strom, Sqoop, Flume, Oozie and Impala.

Have experience on Spark Core, Spark Streaming, Hive Context, Spark Sql and MLlib for analyzing streaming data.

Experience developing Pig Latin and HiveQL scripts for Data Analysis and ETL purposes and also extended the default functionality by writing User Defined Functions (UDFs) for data specific processing.

Experience with Oozie Workflow Engine workflow jobs with actions that run Hadoop Map Reduce and Pig jobs. knowledge in Hadoop HDFS Admin Shell commands. Proficient in Big data ingestion tools like Flume, Kafka, Spark streaming and Sqoop for streaming and batch data ingestion.

Experience in designing and developing Enterprise applications using Java/J2EE technologies on Hadoop MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Spark, J2EE tools & technologies like JDBC, Spring, struts, MVC, RAD, Hibernate, XML, JBoss, Apache Tomcat and IDEs tools Eclipse 3.0, My Eclipse, RAD.

Expertise in optimizing traffic across network using Combiners, joining multiple schema datasets using Join and organizing data using Practitioners and Buckets.

Experience in converting all file formats like Avro, ORC, RC, CSV, Json, Sequence, and Parquet for compression and uncompression data for Data analysis.

Involved in developing complex ETL transformation & performance tuning. Experience in developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Pentaho reporting table.

Experience in Data Extraction, Transformation and Loading (ETL) processes using (SSIS), DTS Import/Export Data, Bulk Insert, BCP and DTS Packages.

Involved in creating dashboards and reports in Tableau and created report schedules on Tableau server.

Implemented Performance Tuning in Tableau Dashboards and Reports built on huge Data sources. Developed BI data mapping and BI objects (dashboards, reports, analyses, visualizations, metrics, security) design documents using Power BI.

Technical skills encompass Java, J2EE (JDBC, Servlets, Custom tags, EJB, JMS, JNDI, JQuery, Struts, Web Services (SOAP, RESTFUL), Spring & Hibernate Frameworks, ORM,XML, HTML 5.0, DHTMLX, UML, JSON, JQuery, JSTL, Apaches Log4J, ANT, Maven, Shell script and Java script).

Experience in developing applications using Spring Framework 3.2.2, worked on different spring modules like corecontainer module, application context module, Aspect oriented module (AOP Module), JDBC Module, ORM Module and web module.

Strong experience in Design, Development of several Web based applications using open source frameworks such as Struts and spring.

Expertise in designing and development enterprise applications for J2EE platform using MVC, JSP, Servlets, JDBC, WebServices, Hibernate and designing Web Applications using HTML5, CSS3, AngularJS, Bootstrap.

Worked with relational database systems such as MySQL, PL/SQL, Oracle and NoSQL database systems like HBase and Cassandra, MangoDB, Mahout.

Hands on experience in Nosql database HBase scales linearly to handle huge data sets with billions of rows and millions of columns, which is use to provide read/write access to large datasets.

Proficiency in UNIX/Linux fundamentals in relation to UNIX scripting and administration, experience on Ubuntu, CentOS.

Expertise in using ApacheTomcat, JBoss, WebLogic, WebSphere.

Having knowledge on Machine learning and related tools like Octave, Python, and R libraries.

Having experience on Scala using scala collections and Singleton object, Anonymous object, Companion object. Using pandas in python for data analysis for file processing like pickle, json, csv, hdf and APIs like Matplotlib, Numpy, Scipy, Scrapy, ipython for data analysis.

TECHNICAL SKILLS:

BigData Technologies

HDFS, MapReduce, Pig, Hive, HBase, Sqoop, Flume, Oozie, Hadoop Streaming, Zookeeper, AWS, Kafka, Impala, Apache Spark, Apache Storm, YARN and Mahout

Hadoop Distributions

Cloudera (CDH4/CDH5), HortonWorks

Languages

Hadoop, Java, Scala, Python, C, C++, JavaScript, Sql

IDE Tools

Eclipse, Netbeans, IntelliJ IDEA, Microsoft visual studio

Framework

Hibernate, Spring, Struts, Junit

Web Technologies

HTML5,CSS3,JavaScript,JQuery,AJAX,Servlets, JSP,JSON, XML, XHTML, JSF, Angular JS

Web Services

SOAP,REST, WSDL, JAXB, and JAXP

Operating Systems

Windows (XP,7,8), UNIX, LINUX, Ubuntu, CentOS

Application Servers

Jboss, Tomcat, Web Logic, Web Sphere, Glass Fish

Tools

Adobe, Sql Developer, Flume, Sqoop and Storm

J2EE Technologies

JSP, Java Bean, Servlets, JPA1.0, EJB3.0, JDBC

Databases

Oracle, MySQL, DB2, Derby, PostgreSQL, No-SQL Database (Hbase, Cassandra)

PROFESSIONAL EXPERIENCE:

State of New York Albany, NY Feb 16 - Till Now

Hadoop Developer

Project description: The New York Department of Health ensures that high quality appropriate health services are available to all New York State residents at a reasonable cost. Department functions and responsibilities include: promoting and supervising public health activities throughout New York State. The Tables are of 3 million records 2 Billion records to consolidate such a huge data we have implemented Spark -1.6 with the distribution of cloudera and successful achieved the end result and sent back the required data to downstream for reporting.

Responsibilities:

Data Ingestion into the Indie-Data Lake using Open source Hadoop distribution to process Structured, Semi-Structured and Unstructured datasets using Open source Apache tools like FLUME and SQOOP into HIVE environment. (Using IBM Big Insights Ver-4.1 platform).

Develop Spark code using Scala and Spark-SQL for faster testing and data processing.

Experience with batch processing of data sources using Apache Spark.

Develop predictive analytics using Apache Spark Scala APIs.

Developed MapReduce jobs in Java API to parse the raw data and store the refined data.

Develop Kafka producer and consumers, Hbase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.

Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in parquet format.

Implemented Spark Data Frames transformations, actions to migrate Map reduce algorithms.

Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.

Used Data Frame Developed solutions to pre-process large sets of structured, with different file formats (Text file, Avro data files, Sequence files, Xml and JSON files, ORC and Parquet).me API in Java for converting the distributed collection of data organized into named columns.

Automating and scheduling the Sqoop jobs in a timely manner using Unix Shell Scripts.

Worked on Database designing, Stored Procedures, and PL/SQL.

Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.

Responsible for managing existing data extraction jobs, but also play a vital role in building new data pipelines from various structured and unstructured sources into Hadoop.

Work on a product team using Agile Scrum methodology to design, develop, deploy and support solutions that leverage the Client big data platform.

Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, Hbase and Hive by integrating with Storm.

Design and code from specifications, analyzes, evaluates, tests, debugs, and implements complex software apps.

Developed Sqoop Scripts to extract data from DB2 EDW source databases into HDFS.

Worked in tuning Hive & Pig to improve performance and solved performance issues in both scripts with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs

VCreated Partitions, Buckets based on State to further process using Bucket based Hive joins.

Implemented Cloudera Manager on existing cluster. .

Extensively worked with Cloudera Distribution Hadoop, CDH 5.x, CDH4.x

Designed & developed various SSIS packages (ETL) to extract & transform data & involved in Scheduling SSIS Packages.

Created ETL packages with different data sources (SQL Server, Flat Files, Excel source files, XML files etc.) and then loaded the data into destination tables by performing complex transformations using SSIS/DTS packages.

Troubleshooting experience in debugging and fixed the wrong data or data missing problem for both Oracle Database and Mongo DB.

Environment: HDFS, MapReduce, JavaAPI, JSP, JavaBean, Pig, Hive, Sqoop, Flume, Oozie, HBase, Kafka, Impala, Spark Streaming, Storm, Yarn, Eclipse, Spring, PL/SQL, Unix Shell Scripting, Cloudera.

Covanta Energy Camden, NJ Apr 2014 - Dec 2015

Hadoop Developer

Project description: Big Data Migration and Analysis Data Migration project using Hadoop ecosystems components from traditional data warehousing and BI system. Involved in extracting data from different servers and dump the data into Hadoop Cluster to generate reports for analysis. The initial areas of Data Migration were MDM, Planning, EDW and Analytics.

Responsibilities:

Importing and exporting data into HDFS and Hive using Sqoop and Kafka.

We were getting on an average of 60 GB on daily basis. Overall the data warehouse for my project was having 4 PB of data and we used 110 node cluster to process the data

Develop different components of system like Hadoop process that involves Map Reduce, and Hive.

Developed interface for validating incoming data into HDFS before kicking off Hadoop process.

Written hive queries using optimized ways like user-defined functions, customizing Hadoop shuffle & sort parameters.

Along with the Infrastructure team, involved in design and developed Kafka and Storm based Data pipeline. This pipeline is also involved in Amazon Web Services EMR, S3 and RDS.

Worked on tuning Hive and Pig to improve performance and solve performance related issues in Hive and Pig scripts with good understanding of Joins, Group and aggregation and how it does Map Reduce jobs.

Developing map reduce programs for different types of Files using Combiners with UDF's and UDAF's.

Experience working on multiple node cluster tool which offer several commands to return HBase usage.

Experience in creating tables, dropping and altered at run time without blocking updates and queries using HBase and Hive.

Using HCATALOG to access Hive table metadata from Map Reduce or Pig code.

Experience on pre-processing the logs and semi structured content stored on HDFS using PIG.

Experience in structured data imports and exports into Hive warehouse which enables business analysts to write Hive queries.

Experience in managing and reviewing Hadoop log files.

Experience on UNIX shell scripts for business process and loading data from different interfaces to HDFS.

Responsible for developing data pipeline using flume, sqoop and pig to extract the data from weblogs and store in HDFS Designed and implemented various metrics that can statistically signify the success of the experiment.

Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and MapReduce) and move the data files within and outside of HDFS.

Responsible for processing ingested raw data using MapReduce, Apache Pig and Hive.

Developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.

Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.

Involved in creating Hive tables, Pig tables, and loading data and writing hive queries and pig scripts.

Had a couple of workshops on Spark, RDD & spark-streaming.

Hands on experience in eclipse, Putty, winSCP, VNCviewer, etc.

Environment: Linux 6.7, CDH5.5.2, MapReduce, Hive 1.1, PIG, HBase, Yarn, Hive, Pig, HBase, Oozie, Shell Script, AWS SQOOP 1.4.3, Eclipse, Java 1.8.

Medtronic Minneapolis, MN Jan 2013 - Mar 2014

Java/J2EE Developer

Project description: Medtronic is a medical device company. Its headquarters are in Dublin, Ireland and operational headquarters in Minnesota. It is a world largest standalone medical technology development company. Market capitalization is nearly $100 million it operates in more than 140 countries. It has more than 53,000 patents.

Responsibilities:

Involved in all the Web module UI design and development using HTML, CSS, jQuery, JavaScript, Ajax.

Designed and modified User Interfaces using JSP, JavaScript, CSS and jQuery.

Developed UI screens using Bootstrap, CSS and jQuery.

Developed user interfaces using JSP, JSF frame work with AJAX, Java Script, HTML, DHTML, and CSS.

Implemented Spring AOP for admin services.

Involved in multi-tiered J2EE design utilizing MVC architecture Struts Framework, Hibernate and EJB deployed on Websphere Application Server connecting to an Oracle database.

Develop software in JAVA/J2EE, XML, Oracle EJB, Struts, and Enterprise Architecture

Developed and Implemented Web Services and used Spring Framework.

Implemented the caching mechanism in Hibernate to load data from Oracle database.

Implemented application level persistence using Hibernate and Spring.

Implemented Persistence layer using Hibernate to interact with the Oracle database used Hibernate Framework for object relational mapping and persistence.

Developed Servlets and JSPs based on MVC pattern using Spring Framework.

Maintained the business standards in EJB and deployed them on to WebLogic Application Server

Developed Rest architecture based web services to facilitate communication between client and servers.

Developed AJAX scripting to process server side JSP scripting.

Used the Eclipse as IDE, configured and deployed the application onto WebLogic application server using Maven.

Created applications, connection pools, deployment of JSPs, Servlets, and EJBs in Weblogic.

Created SQL queries, PL/SQL Stored Procedures, Functions for the Database layer by studying the required business objects and validating them with Stored Procedures using DB2. Also used JPA with Hibernate provider.

Implemented ftp utility program for copying the contents of an entire directory recursively up to two levels from a remote location using Socket Programming.

Wrote test cases using JUnit testing framework and configured applications on Weblogic Server.

Environment: Java, J2EE, Spring, Hibernate, Struts, JSF, EJB, MySql, Oracle, Sql Server, DB2, PL/SQL, JavaScript, JQuery, Servlets, JSP, HTML, CSS, Agile Methodology, Eclipse, Weblogic Application Server, UNIX, XML, Junit, SOAP, Restful Webservices, JDBC.

Johnson and Johnson Hyderabad, India Sep 2011 -Oct 2012

Java/J2EE Developer

Project description: Johnson & Johnson Health and Wellness Solutions is grounded in the science of behavior change, taking a science-based, scalable approach to sustainable behavior change guided by human understanding and demonstrated through proven outcomes. We engage individuals, connect to what they value most and improve health outcomes, ideally through early intervention.

Responsibilities:

Developing front-end screens using JSP, HTML and CSS.

Developing server side code using Struts and Servlets.

Developing core java classes for exceptions, utility classes, business delegate, and test cases.

Developing SQL queries using MySQL and established connectivity.

Working with Eclipse using Maven plugin for Eclipse IDE.

Designing the user interface of the application using HTML5, CSS3, JSP, and JavaScript.

Tested the application functionality with JUnit Test Cases.

Developing all the User Interfaces using JSP framework and Client Side validations using JavaScript.

Writing Client Side validations using JavaScript.

Extensively used JQuery for developing interactive web pages.

Developed the user interface presentation screens using HTML, XML, and CSS.

Developed the Shell scripts to trigger the Java Batch job, Sending summary email for the batch job status.

Co-ordinate with the QA lead for development of test plan, test cases, test code and actual testing responsible for defects allocation and those defects are resolved.

Application was developed in Eclipse IDE and was deployed on Tomcat server.

Involved in Agile scrum methodology.

Supported for bug fixes and functionality change.

Environment: Java/J2EE, Oracle 10g, SQL, PL/SQL, JSP, Hibernate, WebLogic 8.0, HTML, AJAX, Java Script, JDBC, XML, UML, JUnit, Eclipse.

Genesis Software Inc Hyderabad India Apr 2010 -Sep 2011

Java Associate

Project description: Genesis Systems Incorporated is the number one provider of Vital Records registration, issuance and analysis software. We have specialized in Vital Records software since 1987 and were the developers of the original Electronic Birth Certificate (EBC) software.

Responsibilities:

Actively participated in all phases of the Software Development Life Cycle SDLC.

Extensively worked on CORE JAVA (Collections of Generics and Templates, Interfaces for passing data from GUI Layer to Business Layer)

Developed web interface for user's modules using JSP, HTML, XML, CSS, Java script, AJAX.

Developed using J2EE design patterns like Command Pattern, Session Facade, Business Delegate, Service Locator, Data Access Object and value object patterns.

Used J-Unit test cases to test the application and performed random checks to analysis the portability, reliability, and flexibility of the project.

Analyzed, designed and implemented Online Enrollment Web Application using Struts, JSTL, Hibernate, UML, Design Patterns and Log4J.

Using advanced level JQUERY, AJAX, JavaScript, CSS and pure CSS layouts and database using JDBC for ORACLE.

Involved in writing application level code to interact with APIs, Web Services using AJAX, JSON and XML.

Created Servlets and Java Server Pages, which route submittals to the appropriate Enterprise Java Bean EJB.

Development process the SCRUM, Iterative Agile methodologies for web application.

Responsible for the performance PL/ SQL procedures and SQL queries.

Involved in deployment components on Weblogic application server.

Deployed applications on Linux client machines.

Environment: Java EE 5, JSP 2.0, Java Bean, EJB3.0, JDBC, Application Server, Eclipse, Java API, J2SDK 1.4.2, JDK 1.5, JDBC, JMS, Message queues, Web services, UML, XML, HTML, XHTML, JavaScript, log4j, CVS, Junit, Windows and Sun OS 2.7/2.8.



Contact this candidate