Data Developer

Location:

United States

Posted:

July 14, 2016

Contact this candidate

Resume:

SRI RAM

Mobile: 908-***-****

Email:**********@*****.***

PROFESSIONAL SUMMARY:

Over 7 Plus years of professional IT experience, 4 years in Big Data Ecosystem experience in ingestion, querying, processing and analysis of big data.

Experience in using Hadoop ecosystem components like Map Reduce, HDFS, HBase, ZooKeeper, Hive, Sqoop, Pig, Flume, Cloudera, Kafka.

Knowledge on NoSQL databases like HBase, Cassandra

Experience includes Requirements Gathering, Design, Development, Integration, Documentation, Testing and Build.

Experience in working with Map Reduce programs, Pig scripts and Hive commands to deliver the best results.

Extensively worked on development and optimization of Map reduce programs, PIG scripts and HIVE queries to create structured data for data mining.

Solid knowledge of Hadoop architecture and daemons like Name node, Data nodes, Job trackers, Task Trackers.

Good knowledge on ZooKeeper to coordinate clusters.

Experience in Database design, Data analysis, Programming SQL, Stored procedure's PL/ SQL, and Triggers in Oracle and SQL Server.

Experience in extending HIVE and PIG core functionality by using Custom user Defined functions.

Experience in writing custom classes, functions, procedures, problem management, library controls and reusable components.

Working knowledge on Oozie, a workflow scheduler system to manage the jobs that run on PIG, HIVE and SQOOP.

Followed Test driven development of Agile and scrum Methodology to produce high quality software.

Experienced in integrating Java-based web applications in a UNIX environment.

Developed applications using JAVA, JSP, Servlets, JDBC, JavaScript, XML and HTML.

Strong analytical skills with ability to quickly understand clients business needs. Involved in meetings to gather information and requirements from the clients.

Research-oriented, motivated, proactive, self-starter with strong technical, analytical and interpersonal skills.

Bachelor of Engineering (IT), Osmania university, Hyderabad, India

TECHNICAL SKILLS

Hadoop Ecosystem

HDFS,MapReduce,Hive,Impala,Pig,Sqoop,Flume,Oozie,Zookeeper,Ambari,Hue,Spark,Strom,Kafka,Cloudera3/4/5

Web Technologies

HTML, XML, CSS, JavaScript, AngularJS, JQuery

NoSQL Databases

HBase, Cassandra, MongoDB

Databases

Oracle 8i/9i/10g, MySQL

Languages

Java, SQL, PL/SQL, Python, Shell Scripting,

Operating Systems

UNIX(OSX, Solaris), Windows, Linux(Cent OS, Fedora, Red Hat)

IDE Tools

Eclipse, NetBeans

Application Server

Apache Tomcat,Weblogic,GlassFish

PROFESSIONAL EXPERIENCE

Client: Cerner HealthCare, PA Sep 2014 - Present

Role: Hadoop Developer

Description: Our team is responsible for doing "Big data" processing to take Health care data from a variety of disparate systems and cleanse, normalize, and standardize that data in a way that can be leveraged by cloud-based applications to improve the health of populations as a whole.

Our team is responsible for Children's Healthcare by using sensor beside the bed that helps them continuously track patient signs such as blood pressure, heartbeat and the respiratory rate. These sensors produce large chunks of data, which using legacy systems cannot be stored for more than 3 days for analysis. The main motive of Children's Healthcare of Cerner was to store and analyze the vital signs. If there is any change in pattern, then the hospital wanted an alert to be generated to a team of doctors and assistants.

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop.

Designed the projects using MVC architecture providing multiple views using the same model and thereby providing efficient modularity and scalability.

This project will downloads the data that was generated by sensors from the Patients body activities, the data will be collected in to the HDFS system online aggregators by Kafka.

Kafka consumer will get the data from different learning systems of the patients.

Spark Streaming collects this data from Kafka in near-real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in NoSQL store (Hbase).

Used Hadoop's Pig, Hive and Map Reduce for analyzing the Health insurance data to help by extracting data sets for meaningful information such as medicines, diseases, symptoms, opinions, geographic region detail etc.

Developed workflow in Oozie to orchestrate a series of Pig scripts to cleanse data, such as removing personal information or merging many small files into a handful of very large, compressed files using pig pipelines in the data preparation stage.

Uses Pig in three distinct workloads like pipelines, iterative processing and research.

Uses Pig UDF's in Python, Java code and uses sampling of large data sets.

Involved in moving all log files generated from various sources to HDFS for further processing through Flume and process the files by using some piggybank.

Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.

Created Hive tables to store the processed results in a tabular format.

Good experience in PIG Latin scripting and Sqoop Scripting.

Involved in transforming data from legacy tables to HDFS, and HBASE tables using Sqoop.

Implemented exception tracking logic using Pig scripts.

Implemented test scripts to support test driven development and continuous integration.

Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Good understanding of ETL tools and how they can be applied in a Big Data environment.

Environment: Hadoop, Map Reduce, Spark, shark, Kafka, HDFS, Hive, Pig, Oozie, Core Java, Python, Eclipse, Hbase, Flume, Cloud era, Oracle 10g, UNIX Shell Scripting.

Client: T-Mobile, NC May 2013 - Aug 2014

Role: Hadoop Developer

Description: T-Mobile is a United States-based holding company that operates telephones and wireless mobile services. The project was to increase the sales and increase the customer satisfaction based on the data collected from all the customers. As a Hadoop developer along with the team I used to extract data from traditional databases into HDFS using Sqoop and perform processes on large data sets using Map Reduce, Pig Latin scripts. Using this analysis we used to provide the stats to such as purchase history on each and every kind of mobile phones plans etc. so that the marketing teams can target on the strong points to increase the sales of the company. On the other side they can improve the products on which the customer satisfaction is quite low and sales are very low based on these stats.

Responsibilities:

Exported data using Sqoop from HDFS to Teradata on regular basis.

Converting map reduce/pig/jobs into Spark jobs.

Transforming the Hive jobs into Spark SQL.

Responsible to manage data coming from different sources.

Implemented POC using Spark and Cassandra.

Developing Scripts and Batch Job to schedule various Hadoop Program.

Written Hive queries for data analysis to meet the business requirements.

Creating Hive tables and working on them using Hive QL.

Involved in managing and reviewing Hadoop log files.

Experienced in defining job flows.

Involved in creating the workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.

Developed PIG Latin scripts for the analysis of semi structured data. Involved in debugging PIG scripts.

Got good experience with NOSQL databases Cassandra.

Worked closely with data warehouse architect and business intelligence analyst to develop solutions.

Involved in Design, develop Hive Data model, loading with data and writing Java UDF for Hive.

Developed Pig Latin scripts and used Pig as ETL tool for transformations, event joins, and filter.

Designed and Developed Sqoop scripts to extract data from a relational database into Hadoop.

Responsible for performing peer code reviews.

Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.

Environment: Cloudera Hadoop(CDH 4.4), Map Reduce, HDFS, Hive, Java, Pig, Cassandra, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, PL/SQL, Spark, SQL connector, Sub Version.

Client: MasterCard, GA Sep 2012 - Mar 2013

Role: Hadoop Developer

Description: As part of MasterCard, we used Sqoop to import the data from SQL-DB. Pig script generates the confirmation file by filtering the necessary data. E-statement data is generated and is stored in HDFS. Later Hive generates the other files (card details, account details and legal details) including account numbers etc. Final consolidation is done using Hive join on all generated tables for final result.

MasterCard Financial provides clients with access to premier financial products and services. MasterCard provides customer and accounting information are stored in Oracle and Mainframes. Due to high maintenance cost, as the volume of data is huge and growing, MasterCard is moving the data to HDFS as a part of RDBMS to Hadoop migration. Hadoop helps to implement Business rules, predictive analytics on risks or fraud, compliance and marketing.

Responsibilities:

Part of team for developing and writing PIG scripts.

Loaded the data from RDBMS SEREVR to Hive using Sqoop.

Created Hive tables to store the processed results in a tabular format.

Developed the Sqoop scripts in order to make the interaction between Hive and MySQL Database.

Developed Java Mapper and Reducer programs for complex business requirements.

Developed Java custom record reader, partitioner and serialization techniques.

Used different data formats (Text format and ORC format) while loading the data into HDFS.

Created Managed tables and External tables in Hive and loaded data from HDFS.

Performed complex HiveQL queries on Hive tables.

Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with HiveQL queries.

Created partitioned tables and loaded data using both static partition and dynamic partition method.

Created custom user defined functions in Hive.

Performed SQOOP import from Oracle to load the data in HDFS and directly into Hive tables.

Developed Pig Scripts to store unstructured data in HDFS.

Scheduled map reduce jobs in production environment using Oozie scheduler.

Used Hadoop logs to debug the scripts.

Environment: Hadoop CDH4.1.1, Pig, Avro, Oozie 3.2.0, Sqoop, Hive, PIG, Java 1.6, Eclipse, Teradata, HBase.

Client: HDFC Bank, India Jan 2011 - Aug 2012

Role: Java Developer

Responsibilities:

Involved in Requirement analysis and design phase of Software Development Life cycle (SDLC).

Designed, Developed and modified front-end UI using HTML, CSS, JAVASCRIPT, JQUERY.

Involved in designing the front end screens & Designed Low-Level design documents for the project.

Writing complex SQL and PL/SQL queries for stored procedures.

Generating Unit Test cases with the help of internal tools.

Used JavaScript, jQuery and JGrid for development.

Used HTML, CSS and for the enriched front end.

Developed Client applications to consume the Web services based on SOAP.

Designed the projects using MVC architecture providing multiple views and thereby providing efficient modularity and scalability.

Performed business validations at the back-end using Java modules and at the front-end using JavaScript.

Performed building and deployment of EAR, WAR, JAR files on test, stage systems in Weblogic Application Server.

Used Singleton, DAO, DTO, Session Facade, MVC design Patterns.

Involved in resolving Production Issues, Analysis, Troubleshooting and Problem Resolution.

Involved in development and deployment of application on Linux environment.

Involved in defect Tracking, Analysis and Resolution of bugs in system testing.

Involved in Designing and creating database tables.

Prepared project Metrics for Time, cost, Schedule and customer satisfaction (Health of the project).

Environment: Java, J2EE, XML, spring, Struts, Hibernate, Design Patterns, Maven, Eclipse, Toad, ApacheTomcat and Oracle.

Client: Avantel, India Aug 2009 - Dec 2010

Role: Java/J2EE Developer

Responsibilities:

Involved in Java, J2EE, struts, web services and Hibernate in a fast paced development environment.

Followed agile methodology, interacted directly with the client on the features, implemented optimal solutions, and tailor application to customer needs.

Involved in design and implementation of web tier using Servlets and JSP.

Used Apache POI for Excel files reading.

Developed the user interface using JSP and Java Script to view all online trading transactions.

Designed and developed Data Access Objects (DAO) to access the database.

Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects

Coded Java Server Pages for the Dynamic front end content that use Servlets and EJBs.

Coded HTML pages using CSS for static content generation with JavaScript for validations.

Used JDBC API to connect to the database and carry out database operations.

Used JSP and JSTL Tag Libraries for developing User Interface components.

Performing Code Reviews.

Performed unit testing, system testing and integration testing.

Involved in building and deployment of application in Linux environment.

Environment: Java, J2EE, JDBC, Struts, SQL. Hibernate, Eclipse, Apache POI, CSS.

Contact this candidate