Resume

Data Developer

Location:

Wayne, PA

Posted:

June 20, 2016

Contact this candidate

Resume:

Tejaswi

Email: acvcci@r.postjobfree.com

Phone: 201-***-****

Professional Summary

●Around 5 years of professional IT experience with hands-on experience in development of Big Data Technologies, data analytics.

●Experienced as Hadoop Developer with good knowledge in MapReduce, YARN, HBASE, CASSANDRA, PIG, HIVE, SQOOP.

●Extensive work experience in Object Oriented Analysis and Design, Java/J2EE technologies including HTML5, XHTML, DHTML, JavaScript, JSTL, CSS, AJAX and Oracle for developing server side applications and user interfaces.

●Experience with distributed systems, large-scale non-relational data stores, NoSQL map-reduce systems, data modeling, database performance tuning, and multi-terabyte data warehouses.

●Excellent understanding and knowledge of NOSQL databases like HBase and Cassandra.

●Excellent understanding of Hadoop architecture, Hadoop Distributed File System and API's.

●Good Exposure on Apache Hadoop Map Reduce programming architecture and API's.

●Experienced in running MapReduce and Spark jobs over YARN.

●Experienced in writing custom MapReduce I/O formats and key-value formats.

●Hands-on Experience in installing, configuring and maintaining the Hadoop clusters.

●Expert in working with Hive data warehouse tool-creating tables, data distribution by implement-ing partitioning and bucketing, writing and optimizing the HiveQL queries.

●Familiar in writing MapReduce jobs for processing the data over Cassandra cluster.

●Experienced in writing MapReduce jobs over HBase, custom Filters, and Co-processors.

●Hands on experience in Import/Export of data using Hadoop Data Management tool SQOOP.

●Used Hive and Pig for performing data analysis.

●Familiar with MongoDB concepts and its architecture.

●Experienced with moving data from Teradata to HDFS using Teradata connectors.

●Good experience in all the phases of Software Development Life Cycle (Analysis of requirements, Design, Development, Verification and Validation, Deployment).

●Hands-on experience with "productionalizing" Hadoop applications (e.g. administration, configu-ration management, monitoring, debugging, and performance tuning)

●Hands on experience in application development using Java, RDBMS, and Linux shell scripting.

●Experience working with JAVA J2EE, JDBC, ODBC, JSP, Java Beans, Servlets.

●Experience with AJAX, REST and JSON

●Experience in using IDEs like Eclipse and experience in DBMS like Oracle and MYSQL.

●Evaluate and propose new tools and technologies to meet the needs of the organization.

●Good knowledge in Unified Modeling Language (UML), Object Oriented Analysis and Design and Agile Methodologies.

●An excellent team player and self-starter with effective communication skills.

Technology:

Hadoop/Big Data/NoSql Technologies

HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, YARN, Zookeeper, HBase

Programming Languages

Java, Python, C, SQL, PL/SQL, Shell Script

IDE Tools

Eclipse, Rational Team Concert, NetBeans

Framework

Hibernate, Spring, Struts, JMS, EJB,JUnit, MRUnit, JAXB

Web Technologies

HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP,JSON, XML, XHTML, Rest Web Services

Application Servers

Jboss, Tomcat, Web Logic, Web Sphere

Databases

Oracle 11g/10g/9i, MySQL, DB2, Derby, MS-SQL Server

Operating Systems

LINUX,UNIX, Windows

Build Tools

Jenkins, Maven, ANT

Reporting/BI Tools

Jasper Reports, iReport, Tableau, QlikView

Education:

Jawaharlal Nehru Technological University-India

Bachelors in Engineering.

Cummins, Columbus IN Jan 2015 – till date

Hadoop Developer

Cummins is one the leading automobile engine manufactured based on Indiana. This project deals with interface between the RC portal and MAPR database. Requirement analysis, Design and Development. Data Design and Mapping, extraction, validation and creating complex business requirements. Developing Bigdata/Hadoop components using complex data manipulation and windowing analysis using Hive. Implementing the job workflows and scheduling for the end to end application processing.

Roles & Responsibilities:

●Implemented Hadoop framework to capture user navigation across the application to validate the user interface and provide analytic feedback/result to the UI team.

●Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.

●Performed analysis on the unused user navigation data by loading into HDFS and writing Map Reduce jobs. The analysis provided inputs to the new APM front-end developers and lucent team.

●Worked with Cassandra for non-relational data storage and retrieval on enterprise use cases.

●Wrote Map Reduce jobs using Java API and Pig Latin.

●Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.

●Used Flume to collect, aggregate and store the web log data onto HDFS.

●Wrote Pig scripts to run ETL jobs on the data in HDFS.

●Used Hive to do analysis on the data and identify different correlations.

●Written AdhocHiveQL queries to process data and generate reports.

●Involved in HDFS maintenance and administering it through Hadoop-Java API.

●Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.

●Worked on HBase. Configured MySQL Database to store Hive metadata.

●Imported data using Sqoop to load data from MySQL to HDFS on regular basis.

●Written Hive queries for data analysis to meet the business requirements.

●Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.

●Involved in creating Hive tables and working on them using Hive QL.

●Extracted files from MongoDB through Sqoop and placed in HDFS and processed.

●Maintaining and monitoring clusters. Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.

●Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.

●Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: Hadoop, Map Reduce, HDFS, Flume, Pig, Hive, Spark, Scala,Yarn,HBase, Sqoop, ZooKeeper, Cloudera, Oozie, Cassandra, NoSQL, ETL, MYSQL, agile, Windows,UNIX Shell Scripting, Teradata.

State Street Bank, Princeton NJ Nov 2013- Dec 2014

Hadoop Developer

The Company offers traditional banking services to customers through its bank branches located throughout the greater New Jersey NJ and metropolitan area. Hadoop technologies are used for large data processing that cannot be done with traditional databases. Tools that are needed for processing billions of records were used in this project for calculating promotional prices by applying various business transformations.

Roles& Responsibilities

●Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats

●Developed MapReduce programs that filter bad and un-necessary claim records and find out unique records based on account type

●Processed semi, unstructured data using Map Reduce programs

●Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig using Oozie co-ordinator jobs

●Implemented custom DataTypes, InputFormat, RecordReader, OutputFormat, RecordWriter for MapReduce computations

●Worked on CDH4 cluster on CentOS.

●Successfully migrated Legacy application to Big Data application using Hive/Pig/HBase in Production level

●Transformed date related data into application compatible format by developing apache Pig UDFs

●Developed MapReduce pipeline for feature extraction and tested the modules using MRUnit

●Optimized MapReduce jobs to use HDFS efficiently by using various compression mechanisms

●Creating Hive tables, loading with data and writing Hive queries which will run internally in MapReduce way

●Responsible for performing extensive data validation using Hive

●Implemented Partitioning, Dynamic Partitions and Bucketing in Hive for efficient data access

●Worked on different set of tables like External Tables and Managed Tables

●Used Oozie workflow engine to run multiple Hive and Pig jobs

●Involved in installing and configuring Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.

●Involved in designing and developing non-trivial ETL processes within Hadoop using tools like Pig, Sqoop, Flume, and Oozie

●Used DML statements to perform different operations on Hive Tables

●Developed Hive queries for creating foundation tables from stage data

●Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations

●Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior

●Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources

●Worked with Sqoop to export analyzed data from HDFS environment into RDBMS for report generation and visualization purpose

●Queried and analyzed data from Datastax Cassandra for quick searching, sorting and grouping

●Developed Mapping document for reporting tools

Environment: Apache Hadoop, HDFS, MapReduce, Java (jdk1.6), MySQL, DB Visualizer, Linux, Sqoop, Apache Hive, Apache Pig

Merck & Co, Boston, MA (offshore) June 2011 – Sept 2013

Java Developer

Merck & Co is an independent practice association (IPA) serving some 300,000 health plan members in northern California. The company contracts with managed care organizations throughout the region -- including HMOs belonging to Aetna, CIGNA, and Health Net -- to provide care to health plan members through its provider affiliates. Its network includes about 3,700 primary care and specialty physicians, 36 hospitals, and 15 urgent care centers. The company also provides administrative services for doctors and patients. PriMed, a management services organization created Hill Physicians Medical Group in 1984 and still runs the company.

Responsibilities:

As this project has very huge data, we prototyped Hadoop cluster to handle data in distributed way.

●Installed and configured Hadoop clusters for Dev, Qa and production environments

●Installed and configured the Hadoop name node ha service using Zookeeper.

●Installed and configured Hadoop security and access controls using Kerberos, Active Directory

●Imported data from Sql Server database to hdfs using Sqoop.

●Creating Hive tables, loading with data and writing Hive queries which will run internally in MapReduce way.

●Used Eclipse to develop J2EE Components. Components involved a JSP front end (Light weight). CSS and scripts were also part of front end development

●Designed J2EE project with Front Controller pattern

●Designed CSS and tag libraries for Front end

●Extensive use of Java scripts and AJAX to control all user functions.

●Developing front end screens which includes JQuery, JavaScript, Java and CSS

●Attend business and requirement meetings

●Using ANT to create build scripts for deployment and run the JUnit test cases

●Using VSS extensively to code check-in, check-out and version them and maintain production, test and development views appropriately.

●Understand the sources of data and organize it in a structured table setup

●Deliver daily reports and data sheets to clients for their business meetings.

●Code review, unit testing and local Integration testing.

●Integrating of application modules, components and deploying in the target platform.

●Involving in the requirement study, and preparation of detailed software requirement specification.

●Involving in low level and high level design and preparation of HLD and LLD documents Visio

●Testing support during integration and production

Environment: Hadoop, Hive, Sqoop, Zookeeper, Mapreduce, WebSphere / DB2/ IBM RAD.JDK1.6, JSP, J2EE, HTML, Javascript, CSS, Servlets, Struts, JDBC, Oracle, SQL, Log4j, JUnit, VSS, Ant, Shell script, Visio.

Contact this candidate