Deepa Reddy
Spark/Hadoop Developer
Contact: 214-***-****
Email: **********@*****.***
Professional Summary
Over 8+ years of professional IT experience including 3 plus years of experience on Big Data, Hadoop Development and Ecosystem Analytics, Development and Design of Java based enterprise applications.
Over 3+ years of experience in Spark core and its libraries using java.
Experience in both MapReduce MRv1 and MapReduce MRv2 (YARN)
Experience in working with PIG, Hive, Sqoop, Flume, Oozie, Zookeeper, Spark and HBase, Kafka
Experience with Cloudera CDH3, CDH4 and CDH5 distributions
Expert in creating Pig and Hive UDFs using Java in order to analyze the data efficiently.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
Leveraged AWS cloud infrastructure for storage (EBS, EFS) & data warehousing (Redshift & RDS).
Experience with Core Distributed computing and Data Mining Library using Apache Spark.
Successfully loaded files to Hive and HDFS from Oracle and SQL Server using SQOOP.
Developed MapReduce jobs to automate transfer of data from Hbase.
Good working knowledge on Eclipse IDE for developing and debugging Java applications
Good experience in using databases - SQL Server, Stored Procedures, Constraints and Triggers.
Experience in developing and designing POCs deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Oracle.
Worked with Oracle and Teradata for data import/export operations.
Good knowledge in Kafka and Messaging systems.
Experienced in creating and analyzing Software Requirement Specifications (SRS) and Functional Specification Document (FSD). Strong knowledge of Software Development Life Cycle (SDLC)
Extensive knowledge in creating PL/SQL stored Procedures, packages, functions, cursors against Oracle and NoSQL database like mongoDB, Cassandra.
Worked in Windows, Unix/Linux Platform with different technologies such as Big Data, Java, XML, HTML, SQL, PL/SQL, Python Shell Scripting and Business Intelligence.
Strong Knowledge of Flume and Spark, Kafka.
Experience in Scrum, Agile and Waterfall models.
Experience on S3 buckets on AWS to store Cloud Formation Templates.
Good communication Skills, committed, result oriented, hard working with a quest to learn new technologies.
EDUCATION and CERTIFICATIONS:
Bachelor of Computer Science and Engineering - India
Oracle Certified Developer in Java
TECHNICAL SKILLS:
Hadoop Framework
HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper, Flume and HBase, Spark, Kafka, Storm, Impala
Databases
Microsoft SQL Server, MySQL, Oracle, Teradata,, NoSQL (HBase, MongoDB, Cassandra).
Languages
C, C++, Java, Scala, SQL, TSQL, Pig Latin, Visual Basics,Python.
Web Technologies
JSP, Servlets, JavaBeans, JDBC, JSF, XML, Node.js, JavaScripts
Operating Systems
Windows 10, Windows XP/00/NT/9X, UNIX,Linux
Front-End
HTML/HTML 5, CSS3, JavaScript/JQuery
Development Tools
Microsoft SQL Studio, Toad, Eclipse, NetBeans, MySQL Workbench
Reporting Tool
SSRS, Succeed
IDE Tools:
Eclipse, NetBeans
Development Methodologies
Agile/Scrum, Waterfall, UML, Design Patterns, SDLC
Other skills
Business Intelligence, Tableau, Internet OF things, GIT, SVN,AWS.
DIRECTV, El Segundo, CA Oct 2015 – Till Date
ROLE: Spark/Hadoop Developer
RESPONSIBILITIES:
Designing and creating stories for the development and testing of the application.
Configuring and performance tuning the Sqoop jobs for importing the raw data from the data warehouse.
Developing hive queries using partitioning, bucketing and windowing functions.
Designed and developed entire pipeline from data ingestion to reporting tables.
Designed and Developed Pig Latin scripts and Pig command line transformations for data joins and custom processing of Map reduce outputs.
Creating HBase tables for random read/writes by the map reduce programs.
Developed Kafka storm integration to develop various topologies to process data from point products.
Creating data model, schemas and stored procedures for reporting database.
Designing and creating Oozie workflows to schedule and manage Hadoop, pig and Sqoop jobs.
Knowledgeable of Spark and Scala mainly in framework exploration for transition from
Hadoop/MapReduce to Spark.
Extensively worked on mongodb and its performance to store the metadata to store the metadata of the application.
Worked on the core and Spark SQL modules of Spark extensively.
Implemented custom workflow scheduler service to manage multiple independent workflows. Implemented a web application, which uses Oozie Rest API and schedule jobs.
Experience in using Sqoop to migrate data to and fro from HDFS and My SQL or Oracle and deployed Hive and HBase integration to perform OLAP operations on HBase data.
Assisted SQL Server Database Developers in code review and optimizing SQL queries.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Experience in Database design, Entity relationships, Database analysis, Programming SQL, Stored procedure's PL/ SQL, Packages and Triggers in Oracle and SQL Server on Windows and LINUX.
Actively involved in developing front-end spring web application for consumers to created custom profiles for data processing.
Load the data into Spark RDD and do in memory data Computation to generate the Output response.
Installed and configured Pig for ETL jobs Designed high level ETL architecture for overall da ta transfer from the OLTP to OLAP.
Implemented AWS provides a variety of computing and networking services to meet the needs of applications
Expertise in AWS data migration between different database platforms like SQL Server to Amazon Aurora using RDS tool
Actively involved in deploying and testing the application in different environments.
Configuring Hadoop Environment: Kerberos authentication, DataNodes, NameNodes, MapReduce, Hive, Pig, Sqoop, Oozie workflow engine.
Environment :Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Oozie, Cloudera CDH4.5, Kerberos security, SQL, ETL,Linux, Java, J2EE, Eclipse Kepler IDE, Web services, DB2,Spark,MongoDB,AWS.
CLIENT: JP Morgan Chase. Mar 2014 – Aug 2015
Dallas,Texas
ROLE: Hadoop Developer
Responsibilities:
Developed multiple MapReduce Jobs in java for data cleaning and pre-processing.
Loaded home mortgage data from the existing DWH tables (SQL Server) to HDFS using Sqoop.
Created Hive Tables, loaded retail transactional data from Teradata using Sqoop.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with PIG.
Developed the UDF's in PIG and HIVE using Java.
Wrote Hive Queries to have a consolidated view of the mortgage and retail data.
Created multiple Hive tables, implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access.
Exported analyzed data to relational databases using SQOOP for visualization to generate reports for the BI Team. Writing Pig Latin scripts to process the data and also written UDF in java and python for both Hive and Pig.
Created spark scala applications for transformation of data using Spark RDD’s and also used SparkSQL for SQL queries and to connect to hive metastore.
Involved in creating Hive External tables, loading with data and writing hive queries which will run internally in map reduce, also used custom SerDe’s based on the structure of input file so that Hive knows how to load the files to Hive tables.
Exported analyzed data to relational databases using SQOOP for visualization to generate reports for the BI Team.
Extracted files from NoSQL database (MongoDB) and processed them spark using mongo spark connector.
Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
Experience in using Tableau Data Integration tool for data integration, OLAP analysis and ETL process.
Migrated an existing on-premises application to AWS.
Experience in enabling & administering AWS cloud infrastructure for various applications including (Analytics, Enterprise Applications, Mobile Services, etc.)
Done POC on exploring with Spark for improving the performance and optimizing the existing pig scripts in Hadoop and examining Hive query performance.
Environment: Hadoop, HDFS, MapReduce, Sqoop, Hive, Flume, Oozie, Zoo keeper, Cloudera distribution, MySQL, ETL,Eclipse, Spark, Python, MongoDB, Kafka.
Client: GEICO May 2013 – Mar 2014
Richardson, Dallas
Role: Java Developer
Responsibilities:
Created the UI tool - using Java, XML, DHTML, and JavaScript.
Wrote stored procedures using PL/SQL for data retrieval from different tables.
Worked extensively on bug fixes on the server side and made cosmetic changes on the UI side.
Part of performance tuning team and implemented caching mechanism and other changes.
Recreated the system architecture diagram and created numerous new class and sequence diagrams.
Designed and developed UI using HTML, JSP and Struts where users have all the items listed for auctions.
Developed Authentication and Authorization modules where authorized persons can only access the inventory related operations.
Developed Controller Servlets, Action and Form objects for process of interacting with Oracle database and retrieving dynamic data.
Responsible for coding SQL Statements and Stored procedures for back end communication using JDBC.
Experience in technologies such as Web Services (REST/SOAP) using Spring, Hibernate, XML, XSD, JSON.
Used Soap UI for testing SOA and Web Services (SOAP, Restful WSDL).
Developed the Login screen so that only authorized and authenticated administrators can only access the application.
Developed various activities like transaction history, search products that enable users to understand the system efficiently.
Involved in preparing the Documentation of the project to understand the system efficiently.
Environments: JDK1.2, JavaScript, HTML, DHTML, XML, Struts, JSP, Servlet, JNDI, J2EE, Tomcat, Rational Rose, Oracle.
Client: Premier Healthcare Alliance Nov 2012 – April 2013
Charlotte, NC
Role: Java Developer
Responsibilities:
Analyzed requirements and created detailed Technical Design Document.
Involved in all the phase of the Software Development Life Cycle.
Review, Revise and approve functional requirements.
Created UML Diagrams using Microsoft Visio.
Developed Screens for capturing information using JSP, Struts Tag Libs, Java Script, HTML,, JAXB, XPath, Xquery
Designed the application work-flow using Struts and authored struts configuration files like validator.xml, struts-config.xml and validation-rules.xml files.
Developed web services.
Used Struts-Validator framework for all front-end Validations for all the form.
Used Java RMI to write distributed objects and wrote shell scripts for building and deploying the applications.
Implemented JMS messaging interface with MQ Series.
Used Hibernate for Object Relational Mapping and data persistence.
Developed the Database interaction classes using JDBC.
Created JUnit test cases and ANT scripts for build automation.
Environment: Java, J2EE 1.4, HTML, XML, JDBC, JMS, Servlets, JSP 1.2, Struts 1.2, Hibernate, Web services, Eclipse 3.3, Web Sphere 7, Oracle 9i, ANT, Microsoft Visio.
Client: MetLife Insurance Jan 2010 – Aug 2012
Hyderabad, India
Role: Java Developer
Responsibilities:
•Performed analysis for the client requirements based on the developed detailed design documents.
•Developed Use Cases, Class Diagrams, Sequence Diagrams and Data Models.
•Implemented this module on an existing framework with the help of my lead.
•Developed the data access classes using JDBC and SQL queries.
•Worked on developing various HTML files and integrating them CSS.
•Developed UML diagrams for the development of the project and mentored their implementation to the team members.
•Developed servlets for the complete project.
•Designed and documented REST/HTTP APIs, including JSON data formats.
•Involved in writing Java Server Pages (JSP) for the complete project.
•Following agile methodology (SCRUM) during development of the project and oversee the software development in Sprints by attending daily stand-ups.
•Designed and modified User Interfaces using JSP, JavaScript, and CSS.
•Extensively worked with Servlets and spring based applications in developing J2EE Components.
Environment: Java, J2EE, JDBC, servlet, JSON, Eclipse IDE, oracle 11g, Apache Tomcat, Spring, html, CSS, JSP.
Client : EMCO Ltd., Aug 2008 - Nov 2009
Hyderabad,India
Role: Java Developer
Responsibilities:
Developed the application under JEE architecture, developed Designed dynamic and browser compatible user interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
Deployed & maintained the JSP, Servlets components on Tomcat.
Preparation of Developer’s and Deployment Guide.
Involved in storing the details about all employees and retrieving from Oracle database when required by the Administrator for the employee detail module
Developed the weekly schedule for employees which let them plan their weekly activities in an interactive way.
Developed and utilized J2EE Services and JMS components for messaging communication in Web Logic
Environment: Java, JDK 1.4, Java Script, HTML, Servlets, Eclipse, JSP, Apache Tomcat, and Oracle.