Over 8+ years of IT experience in software analysis, design, development, testing and implementation of Big Data, Hadoop, NoSQL and Java/J2EE technologies.
Experience in working with Developer Toolkits like Force.com IDE, Force.com Ant Migration Tool, Eclipse IDE, Mavens.
A very good understanding of job workflow scheduling and monitoring tools like Oozie and Control-M.
Experience in Front-end Technologies like HTML, CSS, HTML5, CSS3, and AJAX.
Experience in the Software Development Life Cycle (SDLC) phases which include Analysis, Design, Implementation, Testing and Maintenance.
Strong technical, administration and mentoring knowledge in Linux and Big data/Hadoop technologies.
Good working experience in using Spark SQL to manipulate Data Frames in Python.
Good knowledge in NoSQL databases including Cassandra and MongoDB.
A very good experience in developing and deploying the applications using Weblogic, Apache Tomcat, and JBoss.
Extensive knowledge of Teradata utilities (BTEQ, Fastload, Fast Export, Multiload Update/Insert/Delete/Upset)
Good experience in Hive partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive SerDe like JSON and Avro.
Experience with build tool ANT, Maven and continuous integrations like Jenkins.
Working experience in Development, Production and QA Environments.
Experience in NoSQL Column-Oriented Databases like HBase and its Integration with Hadoop cluster.
Proficient in Java, Collections, J2EE, Servlets, JSP, spring, Hibernate, JDBC/ODBC.
Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2.
In depth knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node.
Experience with Amazon Web Services, AWS command line interface, and AWS data pipeline.
Experience in writing SQL, PL/SQL queries, Stored Procedures for accessing and managing databases such as Oracle, SQL MySQL, and IBM DB2.
Execute faster MapReduce functions using Spark RDD for parallel processing or referencing a dataset in HDFS, HBase and other data sources
Good experience working on analysis tool like Tableau for regression analysis, pie charts, and bar graphs.
Expertise in developing Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Experience with Apache Spark ecosystem using Spark-SQL, Data Frames, RDD's and knowledge on Spark MLlib.
Hands on experience with Big Data Ecosystems including Hadoop, MapReduce, Pig, Hive, Impala, Sqoop, Flume, Oozie, MongoDB, Zookeeper, Kafka, Maven, Spark, Scala, HBase, Cassandra.
Experience in installation, configuration and deployment of Big Data solutions.
Extensive hold over Hive and Pig core functionality by writing Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sources.
Hands on experience with NoSQL Databases like HBase, Cassandra and relational databases like Oracle and MySQL.
Defined real time data streaming solutions across the cluster using Spark Streaming, Apache Storm, Kafka, Nifi and Flume.
Expertise in J2EE technologies like JSP, Servlets, EJBs, JDBC, JNDI and AJAX.
Expertise in applying Java Messaging Service (JMS) for reliable information exchange across Java applications.
Good usage of Apache Hadoop along enterprise version of Cloudera and Hortonworks.
Good Knowledge on MAPR distribution & Amazon's EMR.
Hadoop/Big Data Technologies: Hadoop 3.0, HDFS, MapReduce, HBase 1.4, Apache Pig, Hive 2.3, Sqoop 1.4, Apache Impala 2.1, Oozie 4.3, Yarn, Apache Flume 1.8, Kafka 1.1, Zookeeper
Cloud Platform: Amazon AWS, EC2, EC3, MS Azure, Azure SQL Database, Azure SQL Data Warehouse, Azure Analysis Services, HDInsight, Azure Data Lake, Data Factory
Hadoop Distributions: Cloudera, Hortonworks, MapR
Programming Language: Java, Scala, Python 3.6, SQL, PL/SQL, Shell Scripting, Storm 1.0, JSP, Servlets
Frameworks: Spring 5.0.5, Hibernate 5.2, Struts 1.3, JSF, EJB, JMS
Databases: Oracle 12c/11g, SQL
Operating Systems: Linux, Unix, Windows 10/8/7
IDE and Tools: Eclipse 4.7, NetBeans 8.2, Intellij, Maven
NoSQL Databases: HBase 1.4, Cassandra 3.11, MongoDB, Accumulo
Web/Application Server: Apache Tomcat 9.0.7, JBoss, Web Logic, Web Sphere
SDLC Methodologies: Agile, Waterfall
Version Control: GIT, SVN, CVS
World Bank Group – Washington, DC Aug 17 - Present
Sr. Big Data Developer
As a Sr. Big Data/Hadoop Developer worked on Hadoop eco-systems including Hive, MongoDB, Zookeeper, Spark Streaming with MapR distribution.
Developed Big Data solutions focused on pattern matching and predictive modeling
Involved in Agile methodologies, daily scrum meetings, spring planning.
Performed multiple MapReduce jobs in Pig and Hive for data cleaning and pre-processing.
Worked on MongoDB, HBase databases which differ from classic relational databases
Involved in converting HiveQL into Spark transformations using Spark RDD and through Scala programming.
Integrated Kafka-Spark streaming for high efficiency throughput and reliability
Worked on Apache Flume for collecting and aggregating huge amount of log data and stored it on HDFS for doing further analysis.
Worked in tuning Hive & Pig to improve performance and solved performance issues in both scripts
Build Hadoop solutions for big data problems using MR1 and MR2 in YARN.
Developed Nifi flows dealing with various kinds of data formats such as XML, JSON, Avro.
Developed and designed data integration and migration solutions in Azure.
Worked on Proof of concept with Spark with Scala and Kafka.
Worked on visualizing the aggregated datasets in Tableau.
Worked on importing data from HDFS to MYSQL database and vice-versa using SQOOP.
Implemented MapReduce jobs in HIVE by querying the available data.
Configured Hive meta store with MySQL, which stores the metadata for Hive tables.
Performance tuning of Hive queries, MapReduce programs for different applications.
Proactively involved in ongoing maintenance, support and improvements in Hadoop cluster.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
Involved in identifying job dependencies to design workflow for Oozie & YARN resource management.
Designed solution for various system components using Microsoft Azure.
Worked on data ingestion using Sqoop from HDFS to Relational Database Systems and vice-versa. Maintaining and troubleshooting
Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.
Created Hive Tables, loaded claims data from Oracle using Sqoop and loaded the processed data into target database.
Involved in PL/SQL query optimization to reduce the overall run time of stored procedures.
Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
Used Cloudera Manager for installation and management of Hadoop Cluster.
Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Collaborated with business users/product owners/developers to contribute to the analysis of functional requirements.
Primarily involved in Data Migration process using Azure by integrating with Github repository and Jenkins.
Upgraded the Hadoop Cluster from CDH3 to CDH4, setting up High Availability Cluster and integrating Hive with existing applications.
Designed & Developed a Flattened View (Merge and Flattened dataset) de-normalizing several Datasets in Hive/HDFS which consists of key attributes consumed by Business and other down streams.
Worked on NoSQL support enterprise production and loading data into HBase using Impala and Sqoop.
Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
Environment: Agile, Hadoop 3.0, Pig 0.17, HBase, Sqoop, Azure, Hive 2.3, HDFS, NoSQL, Impala, YARN, PL/SQL, Nifi, XML, JSON, Avro, Spark Kafka, Tableau, MySQL, Apache Flume 2.3
Anthem – Atlanta, GA Nov 15 – Jul 17
Sr. Big Data/Hadoop Developer
Involved in all phases of Software Development Life Cycle (SDLC) using Agile.
Implemented the Cassandra and manage of the other tools to process observed running on over Yarn.
Implemented Kafka High level consumers to get data from Kafka partitions and move into HDFS.
Worked with Kafka streaming tool to load the data into HDFS and exported it into MongoDB database.
Responsible for importing log files from various sources into HDFS using Flume.
Developed mappings to extract data from Oracle, Teradata, Flat files, XML files, Excel and load.
Analyzed Hadoop cluster and different Big Data including HBase and Cassandra.
Created RDD's and applied data filters in Spark and created Cassandra tables and Hive tables for user access.
Implemented MapReduce programs to handle semi/unstructured data like XML, JSON, Avro data files and sequence files for log files.
Used Elastic search as a distributed RESTful web services with MVC for parsing and processing XML data.
Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
Collected the log data from web servers and integrated into HDFS using Flume.
Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
Worked with cloud services like Amazon Web Services (AWS) and involved in ETL, Data Integration and Migration
Constructed System components and developed server side part using Java, EJB, and Spring Framework.
Involved in designing the data model for the system.
Used J2EE design patterns like DAO, MODEL, Service Locator, MVC and Business Delegate.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDD, Scala and Python.
Implemented a proof of concept (Poc's) using Kafka, Strom, HBase for processing streaming data.
Worked on analyzing Hadoop cluster and different big data analytic tools including MapReduce, Hive and spark.
Wrote multiple MapReduce programs in Java for data extraction, transformation, and aggregation from multiple file formats.
Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Implemented multiple MapReduce Jobs in java for data cleansing and pre-processing.
Wrote complex Hive queries and UDFs in Java and Python.
Responsible for data extraction and data ingestion from different data sources into Hadoop Data Lake by creating ETL pipelines using Pig, and Hive
Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters and Experience in converting MapReduce applications to Spark.
Responsible for building and configuring distributed data solution using MapR distribution of Hadoop.
Involved in complete Big Data flow of the application data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data.
Involved in Agile methodologies, daily scrum meetings, sprint planning.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Migrating various Hive UDF's and queries into Spark SQL for faster requests.
Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
Installed and configured Hive, HDFS and the Nifi, implemented CDH cluster. Assisted with performance tuning and monitoring.
Worked with NoSQL databases HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
Installed Oozie workflow engine to run multiple MapReduce, Hive HQL and Pig jobs.
Developed HDFS with huge amounts of data using Apache Kafka.
Implemented best income logic using Pig scripts and UDFs.
Environment: Hadoop 3.0, Pig 0.17, Hive 2.3, HBase, Sqoop, MapR, HDFS, Agile, Spark SQL, Apache Kafka 2.0.0, Scala, Spark, Nifi, NoSQL, Cassandra 3.11, Java, MongoDB, Flume, XML, JSON, Elastic search, AWS
Metlife - Cary, NC Jan 14 – Oct 15
Sr. J2EE/Hadoop Developer
Created the automated build and deployment process for application, re-engineering setup for better user experience, and leading up to building a continuous integration system.
Involved in the Complete Software development life cycle (SDLC) to develop the application.
Involved in Daily Scrum (Agile) meetings, Sprint planning and estimation of the tasks for the user stories, participated in retrospective and presenting Demo at end of the sprint.
Involved in loading data from LINUX file system to HDFS.
Implemented new Apache Camel routes and extended existing Camel routes that provide end-to-end communications between the web services and other enterprise back end services
Developed code using Core Java to implement technical enhancement following Java Standards.
Worked with Swing and RCP using Oracle ADF to develop a search application which is a migration project.
Implemented Hibernate utility classes, session factory methods, and different annotations to work with back end data base tables.
Implemented Ajax calls using JSF-Ajax integration and implemented cross-domain calls using JQuery Ajax methods.
Implemented Object-relational mapping in the persistence layer using Hibernate frame work in conjunction with spring functionality.
Used JPA (Java Persistence API) with Hibernate as Persistence provider for Object Relational mapping.
Used JDBC and Hibernate for persisting data to different relational databases.
Implemented application level persistence using Hibernate and spring.
Experience in reviewing Hadoop log files to detect failures.
Exported the analyzed data to the relational databases using Sqoop 2.3.x for visualization and to generate reports for the BI team.
Developed continuous flow of data into HDFS from social feeds using Apache Storm Spouts and Bolts.
Real streaming the data using Spark with Kafka.
Importing and exporting data into HDFS and Hive 2.0 using Sqoop.
Used XML and JSON for transferring/retrieving data between different Applications.
Also wrote some complex PL/SQL queries using Joins, Stored Procedures, Functions, Triggers, Cursors, and Indexes in Data Access Layer.
Implementing Restful web services architecture for Client-server interaction and implemented respective POJOs for its implementations.
Interacted with Business Analysts to understand the requirements and the impact of the the business.
Developed and implemented Swing, spring and J2EE based MVC (Model-View-Controller) framework for the application
Worked with NoSQL databases like Base to create tables and store the data Collected and aggregated large amounts of log data using Apache Flume and staged data in HDFS for further analysis.
Collaborated with Business users for requirement gathering for building Tableau reports per business needs.
Worked on deploying Hadoop 2.7.2 cluster with multiple nodes and different big data analytic tools including Pig 0.16, HBase 0.98.23 database and Sqoop HDP2.3.x.
Used Web Logic for application deployment and Log 4J used for Logging/debugging.
Used CVS version controlling tool and project build tool using ANT.
Environment: Hadoop, Hive, Pig, HBase, Sqoop, Spark, MapReduce, Spark, Scala, Oozie, Teradata, SQL, NoSQL, HDFS, Kafka, Flume, Cassandra, Java, XML
FreddieMac - McLean, VA Nov 12 – Dec 13
Sr. Java/J2EE Developer
Iterative based methodology applied for the development of the application. Implemented J2EE Design Patterns like DAO, Singleton, Factory.
Developed the Java/J2EE based multi-threaded application, which is built on top of the struts framework.
Created POJOs in the business layer.
Developed Ant Scripts to build and deploy EAR files on to Tomcat Server.
Analyzed the EJB performance in terms of scalability by various Loads, Stress tests using Bean- test tool.
Extensively used Eclipse while writing code as IDE.
Implemented PL/SQL queries and used MySQL stored procedures, and built-in functions to retrieve and update data from the databases
Involved in preparing Ant builds scripts (XML based), deployments, and integration and configuration management of the entire application modules.
Designed database and created tables, written the complex SQL Queries and stored procedures as per the requirements.
Used Spring/MVC framework to enable the interactions between JSP/View layer and implemented different design patterns with J2EE and XML technology.
Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
Implemented the J2EE design patterns Data Access Object (DAO), Session Façade and Business Delegate.
Developed the Restful web services using Spring IOC to provide user a way to run the job and generate daily status report.
Implemented application using MVC architecture integrating Hibernate and Spring frameworks.
Created and modified Stored Procedures, Functions and Triggers Complex SQL Commands for the application using PL/SQL.
Created tables, triggers, stored procedures, SQL Queries, joins, integrity constraints and views for multiple databases, Oracle using Toad tool.
Configured local Maven repositories and multi-component projects and scheduled projects in Jenkins for continuous integration.
Implemented Hibernate in the data access object layer to access and update information in the Oracle Database.
Used Entity beans for storing the database in to database.
Developed Session Beans as the clients of Entity Beans to maintain the Client state.
Used various Core Java concepts such as Multithreading, Exception Handling, Collection APIs to implement various features and enhancements.
Involved in developing Java APIs, which communicates with the JavaBeans.
Used ANT scripts to fetch, build, and deploy application to development environment.
Wrote test cases in JUnit for unit testing of classes.
Mindtree - Chennai, India Jul 10 – Oct 12
Developed various UML diagrams like use cases, class diagrams, interaction diagrams (sequence and collaboration) and activity diagrams
Involved in requirements gathering and performed object oriented analysis, design and implementation.
Provided utility classes for the application using Core Java and extensively used Collection package.
Extensively worked on n-tier architecture system with application system development using Java, JDBC, Servlets, JSP, Web Services, WSDL, Soap, Spring, Hibernate, XML, SAX, and DOM.
Extensively used Eclipse IDE for developing, debugging, integrating and deploying the application.
Used Spring Framework for MVC for writing Controller, Validations and View.
Developed UI using HTML, CSS, Bootstrap, JQuery, and JSP for interactive cross browser functionality and complex user interface.
Wrote Hibernate classes, DAO's to retrieve & store data, configured Hibernate files.
Developed Service layer interfaces by applying business rules to interact with DAO layer for transactions.
Developed user interface using JSP, JSP Tag libraries and Struts Tag Libraries to simplify the complexities of the application.
Implemented Business Logic using POJO's and used WebSphere to deploy the applications.
Used the built tools Maven to build JAR & WAR files and ANT for clubbing all source files and web content in to war files.
Used Core Spring for Dependency Injection of various component layers.
Used SOA REST (JAX-RS) web services to provide/consume the Web services from/to down-stream systems.
Developed a web-based reporting for credit monitoring system with HTML, CSS, XHTML, JSTL, Custom tags using spring.
Used various Core Java concepts such as Multi Threading, Exception Handling, Collection APIs to implement various features and enhancements.
Wrote and debugged the Maven Scripts for building the entire web application.
Designed and developed Ajax calls to populate screens parts on demand.
Used Maven to build, run and create Aerial-related JARs and WAR files among other uses.
Used JUnit for unit testing of the system and Log4J for logging.