Resume

Data Software Developer

Location:

Lowell, MA

Salary:

90000

Posted:

August 17, 2017

Contact this candidate

Resume:

SUMMARY

*+ years of experience in IT industry, involved in developing, implementing, testing and maintenance of various web based applications using J2EE technologies and Big Data ecosystems on Linux environment.

3 years of Hadoop experience in design, development and deployment of Big Data applications, involving Map/Reduce, HDFS, Hive, HBase, Pig, Oozie, Sqoop and Flume with CDH4&5 distributions.

Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.

Excellent knowledge on Insurance domain, financial services including retail banking and compliance fraud detection.

Experience in performing data enrichment, cleansing, analytics and aggregations using Hive, Pig and Pentaho ETL tool.

Strong in developing MapReduce applications, configuring the development environment and tuning Jobs.

Familiar with different data formats like Json, Avro, parquet, RC and ORC and compressions like snappy & bzip.

Skilled in analyzing data using HQL, Pig Latin and extending HIVE and PIG core functionality by using custom UDFs.

Experience with various performance optimizations like using distributed cache for small datasets, partition and bucketing in Hive and Map Side joins when writing Map Reduce jobs.

Good knowledge on various scripting languages like Linux/Unix shell scripting and Python.

Experience in importing streaming data into HDFS using Flume sources and transforming the data using Flume interceptors.

In depth knowledge and experience in software design methodologies, software design patterns, and object oriented analysis and design (OOA, OOD).

Experience with using SVN, CVS and GIT as source code management.

Experience with Core Java with strong understanding and working knowledge of Object Oriented Concepts like Collections, Multi-threading and Exception Handling.

Expertise in Java/J2EE Technologies such as Core Java, Spring, Hibernate, JDBC, JSON, HTML, Servlets, JSP, Java Beans and RDBMS (Oracle, MySQL, DB2).

SKILLS

Hadoop Core Services: HDFS, Map Reduce, YARN

Hadoop Distribution: Cloudera, Horton works

NO SQL Databases: HBase

Hadoop Data Services: Hive, Pig, Sqoop, Flume

Hadoop Operational Services: Zookeeper, Oozie

Monitoring Tools: Cloudera Manager, Ambari

Languages: C, C++, SQL, PL/SQL, Pig Latin, HiveQL, Unix Shell Scripting

Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, XML, REST

Frameworks: MVC, Hibernate, Spring, Struts

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat

Databases: Oracle 9i-11g, MySQL, Db2

Operating Systems: Ubuntu (Linux), Win 2000/XP

Build Tools: Maven, ANT

Development Tools: Microsoft SQL Studio, Toad, Eclipse, NetBeans

ETL Tools: Pentaho

Development Methodologies: Agile/Scrum, Waterfall

WORK EXPERIENCE

Hadoop Developer/ ETL

Economical Insurance, Toronto July 2015 – May 2017

Economical (Sonnet) is one of Canada’s leading property and casualty (p&c) insurance companies. As a multi-channel insurance company serving to customer through national independent broker force and via digital channel (sonnet). As “PAS” project follows a data object hierarchy, making it highly portable to today's technologies and providing the most flexibility when implementing data model on server aspect.

Involved in design, development, integration, deployment/production support and other technical aspects of development.

Worked on implementing several POC to validate and fit the several Hadoop eco system tools on CDH distributions and other third party tools and API (importing data from salesforce and google analytics to HDFS)

Hands-on experience on Data Warehouse Star Schema Modeling, Snow-Flake Modeling, FACT & Dimension Tables, Physical and Logical Data Modeling.

Designing and implementing semi-structured data analytics platform extracting from legacy system (claim center) leveraging Hadoop.

Proficient in data modeling with Hive partitioning, bucketing, and other optimization techniques in Hive to design warehousing infrastructure on top of HDFS data with Hive.

Hands on experience on the whole ETL (Extract Transformation & Load) process with Pentaho.

Implemented various Pentaho data Integration steps to extract data from third party vendors and APIs for cleansing and load that data per business needs.

Skilled in High Level Design of ETL DTS Package for integrating data from heterogeneous sources (Excel, CSV, Oracle, MySQL, flat file, Text Format Data).

Built several wrapper shell scripts to hold Oozie workflow, ETL Pentaho batch job and Jenkins deployment script.

Developed workflow in Oozie to automate tasks of loading data into HDFS and pre-processing with Pig.

Used Oozie and Control-M workflow engine for managing and scheduling Hadoop Jobs.

Involved in transforming data from Mainframe tables to HDFS and HBASE tables using Sqoop and Pentaho Kettle respectively.

Imported data using Sqoop to load data from MySQL to HDFS on regular basis.

Involved in creating Hadoop streaming jobs.

Involved in Installing, Configuring Hadoop Eco System, Cloudera Manager using CDH4 Distribution.

Environment: Hadoop, CDH(5.4,5.5), Pentaho 6.0, MapReduce, Hive, Impala, HDFS, Sqoop, Oozie, Flume, HBase, Red Hat Enterprise Linux 5, Java(jdk 1.7,1.8), Jenkins, Github, Eclipse, Oracle 11g/12c, Salesforce, Soap UI, IBM WebSphere MQ.

Java-Hadoop Developer

BMO Financial Group, Toronto

AML OSFI/OFAC compliance pattern detection May 2014 – July 2015

BMO P&C is building a modernized AML solution to detect and prevent money laundering using Bigdata technologies which allows us to “see” patterns in data which goes beyond the capabilities of traditional database systems. The idea is to analyze the last one year transaction real time data including transactions done on retail, web and mobile and understand the pattern of layering and integration to raise suspicious activity report from Legacy mainframe system like CBDS and Mech.

Developed simple and complex Map Reduce programs in Java for Data Analysis on different data formats and filter bad and un-necessary records and find out unique records based on different criteria.

Developed Secondary sorting implementation to get sorted values at reduce side to improve map reduce performance.

Implemented custom Data Types, Input Format, Record Reader, Output Format, Record Writer for Map Reduce computations to handle custom business requirements.

Implemented Map Reduce programs to classified data organizations into different classifieds based on different type of records.

Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.

Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig using Oozie co-coordinator jobs

Responsible for performing extensive data validation using Hive

Worked with SQOOP import and export functionalities to handle large data set transfer between Oracle database and HDFS.

Worked in tuning Hive and Pig scripts to improve performance

Involved in submitting and tracking Map Reduce jobs using Job Tracker.

Involved in creating Oozie workflow and Coordinator jobs to kick off the jobs on time and data availability

Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations

Implemented business logic by writing Pig UDFs, Hive Generic UDF's in Java and used various UDFs from Piggybanks and other sources

Environment: Hadoop, HDFS, Hortonworks, Map-Reduce, Hive, Sqoop, MySQL, Java, REST API, Maven, MRUnit, Junit.

Software Developer – Big Data

Yottalore, Toronto SEP 2013 – APR 2014

As part of a team we delivered system to analyze and manage market risk for the wealth management division of a large Canadian financial institution.

Worked in Loading and transforming large sets of structured, semi structured and unstructured data.

Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.

Imported and exported Data from/to Different Relational Data Sources like RDBMS, Teradata to/from HDFS using Sqoop.

Involved in creating Hive tables, loading with data and writing Hive queries which invoke and run Map Reduce jobs in the backend.

Involved in writing APIs to Read HBase tables, cleanse data and write to another HBase table

Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.

Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.

Experienced in managing and reviewing the Hadoop log files.

Configured build scripts for multi module projects with Maven

Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.

Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Oozie, Java, Linux, Maven, Oracle 11g/10g, SVN

Java Software Developer : - Predictive Modelling For Lost Sales For Open Shelf

Logistics AllianceTM (Shoppers Drug mart Chain supply company) JAN 2013 – AUG 2013

Develop full dynamic Mail Notification System as a developer for Shoppers Drug Mart with use of Java EE, Java SE, Java API (Apache POI, Java Mail etc.) and Db2 as backend.

Design and Developed Database Objects like Schema and Tables in Db2, Creation of SQL Statement based on requirement, and complete detail oriented module.

Import Data in DB2 Tables and maintain Database after Scanning Delivery file in various format(Spread Sheet, Flat File) with the help of JDBC connectivity with DB2

Develop HTML Notification Email template, Checked with database calculation proper delivery date and forward email to selected Store

Ultimately, main aim to develop this application for no-user interaction, its downloaded row data flat file from Mail account and Import, verify, restructured data after scanning each file, checked database calculate delivery date and Send E-Mail to Selected Store attached with useful information.

Software Developer (Application Development)

ITECH (A.D.PATEL INSTITUTE OF TECHNOLOGY) OCT 2009 – DEC 2011

Technology used include: Java, PHP, Hibernate, Spring, MySQL, Oracle, Java Script

Involved in various size projects like “CGN Horizon Dashboard”, “Admin Console”, “Pick One" for developing Back-end server, ETL services for database, data monitoring, log file maintenance.

“CGN Horizon Dashboard” gives high level summary on CGN appliances network including server location, Ethernet Status, Traffic load, error tracking. It’s also responsible for generation of PDF report.

“Admin Console” mainly handles website traffic, modify website configuration, logs report, graphical view of tomcat sever, database server. It’s generally used for monitoring severs and databases.

“Pick One” application developed for doing basic survey and suggestion back from group of people. For technical perspective configure back-end server and Restful web services.

EDUCATION

Computer Engineering Technician (Major - Software development)

Sheridan College, Brampton, ON

Bachelors in Information technology (Major - Software development)

S.P University (A.D.I.T Institute of Technology)

Contact this candidate