Resume

Data Developer

Location:

Montreal, QC, Canada

Posted:

March 02, 2019

Contact this candidate

Resume:

Ekta Kumari

+1-514-***-****

ac8noe@r.postjobfree.com

https://www.linkedin.com/in/ekta-kumari-b85589a4/

Open work Permit Visa Holder for Canada

Career Objective

To keep up with the cutting edge of technologies,while making a significant contribution to the success of the firm.

Summary of Experience and Qualifications

Total 7.1 years of experience in Data warehousing, includes around 2.5 years of experience in building Big Data applications using different frameworks like Hadoop, Hive, Sqoop, Spark and Cloud technologies like AWS Redshift,S3 and around 4 .5 years in ETL tools like Informatica PC, SSIS and Datastage.

Holding extensive experience working on Spark SQL, Dataframes,Spark streaming,Scala,Hive,ETL tools and performance tuning of the Spark applications.

Having experience in all stages of the project life cycle like requirements gathering, designing & documenting architecture, development and Testing, performance optimization and Production support.

Technical Skill

●Big Data Technologies : HDFS, YARN,Hive,Sqoop,Spark, Spark Streaming

●ETL Tool : Informatica Power Center 9.6.1, SSIS, Datastage

●NOSQL Databases : HBASE,CASSANDRA

●Database : Teradata, Oracle 11g, SQL Server 2008

●Data Cleansing Tool : Informatica Data Quality(IDQ)

●Programming Languages : C, Scala, SQL, HiveQL, Unix, Shell Scripting

●Workflow manager/

Scheduler : Stonebranch/Opswise

●Query Language : Sql,Hql

●Messaging System : Kafka

●Version control s/w : GIT,SVN

●Bug tracking s/w : Jira,Quality control

●Scripting Language : Unix

●Reporting Tools : Zeppelin,SSRS

●Cloud Technologies : AWS Redshift,S3,EMR,EC2

●IDE : Intellij IDEA

Organizational Experience Summary

●Mar 2018 - January 2019 : Groupon India

●Jan 2015 – Mar 2018 : Deloitte Consulting

●Dec 2011- Jan 2015 : Cognizant Technology Solutions

●Domain : Health and Life Insurance, Telecom, Banking, e-commerce

●Role : Spark /Big data Developer, ETL Developer

Groupon (Project Details Mar 2018 – Jan 2019)

Project 1:

Project Name: Live Customer orders and deals data analysis

Technology Used: Spark, Kafka, Spark Streaming,Cassandra,Apache zeppelin

Project:

After loading the batch data for customers and there related order and ongoing deals in previous project,the need was to perform the similar operation on live data and the data to be loaded in cassandra so the BI team and data science could do analysis and stakeholders / business should be notified of the trend of data that are been received.

Key Responsibilities:

●To build an initial POC to process live data for customer and their related orders coming from Kafka cluster, ingest data using Spark Streaming to process,apply certain logic/transformation and feed then in Cassandra,use Apache zeppelin for further analysis and report generation. To understand more of Kafka,was also involved in fixing Kafka server failures.

●Performed source data analysis and data profiling once the data has been moved to Cassandra.

Project 2:

Industry: e-commerce

Role: Spark Developer

Project Name: Groupon’s Customers and orders Migration

Technology Used: Hive,Informatica Power center, Spark,Sql Server 2008

Project: To design a data pipeline to process billions of the records, Informatica took huge amount of time. Hence all the source data was placed in S3,data was ingested from S3 and loaded to hive tables using spark as data processing engine without any/few logical data change. Informatica data type and all operations, data types must be same. Optimize the performance, schedule, automate this project using Cron.

Key Responsibilities:

●Based on Mapping document, import data from Informatica and store data in Hive based on business logic.

●Query performance optimization, Update data using spark.

●Performance Analysis using spark web UI.

●Write alternative code to optimize Spark performance.

●Validating and testing the code.

●Walked through the the modified code to business

●Involved in building the technical design document and peer review of the code

Deloitte Consulting ( Project Details Jan 2015 – Mar 2018)

Project 1:

Role: Big data Developer

Project Name: Pampers Data Warehouse Apache Spark RDD and Data Frames

Technology Used: Spark SQL, Dataframe, Amazon EMR, Redshift, S3, Oracle

Project:

This Pampers data warehouse is being created by Kimberly Clark to facilitate their Data Science team to work on multiple loyalty management use cases. Data is being pulled from multiple diverse data sources (including their own Smart Button Loyalty Platform, P&G’s Customer Portal Reward database under Amazon S3 database) and co-located in Amazon Redshift database.

Key Responsibilities:

●Designed the complete data pipeline to pull batch data from SQL Server, Amazon S3 bucket & internal SFTP and put them into AWS Redshift data in one staging table as Raw data for further analysis, designed Spark Jobs to achieve this.

●Worked as per given requirement document, mapping document and adhoc requests to load in different dimension and fact tables as part of Target table load,create Spark Jobs to process above data,which was further scheduled via AWS Lambda to run.

●Modularize the code

●Perform unit testing

●Design Technical design document

●Production Support

Project 2:

Role: ETL Developer

Project Name: TUFTS Health Plan (TUFTS)

Technology Used: Informatica Power center, Informatica IDQ, Sql Server 2008

Client:

Tufts Health Plan is nationally recognized for its commitment to providing innovative, high-quality health care coverage. Tufts Health Plan is one of the few health plans in Massachusetts to participate in the commercial, Medicare and Medicaid/subsidized markets, offering coverage across the life span regardless of age or circumstance.

Project:

The main purpose of this project is to provide the provider information in flat file format to Health Rule Payer (HR Payer) system, by reading data from various sources like Express, Cactus and THP systems.

Key Responsibilities:

●Designing effective ETL architecture which matches the requirement and provides the required outputs.

●Developing and Troubleshooting of Mappings and Workflows for ETL Jobs based on the Low Level design documents and according to the standards of client.

●Work with client onshore Manager to understand and determine Business priorities and determine release dates for different phases/backlog items.

●Developed comprehensive data loading strategies using detailed Visio diagrams- communicating with various end user teams, and understanding their data needs.

●Completely involved right from the beginning of analysing the existing system and functionality requirements, review of all deliverables

●Code migrations among various environments- SIT, UAT and Production ensuring successful and timely delivery of the code for Production users.

Project 3:

Industry: Health Care and Life Sciences

Role: ETL Developer

Duration: Feb 2015-Oct 2016

Project Name: PIX Provider Extracts

Technology Used: Informatica Power center, Teradata and its utilities

Client:

Anthem is one of the US largest health benefits companies, with nearly 34 million members in its affiliated health plans and approximately 65 million individuals served through its subsidiaries. It has license in 14 states and provides plans under the Unicare name in other parts of the country.

Project:

DX- Comprehensive care deals with extracting Claims, Provider, ETG, DXCG and Membership attributes out of Edward on a periodic basis. In addition, two daily CIRS extracts get generated from the input files provided by CIRS system. The extracts are carried out on a daily/monthly/quarterly basis for the ACO attributed members from EDWard. Business defined transformation logic is applied on Member, Claim, Provider, ETG, DXCG and CIRS data and the same is loaded into Work tables before exporting in the form of files. The requirement for Build 1 is the mapping logic to source data elements from a combination of the Host or Home instances of a claims based on overall data quality as identified in profiling exercise. Upon generating the Extracts for all the subject areas, Files need to Ftp’d to respective ACO folders for ACO processing.

Key Responsibilities:

●Responsible for the complete Design, Development, validation and testing of interfaces by ensuring specific standards and consistency.

●Preparation of Low Level design documents using Microsoft Visio based on the business logic and requirement.

●Dealt with high volume of data and hence leveraged the utilities of Teradata for performance enhancement.

●Migration of ETL jobs into testing and pre-production repository environments.

●Code walk-through to Clients before migration to Production.

Cognizant (Project Dec 2011- Jan 2015)

Industry: Banking and Financial Services

Role: ETL Developer

Duration: April 2011-Jan 2015

Project Name: Payment Protection Insurance

Technology Used: Informatica Power center, Oracle

Summary of Project:

The project provides reports to the HMRC (HM revenue and custom) within 7 days of end of each quarter, the report is then used for analyzing the customers who falls in exceptional and matched categories for PPI related payment for UKRBB cluster then after analysis all customers falling under exception are reported to PPI finance and corrected in timely manner.For this purpose, we have 3 layer architecture. It involves ETL processes using Informatica Power Centre 9.x, Data is gathered from 2 different source systems, transformed as per the logic and given out in the form of reports to the client, from a relational database and sometimes from flat files.

Responsibilities:

●Responsible for analyzing, designing and developing ETL strategies and processes.

●Work in coordination with other application system (developers), Team Lead to understand the requirements and fetch data correctly.

●Involved in developing Mappings and reusable transformations using Informatica Designer.

●Source system data from the distributed environment was extracted, transformed and loaded into the Data warehouse database using Informatica.

●Extracted Data from flat files and various relational databases to Oracle Data Warehouse Database.

●Created the unit test cases for mappings developed and verified the data.

●Conducted the peer review for mappings developed and tested.

●Most of the transformations were used like the Source qualifier, expressions, and joiner, Aggregators, Lookups, Filter, Rank and Update Strategy.

●Developed and modified changes in mappings according to business Logic.

●Used Informatica Workflow Manager to create, schedule, monitor sessions and send pre and post session emails to communicate success or failure of session execution.

Academic Qualification

●B.E (CSE) from North Maharashtra University with 66.15% (Distinction) in 2011.

●H.S.C from Patna Muslim High School (C.B.S.E) with 72.20% in 2007.

●S.S.C Infant Jesus School (C.B.S.E) with 82% in 2005.

Personal Details

●Date of Birth : 5th Jan 1990

●Marital Status : Married

●Language Known : English, Hindi,French(Learning)

Signature

Place: Montreal,Quebec,Canada

Ekta Kumari

Date: 25 Feb,2019

Contact this candidate