Developer Data

Location:

Dubai, United Arab Emirates

Posted:

August 15, 2020

Contact this candidate

Resume:

Tanmoy

Contact: +971-***-***-***

Email: ************@*****.***

Nationality: India

Country of Residence: Dubai, UAE

PROFILE

Currently working as a Software Developer with additional hands-on responsibility of Data Engineer. 10+ years of experience as a developer and architect. 5 years as a Big Data Programmer and 3+ years as Scala Developer.

Experience as a Solution Architect and Developer and Devops in Business and Data analytics, big data and business intelligence platforms for various domains/verticals such as health-care, telecom, banking, retail and e-commerce. Architected scalable systems to process large data sets (approximately 1 TB data volume daily) and structured, quasi-structured datasets. Coordinated with directors and VP’s to generate new KPI’s and collaborate with Sales and Marketing team to explain the inputs from the Analytics projects.

Currently, working on Scala, Python, JavaScript & Kafka to build the Backend(BackOffice Portal) and Data platform for a financial company using a Hybrid (inhouse and AWS) approach.

Employment Visa: Dubai Work Visa

CORE COMPETENCIES

Statistical and Data Modelling – R, Python

Visual analysis – Tableau

Real-time/big data analytical solution – Kafka, Talend, Apache Nifi, Sqoop, Flume, Spark

Hadoop Development – Hive, HDFS, Splunk, Pig, ELK

Programming Languages: Core Java, Python, Scala, JavaScript (NodeJs, ExpressJs, HTML5)

Databases and NoSQL – Teradata, Greenplum, HBase, ElasticSearch, MySQL-NDB

Data Integration/Analytics – Talend, Python, HVR, Custom Scala pipelines

Data Mining, Text Analytics, Clustering, Classification.

Framework: Play, Map Reduce, Spark, Akka, Anorm

EXPERIENCE

Software Developer

Company – Payments Startup – Dubai, UAE <Aug-2018 to Present>

Backend, Reporting and Data Platform

Role: Scala Backend Developer/ Data Engineer

Project Overview: Reporting directly to the Managing Director, I work with Infrastructure, Erlang and Android/iOS team to build a real-time web portal that is continuously updated from mobile transactions and payment gateway systems. This gives an aggregated view of a customer over different channels. Everything adds up for the customer to give ‘One View’ of all his interactions with our products, giving complete peace of mind. The average latency is about 10 seconds and completely configurable across the system.

Responsibilities:

Develop Scala microservices using Play, Anorm and Akka

Software development using Functional Programming best practices

Developer Python utilities for backend data handling and serve via NodeJs

Develop backend using Scala, NodeJs, Python, Express, MongoDB

Code Review, Version Control using Bitbucket (Enterprise GitHub)

Real Time Data Platform development using Docker, Kafka, Spark, Greenplum and Hadoop

Kafka Connect – MySQL connector, HDFS Connector, Postgres connector

Backend using Couchbase, HBase, Hive and MySQL-NDB

Develop data pipelines using Scala, Python, Spark, Github and Docker

DevOps using Jenkins, Bitbucket, Scripting, Ansible, Rancher

Cloud deployment of Big Data Stack – AWS, Rackspace, OpenShift

Comprehensive unit-tests and integration tests

Data Architecture Design and Third-Party Vendor Management

Sprint Planning and Backlog grooming

Proposed Fraud detection on Apache Spark streams

Proposed ELK for Log Analytics across apps

Motivated the team to accept some design decisions and they paid off very well

Knowledge of Apache Beam, Flink, Flume, Kubernetes and Secure Real-time streaming

Data Lake Engineer/Principal Consultant - Big Data

Company – Manulife – Hong Kong <Oct-2017 to Aug-2018>

Data Lake Platform Developer

Role: Data Lake Engineer / Java Developer

Project Overview: Develop a data platform that has the data for entire South-East Asia (spanning 12 countries) by working with country-specific teams. I was responsible for Japan, Philippines, Cambodia and Hong Kong businesses. The strategy gave the management real power of data and feedback about how different products are being used across countries. I integrated Apache Ranger and Atlas for metadata management and data governance. Helped with streamlining the Data Governance process from managing metadata using Excel to using Hortonworks portal.

Responsibilities:

Designed/Implemented the CDC Ingestion framework using Apache Nifi

Creating Analytical Functions in Apache Hive using Core Java

Metadata management Application using Apache Ranger, Knox and Atlas

Design and Include new sources in the Data Lake

Log monitoring using Splunk and Elastic Search

HDP and HDF

Migrating DWH and Analytic platforms to Big Data

ETL Pipelines using Jython, Groovy

Managed 4 team-members

Senior Data Engineer/Program Manager - Big Data and Analytics

Company – GE - India <July-2016 to Oct-2017>

Build the Data Lake for GE

Role: Data Integration Architect and Data Analytics.

Project Overview: Joined as a Program manager to manage the transformation of GE systems. Worked with GE- Financials, Transportation, Aviation and Healthcare to build central data repository. Managed 3rd party relationships, integrations and contracts (Rapid Ratings, HVR, AWS, OpenSCG).

Also took up a more hands-on role of delivering analytics for Fleet Management at GE-Transportation.

Responsibilities:

Lead a team of 30 members (both Contractual and FTE’s)

Designed and architected the Big Data Ingestion framework

Transformation of GE Legacy systems to Opensource platform

Design and Include new sources in the Data Lake

Customization of Big Data Framework using Core Java and Python

Data System Design for handling huge amounts of data and running real time systems

Migrate Terabytes of Data from Oracle and ERP systems to MPP, Cloud and Hadoop

Feasibility study for moving Prediction models to AWS cloud

Ensemble modelling and Classification for GE-Transportation

Product Engineering of Big Data Lake

Company – EMC <May-2015 to July-2016>

Build an Open Data Platform for implementation of Smart City.

Role: Data Integration Architect and Data Analytics.

Project Description: Work on the Singapore Tourism portal development to streamline tourist bookings and services provided based on flight data and tourism data. This was part of the Singapore smart city project. Multiple POC’s for Government Smart City Initiative.

Smart grid project for the Govt. of Kerela

CDR analytics – Using CDR to propose telephone plans for visiting tourists

Responsibilities:

Extraction of data from smart sensors, telecom towers and smart meters

CDR real time data ingestion.

Designed and architected the solution/framework

Data System Design for handling huge amounts of data and running real time systems

Migrate Terabytes of Data from Greenplum into HDFS

Consult downstream systems to build a complete QDD model on NoSQL

Backend Developer in Core Java and Python

Stress testing of the newly developed system

Integration of Apache NiFi, Sqoop, HDFS and Hive with Teradata and MySQL

HIVE performance tuning.

Use Apache Spark for Smart sensor data processing

Scoring/Clustering model for Tourist Attractions and Local Businesses

Worker Passion Analysis for Big-4 company

Company – Deloitte <Nov-2013 to May -2015>

A feedback and twitter listening tool to predict the Employee Passion Index.

Role: Data scientist, Solution Architect, and Developer.

Responsibilities:

Designed and architected the solution/framework and the Scalable System

Data System Design for handling huge amounts of data and running real time systems

Development of Data Analytics programs – Twitter Data Analysis, Feature Engineering

Scoring model

Health Solution Implementation for Insurance Company

Company – Deloitte <Nov-2013 to May -2015>

A predictive and analytical solution for health care and insurance products

Role: Data Scientist, Solution Architect, Developer.

Responsibilities:

Designed and architected the solution/framework and the visual analysis framework

Coded and built the analytical models

Led the team, which built the data model and the reports

Co-ordinated with the client to gather requirements and iteratively review the results

Reporting Framework for C-Level executives in Big-4 company

Company – Deloitte <Nov-2013 to May -2015>

A Scalable solution Business Unit level reporting of resource management.

Role: Architect, Developer.

Responsibilities:

Complete development of backend framework for the Tableau reports

Architecture and development of Tableau reports and published for Executive viewing.

Developed NPS system to improve sales in Retail company

Company – Sears <Nov-2012 to Nov -2013>

Development of an automated feedback system to collect and measure Net Promoter Score.

Role: Architect, Developer.

Designed a framework to process data from EDW and 3rd party tools. Scaled up the Analytics

platform to handle 11 Billion rows in a single batch. Automated the generation of training datasets

using python and trained the models in R.

Responsibilities:

Worked on API’s and feedback data munging to understand customer sentiment

Developing a report based on the feedback and the sentiment of the customer

Predict churn rate and attrition based on the review

A/B testing

Implemented solutions using HDFS as source and Pig and Hive for Data analysis

Automated the source data ingestion using Python and Api calls.

Developed Risk Management Module for a Leading Bank

Company – Mahindra Satyam <Oct-2011 to Nov2012>

Development of Anti Money Laundering and Quantitative Risk Management System using Basel norms for

the CASA division

Role: Designer, Developer.

Responsibilities:

Worked on automation of data ingestion and cleansing.

Developing a MIS report for Directors

Performance tuned the system to give NRT alerts

BI Programmer for a Leading Telecom company

Company – Accenture <June-2009 to Sep -2011>

Development of Datamarts, Dimensions and Facts using Java, python, SQL and shell scripting

Role: Designer, Developer.

Responsibilities:

Worked on Data loading, transformation, aggregation and reporting

Python programming to create the ETL flow

Coached new team members

TECHNICAL SKILLS SUMMARY

Languages & Development Tools

Languages – Python, R, SQL, HTML, Java, Scala, Javascript

Visual Analytics – Tableau

Database Management Systems

MySQL, Greenplum, Pivotal HD, Teradata, Cassandra, Pig, Hive, PostgreSQL, MongoDB, HBase

Big Data Frameworks

Streaming

AWS, Hadoop and Related Technologies, Tableau, D3.js

Apache Spark, Kafka, Nifi

Dev ops

Docker, Ansible, Rancher, Virtual Box

frameworks

Akka, Play, Map Reduce, Spark

EDUCATION

BITS Pilani, India – PG-Certificate in Big Data Engineering <Sep, 2018 – Feb,2020>

Asansol Engineering College, WBUT – B.Tech (Computer Engineering) <June, 2004 – Aug, 2008>

ACADEMIC PROJECTS

AML system for a leading bank based on customer profile and usage statistics. Used ensemble modelling for accuracy of prediction.

Credit Card Fraud Detection – Capstone Project at BITS-Pilani to identify fraudulent credit card transactions in real time.

Technologies used: Apache Kafka, Spark, HBase, Hive, AWS RDS, Sqoop, Java

Saavn Machine Learning – Saavn is a leading music streaming company in India. Applied Big Data and Spark Streaming to build a song recommendation engine.

Technologies used: Spark Machine Learning, Spark Streaming, Kafka-Spark, HBase, Hive, Java

Contact this candidate