Tanmoy
Contact: +971-***-***-***
Email: ************@*****.***
Nationality: India
Country of Residence: Dubai, UAE
PROFILE
Currently working as a Software Developer with additional hands-on responsibility of Data Engineer. 10+ years of experience as a developer and architect. 5 years as a Big Data Programmer and 3+ years as Scala Developer.
Experience as a Solution Architect and Developer and Devops in Business and Data analytics, big data and business intelligence platforms for various domains/verticals such as health-care, telecom, banking, retail and e-commerce. Architected scalable systems to process large data sets (approximately 1 TB data volume daily) and structured, quasi-structured datasets. Coordinated with directors and VP’s to generate new KPI’s and collaborate with Sales and Marketing team to explain the inputs from the Analytics projects.
Currently, working on Scala, Python, JavaScript & Kafka to build the Backend(BackOffice Portal) and Data platform for a financial company using a Hybrid (inhouse and AWS) approach.
Employment Visa: Dubai Work Visa
CORE COMPETENCIES
Statistical and Data Modelling – R, Python
Visual analysis – Tableau
Real-time/big data analytical solution – Kafka, Talend, Apache Nifi, Sqoop, Flume, Spark
Hadoop Development – Hive, HDFS, Splunk, Pig, ELK
Programming Languages: Core Java, Python, Scala, JavaScript (NodeJs, ExpressJs, HTML5)
Databases and NoSQL – Teradata, Greenplum, HBase, ElasticSearch, MySQL-NDB
Data Integration/Analytics – Talend, Python, HVR, Custom Scala pipelines
Data Mining, Text Analytics, Clustering, Classification.
Framework: Play, Map Reduce, Spark, Akka, Anorm
EXPERIENCE
Software Developer
Company – Payments Startup – Dubai, UAE <Aug-2018 to Present>
Backend, Reporting and Data Platform
Role: Scala Backend Developer/ Data Engineer
Project Overview: Reporting directly to the Managing Director, I work with Infrastructure, Erlang and Android/iOS team to build a real-time web portal that is continuously updated from mobile transactions and payment gateway systems. This gives an aggregated view of a customer over different channels. Everything adds up for the customer to give ‘One View’ of all his interactions with our products, giving complete peace of mind. The average latency is about 10 seconds and completely configurable across the system.
Responsibilities:
Develop Scala microservices using Play, Anorm and Akka
Software development using Functional Programming best practices
Developer Python utilities for backend data handling and serve via NodeJs
Develop backend using Scala, NodeJs, Python, Express, MongoDB
Code Review, Version Control using Bitbucket (Enterprise GitHub)
Real Time Data Platform development using Docker, Kafka, Spark, Greenplum and Hadoop
Kafka Connect – MySQL connector, HDFS Connector, Postgres connector
Backend using Couchbase, HBase, Hive and MySQL-NDB
Develop data pipelines using Scala, Python, Spark, Github and Docker
DevOps using Jenkins, Bitbucket, Scripting, Ansible, Rancher
Cloud deployment of Big Data Stack – AWS, Rackspace, OpenShift
Comprehensive unit-tests and integration tests
Data Architecture Design and Third-Party Vendor Management
Sprint Planning and Backlog grooming
Proposed Fraud detection on Apache Spark streams
Proposed ELK for Log Analytics across apps
Motivated the team to accept some design decisions and they paid off very well
Knowledge of Apache Beam, Flink, Flume, Kubernetes and Secure Real-time streaming
Data Lake Engineer/Principal Consultant - Big Data
Company – Manulife – Hong Kong <Oct-2017 to Aug-2018>
Data Lake Platform Developer
Role: Data Lake Engineer / Java Developer
Project Overview: Develop a data platform that has the data for entire South-East Asia (spanning 12 countries) by working with country-specific teams. I was responsible for Japan, Philippines, Cambodia and Hong Kong businesses. The strategy gave the management real power of data and feedback about how different products are being used across countries. I integrated Apache Ranger and Atlas for metadata management and data governance. Helped with streamlining the Data Governance process from managing metadata using Excel to using Hortonworks portal.
Responsibilities:
Designed/Implemented the CDC Ingestion framework using Apache Nifi
Creating Analytical Functions in Apache Hive using Core Java
Metadata management Application using Apache Ranger, Knox and Atlas
Design and Include new sources in the Data Lake
Log monitoring using Splunk and Elastic Search
HDP and HDF
Migrating DWH and Analytic platforms to Big Data
ETL Pipelines using Jython, Groovy
Managed 4 team-members
Senior Data Engineer/Program Manager - Big Data and Analytics
Company – GE - India <July-2016 to Oct-2017>
Build the Data Lake for GE
Role: Data Integration Architect and Data Analytics.
Project Overview: Joined as a Program manager to manage the transformation of GE systems. Worked with GE- Financials, Transportation, Aviation and Healthcare to build central data repository. Managed 3rd party relationships, integrations and contracts (Rapid Ratings, HVR, AWS, OpenSCG).
Also took up a more hands-on role of delivering analytics for Fleet Management at GE-Transportation.
Responsibilities:
Lead a team of 30 members (both Contractual and FTE’s)
Designed and architected the Big Data Ingestion framework
Transformation of GE Legacy systems to Opensource platform
Design and Include new sources in the Data Lake
Customization of Big Data Framework using Core Java and Python
Data System Design for handling huge amounts of data and running real time systems
Migrate Terabytes of Data from Oracle and ERP systems to MPP, Cloud and Hadoop
Feasibility study for moving Prediction models to AWS cloud
Ensemble modelling and Classification for GE-Transportation
Product Engineering of Big Data Lake
Company – EMC <May-2015 to July-2016>
Build an Open Data Platform for implementation of Smart City.
Role: Data Integration Architect and Data Analytics.
Project Description: Work on the Singapore Tourism portal development to streamline tourist bookings and services provided based on flight data and tourism data. This was part of the Singapore smart city project. Multiple POC’s for Government Smart City Initiative.
Smart grid project for the Govt. of Kerela
CDR analytics – Using CDR to propose telephone plans for visiting tourists
Responsibilities:
Extraction of data from smart sensors, telecom towers and smart meters
CDR real time data ingestion.
Designed and architected the solution/framework
Data System Design for handling huge amounts of data and running real time systems
Migrate Terabytes of Data from Greenplum into HDFS
Consult downstream systems to build a complete QDD model on NoSQL
Backend Developer in Core Java and Python
Stress testing of the newly developed system
Integration of Apache NiFi, Sqoop, HDFS and Hive with Teradata and MySQL
HIVE performance tuning.
Use Apache Spark for Smart sensor data processing
Scoring/Clustering model for Tourist Attractions and Local Businesses
Worker Passion Analysis for Big-4 company
Company – Deloitte <Nov-2013 to May -2015>
A feedback and twitter listening tool to predict the Employee Passion Index.
Role: Data scientist, Solution Architect, and Developer.
Responsibilities:
Designed and architected the solution/framework and the Scalable System
Data System Design for handling huge amounts of data and running real time systems
Development of Data Analytics programs – Twitter Data Analysis, Feature Engineering
Scoring model
Health Solution Implementation for Insurance Company
Company – Deloitte <Nov-2013 to May -2015>
A predictive and analytical solution for health care and insurance products
Role: Data Scientist, Solution Architect, Developer.
Responsibilities:
Designed and architected the solution/framework and the visual analysis framework
Coded and built the analytical models
Led the team, which built the data model and the reports
Co-ordinated with the client to gather requirements and iteratively review the results
Reporting Framework for C-Level executives in Big-4 company
Company – Deloitte <Nov-2013 to May -2015>
A Scalable solution Business Unit level reporting of resource management.
Role: Architect, Developer.
Responsibilities:
Complete development of backend framework for the Tableau reports
Architecture and development of Tableau reports and published for Executive viewing.
Developed NPS system to improve sales in Retail company
Company – Sears <Nov-2012 to Nov -2013>
Development of an automated feedback system to collect and measure Net Promoter Score.
Role: Architect, Developer.
Designed a framework to process data from EDW and 3rd party tools. Scaled up the Analytics
platform to handle 11 Billion rows in a single batch. Automated the generation of training datasets
using python and trained the models in R.
Responsibilities:
Worked on API’s and feedback data munging to understand customer sentiment
Developing a report based on the feedback and the sentiment of the customer
Predict churn rate and attrition based on the review
A/B testing
Implemented solutions using HDFS as source and Pig and Hive for Data analysis
Automated the source data ingestion using Python and Api calls.
Developed Risk Management Module for a Leading Bank
Company – Mahindra Satyam <Oct-2011 to Nov2012>
Development of Anti Money Laundering and Quantitative Risk Management System using Basel norms for
the CASA division
Role: Designer, Developer.
Responsibilities:
Worked on automation of data ingestion and cleansing.
Developing a MIS report for Directors
Performance tuned the system to give NRT alerts
BI Programmer for a Leading Telecom company
Company – Accenture <June-2009 to Sep -2011>
Development of Datamarts, Dimensions and Facts using Java, python, SQL and shell scripting
Role: Designer, Developer.
Responsibilities:
Worked on Data loading, transformation, aggregation and reporting
Python programming to create the ETL flow
Coached new team members
TECHNICAL SKILLS SUMMARY
Languages & Development Tools
Languages – Python, R, SQL, HTML, Java, Scala, Javascript
Visual Analytics – Tableau
Database Management Systems
MySQL, Greenplum, Pivotal HD, Teradata, Cassandra, Pig, Hive, PostgreSQL, MongoDB, HBase
Big Data Frameworks
Streaming
AWS, Hadoop and Related Technologies, Tableau, D3.js
Apache Spark, Kafka, Nifi
Dev ops
Docker, Ansible, Rancher, Virtual Box
frameworks
Akka, Play, Map Reduce, Spark
EDUCATION
BITS Pilani, India – PG-Certificate in Big Data Engineering <Sep, 2018 – Feb,2020>
Asansol Engineering College, WBUT – B.Tech (Computer Engineering) <June, 2004 – Aug, 2008>
ACADEMIC PROJECTS
AML system for a leading bank based on customer profile and usage statistics. Used ensemble modelling for accuracy of prediction.
Credit Card Fraud Detection – Capstone Project at BITS-Pilani to identify fraudulent credit card transactions in real time.
Technologies used: Apache Kafka, Spark, HBase, Hive, AWS RDS, Sqoop, Java
Saavn Machine Learning – Saavn is a leading music streaming company in India. Applied Big Data and Spark Streaming to build a song recommendation engine.
Technologies used: Spark Machine Learning, Spark Streaming, Kafka-Spark, HBase, Hive, Java