Sign in

Cloud Solution Architect - Big Data

Toronto, ON, Canada
October 09, 2018

Contact this candidate



* ********* *****, *******, **, Canada, M6M 2A6 +1-416-***-**** PROFESSIONAL PROFILE

Experienced leader, thinker, collaborator, and contributor

** ***** ********** of analyzing, conceptualizing, designing, developing and delivering highly reliable and scalable software solution

13 years of effective teamwork, coordination and communication experience across geographies in multicultural environment

8 years experience of leading teams & projects with Scrum, providing thought leadership and supporting higher management

Years of experience in hiring, mentoring, coaching and grooming engineers in different tools and technologies

7 years hands-on experience of working with Big Data and 2 years in Cloud(AWS) as AWS Certified Solution Architect

4 patents filed in USA and Europe and multiple in-house publications KNOWLEDGE/SKILLS

Big Data:Hadoop(HDFS, YARN, MapReduce), Spark, Hive, Sqoop,HBase, Cassandra, Flume, Kafka, Storm, ElasticSearch, Tableau

Cloud(AWS): VPC, IAM, Cognito, EC2, S3, EMR, Kinesis, Lambda, DynamoDB, Redshift, Elasticsearch, QuickSight, CloudWatch

Programming Languages: Java, Scala, C++, C#, Python, Bash, Hive, SQL

Web: HTML5, Javascript, CSS, XML, XSLT, GWT

Platform: Hortonworks, Cloudera, AWS, Linux, Window, Android

Others: DevOps using AWS services, Git, ClearCase, Rally, Jira, SCRUM, XP, TDD, Pair programming

Management: Scrum master; Sprint planning; Requirements gathering; People management; Project planning, scoping and delivery; Stakeholders management; Expectation management; Conflict management; Quality assurance; Release management. EDUCATION AND CERTIFICATE

AWS Certified Solutions Architect April 2018-April 2020

EGMP(Executive general management program), Indian Institute of Management, Bangalore July2013-May2014

B. Tech(Bachelor of Technology), Indian Institute of Technology, Guwahati August1999-April 2003 PROFESSIONAL EXPERIENCE


Worked with a team to do DevOps and CI/CD using AWS stack for a critically important medical product for a major USA client.

Worked with a team to migrate 2 PB (Peta Byte) on-premise Hadoop cluster(running various big data solutions and services) to AWS. Multiple AWS’s Infrastructure, Services, Databases, and software services are being used in this project. Technology – HDFS, Spark, YARN, Hive, Sqoop, Oozie, Kafka, EC2, S3, EMR, Kinesis, DynamoDB, Lambda Team Size-6 SENIOR TECHNICAL ARCHITECT COMPANY-DATAMETICA DURATION-JULY2016-APRIL2017

Primary responsibilities included but were not limited to provide technical directions in building large scale software components, overseeing agile software development process, working with various stack holders on SOWs/requirements/design/integration of software components, addressing customer issues/escalations, providing production support and meeting aggressive deadlines.

Hired engineers, team leads and architects. Trained new hires and engineers around Big Data Technologies as per business need.

Created technical requirements from business contract/SOW (Statement of Work) through client interactions and workshops. Further did high level design, guided team and oversaw detail design, development and testing.

Provided consultancy services to multiple clients to solve their big data related problems.

Conceptualized, Designed, guided team and oversaw low level design, development and testing to develop a tool(for retail website) to show product’s view, bought and being in cart counts in real time as well as within specified time window.

Designed, guided team and oversaw detail design, development and testing of Teradata warehouse migration to Hadoop warehouse.

Conceptualized, designed and oversaw development of multiple Big Data utility tools: 1) To calculate Data Quality score and create quality map/report for Hive tables 2) To do unit testing for Hive, Pig and Shell scripts 3) To create Hive table from schema less JSON files 4) To create data ingestion tool to reduce production ingestion work load.

Oversaw tailoring of AirBNB’s ReAir(syncs Hive tables between two hadoop cluster) to work with HDP 2.1, Hadoop 2.3, Java 7(lower version of HDP, Hadoop and Java then pre-requite for ReAir) and AWS’s EC2 and S3. Technology – Hadoop, Hive, Sqoop, HBase, Spark, Flume, Kafka, EC2, S3 Team Size-5 to 15 SUBODH KUMAR

5 Greentree Court, Toronto, ON, Canada, M6M 2A6 +1-416-***-**** SOFTWARE ARCHITECT COMPANY-PHILIPS DURATION-OCT 2010-JULY 2016

Participated in department level activities such as department roadmap discussion, project feasibilities study, assessments of ongoing projects especially related to Big Data. Conducted many training sessions and presentations related to Big Data Technologies.

Conducted scrum and been scrum master of the agile software development process that all projects in the company followed.

Oversaw uses of TDD and pair programming to achieve high code quality and to overall reduced development time.

Hired engineers and team lead. Trained new hires and existing engineers around Big Data Technologies as per business need.

Lead team, managed stake holders communications, did high level design and oversaw detail design, development and testing of an EMR (Electronic Medical Record) using Big Data technologies. This EMR also migrates data from an oracle based older EMR.

Lead team, managed stake holders communications, designed and developed tool using Correlation, Hypothesis testing and Pattern matching algorithms for cardiac patient data to construct data for prediction model. Tool stored the constructed data into ElasticSearch for Clinical uses and analytics. It helped clinician making faster decision and reducing Door-to-Balloon time.

Designed and developed solution to create an ETL (Extraction, Transform and Load), which extracts logs from various medical devices (MR machine, Ultra-Sounds etc) does transformation of logs and load it to ElasticSearch for operation optimization analytics.

Developed a tool to do Next Generation Sequencing for genome data and built an automated pipeline for analysis of cancer dataset.

Created a tool for users to work with EC2 without knowing AWS in detail. For user it was just a Linux terminal but it ran on EC2.

Designed and Guided Web developers to create a simple Web based EMR (both UI and backend) for primary care for Brazil market.

Guided Android developers to create an Android based solution for a Brazilian Health Insurance company (Unimed) to do claim approvals, and providing different treatment plans for beneficiaries (Patient). Technology – HDFS, MapReduce, YARN, Sqoop, Kafka, Storm, HBase, Spark, Spark SQL, SparkMLlib, Flume, ElasticSearch, AWS, EC2, S3, CloudFormation, IAM, CloudWatch, Scala, Java, Python, Bash, HTML5, XML, XSLT, GWT, Javascript, CSS, MySQL, Window, Linux, Android Team Size-1 to 5


Led a software team, participated in some project management activities, managed communication with Danish counterpart in addition of doing design and development of new features and fixing bugs for a pump designing software called Rap1D.

Introduced agile to my team in the form of daily SCRUM and monthly sprint, and later helped other team to adapt to agile processes.

Worked in a research project, where configured and modified open-source software (OpenFOAM) to meet pump designer CFD simulator need and to save money by reducing uses of costly licensed(per user basis) software (CFX). Technology – C++, VC++, Perl, Python, Bash, Data Structures, OpenFOAM, Window, Linux Team Size-1 to 5 MEMBER OF TECHNICAL STAFF COMPANY-CADENCE DURATION-JULY2006-NOV2007

Developed PSL-assertions support for VHDL and Verilog for emulation platforms (XTREME and PALLADIUM) where front-end chip design is verified by simulating the chips on hardware. Technology – C, C++, Lex, Yacc, Linux, Clear case, Design Patterns, Data Structure, Algorithms Team Size-6 SOFTWARE ENGINEER COMPANY-GEOMETRIC DURATION-DEC 2004-MAR 2006

Developed a module of Feature Recognition software to extract complex 3D surfaces from mesh, flatten surface into 2D plane without losing crucial data such as length, width etc. Feature Recognition was highly-complex graphics oriented software to recognize all features and dimensions of CAD models.

Contributed to customize Distributed object-based modeling environment (DOME) to bring it into market with better customer support by removing all open-source libraries and component used in DOME. Technology – C++, Visual Studio, MFC, Data Structure, Design Patterns Team Size-1 to 2 ENGINEER COMPANY-BHEL DURATION-AUG 2003-DEC 2004

Developed BME (Boiler Material Estimation) software for precise and fast estimation of material used for boiler manufacturing. Uses of BME reduced material estimation time from 5-6 days to couple of hours and increased estimation accuracy by 15-20%. Technology – C++, Visual Studio, MFC, Data Structure, Algorithms Team Size-2 ACHIEVEMENTS


WO/2013/128371 - Compact next generation sequencing dataset and efficient sequence processing using same

WO/2014/049470 - System and method for processing variant call data

WO/2014/024142 - Population classification of genetic data set using tree based spatial data structure SUBODH KUMAR

5 Greentree Court, Toronto, ON, Canada, M6M 2A6 +1-416-***-****

WO/2013/084133 - Robust variant identification and validation

Contact this candidate