Engineer Aws

Location:

Pune, Maharashtra, India

Posted:

August 07, 2020

Contact this candidate

Resume:

SOUBHIK CHAKRABORTY

Computer Programmer & Technologist

[ ********.*@*****.*** Ó +91-880-***-**** R G-302, Rohan Nilay-1, Aundh Pune, India

github.com/soubhik-c linkedin.com/in/shoubhik-chakraborty-461aa424 CAREER PHYLOSOPHY

“What we have to learn to do, we learn

by doing.”

- Aristotle

STRENGTHS

Hard-working (16/24) Fast Learner

Customer Interaction Analytical Skills

TECHNICAL SKILLS

Concurrent Programming

Data Structures

ML/DL/OR

Cloud Computing

distributed systems bigdata nosql

scala core java c++ python

spark elasticsearch syslogng

kafka docker swarm k8s helm

javacc antlr lex/yacc

aws azure packet

linux windows macintosh solaris

RECOGNITIONS

VMware Individual Excellence Award

for outstanding contributions in the

India Center

VMware Hackday Competition

Stood 2nd – came up with a graphical

navigation of distributed query plan

using d3js.

Opus Appreciated for

outstanding contribution in product

implementation on behalf of Janney

Montgomery Scott, Wall Street, NY

EDUCATION

Diploma in Advanced Computing

CDAC - Chennai

Aug 1999 – Jan 2000

B.Sc. in Mathematics

Government Autonomous Science College

Mar 1994 – Mar 1997

EXPERIENCE

18+ years of programming expertise

BigData Consulting

Freelancing

April 2018 – Present Pune

Scaling EFK stack with syslog-ng/rsyslog to ingest 500k to 1000k msgs/sec. in aws using docker swarm or kubernetes handling cloud bursts, zero message loss, parallel streaming to data lakes, DR & WAN replication.

Analysis & tuning of ScyllaDB for 100K write/sec & 3-4K scan/sec of 1-2TB datasets, used in ETL pipeline’s write-ahead logging. Staﬀ Engineer

VMware (rejoin)

Jun 2017 – Feb 2018 Pune

Implementation of Distributed Network Encryption (NSX security). Staﬀ Engineer

SnappyData Technologies Pvt. Ltd.

Mar 2016 – Jun 2017 Pune

First few lead engineers developing a HTAP database.

Thought leadership in Approximate Query Processing

Implemented several Spark optimizations like adding CBO, extending whole stage code generation, min-max column statistics et cetra in Spark 1.6/2.0 . Staﬀ Engineer

VMware India Pvt. Ltd.

Apr 2013 – Feb 2016 Pune

Design & implementation of global index (total ordering) & distributed joins.

Oﬀ heap slab allocator for the database using java unsafe apis. Sr. Member of Technical Staﬀ

VMware India Pvt. Ltd.

May 2010 – Mar 2013 Pune

Implemented early conﬂict - lockfree distributed transaction in the database.

Developed eventual consistency based active-passive WAN replication. MY WORK DAY

Productivity Improvement

Architecture & Design

Play or Hobbies

Reading technical

papers & study

academic research

Resolving issues &

requirement analysis

of Customers

Sleeping & family

time

Learning on

the go

Administration

Coding

Mentoring

Reviewing

Bug analysis

Gathering code

insights & new

ideas

0created in L

ATE

PUBLICATIONS

SnappyData: Streaming, Transactions, and

Interactive Analytics in a Single Cluster

SIGMOD 2016, San Fransisco, CA, ACM

full paper: snappydata.io/snappy-industrial

SnappyData: A Uniﬁed Cluster for

Streaming, Transactions, and Interactive

Analytics

CIDR 2017

PATENTS

Participant ———

GII Get Initial Image non-blocking replication of data in an upcoming node while operations continue to happen in the source copy.

Eﬃcient write-through, write-behind to hadoop.

Co-Inventor ———

Applying a DDL statement atomically in a cluster.

Criteria based data eviction from in-memory cache. Others that couldn’t be ﬁled ———

Early conﬂict detection in a distributed transaction.

Query optimization by data co-location.

RECENT BIGDATA/CLOUD CONSULTATION

Overview ———

Apr’ 2018 - present

Experience with big data management upto 100TB & high performance computing upto 10 nodes cluster.

Proﬁcient level at cloud deployments with aws (awscli, root account administration), azure & packet clouds.

Intermediate level at provisioning and automation with ansible, vagrant, packer.

Advanced user of containerization using docker,

docker-swarm, kubernetes (calico, ﬂannel etc n/w

optimization with Ena & sriov/dpdk support, block device volume optimization), daemon sets & statefulsets helm charts.

Opensource developer and contributor to gemﬁre, apache spark (cbo), snappydata & syslogng.

DCEngines ———

Apr’ 2018 (3 months)

Analysis & tuning of scylla database for 100K writes/sec & 3-4K scans/sec of 1-2TB datasets, used in ETL pipeline’s write-ahead logging deployed on packet cloud.

Next Gen HTAP database product architecture & designing.

Bigdata infrastructure setup on packet cloud, ﬁrewalling, virtual machine performance tuning, aligning networking conﬁguration with underlying hyper-v.

Nevis Networks ———

Sep’2018 (7 months)

Scaling elasticsearch, kibana & syslog-ng/rsyslog/ﬂuentd to ingest 500k-1000k msgs/sec with absolutely no jitters as part of OLS (Open Log Stack) centralized logging

infrastructure for network device monitoring and

troubleshooting deployable on baremetal/amazon/azure, private & hybrid clouds.

Optimized syslog-ng’s elasticsearch java-plugin towards achieving high throughput. refer pull request 2728 for more details.

Performance tuned kernel settings in aws vms for udp log collection in syslog-ng/rsyslog with zero packet loss.

Deviced custom kibana dashboard giving an uniﬁed view of the system by gathering log collectors’ ( syslog-ng / rsyslog / ﬂuentd ) statistics and elasticsearch index statistics. This provided self monitoring of the product at various levels & document quick bottleneck identiﬁcation guidelines and deterministic troubleshooting steps according to the deployment environment.

Guided aws support to identify underlying hypervisor tuning observed from VM level bottlenecks related to networking & kernel mainline upgrade on centos.

Identiﬁed various issues in docker-swarm overlay networking eﬀecting scalability and performance of the virtual machines and docker containers.

Engaged with syslog-ng team to help evaluating native http destination plugins and identifying udp socket source weaknesses.

Learned docker & docker swarm alongside above mentioned log ingestors while deliverying desired performance results. Lumina Networks ———

May’18 (till now)

Enhancing deployment of the same OLS (Open Log Stack) solution on kubernetes cluster using helm charts at a very large scale (1000s of network devices & network service agents and 10s of ES nodes) providing centralized log analytics.

Targetting OLS to support adhoc and analytical class long running scan oriented querying upto 100TB of data as per various data retention policy in elasticsearch without eﬀecting ’zero jitter ingestion rate’.

Tuning of calico kubernetes overlay networking on aws using sriov and dpdk support from xen/kvm hypervisors deployed on dedicated h/w.

Evaluating prometheus for realtime monitoring of the deployment infrastructure complementing ols centralized logging & its self monitoring seemlessly integrated with other lumina products.

Working on provisioning of 1000s of nodes & 10s of ES data nodes using packer & ansible generated images, vagrant & ansible based baremetal and aws/azure cloud deployments.

Providing technical assitance to deﬁne enterprise wide processes for conﬁguration management, CI/CD of various lumina products, jenkin integration, artifact repository managemnt, product rollout & production deployment.

Instrumental in setting up inhouse artifactory, git, helm repositories catering to large scale lumina customers like AT&T, TMobile, Verizon as part of leap-infra team. 0created in L

ATE

WORK HISTORY

Staﬀ Engineer

SnappyData Technologies

Mar 2016 – Jun 2017 Pune

scala microsoft sql server aws linux

Features:

Integrated Spark streaming & provided parallel ingestion over microsoft CDC listed in the blog here.

Performance Enhancements:

Implemented cost based optimizer in spark to prefer colocated join between tables & indexes and optimal join order.

Part of optimizing WholeStageCodeGen to emit vectorized scan/ﬁlter/project plan operating on 60GB+ columnar table data.

Replaced spark’s hash aggregation and hash join operators to utilize vectorization, column encodings delaying unsafe row creation thus alleviating gc pressure.

Participated in min-max scan optimization for column batches.

Benchmarked and improved performance of TPCC & TPCH mixed workload by 3X at 100GB scale.

Generated execution proﬁle of YCSB comparing between memsql, cassandra & snappydata on 1 to 100 GB datasize. Staﬀ Engineer

VMware India Pvt. Ltd.

May 2010 – Feb 2016 Pune

SnappyData (in stealth mode) ———

snappydata.io

4 months

scala linux

Product vision and idea formation centering around MPP databases.

Evaluated multiple open source products suitable to the purpose.

Finished proof of concept implementation in 2 months while learning functional programming & apache spark.

Extended spark with the combinatorial parser making it suitable for realtime query analysis.

Implemented basic uniﬁcation of the two products via DataSource apis exposed in spark.

Enriched DataFrame/RDDs with stratiﬁed sampling over spark streaming.

Compared with spark’s builtin MLlib with our approximation.

Feasibility study of supporting MLlib apis through sql interface applied on the uniﬁed storage in gemﬁrexd. GemFireXD ———

Re-positioned as clustered database – gemﬁrexd.docs.pivotal.io 5 years & 5 months

scala core java c++ linux

Features:

Participated in thrift based ODBC driver implementation.

Technical guide for GemFireXD monitoring system – Pulse.

Designed & implemented "explain" a.k.a query execution tracing & query-plan visualization in command line for troubleshooting perf issues. Exposed as javaagent

instrumentation optionally for more indepth proﬁling.

Implemented session monitoring for administrators showing client details of long running resource intensive queries.

Query cancellation through jdbc api & builtin procedures.

User Deﬁned Functions and cluster wide dynamic jar installation & loading/unloading.

Parallel WAN replication of colocated tables.

Memory Optimizations:

ResultSet streaming from remote nodes for queries returning large number of rows.

Fine tuned memory footprint of GemFire PartitionedRegion to reduce per entry overhead from 1024 bytes to 128 bytes.

Implemented CompactHashMap for extremely specialized global index adaptive caching on local node’s

PartitionResolver with 4-8 bytes of per entry overhead. Performance Enhancements:

Avoided System.nanoTime 2.x kernel bug by implementing low level performance counter using jna/jni for query tracing.

Enriched DRDA protocol for single hop primary key based point queries.

Wrote a very light weight parser for query pattern matching of statements to capitilize on query-plan caching.

Accomodated this part-constant-part-variable token in the Cost Based Optimizer for optimized plan.

Introduced Rule Based Optimization to reduce compile time of nested views from 20 mins to 5-10 seconds.

Introduced local index persistence that expedited cold restarts of the cluster from few hours to 1 0 minutes per index.

Modiﬁed ConcurrentSkipList to intake partial sorted batches of rows while populating back local indexes from disk.

Introduced EntrySet iterator to PartitionedRegion that utilized memory optimized key/value storage.

Modiﬁed entrySet to use DiskSavyIterator helping sequential disk reads reducing kernel pressure.

Used DirectByteBuﬀers & java NIO for zero copy network IO.

Helped implementing selector model on java sockets decoupling 1-1 reader threads.

GC Pause Reduction:

Came up with byte-to-byte comparison for massive scans in order to reduce garbage bursts avoiding long gc pauses.

Optimized low level row de-serialization functions to be more jit friendly avoiding frequent decompilations.

Plugged in our own ﬁnalizer instead of java’s interface that invariably takes 3 gc cycles to ﬁnally cleanup dependent objects.

Modiﬁed ConcurrentHashMap with double dispatch

mechanism to avoid unnecessary object creation for apis like putIfAbsent.

Modiﬁed ConcurrentSkipList internal Node classes to reduce object overhead in local indexes.

Altered BigDecimal/BigInteger/DateTime/String classes via reﬂection to mutate and reuse shell objects while scanning large datasets.

0created in L

ATE

Senior Technical Lead

GemStone Systems

Feb 2007 – May 2010 Pune

GemFireXD ———

SQL on top of distributed cache – gemﬁrexd.docs.pivotal.io 2 years & 6 months

core java linux

Modifying Apache Derby:

Started with a simple idea of providing SQL interface over GemFire distributed caching api to reduce usage complexity and learning curve of the application developers.

Modiﬁed sql parser to add new DDL extensions related to additional GemFire capabilities of Region.

Applying DDL atomically across the grid with higher read priority over metadata writes & backing oﬀ capability in case of partially applied DDL.

Added new distributed query plan to AST & query-plan shipping for servicing select queries on the cluster.

Enriched optimizer with colocated join-order, global hash index Vs local sorted index selection, distribution cost considerations.

Implemented delta update propagation based on table metadata instead of full row transmission.

Support for eﬃcient jdbc batch insert api.

Enhance code generation for n-way merge of "order by" &

"group by" queries, optimized top-N queries, avoid double de-serialization of rows when applying projections.

For performance beneﬁt of point queries came up with a concept of global index (auxilary distributed hashmap).

Modiﬁed core mechanics of GemFire PartitionedRegion for consistency guarantees between primary table and global indexes without transactions in place.

Consistency guarantees of After Triggers in DMLs without transactions.

GemFire ———

pivotal.io/pivotal-gemﬁre open sourced as Apache Geode 9 months

c++ java linux windows solaris

Client/Server Security feature:

Introduced AUTH layer in jgroups membership management protocol used for both peer-to-peer and wan gateway interfaces.

Implemented mutual authentication between gemﬁre clients and server.

Client side querying:

Extended server side OQL processor to c++ client for sub-millisecond response time.

Senior Technical Consultant

Opus Software Solutions

Oct 2000 – Jan 2007 Chennai

m27 ———

a framework that helped focusing on “design by contract” methodologies

c++ microsoft windows

Developed a custom c++ compiler delegating ﬁnally to microsoft visual c++ compiler after 10 stage transformations. This enabled new keywords in application code to treat classes as business objects, model system and incorporate change requirements easily.

The compiler generated low level c++ api calls for object-relational mapping, auto-remoting of methods (RMI), publish-subscribe for notiﬁcations, automatic object caching

& lifetime management.

mTalk - It took care of data-dependent routing, connection load balancing, message persistence and publish-subscribe asynchronous notiﬁcations.

Trading Systems ———

Realtime Order & Risk Management system

vc++ microsoft windows microsoft sql server

Developed online transaction processing trading applications interfacing with various stock exchanges viz. NSE, BSE, NYSE, AMEX, HKSE. Some applications involved standard protocols like Financial Xchange (FIX) and widely used NYSE’s CMS.

Application development involved Business Process Modeling Language, Uniﬁed Modeling Language, hierarchical state models and ﬁnite state models (state machines).

Deployed TradeNow, DirectXchange and VelocityXchange systems that had order routing engine, realtime risk management, post trade management and back oﬃce

interfaces.

Programmar Analyst

Netlink Technologies

Mar 2000 – Oct 2000 Chennai

visual basic oracle asp vbscript

IZone - Satyam ISP browsing terminal

Developed server side of the product oﬀering session management of internet users, presenting their internet usage time, providing re-charge options, billing & MIS reporting, user rights management etc as key features. Programmar Analyst

Risha Software Solutions

Aug 1997 – Sep 1999 Jabalpur

foxpro (for dos & unix)

Inventory Management and Resource Scheduling

Prooﬁng mgt system, defence project from Director General of Quality Assurance (DGQA), deployed over 5km radius intranet using Novel Netware and SCO open server.

The system provided optimal schedule for thousands of ammunitions’ prooﬁng (a quality check procedure by ﬁring them) based on 42 factors.

In other words, its an Operational Research problem with 42 mutually dependent parameters, each parameter derived based on multiple other factors and input parameters.

Provided an inventory adjustment module implementing rollback feature in FoxPro.

0created in L

ATE

Contact this candidate