Post Job Free

Resume

Sign in

Big Data Lead

Location:
Chennai, TN, India
Posted:
September 20, 2017

Contact this candidate

Resume:

KAUSHIK CHAKRABORTY

T +91-996*******

ac2d2x@r.postjobfree.com

ac2d2x@r.postjobfree.com

Objective: Experienced SW-engineer and quantitative developer (~18 years) with demonstrated technical leadership, project-management and business analysis capabilities, with strong foothold in multi-domain analytics (signal processing, cryptography, graph-theory, computational finance,) seeking challenging business-analysis and/or architecting/project- management opportunity in product build-up, and management. At this time, I am keen to explore opportunities in following area(s):

Principal-architect/Delivery-Lead roles in system-software/licensed-product LOBs (with substantial analysis/research focus) in HPC apps, Big-data, Middle-ware, machine learning/Low-latency/quantitative domains (preferably in green-field projects/research-streams or in re-engineer positions in complex mature product-lines). Specifically seeking for opportunities in cloud/big-data use cases for BFSI/Telco (BSS/OSS) domain/ Bots/RPA initiatives.

EDUCATION

2002 – 2003 National University of Singapore

M.S in High Performance Computing (HPC)

CPA 8.7/10.0 (1st class Honors )

1999 – 2001 Indian Institute of Science (IISc) Bangalore, India M. Tech in Comp. Sc. & Eng.

CGPA 6.4/8.0 (1st class Hons. )

1994 – 1998 JADAVPUR UNIVERSITY, KOLKATA, INDIA

Bachelor of Electrical Eng. Specialization: Signal Processing & Control Systems 78 % (1st class Hons.)

EXPERIENCE

SUMMARY

Strong development experience using C/C++, Java, .NET) in multi-threaded, distributed and component-based architectures in Unix (Solaris, Linux) & Windows environments.

Abilities delivering complex, low-latency, mission critical systems in front-office space with full- SDLC-wide exposure. Ability to drive road-map initiatives in involving development teams, diverse business and infra stakeholders.

Capability to quickly learn advanced theory, gain subject-matter expertise (often to the level of IEEE pubs and journals of the relevant discipline) and translate and contribute in large-scale state-of-the-art SW solutions (with demonstrated experience in Info-Sec & Networking, middle- ware, image processing, mining and quantitative finance).

Strong academic and professional back-ground in linear-algebra, applied-mathematics, optimization and dynamic -programming, computational techniques for large scale problems, differential equations and numerical solvers applicable to derivative pricing, signal processing, combinatorics and data-science ( using Matlab/Excel-VBA/NAG/CPLEX/R/SAS/C

Excellent theoretical and hands-on skills in data-mining and its applications in H/F-systems, frameworks like RainForest for fast decision tree building and clustering; techniques for fast outlier detection in large historical data sets with statistical-tests, depth, deviation, KNN-distance and density-approaches, K-D and C-F trees (BIRCH) and kernel-based techniques. Experienced in applying the same in prototypes for tick-data analytics and Network-intrusion-detection and visualization aspects of those. On top of upcoming techniques such as ADMM or deep-learning.

Strong multi-threaded development in cross-platform and cross-language applications, delivered apps that utilize dynamic priority reordering, complex locking & barrier-synchronizations primitives using Unix Pthread, Solaris libthread, Java threads. Expert GPU programmer.

Strong hands-on expertise on MPI, PVM & MPICH internals and hands-on knowledge on Map- reduce, Hadoop and Azure platforms. Demonstrable competence in Big-data stacks (Hadoop, HDFS, YARN, Hive, Pig, HBase, MongoDB, Cassandra, ZooKeeper, Flume, Impala, Kafka, EC2/EMR, ElasticSearch, Kinesis, SPARC, SHARK, STORM, Accumulo, Scoop, Graphx, Mahout, MLLib, HDInsight). Competent on ML packages such as NLTK, Weka, H2O, DeepLearning4J.

Competent in distributed computing paradigms and MOMs like TIBCO/SOLACE and distributed caching solutions like COHERENCE (expert-level) and Hazelcast/Gemfire, grid-solutions like DATA-SYNAPSE/GridGain, open-source cloud technologies (in-depth on AWS stack and Open- stack); experienced in diverse system-integration and architectural overhauls. Knowledgeable on MPP platforms such as Teradata, Greenplum, Vertica.

Competent in project management activities in above contexts, experiencing leading and managing large distributed teams (~50+), with exposure to a both BAU and green-field projects. Have delivered projects under demanding time and budget constraints with judicious management of scope-creeps and related challenges. Exposure in pitching, negotiating and engaging with senior pool of stakeholders in building divisional capacities and securing funding mandates.

PROFESSIONAL EXPERIENCE

Aug 16 –

Current

PAYPAL (Senior Manager – Corp. Finance/Big Data) Working as senior delivery-manager in FP&A (Financial Planning & Accounting) Big-data related initiative. Responsible for quickly ramping up of a unit of 30+ professionals in offshore.

Recent accomplishments include:

> Decommissioning legacy post-transaction funding cost and loss-calculation engines from Teradata/HANA to Hive and Spark-SQL (HortonWorks distribution).

> Developed and implemented integration-strategies for PP-wide initiatives like Stored-value/

/Tokenization/Braintree-on-boarding related data feeds into the middle-office data-lake (Kafka, Flink). Strategized and implemented data lake organization/layering and consumption pipeline.

> Handled semi-structured and unstructured data as part of ingestion streams processing.

> Delivered artefacts for WATCH- an-analytics component tuned for business health monitoring.

> Implementation of Spark workflows for FP&A (financial planning and analysis) models and low- latency hi-volume regulatory-reporting and tuning/optimization for the same. (Anaconda on Horton).

> Hadoop capacity planning, diagnostics and optimization for suites of MR/Spark jobs under pursuit

(Dr. Elephant).

> Delivery model, practices and discipline establishment/DevOps endorsement (Rally based). Responsible for ~1800 man-hours of effort planning and tracking in Rally (bi-weekly Sprints). Grew and managed six delivery teams (of ~8 members) dedicated to middle and back-office functions and technically into Big-data artefacts, ingestion, governance and hybrid-cloud integration practices.

> Stakeholder engagements and senior-leadership oversight concomitant with bootstrapping a delivery-unit from the ground up.

Mar 15 – Jun 16

AMERICAN EXPRESS (Lead - Big Data/MC-Learning) Worked as architect/manager in BIG- DATA roadmap reengineering initiative. Specific assignments include batch and real-time application migration to native Hadoop/SPARK ecosystems. Led a team ~25 offshore and 6-10 onshore resources in this capacity.

Key accomplishments being:

> Mar - May '15: Fraud detection stack (Stream processing, data import from mainframe using FLUME, processing layer/enrichment and data mining using SPARK, ML-Lib)

> May -Jul '15: Collaborative filtering based recommendation engine for a web-portal (REDIS + STORM)

> Aug ’15 – Jun’16: Revamp of a legacy credit risk platform in favour of Hadoop + SPARK, scope of work included:

>>Leading a team of 20 developers (between US and India), achieved the following milestones.

>>>Enterprise legacy DW/mining platform migration to Big-Data-storage (SQOOP, HIVE, HBASE)

>>>Porting legacy C++ risk-libraries to HADOOP and SPARK, extensive performance tuning for latency and scale. Refactoring old workflows for BASEL and implementing new retail models

(Markov-chain/Vintage). Delivered date-integration POCs using DataMeer and Cascading.

>>> Design and implementation of Java server side & UI for machine-learning-as-a-service (REST, JSON, Angular). Ported logistics, KNN, SVM, EM and ANN-based-calibration models into the framework. Designed and implemented API contracts between RESTful services for ML workflow management with H20 style UI. Incorporated Champion/Challenger evaluation as part of the E2E modelling workflow.

>>> Development of Oozie workflows for running risk-batches, as many of the consumer systems supported non-XML protocols, developed Oozie-specific adaptors for job-submission (provisioning support for Java/Hive/Pig/Revolution-R). Implemented faster indexing/metadata look-up via ElasticSearch.

>>> Contributed in major artefacts of a complex in-house data-warehouse solution on top of Hadoop

(Hive, HDFS, Compression, mete-data search/indexing and governance/control/lineage layer). The initiative had similar design goals as Apache Falcon, however being entirely incubated in-house. Built accelerators for faster and smarter data ingestion.

>>Ported legacy spend/look-alike and prospect models onto SPARK from SAS.

>> Charting based dash-board integration (Zeppelin)

>> Rally based Sprint-planning and execution. KPIs setup and tracking.

>> Budgetary/stake-holder engagement in terms of roadmap and charter-finalization.

>> Articulating delivery metrics and endorsing quality parameters for platform shipments. Sep14-Jan15 AWS consultant, freelance/consultancy basis. Telecom/NW clients. [RFPs to deployment] Jan 14 –Aug14

AMAZON (Sr. SDE, contract)

[BIG-DATA application development for GRCS (Global retail catalogue system) (C++, Boost, STL, Java 1.8, NOSQL, Cloud/AWS technologies (Dynamo-DB, S3, SQS, SNS, SWF, EMR, STORM, KAFKA, SPARK-MLLIB), GIT]

Contributed in retail-catalogue platform design and development with the aim of replacing legacy components in favour of Amazon AWS and EMR platforms.

Developed distributed server-side components responsible for large-scale feed processing, ingestion and normalization and real-time feed-analytics. Utilized Java 1.8 NIO, M2M, async and stream/bulk-parallel collections features for implementation of the front-end feed-processor.

Contributed to initial prototypes for online-connectivity with standards-based catalogue- networks like GDSN and PRICAT/EDI, introduced messaging based feedback loops in the pipeline.

Developed machine-learning prototypes in R and incrementally ported to EML back-end. Worked on recommendation systems based on hybrid (content and user based) filtering algorithms.

Significant application modelling for leveraging cloud-patterns such as EMR (elastic map reduce hosted Hadoop platform) and EML (elastic machine learning services) for real-time massive machine learning for feed-analytics (SVM, parallel Frequent Pattern-growth, Random Forest, Mixture Modelling), S3 for cloud-storage and SQS (simple Queuing system), SNS (simple Notification system) and SWF (distributed workflow system) for communication and parallel- workflow handling.

Setup of STORM cluster, development of ingestion-topology. Development/re-architecture of trending and expert-recommender system (SPARK on EC2 + MLLIB).

End to end SDLC and team-management (requirements to delivery and stakeholder liaison). Mar 13 –Jan 2014 STANDARD CHARTERED (Sr. Developer, Vice President)

[Commodities market-data distribution-platform development (VC++, Boost, STL, VBA,, Java 1.7, thrift, Reuters RFA 7.4, SUBVERSION]

Leading a team of ~40+ developers, were engaged in RTR (real time rates) platform design for market data capture and dissemination targeted for SCB Commodities business.

Managed the design and development of server and client stacks (java 1.7, thrift, web-services, RMDS, XLA- add-in with C++ using STL & Boost).

Design and development of middle-ware services providing gateway services to RMDS (Reuters market data system) with packet chunking, aggregation and analytics functionalities.

Adoption of major design patterns, code-synthesis and automated testing paradigms in design.

Team management for ~18 direct reports. (regional)

End to end SDLC management (project/team management and stakeholder liaison).

[Commodities distributed-valuation service implementation] (VC++, Boost, STL, VBA, SOLACE) Leading a team of ~25 developers, had been engaged in valuation library enhancement (model implementation, empirical testing) for a wide range of vanilla and exotics, namely forwards, Asians, Barriers, quantos, swaptions and swings utilizing techniques such as mean-reversion, convenience-yield, seasonality, jump-diffusion, jump-reversion, monte-carlo and reduced-form models etc. The enterprise pricing/risk service is scalable and distributed in design with hooks to marked data-systems and various other enterprise-components over SOLACE platform. Jul 10 –Mar 13 BARCLAYS CAPITAL (Sr. Developer, Asst Vice President)

[Low latency front-office system development (VC++, Boost, STL, VBA, NAG, Java 1.7, C#, LINQ, TPL, MS-SQL server, Solace, Reuters/Bloomberg, Coherence, kdb, Code-synthesis, Purify, Perforce]

Development of IRD-analytics platform, a global application responsible for generating golden copies of yield-curves and pricing, order-execution, risk and P&L-explain modules for vanillas

(bonds, swaps, FRAs) and exotics (swaptions, caps-floors) derived from those.

Upgrading and making fundamental changes to cater for increasingly sophisticated formulations, namely OIS/CSA/CVA-based discounting, multiple-curve framework (‘new curves’) and most recently to cheapest-to-deliver (CTD) curve-paradigms (Rates/FX) with phased incremental enrichment across all the stacks (VBA, QA-library, C++/C# stack).

Significant RAD prototyping on Excel-VBA, for modelling and calibrating yield curves (nonlinear solvers), pricing/risk of flow and exotics structures. Work mandated interaction with traders for gathering specifications, coding models and supporting those as well as implementing trade- specific calculators in C++ (graph-theoretic formulation of complex calculation workflow using BGL, the Boost-Graph library and customizations therein). Was in charge of the optimizer and graph-kernel libraries, researched and implemented various distributed graph-cuts and knot- detection algorithms. Implemented numerical optimization algorithms for yield curve models.

Implemented advanced models with much emphasis on factor analysis, parametric estimation, calibration and back-testing. Specifically worked on econometric & statistical models such as maximum-likelihood (MLE) based for market-depth calculation (C

Implementation of pattern-recognition (FP-Growth) based historical market-data analysis and simulation engines, developed heteroskedasticity-analysis functions with machine-learning models.

Worked as a core member in kdb-tick data system interfacing and Q-language based modules development for time-series analytics (AR, ARMA, GARCH), predictive modelling (Kalman). Implemented clustering based mining algorithms for pattern discovery.

Extensive GPU-cluster based risk-simulations (monte-carlo/VAR).

Did significant interface development with RFA 7.1, Bloomberg OPENAPI, solace (4.6) middleware, and ORACLE Coherence. Managed vendor liaison activities w.r.t. 3rd-party libraries.

Researched and implemented FX-quotes sentiment-analysis SW (Hadoop + OpenNLP)

Led a regional team of 20+ members with project and line management responsibilities. Oct 08 – Jun 10 Royal Bank of Scotland (RBS) (Sr. Engineer)

[Valuation & margining system Development for exotic swaps ( Java 1.5, Spring, Hibernate, Sybase, JNI, JDBC, JMS, Test-Director, Numerix, C

Built exotics valuation platform, supporting complex-nonstandard-flow products and hybrids into this valuation platform (Ratchet, Accreting, Commodity-linked IRS, Curve/roll lock swap, Dual- currency swap, Bermudan-quanto-inverse-floater, Polynomial swap, Trigger swap, Correlation and callable-swaps, Equity-KO, FX-TARN products etc. (utilizing Numerix core-sdk).

Design and development for the java server side (and integration with coupled applications via web-services and REST APIs) and significantly contributed in various proprietary valuation models and logic implementations. Virtualization of the server-side (using Citrix).

Regional leadership capability of ~15 member unit. May 08 – Oct 08 STANDARD CHARTERED (Risk Systems Analyst, Contractual)

[Credit Risk system development ( Java 1.5, C++, Murex, ALGO, MQ, XML-Spy, Solaris)]

Enhanced and customized credit-risk regression pack (Excel-vba) with routines for calculating measures such as transition probability and transition-matrices (cohort and hazard-rate approaches), default-correlations (Merton’s asset-value approach), PD (probability of default) estimation and CDS pricing.

Development of the Java based trade-feed-processor for exotics deals (FX-Digitals/window- barriers/IRS/Auto-caps/FRAs fed from Murex (via MXML), that enriches and delegates (over MQ) to AlGO risk-limit suite for measures calculations (PFE/LGD etc.)

Managed a team of ~5 resources.

Aug 07 – May 08 LEHMAN BROTHERS (Sr. Developer)

[Fixed-Income Exotics application development (C++ (STL), Java 5, Spring, C# (Winforms), Boost, ACE, Tibco RV, EMS 4.2, CORBA (IONA))]

Development of pricing/risk platform for exotics like Inverse-floater, Snowball, power-reverse dual-currency, Index, Boosted, Range-accruals, Target Redemption Notes & Structure

(TARS/TARN), multi-no-touch-barriers spanning IR, FX and Commodities desk.

Building and augmenting models, namely HJM, Salitree, BGM, Black, jump-diffusion for exotics products.

Implementing FX stochastic (Heston, Stein, Longstaff, Scott, SABR, JSLV), local volatility (Dupire) and mixed models vol-surface construction & calibration modules.

Pricing of barrier options under volatility-smile utilizing barrier-bending techniques.

Pricing basket options (moment-matching techniques and using Monte-Carlo).

Commodity Asian option (geometric/arithmetic average-rate) valuation using analytical and moment-matching approximation methods. As extension of this work developed C++ library for numerical inverse Laplace-transform solvers and Matlab interfacing using MEX.

Commodity swing-option pricing using a forest-of-recombinant-trees method and dynamic programming (max-min formulation based solvers). As part of back-testing, used Matlab optimization & PDE toolbox, and developed models for optimal exercise schedule of Commodity Swing options (DP formulation for optimal exercise). Oct 04 – Feb 07 IFLEX SOLUTIONS ( currently Oracle Financial Systems) ( Associate Consultant ) Securities-Processing application development (C/C++, SWING, Sybase T-SQL, SQL-Adv., bcp, isql, WebSphere MQ, Sybase Perl, Python, SWIFT, Test-Director, Solaris, lib-thread) Managing ~20+ developers, designed, developed and deployed large multi-market and multi- instrument trade-settlement application for UBS (supporting funds, equities, options, Debts, structured-products) & corporate-action engine (rights /CD/coupons and redemptions). Jul 03 – May 04 LEICA GEOSYSTEM (R&D Engineer Contract) Machine-Vision algorithm development and implementation in ARM based imaging-ASIC (on top of ATI Nucleus RTOS API) for Leica DNA01 and DNA03 series of surveying-devices (design, implementation, deployment). Implemented FFT, thresholding, edge-detection (Prewitt/Sobel) and calibration algorithms for inferring accurate spatial traits from the image gathered (used in surveying sites for geometrical measurements).

Apr 01 – June 02 MindTree Consulting [Associate Consultant]

> Developed in-house analytics-prototype to detect website-attack (sql-injection, XSS or cross-site- scripting) using ID3-based data-mining/anomaly-detection approach, built the experimental setup to train the model with test-data and tuning the classifiers. Integrated and piloted the prototype with Rainbow I-Key crypto-token authentication solution.

> Developed complex warehouse logistics automation engine for Volvo Automobiles. Jul 98 – Aug 99 Cognizant Technology solutions [Programmer Analyst] Developed payment-processing solution for First-Data Corp., using JCL-CICS (biz. Modules) and developing UI for POS terminals (VC++/MFC) in a team of 10+ members. SKILLS & QUALIFICATIONS

Finance: Quant-Analytics, exotics valuation, and data-mining techniques. Computing: C++, Java(Core-1.6, 1.7, 1.8, JMS, JNDI), SAS, R, Unix internals, shell/Perl, MS (C# 3.5, 4.0, .Net, Foundations, WCF, LINQ); Middleware-MQ Series, TIBCO (RV, EMS),Solace (4.6, 5.1), Reuters RFA, Bloomberg OpenAPI; Thread (POSIX, Intel TBB), DB- MS-SQL Svr, Sybase, Design patterns (across all flavours), Hadoop and Spark (Horton and MapR distros). INTEREST

Currently seeking challenging Leadership position in Analytics-heavy Cloud/Big-Data application in finance/telecom/ecommerce/BI space as Director/Practice Lead.



Contact this candidate