Big Data Developer(Scala, Python, Kafka, Databricks)

Location:

Austin, TX

Posted:

June 26, 2024

Contact this candidate

Resume:

Chandrasekar Sundaramoorthy

+1-732-***-****

********.******@*****.***

Developer & Architect - Java, Scala, Python, Kafka, Databricks, Spark, Snowflake, DBT

18+ years of overall Experience

PROFESSIONAL SUMMARY:

• 11+years of experience in Java, J2EE, Javascript, Spring, Struts, Hibernate, Oracle, MySql, DB2

• 7+ years of experience in Scala, Kafka, Kafka Stream(KStream, KTable, KSql), Kafka Connect, KSqlDB, MirrorMaker 2, Akka, Akka Streams

• 5+ years of experience in Hadoop, Map/Reduce, Hive, HBase, Pig, Sqoop, Oozie, Flume, Avro, Yarn, Spark(Scala), Spark Streaming

• 2+ years of experience in Snowflake, DBT

• 2+ years of experience in GCP(Dataflow, BigQuery, PubSub)

• 5+ years of experience in Python 3.9+, FastApi

• 2+ years of experience in Kafka Administration with Prometheus, Graphana

• 5+ years of experience in creating microservices using Spring Boot

• Knowledge in Golang, RUST, TypeScript

• 6+ years of experience in Docker, Kubernetes, Openshift, Terraform, CI/CD pipeline using Gitlabs & Jenkins, Github Actions

• 2+ years of experience in Apache Nifi, Apache Camel, Airflow

• 5+ years of experience in AWS - S3, EC2, EMR, EKS, Glue, Step Functions, Athena, Lambda, EBS

• 2+ years of experience in GCP - Dataflow, Apache Beam, Scio

• 2+ years of experience in CassandraDB, MongoDB

• 4+ years of experience in Git.

• Experience with Agile Methodology.

Current Hobbies:

Playing with Micro:bit using Rust

Learning more about Rust microservices

CERTIFICATIONS:

Oracle Certified Associate, Java SE 7 Programmer.

EDUCATION:

MCA First class from Pondicherry University.

BSc (CS) First class from Periyar University, Salem.

PROFESSIONAL EXPERIENCE:

TEID Client Profile

Nov 2023 to till date

Client: Thrivent

Designation: Senior Scala Engineer

•Building Scala micro services using Spring Boot

•Design and code ETL using Spark(Databricks) with Scala and PySpark in batch processing

•Streaming data using Kafka

•Data Warehousing using Snowflake DBT(Snow SQL, Snow Pipe, Tasks, Streams, Time travel, Zero Copy Cloning, Optimizer, Metadata Manager, data sharing and stored procedures)

•Snowflake connector(SnowPipe Streaming) to feed Kafka

•Managing Kafka Connect and built pipeline on Kafka Streams(KStream, KTable) includes CDC(Change Data Capture)

•Built CI/CD pipeline using Gitlabs, Docker and Kubernetes

•AWS stack: AWS - S3, EC2, EMR, EKS, Glue, Athena, Lambda, Github Actions

Format Inquiry

Dec 2022 to Nov 2023

Client: Experian

Designation: Senior Engineer

Building micro services using Scala, Spring Boot

Design and code ETL using AWS Glue/Spark with Python in batch processing

Streaming data using Kafka

Coded Validation tool using Python to compare the output of AWS Glue/Spark job vs mainframe’s output

Data Warehousing using Snowflake DBT(Snow SQL, Snow Pipe, Tasks, Streams, Time travel, Zero Copy Cloning, Optimizer, Metadata Manager, data sharing and stored procedures)

Managing Kafka Connect and built pipeline on Kafka Streams(KStream, KTable) includes CDC(Change Data Capture)

Design and developed AWS Step functions to automate the overall flow

Built CI/CD pipeline using Gitlabs, Docker and Kubernetes

AWS stack: AWS - S3, EC2, EMR, EKS, Glue, Step Functions, Athena, Lambda, Github Actions

Apache Beam Jobs using Scio

Jan 2021 to Dec 2022

Client: Twitter(Google)

Designation: Principal Engineer

Design and implement ETL using Apache Beam Job using Scio(Scala wrapper of Beam code)

Migrated C++ project to Python

Deploy the Beam job in GCP(Dataflow), Google Big Query

Testing the Job using Parquet file by storing it in Google Storage

Migrating existing Apache Beam jobs from Scalding to Scio

Built CI/CD pipeline using Gitlabs, Docker and Kubernetes

Automating Terraform using Golang

Diff Correlation & Automating SDLC process for faster delivery, Orpine Inc

Jan 2019 to Dec 2020

Client: Morgan Stanley

Designation: Sr. Scala Engineer

Desk strategists run the regressions across various books to validate the pricing with the help of proprietary tools which results in differences between BAS version and current (REG) versions. Differences are investigated further to find the correlation between the tickets.

Automating the above Diff correlation makes desk strategist job easier to focus on the root cause and fix it faster. Apart from Diff correlation, in the processes of pricing tool, some of the SDLC processes are manual and have less performance over treadmill(similar to Kubernetes) which leads to delivery to PROD in ~2 weeks. The goal is to automate the release process, shortening the feature release time from ~2 weeks to less than 2 days with an aim to reach continuous deployment, which is more productive for desk strategist.

Spread the load across multiple Akka actors using Routers with related supervision and mailbox.

Using Akka Reactive streams, the Kafka Streams were implemented with offset management and backpressure.

Using Akka-Http, build the web app and route the request to related Actors.

Roles and Responsibilities:

Create applications using Python and Scala with Optimus(extended from Scala - proprietary)

Automate the SDLC process reactively using Kafka and Akka-typed

Work with Bitemporal Store - DAL(proprietary storage architecture and will never access DB directly) and MongoDB

Analyse the Pricing results using Scala & Python Apps

Python scripting for automation

Build CI/CD pipeline using Jenkins

Data Processing using Kafka Streams

Writing business logic with Kafka consumers & producers using Python

Kafka Administration - Add/remove nodes, partition rebalance, scaling

Kafka monitoring and alerting setup on brokers, topics and Consumers using burrow

Kafka Components setup – Kafka Connect and Schema registry

Kafka Stream Datapipeline & KSqlDB

Kafka Security – SSL and SASL.

Kafka offset reset tool

Kafka Administration - Add/remove nodes,partition rebalance, monitoring, scaling

Built CI/CD pipeline using Jenkins and deployed using Docker, Kubernetes

At least an hour spend everyday for code review for the team members

Following agile methodology and involved in Backlog refinement, Sprint planning, Retrospection.

Technologies Java, Scala, Python, Kafka, Spark, Spark Streaming, Akka-typed, Hadoop, Hive, Devops, Kubernetes, Jenkins, Git, AWS stack: AWS - S3, EC2, EMR, EKS, Glue, Step Functions, Athena, Lambda

Trade Journaling, Orpine Inc

May 2017 to Dec 2018

Client: Point72 Asset Management

Designation: Platform Engineer(Scala Lead/Architect)

Trade Journaling receives Portfolio Manager’s trade fills, process & transform the data into trade, positions, trading area, trading books and preparing profit & loss reports for each perspective. This project also automated reconciliation process between the received fill vs Prime Broker report.

Roles and Responsibilities:

Writing Spark(Python and Scala) stream jobs for processing Kafka Real Time trade fill events

Analyzing data using Hive Query and writing UDF functions

Storing & Retrieving the trade fill data after processing stages such as positions, profit and loss, PM perspective

Provide control configuration service to manage environment variable and other openshift parameters against app or project, which automatically update the pod when deployed.

Implemented microservice using Spring Boot, Akka-typed,, Akka Stream, Scala and Kafka as messaging layer

Built and deployed microservices using Docker, Kubernetes, Openshift

Built CI/CD pipeline using Jenkins

Provisioned & managed software using Ansible configurations

Kafka Administration - Add/remove nodes, partition rebalance, scaling

Kafka monitoring and alerting setup on brokers, topics and Consumers using burrow

Kafka Components setup – Kafka Connect and Schema registry

Kafka Stream Datapipeline & KSqlDB

Kafka Security – SSL and SASL.

Writing business logic with Kafka consumers & producers using Scala/Java

Design and code with CassandraDB

Creating Nifi custom processor and building the flow according. Built using scala and Kafka as messaging layer

POC on camunda workflow to meet the error management mechanism across multiple project using scala and Kafka as messaging layer

Created tool for resetting Kafka offset based on predicate for consumer groups

Built CI/CD pipeline using Jenkins

Performing Code Review

Managing a team of 6 member

Using agile methodology and involved in Backlog refinement, Sprint planning, Retrospection.

Technologies Java, Scala, Spring Boot, Spark, Spark Streaming, Hive, HBase, Kafka, Akka-typed, Akka Stream, Camunda, Redis, AWS, Devops, Docker, Kubernetes, OpenShift, Jenkins, Git, AWS stack: AWS - S3, EC2, EMR, EKS, Glue, Step Functions, Athena, Lambda

Centralized Data Store, Mphasis

Sep 2016 to Apr 2017

Client: JP Morgan Chase

Designation: Project Lead

A platform to dump data from multiple data sources into HBase and workflow managed by Nifi after fetching the values from Kafka.

Roles and Responsibilities:

Lead Developer

involved in Backlog refinement, Sprint planning, Retrospection.

Implemented Akka Actor for better concurrency and state

Built and deployed microservices using Docker, Kubernetes

Implemented microservice using Spring Boot, Akka, Akka Stream, Scala and Kafka as messaging layer

Creating Nifi custom processor using scala and building the flow accordingly

Hadoop/Hive Administration

Setting up and managing Kafka for stream processing

Broker and topic configuration and creation

Rebalance replica leader election

Securing by Kerberos authentication

Configuring Producer and Consumer based on the requirement

Unit testing through VerifiableConsumer and VerifiableProducer to test configurations

Spark(Scala) for analyzing data and transforming

Designing HBase schema

Performing Code Review

Technologies Scala, Akka, Java, Nifi, Hadoop, Spark(Scala), Hadoop/Hive Administration Map/Reduce, YARN, Kafka, HBase, Zookeeper, Sqoop, Avro, HortonWorks, Linux, Git

Rate Management (Viewership), Experis Inc

Nov 2015 to Sep 2016

Client: Sabre inc

Designation: Sr. Programmer Analyst

A platform between Suppliers and Agencies to assign rate which they accepted for contract. Currently this process is TPF command basis which interacts with TPF systems. This project will eliminate the use of TPF by providing the API which expose the services to be invoked by UI and manages the Rate between the supplier and agencies.

Roles and Responsibilities:

Lead Developer and Kafka administration

Involved in Backlog refinement, Sprint planning, Retrospection.

Creating Camel Routes and building the flow accordingly

Setting up and managing Kafka for stream processing

Hadoop/Hive Administration

Spark(Scala) for analyzing data and transforming

Communicating BA for requirement clarification, Analyzing data

Performed Code Review

Technologies Java, J2EE, Spring, Hadoop/Hive Administration, Apache Camel, JAXB, SOAP, Design Patterns, Web Services, S2 Container, Unix, SVN, Crucible, Jenkins, Hadoop, Map/Reduce, Spark, Scala, Python, YARN, Kafka, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, Flume, Avro

Vehicle Data Analysis, Ford IT

Nov 14 to September 2015

Client: Ford IT

Designation: Sr. Consultant

Work closely with teams across the portfolio to identify and solve business challenges utilizing large structured, semi-structured, and unstructured data in a distributed processing environment. Develop a reporting analytical strategy for last 15 years of data of vehicle based customer purchases which correctly predicted the resurgence of particular vehicle giving us a jump on the competition.

Working on Automobile Purchasing domain.

Designing and Development effort the project requirements as Technical Lead for different projects on different platforms such as web Services (SOAP) DB links and J2EE.

Evaluating the tools and utilities for the project and participating in scalability assessment.

Project planning, task assignments, monitoring and tracking.

Managed and reviewed Hadoop log files.

Lead programmers and helped them to understand coding and functionality of each module in detail for the project related to functional and technical perspectives.

Design and develop applications from J2EE end, writing designing patterns, during the development process. Testing & optimizing programs for achieving optimum performance.

Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.

Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.

Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System from Sqoop which pre-process the data with Hive respectively.

Hadoop Administration

Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.

Tested raw data and executed performance scripts.

Roles and Responsibilities:

Communicating BA for requirement clarification, Analyzing data

Performed Code Review

Working with Hive

Programming with Map/Reduce

Technologies Java, Camel, Hadoop, Hadoop Administration, Map/Reduce, Hive, Pig, HBase, Zookeeper, Sqoop, Oozie, Flume

Global Purchasing Program Management, Ford IT

October 2010 to Nov 2014

Client: Ford IT

Designation: Sr. Lead Developer / Designer / Team Lead

The goal of the GPPM application is to establish a common Global Purchasing Program Management business process and reporting tool to ensure program launches are supported efficiently and with quality. This includes the automation of manual tasks performed by Global Purchasing Program Management for the following five work streams:

Production Purchase Orders

Production Tool Orders

Prototype Purchase Orders

Capacity Studies

Uncommitted Funds

Modules involved are:

Customer Report

Data Summary Report

Region - Web Focus

PMT - Web Focus

Workstream - Web Focus

Program - Web Focus

Admin

Roles and Responsibilities:

Designer & Developer, Hands on Java, J2EE Developer

involved in Backlog refinement, Spring planning, Retrospection.

Communicating BA for requirement clarification

Providing high level design to developers including the design pattern and UML

Leading the team of 8 members

Mentoring the team for their low level design

Performed Code Review

Technologies Java 1.5, J2EE, Struts 1.1, Toplink, Web-Services, Sql Server 2008

Environment IBM RSA, AccuRev, WebSphere 8

Morcom ID Management, Virtusa

Oct 2009 to Oct 2010

Client: JP Morgan Chase

Designation: Sr. Developer / Designer / Team Lead

This system deals with KOPS User ID creation and its capabilities for Broker Dealers. The KOPS ID creation performs with various manual step approval based on the capability selected and finally persisted into an external system (KOPS/COSMOS).

Development methodology – Agile

Modules involved are, Inbox, Request Status, User Search & Reports, Template, Admin

Roles and Responsibilities:

Hands on Java, J2EE Developer

Communicating BA for requirement clarification

Providing high level design to developer including the design pattern

Leading a team of 6 members.

Mentoring the team for their low level design

Performed Code Review

Deployment in DEV and QA using Unix

Preparing release notes for QA and PROD deliverables

Technologies Java 1.5, J2EE, Spring 2, Web-Services, Oracle 10g, Oracle BPM, Spring MVC

Environment WebLogic 10.3, SVN

Online Work Request, Compunet Connections

March 2008 to Oct 2009

Client: Photon infotech

This system developed for online work request which is sent to Super user, Managers and Special users and they will assign the project for Technicians based on their department. All the process for the request will be based on status and the status are managed by the managers on the project# assigned to the specific department.

Main features and modules of this project include.

Make new request, Assign Project, Outside Agency

Refine Search

Manage SQL Data

Roles and Responsibilities:

Hands on Java, J2EE Developer

Developed the Refine search and Manage SQL Data modules.

Used Hibernate ORM to integrate Spring Framework.

Implemented business logic using Spring

Performed Unit Testing of the developed components.

Technologies JSP, Servlets, Spring 2.0, Hibernate 2.0, Oracle 10g, Web Services

Environment WebLogic 8.1, SVN, Eclipse

Health Record Tracking System, Compunet Connections

April 2007 to March 2008

Client: Photon infotech

This project was aimed to generate the health record ratios quarter based, and maintain the generated ratios for certain period. This was split into 5 modules, User maintenance, Uploading of files to calculate, calculation part, maintenance of generated ratios and report generation.

Roles and Responsibilities:

Hands on Java J2EE developer

Customized Data Access Module using DAO patterns, for entire transaction with database.

Coded business logic with session bean, hibernate for User maintenance and calculation part.

Technologies JSP, Servlets, EJB 2.0, Spring 1.2.6, Hibernate, Oracle 9i, WebServices, WebLogic 8.1, SVN, Eclipse

E-Accounting System, Compunet Connections

June 2006 to March 2007

E-Accounting System is basically used to maintain all account details of telecom organization. It will provide user friendly system maintaining accounting details. It consists of operation cash and bankbook, collection cash and bankbook; unpaid cases trail balance details and monthly reports. Operation cashbook contained all cash paid details from the company which include salary payments and advances, whereas bank book consists of payments paid in form of drafts. Collection cashbook consists of details regarding receipts, which include all deposits paid in cash form, Whereas Collection bank book consist of details of receipts in drafts. Deposits of contract work; bills monthly rentals will be added in this collection bankbook.

Roles and Responsibilities:

Hands on Java, J2EE Developer

Design, code, implementation and maintenance of scalable infrastructure for Web.

Developed Servlet programs to transfer the control to other pages and programs depending on the business logics.

Creating result pages using JSP.

Processing the client form using Servlet.

Technologies JSP, Servlets, Struts 1.3, EJB 2.0, Sql Server 2000, Hibernate

Environment WebLogic 8.1, SVN, Eclipse

Contact this candidate