Sankarapandian Chandrasekaran
Contact# +1-224-***-****
Email: *************@*****.***
SUMMARY:
Experienced IT professional with over 20 years in software development across BFS, Information Media & Entertainment, and Healthcare domains
Strong hands-on experience implementing big data solutions using technology stack including Hadoop, Databricks, MapReduce, Hive, HDFS, Spark, Sqoop, Flume, Nifi and Oozie.
Experience in multiple Big Data distributions, Cloudera 5.x, HDP 3.0.
Experience in Data modelling using Erwin
Experience in implementing Medallion Architecture
Implemented CI/CD pipeline using GIT, Jenkins and Unix shell scripting
Having extensive experience in Metadata driven data framework
Having extensive experience in Object Oriented Programming.
Having extensive experience in Unix shell scripting.
Having experience in Crontab scheduling and Maestro scheduling.
Proficiency in designing the applications.
Proficiency in Waterfall and Agile methodologies
Sensitive data remediation and data insertion using snow flake
Technical Skills:
Big Data Technologies: Hadoop, MapReduce, Pig, Hive, HDFS, Spark, YARN, Zookeeper, Sqoop, Flume, Oozie, Nifi, HBase, Cassandra, AWS Services (EMR, EC2, S3, RDS, Glue, Redshift), Snowflake, Tableau, Databricks
Programming Languages: C++, Scala, Java, Python, Unix shell scripting
Databases: Oracle, PostgreSQL
Tools: GIT, Clear Case, Clear Quest, Clarity, Fusion, ION, GTest/GMOCK, SMARTS (Time Series Database), Stonebranch scheduler
Methodologies: Waterfall, Agile
Certifications:
Databricks Certified Data Engineer Associate
Databricks Certified Fundamentals of the Databricks Lakehouse Platform Accreditation
Databricks Certified Databricks Delivery Specialization: - Unity Catalog Upgrade
Scrum Alliance Certified Scrum Master
Project Management Professional
IIBF Certification in Banking
NSE Certification in Finance market ( Securities )
NSE Certification in Finance market ( Derivatives )
Brain bench certification in Advance C++
Organizations worked:-
HCL Technologies Ltd – March 2005 to June 2010
Cognizant Technology Solutions – June 2010 to Sep 2021
Perficient Inc – Sep 2021 to till date
Projects Handled in Perficient Inc
Project Title
Data4u application Oct 2024 – Till date
Client
Johnson & Johnson
Role
Data Architect
Functional Area
Clinical Data
Technologies Used
AWS, PostgreSql, Rest API, Databricks, Spark(Python), Spark(Scala)
Objective
Web portal development
Description:-
The Objective of the project is to develop web portal to enable Data users to Ingest, query, transform and view the clinical data through this web portal. Clinical data will be processed through the medallion Architecture and clinical data report will be generated through the Golden layer of the data.
Resposibilities:-
Design the Databricks applications to get UI parameters through PG table to perform all kinds of data transformations such as Filter, Join, Transpose, Union etc.
Data modelling using Erwin Data modeler
Determining load type of tables
Performance tuning
Project Title
UC Migration June 2024 – Oct 2024
Client
Midtown Athletic Club
Role
Data Architect
Functional Area
Member Data Integration
Technologies Used
Azure ADF, Databricks, Spark(Python), Spark(Scala)
Objective
Unity Catalog migration
Description:-
The Objective of the project is to migrate the catalog from hive_metastore to unity catalog for better data governance which enable centralized access to workspaces, data lineage, enhanced security
and provide audits to meet security compliance requirements.
Responsibilities:-
Migrate all the schemas and table under new ( unity) catalog
Replace the Ingestion paths with the volumes path
Regression test the notebook jobs to make sure the jobs are writing to tables in Unity catalog.
Project Title
Blue Cross Coordinated Care June 2022 – Sep 2024
Client
Blue Cross Blue Shield - Michigan
Role
Data Architect
Functional Area
Member Data Integration
Technologies Used
AWS (S3, EC2, Relational database services (RDS), Redshift), Spark (Scala), Databricks, PostgreSql DB, Unix shell scripting, Stone branch scheduler, Mulesoft API
Objective
Integrate member data from Datalake tables in Databricks to Operational Data Hub in PosgreSql DB for Webportal development
Description:-
The objective of the project is to migrate the data from Datalake tables in Databricks platform to Operation Data Hub in PostgreSql DB for AM360 ( Advocacy Member 360) web portal development. ETL applications integrate various member data such as member demographics, primary care Physician, clinical information, member gaps ( vision, dental, Hedis etc) from various sources and write into ODH PostgreSql DB tables. Member data will be pulled and displayed in AM360 webportal through MuleSoft APIs.
Responsibilities: -
End-to-End Design i.e. Data ->API->UI
Determining the natural and foreign keys
Providing Logical Design Model using Erwin
Determining the encryption algorithms
Determining load type of the incremental data
Performance tuning of DB tables and ETL applications
Project Title
Legacy Modernization – Nasco feeds Sep 2021 – June 2022
Client
Blue Cross Blue Shield - Michigan
Role
Data Architect
Functional Area
Medical Claims
Technologies Used
AWS (S3, EC2, Relational database services (RDS), Redshift), Spark (Scala), Databricks, PostgresDB, Unix shell scripting, Stone branch scheduler
Objective
Data standardization of Medical claims data in Operational Data Hub and migration of data into Analytical Data Hub
Description: -
The objective of the project is to standardize the medical claims data from Nasco datasource in EDW and Rich claims Extract (Mainframe) feeds. Data standardization happens through Spark (Scala) application in Databricks environment and the resultant dataset will be stored in PostgresDB in Operational Data Hub (ODH) layer using AWS RDS and then migrate the medical claims data into AWS Redshift in Analytical Data Hub ( ADH ) layer. Scheduling of application happens through Stone branch scheduler.
Responsibilities: -
End-to-End Design .
Determining the natural and foreign keys
Providing LDM using Erwin
Determine the encryption algorithms
Determine load type of the incremental data
Performance tuning of PostgreSQL tables and ETL applications
Projects Handled in Cognizant Technology Solutions
Project Title
Internal Services - Auto Finance Mar 2020 – Sep 2021
Client
Capital one
Role
Data Lead
Functional Area
Auto Loans
Technologies Used
AWS ( EMR, S3,EC2), Snowflake, Tableau, Unix shell scripting
Objective
Facilitating data services and reports to other experiences in Auto finance like Vehicle resolution, set up my loan, Catch up team etc
Description:-
The objective of the project is to provide data services to other experiences like
Sensitive data remediation, data migration, creating new tables in D2A and D2B layers and Tableau reports (Creation and modifications).
Responsibilities:-
Analyzing and Remediating sensitive data using Snowflake.
Classifying the data
Creating views as per data classification
Creating new tables in Snowflakes D1, D2A and D2B
Creating and modifying Tableau report of other experiences in Auto finance.
Project Title
Locus 2.0 Aug 2019 – Mar 2020
Client
Verizon
Role
Engineering Manager
Functional Area
Communication
Technologies Used
AWS EMR, S3, Hadoop, Spark (Scala), Postgres, Unix shell scripting
Objective
Standardization of Addresses
Description:-
The objective of the project is to standardize the addresses in the data collected from various sources. Data got ingested as csv files into AWS S3. Spark application ingest these data into Postgres tables after standardization process in AWS EMR environment.
Responsibilities:-
Choosing technologies and deliverables
Logical Data modelling using Erwin
Developing and delivering the Spark applications
Orchestrating the execution of ETL applications using Shell script
Implementing CI/CD pipelines
Project Title
Data Harmonization May 2019 – Jul 2019
Client
ACI Universal Payments
Role
Data Lead
Functional Area
Banking & Financial Services ( Cards& Payments)
Technologies Used
Hadoop, Spark (Scala), Hive, Unix shell scripting
Objective
Migrating Streaming data in to Hadoop Data Lake
Description:-
The objective of the project is to migrate streaming data into Hadoop Data Lake. Streaming data got ingested into Hadoop environment using Kafka in Avro format. Spark structured streaming application will ingest these data in to Hive Table. Spark-sql application will transform the data and save in to 3 different BI tables.
Responsibilities:-
Performance tuning of the ETL application.
Developing a Spark structured Streaming application to convert avro data in to Json format.
Reducing the execution time of the ETL applications
Project Title
Data Harmonization May 2018 – Apr 2019
Client
Vantiv now Worldpay
Role
Data Lead & Engineering Manager
Functional Area
Banking & Financial Services ( Cards & Payments)
Technologies Used
Hadoop, Spark, Hive,Oozie,Datastage, Unix shell scripting, TWS Scheduling
Objective
Migrate relational Databases to Hadoop Datalake and harmonize the Data
Description:-
The objective of the project is to migrate Oracle and DB2 Databases to Hadoop Data Lake. We also harmonize the data using Hive and Spark. IBMDatastage is used to ingest the data in to Hadoop data lake. Hive and Spark are extensively used to harmonize the data in Hadoop Data Lake. Harmonized data will be used by Business Teams and Data science team.
Responsibilities:-
Choosing technologies and deliverables
Implemented Spark applications to harmonize the data in Hadoop Data Lake.
Implemented Hive scripts to harmonize the data in Hadoop Data Lake.
Implemented Oozie workflows to execute the Hive and Spark Jobs.
Developed Shell Scripts to automate the jobs
Scheduled the TWS Jobs through Oozie workflows and Shell scripting.
Have involved in PI planning events and helped the team in finding PI objectives.
Effectively followed Agile scrum methodologies and processes
Have used Rally for tracking the Sprint user Stories
Project Title
Dataflow Framework Aug 2017 – Apr 2018
Client
Discover Financial Services
Role
Technical lead
Functional Area
Banking & Financial Services
Technologies Used
Nifi, Spark(Scala), Python, Unix shell scripting, Proteus tool, AWS [ S3 & EC2], Maestro
Objective
Develop a dataflow framework to automate the dataflow
Description:-
The objective of the project is to develop dataflow framework to automate dataflow of Card applications in Discover. Apache Nifi is extensively used to automate the dataflow from Teradata database to SOR,SOR-Usable and SOT – source in HDFS. Eventually SOT – source tables will be stored in AWS. Various services of AWS such as S3, EC2, EMR, Lamda etc.
Responsibilities:-
Implemented Nifi processors to create Dataflo for SOR, SOR-U and SOT –Source
Developed Spark applications using Scala to process avro tables.
Developed Python scripts to parse xml contents
Developed Custom Nifi Processor to launch Spark Jobs.
Implemented fraud investigation flows using NIFI and Proteus.
Scheduled the jobs thru Maestro
Implemented Nifi flows to process realtime data.
Developed shell scripts to create ftp scripts and triggers for Nifi flows.
Have involved in PI planning events and helped in finding PI objectives.
Effectively followed Agile scrum methodologies and processes
Developed data flows in Nifi to ingest realtime data in to HDFS.
Have used Jenkins for building and deploying the jar
Have used Rally for tracking the Sprint user Stories
Project Title
Global Trade Incentives June 2016 – Aug 2017
Client
Dun & Bradstreet
Role
Technical Lead & Senior Developer
Functional Area
Information Media & Entertainment
Technologies Used
Spark (Scala), Hadoop, Pig, Hive, Hbase, Sqoop, Oozie, AWS EC2
Objective
To analyze and display the Supplier data in a User Interface
Description:-
Global Trade Incentives is a D&B initiative to provide extra supplier information thru a web application. Data from the Oracle warehouse will be imported using Sqoop in to HDFS in AWS EC2 cluster. Data in HDFS will be processed by Pig and stored in to HBASE. The tables in the Hbase will be accessed through the Hive external table. These Hive tables will be joined and processed to create denormalized tables using Scala-Spark applications. This denormalized table data will be retrieved by a web application using rest services.
Responsibilities:-
Developed sqoop jobs to import data from Oracle to HDFS
Developed Pig scripts to store data in to Hbase for incremental load
Created Hive external tables on top of Hbase tables to enable the spark application to access HBase
Have done POC in Spark to convince the client to use Spark-sql instead of Hive queries.
Created Spark Jobs for Hbase bulk upload which drastically reduce the upload time of data in to HBase
Developed Spark applications to process data on Hive tables to replace slow running Hive queries
Extensively used Spark sql to perform join, filter, Pivot operations etc to translate the queries
Have improved the performance of the Spark applications by efficient tuning.
Have repartitioned the Hbase tables to improve the performance of Spark applications.
Have considerably reduced the turnaround time of slow running Spark applications.
Have used JUnit for Unit testing the Spark applications.
Have used Cucumber test plugin to perform Integration testing.
Effectively followed Agile methodologies and processes
Have used Jira for tracking the Sprint user stories.
Project Title
Stars Rewrite Jun 2014–May 2016
Client
Aetna Inc.
Role
Senior Developer
Functional Area
Health Measures
Technologies Used
C++,Unix Shell Scripting, Java-Map Reduce,Hive, Oozie, HDFS
Objective
Calculate Health Care Measures and provide Star Rating
Description
The objective of this project to develop STARS application in C++,which will calculate Health Care Measures like OMW (osteoporosis Management for Women),AAP (Adults Access to preventive/Ambulatory services) etc. Lookup files will be provided to the application. Based on the requirement these lookup files will be changed .Output files of the STARS application will be copied in to HDFS environment and it will be processed using Java-Map Reduce and Hive to calculate the Health Care Measures. Based on the calculation, it will be given a STAR Rating.
Responsibilities:-
Developing the measures as C++ applications
Developing Mapreduce application to calculate Health care measures.
Developing Hive tables to calculate Health care measures.
Generating and Submitting periodic metrics for the Stars project
Coordinating the team
Project Title
Analytics & Reporting [ Liquidity Risk ] Oct 2012–May 2014
Client
Royal Bank of Canada
Role
Business Analyst
Functional Area
Liquidity Risk in investment banking
Tool Used
ION
Objective
Generate Regulatory Reports
Description
The objective of the project is to generate Regulatory Reports such as Basel Reporting [ LCR and NSFR],FSA Reporting,EBA Reporting etc.The project will also accomplish all the regulatory norm changes in reports.This changes will impact the other areas such as CFLA [ Cost of Funding and Liquidity Attribution ],CT [ Corporate Treasury ] Reports.A&R Team will verify and validate those changes.
Responsibilities:-
Validating the Regulatory Reports in ION tool.
Validating the CFLA data.
Generating CT Reports for the regulatory requirements
Project Title
Vendor Isolation [ Application development ] Apr 2012–Sep 2012
Client
Paypal
Role
Senior Developer
Environment
FinSys Framework
Languages& Tools
C++, GIT, BULLS EYE COVERAGE, GTEST/GMOCK
Objective
Developing Credit card applications
Description:-
The objectiveof the Project is to develop applications required to make Credit Card payments through PayPal. Application receives Transaction informations through value objects which in turn will be parsed and processed with appropriate vendors. Depending on the Accept/Reject response from the Vendors, PayPal will send its own Accept/Reject response.
Responsibilities:-
Designing the application abiding existing framework
Developing the C++ application in Finsys framework
Extensively used GMock /Gtest to perform test driven development.
Project Title
Pricing Plus [ Application development ] Jan 2012–Apr 2012
Client
Paypal
Role
Senior Developer
Environment
Unix
Languages & Tools
C++, GIT, BULLS EYE COVERAGE, GTEST/GMOCK
Objective
Changing Pricing Model of PAYPAL
Description:-
The objective of the Project is to change the pricing model of PayPal which enable the latter to generate huge revenues. Pricing Api’s are the crucial api’swhich are called by various interior teams like front end,RiskTeam,Money Team to calculate the cross border fees and Foreign Exchange fees.All the Pricing Api’s are modified to impose fees based on region instead of country.
Responsibilities:-
Designing the application abiding existing framework
Developing the C++ application in Finsys framework
Extensively used GMock /Gtest to perform test driven development.
Project Title
Pay After Purchase [ Application development ] Feb 2011–Dec 2011
Client
Paypal
Role
Senior Developer
Environment
Unix
Languages & Tools
C++, GIT, BULLS EYE COVERAGE, GTEST/GMOCK
Objective
Enable the user to choose his Funding source and pay after purchase
Description:-
The objective of the project is to generate revenues in offline market.This project will enable the user to choose his funding source after making a purchage within predefined period of 3 to 7 days.Theflexibility of choosing the funding source with in predefined period of 3 to 7 days makes PayPal to generate huge revenues in offline market.
Responsibilities:-
Designing the application abiding existing framework
Developing the C++ application in Finsys framework
Extensively used GMock /Gtest to perform test driven development.
Project Title
Admin CorrectionTool[ Application development ] Jun 2010–Feb 2011
Client
Paypal
Role
Senior Developer
Environment
Unix
Languages & Tools
C++, GIT, BULLS EYE COVERAGE, GTEST/GMOCK
Objective
Enable the PayPal Admin to move funds
Description:-
The objective of the project is to enable the PayPal Admin to move funds between PayPal Accounts,External funding source such as Credit Card,Bank,PayPal Credit .It provides flexibility to choose credit cards,DebitCard,Bank Accounts and PayPal Credits.
Responsibilities:-
Designing the application abiding existing framework
Developing the C++ application in Finsys framework
Extensively used GMock /Gtest to perform test driven development.
Projects Handled in HCL Technolgies. Mar 2005 to Jun 2010.
Project Title
SEBI –Integrated market surveillance system
Client
Securities Exchange Board of India [ Capital Market Regulator ]
Role
Developer
Environment
Solaris
Third Party Product
SMARTS, Australia [ TimeSeriesDatabase ]
Languages
C, C++,Shell scripting
Markets
National Stock Exchange of India,Bombay Stock Exchange of India
Objective
To find risks associated with Securities&Derivatives and to bring Transparency in the Indian Stock Exchanges and to control Manipulations taking places in the Stock Exchanges
Description:-
This project is an Interface application between the Stock Exchanges of India and the Third party product SMARTS, Australia. This SMARTS, Australia is doing Products for Market Surveillance System for various countries. The role of Application, is to get the data from Stock Exchanges like National Stock Exchange of India and Bombay stock Exchange of India on daily basis and to convert to .FAV format in which the SMARTS, Australia accepts. The Objective of this project is to bring transparency in the Indian Stock Exchanges to control all type of manipulations taking place in it. This SMARTS product defined various type of alerts .This SMARTS will raise alerts when the data has manipulations in terms of Quantity of Stocks or Sudden changes in the Price of Stocks.
Responsibilities:-
Developing C++ application to conver the exchange data to SMARTS format
Developing Unix shell scripting to automate the execution
Crontab scheduling to execute the application on daily basis
Unit testing the C++ applications
Project Title
SEBI –Integrated market surveillance system -Depositories
Client
Securities Exchange Board of India [ Capital Market Regulator ]
Role
Developer
Environment
Solaris
Third Party Product
SMARTS, Australia [ TimeSeriesDataBase ]
Languages
C, C++,Shell scripting
Markets
Central Depository Services Ltd,National Services Depository Ltd.
Objective
To find risks associated with OffMarket and Inter depository transactions and to bring Transparency in the Indian Stock Exchanges by including Depositories along with the exchanges.
Description:
The role of Application, is to get the off-market transactions and inter-depository transactions data from Depositories like CDSL and NSDL on daily basis and to convert to .FAV format in which the SMARTS, Australia accepts. The Objective of this project is to include depositories transaction with the Indian Stock Exchanges to control all type of manipulations taking place in it.This SMARTS product defined various type of alerts to identify the manipulated offmarket and inter-depository transactions.This SMARTS will raise alerts when the trade has manipulations in terms of Quantity of Stocks or Sudden changes in the Price of Stocks.
Responsibilities:-
Developing C++ application to conver the exchange data to SMARTS format
Developing Unix shell scripting to automate the execution
Crontab scheduling to execute the application on daily basis
Unit testing the C++ applications
EDUCATION
Bachelor of Engineering,Information Technology (2000-2004)
Bharathiyar University 75.70%
Class XII
Thiagarajar model high school 85.08%
Class X
Thiagarajar model high school
84.80%