Logaprabhu Jayakumar
Contact: 480-***-**** ***********@*****.***, US Citizen
Summary
Seasoned Data Engineer Lead passionate about architecting and implementing secure, scalable data platforms. I design and implement end-to-end real-time and batch data pipelines across the organization, providing core database solutions for application teams. Expertise spans key business domains including financial data, customer management, marketing analytics, and fleet management, with a continuous focus on efficient data handling and quality assurance.
Skills
Certifications : AWS Certified Developer Associate, MAPR Certified Spark Developer
Data : Migration, Cleansing, Loss Prevention, Optimization, Quality and reporting
Languages : Python, Scala, Java 11 (Spring, Spring Boot)
Big Data : Hive, HBase, Apache Spark, Elasticsearch, Solr
Cloud Technology : AWS, Glue, RDS, DynamoDB, S3, Azure Data Factory
ETL : StreamSets, Confluent Kafka, Apache Kafka, ksqlDB,Teradata
Database : MySQL, Redshift, Redis, PostgreSQL, MSSQL, MongoDB, Oracle, Sybase,
PL/SQL
Other : GIT, CICD, Snowflake, Rest API, DBT, GCP, Harness, Ansible,
Bamboo, Cucumber, Kibana, SAS, Informatica, MS Office Suites,
Tableau, SSRS,SSIS, Agile, Scrum, JIRA, GenAI
Education
●Bachelor Of Engineering
Experience
Sr. Data Engineer – Karsun Solutions LLC USA
07/2024 – 10/2025
●Designed and implemented Star schema data warehouse in AWS Redshift consisting of 20 dimension tables, a fact table, 5 federated schemas, 4 staging tables, and 3 materialized views for reporting overhead cost, profit/loss gained for around 1.7 million Government owned vehicles.
●Re-engineered 15 StreamSets ETL data pipelines into AWS Glue and Python based ETL pipelines to remove the StreamSets license costs across the Fleet Management program.
●Coordinated with Business, Data Architects and created Payment database system in AWS MySQL from Legacy Payment (PPMS) system for the Fleet Management program.
●Developed and implemented dedup tools to remove duplicate delivery points records to improve the TypeScript React UI performance upto 95%.
●Designed and Developed 18 StreamSets ETL data pipelines to migrate OnPrem Mainframe applications data to Gov Cloud AWS RDS.
●Implemented data quality and integrity rules in multiples database objects across 8 schemas by following up with different stakeholders.
●Leveraged advanced SQL analysis and optimized Stored Procedure development to support Quality Assurance efforts by generating targeted test data and to ensure data integrity by quickly implementing fixes for production defects
Staff Engineer - Early Warning Services USA
03/2021 – 11/2023
●Led a team of 10 Engineers to form Enterprise Data Analytical Platform (EDAP) using Confluent Kafka, Elastic Search in the Cloudera environment for Fraud prevention and reporting applications.
●Designed and architected to create Data Lake using AWS Glue Python ETL to replace StreamSets ETL to resolve data loss issue upto 95% by implementing record count balancing.
●Led the development of data cleaning tools in PySpark to remove PII sensitive data to compliant with GDPR in association with SLOD risk team.
●Designed and implemented a Data quality framework to report data disparity in the files received from 2500+ SOR’s including Chase, Bank of America Banking and Financial Institutions. Expert in handling various source file formatting including Avro, JSON, Parquet.
●Enhanced real-time and batch data pipelines to store Bank and Zelle payment data in NoSQL (HBase) and Hive databases through Apache Kafka and Scala Spark applications.
●Designed and developed an end to end automation testing framework in Cucumber to test streaming and batch ETL to support Fraud prevention,AML and Risk management applications.
Senior consultant – Atos Syntel Inc USA
10/2018 – 02/2021
●Designed and created a pre-production Big Data environment (Palladium in American Express) to minimize the development and testing time to launch new Merchant Offers features. The new environment reduced the time to release the new features upto 50% to the actual time.
●Designed and implemented an enhanced Partner Recognition Attribution System in PySpark and UNIX. This has enabled accurate recognition of Partners like Google, Amazon and Affiliates contributing to Amex product campaigns.
●Harvested valuable insights from customer credit card data using Predictive analytical techniques to identify Prospective customers in Hive, PySpark and UNIX.
●Developed an application to consume the merchant offer score API from the GBM Machine Learning recommendation system, successfully implemented a feature that drives personalized offer promotion on the customer UI
●Analyzed, Aggregated, segmented marketing data and developed Tableau dashboard reports for Marketing and Campaign management applications.
Technology Lead – Infosys USA
02/2018 – 10/2018
●Increased Operational efficiency by analyzing potential threats to business continuity and proposed effective workarounds, minimizing disruptions and ensuring uninterrupted operations.
●Led the coordination efforts between Business, Technical, and Development teams located across various geography’s, fostering effective communication and knowledge-sharing for successful Logistic Management project delivery.
Delivery Project Lead – Mphasis India
08/2015 – 04/2016
●Designed and developed data mismatch reporting applications in REST API for Insurance applications.
●Enabling automated identification and reporting of data inconsistencies.
●Strategized and implemented the automation of dynamic SQL*Loader control file generation and post-production reconciliation process.
Associate Lead Engineer – CGI India
11/2008 – 02/2015
●Spearheaded the development of the global Credit Risk data worldwide (BNP Paribas) applications in Sybase, leading a cross-functional team of 10 engineers across India and France.