Post Job Free
Sign in

Kafka Platform Engineer with 4+ years of experience

Location:
Budd Lake, NJ, 07828
Posted:
January 13, 2026

Contact this candidate

Resume:

NITHEESH MUNDRU +1-801-***-**** *********@*****.***

Professional Summary:

Certified and results-driven Kafka Platform Engineer with over 4 years of experience designing, securing, and scaling real-time streaming platforms using Apache Kafka, Confluent Cloud, and IBM MQ across cloud-native, hybrid, and distributed environments.

Specialized in platform engineering and real-time data streaming with Apache Kafka, Kafka Connect, Kafka Streams, ksqlDB, and Confluent Cloud, powering use cases like fraud detection, event sourcing, log analytics, order tracking, and system observability.

Designed and implemented highly available Kafka clusters with multi-AZ resilience, disaster recovery (DR), schema evolution, and multi-tenancy for 50+ microservices, enabling decoupled, event-driven architectures across on-prem and cloud environments (AWS, Azure).

Built and managed Kafka Connect clusters with high-throughput source/sink connectors (HTTP, REST, JDBC, S3, BigQuery, Snowflake), supporting seamless integration across data platforms and business domains.

Automated infrastructure provisioning, cluster configuration, ACL management, and monitoring setup using Terraform, Ansible, and Helm — reducing manual operations by 70% and enabling consistent GitOps-style deployments.

Enforced end-to-end Kafka security using SSL/TLS, SASL (SCRAM, OAUTHBEARER), Kerberos, and fine- grained ACLs, meeting compliance for industries like banking and healthcare.

Deployed centralized monitoring, logging, and alerting using Prometheus, Grafana, OpenTelemetry, and

Splunk, enabling real-time visibility into broker health, consumer lag, throughput, and replication delays.

Led Kafka platform upgrades, partition reassignments, offset resets, and cluster tuning with zero downtime, supporting 99.99% SLA pipelines processing millions of events daily.

Implemented observability and self-healing strategies including log-based anomaly detection, auto- recovery scripts, and dynamic consumer throttling.

Delivered Kafka Platform-as-a-Service (PaaS) onboarding for 30+ development teams with templated configs, runbooks, Terraform modules, and schema registry integrations.

Enabled schema governance and compatibility enforcement using Confluent Schema Registry with Avro

and Protobuf, supporting backward/forward compatibility and secure multi-tenant controls.

Collaborated with App Dev, SRE, CloudOps, and Security teams to define platform guardrails, automate

secrets/certificates (Okta, Vault), and document scalable Kafka patterns for organization-wide adoption.

Licenses & Certificates:

AWS Certified Solution Architect – Associate

SnowPro Core Certified by Snowflake

Tableau Desktop Specialist

Lean Six Sigma Green Belt

Technical Skills:

Streaming & Messaging Platforms

Apache Kafka, Confluent Cloud, Kafka Connect, Kafka Streams, ksqlDB, IBM MQ, IBM Integration Bus, JMS, ActiveMQ, WebSphere MQ, AWS MSK, Self-

Hosted Kafka

Cloud & Infrastructure

AWS (EC2, S3, IAM, MSK, CloudWatch, Lambda, Secrets Manager), Azure (VMs, Event Hubs, Storage, AKS), Terraform, Ansible, Helm, Kubernetes, Strimzi, Okta, Vault, Load Balancers

Infrastructure

Automation & DevOps

Terraform, Ansible, Shell Scripting, Git, CI/CD, Jenkins, GitLab CI, GitHub

Actions, Docker, Kubernetes

Monitoring &

Observability

Prometheus, Grafana, Splunk, OpenTelemetry, CloudWatch, Alertmanager,

Custom Dashboards, Consumer Lag Monitoring

Security & Governance

SSL/TLS, SASL (SCRAM, OAUTHBEARER), Kerberos, Kafka ACLs, Confluent RBAC, Schema Registry, Avro, Protobuf, PII Masking, HashiCorp Vault, AWS Secrets Manager

Data Integration & Connectors

JDBC, REST, HTTP, S3 Sink/Source, Snowflake, BigQuery, HDFS, Kafka Connect Framework, Custom Connectors

Streaming Use Cases & Architectures

Event-Driven Architecture, Multi-Tenant Kafka Clusters, Disaster Recovery, Multi-AZ Clustering, Offset Management, Log Aggregation, Real-Time

Ingestion

Middleware & Protocols

WebSphere MQ, IBM MQ, ActiveMQ, TCP/IP, HTTP, HTTPS, FTP, SMTP

Programming &

Scripting

Python, Java, C, C++, PL/SQL, J2EE, Shell, Bash

Data Platforms &

Integration Targets

Snowflake, HDFS, Spark, Hadoop, Elasticsearch, RDBMS (Oracle, DB2,

Sybase, SQL Server)

Operating Systems

Linux, UNIX, AIX, Sun Solaris, Windows

Professional Experience:

Kafka Infrastructure Engineer

Kroger - Remote June 2024 to Present

Project Title: Enterprise Kafka Platform Modernization for Real-Time Applications

Project Description: Enterprise Kafka Platform Modernization for Real-Time Applications Designed and delivered a secure, scalable, and cloud-ready Kafka platform to support real-time data pipelines across retail operations, analytics, and observability systems. Built HA clusters across AWS, Azure, and on-prem, automated deployment using Terraform and Ansible, and enabled 50+ microservices with secure Kafka onboarding, schema enforcement, and high-throughput connector pipelines.

Responsibilities:

Designed and deployed Confluent Kafka clusters with built-in high availability (HA) and disaster recovery (DR), enabling 24x7 reliability for 50+ real-time use cases.

Implemented end-to-end Kafka security using SSL/TLS encryption, SASL/LDAP authentication, and

Zookeeper-based ACLs, meeting enterprise compliance standards.

Built and managed Kafka Connect clusters and deployed high-throughput source/sink connectors for integration with JDBC, S3, and internal APIs.

Managed schema evolution and compatibility enforcement using Confluent Schema Registry, supporting

multi-tenant backward/forward compatibility.

Automated Kafka provisioning and configuration using Ansible and Terraform, enabling GitOps-style deployments across AWS, Azure, and on-prem environments.

Integrated Prometheus and Grafana for real-time observability of consumer lag, broker health, and

throughput, reducing incident detection time by 60%.

Tuned topic configurations, replication strategies, and retention policies, improving cluster stability and reducing disk usage by 35%.

Supported Kafka platform upgrades, offset resets, and daily operations, maintaining 99.99% uptime

across production environments.

Collaborated with development teams to implement secure Kafka onboarding for 20+ microservices using

templated configs, ACL policies, and schema registration.

Conducted Kafka performance benchmarks, POCs, and platform hardening assessments to prepare for

multi-cloud scale-out.

Middleware Engineer

Galderma, New York, NY Jan 2023 to May 2024

Project Title: Kafka & MQ Platform Enablement for Real-Time Enterprise Integration

Project Description: Led the implementation of Kafka-based messaging and IBM MQ modernization to support scalable, event-driven integrations across enterprise systems including analytics, EHR, inventory, and operations. Delivered a secure, monitored, and production-hardened streaming backbone across on-prem and cloud environments with automation, DR support, and platform observability.

Responsibilities:

Designed and deployed Apache Kafka clusters for log aggregation, event streaming, and asynchronous messaging, enabling high-throughput integration across business systems.

Built reusable Kafka onboarding templates, connector config guides, and runbooks, supporting secure integration for 15+ application teams across dev and prod.

Managed schema evolution and compatibility using Avro with internal Schema Registry, ensuring deserialization integrity across consumers and reducing data contract issues.

Implemented DR policies, backup strategies, and replication tuning to ensure platform fault tolerance, resilience, and data durability.

Automated Kafka and MQ provisioning using Shell, Ansible, and MQSC scripting, enabling consistent, repeatable infrastructure rollouts.

Integrated Prometheus and Grafana for broker health monitoring, lag detection, and throughput visualization; established alert rules and runbooks for faster triage.

Tuned Kafka broker memory, consumer lag thresholds, and topic partitions to reduce processing latency and increase parallelism in message pipelines.

Installed and supported IBM WebSphere MQ clusters, managing Listeners, Channels, Command Servers, and Dead Letter Queues in multi-host environments.

Performed MQ version upgrades (6.x 7.x 7.5+), patched critical vulnerabilities, and optimized cluster config for scalability and uptime.

Integrated Kafka with Spark, Elasticsearch, and Hadoop, enabling downstream real-time analytics and

search indexing.

Secured Kafka and MQ using SSL/TLS, Kerberos, and ACLs, enforcing encrypted, authenticated access across microservices and external partners.

Conducted root cause investigations, wrote RCA documentation, and led post-incident retrospectives to improve platform reliability and team response.

Kafka Engineer

Metro Bank, Bangalore, India Sep 2021 to Dec 2022

Project Title: Enterprise Kafka Platform Implementation & Streaming Integration

Project Description: Served as the lead Kafka Engineer responsible for designing and operationalizing a multi- tenant, secure Kafka platform to support real-time event processing across core banking, payment systems, and fraud detection pipelines. Delivered production-grade Kafka clusters, enabled developer adoption, and integrated the platform with streaming apps, observability, and DevOps workflows.

Responsibilities:

Architected and deployed enterprise-grade Kafka clusters across multiple environments with high availability (HA) and replication strategies, supporting critical real-time workloads.

Configured and managed Kafka core components including Brokers, Zookeeper, Kafka Streams, Kafka Control Center, and ksqlDB, enabling high-performance event streaming.

Enforced Kafka security policies using SSL, SASL, and ACLs, ensuring encrypted data flow and controlled access for all producers and consumers.

Built and maintained Kafka Connect Framework, deploying JMS, HTTP, and custom connectors, streamlining integration with legacy and modern systems.

Delivered Kafka platform onboarding for 25+ app teams with reusable templates, access provisioning, and schema validation guides.

Tuned topic partitioning, replication factors, and consumer group lag to achieve low-latency, fault-tolerant streaming for millions of messages per day.

Led deployment and monitoring of Prometheus, Grafana, and Splunk, enabling real-time observability and proactive alerting.

Designed and deployed real-time data ingestion pipelines using Kafka + Spring Boot, powering microservice-based, event-driven banking applications.

Supported CI/CD pipelines and automated Kafka rollout processes using GitLab CI and Jenkins, improving deployment consistency and reducing manual errors.

Conducted performance benchmarking, stress testing, and DR validation to ensure platform resilience under varying message loads.

Defined Kafka best practices, published runbooks, and facilitated knowledge sharing to improve engineering maturity and operational confidence.

Collaborated cross-functionally with app developers, DevOps, and infrastructure teams to scale Kafka platform usage across the organization.

Education:

Master of Science in Computer Science at Concordia University Wisconsin, WI.



Contact this candidate