Post Job Free
Sign in

SRE/ Devops Java Santa Monica, CA and Mclean VA

Company:
KNS IT Group
Location:
McLean, VA
Posted:
January 16, 2026
Apply

Description:

Senior Java SRE

Experience Required: 14+ years

Assignment Duration: 12+ Months

Engagement Type: Contract

Work Location: Santa Monica, CA - Onsite (Hybrid/Initial Remote options depending on end-client)

Consultants have to give a coding round based on Java

Key Responsibilities:

• Architect globally distributed, multi-region GCP platforms with 99.99%+ availability targets.

• Define and operationalize SLIs, SLOs, error budgets, and reliability governance models.

• Lead incident command, RCA, and long-term reliability remediation for large-scale systems.

• Engineer and tune Java-based microservices (JVM internals, GC strategies, memory profiling).

• Design and operate GKE (Google Kubernetes Engine) at scale, including multi-cluster and fleet management.

• Implement GCP-native architectures using: GKE, Compute Engine, Cloud Load Balancing Cloud Spanner, Bigtable, Cloud SQL Pub/Sub, Cloud Storage IAM, VPC Service Controls

• Build secure and repeatable infrastructure using Terraform and policy-as-code.

• Design advanced service mesh and traffic management using Istio / Anthos Service Mesh.

• Implement stateful Kubernetes workloads using Portworx.

• Implement advanced Kubernetes storage using Portworx for stateful workloads.

• Support event-driven architectures using Kafka, Kafka Streams, KSQLDB, and Spark Streaming.

• Integrate GCP-native streaming solutions such as Pub/Sub.

• Optimize systems for low-latency, high-throughput workloads.

• Implement advanced observability using Prometheus, Datadog, Splunk, Kiali.

• Leverage eBPF for kernel-level tracing, networking diagnostics, and performance tuning.

• Manage advanced ingress, load balancing, and traffic shaping using Nginx Controller and Seesaw.

• Architect high-scale CI/CD pipelines using GitLab CI/CD, Jenkins, and GCP-native tooling.

• Build internal developer platforms (PaaS) to standardize deployments and reduce toil.

• Automate operations using Python, Go, Bash, and custom reliability tooling.

• Provide 24 7 production support across U.S. time zones.

• Participate in on-call rotations, weekend releases, and incident war rooms.

• Continuously improve monitoring, alerting, and incident response maturity.

Required Technical Expertise:

• Java (Advanced JVM internals, GC, performance tuning)

• GCP Cloud (Professional-level depth)

• GKE/Kubernetes (CKA/CKS depth)

• Docker, Terraform

• CI/CD: GitLab CI/CD, Jenkins

• Streaming: Kafka, Kafka Streams, KSQLDB, Spark

• Service Mesh: Istio, Anthos Service Mesh

• Monitoring & Logging: Prometheus, Datadog, Splunk, Kiali

• OS & Scripting: Linux/Unix, Bash

• Programming: Python or Go

• Virtualization: VMware

• Networking & Performance: eBPF, Nginx Controller, Seesaw

• Multi-cluster Kubernetes governance

• Internal platform engineering (PaaS)

• High-traffic SaaS or consumer-scale platforms

• Real-time streaming & event-driven architectures

• Deep observability and kernel-level tracing

• GKE fleet & Anthos multi-cluster architectures

• JVM performance engineering at hyperscale

• Service mesh traffic shaping & zero-downtime releases

• eBPF-based observability & kernel tracing

• Platform engineering / internal PaaS design

• Real-time streaming & event-driven systems

Certifications Required:

• Google Professional Cloud Architect or Professional Cloud DevOps Engineer

• Certified Kubernetes Administrator (CKA) or Certified Kubernetes Security Specialist (CKS)

Apply