Post Job Free
Sign in

Kubernetes Platform Engineer

Company:
Bay Systems Consulting Inc.
Location:
Berkeley, CA, 94720
Posted:
April 25, 2026
Apply

Description:

Job Description

We are seeking a Kubernetes Platform Engineer to join the Platform Engineering team as a hands-on individual contributor. This role focuses on day-to-day operations and administration of Kubernetes clusters, primarily on-premises (K3s/RKE2) with additional support for cloud environments on Google Cloud Platform (GCP) and Amazon Web Services (AWS). You will manage cluster lifecycle operations, implement and maintain Cilium-based networking, troubleshoot complex platform issues, and enable development teams to successfully deploy and operate their workloads. This position balances infrastructure operations with developer enablement, requiring both deep technical expertise and strong collaboration skills.The Team

The Platform Engineering team is a small team within ESnet's Systems and Software department that is dedicated to streamlining the software development lifecycle by establishing standardized processes for building, configuring, and deploying applications. The team supports the engineering, implementation, and maintenance of ESnet's platform systems and services including GitLab, Ansible, and Kubernetes environments, with responsibility for both on-premises and cloud-based services deployed across Google Cloud Platform (GCP) and Amazon Web Services (AWS).Major ResponsibilitiesCluster Operations & Administration

Manage the full lifecycle of Kubernetes clusters (on-premises K3s/RKE2, GKE, and EKS), including upgrades, security patching, scaling, and capacity planning

Troubleshoot cluster-level issues including control plane problems, node failures, and resource constraints

Implement and maintain cluster security hardening based on CIS benchmarks and organizational security policies

Manage etcd cluster health, backup procedures, and disaster recovery capabilities

Monitor cluster performance and optimize resource utilization across multi-tenant workloads

Coordinate with datacenter operations team for physical infrastructure changes and maintenance windowsNetworking & Cilium CNI

Implement, configure, and maintain Cilium CNI across on-premises and cloud Kubernetes environments

Design and enforce network policies to achieve secure multi-tenant isolation

Troubleshoot complex pod networking issues including DNS resolution, service discovery, and connectivity problems

Configure and maintain BGP peering with physical network infrastructure for on-premises integration

Work with network engineering team on firewall rules, VLANs, IPv6 networking, and network architecture

Internal Developer Platform & Enablement

Contribute to building a next-generation internal developer platform inspired by tools like Backstage, focused on increasing development efficiency and security

Work with the security team to define secure image baselines and automate the patching pipeline for container images

Assist development teams with deploying, configuring, and troubleshooting Kubernetes workloads

Review application deployment manifests and provide guidance on best practices and optimization

Develop and maintain platform documentation, runbooks, and self-service guides

Engage with development teams to understand platform needs and tailor the cluster experience to meet evolving requirements

Required Qualifications

Typically requires a minimum of 8 years of related experience with a Bachelor’s degree; or 6 years and a Master’s degree; or equivalent experience.

Demonstrated experience administering Kubernetes on on-premises infrastructure (K3s, RKE2, or similar bare-metal distributions)

Experience with cloud-managed Kubernetes (GKE and/or EKS)

Strong understanding of Linux networking fundamentals: iptables/nftables, routing tables, DNS, TCP/IP stack, network troubleshooting

Experience with GitOps methodologies and tools such as ArgoCD or Flux

Proficiency in scripting and automation: Bash, Python, Go

Cilium CNI or equivalent production experience

Ability to work collaboratively in a team environment and communicate technical concepts clearly

Understanding of Kubernetes security best practices including Pod Security Standards, RBAC, and secrets management

GCP (Google Cloud Platform) and/or AWS (Amazon Web Services) cloud platform experience

Preferred Qualifications

Go programming experience for operator maintenance and platform tooling development

CKA (Certified Kubernetes Administrator) or CKS (Certified Kubernetes Security Specialist) certification

Background in BGP routing protocols and network engineering concepts

IPv6 networking experience

Infrastructure as Code experience with Terraform or Ansible

Experience with internal developer platform (IDP) tools such as Backstage or similar

Experience with service mesh technologies (Istio, Linkerd)

Excellent understanding of code review and familiarity with GitHub and GitLab workflows

Full-time

Apply