Area of work:
We are seeking a highly motivated and skilled DevOps Engineer with a strong focus on Observability to join our growing team. The ideal candidate will have hands-on experience with OpenSearch, Prometheus, Grafana, and OpenTelemetry systems, along with a proven background in cloud infrastructure, particularly Google Cloud Platform (GCP) and Google Kubernetes Engine (GKE). You will play a key role in designing, implementing, and maintaining observability solutions that ensure the reliability, performance, and scalability of our systems. This role requires a proactive mindset, excellent interpersonal skills, and the confidence to drive initiatives independently.
Your responsibilities:
Design, deploy, and maintain observability stacks using OpenSearch, Prometheus, Grafana, and OpenTelemetry tools
Manage installation, upgrades, and maintenance of observability platforms
Develop and maintain dashboards, alerts, and metrics to monitor system health and performance
Collaborate with development and operations teams to define and implement best practices for monitoring and logging
Ensure high availability and scalability of observability infrastructure in GCP/GKE environments
Automate observability workflows using Infrastructure as Code (IaC) and CI/CD pipelines
Troubleshoot and resolve issues related to monitoring, logging, and alerting systems
Drive continuous improvement in observability practices and tooling
Your profile:
3–5 years of hands-on experience in a DevOps or SRE role with a strong focus on observability
Proven expertise in OpenSearch, Prometheus, Grafana, and telemetry systems
Solid experience with GCP and GKE
Proficiency in scripting languages (e.g., Bash, Python) and configuration management tools (e.g., Terraform, Ansible)
Strong understanding of containerization and orchestration (Docker, Kubernetes)
Experience with CI/CD tools and practices
Excellent problem-solving skills and attention to detail
Strong interpersonal and communication skills
Self-motivated, proactive, and confident in driving initiatives independently
Nice to have:
Experience with other cloud platforms (AWS, Azure)
Familiarity with service meshes (e.g., Istio) and distributed tracing tools (e.g., Jaeger, OpenTelemetry)
Knowledge of security best practices in observability
You can look forward to our benefit package:
Hybrid Work and Flexible working hours
Work from abroad - 12 days of remote work from EU countries per year
Group Share Plan - discount on company shares
Pension fund contribution - 3% of your gross salary (5% after 5 years with us)
Health & Wellbeing - fully covered Multisport card, life & accident insurance, sick days and 100% salary contribution during sick leave (up to 56 days)
25 vacation days
Mobility - fully covered public transport in Prague & free parking
Flexible Benefit Account (Pluxee) - 1200 per month
Personal Development - annual budget of €690 ... and way more!