Post Job Free
Sign in

Engineer

Company:
Robeco
Location:
Rotterdam, South Holland, The Netherlands
Posted:
April 24, 2025
Apply

Description:

Department

At Robeco, we operate with Job Profiles and Roles, providing flexibility in work responsibilities, hence the opportunity to develop knowledge, experiences & skills, clarity regarding possible career paths and increased adaptability and collaboration. Changes in Roles are determined collaboratively and consultatively in order to balance employee and company interests.

Position & Requirements

The Site Reliability Engineer (SRE) plays a critical role in ensuring the reliability, availability, and performance of IT services within our financial services environment. Operating at the intersection of development and operations, the SRE leverages observability tools such as Dynatrace, ServiceNow, Azure DevOps, and other supporting applications and platforms to proactively monitor, automate, optimize system performance, application and infrastructure resilience and incident response in our hybrid Azure Cloud and Datacenter environment.

The ideal candidate is passionate about automation, observability, and continuous improvement in a DevSecOps transitioning organization.

This role includes on-call duties outside of regular business hours, following industry best practices to ensure a balanced work-life integration

Your success in this role will hinge on your ability to communicate effectively, educate development teams, and ensure that SRE recommendations and best practices are consistently followed across the organization.

Key Responsibilities:

Observability and Monitoring:

Implement, maintain, and optimize observability tools, including Dynatrace, ServiceNow, and Azure monitoring solutions.

Develop automated alerting, dashboarding, and reporting solutions tailored to diverse stakeholders.

Align observability efforts with business capability monitoring strategies.

Integrate AI-driven insights for predictive monitoring and proactive issue detection.

Define, measure, and monitor Service Level Indicators (SLIs) to ensure service reliability and performance.

Establish and track Service Level Objectives (SLOs) to align reliability goals with business expectations.

Incident Management and Response:

Utilize AI-powered monitoring tools to detect, diagnose, and resolve incidents proactively.

Automate incident creation, escalation, and resolution workflows.

Lead incident post-mortems, ensuring thorough root cause analysis and implementation of preventive actions.

This position includes on-call responsibilities outside of regular work hours, adhering to best practices for managing on-call duties.

The SRE will collaborate within an on-call framework alongside the SMPO team, which coordinates incident resolution, and the IT Delivery teams if necessary. Robeco is introducing these roles for the first time to improve operational efficiency and incident management throughout the organization.

Automation and Reliability Engineering:

Develop automation scripts and frameworks to enhance system reliability, scalability, and resilience.

Implement self-healing capabilities and continuous deployment pipelines.

Collaborate with development teams to embed reliability best practices into CI/CD workflows.

Performance Optimization and Cost Management:

Continuously analyze system performance and suggest optimizations for cloud cost savings.

Utilize observability data to guide infrastructure scaling decisions and ensure optimal resource utilization.

Monitor sustainability KPIs and align observability insights with ESG goals.

Collaboration and Knowledge Sharing:

Work closely with cross-functional teams (DevOps, Security, Compliance) to align observability practices with business objectives.

Provide training and mentorship on observability best practices to engineers and operational teams.

Document processes, workflows, and guidelines for broader organizational adoption.

Core Competencies

Technical Skills:

Expertise in observability tools: Dynatrace, ServiceNow, QlikSense.

Experience with scripting and automation (Python, PowerShell, Azure DevOps).

Deep understanding of the Azure cloud environment and hybrid infrastructure.

Strong knowledge of observability concepts (metrics, logs, traces) for performance and reliability tuning.

Problem-Solving Abilities:

Strong analytical skills with the ability to troubleshoot complex distributed systems.

Ability to correlate multiple data sources to identify and resolve performance bottlenecks.

Collaboration and Communication:

Ability to translate technical concepts for non-technical audiences.

Experience working within Agile/Scrum teams.

Security and Compliance Awareness:

Familiarity with IT governance frameworks, security compliance (ISO, GDPR), and audit requirements.

Soft Skills

Proactive and self-driven with a continuous improvement mindset.

Strong organizational and time management skills.

A secure base for the team, balancing technical depth with a collaborative and mentoring approach.

Experience in communicating and collaborating with both operational and development teams.

A strong communicator with the ability to persuade and influence stakeholders at all levels.

Qualifications and Experience

Bachelor’s degree in computer science, IT, or related field.

3-5 years of experience in a similar role, preferably within financial services or large enterprise environments.

Certifications in Dynatrace and Azure are a plus.

All applications will be treated with the utmost confidentiality. An assessment and integrity test may be used in the selection procedure.

Robeco Recruiting Team

R1893

Apply