Post Job Free
Sign in

Site Reliability Engineer

Company:
CLS Group
Location:
Iselin, NJ, 08830
Posted:
May 12, 2024
Apply

Description:

Job Purpose

The role is primarily responsible for ensuring that SRE methodologies are applied to the cloud hosted environment. In addition, the role will act as a central point of expertise for SRE automation across the Platform Operations team.

Essential Job Functions

Responsible for implementing SRE methodologies within the cloud hosted environment, including automation of TOIL and definition / implementation of SLOs and SLAs.

Establish SRE as a practice within the Cloud team and working closely with Infrastructure Engineering enhance observability and telemetry to ensure the Cloud hosted services have appropriate Service metrics and monitoring.

Build out GitOps practices for use in the cloud hosted environment using tooling such as Terraform and Ansible. Provide a liaison between Engineering and Cloud Operations to fully embed Infrastructure as Code for all new cloud hosted deployments.

Provide support escalation for Cloud and Automation related issues ensuring that Production stability is always the primary requirement.

Ensure risks and stability issues in the cloud hosted environment are understood and addressed where possible through SRE best practice as part of any incident postmortems.

Minimum Education Required

Bachelor’s degree educated or equivalent

Industry standard IT certification desired e.g. AWS / Microsoft / VMware / Redhat Linux

Minimum job-related Experience Required

Must have strong technical operational support experience within an infrastructure services team preferably supporting either on-premise compute or cloud hosted environments.

Strong understanding of automation technologies ideally including Terraform and Ansible and ability to use these tools to drive Infrastructure as Code through GitOps methodologies.

Knowledge of at least 1 scripting language, preferably either Python or Powershell.

Minimum of 2 years experience applying SRE methodologies within a support team and an understanding of Service Level metrics associated with this.

Experience of APM tools (e.g. Grafana / Datadog / Dynatrace).

Experience of working in a regulated financial services / banking organization.

Special Skills/Knowledge

Able to understand and use at least one cloud hosted services including either AWS or Azure.

Possesses a strong service-orientated mindset, can consistently deliver a high level of service to the business.

Able to make and influence decisions with peers, stakeholders and management.

Able to communicate effectively with both business and technical staff at all levels. This includes communicating complex technical issues to different levels of management.

Able to work proactively, own complex deliveries and provide regular updates to management and stakeholders.

Expected full-time salary range between $140,000 - $170,000 + variable compensation + 401(k) match + benefits.

*Note: Disclosure as required by NY Pay Transparency Law of the expected salary compensation range for this role

Apply