Post Job Free
Sign in

Senior Site Reliability Engineer

Company:
Algo Capital Group
Location:
Manhattan, NY, 10261
Posted:
May 15, 2025
Apply

Description:

Senior Site Reliability Engineer

Our client is a top tier High Frequency Trading firm based in NYC with a strong engineering culture and ML infrastructure based in NYC looking is looking to hire a Senior Site Reliability Engineer to their infrastructure team, this team is responsible for developing and maintaining the corporate productivity stack for the entire firm, both on-prem and in the cloud. You will ensure the availability and reliability of systems within this stack and grow the engineering practice in alignment with the firm's larger engineering organization.

This role requires a deep Linux operating system and application administration skill set, proficiency in Python, and solid experience with configuration management/IaC. Successful candidates should also have exceptional organizational, communication, and project management skills, as well as the ability to troubleshoot complex technical issues.

Responsibilities

Manage on-premise containerized web services

Automate and troubleshoot a broad range of technical infrastructure

Design and operate secure, reliable systems

Develop and implement monitoring solutions to ensure high system uptime and reliability; utilize tools to detect and resolve issues proactively

Document system architecture, processes, and best practices

Break down complexity, iterate, and communicate progress to a wide variety of leads and stakeholders

Assist with the administration of DHCP and DNS for both on-premise and external systems and applications

Qualifications

Years of experience in site reliability engineering or related disciplines

Strong proficiency with Python

Experience managing and monitoring containerized infrastructure

Experience working with CI/CD tools such as Jenkins, GitHub Actions, or ArgoCD

Expert experience with IaC and configuration management tools such as Terraform, SaltStack, Chef, Puppet, or Ansible

Experience building and operating systems on cloud platforms (e.g. AWS, Azure, GCP)

OpenLDAP or other directory services management expertise

Atlassian Data Center administration experience (on-prem)

Web development experience

This position offers a top of the market compensation package with excellent benefits and career growth opportunities.

Apply