Post Job Free
Sign in

Site Reliability Engineer / DevOps

Company:
Global Technology Associates
Location:
Manhattan, NY, 10261
Posted:
June 21, 2025
Apply

Description:

Looking for a Site Reliability Engineer / DevOps

Salary Range: $5 0 -$60/hr

What you will be doing as Site Reliability Engineer / DevOps

Site reliability / operations lead for live, large-scale services

Work with engineering teams, product/business, service providers and third-party vendors across multiple locations

Ensure that project milestones are met in terms of features, quality and time for the E2E system

Lead the site reliability team and maintain the overall Production live system highly available at all times while expanding its functionality

Debug Production issues raised by customers and customer support teams, isolate the cause and work towards a fix

Actively participate in the expansion and enhancement the functionality of services in Kubernetes, with best practices for architecting and site reliability engineering regarding availability, reliability, scalability and security in mind

Own any updates to the live production environments in order to maximize availability as well as timely updates as requested by the project roadmap

Research, prototype and develop solutions for various cutting-edge issues, scalability, security problems etc

Exercise judgment in selecting methods, techniques and evaluation criteria for obtaining results

Conduct integration, integration tests and performance tests of the E2E system including external dependencies, improve the system to meet performance and reliability requirements. What you will bring to the table as a Site Reliability Engineer / DevOps

Site reliability / operations lead for live, large-scale services

Work with engineering teams, product/business, service providers and third-party vendors across multiple locations

Ensure that project milestones are met in terms of features, quality and time for the E2E system

Lead the site reliability team and maintain the overall Production live system highly available at all times while expanding its functionality

Debug Production issues raised by customers and customer support teams, isolate the cause and work towards a fix

Actively participate in the expansion and enhancement the functionality of services in Kubernetes, with best practices for architecting and site reliability engineering regarding availability, reliability, scalability and security in mind

Own any updates to the live production environments in order to maximize availability as well as timely updates as requested by the project roadmap

Research, prototype and develop solutions for various cutting-edge issues, scalability, security problems etc

Exercise judgment in selecting methods, techniques and evaluation criteria for obtaining results

Conduct integration, integration tests and performance tests of the E2E system including external dependencies, improve the system to meet performance and reliability requirements

#JobsAtKellyTelecom

Apply