Looking for a Site Reliability Engineer / DevOps
Salary Range: $5 0 -$60/hr
What you will be doing as Site Reliability Engineer / DevOps
Site reliability / operations lead for live, large-scale services
Work with engineering teams, product/business, service providers and third-party vendors across multiple locations
Ensure that project milestones are met in terms of features, quality and time for the E2E system
Lead the site reliability team and maintain the overall Production live system highly available at all times while expanding its functionality
Debug Production issues raised by customers and customer support teams, isolate the cause and work towards a fix
Actively participate in the expansion and enhancement the functionality of services in Kubernetes, with best practices for architecting and site reliability engineering regarding availability, reliability, scalability and security in mind
Own any updates to the live production environments in order to maximize availability as well as timely updates as requested by the project roadmap
Research, prototype and develop solutions for various cutting-edge issues, scalability, security problems etc
Exercise judgment in selecting methods, techniques and evaluation criteria for obtaining results
Conduct integration, integration tests and performance tests of the E2E system including external dependencies, improve the system to meet performance and reliability requirements. What you will bring to the table as a Site Reliability Engineer / DevOps
Site reliability / operations lead for live, large-scale services
Work with engineering teams, product/business, service providers and third-party vendors across multiple locations
Ensure that project milestones are met in terms of features, quality and time for the E2E system
Lead the site reliability team and maintain the overall Production live system highly available at all times while expanding its functionality
Debug Production issues raised by customers and customer support teams, isolate the cause and work towards a fix
Actively participate in the expansion and enhancement the functionality of services in Kubernetes, with best practices for architecting and site reliability engineering regarding availability, reliability, scalability and security in mind
Own any updates to the live production environments in order to maximize availability as well as timely updates as requested by the project roadmap
Research, prototype and develop solutions for various cutting-edge issues, scalability, security problems etc
Exercise judgment in selecting methods, techniques and evaluation criteria for obtaining results
Conduct integration, integration tests and performance tests of the E2E system including external dependencies, improve the system to meet performance and reliability requirements
#JobsAtKellyTelecom