Post Job Free
Sign in

Senior Manager of Site Reliability Engineering - Data Protection and

Company:
JPMorgan Chase & Co.
Location:
Houston, TX, 77020
Posted:
September 30, 2025
Apply

Description:

JobID: 210644287 Category: Software Engineering JobSchedule: Full time Posted Date: 2025-07-11T17:40:52+00:00 JobShift: : Guide and shape the future of technology at a globally recognized firm, driven by pride in ownership.

As a Senior Manager of Site Reliability Engineering at JPMorgan Chase within the Infrastructure Platforms-Data Protection and Recovery organization, you are the non-functional requirement owner and champion for the applications and infrastructure operations in your remit.

You are a key influencer in your team's strategic planning, driving continual improvement in customer experience, resiliency, security, scalability, monitoring, instrumentation, and automation of the software in your area.

Job responsibilities * Demonstrates expertise in site reliability principles and demonstrates an understanding of the fine balance between features, efficiency, and stability * Effectively negotiates with peers and executive partners to ensure optimal outcomes for all * Drives the adoption of site reliability practices throughout the organization * Ensures your teams demonstrate site reliability best practices with the ability to demonstrate this empirically through stability and reliability metrics * Drives a culture of continual improvement and solicits real-time feedback to improve the customer's experience and product line services * Ensures your team collaborates with other teams within your group's specialization and avoids duplication of work where possible * Follows blameless, data-driven, post-mortem strategies and conducts regular team debriefs to enable learning from both successes and mistakes * Provides personalized coaching for entry to mid-level team members * Ensures your team documents and shares their knowledge and innovations via internal forums, communities of practice, guilds, and conferences Required qualifications, capabilities, and skills * Formal training or certification on infrastructure engineering concepts and 5+ years applied experience.

In addition, 2+ years of experience leading technologists to manage and solve complex technical items within your domain of expertise.

Consolidate bullet points in this section.

* 7+ years experience in Infrastructure Operations, driving site reliability and performance engineering * Advanced proficiency in site reliability culture and principles and can demonstrate how to implement site reliability across platform teams while avoiding common pitfalls * Experience leading technologists to manage and solve complex technological issues at a firmwide level * Ability to influence the team's culture by championing innovation and change for success * Experience hiring, developing, and recognizing talent * Proficiency in at least one programming language (e.g., Python) * Demonstrated proficiency in technical processes * Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.) * Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker, etc.) * Experience with troubleshooting common compute, storage, and networking technologies and hardware issues Preferred qualifications, capabilities, and skills * Demonstrate data fluency * 3+ years experience with enterprise data protection products such as Cohesity or Commvault

Apply