Senior Technology Site Reliability Engineering Manager
Cooley is seeking a Senior Site Reliability Engineering Manager to join the Infrastructure & Development Operations team.
Position summary: The Senior Technology Site Reliability Engineering ("SRE") Manageris responsible forleading a team of SRE's to ensure the reliability, scalability, and performance of the firm's infrastructure and services. This role works with the DevOps, infrastructure, and development teams, applying engineering principles to operations in order to create scalable and resilient systems. In addition to being technically advanced, the SRE Manager will have high degree of emotional intelligence and the ability to work as a team towards complex and layered objectives. Specific duties and responsibilities include, but are not limited to, the following:
Position responsibilities:
Define and execute the SRE strategic roadmap aligned with business goal, providing experienced leadership in developing solutions for highly scalable, highly available, hybrid cloud (IaaS, PaaS, SaaS) infrastructure patterns and platform integrations across physical colocations and hyperscalers (AWS and Azure)
Build and mentor a high-performing SRE team, fostering a culture of trust, collaboration, and continuous improvement
Partner with cross-functional leaders in infrastructure, DevOps, and application development to scale reliability practices across the enterprise
Oversee incident response, root cause analysis, and postmortems with a focus on accountability and learning
Establish and enforce Service-level objectives (SLOs), service-level indicators (SLI's), and service-level agreements (SLA's)
Drive proactive monitoring, alerting, observability, and capacity planning
Lead automation initiatives for deployment, scaling, failover, and recovery
Promote observability practices using tools like Prometheus, Grafana, DataDog, or Splunk
Collaborate with development teams to build self-healing, fault-tolerant systems
Champion reliability-first thinking across engineering and operations
Encourage blameless postmortems and a learning-oriented incident culture
Ensure compliance with security, risk, and regulatory requirements
Serve as direct supervisor and mentor to direct reports
Provide day-to-day supervision of direct reports, ensure compliance with assigned work hours and monitor for compliance with all firm and department policies. Manage staffing coverage, review and process time logs/time off requests
Support business professional development and continued educational opportunities
In collaboration with immediate supervisor and CN HR, participate in hiring, performance appraisals, counseling, termination and other employee lifecycle events
All other duties as assigned or required
Skills and experience:
Required:
After orientation at Cooley LLP, exhibit proficiency in the Microsoft Office suite, iManage and other firm applications
Ability to work extended and/or weekend hours, as required
Ability to travel, as required
7+ years' direct applicable experience (e.g., DevOps or Site Reliability Engineering) with 2+ years of exempt/management experience in relevant roles
Experience managing cross-functional projects and SRE planning and programing
Proficiency in Terraform and programming languages such as Python, Go, or Java
Deep expertise in cloud platforms, particularly AWS, and container orchestration
Strong background in distributed systems, performance tuning, and automation
Hands-on experience with configuration management tools such as Puppet, Chef, or Salt
Preferred:
Bachelor's Degree in Computer Science, Information Technology, Engineering, or associated discipline
Experience working with advanced ETL data workflows including technologies such as AWS EMR, Azure Synapse, Azure Data Factory, or Apache Hive/Spark/Airflow
Experience with IaC deployment of AKS/EKS/GKE architecture
Experience with enterprise Data Lake environments using technologies such as DataBricks or Snowflake
Competencies:
Expert analytical/quantitative, problem-solving, and deductive reasoning skills, with demonstrated experience performing advanced troubleshooting and root cause analysis of complex technical issues
Excellent organizational, planning, and time management skills and ability to work either independently or in a team environment to manage competing priorities and meet deadlines
Advanced verbal and written communication skills with the ability to present findings, conclusions, alternatives, and information clearly and concisely
Experience working with all levels of business professionals, management, stakeholders, and vendors with demonstrated ability to build effective relationships through trust and diplomacy
Cooley offers a competitive compensation and excellent benefits package and is committed to fair and equitable employment practices.
EOE.
The expected annual pay range for this position with a full-time schedule is $165,000 - $235,000. Please note that final offer amount will be dependent on geographic location, applicable experience and skillset of the candidate.
We offer a full range of elective benefits including medical, health savings account (with applicable medical plan), dental, vision, health and/or dependent care flexible spending accounts, pre-tax commuter benefits, life insurance, AD&D, long-term care coverage, backup care for children and/or adults and other parental support benefits. In addition to elective benefit options, benefited employees receive firm-paid life insurance, AD&D, LTD, short term medical benefits as well as 21 days of Paid Time Off ("PTO") and 10 paid holidays each year. We provide generous parental leave and fertility benefits. New employees will attend a detailed benefit orientation to learn more about our many benefits and resources.