Lead DevOps Engineer - Shape the Future of Our Platform
Chicago - On site
What You'll Do:
Architect and Scale: Design and implement a forward-thinking infrastructure strategy that seamlessly scales to meet the evolving demands of our web services.
Cloud Mastery: Architect, manage, and maintain our cloud infrastructure (primarily AWS, including services like Lambda, EC2, EKS, DynamoDB, and Aurora) across all environments, with a laser focus on reliability, availability, and scalability.
Empower Engineering: Collaborate closely with software engineering teams to understand their needs, enabling rapid iteration, efficient testing, and seamless deployments.
Automate Everything: Develop and refine automation tools and workflows to streamline deployment and operational processes, boosting efficiency and reducing manual effort.
Champion Reliability (SRE): Implement and evangelize Site Reliability Engineering (SRE) principles, defining and tracking SLAs, SLOs, and SLIs to ensure optimal service performance and reliability.
Drive Observability: Define and maintain comprehensive observability standards through robust monitoring, alerting, logging, and distributed tracing systems.
Build Developer-Friendly Tools: Create intuitive internal platforms and tooling that empower developers with increased automation, productivity, and deployment confidence.
Enable Innovation: Ensure our infrastructure and tools are adaptable and supportive of the adoption of new technologies and features as we grow.
Secure and Compliant: Develop and implement robust security and IT standards to effectively manage risks and ensure adherence to company policies.
Lead and Resolve: Take ownership of incident response efforts, driving thorough blameless postmortems and implementing systemic fixes to prevent recurrence.
Translate Vision into Reality: Translate business requirements into scalable, resilient, and secure technical solutions, while maintaining robust high availability and disaster recovery capabilities.
Build and Mentor: Lead and grow a team of talented DevOps professionals, scaling the organization's reliability operations in line with our expanding needs.
What You'll Bring:
Minimum of 5 years of hands-on experience in infrastructure roles, with significant expertise in AWS cloud environments (Lambda, EC2, EKS, DynamoDB, Aurora, etc.).
Bachelor's or Master's degree in Computer Science, Information Systems, or a related field.
Proven experience in managing high-scale, high-throughput environments with distributed microservices architectures.
Deep proficiency in Infrastructure as Code (IaC) using tools like Terraform.
Expertise in networking, virtualization, and container orchestration technologies, particularly Docker and Kubernetes.
Solid understanding of CI/CD tools and modern observability solutions.
Exceptional problem-solving, strategic thinking, and communication skills, coupled with a strong passion for system design and operational excellence.