Join Team CARFAX as a Senior Site Reliability Engineer!
Isn't it time you bragged about where you work? At CARFAX, we do, every day. We pride ourselves on being mission-focused on helping to grow a brand built on accuracy and integrity. We care deeply about our products and our customers. We’re more than just a company: We help millions of consumers make more informed decisions every day. We know that our teammates are our most valuable asset, and we value a balanced life while tackling challenging projects in a fast-paced environment. One last thing: Our four-day week continues in Summer 2025!
This role has an expectation of 3 days in the Columbia, MO office per week, subject to change based on future business needs.
What you'll be doing:
Support DevOps at CARFAX as an engineer in our observability practice.
Maintain the observability tool stack used by teams throughout CARFAX.
Work in a dynamic, agile, team environment helping keep CARFAX’s applications up and running.
Collaborate with engineering teams to design and build monitoring solutions
Respond to major incidents. Help teams troubleshoot their products and restore service.
Collaborate closely with DevOps and engineering teams to implement observability best practices.
Reduce toil by creating observability automation that can be reused across our teams.
Continuously analyze and evaluate our systems, products, and process for potential improvements
What we're looking for:
Five or more years of experience with observability solutions.
Experience with the following:
Maintaining cloud infrastructure via IaC - Terraform preferred
AWS EKS and monitoring solutions for K8s.
Prometheus and Grafana to collect and visualize metrics.
Platforms such as New Relic, DataDog or Splunk to collect metric and event data.
Log management: experience operating and managing a large scale ELK track.
Monitoring and alerting: experience analyzing applications and infrastructure and determining the right type of monitoring and alerting
Experience with our tech stack: AWS (EKS), Prometheus / Grafana, Terraform / Consul / Vault, NodeJS / GoLang, Java.
Strong believer in reducing toil for yourself and teammates.
Ability to troubleshoot complex systems and help resolve major incidents.
Strong communciation skills for documenting best practices to be implemented.
What’s in it for you:
Competitive compensation, benefits and generous time-off policies
4-Day summer work weeks and a winter holiday break
401(k)/DCPP matching
Annual bonus program
Casual, dog-friendly, and innovative office spaces
Don’t just take our word for it:
10X Virginia Business Best Places to Work
9X Washingtonian Great Places to Work
9X Washington Post Top Workplace
St. Louis Post-Dispatch Best Places to Work