The Key to Reliability
Behind every reliable software are engineering best practices called service level objectives (SLOs). By collecting individual data points, called SLIs, SLOs can give you the visibility to ensure your products and services meet your customer’s expectations. SLOs give technical users the clarity to plan a roadmap, while providing business stakeholders optimal insights to budget their reliability spend strategically. Happier customers and more efficient engineering, all with less damage to your bottom line.
SLOs are essentially a target set for a given system’s desired consistency of behavior over time. Tracking SLO adherence is dependent on SLIs, which are a way to measure if an SLO is being met or not. If a system or service is not meeting its SLO, there are consequences such as performance degradation, outages, or impact on customers. SLOs balance the possible negative repercussions and the lack of availability to determine a reasonable uptime goal for that service, with the downtime afforded in that SLO as the error budget.
SLOs To The Rescue
As an SLO approaches (but never reaches) 100%, the cost of reliability increases exponentially.
Your software releases slow down as you build expensive infrastructure far beyond user expectations.
Not all downtime is equal, so fine-tuning SLOs that correlate with business KPIs will give you a clearer picture of where to spend your time and energy – delivering new features or paying down technical debt. Not just nines for nines’ sake; find the right nines for the job.
GitOps Ready SLOCtl and SLO Yaml
Setting Up A Prometheus SLO
Creating a Service Level Objective
For Site Reliability Engineers
Site Reliability Engineering is one of the hottest areas as companies look to build reliable systems and their online presence. As companies rush to adopt Site Reliability Engineering principles, Service Level Objectives (SLOs) are the most important place to begin. SLOs are the combination of cultural philosophies, practices, and tools.
The SRE community needs a place to gather and focus on SLOs in depth. This virtual conference will cover topics at all levels, from introduction to SLOs to the practical application of SLOs. This conference is a community event made and led by Site Reliability Engineers and influencers who care about reliability and becoming more customer centric by adopting, measuring and optimizing SLOs.
Stephen Elliott, Group Vice President - IDC
Alex Hidalgo on Reliability Reporting: Painting the Big Picture for SLOs
Chock-full of example scenarios you’ve probably experienced yourself, Chapter 17 of Implementing Service Level Objectives by our own Alex Hidalgo covers the ins and outs of SLO-based reliability reporting. Hidalgo walks you through how to make the best use of this approach, why other methods fall short, and how using SLOs can really make a difference for the people in your organization.