| Author: Kit Merker
Imagine this: you’re the CFO of a large company, and your job is to ensure the business’ financial health and long-term viability. You know it takes significant technology investments to stay competitive, so you’ve greenlighted a massive budget — 9 figures plus — to help modernize your business’s IT and customer digital experiences. You thought this was a smart move until you realized you would never be able to measure the results or get that money back!
Understanding return on investment (ROI) is part of basically every decision you make or approve. You can measure sales efficiency, costs-of-goods-sold, EBITDA. But measuring the fundamental technology that supports your business seems impossible.
One of the key areas that technology is supposed to help is ensuring your systems are always available to support customers. Customers expect it to “just work,” or they might leave you for a more reliable competitor. However, you’re yet to experience genuinely “perfect systems,” and the bill for keeping systems up and running just keeps growing — even without any correlation to increased customer happiness or satisfaction.
By striving for 100% uptime (or perfection), you kill the opportunity to get outsized margins.
Is there a better way? The modern approach to measuring and managing this trade-off in resource investment and risk management is called service level objectives (SLOs). You’re undoubtedly familiar with SLAs (Service Level Agreements); how are SLOs different?
SLOs vs SLAs
An SLO is a reliability goal, expressed as a percentage of moments or events that are “good” relative to the total over time. Unlike SLAs, these goals represent the ideal balance between reliable enough to make customers happy and imperfect enough to allow profitability.
Finding the sweet spot between high performance and perfection is the precise function of an SLO – measuring your data to find that happy place between failure and success. Of course, with any amount of complexity comes risk. And without a certain amount of risk, there’s no room for reward. By striving for 100% uptime (or perfection), you kill the opportunity to get outsized margins. Think about this analogy: if you back a low-risk investment, you’re going to get a low return. To get a higher return on investment, you need to accept some risk. Technology investments work the same way, and SLOs model complex software systems’ risk/reward function.
Setting a baseline for expectations, creating clear penalties for underperformance, and clarifying dispute resolution are the fundamentals of a Service Level Agreement (SLA). Businesses have been using them for decades as a kind of insurance policy for businesses and customers alike. In reality, all they do is keep your product from finding the “edge of excellence” for your customer service. SLAs are content to live in a place of mediocrity. That’s not a place you want to send your business. But, if you move beyond SLAs to SLOs, you’ll be investing money in delivering excellent service at a reasonably low cost.
By setting SLOs, you can focus on what matters to your bottom line and your investment limit. But why would you, a CFO, care about something you view as “code”? Throwing a bunch of math on a page and giving that to you won’t do it. However, if we present this data to you in an easy-to-digest format, you’ll immediately start seeing financial opportunities. Here’s an example:
U.S. Shoe Shoppers Purchasing On Black Friday: Sales Risk Analysis
You can see the economic impact of spending $10000 vs. $100 to operate at 99% versus 99.99% – an easy financial decision brought to you by SLOs.
Where can I save money?
You’re most likely already aware of issues with your product. You can use this knowledge to create SLOs to understand what those issues are. For example, in a marketing context, if the web-user-leads or leads-to-conversion metrics look good, you can focus your investments elsewhere because you know you’re building on a stable foundation.
Finding the “edge of excellence” should always be your goal when creating SLOs. In other words, an SLO will define the acceptable level of reliability for your site while maintaining a reasonable efficiency in the underlying costs to support it. You need to set your expectations (and your SLOs) at a high-but-not-perfect threshold. If you put them lower than this, you could lose customers, and if you set them higher, you’re probably spending too much money.
Let’s look at this scenario: A user’s free trial experience. If we create SLOs within this journey (arriving on your page after googling, clicking the trial button, leaving your site) and put it in terms of business impact, we get a clear picture of the bottom line – goodbye, customer! From there, we can then see which metrics to use to most accurately track the user experience. By determining the length of the SLO, we can start adjusting to meet your reliability goals consistently – hello customer and hello profit.
The bottom line
As a CFO, your number one job is to keep an eye on the money, and to do this, you need SLOs. Monitoring tools are great, but they’re only a stepping stone in the climb towards achievable perfection – or near perfection. Throw out the SLA, adopt SLOs and watch as your costs stay down while customer happiness goes up.