SLAs in a Cloud World: Getting What You Pay For

When it comes to your SLA, no service has 100% uptime. Things need to be maintained. It takes time to update something. There will be periods where something, somewhere has gone wrong. We know this, and we think we accept it – but do we?

Many services are expected to come with some form of Service Level Agreement (SLA) and defining that for services that you provide or consume is a challenge for every organization on the planet, no matter how large they are. Google offers an SLA on their Cloud Compute platform. It’s complicated, but you can get refunds on your monthly spend on events that last longer than about 4 minutes a month – that’s an uptime of 99.99%.

That seems fair? Right?

Well, looking at the fine print, it then goes on to put the onus on the user. You have to notify them, you have to provide logs showing that there was a problem and how long it lasted in order to get your credit.

It also clearly states it applies to the backend instances. So what if the problem is elsewhere in the delivery chain? I only raise this because we’re into Day 2 of a problem with the Google Load Balancer which is affecting about 1.5% of all the calls we’re making through the Google network.

But it’s more complicated than that. It’s affected something like 10% of the calls made from locations in India and Singapore and working just fine pretty much everywhere else.

So, how do you measure what matters? And, when there’s actual money on the line, how do you prove what you need to know?

Photo courtesy of Du Truong

Contact APImetrics today about how our active monitoring solutions can help you fulfill your SLAs.

Sign Up to Our Newsletter

Want to be updated on latest API Metrics Insights? Sign up to our newsletter and receive latest news and blog from us.

Add an API

Related Posts

Join Us Now!

Join the 100s of companies relying on APImetrics.
Share