Defining the key performance indicators (KPIs) for APIs being used is a critical part of understanding not just how they work but how well they can work and the impact they have on your services, users or partners. But what do you look at and track when measuring API KPIs?
Some API KPIs are ‘soft’ – your documentation, the formal definitions of how the API works, use metrics and adoption, and others are ‘hard’ – focusing on the underlying functionality of the service being measured.
This article will focus on the ‘hard’ API KPIs, and looking beyond the obvious ones that existing tools tools and solutions capture and at the critical numbers you need to focus on to understand how well an API works and the impact that it is having on the services you deliver, integrate to, or provide.
We will be focusing on measurable metrics for API KPIs:
- Call Latency – but not just total time, you should understand the networking impact as well as the various server components – some Open Banking regulations like those from Open Banking UK require a specific Time to First Byte received to be reported
- Total pass and error rates – the measured success rates in terms of HTTP or other codes
- Effective pass rates – the pass rate when you take performance outliers or ‘passing errors’ (like a HTTP 200 OK! hiding an error) into account
- The Cloud API Service Consistency score which takes into account the consistency of service
Multiple ways to measure APIs and the impact they may have on your services, customers and critical interdependence. But outside of what you can measure with your current products.
Measured Pass Rate
The pass rate is the actual measured success rate for an API call from a specific location.
But APIs can pass at different rates from different locations. So don’t just assume that HTTP-200 means ‘All OK’. Just because your API gateway or APM stack logs show 200 codes doesn’t mean that everything is working well – or even at all.
Image you’re in a restaurant where they only measure whether or not you get a drink but don’t actually care if you get the drink you ordered or whether or not it has ice in it? That can be the difference between being told an API call passed and having a dispute with a supplier, customer or regulator over what you actually delivered.
Effective Pass Rate
It’s entirely possible to have two APIs doing the same thing, but with wildly different effective pass rates.
Even with a 100% pass rate, there may be events and performance issues that cause timeouts and other problems. You need to take into account the performance, including latencies and items that may affect end users.
So thinking back to our restaurant, they might have a time goal where they still don’t care if there’s even a drink in the glass as long as the glass is delivered on time, or, where they check the average time, so while your drink took over an hour, on average,most drinks were served in under 5 minutes.
Location Impact on KPIs
Latency is complex for APIs. Using a common ‘ping’ tool won’t tell you what you need to know. API calls include multiple steps:
- Connect Time
- DNS Look Up
- Server Side Processing Time
- Internet Travel Time
- Total Call Time
And each of these vary by geography and even cloud service provider. What works between AWS servers might not work so well if your partner is using Azure or Google. Understanding these issues can save pain and complaints and brand impact later.
A slow API might not be a problem. But an API that is slow only sometimes might well be. Systems measuring average performance can miss significant performance issues lasting many hours.
So it’s important not just to think about the averages here, but also the percentiles, that is the calls that fall into a certain group – like the worst 5% or 1% of calls.
That might not sound like a lot but consider an example. You serve 1,000,000 API calls a day and have an average latency of 500ms, that’s good? Right? But consider this, your 95th percentile, that’s the slowest 5% take over 2 seconds, and your 99th percentile is over 10 seconds. That’s 10,000 calls a day taking over 10 seconds… is that any good?
Learn more about API KPIs
Why not see how APImetrics measures performance and SLAS or check out some of our other articles.