Combine multiple metrics into a single Stat in Grafana

Recently I tried to visualize the current state of a circuit breaker in Grafana as a Stat. The monitored system is based on Spring Boot and resilience4j while gathering metrics using Micrometer and InfluxDB.

When trying to visualize all states in a single Stat I first only get a single Stat per circuit breaker state. Combining all to a single Stat wasn’t as easy as I expected.

The problem

Sadly resilience4j doesn’t combine the states into one single metric rather than emitting a metric with multiple tags per state (see circuitbreaker-metrics). So for every state/tag there will be a value of 0 or 1 determining that the state is not active (= 0) or active (= 1) accordingly.

Naively attempting to use the “group by” function with the tag “state” will result in multiple Stats showing only a single state per tag:

A Grafana metric showing a Stat per tag because the group by function is used. This isn't a desired outcome here.
Using the group by function leads to a Stat per tag. Not a desired outcome in this case.

Not my desired outcome.

A solution

So how can we combine these states into a single Stat, ideally showing the state as text (rather than a numeric value) ?

Like this:

  1. Create a Query per state (and multiply the value with a factor to spread the result set)
  2. Reduce the queries using the Transformation tab (by simply summing every value up)
  3. (Optionally) Add a Value Mapping and Threshold for every possible value in the result set

So let’s get a little bit into the details.

Create a Query per state

First we have to create a query per state (“closed”, “half_open”, “open”, “forced_open” and “disabled”) as shown in the image below.

A Grafana query editor showing a query for every curcuit breaker state.
A query for every circuit breaker state.

Note:

  • For every query there is a different factor multiplied (which will be explained in the next step).
  • The “application_name” is a custom tag to distinguish different applications inside the InfluxDB—and can be ignored here.
  • The “name” tag corresponds to the resilience4j circuit breaker’s name given inside my application.

Reduce the queries using the Transformation tab

After creating the queries we reduce all of them to a single, new value by simply calculating the sum of the other queries’ outcome. We’re naming this new value “state” (because we’re not having enough ambiguity by now :-). The “Time” field has to be excluded.

A Grafana query editor's transform tab showing a "reduce now" transformator to combine every query into a single, new value.
A “reduce now” transform entry combines every query into a single, new value.

Now multiplying a factor to every query comes into play.

If the circuit breaker is in a certain state, the according metric will change to 1 (while all others remain 0). By multiplying a factor, this enables a one-to-one mapping between the value “state” (received by reducing the other queries) and the circuit breaker’s state:

  • 1 = closed
  • 2 = half_open
  • 4 = open
  • 8 = forced_open
  • 16 = disabled

Noteworthy:

  • If “state” has a value of 0, this means that there is no state for the circuit breaker—which has to be a failure while measuring
  • If (for whatever reason) more than one circuit breaker state is measured with “1”, the sum of all states will result in a “state” which isn’t covered in our list above. E.g. if “closed” and “open” both are measured true, then the sum will be 1 + 4 = 5, and 5 is obviously not a covered condition (and will be shown in Grafana’s visualization accordingly).
  • This is also the reason we choose a power of 2 for the factors. If we had chosen a factor like [closed=1, half_open=2 ,open=3, forced_open=4, disabled=5] we wouldn’t be able to recognize an erroneous situation like this.

Finally, an additional step is needed to reduce the states to a single Stat: select the transformation outcome “state” at “Value options / Fields” in the “Panel display options”.

Selecting the new value "state" in Grafana's Panel display options.
The (new) value “state” selected in Grafana’s “Panel display options”.

Add a Value Mapping and Threshold for every possible value in the result set

By now we’ve reached our goal to have a single Stat combining every circuit breaker state.

Let’s tweak the appearance by adding a value mapping for every possible value. This will result in a nicer user experience because we now are showing the state as readable text instead of a numeric value only. Additionally we set a Threshold to color every other state than “closed” in a heart warming red.

Adding value mappings in Grafana's "Panel display options" to receive a nicer user experience.
A value mapping for every state to receive a nicer user experience.

This will lead to the desired outcome if the Circuit Breaker is closed:

A Grafana Stat showing a closed circuit breaker state.
A closed Circuit Breaker.

And if the Circuit Breaker is open (all other states according):

A Grafana Stat showing an open circuit breaker state.
An open Circuit Breaker.

Rolf Engelhard

 

Leave a Reply

Required fields are marked *.