You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When computing multiple alert rules in Grafana, the current logic evaluates all rules before determining whether to fire an alert based on their "and/or" relationships. For example, given four rules with the relationship:
rule1: result true, operator N/A(first rule don't need an operator)
Introducing a new operator, perhaps named lowerPrecedenceOr, could facilitate this enhancement without causing breaking changes.
Who is this feature for?
This feature is essential for users implementing Google's The Site Reliability WorkbookMultiple Burn Rate Alerts.
In production, missing alerts due to the following scenario has highlighted the necessity for improvement:
Alert structure: {short_term} OR {mid_term} OR {long_term}
Example scenario:
The {short_term} rule was triggered as {5_minutes} AND {1_hour}, resulting in true AND true.
However, the {mid_term} rule failed to trigger because of {30_minutes} AND {6_hours}, which evaluates to true AND false.
Similarly, the {long_term} rule also failed to trigger because of {6_hours} AND {72_hours}, resulting in false AND false.
Getting the false result and the alert didn't fired.
After the improvement, the alert should be fired.
Improving this functionality is crucial for ensuring timely and accurate alerting in complex operational environments.
The text was updated successfully, but these errors were encountered:
Why is this needed:
When computing multiple alert rules in Grafana, the current logic evaluates all rules before determining whether to fire an alert based on their "and/or" relationships. For example, given four rules with the relationship:
true
, operatorN/A
(first rule don't need an operator)true
, operatorand
false
, operatoror
false
, operatorand
the algorithm computes as follow (excerpt from pkg/expr/classic/classic.go):
This computation yields a result of
false
because:true AND true
equals totrue
true OR false
equals totrue
true AND false
equals tofalse
I expected the computation is same as the logical order:
true && true || false && false
, resulting intrue
.What would you like to be added:
To enhance the logic flow, we can implement a check to return
true
if we encountertrue
before anor
operation:Introducing a new operator, perhaps named
lowerPrecedenceOr
, could facilitate this enhancement without causing breaking changes.Who is this feature for?
This feature is essential for users implementing Google's The Site Reliability Workbook Multiple Burn Rate Alerts.
In production, missing alerts due to the following scenario has highlighted the necessity for improvement:
{short_term} OR {mid_term} OR {long_term}
{short_term}
rule was triggered as{5_minutes} AND {1_hour}
, resulting intrue AND true
.{mid_term}
rule failed to trigger because of{30_minutes} AND {6_hours}
, which evaluates totrue AND false
.{long_term}
rule also failed to trigger because of{6_hours} AND {72_hours}
, resulting infalse AND false
.false
result and the alert didn't fired.Improving this functionality is crucial for ensuring timely and accurate alerting in complex operational environments.
The text was updated successfully, but these errors were encountered: