Setting up Checks

To begin setting up a check rule, by following these steps:

Define the query: Build a query to count events, assess metrics, group by one or more dimensions, and in doubt go full PromQL on all collected telemetry.
Set alert conditions: Specify degraded and critical thresholds, select evaluation time frames, and configure additional enablement conditions.
Configure notifications and automations: Customize a notification summary and detailed description using variables. Decide on notification delivery methods for your teams (e.g., email, Slack, or webhook).

Use our query builder to easily define what you want to check for based on trace, logs or metric data - or switch to raw PromQL to access all telemetry in a unified way and unlock most advanced use cases. The query definition of the check rule will not only specify the metrics that you will get alerted on - but color your product experience and determine how often you will be notified by the system:

info

The check rule will trigger on each distinct time series.

For example:

Rule will trigger once for the frontend service

Rule will trigger for each operation of the frontend service separately (note the added operation name to the sum aggregation)

query-service-request-count-by-operation

Query Templates

Dash0’s templates use the popular "Awesome Prometheus alerts" to offer ready-to-use monitoring solutions crafted by the community and trusted by many. With these templates, you can:

Skip the hassle of building monitors from the ground up.
Achieve robust, end-to-end monitoring across all your critical integrations effortlessly.

Browse Templates

While exploring the Prometheus alerts we will indicate if the respective metrics are available in Dash0.

In case they are not collected yet we point you to the respective integration so you can start gathering the necessary metrics for the respective technology, easily.

Thresholds

Dash0 supports two severities: degraded and critical, the check will appear/resolve and increase or decrease the severity considering the respective specified grace period.

Enablement Conditions

Enablement conditions allow you to restrict when the check rule should be evaluated. As usual you can use any existing telemetry for the conditions: For example, you can use enablement conditions to design check rules that only trigger when your system has relevant activity.

Evaluation Frequency

The evaluation frequency determines how often Dash0 runs the check query. Typically, this frequency is set to 1 minute, meaning that each minute, the check assesses the chosen time series and compares the aggregated result to the defined thresholds.

Supported frequencies are: 1 minute, 5 minutes, 10 minutes.

Grace Periods

Two grace periods can be configured:

the first determines how long a query must continuously exceed the defined threshold before triggering the check

second sets how long a query remains in a degraded or critical state after it no longer meets the threshold condition.

Supported frequencies are: 0s, 10s, 30s, 1 minute, 2 minutes, 5 minutes, 10 minutes.

Configure Checks