Notifications

Notifications are essential for keeping your team informed and aiding troubleshooting. Be sure to add them in the "Notifications" section when setting up your checks.

Available notification channels are at the moment: Email, Slack, Webhook, incident.io (Beta), BetterStack (Beta) and Alertmanager (Beta).

Notification Details

  • Summary: A brief, readable summary of the issue, like "High CPU usage detected on server-1."
  • Description: Detailed information about the issue, such as "CPU usage has exceeded 90% for more than 5 minutes on instance server-1."

Configure Notification Channels

When setting up checks, you can configure notification channels to receive alerts if a check fails. If you already have existing notification channels, simply select one from the list. Alternatively, you can create a new notification channel tailored to your needs.

Email

  • Details Required:
    • Name: Specify the name for this notification channel.
    • Email Address: Provide the email address where notifications should be sent.
  • Purpose: Receive email notifications whenever a check fails.

Webhook

  • Details Required:
    • Name: Specify the name for this webhook notification.
    • URL: Provide the webhook URL where notifications will be sent.
    • Additional HTTP Headers (Optional): Add headers to authenticate the request or include extra information for downstream integration.
  • Purpose: Use this channel to send notifications to a generic webhook, allowing integration with various systems.

Example Payload

json
01234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
{
"type": "alert.ongoing",
"data": {
"issue": {
"id": "a53b18cd-2f45-4896-90f5-2b6c3e9b0479",
"issueIdentifier": "742195867438912365",
"dataset": "production",
"start": "2024-10-31T11:23:05.123456789Z",
"summary": "High CPU Usage Detected",
"description": "CPU usage on server cluster 'prod-cluster-01' has exceeded the threshold, indicating potential resource bottleneck.",
"labels": [
{
"key": "environment",
"value": {
"stringValue": "production"
}
},
{
"key": "region",
"value": {
"stringValue": "us-west-2"
}
}
],
"annotations": [
{
"key": "impacted_service",
"value": {
"stringValue": "web-server"
}
}
],
"checkrules": [
{
"id": "d9e4a500-1cfc-4fa7-a84c-f7b2b2d97c42",
"version": 5,
"name": "High CPU Usage Check",
"expression": "avg({host_cpu_usage > $__threshold})",
"thresholds": {
"degraded": 0.75,
"failed": 0.9
},
"interval": "5m0s",
"for": "2m0s",
"keepFiringFor": "10m0s",
"summary": "Alert on high CPU usage across production servers",
"description": "Triggered when average CPU usage exceeds specified thresholds over 5 minutes.",
"labels": {
"severity": "critical"
},
"annotations": {
"team": "SRE"
},
"modes": [
"performance",
"threshold",
"alert"
],
"url": <LINK_TO_CHECK_RULE>
}
],
"url": <LINK_TO_FAILED_CECK>
}
}
}

Slack

  • Details Required:
    • Name: Specify the name for this Slack notification.
    • Webhook URL: Provide the Slack-specific webhook URL.
    • Slack Channel: Indicate the Slack channel where notifications should be sent.
  • Purpose: Receive notifications directly within a Slack channel for convenient, real-time alerts.

incident.io (Beta)

  • Details Required:
  • Purpose: Receive notifications in incident.io for incident management.

Example Payload

json
012345678910111213
{
"title": "CRITICAL — Log errors increased for my-service-name",
"deduplication_key": "issue-3383dc28-c8a2-42d6-91b5-8efe7abfacbd",
"status": "firing",
"description": "Log errors increased for my-service-name",
"source_url": "https://app.dash0.com/alerting/failed-checks?org={your-org-name}&s={failed_check_id}"
"metadata": {
"dash0.resource.name": "my-service-name",
"dash0.resource.type": "service",
"priority": "p3",
"service.name": "my-service-name",
"team": "backend"
}
}
info

Labels are directly added under "metadata" for incident.io - e.g. "team": "backend"

BetterStack (Beta)

Receive alerts in BetterStack via its Prometheus integration.

  • Configuration Steps:
    1. In your Uptime Dashboard on BetterStack
      1. Click on Integrations in the left panel
      2. Click on the Importing data tab
      3. Scroll down to the Infrastructure monitoring section
      4. Click on the Add button of the Prometheus integration
    2. In your Dash0 organization
      1. Click on your organization logo and open the organization Settings via the gear icon
      2. Click on the Notification Channels tab
      3. Click on the New notification channel button and select BetterStack
      4. Enter the following required details
        1. URL: Provide the destination URL for alert events
          1. Endpoint: https://uptime.betterstack.com/api/v1/prometheus/webhook/{your_id}
          2. Simply replace {your_id} with your id or copy the entire URL directly from BetterStack
      5. Click on the Send test notification button to verify your configuration's correctness
      6. Once your configuration is complete, click on Save
      7. You are now ready to use your new BetterStack Notification Channel as part of an existing or new Notification Rule

info

This integration uses the Prometheus Alertmanager Webhook API.

Alertmanager (Beta)

Receive alerts via an external Prometheus Alertmanager.

  • Configuration Steps:
    1. Click on your organization logo and open the organization Settings via the gear icon
    2. Click on the Notification Channels tab
    3. Click on the New notification channel button and select Alertmanager
    4. Enter the following required details
      1. URL: Provide the destination URL (/api/v2/alerts) of Alertmanager
    5. Click on the Send test notification button to verify your configuration's correctness
    6. Once your configuration is complete, click on Save
    7. You are now ready to use your new Alertmanager Notification Channel as part of an existing or new Notification Rule

Labels & Annotations

Labels

  • Purpose: Labels are key-value pairs that categorize and provide metadata for the alert. They define important aspects like the source of the alert, its severity, and contextual information that can help in filtering, routing, and silencing alerts when using Alertmanager.
  • Examples: Common labels might include:
    • severity: Defines the urgency level of the alert, such as critical, warning, or info.
    • alertname: A unique name for the alert rule, like HighCPUUsage or MemoryLeak.
    • instance: Identifies the instance where the alert originated, like server-1 or node-xyz.
    • service: Indicates the job or service name associated with the alert, like web-service or database-service.

Annotations

  • Purpose: Annotations provide descriptive, human-readable information about the alert. They contain details that aid in understanding and troubleshooting the alert, often presented to the user in notification messages.
  • Examples: Common annotations might include:
    • message: Detailed information about the issue, such as "CPU usage has exceeded 90% for more than 5 minutes on instance server-1."
    • runbook_url: A link to documentation or a playbook on how to respond to the alert.
  • Note: Summary and Description that can be configured - are effectively annotations.

Last updated: January 9, 2025