• 5 min read

Enhancing GitHub Actions Observability with OpenTelemetry Tracing

In the world of continuous integration and delivery (CI/CD), understanding the performance and behavior of your workflows is important. GitHub Actions have become a popular choice for automating software workflows, but monitoring and troubleshooting these pipelines can be challenging. Enter OpenTelemetry tracing - a powerful technology that can provide deep insights into your GitHub Actions workflows.

Why Use OpenTelemetry Tracing for GitHub Actions?

OpenTelemetry tracing offers several benefits when applied to GitHub Actions:

  1. End-to-end visibility: Trace the entire lifecycle of your workflows, from trigger to completion.
  2. Performance optimization: Identify bottlenecks and slow-running steps in your pipelines.
  3. Error detection: Quickly pinpoint where and why failures occur in your workflows.
  4. Dependency analysis: Understand how different jobs and steps interact within your workflows.

Implementing OpenTelemetry Tracing in GitHub Actions

Implementing OpenTelemetry tracing for your GitHub Actions workflows is surprisingly simple. You can achieve this with a single workflow file that utilizes the corentinmusard/otel-cicd-action action.

To set it up, create a new workflow file in your repository’s GitHub Actions workflow directory .github/workflows/ with the following content:

yaml
.github/workflows/otel-traces.yaml
012345678910111213141516171819
name: Export OpenTelemetry Trace for CI
on:
workflow_run:
workflows:
- CI-CD
types: [completed]
jobs:
otel-export-trace:
name: OpenTelemetry Export Trace
runs-on: ubuntu-latest
steps:
- name: Export Workflow Trace
uses: corentinmusard/otel-cicd-action@v1
with:
otlpEndpoint: ${{ secrets.DASH0_OTLP_ENDPOINT }}
otlpHeaders: ${{ secrets.DASH0_OTLP_HEADERS }}
githubToken: ${{ secrets.GITHUB_TOKEN }}
runId: ${{ github.event.workflow_run.id }}

The action requires the configuration of two secrets to describe where and how to export workflow telemetry:

  • DASH0_OTLP_ENDPOINT: grpc://ingress.eu-west-1.aws.dash0.com:4317
  • DASH0_OTLP_HEADERS: Authorization: Bearer auth_XXXXXXXXXXXXXXXXXXXXXXXX

Understanding the Configuration

Let's break down the key components of this workflow:

Trigger

yaml
01234
on:
workflow_run:
workflows:
- CI-CD
types: [completed]

This workflow is triggered when the specified workflows (i.e. CI) complete their execution. This ensures that tracing data is collected after the workflows have finished.

Job Configuration

yaml
0123
jobs:
otel-export-trace:
name: OpenTelemetry Export Trace
runs-on: ubuntu-latest

A single job named "OpenTelemetry Export Trace" is defined, with the latest Ubuntu runner.

Trace Export Step

yaml
01234567
steps:
- name: Export Workflow Trace
uses: corentinmusard/otel-cicd-action@v1
with:
otlpEndpoint: ${{ secrets.DASH0_OTLP_ENDPOINT }}
otlpHeaders: ${{ secrets.DASH0_OTLP_HEADERS }}
githubToken: ${{ secrets.GITHUB_TOKEN }}
runId: ${{ github.event.workflow_run.id }}

This step uses the corentinmusard/otel-cicd-action to export workflow telemetry in the form of an OpenTelemetry trace. The action requires several inputs:

  • otlpEndpoint: The OpenTelemetry Protocol (OTLP) endpoint where the trace data will be sent.
  • otlpHeaders: Headers required for authentication with the OTLP endpoint.
  • githubToken: A GitHub token with appropriate permissions to access workflow data.
  • runId: The ID of the workflow run, used to identify which execution to trace.

Benefits of This Approach

  • Simplicity: With just one workflow file, you can start collecting tracing data for your GitHub Actions.
  • Flexibility: The action can be easily configured to work with different OTLP endpoints and authentication methods.
  • Non-intrusive: This tracing method doesn't require modifications to your existing workflows.
  • Comprehensive: It captures data for entire workflow runs, providing a complete picture of your CI/CD process.

By leveraging OpenTelemetry tracing in your GitHub Actions, you're taking a significant step towards more observable, efficient, and reliable continuous integration and delivery processes.

Using Dash0 for CI/CD OpenTelemetry data

Here are some screenshots of what the GitHub action traces look like inside Dash0. You can find all GitHub action traces in the Tracing view. You can either search by service.namespace = CI-CD which matches your GitHub action workflow name or you use service.namespace = <GITHUB REPOSITORY>.

You can then slice and dice through your data using Dash0’s product capabilities.

Dash0 Tracing view showing spans from GitHub actions. Red spans are failed GitHub action steps. This view shows start time, duration, github conclusion and github author name.

The tracing view gives you full insights which build steps take the most time and where it might benefit the most to invest engineering efforts to reduce CI build times.

Shows details for a GitHub action trace with all its child spans. We show the GitHub action step names and the duration of how long a step took.

While writing this blog post I discovered that the “Test Helm Charts” step was taking 2m 19s in total. That seemed too long to me. In the screenshot below you can see that most time was actually spent on the “Checkout” step. It was checking out the complete repository including all branches and tags which was not necessary.

Shows details for a GitHub action “Checkout” step that took almost 2min

Filtering by “dash0.span.name = Checkout” quickly revealed all places that might be misconfigured. Sometimes we need to check out all branches and tags, but for certain build steps that is not necessary.

Shows Tracing heat map with spans highlighted with duration around 2min

You can also create custom dashboards based on the GitHub action spans. These dashboards can help you identify where most of the time is spent.

Shows a dashboard that is based on GitHub action span metrics.

For your convenience, here are the used PromQL queries:

Top 20 - Average Span duration in minutes:

PromQL
0123456789
topk(20,
histogram_avg(
sum by(service_namespace, service_name, otel_span_name) (
rate({
otel_metric_name="dash0.spans.duration",
service_namespace="dash0hq/dash0"
}[$__interval])
)
) / 60
)

GitHub Action status

PromQL
012345
sum by (github_conclusion) (
increase({
otel_metric_name = "dash0.spans",
service_namespace = "dash0hq/dash0"
}[10m])
)

You can also easily build a dashboard that shows successful deployments to development and production environments as the one below.

Shows a dashboard with deployment metrics derived from GitHub action spans. We see deployment numbers for development and production.

Summary

By implementing OpenTelemetry tracing in your GitHub Actions workflows, you can gain valuable insights into your CI/CD processes, leading to more efficient and reliable pipelines. This enhanced observability allows you to optimize performance, quickly identify and resolve issues, and better understand the interactions within your workflows. As the complexity of software development continues to grow, tools like OpenTelemetry tracing become increasingly crucial for maintaining agile and effective CI/CD practices. Embrace this powerful technology to take your GitHub Actions workflows to the next level of observability and performance.