• 15 min read

Exploring the OpenTelemetry Go Automatic Instrumentation powered by eBPF: A Deep Dive

This post explores OpenTelemetry Go Automatic Instrumentation with eBPF, offering zero-code tracing for Go applications. It highlights its benefits, challenges with Go's compiled nature, and security considerations.

Observability in Go applications, and especially distributed tracing, has long been a challenge due to the language’s compiled nature.

The OpenTelemetry Go Automatic Instrumentation project, powered by eBPF, aims at improving the situation by bringing automatic instrumentation to Go, in a way that is similar to what’s available for Java, Python, and JavaScript and other languages using runtimes.

This post shares my experience with the OpenTelemetry Go Automatic Instrumentation, explores how eBPF enables it, evaluates security implications, and reviews alternative solutions.

Why Go is Different

Go’s direct compilation to machine code makes traditional auto-instrumentation approaches difficult. Unlike interpreted languages such as Python or JavaScript, Go lacks dynamic features that enable straightforward runtime auto-instrumentation. Developers have to manually integrate OpenTelemetry libraries into their code, which is toilsome due to the necessity of manual trace-context propagation through Go context.Context objects throughout the application.

Understanding Go Auto-Instrumentation

The OpenTelemetry Go Auto-Instrumentation project introduces a way to capture tracing data without modifying application code. Instead of requiring developers to embed telemetry manually, it dynamically traces function calls at runtime.

Some of its key features include:

  • No Code Changes Required
    Works without modifying or recompiling Go applications.
  • Supports Key Libraries
    HTTP (net/http), gRPC, database/sql, and kafka-go are instrumented automatically.
  • Minimal performance impact
  • Environment-Based Configuration
    Uses OpenTelemetry standard environment variables.

For teams adopting OpenTelemetry, this simplifies observability without requiring developers to instrument their code manually.

eBPF: The Engine Behind Go Auto-Instrumentation

eBPF (Extended Berkeley Packet Filter) is a technology embedded in modern Linux kernels that allows sandboxed programs to run within the kernel. Initially developed for network monitoring, eBPF has expanded into areas such as security and observability, especially profiling, and now distributed tracing.

Thanks to eBPF, the OpenTelemetry Go Automatic Instrumentation can trace function calls dynamically and extract telemetry. The creation of spans happens without requiring developers to change their code. However, the current status of the instrumentation still requires manual trace-context propagation through Go context.Context objects throughout the application.

Getting Started with Go Auto-Instrumentation

I tested the project to explore its capabilities and provide guidance for those considering its use. Since the project is still in beta and marked as work in progress, caution is advised before using it in production.

To demonstrate its setup, I’ve prepared a simple “todo” application that uses PostgreSQL as storage. You can follow along with the demo here. We’ll deploy this application in a local Kubernetes cluster using kind and configure OpenTelemetry auto-instrumentation. The Go Automatic Instrumentation project also includes a few examples you can play around with here.

During the experiment I bumped into the requirements about manual trace-context propagation: in order to ensure that spans are correctly linked to the trace, the context.Context object needs to be passed along between functions. For example, in my “todo” application, I initially used the Query method for database calls, which does not accept a context.Context. This resulted in spans being generated as separate traces, rather than as nested spans within a single trace, because the database span was not pointing at the HTTP span as its parent due to the missing trace-context propagation. By switching to QueryContext and passing the context.Context from the request handler, the spans were correctly linked.

As a side note: This type of issue with spans not correctly pointing to their parents is known in the distributed tracing circles as “broken trace context”. It is notoriously insidious to debug and even to spot for end users: usually people get a trace structure for code they are not familiar with at face value. And it can be very time consuming to troubleshoot why that one trace is broken up into multiple traces.

Nevertheless, passing context objects along at least aligns with Go's best practices, as it enables cancellation and timeout management throughout the call chain.

Setting up the environment

Create a Kind cluster:

sh
0
kind create cluster --name=go-otel-ebpf

Deploy PostgreSQL using helm. (This configures a database with the name todo, and sets password to be password.)

sh
012
helm install pg \
--set postgresqlPassword=password,postgresqlDatabase=todo \
oci://registry-1.docker.io/bitnamicharts/postgresql

Next, build and load the application into the cluster:

sh
01
docker build -t todo:v1 .
kind load docker-image todo:v1

Now we can run the application in Kubernetes. But before we do that, let’s have a look at how to enable auto-instrumentation in a Kubernetes environment:

yaml
0123456789101112131415161718192021222324252627282930
apiVersion: apps/v1
kind: Deployment
metadata:
name: todo
...
spec:
...
template:
...
spec:
# 1. Share the Pods Process Namespace
shareProcessNamespace: true
containers:
- image: todo:v1
...
# 2. Add the autoinstrumentation-go sidecar
- name: autoinstrumentation-go
image: otel/autoinstrumentation-go
imagePullPolicy: IfNotPresent
# 3. Configure the environment varialbes
env:
- name: OTEL_GO_AUTO_TARGET_EXE
value: <location of the target binary>
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "https://<endpoint>"
- name: OTEL_SERVICE_NAME
value: "<name of the service>"
# 4. Ensure that the Container runs with elevated privileges
securityContext:
runAsUser: 0
privileged: true

You need the following as shown above in the example:

  1. Ensure that containers in the Pod share the process namespace (shareProcessNamespace: true)
  2. Add the OpenTelemetry Go Auto-Instrumentation container as a sidecar. This sidecar runs the eBPF-based instrumentation agent alongside your application container.
  3. Configure environment variables: Define the target executable and telemetry settings.
  4. Ensure the container runs with elevated privileges: eBPF requires elevated permissions to function correctly.

Before running the application, we need to configure where the telemetry data will be sent. In this example, I’ll be using Dash0, which you can sign up for here. Alternatively, you can use the OpenTelemetry Collector or other observability tools to view traces.

Once your account is set up, retrieve your token and add the following environment variables to the configuration of the otel/autoinstrumentation-go sidecar:

yaml
01234567891011121314151617
- name: autoinstrumentation-go
env:
- name: OTEL_LOG_LEVEL
value: debug
- name: OTEL_GO_AUTO_TARGET_EXE
value: /todo
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "https://<ENDPOINT>:4317"
- name: OTEL_EXPORTER_OTLP_HEADERS
value: "Authorization=Bearer <TOKEN>"
- name: OTEL_EXPORTER_OTLP_PROTOCOL
value: "grpc"
- name: OTEL_SERVICE_NAME
value: "todo-service"
- name: OTEL_GO_AUTO_INCLUDE_DB_STATEMENT
value: 'true'
- name: OTEL_GO_AUTO_PARSE_DB_STATEMENT
value: 'true'

You can omit OTEL_LOG_LEVEL unless you need to enable debug logging to troubleshoot issues.

The last two environment variables, OTEL_GO_AUTO_INCLUDE_DB_STATEMENT and OTEL_GO_AUTO_PARSE_DB_STATEMENT, instruct the sidecar to include and parse database statements. Since this example uses a database, enabling these ensures that database queries are captured correctly.

Now, let's deploy our application:

sh
0
kubectl apply -f manifests/

Next, we’ll port-forward the service to test the endpoint locally and check if spans are being produced:

sh
012
kubectl port-forward svc/todo 3000:3000
# In a new terminal
curl localhost:3000/todos/all

Finally, go to dash0.com, click on the Tracing menu item, and you should see something similar to the following:

Further deep dive into a specific trace, if you like, as shown below.

Our simple todo service is now automatically instrumented, and spans are correctly linked within a trace. As demonstrated, getting started with OpenTelemetry auto-instrumentation in a Go service requires minimal effort. However, this convenience comes with certain trade-offs, particularly around security and control, which we will examine next.

On adding more data

From my experiments, there are a few observations worth noting. The project is actively making progress toward a GA release, and adding custom spans, e.g. what they refer to as hybrid instrumentation, to your code is relatively simple. To enable global instrumentation, you need to set the environment variable OTEL_GO_AUTO_GLOBAL=true. Once enabled, you can acquire a tracer and create spans as shown below:

go
012345678910
tracer := otel.Tracer("todo")
ctx, span := tracer.Start(r.Context(), "AllTodos")
todos, err := getAllTodos(ctx)
if err != nil {
span.SetStatus(codes.Error, "Failed to fetch todos")
span.RecordError(err)
span.End()
http.Error(w, "Failed to fetch todos", http.StatusInternalServerError)
return
}
span.End()

One limitation I encountered is the inability to add custom attributes to spans generated by auto-instrumentation. However, it seems this functionality is currently being developed. Once implemented, it will allow developers to retrieve the current span from the context and dynamically add attributes.

If additional resource attributes need to be configured, this is already possible by using the OpenTelemetry standard environment variable, such as OTEL_RESOURCE_ATTRIBUTES="service.namespace=dash0".

Security Considerations of Running Go Instrumentation with eBPF

While OpenTelemetry Go Auto-Instrumentation provides a zero-code solution for tracing, and is in principle pretty cool and helpful to ease adoption, its current implementation raises security concerns.

One of the biggest risks is that the sidecar must run as root (runAsUser: 0) and requires privileged: true permissions. This effectively grants the instrumentation container full access to the host system. If an attacker compromises this container, they could gain complete control over the node, execute arbitrary commands.

Additionally, the instrumentation setup requires sharing the process namespace (shareProcessNamespace: true), allowing containers in the same pod to interact with each other’s processes. This introduces additional potential attack vectors: if your application has a vulnerability, it can interact with the process running as root (i.e., “lateral movement”).

From a Kubernetes security perspective, these requirements conflict with Pod Security Standards, which recommend avoiding privileged containers, running workloads as non-root users, and preventing host namespace sharing. Deploying this auto-instrumentation cluster-wide effectively grants these elevated permissions to all Go deployments, many of which may be exposed to the internet as APIs, increasing the attack surface significantly.

Alternatives to OpenTelemetry Go Auto-Instrumentation

The OpenTelemetry Go Auto-Instrumentation is not the only eBPF-based approach to distributed tracing for Go applications. Odigos (with which the OpenTelemetry Go Automatic Instrumentation shares multiple contributors) and Grafana Beyla (which is also in discussions to be donated to the OpenTelemetry project) offer alternative approaches with distinct advantages and limitations.

Odigos is an open-source observability control plane that automates OpenTelemetry instrumentation using eBPF. Full Go auto-instrumentation requires an enterprise license, at least to configure instrumentation rules. In my tests, database queries were not fully visible and appeared as generic DB operations rather than detailed statements, which limited its usefulness for deeper database observability. Maybe I missed a setting, as there is a similar one in the OpenTelemetry Go Automatic Instrumentation.

Grafana Beyla focuses on eBPF-based auto-instrumentation for tracing and metrics. It supports multiple protocols such as HTTP, gRPC, SQL, and Redis and is designed to work across different environments without requiring application modifications. Unlike OpenTelemetry Go Automatic Instrumentation, Beyla does not need root privileges or privileged mode, making it a more security-conscious option. However, I found configuring Beyla for Go auto-instrumentation challenging. While it has potential, it is not yet a seamless experience.

OpenTelemetry Go Auto-Instrumentation remains the most direct approach for OpenTelemetry-native environments; however it requires privileged access and root permissions, posing security concerns. Odigos seems interesting for those prioritizing ease of use but comes with enterprise licensing restrictions. Beyla stands out with its security model and broad protocol support, though it still has usability challenges, particularly for Go applications. As the OpenTelemetry ecosystem evolves, I am excited to see whether Beyla’s integration could create a more seamless eBPF-based instrumentation experience without compromising on the security posture.

Other Alternatives: Go Compile-Time Instrumentation

While eBPF-based auto-instrumentation represents a significant breakthrough, compile-time instrumentation offers another promising approach. Instead of injecting observability dynamically at runtime, this method modifies Go source code before compilation, embedding telemetry hooks at build time.

A new industry collaboration is emerging in this space. Alibaba, Datadog, and Quesma have teamed up to create the OpenTelemetry Go Compile-Time Instrumentation SIG (Special Interest Group). Their goal is to merge Alibaba’s opentelemetry-go-auto-instrumentation project and Datadog’s Orchestrion into a unified, vendor-neutral solution for Go applications.

This initiative has major implications. Since telemetry is embedded at compile time, it reduces runtime overhead, making it a more efficient solution for high-performance applications. Unlike eBPF, which requires specific Linux kernel support, compile-time instrumentation works across a broader range of environments, ensuring better compatibility. The formation of this SIG is also significant from a community perspective, as it marks the first time an OpenTelemetry SIG has been led by contributors from the APAC region, highlighting growing contributions from Alibaba and other companies.

The proposal for this SIG can be found here.

Final Thoughts

This was a very interesting investigation. There are multiple options out there based on eBPF, although none seems currently really prime-time ready. However, they seem to be converging through donations to the OpenTelemetry project, so there’s hope the community will get the best of all the options.

Also, adding the instrumentation to your Go applications automatically at compilation time might be an option very much worth considering when the project matures. It offers a promising path to simplifying observability without compromising on performance or introducing significant security risks. Moreover, as the OpenTelemetry community grows and collaborates across various regions and companies, we can expect continued innovation and maturation in this space.

In the meantime, organizations should carefully evaluate the trade-offs using runtime eBPF-based solutions.