The Journey to Otel Collector

In a previous role, I worked on establishing an observability stack using OpenTelemetry (OTel) which mostly involved setting up instances of OTel collector to run across a distributed network. The main goal of this effort was to decouple data collection from data export, to more reliably gather data from various services for export.

In the simplest of setups, logs, and metrics can be sent directly to Grafana Cloud via an OTLP (OTel Protocol) exporter. An OTLP exporter is the component that sends your telemetry data from the OTel Collector to a backend like Grafana. Exporters generally handle authentication, and batching and can be configured for HTTP or gRPC transport. Essentially, they act as the bridge between your application’s telemetry and your monitoring platform.

Assuming you have the env vars OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_HEADERS pointing to a Grafana Cloud instance and the following code to run the otel node sdk:

import { NodeSDK } from "@opentelemetry/sdk-node";
import { RuntimeNodeInstrumentation } from "@opentelemetry/instrumentation-runtime-node";

const sdk = new NodeSDK({
  serviceName: 'service',
  instrumentations: [
    new RuntimeNodeInstrumentation({
      monitoringPrecision: 10,
    }),
  ],
})

process.on("beforeExit", async () => {
  await sdk.shutdown();
});

sdk.start();

You can call the instrumentation code alongside app code from Docker and expect data to be picked up by Grafana Cloud:

FROM node:20-alpine

WORKDIR /app
COPY . .

ENV OTEL_EXPORTER_OTLP_ENDPOINT="GRAFANA_ENDPOINT_HERE" \
    OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic BASIC_HEADER_HERE" \
    OTEL_RESOURCE_ATTRIBUTES="deployment.environment.name=prod"

EXPOSE 9229

ENTRYPOINT ["/bin/sh", "-c"]
CMD ["cd /app && node --inspect=0.0.0.0 --experimental-loader=@opentelemetry/instrumentation/hook.mjs --import ./instrumentation.js app.js"]

Naturally, it’s not considered best practice to have your app code directly handle metrics collection. In the OTel ecosystem, the common pattern is to externalize the monitoring responsibility to an OTel Collector. This collector is configured via declarative configuration files which the collector ingests and applies at runtime. This config file looks something like this yaml:

receivers:
  otlp:
    protocols:
      http:
        endpoint: localhost:4318

processors:
  batch/grafana_cloud:
    timeout: 5s
    send_batch_size: 50000
 
exporters:
  otlphttp/grafana_cloud:
    endpoint: "${env:OTEL_EXPORTER_OTLP_ENDPOINT}"
    headers: { "Authorization": "Basic ${env:OTEL_EXPORTER_BASIC}" }


service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [batch/grafana-cloud]
      exporters: 
        - otlphttp/grafana_cloud

  telemetry:
    logs:
      level: info
      encoding: 'json'
      processors:
        - batch:
            exporter:
              otlp:
                protocol: http/protobuf
                endpoint: "${env:OTEL_EXPORTER_OTLP_ENDPOINT}/v1/logs"
                headers:
                  - name: Authorization
                    value: "Basic ${env:OTEL_EXPORTER_BASIC_HEADER}"

This configuration sets up an OTel Collector that accepts logs over otlp/http, batches them for efficiency, and securely exports them to Grafana Cloud. Specifically, the service.telemetry.logs section configures how the OTel Collector emits and exports its own internal logs. Meanwhile, the service.pipelines.logs section defines a standard application telemetry pipeline, which ships logs generated by your applications to Grafana Cloud. This separation allows the collector to handle its own logging independently from app logs without introducing circular dependencies in user-defined pipelines.

The OTel collector can then be run as a docker container like so:

# Use the OpenTelemetry Collector Contrib image
FROM otel/opentelemetry-collector-contrib:0.136.0

# Create a directory for the config
WORKDIR /etc/otel

COPY config/otel-agent-config.yaml /etc/otel/otel-agent-config.yaml

ENV OTEL_EXPORTER_OTLP_ENDPOINT="ENDPOINT" \
    OTEL_EXPORTER_BASIC_HEADER="HEADER"

EXPOSE 4318

ENTRYPOINT ["/otelcontribcol"]
CMD ["--config=/etc/otel/otel-agent-config.yaml"]

With this setup, you can update the OTEL_EXPORTER_OTLP_ENDPOINT env var in the instrumentation code to point to the OTel Collector container instead of sending telemetry directly to Grafana Cloud. By running both the application container and the collector container on a shared host network (using --network host), the app can easily reach the collector via localhost:4318. Note: The receivers.otlp.protocols.http.endpoint is now using localhost not 0.0.0.0, this is a Docker networking quirk when using host networking.

Centalizing telemetry collection in the OTel Collector gives us a more flexible, scalable observability architecture. The application can focus on application logic, while the OTel Collector provides fine-grained configurability of logs and metrics that are far more reliable and testable.