OpenTelemetry Support

This document details Sentry's work in integrating and supporting OpenTelemetry, the open standard for metrics, traces and logs. In particular, it focuses on the integration between Sentry's performance monitoring product and OpenTelemetry's tracing spec.

Background

When Sentry performance monitoring was initially introduced, OpenTelemetry was in early stages. This lead to us adopt a slightly different model from OpenTelemetry, notably we have this concept of transactions that OpenTelemetry does not have. We've described this, and some more historical background, in our performance monitoring research document.

Approach

To transform OpenTelemetry SDK data to Sentry data, we rely on the OpenTelemetry SpanProcessor and OpenTelemetry Propagator API.

We want users to use Sentry for sampling functionality, so users should use OpenTelemetry's default sampler and sample through Sentry's trace_sampler and traces_sample_rate options.

To be an classified as fully supporting OpenTelemetry the OpenTelemetry instrumentation must support the following features.

  1. Linking of errors and transactions via trace context.
  2. Propagation of sentry-trace and baggage headers (this is done via the SentryPropagator)
  3. Span/Trace ID's must match between Sentry and OpenTelemetry.
  4. Spans that represent requests to Sentry must not be recorded in Sentry

To enable OpenTelemetry support for your SDK, go through the following steps.

Step 1: Implement the SentrySpanProcessor on your SDK

Below we have written a custom SpanProcessor that will transform the OpenTelemetry spans to Sentry spans.

This relies on a couple key assumptions.

  1. Hub/Scope propogation works properly in your platform. This means that the hub used in onStart is the same hub used in onEnd and it means that hubs fork properly in async contexts.
  2. The SDK is initialized before the OpenTelemetry SDK is initialized.
  3. There will only be a single transaction occuring at the same time. This is a limitation of the current SDK design.
Copied
import { SpanProcessor } from '@opentelemetry/sdk-trace-base';
import { Span as OpenTelemetrySpan, trace } from '@opentelemetry/api';
import { Span as SentrySpan } from '@sentry/tracing';

const MAP = Map<SentrySpan['spanId'], SentrySpan> = new Map<SentrySpan['spanId'], SentrySpan>();

class SentrySpanProcessor implements SpanProcessor {
  constructor() {
    Sentry.addGlobalEventProcessor(event => {
      const otelSpan = trace.getActiveSpan();
      if (!otelSpan) {
        return event;
      }

      const otelSpanId = otelSpan.spanContext().spanId;
      const sentrySpan = MAP.get(otelSpanId);

      if (!sentrySpan) {
        return event;
      }

      // If event has already set `trace` context, use that one.
      // This happens in the case of transaction events.
      event.contexts = { trace: sentrySpan.getTraceContext(), ...event.contexts };
      return event;
    });
  }

  onStart(otelSpan: OpenTelemetrySpan): void {
    const hub = Sentry.getCurrentHub();
    if (!hub) return;

    const otelSpanId = otelSpan.spanContext().spanId;
    const otelParentSpanId = otelSpan.parentSpanId;

    // Otel supports having multiple non-nested spans at the same time
    // so we cannot use hub.getSpan(), as we cannot rely on this being on the current span
    const sentryParentSpan = otelParentSpanId && MAP.get(otelParentSpanId);

    if (sentryParentSpan) {
      const sentryChildSpan = sentryParentSpan.startChild({
        description: otelSpan.name,
        instrumenter: 'otel',
        startTimestamp: convertOtelTimeToSeconds(otelSpan.startTime),
        spanId: otelSpanId,
      });

      MAP.set(otelSpanId, sentryChildSpan);
    } else {
      const traceCtx = getTraceData(otelSpan);
      const transaction = hub.startTransaction({
        name: otelSpan.name,
        ...traceCtx,
        instrumenter: 'otel',
        startTimestamp: convertOtelTimeToSeconds(otelSpan.startTime),
        spanId: otelSpanId,
      });

      MAP.set(otelSpanId, transaction);
    }
  }

  onEnd(otelSpan: OpenTelemetrySpan): void {
    const otelSpanId = otelSpan.spanContext().spanId;
    const sentrySpan = MAP.get(otelSpanId);

     // Func definition is explained in Step 2
    if isSentryRequest(otelSpanId) {
      // This is a request to Sentry, so we don't want to finish the transaction/span (so it isn't sent to Sentry)
      MAP.delete(otelSpanId);
      return;
    }

    if (sentrySpan instanceof Transaction) {
      updateTransactionWithOtelData(sentrySpan, otelSpan);
      finishTransactionWithContextFromOtelData(sentrySpan, otelSpan);
    } else {
      updateSpanWithOtelData(sentrySpan, otelSpan);
      sentrySpan.finish(convertOtelTimeToSeconds(otelSpan.endTime));
    }

    MAP.delete(otelSpanId);
  }
}

Users are required to add this SentrySpanProcessor to their OpenTelemetry SDK initialization logic to make this work, like so. Individual SDK implementations might be a little different.

Copied
import { NodeSDK } from "@opentelemetry/sdk-node";
import { Resource } from "@opentelemetry/resources";
import * as Sentry from "@sentry/node";
import { SentrySpanProcessor } from "@sentry/opentelemetry-node";

Sentry.init({
  /// ...
});

const sdk = new NodeSDK({
  resource: new Resource({
    "service.name": "my-service",
    "service.version": "1.0.0",
  }),
  spanProcessor: new SentrySpanProcessor(),
});

Step 2: Implement the SentryPropagator on your SDK Add SentryPropagator

SentryPropagator is used to inject/extract sentry-trace and baggage headers to make trace propogation and dynamic sampling work correctly.

Copied
import { Context, TextMapPropagator } from "@opentelemetry/api";
import { SpanContext } from "@opentelemetry/api";

export class SentryPropagator implements TextMapPropagator {
  inject(context: Context, carrier: unknown, setter: TextMapSetter): void {
    const spanContext = getSpanContext(context);
    if (isSentryRequest(spanContext)) {
      return;
    }

    const sentryTrace = generateSentryTrace(spanContext);
    setSentryTraceHeader(carrier, sentryTrace, setter);

    const baggage = generateBaggageFromContext(context);
    setBaggageHeader(carrier, baggage, setter);
  }

  extract(context: Context, carrier: unknown, getter: TextMapGetter): Context {
    const sentryTraceParent = extractSentryTraceFromCarrier(carrier, getter);
    setSentryTraceInfoOnSpanContext(context, sentryTraceParent);

    const dynamicSamplingContext = extractBaggageFromCarrier(carrier, getter);
    setDynamicSamplingContextOnSpanContext(context, dynamicSamplingContext);

    return context;
  }
}

Step 3: Define isSentryRequest

We want to make sure that we don't create Sentry Spans for requests to Sentry.

Copied
import { Span as OtelSpan } from "@opentelemetry/sdk-trace-base";
import { SemanticAttributes } from "@opentelemetry/semantic-conventions";
import { getCurrentHub } from "@sentry/core";

export function isSentryRequestSpan(otelSpan: OtelSpan): boolean {
  const { attributes } = otelSpan;

  const httpUrl = attributes[SemanticAttributes.HTTP_URL];

  if (!httpUrl) {
    return false;
  }

  return isSentryRequestUrl(httpUrl.toString());
}

function isSentryRequestUrl(url: string): boolean {
  const dsn = getCurrentHub()
    .getClient()
    ?.getDsn();
  return dsn ? url.includes(dsn.host) : false;
}

Note: In some environments (e.g. JavaScript), you do not have access to attributes in the SpanProcessor onStart hook. In that case, we will always create transactions/spans, but never finish them in the case they are identified as Sentry requests in onEnd.

Step 3: Define getTraceData

Copied
function getTraceData(otelSpan: OtelSpan): Partial<TransactionContext> {
  const spanContext = otelSpan.spanContext();
  const traceId = spanContext.traceId;
  const spanId = spanContext.spanId;

  const parentSpanId = otelSpan.parentSpanId;
  return { spanId, traceId, parentSpanId };
}

If the user is using the SentryPropagator, the sentry-trace information should be getting attached to incoming spans automatically. This should mean that we don't need to explicitly grab the header. The SentryPropagator should also be putting the Dynamic Sampling Context from the baggage header onto the OpenTelemetry Context, which then can be used in getTraceData to inject the DSC in during transaction creation.

Step 4: Define updateSpanWithOtelData

We want to update the Sentry Span & Transaction with the OpenTelemetry data. This includes the span's operation, description, and resource attributes.

The Sentry span description should come from the OpenTelemetry Span name. The Sentry span operation should come from a combinations of the OpenTelemetry Span name, kind, and attributes.

To make things simple, only set ops for db and http spans. Don't do any other extended logic, this will be done in Relay in the future.

Copied
function updateSpanWithOtelData(
  sentrySpan: SentrySpan,
  otelSpan: OtelSpan
): void {
  const { attributes, kind } = otelSpan;

  sentrySpan.setStatus(mapOtelStatus(otelSpan));
  sentrySpan.setData("otel.kind", kind.valueOf());

  Object.keys(attributes).forEach(prop => {
    const value = attributes[prop];
    sentrySpan.setData(prop, value);
  });

  const { op, description } = parseSpanDescription(otelSpan);
  sentrySpan.op = op;
  sentrySpan.description = description;
}

function updateTransactionWithOtelData(
  transaction: Transaction,
  otelSpan: OtelSpan
): void {
  transaction.setStatus(mapOtelStatus(otelSpan));

  const { op, description } = parseSpanDescription(otelSpan);
  transaction.op = op;
  transaction.name = description;
}

A reference implementation to parse the OTEL span descriptions can be found here: parse-otel-span-description.ts

Step 5: Add OpenTelemetry Context

All SDKs are required to add event context with the key otel to events generated by the SpanProcessor. This is detailed in our spec for the Context. The OpenTelemetryContext should be generated from information from the OpenTelemetry Span.

We recommend doing this by adding a transaction.setContext() method to your SDK.

Copied
function finishTransactionWithContextFromOtelData(
  transaction: Transaction,
  otelSpan: OtelSpan
): void {
  transaction.setContext("otel", {
    attributes: otelSpan.attributes,
    resource: otelSpan.resource.attributes,
  });

  transaction.finish(convertOtelTimeToSeconds(otelSpan.endTime));
}

Example:

Copied
{
  "contexts": {
    "otel": {
      "attributes": {
        "http.method": "GET",
        "http.url": "https://example.com",
        "http.status_code": 200
      },
      "resource": {
        "service.name": "my-service",
        "service.version": "1.0.0",
        "telemetry.sdk.name": "opentelemetry-node",
        "telemetry.sdk.version": "4.0.0"
      }
      // ...
    }
  }
}

Step 6: Add instrumenter option to Sentry SDK

We want to avoid double instrumenting the same library. To do this, we want to add an option to the Sentry SDK that disables Sentry performance instrumentation (automatic and manual) when OpenTelemetry is enabled. For now users will have to manually add this option to their SDK initialization code, but in the future we can do this automatically (by detecting if OpenTelemetry exists).

Copied
Sentry.init({
  // ...
  instrumenter: "otel",
});

You have two options for this:

  1. Add an early return to Sentry.startTransaction() and Sentry.trace() (a helper method) if the instrumenter option is set to otel. This may require SDK changes, as Sentry.trace() is not part of the unified API.
Copied
class Hub implements HubInterface {
  // ...
  startTransaction(
    this: Hub,
    context: TransactionContext
  ): Transaction | undefined {
    // ...
    if (this.instrumenter !== context.instrumenter) {
      return;
    }
    // ...
  }

  // Note this is not part of the unified API.
  // It is being proposed as part of:
  // https://github.com/getsentry/rfcs/issues/28
  // see: https://github.com/getsentry/develop/issues/720
  trace<T>(this: Hub, context: SpanContext, callback: (span?: Span) => T): T {
    // ...
    if (this.instrumenter !== context.instrumenter) {
      return callback();
    }
    // ...
  }
}
  1. Add if conditionals around all callsites where you are using Sentry.startTransaction and span.startChild in the Sentry SDK code. This is a little more work, but it doesn't require any SDK changes to the unified API.
Copied
if (instrumenter === "otel") {
  span.startChild({
    // ...
  });
}

You are also free to do both, or an alternative method, the only requirement is that we don't double instrument.

Span Protocol

Below describe the transformations between an OpenTelemetry span and a Sentry Span. Related: the interface for a Sentry Span, the Relay spec for a Sentry Span and the spec for an OpenTelemetry span.

This is based on a mapping done as part of work on the OpenTelemetry Sentry Exporter.

OpenTelemetry SpanSentry SpanNotes
trace_idtrace_id
span_idspan_id
parent_span_idparent_span_idIf a span does not have a parent span ID, it is a root span. For a root span:
  • If there is an active Sentry transaction, add it to the transaction
  • If there is no active Sentry transaction, construct a new transaction from that span
  • namedescription
    name,attributes,kindop
    attributes,kinddata
    attributes,statusstatusSee Span Status for more details
    start_time_unix_nanostart_timestamp
    end_time_unix_nanotimestamp
    eventSee Span Events for more details

    Currently there is no spec for how Span.link in OpenTelemetry should appear in Sentry.

    Span Status

    In OpenTelemetry, Span Status is an enum of 3 values, while Sentry's Span Status is an enum of 17 values that map to the GRPC status codes. Each of the Sentry Span Status codes also map to HTTP codes. Sentry adopted it's Span Status spec from OpenTelemetry, who used the GRPC status code spec, but later on changed to the current spec it uses today.

    To map from OpenTelemetry Span Status to, you need to rely on both OpenTelemetry Span Status and Span attributes. This approach was adapted from a PR by GH user @anguisa to the OpenTelemetry Sentry Exporter.

    Copied
    // OpenTelemetry span status can be Unset, Ok, Error. HTTP and Grpc codes contained in tags can make it more detailed.
    
    import { Span as OtelSpan } from "@opentelemetry/sdk-trace-base";
    import { SemanticAttributes } from "@opentelemetry/semantic-conventions";
    import { SpanStatusType as SentryStatus } from "@sentry/tracing";
    
    // canonicalCodesHTTPMap maps some HTTP codes to Sentry's span statuses. See possible mapping in https://develop.sentry.dev/sdk/event-payloads/span/
    const canonicalCodesHTTPMap: Record<string, SentryStatus> = {
      "400": "failed_precondition",
      "401": "unauthenticated",
      "403": "permission_denied",
      "404": "not_found",
      "409": "aborted",
      "429": "resource_exhausted",
      "499": "cancelled",
      "500": "internal_error",
      "501": "unimplemented",
      "503": "unavailable",
      "504": "deadline_exceeded",
    } as const;
    
    // canonicalCodesGrpcMap maps some GRPC codes to Sentry's span statuses. See description in grpc documentation.
    const canonicalCodesGrpcMap: Record<string, SentryStatus> = {
      "1": "cancelled",
      "2": "unknown_error",
      "3": "invalid_argument",
      "4": "deadline_exceeded",
      "5": "not_found",
      "6": "already_exists",
      "7": "permission_denied",
      "8": "resource_exhausted",
      "9": "failed_precondition",
      "10": "aborted",
      "11": "out_of_range",
      "12": "unimplemented",
      "13": "internal_error",
      "14": "unavailable",
      "15": "data_loss",
      "16": "unauthenticated",
    } as const;
    
    export function mapOtelStatus(otelSpan: OtelSpan): SentryStatus {
      const { status, attributes } = otelSpan;
    
      const statusCode = status.code;
    
      if (statusCode < 0 || statusCode > 2) {
        return "unknown_error";
      }
    
      if (statusCode === 0 || statusCode === 1) {
        return "ok";
      }
    
      const httpCode = attributes[SemanticAttributes.HTTP_STATUS_CODE];
      const grpcCode = attributes[SemanticAttributes.RPC_GRPC_STATUS_CODE];
    
      if (typeof httpCode === "string") {
        const sentryStatus = canonicalCodesHTTPMap[httpCode];
        if (sentryStatus) {
          return sentryStatus;
        }
      }
    
      if (typeof grpcCode === "string") {
        const sentryStatus = canonicalCodesGrpcMap[grpcCode];
        if (sentryStatus) {
          return sentryStatus;
        }
      }
    
      return "unknown_error";
    }

    Span Events

    OpenTelemetry, has the concept of Span Events. As per the spec:

    An event is a human-readable message on a span that represents “something happening” during it’s lifetime

    In Sentry, we have two options for how to treat span events. First, we can add them as breadcrumbs to the transaction the span belongs to. Second, we can create an artificial "point-in-time" span (a span with 0 duration), and add it to the span tree. TODO on what approach we take here.

    In the special case that the span event is an exception span, where the name of the span event is exception, we also have the possibility of generating a Sentry error from an exception. In this case, we can create this exception based on the attributes of an event, which include the error message and stacktrace. This exception can also inherit all other attributes of the span event + span as tags on the event.

    In the OpenTelemetry Sentry exporter, we've used this strategy to generate Sentry errors.

    Transaction Protocol

    There is no concept of a transaction within OpenTelemetry, so we rely on promoting spans to become transactions. The span description becomes the transaction name, and the span op becomes the transaction op. Therefore, OpenTelemetry spans must be mapped to Sentry spans before they can be promoted to become a transaction.

    We should not set tags on transactions. Instead, we should populate the attributes field on the otel context, see below

    OpenTelemetry Context

    Aside from information from Spans and Transactions, OpenTelemetry has meta-level information about the SDK, resource, and service that generated spans. To track this information, we generate a new OpenTelemetry Event Context.

    The existence of this context on an event (transaction or error) is how the Sentry backend will know that the incoming event is an OpenTelemetry event.

    It uses the following schema:

    Copied
    type Primitive = number | string | boolean | bigint;
    
    interface Attributes<T extends Primitive> {
      [key: string]: T | Array<T> | Record<string, T>;
    }
    
    interface OpenTelemetryContext {
      type?: "otel";
    
      // https://github.com/open-telemetry/opentelemetry-proto/blob/724e427879e3d2bae2edc0218fff06e37b9eb46e/opentelemetry/proto/trace/v1/trace.proto#L174-L186
      attributes?: Attributes;
    
      // https://github.com/open-telemetry/opentelemetry-proto/blob/724e427879e3d2bae2edc0218fff06e37b9eb46e/opentelemetry/proto/resource/v1/resource.proto
      // Resource information.
      resource?: Attributes;
    }

    The reason sdk and service are split are so they can be indexed as top level fields in the future for easier usage within Sentry.

    SDK Spec

    You can edit this page on GitHub.