Skip to content

Datadog Tracing

CritterWatch can query Datadog's APM API for OpenTelemetry spans when an operator opens a "View Trace" link on a DLQ entry, a saga instance, or a message-id page. Datadog is one of several supported trace backends — see Settings → Trace Providers for the full list.

What CritterWatch needs from Datadog

  • A Datadog API Key and a Datadog Application Key (two distinct credentials)
  • The site your account is hosted on — us1 / us3 / us5 / eu1 / ap1
  • Your Wolverine services already shipping OTel spans to Datadog (CritterWatch does not enroll services into APM — it only queries traces that are already there)

Where to get the keys: in your Datadog account, Organization Settings → API Keys for the API key and Organization Settings → Application Keys for the application key. The application key only needs read access to the APM Read scope; CritterWatch never writes to Datadog.

How to wire it up

In your CritterWatch host startup (typically CritterWatchBff/Program.cs):

csharp
services.AddCritterWatchTraceProvider<DataDogTraceProvider, DataDogTraceProviderOptions>(
    "datadog",
    opts =>
    {
        opts.ApiKey = builder.Configuration["Datadog:ApiKey"]!;
        opts.AppKey = builder.Configuration["Datadog:AppKey"]!;
        opts.Site   = builder.Configuration["Datadog:Site"] ?? "us1";
    });

Use the same secret store you already use for the rest of your service configuration — environment variables, Azure Key Vault, AWS Secrets Manager, Kubernetes sealed secrets, etc. CritterWatch never persists the API or app keys; they live in IConfiguration and only the active HTTP client knows them.

The Site value maps to the API base URL:

SettingAPI base URL
us1 (default)https://api.datadoghq.com
us3https://api.us3.datadoghq.com
us5https://api.us5.datadoghq.com
eu1https://api.datadoghq.eu
ap1https://api.ap1.datadoghq.com

Per-service routing

By default a registered provider is used for every monitored service. To route a specific service somewhere else (e.g. dev services use Jaeger, prod uses Datadog), the monitored service can declare its preferred provider in its Wolverine.CritterWatch wire-up:

csharp
services.AddCritterWatchMonitoring(opts =>
{
    opts.TraceProvider("datadog");
});

That string is the same name you registered in the CritterWatch host. The pushed value becomes the active binding unless the operator overrides it in Settings — see Settings → Service Bindings for the override flow.

What CritterWatch queries Datadog for

When an operator clicks a "View Trace" link, CritterWatch sends two requests to Datadog's Spans Search API:

  1. Search for matching spansPOST /api/v2/spans/events/search with a query like service:order-service @messaging.message_id:<uuid>. Returns trace summaries newest-first.
  2. Fetch the full trace — same endpoint, this time filtered by @trace_id:<traceId>. Returns every span in the trace, which CritterWatch renders as a hierarchy.

The query attribute names match what Wolverine already emits:

LookupTag
Message idmessaging.message_id
Message typemessaging.message_type
Stream idwolverine.stream.id
Saga idwolverine.saga.id

If your Datadog account is missing one of these tags on its spans, the corresponding lookup will return empty. Verify by searching your Datadog APM console directly with the same tag filter — if it returns nothing there, your services aren't emitting that attribute yet, and the fix is at the Wolverine / OTel exporter side, not in CritterWatch.

Health probe + diagnostics

The Settings page shows a Last Healthy timestamp on each provider row. Click Test to probe — CritterWatch issues GET /api/v1/validate to Datadog with your API key. A green check confirms the API key is recognized; the app key is exercised separately the first time you actually fetch a trace.

Common failure modes:

What you seeLikely cause
Last Healthy never populates after a TestAPI key wrong, or wrong Site value (the key is valid but you're hitting the wrong region)
Healthy probe passes but every trace query returns emptyApp key is missing or lacks APM Read scope
Test returns HTTP 429Datadog rate limiting; back off and retry. Cap is per-org, not per-key
Test returns HTTP 5xxDatadog incident — check status.datadoghq.com

Operator-side caveats

  • Retention window: queries that go further back than your Datadog APM retention return empty results. CritterWatch shows "no traces found" rather than an error; check your Datadog org's retention if a trace you expected to see isn't there.
  • Sampling: if your services use head-based or tail-based sampling, some traces won't be in Datadog at all. CritterWatch can't tell the difference between "trace was sampled out" and "trace never existed".
  • Multi-region: one CritterWatch installation can register multiple DataDogTraceProvider instances pointing at different sites — e.g. "datadog-us" and "datadog-eu" — and bind individual services to whichever fits.

Released under the MIT License.