Skip to content

What Gets Reported

This page describes what Wolverine.CritterWatch publishes from your service to the console — useful when telemetry isn't flowing as expected, or when you want to know what kind of data is leaving the process.

Publish cadence

SurfaceCadenceNotes
Telemetry batchEvery 1 secondAlways publishes, even if nothing changed (heartbeat).
Heartbeat pingEvery 30 secondsDrives the per-node liveness dot in the UI.
Agent health probeEvery 60 secondsActive probe; catches silent agent failures.
Broker health probe~60 secondsOne probe per configured transport.
Capability snapshotOn startup, then on Wolverine reinitTopology — handlers, endpoints, stores, tenancy.
Source code (handler / HTTP chain)On demandReturned when an operator opens the corresponding detail page.

The 1-second batching is the dominant latency — see Architecture → Message Flow for the full picture.

What's in a telemetry batch

Each batch carries:

  • Service identity — service name, label, Wolverine version.
  • Endpoint snapshot — every listener and sender with its current status (Accepting / Stopped / TooBusy / Latched / Paused / Draining), transport type, mode.
  • Subscription / handler catalog — every message type the service handles or publishes, with handler bindings and routing.
  • Recent changes — node added/removed, agent started/stopped, leadership change, circuit breaker tripped/reset, back pressure triggered/lifted, exceptions, since the last batch.
  • Agent health snapshot — Healthy / Degraded / Offline for each registered agent.
  • Persistence counts — inbox, outbox, scheduled, handled, and dead-letter counts per durability store. Per-tenant for multi-tenant services.
  • Shard states — current sequence and high-water mark for each projection shard.

What's not in a telemetry batch

  • Message bodies. Bodies are only sent on demand when an operator opens a specific dead-letter or scheduled message for inspection.
  • Database connection strings. Database URIs (host + database name) are reported for identification; credentials are not.
  • Application data. Your domain events, aggregates, and read models stay in your service's database.
  • Application logs / traces. CritterWatch isn't an APM. For traces, configure an OpenTelemetry trace provider in Settings → Trace Providers.

When telemetry stops flowing

If a service goes silent in the UI (the heartbeat dot turns red), the most likely causes in order:

  1. Process is gone. Crashed, killed, or shut down without graceful shutdown. The next telemetry batch never publishes. Check the host's logs.
  2. Broker is unreachable from the service. Telemetry never reaches the transport. Check broker connectivity from the service's network namespace.
  3. Console is unreachable, telemetry queueing. The transport buffers messages. The dashboard will show the service as silent until the console drains the backlog after reconnect.
  4. Service hung but process alive. A deadlock or runaway GC pause stops the publish timer. The 30-second heartbeat is the leading indicator — use the Per-node detail page to confirm.

The amber-then-red transition on the heartbeat dot (60s → 150s) gives you ~2.5 minutes to spot a stuck node before the UI calls it dead.

Capability snapshot

On startup, the service advertises its full topology to the console:

  • Every registered message type and its handler binding.
  • Every messaging endpoint with its configuration (mode, buffering limits, circuit-breaker settings).
  • Every Wolverine durability store (inbox/outbox database).
  • Every Marten event store and document store the service uses.
  • Multi-tenancy mode and the current tenant list (for dynamic tenancy).
  • The Wolverine assembly version.

The snapshot replaces the prior shape wholesale on each rollout. So if you redeploy with a model change, the new shape appears the moment the new version checks in — there's no merge or migration logic to worry about.

The snapshot is also re-issued whenever the Wolverine runtime is reinitialized (e.g., after a hot reload during development).

Graceful shutdown

On IAsyncDisposable.DisposeAsync(), the observer publishes one final telemetry batch tagged as a shutdown, cancels the periodic timers, and waits for in-flight publishes to complete. This produces a clean "service stopped" timeline entry rather than a heartbeat-timeout-induced "service silent" entry.

If your service is killed without graceful shutdown (SIGKILL, OOM kill, container forced termination), the final batch is lost — the service goes silent and the heartbeat dot transitions to red after the threshold.

Custom transports

The default integration uses RabbitMQ. The library works with any Wolverine-supported transport, but RabbitMQ is recommended for production:

  • Reliable message delivery — telemetry survives a console outage.
  • Decouples services from console availability.
  • Standard tooling for observing the queue depth on the console side.

In-memory transport is fine for development and tests but loses messages if either process restarts. SQL Server transport works but has higher persistence overhead than RabbitMQ for the telemetry volume.

Released under the MIT License.