What Gets Reported
This page describes what Wolverine.CritterWatch publishes from your service to the console — useful when telemetry isn't flowing as expected, or when you want to know what kind of data is leaving the process.
Publish cadence
| Surface | Cadence | Notes |
|---|---|---|
| Telemetry batch | Every 1 second | Always publishes, even if nothing changed (heartbeat). |
| Heartbeat ping | Every 30 seconds | Drives the per-node liveness dot in the UI. |
| Agent health probe | Every 60 seconds | Active probe; catches silent agent failures. |
| Broker health probe | ~60 seconds | One probe per configured transport. |
| Capability snapshot | On startup, then on Wolverine reinit | Topology — handlers, endpoints, stores, tenancy. |
| Source code (handler / HTTP chain) | On demand | Returned when an operator opens the corresponding detail page. |
The 1-second batching is the dominant latency — see Architecture → Message Flow for the full picture.
What's in a telemetry batch
Each batch carries:
- Service identity — service name, label, Wolverine version.
- Endpoint snapshot — every listener and sender with its current status (Accepting / Stopped / TooBusy / Latched / Paused / Draining), transport type, mode.
- Subscription / handler catalog — every message type the service handles or publishes, with handler bindings and routing.
- Recent changes — node added/removed, agent started/stopped, leadership change, circuit breaker tripped/reset, back pressure triggered/lifted, exceptions, since the last batch.
- Agent health snapshot — Healthy / Degraded / Offline for each registered agent.
- Persistence counts — inbox, outbox, scheduled, handled, and dead-letter counts per durability store. Per-tenant for multi-tenant services.
- Shard states — current sequence and high-water mark for each projection shard.
What's not in a telemetry batch
- Message bodies. Bodies are only sent on demand when an operator opens a specific dead-letter or scheduled message for inspection.
- Database connection strings. Database URIs (host + database name) are reported for identification; credentials are not.
- Application data. Your domain events, aggregates, and read models stay in your service's database.
- Application logs / traces. CritterWatch isn't an APM. For traces, configure an OpenTelemetry trace provider in Settings → Trace Providers.
When telemetry stops flowing
If a service goes silent in the UI (the heartbeat dot turns red), the most likely causes in order:
- Process is gone. Crashed, killed, or shut down without graceful shutdown. The next telemetry batch never publishes. Check the host's logs.
- Broker is unreachable from the service. Telemetry never reaches the transport. Check broker connectivity from the service's network namespace.
- Console is unreachable, telemetry queueing. The transport buffers messages. The dashboard will show the service as silent until the console drains the backlog after reconnect.
- Service hung but process alive. A deadlock or runaway GC pause stops the publish timer. The 30-second heartbeat is the leading indicator — use the Per-node detail page to confirm.
The amber-then-red transition on the heartbeat dot (60s → 150s) gives you ~2.5 minutes to spot a stuck node before the UI calls it dead.
Capability snapshot
On startup, the service advertises its full topology to the console:
- Every registered message type and its handler binding.
- Every messaging endpoint with its configuration (mode, buffering limits, circuit-breaker settings).
- Every Wolverine durability store (inbox/outbox database).
- Every Marten event store and document store the service uses.
- Multi-tenancy mode and the current tenant list (for dynamic tenancy).
- The Wolverine assembly version.
The snapshot replaces the prior shape wholesale on each rollout. So if you redeploy with a model change, the new shape appears the moment the new version checks in — there's no merge or migration logic to worry about.
The snapshot is also re-issued whenever the Wolverine runtime is reinitialized (e.g., after a hot reload during development).
Graceful shutdown
On IAsyncDisposable.DisposeAsync(), the observer publishes one final telemetry batch tagged as a shutdown, cancels the periodic timers, and waits for in-flight publishes to complete. This produces a clean "service stopped" timeline entry rather than a heartbeat-timeout-induced "service silent" entry.
If your service is killed without graceful shutdown (SIGKILL, OOM kill, container forced termination), the final batch is lost — the service goes silent and the heartbeat dot transitions to red after the threshold.
Custom transports
The default integration uses RabbitMQ. The library works with any Wolverine-supported transport, but RabbitMQ is recommended for production:
- Reliable message delivery — telemetry survives a console outage.
- Decouples services from console availability.
- Standard tooling for observing the queue depth on the console side.
In-memory transport is fine for development and tests but loses messages if either process restarts. SQL Server transport works but has higher persistence overhead than RabbitMQ for the telemetry volume.
