Skip to content

Introduction

CritterWatch is a purpose-built production monitoring and management console for distributed systems built on the Critter Stack — Wolverine, Marten, and Polecat. It delivers real-time visibility into service health, message flow, dead letter queues, event store projections, and alerting, all from a single unified interface.

The Problem

Modern distributed systems built on messaging, event sourcing, and CQRS patterns introduce operational challenges that general-purpose monitoring tools were never designed to address.

Event-driven architectures have unique blind spots. Event-sourced systems rely on projections and subscriptions to derive read models from append-only event stores. When a projection falls behind the high water mark or stalls entirely, the application continues to accept writes while read models silently go stale. Generic APM tools have no concept of projection lag, subscription health, or event stream continuity.

Dead letter queues are a ticking clock. Every message-driven system produces failed messages. Whether caused by transient faults, schema mismatches, or application bugs, dead-lettered messages represent lost business transactions. Without centralized visibility, teams discover DLQ buildup through customer complaints rather than dashboards. Replaying or discarding those messages typically requires direct database access or custom scripts.

Circuit breakers and back pressure need real-time awareness. Wolverine provides sophisticated error handling — circuit breakers that pause listeners after repeated failures, back pressure that throttles producers when consumers fall behind. These mechanisms protect systems from cascading failure, but operators need to know when they activate, why, and whether the underlying condition has resolved.

Multi-tenant systems multiply complexity. Organizations running multi-tenant Wolverine applications need per-tenant visibility into message processing, projection health, and failure rates. A problem in one tenant's event stream should not require sifting through logs from all tenants to diagnose.

Message routing topology is invisible at runtime. Understanding which handlers process which messages — and how they perform — requires tracing through source code rather than observing the live system.

The Solution

CritterWatch connects directly to your Wolverine-based services through a lightweight observer library. It collects telemetry in real time, stores operational history using Marten event sourcing, and provides a rich web interface for both observing and controlling your distributed system.

Adding CritterWatch to an existing Wolverine application requires a single NuGet package and two lines of configuration. There are no external dependencies beyond the PostgreSQL database your Marten-based services already use.

Key Capabilities

CapabilityDescription
Service DashboardLive health indicators, node/agent status, per-service metrics
DLQ ManagementQuery, filter, replay, edit-and-replay, batch operations
Projection MonitoringLag tracking, stall detection, rebuild/rewind controls
AlertingConfigurable thresholds, full event-sourced lifecycle, auto-resolve
Endpoint ManagementPause/restart listeners, circuit breaker visibility, buffer limits
Scheduled MessagesView, reschedule, cancel, or edit scheduled messages before delivery
Durability MonitorInbox/outbox sparklines, persistence queue depth
Multi-TenancyDynamic tenant management, per-tenant DLQ and projections
Chaos MonkeyControlled fault injection for resilience testing
Message TopologyVisual message routing graph, handler chain visualization
Activity TimelineReal-time audit log of all system events and operator actions

Architecture Overview

CritterWatch uses a hub-and-spoke architecture. Your monitored services run Wolverine.CritterWatch, a lightweight observer that publishes telemetry via RabbitMQ (or any Wolverine transport) at 1-second intervals. The CritterWatch server receives these updates, projects them into a Marten event store, and relays live updates to the browser over SignalR.

Next Steps

Released under the MIT License.