Transactional Outbox

Process managers in Nagare emit two kinds of side effects from a single command: events to persist on their own journal, and commands to dispatch to other aggregates. The outbox makes those two arrive together — either both committed to the database, or neither — and decouples delivery from the producing transaction.

Why

Without an outbox, a process manager that "persists events then awaits dispatch" is exposed to a silent message-loss window:

Events are committed to the journal.
The application crashes (or the grain deactivates, or the network fails) before the dispatch foreach completes.
The events are durable. The dispatches are gone.

The receiver never hears about them and the producer can't tell what was lost. This is the dual-write problem; the standard fix is to record the intent to dispatch in the same database transaction as the events, then deliver from that record asynchronously.

How it works

                        ┌──────────────────────────────┐
       Persist tx ──────▶  events table                │
       (Serializable)     nagare_outbox (rows pending) │
                        └──────────────────────────────┘
                                    │ commit
                                    ▼
                        ┌──────────────────────────────┐
       OutboxRunner ────▶  read pending → dispatch     │
       (hosted)            mark dispatched / dead /    │
                          retry with backoff           │
                        └──────────────────────────────┘
                                    │
                                    ▼
                        ┌──────────────────────────────┐
       Receiver ────────▶  IEventMetadata.DispatchId   │
                          dedupe via aggregate LRU     │
                        └──────────────────────────────┘

RelationalEventStore.Append opens one transaction, inserts events, inserts outbox rows, commits. Atomic by construction.
OutboxRunner polls (and reacts to push wake-up) for state='pending' rows whose next_attempt_at is in the past. For each, it deserialises the command + metadata and forwards via IDispatchSink.
The receiving aggregate sees IEventMetadata.DispatchId and checks its bounded LRU. If already applied, returns Replies.Accepted() without re-running the handler. This makes at-least-once delivery effectively exactly-once.
Outbox rows are stable; per-attempt history goes into nagare_outbox_attempts (append-only, FK cascade on delete).

Cross-database support

Capability	Postgres	MySQL 8	SQL Server	SQLite
Identity	`BIGSERIAL`	`BIGINT AUTO_INCREMENT`	`BIGINT IDENTITY`	`INTEGER PRIMARY KEY AUTOINCREMENT`
Idempotent insert on `dispatch_id`	`ON CONFLICT DO NOTHING`	`INSERT IGNORE`	`WHERE NOT EXISTS`	`INSERT OR IGNORE`
Pending index	partial `WHERE state='pending'`	composite	filtered `WHERE state='pending'`	partial `WHERE state='pending'`
Push wake-up	`LISTEN/NOTIFY`	poll	poll	poll
Leader election	`pg_try_advisory_lock`	`GET_LOCK`	`sp_getapplock @LockOwner='Session'`	not needed (single-process)

The schema is otherwise identical across backends. Switching between them is a configuration change.

Wiring

csharp

services.AddNagare();

// Backend storage (pick one)
services.AddPostgresEventStore<MyEvent>("MyEvents");
services.AddPostgresOutbox(connectionString);

// Or:
// services.AddSqliteEventStore<MyEvent>("MyEvents");
// services.AddSqliteOutbox();
//
// services.AddMySqlEventStore<MyEvent>("MyEvents");
// services.AddMySqlOutbox(connectionString);
//
// services.AddSqlServerEventStore<MyEvent>("MyEvents");
// services.AddSqlServerOutbox(connectionString);

// Runner + dispatch sink
services.AddOutboxRunner(opts =>
{
    opts.MaxAttempts = 10;
    opts.BatchSize = 32;
    opts.IdleDelay = TimeSpan.FromSeconds(30);
    opts.DispatchedRetention = TimeSpan.FromDays(30);
});
services.AddProcessOutboxSink(); // wraps the existing ICommandDispatcher

// Optional: health checks tagged "ready"
services.AddOutboxHealthChecks();

Producer-side: dispatch ids are deterministic

Process managers don't pick dispatch ids; the framework computes them as a UUIDv5 over (sourceProcess, sourceVersion, dispatchIndex). Combined with the UNIQUE constraint on dispatch_id, this means a producer replay (e.g. grain reactivation after a crash) re-attempts the insert idempotently — same id, conflict, no double-dispatch.

The dispatch id is also embedded in the command's IEventMetadata.DispatchId so the receiver can dedupe.

Receiver-side: aggregate dedupe

Aggregates carry a bounded FIFO set (default 1024 entries) of recently-applied dispatch ids in their in-memory state. The set is rehydrated on Initialize from the metadata of replayed envelopes. When a command arrives with a DispatchId already in the set, the aggregate short-circuits to Replies.Accepted() and persists nothing.

Capacity must comfortably exceed the outbox dispatch retention so a retried row is never older than the window. Tune via:

csharp

public class MyAggregate : Aggregate<MyCommand, MyEvent, MyState>
{
    protected override int DispatchDedupeCapacity => 4096;
}

Behaviour change to be aware of

ProcessGrain.Ask no longer awaits downstream aggregate calls. Once events + outbox rows are durably committed, Ask returns. Delivery happens asynchronously on the runner.

Anything that depended on synchronous downstream side effects from a process command should switch to:

An eventually-consistent assertion (read-your-own-writes via the projection), or
A subscription on the target aggregate's events.

RelationalEventStore.Append throws if you pass dispatches without an IOutboxStore registered — fail loud rather than silently lose messages.

Ordering guarantees

The runner is single-leader by design. Within one process manager, dispatches arrive in the order they were produced. Across process managers there is no global ordering guarantee — concurrent producers commit independently and the drainer picks rows by id, which is a fine-grained interleaving of every committer.

Do not depend on cross-process order. If a consumer needs it, redesign the events to be self-describing (carry the version / state needed) so out-of-order delivery is tolerable. See anti-patterns.md.

Operations

Concern	Where to look
Drainer is stalled	`OutboxLagHealthCheck` reports `Degraded`/`Unhealthy` based on age of oldest pending row
Dead-letters accumulating	`OutboxDeadLetterHealthCheck` reports based on `state='dead'` count
Trace a specific dispatch	`nagare.outbox.dispatch` Activity, tagged with `nagare.outbox.dispatch_id`
Throughput / outcome breakdown	`nagare.outbox.attempts` counter, tagged by outcome (`success`/`rejected`/`transient`/`poison`)
Per-attempt latency	`nagare.outbox.dispatch.duration_ms` histogram
Storage growth	retention sweep deletes `state='dispatched'` rows older than `DispatchedRetention` (default 30d); see `nagare.outbox.retention.deleted` counter

Failure modes

Scenario	What happens	What you should do
Persist tx fails	Events + outbox rows roll back together. No dispatch sent.	Producer retries the command; framework handles.
Drainer crashes mid-tick	Row stays `pending`; next leader (or restarted runner) re-claims and re-dispatches. Receiver dedupes.	Nothing — designed for this.
Receiver throws transient	Row stays `pending` with `attempts++` and `next_attempt_at` advanced by backoff.	Investigate target service health.
Receiver returns rejection	Row marked `dead`.	Triage via `nagare_outbox` + `nagare_outbox_attempts` history.
Retry budget exhausted	Row marked `dead` with `outcome='poison'`.	Same as above; triage and either re-enable or accept the loss.
Type can't be loaded by drainer	Row marked `dead` immediately (`outcome='poison'`).	Drainer needs the same assemblies as the producer. Common in microservice splits.
Connection drops	Leader-election lock auto-released; another node picks up next tick.	Nothing — designed for this.

Limits and explicit non-goals

One leader at a time. Multi-drainer mode (SELECT … FOR UPDATE SKIP LOCKED claim per source_process) is a future option for higher throughput; defer until single-leader is the bottleneck.
No cross-process ordering. As above.
No exactly-once on the wire. At-least-once delivery + receiver dedupe gives exactly-once effects, which is what callers actually need.
Per-backend leader-election connections need session-pooling-friendly drivers. PgBouncer in transaction-pooling mode breaks session-level advisory locks; use session pooling or talk to Postgres directly for the leader connection.

Transactional Outbox ​

Why ​

How it works ​

Cross-database support ​

Wiring ​

Producer-side: dispatch ids are deterministic ​

Receiver-side: aggregate dedupe ​

Behaviour change to be aware of ​

Ordering guarantees ​

Operations ​

Failure modes ​

Limits and explicit non-goals ​