ADR-0009: Concurrent Event Processing Strategy
| Field | Value |
|---|---|
| Status | Accepted |
| Date | 2026-03-24 |
| Author | Subhash Bhushan |
Context
A production incident in a HubSpot integration exposed a class of concurrency bugs in event-driven systems. Two webhooks -- Client Creation (CC) and Client-Contact Association Change (CCA) -- fired in rapid succession. Two workers picked up these events concurrently. The CC handler began creating a Client aggregate. Simultaneously, the CCA handler looked for the Client, didn't find it yet (CC hadn't committed), and created its own copy. Result: duplicate Client records.
This triggered a thorough analysis of how Protean handles concurrent event processing. The analysis identified four problem classes:
- Concurrent entity creation -- multiple events trigger "find or create" logic for the same aggregate
- Concurrent aggregate mutation -- multiple events modify the same aggregate simultaneously
- Causal dependency violation -- event B depends on side effects of event A, but B executes before A's effects are visible
- Duplicate processing -- the same event is delivered twice due to at-least-once delivery semantics
An industry survey of Axon (Java), Eventuous (.NET), Marten (.NET), and Wolverine (.NET) revealed three common characteristics in frameworks that handle concurrency well: they make the aggregate the serialization boundary, provide OCC as a safety net, and surface the concurrency model as an explicit developer decision.
The question was whether Protean needed new infrastructure (partitioned streams, handler-level idempotency stores, repository upsert primitives, new aggregate Meta options) or whether the existing building blocks already address these problem classes.
Decision
Protean already provides the building blocks for safe concurrent event processing. The framework will not add new concurrency infrastructure in the near term. Instead, it will fix the atomic race condition in existing OCC, add a lint rule for a common structural smell, and publish a connective pattern document.
Specifically:
1. OCC is always-on, not opt-in
Every aggregate already has a _version field (auto-managed,
starts at -1). The DAO's _validate_and_update_version() runs on every
save and raises ExpectedVersionError on mismatch. The @handle
decorator retries with exponential backoff (3 retries, 50ms-1s,
configurable via [server.version_retry]).
We will not add a concurrency_strategy = "optimistic" Meta option.
OCC is a correctness mechanism, not a configuration choice. Adding an
opt-in would imply that running without version checking is a valid
option -- it is not.
The existing atomic race condition (SELECT-compare-UPDATE instead of conditional UPDATE) will be fixed in 5.1 #793. This is a bug fix in existing infrastructure, not new infrastructure.
2. No find_or_create repository primitive
find_or_create pushes creation logic to the database level, bypassing
aggregate constructors, invariant checking, and event raising. This
conflicts with Protean's design philosophy: "Validation lives in the
domain layer (invariants), not the database layer."
The DDD-correct solutions for creation races are:
- Process Manager: coordinates the flow when events have causal dependencies (CC must complete before CCA can act)
- Combine into a single handler: if both events target the same aggregate, a single handler serializes processing naturally
- Unique constraints + retry: the database prevents duplicates
via unique constraints;
ExpectedVersionErrorretry handles the transient conflict
3. No framework-level event handler idempotency
We considered adding idempotent_by on event handlers with a
framework-managed idempotency store, but decided against it because:
- Idempotency is inherently a business-logic concern -- the "right" deduplication strategy depends on the operation (set-based, dedup-tracked, upsert)
- Protean already has comprehensive pattern documentation for idempotent event handlers (600+ lines covering three strategies)
- Command idempotency via
IdempotencyStoreis already built for the submission boundary where framework-level dedup makes sense
4. sequential_by deferred to R4
Within-handler sequential processing by partition key (sequential_by)
is a legitimate feature, but the simpler solutions (PM, combine handlers,
OCC retry) handle the vast majority of cases. The infrastructure changes
required (partitioned streams, modified event publishing path, consumer
group management changes) belong in the 5.2 Subscription Profiles &
Tuning epic (R4), not the near-term roadmap.
5. Lint rule for cross-cluster event handling
A new HANDLER_CREATES_FOREIGN_AGGREGATE lint rule (WARNING level) will
detect event handlers that handle events from a different cluster than
the one they belong to (part_of). This is a structural smell that often
indicates a Process Manager would be more appropriate. The rule is
IR-based (no AST analysis needed) and fits into 3.5.4 (handler
completeness rules).
6. Connective pattern documentation
A new pattern page "Designing for Concurrent Event Processing" will tie together the existing pattern docs (OCC, idempotent handlers, process managers, error classification) with the four problem classes and a worked example of the HubSpot scenario refactored three ways.
Consequences
Positive
-
No new infrastructure to maintain. The framework stays lean. Concurrency safety comes from composing existing primitives (OCC, PM, idempotent patterns), not from new mechanisms.
-
DDD philosophy preserved. Creation logic stays in the domain layer. Coordination logic stays in Process Managers. The framework guides developers toward correct patterns rather than providing database-level escape hatches.
-
Existing OCC gets stronger. The atomic fix in 5.1 #793 closes the last real gap in version checking.
-
Lint rule catches the structural smell early. Cross-cluster event handling without a PM is flagged before it causes production issues.
Negative
-
Developers must understand the patterns. There is no "just add
sequential_byand it works" shortcut. The pattern doc must be excellent to bridge this gap. -
sequential_byis deferred. For high-throughput systems with many handlers on the same stream, the only current option for sequential processing is combining handlers or using a PM. Partition-key-based routing comes in R4. -
No database-level upsert. Developers who want
find_or_createmust implement it in their adapter layer outside the framework, or use the recommended DDD patterns instead.
Alternatives Considered
Full infrastructure suite (6 phases)
We evaluated a comprehensive approach: concurrency_strategy Meta
option, find_or_create repository primitive, partitioned streams
(sequential_by), idempotent_by on event handlers, 6 lint rules,
and scaffolding CLI. This was declined because most of the mechanisms
either already exist (OCC, command idempotency), conflict with DDD
philosophy (find_or_create), or are premature for the current roadmap
stage (partitioned streams).
Framework-managed event deduplication
An idempotency store for events (similar to the existing command
IdempotencyStore) was considered. We decided against it because event
handler idempotency is a business-logic concern -- the right dedup
strategy varies by operation type, and the existing pattern docs already
provide comprehensive guidance.
CONCURRENT_ENTITY_HANDLERS lint rule
A lint rule matching handlers by shared field names (e.g., both handlers
have order_id) was considered. We opted for the more precise
HANDLER_CREATES_FOREIGN_AGGREGATE rule instead, which detects
cross-cluster event handling via IR topology and has a significantly
lower false-positive rate.