Logging

This page explains why Protean's logging works the way it does. For task-oriented instructions see the Logging guide; for a factual enumeration of config keys, framework loggers, and event schemas see the Logging reference.

Why structured logs?

The 12-factor treatment of logs as an event stream has been conventional wisdom for more than a decade, but most codebases still produce human-formatted strings and call it a day:

# The old way — looks readable, queries like grep
logger.info(f"Refunded ${amount} for order {order_id} because {reason}")

When a customer reports a missing refund, you open Loki or Elasticsearch and search. You don't know whether people wrote "Refunded" or "Refund issued" or "Processing refund"; you don't know whether the amount comes before or after the order id; you don't know whether the reason is free text or a code. Every query is an archaeology dig.

Structured logs invert that:

# The Protean way — event name is stable, fields are queryable
logger.info("payment_refunded", order_id="ord-123", amount=19.99, reason="customer_request")

event=payment_refunded AND reason=customer_request is a viable query. avg(amount) by reason is a viable aggregation. count by hour draws a time series. Strings are opaque; fields are data.

Protean commits to this pattern throughout — every hot-path event emitted by the framework uses a stable name and structured fields, and the get_logger() API encourages application code to do the same.

The wide event pattern

Structured logs alone still leave an observability gap: when something goes wrong inside a handler, you typically have a dozen scattered log lines — "started processing", "loaded aggregate", "validated input", "raised event", "saved aggregate", "committed UoW" — and you have to reconstruct the full story by filtering on correlation_id.

The wide event pattern replaces that stream of thin events with one rich event per unit of work. The term itself was popularised by Jamie Brandon's Logging Sucks, So Use Wide Events, and the same pattern was independently codified by Stripe engineering as the canonical log line:

We emit one log line for every request our application handles. The log line contains every field we might want to query on: request id, user id, response time, status code, billing plan, A/B test cohorts, and so on.

That one log line answers:

"Which requests failed in the last 10 minutes?"
"What's the p99 latency for premium users on the checkout endpoint?"
"Which feature-flag cohort shows the highest error rate?"

…without joining log lines or re-running the query over a dozen thin events.

Designing fields for queryability

The practical insight: choose log fields by asking "what questions will operators ask at 3 AM?" not "what is my code doing right now." That reframes debugging from archaeology — grep for strings until a pattern emerges — to analytics — run an aggregation on a queryable schema.

Examples of queries wide events enable when the right fields are bound:

"Show all payment failures for premium users in the last hour grouped by error code."
"What's the p99 handler duration for PlaceOrder commands where coupon_applied is not null?"
"Which aggregates have the highest events_raised_count? (potential aggregate design smell.)"
"Which correlation chains span more than five handlers? (long-running flows worth inspecting.)"

If your wide event can answer those questions, you've picked the right fields. If it can't, you're still doing archaeology.

The stdlib → structlog pipeline bridge

Protean uses structlog for its processor chain but keeps stdlib logging.getLogger(__name__) in framework internals. The two worlds meet in a single structlog.stdlib.ProcessorFormatter installed on every stdlib handler:

flowchart LR
    A["Stdlib<br/><code>logging.getLogger(__name__)</code>"] -->|LogRecord| B[ProcessorFormatter]
    C["Structlog<br/><code>get_logger(__name__)</code>"] -->|event dict| D[structlog pipeline]
    B -->|foreign_pre_chain| E[shared processor chain]
    D --> E
    E --> F{PROTEAN_ENV}
    F -->|production / staging| G[JSONRenderer]
    F -->|development / test| H[ConsoleRenderer]

The shared processor chain applies correlation injection, OpenTelemetry trace context (when telemetry is enabled), caller-supplied processors, tail sampling, and redaction — in that order — before the renderer produces the final output. This means a stdlib call like logger.error("db.query_failed", extra={"stmt": stmt}) and a structlog call like get_logger(__name__).error("db.query_failed", stmt=stmt) emit identical JSON in production.

See ADR-0010 §1 for why this bridge is preferred over mass-migrating every internal module to structlog.

How Protean builds wide events automatically

No other Python framework can auto-populate domain fields like aggregate, events_raised, or uow_outcome because no other framework models those concepts at the framework layer. Protean builds the wide event from two sources:

A framework layer, always present with no code required — handler identity, duration, aggregate, events raised, repository operations, UoW outcome, correlation, and (when telemetry is enabled) trace context. See the reference for the exhaustive field list.
An application layer, whatever the handler binds via bind_event_context() — user tier, order total, feature flags, tenant plan.

The two layers merge into a single event emitted when the handler returns. The framework keeps ownership of its fields: application keys that collide with a framework-reserved name are dropped before emission, so the meaning of handler or uow_outcome never depends on what the caller happened to bind.

Query-oriented field design

When choosing what to bind_event_context(), favour fields that:

Are queryable dimensions, not free text. Prefer user_tier="premium" to note="premium user doing checkout".
Are stable. If a value changes shape every deployment, it's a poor aggregation key.
Are low enough cardinality to group by, high enough to be useful. tenant_id is great; uuid_of_every_request is not (unless your backend handles it — see below).
Name the business concept the operator will ask about. coupon_applied=True beats flags.coupon=1 because the former reads cleanly in a filter.

Leave out fields that only matter inside the handler itself — those belong in DEBUG-level thin events, not the wide event.

High-cardinality fields and backend choice

Wide events produce data that legacy log backends struggle with. A single event carries a request id, a user id, an aggregate id, a correlation id, possibly a tenant id — every one of them high-cardinality by design, because they're exactly what an operator filters by.

Syslog-era stores built for low-cardinality tagging don't index these well; queries over them fall back to full scans. Modern columnar log stores — Loki with structured metadata, ClickHouse, BigQuery, CloudWatch Logs Insights, Datadog Log Analytics — handle high cardinality natively.

Protean does not recommend a specific vendor. The heuristic is: if your logging aggregator makes filter user_id=X or top by aggregate_id queries slow, the bottleneck is the backend, not the event shape. Fix that first before trimming fields out of the wide event.

How Protean's logging differs from Django and FastAPI

Feature	Django	FastAPI (ASGI + uvicorn)	Protean
Structured logs by default	No (stdlib messages)	No (uvicorn access log is plain-text)	Yes (`structlog` pipeline on by default)
Wide event per request	No	No	Yes (`protean.access`)
Domain context in logs	Manual	Manual	Auto (aggregate, events, UoW outcome, correlation)
Correlation ID in logs	Manual	Manual	Auto
OpenTelemetry trace context in logs	Manual	Manual	Auto when `telemetry.enabled = true`
`dictConfig` support	Yes	Yes	Yes (via `--log-config` or `dict_config=`)
Redaction	Manual	Manual	Built-in processor with default keys
Multi-worker hygiene	N/A (single process)	Manual	Built-in `QueueHandler` / `QueueListener` in `--workers N` mode

The gap Protean closes is the domain-aware one: aggregate, events raised, UoW outcome. No other Python framework can populate those automatically because no other framework models them. The commodity features (structured logs, dictConfig, correlation filter) are also on by default — where Django and FastAPI leave them to the operator.

The four sources of context

Every resolved setting comes from one of four places. The precedence, from highest to lowest:

Explicit keyword arguments to Domain.configure_logging() or configure_logging(). Callers always win.
Environment variables (PROTEAN_LOG_LEVEL, PROTEAN_ENV). Useful for container orchestration.
domain.toml [logging] section. The declarative surface operators tune during deployment.
Environment-based defaults. PROTEAN_ENV=production picks INFO + JSON; development picks DEBUG + console; test picks WARNING + console.

This precedence matches every other resolvable setting in Protean (server, telemetry, databases) so operators don't need to memorize a per-subsystem rule. It also means the right answer to "what will my deployment see?" is always: check the kwargs, then the env vars, then domain.toml, then the environment.

How correlation propagates to logs

The ProteanCorrelationFilter (stdlib) and protean_correlation_processor (structlog) read g.message_in_context.metadata.domain.correlation_id first, then fall back to g.correlation_id. The fallback is the documented extension point for HTTP middleware, CLI commands, and background jobs that tag context before any domain message exists.

Both are safe no-ops outside a domain context: they return "" for both fields so formatters referencing %(correlation_id)s never raise KeyError. That no-op semantics is the reason Protean can attach the filter to the root logger unconditionally — it costs one attribute set per record and changes nothing about the output shape when context is absent.

For the full correlation story across HTTP, events, subscribers, and OTel spans, see Correlation and Causation IDs.

How trace context reaches logs

OpenTelemetry trace context (trace_id, span_id, trace_flags) is injected by OTelTraceContextFilter and protean_otel_processor. Both are installed only when telemetry.enabled = true — when telemetry is disabled, the structlog chain has one fewer processor and the root logger has one fewer filter, so the hot path pays zero cost.

The processor lazily resolves OpenTelemetry symbols on first access and caches the result. If opentelemetry is not installed (the telemetry extra is optional), the processor emits empty values and never retries the import.

This design enables the log ↔ trace jump in every APM tool: click a trace in Jaeger or Datadog, copy its trace_id, paste it into Loki or Elasticsearch, see every log record from the same request. Without the trace_id field on log records, that jump requires custom middleware.

See ADR-0007 and ADR-0008 for the OTel architecture this layer builds on.

Why redaction is processor-based

Call-site redaction — "be careful what you log!" — never survives contact with the real world. Someone, somewhere, forgets. A stacktrace inlines a request body. A new feature logs more than its predecessor. A well-meaning debug line ships to production.

The only pattern that survives is redaction as a pipeline stage: any mention of a sensitive key name in any log event is masked before rendering, regardless of where in the pipeline the value was introduced. Protean's redaction processor runs last in the structlog pipeline — after every caller-supplied processor — so operator-supplied processors cannot smuggle sensitive values past it, whether intentionally or by accident.

The redact list is unioned with the built-in defaults, not replaced. Operators can only widen the list — never narrow it. You cannot turn off password masking by supplying your own list.

Redaction is a hygiene filter, not a security boundary: don't store secrets where their redaction is the last line of defence. Authentication tokens belong in a vault, not a best-effort log filter.

Multi-worker hygiene

stdout writes from separate processes are not atomic beyond PIPE_BUF bytes — typically 4096 on Linux, 512 on BSD. A structured JSON log record easily exceeds either. Without coordination, concurrent worker output interleaves mid-record, and the resulting lines are unparseable.

protean server --workers N (with N > 1) solves this the way Python's stdlib recommends: each worker installs a QueueHandler as its sole root handler; the supervisor runs a QueueListener that drains the queue and forwards records to the supervisor's real handlers. Records cross the process boundary through a multiprocessing.Queue, which preserves record integrity.

The listener is stopped in a finally block on shutdown so buffered records are flushed before the supervisor exits. Single-worker mode is unchanged — no queue overhead.

Two layers of wide events

Protean emits wide events at two layers, deliberately kept as separate loggers so operators can route and sample them independently:

Logger	Answers	Emitted from
`protean.access`	"What did my handler do?" — aggregate, events raised, repo operations, UoW outcome.	Every command / event / query / projector handler dispatch.
`protean.access.http`	"What happened at the HTTP boundary?" — method, path, status, duration, commands dispatched.	FastAPI requests routed through `DomainContextMiddleware`.

A single HTTP request that dispatches three commands produces one access.http_completed event and three access.handler_completed events. All four share the same correlation_id, so operators can pull the full thread with a single log-aggregator query:

{logger=~"protean.access.*"} | json | correlation_id="req-abc-123"

Why the split

A unified protean.access logger covering both layers was considered and rejected. Operational concerns differ:

HTTP 4xx rate limits and domain-level handler failures are different problems with different alerting rules. Routing them under one logger name would force operators to re-split them by regex at ingestion time.
HTTP requests are often higher-cardinality than domain operations (scanners, bots, rate-limit probes). Sampling aggressively at the HTTP layer while keeping every domain handler makes more sense as two independent dials.

Because protean.access.http is nested under protean.access in the Python logger hierarchy, filters attached to the parent (like the tail sampling filter) naturally apply to both — the split doesn't cost operators a second place to configure sampling.

See the HTTP wide events guide for the full field schema of access.http_completed, and the logging reference for a factual lookup of every HTTP-layer field.

Enrichment flows across both layers

bind_event_context() called from a FastAPI endpoint reaches both the HTTP wide event and the domain wide event. Internally the API does two things: it binds via structlog.contextvars (picked up by the domain-layer emission) and mirrors the kwargs onto g._http_wide_event_extras (picked up by the middleware when the request finishes). The mirror is necessary because Starlette runs endpoint functions in a child asyncio task whose contextvars copy is not observable from the middleware's parent task.

Application code does not need to know this — one call, both layers. It is documented here so operators debugging field drift between the two layers know where to look.

Tail sampling: keep what matters at scale

Wide events are rich and expensive. At volume, storing one per message can outgrow your log backend's budget. But the naive fix — head sampling (decide at request entry whether to log it) — loses exactly the events you most want at exactly the wrong moment: a 2 % head sample drops 98 % of your production errors.

Protean uses tail sampling: the keep/drop decision happens after the unit of work completes, when status, duration_ms, and message_type are known. Operators specify rules:

Always keep errors — status="failed" or level >= ERROR.
Always keep slow requests — status="slow".
Always keep critical message types — glob match on message_type, for when a whole family of operations (Payment*, Auth*) must be kept regardless of outcome.
Random sample the rest at default_rate.

Rules run in order, first match wins. Every surviving event carries three metadata fields — sampling_decision, sampling_rule, sampling_rate — so downstream aggregators can compute accurate throughput (actual_count = sampled_count / sampling_rate). Without those fields, sampled logs silently lie about volume.

The safety defaults (always_keep_errors=true, always_keep_slow=true) are non-negotiable in the sense that a careless default_rate=0.001 combined with unset safety flags would drop the entire debugging signal. Operators can turn them off, but the defaults refuse to punish absentmindedness.

See the logging guide for enabling it, and the reference for the config schema.

What Protean deliberately does not do

Some capabilities that adjacent frameworks ship with are out of scope for Protean's logging:

Email-on-error handlers. Django's AdminEmailHandler is historical baggage; paging belongs in an alerting layer (PagerDuty, Opsgenie, Alertmanager) reading from the log backend, not a handler buried in the logging library.
An Observatory log viewer. Logs belong in log aggregators. The Observatory's value is live traces and event timelines — fighting Loki, Elasticsearch, and Datadog on their home turf is a losing strategy.
PII heuristics. Automatic PII detection is a research problem; the current state of the art is too flaky to enable by default without false positives that mask real signal. Operators explicitly list the keys they care about via [logging].redact.
Audit log persistence. Domain events already capture the append-only history of what happened. The logging layer is for debugging and observability, not business audit trails.

Keeping these out of scope is deliberate. A smaller, coherent logging story serves operators better than a large one that duplicates downstream tools.