ADR-001 — Bitemporal + vintage lineage everywhere
Status: Accepted (POC). Date: 2026-05-28.
Context
The substrate spans asset classes whose authoritative records are corrected after the fact: well production restated when EIA revises, treaty claims revised as losses develop, SFR appraisals re-marked, crypto exchange settlements amended. Any system that overwrites in place silently loses history. Two distinct temporal axes are actually in play:
- Valid time — when a fact was true in the world. "We held 1000 mcf on 2024-06-15."
- Transaction time — when our system knew that. "We recorded that holding at 09:00 UTC on 2024-06-16; on 2024-06-30 we corrected the qty to 1010."
Queries like "What did we think we owned on June 15, given what we knew on June 30?" require both axes — single-temporal modeling (just an as_of_date) can answer one form ("what we owned") but loses the other ("what we knew"). For a substrate whose JD calls out Shadow IBOR reconciliation, both are load-bearing: ops must reproduce the state of the database at any historical point to defend an investigation, an audit, or a restatement.
Derived series have an additional dimension: which vintage of upstream data and which version of the derivation formula produced this row. A vintage that's later re-derived must coexist with prior vintages, not replace them.
Decision
Full bitemporal tracking (valid_from, valid_to, txn_from, txn_to) on positions, trades, cash transactions, reconciliation runs, entity relationships. Valid-time + vintage on series_obs. Append-only per-class detail tables (position_well, position_treaty, position_sfr, position_crypto_deriv) — a new detail row per new position_id.
Corrections are issued as new rows, never UPDATEs. The original row gets txn_to set to the correction's txn_from. Currently-valid current-knowledge state is WHERE valid_to IS NULL AND txn_to IS NULL.
The pre-Cambrian timestamp for "always known" data is 2020-01-01T00:00:00Z (BT_FROM in synth modules); the marker for "current correction in progress" is NULL on both valid_to and txn_to.
Consequences
Positive:
- Demo Query 3 (txn-time as-of) is a one-line WHERE filter; no archive table, no journal-log replay.
- Reconciliation break carry-forward uses a
logical_break_keyderived from bitemporal fields; second-run showsn_breaks_new=0andn_breaks_carried=4without any application-level dedup. - Restatement and audit defense are first-class — every change is traceable to a known timestamp and source_ref.
Negative:
- Every position query needs
WHERE valid_to IS NULL AND txn_to IS NULL— easy to forget. Codex round 2 caught two places where I had onlyvalid_to IS NULL, which silently treated superseded rows as current. Mitigation: avw_position_currentview (Phase 2) so callers can't omit the filter. - Storage is higher (every correction is +1 row). Acceptable at POC scale (350 position rows); needs a TTL/archival policy at production scale.
- Query plans on bitemporal tables benefit from compound indices on
(instrument_id, holding_entity_id, valid_to, txn_to). DuckDB handles the POC volume without them; production deployment would add them.
Alternatives Considered
- Single-temporal (
as_of_dateonly). Loses txn-time entirely. Restatement workflow degenerates to either overwriting (loses history) or maintaining a separate audit journal table (doubles the surface area). - Event sourcing with a journal-log replay model. Pure event store as system of record, materialized views over the log. Stronger audit story but adds a replay-engine dependency, complicates ad-hoc SQL, and the substrate-first thesis ("anyone can query the canonical layer in plain SQL") suffers. Reasonable for a production rebuild; overkill for a POC and harder to demo.
- Bitemporal on positions only, single-temporal elsewhere. The asymmetry leaks: reconciliation breaks reference positions, so the break record needs to know which vintage of position it corresponds to. Either every neighbor table gets bitemporal too, or break records lie. Halfway bitemporal is worse than either extreme.