Files
core/bible/architecture/runtime-flows.md

168 lines
5.9 KiB
Markdown

# Runtime Flows And Invariants
## History-First Mutation Path
All state-changing mutations for assets/components must go through the history domain layer.
Sources covered:
- user registry edits
- hardware JSON ingest
- manual CSV ingest
History apply transaction invariants:
1. lock current projection row (`parts` or `machines`)
2. load latest snapshot (or bootstrap from projection)
3. apply validated patch/domain command
4. semantic dedupe (`after_hash == before_hash` => no-op, no event/snapshot/timeline)
5. write history event + snapshot
6. update projections
7. write timeline projection rows
Direct projection writes are allowed only inside history recompute/rebuild flows.
## Event Time Source Priority
Use component event time over ingest time whenever possible.
- `eventFallbackTime(actual, ingestedAt, collectedAt)`:
1. `actual`
2. `ingestedAt`
3. `collectedAt`
- `collectedFallbackTime(collectedAt, ingestedAt)`:
1. `collectedAt`
2. `ingestedAt`
## Status Event Time Parsing Order
`parseComponentStatusEventTime` resolves time in this order:
1. `status_changed_at`
2. Latest matching `status_history` item for current status
3. Latest parseable `status_history` item
4. `status_checked_at`
## Failure Event Rules
For critical components:
- Timeline event type: `COMPONENT_FAILED`
- `failure_events.failure_time` uses resolved failure time (not raw ingest time)
- `failure_events.external_id` includes the same failure timestamp
- `failure_events` is a projection and may be rebuilt from component history (`COMPONENT_STATUS_SET`)
## First Seen Rules
`parts.first_seen_at` must be the earliest known ingest-derived component time.
Candidate sources:
1. Parseable `status_history[].changed_at`
2. `status_changed_at`
3. `status_checked_at`
4. `eventFallbackTime(nil, ingestedAt, collectedAt)`
Persistence rule:
- Keep the minimum value over time.
- If incoming is earlier than stored value, overwrite with incoming value.
- In history mode this is persisted as component metadata correction (`COMPONENT_FIRST_SEEN_CORRECTED`), then projected into `parts.first_seen_at`.
## Duplicate Component Serial Rules (CSV + JSON Ingest)
If serial numbers are not unique within the same `p/n` (`model`) inside one ingest payload:
- First occurrence keeps original `vendor_serial`.
- Each next duplicate occurrence is assigned a service serial placeholder:
- Format: `NO_SN-XXXXXXXX` (8-digit zero-padded global counter).
- If `vendor_serial` is empty, a service serial placeholder is assigned as well.
- Counter is global for the whole application and stored in `id_sequences` under `entity_type = 'no_sn_placeholder'`.
## Component Health Computation In UI
Component health is derived only from the latest status event among:
- `COMPONENT_FAILED`
- `COMPONENT_WARNING`
- `COMPONENT_OK`
Non-status timeline events (`INSTALLED`, `REMOVED`, `FIRMWARE_CHANGED`, `FIRMWARE_INSTALLED`, etc.) must not change health status.
## Firmware Timeline Rules
For component firmware observations:
- First observed version -> `FIRMWARE_INSTALLED` (asset + component timeline pair)
- Later version change -> `FIRMWARE_CHANGED` (asset + component timeline pair)
- In history mode, firmware observations are persisted as:
- component: `COMPONENT_FIRMWARE_SET`
- asset device firmware: `ASSET_FIRMWARE_DEVICE_SET`
Storage details:
- `FIRMWARE_INSTALLED` stores transition string in `timeline_events.firmware_version`: `- -> <installed_version>`
- `FIRMWARE_CHANGED` stores installed/new firmware value
Detection details:
- Previous observation lookup: `ORDER BY observed_at DESC, id DESC LIMIT 1 OFFSET 1`
## Install / Remove Flow (Cross-Entity)
Install/remove operations are applied as cross-entity history commands in one transaction.
Invariants:
- component and asset history events are both written
- both events share one `correlation_id`
- `installations` is updated as a projection
- asset/component timeline rows are emitted as paired projection events
## Log Collected Flow
- Hardware log collection is represented in history as `ASSET_LOG_COLLECTED`.
- UI/API timeline renders it as `LOG_COLLECTED`.
## Delete / Rollback / Hard Restore Flows
Mutating history operations are asynchronous and DB-backed (`history_recompute_jobs`).
- Delete event:
- soft-delete logical history event (`is_deleted`)
- mark linked timeline rows deleted
- enqueue recompute job
- Rollback (compensating):
- async job creates a new rollback history event (`*_ROLLBACK_APPLIED`)
- original history remains intact
- Hard restore (admin):
- async job physically deletes future history rows after target snapshot/version
- operation is recorded in `history_admin_audit`
## Recompute Scope And Propagation
- Component recompute rebuilds component projections (`parts`, `timeline_events`, `installations`, `failure_events`).
- Asset recompute rebuilds asset projections (`machines`, `machine_firmware_states`, `timeline_events`) and then triggers recompute for linked components to restore cross-entity projection consistency.
- Asset delete/hard-restore propagates to correlated component history events via `correlation_id`.
## Timeline API Grouping
- History timeline endpoints group events by day by default.
- Default timezone for grouping is `UTC`; callers may override with `tz`.
## Timeline Color Semantics
- `REMOVED` -> yellow
- `COMPONENT_FAILED` -> red
- `COMPONENT_WARNING` and related warning semantics follow `timelineEventClass`
## Regression Guardrails
Do not reintroduce these regressions:
- Using ingest timestamp when payload provides better event/failure timestamp
- Letting `INSTALLED` mark failed components as healthy
- Missing `Previous Components` section on asset page
- Missing installation history on component page
- Missing firmware information on component page timeline
- Writing ingest state transitions directly to projections/timeline while bypassing history apply
- Creating duplicate history events for semantic no-op updates