Files
bee/bible-local/architecture/data-model.md
Michael Chus 7d2e904d14 Bring codebase into compliance with bible contracts (A–E)
A (hardware-ingest-json v2.8-2.9): remove sensor location fields from schema
and collector; tag HardwareMemory.Location as json:"-"; add PlatformConfig to
HardwareSnapshot.

B (no-hardcoded-vendors): consolidate PCI vendor IDs into collector/pci_vendors.go;
replace all vendor-name string checks in isGPUDevice, isNVIDIADevice, isMellanoxDevice,
isAMDGPUDevice, matchesGPUVendor (sat_overlay), and validateIsVendorGPU (page_validate)
with numeric vendor_id comparisons.

C (module-structure): split app/app.go (1413 lines) into app.go + app_format.go,
app_network.go, app_services.go, app_packs.go, app_install.go — no logic changes.

D (go-code-style): wrap bare return err in interfaceAdminState and
interfaceIPv4Addrs (platform/network.go) with fmt.Errorf context including
the interface name.

E (go-project-bible): add bible-local/architecture/data-model.md and
bible-local/architecture/api-surface.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-13 14:32:08 +03:00

138 lines
5.6 KiB
Markdown

# Data Model
The canonical output of `bee audit` is a `HardwareIngestRequest` JSON document accepted
by the Reanimator `/api/ingest/hardware` endpoint. The ingest endpoint uses a strict
decoder — unknown fields cause `400 Bad Request`.
Source of truth: `audit/internal/schema/hardware.go`
---
## Top-level: HardwareIngestRequest
```
HardwareIngestRequest
├── collected_at string RFC3339 UTC timestamp of collection
├── hardware HardwareSnapshot
├── runtime RuntimeHealth? from bee-runtime-preflight service
├── filename string?
├── source_type string?
├── protocol string?
└── target_host string?
```
`collected_at` is the primary sort key used by Reanimator to deduplicate ingests.
---
## HardwareSnapshot
All component arrays are `omitempty` — absent when the collector finds nothing.
| JSON key | Go type | Source |
|-------------------|----------------------------|------------------------------|
| `board` | HardwareBoard | dmidecode type 1/2 |
| `firmware` | []HardwareFirmwareRecord | dmidecode type 0/13 |
| `cpus` | []HardwareCPU | dmidecode type 4 |
| `memory` | []HardwareMemory | dmidecode type 17 |
| `storage` | []HardwareStorage | lsblk + nvme-cli + smartctl |
| `pcie_devices` | []HardwarePCIeDevice | lspci |
| `power_supplies` | []HardwarePowerSupply | ipmitool fru + sdr |
| `sensors` | *HardwareSensors | sensors -j |
| `event_logs` | []HardwareEventLog | ipmitool sel + journald |
| `platform_config` | *json.RawMessage | reserved, nil until used |
| `vroc_license` | *string | vroc-cli |
---
## Identity keys
Reanimator uses these fields to match components across successive audits:
| Component | Identity key |
|----------------|------------------------------------------------|
| Board | `board.serial_number` (required, never empty) |
| CPU | `serial_number` if present; else generated key |
| Memory DIMM | `serial_number` — absent DIMMs have `present: false` |
| Storage | `serial_number` if present; else `linux_device` from Telemetry |
| PCIe device | `bdf` (Bus:Device.Function address) |
| PSU | `slot` |
Components without a stable identity are still emitted but may not be matched across runs.
---
## HardwareComponentStatus (embedded in all components)
```go
type HardwareComponentStatus struct {
Status *string `json:"status,omitempty"` // OK | Warning | Critical | Unknown
ErrorDescription *string `json:"error_description,omitempty"`
}
```
Status is set by collectors and overwritten at render time by `ApplySATOverlay`
(latest SAT run results are always merged on top before display).
---
## HardwarePCIeDevice
The most enriched component type. Key fields:
| JSON key | Meaning |
|----------------------|------------------------------------------------|
| `bdf` | PCI address (identity key), e.g. `0000:4b:00.0` |
| `vendor_id` | Numeric PCI vendor ID (hex). Use this for classification — not `manufacturer`. |
| `device_id` | Numeric PCI device ID (hex) |
| `device_class` | Human-readable class, e.g. `VideoController` |
| `manufacturer` | String label from lspci — for display only |
| `model` | From nvidia-smi / rocm-smi — display name |
| `link_speed` | Current PCIe link speed, e.g. `Gen4` |
| `max_link_speed` | Max negotiated speed |
| `link_width` | Current lane count |
| `max_link_width` | Max lane count |
| `temperature_c` | From nvidia-smi / rocm-smi |
| `power_w` | Current power draw |
| `ecc_uncorrected_total` | Cumulative ECC uncorrected errors (NVIDIA) |
| `ecc_corrected_total` | Cumulative ECC corrected errors (NVIDIA) |
| `hw_slowdown` | HW throttle active (NVIDIA) |
| `telemetry` | Free-form map for vendor-specific extras |
**Classification rule**: use `vendor_id` (numeric PCI ID), never `manufacturer` string.
| Vendor | vendor_id |
|-----------|-----------|
| NVIDIA | `0x10de` |
| AMD | `0x1002` |
| Mellanox | `0x15b3` |
| Aspeed | `0x1a03` |
| Intel | `0x8086` |
Constants live in `audit/internal/collector/pci_vendors.go`.
---
## HardwareMemory
`location` field exists in the Go struct with `json:"-"` — it is intentionally excluded
from JSON output because the Reanimator schema does not include it. It is used internally
for DIMM telemetry matching only (`collector/memory_telemetry.go`).
---
## HardwareSensors
Sensor structs (`HardwareFanSensor`, `HardwareTemperatureSensor`,
`HardwarePowerSensor`, `HardwareOtherSensor`) do **not** have a `location` field.
Location was removed in contract v2.8. The Go types mirror the schema exactly.
---
## JSON naming convention
All JSON keys are `snake_case`. Go field names are `CamelCase`. The mapping is
maintained by struct tags in `audit/internal/schema/hardware.go`.
All pointer fields use `omitempty` — absent means not collected (not zero).