Update docs and add release artifacts
This commit is contained in:
@@ -22,6 +22,9 @@ Key top-level fields:
|
||||
| `sensors` | `[]SensorReading` | Sensor readings |
|
||||
| `raw_payloads` | `map[string]any` | Raw vendor data (e.g. `redfish_tree`) |
|
||||
|
||||
`raw_payloads` is the durable source for offline re-analysis (especially for Redfish).
|
||||
Normalized fields should be treated as derivable output from raw source data.
|
||||
|
||||
### Hardware sub-structure
|
||||
|
||||
```
|
||||
@@ -31,6 +34,7 @@ HardwareConfig
|
||||
├── cpus []CPU
|
||||
├── memory []MemoryDIMM
|
||||
├── storage []Storage
|
||||
├── volumes []StorageVolume — logical RAID/VROC volumes
|
||||
├── pcie_devices []PCIeDevice
|
||||
├── gpus []GPU
|
||||
├── network_adapters []NetworkAdapter
|
||||
@@ -86,3 +90,15 @@ Carried by both `/api/status` and `/api/config`:
|
||||
|
||||
Valid `source_type` values: `archive`, `api`
|
||||
Valid `protocol` values: `redfish`, `ipmi` (empty is allowed for archive uploads)
|
||||
|
||||
---
|
||||
|
||||
## Raw Export Package (reopenable artifact)
|
||||
|
||||
`Export Raw Data` does not merely dump `AnalysisResult`; it emits a reopenable raw package
|
||||
(JSON or ZIP bundle) that carries source data required for re-analysis.
|
||||
|
||||
Design rules:
|
||||
- raw source is authoritative (`redfish_tree` or original file bytes)
|
||||
- imports must re-analyze from raw source
|
||||
- parsed field snapshots included in bundles are diagnostic artifacts, not the source of truth
|
||||
|
||||
@@ -46,15 +46,53 @@ Dynamic — does not assume fixed paths. Discovers:
|
||||
Full Redfish response tree is stored in `result.RawPayloads["redfish_tree"]`.
|
||||
This allows future offline re-analysis without re-collecting from a live BMC.
|
||||
|
||||
### Unified Redfish analysis pipeline (live == replay)
|
||||
|
||||
LOGPile uses a **single Redfish analyzer path**:
|
||||
|
||||
1. Live collector crawls the Redfish API and builds `raw_payloads.redfish_tree`
|
||||
2. Parsed result is produced by replaying that tree through the same analyzer used by raw import
|
||||
|
||||
This guarantees that live collection and `Export Raw Data` re-open/re-analyze produce the same
|
||||
normalized output for the same `redfish_tree`.
|
||||
|
||||
### Snapshot crawler behavior (important)
|
||||
|
||||
The Redfish snapshot crawler is intentionally:
|
||||
- **bounded** (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
|
||||
- **prioritized** (PCIe, Fabrics, FirmwareInventory, Storage, PowerSubsystem, ThermalSubsystem)
|
||||
- **tolerant** (skips noisy expected failures, strips `#fragment` from `@odata.id`)
|
||||
|
||||
Design notes:
|
||||
- Queue capacity is sized to snapshot cap to avoid worker deadlocks on large trees.
|
||||
- UI progress is coarse and human-readable; detailed per-request diagnostics are available via debug logs.
|
||||
- `LOGPILE_REDFISH_DEBUG=1` and `LOGPILE_REDFISH_SNAPSHOT_DEBUG=1` enable console diagnostics.
|
||||
|
||||
### Parsing guidelines
|
||||
|
||||
When adding Redfish mappings, follow these principles:
|
||||
- Support alternate collection paths (resources may appear at different odata URLs).
|
||||
- Follow `@odata.id` references and handle embedded `Members` arrays.
|
||||
- Prefer **raw-tree replay compatibility**: if live collector adds a fallback/probe, replay analyzer must mirror it.
|
||||
- Deduplicate by serial / BDF / slot+model (in that priority order).
|
||||
- Prefer tolerant/fallback parsing — missing fields should be silently skipped,
|
||||
not cause the whole collection to fail.
|
||||
|
||||
### Vendor-specific storage fallbacks (Supermicro and similar)
|
||||
|
||||
When standard `Storage/.../Drives` collections are empty, collector/replay may recover drives via:
|
||||
- `Storage.Links.Enclosures[*] -> .../Drives`
|
||||
- direct probing of finite `Disk.Bay` candidates (`Disk.Bay.0`, `Disk.Bay0`, `.../0`)
|
||||
|
||||
This is required for some BMCs that publish drive inventory in vendor-specific paths while leaving
|
||||
standard collections empty.
|
||||
|
||||
### PSU source preference (newer Redfish)
|
||||
|
||||
PSU inventory source order:
|
||||
1. `Chassis/*/PowerSubsystem/PowerSupplies` (preferred on X14+/newer Redfish)
|
||||
2. `Chassis/*/Power` (legacy fallback)
|
||||
|
||||
### Progress reporting
|
||||
|
||||
The collector emits progress log entries at each stage (connecting, enumerating systems,
|
||||
|
||||
@@ -5,11 +5,42 @@
|
||||
| Endpoint | Format | Filename pattern |
|
||||
|----------|--------|-----------------|
|
||||
| `GET /api/export/csv` | CSV — serial numbers | `YYYY-MM-DD (MODEL) - SN.csv` |
|
||||
| `GET /api/export/json` | Full `AnalysisResult` JSON (incl. `raw_payloads`) | `YYYY-MM-DD (MODEL) - SN.json` |
|
||||
| `GET /api/export/json` | **Raw export package** (JSON or ZIP bundle) for reopen/re-analysis | `YYYY-MM-DD (MODEL) - SN.(json|zip)` |
|
||||
| `GET /api/export/reanimator` | Reanimator hardware JSON | `YYYY-MM-DD (MODEL) - SN.json` |
|
||||
|
||||
---
|
||||
|
||||
## Raw Export (`Export Raw Data`)
|
||||
|
||||
### Purpose
|
||||
|
||||
Preserve enough source data to reproduce parsing later after parser fixes, without requiring
|
||||
another live collection from the target system.
|
||||
|
||||
### Format
|
||||
|
||||
`/api/export/json` returns a **raw export package**:
|
||||
- JSON package (machine-readable), or
|
||||
- ZIP bundle containing:
|
||||
- `raw_export.json` — machine-readable package
|
||||
- `collect.log` — human-readable collection + parsing summary
|
||||
- `parser_fields.json` — structured parsed field snapshot for diffs between parser versions
|
||||
|
||||
### Import / reopen behavior
|
||||
|
||||
When a raw export package is uploaded back into LOGPile:
|
||||
- the app **re-analyzes from raw source**
|
||||
- it does **not** trust embedded parsed output as source of truth
|
||||
|
||||
For Redfish, this means replay from `raw_payloads.redfish_tree`.
|
||||
|
||||
### Design rule
|
||||
|
||||
Raw export is a **re-analysis artifact**, not a final report dump. Keep it self-contained and
|
||||
forward-compatible where possible (versioned package format, additive fields only).
|
||||
|
||||
---
|
||||
|
||||
## Reanimator Export
|
||||
|
||||
### Purpose
|
||||
|
||||
@@ -111,4 +111,94 @@ Top-level `README.md` and `CLAUDE.md` must remain minimal pointers/instructions.
|
||||
|
||||
---
|
||||
|
||||
## ADL-009 — Redfish analysis is performed from raw snapshot replay (unified tunnel)
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Live Redfish collection and raw export re-analysis used different parsing paths,
|
||||
which caused drift and made bug fixes difficult to validate consistently.
|
||||
**Decision:** Redfish live collection must produce a `raw_payloads.redfish_tree` snapshot first,
|
||||
then run the same replay analyzer used for imported raw exports.
|
||||
**Consequences:**
|
||||
- Same `redfish_tree` input produces the same parsed result in live and offline modes.
|
||||
- Debugging parser issues can be done against exported raw bundles without live BMC access.
|
||||
- Snapshot completeness becomes critical; collector seeds/limits are part of analyzer correctness.
|
||||
|
||||
---
|
||||
|
||||
## ADL-010 — Raw export is a self-contained re-analysis package (not a final result dump)
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Exporting only normalized `AnalysisResult` loses raw source fidelity and prevents
|
||||
future parser improvements from being applied to already collected data.
|
||||
**Decision:** `Export Raw Data` produces a self-contained raw package (JSON or ZIP bundle)
|
||||
that the application can reopen and re-analyze. Parsed data in the package is optional and not
|
||||
the source of truth on import.
|
||||
**Consequences:**
|
||||
- Re-opening an export always re-runs analysis from raw source (`redfish_tree` or uploaded file bytes).
|
||||
- Raw bundles include collection context and diagnostics for debugging (`collect.log`, `parser_fields.json`).
|
||||
- Endpoint compatibility is preserved (`/api/export/json`) while actual payload format may be a bundle.
|
||||
|
||||
---
|
||||
|
||||
## ADL-011 — Redfish snapshot crawler is bounded, prioritized, and failure-tolerant
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Full Redfish trees on modern GPU systems are large, noisy, and contain many
|
||||
vendor-specific or non-fetchable links. Unbounded crawling and naive queue design caused hangs
|
||||
and incomplete snapshots.
|
||||
**Decision:** Use a bounded snapshot crawler with:
|
||||
- explicit document cap (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
|
||||
- priority seed paths (PCIe/Fabrics/Firmware/Storage/PowerSubsystem/ThermalSubsystem)
|
||||
- normalized `@odata.id` paths (strip `#fragment`)
|
||||
- noisy expected error filtering (404/405/410/501 hidden from UI)
|
||||
- queue capacity sized to crawl cap to avoid producer/consumer deadlock
|
||||
**Consequences:**
|
||||
- Snapshot collection remains stable on large BMC trees.
|
||||
- Most high-value inventory paths are reached before the cap.
|
||||
- UI progress remains useful while debug logs retain low-level fetch failures.
|
||||
|
||||
---
|
||||
|
||||
## ADL-012 — Vendor-specific storage inventory probing is allowed as fallback
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Some Supermicro BMCs expose empty standard `Storage/.../Drives` collections while
|
||||
real disk inventory exists under vendor-specific `Disk.Bay` endpoints and enclosure links.
|
||||
**Decision:** When standard drive collections are empty, collector/replay may probe vendor-style
|
||||
`.../Drives/Disk.Bay.*` endpoints and follow `Storage.Links.Enclosures[*]` to recover physical drives.
|
||||
**Consequences:**
|
||||
- Higher storage inventory coverage on Supermicro HBA/HA-RAID/MRVL/NVMe backplane implementations.
|
||||
- Replay must mirror the same probing behavior to preserve deterministic results.
|
||||
- Probing remains bounded (finite candidate set) to avoid runaway requests.
|
||||
|
||||
---
|
||||
|
||||
## ADL-013 — PowerSubsystem is preferred over legacy Power on newer Redfish implementations
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** X14+/newer Redfish implementations increasingly expose authoritative PSU data in
|
||||
`PowerSubsystem/PowerSupplies`, while legacy `/Power` may be incomplete or schema-shifted.
|
||||
**Decision:** Prefer `Chassis/*/PowerSubsystem/PowerSupplies` as the primary PSU source and use
|
||||
legacy `Chassis/*/Power` as fallback.
|
||||
**Consequences:**
|
||||
- Better compatibility with newer BMC firmware generations.
|
||||
- Legacy systems remain supported without special-case collector selection.
|
||||
- Snapshot priority seeds must include `PowerSubsystem` resources.
|
||||
|
||||
---
|
||||
|
||||
## ADL-014 — Threshold logic lives on the server; UI reflects status only
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Duplicating threshold math in frontend and backend creates drift and inconsistent
|
||||
highlighting (e.g. PSU mains voltage range checks).
|
||||
**Decision:** Business threshold evaluation (e.g. PSU voltage nominal range) must be computed on
|
||||
the server; frontend only renders status/flags returned by the API.
|
||||
**Consequences:**
|
||||
- Single source of truth for threshold policies.
|
||||
- UI can evolve visually without re-implementing domain logic.
|
||||
- API payloads may carry richer status semantics over time.
|
||||
|
||||
---
|
||||
|
||||
<!-- Add new decisions below this line using the format above -->
|
||||
|
||||
Reference in New Issue
Block a user