Update docs and add release artifacts

This commit is contained in:
Mikhail Chusavitin
2026-02-25 12:17:17 +03:00
parent a4a1a19a94
commit 693b7346ab
9 changed files with 241 additions and 1 deletions

View File

@@ -22,6 +22,9 @@ Key top-level fields:
| `sensors` | `[]SensorReading` | Sensor readings |
| `raw_payloads` | `map[string]any` | Raw vendor data (e.g. `redfish_tree`) |
`raw_payloads` is the durable source for offline re-analysis (especially for Redfish).
Normalized fields should be treated as derivable output from raw source data.
### Hardware sub-structure
```
@@ -31,6 +34,7 @@ HardwareConfig
├── cpus []CPU
├── memory []MemoryDIMM
├── storage []Storage
├── volumes []StorageVolume — logical RAID/VROC volumes
├── pcie_devices []PCIeDevice
├── gpus []GPU
├── network_adapters []NetworkAdapter
@@ -86,3 +90,15 @@ Carried by both `/api/status` and `/api/config`:
Valid `source_type` values: `archive`, `api`
Valid `protocol` values: `redfish`, `ipmi` (empty is allowed for archive uploads)
---
## Raw Export Package (reopenable artifact)
`Export Raw Data` does not merely dump `AnalysisResult`; it emits a reopenable raw package
(JSON or ZIP bundle) that carries source data required for re-analysis.
Design rules:
- raw source is authoritative (`redfish_tree` or original file bytes)
- imports must re-analyze from raw source
- parsed field snapshots included in bundles are diagnostic artifacts, not the source of truth

View File

@@ -46,15 +46,53 @@ Dynamic — does not assume fixed paths. Discovers:
Full Redfish response tree is stored in `result.RawPayloads["redfish_tree"]`.
This allows future offline re-analysis without re-collecting from a live BMC.
### Unified Redfish analysis pipeline (live == replay)
LOGPile uses a **single Redfish analyzer path**:
1. Live collector crawls the Redfish API and builds `raw_payloads.redfish_tree`
2. Parsed result is produced by replaying that tree through the same analyzer used by raw import
This guarantees that live collection and `Export Raw Data` re-open/re-analyze produce the same
normalized output for the same `redfish_tree`.
### Snapshot crawler behavior (important)
The Redfish snapshot crawler is intentionally:
- **bounded** (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
- **prioritized** (PCIe, Fabrics, FirmwareInventory, Storage, PowerSubsystem, ThermalSubsystem)
- **tolerant** (skips noisy expected failures, strips `#fragment` from `@odata.id`)
Design notes:
- Queue capacity is sized to snapshot cap to avoid worker deadlocks on large trees.
- UI progress is coarse and human-readable; detailed per-request diagnostics are available via debug logs.
- `LOGPILE_REDFISH_DEBUG=1` and `LOGPILE_REDFISH_SNAPSHOT_DEBUG=1` enable console diagnostics.
### Parsing guidelines
When adding Redfish mappings, follow these principles:
- Support alternate collection paths (resources may appear at different odata URLs).
- Follow `@odata.id` references and handle embedded `Members` arrays.
- Prefer **raw-tree replay compatibility**: if live collector adds a fallback/probe, replay analyzer must mirror it.
- Deduplicate by serial / BDF / slot+model (in that priority order).
- Prefer tolerant/fallback parsing — missing fields should be silently skipped,
not cause the whole collection to fail.
### Vendor-specific storage fallbacks (Supermicro and similar)
When standard `Storage/.../Drives` collections are empty, collector/replay may recover drives via:
- `Storage.Links.Enclosures[*] -> .../Drives`
- direct probing of finite `Disk.Bay` candidates (`Disk.Bay.0`, `Disk.Bay0`, `.../0`)
This is required for some BMCs that publish drive inventory in vendor-specific paths while leaving
standard collections empty.
### PSU source preference (newer Redfish)
PSU inventory source order:
1. `Chassis/*/PowerSubsystem/PowerSupplies` (preferred on X14+/newer Redfish)
2. `Chassis/*/Power` (legacy fallback)
### Progress reporting
The collector emits progress log entries at each stage (connecting, enumerating systems,

View File

@@ -5,11 +5,42 @@
| Endpoint | Format | Filename pattern |
|----------|--------|-----------------|
| `GET /api/export/csv` | CSV — serial numbers | `YYYY-MM-DD (MODEL) - SN.csv` |
| `GET /api/export/json` | Full `AnalysisResult` JSON (incl. `raw_payloads`) | `YYYY-MM-DD (MODEL) - SN.json` |
| `GET /api/export/json` | **Raw export package** (JSON or ZIP bundle) for reopen/re-analysis | `YYYY-MM-DD (MODEL) - SN.(json|zip)` |
| `GET /api/export/reanimator` | Reanimator hardware JSON | `YYYY-MM-DD (MODEL) - SN.json` |
---
## Raw Export (`Export Raw Data`)
### Purpose
Preserve enough source data to reproduce parsing later after parser fixes, without requiring
another live collection from the target system.
### Format
`/api/export/json` returns a **raw export package**:
- JSON package (machine-readable), or
- ZIP bundle containing:
- `raw_export.json` — machine-readable package
- `collect.log` — human-readable collection + parsing summary
- `parser_fields.json` — structured parsed field snapshot for diffs between parser versions
### Import / reopen behavior
When a raw export package is uploaded back into LOGPile:
- the app **re-analyzes from raw source**
- it does **not** trust embedded parsed output as source of truth
For Redfish, this means replay from `raw_payloads.redfish_tree`.
### Design rule
Raw export is a **re-analysis artifact**, not a final report dump. Keep it self-contained and
forward-compatible where possible (versioned package format, additive fields only).
---
## Reanimator Export
### Purpose

View File

@@ -111,4 +111,94 @@ Top-level `README.md` and `CLAUDE.md` must remain minimal pointers/instructions.
---
## ADL-009 — Redfish analysis is performed from raw snapshot replay (unified tunnel)
**Date:** 2026-02-24
**Context:** Live Redfish collection and raw export re-analysis used different parsing paths,
which caused drift and made bug fixes difficult to validate consistently.
**Decision:** Redfish live collection must produce a `raw_payloads.redfish_tree` snapshot first,
then run the same replay analyzer used for imported raw exports.
**Consequences:**
- Same `redfish_tree` input produces the same parsed result in live and offline modes.
- Debugging parser issues can be done against exported raw bundles without live BMC access.
- Snapshot completeness becomes critical; collector seeds/limits are part of analyzer correctness.
---
## ADL-010 — Raw export is a self-contained re-analysis package (not a final result dump)
**Date:** 2026-02-24
**Context:** Exporting only normalized `AnalysisResult` loses raw source fidelity and prevents
future parser improvements from being applied to already collected data.
**Decision:** `Export Raw Data` produces a self-contained raw package (JSON or ZIP bundle)
that the application can reopen and re-analyze. Parsed data in the package is optional and not
the source of truth on import.
**Consequences:**
- Re-opening an export always re-runs analysis from raw source (`redfish_tree` or uploaded file bytes).
- Raw bundles include collection context and diagnostics for debugging (`collect.log`, `parser_fields.json`).
- Endpoint compatibility is preserved (`/api/export/json`) while actual payload format may be a bundle.
---
## ADL-011 — Redfish snapshot crawler is bounded, prioritized, and failure-tolerant
**Date:** 2026-02-24
**Context:** Full Redfish trees on modern GPU systems are large, noisy, and contain many
vendor-specific or non-fetchable links. Unbounded crawling and naive queue design caused hangs
and incomplete snapshots.
**Decision:** Use a bounded snapshot crawler with:
- explicit document cap (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
- priority seed paths (PCIe/Fabrics/Firmware/Storage/PowerSubsystem/ThermalSubsystem)
- normalized `@odata.id` paths (strip `#fragment`)
- noisy expected error filtering (404/405/410/501 hidden from UI)
- queue capacity sized to crawl cap to avoid producer/consumer deadlock
**Consequences:**
- Snapshot collection remains stable on large BMC trees.
- Most high-value inventory paths are reached before the cap.
- UI progress remains useful while debug logs retain low-level fetch failures.
---
## ADL-012 — Vendor-specific storage inventory probing is allowed as fallback
**Date:** 2026-02-24
**Context:** Some Supermicro BMCs expose empty standard `Storage/.../Drives` collections while
real disk inventory exists under vendor-specific `Disk.Bay` endpoints and enclosure links.
**Decision:** When standard drive collections are empty, collector/replay may probe vendor-style
`.../Drives/Disk.Bay.*` endpoints and follow `Storage.Links.Enclosures[*]` to recover physical drives.
**Consequences:**
- Higher storage inventory coverage on Supermicro HBA/HA-RAID/MRVL/NVMe backplane implementations.
- Replay must mirror the same probing behavior to preserve deterministic results.
- Probing remains bounded (finite candidate set) to avoid runaway requests.
---
## ADL-013 — PowerSubsystem is preferred over legacy Power on newer Redfish implementations
**Date:** 2026-02-24
**Context:** X14+/newer Redfish implementations increasingly expose authoritative PSU data in
`PowerSubsystem/PowerSupplies`, while legacy `/Power` may be incomplete or schema-shifted.
**Decision:** Prefer `Chassis/*/PowerSubsystem/PowerSupplies` as the primary PSU source and use
legacy `Chassis/*/Power` as fallback.
**Consequences:**
- Better compatibility with newer BMC firmware generations.
- Legacy systems remain supported without special-case collector selection.
- Snapshot priority seeds must include `PowerSubsystem` resources.
---
## ADL-014 — Threshold logic lives on the server; UI reflects status only
**Date:** 2026-02-24
**Context:** Duplicating threshold math in frontend and backend creates drift and inconsistent
highlighting (e.g. PSU mains voltage range checks).
**Decision:** Business threshold evaluation (e.g. PSU voltage nominal range) must be computed on
the server; frontend only renders status/flags returned by the API.
**Consequences:**
- Single source of truth for threshold policies.
- UI can evolve visually without re-implementing domain logic.
- API payloads may carry richer status semantics over time.
---
<!-- Add new decisions below this line using the format above -->