docs: refresh project documentation

2026-03-15 16:35:16 +03:00
parent 47bb0ee939
commit 0acdc2b202
14 changed files with 508 additions and 1224 deletions
--- a/bible-local/01-overview.md
+++ b/bible-local/01-overview.md
@@ -1,35 +1,43 @@
 # 01 — Overview

-## What is LOGPile?
+## Purpose

-LOGPile is a standalone Go application for BMC (Baseboard Management Controller)
-diagnostics analysis with an embedded web UI.
-It runs as a single binary with no external file dependencies.
+LOGPile is a standalone Go application for BMC diagnostics analysis with an embedded web UI.
+It runs as a single binary and normalizes hardware data from archives or live Redfish collection.

 ## Operating modes

-| Mode | Entry point | Description |
-|------|-------------|-------------|
-| **Offline / archive** | `POST /api/upload` | Upload a vendor diagnostic archive or a JSON snapshot; parse and display in UI |
-| **Live / Redfish** | `POST /api/collect` | Connect to a live BMC via Redfish API, collect hardware inventory, display and export |
+| Mode | Entry point | Outcome |
+|------|-------------|---------|
+| Archive upload | `POST /api/upload` | Parse a supported archive, raw export bundle, or JSON snapshot into `AnalysisResult` |
+| Live collection | `POST /api/collect` | Collect from a live BMC via Redfish and store the result in memory |
+| Batch convert | `POST /api/convert` | Convert multiple supported input files into Reanimator JSON in a ZIP artifact |

-Both modes produce the same in-memory `AnalysisResult` structure and expose it
-through the same API and UI.
+All modes converge on the same normalized hardware model and exporter pipeline.

-## Key capabilities
+## In scope

- Single self-contained binary with embedded HTML/JS/CSS (no static file serving required).
- Vendor archive parsing: Inspur/Kaytus, Dell TSR, NVIDIA HGX Field Diagnostics,
-  NVIDIA Bug Report, Unraid, XigmaNAS, Generic text fallback.
- Live Redfish collection with async progress tracking.
- Normalized hardware inventory: CPU / RAM / Storage / GPU / PSU / NIC / PCIe / Firmware.
- Raw `redfish_tree` snapshot stored in `RawPayloads` for future offline re-analysis.
- Re-upload of a JSON snapshot for offline work (`/api/upload` accepts `AnalysisResult` JSON).
- Export in CSV, JSON (full `AnalysisResult`), and Reanimator format.
- PCI device model resolution via embedded `pci.ids` (no hardcoded model strings).
+- Single-binary desktop/server utility with embedded UI
+- Vendor archive parsing and live Redfish collection
+- Canonical hardware inventory across UI and exports
+- Reopenable raw export bundles for future re-analysis
+- Reanimator export and batch conversion workflows
+- Embedded `pci.ids` lookup for vendor/device name enrichment

-## Non-goals (current scope)
+## Current vendor coverage

- No persistent storage — all state is in-memory per process lifetime.
- IPMI collector is a mock scaffold only; real IPMI support is not implemented.
- No authentication layer on the HTTP server.
+- Dell TSR
+- H3C SDS G5/G6
+- Inspur / Kaytus
+- NVIDIA HGX Field Diagnostics
+- NVIDIA Bug Report
+- Unraid
+- XigmaNAS
+- Generic fallback parser
+
+## Non-goals
+
+- Persistent storage or multi-user state
+- Production IPMI collection
+- Authentication/authorization on the built-in HTTP server
+- Long-term server-side job history beyond in-memory process lifetime
--- a/bible-local/02-architecture.md
+++ b/bible-local/02-architecture.md
@@ -2,114 +2,85 @@

 ## Runtime stack

-| Layer | Technology |
-|-------|------------|
+| Layer | Implementation |
+|-------|----------------|
 | Language | Go 1.22+ |
-| HTTP | `net/http`, `http.ServeMux` |
-| UI | Embedded via `//go:embed` in `web/embed.go` (templates + static assets) |
-| State | In-memory only — no database |
-| Build | `CGO_ENABLED=0`, single static binary |
+| HTTP | `net/http` + `http.ServeMux` |
+| UI | Embedded templates and static assets via `go:embed` |
+| State | In-memory only |
+| Build | `CGO_ENABLED=0`, single binary |

-Default port: **8082**
+Default port: `8082`

-## Directory structure
+## Code map

-```
-cmd/logpile/main.go          # Binary entry point, CLI flag parsing
-internal/
-  collector/                 # Live data collectors
-    registry.go              # Collector registration
-    redfish.go               # Redfish connector (real implementation)
-    ipmi_mock.go             # IPMI mock connector (scaffold)
-    types.go                 # Connector request/progress contracts
-  parser/                    # Archive parsers
-    parser.go                # BMCParser (dispatcher) + parse orchestration
-    archive.go               # Archive extraction helpers
-    registry.go              # Parser registry + detect/selection
-    interface.go             # VendorParser interface
-    vendors/                 # Vendor-specific parser modules
-      vendors.go             # Import-side-effect registrations
-      dell/
-      inspur/
-      nvidia/
-      nvidia_bug_report/
-      unraid/
-      xigmanas/
-      generic/
-      pciids/                # PCI IDs lookup (embedded pci.ids)
-  server/                    # HTTP layer
-    server.go                # Server struct, route registration
-    handlers.go              # All HTTP handler functions
-  exporter/                  # Export formatters
-    exporter.go              # CSV + JSON exporters
-    reanimator_models.go
-    reanimator_converter.go
-  models/                    # Shared data contracts
-web/
-  embed.go                   # go:embed directive
-  templates/                 # HTML templates
-  static/                    # JS / CSS
-    js/app.js                # Frontend — API contract consumer
+```text
+cmd/logpile/main.go          entrypoint and CLI flags
+internal/server/             HTTP handlers, jobs, upload/export flows
+internal/collector/          live collection and Redfish replay
+internal/analyzer/           shared analysis helpers
+internal/parser/             archive extraction and parser dispatch
+internal/exporter/           CSV and Reanimator conversion
+internal/models/             stable data contracts
+web/                         embedded UI assets
 ```

-## In-memory state
+## Server state

-The `Server` struct in `internal/server/server.go` holds:
+`internal/server.Server` stores:

-| Field | Type | Description |
-|-------|------|-------------|
-| `result` | `*models.AnalysisResult` | Current parsed/collected dataset |
-| `detectedVendor` | `string` | Vendor identifier from last parse |
-| `jobManager` | `*JobManager` | Tracks live collect job status/logs |
-| `collectors` | `*collector.Registry` | Registered live collection connectors |
+| Field | Purpose |
+|------|---------|
+| `result` | Current `AnalysisResult` shown in UI and used by exports |
+| `detectedVendor` | Parser/collector identity for the current dataset |
+| `rawExport` | Reopenable raw-export package associated with current result |
+| `jobManager` | Shared async job state for collect and convert flows |
+| `collectors` | Registered live collectors (`redfish`, `ipmi`) |
+| `convertOutput` | Temporary ZIP artifacts for batch convert downloads |

-State is replaced atomically on successful upload or collect.
-On a failed/canceled collect, the previous `result` is preserved unchanged.
+State is replaced only on successful upload or successful live collection.
+Failed or canceled jobs do not overwrite the previous dataset.

-## Upload flow (`POST /api/upload`)
+## Main flows

-```
-multipart form field: "archive"
-  │
-  ├─ file looks like JSON?
-  │     └─ parse as models.AnalysisResult snapshot → store in Server.result
-  │
-  └─ otherwise
-        └─ parser.NewBMCParser().ParseFromReader(...)
-              │
-              ├─ try all registered vendor parsers (highest confidence wins)
-              └─ result → store in Server.result
-```
+### Upload

-## Live collect flow (`POST /api/collect`)
+1. `POST /api/upload` receives multipart field `archive`
+2. JSON inputs are checked for raw-export package or `AnalysisResult` snapshot
+3. Non-JSON inputs go through `parser.BMCParser`
+4. Archive metadata is normalized onto `AnalysisResult`
+5. Result becomes the current in-memory dataset

-```
-validate request (host / protocol / port / username / auth_type / tls_mode)
-  │
-  └─ launch async job
-        │
-        ├─ progress callback → job log (queryable via GET /api/collect/{id})
-        │
-        ├─ success:
-        │     set source metadata (source_type=api, protocol, host, date)
-        │     store result in Server.result
-        │
-        └─ failure / cancel:
-              previous Server.result unchanged
-```
+### Live collect

-Job lifecycle states: `queued → running → success | failed | canceled`
+1. `POST /api/collect` validates request fields
+2. Server creates an async job and returns `202 Accepted`
+3. Selected collector gathers raw data
+4. For Redfish, collector saves `raw_payloads.redfish_tree`
+5. Result is normalized, source metadata applied, and state replaced on success
+
+### Batch convert
+
+1. `POST /api/convert` accepts multiple files
+2. Each supported file is analyzed independently
+3. Successful results are converted to Reanimator JSON
+4. Outputs are packaged into a temporary ZIP artifact
+5. Client polls job status and downloads the artifact when ready
+
+## Redfish design rule
+
+Live Redfish collection and offline Redfish re-analysis must use the same replay path.
+The collector first captures `raw_payloads.redfish_tree`, then the replay logic builds the normalized result.

 ## PCI IDs lookup

-Load/override order (`LOGPILE_PCI_IDS_PATH` has highest priority because it is loaded last):
+Lookup order:

-1. Embedded `internal/parser/vendors/pciids/pci.ids` (base dataset compiled into binary)
+1. Embedded `internal/parser/vendors/pciids/pci.ids`
 2. `./pci.ids`
 3. `/usr/share/hwdata/pci.ids`
 4. `/usr/share/misc/pci.ids`
 5. `/opt/homebrew/share/pciids/pci.ids`
-6. Paths from `LOGPILE_PCI_IDS_PATH` (colon-separated on Unix; later loaded, override same IDs)
+6. Extra paths from `LOGPILE_PCI_IDS_PATH`

-This means unknown GPU/NIC model strings can be updated by refreshing `pci.ids`
-without any code change.
+Later sources override earlier ones for the same IDs.
--- a/bible-local/03-api.md
+++ b/bible-local/03-api.md
@@ -2,38 +2,37 @@

 ## Conventions

- All endpoints under `/api/`.
- Request bodies: `application/json` or `multipart/form-data` where noted.
- Responses: `application/json` unless file download.
- Export filenames follow pattern: `YYYY-MM-DD (SERVER MODEL) - SERVER SN.<ext>`
+- All endpoints are under `/api/`
+- JSON responses are used unless the endpoint downloads a file
+- Async jobs share the same status model: `queued`, `running`, `success`, `failed`, `canceled`
+- Export filenames use `YYYY-MM-DD (MODEL) - SERIAL.<ext>` when board metadata exists

---
-
-## Upload & Data Input
+## Input endpoints

 ### `POST /api/upload`

-Upload a vendor diagnostic archive or a JSON snapshot.
-
-**Request:** `multipart/form-data`, field name `archive`.
-Server-side multipart limit: **100 MiB**.
+Uploads one file in multipart field `archive`.

 Accepted inputs:
- `.tar`, `.tar.gz`, `.tgz` — vendor diagnostic archives
- `.txt` — plain text files
- JSON file containing a serialized `AnalysisResult` — re-loaded as-is
+- supported archive/log formats from the parser registry
+- `.json` `AnalysisResult` snapshots
+- raw-export JSON packages
+- raw-export ZIP bundles

-**Response:** `200 OK` with parsed result summary, or `4xx`/`5xx` on error.
+Result:
+- parses or replays the input
+- stores the result as current in-memory state
+- returns parsed summary JSON

---
-
-## Live Collection
+Related helper:
+- `GET /api/file-types` returns `archive_extensions`, `upload_extensions`, and `convert_extensions`

 ### `POST /api/collect`

-Start a live collection job (`redfish` or `ipmi`).
+Starts a live collection job.
+
+Request body:

-**Request body:**
 ```json
 {
  "host": "bmc01.example.local",
@@ -47,138 +46,125 @@ Start a live collection job (`redfish` or `ipmi`).
 ```

 Supported values:
- `protocol`: `redfish` | `ipmi`
- `auth_type`: `password` | `token`
- `tls_mode`: `strict` | `insecure`
+- `protocol`: `redfish` or `ipmi`
+- `auth_type`: `password` or `token`
+- `tls_mode`: `strict` or `insecure`

-**Response:** `202 Accepted`
-```json
-{
-  "job_id": "job_a1b2c3d4e5f6",
-  "status": "queued",
-  "message": "Collection job accepted",
-  "created_at": "2026-02-23T12:00:00Z"
-}
-```
-
-Validation behavior:
- `400 Bad Request` for invalid JSON
- `422 Unprocessable Entity` for semantic validation errors (missing/invalid fields)
+Responses:
+- `202` on accepted job creation
+- `400` on malformed JSON
+- `422` on validation errors

 ### `GET /api/collect/{id}`

-Poll job status and progress log.
-
-**Response:**
-```json
-{
-  "job_id": "job_a1b2c3d4e5f6",
-  "status": "running",
-  "progress": 55,
-  "logs": ["..."],
-  "created_at": "2026-02-23T12:00:00Z",
-  "updated_at": "2026-02-23T12:00:10Z"
-}
-```
-
-Status values: `queued` | `running` | `success` | `failed` | `canceled`
+Returns async collection job status, progress, timestamps, and accumulated logs.

 ### `POST /api/collect/{id}/cancel`

-Cancel a running job.
+Requests cancellation for a running collection job.

---
+### `POST /api/convert`

-## Data Queries
+Starts a batch conversion job that accepts multiple files under `files[]` or `files`.
+Each supported file is parsed independently and converted to Reanimator JSON.
+
+Response fields:
+- `job_id`
+- `status`
+- `accepted`
+- `skipped`
+- `total_files`
+
+### `GET /api/convert/{id}`
+
+Returns batch convert job status using the same async job envelope as collection.
+
+### `GET /api/convert/{id}/download`
+
+Downloads the ZIP artifact produced by a successful convert job.
+
+## Read endpoints

 ### `GET /api/status`

 Returns source metadata for the current dataset.
+If nothing is loaded, response is `{ "loaded": false }`.

-```json
-{
-  "loaded": true,
-  "filename": "redfish://bmc01.example.local",
-  "vendor": "redfish",
-  "source_type": "api",
-  "protocol": "redfish",
-  "target_host": "bmc01.example.local",
-  "collected_at": "2026-02-10T15:30:00Z",
-  "stats": { "events": 0, "sensors": 0, "fru": 0 }
-}
-```
-
-`source_type`: `archive` | `api`
-
-When no dataset is loaded, response is `{ "loaded": false }`.
+Typical fields:
+- `loaded`
+- `filename`
+- `vendor`
+- `source_type`
+- `protocol`
+- `target_host`
+- `source_timezone`
+- `collected_at`
+- `stats`

 ### `GET /api/config`

-Returns source metadata plus:
+Returns the main UI configuration payload, including:
+- source metadata
 - `hardware.board`
 - `hardware.firmware`
 - canonical `hardware.devices`
- computed `specification` summary lines
+- computed specification lines

 ### `GET /api/events`

-Returns parsed diagnostic events.
+Returns events sorted newest first.

 ### `GET /api/sensors`

-Returns sensor readings (temperatures, voltages, fan speeds).
+Returns parsed sensors plus synthesized PSU voltage sensors when telemetry is available.

 ### `GET /api/serials`

-Returns serial numbers built from canonical `hardware.devices`.
+Returns serial-oriented inventory built from canonical devices.

 ### `GET /api/firmware`

-Returns firmware versions built from canonical `hardware.devices`.
+Returns firmware-oriented inventory built from canonical devices.
+
+### `GET /api/parse-errors`
+
+Returns normalized parse and collection issues combined from:
+- Redfish fetch errors in `raw_payloads`
+- raw-export collect logs
+- derived partial-inventory warnings

 ### `GET /api/parsers`

-Returns list of registered vendor parsers with their identifiers.
+Returns registered parser metadata.

---
+### `GET /api/file-types`

-## Export
+Returns supported file extensions for upload and batch convert.
+
+## Export endpoints

 ### `GET /api/export/csv`

-Download serial numbers as CSV.
+Downloads serial-number CSV.

 ### `GET /api/export/json`

-Download full `AnalysisResult` as JSON (includes `raw_payloads`).
+Downloads a raw-export artifact for reopen and re-analysis.
+Current implementation emits a ZIP bundle containing:
+- `raw_export.json`
+- `collect.log`
+- `parser_fields.json`

 ### `GET /api/export/reanimator`

-Download hardware data in Reanimator format for asset tracking integration.
-See [`07-exporters.md`](07-exporters.md) for full format spec.
+Downloads Reanimator JSON built from the current normalized result.

---
-
-## Management
+## Management endpoints

 ### `DELETE /api/clear`

-Clear current in-memory dataset.
+Clears current in-memory dataset, raw export state, and temporary convert artifacts.

 ### `POST /api/shutdown`

-Gracefully shut down the server process.
-This endpoint terminates the current process after responding.
-
---
-
-## Source metadata fields
-
-Fields present in `/api/status` and `/api/config`:
-
-| Field | Values |
-|-------|--------|
-| `source_type` | `archive` \| `api` |
-| `protocol` | `redfish` \| `ipmi` (may be empty for archive uploads) |
-| `target_host` | IP or hostname |
-| `collected_at` | RFC3339 timestamp |
+Gracefully shuts down the process after responding.
--- a/bible-local/04-data-models.md
+++ b/bible-local/04-data-models.md
@@ -1,104 +1,87 @@
 # 04 — Data Models

-## AnalysisResult
+## Core contract: `AnalysisResult`

-`internal/models/` — the central data contract shared by parsers, collectors, exporters, and the HTTP layer.
+`internal/models/models.go` defines the shared result passed between parsers, collectors, server handlers, and exporters.

-**Stability rule:** Never break the JSON shape of `AnalysisResult`.
-Backward-compatible additions are allowed; removals or renames are not.
+Stability rule:
+- do not rename or remove JSON fields from `AnalysisResult`
+- additive fields are allowed
+- UI and exporter compatibility depends on this shape remaining stable

-Key top-level fields:
+Key fields:

-| Field | Type | Description |
-|-------|------|-------------|
-| `filename` | `string` | Uploaded filename or generated live source identifier |
-| `source_type` | `string` | `archive` or `api` |
-| `protocol` | `string` | `redfish`, `ipmi`, or empty for archive uploads |
-| `target_host` | `string` | BMC host for live collection |
-| `collected_at` | `time.Time` | Upload/collection timestamp |
-| `hardware` | `*HardwareConfig` | All normalized hardware inventory |
-| `events` | `[]Event` | Diagnostic events from parsers |
-| `fru` | `[]FRUInfo` | FRU/SDR-derived inventory details |
-| `sensors` | `[]SensorReading` | Sensor readings |
-| `raw_payloads` | `map[string]any` | Raw vendor data (e.g. `redfish_tree`) |
+| Field | Meaning |
+|------|---------|
+| `filename` | Original upload name or synthesized live source name |
+| `source_type` | `archive` or `api` |
+| `protocol` | `redfish`, `ipmi`, or empty for archive uploads |
+| `target_host` | Hostname or IP for live collection |
+| `source_timezone` | Source timezone/offset if known |
+| `collected_at` | Canonical collection/upload time |
+| `raw_payloads` | Raw source data used for replay or diagnostics |
+| `events` | Parsed event timeline |
+| `fru` | FRU-derived inventory details |
+| `sensors` | Sensor readings |
+| `hardware` | Normalized hardware inventory |

-`raw_payloads` is the durable source for offline re-analysis (especially for Redfish).
-Normalized fields should be treated as derivable output from raw source data.
+## `HardwareConfig`

-### Hardware sub-structure
+Main sections:

-```
-HardwareConfig
-  ├── board        BoardInfo       — server/motherboard identity
-  ├── devices      []HardwareDevice — CANONICAL INVENTORY (see below)
-  ├── cpus         []CPU
-  ├── memory       []MemoryDIMM
-  ├── storage      []Storage
-  ├── volumes      []StorageVolume  — logical RAID/VROC volumes
-  ├── pcie_devices []PCIeDevice
-  ├── gpus         []GPU
-  ├── network_adapters []NetworkAdapter
-  ├── network_cards    []NIC           (legacy/alternate source field)
-  ├── power_supplies   []PSU
-  └── firmware     []FirmwareInfo
+```text
+hardware.board
+hardware.devices
+hardware.cpus
+hardware.memory
+hardware.storage
+hardware.volumes
+hardware.pcie_devices
+hardware.gpus
+hardware.network_adapters
+hardware.network_cards
+hardware.power_supplies
+hardware.firmware
 ```

---
+`network_cards` is legacy/alternate source data.
+`hardware.devices` is the canonical cross-section inventory.

-## Canonical Device Repository (`hardware.devices`)
+## Canonical inventory: `hardware.devices`

-`hardware.devices` is the **single source of truth** for hardware inventory.
+`hardware.devices` is the single source of truth for device-oriented UI and Reanimator export.

-### Rules — must not be violated
+Required rules:

-1. All UI tabs displaying hardware components **must read from `hardware.devices`**.
-2. The Device Inventory tab shows kinds: `pcie`, `storage`, `gpu`, `network`.
-3. The Reanimator exporter **must use the same `hardware.devices`** as the UI.
-4. Any discrepancy between UI data and Reanimator export data is a **bug**.
-5. New hardware attributes must be added to the canonical device schema **first**,
-   then mapped to Reanimator/UI — never the other way around.
-6. The exporter should group/filter canonical records by section, not rebuild data
-   from multiple sources.
+1. UI hardware views must read from `hardware.devices`
+2. Reanimator conversion must derive device sections from `hardware.devices`
+3. UI/export mismatches are bugs, not accepted divergence
+4. New shared device fields belong in `HardwareDevice` first

-### Deduplication logic (applied once by repository builder)
+Deduplication priority:

-| Priority | Key used |
-|----------|----------|
-| 1 | `serial_number` — usable (not empty, not `N/A`, `NA`, `NONE`, `NULL`, `UNKNOWN`, `-`) |
-| 2 | `bdf` — PCI Bus:Device.Function address |
-| 3 | No merge — records remain distinct if both serial and bdf are absent |
+| Priority | Key |
+|----------|-----|
+| 1 | usable `serial_number` |
+| 2 | `bdf` |
+| 3 | keep records separate |

-### Device schema alignment
+## Raw payloads

-Keep `hardware.devices` schema as close as possible to Reanimator JSON field names.
-This minimizes translation logic in the exporter and prevents drift.
+`raw_payloads` is authoritative for replayable sources.

---
+Current important payloads:
+- `redfish_tree`
+- `redfish_fetch_errors`
+- `source_timezone`

-## Source metadata fields (stored directly on `AnalysisResult`)
+Normalized hardware fields are derived output, not the long-term source of truth.

-Carried by both `/api/status` and `/api/config`:
+## Raw export package

-```json
-{
-  "source_type": "api",
-  "protocol": "redfish",
-  "target_host": "10.0.0.1",
-  "collected_at": "2026-02-10T15:30:00Z"
-}
-```
-
-Valid `source_type` values: `archive`, `api`
-Valid `protocol` values: `redfish`, `ipmi` (empty is allowed for archive uploads)
-
---
-
-## Raw Export Package (reopenable artifact)
-
-`Export Raw Data` does not merely dump `AnalysisResult`; it emits a reopenable raw package
-(JSON or ZIP bundle) that carries source data required for re-analysis.
+`/api/export/json` produces a reopenable raw-export artifact.

 Design rules:
- raw source is authoritative (`redfish_tree` or original file bytes)
- imports must re-analyze from raw source
- parsed field snapshots included in bundles are diagnostic artifacts, not the source of truth
+- raw source stays authoritative
+- uploads of raw-export artifacts must re-analyze from raw source
+- parsed snapshots inside the bundle are diagnostic only
--- a/bible-local/05-collectors.md
+++ b/bible-local/05-collectors.md
@@ -3,107 +3,69 @@
 Collectors live in `internal/collector/`.

 Core files:
- `internal/collector/registry.go` — connector registry (`redfish`, `ipmi`)
- `internal/collector/redfish.go` — real Redfish connector
- `internal/collector/ipmi_mock.go` — IPMI mock connector scaffold
- `internal/collector/types.go` — request/progress contracts
+- `registry.go` for protocol registration
+- `redfish.go` for live collection
+- `redfish_replay.go` for replay from raw payloads
+- `ipmi_mock.go` for the placeholder IPMI implementation
+- `types.go` for request/progress contracts

---
+## Redfish collector

-## Redfish Collector (`redfish`)
+Status: active production path.

-**Status:** Production-ready.
+Request fields passed from the server:
+- `host`
+- `port`
+- `username`
+- `auth_type`
+- credential field (`password` or token)
+- `tls_mode`

-### Request contract (from server)
+### Core rule

-Passed through from `/api/collect` after validation:
- `host`, `port`, `username`
- `auth_type=password|token` (+ matching credential field)
- `tls_mode=strict|insecure`
+Live collection and replay must stay behaviorally aligned.
+If the collector adds a fallback, probe, or normalization rule, replay must mirror it.

-### Discovery
+### Discovery model

-Dynamic — does not assume fixed paths. Discovers:
- `Systems` collection → per-system resources
- `Chassis` collection → enclosure/board data
- `Managers` collection → BMC/firmware info
+The collector does not rely on one fixed vendor tree.
+It discovers and follows Redfish resources dynamically from root collections such as:
+- `Systems`
+- `Chassis`
+- `Managers`

-### Collected data
+### Stored raw data

-| Category | Notes |
-|----------|-------|
-| CPU | Model, cores, threads, socket, status |
-| Memory | DIMM slot, size, type, speed, serial, manufacturer |
-| Storage | Slot, type, model, serial, firmware, interface, status |
-| GPU | Detected via PCIe class + NVIDIA vendor ID |
-| PSU | Model, serial, wattage, firmware, telemetry (input/output power, voltage) |
-| NIC | Model, serial, port count, BDF |
-| PCIe | Slot, vendor_id, device_id, BDF, link width/speed |
-| Firmware | BIOS, BMC versions |
+Important raw payloads:
+- `raw_payloads.redfish_tree`
+- `raw_payloads.redfish_fetch_errors`
+- `raw_payloads.source_timezone` when available

-### Raw snapshot
+### Snapshot crawler rules

-Full Redfish response tree is stored in `result.RawPayloads["redfish_tree"]`.
-This allows future offline re-analysis without re-collecting from a live BMC.
+- bounded by `LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`
+- prioritized toward high-value inventory paths
+- tolerant of expected vendor-specific failures
+- normalizes `@odata.id` values before queueing

-### Unified Redfish analysis pipeline (live == replay)
+### Redfish implementation guidance

-LOGPile uses a **single Redfish analyzer path**:
+When changing collection logic:

-1. Live collector crawls the Redfish API and builds `raw_payloads.redfish_tree`
-2. Parsed result is produced by replaying that tree through the same analyzer used by raw import
+1. Prefer alternate-path support over vendor hardcoding
+2. Keep expensive probing bounded
+3. Deduplicate by serial, then BDF, then location/model fallbacks
+4. Preserve replay determinism from saved raw payloads
+5. Add tests for both the motivating topology and a negative case

-This guarantees that live collection and `Export Raw Data` re-open/re-analyze produce the same
-normalized output for the same `redfish_tree`.
+### Known vendor fallbacks

-### Snapshot crawler behavior (important)
+- empty standard drive collections may trigger bounded `Disk.Bay` probing
+- `Storage.Links.Enclosures[*]` may be followed to recover physical drives
+- `PowerSubsystem/PowerSupplies` is preferred over legacy `Power` when available

-The Redfish snapshot crawler is intentionally:
- **bounded** (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
- **prioritized** (PCIe, Fabrics, FirmwareInventory, Storage, PowerSubsystem, ThermalSubsystem)
- **tolerant** (skips noisy expected failures, strips `#fragment` from `@odata.id`)
+## IPMI collector

-Design notes:
- Queue capacity is sized to snapshot cap to avoid worker deadlocks on large trees.
- UI progress is coarse and human-readable; detailed per-request diagnostics are available via debug logs.
- `LOGPILE_REDFISH_DEBUG=1` and `LOGPILE_REDFISH_SNAPSHOT_DEBUG=1` enable console diagnostics.
+Status: mock scaffold only.

-### Parsing guidelines
-
-When adding Redfish mappings, follow these principles:
- Support alternate collection paths (resources may appear at different odata URLs).
- Follow `@odata.id` references and handle embedded `Members` arrays.
- Prefer **raw-tree replay compatibility**: if live collector adds a fallback/probe, replay analyzer must mirror it.
- Deduplicate by serial / BDF / slot+model (in that priority order).
- Prefer tolerant/fallback parsing — missing fields should be silently skipped,
-  not cause the whole collection to fail.
-
-### Vendor-specific storage fallbacks (Supermicro and similar)
-
-When standard `Storage/.../Drives` collections are empty, collector/replay may recover drives via:
- `Storage.Links.Enclosures[*] -> .../Drives`
- direct probing of finite `Disk.Bay` candidates (`Disk.Bay.0`, `Disk.Bay0`, `.../0`)
-
-This is required for some BMCs that publish drive inventory in vendor-specific paths while leaving
-standard collections empty.
-
-### PSU source preference (newer Redfish)
-
-PSU inventory source order:
-1. `Chassis/*/PowerSubsystem/PowerSupplies` (preferred on X14+/newer Redfish)
-2. `Chassis/*/Power` (legacy fallback)
-
-### Progress reporting
-
-The collector emits progress log entries at each stage (connecting, enumerating systems,
-collecting CPUs, etc.) so the UI can display meaningful status.
-Current progress message strings are user-facing and may be localized.
-
---
-
-## IPMI Collector (`ipmi`)
-
-**Status:** Mock scaffold only — not implemented.
-
-Registered in the collector registry but returns placeholder data.
-Real IPMI support is a future work item.
+It remains registered for protocol completeness, but it is not a real collection path.
--- a/bible-local/06-parsers.md
+++ b/bible-local/06-parsers.md
@@ -2,261 +2,69 @@

 ## Framework

-### Registration
+Parsers live in `internal/parser/` and vendor implementations live in `internal/parser/vendors/`.

-Each vendor parser registers itself via Go's `init()` side-effect import pattern.
+Core behavior:
+- registration uses `init()` side effects
+- all registered parsers run `Detect()`
+- the highest-confidence parser wins
+- generic fallback stays last and low-confidence

-All registrations are collected in `internal/parser/vendors/vendors.go`:
-```go
-import (
-    _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
-    _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
-    // etc.
-)
-```
-
-### VendorParser interface
+`VendorParser` contract:

 ```go
 type VendorParser interface {
-    Name() string                                // human-readable name
-    Vendor() string                              // vendor identifier string
-    Version() string                             // parser version (increment on logic changes)
-    Detect(files []ExtractedFile) int            // confidence 0–100
+    Name() string
+    Vendor() string
+    Version() string
+    Detect(files []ExtractedFile) int
    Parse(files []ExtractedFile) (*models.AnalysisResult, error)
 }
 ```

-### Selection logic
+## Adding a parser

-All registered parsers run `Detect()` against the uploaded archive's file list.
-The parser with the **highest confidence score** is selected.
-Multiple parsers may return >0; only the top scorer is used.
+1. Create `internal/parser/vendors/<vendor>/`
+2. Start from `internal/parser/vendors/template/parser.go.template`
+3. Implement `Detect()` and `Parse()`
+4. Add a blank import in `internal/parser/vendors/vendors.go`
+5. Add at least one positive and one negative detection test

-### Adding a new vendor parser
+## Data quality rules

-1. `mkdir -p internal/parser/vendors/VENDORNAME`
-2. Copy `internal/parser/vendors/template/parser.go.template` as starting point.
-3. Implement `Detect()` and `Parse()`.
-4. Add blank import to `vendors/vendors.go`.
+### System firmware only in `hardware.firmware`

-`Detect()` tips:
- Look for unique filenames or directory names.
- Check file content for vendor-specific markers.
- Return 70+ only when confident; return 0 if clearly not a match.
+`hardware.firmware` must contain system-level firmware only.
+Device-bound firmware belongs on the device record and must not be duplicated at the top level.

-### Parser versioning
+### Strip embedded MAC addresses from model names

-Each parser file contains a `parserVersion` constant.
-Increment the version whenever parsing logic changes — this helps trace which
-version produced a given result.
+If a source embeds ` - XX:XX:XX:XX:XX:XX` in a model/name field, remove that suffix before storing it.

---
+### Use `pci.ids` for empty or generic PCI model names

-## Parser data quality rules
+When `vendor_id` and `device_id` are known but the model name is missing or generic, resolve the name via `internal/parser/vendors/pciids`.

-### FirmwareInfo — system-level only
+## Active vendor coverage

-`Hardware.Firmware` must contain **only system-level firmware**: BIOS, BMC/iDRAC,
-Lifecycle Controller, CPLD, storage controllers, BOSS adapters.
+| Vendor ID | Input family | Notes |
+|-----------|--------------|-------|
+| `dell` | TSR ZIP archives | Broad hardware, firmware, sensors, lifecycle events |
+| `h3c_g5` | H3C SDS G5 bundles | INI/XML/CSV-driven hardware and event parsing |
+| `h3c_g6` | H3C SDS G6 bundles | Similar flow with G6-specific files |
+| `inspur` | onekeylog archives | FRU/SDR plus optional Redis enrichment |
+| `nvidia` | HGX Field Diagnostics | GPU- and fabric-heavy diagnostic input |
+| `nvidia_bug_report` | `nvidia-bug-report-*.log.gz` | dmidecode, lspci, NVIDIA driver sections |
+| `unraid` | Unraid diagnostics/log bundles | Server and storage-focused parsing |
+| `xigmanas` | XigmaNAS plain logs | FreeBSD/NAS-oriented inventory |
+| `generic` | fallback | Low-confidence text fallback when nothing else matches |

-**Device-bound firmware** (NIC, GPU, PSU, disk, backplane) **must NOT be added to
-`Hardware.Firmware`**. It belongs to the device's own `Firmware` field and is already
-present there. Duplicating it in `Hardware.Firmware` causes double entries in Reanimator.
+## Practical guidance

-The Reanimator exporter filters by `FirmwareInfo.DeviceName` prefix and by
-`FirmwareInfo.Description` (FQDD prefix). Parsers must cooperate:
-
- Store the device's FQDD (or equivalent slot identifier) in `FirmwareInfo.Description`
-  for all firmware entries that come from a per-device inventory source (e.g. Dell
-  `DCIM_SoftwareIdentity`).
- FQDD prefixes that are device-bound: `NIC.`, `PSU.`, `Disk.`, `RAID.Backplane.`, `GPU.`
-
-### NIC/device model names — strip embedded MAC addresses
-
-Some vendors (confirmed: Dell TSR) embed the MAC address in the device model name field,
-e.g. `ProductName = "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`.
-
-**Rule:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from model/name strings before storing
-them in `FirmwareInfo.DeviceName`, `NetworkAdapter.Model`, or any other model field.
-
-Use `nicMACInModelRE` (defined in the Dell parser) or an equivalent regex:
-```
-\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$
-```
-
-This applies to **all** string fields used as device names or model identifiers.
-
-### PCI device name enrichment via pci.ids
-
-If a PCIe device, GPU, NIC, or any hardware component has a `vendor_id` + `device_id`
-but its model/name field is **empty or generic** (e.g. blank, equals the description,
-or is just a raw hex ID), the parser **must** attempt to resolve the human-readable
-model name from the embedded `pci.ids` database before storing the result.
-
-**Rule:** When `Model` (or equivalent name field) is empty and both `VendorID` and
-`DeviceID` are non-zero, call the pciids lookup and use the result as the model name.
-
-```go
-// Example pattern — use in any parser that handles PCIe/GPU/NIC devices:
-if strings.TrimSpace(device.Model) == "" && device.VendorID != 0 && device.DeviceID != 0 {
-    if name := pciids.Lookup(device.VendorID, device.DeviceID); name != "" {
-        device.Model = name
-    }
-}
-```
-
-This rule applies to all vendor parsers. The pciids package is available at
-`internal/parser/vendors/pciids`. See ADL-005 for the rationale.
-
-**Do not hardcode model name strings.** If a device is unknown today, it will be
-resolved automatically once `pci.ids` is updated.
-
---
-
-## Vendor parsers
-
-### Inspur / Kaytus (`inspur`)
-
-**Status:** Ready. Tested on KR4268X2 (onekeylog format).
-
-**Archive format:** `.tar.gz` onekeylog
-
-**Primary source files:**
-
-| File | Content |
-|------|---------|
-| `asset.json` | Base hardware inventory |
-| `component.log` | Component list |
-| `devicefrusdr.log` | FRU and SDR data |
-| `onekeylog/runningdata/redis-dump.rdb` | Runtime enrichment (optional) |
-
-**Redis RDB enrichment** (applied conservatively — fills missing fields only):
- GPU: `serial_number`, `firmware` (VBIOS/FW), runtime telemetry
- NIC: firmware, serial, part number (when text logs leave fields empty)
-
-**Module structure:**
-```
-inspur/
-  parser.go   — main parser + registration
-  sdr.go      — sensor/SDR parsing
-  fru.go      — FRU serial parsing
-  asset.go    — asset.json parsing
-  syslog.go   — syslog parsing
-```
-
---
-
-### Dell TSR (`dell`)
-
-**Status:** Ready (v3.0). Tested on nested TSR archives with embedded `*.pl.zip`.
-
-**Archive format:** `.zip` (outer archive + nested `*.pl.zip`)
-
-**Primary source files:**
- `tsr/metadata.json`
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml`
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml`
- `tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml`
- `tsr/hardware/sysinfo/lcfiles/curr_lclog.xml`
-
-**Extracted data:**
- Board/system identity and BIOS/iDRAC firmware
- CPU, memory, physical disks, virtual disks, PSU, NIC, PCIe
- GPU inventory (`DCIM_VideoView`) + GPU sensor enrichment (`DCIM_GPUSensor`)
- Controller/backplane inventory (`DCIM_ControllerView`, `DCIM_EnclosureView`)
- Sensor readings (temperature/voltage/current/power/fan/utilization)
- Lifecycle events (`curr_lclog.xml`)
-
---
-
-### NVIDIA HGX Field Diagnostics (`nvidia`)
-
-**Status:** Ready (v1.1.0). Works with any server vendor.
-
-**Archive format:** `.tar` / `.tar.gz`
-
-**Confidence scoring:**
-
-| File | Score |
-|------|-------|
-| `unified_summary.json` with "HGX Field Diag" marker | +40 |
-| `summary.json` | +20 |
-| `summary.csv` | +15 |
-| `gpu_fieldiag/` directory | +15 |
-
-**Source files:**
-
-| File | Content |
-|------|---------|
-| `output.log` | dmidecode — server manufacturer, model, serial number |
-| `unified_summary.json` | GPU details, NVSwitch devices, PCI addresses |
-| `summary.json` | Diagnostic test results and error codes |
-| `summary.csv` | Alternative test results format |
-
-**Extracted data:**
- GPUs: slot, model, manufacturer, firmware (VBIOS), BDF
- NVSwitch devices: slot, device_class, vendor_id, device_id, BDF, link speed/width
- Events: diagnostic test failures (connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power)
-
-**Severity mapping:**
- `info` — tests passed
- `warning` — e.g. "Row remapping failed"
- `critical` — error codes 300+
-
-**Known limitations:**
- Detailed logs in `gpu_fieldiag/*.log` are not parsed.
- No CPU, memory, or storage extraction (not present in field diag archives).
-
---
-
-### NVIDIA Bug Report (`nvidia_bug_report`)
-
-**Status:** Ready (v1.0.0).
-
-**File format:** `nvidia-bug-report-*.log.gz` (gzip-compressed text)
-
-**Confidence:** 85 (high priority for matching filename pattern)
-
-**Source sections parsed:**
-
-| dmidecode section | Extracts |
-|-------------------|---------|
-| System Information | server serial, UUID, manufacturer, product name |
-| Processor Information | CPU model, serial, core/thread count, frequency |
-| Memory Device | DIMM slot, size, type, manufacturer, serial, part number, speed |
-| System Power Supply | PSU location, manufacturer, model, serial, wattage, firmware, status |
-
-| Other source | Extracts |
-|--------------|---------|
-| `lspci -vvv` (Ethernet/Network/IB) | NIC model (from VPD), BDF, slot, P/N, S/N, port count, port type |
-| `/proc/driver/nvidia/gpus/*/information` | GPU model, BDF, UUID, VBIOS version, IRQ |
-| NVRM version line | NVIDIA driver version |
-
-**Known limitations:**
- Driver error/warning log lines not yet extracted.
- GPU temperature/utilization metrics require additional parsing sections.
-
---
-
-### XigmaNAS (`xigmanas`)
-
-**Status:** Ready.
-
-**Archive format:** Plain log files (FreeBSD-based NAS system)
-
-**Detection:** Files named `xigmanas`, `system`, or `dmesg`; content containing "XigmaNAS" or "FreeBSD"; SMART data presence.
-
-**Extracted data:**
- System: firmware version, uptime, CPU model, memory configuration, hardware platform
- Storage: disk models, serial numbers, capacity, health, SMART temperatures
- Populates: `Hardware.Firmware`, `Hardware.CPUs`, `Hardware.Memory`, `Hardware.Storage`, `Sensors`
-
---
-
-### Unraid (`unraid`)
-
-**Status:** Ready (v1.0.0).
+- Be conservative with high detect scores
+- Prefer filling missing fields over overwriting stronger source data
+- Keep parser version constants current when behavior changes
+- Any new vendor-specific filtering or dedup logic must ship with tests for that vendor format

 **Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories).

--- a/bible-local/07-exporters.md
+++ b/bible-local/07-exporters.md
@@ -1,366 +1,63 @@
-# 07 — Exporters & Reanimator Integration
+# 07 — Exporters

-## Export endpoints summary
+## Export surfaces

-| Endpoint | Format | Filename pattern |
-|----------|--------|-----------------|
-| `GET /api/export/csv` | CSV — serial numbers | `YYYY-MM-DD (MODEL) - SN.csv` |
-| `GET /api/export/json` | **Raw export package** (JSON or ZIP bundle) for reopen/re-analysis | `YYYY-MM-DD (MODEL) - SN.(json|zip)` |
-| `GET /api/export/reanimator` | Reanimator hardware JSON | `YYYY-MM-DD (MODEL) - SN.json` |
+| Endpoint | Output | Purpose |
+|----------|--------|---------|
+| `GET /api/export/csv` | CSV | Serial-number export |
+| `GET /api/export/json` | raw-export ZIP bundle | Reopen and re-analyze later |
+| `GET /api/export/reanimator` | JSON | Reanimator hardware payload |
+| `POST /api/convert` | async ZIP artifact | Batch archive-to-Reanimator conversion |

---
+## Raw export

-## Raw Export (`Export Raw Data`)
+Raw export is not a final report dump.
+It is a replayable artifact that preserves enough source data for future parser improvements.

-### Purpose
+Current bundle contents:
+- `raw_export.json`
+- `collect.log`
+- `parser_fields.json`

-Preserve enough source data to reproduce parsing later after parser fixes, without requiring
-another live collection from the target system.
+Design rules:
+- raw source is authoritative
+- uploads of raw export must replay from raw source
+- parsed snapshots inside the bundle are diagnostic only

-### Format
+## Reanimator export

-`/api/export/json` returns a **raw export package**:
- JSON package (machine-readable), or
- ZIP bundle containing:
-  - `raw_export.json` — machine-readable package
-  - `collect.log` — human-readable collection + parsing summary
-  - `parser_fields.json` — structured parsed field snapshot for diffs between parser versions
+Implementation files:
+- `internal/exporter/reanimator_models.go`
+- `internal/exporter/reanimator_converter.go`
+- `internal/server/handlers.go`

-### Import / reopen behavior
+Conversion rules:
+- canonical source is `hardware.devices`
+- timestamps are RFC3339
+- status is normalized to Reanimator-friendly values
+- missing PCIe serials may be generated from board serial + slot
+- `NULL`-style board manufacturer/product values are treated as absent

-When a raw export package is uploaded back into LOGPile:
- the app **re-analyzes from raw source**
- it does **not** trust embedded parsed output as source of truth
+## Inclusion rules

-For Redfish, this means replay from `raw_payloads.redfish_tree`.
+Included:
+- empty memory slots (`present=false`) for topology visibility
+- PCIe-class devices even when serial must be synthesized

-### Design rule
+Excluded:
+- storage without `serial_number`
+- power supplies without `serial_number`
+- non-present network adapters
+- device-bound firmware duplicated at top-level firmware list

-Raw export is a **re-analysis artifact**, not a final report dump. Keep it self-contained and
-forward-compatible where possible (versioned package format, additive fields only).
+## Batch convert

---
+`POST /api/convert` accepts multiple supported files and produces a ZIP with:
+- one `*.reanimator.json` file per successful input
+- `convert-summary.txt`

-## Reanimator Export
-
-### Purpose
-
-Exports hardware inventory data in the format expected by the Reanimator asset tracking
-system. Enables one-click push from LOGPile to an external asset management platform.
-
-### Implementation files
-
-| File | Role |
-|------|------|
-| `internal/exporter/reanimator_models.go` | Go structs for Reanimator JSON |
-| `internal/exporter/reanimator_converter.go` | `ConvertToReanimator()` and helpers |
-| `internal/server/handlers.go` | `handleExportReanimator()` HTTP handler |
-
-### Conversion rules
-
- Source: canonical `hardware.devices` repository (see [`04-data-models.md`](04-data-models.md))
- CPU manufacturer inferred from model string (Intel / AMD / ARM / Ampere)
- PCIe serial number generated when absent: `{board_serial}-PCIE-{slot}`
- Status values normalized to: `OK`, `Warning`, `Critical`, `Unknown` (`Empty` only for memory slots)
- Timestamps in RFC3339 format
- `target_host` derived from `filename` field (`redfish://…`, `ipmi://…`) if not in source; omitted if undeterminable
- `board.manufacturer` and `board.product_name` values of `"NULL"` treated as absent
-
-### LOGPile → Reanimator field mapping
-
-| LOGPile type | Reanimator section | Notes |
-|---|---|---|
-| `BoardInfo` | `board` | Direct mapping |
-| `CPU` | `cpus` | + manufacturer (inferred) |
-| `MemoryDIMM` | `memory` | Direct; empty slots included (`present=false`) |
-| `Storage` | `storage` | Excluded if no `serial_number` |
-| `PCIeDevice` | `pcie_devices` | Serial generated if missing |
-| `GPU` | `pcie_devices` | `device_class=DisplayController` |
-| `NetworkAdapter` | `pcie_devices` | `device_class=NetworkController` |
-| `PSU` | `power_supplies` | Excluded if no serial or `present=false` |
-| `FirmwareInfo` | `firmware` | Direct mapping |
-
-### Inclusion / exclusion rules
-
-**Included:**
- Memory slots with `present=false` (as Empty slots)
- PCIe devices without serial number (serial is generated)
-
-**Excluded:**
- Storage without `serial_number`
- PSU without `serial_number` or with `present=false`
- NetworkAdapters with `present=false`
-
---
-
-## Reanimator Integration Guide
-
-This section documents the Reanimator receiver-side JSON format (what the Reanimator
-system expects when it ingests a LOGPile export).
-
-> **Important:** The Reanimator endpoint uses a strict JSON decoder (`DisallowUnknownFields`).
-> Any unknown field — including nested ones — causes `400 Bad Request`.
-> Use only `snake_case` keys listed here.
-
-### Top-level structure
-
-```json
-{
-  "filename": "redfish://10.10.10.103",
-  "source_type": "api",
-  "protocol": "redfish",
-  "target_host": "10.10.10.103",
-  "collected_at": "2026-02-10T15:30:00Z",
-  "hardware": {
-    "board": {...},
-    "firmware": [...],
-    "cpus": [...],
-    "memory": [...],
-    "storage": [...],
-    "pcie_devices": [...],
-    "power_supplies": [...]
-  }
-}
-```
-
-**Required:** `collected_at`, `hardware.board.serial_number`
-**Optional:** `target_host`, `source_type`, `protocol`, `filename`
-
-`source_type` values: `api`, `logfile`, `manual`
-`protocol` values: `redfish`, `ipmi`, `snmp`, `ssh`
-
-### Component status fields (all component sections)
-
-Each component may carry:
-
-| Field | Type | Description |
-|-------|------|-------------|
-| `status` | string | `OK`, `Warning`, `Critical`, `Unknown`, `Empty` |
-| `status_checked_at` | RFC3339 | When status was last verified |
-| `status_changed_at` | RFC3339 | When status last changed |
-| `status_at_collection` | object | `{ "status": "...", "at": "..." }` — snapshot-time status |
-| `status_history` | array | `[{ "status": "...", "changed_at": "...", "details": "..." }]` |
-| `error_description` | string | Human-readable error for Warning/Critical |
-
-### Board
-
-```json
-{
-  "board": {
-    "manufacturer": "Supermicro",
-    "product_name": "X12DPG-QT6",
-    "serial_number": "21D634101",
-    "part_number": "X12DPG-QT6-REV1.01",
-    "uuid": "d7ef2fe5-2fd0-11f0-910a-346f11040868"
-  }
-}
-```
-
-`serial_number` required. `manufacturer` / `product_name` of `"NULL"` treated as absent.
-
-### CPUs
-
-```json
-{
-  "socket": 0,
-  "model": "INTEL(R) XEON(R) GOLD 6530",
-  "cores": 32,
-  "threads": 64,
-  "frequency_mhz": 2100,
-  "max_frequency_mhz": 4000,
-  "manufacturer": "Intel",
-  "status": "OK"
-}
-```
-
-`socket` (int) and `model` required. Serial generated: `{board_serial}-CPU-{socket}`.
-
-LOT format: `CPU_{VENDOR}_{MODEL_NORMALIZED}` → e.g. `CPU_INTEL_XEON_GOLD_6530`
-
-### Memory
-
-```json
-{
-  "slot": "CPU0_C0D0",
-  "location": "CPU0_C0D0",
-  "present": true,
-  "size_mb": 32768,
-  "type": "DDR5",
-  "max_speed_mhz": 4800,
-  "current_speed_mhz": 4800,
-  "manufacturer": "Hynix",
-  "serial_number": "80AD032419E17CEEC1",
-  "part_number": "HMCG88AGBRA191N",
-  "status": "OK"
-}
-```
-
-`slot` and `present` required. `serial_number` required when `present=true`.
-Empty slots (`present=false`, `status="Empty"`) are included but no component created.
-
-LOT format: `DIMM_{TYPE}_{SIZE_GB}GB` → e.g. `DIMM_DDR5_32GB`
-
-### Storage
-
-```json
-{
-  "slot": "OB01",
-  "type": "NVMe",
-  "model": "INTEL SSDPF2KX076T1",
-  "size_gb": 7680,
-  "serial_number": "BTAX41900GF87P6DGN",
-  "manufacturer": "Intel",
-  "firmware": "9CV10510",
-  "interface": "NVMe",
-  "present": true,
-  "status": "OK"
-}
-```
-
-`slot`, `model`, `serial_number`, `present` required.
-
-LOT format: `{TYPE}_{INTERFACE}_{SIZE_TB}TB` → e.g. `SSD_NVME_07.68TB`
-
-### Power Supplies
-
-```json
-{
-  "slot": "0",
-  "present": true,
-  "model": "GW-CRPS3000LW",
-  "vendor": "Great Wall",
-  "wattage_w": 3000,
-  "serial_number": "2P06C102610",
-  "part_number": "V0310C9000000000",
-  "firmware": "00.03.05",
-  "status": "OK",
-  "input_power_w": 137,
-  "output_power_w": 104,
-  "input_voltage": 215.25
-}
-```
-
-`slot`, `present` required. `serial_number` required when `present=true`.
-Telemetry fields (`input_power_w`, `output_power_w`, `input_voltage`) stored in observation only.
-
-LOT format: `PSU_{WATTAGE}W_{VENDOR_NORMALIZED}` → e.g. `PSU_3000W_GREAT_WALL`
-
-### PCIe Devices
-
-```json
-{
-  "slot": "PCIeCard1",
-  "vendor_id": 32902,
-  "device_id": 2912,
-  "bdf": "0000:18:00.0",
-  "device_class": "MassStorageController",
-  "manufacturer": "Intel",
-  "model": "RAID Controller RSP3DD080F",
-  "link_width": 8,
-  "link_speed": "Gen3",
-  "max_link_width": 8,
-  "max_link_speed": "Gen3",
-  "serial_number": "RAID-001-12345",
-  "firmware": "50.9.1-4296",
-  "status": "OK"
-}
-```
-
-`slot` required. Serial generated if absent: `{board_serial}-PCIE-{slot}`.
-
-`device_class` values: `NetworkController`, `MassStorageController`, `DisplayController`, etc.
-
-LOT format: `PCIE_{DEVICE_CLASS}_{MODEL_NORMALIZED}` → e.g. `PCIE_NETWORK_CONNECTX5`
-
-### Firmware
-
-```json
-[
-  { "device_name": "BIOS", "version": "06.08.05" },
-  { "device_name": "BMC",  "version": "5.17.00" }
-]
-```
-
-Both fields required. Changes trigger `FIRMWARE_CHANGED` timeline events.
-
---
-
-### Import process (Reanimator side)
-
-1. Validate `collected_at` (RFC3339) and `hardware.board.serial_number`.
-2. Find or create Asset by `board.serial_number` → `vendor_serial`.
-3. For each component: filter `present=false`, auto-determine LOT, find or create Component,
-   create Observation, update Installations.
-4. Detect removed components (present in previous snapshot, absent in current) → close Installation.
-5. Generate timeline events: `LOG_COLLECTED`, `INSTALLED`, `REMOVED`, `FIRMWARE_CHANGED`.
-
-**Idempotency:** Repeated import of the same snapshot (same content hash) returns `200 OK`
-with `"duplicate": true` and does not create duplicate records.
-
-### Reanimator API endpoint
-
-```http
-POST /ingest/hardware
-Content-Type: application/json
-```
-
-**Success (201):**
-```json
-{
-  "status": "success",
-  "bundle_id": "lb_01J...",
-  "asset_id": "mach_01J...",
-  "collected_at": "2026-02-10T15:30:00Z",
-  "duplicate": false,
-  "summary": {
-    "parts_observed": 15,
-    "parts_created": 2,
-    "installations_created": 2,
-    "timeline_events_created": 9
-  }
-}
-```
-
-**Duplicate (200):**
-```json
-{ "status": "success", "duplicate": true, "message": "LogBundle with this content hash already exists" }
-```
-
-**Error (400):**
-```json
-{ "status": "error", "error": "validation_failed", "details": { "field": "...", "message": "..." } }
-```
-
-Common `400` causes:
- Unknown JSON field (strict decoder)
- Wrong key name (e.g. `targetHost` instead of `target_host`)
- Invalid `collected_at` format (must be RFC3339)
- Empty `hardware.board.serial_number`
-
-### LOT normalization rules
-
-1. Remove special chars `( ) - ® ™`; replace spaces with `_`
-2. Uppercase all
-3. Collapse multiple underscores to one
-4. Strip common prefixes like `MODEL:`, `PN:`
-
-### Status values
-
-| Value | Meaning | Action |
-|-------|---------|--------|
-| `OK` | Normal | — |
-| `Warning` | Degraded | Create `COMPONENT_WARNING` event (optional) |
-| `Critical` | Failed | Auto-create `failure_event`, create `COMPONENT_FAILED` event |
-| `Unknown` | Not determinable | Treat as working |
-| `Empty` | Slot unpopulated | No component created (memory/PCIe only) |
-
-### Missing field handling
-
-| Field | Fallback |
-|-------|---------|
-| CPU serial | Generated: `{board_serial}-CPU-{socket}` |
-| PCIe serial | Generated: `{board_serial}-PCIE-{slot}` |
-| Other serial | Component skipped if absent |
-| manufacturer (PCIe) | Looked up from `vendor_id` (8086→Intel, 10de→NVIDIA, 15b3→Mellanox…) |
-| status | Treated as `Unknown` |
-| firmware | No `FIRMWARE_CHANGED` event |
+Behavior:
+- unsupported filenames are skipped
+- each file is parsed independently
+- one bad file must not fail the whole batch if at least one conversion succeeds
+- result artifact is temporary and deleted after download
--- a/bible-local/08-build-release.md
+++ b/bible-local/08-build-release.md
@@ -4,86 +4,74 @@

 Defined in `cmd/logpile/main.go`:

-| Flag | Default | Description |
-|------|---------|-------------|
+| Flag | Default | Purpose |
+|------|---------|---------|
 | `--port` | `8082` | HTTP server port |
-| `--file` | — | Reserved for archive preload (not active) |
-| `--version` | — | Print version and exit |
-| `--no-browser` | — | Do not open browser on start |
-| `--hold-on-crash` | `true` on Windows | Keep console open on fatal crash for debugging |
+| `--file` | empty | Preload archive file |
+| `--version` | `false` | Print version and exit |
+| `--no-browser` | `false` | Do not auto-open browser |
+| `--hold-on-crash` | `true` on Windows | Keep console open after fatal crash |

-## Build
+## Common commands

 ```bash
-# Local binary (current OS/arch)
 make build
-# Output: bin/logpile
-
-# Cross-platform binaries
 make build-all
-# Output:
-#   bin/logpile-linux-amd64
-#   bin/logpile-linux-arm64
-#   bin/logpile-darwin-amd64
-#   bin/logpile-darwin-arm64
-#   bin/logpile-windows-amd64.exe
-```
-
-Both `make build` and `make build-all` run `scripts/update-pci-ids.sh --best-effort`
-before compilation to sync `pci.ids` from the submodule.
-
-To skip PCI IDs update:
-```bash
-SKIP_PCI_IDS_UPDATE=1 make build
-```
-
-Build flags: `CGO_ENABLED=0` — fully static binary, no C runtime dependency.
-
-## PCI IDs submodule
-
-Source: `third_party/pciids` (git submodule → `github.com/pciutils/pciids`)
-Local copy embedded at build time: `internal/parser/vendors/pciids/pci.ids`
-
-```bash
-# Manual update
+make test
+make fmt
 make update-pci-ids
+```

-# Init submodule after fresh clone
+Notes:
+- `make build` outputs `bin/logpile`
+- `make build-all` builds the supported cross-platform binaries
+- `make build` and `make build-all` run `scripts/update-pci-ids.sh --best-effort` unless `SKIP_PCI_IDS_UPDATE=1`
+
+## PCI IDs
+
+Source submodule: `third_party/pciids`
+Embedded copy: `internal/parser/vendors/pciids/pci.ids`
+
+Typical setup after clone:
+
+```bash
 git submodule update --init third_party/pciids
 ```

-## Release process
+## Release script
+
+Run:

 ```bash
-scripts/release.sh
+./scripts/release.sh
 ```

-What it does:
+Current behavior:
+
 1. Reads version from `git describe --tags`
-2. Validates clean working tree (override: `ALLOW_DIRTY=1`)
-3. Sets stable `GOPATH` / `GOCACHE` / `GOTOOLCHAIN` env
-4. Creates `releases/{VERSION}/` directory
-5. Generates `RELEASE_NOTES.md` template if not present
-6. Builds `darwin-arm64` and `windows-amd64` binaries
-7. Packages all binaries found in `bin/` as `.tar.gz` / `.zip`
+2. Refuses a dirty tree unless `ALLOW_DIRTY=1`
+3. Sets stable Go cache/toolchain environment
+4. Creates `releases/{VERSION}/`
+5. Creates a release-notes template if missing
+6. Builds `darwin-arm64` and `windows-amd64`
+7. Packages any already-present binaries from `bin/`
 8. Generates `SHA256SUMS.txt`
-9. Prints next steps (tag, push, create release manually)

-Release notes template is created in `releases/{VERSION}/RELEASE_NOTES.md`.
+Important limitation:
+- `scripts/release.sh` does not run `make build-all` for you
+- if you want Linux or additional macOS archives in the release directory, build them before running the script

-## Running
+## Run locally

 ```bash
 ./bin/logpile
 ./bin/logpile --port 9090
 ./bin/logpile --no-browser
 ./bin/logpile --version
-./bin/logpile --hold-on-crash   # keep console open on crash (default on Windows)
 ```

 ## macOS Gatekeeper

-After downloading a binary, remove the quarantine attribute:
 ```bash
 xattr -d com.apple.quarantine /path/to/logpile-darwin-arm64
 ```
--- a/bible-local/09-testing.md
+++ b/bible-local/09-testing.md
@@ -1,134 +1,54 @@
 # 09 — Testing

-## Required before merge
+## Baseline
+
+Required before merge:

 ```bash
 go test ./...
 ```

-All tests must pass before any change is merged.
+## Test locations

-## Where to add tests
-
-| Change area | Test location |
-|-------------|---------------|
-| Collectors | `internal/collector/*_test.go` |
-| HTTP handlers | `internal/server/*_test.go` |
+| Area | Location |
+|------|----------|
+| Collectors and replay | `internal/collector/*_test.go` |
+| HTTP handlers and jobs | `internal/server/*_test.go` |
 | Exporters | `internal/exporter/*_test.go` |
-| Parsers | `internal/parser/vendors/<vendor>/*_test.go` |
+| Vendor parsers | `internal/parser/vendors/<vendor>/*_test.go` |

-## Exporter tests
+## General rules

-The Reanimator exporter has comprehensive coverage:
+- Prefer table-driven tests
+- No network access in unit tests
+- Cover happy path and realistic failure/partial-data cases
+- New vendor parsers need both detection and parse coverage

-| Test file | Coverage |
-|-----------|----------|
-| `reanimator_converter_test.go` | Unit tests per conversion function |
-| `reanimator_integration_test.go` | Full export with realistic `AnalysisResult` |
+## Mandatory coverage for dedup/filter/classify logic
+
+Any new deduplication, filtering, or classification function must have:
+
+1. A true-positive case
+2. A true-negative case
+3. A regression case for the vendor or topology that motivated the change
+
+This is mandatory for inventory logic, firmware filtering, and similar code paths where silent data drift is likely.
+
+## Mandatory coverage for expensive path selection
+
+Any function that decides whether to crawl or probe an expensive path must have:
+
+1. A positive selection case
+2. A negative exclusion case
+3. A topology-level count/integration case
+
+The goal is to catch runaway I/O regressions before they ship.
+
+## Useful focused commands

-Run exporter tests only:
 ```bash
 go test ./internal/exporter/...
-go test ./internal/exporter/... -v -run Reanimator
-go test ./internal/exporter/... -cover
-```
-
-## Guidelines
-
- Prefer table-driven tests for parsing logic (multiple input variants).
- Do not rely on network access in unit tests.
- Test both the happy path and edge cases (missing fields, empty collections).
- When adding a new vendor parser, include at minimum:
-  - `Detect()` test with a positive and a negative sample file list.
-  - `Parse()` test with a minimal but representative archive.
-
-## Dedup and filtering functions — mandatory coverage
-
-Any function that deduplicates, filters, or classifies hardware inventory items
-**must** have tests covering all three axes before the code is considered done:
-
-| Axis | What to test | Why |
-|------|-------------|-----|
-| **True positive** | Items that ARE duplicates are collapsed to one | Proves the function works |
-| **True negative** | Items that are NOT duplicates are kept separate | Proves the function doesn't over-collapse |
-| **Counter-case** | The scenario that motivated the original code still works after changes | Prevents regression from future fixes |
-
-### Worked example — GPU dedup regression (2026-03-11)
-
-`collectGPUsFromProcessors` was added for MSI (chassis Id matches processor Id).
-No tests → when Supermicro HGX arrived (chassis Id = "HGX_GPU_SXM_1", processor Id = "GPU_SXM_1"),
-the chassis lookup silently returned nothing, serial stayed empty, UUID was new → 8 duplicate GPUs.
-
-Simultaneously, fixing `gpuDocDedupKey` to use `slot|model` before path collapsed two distinct
-GraphicsControllers GPUs with the same model into one — breaking an existing test that had no
-counter-case for the path-fallback scenario.
-
-**Required test matrix for any dedup function:**
-
-```
-TestXxx_CollapsesDuplicates   — same item via two sources → 1 result
-TestXxx_KeepsDistinct          — two different items with same model → 2 results
-TestXxx_<VendorThatMotivated>  — the specific vendor/setup that triggered the code
-```
-
-### Worked example — firmware filter regression (2026-03-12)
-
-`collectFirmwareInventory` was added in `6c19a58` without coverage for Supermicro naming.
-`isDeviceBoundFirmwareName` had patterns for Dell-style names (`"GPU SomeDevice"`, `"NIC OnboardLAN"`)
-but Supermicro Redfish uses `"GPU1 System Slot0"` and `"NIC1 System Slot0 ..."` — digit follows
-immediately after the type prefix. 29 device-bound entries leaked into `hardware.firmware`.
-
-`9c5512d` attempted to fix this with HGX ID patterns (`_fw_gpu_`, etc.) in the wrong field:
-the filter checked `DeviceName` but `collectFirmwareInventory` populates it from `Name` first
-(`"Software Inventory"` for all HGX per-component slots), not from the `Id` field that contains
-the firmware ID like `"HGX_FW_GPU_SXM_1"`. The patterns were effectively dead code from day one.
-
-**Required test matrix for any filter function:**
-
-```
-TestXxx_FiltersDeviceBound_Dell         — Dell-style names that motivated the original code
-TestXxx_FiltersDeviceBound_Supermicro   — Supermicro names with digit suffix (GPU1/NIC1)
-TestXxx_KeepsSystemLevel                — BIOS, BMC, CPLD names must NOT be filtered
-```
-
-### Practical rule
-
-When you write a new filter/dedup/classify function, ask:
-1. Does my test cover the vendor that motivated this code?
-2. Does my test cover a *different* vendor or naming convention where the function must NOT fire?
-3. If I change the dedup key logic, do existing tests still exercise the old correct behavior?
-4. When the filter checks a field on a model struct, does my test verify that the field is
-   actually populated by the collector? (Dead-code filter pattern: `9c5512d` `_fw_gpu_` check.)
-
-If any answer is "no" — add the missing test before committing.
-
-## Collector candidate-selection functions — mandatory coverage
-
-Any function that selects paths for an expensive operation (probing, crawling, plan-B retry)
-**must** have tests covering:
-
-| Axis | What to test | Why |
-|------|-------------|-----|
-| **Positive** | Paths that should be selected ARE selected | Proves the feature works |
-| **Negative** | Paths that should be excluded ARE excluded | Prevents runaway I/O |
-| **Topology integration** | Given a realistic `out` map, the count of selected paths matches expectations | Catches implicit coupling between the selector and the surrounding data shape |
-
-### Worked example — NVMe post-probe regression (2026-03-12)
-
-`shouldAdaptiveNVMeProbe` was added in `2fa4a12` for Supermicro NVMe backplanes that return
-`Members: []` but serve disks at `Disk.Bay.N` paths. No topology-level test was added.
-
-When SYS-A21GE-NBRT (HGX B200) arrived, its 35 sub-chassis (GPU, NVSwitch, PCIeRetimer,
-ERoT, IRoT, BMC, FPGA) all have `ChassisType=Module/Component/Zone` and empty `/Drives` →
-all 35 passed the filter → 35 × 384 = 13 440 HTTP requests → 22 min extra per collection.
-
-A topology integration test (`TestNVMePostProbeSkipsNonStorageChassis`) would have caught
-this at commit time: given GPU chassis + backplane, exactly 1 candidate must be selected.
-
-**Required test matrix for any path-selection function:**
-
-```
-TestXxx_SelectsTargetPath        — the path that motivated the code IS selected
-TestXxx_SkipsIrrelevantPath      — a path that must never be selected IS skipped
-TestXxx_TopologyCount            — given a realistic multi-chassis map, selected count = N
+go test ./internal/collector/...
+go test ./internal/server/...
+go test ./internal/parser/vendors/...
 ```
--- a/bible-local/README.md
+++ b/bible-local/README.md
@@ -1,59 +1,41 @@
 # LOGPile Bible

-> **Documentation language:** English only. All maintained project documentation must be written in English.
->
-> **Architectural decisions:** Every significant architectural decision **must** be recorded in
-> [`10-decisions.md`](10-decisions.md) before or alongside the code change.
->
-> **Single source of truth:** Architecture and technical design documentation belongs in `docs/bible/`.
-> Keep `README.md` and `CLAUDE.md` minimal to avoid duplicate documentation.
+`bible-local/` is the project-specific source of truth for LOGPile.
+Keep top-level docs minimal and put maintained architecture/API contracts here.

-This directory is the single source of truth for LOGPile's architecture, design, and integration contracts.
-It is structured so that both humans and AI assistants can navigate it quickly.
+## Rules

---
+- Documentation language: English only
+- Update relevant bible files in the same change as the code
+- Record significant architectural decisions in [`10-decisions.md`](10-decisions.md)
+- Do not duplicate shared rules from `bible/`

-## Reading Map (Hierarchical)
+## Read order

-### 1. Foundations (read first)
+| File | Purpose |
+|------|---------|
+| [01-overview.md](01-overview.md) | Product scope, modes, non-goals |
+| [02-architecture.md](02-architecture.md) | Runtime structure, state, main flows |
+| [04-data-models.md](04-data-models.md) | Stable data contracts and canonical inventory |
+| [03-api.md](03-api.md) | HTTP endpoints and response contracts |
+| [05-collectors.md](05-collectors.md) | Live collection behavior |
+| [06-parsers.md](06-parsers.md) | Archive parser framework and vendor coverage |
+| [07-exporters.md](07-exporters.md) | Raw export, Reanimator export, batch convert |
+| [08-build-release.md](08-build-release.md) | Build and release workflow |
+| [09-testing.md](09-testing.md) | Test expectations and regression rules |
+| [10-decisions.md](10-decisions.md) | Architectural Decision Log |

-| File | What it covers |
-|------|----------------|
-| [01-overview.md](01-overview.md) | Product purpose, operating modes, scope |
-| [02-architecture.md](02-architecture.md) | Runtime structure, control flow, in-memory state |
-| [04-data-models.md](04-data-models.md) | Core contracts (`AnalysisResult`, canonical `hardware.devices`) |
+## Fast orientation

-### 2. Runtime Interfaces
-
-| File | What it covers |
-|------|----------------|
-| [03-api.md](03-api.md) | HTTP API contracts and endpoint behavior |
-| [05-collectors.md](05-collectors.md) | Live collection connectors (Redfish, IPMI mock) |
-| [06-parsers.md](06-parsers.md) | Archive parser framework and vendor parsers |
-| [07-exporters.md](07-exporters.md) | CSV / JSON / Reanimator exports and integration mapping |
-
-### 3. Delivery & Quality
-
-| File | What it covers |
-|------|----------------|
-| [08-build-release.md](08-build-release.md) | Build, packaging, release workflow |
-| [09-testing.md](09-testing.md) | Testing expectations and verification guidance |
-
-### 4. Governance (always current)
-
-| File | What it covers |
-|------|----------------|
-| [10-decisions.md](10-decisions.md) | Architectural Decision Log (ADL) |
-
---
-
-## Quick orientation for AI assistants
-
- Read order for most changes: `01` → `02` → `04` → relevant interface doc(s) → `10`
 - Entry point: `cmd/logpile/main.go`
- HTTP server: `internal/server/` — handlers in `handlers.go`, routes in `server.go`
- Data contracts: `internal/models/` — never break `AnalysisResult` JSON shape
- Frontend contract: `web/static/js/app.js` — keep API responses stable
- Canonical inventory: `hardware.devices` in `AnalysisResult` — source of truth for UI and exports
- Parser registry: `internal/parser/vendors/` — `init()` auto-registration pattern
- Collector registry: `internal/collector/registry.go`
+- HTTP layer: `internal/server/`
+- Core contracts: `internal/models/models.go`
+- Live collection: `internal/collector/`
+- Archive parsing: `internal/parser/`
+- Export conversion: `internal/exporter/`
+- Frontend consumer: `web/static/js/app.js`
+
+## Maintenance rule
+
+If a document becomes stale, either fix it immediately or delete it.
+Stale docs are worse than missing docs.