- Dell NICView: strip " - XX:XX:XX:XX:XX:XX" suffix from ProductName (Dell TSR embeds MAC in this field for every NIC port) - Dell SoftwareIdentity: same strip applied to ElementName; store FQDD in FirmwareInfo.Description so exporter can filter device-bound entries - Exporter: add isDeviceBoundFirmwareFQDD() to filter firmware entries whose Description matches NIC./PSU./Disk./RAID.Backplane./GPU. FQDD prefixes (prevents device firmware from appearing in hardware.firmware) - Exporter: extend isDeviceBoundFirmwareName() to filter HGX GPU/NVSwitch firmware inventory IDs (_fw_gpu_, _fw_nvswitch_, _inforom_gpu_) - Inspur: remove HDD firmware from Hardware.Firmware — already present in Storage.Firmware, duplicating it violates ADL-016 - bible-local/06-parsers.md: document firmware and MAC stripping rules - bible-local/10-decisions.md: add ADL-016 (device-bound firmware) and ADL-017 (vendor-embedded MAC in model name fields) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 KiB
10 — Architectural Decision Log (ADL)
Rule: Every significant architectural decision must be recorded here before or alongside the code change. This applies to humans and AI assistants alike.
Format: date · title · context · decision · consequences
ADL-001 — In-memory only state (no database)
Date: project start
Context: LOGPile is designed as a standalone diagnostic tool, not a persistent service.
Decision: All parsed/collected data lives in Server.result (in-memory). No database, no files written.
Consequences:
- Data is lost on process restart — intentional.
- Simple deployment: single binary, no setup required.
- JSON export is the persistence mechanism for users who want to save results.
ADL-002 — Vendor parser auto-registration via init()
Date: project start
Context: Need an extensible parser registry without a central factory function.
Decision: Each vendor parser registers itself in its package's init() function.
vendors/vendors.go holds blank imports to trigger registration.
Consequences:
- Adding a new parser requires only: implement interface + add one blank import.
- No central list to maintain (other than the import file).
go test ./...will include new parsers automatically.
ADL-003 — Highest-confidence parser wins
Date: project start
Context: Multiple parsers may partially match an archive (e.g. generic + specific vendor).
Decision: Run all parsers' Detect(), select the one returning the highest score (0–100).
Consequences:
- Generic fallback (score 15) only activates when no vendor parser scores higher.
- Parsers must be conservative with high scores (70+) to avoid false positives.
ADL-004 — Canonical hardware.devices as single source of truth
Date: v1.5.0
Context: UI tabs and Reanimator exporter were reading from different sub-fields of
AnalysisResult, causing potential drift.
Decision: Introduce hardware.devices as the canonical inventory repository.
All UI tabs and all exporters must read exclusively from this repository.
Consequences:
- Any UI vs Reanimator discrepancy is classified as a bug, not a "known difference".
- Deduplication logic runs once in the repository builder (serial → bdf → distinct).
- New hardware attributes must be added to canonical schema first, then mapped to consumers.
ADL-005 — No hardcoded PCI model strings; use pci.ids
Date: v1.5.0
Context: NVIDIA and other vendors release new GPU models frequently; hardcoded maps
required code changes for each new model ID.
Decision: Use the pciutils/pciids database (git submodule, embedded at build time).
PCI vendor/device ID → human-readable model name via lookup.
Consequences:
- New GPU models can be supported by updating
pci.idswithout code changes. make buildauto-syncspci.idsfrom submodule before compilation.- External override via
LOGPILE_PCI_IDS_PATHenv var.
ADL-006 — Reanimator export uses canonical hardware.devices (not raw sub-fields)
Date: v1.5.0
Context: Early Reanimator exporter read from Hardware.GPUs, Hardware.NICs, etc.
directly, diverging from UI data.
Decision: Reanimator exporter must use hardware.devices — the same source as the UI.
Exporter groups/filters canonical records by section; does not rebuild from sub-fields.
Consequences:
- Guarantees UI and export consistency.
- Exporter code is simpler — mainly a filter+map, not a data reconstruction.
ADL-007 — Documentation language is English
Date: 2026-02-20
Context: Codebase documentation was mixed Russian/English, reducing clarity for
international contributors and AI assistants.
Decision: All maintained project documentation (docs/bible/, README.md,
CLAUDE.md, and new technical docs) must be written in English.
Consequences:
- Bible is authoritative in English.
- AI assistants get consistent, unambiguous context.
ADL-008 — Bible is the single source of truth for architecture docs
Date: 2026-02-23
Context: Architecture information was duplicated across README.md, CLAUDE.md,
and the Bible, creating drift risk and stale guidance for humans and AI agents.
Decision: Keep architecture and technical design documentation only in docs/bible/.
Top-level README.md and CLAUDE.md must remain minimal pointers/instructions.
Consequences:
- Reduces documentation drift and duplicate updates.
- AI assistants are directed to one authoritative source before making changes.
- Documentation updates that affect architecture must include Bible changes (and ADL entries when significant).
ADL-009 — Redfish analysis is performed from raw snapshot replay (unified tunnel)
Date: 2026-02-24
Context: Live Redfish collection and raw export re-analysis used different parsing paths,
which caused drift and made bug fixes difficult to validate consistently.
Decision: Redfish live collection must produce a raw_payloads.redfish_tree snapshot first,
then run the same replay analyzer used for imported raw exports.
Consequences:
- Same
redfish_treeinput produces the same parsed result in live and offline modes. - Debugging parser issues can be done against exported raw bundles without live BMC access.
- Snapshot completeness becomes critical; collector seeds/limits are part of analyzer correctness.
ADL-010 — Raw export is a self-contained re-analysis package (not a final result dump)
Date: 2026-02-24
Context: Exporting only normalized AnalysisResult loses raw source fidelity and prevents
future parser improvements from being applied to already collected data.
Decision: Export Raw Data produces a self-contained raw package (JSON or ZIP bundle)
that the application can reopen and re-analyze. Parsed data in the package is optional and not
the source of truth on import.
Consequences:
- Re-opening an export always re-runs analysis from raw source (
redfish_treeor uploaded file bytes). - Raw bundles include collection context and diagnostics for debugging (
collect.log,parser_fields.json). - Endpoint compatibility is preserved (
/api/export/json) while actual payload format may be a bundle.
ADL-011 — Redfish snapshot crawler is bounded, prioritized, and failure-tolerant
Date: 2026-02-24 Context: Full Redfish trees on modern GPU systems are large, noisy, and contain many vendor-specific or non-fetchable links. Unbounded crawling and naive queue design caused hangs and incomplete snapshots. Decision: Use a bounded snapshot crawler with:
- explicit document cap (
LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS) - priority seed paths (PCIe/Fabrics/Firmware/Storage/PowerSubsystem/ThermalSubsystem)
- normalized
@odata.idpaths (strip#fragment) - noisy expected error filtering (404/405/410/501 hidden from UI)
- queue capacity sized to crawl cap to avoid producer/consumer deadlock Consequences:
- Snapshot collection remains stable on large BMC trees.
- Most high-value inventory paths are reached before the cap.
- UI progress remains useful while debug logs retain low-level fetch failures.
ADL-012 — Vendor-specific storage inventory probing is allowed as fallback
Date: 2026-02-24
Context: Some Supermicro BMCs expose empty standard Storage/.../Drives collections while
real disk inventory exists under vendor-specific Disk.Bay endpoints and enclosure links.
Decision: When standard drive collections are empty, collector/replay may probe vendor-style
.../Drives/Disk.Bay.* endpoints and follow Storage.Links.Enclosures[*] to recover physical drives.
Consequences:
- Higher storage inventory coverage on Supermicro HBA/HA-RAID/MRVL/NVMe backplane implementations.
- Replay must mirror the same probing behavior to preserve deterministic results.
- Probing remains bounded (finite candidate set) to avoid runaway requests.
ADL-013 — PowerSubsystem is preferred over legacy Power on newer Redfish implementations
Date: 2026-02-24
Context: X14+/newer Redfish implementations increasingly expose authoritative PSU data in
PowerSubsystem/PowerSupplies, while legacy /Power may be incomplete or schema-shifted.
Decision: Prefer Chassis/*/PowerSubsystem/PowerSupplies as the primary PSU source and use
legacy Chassis/*/Power as fallback.
Consequences:
- Better compatibility with newer BMC firmware generations.
- Legacy systems remain supported without special-case collector selection.
- Snapshot priority seeds must include
PowerSubsystemresources.
ADL-014 — Threshold logic lives on the server; UI reflects status only
Date: 2026-02-24 Context: Duplicating threshold math in frontend and backend creates drift and inconsistent highlighting (e.g. PSU mains voltage range checks). Decision: Business threshold evaluation (e.g. PSU voltage nominal range) must be computed on the server; frontend only renders status/flags returned by the API. Consequences:
- Single source of truth for threshold policies.
- UI can evolve visually without re-implementing domain logic.
- API payloads may carry richer status semantics over time.
ADL-015 — Supermicro crashdump archive parser removed from active registry
Date: 2026-03-01
Context: The Supermicro crashdump parser (SMC Crash Dump Parser) produced low-value
results for current workflows and was explicitly rejected as a supported archive path.
Decision: Remove supermicro vendor parser from active registration and project source.
Do not include it in /api/parsers output or parser documentation matrix.
Consequences:
- Supermicro crashdump archives (
CDump.txtformat) are no longer parsed by a dedicated vendor parser. - Such archives fall back to other matching parsers (typically
generic) unless a new replacement parser is added. - Reintroduction requires a new parser package and an explicit registry import in
vendors/vendors.go.
ADL-016 — Device-bound firmware must not appear in hardware.firmware
Date: 2026-03-01
Context: Dell TSR DCIM_SoftwareIdentity lists firmware for every component (NICs,
PSUs, disks, backplanes) in addition to system-level firmware. Naively importing all entries
into Hardware.Firmware caused device firmware to appear twice in Reanimator: once in the
device's own record and again in the top-level firmware list.
Decision:
Hardware.Firmwarecontains only system-level firmware (BIOS, BMC/iDRAC, CPLD, Lifecycle Controller, storage controllers, BOSS).- Device-bound entries (NIC, PSU, Disk, Backplane, GPU) must not be added to
Hardware.Firmware. - Parsers must store the FQDD (or equivalent slot identifier) in
FirmwareInfo.Descriptionso the Reanimator exporter can filter by FQDD prefix. - The exporter's
isDeviceBoundFirmwareFQDD()function performs this filter. Consequences: - Any new parser that ingests a per-device firmware inventory must follow the same rule.
- Device firmware is accessible only via the device's own record, not the firmware list.
ADL-017 — Vendor-embedded MAC addresses must be stripped from model name fields
Date: 2026-03-01
Context: Dell TSR embeds MAC addresses directly in ProductName and ElementName
fields (e.g. "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08").
This caused model names to contain MAC addresses in NIC model, NIC firmware device name,
and potentially other fields.
Decision: Strip any - XX:XX:XX:XX:XX:XX suffix from all model/name string fields
at parse time before storing in any model struct. Use the regex
\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$.
Consequences:
- Model names are clean and consistent across all devices.
- All parsers must apply this stripping to any field used as a device name or model.
- Confirmed affected fields in Dell:
DCIM_NICView.ProductName,DCIM_SoftwareIdentity.ElementName.