Root cause analysis for device-bound firmware leaking into hardware.firmware
on Supermicro Redfish (SYS-A21GE-NBRT HGX B200):
- collectFirmwareInventory (6c19a58) had no coverage for Supermicro naming.
isDeviceBoundFirmwareName checked "gpu " / "nic " (space-terminated) while
Supermicro uses "GPU1 System Slot0" / "NIC1 System Slot0 ..." (digit suffix).
- 9c5512d added _fw_gpu_ / _fw_nvswitch_ / _inforom_gpu_ patterns to fix HGX,
but checked DeviceName which contains "Software Inventory" (from Redfish Name),
not the firmware Id. Dead code from day one.
09-testing.md: add firmware filter worked example and rule #4 — verify the
filter checks the field that the collector actually populates.
10-decisions.md: ADL-019 — isDeviceBoundFirmwareName must be extended per
vendor with a test case per vendor format before shipping.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On Supermicro HGX systems (SYS-A21GE-NBRT) ~35 sub-chassis (GPU, NVSwitch,
PCIeRetimer, ERoT/IRoT, BMC, FPGA) all carry ChassisType=Module/Component/Zone
and expose empty /Drives collections. shouldAdaptiveNVMeProbe returned true for
all of them, triggering 35 × 384 = 13 440 HTTP requests → ~22 min wasted per
collection (more than half of total 35 min collection time).
Fix: chassisTypeCanHaveNVMe returns false for Module, Component, Zone. The
candidate selection loop in collectRawRedfishTree now checks the parent chassis
doc before adding a /Drives path to the probe list. Enclosure (NVMe backplane),
RackMount, and unknown types are unaffected.
Tests:
- TestChassisTypeCanHaveNVMe: table-driven, covers excluded and storage-capable types
- TestNVMePostProbeSkipsNonStorageChassis: topology integration, GPU chassis +
backplane with empty /Drives → exactly 1 candidate selected (backplane only)
Docs:
- ADL-018 in bible-local/10-decisions.md
- Candidate-selection test matrix in bible-local/09-testing.md
- SYS-A21GE-NBRT baseline row in docs/test_server_collection_memory.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Dell NICView: strip " - XX:XX:XX:XX:XX:XX" suffix from ProductName
(Dell TSR embeds MAC in this field for every NIC port)
- Dell SoftwareIdentity: same strip applied to ElementName; store FQDD
in FirmwareInfo.Description so exporter can filter device-bound entries
- Exporter: add isDeviceBoundFirmwareFQDD() to filter firmware entries
whose Description matches NIC./PSU./Disk./RAID.Backplane./GPU. FQDD
prefixes (prevents device firmware from appearing in hardware.firmware)
- Exporter: extend isDeviceBoundFirmwareName() to filter HGX GPU/NVSwitch
firmware inventory IDs (_fw_gpu_, _fw_nvswitch_, _inforom_gpu_)
- Inspur: remove HDD firmware from Hardware.Firmware — already present
in Storage.Firmware, duplicating it violates ADL-016
- bible-local/06-parsers.md: document firmware and MAC stripping rules
- bible-local/10-decisions.md: add ADL-016 (device-bound firmware) and
ADL-017 (vendor-embedded MAC in model name fields)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add bible.git as submodule at bible/
- Move docs/bible/ → bible-local/ (project-specific architecture)
- Update CLAUDE.md to reference both bible/ and bible-local/
- Add AGENTS.md for Codex with same structure
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>