Three bugs, all related to GPU dedup in the Redfish replay pipeline:
1. collectGPUsFromProcessors (redfish_replay.go): GPU-type Processor entries
(Systems/HGX_Baseboard_0/Processors/GPU_SXM_N) were not deduplicated against
existing PCIeDevice GPUs on Supermicro HGX. The chassis-ID lookup keyed on
processor Id ("GPU_SXM_1") but the chassis is named "HGX_GPU_SXM_1" — lookup
returned nothing, serial stayed empty, UUID was unseen → 8 duplicate GPU rows.
Fix: read SerialNumber directly from the Processor doc first; chassis lookup
is now a fallback override (as it was designed for MSI).
2. looksLikeGPU (redfish.go): NVSwitch PCIe devices (Model="NVSwitch",
Manufacturer="NVIDIA") were classified as GPUs because "nvidia" matched the
GPU hint list. Fix: early return false when Model contains "nvswitch".
3. gpuDocDedupKey (redfish.go): commit 9df29b1 changed the dedup key to prefer
slot|model before path, which collapsed two distinct GPUs with identical model
names in GraphicsControllers into one entry. Fix: only serial and BDF are used
as cross-path stable dedup keys; fall back to Redfish path when neither is
present. This also restores TestReplayCollectGPUs_DedupUsesRedfishPathBeforeHeuristics
which had been broken on main since 9df29b1.
Added tests:
- TestCollectGPUsFromProcessors_SupermicroHGX: Processor GPU dedup when
chassis-ID naming convention does not match processor Id
- TestReplayCollectGPUs_DedupCrossChassisSerial: same GPU via two Chassis
PCIeDevice trees with matching serials → collapsed to one
- TestLooksLikeGPU_NVSwitchExcluded: NVSwitch is not a GPU
Added rule to bible-local/09-testing.md: dedup/filter/classify functions must
cover true-positive, true-negative, and the vendor counter-case axes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add /redfish/v1 to redfishCriticalEndpoints so plan-B retries the service
root if it failed during the main crawl. Also downgrade the missing-root
error in ReplayRedfishFromRawPayloads from fatal to a warning so analysis
can complete with defaults when the root doc was not recovered.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Supermicro HGX exposes each GPU under both Chassis/1/PCIeDevices and a
dedicated Chassis/HGX_GPU_SXM_N/PCIeDevices. gpuDocDedupKey was keying
by @odata.id path, so identical GPUs with the same serial were not
deduplicated across sources. Now stable identifiers (serial → BDF →
slot+model) take priority over path.
Also includes Inspur parser improvements: NVMe model/serial enrichment
from devicefrusdr.log and audit.log, RAID drive slot normalization to
BP notation, PSU slot normalization, BMC/CPLD/VR firmware from RESTful
version info section, and parser version bump to 1.8.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On folder selection, filter out duplicate files before conversion:
- First pass: same basename → skip (same filename in different subdirs)
- Second pass: same SHA-256 hash → skip (identical content, different path)
Duplicates are excluded from the convert queue and shown as a warning
in the summary with reason (same name / same content).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Skip FQDD prefixes that are internal AMD EPYC fabric or devices
already captured with richer data from other DCIM views:
- HostBridge/P2PBridge/ISABridge/SMBus.Embedded: AMD internal bus
- AHCI.Embedded: AMD FCH SATA (chipset, not a slot)
- Video.Embedded: BMC Matrox G200eW3, not user-visible
- NIC.Embedded: duplicates DCIM_NICView entries (no model/MAC in PCIe view)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- isUnidentifiablePCIeDevice: skip PCIe entries with generic class
(SingleFunction/MultiFunction) and no model/serial/VendorID — eliminates
PCH bridges, root ports and other bus infrastructure that MSI BMC
enumerates exhaustively (59→9 entries on CG480-S5063)
- collectPCIeDevices: skip entries where looksLikeGPU — prevents GPU
devices from appearing in both hw.GPUs and hw.PCIeDevices (fixed
Inspur H100 duplicate)
- dedupeCanonicalDevices: secondary model+manufacturer match for noKey
items (no serial, no BDF) — merges NetworkAdapter entries into
matching PCIe device entries; isGenericDeviceClass helper for
DeviceClass identity check (fixed Inspur ENFI1100-T4 duplicate)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Parser / archive:
- Add .sds extension as tar-format alias (archive.go)
- Add tests for multipart upload size limits (multipart_limits_test.go)
- Remove supermicro crashdump parser (ADL-015)
Dell parser:
- Remove GPU duplicates from PCIeDevices (DCIM_VideoView vs DCIM_PCIDeviceView
both list the same GPU; VideoView record is authoritative)
Server:
- Add LOGPILE_CONVERT_MAX_MB env var for independent convert batch size limit
- Improve "file too large" error message with current limit value
Web:
- Add CONVERT_MAX_FILES_PER_BATCH = 1000 cap
- Minor UI copy and CSS fixes
Bible:
- bible-local/06-parsers.md: add pci.ids enrichment rule (enrich model from
pciids when name is empty but vendor_id+device_id are present)
- Sync bible submodule and local overview/architecture docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Dell NICView: strip " - XX:XX:XX:XX:XX:XX" suffix from ProductName
(Dell TSR embeds MAC in this field for every NIC port)
- Dell SoftwareIdentity: same strip applied to ElementName; store FQDD
in FirmwareInfo.Description so exporter can filter device-bound entries
- Exporter: add isDeviceBoundFirmwareFQDD() to filter firmware entries
whose Description matches NIC./PSU./Disk./RAID.Backplane./GPU. FQDD
prefixes (prevents device firmware from appearing in hardware.firmware)
- Exporter: extend isDeviceBoundFirmwareName() to filter HGX GPU/NVSwitch
firmware inventory IDs (_fw_gpu_, _fw_nvswitch_, _inforom_gpu_)
- Inspur: remove HDD firmware from Hardware.Firmware — already present
in Storage.Firmware, duplicating it violates ADL-016
- bible-local/06-parsers.md: document firmware and MAC stripping rules
- bible-local/10-decisions.md: add ADL-016 (device-bound firmware) and
ADL-017 (vendor-embedded MAC in model name fields)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add bible.git as submodule at bible/
- Move docs/bible/ → bible-local/ (project-specific architecture)
- Update CLAUDE.md to reference both bible/ and bible-local/
- Add AGENTS.md for Codex with same structure
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>