Files
logpile/bible-local/09-testing.md
Mikhail Chusavitin a9f58b3cf4 redfish: fix GPU duplication on Supermicro HGX, exclude NVSwitch, restore path dedup
Three bugs, all related to GPU dedup in the Redfish replay pipeline:

1. collectGPUsFromProcessors (redfish_replay.go): GPU-type Processor entries
   (Systems/HGX_Baseboard_0/Processors/GPU_SXM_N) were not deduplicated against
   existing PCIeDevice GPUs on Supermicro HGX. The chassis-ID lookup keyed on
   processor Id ("GPU_SXM_1") but the chassis is named "HGX_GPU_SXM_1" — lookup
   returned nothing, serial stayed empty, UUID was unseen → 8 duplicate GPU rows.
   Fix: read SerialNumber directly from the Processor doc first; chassis lookup
   is now a fallback override (as it was designed for MSI).

2. looksLikeGPU (redfish.go): NVSwitch PCIe devices (Model="NVSwitch",
   Manufacturer="NVIDIA") were classified as GPUs because "nvidia" matched the
   GPU hint list. Fix: early return false when Model contains "nvswitch".

3. gpuDocDedupKey (redfish.go): commit 9df29b1 changed the dedup key to prefer
   slot|model before path, which collapsed two distinct GPUs with identical model
   names in GraphicsControllers into one entry. Fix: only serial and BDF are used
   as cross-path stable dedup keys; fall back to Redfish path when neither is
   present. This also restores TestReplayCollectGPUs_DedupUsesRedfishPathBeforeHeuristics
   which had been broken on main since 9df29b1.

Added tests:
- TestCollectGPUsFromProcessors_SupermicroHGX: Processor GPU dedup when
  chassis-ID naming convention does not match processor Id
- TestReplayCollectGPUs_DedupCrossChassisSerial: same GPU via two Chassis
  PCIeDevice trees with matching serials → collapsed to one
- TestLooksLikeGPU_NVSwitchExcluded: NVSwitch is not a GPU

Added rule to bible-local/09-testing.md: dedup/filter/classify functions must
cover true-positive, true-negative, and the vendor counter-case axes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 15:09:27 +03:00

3.1 KiB

09 — Testing

Required before merge

go test ./...

All tests must pass before any change is merged.

Where to add tests

Change area Test location
Collectors internal/collector/*_test.go
HTTP handlers internal/server/*_test.go
Exporters internal/exporter/*_test.go
Parsers internal/parser/vendors/<vendor>/*_test.go

Exporter tests

The Reanimator exporter has comprehensive coverage:

Test file Coverage
reanimator_converter_test.go Unit tests per conversion function
reanimator_integration_test.go Full export with realistic AnalysisResult

Run exporter tests only:

go test ./internal/exporter/...
go test ./internal/exporter/... -v -run Reanimator
go test ./internal/exporter/... -cover

Guidelines

  • Prefer table-driven tests for parsing logic (multiple input variants).
  • Do not rely on network access in unit tests.
  • Test both the happy path and edge cases (missing fields, empty collections).
  • When adding a new vendor parser, include at minimum:
    • Detect() test with a positive and a negative sample file list.
    • Parse() test with a minimal but representative archive.

Dedup and filtering functions — mandatory coverage

Any function that deduplicates, filters, or classifies hardware inventory items must have tests covering all three axes before the code is considered done:

Axis What to test Why
True positive Items that ARE duplicates are collapsed to one Proves the function works
True negative Items that are NOT duplicates are kept separate Proves the function doesn't over-collapse
Counter-case The scenario that motivated the original code still works after changes Prevents regression from future fixes

Worked example — GPU dedup regression (2026-03-11)

collectGPUsFromProcessors was added for MSI (chassis Id matches processor Id). No tests → when Supermicro HGX arrived (chassis Id = "HGX_GPU_SXM_1", processor Id = "GPU_SXM_1"), the chassis lookup silently returned nothing, serial stayed empty, UUID was new → 8 duplicate GPUs.

Simultaneously, fixing gpuDocDedupKey to use slot|model before path collapsed two distinct GraphicsControllers GPUs with the same model into one — breaking an existing test that had no counter-case for the path-fallback scenario.

Required test matrix for any dedup function:

TestXxx_CollapsesDuplicates   — same item via two sources → 1 result
TestXxx_KeepsDistinct          — two different items with same model → 2 results
TestXxx_<VendorThatMotivated>  — the specific vendor/setup that triggered the code

Practical rule

When you write a new filter/dedup/classify function, ask:

  1. Does my test cover the vendor that motivated this code?
  2. Does my test cover a different vendor or naming convention where the function must NOT fire?
  3. If I change the dedup key logic, do existing tests still exercise the old correct behavior?

If any answer is "no" — add the missing test before committing.