Files
logpile/bible-local
Mikhail Chusavitin a9f58b3cf4 redfish: fix GPU duplication on Supermicro HGX, exclude NVSwitch, restore path dedup
Three bugs, all related to GPU dedup in the Redfish replay pipeline:

1. collectGPUsFromProcessors (redfish_replay.go): GPU-type Processor entries
   (Systems/HGX_Baseboard_0/Processors/GPU_SXM_N) were not deduplicated against
   existing PCIeDevice GPUs on Supermicro HGX. The chassis-ID lookup keyed on
   processor Id ("GPU_SXM_1") but the chassis is named "HGX_GPU_SXM_1" — lookup
   returned nothing, serial stayed empty, UUID was unseen → 8 duplicate GPU rows.
   Fix: read SerialNumber directly from the Processor doc first; chassis lookup
   is now a fallback override (as it was designed for MSI).

2. looksLikeGPU (redfish.go): NVSwitch PCIe devices (Model="NVSwitch",
   Manufacturer="NVIDIA") were classified as GPUs because "nvidia" matched the
   GPU hint list. Fix: early return false when Model contains "nvswitch".

3. gpuDocDedupKey (redfish.go): commit 9df29b1 changed the dedup key to prefer
   slot|model before path, which collapsed two distinct GPUs with identical model
   names in GraphicsControllers into one entry. Fix: only serial and BDF are used
   as cross-path stable dedup keys; fall back to Redfish path when neither is
   present. This also restores TestReplayCollectGPUs_DedupUsesRedfishPathBeforeHeuristics
   which had been broken on main since 9df29b1.

Added tests:
- TestCollectGPUsFromProcessors_SupermicroHGX: Processor GPU dedup when
  chassis-ID naming convention does not match processor Id
- TestReplayCollectGPUs_DedupCrossChassisSerial: same GPU via two Chassis
  PCIeDevice trees with matching serials → collapsed to one
- TestLooksLikeGPU_NVSwitchExcluded: NVSwitch is not a GPU

Added rule to bible-local/09-testing.md: dedup/filter/classify functions must
cover true-positive, true-negative, and the vendor counter-case axes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 15:09:27 +03:00
..

LOGPile Bible

Documentation language: English only. All maintained project documentation must be written in English.

Architectural decisions: Every significant architectural decision must be recorded in 10-decisions.md before or alongside the code change.

Single source of truth: Architecture and technical design documentation belongs in docs/bible/. Keep README.md and CLAUDE.md minimal to avoid duplicate documentation.

This directory is the single source of truth for LOGPile's architecture, design, and integration contracts. It is structured so that both humans and AI assistants can navigate it quickly.


Reading Map (Hierarchical)

1. Foundations (read first)

File What it covers
01-overview.md Product purpose, operating modes, scope
02-architecture.md Runtime structure, control flow, in-memory state
04-data-models.md Core contracts (AnalysisResult, canonical hardware.devices)

2. Runtime Interfaces

File What it covers
03-api.md HTTP API contracts and endpoint behavior
05-collectors.md Live collection connectors (Redfish, IPMI mock)
06-parsers.md Archive parser framework and vendor parsers
07-exporters.md CSV / JSON / Reanimator exports and integration mapping

3. Delivery & Quality

File What it covers
08-build-release.md Build, packaging, release workflow
09-testing.md Testing expectations and verification guidance

4. Governance (always current)

File What it covers
10-decisions.md Architectural Decision Log (ADL)

Quick orientation for AI assistants

  • Read order for most changes: 010204 → relevant interface doc(s) → 10
  • Entry point: cmd/logpile/main.go
  • HTTP server: internal/server/ — handlers in handlers.go, routes in server.go
  • Data contracts: internal/models/ — never break AnalysisResult JSON shape
  • Frontend contract: web/static/js/app.js — keep API responses stable
  • Canonical inventory: hardware.devices in AnalysisResult — source of truth for UI and exports
  • Parser registry: internal/parser/vendors/init() auto-registration pattern
  • Collector registry: internal/collector/registry.go