Files
logpile/bible-local/05-collectors.md
Mikhail Chusavitin d650a6ba1c refactor: unified ingest pipeline + modular Redfish profile framework
Implement the full architectural plan: unified ingest.Service entry point
for archive and Redfish payloads, modular redfishprofile package with
composable profiles (generic, ami-family, msi, supermicro, dell,
hgx-topology), score-based profile matching with fallback expansion mode,
and profile-driven acquisition/analysis plans.

Vendor-specific logic moved out of common executors and into profile hooks.
GPU chassis lookup strategies and known storage recovery collections
(IntelVROC/HA-RAID/MRVL) now live in ResolvedAnalysisPlan, populated by
profiles at analysis time. Replay helpers read from the plan; no hardcoded
path lists remain in generic code.

Also splits redfish_replay.go into domain modules (gpu, storage, inventory,
fru, profiles) and adds full fixture/matcher/directive test coverage
including Dell, AMI, unknown-vendor fallback, and deterministic ordering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 08:48:58 +03:00

7.7 KiB

05 — Collectors

Collectors live in internal/collector/.

Core files:

  • registry.go for protocol registration
  • redfish.go for live collection
  • redfish_replay.go for replay from raw payloads
  • redfish_replay_gpu.go for profile-driven GPU replay collectors and GPU fallback helpers
  • redfish_replay_storage.go for profile-driven storage replay collectors and storage recovery helpers
  • redfish_replay_inventory.go for replay inventory collectors (PCIe, NIC, BMC MAC, NIC enrichment)
  • redfish_replay_fru.go for board fallback helpers and Assembly/FRU replay extraction
  • redfish_replay_profiles.go for profile-driven replay helpers and vendor-aware recovery helpers
  • redfishprofile/ for Redfish profile matching and acquisition/analysis hooks
  • ipmi_mock.go for the placeholder IPMI implementation
  • types.go for request/progress contracts

Redfish collector

Status: active production path.

Request fields passed from the server:

  • host
  • port
  • username
  • auth_type
  • credential field (password or token)
  • tls_mode
  • optional power_on_if_host_off

Core rule

Live collection and replay must stay behaviorally aligned. If the collector adds a fallback, probe, or normalization rule, replay must mirror it.

Preflight and host power

  • Probe() may be used before collection to verify API connectivity and current host PowerState
  • if the host is off and the user chose power-on, the collector may issue ComputerSystem.Reset with ResetType=On
  • power-on attempts are bounded and logged
  • after a successful power-on, the collector waits an extra stabilization window, then checks PowerState again and only starts collection if the host is still on
  • if the collector powered on the host itself for collection, it must attempt to power it back off after collection completes
  • if the host was already on before collection, the collector must not power it off afterward
  • if power-on fails, collection still continues against the powered-off host
  • all power-control decisions and attempts must be visible in the collection log so they are preserved in raw-export bundles

Discovery model

The collector does not rely on one fixed vendor tree. It discovers and follows Redfish resources dynamically from root collections such as:

  • Systems
  • Chassis
  • Managers

After minimal discovery the collector builds MatchSignals and selects a Redfish profile mode:

  • matched when one or more profiles score with high confidence
  • fallback when vendor/platform confidence is low; in this mode the collector aggregates safe additive profile probes to maximize snapshot completeness

Profile modules may contribute:

  • primary acquisition seeds
  • bounded PlanBPaths for secondary recovery
  • critical paths
  • acquisition notes/diagnostics
  • tuning hints such as snapshot document cap, prefetch behavior, and expensive post-probe toggles
  • post-probe policy for numeric collection recovery, direct NVMe Disk.Bay recovery, and sensor post-probe enablement
  • recovery policy for critical collection member retry, slow numeric plan-B probing, and profile-specific plan-B activation
  • scoped path policy for discovered Systems/*, Chassis/*, and Managers/* branches when a profile needs extra seeds/critical targets beyond the vendor-neutral core set
  • prefetch policy for which critical paths are eligible for adaptive prefetch and which path shapes are explicitly excluded

Model- or topology-specific CriticalPaths and profile PlanBPaths must live in the profile module that owns the behavior. The collector core may execute those paths, but it should not hardcode vendor-specific recovery targets. The same rule applies to expensive post-probe decisions: the collector core may execute bounded post-probe loops, but profiles own whether those loops are enabled for a given platform shape. The same rule applies to critical recovery passes: the collector core may run bounded plan-B loops, but profiles own whether member retry, slow numeric recovery, and profile-specific plan-B passes are enabled. When a profile needs extra discovered-path branches such as storage controller subtrees, it must provide them as scoped suffix policy rather than by hardcoding platform-shaped suffixes into the collector core baseline seed list. The same applies to prefetch shaping: the collector core may execute adaptive prefetch, but profiles own the include/exclude rules for which critical paths should participate. The same applies to critical inventory shaping: the collector core should keep only a minimal vendor-neutral critical baseline, while profiles own additional system/chassis/manager critical suffixes and top-level critical targets. Resolved live acquisition plans should be built inside redfishprofile/, not by hand in redfish.go. The collector core should receive discovered resources plus the selected profile plan and then execute the resolved seed/critical paths. When profile behavior depends on what discovery actually returned, use a post-discovery refinement hook in redfishprofile/ instead of hardcoding guessed absolute paths in the static plan. MSI GPU chassis refinement is the reference example.

Live Redfish collection must expose profile-match diagnostics:

  • collector logs must include the selected modules and score for every known module
  • job status responses must carry structured active_modules and module_scores
  • the collect page should render active modules as chips from structured status data, not by parsing log lines

On replay, profile-derived analysis directives may enable vendor-specific inventory linking helpers such as processor-GPU fallback, chassis-ID alias resolution, and bounded storage recovery. Replay should now resolve a structured analysis plan inside redfishprofile/, analogous to the live acquisition plan. The replay core may execute collectors against the resolved directives, but snapshot-aware vendor decisions should live in profile analysis hooks, not in redfish_replay.go. GPU and storage replay executors should consume the resolved analysis plan directly, not a raw AnalysisDirectives struct, so the boundary between planning and execution stays explicit.

Profile matching and acquisition tuning must be regression-tested against repo-owned compact fixtures under internal/collector/redfishprofile/testdata/, derived from representative raw-export snapshots, for at least MSI and Supermicro shapes. When multiple raw-export snapshots exist for the same platform, profile selection must remain stable across those sibling fixtures unless the topology actually changes. Analysis-plan metadata should be stored in replay raw payloads so vendor hook activation is debuggable offline.

Stored raw data

Important raw payloads:

  • raw_payloads.redfish_tree
  • raw_payloads.redfish_fetch_errors
  • raw_payloads.redfish_profiles
  • raw_payloads.source_timezone when available

Snapshot crawler rules

  • bounded by LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS
  • prioritized toward high-value inventory paths
  • tolerant of expected vendor-specific failures
  • normalizes @odata.id values before queueing

Redfish implementation guidance

When changing collection logic:

  1. Prefer profile modules over ad-hoc vendor branches in the collector core
  2. Keep expensive probing bounded
  3. Deduplicate by serial, then BDF, then location/model fallbacks
  4. Preserve replay determinism from saved raw payloads
  5. Add tests for both the motivating topology and a negative case

Known vendor fallbacks

  • empty standard drive collections may trigger bounded Disk.Bay probing
  • Storage.Links.Enclosures[*] may be followed to recover physical drives
  • PowerSubsystem/PowerSupplies is preferred over legacy Power when available

IPMI collector

Status: mock scaffold only.

It remains registered for protocol completeness, but it is not a real collection path.