# 05 — Collectors Collectors live in `internal/collector/`. Core files: - `registry.go` for protocol registration - `redfish.go` for live collection - `redfish_replay.go` for replay from raw payloads - `redfish_replay_gpu.go` for profile-driven GPU replay collectors and GPU fallback helpers - `redfish_replay_storage.go` for profile-driven storage replay collectors and storage recovery helpers - `redfish_replay_inventory.go` for replay inventory collectors (PCIe, NIC, BMC MAC, NIC enrichment) - `redfish_replay_fru.go` for board fallback helpers and Assembly/FRU replay extraction - `redfish_replay_profiles.go` for profile-driven replay helpers and vendor-aware recovery helpers - `redfishprofile/` for Redfish profile matching and acquisition/analysis hooks - `ipmi_mock.go` for the placeholder IPMI implementation - `types.go` for request/progress contracts ## Redfish collector Status: active production path. Request fields passed from the server: - `host` - `port` - `username` - `auth_type` - credential field (`password` or token) - `tls_mode` - optional `power_on_if_host_off` ### Core rule Live collection and replay must stay behaviorally aligned. If the collector adds a fallback, probe, or normalization rule, replay must mirror it. ### Preflight and host power - `Probe()` may be used before collection to verify API connectivity and current host `PowerState` - if the host is off and the user chose power-on, the collector may issue `ComputerSystem.Reset` with `ResetType=On` - power-on attempts are bounded and logged - after a successful power-on, the collector waits an extra stabilization window, then checks `PowerState` again and only starts collection if the host is still on - if the collector powered on the host itself for collection, it must attempt to power it back off after collection completes - if the host was already on before collection, the collector must not power it off afterward - if power-on fails, collection still continues against the powered-off host - all power-control decisions and attempts must be visible in the collection log so they are preserved in raw-export bundles ### Discovery model The collector does not rely on one fixed vendor tree. It discovers and follows Redfish resources dynamically from root collections such as: - `Systems` - `Chassis` - `Managers` After minimal discovery the collector builds `MatchSignals` and selects a Redfish profile mode: - `matched` when one or more profiles score with high confidence - `fallback` when vendor/platform confidence is low; in this mode the collector aggregates safe additive profile probes to maximize snapshot completeness Profile modules may contribute: - primary acquisition seeds - bounded `PlanBPaths` for secondary recovery - critical paths - acquisition notes/diagnostics - tuning hints such as snapshot document cap, prefetch behavior, and expensive post-probe toggles - post-probe policy for numeric collection recovery, direct NVMe `Disk.Bay` recovery, and sensor post-probe enablement - recovery policy for critical collection member retry, slow numeric plan-B probing, and profile-specific plan-B activation - scoped path policy for discovered `Systems/*`, `Chassis/*`, and `Managers/*` branches when a profile needs extra seeds/critical targets beyond the vendor-neutral core set - prefetch policy for which critical paths are eligible for adaptive prefetch and which path shapes are explicitly excluded Model- or topology-specific `CriticalPaths` and profile `PlanBPaths` must live in the profile module that owns the behavior. The collector core may execute those paths, but it should not hardcode vendor-specific recovery targets. The same rule applies to expensive post-probe decisions: the collector core may execute bounded post-probe loops, but profiles own whether those loops are enabled for a given platform shape. The same rule applies to critical recovery passes: the collector core may run bounded plan-B loops, but profiles own whether member retry, slow numeric recovery, and profile-specific plan-B passes are enabled. When a profile needs extra discovered-path branches such as storage controller subtrees, it must provide them as scoped suffix policy rather than by hardcoding platform-shaped suffixes into the collector core baseline seed list. The same applies to prefetch shaping: the collector core may execute adaptive prefetch, but profiles own the include/exclude rules for which critical paths should participate. The same applies to critical inventory shaping: the collector core should keep only a minimal vendor-neutral critical baseline, while profiles own additional system/chassis/manager critical suffixes and top-level critical targets. Resolved live acquisition plans should be built inside `redfishprofile/`, not by hand in `redfish.go`. The collector core should receive discovered resources plus the selected profile plan and then execute the resolved seed/critical paths. When profile behavior depends on what discovery actually returned, use a post-discovery refinement hook in `redfishprofile/` instead of hardcoding guessed absolute paths in the static plan. MSI GPU chassis refinement is the reference example. Live Redfish collection must expose profile-match diagnostics: - collector logs must include the selected modules and score for every known module - job status responses must carry structured `active_modules` and `module_scores` - the collect page should render active modules as chips from structured status data, not by parsing log lines On replay, profile-derived analysis directives may enable vendor-specific inventory linking helpers such as processor-GPU fallback, chassis-ID alias resolution, and bounded storage recovery. Replay should now resolve a structured analysis plan inside `redfishprofile/`, analogous to the live acquisition plan. The replay core may execute collectors against the resolved directives, but snapshot-aware vendor decisions should live in profile analysis hooks, not in `redfish_replay.go`. GPU and storage replay executors should consume the resolved analysis plan directly, not a raw `AnalysisDirectives` struct, so the boundary between planning and execution stays explicit. Profile matching and acquisition tuning must be regression-tested against repo-owned compact fixtures under `internal/collector/redfishprofile/testdata/`, derived from representative raw-export snapshots, for at least MSI and Supermicro shapes. When multiple raw-export snapshots exist for the same platform, profile selection must remain stable across those sibling fixtures unless the topology actually changes. Analysis-plan metadata should be stored in replay raw payloads so vendor hook activation is debuggable offline. ### Stored raw data Important raw payloads: - `raw_payloads.redfish_tree` - `raw_payloads.redfish_fetch_errors` - `raw_payloads.redfish_profiles` - `raw_payloads.source_timezone` when available ### Snapshot crawler rules - bounded by `LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS` - prioritized toward high-value inventory paths - tolerant of expected vendor-specific failures - normalizes `@odata.id` values before queueing ### Redfish implementation guidance When changing collection logic: 1. Prefer profile modules over ad-hoc vendor branches in the collector core 2. Keep expensive probing bounded 3. Deduplicate by serial, then BDF, then location/model fallbacks 4. Preserve replay determinism from saved raw payloads 5. Add tests for both the motivating topology and a negative case ### Known vendor fallbacks - empty standard drive collections may trigger bounded `Disk.Bay` probing - `Storage.Links.Enclosures[*]` may be followed to recover physical drives - `PowerSubsystem/PowerSupplies` is preferred over legacy `Power` when available ## IPMI collector Status: mock scaffold only. It remains registered for protocol completeness, but it is not a real collection path.