# 05 — Collectors Collectors live in `internal/collector/`. Core files: - `registry.go` for protocol registration - `redfish.go` for live collection - `redfish_replay.go` for replay from raw payloads - `redfish_replay_gpu.go` for profile-driven GPU replay collectors and GPU fallback helpers - `redfish_replay_storage.go` for profile-driven storage replay collectors and storage recovery helpers - `redfish_replay_inventory.go` for replay inventory collectors (PCIe, NIC, BMC MAC, NIC enrichment) - `redfish_replay_fru.go` for board fallback helpers and Assembly/FRU replay extraction - `redfish_replay_profiles.go` for profile-driven replay helpers and vendor-aware recovery helpers - `redfishprofile/` for Redfish profile matching and acquisition/analysis hooks - `ipmi_mock.go` for the placeholder IPMI implementation - `types.go` for request/progress contracts ## Redfish collector Status: active production path. Request fields passed from the server: - `host` - `port` - `username` - `auth_type` - credential field (`password` or token) - `tls_mode` - optional `power_on_if_host_off` ### Core rule Live collection and replay must stay behaviorally aligned. If the collector adds a fallback, probe, or normalization rule, replay must mirror it. ### Preflight and host power - `Probe()` may be used before collection to verify API connectivity and current host `PowerState` - if the host is off and the user chose power-on, the collector may issue `ComputerSystem.Reset` with `ResetType=On` - power-on attempts are bounded and logged - after a successful power-on, the collector waits an extra stabilization window, then checks `PowerState` again and only starts collection if the host is still on - if the collector powered on the host itself for collection, it must attempt to power it back off after collection completes - if the host was already on before collection, the collector must not power it off afterward - if power-on fails, collection still continues against the powered-off host - all power-control decisions and attempts must be visible in the collection log so they are preserved in raw-export bundles ### Discovery model The collector does not rely on one fixed vendor tree. It discovers and follows Redfish resources dynamically from root collections such as: - `Systems` - `Chassis` - `Managers` After minimal discovery the collector builds `MatchSignals` and selects a Redfish profile mode: - `matched` when one or more profiles score with high confidence - `fallback` when vendor/platform confidence is low; in this mode the collector aggregates safe additive profile probes to maximize snapshot completeness Profile modules may contribute: - primary acquisition seeds - bounded `PlanBPaths` for secondary recovery - critical paths - acquisition notes/diagnostics - tuning hints such as snapshot document cap, prefetch behavior, and expensive post-probe toggles - post-probe policy for numeric collection recovery, direct NVMe `Disk.Bay` recovery, and sensor post-probe enablement - recovery policy for critical collection member retry, slow numeric plan-B probing, and profile-specific plan-B activation - scoped path policy for discovered `Systems/*`, `Chassis/*`, and `Managers/*` branches when a profile needs extra seeds/critical targets beyond the vendor-neutral core set - prefetch policy for which critical paths are eligible for adaptive prefetch and which path shapes are explicitly excluded Model- or topology-specific `CriticalPaths` and profile `PlanBPaths` must live in the profile module that owns the behavior. The collector core may execute those paths, but it should not hardcode vendor-specific recovery targets. The same rule applies to expensive post-probe decisions: the collector core may execute bounded post-probe loops, but profiles own whether those loops are enabled for a given platform shape. The same rule applies to critical recovery passes: the collector core may run bounded plan-B loops, but profiles own whether member retry, slow numeric recovery, and profile-specific plan-B passes are enabled. When a profile needs extra discovered-path branches such as storage controller subtrees, it must provide them as scoped suffix policy rather than by hardcoding platform-shaped suffixes into the collector core baseline seed list. The same applies to prefetch shaping: the collector core may execute adaptive prefetch, but profiles own the include/exclude rules for which critical paths should participate. The same applies to critical inventory shaping: the collector core should keep only a minimal vendor-neutral critical baseline, while profiles own additional system/chassis/manager critical suffixes and top-level critical targets. Resolved live acquisition plans should be built inside `redfishprofile/`, not by hand in `redfish.go`. The collector core should receive discovered resources plus the selected profile plan and then execute the resolved seed/critical paths. When profile behavior depends on what discovery actually returned, use a post-discovery refinement hook in `redfishprofile/` instead of hardcoding guessed absolute paths in the static plan. MSI GPU chassis refinement is the reference example. Live Redfish collection must expose profile-match diagnostics: - collector logs must include the selected modules and score for every known module - job status responses must carry structured `active_modules` and `module_scores` - the collect page should render active modules as chips from structured status data, not by parsing log lines Profile matching may use stable platform grammar signals in addition to vendor strings: - discovered member/resource naming from lightweight discovery collections - firmware inventory member IDs - OEM action names and linked target paths embedded in discovery documents - replay-only snapshot hints such as OEM assembly/type markers when they are present in `raw_payloads.redfish_tree` On replay, profile-derived analysis directives may enable vendor-specific inventory linking helpers such as processor-GPU fallback, chassis-ID alias resolution, and bounded storage recovery. Replay should now resolve a structured analysis plan inside `redfishprofile/`, analogous to the live acquisition plan. The replay core may execute collectors against the resolved directives, but snapshot-aware vendor decisions should live in profile analysis hooks, not in `redfish_replay.go`. GPU and storage replay executors should consume the resolved analysis plan directly, not a raw `AnalysisDirectives` struct, so the boundary between planning and execution stays explicit. Profile matching and acquisition tuning must be regression-tested against repo-owned compact fixtures under `internal/collector/redfishprofile/testdata/`, derived from representative raw-export snapshots, for at least MSI and Supermicro shapes. When multiple raw-export snapshots exist for the same platform, profile selection must remain stable across those sibling fixtures unless the topology actually changes. Analysis-plan metadata should be stored in replay raw payloads so vendor hook activation is debuggable offline. ### Stored raw data Important raw payloads: - `raw_payloads.redfish_tree` - `raw_payloads.redfish_fetch_errors` - `raw_payloads.redfish_profiles` - `raw_payloads.source_timezone` when available ### Snapshot crawler rules - bounded by `LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS` - prioritized toward high-value inventory paths - tolerant of expected vendor-specific failures - normalizes `@odata.id` values before queueing ### Redfish implementation guidance When changing collection logic: 1. Prefer profile modules over ad-hoc vendor branches in the collector core 2. Keep expensive probing bounded 3. Deduplicate by serial, then BDF, then location/model fallbacks 4. Preserve replay determinism from saved raw payloads 5. Add tests for both the motivating topology and a negative case ### Known vendor fallbacks - empty standard drive collections may trigger bounded `Disk.Bay` probing - `Storage.Links.Enclosures[*]` may be followed to recover physical drives - `PowerSubsystem/PowerSupplies` is preferred over legacy `Power` when available ## IPMI collector Status: mock scaffold only. It remains registered for protocol completeness, but it is not a real collection path.