Implement the full architectural plan: unified ingest.Service entry point for archive and Redfish payloads, modular redfishprofile package with composable profiles (generic, ami-family, msi, supermicro, dell, hgx-topology), score-based profile matching with fallback expansion mode, and profile-driven acquisition/analysis plans. Vendor-specific logic moved out of common executors and into profile hooks. GPU chassis lookup strategies and known storage recovery collections (IntelVROC/HA-RAID/MRVL) now live in ResolvedAnalysisPlan, populated by profiles at analysis time. Replay helpers read from the plan; no hardcoded path lists remain in generic code. Also splits redfish_replay.go into domain modules (gpu, storage, inventory, fru, profiles) and adds full fixture/matcher/directive test coverage including Dell, AMI, unknown-vendor fallback, and deterministic ordering. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.7 KiB
05 — Collectors
Collectors live in internal/collector/.
Core files:
registry.gofor protocol registrationredfish.gofor live collectionredfish_replay.gofor replay from raw payloadsredfish_replay_gpu.gofor profile-driven GPU replay collectors and GPU fallback helpersredfish_replay_storage.gofor profile-driven storage replay collectors and storage recovery helpersredfish_replay_inventory.gofor replay inventory collectors (PCIe, NIC, BMC MAC, NIC enrichment)redfish_replay_fru.gofor board fallback helpers and Assembly/FRU replay extractionredfish_replay_profiles.gofor profile-driven replay helpers and vendor-aware recovery helpersredfishprofile/for Redfish profile matching and acquisition/analysis hooksipmi_mock.gofor the placeholder IPMI implementationtypes.gofor request/progress contracts
Redfish collector
Status: active production path.
Request fields passed from the server:
hostportusernameauth_type- credential field (
passwordor token) tls_mode- optional
power_on_if_host_off
Core rule
Live collection and replay must stay behaviorally aligned. If the collector adds a fallback, probe, or normalization rule, replay must mirror it.
Preflight and host power
Probe()may be used before collection to verify API connectivity and current hostPowerState- if the host is off and the user chose power-on, the collector may issue
ComputerSystem.ResetwithResetType=On - power-on attempts are bounded and logged
- after a successful power-on, the collector waits an extra stabilization window, then checks
PowerStateagain and only starts collection if the host is still on - if the collector powered on the host itself for collection, it must attempt to power it back off after collection completes
- if the host was already on before collection, the collector must not power it off afterward
- if power-on fails, collection still continues against the powered-off host
- all power-control decisions and attempts must be visible in the collection log so they are preserved in raw-export bundles
Discovery model
The collector does not rely on one fixed vendor tree. It discovers and follows Redfish resources dynamically from root collections such as:
SystemsChassisManagers
After minimal discovery the collector builds MatchSignals and selects a Redfish profile mode:
matchedwhen one or more profiles score with high confidencefallbackwhen vendor/platform confidence is low; in this mode the collector aggregates safe additive profile probes to maximize snapshot completeness
Profile modules may contribute:
- primary acquisition seeds
- bounded
PlanBPathsfor secondary recovery - critical paths
- acquisition notes/diagnostics
- tuning hints such as snapshot document cap, prefetch behavior, and expensive post-probe toggles
- post-probe policy for numeric collection recovery, direct NVMe
Disk.Bayrecovery, and sensor post-probe enablement - recovery policy for critical collection member retry, slow numeric plan-B probing, and profile-specific plan-B activation
- scoped path policy for discovered
Systems/*,Chassis/*, andManagers/*branches when a profile needs extra seeds/critical targets beyond the vendor-neutral core set - prefetch policy for which critical paths are eligible for adaptive prefetch and which path shapes are explicitly excluded
Model- or topology-specific CriticalPaths and profile PlanBPaths must live in the profile
module that owns the behavior. The collector core may execute those paths, but it should not
hardcode vendor-specific recovery targets.
The same rule applies to expensive post-probe decisions: the collector core may execute bounded
post-probe loops, but profiles own whether those loops are enabled for a given platform shape.
The same rule applies to critical recovery passes: the collector core may run bounded plan-B
loops, but profiles own whether member retry, slow numeric recovery, and profile-specific plan-B
passes are enabled.
When a profile needs extra discovered-path branches such as storage controller subtrees, it must
provide them as scoped suffix policy rather than by hardcoding platform-shaped suffixes into the
collector core baseline seed list.
The same applies to prefetch shaping: the collector core may execute adaptive prefetch, but
profiles own the include/exclude rules for which critical paths should participate.
The same applies to critical inventory shaping: the collector core should keep only a minimal
vendor-neutral critical baseline, while profiles own additional system/chassis/manager critical
suffixes and top-level critical targets.
Resolved live acquisition plans should be built inside redfishprofile/, not by hand in
redfish.go. The collector core should receive discovered resources plus the selected profile
plan and then execute the resolved seed/critical paths.
When profile behavior depends on what discovery actually returned, use a post-discovery
refinement hook in redfishprofile/ instead of hardcoding guessed absolute paths in the static
plan. MSI GPU chassis refinement is the reference example.
Live Redfish collection must expose profile-match diagnostics:
- collector logs must include the selected modules and score for every known module
- job status responses must carry structured
active_modulesandmodule_scores - the collect page should render active modules as chips from structured status data, not by parsing log lines
On replay, profile-derived analysis directives may enable vendor-specific inventory linking
helpers such as processor-GPU fallback, chassis-ID alias resolution, and bounded storage recovery.
Replay should now resolve a structured analysis plan inside redfishprofile/, analogous to the
live acquisition plan. The replay core may execute collectors against the resolved directives, but
snapshot-aware vendor decisions should live in profile analysis hooks, not in redfish_replay.go.
GPU and storage replay executors should consume the resolved analysis plan directly, not a raw
AnalysisDirectives struct, so the boundary between planning and execution stays explicit.
Profile matching and acquisition tuning must be regression-tested against repo-owned compact
fixtures under internal/collector/redfishprofile/testdata/, derived from representative
raw-export snapshots, for at least MSI and Supermicro shapes.
When multiple raw-export snapshots exist for the same platform, profile selection must remain
stable across those sibling fixtures unless the topology actually changes.
Analysis-plan metadata should be stored in replay raw payloads so vendor hook activation is
debuggable offline.
Stored raw data
Important raw payloads:
raw_payloads.redfish_treeraw_payloads.redfish_fetch_errorsraw_payloads.redfish_profilesraw_payloads.source_timezonewhen available
Snapshot crawler rules
- bounded by
LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS - prioritized toward high-value inventory paths
- tolerant of expected vendor-specific failures
- normalizes
@odata.idvalues before queueing
Redfish implementation guidance
When changing collection logic:
- Prefer profile modules over ad-hoc vendor branches in the collector core
- Keep expensive probing bounded
- Deduplicate by serial, then BDF, then location/model fallbacks
- Preserve replay determinism from saved raw payloads
- Add tests for both the motivating topology and a negative case
Known vendor fallbacks
- empty standard drive collections may trigger bounded
Disk.Bayprobing Storage.Links.Enclosures[*]may be followed to recover physical drivesPowerSubsystem/PowerSuppliesis preferred over legacyPowerwhen available
IPMI collector
Status: mock scaffold only.
It remains registered for protocol completeness, but it is not a real collection path.