refactor: unified ingest pipeline + modular Redfish profile framework
Implement the full architectural plan: unified ingest.Service entry point for archive and Redfish payloads, modular redfishprofile package with composable profiles (generic, ami-family, msi, supermicro, dell, hgx-topology), score-based profile matching with fallback expansion mode, and profile-driven acquisition/analysis plans. Vendor-specific logic moved out of common executors and into profile hooks. GPU chassis lookup strategies and known storage recovery collections (IntelVROC/HA-RAID/MRVL) now live in ResolvedAnalysisPlan, populated by profiles at analysis time. Replay helpers read from the plan; no hardcoded path lists remain in generic code. Also splits redfish_replay.go into domain modules (gpu, storage, inventory, fru, profiles) and adds full fixture/matcher/directive test coverage including Dell, AMI, unknown-vendor fallback, and deterministic ordering. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -20,6 +20,7 @@ LOGPile remains responsible for upload, collection, parsing, normalization, and
|
||||
```text
|
||||
cmd/logpile/main.go entrypoint and CLI flags
|
||||
internal/server/ HTTP handlers, jobs, upload/export flows
|
||||
internal/ingest/ source-family orchestration for upload and raw replay
|
||||
internal/collector/ live collection and Redfish replay
|
||||
internal/analyzer/ shared analysis helpers
|
||||
internal/parser/ archive extraction and parser dispatch
|
||||
@@ -50,18 +51,21 @@ Failed or canceled jobs do not overwrite the previous dataset.
|
||||
### Upload
|
||||
|
||||
1. `POST /api/upload` receives multipart field `archive`
|
||||
2. JSON inputs are checked for raw-export package or `AnalysisResult` snapshot
|
||||
3. Non-JSON inputs go through `parser.BMCParser`
|
||||
4. Archive metadata is normalized onto `AnalysisResult`
|
||||
5. Result becomes the current in-memory dataset
|
||||
2. `internal/ingest.Service` resolves the source family
|
||||
3. JSON inputs are checked for raw-export package or `AnalysisResult` snapshot
|
||||
4. Non-JSON archives go through the archive parser family
|
||||
5. Archive metadata is normalized onto `AnalysisResult`
|
||||
6. Result becomes the current in-memory dataset
|
||||
|
||||
### Live collect
|
||||
|
||||
1. `POST /api/collect` validates request fields
|
||||
2. Server creates an async job and returns `202 Accepted`
|
||||
3. Selected collector gathers raw data
|
||||
4. For Redfish, collector saves `raw_payloads.redfish_tree`
|
||||
5. Result is normalized, source metadata applied, and state replaced on success
|
||||
4. For Redfish, collector runs minimal discovery, matches Redfish profiles, and builds an acquisition plan
|
||||
5. Collector applies profile tuning hints (for example crawl breadth, prefetch, bounded plan-B passes)
|
||||
6. Collector saves `raw_payloads.redfish_tree` plus acquisition diagnostics
|
||||
7. Result is normalized, source metadata applied, and state replaced on success
|
||||
|
||||
### Batch convert
|
||||
|
||||
@@ -76,6 +80,10 @@ Failed or canceled jobs do not overwrite the previous dataset.
|
||||
Live Redfish collection and offline Redfish re-analysis must use the same replay path.
|
||||
The collector first captures `raw_payloads.redfish_tree`, then the replay logic builds the normalized result.
|
||||
|
||||
Redfish is being split into two coordinated phases:
|
||||
- acquisition: profile-driven snapshot collection strategy
|
||||
- analysis: replay over the saved snapshot with the same profile framework
|
||||
|
||||
## PCI IDs lookup
|
||||
|
||||
Lookup order:
|
||||
|
||||
@@ -6,6 +6,12 @@ Core files:
|
||||
- `registry.go` for protocol registration
|
||||
- `redfish.go` for live collection
|
||||
- `redfish_replay.go` for replay from raw payloads
|
||||
- `redfish_replay_gpu.go` for profile-driven GPU replay collectors and GPU fallback helpers
|
||||
- `redfish_replay_storage.go` for profile-driven storage replay collectors and storage recovery helpers
|
||||
- `redfish_replay_inventory.go` for replay inventory collectors (PCIe, NIC, BMC MAC, NIC enrichment)
|
||||
- `redfish_replay_fru.go` for board fallback helpers and Assembly/FRU replay extraction
|
||||
- `redfish_replay_profiles.go` for profile-driven replay helpers and vendor-aware recovery helpers
|
||||
- `redfishprofile/` for Redfish profile matching and acquisition/analysis hooks
|
||||
- `ipmi_mock.go` for the placeholder IPMI implementation
|
||||
- `types.go` for request/progress contracts
|
||||
|
||||
@@ -50,11 +56,72 @@ It discovers and follows Redfish resources dynamically from root collections suc
|
||||
- `Chassis`
|
||||
- `Managers`
|
||||
|
||||
After minimal discovery the collector builds `MatchSignals` and selects a Redfish profile mode:
|
||||
- `matched` when one or more profiles score with high confidence
|
||||
- `fallback` when vendor/platform confidence is low; in this mode the collector aggregates safe additive profile probes to maximize snapshot completeness
|
||||
|
||||
Profile modules may contribute:
|
||||
- primary acquisition seeds
|
||||
- bounded `PlanBPaths` for secondary recovery
|
||||
- critical paths
|
||||
- acquisition notes/diagnostics
|
||||
- tuning hints such as snapshot document cap, prefetch behavior, and expensive post-probe toggles
|
||||
- post-probe policy for numeric collection recovery, direct NVMe `Disk.Bay` recovery, and sensor post-probe enablement
|
||||
- recovery policy for critical collection member retry, slow numeric plan-B probing, and profile-specific plan-B activation
|
||||
- scoped path policy for discovered `Systems/*`, `Chassis/*`, and `Managers/*` branches when a profile needs extra seeds/critical targets beyond the vendor-neutral core set
|
||||
- prefetch policy for which critical paths are eligible for adaptive prefetch and which path shapes are explicitly excluded
|
||||
|
||||
Model- or topology-specific `CriticalPaths` and profile `PlanBPaths` must live in the profile
|
||||
module that owns the behavior. The collector core may execute those paths, but it should not
|
||||
hardcode vendor-specific recovery targets.
|
||||
The same rule applies to expensive post-probe decisions: the collector core may execute bounded
|
||||
post-probe loops, but profiles own whether those loops are enabled for a given platform shape.
|
||||
The same rule applies to critical recovery passes: the collector core may run bounded plan-B
|
||||
loops, but profiles own whether member retry, slow numeric recovery, and profile-specific plan-B
|
||||
passes are enabled.
|
||||
When a profile needs extra discovered-path branches such as storage controller subtrees, it must
|
||||
provide them as scoped suffix policy rather than by hardcoding platform-shaped suffixes into the
|
||||
collector core baseline seed list.
|
||||
The same applies to prefetch shaping: the collector core may execute adaptive prefetch, but
|
||||
profiles own the include/exclude rules for which critical paths should participate.
|
||||
The same applies to critical inventory shaping: the collector core should keep only a minimal
|
||||
vendor-neutral critical baseline, while profiles own additional system/chassis/manager critical
|
||||
suffixes and top-level critical targets.
|
||||
Resolved live acquisition plans should be built inside `redfishprofile/`, not by hand in
|
||||
`redfish.go`. The collector core should receive discovered resources plus the selected profile
|
||||
plan and then execute the resolved seed/critical paths.
|
||||
When profile behavior depends on what discovery actually returned, use a post-discovery
|
||||
refinement hook in `redfishprofile/` instead of hardcoding guessed absolute paths in the static
|
||||
plan. MSI GPU chassis refinement is the reference example.
|
||||
|
||||
Live Redfish collection must expose profile-match diagnostics:
|
||||
- collector logs must include the selected modules and score for every known module
|
||||
- job status responses must carry structured `active_modules` and `module_scores`
|
||||
- the collect page should render active modules as chips from structured status data, not by
|
||||
parsing log lines
|
||||
|
||||
On replay, profile-derived analysis directives may enable vendor-specific inventory linking
|
||||
helpers such as processor-GPU fallback, chassis-ID alias resolution, and bounded storage recovery.
|
||||
Replay should now resolve a structured analysis plan inside `redfishprofile/`, analogous to the
|
||||
live acquisition plan. The replay core may execute collectors against the resolved directives, but
|
||||
snapshot-aware vendor decisions should live in profile analysis hooks, not in `redfish_replay.go`.
|
||||
GPU and storage replay executors should consume the resolved analysis plan directly, not a raw
|
||||
`AnalysisDirectives` struct, so the boundary between planning and execution stays explicit.
|
||||
|
||||
Profile matching and acquisition tuning must be regression-tested against repo-owned compact
|
||||
fixtures under `internal/collector/redfishprofile/testdata/`, derived from representative
|
||||
raw-export snapshots, for at least MSI and Supermicro shapes.
|
||||
When multiple raw-export snapshots exist for the same platform, profile selection must remain
|
||||
stable across those sibling fixtures unless the topology actually changes.
|
||||
Analysis-plan metadata should be stored in replay raw payloads so vendor hook activation is
|
||||
debuggable offline.
|
||||
|
||||
### Stored raw data
|
||||
|
||||
Important raw payloads:
|
||||
- `raw_payloads.redfish_tree`
|
||||
- `raw_payloads.redfish_fetch_errors`
|
||||
- `raw_payloads.redfish_profiles`
|
||||
- `raw_payloads.source_timezone` when available
|
||||
|
||||
### Snapshot crawler rules
|
||||
@@ -68,7 +135,7 @@ Important raw payloads:
|
||||
|
||||
When changing collection logic:
|
||||
|
||||
1. Prefer alternate-path support over vendor hardcoding
|
||||
1. Prefer profile modules over ad-hoc vendor branches in the collector core
|
||||
2. Keep expensive probing bounded
|
||||
3. Deduplicate by serial, then BDF, then location/model fallbacks
|
||||
4. Preserve replay determinism from saved raw payloads
|
||||
|
||||
@@ -274,6 +274,188 @@ for `Enclosure`, `RackMount`, and any unrecognised type (fail-safe).
|
||||
both the excluded types and the storage-capable types (see `TestChassisTypeCanHaveNVMe`
|
||||
and `TestNVMePostProbeSkipsNonStorageChassis`).
|
||||
|
||||
## ADL-019 — Redfish post-probe recovery is profile-owned acquisition policy
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Numeric collection post-probe and direct NVMe `Disk.Bay` recovery were still
|
||||
controlled by collector-core heuristics, which kept platform-specific acquisition behavior in
|
||||
`redfish.go` and made vendor/topology refactoring incomplete.
|
||||
**Decision:** Move expensive Redfish post-probe enablement into profile-owned acquisition policy.
|
||||
The collector core may execute bounded post-probe loops, but profiles must explicitly enable:
|
||||
- numeric collection post-probe
|
||||
- direct NVMe `Disk.Bay` recovery
|
||||
- sensor collection post-probe
|
||||
**Consequences:**
|
||||
- Generic collector flow no longer implicitly turns on storage/NVMe recovery for every platform.
|
||||
- Supermicro-specific direct NVMe recovery and generic numeric collection recovery are now
|
||||
regression-tested through profile fixtures.
|
||||
- Future platform storage/post-probe behavior must be added through profile tuning, not new
|
||||
vendor-shaped `if` branches in collector core.
|
||||
|
||||
## ADL-020 — Redfish critical plan-B activation is profile-owned recovery policy
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** `critical plan-B` and `profile plan-B` were still effectively always-on collector
|
||||
behavior once paths were present, including critical collection member retry and slow numeric
|
||||
child probing. That kept acquisition recovery semantics in `redfish.go` instead of the profile
|
||||
layer.
|
||||
**Decision:** Move plan-B activation into profile-owned recovery policy. Profiles must explicitly
|
||||
enable:
|
||||
- critical collection member retry
|
||||
- slow numeric probing during critical plan-B
|
||||
- profile-specific plan-B pass
|
||||
**Consequences:**
|
||||
- Recovery behavior is now observable in raw Redfish diagnostics alongside other tuning.
|
||||
- Generic/fallback recovery remains available through profile policy instead of implicit collector
|
||||
defaults.
|
||||
- Future platform-specific plan-B behavior must be introduced through profile tuning and tests,
|
||||
not through new unconditional collector branches.
|
||||
|
||||
## ADL-021 — Extra discovered-path storage seeds must be profile-scoped, not core-baseline
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** The collector core baseline seed list still contained storage-specific discovered-path
|
||||
suffixes such as `SimpleStorage` and `Storage/IntelVROC/*`. These are useful on some platforms,
|
||||
but they are acquisition extensions layered on top of discovered `Systems/*` resources, not part
|
||||
of the minimal vendor-neutral Redfish baseline.
|
||||
**Decision:** Move such discovered-path expansions into profile-owned scoped path policy. The
|
||||
collector core keeps the vendor-neutral baseline; profiles may add extra system/chassis/manager
|
||||
suffixes that are expanded over discovered members during acquisition planning.
|
||||
**Consequences:**
|
||||
- Platform-shaped storage discovery no longer lives in `redfish.go` baseline seed construction.
|
||||
- Extra discovered-path branches are visible in plan diagnostics and fixture regression tests.
|
||||
- Future model/vendor storage path expansions must be added through scoped profile policy instead
|
||||
of editing the shared baseline seed list.
|
||||
|
||||
## ADL-022 — Adaptive prefetch eligibility is profile-owned policy
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** The adaptive prefetch executor was still driven by hardcoded include/exclude path
|
||||
rules in `redfish.go`. That made GPU/storage/network prefetch shaping part of collector-core
|
||||
knowledge rather than profile-owned acquisition policy.
|
||||
**Decision:** Move prefetch eligibility rules into profile tuning. The collector core still runs
|
||||
adaptive prefetch, but profiles provide:
|
||||
- `IncludeSuffixes` for critical paths eligible for prefetch
|
||||
- `ExcludeContains` for path shapes that must never be prefetched
|
||||
**Consequences:**
|
||||
- Prefetch behavior is now visible in raw Redfish diagnostics and test fixtures.
|
||||
- Platform- or topology-specific prefetch shaping no longer requires editing collector-core
|
||||
string lists.
|
||||
- Future prefetch tuning must be introduced through profiles and regression tests.
|
||||
|
||||
## ADL-023 — Core critical baseline is roots-only; critical shaping is profile-owned
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** `redfishCriticalEndpoints(...)` still encoded a broad set of system/chassis/manager
|
||||
critical branches directly in collector core. This mixed minimal crawl invariants with profile-
|
||||
specific acquisition shaping.
|
||||
**Decision:** Reduce collector-core critical baseline to vendor-neutral roots only:
|
||||
- `/redfish/v1`
|
||||
- discovered `Systems/*`
|
||||
- discovered `Chassis/*`
|
||||
- discovered `Managers/*`
|
||||
|
||||
Profiles now own additional critical shaping through:
|
||||
- scoped critical suffix policy for discovered resources
|
||||
- explicit top-level `CriticalPaths`
|
||||
**Consequences:**
|
||||
- Critical inventory breadth is now explained by the acquisition plan, not hidden in collector
|
||||
helper defaults.
|
||||
- Generic profile still provides the previous broad critical coverage, so behavior stays stable.
|
||||
- Future critical-path tuning must be implemented in profiles and regression-tested there.
|
||||
|
||||
## ADL-024 — Live Redfish execution plans are resolved inside redfishprofile
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Even after moving seeds, scoped paths, critical shaping, recovery, and prefetch
|
||||
policy into profiles, `redfish.go` still manually merged discovered resources with those policy
|
||||
fragments. That left acquisition-plan resolution logic in collector core.
|
||||
**Decision:** Introduce `redfishprofile.ResolveAcquisitionPlan(...)` as the boundary between
|
||||
profile planning and collector execution. `redfishprofile` now resolves:
|
||||
- baseline seeds
|
||||
- baseline critical roots
|
||||
- scoped path expansions
|
||||
- explicit profile seed/critical/plan-B paths
|
||||
|
||||
The collector core consumes the resolved plan and executes it.
|
||||
**Consequences:**
|
||||
- Acquisition planning logic is now testable in `redfishprofile` without going through the live
|
||||
collector.
|
||||
- `redfish.go` no longer owns path-resolution helpers for seeds/critical planning.
|
||||
- This creates a clean next step toward true per-profile acquisition hooks beyond static policy
|
||||
fragments.
|
||||
|
||||
## ADL-025 — Post-discovery acquisition refinement belongs to profile hooks
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Some acquisition behavior depends not only on vendor/model hints, but on what the
|
||||
lightweight Redfish discovery actually returned. Static absolute path lists in profile plans are
|
||||
too rigid for such cases and reintroduce guessed platform knowledge.
|
||||
**Decision:** Add a post-discovery acquisition refinement hook to Redfish profiles. Profiles may
|
||||
mutate the resolved execution plan after discovered `Systems/*`, `Chassis/*`, and `Managers/*`
|
||||
are known.
|
||||
|
||||
First concrete use:
|
||||
- MSI now derives GPU chassis seeds and `.../Sensors` critical/plan-B paths from discovered
|
||||
`Chassis/GPU*` resources instead of hardcoded `GPU1..GPU4` absolute paths in the static plan.
|
||||
Additional use:
|
||||
- Supermicro now derives `UpdateService/Oem/Supermicro/FirmwareInventory` critical/plan-B paths
|
||||
from resource hints instead of carrying that absolute path in the static plan.
|
||||
Additional use:
|
||||
- Dell now derives `Managers/iDRAC.Embedded.*` acquisition paths from discovered manager
|
||||
resources instead of carrying `Managers/iDRAC.Embedded.1` as a static absolute path.
|
||||
**Consequences:**
|
||||
- Profile modules can react to actual discovery results without pushing conditional logic back
|
||||
into `redfish.go`.
|
||||
- Diagnostics still show the final refined plan because the collector stores the refined plan,
|
||||
not only the pre-refinement template.
|
||||
- Future vendor-specific discovery-dependent acquisition behavior should be implemented through
|
||||
this hook rather than new collector-core branches.
|
||||
|
||||
## ADL-026 — Replay analysis uses a resolved profile plan, not ad-hoc directives only
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Replay still relied on a flat `AnalysisDirectives` struct assembled centrally,
|
||||
while vendor-specific conditions often depended on the actual snapshot shape. That made analysis
|
||||
behavior harder to explain and kept too much vendor logic in generic replay collectors.
|
||||
**Decision:** Introduce `redfishprofile.ResolveAnalysisPlan(...)` for replay. The resolved
|
||||
analysis plan contains:
|
||||
- active match result
|
||||
- resolved analysis directives
|
||||
- analysis notes explaining snapshot-aware hook activation
|
||||
|
||||
Profiles may refine this plan using the snapshot and discovered resources before replay collectors
|
||||
run.
|
||||
|
||||
First concrete uses:
|
||||
- MSI enables processor-GPU fallback and MSI chassis lookup only when the snapshot actually
|
||||
contains GPU processors and `Chassis/GPU*`
|
||||
- HGX enables processor-GPU alias fallback from actual HGX/GPU_SXM topology signals in the snapshot
|
||||
- Supermicro enables NVMe backplane and known-controller recovery from actual snapshot paths
|
||||
**Consequences:**
|
||||
- Replay behavior is now closer to the acquisition architecture: a resolved profile plan feeds the
|
||||
executor.
|
||||
- `redfish_analysis_plan` is stored in raw payload metadata for offline debugging.
|
||||
- Future analysis-side vendor logic should move into profile refinement hooks instead of growing the
|
||||
central directive builder.
|
||||
|
||||
## ADL-027 — Replay GPU/storage executors consume resolved analysis plans
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Even after introducing `ResolveAnalysisPlan(...)`, replay GPU/storage collectors still
|
||||
accepted a raw `AnalysisDirectives` struct. That preserved an implicit shortcut from the old design
|
||||
and weakened the plan/executor boundary.
|
||||
**Decision:** Replay GPU/storage executors now accept `redfishprofile.ResolvedAnalysisPlan`
|
||||
directly. The executor reads resolved directives from the plan instead of being passed a standalone
|
||||
directive bundle.
|
||||
**Consequences:**
|
||||
- GPU and storage replay execution now follows the same architectural pattern as acquisition:
|
||||
resolve plan first, execute second.
|
||||
- Future profile-owned execution helpers can use plan notes or additional resolved fields without
|
||||
changing the executor API again.
|
||||
- Remaining replay areas should migrate the same way instead of continuing to accept raw directive
|
||||
structs.
|
||||
|
||||
## ADL-019 — isDeviceBoundFirmwareName must cover vendor-specific naming patterns per vendor
|
||||
|
||||
**Date:** 2026-03-12
|
||||
@@ -604,3 +786,39 @@ presentation drift and duplicated UI logic.
|
||||
- The host UI becomes a service shell around the viewer instead of maintaining its own
|
||||
field-by-field tabs.
|
||||
- `internal/chart` must be updated explicitly as a git submodule when the viewer changes.
|
||||
|
||||
---
|
||||
|
||||
## ADL-031 — Redfish uses profile-driven acquisition and unified ingest entrypoints
|
||||
|
||||
**Date:** 2026-03-17
|
||||
**Context:**
|
||||
Redfish collection had accumulated platform-specific probing in the shared collector path, while
|
||||
upload and raw-export replay still entered analysis through direct handler branches. This made
|
||||
vendor/model tuning harder to contain and increased regression risk when one topology needed a
|
||||
special acquisition strategy.
|
||||
|
||||
**Decision:**
|
||||
- Introduce `internal/ingest.Service` as the internal source-family entrypoint for archive parsing
|
||||
and Redfish raw replay.
|
||||
- Introduce `internal/collector/redfishprofile/` for Redfish profile matching and modular hooks.
|
||||
- Split Redfish behavior into coordinated phases:
|
||||
- acquisition planning during live collection
|
||||
- analysis hooks during snapshot replay
|
||||
- Use score-based profile matching. If confidence is low, enter fallback acquisition mode and
|
||||
aggregate only safe additive profile probes.
|
||||
- Allow profile modules to provide bounded acquisition tuning hints such as crawl cap, prefetch
|
||||
behavior, and expensive post-probe toggles.
|
||||
- Allow profile modules to own model-specific `CriticalPaths` and bounded `PlanBPaths` so vendor
|
||||
recovery targets stop leaking into the collector core.
|
||||
- Expose Redfish profile matching as structured diagnostics during live collection: logs must
|
||||
contain all module scores, and collect job status must expose active modules for the UI.
|
||||
|
||||
**Consequences:**
|
||||
- Server handlers stop owning parser-vs-replay branching details directly.
|
||||
- Vendor/model-specific Redfish logic gets an explicit module boundary.
|
||||
- Unknown-vendor Redfish collection becomes slower but more complete by design.
|
||||
- Tactical Redfish fixes should move into profile modules instead of widening generic replay logic.
|
||||
- Repo-owned compact fixtures under `internal/collector/redfishprofile/testdata/`, derived from
|
||||
representative raw-export snapshots, are used to lock profile matching and acquisition tuning
|
||||
for known MSI and Supermicro-family shapes.
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -8,6 +8,7 @@ import (
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
@@ -30,7 +31,8 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
|
||||
if emit != nil {
|
||||
emit(Progress{Status: "running", Progress: 10, Message: "Redfish snapshot: replay service root..."})
|
||||
}
|
||||
if _, err := r.getJSON("/redfish/v1"); err != nil {
|
||||
serviceRootDoc, err := r.getJSON("/redfish/v1")
|
||||
if err != nil {
|
||||
log.Printf("redfish replay: service root /redfish/v1 missing from snapshot, continuing with defaults: %v", err)
|
||||
}
|
||||
|
||||
@@ -49,6 +51,7 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
|
||||
return nil, fmt.Errorf("system info: %w", err)
|
||||
}
|
||||
chassisDoc, _ := r.getJSON(primaryChassis)
|
||||
managerDoc, _ := r.getJSON(primaryManager)
|
||||
biosDoc, _ := r.getJSON(joinPath(primarySystem, "/Bios"))
|
||||
secureBootDoc, _ := r.getJSON(joinPath(primarySystem, "/SecureBoot"))
|
||||
systemFRUDoc, _ := r.getJSON(joinPath(primarySystem, "/Oem/Public/FRU"))
|
||||
@@ -58,22 +61,32 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
|
||||
fruDoc = chassisFRUDoc
|
||||
}
|
||||
boardFallbackDocs := r.collectBoardFallbackDocs(systemPaths, chassisPaths)
|
||||
|
||||
resourceHints := append(append([]string{}, systemPaths...), append(chassisPaths, managerPaths...)...)
|
||||
profileSignals := redfishprofile.CollectSignals(serviceRootDoc, systemDoc, chassisDoc, managerDoc, resourceHints)
|
||||
profileMatch := redfishprofile.MatchProfiles(profileSignals)
|
||||
analysisPlan := redfishprofile.ResolveAnalysisPlan(profileMatch, tree, redfishprofile.DiscoveredResources{
|
||||
SystemPaths: systemPaths,
|
||||
ChassisPaths: chassisPaths,
|
||||
ManagerPaths: managerPaths,
|
||||
}, profileSignals)
|
||||
if emit != nil {
|
||||
emit(Progress{Status: "running", Progress: 55, Message: "Redfish snapshot: replay CPU/RAM/Storage..."})
|
||||
}
|
||||
processors := r.collectProcessors(primarySystem)
|
||||
memory := r.collectMemory(primarySystem)
|
||||
storageDevices := r.collectStorage(primarySystem)
|
||||
storageVolumes := r.collectStorageVolumes(primarySystem)
|
||||
storageDevices := r.collectStorage(primarySystem, analysisPlan)
|
||||
storageVolumes := r.collectStorageVolumes(primarySystem, analysisPlan)
|
||||
|
||||
if emit != nil {
|
||||
emit(Progress{Status: "running", Progress: 80, Message: "Redfish snapshot: replay network/BMC..."})
|
||||
}
|
||||
psus := r.collectPSUs(chassisPaths)
|
||||
pcieDevices := r.collectPCIeDevices(systemPaths, chassisPaths)
|
||||
gpus := r.collectGPUs(systemPaths, chassisPaths)
|
||||
gpus = r.collectGPUsFromProcessors(systemPaths, chassisPaths, gpus)
|
||||
boardInfo := parseBoardInfoWithFallback(systemDoc, chassisDoc, fruDoc)
|
||||
applyBoardInfoFallbackFromDocs(&boardInfo, boardFallbackDocs)
|
||||
|
||||
gpus := r.collectGPUs(systemPaths, chassisPaths, analysisPlan)
|
||||
gpus = r.collectGPUsFromProcessors(systemPaths, chassisPaths, gpus, analysisPlan)
|
||||
nics := r.collectNICs(chassisPaths)
|
||||
r.enrichNICsFromNetworkInterfaces(&nics, systemPaths)
|
||||
thresholdSensors := r.collectThresholdSensors(chassisPaths)
|
||||
@@ -82,12 +95,9 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
|
||||
discreteEvents := r.collectDiscreteSensorEvents(chassisPaths)
|
||||
healthEvents := r.collectHealthSummaryEvents(chassisPaths)
|
||||
driveFetchWarningEvents := buildDriveFetchWarningEvents(rawPayloads)
|
||||
managerDoc, _ := r.getJSON(primaryManager)
|
||||
networkProtocolDoc, _ := r.getJSON(joinPath(primaryManager, "/NetworkProtocol"))
|
||||
firmware := parseFirmware(systemDoc, biosDoc, managerDoc, secureBootDoc, networkProtocolDoc)
|
||||
firmware = dedupeFirmwareInfo(append(firmware, r.collectFirmwareInventory()...))
|
||||
boardInfo := parseBoardInfoWithFallback(systemDoc, chassisDoc, fruDoc)
|
||||
applyBoardInfoFallbackFromDocs(&boardInfo, boardFallbackDocs)
|
||||
boardInfo.BMCMACAddress = r.collectBMCMAC(managerPaths)
|
||||
assemblyFRU := r.collectAssemblyFRU(chassisPaths)
|
||||
collectedAt, sourceTimezone := inferRedfishCollectionTime(managerDoc, rawPayloads)
|
||||
@@ -112,10 +122,36 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
|
||||
Firmware: firmware,
|
||||
},
|
||||
}
|
||||
match := profileMatch
|
||||
for _, profile := range match.Profiles {
|
||||
profile.PostAnalyze(result, tree, profileSignals)
|
||||
}
|
||||
if result.RawPayloads == nil {
|
||||
result.RawPayloads = map[string]any{}
|
||||
}
|
||||
appliedProfiles := make([]string, 0, len(match.Profiles))
|
||||
for _, profile := range match.Profiles {
|
||||
appliedProfiles = append(appliedProfiles, profile.Name())
|
||||
}
|
||||
result.RawPayloads["redfish_analysis_profiles"] = map[string]any{
|
||||
"mode": match.Mode,
|
||||
"profiles": appliedProfiles,
|
||||
}
|
||||
result.RawPayloads["redfish_analysis_plan"] = map[string]any{
|
||||
"mode": analysisPlan.Match.Mode,
|
||||
"profiles": appliedProfiles,
|
||||
"notes": analysisPlan.Notes,
|
||||
"directives": map[string]any{
|
||||
"processor_gpu_fallback": analysisPlan.Directives.EnableProcessorGPUFallback,
|
||||
"supermicro_nvme_backplane": analysisPlan.Directives.EnableSupermicroNVMeBackplane,
|
||||
"processor_gpu_chassis_alias": analysisPlan.Directives.EnableProcessorGPUChassisAlias,
|
||||
"generic_graphics_controller_dedup": analysisPlan.Directives.EnableGenericGraphicsControllerDedup,
|
||||
"msi_processor_gpu_chassis_lookup": analysisPlan.Directives.EnableMSIProcessorGPUChassisLookup,
|
||||
"storage_enclosure_recovery": analysisPlan.Directives.EnableStorageEnclosureRecovery,
|
||||
"known_storage_controller_recovery": analysisPlan.Directives.EnableKnownStorageControllerRecovery,
|
||||
},
|
||||
}
|
||||
if strings.TrimSpace(sourceTimezone) != "" {
|
||||
if result.RawPayloads == nil {
|
||||
result.RawPayloads = map[string]any{}
|
||||
}
|
||||
result.RawPayloads["source_timezone"] = sourceTimezone
|
||||
}
|
||||
appendMissingServerModelWarning(result, systemDoc, joinPath(primarySystem, "/Oem/Public/FRU"), joinPath(primaryChassis, "/Oem/Public/FRU"))
|
||||
@@ -667,57 +703,6 @@ func (r redfishSnapshotReader) collectHealthSummaryEvents(chassisPaths []string)
|
||||
return out
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) enrichNICsFromNetworkInterfaces(nics *[]models.NetworkAdapter, systemPaths []string) {
|
||||
if nics == nil {
|
||||
return
|
||||
}
|
||||
bySlot := make(map[string]int, len(*nics))
|
||||
for i, nic := range *nics {
|
||||
bySlot[strings.ToLower(strings.TrimSpace(nic.Slot))] = i
|
||||
}
|
||||
|
||||
for _, systemPath := range systemPaths {
|
||||
ifaces, err := r.getCollectionMembers(joinPath(systemPath, "/NetworkInterfaces"))
|
||||
if err != nil || len(ifaces) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, iface := range ifaces {
|
||||
slot := firstNonEmpty(asString(iface["Id"]), asString(iface["Name"]))
|
||||
if strings.TrimSpace(slot) == "" {
|
||||
continue
|
||||
}
|
||||
idx, ok := bySlot[strings.ToLower(strings.TrimSpace(slot))]
|
||||
if !ok {
|
||||
*nics = append(*nics, models.NetworkAdapter{
|
||||
Slot: slot,
|
||||
Present: true,
|
||||
Model: firstNonEmpty(asString(iface["Model"]), asString(iface["Name"])),
|
||||
Status: mapStatus(iface["Status"]),
|
||||
})
|
||||
idx = len(*nics) - 1
|
||||
bySlot[strings.ToLower(strings.TrimSpace(slot))] = idx
|
||||
}
|
||||
|
||||
portsPath := redfishLinkedPath(iface, "NetworkPorts")
|
||||
if portsPath == "" {
|
||||
continue
|
||||
}
|
||||
portDocs, err := r.getCollectionMembers(portsPath)
|
||||
if err != nil || len(portDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
macs := append([]string{}, (*nics)[idx].MACAddresses...)
|
||||
for _, p := range portDocs {
|
||||
macs = append(macs, collectNetworkPortMACs(p)...)
|
||||
}
|
||||
(*nics)[idx].MACAddresses = dedupeStrings(macs)
|
||||
if sanitizeNetworkPortCount((*nics)[idx].PortCount) == 0 {
|
||||
(*nics)[idx].PortCount = len(portDocs)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func collectNetworkPortMACs(doc map[string]interface{}) []string {
|
||||
if len(doc) == 0 {
|
||||
return nil
|
||||
@@ -756,79 +741,6 @@ func dedupeStrings(items []string) []string {
|
||||
return out
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectBoardFallbackDocs(systemPaths, chassisPaths []string) []map[string]interface{} {
|
||||
out := make([]map[string]interface{}, 0)
|
||||
for _, chassisPath := range chassisPaths {
|
||||
for _, suffix := range []string{"/Boards", "/Backplanes"} {
|
||||
path := joinPath(chassisPath, suffix)
|
||||
if docs, err := r.getCollectionMembers(path); err == nil && len(docs) > 0 {
|
||||
out = append(out, docs...)
|
||||
continue
|
||||
}
|
||||
if doc, err := r.getJSON(path); err == nil && len(doc) > 0 {
|
||||
out = append(out, doc)
|
||||
}
|
||||
}
|
||||
}
|
||||
for _, path := range append(append([]string{}, systemPaths...), chassisPaths...) {
|
||||
for _, suffix := range []string{"/Oem/Public", "/Oem/Public/ThermalConfig", "/ThermalConfig"} {
|
||||
docPath := joinPath(path, suffix)
|
||||
if doc, err := r.getJSON(docPath); err == nil && len(doc) > 0 {
|
||||
out = append(out, doc)
|
||||
}
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func applyBoardInfoFallbackFromDocs(board *models.BoardInfo, docs []map[string]interface{}) {
|
||||
if board == nil || len(docs) == 0 {
|
||||
return
|
||||
}
|
||||
for _, doc := range docs {
|
||||
candidate := parseBoardInfoFromFRUDoc(doc)
|
||||
if !isLikelyServerProductName(candidate.ProductName) {
|
||||
continue
|
||||
}
|
||||
if board.Manufacturer == "" {
|
||||
board.Manufacturer = candidate.Manufacturer
|
||||
}
|
||||
if board.ProductName == "" {
|
||||
board.ProductName = candidate.ProductName
|
||||
}
|
||||
if board.SerialNumber == "" {
|
||||
board.SerialNumber = candidate.SerialNumber
|
||||
}
|
||||
if board.PartNumber == "" {
|
||||
board.PartNumber = candidate.PartNumber
|
||||
}
|
||||
if board.Manufacturer != "" && board.ProductName != "" && board.SerialNumber != "" && board.PartNumber != "" {
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func isLikelyServerProductName(v string) bool {
|
||||
v = strings.TrimSpace(v)
|
||||
if v == "" {
|
||||
return false
|
||||
}
|
||||
n := strings.ToUpper(v)
|
||||
if strings.Contains(n, "NULL") {
|
||||
return false
|
||||
}
|
||||
componentTokens := []string{
|
||||
"DIMM", "DDR", "NVME", "SSD", "HDD", "GPU", "NIC", "RAID",
|
||||
"PSU", "FAN", "BACKPLANE", "FRU",
|
||||
}
|
||||
for _, token := range componentTokens {
|
||||
if strings.Contains(n, strings.ToUpper(token)) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
type redfishSnapshotReader struct {
|
||||
tree map[string]interface{}
|
||||
}
|
||||
@@ -1063,222 +975,6 @@ func (r redfishSnapshotReader) collectMemory(systemPath string) []models.MemoryD
|
||||
return out
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectStorage(systemPath string) []models.Storage {
|
||||
var out []models.Storage
|
||||
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
|
||||
for _, member := range storageMembers {
|
||||
if driveCollection, ok := member["Drives"].(map[string]interface{}); ok {
|
||||
if driveCollectionPath := asString(driveCollection["@odata.id"]); driveCollectionPath != "" {
|
||||
driveDocs, err := r.getCollectionMembers(driveCollectionPath)
|
||||
if err == nil {
|
||||
for _, driveDoc := range driveDocs {
|
||||
if !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
if len(driveDocs) == 0 {
|
||||
for _, driveDoc := range r.probeDirectDiskBayChildren(driveCollectionPath) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
}
|
||||
if drives, ok := member["Drives"].([]interface{}); ok {
|
||||
for _, driveAny := range drives {
|
||||
driveRef, ok := driveAny.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
odata := asString(driveRef["@odata.id"])
|
||||
if odata == "" {
|
||||
continue
|
||||
}
|
||||
driveDoc, err := r.getJSON(odata)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
if looksLikeDrive(member) {
|
||||
if isVirtualStorageDrive(member) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(member, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(member, supplementalDocs...))
|
||||
}
|
||||
|
||||
for _, enclosurePath := range redfishLinkRefs(member, "Links", "Enclosures") {
|
||||
driveDocs, err := r.getCollectionMembers(joinPath(enclosurePath, "/Drives"))
|
||||
if err == nil {
|
||||
for _, driveDoc := range driveDocs {
|
||||
if looksLikeDrive(driveDoc) && !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
if len(driveDocs) == 0 {
|
||||
for _, driveDoc := range r.probeDirectDiskBayChildren(joinPath(enclosurePath, "/Drives")) {
|
||||
if isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for _, driveDoc := range r.collectKnownStorageMembers(systemPath, []string{
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Controllers/1/Drives",
|
||||
}) {
|
||||
if looksLikeDrive(driveDoc) && !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
|
||||
simpleStorageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/SimpleStorage"))
|
||||
for _, member := range simpleStorageMembers {
|
||||
devices, ok := member["Devices"].([]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
for _, devAny := range devices {
|
||||
devDoc, ok := devAny.(map[string]interface{})
|
||||
if !ok || !looksLikeDrive(devDoc) || isVirtualStorageDrive(devDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(devDoc))
|
||||
}
|
||||
}
|
||||
|
||||
chassisPaths := r.discoverMemberPaths("/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
|
||||
for _, chassisPath := range chassisPaths {
|
||||
driveDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/Drives"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, driveDoc := range driveDocs {
|
||||
if !looksLikeDrive(driveDoc) || isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
for _, chassisPath := range chassisPaths {
|
||||
if !isSupermicroNVMeBackplanePath(chassisPath) {
|
||||
continue
|
||||
}
|
||||
for _, driveDoc := range r.probeSupermicroNVMeDiskBays(chassisPath) {
|
||||
if !looksLikeDrive(driveDoc) || isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
return dedupeStorage(out)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectStorageVolumes(systemPath string) []models.StorageVolume {
|
||||
var out []models.StorageVolume
|
||||
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
|
||||
for _, member := range storageMembers {
|
||||
controller := firstNonEmpty(asString(member["Id"]), asString(member["Name"]))
|
||||
volumeCollectionPath := redfishLinkedPath(member, "Volumes")
|
||||
if volumeCollectionPath == "" {
|
||||
continue
|
||||
}
|
||||
volumeDocs, err := r.getCollectionMembers(volumeCollectionPath)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, volDoc := range volumeDocs {
|
||||
if looksLikeVolume(volDoc) {
|
||||
out = append(out, parseStorageVolume(volDoc, controller))
|
||||
}
|
||||
}
|
||||
}
|
||||
for _, volDoc := range r.collectKnownStorageMembers(systemPath, []string{
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
"/Storage/HA-RAID/Volumes",
|
||||
"/Storage/MRVL.HA-RAID/Volumes",
|
||||
}) {
|
||||
if looksLikeVolume(volDoc) {
|
||||
out = append(out, parseStorageVolume(volDoc, storageControllerFromPath(asString(volDoc["@odata.id"]))))
|
||||
}
|
||||
}
|
||||
return dedupeStorageVolumes(out)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectKnownStorageMembers(systemPath string, relativeCollections []string) []map[string]interface{} {
|
||||
var out []map[string]interface{}
|
||||
for _, rel := range relativeCollections {
|
||||
docs, err := r.getCollectionMembers(joinPath(systemPath, rel))
|
||||
if err != nil || len(docs) == 0 {
|
||||
continue
|
||||
}
|
||||
out = append(out, docs...)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) probeSupermicroNVMeDiskBays(backplanePath string) []map[string]interface{} {
|
||||
return r.probeDirectDiskBayChildren(joinPath(backplanePath, "/Drives"))
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) probeDirectDiskBayChildren(drivesCollectionPath string) []map[string]interface{} {
|
||||
var out []map[string]interface{}
|
||||
for _, path := range directDiskBayCandidates(drivesCollectionPath) {
|
||||
doc, err := r.getJSON(path)
|
||||
if err != nil || !looksLikeDrive(doc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, doc)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectNICs(chassisPaths []string) []models.NetworkAdapter {
|
||||
var nics []models.NetworkAdapter
|
||||
for _, chassisPath := range chassisPaths {
|
||||
adapterDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/NetworkAdapters"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, doc := range adapterDocs {
|
||||
nic := parseNIC(doc)
|
||||
for _, pciePath := range networkAdapterPCIeDevicePaths(doc) {
|
||||
pcieDoc, err := r.getJSON(pciePath)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
functionDocs := r.getLinkedPCIeFunctions(pcieDoc)
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(pcieDoc, "EnvironmentMetrics", "Metrics")
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
enrichNICFromPCIe(&nic, pcieDoc, functionDocs, supplementalDocs)
|
||||
}
|
||||
// Collect MACs from NetworkDeviceFunctions when not found via PCIe path.
|
||||
if len(nic.MACAddresses) == 0 {
|
||||
r.enrichNICMACsFromNetworkDeviceFunctions(&nic, doc)
|
||||
}
|
||||
nics = append(nics, nic)
|
||||
}
|
||||
}
|
||||
return dedupeNetworkAdapters(nics)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectPSUs(chassisPaths []string) []models.PSU {
|
||||
var out []models.PSU
|
||||
seen := make(map[string]int)
|
||||
@@ -1307,363 +1003,9 @@ func (r redfishSnapshotReader) collectPSUs(chassisPaths []string) []models.PSU {
|
||||
return out
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectGPUs(systemPaths, chassisPaths []string) []models.GPU {
|
||||
collections := make([]string, 0, len(systemPaths)*3+len(chassisPaths)*2)
|
||||
for _, systemPath := range systemPaths {
|
||||
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
|
||||
collections = append(collections, joinPath(systemPath, "/Accelerators"))
|
||||
collections = append(collections, joinPath(systemPath, "/GraphicsControllers"))
|
||||
}
|
||||
for _, chassisPath := range chassisPaths {
|
||||
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
|
||||
collections = append(collections, joinPath(chassisPath, "/Accelerators"))
|
||||
}
|
||||
var out []models.GPU
|
||||
seen := make(map[string]struct{})
|
||||
idx := 1
|
||||
for _, collectionPath := range collections {
|
||||
memberDocs, err := r.getCollectionMembers(collectionPath)
|
||||
if err != nil || len(memberDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range memberDocs {
|
||||
functionDocs := r.getLinkedPCIeFunctions(doc)
|
||||
if !looksLikeGPU(doc, functionDocs) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
gpu := parseGPUWithSupplementalDocs(doc, functionDocs, supplementalDocs, idx)
|
||||
idx++
|
||||
if shouldSkipGenericGPUDuplicate(out, gpu) {
|
||||
continue
|
||||
}
|
||||
key := gpuDocDedupKey(doc, gpu)
|
||||
if key == "" {
|
||||
continue
|
||||
}
|
||||
if _, ok := seen[key]; ok {
|
||||
continue
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
out = append(out, gpu)
|
||||
}
|
||||
}
|
||||
return dropModelOnlyGPUPlaceholders(out)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectPCIeDevices(systemPaths, chassisPaths []string) []models.PCIeDevice {
|
||||
collections := make([]string, 0, len(systemPaths)+len(chassisPaths))
|
||||
for _, systemPath := range systemPaths {
|
||||
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
|
||||
}
|
||||
for _, chassisPath := range chassisPaths {
|
||||
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
|
||||
}
|
||||
var out []models.PCIeDevice
|
||||
for _, collectionPath := range collections {
|
||||
memberDocs, err := r.getCollectionMembers(collectionPath)
|
||||
if err != nil || len(memberDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range memberDocs {
|
||||
functionDocs := r.getLinkedPCIeFunctions(doc)
|
||||
if looksLikeGPU(doc, functionDocs) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
|
||||
supplementalDocs = append(supplementalDocs, r.getChassisScopedPCIeSupplementalDocs(doc)...)
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
dev := parsePCIeDeviceWithSupplementalDocs(doc, functionDocs, supplementalDocs)
|
||||
if isUnidentifiablePCIeDevice(dev) {
|
||||
continue
|
||||
}
|
||||
out = append(out, dev)
|
||||
}
|
||||
}
|
||||
for _, systemPath := range systemPaths {
|
||||
functionDocs, err := r.getCollectionMembers(joinPath(systemPath, "/PCIeFunctions"))
|
||||
if err != nil || len(functionDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for idx, fn := range functionDocs {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")
|
||||
dev := parsePCIeFunctionWithSupplementalDocs(fn, supplementalDocs, idx+1)
|
||||
out = append(out, dev)
|
||||
}
|
||||
}
|
||||
return dedupePCIeDevices(out)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) getChassisScopedPCIeSupplementalDocs(doc map[string]interface{}) []map[string]interface{} {
|
||||
if !looksLikeNVSwitchPCIeDoc(doc) {
|
||||
return nil
|
||||
}
|
||||
docPath := normalizeRedfishPath(asString(doc["@odata.id"]))
|
||||
chassisPath := chassisPathForPCIeDoc(docPath)
|
||||
if chassisPath == "" {
|
||||
return nil
|
||||
}
|
||||
out := make([]map[string]interface{}, 0, 4)
|
||||
for _, path := range []string{
|
||||
joinPath(chassisPath, "/EnvironmentMetrics"),
|
||||
joinPath(chassisPath, "/ThermalSubsystem/ThermalMetrics"),
|
||||
} {
|
||||
supplementalDoc, err := r.getJSON(path)
|
||||
if err != nil || len(supplementalDoc) == 0 {
|
||||
continue
|
||||
}
|
||||
out = append(out, supplementalDoc)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func stringsTrimTrailingSlash(s string) string {
|
||||
for len(s) > 1 && s[len(s)-1] == '/' {
|
||||
s = s[:len(s)-1]
|
||||
}
|
||||
return s
|
||||
}
|
||||
|
||||
// collectBMCMAC returns the MAC address of the first active BMC management
|
||||
// interface found in Managers/*/EthernetInterfaces. Returns empty string if
|
||||
// no MAC is available.
|
||||
func (r redfishSnapshotReader) collectBMCMAC(managerPaths []string) string {
|
||||
for _, managerPath := range managerPaths {
|
||||
members, err := r.getCollectionMembers(joinPath(managerPath, "/EthernetInterfaces"))
|
||||
if err != nil || len(members) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range members {
|
||||
mac := strings.TrimSpace(firstNonEmpty(
|
||||
asString(doc["PermanentMACAddress"]),
|
||||
asString(doc["MACAddress"]),
|
||||
))
|
||||
if mac == "" || strings.EqualFold(mac, "00:00:00:00:00:00") {
|
||||
continue
|
||||
}
|
||||
return strings.ToUpper(mac)
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// collectAssemblyFRU reads Chassis/*/Assembly documents and returns FRU entries
|
||||
// for subcomponents (backplanes, PSUs, DIMMs, etc.) that carry meaningful
|
||||
// serial or part numbers. Entries already present in dedicated collections
|
||||
// (PSUs, DIMMs) are included here as well so that all FRU data is available
|
||||
// in one place; deduplication by serial is performed.
|
||||
func (r redfishSnapshotReader) collectAssemblyFRU(chassisPaths []string) []models.FRUInfo {
|
||||
seen := make(map[string]struct{})
|
||||
var out []models.FRUInfo
|
||||
|
||||
add := func(fru models.FRUInfo) {
|
||||
key := strings.ToUpper(strings.TrimSpace(fru.SerialNumber))
|
||||
if key == "" {
|
||||
key = strings.ToUpper(strings.TrimSpace(fru.Description + "|" + fru.PartNumber))
|
||||
}
|
||||
if key == "" || key == "|" {
|
||||
return
|
||||
}
|
||||
if _, ok := seen[key]; ok {
|
||||
return
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
out = append(out, fru)
|
||||
}
|
||||
|
||||
for _, chassisPath := range chassisPaths {
|
||||
doc, err := r.getJSON(joinPath(chassisPath, "/Assembly"))
|
||||
if err != nil || len(doc) == 0 {
|
||||
continue
|
||||
}
|
||||
assemblies, _ := doc["Assemblies"].([]interface{})
|
||||
for _, aAny := range assemblies {
|
||||
a, ok := aAny.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
name := strings.TrimSpace(firstNonEmpty(asString(a["Name"]), asString(a["Description"])))
|
||||
model := strings.TrimSpace(asString(a["Model"]))
|
||||
partNumber := strings.TrimSpace(asString(a["PartNumber"]))
|
||||
serial := extractAssemblySerial(a)
|
||||
|
||||
if serial == "" && partNumber == "" {
|
||||
continue
|
||||
}
|
||||
add(models.FRUInfo{
|
||||
Description: name,
|
||||
ProductName: model,
|
||||
SerialNumber: serial,
|
||||
PartNumber: partNumber,
|
||||
})
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// extractAssemblySerial tries to find a serial number in an Assembly entry.
|
||||
// Standard Redfish Assembly has no top-level SerialNumber; vendors put it in Oem.
|
||||
func extractAssemblySerial(a map[string]interface{}) string {
|
||||
// Some implementations expose it at top level.
|
||||
if s := strings.TrimSpace(asString(a["SerialNumber"])); s != "" {
|
||||
return s
|
||||
}
|
||||
// Dig into Oem for vendor-specific structures (e.g. Huawei COMMONb).
|
||||
oem, _ := a["Oem"].(map[string]interface{})
|
||||
for _, v := range oem {
|
||||
subtree, ok := v.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
for _, v2 := range subtree {
|
||||
node, ok := v2.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if s := strings.TrimSpace(asString(node["SerialNumber"])); s != "" {
|
||||
return s
|
||||
}
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// enrichNICMACsFromNetworkDeviceFunctions reads the NetworkDeviceFunctions
|
||||
// collection linked from a NetworkAdapter document and populates the NIC's
|
||||
// MACAddresses from each function's Ethernet.PermanentMACAddress / MACAddress.
|
||||
// Called when PCIe-path enrichment does not produce any MACs.
|
||||
func (r redfishSnapshotReader) enrichNICMACsFromNetworkDeviceFunctions(nic *models.NetworkAdapter, adapterDoc map[string]interface{}) {
|
||||
ndfCol, ok := adapterDoc["NetworkDeviceFunctions"].(map[string]interface{})
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
colPath := asString(ndfCol["@odata.id"])
|
||||
if colPath == "" {
|
||||
return
|
||||
}
|
||||
funcDocs, err := r.getCollectionMembers(colPath)
|
||||
if err != nil || len(funcDocs) == 0 {
|
||||
return
|
||||
}
|
||||
for _, fn := range funcDocs {
|
||||
eth, _ := fn["Ethernet"].(map[string]interface{})
|
||||
if eth == nil {
|
||||
continue
|
||||
}
|
||||
mac := strings.TrimSpace(firstNonEmpty(
|
||||
asString(eth["PermanentMACAddress"]),
|
||||
asString(eth["MACAddress"]),
|
||||
))
|
||||
if mac == "" {
|
||||
continue
|
||||
}
|
||||
nic.MACAddresses = dedupeStrings(append(nic.MACAddresses, strings.ToUpper(mac)))
|
||||
}
|
||||
if len(funcDocs) > 0 && nic.PortCount == 0 {
|
||||
nic.PortCount = sanitizeNetworkPortCount(len(funcDocs))
|
||||
}
|
||||
}
|
||||
|
||||
// collectGPUsFromProcessors finds GPUs that some BMCs (e.g. MSI) expose as
|
||||
// Processor entries with ProcessorType=GPU rather than as PCIe devices.
|
||||
// It supplements the existing gpus slice (already found via PCIe path),
|
||||
// skipping entries already present by UUID or SerialNumber.
|
||||
// Serial numbers are looked up from Chassis members named after each GPU Id.
|
||||
func (r redfishSnapshotReader) collectGPUsFromProcessors(systemPaths, chassisPaths []string, existing []models.GPU) []models.GPU {
|
||||
// Build a lookup: chassis member ID → chassis doc (for serial numbers).
|
||||
chassisByID := make(map[string]map[string]interface{})
|
||||
for _, cp := range chassisPaths {
|
||||
doc, err := r.getJSON(cp)
|
||||
if err != nil || len(doc) == 0 {
|
||||
continue
|
||||
}
|
||||
id := strings.TrimSpace(asString(doc["Id"]))
|
||||
if id != "" {
|
||||
chassisByID[strings.ToUpper(id)] = doc
|
||||
}
|
||||
}
|
||||
|
||||
// Build dedup sets from existing GPUs.
|
||||
seenUUID := make(map[string]struct{})
|
||||
seenSerial := make(map[string]struct{})
|
||||
for _, g := range existing {
|
||||
if u := strings.ToUpper(strings.TrimSpace(g.UUID)); u != "" {
|
||||
seenUUID[u] = struct{}{}
|
||||
}
|
||||
if s := strings.ToUpper(strings.TrimSpace(g.SerialNumber)); s != "" {
|
||||
seenSerial[s] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
out := append([]models.GPU{}, existing...)
|
||||
idx := len(existing) + 1
|
||||
|
||||
for _, systemPath := range systemPaths {
|
||||
procDocs, err := r.getCollectionMembers(joinPath(systemPath, "/Processors"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, doc := range procDocs {
|
||||
if !strings.EqualFold(strings.TrimSpace(asString(doc["ProcessorType"])), "GPU") {
|
||||
continue
|
||||
}
|
||||
|
||||
// Resolve serial: prefer the processor doc itself (e.g. Supermicro
|
||||
// HGX_Baseboard_0/Processors/GPU_SXM_N carries SerialNumber directly),
|
||||
// then fall back to a matching chassis doc keyed by processor Id
|
||||
// (e.g. MSI: Chassis/GPU_SXM_1/SerialNumber).
|
||||
gpuID := strings.TrimSpace(asString(doc["Id"]))
|
||||
serial := findFirstNormalizedStringByKeys(doc, "SerialNumber")
|
||||
if chassisDoc, ok := chassisByID[strings.ToUpper(gpuID)]; ok {
|
||||
if cs := strings.TrimSpace(asString(chassisDoc["SerialNumber"])); cs != "" {
|
||||
serial = cs
|
||||
}
|
||||
}
|
||||
|
||||
uuid := strings.TrimSpace(asString(doc["UUID"]))
|
||||
uuidKey := strings.ToUpper(uuid)
|
||||
serialKey := strings.ToUpper(serial)
|
||||
|
||||
if uuidKey != "" {
|
||||
if _, dup := seenUUID[uuidKey]; dup {
|
||||
continue
|
||||
}
|
||||
seenUUID[uuidKey] = struct{}{}
|
||||
}
|
||||
if serialKey != "" {
|
||||
if _, dup := seenSerial[serialKey]; dup {
|
||||
continue
|
||||
}
|
||||
seenSerial[serialKey] = struct{}{}
|
||||
}
|
||||
|
||||
slotLabel := firstNonEmpty(
|
||||
redfishLocationLabel(doc["Location"]),
|
||||
redfishLocationLabel(doc["PhysicalLocation"]),
|
||||
)
|
||||
if slotLabel == "" && gpuID != "" {
|
||||
slotLabel = gpuID
|
||||
}
|
||||
if slotLabel == "" {
|
||||
slotLabel = fmt.Sprintf("GPU%d", idx)
|
||||
}
|
||||
|
||||
out = append(out, models.GPU{
|
||||
Slot: slotLabel,
|
||||
Model: firstNonEmpty(asString(doc["Model"]), asString(doc["Name"])),
|
||||
Manufacturer: asString(doc["Manufacturer"]),
|
||||
PartNumber: asString(doc["PartNumber"]),
|
||||
SerialNumber: serial,
|
||||
UUID: uuid,
|
||||
Status: mapStatus(doc["Status"]),
|
||||
})
|
||||
idx++
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
159
internal/collector/redfish_replay_fru.go
Normal file
159
internal/collector/redfish_replay_fru.go
Normal file
@@ -0,0 +1,159 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) collectBoardFallbackDocs(systemPaths, chassisPaths []string) []map[string]interface{} {
|
||||
out := make([]map[string]interface{}, 0)
|
||||
for _, chassisPath := range chassisPaths {
|
||||
for _, suffix := range []string{"/Boards", "/Backplanes"} {
|
||||
path := joinPath(chassisPath, suffix)
|
||||
if docs, err := r.getCollectionMembers(path); err == nil && len(docs) > 0 {
|
||||
out = append(out, docs...)
|
||||
continue
|
||||
}
|
||||
if doc, err := r.getJSON(path); err == nil && len(doc) > 0 {
|
||||
out = append(out, doc)
|
||||
}
|
||||
}
|
||||
}
|
||||
for _, path := range append(append([]string{}, systemPaths...), chassisPaths...) {
|
||||
for _, suffix := range []string{"/Oem/Public", "/Oem/Public/ThermalConfig", "/ThermalConfig"} {
|
||||
docPath := joinPath(path, suffix)
|
||||
if doc, err := r.getJSON(docPath); err == nil && len(doc) > 0 {
|
||||
out = append(out, doc)
|
||||
}
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func applyBoardInfoFallbackFromDocs(board *models.BoardInfo, docs []map[string]interface{}) {
|
||||
if board == nil || len(docs) == 0 {
|
||||
return
|
||||
}
|
||||
for _, doc := range docs {
|
||||
candidate := parseBoardInfoFromFRUDoc(doc)
|
||||
if !isLikelyServerProductName(candidate.ProductName) {
|
||||
continue
|
||||
}
|
||||
if board.Manufacturer == "" {
|
||||
board.Manufacturer = candidate.Manufacturer
|
||||
}
|
||||
if board.ProductName == "" {
|
||||
board.ProductName = candidate.ProductName
|
||||
}
|
||||
if board.SerialNumber == "" {
|
||||
board.SerialNumber = candidate.SerialNumber
|
||||
}
|
||||
if board.PartNumber == "" {
|
||||
board.PartNumber = candidate.PartNumber
|
||||
}
|
||||
if board.Manufacturer != "" && board.ProductName != "" && board.SerialNumber != "" && board.PartNumber != "" {
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func isLikelyServerProductName(v string) bool {
|
||||
v = strings.TrimSpace(v)
|
||||
if v == "" {
|
||||
return false
|
||||
}
|
||||
n := strings.ToUpper(v)
|
||||
if strings.Contains(n, "NULL") {
|
||||
return false
|
||||
}
|
||||
componentTokens := []string{
|
||||
"DIMM", "DDR", "NVME", "SSD", "HDD", "GPU", "NIC", "RAID",
|
||||
"PSU", "FAN", "BACKPLANE", "FRU",
|
||||
}
|
||||
for _, token := range componentTokens {
|
||||
if strings.Contains(n, strings.ToUpper(token)) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// collectAssemblyFRU reads Chassis/*/Assembly documents and returns FRU entries
|
||||
// for subcomponents (backplanes, PSUs, DIMMs, etc.) that carry meaningful
|
||||
// serial or part numbers. Entries already present in dedicated collections
|
||||
// (PSUs, DIMMs) are included here as well so that all FRU data is available
|
||||
// in one place; deduplication by serial is performed.
|
||||
func (r redfishSnapshotReader) collectAssemblyFRU(chassisPaths []string) []models.FRUInfo {
|
||||
seen := make(map[string]struct{})
|
||||
var out []models.FRUInfo
|
||||
|
||||
add := func(fru models.FRUInfo) {
|
||||
key := strings.ToUpper(strings.TrimSpace(fru.SerialNumber))
|
||||
if key == "" {
|
||||
key = strings.ToUpper(strings.TrimSpace(fru.Description + "|" + fru.PartNumber))
|
||||
}
|
||||
if key == "" || key == "|" {
|
||||
return
|
||||
}
|
||||
if _, ok := seen[key]; ok {
|
||||
return
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
out = append(out, fru)
|
||||
}
|
||||
|
||||
for _, chassisPath := range chassisPaths {
|
||||
doc, err := r.getJSON(joinPath(chassisPath, "/Assembly"))
|
||||
if err != nil || len(doc) == 0 {
|
||||
continue
|
||||
}
|
||||
assemblies, _ := doc["Assemblies"].([]interface{})
|
||||
for _, aAny := range assemblies {
|
||||
a, ok := aAny.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
name := strings.TrimSpace(firstNonEmpty(asString(a["Name"]), asString(a["Description"])))
|
||||
model := strings.TrimSpace(asString(a["Model"]))
|
||||
partNumber := strings.TrimSpace(asString(a["PartNumber"]))
|
||||
serial := extractAssemblySerial(a)
|
||||
|
||||
if serial == "" && partNumber == "" {
|
||||
continue
|
||||
}
|
||||
add(models.FRUInfo{
|
||||
Description: name,
|
||||
ProductName: model,
|
||||
SerialNumber: serial,
|
||||
PartNumber: partNumber,
|
||||
})
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// extractAssemblySerial tries to find a serial number in an Assembly entry.
|
||||
// Standard Redfish Assembly has no top-level SerialNumber; vendors put it in Oem.
|
||||
func extractAssemblySerial(a map[string]interface{}) string {
|
||||
if s := strings.TrimSpace(asString(a["SerialNumber"])); s != "" {
|
||||
return s
|
||||
}
|
||||
oem, _ := a["Oem"].(map[string]interface{})
|
||||
for _, v := range oem {
|
||||
subtree, ok := v.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
for _, v2 := range subtree {
|
||||
node, ok := v2.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if s := strings.TrimSpace(asString(node["SerialNumber"])); s != "" {
|
||||
return s
|
||||
}
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
151
internal/collector/redfish_replay_gpu.go
Normal file
151
internal/collector/redfish_replay_gpu.go
Normal file
@@ -0,0 +1,151 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) collectGPUs(systemPaths, chassisPaths []string, plan redfishprofile.ResolvedAnalysisPlan) []models.GPU {
|
||||
collections := make([]string, 0, len(systemPaths)*3+len(chassisPaths)*2)
|
||||
for _, systemPath := range systemPaths {
|
||||
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
|
||||
collections = append(collections, joinPath(systemPath, "/Accelerators"))
|
||||
collections = append(collections, joinPath(systemPath, "/GraphicsControllers"))
|
||||
}
|
||||
for _, chassisPath := range chassisPaths {
|
||||
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
|
||||
collections = append(collections, joinPath(chassisPath, "/Accelerators"))
|
||||
}
|
||||
var out []models.GPU
|
||||
seen := make(map[string]struct{})
|
||||
idx := 1
|
||||
for _, collectionPath := range collections {
|
||||
memberDocs, err := r.getCollectionMembers(collectionPath)
|
||||
if err != nil || len(memberDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range memberDocs {
|
||||
functionDocs := r.getLinkedPCIeFunctions(doc)
|
||||
if !looksLikeGPU(doc, functionDocs) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
gpu := parseGPUWithSupplementalDocs(doc, functionDocs, supplementalDocs, idx)
|
||||
idx++
|
||||
if plan.Directives.EnableGenericGraphicsControllerDedup && shouldSkipGenericGPUDuplicate(out, gpu) {
|
||||
continue
|
||||
}
|
||||
key := gpuDocDedupKey(doc, gpu)
|
||||
if key == "" {
|
||||
continue
|
||||
}
|
||||
if _, ok := seen[key]; ok {
|
||||
continue
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
out = append(out, gpu)
|
||||
}
|
||||
}
|
||||
if plan.Directives.EnableGenericGraphicsControllerDedup {
|
||||
return dropModelOnlyGPUPlaceholders(out)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// collectGPUsFromProcessors finds GPUs that some BMCs (e.g. MSI) expose as
|
||||
// Processor entries with ProcessorType=GPU rather than as PCIe devices.
|
||||
// It supplements the existing gpus slice (already found via PCIe path),
|
||||
// skipping entries already present by UUID or SerialNumber.
|
||||
// Serial numbers are looked up from Chassis members named after each GPU Id.
|
||||
func (r redfishSnapshotReader) collectGPUsFromProcessors(systemPaths, chassisPaths []string, existing []models.GPU, plan redfishprofile.ResolvedAnalysisPlan) []models.GPU {
|
||||
if !plan.Directives.EnableProcessorGPUFallback {
|
||||
return append([]models.GPU{}, existing...)
|
||||
}
|
||||
chassisByID := make(map[string]map[string]interface{})
|
||||
for _, cp := range chassisPaths {
|
||||
doc, err := r.getJSON(cp)
|
||||
if err != nil || len(doc) == 0 {
|
||||
continue
|
||||
}
|
||||
id := strings.TrimSpace(asString(doc["Id"]))
|
||||
if id != "" {
|
||||
chassisByID[strings.ToUpper(id)] = doc
|
||||
}
|
||||
}
|
||||
|
||||
seenUUID := make(map[string]struct{})
|
||||
seenSerial := make(map[string]struct{})
|
||||
for _, g := range existing {
|
||||
if u := strings.ToUpper(strings.TrimSpace(g.UUID)); u != "" {
|
||||
seenUUID[u] = struct{}{}
|
||||
}
|
||||
if s := strings.ToUpper(strings.TrimSpace(g.SerialNumber)); s != "" {
|
||||
seenSerial[s] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
out := append([]models.GPU{}, existing...)
|
||||
idx := len(existing) + 1
|
||||
for _, systemPath := range systemPaths {
|
||||
procDocs, err := r.getCollectionMembers(joinPath(systemPath, "/Processors"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, doc := range procDocs {
|
||||
if !strings.EqualFold(strings.TrimSpace(asString(doc["ProcessorType"])), "GPU") {
|
||||
continue
|
||||
}
|
||||
|
||||
gpuID := strings.TrimSpace(asString(doc["Id"]))
|
||||
serial := findFirstNormalizedStringByKeys(doc, "SerialNumber")
|
||||
if serial == "" {
|
||||
serial = resolveProcessorGPUChassisSerial(chassisByID, gpuID, plan)
|
||||
}
|
||||
|
||||
uuid := strings.TrimSpace(asString(doc["UUID"]))
|
||||
uuidKey := strings.ToUpper(uuid)
|
||||
serialKey := strings.ToUpper(serial)
|
||||
|
||||
if uuidKey != "" {
|
||||
if _, dup := seenUUID[uuidKey]; dup {
|
||||
continue
|
||||
}
|
||||
seenUUID[uuidKey] = struct{}{}
|
||||
}
|
||||
if serialKey != "" {
|
||||
if _, dup := seenSerial[serialKey]; dup {
|
||||
continue
|
||||
}
|
||||
seenSerial[serialKey] = struct{}{}
|
||||
}
|
||||
|
||||
slotLabel := firstNonEmpty(
|
||||
redfishLocationLabel(doc["Location"]),
|
||||
redfishLocationLabel(doc["PhysicalLocation"]),
|
||||
)
|
||||
if slotLabel == "" && gpuID != "" {
|
||||
slotLabel = gpuID
|
||||
}
|
||||
if slotLabel == "" {
|
||||
slotLabel = fmt.Sprintf("GPU%d", idx)
|
||||
}
|
||||
out = append(out, models.GPU{
|
||||
Slot: slotLabel,
|
||||
Model: firstNonEmpty(asString(doc["Model"]), asString(doc["Name"])),
|
||||
Manufacturer: asString(doc["Manufacturer"]),
|
||||
PartNumber: asString(doc["PartNumber"]),
|
||||
SerialNumber: serial,
|
||||
UUID: uuid,
|
||||
Status: mapStatus(doc["Status"]),
|
||||
})
|
||||
idx++
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
215
internal/collector/redfish_replay_inventory.go
Normal file
215
internal/collector/redfish_replay_inventory.go
Normal file
@@ -0,0 +1,215 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) enrichNICsFromNetworkInterfaces(nics *[]models.NetworkAdapter, systemPaths []string) {
|
||||
if nics == nil {
|
||||
return
|
||||
}
|
||||
bySlot := make(map[string]int, len(*nics))
|
||||
for i, nic := range *nics {
|
||||
bySlot[strings.ToLower(strings.TrimSpace(nic.Slot))] = i
|
||||
}
|
||||
|
||||
for _, systemPath := range systemPaths {
|
||||
ifaces, err := r.getCollectionMembers(joinPath(systemPath, "/NetworkInterfaces"))
|
||||
if err != nil || len(ifaces) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, iface := range ifaces {
|
||||
slot := firstNonEmpty(asString(iface["Id"]), asString(iface["Name"]))
|
||||
if strings.TrimSpace(slot) == "" {
|
||||
continue
|
||||
}
|
||||
idx, ok := bySlot[strings.ToLower(strings.TrimSpace(slot))]
|
||||
if !ok {
|
||||
*nics = append(*nics, models.NetworkAdapter{
|
||||
Slot: slot,
|
||||
Present: true,
|
||||
Model: firstNonEmpty(asString(iface["Model"]), asString(iface["Name"])),
|
||||
Status: mapStatus(iface["Status"]),
|
||||
})
|
||||
idx = len(*nics) - 1
|
||||
bySlot[strings.ToLower(strings.TrimSpace(slot))] = idx
|
||||
}
|
||||
|
||||
portsPath := redfishLinkedPath(iface, "NetworkPorts")
|
||||
if portsPath == "" {
|
||||
continue
|
||||
}
|
||||
portDocs, err := r.getCollectionMembers(portsPath)
|
||||
if err != nil || len(portDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
macs := append([]string{}, (*nics)[idx].MACAddresses...)
|
||||
for _, p := range portDocs {
|
||||
macs = append(macs, collectNetworkPortMACs(p)...)
|
||||
}
|
||||
(*nics)[idx].MACAddresses = dedupeStrings(macs)
|
||||
if sanitizeNetworkPortCount((*nics)[idx].PortCount) == 0 {
|
||||
(*nics)[idx].PortCount = len(portDocs)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectNICs(chassisPaths []string) []models.NetworkAdapter {
|
||||
var nics []models.NetworkAdapter
|
||||
for _, chassisPath := range chassisPaths {
|
||||
adapterDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/NetworkAdapters"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, doc := range adapterDocs {
|
||||
nic := parseNIC(doc)
|
||||
for _, pciePath := range networkAdapterPCIeDevicePaths(doc) {
|
||||
pcieDoc, err := r.getJSON(pciePath)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
functionDocs := r.getLinkedPCIeFunctions(pcieDoc)
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(pcieDoc, "EnvironmentMetrics", "Metrics")
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
enrichNICFromPCIe(&nic, pcieDoc, functionDocs, supplementalDocs)
|
||||
}
|
||||
if len(nic.MACAddresses) == 0 {
|
||||
r.enrichNICMACsFromNetworkDeviceFunctions(&nic, doc)
|
||||
}
|
||||
nics = append(nics, nic)
|
||||
}
|
||||
}
|
||||
return dedupeNetworkAdapters(nics)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectPCIeDevices(systemPaths, chassisPaths []string) []models.PCIeDevice {
|
||||
collections := make([]string, 0, len(systemPaths)+len(chassisPaths))
|
||||
for _, systemPath := range systemPaths {
|
||||
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
|
||||
}
|
||||
for _, chassisPath := range chassisPaths {
|
||||
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
|
||||
}
|
||||
var out []models.PCIeDevice
|
||||
for _, collectionPath := range collections {
|
||||
memberDocs, err := r.getCollectionMembers(collectionPath)
|
||||
if err != nil || len(memberDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range memberDocs {
|
||||
functionDocs := r.getLinkedPCIeFunctions(doc)
|
||||
if looksLikeGPU(doc, functionDocs) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
|
||||
supplementalDocs = append(supplementalDocs, r.getChassisScopedPCIeSupplementalDocs(doc)...)
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
dev := parsePCIeDeviceWithSupplementalDocs(doc, functionDocs, supplementalDocs)
|
||||
if isUnidentifiablePCIeDevice(dev) {
|
||||
continue
|
||||
}
|
||||
out = append(out, dev)
|
||||
}
|
||||
}
|
||||
for _, systemPath := range systemPaths {
|
||||
functionDocs, err := r.getCollectionMembers(joinPath(systemPath, "/PCIeFunctions"))
|
||||
if err != nil || len(functionDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for idx, fn := range functionDocs {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")
|
||||
dev := parsePCIeFunctionWithSupplementalDocs(fn, supplementalDocs, idx+1)
|
||||
out = append(out, dev)
|
||||
}
|
||||
}
|
||||
return dedupePCIeDevices(out)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) getChassisScopedPCIeSupplementalDocs(doc map[string]interface{}) []map[string]interface{} {
|
||||
if !looksLikeNVSwitchPCIeDoc(doc) {
|
||||
return nil
|
||||
}
|
||||
docPath := normalizeRedfishPath(asString(doc["@odata.id"]))
|
||||
chassisPath := chassisPathForPCIeDoc(docPath)
|
||||
if chassisPath == "" {
|
||||
return nil
|
||||
}
|
||||
out := make([]map[string]interface{}, 0, 4)
|
||||
for _, path := range []string{
|
||||
joinPath(chassisPath, "/EnvironmentMetrics"),
|
||||
joinPath(chassisPath, "/ThermalSubsystem/ThermalMetrics"),
|
||||
} {
|
||||
supplementalDoc, err := r.getJSON(path)
|
||||
if err != nil || len(supplementalDoc) == 0 {
|
||||
continue
|
||||
}
|
||||
out = append(out, supplementalDoc)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// collectBMCMAC returns the MAC address of the first active BMC management
|
||||
// interface found in Managers/*/EthernetInterfaces. Returns empty string if
|
||||
// no MAC is available.
|
||||
func (r redfishSnapshotReader) collectBMCMAC(managerPaths []string) string {
|
||||
for _, managerPath := range managerPaths {
|
||||
members, err := r.getCollectionMembers(joinPath(managerPath, "/EthernetInterfaces"))
|
||||
if err != nil || len(members) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range members {
|
||||
mac := strings.TrimSpace(firstNonEmpty(
|
||||
asString(doc["PermanentMACAddress"]),
|
||||
asString(doc["MACAddress"]),
|
||||
))
|
||||
if mac == "" || strings.EqualFold(mac, "00:00:00:00:00:00") {
|
||||
continue
|
||||
}
|
||||
return strings.ToUpper(mac)
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// enrichNICMACsFromNetworkDeviceFunctions reads the NetworkDeviceFunctions
|
||||
// collection linked from a NetworkAdapter document and populates the NIC's
|
||||
// MACAddresses from each function's Ethernet.PermanentMACAddress / MACAddress.
|
||||
// Called when PCIe-path enrichment does not produce any MACs.
|
||||
func (r redfishSnapshotReader) enrichNICMACsFromNetworkDeviceFunctions(nic *models.NetworkAdapter, adapterDoc map[string]interface{}) {
|
||||
ndfCol, ok := adapterDoc["NetworkDeviceFunctions"].(map[string]interface{})
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
colPath := asString(ndfCol["@odata.id"])
|
||||
if colPath == "" {
|
||||
return
|
||||
}
|
||||
funcDocs, err := r.getCollectionMembers(colPath)
|
||||
if err != nil || len(funcDocs) == 0 {
|
||||
return
|
||||
}
|
||||
for _, fn := range funcDocs {
|
||||
eth, _ := fn["Ethernet"].(map[string]interface{})
|
||||
if eth == nil {
|
||||
continue
|
||||
}
|
||||
mac := strings.TrimSpace(firstNonEmpty(
|
||||
asString(eth["PermanentMACAddress"]),
|
||||
asString(eth["MACAddress"]),
|
||||
))
|
||||
if mac == "" {
|
||||
continue
|
||||
}
|
||||
nic.MACAddresses = dedupeStrings(append(nic.MACAddresses, strings.ToUpper(mac)))
|
||||
}
|
||||
if len(funcDocs) > 0 && nic.PortCount == 0 {
|
||||
nic.PortCount = sanitizeNetworkPortCount(len(funcDocs))
|
||||
}
|
||||
}
|
||||
91
internal/collector/redfish_replay_profiles.go
Normal file
91
internal/collector/redfish_replay_profiles.go
Normal file
@@ -0,0 +1,91 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) collectKnownStorageMembers(systemPath string, relativeCollections []string) []map[string]interface{} {
|
||||
var out []map[string]interface{}
|
||||
for _, rel := range relativeCollections {
|
||||
docs, err := r.getCollectionMembers(joinPath(systemPath, rel))
|
||||
if err != nil || len(docs) == 0 {
|
||||
continue
|
||||
}
|
||||
out = append(out, docs...)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) probeSupermicroNVMeDiskBays(backplanePath string) []map[string]interface{} {
|
||||
return r.probeDirectDiskBayChildren(joinPath(backplanePath, "/Drives"))
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) probeDirectDiskBayChildren(drivesCollectionPath string) []map[string]interface{} {
|
||||
var out []map[string]interface{}
|
||||
for _, path := range directDiskBayCandidates(drivesCollectionPath) {
|
||||
doc, err := r.getJSON(path)
|
||||
if err != nil || !looksLikeDrive(doc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, doc)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func resolveProcessorGPUChassisSerial(chassisByID map[string]map[string]interface{}, gpuID string, plan redfishprofile.ResolvedAnalysisPlan) string {
|
||||
for _, candidateID := range processorGPUChassisCandidateIDs(gpuID, plan) {
|
||||
if chassisDoc, ok := chassisByID[strings.ToUpper(candidateID)]; ok {
|
||||
if serial := strings.TrimSpace(asString(chassisDoc["SerialNumber"])); serial != "" {
|
||||
return serial
|
||||
}
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func processorGPUChassisCandidateIDs(gpuID string, plan redfishprofile.ResolvedAnalysisPlan) []string {
|
||||
gpuID = strings.TrimSpace(gpuID)
|
||||
if gpuID == "" {
|
||||
return nil
|
||||
}
|
||||
candidates := []string{gpuID}
|
||||
for _, mode := range plan.ProcessorGPUChassisLookupModes {
|
||||
switch strings.ToLower(strings.TrimSpace(mode)) {
|
||||
case "msi-index":
|
||||
candidates = append(candidates, msiProcessorGPUChassisCandidateIDs(gpuID)...)
|
||||
case "hgx-alias":
|
||||
if strings.HasPrefix(strings.ToUpper(gpuID), "GPU_") {
|
||||
candidates = append(candidates, "HGX_"+gpuID)
|
||||
}
|
||||
}
|
||||
}
|
||||
return dedupeStrings(candidates)
|
||||
}
|
||||
|
||||
func msiProcessorGPUChassisCandidateIDs(gpuID string) []string {
|
||||
gpuID = strings.TrimSpace(strings.ToUpper(gpuID))
|
||||
if gpuID == "" {
|
||||
return nil
|
||||
}
|
||||
var out []string
|
||||
switch {
|
||||
case strings.HasPrefix(gpuID, "GPU_SXM_"):
|
||||
index := strings.TrimPrefix(gpuID, "GPU_SXM_")
|
||||
if index != "" {
|
||||
out = append(out, "GPU"+index, "GPU_"+index)
|
||||
}
|
||||
case strings.HasPrefix(gpuID, "GPU_"):
|
||||
index := strings.TrimPrefix(gpuID, "GPU_")
|
||||
if index != "" {
|
||||
out = append(out, "GPU"+index, "GPU_SXM_"+index)
|
||||
}
|
||||
case strings.HasPrefix(gpuID, "GPU"):
|
||||
index := strings.TrimPrefix(gpuID, "GPU")
|
||||
if index != "" {
|
||||
out = append(out, "GPU_"+index, "GPU_SXM_"+index)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
164
internal/collector/redfish_replay_storage.go
Normal file
164
internal/collector/redfish_replay_storage.go
Normal file
@@ -0,0 +1,164 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) collectStorage(systemPath string, plan redfishprofile.ResolvedAnalysisPlan) []models.Storage {
|
||||
var out []models.Storage
|
||||
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
|
||||
for _, member := range storageMembers {
|
||||
if driveCollection, ok := member["Drives"].(map[string]interface{}); ok {
|
||||
if driveCollectionPath := asString(driveCollection["@odata.id"]); driveCollectionPath != "" {
|
||||
driveDocs, err := r.getCollectionMembers(driveCollectionPath)
|
||||
if err == nil {
|
||||
for _, driveDoc := range driveDocs {
|
||||
if !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
if len(driveDocs) == 0 {
|
||||
for _, driveDoc := range r.probeDirectDiskBayChildren(driveCollectionPath) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
}
|
||||
if drives, ok := member["Drives"].([]interface{}); ok {
|
||||
for _, driveAny := range drives {
|
||||
driveRef, ok := driveAny.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
odata := asString(driveRef["@odata.id"])
|
||||
if odata == "" {
|
||||
continue
|
||||
}
|
||||
driveDoc, err := r.getJSON(odata)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
if looksLikeDrive(member) {
|
||||
if isVirtualStorageDrive(member) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(member, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(member, supplementalDocs...))
|
||||
}
|
||||
|
||||
if plan.Directives.EnableStorageEnclosureRecovery {
|
||||
for _, enclosurePath := range redfishLinkRefs(member, "Links", "Enclosures") {
|
||||
driveDocs, err := r.getCollectionMembers(joinPath(enclosurePath, "/Drives"))
|
||||
if err == nil {
|
||||
for _, driveDoc := range driveDocs {
|
||||
if looksLikeDrive(driveDoc) && !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
if len(driveDocs) == 0 {
|
||||
for _, driveDoc := range r.probeDirectDiskBayChildren(joinPath(enclosurePath, "/Drives")) {
|
||||
if isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if len(plan.KnownStorageDriveCollections) > 0 {
|
||||
for _, driveDoc := range r.collectKnownStorageMembers(systemPath, plan.KnownStorageDriveCollections) {
|
||||
if looksLikeDrive(driveDoc) && !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
simpleStorageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/SimpleStorage"))
|
||||
for _, member := range simpleStorageMembers {
|
||||
devices, ok := member["Devices"].([]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
for _, devAny := range devices {
|
||||
devDoc, ok := devAny.(map[string]interface{})
|
||||
if !ok || !looksLikeDrive(devDoc) || isVirtualStorageDrive(devDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(devDoc))
|
||||
}
|
||||
}
|
||||
|
||||
chassisPaths := r.discoverMemberPaths("/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
|
||||
for _, chassisPath := range chassisPaths {
|
||||
driveDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/Drives"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, driveDoc := range driveDocs {
|
||||
if !looksLikeDrive(driveDoc) || isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
if plan.Directives.EnableSupermicroNVMeBackplane {
|
||||
for _, chassisPath := range chassisPaths {
|
||||
if !isSupermicroNVMeBackplanePath(chassisPath) {
|
||||
continue
|
||||
}
|
||||
for _, driveDoc := range r.probeSupermicroNVMeDiskBays(chassisPath) {
|
||||
if !looksLikeDrive(driveDoc) || isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
}
|
||||
return dedupeStorage(out)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectStorageVolumes(systemPath string, plan redfishprofile.ResolvedAnalysisPlan) []models.StorageVolume {
|
||||
var out []models.StorageVolume
|
||||
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
|
||||
for _, member := range storageMembers {
|
||||
controller := firstNonEmpty(asString(member["Id"]), asString(member["Name"]))
|
||||
volumeCollectionPath := redfishLinkedPath(member, "Volumes")
|
||||
if volumeCollectionPath == "" {
|
||||
continue
|
||||
}
|
||||
volumeDocs, err := r.getCollectionMembers(volumeCollectionPath)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, volDoc := range volumeDocs {
|
||||
if looksLikeVolume(volDoc) {
|
||||
out = append(out, parseStorageVolume(volDoc, controller))
|
||||
}
|
||||
}
|
||||
}
|
||||
if len(plan.KnownStorageVolumeCollections) > 0 {
|
||||
for _, volDoc := range r.collectKnownStorageMembers(systemPath, plan.KnownStorageVolumeCollections) {
|
||||
if looksLikeVolume(volDoc) {
|
||||
out = append(out, parseStorageVolume(volDoc, storageControllerFromPath(asString(volDoc["@odata.id"]))))
|
||||
}
|
||||
}
|
||||
}
|
||||
return dedupeStorageVolumes(out)
|
||||
}
|
||||
@@ -11,9 +11,14 @@ import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func testAnalysisPlan(d redfishprofile.AnalysisDirectives) redfishprofile.ResolvedAnalysisPlan {
|
||||
return redfishprofile.ResolvedAnalysisPlan{Directives: d}
|
||||
}
|
||||
|
||||
func TestRedfishConnectorCollect(t *testing.T) {
|
||||
mux := http.NewServeMux()
|
||||
register := func(path string, payload interface{}) {
|
||||
@@ -1422,6 +1427,12 @@ func TestRecoverCriticalRedfishDocsPlanB_RetriesMembersFromExistingCollection(t
|
||||
[]string{"/redfish/v1/Chassis/1/Drives"},
|
||||
rawTree,
|
||||
fetchErrs,
|
||||
redfishprofile.AcquisitionTuning{
|
||||
RecoveryPolicy: redfishprofile.AcquisitionRecoveryPolicy{
|
||||
EnableCriticalCollectionMemberRetry: true,
|
||||
EnableCriticalSlowProbe: true,
|
||||
},
|
||||
},
|
||||
nil,
|
||||
)
|
||||
if recovered == 0 {
|
||||
@@ -1474,7 +1485,12 @@ func TestRecoverCriticalRedfishDocsPlanB_RetriesMembersFromSystemMemoryCollectio
|
||||
dimmPath: `Get "https://example/redfish/v1/Systems/1/Memory/CPU1_C1D1": context deadline exceeded (Client.Timeout exceeded while awaiting headers)`,
|
||||
}
|
||||
|
||||
criticalPaths := redfishCriticalEndpoints([]string{systemPath}, nil, nil)
|
||||
plan := redfishprofile.BuildAcquisitionPlan(redfishprofile.MatchSignals{})
|
||||
match := redfishprofile.MatchProfiles(redfishprofile.MatchSignals{})
|
||||
resolved := redfishprofile.ResolveAcquisitionPlan(match, plan, redfishprofile.DiscoveredResources{
|
||||
SystemPaths: []string{systemPath},
|
||||
}, redfishprofile.MatchSignals{})
|
||||
criticalPaths := resolved.CriticalPaths
|
||||
hasMemoryPath := false
|
||||
for _, p := range criticalPaths {
|
||||
if p == memoryPath {
|
||||
@@ -1495,6 +1511,12 @@ func TestRecoverCriticalRedfishDocsPlanB_RetriesMembersFromSystemMemoryCollectio
|
||||
criticalPaths,
|
||||
rawTree,
|
||||
fetchErrs,
|
||||
redfishprofile.AcquisitionTuning{
|
||||
RecoveryPolicy: redfishprofile.AcquisitionRecoveryPolicy{
|
||||
EnableCriticalCollectionMemberRetry: true,
|
||||
EnableCriticalSlowProbe: true,
|
||||
},
|
||||
},
|
||||
nil,
|
||||
)
|
||||
if recovered == 0 {
|
||||
@@ -1508,6 +1530,50 @@ func TestRecoverCriticalRedfishDocsPlanB_RetriesMembersFromSystemMemoryCollectio
|
||||
}
|
||||
}
|
||||
|
||||
func TestRecoverCriticalRedfishDocsPlanB_SkipsMemberRetryWithoutRecoveryPolicy(t *testing.T) {
|
||||
t.Setenv("LOGPILE_REDFISH_CRITICAL_COOLDOWN", "0s")
|
||||
t.Setenv("LOGPILE_REDFISH_CRITICAL_SLOW_GAP", "0s")
|
||||
t.Setenv("LOGPILE_REDFISH_CRITICAL_PLANB_RETRIES", "1")
|
||||
t.Setenv("LOGPILE_REDFISH_CRITICAL_RETRIES", "1")
|
||||
t.Setenv("LOGPILE_REDFISH_CRITICAL_BACKOFF", "0s")
|
||||
|
||||
const memoryPath = "/redfish/v1/Systems/1/Memory"
|
||||
const dimmPath = "/redfish/v1/Systems/1/Memory/CPU1_C1D1"
|
||||
|
||||
rawTree := map[string]interface{}{
|
||||
memoryPath: map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": dimmPath},
|
||||
},
|
||||
},
|
||||
}
|
||||
fetchErrs := map[string]string{
|
||||
dimmPath: `Get "https://example/redfish/v1/Systems/1/Memory/CPU1_C1D1": context deadline exceeded (Client.Timeout exceeded while awaiting headers)`,
|
||||
}
|
||||
|
||||
c := NewRedfishConnector()
|
||||
recovered := c.recoverCriticalRedfishDocsPlanB(
|
||||
context.Background(),
|
||||
http.DefaultClient,
|
||||
Request{},
|
||||
"https://example",
|
||||
[]string{memoryPath},
|
||||
rawTree,
|
||||
fetchErrs,
|
||||
redfishprofile.AcquisitionTuning{},
|
||||
nil,
|
||||
)
|
||||
if recovered != 0 {
|
||||
t.Fatalf("expected no recovery without recovery policy, got %d", recovered)
|
||||
}
|
||||
if _, ok := rawTree[dimmPath]; ok {
|
||||
t.Fatalf("did not expect recovered DIMM doc for %s", dimmPath)
|
||||
}
|
||||
if _, ok := fetchErrs[dimmPath]; !ok {
|
||||
t.Fatalf("expected DIMM fetch error for %s to remain", dimmPath)
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayCollectStorage_ProbesSupermicroNVMeDiskBayWhenCollectionEmpty(t *testing.T) {
|
||||
r := redfishSnapshotReader{tree: map[string]interface{}{
|
||||
"/redfish/v1/Systems": map[string]interface{}{
|
||||
@@ -1551,7 +1617,7 @@ func TestReplayCollectStorage_ProbesSupermicroNVMeDiskBayWhenCollectionEmpty(t *
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectStorage("/redfish/v1/Systems/1")
|
||||
got := r.collectStorage("/redfish/v1/Systems/1", testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableSupermicroNVMeBackplane: true}))
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected one drive from direct Disk.Bay probe, got %d", len(got))
|
||||
}
|
||||
@@ -1563,6 +1629,70 @@ func TestReplayCollectStorage_ProbesSupermicroNVMeDiskBayWhenCollectionEmpty(t *
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayCollectStorage_SkipsEnclosureRecoveryWhenDirectiveDisabled(t *testing.T) {
|
||||
r := redfishSnapshotReader{tree: map[string]interface{}{
|
||||
"/redfish/v1/Systems/1/Storage": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/Storage/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Systems/1/Storage/1": map[string]interface{}{
|
||||
"Links": map[string]interface{}{
|
||||
"Enclosures": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Enclosures/1"},
|
||||
},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Enclosures/1/Drives": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Enclosures/1/Drives/Drive1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Enclosures/1/Drives/Drive1": map[string]interface{}{
|
||||
"Id": "Drive1",
|
||||
"Name": "Drive1",
|
||||
"Model": "INTEL SSD",
|
||||
"SerialNumber": "ENCLOSURE-DRIVE-001",
|
||||
"Protocol": "SATA",
|
||||
"MediaType": "SSD",
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectStorage("/redfish/v1/Systems/1", testAnalysisPlan(redfishprofile.AnalysisDirectives{}))
|
||||
if len(got) != 0 {
|
||||
t.Fatalf("expected no enclosure recovery when directive is off, got %d", len(got))
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayCollectStorage_UsesKnownControllerRecoveryWhenEnabled(t *testing.T) {
|
||||
r := redfishSnapshotReader{tree: map[string]interface{}{
|
||||
"/redfish/v1/Systems/1/Storage/IntelVROC/Drives": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/Storage/IntelVROC/Drives/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Systems/1/Storage/IntelVROC/Drives/1": map[string]interface{}{
|
||||
"Id": "1",
|
||||
"Name": "Drive1",
|
||||
"Model": "VROC SSD",
|
||||
"SerialNumber": "VROC-001",
|
||||
"Protocol": "NVMe",
|
||||
"MediaType": "SSD",
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectStorage("/redfish/v1/Systems/1", redfishprofile.ResolvedAnalysisPlan{
|
||||
Directives: redfishprofile.AnalysisDirectives{EnableKnownStorageControllerRecovery: true},
|
||||
KnownStorageDriveCollections: []string{"/Storage/IntelVROC/Drives"},
|
||||
})
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected one drive from known controller recovery, got %d", len(got))
|
||||
}
|
||||
if got[0].SerialNumber != "VROC-001" {
|
||||
t.Fatalf("unexpected serial %q", got[0].SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayCollectGPUs_DoesNotCollapseOnPlaceholderSerialAndSkipsNIC(t *testing.T) {
|
||||
r := redfishSnapshotReader{tree: map[string]interface{}{
|
||||
"/redfish/v1/Chassis/1/PCIeDevices": map[string]interface{}{
|
||||
@@ -1610,7 +1740,7 @@ func TestReplayCollectGPUs_DoesNotCollapseOnPlaceholderSerialAndSkipsNIC(t *test
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectGPUs(nil, []string{"/redfish/v1/Chassis/1"})
|
||||
got := r.collectGPUs(nil, []string{"/redfish/v1/Chassis/1"}, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
if len(got) != 2 {
|
||||
t.Fatalf("expected 2 GPUs (two H200 cards), got %d", len(got))
|
||||
}
|
||||
@@ -1681,7 +1811,7 @@ func TestReplayCollectGPUs_FromGraphicsControllers(t *testing.T) {
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, nil)
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, nil, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
if len(got) != 2 {
|
||||
t.Fatalf("expected 2 GPUs from GraphicsControllers, got %d", len(got))
|
||||
}
|
||||
@@ -1714,7 +1844,7 @@ func TestReplayCollectGPUs_DedupUsesRedfishPathBeforeHeuristics(t *testing.T) {
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, nil)
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, nil, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
if len(got) != 2 {
|
||||
t.Fatalf("expected both GPUs to be kept by unique redfish path, got %d", len(got))
|
||||
}
|
||||
@@ -1834,6 +1964,83 @@ func TestReplayRedfishFromRawPayloads_AddsMissingServerModelWarning(t *testing.T
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayRedfishFromRawPayloads_StoresAnalysisProfilesMetadata(t *testing.T) {
|
||||
raw := map[string]any{
|
||||
"redfish_tree": map[string]interface{}{
|
||||
"/redfish/v1": map[string]interface{}{
|
||||
"Vendor": "AMI",
|
||||
"Product": "AMI Redfish",
|
||||
"Systems": map[string]interface{}{"@odata.id": "/redfish/v1/Systems"},
|
||||
"Chassis": map[string]interface{}{"@odata.id": "/redfish/v1/Chassis"},
|
||||
"Managers": map[string]interface{}{"@odata.id": "/redfish/v1/Managers"},
|
||||
},
|
||||
"/redfish/v1/Systems": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Systems/1": map[string]interface{}{
|
||||
"Manufacturer": "Micro-Star International Co., Ltd.",
|
||||
"Model": "CG290",
|
||||
},
|
||||
"/redfish/v1/Chassis": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1": map[string]interface{}{
|
||||
"Manufacturer": "Micro-Star International Co., Ltd.",
|
||||
"Model": "CG290",
|
||||
},
|
||||
"/redfish/v1/Managers": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Managers/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Managers/1": map[string]interface{}{
|
||||
"Id": "1",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
got, err := ReplayRedfishFromRawPayloads(raw, nil)
|
||||
if err != nil {
|
||||
t.Fatalf("replay failed: %v", err)
|
||||
}
|
||||
meta, ok := got.RawPayloads["redfish_analysis_profiles"].(map[string]any)
|
||||
if !ok {
|
||||
t.Fatalf("expected redfish_analysis_profiles metadata")
|
||||
}
|
||||
if meta["mode"] != redfishprofile.ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %#v", meta["mode"])
|
||||
}
|
||||
profiles, ok := meta["profiles"].([]string)
|
||||
if !ok {
|
||||
t.Fatalf("expected []string profiles, got %T", meta["profiles"])
|
||||
}
|
||||
foundMSI := false
|
||||
for _, profile := range profiles {
|
||||
if profile == "msi" {
|
||||
foundMSI = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !foundMSI {
|
||||
t.Fatalf("expected msi in applied profiles, got %v", profiles)
|
||||
}
|
||||
planMeta, ok := got.RawPayloads["redfish_analysis_plan"].(map[string]any)
|
||||
if !ok {
|
||||
t.Fatalf("expected redfish_analysis_plan metadata")
|
||||
}
|
||||
directives, ok := planMeta["directives"].(map[string]any)
|
||||
if !ok {
|
||||
t.Fatalf("expected directives map in redfish_analysis_plan")
|
||||
}
|
||||
if directives["generic_graphics_controller_dedup"] != true {
|
||||
t.Fatalf("expected generic_graphics_controller_dedup directive, got %#v", directives["generic_graphics_controller_dedup"])
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayRedfishFromRawPayloads_AddsDriveFetchWarning(t *testing.T) {
|
||||
raw := map[string]any{
|
||||
"redfish_tree": map[string]interface{}{
|
||||
@@ -1934,7 +2141,7 @@ func TestReplayCollectGPUs_SkipsModelOnlyDuplicateFromGraphicsControllers(t *tes
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, nil)
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, nil, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
if len(got) != 2 {
|
||||
t.Fatalf("expected 2 GPUs without generic duplicate, got %d", len(got))
|
||||
}
|
||||
@@ -1945,6 +2152,48 @@ func TestReplayCollectGPUs_SkipsModelOnlyDuplicateFromGraphicsControllers(t *tes
|
||||
}
|
||||
}
|
||||
|
||||
func TestReplayCollectGPUs_KeepsModelOnlyGraphicsDuplicateWhenDirectiveDisabled(t *testing.T) {
|
||||
r := redfishSnapshotReader{tree: map[string]interface{}{
|
||||
"/redfish/v1/Chassis/1/PCIeDevices": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/4"},
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/9"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/4": map[string]interface{}{
|
||||
"Id": "4",
|
||||
"Name": "PCIeCard4",
|
||||
"Model": "H200-SXM5-141G",
|
||||
"Manufacturer": "NVIDIA",
|
||||
"SerialNumber": "1654225094493",
|
||||
},
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/9": map[string]interface{}{
|
||||
"Id": "9",
|
||||
"Name": "PCIeCard9",
|
||||
"Model": "H200-SXM5-141G",
|
||||
"Manufacturer": "NVIDIA",
|
||||
"SerialNumber": "1654425002635",
|
||||
},
|
||||
"/redfish/v1/Systems/1/GraphicsControllers": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/GraphicsControllers/GPU0"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Systems/1/GraphicsControllers/GPU0": map[string]interface{}{
|
||||
"Id": "GPU0",
|
||||
"Name": "H200-SXM5-141G",
|
||||
"Model": "H200-SXM5-141G",
|
||||
"Manufacturer": "NVIDIA",
|
||||
"SerialNumber": "N/A",
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, []string{"/redfish/v1/Chassis/1"}, testAnalysisPlan(redfishprofile.AnalysisDirectives{}))
|
||||
if len(got) != 3 {
|
||||
t.Fatalf("expected model-only graphics duplicate to remain when directive is off, got %d", len(got))
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyBoardInfoFallbackFromDocs_SkipsComponentProductNames(t *testing.T) {
|
||||
board := models.BoardInfo{
|
||||
SerialNumber: "23E100051",
|
||||
@@ -2129,7 +2378,7 @@ func TestReplayCollectGPUs_DropsModelOnlyPlaceholderWhenConcreteDiscoveredLater(
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, []string{"/redfish/v1/Chassis/1"})
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, []string{"/redfish/v1/Chassis/1"}, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected generic graphics placeholder to be dropped, got %d GPUs", len(got))
|
||||
}
|
||||
@@ -2169,7 +2418,7 @@ func TestReplayCollectGPUs_MergesGraphicsSerialIntoConcretePCIeGPU(t *testing.T)
|
||||
},
|
||||
}}
|
||||
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, []string{"/redfish/v1/Chassis/1"})
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, []string{"/redfish/v1/Chassis/1"}, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected merged single GPU row, got %d", len(got))
|
||||
}
|
||||
@@ -2227,7 +2476,7 @@ func TestReplayCollectGPUs_MergesAmbiguousSameModelByOrder(t *testing.T) {
|
||||
}
|
||||
|
||||
r := redfishSnapshotReader{tree: tree}
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, []string{"/redfish/v1/Chassis/1"})
|
||||
got := r.collectGPUs([]string{"/redfish/v1/Systems/1"}, []string{"/redfish/v1/Chassis/1"}, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
if len(got) != len(pcieIDs) {
|
||||
t.Fatalf("expected %d merged GPUs, got %d", len(pcieIDs), len(got))
|
||||
}
|
||||
@@ -2358,8 +2607,8 @@ func TestCollectGPUsFromProcessors_SupermicroHGX(t *testing.T) {
|
||||
}
|
||||
systemPaths := []string{"/redfish/v1/Systems/HGX_Baseboard_0"}
|
||||
|
||||
gpus := r.collectGPUs(systemPaths, chassisPaths)
|
||||
gpus = r.collectGPUsFromProcessors(systemPaths, chassisPaths, gpus)
|
||||
gpus := r.collectGPUs(systemPaths, chassisPaths, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
gpus = r.collectGPUsFromProcessors(systemPaths, chassisPaths, gpus, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableProcessorGPUFallback: true}))
|
||||
|
||||
if len(gpus) != 2 {
|
||||
var slots []string
|
||||
@@ -2370,6 +2619,110 @@ func TestCollectGPUsFromProcessors_SupermicroHGX(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestCollectGPUsFromProcessors_SupermicroHGXUsesChassisAliasSerial(t *testing.T) {
|
||||
tree := map[string]interface{}{
|
||||
"/redfish/v1/Chassis/1/PCIeDevices": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/GPU1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1": map[string]interface{}{
|
||||
"Id": "GPU1",
|
||||
"Name": "GPU1",
|
||||
"Model": "NVIDIA H200",
|
||||
"Manufacturer": "NVIDIA",
|
||||
"SerialNumber": "SN-ALIAS-001",
|
||||
"PCIeFunctions": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions",
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions/1": map[string]interface{}{
|
||||
"FunctionId": "1",
|
||||
"ClassCode": "0x030200",
|
||||
},
|
||||
"/redfish/v1/Chassis/HGX_GPU_SXM_1": map[string]interface{}{
|
||||
"Id": "HGX_GPU_SXM_1",
|
||||
"SerialNumber": "SN-ALIAS-001",
|
||||
},
|
||||
"/redfish/v1/Systems/HGX_Baseboard_0/Processors": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_SXM_1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_SXM_1": map[string]interface{}{
|
||||
"Id": "GPU_SXM_1",
|
||||
"Name": "Processor",
|
||||
"ProcessorType": "GPU",
|
||||
"Model": "NVIDIA H200",
|
||||
"Manufacturer": "NVIDIA",
|
||||
},
|
||||
}
|
||||
|
||||
r := redfishSnapshotReader{tree: tree}
|
||||
chassisPaths := []string{
|
||||
"/redfish/v1/Chassis/1",
|
||||
"/redfish/v1/Chassis/HGX_GPU_SXM_1",
|
||||
}
|
||||
systemPaths := []string{"/redfish/v1/Systems/HGX_Baseboard_0"}
|
||||
|
||||
gpus := r.collectGPUs(systemPaths, chassisPaths, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
gpus = r.collectGPUsFromProcessors(systemPaths, chassisPaths, gpus, redfishprofile.ResolvedAnalysisPlan{
|
||||
Directives: redfishprofile.AnalysisDirectives{EnableProcessorGPUFallback: true, EnableProcessorGPUChassisAlias: true},
|
||||
ProcessorGPUChassisLookupModes: []string{"hgx-alias"},
|
||||
})
|
||||
|
||||
if len(gpus) != 1 {
|
||||
t.Fatalf("expected alias serial dedupe to keep 1 gpu, got %d", len(gpus))
|
||||
}
|
||||
if gpus[0].SerialNumber != "SN-ALIAS-001" {
|
||||
t.Fatalf("expected serial from aliased chassis, got %q", gpus[0].SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestCollectGPUsFromProcessors_MSIUsesIndexedChassisLookup(t *testing.T) {
|
||||
tree := map[string]interface{}{
|
||||
"/redfish/v1/Chassis/GPU1": map[string]interface{}{
|
||||
"Id": "GPU1",
|
||||
"SerialNumber": "MSI-SN-001",
|
||||
},
|
||||
"/redfish/v1/Systems/1/Processors": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/Processors/GPU_SXM_1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Systems/1/Processors/GPU_SXM_1": map[string]interface{}{
|
||||
"Id": "GPU_SXM_1",
|
||||
"Name": "Processor",
|
||||
"ProcessorType": "GPU",
|
||||
"Model": "NVIDIA RTX PRO 6000 Blackwell",
|
||||
"Manufacturer": "NVIDIA",
|
||||
},
|
||||
}
|
||||
|
||||
r := redfishSnapshotReader{tree: tree}
|
||||
gpus := r.collectGPUsFromProcessors(
|
||||
[]string{"/redfish/v1/Systems/1"},
|
||||
[]string{"/redfish/v1/Chassis/GPU1"},
|
||||
nil,
|
||||
redfishprofile.ResolvedAnalysisPlan{
|
||||
Directives: redfishprofile.AnalysisDirectives{EnableProcessorGPUFallback: true, EnableMSIProcessorGPUChassisLookup: true},
|
||||
ProcessorGPUChassisLookupModes: []string{"msi-index"},
|
||||
},
|
||||
)
|
||||
|
||||
if len(gpus) != 1 {
|
||||
t.Fatalf("expected one gpu, got %d", len(gpus))
|
||||
}
|
||||
if gpus[0].SerialNumber != "MSI-SN-001" {
|
||||
t.Fatalf("expected serial from MSI indexed chassis lookup, got %q", gpus[0].SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
// TestReplayCollectGPUs_DedupCrossChassisSerial verifies that the same GPU
|
||||
// appearing under two Chassis PCIeDevice trees (e.g. Chassis/1/PCIeDevices/GPU1
|
||||
// and Chassis/HGX_GPU_SXM_1/PCIeDevices/GPU_SXM_1) is deduplicated to one entry
|
||||
@@ -2428,7 +2781,7 @@ func TestReplayCollectGPUs_DedupCrossChassisSerial(t *testing.T) {
|
||||
got := r.collectGPUs(nil, []string{
|
||||
"/redfish/v1/Chassis/1",
|
||||
"/redfish/v1/Chassis/HGX_GPU_SXM_1",
|
||||
})
|
||||
}, testAnalysisPlan(redfishprofile.AnalysisDirectives{EnableGenericGraphicsControllerDedup: true}))
|
||||
if len(got) != 1 {
|
||||
var slots []string
|
||||
for _, g := range got {
|
||||
@@ -2565,32 +2918,42 @@ func TestRedfishSnapshotBranchKey(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestShouldPostProbeCollectionPath(t *testing.T) {
|
||||
if shouldPostProbeCollectionPath("/redfish/v1/Chassis/1/Sensors") {
|
||||
var tuning redfishprofile.AcquisitionTuning
|
||||
if shouldPostProbeCollectionPath("/redfish/v1/Chassis/1/Sensors", tuning) {
|
||||
t.Fatalf("expected sensors collection to be skipped by default")
|
||||
}
|
||||
if shouldPostProbeCollectionPath("/redfish/v1/Systems/1/Storage/RAID/Drives", tuning) {
|
||||
t.Fatalf("expected drives collection to be skipped without profile policy")
|
||||
}
|
||||
tuning.PostProbePolicy.EnableNumericCollectionProbe = true
|
||||
t.Setenv("LOGPILE_REDFISH_SENSOR_POSTPROBE", "1")
|
||||
if !shouldPostProbeCollectionPath("/redfish/v1/Chassis/1/Sensors") {
|
||||
if !shouldPostProbeCollectionPath("/redfish/v1/Chassis/1/Sensors", tuning) {
|
||||
t.Fatalf("expected sensors collection to be post-probed when enabled")
|
||||
}
|
||||
if !shouldPostProbeCollectionPath("/redfish/v1/Systems/1/Storage/RAID/Drives") {
|
||||
if !shouldPostProbeCollectionPath("/redfish/v1/Systems/1/Storage/RAID/Drives", tuning) {
|
||||
t.Fatalf("expected drives collection to be post-probed")
|
||||
}
|
||||
if shouldPostProbeCollectionPath("/redfish/v1/Chassis/1/Boards/BOARD1") {
|
||||
if shouldPostProbeCollectionPath("/redfish/v1/Chassis/1/Boards/BOARD1", tuning) {
|
||||
t.Fatalf("expected board member resource to be skipped from post-probe")
|
||||
}
|
||||
if shouldPostProbeCollectionPath("/redfish/v1/Chassis/1/Assembly/Oem/COMMONb/COMMONbAssembly/1") {
|
||||
if shouldPostProbeCollectionPath("/redfish/v1/Chassis/1/Assembly/Oem/COMMONb/COMMONbAssembly/1", tuning) {
|
||||
t.Fatalf("expected assembly member resource to be skipped from post-probe")
|
||||
}
|
||||
}
|
||||
|
||||
func TestShouldAdaptivePostProbeCollectionPath(t *testing.T) {
|
||||
tuning := redfishprofile.AcquisitionTuning{
|
||||
PostProbePolicy: redfishprofile.AcquisitionPostProbePolicy{
|
||||
EnableNumericCollectionProbe: true,
|
||||
},
|
||||
}
|
||||
withExplicitNamedMembers := map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/EthernetInterfaces/NIC-0-0"},
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/EthernetInterfaces/NIC-0-1"},
|
||||
},
|
||||
}
|
||||
if shouldAdaptivePostProbeCollectionPath("/redfish/v1/Systems/1/EthernetInterfaces", withExplicitNamedMembers) {
|
||||
if shouldAdaptivePostProbeCollectionPath("/redfish/v1/Systems/1/EthernetInterfaces", withExplicitNamedMembers, tuning) {
|
||||
t.Fatalf("expected explicit non-numeric members to skip adaptive post-probe")
|
||||
}
|
||||
|
||||
@@ -2600,14 +2963,18 @@ func TestShouldAdaptivePostProbeCollectionPath(t *testing.T) {
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/2"},
|
||||
},
|
||||
}
|
||||
if !shouldAdaptivePostProbeCollectionPath("/redfish/v1/Chassis/1/PCIeDevices", withNumericMembers) {
|
||||
if !shouldAdaptivePostProbeCollectionPath("/redfish/v1/Chassis/1/PCIeDevices", withNumericMembers, tuning) {
|
||||
t.Fatalf("expected numeric members to allow adaptive post-probe")
|
||||
}
|
||||
|
||||
withoutMembers := map[string]interface{}{"Name": "Drives"}
|
||||
if !shouldAdaptivePostProbeCollectionPath("/redfish/v1/Chassis/1/Drives", withoutMembers) {
|
||||
if !shouldAdaptivePostProbeCollectionPath("/redfish/v1/Chassis/1/Drives", withoutMembers, tuning) {
|
||||
t.Fatalf("expected missing members to allow adaptive post-probe")
|
||||
}
|
||||
|
||||
if shouldAdaptivePostProbeCollectionPath("/redfish/v1/Chassis/1/Drives", withoutMembers, redfishprofile.AcquisitionTuning{}) {
|
||||
t.Fatalf("expected post-probe to stay disabled without profile policy")
|
||||
}
|
||||
}
|
||||
|
||||
func TestShouldAdaptiveNVMeProbe(t *testing.T) {
|
||||
@@ -2627,6 +2994,15 @@ func TestShouldAdaptiveNVMeProbe(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestRedfishAdaptivePrefetchTargets(t *testing.T) {
|
||||
tuning := redfishprofile.AcquisitionTuning{
|
||||
PrefetchPolicy: redfishprofile.AcquisitionPrefetchPolicy{
|
||||
IncludeSuffixes: []string{
|
||||
"/Memory",
|
||||
"/Processors",
|
||||
"/Storage",
|
||||
},
|
||||
},
|
||||
}
|
||||
candidates := []string{
|
||||
"/redfish/v1/Systems/1/Memory",
|
||||
"/redfish/v1/Systems/1/Processors",
|
||||
@@ -2651,7 +3027,7 @@ func TestRedfishAdaptivePrefetchTargets(t *testing.T) {
|
||||
"/redfish/v1/Systems/1/Storage/Volumes": "status 404 from /redfish/v1/Systems/1/Storage/Volumes: not found",
|
||||
}
|
||||
|
||||
got := redfishAdaptivePrefetchTargets(candidates, rawTree, fetchErrs)
|
||||
got := redfishAdaptivePrefetchTargets(redfishPrefetchTargets(candidates, tuning), rawTree, fetchErrs)
|
||||
joined := strings.Join(got, "\n")
|
||||
for _, wanted := range []string{
|
||||
"/redfish/v1/Systems/1/Memory",
|
||||
@@ -2666,12 +3042,16 @@ func TestRedfishAdaptivePrefetchTargets(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestRedfishSnapshotPrioritySeeds_DefaultSkipsNoisyBranches(t *testing.T) {
|
||||
seeds := redfishSnapshotPrioritySeeds(
|
||||
[]string{"/redfish/v1/Systems/1"},
|
||||
[]string{"/redfish/v1/Chassis/1"},
|
||||
[]string{"/redfish/v1/Managers/1"},
|
||||
)
|
||||
func TestResolveAcquisitionPlan_DefaultSkipsNoisyBranches(t *testing.T) {
|
||||
signals := redfishprofile.MatchSignals{}
|
||||
match := redfishprofile.MatchProfiles(signals)
|
||||
plan := redfishprofile.BuildAcquisitionPlan(signals)
|
||||
resolved := redfishprofile.ResolveAcquisitionPlan(match, plan, redfishprofile.DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1"},
|
||||
}, signals)
|
||||
seeds := resolved.SeedPaths
|
||||
joined := strings.Join(seeds, "\n")
|
||||
for _, noisy := range []string{
|
||||
"/redfish/v1/Fabrics",
|
||||
@@ -2697,7 +3077,43 @@ func TestRedfishSnapshotPrioritySeeds_DefaultSkipsNoisyBranches(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestShouldPrefetchCriticalPath_UsesPrefetchPolicy(t *testing.T) {
|
||||
tuning := redfishprofile.AcquisitionTuning{
|
||||
PrefetchPolicy: redfishprofile.AcquisitionPrefetchPolicy{
|
||||
IncludeSuffixes: []string{"/Storage", "/Oem/Public"},
|
||||
ExcludeContains: []string{"/Assembly"},
|
||||
},
|
||||
}
|
||||
if !shouldPrefetchCriticalPath("/redfish/v1/Systems/1/Storage", tuning) {
|
||||
t.Fatal("expected storage path to be prefetched when included by policy")
|
||||
}
|
||||
if !shouldPrefetchCriticalPath("/redfish/v1/Systems/1/Oem/Public", tuning) {
|
||||
t.Fatal("expected OEM public path to be prefetched when included by policy")
|
||||
}
|
||||
if shouldPrefetchCriticalPath("/redfish/v1/Chassis/1/Assembly", tuning) {
|
||||
t.Fatal("expected excluded path to skip prefetch")
|
||||
}
|
||||
if shouldPrefetchCriticalPath("/redfish/v1/Chassis/1/Power", redfishprofile.AcquisitionTuning{}) {
|
||||
t.Fatal("expected empty prefetch policy to disable suffix-based prefetch")
|
||||
}
|
||||
}
|
||||
|
||||
func TestRedfishPrefetchTargets_FilterNoisyBranches(t *testing.T) {
|
||||
tuning := redfishprofile.AcquisitionTuning{
|
||||
PrefetchPolicy: redfishprofile.AcquisitionPrefetchPolicy{
|
||||
IncludeSuffixes: []string{
|
||||
"/Memory",
|
||||
"/Oem/Public/FRU",
|
||||
"/Drives",
|
||||
"/NetworkProtocol",
|
||||
},
|
||||
ExcludeContains: []string{
|
||||
"/Backplanes",
|
||||
"/Sensors",
|
||||
"/LogServices",
|
||||
},
|
||||
},
|
||||
}
|
||||
critical := []string{
|
||||
"/redfish/v1/Systems/1",
|
||||
"/redfish/v1/Systems/1/Memory",
|
||||
@@ -2708,7 +3124,7 @@ func TestRedfishPrefetchTargets_FilterNoisyBranches(t *testing.T) {
|
||||
"/redfish/v1/Managers/1/LogServices",
|
||||
"/redfish/v1/Managers/1/NetworkProtocol",
|
||||
}
|
||||
got := redfishPrefetchTargets(critical)
|
||||
got := redfishPrefetchTargets(critical, tuning)
|
||||
joined := strings.Join(got, "\n")
|
||||
for _, wanted := range []string{
|
||||
"/redfish/v1/Systems/1",
|
||||
|
||||
163
internal/collector/redfishprofile/acquisition.go
Normal file
163
internal/collector/redfishprofile/acquisition.go
Normal file
@@ -0,0 +1,163 @@
|
||||
package redfishprofile
|
||||
|
||||
import "strings"
|
||||
|
||||
func ResolveAcquisitionPlan(match MatchResult, plan AcquisitionPlan, discovered DiscoveredResources, signals MatchSignals) ResolvedAcquisitionPlan {
|
||||
seedGroups := [][]string{
|
||||
baselineSeedPaths(discovered),
|
||||
expandScopedSuffixes(discovered.SystemPaths, plan.ScopedPaths.SystemSeedSuffixes),
|
||||
expandScopedSuffixes(discovered.ChassisPaths, plan.ScopedPaths.ChassisSeedSuffixes),
|
||||
expandScopedSuffixes(discovered.ManagerPaths, plan.ScopedPaths.ManagerSeedSuffixes),
|
||||
plan.SeedPaths,
|
||||
}
|
||||
if plan.Mode == ModeFallback {
|
||||
seedGroups = append(seedGroups, plan.PlanBPaths)
|
||||
}
|
||||
|
||||
criticalGroups := [][]string{
|
||||
baselineCriticalPaths(discovered),
|
||||
expandScopedSuffixes(discovered.SystemPaths, plan.ScopedPaths.SystemCriticalSuffixes),
|
||||
expandScopedSuffixes(discovered.ChassisPaths, plan.ScopedPaths.ChassisCriticalSuffixes),
|
||||
expandScopedSuffixes(discovered.ManagerPaths, plan.ScopedPaths.ManagerCriticalSuffixes),
|
||||
plan.CriticalPaths,
|
||||
}
|
||||
|
||||
resolved := ResolvedAcquisitionPlan{
|
||||
Plan: plan,
|
||||
SeedPaths: mergeResolvedPaths(seedGroups...),
|
||||
CriticalPaths: mergeResolvedPaths(criticalGroups...),
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
profile.RefineAcquisitionPlan(&resolved, discovered, signals)
|
||||
}
|
||||
resolved.SeedPaths = mergeResolvedPaths(resolved.SeedPaths)
|
||||
resolved.CriticalPaths = mergeResolvedPaths(resolved.CriticalPaths, resolved.Plan.CriticalPaths)
|
||||
resolved.Plan.SeedPaths = mergeResolvedPaths(resolved.Plan.SeedPaths)
|
||||
resolved.Plan.CriticalPaths = mergeResolvedPaths(resolved.Plan.CriticalPaths)
|
||||
resolved.Plan.PlanBPaths = mergeResolvedPaths(resolved.Plan.PlanBPaths)
|
||||
return resolved
|
||||
}
|
||||
|
||||
func baselineSeedPaths(discovered DiscoveredResources) []string {
|
||||
var out []string
|
||||
add := func(p string) {
|
||||
if p = normalizePath(p); p != "" {
|
||||
out = append(out, p)
|
||||
}
|
||||
}
|
||||
|
||||
add("/redfish/v1/UpdateService")
|
||||
add("/redfish/v1/UpdateService/FirmwareInventory")
|
||||
|
||||
for _, p := range discovered.SystemPaths {
|
||||
add(p)
|
||||
add(joinPath(p, "/Bios"))
|
||||
add(joinPath(p, "/SecureBoot"))
|
||||
add(joinPath(p, "/Oem/Public"))
|
||||
add(joinPath(p, "/Oem/Public/FRU"))
|
||||
add(joinPath(p, "/Processors"))
|
||||
add(joinPath(p, "/Memory"))
|
||||
add(joinPath(p, "/EthernetInterfaces"))
|
||||
add(joinPath(p, "/NetworkInterfaces"))
|
||||
add(joinPath(p, "/PCIeDevices"))
|
||||
add(joinPath(p, "/PCIeFunctions"))
|
||||
add(joinPath(p, "/Accelerators"))
|
||||
add(joinPath(p, "/GraphicsControllers"))
|
||||
add(joinPath(p, "/Storage"))
|
||||
}
|
||||
for _, p := range discovered.ChassisPaths {
|
||||
add(p)
|
||||
add(joinPath(p, "/Oem/Public"))
|
||||
add(joinPath(p, "/Oem/Public/FRU"))
|
||||
add(joinPath(p, "/PCIeDevices"))
|
||||
add(joinPath(p, "/PCIeSlots"))
|
||||
add(joinPath(p, "/NetworkAdapters"))
|
||||
add(joinPath(p, "/Drives"))
|
||||
add(joinPath(p, "/Power"))
|
||||
}
|
||||
for _, p := range discovered.ManagerPaths {
|
||||
add(p)
|
||||
add(joinPath(p, "/EthernetInterfaces"))
|
||||
add(joinPath(p, "/NetworkProtocol"))
|
||||
}
|
||||
return mergeResolvedPaths(out)
|
||||
}
|
||||
|
||||
func baselineCriticalPaths(discovered DiscoveredResources) []string {
|
||||
var out []string
|
||||
for _, group := range [][]string{
|
||||
{"/redfish/v1"},
|
||||
discovered.SystemPaths,
|
||||
discovered.ChassisPaths,
|
||||
discovered.ManagerPaths,
|
||||
} {
|
||||
out = append(out, group...)
|
||||
}
|
||||
return mergeResolvedPaths(out)
|
||||
}
|
||||
|
||||
func expandScopedSuffixes(basePaths, suffixes []string) []string {
|
||||
if len(basePaths) == 0 || len(suffixes) == 0 {
|
||||
return nil
|
||||
}
|
||||
out := make([]string, 0, len(basePaths)*len(suffixes))
|
||||
for _, basePath := range basePaths {
|
||||
basePath = normalizePath(basePath)
|
||||
if basePath == "" {
|
||||
continue
|
||||
}
|
||||
for _, suffix := range suffixes {
|
||||
suffix = strings.TrimSpace(suffix)
|
||||
if suffix == "" {
|
||||
continue
|
||||
}
|
||||
out = append(out, joinPath(basePath, suffix))
|
||||
}
|
||||
}
|
||||
return mergeResolvedPaths(out)
|
||||
}
|
||||
|
||||
func mergeResolvedPaths(groups ...[]string) []string {
|
||||
seen := make(map[string]struct{})
|
||||
out := make([]string, 0)
|
||||
for _, group := range groups {
|
||||
for _, path := range group {
|
||||
path = normalizePath(path)
|
||||
if path == "" {
|
||||
continue
|
||||
}
|
||||
if _, ok := seen[path]; ok {
|
||||
continue
|
||||
}
|
||||
seen[path] = struct{}{}
|
||||
out = append(out, path)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func normalizePath(path string) string {
|
||||
path = strings.TrimSpace(path)
|
||||
if path == "" {
|
||||
return ""
|
||||
}
|
||||
if !strings.HasPrefix(path, "/") {
|
||||
path = "/" + path
|
||||
}
|
||||
return strings.TrimRight(path, "/")
|
||||
}
|
||||
|
||||
func joinPath(base, rel string) string {
|
||||
base = normalizePath(base)
|
||||
rel = strings.TrimSpace(rel)
|
||||
if base == "" {
|
||||
return normalizePath(rel)
|
||||
}
|
||||
if rel == "" {
|
||||
return base
|
||||
}
|
||||
if !strings.HasPrefix(rel, "/") {
|
||||
rel = "/" + rel
|
||||
}
|
||||
return normalizePath(base + rel)
|
||||
}
|
||||
100
internal/collector/redfishprofile/analysis.go
Normal file
100
internal/collector/redfishprofile/analysis.go
Normal file
@@ -0,0 +1,100 @@
|
||||
package redfishprofile
|
||||
|
||||
import "strings"
|
||||
|
||||
func ResolveAnalysisPlan(match MatchResult, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals) ResolvedAnalysisPlan {
|
||||
plan := ResolvedAnalysisPlan{
|
||||
Match: match,
|
||||
Directives: AnalysisDirectives{},
|
||||
}
|
||||
if match.Mode == ModeFallback {
|
||||
plan.Directives.EnableProcessorGPUFallback = true
|
||||
plan.Directives.EnableSupermicroNVMeBackplane = true
|
||||
plan.Directives.EnableProcessorGPUChassisAlias = true
|
||||
plan.Directives.EnableGenericGraphicsControllerDedup = true
|
||||
plan.Directives.EnableStorageEnclosureRecovery = true
|
||||
plan.Directives.EnableKnownStorageControllerRecovery = true
|
||||
addAnalysisLookupMode(&plan, "msi-index")
|
||||
addAnalysisLookupMode(&plan, "hgx-alias")
|
||||
addAnalysisStorageDriveCollections(&plan,
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Controllers/1/Drives",
|
||||
)
|
||||
addAnalysisStorageVolumeCollections(&plan,
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
"/Storage/HA-RAID/Volumes",
|
||||
"/Storage/MRVL.HA-RAID/Volumes",
|
||||
)
|
||||
addAnalysisNote(&plan, "fallback analysis enables broad recovery directives")
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
profile.ApplyAnalysisDirectives(&plan.Directives, signals)
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
profile.RefineAnalysisPlan(&plan, snapshot, discovered, signals)
|
||||
}
|
||||
return plan
|
||||
}
|
||||
|
||||
func snapshotHasPathPrefix(snapshot map[string]interface{}, prefix string) bool {
|
||||
prefix = normalizePath(prefix)
|
||||
if prefix == "" {
|
||||
return false
|
||||
}
|
||||
for path := range snapshot {
|
||||
if strings.HasPrefix(normalizePath(path), prefix) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func snapshotHasPathContaining(snapshot map[string]interface{}, sub string) bool {
|
||||
sub = strings.ToLower(strings.TrimSpace(sub))
|
||||
if sub == "" {
|
||||
return false
|
||||
}
|
||||
for path := range snapshot {
|
||||
if strings.Contains(strings.ToLower(path), sub) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func snapshotHasGPUProcessor(snapshot map[string]interface{}, systemPaths []string) bool {
|
||||
for _, systemPath := range systemPaths {
|
||||
prefix := normalizePath(joinPath(systemPath, "/Processors")) + "/"
|
||||
for path, docAny := range snapshot {
|
||||
if !strings.HasPrefix(normalizePath(path), prefix) {
|
||||
continue
|
||||
}
|
||||
doc, ok := docAny.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if strings.EqualFold(strings.TrimSpace(asString(doc["ProcessorType"])), "GPU") {
|
||||
return true
|
||||
}
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func snapshotHasStorageControllerHint(snapshot map[string]interface{}, needles ...string) bool {
|
||||
for _, needle := range needles {
|
||||
if snapshotHasPathContaining(snapshot, needle) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func asString(v interface{}) string {
|
||||
switch x := v.(type) {
|
||||
case string:
|
||||
return x
|
||||
default:
|
||||
return ""
|
||||
}
|
||||
}
|
||||
405
internal/collector/redfishprofile/fixture_test.go
Normal file
405
internal/collector/redfishprofile/fixture_test.go
Normal file
@@ -0,0 +1,405 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_MSI_CG480(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "msi-cg480.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "msi")
|
||||
assertProfileSelected(t, match, "ami-family")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
|
||||
if plan.Tuning.PrefetchWorkers < 6 {
|
||||
t.Fatalf("expected msi prefetch worker tuning, got %d", plan.Tuning.PrefetchWorkers)
|
||||
}
|
||||
if !containsString(resolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
|
||||
t.Fatalf("expected MSI chassis GPU seed path")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Chassis/GPU1/Sensors") {
|
||||
t.Fatal("expected MSI GPU sensor critical path")
|
||||
}
|
||||
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/Chassis/GPU1/Sensors") {
|
||||
t.Fatal("expected MSI GPU sensor plan-b path")
|
||||
}
|
||||
if plan.Tuning.ETABaseline.SnapshotSeconds <= 0 {
|
||||
t.Fatal("expected MSI snapshot eta baseline")
|
||||
}
|
||||
if !plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe {
|
||||
t.Fatal("expected MSI fixture to inherit generic numeric post-probe policy")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.SystemSeedSuffixes, "/SimpleStorage") {
|
||||
t.Fatal("expected MSI fixture to inherit generic SimpleStorage scoped seed suffix")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.SystemCriticalSuffixes, "/Memory") {
|
||||
t.Fatal("expected MSI fixture to inherit generic system critical suffixes")
|
||||
}
|
||||
if !containsString(plan.Tuning.PrefetchPolicy.IncludeSuffixes, "/Storage") {
|
||||
t.Fatal("expected MSI fixture to inherit generic storage prefetch policy")
|
||||
}
|
||||
if !containsString(plan.CriticalPaths, "/redfish/v1/UpdateService") {
|
||||
t.Fatal("expected MSI fixture to inherit generic top-level critical path")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
|
||||
t.Fatal("expected MSI fixture to enable profile plan-b")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_MSI_CG480_CopyMatchesSameProfiles(t *testing.T) {
|
||||
originalSignals := loadProfileFixtureSignals(t, "msi-cg480.json")
|
||||
copySignals := loadProfileFixtureSignals(t, "msi-cg480-copy.json")
|
||||
originalMatch := MatchProfiles(originalSignals)
|
||||
copyMatch := MatchProfiles(copySignals)
|
||||
originalPlan := BuildAcquisitionPlan(originalSignals)
|
||||
copyPlan := BuildAcquisitionPlan(copySignals)
|
||||
originalResolved := ResolveAcquisitionPlan(originalMatch, originalPlan, discoveredResourcesFromSignals(originalSignals), originalSignals)
|
||||
copyResolved := ResolveAcquisitionPlan(copyMatch, copyPlan, discoveredResourcesFromSignals(copySignals), copySignals)
|
||||
|
||||
assertSameProfileNames(t, originalMatch, copyMatch)
|
||||
if originalPlan.Tuning.PrefetchWorkers != copyPlan.Tuning.PrefetchWorkers {
|
||||
t.Fatalf("expected same MSI prefetch worker tuning, got %d vs %d", originalPlan.Tuning.PrefetchWorkers, copyPlan.Tuning.PrefetchWorkers)
|
||||
}
|
||||
if containsString(originalResolved.SeedPaths, "/redfish/v1/Chassis/GPU1") != containsString(copyResolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
|
||||
t.Fatal("expected same MSI GPU chassis seed presence in both fixtures")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_MSI_CG290(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "msi-cg290.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "msi")
|
||||
assertProfileSelected(t, match, "ami-family")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
|
||||
if plan.Tuning.PrefetchWorkers < 6 {
|
||||
t.Fatalf("expected MSI prefetch worker tuning, got %d", plan.Tuning.PrefetchWorkers)
|
||||
}
|
||||
if !containsString(resolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
|
||||
t.Fatalf("expected MSI chassis GPU seed path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_Supermicro_HGX(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "supermicro-hgx.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
discovered := discoveredResourcesFromSignals(signals)
|
||||
discovered.SystemPaths = dedupeSorted(append(discovered.SystemPaths, "/redfish/v1/Systems/HGX_Baseboard_0"))
|
||||
resolved := ResolveAcquisitionPlan(match, plan, discovered, signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "supermicro")
|
||||
assertProfileSelected(t, match, "hgx-topology")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
|
||||
if plan.Tuning.SnapshotMaxDocuments < 180000 {
|
||||
t.Fatalf("expected widened HGX snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.NVMePostProbeEnabled == nil || *plan.Tuning.NVMePostProbeEnabled {
|
||||
t.Fatal("expected HGX fixture to disable NVMe post-probe")
|
||||
}
|
||||
if !containsString(resolved.SeedPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX baseboard processors seed path")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX baseboard processors critical path")
|
||||
}
|
||||
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX baseboard processors plan-b path")
|
||||
}
|
||||
if plan.Tuning.ETABaseline.SnapshotSeconds < 300 {
|
||||
t.Fatalf("expected HGX snapshot eta baseline, got %d", plan.Tuning.ETABaseline.SnapshotSeconds)
|
||||
}
|
||||
if !plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe {
|
||||
t.Fatal("expected HGX fixture to retain Supermicro direct NVMe disk bay probe policy")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.SystemCriticalSuffixes, "/Storage/IntelVROC/Drives") {
|
||||
t.Fatal("expected HGX fixture to inherit generic IntelVROC scoped critical suffix")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.ChassisCriticalSuffixes, "/Assembly") {
|
||||
t.Fatal("expected HGX fixture to inherit generic chassis critical suffixes")
|
||||
}
|
||||
if !containsString(plan.Tuning.PrefetchPolicy.ExcludeContains, "/Assembly") {
|
||||
t.Fatal("expected HGX fixture to inherit generic assembly prefetch exclusion")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
|
||||
t.Fatal("expected HGX fixture to enable profile plan-b")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_Supermicro_OAM_NoHGX(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "supermicro-oam-amd.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "supermicro")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
|
||||
if containsString(resolved.SeedPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("did not expect HGX baseboard processors seed path for OAM fixture")
|
||||
}
|
||||
if containsString(resolved.CriticalPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("did not expect HGX baseboard processors critical path for OAM fixture")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
|
||||
t.Fatal("expected Supermicro firmware critical path")
|
||||
}
|
||||
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
|
||||
t.Fatal("expected Supermicro firmware plan-b path")
|
||||
}
|
||||
if plan.Tuning.SnapshotMaxDocuments != 150000 {
|
||||
t.Fatalf("expected generic supermicro snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.NVMePostProbeEnabled != nil {
|
||||
t.Fatal("did not expect HGX NVMe tuning for OAM fixture")
|
||||
}
|
||||
if plan.Tuning.ETABaseline.SnapshotSeconds < 180 {
|
||||
t.Fatalf("expected Supermicro snapshot eta baseline, got %d", plan.Tuning.ETABaseline.SnapshotSeconds)
|
||||
}
|
||||
if !plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe {
|
||||
t.Fatal("expected Supermicro OAM fixture to use direct NVMe disk bay probe policy")
|
||||
}
|
||||
if !plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe {
|
||||
t.Fatal("expected Supermicro OAM fixture to inherit generic numeric post-probe policy")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.SystemSeedSuffixes, "/Storage/IntelVROC") {
|
||||
t.Fatal("expected Supermicro OAM fixture to inherit generic IntelVROC scoped seed suffix")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
|
||||
t.Fatal("expected Supermicro OAM fixture to enable profile plan-b")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_Dell_R750(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "dell-r750.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/System.Embedded.1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/System.Embedded.1"},
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1", "/redfish/v1/Managers/iDRAC.Embedded.1"},
|
||||
}, signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "dell")
|
||||
assertProfileNotSelected(t, match, "supermicro")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
|
||||
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
|
||||
t.Fatal("expected dell fixture to enable profile plan-b")
|
||||
}
|
||||
if !containsString(resolved.SeedPaths, "/redfish/v1/Managers/iDRAC.Embedded.1") {
|
||||
t.Fatal("expected Dell refinement to add iDRAC manager seed path")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Managers/iDRAC.Embedded.1") {
|
||||
t.Fatal("expected Dell refinement to add iDRAC manager critical path")
|
||||
}
|
||||
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals).Directives
|
||||
if !directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected dell fixture to enable graphics controller dedup")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_AMI_Generic(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "ami-generic.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "ami-family")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
assertProfileNotSelected(t, match, "supermicro")
|
||||
assertProfileNotSelected(t, match, "dell")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
|
||||
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
|
||||
t.Fatal("expected ami-family fixture to force prefetch enabled")
|
||||
}
|
||||
if !containsString(plan.SeedPaths, "/redfish/v1/Oem/Ami") {
|
||||
t.Fatal("expected ami-family fixture seed path /redfish/v1/Oem/Ami")
|
||||
}
|
||||
if !containsString(plan.SeedPaths, "/redfish/v1/Oem/Ami/InventoryData/Status") {
|
||||
t.Fatal("expected ami-family fixture seed path /redfish/v1/Oem/Ami/InventoryData/Status")
|
||||
}
|
||||
if !containsString(plan.CriticalPaths, "/redfish/v1/UpdateService") {
|
||||
t.Fatal("expected ami-family fixture to inherit generic critical path")
|
||||
}
|
||||
|
||||
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals).Directives
|
||||
if !directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected ami-family fixture to enable graphics controller dedup")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_UnknownVendor(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "unknown-vendor.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1"},
|
||||
}, signals)
|
||||
|
||||
if match.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode for unknown vendor, got %q", match.Mode)
|
||||
}
|
||||
if len(match.Profiles) == 0 {
|
||||
t.Fatal("expected fallback to aggregate profiles")
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
if !profile.SafeForFallback() {
|
||||
t.Fatalf("fallback mode included non-safe profile %q", profile.Name())
|
||||
}
|
||||
}
|
||||
|
||||
if plan.Tuning.SnapshotMaxDocuments < 180000 {
|
||||
t.Fatalf("expected fallback to widen snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
|
||||
t.Fatal("expected fallback fixture to force prefetch enabled")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Systems/1") {
|
||||
t.Fatal("expected fallback resolved critical paths to include discovered system")
|
||||
}
|
||||
|
||||
analysisPlan := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals)
|
||||
if !analysisPlan.Directives.EnableProcessorGPUFallback {
|
||||
t.Fatal("expected fallback fixture to enable processor GPU fallback")
|
||||
}
|
||||
if !analysisPlan.Directives.EnableStorageEnclosureRecovery {
|
||||
t.Fatal("expected fallback fixture to enable storage enclosure recovery")
|
||||
}
|
||||
if !analysisPlan.Directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected fallback fixture to enable graphics controller dedup")
|
||||
}
|
||||
}
|
||||
|
||||
func loadProfileFixtureSignals(t *testing.T, fixtureName string) MatchSignals {
|
||||
t.Helper()
|
||||
path := filepath.Join("testdata", fixtureName)
|
||||
data, err := os.ReadFile(path)
|
||||
if err != nil {
|
||||
t.Fatalf("read fixture %s: %v", path, err)
|
||||
}
|
||||
var signals MatchSignals
|
||||
if err := json.Unmarshal(data, &signals); err != nil {
|
||||
t.Fatalf("decode fixture %s: %v", path, err)
|
||||
}
|
||||
return normalizeSignals(signals)
|
||||
}
|
||||
|
||||
func assertProfileSelected(t *testing.T, match MatchResult, want string) {
|
||||
t.Helper()
|
||||
for _, profile := range match.Profiles {
|
||||
if profile.Name() == want {
|
||||
return
|
||||
}
|
||||
}
|
||||
t.Fatalf("expected profile %q in %v", want, profileNames(match))
|
||||
}
|
||||
|
||||
func assertProfileNotSelected(t *testing.T, match MatchResult, want string) {
|
||||
t.Helper()
|
||||
for _, profile := range match.Profiles {
|
||||
if profile.Name() == want {
|
||||
t.Fatalf("did not expect profile %q in %v", want, profileNames(match))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func profileNames(match MatchResult) []string {
|
||||
out := make([]string, 0, len(match.Profiles))
|
||||
for _, profile := range match.Profiles {
|
||||
out = append(out, profile.Name())
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func assertSameProfileNames(t *testing.T, left, right MatchResult) {
|
||||
t.Helper()
|
||||
leftNames := profileNames(left)
|
||||
rightNames := profileNames(right)
|
||||
if len(leftNames) != len(rightNames) {
|
||||
t.Fatalf("profile stack size differs: %v vs %v", leftNames, rightNames)
|
||||
}
|
||||
for i := range leftNames {
|
||||
if leftNames[i] != rightNames[i] {
|
||||
t.Fatalf("profile stack differs: %v vs %v", leftNames, rightNames)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func containsString(items []string, want string) bool {
|
||||
for _, item := range items {
|
||||
if item == want {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func discoveredResourcesFromSignals(signals MatchSignals) DiscoveredResources {
|
||||
var discovered DiscoveredResources
|
||||
for _, hint := range signals.ResourceHints {
|
||||
memberPath := discoveredMemberPath(hint)
|
||||
switch {
|
||||
case strings.HasPrefix(memberPath, "/redfish/v1/Systems/"):
|
||||
discovered.SystemPaths = append(discovered.SystemPaths, memberPath)
|
||||
case strings.HasPrefix(memberPath, "/redfish/v1/Chassis/"):
|
||||
discovered.ChassisPaths = append(discovered.ChassisPaths, memberPath)
|
||||
case strings.HasPrefix(memberPath, "/redfish/v1/Managers/"):
|
||||
discovered.ManagerPaths = append(discovered.ManagerPaths, memberPath)
|
||||
}
|
||||
}
|
||||
discovered.SystemPaths = dedupeSorted(discovered.SystemPaths)
|
||||
discovered.ChassisPaths = dedupeSorted(discovered.ChassisPaths)
|
||||
discovered.ManagerPaths = dedupeSorted(discovered.ManagerPaths)
|
||||
return discovered
|
||||
}
|
||||
|
||||
func discoveredMemberPath(path string) string {
|
||||
path = strings.TrimSpace(path)
|
||||
if path == "" {
|
||||
return ""
|
||||
}
|
||||
parts := strings.Split(strings.Trim(path, "/"), "/")
|
||||
if len(parts) < 4 || parts[0] != "redfish" || parts[1] != "v1" {
|
||||
return ""
|
||||
}
|
||||
switch parts[2] {
|
||||
case "Systems", "Chassis", "Managers":
|
||||
return "/" + strings.Join(parts[:4], "/")
|
||||
default:
|
||||
return ""
|
||||
}
|
||||
}
|
||||
122
internal/collector/redfishprofile/matcher.go
Normal file
122
internal/collector/redfishprofile/matcher.go
Normal file
@@ -0,0 +1,122 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"sort"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
const (
|
||||
ModeMatched = "matched"
|
||||
ModeFallback = "fallback"
|
||||
)
|
||||
|
||||
func MatchProfiles(signals MatchSignals) MatchResult {
|
||||
type scored struct {
|
||||
profile Profile
|
||||
score int
|
||||
}
|
||||
builtins := BuiltinProfiles()
|
||||
candidates := make([]scored, 0, len(builtins))
|
||||
allScores := make([]ProfileScore, 0, len(builtins))
|
||||
for _, profile := range builtins {
|
||||
score := profile.Match(signals)
|
||||
allScores = append(allScores, ProfileScore{
|
||||
Name: profile.Name(),
|
||||
Score: score,
|
||||
Priority: profile.Priority(),
|
||||
})
|
||||
if score <= 0 {
|
||||
continue
|
||||
}
|
||||
candidates = append(candidates, scored{profile: profile, score: score})
|
||||
}
|
||||
sort.Slice(allScores, func(i, j int) bool {
|
||||
if allScores[i].Score == allScores[j].Score {
|
||||
if allScores[i].Priority == allScores[j].Priority {
|
||||
return allScores[i].Name < allScores[j].Name
|
||||
}
|
||||
return allScores[i].Priority < allScores[j].Priority
|
||||
}
|
||||
return allScores[i].Score > allScores[j].Score
|
||||
})
|
||||
sort.Slice(candidates, func(i, j int) bool {
|
||||
if candidates[i].score == candidates[j].score {
|
||||
return candidates[i].profile.Priority() < candidates[j].profile.Priority()
|
||||
}
|
||||
return candidates[i].score > candidates[j].score
|
||||
})
|
||||
if len(candidates) == 0 || candidates[0].score < 60 {
|
||||
profiles := make([]Profile, 0, len(builtins))
|
||||
active := make(map[string]struct{}, len(builtins))
|
||||
for _, profile := range builtins {
|
||||
if profile.SafeForFallback() {
|
||||
profiles = append(profiles, profile)
|
||||
active[profile.Name()] = struct{}{}
|
||||
}
|
||||
}
|
||||
sortProfiles(profiles)
|
||||
for i := range allScores {
|
||||
_, ok := active[allScores[i].Name]
|
||||
allScores[i].Active = ok
|
||||
}
|
||||
return MatchResult{Mode: ModeFallback, Profiles: profiles, Scores: allScores}
|
||||
}
|
||||
profiles := make([]Profile, 0, len(candidates))
|
||||
seen := make(map[string]struct{}, len(candidates))
|
||||
for _, candidate := range candidates {
|
||||
name := candidate.profile.Name()
|
||||
if _, ok := seen[name]; ok {
|
||||
continue
|
||||
}
|
||||
seen[name] = struct{}{}
|
||||
profiles = append(profiles, candidate.profile)
|
||||
}
|
||||
sortProfiles(profiles)
|
||||
for i := range allScores {
|
||||
_, ok := seen[allScores[i].Name]
|
||||
allScores[i].Active = ok
|
||||
}
|
||||
return MatchResult{Mode: ModeMatched, Profiles: profiles, Scores: allScores}
|
||||
}
|
||||
|
||||
func BuildAcquisitionPlan(signals MatchSignals) AcquisitionPlan {
|
||||
match := MatchProfiles(signals)
|
||||
plan := AcquisitionPlan{Mode: match.Mode}
|
||||
for _, profile := range match.Profiles {
|
||||
plan.Profiles = append(plan.Profiles, profile.Name())
|
||||
profile.ExtendAcquisitionPlan(&plan, signals)
|
||||
}
|
||||
plan.Profiles = dedupeSorted(plan.Profiles)
|
||||
plan.SeedPaths = dedupeSorted(plan.SeedPaths)
|
||||
plan.CriticalPaths = dedupeSorted(plan.CriticalPaths)
|
||||
plan.PlanBPaths = dedupeSorted(plan.PlanBPaths)
|
||||
plan.Notes = dedupeSorted(plan.Notes)
|
||||
if plan.Mode == ModeFallback {
|
||||
ensureSnapshotMaxDocuments(&plan, 180000)
|
||||
ensurePrefetchEnabled(&plan, true)
|
||||
addPlanNote(&plan, "fallback acquisition expands safe profile probes")
|
||||
}
|
||||
return plan
|
||||
}
|
||||
|
||||
func ApplyAnalysisProfiles(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals) MatchResult {
|
||||
match := MatchProfiles(signals)
|
||||
for _, profile := range match.Profiles {
|
||||
profile.PostAnalyze(result, snapshot, signals)
|
||||
}
|
||||
return match
|
||||
}
|
||||
|
||||
func BuildAnalysisDirectives(match MatchResult) AnalysisDirectives {
|
||||
return ResolveAnalysisPlan(match, nil, DiscoveredResources{}, MatchSignals{}).Directives
|
||||
}
|
||||
|
||||
func sortProfiles(profiles []Profile) {
|
||||
sort.Slice(profiles, func(i, j int) bool {
|
||||
if profiles[i].Priority() == profiles[j].Priority() {
|
||||
return profiles[i].Name() < profiles[j].Name()
|
||||
}
|
||||
return profiles[i].Priority() < profiles[j].Priority()
|
||||
})
|
||||
}
|
||||
390
internal/collector/redfishprofile/matcher_test.go
Normal file
390
internal/collector/redfishprofile/matcher_test.go
Normal file
@@ -0,0 +1,390 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestMatchProfiles_UnknownVendorFallsBackToAggregateProfiles(t *testing.T) {
|
||||
match := MatchProfiles(MatchSignals{
|
||||
ServiceRootProduct: "Redfish Server",
|
||||
})
|
||||
if match.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode, got %q", match.Mode)
|
||||
}
|
||||
if len(match.Profiles) < 2 {
|
||||
t.Fatalf("expected aggregated fallback profiles, got %d", len(match.Profiles))
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_MSISelectsMatchedMode(t *testing.T) {
|
||||
match := MatchProfiles(MatchSignals{
|
||||
SystemManufacturer: "Micro-Star International Co., Ltd.",
|
||||
ResourceHints: []string{"/redfish/v1/Chassis/GPU1"},
|
||||
})
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
found := false
|
||||
for _, profile := range match.Profiles {
|
||||
if profile.Name() == "msi" {
|
||||
found = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
t.Fatal("expected msi profile to be selected")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_FallbackIncludesProfileNotes(t *testing.T) {
|
||||
plan := BuildAcquisitionPlan(MatchSignals{
|
||||
ServiceRootVendor: "AMI",
|
||||
})
|
||||
if len(plan.Profiles) == 0 {
|
||||
t.Fatal("expected acquisition plan profiles")
|
||||
}
|
||||
if len(plan.Notes) == 0 {
|
||||
t.Fatal("expected acquisition plan notes")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_FallbackAddsBroadCrawlTuning(t *testing.T) {
|
||||
plan := BuildAcquisitionPlan(MatchSignals{
|
||||
ServiceRootProduct: "Unknown Redfish",
|
||||
})
|
||||
if plan.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode, got %q", plan.Mode)
|
||||
}
|
||||
if plan.Tuning.SnapshotMaxDocuments < 180000 {
|
||||
t.Fatalf("expected widened snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
|
||||
t.Fatal("expected fallback to force prefetch enabled")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableCriticalCollectionMemberRetry {
|
||||
t.Fatal("expected fallback to inherit critical member retry recovery")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableCriticalSlowProbe {
|
||||
t.Fatal("expected fallback to inherit critical slow probe recovery")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_HGXDisablesNVMePostProbe(t *testing.T) {
|
||||
plan := BuildAcquisitionPlan(MatchSignals{
|
||||
SystemModel: "HGX B200",
|
||||
ResourceHints: []string{"/redfish/v1/Systems/HGX_Baseboard_0"},
|
||||
})
|
||||
if plan.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", plan.Mode)
|
||||
}
|
||||
if plan.Tuning.NVMePostProbeEnabled == nil || *plan.Tuning.NVMePostProbeEnabled {
|
||||
t.Fatal("expected hgx profile to disable NVMe post-probe")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_ExpandsScopedPaths(t *testing.T) {
|
||||
signals := MatchSignals{}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1", "/redfish/v1/Systems/2"},
|
||||
}, signals)
|
||||
joined := joinResolvedPaths(resolved.SeedPaths)
|
||||
for _, wanted := range []string{
|
||||
"/redfish/v1/Systems/1/SimpleStorage",
|
||||
"/redfish/v1/Systems/1/Storage/IntelVROC",
|
||||
"/redfish/v1/Systems/2/SimpleStorage",
|
||||
"/redfish/v1/Systems/2/Storage/IntelVROC",
|
||||
} {
|
||||
if !containsJoinedPath(joined, wanted) {
|
||||
t.Fatalf("expected resolved seed path %q", wanted)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_CriticalBaselineIsShapedByProfiles(t *testing.T) {
|
||||
signals := MatchSignals{}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1"},
|
||||
}, signals)
|
||||
joined := joinResolvedPaths(resolved.CriticalPaths)
|
||||
for _, wanted := range []string{
|
||||
"/redfish/v1",
|
||||
"/redfish/v1/Systems/1",
|
||||
"/redfish/v1/Systems/1/Memory",
|
||||
"/redfish/v1/Chassis/1/Assembly",
|
||||
"/redfish/v1/Managers/1/NetworkProtocol",
|
||||
"/redfish/v1/UpdateService",
|
||||
} {
|
||||
if !containsJoinedPath(joined, wanted) {
|
||||
t.Fatalf("expected resolved critical path %q", wanted)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_FallbackAppendsPlanBToSeeds(t *testing.T) {
|
||||
signals := MatchSignals{ServiceRootProduct: "Unknown Redfish"}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
if plan.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode, got %q", plan.Mode)
|
||||
}
|
||||
plan.PlanBPaths = append(plan.PlanBPaths, "/redfish/v1/Systems/1/Oem/TestPlanB")
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
}, signals)
|
||||
if !containsJoinedPath(joinResolvedPaths(resolved.SeedPaths), "/redfish/v1/Systems/1/Oem/TestPlanB") {
|
||||
t.Fatal("expected fallback resolved seeds to include plan-b path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_MSIRefinesDiscoveredGPUChassis(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Micro-Star International Co., Ltd.",
|
||||
ResourceHints: []string{"/redfish/v1/Chassis/GPU1", "/redfish/v1/Chassis/GPU4/Sensors"},
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/1", "/redfish/v1/Chassis/GPU1", "/redfish/v1/Chassis/GPU4"},
|
||||
}, signals)
|
||||
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
|
||||
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
|
||||
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU1") || !containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU4") {
|
||||
t.Fatal("expected MSI refinement to add discovered GPU chassis seed paths")
|
||||
}
|
||||
if containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU2") {
|
||||
t.Fatal("did not expect undiscovered MSI GPU chassis in resolved seeds")
|
||||
}
|
||||
if !containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU1/Sensors") || !containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU4/Sensors") {
|
||||
t.Fatal("expected MSI refinement to add discovered GPU sensor critical paths")
|
||||
}
|
||||
if containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU3/Sensors") {
|
||||
t.Fatal("did not expect undiscovered MSI GPU sensor critical path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_HGXRefinesDiscoveredBaseboardSystems(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
SystemModel: "SYS-821GE-TNHR",
|
||||
ChassisModel: "HGX B200",
|
||||
ResourceHints: []string{
|
||||
"/redfish/v1/Systems/HGX_Baseboard_0",
|
||||
"/redfish/v1/Systems/HGX_Baseboard_0/Processors",
|
||||
"/redfish/v1/Systems/1",
|
||||
},
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1", "/redfish/v1/Systems/HGX_Baseboard_0"},
|
||||
}, signals)
|
||||
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
|
||||
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
|
||||
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_0") || !containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX refinement to add discovered baseboard system paths")
|
||||
}
|
||||
if !containsJoinedPath(joinedCritical, "/redfish/v1/Systems/HGX_Baseboard_0") || !containsJoinedPath(joinedCritical, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX refinement to add discovered baseboard critical paths")
|
||||
}
|
||||
if containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_1") {
|
||||
t.Fatal("did not expect undiscovered HGX baseboard system path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_SupermicroRefinesFirmwareInventoryFromHint(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
ResourceHints: []string{
|
||||
"/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory",
|
||||
"/redfish/v1/Managers/1/Oem/Supermicro/FanMode",
|
||||
},
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1"},
|
||||
}, signals)
|
||||
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
|
||||
if !containsJoinedPath(joinedCritical, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
|
||||
t.Fatal("expected Supermicro refinement to add firmware inventory critical path")
|
||||
}
|
||||
if !containsJoinedPath(joinResolvedPaths(resolved.Plan.PlanBPaths), "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
|
||||
t.Fatal("expected Supermicro refinement to add firmware inventory plan-b path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_DellRefinesDiscoveredIDRACManager(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Dell Inc.",
|
||||
ServiceRootProduct: "iDRAC Redfish Service",
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1", "/redfish/v1/Managers/iDRAC.Embedded.1"},
|
||||
}, signals)
|
||||
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
|
||||
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
|
||||
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Managers/iDRAC.Embedded.1") {
|
||||
t.Fatal("expected Dell refinement to add discovered iDRAC manager seed path")
|
||||
}
|
||||
if !containsJoinedPath(joinedCritical, "/redfish/v1/Managers/iDRAC.Embedded.1") {
|
||||
t.Fatal("expected Dell refinement to add discovered iDRAC manager critical path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_SupermicroEnablesVendorStorageFallbacks(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
SystemModel: "SYS-821GE",
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Chassis/NVMeSSD.1.StorageBackplane/Drives": map[string]interface{}{},
|
||||
}, DiscoveredResources{}, signals)
|
||||
directives := plan.Directives
|
||||
if !directives.EnableSupermicroNVMeBackplane {
|
||||
t.Fatal("expected supermicro nvme backplane fallback")
|
||||
}
|
||||
}
|
||||
|
||||
func joinResolvedPaths(paths []string) string {
|
||||
return "\n" + strings.Join(paths, "\n") + "\n"
|
||||
}
|
||||
|
||||
func containsJoinedPath(joined, want string) bool {
|
||||
return strings.Contains(joined, "\n"+want+"\n")
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_HGXEnablesGPUFallbacks(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
SystemModel: "SYS-821GE-TNHR",
|
||||
ChassisModel: "HGX B200",
|
||||
ResourceHints: []string{"/redfish/v1/Systems/HGX_Baseboard_0", "/redfish/v1/Chassis/HGX_Chassis_0/PCIeDevices/GPU_SXM_1"},
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_SXM_1": map[string]interface{}{"ProcessorType": "GPU"},
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/PCIeDevices/GPU_SXM_1": map[string]interface{}{},
|
||||
}, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/HGX_Baseboard_0"},
|
||||
}, signals)
|
||||
directives := plan.Directives
|
||||
if !directives.EnableProcessorGPUFallback {
|
||||
t.Fatal("expected processor GPU fallback for hgx profile")
|
||||
}
|
||||
if !directives.EnableProcessorGPUChassisAlias {
|
||||
t.Fatal("expected processor GPU chassis alias resolution for hgx profile")
|
||||
}
|
||||
if !directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected graphics-controller dedup for hgx profile")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_MSIEnablesMSIChassisLookup(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Micro-Star International Co., Ltd.",
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Systems/1/Processors/GPU1": map[string]interface{}{"ProcessorType": "GPU"},
|
||||
"/redfish/v1/Chassis/GPU1": map[string]interface{}{},
|
||||
}, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/GPU1"},
|
||||
}, signals)
|
||||
directives := plan.Directives
|
||||
if !directives.EnableMSIProcessorGPUChassisLookup {
|
||||
t.Fatal("expected MSI processor GPU chassis lookup")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_SupermicroEnablesStorageRecovery(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Chassis/1/Drives": map[string]interface{}{},
|
||||
"/redfish/v1/Systems/1/Storage/IntelVROC": map[string]interface{}{},
|
||||
"/redfish/v1/Systems/1/Storage/IntelVROC/Drives": map[string]interface{}{},
|
||||
}, DiscoveredResources{}, signals)
|
||||
directives := plan.Directives
|
||||
if !directives.EnableStorageEnclosureRecovery {
|
||||
t.Fatal("expected storage enclosure recovery for supermicro")
|
||||
}
|
||||
if !directives.EnableKnownStorageControllerRecovery {
|
||||
t.Fatal("expected known storage controller recovery for supermicro")
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_OrderingIsDeterministic(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Micro-Star International Co., Ltd.",
|
||||
ResourceHints: []string{"/redfish/v1/Chassis/GPU1"},
|
||||
}
|
||||
first := MatchProfiles(signals)
|
||||
second := MatchProfiles(signals)
|
||||
if len(first.Profiles) != len(second.Profiles) {
|
||||
t.Fatalf("profile stack size differs across calls: %d vs %d", len(first.Profiles), len(second.Profiles))
|
||||
}
|
||||
for i := range first.Profiles {
|
||||
if first.Profiles[i].Name() != second.Profiles[i].Name() {
|
||||
t.Fatalf("profile ordering differs at index %d: %q vs %q", i, first.Profiles[i].Name(), second.Profiles[i].Name())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_FallbackOrderingIsDeterministic(t *testing.T) {
|
||||
signals := MatchSignals{ServiceRootProduct: "Unknown Redfish"}
|
||||
first := MatchProfiles(signals)
|
||||
second := MatchProfiles(signals)
|
||||
if first.Mode != ModeFallback || second.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode in both calls")
|
||||
}
|
||||
if len(first.Profiles) != len(second.Profiles) {
|
||||
t.Fatalf("fallback profile stack size differs: %d vs %d", len(first.Profiles), len(second.Profiles))
|
||||
}
|
||||
for i := range first.Profiles {
|
||||
if first.Profiles[i].Name() != second.Profiles[i].Name() {
|
||||
t.Fatalf("fallback profile ordering differs at index %d: %q vs %q", i, first.Profiles[i].Name(), second.Profiles[i].Name())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_FallbackOnlySelectsSafeProfiles(t *testing.T) {
|
||||
match := MatchProfiles(MatchSignals{ServiceRootProduct: "Unknown Generic Redfish Server"})
|
||||
if match.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode, got %q", match.Mode)
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
if !profile.SafeForFallback() {
|
||||
t.Fatalf("fallback mode included non-safe profile %q", profile.Name())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_GenericMatchedKeepsFallbacksDisabled(t *testing.T) {
|
||||
match := MatchResult{
|
||||
Mode: ModeMatched,
|
||||
Profiles: []Profile{genericProfile()},
|
||||
}
|
||||
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, MatchSignals{}).Directives
|
||||
if directives.EnableProcessorGPUFallback {
|
||||
t.Fatal("did not expect processor GPU fallback for generic matched profile")
|
||||
}
|
||||
if directives.EnableSupermicroNVMeBackplane {
|
||||
t.Fatal("did not expect supermicro nvme fallback for generic matched profile")
|
||||
}
|
||||
if directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("did not expect generic graphics-controller dedup for generic matched profile")
|
||||
}
|
||||
}
|
||||
33
internal/collector/redfishprofile/profile_ami.go
Normal file
33
internal/collector/redfishprofile/profile_ami.go
Normal file
@@ -0,0 +1,33 @@
|
||||
package redfishprofile
|
||||
|
||||
func amiProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "ami-family",
|
||||
priority: 10,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.ServiceRootVendor, "ami") || containsFold(s.ServiceRootProduct, "ami") {
|
||||
score += 70
|
||||
}
|
||||
for _, ns := range s.OEMNamespaces {
|
||||
if containsFold(ns, "ami") {
|
||||
score += 30
|
||||
break
|
||||
}
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
addPlanPaths(&plan.SeedPaths,
|
||||
"/redfish/v1/Oem/Ami",
|
||||
"/redfish/v1/Oem/Ami/InventoryData/Status",
|
||||
)
|
||||
ensurePrefetchEnabled(plan, true)
|
||||
addPlanNote(plan, "ami-family acquisition extensions enabled")
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
},
|
||||
}
|
||||
}
|
||||
45
internal/collector/redfishprofile/profile_dell.go
Normal file
45
internal/collector/redfishprofile/profile_dell.go
Normal file
@@ -0,0 +1,45 @@
|
||||
package redfishprofile
|
||||
|
||||
func dellProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "dell",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemManufacturer, "dell") || containsFold(s.ChassisManufacturer, "dell") {
|
||||
score += 80
|
||||
}
|
||||
for _, ns := range s.OEMNamespaces {
|
||||
if containsFold(ns, "dell") {
|
||||
score += 30
|
||||
break
|
||||
}
|
||||
}
|
||||
if containsFold(s.ServiceRootProduct, "idrac") {
|
||||
score += 30
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
addPlanNote(plan, "dell iDRAC acquisition extensions enabled")
|
||||
},
|
||||
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
|
||||
for _, managerPath := range discovered.ManagerPaths {
|
||||
if !containsFold(managerPath, "idrac") {
|
||||
continue
|
||||
}
|
||||
addPlanPaths(&resolved.SeedPaths, managerPath)
|
||||
addPlanPaths(&resolved.Plan.SeedPaths, managerPath)
|
||||
addPlanPaths(&resolved.CriticalPaths, managerPath)
|
||||
addPlanPaths(&resolved.Plan.CriticalPaths, managerPath)
|
||||
}
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
},
|
||||
}
|
||||
}
|
||||
117
internal/collector/redfishprofile/profile_generic.go
Normal file
117
internal/collector/redfishprofile/profile_generic.go
Normal file
@@ -0,0 +1,117 @@
|
||||
package redfishprofile
|
||||
|
||||
func genericProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "generic",
|
||||
priority: 100,
|
||||
safeForFallback: true,
|
||||
matchFn: func(MatchSignals) int { return 10 },
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensurePrefetchPolicy(plan, AcquisitionPrefetchPolicy{
|
||||
IncludeSuffixes: []string{
|
||||
"/Bios",
|
||||
"/SecureBoot",
|
||||
"/Processors",
|
||||
"/Memory",
|
||||
"/Storage",
|
||||
"/SimpleStorage",
|
||||
"/PCIeDevices",
|
||||
"/PCIeFunctions",
|
||||
"/Accelerators",
|
||||
"/GraphicsControllers",
|
||||
"/EthernetInterfaces",
|
||||
"/NetworkInterfaces",
|
||||
"/NetworkAdapters",
|
||||
"/Drives",
|
||||
"/Power",
|
||||
"/PowerSubsystem/PowerSupplies",
|
||||
"/NetworkProtocol",
|
||||
"/UpdateService",
|
||||
"/UpdateService/FirmwareInventory",
|
||||
},
|
||||
ExcludeContains: []string{
|
||||
"/Fabrics",
|
||||
"/Backplanes",
|
||||
"/Boards",
|
||||
"/Assembly",
|
||||
"/Sensors",
|
||||
"/ThresholdSensors",
|
||||
"/DiscreteSensors",
|
||||
"/ThermalConfig",
|
||||
"/ThermalSubsystem",
|
||||
"/EnvironmentMetrics",
|
||||
"/Certificates",
|
||||
"/LogServices",
|
||||
},
|
||||
})
|
||||
ensureScopedPathPolicy(plan, AcquisitionScopedPathPolicy{
|
||||
SystemCriticalSuffixes: []string{
|
||||
"/Bios",
|
||||
"/SecureBoot",
|
||||
"/Oem/Public",
|
||||
"/Oem/Public/FRU",
|
||||
"/Processors",
|
||||
"/Memory",
|
||||
"/Storage",
|
||||
"/PCIeDevices",
|
||||
"/PCIeFunctions",
|
||||
"/Accelerators",
|
||||
"/GraphicsControllers",
|
||||
"/EthernetInterfaces",
|
||||
"/NetworkInterfaces",
|
||||
"/SimpleStorage",
|
||||
"/Storage/IntelVROC",
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
},
|
||||
ChassisCriticalSuffixes: []string{
|
||||
"/Oem/Public",
|
||||
"/Oem/Public/FRU",
|
||||
"/Power",
|
||||
"/NetworkAdapters",
|
||||
"/PCIeDevices",
|
||||
"/Accelerators",
|
||||
"/Drives",
|
||||
"/Assembly",
|
||||
},
|
||||
ManagerCriticalSuffixes: []string{
|
||||
"/NetworkProtocol",
|
||||
},
|
||||
SystemSeedSuffixes: []string{
|
||||
"/SimpleStorage",
|
||||
"/Storage/IntelVROC",
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
},
|
||||
})
|
||||
addPlanPaths(&plan.CriticalPaths,
|
||||
"/redfish/v1/UpdateService",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory",
|
||||
)
|
||||
ensureSnapshotMaxDocuments(plan, 100000)
|
||||
ensureSnapshotWorkers(plan, 6)
|
||||
ensurePrefetchWorkers(plan, 4)
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 8,
|
||||
SnapshotSeconds: 90,
|
||||
PrefetchSeconds: 20,
|
||||
CriticalPlanBSeconds: 20,
|
||||
ProfilePlanBSeconds: 15,
|
||||
})
|
||||
ensurePostProbePolicy(plan, AcquisitionPostProbePolicy{
|
||||
EnableNumericCollectionProbe: true,
|
||||
})
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableCriticalCollectionMemberRetry: true,
|
||||
EnableCriticalSlowProbe: true,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 900,
|
||||
ThrottleP95LatencyMS: 1800,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
},
|
||||
}
|
||||
}
|
||||
85
internal/collector/redfishprofile/profile_hgx.go
Normal file
85
internal/collector/redfishprofile/profile_hgx.go
Normal file
@@ -0,0 +1,85 @@
|
||||
package redfishprofile
|
||||
|
||||
func hgxProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "hgx-topology",
|
||||
priority: 30,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemModel, "hgx") || containsFold(s.ChassisModel, "hgx") {
|
||||
score += 70
|
||||
}
|
||||
for _, hint := range s.ResourceHints {
|
||||
if containsFold(hint, "hgx_") || containsFold(hint, "gpu_sxm") {
|
||||
score += 20
|
||||
break
|
||||
}
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureSnapshotMaxDocuments(plan, 180000)
|
||||
ensureSnapshotWorkers(plan, 4)
|
||||
ensurePrefetchWorkers(plan, 4)
|
||||
ensureNVMePostProbeEnabled(plan, false)
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 20,
|
||||
SnapshotSeconds: 300,
|
||||
PrefetchSeconds: 50,
|
||||
CriticalPlanBSeconds: 90,
|
||||
ProfilePlanBSeconds: 40,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 1500,
|
||||
ThrottleP95LatencyMS: 3000,
|
||||
MinSnapshotWorkers: 1,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
addPlanNote(plan, "hgx topology acquisition extensions enabled")
|
||||
},
|
||||
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
|
||||
for _, systemPath := range discovered.SystemPaths {
|
||||
if !containsFold(systemPath, "hgx_baseboard_") {
|
||||
continue
|
||||
}
|
||||
addPlanPaths(&resolved.SeedPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
addPlanPaths(&resolved.Plan.SeedPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
addPlanPaths(&resolved.CriticalPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
addPlanPaths(&resolved.Plan.CriticalPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
addPlanPaths(&resolved.Plan.PlanBPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
}
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
d.EnableStorageEnclosureRecovery = true
|
||||
},
|
||||
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, _ MatchSignals) {
|
||||
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) && (snapshotHasPathContaining(snapshot, "gpu_sxm") || snapshotHasPathContaining(snapshot, "hgx_")) {
|
||||
plan.Directives.EnableProcessorGPUFallback = true
|
||||
plan.Directives.EnableProcessorGPUChassisAlias = true
|
||||
addAnalysisLookupMode(plan, "hgx-alias")
|
||||
addAnalysisNote(plan, "hgx analysis enables processor-gpu alias fallback from snapshot topology")
|
||||
}
|
||||
if snapshotHasStorageControllerHint(snapshot, "/storage/intelvroc", "/storage/ha-raid", "/storage/mrvl.ha-raid") {
|
||||
plan.Directives.EnableKnownStorageControllerRecovery = true
|
||||
addAnalysisStorageDriveCollections(plan,
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Controllers/1/Drives",
|
||||
)
|
||||
addAnalysisStorageVolumeCollections(plan,
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
"/Storage/HA-RAID/Volumes",
|
||||
"/Storage/MRVL.HA-RAID/Volumes",
|
||||
)
|
||||
}
|
||||
if snapshotHasPathContaining(snapshot, "/chassis/nvmessd.") && snapshotHasPathContaining(snapshot, ".storagebackplane") {
|
||||
plan.Directives.EnableSupermicroNVMeBackplane = true
|
||||
}
|
||||
},
|
||||
}
|
||||
}
|
||||
72
internal/collector/redfishprofile/profile_msi.go
Normal file
72
internal/collector/redfishprofile/profile_msi.go
Normal file
@@ -0,0 +1,72 @@
|
||||
package redfishprofile
|
||||
|
||||
import "strings"
|
||||
|
||||
func msiProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "msi",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemManufacturer, "micro-star") || containsFold(s.ChassisManufacturer, "micro-star") {
|
||||
score += 80
|
||||
}
|
||||
if containsFold(s.SystemManufacturer, "msi") || containsFold(s.ChassisManufacturer, "msi") {
|
||||
score += 40
|
||||
}
|
||||
for _, hint := range s.ResourceHints {
|
||||
if strings.HasPrefix(hint, "/redfish/v1/Chassis/GPU") {
|
||||
score += 10
|
||||
break
|
||||
}
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureSnapshotWorkers(plan, 6)
|
||||
ensurePrefetchWorkers(plan, 8)
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 12,
|
||||
SnapshotSeconds: 120,
|
||||
PrefetchSeconds: 25,
|
||||
CriticalPlanBSeconds: 35,
|
||||
ProfilePlanBSeconds: 25,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 1000,
|
||||
ThrottleP95LatencyMS: 2200,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 2,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
addPlanNote(plan, "msi gpu chassis probes enabled")
|
||||
},
|
||||
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
|
||||
for _, chassisPath := range discovered.ChassisPaths {
|
||||
if !strings.HasPrefix(chassisPath, "/redfish/v1/Chassis/GPU") {
|
||||
continue
|
||||
}
|
||||
addPlanPaths(&resolved.SeedPaths, chassisPath)
|
||||
addPlanPaths(&resolved.Plan.SeedPaths, chassisPath)
|
||||
addPlanPaths(&resolved.CriticalPaths, joinPath(chassisPath, "/Sensors"))
|
||||
addPlanPaths(&resolved.Plan.CriticalPaths, joinPath(chassisPath, "/Sensors"))
|
||||
addPlanPaths(&resolved.Plan.PlanBPaths, joinPath(chassisPath, "/Sensors"))
|
||||
}
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
},
|
||||
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, _ MatchSignals) {
|
||||
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) && snapshotHasPathPrefix(snapshot, "/redfish/v1/Chassis/GPU") {
|
||||
plan.Directives.EnableProcessorGPUFallback = true
|
||||
plan.Directives.EnableMSIProcessorGPUChassisLookup = true
|
||||
addAnalysisLookupMode(plan, "msi-index")
|
||||
addAnalysisNote(plan, "msi analysis enables processor-gpu fallback from discovered GPU chassis")
|
||||
}
|
||||
},
|
||||
}
|
||||
}
|
||||
81
internal/collector/redfishprofile/profile_supermicro.go
Normal file
81
internal/collector/redfishprofile/profile_supermicro.go
Normal file
@@ -0,0 +1,81 @@
|
||||
package redfishprofile
|
||||
|
||||
func supermicroProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "supermicro",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemManufacturer, "supermicro") || containsFold(s.ChassisManufacturer, "supermicro") {
|
||||
score += 80
|
||||
}
|
||||
for _, hint := range s.ResourceHints {
|
||||
if containsFold(hint, "hgx_baseboard") || containsFold(hint, "hgx_gpu_sxm") {
|
||||
score += 20
|
||||
break
|
||||
}
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureSnapshotMaxDocuments(plan, 150000)
|
||||
ensureSnapshotWorkers(plan, 6)
|
||||
ensurePrefetchWorkers(plan, 4)
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 15,
|
||||
SnapshotSeconds: 180,
|
||||
PrefetchSeconds: 35,
|
||||
CriticalPlanBSeconds: 45,
|
||||
ProfilePlanBSeconds: 30,
|
||||
})
|
||||
ensurePostProbePolicy(plan, AcquisitionPostProbePolicy{
|
||||
EnableDirectNVMEDiskBayProbe: true,
|
||||
})
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 1200,
|
||||
ThrottleP95LatencyMS: 2400,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
addPlanNote(plan, "supermicro acquisition extensions enabled")
|
||||
},
|
||||
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, _ DiscoveredResources, signals MatchSignals) {
|
||||
for _, hint := range signals.ResourceHints {
|
||||
if normalizePath(hint) != "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory" {
|
||||
continue
|
||||
}
|
||||
addPlanPaths(&resolved.CriticalPaths, hint)
|
||||
addPlanPaths(&resolved.Plan.CriticalPaths, hint)
|
||||
addPlanPaths(&resolved.Plan.PlanBPaths, hint)
|
||||
break
|
||||
}
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableStorageEnclosureRecovery = true
|
||||
},
|
||||
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, _ DiscoveredResources, _ MatchSignals) {
|
||||
if snapshotHasPathContaining(snapshot, "/chassis/nvmessd.") && snapshotHasPathContaining(snapshot, ".storagebackplane") {
|
||||
plan.Directives.EnableSupermicroNVMeBackplane = true
|
||||
addAnalysisNote(plan, "supermicro analysis enables NVMe backplane recovery from snapshot paths")
|
||||
}
|
||||
if snapshotHasStorageControllerHint(snapshot, "/storage/intelvroc", "/storage/ha-raid", "/storage/mrvl.ha-raid") {
|
||||
plan.Directives.EnableKnownStorageControllerRecovery = true
|
||||
addAnalysisStorageDriveCollections(plan,
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Controllers/1/Drives",
|
||||
)
|
||||
addAnalysisStorageVolumeCollections(plan,
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
"/Storage/HA-RAID/Volumes",
|
||||
"/Storage/MRVL.HA-RAID/Volumes",
|
||||
)
|
||||
addAnalysisNote(plan, "supermicro analysis enables known storage-controller recovery from snapshot paths")
|
||||
}
|
||||
},
|
||||
}
|
||||
}
|
||||
228
internal/collector/redfishprofile/profiles_common.go
Normal file
228
internal/collector/redfishprofile/profiles_common.go
Normal file
@@ -0,0 +1,228 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
type staticProfile struct {
|
||||
name string
|
||||
priority int
|
||||
safeForFallback bool
|
||||
matchFn func(MatchSignals) int
|
||||
extendAcquisition func(*AcquisitionPlan, MatchSignals)
|
||||
refineAcquisition func(*ResolvedAcquisitionPlan, DiscoveredResources, MatchSignals)
|
||||
applyAnalysisDirectives func(*AnalysisDirectives, MatchSignals)
|
||||
refineAnalysis func(*ResolvedAnalysisPlan, map[string]interface{}, DiscoveredResources, MatchSignals)
|
||||
postAnalyze func(*models.AnalysisResult, map[string]interface{}, MatchSignals)
|
||||
}
|
||||
|
||||
func (p staticProfile) Name() string { return p.name }
|
||||
func (p staticProfile) Priority() int { return p.priority }
|
||||
func (p staticProfile) Match(signals MatchSignals) int { return p.matchFn(normalizeSignals(signals)) }
|
||||
func (p staticProfile) SafeForFallback() bool { return p.safeForFallback }
|
||||
func (p staticProfile) ExtendAcquisitionPlan(plan *AcquisitionPlan, signals MatchSignals) {
|
||||
if p.extendAcquisition != nil {
|
||||
p.extendAcquisition(plan, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
func (p staticProfile) RefineAcquisitionPlan(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, signals MatchSignals) {
|
||||
if p.refineAcquisition != nil {
|
||||
p.refineAcquisition(resolved, discovered, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
func (p staticProfile) ApplyAnalysisDirectives(directives *AnalysisDirectives, signals MatchSignals) {
|
||||
if p.applyAnalysisDirectives != nil {
|
||||
p.applyAnalysisDirectives(directives, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
func (p staticProfile) RefineAnalysisPlan(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals) {
|
||||
if p.refineAnalysis != nil {
|
||||
p.refineAnalysis(plan, snapshot, discovered, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
func (p staticProfile) PostAnalyze(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals) {
|
||||
if p.postAnalyze != nil {
|
||||
p.postAnalyze(result, snapshot, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
|
||||
func BuiltinProfiles() []Profile {
|
||||
return []Profile{
|
||||
genericProfile(),
|
||||
amiProfile(),
|
||||
msiProfile(),
|
||||
supermicroProfile(),
|
||||
dellProfile(),
|
||||
hgxProfile(),
|
||||
}
|
||||
}
|
||||
|
||||
func containsFold(v, sub string) bool {
|
||||
return strings.Contains(strings.ToLower(strings.TrimSpace(v)), strings.ToLower(strings.TrimSpace(sub)))
|
||||
}
|
||||
|
||||
func addPlanPaths(dst *[]string, paths ...string) {
|
||||
*dst = append(*dst, paths...)
|
||||
*dst = dedupeSorted(*dst)
|
||||
}
|
||||
|
||||
func addPlanNote(plan *AcquisitionPlan, note string) {
|
||||
if strings.TrimSpace(note) == "" {
|
||||
return
|
||||
}
|
||||
plan.Notes = append(plan.Notes, note)
|
||||
plan.Notes = dedupeSorted(plan.Notes)
|
||||
}
|
||||
|
||||
func addAnalysisNote(plan *ResolvedAnalysisPlan, note string) {
|
||||
if plan == nil || strings.TrimSpace(note) == "" {
|
||||
return
|
||||
}
|
||||
plan.Notes = append(plan.Notes, note)
|
||||
plan.Notes = dedupeSorted(plan.Notes)
|
||||
}
|
||||
|
||||
func addAnalysisLookupMode(plan *ResolvedAnalysisPlan, mode string) {
|
||||
if plan == nil || strings.TrimSpace(mode) == "" {
|
||||
return
|
||||
}
|
||||
plan.ProcessorGPUChassisLookupModes = dedupeSorted(append(plan.ProcessorGPUChassisLookupModes, mode))
|
||||
}
|
||||
|
||||
func addAnalysisStorageDriveCollections(plan *ResolvedAnalysisPlan, rels ...string) {
|
||||
if plan == nil {
|
||||
return
|
||||
}
|
||||
plan.KnownStorageDriveCollections = dedupeSorted(append(plan.KnownStorageDriveCollections, rels...))
|
||||
}
|
||||
|
||||
func addAnalysisStorageVolumeCollections(plan *ResolvedAnalysisPlan, rels ...string) {
|
||||
if plan == nil {
|
||||
return
|
||||
}
|
||||
plan.KnownStorageVolumeCollections = dedupeSorted(append(plan.KnownStorageVolumeCollections, rels...))
|
||||
}
|
||||
|
||||
func ensureSnapshotMaxDocuments(plan *AcquisitionPlan, n int) {
|
||||
if n <= 0 {
|
||||
return
|
||||
}
|
||||
if plan.Tuning.SnapshotMaxDocuments < n {
|
||||
plan.Tuning.SnapshotMaxDocuments = n
|
||||
}
|
||||
}
|
||||
|
||||
func ensureSnapshotWorkers(plan *AcquisitionPlan, n int) {
|
||||
if n <= 0 {
|
||||
return
|
||||
}
|
||||
if plan.Tuning.SnapshotWorkers < n {
|
||||
plan.Tuning.SnapshotWorkers = n
|
||||
}
|
||||
}
|
||||
|
||||
func ensurePrefetchEnabled(plan *AcquisitionPlan, enabled bool) {
|
||||
if plan.Tuning.PrefetchEnabled == nil {
|
||||
plan.Tuning.PrefetchEnabled = new(bool)
|
||||
}
|
||||
*plan.Tuning.PrefetchEnabled = enabled
|
||||
}
|
||||
|
||||
func ensurePrefetchWorkers(plan *AcquisitionPlan, n int) {
|
||||
if n <= 0 {
|
||||
return
|
||||
}
|
||||
if plan.Tuning.PrefetchWorkers < n {
|
||||
plan.Tuning.PrefetchWorkers = n
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNVMePostProbeEnabled(plan *AcquisitionPlan, enabled bool) {
|
||||
if plan.Tuning.NVMePostProbeEnabled == nil {
|
||||
plan.Tuning.NVMePostProbeEnabled = new(bool)
|
||||
}
|
||||
*plan.Tuning.NVMePostProbeEnabled = enabled
|
||||
}
|
||||
|
||||
func ensureRatePolicy(plan *AcquisitionPlan, policy AcquisitionRatePolicy) {
|
||||
if policy.TargetP95LatencyMS > plan.Tuning.RatePolicy.TargetP95LatencyMS {
|
||||
plan.Tuning.RatePolicy.TargetP95LatencyMS = policy.TargetP95LatencyMS
|
||||
}
|
||||
if policy.ThrottleP95LatencyMS > plan.Tuning.RatePolicy.ThrottleP95LatencyMS {
|
||||
plan.Tuning.RatePolicy.ThrottleP95LatencyMS = policy.ThrottleP95LatencyMS
|
||||
}
|
||||
if policy.MinSnapshotWorkers > plan.Tuning.RatePolicy.MinSnapshotWorkers {
|
||||
plan.Tuning.RatePolicy.MinSnapshotWorkers = policy.MinSnapshotWorkers
|
||||
}
|
||||
if policy.MinPrefetchWorkers > plan.Tuning.RatePolicy.MinPrefetchWorkers {
|
||||
plan.Tuning.RatePolicy.MinPrefetchWorkers = policy.MinPrefetchWorkers
|
||||
}
|
||||
if policy.DisablePrefetchOnErrors {
|
||||
plan.Tuning.RatePolicy.DisablePrefetchOnErrors = true
|
||||
}
|
||||
}
|
||||
|
||||
func ensureETABaseline(plan *AcquisitionPlan, baseline AcquisitionETABaseline) {
|
||||
if baseline.DiscoverySeconds > plan.Tuning.ETABaseline.DiscoverySeconds {
|
||||
plan.Tuning.ETABaseline.DiscoverySeconds = baseline.DiscoverySeconds
|
||||
}
|
||||
if baseline.SnapshotSeconds > plan.Tuning.ETABaseline.SnapshotSeconds {
|
||||
plan.Tuning.ETABaseline.SnapshotSeconds = baseline.SnapshotSeconds
|
||||
}
|
||||
if baseline.PrefetchSeconds > plan.Tuning.ETABaseline.PrefetchSeconds {
|
||||
plan.Tuning.ETABaseline.PrefetchSeconds = baseline.PrefetchSeconds
|
||||
}
|
||||
if baseline.CriticalPlanBSeconds > plan.Tuning.ETABaseline.CriticalPlanBSeconds {
|
||||
plan.Tuning.ETABaseline.CriticalPlanBSeconds = baseline.CriticalPlanBSeconds
|
||||
}
|
||||
if baseline.ProfilePlanBSeconds > plan.Tuning.ETABaseline.ProfilePlanBSeconds {
|
||||
plan.Tuning.ETABaseline.ProfilePlanBSeconds = baseline.ProfilePlanBSeconds
|
||||
}
|
||||
}
|
||||
|
||||
func ensurePostProbePolicy(plan *AcquisitionPlan, policy AcquisitionPostProbePolicy) {
|
||||
if policy.EnableDirectNVMEDiskBayProbe {
|
||||
plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe = true
|
||||
}
|
||||
if policy.EnableNumericCollectionProbe {
|
||||
plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe = true
|
||||
}
|
||||
if policy.EnableSensorCollectionProbe {
|
||||
plan.Tuning.PostProbePolicy.EnableSensorCollectionProbe = true
|
||||
}
|
||||
}
|
||||
|
||||
func ensureRecoveryPolicy(plan *AcquisitionPlan, policy AcquisitionRecoveryPolicy) {
|
||||
if policy.EnableCriticalCollectionMemberRetry {
|
||||
plan.Tuning.RecoveryPolicy.EnableCriticalCollectionMemberRetry = true
|
||||
}
|
||||
if policy.EnableCriticalSlowProbe {
|
||||
plan.Tuning.RecoveryPolicy.EnableCriticalSlowProbe = true
|
||||
}
|
||||
if policy.EnableProfilePlanB {
|
||||
plan.Tuning.RecoveryPolicy.EnableProfilePlanB = true
|
||||
}
|
||||
}
|
||||
|
||||
func ensureScopedPathPolicy(plan *AcquisitionPlan, policy AcquisitionScopedPathPolicy) {
|
||||
addPlanPaths(&plan.ScopedPaths.SystemSeedSuffixes, policy.SystemSeedSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.SystemCriticalSuffixes, policy.SystemCriticalSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.ChassisSeedSuffixes, policy.ChassisSeedSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.ChassisCriticalSuffixes, policy.ChassisCriticalSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.ManagerSeedSuffixes, policy.ManagerSeedSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.ManagerCriticalSuffixes, policy.ManagerCriticalSuffixes...)
|
||||
}
|
||||
|
||||
func ensurePrefetchPolicy(plan *AcquisitionPlan, policy AcquisitionPrefetchPolicy) {
|
||||
addPlanPaths(&plan.Tuning.PrefetchPolicy.IncludeSuffixes, policy.IncludeSuffixes...)
|
||||
addPlanPaths(&plan.Tuning.PrefetchPolicy.ExcludeContains, policy.ExcludeContains...)
|
||||
}
|
||||
|
||||
func min(a, b int) int {
|
||||
if a < b {
|
||||
return a
|
||||
}
|
||||
return b
|
||||
}
|
||||
98
internal/collector/redfishprofile/signals.go
Normal file
98
internal/collector/redfishprofile/signals.go
Normal file
@@ -0,0 +1,98 @@
|
||||
package redfishprofile
|
||||
|
||||
import "strings"
|
||||
|
||||
func CollectSignals(serviceRootDoc, systemDoc, chassisDoc, managerDoc map[string]interface{}, resourceHints []string) MatchSignals {
|
||||
signals := MatchSignals{
|
||||
ServiceRootVendor: lookupString(serviceRootDoc, "Vendor"),
|
||||
ServiceRootProduct: lookupString(serviceRootDoc, "Product"),
|
||||
SystemManufacturer: lookupString(systemDoc, "Manufacturer"),
|
||||
SystemModel: lookupString(systemDoc, "Model"),
|
||||
SystemSKU: lookupString(systemDoc, "SKU"),
|
||||
ChassisManufacturer: lookupString(chassisDoc, "Manufacturer"),
|
||||
ChassisModel: lookupString(chassisDoc, "Model"),
|
||||
ManagerManufacturer: lookupString(managerDoc, "Manufacturer"),
|
||||
ResourceHints: resourceHints,
|
||||
}
|
||||
signals.OEMNamespaces = dedupeSorted(append(
|
||||
oemNamespaces(serviceRootDoc),
|
||||
append(oemNamespaces(systemDoc), append(oemNamespaces(chassisDoc), oemNamespaces(managerDoc)...)...)...,
|
||||
))
|
||||
return normalizeSignals(signals)
|
||||
}
|
||||
|
||||
func CollectSignalsFromTree(tree map[string]interface{}) MatchSignals {
|
||||
getDoc := func(path string) map[string]interface{} {
|
||||
if v, ok := tree[path]; ok {
|
||||
if doc, ok := v.(map[string]interface{}); ok {
|
||||
return doc
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
memberPath := func(collectionPath, fallbackPath string) string {
|
||||
collection := getDoc(collectionPath)
|
||||
if len(collection) != 0 {
|
||||
if members, ok := collection["Members"].([]interface{}); ok && len(members) > 0 {
|
||||
if ref, ok := members[0].(map[string]interface{}); ok {
|
||||
if path := lookupString(ref, "@odata.id"); path != "" {
|
||||
return path
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return fallbackPath
|
||||
}
|
||||
|
||||
systemPath := memberPath("/redfish/v1/Systems", "/redfish/v1/Systems/1")
|
||||
chassisPath := memberPath("/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
|
||||
managerPath := memberPath("/redfish/v1/Managers", "/redfish/v1/Managers/1")
|
||||
|
||||
resourceHints := make([]string, 0, len(tree))
|
||||
for path := range tree {
|
||||
path = strings.TrimSpace(path)
|
||||
if path == "" {
|
||||
continue
|
||||
}
|
||||
resourceHints = append(resourceHints, path)
|
||||
}
|
||||
|
||||
return CollectSignals(
|
||||
getDoc("/redfish/v1"),
|
||||
getDoc(systemPath),
|
||||
getDoc(chassisPath),
|
||||
getDoc(managerPath),
|
||||
resourceHints,
|
||||
)
|
||||
}
|
||||
|
||||
func lookupString(doc map[string]interface{}, key string) string {
|
||||
if len(doc) == 0 {
|
||||
return ""
|
||||
}
|
||||
value, _ := doc[key]
|
||||
if s, ok := value.(string); ok {
|
||||
return strings.TrimSpace(s)
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func oemNamespaces(doc map[string]interface{}) []string {
|
||||
if len(doc) == 0 {
|
||||
return nil
|
||||
}
|
||||
oem, ok := doc["Oem"].(map[string]interface{})
|
||||
if !ok {
|
||||
return nil
|
||||
}
|
||||
out := make([]string, 0, len(oem))
|
||||
for key := range oem {
|
||||
key = strings.TrimSpace(key)
|
||||
if key == "" {
|
||||
continue
|
||||
}
|
||||
out = append(out, key)
|
||||
}
|
||||
return out
|
||||
}
|
||||
17
internal/collector/redfishprofile/testdata/ami-generic.json
vendored
Normal file
17
internal/collector/redfishprofile/testdata/ami-generic.json
vendored
Normal file
@@ -0,0 +1,17 @@
|
||||
{
|
||||
"ServiceRootVendor": "AMI",
|
||||
"ServiceRootProduct": "AMI Redfish Server",
|
||||
"SystemManufacturer": "Gigabyte",
|
||||
"SystemModel": "G292-Z42",
|
||||
"SystemSKU": "",
|
||||
"ChassisManufacturer": "",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Ami"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/Self",
|
||||
"/redfish/v1/Managers/Self",
|
||||
"/redfish/v1/Oem/Ami",
|
||||
"/redfish/v1/Systems/Self"
|
||||
]
|
||||
}
|
||||
18
internal/collector/redfishprofile/testdata/dell-r750.json
vendored
Normal file
18
internal/collector/redfishprofile/testdata/dell-r750.json
vendored
Normal file
@@ -0,0 +1,18 @@
|
||||
{
|
||||
"ServiceRootVendor": "",
|
||||
"ServiceRootProduct": "iDRAC Redfish Service",
|
||||
"SystemManufacturer": "Dell Inc.",
|
||||
"SystemModel": "PowerEdge R750",
|
||||
"SystemSKU": "0A42H9",
|
||||
"ChassisManufacturer": "Dell Inc.",
|
||||
"ChassisModel": "PowerEdge R750",
|
||||
"ManagerManufacturer": "Dell Inc.",
|
||||
"OEMNamespaces": ["Dell"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/System.Embedded.1",
|
||||
"/redfish/v1/Managers/iDRAC.Embedded.1",
|
||||
"/redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell",
|
||||
"/redfish/v1/Systems/System.Embedded.1",
|
||||
"/redfish/v1/Systems/System.Embedded.1/Storage"
|
||||
]
|
||||
}
|
||||
33
internal/collector/redfishprofile/testdata/msi-cg290.json
vendored
Normal file
33
internal/collector/redfishprofile/testdata/msi-cg290.json
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"ServiceRootVendor": "AMI",
|
||||
"ServiceRootProduct": "AMI Redfish Server",
|
||||
"SystemManufacturer": "Micro-Star International Co., Ltd.",
|
||||
"SystemModel": "CG290-S3063",
|
||||
"SystemSKU": "S3063G290RAU4",
|
||||
"ChassisManufacturer": "NVIDIA",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Ami"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/GPU1",
|
||||
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
|
||||
"/redfish/v1/Chassis/GPU2",
|
||||
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
|
||||
"/redfish/v1/Chassis/GPU3",
|
||||
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
|
||||
"/redfish/v1/Chassis/GPU4",
|
||||
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
|
||||
]
|
||||
}
|
||||
33
internal/collector/redfishprofile/testdata/msi-cg480-copy.json
vendored
Normal file
33
internal/collector/redfishprofile/testdata/msi-cg480-copy.json
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"ServiceRootVendor": "AMI",
|
||||
"ServiceRootProduct": "AMI Redfish Server",
|
||||
"SystemManufacturer": "Micro-Star International Co., Ltd.",
|
||||
"SystemModel": "CG480-S5063",
|
||||
"SystemSKU": "5063G480RAE20",
|
||||
"ChassisManufacturer": "NVIDIA",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Ami"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/GPU1",
|
||||
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
|
||||
"/redfish/v1/Chassis/GPU2",
|
||||
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
|
||||
"/redfish/v1/Chassis/GPU3",
|
||||
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
|
||||
"/redfish/v1/Chassis/GPU4",
|
||||
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
|
||||
]
|
||||
}
|
||||
33
internal/collector/redfishprofile/testdata/msi-cg480.json
vendored
Normal file
33
internal/collector/redfishprofile/testdata/msi-cg480.json
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"ServiceRootVendor": "AMI",
|
||||
"ServiceRootProduct": "AMI Redfish Server",
|
||||
"SystemManufacturer": "Micro-Star International Co., Ltd.",
|
||||
"SystemModel": "CG480-S5063",
|
||||
"SystemSKU": "5063G480RAE20",
|
||||
"ChassisManufacturer": "NVIDIA",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Ami"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/GPU1",
|
||||
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
|
||||
"/redfish/v1/Chassis/GPU2",
|
||||
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
|
||||
"/redfish/v1/Chassis/GPU3",
|
||||
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
|
||||
"/redfish/v1/Chassis/GPU4",
|
||||
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
|
||||
]
|
||||
}
|
||||
33
internal/collector/redfishprofile/testdata/supermicro-hgx.json
vendored
Normal file
33
internal/collector/redfishprofile/testdata/supermicro-hgx.json
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"ServiceRootVendor": "Supermicro",
|
||||
"ServiceRootProduct": "",
|
||||
"SystemManufacturer": "Supermicro",
|
||||
"SystemModel": "SYS-821GE-TNHR",
|
||||
"SystemSKU": "0x1D1415D9",
|
||||
"ChassisManufacturer": "Supermicro",
|
||||
"ChassisModel": "X13DEG-OAD",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Supermicro"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/HGX_BMC_0",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Assembly",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Controls",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Drives",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/EnvironmentMetrics",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/LogServices",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/PCIeDevices",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/PCIeSlots",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/PowerSubsystem",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/PowerSubsystem/PowerSupplies",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Sensors",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Sensors/HGX_BMC_0_Temp_0",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/ThermalSubsystem",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/ThermalSubsystem/ThermalMetrics",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/Assembly",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/Controls",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/Controls/TotalGPU_Power_0",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/Drives",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/EnvironmentMetrics"
|
||||
]
|
||||
}
|
||||
51
internal/collector/redfishprofile/testdata/supermicro-oam-amd.json
vendored
Normal file
51
internal/collector/redfishprofile/testdata/supermicro-oam-amd.json
vendored
Normal file
@@ -0,0 +1,51 @@
|
||||
{
|
||||
"ServiceRootVendor": "",
|
||||
"ServiceRootProduct": "H12DGQ-NT6",
|
||||
"SystemManufacturer": "Supermicro",
|
||||
"SystemModel": "AS -4124GQ-TNMI",
|
||||
"SystemSKU": "091715D9",
|
||||
"ChassisManufacturer": "Supermicro",
|
||||
"ChassisModel": "H12DGQ-NT6",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": [
|
||||
"Supermicro"
|
||||
],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/1/PCIeDevices",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU2",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU2/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU2/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU3",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU3/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU3/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU4",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU4/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU4/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU5",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU5/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU5/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU6",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU6/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU6/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU7",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU7/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU7/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU8",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU8/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU8/PCIeFunctions/1",
|
||||
"/redfish/v1/Managers/1/Oem/Supermicro/FanMode",
|
||||
"/redfish/v1/Oem/Supermicro/DumpService",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU1",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU2",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU3",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU4",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU5",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU6",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU7",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU8",
|
||||
"/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory"
|
||||
]
|
||||
}
|
||||
16
internal/collector/redfishprofile/testdata/unknown-vendor.json
vendored
Normal file
16
internal/collector/redfishprofile/testdata/unknown-vendor.json
vendored
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"ServiceRootVendor": "",
|
||||
"ServiceRootProduct": "Redfish Service",
|
||||
"SystemManufacturer": "",
|
||||
"SystemModel": "",
|
||||
"SystemSKU": "",
|
||||
"ChassisManufacturer": "",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": [],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/1",
|
||||
"/redfish/v1/Managers/1",
|
||||
"/redfish/v1/Systems/1"
|
||||
]
|
||||
}
|
||||
167
internal/collector/redfishprofile/types.go
Normal file
167
internal/collector/redfishprofile/types.go
Normal file
@@ -0,0 +1,167 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"sort"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
type MatchSignals struct {
|
||||
ServiceRootVendor string
|
||||
ServiceRootProduct string
|
||||
SystemManufacturer string
|
||||
SystemModel string
|
||||
SystemSKU string
|
||||
ChassisManufacturer string
|
||||
ChassisModel string
|
||||
ManagerManufacturer string
|
||||
OEMNamespaces []string
|
||||
ResourceHints []string
|
||||
}
|
||||
|
||||
type AcquisitionPlan struct {
|
||||
Mode string
|
||||
Profiles []string
|
||||
SeedPaths []string
|
||||
CriticalPaths []string
|
||||
PlanBPaths []string
|
||||
Notes []string
|
||||
ScopedPaths AcquisitionScopedPathPolicy
|
||||
Tuning AcquisitionTuning
|
||||
}
|
||||
|
||||
type DiscoveredResources struct {
|
||||
SystemPaths []string
|
||||
ChassisPaths []string
|
||||
ManagerPaths []string
|
||||
}
|
||||
|
||||
type ResolvedAcquisitionPlan struct {
|
||||
Plan AcquisitionPlan
|
||||
SeedPaths []string
|
||||
CriticalPaths []string
|
||||
}
|
||||
|
||||
type AcquisitionScopedPathPolicy struct {
|
||||
SystemSeedSuffixes []string
|
||||
SystemCriticalSuffixes []string
|
||||
ChassisSeedSuffixes []string
|
||||
ChassisCriticalSuffixes []string
|
||||
ManagerSeedSuffixes []string
|
||||
ManagerCriticalSuffixes []string
|
||||
}
|
||||
|
||||
type AcquisitionTuning struct {
|
||||
SnapshotMaxDocuments int
|
||||
SnapshotWorkers int
|
||||
PrefetchEnabled *bool
|
||||
PrefetchWorkers int
|
||||
NVMePostProbeEnabled *bool
|
||||
RatePolicy AcquisitionRatePolicy
|
||||
ETABaseline AcquisitionETABaseline
|
||||
PostProbePolicy AcquisitionPostProbePolicy
|
||||
RecoveryPolicy AcquisitionRecoveryPolicy
|
||||
PrefetchPolicy AcquisitionPrefetchPolicy
|
||||
}
|
||||
|
||||
type AcquisitionRatePolicy struct {
|
||||
TargetP95LatencyMS int
|
||||
ThrottleP95LatencyMS int
|
||||
MinSnapshotWorkers int
|
||||
MinPrefetchWorkers int
|
||||
DisablePrefetchOnErrors bool
|
||||
}
|
||||
|
||||
type AcquisitionETABaseline struct {
|
||||
DiscoverySeconds int
|
||||
SnapshotSeconds int
|
||||
PrefetchSeconds int
|
||||
CriticalPlanBSeconds int
|
||||
ProfilePlanBSeconds int
|
||||
}
|
||||
|
||||
type AcquisitionPostProbePolicy struct {
|
||||
EnableDirectNVMEDiskBayProbe bool
|
||||
EnableNumericCollectionProbe bool
|
||||
EnableSensorCollectionProbe bool
|
||||
}
|
||||
|
||||
type AcquisitionRecoveryPolicy struct {
|
||||
EnableCriticalCollectionMemberRetry bool
|
||||
EnableCriticalSlowProbe bool
|
||||
EnableProfilePlanB bool
|
||||
}
|
||||
|
||||
type AcquisitionPrefetchPolicy struct {
|
||||
IncludeSuffixes []string
|
||||
ExcludeContains []string
|
||||
}
|
||||
|
||||
type AnalysisDirectives struct {
|
||||
EnableProcessorGPUFallback bool
|
||||
EnableSupermicroNVMeBackplane bool
|
||||
EnableProcessorGPUChassisAlias bool
|
||||
EnableGenericGraphicsControllerDedup bool
|
||||
EnableMSIProcessorGPUChassisLookup bool
|
||||
EnableStorageEnclosureRecovery bool
|
||||
EnableKnownStorageControllerRecovery bool
|
||||
}
|
||||
|
||||
type ResolvedAnalysisPlan struct {
|
||||
Match MatchResult
|
||||
Directives AnalysisDirectives
|
||||
Notes []string
|
||||
ProcessorGPUChassisLookupModes []string
|
||||
KnownStorageDriveCollections []string
|
||||
KnownStorageVolumeCollections []string
|
||||
}
|
||||
|
||||
type Profile interface {
|
||||
Name() string
|
||||
Priority() int
|
||||
Match(signals MatchSignals) int
|
||||
SafeForFallback() bool
|
||||
ExtendAcquisitionPlan(plan *AcquisitionPlan, signals MatchSignals)
|
||||
RefineAcquisitionPlan(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, signals MatchSignals)
|
||||
ApplyAnalysisDirectives(directives *AnalysisDirectives, signals MatchSignals)
|
||||
RefineAnalysisPlan(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals)
|
||||
PostAnalyze(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals)
|
||||
}
|
||||
|
||||
type MatchResult struct {
|
||||
Mode string
|
||||
Profiles []Profile
|
||||
Scores []ProfileScore
|
||||
}
|
||||
|
||||
type ProfileScore struct {
|
||||
Name string
|
||||
Score int
|
||||
Active bool
|
||||
Priority int
|
||||
}
|
||||
|
||||
func normalizeSignals(signals MatchSignals) MatchSignals {
|
||||
signals.OEMNamespaces = dedupeSorted(signals.OEMNamespaces)
|
||||
signals.ResourceHints = dedupeSorted(signals.ResourceHints)
|
||||
return signals
|
||||
}
|
||||
|
||||
func dedupeSorted(items []string) []string {
|
||||
if len(items) == 0 {
|
||||
return nil
|
||||
}
|
||||
set := make(map[string]struct{}, len(items))
|
||||
for _, item := range items {
|
||||
if item == "" {
|
||||
continue
|
||||
}
|
||||
set[item] = struct{}{}
|
||||
}
|
||||
out := make([]string, 0, len(set))
|
||||
for item := range set {
|
||||
out = append(out, item)
|
||||
}
|
||||
sort.Strings(out)
|
||||
return out
|
||||
}
|
||||
@@ -19,13 +19,47 @@ type Request struct {
|
||||
}
|
||||
|
||||
type Progress struct {
|
||||
Status string
|
||||
Progress int
|
||||
Message string
|
||||
Status string
|
||||
Progress int
|
||||
Message string
|
||||
CurrentPhase string
|
||||
ETASeconds int
|
||||
ActiveModules []ModuleActivation
|
||||
ModuleScores []ModuleScore
|
||||
DebugInfo *CollectDebugInfo
|
||||
}
|
||||
|
||||
type ProgressFn func(Progress)
|
||||
|
||||
type ModuleActivation struct {
|
||||
Name string
|
||||
Score int
|
||||
}
|
||||
|
||||
type ModuleScore struct {
|
||||
Name string
|
||||
Score int
|
||||
Active bool
|
||||
Priority int
|
||||
}
|
||||
|
||||
type CollectDebugInfo struct {
|
||||
AdaptiveThrottled bool
|
||||
SnapshotWorkers int
|
||||
PrefetchWorkers int
|
||||
PrefetchEnabled *bool
|
||||
PhaseTelemetry []PhaseTelemetry
|
||||
}
|
||||
|
||||
type PhaseTelemetry struct {
|
||||
Phase string
|
||||
Requests int
|
||||
Errors int
|
||||
ErrorRate float64
|
||||
AvgMS int64
|
||||
P95MS int64
|
||||
}
|
||||
|
||||
type ProbeResult struct {
|
||||
Reachable bool
|
||||
Protocol string
|
||||
|
||||
63
internal/ingest/service.go
Normal file
63
internal/ingest/service.go
Normal file
@@ -0,0 +1,63 @@
|
||||
package ingest
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
type Service struct{}
|
||||
|
||||
type RedfishSourceMetadata struct {
|
||||
TargetHost string
|
||||
SourceTimezone string
|
||||
Filename string
|
||||
}
|
||||
|
||||
func NewService() *Service {
|
||||
return &Service{}
|
||||
}
|
||||
|
||||
func (s *Service) AnalyzeArchivePayload(filename string, payload []byte) (*models.AnalysisResult, string, error) {
|
||||
p := parser.NewBMCParser()
|
||||
if err := p.ParseFromReader(bytes.NewReader(payload), filename); err != nil {
|
||||
return nil, "", err
|
||||
}
|
||||
return p.Result(), p.DetectedVendor(), nil
|
||||
}
|
||||
|
||||
func (s *Service) AnalyzeRedfishRawPayloads(rawPayloads map[string]any, meta RedfishSourceMetadata) (*models.AnalysisResult, string, error) {
|
||||
result, err := collector.ReplayRedfishFromRawPayloads(rawPayloads, nil)
|
||||
if err != nil {
|
||||
return nil, "", err
|
||||
}
|
||||
if result == nil {
|
||||
return nil, "", fmt.Errorf("redfish replay returned nil result")
|
||||
}
|
||||
if strings.TrimSpace(result.Protocol) == "" {
|
||||
result.Protocol = "redfish"
|
||||
}
|
||||
if strings.TrimSpace(result.SourceType) == "" {
|
||||
result.SourceType = models.SourceTypeAPI
|
||||
}
|
||||
if strings.TrimSpace(result.TargetHost) == "" {
|
||||
result.TargetHost = strings.TrimSpace(meta.TargetHost)
|
||||
}
|
||||
if strings.TrimSpace(result.SourceTimezone) == "" {
|
||||
result.SourceTimezone = strings.TrimSpace(meta.SourceTimezone)
|
||||
}
|
||||
if strings.TrimSpace(result.Filename) == "" {
|
||||
if strings.TrimSpace(meta.Filename) != "" {
|
||||
result.Filename = strings.TrimSpace(meta.Filename)
|
||||
} else if target := strings.TrimSpace(result.TargetHost); target != "" {
|
||||
result.Filename = "redfish://" + target
|
||||
} else {
|
||||
result.Filename = "redfish://snapshot"
|
||||
}
|
||||
}
|
||||
return result, "redfish", nil
|
||||
}
|
||||
@@ -91,6 +91,21 @@ func TestCollectLifecycleToTerminal(t *testing.T) {
|
||||
if len(status.Logs) < 4 {
|
||||
t.Fatalf("expected detailed logs, got %v", status.Logs)
|
||||
}
|
||||
if len(status.ActiveModules) == 0 {
|
||||
t.Fatal("expected active modules in collect status")
|
||||
}
|
||||
if status.ActiveModules[0].Name == "" {
|
||||
t.Fatal("expected active module name")
|
||||
}
|
||||
if len(status.ModuleScores) == 0 {
|
||||
t.Fatal("expected module scores in collect status")
|
||||
}
|
||||
if status.DebugInfo == nil {
|
||||
t.Fatal("expected debug info in collect status")
|
||||
}
|
||||
if len(status.DebugInfo.PhaseTelemetry) == 0 {
|
||||
t.Fatal("expected phase telemetry in collect debug info")
|
||||
}
|
||||
}
|
||||
|
||||
func TestCollectCancel(t *testing.T) {
|
||||
|
||||
@@ -33,6 +33,28 @@ func (c *mockConnector) Probe(ctx context.Context, req collector.Request) (*coll
|
||||
|
||||
func (c *mockConnector) Collect(ctx context.Context, req collector.Request, emit collector.ProgressFn) (*models.AnalysisResult, error) {
|
||||
steps := []collector.Progress{
|
||||
{
|
||||
Status: CollectStatusRunning,
|
||||
Progress: 10,
|
||||
Message: "Подбор модулей Redfish...",
|
||||
ActiveModules: []collector.ModuleActivation{
|
||||
{Name: "supermicro", Score: 80},
|
||||
{Name: "generic", Score: 10},
|
||||
},
|
||||
ModuleScores: []collector.ModuleScore{
|
||||
{Name: "supermicro", Score: 80, Active: true, Priority: 20},
|
||||
{Name: "generic", Score: 10, Active: true, Priority: 100},
|
||||
{Name: "hgx-topology", Score: 0, Active: false, Priority: 30},
|
||||
},
|
||||
DebugInfo: &collector.CollectDebugInfo{
|
||||
AdaptiveThrottled: false,
|
||||
SnapshotWorkers: 6,
|
||||
PrefetchWorkers: 4,
|
||||
PhaseTelemetry: []collector.PhaseTelemetry{
|
||||
{Phase: "discovery", Requests: 6, Errors: 0, ErrorRate: 0, AvgMS: 120, P95MS: 180},
|
||||
},
|
||||
},
|
||||
},
|
||||
{Status: CollectStatusRunning, Progress: 20, Message: "Подключение..."},
|
||||
{Status: CollectStatusRunning, Progress: 50, Message: "Сбор инвентаря..."},
|
||||
{Status: CollectStatusRunning, Progress: 80, Message: "Нормализация..."},
|
||||
|
||||
@@ -39,13 +39,18 @@ type CollectJobResponse struct {
|
||||
}
|
||||
|
||||
type CollectJobStatusResponse struct {
|
||||
JobID string `json:"job_id"`
|
||||
Status string `json:"status"`
|
||||
Progress *int `json:"progress,omitempty"`
|
||||
Logs []string `json:"logs,omitempty"`
|
||||
Error string `json:"error,omitempty"`
|
||||
CreatedAt time.Time `json:"created_at,omitempty"`
|
||||
UpdatedAt time.Time `json:"updated_at"`
|
||||
JobID string `json:"job_id"`
|
||||
Status string `json:"status"`
|
||||
Progress *int `json:"progress,omitempty"`
|
||||
CurrentPhase string `json:"current_phase,omitempty"`
|
||||
ETASeconds *int `json:"eta_seconds,omitempty"`
|
||||
Logs []string `json:"logs,omitempty"`
|
||||
Error string `json:"error,omitempty"`
|
||||
ActiveModules []CollectModuleStatus `json:"active_modules,omitempty"`
|
||||
ModuleScores []CollectModuleStatus `json:"module_scores,omitempty"`
|
||||
DebugInfo *CollectDebugInfo `json:"debug_info,omitempty"`
|
||||
CreatedAt time.Time `json:"created_at,omitempty"`
|
||||
UpdatedAt time.Time `json:"updated_at"`
|
||||
}
|
||||
|
||||
type CollectRequestMeta struct {
|
||||
@@ -58,27 +63,64 @@ type CollectRequestMeta struct {
|
||||
}
|
||||
|
||||
type Job struct {
|
||||
ID string
|
||||
Status string
|
||||
Progress int
|
||||
Logs []string
|
||||
Error string
|
||||
CreatedAt time.Time
|
||||
UpdatedAt time.Time
|
||||
RequestMeta CollectRequestMeta
|
||||
cancel func()
|
||||
ID string
|
||||
Status string
|
||||
Progress int
|
||||
CurrentPhase string
|
||||
ETASeconds int
|
||||
Logs []string
|
||||
Error string
|
||||
ActiveModules []CollectModuleStatus
|
||||
ModuleScores []CollectModuleStatus
|
||||
DebugInfo *CollectDebugInfo
|
||||
CreatedAt time.Time
|
||||
UpdatedAt time.Time
|
||||
RequestMeta CollectRequestMeta
|
||||
cancel func()
|
||||
}
|
||||
|
||||
type CollectModuleStatus struct {
|
||||
Name string `json:"name"`
|
||||
Score int `json:"score"`
|
||||
Active bool `json:"active,omitempty"`
|
||||
Priority int `json:"priority,omitempty"`
|
||||
}
|
||||
|
||||
type CollectDebugInfo struct {
|
||||
AdaptiveThrottled bool `json:"adaptive_throttled"`
|
||||
SnapshotWorkers int `json:"snapshot_workers,omitempty"`
|
||||
PrefetchWorkers int `json:"prefetch_workers,omitempty"`
|
||||
PrefetchEnabled *bool `json:"prefetch_enabled,omitempty"`
|
||||
PhaseTelemetry []CollectPhaseTelemetry `json:"phase_telemetry,omitempty"`
|
||||
}
|
||||
|
||||
type CollectPhaseTelemetry struct {
|
||||
Phase string `json:"phase"`
|
||||
Requests int `json:"requests,omitempty"`
|
||||
Errors int `json:"errors,omitempty"`
|
||||
ErrorRate float64 `json:"error_rate,omitempty"`
|
||||
AvgMS int64 `json:"avg_ms,omitempty"`
|
||||
P95MS int64 `json:"p95_ms,omitempty"`
|
||||
}
|
||||
|
||||
func (j *Job) toStatusResponse() CollectJobStatusResponse {
|
||||
progress := j.Progress
|
||||
resp := CollectJobStatusResponse{
|
||||
JobID: j.ID,
|
||||
Status: j.Status,
|
||||
Progress: &progress,
|
||||
Logs: append([]string(nil), j.Logs...),
|
||||
Error: j.Error,
|
||||
CreatedAt: j.CreatedAt,
|
||||
UpdatedAt: j.UpdatedAt,
|
||||
JobID: j.ID,
|
||||
Status: j.Status,
|
||||
Progress: &progress,
|
||||
CurrentPhase: j.CurrentPhase,
|
||||
Logs: append([]string(nil), j.Logs...),
|
||||
Error: j.Error,
|
||||
ActiveModules: append([]CollectModuleStatus(nil), j.ActiveModules...),
|
||||
ModuleScores: append([]CollectModuleStatus(nil), j.ModuleScores...),
|
||||
DebugInfo: cloneCollectDebugInfo(j.DebugInfo),
|
||||
CreatedAt: j.CreatedAt,
|
||||
UpdatedAt: j.UpdatedAt,
|
||||
}
|
||||
if j.ETASeconds > 0 {
|
||||
eta := j.ETASeconds
|
||||
resp.ETASeconds = &eta
|
||||
}
|
||||
return resp
|
||||
}
|
||||
@@ -91,3 +133,16 @@ func (j *Job) toJobResponse(message string) CollectJobResponse {
|
||||
CreatedAt: j.CreatedAt,
|
||||
}
|
||||
}
|
||||
|
||||
func cloneCollectDebugInfo(in *CollectDebugInfo) *CollectDebugInfo {
|
||||
if in == nil {
|
||||
return nil
|
||||
}
|
||||
out := *in
|
||||
out.PhaseTelemetry = append([]CollectPhaseTelemetry(nil), in.PhaseTelemetry...)
|
||||
if in.PrefetchEnabled != nil {
|
||||
value := *in.PrefetchEnabled
|
||||
out.PrefetchEnabled = &value
|
||||
}
|
||||
return &out
|
||||
}
|
||||
|
||||
@@ -21,6 +21,7 @@ import (
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector"
|
||||
"git.mchus.pro/mchus/logpile/internal/exporter"
|
||||
"git.mchus.pro/mchus/logpile/internal/ingest"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
chartviewer "reanimator/chart/viewer"
|
||||
@@ -219,13 +220,12 @@ func (s *Server) analyzeUploadedFile(filename, mimeType string, payload []byte)
|
||||
return nil, "", nil, fmt.Errorf("unsupported archive format: %s", strings.ToLower(filepath.Ext(filename)))
|
||||
}
|
||||
|
||||
p := parser.NewBMCParser()
|
||||
if err := p.ParseFromReader(bytes.NewReader(payload), filename); err != nil {
|
||||
result, vendor, err := s.ingestService().AnalyzeArchivePayload(filename, payload)
|
||||
if err != nil {
|
||||
return nil, "", nil, err
|
||||
}
|
||||
result := p.Result()
|
||||
applyArchiveSourceMetadata(result)
|
||||
return result, p.DetectedVendor(), newRawExportFromUploadedFile(filename, mimeType, payload, result), nil
|
||||
return result, vendor, newRawExportFromUploadedFile(filename, mimeType, payload, result), nil
|
||||
}
|
||||
|
||||
func uploadMultipartMaxBytes() int64 {
|
||||
@@ -297,33 +297,18 @@ func (s *Server) reanalyzeRawExportPackage(pkg *RawExportPackage) (*models.Analy
|
||||
if !strings.EqualFold(strings.TrimSpace(pkg.Source.Protocol), "redfish") {
|
||||
return nil, "", fmt.Errorf("unsupported live protocol: %s", pkg.Source.Protocol)
|
||||
}
|
||||
result, err := collector.ReplayRedfishFromRawPayloads(pkg.Source.RawPayloads, nil)
|
||||
result, vendor, err := s.ingestService().AnalyzeRedfishRawPayloads(pkg.Source.RawPayloads, ingest.RedfishSourceMetadata{
|
||||
TargetHost: pkg.Source.TargetHost,
|
||||
SourceTimezone: pkg.Source.SourceTimezone,
|
||||
Filename: pkg.Source.Filename,
|
||||
})
|
||||
if err != nil {
|
||||
return nil, "", err
|
||||
}
|
||||
if result != nil {
|
||||
if strings.TrimSpace(result.Protocol) == "" {
|
||||
result.Protocol = "redfish"
|
||||
}
|
||||
if strings.TrimSpace(result.SourceType) == "" {
|
||||
result.SourceType = models.SourceTypeAPI
|
||||
}
|
||||
if strings.TrimSpace(result.TargetHost) == "" {
|
||||
result.TargetHost = strings.TrimSpace(pkg.Source.TargetHost)
|
||||
}
|
||||
if strings.TrimSpace(result.SourceTimezone) == "" {
|
||||
result.SourceTimezone = strings.TrimSpace(pkg.Source.SourceTimezone)
|
||||
}
|
||||
result.CollectedAt = inferRawExportCollectedAt(result, pkg)
|
||||
if strings.TrimSpace(result.Filename) == "" {
|
||||
target := result.TargetHost
|
||||
if target == "" {
|
||||
target = "snapshot"
|
||||
}
|
||||
result.Filename = "redfish://" + target
|
||||
}
|
||||
}
|
||||
return result, "redfish", nil
|
||||
return result, vendor, nil
|
||||
default:
|
||||
return nil, "", fmt.Errorf("unsupported raw export source kind: %s", pkg.Source.Kind)
|
||||
}
|
||||
@@ -342,13 +327,12 @@ func (s *Server) parseUploadedPayload(filename string, payload []byte) (*models.
|
||||
return snapshotResult, vendor, nil
|
||||
}
|
||||
|
||||
p := parser.NewBMCParser()
|
||||
if err := p.ParseFromReader(bytes.NewReader(payload), filename); err != nil {
|
||||
result, vendor, err := s.ingestService().AnalyzeArchivePayload(filename, payload)
|
||||
if err != nil {
|
||||
return nil, "", err
|
||||
}
|
||||
result := p.Result()
|
||||
applyArchiveSourceMetadata(result)
|
||||
return result, p.DetectedVendor(), nil
|
||||
return result, vendor, nil
|
||||
}
|
||||
|
||||
func (s *Server) handleGetParsers(w http.ResponseWriter, r *http.Request) {
|
||||
@@ -1706,6 +1690,51 @@ func (s *Server) startCollectionJob(jobID string, req CollectRequest) {
|
||||
status = CollectStatusRunning
|
||||
}
|
||||
s.jobManager.UpdateJobStatus(jobID, status, update.Progress, "")
|
||||
if update.CurrentPhase != "" || update.ETASeconds > 0 {
|
||||
s.jobManager.UpdateJobETA(jobID, update.CurrentPhase, update.ETASeconds)
|
||||
}
|
||||
if update.DebugInfo != nil {
|
||||
debugInfo := &CollectDebugInfo{
|
||||
AdaptiveThrottled: update.DebugInfo.AdaptiveThrottled,
|
||||
SnapshotWorkers: update.DebugInfo.SnapshotWorkers,
|
||||
PrefetchWorkers: update.DebugInfo.PrefetchWorkers,
|
||||
PrefetchEnabled: update.DebugInfo.PrefetchEnabled,
|
||||
}
|
||||
if len(update.DebugInfo.PhaseTelemetry) > 0 {
|
||||
debugInfo.PhaseTelemetry = make([]CollectPhaseTelemetry, 0, len(update.DebugInfo.PhaseTelemetry))
|
||||
for _, item := range update.DebugInfo.PhaseTelemetry {
|
||||
debugInfo.PhaseTelemetry = append(debugInfo.PhaseTelemetry, CollectPhaseTelemetry{
|
||||
Phase: item.Phase,
|
||||
Requests: item.Requests,
|
||||
Errors: item.Errors,
|
||||
ErrorRate: item.ErrorRate,
|
||||
AvgMS: item.AvgMS,
|
||||
P95MS: item.P95MS,
|
||||
})
|
||||
}
|
||||
}
|
||||
s.jobManager.UpdateJobDebugInfo(jobID, debugInfo)
|
||||
}
|
||||
if len(update.ActiveModules) > 0 || len(update.ModuleScores) > 0 {
|
||||
activeModules := make([]CollectModuleStatus, 0, len(update.ActiveModules))
|
||||
for _, module := range update.ActiveModules {
|
||||
activeModules = append(activeModules, CollectModuleStatus{
|
||||
Name: module.Name,
|
||||
Score: module.Score,
|
||||
Active: true,
|
||||
})
|
||||
}
|
||||
moduleScores := make([]CollectModuleStatus, 0, len(update.ModuleScores))
|
||||
for _, module := range update.ModuleScores {
|
||||
moduleScores = append(moduleScores, CollectModuleStatus{
|
||||
Name: module.Name,
|
||||
Score: module.Score,
|
||||
Active: module.Active,
|
||||
Priority: module.Priority,
|
||||
})
|
||||
}
|
||||
s.jobManager.UpdateJobModules(jobID, activeModules, moduleScores)
|
||||
}
|
||||
if update.Message != "" {
|
||||
s.jobManager.AppendJobLog(jobID, update.Message)
|
||||
}
|
||||
|
||||
@@ -128,6 +128,53 @@ func (m *JobManager) AppendJobLog(id, message string) (*Job, bool) {
|
||||
return cloned, true
|
||||
}
|
||||
|
||||
func (m *JobManager) UpdateJobModules(id string, activeModules, moduleScores []CollectModuleStatus) (*Job, bool) {
|
||||
m.mu.Lock()
|
||||
job, ok := m.jobs[id]
|
||||
if !ok || job == nil {
|
||||
m.mu.Unlock()
|
||||
return nil, false
|
||||
}
|
||||
job.ActiveModules = append([]CollectModuleStatus(nil), activeModules...)
|
||||
job.ModuleScores = append([]CollectModuleStatus(nil), moduleScores...)
|
||||
job.UpdatedAt = time.Now().UTC()
|
||||
|
||||
cloned := cloneJob(job)
|
||||
m.mu.Unlock()
|
||||
return cloned, true
|
||||
}
|
||||
|
||||
func (m *JobManager) UpdateJobETA(id, phase string, etaSeconds int) (*Job, bool) {
|
||||
m.mu.Lock()
|
||||
job, ok := m.jobs[id]
|
||||
if !ok || job == nil {
|
||||
m.mu.Unlock()
|
||||
return nil, false
|
||||
}
|
||||
job.CurrentPhase = phase
|
||||
job.ETASeconds = etaSeconds
|
||||
job.UpdatedAt = time.Now().UTC()
|
||||
|
||||
cloned := cloneJob(job)
|
||||
m.mu.Unlock()
|
||||
return cloned, true
|
||||
}
|
||||
|
||||
func (m *JobManager) UpdateJobDebugInfo(id string, info *CollectDebugInfo) (*Job, bool) {
|
||||
m.mu.Lock()
|
||||
job, ok := m.jobs[id]
|
||||
if !ok || job == nil {
|
||||
m.mu.Unlock()
|
||||
return nil, false
|
||||
}
|
||||
job.DebugInfo = cloneCollectDebugInfo(info)
|
||||
job.UpdatedAt = time.Now().UTC()
|
||||
|
||||
cloned := cloneJob(job)
|
||||
m.mu.Unlock()
|
||||
return cloned, true
|
||||
}
|
||||
|
||||
func (m *JobManager) AttachJobCancel(id string, cancelFn context.CancelFunc) bool {
|
||||
m.mu.Lock()
|
||||
defer m.mu.Unlock()
|
||||
@@ -176,6 +223,11 @@ func cloneJob(job *Job) *Job {
|
||||
}
|
||||
cloned := *job
|
||||
cloned.Logs = append([]string(nil), job.Logs...)
|
||||
cloned.ActiveModules = append([]CollectModuleStatus(nil), job.ActiveModules...)
|
||||
cloned.ModuleScores = append([]CollectModuleStatus(nil), job.ModuleScores...)
|
||||
cloned.DebugInfo = cloneCollectDebugInfo(job.DebugInfo)
|
||||
cloned.CurrentPhase = job.CurrentPhase
|
||||
cloned.ETASeconds = job.ETASeconds
|
||||
cloned.cancel = nil
|
||||
return &cloned
|
||||
}
|
||||
|
||||
72
internal/server/manual_input_inspect_test.go
Normal file
72
internal/server/manual_input_inspect_test.go
Normal file
@@ -0,0 +1,72 @@
|
||||
package server
|
||||
|
||||
import (
|
||||
"os"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
// TestManualInspectInput is a persistent local debugging harness for checking
|
||||
// how the current server code analyzes a real input file. It is skipped unless
|
||||
// LOGPILE_MANUAL_INPUT points to a file on disk.
|
||||
//
|
||||
// Usage:
|
||||
//
|
||||
// LOGPILE_MANUAL_INPUT=/abs/path/to/file.zip go test ./internal/server -run TestManualInspectInput -v
|
||||
func TestManualInspectInput(t *testing.T) {
|
||||
path := strings.TrimSpace(os.Getenv("LOGPILE_MANUAL_INPUT"))
|
||||
if path == "" {
|
||||
t.Skip("set LOGPILE_MANUAL_INPUT to inspect a real input file")
|
||||
}
|
||||
|
||||
payload, err := os.ReadFile(path)
|
||||
if err != nil {
|
||||
t.Fatalf("read input: %v", err)
|
||||
}
|
||||
|
||||
s := &Server{}
|
||||
filename := path
|
||||
|
||||
if rawPkg, ok, err := parseRawExportBundle(payload); err != nil {
|
||||
t.Fatalf("parseRawExportBundle: %v", err)
|
||||
} else if ok {
|
||||
result, vendor, err := s.reanalyzeRawExportPackage(rawPkg)
|
||||
if err != nil {
|
||||
t.Fatalf("reanalyzeRawExportPackage: %v", err)
|
||||
}
|
||||
logManualAnalysisResult(t, "raw_export_bundle", vendor, result)
|
||||
return
|
||||
}
|
||||
|
||||
result, vendor, err := s.parseUploadedPayload(filename, payload)
|
||||
if err != nil {
|
||||
t.Fatalf("parseUploadedPayload: %v", err)
|
||||
}
|
||||
logManualAnalysisResult(t, "uploaded_payload", vendor, result)
|
||||
}
|
||||
|
||||
func logManualAnalysisResult(t *testing.T, mode, vendor string, result *models.AnalysisResult) {
|
||||
t.Helper()
|
||||
if result == nil || result.Hardware == nil {
|
||||
t.Fatalf("missing hardware result")
|
||||
}
|
||||
|
||||
t.Logf("mode=%s vendor=%s source_type=%s protocol=%s target=%s", mode, vendor, result.SourceType, result.Protocol, result.TargetHost)
|
||||
t.Logf("counts: gpus=%d pcie=%d cpus=%d memory=%d storage=%d nics=%d psus=%d",
|
||||
len(result.Hardware.GPUs),
|
||||
len(result.Hardware.PCIeDevices),
|
||||
len(result.Hardware.CPUs),
|
||||
len(result.Hardware.Memory),
|
||||
len(result.Hardware.Storage),
|
||||
len(result.Hardware.NetworkAdapters),
|
||||
len(result.Hardware.PowerSupply),
|
||||
)
|
||||
for i, g := range result.Hardware.GPUs {
|
||||
t.Logf("gpu[%d]: slot=%s model=%s bdf=%s serial=%s status=%s", i, g.Slot, g.Model, g.BDF, g.SerialNumber, g.Status)
|
||||
}
|
||||
for i, p := range result.Hardware.PCIeDevices {
|
||||
t.Logf("pcie[%d]: slot=%s class=%s model=%s bdf=%s serial=%s vendor=%s", i, p.Slot, p.DeviceClass, p.PartNumber, p.BDF, p.SerialNumber, p.Manufacturer)
|
||||
}
|
||||
}
|
||||
@@ -10,6 +10,7 @@ import (
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector"
|
||||
"git.mchus.pro/mchus/logpile/internal/ingest"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
chartviewer "reanimator/chart/viewer"
|
||||
)
|
||||
@@ -38,6 +39,7 @@ type Server struct {
|
||||
|
||||
jobManager *JobManager
|
||||
collectors *collector.Registry
|
||||
ingest *ingest.Service
|
||||
}
|
||||
|
||||
type ConvertArtifact struct {
|
||||
@@ -51,6 +53,7 @@ func New(cfg Config) *Server {
|
||||
mux: http.NewServeMux(),
|
||||
jobManager: NewJobManager(),
|
||||
collectors: collector.NewDefaultRegistry(),
|
||||
ingest: ingest.NewService(),
|
||||
convertJobs: make(map[string]struct{}),
|
||||
convertOutput: make(map[string]ConvertArtifact),
|
||||
}
|
||||
@@ -160,6 +163,17 @@ func (s *Server) ClientVersionString() string {
|
||||
return fmt.Sprintf("LOGPile %s (commit: %s)", v, c)
|
||||
}
|
||||
|
||||
func (s *Server) ingestService() *ingest.Service {
|
||||
if s != nil && s.ingest != nil {
|
||||
return s.ingest
|
||||
}
|
||||
svc := ingest.NewService()
|
||||
if s != nil {
|
||||
s.ingest = svc
|
||||
}
|
||||
return svc
|
||||
}
|
||||
|
||||
// SetDetectedVendor sets the detected vendor name
|
||||
func (s *Server) SetDetectedVendor(vendor string) {
|
||||
s.mu.Lock()
|
||||
|
||||
@@ -357,6 +357,82 @@ main {
|
||||
transition: width 0.25s ease;
|
||||
}
|
||||
|
||||
.job-active-modules {
|
||||
margin-bottom: 0.85rem;
|
||||
}
|
||||
|
||||
.job-module-chips {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 0.45rem;
|
||||
margin-top: 0.35rem;
|
||||
}
|
||||
|
||||
.job-module-chip {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 0.4rem;
|
||||
background: #eef6ff;
|
||||
border: 1px solid #bfdcff;
|
||||
border-radius: 999px;
|
||||
padding: 0.32rem 0.68rem;
|
||||
line-height: 1;
|
||||
}
|
||||
|
||||
.job-module-chip-name {
|
||||
font-size: 0.82rem;
|
||||
color: #1f2937;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.job-module-chip-score {
|
||||
font-size: 0.72rem;
|
||||
color: #1d4ed8;
|
||||
background: #dbeafe;
|
||||
border: 1px solid #bfdbfe;
|
||||
border-radius: 999px;
|
||||
padding: 0.1rem 0.38rem;
|
||||
font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace;
|
||||
}
|
||||
|
||||
.job-debug-info {
|
||||
margin-bottom: 0.85rem;
|
||||
border: 1px solid #dbe5f0;
|
||||
background: #f8fbff;
|
||||
border-radius: 8px;
|
||||
padding: 0.75rem;
|
||||
}
|
||||
|
||||
.job-debug-summary {
|
||||
font-size: 0.82rem;
|
||||
color: #334155;
|
||||
margin-top: 0.35rem;
|
||||
}
|
||||
|
||||
.job-phase-telemetry {
|
||||
margin-top: 0.55rem;
|
||||
display: grid;
|
||||
gap: 0.35rem;
|
||||
}
|
||||
|
||||
.job-phase-row {
|
||||
display: grid;
|
||||
grid-template-columns: minmax(120px, 180px) repeat(4, minmax(60px, auto));
|
||||
gap: 0.5rem;
|
||||
align-items: center;
|
||||
font-size: 0.8rem;
|
||||
}
|
||||
|
||||
.job-phase-name {
|
||||
color: #0f172a;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.job-phase-metric {
|
||||
color: #475569;
|
||||
font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace;
|
||||
}
|
||||
|
||||
.meta-label {
|
||||
color: #64748b;
|
||||
font-weight: 600;
|
||||
|
||||
@@ -422,7 +422,12 @@ function startCollectionJob(payload) {
|
||||
id: body.job_id,
|
||||
status: normalizeJobStatus(body.status || 'queued'),
|
||||
progress: 0,
|
||||
currentPhase: '',
|
||||
etaSeconds: null,
|
||||
logs: [],
|
||||
activeModules: [],
|
||||
moduleScores: [],
|
||||
debugInfo: null,
|
||||
payload
|
||||
};
|
||||
appendJobLog(body.message || 'Задача поставлена в очередь');
|
||||
@@ -460,7 +465,12 @@ function pollCollectionJobStatus() {
|
||||
const prevStatus = collectionJob.status;
|
||||
collectionJob.status = normalizeJobStatus(body.status || collectionJob.status);
|
||||
collectionJob.progress = Number.isFinite(body.progress) ? body.progress : collectionJob.progress;
|
||||
collectionJob.currentPhase = body.current_phase || collectionJob.currentPhase || '';
|
||||
collectionJob.etaSeconds = Number.isFinite(body.eta_seconds) ? body.eta_seconds : collectionJob.etaSeconds;
|
||||
collectionJob.error = body.error || '';
|
||||
collectionJob.activeModules = Array.isArray(body.active_modules) ? body.active_modules : collectionJob.activeModules;
|
||||
collectionJob.moduleScores = Array.isArray(body.module_scores) ? body.module_scores : collectionJob.moduleScores;
|
||||
collectionJob.debugInfo = body.debug_info || collectionJob.debugInfo || null;
|
||||
syncServerLogs(body.logs);
|
||||
renderCollectionJob();
|
||||
|
||||
@@ -528,9 +538,14 @@ function renderCollectionJob() {
|
||||
const progressValue = document.getElementById('job-progress-value');
|
||||
const etaValue = document.getElementById('job-eta-value');
|
||||
const progressBar = document.getElementById('job-progress-bar');
|
||||
const activeModulesBlock = document.getElementById('job-active-modules');
|
||||
const activeModulesList = document.getElementById('job-active-modules-list');
|
||||
const debugInfoBlock = document.getElementById('job-debug-info');
|
||||
const debugSummary = document.getElementById('job-debug-summary');
|
||||
const phaseTelemetryNode = document.getElementById('job-phase-telemetry');
|
||||
const logsList = document.getElementById('job-logs-list');
|
||||
const cancelButton = document.getElementById('cancel-job-btn');
|
||||
if (!jobStatusBlock || !jobIdValue || !statusValue || !progressValue || !etaValue || !progressBar || !logsList || !cancelButton) {
|
||||
if (!jobStatusBlock || !jobIdValue || !statusValue || !progressValue || !etaValue || !progressBar || !activeModulesBlock || !activeModulesList || !debugInfoBlock || !debugSummary || !phaseTelemetryNode || !logsList || !cancelButton) {
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -558,6 +573,8 @@ function renderCollectionJob() {
|
||||
etaValue.textContent = eta;
|
||||
progressBar.style.width = `${progressPercent}%`;
|
||||
progressBar.textContent = `${progressPercent}%`;
|
||||
renderJobActiveModules(activeModulesBlock, activeModulesList);
|
||||
renderJobDebugInfo(debugInfoBlock, debugSummary, phaseTelemetryNode);
|
||||
|
||||
logsList.innerHTML = [...collectionJob.logs].reverse().map((log) => (
|
||||
`<li><span class="log-time">${escapeHtml(log.time)}</span><span class="log-message">${escapeHtml(log.message)}</span></li>`
|
||||
@@ -568,6 +585,9 @@ function renderCollectionJob() {
|
||||
}
|
||||
|
||||
function latestCollectionActivityMessage() {
|
||||
if (collectionJob && collectionJob.currentPhase) {
|
||||
return humanizeCollectionPhase(collectionJob.currentPhase);
|
||||
}
|
||||
if (!collectionJob || !Array.isArray(collectionJob.logs) || collectionJob.logs.length === 0) {
|
||||
return 'Сбор данных...';
|
||||
}
|
||||
@@ -584,6 +604,9 @@ function latestCollectionActivityMessage() {
|
||||
}
|
||||
|
||||
function latestCollectionETA() {
|
||||
if (collectionJob && Number.isFinite(collectionJob.etaSeconds) && collectionJob.etaSeconds > 0) {
|
||||
return formatDurationSeconds(collectionJob.etaSeconds);
|
||||
}
|
||||
if (!collectionJob || !Array.isArray(collectionJob.logs) || collectionJob.logs.length === 0) {
|
||||
return '-';
|
||||
}
|
||||
@@ -649,6 +672,94 @@ function normalizeJobStatus(status) {
|
||||
return String(status || '').trim().toLowerCase();
|
||||
}
|
||||
|
||||
function humanizeCollectionPhase(phase) {
|
||||
const value = String(phase || '').trim().toLowerCase();
|
||||
return {
|
||||
discovery: 'Discovery',
|
||||
snapshot: 'Snapshot',
|
||||
snapshot_postprobe_nvme: 'Snapshot NVMe post-probe',
|
||||
snapshot_postprobe_collections: 'Snapshot collection post-probe',
|
||||
prefetch: 'Prefetch critical endpoints',
|
||||
critical_plan_b: 'Critical plan-B',
|
||||
profile_plan_b: 'Profile plan-B'
|
||||
}[value] || value || 'Сбор данных...';
|
||||
}
|
||||
|
||||
function formatDurationSeconds(totalSeconds) {
|
||||
const seconds = Math.max(0, Math.round(Number(totalSeconds) || 0));
|
||||
if (seconds <= 0) {
|
||||
return '-';
|
||||
}
|
||||
const minutes = Math.floor(seconds / 60);
|
||||
const remainingSeconds = seconds % 60;
|
||||
if (minutes === 0) {
|
||||
return `${remainingSeconds}s`;
|
||||
}
|
||||
if (remainingSeconds === 0) {
|
||||
return `${minutes}m`;
|
||||
}
|
||||
return `${minutes}m ${remainingSeconds}s`;
|
||||
}
|
||||
|
||||
function renderJobActiveModules(activeModulesBlock, activeModulesList) {
|
||||
const activeModules = collectionJob && Array.isArray(collectionJob.activeModules) ? collectionJob.activeModules : [];
|
||||
if (activeModules.length === 0) {
|
||||
activeModulesBlock.classList.add('hidden');
|
||||
activeModulesList.innerHTML = '';
|
||||
return;
|
||||
}
|
||||
|
||||
activeModulesBlock.classList.remove('hidden');
|
||||
activeModulesList.innerHTML = activeModules.map((module) => {
|
||||
const score = Number.isFinite(module.score) ? module.score : 0;
|
||||
return `<span class="job-module-chip" title="${escapeHtml(moduleTitle(module))}">
|
||||
<span class="job-module-chip-name">${escapeHtml(module.name || '-')}</span>
|
||||
<span class="job-module-chip-score">${escapeHtml(String(score))}</span>
|
||||
</span>`;
|
||||
}).join('');
|
||||
}
|
||||
|
||||
function renderJobDebugInfo(debugInfoBlock, debugSummary, phaseTelemetryNode) {
|
||||
const debug = collectionJob && collectionJob.debugInfo ? collectionJob.debugInfo : null;
|
||||
if (!debug) {
|
||||
debugInfoBlock.classList.add('hidden');
|
||||
debugSummary.innerHTML = '';
|
||||
phaseTelemetryNode.innerHTML = '';
|
||||
return;
|
||||
}
|
||||
|
||||
debugInfoBlock.classList.remove('hidden');
|
||||
const throttled = debug.adaptive_throttled ? 'on' : 'off';
|
||||
const prefetchEnabled = typeof debug.prefetch_enabled === 'boolean' ? String(debug.prefetch_enabled) : 'auto';
|
||||
debugSummary.innerHTML = `adaptive_throttling=<strong>${escapeHtml(throttled)}</strong>, snapshot_workers=<strong>${escapeHtml(String(debug.snapshot_workers || 0))}</strong>, prefetch_workers=<strong>${escapeHtml(String(debug.prefetch_workers || 0))}</strong>, prefetch_enabled=<strong>${escapeHtml(prefetchEnabled)}</strong>`;
|
||||
|
||||
const phases = Array.isArray(debug.phase_telemetry) ? debug.phase_telemetry : [];
|
||||
if (phases.length === 0) {
|
||||
phaseTelemetryNode.innerHTML = '';
|
||||
return;
|
||||
}
|
||||
phaseTelemetryNode.innerHTML = phases.map((item) => (
|
||||
`<div class="job-phase-row">
|
||||
<span class="job-phase-name">${escapeHtml(humanizeCollectionPhase(item.phase || ''))}</span>
|
||||
<span class="job-phase-metric">req=${escapeHtml(String(item.requests || 0))}</span>
|
||||
<span class="job-phase-metric">err=${escapeHtml(String(item.errors || 0))}</span>
|
||||
<span class="job-phase-metric">avg=${escapeHtml(String(item.avg_ms || 0))}ms</span>
|
||||
<span class="job-phase-metric">p95=${escapeHtml(String(item.p95_ms || 0))}ms</span>
|
||||
</div>`
|
||||
)).join('');
|
||||
}
|
||||
|
||||
function moduleTitle(activeModule) {
|
||||
const name = String(activeModule && activeModule.name || '').trim();
|
||||
const scores = collectionJob && Array.isArray(collectionJob.moduleScores) ? collectionJob.moduleScores : [];
|
||||
const full = scores.find((item) => String(item && item.name || '').trim() === name);
|
||||
if (!full) {
|
||||
return name;
|
||||
}
|
||||
const state = full.active ? 'active' : 'inactive';
|
||||
return `${name}: score=${Number.isFinite(full.score) ? full.score : 0}, priority=${Number.isFinite(full.priority) ? full.priority : 0}, ${state}`;
|
||||
}
|
||||
|
||||
async function loadDataFromStatus() {
|
||||
try {
|
||||
const response = await fetch('/api/status');
|
||||
|
||||
@@ -107,6 +107,15 @@
|
||||
<div class="job-progress" aria-label="Прогресс задачи">
|
||||
<div id="job-progress-bar" class="job-progress-bar" style="width: 0%">0%</div>
|
||||
</div>
|
||||
<div id="job-active-modules" class="job-active-modules hidden">
|
||||
<p class="meta-label">Активные модули:</p>
|
||||
<div id="job-active-modules-list" class="job-module-chips"></div>
|
||||
</div>
|
||||
<div id="job-debug-info" class="job-debug-info hidden">
|
||||
<p class="meta-label">Redfish debug:</p>
|
||||
<div id="job-debug-summary" class="job-debug-summary"></div>
|
||||
<div id="job-phase-telemetry" class="job-phase-telemetry"></div>
|
||||
</div>
|
||||
<div class="job-status-logs">
|
||||
<p class="meta-label">Журнал шагов:</p>
|
||||
<ul id="job-logs-list"></ul>
|
||||
|
||||
Reference in New Issue
Block a user