15 Commits

Author SHA1 Message Date
Mikhail Chusavitin
1162ccd22e Trim noisy Lenovo Redfish collection paths 2026-04-29 17:02:40 +03:00
Mikhail Chusavitin
3887df6547 Improve Lenovo XCC inventory enrichment 2026-04-29 16:38:30 +03:00
Mikhail Chusavitin
a82fb227e5 submodule update 2026-04-16 15:33:48 +03:00
c9969fc3da feat(parser): lenovo xcc warnings and redfish logs - v1.1 2026-04-13 20:34:04 +03:00
89b6701f43 feat(parser): add Lenovo XCC mini-log parser 2026-04-13 20:20:37 +03:00
b04877549a feat(collector): add Lenovo XCC profile to skip noisy snapshot paths
Lenovo ThinkSystem SR650 V3 (and similar XCC-based servers) caused
collection runs of 23+ minutes because the BMC exposes two large high-
error-rate subtrees in the snapshot BFS:

  - Chassis/1/Sensors: 315 individual sensor members, 282/315 failing,
    ~3.7s per request → ~19 minutes wasted. These documents are never
    read by any LOGPile parser (thermal/power data comes from aggregate
    Chassis/*/Thermal and Chassis/*/Power endpoints).

  - Chassis/1/Oem/Lenovo: 75 requests (LEDs×47, Slots×26, etc.),
    68/75 failing → 8+ minutes wasted on non-inventory data.

Add a Lenovo profile (matched on SystemManufacturer/OEMNamespace "Lenovo")
that sets SnapshotExcludeContains to block individual sensor documents and
non-inventory Lenovo OEM subtrees from the snapshot BFS queue. Also sets
rate policy thresholds appropriate for XCC BMC latency (p95 often 3-5s).

Add SnapshotExcludeContains []string to AcquisitionTuning and check it
in the snapshot enqueue closure in redfish.go.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 19:29:04 +03:00
8ca173c99b fix(exporter): preserve all HGX GPUs with generic PCIe slot name
Supermicro HGX BMC reports all 8 B200 GPU PCIe devices with Name
"PCIe Device" — a generic label shared by every GPU, not a unique
hardware position. pcieDedupKey used slot as the primary key, so all
8 GPUs collapsed to one entry in the UI (the first, serial 1654925165720).

Add isGenericPCIeSlotName to detect non-positional slot labels and fall
through to serial/BDF for dedup instead, preserving each GPU separately.
Positional slots (#GPU0, SLOT-NIC1, etc.) continue to use slot-first dedup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 16:05:49 +03:00
f19a3454fa fix(redfish): gate hgx diagnostic plan-b by debug toggle 2026-04-13 14:45:41 +03:00
Mikhail Chusavitin
becdca1d7e fix(redfish): read PCIeInterface link width for GPU PCIe devices
parseGPUWithSupplementalDocs did not read PCIeInterface from the device
doc, only from function docs. xFusion GPU PCIeCard entries carry link
width/speed in PCIeInterface (LanesInUse/Maxlanes/PCIeType/MaxPCIeType)
so GPU link width was always empty for xFusion servers.

Also apply the xFusion OEM function-level fallback for GPU function docs,
consistent with the NIC and PCIeDevice paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 13:35:29 +03:00
Mikhail Chusavitin
e10440ae32 fix(redfish): collect PCIe link width from xFusion servers
xFusion iBMC exposes PCIe link width in two non-standard ways:
- PCIeInterface uses "Maxlanes" (lowercase 'l') instead of "MaxLanes"
- PCIeFunction docs carry width/speed in Oem.xFusion.LinkWidth ("X8"),
  Oem.xFusion.LinkWidthAbility, Oem.xFusion.LinkSpeed, and
  Oem.xFusion.LinkSpeedAbility rather than the standard CurrentLinkWidth int

Add redfishEnrichFromOEMxFusionPCIeLink and parseXFusionLinkWidth helpers,
apply them as fallbacks in NIC and PCIeDevice enrichment paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 13:35:29 +03:00
5c2a21aff1 chore: update bible and chart submodules
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 12:17:40 +03:00
Mikhail Chusavitin
9df13327aa feat(collect): remove power-on/off, add skip-hung for Redfish collection
Remove power-on and power-off functionality from the Redfish collector;
keep host power-state detection and show a warning in the UI when the
host is powered off before collection starts.

Add a "Пропустить зависшие" (skip hung) button that lets the user abort
stuck Redfish collection phases without losing already-collected data.
Introduces a two-level context model in Collect(): the outer job context
covers the full lifecycle including replay; an inner collectCtx covers
snapshot, prefetch, and plan-B phases only. Closing the skipCh cancels
collectCtx immediately — aborts all in-flight HTTP requests and exits
plan-B loops — then replay runs on whatever rawTree was collected.

Signal path: UI → POST /api/collect/{id}/skip → JobManager.SkipJob()
→ close(skipCh) → goroutine in Collect() → cancelCollect().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 13:12:38 +03:00
Mikhail Chusavitin
7e9af89c46 Add xFusion file-export parser support 2026-04-04 15:07:10 +03:00
Mikhail Chusavitin
db74df9994 fix(redfish): trim MSI replay noise and unify NIC classes 2026-04-01 17:49:00 +03:00
Mikhail Chusavitin
bb82387d48 fix(redfish): narrow MSI PCIeFunctions crawl 2026-04-01 16:50:51 +03:00
41 changed files with 3967 additions and 854 deletions

View File

@@ -34,6 +34,7 @@ All modes converge on the same normalized hardware model and exporter pipeline.
- NVIDIA HGX Field Diagnostics
- NVIDIA Bug Report
- Unraid
- xFusion iBMC dump / file export
- XigmaNAS
- Generic fallback parser

View File

@@ -58,6 +58,7 @@ Responses:
Optional request field:
- `power_on_if_host_off`: when `true`, Redfish collection may power on the host before collection if preflight found it powered off
- `debug_payloads`: when `true`, collector keeps extra diagnostic payloads and enables extended plan-B retries for slow HGX component inventory branches (`Assembly`, `Accelerators`, `Drives`, `NetworkAdapters`, `PCIeDevices`)
### `POST /api/collect/probe`

View File

@@ -27,6 +27,7 @@ Request fields passed from the server:
- credential field (`password` or token)
- `tls_mode`
- optional `power_on_if_host_off`
- optional `debug_payloads` for extended diagnostics
### Core rule
@@ -35,18 +36,38 @@ If the collector adds a fallback, probe, or normalization rule, replay must mirr
### Preflight and host power
- `Probe()` may be used before collection to verify API connectivity and current host `PowerState`
- if the host is off and the user chose power-on, the collector may issue `ComputerSystem.Reset`
with `ResetType=On`
- power-on attempts are bounded and logged
- after a successful power-on, the collector waits an extra stabilization window, then checks
`PowerState` again and only starts collection if the host is still on
- if the collector powered on the host itself for collection, it must attempt to power it back off
after collection completes
- if the host was already on before collection, the collector must not power it off afterward
- if power-on fails, collection still continues against the powered-off host
- all power-control decisions and attempts must be visible in the collection log so they are
preserved in raw-export bundles
- `Probe()` is used before collection to verify API connectivity and report current host `PowerState`
- if the host is off, the collector logs a warning and proceeds with collection; inventory data may
be incomplete when the host is powered off
- power-on and power-off are not performed by the collector
### Skip hung requests
Redfish collection uses a two-level context model:
- `ctx` — job lifetime context, cancelled only on explicit job cancel
- `collectCtx` — collection phase context, derived from `ctx`; covers snapshot, prefetch, and plan-B
`collectCtx` is cancelled when the user presses "Пропустить зависшие" (skip hung).
On skip, all in-flight HTTP requests in the current phase are aborted immediately via context
cancellation, the crawler and plan-B loops exit, and execution proceeds to the replay phase using
whatever was collected in `rawTree`. The result is partial but valid.
The skip signal travels: UI button → `POST /api/collect/{id}/skip``JobManager.SkipJob()`
closes `skipCh` → goroutine in `Collect()``cancelCollect()`.
The skip button is visible during `running` state and hidden once the job reaches a terminal state.
### Extended diagnostics toggle
The live collect form exposes a user-facing checkbox for extended diagnostics.
- default collection prioritizes inventory completeness and bounded runtime
- when extended diagnostics is off, heavy HGX component-chassis critical plan-B retries
(`Assembly`, `Accelerators`, `Drives`, `NetworkAdapters`, `PCIeDevices`) are skipped
- when extended diagnostics is on, those retries are allowed and extra debug payloads are collected
This toggle is intended for operator-driven deep diagnostics on problematic hosts, not for the default path.
### Discovery model

View File

@@ -55,9 +55,11 @@ When `vendor_id` and `device_id` are known but the model name is missing or gene
| `h3c_g6` | H3C SDS G6 bundles | Similar flow with G6-specific files |
| `hpe_ilo_ahs` | HPE iLO Active Health System (`.ahs`) | Proprietary `ABJR` container with gzip-compressed `zbb` members; parser combines SMBIOS-style inventory strings and embedded Redfish storage JSON |
| `inspur` | onekeylog archives | FRU/SDR plus optional Redis enrichment |
| `lenovo_xcc` | Lenovo XCC mini-log ZIP archives | JSON inventory + platform event logs |
| `nvidia` | HGX Field Diagnostics | GPU- and fabric-heavy diagnostic input |
| `nvidia_bug_report` | `nvidia-bug-report-*.log.gz` | dmidecode, lspci, NVIDIA driver sections |
| `unraid` | Unraid diagnostics/log bundles | Server and storage-focused parsing |
| `xfusion` | xFusion iBMC `tar.gz` dump / file export | AppDump + RTOSDump + LogDump merge for hardware and firmware |
| `xigmanas` | XigmaNAS plain logs | FreeBSD/NAS-oriented inventory |
| `generic` | fallback | Low-confidence text fallback when nothing else matches |
@@ -148,6 +150,29 @@ entire internal `zbb` schema.
---
### xFusion iBMC Dump / File Export (`xfusion`)
**Status:** Ready (v1.1.0). Tested on xFusion G5500 V7 `tar.gz` exports.
**Archive format:** `tar.gz` dump exported from the iBMC UI, including `AppDump/`, `RTOSDump/`,
and `LogDump/` trees.
**Detection:** `AppDump/FruData/fruinfo.txt`, `AppDump/card_manage/card_info`,
`RTOSDump/versioninfo/app_revision.txt`, and `LogDump/netcard/netcard_info.txt`.
**Extracted data (current):**
- Board / FRU inventory from `fruinfo.txt`
- CPU inventory from `CpuMem/cpu_info`
- Memory DIMM inventory from `CpuMem/mem_info`
- GPU inventory from `card_info`
- OCP NIC inventory by merging `card_info` with `LogDump/netcard/netcard_info.txt`
- PSU inventory from `BMC/psu_info.txt`
- Physical storage from `StorageMgnt/PhysicalDrivesInfo/*/disk_info`
- System firmware entries from `RTOSDump/versioninfo/app_revision.txt`
- Maintenance events from `LogDump/maintenance_log`
---
### Generic text fallback (`generic`)
**Status:** Ready (v1.0.0).
@@ -170,9 +195,11 @@ entire internal `zbb` schema.
| Reanimator Easy Bee | `easy_bee` | Ready | `bee-support-*.tar.gz` support bundles |
| HPE iLO AHS | `hpe_ilo_ahs` | Ready | iLO 6 `.ahs` exports |
| Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog |
| Lenovo XCC mini-log | `lenovo_xcc` | Ready | ThinkSystem SR650 V3 XCC mini-log ZIP |
| NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers |
| NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems |
| Unraid | `unraid` | Ready | Unraid diagnostics archives |
| xFusion iBMC dump | `xfusion` | Ready | G5500 V7 file-export `tar.gz` bundles |
| XigmaNAS | `xigmanas` | Ready | FreeBSD NAS logs |
| H3C SDS G5 | `h3c_g5` | Ready | H3C UniServer R4900 G5 SDS archives |
| H3C SDS G6 | `h3c_g6` | Ready | H3C UniServer R4700 G6 SDS archives |

View File

@@ -57,6 +57,11 @@ Current behavior:
7. Packages any already-present binaries from `bin/`
8. Generates `SHA256SUMS.txt`
Release tag format:
- project release tags use `vN.M`
- do not create `vN.M.P` tags for LOGPile releases
- release artifacts and `main.version` inherit the exact git tag string
Important limitation:
- `scripts/release.sh` does not run `make build-all` for you
- if you want Linux or additional macOS archives in the release directory, build them before running the script

View File

@@ -1045,3 +1045,112 @@ logical volumes.
- HPE PCIe inventory gets better slot labels like `OCP 3.0 Slot 15` plus concrete classes such as
`LOM/NIC` or `SAS/SATA Storage Controller`.
- `part_number` remains available separately for model identity, without polluting the class field.
---
## ADL-041 — Redfish replay drops topology-only PCIe noise classes from canonical inventory
**Date:** 2026-04-01
**Context:** Some Redfish BMCs, especially MSI/AMI GPU systems, expose a very wide PCIe topology
tree under `Chassis/*/PCIeDevices/*`. Besides real endpoint devices, the replay sees bridge stages,
CPU-side helper functions, IMC/mesh signal-processing nodes, USB/SPI side controllers, and GPU
display-function duplicates reported as generic `Display Device`. Keeping all of them in
`hardware.pcie_devices` pollutes downstream exports such as Reanimator and hides the actual
endpoint inventory signal.
**Decision:**
- Filter topology-only PCIe records during Redfish replay, not in the UI layer.
- Drop PCIe entries with replay-resolved classes:
- `Bridge`
- `Processor`
- `SignalProcessingController`
- `SerialBusController`
- Drop `DisplayController` entries when the source Redfish PCIe document is the generic MSI-style
`Description: "Display Device"` duplicate.
- Drop PCIe network endpoints when their PCIe functions already link to `NetworkDeviceFunctions`,
because those devices are represented canonically in `hardware.network_adapters`.
- When `Systems/*/NetworkInterfaces/*` links back to a chassis `NetworkAdapter`, match against the
fully enriched chassis NIC identity to avoid creating a second ghost NIC row with the raw
`NetworkAdapter_*` slot/name.
- Treat generic Redfish object names such as `NetworkAdapter_*` and `PCIeDevice_*` as placeholder
models and replace them from PCI IDs when a concrete vendor/device match exists.
- Drop MSI-style storage service PCIe endpoints whose resolved device names are only
`Volume Management Device NVMe RAID Controller` or `PCIe Switch management endpoint`; storage
inventory already comes from the Redfish storage tree.
- Normalize Ethernet-class NICs into the single exported class `NetworkController`; do not split
`EthernetController` into a separate top-level inventory section.
- Keep endpoint classes such as `NetworkController`, `MassStorageController`, and dedicated GPU
inventory coming from `hardware.gpus`.
**Consequences:**
- `hardware.pcie_devices` becomes closer to real endpoint inventory instead of raw PCIe topology.
- Reanimator exports stop showing MSI bridge/processor/display duplicate noise.
- Reanimator exports no longer duplicate the same MSI NIC as both `PCIeDevice_*` and
`NetworkAdapter_*`.
- Replay no longer creates extra NIC rows from `Systems/NetworkInterfaces` when the same adapter
was already normalized from `Chassis/NetworkAdapters`.
- MSI VMD / PCIe switch storage service endpoints no longer pollute PCIe inventory.
- UI/Reanimator group all Ethernet NICs under the same `NETWORKCONTROLLER` section.
- Canonical NIC inventory prefers resolved PCI product names over generic Redfish placeholder names.
- The raw Redfish snapshot still remains available in `raw_payloads.redfish_tree` for low-level
troubleshooting if topology details are ever needed.
---
## ADL-042 — xFusion file-export archives merge AppDump inventory with RTOS/Log snapshots
**Date:** 2026-04-04
**Context:** xFusion iBMC `tar.gz` exports expose the base inventory in `AppDump/`, but the most
useful NIC and firmware details live elsewhere: NIC firmware/MAC snapshots in
`LogDump/netcard/netcard_info.txt` and system firmware versions in
`RTOSDump/versioninfo/app_revision.txt`. Parsing only `AppDump/` left xFusion uploads detectable but
incomplete for UI and Reanimator consumers.
**Decision:**
- Treat xFusion file-export `tar.gz` bundles as a first-class archive parser input.
- Merge OCP NIC identity from `AppDump/card_manage/card_info` with the latest per-slot snapshot
from `LogDump/netcard/netcard_info.txt` to produce `hardware.network_adapters`.
- Import system-level firmware from `RTOSDump/versioninfo/app_revision.txt` into
`hardware.firmware`.
- Allow FRU fallback from `RTOSDump/versioninfo/fruinfo.txt` when `AppDump/FruData/fruinfo.txt`
is absent.
**Consequences:**
- xFusion uploads now preserve NIC BDF, MAC, firmware, and serial identity in normalized output.
- System firmware such as BIOS and iBMC versions survives xFusion file exports.
- xFusion archives participate more reliably in canonical device/export flows without special UI
cases.
---
## ADL-043 — Extended HGX diagnostic plan-B is opt-in from the live collect form
**Date:** 2026-04-13
**Context:** Some Supermicro HGX Redfish targets expose slow or hanging component-chassis inventory
collections during critical plan-B, especially under `Chassis/HGX_*` for `Assembly`,
`Accelerators`, `Drives`, `NetworkAdapters`, and `PCIeDevices`. Default collection should not
block operators on deep diagnostic retries that are useful mainly for troubleshooting.
**Decision:** Keep the normal snapshot/replay path unchanged, but gate those heavy HGX
component-chassis critical plan-B retries behind the existing live-collect `debug_payloads` flag,
presented in the UI as "Сбор расширенных данных для диагностики".
**Consequences:**
- Default live collection skips those heavy diagnostic plan-B retries and reaches replay faster.
- Operators can explicitly opt into the slower diagnostic path when they need deeper collection.
- The same user-facing toggle continues to enable extra debug payload capture for troubleshooting.
---
## ADL-044 — LOGPile project release tags use `vN.M`
**Date:** 2026-04-13
**Context:** The repository accumulated release tags in `vN.M.P` form, while the shared module
versioning contract in `bible/rules/patterns/module-versioning/contract.md` standardizes version
shape as `N.M`. Release tooling reads the git tag verbatim into build metadata and release
artifacts, so inconsistent tag shape leaks directly into packaged versions.
**Decision:** Use `vN.M` for LOGPile project release tags going forward. Do not create new
`vN.M.P` tags for repository releases. Build metadata, release directory names, and release notes
continue to inherit the exact git tag string from `git describe --tags`.
**Consequences:**
- Future project releases have a two-component version string such as `v1.12`.
- Release artifacts and `--version` output stay aligned with the tag shape without extra mapping.
- Existing historical `vN.M.P` tags remain as-is unless explicitly rewritten.

View File

@@ -112,12 +112,11 @@ func (c *RedfishConnector) Probe(ctx context.Context, req Request) (*ProbeResult
}
powerState := redfishSystemPowerState(systemDoc)
return &ProbeResult{
Reachable: true,
Protocol: "redfish",
HostPowerState: powerState,
HostPoweredOn: isRedfishHostPoweredOn(powerState),
PowerControlAvailable: redfishResetActionTarget(systemDoc) != "",
SystemPath: primarySystem,
Reachable: true,
Protocol: "redfish",
HostPowerState: powerState,
HostPoweredOn: isRedfishHostPoweredOn(powerState),
SystemPath: primarySystem,
}, nil
}
@@ -160,17 +159,6 @@ func (c *RedfishConnector) Collect(ctx context.Context, req Request, emit Progre
systemPaths := c.discoverMemberPaths(discoveryCtx, snapshotClient, req, baseURL, "/redfish/v1/Systems", "/redfish/v1/Systems/1")
primarySystem := firstNonEmptyPath(systemPaths, "/redfish/v1/Systems/1")
if primarySystem != "" {
c.ensureHostPowerForCollection(ctx, snapshotClient, req, baseURL, primarySystem, emit)
}
defer func() {
if primarySystem == "" || !req.StopHostAfterCollect {
return
}
shutdownCtx, cancel := context.WithTimeout(context.Background(), 45*time.Second)
defer cancel()
c.restoreHostPowerAfterCollection(shutdownCtx, snapshotClient, req, baseURL, primarySystem, emit)
}()
chassisPaths := c.discoverMemberPaths(discoveryCtx, snapshotClient, req, baseURL, "/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
managerPaths := c.discoverMemberPaths(discoveryCtx, snapshotClient, req, baseURL, "/redfish/v1/Managers", "/redfish/v1/Managers/1")
primaryChassis := firstNonEmptyPath(chassisPaths, "/redfish/v1/Chassis/1")
@@ -269,12 +257,35 @@ func (c *RedfishConnector) Collect(ctx context.Context, req Request, emit Progre
emit(Progress{Status: "running", Progress: 80, Message: "Redfish: подготовка расширенного snapshot...", CurrentPhase: "snapshot", ETASeconds: acquisitionPlan.Tuning.ETABaseline.SnapshotSeconds})
emit(Progress{Status: "running", Progress: 90, Message: "Redfish: сбор расширенного snapshot...", CurrentPhase: "snapshot", ETASeconds: acquisitionPlan.Tuning.ETABaseline.SnapshotSeconds})
}
// collectCtx covers all data-fetching phases (snapshot, prefetch, plan-B).
// Cancelling it via the skip signal aborts only the collection phases while
// leaving the replay phase intact so results from already-fetched data are preserved.
collectCtx, cancelCollect := context.WithCancel(ctx)
defer cancelCollect()
if req.SkipHungCh != nil {
go func() {
select {
case <-req.SkipHungCh:
if emit != nil {
emit(Progress{
Status: "running",
Progress: 97,
Message: "Redfish: пропуск зависших запросов, анализ уже собранных данных...",
})
}
log.Printf("redfish: skip-hung triggered, cancelling collection phases")
cancelCollect()
case <-ctx.Done():
}
}()
}
c.debugSnapshotf("snapshot crawl start host=%s port=%d", req.Host, req.Port)
rawTree, fetchErrors, postProbeMetrics, snapshotTimingSummary := c.collectRawRedfishTree(withRedfishTelemetryPhase(ctx, "snapshot"), snapshotClient, req, baseURL, seedPaths, acquisitionPlan.Tuning, emit)
rawTree, fetchErrors, postProbeMetrics, snapshotTimingSummary := c.collectRawRedfishTree(withRedfishTelemetryPhase(collectCtx, "snapshot"), snapshotClient, req, baseURL, seedPaths, acquisitionPlan.Tuning, emit)
c.debugSnapshotf("snapshot crawl done docs=%d", len(rawTree))
fetchErrMap := redfishFetchErrorListToMap(fetchErrors)
prefetchedCritical, prefetchMetrics := c.prefetchCriticalRedfishDocs(withRedfishTelemetryPhase(ctx, "prefetch"), prefetchClient, req, baseURL, criticalPaths, rawTree, fetchErrMap, acquisitionPlan.Tuning, emit)
prefetchedCritical, prefetchMetrics := c.prefetchCriticalRedfishDocs(withRedfishTelemetryPhase(collectCtx, "prefetch"), prefetchClient, req, baseURL, criticalPaths, rawTree, fetchErrMap, acquisitionPlan.Tuning, emit)
for p, doc := range prefetchedCritical {
if _, exists := rawTree[p]; exists {
continue
@@ -295,10 +306,10 @@ func (c *RedfishConnector) Collect(ctx context.Context, req Request, emit Progre
prefetchMetrics.Duration.Round(time.Millisecond),
firstNonEmpty(prefetchMetrics.SkipReason, "-"),
)
if recoveredN := c.recoverCriticalRedfishDocsPlanB(withRedfishTelemetryPhase(ctx, "critical_plan_b"), criticalClient, req, baseURL, criticalPaths, rawTree, fetchErrMap, acquisitionPlan.Tuning, emit); recoveredN > 0 {
if recoveredN := c.recoverCriticalRedfishDocsPlanB(withRedfishTelemetryPhase(collectCtx, "critical_plan_b"), criticalClient, req, baseURL, criticalPaths, rawTree, fetchErrMap, acquisitionPlan.Tuning, emit); recoveredN > 0 {
c.debugSnapshotf("critical plan-b recovered docs=%d", recoveredN)
}
if recoveredN := c.recoverProfilePlanBDocs(withRedfishTelemetryPhase(ctx, "profile_plan_b"), criticalClient, req, baseURL, acquisitionPlan, rawTree, emit); recoveredN > 0 {
if recoveredN := c.recoverProfilePlanBDocs(withRedfishTelemetryPhase(collectCtx, "profile_plan_b"), criticalClient, req, baseURL, acquisitionPlan, rawTree, emit); recoveredN > 0 {
c.debugSnapshotf("profile plan-b recovered docs=%d", recoveredN)
}
// Hide transient fetch errors for endpoints that were eventually recovered into rawTree.
@@ -334,8 +345,9 @@ func (c *RedfishConnector) Collect(ctx context.Context, req Request, emit Progre
"manager_critical_suffixes": acquisitionPlan.ScopedPaths.ManagerCriticalSuffixes,
},
"tuning": map[string]any{
"snapshot_max_documents": acquisitionPlan.Tuning.SnapshotMaxDocuments,
"snapshot_workers": acquisitionPlan.Tuning.SnapshotWorkers,
"snapshot_max_documents": acquisitionPlan.Tuning.SnapshotMaxDocuments,
"snapshot_workers": acquisitionPlan.Tuning.SnapshotWorkers,
"snapshot_exclude_contains": acquisitionPlan.Tuning.SnapshotExcludeContains,
"prefetch_workers": acquisitionPlan.Tuning.PrefetchWorkers,
"prefetch_enabled": boolPointerValue(acquisitionPlan.Tuning.PrefetchEnabled),
"nvme_post_probe": boolPointerValue(acquisitionPlan.Tuning.NVMePostProbeEnabled),
@@ -485,231 +497,6 @@ func (c *RedfishConnector) Collect(ctx context.Context, req Request, emit Progre
return result, nil
}
func (c *RedfishConnector) ensureHostPowerForCollection(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string, emit ProgressFn) (hostOn bool, poweredOnByCollector bool) {
systemDoc, err := c.getJSON(ctx, client, req, baseURL, systemPath)
if err != nil {
if emit != nil {
emit(Progress{Status: "running", Progress: 18, Message: "Redfish: не удалось проверить PowerState host, сбор продолжается без power-control"})
}
return false, false
}
powerState := redfishSystemPowerState(systemDoc)
if isRedfishHostPoweredOn(powerState) {
if emit != nil {
emit(Progress{Status: "running", Progress: 18, Message: fmt.Sprintf("Redfish: host включен (%s)", firstNonEmpty(powerState, "On"))})
}
return true, false
}
if emit != nil {
emit(Progress{Status: "running", Progress: 18, Message: fmt.Sprintf("Redfish: host выключен (%s)", firstNonEmpty(powerState, "Off"))})
}
if !req.PowerOnIfHostOff {
if emit != nil {
emit(Progress{Status: "running", Progress: 19, Message: "Redfish: включение host не запрошено, сбор продолжается на выключенном host"})
}
return false, false
}
// Invalidate all inventory CRC groups before powering on so the BMC accepts
// fresh inventory from the host after boot. Best-effort: failure is logged but
// does not block power-on.
c.invalidateRedfishInventory(ctx, client, req, baseURL, systemPath, emit)
resetTarget := redfishResetActionTarget(systemDoc)
resetType := redfishPickResetType(systemDoc, "On", "ForceOn")
if resetTarget == "" || resetType == "" {
if emit != nil {
emit(Progress{Status: "running", Progress: 19, Message: "Redfish: action ComputerSystem.Reset недоступен, сбор продолжается на выключенном host"})
}
return false, false
}
waitWindows := []time.Duration{5 * time.Second, 10 * time.Second, 30 * time.Second}
for i, waitFor := range waitWindows {
if emit != nil {
emit(Progress{Status: "running", Progress: 19, Message: fmt.Sprintf("Redfish: попытка включения host (%d/%d), ожидание %s", i+1, len(waitWindows), waitFor)})
}
if err := c.postJSON(ctx, client, req, baseURL, resetTarget, map[string]any{"ResetType": resetType}); err != nil {
if emit != nil {
emit(Progress{Status: "running", Progress: 19, Message: fmt.Sprintf("Redfish: включение host не удалось (%v)", err)})
}
continue
}
if c.waitForHostPowerState(ctx, client, req, baseURL, systemPath, true, waitFor) {
if !c.waitForStablePoweredOnHost(ctx, client, req, baseURL, systemPath, emit) {
if emit != nil {
emit(Progress{Status: "running", Progress: 20, Message: "Redfish: host включился, но не подтвердил стабильное состояние; сбор продолжается на выключенном host"})
}
return false, false
}
if emit != nil {
emit(Progress{Status: "running", Progress: 20, Message: "Redfish: host успешно включен и стабилен перед сбором"})
}
return true, true
}
if emit != nil {
emit(Progress{Status: "running", Progress: 20, Message: fmt.Sprintf("Redfish: host не включился за %s", waitFor)})
}
}
if emit != nil {
emit(Progress{Status: "running", Progress: 20, Message: "Redfish: host не удалось включить, сбор продолжается на выключенном host"})
}
return false, false
}
func (c *RedfishConnector) waitForStablePoweredOnHost(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string, emit ProgressFn) bool {
stabilizationDelay := redfishPowerOnStabilizationDelay()
if stabilizationDelay > 0 {
if emit != nil {
emit(Progress{
Status: "running",
Progress: 20,
Message: fmt.Sprintf("Redfish: host включен, ожидание стабилизации %s перед началом сбора", stabilizationDelay),
})
}
timer := time.NewTimer(stabilizationDelay)
select {
case <-ctx.Done():
timer.Stop()
return false
case <-timer.C:
timer.Stop()
}
}
if emit != nil {
emit(Progress{
Status: "running",
Progress: 20,
Message: "Redfish: повторная проверка PowerState после стабилизации host",
})
}
if !c.waitForHostPowerState(ctx, client, req, baseURL, systemPath, true, 5*time.Second) {
return false
}
// After the initial stabilization wait, the BMC may still be populating its
// hardware inventory (PCIeDevices, memory summary). Poll readiness with
// increasing back-off (default: +60s, +120s), then warn and proceed.
readinessWaits := redfishBMCReadinessWaits()
for attempt, extraWait := range readinessWaits {
ready, reason := c.isBMCInventoryReady(ctx, client, req, baseURL, systemPath)
if ready {
if emit != nil {
emit(Progress{
Status: "running",
Progress: 20,
Message: fmt.Sprintf("Redfish: BMC готов (%s)", reason),
})
}
return true
}
if emit != nil {
emit(Progress{
Status: "running",
Progress: 20,
Message: fmt.Sprintf("Redfish: BMC не готов (%s), ожидание %s (попытка %d/%d)", reason, extraWait, attempt+1, len(readinessWaits)),
})
}
timer := time.NewTimer(extraWait)
select {
case <-ctx.Done():
timer.Stop()
return false
case <-timer.C:
timer.Stop()
}
if emit != nil {
emit(Progress{
Status: "running",
Progress: 20,
Message: fmt.Sprintf("Redfish: повторная проверка готовности BMC (%d/%d)...", attempt+1, len(readinessWaits)),
})
}
}
ready, reason := c.isBMCInventoryReady(ctx, client, req, baseURL, systemPath)
if !ready {
if emit != nil {
emit(Progress{
Status: "running",
Progress: 20,
Message: fmt.Sprintf("Redfish: WARNING — BMC не подтвердил готовность (%s), сбор может быть неполным", reason),
})
}
} else if emit != nil {
emit(Progress{
Status: "running",
Progress: 20,
Message: fmt.Sprintf("Redfish: BMC готов (%s)", reason),
})
}
return true
}
// isBMCInventoryReady checks whether the BMC has finished populating its
// hardware inventory after a power-on. Returns (ready, reason).
// It considers the BMC ready if either the system memory summary reports
// a non-zero total or the PCIeDevices collection is non-empty.
func (c *RedfishConnector) isBMCInventoryReady(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string) (bool, string) {
systemDoc, err := c.getJSON(ctx, client, req, baseURL, systemPath)
if err != nil {
return false, "не удалось прочитать System"
}
if summary, ok := systemDoc["MemorySummary"].(map[string]interface{}); ok {
if asFloat(summary["TotalSystemMemoryGiB"]) > 0 {
return true, "MemorySummary заполнен"
}
}
pcieDoc, err := c.getJSON(ctx, client, req, baseURL, joinPath(systemPath, "/PCIeDevices"))
if err == nil {
if asInt(pcieDoc["Members@odata.count"]) > 0 {
return true, "PCIeDevices не пуст"
}
if members, ok := pcieDoc["Members"].([]interface{}); ok && len(members) > 0 {
return true, "PCIeDevices не пуст"
}
}
return false, "MemorySummary=0, PCIeDevices пуст"
}
func (c *RedfishConnector) restoreHostPowerAfterCollection(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string, emit ProgressFn) {
systemDoc, err := c.getJSON(ctx, client, req, baseURL, systemPath)
if err != nil {
if emit != nil {
emit(Progress{Status: "running", Progress: 100, Message: "Redfish: не удалось повторно прочитать system перед выключением host"})
}
return
}
resetTarget := redfishResetActionTarget(systemDoc)
resetType := redfishPickResetType(systemDoc, "GracefulShutdown", "ForceOff", "PushPowerButton")
if resetTarget == "" || resetType == "" {
if emit != nil {
emit(Progress{Status: "running", Progress: 100, Message: "Redfish: выключение host после сбора недоступно"})
}
return
}
if emit != nil {
emit(Progress{Status: "running", Progress: 100, Message: "Redfish: выключаем host после завершения сбора"})
}
if err := c.postJSON(ctx, client, req, baseURL, resetTarget, map[string]any{"ResetType": resetType}); err != nil {
if emit != nil {
emit(Progress{Status: "running", Progress: 100, Message: fmt.Sprintf("Redfish: не удалось выключить host после сбора (%v)", err)})
}
return
}
if c.waitForHostPowerState(ctx, client, req, baseURL, systemPath, false, 20*time.Second) {
if emit != nil {
emit(Progress{Status: "running", Progress: 100, Message: "Redfish: host выключен после завершения сбора"})
}
return
}
if emit != nil {
emit(Progress{Status: "running", Progress: 100, Message: "Redfish: не удалось подтвердить выключение host после сбора"})
}
}
// collectDebugPayloads fetches vendor-specific diagnostic endpoints on a best-effort basis.
// Results are stored in rawPayloads["redfish_debug_payloads"] and exported with the bundle.
// Enabled only when Request.DebugPayloads is true.
@@ -724,50 +511,6 @@ func (c *RedfishConnector) collectDebugPayloads(ctx context.Context, client *htt
return out
}
// invalidateRedfishInventory POSTs to the AMI/MSI InventoryCrc endpoint to zero out
// all known CRC groups before a host power-on. This causes the BMC to accept fresh
// inventory from the host after boot, preventing stale inventory (ghost GPUs, wrong
// BIOS version, etc.) from persisting across hardware changes.
// Best-effort: any error is logged and the call silently returns.
func (c *RedfishConnector) invalidateRedfishInventory(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string, emit ProgressFn) {
crcPath := joinPath(systemPath, "/Oem/Ami/Inventory/Crc")
body := map[string]any{
"GroupCrcList": []map[string]any{
{"CPU": 0},
{"DIMM": 0},
{"PCIE": 0},
},
}
if err := c.postJSON(ctx, client, req, baseURL, crcPath, body); err != nil {
log.Printf("redfish: inventory invalidation skipped (not AMI/MSI or endpoint unavailable): %v", err)
return
}
log.Printf("redfish: inventory CRC groups invalidated at %s before host power-on", crcPath)
if emit != nil {
emit(Progress{Status: "running", Progress: 19, Message: "Redfish: инвентарь BMC инвалидирован перед включением host (все CRC группы сброшены)"})
}
}
func (c *RedfishConnector) waitForHostPowerState(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string, wantOn bool, timeout time.Duration) bool {
deadline := time.Now().Add(timeout)
for {
systemDoc, err := c.getJSON(ctx, client, req, baseURL, systemPath)
if err == nil {
if isRedfishHostPoweredOn(redfishSystemPowerState(systemDoc)) == wantOn {
return true
}
}
if time.Now().After(deadline) {
return false
}
select {
case <-ctx.Done():
return false
case <-time.After(1 * time.Second):
}
}
}
func firstNonEmptyPath(paths []string, fallback string) string {
for _, p := range paths {
if strings.TrimSpace(p) != "" {
@@ -799,49 +542,6 @@ func redfishSystemPowerState(systemDoc map[string]interface{}) string {
return ""
}
func redfishResetActionTarget(systemDoc map[string]interface{}) string {
if systemDoc == nil {
return ""
}
actions, _ := systemDoc["Actions"].(map[string]interface{})
reset, _ := actions["#ComputerSystem.Reset"].(map[string]interface{})
target := strings.TrimSpace(asString(reset["target"]))
if target != "" {
return target
}
odataID := strings.TrimSpace(asString(systemDoc["@odata.id"]))
if odataID == "" {
return ""
}
return joinPath(odataID, "/Actions/ComputerSystem.Reset")
}
func redfishPickResetType(systemDoc map[string]interface{}, preferred ...string) string {
actions, _ := systemDoc["Actions"].(map[string]interface{})
reset, _ := actions["#ComputerSystem.Reset"].(map[string]interface{})
allowedRaw, _ := reset["ResetType@Redfish.AllowableValues"].([]interface{})
if len(allowedRaw) == 0 {
if len(preferred) > 0 {
return preferred[0]
}
return ""
}
allowed := make([]string, 0, len(allowedRaw))
for _, item := range allowedRaw {
if v := strings.TrimSpace(asString(item)); v != "" {
allowed = append(allowed, v)
}
}
for _, want := range preferred {
for _, have := range allowed {
if strings.EqualFold(want, have) {
return have
}
}
}
return ""
}
func (c *RedfishConnector) postJSON(ctx context.Context, client *http.Client, req Request, baseURL, resourcePath string, payload map[string]any) error {
body, err := json.Marshal(payload)
if err != nil {
@@ -1288,12 +988,17 @@ func (c *RedfishConnector) collectNICs(ctx context.Context, client *http.Client,
}
for _, doc := range adapterDocs {
nic := parseNIC(doc)
adapterFunctionDocs := c.getNetworkAdapterFunctionDocs(ctx, client, req, baseURL, doc)
for _, pciePath := range networkAdapterPCIeDevicePaths(doc) {
pcieDoc, err := c.getJSON(ctx, client, req, baseURL, pciePath)
if err != nil {
continue
}
functionDocs := c.getLinkedPCIeFunctions(ctx, client, req, baseURL, pcieDoc)
for _, adapterFnDoc := range adapterFunctionDocs {
functionDocs = append(functionDocs, c.getLinkedPCIeFunctions(ctx, client, req, baseURL, adapterFnDoc)...)
}
functionDocs = dedupeJSONDocsByPath(functionDocs)
supplementalDocs := c.getLinkedSupplementalDocs(ctx, client, req, baseURL, pcieDoc, "EnvironmentMetrics", "Metrics")
for _, fn := range functionDocs {
supplementalDocs = append(supplementalDocs, c.getLinkedSupplementalDocs(ctx, client, req, baseURL, fn, "EnvironmentMetrics", "Metrics")...)
@@ -1639,6 +1344,11 @@ func (c *RedfishConnector) collectRawRedfishTree(ctx context.Context, client *ht
if !shouldCrawlPath(path) {
return
}
for _, pattern := range tuning.SnapshotExcludeContains {
if pattern != "" && strings.Contains(path, pattern) {
return
}
}
mu.Lock()
if len(seen) >= maxDocuments {
mu.Unlock()
@@ -2592,34 +2302,6 @@ func redfishCriticalSlowGap() time.Duration {
return 1200 * time.Millisecond
}
func redfishPowerOnStabilizationDelay() time.Duration {
if v := strings.TrimSpace(os.Getenv("LOGPILE_REDFISH_POWERON_STABILIZATION")); v != "" {
if d, err := time.ParseDuration(v); err == nil && d >= 0 {
return d
}
}
return 60 * time.Second
}
// redfishBMCReadinessWaits returns the extra wait durations used when polling
// BMC inventory readiness after power-on. Defaults: [60s, 120s].
// Override with LOGPILE_REDFISH_BMC_READY_WAITS (comma-separated durations,
// e.g. "60s,120s").
func redfishBMCReadinessWaits() []time.Duration {
if v := strings.TrimSpace(os.Getenv("LOGPILE_REDFISH_BMC_READY_WAITS")); v != "" {
var out []time.Duration
for _, part := range strings.Split(v, ",") {
if d, err := time.ParseDuration(strings.TrimSpace(part)); err == nil && d >= 0 {
out = append(out, d)
}
}
if len(out) > 0 {
return out
}
}
return []time.Duration{60 * time.Second, 120 * time.Second}
}
func redfishSnapshotMemoryRequestTimeout() time.Duration {
if v := strings.TrimSpace(os.Getenv("LOGPILE_REDFISH_MEMORY_TIMEOUT")); v != "" {
if d, err := time.ParseDuration(v); err == nil && d > 0 {
@@ -2810,6 +2492,14 @@ func shouldCrawlPath(path string) bool {
if isAllowedNVSwitchFabricPath(normalized) {
return true
}
if strings.Contains(normalized, "/Chassis/") &&
strings.Contains(normalized, "/PCIeDevices/") &&
strings.HasSuffix(normalized, "/PCIeFunctions") {
// Avoid crawling entire chassis PCIeFunctions collections. Concrete member
// docs can still be reached through direct links such as
// NetworkDeviceFunction Links.PCIeFunction.
return false
}
if strings.Contains(normalized, "/Memory/") {
after := strings.SplitN(normalized, "/Memory/", 2)
if len(after) == 2 && strings.Count(after[1], "/") >= 1 {
@@ -2982,6 +2672,15 @@ func (c *RedfishConnector) getLinkedPCIeFunctions(ctx context.Context, client *h
}
return out
}
if ref, ok := links["PCIeFunction"].(map[string]interface{}); ok {
memberPath := asString(ref["@odata.id"])
if memberPath != "" {
memberDoc, err := c.getJSON(ctx, client, req, baseURL, memberPath)
if err == nil {
return []map[string]interface{}{memberDoc}
}
}
}
}
// Some implementations expose a collection object in PCIeFunctions.@odata.id.
@@ -2997,6 +2696,22 @@ func (c *RedfishConnector) getLinkedPCIeFunctions(ctx context.Context, client *h
return nil
}
func (c *RedfishConnector) getNetworkAdapterFunctionDocs(ctx context.Context, client *http.Client, req Request, baseURL string, adapterDoc map[string]interface{}) []map[string]interface{} {
ndfCol, ok := adapterDoc["NetworkDeviceFunctions"].(map[string]interface{})
if !ok {
return nil
}
colPath := asString(ndfCol["@odata.id"])
if colPath == "" {
return nil
}
funcDocs, err := c.getCollectionMembers(ctx, client, req, baseURL, colPath)
if err != nil {
return nil
}
return funcDocs
}
func (c *RedfishConnector) getCollectionMembers(ctx context.Context, client *http.Client, req Request, baseURL, collectionPath string) ([]map[string]interface{}, error) {
collection, err := c.getJSON(ctx, client, req, baseURL, collectionPath)
if err != nil {
@@ -3165,11 +2880,16 @@ func (c *RedfishConnector) recoverCriticalRedfishDocsPlanB(ctx context.Context,
timings := newRedfishPathTimingCollector(4)
var targets []string
seenTargets := make(map[string]struct{})
skippedDiagnosticTargets := 0
addTarget := func(path string) {
path = normalizeRedfishPath(path)
if path == "" {
return
}
if !shouldIncludeCriticalPlanBPath(req, path) {
skippedDiagnosticTargets++
return
}
if _, ok := seenTargets[path]; ok {
return
}
@@ -3255,6 +2975,13 @@ func (c *RedfishConnector) recoverCriticalRedfishDocsPlanB(ctx context.Context,
return 0
}
if emit != nil {
if skippedDiagnosticTargets > 0 {
emit(Progress{
Status: "running",
Progress: 97,
Message: fmt.Sprintf("Redfish: расширенная диагностика выключена, пропущено %d тяжелых diagnostic endpoint", skippedDiagnosticTargets),
})
}
totalETA := redfishCriticalCooldown() + estimatePlanBETA(len(targets))
emit(Progress{
Status: "running",
@@ -3360,6 +3087,39 @@ func (c *RedfishConnector) recoverCriticalRedfishDocsPlanB(ctx context.Context,
return recovered
}
func shouldIncludeCriticalPlanBPath(req Request, path string) bool {
if req.DebugPayloads {
return true
}
return !isExtendedDiagnosticCriticalPlanBPath(path)
}
func isExtendedDiagnosticCriticalPlanBPath(path string) bool {
path = normalizeRedfishPath(path)
if path == "" {
return false
}
parts := strings.Split(strings.Trim(path, "/"), "/")
if len(parts) < 5 || parts[0] != "redfish" || parts[1] != "v1" || parts[2] != "Chassis" {
return false
}
if !strings.HasPrefix(parts[3], "HGX_") {
return false
}
for _, suffix := range []string{
"/Accelerators",
"/Assembly",
"/Drives",
"/NetworkAdapters",
"/PCIeDevices",
} {
if strings.HasSuffix(path, suffix) {
return true
}
}
return false
}
func (c *RedfishConnector) recoverProfilePlanBDocs(ctx context.Context, client *http.Client, req Request, baseURL string, plan redfishprofile.AcquisitionPlan, rawTree map[string]interface{}, emit ProgressFn) int {
if len(plan.PlanBPaths) == 0 || plan.Mode == redfishprofile.ModeFallback || !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
return 0
@@ -3879,7 +3639,7 @@ func parseNIC(doc map[string]interface{}) models.NetworkAdapter {
}
if pcieIf, ok := ctrl["PCIeInterface"].(map[string]interface{}); ok && linkWidth == 0 && maxLinkWidth == 0 && linkSpeed == "" && maxLinkSpeed == "" {
linkWidth = asInt(pcieIf["LanesInUse"])
maxLinkWidth = asInt(pcieIf["MaxLanes"])
maxLinkWidth = firstNonZeroInt(asInt(pcieIf["MaxLanes"]), asInt(pcieIf["Maxlanes"]))
linkSpeed = firstNonEmpty(asString(pcieIf["PCIeType"]), asString(pcieIf["CurrentLinkSpeedGTs"]), asString(pcieIf["CurrentLinkSpeed"]))
maxLinkSpeed = firstNonEmpty(asString(pcieIf["MaxPCIeType"]), asString(pcieIf["MaxLinkSpeedGTs"]), asString(pcieIf["MaxLinkSpeed"]))
}
@@ -3992,6 +3752,9 @@ func enrichNICFromPCIe(nic *models.NetworkAdapter, pcieDoc map[string]interface{
if strings.TrimSpace(nic.MaxLinkSpeed) == "" {
nic.MaxLinkSpeed = firstNonEmpty(asString(pcieDoc["MaxLinkSpeedGTs"]), asString(pcieDoc["MaxLinkSpeed"]))
}
if nic.LinkWidth == 0 || nic.MaxLinkWidth == 0 || nic.LinkSpeed == "" || nic.MaxLinkSpeed == "" {
redfishEnrichFromOEMxFusionPCIeLink(pcieDoc, &nic.LinkWidth, &nic.MaxLinkWidth, &nic.LinkSpeed, &nic.MaxLinkSpeed)
}
if normalizeRedfishIdentityField(nic.SerialNumber) == "" {
nic.SerialNumber = findFirstNormalizedStringByKeys(pcieDoc, "SerialNumber")
}
@@ -4023,6 +3786,9 @@ func enrichNICFromPCIe(nic *models.NetworkAdapter, pcieDoc map[string]interface{
if strings.TrimSpace(nic.MaxLinkSpeed) == "" {
nic.MaxLinkSpeed = firstNonEmpty(asString(fn["MaxLinkSpeedGTs"]), asString(fn["MaxLinkSpeed"]))
}
if nic.LinkWidth == 0 || nic.MaxLinkWidth == 0 || nic.LinkSpeed == "" || nic.MaxLinkSpeed == "" {
redfishEnrichFromOEMxFusionPCIeLink(fn, &nic.LinkWidth, &nic.MaxLinkWidth, &nic.LinkSpeed, &nic.MaxLinkSpeed)
}
if normalizeRedfishIdentityField(nic.SerialNumber) == "" {
nic.SerialNumber = findFirstNormalizedStringByKeys(fn, "SerialNumber")
}
@@ -4589,6 +4355,21 @@ func parseGPUWithSupplementalDocs(doc map[string]interface{}, functionDocs []map
gpu.DeviceID = asHexOrInt(doc["DeviceId"])
}
if pcieIf, ok := doc["PCIeInterface"].(map[string]interface{}); ok {
if gpu.CurrentLinkWidth == 0 {
gpu.CurrentLinkWidth = asInt(pcieIf["LanesInUse"])
}
if gpu.MaxLinkWidth == 0 {
gpu.MaxLinkWidth = firstNonZeroInt(asInt(pcieIf["MaxLanes"]), asInt(pcieIf["Maxlanes"]))
}
if gpu.CurrentLinkSpeed == "" {
gpu.CurrentLinkSpeed = firstNonEmpty(asString(pcieIf["PCIeType"]), asString(pcieIf["CurrentLinkSpeedGTs"]), asString(pcieIf["CurrentLinkSpeed"]))
}
if gpu.MaxLinkSpeed == "" {
gpu.MaxLinkSpeed = firstNonEmpty(asString(pcieIf["MaxPCIeType"]), asString(pcieIf["MaxLinkSpeedGTs"]), asString(pcieIf["MaxLinkSpeed"]))
}
}
for _, fn := range functionDocs {
if gpu.BDF == "" {
gpu.BDF = sanitizeRedfishBDF(asString(fn["FunctionId"]))
@@ -4611,6 +4392,9 @@ func parseGPUWithSupplementalDocs(doc map[string]interface{}, functionDocs []map
if gpu.CurrentLinkSpeed == "" {
gpu.CurrentLinkSpeed = firstNonEmpty(asString(fn["CurrentLinkSpeedGTs"]), asString(fn["CurrentLinkSpeed"]))
}
if gpu.CurrentLinkWidth == 0 || gpu.MaxLinkWidth == 0 || gpu.CurrentLinkSpeed == "" || gpu.MaxLinkSpeed == "" {
redfishEnrichFromOEMxFusionPCIeLink(fn, &gpu.CurrentLinkWidth, &gpu.MaxLinkWidth, &gpu.CurrentLinkSpeed, &gpu.MaxLinkSpeed)
}
}
if isMissingOrRawPCIModel(gpu.Model) {
@@ -4671,6 +4455,9 @@ func parsePCIeDeviceWithSupplementalDocs(doc map[string]interface{}, functionDoc
if dev.MaxLinkSpeed == "" {
dev.MaxLinkSpeed = firstNonEmpty(asString(fn["MaxLinkSpeedGTs"]), asString(fn["MaxLinkSpeed"]))
}
if dev.LinkWidth == 0 || dev.MaxLinkWidth == 0 || dev.LinkSpeed == "" || dev.MaxLinkSpeed == "" {
redfishEnrichFromOEMxFusionPCIeLink(fn, &dev.LinkWidth, &dev.MaxLinkWidth, &dev.LinkSpeed, &dev.MaxLinkSpeed)
}
}
if dev.DeviceClass == "" || isGenericPCIeClassLabel(dev.DeviceClass) {
dev.DeviceClass = firstNonEmpty(redfishFirstStringAcrossDocs(supplementalDocs, "DeviceType"), dev.DeviceClass)
@@ -4755,6 +4542,9 @@ func isMissingOrRawPCIModel(model string) bool {
if l == "unknown" || l == "n/a" || l == "na" || l == "none" {
return true
}
if isGenericRedfishInventoryName(l) {
return true
}
if strings.HasPrefix(l, "0x") && len(l) <= 6 {
return true
}
@@ -4773,6 +4563,26 @@ func isMissingOrRawPCIModel(model string) bool {
return false
}
func isGenericRedfishInventoryName(value string) bool {
value = strings.ToLower(strings.TrimSpace(value))
switch {
case value == "":
return false
case value == "networkadapter", strings.HasPrefix(value, "networkadapter_"), strings.HasPrefix(value, "networkadapter "):
return true
case value == "pciedevice", strings.HasPrefix(value, "pciedevice_"), strings.HasPrefix(value, "pciedevice "):
return true
case value == "pciefunction", strings.HasPrefix(value, "pciefunction_"), strings.HasPrefix(value, "pciefunction "):
return true
case value == "ethernetinterface", strings.HasPrefix(value, "ethernetinterface_"), strings.HasPrefix(value, "ethernetinterface "):
return true
case value == "networkport", strings.HasPrefix(value, "networkport_"), strings.HasPrefix(value, "networkport "):
return true
default:
return false
}
}
// isUnidentifiablePCIeDevice returns true for PCIe topology entries that carry no
// useful inventory information: generic class (SingleFunction/MultiFunction), no
// resolved model or serial, and no PCI vendor/device IDs for future resolution.
@@ -4897,6 +4707,59 @@ func buildBDFfromOemPublic(doc map[string]interface{}) string {
return fmt.Sprintf("%04x:%02x:%02x.%x", segment, bus, dev, fn)
}
// redfishEnrichFromOEMxFusionPCIeLink fills in missing PCIe link width/speed
// from the xFusion OEM namespace. xFusion reports link width as a string like
// "X8" in Oem.xFusion.LinkWidth / Oem.xFusion.LinkWidthAbility, and link speed
// as a string like "Gen4 (16.0GT/s)" in Oem.xFusion.LinkSpeed /
// Oem.xFusion.LinkSpeedAbility. These fields appear on PCIeFunction docs.
func redfishEnrichFromOEMxFusionPCIeLink(doc map[string]interface{}, linkWidth, maxLinkWidth *int, linkSpeed, maxLinkSpeed *string) {
oem, _ := doc["Oem"].(map[string]interface{})
if oem == nil {
return
}
xf, _ := oem["xFusion"].(map[string]interface{})
if xf == nil {
return
}
if *linkWidth == 0 {
*linkWidth = parseXFusionLinkWidth(asString(xf["LinkWidth"]))
}
if *maxLinkWidth == 0 {
*maxLinkWidth = parseXFusionLinkWidth(asString(xf["LinkWidthAbility"]))
}
if strings.TrimSpace(*linkSpeed) == "" {
*linkSpeed = strings.TrimSpace(asString(xf["LinkSpeed"]))
}
if strings.TrimSpace(*maxLinkSpeed) == "" {
*maxLinkSpeed = strings.TrimSpace(asString(xf["LinkSpeedAbility"]))
}
}
// parseXFusionLinkWidth converts an xFusion link-width string like "X8" or
// "x16" to the integer lane count. Returns 0 for unrecognised values.
func parseXFusionLinkWidth(s string) int {
s = strings.TrimSpace(s)
if s == "" {
return 0
}
s = strings.TrimPrefix(strings.ToUpper(s), "X")
v := asInt(s)
if v <= 0 {
return 0
}
return v
}
// firstNonZeroInt returns the first argument that is non-zero.
func firstNonZeroInt(vals ...int) int {
for _, v := range vals {
if v != 0 {
return v
}
}
return 0
}
func normalizeRedfishIdentityField(v string) string {
v = strings.TrimSpace(v)
if v == "" {
@@ -5612,6 +5475,9 @@ func normalizeNetworkAdapterModel(nic models.NetworkAdapter) string {
if model == "" {
return ""
}
if isMissingOrRawPCIModel(model) {
return ""
}
slot := strings.TrimSpace(nic.Slot)
if slot != "" && strings.EqualFold(slot, model) {
return ""

View File

@@ -50,11 +50,15 @@ func (c *RedfishConnector) collectRedfishLogEntries(ctx context.Context, client
}
for _, systemPath := range systemPaths {
collectFrom(joinPath(systemPath, "/LogServices"), isHardwareLogService)
for _, logServicesPath := range c.redfishLinkedCollectionPaths(ctx, client, req, baseURL, systemPath, "LogServices") {
collectFrom(logServicesPath, isHardwareLogService)
}
}
// Managers hold the IPMI SEL on AMI/MSI BMCs — include only the "SEL" service.
for _, managerPath := range managerPaths {
collectFrom(joinPath(managerPath, "/LogServices"), isManagerSELService)
for _, logServicesPath := range c.redfishLinkedCollectionPaths(ctx, client, req, baseURL, managerPath, "LogServices") {
collectFrom(logServicesPath, isManagerSELService)
}
}
if len(out) > 0 {
@@ -63,6 +67,42 @@ func (c *RedfishConnector) collectRedfishLogEntries(ctx context.Context, client
return out
}
func (c *RedfishConnector) redfishLinkedCollectionPaths(
ctx context.Context,
client *http.Client,
req Request,
baseURL, resourcePath, linkKey string,
) []string {
resourcePath = normalizeRedfishPath(resourcePath)
if resourcePath == "" || strings.TrimSpace(linkKey) == "" {
return nil
}
seen := make(map[string]struct{}, 2)
var out []string
add := func(path string) {
path = normalizeRedfishPath(path)
if path == "" {
return
}
if _, ok := seen[path]; ok {
return
}
seen[path] = struct{}{}
out = append(out, path)
}
add(joinPath(resourcePath, "/"+strings.TrimSpace(linkKey)))
resourceDoc, err := c.getJSON(ctx, client, req, baseURL, resourcePath)
if err == nil {
if linked := redfishLinkedPath(resourceDoc, linkKey); linked != "" {
add(linked)
}
}
return out
}
// fetchRedfishLogEntriesWithPaging fetches entries from a LogEntry collection,
// following nextLink pages. Stops early when entries older than cutoff are encountered
// (assumes BMC returns entries newest-first, which is typical).
@@ -182,7 +222,7 @@ func redfishLogServiceEntriesPath(svc map[string]interface{}) string {
// Audit, authentication, and session events are excluded.
func isHardwareLogEntry(entry map[string]interface{}) bool {
entryType := strings.TrimSpace(asString(entry["EntryType"]))
if strings.EqualFold(entryType, "Oem") {
if strings.EqualFold(entryType, "Oem") && !strings.EqualFold(strings.TrimSpace(asString(entry["OemRecordFormat"])), "Lenovo") {
return false
}
@@ -362,6 +402,9 @@ func parseIPMIDumpKV(message string) map[string]string {
// AMI/MSI BMCs often set Severity="OK" on all SEL records regardless of content,
// so we fall back to inferring severity from SensorType when the explicit field is unhelpful.
func redfishLogEntrySeverity(entry map[string]interface{}) models.Severity {
if redfishLogEntryLooksLikeWarning(entry) {
return models.SeverityWarning
}
// Newer Redfish uses MessageSeverity; older uses Severity.
raw := strings.ToLower(firstNonEmpty(
strings.TrimSpace(asString(entry["MessageSeverity"])),
@@ -380,6 +423,16 @@ func redfishLogEntrySeverity(entry map[string]interface{}) models.Severity {
}
}
func redfishLogEntryLooksLikeWarning(entry map[string]interface{}) bool {
joined := strings.ToLower(strings.TrimSpace(strings.Join([]string{
asString(entry["Message"]),
asString(entry["Name"]),
asString(entry["SensorType"]),
asString(entry["EntryCode"]),
}, " ")))
return strings.Contains(joined, "unqualified dimm")
}
// redfishSeverityFromSensorType infers event severity from the IPMI/Redfish SensorType string.
func redfishSeverityFromSensorType(sensorType string) models.Severity {
switch strings.ToLower(sensorType) {

View File

@@ -0,0 +1,125 @@
package collector
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
"git.mchus.pro/mchus/logpile/internal/models"
)
func TestCollectRedfishLogEntries_UsesLinkedManagerLogServicesPath(t *testing.T) {
mux := http.NewServeMux()
register := func(path string, payload interface{}) {
mux.HandleFunc(path, func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(payload)
})
}
register("/redfish/v1/Managers/1", map[string]interface{}{
"Id": "1",
"LogServices": map[string]interface{}{
"@odata.id": "/redfish/v1/Systems/1/LogServices",
},
})
register("/redfish/v1/Systems/1/LogServices", map[string]interface{}{
"Members": []map[string]string{
{"@odata.id": "/redfish/v1/Systems/1/LogServices/SEL"},
},
})
register("/redfish/v1/Systems/1/LogServices/SEL", map[string]interface{}{
"Id": "SEL",
"Entries": map[string]interface{}{
"@odata.id": "/redfish/v1/Systems/1/LogServices/SEL/Entries",
},
})
register("/redfish/v1/Systems/1/LogServices/SEL/Entries", map[string]interface{}{
"Members": []map[string]string{
{"@odata.id": "/redfish/v1/Systems/1/LogServices/SEL/Entries/1"},
},
})
register("/redfish/v1/Systems/1/LogServices/SEL/Entries/1", map[string]interface{}{
"Id": "1",
"Created": time.Now().UTC().Format(time.RFC3339),
"Message": "System found Unqualified DIMM in slot DIMM A1",
"MessageSeverity": "OK",
"SensorType": "Memory",
"EntryType": "Event",
})
ts := httptest.NewServer(mux)
defer ts.Close()
c := NewRedfishConnector()
got := c.collectRedfishLogEntries(context.Background(), ts.Client(), Request{
Host: ts.URL,
Port: 443,
Protocol: "redfish",
Username: "admin",
AuthType: "password",
Password: "secret",
TLSMode: "strict",
}, ts.URL, nil, []string{"/redfish/v1/Managers/1"})
if len(got) != 1 {
t.Fatalf("expected 1 collected log entry, got %d", len(got))
}
if got[0]["Message"] != "System found Unqualified DIMM in slot DIMM A1" {
t.Fatalf("unexpected collected message: %#v", got[0]["Message"])
}
}
func TestParseRedfishLogEntries_UnqualifiedDIMMBecomesWarning(t *testing.T) {
rawPayloads := map[string]any{
"redfish_log_entries": []any{
map[string]any{
"Id": "sel-1",
"Created": "2026-04-13T12:00:00Z",
"Message": "System found Unqualified DIMM in slot DIMM A1",
"MessageSeverity": "OK",
"SensorType": "Memory",
"EntryType": "Event",
},
},
}
events := parseRedfishLogEntries(rawPayloads, time.Date(2026, 4, 13, 12, 30, 0, 0, time.UTC))
if len(events) != 1 {
t.Fatalf("expected 1 event, got %d", len(events))
}
if events[0].Severity != models.SeverityWarning {
t.Fatalf("expected warning severity, got %q", events[0].Severity)
}
if events[0].Description != "System found Unqualified DIMM in slot DIMM A1" {
t.Fatalf("unexpected description: %q", events[0].Description)
}
}
func TestParseRedfishLogEntries_LenovoOEMEntryIsKept(t *testing.T) {
rawPayloads := map[string]any{
"redfish_log_entries": []any{
map[string]any{
"Id": "plat-55",
"Created": "2026-04-13T12:00:00Z",
"Message": "DIMM A1 is unqualified",
"MessageSeverity": "Warning",
"SensorType": "Memory",
"EntryType": "Oem",
"OemRecordFormat": "Lenovo",
"EntryCode": "Assert",
},
},
}
events := parseRedfishLogEntries(rawPayloads, time.Date(2026, 4, 13, 12, 30, 0, 0, time.UTC))
if len(events) != 1 {
t.Fatalf("expected 1 Lenovo OEM event, got %d", len(events))
}
if events[0].Severity != models.SeverityWarning {
t.Fatalf("expected warning severity, got %q", events[0].Severity)
}
}

View File

@@ -0,0 +1,57 @@
package collector
import "testing"
func TestShouldIncludeCriticalPlanBPath(t *testing.T) {
tests := []struct {
name string
req Request
path string
want bool
}{
{
name: "skip hgx erot pcie without extended diagnostics",
req: Request{},
path: "/redfish/v1/Chassis/HGX_ERoT_NVSwitch_0/PCIeDevices",
want: false,
},
{
name: "skip hgx chassis assembly without extended diagnostics",
req: Request{},
path: "/redfish/v1/Chassis/HGX_Chassis_0/Assembly",
want: false,
},
{
name: "keep standard chassis inventory without extended diagnostics",
req: Request{},
path: "/redfish/v1/Chassis/1/PCIeDevices",
want: true,
},
{
name: "keep nvme storage backplane drives without extended diagnostics",
req: Request{},
path: "/redfish/v1/Chassis/NVMeSSD.0.Group.0.StorageBackplane/Drives",
want: true,
},
{
name: "keep system processors without extended diagnostics",
req: Request{},
path: "/redfish/v1/Systems/HGX_Baseboard_0/Processors",
want: true,
},
{
name: "include hgx erot pcie when extended diagnostics enabled",
req: Request{DebugPayloads: true},
path: "/redfish/v1/Chassis/HGX_ERoT_NVSwitch_0/PCIeDevices",
want: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := shouldIncludeCriticalPlanBPath(tt.req, tt.path); got != tt.want {
t.Fatalf("shouldIncludeCriticalPlanBPath(%q) = %v, want %v", tt.path, got, tt.want)
}
})
}
}

View File

@@ -1244,6 +1244,15 @@ func (r redfishSnapshotReader) getLinkedPCIeFunctions(doc map[string]interface{}
}
return out
}
if ref, ok := links["PCIeFunction"].(map[string]interface{}); ok {
memberPath := asString(ref["@odata.id"])
if memberPath != "" {
memberDoc, err := r.getJSON(memberPath)
if err == nil {
return []map[string]interface{}{memberDoc}
}
}
}
}
if pcieFunctions, ok := doc["PCIeFunctions"].(map[string]interface{}); ok {
if collectionPath := asString(pcieFunctions["@odata.id"]); collectionPath != "" {
@@ -1256,6 +1265,33 @@ func (r redfishSnapshotReader) getLinkedPCIeFunctions(doc map[string]interface{}
return nil
}
func dedupeJSONDocsByPath(docs []map[string]interface{}) []map[string]interface{} {
if len(docs) == 0 {
return nil
}
seen := make(map[string]struct{}, len(docs))
out := make([]map[string]interface{}, 0, len(docs))
for _, doc := range docs {
if len(doc) == 0 {
continue
}
key := normalizeRedfishPath(asString(doc["@odata.id"]))
if key == "" {
payload, err := json.Marshal(doc)
if err != nil {
continue
}
key = string(payload)
}
if _, ok := seen[key]; ok {
continue
}
seen[key] = struct{}{}
out = append(out, doc)
}
return out
}
func (r redfishSnapshotReader) getLinkedSupplementalDocs(doc map[string]interface{}, keys ...string) []map[string]interface{} {
if len(doc) == 0 || len(keys) == 0 {
return nil

View File

@@ -31,7 +31,7 @@ func (r redfishSnapshotReader) enrichNICsFromNetworkInterfaces(nics *[]models.Ne
// the real NIC that came from Chassis/NetworkAdapters (e.g. "RISER 5
// slot 1 (7)"). Try to find the real NIC via the Links.NetworkAdapter
// cross-reference before creating a ghost entry.
if linkedIdx := r.findNICIndexByLinkedNetworkAdapter(iface, bySlot); linkedIdx >= 0 {
if linkedIdx := r.findNICIndexByLinkedNetworkAdapter(iface, *nics, bySlot); linkedIdx >= 0 {
idx = linkedIdx
ok = true
}
@@ -75,28 +75,53 @@ func (r redfishSnapshotReader) collectNICs(chassisPaths []string) []models.Netwo
continue
}
for _, doc := range adapterDocs {
nic := parseNIC(doc)
for _, pciePath := range networkAdapterPCIeDevicePaths(doc) {
pcieDoc, err := r.getJSON(pciePath)
if err != nil {
continue
}
functionDocs := r.getLinkedPCIeFunctions(pcieDoc)
supplementalDocs := r.getLinkedSupplementalDocs(pcieDoc, "EnvironmentMetrics", "Metrics")
for _, fn := range functionDocs {
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
}
enrichNICFromPCIe(&nic, pcieDoc, functionDocs, supplementalDocs)
}
if len(nic.MACAddresses) == 0 {
r.enrichNICMACsFromNetworkDeviceFunctions(&nic, doc)
}
nics = append(nics, nic)
nics = append(nics, r.buildNICFromAdapterDoc(doc))
}
}
return dedupeNetworkAdapters(nics)
}
func (r redfishSnapshotReader) buildNICFromAdapterDoc(adapterDoc map[string]interface{}) models.NetworkAdapter {
nic := parseNIC(adapterDoc)
adapterFunctionDocs := r.getNetworkAdapterFunctionDocs(adapterDoc)
for _, pciePath := range networkAdapterPCIeDevicePaths(adapterDoc) {
pcieDoc, err := r.getJSON(pciePath)
if err != nil {
continue
}
functionDocs := r.getLinkedPCIeFunctions(pcieDoc)
for _, adapterFnDoc := range adapterFunctionDocs {
functionDocs = append(functionDocs, r.getLinkedPCIeFunctions(adapterFnDoc)...)
}
functionDocs = dedupeJSONDocsByPath(functionDocs)
supplementalDocs := r.getLinkedSupplementalDocs(pcieDoc, "EnvironmentMetrics", "Metrics")
for _, fn := range functionDocs {
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
}
enrichNICFromPCIe(&nic, pcieDoc, functionDocs, supplementalDocs)
}
if len(nic.MACAddresses) == 0 {
r.enrichNICMACsFromNetworkDeviceFunctions(&nic, adapterDoc)
}
return nic
}
func (r redfishSnapshotReader) getNetworkAdapterFunctionDocs(adapterDoc map[string]interface{}) []map[string]interface{} {
ndfCol, ok := adapterDoc["NetworkDeviceFunctions"].(map[string]interface{})
if !ok {
return nil
}
colPath := asString(ndfCol["@odata.id"])
if colPath == "" {
return nil
}
funcDocs, err := r.getCollectionMembers(colPath)
if err != nil {
return nil
}
return funcDocs
}
func (r redfishSnapshotReader) collectPCIeDevices(systemPaths, chassisPaths []string) []models.PCIeDevice {
collections := make([]string, 0, len(systemPaths)+len(chassisPaths))
for _, systemPath := range systemPaths {
@@ -116,13 +141,16 @@ func (r redfishSnapshotReader) collectPCIeDevices(systemPaths, chassisPaths []st
if looksLikeGPU(doc, functionDocs) {
continue
}
if replayPCIeDeviceBackedByCanonicalNIC(doc, functionDocs) {
continue
}
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
supplementalDocs = append(supplementalDocs, r.getChassisScopedPCIeSupplementalDocs(doc)...)
for _, fn := range functionDocs {
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
}
dev := parsePCIeDeviceWithSupplementalDocs(doc, functionDocs, supplementalDocs)
if isUnidentifiablePCIeDevice(dev) {
if shouldSkipReplayPCIeDevice(doc, dev) {
continue
}
out = append(out, dev)
@@ -136,12 +164,134 @@ func (r redfishSnapshotReader) collectPCIeDevices(systemPaths, chassisPaths []st
for idx, fn := range functionDocs {
supplementalDocs := r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")
dev := parsePCIeFunctionWithSupplementalDocs(fn, supplementalDocs, idx+1)
if shouldSkipReplayPCIeDevice(fn, dev) {
continue
}
out = append(out, dev)
}
}
return dedupePCIeDevices(out)
}
func shouldSkipReplayPCIeDevice(doc map[string]interface{}, dev models.PCIeDevice) bool {
if isUnidentifiablePCIeDevice(dev) {
return true
}
if replayNetworkFunctionBackedByCanonicalNIC(doc, dev) {
return true
}
if isReplayStorageServiceEndpoint(doc, dev) {
return true
}
if isReplayNoisePCIeClass(dev.DeviceClass) {
return true
}
if isReplayDisplayDeviceDuplicate(doc, dev) {
return true
}
return false
}
func replayPCIeDeviceBackedByCanonicalNIC(doc map[string]interface{}, functionDocs []map[string]interface{}) bool {
if !looksLikeReplayNetworkPCIeDevice(doc, functionDocs) {
return false
}
for _, fn := range functionDocs {
if hasRedfishLinkedMember(fn, "NetworkDeviceFunctions") {
return true
}
}
return false
}
func replayNetworkFunctionBackedByCanonicalNIC(doc map[string]interface{}, dev models.PCIeDevice) bool {
if !looksLikeReplayNetworkClass(dev.DeviceClass) {
return false
}
return hasRedfishLinkedMember(doc, "NetworkDeviceFunctions")
}
func looksLikeReplayNetworkPCIeDevice(doc map[string]interface{}, functionDocs []map[string]interface{}) bool {
for _, fn := range functionDocs {
if looksLikeReplayNetworkClass(asString(fn["DeviceClass"])) {
return true
}
}
joined := strings.ToLower(strings.TrimSpace(strings.Join([]string{
asString(doc["DeviceType"]),
asString(doc["Description"]),
asString(doc["Name"]),
asString(doc["Model"]),
}, " ")))
return strings.Contains(joined, "network")
}
func looksLikeReplayNetworkClass(class string) bool {
class = strings.ToLower(strings.TrimSpace(class))
return strings.Contains(class, "network") || strings.Contains(class, "ethernet")
}
func isReplayStorageServiceEndpoint(doc map[string]interface{}, dev models.PCIeDevice) bool {
class := strings.ToLower(strings.TrimSpace(dev.DeviceClass))
if class != "massstoragecontroller" && class != "mass storage controller" {
return false
}
name := strings.ToLower(strings.TrimSpace(firstNonEmpty(
dev.PartNumber,
asString(doc["PartNumber"]),
asString(doc["Description"]),
)))
if strings.Contains(name, "pcie switch management endpoint") {
return true
}
if strings.Contains(name, "volume management device nvme raid controller") {
return true
}
return false
}
func hasRedfishLinkedMember(doc map[string]interface{}, key string) bool {
links, ok := doc["Links"].(map[string]interface{})
if !ok {
return false
}
if asInt(links[key+"@odata.count"]) > 0 {
return true
}
linked, ok := links[key]
if !ok {
return false
}
switch v := linked.(type) {
case []interface{}:
return len(v) > 0
case map[string]interface{}:
if asString(v["@odata.id"]) != "" {
return true
}
return len(v) > 0
default:
return false
}
}
func isReplayNoisePCIeClass(class string) bool {
switch strings.ToLower(strings.TrimSpace(class)) {
case "bridge", "processor", "signalprocessingcontroller", "signal processing controller", "serialbuscontroller", "serial bus controller":
return true
default:
return false
}
}
func isReplayDisplayDeviceDuplicate(doc map[string]interface{}, dev models.PCIeDevice) bool {
class := strings.ToLower(strings.TrimSpace(dev.DeviceClass))
if class != "displaycontroller" && class != "display controller" {
return false
}
return strings.EqualFold(strings.TrimSpace(asString(doc["Description"])), "Display Device")
}
func (r redfishSnapshotReader) getChassisScopedPCIeSupplementalDocs(doc map[string]interface{}) []map[string]interface{} {
docPath := normalizeRedfishPath(asString(doc["@odata.id"]))
chassisPath := chassisPathForPCIeDoc(docPath)
@@ -341,8 +491,9 @@ func redfishManagerInterfaceScore(summary map[string]any) int {
// findNICIndexByLinkedNetworkAdapter resolves a NetworkInterface document to an
// existing NIC in bySlot by following Links.NetworkAdapter → the Chassis
// NetworkAdapter doc → its slot label. Returns -1 if no match is found.
func (r redfishSnapshotReader) findNICIndexByLinkedNetworkAdapter(iface map[string]interface{}, bySlot map[string]int) int {
// NetworkAdapter doc and reconstructing the canonical NIC identity. Returns -1
// if no match is found.
func (r redfishSnapshotReader) findNICIndexByLinkedNetworkAdapter(iface map[string]interface{}, existing []models.NetworkAdapter, bySlot map[string]int) int {
links, ok := iface["Links"].(map[string]interface{})
if !ok {
return -1
@@ -359,15 +510,58 @@ func (r redfishSnapshotReader) findNICIndexByLinkedNetworkAdapter(iface map[stri
if err != nil || len(adapterDoc) == 0 {
return -1
}
adapterNIC := parseNIC(adapterDoc)
adapterNIC := r.buildNICFromAdapterDoc(adapterDoc)
if serial := normalizeRedfishIdentityField(adapterNIC.SerialNumber); serial != "" {
for idx, nic := range existing {
if strings.EqualFold(normalizeRedfishIdentityField(nic.SerialNumber), serial) {
return idx
}
}
}
if bdf := strings.TrimSpace(adapterNIC.BDF); bdf != "" {
for idx, nic := range existing {
if strings.EqualFold(strings.TrimSpace(nic.BDF), bdf) {
return idx
}
}
}
if slot := strings.ToLower(strings.TrimSpace(adapterNIC.Slot)); slot != "" {
if idx, ok := bySlot[slot]; ok {
return idx
}
}
for idx, nic := range existing {
if networkAdaptersShareMACs(nic, adapterNIC) {
return idx
}
}
return -1
}
func networkAdaptersShareMACs(a, b models.NetworkAdapter) bool {
if len(a.MACAddresses) == 0 || len(b.MACAddresses) == 0 {
return false
}
seen := make(map[string]struct{}, len(a.MACAddresses))
for _, mac := range a.MACAddresses {
normalized := strings.ToUpper(strings.TrimSpace(mac))
if normalized == "" {
continue
}
seen[normalized] = struct{}{}
}
for _, mac := range b.MACAddresses {
normalized := strings.ToUpper(strings.TrimSpace(mac))
if normalized == "" {
continue
}
if _, ok := seen[normalized]; ok {
return true
}
}
return false
}
// enrichNICMACsFromNetworkDeviceFunctions reads the NetworkDeviceFunctions
// collection linked from a NetworkAdapter document and populates the NIC's
// MACAddresses from each function's Ethernet.PermanentMACAddress / MACAddress.

View File

@@ -265,9 +265,6 @@ func TestRedfishConnectorProbe(t *testing.T) {
if got.HostPowerState != "Off" {
t.Fatalf("expected power state Off, got %q", got.HostPowerState)
}
if !got.PowerControlAvailable {
t.Fatalf("expected power control available")
}
}
func TestRedfishConnectorProbe_FallsBackToPowerSummary(t *testing.T) {
@@ -330,225 +327,6 @@ func TestRedfishConnectorProbe_FallsBackToPowerSummary(t *testing.T) {
if got.HostPowerState != "On" {
t.Fatalf("expected power state On, got %q", got.HostPowerState)
}
if !got.PowerControlAvailable {
t.Fatalf("expected power control available")
}
}
func TestEnsureHostPowerForCollection_WaitsForStablePowerOn(t *testing.T) {
t.Setenv("LOGPILE_REDFISH_POWERON_STABILIZATION", "1ms")
t.Setenv("LOGPILE_REDFISH_BMC_READY_WAITS", "1ms,1ms")
powerState := "Off"
resetCalls := 0
mux := http.NewServeMux()
mux.HandleFunc("/redfish/v1/Systems/1", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string]interface{}{
"@odata.id": "/redfish/v1/Systems/1",
"PowerState": powerState,
"MemorySummary": map[string]interface{}{
"TotalSystemMemoryGiB": 128,
},
"Actions": map[string]interface{}{
"#ComputerSystem.Reset": map[string]interface{}{
"target": "/redfish/v1/Systems/1/Actions/ComputerSystem.Reset",
"ResetType@Redfish.AllowableValues": []interface{}{"On"},
},
},
})
})
mux.HandleFunc("/redfish/v1/Systems/1/Actions/ComputerSystem.Reset", func(w http.ResponseWriter, r *http.Request) {
resetCalls++
powerState = "On"
w.WriteHeader(http.StatusOK)
})
ts := httptest.NewTLSServer(mux)
defer ts.Close()
u, err := url.Parse(ts.URL)
if err != nil {
t.Fatalf("parse server url: %v", err)
}
port := 443
if u.Port() != "" {
fmt.Sscanf(u.Port(), "%d", &port)
}
c := NewRedfishConnector()
hostOn, changed := c.ensureHostPowerForCollection(context.Background(), c.httpClientWithTimeout(Request{TLSMode: "insecure"}, 5*time.Second), Request{
Host: u.Hostname(),
Protocol: "redfish",
Port: port,
Username: "admin",
AuthType: "password",
Password: "secret",
TLSMode: "insecure",
PowerOnIfHostOff: true,
}, ts.URL, "/redfish/v1/Systems/1", nil)
if !hostOn || !changed {
t.Fatalf("expected stable power-on result, got hostOn=%v changed=%v", hostOn, changed)
}
if resetCalls != 1 {
t.Fatalf("expected one reset call, got %d", resetCalls)
}
}
func TestEnsureHostPowerForCollection_FailsIfHostDoesNotStayOnAfterStabilization(t *testing.T) {
t.Setenv("LOGPILE_REDFISH_POWERON_STABILIZATION", "1ms")
t.Setenv("LOGPILE_REDFISH_BMC_READY_WAITS", "1ms,1ms")
powerState := "Off"
mux := http.NewServeMux()
mux.HandleFunc("/redfish/v1/Systems/1", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
current := powerState
if powerState == "On" {
powerState = "Off"
}
_ = json.NewEncoder(w).Encode(map[string]interface{}{
"@odata.id": "/redfish/v1/Systems/1",
"PowerState": current,
"Actions": map[string]interface{}{
"#ComputerSystem.Reset": map[string]interface{}{
"target": "/redfish/v1/Systems/1/Actions/ComputerSystem.Reset",
"ResetType@Redfish.AllowableValues": []interface{}{"On"},
},
},
})
})
mux.HandleFunc("/redfish/v1/Systems/1/Actions/ComputerSystem.Reset", func(w http.ResponseWriter, r *http.Request) {
powerState = "On"
w.WriteHeader(http.StatusOK)
})
ts := httptest.NewTLSServer(mux)
defer ts.Close()
u, err := url.Parse(ts.URL)
if err != nil {
t.Fatalf("parse server url: %v", err)
}
port := 443
if u.Port() != "" {
fmt.Sscanf(u.Port(), "%d", &port)
}
c := NewRedfishConnector()
hostOn, changed := c.ensureHostPowerForCollection(context.Background(), c.httpClientWithTimeout(Request{TLSMode: "insecure"}, 5*time.Second), Request{
Host: u.Hostname(),
Protocol: "redfish",
Port: port,
Username: "admin",
AuthType: "password",
Password: "secret",
TLSMode: "insecure",
PowerOnIfHostOff: true,
}, ts.URL, "/redfish/v1/Systems/1", nil)
if hostOn || changed {
t.Fatalf("expected unstable power-on result to fail, got hostOn=%v changed=%v", hostOn, changed)
}
}
func TestEnsureHostPowerForCollection_UsesPowerSummaryState(t *testing.T) {
t.Setenv("LOGPILE_REDFISH_POWERON_STABILIZATION", "1ms")
t.Setenv("LOGPILE_REDFISH_BMC_READY_WAITS", "1ms,1ms")
powerState := "On"
mux := http.NewServeMux()
mux.HandleFunc("/redfish/v1/Systems/1", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string]interface{}{
"@odata.id": "/redfish/v1/Systems/1",
"PowerSummary": map[string]interface{}{
"PowerState": powerState,
},
"MemorySummary": map[string]interface{}{
"TotalSystemMemoryGiB": 128,
},
"Actions": map[string]interface{}{
"#ComputerSystem.Reset": map[string]interface{}{
"target": "/redfish/v1/Systems/1/Actions/ComputerSystem.Reset",
"ResetType@Redfish.AllowableValues": []interface{}{"On"},
},
},
})
})
ts := httptest.NewTLSServer(mux)
defer ts.Close()
u, err := url.Parse(ts.URL)
if err != nil {
t.Fatalf("parse server url: %v", err)
}
port := 443
if u.Port() != "" {
fmt.Sscanf(u.Port(), "%d", &port)
}
c := NewRedfishConnector()
hostOn, changed := c.ensureHostPowerForCollection(context.Background(), c.httpClientWithTimeout(Request{TLSMode: "insecure"}, 5*time.Second), Request{
Host: u.Hostname(),
Protocol: "redfish",
Port: port,
Username: "admin",
AuthType: "password",
Password: "secret",
TLSMode: "insecure",
PowerOnIfHostOff: true,
}, ts.URL, "/redfish/v1/Systems/1", nil)
if !hostOn || changed {
t.Fatalf("expected already-on host from PowerSummary, got hostOn=%v changed=%v", hostOn, changed)
}
}
func TestWaitForHostPowerState_UsesPowerSummaryState(t *testing.T) {
powerState := "Off"
mux := http.NewServeMux()
mux.HandleFunc("/redfish/v1/Systems/1", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
current := powerState
if powerState == "Off" {
powerState = "On"
}
_ = json.NewEncoder(w).Encode(map[string]interface{}{
"@odata.id": "/redfish/v1/Systems/1",
"PowerSummary": map[string]interface{}{
"PowerState": current,
},
})
})
ts := httptest.NewTLSServer(mux)
defer ts.Close()
u, err := url.Parse(ts.URL)
if err != nil {
t.Fatalf("parse server url: %v", err)
}
port := 443
if u.Port() != "" {
fmt.Sscanf(u.Port(), "%d", &port)
}
c := NewRedfishConnector()
ok := c.waitForHostPowerState(context.Background(), c.httpClientWithTimeout(Request{TLSMode: "insecure"}, 5*time.Second), Request{
Host: u.Hostname(),
Protocol: "redfish",
Port: port,
Username: "admin",
AuthType: "password",
Password: "secret",
TLSMode: "insecure",
}, ts.URL, "/redfish/v1/Systems/1", true, 3*time.Second)
if !ok {
t.Fatalf("expected waitForHostPowerState to use PowerSummary")
}
}
func TestParsePCIeDeviceSlot_FromNestedRedfishSlotLocation(t *testing.T) {
@@ -1287,6 +1065,229 @@ func TestEnrichNICFromPCIeFunctions_FillsMissingIdentityFromFunctionDoc(t *testi
}
}
func TestReplayCollectNICs_UsesNetworkDeviceFunctionPCIeFunctionLink(t *testing.T) {
tree := map[string]interface{}{
"/redfish/v1/Chassis/1/NetworkAdapters": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1"},
},
},
"/redfish/v1/Chassis/1/NetworkAdapters/NIC1": map[string]interface{}{
"Id": "DevType7_NIC1",
"Name": "NetworkAdapter_1",
"Controllers": []interface{}{
map[string]interface{}{
"ControllerCapabilities": map[string]interface{}{
"NetworkPortCount": 2,
},
"Links": map[string]interface{}{
"PCIeDevices": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00"},
},
},
},
},
"NetworkDeviceFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions",
},
},
"/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions/Function0"},
},
},
"/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions/Function0": map[string]interface{}{
"Id": "Function0",
"Links": map[string]interface{}{
"PCIeFunction": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions/Function0",
},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/00_0F_00": map[string]interface{}{
"Id": "00_0F_00",
"Name": "PCIeDevice_00_0F_00",
"Manufacturer": "Mellanox Technologies",
"FirmwareVersion": "26.43.25.66",
"Slot": map[string]interface{}{
"Location": map[string]interface{}{
"PartLocation": map[string]interface{}{
"ServiceLabel": "RISER4",
},
},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions/Function0": map[string]interface{}{
"Id": "Function0",
"FunctionId": "0000:0f:00.0",
"VendorId": "0x15b3",
"DeviceId": "0x101f",
"SerialNumber": "MT2412X00001",
"PartNumber": "MCX623432AC-GDA_Ax",
},
}
r := redfishSnapshotReader{tree: tree}
nics := r.collectNICs([]string{"/redfish/v1/Chassis/1"})
if len(nics) != 1 {
t.Fatalf("expected one NIC, got %d", len(nics))
}
if nics[0].Slot != "RISER4" {
t.Fatalf("expected slot from PCIe device, got %q", nics[0].Slot)
}
if nics[0].SerialNumber != "MT2412X00001" {
t.Fatalf("expected serial from NetworkDeviceFunction PCIeFunction link, got %q", nics[0].SerialNumber)
}
if nics[0].PartNumber != "MCX623432AC-GDA_Ax" {
t.Fatalf("expected part number from linked PCIeFunction, got %q", nics[0].PartNumber)
}
if nics[0].BDF != "0000:0f:00.0" {
t.Fatalf("expected BDF from linked PCIeFunction, got %q", nics[0].BDF)
}
if nics[0].Model != "MT2894 Family [ConnectX-6 Lx]" {
t.Fatalf("expected model resolved from PCI IDs, got %q", nics[0].Model)
}
}
func TestReplayEnrichNICsFromNetworkInterfaces_DoesNotCreateGhostForLinkedAdapter(t *testing.T) {
tree := map[string]interface{}{
"/redfish/v1/Chassis/1/NetworkAdapters": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1"},
},
},
"/redfish/v1/Chassis/1/NetworkAdapters/NIC1": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1",
"Id": "DevType7_NIC1",
"Name": "NetworkAdapter_1",
"Controllers": []interface{}{
map[string]interface{}{
"ControllerCapabilities": map[string]interface{}{
"NetworkPortCount": 1,
},
"Links": map[string]interface{}{
"PCIeDevices": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00"},
},
},
},
map[string]interface{}{
"ControllerCapabilities": map[string]interface{}{
"NetworkPortCount": 1,
},
"Links": map[string]interface{}{
"PCIeDevices": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00"},
},
},
},
},
"NetworkDeviceFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions",
},
},
"/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions/Function0"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions/Function1"},
},
},
"/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions/Function0": map[string]interface{}{
"Id": "Function0",
"Ethernet": map[string]interface{}{
"MACAddress": "CC:40:F3:D6:9E:DE",
"PermanentMACAddress": "CC:40:F3:D6:9E:DE",
},
"Links": map[string]interface{}{
"PCIeFunction": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions/Function0",
},
},
},
"/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions/Function1": map[string]interface{}{
"Id": "Function1",
"Ethernet": map[string]interface{}{
"MACAddress": "CC:40:F3:D6:9E:DF",
"PermanentMACAddress": "CC:40:F3:D6:9E:DF",
},
"Links": map[string]interface{}{
"PCIeFunction": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions/Function1",
},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/00_0F_00": map[string]interface{}{
"Id": "00_0F_00",
"Name": "PCIeDevice_00_0F_00",
"Manufacturer": "Mellanox Technologies",
"FirmwareVersion": "26.43.25.66",
"Slot": map[string]interface{}{
"Location": map[string]interface{}{
"PartLocation": map[string]interface{}{
"ServiceLabel": "RISER4",
},
},
},
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions/Function0"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions/Function1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions/Function0": map[string]interface{}{
"FunctionId": "0000:0f:00.0",
"VendorId": "0x15b3",
"DeviceId": "0x101f",
"DeviceClass": "NetworkController",
"SerialNumber": "N/A",
},
"/redfish/v1/Chassis/1/PCIeDevices/00_0F_00/PCIeFunctions/Function1": map[string]interface{}{
"FunctionId": "0000:0f:00.1",
"VendorId": "0x15b3",
"DeviceId": "0x101f",
"DeviceClass": "NetworkController",
},
"/redfish/v1/Systems/1/NetworkInterfaces": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/NetworkInterfaces/DevType7_NIC1"},
},
},
"/redfish/v1/Systems/1/NetworkInterfaces/DevType7_NIC1": map[string]interface{}{
"Id": "DevType7_NIC1",
"Name": "NetworkAdapter_1",
"Links": map[string]interface{}{
"NetworkAdapter": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1",
},
},
"Status": map[string]interface{}{
"Health": "OK",
"State": "Disabled",
},
},
}
r := redfishSnapshotReader{tree: tree}
nics := r.collectNICs([]string{"/redfish/v1/Chassis/1"})
r.enrichNICsFromNetworkInterfaces(&nics, []string{"/redfish/v1/Systems/1"})
if len(nics) != 1 {
t.Fatalf("expected linked network interface to reuse existing NIC, got %d: %+v", len(nics), nics)
}
if nics[0].Slot != "RISER4" {
t.Fatalf("expected enriched slot to stay canonical, got %q", nics[0].Slot)
}
if nics[0].Model != "MT2894 Family [ConnectX-6 Lx]" {
t.Fatalf("expected resolved Mellanox model, got %q", nics[0].Model)
}
if len(nics[0].MACAddresses) != 2 {
t.Fatalf("expected both MACs to stay on one NIC, got %+v", nics[0].MACAddresses)
}
}
func TestParseNIC_PortCountFromControllerCapabilities(t *testing.T) {
nic := parseNIC(map[string]interface{}{
"Id": "1",
@@ -1340,6 +1341,48 @@ func TestParseNIC_PrefersControllerSlotLabelAndPCIeInterface(t *testing.T) {
}
}
func TestParseNIC_xFusionMaxlanesAndOEMLinkWidth(t *testing.T) {
// xFusion uses "Maxlanes" (lowercase 'l') in PCIeInterface, not "MaxLanes".
// xFusion also stores per-function link width as Oem.xFusion.LinkWidth = "X8".
nic := parseNIC(map[string]interface{}{
"Id": "OCPCard1",
"Model": "ConnectX-6 Lx",
"Controllers": []interface{}{
map[string]interface{}{
"PCIeInterface": map[string]interface{}{
"LanesInUse": 8,
"Maxlanes": 8, // xFusion uses lowercase 'l'
"PCIeType": "Gen4",
"MaxPCIeType": "Gen4",
},
},
},
})
if nic.LinkWidth != 8 || nic.MaxLinkWidth != 8 {
t.Fatalf("expected link widths 8/8 from xFusion Maxlanes, got current=%d max=%d", nic.LinkWidth, nic.MaxLinkWidth)
}
// enrichNICFromPCIe: OEM xFusion LinkWidth on a PCIeFunction doc.
nic2 := models.NetworkAdapter{}
fnDoc := map[string]interface{}{
"Oem": map[string]interface{}{
"xFusion": map[string]interface{}{
"LinkWidth": "X8",
"LinkWidthAbility": "X8",
"LinkSpeed": "Gen4 (16.0GT/s)",
"LinkSpeedAbility": "Gen4 (16.0GT/s)",
},
},
}
enrichNICFromPCIe(&nic2, map[string]interface{}{}, []map[string]interface{}{fnDoc}, nil)
if nic2.LinkWidth != 8 || nic2.MaxLinkWidth != 8 {
t.Fatalf("expected link width 8 from xFusion OEM LinkWidth, got current=%d max=%d", nic2.LinkWidth, nic2.MaxLinkWidth)
}
if nic2.LinkSpeed != "Gen4 (16.0GT/s)" || nic2.MaxLinkSpeed != "Gen4 (16.0GT/s)" {
t.Fatalf("expected link speed from xFusion OEM LinkSpeed, got current=%q max=%q", nic2.LinkSpeed, nic2.MaxLinkSpeed)
}
}
func TestParseNIC_DropsUnrealisticPortCount(t *testing.T) {
nic := parseNIC(map[string]interface{}{
"Id": "1",
@@ -2388,6 +2431,279 @@ func TestReplayCollectGPUs_DoesNotCollapseOnPlaceholderSerialAndSkipsNIC(t *test
}
}
func TestReplayCollectPCIeDevices_SkipsMSITopologyNoiseClasses(t *testing.T) {
r := redfishSnapshotReader{tree: map[string]interface{}{
"/redfish/v1/Chassis/1/PCIeDevices": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/bridge"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/processor"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/signal"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/serial"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/display"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/network"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/storage"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/bridge": map[string]interface{}{
"Id": "bridge",
"Name": "Bridge",
"Description": "Bridge Device",
"Manufacturer": "Intel Corporation",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/bridge/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/bridge/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/bridge/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/bridge/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "Bridge",
"VendorId": "0x8086",
"DeviceId": "0x0db0",
},
"/redfish/v1/Chassis/1/PCIeDevices/processor": map[string]interface{}{
"Id": "processor",
"Name": "Processor",
"Description": "Processor Device",
"Manufacturer": "Intel Corporation",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/processor/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/processor/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/processor/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/processor/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "Processor",
"VendorId": "0x8086",
"DeviceId": "0x4944",
},
"/redfish/v1/Chassis/1/PCIeDevices/signal": map[string]interface{}{
"Id": "signal",
"Name": "Signal",
"Manufacturer": "Intel Corporation",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/signal/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/signal/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/signal/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/signal/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "SignalProcessingController",
"VendorId": "0x8086",
"DeviceId": "0x3254",
},
"/redfish/v1/Chassis/1/PCIeDevices/serial": map[string]interface{}{
"Id": "serial",
"Name": "Serial",
"Manufacturer": "Renesas",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/serial/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/serial/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/serial/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/serial/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "SerialBusController",
"VendorId": "0x1912",
"DeviceId": "0x0014",
},
"/redfish/v1/Chassis/1/PCIeDevices/display": map[string]interface{}{
"Id": "display",
"Name": "Display",
"Description": "Display Device",
"Manufacturer": "NVIDIA Corporation",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/display/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/display/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/display/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/display/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "DisplayController",
"VendorId": "0x10de",
"DeviceId": "0x233b",
},
"/redfish/v1/Chassis/1/PCIeDevices/network": map[string]interface{}{
"Id": "network",
"Name": "NIC",
"Description": "Network Device",
"Manufacturer": "Mellanox Technologies",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/network/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/network/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/network/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/network/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "NetworkController",
"VendorId": "0x15b3",
"DeviceId": "0x101f",
},
"/redfish/v1/Chassis/1/PCIeDevices/storage": map[string]interface{}{
"Id": "storage",
"Name": "Storage",
"Description": "Storage Device",
"Manufacturer": "Intel Corporation",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/storage/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/storage/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/storage/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/storage/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "MassStorageController",
"VendorId": "0x1234",
"DeviceId": "0x5678",
},
}}
got := r.collectPCIeDevices(nil, []string{"/redfish/v1/Chassis/1"})
if len(got) != 2 {
t.Fatalf("expected only endpoint PCIe devices to remain, got %d: %+v", len(got), got)
}
classes := map[string]bool{}
for _, dev := range got {
classes[dev.DeviceClass] = true
}
if !classes["NetworkController"] || !classes["MassStorageController"] {
t.Fatalf("expected network and storage PCIe devices to remain, got %+v", got)
}
if classes["Bridge"] || classes["Processor"] || classes["SignalProcessingController"] || classes["SerialBusController"] || classes["DisplayController"] {
t.Fatalf("expected MSI topology noise classes to be filtered, got %+v", got)
}
}
func TestReplayCollectPCIeDevices_SkipsNICsAlreadyRepresentedAsNetworkAdapters(t *testing.T) {
r := redfishSnapshotReader{tree: map[string]interface{}{
"/redfish/v1/Chassis/1/PCIeDevices": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/nic"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/nic": map[string]interface{}{
"Id": "nic",
"Name": "PCIeDevice_00_39_00",
"Description": "Network Device",
"Manufacturer": "Mellanox Technologies",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/nic/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/nic/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/nic/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/nic/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "NetworkController",
"VendorId": "0x15b3",
"DeviceId": "0x101f",
"Links": map[string]interface{}{
"NetworkDeviceFunctions": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/NIC1/NetworkDeviceFunctions/Function0"},
},
"NetworkDeviceFunctions@odata.count": 1,
},
},
}}
got := r.collectPCIeDevices(nil, []string{"/redfish/v1/Chassis/1"})
if len(got) != 0 {
t.Fatalf("expected network-backed PCIe duplicate to be skipped, got %+v", got)
}
}
func TestReplayCollectPCIeDevices_SkipsStorageServiceEndpoints(t *testing.T) {
r := redfishSnapshotReader{tree: map[string]interface{}{
"/redfish/v1/Chassis/1/PCIeDevices": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/vmd"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/switch-mgmt"},
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/hba"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/vmd": map[string]interface{}{
"Id": "vmd",
"Description": "Storage Device",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/vmd/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/vmd/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/vmd/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/vmd/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "MassStorageController",
"VendorId": "0x8086",
"DeviceId": "0x28c0",
},
"/redfish/v1/Chassis/1/PCIeDevices/switch-mgmt": map[string]interface{}{
"Id": "switch-mgmt",
"Description": "Storage Device",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/switch-mgmt/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/switch-mgmt/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/switch-mgmt/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/switch-mgmt/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "MassStorageController",
"VendorId": "0x1000",
"DeviceId": "0x00b2",
},
"/redfish/v1/Chassis/1/PCIeDevices/hba": map[string]interface{}{
"Id": "hba",
"Description": "Storage Device",
"PCIeFunctions": map[string]interface{}{
"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/hba/PCIeFunctions",
},
},
"/redfish/v1/Chassis/1/PCIeDevices/hba/PCIeFunctions": map[string]interface{}{
"Members": []interface{}{
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/PCIeDevices/hba/PCIeFunctions/1"},
},
},
"/redfish/v1/Chassis/1/PCIeDevices/hba/PCIeFunctions/1": map[string]interface{}{
"DeviceClass": "MassStorageController",
"VendorId": "0x1234",
"DeviceId": "0x5678",
},
}}
got := r.collectPCIeDevices(nil, []string{"/redfish/v1/Chassis/1"})
if len(got) != 1 {
t.Fatalf("expected only non-service storage controller to remain, got %+v", got)
}
if got[0].VendorID != 0x1234 || got[0].DeviceID != 0x5678 {
t.Fatalf("expected generic HBA to remain, got %+v", got[0])
}
}
func TestParseBoardInfo_NormalizesNullPlaceholders(t *testing.T) {
got := parseBoardInfo(map[string]interface{}{
"Manufacturer": "NULL",
@@ -2499,6 +2815,28 @@ func TestReplayCollectGPUs_DedupUsesRedfishPathBeforeHeuristics(t *testing.T) {
}
}
func TestParseGPU_xFusionPCIeInterfaceMaxlanes(t *testing.T) {
// xFusion GPU PCIeDevices (PCIeCard1..N) carry link width in PCIeInterface
// with "Maxlanes" (lowercase 'l') rather than "MaxLanes".
doc := map[string]interface{}{
"Id": "PCIeCard1",
"Model": "RTX PRO 6000",
"PCIeInterface": map[string]interface{}{
"LanesInUse": 16,
"Maxlanes": 16,
"PCIeType": "Gen5",
"MaxPCIeType": "Gen5",
},
}
gpu := parseGPU(doc, nil, 1)
if gpu.CurrentLinkWidth != 16 || gpu.MaxLinkWidth != 16 {
t.Fatalf("expected link widths 16/16 from PCIeInterface, got current=%d max=%d", gpu.CurrentLinkWidth, gpu.MaxLinkWidth)
}
if gpu.CurrentLinkSpeed != "Gen5" || gpu.MaxLinkSpeed != "Gen5" {
t.Fatalf("expected link speeds Gen5/Gen5 from PCIeInterface, got current=%q max=%q", gpu.CurrentLinkSpeed, gpu.MaxLinkSpeed)
}
}
func TestParseGPU_UsesNestedOemSerialNumber(t *testing.T) {
doc := map[string]interface{}{
"Id": "GPU4",
@@ -3527,8 +3865,11 @@ func TestShouldCrawlPath_MemoryAndProcessorMetricsAreAllowed(t *testing.T) {
if !shouldCrawlPath("/redfish/v1/Systems/1/Processors/CPU0/ProcessorMetrics") {
t.Fatalf("expected CPU metrics subresource to be crawlable")
}
if shouldCrawlPath("/redfish/v1/Chassis/1/PCIeDevices/0/PCIeFunctions") {
t.Fatalf("expected broad chassis PCIeFunctions collection to be skipped")
}
if !shouldCrawlPath("/redfish/v1/Chassis/1/PCIeDevices/0/PCIeFunctions/1") {
t.Fatalf("expected chassis pciefunctions resource to be crawlable for NIC/GPU identity recovery")
t.Fatalf("expected direct chassis PCIeFunction member to remain crawlable")
}
if !shouldCrawlPath("/redfish/v1/Fabrics/HGX_NVLinkFabric_0/Switches/NVSwitch_0") {
t.Fatalf("expected NVSwitch fabric resource to be crawlable")

View File

@@ -326,6 +326,95 @@ func TestBuildAnalysisDirectives_SupermicroEnablesStorageRecovery(t *testing.T)
}
}
func TestMatchProfiles_LenovoXCCSelectsMatchedModeAndExcludesSensors(t *testing.T) {
match := MatchProfiles(MatchSignals{
SystemManufacturer: "Lenovo",
ChassisManufacturer: "Lenovo",
OEMNamespaces: []string{"Lenovo"},
})
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", match.Mode)
}
found := false
for _, profile := range match.Profiles {
if profile.Name() == "lenovo" {
found = true
break
}
}
if !found {
t.Fatal("expected lenovo profile to be selected")
}
// Verify the acquisition plan excludes noisy Lenovo-specific snapshot paths.
plan := BuildAcquisitionPlan(MatchSignals{
SystemManufacturer: "Lenovo",
ChassisManufacturer: "Lenovo",
OEMNamespaces: []string{"Lenovo"},
})
wantExcluded := []string{
"/Sensors/",
"/Oem/Lenovo/LEDs/",
"/Oem/Lenovo/Slots/",
"/Oem/Lenovo/Configuration",
"/NetworkProtocol/Oem/Lenovo/",
"/VirtualMedia/",
"/ThermalSubsystem/Fans/",
}
for _, want := range wantExcluded {
found := false
for _, ex := range plan.Tuning.SnapshotExcludeContains {
if ex == want {
found = true
break
}
}
if !found {
t.Errorf("expected SnapshotExcludeContains to include %q, got %v", want, plan.Tuning.SnapshotExcludeContains)
}
}
}
func TestResolveAcquisitionPlan_LenovoFiltersNonInventoryChassisBranches(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Lenovo",
ChassisManufacturer: "Lenovo",
OEMNamespaces: []string{"Lenovo"},
ResourceHints: []string{
"/redfish/v1/Chassis/1/Power",
"/redfish/v1/Chassis/1/Thermal",
"/redfish/v1/Chassis/1/NetworkAdapters",
"/redfish/v1/Chassis/3",
"/redfish/v1/Chassis/IO_Board",
},
}
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
ChassisPaths: []string{
"/redfish/v1/Chassis/1",
"/redfish/v1/Chassis/3",
"/redfish/v1/Chassis/IO_Board",
},
}, signals)
if !containsString(resolved.CriticalPaths, "/redfish/v1/Chassis/1/Power") {
t.Fatal("expected primary Lenovo chassis power path to remain critical")
}
if containsString(resolved.CriticalPaths, "/redfish/v1/Chassis/3/Power") {
t.Fatal("did not expect non-inventory Lenovo backplane chassis power path")
}
if containsString(resolved.CriticalPaths, "/redfish/v1/Chassis/IO_Board/Assembly") {
t.Fatal("did not expect IO board assembly path without inventory hints")
}
if containsString(resolved.Plan.PlanBPaths, "/redfish/v1/Chassis/3/Assembly") {
t.Fatal("did not expect non-inventory Lenovo chassis plan-b target")
}
if !containsString(resolved.CriticalPaths, "/redfish/v1/Chassis/3") {
t.Fatal("expected chassis root to remain discoverable even when suffixes are filtered")
}
}
func TestMatchProfiles_OrderingIsDeterministic(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Micro-Star International Co., Ltd.",

View File

@@ -0,0 +1,175 @@
package redfishprofile
import "strings"
func lenovoProfile() Profile {
return staticProfile{
name: "lenovo",
priority: 20,
safeForFallback: true,
matchFn: func(s MatchSignals) int {
score := 0
if containsFold(s.SystemManufacturer, "lenovo") ||
containsFold(s.ChassisManufacturer, "lenovo") {
score += 80
}
for _, ns := range s.OEMNamespaces {
if containsFold(ns, "lenovo") {
score += 30
break
}
}
// Lenovo XClarity Controller (XCC) is the BMC product line.
if containsFold(s.ServiceRootProduct, "xclarity") ||
containsFold(s.ServiceRootProduct, "xcc") {
score += 30
}
return min(score, 100)
},
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
// Lenovo XCC BMC exposes Chassis/1/Sensors with hundreds of individual
// sensor member documents (e.g. Chassis/1/Sensors/101L1). These are
// not used by any LOGPile parser — thermal/power data is read from
// the aggregate Chassis/*/Thermal and Chassis/*/Power endpoints. On
// a real server they largely return errors, wasting many minutes.
// Lenovo OEM subtrees under Oem/Lenovo/LEDs and Oem/Lenovo/Slots also
// enumerate dozens of individual documents not relevant to inventory.
ensureSnapshotExcludeContains(plan,
"/Sensors/", // individual sensor docs (Chassis/1/Sensors/NNN)
"/Oem/Lenovo/LEDs/", // individual LED status entries (~47 per server)
"/Oem/Lenovo/Slots/", // individual slot detail entries (~26 per server)
"/Oem/Lenovo/Metrics/", // operational metrics, not inventory
"/Oem/Lenovo/History", // historical telemetry
"/Oem/Lenovo/Configuration", // BMC config service, not inventory
"/Oem/Lenovo/DateTimeService", // BMC time service config
"/Oem/Lenovo/GroupService", // XCC fleet/group management state
"/Oem/Lenovo/Recipients", // alert recipient config
"/Oem/Lenovo/RemoteControl", // remote-media/session management
"/Oem/Lenovo/RemoteMap", // remote-media mapping config
"/Oem/Lenovo/SecureKeyLifecycleService", // key lifecycle/cert config
"/Oem/Lenovo/ServerProfile", // profile export/import config
"/Oem/Lenovo/ServiceData", // support/service metadata
"/Oem/Lenovo/SsoCertificates", // SSO certificate config
"/Oem/Lenovo/SystemGuard", // snapshot/history service
"/Oem/Lenovo/Watchdogs", // watchdog config
"/Oem/Lenovo/ScheduledPower", // power scheduling config
"/Oem/Lenovo/BootSettings/BootOrder", // individual boot order lists
"/NetworkProtocol/Oem/Lenovo/", // DNS/LDAP/SMTP/SNMP manager config
"/PortForwardingMap/", // network port forwarding config
"/VirtualMedia/", // virtual media inventory/config, not hardware
"/Boot/Certificates", // secure boot certificate stores, not inventory
"/ThermalSubsystem/Fans/", // per-fan member docs; replay uses aggregate Thermal only
)
// Lenovo XCC BMC is typically slow (p95 latency often 3-5s even under
// normal load). Set rate thresholds that don't over-throttle on the
// first few requests, and give the ETA estimator a realistic baseline.
ensureRatePolicy(plan, AcquisitionRatePolicy{
TargetP95LatencyMS: 2000,
ThrottleP95LatencyMS: 4000,
MinSnapshotWorkers: 2,
MinPrefetchWorkers: 1,
DisablePrefetchOnErrors: true,
})
ensureETABaseline(plan, AcquisitionETABaseline{
DiscoverySeconds: 15,
SnapshotSeconds: 120,
PrefetchSeconds: 30,
CriticalPlanBSeconds: 40,
ProfilePlanBSeconds: 20,
})
addPlanNote(plan, "lenovo xcc acquisition extensions enabled: noisy sensor/oem paths excluded from snapshot")
},
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, signals MatchSignals) {
allowedChassis := lenovoAllowedInventoryChassis(discovered.ChassisPaths, signals.ResourceHints)
resolved.SeedPaths = filterLenovoChassisInventoryPaths(resolved.SeedPaths, allowedChassis)
resolved.CriticalPaths = filterLenovoChassisInventoryPaths(resolved.CriticalPaths, allowedChassis)
resolved.Plan.SeedPaths = filterLenovoChassisInventoryPaths(resolved.Plan.SeedPaths, allowedChassis)
resolved.Plan.CriticalPaths = filterLenovoChassisInventoryPaths(resolved.Plan.CriticalPaths, allowedChassis)
resolved.Plan.PlanBPaths = filterLenovoChassisInventoryPaths(resolved.Plan.PlanBPaths, allowedChassis)
},
}
}
func lenovoAllowedInventoryChassis(chassisPaths, resourceHints []string) map[string]struct{} {
allowed := make(map[string]struct{}, len(chassisPaths))
for _, chassisPath := range chassisPaths {
normalized := normalizePath(chassisPath)
if normalized == "" {
continue
}
if normalized == "/redfish/v1/Chassis/1" {
allowed[normalized] = struct{}{}
continue
}
for _, hint := range resourceHints {
hint = normalizePath(hint)
if !strings.HasPrefix(hint, normalized+"/") {
continue
}
if lenovoHintLooksLikeChassisInventory(hint) {
allowed[normalized] = struct{}{}
break
}
}
}
return allowed
}
func lenovoHintLooksLikeChassisInventory(path string) bool {
for _, suffix := range []string{
"/Power",
"/PowerSubsystem",
"/PowerSubsystem/PowerSupplies",
"/Thermal",
"/ThresholdSensors",
"/DiscreteSensors",
"/SensorsList",
"/NetworkAdapters",
"/PCIeDevices",
"/Drives",
"/Assembly",
} {
if strings.HasSuffix(path, suffix) || strings.Contains(path, suffix+"/") {
return true
}
}
return false
}
func filterLenovoChassisInventoryPaths(paths []string, allowedChassis map[string]struct{}) []string {
if len(paths) == 0 {
return nil
}
out := make([]string, 0, len(paths))
for _, path := range paths {
normalized := normalizePath(path)
chassis := lenovoPathChassisRoot(normalized)
if chassis == "" {
out = append(out, normalized)
continue
}
if normalized == chassis {
out = append(out, normalized)
continue
}
if _, ok := allowedChassis[chassis]; ok {
out = append(out, normalized)
}
}
return dedupeSorted(out)
}
func lenovoPathChassisRoot(path string) string {
const prefix = "/redfish/v1/Chassis/"
if !strings.HasPrefix(path, prefix) {
return ""
}
rest := strings.TrimPrefix(path, prefix)
if rest == "" {
return ""
}
if idx := strings.IndexByte(rest, '/'); idx >= 0 {
return prefix + rest[:idx]
}
return prefix + rest
}

View File

@@ -56,6 +56,7 @@ func BuiltinProfiles() []Profile {
supermicroProfile(),
dellProfile(),
hpeProfile(),
lenovoProfile(),
inspurGroupOEMPlatformsProfile(),
hgxProfile(),
xfusionProfile(),
@@ -226,6 +227,10 @@ func ensurePrefetchPolicy(plan *AcquisitionPlan, policy AcquisitionPrefetchPolic
addPlanPaths(&plan.Tuning.PrefetchPolicy.ExcludeContains, policy.ExcludeContains...)
}
func ensureSnapshotExcludeContains(plan *AcquisitionPlan, patterns ...string) {
addPlanPaths(&plan.Tuning.SnapshotExcludeContains, patterns...)
}
func min(a, b int) int {
if a < b {
return a

View File

@@ -53,16 +53,17 @@ type AcquisitionScopedPathPolicy struct {
}
type AcquisitionTuning struct {
SnapshotMaxDocuments int
SnapshotWorkers int
PrefetchEnabled *bool
PrefetchWorkers int
NVMePostProbeEnabled *bool
RatePolicy AcquisitionRatePolicy
ETABaseline AcquisitionETABaseline
PostProbePolicy AcquisitionPostProbePolicy
RecoveryPolicy AcquisitionRecoveryPolicy
PrefetchPolicy AcquisitionPrefetchPolicy
SnapshotMaxDocuments int
SnapshotWorkers int
SnapshotExcludeContains []string
PrefetchEnabled *bool
PrefetchWorkers int
NVMePostProbeEnabled *bool
RatePolicy AcquisitionRatePolicy
ETABaseline AcquisitionETABaseline
PostProbePolicy AcquisitionPostProbePolicy
RecoveryPolicy AcquisitionRecoveryPolicy
PrefetchPolicy AcquisitionPrefetchPolicy
}
type AcquisitionRatePolicy struct {

View File

@@ -15,9 +15,8 @@ type Request struct {
Password string
Token string
TLSMode string
PowerOnIfHostOff bool
StopHostAfterCollect bool
DebugPayloads bool
DebugPayloads bool
SkipHungCh <-chan struct{}
}
type Progress struct {
@@ -65,10 +64,9 @@ type PhaseTelemetry struct {
type ProbeResult struct {
Reachable bool
Protocol string
HostPowerState string
HostPoweredOn bool
PowerControlAvailable bool
SystemPath string
HostPowerState string
HostPoweredOn bool
SystemPath string
}
type Connector interface {

View File

@@ -1961,7 +1961,10 @@ func pcieDedupKey(item ReanimatorPCIe) string {
slot := strings.ToLower(strings.TrimSpace(item.Slot))
serial := strings.ToLower(strings.TrimSpace(item.SerialNumber))
bdf := strings.ToLower(strings.TrimSpace(item.BDF))
if slot != "" {
// Generic slot names (e.g. "PCIe Device" from HGX BMC) are not unique
// hardware positions — multiple distinct devices share the same name.
// Fall through to serial/BDF so they are not incorrectly collapsed.
if slot != "" && !isGenericPCIeSlotName(slot) {
return "slot:" + slot
}
if serial != "" {
@@ -1970,9 +1973,22 @@ func pcieDedupKey(item ReanimatorPCIe) string {
if bdf != "" {
return "bdf:" + bdf
}
if slot != "" {
return "slot:" + slot
}
return strings.ToLower(strings.TrimSpace(item.DeviceClass)) + "|" + strings.ToLower(strings.TrimSpace(item.Model))
}
// isGenericPCIeSlotName reports whether slot is a generic device-type label
// rather than a unique hardware position identifier.
func isGenericPCIeSlotName(slot string) bool {
switch slot {
case "pcie device", "pcie slot", "pcie":
return true
}
return false
}
func pcieQualityScore(item ReanimatorPCIe) int {
score := 0
if strings.TrimSpace(item.SerialNumber) != "" {
@@ -2246,10 +2262,8 @@ func normalizePCIeDeviceClass(d models.HardwareDevice) string {
func normalizeLegacyPCIeDeviceClass(deviceClass string) string {
switch strings.ToLower(strings.TrimSpace(deviceClass)) {
case "", "network", "network controller", "networkcontroller":
case "", "network", "network controller", "networkcontroller", "ethernet", "ethernet controller", "ethernetcontroller":
return "NetworkController"
case "ethernet", "ethernet controller", "ethernetcontroller":
return "EthernetController"
case "fibre channel", "fibre channel controller", "fibrechannelcontroller", "fc":
return "FibreChannelController"
case "display", "displaycontroller", "display controller", "vga":
@@ -2270,8 +2284,6 @@ func normalizeLegacyPCIeDeviceClass(deviceClass string) string {
func normalizeNetworkDeviceClass(portType, model, description string) string {
joined := strings.ToLower(strings.TrimSpace(strings.Join([]string{portType, model, description}, " ")))
switch {
case strings.Contains(joined, "ethernet"):
return "EthernetController"
case strings.Contains(joined, "fibre channel") || strings.Contains(joined, " fibrechannel") || strings.Contains(joined, "fc "):
return "FibreChannelController"
default:

View File

@@ -733,6 +733,42 @@ func TestConvertPCIeDevices_SkipsDisplayControllerDuplicates(t *testing.T) {
}
}
func TestConvertPCIeDevices_PreservesAllGPUsWithGenericSlot(t *testing.T) {
// Supermicro HGX BMC reports all GPU PCIe devices with Name "PCIe Device" —
// a generic label that is not a unique hardware position. All 8 GPUs must
// be preserved; dedup by generic slot name must not collapse them into one.
gpus := make([]models.GPU, 8)
serials := []string{
"1654925165720", "1654925166160", "1654925165942", "1654925165271",
"1654925165719", "1654925165252", "1654925165304", "1654925165587",
}
for i, sn := range serials {
gpus[i] = models.GPU{
Slot: "PCIe Device",
Model: "B200 180GB HBM3e",
Manufacturer: "NVIDIA",
SerialNumber: sn,
PartNumber: "2901-886-A1",
Status: "OK",
}
}
hw := &models.HardwareConfig{GPUs: gpus}
result := convertPCIeDevices(hw, "2026-04-13T10:00:00Z")
if len(result) != 8 {
t.Fatalf("expected 8 GPU entries (one per serial), got %d", len(result))
}
seen := make(map[string]bool)
for _, r := range result {
if seen[r.SerialNumber] {
t.Fatalf("duplicate serial %q in PCIe result", r.SerialNumber)
}
seen[r.SerialNumber] = true
if r.DeviceClass != "VideoController" {
t.Fatalf("expected VideoController device class, got %q", r.DeviceClass)
}
}
}
func TestConvertPCIeDevices_MapsGPUStatusHistory(t *testing.T) {
hw := &models.HardwareConfig{
GPUs: []models.GPU{
@@ -1733,6 +1769,43 @@ func TestConvertToReanimator_ExportsContractV24Telemetry(t *testing.T) {
}
}
func TestConvertToReanimator_UnifiesEthernetAndNetworkControllers(t *testing.T) {
input := &models.AnalysisResult{
Hardware: &models.HardwareConfig{
BoardInfo: models.BoardInfo{SerialNumber: "BOARD-123"},
Devices: []models.HardwareDevice{
{
Kind: models.DeviceKindPCIe,
Slot: "PCIe1",
DeviceClass: "EthernetController",
Present: boolPtr(true),
SerialNumber: "ETH-001",
},
{
Kind: models.DeviceKindNetwork,
Slot: "NIC1",
Model: "Ethernet Adapter",
Present: boolPtr(true),
SerialNumber: "NIC-001",
},
},
},
}
out, err := ConvertToReanimator(input)
if err != nil {
t.Fatalf("ConvertToReanimator() failed: %v", err)
}
if len(out.Hardware.PCIeDevices) != 2 {
t.Fatalf("expected two pcie-class exports, got %d", len(out.Hardware.PCIeDevices))
}
for _, dev := range out.Hardware.PCIeDevices {
if dev.DeviceClass != "NetworkController" {
t.Fatalf("expected unified NetworkController class, got %+v", dev)
}
}
}
func TestConvertToReanimator_PreservesLegacyStorageAndPSUDetails(t *testing.T) {
input := &models.AnalysisResult{
Filename: "legacy-details.json",

View File

@@ -0,0 +1,873 @@
// Package lenovo_xcc provides parser for Lenovo XCC mini-log archives.
// Tested with: ThinkSystem SR650 V3 (XCC mini-log zip, exported via XCC UI)
//
// Archive structure: zip with tmp/ directory containing JSON .log files.
//
// IMPORTANT: Increment parserVersion when modifying parser logic!
package lenovo_xcc
import (
"encoding/json"
"fmt"
"regexp"
"strconv"
"strings"
"time"
"git.mchus.pro/mchus/logpile/internal/models"
"git.mchus.pro/mchus/logpile/internal/parser"
)
const parserVersion = "1.2"
func init() {
parser.Register(&Parser{})
}
// Parser implements VendorParser for Lenovo XCC mini-log archives.
type Parser struct{}
func (p *Parser) Name() string { return "Lenovo XCC Mini-Log Parser" }
func (p *Parser) Vendor() string { return "lenovo_xcc" }
func (p *Parser) Version() string { return parserVersion }
// Detect checks if files match the Lenovo XCC mini-log archive format.
// Returns confidence score 0-100.
func (p *Parser) Detect(files []parser.ExtractedFile) int {
confidence := 0
for _, f := range files {
path := strings.ToLower(f.Path)
switch {
case strings.HasSuffix(path, "tmp/basic_sys_info.log"):
confidence += 60
case strings.HasSuffix(path, "tmp/inventory_cpu.log"):
confidence += 20
case strings.HasSuffix(path, "tmp/xcc_plat_events1.log"):
confidence += 20
case strings.HasSuffix(path, "tmp/inventory_dimm.log"):
confidence += 10
case strings.HasSuffix(path, "tmp/inventory_fw.log"):
confidence += 10
}
if confidence >= 100 {
return 100
}
}
return confidence
}
// Parse parses the Lenovo XCC mini-log archive and returns an analysis result.
func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, error) {
result := &models.AnalysisResult{
Events: make([]models.Event, 0),
FRU: make([]models.FRUInfo, 0),
Sensors: make([]models.SensorReading, 0),
Hardware: &models.HardwareConfig{
Firmware: make([]models.FirmwareInfo, 0),
CPUs: make([]models.CPU, 0),
Memory: make([]models.MemoryDIMM, 0),
Storage: make([]models.Storage, 0),
PCIeDevices: make([]models.PCIeDevice, 0),
PowerSupply: make([]models.PSU, 0),
},
}
if f := findByPath(files, "tmp/basic_sys_info.log"); f != nil {
parseBasicSysInfo(f.Content, result)
}
if f := findByPath(files, "tmp/inventory_fw.log"); f != nil {
result.Hardware.Firmware = append(result.Hardware.Firmware, parseFirmware(f.Content)...)
}
if f := findByPath(files, "tmp/inventory_cpu.log"); f != nil {
result.Hardware.CPUs = parseCPUs(f.Content)
}
if f := findByPath(files, "tmp/inventory_dimm.log"); f != nil {
memory, events := parseDIMMs(f.Content)
result.Hardware.Memory = memory
result.Events = append(result.Events, events...)
}
if f := findByPath(files, "tmp/inventory_disk.log"); f != nil {
result.Hardware.Storage = parseDisks(f.Content)
}
if f := findByPath(files, "tmp/inventory_card.log"); f != nil {
result.Hardware.PCIeDevices = parseCards(f.Content)
}
if f := findByPath(files, "tmp/inventory_psu.log"); f != nil {
result.Hardware.PowerSupply = parsePSUs(f.Content)
}
if f := findByPath(files, "tmp/inventory_ipmi_fru.log"); f != nil {
result.FRU = parseFRU(f.Content)
enrichBoardFromFRU(result)
}
if f := findByPath(files, "tmp/inventory_ipmi_sensor.log"); f != nil {
result.Sensors = parseSensors(f.Content)
result.Hardware.PowerSupply = enrichPSUsFromSensors(result.Hardware.PowerSupply, result.Sensors)
}
for _, f := range findEventFiles(files) {
result.Events = append(result.Events, parseEvents(f.Content)...)
}
result.Protocol = "ipmi"
result.SourceType = models.SourceTypeArchive
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
return result, nil
}
// findByPath returns the first file whose lowercased path ends with the given suffix.
func findByPath(files []parser.ExtractedFile, suffix string) *parser.ExtractedFile {
for i := range files {
if strings.HasSuffix(strings.ToLower(files[i].Path), suffix) {
return &files[i]
}
}
return nil
}
// findEventFiles returns all xcc_plat_eventsN.log files.
func findEventFiles(files []parser.ExtractedFile) []parser.ExtractedFile {
var out []parser.ExtractedFile
for _, f := range files {
path := strings.ToLower(f.Path)
if strings.Contains(path, "tmp/xcc_plat_events") && strings.HasSuffix(path, ".log") {
out = append(out, f)
}
}
return out
}
// --- JSON structures ---
type xccBasicSysInfoDoc struct {
Items []xccBasicSysInfoItem `json:"items"`
}
type xccBasicSysInfoItem struct {
MachineName string `json:"machine_name"`
MachineTypeModel string `json:"machine_typemodel"`
SerialNumber string `json:"serial_number"`
UUID string `json:"uuid"`
PowerState string `json:"power_state"`
ServerState string `json:"server_state"`
CurrentTime string `json:"current_time"`
}
// xccFWEntry covers both basic_sys_info firmware (no type_str) and inventory_fw (has type_str).
type xccFWEntry struct {
Index int `json:"index"`
TypeCode int `json:"type"`
TypeStr string `json:"type_str"` // only in inventory_fw.log
Version string `json:"version"`
Build string `json:"build"`
ReleaseDate string `json:"release_date"`
}
type xccFirmwareDoc struct {
Items []xccFWEntry `json:"items"`
}
type xccCPUDoc struct {
Items []xccCPUItem `json:"items"`
}
type xccCPUItem struct {
Processors []xccCPU `json:"processors"`
}
type xccCPU struct {
Name int `json:"processors_name"`
Model string `json:"processors_cpu_model"`
Cores json.RawMessage `json:"processors_cores"` // may be int or string
Threads json.RawMessage `json:"processors_threads"` // may be int or string
ClockSpeed string `json:"processors_clock_speed"`
L1DataCache string `json:"processors_l1datacache"`
L2Cache string `json:"processors_l2cache"`
L3Cache string `json:"processors_l3cache"`
Status string `json:"processors_status"`
SerialNumber string `json:"processors_serial_number"`
}
type xccDIMMDoc struct {
Items []xccDIMMItem `json:"items"`
}
type xccDIMMItem struct {
Memory []xccDIMM `json:"memory"`
}
type xccDIMM struct {
Index int `json:"memory_index"`
Status string `json:"memory_status"`
Name string `json:"memory_name"`
Type string `json:"memory_type"`
Capacity json.RawMessage `json:"memory_capacity"` // int (GB) or string
PartNumber string `json:"memory_part_number"`
SerialNumber string `json:"memory_serial_number"`
Manufacturer string `json:"memory_manufacturer"`
MemSpeed json.RawMessage `json:"memory_mem_speed"` // int or string
ConfigSpeed json.RawMessage `json:"memory_config_speed"` // int or string
}
type xccDiskDoc struct {
Items []xccDiskItem `json:"items"`
}
type xccDiskItem struct {
Disks []xccDisk `json:"disks"`
}
type xccDisk struct {
ID int `json:"id"`
SlotNo int `json:"slotNo"`
Type string `json:"type"`
Interface string `json:"interface"`
Media string `json:"media"`
SerialNo string `json:"serialNo"`
PartNo string `json:"partNo"`
CapacityStr string `json:"capacityStr"` // e.g. "3.20 TB"
Manufacture string `json:"manufacture"`
ProductName string `json:"productName"`
RemainLife int `json:"remainLife"` // 0-100
FWVersion string `json:"fwVersion"`
Temperature int `json:"temperature"`
HealthStatus int `json:"healthStatus"` // int code: 2=Normal
State int `json:"state"`
StateStr string `json:"statestr"`
}
type xccCardDoc struct {
Items []xccCard `json:"items"`
}
type xccCard struct {
Key int `json:"key"`
SlotNo int `json:"slotNo"`
AdapterName string `json:"adapterName"`
ConnectorLabel string `json:"connectorLabel"`
OOBSupported int `json:"oobSupported"`
Location int `json:"location"`
Functions []xccCardFunc `json:"functions"`
}
type xccCardFunc struct {
FunType int `json:"funType"`
BusNo int `json:"generic_busNo"`
DevNo int `json:"generic_devNo"`
FunNo int `json:"generic_funNo"`
VendorID int `json:"generic_vendorId"` // direct int
DeviceID int `json:"generic_devId"` // direct int
SlotDesignation string `json:"generic_slotDesignation"`
}
type xccPSUDoc struct {
Items []xccPSUItem `json:"items"`
}
type xccPSUItem struct {
Power []xccPSU `json:"power"`
}
type xccPSU struct {
Name int `json:"name"`
Status string `json:"status"`
RatedPower int `json:"rated_power"`
PartNumber string `json:"part_number"`
FRUNumber string `json:"fru_number"`
SerialNumber string `json:"serial_number"`
ManufID string `json:"manuf_id"`
}
type xccFRUDoc struct {
Items []xccFRUItem `json:"items"`
}
type xccFRUItem struct {
BuiltinFRU []map[string]string `json:"builtin_fru_device"`
}
type xccSensorDoc struct {
Items []xccSensor `json:"items"`
}
type xccSensor struct {
Name string `json:"Sensor Name"`
Value string `json:"Value"`
Status string `json:"status"`
Unit string `json:"unit"`
}
type xccEventDoc struct {
Items []xccEvent `json:"items"`
}
type xccEvent struct {
Severity string `json:"severity"` // "I", "W", "E", "C"
Source string `json:"source"`
Date string `json:"date"` // "2025-12-22T13:24:02.070"
Index int `json:"index"`
EventID string `json:"eventid"`
CmnID string `json:"cmnid"`
Message string `json:"message"`
}
// --- Parsers ---
func parseBasicSysInfo(content []byte, result *models.AnalysisResult) {
var doc xccBasicSysInfoDoc
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
return
}
item := doc.Items[0]
result.Hardware.BoardInfo = models.BoardInfo{
ProductName: cleanXCCValue(item.MachineTypeModel),
SerialNumber: cleanXCCValue(item.SerialNumber),
UUID: cleanXCCValue(item.UUID),
}
if host := cleanXCCValue(item.MachineName); host != "" {
result.TargetHost = host
}
if t, err := parseXCCTime(item.CurrentTime); err == nil {
result.CollectedAt = t.UTC()
}
}
func parseFirmware(content []byte) []models.FirmwareInfo {
var doc xccFirmwareDoc
if err := json.Unmarshal(content, &doc); err != nil {
return nil
}
var out []models.FirmwareInfo
for _, fw := range doc.Items {
if fi := xccFWEntryToModel(fw); fi != nil {
out = append(out, *fi)
}
}
return out
}
func xccFWEntryToModel(fw xccFWEntry) *models.FirmwareInfo {
name := strings.TrimSpace(fw.TypeStr)
version := strings.TrimSpace(fw.Version)
if name == "" && version == "" {
return nil
}
build := strings.TrimSpace(fw.Build)
v := version
if build != "" {
v = version + " (" + build + ")"
}
return &models.FirmwareInfo{
DeviceName: name,
Version: v,
BuildTime: strings.TrimSpace(fw.ReleaseDate),
}
}
func parseCPUs(content []byte) []models.CPU {
var doc xccCPUDoc
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
return nil
}
var out []models.CPU
for _, item := range doc.Items {
for _, c := range item.Processors {
cpu := models.CPU{
Socket: c.Name,
Model: strings.TrimSpace(c.Model),
Cores: rawJSONToInt(c.Cores),
Threads: rawJSONToInt(c.Threads),
FrequencyMHz: parseMHz(c.ClockSpeed),
L1CacheKB: parseKB(c.L1DataCache),
L2CacheKB: parseKB(c.L2Cache),
L3CacheKB: parseKB(c.L3Cache),
Status: strings.TrimSpace(c.Status),
SerialNumber: strings.TrimSpace(c.SerialNumber),
}
out = append(out, cpu)
}
}
return out
}
func parseDIMMs(content []byte) ([]models.MemoryDIMM, []models.Event) {
var doc xccDIMMDoc
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
return nil, nil
}
var out []models.MemoryDIMM
var events []models.Event
for _, item := range doc.Items {
for _, m := range item.Memory {
status := strings.TrimSpace(m.Status)
present := !strings.EqualFold(status, "not present") &&
!strings.EqualFold(status, "absent")
// memory_capacity is in GB (int); convert to MB
capacityGB := rawJSONToInt(m.Capacity)
dimm := models.MemoryDIMM{
Slot: strings.TrimSpace(m.Name),
Location: strings.TrimSpace(m.Name),
Present: present,
SizeMB: capacityGB * 1024,
Type: strings.TrimSpace(m.Type),
MaxSpeedMHz: rawJSONToInt(m.MemSpeed),
CurrentSpeedMHz: rawJSONToInt(m.ConfigSpeed),
Manufacturer: strings.TrimSpace(m.Manufacturer),
SerialNumber: strings.TrimSpace(m.SerialNumber),
PartNumber: strings.TrimSpace(strings.TrimRight(m.PartNumber, " ")),
Status: status,
}
out = append(out, dimm)
if isUnqualifiedDIMM(status) {
events = append(events, models.Event{
Source: "Memory",
SensorType: "Memory",
SensorName: dimm.Slot,
EventType: "DIMM Qualification",
Severity: models.SeverityWarning,
Description: status,
})
}
}
}
return out, events
}
func parseDisks(content []byte) []models.Storage {
var doc xccDiskDoc
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
return nil
}
var out []models.Storage
for _, item := range doc.Items {
for _, d := range item.Disks {
sizeGB := parseCapacityToGB(d.CapacityStr)
stateStr := strings.TrimSpace(d.StateStr)
present := !strings.EqualFold(stateStr, "absent") &&
!strings.EqualFold(stateStr, "not present")
status := mapDiskHealthStatus(d.HealthStatus, stateStr)
disk := models.Storage{
Slot: fmt.Sprintf("%d", d.SlotNo),
Type: strings.TrimSpace(d.Media),
Model: cleanXCCValue(d.ProductName),
SizeGB: sizeGB,
SerialNumber: cleanXCCValue(d.SerialNo),
Manufacturer: cleanXCCValue(d.Manufacture),
Firmware: cleanXCCValue(d.FWVersion),
Interface: strings.TrimSpace(d.Interface),
Present: present,
Status: status,
}
if d.Temperature > 0 {
disk.Details = map[string]any{"temperature_c": d.Temperature}
}
if d.RemainLife >= 0 && d.RemainLife <= 100 {
v := d.RemainLife
disk.RemainingEndurancePct = &v
}
out = append(out, disk)
}
}
return out
}
func parseCards(content []byte) []models.PCIeDevice {
var doc xccCardDoc
if err := json.Unmarshal(content, &doc); err != nil {
return nil
}
var out []models.PCIeDevice
for _, card := range doc.Items {
slot := strings.TrimSpace(card.ConnectorLabel)
if slot == "" {
slot = fmt.Sprintf("%d", card.SlotNo)
}
dev := models.PCIeDevice{
Slot: slot,
Description: strings.TrimSpace(card.AdapterName),
}
if len(card.Functions) > 0 {
fn := card.Functions[0]
dev.BDF = fmt.Sprintf("%02x:%02x.%x", fn.BusNo, fn.DevNo, fn.FunNo)
dev.VendorID = fn.VendorID
dev.DeviceID = fn.DeviceID
}
out = append(out, dev)
}
return out
}
func parsePSUs(content []byte) []models.PSU {
var doc xccPSUDoc
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
return nil
}
var out []models.PSU
for _, item := range doc.Items {
for _, p := range item.Power {
model := cleanXCCValue(p.FRUNumber)
if model == "" {
model = cleanXCCValue(p.PartNumber)
}
psu := models.PSU{
Slot: fmt.Sprintf("%d", p.Name),
Present: true,
Model: model,
WattageW: p.RatedPower,
SerialNumber: cleanXCCValue(p.SerialNumber),
PartNumber: cleanXCCValue(p.PartNumber),
Vendor: cleanXCCValue(p.ManufID),
Status: strings.TrimSpace(p.Status),
}
out = append(out, psu)
}
}
return out
}
func parseFRU(content []byte) []models.FRUInfo {
var doc xccFRUDoc
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
return nil
}
var out []models.FRUInfo
for _, item := range doc.Items {
for _, entry := range item.BuiltinFRU {
fru := models.FRUInfo{
Description: entry["FRU Device Description"],
Manufacturer: entry["Board Mfg"],
ProductName: entry["Board Product"],
SerialNumber: entry["Board Serial"],
PartNumber: entry["Board Part Number"],
MfgDate: entry["Board Mfg Date"],
}
if fru.ProductName == "" {
fru.ProductName = entry["Product Name"]
}
if fru.SerialNumber == "" {
fru.SerialNumber = entry["Product Serial"]
}
if fru.PartNumber == "" {
fru.PartNumber = entry["Product Part Number"]
}
if fru.Description == "" && fru.ProductName == "" && fru.SerialNumber == "" {
continue
}
out = append(out, fru)
}
}
return out
}
func parseSensors(content []byte) []models.SensorReading {
var doc xccSensorDoc
if err := json.Unmarshal(content, &doc); err != nil {
return nil
}
var out []models.SensorReading
for _, s := range doc.Items {
name := strings.TrimSpace(s.Name)
if name == "" {
continue
}
unit := strings.TrimSpace(s.Unit)
sr := models.SensorReading{
Name: name,
RawValue: strings.TrimSpace(s.Value),
Unit: unit,
Status: strings.TrimSpace(s.Status),
Type: classifySensorType(name, unit),
}
if v, err := strconv.ParseFloat(sr.RawValue, 64); err == nil {
sr.Value = v
}
out = append(out, sr)
}
return out
}
func parseEvents(content []byte) []models.Event {
var doc xccEventDoc
if err := json.Unmarshal(content, &doc); err != nil {
return nil
}
var out []models.Event
for _, e := range doc.Items {
ev := models.Event{
ID: e.EventID,
Source: strings.TrimSpace(e.Source),
Description: strings.TrimSpace(e.Message),
Severity: xccSeverity(e.Severity, e.Message),
}
if t, err := parseXCCTime(e.Date); err == nil {
ev.Timestamp = t.UTC()
}
out = append(out, ev)
}
return out
}
// --- Cross-reference enrichment ---
// enrichBoardFromFRU sets BoardInfo.Manufacturer from the system board FRU entry
// when it is not already populated. Mirrors bee's board parsing from dmidecode type 1.
func enrichBoardFromFRU(result *models.AnalysisResult) {
if result.Hardware.BoardInfo.Manufacturer != "" {
return
}
for _, fru := range result.FRU {
desc := strings.ToLower(fru.Description)
if !strings.Contains(desc, "system board") &&
!strings.Contains(desc, "planar") &&
!strings.Contains(desc, "backplane") {
continue
}
if mfg := cleanXCCValue(fru.Manufacturer); mfg != "" {
result.Hardware.BoardInfo.Manufacturer = mfg
return
}
}
}
// psuSensorSlot extracts a 1-based PSU slot number from a sensor name.
// Recognises patterns: "PSU1 ...", "PSU 2 ...", "Power Supply 1 ...", "PWS1 ..."
var psuSensorSlotPattern = regexp.MustCompile(`(?i)(?:PSU|Power\s+Supply|PWS)\s*(\d+)`)
// enrichPSUsFromSensors cross-references sensor readings into PSU InputPowerW /
// OutputPowerW / InputVoltage. Mirrors bee's enrichPSUsWithTelemetry approach.
func enrichPSUsFromSensors(psus []models.PSU, sensors []models.SensorReading) []models.PSU {
if len(psus) == 0 || len(sensors) == 0 {
return psus
}
for i := range psus {
slot, err := strconv.Atoi(psus[i].Slot)
if err != nil {
continue
}
for _, s := range sensors {
m := psuSensorSlotPattern.FindStringSubmatch(s.Name)
if len(m) < 2 {
continue
}
sensorSlot, err := strconv.Atoi(m[1])
if err != nil || sensorSlot != slot {
continue
}
nameLower := strings.ToLower(s.Name)
switch {
case isPSUInputPower(nameLower):
psus[i].InputPowerW = int(s.Value)
case isPSUOutputPower(nameLower):
psus[i].OutputPowerW = int(s.Value)
case isPSUInputVoltage(nameLower):
psus[i].InputVoltage = s.Value
}
}
}
return psus
}
func isPSUInputPower(name string) bool {
return strings.Contains(name, "input power") ||
strings.Contains(name, "input watts") ||
strings.Contains(name, "_pin") ||
strings.Contains(name, " pin")
}
func isPSUOutputPower(name string) bool {
return strings.Contains(name, "output power") ||
strings.Contains(name, "output watts") ||
strings.Contains(name, "_pout") ||
strings.Contains(name, " pout")
}
func isPSUInputVoltage(name string) bool {
return strings.Contains(name, "input voltage") ||
strings.Contains(name, "ac voltage") ||
strings.Contains(name, "_vin") ||
strings.Contains(name, " vin")
}
// mapDiskHealthStatus maps an XCC disk healthStatus integer to a canonical status
// string. Mirrors bee's mapRAIDDriveStatus logic.
// XCC codes: 1=Warning, 2=Normal, 3=Critical, 4=PredictiveFailure; 0=Unknown.
func mapDiskHealthStatus(code int, stateStr string) string {
switch code {
case 2:
return "OK"
case 1, 4:
return "Warning"
case 3:
return "Critical"
default:
if stateStr != "" {
return stateStr
}
return "Unknown"
}
}
// classifySensorType returns a sensor category based on bee's classification logic:
// fan / temperature / power / voltage / current / other.
func classifySensorType(name, unit string) string {
u := strings.ToLower(strings.TrimSpace(unit))
switch u {
case "rpm":
return "fan"
case "c", "celsius", "°c":
return "temperature"
case "w", "watts":
return "power"
case "v", "volts":
return "voltage"
case "a", "amps":
return "current"
}
n := strings.ToLower(name)
switch {
case strings.Contains(n, "fan"):
return "fan"
case strings.Contains(n, "temp"):
return "temperature"
case strings.Contains(n, "power") || strings.Contains(n, " pwr"):
return "power"
case strings.Contains(n, "volt") || strings.Contains(n, " vin") || strings.Contains(n, " vout"):
return "voltage"
case strings.Contains(n, "curr") || strings.Contains(n, " amp"):
return "current"
default:
return "other"
}
}
// cleanXCCValue strips XCC placeholder strings, returning "" for non-values.
// Mirrors bee's cleanDMIValue for IPMI/XCC context.
func cleanXCCValue(v string) string {
v = strings.TrimSpace(v)
switch strings.ToLower(v) {
case "", "n/a", "na", "none", "unknown", "not available",
"not applicable", "not present", "not specified", "-":
return ""
}
return v
}
// --- Helpers ---
func xccSeverity(s, message string) models.Severity {
if isUnqualifiedDIMM(message) {
return models.SeverityWarning
}
switch strings.ToUpper(strings.TrimSpace(s)) {
case "C":
return models.SeverityCritical
case "E":
return models.SeverityCritical
case "W":
return models.SeverityWarning
default:
return models.SeverityInfo
}
}
func isUnqualifiedDIMM(value string) bool {
return strings.Contains(strings.ToLower(strings.TrimSpace(value)), "unqualified dimm")
}
func parseXCCTime(s string) (time.Time, error) {
s = strings.TrimSpace(s)
formats := []string{
"2006-01-02T15:04:05.000",
"2006-01-02T15:04:05",
"2006-01-02 15:04:05",
}
for _, f := range formats {
if t, err := time.Parse(f, s); err == nil {
return t, nil
}
}
return time.Time{}, fmt.Errorf("unparseable time: %q", s)
}
// parseMHz parses "4100 MHz" → 4100
func parseMHz(s string) int {
s = strings.TrimSpace(s)
parts := strings.Fields(s)
if len(parts) == 0 {
return 0
}
v, _ := strconv.Atoi(parts[0])
return v
}
// parseKB parses "384 KB" → 384
func parseKB(s string) int {
s = strings.TrimSpace(s)
parts := strings.Fields(s)
if len(parts) == 0 {
return 0
}
v, _ := strconv.Atoi(parts[0])
return v
}
// parseMB parses "32768 MB" → 32768
func parseMB(s string) int {
return parseKB(s)
}
// parseMTs parses "4800 MT/s" → 4800 (treated as MHz equivalent)
func parseMTs(s string) int {
return parseKB(s)
}
// parseCapacityToGB parses "3.20 TB" or "480 GB" → GB integer
func parseCapacityToGB(s string) int {
s = strings.TrimSpace(s)
parts := strings.Fields(s)
if len(parts) < 2 {
return 0
}
v, err := strconv.ParseFloat(parts[0], 64)
if err != nil {
return 0
}
switch strings.ToUpper(parts[1]) {
case "TB":
return int(v * 1000)
case "GB":
return int(v)
case "MB":
return int(v / 1024)
}
return int(v)
}
// rawJSONToInt parses a json.RawMessage that may be an int or a quoted string → int
func rawJSONToInt(raw json.RawMessage) int {
if len(raw) == 0 {
return 0
}
// try direct int
var n int
if err := json.Unmarshal(raw, &n); err == nil {
return n
}
// try string
var s string
if err := json.Unmarshal(raw, &s); err == nil {
v, _ := strconv.Atoi(strings.TrimSpace(s))
return v
}
return 0
}
// parseHexID parses "0x15b3" → 5555
func parseHexID(s string) int {
s = strings.TrimSpace(strings.ToLower(s))
s = strings.TrimPrefix(s, "0x")
v, _ := strconv.ParseInt(s, 16, 32)
return int(v)
}

View File

@@ -0,0 +1,398 @@
package lenovo_xcc
import (
"testing"
"git.mchus.pro/mchus/logpile/internal/models"
"git.mchus.pro/mchus/logpile/internal/parser"
)
const exampleArchive = "/Users/mchusavitin/Documents/git/logpile/example/7D76CTO1WW_JF0002KT_xcc_mini-log_20260413-122150.zip"
func TestDetect_LenovoXCCMiniLog(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
score := p.Detect(files)
if score < 80 {
t.Errorf("expected Detect score >= 80 for XCC mini-log archive, got %d", score)
}
}
func TestParse_LenovoXCCMiniLog_BasicSysInfo(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, err := p.Parse(files)
if err != nil {
t.Fatalf("Parse returned error: %v", err)
}
if result == nil || result.Hardware == nil {
t.Fatal("Parse returned nil result or hardware")
}
hw := result.Hardware
if hw.BoardInfo.SerialNumber == "" {
t.Error("BoardInfo.SerialNumber is empty")
}
if hw.BoardInfo.ProductName == "" {
t.Error("BoardInfo.ProductName is empty")
}
t.Logf("BoardInfo: serial=%s model=%s uuid=%s", hw.BoardInfo.SerialNumber, hw.BoardInfo.ProductName, hw.BoardInfo.UUID)
}
func TestParse_LenovoXCCMiniLog_CPUs(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil || result.Hardware == nil {
t.Fatal("Parse returned nil")
}
if len(result.Hardware.CPUs) == 0 {
t.Error("expected at least one CPU, got none")
}
for i, cpu := range result.Hardware.CPUs {
t.Logf("CPU[%d]: socket=%d model=%q cores=%d threads=%d freq=%dMHz", i, cpu.Socket, cpu.Model, cpu.Cores, cpu.Threads, cpu.FrequencyMHz)
}
}
func TestParse_LenovoXCCMiniLog_Memory(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil || result.Hardware == nil {
t.Fatal("Parse returned nil")
}
if len(result.Hardware.Memory) == 0 {
t.Error("expected memory DIMMs, got none")
}
t.Logf("Memory: %d DIMMs", len(result.Hardware.Memory))
for i, m := range result.Hardware.Memory {
t.Logf("DIMM[%d]: slot=%s present=%v size=%dMB sn=%s", i, m.Slot, m.Present, m.SizeMB, m.SerialNumber)
}
}
func TestParse_LenovoXCCMiniLog_Storage(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil || result.Hardware == nil {
t.Fatal("Parse returned nil")
}
t.Logf("Storage: %d disks", len(result.Hardware.Storage))
for i, s := range result.Hardware.Storage {
t.Logf("Disk[%d]: slot=%s model=%q size=%dGB sn=%s", i, s.Slot, s.Model, s.SizeGB, s.SerialNumber)
}
}
func TestParse_LenovoXCCMiniLog_PCIeCards(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil || result.Hardware == nil {
t.Fatal("Parse returned nil")
}
t.Logf("PCIe cards: %d", len(result.Hardware.PCIeDevices))
for i, c := range result.Hardware.PCIeDevices {
t.Logf("Card[%d]: slot=%s desc=%q bdf=%s", i, c.Slot, c.Description, c.BDF)
}
}
func TestParse_LenovoXCCMiniLog_PSUs(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil || result.Hardware == nil {
t.Fatal("Parse returned nil")
}
if len(result.Hardware.PowerSupply) == 0 {
t.Error("expected PSUs, got none")
}
for i, p := range result.Hardware.PowerSupply {
t.Logf("PSU[%d]: slot=%s wattage=%dW status=%s sn=%s", i, p.Slot, p.WattageW, p.Status, p.SerialNumber)
}
}
func TestParse_LenovoXCCMiniLog_Sensors(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil {
t.Fatal("Parse returned nil")
}
if len(result.Sensors) == 0 {
t.Error("expected sensors, got none")
}
t.Logf("Sensors: %d", len(result.Sensors))
}
func TestParse_LenovoXCCMiniLog_Events(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil {
t.Fatal("Parse returned nil")
}
if len(result.Events) == 0 {
t.Error("expected events, got none")
}
t.Logf("Events: %d", len(result.Events))
for i, e := range result.Events {
if i >= 5 {
break
}
t.Logf("Event[%d]: severity=%s ts=%s desc=%q", i, e.Severity, e.Timestamp.Format("2006-01-02T15:04:05"), e.Description)
}
}
func TestParse_LenovoXCCMiniLog_FRU(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil {
t.Fatal("Parse returned nil")
}
t.Logf("FRU: %d entries", len(result.FRU))
for i, f := range result.FRU {
t.Logf("FRU[%d]: desc=%q product=%q serial=%q", i, f.Description, f.ProductName, f.SerialNumber)
}
}
func TestParse_LenovoXCCMiniLog_Firmware(t *testing.T) {
files, err := parser.ExtractArchive(exampleArchive)
if err != nil {
t.Skipf("example archive not available: %v", err)
}
p := &Parser{}
result, _ := p.Parse(files)
if result == nil || result.Hardware == nil {
t.Fatal("Parse returned nil")
}
if len(result.Hardware.Firmware) == 0 {
t.Error("expected firmware entries, got none")
}
for i, f := range result.Hardware.Firmware {
t.Logf("FW[%d]: name=%q version=%q buildtime=%q", i, f.DeviceName, f.Version, f.BuildTime)
}
}
func TestParseDIMMs_UnqualifiedDIMMAddsWarningEvent(t *testing.T) {
content := []byte(`{
"items": [{
"memory": [{
"memory_name": "DIMM A1",
"memory_status": "Unqualified DIMM",
"memory_type": "DDR5",
"memory_capacity": 32
}]
}]
}`)
memory, events := parseDIMMs(content)
if len(memory) != 1 {
t.Fatalf("expected 1 DIMM, got %d", len(memory))
}
if len(events) != 1 {
t.Fatalf("expected 1 warning event, got %d", len(events))
}
if events[0].Severity != models.SeverityWarning {
t.Fatalf("expected warning severity, got %q", events[0].Severity)
}
if events[0].SensorName != "DIMM A1" {
t.Fatalf("unexpected sensor name: %q", events[0].SensorName)
}
}
func TestSeverity_UnqualifiedDIMMMessageBecomesWarning(t *testing.T) {
if got := xccSeverity("I", "System found Unqualified DIMM in slot DIMM A1"); got != models.SeverityWarning {
t.Fatalf("expected warning severity, got %q", got)
}
}
func TestParseBasicSysInfo_CleansPlaceholderValuesAndSetsTargetHost(t *testing.T) {
result := &models.AnalysisResult{Hardware: &models.HardwareConfig{}}
content := []byte(`{
"items": [{
"machine_name": " sr650v3-node01 ",
"machine_typemodel": " 7D76CTO1WW ",
"serial_number": " Not Specified ",
"uuid": "N/A"
}]
}`)
parseBasicSysInfo(content, result)
if result.TargetHost != "sr650v3-node01" {
t.Fatalf("unexpected target host: %q", result.TargetHost)
}
if result.Hardware.BoardInfo.ProductName != "7D76CTO1WW" {
t.Fatalf("unexpected product name: %q", result.Hardware.BoardInfo.ProductName)
}
if result.Hardware.BoardInfo.SerialNumber != "" {
t.Fatalf("expected serial number to be cleaned, got %q", result.Hardware.BoardInfo.SerialNumber)
}
if result.Hardware.BoardInfo.UUID != "" {
t.Fatalf("expected UUID to be cleaned, got %q", result.Hardware.BoardInfo.UUID)
}
}
func TestEnrichBoardFromFRU_SystemBoardManufacturerOnly(t *testing.T) {
result := &models.AnalysisResult{
Hardware: &models.HardwareConfig{},
FRU: []models.FRUInfo{
{Description: "Power Supply 1", Manufacturer: "Ignore Me"},
{Description: "System Board", Manufacturer: " Lenovo "},
},
}
enrichBoardFromFRU(result)
if result.Hardware.BoardInfo.Manufacturer != "Lenovo" {
t.Fatalf("unexpected manufacturer: %q", result.Hardware.BoardInfo.Manufacturer)
}
}
func TestEnrichPSUsFromSensors_AssignsTelemetryBySlot(t *testing.T) {
psus := []models.PSU{
{Slot: "1"},
{Slot: "2"},
}
sensors := []models.SensorReading{
{Name: "PSU1 Input Power", Value: 430},
{Name: "Power Supply 1 Output Power", Value: 390},
{Name: "PWS1 AC Voltage", Value: 230.5},
{Name: "PSU2 Input Power", Value: 0},
{Name: "PSU3 Input Power", Value: 999},
{Name: "Fan 1", Value: 12000},
}
got := enrichPSUsFromSensors(psus, sensors)
if got[0].InputPowerW != 430 {
t.Fatalf("unexpected PSU1 input power: %d", got[0].InputPowerW)
}
if got[0].OutputPowerW != 390 {
t.Fatalf("unexpected PSU1 output power: %d", got[0].OutputPowerW)
}
if got[0].InputVoltage != 230.5 {
t.Fatalf("unexpected PSU1 input voltage: %v", got[0].InputVoltage)
}
if got[1].InputPowerW != 0 || got[1].OutputPowerW != 0 || got[1].InputVoltage != 0 {
t.Fatalf("unexpected telemetry assigned to PSU2: %+v", got[1])
}
}
func TestMapDiskHealthStatus(t *testing.T) {
tests := []struct {
name string
code int
stateStr string
want string
}{
{name: "normal", code: 2, stateStr: "Online", want: "OK"},
{name: "warning", code: 1, stateStr: "Online", want: "Warning"},
{name: "predictive failure", code: 4, stateStr: "Online", want: "Warning"},
{name: "critical", code: 3, stateStr: "Failed", want: "Critical"},
{name: "fallback state", code: 0, stateStr: "Rebuilding", want: "Rebuilding"},
{name: "unknown", code: 0, stateStr: "", want: "Unknown"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := mapDiskHealthStatus(tt.code, tt.stateStr); got != tt.want {
t.Fatalf("got %q, want %q", got, tt.want)
}
})
}
}
func TestClassifySensorType(t *testing.T) {
tests := []struct {
name string
in string
unit string
want string
}{
{name: "unit rpm", in: "Fan 1", unit: "RPM", want: "fan"},
{name: "unit celsius", in: "CPU Temp", unit: "C", want: "temperature"},
{name: "unit watts", in: "PSU1 Input Power", unit: "W", want: "power"},
{name: "unit volts", in: "PWS1 AC Voltage", unit: "V", want: "voltage"},
{name: "unit amps", in: "PSU1 Current", unit: "A", want: "current"},
{name: "name fallback", in: "GPU Temp", unit: "", want: "temperature"},
{name: "other", in: "Presence", unit: "", want: "other"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := classifySensorType(tt.in, tt.unit); got != tt.want {
t.Fatalf("got %q, want %q", got, tt.want)
}
})
}
}
func TestCleanXCCValue(t *testing.T) {
tests := []struct {
in string
want string
}{
{in: " Lenovo ", want: "Lenovo"},
{in: "N/A", want: ""},
{in: " not specified ", want: ""},
{in: "-", want: ""},
}
for _, tt := range tests {
if got := cleanXCCValue(tt.in); got != tt.want {
t.Fatalf("cleanXCCValue(%q) = %q, want %q", tt.in, got, tt.want)
}
}
}

View File

@@ -14,6 +14,7 @@ import (
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/unraid"
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/xfusion"
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/xigmanas"
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/lenovo_xcc"
// Generic fallback parser (must be last for lowest priority)
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/generic"

View File

@@ -10,6 +10,33 @@ import (
"git.mchus.pro/mchus/logpile/internal/parser"
)
type xfusionNICCard struct {
Slot string
Model string
ProductName string
Vendor string
VendorID int
DeviceID int
BDF string
SerialNumber string
PartNumber string
}
type xfusionNetcardPort struct {
BDF string
MAC string
ActualMAC string
}
type xfusionNetcardSnapshot struct {
Timestamp time.Time
Slot string
ProductName string
Manufacturer string
Firmware string
Ports []xfusionNetcardPort
}
// ── FRU ──────────────────────────────────────────────────────────────────────
// parseFRUInfo parses fruinfo.txt and populates result.FRU and result.Hardware.BoardInfo.
@@ -232,15 +259,15 @@ func parseCPUInfo(content []byte) []models.CPU {
}
cpus = append(cpus, models.CPU{
Socket: socketNum,
Model: model,
Cores: cores,
Threads: threads,
L1CacheKB: l1,
L2CacheKB: l2,
L3CacheKB: l3,
Socket: socketNum,
Model: model,
Cores: cores,
Threads: threads,
L1CacheKB: l1,
L2CacheKB: l2,
L3CacheKB: l3,
SerialNumber: sn,
Status: "ok",
Status: "ok",
})
}
return cpus
@@ -338,9 +365,9 @@ func parseMemInfo(content []byte) []models.MemoryDIMM {
// ── Card Info (GPU + NIC) ─────────────────────────────────────────────────────
// parseCardInfo parses card_info file, extracting GPU and NIC entries.
// parseCardInfo parses card_info file, extracting GPU and OCP NIC card inventory.
// The file has named sections ("GPU Card Info", "OCP Card Info", etc.) each with a pipe-table.
func parseCardInfo(content []byte) (gpus []models.GPU, nics []models.NIC) {
func parseCardInfo(content []byte) (gpus []models.GPU, nicCards []xfusionNICCard) {
sections := splitPipeSections(content)
// Build BDF and VendorID/DeviceID map from PCIe Card Info: slot → info
@@ -396,17 +423,22 @@ func parseCardInfo(content []byte) (gpus []models.GPU, nics []models.NIC) {
}
// OCP Card Info: NIC cards
for i, row := range sections["ocp card info"] {
desc := strings.TrimSpace(row["card desc"])
sn := strings.TrimSpace(row["serialnumber"])
nics = append(nics, models.NIC{
Name: fmt.Sprintf("OCP%d", i+1),
Model: desc,
SerialNumber: sn,
for _, row := range sections["ocp card info"] {
slot := strings.TrimSpace(row["slot"])
pcie := slotPCIe[slot]
nicCards = append(nicCards, xfusionNICCard{
Slot: slot,
Model: strings.TrimSpace(row["card desc"]),
ProductName: strings.TrimSpace(row["card desc"]),
VendorID: parseHexInt(row["vender id"]),
DeviceID: parseHexInt(row["device id"]),
BDF: pcie.bdf,
SerialNumber: strings.TrimSpace(row["serialnumber"]),
PartNumber: strings.TrimSpace(row["partnum"]),
})
}
return gpus, nics
return gpus, nicCards
}
// splitPipeSections parses a multi-section file where each section starts with a
@@ -462,6 +494,301 @@ func parseHexInt(s string) int {
return int(n)
}
func parseNetcardInfo(content []byte) []xfusionNetcardSnapshot {
if len(content) == 0 {
return nil
}
var snapshots []xfusionNetcardSnapshot
var current *xfusionNetcardSnapshot
var currentPort *xfusionNetcardPort
flushPort := func() {
if current == nil || currentPort == nil {
return
}
current.Ports = append(current.Ports, *currentPort)
currentPort = nil
}
flushSnapshot := func() {
if current == nil || !current.hasData() {
return
}
flushPort()
snapshots = append(snapshots, *current)
current = nil
}
for _, rawLine := range strings.Split(string(content), "\n") {
line := strings.TrimSpace(rawLine)
if line == "" {
flushPort()
continue
}
if ts, ok := parseXFusionUTCTimestamp(line); ok {
if current == nil {
current = &xfusionNetcardSnapshot{Timestamp: ts}
continue
}
if current.hasData() {
flushSnapshot()
current = &xfusionNetcardSnapshot{Timestamp: ts}
continue
}
current.Timestamp = ts
continue
}
if current == nil {
current = &xfusionNetcardSnapshot{}
}
if port := parseNetcardPortHeader(line); port != nil {
flushPort()
currentPort = port
continue
}
if currentPort != nil {
if value, ok := parseSimpleKV(line, "MacAddr"); ok {
currentPort.MAC = value
continue
}
if value, ok := parseSimpleKV(line, "ActualMac"); ok {
currentPort.ActualMAC = value
continue
}
}
if value, ok := parseSimpleKV(line, "ProductName"); ok {
current.ProductName = value
continue
}
if value, ok := parseSimpleKV(line, "Manufacture"); ok {
current.Manufacturer = value
continue
}
if value, ok := parseSimpleKV(line, "FirmwareVersion"); ok {
current.Firmware = value
continue
}
if value, ok := parseSimpleKV(line, "SlotId"); ok {
current.Slot = value
}
}
flushSnapshot()
bestIndexBySlot := make(map[string]int)
for i, snapshot := range snapshots {
slot := strings.TrimSpace(snapshot.Slot)
if slot == "" {
continue
}
prevIdx, exists := bestIndexBySlot[slot]
if !exists || snapshot.isBetterThan(snapshots[prevIdx]) {
bestIndexBySlot[slot] = i
}
}
ordered := make([]xfusionNetcardSnapshot, 0, len(bestIndexBySlot))
for i, snapshot := range snapshots {
slot := strings.TrimSpace(snapshot.Slot)
bestIdx, ok := bestIndexBySlot[slot]
if !ok || bestIdx != i {
continue
}
ordered = append(ordered, snapshot)
delete(bestIndexBySlot, slot)
}
return ordered
}
func mergeNetworkAdapters(cards []xfusionNICCard, snapshots []xfusionNetcardSnapshot) ([]models.NetworkAdapter, []models.NIC) {
bySlotCard := make(map[string]xfusionNICCard, len(cards))
bySlotSnapshot := make(map[string]xfusionNetcardSnapshot, len(snapshots))
orderedSlots := make([]string, 0, len(cards)+len(snapshots))
seenSlots := make(map[string]struct{}, len(cards)+len(snapshots))
for _, card := range cards {
slot := strings.TrimSpace(card.Slot)
if slot == "" {
continue
}
bySlotCard[slot] = card
if _, seen := seenSlots[slot]; !seen {
orderedSlots = append(orderedSlots, slot)
seenSlots[slot] = struct{}{}
}
}
for _, snapshot := range snapshots {
slot := strings.TrimSpace(snapshot.Slot)
if slot == "" {
continue
}
bySlotSnapshot[slot] = snapshot
if _, seen := seenSlots[slot]; !seen {
orderedSlots = append(orderedSlots, slot)
seenSlots[slot] = struct{}{}
}
}
adapters := make([]models.NetworkAdapter, 0, len(orderedSlots))
legacyNICs := make([]models.NIC, 0, len(orderedSlots))
for _, slot := range orderedSlots {
card := bySlotCard[slot]
snapshot := bySlotSnapshot[slot]
model := firstNonEmpty(card.Model, snapshot.ProductName)
description := ""
if !strings.EqualFold(strings.TrimSpace(model), strings.TrimSpace(snapshot.ProductName)) {
description = strings.TrimSpace(snapshot.ProductName)
}
macs := snapshot.macAddresses()
bdf := firstNonEmpty(snapshot.primaryBDF(), card.BDF)
firmware := normalizeXFusionValue(snapshot.Firmware)
manufacturer := firstNonEmpty(snapshot.Manufacturer, card.Vendor)
portCount := len(snapshot.Ports)
if portCount == 0 && len(macs) > 0 {
portCount = len(macs)
}
if portCount == 0 {
portCount = 1
}
adapters = append(adapters, models.NetworkAdapter{
Slot: slot,
Location: "OCP",
Present: true,
BDF: bdf,
Model: model,
Description: description,
Vendor: manufacturer,
VendorID: card.VendorID,
DeviceID: card.DeviceID,
SerialNumber: card.SerialNumber,
PartNumber: card.PartNumber,
Firmware: firmware,
PortCount: portCount,
PortType: "ethernet",
MACAddresses: macs,
Status: "ok",
})
legacyNICs = append(legacyNICs, models.NIC{
Name: fmt.Sprintf("OCP%s", slot),
Model: model,
Description: description,
MACAddress: firstNonEmpty(macs...),
SerialNumber: card.SerialNumber,
})
}
return adapters, legacyNICs
}
func parseXFusionUTCTimestamp(line string) (time.Time, bool) {
ts, err := time.Parse("2006-01-02 15:04:05 MST", strings.TrimSpace(line))
if err != nil {
return time.Time{}, false
}
return ts, true
}
func parseNetcardPortHeader(line string) *xfusionNetcardPort {
fields := strings.Fields(strings.TrimSpace(line))
if len(fields) < 2 || !strings.HasPrefix(strings.ToLower(fields[0]), "port") {
return nil
}
joined := strings.Join(fields[1:], " ")
if !strings.HasPrefix(strings.ToLower(joined), "bdf:") {
return nil
}
return &xfusionNetcardPort{BDF: strings.TrimSpace(joined[len("BDF:"):])}
}
func parseSimpleKV(line, key string) (string, bool) {
idx := strings.Index(line, ":")
if idx < 0 {
return "", false
}
gotKey := strings.TrimSpace(line[:idx])
if !strings.EqualFold(gotKey, key) {
return "", false
}
return strings.TrimSpace(line[idx+1:]), true
}
func normalizeXFusionValue(value string) string {
value = strings.TrimSpace(value)
switch strings.ToUpper(value) {
case "", "N/A", "NA", "UNKNOWN":
return ""
default:
return value
}
}
func (s xfusionNetcardSnapshot) hasData() bool {
return strings.TrimSpace(s.Slot) != "" ||
strings.TrimSpace(s.ProductName) != "" ||
strings.TrimSpace(s.Manufacturer) != "" ||
strings.TrimSpace(s.Firmware) != "" ||
len(s.Ports) > 0
}
func (s xfusionNetcardSnapshot) score() int {
score := len(s.Ports)
if normalizeXFusionValue(s.Firmware) != "" {
score += 10
}
score += len(s.macAddresses()) * 2
return score
}
func (s xfusionNetcardSnapshot) isBetterThan(other xfusionNetcardSnapshot) bool {
if s.score() != other.score() {
return s.score() > other.score()
}
if !s.Timestamp.Equal(other.Timestamp) {
return s.Timestamp.After(other.Timestamp)
}
return len(s.Ports) > len(other.Ports)
}
func (s xfusionNetcardSnapshot) primaryBDF() string {
for _, port := range s.Ports {
if bdf := strings.TrimSpace(port.BDF); bdf != "" {
return bdf
}
}
return ""
}
func (s xfusionNetcardSnapshot) macAddresses() []string {
out := make([]string, 0, len(s.Ports))
seen := make(map[string]struct{}, len(s.Ports))
for _, port := range s.Ports {
for _, candidate := range []string{port.ActualMAC, port.MAC} {
mac := normalizeMAC(candidate)
if mac == "" {
continue
}
if _, exists := seen[mac]; exists {
continue
}
seen[mac] = struct{}{}
out = append(out, mac)
break
}
}
return out
}
func normalizeMAC(value string) string {
value = strings.ToUpper(strings.TrimSpace(value))
switch value {
case "", "N/A", "NA", "UNKNOWN", "00:00:00:00:00:00":
return ""
default:
return value
}
}
// ── PSU ───────────────────────────────────────────────────────────────────────
// parsePSUInfo parses the pipe-delimited psu_info.txt.
@@ -525,6 +852,11 @@ func parsePSUInfo(content []byte) []models.PSU {
func parseStorageControllerInfo(content []byte, result *models.AnalysisResult) {
// File may contain multiple controller blocks; parse key:value pairs from each.
// We only look at the first occurrence of each key (first controller).
seen := make(map[string]struct{}, len(result.Hardware.Firmware))
for _, fw := range result.Hardware.Firmware {
key := strings.ToLower(strings.TrimSpace(fw.DeviceName + "\x00" + fw.Version + "\x00" + fw.Description))
seen[key] = struct{}{}
}
text := string(content)
blocks := strings.Split(text, "RAID Controller #")
for _, block := range blocks[1:] { // skip pre-block preamble
@@ -532,7 +864,7 @@ func parseStorageControllerInfo(content []byte, result *models.AnalysisResult) {
name := firstNonEmpty(fields["Component Name"], fields["Controller Name"], fields["Controller Type"])
firmware := fields["Firmware Version"]
if name != "" && firmware != "" {
result.Hardware.Firmware = append(result.Hardware.Firmware, models.FirmwareInfo{
appendXFusionFirmware(result, seen, models.FirmwareInfo{
DeviceName: name,
Description: fields["Controller Name"],
Version: firmware,
@@ -541,6 +873,86 @@ func parseStorageControllerInfo(content []byte, result *models.AnalysisResult) {
}
}
func parseAppRevision(content []byte, result *models.AnalysisResult) {
type firmwareLine struct {
deviceName string
description string
buildKey string
}
known := map[string]firmwareLine{
"Active iBMC Version": {deviceName: "iBMC", description: "active iBMC", buildKey: "Active iBMC Built"},
"Active BIOS Version": {deviceName: "BIOS", description: "active BIOS", buildKey: "Active BIOS Built"},
"CPLD Version": {deviceName: "CPLD", description: "mainboard CPLD"},
"SDK Version": {deviceName: "SDK", description: "iBMC SDK", buildKey: "SDK Built"},
"Active Uboot Version": {deviceName: "U-Boot", description: "active U-Boot"},
"Active Secure Bootloader Version": {deviceName: "Secure Bootloader", description: "active secure bootloader"},
"Active Secure Firmware Version": {deviceName: "Secure Firmware", description: "active secure firmware"},
}
values := parseAlignedKeyValues(content)
if result.Hardware.BoardInfo.ProductName == "" {
if productName := values["Product Name"]; productName != "" {
result.Hardware.BoardInfo.ProductName = productName
}
}
seen := make(map[string]struct{}, len(result.Hardware.Firmware))
for _, fw := range result.Hardware.Firmware {
key := strings.ToLower(strings.TrimSpace(fw.DeviceName + "\x00" + fw.Version + "\x00" + fw.Description))
seen[key] = struct{}{}
}
for key, meta := range known {
version := normalizeXFusionValue(values[key])
if version == "" {
continue
}
appendXFusionFirmware(result, seen, models.FirmwareInfo{
DeviceName: meta.deviceName,
Description: meta.description,
Version: version,
BuildTime: normalizeXFusionValue(values[meta.buildKey]),
})
}
}
func parseAlignedKeyValues(content []byte) map[string]string {
values := make(map[string]string)
for _, rawLine := range strings.Split(string(content), "\n") {
line := strings.TrimRight(rawLine, "\r")
if !strings.Contains(line, ":") {
continue
}
idx := strings.Index(line, ":")
if idx < 0 {
continue
}
key := strings.TrimRight(line[:idx], " \t")
value := strings.TrimSpace(line[idx+1:])
if key == "" || value == "" || values[key] != "" {
continue
}
values[key] = value
}
return values
}
func appendXFusionFirmware(result *models.AnalysisResult, seen map[string]struct{}, fw models.FirmwareInfo) {
if result == nil || result.Hardware == nil {
return
}
key := strings.ToLower(strings.TrimSpace(fw.DeviceName + "\x00" + fw.Version + "\x00" + fw.Description))
if key == "" {
return
}
if _, exists := seen[key]; exists {
return
}
seen[key] = struct{}{}
result.Hardware.Firmware = append(result.Hardware.Firmware, fw)
}
// parseDiskInfo parses a single PhysicalDrivesInfo/DiskN/disk_info file.
func parseDiskInfo(content []byte) *models.Storage {
fields := parseKeyValueBlock(content)

View File

@@ -13,7 +13,7 @@ import (
"git.mchus.pro/mchus/logpile/internal/parser"
)
const parserVersion = "1.0"
const parserVersion = "1.1"
func init() {
parser.Register(&Parser{})
@@ -34,11 +34,15 @@ func (p *Parser) Detect(files []parser.ExtractedFile) int {
path := strings.ToLower(f.Path)
switch {
case strings.Contains(path, "appdump/frudata/fruinfo.txt"):
confidence += 60
confidence += 50
case strings.Contains(path, "rtosdump/versioninfo/app_revision.txt"):
confidence += 30
case strings.Contains(path, "appdump/sensor_alarm/sensor_info.txt"):
confidence += 20
confidence += 10
case strings.Contains(path, "appdump/card_manage/card_info"):
confidence += 20
case strings.Contains(path, "logdump/netcard/netcard_info.txt"):
confidence += 20
}
if confidence >= 100 {
return 100
@@ -54,17 +58,21 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
FRU: make([]models.FRUInfo, 0),
Sensors: make([]models.SensorReading, 0),
Hardware: &models.HardwareConfig{
CPUs: make([]models.CPU, 0),
Memory: make([]models.MemoryDIMM, 0),
Storage: make([]models.Storage, 0),
GPUs: make([]models.GPU, 0),
NetworkCards: make([]models.NIC, 0),
PowerSupply: make([]models.PSU, 0),
Firmware: make([]models.FirmwareInfo, 0),
Firmware: make([]models.FirmwareInfo, 0),
Devices: make([]models.HardwareDevice, 0),
CPUs: make([]models.CPU, 0),
Memory: make([]models.MemoryDIMM, 0),
Storage: make([]models.Storage, 0),
Volumes: make([]models.StorageVolume, 0),
PCIeDevices: make([]models.PCIeDevice, 0),
GPUs: make([]models.GPU, 0),
NetworkCards: make([]models.NIC, 0),
NetworkAdapters: make([]models.NetworkAdapter, 0),
PowerSupply: make([]models.PSU, 0),
},
}
if f := findByPath(files, "appdump/frudata/fruinfo.txt"); f != nil {
if f := findByAnyPath(files, "appdump/frudata/fruinfo.txt", "rtosdump/versioninfo/fruinfo.txt"); f != nil {
parseFRUInfo(f.Content, result)
}
if f := findByPath(files, "appdump/sensor_alarm/sensor_info.txt"); f != nil {
@@ -76,10 +84,20 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
if f := findByPath(files, "appdump/cpumem/mem_info"); f != nil {
result.Hardware.Memory = parseMemInfo(f.Content)
}
var nicCards []xfusionNICCard
if f := findByPath(files, "appdump/card_manage/card_info"); f != nil {
gpus, nics := parseCardInfo(f.Content)
gpus, cards := parseCardInfo(f.Content)
result.Hardware.GPUs = gpus
result.Hardware.NetworkCards = nics
nicCards = cards
}
if f := findByPath(files, "logdump/netcard/netcard_info.txt"); f != nil || len(nicCards) > 0 {
var content []byte
if f != nil {
content = f.Content
}
adapters, legacyNICs := mergeNetworkAdapters(nicCards, parseNetcardInfo(content))
result.Hardware.NetworkAdapters = adapters
result.Hardware.NetworkCards = legacyNICs
}
if f := findByPath(files, "appdump/bmc/psu_info.txt"); f != nil {
result.Hardware.PowerSupply = parsePSUInfo(f.Content)
@@ -87,6 +105,9 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
if f := findByPath(files, "appdump/storagemgnt/raid_controller_info.txt"); f != nil {
parseStorageControllerInfo(f.Content, result)
}
if f := findByPath(files, "rtosdump/versioninfo/app_revision.txt"); f != nil {
parseAppRevision(f.Content, result)
}
for _, f := range findDiskInfoFiles(files) {
disk := parseDiskInfo(f.Content)
if disk != nil {
@@ -99,6 +120,7 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
result.Protocol = "ipmi"
result.SourceType = models.SourceTypeArchive
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
return result, nil
}
@@ -113,6 +135,15 @@ func findByPath(files []parser.ExtractedFile, substring string) *parser.Extracte
return nil
}
func findByAnyPath(files []parser.ExtractedFile, substrings ...string) *parser.ExtractedFile {
for _, substring := range substrings {
if f := findByPath(files, substring); f != nil {
return f
}
}
return nil
}
// findDiskInfoFiles returns all PhysicalDrivesInfo disk_info files.
func findDiskInfoFiles(files []parser.ExtractedFile) []parser.ExtractedFile {
var out []parser.ExtractedFile

View File

@@ -1,8 +1,10 @@
package xfusion
import (
"strings"
"testing"
"git.mchus.pro/mchus/logpile/internal/models"
"git.mchus.pro/mchus/logpile/internal/parser"
)
@@ -26,6 +28,29 @@ func TestDetect_G5500V7(t *testing.T) {
}
}
func TestDetect_ServerFileExportMarkers(t *testing.T) {
p := &Parser{}
score := p.Detect([]parser.ExtractedFile{
{Path: "dump_info/RTOSDump/versioninfo/app_revision.txt", Content: []byte("Product Name: G5500 V7")},
{Path: "dump_info/LogDump/netcard/netcard_info.txt", Content: []byte("2026-02-04 03:54:06 UTC")},
{Path: "dump_info/AppDump/card_manage/card_info", Content: []byte("OCP Card Info")},
})
if score < 70 {
t.Fatalf("expected Detect score >= 70 for xFusion file export markers, got %d", score)
}
}
func TestDetect_Negative(t *testing.T) {
p := &Parser{}
score := p.Detect([]parser.ExtractedFile{
{Path: "logs/messages.txt", Content: []byte("plain text")},
{Path: "inventory.json", Content: []byte(`{"vendor":"other"}`)},
})
if score != 0 {
t.Fatalf("expected Detect score 0 for non-xFusion input, got %d", score)
}
}
func TestParse_G5500V7_BoardInfo(t *testing.T) {
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
p := &Parser{}
@@ -126,6 +151,94 @@ func TestParse_G5500V7_NICs(t *testing.T) {
}
}
func TestParse_ServerFileExport_NetworkAdaptersAndFirmware(t *testing.T) {
p := &Parser{}
files := []parser.ExtractedFile{
{
Path: "dump_info/AppDump/card_manage/card_info",
Content: []byte(strings.TrimSpace(`
Pcie Card Info
Slot | Vender Id | Device Id | Sub Vender Id | Sub Device Id | Segment Number | Bus Number | Device Number | Function Number | Card Desc | Board Id | PCB Version | CPLD Version | Sub Card Bom Id | PartNum | SerialNumber | OriginalPartNum
1 | 0x15b3 | 0x101f | 0x1f24 | 0x2011 | 0x00 | 0x27 | 0x00 | 0x00 | MT2894 Family [ConnectX-6 Lx] | N/A | N/A | N/A | N/A | 0302Y238 | 02Y238X6RC000058 |
OCP Card Info
Slot | Vender Id | Device Id | Sub Vender Id | Sub Device Id | Segment Number | Bus Number | Device Number | Function Number | Card Desc | Board Id | PCB Version | CPLD Version | Sub Card Bom Id | PartNum | SerialNumber | OriginalPartNum
1 | 0x15b3 | 0x101f | 0x1f24 | 0x2011 | 0x00 | 0x27 | 0x00 | 0x00 | MT2894 Family [ConnectX-6 Lx] | N/A | N/A | N/A | N/A | 0302Y238 | 02Y238X6RC000058 |
`)),
},
{
Path: "dump_info/LogDump/netcard/netcard_info.txt",
Content: []byte(strings.TrimSpace(`
2026-02-04 03:54:06 UTC
ProductName :XC385
Manufacture :XFUSION
FirmwareVersion :26.39.2048
SlotId :1
Port0 BDF:0000:27:00.0
MacAddr:44:1A:4C:16:E8:03
ActualMac:44:1A:4C:16:E8:03
Port1 BDF:0000:27:00.1
MacAddr:00:00:00:00:00:00
ActualMac:44:1A:4C:16:E8:04
`)),
},
{
Path: "dump_info/RTOSDump/versioninfo/app_revision.txt",
Content: []byte(strings.TrimSpace(`
------------------- iBMC INFO -------------------
Active iBMC Version: (U68)3.08.05.85
Active iBMC Built: 16:46:26 Jan 4 2026
SDK Version: 13.16.30.16
SDK Built: 07:55:18 Dec 12 2025
Active BIOS Version: (U6216)01.02.08.17
Active BIOS Built: 00:00:00 Jan 05 2026
Product Name: G5500 V7
`)),
},
}
result, err := p.Parse(files)
if err != nil {
t.Fatalf("Parse: %v", err)
}
if result.Protocol != "ipmi" || result.SourceType != models.SourceTypeArchive {
t.Fatalf("unexpected source metadata: protocol=%q source_type=%q", result.Protocol, result.SourceType)
}
if result.Hardware == nil {
t.Fatal("Hardware is nil")
}
if len(result.Hardware.NetworkAdapters) != 1 {
t.Fatalf("expected 1 network adapter, got %d", len(result.Hardware.NetworkAdapters))
}
adapter := result.Hardware.NetworkAdapters[0]
if adapter.BDF != "0000:27:00.0" {
t.Fatalf("expected network adapter BDF 0000:27:00.0, got %q", adapter.BDF)
}
if adapter.Firmware != "26.39.2048" {
t.Fatalf("expected network adapter firmware 26.39.2048, got %q", adapter.Firmware)
}
if adapter.SerialNumber != "02Y238X6RC000058" {
t.Fatalf("expected network adapter serial from card_info, got %q", adapter.SerialNumber)
}
if len(adapter.MACAddresses) != 2 || adapter.MACAddresses[0] != "44:1A:4C:16:E8:03" || adapter.MACAddresses[1] != "44:1A:4C:16:E8:04" {
t.Fatalf("unexpected MAC addresses: %#v", adapter.MACAddresses)
}
fwByDevice := make(map[string]models.FirmwareInfo)
for _, fw := range result.Hardware.Firmware {
fwByDevice[fw.DeviceName] = fw
}
if fwByDevice["iBMC"].Version != "(U68)3.08.05.85" {
t.Fatalf("expected iBMC firmware from app_revision.txt, got %#v", fwByDevice["iBMC"])
}
if fwByDevice["BIOS"].Version != "(U6216)01.02.08.17" {
t.Fatalf("expected BIOS firmware from app_revision.txt, got %#v", fwByDevice["BIOS"])
}
if result.Hardware.BoardInfo.ProductName != "G5500 V7" {
t.Fatalf("expected board product fallback from app_revision.txt, got %q", result.Hardware.BoardInfo.ProductName)
}
}
func TestParse_G5500V7_PSUs(t *testing.T) {
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
p := &Parser{}

View File

@@ -44,6 +44,9 @@ func TestParserParseExample(t *testing.T) {
examplePath := filepath.Join("..", "..", "..", "..", "example", "xigmanas.txt")
raw, err := os.ReadFile(examplePath)
if err != nil {
if os.IsNotExist(err) {
t.Skipf("example file %s not present", examplePath)
}
t.Fatalf("read example file: %v", err)
}

View File

@@ -3,6 +3,8 @@ package server
import (
"bytes"
"encoding/json"
"fmt"
"net"
"net/http"
"net/http/httptest"
"strings"
@@ -22,6 +24,7 @@ func newCollectTestServer() (*Server, *httptest.Server) {
mux.HandleFunc("POST /api/collect", s.handleCollectStart)
mux.HandleFunc("GET /api/collect/{id}", s.handleCollectStatus)
mux.HandleFunc("POST /api/collect/{id}/cancel", s.handleCollectCancel)
mux.HandleFunc("POST /api/collect/{id}/skip", s.handleCollectSkip)
return s, httptest.NewServer(mux)
}
@@ -29,7 +32,17 @@ func TestCollectProbe(t *testing.T) {
_, ts := newCollectTestServer()
defer ts.Close()
body := `{"host":"bmc-off.local","protocol":"redfish","port":443,"username":"admin","auth_type":"password","password":"secret","tls_mode":"strict"}`
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatalf("listen probe target: %v", err)
}
defer ln.Close()
addr, ok := ln.Addr().(*net.TCPAddr)
if !ok {
t.Fatalf("unexpected listener address type: %T", ln.Addr())
}
body := fmt.Sprintf(`{"host":"127.0.0.1","protocol":"redfish","port":%d,"username":"admin-off","auth_type":"password","password":"secret","tls_mode":"strict"}`, addr.Port)
resp, err := http.Post(ts.URL+"/api/collect/probe", "application/json", bytes.NewBufferString(body))
if err != nil {
t.Fatalf("post collect probe failed: %v", err)
@@ -53,9 +66,6 @@ func TestCollectProbe(t *testing.T) {
if payload.HostPowerState != "Off" {
t.Fatalf("expected host power state Off, got %q", payload.HostPowerState)
}
if !payload.PowerControlAvailable {
t.Fatalf("expected power control to be available")
}
}
func TestCollectLifecycleToTerminal(t *testing.T) {

View File

@@ -21,13 +21,16 @@ func (c *mockConnector) Probe(ctx context.Context, req collector.Request) (*coll
if strings.Contains(strings.ToLower(req.Host), "fail") {
return nil, context.DeadlineExceeded
}
hostPoweredOn := true
if strings.Contains(strings.ToLower(req.Host), "off") || strings.Contains(strings.ToLower(req.Username), "off") {
hostPoweredOn = false
}
return &collector.ProbeResult{
Reachable: true,
Protocol: c.protocol,
HostPowerState: map[bool]string{true: "On", false: "Off"}[!strings.Contains(strings.ToLower(req.Host), "off")],
HostPoweredOn: !strings.Contains(strings.ToLower(req.Host), "off"),
PowerControlAvailable: true,
SystemPath: "/redfish/v1/Systems/1",
Reachable: true,
Protocol: c.protocol,
HostPowerState: map[bool]string{true: "On", false: "Off"}[hostPoweredOn],
HostPoweredOn: hostPoweredOn,
SystemPath: "/redfish/v1/Systems/1",
}, nil
}

View File

@@ -19,18 +19,15 @@ type CollectRequest struct {
Password string `json:"password,omitempty"`
Token string `json:"token,omitempty"`
TLSMode string `json:"tls_mode"`
PowerOnIfHostOff bool `json:"power_on_if_host_off,omitempty"`
StopHostAfterCollect bool `json:"stop_host_after_collect,omitempty"`
DebugPayloads bool `json:"debug_payloads,omitempty"`
DebugPayloads bool `json:"debug_payloads,omitempty"`
}
type CollectProbeResponse struct {
Reachable bool `json:"reachable"`
Protocol string `json:"protocol,omitempty"`
HostPowerState string `json:"host_power_state,omitempty"`
HostPoweredOn bool `json:"host_powered_on"`
PowerControlAvailable bool `json:"power_control_available"`
Message string `json:"message,omitempty"`
HostPowerState string `json:"host_power_state,omitempty"`
HostPoweredOn bool `json:"host_powered_on"`
Message string `json:"message,omitempty"`
}
type CollectJobResponse struct {
@@ -78,7 +75,8 @@ type Job struct {
CreatedAt time.Time
UpdatedAt time.Time
RequestMeta CollectRequestMeta
cancel func()
cancel func()
skipFn func()
}
type CollectModuleStatus struct {

View File

@@ -243,6 +243,8 @@ func BuildHardwareDevices(hw *models.HardwareConfig) []models.HardwareDevice {
Source: "network_adapters",
Slot: nic.Slot,
Location: nic.Location,
BDF: nic.BDF,
DeviceClass: "NetworkController",
VendorID: nic.VendorID,
DeviceID: nic.DeviceID,
Model: nic.Model,
@@ -253,6 +255,11 @@ func BuildHardwareDevices(hw *models.HardwareConfig) []models.HardwareDevice {
PortCount: nic.PortCount,
PortType: nic.PortType,
MACAddresses: nic.MACAddresses,
LinkWidth: nic.LinkWidth,
LinkSpeed: nic.LinkSpeed,
MaxLinkWidth: nic.MaxLinkWidth,
MaxLinkSpeed: nic.MaxLinkSpeed,
NUMANode: nic.NUMANode,
Present: &present,
Status: nic.Status,
StatusCheckedAt: nic.StatusCheckedAt,

View File

@@ -122,6 +122,41 @@ func TestBuildHardwareDevices_ZeroSizeMemoryWithInventoryIsIncluded(t *testing.T
}
}
func TestBuildHardwareDevices_NetworkAdapterPreservesPCIeMetadata(t *testing.T) {
hw := &models.HardwareConfig{
NetworkAdapters: []models.NetworkAdapter{
{
Slot: "1",
Location: "OCP",
Present: true,
BDF: "0000:27:00.0",
Model: "ConnectX-6 Lx",
VendorID: 0x15b3,
DeviceID: 0x101f,
SerialNumber: "NIC-001",
Firmware: "26.39.2048",
MACAddresses: []string{"44:1A:4C:16:E8:03", "44:1A:4C:16:E8:04"},
LinkWidth: 16,
LinkSpeed: "32 GT/s",
NUMANode: 1,
Status: "ok",
},
},
}
devices := BuildHardwareDevices(hw)
for _, d := range devices {
if d.Kind != models.DeviceKindNetwork {
continue
}
if d.BDF != "0000:27:00.0" || d.LinkWidth != 16 || d.LinkSpeed != "32 GT/s" || d.NUMANode != 1 {
t.Fatalf("expected network PCIe metadata to be preserved, got %+v", d)
}
return
}
t.Fatal("expected network device in canonical inventory")
}
func TestBuildSpecification_ZeroSizeMemoryWithInventoryIsShown(t *testing.T) {
hw := &models.HardwareConfig{
Memory: []models.MemoryDIMM{
@@ -223,6 +258,31 @@ func TestBuildHardwareDevices_SkipsFirmwareOnlyNumericSlots(t *testing.T) {
}
}
func TestBuildHardwareDevices_NetworkDevicesUseUnifiedControllerClass(t *testing.T) {
hw := &models.HardwareConfig{
NetworkAdapters: []models.NetworkAdapter{
{
Slot: "NIC1",
Model: "Ethernet Adapter",
Vendor: "Intel",
Present: true,
},
},
}
devices := BuildHardwareDevices(hw)
for _, d := range devices {
if d.Kind != models.DeviceKindNetwork {
continue
}
if d.DeviceClass != "NetworkController" {
t.Fatalf("expected unified network controller class, got %+v", d)
}
return
}
t.Fatalf("expected one canonical network device")
}
func TestHandleGetConfig_ReturnsCanonicalHardware(t *testing.T) {
srv := &Server{}
srv.SetResult(&models.AnalysisResult{

View File

@@ -18,6 +18,7 @@ import (
"sort"
"strconv"
"strings"
"sync"
"sync/atomic"
"time"
@@ -1674,34 +1675,28 @@ func (s *Server) handleCollectProbe(w http.ResponseWriter, r *http.Request) {
message := "Связь с BMC установлена"
if result != nil {
switch {
case !result.HostPoweredOn && result.PowerControlAvailable:
message = "Связь с BMC установлена, host выключен. Можно включить перед сбором."
case !result.HostPoweredOn:
message = "Связь с BMC установлена, host выключен."
default:
message = "Связь с BMC установлена, host включен."
if result.HostPoweredOn {
message = "Связь с BMC установлена, host включён."
} else {
message = "Связь с BMC установлена, host выключен. Данные инвентаря могут быть неполными."
}
}
hostPowerState := ""
hostPoweredOn := false
powerControlAvailable := false
reachable := false
if result != nil {
reachable = result.Reachable
hostPowerState = strings.TrimSpace(result.HostPowerState)
hostPoweredOn = result.HostPoweredOn
powerControlAvailable = result.PowerControlAvailable
}
jsonResponse(w, CollectProbeResponse{
Reachable: reachable,
Protocol: req.Protocol,
HostPowerState: hostPowerState,
HostPoweredOn: hostPoweredOn,
PowerControlAvailable: powerControlAvailable,
Message: message,
Reachable: reachable,
Protocol: req.Protocol,
HostPowerState: hostPowerState,
HostPoweredOn: hostPoweredOn,
Message: message,
})
}
@@ -1737,6 +1732,22 @@ func (s *Server) handleCollectCancel(w http.ResponseWriter, r *http.Request) {
jsonResponse(w, job.toStatusResponse())
}
func (s *Server) handleCollectSkip(w http.ResponseWriter, r *http.Request) {
jobID := strings.TrimSpace(r.PathValue("id"))
if !isValidCollectJobID(jobID) {
jsonError(w, "Invalid collect job id", http.StatusBadRequest)
return
}
job, ok := s.jobManager.SkipJob(jobID)
if !ok {
jsonError(w, "Collect job not found", http.StatusNotFound)
return
}
jsonResponse(w, job.toStatusResponse())
}
func (s *Server) startCollectionJob(jobID string, req CollectRequest) {
ctx, cancel := context.WithCancel(context.Background())
if attached := s.jobManager.AttachJobCancel(jobID, cancel); !attached {
@@ -1744,6 +1755,11 @@ func (s *Server) startCollectionJob(jobID string, req CollectRequest) {
return
}
skipCh := make(chan struct{})
var skipOnce sync.Once
skipFn := func() { skipOnce.Do(func() { close(skipCh) }) }
s.jobManager.AttachJobSkip(jobID, skipFn)
go func() {
connector, ok := s.getCollector(req.Protocol)
if !ok {
@@ -1811,7 +1827,9 @@ func (s *Server) startCollectionJob(jobID string, req CollectRequest) {
}
}
result, err := connector.Collect(ctx, toCollectorRequest(req), emitProgress)
collectorReq := toCollectorRequest(req)
collectorReq.SkipHungCh = skipCh
result, err := connector.Collect(ctx, collectorReq, emitProgress)
if err != nil {
if ctx.Err() != nil {
return
@@ -2035,9 +2053,7 @@ func toCollectorRequest(req CollectRequest) collector.Request {
Password: req.Password,
Token: req.Token,
TLSMode: req.TLSMode,
PowerOnIfHostOff: req.PowerOnIfHostOff,
StopHostAfterCollect: req.StopHostAfterCollect,
DebugPayloads: req.DebugPayloads,
DebugPayloads: req.DebugPayloads,
}
}

View File

@@ -175,6 +175,43 @@ func (m *JobManager) UpdateJobDebugInfo(id string, info *CollectDebugInfo) (*Job
return cloned, true
}
func (m *JobManager) AttachJobSkip(id string, skipFn func()) bool {
m.mu.Lock()
defer m.mu.Unlock()
job, ok := m.jobs[id]
if !ok || job == nil || isTerminalCollectStatus(job.Status) {
return false
}
job.skipFn = skipFn
return true
}
func (m *JobManager) SkipJob(id string) (*Job, bool) {
m.mu.Lock()
job, ok := m.jobs[id]
if !ok || job == nil {
m.mu.Unlock()
return nil, false
}
if isTerminalCollectStatus(job.Status) {
cloned := cloneJob(job)
m.mu.Unlock()
return cloned, true
}
skipFn := job.skipFn
job.skipFn = nil
job.UpdatedAt = time.Now().UTC()
job.Logs = append(job.Logs, formatCollectLogLine(job.UpdatedAt, "Пропуск зависших запросов по команде пользователя"))
cloned := cloneJob(job)
m.mu.Unlock()
if skipFn != nil {
skipFn()
}
return cloned, true
}
func (m *JobManager) AttachJobCancel(id string, cancelFn context.CancelFunc) bool {
m.mu.Lock()
defer m.mu.Unlock()
@@ -229,5 +266,6 @@ func cloneJob(job *Job) *Job {
cloned.CurrentPhase = job.CurrentPhase
cloned.ETASeconds = job.ETASeconds
cloned.cancel = nil
cloned.skipFn = nil
return &cloned
}

View File

@@ -99,6 +99,7 @@ func (s *Server) setupRoutes() {
s.mux.HandleFunc("POST /api/collect/probe", s.handleCollectProbe)
s.mux.HandleFunc("GET /api/collect/{id}", s.handleCollectStatus)
s.mux.HandleFunc("POST /api/collect/{id}/cancel", s.handleCollectCancel)
s.mux.HandleFunc("POST /api/collect/{id}/skip", s.handleCollectSkip)
}
func (s *Server) Run() error {

View File

@@ -24,6 +24,7 @@ func newFlowTestServer() (*Server, *httptest.Server) {
mux.HandleFunc("POST /api/collect", s.handleCollectStart)
mux.HandleFunc("GET /api/collect/{id}", s.handleCollectStatus)
mux.HandleFunc("POST /api/collect/{id}/cancel", s.handleCollectCancel)
mux.HandleFunc("POST /api/collect/{id}/skip", s.handleCollectSkip)
return s, httptest.NewServer(mux)
}

BIN
logpile

Binary file not shown.

View File

@@ -128,6 +128,7 @@ echo ""
# Show next steps
echo -e "${YELLOW}Next steps:${NC}"
echo " 1. Create git tag:"
echo " # LOGPile release tags use vN.M, for example: v1.12"
echo " git tag -a ${VERSION} -m \"Release ${VERSION}\""
echo ""
echo " 2. Push tag to remote:"

View File

@@ -211,8 +211,6 @@ main {
}
#api-connect-btn,
#api-power-on-collect-btn,
#api-collect-off-btn,
#convert-folder-btn,
#convert-run-btn,
#cancel-job-btn,
@@ -229,8 +227,6 @@ main {
}
#api-connect-btn:hover,
#api-power-on-collect-btn:hover,
#api-collect-off-btn:hover,
#convert-folder-btn:hover,
#convert-run-btn:hover,
#cancel-job-btn:hover,
@@ -241,8 +237,6 @@ main {
#convert-run-btn:disabled,
#convert-folder-btn:disabled,
#api-connect-btn:disabled,
#api-power-on-collect-btn:disabled,
#api-collect-off-btn:disabled,
#cancel-job-btn:disabled,
.upload-area button:disabled {
opacity: 0.6;
@@ -311,64 +305,19 @@ main {
border-top: 1px solid #e2e8f0;
}
.api-confirm-modal-backdrop {
position: fixed;
inset: 0;
background: rgba(0, 0, 0, 0.45);
.api-host-off-warning {
display: flex;
align-items: center;
justify-content: center;
z-index: 1000;
}
.api-confirm-modal {
background: #fff;
border-radius: 10px;
padding: 1.5rem 1.75rem;
max-width: 380px;
width: 90%;
box-shadow: 0 8px 32px rgba(0,0,0,0.18);
}
.api-confirm-modal p {
margin-bottom: 1.1rem;
font-size: 0.95rem;
color: #333;
line-height: 1.5;
}
.api-confirm-modal-actions {
display: flex;
gap: 0.6rem;
justify-content: flex-end;
}
.api-confirm-modal-actions button {
border: none;
gap: 0.4rem;
padding: 0.5rem 0.75rem;
background: #fef3c7;
border: 1px solid #f59e0b;
border-radius: 6px;
padding: 0.5rem 1rem;
font-size: 0.9rem;
font-weight: 600;
cursor: pointer;
font-size: 0.875rem;
color: #92400e;
font-weight: 500;
}
.api-confirm-modal-actions .btn-cancel {
background: #e2e8f0;
color: #333;
}
.api-confirm-modal-actions .btn-cancel:hover {
background: #cbd5e1;
}
.api-confirm-modal-actions .btn-confirm {
background: #dc3545;
color: #fff;
}
.api-confirm-modal-actions .btn-confirm:hover {
background: #b02a37;
}
.api-connect-status {
margin-top: 0.75rem;
@@ -445,6 +394,33 @@ main {
cursor: default;
}
.job-status-actions {
display: flex;
gap: 0.5rem;
align-items: center;
}
#skip-hung-btn {
background: #f59e0b;
color: #fff;
border: none;
border-radius: 6px;
padding: 0.5rem 0.9rem;
font-size: 0.875rem;
font-weight: 600;
cursor: pointer;
transition: background-color 0.2s ease, opacity 0.2s ease;
}
#skip-hung-btn:hover {
background: #d97706;
}
#skip-hung-btn:disabled {
opacity: 0.6;
cursor: not-allowed;
}
.job-status-meta {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(230px, 1fr));

View File

@@ -91,9 +91,9 @@ function initApiSource() {
}
const cancelJobButton = document.getElementById('cancel-job-btn');
const skipHungButton = document.getElementById('skip-hung-btn');
const connectButton = document.getElementById('api-connect-btn');
const collectButton = document.getElementById('api-collect-btn');
const powerOffCheckbox = document.getElementById('api-power-off');
const fieldNames = ['host', 'port', 'username', 'password'];
apiForm.addEventListener('submit', (event) => {
@@ -110,6 +110,11 @@ function initApiSource() {
cancelCollectionJob();
});
}
if (skipHungButton) {
skipHungButton.addEventListener('click', () => {
skipHungCollectionJob();
});
}
if (connectButton) {
connectButton.addEventListener('click', () => {
startApiProbe();
@@ -120,22 +125,6 @@ function initApiSource() {
startCollectionWithOptions();
});
}
if (powerOffCheckbox) {
powerOffCheckbox.addEventListener('change', () => {
if (!powerOffCheckbox.checked) {
return;
}
// If host was already on when probed, warn before enabling shutdown
if (apiProbeResult && apiProbeResult.host_powered_on) {
showConfirmModal(
'Хост был включён до начала сбора. Вы уверены, что хотите выключить его после завершения сбора?',
() => { /* confirmed — leave checked */ },
() => { powerOffCheckbox.checked = false; }
);
}
});
}
fieldNames.forEach((fieldName) => {
const field = apiForm.elements.namedItem(fieldName);
if (!field) {
@@ -163,36 +152,6 @@ function initApiSource() {
renderCollectionJob();
}
function showConfirmModal(message, onConfirm, onCancel) {
const backdrop = document.createElement('div');
backdrop.className = 'api-confirm-modal-backdrop';
backdrop.innerHTML = `
<div class="api-confirm-modal" role="dialog" aria-modal="true">
<p>${escapeHtml(message)}</p>
<div class="api-confirm-modal-actions">
<button class="btn-cancel">Отмена</button>
<button class="btn-confirm">Да, выключить</button>
</div>
</div>
`;
document.body.appendChild(backdrop);
const close = () => document.body.removeChild(backdrop);
backdrop.querySelector('.btn-cancel').addEventListener('click', () => {
close();
if (onCancel) onCancel();
});
backdrop.querySelector('.btn-confirm').addEventListener('click', () => {
close();
if (onConfirm) onConfirm();
});
backdrop.addEventListener('click', (e) => {
if (e.target === backdrop) {
close();
if (onCancel) onCancel();
}
});
}
function startApiProbe() {
const { isValid, payload, errors } = validateCollectForm();
@@ -255,11 +214,7 @@ function startCollectionWithOptions() {
return;
}
const powerOnCheckbox = document.getElementById('api-power-on');
const powerOffCheckbox = document.getElementById('api-power-off');
const debugPayloads = document.getElementById('api-debug-payloads');
payload.power_on_if_host_off = powerOnCheckbox ? powerOnCheckbox.checked : false;
payload.stop_host_after_collect = powerOffCheckbox ? powerOffCheckbox.checked : false;
payload.debug_payloads = debugPayloads ? debugPayloads.checked : false;
startCollectionJob(payload);
}
@@ -268,8 +223,6 @@ function renderApiProbeState() {
const connectButton = document.getElementById('api-connect-btn');
const probeOptions = document.getElementById('api-probe-options');
const status = document.getElementById('api-connect-status');
const powerOnCheckbox = document.getElementById('api-power-on');
const powerOffCheckbox = document.getElementById('api-power-off');
if (!connectButton || !probeOptions || !status) {
return;
}
@@ -283,7 +236,6 @@ function renderApiProbeState() {
}
const hostOn = apiProbeResult.host_powered_on;
const powerControlAvailable = apiProbeResult.power_control_available;
if (hostOn) {
status.textContent = apiProbeResult.message || 'Связь с BMC есть, host включён.';
@@ -295,25 +247,15 @@ function renderApiProbeState() {
probeOptions.classList.remove('hidden');
// "Включить" checkbox
if (powerOnCheckbox) {
const hostOffWarning = document.getElementById('api-host-off-warning');
if (hostOffWarning) {
if (hostOn) {
// Host already on — checkbox is checked and disabled
powerOnCheckbox.checked = true;
powerOnCheckbox.disabled = true;
hostOffWarning.classList.add('hidden');
} else {
// Host off — default: checked (will power on), enabled
powerOnCheckbox.checked = true;
powerOnCheckbox.disabled = !powerControlAvailable;
hostOffWarning.classList.remove('hidden');
}
}
// "Выключить" checkbox — default: unchecked
if (powerOffCheckbox) {
powerOffCheckbox.checked = false;
powerOffCheckbox.disabled = !powerControlAvailable;
}
connectButton.textContent = 'Переподключиться';
}
@@ -535,6 +477,36 @@ function pollCollectionJobStatus() {
});
}
function skipHungCollectionJob() {
if (!collectionJob || isCollectionJobTerminal(collectionJob.status)) {
return;
}
const btn = document.getElementById('skip-hung-btn');
if (btn) {
btn.disabled = true;
btn.textContent = 'Пропуск...';
}
fetch(`/api/collect/${encodeURIComponent(collectionJob.id)}/skip`, {
method: 'POST'
})
.then(async (response) => {
const body = await response.json().catch(() => ({}));
if (!response.ok) {
throw new Error(body.error || 'Не удалось пропустить зависшие запросы');
}
syncServerLogs(body.logs);
renderCollectionJob();
})
.catch((err) => {
appendJobLog(`Ошибка пропуска: ${err.message}`);
if (btn) {
btn.disabled = false;
btn.textContent = 'Пропустить зависшие';
}
renderCollectionJob();
});
}
function cancelCollectionJob() {
if (!collectionJob || isCollectionJobTerminal(collectionJob.status)) {
return;
@@ -671,6 +643,19 @@ function renderCollectionJob() {
)).join('');
cancelButton.disabled = isTerminal;
const skipBtn = document.getElementById('skip-hung-btn');
if (skipBtn) {
const isCollecting = !isTerminal && collectionJob.status === 'running';
if (isCollecting) {
skipBtn.classList.remove('hidden');
} else {
skipBtn.classList.add('hidden');
skipBtn.disabled = false;
skipBtn.textContent = 'Пропустить зависшие';
}
}
setApiFormBlocked(!isTerminal);
}

View File

@@ -80,18 +80,12 @@
</div>
<div id="api-connect-status" class="api-connect-status"></div>
<div id="api-probe-options" class="api-probe-options hidden">
<label class="api-form-checkbox" for="api-power-on">
<input id="api-power-on" name="power_on_if_host_off" type="checkbox">
<span>Включить перед сбором</span>
</label>
<label class="api-form-checkbox" for="api-power-off">
<input id="api-power-off" name="stop_host_after_collect" type="checkbox">
<span>Выключить после сбора</span>
</label>
<div class="api-probe-options-separator"></div>
<div id="api-host-off-warning" class="api-host-off-warning hidden">
&#9888; Host выключен — данные инвентаря могут быть неполными
</div>
<label class="api-form-checkbox" for="api-debug-payloads">
<input id="api-debug-payloads" name="debug_payloads" type="checkbox">
<span>Сбор расширенных метрик для отладки</span>
<span>Сбор расширенных данных для диагностики</span>
</label>
<div class="api-form-actions">
<button id="api-collect-btn" type="submit">Собрать</button>
@@ -102,7 +96,10 @@
<section id="api-job-status" class="job-status hidden" aria-live="polite">
<div class="job-status-header">
<h4>Статус задачи сбора</h4>
<button id="cancel-job-btn" type="button">Отменить</button>
<div class="job-status-actions">
<button id="skip-hung-btn" type="button" class="hidden" title="Прервать зависшие запросы и перейти к анализу собранных данных">Пропустить зависшие</button>
<button id="cancel-job-btn" type="button">Отменить</button>
</div>
</div>
<div class="job-status-meta">
<div><span class="meta-label">jobId:</span> <code id="job-id-value">-</code></div>