feat: Redfish hardware event log collection + MSI ghost GPU filter + inventory improvements
- Collect hardware event logs (last 7 days) from Systems and Managers/SEL LogServices - Parse AMI raw IPMI dump messages into readable descriptions (Sensor_Type: Event_Type) - Filter out audit/journal/non-hardware log services; only SEL from Managers - MSI ghost GPU filter: exclude processor GPU entries with temperature=0 when host is powered on - Reanimator collected_at uses InventoryData/Status.LastModifiedTime (30-day fallback) - Invalidate Redfish inventory CRC groups before host power-on - Log inventory LastModifiedTime age in collection logs - Drop SecureBoot collection (SecureBootMode, SecureBootDatabases) — not hardware inventory - Add build version to UI footer via template - Add MSI Redfish API reference doc to bible-local/docs/ ADL-032–ADL-035 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -822,3 +822,99 @@ special acquisition strategy.
|
|||||||
- Repo-owned compact fixtures under `internal/collector/redfishprofile/testdata/`, derived from
|
- Repo-owned compact fixtures under `internal/collector/redfishprofile/testdata/`, derived from
|
||||||
representative raw-export snapshots, are used to lock profile matching and acquisition tuning
|
representative raw-export snapshots, are used to lock profile matching and acquisition tuning
|
||||||
for known MSI and Supermicro-family shapes.
|
for known MSI and Supermicro-family shapes.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADL-032 — MSI ghost GPU filter: exclude GPUs with temperature=0 on powered-on host
|
||||||
|
|
||||||
|
**Date:** 2026-03-18
|
||||||
|
**Context:**
|
||||||
|
MSI/AMI BMC caches GPU inventory from the host via Host Interface (in-band). When GPUs are
|
||||||
|
removed without a reboot the old entries remain in `Chassis/GPU*` and
|
||||||
|
`Systems/Self/Processors/GPU*` with `Status.Health: OK, State: Enabled`. The BMC has no
|
||||||
|
out-of-band mechanism to detect physical absence. A physically present GPU always reports
|
||||||
|
an ambient temperature (>0°C) even when idle; a stale cached entry returns `Reading: 0`.
|
||||||
|
|
||||||
|
**Decision:**
|
||||||
|
- Add `EnableMSIGhostGPUFilter` directive (enabled by MSI profile's `refineAnalysis`
|
||||||
|
alongside `EnableProcessorGPUFallback`).
|
||||||
|
- In `collectGPUsFromProcessors`: for each processor GPU, resolve its chassis path and read
|
||||||
|
`Chassis/GPU{n}/Sensors/GPU{n}_Temperature`. If `PowerState=On` and `Reading=0` → skip.
|
||||||
|
- Filter only applies when host is powered on; when host is off all temperatures are 0 and
|
||||||
|
the signal is ambiguous.
|
||||||
|
|
||||||
|
**Consequences:**
|
||||||
|
- Ghost GPUs from previous hardware configurations no longer appear in the inventory.
|
||||||
|
- Filter is MSI-profile-owned and does not affect HGX, Supermicro, or generic paths.
|
||||||
|
- Any new MSI GPU chassis that uses a different temperature sensor path will bypass the filter
|
||||||
|
(safe default: include rather than wrongly exclude).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADL-033 — Reanimator export collected_at uses inventory LastModifiedTime with 30-day fallback
|
||||||
|
|
||||||
|
**Date:** 2026-03-18
|
||||||
|
**Context:**
|
||||||
|
For Redfish sources the BMC Manager `DateTime` reflects when the BMC clock read the time, not
|
||||||
|
when the hardware inventory was last known-good. `InventoryData/Status.LastModifiedTime`
|
||||||
|
(AMI/MSI OEM endpoint) records the actual timestamp of the last successful host-pushed
|
||||||
|
inventory cycle and is a better proxy for "when was this hardware configuration last confirmed".
|
||||||
|
|
||||||
|
**Decision:**
|
||||||
|
- `inferInventoryLastModifiedTime` reads `LastModifiedTime` from the snapshot and sets
|
||||||
|
`AnalysisResult.InventoryLastModifiedAt`.
|
||||||
|
- `reanimatorCollectedAt()` in the exporter selects `InventoryLastModifiedAt` when it is set
|
||||||
|
and no older than 30 days; otherwise falls back to `CollectedAt`.
|
||||||
|
- Fallback rationale: inventory older than 30 days is likely from a long-running server with
|
||||||
|
no recent reboot; using the actual collection date is more useful for the downstream consumer.
|
||||||
|
- The inventory timestamp is also logged during replay and live collection for diagnostics.
|
||||||
|
|
||||||
|
**Consequences:**
|
||||||
|
- Reanimator export `collected_at` reflects the last confirmed inventory cycle on AMI/MSI BMCs.
|
||||||
|
- On non-AMI BMCs or when `InventoryData/Status` is absent, behavior is unchanged.
|
||||||
|
- If inventory is stale (>30 days), collection date is used as before.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADL-034 — Redfish inventory invalidated before host power-on
|
||||||
|
|
||||||
|
**Date:** 2026-03-18
|
||||||
|
**Context:**
|
||||||
|
When a host is powered on by the collector (`power_on_if_host_off=true`), the BMC still holds
|
||||||
|
inventory from the previous boot. If hardware changed between shutdowns, the new boot will push
|
||||||
|
fresh inventory — but only if the BMC accepts it (CRC mismatch triggers re-population). Without
|
||||||
|
explicit invalidation, unchanged CRCs can cause the BMC to skip re-processing even after a
|
||||||
|
hardware change.
|
||||||
|
|
||||||
|
**Decision:**
|
||||||
|
- Before any power-on attempt, `invalidateRedfishInventory` POSTs to
|
||||||
|
`{systemPath}/Oem/Ami/Inventory/Crc` with all groups zeroed (`CPU`, `DIMM`, `PCIE`,
|
||||||
|
`CERTIFICATES`, `SECUREBOOT`).
|
||||||
|
- Best-effort: a 404/405 response (non-AMI BMC) is logged and silently ignored.
|
||||||
|
- The invalidation is logged at `INFO` level and surfaced as a collect progress message.
|
||||||
|
|
||||||
|
**Consequences:**
|
||||||
|
- On AMI/MSI BMCs: the next boot will push a full fresh inventory regardless of whether
|
||||||
|
CRCs appear unchanged, eliminating ghost components from prior hardware configurations.
|
||||||
|
- On non-AMI BMCs: the POST fails immediately (endpoint does not exist), nothing changes.
|
||||||
|
- Invalidation runs only when `power_on_if_host_off=true` and host is confirmed off.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADL-035 — Redfish hardware event log collection from Systems LogServices
|
||||||
|
|
||||||
|
**Date:** 2026-03-18
|
||||||
|
**Context:** Redfish BMCs expose event logs via `LogServices/{svc}/Entries`. On MSI/AMI this includes the IPMI SEL with hardware events (temperature, power, drive failures, etc.). Live collection previously collected only inventory/sensor snapshots; event history was unavailable in Reanimator.
|
||||||
|
**Decision:**
|
||||||
|
- After tree-walk, fetch hardware log entries separately via `collectRedfishLogEntries()` (not part of tree-walk to avoid bloat).
|
||||||
|
- Only `Systems/{sys}/LogServices` is queried — Managers LogServices (BMC audit/journal) are excluded.
|
||||||
|
- Log services with Id/Name containing "audit", "journal", "bmc", "security", "manager", "debug" are skipped.
|
||||||
|
- Entries older than 7 days (client-side filter) are discarded. Pages are followed until an out-of-window entry is found (assumes newest-first ordering, typical for BMCs).
|
||||||
|
- Entries with `EntryType: "Oem"` or `MessageId` containing user/auth/login keywords are filtered as non-hardware.
|
||||||
|
- Raw entries stored in `rawPayloads["redfish_log_entries"]` as `[]map[string]interface{}`.
|
||||||
|
- Parsed to `models.Event` in `parseRedfishLogEntries()` during replay — same path for live and offline.
|
||||||
|
- Max 200 entries per log service, 500 total to limit BMC load.
|
||||||
|
**Consequences:**
|
||||||
|
- Hardware event history (last 7 days) visible in Reanimator `EventLogs` section.
|
||||||
|
- No impact on existing inventory pipeline or offline archive replay (archives without `redfish_log_entries` key silently skip parsing).
|
||||||
|
- Adds extra HTTP requests during live collection (sequential, after tree-walk completes).
|
||||||
|
|||||||
343
bible-local/docs/msi-redfish-api.md
Normal file
343
bible-local/docs/msi-redfish-api.md
Normal file
@@ -0,0 +1,343 @@
|
|||||||
|
# MSI BMC Redfish API Reference
|
||||||
|
|
||||||
|
Source: MSI Enterprise Platform Solutions — Redfish BMC User Guide v1.0 (AMI/MegaRAC stack).
|
||||||
|
Spec compliance: DSP0266 1.15.1, DSP8010 2019.2.
|
||||||
|
|
||||||
|
> This document is trimmed to sections relevant to LOGPile collection and inventory analysis.
|
||||||
|
> Auth, LDAP/AD, SMTP, VirtualMedia, Certificates, RADIUS, Composability, and BMC config
|
||||||
|
> sections are omitted.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Supported HTTP methods
|
||||||
|
|
||||||
|
`GET`, `POST`, `PATCH`, `DELETE`. Unsupported methods return `405`.
|
||||||
|
|
||||||
|
PATCH requires an `If-Match` / `ETag` precondition header; missing header → `428`, mismatch → `412`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Core Redfish API endpoints
|
||||||
|
|
||||||
|
| Resource | URI | Schema |
|
||||||
|
|---|---|---|
|
||||||
|
| Service Root | `/redfish/v1/` | ServiceRoot.v1_7_0 |
|
||||||
|
| ComputerSystem Collection | `/redfish/v1/Systems` | ComputerSystemCollection |
|
||||||
|
| ComputerSystem | `/redfish/v1/Systems/{sys}` | ComputerSystem.v1_16_2 |
|
||||||
|
| Memory Collection | `/redfish/v1/Systems/{sys}/Memory` | MemoryCollection |
|
||||||
|
| Memory | `/redfish/v1/Systems/{sys}/Memory/{mem}` | Memory.v1_19_0 |
|
||||||
|
| MemoryMetrics | `/redfish/v1/Systems/{sys}/Memory/{mem}/MemoryMetrics` | MemoryMetrics.v1_7_0 |
|
||||||
|
| MemoryDomain Collection | `/redfish/v1/Systems/{sys}/MemoryDomain` | MemoryDomainCollection |
|
||||||
|
| MemoryDomain | `/redfish/v1/Systems/{sys}/MemoryDomain/{dom}` | MemoryDomain.v1_2_3 |
|
||||||
|
| MemoryChunks Collection | `/redfish/v1/Systems/{sys}/MemoryDomain/{dom}/MemoryChunks` | MemoryChunksCollection |
|
||||||
|
| MemoryChunks | `/redfish/v1/Systems/{sys}/MemoryDomain/{dom}/MemoryChunks/{chunk}` | MemoryChunks.v1_4_0 |
|
||||||
|
| Processor Collection | `/redfish/v1/Systems/{sys}/Processors` | ProcessorCollection |
|
||||||
|
| Processor | `/redfish/v1/Systems/{sys}/Processors/{proc}` | Processor.v1_15_0 |
|
||||||
|
| SubProcessors Collection | `/redfish/v1/Systems/{sys}/Processors/{proc}/SubProcessors` | ProcessorCollection |
|
||||||
|
| SubProcessor | `/redfish/v1/Systems/{sys}/Processors/{proc}/SubProcessors/{sub}` | Processor.v1_15_0 |
|
||||||
|
| ProcessorMetrics | `/redfish/v1/Systems/{sys}/Processors/{proc}/ProcessorMetrics` | ProcessorMetrics.v1_4_0 |
|
||||||
|
| Bios | `/redfish/v1/Systems/{sys}/Bios` | Bios.v1_2_0 |
|
||||||
|
| SimpleStorage Collection | `/redfish/v1/Systems/{sys}/SimpleStorage` | SimpleStorageCollection |
|
||||||
|
| SimpleStorage | `/redfish/v1/Systems/{sys}/SimpleStorage/{ss}` | SimpleStorage.v1_3_0 |
|
||||||
|
| Storage Collection | `/redfish/v1/Systems/{sys}/Storage` | StorageCollection |
|
||||||
|
| Storage | `/redfish/v1/Systems/{sys}/Storage/{stor}` | Storage.v1_9_0 |
|
||||||
|
| StorageController Collection | `/redfish/v1/Systems/{sys}/Storage/{stor}/Controllers` | StorageControllerCollection |
|
||||||
|
| StorageController | `/redfish/v1/Systems/{sys}/Storage/{stor}/Controllers/{ctrl}` | StorageController.v1_0_0 |
|
||||||
|
| Drive | `/redfish/v1/Systems/{sys}/Storage/{stor}/Drives/{drv}` | Drive.v1_13_0 |
|
||||||
|
| Volume Collection | `/redfish/v1/Systems/{sys}/Storage/{stor}/Volumes` | VolumeCollection |
|
||||||
|
| Volume | `/redfish/v1/Systems/{sys}/Storage/{stor}/Volumes/{vol}` | Volume.v1_5_0 |
|
||||||
|
| NetworkInterface Collection | `/redfish/v1/Systems/{sys}/NetworkInterfaces` | NetworkInterfaceCollection |
|
||||||
|
| NetworkInterface | `/redfish/v1/Systems/{sys}/NetworkInterfaces/{nic}` | NetworkInterface.v1_2_0 |
|
||||||
|
| EthernetInterface (System) | `/redfish/v1/Systems/{sys}/EthernetInterfaces/{eth}` | EthernetInterface.v1_6_2 |
|
||||||
|
| GraphicsController Collection | `/redfish/v1/Systems/{sys}/GraphicsControllers` | GraphicsControllerCollection |
|
||||||
|
| GraphicsController | `/redfish/v1/Systems/{sys}/GraphicsControllers/{gpu}` | GraphicsController.v1_0_0 |
|
||||||
|
| USBController Collection | `/redfish/v1/Systems/{sys}/USBControllers` | USBControllerCollection |
|
||||||
|
| USBController | `/redfish/v1/Systems/{sys}/USBControllers/{usb}` | USBController.v1_0_0 |
|
||||||
|
| SecureBoot | `/redfish/v1/Systems/{sys}/SecureBoot` | SecureBoot.v1_1_0 |
|
||||||
|
| LogService Collection (System) | `/redfish/v1/Systems/{sys}/LogServices` | LogServiceCollection |
|
||||||
|
| LogService (System) | `/redfish/v1/Systems/{sys}/LogServices/{log}` | LogService.v1_1_3 |
|
||||||
|
| LogEntry Collection | `/redfish/v1/Systems/{sys}/LogServices/{log}/Entries` | LogEntryCollection |
|
||||||
|
| LogEntry | `/redfish/v1/Systems/{sys}/LogServices/{log}/Entries/{entry}` | LogEntry.v1_12_0 |
|
||||||
|
| Chassis Collection | `/redfish/v1/Chassis` | ChassisCollection |
|
||||||
|
| Chassis | `/redfish/v1/Chassis/{ch}` | Chassis.v1_15_0 |
|
||||||
|
| Power | `/redfish/v1/Chassis/{ch}/Power` | Power.v1_5_4 |
|
||||||
|
| PowerSubSystem | `/redfish/v1/Chassis/{ch}/PowerSubSystem` | PowerSubsystem.v1_1_0 |
|
||||||
|
| PowerSupplies Collection | `/redfish/v1/Chassis/{ch}/PowerSubSystem/PowerSupplies` | PowerSupplyCollection |
|
||||||
|
| PowerSupply | `/redfish/v1/Chassis/{ch}/PowerSubSystem/PowerSupplies/{psu}` | PowerSupply.v1_3_0 |
|
||||||
|
| PowerSupplyMetrics | `/redfish/v1/Chassis/{ch}/PowerSubSystem/PowerSupplies/{psu}/Metrics` | PowerSupplyMetrics.v1_0_1 |
|
||||||
|
| Thermal | `/redfish/v1/Chassis/{ch}/Thermal` | Thermal.v1_5_3 |
|
||||||
|
| ThermalSubSystem | `/redfish/v1/Chassis/{ch}/ThermalSubSystem` | ThermalSubsystem.v1_0_0 |
|
||||||
|
| ThermalMetrics | `/redfish/v1/Chassis/{ch}/ThermalSubSystem/ThermalMetrics` | ThermalMetrics.v1_0_1 |
|
||||||
|
| Fans Collection | `/redfish/v1/Chassis/{ch}/ThermalSubSystem/Fans` | FanCollection |
|
||||||
|
| Fan | `/redfish/v1/Chassis/{ch}/ThermalSubSystem/Fans/{fan}` | Fan.v1_1_1 |
|
||||||
|
| Sensor Collection | `/redfish/v1/Chassis/{ch}/Sensors` | SensorCollection |
|
||||||
|
| Sensor | `/redfish/v1/Chassis/{ch}/Sensors/{sen}` | Sensor.v1_0_2 |
|
||||||
|
| PCIeDevice Collection | `/redfish/v1/Chassis/{ch}/PCIeDevices` | PCIeDeviceCollection |
|
||||||
|
| PCIeDevice | `/redfish/v1/Chassis/{ch}/PCIeDevices/{dev}` | PCIeDevice.v1_9_0 |
|
||||||
|
| PCIeFunction Collection | `/redfish/v1/Chassis/{ch}/PCIeDevices/{dev}/PCIeFunctions` | PCIeFunctionCollection |
|
||||||
|
| PCIeFunction | `/redfish/v1/Chassis/{ch}/PCIeDevices/{dev}/PCIeFunctions/{fn}` | PCIeFunction.v1_2_3 |
|
||||||
|
| PCIeSlots | `/redfish/v1/Chassis/{ch}/PCIeSlots` | PCIeSlots.v1_5_0 |
|
||||||
|
| NetworkAdapter Collection | `/redfish/v1/Chassis/{ch}/NetworkAdapters` | NetworkAdapterCollection |
|
||||||
|
| NetworkAdapter | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}` | NetworkAdapter.v1_8_0 |
|
||||||
|
| NetworkDeviceFunction Collection | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/NetworkDeviceFunctions` | NetworkDeviceFunctionCollection |
|
||||||
|
| NetworkDeviceFunction | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/NetworkDeviceFunctions/{fn}` | NetworkDeviceFunction.v1_5_0 |
|
||||||
|
| Assembly | `/redfish/v1/Chassis/{ch}/Assembly` | Assembly.v1_2_2 |
|
||||||
|
| Assembly (Drive) | `/redfish/v1/Systems/{sys}/Storage/{stor}/Drives/{drv}/Assembly` | Assembly.v1_2_2 |
|
||||||
|
| Assembly (Processor) | `/redfish/v1/Systems/{sys}/Processors/{proc}/Assembly` | Assembly.v1_2_2 |
|
||||||
|
| Assembly (Memory) | `/redfish/v1/Systems/{sys}/Memory/{mem}/Assembly` | Assembly.v1_2_2 |
|
||||||
|
| Assembly (NetworkAdapter) | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/Assembly` | Assembly.v1_2_2 |
|
||||||
|
| Assembly (PCIeDevice) | `/redfish/v1/Chassis/{ch}/PCIeDevices/{dev}/Assembly` | Assembly.v1_2_2 |
|
||||||
|
| MediaController Collection | `/redfish/v1/Chassis/{ch}/MediaControllers` | MediaControllerCollection |
|
||||||
|
| MediaController | `/redfish/v1/Chassis/{ch}/MediaControllers/{mc}` | MediaController.v1_1_0 |
|
||||||
|
| LogService Collection (Chassis) | `/redfish/v1/Chassis/{ch}/LogServices` | LogServiceCollection |
|
||||||
|
| LogService (Chassis) | `/redfish/v1/Chassis/{ch}/LogServices/{log}` | LogService.v1_1_3 |
|
||||||
|
| Manager Collection | `/redfish/v1/Managers` | ManagerCollection |
|
||||||
|
| Manager | `/redfish/v1/Managers/{mgr}` | Manager.v1_13_0 |
|
||||||
|
| EthernetInterface (Manager) | `/redfish/v1/Managers/{mgr}/EthernetInterfaces/{eth}` | EthernetInterface.v1_6_2 |
|
||||||
|
| LogService Collection (Manager) | `/redfish/v1/Managers/{mgr}/LogServices` | LogServiceCollection |
|
||||||
|
| LogService (Manager) | `/redfish/v1/Managers/{mgr}/LogServices/{log}` | LogService.v1_1_3 |
|
||||||
|
| UpdateService | `/redfish/v1/UpdateService` | UpdateService.v1_6_0 |
|
||||||
|
| TaskService | `/redfish/v1/TasksService` | TaskService.v1_1_4 |
|
||||||
|
| Task Collection | `/redfish/v1/TaskService/Tasks` | TaskCollection |
|
||||||
|
| Task | `/redfish/v1/TaskService/Tasks/{task}` | Task.v1_4_2 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Telemetry API endpoints
|
||||||
|
|
||||||
|
| Resource | URI | Schema |
|
||||||
|
|---|---|---|
|
||||||
|
| TelemetryService | `/redfish/v1/TelemetryService` | TelemetryService.v1_2_1 |
|
||||||
|
| MetricDefinition Collection | `/redfish/v1/TelemetryService/MetricDefinitions` | MetricDefinitionCollection |
|
||||||
|
| MetricDefinition | `/redfish/v1/TelemetryService/MetricDefinitions/{md}` | MetricDefinition.v1_0_3 |
|
||||||
|
| MetricReportDefinition Collection | `/redfish/v1/TelemetryService/MetricReportDefinitions` | MetricReportDefinitionCollection |
|
||||||
|
| MetricReportDefinition | `/redfish/v1/TelemetryService/MetricReportDefinitions/{mrd}` | MetricReportDefinition.v1_3_0 |
|
||||||
|
| MetricReport Collection | `/redfish/v1/TelemetryService/MetricReports` | MetricReportCollection |
|
||||||
|
| MetricReport | `/redfish/v1/TelemetryService/MetricReports/{mr}` | MetricReport.v1_2_0 |
|
||||||
|
| Telemetry LogService | `/redfish/v1/TelemetryService/LogService` | LogService.v1_1_3 |
|
||||||
|
| Telemetry LogEntry Collection | `/redfish/v1/TelemetryService/LogService/Entries` | LogEntryCollection |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Processor / NIC sub-resources (GPU-relevant)
|
||||||
|
|
||||||
|
| Resource | URI |
|
||||||
|
|---|---|
|
||||||
|
| Processor (NetworkAdapter) | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/Processors/{proc}` |
|
||||||
|
| AccelerationFunction Collection | `/redfish/v1/Systems/{sys}/Processors/{proc}/AccelerationFunctions` |
|
||||||
|
| AccelerationFunction | `/redfish/v1/Systems/{sys}/Processors/{proc}/AccelerationFunctions/{fn}` |
|
||||||
|
| Port Collection (NetworkAdapter) | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/Ports` |
|
||||||
|
| Port (GraphicsController) | `/redfish/v1/Systems/{sys}/GraphicsControllers/{gpu}/Ports/{port}` |
|
||||||
|
| OperatingConfig Collection | `/redfish/v1/Systems/{sys}/Processors/{proc}/OperatingConfigs` |
|
||||||
|
| OperatingConfig | `/redfish/v1/Systems/{sys}/Processors/{proc}/OperatingConfigs/{cfg}` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Error response format
|
||||||
|
|
||||||
|
On error, the service returns an HTTP status code and a JSON body with a single `error` property:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"code": "Base.1.12.0.ActionParameterMissing",
|
||||||
|
"message": "...",
|
||||||
|
"@Message.ExtendedInfo": [
|
||||||
|
{
|
||||||
|
"@odata.type": "#Message.v1_0_8.Message",
|
||||||
|
"MessageId": "Base.1.12.0.ActionParameterMissing",
|
||||||
|
"Message": "...",
|
||||||
|
"MessageArgs": [],
|
||||||
|
"Severity": "Warning",
|
||||||
|
"Resolution": "..."
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Common status codes:**
|
||||||
|
|
||||||
|
| Code | Meaning |
|
||||||
|
|------|---------|
|
||||||
|
| 200 | OK with body |
|
||||||
|
| 201 | Created |
|
||||||
|
| 204 | Success, no body |
|
||||||
|
| 400 | Bad request / validation error |
|
||||||
|
| 401 | Unauthorized |
|
||||||
|
| 403 | Forbidden / firmware update in progress |
|
||||||
|
| 404 | Resource not found |
|
||||||
|
| 405 | Method not allowed |
|
||||||
|
| 412 | ETag precondition failed (PATCH) |
|
||||||
|
| 415 | Unsupported media type |
|
||||||
|
| 428 | Missing precondition header (PATCH) |
|
||||||
|
| 501 | Not implemented |
|
||||||
|
|
||||||
|
**Request validation sequence:**
|
||||||
|
1. Authorization check → 401
|
||||||
|
2. Entity privilege check → 403
|
||||||
|
3. URI existence → 404
|
||||||
|
4. Firmware update lock → 403
|
||||||
|
5. Method allowed → 405
|
||||||
|
6. Media type → 415
|
||||||
|
7. Body format → 400
|
||||||
|
8. PATCH: ETag header → 428/412
|
||||||
|
9. Property validation → 400
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. OEM: Inventory refresh (AMI/MSI-specific)
|
||||||
|
|
||||||
|
### 5.1 InventoryCrc — force component re-inventory
|
||||||
|
|
||||||
|
`GET/POST/DELETE /redfish/v1/Systems/{sys}/Oem/Ami/Inventory/Crc`
|
||||||
|
|
||||||
|
The `GroupCrcList` field lists current CRC checksums per component group. When a group's CRC
|
||||||
|
changes (host sends new inventory) or is explicitly zeroed out via POST, the BMC discards its
|
||||||
|
cached inventory and re-reads that group from the host.
|
||||||
|
|
||||||
|
**CRC groups:**
|
||||||
|
|
||||||
|
| Group | Covers |
|
||||||
|
|-------|--------|
|
||||||
|
| `CPU` | Processors, ProcessorMetrics |
|
||||||
|
| `DIMM` | Memory, MemoryDomains, MemoryChunks, MemoryMetrics |
|
||||||
|
| `PCIE` | Storage, PCIeDevices, NetworkInterfaces, NetworkAdapters |
|
||||||
|
| `CERTIFICATES` | Boot Certificates |
|
||||||
|
| `SECURBOOT` | SecureBoot data |
|
||||||
|
|
||||||
|
**POST — invalidate selected groups (force re-inventory):**
|
||||||
|
|
||||||
|
```
|
||||||
|
POST /redfish/v1/Systems/{sys}/Oem/Ami/Inventory/Crc
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{
|
||||||
|
"GroupCrcList": [
|
||||||
|
{ "CPU": 0 },
|
||||||
|
{ "DIMM": 0 },
|
||||||
|
{ "PCIE": 0 }
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Setting a group's value to `0` signals the BMC to invalidate and repopulate that group on next
|
||||||
|
host inventory push (typically at next boot or host-interface inventory cycle).
|
||||||
|
|
||||||
|
**DELETE** — remove all CRC records entirely.
|
||||||
|
|
||||||
|
**Note:** Inventory data is populated by the host via the Redfish Host Interface (in-band),
|
||||||
|
not by the BMC itself. Zeroing a CRC group does not immediately re-read hardware — it marks
|
||||||
|
the group as stale so the next host-side inventory push will be accepted. A cold reboot is the
|
||||||
|
most reliable trigger.
|
||||||
|
|
||||||
|
### 5.2 InventoryData Status — monitor inventory processing
|
||||||
|
|
||||||
|
`GET /redfish/v1/Oem/Ami/InventoryData/Status`
|
||||||
|
|
||||||
|
Available only after the host has posted an inventory file. Shows current processing state.
|
||||||
|
|
||||||
|
**Status enum:**
|
||||||
|
|
||||||
|
| Value | Meaning |
|
||||||
|
|-------|---------|
|
||||||
|
| `BootInProgress` | Host is booting |
|
||||||
|
| `Queued` | Processing task queued |
|
||||||
|
| `In-Progress` | Processing running in background |
|
||||||
|
| `Ready` / `Completed` | Processing finished successfully |
|
||||||
|
| `Failed` | Processing failed |
|
||||||
|
|
||||||
|
Response also includes:
|
||||||
|
- `InventoryData.DeletedModules` — array of groups updated in this population cycle
|
||||||
|
- `InventoryData.Messages` — warnings/errors encountered during processing
|
||||||
|
- `ProcessingTime` — milliseconds taken
|
||||||
|
- `LastModifiedTime` — ISO 8601 timestamp of last successful update
|
||||||
|
|
||||||
|
### 5.3 Systems OEM properties — Inventory reference
|
||||||
|
|
||||||
|
`GET /redfish/v1/Systems/{sys}` → `Oem.Ami` contains:
|
||||||
|
|
||||||
|
| Property | Notes |
|
||||||
|
|----------|-------|
|
||||||
|
| `Inventory` | Reference to InventoryCrc URI + current GroupCrc data |
|
||||||
|
| `RedfishVersion` | BIOS Redfish version (populated via Host Interface) |
|
||||||
|
| `RtpVersion` | BIOS RTP version (populated via Host Interface) |
|
||||||
|
| `ManagerBootConfiguration.ManagerBootMode` | PATCH to trigger soft reset: `SoftReset` / `ResetTimeout` / `None` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. OEM: Component state actions
|
||||||
|
|
||||||
|
### 6.1 Memory enable/disable
|
||||||
|
|
||||||
|
```
|
||||||
|
POST /redfish/v1/Systems/{sys}/Memory/{mem}/Actions/AmiBios.ChangeState
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{ "State": "Disabled" }
|
||||||
|
```
|
||||||
|
|
||||||
|
Response: 204.
|
||||||
|
|
||||||
|
### 6.2 PCIeFunction enable/disable
|
||||||
|
|
||||||
|
```
|
||||||
|
POST /redfish/v1/Chassis/{ch}/PCIeDevices/{dev}/PCIeFunctions/{fn}/Actions/AmiBios.ChangeState
|
||||||
|
Content-Type: application/json
|
||||||
|
|
||||||
|
{ "State": "Disabled" }
|
||||||
|
```
|
||||||
|
|
||||||
|
Response: 204.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. OEM: Storage sensor readings
|
||||||
|
|
||||||
|
`GET /redfish/v1/Systems/{sys}/Storage/{stor}` → `Oem.Ami.StorageControllerSensors`
|
||||||
|
|
||||||
|
Array of sensor objects per storage controller instance. Each entry exposes:
|
||||||
|
- `Reading` (Number) — current sensor value
|
||||||
|
- `ReadingType` (String) — type of reading
|
||||||
|
- `ReadingUnit` (String) — unit
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. OEM: Power and Thermal OwnerLUN
|
||||||
|
|
||||||
|
Both `GET /redfish/v1/Chassis/{ch}/Power` and `GET /redfish/v1/Chassis/{ch}/Thermal` expose
|
||||||
|
`Oem.Ami.OwnerLUN` (Number, read-only) — the IPMI LUN associated with each
|
||||||
|
temperature/fan/voltage sensor entry. Useful for correlating Redfish sensor readings with IPMI
|
||||||
|
SDR records.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. UpdateService
|
||||||
|
|
||||||
|
`GET /redfish/v1/UpdateService` → `Oem.Ami.BMC.DualImageConfiguration`:
|
||||||
|
|
||||||
|
| Property | Description |
|
||||||
|
|----------|-------------|
|
||||||
|
| `ActiveImage` | Currently active BMC image slot |
|
||||||
|
| `BootImage` | Image slot BMC boots from |
|
||||||
|
| `FirmwareImage1Name` / `FirmwareImage1Version` | First image slot name + version |
|
||||||
|
| `FirmwareImage2Name` / `FirmwareImage2Version` | Second image slot name + version |
|
||||||
|
|
||||||
|
Standard `SimpleUpdate` action available at `/redfish/v1/UpdateService/Actions/UpdateService.SimpleUpdate`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Inventory refresh summary
|
||||||
|
|
||||||
|
| Approach | Trigger | Latency | Scope |
|
||||||
|
|----------|---------|---------|-------|
|
||||||
|
| Host reboot | Physical/soft reset | Minutes | All groups |
|
||||||
|
| `POST InventoryCrc` (groups = 0) | Explicit API call | Next host inventory push | Selected groups |
|
||||||
|
| Firmware update (`SimpleUpdate`) | Explicit API call | Minutes + reboot | Full platform |
|
||||||
|
| Sensor/telemetry reads | Always live on GET | Immediate | Sensors only |
|
||||||
|
|
||||||
|
**Key constraint:** `InventoryCrc POST` marks groups stale but does not re-read hardware
|
||||||
|
directly. Actual inventory data flows from the host to BMC via the Redfish Host Interface
|
||||||
|
in-band channel, typically during POST/boot. For immediate inventory refresh without a full
|
||||||
|
reboot, a soft reset via `ManagerBootMode: SoftReset` PATCH may be sufficient on some
|
||||||
|
configurations.
|
||||||
@@ -311,6 +311,8 @@ func (c *RedfishConnector) Collect(ctx context.Context, req Request, emit Progre
|
|||||||
if emit != nil {
|
if emit != nil {
|
||||||
emit(Progress{Status: "running", Progress: 99, Message: "Redfish: анализ raw snapshot..."})
|
emit(Progress{Status: "running", Progress: 99, Message: "Redfish: анализ raw snapshot..."})
|
||||||
}
|
}
|
||||||
|
// Collect hardware event logs separately (not part of tree-walk to avoid bloat).
|
||||||
|
rawLogEntries := c.collectRedfishLogEntries(withRedfishTelemetryPhase(ctx, "log_entries"), snapshotClient, req, baseURL, systemPaths, managerPaths)
|
||||||
rawPayloads := map[string]any{
|
rawPayloads := map[string]any{
|
||||||
"redfish_tree": rawTree,
|
"redfish_tree": rawTree,
|
||||||
"redfish_profiles": map[string]any{
|
"redfish_profiles": map[string]any{
|
||||||
@@ -413,12 +415,21 @@ func (c *RedfishConnector) Collect(ctx context.Context, req Request, emit Progre
|
|||||||
if len(fetchErrMap) > 0 {
|
if len(fetchErrMap) > 0 {
|
||||||
rawPayloads["redfish_fetch_errors"] = redfishFetchErrorMapToList(fetchErrMap)
|
rawPayloads["redfish_fetch_errors"] = redfishFetchErrorMapToList(fetchErrMap)
|
||||||
}
|
}
|
||||||
|
if len(rawLogEntries) > 0 {
|
||||||
|
rawPayloads["redfish_log_entries"] = rawLogEntries
|
||||||
|
}
|
||||||
// Unified tunnel: live collection and raw import go through the same analyzer over redfish_tree.
|
// Unified tunnel: live collection and raw import go through the same analyzer over redfish_tree.
|
||||||
result, err := ReplayRedfishFromRawPayloads(rawPayloads, nil)
|
result, err := ReplayRedfishFromRawPayloads(rawPayloads, nil)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return nil, err
|
return nil, err
|
||||||
}
|
}
|
||||||
totalElapsed := time.Since(collectStart).Round(time.Second)
|
totalElapsed := time.Since(collectStart).Round(time.Second)
|
||||||
|
if !result.InventoryLastModifiedAt.IsZero() {
|
||||||
|
log.Printf("redfish-collect: inventory last modified at %s (age: %s)",
|
||||||
|
result.InventoryLastModifiedAt.Format(time.RFC3339),
|
||||||
|
time.Since(result.InventoryLastModifiedAt).Round(time.Minute),
|
||||||
|
)
|
||||||
|
}
|
||||||
log.Printf(
|
log.Printf(
|
||||||
"redfish-postprobe-metrics: nvme_candidates=%d nvme_selected=%d nvme_added=%d candidates=%d selected=%d skipped_explicit=%d added=%d dur=%s",
|
"redfish-postprobe-metrics: nvme_candidates=%d nvme_selected=%d nvme_added=%d candidates=%d selected=%d skipped_explicit=%d added=%d dur=%s",
|
||||||
postProbeMetrics.NVMECandidates,
|
postProbeMetrics.NVMECandidates,
|
||||||
@@ -495,6 +506,11 @@ func (c *RedfishConnector) ensureHostPowerForCollection(ctx context.Context, cli
|
|||||||
return false, false
|
return false, false
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Invalidate all inventory CRC groups before powering on so the BMC accepts
|
||||||
|
// fresh inventory from the host after boot. Best-effort: failure is logged but
|
||||||
|
// does not block power-on.
|
||||||
|
c.invalidateRedfishInventory(ctx, client, req, baseURL, systemPath, emit)
|
||||||
|
|
||||||
resetTarget := redfishResetActionTarget(systemDoc)
|
resetTarget := redfishResetActionTarget(systemDoc)
|
||||||
resetType := redfishPickResetType(systemDoc, "On", "ForceOn")
|
resetType := redfishPickResetType(systemDoc, "On", "ForceOn")
|
||||||
if resetTarget == "" || resetType == "" {
|
if resetTarget == "" || resetType == "" {
|
||||||
@@ -602,6 +618,32 @@ func (c *RedfishConnector) restoreHostPowerAfterCollection(ctx context.Context,
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// invalidateRedfishInventory POSTs to the AMI/MSI InventoryCrc endpoint to zero out
|
||||||
|
// all known CRC groups before a host power-on. This causes the BMC to accept fresh
|
||||||
|
// inventory from the host after boot, preventing stale inventory (ghost GPUs, wrong
|
||||||
|
// BIOS version, etc.) from persisting across hardware changes.
|
||||||
|
// Best-effort: any error is logged and the call silently returns.
|
||||||
|
func (c *RedfishConnector) invalidateRedfishInventory(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string, emit ProgressFn) {
|
||||||
|
crcPath := joinPath(systemPath, "/Oem/Ami/Inventory/Crc")
|
||||||
|
body := map[string]any{
|
||||||
|
"GroupCrcList": []map[string]any{
|
||||||
|
{"CPU": 0},
|
||||||
|
{"DIMM": 0},
|
||||||
|
{"PCIE": 0},
|
||||||
|
{"CERTIFICATES": 0},
|
||||||
|
{"SECUREBOOT": 0},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
if err := c.postJSON(ctx, client, req, baseURL, crcPath, body); err != nil {
|
||||||
|
log.Printf("redfish: inventory invalidation skipped (not AMI/MSI or endpoint unavailable): %v", err)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
log.Printf("redfish: inventory CRC groups invalidated at %s before host power-on", crcPath)
|
||||||
|
if emit != nil {
|
||||||
|
emit(Progress{Status: "running", Progress: 19, Message: "Redfish: инвентарь BMC инвалидирован перед включением host (все CRC группы сброшены)"})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
func (c *RedfishConnector) waitForHostPowerState(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string, wantOn bool, timeout time.Duration) bool {
|
func (c *RedfishConnector) waitForHostPowerState(ctx context.Context, client *http.Client, req Request, baseURL, systemPath string, wantOn bool, timeout time.Duration) bool {
|
||||||
deadline := time.Now().Add(timeout)
|
deadline := time.Now().Add(timeout)
|
||||||
for {
|
for {
|
||||||
@@ -2627,6 +2669,7 @@ func shouldCrawlPath(path string) bool {
|
|||||||
"/Bios/Settings",
|
"/Bios/Settings",
|
||||||
"/GetServerAllUSBStatus",
|
"/GetServerAllUSBStatus",
|
||||||
"/Oem/Public/KVM",
|
"/Oem/Public/KVM",
|
||||||
|
"/SecureBoot/SecureBootDatabases",
|
||||||
} {
|
} {
|
||||||
if strings.Contains(normalized, part) {
|
if strings.Contains(normalized, part) {
|
||||||
return false
|
return false
|
||||||
@@ -5548,7 +5591,7 @@ func storageControllerFromPath(path string) string {
|
|||||||
return ""
|
return ""
|
||||||
}
|
}
|
||||||
|
|
||||||
func parseFirmware(system, bios, manager, secureBoot, networkProtocol map[string]interface{}) []models.FirmwareInfo {
|
func parseFirmware(system, bios, manager, networkProtocol map[string]interface{}) []models.FirmwareInfo {
|
||||||
var out []models.FirmwareInfo
|
var out []models.FirmwareInfo
|
||||||
|
|
||||||
appendFW := func(name, version string) {
|
appendFW := func(name, version string) {
|
||||||
@@ -5562,7 +5605,6 @@ func parseFirmware(system, bios, manager, secureBoot, networkProtocol map[string
|
|||||||
appendFW("BIOS", asString(system["BiosVersion"]))
|
appendFW("BIOS", asString(system["BiosVersion"]))
|
||||||
appendFW("BIOS", asString(bios["Version"]))
|
appendFW("BIOS", asString(bios["Version"]))
|
||||||
appendFW("BMC", asString(manager["FirmwareVersion"]))
|
appendFW("BMC", asString(manager["FirmwareVersion"]))
|
||||||
appendFW("SecureBoot", asString(secureBoot["SecureBootMode"]))
|
|
||||||
|
|
||||||
return out
|
return out
|
||||||
}
|
}
|
||||||
|
|||||||
392
internal/collector/redfish_logentries.go
Normal file
392
internal/collector/redfish_logentries.go
Normal file
@@ -0,0 +1,392 @@
|
|||||||
|
package collector
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"log"
|
||||||
|
"net/http"
|
||||||
|
"strings"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"git.mchus.pro/mchus/logpile/internal/models"
|
||||||
|
)
|
||||||
|
|
||||||
|
const (
|
||||||
|
redfishLogEntriesWindow = 7 * 24 * time.Hour
|
||||||
|
redfishLogEntriesMaxTotal = 500
|
||||||
|
redfishLogEntriesMaxPerSvc = 200
|
||||||
|
)
|
||||||
|
|
||||||
|
// collectRedfishLogEntries fetches hardware event log entries from Systems and Managers LogServices.
|
||||||
|
// Only hardware-relevant entries from the last 7 days are returned.
|
||||||
|
// For Systems: all log services except audit/journal/security/debug.
|
||||||
|
// For Managers: only the IPMI SEL service (Id="SEL") — audit and event logs are excluded.
|
||||||
|
func (c *RedfishConnector) collectRedfishLogEntries(ctx context.Context, client *http.Client, req Request, baseURL string, systemPaths, managerPaths []string) []map[string]interface{} {
|
||||||
|
cutoff := time.Now().UTC().Add(-redfishLogEntriesWindow)
|
||||||
|
seen := make(map[string]struct{})
|
||||||
|
var out []map[string]interface{}
|
||||||
|
|
||||||
|
collectFrom := func(logServicesPath string, filter func(map[string]interface{}) bool) {
|
||||||
|
if len(out) >= redfishLogEntriesMaxTotal {
|
||||||
|
return
|
||||||
|
}
|
||||||
|
services, err := c.getCollectionMembers(ctx, client, req, baseURL, logServicesPath)
|
||||||
|
if err != nil || len(services) == 0 {
|
||||||
|
return
|
||||||
|
}
|
||||||
|
for _, svc := range services {
|
||||||
|
if len(out) >= redfishLogEntriesMaxTotal {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
if !filter(svc) {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
entriesPath := redfishLogServiceEntriesPath(svc)
|
||||||
|
if entriesPath == "" {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
entries := c.fetchRedfishLogEntriesWithPaging(ctx, client, req, baseURL, entriesPath, cutoff, seen, redfishLogEntriesMaxPerSvc)
|
||||||
|
out = append(out, entries...)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, systemPath := range systemPaths {
|
||||||
|
collectFrom(joinPath(systemPath, "/LogServices"), isHardwareLogService)
|
||||||
|
}
|
||||||
|
// Managers hold the IPMI SEL on AMI/MSI BMCs — include only the "SEL" service.
|
||||||
|
for _, managerPath := range managerPaths {
|
||||||
|
collectFrom(joinPath(managerPath, "/LogServices"), isManagerSELService)
|
||||||
|
}
|
||||||
|
|
||||||
|
if len(out) > 0 {
|
||||||
|
log.Printf("redfish: collected %d hardware log entries (Systems+Managers SEL, window=7d)", len(out))
|
||||||
|
}
|
||||||
|
return out
|
||||||
|
}
|
||||||
|
|
||||||
|
// fetchRedfishLogEntriesWithPaging fetches entries from a LogEntry collection,
|
||||||
|
// following nextLink pages. Stops early when entries older than cutoff are encountered
|
||||||
|
// (assumes BMC returns entries newest-first, which is typical).
|
||||||
|
func (c *RedfishConnector) fetchRedfishLogEntriesWithPaging(ctx context.Context, client *http.Client, req Request, baseURL, entriesPath string, cutoff time.Time, seen map[string]struct{}, limit int) []map[string]interface{} {
|
||||||
|
var out []map[string]interface{}
|
||||||
|
nextPath := entriesPath
|
||||||
|
|
||||||
|
for nextPath != "" && len(out) < limit {
|
||||||
|
collection, err := c.getJSON(ctx, client, req, baseURL, nextPath)
|
||||||
|
if err != nil {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// Handle both linked members (@odata.id only) and inline members (full objects).
|
||||||
|
rawMembers, _ := collection["Members"].([]interface{})
|
||||||
|
hitOldEntry := false
|
||||||
|
|
||||||
|
for _, rawMember := range rawMembers {
|
||||||
|
if len(out) >= limit {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
memberMap, ok := rawMember.(map[string]interface{})
|
||||||
|
if !ok {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
var entry map[string]interface{}
|
||||||
|
if _, hasCreated := memberMap["Created"]; hasCreated {
|
||||||
|
// Inline entry — use directly.
|
||||||
|
entry = memberMap
|
||||||
|
} else {
|
||||||
|
// Linked entry — fetch by path.
|
||||||
|
memberPath := normalizeRedfishPath(asString(memberMap["@odata.id"]))
|
||||||
|
if memberPath == "" {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
entry, err = c.getJSON(ctx, client, req, baseURL, memberPath)
|
||||||
|
if err != nil || len(entry) == 0 {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Dedup by entry Id or path.
|
||||||
|
entryKey := asString(entry["Id"])
|
||||||
|
if entryKey == "" {
|
||||||
|
entryKey = asString(entry["@odata.id"])
|
||||||
|
}
|
||||||
|
if entryKey != "" {
|
||||||
|
if _, dup := seen[entryKey]; dup {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
seen[entryKey] = struct{}{}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Time filter.
|
||||||
|
created := parseRedfishEntryTime(asString(entry["Created"]))
|
||||||
|
if !created.IsZero() && created.Before(cutoff) {
|
||||||
|
hitOldEntry = true
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
// Hardware relevance filter.
|
||||||
|
if !isHardwareLogEntry(entry) {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
out = append(out, entry)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Stop paging once we've seen entries older than the window.
|
||||||
|
if hitOldEntry {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
nextPath = firstNonEmpty(
|
||||||
|
normalizeRedfishPath(asString(collection["Members@odata.nextLink"])),
|
||||||
|
normalizeRedfishPath(asString(collection["@odata.nextLink"])),
|
||||||
|
)
|
||||||
|
}
|
||||||
|
return out
|
||||||
|
}
|
||||||
|
|
||||||
|
// isManagerSELService returns true only for the IPMI SEL exposed under Managers.
|
||||||
|
// On AMI/MSI BMCs the hardware SEL lives at Managers/{mgr}/LogServices/SEL.
|
||||||
|
// All other Manager log services (AuditLog, EventLog, Journal) are excluded.
|
||||||
|
func isManagerSELService(svc map[string]interface{}) bool {
|
||||||
|
id := strings.ToLower(strings.TrimSpace(asString(svc["Id"])))
|
||||||
|
return id == "sel"
|
||||||
|
}
|
||||||
|
|
||||||
|
// isHardwareLogService returns true if the log service looks like a hardware event log
|
||||||
|
// (SEL, System Event Log) rather than a BMC audit/journal log.
|
||||||
|
func isHardwareLogService(svc map[string]interface{}) bool {
|
||||||
|
id := strings.ToLower(strings.TrimSpace(asString(svc["Id"])))
|
||||||
|
name := strings.ToLower(strings.TrimSpace(asString(svc["Name"])))
|
||||||
|
for _, skip := range []string{"audit", "journal", "bmc", "security", "manager", "debug"} {
|
||||||
|
if strings.Contains(id, skip) || strings.Contains(name, skip) {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
// redfishLogServiceEntriesPath returns the Entries collection path for a LogService document.
|
||||||
|
func redfishLogServiceEntriesPath(svc map[string]interface{}) string {
|
||||||
|
if entriesLink, ok := svc["Entries"].(map[string]interface{}); ok {
|
||||||
|
if p := normalizeRedfishPath(asString(entriesLink["@odata.id"])); p != "" {
|
||||||
|
return p
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if id := normalizeRedfishPath(asString(svc["@odata.id"])); id != "" {
|
||||||
|
return joinPath(id, "/Entries")
|
||||||
|
}
|
||||||
|
return ""
|
||||||
|
}
|
||||||
|
|
||||||
|
// isHardwareLogEntry returns true if the log entry is hardware-related.
|
||||||
|
// Audit, authentication, and session events are excluded.
|
||||||
|
func isHardwareLogEntry(entry map[string]interface{}) bool {
|
||||||
|
entryType := strings.TrimSpace(asString(entry["EntryType"]))
|
||||||
|
if strings.EqualFold(entryType, "Oem") {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
|
msgID := strings.ToLower(strings.TrimSpace(asString(entry["MessageId"])))
|
||||||
|
for _, skip := range []string{
|
||||||
|
"user", "account", "password", "login", "logon", "session",
|
||||||
|
"auth", "certificate", "security", "credential", "privilege",
|
||||||
|
} {
|
||||||
|
if strings.Contains(msgID, skip) {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Also check the human-readable message for obvious audit patterns.
|
||||||
|
msg := strings.ToLower(strings.TrimSpace(asString(entry["Message"])))
|
||||||
|
for _, skip := range []string{"logged in", "logged out", "log in", "log out", "sign in", "signed in"} {
|
||||||
|
if strings.Contains(msg, skip) {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
// parseRedfishEntryTime parses a Redfish LogEntry Created timestamp (ISO 8601 / RFC 3339).
|
||||||
|
func parseRedfishEntryTime(raw string) time.Time {
|
||||||
|
raw = strings.TrimSpace(raw)
|
||||||
|
if raw == "" {
|
||||||
|
return time.Time{}
|
||||||
|
}
|
||||||
|
for _, layout := range []string{time.RFC3339, time.RFC3339Nano, "2006-01-02T15:04:05Z07:00"} {
|
||||||
|
if t, err := time.Parse(layout, raw); err == nil {
|
||||||
|
return t.UTC()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return time.Time{}
|
||||||
|
}
|
||||||
|
|
||||||
|
// parseRedfishLogEntries converts raw log entries stored in RawPayloads into models.Event slice.
|
||||||
|
// Called during Redfish replay for both live and offline (archive) collections.
|
||||||
|
func parseRedfishLogEntries(rawPayloads map[string]any, collectedAt time.Time) []models.Event {
|
||||||
|
raw, ok := rawPayloads["redfish_log_entries"]
|
||||||
|
if !ok {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
var entries []map[string]interface{}
|
||||||
|
switch v := raw.(type) {
|
||||||
|
case []map[string]interface{}:
|
||||||
|
entries = v
|
||||||
|
case []interface{}:
|
||||||
|
for _, item := range v {
|
||||||
|
if m, ok := item.(map[string]interface{}); ok {
|
||||||
|
entries = append(entries, m)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
default:
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
if len(entries) == 0 {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
out := make([]models.Event, 0, len(entries))
|
||||||
|
for _, entry := range entries {
|
||||||
|
ev := redfishLogEntryToEvent(entry, collectedAt)
|
||||||
|
if ev == nil {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
out = append(out, *ev)
|
||||||
|
}
|
||||||
|
return out
|
||||||
|
}
|
||||||
|
|
||||||
|
// redfishLogEntryToEvent converts a single Redfish LogEntry document to models.Event.
|
||||||
|
func redfishLogEntryToEvent(entry map[string]interface{}, collectedAt time.Time) *models.Event {
|
||||||
|
// Prefer EventTimestamp (actual hardware event time) over Created (Redfish record creation time).
|
||||||
|
ts := parseRedfishEntryTime(asString(entry["EventTimestamp"]))
|
||||||
|
if ts.IsZero() {
|
||||||
|
ts = parseRedfishEntryTime(asString(entry["Created"]))
|
||||||
|
}
|
||||||
|
if ts.IsZero() {
|
||||||
|
ts = collectedAt
|
||||||
|
}
|
||||||
|
|
||||||
|
severity := redfishLogEntrySeverity(entry)
|
||||||
|
sensorType := strings.TrimSpace(asString(entry["SensorType"]))
|
||||||
|
messageID := strings.TrimSpace(asString(entry["MessageId"]))
|
||||||
|
entryType := strings.TrimSpace(asString(entry["EntryType"]))
|
||||||
|
entryCode := strings.TrimSpace(asString(entry["EntryCode"]))
|
||||||
|
|
||||||
|
// SensorName: prefer "Name", fall back to "SensorNumber" + SensorType.
|
||||||
|
sensorName := strings.TrimSpace(asString(entry["Name"]))
|
||||||
|
if sensorName == "" {
|
||||||
|
num := strings.TrimSpace(asString(entry["SensorNumber"]))
|
||||||
|
if num != "" && sensorType != "" {
|
||||||
|
sensorName = sensorType + " " + num
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
rawMessage := strings.TrimSpace(asString(entry["Message"]))
|
||||||
|
|
||||||
|
// AMI/MSI BMCs dump raw IPMI record fields into Message instead of human-readable text.
|
||||||
|
// Detect this and build a readable description from structured fields instead.
|
||||||
|
description, rawData := redfishDecodeMessage(rawMessage, sensorType, entryCode, entry)
|
||||||
|
if description == "" {
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
return &models.Event{
|
||||||
|
ID: messageID,
|
||||||
|
Timestamp: ts,
|
||||||
|
Source: "redfish",
|
||||||
|
SensorType: sensorType,
|
||||||
|
SensorName: sensorName,
|
||||||
|
EventType: entryType,
|
||||||
|
Severity: severity,
|
||||||
|
Description: description,
|
||||||
|
RawData: rawData,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// redfishDecodeMessage returns a human-readable description and optional raw data.
|
||||||
|
// AMI/MSI BMCs dump raw IPMI record fields into Message as "Key : Value, Key : Value, ..."
|
||||||
|
// instead of a plain human-readable string. We extract the useful decoded fields from it.
|
||||||
|
func redfishDecodeMessage(message, sensorType, entryCode string, entry map[string]interface{}) (description, rawData string) {
|
||||||
|
if !isRawIPMIDump(message) {
|
||||||
|
description = message
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
rawData = message
|
||||||
|
kv := parseIPMIDumpKV(message)
|
||||||
|
|
||||||
|
// Sensor_Type inside the dump is more specific than the top-level SensorType field.
|
||||||
|
if v := kv["Sensor_Type"]; v != "" {
|
||||||
|
sensorType = v
|
||||||
|
}
|
||||||
|
eventType := kv["Event_Type"] // human-readable IPMI event type, e.g. "Legacy OFF State"
|
||||||
|
|
||||||
|
var parts []string
|
||||||
|
if sensorType != "" {
|
||||||
|
parts = append(parts, sensorType)
|
||||||
|
}
|
||||||
|
if eventType != "" {
|
||||||
|
parts = append(parts, eventType)
|
||||||
|
} else if entryCode != "" {
|
||||||
|
parts = append(parts, entryCode)
|
||||||
|
}
|
||||||
|
description = strings.Join(parts, ": ")
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// isRawIPMIDump returns true if the message is an AMI raw IPMI record dump.
|
||||||
|
func isRawIPMIDump(message string) bool {
|
||||||
|
return strings.Contains(message, "Event_Data_1 :") && strings.Contains(message, "Record_Type :")
|
||||||
|
}
|
||||||
|
|
||||||
|
// parseIPMIDumpKV parses the AMI "Key : Value, Key : Value, " format into a map.
|
||||||
|
func parseIPMIDumpKV(message string) map[string]string {
|
||||||
|
out := make(map[string]string)
|
||||||
|
for _, part := range strings.Split(message, ",") {
|
||||||
|
part = strings.TrimSpace(part)
|
||||||
|
idx := strings.Index(part, " : ")
|
||||||
|
if idx < 0 {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
k := strings.TrimSpace(part[:idx])
|
||||||
|
v := strings.TrimSpace(part[idx+3:])
|
||||||
|
if k != "" && v != "" {
|
||||||
|
out[k] = v
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return out
|
||||||
|
}
|
||||||
|
|
||||||
|
// redfishLogEntrySeverity maps a Redfish LogEntry to models.Severity.
|
||||||
|
// AMI/MSI BMCs often set Severity="OK" on all SEL records regardless of content,
|
||||||
|
// so we fall back to inferring severity from SensorType when the explicit field is unhelpful.
|
||||||
|
func redfishLogEntrySeverity(entry map[string]interface{}) models.Severity {
|
||||||
|
// Newer Redfish uses MessageSeverity; older uses Severity.
|
||||||
|
raw := strings.ToLower(firstNonEmpty(
|
||||||
|
strings.TrimSpace(asString(entry["MessageSeverity"])),
|
||||||
|
strings.TrimSpace(asString(entry["Severity"])),
|
||||||
|
))
|
||||||
|
switch raw {
|
||||||
|
case "critical":
|
||||||
|
return models.SeverityCritical
|
||||||
|
case "warning":
|
||||||
|
return models.SeverityWarning
|
||||||
|
case "ok", "informational", "":
|
||||||
|
// BMC didn't set a meaningful severity — infer from SensorType.
|
||||||
|
return redfishSeverityFromSensorType(strings.TrimSpace(asString(entry["SensorType"])))
|
||||||
|
default:
|
||||||
|
return models.SeverityInfo
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// redfishSeverityFromSensorType infers event severity from the IPMI/Redfish SensorType string.
|
||||||
|
func redfishSeverityFromSensorType(sensorType string) models.Severity {
|
||||||
|
switch strings.ToLower(sensorType) {
|
||||||
|
case "critical interrupt", "processor", "memory", "power unit",
|
||||||
|
"power supply", "drive slot", "system firmware progress":
|
||||||
|
return models.SeverityWarning
|
||||||
|
default:
|
||||||
|
return models.SeverityInfo
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -53,7 +53,7 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
|
|||||||
chassisDoc, _ := r.getJSON(primaryChassis)
|
chassisDoc, _ := r.getJSON(primaryChassis)
|
||||||
managerDoc, _ := r.getJSON(primaryManager)
|
managerDoc, _ := r.getJSON(primaryManager)
|
||||||
biosDoc, _ := r.getJSON(joinPath(primarySystem, "/Bios"))
|
biosDoc, _ := r.getJSON(joinPath(primarySystem, "/Bios"))
|
||||||
secureBootDoc, _ := r.getJSON(joinPath(primarySystem, "/SecureBoot"))
|
|
||||||
systemFRUDoc, _ := r.getJSON(joinPath(primarySystem, "/Oem/Public/FRU"))
|
systemFRUDoc, _ := r.getJSON(joinPath(primarySystem, "/Oem/Public/FRU"))
|
||||||
chassisFRUDoc, _ := r.getJSON(joinPath(primaryChassis, "/Oem/Public/FRU"))
|
chassisFRUDoc, _ := r.getJSON(joinPath(primaryChassis, "/Oem/Public/FRU"))
|
||||||
fruDoc := systemFRUDoc
|
fruDoc := systemFRUDoc
|
||||||
@@ -96,16 +96,19 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
|
|||||||
healthEvents := r.collectHealthSummaryEvents(chassisPaths)
|
healthEvents := r.collectHealthSummaryEvents(chassisPaths)
|
||||||
driveFetchWarningEvents := buildDriveFetchWarningEvents(rawPayloads)
|
driveFetchWarningEvents := buildDriveFetchWarningEvents(rawPayloads)
|
||||||
networkProtocolDoc, _ := r.getJSON(joinPath(primaryManager, "/NetworkProtocol"))
|
networkProtocolDoc, _ := r.getJSON(joinPath(primaryManager, "/NetworkProtocol"))
|
||||||
firmware := parseFirmware(systemDoc, biosDoc, managerDoc, secureBootDoc, networkProtocolDoc)
|
firmware := parseFirmware(systemDoc, biosDoc, managerDoc, networkProtocolDoc)
|
||||||
firmware = dedupeFirmwareInfo(append(firmware, r.collectFirmwareInventory()...))
|
firmware = dedupeFirmwareInfo(append(firmware, r.collectFirmwareInventory()...))
|
||||||
boardInfo.BMCMACAddress = r.collectBMCMAC(managerPaths)
|
boardInfo.BMCMACAddress = r.collectBMCMAC(managerPaths)
|
||||||
assemblyFRU := r.collectAssemblyFRU(chassisPaths)
|
assemblyFRU := r.collectAssemblyFRU(chassisPaths)
|
||||||
collectedAt, sourceTimezone := inferRedfishCollectionTime(managerDoc, rawPayloads)
|
collectedAt, sourceTimezone := inferRedfishCollectionTime(managerDoc, rawPayloads)
|
||||||
|
inventoryLastModifiedAt := inferInventoryLastModifiedTime(r.tree)
|
||||||
|
logEntryEvents := parseRedfishLogEntries(rawPayloads, collectedAt)
|
||||||
|
|
||||||
result := &models.AnalysisResult{
|
result := &models.AnalysisResult{
|
||||||
CollectedAt: collectedAt,
|
CollectedAt: collectedAt,
|
||||||
|
InventoryLastModifiedAt: inventoryLastModifiedAt,
|
||||||
SourceTimezone: sourceTimezone,
|
SourceTimezone: sourceTimezone,
|
||||||
Events: append(append(append(make([]models.Event, 0, len(discreteEvents)+len(healthEvents)+len(driveFetchWarningEvents)+1), healthEvents...), discreteEvents...), driveFetchWarningEvents...),
|
Events: append(append(append(append(make([]models.Event, 0, len(discreteEvents)+len(healthEvents)+len(driveFetchWarningEvents)+len(logEntryEvents)+1), healthEvents...), discreteEvents...), driveFetchWarningEvents...), logEntryEvents...),
|
||||||
FRU: assemblyFRU,
|
FRU: assemblyFRU,
|
||||||
Sensors: dedupeSensorReadings(append(append(thresholdSensors, thermalSensors...), powerSensors...)),
|
Sensors: dedupeSensorReadings(append(append(thresholdSensors, thermalSensors...), powerSensors...)),
|
||||||
RawPayloads: cloneRawPayloads(rawPayloads),
|
RawPayloads: cloneRawPayloads(rawPayloads),
|
||||||
@@ -183,6 +186,35 @@ func inferRedfishCollectionTime(managerDoc map[string]interface{}, rawPayloads m
|
|||||||
return time.Time{}, offset
|
return time.Time{}, offset
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// inferInventoryLastModifiedTime reads InventoryData/Status.InventoryData.LastModifiedTime
|
||||||
|
// from the Redfish snapshot. Returns zero time if not present or unparseable.
|
||||||
|
func inferInventoryLastModifiedTime(snapshot map[string]interface{}) time.Time {
|
||||||
|
docAny, ok := snapshot["/redfish/v1/Oem/Ami/InventoryData/Status"]
|
||||||
|
if !ok {
|
||||||
|
return time.Time{}
|
||||||
|
}
|
||||||
|
doc, ok := docAny.(map[string]interface{})
|
||||||
|
if !ok {
|
||||||
|
return time.Time{}
|
||||||
|
}
|
||||||
|
invData, ok := doc["InventoryData"].(map[string]interface{})
|
||||||
|
if !ok {
|
||||||
|
return time.Time{}
|
||||||
|
}
|
||||||
|
raw := strings.TrimSpace(asString(invData["LastModifiedTime"]))
|
||||||
|
if raw == "" {
|
||||||
|
return time.Time{}
|
||||||
|
}
|
||||||
|
for _, layout := range []string{time.RFC3339, time.RFC3339Nano} {
|
||||||
|
if ts, err := time.Parse(layout, raw); err == nil {
|
||||||
|
t := ts.UTC()
|
||||||
|
log.Printf("redfish replay: inventory last modified at %s (InventoryData/Status.LastModifiedTime)", t.Format(time.RFC3339))
|
||||||
|
return t
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return time.Time{}
|
||||||
|
}
|
||||||
|
|
||||||
func appendMissingServerModelWarning(result *models.AnalysisResult, systemDoc map[string]interface{}, systemFRUPath, chassisFRUPath string) {
|
func appendMissingServerModelWarning(result *models.AnalysisResult, systemDoc map[string]interface{}, systemFRUPath, chassisFRUPath string) {
|
||||||
if result == nil || result.Hardware == nil {
|
if result == nil || result.Hardware == nil {
|
||||||
return
|
return
|
||||||
|
|||||||
@@ -58,6 +58,44 @@ func (r redfishSnapshotReader) collectGPUs(systemPaths, chassisPaths []string, p
|
|||||||
return out
|
return out
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// msiGhostGPUFilter returns true when the GPU chassis for gpuID shows a temperature
|
||||||
|
// of 0 on a powered-on host, which is the reliable MSI/AMI signal that the GPU is
|
||||||
|
// no longer physically installed (stale BMC inventory cache).
|
||||||
|
// It only filters when the system PowerState is "On" — when the host is off, all
|
||||||
|
// temperature readings are 0 and we cannot distinguish absent from idle.
|
||||||
|
func (r redfishSnapshotReader) msiGhostGPUFilter(systemPaths []string, gpuID, chassisPath string) bool {
|
||||||
|
// Require host powered on.
|
||||||
|
for _, sp := range systemPaths {
|
||||||
|
doc, err := r.getJSON(sp)
|
||||||
|
if err != nil {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
if !strings.EqualFold(strings.TrimSpace(asString(doc["PowerState"])), "on") {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
break
|
||||||
|
}
|
||||||
|
// Read the temperature sensor for this GPU chassis.
|
||||||
|
sensorPath := joinPath(chassisPath, "/Sensors/"+gpuID+"_Temperature")
|
||||||
|
sensorDoc, err := r.getJSON(sensorPath)
|
||||||
|
if err != nil || len(sensorDoc) == 0 {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
reading, ok := sensorDoc["Reading"]
|
||||||
|
if !ok {
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
switch v := reading.(type) {
|
||||||
|
case float64:
|
||||||
|
return v == 0
|
||||||
|
case int:
|
||||||
|
return v == 0
|
||||||
|
case int64:
|
||||||
|
return v == 0
|
||||||
|
}
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
// collectGPUsFromProcessors finds GPUs that some BMCs (e.g. MSI) expose as
|
// collectGPUsFromProcessors finds GPUs that some BMCs (e.g. MSI) expose as
|
||||||
// Processor entries with ProcessorType=GPU rather than as PCIe devices.
|
// Processor entries with ProcessorType=GPU rather than as PCIe devices.
|
||||||
// It supplements the existing gpus slice (already found via PCIe path),
|
// It supplements the existing gpus slice (already found via PCIe path),
|
||||||
@@ -68,6 +106,7 @@ func (r redfishSnapshotReader) collectGPUsFromProcessors(systemPaths, chassisPat
|
|||||||
return append([]models.GPU{}, existing...)
|
return append([]models.GPU{}, existing...)
|
||||||
}
|
}
|
||||||
chassisByID := make(map[string]map[string]interface{})
|
chassisByID := make(map[string]map[string]interface{})
|
||||||
|
chassisPathByID := make(map[string]string)
|
||||||
for _, cp := range chassisPaths {
|
for _, cp := range chassisPaths {
|
||||||
doc, err := r.getJSON(cp)
|
doc, err := r.getJSON(cp)
|
||||||
if err != nil || len(doc) == 0 {
|
if err != nil || len(doc) == 0 {
|
||||||
@@ -76,6 +115,7 @@ func (r redfishSnapshotReader) collectGPUsFromProcessors(systemPaths, chassisPat
|
|||||||
id := strings.TrimSpace(asString(doc["Id"]))
|
id := strings.TrimSpace(asString(doc["Id"]))
|
||||||
if id != "" {
|
if id != "" {
|
||||||
chassisByID[strings.ToUpper(id)] = doc
|
chassisByID[strings.ToUpper(id)] = doc
|
||||||
|
chassisPathByID[strings.ToUpper(id)] = cp
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -108,6 +148,13 @@ func (r redfishSnapshotReader) collectGPUsFromProcessors(systemPaths, chassisPat
|
|||||||
serial = resolveProcessorGPUChassisSerial(chassisByID, gpuID, plan)
|
serial = resolveProcessorGPUChassisSerial(chassisByID, gpuID, plan)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if plan.Directives.EnableMSIGhostGPUFilter {
|
||||||
|
chassisPath := resolveProcessorGPUChassisPath(chassisPathByID, gpuID, plan)
|
||||||
|
if chassisPath != "" && r.msiGhostGPUFilter(systemPaths, gpuID, chassisPath) {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
uuid := strings.TrimSpace(asString(doc["UUID"]))
|
uuid := strings.TrimSpace(asString(doc["UUID"]))
|
||||||
uuidKey := strings.ToUpper(uuid)
|
uuidKey := strings.ToUpper(uuid)
|
||||||
serialKey := strings.ToUpper(serial)
|
serialKey := strings.ToUpper(serial)
|
||||||
|
|||||||
@@ -45,6 +45,15 @@ func resolveProcessorGPUChassisSerial(chassisByID map[string]map[string]interfac
|
|||||||
return ""
|
return ""
|
||||||
}
|
}
|
||||||
|
|
||||||
|
func resolveProcessorGPUChassisPath(chassisPathByID map[string]string, gpuID string, plan redfishprofile.ResolvedAnalysisPlan) string {
|
||||||
|
for _, candidateID := range processorGPUChassisCandidateIDs(gpuID, plan) {
|
||||||
|
if p, ok := chassisPathByID[strings.ToUpper(candidateID)]; ok {
|
||||||
|
return p
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return ""
|
||||||
|
}
|
||||||
|
|
||||||
func processorGPUChassisCandidateIDs(gpuID string, plan redfishprofile.ResolvedAnalysisPlan) []string {
|
func processorGPUChassisCandidateIDs(gpuID string, plan redfishprofile.ResolvedAnalysisPlan) []string {
|
||||||
gpuID = strings.TrimSpace(gpuID)
|
gpuID = strings.TrimSpace(gpuID)
|
||||||
if gpuID == "" {
|
if gpuID == "" {
|
||||||
|
|||||||
@@ -52,7 +52,6 @@ func baselineSeedPaths(discovered DiscoveredResources) []string {
|
|||||||
for _, p := range discovered.SystemPaths {
|
for _, p := range discovered.SystemPaths {
|
||||||
add(p)
|
add(p)
|
||||||
add(joinPath(p, "/Bios"))
|
add(joinPath(p, "/Bios"))
|
||||||
add(joinPath(p, "/SecureBoot"))
|
|
||||||
add(joinPath(p, "/Oem/Public"))
|
add(joinPath(p, "/Oem/Public"))
|
||||||
add(joinPath(p, "/Oem/Public/FRU"))
|
add(joinPath(p, "/Oem/Public/FRU"))
|
||||||
add(joinPath(p, "/Processors"))
|
add(joinPath(p, "/Processors"))
|
||||||
|
|||||||
@@ -10,7 +10,6 @@ func genericProfile() Profile {
|
|||||||
ensurePrefetchPolicy(plan, AcquisitionPrefetchPolicy{
|
ensurePrefetchPolicy(plan, AcquisitionPrefetchPolicy{
|
||||||
IncludeSuffixes: []string{
|
IncludeSuffixes: []string{
|
||||||
"/Bios",
|
"/Bios",
|
||||||
"/SecureBoot",
|
|
||||||
"/Processors",
|
"/Processors",
|
||||||
"/Memory",
|
"/Memory",
|
||||||
"/Storage",
|
"/Storage",
|
||||||
@@ -47,7 +46,6 @@ func genericProfile() Profile {
|
|||||||
ensureScopedPathPolicy(plan, AcquisitionScopedPathPolicy{
|
ensureScopedPathPolicy(plan, AcquisitionScopedPathPolicy{
|
||||||
SystemCriticalSuffixes: []string{
|
SystemCriticalSuffixes: []string{
|
||||||
"/Bios",
|
"/Bios",
|
||||||
"/SecureBoot",
|
|
||||||
"/Oem/Public",
|
"/Oem/Public",
|
||||||
"/Oem/Public/FRU",
|
"/Oem/Public/FRU",
|
||||||
"/Processors",
|
"/Processors",
|
||||||
|
|||||||
@@ -64,8 +64,10 @@ func msiProfile() Profile {
|
|||||||
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) && snapshotHasPathPrefix(snapshot, "/redfish/v1/Chassis/GPU") {
|
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) && snapshotHasPathPrefix(snapshot, "/redfish/v1/Chassis/GPU") {
|
||||||
plan.Directives.EnableProcessorGPUFallback = true
|
plan.Directives.EnableProcessorGPUFallback = true
|
||||||
plan.Directives.EnableMSIProcessorGPUChassisLookup = true
|
plan.Directives.EnableMSIProcessorGPUChassisLookup = true
|
||||||
|
plan.Directives.EnableMSIGhostGPUFilter = true
|
||||||
addAnalysisLookupMode(plan, "msi-index")
|
addAnalysisLookupMode(plan, "msi-index")
|
||||||
addAnalysisNote(plan, "msi analysis enables processor-gpu fallback from discovered GPU chassis")
|
addAnalysisNote(plan, "msi analysis enables processor-gpu fallback from discovered GPU chassis")
|
||||||
|
addAnalysisNote(plan, "msi ghost-gpu filter enabled: GPUs with temperature=0 on powered-on host are excluded")
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -103,6 +103,7 @@ type AnalysisDirectives struct {
|
|||||||
EnableProcessorGPUChassisAlias bool
|
EnableProcessorGPUChassisAlias bool
|
||||||
EnableGenericGraphicsControllerDedup bool
|
EnableGenericGraphicsControllerDedup bool
|
||||||
EnableMSIProcessorGPUChassisLookup bool
|
EnableMSIProcessorGPUChassisLookup bool
|
||||||
|
EnableMSIGhostGPUFilter bool
|
||||||
EnableStorageEnclosureRecovery bool
|
EnableStorageEnclosureRecovery bool
|
||||||
EnableKnownStorageControllerRecovery bool
|
EnableKnownStorageControllerRecovery bool
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -33,7 +33,7 @@ func ConvertToReanimator(result *models.AnalysisResult) (*ReanimatorExport, erro
|
|||||||
// Determine target host (optional field)
|
// Determine target host (optional field)
|
||||||
targetHost := inferTargetHost(result.TargetHost, result.Filename)
|
targetHost := inferTargetHost(result.TargetHost, result.Filename)
|
||||||
|
|
||||||
collectedAt := formatRFC3339(result.CollectedAt)
|
collectedAt := formatRFC3339(reanimatorCollectedAt(result))
|
||||||
devices := canonicalDevicesForExport(result.Hardware)
|
devices := canonicalDevicesForExport(result.Hardware)
|
||||||
|
|
||||||
export := &ReanimatorExport{
|
export := &ReanimatorExport{
|
||||||
@@ -58,6 +58,17 @@ func ConvertToReanimator(result *models.AnalysisResult) (*ReanimatorExport, erro
|
|||||||
return export, nil
|
return export, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// reanimatorCollectedAt returns the best timestamp for Reanimator export collected_at.
|
||||||
|
// Prefers InventoryLastModifiedAt when it is set and no older than 30 days; falls back
|
||||||
|
// to CollectedAt (and ultimately to now via formatRFC3339).
|
||||||
|
func reanimatorCollectedAt(result *models.AnalysisResult) time.Time {
|
||||||
|
inv := result.InventoryLastModifiedAt
|
||||||
|
if !inv.IsZero() && time.Since(inv) <= 30*24*time.Hour {
|
||||||
|
return inv
|
||||||
|
}
|
||||||
|
return result.CollectedAt
|
||||||
|
}
|
||||||
|
|
||||||
// formatRFC3339 formats time in RFC3339 format, returns current time if zero
|
// formatRFC3339 formats time in RFC3339 format, returns current time if zero
|
||||||
func formatRFC3339(t time.Time) string {
|
func formatRFC3339(t time.Time) string {
|
||||||
if t.IsZero() {
|
if t.IsZero() {
|
||||||
|
|||||||
@@ -15,6 +15,7 @@ type AnalysisResult struct {
|
|||||||
TargetHost string `json:"target_host,omitempty"` // BMC host for live collect
|
TargetHost string `json:"target_host,omitempty"` // BMC host for live collect
|
||||||
SourceTimezone string `json:"source_timezone,omitempty"` // Source timezone/offset used during collection (e.g. +08:00)
|
SourceTimezone string `json:"source_timezone,omitempty"` // Source timezone/offset used during collection (e.g. +08:00)
|
||||||
CollectedAt time.Time `json:"collected_at,omitempty"` // Collection/upload timestamp
|
CollectedAt time.Time `json:"collected_at,omitempty"` // Collection/upload timestamp
|
||||||
|
InventoryLastModifiedAt time.Time `json:"inventory_last_modified_at,omitempty"` // Redfish inventory last modified (InventoryData/Status)
|
||||||
RawPayloads map[string]any `json:"raw_payloads,omitempty"` // Additional source payloads (e.g. Redfish tree)
|
RawPayloads map[string]any `json:"raw_payloads,omitempty"` // Additional source payloads (e.g. Redfish tree)
|
||||||
Events []Event `json:"events"`
|
Events []Event `json:"events"`
|
||||||
FRU []FRUInfo `json:"fru"`
|
FRU []FRUInfo `json:"fru"`
|
||||||
|
|||||||
@@ -46,7 +46,10 @@ func (s *Server) handleIndex(w http.ResponseWriter, r *http.Request) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
w.Header().Set("Content-Type", "text/html; charset=utf-8")
|
w.Header().Set("Content-Type", "text/html; charset=utf-8")
|
||||||
tmpl.Execute(w, nil)
|
tmpl.Execute(w, map[string]string{
|
||||||
|
"AppVersion": s.config.AppVersion,
|
||||||
|
"AppCommit": s.config.AppCommit,
|
||||||
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
func (s *Server) handleChartCurrent(w http.ResponseWriter, r *http.Request) {
|
func (s *Server) handleChartCurrent(w http.ResponseWriter, r *http.Request) {
|
||||||
|
|||||||
@@ -165,7 +165,7 @@
|
|||||||
<div class="footer-buttons">
|
<div class="footer-buttons">
|
||||||
</div>
|
</div>
|
||||||
<div class="footer-info">
|
<div class="footer-info">
|
||||||
<p>Автор: <a href="https://mchus.pro" target="_blank">mchus.pro</a> | <a href="https://git.mchus.pro/mchus/logpile" target="_blank">Git Repository</a></p>
|
<p>Автор: <a href="https://mchus.pro" target="_blank">mchus.pro</a> | <a href="https://git.mchus.pro/mchus/logpile" target="_blank">Git Repository</a>{{if .AppVersion}} | v{{.AppVersion}}{{end}}</p>
|
||||||
</div>
|
</div>
|
||||||
</footer>
|
</footer>
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user