18 Commits

Author SHA1 Message Date
Mikhail Chusavitin
65e65968cf feat: add xFusion iBMC Redfish profile
Matches on ServiceRootVendor "xFusion" and OEM namespace "xFusion"
(score 90+). Enables GenericGraphicsControllerDedup unconditionally and
ProcessorGPUFallback when GPU-type processors are present in the snapshot
(xFusion G5500 V7 exposes H200s simultaneously in PCIeDevices,
GraphicsControllers, and Processors/Gpu* — all three need dedup).

Without this profile, xFusion fell into fallback mode which activated all
vendor profiles (Supermicro, HGX, MSI, Dell) unnecessarily. Now resolves
to matched mode with targeted acquisition tuning (120k cap, 75s baseline).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 15:13:15 +03:00
Mikhail Chusavitin
380c199705 chore: update chart submodule
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 08:50:56 +03:00
Mikhail Chusavitin
d650a6ba1c refactor: unified ingest pipeline + modular Redfish profile framework
Implement the full architectural plan: unified ingest.Service entry point
for archive and Redfish payloads, modular redfishprofile package with
composable profiles (generic, ami-family, msi, supermicro, dell,
hgx-topology), score-based profile matching with fallback expansion mode,
and profile-driven acquisition/analysis plans.

Vendor-specific logic moved out of common executors and into profile hooks.
GPU chassis lookup strategies and known storage recovery collections
(IntelVROC/HA-RAID/MRVL) now live in ResolvedAnalysisPlan, populated by
profiles at analysis time. Replay helpers read from the plan; no hardcoded
path lists remain in generic code.

Also splits redfish_replay.go into domain modules (gpu, storage, inventory,
fru, profiles) and adds full fixture/matcher/directive test coverage
including Dell, AMI, unknown-vendor fallback, and deterministic ordering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 08:48:58 +03:00
Mikhail Chusavitin
d8d3d8c524 build: use local go toolchain in release script 2026-03-16 00:32:09 +03:00
Mikhail Chusavitin
057a222288 ui: embed reanimator chart viewer 2026-03-16 00:20:11 +03:00
Mikhail Chusavitin
f11a43f690 export: merge inspur psu sensor groups 2026-03-15 23:29:44 +03:00
Mikhail Chusavitin
476630190d export: align reanimator contract v2.7 2026-03-15 23:27:32 +03:00
Mikhail Chusavitin
9007f1b360 export: align reanimator and enrich redfish metrics 2026-03-15 21:38:28 +03:00
Mikhail Chusavitin
0acdc2b202 docs: refresh project documentation 2026-03-15 16:35:16 +03:00
Mikhail Chusavitin
47bb0ee939 docs: document firmware filter regression pattern in bible (ADL-019)
Root cause analysis for device-bound firmware leaking into hardware.firmware
on Supermicro Redfish (SYS-A21GE-NBRT HGX B200):

- collectFirmwareInventory (6c19a58) had no coverage for Supermicro naming.
  isDeviceBoundFirmwareName checked "gpu " / "nic " (space-terminated) while
  Supermicro uses "GPU1 System Slot0" / "NIC1 System Slot0 ..." (digit suffix).

- 9c5512d added _fw_gpu_ / _fw_nvswitch_ / _inforom_gpu_ patterns to fix HGX,
  but checked DeviceName which contains "Software Inventory" (from Redfish Name),
  not the firmware Id. Dead code from day one.

09-testing.md: add firmware filter worked example and rule #4 — verify the
filter checks the field that the collector actually populates.

10-decisions.md: ADL-019 — isDeviceBoundFirmwareName must be extended per
vendor with a test case per vendor format before shipping.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 14:03:47 +03:00
Mikhail Chusavitin
5815100e2f exporter: filter Supermicro Redfish device-bound firmware from hardware.firmware
isDeviceBoundFirmwareName did not catch Supermicro FirmwareInventory naming
conventions where a digit follows the type prefix directly ("GPU1 System Slot0",
"NIC1 System Slot0 AOM-DP805-IO") instead of a space. Also missing: "Power supply N",
"NVMeController N", and "Software Inventory" (generic label for all HGX per-component
firmware slots — GPU, NVSwitch, PCIeRetimer, ERoT, InfoROM, etc.).

On SYS-A21GE-NBRT (HGX B200) this caused 29 device-bound entries to leak into
hardware.firmware: 8 GPU, 9 NIC, 1 NVMe, 6 PSU, 4 PCIeSwitch, 1 Software Inventory.

Fix: extend isDeviceBoundFirmwareName with patterns for all four new cases.
Add TestIsDeviceBoundFirmwareName covering both excluded and kept entries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 13:48:55 +03:00
Mikhail Chusavitin
1eb639e6bf redfish: skip NVMe bay probe for non-storage chassis types (Module/Component/Zone)
On Supermicro HGX systems (SYS-A21GE-NBRT) ~35 sub-chassis (GPU, NVSwitch,
PCIeRetimer, ERoT/IRoT, BMC, FPGA) all carry ChassisType=Module/Component/Zone
and expose empty /Drives collections. shouldAdaptiveNVMeProbe returned true for
all of them, triggering 35 × 384 = 13 440 HTTP requests → ~22 min wasted per
collection (more than half of total 35 min collection time).

Fix: chassisTypeCanHaveNVMe returns false for Module, Component, Zone. The
candidate selection loop in collectRawRedfishTree now checks the parent chassis
doc before adding a /Drives path to the probe list. Enclosure (NVMe backplane),
RackMount, and unknown types are unaffected.

Tests:
- TestChassisTypeCanHaveNVMe: table-driven, covers excluded and storage-capable types
- TestNVMePostProbeSkipsNonStorageChassis: topology integration, GPU chassis +
  backplane with empty /Drives → exactly 1 candidate selected (backplane only)

Docs:
- ADL-018 in bible-local/10-decisions.md
- Candidate-selection test matrix in bible-local/09-testing.md
- SYS-A21GE-NBRT baseline row in docs/test_server_collection_memory.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-12 13:38:29 +03:00
Mikhail Chusavitin
a9f58b3cf4 redfish: fix GPU duplication on Supermicro HGX, exclude NVSwitch, restore path dedup
Three bugs, all related to GPU dedup in the Redfish replay pipeline:

1. collectGPUsFromProcessors (redfish_replay.go): GPU-type Processor entries
   (Systems/HGX_Baseboard_0/Processors/GPU_SXM_N) were not deduplicated against
   existing PCIeDevice GPUs on Supermicro HGX. The chassis-ID lookup keyed on
   processor Id ("GPU_SXM_1") but the chassis is named "HGX_GPU_SXM_1" — lookup
   returned nothing, serial stayed empty, UUID was unseen → 8 duplicate GPU rows.
   Fix: read SerialNumber directly from the Processor doc first; chassis lookup
   is now a fallback override (as it was designed for MSI).

2. looksLikeGPU (redfish.go): NVSwitch PCIe devices (Model="NVSwitch",
   Manufacturer="NVIDIA") were classified as GPUs because "nvidia" matched the
   GPU hint list. Fix: early return false when Model contains "nvswitch".

3. gpuDocDedupKey (redfish.go): commit 9df29b1 changed the dedup key to prefer
   slot|model before path, which collapsed two distinct GPUs with identical model
   names in GraphicsControllers into one entry. Fix: only serial and BDF are used
   as cross-path stable dedup keys; fall back to Redfish path when neither is
   present. This also restores TestReplayCollectGPUs_DedupUsesRedfishPathBeforeHeuristics
   which had been broken on main since 9df29b1.

Added tests:
- TestCollectGPUsFromProcessors_SupermicroHGX: Processor GPU dedup when
  chassis-ID naming convention does not match processor Id
- TestReplayCollectGPUs_DedupCrossChassisSerial: same GPU via two Chassis
  PCIeDevice trees with matching serials → collapsed to one
- TestLooksLikeGPU_NVSwitchExcluded: NVSwitch is not a GPU

Added rule to bible-local/09-testing.md: dedup/filter/classify functions must
cover true-positive, true-negative, and the vendor counter-case axes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 15:09:27 +03:00
Mikhail Chusavitin
d8ffe3d3a5 redfish: add service root to critical endpoints, tolerate missing root in replay
Add /redfish/v1 to redfishCriticalEndpoints so plan-B retries the service
root if it failed during the main crawl. Also downgrade the missing-root
error in ReplayRedfishFromRawPayloads from fatal to a warning so analysis
can complete with defaults when the root doc was not recovered.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-11 08:31:00 +03:00
Mikhail Chusavitin
9df29b1be9 fix: dedup GPUs across multiple chassis PCIeDevice trees in Redfish collector
Supermicro HGX exposes each GPU under both Chassis/1/PCIeDevices and a
dedicated Chassis/HGX_GPU_SXM_N/PCIeDevices. gpuDocDedupKey was keying
by @odata.id path, so identical GPUs with the same serial were not
deduplicated across sources. Now stable identifiers (serial → BDF →
slot+model) take priority over path.

Also includes Inspur parser improvements: NVMe model/serial enrichment
from devicefrusdr.log and audit.log, RAID drive slot normalization to
BP notation, PSU slot normalization, BMC/CPLD/VR firmware from RESTful
version info section, and parser version bump to 1.8.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:44:36 +03:00
Mikhail Chusavitin
62d6ad6f66 ui: deduplicate files by name and SHA-256 hash before batch convert
On folder selection, filter out duplicate files before conversion:
- First pass: same basename → skip (same filename in different subdirs)
- Second pass: same SHA-256 hash → skip (identical content, different path)

Duplicates are excluded from the convert queue and shown as a warning
in the summary with reason (same name / same content).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 12:45:09 +03:00
Mikhail Chusavitin
f09344e288 dell: filter chipset/embedded noise from PCIe device list
Skip FQDD prefixes that are internal AMD EPYC fabric or devices
already captured with richer data from other DCIM views:
- HostBridge/P2PBridge/ISABridge/SMBus.Embedded: AMD internal bus
- AHCI.Embedded: AMD FCH SATA (chipset, not a slot)
- Video.Embedded: BMC Matrox G200eW3, not user-visible
- NIC.Embedded: duplicates DCIM_NICView entries (no model/MAC in PCIe view)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 12:09:40 +03:00
19d857b459 redfish: filter PCIe topology noise, deduplicate GPU/NIC cross-sources
- isUnidentifiablePCIeDevice: skip PCIe entries with generic class
  (SingleFunction/MultiFunction) and no model/serial/VendorID — eliminates
  PCH bridges, root ports and other bus infrastructure that MSI BMC
  enumerates exhaustively (59→9 entries on CG480-S5063)
- collectPCIeDevices: skip entries where looksLikeGPU — prevents GPU
  devices from appearing in both hw.GPUs and hw.PCIeDevices (fixed
  Inspur H100 duplicate)
- dedupeCanonicalDevices: secondary model+manufacturer match for noKey
  items (no serial, no BDF) — merges NetworkAdapter entries into
  matching PCIe device entries; isGenericDeviceClass helper for
  DeviceClass identity check (fixed Inspur ENFI1100-T4 duplicate)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 22:08:02 +03:00
91 changed files with 13937 additions and 3073 deletions

3
.gitmodules vendored
View File

@@ -4,3 +4,6 @@
[submodule "bible"]
path = bible
url = https://git.mchus.pro/mchus/bible.git
[submodule "internal/chart"]
path = internal/chart
url = https://git.mchus.pro/reanimator/chart.git

View File

@@ -6,6 +6,6 @@ Start with `bible/rules/patterns/` for specific contracts.
## Project Architecture
Read `bible-local/` — LOGPile specific architecture.
Read order: `bible-local/README.md``01-overview.md` → relevant files for the task.
Read order: `bible-local/README.md``01-overview.md` `02-architecture.md``04-data-models.md` relevant file(s) for the task.
Every architectural decision specific to this project must be recorded in `bible-local/10-decisions.md`.

View File

@@ -6,6 +6,6 @@ Start with `bible/rules/patterns/` for specific contracts.
## Project Architecture
Read `bible-local/` — LOGPile specific architecture.
Read order: `bible-local/README.md``01-overview.md` → relevant files for the task.
Read order: `bible-local/README.md``01-overview.md` `02-architecture.md``04-data-models.md` relevant file(s) for the task.
Every architectural decision specific to this project must be recorded in `bible-local/10-decisions.md`.

View File

@@ -2,9 +2,27 @@
Standalone Go application for BMC diagnostics analysis with an embedded web UI.
## What it does
- Parses vendor diagnostic archives into a normalized hardware inventory
- Collects live BMC data via Redfish
- Exports normalized data as CSV, raw re-analysis bundles, and Reanimator JSON
- Runs as a single Go binary with embedded UI assets
## Documentation
- Architecture and technical documentation (single source of truth): [`docs/bible/README.md`](docs/bible/README.md)
- Shared engineering rules: [`bible/README.md`](bible/README.md)
- Project architecture and API contracts: [`bible-local/README.md`](bible-local/README.md)
- Agent entrypoints: [`AGENTS.md`](AGENTS.md), [`CLAUDE.md`](CLAUDE.md)
## Run
```bash
make build
./bin/logpile
```
Default port: `8082`
## License

View File

@@ -1,35 +1,43 @@
# 01 — Overview
## What is LOGPile?
## Purpose
LOGPile is a standalone Go application for BMC (Baseboard Management Controller)
diagnostics analysis with an embedded web UI.
It runs as a single binary with no external file dependencies.
LOGPile is a standalone Go application for BMC diagnostics analysis with an embedded web UI.
It runs as a single binary and normalizes hardware data from archives or live Redfish collection.
## Operating modes
| Mode | Entry point | Description |
|------|-------------|-------------|
| **Offline / archive** | `POST /api/upload` | Upload a vendor diagnostic archive or a JSON snapshot; parse and display in UI |
| **Live / Redfish** | `POST /api/collect` | Connect to a live BMC via Redfish API, collect hardware inventory, display and export |
| Mode | Entry point | Outcome |
|------|-------------|---------|
| Archive upload | `POST /api/upload` | Parse a supported archive, raw export bundle, or JSON snapshot into `AnalysisResult` |
| Live collection | `POST /api/collect` | Collect from a live BMC via Redfish and store the result in memory |
| Batch convert | `POST /api/convert` | Convert multiple supported input files into Reanimator JSON in a ZIP artifact |
Both modes produce the same in-memory `AnalysisResult` structure and expose it
through the same API and UI.
All modes converge on the same normalized hardware model and exporter pipeline.
## Key capabilities
## In scope
- Single self-contained binary with embedded HTML/JS/CSS (no static file serving required).
- Vendor archive parsing: Inspur/Kaytus, Dell TSR, NVIDIA HGX Field Diagnostics,
NVIDIA Bug Report, Unraid, XigmaNAS, Generic text fallback.
- Live Redfish collection with async progress tracking.
- Normalized hardware inventory: CPU / RAM / Storage / GPU / PSU / NIC / PCIe / Firmware.
- Raw `redfish_tree` snapshot stored in `RawPayloads` for future offline re-analysis.
- Re-upload of a JSON snapshot for offline work (`/api/upload` accepts `AnalysisResult` JSON).
- Export in CSV, JSON (full `AnalysisResult`), and Reanimator format.
- PCI device model resolution via embedded `pci.ids` (no hardcoded model strings).
- Single-binary desktop/server utility with embedded UI
- Vendor archive parsing and live Redfish collection
- Canonical hardware inventory across UI and exports
- Reopenable raw export bundles for future re-analysis
- Reanimator export and batch conversion workflows
- Embedded `pci.ids` lookup for vendor/device name enrichment
## Non-goals (current scope)
## Current vendor coverage
- No persistent storage — all state is in-memory per process lifetime.
- IPMI collector is a mock scaffold only; real IPMI support is not implemented.
- No authentication layer on the HTTP server.
- Dell TSR
- H3C SDS G5/G6
- Inspur / Kaytus
- NVIDIA HGX Field Diagnostics
- NVIDIA Bug Report
- Unraid
- XigmaNAS
- Generic fallback parser
## Non-goals
- Persistent storage or multi-user state
- Production IPMI collection
- Authentication/authorization on the built-in HTTP server
- Long-term server-side job history beyond in-memory process lifetime

View File

@@ -2,114 +2,97 @@
## Runtime stack
| Layer | Technology |
|-------|------------|
| Layer | Implementation |
|-------|----------------|
| Language | Go 1.22+ |
| HTTP | `net/http`, `http.ServeMux` |
| UI | Embedded via `//go:embed` in `web/embed.go` (templates + static assets) |
| State | In-memory only — no database |
| Build | `CGO_ENABLED=0`, single static binary |
| HTTP | `net/http` + `http.ServeMux` |
| UI | Embedded templates and static assets via `go:embed` |
| State | In-memory only |
| Build | `CGO_ENABLED=0`, single binary |
Default port: **8082**
Default port: `8082`
## Directory structure
Audit result rendering is delegated to embedded `reanimator/chart`, vendored as git submodule `internal/chart`.
LOGPile remains responsible for upload, collection, parsing, normalization, and Reanimator export generation.
```
cmd/logpile/main.go # Binary entry point, CLI flag parsing
internal/
collector/ # Live data collectors
registry.go # Collector registration
redfish.go # Redfish connector (real implementation)
ipmi_mock.go # IPMI mock connector (scaffold)
types.go # Connector request/progress contracts
parser/ # Archive parsers
parser.go # BMCParser (dispatcher) + parse orchestration
archive.go # Archive extraction helpers
registry.go # Parser registry + detect/selection
interface.go # VendorParser interface
vendors/ # Vendor-specific parser modules
vendors.go # Import-side-effect registrations
dell/
inspur/
nvidia/
nvidia_bug_report/
unraid/
xigmanas/
generic/
pciids/ # PCI IDs lookup (embedded pci.ids)
server/ # HTTP layer
server.go # Server struct, route registration
handlers.go # All HTTP handler functions
exporter/ # Export formatters
exporter.go # CSV + JSON exporters
reanimator_models.go
reanimator_converter.go
models/ # Shared data contracts
web/
embed.go # go:embed directive
templates/ # HTML templates
static/ # JS / CSS
js/app.js # Frontend — API contract consumer
## Code map
```text
cmd/logpile/main.go entrypoint and CLI flags
internal/server/ HTTP handlers, jobs, upload/export flows
internal/ingest/ source-family orchestration for upload and raw replay
internal/collector/ live collection and Redfish replay
internal/analyzer/ shared analysis helpers
internal/parser/ archive extraction and parser dispatch
internal/exporter/ CSV and Reanimator conversion
internal/chart/ vendored `reanimator/chart` viewer submodule
internal/models/ stable data contracts
web/ embedded UI assets
```
## In-memory state
## Server state
The `Server` struct in `internal/server/server.go` holds:
`internal/server.Server` stores:
| Field | Type | Description |
|-------|------|-------------|
| `result` | `*models.AnalysisResult` | Current parsed/collected dataset |
| `detectedVendor` | `string` | Vendor identifier from last parse |
| `jobManager` | `*JobManager` | Tracks live collect job status/logs |
| `collectors` | `*collector.Registry` | Registered live collection connectors |
| Field | Purpose |
|------|---------|
| `result` | Current `AnalysisResult` shown in UI and used by exports |
| `detectedVendor` | Parser/collector identity for the current dataset |
| `rawExport` | Reopenable raw-export package associated with current result |
| `jobManager` | Shared async job state for collect and convert flows |
| `collectors` | Registered live collectors (`redfish`, `ipmi`) |
| `convertOutput` | Temporary ZIP artifacts for batch convert downloads |
State is replaced atomically on successful upload or collect.
On a failed/canceled collect, the previous `result` is preserved unchanged.
State is replaced only on successful upload or successful live collection.
Failed or canceled jobs do not overwrite the previous dataset.
## Upload flow (`POST /api/upload`)
## Main flows
```
multipart form field: "archive"
├─ file looks like JSON?
│ └─ parse as models.AnalysisResult snapshot → store in Server.result
└─ otherwise
└─ parser.NewBMCParser().ParseFromReader(...)
├─ try all registered vendor parsers (highest confidence wins)
└─ result → store in Server.result
```
### Upload
## Live collect flow (`POST /api/collect`)
1. `POST /api/upload` receives multipart field `archive`
2. `internal/ingest.Service` resolves the source family
3. JSON inputs are checked for raw-export package or `AnalysisResult` snapshot
4. Non-JSON archives go through the archive parser family
5. Archive metadata is normalized onto `AnalysisResult`
6. Result becomes the current in-memory dataset
```
validate request (host / protocol / port / username / auth_type / tls_mode)
└─ launch async job
├─ progress callback → job log (queryable via GET /api/collect/{id})
├─ success:
│ set source metadata (source_type=api, protocol, host, date)
│ store result in Server.result
└─ failure / cancel:
previous Server.result unchanged
```
### Live collect
Job lifecycle states: `queued → running → success | failed | canceled`
1. `POST /api/collect` validates request fields
2. Server creates an async job and returns `202 Accepted`
3. Selected collector gathers raw data
4. For Redfish, collector runs minimal discovery, matches Redfish profiles, and builds an acquisition plan
5. Collector applies profile tuning hints (for example crawl breadth, prefetch, bounded plan-B passes)
6. Collector saves `raw_payloads.redfish_tree` plus acquisition diagnostics
7. Result is normalized, source metadata applied, and state replaced on success
### Batch convert
1. `POST /api/convert` accepts multiple files
2. Each supported file is analyzed independently
3. Successful results are converted to Reanimator JSON
4. Outputs are packaged into a temporary ZIP artifact
5. Client polls job status and downloads the artifact when ready
## Redfish design rule
Live Redfish collection and offline Redfish re-analysis must use the same replay path.
The collector first captures `raw_payloads.redfish_tree`, then the replay logic builds the normalized result.
Redfish is being split into two coordinated phases:
- acquisition: profile-driven snapshot collection strategy
- analysis: replay over the saved snapshot with the same profile framework
## PCI IDs lookup
Load/override order (`LOGPILE_PCI_IDS_PATH` has highest priority because it is loaded last):
Lookup order:
1. Embedded `internal/parser/vendors/pciids/pci.ids` (base dataset compiled into binary)
1. Embedded `internal/parser/vendors/pciids/pci.ids`
2. `./pci.ids`
3. `/usr/share/hwdata/pci.ids`
4. `/usr/share/misc/pci.ids`
5. `/opt/homebrew/share/pciids/pci.ids`
6. Paths from `LOGPILE_PCI_IDS_PATH` (colon-separated on Unix; later loaded, override same IDs)
6. Extra paths from `LOGPILE_PCI_IDS_PATH`
This means unknown GPU/NIC model strings can be updated by refreshing `pci.ids`
without any code change.
Later sources override earlier ones for the same IDs.

View File

@@ -2,38 +2,38 @@
## Conventions
- All endpoints under `/api/`.
- Request bodies: `application/json` or `multipart/form-data` where noted.
- Responses: `application/json` unless file download.
- Export filenames follow pattern: `YYYY-MM-DD (SERVER MODEL) - SERVER SN.<ext>`
- All endpoints are under `/api/`
- JSON responses are used unless the endpoint downloads a file
- Async jobs share the same status model: `queued`, `running`, `success`, `failed`, `canceled`
- Export filenames use `YYYY-MM-DD (MODEL) - SERIAL.<ext>` when board metadata exists
- Embedded chart viewer routes live under `/chart/` and return HTML/CSS, not JSON
---
## Upload & Data Input
## Input endpoints
### `POST /api/upload`
Upload a vendor diagnostic archive or a JSON snapshot.
**Request:** `multipart/form-data`, field name `archive`.
Server-side multipart limit: **100 MiB**.
Uploads one file in multipart field `archive`.
Accepted inputs:
- `.tar`, `.tar.gz`, `.tgz` — vendor diagnostic archives
- `.txt` — plain text files
- JSON file containing a serialized `AnalysisResult` — re-loaded as-is
- supported archive/log formats from the parser registry
- `.json` `AnalysisResult` snapshots
- raw-export JSON packages
- raw-export ZIP bundles
**Response:** `200 OK` with parsed result summary, or `4xx`/`5xx` on error.
Result:
- parses or replays the input
- stores the result as current in-memory state
- returns parsed summary JSON
---
## Live Collection
Related helper:
- `GET /api/file-types` returns `archive_extensions`, `upload_extensions`, and `convert_extensions`
### `POST /api/collect`
Start a live collection job (`redfish` or `ipmi`).
Starts a live collection job.
Request body:
**Request body:**
```json
{
"host": "bmc01.example.local",
@@ -47,138 +47,153 @@ Start a live collection job (`redfish` or `ipmi`).
```
Supported values:
- `protocol`: `redfish` | `ipmi`
- `auth_type`: `password` | `token`
- `tls_mode`: `strict` | `insecure`
- `protocol`: `redfish` or `ipmi`
- `auth_type`: `password` or `token`
- `tls_mode`: `strict` or `insecure`
**Response:** `202 Accepted`
```json
{
"job_id": "job_a1b2c3d4e5f6",
"status": "queued",
"message": "Collection job accepted",
"created_at": "2026-02-23T12:00:00Z"
}
```
Responses:
- `202` on accepted job creation
- `400` on malformed JSON
- `422` on validation errors
Validation behavior:
- `400 Bad Request` for invalid JSON
- `422 Unprocessable Entity` for semantic validation errors (missing/invalid fields)
Optional request field:
- `power_on_if_host_off`: when `true`, Redfish collection may power on the host before collection if preflight found it powered off
### `POST /api/collect/probe`
Checks that live API connectivity works and returns host power state before collection starts.
Typical request body is the same as `POST /api/collect`.
Typical response fields:
- `reachable`
- `protocol`
- `host_power_state`
- `host_powered_on`
- `power_control_available`
- `message`
### `GET /api/collect/{id}`
Poll job status and progress log.
**Response:**
```json
{
"job_id": "job_a1b2c3d4e5f6",
"status": "running",
"progress": 55,
"logs": ["..."],
"created_at": "2026-02-23T12:00:00Z",
"updated_at": "2026-02-23T12:00:10Z"
}
```
Status values: `queued` | `running` | `success` | `failed` | `canceled`
Returns async collection job status, progress, timestamps, and accumulated logs.
### `POST /api/collect/{id}/cancel`
Cancel a running job.
Requests cancellation for a running collection job.
---
### `POST /api/convert`
## Data Queries
Starts a batch conversion job that accepts multiple files under `files[]` or `files`.
Each supported file is parsed independently and converted to Reanimator JSON.
Response fields:
- `job_id`
- `status`
- `accepted`
- `skipped`
- `total_files`
### `GET /api/convert/{id}`
Returns batch convert job status using the same async job envelope as collection.
### `GET /api/convert/{id}/download`
Downloads the ZIP artifact produced by a successful convert job.
## Read endpoints
### `GET /api/status`
Returns source metadata for the current dataset.
If nothing is loaded, response is `{ "loaded": false }`.
```json
{
"loaded": true,
"filename": "redfish://bmc01.example.local",
"vendor": "redfish",
"source_type": "api",
"protocol": "redfish",
"target_host": "bmc01.example.local",
"collected_at": "2026-02-10T15:30:00Z",
"stats": { "events": 0, "sensors": 0, "fru": 0 }
}
```
`source_type`: `archive` | `api`
When no dataset is loaded, response is `{ "loaded": false }`.
Typical fields:
- `loaded`
- `filename`
- `vendor`
- `source_type`
- `protocol`
- `target_host`
- `source_timezone`
- `collected_at`
- `stats`
### `GET /api/config`
Returns source metadata plus:
Returns the main UI configuration payload, including:
- source metadata
- `hardware.board`
- `hardware.firmware`
- canonical `hardware.devices`
- computed `specification` summary lines
- computed specification lines
### `GET /api/events`
Returns parsed diagnostic events.
Returns events sorted newest first.
### `GET /api/sensors`
Returns sensor readings (temperatures, voltages, fan speeds).
Returns parsed sensors plus synthesized PSU voltage sensors when telemetry is available.
### `GET /api/serials`
Returns serial numbers built from canonical `hardware.devices`.
Returns serial-oriented inventory built from canonical devices.
### `GET /api/firmware`
Returns firmware versions built from canonical `hardware.devices`.
Returns firmware-oriented inventory built from canonical devices.
### `GET /api/parse-errors`
Returns normalized parse and collection issues combined from:
- Redfish fetch errors in `raw_payloads`
- raw-export collect logs
- derived partial-inventory warnings
### `GET /api/parsers`
Returns list of registered vendor parsers with their identifiers.
Returns registered parser metadata.
---
### `GET /api/file-types`
## Export
Returns supported file extensions for upload and batch convert.
## Viewer endpoints
### `GET /chart/current`
Renders the current in-memory dataset as Reanimator HTML using embedded `reanimator/chart`.
The server first converts the current result to Reanimator JSON, then passes that snapshot to the viewer.
### `GET /chart/static/...`
Serves embedded `reanimator/chart` static assets.
## Export endpoints
### `GET /api/export/csv`
Download serial numbers as CSV.
Downloads serial-number CSV.
### `GET /api/export/json`
Download full `AnalysisResult` as JSON (includes `raw_payloads`).
Downloads a raw-export artifact for reopen and re-analysis.
Current implementation emits a ZIP bundle containing:
- `raw_export.json`
- `collect.log`
- `parser_fields.json`
### `GET /api/export/reanimator`
Download hardware data in Reanimator format for asset tracking integration.
See [`07-exporters.md`](07-exporters.md) for full format spec.
Downloads Reanimator JSON built from the current normalized result.
---
## Management
## Management endpoints
### `DELETE /api/clear`
Clear current in-memory dataset.
Clears current in-memory dataset, raw export state, and temporary convert artifacts.
### `POST /api/shutdown`
Gracefully shut down the server process.
This endpoint terminates the current process after responding.
---
## Source metadata fields
Fields present in `/api/status` and `/api/config`:
| Field | Values |
|-------|--------|
| `source_type` | `archive` \| `api` |
| `protocol` | `redfish` \| `ipmi` (may be empty for archive uploads) |
| `target_host` | IP or hostname |
| `collected_at` | RFC3339 timestamp |
Gracefully shuts down the process after responding.

View File

@@ -1,104 +1,87 @@
# 04 — Data Models
## AnalysisResult
## Core contract: `AnalysisResult`
`internal/models/` — the central data contract shared by parsers, collectors, exporters, and the HTTP layer.
`internal/models/models.go` defines the shared result passed between parsers, collectors, server handlers, and exporters.
**Stability rule:** Never break the JSON shape of `AnalysisResult`.
Backward-compatible additions are allowed; removals or renames are not.
Stability rule:
- do not rename or remove JSON fields from `AnalysisResult`
- additive fields are allowed
- UI and exporter compatibility depends on this shape remaining stable
Key top-level fields:
Key fields:
| Field | Type | Description |
|-------|------|-------------|
| `filename` | `string` | Uploaded filename or generated live source identifier |
| `source_type` | `string` | `archive` or `api` |
| `protocol` | `string` | `redfish`, `ipmi`, or empty for archive uploads |
| `target_host` | `string` | BMC host for live collection |
| `collected_at` | `time.Time` | Upload/collection timestamp |
| `hardware` | `*HardwareConfig` | All normalized hardware inventory |
| `events` | `[]Event` | Diagnostic events from parsers |
| `fru` | `[]FRUInfo` | FRU/SDR-derived inventory details |
| `sensors` | `[]SensorReading` | Sensor readings |
| `raw_payloads` | `map[string]any` | Raw vendor data (e.g. `redfish_tree`) |
| Field | Meaning |
|------|---------|
| `filename` | Original upload name or synthesized live source name |
| `source_type` | `archive` or `api` |
| `protocol` | `redfish`, `ipmi`, or empty for archive uploads |
| `target_host` | Hostname or IP for live collection |
| `source_timezone` | Source timezone/offset if known |
| `collected_at` | Canonical collection/upload time |
| `raw_payloads` | Raw source data used for replay or diagnostics |
| `events` | Parsed event timeline |
| `fru` | FRU-derived inventory details |
| `sensors` | Sensor readings |
| `hardware` | Normalized hardware inventory |
`raw_payloads` is the durable source for offline re-analysis (especially for Redfish).
Normalized fields should be treated as derivable output from raw source data.
## `HardwareConfig`
### Hardware sub-structure
Main sections:
```
HardwareConfig
├── board BoardInfo — server/motherboard identity
├── devices []HardwareDevice — CANONICAL INVENTORY (see below)
├── cpus []CPU
├── memory []MemoryDIMM
├── storage []Storage
├── volumes []StorageVolume — logical RAID/VROC volumes
├── pcie_devices []PCIeDevice
├── gpus []GPU
├── network_adapters []NetworkAdapter
├── network_cards []NIC (legacy/alternate source field)
├── power_supplies []PSU
└── firmware []FirmwareInfo
```text
hardware.board
hardware.devices
hardware.cpus
hardware.memory
hardware.storage
hardware.volumes
hardware.pcie_devices
hardware.gpus
hardware.network_adapters
hardware.network_cards
hardware.power_supplies
hardware.firmware
```
---
`network_cards` is legacy/alternate source data.
`hardware.devices` is the canonical cross-section inventory.
## Canonical Device Repository (`hardware.devices`)
## Canonical inventory: `hardware.devices`
`hardware.devices` is the **single source of truth** for hardware inventory.
`hardware.devices` is the single source of truth for device-oriented UI and Reanimator export.
### Rules — must not be violated
Required rules:
1. All UI tabs displaying hardware components **must read from `hardware.devices`**.
2. The Device Inventory tab shows kinds: `pcie`, `storage`, `gpu`, `network`.
3. The Reanimator exporter **must use the same `hardware.devices`** as the UI.
4. Any discrepancy between UI data and Reanimator export data is a **bug**.
5. New hardware attributes must be added to the canonical device schema **first**,
then mapped to Reanimator/UI — never the other way around.
6. The exporter should group/filter canonical records by section, not rebuild data
from multiple sources.
1. UI hardware views must read from `hardware.devices`
2. Reanimator conversion must derive device sections from `hardware.devices`
3. UI/export mismatches are bugs, not accepted divergence
4. New shared device fields belong in `HardwareDevice` first
### Deduplication logic (applied once by repository builder)
Deduplication priority:
| Priority | Key used |
|----------|----------|
| 1 | `serial_number` — usable (not empty, not `N/A`, `NA`, `NONE`, `NULL`, `UNKNOWN`, `-`) |
| 2 | `bdf` — PCI Bus:Device.Function address |
| 3 | No merge — records remain distinct if both serial and bdf are absent |
| Priority | Key |
|----------|-----|
| 1 | usable `serial_number` |
| 2 | `bdf` |
| 3 | keep records separate |
### Device schema alignment
## Raw payloads
Keep `hardware.devices` schema as close as possible to Reanimator JSON field names.
This minimizes translation logic in the exporter and prevents drift.
`raw_payloads` is authoritative for replayable sources.
---
Current important payloads:
- `redfish_tree`
- `redfish_fetch_errors`
- `source_timezone`
## Source metadata fields (stored directly on `AnalysisResult`)
Normalized hardware fields are derived output, not the long-term source of truth.
Carried by both `/api/status` and `/api/config`:
## Raw export package
```json
{
"source_type": "api",
"protocol": "redfish",
"target_host": "10.0.0.1",
"collected_at": "2026-02-10T15:30:00Z"
}
```
Valid `source_type` values: `archive`, `api`
Valid `protocol` values: `redfish`, `ipmi` (empty is allowed for archive uploads)
---
## Raw Export Package (reopenable artifact)
`Export Raw Data` does not merely dump `AnalysisResult`; it emits a reopenable raw package
(JSON or ZIP bundle) that carries source data required for re-analysis.
`/api/export/json` produces a reopenable raw-export artifact.
Design rules:
- raw source is authoritative (`redfish_tree` or original file bytes)
- imports must re-analyze from raw source
- parsed field snapshots included in bundles are diagnostic artifacts, not the source of truth
- raw source stays authoritative
- uploads of raw-export artifacts must re-analyze from raw source
- parsed snapshots inside the bundle are diagnostic only

View File

@@ -3,107 +3,152 @@
Collectors live in `internal/collector/`.
Core files:
- `internal/collector/registry.go` — connector registry (`redfish`, `ipmi`)
- `internal/collector/redfish.go` — real Redfish connector
- `internal/collector/ipmi_mock.go` — IPMI mock connector scaffold
- `internal/collector/types.go` — request/progress contracts
- `registry.go` for protocol registration
- `redfish.go` for live collection
- `redfish_replay.go` for replay from raw payloads
- `redfish_replay_gpu.go` for profile-driven GPU replay collectors and GPU fallback helpers
- `redfish_replay_storage.go` for profile-driven storage replay collectors and storage recovery helpers
- `redfish_replay_inventory.go` for replay inventory collectors (PCIe, NIC, BMC MAC, NIC enrichment)
- `redfish_replay_fru.go` for board fallback helpers and Assembly/FRU replay extraction
- `redfish_replay_profiles.go` for profile-driven replay helpers and vendor-aware recovery helpers
- `redfishprofile/` for Redfish profile matching and acquisition/analysis hooks
- `ipmi_mock.go` for the placeholder IPMI implementation
- `types.go` for request/progress contracts
---
## Redfish collector
## Redfish Collector (`redfish`)
Status: active production path.
**Status:** Production-ready.
Request fields passed from the server:
- `host`
- `port`
- `username`
- `auth_type`
- credential field (`password` or token)
- `tls_mode`
- optional `power_on_if_host_off`
### Request contract (from server)
### Core rule
Passed through from `/api/collect` after validation:
- `host`, `port`, `username`
- `auth_type=password|token` (+ matching credential field)
- `tls_mode=strict|insecure`
Live collection and replay must stay behaviorally aligned.
If the collector adds a fallback, probe, or normalization rule, replay must mirror it.
### Discovery
### Preflight and host power
Dynamic — does not assume fixed paths. Discovers:
- `Systems` collection → per-system resources
- `Chassis` collection → enclosure/board data
- `Managers` collection → BMC/firmware info
- `Probe()` may be used before collection to verify API connectivity and current host `PowerState`
- if the host is off and the user chose power-on, the collector may issue `ComputerSystem.Reset`
with `ResetType=On`
- power-on attempts are bounded and logged
- after a successful power-on, the collector waits an extra stabilization window, then checks
`PowerState` again and only starts collection if the host is still on
- if the collector powered on the host itself for collection, it must attempt to power it back off
after collection completes
- if the host was already on before collection, the collector must not power it off afterward
- if power-on fails, collection still continues against the powered-off host
- all power-control decisions and attempts must be visible in the collection log so they are
preserved in raw-export bundles
### Collected data
### Discovery model
| Category | Notes |
|----------|-------|
| CPU | Model, cores, threads, socket, status |
| Memory | DIMM slot, size, type, speed, serial, manufacturer |
| Storage | Slot, type, model, serial, firmware, interface, status |
| GPU | Detected via PCIe class + NVIDIA vendor ID |
| PSU | Model, serial, wattage, firmware, telemetry (input/output power, voltage) |
| NIC | Model, serial, port count, BDF |
| PCIe | Slot, vendor_id, device_id, BDF, link width/speed |
| Firmware | BIOS, BMC versions |
The collector does not rely on one fixed vendor tree.
It discovers and follows Redfish resources dynamically from root collections such as:
- `Systems`
- `Chassis`
- `Managers`
### Raw snapshot
After minimal discovery the collector builds `MatchSignals` and selects a Redfish profile mode:
- `matched` when one or more profiles score with high confidence
- `fallback` when vendor/platform confidence is low; in this mode the collector aggregates safe additive profile probes to maximize snapshot completeness
Full Redfish response tree is stored in `result.RawPayloads["redfish_tree"]`.
This allows future offline re-analysis without re-collecting from a live BMC.
Profile modules may contribute:
- primary acquisition seeds
- bounded `PlanBPaths` for secondary recovery
- critical paths
- acquisition notes/diagnostics
- tuning hints such as snapshot document cap, prefetch behavior, and expensive post-probe toggles
- post-probe policy for numeric collection recovery, direct NVMe `Disk.Bay` recovery, and sensor post-probe enablement
- recovery policy for critical collection member retry, slow numeric plan-B probing, and profile-specific plan-B activation
- scoped path policy for discovered `Systems/*`, `Chassis/*`, and `Managers/*` branches when a profile needs extra seeds/critical targets beyond the vendor-neutral core set
- prefetch policy for which critical paths are eligible for adaptive prefetch and which path shapes are explicitly excluded
### Unified Redfish analysis pipeline (live == replay)
Model- or topology-specific `CriticalPaths` and profile `PlanBPaths` must live in the profile
module that owns the behavior. The collector core may execute those paths, but it should not
hardcode vendor-specific recovery targets.
The same rule applies to expensive post-probe decisions: the collector core may execute bounded
post-probe loops, but profiles own whether those loops are enabled for a given platform shape.
The same rule applies to critical recovery passes: the collector core may run bounded plan-B
loops, but profiles own whether member retry, slow numeric recovery, and profile-specific plan-B
passes are enabled.
When a profile needs extra discovered-path branches such as storage controller subtrees, it must
provide them as scoped suffix policy rather than by hardcoding platform-shaped suffixes into the
collector core baseline seed list.
The same applies to prefetch shaping: the collector core may execute adaptive prefetch, but
profiles own the include/exclude rules for which critical paths should participate.
The same applies to critical inventory shaping: the collector core should keep only a minimal
vendor-neutral critical baseline, while profiles own additional system/chassis/manager critical
suffixes and top-level critical targets.
Resolved live acquisition plans should be built inside `redfishprofile/`, not by hand in
`redfish.go`. The collector core should receive discovered resources plus the selected profile
plan and then execute the resolved seed/critical paths.
When profile behavior depends on what discovery actually returned, use a post-discovery
refinement hook in `redfishprofile/` instead of hardcoding guessed absolute paths in the static
plan. MSI GPU chassis refinement is the reference example.
LOGPile uses a **single Redfish analyzer path**:
Live Redfish collection must expose profile-match diagnostics:
- collector logs must include the selected modules and score for every known module
- job status responses must carry structured `active_modules` and `module_scores`
- the collect page should render active modules as chips from structured status data, not by
parsing log lines
1. Live collector crawls the Redfish API and builds `raw_payloads.redfish_tree`
2. Parsed result is produced by replaying that tree through the same analyzer used by raw import
On replay, profile-derived analysis directives may enable vendor-specific inventory linking
helpers such as processor-GPU fallback, chassis-ID alias resolution, and bounded storage recovery.
Replay should now resolve a structured analysis plan inside `redfishprofile/`, analogous to the
live acquisition plan. The replay core may execute collectors against the resolved directives, but
snapshot-aware vendor decisions should live in profile analysis hooks, not in `redfish_replay.go`.
GPU and storage replay executors should consume the resolved analysis plan directly, not a raw
`AnalysisDirectives` struct, so the boundary between planning and execution stays explicit.
This guarantees that live collection and `Export Raw Data` re-open/re-analyze produce the same
normalized output for the same `redfish_tree`.
Profile matching and acquisition tuning must be regression-tested against repo-owned compact
fixtures under `internal/collector/redfishprofile/testdata/`, derived from representative
raw-export snapshots, for at least MSI and Supermicro shapes.
When multiple raw-export snapshots exist for the same platform, profile selection must remain
stable across those sibling fixtures unless the topology actually changes.
Analysis-plan metadata should be stored in replay raw payloads so vendor hook activation is
debuggable offline.
### Snapshot crawler behavior (important)
### Stored raw data
The Redfish snapshot crawler is intentionally:
- **bounded** (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
- **prioritized** (PCIe, Fabrics, FirmwareInventory, Storage, PowerSubsystem, ThermalSubsystem)
- **tolerant** (skips noisy expected failures, strips `#fragment` from `@odata.id`)
Important raw payloads:
- `raw_payloads.redfish_tree`
- `raw_payloads.redfish_fetch_errors`
- `raw_payloads.redfish_profiles`
- `raw_payloads.source_timezone` when available
Design notes:
- Queue capacity is sized to snapshot cap to avoid worker deadlocks on large trees.
- UI progress is coarse and human-readable; detailed per-request diagnostics are available via debug logs.
- `LOGPILE_REDFISH_DEBUG=1` and `LOGPILE_REDFISH_SNAPSHOT_DEBUG=1` enable console diagnostics.
### Snapshot crawler rules
### Parsing guidelines
- bounded by `LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`
- prioritized toward high-value inventory paths
- tolerant of expected vendor-specific failures
- normalizes `@odata.id` values before queueing
When adding Redfish mappings, follow these principles:
- Support alternate collection paths (resources may appear at different odata URLs).
- Follow `@odata.id` references and handle embedded `Members` arrays.
- Prefer **raw-tree replay compatibility**: if live collector adds a fallback/probe, replay analyzer must mirror it.
- Deduplicate by serial / BDF / slot+model (in that priority order).
- Prefer tolerant/fallback parsing — missing fields should be silently skipped,
not cause the whole collection to fail.
### Redfish implementation guidance
### Vendor-specific storage fallbacks (Supermicro and similar)
When changing collection logic:
When standard `Storage/.../Drives` collections are empty, collector/replay may recover drives via:
- `Storage.Links.Enclosures[*] -> .../Drives`
- direct probing of finite `Disk.Bay` candidates (`Disk.Bay.0`, `Disk.Bay0`, `.../0`)
1. Prefer profile modules over ad-hoc vendor branches in the collector core
2. Keep expensive probing bounded
3. Deduplicate by serial, then BDF, then location/model fallbacks
4. Preserve replay determinism from saved raw payloads
5. Add tests for both the motivating topology and a negative case
This is required for some BMCs that publish drive inventory in vendor-specific paths while leaving
standard collections empty.
### Known vendor fallbacks
### PSU source preference (newer Redfish)
- empty standard drive collections may trigger bounded `Disk.Bay` probing
- `Storage.Links.Enclosures[*]` may be followed to recover physical drives
- `PowerSubsystem/PowerSupplies` is preferred over legacy `Power` when available
PSU inventory source order:
1. `Chassis/*/PowerSubsystem/PowerSupplies` (preferred on X14+/newer Redfish)
2. `Chassis/*/Power` (legacy fallback)
## IPMI collector
### Progress reporting
Status: mock scaffold only.
The collector emits progress log entries at each stage (connecting, enumerating systems,
collecting CPUs, etc.) so the UI can display meaningful status.
Current progress message strings are user-facing and may be localized.
---
## IPMI Collector (`ipmi`)
**Status:** Mock scaffold only — not implemented.
Registered in the collector registry but returns placeholder data.
Real IPMI support is a future work item.
It remains registered for protocol completeness, but it is not a real collection path.

View File

@@ -2,261 +2,69 @@
## Framework
### Registration
Parsers live in `internal/parser/` and vendor implementations live in `internal/parser/vendors/`.
Each vendor parser registers itself via Go's `init()` side-effect import pattern.
Core behavior:
- registration uses `init()` side effects
- all registered parsers run `Detect()`
- the highest-confidence parser wins
- generic fallback stays last and low-confidence
All registrations are collected in `internal/parser/vendors/vendors.go`:
```go
import (
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
// etc.
)
```
### VendorParser interface
`VendorParser` contract:
```go
type VendorParser interface {
Name() string // human-readable name
Vendor() string // vendor identifier string
Version() string // parser version (increment on logic changes)
Detect(files []ExtractedFile) int // confidence 0100
Name() string
Vendor() string
Version() string
Detect(files []ExtractedFile) int
Parse(files []ExtractedFile) (*models.AnalysisResult, error)
}
```
### Selection logic
## Adding a parser
All registered parsers run `Detect()` against the uploaded archive's file list.
The parser with the **highest confidence score** is selected.
Multiple parsers may return >0; only the top scorer is used.
1. Create `internal/parser/vendors/<vendor>/`
2. Start from `internal/parser/vendors/template/parser.go.template`
3. Implement `Detect()` and `Parse()`
4. Add a blank import in `internal/parser/vendors/vendors.go`
5. Add at least one positive and one negative detection test
### Adding a new vendor parser
## Data quality rules
1. `mkdir -p internal/parser/vendors/VENDORNAME`
2. Copy `internal/parser/vendors/template/parser.go.template` as starting point.
3. Implement `Detect()` and `Parse()`.
4. Add blank import to `vendors/vendors.go`.
### System firmware only in `hardware.firmware`
`Detect()` tips:
- Look for unique filenames or directory names.
- Check file content for vendor-specific markers.
- Return 70+ only when confident; return 0 if clearly not a match.
`hardware.firmware` must contain system-level firmware only.
Device-bound firmware belongs on the device record and must not be duplicated at the top level.
### Parser versioning
### Strip embedded MAC addresses from model names
Each parser file contains a `parserVersion` constant.
Increment the version whenever parsing logic changes — this helps trace which
version produced a given result.
If a source embeds ` - XX:XX:XX:XX:XX:XX` in a model/name field, remove that suffix before storing it.
---
### Use `pci.ids` for empty or generic PCI model names
## Parser data quality rules
When `vendor_id` and `device_id` are known but the model name is missing or generic, resolve the name via `internal/parser/vendors/pciids`.
### FirmwareInfo — system-level only
## Active vendor coverage
`Hardware.Firmware` must contain **only system-level firmware**: BIOS, BMC/iDRAC,
Lifecycle Controller, CPLD, storage controllers, BOSS adapters.
| Vendor ID | Input family | Notes |
|-----------|--------------|-------|
| `dell` | TSR ZIP archives | Broad hardware, firmware, sensors, lifecycle events |
| `h3c_g5` | H3C SDS G5 bundles | INI/XML/CSV-driven hardware and event parsing |
| `h3c_g6` | H3C SDS G6 bundles | Similar flow with G6-specific files |
| `inspur` | onekeylog archives | FRU/SDR plus optional Redis enrichment |
| `nvidia` | HGX Field Diagnostics | GPU- and fabric-heavy diagnostic input |
| `nvidia_bug_report` | `nvidia-bug-report-*.log.gz` | dmidecode, lspci, NVIDIA driver sections |
| `unraid` | Unraid diagnostics/log bundles | Server and storage-focused parsing |
| `xigmanas` | XigmaNAS plain logs | FreeBSD/NAS-oriented inventory |
| `generic` | fallback | Low-confidence text fallback when nothing else matches |
**Device-bound firmware** (NIC, GPU, PSU, disk, backplane) **must NOT be added to
`Hardware.Firmware`**. It belongs to the device's own `Firmware` field and is already
present there. Duplicating it in `Hardware.Firmware` causes double entries in Reanimator.
## Practical guidance
The Reanimator exporter filters by `FirmwareInfo.DeviceName` prefix and by
`FirmwareInfo.Description` (FQDD prefix). Parsers must cooperate:
- Store the device's FQDD (or equivalent slot identifier) in `FirmwareInfo.Description`
for all firmware entries that come from a per-device inventory source (e.g. Dell
`DCIM_SoftwareIdentity`).
- FQDD prefixes that are device-bound: `NIC.`, `PSU.`, `Disk.`, `RAID.Backplane.`, `GPU.`
### NIC/device model names — strip embedded MAC addresses
Some vendors (confirmed: Dell TSR) embed the MAC address in the device model name field,
e.g. `ProductName = "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`.
**Rule:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from model/name strings before storing
them in `FirmwareInfo.DeviceName`, `NetworkAdapter.Model`, or any other model field.
Use `nicMACInModelRE` (defined in the Dell parser) or an equivalent regex:
```
\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$
```
This applies to **all** string fields used as device names or model identifiers.
### PCI device name enrichment via pci.ids
If a PCIe device, GPU, NIC, or any hardware component has a `vendor_id` + `device_id`
but its model/name field is **empty or generic** (e.g. blank, equals the description,
or is just a raw hex ID), the parser **must** attempt to resolve the human-readable
model name from the embedded `pci.ids` database before storing the result.
**Rule:** When `Model` (or equivalent name field) is empty and both `VendorID` and
`DeviceID` are non-zero, call the pciids lookup and use the result as the model name.
```go
// Example pattern — use in any parser that handles PCIe/GPU/NIC devices:
if strings.TrimSpace(device.Model) == "" && device.VendorID != 0 && device.DeviceID != 0 {
if name := pciids.Lookup(device.VendorID, device.DeviceID); name != "" {
device.Model = name
}
}
```
This rule applies to all vendor parsers. The pciids package is available at
`internal/parser/vendors/pciids`. See ADL-005 for the rationale.
**Do not hardcode model name strings.** If a device is unknown today, it will be
resolved automatically once `pci.ids` is updated.
---
## Vendor parsers
### Inspur / Kaytus (`inspur`)
**Status:** Ready. Tested on KR4268X2 (onekeylog format).
**Archive format:** `.tar.gz` onekeylog
**Primary source files:**
| File | Content |
|------|---------|
| `asset.json` | Base hardware inventory |
| `component.log` | Component list |
| `devicefrusdr.log` | FRU and SDR data |
| `onekeylog/runningdata/redis-dump.rdb` | Runtime enrichment (optional) |
**Redis RDB enrichment** (applied conservatively — fills missing fields only):
- GPU: `serial_number`, `firmware` (VBIOS/FW), runtime telemetry
- NIC: firmware, serial, part number (when text logs leave fields empty)
**Module structure:**
```
inspur/
parser.go — main parser + registration
sdr.go — sensor/SDR parsing
fru.go — FRU serial parsing
asset.go — asset.json parsing
syslog.go — syslog parsing
```
---
### Dell TSR (`dell`)
**Status:** Ready (v3.0). Tested on nested TSR archives with embedded `*.pl.zip`.
**Archive format:** `.zip` (outer archive + nested `*.pl.zip`)
**Primary source files:**
- `tsr/metadata.json`
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml`
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml`
- `tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml`
- `tsr/hardware/sysinfo/lcfiles/curr_lclog.xml`
**Extracted data:**
- Board/system identity and BIOS/iDRAC firmware
- CPU, memory, physical disks, virtual disks, PSU, NIC, PCIe
- GPU inventory (`DCIM_VideoView`) + GPU sensor enrichment (`DCIM_GPUSensor`)
- Controller/backplane inventory (`DCIM_ControllerView`, `DCIM_EnclosureView`)
- Sensor readings (temperature/voltage/current/power/fan/utilization)
- Lifecycle events (`curr_lclog.xml`)
---
### NVIDIA HGX Field Diagnostics (`nvidia`)
**Status:** Ready (v1.1.0). Works with any server vendor.
**Archive format:** `.tar` / `.tar.gz`
**Confidence scoring:**
| File | Score |
|------|-------|
| `unified_summary.json` with "HGX Field Diag" marker | +40 |
| `summary.json` | +20 |
| `summary.csv` | +15 |
| `gpu_fieldiag/` directory | +15 |
**Source files:**
| File | Content |
|------|---------|
| `output.log` | dmidecode — server manufacturer, model, serial number |
| `unified_summary.json` | GPU details, NVSwitch devices, PCI addresses |
| `summary.json` | Diagnostic test results and error codes |
| `summary.csv` | Alternative test results format |
**Extracted data:**
- GPUs: slot, model, manufacturer, firmware (VBIOS), BDF
- NVSwitch devices: slot, device_class, vendor_id, device_id, BDF, link speed/width
- Events: diagnostic test failures (connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power)
**Severity mapping:**
- `info` — tests passed
- `warning` — e.g. "Row remapping failed"
- `critical` — error codes 300+
**Known limitations:**
- Detailed logs in `gpu_fieldiag/*.log` are not parsed.
- No CPU, memory, or storage extraction (not present in field diag archives).
---
### NVIDIA Bug Report (`nvidia_bug_report`)
**Status:** Ready (v1.0.0).
**File format:** `nvidia-bug-report-*.log.gz` (gzip-compressed text)
**Confidence:** 85 (high priority for matching filename pattern)
**Source sections parsed:**
| dmidecode section | Extracts |
|-------------------|---------|
| System Information | server serial, UUID, manufacturer, product name |
| Processor Information | CPU model, serial, core/thread count, frequency |
| Memory Device | DIMM slot, size, type, manufacturer, serial, part number, speed |
| System Power Supply | PSU location, manufacturer, model, serial, wattage, firmware, status |
| Other source | Extracts |
|--------------|---------|
| `lspci -vvv` (Ethernet/Network/IB) | NIC model (from VPD), BDF, slot, P/N, S/N, port count, port type |
| `/proc/driver/nvidia/gpus/*/information` | GPU model, BDF, UUID, VBIOS version, IRQ |
| NVRM version line | NVIDIA driver version |
**Known limitations:**
- Driver error/warning log lines not yet extracted.
- GPU temperature/utilization metrics require additional parsing sections.
---
### XigmaNAS (`xigmanas`)
**Status:** Ready.
**Archive format:** Plain log files (FreeBSD-based NAS system)
**Detection:** Files named `xigmanas`, `system`, or `dmesg`; content containing "XigmaNAS" or "FreeBSD"; SMART data presence.
**Extracted data:**
- System: firmware version, uptime, CPU model, memory configuration, hardware platform
- Storage: disk models, serial numbers, capacity, health, SMART temperatures
- Populates: `Hardware.Firmware`, `Hardware.CPUs`, `Hardware.Memory`, `Hardware.Storage`, `Sensors`
---
### Unraid (`unraid`)
**Status:** Ready (v1.0.0).
- Be conservative with high detect scores
- Prefer filling missing fields over overwriting stronger source data
- Keep parser version constants current when behavior changes
- Any new vendor-specific filtering or dedup logic must ship with tests for that vendor format
**Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories).

View File

@@ -1,366 +1,93 @@
# 07 — Exporters & Reanimator Integration
## Export endpoints summary
| Endpoint | Format | Filename pattern |
|----------|--------|-----------------|
| `GET /api/export/csv` | CSV — serial numbers | `YYYY-MM-DD (MODEL) - SN.csv` |
| `GET /api/export/json` | **Raw export package** (JSON or ZIP bundle) for reopen/re-analysis | `YYYY-MM-DD (MODEL) - SN.(json|zip)` |
| `GET /api/export/reanimator` | Reanimator hardware JSON | `YYYY-MM-DD (MODEL) - SN.json` |
---
## Raw Export (`Export Raw Data`)
### Purpose
Preserve enough source data to reproduce parsing later after parser fixes, without requiring
another live collection from the target system.
### Format
`/api/export/json` returns a **raw export package**:
- JSON package (machine-readable), or
- ZIP bundle containing:
- `raw_export.json` — machine-readable package
- `collect.log` — human-readable collection + parsing summary
- `parser_fields.json` — structured parsed field snapshot for diffs between parser versions
### Import / reopen behavior
When a raw export package is uploaded back into LOGPile:
- the app **re-analyzes from raw source**
- it does **not** trust embedded parsed output as source of truth
For Redfish, this means replay from `raw_payloads.redfish_tree`.
### Design rule
Raw export is a **re-analysis artifact**, not a final report dump. Keep it self-contained and
forward-compatible where possible (versioned package format, additive fields only).
---
## Reanimator Export
### Purpose
Exports hardware inventory data in the format expected by the Reanimator asset tracking
system. Enables one-click push from LOGPile to an external asset management platform.
### Implementation files
| File | Role |
|------|------|
| `internal/exporter/reanimator_models.go` | Go structs for Reanimator JSON |
| `internal/exporter/reanimator_converter.go` | `ConvertToReanimator()` and helpers |
| `internal/server/handlers.go` | `handleExportReanimator()` HTTP handler |
### Conversion rules
- Source: canonical `hardware.devices` repository (see [`04-data-models.md`](04-data-models.md))
- CPU manufacturer inferred from model string (Intel / AMD / ARM / Ampere)
- PCIe serial number generated when absent: `{board_serial}-PCIE-{slot}`
- Status values normalized to: `OK`, `Warning`, `Critical`, `Unknown` (`Empty` only for memory slots)
- Timestamps in RFC3339 format
- `target_host` derived from `filename` field (`redfish://…`, `ipmi://…`) if not in source; omitted if undeterminable
- `board.manufacturer` and `board.product_name` values of `"NULL"` treated as absent
### LOGPile → Reanimator field mapping
| LOGPile type | Reanimator section | Notes |
|---|---|---|
| `BoardInfo` | `board` | Direct mapping |
| `CPU` | `cpus` | + manufacturer (inferred) |
| `MemoryDIMM` | `memory` | Direct; empty slots included (`present=false`) |
| `Storage` | `storage` | Excluded if no `serial_number` |
| `PCIeDevice` | `pcie_devices` | Serial generated if missing |
| `GPU` | `pcie_devices` | `device_class=DisplayController` |
| `NetworkAdapter` | `pcie_devices` | `device_class=NetworkController` |
| `PSU` | `power_supplies` | Excluded if no serial or `present=false` |
| `FirmwareInfo` | `firmware` | Direct mapping |
### Inclusion / exclusion rules
**Included:**
- Memory slots with `present=false` (as Empty slots)
- PCIe devices without serial number (serial is generated)
**Excluded:**
- Storage without `serial_number`
- PSU without `serial_number` or with `present=false`
- NetworkAdapters with `present=false`
---
## Reanimator Integration Guide
This section documents the Reanimator receiver-side JSON format (what the Reanimator
system expects when it ingests a LOGPile export).
> **Important:** The Reanimator endpoint uses a strict JSON decoder (`DisallowUnknownFields`).
> Any unknown field — including nested ones — causes `400 Bad Request`.
> Use only `snake_case` keys listed here.
### Top-level structure
```json
{
"filename": "redfish://10.10.10.103",
"source_type": "api",
"protocol": "redfish",
"target_host": "10.10.10.103",
"collected_at": "2026-02-10T15:30:00Z",
"hardware": {
"board": {...},
"firmware": [...],
"cpus": [...],
"memory": [...],
"storage": [...],
"pcie_devices": [...],
"power_supplies": [...]
}
}
```
**Required:** `collected_at`, `hardware.board.serial_number`
**Optional:** `target_host`, `source_type`, `protocol`, `filename`
`source_type` values: `api`, `logfile`, `manual`
`protocol` values: `redfish`, `ipmi`, `snmp`, `ssh`
### Component status fields (all component sections)
Each component may carry:
| Field | Type | Description |
|-------|------|-------------|
| `status` | string | `OK`, `Warning`, `Critical`, `Unknown`, `Empty` |
| `status_checked_at` | RFC3339 | When status was last verified |
| `status_changed_at` | RFC3339 | When status last changed |
| `status_at_collection` | object | `{ "status": "...", "at": "..." }` — snapshot-time status |
| `status_history` | array | `[{ "status": "...", "changed_at": "...", "details": "..." }]` |
| `error_description` | string | Human-readable error for Warning/Critical |
### Board
```json
{
"board": {
"manufacturer": "Supermicro",
"product_name": "X12DPG-QT6",
"serial_number": "21D634101",
"part_number": "X12DPG-QT6-REV1.01",
"uuid": "d7ef2fe5-2fd0-11f0-910a-346f11040868"
}
}
```
`serial_number` required. `manufacturer` / `product_name` of `"NULL"` treated as absent.
### CPUs
```json
{
"socket": 0,
"model": "INTEL(R) XEON(R) GOLD 6530",
"cores": 32,
"threads": 64,
"frequency_mhz": 2100,
"max_frequency_mhz": 4000,
"manufacturer": "Intel",
"status": "OK"
}
```
`socket` (int) and `model` required. Serial generated: `{board_serial}-CPU-{socket}`.
LOT format: `CPU_{VENDOR}_{MODEL_NORMALIZED}` → e.g. `CPU_INTEL_XEON_GOLD_6530`
### Memory
```json
{
"slot": "CPU0_C0D0",
"location": "CPU0_C0D0",
"present": true,
"size_mb": 32768,
"type": "DDR5",
"max_speed_mhz": 4800,
"current_speed_mhz": 4800,
"manufacturer": "Hynix",
"serial_number": "80AD032419E17CEEC1",
"part_number": "HMCG88AGBRA191N",
"status": "OK"
}
```
`slot` and `present` required. `serial_number` required when `present=true`.
Empty slots (`present=false`, `status="Empty"`) are included but no component created.
LOT format: `DIMM_{TYPE}_{SIZE_GB}GB` → e.g. `DIMM_DDR5_32GB`
### Storage
```json
{
"slot": "OB01",
"type": "NVMe",
"model": "INTEL SSDPF2KX076T1",
"size_gb": 7680,
"serial_number": "BTAX41900GF87P6DGN",
"manufacturer": "Intel",
"firmware": "9CV10510",
"interface": "NVMe",
"present": true,
"status": "OK"
}
```
`slot`, `model`, `serial_number`, `present` required.
LOT format: `{TYPE}_{INTERFACE}_{SIZE_TB}TB` → e.g. `SSD_NVME_07.68TB`
### Power Supplies
```json
{
"slot": "0",
"present": true,
"model": "GW-CRPS3000LW",
"vendor": "Great Wall",
"wattage_w": 3000,
"serial_number": "2P06C102610",
"part_number": "V0310C9000000000",
"firmware": "00.03.05",
"status": "OK",
"input_power_w": 137,
"output_power_w": 104,
"input_voltage": 215.25
}
```
`slot`, `present` required. `serial_number` required when `present=true`.
Telemetry fields (`input_power_w`, `output_power_w`, `input_voltage`) stored in observation only.
LOT format: `PSU_{WATTAGE}W_{VENDOR_NORMALIZED}` → e.g. `PSU_3000W_GREAT_WALL`
### PCIe Devices
```json
{
"slot": "PCIeCard1",
"vendor_id": 32902,
"device_id": 2912,
"bdf": "0000:18:00.0",
"device_class": "MassStorageController",
"manufacturer": "Intel",
"model": "RAID Controller RSP3DD080F",
"link_width": 8,
"link_speed": "Gen3",
"max_link_width": 8,
"max_link_speed": "Gen3",
"serial_number": "RAID-001-12345",
"firmware": "50.9.1-4296",
"status": "OK"
}
```
`slot` required. Serial generated if absent: `{board_serial}-PCIE-{slot}`.
`device_class` values: `NetworkController`, `MassStorageController`, `DisplayController`, etc.
LOT format: `PCIE_{DEVICE_CLASS}_{MODEL_NORMALIZED}` → e.g. `PCIE_NETWORK_CONNECTX5`
### Firmware
```json
[
{ "device_name": "BIOS", "version": "06.08.05" },
{ "device_name": "BMC", "version": "5.17.00" }
]
```
Both fields required. Changes trigger `FIRMWARE_CHANGED` timeline events.
---
### Import process (Reanimator side)
1. Validate `collected_at` (RFC3339) and `hardware.board.serial_number`.
2. Find or create Asset by `board.serial_number``vendor_serial`.
3. For each component: filter `present=false`, auto-determine LOT, find or create Component,
create Observation, update Installations.
4. Detect removed components (present in previous snapshot, absent in current) → close Installation.
5. Generate timeline events: `LOG_COLLECTED`, `INSTALLED`, `REMOVED`, `FIRMWARE_CHANGED`.
**Idempotency:** Repeated import of the same snapshot (same content hash) returns `200 OK`
with `"duplicate": true` and does not create duplicate records.
### Reanimator API endpoint
```http
POST /ingest/hardware
Content-Type: application/json
```
**Success (201):**
```json
{
"status": "success",
"bundle_id": "lb_01J...",
"asset_id": "mach_01J...",
"collected_at": "2026-02-10T15:30:00Z",
"duplicate": false,
"summary": {
"parts_observed": 15,
"parts_created": 2,
"installations_created": 2,
"timeline_events_created": 9
}
}
```
**Duplicate (200):**
```json
{ "status": "success", "duplicate": true, "message": "LogBundle with this content hash already exists" }
```
**Error (400):**
```json
{ "status": "error", "error": "validation_failed", "details": { "field": "...", "message": "..." } }
```
Common `400` causes:
- Unknown JSON field (strict decoder)
- Wrong key name (e.g. `targetHost` instead of `target_host`)
- Invalid `collected_at` format (must be RFC3339)
- Empty `hardware.board.serial_number`
### LOT normalization rules
1. Remove special chars `( ) - ® ™`; replace spaces with `_`
2. Uppercase all
3. Collapse multiple underscores to one
4. Strip common prefixes like `MODEL:`, `PN:`
### Status values
| Value | Meaning | Action |
|-------|---------|--------|
| `OK` | Normal | — |
| `Warning` | Degraded | Create `COMPONENT_WARNING` event (optional) |
| `Critical` | Failed | Auto-create `failure_event`, create `COMPONENT_FAILED` event |
| `Unknown` | Not determinable | Treat as working |
| `Empty` | Slot unpopulated | No component created (memory/PCIe only) |
### Missing field handling
| Field | Fallback |
|-------|---------|
| CPU serial | Generated: `{board_serial}-CPU-{socket}` |
| PCIe serial | Generated: `{board_serial}-PCIE-{slot}` |
| Other serial | Component skipped if absent |
| manufacturer (PCIe) | Looked up from `vendor_id` (8086→Intel, 10de→NVIDIA, 15b3→Mellanox…) |
| status | Treated as `Unknown` |
| firmware | No `FIRMWARE_CHANGED` event |
# 07 — Exporters
## Export surfaces
| Endpoint | Output | Purpose |
|----------|--------|---------|
| `GET /api/export/csv` | CSV | Serial-number export |
| `GET /api/export/json` | raw-export ZIP bundle | Reopen and re-analyze later |
| `GET /api/export/reanimator` | JSON | Reanimator hardware payload |
| `POST /api/convert` | async ZIP artifact | Batch archive-to-Reanimator conversion |
## Raw export
Raw export is not a final report dump.
It is a replayable artifact that preserves enough source data for future parser improvements.
Current bundle contents:
- `raw_export.json`
- `collect.log`
- `parser_fields.json`
Design rules:
- raw source is authoritative
- uploads of raw export must replay from raw source
- parsed snapshots inside the bundle are diagnostic only
## Reanimator export
Implementation files:
- `internal/exporter/reanimator_models.go`
- `internal/exporter/reanimator_converter.go`
- `internal/server/handlers.go`
- `bible-local/docs/hardware-ingest-contract.md`
Conversion rules:
- canonical source is merged canonical inventory derived from `hardware.devices` plus legacy hardware slices
- output must conform to the strict Reanimator ingest contract in `docs/hardware-ingest-contract.md`
- local mirror currently tracks upstream contract `v2.7`
- timestamps are RFC3339
- status is normalized to Reanimator-friendly values
- missing component serial numbers must stay absent; LOGPile must not synthesize fake serials for Reanimator export
- CPU `firmware` field means CPU microcode, not generic processor firmware inventory
- `NULL`-style board manufacturer/product values are treated as absent
- optional component telemetry/health fields are exported when LOGPile already has the data
- partial `hardware.devices` must not suppress components still present only in legacy parser/collector fields
- `present` is not serialized for exported components; presence is expressed by the existence of the component record itself
- Reanimator ingest may apply its own server-side fallback serial rules for CPU and PCIe when LOGPile leaves serials absent
## Inclusion rules
Included:
- PCIe-class devices when the component itself is present, even if serial number is missing
- contract `v2.7` component telemetry and health fields when source data exists
- hardware sensors grouped into `fans`, `power`, `temperatures`, `other` only when the sensor has a real numeric reading
- sensor `location` is not exported; LOGPile keeps only sensor `name` plus measured values and status
- Redfish linked metric docs that carry component telemetry: `ProcessorMetrics`, `MemoryMetrics`, `DriveMetrics`, `EnvironmentMetrics`, `Metrics`
- `pcie_devices.slot` is treated as the canonical PCIe address; `bdf` is used only as an internal fallback/dedupe key and is not serialized in the payload
- `event_logs` are exported only from normalized parser/collector events that can be mapped to contract sources `host` / `bmc` / `redfish` without synthesizing content
- `manufactured_year_week` is exported only as a reliable passthrough when the parser/collector already extracted a valid `YYYY-Www` value
Excluded:
- storage endpoints from `pcie_devices`; disks and NVMe drives export only through `hardware.storage`
- fake serial numbers for PCIe-class devices; any fallback serial generation belongs to Reanimator ingest, not LOGPile
- sensors without a real numeric reading
- events with internal-only or unmappable sources such as LOGPile internal warnings
- memory with missing serial number
- memory with `present=false` or `status=Empty`
- CPUs with `present=false`
- storage without `serial_number`
- storage with `present=false`
- power supplies without `serial_number`
- power supplies with `present=false`
- non-present network adapters
- non-present PCIe / GPU devices
- device-bound firmware duplicated at top-level firmware list
- any field not present in the strict ingest contract
## Batch convert
`POST /api/convert` accepts multiple supported files and produces a ZIP with:
- one `*.reanimator.json` file per successful input
- `convert-summary.txt`
Behavior:
- unsupported filenames are skipped
- each file is parsed independently
- one bad file must not fail the whole batch if at least one conversion succeeds
- result artifact is temporary and deleted after download
## CSV export
`GET /api/export/csv` uses the same merged canonical inventory as Reanimator export,
with legacy network-card fallback kept only for records that still have no canonical device match.

View File

@@ -4,86 +4,78 @@
Defined in `cmd/logpile/main.go`:
| Flag | Default | Description |
|------|---------|-------------|
| Flag | Default | Purpose |
|------|---------|---------|
| `--port` | `8082` | HTTP server port |
| `--file` | — | Reserved for archive preload (not active) |
| `--version` | | Print version and exit |
| `--no-browser` | | Do not open browser on start |
| `--hold-on-crash` | `true` on Windows | Keep console open on fatal crash for debugging |
| `--file` | empty | Preload archive file |
| `--version` | `false` | Print version and exit |
| `--no-browser` | `false` | Do not auto-open browser |
| `--hold-on-crash` | `true` on Windows | Keep console open after fatal crash |
## Build
## Common commands
```bash
# Local binary (current OS/arch)
make build
# Output: bin/logpile
# Cross-platform binaries
make build-all
# Output:
# bin/logpile-linux-amd64
# bin/logpile-linux-arm64
# bin/logpile-darwin-amd64
# bin/logpile-darwin-arm64
# bin/logpile-windows-amd64.exe
```
Both `make build` and `make build-all` run `scripts/update-pci-ids.sh --best-effort`
before compilation to sync `pci.ids` from the submodule.
To skip PCI IDs update:
```bash
SKIP_PCI_IDS_UPDATE=1 make build
```
Build flags: `CGO_ENABLED=0` — fully static binary, no C runtime dependency.
## PCI IDs submodule
Source: `third_party/pciids` (git submodule → `github.com/pciutils/pciids`)
Local copy embedded at build time: `internal/parser/vendors/pciids/pci.ids`
```bash
# Manual update
make test
make fmt
make update-pci-ids
```
# Init submodule after fresh clone
Notes:
- `make build` outputs `bin/logpile`
- `make build-all` builds the supported cross-platform binaries
- `make build` and `make build-all` run `scripts/update-pci-ids.sh --best-effort` unless `SKIP_PCI_IDS_UPDATE=1`
## PCI IDs
Source submodule: `third_party/pciids`
Embedded copy: `internal/parser/vendors/pciids/pci.ids`
Typical setup after clone:
```bash
git submodule update --init third_party/pciids
```
## Release process
## Release script
Run:
```bash
scripts/release.sh
./scripts/release.sh
```
What it does:
Current behavior:
1. Reads version from `git describe --tags`
2. Validates clean working tree (override: `ALLOW_DIRTY=1`)
3. Sets stable `GOPATH` / `GOCACHE` / `GOTOOLCHAIN` env
4. Creates `releases/{VERSION}/` directory
5. Generates `RELEASE_NOTES.md` template if not present
6. Builds `darwin-arm64` and `windows-amd64` binaries
7. Packages all binaries found in `bin/` as `.tar.gz` / `.zip`
2. Refuses a dirty tree unless `ALLOW_DIRTY=1`
3. Sets stable Go cache/toolchain environment
4. Creates `releases/{VERSION}/`
5. Creates a release-notes template if missing
6. Builds `darwin-arm64` and `windows-amd64`
7. Packages any already-present binaries from `bin/`
8. Generates `SHA256SUMS.txt`
9. Prints next steps (tag, push, create release manually)
Release notes template is created in `releases/{VERSION}/RELEASE_NOTES.md`.
Important limitation:
- `scripts/release.sh` does not run `make build-all` for you
- if you want Linux or additional macOS archives in the release directory, build them before running the script
## Running
Toolchain note:
- `scripts/release.sh` defaults `GOTOOLCHAIN=local` to use the already installed Go toolchain and avoid implicit network downloads during release builds
- if you intentionally want another toolchain, pass it explicitly, for example `GOTOOLCHAIN=go1.24.0 ./scripts/release.sh`
## Run locally
```bash
./bin/logpile
./bin/logpile --port 9090
./bin/logpile --no-browser
./bin/logpile --version
./bin/logpile --hold-on-crash # keep console open on crash (default on Windows)
```
## macOS Gatekeeper
After downloading a binary, remove the quarantine attribute:
```bash
xattr -d com.apple.quarantine /path/to/logpile-darwin-arm64
```

View File

@@ -1,43 +1,54 @@
# 09 — Testing
## Required before merge
## Baseline
Required before merge:
```bash
go test ./...
```
All tests must pass before any change is merged.
## Test locations
## Where to add tests
| Change area | Test location |
|-------------|---------------|
| Collectors | `internal/collector/*_test.go` |
| HTTP handlers | `internal/server/*_test.go` |
| Area | Location |
|------|----------|
| Collectors and replay | `internal/collector/*_test.go` |
| HTTP handlers and jobs | `internal/server/*_test.go` |
| Exporters | `internal/exporter/*_test.go` |
| Parsers | `internal/parser/vendors/<vendor>/*_test.go` |
| Vendor parsers | `internal/parser/vendors/<vendor>/*_test.go` |
## Exporter tests
## General rules
The Reanimator exporter has comprehensive coverage:
- Prefer table-driven tests
- No network access in unit tests
- Cover happy path and realistic failure/partial-data cases
- New vendor parsers need both detection and parse coverage
| Test file | Coverage |
|-----------|----------|
| `reanimator_converter_test.go` | Unit tests per conversion function |
| `reanimator_integration_test.go` | Full export with realistic `AnalysisResult` |
## Mandatory coverage for dedup/filter/classify logic
Any new deduplication, filtering, or classification function must have:
1. A true-positive case
2. A true-negative case
3. A regression case for the vendor or topology that motivated the change
This is mandatory for inventory logic, firmware filtering, and similar code paths where silent data drift is likely.
## Mandatory coverage for expensive path selection
Any function that decides whether to crawl or probe an expensive path must have:
1. A positive selection case
2. A negative exclusion case
3. A topology-level count/integration case
The goal is to catch runaway I/O regressions before they ship.
## Useful focused commands
Run exporter tests only:
```bash
go test ./internal/exporter/...
go test ./internal/exporter/... -v -run Reanimator
go test ./internal/exporter/... -cover
go test ./internal/collector/...
go test ./internal/server/...
go test ./internal/parser/vendors/...
```
## Guidelines
- Prefer table-driven tests for parsing logic (multiple input variants).
- Do not rely on network access in unit tests.
- Test both the happy path and edge cases (missing fields, empty collections).
- When adding a new vendor parser, include at minimum:
- `Detect()` test with a positive and a negative sample file list.
- `Parse()` test with a minimal but representative archive.

View File

@@ -253,4 +253,572 @@ at parse time before storing in any model struct. Use the regex
---
## ADL-018 — NVMe bay probe must be restricted to storage-capable chassis types
**Date:** 2026-03-12
**Context:** `shouldAdaptiveNVMeProbe` was introduced in `2fa4a12` to recover NVMe drives on
Supermicro BMCs that expose empty `Drives` collections but serve disks at direct `Disk.Bay.N`
paths. The function returns `true` for any chassis with an empty `Members` array. On
Supermicro HGX systems (SYS-A21GE-NBRT and similar) ~35 sub-chassis (GPU, NVSwitch,
PCIeRetimer, ERoT, IRoT, BMC, FPGA) all carry `ChassisType=Module/Component/Zone` and
expose empty `/Drives` collections. Without filtering, each triggered 384 HTTP requests →
13 440 requests ≈ 22 minutes of pure I/O waste per collection.
**Decision:** Before probing `Disk.Bay.N` candidates for a chassis, check its `ChassisType`
via `chassisTypeCanHaveNVMe`. Skip if type is `Module`, `Component`, or `Zone`. Keep probing
for `Enclosure`, `RackMount`, and any unrecognised type (fail-safe).
**Consequences:**
- On HGX systems post-probe NVMe goes from ~22 min to effectively zero.
- NVMe backplane recovery (`Enclosure` type) is unaffected.
- Any new chassis type that hosts NVMe storage is covered by the default `true` path.
- `chassisTypeCanHaveNVMe` and the candidate-selection loop must have unit tests covering
both the excluded types and the storage-capable types (see `TestChassisTypeCanHaveNVMe`
and `TestNVMePostProbeSkipsNonStorageChassis`).
## ADL-019 — Redfish post-probe recovery is profile-owned acquisition policy
**Date:** 2026-03-18
**Context:** Numeric collection post-probe and direct NVMe `Disk.Bay` recovery were still
controlled by collector-core heuristics, which kept platform-specific acquisition behavior in
`redfish.go` and made vendor/topology refactoring incomplete.
**Decision:** Move expensive Redfish post-probe enablement into profile-owned acquisition policy.
The collector core may execute bounded post-probe loops, but profiles must explicitly enable:
- numeric collection post-probe
- direct NVMe `Disk.Bay` recovery
- sensor collection post-probe
**Consequences:**
- Generic collector flow no longer implicitly turns on storage/NVMe recovery for every platform.
- Supermicro-specific direct NVMe recovery and generic numeric collection recovery are now
regression-tested through profile fixtures.
- Future platform storage/post-probe behavior must be added through profile tuning, not new
vendor-shaped `if` branches in collector core.
## ADL-020 — Redfish critical plan-B activation is profile-owned recovery policy
**Date:** 2026-03-18
**Context:** `critical plan-B` and `profile plan-B` were still effectively always-on collector
behavior once paths were present, including critical collection member retry and slow numeric
child probing. That kept acquisition recovery semantics in `redfish.go` instead of the profile
layer.
**Decision:** Move plan-B activation into profile-owned recovery policy. Profiles must explicitly
enable:
- critical collection member retry
- slow numeric probing during critical plan-B
- profile-specific plan-B pass
**Consequences:**
- Recovery behavior is now observable in raw Redfish diagnostics alongside other tuning.
- Generic/fallback recovery remains available through profile policy instead of implicit collector
defaults.
- Future platform-specific plan-B behavior must be introduced through profile tuning and tests,
not through new unconditional collector branches.
## ADL-021 — Extra discovered-path storage seeds must be profile-scoped, not core-baseline
**Date:** 2026-03-18
**Context:** The collector core baseline seed list still contained storage-specific discovered-path
suffixes such as `SimpleStorage` and `Storage/IntelVROC/*`. These are useful on some platforms,
but they are acquisition extensions layered on top of discovered `Systems/*` resources, not part
of the minimal vendor-neutral Redfish baseline.
**Decision:** Move such discovered-path expansions into profile-owned scoped path policy. The
collector core keeps the vendor-neutral baseline; profiles may add extra system/chassis/manager
suffixes that are expanded over discovered members during acquisition planning.
**Consequences:**
- Platform-shaped storage discovery no longer lives in `redfish.go` baseline seed construction.
- Extra discovered-path branches are visible in plan diagnostics and fixture regression tests.
- Future model/vendor storage path expansions must be added through scoped profile policy instead
of editing the shared baseline seed list.
## ADL-022 — Adaptive prefetch eligibility is profile-owned policy
**Date:** 2026-03-18
**Context:** The adaptive prefetch executor was still driven by hardcoded include/exclude path
rules in `redfish.go`. That made GPU/storage/network prefetch shaping part of collector-core
knowledge rather than profile-owned acquisition policy.
**Decision:** Move prefetch eligibility rules into profile tuning. The collector core still runs
adaptive prefetch, but profiles provide:
- `IncludeSuffixes` for critical paths eligible for prefetch
- `ExcludeContains` for path shapes that must never be prefetched
**Consequences:**
- Prefetch behavior is now visible in raw Redfish diagnostics and test fixtures.
- Platform- or topology-specific prefetch shaping no longer requires editing collector-core
string lists.
- Future prefetch tuning must be introduced through profiles and regression tests.
## ADL-023 — Core critical baseline is roots-only; critical shaping is profile-owned
**Date:** 2026-03-18
**Context:** `redfishCriticalEndpoints(...)` still encoded a broad set of system/chassis/manager
critical branches directly in collector core. This mixed minimal crawl invariants with profile-
specific acquisition shaping.
**Decision:** Reduce collector-core critical baseline to vendor-neutral roots only:
- `/redfish/v1`
- discovered `Systems/*`
- discovered `Chassis/*`
- discovered `Managers/*`
Profiles now own additional critical shaping through:
- scoped critical suffix policy for discovered resources
- explicit top-level `CriticalPaths`
**Consequences:**
- Critical inventory breadth is now explained by the acquisition plan, not hidden in collector
helper defaults.
- Generic profile still provides the previous broad critical coverage, so behavior stays stable.
- Future critical-path tuning must be implemented in profiles and regression-tested there.
## ADL-024 — Live Redfish execution plans are resolved inside redfishprofile
**Date:** 2026-03-18
**Context:** Even after moving seeds, scoped paths, critical shaping, recovery, and prefetch
policy into profiles, `redfish.go` still manually merged discovered resources with those policy
fragments. That left acquisition-plan resolution logic in collector core.
**Decision:** Introduce `redfishprofile.ResolveAcquisitionPlan(...)` as the boundary between
profile planning and collector execution. `redfishprofile` now resolves:
- baseline seeds
- baseline critical roots
- scoped path expansions
- explicit profile seed/critical/plan-B paths
The collector core consumes the resolved plan and executes it.
**Consequences:**
- Acquisition planning logic is now testable in `redfishprofile` without going through the live
collector.
- `redfish.go` no longer owns path-resolution helpers for seeds/critical planning.
- This creates a clean next step toward true per-profile acquisition hooks beyond static policy
fragments.
## ADL-025 — Post-discovery acquisition refinement belongs to profile hooks
**Date:** 2026-03-18
**Context:** Some acquisition behavior depends not only on vendor/model hints, but on what the
lightweight Redfish discovery actually returned. Static absolute path lists in profile plans are
too rigid for such cases and reintroduce guessed platform knowledge.
**Decision:** Add a post-discovery acquisition refinement hook to Redfish profiles. Profiles may
mutate the resolved execution plan after discovered `Systems/*`, `Chassis/*`, and `Managers/*`
are known.
First concrete use:
- MSI now derives GPU chassis seeds and `.../Sensors` critical/plan-B paths from discovered
`Chassis/GPU*` resources instead of hardcoded `GPU1..GPU4` absolute paths in the static plan.
Additional use:
- Supermicro now derives `UpdateService/Oem/Supermicro/FirmwareInventory` critical/plan-B paths
from resource hints instead of carrying that absolute path in the static plan.
Additional use:
- Dell now derives `Managers/iDRAC.Embedded.*` acquisition paths from discovered manager
resources instead of carrying `Managers/iDRAC.Embedded.1` as a static absolute path.
**Consequences:**
- Profile modules can react to actual discovery results without pushing conditional logic back
into `redfish.go`.
- Diagnostics still show the final refined plan because the collector stores the refined plan,
not only the pre-refinement template.
- Future vendor-specific discovery-dependent acquisition behavior should be implemented through
this hook rather than new collector-core branches.
## ADL-026 — Replay analysis uses a resolved profile plan, not ad-hoc directives only
**Date:** 2026-03-18
**Context:** Replay still relied on a flat `AnalysisDirectives` struct assembled centrally,
while vendor-specific conditions often depended on the actual snapshot shape. That made analysis
behavior harder to explain and kept too much vendor logic in generic replay collectors.
**Decision:** Introduce `redfishprofile.ResolveAnalysisPlan(...)` for replay. The resolved
analysis plan contains:
- active match result
- resolved analysis directives
- analysis notes explaining snapshot-aware hook activation
Profiles may refine this plan using the snapshot and discovered resources before replay collectors
run.
First concrete uses:
- MSI enables processor-GPU fallback and MSI chassis lookup only when the snapshot actually
contains GPU processors and `Chassis/GPU*`
- HGX enables processor-GPU alias fallback from actual HGX/GPU_SXM topology signals in the snapshot
- Supermicro enables NVMe backplane and known-controller recovery from actual snapshot paths
**Consequences:**
- Replay behavior is now closer to the acquisition architecture: a resolved profile plan feeds the
executor.
- `redfish_analysis_plan` is stored in raw payload metadata for offline debugging.
- Future analysis-side vendor logic should move into profile refinement hooks instead of growing the
central directive builder.
## ADL-027 — Replay GPU/storage executors consume resolved analysis plans
**Date:** 2026-03-18
**Context:** Even after introducing `ResolveAnalysisPlan(...)`, replay GPU/storage collectors still
accepted a raw `AnalysisDirectives` struct. That preserved an implicit shortcut from the old design
and weakened the plan/executor boundary.
**Decision:** Replay GPU/storage executors now accept `redfishprofile.ResolvedAnalysisPlan`
directly. The executor reads resolved directives from the plan instead of being passed a standalone
directive bundle.
**Consequences:**
- GPU and storage replay execution now follows the same architectural pattern as acquisition:
resolve plan first, execute second.
- Future profile-owned execution helpers can use plan notes or additional resolved fields without
changing the executor API again.
- Remaining replay areas should migrate the same way instead of continuing to accept raw directive
structs.
## ADL-019 — isDeviceBoundFirmwareName must cover vendor-specific naming patterns per vendor
**Date:** 2026-03-12
**Context:** `isDeviceBoundFirmwareName` was written to filter Dell-style device firmware names
(`"GPU SomeDevice"`, `"NIC OnboardLAN"`). When Supermicro Redfish FirmwareInventory was added
(`6c19a58`), no Supermicro-specific patterns were added. Supermicro names a NIC entry
`"NIC1 System Slot0 AOM-DP805-IO"` — a digit follows the type prefix directly, bypassing the
`"nic "` (space-terminated) check. 29 device-bound entries leaked into `hardware.firmware` on
SYS-A21GE-NBRT (HGX B200). Commit `9c5512d` attempted a fix by adding `_fw_gpu_` patterns,
but checked `DeviceName` which contains `"Software Inventory"` (from the Redfish `Name` field),
not the firmware inventory ID. The patterns were dead code from the moment they were committed.
**Decision:**
- `isDeviceBoundFirmwareName` must be extended for each new vendor whose FirmwareInventory
naming convention differs from the existing patterns.
- When adding HGX/Supermicro patterns, check that the pattern matches the field value that
`collectFirmwareInventory` actually stores — trace the data path from Redfish doc to
`FirmwareInfo.DeviceName` before writing the condition.
- `TestIsDeviceBoundFirmwareName` must contain at least one case per vendor format.
**Consequences:**
- New vendors with FirmwareInventory support require a test covering both device-bound names
(must return true) and system-level names (must return false) before the code ships.
- The dead `_fw_gpu_` / `_fw_nvswitch_` / `_inforom_gpu_` patterns were replaced with
correct prefix+digit checks (`"gpu" + digit`, `"nic" + digit`) and explicit string checks
(`"nvmecontroller"`, `"power supply"`, `"software inventory"`).
## ADL-020 — Dell TSR device-bound firmware filtered via FQDD; InfiniBand routed to NetworkAdapters
**Date:** 2026-03-15
**Context:** Dell TSR `sysinfo_DCIM_SoftwareIdentity.xml` lists firmware for every installed
component. `parseSoftwareIdentityXML` dumped all of these into `hardware.firmware` without
filtering, so device-bound entries such as `"Mellanox Network Adapter"` (FQDD `InfiniBand.Slot.1-1`)
and `"PERC H755 Front"` (FQDD `RAID.SL.3-1`) appeared in the reanimator export alongside system
firmware like BIOS and iDRAC. Confirmed on PowerEdge R6625 (8VS2LG4).
Additionally, `DCIM_InfiniBandView` was not handled in the parser switch, so Mellanox ConnectX-6
appeared only as a PCIe device with `model: "16x or x16"` (from `DataBusWidth` fallback).
`parseControllerView` called `addFirmware` with description `"storage controller"` instead of the
FQDD, so the FQDD-based filter in the exporter could not remove it.
**Decision:**
1. `isDeviceBoundFirmwareFQDD` extended with `"infiniband."` and `"fc."` prefixes; `"raid.backplane."`
broadened to `"raid."` to cover `RAID.SL.*`, `RAID.Integrated.*`, etc.
2. `DCIM_InfiniBandView` routed to `parseNICView` → device appears as `NetworkAdapter` with correct
firmware, MAC address, and VendorID/DeviceID.
3. `"InfiniBand."` added to `pcieFQDDNoisePrefix` to suppress the duplicate `DCIM_PCIDeviceView`
entry (DataBusWidth-only, no useful data).
4. `parseControllerView` now passes `fqdd` as the `addFirmware` description so the FQDD filter
removes the entry in the exporter.
5. `parsePCIeDeviceView` now prioritises `props["description"]` (chip model, e.g. `"MT28908 Family
[ConnectX-6]"`) over `props["devicedescription"]` (location string) for `pcie.Description`.
6. `convertPCIeDevices` model fallback order: `PartNumber → Description → DeviceClass`.
**Consequences:**
- `hardware.firmware` contains only system-level entries; NIC/RAID/storage-controller firmware
lives on the respective device record.
- `TestParseDellInfiniBandView` and `TestIsDeviceBoundFirmwareFQDD` guard the regression.
- Any future Dell TSR device class whose FQDD prefix is not yet in the prefix list may still leak;
extend `isDeviceBoundFirmwareFQDD` and add a test case when encountered.
---
## ADL-021 — pci.ids enrichment: chip model and vendor resolved from PCI IDs when source data is generic or missing
**Date:** 2026-03-15
**Context:**
Dell TSR `DCIM_InfiniBandView.ProductName` reports a generic marketing name ("Mellanox Network
Adapter") instead of the precise chip identifier ("MT28908 Family [ConnectX-6]"). The actual
chip model is available in `pci.ids` by VendorID:DeviceID (15B3:101B). Vendor name may also be
absent when no `VendorName` / `Manufacturer` property is present.
The general rule was established: *if model is not found in source data but PCI IDs are known,
resolve model from `pci.ids`*. This rule applies broadly across all export paths.
**Decision (two-layer enrichment):**
1. **Parser layer (Dell, `parseNICView`):** When `VendorID != 0 && DeviceID != 0`, prefer
`pciids.DeviceName(vendorID, deviceID)` over the product name from logs. This makes the chip
identifier the primary model for NIC/InfiniBand adapters (more specific than marketing name).
Fill `Vendor` from `pciids.VendorName(vendorID)` when the vendor field is otherwise empty.
Same fallback applied in `parsePCIeDeviceView` for empty `Description`.
2. **Exporter layer (`convertPCIeFromDevices`):** General rule — when `d.Model == ""` after all
legacy fallbacks and `VendorID != 0 && DeviceID != 0`, set `model = pciids.DeviceName(...)`.
Also fill empty `manufacturer` from `pciids.VendorName(...)`. This covers all parsers/sources.
**Consequences:**
- Mellanox InfiniBand slot now reports `model: "MT28908 Family [ConnectX-6]"` and
`manufacturer: "Mellanox Technologies"` in the reanimator export.
- For NICs where pci.ids has no entry, the original product name is kept (pci.ids returns "").
- `TestParseDellInfiniBandView` asserts the model and vendor from pci.ids.
---
## ADL-022 — CPUAffinity parsed into NUMANode for PCIe, NIC, and controller devices
**Date:** 2026-03-15
**Context:**
Dell TSR DCIM view classes report `CPUAffinity` for NIC, InfiniBand, PCIe, and controller
devices. Values are "1", "2" (NUMA node index), or "Not Applicable" (for devices that bridge
both CPUs or have no CPU affinity). This data is needed for topology-aware diagnostics.
**Decision:**
- Add `NUMANode int` (JSON: `"numa_node,omitempty"`) to `models.PCIeDevice`,
`models.NetworkAdapter`, `models.HardwareDevice`, and `ReanimatorPCIe`.
- Parse from `props["cpuaffinity"]` using `parseIntLoose`: numeric values ("1", "2") map
directly; "Not Applicable" returns 0 (omitted via `omitempty`).
- Thread through `buildDevicesFromLegacy` (PCIe and NIC sections) and `convertPCIeFromDevices`.
- `parseControllerView` also parses CPUAffinity since RAID controllers have NUMA affinity.
**Consequences:**
- `numa_node: 1` or `2` appears in reanimator export for devices with known affinity.
- Value 0 / absent means "not reported" — covers both "Not Applicable" and sources that don't
provide CPUAffinity at all.
- `TestParseDellCPUAffinity` verifies numeric values parsed correctly and "Not Applicable"→0.
---
## ADL-023 — Reanimator export must match ingest contract exactly
**Date:** 2026-03-15
**Context:**
LOGPile's Reanimator export had drifted from the strict ingest contract. It emitted fields that
Reanimator does not currently accept (`status_at_collection`, `numa_node`),
while missing fields and sections now present in the contract (`hardware.sensors`,
`pcie_devices[].mac_addresses`). Memory export rules also diverged from the ingest side: empty or
serial-less DIMMs were still exported.
**Decision:**
- Treat the Reanimator ingest contract as the authoritative schema for `GET /api/export/reanimator`.
- Emit only fields present in the current upstream contract revision.
- Add `hardware.sensors`, `pcie_devices[].mac_addresses`, `pcie_devices[].numa_node`, and
upstream-approved component telemetry/health fields.
- Leave out fields that are still not part of the upstream contract.
- Map internal `source_type=archive` to external `source_type=logfile`.
- Skip memory entries that are empty, not present, or missing serial numbers.
- Generate CPU and PCIe serials only in the forms allowed by the contract.
- Mirror the applied contract in `bible-local/docs/hardware-ingest-contract.md`.
**Consequences:**
- Some previously exported diagnostic fields are intentionally dropped from the Reanimator payload
until the upstream contract adds them.
- Internal models may retain richer fields than the current export schema.
- `hardware.devices` is canonical only after merge with legacy hardware slices; partial parser-owned
canonical records must not hide CPUs, memory, storage, NICs, or PSUs still stored in legacy
fields.
- CSV and Reanimator exports must use the same merged canonical inventory to avoid divergent export
contents across surfaces.
- Future exporter changes must update both the code and the mirrored contract document together.
---
## ADL-024 — Component presence is implicit; Redfish linked metrics are part of replay correctness
**Date:** 2026-03-15
**Context:**
The upstream ingest contract allows `present`, but current export semantics do not need to send
`present=true` for populated components. At the same time, several important Redfish component
telemetry fields were only available through linked metric resources such as `ProcessorMetrics`,
`MemoryMetrics`, and `DriveMetrics`. Without collecting and replaying these linked documents,
live collection and raw snapshot replay still underreported component health fields.
**Decision:**
- Do not serialize `present=true` in Reanimator export. Presence is represented by the presence of
the component record itself.
- Do not export component records marked `present=false`.
- Interpret CPU `firmware` in Reanimator payload as CPU microcode.
- Treat Redfish linked metric resources `ProcessorMetrics`, `MemoryMetrics`, `DriveMetrics`,
`EnvironmentMetrics`, and generic `Metrics` as part of analyzer correctness when they are linked
from component resources.
- Replay logic must merge these linked metric resources back into CPU, memory, storage, PCIe, GPU,
NIC, and PSU component `Details` the same way live collection expects them to be used.
**Consequences:**
- Reanimator payloads are smaller and avoid redundant `present=true` noise while still excluding
empty slots and absent components.
- Any future exporter change that reintroduces serialized component presence needs an explicit
contract review.
- Raw Redfish snapshot completeness now includes linked per-component metric resources, not only
top-level inventory members.
- CPU microcode is no longer expected in top-level `hardware.firmware`; it belongs on the CPU
component record.
<!-- Add new decisions below this line using the format above -->
## ADL-025 — Missing serial numbers must remain absent in Reanimator export
**Date:** 2026-03-15
**Context:**
LOGPile previously generated synthetic serial numbers for components that had no real serial in
source data, especially CPUs and PCIe-class devices. This made the payload look richer, but the
serials were not authoritative and could mislead downstream consumers. Reanimator can already
accept missing serials and generate its own internal fallback identifiers when needed.
**Decision:**
- Do not synthesize fake serial numbers in LOGPile's Reanimator export.
- If a component has no real serial in parsed source data, export the serial field as absent.
- This applies to CPUs, PCIe devices, GPUs, NICs, and any other component class unless an
upstream contract explicitly requires a deterministic exporter-generated identifier.
- Any fallback serial generation defined by the upstream contract is ingest-side Reanimator behavior,
not LOGPile exporter behavior.
**Consequences:**
- Exported payloads carry only source-backed serial numbers.
- Fake identifiers such as `BOARD-...-CPU-...` or synthetic PCIe serials are no longer considered
acceptable exporter behavior.
- Any future attempt to reintroduce generated serials requires an explicit contract review and a
new ADL entry.
---
## ADL-026 — Live Redfish collection uses explicit preflight host-power confirmation
**Date:** 2026-03-15
**Context:**
Live Redfish inventory can be incomplete when the managed host is powered off. At the same time,
LOGPile must not silently power on a host without explicit user choice. The collection workflow
therefore needs a preflight step that verifies connectivity, shows current host power state to the
user, and only powers on the host when the user explicitly chose that path.
**Decision:**
- Add a dedicated live preflight API step before collection starts.
- UI first runs connectivity and power-state check, then offers:
- collect as-is
- power on and collect
- if the host is off and the user does not answer within 5 seconds, default to collecting without
powering the host on
- Redfish collection may power on the host only when the request explicitly sets
`power_on_if_host_off=true`
- when LOGPile powers on the host for collection, it must try to power the host back off after
collection completes
- if LOGPile did not power the host on itself, it must never power the host off
- all preflight and power-control steps must be logged into the collection log and therefore into
the raw-export bundle
**Consequences:**
- Live collection becomes a two-step UX: probe first, collect second.
- Raw bundles preserve operator-visible evidence of power-state decisions and power-control attempts.
- Power-on failures do not block collection entirely; they only downgrade completeness expectations.
---
## ADL-027 — Sensors without numeric readings are not exported
**Date:** 2026-03-15
**Context:**
Some parsed sensor records carry only a name, unit, or status, but no actual numeric reading. Such
records are not useful as telemetry in Reanimator export and create noisy, low-value sensor lists.
**Decision:**
- Do not export temperature, power, fan, or other sensor records unless they carry a real numeric
measurement value.
- Presence of a sensor name or health/status alone is not sufficient for export.
**Consequences:**
- Exported sensor groups contain only actionable telemetry.
- Parsers and collectors may still keep non-numeric sensor artifacts internally for diagnostics, but
Reanimator export must filter them out.
---
## ADL-028 — Reanimator PCIe export excludes storage endpoints and synthetic serials
**Date:** 2026-03-15
**Context:**
Some Redfish and archive sources expose NVMe drives both as storage inventory and as PCIe-visible
endpoints. Exporting such drives in both `hardware.storage` and `hardware.pcie_devices` creates
duplicates without adding useful topology value. At the same time, PCIe-class export still had old
fallback behavior that generated synthetic serial numbers when source serials were absent.
**Decision:**
- Export disks and NVMe drives only through `hardware.storage`.
- Do not export storage endpoints as `hardware.pcie_devices`, even if the source inventory exposes
them as PCIe/NVMe devices.
- Keep real PCIe storage controllers such as RAID and HBA adapters in `hardware.pcie_devices`.
- Do not synthesize PCIe/GPU/NIC serial numbers in LOGPile; missing serials stay absent.
- Treat placeholder names such as `Network Device View` as non-authoritative and prefer resolved
device names when stronger data exists.
**Consequences:**
- Reanimator payloads no longer duplicate NVMe drives between storage and PCIe sections.
- PCIe export remains topology-focused while storage export remains component-focused.
- Missing PCIe-class serials no longer produce fake `BOARD-...-PCIE-...` identifiers.
---
## ADL-029 — Local exporter guidance tracks upstream contract v2.7 terminology
**Date:** 2026-03-15
**Context:**
The upstream Reanimator hardware ingest contract moved to `v2.7` and clarified several points that
matter for LOGPile documentation: ingest-side serial fallback rules, canonical PCIe addressing via
`slot`, the optional `event_logs` section, and the shared `manufactured_year_week` field.
**Decision:**
- Keep the local mirrored contract file as an exact copy of the upstream `v2.7` document.
- Describe CPU/PCIe serial fallback as Reanimator ingest behavior, not LOGPile exporter behavior.
- Treat `pcie_devices.slot` as the canonical address on the LOGPile side as well; `bdf` may remain
an internal fallback/dedupe key but is not serialized in the payload.
- Export `event_logs` only from normalized parser/collector events that can be mapped to contract
sources `host` / `bmc` / `redfish` without synthesizing message content.
- Export `manufactured_year_week` only as a reliable passthrough when a parser/collector already
extracted a valid `YYYY-Www` value.
**Consequences:**
- Local bible wording no longer conflicts with upstream contract terminology.
- Reanimator payloads use contract-native PCIe addressing and no longer expose `bdf` as a parallel
coordinate.
- LOGPile event export remains strictly source-derived; internal warnings such as LOGPile analysis
notes do not leak into Reanimator `event_logs`.
---
## ADL-030 — Audit result rendering is delegated to embedded reanimator/chart
**Date:** 2026-03-16
**Context:**
LOGPile already owns file upload, Redfish collection, archive parsing, normalization, and
Reanimator export. Maintaining a second host-side audit renderer for the same data created
presentation drift and duplicated UI logic.
**Decision:**
- Use vendored `reanimator/chart` as the only audit result viewer.
- Keep LOGPile responsible for service flows: upload, live collection, batch convert, raw export,
Reanimator export, and parse-error reporting.
- Render the current dataset by converting it to Reanimator JSON and passing that snapshot to
embedded `chart` under `/chart/current`.
**Consequences:**
- Reanimator JSON becomes the single presentation contract for the audit surface.
- The host UI becomes a service shell around the viewer instead of maintaining its own
field-by-field tabs.
- `internal/chart` must be updated explicitly as a git submodule when the viewer changes.
---
## ADL-031 — Redfish uses profile-driven acquisition and unified ingest entrypoints
**Date:** 2026-03-17
**Context:**
Redfish collection had accumulated platform-specific probing in the shared collector path, while
upload and raw-export replay still entered analysis through direct handler branches. This made
vendor/model tuning harder to contain and increased regression risk when one topology needed a
special acquisition strategy.
**Decision:**
- Introduce `internal/ingest.Service` as the internal source-family entrypoint for archive parsing
and Redfish raw replay.
- Introduce `internal/collector/redfishprofile/` for Redfish profile matching and modular hooks.
- Split Redfish behavior into coordinated phases:
- acquisition planning during live collection
- analysis hooks during snapshot replay
- Use score-based profile matching. If confidence is low, enter fallback acquisition mode and
aggregate only safe additive profile probes.
- Allow profile modules to provide bounded acquisition tuning hints such as crawl cap, prefetch
behavior, and expensive post-probe toggles.
- Allow profile modules to own model-specific `CriticalPaths` and bounded `PlanBPaths` so vendor
recovery targets stop leaking into the collector core.
- Expose Redfish profile matching as structured diagnostics during live collection: logs must
contain all module scores, and collect job status must expose active modules for the UI.
**Consequences:**
- Server handlers stop owning parser-vs-replay branching details directly.
- Vendor/model-specific Redfish logic gets an explicit module boundary.
- Unknown-vendor Redfish collection becomes slower but more complete by design.
- Tactical Redfish fixes should move into profile modules instead of widening generic replay logic.
- Repo-owned compact fixtures under `internal/collector/redfishprofile/testdata/`, derived from
representative raw-export snapshots, are used to lock profile matching and acquisition tuning
for known MSI and Supermicro-family shapes.

View File

@@ -1,59 +1,42 @@
# LOGPile Bible
> **Documentation language:** English only. All maintained project documentation must be written in English.
>
> **Architectural decisions:** Every significant architectural decision **must** be recorded in
> [`10-decisions.md`](10-decisions.md) before or alongside the code change.
>
> **Single source of truth:** Architecture and technical design documentation belongs in `docs/bible/`.
> Keep `README.md` and `CLAUDE.md` minimal to avoid duplicate documentation.
`bible-local/` is the project-specific source of truth for LOGPile.
Keep top-level docs minimal and put maintained architecture/API contracts here.
This directory is the single source of truth for LOGPile's architecture, design, and integration contracts.
It is structured so that both humans and AI assistants can navigate it quickly.
## Rules
---
- Documentation language: English only
- Update relevant bible files in the same change as the code
- Record significant architectural decisions in [`10-decisions.md`](10-decisions.md)
- Do not duplicate shared rules from `bible/`
## Reading Map (Hierarchical)
## Read order
### 1. Foundations (read first)
| File | Purpose |
|------|---------|
| [01-overview.md](01-overview.md) | Product scope, modes, non-goals |
| [02-architecture.md](02-architecture.md) | Runtime structure, state, main flows |
| [04-data-models.md](04-data-models.md) | Stable data contracts and canonical inventory |
| [03-api.md](03-api.md) | HTTP endpoints and response contracts |
| [05-collectors.md](05-collectors.md) | Live collection behavior |
| [06-parsers.md](06-parsers.md) | Archive parser framework and vendor coverage |
| [07-exporters.md](07-exporters.md) | Raw export, Reanimator export, batch convert |
| [docs/hardware-ingest-contract.md](docs/hardware-ingest-contract.md) | Reanimator ingest schema mirrored locally |
| [08-build-release.md](08-build-release.md) | Build and release workflow |
| [09-testing.md](09-testing.md) | Test expectations and regression rules |
| [10-decisions.md](10-decisions.md) | Architectural Decision Log |
| File | What it covers |
|------|----------------|
| [01-overview.md](01-overview.md) | Product purpose, operating modes, scope |
| [02-architecture.md](02-architecture.md) | Runtime structure, control flow, in-memory state |
| [04-data-models.md](04-data-models.md) | Core contracts (`AnalysisResult`, canonical `hardware.devices`) |
## Fast orientation
### 2. Runtime Interfaces
| File | What it covers |
|------|----------------|
| [03-api.md](03-api.md) | HTTP API contracts and endpoint behavior |
| [05-collectors.md](05-collectors.md) | Live collection connectors (Redfish, IPMI mock) |
| [06-parsers.md](06-parsers.md) | Archive parser framework and vendor parsers |
| [07-exporters.md](07-exporters.md) | CSV / JSON / Reanimator exports and integration mapping |
### 3. Delivery & Quality
| File | What it covers |
|------|----------------|
| [08-build-release.md](08-build-release.md) | Build, packaging, release workflow |
| [09-testing.md](09-testing.md) | Testing expectations and verification guidance |
### 4. Governance (always current)
| File | What it covers |
|------|----------------|
| [10-decisions.md](10-decisions.md) | Architectural Decision Log (ADL) |
---
## Quick orientation for AI assistants
- Read order for most changes: `01``02``04` → relevant interface doc(s) → `10`
- Entry point: `cmd/logpile/main.go`
- HTTP server: `internal/server/` — handlers in `handlers.go`, routes in `server.go`
- Data contracts: `internal/models/` — never break `AnalysisResult` JSON shape
- Frontend contract: `web/static/js/app.js` — keep API responses stable
- Canonical inventory: `hardware.devices` in `AnalysisResult` — source of truth for UI and exports
- Parser registry: `internal/parser/vendors/``init()` auto-registration pattern
- Collector registry: `internal/collector/registry.go`
- HTTP layer: `internal/server/`
- Core contracts: `internal/models/models.go`
- Live collection: `internal/collector/`
- Archive parsing: `internal/parser/`
- Export conversion: `internal/exporter/`
- Frontend consumer: `web/static/js/app.js`
## Maintenance rule
If a document becomes stale, either fix it immediately or delete it.
Stale docs are worse than missing docs.

View File

@@ -0,0 +1,793 @@
---
title: Hardware Ingest JSON Contract
version: "2.7"
updated: "2026-03-15"
maintainer: Reanimator Core
audience: external-integrators, ai-agents
language: ru
---
# Интеграция с Reanimator: контракт JSON-импорта аппаратного обеспечения
Версия: **2.7** · Дата: **2026-03-15**
Документ описывает формат JSON для передачи данных об аппаратном обеспечении серверов в систему **Reanimator** (управление жизненным циклом аппаратного обеспечения).
Предназначен для разработчиков смежных систем (Redfish-коллекторов, агентов мониторинга, CMDB-экспортёров) и может быть включён в документацию интегрируемых проектов.
> Актуальная версия документа: https://git.mchus.pro/reanimator/core/src/branch/main/bible-local/docs/hardware-ingest-contract.md
---
## Changelog
| Версия | Дата | Изменения |
|--------|------|-----------|
| 2.7 | 2026-03-15 | Явно запрещён синтез данных в `event_logs`; интеграторы не должны придумывать серийные номера компонентов, если источник их не отдал |
| 2.6 | 2026-03-15 | Добавлена необязательная секция `event_logs` для dedup/upsert логов `host` / `bmc` / `redfish` вне history timeline |
| 2.5 | 2026-03-15 | Добавлено общее необязательное поле `manufactured_year_week` для компонентных секций (`YYYY-Www`) |
| 2.4 | 2026-03-15 | Добавлена первая волна component telemetry: health/life поля для `cpus`, `memory`, `storage`, `pcie_devices`, `power_supplies` |
| 2.3 | 2026-03-15 | Добавлены component telemetry поля: `pcie_devices.temperature_c`, `pcie_devices.power_w`, `power_supplies.temperature_c` |
| 2.2 | 2026-03-15 | Добавлено поле `numa_node` у `pcie_devices` для topology/affinity |
| 2.1 | 2026-03-15 | Добавлена секция `sensors` (fans, power, temperatures, other); поле `mac_addresses` у `pcie_devices`; расширен список значений `device_class` |
| 2.0 | 2026-02-01 | История статусов (`status_history`, `status_changed_at`); поля telemetry у PSU; async job response |
| 1.0 | 2026-01-01 | Начальная версия контракта |
---
## Принципы
1. **Snapshot** — JSON описывает состояние сервера на момент сбора. Может включать историю изменений статуса компонентов.
2. **Идемпотентность** — повторная отправка идентичного payload не создаёт дублей (дедупликация по хешу).
3. **Частичность** — можно передавать только те секции, данные по которым доступны. Пустой массив и отсутствие секции эквивалентны.
4. **Строгая схема** — endpoint использует строгий JSON-декодер; неизвестные поля приводят к `400 Bad Request`.
5. **Event-driven** — импорт создаёт события в timeline (LOG_COLLECTED, INSTALLED, REMOVED, FIRMWARE_CHANGED и др.).
6. **Без синтеза со стороны интегратора** — сборщик передаёт только фактически собранные значения. Нельзя придумывать `serial_number`, `component_ref`, `message`, `message_id` или другие идентификаторы/атрибуты, если источник их не предоставил или парсер не смог их надёжно извлечь.
---
## Endpoint
```
POST /ingest/hardware
Content-Type: application/json
```
**Ответ при приёме (202 Accepted):**
```json
{
"status": "accepted",
"job_id": "job_01J..."
}
```
Импорт выполняется асинхронно. Результат доступен по:
```
GET /ingest/hardware/jobs/{job_id}
```
**Ответ при успехе задачи:**
```json
{
"status": "success",
"bundle_id": "lb_01J...",
"asset_id": "mach_01J...",
"collected_at": "2026-02-10T15:30:00Z",
"duplicate": false,
"summary": {
"parts_observed": 15,
"parts_created": 2,
"parts_updated": 13,
"installations_created": 2,
"installations_closed": 1,
"timeline_events_created": 9,
"failure_events_created": 1
}
}
```
**Ответ при дубликате:**
```json
{
"status": "success",
"duplicate": true,
"message": "LogBundle with this content hash already exists"
}
```
**Ответ при ошибке (400 Bad Request):**
```json
{
"status": "error",
"error": "validation_failed",
"details": {
"field": "hardware.board.serial_number",
"message": "serial_number is required"
}
}
```
Частые причины `400`:
- Неверный формат `collected_at` (требуется RFC3339).
- Пустой `hardware.board.serial_number`.
- Наличие неизвестного JSON-поля на любом уровне.
- Тело запроса превышает допустимый размер.
---
## Структура верхнего уровня
```json
{
"filename": "redfish://10.10.10.103",
"source_type": "api",
"protocol": "redfish",
"target_host": "10.10.10.103",
"collected_at": "2026-02-10T15:30:00Z",
"hardware": {
"board": { ... },
"firmware": [ ... ],
"cpus": [ ... ],
"memory": [ ... ],
"storage": [ ... ],
"pcie_devices": [ ... ],
"power_supplies": [ ... ],
"sensors": { ... },
"event_logs": [ ... ]
}
}
```
### Поля верхнего уровня
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `collected_at` | string RFC3339 | **да** | Время сбора данных |
| `hardware` | object | **да** | Аппаратный снапшот |
| `hardware.board.serial_number` | string | **да** | Серийный номер платы/сервера |
| `target_host` | string | нет | IP или hostname |
| `source_type` | string | нет | Тип источника: `api`, `logfile`, `manual` |
| `protocol` | string | нет | Протокол: `redfish`, `ipmi`, `snmp`, `ssh` |
| `filename` | string | нет | Идентификатор источника |
---
## Общие поля статуса компонентов
Применяются ко всем компонентным секциям (`cpus`, `memory`, `storage`, `pcie_devices`, `power_supplies`).
| Поле | Тип | Описание |
|------|-----|----------|
| `status` | string | Текущий статус: `OK`, `Warning`, `Critical`, `Unknown`, `Empty` |
| `status_checked_at` | string RFC3339 | Время последней проверки статуса |
| `status_changed_at` | string RFC3339 | Время последнего изменения статуса |
| `status_history` | array | История переходов статусов (см. ниже) |
| `error_description` | string | Текст ошибки/диагностики |
| `manufactured_year_week` | string | Дата производства в формате `YYYY-Www`, например `2024-W07` |
**Объект `status_history[]`:**
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `status` | string | **да** | Статус в этот момент |
| `changed_at` | string RFC3339 | **да** | Время перехода (без этого поля запись игнорируется) |
| `details` | string | нет | Пояснение к переходу |
**Правила приоритета времени события:**
1. `status_changed_at`
2. Последняя запись `status_history` с совпадающим статусом
3. Последняя парсируемая запись `status_history`
4. `status_checked_at`
**Правила передачи статусов:**
- Передавайте `status` как текущее состояние компонента в snapshot.
- Если источник хранит историю — передавайте `status_history` отсортированным по `changed_at` по возрастанию.
- Не включайте записи `status_history` без `changed_at`.
- Все даты — RFC3339, рекомендуется UTC (`Z`).
- `manufactured_year_week` используйте, когда источник знает только год и неделю производства, без точной календарной даты.
---
## Секции hardware
### board
Основная информация о сервере. Обязательная секция.
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `serial_number` | string | **да** | Серийный номер (ключ идентификации Asset) |
| `manufacturer` | string | нет | Производитель |
| `product_name` | string | нет | Модель |
| `part_number` | string | нет | Партномер |
| `uuid` | string | нет | UUID системы |
Значения `"NULL"` в строковых полях трактуются как отсутствие данных.
```json
"board": {
"manufacturer": "Supermicro",
"product_name": "X12DPG-QT6",
"serial_number": "21D634101",
"part_number": "X12DPG-QT6-REV1.01",
"uuid": "d7ef2fe5-2fd0-11f0-910a-346f11040868"
}
```
---
### firmware
Версии прошивок системных компонентов (BIOS, BMC, CPLD и др.).
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `device_name` | string | **да** | Название устройства (`BIOS`, `BMC`, `CPLD`, …) |
| `version` | string | **да** | Версия прошивки |
Записи с пустым `device_name` или `version` игнорируются.
Изменение версии создаёт событие `FIRMWARE_CHANGED` для Asset.
```json
"firmware": [
{ "device_name": "BIOS", "version": "06.08.05" },
{ "device_name": "BMC", "version": "5.17.00" },
{ "device_name": "CPLD", "version": "01.02.03" }
]
```
---
### cpus
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `socket` | int | **да** | Номер сокета (используется для генерации serial) |
| `model` | string | нет | Модель процессора |
| `manufacturer` | string | нет | Производитель |
| `cores` | int | нет | Количество ядер |
| `threads` | int | нет | Количество потоков |
| `frequency_mhz` | int | нет | Текущая частота |
| `max_frequency_mhz` | int | нет | Максимальная частота |
| `temperature_c` | float | нет | Температура CPU, °C (telemetry) |
| `power_w` | float | нет | Текущая мощность CPU, Вт (telemetry) |
| `throttled` | bool | нет | Зафиксирован thermal/power throttling |
| `correctable_error_count` | int | нет | Количество корректируемых ошибок CPU |
| `uncorrectable_error_count` | int | нет | Количество некорректируемых ошибок CPU |
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
| `serial_number` | string | нет | Серийный номер (если доступен) |
| `firmware` | string | нет | Версия микрокода; если логгер отдает `Microcode level`, передавайте его сюда как есть |
| `present` | bool | нет | Наличие (по умолчанию `true`) |
| + общие поля статуса | | | см. раздел выше |
**Генерация serial_number при отсутствии:** `{board_serial}-CPU-{socket}`
Если источник использует поле/лейбл `Microcode level`, его значение передавайте в `cpus[].firmware` без дополнительного преобразования.
```json
"cpus": [
{
"socket": 0,
"model": "INTEL(R) XEON(R) GOLD 6530",
"cores": 32,
"threads": 64,
"frequency_mhz": 2100,
"max_frequency_mhz": 4000,
"temperature_c": 61.5,
"power_w": 182.0,
"throttled": false,
"manufacturer": "Intel",
"status": "OK",
"status_checked_at": "2026-02-10T15:28:00Z"
}
]
```
---
### memory
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `slot` | string | нет | Идентификатор слота |
| `present` | bool | нет | Наличие модуля (по умолчанию `true`) |
| `serial_number` | string | нет | Серийный номер |
| `part_number` | string | нет | Партномер (используется как модель) |
| `manufacturer` | string | нет | Производитель |
| `size_mb` | int | нет | Объём в МБ |
| `type` | string | нет | Тип: `DDR3`, `DDR4`, `DDR5`, … |
| `max_speed_mhz` | int | нет | Максимальная частота |
| `current_speed_mhz` | int | нет | Текущая частота |
| `temperature_c` | float | нет | Температура DIMM/модуля, °C (telemetry) |
| `correctable_ecc_error_count` | int | нет | Количество корректируемых ECC-ошибок |
| `uncorrectable_ecc_error_count` | int | нет | Количество некорректируемых ECC-ошибок |
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
| `spare_blocks_remaining_pct` | float | нет | Остаток spare blocks, % |
| `performance_degraded` | bool | нет | Зафиксирована деградация производительности |
| `data_loss_detected` | bool | нет | Источник сигнализирует риск/факт потери данных |
| + общие поля статуса | | | см. раздел выше |
Модуль без `serial_number` игнорируется. Модуль с `present=false` или `status=Empty` игнорируется.
```json
"memory": [
{
"slot": "CPU0_C0D0",
"present": true,
"size_mb": 32768,
"type": "DDR5",
"max_speed_mhz": 4800,
"current_speed_mhz": 4800,
"temperature_c": 43.0,
"correctable_ecc_error_count": 0,
"manufacturer": "Hynix",
"serial_number": "80AD032419E17CEEC1",
"part_number": "HMCG88AGBRA191N",
"status": "OK"
}
]
```
---
### storage
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `slot` | string | нет | Канонический адрес установки PCIe-устройства; передавайте BDF (`0000:18:00.0`) |
| `serial_number` | string | нет | Серийный номер |
| `model` | string | нет | Модель |
| `manufacturer` | string | нет | Производитель |
| `type` | string | нет | Тип: `NVMe`, `SSD`, `HDD` |
| `interface` | string | нет | Интерфейс: `NVMe`, `SATA`, `SAS` |
| `size_gb` | int | нет | Размер в ГБ |
| `temperature_c` | float | нет | Температура накопителя, °C (telemetry) |
| `power_on_hours` | int64 | нет | Время работы, часы |
| `power_cycles` | int64 | нет | Количество циклов питания |
| `unsafe_shutdowns` | int64 | нет | Нештатные выключения |
| `media_errors` | int64 | нет | Ошибки носителя / media errors |
| `error_log_entries` | int64 | нет | Количество записей в error log |
| `written_bytes` | int64 | нет | Всего записано байт |
| `read_bytes` | int64 | нет | Всего прочитано байт |
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
| `available_spare_pct` | float | нет | Доступный spare, % |
| `reallocated_sectors` | int64 | нет | Переназначенные сектора |
| `current_pending_sectors` | int64 | нет | Сектора в ожидании ремапа |
| `offline_uncorrectable` | int64 | нет | Некорректируемые ошибки offline scan |
| `firmware` | string | нет | Версия прошивки |
| `present` | bool | нет | Наличие (по умолчанию `true`) |
| + общие поля статуса | | | см. раздел выше |
Диск без `serial_number` игнорируется. Изменение `firmware` создаёт событие `FIRMWARE_CHANGED`.
```json
"storage": [
{
"slot": "OB01",
"type": "NVMe",
"model": "INTEL SSDPF2KX076T1",
"size_gb": 7680,
"temperature_c": 38.5,
"power_on_hours": 12450,
"unsafe_shutdowns": 3,
"written_bytes": 9876543210,
"life_remaining_pct": 91.0,
"serial_number": "BTAX41900GF87P6DGN",
"manufacturer": "Intel",
"firmware": "9CV10510",
"interface": "NVMe",
"present": true,
"status": "OK"
}
]
```
---
### pcie_devices
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `slot` | string | нет | Идентификатор слота |
| `vendor_id` | int | нет | PCI Vendor ID (decimal) |
| `device_id` | int | нет | PCI Device ID (decimal) |
| `numa_node` | int | нет | NUMA node / CPU affinity устройства |
| `temperature_c` | float | нет | Температура устройства, °C (telemetry) |
| `power_w` | float | нет | Текущее энергопотребление устройства, Вт (telemetry) |
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
| `ecc_corrected_total` | int64 | нет | Всего корректируемых ECC-ошибок |
| `ecc_uncorrected_total` | int64 | нет | Всего некорректируемых ECC-ошибок |
| `hw_slowdown` | bool | нет | Устройство вошло в hardware slowdown / protective mode |
| `battery_charge_pct` | float | нет | Заряд батареи / supercap, % |
| `battery_health_pct` | float | нет | Состояние батареи / supercap, % |
| `battery_temperature_c` | float | нет | Температура батареи / supercap, °C |
| `battery_voltage_v` | float | нет | Напряжение батареи / supercap, В |
| `battery_replace_required` | bool | нет | Требуется замена батареи / supercap |
| `sfp_temperature_c` | float | нет | Температура SFP/optic, °C |
| `sfp_tx_power_dbm` | float | нет | TX optical power, dBm |
| `sfp_rx_power_dbm` | float | нет | RX optical power, dBm |
| `sfp_voltage_v` | float | нет | Напряжение SFP, В |
| `sfp_bias_ma` | float | нет | Bias current SFP, мА |
| `bdf` | string | нет | Deprecated alias для `slot`; при наличии ingest нормализует его в `slot` |
| `device_class` | string | нет | Класс устройства (см. список ниже) |
| `manufacturer` | string | нет | Производитель |
| `model` | string | нет | Модель |
| `serial_number` | string | нет | Серийный номер |
| `firmware` | string | нет | Версия прошивки |
| `link_width` | int | нет | Текущая ширина линка |
| `link_speed` | string | нет | Текущая скорость: `Gen3`, `Gen4`, `Gen5` |
| `max_link_width` | int | нет | Максимальная ширина линка |
| `max_link_speed` | string | нет | Максимальная скорость |
| `mac_addresses` | string[] | нет | MAC-адреса портов (для сетевых устройств) |
| `present` | bool | нет | Наличие (по умолчанию `true`) |
| + общие поля статуса | | | см. раздел выше |
`numa_node` передавайте для NIC / InfiniBand / RAID / GPU, когда источник знает CPU/NUMA affinity. Поле сохраняется в snapshot-атрибутах PCIe-компонента и дублируется в telemetry для topology use cases.
Поля `temperature_c` и `power_w` используйте для device-level telemetry GPU / accelerator / smart PCIe devices. Они не влияют на идентификацию компонента.
**Генерация serial_number при отсутствии или `"N/A"`:** `{board_serial}-PCIE-{slot}`, где `slot` для PCIe равен BDF.
`slot` — единственный канонический адрес компонента. Для PCIe в `slot` передавайте BDF. Поле `bdf` сохраняется только как переходный alias на входе и не должно использоваться как отдельная координата рядом со `slot`.
**Значения `device_class`:**
| Значение | Назначение |
|----------|------------|
| `MassStorageController` | RAID-контроллеры |
| `StorageController` | HBA, SAS-контроллеры |
| `NetworkController` | Сетевые адаптеры (InfiniBand, общий) |
| `EthernetController` | Ethernet NIC |
| `FibreChannelController` | Fibre Channel HBA |
| `VideoController` | GPU, видеокарты |
| `ProcessingAccelerator` | Вычислительные ускорители (AI/ML) |
| `DisplayController` | Контроллеры дисплея (BMC VGA) |
Список открытый: допускаются произвольные строки для нестандартных классов.
```json
"pcie_devices": [
{
"slot": "0000:3b:00.0",
"vendor_id": 5555,
"device_id": 4401,
"numa_node": 0,
"temperature_c": 48.5,
"power_w": 18.2,
"sfp_temperature_c": 36.2,
"sfp_tx_power_dbm": -1.8,
"sfp_rx_power_dbm": -2.1,
"device_class": "EthernetController",
"manufacturer": "Intel",
"model": "X710 10GbE",
"serial_number": "K65472-003",
"firmware": "9.20 0x8000d4ae",
"mac_addresses": ["3c:fd:fe:aa:bb:cc", "3c:fd:fe:aa:bb:cd"],
"status": "OK"
}
]
```
---
### power_supplies
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `slot` | string | нет | Идентификатор слота |
| `present` | bool | нет | Наличие (по умолчанию `true`) |
| `serial_number` | string | нет | Серийный номер |
| `part_number` | string | нет | Партномер |
| `model` | string | нет | Модель |
| `vendor` | string | нет | Производитель |
| `wattage_w` | int | нет | Мощность в ваттах |
| `firmware` | string | нет | Версия прошивки |
| `input_type` | string | нет | Тип входа (например `ACWideRange`) |
| `input_voltage` | float | нет | Входное напряжение, В (telemetry) |
| `input_power_w` | float | нет | Входная мощность, Вт (telemetry) |
| `output_power_w` | float | нет | Выходная мощность, Вт (telemetry) |
| `temperature_c` | float | нет | Температура PSU, °C (telemetry) |
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
| + общие поля статуса | | | см. раздел выше |
Поля telemetry (`input_voltage`, `input_power_w`, `output_power_w`, `temperature_c`, `life_remaining_pct`, `life_used_pct`) сохраняются в атрибутах компонента и не влияют на его идентификацию.
PSU без `serial_number` игнорируется.
```json
"power_supplies": [
{
"slot": "0",
"present": true,
"model": "GW-CRPS3000LW",
"vendor": "Great Wall",
"wattage_w": 3000,
"serial_number": "2P06C102610",
"firmware": "00.03.05",
"status": "OK",
"input_type": "ACWideRange",
"input_power_w": 137,
"output_power_w": 104,
"input_voltage": 215.25,
"temperature_c": 39.5,
"life_remaining_pct": 97.0
}
]
```
---
### sensors
Показания сенсоров сервера. Секция опциональная, не привязана к компонентам.
Данные хранятся как последнее известное значение (last-known-value) на уровне Asset.
```json
"sensors": {
"fans": [ ... ],
"power": [ ... ],
"temperatures": [ ... ],
"other": [ ... ]
}
```
---
### event_logs
Нормализованные операционные логи сервера из `host`, `bmc` или `redfish`.
Эти записи не попадают в history timeline и не создают history events. Они сохраняются в отдельной deduplicated log store и отображаются в отдельном UI-блоке asset logs / host logs.
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `source` | string | **да** | Источник лога: `host`, `bmc`, `redfish` |
| `event_time` | string RFC3339 | нет | Время события из источника; если отсутствует, используется время ingest/collection |
| `severity` | string | нет | Уровень: `OK`, `Info`, `Warning`, `Critical`, `Unknown` |
| `message_id` | string | нет | Идентификатор/код события источника |
| `message` | string | **да** | Нормализованный текст события |
| `component_ref` | string | нет | Ссылка на компонент/устройство/слот, если извлекается |
| `fingerprint` | string | нет | Внешний готовый dedup-key; если не передан, система вычисляет свой |
| `is_active` | bool | нет | Признак, что событие всё ещё активно/не погашено, если источник умеет lifecycle |
| `raw_payload` | object | нет | Сырой vendor-specific payload для диагностики |
**Правила event_logs:**
- Логи дедуплицируются в рамках asset + source + fingerprint.
- Если `fingerprint` не передан, система строит его из нормализованных полей (`source`, `message_id`, `message`, `component_ref`, временная нормализация).
- Интегратор/сборщик логов не должен синтезировать содержимое событий: не придумывайте `message`, `message_id`, `component_ref`, serial/device identifiers или иные поля, если они отсутствуют в исходном логе или не были надёжно извлечены.
- Повторное получение того же события обновляет `last_seen_at`/счётчик повторов и не должно создавать новый timeline/history event.
- `event_logs` используются для отдельного UI-представления логов и не изменяют canonical state компонентов/asset по умолчанию.
```json
"event_logs": [
{
"source": "bmc",
"event_time": "2026-03-15T14:03:11Z",
"severity": "Warning",
"message_id": "0x000F",
"message": "Correctable ECC error threshold exceeded",
"component_ref": "CPU0_C0D0",
"raw_payload": {
"sensor": "DIMM_A1",
"sel_record_id": "0042"
}
},
{
"source": "redfish",
"event_time": "2026-03-15T14:03:20Z",
"severity": "Info",
"message_id": "OpenBMC.0.1.SystemReboot",
"message": "System reboot requested by administrator",
"component_ref": "Mainboard"
}
]
```
#### sensors.fans
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `name` | string | **да** | Уникальное имя сенсора в рамках секции |
| `location` | string | нет | Физическое расположение |
| `rpm` | int | нет | Обороты, RPM |
| `status` | string | нет | Статус: `OK`, `Warning`, `Critical`, `Unknown` |
#### sensors.power
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `name` | string | **да** | Уникальное имя сенсора |
| `location` | string | нет | Физическое расположение |
| `voltage_v` | float | нет | Напряжение, В |
| `current_a` | float | нет | Ток, А |
| `power_w` | float | нет | Мощность, Вт |
| `status` | string | нет | Статус |
#### sensors.temperatures
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `name` | string | **да** | Уникальное имя сенсора |
| `location` | string | нет | Физическое расположение |
| `celsius` | float | нет | Температура, °C |
| `threshold_warning_celsius` | float | нет | Порог Warning, °C |
| `threshold_critical_celsius` | float | нет | Порог Critical, °C |
| `status` | string | нет | Статус |
#### sensors.other
| Поле | Тип | Обязательно | Описание |
|------|-----|-------------|----------|
| `name` | string | **да** | Уникальное имя сенсора |
| `location` | string | нет | Физическое расположение |
| `value` | float | нет | Значение |
| `unit` | string | нет | Единица измерения |
| `status` | string | нет | Статус |
**Правила sensors:**
- Идентификатор сенсора: пара `(sensor_type, name)`. Дубли в одном payload — берётся первое вхождение.
- Сенсоры без `name` игнорируются.
- При каждом импорте значения перезаписываются (upsert по ключу).
```json
"sensors": {
"fans": [
{ "name": "FAN1", "location": "Front", "rpm": 4200, "status": "OK" },
{ "name": "FAN_CPU0", "location": "CPU0", "rpm": 5600, "status": "OK" }
],
"power": [
{ "name": "12V Rail", "location": "Mainboard", "voltage_v": 12.06, "status": "OK" },
{ "name": "PSU0 Input", "location": "PSU0", "voltage_v": 215.25, "current_a": 0.64, "power_w": 137.0, "status": "OK" }
],
"temperatures": [
{ "name": "CPU0 Temp", "location": "CPU0", "celsius": 46.0, "threshold_warning_celsius": 80.0, "threshold_critical_celsius": 95.0, "status": "OK" },
{ "name": "Inlet Temp", "location": "Front", "celsius": 22.0, "threshold_warning_celsius": 40.0, "threshold_critical_celsius": 50.0, "status": "OK" }
],
"other": [
{ "name": "System Humidity", "value": 38.5, "unit": "%" , "status": "OK" }
]
}
```
---
## Обработка статусов компонентов
| Статус | Поведение |
|--------|-----------|
| `OK` | Нормальная обработка |
| `Warning` | Создаётся событие `COMPONENT_WARNING` |
| `Critical` | Создаётся событие `COMPONENT_FAILED` + запись в `failure_events` |
| `Unknown` | Компонент считается рабочим, создаётся событие `COMPONENT_UNKNOWN` |
| `Empty` | Компонент не создаётся/не обновляется |
---
## Обработка отсутствующих serial_number
Общее правило для всех секций: если источник не вернул серийный номер и сборщик не смог его надёжно извлечь, интегратор не должен подставлять вымышленные значения, хеши, локальные placeholder-идентификаторы или серийные номера "по догадке". Разрешены только явно оговорённые ниже server-side fallback-правила ingest.
| Тип | Поведение |
|-----|-----------|
| CPU | Генерируется: `{board_serial}-CPU-{socket}` |
| PCIe | Генерируется: `{board_serial}-PCIE-{slot}` (если serial = `"N/A"` или пустой; `slot` для PCIe = BDF) |
| Memory | Компонент игнорируется |
| Storage | Компонент игнорируется |
| PSU | Компонент игнорируется |
Если `serial_number` не уникален внутри одного payload для того же `model`:
- Первое вхождение сохраняет оригинальный серийный номер.
- Каждое следующее дублирующее получает placeholder: `NO_SN-XXXXXXXX`.
---
## Минимальный валидный пример
```json
{
"collected_at": "2026-02-10T15:30:00Z",
"target_host": "192.168.1.100",
"hardware": {
"board": {
"serial_number": "SRV-001"
}
}
}
```
---
## Полный пример с историей статусов
```json
{
"filename": "redfish://10.10.10.103",
"source_type": "api",
"protocol": "redfish",
"target_host": "10.10.10.103",
"collected_at": "2026-02-10T15:30:00Z",
"hardware": {
"board": {
"manufacturer": "Supermicro",
"product_name": "X12DPG-QT6",
"serial_number": "21D634101"
},
"firmware": [
{ "device_name": "BIOS", "version": "06.08.05" },
{ "device_name": "BMC", "version": "5.17.00" }
],
"cpus": [
{
"socket": 0,
"model": "INTEL(R) XEON(R) GOLD 6530",
"manufacturer": "Intel",
"cores": 32,
"threads": 64,
"status": "OK"
}
],
"storage": [
{
"slot": "OB01",
"type": "NVMe",
"model": "INTEL SSDPF2KX076T1",
"size_gb": 7680,
"serial_number": "BTAX41900GF87P6DGN",
"manufacturer": "Intel",
"firmware": "9CV10510",
"present": true,
"status": "OK",
"status_changed_at": "2026-02-10T15:22:00Z",
"status_history": [
{ "status": "Critical", "changed_at": "2026-02-10T15:10:00Z", "details": "I/O timeout on NVMe queue 3" },
{ "status": "OK", "changed_at": "2026-02-10T15:22:00Z", "details": "Recovered after controller reset" }
]
}
],
"pcie_devices": [
{
"slot": "0000:18:00.0",
"device_class": "EthernetController",
"manufacturer": "Intel",
"model": "X710 10GbE",
"serial_number": "K65472-003",
"mac_addresses": ["3c:fd:fe:aa:bb:cc", "3c:fd:fe:aa:bb:cd"],
"status": "OK"
}
],
"power_supplies": [
{
"slot": "0",
"present": true,
"model": "GW-CRPS3000LW",
"vendor": "Great Wall",
"wattage_w": 3000,
"serial_number": "2P06C102610",
"firmware": "00.03.05",
"status": "OK",
"input_power_w": 137,
"output_power_w": 104,
"input_voltage": 215.25
}
],
"sensors": {
"fans": [
{ "name": "FAN1", "location": "Front", "rpm": 4200, "status": "OK" }
],
"power": [
{ "name": "12V Rail", "voltage_v": 12.06, "status": "OK" }
],
"temperatures": [
{ "name": "CPU0 Temp", "celsius": 46.0, "threshold_warning_celsius": 80.0, "threshold_critical_celsius": 95.0, "status": "OK" }
],
"other": [
{ "name": "System Humidity", "value": 38.5, "unit": "%" }
]
}
}
}
```

View File

@@ -1,28 +0,0 @@
# Test Server Collection Memory
Keep this table updated after each test-server run.
Definition:
- `Collection Time` = total Redfish collection duration from `collect.log`.
- `Speed` = `Documents / seconds`.
- `Metrics Collected` = sum of `Counts` fields (`cpus + memory + storage + pcie + gpus + nics + psus + firmware`).
- `n/a` means the log does not contain enough timestamp metadata to calculate duration/speed.
## Server Model: `NF5688M7`
| Date (UTC) | App Version | Collection Time | Documents | Speed | Metrics Collected | Notes |
|---|---|---:|---:|---:|---:|---|
| 2026-02-28 | `v1.7.1-12-g612058e` (`612058e`) | 10m10s (610s) | 228 | 0.37 docs/s | 98 | 2026-02-28 (SERVER MODEL) - 23E100043.zip |
| 2026-02-28 | `v1.7.1-11-ge0146ad` (`e0146ad`) | 9m36s (576s) | 138 | 0.24 docs/s | 110 | 2026-02-28 (SERVER MODEL) - 23E100042.zip |
| 2026-02-28 | `v1.7.1-10-g9a30705` (`9a30705`) | 20m47s (1247s) | 106 | 0.09 docs/s | 97 | 2026-02-28 (SERVER MODEL) - 23E100053.zip |
| 2026-02-28 | `v1.7.1` (`6c19a58`) | 15m08s (908s) | 184 | 0.20 docs/s | 96 | 2026-02-28 (DDR5 DIMM) - 23E100051.zip |
| 2026-02-28 | `v1.7.0` (`ddab93a`) | n/a | 193 | n/a | 61 | 2026-02-28 (NULL) - 23E100051.zip |
| 2026-02-28 | `v1.7.0` (`ddab93a`) | n/a | 291 | n/a | 61 | 2026-02-28 (NULL) - 23E100206.zip |
## Server Model: `KR1280-X2-A0-R0-00`
| Date (UTC) | App Version | Collection Time | Documents | Speed | Metrics Collected | Notes |
|---|---|---:|---:|---:|---:|---|
| 2026-02-28 | `v1.7.1-12-g612058e` (`612058e`) | 6m15s (375s) | 185 | 0.49 docs/s | 46 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657.zip |
| 2026-02-28 | `v1.7.1-9-g8dbbec3-dirty` (`8dbbec3`) | 6m16s (376s) | 165 | 0.44 docs/s | 46 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657-2.zip |
| 2026-02-28 | `v1.7.1-7-gc52fea2` (`c52fea2`) | 10m51s (651s) | 227 | 0.35 docs/s | 40 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657 copy.zip |

6
go.mod
View File

@@ -1,3 +1,7 @@
module git.mchus.pro/mchus/logpile
go 1.22
go 1.24.0
require reanimator/chart v0.0.0
replace reanimator/chart => ./internal/chart

1
internal/chart Submodule

Submodule internal/chart added at c025ae0477

File diff suppressed because it is too large Load Diff

View File

@@ -3,10 +3,12 @@ package collector
import (
"encoding/json"
"fmt"
"log"
"sort"
"strings"
"time"
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
"git.mchus.pro/mchus/logpile/internal/models"
)
@@ -29,8 +31,9 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
if emit != nil {
emit(Progress{Status: "running", Progress: 10, Message: "Redfish snapshot: replay service root..."})
}
if _, err := r.getJSON("/redfish/v1"); err != nil {
return nil, fmt.Errorf("redfish service root: %w", err)
serviceRootDoc, err := r.getJSON("/redfish/v1")
if err != nil {
log.Printf("redfish replay: service root /redfish/v1 missing from snapshot, continuing with defaults: %v", err)
}
systemPaths := r.discoverMemberPaths("/redfish/v1/Systems", "/redfish/v1/Systems/1")
@@ -48,6 +51,7 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
return nil, fmt.Errorf("system info: %w", err)
}
chassisDoc, _ := r.getJSON(primaryChassis)
managerDoc, _ := r.getJSON(primaryManager)
biosDoc, _ := r.getJSON(joinPath(primarySystem, "/Bios"))
secureBootDoc, _ := r.getJSON(joinPath(primarySystem, "/SecureBoot"))
systemFRUDoc, _ := r.getJSON(joinPath(primarySystem, "/Oem/Public/FRU"))
@@ -57,22 +61,32 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
fruDoc = chassisFRUDoc
}
boardFallbackDocs := r.collectBoardFallbackDocs(systemPaths, chassisPaths)
resourceHints := append(append([]string{}, systemPaths...), append(chassisPaths, managerPaths...)...)
profileSignals := redfishprofile.CollectSignals(serviceRootDoc, systemDoc, chassisDoc, managerDoc, resourceHints)
profileMatch := redfishprofile.MatchProfiles(profileSignals)
analysisPlan := redfishprofile.ResolveAnalysisPlan(profileMatch, tree, redfishprofile.DiscoveredResources{
SystemPaths: systemPaths,
ChassisPaths: chassisPaths,
ManagerPaths: managerPaths,
}, profileSignals)
if emit != nil {
emit(Progress{Status: "running", Progress: 55, Message: "Redfish snapshot: replay CPU/RAM/Storage..."})
}
processors, _ := r.getCollectionMembers(joinPath(primarySystem, "/Processors"))
memory, _ := r.getCollectionMembers(joinPath(primarySystem, "/Memory"))
storageDevices := r.collectStorage(primarySystem)
storageVolumes := r.collectStorageVolumes(primarySystem)
processors := r.collectProcessors(primarySystem)
memory := r.collectMemory(primarySystem)
storageDevices := r.collectStorage(primarySystem, analysisPlan)
storageVolumes := r.collectStorageVolumes(primarySystem, analysisPlan)
if emit != nil {
emit(Progress{Status: "running", Progress: 80, Message: "Redfish snapshot: replay network/BMC..."})
}
psus := r.collectPSUs(chassisPaths)
pcieDevices := r.collectPCIeDevices(systemPaths, chassisPaths)
gpus := r.collectGPUs(systemPaths, chassisPaths)
gpus = r.collectGPUsFromProcessors(systemPaths, chassisPaths, gpus)
boardInfo := parseBoardInfoWithFallback(systemDoc, chassisDoc, fruDoc)
applyBoardInfoFallbackFromDocs(&boardInfo, boardFallbackDocs)
gpus := r.collectGPUs(systemPaths, chassisPaths, analysisPlan)
gpus = r.collectGPUsFromProcessors(systemPaths, chassisPaths, gpus, analysisPlan)
nics := r.collectNICs(chassisPaths)
r.enrichNICsFromNetworkInterfaces(&nics, systemPaths)
thresholdSensors := r.collectThresholdSensors(chassisPaths)
@@ -81,27 +95,24 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
discreteEvents := r.collectDiscreteSensorEvents(chassisPaths)
healthEvents := r.collectHealthSummaryEvents(chassisPaths)
driveFetchWarningEvents := buildDriveFetchWarningEvents(rawPayloads)
managerDoc, _ := r.getJSON(primaryManager)
networkProtocolDoc, _ := r.getJSON(joinPath(primaryManager, "/NetworkProtocol"))
firmware := parseFirmware(systemDoc, biosDoc, managerDoc, secureBootDoc, networkProtocolDoc)
firmware = dedupeFirmwareInfo(append(firmware, r.collectFirmwareInventory()...))
boardInfo := parseBoardInfoWithFallback(systemDoc, chassisDoc, fruDoc)
applyBoardInfoFallbackFromDocs(&boardInfo, boardFallbackDocs)
boardInfo.BMCMACAddress = r.collectBMCMAC(managerPaths)
assemblyFRU := r.collectAssemblyFRU(chassisPaths)
collectedAt, sourceTimezone := inferRedfishCollectionTime(managerDoc, rawPayloads)
result := &models.AnalysisResult{
CollectedAt: collectedAt,
CollectedAt: collectedAt,
SourceTimezone: sourceTimezone,
Events: append(append(append(make([]models.Event, 0, len(discreteEvents)+len(healthEvents)+len(driveFetchWarningEvents)+1), healthEvents...), discreteEvents...), driveFetchWarningEvents...),
FRU: assemblyFRU,
Sensors: dedupeSensorReadings(append(append(thresholdSensors, thermalSensors...), powerSensors...)),
RawPayloads: cloneRawPayloads(rawPayloads),
Events: append(append(append(make([]models.Event, 0, len(discreteEvents)+len(healthEvents)+len(driveFetchWarningEvents)+1), healthEvents...), discreteEvents...), driveFetchWarningEvents...),
FRU: assemblyFRU,
Sensors: dedupeSensorReadings(append(append(thresholdSensors, thermalSensors...), powerSensors...)),
RawPayloads: cloneRawPayloads(rawPayloads),
Hardware: &models.HardwareConfig{
BoardInfo: boardInfo,
CPUs: parseCPUs(processors),
Memory: parseMemory(memory),
CPUs: processors,
Memory: memory,
Storage: storageDevices,
Volumes: storageVolumes,
PCIeDevices: pcieDevices,
@@ -111,10 +122,36 @@ func ReplayRedfishFromRawPayloads(rawPayloads map[string]any, emit ProgressFn) (
Firmware: firmware,
},
}
match := profileMatch
for _, profile := range match.Profiles {
profile.PostAnalyze(result, tree, profileSignals)
}
if result.RawPayloads == nil {
result.RawPayloads = map[string]any{}
}
appliedProfiles := make([]string, 0, len(match.Profiles))
for _, profile := range match.Profiles {
appliedProfiles = append(appliedProfiles, profile.Name())
}
result.RawPayloads["redfish_analysis_profiles"] = map[string]any{
"mode": match.Mode,
"profiles": appliedProfiles,
}
result.RawPayloads["redfish_analysis_plan"] = map[string]any{
"mode": analysisPlan.Match.Mode,
"profiles": appliedProfiles,
"notes": analysisPlan.Notes,
"directives": map[string]any{
"processor_gpu_fallback": analysisPlan.Directives.EnableProcessorGPUFallback,
"supermicro_nvme_backplane": analysisPlan.Directives.EnableSupermicroNVMeBackplane,
"processor_gpu_chassis_alias": analysisPlan.Directives.EnableProcessorGPUChassisAlias,
"generic_graphics_controller_dedup": analysisPlan.Directives.EnableGenericGraphicsControllerDedup,
"msi_processor_gpu_chassis_lookup": analysisPlan.Directives.EnableMSIProcessorGPUChassisLookup,
"storage_enclosure_recovery": analysisPlan.Directives.EnableStorageEnclosureRecovery,
"known_storage_controller_recovery": analysisPlan.Directives.EnableKnownStorageControllerRecovery,
},
}
if strings.TrimSpace(sourceTimezone) != "" {
if result.RawPayloads == nil {
result.RawPayloads = map[string]any{}
}
result.RawPayloads["source_timezone"] = sourceTimezone
}
appendMissingServerModelWarning(result, systemDoc, joinPath(primarySystem, "/Oem/Public/FRU"), joinPath(primaryChassis, "/Oem/Public/FRU"))
@@ -272,11 +309,7 @@ func (r redfishSnapshotReader) collectFirmwareInventory() []models.FirmwareInfo
if strings.TrimSpace(version) == "" {
continue
}
name := firstNonEmpty(
asString(doc["DeviceName"]),
asString(doc["Name"]),
asString(doc["Id"]),
)
name := firmwareInventoryDeviceName(doc)
name = strings.TrimSpace(name)
if name == "" {
continue
@@ -290,6 +323,25 @@ func (r redfishSnapshotReader) collectFirmwareInventory() []models.FirmwareInfo
return out
}
func firmwareInventoryDeviceName(doc map[string]interface{}) string {
name := strings.TrimSpace(asString(doc["DeviceName"]))
if name != "" {
return name
}
id := strings.TrimSpace(asString(doc["Id"]))
rawName := strings.TrimSpace(asString(doc["Name"]))
if strings.EqualFold(rawName, "Software Inventory") || strings.EqualFold(rawName, "Firmware Inventory") {
if id != "" {
return id
}
}
if rawName != "" {
return rawName
}
return id
}
func dedupeFirmwareInfo(items []models.FirmwareInfo) []models.FirmwareInfo {
seen := make(map[string]struct{}, len(items))
out := make([]models.FirmwareInfo, 0, len(items))
@@ -651,57 +703,6 @@ func (r redfishSnapshotReader) collectHealthSummaryEvents(chassisPaths []string)
return out
}
func (r redfishSnapshotReader) enrichNICsFromNetworkInterfaces(nics *[]models.NetworkAdapter, systemPaths []string) {
if nics == nil {
return
}
bySlot := make(map[string]int, len(*nics))
for i, nic := range *nics {
bySlot[strings.ToLower(strings.TrimSpace(nic.Slot))] = i
}
for _, systemPath := range systemPaths {
ifaces, err := r.getCollectionMembers(joinPath(systemPath, "/NetworkInterfaces"))
if err != nil || len(ifaces) == 0 {
continue
}
for _, iface := range ifaces {
slot := firstNonEmpty(asString(iface["Id"]), asString(iface["Name"]))
if strings.TrimSpace(slot) == "" {
continue
}
idx, ok := bySlot[strings.ToLower(strings.TrimSpace(slot))]
if !ok {
*nics = append(*nics, models.NetworkAdapter{
Slot: slot,
Present: true,
Model: firstNonEmpty(asString(iface["Model"]), asString(iface["Name"])),
Status: mapStatus(iface["Status"]),
})
idx = len(*nics) - 1
bySlot[strings.ToLower(strings.TrimSpace(slot))] = idx
}
portsPath := redfishLinkedPath(iface, "NetworkPorts")
if portsPath == "" {
continue
}
portDocs, err := r.getCollectionMembers(portsPath)
if err != nil || len(portDocs) == 0 {
continue
}
macs := append([]string{}, (*nics)[idx].MACAddresses...)
for _, p := range portDocs {
macs = append(macs, collectNetworkPortMACs(p)...)
}
(*nics)[idx].MACAddresses = dedupeStrings(macs)
if sanitizeNetworkPortCount((*nics)[idx].PortCount) == 0 {
(*nics)[idx].PortCount = len(portDocs)
}
}
}
}
func collectNetworkPortMACs(doc map[string]interface{}) []string {
if len(doc) == 0 {
return nil
@@ -740,79 +741,6 @@ func dedupeStrings(items []string) []string {
return out
}
func (r redfishSnapshotReader) collectBoardFallbackDocs(systemPaths, chassisPaths []string) []map[string]interface{} {
out := make([]map[string]interface{}, 0)
for _, chassisPath := range chassisPaths {
for _, suffix := range []string{"/Boards", "/Backplanes"} {
path := joinPath(chassisPath, suffix)
if docs, err := r.getCollectionMembers(path); err == nil && len(docs) > 0 {
out = append(out, docs...)
continue
}
if doc, err := r.getJSON(path); err == nil && len(doc) > 0 {
out = append(out, doc)
}
}
}
for _, path := range append(append([]string{}, systemPaths...), chassisPaths...) {
for _, suffix := range []string{"/Oem/Public", "/Oem/Public/ThermalConfig", "/ThermalConfig"} {
docPath := joinPath(path, suffix)
if doc, err := r.getJSON(docPath); err == nil && len(doc) > 0 {
out = append(out, doc)
}
}
}
return out
}
func applyBoardInfoFallbackFromDocs(board *models.BoardInfo, docs []map[string]interface{}) {
if board == nil || len(docs) == 0 {
return
}
for _, doc := range docs {
candidate := parseBoardInfoFromFRUDoc(doc)
if !isLikelyServerProductName(candidate.ProductName) {
continue
}
if board.Manufacturer == "" {
board.Manufacturer = candidate.Manufacturer
}
if board.ProductName == "" {
board.ProductName = candidate.ProductName
}
if board.SerialNumber == "" {
board.SerialNumber = candidate.SerialNumber
}
if board.PartNumber == "" {
board.PartNumber = candidate.PartNumber
}
if board.Manufacturer != "" && board.ProductName != "" && board.SerialNumber != "" && board.PartNumber != "" {
return
}
}
}
func isLikelyServerProductName(v string) bool {
v = strings.TrimSpace(v)
if v == "" {
return false
}
n := strings.ToUpper(v)
if strings.Contains(n, "NULL") {
return false
}
componentTokens := []string{
"DIMM", "DDR", "NVME", "SSD", "HDD", "GPU", "NIC", "RAID",
"PSU", "FAN", "BACKPLANE", "FRU",
}
for _, token := range componentTokens {
if strings.Contains(n, strings.ToUpper(token)) {
return false
}
}
return true
}
type redfishSnapshotReader struct {
tree map[string]interface{}
}
@@ -976,204 +904,75 @@ func (r redfishSnapshotReader) getLinkedPCIeFunctions(doc map[string]interface{}
return nil
}
func (r redfishSnapshotReader) collectStorage(systemPath string) []models.Storage {
var out []models.Storage
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
for _, member := range storageMembers {
if driveCollection, ok := member["Drives"].(map[string]interface{}); ok {
if driveCollectionPath := asString(driveCollection["@odata.id"]); driveCollectionPath != "" {
driveDocs, err := r.getCollectionMembers(driveCollectionPath)
if err == nil {
for _, driveDoc := range driveDocs {
if !isVirtualStorageDrive(driveDoc) {
out = append(out, parseDrive(driveDoc))
}
}
if len(driveDocs) == 0 {
for _, driveDoc := range r.probeDirectDiskBayChildren(driveCollectionPath) {
out = append(out, parseDrive(driveDoc))
}
}
}
continue
}
}
if drives, ok := member["Drives"].([]interface{}); ok {
for _, driveAny := range drives {
driveRef, ok := driveAny.(map[string]interface{})
if !ok {
continue
}
odata := asString(driveRef["@odata.id"])
if odata == "" {
continue
}
driveDoc, err := r.getJSON(odata)
if err != nil {
continue
}
if !isVirtualStorageDrive(driveDoc) {
out = append(out, parseDrive(driveDoc))
}
}
continue
}
if looksLikeDrive(member) {
out = append(out, parseDrive(member))
}
for _, enclosurePath := range redfishLinkRefs(member, "Links", "Enclosures") {
driveDocs, err := r.getCollectionMembers(joinPath(enclosurePath, "/Drives"))
if err == nil {
for _, driveDoc := range driveDocs {
if looksLikeDrive(driveDoc) {
out = append(out, parseDrive(driveDoc))
}
}
if len(driveDocs) == 0 {
for _, driveDoc := range r.probeDirectDiskBayChildren(joinPath(enclosurePath, "/Drives")) {
out = append(out, parseDrive(driveDoc))
}
}
}
}
func (r redfishSnapshotReader) getLinkedSupplementalDocs(doc map[string]interface{}, keys ...string) []map[string]interface{} {
if len(doc) == 0 || len(keys) == 0 {
return nil
}
for _, driveDoc := range r.collectKnownStorageMembers(systemPath, []string{
"/Storage/IntelVROC/Drives",
"/Storage/IntelVROC/Controllers/1/Drives",
}) {
if looksLikeDrive(driveDoc) {
out = append(out, parseDrive(driveDoc))
}
}
simpleStorageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/SimpleStorage"))
for _, member := range simpleStorageMembers {
devices, ok := member["Devices"].([]interface{})
if !ok {
continue
}
for _, devAny := range devices {
devDoc, ok := devAny.(map[string]interface{})
if !ok || !looksLikeDrive(devDoc) {
continue
}
out = append(out, parseDrive(devDoc))
}
}
chassisPaths := r.discoverMemberPaths("/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
for _, chassisPath := range chassisPaths {
driveDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/Drives"))
if err != nil {
continue
}
for _, driveDoc := range driveDocs {
if !looksLikeDrive(driveDoc) {
continue
}
out = append(out, parseDrive(driveDoc))
}
}
for _, chassisPath := range chassisPaths {
if !isSupermicroNVMeBackplanePath(chassisPath) {
continue
}
for _, driveDoc := range r.probeSupermicroNVMeDiskBays(chassisPath) {
if !looksLikeDrive(driveDoc) {
continue
}
out = append(out, parseDrive(driveDoc))
}
}
return dedupeStorage(out)
}
func (r redfishSnapshotReader) collectStorageVolumes(systemPath string) []models.StorageVolume {
var out []models.StorageVolume
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
for _, member := range storageMembers {
controller := firstNonEmpty(asString(member["Id"]), asString(member["Name"]))
volumeCollectionPath := redfishLinkedPath(member, "Volumes")
if volumeCollectionPath == "" {
continue
}
volumeDocs, err := r.getCollectionMembers(volumeCollectionPath)
if err != nil {
continue
}
for _, volDoc := range volumeDocs {
if looksLikeVolume(volDoc) {
out = append(out, parseStorageVolume(volDoc, controller))
}
}
}
for _, volDoc := range r.collectKnownStorageMembers(systemPath, []string{
"/Storage/IntelVROC/Volumes",
"/Storage/HA-RAID/Volumes",
"/Storage/MRVL.HA-RAID/Volumes",
}) {
if looksLikeVolume(volDoc) {
out = append(out, parseStorageVolume(volDoc, storageControllerFromPath(asString(volDoc["@odata.id"]))))
}
}
return dedupeStorageVolumes(out)
}
func (r redfishSnapshotReader) collectKnownStorageMembers(systemPath string, relativeCollections []string) []map[string]interface{} {
var out []map[string]interface{}
for _, rel := range relativeCollections {
docs, err := r.getCollectionMembers(joinPath(systemPath, rel))
if err != nil || len(docs) == 0 {
seen := make(map[string]struct{})
for _, key := range keys {
path := normalizeRedfishPath(redfishLinkedPath(doc, key))
if path == "" {
continue
}
out = append(out, docs...)
if _, ok := seen[path]; ok {
continue
}
supplementalDoc, err := r.getJSON(path)
if err != nil || len(supplementalDoc) == 0 {
continue
}
seen[path] = struct{}{}
out = append(out, supplementalDoc)
}
return out
}
func (r redfishSnapshotReader) probeSupermicroNVMeDiskBays(backplanePath string) []map[string]interface{} {
return r.probeDirectDiskBayChildren(joinPath(backplanePath, "/Drives"))
}
func (r redfishSnapshotReader) probeDirectDiskBayChildren(drivesCollectionPath string) []map[string]interface{} {
var out []map[string]interface{}
for _, path := range directDiskBayCandidates(drivesCollectionPath) {
doc, err := r.getJSON(path)
if err != nil || !looksLikeDrive(doc) {
func (r redfishSnapshotReader) collectProcessors(systemPath string) []models.CPU {
memberDocs, err := r.getCollectionMembers(joinPath(systemPath, "/Processors"))
if err != nil || len(memberDocs) == 0 {
return nil
}
out := make([]models.CPU, 0, len(memberDocs))
socketIdx := 0
for _, doc := range memberDocs {
if pt := strings.TrimSpace(asString(doc["ProcessorType"])); pt != "" &&
!strings.EqualFold(pt, "CPU") && !strings.EqualFold(pt, "General") {
continue
}
out = append(out, doc)
cpu := parseCPUs([]map[string]interface{}{doc})[0]
if cpu.Socket == 0 && socketIdx > 0 && strings.TrimSpace(asString(doc["Socket"])) == "" {
cpu.Socket = socketIdx
if cpu.Details == nil {
cpu.Details = map[string]any{}
}
cpu.Details["socket"] = cpu.Socket
}
supplementalDocs := r.getLinkedSupplementalDocs(doc, "ProcessorMetrics", "EnvironmentMetrics", "Metrics")
if len(supplementalDocs) > 0 {
cpu.Details = mergeGenericDetails(cpu.Details, redfishCPUDetailsAcrossDocs(doc, supplementalDocs...))
}
out = append(out, cpu)
socketIdx++
}
return out
}
func (r redfishSnapshotReader) collectNICs(chassisPaths []string) []models.NetworkAdapter {
var nics []models.NetworkAdapter
for _, chassisPath := range chassisPaths {
adapterDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/NetworkAdapters"))
if err != nil {
continue
}
for _, doc := range adapterDocs {
nic := parseNIC(doc)
for _, pciePath := range networkAdapterPCIeDevicePaths(doc) {
pcieDoc, err := r.getJSON(pciePath)
if err != nil {
continue
}
functionDocs := r.getLinkedPCIeFunctions(pcieDoc)
enrichNICFromPCIe(&nic, pcieDoc, functionDocs)
}
// Collect MACs from NetworkDeviceFunctions when not found via PCIe path.
if len(nic.MACAddresses) == 0 {
r.enrichNICMACsFromNetworkDeviceFunctions(&nic, doc)
}
nics = append(nics, nic)
}
func (r redfishSnapshotReader) collectMemory(systemPath string) []models.MemoryDIMM {
memberDocs, err := r.getCollectionMembers(joinPath(systemPath, "/Memory"))
if err != nil || len(memberDocs) == 0 {
return nil
}
return dedupeNetworkAdapters(nics)
out := make([]models.MemoryDIMM, 0, len(memberDocs))
for _, doc := range memberDocs {
dimm := parseMemory([]map[string]interface{}{doc})[0]
supplementalDocs := r.getLinkedSupplementalDocs(doc, "MemoryMetrics", "EnvironmentMetrics", "Metrics")
if len(supplementalDocs) > 0 {
dimm.Details = mergeGenericDetails(dimm.Details, redfishMemoryDetailsAcrossDocs(doc, supplementalDocs...))
}
out = append(out, dimm)
}
return out
}
func (r redfishSnapshotReader) collectPSUs(chassisPaths []string) []models.PSU {
@@ -1183,7 +982,8 @@ func (r redfishSnapshotReader) collectPSUs(chassisPaths []string) []models.PSU {
for _, chassisPath := range chassisPaths {
if memberDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/PowerSubsystem/PowerSupplies")); err == nil && len(memberDocs) > 0 {
for _, doc := range memberDocs {
idx = appendPSU(&out, seen, parsePSU(doc, idx), idx)
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
idx = appendPSU(&out, seen, parsePSUWithSupplementalDocs(doc, idx, supplementalDocs...), idx)
}
continue
}
@@ -1194,7 +994,8 @@ func (r redfishSnapshotReader) collectPSUs(chassisPaths []string) []models.PSU {
if !ok {
continue
}
idx = appendPSU(&out, seen, parsePSU(doc, idx), idx)
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
idx = appendPSU(&out, seen, parsePSUWithSupplementalDocs(doc, idx, supplementalDocs...), idx)
}
}
}
@@ -1202,319 +1003,9 @@ func (r redfishSnapshotReader) collectPSUs(chassisPaths []string) []models.PSU {
return out
}
func (r redfishSnapshotReader) collectGPUs(systemPaths, chassisPaths []string) []models.GPU {
collections := make([]string, 0, len(systemPaths)*3+len(chassisPaths)*2)
for _, systemPath := range systemPaths {
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
collections = append(collections, joinPath(systemPath, "/Accelerators"))
collections = append(collections, joinPath(systemPath, "/GraphicsControllers"))
}
for _, chassisPath := range chassisPaths {
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
collections = append(collections, joinPath(chassisPath, "/Accelerators"))
}
var out []models.GPU
seen := make(map[string]struct{})
idx := 1
for _, collectionPath := range collections {
memberDocs, err := r.getCollectionMembers(collectionPath)
if err != nil || len(memberDocs) == 0 {
continue
}
for _, doc := range memberDocs {
functionDocs := r.getLinkedPCIeFunctions(doc)
if !looksLikeGPU(doc, functionDocs) {
continue
}
gpu := parseGPU(doc, functionDocs, idx)
idx++
if shouldSkipGenericGPUDuplicate(out, gpu) {
continue
}
key := gpuDocDedupKey(doc, gpu)
if key == "" {
continue
}
if _, ok := seen[key]; ok {
continue
}
seen[key] = struct{}{}
out = append(out, gpu)
}
}
return dropModelOnlyGPUPlaceholders(out)
}
func (r redfishSnapshotReader) collectPCIeDevices(systemPaths, chassisPaths []string) []models.PCIeDevice {
collections := make([]string, 0, len(systemPaths)+len(chassisPaths))
for _, systemPath := range systemPaths {
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
}
for _, chassisPath := range chassisPaths {
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
}
var out []models.PCIeDevice
for _, collectionPath := range collections {
memberDocs, err := r.getCollectionMembers(collectionPath)
if err != nil || len(memberDocs) == 0 {
continue
}
for _, doc := range memberDocs {
functionDocs := r.getLinkedPCIeFunctions(doc)
dev := parsePCIeDevice(doc, functionDocs)
out = append(out, dev)
}
}
for _, systemPath := range systemPaths {
functionDocs, err := r.getCollectionMembers(joinPath(systemPath, "/PCIeFunctions"))
if err != nil || len(functionDocs) == 0 {
continue
}
for idx, fn := range functionDocs {
dev := parsePCIeFunction(fn, idx+1)
out = append(out, dev)
}
}
return dedupePCIeDevices(out)
}
func stringsTrimTrailingSlash(s string) string {
for len(s) > 1 && s[len(s)-1] == '/' {
s = s[:len(s)-1]
}
return s
}
// collectBMCMAC returns the MAC address of the first active BMC management
// interface found in Managers/*/EthernetInterfaces. Returns empty string if
// no MAC is available.
func (r redfishSnapshotReader) collectBMCMAC(managerPaths []string) string {
for _, managerPath := range managerPaths {
members, err := r.getCollectionMembers(joinPath(managerPath, "/EthernetInterfaces"))
if err != nil || len(members) == 0 {
continue
}
for _, doc := range members {
mac := strings.TrimSpace(firstNonEmpty(
asString(doc["PermanentMACAddress"]),
asString(doc["MACAddress"]),
))
if mac == "" || strings.EqualFold(mac, "00:00:00:00:00:00") {
continue
}
return strings.ToUpper(mac)
}
}
return ""
}
// collectAssemblyFRU reads Chassis/*/Assembly documents and returns FRU entries
// for subcomponents (backplanes, PSUs, DIMMs, etc.) that carry meaningful
// serial or part numbers. Entries already present in dedicated collections
// (PSUs, DIMMs) are included here as well so that all FRU data is available
// in one place; deduplication by serial is performed.
func (r redfishSnapshotReader) collectAssemblyFRU(chassisPaths []string) []models.FRUInfo {
seen := make(map[string]struct{})
var out []models.FRUInfo
add := func(fru models.FRUInfo) {
key := strings.ToUpper(strings.TrimSpace(fru.SerialNumber))
if key == "" {
key = strings.ToUpper(strings.TrimSpace(fru.Description + "|" + fru.PartNumber))
}
if key == "" || key == "|" {
return
}
if _, ok := seen[key]; ok {
return
}
seen[key] = struct{}{}
out = append(out, fru)
}
for _, chassisPath := range chassisPaths {
doc, err := r.getJSON(joinPath(chassisPath, "/Assembly"))
if err != nil || len(doc) == 0 {
continue
}
assemblies, _ := doc["Assemblies"].([]interface{})
for _, aAny := range assemblies {
a, ok := aAny.(map[string]interface{})
if !ok {
continue
}
name := strings.TrimSpace(firstNonEmpty(asString(a["Name"]), asString(a["Description"])))
model := strings.TrimSpace(asString(a["Model"]))
partNumber := strings.TrimSpace(asString(a["PartNumber"]))
serial := extractAssemblySerial(a)
if serial == "" && partNumber == "" {
continue
}
add(models.FRUInfo{
Description: name,
ProductName: model,
SerialNumber: serial,
PartNumber: partNumber,
})
}
}
return out
}
// extractAssemblySerial tries to find a serial number in an Assembly entry.
// Standard Redfish Assembly has no top-level SerialNumber; vendors put it in Oem.
func extractAssemblySerial(a map[string]interface{}) string {
// Some implementations expose it at top level.
if s := strings.TrimSpace(asString(a["SerialNumber"])); s != "" {
return s
}
// Dig into Oem for vendor-specific structures (e.g. Huawei COMMONb).
oem, _ := a["Oem"].(map[string]interface{})
for _, v := range oem {
subtree, ok := v.(map[string]interface{})
if !ok {
continue
}
for _, v2 := range subtree {
node, ok := v2.(map[string]interface{})
if !ok {
continue
}
if s := strings.TrimSpace(asString(node["SerialNumber"])); s != "" {
return s
}
}
}
return ""
}
// enrichNICMACsFromNetworkDeviceFunctions reads the NetworkDeviceFunctions
// collection linked from a NetworkAdapter document and populates the NIC's
// MACAddresses from each function's Ethernet.PermanentMACAddress / MACAddress.
// Called when PCIe-path enrichment does not produce any MACs.
func (r redfishSnapshotReader) enrichNICMACsFromNetworkDeviceFunctions(nic *models.NetworkAdapter, adapterDoc map[string]interface{}) {
ndfCol, ok := adapterDoc["NetworkDeviceFunctions"].(map[string]interface{})
if !ok {
return
}
colPath := asString(ndfCol["@odata.id"])
if colPath == "" {
return
}
funcDocs, err := r.getCollectionMembers(colPath)
if err != nil || len(funcDocs) == 0 {
return
}
for _, fn := range funcDocs {
eth, _ := fn["Ethernet"].(map[string]interface{})
if eth == nil {
continue
}
mac := strings.TrimSpace(firstNonEmpty(
asString(eth["PermanentMACAddress"]),
asString(eth["MACAddress"]),
))
if mac == "" {
continue
}
nic.MACAddresses = dedupeStrings(append(nic.MACAddresses, strings.ToUpper(mac)))
}
if len(funcDocs) > 0 && nic.PortCount == 0 {
nic.PortCount = sanitizeNetworkPortCount(len(funcDocs))
}
}
// collectGPUsFromProcessors finds GPUs that some BMCs (e.g. MSI) expose as
// Processor entries with ProcessorType=GPU rather than as PCIe devices.
// It supplements the existing gpus slice (already found via PCIe path),
// skipping entries already present by UUID or SerialNumber.
// Serial numbers are looked up from Chassis members named after each GPU Id.
func (r redfishSnapshotReader) collectGPUsFromProcessors(systemPaths, chassisPaths []string, existing []models.GPU) []models.GPU {
// Build a lookup: chassis member ID → chassis doc (for serial numbers).
chassisByID := make(map[string]map[string]interface{})
for _, cp := range chassisPaths {
doc, err := r.getJSON(cp)
if err != nil || len(doc) == 0 {
continue
}
id := strings.TrimSpace(asString(doc["Id"]))
if id != "" {
chassisByID[strings.ToUpper(id)] = doc
}
}
// Build dedup sets from existing GPUs.
seenUUID := make(map[string]struct{})
seenSerial := make(map[string]struct{})
for _, g := range existing {
if u := strings.ToUpper(strings.TrimSpace(g.UUID)); u != "" {
seenUUID[u] = struct{}{}
}
if s := strings.ToUpper(strings.TrimSpace(g.SerialNumber)); s != "" {
seenSerial[s] = struct{}{}
}
}
out := append([]models.GPU{}, existing...)
idx := len(existing) + 1
for _, systemPath := range systemPaths {
procDocs, err := r.getCollectionMembers(joinPath(systemPath, "/Processors"))
if err != nil {
continue
}
for _, doc := range procDocs {
if !strings.EqualFold(strings.TrimSpace(asString(doc["ProcessorType"])), "GPU") {
continue
}
// Resolve serial from Chassis/<Id>.
gpuID := strings.TrimSpace(asString(doc["Id"]))
serial := ""
if chassisDoc, ok := chassisByID[strings.ToUpper(gpuID)]; ok {
serial = strings.TrimSpace(asString(chassisDoc["SerialNumber"]))
}
uuid := strings.TrimSpace(asString(doc["UUID"]))
uuidKey := strings.ToUpper(uuid)
serialKey := strings.ToUpper(serial)
if uuidKey != "" {
if _, dup := seenUUID[uuidKey]; dup {
continue
}
seenUUID[uuidKey] = struct{}{}
}
if serialKey != "" {
if _, dup := seenSerial[serialKey]; dup {
continue
}
seenSerial[serialKey] = struct{}{}
}
slotLabel := firstNonEmpty(
redfishLocationLabel(doc["Location"]),
redfishLocationLabel(doc["PhysicalLocation"]),
)
if slotLabel == "" && gpuID != "" {
slotLabel = gpuID
}
if slotLabel == "" {
slotLabel = fmt.Sprintf("GPU%d", idx)
}
out = append(out, models.GPU{
Slot: slotLabel,
Model: firstNonEmpty(asString(doc["Model"]), asString(doc["Name"])),
Manufacturer: asString(doc["Manufacturer"]),
PartNumber: asString(doc["PartNumber"]),
SerialNumber: serial,
UUID: uuid,
Status: mapStatus(doc["Status"]),
})
idx++
}
}
return out
}

View File

@@ -0,0 +1,159 @@
package collector
import (
"strings"
"git.mchus.pro/mchus/logpile/internal/models"
)
func (r redfishSnapshotReader) collectBoardFallbackDocs(systemPaths, chassisPaths []string) []map[string]interface{} {
out := make([]map[string]interface{}, 0)
for _, chassisPath := range chassisPaths {
for _, suffix := range []string{"/Boards", "/Backplanes"} {
path := joinPath(chassisPath, suffix)
if docs, err := r.getCollectionMembers(path); err == nil && len(docs) > 0 {
out = append(out, docs...)
continue
}
if doc, err := r.getJSON(path); err == nil && len(doc) > 0 {
out = append(out, doc)
}
}
}
for _, path := range append(append([]string{}, systemPaths...), chassisPaths...) {
for _, suffix := range []string{"/Oem/Public", "/Oem/Public/ThermalConfig", "/ThermalConfig"} {
docPath := joinPath(path, suffix)
if doc, err := r.getJSON(docPath); err == nil && len(doc) > 0 {
out = append(out, doc)
}
}
}
return out
}
func applyBoardInfoFallbackFromDocs(board *models.BoardInfo, docs []map[string]interface{}) {
if board == nil || len(docs) == 0 {
return
}
for _, doc := range docs {
candidate := parseBoardInfoFromFRUDoc(doc)
if !isLikelyServerProductName(candidate.ProductName) {
continue
}
if board.Manufacturer == "" {
board.Manufacturer = candidate.Manufacturer
}
if board.ProductName == "" {
board.ProductName = candidate.ProductName
}
if board.SerialNumber == "" {
board.SerialNumber = candidate.SerialNumber
}
if board.PartNumber == "" {
board.PartNumber = candidate.PartNumber
}
if board.Manufacturer != "" && board.ProductName != "" && board.SerialNumber != "" && board.PartNumber != "" {
return
}
}
}
func isLikelyServerProductName(v string) bool {
v = strings.TrimSpace(v)
if v == "" {
return false
}
n := strings.ToUpper(v)
if strings.Contains(n, "NULL") {
return false
}
componentTokens := []string{
"DIMM", "DDR", "NVME", "SSD", "HDD", "GPU", "NIC", "RAID",
"PSU", "FAN", "BACKPLANE", "FRU",
}
for _, token := range componentTokens {
if strings.Contains(n, strings.ToUpper(token)) {
return false
}
}
return true
}
// collectAssemblyFRU reads Chassis/*/Assembly documents and returns FRU entries
// for subcomponents (backplanes, PSUs, DIMMs, etc.) that carry meaningful
// serial or part numbers. Entries already present in dedicated collections
// (PSUs, DIMMs) are included here as well so that all FRU data is available
// in one place; deduplication by serial is performed.
func (r redfishSnapshotReader) collectAssemblyFRU(chassisPaths []string) []models.FRUInfo {
seen := make(map[string]struct{})
var out []models.FRUInfo
add := func(fru models.FRUInfo) {
key := strings.ToUpper(strings.TrimSpace(fru.SerialNumber))
if key == "" {
key = strings.ToUpper(strings.TrimSpace(fru.Description + "|" + fru.PartNumber))
}
if key == "" || key == "|" {
return
}
if _, ok := seen[key]; ok {
return
}
seen[key] = struct{}{}
out = append(out, fru)
}
for _, chassisPath := range chassisPaths {
doc, err := r.getJSON(joinPath(chassisPath, "/Assembly"))
if err != nil || len(doc) == 0 {
continue
}
assemblies, _ := doc["Assemblies"].([]interface{})
for _, aAny := range assemblies {
a, ok := aAny.(map[string]interface{})
if !ok {
continue
}
name := strings.TrimSpace(firstNonEmpty(asString(a["Name"]), asString(a["Description"])))
model := strings.TrimSpace(asString(a["Model"]))
partNumber := strings.TrimSpace(asString(a["PartNumber"]))
serial := extractAssemblySerial(a)
if serial == "" && partNumber == "" {
continue
}
add(models.FRUInfo{
Description: name,
ProductName: model,
SerialNumber: serial,
PartNumber: partNumber,
})
}
}
return out
}
// extractAssemblySerial tries to find a serial number in an Assembly entry.
// Standard Redfish Assembly has no top-level SerialNumber; vendors put it in Oem.
func extractAssemblySerial(a map[string]interface{}) string {
if s := strings.TrimSpace(asString(a["SerialNumber"])); s != "" {
return s
}
oem, _ := a["Oem"].(map[string]interface{})
for _, v := range oem {
subtree, ok := v.(map[string]interface{})
if !ok {
continue
}
for _, v2 := range subtree {
node, ok := v2.(map[string]interface{})
if !ok {
continue
}
if s := strings.TrimSpace(asString(node["SerialNumber"])); s != "" {
return s
}
}
}
return ""
}

View File

@@ -0,0 +1,151 @@
package collector
import (
"fmt"
"strings"
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
"git.mchus.pro/mchus/logpile/internal/models"
)
func (r redfishSnapshotReader) collectGPUs(systemPaths, chassisPaths []string, plan redfishprofile.ResolvedAnalysisPlan) []models.GPU {
collections := make([]string, 0, len(systemPaths)*3+len(chassisPaths)*2)
for _, systemPath := range systemPaths {
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
collections = append(collections, joinPath(systemPath, "/Accelerators"))
collections = append(collections, joinPath(systemPath, "/GraphicsControllers"))
}
for _, chassisPath := range chassisPaths {
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
collections = append(collections, joinPath(chassisPath, "/Accelerators"))
}
var out []models.GPU
seen := make(map[string]struct{})
idx := 1
for _, collectionPath := range collections {
memberDocs, err := r.getCollectionMembers(collectionPath)
if err != nil || len(memberDocs) == 0 {
continue
}
for _, doc := range memberDocs {
functionDocs := r.getLinkedPCIeFunctions(doc)
if !looksLikeGPU(doc, functionDocs) {
continue
}
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
for _, fn := range functionDocs {
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
}
gpu := parseGPUWithSupplementalDocs(doc, functionDocs, supplementalDocs, idx)
idx++
if plan.Directives.EnableGenericGraphicsControllerDedup && shouldSkipGenericGPUDuplicate(out, gpu) {
continue
}
key := gpuDocDedupKey(doc, gpu)
if key == "" {
continue
}
if _, ok := seen[key]; ok {
continue
}
seen[key] = struct{}{}
out = append(out, gpu)
}
}
if plan.Directives.EnableGenericGraphicsControllerDedup {
return dropModelOnlyGPUPlaceholders(out)
}
return out
}
// collectGPUsFromProcessors finds GPUs that some BMCs (e.g. MSI) expose as
// Processor entries with ProcessorType=GPU rather than as PCIe devices.
// It supplements the existing gpus slice (already found via PCIe path),
// skipping entries already present by UUID or SerialNumber.
// Serial numbers are looked up from Chassis members named after each GPU Id.
func (r redfishSnapshotReader) collectGPUsFromProcessors(systemPaths, chassisPaths []string, existing []models.GPU, plan redfishprofile.ResolvedAnalysisPlan) []models.GPU {
if !plan.Directives.EnableProcessorGPUFallback {
return append([]models.GPU{}, existing...)
}
chassisByID := make(map[string]map[string]interface{})
for _, cp := range chassisPaths {
doc, err := r.getJSON(cp)
if err != nil || len(doc) == 0 {
continue
}
id := strings.TrimSpace(asString(doc["Id"]))
if id != "" {
chassisByID[strings.ToUpper(id)] = doc
}
}
seenUUID := make(map[string]struct{})
seenSerial := make(map[string]struct{})
for _, g := range existing {
if u := strings.ToUpper(strings.TrimSpace(g.UUID)); u != "" {
seenUUID[u] = struct{}{}
}
if s := strings.ToUpper(strings.TrimSpace(g.SerialNumber)); s != "" {
seenSerial[s] = struct{}{}
}
}
out := append([]models.GPU{}, existing...)
idx := len(existing) + 1
for _, systemPath := range systemPaths {
procDocs, err := r.getCollectionMembers(joinPath(systemPath, "/Processors"))
if err != nil {
continue
}
for _, doc := range procDocs {
if !strings.EqualFold(strings.TrimSpace(asString(doc["ProcessorType"])), "GPU") {
continue
}
gpuID := strings.TrimSpace(asString(doc["Id"]))
serial := findFirstNormalizedStringByKeys(doc, "SerialNumber")
if serial == "" {
serial = resolveProcessorGPUChassisSerial(chassisByID, gpuID, plan)
}
uuid := strings.TrimSpace(asString(doc["UUID"]))
uuidKey := strings.ToUpper(uuid)
serialKey := strings.ToUpper(serial)
if uuidKey != "" {
if _, dup := seenUUID[uuidKey]; dup {
continue
}
seenUUID[uuidKey] = struct{}{}
}
if serialKey != "" {
if _, dup := seenSerial[serialKey]; dup {
continue
}
seenSerial[serialKey] = struct{}{}
}
slotLabel := firstNonEmpty(
redfishLocationLabel(doc["Location"]),
redfishLocationLabel(doc["PhysicalLocation"]),
)
if slotLabel == "" && gpuID != "" {
slotLabel = gpuID
}
if slotLabel == "" {
slotLabel = fmt.Sprintf("GPU%d", idx)
}
out = append(out, models.GPU{
Slot: slotLabel,
Model: firstNonEmpty(asString(doc["Model"]), asString(doc["Name"])),
Manufacturer: asString(doc["Manufacturer"]),
PartNumber: asString(doc["PartNumber"]),
SerialNumber: serial,
UUID: uuid,
Status: mapStatus(doc["Status"]),
})
idx++
}
}
return out
}

View File

@@ -0,0 +1,215 @@
package collector
import (
"strings"
"git.mchus.pro/mchus/logpile/internal/models"
)
func (r redfishSnapshotReader) enrichNICsFromNetworkInterfaces(nics *[]models.NetworkAdapter, systemPaths []string) {
if nics == nil {
return
}
bySlot := make(map[string]int, len(*nics))
for i, nic := range *nics {
bySlot[strings.ToLower(strings.TrimSpace(nic.Slot))] = i
}
for _, systemPath := range systemPaths {
ifaces, err := r.getCollectionMembers(joinPath(systemPath, "/NetworkInterfaces"))
if err != nil || len(ifaces) == 0 {
continue
}
for _, iface := range ifaces {
slot := firstNonEmpty(asString(iface["Id"]), asString(iface["Name"]))
if strings.TrimSpace(slot) == "" {
continue
}
idx, ok := bySlot[strings.ToLower(strings.TrimSpace(slot))]
if !ok {
*nics = append(*nics, models.NetworkAdapter{
Slot: slot,
Present: true,
Model: firstNonEmpty(asString(iface["Model"]), asString(iface["Name"])),
Status: mapStatus(iface["Status"]),
})
idx = len(*nics) - 1
bySlot[strings.ToLower(strings.TrimSpace(slot))] = idx
}
portsPath := redfishLinkedPath(iface, "NetworkPorts")
if portsPath == "" {
continue
}
portDocs, err := r.getCollectionMembers(portsPath)
if err != nil || len(portDocs) == 0 {
continue
}
macs := append([]string{}, (*nics)[idx].MACAddresses...)
for _, p := range portDocs {
macs = append(macs, collectNetworkPortMACs(p)...)
}
(*nics)[idx].MACAddresses = dedupeStrings(macs)
if sanitizeNetworkPortCount((*nics)[idx].PortCount) == 0 {
(*nics)[idx].PortCount = len(portDocs)
}
}
}
}
func (r redfishSnapshotReader) collectNICs(chassisPaths []string) []models.NetworkAdapter {
var nics []models.NetworkAdapter
for _, chassisPath := range chassisPaths {
adapterDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/NetworkAdapters"))
if err != nil {
continue
}
for _, doc := range adapterDocs {
nic := parseNIC(doc)
for _, pciePath := range networkAdapterPCIeDevicePaths(doc) {
pcieDoc, err := r.getJSON(pciePath)
if err != nil {
continue
}
functionDocs := r.getLinkedPCIeFunctions(pcieDoc)
supplementalDocs := r.getLinkedSupplementalDocs(pcieDoc, "EnvironmentMetrics", "Metrics")
for _, fn := range functionDocs {
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
}
enrichNICFromPCIe(&nic, pcieDoc, functionDocs, supplementalDocs)
}
if len(nic.MACAddresses) == 0 {
r.enrichNICMACsFromNetworkDeviceFunctions(&nic, doc)
}
nics = append(nics, nic)
}
}
return dedupeNetworkAdapters(nics)
}
func (r redfishSnapshotReader) collectPCIeDevices(systemPaths, chassisPaths []string) []models.PCIeDevice {
collections := make([]string, 0, len(systemPaths)+len(chassisPaths))
for _, systemPath := range systemPaths {
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
}
for _, chassisPath := range chassisPaths {
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
}
var out []models.PCIeDevice
for _, collectionPath := range collections {
memberDocs, err := r.getCollectionMembers(collectionPath)
if err != nil || len(memberDocs) == 0 {
continue
}
for _, doc := range memberDocs {
functionDocs := r.getLinkedPCIeFunctions(doc)
if looksLikeGPU(doc, functionDocs) {
continue
}
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
supplementalDocs = append(supplementalDocs, r.getChassisScopedPCIeSupplementalDocs(doc)...)
for _, fn := range functionDocs {
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
}
dev := parsePCIeDeviceWithSupplementalDocs(doc, functionDocs, supplementalDocs)
if isUnidentifiablePCIeDevice(dev) {
continue
}
out = append(out, dev)
}
}
for _, systemPath := range systemPaths {
functionDocs, err := r.getCollectionMembers(joinPath(systemPath, "/PCIeFunctions"))
if err != nil || len(functionDocs) == 0 {
continue
}
for idx, fn := range functionDocs {
supplementalDocs := r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")
dev := parsePCIeFunctionWithSupplementalDocs(fn, supplementalDocs, idx+1)
out = append(out, dev)
}
}
return dedupePCIeDevices(out)
}
func (r redfishSnapshotReader) getChassisScopedPCIeSupplementalDocs(doc map[string]interface{}) []map[string]interface{} {
if !looksLikeNVSwitchPCIeDoc(doc) {
return nil
}
docPath := normalizeRedfishPath(asString(doc["@odata.id"]))
chassisPath := chassisPathForPCIeDoc(docPath)
if chassisPath == "" {
return nil
}
out := make([]map[string]interface{}, 0, 4)
for _, path := range []string{
joinPath(chassisPath, "/EnvironmentMetrics"),
joinPath(chassisPath, "/ThermalSubsystem/ThermalMetrics"),
} {
supplementalDoc, err := r.getJSON(path)
if err != nil || len(supplementalDoc) == 0 {
continue
}
out = append(out, supplementalDoc)
}
return out
}
// collectBMCMAC returns the MAC address of the first active BMC management
// interface found in Managers/*/EthernetInterfaces. Returns empty string if
// no MAC is available.
func (r redfishSnapshotReader) collectBMCMAC(managerPaths []string) string {
for _, managerPath := range managerPaths {
members, err := r.getCollectionMembers(joinPath(managerPath, "/EthernetInterfaces"))
if err != nil || len(members) == 0 {
continue
}
for _, doc := range members {
mac := strings.TrimSpace(firstNonEmpty(
asString(doc["PermanentMACAddress"]),
asString(doc["MACAddress"]),
))
if mac == "" || strings.EqualFold(mac, "00:00:00:00:00:00") {
continue
}
return strings.ToUpper(mac)
}
}
return ""
}
// enrichNICMACsFromNetworkDeviceFunctions reads the NetworkDeviceFunctions
// collection linked from a NetworkAdapter document and populates the NIC's
// MACAddresses from each function's Ethernet.PermanentMACAddress / MACAddress.
// Called when PCIe-path enrichment does not produce any MACs.
func (r redfishSnapshotReader) enrichNICMACsFromNetworkDeviceFunctions(nic *models.NetworkAdapter, adapterDoc map[string]interface{}) {
ndfCol, ok := adapterDoc["NetworkDeviceFunctions"].(map[string]interface{})
if !ok {
return
}
colPath := asString(ndfCol["@odata.id"])
if colPath == "" {
return
}
funcDocs, err := r.getCollectionMembers(colPath)
if err != nil || len(funcDocs) == 0 {
return
}
for _, fn := range funcDocs {
eth, _ := fn["Ethernet"].(map[string]interface{})
if eth == nil {
continue
}
mac := strings.TrimSpace(firstNonEmpty(
asString(eth["PermanentMACAddress"]),
asString(eth["MACAddress"]),
))
if mac == "" {
continue
}
nic.MACAddresses = dedupeStrings(append(nic.MACAddresses, strings.ToUpper(mac)))
}
if len(funcDocs) > 0 && nic.PortCount == 0 {
nic.PortCount = sanitizeNetworkPortCount(len(funcDocs))
}
}

View File

@@ -0,0 +1,91 @@
package collector
import (
"strings"
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
)
func (r redfishSnapshotReader) collectKnownStorageMembers(systemPath string, relativeCollections []string) []map[string]interface{} {
var out []map[string]interface{}
for _, rel := range relativeCollections {
docs, err := r.getCollectionMembers(joinPath(systemPath, rel))
if err != nil || len(docs) == 0 {
continue
}
out = append(out, docs...)
}
return out
}
func (r redfishSnapshotReader) probeSupermicroNVMeDiskBays(backplanePath string) []map[string]interface{} {
return r.probeDirectDiskBayChildren(joinPath(backplanePath, "/Drives"))
}
func (r redfishSnapshotReader) probeDirectDiskBayChildren(drivesCollectionPath string) []map[string]interface{} {
var out []map[string]interface{}
for _, path := range directDiskBayCandidates(drivesCollectionPath) {
doc, err := r.getJSON(path)
if err != nil || !looksLikeDrive(doc) {
continue
}
out = append(out, doc)
}
return out
}
func resolveProcessorGPUChassisSerial(chassisByID map[string]map[string]interface{}, gpuID string, plan redfishprofile.ResolvedAnalysisPlan) string {
for _, candidateID := range processorGPUChassisCandidateIDs(gpuID, plan) {
if chassisDoc, ok := chassisByID[strings.ToUpper(candidateID)]; ok {
if serial := strings.TrimSpace(asString(chassisDoc["SerialNumber"])); serial != "" {
return serial
}
}
}
return ""
}
func processorGPUChassisCandidateIDs(gpuID string, plan redfishprofile.ResolvedAnalysisPlan) []string {
gpuID = strings.TrimSpace(gpuID)
if gpuID == "" {
return nil
}
candidates := []string{gpuID}
for _, mode := range plan.ProcessorGPUChassisLookupModes {
switch strings.ToLower(strings.TrimSpace(mode)) {
case "msi-index":
candidates = append(candidates, msiProcessorGPUChassisCandidateIDs(gpuID)...)
case "hgx-alias":
if strings.HasPrefix(strings.ToUpper(gpuID), "GPU_") {
candidates = append(candidates, "HGX_"+gpuID)
}
}
}
return dedupeStrings(candidates)
}
func msiProcessorGPUChassisCandidateIDs(gpuID string) []string {
gpuID = strings.TrimSpace(strings.ToUpper(gpuID))
if gpuID == "" {
return nil
}
var out []string
switch {
case strings.HasPrefix(gpuID, "GPU_SXM_"):
index := strings.TrimPrefix(gpuID, "GPU_SXM_")
if index != "" {
out = append(out, "GPU"+index, "GPU_"+index)
}
case strings.HasPrefix(gpuID, "GPU_"):
index := strings.TrimPrefix(gpuID, "GPU_")
if index != "" {
out = append(out, "GPU"+index, "GPU_SXM_"+index)
}
case strings.HasPrefix(gpuID, "GPU"):
index := strings.TrimPrefix(gpuID, "GPU")
if index != "" {
out = append(out, "GPU_"+index, "GPU_SXM_"+index)
}
}
return out
}

View File

@@ -0,0 +1,164 @@
package collector
import (
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
"git.mchus.pro/mchus/logpile/internal/models"
)
func (r redfishSnapshotReader) collectStorage(systemPath string, plan redfishprofile.ResolvedAnalysisPlan) []models.Storage {
var out []models.Storage
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
for _, member := range storageMembers {
if driveCollection, ok := member["Drives"].(map[string]interface{}); ok {
if driveCollectionPath := asString(driveCollection["@odata.id"]); driveCollectionPath != "" {
driveDocs, err := r.getCollectionMembers(driveCollectionPath)
if err == nil {
for _, driveDoc := range driveDocs {
if !isVirtualStorageDrive(driveDoc) {
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
}
}
if len(driveDocs) == 0 {
for _, driveDoc := range r.probeDirectDiskBayChildren(driveCollectionPath) {
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
}
}
}
continue
}
}
if drives, ok := member["Drives"].([]interface{}); ok {
for _, driveAny := range drives {
driveRef, ok := driveAny.(map[string]interface{})
if !ok {
continue
}
odata := asString(driveRef["@odata.id"])
if odata == "" {
continue
}
driveDoc, err := r.getJSON(odata)
if err != nil {
continue
}
if !isVirtualStorageDrive(driveDoc) {
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
}
}
continue
}
if looksLikeDrive(member) {
if isVirtualStorageDrive(member) {
continue
}
supplementalDocs := r.getLinkedSupplementalDocs(member, "DriveMetrics", "EnvironmentMetrics", "Metrics")
out = append(out, parseDriveWithSupplementalDocs(member, supplementalDocs...))
}
if plan.Directives.EnableStorageEnclosureRecovery {
for _, enclosurePath := range redfishLinkRefs(member, "Links", "Enclosures") {
driveDocs, err := r.getCollectionMembers(joinPath(enclosurePath, "/Drives"))
if err == nil {
for _, driveDoc := range driveDocs {
if looksLikeDrive(driveDoc) && !isVirtualStorageDrive(driveDoc) {
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
}
}
if len(driveDocs) == 0 {
for _, driveDoc := range r.probeDirectDiskBayChildren(joinPath(enclosurePath, "/Drives")) {
if isVirtualStorageDrive(driveDoc) {
continue
}
out = append(out, parseDrive(driveDoc))
}
}
}
}
}
}
if len(plan.KnownStorageDriveCollections) > 0 {
for _, driveDoc := range r.collectKnownStorageMembers(systemPath, plan.KnownStorageDriveCollections) {
if looksLikeDrive(driveDoc) && !isVirtualStorageDrive(driveDoc) {
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
}
}
}
simpleStorageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/SimpleStorage"))
for _, member := range simpleStorageMembers {
devices, ok := member["Devices"].([]interface{})
if !ok {
continue
}
for _, devAny := range devices {
devDoc, ok := devAny.(map[string]interface{})
if !ok || !looksLikeDrive(devDoc) || isVirtualStorageDrive(devDoc) {
continue
}
out = append(out, parseDrive(devDoc))
}
}
chassisPaths := r.discoverMemberPaths("/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
for _, chassisPath := range chassisPaths {
driveDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/Drives"))
if err != nil {
continue
}
for _, driveDoc := range driveDocs {
if !looksLikeDrive(driveDoc) || isVirtualStorageDrive(driveDoc) {
continue
}
out = append(out, parseDrive(driveDoc))
}
}
if plan.Directives.EnableSupermicroNVMeBackplane {
for _, chassisPath := range chassisPaths {
if !isSupermicroNVMeBackplanePath(chassisPath) {
continue
}
for _, driveDoc := range r.probeSupermicroNVMeDiskBays(chassisPath) {
if !looksLikeDrive(driveDoc) || isVirtualStorageDrive(driveDoc) {
continue
}
out = append(out, parseDrive(driveDoc))
}
}
}
return dedupeStorage(out)
}
func (r redfishSnapshotReader) collectStorageVolumes(systemPath string, plan redfishprofile.ResolvedAnalysisPlan) []models.StorageVolume {
var out []models.StorageVolume
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
for _, member := range storageMembers {
controller := firstNonEmpty(asString(member["Id"]), asString(member["Name"]))
volumeCollectionPath := redfishLinkedPath(member, "Volumes")
if volumeCollectionPath == "" {
continue
}
volumeDocs, err := r.getCollectionMembers(volumeCollectionPath)
if err != nil {
continue
}
for _, volDoc := range volumeDocs {
if looksLikeVolume(volDoc) {
out = append(out, parseStorageVolume(volDoc, controller))
}
}
}
if len(plan.KnownStorageVolumeCollections) > 0 {
for _, volDoc := range r.collectKnownStorageMembers(systemPath, plan.KnownStorageVolumeCollections) {
if looksLikeVolume(volDoc) {
out = append(out, parseStorageVolume(volDoc, storageControllerFromPath(asString(volDoc["@odata.id"]))))
}
}
}
return dedupeStorageVolumes(out)
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,163 @@
package redfishprofile
import "strings"
func ResolveAcquisitionPlan(match MatchResult, plan AcquisitionPlan, discovered DiscoveredResources, signals MatchSignals) ResolvedAcquisitionPlan {
seedGroups := [][]string{
baselineSeedPaths(discovered),
expandScopedSuffixes(discovered.SystemPaths, plan.ScopedPaths.SystemSeedSuffixes),
expandScopedSuffixes(discovered.ChassisPaths, plan.ScopedPaths.ChassisSeedSuffixes),
expandScopedSuffixes(discovered.ManagerPaths, plan.ScopedPaths.ManagerSeedSuffixes),
plan.SeedPaths,
}
if plan.Mode == ModeFallback {
seedGroups = append(seedGroups, plan.PlanBPaths)
}
criticalGroups := [][]string{
baselineCriticalPaths(discovered),
expandScopedSuffixes(discovered.SystemPaths, plan.ScopedPaths.SystemCriticalSuffixes),
expandScopedSuffixes(discovered.ChassisPaths, plan.ScopedPaths.ChassisCriticalSuffixes),
expandScopedSuffixes(discovered.ManagerPaths, plan.ScopedPaths.ManagerCriticalSuffixes),
plan.CriticalPaths,
}
resolved := ResolvedAcquisitionPlan{
Plan: plan,
SeedPaths: mergeResolvedPaths(seedGroups...),
CriticalPaths: mergeResolvedPaths(criticalGroups...),
}
for _, profile := range match.Profiles {
profile.RefineAcquisitionPlan(&resolved, discovered, signals)
}
resolved.SeedPaths = mergeResolvedPaths(resolved.SeedPaths)
resolved.CriticalPaths = mergeResolvedPaths(resolved.CriticalPaths, resolved.Plan.CriticalPaths)
resolved.Plan.SeedPaths = mergeResolvedPaths(resolved.Plan.SeedPaths)
resolved.Plan.CriticalPaths = mergeResolvedPaths(resolved.Plan.CriticalPaths)
resolved.Plan.PlanBPaths = mergeResolvedPaths(resolved.Plan.PlanBPaths)
return resolved
}
func baselineSeedPaths(discovered DiscoveredResources) []string {
var out []string
add := func(p string) {
if p = normalizePath(p); p != "" {
out = append(out, p)
}
}
add("/redfish/v1/UpdateService")
add("/redfish/v1/UpdateService/FirmwareInventory")
for _, p := range discovered.SystemPaths {
add(p)
add(joinPath(p, "/Bios"))
add(joinPath(p, "/SecureBoot"))
add(joinPath(p, "/Oem/Public"))
add(joinPath(p, "/Oem/Public/FRU"))
add(joinPath(p, "/Processors"))
add(joinPath(p, "/Memory"))
add(joinPath(p, "/EthernetInterfaces"))
add(joinPath(p, "/NetworkInterfaces"))
add(joinPath(p, "/PCIeDevices"))
add(joinPath(p, "/PCIeFunctions"))
add(joinPath(p, "/Accelerators"))
add(joinPath(p, "/GraphicsControllers"))
add(joinPath(p, "/Storage"))
}
for _, p := range discovered.ChassisPaths {
add(p)
add(joinPath(p, "/Oem/Public"))
add(joinPath(p, "/Oem/Public/FRU"))
add(joinPath(p, "/PCIeDevices"))
add(joinPath(p, "/PCIeSlots"))
add(joinPath(p, "/NetworkAdapters"))
add(joinPath(p, "/Drives"))
add(joinPath(p, "/Power"))
}
for _, p := range discovered.ManagerPaths {
add(p)
add(joinPath(p, "/EthernetInterfaces"))
add(joinPath(p, "/NetworkProtocol"))
}
return mergeResolvedPaths(out)
}
func baselineCriticalPaths(discovered DiscoveredResources) []string {
var out []string
for _, group := range [][]string{
{"/redfish/v1"},
discovered.SystemPaths,
discovered.ChassisPaths,
discovered.ManagerPaths,
} {
out = append(out, group...)
}
return mergeResolvedPaths(out)
}
func expandScopedSuffixes(basePaths, suffixes []string) []string {
if len(basePaths) == 0 || len(suffixes) == 0 {
return nil
}
out := make([]string, 0, len(basePaths)*len(suffixes))
for _, basePath := range basePaths {
basePath = normalizePath(basePath)
if basePath == "" {
continue
}
for _, suffix := range suffixes {
suffix = strings.TrimSpace(suffix)
if suffix == "" {
continue
}
out = append(out, joinPath(basePath, suffix))
}
}
return mergeResolvedPaths(out)
}
func mergeResolvedPaths(groups ...[]string) []string {
seen := make(map[string]struct{})
out := make([]string, 0)
for _, group := range groups {
for _, path := range group {
path = normalizePath(path)
if path == "" {
continue
}
if _, ok := seen[path]; ok {
continue
}
seen[path] = struct{}{}
out = append(out, path)
}
}
return out
}
func normalizePath(path string) string {
path = strings.TrimSpace(path)
if path == "" {
return ""
}
if !strings.HasPrefix(path, "/") {
path = "/" + path
}
return strings.TrimRight(path, "/")
}
func joinPath(base, rel string) string {
base = normalizePath(base)
rel = strings.TrimSpace(rel)
if base == "" {
return normalizePath(rel)
}
if rel == "" {
return base
}
if !strings.HasPrefix(rel, "/") {
rel = "/" + rel
}
return normalizePath(base + rel)
}

View File

@@ -0,0 +1,100 @@
package redfishprofile
import "strings"
func ResolveAnalysisPlan(match MatchResult, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals) ResolvedAnalysisPlan {
plan := ResolvedAnalysisPlan{
Match: match,
Directives: AnalysisDirectives{},
}
if match.Mode == ModeFallback {
plan.Directives.EnableProcessorGPUFallback = true
plan.Directives.EnableSupermicroNVMeBackplane = true
plan.Directives.EnableProcessorGPUChassisAlias = true
plan.Directives.EnableGenericGraphicsControllerDedup = true
plan.Directives.EnableStorageEnclosureRecovery = true
plan.Directives.EnableKnownStorageControllerRecovery = true
addAnalysisLookupMode(&plan, "msi-index")
addAnalysisLookupMode(&plan, "hgx-alias")
addAnalysisStorageDriveCollections(&plan,
"/Storage/IntelVROC/Drives",
"/Storage/IntelVROC/Controllers/1/Drives",
)
addAnalysisStorageVolumeCollections(&plan,
"/Storage/IntelVROC/Volumes",
"/Storage/HA-RAID/Volumes",
"/Storage/MRVL.HA-RAID/Volumes",
)
addAnalysisNote(&plan, "fallback analysis enables broad recovery directives")
}
for _, profile := range match.Profiles {
profile.ApplyAnalysisDirectives(&plan.Directives, signals)
}
for _, profile := range match.Profiles {
profile.RefineAnalysisPlan(&plan, snapshot, discovered, signals)
}
return plan
}
func snapshotHasPathPrefix(snapshot map[string]interface{}, prefix string) bool {
prefix = normalizePath(prefix)
if prefix == "" {
return false
}
for path := range snapshot {
if strings.HasPrefix(normalizePath(path), prefix) {
return true
}
}
return false
}
func snapshotHasPathContaining(snapshot map[string]interface{}, sub string) bool {
sub = strings.ToLower(strings.TrimSpace(sub))
if sub == "" {
return false
}
for path := range snapshot {
if strings.Contains(strings.ToLower(path), sub) {
return true
}
}
return false
}
func snapshotHasGPUProcessor(snapshot map[string]interface{}, systemPaths []string) bool {
for _, systemPath := range systemPaths {
prefix := normalizePath(joinPath(systemPath, "/Processors")) + "/"
for path, docAny := range snapshot {
if !strings.HasPrefix(normalizePath(path), prefix) {
continue
}
doc, ok := docAny.(map[string]interface{})
if !ok {
continue
}
if strings.EqualFold(strings.TrimSpace(asString(doc["ProcessorType"])), "GPU") {
return true
}
}
}
return false
}
func snapshotHasStorageControllerHint(snapshot map[string]interface{}, needles ...string) bool {
for _, needle := range needles {
if snapshotHasPathContaining(snapshot, needle) {
return true
}
}
return false
}
func asString(v interface{}) string {
switch x := v.(type) {
case string:
return x
default:
return ""
}
}

View File

@@ -0,0 +1,450 @@
package redfishprofile
import (
"encoding/json"
"os"
"path/filepath"
"strings"
"testing"
)
func TestBuildAcquisitionPlan_Fixture_MSI_CG480(t *testing.T) {
signals := loadProfileFixtureSignals(t, "msi-cg480.json")
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", match.Mode)
}
assertProfileSelected(t, match, "msi")
assertProfileSelected(t, match, "ami-family")
assertProfileNotSelected(t, match, "hgx-topology")
if plan.Tuning.PrefetchWorkers < 6 {
t.Fatalf("expected msi prefetch worker tuning, got %d", plan.Tuning.PrefetchWorkers)
}
if !containsString(resolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
t.Fatalf("expected MSI chassis GPU seed path")
}
if !containsString(resolved.CriticalPaths, "/redfish/v1/Chassis/GPU1/Sensors") {
t.Fatal("expected MSI GPU sensor critical path")
}
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/Chassis/GPU1/Sensors") {
t.Fatal("expected MSI GPU sensor plan-b path")
}
if plan.Tuning.ETABaseline.SnapshotSeconds <= 0 {
t.Fatal("expected MSI snapshot eta baseline")
}
if !plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe {
t.Fatal("expected MSI fixture to inherit generic numeric post-probe policy")
}
if !containsString(plan.ScopedPaths.SystemSeedSuffixes, "/SimpleStorage") {
t.Fatal("expected MSI fixture to inherit generic SimpleStorage scoped seed suffix")
}
if !containsString(plan.ScopedPaths.SystemCriticalSuffixes, "/Memory") {
t.Fatal("expected MSI fixture to inherit generic system critical suffixes")
}
if !containsString(plan.Tuning.PrefetchPolicy.IncludeSuffixes, "/Storage") {
t.Fatal("expected MSI fixture to inherit generic storage prefetch policy")
}
if !containsString(plan.CriticalPaths, "/redfish/v1/UpdateService") {
t.Fatal("expected MSI fixture to inherit generic top-level critical path")
}
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
t.Fatal("expected MSI fixture to enable profile plan-b")
}
}
func TestBuildAcquisitionPlan_Fixture_MSI_CG480_CopyMatchesSameProfiles(t *testing.T) {
originalSignals := loadProfileFixtureSignals(t, "msi-cg480.json")
copySignals := loadProfileFixtureSignals(t, "msi-cg480-copy.json")
originalMatch := MatchProfiles(originalSignals)
copyMatch := MatchProfiles(copySignals)
originalPlan := BuildAcquisitionPlan(originalSignals)
copyPlan := BuildAcquisitionPlan(copySignals)
originalResolved := ResolveAcquisitionPlan(originalMatch, originalPlan, discoveredResourcesFromSignals(originalSignals), originalSignals)
copyResolved := ResolveAcquisitionPlan(copyMatch, copyPlan, discoveredResourcesFromSignals(copySignals), copySignals)
assertSameProfileNames(t, originalMatch, copyMatch)
if originalPlan.Tuning.PrefetchWorkers != copyPlan.Tuning.PrefetchWorkers {
t.Fatalf("expected same MSI prefetch worker tuning, got %d vs %d", originalPlan.Tuning.PrefetchWorkers, copyPlan.Tuning.PrefetchWorkers)
}
if containsString(originalResolved.SeedPaths, "/redfish/v1/Chassis/GPU1") != containsString(copyResolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
t.Fatal("expected same MSI GPU chassis seed presence in both fixtures")
}
}
func TestBuildAcquisitionPlan_Fixture_MSI_CG290(t *testing.T) {
signals := loadProfileFixtureSignals(t, "msi-cg290.json")
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", match.Mode)
}
assertProfileSelected(t, match, "msi")
assertProfileSelected(t, match, "ami-family")
assertProfileNotSelected(t, match, "hgx-topology")
if plan.Tuning.PrefetchWorkers < 6 {
t.Fatalf("expected MSI prefetch worker tuning, got %d", plan.Tuning.PrefetchWorkers)
}
if !containsString(resolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
t.Fatalf("expected MSI chassis GPU seed path")
}
}
func TestBuildAcquisitionPlan_Fixture_Supermicro_HGX(t *testing.T) {
signals := loadProfileFixtureSignals(t, "supermicro-hgx.json")
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
discovered := discoveredResourcesFromSignals(signals)
discovered.SystemPaths = dedupeSorted(append(discovered.SystemPaths, "/redfish/v1/Systems/HGX_Baseboard_0"))
resolved := ResolveAcquisitionPlan(match, plan, discovered, signals)
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", match.Mode)
}
assertProfileSelected(t, match, "supermicro")
assertProfileSelected(t, match, "hgx-topology")
assertProfileNotSelected(t, match, "msi")
if plan.Tuning.SnapshotMaxDocuments < 180000 {
t.Fatalf("expected widened HGX snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
}
if plan.Tuning.NVMePostProbeEnabled == nil || *plan.Tuning.NVMePostProbeEnabled {
t.Fatal("expected HGX fixture to disable NVMe post-probe")
}
if !containsString(resolved.SeedPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
t.Fatal("expected HGX baseboard processors seed path")
}
if !containsString(resolved.CriticalPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
t.Fatal("expected HGX baseboard processors critical path")
}
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
t.Fatal("expected HGX baseboard processors plan-b path")
}
if plan.Tuning.ETABaseline.SnapshotSeconds < 300 {
t.Fatalf("expected HGX snapshot eta baseline, got %d", plan.Tuning.ETABaseline.SnapshotSeconds)
}
if !plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe {
t.Fatal("expected HGX fixture to retain Supermicro direct NVMe disk bay probe policy")
}
if !containsString(plan.ScopedPaths.SystemCriticalSuffixes, "/Storage/IntelVROC/Drives") {
t.Fatal("expected HGX fixture to inherit generic IntelVROC scoped critical suffix")
}
if !containsString(plan.ScopedPaths.ChassisCriticalSuffixes, "/Assembly") {
t.Fatal("expected HGX fixture to inherit generic chassis critical suffixes")
}
if !containsString(plan.Tuning.PrefetchPolicy.ExcludeContains, "/Assembly") {
t.Fatal("expected HGX fixture to inherit generic assembly prefetch exclusion")
}
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
t.Fatal("expected HGX fixture to enable profile plan-b")
}
}
func TestBuildAcquisitionPlan_Fixture_Supermicro_OAM_NoHGX(t *testing.T) {
signals := loadProfileFixtureSignals(t, "supermicro-oam-amd.json")
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", match.Mode)
}
assertProfileSelected(t, match, "supermicro")
assertProfileNotSelected(t, match, "hgx-topology")
assertProfileNotSelected(t, match, "msi")
if containsString(resolved.SeedPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
t.Fatal("did not expect HGX baseboard processors seed path for OAM fixture")
}
if containsString(resolved.CriticalPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
t.Fatal("did not expect HGX baseboard processors critical path for OAM fixture")
}
if !containsString(resolved.CriticalPaths, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
t.Fatal("expected Supermicro firmware critical path")
}
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
t.Fatal("expected Supermicro firmware plan-b path")
}
if plan.Tuning.SnapshotMaxDocuments != 150000 {
t.Fatalf("expected generic supermicro snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
}
if plan.Tuning.NVMePostProbeEnabled != nil {
t.Fatal("did not expect HGX NVMe tuning for OAM fixture")
}
if plan.Tuning.ETABaseline.SnapshotSeconds < 180 {
t.Fatalf("expected Supermicro snapshot eta baseline, got %d", plan.Tuning.ETABaseline.SnapshotSeconds)
}
if !plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe {
t.Fatal("expected Supermicro OAM fixture to use direct NVMe disk bay probe policy")
}
if !plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe {
t.Fatal("expected Supermicro OAM fixture to inherit generic numeric post-probe policy")
}
if !containsString(plan.ScopedPaths.SystemSeedSuffixes, "/Storage/IntelVROC") {
t.Fatal("expected Supermicro OAM fixture to inherit generic IntelVROC scoped seed suffix")
}
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
t.Fatal("expected Supermicro OAM fixture to enable profile plan-b")
}
}
func TestBuildAcquisitionPlan_Fixture_Dell_R750(t *testing.T) {
signals := loadProfileFixtureSignals(t, "dell-r750.json")
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/System.Embedded.1"},
ChassisPaths: []string{"/redfish/v1/Chassis/System.Embedded.1"},
ManagerPaths: []string{"/redfish/v1/Managers/1", "/redfish/v1/Managers/iDRAC.Embedded.1"},
}, signals)
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", match.Mode)
}
assertProfileSelected(t, match, "dell")
assertProfileNotSelected(t, match, "supermicro")
assertProfileNotSelected(t, match, "hgx-topology")
assertProfileNotSelected(t, match, "msi")
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
t.Fatal("expected dell fixture to enable profile plan-b")
}
if !containsString(resolved.SeedPaths, "/redfish/v1/Managers/iDRAC.Embedded.1") {
t.Fatal("expected Dell refinement to add iDRAC manager seed path")
}
if !containsString(resolved.CriticalPaths, "/redfish/v1/Managers/iDRAC.Embedded.1") {
t.Fatal("expected Dell refinement to add iDRAC manager critical path")
}
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals).Directives
if !directives.EnableGenericGraphicsControllerDedup {
t.Fatal("expected dell fixture to enable graphics controller dedup")
}
}
func TestBuildAcquisitionPlan_Fixture_AMI_Generic(t *testing.T) {
signals := loadProfileFixtureSignals(t, "ami-generic.json")
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", match.Mode)
}
assertProfileSelected(t, match, "ami-family")
assertProfileNotSelected(t, match, "msi")
assertProfileNotSelected(t, match, "supermicro")
assertProfileNotSelected(t, match, "dell")
assertProfileNotSelected(t, match, "hgx-topology")
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
t.Fatal("expected ami-family fixture to force prefetch enabled")
}
if !containsString(plan.SeedPaths, "/redfish/v1/Oem/Ami") {
t.Fatal("expected ami-family fixture seed path /redfish/v1/Oem/Ami")
}
if !containsString(plan.SeedPaths, "/redfish/v1/Oem/Ami/InventoryData/Status") {
t.Fatal("expected ami-family fixture seed path /redfish/v1/Oem/Ami/InventoryData/Status")
}
if !containsString(plan.CriticalPaths, "/redfish/v1/UpdateService") {
t.Fatal("expected ami-family fixture to inherit generic critical path")
}
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals).Directives
if !directives.EnableGenericGraphicsControllerDedup {
t.Fatal("expected ami-family fixture to enable graphics controller dedup")
}
}
func TestBuildAcquisitionPlan_Fixture_UnknownVendor(t *testing.T) {
signals := loadProfileFixtureSignals(t, "unknown-vendor.json")
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/1"},
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
ManagerPaths: []string{"/redfish/v1/Managers/1"},
}, signals)
if match.Mode != ModeFallback {
t.Fatalf("expected fallback mode for unknown vendor, got %q", match.Mode)
}
if len(match.Profiles) == 0 {
t.Fatal("expected fallback to aggregate profiles")
}
for _, profile := range match.Profiles {
if !profile.SafeForFallback() {
t.Fatalf("fallback mode included non-safe profile %q", profile.Name())
}
}
if plan.Tuning.SnapshotMaxDocuments < 180000 {
t.Fatalf("expected fallback to widen snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
}
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
t.Fatal("expected fallback fixture to force prefetch enabled")
}
if !containsString(resolved.CriticalPaths, "/redfish/v1/Systems/1") {
t.Fatal("expected fallback resolved critical paths to include discovered system")
}
analysisPlan := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals)
if !analysisPlan.Directives.EnableProcessorGPUFallback {
t.Fatal("expected fallback fixture to enable processor GPU fallback")
}
if !analysisPlan.Directives.EnableStorageEnclosureRecovery {
t.Fatal("expected fallback fixture to enable storage enclosure recovery")
}
if !analysisPlan.Directives.EnableGenericGraphicsControllerDedup {
t.Fatal("expected fallback fixture to enable graphics controller dedup")
}
}
func TestBuildAcquisitionPlan_Fixture_xFusion_G5500V7(t *testing.T) {
signals := loadProfileFixtureSignals(t, "xfusion-g5500v7.json")
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/1"},
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
ManagerPaths: []string{"/redfish/v1/Managers/1"},
}, signals)
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode for xFusion, got %q", match.Mode)
}
assertProfileSelected(t, match, "xfusion")
assertProfileNotSelected(t, match, "supermicro")
assertProfileNotSelected(t, match, "hgx-topology")
assertProfileNotSelected(t, match, "msi")
assertProfileNotSelected(t, match, "dell")
if plan.Tuning.SnapshotMaxDocuments > 150000 {
t.Fatalf("expected xfusion snapshot cap <= 150000, got %d", plan.Tuning.SnapshotMaxDocuments)
}
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
t.Fatal("expected xfusion fixture to enable prefetch")
}
if plan.Tuning.ETABaseline.SnapshotSeconds <= 0 {
t.Fatal("expected xfusion snapshot eta baseline")
}
if !containsString(resolved.CriticalPaths, "/redfish/v1/Systems/1") {
t.Fatal("expected system path in critical paths")
}
analysisPlan := ResolveAnalysisPlan(match, map[string]interface{}{
"/redfish/v1/Systems/1/Processors/Gpu1": map[string]interface{}{"ProcessorType": "GPU"},
}, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/1"},
}, signals)
if !analysisPlan.Directives.EnableProcessorGPUFallback {
t.Fatal("expected xfusion analysis to enable processor GPU fallback when GPU processors present")
}
if !analysisPlan.Directives.EnableGenericGraphicsControllerDedup {
t.Fatal("expected xfusion analysis to enable graphics controller dedup")
}
}
func loadProfileFixtureSignals(t *testing.T, fixtureName string) MatchSignals {
t.Helper()
path := filepath.Join("testdata", fixtureName)
data, err := os.ReadFile(path)
if err != nil {
t.Fatalf("read fixture %s: %v", path, err)
}
var signals MatchSignals
if err := json.Unmarshal(data, &signals); err != nil {
t.Fatalf("decode fixture %s: %v", path, err)
}
return normalizeSignals(signals)
}
func assertProfileSelected(t *testing.T, match MatchResult, want string) {
t.Helper()
for _, profile := range match.Profiles {
if profile.Name() == want {
return
}
}
t.Fatalf("expected profile %q in %v", want, profileNames(match))
}
func assertProfileNotSelected(t *testing.T, match MatchResult, want string) {
t.Helper()
for _, profile := range match.Profiles {
if profile.Name() == want {
t.Fatalf("did not expect profile %q in %v", want, profileNames(match))
}
}
}
func profileNames(match MatchResult) []string {
out := make([]string, 0, len(match.Profiles))
for _, profile := range match.Profiles {
out = append(out, profile.Name())
}
return out
}
func assertSameProfileNames(t *testing.T, left, right MatchResult) {
t.Helper()
leftNames := profileNames(left)
rightNames := profileNames(right)
if len(leftNames) != len(rightNames) {
t.Fatalf("profile stack size differs: %v vs %v", leftNames, rightNames)
}
for i := range leftNames {
if leftNames[i] != rightNames[i] {
t.Fatalf("profile stack differs: %v vs %v", leftNames, rightNames)
}
}
}
func containsString(items []string, want string) bool {
for _, item := range items {
if item == want {
return true
}
}
return false
}
func discoveredResourcesFromSignals(signals MatchSignals) DiscoveredResources {
var discovered DiscoveredResources
for _, hint := range signals.ResourceHints {
memberPath := discoveredMemberPath(hint)
switch {
case strings.HasPrefix(memberPath, "/redfish/v1/Systems/"):
discovered.SystemPaths = append(discovered.SystemPaths, memberPath)
case strings.HasPrefix(memberPath, "/redfish/v1/Chassis/"):
discovered.ChassisPaths = append(discovered.ChassisPaths, memberPath)
case strings.HasPrefix(memberPath, "/redfish/v1/Managers/"):
discovered.ManagerPaths = append(discovered.ManagerPaths, memberPath)
}
}
discovered.SystemPaths = dedupeSorted(discovered.SystemPaths)
discovered.ChassisPaths = dedupeSorted(discovered.ChassisPaths)
discovered.ManagerPaths = dedupeSorted(discovered.ManagerPaths)
return discovered
}
func discoveredMemberPath(path string) string {
path = strings.TrimSpace(path)
if path == "" {
return ""
}
parts := strings.Split(strings.Trim(path, "/"), "/")
if len(parts) < 4 || parts[0] != "redfish" || parts[1] != "v1" {
return ""
}
switch parts[2] {
case "Systems", "Chassis", "Managers":
return "/" + strings.Join(parts[:4], "/")
default:
return ""
}
}

View File

@@ -0,0 +1,122 @@
package redfishprofile
import (
"sort"
"git.mchus.pro/mchus/logpile/internal/models"
)
const (
ModeMatched = "matched"
ModeFallback = "fallback"
)
func MatchProfiles(signals MatchSignals) MatchResult {
type scored struct {
profile Profile
score int
}
builtins := BuiltinProfiles()
candidates := make([]scored, 0, len(builtins))
allScores := make([]ProfileScore, 0, len(builtins))
for _, profile := range builtins {
score := profile.Match(signals)
allScores = append(allScores, ProfileScore{
Name: profile.Name(),
Score: score,
Priority: profile.Priority(),
})
if score <= 0 {
continue
}
candidates = append(candidates, scored{profile: profile, score: score})
}
sort.Slice(allScores, func(i, j int) bool {
if allScores[i].Score == allScores[j].Score {
if allScores[i].Priority == allScores[j].Priority {
return allScores[i].Name < allScores[j].Name
}
return allScores[i].Priority < allScores[j].Priority
}
return allScores[i].Score > allScores[j].Score
})
sort.Slice(candidates, func(i, j int) bool {
if candidates[i].score == candidates[j].score {
return candidates[i].profile.Priority() < candidates[j].profile.Priority()
}
return candidates[i].score > candidates[j].score
})
if len(candidates) == 0 || candidates[0].score < 60 {
profiles := make([]Profile, 0, len(builtins))
active := make(map[string]struct{}, len(builtins))
for _, profile := range builtins {
if profile.SafeForFallback() {
profiles = append(profiles, profile)
active[profile.Name()] = struct{}{}
}
}
sortProfiles(profiles)
for i := range allScores {
_, ok := active[allScores[i].Name]
allScores[i].Active = ok
}
return MatchResult{Mode: ModeFallback, Profiles: profiles, Scores: allScores}
}
profiles := make([]Profile, 0, len(candidates))
seen := make(map[string]struct{}, len(candidates))
for _, candidate := range candidates {
name := candidate.profile.Name()
if _, ok := seen[name]; ok {
continue
}
seen[name] = struct{}{}
profiles = append(profiles, candidate.profile)
}
sortProfiles(profiles)
for i := range allScores {
_, ok := seen[allScores[i].Name]
allScores[i].Active = ok
}
return MatchResult{Mode: ModeMatched, Profiles: profiles, Scores: allScores}
}
func BuildAcquisitionPlan(signals MatchSignals) AcquisitionPlan {
match := MatchProfiles(signals)
plan := AcquisitionPlan{Mode: match.Mode}
for _, profile := range match.Profiles {
plan.Profiles = append(plan.Profiles, profile.Name())
profile.ExtendAcquisitionPlan(&plan, signals)
}
plan.Profiles = dedupeSorted(plan.Profiles)
plan.SeedPaths = dedupeSorted(plan.SeedPaths)
plan.CriticalPaths = dedupeSorted(plan.CriticalPaths)
plan.PlanBPaths = dedupeSorted(plan.PlanBPaths)
plan.Notes = dedupeSorted(plan.Notes)
if plan.Mode == ModeFallback {
ensureSnapshotMaxDocuments(&plan, 180000)
ensurePrefetchEnabled(&plan, true)
addPlanNote(&plan, "fallback acquisition expands safe profile probes")
}
return plan
}
func ApplyAnalysisProfiles(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals) MatchResult {
match := MatchProfiles(signals)
for _, profile := range match.Profiles {
profile.PostAnalyze(result, snapshot, signals)
}
return match
}
func BuildAnalysisDirectives(match MatchResult) AnalysisDirectives {
return ResolveAnalysisPlan(match, nil, DiscoveredResources{}, MatchSignals{}).Directives
}
func sortProfiles(profiles []Profile) {
sort.Slice(profiles, func(i, j int) bool {
if profiles[i].Priority() == profiles[j].Priority() {
return profiles[i].Name() < profiles[j].Name()
}
return profiles[i].Priority() < profiles[j].Priority()
})
}

View File

@@ -0,0 +1,390 @@
package redfishprofile
import (
"strings"
"testing"
)
func TestMatchProfiles_UnknownVendorFallsBackToAggregateProfiles(t *testing.T) {
match := MatchProfiles(MatchSignals{
ServiceRootProduct: "Redfish Server",
})
if match.Mode != ModeFallback {
t.Fatalf("expected fallback mode, got %q", match.Mode)
}
if len(match.Profiles) < 2 {
t.Fatalf("expected aggregated fallback profiles, got %d", len(match.Profiles))
}
}
func TestMatchProfiles_MSISelectsMatchedMode(t *testing.T) {
match := MatchProfiles(MatchSignals{
SystemManufacturer: "Micro-Star International Co., Ltd.",
ResourceHints: []string{"/redfish/v1/Chassis/GPU1"},
})
if match.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", match.Mode)
}
found := false
for _, profile := range match.Profiles {
if profile.Name() == "msi" {
found = true
break
}
}
if !found {
t.Fatal("expected msi profile to be selected")
}
}
func TestBuildAcquisitionPlan_FallbackIncludesProfileNotes(t *testing.T) {
plan := BuildAcquisitionPlan(MatchSignals{
ServiceRootVendor: "AMI",
})
if len(plan.Profiles) == 0 {
t.Fatal("expected acquisition plan profiles")
}
if len(plan.Notes) == 0 {
t.Fatal("expected acquisition plan notes")
}
}
func TestBuildAcquisitionPlan_FallbackAddsBroadCrawlTuning(t *testing.T) {
plan := BuildAcquisitionPlan(MatchSignals{
ServiceRootProduct: "Unknown Redfish",
})
if plan.Mode != ModeFallback {
t.Fatalf("expected fallback mode, got %q", plan.Mode)
}
if plan.Tuning.SnapshotMaxDocuments < 180000 {
t.Fatalf("expected widened snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
}
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
t.Fatal("expected fallback to force prefetch enabled")
}
if !plan.Tuning.RecoveryPolicy.EnableCriticalCollectionMemberRetry {
t.Fatal("expected fallback to inherit critical member retry recovery")
}
if !plan.Tuning.RecoveryPolicy.EnableCriticalSlowProbe {
t.Fatal("expected fallback to inherit critical slow probe recovery")
}
}
func TestBuildAcquisitionPlan_HGXDisablesNVMePostProbe(t *testing.T) {
plan := BuildAcquisitionPlan(MatchSignals{
SystemModel: "HGX B200",
ResourceHints: []string{"/redfish/v1/Systems/HGX_Baseboard_0"},
})
if plan.Mode != ModeMatched {
t.Fatalf("expected matched mode, got %q", plan.Mode)
}
if plan.Tuning.NVMePostProbeEnabled == nil || *plan.Tuning.NVMePostProbeEnabled {
t.Fatal("expected hgx profile to disable NVMe post-probe")
}
}
func TestResolveAcquisitionPlan_ExpandsScopedPaths(t *testing.T) {
signals := MatchSignals{}
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/1", "/redfish/v1/Systems/2"},
}, signals)
joined := joinResolvedPaths(resolved.SeedPaths)
for _, wanted := range []string{
"/redfish/v1/Systems/1/SimpleStorage",
"/redfish/v1/Systems/1/Storage/IntelVROC",
"/redfish/v1/Systems/2/SimpleStorage",
"/redfish/v1/Systems/2/Storage/IntelVROC",
} {
if !containsJoinedPath(joined, wanted) {
t.Fatalf("expected resolved seed path %q", wanted)
}
}
}
func TestResolveAcquisitionPlan_CriticalBaselineIsShapedByProfiles(t *testing.T) {
signals := MatchSignals{}
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/1"},
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
ManagerPaths: []string{"/redfish/v1/Managers/1"},
}, signals)
joined := joinResolvedPaths(resolved.CriticalPaths)
for _, wanted := range []string{
"/redfish/v1",
"/redfish/v1/Systems/1",
"/redfish/v1/Systems/1/Memory",
"/redfish/v1/Chassis/1/Assembly",
"/redfish/v1/Managers/1/NetworkProtocol",
"/redfish/v1/UpdateService",
} {
if !containsJoinedPath(joined, wanted) {
t.Fatalf("expected resolved critical path %q", wanted)
}
}
}
func TestResolveAcquisitionPlan_FallbackAppendsPlanBToSeeds(t *testing.T) {
signals := MatchSignals{ServiceRootProduct: "Unknown Redfish"}
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
if plan.Mode != ModeFallback {
t.Fatalf("expected fallback mode, got %q", plan.Mode)
}
plan.PlanBPaths = append(plan.PlanBPaths, "/redfish/v1/Systems/1/Oem/TestPlanB")
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/1"},
}, signals)
if !containsJoinedPath(joinResolvedPaths(resolved.SeedPaths), "/redfish/v1/Systems/1/Oem/TestPlanB") {
t.Fatal("expected fallback resolved seeds to include plan-b path")
}
}
func TestResolveAcquisitionPlan_MSIRefinesDiscoveredGPUChassis(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Micro-Star International Co., Ltd.",
ResourceHints: []string{"/redfish/v1/Chassis/GPU1", "/redfish/v1/Chassis/GPU4/Sensors"},
}
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
ChassisPaths: []string{"/redfish/v1/Chassis/1", "/redfish/v1/Chassis/GPU1", "/redfish/v1/Chassis/GPU4"},
}, signals)
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU1") || !containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU4") {
t.Fatal("expected MSI refinement to add discovered GPU chassis seed paths")
}
if containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU2") {
t.Fatal("did not expect undiscovered MSI GPU chassis in resolved seeds")
}
if !containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU1/Sensors") || !containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU4/Sensors") {
t.Fatal("expected MSI refinement to add discovered GPU sensor critical paths")
}
if containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU3/Sensors") {
t.Fatal("did not expect undiscovered MSI GPU sensor critical path")
}
}
func TestResolveAcquisitionPlan_HGXRefinesDiscoveredBaseboardSystems(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Supermicro",
SystemModel: "SYS-821GE-TNHR",
ChassisModel: "HGX B200",
ResourceHints: []string{
"/redfish/v1/Systems/HGX_Baseboard_0",
"/redfish/v1/Systems/HGX_Baseboard_0/Processors",
"/redfish/v1/Systems/1",
},
}
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/1", "/redfish/v1/Systems/HGX_Baseboard_0"},
}, signals)
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_0") || !containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
t.Fatal("expected HGX refinement to add discovered baseboard system paths")
}
if !containsJoinedPath(joinedCritical, "/redfish/v1/Systems/HGX_Baseboard_0") || !containsJoinedPath(joinedCritical, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
t.Fatal("expected HGX refinement to add discovered baseboard critical paths")
}
if containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_1") {
t.Fatal("did not expect undiscovered HGX baseboard system path")
}
}
func TestResolveAcquisitionPlan_SupermicroRefinesFirmwareInventoryFromHint(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Supermicro",
ResourceHints: []string{
"/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory",
"/redfish/v1/Managers/1/Oem/Supermicro/FanMode",
},
}
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
ManagerPaths: []string{"/redfish/v1/Managers/1"},
}, signals)
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
if !containsJoinedPath(joinedCritical, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
t.Fatal("expected Supermicro refinement to add firmware inventory critical path")
}
if !containsJoinedPath(joinResolvedPaths(resolved.Plan.PlanBPaths), "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
t.Fatal("expected Supermicro refinement to add firmware inventory plan-b path")
}
}
func TestResolveAcquisitionPlan_DellRefinesDiscoveredIDRACManager(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Dell Inc.",
ServiceRootProduct: "iDRAC Redfish Service",
}
match := MatchProfiles(signals)
plan := BuildAcquisitionPlan(signals)
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
ManagerPaths: []string{"/redfish/v1/Managers/1", "/redfish/v1/Managers/iDRAC.Embedded.1"},
}, signals)
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Managers/iDRAC.Embedded.1") {
t.Fatal("expected Dell refinement to add discovered iDRAC manager seed path")
}
if !containsJoinedPath(joinedCritical, "/redfish/v1/Managers/iDRAC.Embedded.1") {
t.Fatal("expected Dell refinement to add discovered iDRAC manager critical path")
}
}
func TestBuildAnalysisDirectives_SupermicroEnablesVendorStorageFallbacks(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Supermicro",
SystemModel: "SYS-821GE",
}
match := MatchProfiles(signals)
plan := ResolveAnalysisPlan(match, map[string]interface{}{
"/redfish/v1/Chassis/NVMeSSD.1.StorageBackplane/Drives": map[string]interface{}{},
}, DiscoveredResources{}, signals)
directives := plan.Directives
if !directives.EnableSupermicroNVMeBackplane {
t.Fatal("expected supermicro nvme backplane fallback")
}
}
func joinResolvedPaths(paths []string) string {
return "\n" + strings.Join(paths, "\n") + "\n"
}
func containsJoinedPath(joined, want string) bool {
return strings.Contains(joined, "\n"+want+"\n")
}
func TestBuildAnalysisDirectives_HGXEnablesGPUFallbacks(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Supermicro",
SystemModel: "SYS-821GE-TNHR",
ChassisModel: "HGX B200",
ResourceHints: []string{"/redfish/v1/Systems/HGX_Baseboard_0", "/redfish/v1/Chassis/HGX_Chassis_0/PCIeDevices/GPU_SXM_1"},
}
match := MatchProfiles(signals)
plan := ResolveAnalysisPlan(match, map[string]interface{}{
"/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_SXM_1": map[string]interface{}{"ProcessorType": "GPU"},
"/redfish/v1/Chassis/HGX_Chassis_0/PCIeDevices/GPU_SXM_1": map[string]interface{}{},
}, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/HGX_Baseboard_0"},
}, signals)
directives := plan.Directives
if !directives.EnableProcessorGPUFallback {
t.Fatal("expected processor GPU fallback for hgx profile")
}
if !directives.EnableProcessorGPUChassisAlias {
t.Fatal("expected processor GPU chassis alias resolution for hgx profile")
}
if !directives.EnableGenericGraphicsControllerDedup {
t.Fatal("expected graphics-controller dedup for hgx profile")
}
}
func TestBuildAnalysisDirectives_MSIEnablesMSIChassisLookup(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Micro-Star International Co., Ltd.",
}
match := MatchProfiles(signals)
plan := ResolveAnalysisPlan(match, map[string]interface{}{
"/redfish/v1/Systems/1/Processors/GPU1": map[string]interface{}{"ProcessorType": "GPU"},
"/redfish/v1/Chassis/GPU1": map[string]interface{}{},
}, DiscoveredResources{
SystemPaths: []string{"/redfish/v1/Systems/1"},
ChassisPaths: []string{"/redfish/v1/Chassis/GPU1"},
}, signals)
directives := plan.Directives
if !directives.EnableMSIProcessorGPUChassisLookup {
t.Fatal("expected MSI processor GPU chassis lookup")
}
}
func TestBuildAnalysisDirectives_SupermicroEnablesStorageRecovery(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Supermicro",
}
match := MatchProfiles(signals)
plan := ResolveAnalysisPlan(match, map[string]interface{}{
"/redfish/v1/Chassis/1/Drives": map[string]interface{}{},
"/redfish/v1/Systems/1/Storage/IntelVROC": map[string]interface{}{},
"/redfish/v1/Systems/1/Storage/IntelVROC/Drives": map[string]interface{}{},
}, DiscoveredResources{}, signals)
directives := plan.Directives
if !directives.EnableStorageEnclosureRecovery {
t.Fatal("expected storage enclosure recovery for supermicro")
}
if !directives.EnableKnownStorageControllerRecovery {
t.Fatal("expected known storage controller recovery for supermicro")
}
}
func TestMatchProfiles_OrderingIsDeterministic(t *testing.T) {
signals := MatchSignals{
SystemManufacturer: "Micro-Star International Co., Ltd.",
ResourceHints: []string{"/redfish/v1/Chassis/GPU1"},
}
first := MatchProfiles(signals)
second := MatchProfiles(signals)
if len(first.Profiles) != len(second.Profiles) {
t.Fatalf("profile stack size differs across calls: %d vs %d", len(first.Profiles), len(second.Profiles))
}
for i := range first.Profiles {
if first.Profiles[i].Name() != second.Profiles[i].Name() {
t.Fatalf("profile ordering differs at index %d: %q vs %q", i, first.Profiles[i].Name(), second.Profiles[i].Name())
}
}
}
func TestMatchProfiles_FallbackOrderingIsDeterministic(t *testing.T) {
signals := MatchSignals{ServiceRootProduct: "Unknown Redfish"}
first := MatchProfiles(signals)
second := MatchProfiles(signals)
if first.Mode != ModeFallback || second.Mode != ModeFallback {
t.Fatalf("expected fallback mode in both calls")
}
if len(first.Profiles) != len(second.Profiles) {
t.Fatalf("fallback profile stack size differs: %d vs %d", len(first.Profiles), len(second.Profiles))
}
for i := range first.Profiles {
if first.Profiles[i].Name() != second.Profiles[i].Name() {
t.Fatalf("fallback profile ordering differs at index %d: %q vs %q", i, first.Profiles[i].Name(), second.Profiles[i].Name())
}
}
}
func TestMatchProfiles_FallbackOnlySelectsSafeProfiles(t *testing.T) {
match := MatchProfiles(MatchSignals{ServiceRootProduct: "Unknown Generic Redfish Server"})
if match.Mode != ModeFallback {
t.Fatalf("expected fallback mode, got %q", match.Mode)
}
for _, profile := range match.Profiles {
if !profile.SafeForFallback() {
t.Fatalf("fallback mode included non-safe profile %q", profile.Name())
}
}
}
func TestBuildAnalysisDirectives_GenericMatchedKeepsFallbacksDisabled(t *testing.T) {
match := MatchResult{
Mode: ModeMatched,
Profiles: []Profile{genericProfile()},
}
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, MatchSignals{}).Directives
if directives.EnableProcessorGPUFallback {
t.Fatal("did not expect processor GPU fallback for generic matched profile")
}
if directives.EnableSupermicroNVMeBackplane {
t.Fatal("did not expect supermicro nvme fallback for generic matched profile")
}
if directives.EnableGenericGraphicsControllerDedup {
t.Fatal("did not expect generic graphics-controller dedup for generic matched profile")
}
}

View File

@@ -0,0 +1,33 @@
package redfishprofile
func amiProfile() Profile {
return staticProfile{
name: "ami-family",
priority: 10,
safeForFallback: true,
matchFn: func(s MatchSignals) int {
score := 0
if containsFold(s.ServiceRootVendor, "ami") || containsFold(s.ServiceRootProduct, "ami") {
score += 70
}
for _, ns := range s.OEMNamespaces {
if containsFold(ns, "ami") {
score += 30
break
}
}
return min(score, 100)
},
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
addPlanPaths(&plan.SeedPaths,
"/redfish/v1/Oem/Ami",
"/redfish/v1/Oem/Ami/InventoryData/Status",
)
ensurePrefetchEnabled(plan, true)
addPlanNote(plan, "ami-family acquisition extensions enabled")
},
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
d.EnableGenericGraphicsControllerDedup = true
},
}
}

View File

@@ -0,0 +1,45 @@
package redfishprofile
func dellProfile() Profile {
return staticProfile{
name: "dell",
priority: 20,
safeForFallback: true,
matchFn: func(s MatchSignals) int {
score := 0
if containsFold(s.SystemManufacturer, "dell") || containsFold(s.ChassisManufacturer, "dell") {
score += 80
}
for _, ns := range s.OEMNamespaces {
if containsFold(ns, "dell") {
score += 30
break
}
}
if containsFold(s.ServiceRootProduct, "idrac") {
score += 30
}
return min(score, 100)
},
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
EnableProfilePlanB: true,
})
addPlanNote(plan, "dell iDRAC acquisition extensions enabled")
},
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
for _, managerPath := range discovered.ManagerPaths {
if !containsFold(managerPath, "idrac") {
continue
}
addPlanPaths(&resolved.SeedPaths, managerPath)
addPlanPaths(&resolved.Plan.SeedPaths, managerPath)
addPlanPaths(&resolved.CriticalPaths, managerPath)
addPlanPaths(&resolved.Plan.CriticalPaths, managerPath)
}
},
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
d.EnableGenericGraphicsControllerDedup = true
},
}
}

View File

@@ -0,0 +1,117 @@
package redfishprofile
func genericProfile() Profile {
return staticProfile{
name: "generic",
priority: 100,
safeForFallback: true,
matchFn: func(MatchSignals) int { return 10 },
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
ensurePrefetchPolicy(plan, AcquisitionPrefetchPolicy{
IncludeSuffixes: []string{
"/Bios",
"/SecureBoot",
"/Processors",
"/Memory",
"/Storage",
"/SimpleStorage",
"/PCIeDevices",
"/PCIeFunctions",
"/Accelerators",
"/GraphicsControllers",
"/EthernetInterfaces",
"/NetworkInterfaces",
"/NetworkAdapters",
"/Drives",
"/Power",
"/PowerSubsystem/PowerSupplies",
"/NetworkProtocol",
"/UpdateService",
"/UpdateService/FirmwareInventory",
},
ExcludeContains: []string{
"/Fabrics",
"/Backplanes",
"/Boards",
"/Assembly",
"/Sensors",
"/ThresholdSensors",
"/DiscreteSensors",
"/ThermalConfig",
"/ThermalSubsystem",
"/EnvironmentMetrics",
"/Certificates",
"/LogServices",
},
})
ensureScopedPathPolicy(plan, AcquisitionScopedPathPolicy{
SystemCriticalSuffixes: []string{
"/Bios",
"/SecureBoot",
"/Oem/Public",
"/Oem/Public/FRU",
"/Processors",
"/Memory",
"/Storage",
"/PCIeDevices",
"/PCIeFunctions",
"/Accelerators",
"/GraphicsControllers",
"/EthernetInterfaces",
"/NetworkInterfaces",
"/SimpleStorage",
"/Storage/IntelVROC",
"/Storage/IntelVROC/Drives",
"/Storage/IntelVROC/Volumes",
},
ChassisCriticalSuffixes: []string{
"/Oem/Public",
"/Oem/Public/FRU",
"/Power",
"/NetworkAdapters",
"/PCIeDevices",
"/Accelerators",
"/Drives",
"/Assembly",
},
ManagerCriticalSuffixes: []string{
"/NetworkProtocol",
},
SystemSeedSuffixes: []string{
"/SimpleStorage",
"/Storage/IntelVROC",
"/Storage/IntelVROC/Drives",
"/Storage/IntelVROC/Volumes",
},
})
addPlanPaths(&plan.CriticalPaths,
"/redfish/v1/UpdateService",
"/redfish/v1/UpdateService/FirmwareInventory",
)
ensureSnapshotMaxDocuments(plan, 100000)
ensureSnapshotWorkers(plan, 6)
ensurePrefetchWorkers(plan, 4)
ensureETABaseline(plan, AcquisitionETABaseline{
DiscoverySeconds: 8,
SnapshotSeconds: 90,
PrefetchSeconds: 20,
CriticalPlanBSeconds: 20,
ProfilePlanBSeconds: 15,
})
ensurePostProbePolicy(plan, AcquisitionPostProbePolicy{
EnableNumericCollectionProbe: true,
})
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
EnableCriticalCollectionMemberRetry: true,
EnableCriticalSlowProbe: true,
})
ensureRatePolicy(plan, AcquisitionRatePolicy{
TargetP95LatencyMS: 900,
ThrottleP95LatencyMS: 1800,
MinSnapshotWorkers: 2,
MinPrefetchWorkers: 1,
DisablePrefetchOnErrors: true,
})
},
}
}

View File

@@ -0,0 +1,85 @@
package redfishprofile
func hgxProfile() Profile {
return staticProfile{
name: "hgx-topology",
priority: 30,
safeForFallback: true,
matchFn: func(s MatchSignals) int {
score := 0
if containsFold(s.SystemModel, "hgx") || containsFold(s.ChassisModel, "hgx") {
score += 70
}
for _, hint := range s.ResourceHints {
if containsFold(hint, "hgx_") || containsFold(hint, "gpu_sxm") {
score += 20
break
}
}
return min(score, 100)
},
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
ensureSnapshotMaxDocuments(plan, 180000)
ensureSnapshotWorkers(plan, 4)
ensurePrefetchWorkers(plan, 4)
ensureNVMePostProbeEnabled(plan, false)
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
EnableProfilePlanB: true,
})
ensureETABaseline(plan, AcquisitionETABaseline{
DiscoverySeconds: 20,
SnapshotSeconds: 300,
PrefetchSeconds: 50,
CriticalPlanBSeconds: 90,
ProfilePlanBSeconds: 40,
})
ensureRatePolicy(plan, AcquisitionRatePolicy{
TargetP95LatencyMS: 1500,
ThrottleP95LatencyMS: 3000,
MinSnapshotWorkers: 1,
MinPrefetchWorkers: 1,
DisablePrefetchOnErrors: true,
})
addPlanNote(plan, "hgx topology acquisition extensions enabled")
},
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
for _, systemPath := range discovered.SystemPaths {
if !containsFold(systemPath, "hgx_baseboard_") {
continue
}
addPlanPaths(&resolved.SeedPaths, systemPath, joinPath(systemPath, "/Processors"))
addPlanPaths(&resolved.Plan.SeedPaths, systemPath, joinPath(systemPath, "/Processors"))
addPlanPaths(&resolved.CriticalPaths, systemPath, joinPath(systemPath, "/Processors"))
addPlanPaths(&resolved.Plan.CriticalPaths, systemPath, joinPath(systemPath, "/Processors"))
addPlanPaths(&resolved.Plan.PlanBPaths, systemPath, joinPath(systemPath, "/Processors"))
}
},
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
d.EnableGenericGraphicsControllerDedup = true
d.EnableStorageEnclosureRecovery = true
},
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, _ MatchSignals) {
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) && (snapshotHasPathContaining(snapshot, "gpu_sxm") || snapshotHasPathContaining(snapshot, "hgx_")) {
plan.Directives.EnableProcessorGPUFallback = true
plan.Directives.EnableProcessorGPUChassisAlias = true
addAnalysisLookupMode(plan, "hgx-alias")
addAnalysisNote(plan, "hgx analysis enables processor-gpu alias fallback from snapshot topology")
}
if snapshotHasStorageControllerHint(snapshot, "/storage/intelvroc", "/storage/ha-raid", "/storage/mrvl.ha-raid") {
plan.Directives.EnableKnownStorageControllerRecovery = true
addAnalysisStorageDriveCollections(plan,
"/Storage/IntelVROC/Drives",
"/Storage/IntelVROC/Controllers/1/Drives",
)
addAnalysisStorageVolumeCollections(plan,
"/Storage/IntelVROC/Volumes",
"/Storage/HA-RAID/Volumes",
"/Storage/MRVL.HA-RAID/Volumes",
)
}
if snapshotHasPathContaining(snapshot, "/chassis/nvmessd.") && snapshotHasPathContaining(snapshot, ".storagebackplane") {
plan.Directives.EnableSupermicroNVMeBackplane = true
}
},
}
}

View File

@@ -0,0 +1,72 @@
package redfishprofile
import "strings"
func msiProfile() Profile {
return staticProfile{
name: "msi",
priority: 20,
safeForFallback: true,
matchFn: func(s MatchSignals) int {
score := 0
if containsFold(s.SystemManufacturer, "micro-star") || containsFold(s.ChassisManufacturer, "micro-star") {
score += 80
}
if containsFold(s.SystemManufacturer, "msi") || containsFold(s.ChassisManufacturer, "msi") {
score += 40
}
for _, hint := range s.ResourceHints {
if strings.HasPrefix(hint, "/redfish/v1/Chassis/GPU") {
score += 10
break
}
}
return min(score, 100)
},
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
ensureSnapshotWorkers(plan, 6)
ensurePrefetchWorkers(plan, 8)
ensureETABaseline(plan, AcquisitionETABaseline{
DiscoverySeconds: 12,
SnapshotSeconds: 120,
PrefetchSeconds: 25,
CriticalPlanBSeconds: 35,
ProfilePlanBSeconds: 25,
})
ensureRatePolicy(plan, AcquisitionRatePolicy{
TargetP95LatencyMS: 1000,
ThrottleP95LatencyMS: 2200,
MinSnapshotWorkers: 2,
MinPrefetchWorkers: 2,
DisablePrefetchOnErrors: true,
})
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
EnableProfilePlanB: true,
})
addPlanNote(plan, "msi gpu chassis probes enabled")
},
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
for _, chassisPath := range discovered.ChassisPaths {
if !strings.HasPrefix(chassisPath, "/redfish/v1/Chassis/GPU") {
continue
}
addPlanPaths(&resolved.SeedPaths, chassisPath)
addPlanPaths(&resolved.Plan.SeedPaths, chassisPath)
addPlanPaths(&resolved.CriticalPaths, joinPath(chassisPath, "/Sensors"))
addPlanPaths(&resolved.Plan.CriticalPaths, joinPath(chassisPath, "/Sensors"))
addPlanPaths(&resolved.Plan.PlanBPaths, joinPath(chassisPath, "/Sensors"))
}
},
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
d.EnableGenericGraphicsControllerDedup = true
},
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, _ MatchSignals) {
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) && snapshotHasPathPrefix(snapshot, "/redfish/v1/Chassis/GPU") {
plan.Directives.EnableProcessorGPUFallback = true
plan.Directives.EnableMSIProcessorGPUChassisLookup = true
addAnalysisLookupMode(plan, "msi-index")
addAnalysisNote(plan, "msi analysis enables processor-gpu fallback from discovered GPU chassis")
}
},
}
}

View File

@@ -0,0 +1,81 @@
package redfishprofile
func supermicroProfile() Profile {
return staticProfile{
name: "supermicro",
priority: 20,
safeForFallback: true,
matchFn: func(s MatchSignals) int {
score := 0
if containsFold(s.SystemManufacturer, "supermicro") || containsFold(s.ChassisManufacturer, "supermicro") {
score += 80
}
for _, hint := range s.ResourceHints {
if containsFold(hint, "hgx_baseboard") || containsFold(hint, "hgx_gpu_sxm") {
score += 20
break
}
}
return min(score, 100)
},
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
ensureSnapshotMaxDocuments(plan, 150000)
ensureSnapshotWorkers(plan, 6)
ensurePrefetchWorkers(plan, 4)
ensureETABaseline(plan, AcquisitionETABaseline{
DiscoverySeconds: 15,
SnapshotSeconds: 180,
PrefetchSeconds: 35,
CriticalPlanBSeconds: 45,
ProfilePlanBSeconds: 30,
})
ensurePostProbePolicy(plan, AcquisitionPostProbePolicy{
EnableDirectNVMEDiskBayProbe: true,
})
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
EnableProfilePlanB: true,
})
ensureRatePolicy(plan, AcquisitionRatePolicy{
TargetP95LatencyMS: 1200,
ThrottleP95LatencyMS: 2400,
MinSnapshotWorkers: 2,
MinPrefetchWorkers: 1,
DisablePrefetchOnErrors: true,
})
addPlanNote(plan, "supermicro acquisition extensions enabled")
},
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, _ DiscoveredResources, signals MatchSignals) {
for _, hint := range signals.ResourceHints {
if normalizePath(hint) != "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory" {
continue
}
addPlanPaths(&resolved.CriticalPaths, hint)
addPlanPaths(&resolved.Plan.CriticalPaths, hint)
addPlanPaths(&resolved.Plan.PlanBPaths, hint)
break
}
},
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
d.EnableStorageEnclosureRecovery = true
},
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, _ DiscoveredResources, _ MatchSignals) {
if snapshotHasPathContaining(snapshot, "/chassis/nvmessd.") && snapshotHasPathContaining(snapshot, ".storagebackplane") {
plan.Directives.EnableSupermicroNVMeBackplane = true
addAnalysisNote(plan, "supermicro analysis enables NVMe backplane recovery from snapshot paths")
}
if snapshotHasStorageControllerHint(snapshot, "/storage/intelvroc", "/storage/ha-raid", "/storage/mrvl.ha-raid") {
plan.Directives.EnableKnownStorageControllerRecovery = true
addAnalysisStorageDriveCollections(plan,
"/Storage/IntelVROC/Drives",
"/Storage/IntelVROC/Controllers/1/Drives",
)
addAnalysisStorageVolumeCollections(plan,
"/Storage/IntelVROC/Volumes",
"/Storage/HA-RAID/Volumes",
"/Storage/MRVL.HA-RAID/Volumes",
)
addAnalysisNote(plan, "supermicro analysis enables known storage-controller recovery from snapshot paths")
}
},
}
}

View File

@@ -0,0 +1,55 @@
package redfishprofile
func xfusionProfile() Profile {
return staticProfile{
name: "xfusion",
priority: 20,
safeForFallback: true,
matchFn: func(s MatchSignals) int {
score := 0
if containsFold(s.ServiceRootVendor, "xfusion") {
score += 90
}
for _, ns := range s.OEMNamespaces {
if containsFold(ns, "xfusion") {
score += 20
break
}
}
if containsFold(s.SystemManufacturer, "xfusion") || containsFold(s.ChassisManufacturer, "xfusion") {
score += 40
}
return min(score, 100)
},
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
ensureSnapshotMaxDocuments(plan, 120000)
ensureSnapshotWorkers(plan, 4)
ensurePrefetchWorkers(plan, 4)
ensurePrefetchEnabled(plan, true)
ensureETABaseline(plan, AcquisitionETABaseline{
DiscoverySeconds: 10,
SnapshotSeconds: 90,
PrefetchSeconds: 20,
CriticalPlanBSeconds: 30,
ProfilePlanBSeconds: 20,
})
ensureRatePolicy(plan, AcquisitionRatePolicy{
TargetP95LatencyMS: 800,
ThrottleP95LatencyMS: 1800,
MinSnapshotWorkers: 2,
MinPrefetchWorkers: 1,
DisablePrefetchOnErrors: true,
})
addPlanNote(plan, "xfusion ibmc acquisition extensions enabled")
},
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
d.EnableGenericGraphicsControllerDedup = true
},
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, _ MatchSignals) {
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) {
plan.Directives.EnableProcessorGPUFallback = true
addAnalysisNote(plan, "xfusion analysis enables processor-gpu fallback from snapshot topology")
}
},
}
}

View File

@@ -0,0 +1,229 @@
package redfishprofile
import (
"strings"
"git.mchus.pro/mchus/logpile/internal/models"
)
type staticProfile struct {
name string
priority int
safeForFallback bool
matchFn func(MatchSignals) int
extendAcquisition func(*AcquisitionPlan, MatchSignals)
refineAcquisition func(*ResolvedAcquisitionPlan, DiscoveredResources, MatchSignals)
applyAnalysisDirectives func(*AnalysisDirectives, MatchSignals)
refineAnalysis func(*ResolvedAnalysisPlan, map[string]interface{}, DiscoveredResources, MatchSignals)
postAnalyze func(*models.AnalysisResult, map[string]interface{}, MatchSignals)
}
func (p staticProfile) Name() string { return p.name }
func (p staticProfile) Priority() int { return p.priority }
func (p staticProfile) Match(signals MatchSignals) int { return p.matchFn(normalizeSignals(signals)) }
func (p staticProfile) SafeForFallback() bool { return p.safeForFallback }
func (p staticProfile) ExtendAcquisitionPlan(plan *AcquisitionPlan, signals MatchSignals) {
if p.extendAcquisition != nil {
p.extendAcquisition(plan, normalizeSignals(signals))
}
}
func (p staticProfile) RefineAcquisitionPlan(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, signals MatchSignals) {
if p.refineAcquisition != nil {
p.refineAcquisition(resolved, discovered, normalizeSignals(signals))
}
}
func (p staticProfile) ApplyAnalysisDirectives(directives *AnalysisDirectives, signals MatchSignals) {
if p.applyAnalysisDirectives != nil {
p.applyAnalysisDirectives(directives, normalizeSignals(signals))
}
}
func (p staticProfile) RefineAnalysisPlan(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals) {
if p.refineAnalysis != nil {
p.refineAnalysis(plan, snapshot, discovered, normalizeSignals(signals))
}
}
func (p staticProfile) PostAnalyze(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals) {
if p.postAnalyze != nil {
p.postAnalyze(result, snapshot, normalizeSignals(signals))
}
}
func BuiltinProfiles() []Profile {
return []Profile{
genericProfile(),
amiProfile(),
msiProfile(),
supermicroProfile(),
dellProfile(),
hgxProfile(),
xfusionProfile(),
}
}
func containsFold(v, sub string) bool {
return strings.Contains(strings.ToLower(strings.TrimSpace(v)), strings.ToLower(strings.TrimSpace(sub)))
}
func addPlanPaths(dst *[]string, paths ...string) {
*dst = append(*dst, paths...)
*dst = dedupeSorted(*dst)
}
func addPlanNote(plan *AcquisitionPlan, note string) {
if strings.TrimSpace(note) == "" {
return
}
plan.Notes = append(plan.Notes, note)
plan.Notes = dedupeSorted(plan.Notes)
}
func addAnalysisNote(plan *ResolvedAnalysisPlan, note string) {
if plan == nil || strings.TrimSpace(note) == "" {
return
}
plan.Notes = append(plan.Notes, note)
plan.Notes = dedupeSorted(plan.Notes)
}
func addAnalysisLookupMode(plan *ResolvedAnalysisPlan, mode string) {
if plan == nil || strings.TrimSpace(mode) == "" {
return
}
plan.ProcessorGPUChassisLookupModes = dedupeSorted(append(plan.ProcessorGPUChassisLookupModes, mode))
}
func addAnalysisStorageDriveCollections(plan *ResolvedAnalysisPlan, rels ...string) {
if plan == nil {
return
}
plan.KnownStorageDriveCollections = dedupeSorted(append(plan.KnownStorageDriveCollections, rels...))
}
func addAnalysisStorageVolumeCollections(plan *ResolvedAnalysisPlan, rels ...string) {
if plan == nil {
return
}
plan.KnownStorageVolumeCollections = dedupeSorted(append(plan.KnownStorageVolumeCollections, rels...))
}
func ensureSnapshotMaxDocuments(plan *AcquisitionPlan, n int) {
if n <= 0 {
return
}
if plan.Tuning.SnapshotMaxDocuments < n {
plan.Tuning.SnapshotMaxDocuments = n
}
}
func ensureSnapshotWorkers(plan *AcquisitionPlan, n int) {
if n <= 0 {
return
}
if plan.Tuning.SnapshotWorkers < n {
plan.Tuning.SnapshotWorkers = n
}
}
func ensurePrefetchEnabled(plan *AcquisitionPlan, enabled bool) {
if plan.Tuning.PrefetchEnabled == nil {
plan.Tuning.PrefetchEnabled = new(bool)
}
*plan.Tuning.PrefetchEnabled = enabled
}
func ensurePrefetchWorkers(plan *AcquisitionPlan, n int) {
if n <= 0 {
return
}
if plan.Tuning.PrefetchWorkers < n {
plan.Tuning.PrefetchWorkers = n
}
}
func ensureNVMePostProbeEnabled(plan *AcquisitionPlan, enabled bool) {
if plan.Tuning.NVMePostProbeEnabled == nil {
plan.Tuning.NVMePostProbeEnabled = new(bool)
}
*plan.Tuning.NVMePostProbeEnabled = enabled
}
func ensureRatePolicy(plan *AcquisitionPlan, policy AcquisitionRatePolicy) {
if policy.TargetP95LatencyMS > plan.Tuning.RatePolicy.TargetP95LatencyMS {
plan.Tuning.RatePolicy.TargetP95LatencyMS = policy.TargetP95LatencyMS
}
if policy.ThrottleP95LatencyMS > plan.Tuning.RatePolicy.ThrottleP95LatencyMS {
plan.Tuning.RatePolicy.ThrottleP95LatencyMS = policy.ThrottleP95LatencyMS
}
if policy.MinSnapshotWorkers > plan.Tuning.RatePolicy.MinSnapshotWorkers {
plan.Tuning.RatePolicy.MinSnapshotWorkers = policy.MinSnapshotWorkers
}
if policy.MinPrefetchWorkers > plan.Tuning.RatePolicy.MinPrefetchWorkers {
plan.Tuning.RatePolicy.MinPrefetchWorkers = policy.MinPrefetchWorkers
}
if policy.DisablePrefetchOnErrors {
plan.Tuning.RatePolicy.DisablePrefetchOnErrors = true
}
}
func ensureETABaseline(plan *AcquisitionPlan, baseline AcquisitionETABaseline) {
if baseline.DiscoverySeconds > plan.Tuning.ETABaseline.DiscoverySeconds {
plan.Tuning.ETABaseline.DiscoverySeconds = baseline.DiscoverySeconds
}
if baseline.SnapshotSeconds > plan.Tuning.ETABaseline.SnapshotSeconds {
plan.Tuning.ETABaseline.SnapshotSeconds = baseline.SnapshotSeconds
}
if baseline.PrefetchSeconds > plan.Tuning.ETABaseline.PrefetchSeconds {
plan.Tuning.ETABaseline.PrefetchSeconds = baseline.PrefetchSeconds
}
if baseline.CriticalPlanBSeconds > plan.Tuning.ETABaseline.CriticalPlanBSeconds {
plan.Tuning.ETABaseline.CriticalPlanBSeconds = baseline.CriticalPlanBSeconds
}
if baseline.ProfilePlanBSeconds > plan.Tuning.ETABaseline.ProfilePlanBSeconds {
plan.Tuning.ETABaseline.ProfilePlanBSeconds = baseline.ProfilePlanBSeconds
}
}
func ensurePostProbePolicy(plan *AcquisitionPlan, policy AcquisitionPostProbePolicy) {
if policy.EnableDirectNVMEDiskBayProbe {
plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe = true
}
if policy.EnableNumericCollectionProbe {
plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe = true
}
if policy.EnableSensorCollectionProbe {
plan.Tuning.PostProbePolicy.EnableSensorCollectionProbe = true
}
}
func ensureRecoveryPolicy(plan *AcquisitionPlan, policy AcquisitionRecoveryPolicy) {
if policy.EnableCriticalCollectionMemberRetry {
plan.Tuning.RecoveryPolicy.EnableCriticalCollectionMemberRetry = true
}
if policy.EnableCriticalSlowProbe {
plan.Tuning.RecoveryPolicy.EnableCriticalSlowProbe = true
}
if policy.EnableProfilePlanB {
plan.Tuning.RecoveryPolicy.EnableProfilePlanB = true
}
}
func ensureScopedPathPolicy(plan *AcquisitionPlan, policy AcquisitionScopedPathPolicy) {
addPlanPaths(&plan.ScopedPaths.SystemSeedSuffixes, policy.SystemSeedSuffixes...)
addPlanPaths(&plan.ScopedPaths.SystemCriticalSuffixes, policy.SystemCriticalSuffixes...)
addPlanPaths(&plan.ScopedPaths.ChassisSeedSuffixes, policy.ChassisSeedSuffixes...)
addPlanPaths(&plan.ScopedPaths.ChassisCriticalSuffixes, policy.ChassisCriticalSuffixes...)
addPlanPaths(&plan.ScopedPaths.ManagerSeedSuffixes, policy.ManagerSeedSuffixes...)
addPlanPaths(&plan.ScopedPaths.ManagerCriticalSuffixes, policy.ManagerCriticalSuffixes...)
}
func ensurePrefetchPolicy(plan *AcquisitionPlan, policy AcquisitionPrefetchPolicy) {
addPlanPaths(&plan.Tuning.PrefetchPolicy.IncludeSuffixes, policy.IncludeSuffixes...)
addPlanPaths(&plan.Tuning.PrefetchPolicy.ExcludeContains, policy.ExcludeContains...)
}
func min(a, b int) int {
if a < b {
return a
}
return b
}

View File

@@ -0,0 +1,98 @@
package redfishprofile
import "strings"
func CollectSignals(serviceRootDoc, systemDoc, chassisDoc, managerDoc map[string]interface{}, resourceHints []string) MatchSignals {
signals := MatchSignals{
ServiceRootVendor: lookupString(serviceRootDoc, "Vendor"),
ServiceRootProduct: lookupString(serviceRootDoc, "Product"),
SystemManufacturer: lookupString(systemDoc, "Manufacturer"),
SystemModel: lookupString(systemDoc, "Model"),
SystemSKU: lookupString(systemDoc, "SKU"),
ChassisManufacturer: lookupString(chassisDoc, "Manufacturer"),
ChassisModel: lookupString(chassisDoc, "Model"),
ManagerManufacturer: lookupString(managerDoc, "Manufacturer"),
ResourceHints: resourceHints,
}
signals.OEMNamespaces = dedupeSorted(append(
oemNamespaces(serviceRootDoc),
append(oemNamespaces(systemDoc), append(oemNamespaces(chassisDoc), oemNamespaces(managerDoc)...)...)...,
))
return normalizeSignals(signals)
}
func CollectSignalsFromTree(tree map[string]interface{}) MatchSignals {
getDoc := func(path string) map[string]interface{} {
if v, ok := tree[path]; ok {
if doc, ok := v.(map[string]interface{}); ok {
return doc
}
}
return nil
}
memberPath := func(collectionPath, fallbackPath string) string {
collection := getDoc(collectionPath)
if len(collection) != 0 {
if members, ok := collection["Members"].([]interface{}); ok && len(members) > 0 {
if ref, ok := members[0].(map[string]interface{}); ok {
if path := lookupString(ref, "@odata.id"); path != "" {
return path
}
}
}
}
return fallbackPath
}
systemPath := memberPath("/redfish/v1/Systems", "/redfish/v1/Systems/1")
chassisPath := memberPath("/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
managerPath := memberPath("/redfish/v1/Managers", "/redfish/v1/Managers/1")
resourceHints := make([]string, 0, len(tree))
for path := range tree {
path = strings.TrimSpace(path)
if path == "" {
continue
}
resourceHints = append(resourceHints, path)
}
return CollectSignals(
getDoc("/redfish/v1"),
getDoc(systemPath),
getDoc(chassisPath),
getDoc(managerPath),
resourceHints,
)
}
func lookupString(doc map[string]interface{}, key string) string {
if len(doc) == 0 {
return ""
}
value, _ := doc[key]
if s, ok := value.(string); ok {
return strings.TrimSpace(s)
}
return ""
}
func oemNamespaces(doc map[string]interface{}) []string {
if len(doc) == 0 {
return nil
}
oem, ok := doc["Oem"].(map[string]interface{})
if !ok {
return nil
}
out := make([]string, 0, len(oem))
for key := range oem {
key = strings.TrimSpace(key)
if key == "" {
continue
}
out = append(out, key)
}
return out
}

View File

@@ -0,0 +1,17 @@
{
"ServiceRootVendor": "AMI",
"ServiceRootProduct": "AMI Redfish Server",
"SystemManufacturer": "Gigabyte",
"SystemModel": "G292-Z42",
"SystemSKU": "",
"ChassisManufacturer": "",
"ChassisModel": "",
"ManagerManufacturer": "",
"OEMNamespaces": ["Ami"],
"ResourceHints": [
"/redfish/v1/Chassis/Self",
"/redfish/v1/Managers/Self",
"/redfish/v1/Oem/Ami",
"/redfish/v1/Systems/Self"
]
}

View File

@@ -0,0 +1,18 @@
{
"ServiceRootVendor": "",
"ServiceRootProduct": "iDRAC Redfish Service",
"SystemManufacturer": "Dell Inc.",
"SystemModel": "PowerEdge R750",
"SystemSKU": "0A42H9",
"ChassisManufacturer": "Dell Inc.",
"ChassisModel": "PowerEdge R750",
"ManagerManufacturer": "Dell Inc.",
"OEMNamespaces": ["Dell"],
"ResourceHints": [
"/redfish/v1/Chassis/System.Embedded.1",
"/redfish/v1/Managers/iDRAC.Embedded.1",
"/redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell",
"/redfish/v1/Systems/System.Embedded.1",
"/redfish/v1/Systems/System.Embedded.1/Storage"
]
}

View File

@@ -0,0 +1,33 @@
{
"ServiceRootVendor": "AMI",
"ServiceRootProduct": "AMI Redfish Server",
"SystemManufacturer": "Micro-Star International Co., Ltd.",
"SystemModel": "CG290-S3063",
"SystemSKU": "S3063G290RAU4",
"ChassisManufacturer": "NVIDIA",
"ChassisModel": "",
"ManagerManufacturer": "",
"OEMNamespaces": ["Ami"],
"ResourceHints": [
"/redfish/v1/Chassis/GPU1",
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
"/redfish/v1/Chassis/GPU1/Sensors",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
"/redfish/v1/Chassis/GPU2",
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
"/redfish/v1/Chassis/GPU2/Sensors",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
"/redfish/v1/Chassis/GPU3",
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
"/redfish/v1/Chassis/GPU3/Sensors",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
"/redfish/v1/Chassis/GPU4",
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
]
}

View File

@@ -0,0 +1,33 @@
{
"ServiceRootVendor": "AMI",
"ServiceRootProduct": "AMI Redfish Server",
"SystemManufacturer": "Micro-Star International Co., Ltd.",
"SystemModel": "CG480-S5063",
"SystemSKU": "5063G480RAE20",
"ChassisManufacturer": "NVIDIA",
"ChassisModel": "",
"ManagerManufacturer": "",
"OEMNamespaces": ["Ami"],
"ResourceHints": [
"/redfish/v1/Chassis/GPU1",
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
"/redfish/v1/Chassis/GPU1/Sensors",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
"/redfish/v1/Chassis/GPU2",
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
"/redfish/v1/Chassis/GPU2/Sensors",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
"/redfish/v1/Chassis/GPU3",
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
"/redfish/v1/Chassis/GPU3/Sensors",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
"/redfish/v1/Chassis/GPU4",
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
]
}

View File

@@ -0,0 +1,33 @@
{
"ServiceRootVendor": "AMI",
"ServiceRootProduct": "AMI Redfish Server",
"SystemManufacturer": "Micro-Star International Co., Ltd.",
"SystemModel": "CG480-S5063",
"SystemSKU": "5063G480RAE20",
"ChassisManufacturer": "NVIDIA",
"ChassisModel": "",
"ManagerManufacturer": "",
"OEMNamespaces": ["Ami"],
"ResourceHints": [
"/redfish/v1/Chassis/GPU1",
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
"/redfish/v1/Chassis/GPU1/Sensors",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
"/redfish/v1/Chassis/GPU2",
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
"/redfish/v1/Chassis/GPU2/Sensors",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
"/redfish/v1/Chassis/GPU3",
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
"/redfish/v1/Chassis/GPU3/Sensors",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
"/redfish/v1/Chassis/GPU4",
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
]
}

View File

@@ -0,0 +1,33 @@
{
"ServiceRootVendor": "Supermicro",
"ServiceRootProduct": "",
"SystemManufacturer": "Supermicro",
"SystemModel": "SYS-821GE-TNHR",
"SystemSKU": "0x1D1415D9",
"ChassisManufacturer": "Supermicro",
"ChassisModel": "X13DEG-OAD",
"ManagerManufacturer": "",
"OEMNamespaces": ["Supermicro"],
"ResourceHints": [
"/redfish/v1/Chassis/HGX_BMC_0",
"/redfish/v1/Chassis/HGX_BMC_0/Assembly",
"/redfish/v1/Chassis/HGX_BMC_0/Controls",
"/redfish/v1/Chassis/HGX_BMC_0/Drives",
"/redfish/v1/Chassis/HGX_BMC_0/EnvironmentMetrics",
"/redfish/v1/Chassis/HGX_BMC_0/LogServices",
"/redfish/v1/Chassis/HGX_BMC_0/PCIeDevices",
"/redfish/v1/Chassis/HGX_BMC_0/PCIeSlots",
"/redfish/v1/Chassis/HGX_BMC_0/PowerSubsystem",
"/redfish/v1/Chassis/HGX_BMC_0/PowerSubsystem/PowerSupplies",
"/redfish/v1/Chassis/HGX_BMC_0/Sensors",
"/redfish/v1/Chassis/HGX_BMC_0/Sensors/HGX_BMC_0_Temp_0",
"/redfish/v1/Chassis/HGX_BMC_0/ThermalSubsystem",
"/redfish/v1/Chassis/HGX_BMC_0/ThermalSubsystem/ThermalMetrics",
"/redfish/v1/Chassis/HGX_Chassis_0",
"/redfish/v1/Chassis/HGX_Chassis_0/Assembly",
"/redfish/v1/Chassis/HGX_Chassis_0/Controls",
"/redfish/v1/Chassis/HGX_Chassis_0/Controls/TotalGPU_Power_0",
"/redfish/v1/Chassis/HGX_Chassis_0/Drives",
"/redfish/v1/Chassis/HGX_Chassis_0/EnvironmentMetrics"
]
}

View File

@@ -0,0 +1,51 @@
{
"ServiceRootVendor": "",
"ServiceRootProduct": "H12DGQ-NT6",
"SystemManufacturer": "Supermicro",
"SystemModel": "AS -4124GQ-TNMI",
"SystemSKU": "091715D9",
"ChassisManufacturer": "Supermicro",
"ChassisModel": "H12DGQ-NT6",
"ManagerManufacturer": "",
"OEMNamespaces": [
"Supermicro"
],
"ResourceHints": [
"/redfish/v1/Chassis/1/PCIeDevices",
"/redfish/v1/Chassis/1/PCIeDevices/GPU1",
"/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions",
"/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions/1",
"/redfish/v1/Chassis/1/PCIeDevices/GPU2",
"/redfish/v1/Chassis/1/PCIeDevices/GPU2/PCIeFunctions",
"/redfish/v1/Chassis/1/PCIeDevices/GPU2/PCIeFunctions/1",
"/redfish/v1/Chassis/1/PCIeDevices/GPU3",
"/redfish/v1/Chassis/1/PCIeDevices/GPU3/PCIeFunctions",
"/redfish/v1/Chassis/1/PCIeDevices/GPU3/PCIeFunctions/1",
"/redfish/v1/Chassis/1/PCIeDevices/GPU4",
"/redfish/v1/Chassis/1/PCIeDevices/GPU4/PCIeFunctions",
"/redfish/v1/Chassis/1/PCIeDevices/GPU4/PCIeFunctions/1",
"/redfish/v1/Chassis/1/PCIeDevices/GPU5",
"/redfish/v1/Chassis/1/PCIeDevices/GPU5/PCIeFunctions",
"/redfish/v1/Chassis/1/PCIeDevices/GPU5/PCIeFunctions/1",
"/redfish/v1/Chassis/1/PCIeDevices/GPU6",
"/redfish/v1/Chassis/1/PCIeDevices/GPU6/PCIeFunctions",
"/redfish/v1/Chassis/1/PCIeDevices/GPU6/PCIeFunctions/1",
"/redfish/v1/Chassis/1/PCIeDevices/GPU7",
"/redfish/v1/Chassis/1/PCIeDevices/GPU7/PCIeFunctions",
"/redfish/v1/Chassis/1/PCIeDevices/GPU7/PCIeFunctions/1",
"/redfish/v1/Chassis/1/PCIeDevices/GPU8",
"/redfish/v1/Chassis/1/PCIeDevices/GPU8/PCIeFunctions",
"/redfish/v1/Chassis/1/PCIeDevices/GPU8/PCIeFunctions/1",
"/redfish/v1/Managers/1/Oem/Supermicro/FanMode",
"/redfish/v1/Oem/Supermicro/DumpService",
"/redfish/v1/UpdateService/FirmwareInventory/GPU1",
"/redfish/v1/UpdateService/FirmwareInventory/GPU2",
"/redfish/v1/UpdateService/FirmwareInventory/GPU3",
"/redfish/v1/UpdateService/FirmwareInventory/GPU4",
"/redfish/v1/UpdateService/FirmwareInventory/GPU5",
"/redfish/v1/UpdateService/FirmwareInventory/GPU6",
"/redfish/v1/UpdateService/FirmwareInventory/GPU7",
"/redfish/v1/UpdateService/FirmwareInventory/GPU8",
"/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory"
]
}

View File

@@ -0,0 +1,16 @@
{
"ServiceRootVendor": "",
"ServiceRootProduct": "Redfish Service",
"SystemManufacturer": "",
"SystemModel": "",
"SystemSKU": "",
"ChassisManufacturer": "",
"ChassisModel": "",
"ManagerManufacturer": "",
"OEMNamespaces": [],
"ResourceHints": [
"/redfish/v1/Chassis/1",
"/redfish/v1/Managers/1",
"/redfish/v1/Systems/1"
]
}

View File

@@ -0,0 +1,24 @@
{
"ServiceRootVendor": "xFusion",
"ServiceRootProduct": "G5500 V7",
"SystemManufacturer": "OEM",
"SystemModel": "G5500 V7",
"SystemSKU": "",
"ChassisManufacturer": "OEM",
"ChassisModel": "G5500 V7",
"ManagerManufacturer": "XFUSION",
"OEMNamespaces": ["xFusion"],
"ResourceHints": [
"/redfish/v1/Chassis/1",
"/redfish/v1/Chassis/1/Drives",
"/redfish/v1/Chassis/1/PCIeDevices",
"/redfish/v1/Chassis/1/Sensors",
"/redfish/v1/Managers/1",
"/redfish/v1/Systems/1",
"/redfish/v1/Systems/1/GraphicsControllers",
"/redfish/v1/Systems/1/Processors",
"/redfish/v1/Systems/1/Processors/Gpu1",
"/redfish/v1/Systems/1/Storages",
"/redfish/v1/UpdateService/FirmwareInventory"
]
}

View File

@@ -0,0 +1,167 @@
package redfishprofile
import (
"sort"
"git.mchus.pro/mchus/logpile/internal/models"
)
type MatchSignals struct {
ServiceRootVendor string
ServiceRootProduct string
SystemManufacturer string
SystemModel string
SystemSKU string
ChassisManufacturer string
ChassisModel string
ManagerManufacturer string
OEMNamespaces []string
ResourceHints []string
}
type AcquisitionPlan struct {
Mode string
Profiles []string
SeedPaths []string
CriticalPaths []string
PlanBPaths []string
Notes []string
ScopedPaths AcquisitionScopedPathPolicy
Tuning AcquisitionTuning
}
type DiscoveredResources struct {
SystemPaths []string
ChassisPaths []string
ManagerPaths []string
}
type ResolvedAcquisitionPlan struct {
Plan AcquisitionPlan
SeedPaths []string
CriticalPaths []string
}
type AcquisitionScopedPathPolicy struct {
SystemSeedSuffixes []string
SystemCriticalSuffixes []string
ChassisSeedSuffixes []string
ChassisCriticalSuffixes []string
ManagerSeedSuffixes []string
ManagerCriticalSuffixes []string
}
type AcquisitionTuning struct {
SnapshotMaxDocuments int
SnapshotWorkers int
PrefetchEnabled *bool
PrefetchWorkers int
NVMePostProbeEnabled *bool
RatePolicy AcquisitionRatePolicy
ETABaseline AcquisitionETABaseline
PostProbePolicy AcquisitionPostProbePolicy
RecoveryPolicy AcquisitionRecoveryPolicy
PrefetchPolicy AcquisitionPrefetchPolicy
}
type AcquisitionRatePolicy struct {
TargetP95LatencyMS int
ThrottleP95LatencyMS int
MinSnapshotWorkers int
MinPrefetchWorkers int
DisablePrefetchOnErrors bool
}
type AcquisitionETABaseline struct {
DiscoverySeconds int
SnapshotSeconds int
PrefetchSeconds int
CriticalPlanBSeconds int
ProfilePlanBSeconds int
}
type AcquisitionPostProbePolicy struct {
EnableDirectNVMEDiskBayProbe bool
EnableNumericCollectionProbe bool
EnableSensorCollectionProbe bool
}
type AcquisitionRecoveryPolicy struct {
EnableCriticalCollectionMemberRetry bool
EnableCriticalSlowProbe bool
EnableProfilePlanB bool
}
type AcquisitionPrefetchPolicy struct {
IncludeSuffixes []string
ExcludeContains []string
}
type AnalysisDirectives struct {
EnableProcessorGPUFallback bool
EnableSupermicroNVMeBackplane bool
EnableProcessorGPUChassisAlias bool
EnableGenericGraphicsControllerDedup bool
EnableMSIProcessorGPUChassisLookup bool
EnableStorageEnclosureRecovery bool
EnableKnownStorageControllerRecovery bool
}
type ResolvedAnalysisPlan struct {
Match MatchResult
Directives AnalysisDirectives
Notes []string
ProcessorGPUChassisLookupModes []string
KnownStorageDriveCollections []string
KnownStorageVolumeCollections []string
}
type Profile interface {
Name() string
Priority() int
Match(signals MatchSignals) int
SafeForFallback() bool
ExtendAcquisitionPlan(plan *AcquisitionPlan, signals MatchSignals)
RefineAcquisitionPlan(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, signals MatchSignals)
ApplyAnalysisDirectives(directives *AnalysisDirectives, signals MatchSignals)
RefineAnalysisPlan(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals)
PostAnalyze(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals)
}
type MatchResult struct {
Mode string
Profiles []Profile
Scores []ProfileScore
}
type ProfileScore struct {
Name string
Score int
Active bool
Priority int
}
func normalizeSignals(signals MatchSignals) MatchSignals {
signals.OEMNamespaces = dedupeSorted(signals.OEMNamespaces)
signals.ResourceHints = dedupeSorted(signals.ResourceHints)
return signals
}
func dedupeSorted(items []string) []string {
if len(items) == 0 {
return nil
}
set := make(map[string]struct{}, len(items))
for _, item := range items {
if item == "" {
continue
}
set[item] = struct{}{}
}
out := make([]string, 0, len(set))
for item := range set {
out = append(out, item)
}
sort.Strings(out)
return out
}

View File

@@ -7,25 +7,73 @@ import (
)
type Request struct {
Host string
Protocol string
Port int
Username string
AuthType string
Password string
Token string
TLSMode string
Host string
Protocol string
Port int
Username string
AuthType string
Password string
Token string
TLSMode string
PowerOnIfHostOff bool
}
type Progress struct {
Status string
Progress int
Message string
Status string
Progress int
Message string
CurrentPhase string
ETASeconds int
ActiveModules []ModuleActivation
ModuleScores []ModuleScore
DebugInfo *CollectDebugInfo
}
type ProgressFn func(Progress)
type ModuleActivation struct {
Name string
Score int
}
type ModuleScore struct {
Name string
Score int
Active bool
Priority int
}
type CollectDebugInfo struct {
AdaptiveThrottled bool
SnapshotWorkers int
PrefetchWorkers int
PrefetchEnabled *bool
PhaseTelemetry []PhaseTelemetry
}
type PhaseTelemetry struct {
Phase string
Requests int
Errors int
ErrorRate float64
AvgMS int64
P95MS int64
}
type ProbeResult struct {
Reachable bool
Protocol string
HostPowerState string
HostPoweredOn bool
PowerControlAvailable bool
SystemPath string
}
type Connector interface {
Protocol() string
Collect(ctx context.Context, req Request, emit ProgressFn) (*models.AnalysisResult, error)
}
type Prober interface {
Probe(ctx context.Context, req Request) (*ProbeResult, error)
}

View File

@@ -66,104 +66,15 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
}
}
// CPUs
for _, cpu := range e.result.Hardware.CPUs {
if !hasUsableSerial(cpu.SerialNumber) {
seenCanonical := make(map[string]struct{})
for _, dev := range canonicalDevicesForExport(e.result.Hardware) {
if !hasUsableSerial(dev.SerialNumber) {
continue
}
if err := writer.Write([]string{
cpu.Model,
strings.TrimSpace(cpu.SerialNumber),
"",
"CPU",
}); err != nil {
return err
}
}
// Memory
for _, mem := range e.result.Hardware.Memory {
if !hasUsableSerial(mem.SerialNumber) {
continue
}
location := mem.Location
if location == "" {
location = mem.Slot
}
if err := writer.Write([]string{
mem.PartNumber,
strings.TrimSpace(mem.SerialNumber),
mem.Manufacturer,
location,
}); err != nil {
return err
}
}
// Storage
for _, stor := range e.result.Hardware.Storage {
if !hasUsableSerial(stor.SerialNumber) {
continue
}
if err := writer.Write([]string{
stor.Model,
strings.TrimSpace(stor.SerialNumber),
stor.Manufacturer,
stor.Slot,
}); err != nil {
return err
}
}
// GPUs
for _, gpu := range e.result.Hardware.GPUs {
if !hasUsableSerial(gpu.SerialNumber) {
continue
}
component := gpu.Model
if component == "" {
component = "GPU"
}
if err := writer.Write([]string{
component,
strings.TrimSpace(gpu.SerialNumber),
gpu.Manufacturer,
gpu.Slot,
}); err != nil {
return err
}
}
// PCIe devices
for _, pcie := range e.result.Hardware.PCIeDevices {
if !hasUsableSerial(pcie.SerialNumber) {
continue
}
if err := writer.Write([]string{
pcie.DeviceClass,
strings.TrimSpace(pcie.SerialNumber),
pcie.Manufacturer,
pcie.Slot,
}); err != nil {
return err
}
}
// Network adapters
for _, nic := range e.result.Hardware.NetworkAdapters {
if !hasUsableSerial(nic.SerialNumber) {
continue
}
location := nic.Location
if location == "" {
location = nic.Slot
}
if err := writer.Write([]string{
nic.Model,
strings.TrimSpace(nic.SerialNumber),
nic.Vendor,
location,
}); err != nil {
serial := strings.TrimSpace(dev.SerialNumber)
seenCanonical[serial] = struct{}{}
component, manufacturer, location := csvFieldsFromCanonicalDevice(dev)
if err := writer.Write([]string{component, serial, manufacturer, location}); err != nil {
return err
}
}
@@ -173,26 +84,15 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
if !hasUsableSerial(nic.SerialNumber) {
continue
}
if err := writer.Write([]string{
nic.Model,
strings.TrimSpace(nic.SerialNumber),
"",
"Network",
}); err != nil {
return err
}
}
// Power supplies
for _, psu := range e.result.Hardware.PowerSupply {
if !hasUsableSerial(psu.SerialNumber) {
serial := strings.TrimSpace(nic.SerialNumber)
if _, ok := seenCanonical[serial]; ok {
continue
}
if err := writer.Write([]string{
psu.Model,
strings.TrimSpace(psu.SerialNumber),
psu.Vendor,
psu.Slot,
nic.Model,
serial,
"",
"Network",
}); err != nil {
return err
}
@@ -221,3 +121,52 @@ func hasUsableSerial(serial string) bool {
return true
}
}
func csvFieldsFromCanonicalDevice(dev models.HardwareDevice) (component, manufacturer, location string) {
component = firstNonEmptyString(
dev.Model,
dev.PartNumber,
dev.DeviceClass,
dev.Kind,
)
manufacturer = firstNonEmptyString(dev.Manufacturer, inferCSVVendor(dev))
location = firstNonEmptyString(dev.Location, dev.Slot, dev.BDF, dev.Kind)
switch dev.Kind {
case models.DeviceKindCPU:
if component == "" {
component = "CPU"
}
if location == "" {
location = "CPU"
}
case models.DeviceKindMemory:
component = firstNonEmptyString(dev.PartNumber, dev.Model, "Memory")
case models.DeviceKindPCIe, models.DeviceKindGPU, models.DeviceKindNetwork:
if location == "" {
location = firstNonEmptyString(dev.Slot, dev.BDF, "PCIe")
}
case models.DeviceKindPSU:
component = firstNonEmptyString(dev.Model, "Power Supply")
}
return component, manufacturer, location
}
func inferCSVVendor(dev models.HardwareDevice) string {
switch dev.Kind {
case models.DeviceKindCPU:
return ""
default:
return ""
}
}
func firstNonEmptyString(values ...string) string {
for _, value := range values {
if strings.TrimSpace(value) != "" {
return strings.TrimSpace(value)
}
}
return ""
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -19,6 +19,8 @@ type ReanimatorHardware struct {
Storage []ReanimatorStorage `json:"storage,omitempty"`
PCIeDevices []ReanimatorPCIe `json:"pcie_devices,omitempty"`
PowerSupplies []ReanimatorPSU `json:"power_supplies,omitempty"`
Sensors *ReanimatorSensors `json:"sensors,omitempty"`
EventLogs []ReanimatorEventLog `json:"event_logs,omitempty"`
}
// ReanimatorBoard represents motherboard/server information
@@ -36,11 +38,6 @@ type ReanimatorFirmware struct {
Version string `json:"version"`
}
type ReanimatorStatusAtCollection struct {
Status string `json:"status"`
At string `json:"at"`
}
type ReanimatorStatusHistoryEntry struct {
Status string `json:"status"`
ChangedAt string `json:"changed_at"`
@@ -49,105 +46,209 @@ type ReanimatorStatusHistoryEntry struct {
// ReanimatorCPU represents processor information
type ReanimatorCPU struct {
Socket int `json:"socket"`
Model string `json:"model"`
Cores int `json:"cores,omitempty"`
Threads int `json:"threads,omitempty"`
FrequencyMHz int `json:"frequency_mhz,omitempty"`
MaxFrequencyMHz int `json:"max_frequency_mhz,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Socket int `json:"socket"`
Model string `json:"model,omitempty"`
Cores int `json:"cores,omitempty"`
Threads int `json:"threads,omitempty"`
FrequencyMHz int `json:"frequency_mhz,omitempty"`
MaxFrequencyMHz int `json:"max_frequency_mhz,omitempty"`
TemperatureC float64 `json:"temperature_c,omitempty"`
PowerW float64 `json:"power_w,omitempty"`
Throttled *bool `json:"throttled,omitempty"`
CorrectableErrorCount int64 `json:"correctable_error_count,omitempty"`
UncorrectableErrorCount int64 `json:"uncorrectable_error_count,omitempty"`
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
Present *bool `json:"present,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
}
// ReanimatorMemory represents a memory module (DIMM)
type ReanimatorMemory struct {
Slot string `json:"slot"`
Location string `json:"location,omitempty"`
Present bool `json:"present"`
SizeMB int `json:"size_mb,omitempty"`
Type string `json:"type,omitempty"`
MaxSpeedMHz int `json:"max_speed_mhz,omitempty"`
CurrentSpeedMHz int `json:"current_speed_mhz,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Slot string `json:"slot"`
Location string `json:"location,omitempty"`
Present *bool `json:"present,omitempty"`
SizeMB int `json:"size_mb,omitempty"`
Type string `json:"type,omitempty"`
MaxSpeedMHz int `json:"max_speed_mhz,omitempty"`
CurrentSpeedMHz int `json:"current_speed_mhz,omitempty"`
TemperatureC float64 `json:"temperature_c,omitempty"`
CorrectableECCErrorCount int64 `json:"correctable_ecc_error_count,omitempty"`
UncorrectableECCErrorCount int64 `json:"uncorrectable_ecc_error_count,omitempty"`
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
SpareBlocksRemainingPct float64 `json:"spare_blocks_remaining_pct,omitempty"`
PerformanceDegraded *bool `json:"performance_degraded,omitempty"`
DataLossDetected *bool `json:"data_loss_detected,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
}
// ReanimatorStorage represents a storage device
type ReanimatorStorage struct {
Slot string `json:"slot"`
Type string `json:"type,omitempty"`
Model string `json:"model"`
SizeGB int `json:"size_gb,omitempty"`
SerialNumber string `json:"serial_number"`
Manufacturer string `json:"manufacturer,omitempty"`
Firmware string `json:"firmware,omitempty"`
Interface string `json:"interface,omitempty"`
Present bool `json:"present"`
Status string `json:"status,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Slot string `json:"slot"`
Type string `json:"type,omitempty"`
Model string `json:"model"`
SizeGB int `json:"size_gb,omitempty"`
SerialNumber string `json:"serial_number"`
Manufacturer string `json:"manufacturer,omitempty"`
Firmware string `json:"firmware,omitempty"`
Interface string `json:"interface,omitempty"`
Present *bool `json:"present,omitempty"`
TemperatureC float64 `json:"temperature_c,omitempty"`
PowerOnHours int64 `json:"power_on_hours,omitempty"`
PowerCycles int64 `json:"power_cycles,omitempty"`
UnsafeShutdowns int64 `json:"unsafe_shutdowns,omitempty"`
MediaErrors int64 `json:"media_errors,omitempty"`
ErrorLogEntries int64 `json:"error_log_entries,omitempty"`
WrittenBytes int64 `json:"written_bytes,omitempty"`
ReadBytes int64 `json:"read_bytes,omitempty"`
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
RemainingEndurancePct *int `json:"remaining_endurance_pct,omitempty"`
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
AvailableSparePct float64 `json:"available_spare_pct,omitempty"`
ReallocatedSectors int64 `json:"reallocated_sectors,omitempty"`
CurrentPendingSectors int64 `json:"current_pending_sectors,omitempty"`
OfflineUncorrectable int64 `json:"offline_uncorrectable,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
}
// ReanimatorPCIe represents a PCIe device
type ReanimatorPCIe struct {
Slot string `json:"slot"`
VendorID int `json:"vendor_id,omitempty"`
DeviceID int `json:"device_id,omitempty"`
BDF string `json:"bdf,omitempty"`
DeviceClass string `json:"device_class,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
Model string `json:"model,omitempty"`
LinkWidth int `json:"link_width,omitempty"`
LinkSpeed string `json:"link_speed,omitempty"`
MaxLinkWidth int `json:"max_link_width,omitempty"`
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
TemperatureC int `json:"temperature_c,omitempty"`
PowerW int `json:"power_w,omitempty"`
VoltageV float64 `json:"voltage_v,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Slot string `json:"slot"`
VendorID int `json:"vendor_id,omitempty"`
DeviceID int `json:"device_id,omitempty"`
NUMANode int `json:"numa_node,omitempty"`
TemperatureC float64 `json:"temperature_c,omitempty"`
PowerW float64 `json:"power_w,omitempty"`
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
ECCCorrectedTotal int64 `json:"ecc_corrected_total,omitempty"`
ECCUncorrectedTotal int64 `json:"ecc_uncorrected_total,omitempty"`
HWSlowdown *bool `json:"hw_slowdown,omitempty"`
BatteryChargePct float64 `json:"battery_charge_pct,omitempty"`
BatteryHealthPct float64 `json:"battery_health_pct,omitempty"`
BatteryTemperatureC float64 `json:"battery_temperature_c,omitempty"`
BatteryVoltageV float64 `json:"battery_voltage_v,omitempty"`
BatteryReplaceRequired *bool `json:"battery_replace_required,omitempty"`
SFPTemperatureC float64 `json:"sfp_temperature_c,omitempty"`
SFPTXPowerDBm float64 `json:"sfp_tx_power_dbm,omitempty"`
SFPRXPowerDBm float64 `json:"sfp_rx_power_dbm,omitempty"`
SFPVoltageV float64 `json:"sfp_voltage_v,omitempty"`
SFPBiasMA float64 `json:"sfp_bias_ma,omitempty"`
BDF string `json:"-"`
DeviceClass string `json:"device_class,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
Model string `json:"model,omitempty"`
LinkWidth int `json:"link_width,omitempty"`
LinkSpeed string `json:"link_speed,omitempty"`
MaxLinkWidth int `json:"max_link_width,omitempty"`
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
MACAddresses []string `json:"mac_addresses,omitempty"`
Present *bool `json:"present,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
}
// ReanimatorPSU represents a power supply unit
type ReanimatorPSU struct {
Slot string `json:"slot"`
Present bool `json:"present"`
Model string `json:"model,omitempty"`
Vendor string `json:"vendor,omitempty"`
WattageW int `json:"wattage_w,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
Status string `json:"status,omitempty"`
InputType string `json:"input_type,omitempty"`
InputPowerW int `json:"input_power_w,omitempty"`
OutputPowerW int `json:"output_power_w,omitempty"`
InputVoltage float64 `json:"input_voltage,omitempty"`
TemperatureC int `json:"temperature_c,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Slot string `json:"slot"`
Present *bool `json:"present,omitempty"`
Model string `json:"model,omitempty"`
Vendor string `json:"vendor,omitempty"`
WattageW int `json:"wattage_w,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
Status string `json:"status,omitempty"`
InputType string `json:"input_type,omitempty"`
InputPowerW float64 `json:"input_power_w,omitempty"`
OutputPowerW float64 `json:"output_power_w,omitempty"`
InputVoltage float64 `json:"input_voltage,omitempty"`
TemperatureC float64 `json:"temperature_c,omitempty"`
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
StatusCheckedAt string `json:"status_checked_at,omitempty"`
StatusChangedAt string `json:"status_changed_at,omitempty"`
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
}
type ReanimatorEventLog struct {
Source string `json:"source"`
EventTime string `json:"event_time,omitempty"`
Severity string `json:"severity,omitempty"`
MessageID string `json:"message_id,omitempty"`
Message string `json:"message"`
ComponentRef string `json:"component_ref,omitempty"`
Fingerprint string `json:"fingerprint,omitempty"`
IsActive *bool `json:"is_active,omitempty"`
RawPayload map[string]any `json:"raw_payload,omitempty"`
}
type ReanimatorSensors struct {
Fans []ReanimatorFanSensor `json:"fans,omitempty"`
Power []ReanimatorPowerSensor `json:"power,omitempty"`
Temperatures []ReanimatorTemperatureSensor `json:"temperatures,omitempty"`
Other []ReanimatorOtherSensor `json:"other,omitempty"`
}
type ReanimatorFanSensor struct {
Name string `json:"name"`
Location string `json:"location,omitempty"`
RPM int `json:"rpm,omitempty"`
Status string `json:"status,omitempty"`
}
type ReanimatorPowerSensor struct {
Name string `json:"name"`
Location string `json:"location,omitempty"`
VoltageV float64 `json:"voltage_v,omitempty"`
CurrentA float64 `json:"current_a,omitempty"`
PowerW float64 `json:"power_w,omitempty"`
Status string `json:"status,omitempty"`
}
type ReanimatorTemperatureSensor struct {
Name string `json:"name"`
Location string `json:"location,omitempty"`
Celsius float64 `json:"celsius,omitempty"`
ThresholdWarningCelsius float64 `json:"threshold_warning_celsius,omitempty"`
ThresholdCriticalCelsius float64 `json:"threshold_critical_celsius,omitempty"`
Status string `json:"status,omitempty"`
}
type ReanimatorOtherSensor struct {
Name string `json:"name"`
Location string `json:"location,omitempty"`
Value float64 `json:"value,omitempty"`
Unit string `json:"unit,omitempty"`
Status string `json:"status,omitempty"`
}

View File

@@ -0,0 +1,63 @@
package ingest
import (
"bytes"
"fmt"
"strings"
"git.mchus.pro/mchus/logpile/internal/collector"
"git.mchus.pro/mchus/logpile/internal/models"
"git.mchus.pro/mchus/logpile/internal/parser"
)
type Service struct{}
type RedfishSourceMetadata struct {
TargetHost string
SourceTimezone string
Filename string
}
func NewService() *Service {
return &Service{}
}
func (s *Service) AnalyzeArchivePayload(filename string, payload []byte) (*models.AnalysisResult, string, error) {
p := parser.NewBMCParser()
if err := p.ParseFromReader(bytes.NewReader(payload), filename); err != nil {
return nil, "", err
}
return p.Result(), p.DetectedVendor(), nil
}
func (s *Service) AnalyzeRedfishRawPayloads(rawPayloads map[string]any, meta RedfishSourceMetadata) (*models.AnalysisResult, string, error) {
result, err := collector.ReplayRedfishFromRawPayloads(rawPayloads, nil)
if err != nil {
return nil, "", err
}
if result == nil {
return nil, "", fmt.Errorf("redfish replay returned nil result")
}
if strings.TrimSpace(result.Protocol) == "" {
result.Protocol = "redfish"
}
if strings.TrimSpace(result.SourceType) == "" {
result.SourceType = models.SourceTypeAPI
}
if strings.TrimSpace(result.TargetHost) == "" {
result.TargetHost = strings.TrimSpace(meta.TargetHost)
}
if strings.TrimSpace(result.SourceTimezone) == "" {
result.SourceTimezone = strings.TrimSpace(meta.SourceTimezone)
}
if strings.TrimSpace(result.Filename) == "" {
if strings.TrimSpace(meta.Filename) != "" {
result.Filename = strings.TrimSpace(meta.Filename)
} else if target := strings.TrimSpace(result.TargetHost); target != "" {
result.Filename = "redfish://" + target
} else {
result.Filename = "redfish://snapshot"
}
}
return result, "redfish", nil
}

View File

@@ -9,17 +9,17 @@ const (
// AnalysisResult contains all parsed data from an archive
type AnalysisResult struct {
Filename string `json:"filename"`
SourceType string `json:"source_type,omitempty"` // archive | api
Protocol string `json:"protocol,omitempty"` // redfish | ipmi
TargetHost string `json:"target_host,omitempty"` // BMC host for live collect
SourceTimezone string `json:"source_timezone,omitempty"` // Source timezone/offset used during collection (e.g. +08:00)
CollectedAt time.Time `json:"collected_at,omitempty"` // Collection/upload timestamp
RawPayloads map[string]any `json:"raw_payloads,omitempty"` // Additional source payloads (e.g. Redfish tree)
Events []Event `json:"events"`
FRU []FRUInfo `json:"fru"`
Sensors []SensorReading `json:"sensors"`
Hardware *HardwareConfig `json:"hardware"`
Filename string `json:"filename"`
SourceType string `json:"source_type,omitempty"` // archive | api
Protocol string `json:"protocol,omitempty"` // redfish | ipmi
TargetHost string `json:"target_host,omitempty"` // BMC host for live collect
SourceTimezone string `json:"source_timezone,omitempty"` // Source timezone/offset used during collection (e.g. +08:00)
CollectedAt time.Time `json:"collected_at,omitempty"` // Collection/upload timestamp
RawPayloads map[string]any `json:"raw_payloads,omitempty"` // Additional source payloads (e.g. Redfish tree)
Events []Event `json:"events"`
FRU []FRUInfo `json:"fru"`
Sensors []SensorReading `json:"sensors"`
Hardware *HardwareConfig `json:"hardware"`
}
// Event represents a single log event
@@ -110,43 +110,45 @@ const (
// HardwareDevice is canonical device inventory used across UI and exports.
type HardwareDevice struct {
ID string `json:"id"`
Kind string `json:"kind"`
Source string `json:"source,omitempty"`
Slot string `json:"slot,omitempty"`
Location string `json:"location,omitempty"`
BDF string `json:"bdf,omitempty"`
DeviceClass string `json:"device_class,omitempty"`
VendorID int `json:"vendor_id,omitempty"`
DeviceID int `json:"device_id,omitempty"`
Model string `json:"model,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
Type string `json:"type,omitempty"`
Interface string `json:"interface,omitempty"`
Present *bool `json:"present,omitempty"`
SizeMB int `json:"size_mb,omitempty"`
SizeGB int `json:"size_gb,omitempty"`
Cores int `json:"cores,omitempty"`
Threads int `json:"threads,omitempty"`
FrequencyMHz int `json:"frequency_mhz,omitempty"`
MaxFreqMHz int `json:"max_frequency_mhz,omitempty"`
PortCount int `json:"port_count,omitempty"`
PortType string `json:"port_type,omitempty"`
MACAddresses []string `json:"mac_addresses,omitempty"`
LinkWidth int `json:"link_width,omitempty"`
LinkSpeed string `json:"link_speed,omitempty"`
MaxLinkWidth int `json:"max_link_width,omitempty"`
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
WattageW int `json:"wattage_w,omitempty"`
InputType string `json:"input_type,omitempty"`
InputPowerW int `json:"input_power_w,omitempty"`
OutputPowerW int `json:"output_power_w,omitempty"`
InputVoltage float64 `json:"input_voltage,omitempty"`
TemperatureC int `json:"temperature_c,omitempty"`
Status string `json:"status,omitempty"`
ID string `json:"id"`
Kind string `json:"kind"`
Source string `json:"source,omitempty"`
Slot string `json:"slot,omitempty"`
Location string `json:"location,omitempty"`
BDF string `json:"bdf,omitempty"`
DeviceClass string `json:"device_class,omitempty"`
VendorID int `json:"vendor_id,omitempty"`
DeviceID int `json:"device_id,omitempty"`
Model string `json:"model,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
Type string `json:"type,omitempty"`
Interface string `json:"interface,omitempty"`
Present *bool `json:"present,omitempty"`
SizeMB int `json:"size_mb,omitempty"`
SizeGB int `json:"size_gb,omitempty"`
Cores int `json:"cores,omitempty"`
Threads int `json:"threads,omitempty"`
FrequencyMHz int `json:"frequency_mhz,omitempty"`
MaxFreqMHz int `json:"max_frequency_mhz,omitempty"`
PortCount int `json:"port_count,omitempty"`
PortType string `json:"port_type,omitempty"`
MACAddresses []string `json:"mac_addresses,omitempty"`
LinkWidth int `json:"link_width,omitempty"`
LinkSpeed string `json:"link_speed,omitempty"`
MaxLinkWidth int `json:"max_link_width,omitempty"`
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
WattageW int `json:"wattage_w,omitempty"`
InputType string `json:"input_type,omitempty"`
InputPowerW int `json:"input_power_w,omitempty"`
OutputPowerW int `json:"output_power_w,omitempty"`
InputVoltage float64 `json:"input_voltage,omitempty"`
TemperatureC int `json:"temperature_c,omitempty"`
RemainingEndurancePct *int `json:"remaining_endurance_pct,omitempty"` // 0-100 %; nil = not reported
NUMANode int `json:"numa_node,omitempty"` // 0 = not reported/N/A
Status string `json:"status,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
@@ -167,14 +169,14 @@ type FirmwareInfo struct {
// BoardInfo represents motherboard/system information
type BoardInfo struct {
Manufacturer string `json:"manufacturer,omitempty"`
ProductName string `json:"product_name,omitempty"`
Description string `json:"description,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Version string `json:"version,omitempty"`
UUID string `json:"uuid,omitempty"`
BMCMACAddress string `json:"bmc_mac_address,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
ProductName string `json:"product_name,omitempty"`
Description string `json:"description,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Version string `json:"version,omitempty"`
UUID string `json:"uuid,omitempty"`
BMCMACAddress string `json:"bmc_mac_address,omitempty"`
}
// CPU represents processor information
@@ -194,11 +196,12 @@ type CPU struct {
SerialNumber string `json:"serial_number,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Details map[string]any `json:"details,omitempty"`
}
// MemoryDIMM represents a memory module
@@ -218,31 +221,34 @@ type MemoryDIMM struct {
Status string `json:"status,omitempty"`
Ranks int `json:"ranks,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Details map[string]any `json:"details,omitempty"`
}
// Storage represents a storage device
type Storage struct {
Slot string `json:"slot"`
Type string `json:"type"`
Model string `json:"model"`
Description string `json:"description,omitempty"`
SizeGB int `json:"size_gb"`
SerialNumber string `json:"serial_number,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
Firmware string `json:"firmware,omitempty"`
Interface string `json:"interface,omitempty"`
Present bool `json:"present"`
Location string `json:"location,omitempty"` // Front/Rear
BackplaneID int `json:"backplane_id,omitempty"`
Status string `json:"status,omitempty"`
Slot string `json:"slot"`
Type string `json:"type"`
Model string `json:"model"`
Description string `json:"description,omitempty"`
SizeGB int `json:"size_gb"`
SerialNumber string `json:"serial_number,omitempty"`
Manufacturer string `json:"manufacturer,omitempty"`
Firmware string `json:"firmware,omitempty"`
Interface string `json:"interface,omitempty"`
Present bool `json:"present"`
Location string `json:"location,omitempty"` // Front/Rear
BackplaneID int `json:"backplane_id,omitempty"`
RemainingEndurancePct *int `json:"remaining_endurance_pct,omitempty"` // 0-100 %; nil = not reported
Status string `json:"status,omitempty"`
Details map[string]any `json:"details,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
@@ -250,15 +256,15 @@ type Storage struct {
// StorageVolume represents a logical storage volume (RAID/VROC/etc.).
type StorageVolume struct {
ID string `json:"id,omitempty"`
Name string `json:"name,omitempty"`
Controller string `json:"controller,omitempty"`
RAIDLevel string `json:"raid_level,omitempty"`
SizeGB int `json:"size_gb,omitempty"`
CapacityBytes int64 `json:"capacity_bytes,omitempty"`
Status string `json:"status,omitempty"`
Bootable bool `json:"bootable,omitempty"`
Encrypted bool `json:"encrypted,omitempty"`
ID string `json:"id,omitempty"`
Name string `json:"name,omitempty"`
Controller string `json:"controller,omitempty"`
RAIDLevel string `json:"raid_level,omitempty"`
SizeGB int `json:"size_gb,omitempty"`
CapacityBytes int64 `json:"capacity_bytes,omitempty"`
Status string `json:"status,omitempty"`
Bootable bool `json:"bootable,omitempty"`
Encrypted bool `json:"encrypted,omitempty"`
}
// PCIeDevice represents a PCIe device
@@ -277,13 +283,15 @@ type PCIeDevice struct {
PartNumber string `json:"part_number,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
MACAddresses []string `json:"mac_addresses,omitempty"`
NUMANode int `json:"numa_node,omitempty"` // 0 = not reported/N/A
Status string `json:"status,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Details map[string]any `json:"details,omitempty"`
}
// NIC represents a network interface card
@@ -298,25 +306,26 @@ type NIC struct {
// PSU represents a power supply unit
type PSU struct {
Slot string `json:"slot"`
Present bool `json:"present"`
Model string `json:"model"`
Description string `json:"description,omitempty"`
Vendor string `json:"vendor,omitempty"`
WattageW int `json:"wattage_w,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
Status string `json:"status,omitempty"`
InputType string `json:"input_type,omitempty"`
InputPowerW int `json:"input_power_w,omitempty"`
OutputPowerW int `json:"output_power_w,omitempty"`
InputVoltage float64 `json:"input_voltage,omitempty"`
OutputVoltage float64 `json:"output_voltage,omitempty"`
TemperatureC int `json:"temperature_c,omitempty"`
Slot string `json:"slot"`
Present bool `json:"present"`
Model string `json:"model"`
Description string `json:"description,omitempty"`
Vendor string `json:"vendor,omitempty"`
WattageW int `json:"wattage_w,omitempty"`
SerialNumber string `json:"serial_number,omitempty"`
PartNumber string `json:"part_number,omitempty"`
Firmware string `json:"firmware,omitempty"`
Status string `json:"status,omitempty"`
InputType string `json:"input_type,omitempty"`
InputPowerW int `json:"input_power_w,omitempty"`
OutputPowerW int `json:"output_power_w,omitempty"`
InputVoltage float64 `json:"input_voltage,omitempty"`
OutputVoltage float64 `json:"output_voltage,omitempty"`
TemperatureC int `json:"temperature_c,omitempty"`
Details map[string]any `json:"details,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
@@ -353,11 +362,12 @@ type GPU struct {
CurrentLinkSpeed string `json:"current_link_speed,omitempty"`
Status string `json:"status,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Details map[string]any `json:"details,omitempty"`
}
// NetworkAdapter represents a network adapter with detailed info
@@ -365,6 +375,7 @@ type NetworkAdapter struct {
Slot string `json:"slot"`
Location string `json:"location"`
Present bool `json:"present"`
BDF string `json:"bdf,omitempty"`
Model string `json:"model"`
Description string `json:"description,omitempty"`
Vendor string `json:"vendor,omitempty"`
@@ -376,11 +387,17 @@ type NetworkAdapter struct {
PortCount int `json:"port_count,omitempty"`
PortType string `json:"port_type,omitempty"`
MACAddresses []string `json:"mac_addresses,omitempty"`
LinkWidth int `json:"link_width,omitempty"`
LinkSpeed string `json:"link_speed,omitempty"`
MaxLinkWidth int `json:"max_link_width,omitempty"`
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
NUMANode int `json:"numa_node,omitempty"` // 0 = not reported/N/A
Status string `json:"status,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
ErrorDescription string `json:"error_description,omitempty"`
Details map[string]any `json:"details,omitempty"`
}

View File

@@ -0,0 +1,135 @@
package parser
import (
"fmt"
"regexp"
"strings"
"time"
"git.mchus.pro/mchus/logpile/internal/models"
)
var manufacturedYearWeekPattern = regexp.MustCompile(`^\d{4}-W\d{2}$`)
// NormalizeManufacturedYearWeek converts common FRU manufacturing date formats
// into contract-compatible YYYY-Www values. Unknown or ambiguous inputs return "".
func NormalizeManufacturedYearWeek(raw string) string {
value := strings.TrimSpace(raw)
if value == "" {
return ""
}
upper := strings.ToUpper(value)
if manufacturedYearWeekPattern.MatchString(upper) {
return upper
}
layouts := []string{
time.RFC3339,
"2006-01-02T15:04:05",
"2006-01-02 15:04:05",
"2006-01-02",
"2006/01/02",
"01/02/2006 15:04:05",
"01/02/2006",
"01-02-2006",
"Mon Jan 2 15:04:05 2006",
"Mon Jan _2 15:04:05 2006",
"Jan 2 2006",
"Jan _2 2006",
}
for _, layout := range layouts {
if ts, err := time.Parse(layout, value); err == nil {
year, week := ts.ISOWeek()
return formatYearWeek(year, week)
}
}
return ""
}
func formatYearWeek(year, week int) string {
if year <= 0 || week <= 0 || week > 53 {
return ""
}
return fmt.Sprintf("%04d-W%02d", year, week)
}
// ApplyManufacturedYearWeekFromFRU attaches normalized manufactured_year_week to
// component details by exact serial-number match. Board-level FRU entries are not
// expanded to components.
func ApplyManufacturedYearWeekFromFRU(frus []models.FRUInfo, hw *models.HardwareConfig) {
if hw == nil || len(frus) == 0 {
return
}
bySerial := make(map[string]string, len(frus))
for _, fru := range frus {
serial := normalizeFRUSerial(fru.SerialNumber)
yearWeek := NormalizeManufacturedYearWeek(fru.MfgDate)
if serial == "" || yearWeek == "" {
continue
}
if _, exists := bySerial[serial]; exists {
continue
}
bySerial[serial] = yearWeek
}
if len(bySerial) == 0 {
return
}
for i := range hw.CPUs {
attachYearWeek(&hw.CPUs[i].Details, bySerial[normalizeFRUSerial(hw.CPUs[i].SerialNumber)])
}
for i := range hw.Memory {
attachYearWeek(&hw.Memory[i].Details, bySerial[normalizeFRUSerial(hw.Memory[i].SerialNumber)])
}
for i := range hw.Storage {
attachYearWeek(&hw.Storage[i].Details, bySerial[normalizeFRUSerial(hw.Storage[i].SerialNumber)])
}
for i := range hw.PCIeDevices {
attachYearWeek(&hw.PCIeDevices[i].Details, bySerial[normalizeFRUSerial(hw.PCIeDevices[i].SerialNumber)])
}
for i := range hw.GPUs {
attachYearWeek(&hw.GPUs[i].Details, bySerial[normalizeFRUSerial(hw.GPUs[i].SerialNumber)])
}
for i := range hw.NetworkAdapters {
attachYearWeek(&hw.NetworkAdapters[i].Details, bySerial[normalizeFRUSerial(hw.NetworkAdapters[i].SerialNumber)])
}
for i := range hw.PowerSupply {
attachYearWeek(&hw.PowerSupply[i].Details, bySerial[normalizeFRUSerial(hw.PowerSupply[i].SerialNumber)])
}
}
func attachYearWeek(details *map[string]any, yearWeek string) {
if yearWeek == "" {
return
}
if *details == nil {
*details = map[string]any{}
}
if existing, ok := (*details)["manufactured_year_week"]; ok && strings.TrimSpace(toString(existing)) != "" {
return
}
(*details)["manufactured_year_week"] = yearWeek
}
func normalizeFRUSerial(v string) string {
s := strings.TrimSpace(v)
if s == "" {
return ""
}
switch strings.ToUpper(s) {
case "N/A", "NA", "NULL", "UNKNOWN", "-", "0":
return ""
default:
return strings.ToUpper(s)
}
}
func toString(v any) string {
switch x := v.(type) {
case string:
return x
default:
return strings.TrimSpace(fmt.Sprint(v))
}
}

View File

@@ -0,0 +1,65 @@
package parser
import (
"testing"
"git.mchus.pro/mchus/logpile/internal/models"
)
func TestNormalizeManufacturedYearWeek(t *testing.T) {
tests := []struct {
in string
want string
}{
{"2024-W07", "2024-W07"},
{"2024-02-13", "2024-W07"},
{"02/13/2024", "2024-W07"},
{"Tue Feb 13 12:00:00 2024", "2024-W07"},
{"", ""},
{"not-a-date", ""},
}
for _, tt := range tests {
if got := NormalizeManufacturedYearWeek(tt.in); got != tt.want {
t.Fatalf("NormalizeManufacturedYearWeek(%q) = %q, want %q", tt.in, got, tt.want)
}
}
}
func TestApplyManufacturedYearWeekFromFRU_AttachesByExactSerial(t *testing.T) {
hw := &models.HardwareConfig{
PowerSupply: []models.PSU{
{
Slot: "PSU0",
SerialNumber: "PSU-SN-001",
},
},
Storage: []models.Storage{
{
Slot: "OB01",
SerialNumber: "DISK-SN-001",
},
},
}
fru := []models.FRUInfo{
{
Description: "PSU0_FRU (ID 30)",
SerialNumber: "PSU-SN-001",
MfgDate: "2024-02-13",
},
{
Description: "Builtin FRU Device (ID 0)",
SerialNumber: "BOARD-SN-001",
MfgDate: "2024-02-01",
},
}
ApplyManufacturedYearWeekFromFRU(fru, hw)
if got := hw.PowerSupply[0].Details["manufactured_year_week"]; got != "2024-W07" {
t.Fatalf("expected PSU year week 2024-W07, got %#v", hw.PowerSupply[0].Details)
}
if hw.Storage[0].Details != nil {
t.Fatalf("expected unmatched storage serial to stay untouched, got %#v", hw.Storage[0].Details)
}
}

View File

@@ -16,6 +16,7 @@ import (
"git.mchus.pro/mchus/logpile/internal/models"
"git.mchus.pro/mchus/logpile/internal/parser"
"git.mchus.pro/mchus/logpile/internal/parser/vendors/pciids"
)
const parserVersion = "3.0"
@@ -199,7 +200,7 @@ func parseDCIMViewXML(content []byte, result *models.AnalysisResult) {
parsePowerSupplyView(props, result)
case "DCIM_PCIDeviceView":
parsePCIeDeviceView(props, result)
case "DCIM_NICView":
case "DCIM_NICView", "DCIM_InfiniBandView":
parseNICView(props, result)
case "DCIM_VideoView":
parseVideoView(props, result)
@@ -374,6 +375,10 @@ func parsePhysicalDiskView(props map[string]string, result *models.AnalysisResul
Location: strings.TrimSpace(props["devicedescription"]),
Status: normalizeStatus(firstNonEmpty(props["raidstatus"], props["primarystatus"])),
}
if v := strings.TrimSpace(props["remainingratedwriteendurance"]); v != "" {
n := parseIntLoose(v)
st.RemainingEndurancePct = &n
}
result.Hardware.Storage = append(result.Hardware.Storage, st)
}
@@ -424,20 +429,60 @@ func parsePowerSupplyView(props map[string]string, result *models.AnalysisResult
result.Hardware.PowerSupply = append(result.Hardware.PowerSupply, psu)
}
// pcieFQDDNoisePrefix lists FQDD prefixes that represent internal chipset/CPU
// components or devices already captured with richer data elsewhere:
// - HostBridge/P2PBridge/ISABridge/SMBus: AMD EPYC internal fabric, not PCIe slots
// - AHCI.Embedded: AMD FCH SATA, not a slot device
// - Video.Embedded: BMC/iDRAC Matrox graphics chip, not user-visible
// - NIC.Embedded: already parsed from DCIM_NICView with model and MAC addresses
var pcieFQDDNoisePrefix = []string{
"HostBridge.Embedded.",
"P2PBridge.Embedded.",
"ISABridge.Embedded.",
"SMBus.Embedded.",
"AHCI.Embedded.",
"Video.Embedded.",
// All NIC FQDD classes are parsed from DCIM_NICView / DCIM_InfiniBandView into
// NetworkAdapters with model, MAC, firmware, and VendorID/DeviceID. The
// DCIM_PCIDeviceView duplicate carries only DataBusWidth ("Unknown", "16x or x16")
// and no useful extra data, so suppress it here.
"NIC.",
"InfiniBand.",
}
func parsePCIeDeviceView(props map[string]string, result *models.AnalysisResult) {
desc := strings.TrimSpace(firstNonEmpty(props["devicedescription"], props["description"]))
// "description" is the chip/device model (e.g. "MT28908 Family [ConnectX-6]"); prefer
// it over "devicedescription" which is the location string ("InfiniBand in Slot 1 Port 1").
desc := strings.TrimSpace(firstNonEmpty(props["description"], props["devicedescription"]))
fqdd := strings.TrimSpace(firstNonEmpty(props["fqdd"], props["instanceid"]))
if desc == "" && fqdd == "" {
return
}
for _, prefix := range pcieFQDDNoisePrefix {
if strings.HasPrefix(fqdd, prefix) {
return
}
}
vendorID := parseHexOrDec(firstNonEmpty(props["pcivendorid"], props["vendorid"]))
deviceID := parseHexOrDec(firstNonEmpty(props["pcideviceid"], props["deviceid"]))
manufacturer := strings.TrimSpace(props["manufacturer"])
// General rule: if chip model not found in logs but PCI IDs are known, resolve from pci.ids
if desc == "" && vendorID != 0 && deviceID != 0 {
desc = pciids.DeviceName(vendorID, deviceID)
}
if manufacturer == "" && vendorID != 0 {
manufacturer = pciids.VendorName(vendorID)
}
p := models.PCIeDevice{
Slot: fqdd,
Description: desc,
VendorID: parseHexOrDec(firstNonEmpty(props["pcivendorid"], props["vendorid"])),
DeviceID: parseHexOrDec(firstNonEmpty(props["pcideviceid"], props["deviceid"])),
VendorID: vendorID,
DeviceID: deviceID,
BDF: formatBDF(props["busnumber"], props["devicenumber"], props["functionnumber"]),
DeviceClass: strings.TrimSpace(props["databuswidth"]),
Manufacturer: strings.TrimSpace(props["manufacturer"]),
Manufacturer: manufacturer,
NUMANode: parseIntLoose(props["cpuaffinity"]),
Status: normalizeStatus(props["primarystatus"]),
}
result.Hardware.PCIeDevices = append(result.Hardware.PCIeDevices, p)
@@ -450,15 +495,31 @@ func parseNICView(props map[string]string, result *models.AnalysisResult) {
return
}
mac := strings.TrimSpace(firstNonEmpty(props["currentmacaddress"], props["permanentmacaddress"]))
vendorID := parseHexOrDec(firstNonEmpty(props["pcivendorid"], props["vendorid"]))
deviceID := parseHexOrDec(firstNonEmpty(props["pcideviceid"], props["deviceid"]))
vendor := strings.TrimSpace(firstNonEmpty(props["vendorname"], props["manufacturer"]))
// Prefer pci.ids chip model over generic ProductName when PCI IDs are available.
// Dell TSR often reports a marketing name (e.g. "Mellanox Network Adapter") while
// pci.ids has the precise chip identifier (e.g. "MT28908 Family [ConnectX-6]").
if vendorID != 0 && deviceID != 0 {
if chipModel := pciids.DeviceName(vendorID, deviceID); chipModel != "" {
model = chipModel
}
if vendor == "" {
vendor = pciids.VendorName(vendorID)
}
}
n := models.NetworkAdapter{
Slot: fqdd,
Location: strings.TrimSpace(firstNonEmpty(props["devicedescription"], fqdd)),
Present: true,
Model: model,
Description: strings.TrimSpace(props["protocol"]),
Vendor: strings.TrimSpace(firstNonEmpty(props["vendorname"], props["manufacturer"])),
VendorID: parseHexOrDec(firstNonEmpty(props["pcivendorid"], props["vendorid"])),
DeviceID: parseHexOrDec(firstNonEmpty(props["pcideviceid"], props["deviceid"])),
Vendor: vendor,
VendorID: vendorID,
DeviceID: deviceID,
SerialNumber: strings.TrimSpace(props["serialnumber"]),
PartNumber: strings.TrimSpace(props["partnumber"]),
Firmware: strings.TrimSpace(firstNonEmpty(
@@ -468,6 +529,7 @@ func parseNICView(props map[string]string, result *models.AnalysisResult) {
props["controllerbiosversion"],
)),
PortCount: inferPortCountFromFQDD(fqdd),
NUMANode: parseIntLoose(props["cpuaffinity"]),
Status: normalizeStatus(props["primarystatus"]),
}
if mac != "" {
@@ -521,10 +583,11 @@ func parseControllerView(props map[string]string, result *models.AnalysisResult)
DeviceClass: "storage-controller",
Manufacturer: strings.TrimSpace(firstNonEmpty(props["devicecardmanufacturer"], props["manufacturer"])),
PartNumber: strings.TrimSpace(firstNonEmpty(props["ppid"], props["boardpartnumber"])),
NUMANode: parseIntLoose(props["cpuaffinity"]),
Status: normalizeStatus(props["primarystatus"]),
})
addFirmware(result, firstNonEmpty(name, fqdd), props["controllerfirmwareversion"], "storage controller")
addFirmware(result, firstNonEmpty(name, fqdd), props["controllerfirmwareversion"], firstNonEmpty(fqdd, "storage controller"))
}
func parseControllerBatteryView(props map[string]string, result *models.AnalysisResult) {
@@ -1110,6 +1173,7 @@ func mergeStorage(dst *models.Storage, src models.Storage) {
}
setIfEmpty(&dst.Location, src.Location)
setIfEmpty(&dst.Status, src.Status)
dst.Details = mergeDellDetails(dst.Details, src.Details)
}
func dedupeVolumes(items []models.StorageVolume) []models.StorageVolume {
@@ -1181,6 +1245,22 @@ func mergePSU(dst *models.PSU, src models.PSU) {
dst.InputVoltage = src.InputVoltage
}
setIfEmpty(&dst.InputType, src.InputType)
dst.Details = mergeDellDetails(dst.Details, src.Details)
}
func mergeDellDetails(primary, secondary map[string]any) map[string]any {
if len(secondary) == 0 {
return primary
}
if primary == nil {
primary = make(map[string]any, len(secondary))
}
for key, value := range secondary {
if _, ok := primary[key]; !ok {
primary[key] = value
}
}
return primary
}
func dedupeNetworkAdapters(items []models.NetworkAdapter) []models.NetworkAdapter {

View File

@@ -204,6 +204,262 @@ func TestParseNestedTSRZip(t *testing.T) {
}
}
// TestParseDellPhysicalDiskEndurance verifies that RemainingRatedWriteEndurance from
// DCIM_PhysicalDiskView is parsed into Storage.RemainingEndurancePct.
func TestParseDellPhysicalDiskEndurance(t *testing.T) {
const viewXML = `<CIM><MESSAGE><SIMPLEREQ>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SystemView">
<PROPERTY NAME="Manufacturer"><VALUE>Dell Inc.</VALUE></PROPERTY>
<PROPERTY NAME="Model"><VALUE>PowerEdge R6625</VALUE></PROPERTY>
<PROPERTY NAME="ServiceTag"><VALUE>8VS2LG4</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PhysicalDiskView">
<PROPERTY NAME="FQDD"><VALUE>Disk.Bay.0:Enclosure.Internal.0-1:RAID.SL.3-1</VALUE></PROPERTY>
<PROPERTY NAME="Slot"><VALUE>0</VALUE></PROPERTY>
<PROPERTY NAME="Model"><VALUE>HFS480G3H2X069N</VALUE></PROPERTY>
<PROPERTY NAME="SerialNumber"><VALUE>ESEAN5254I030B26B</VALUE></PROPERTY>
<PROPERTY NAME="SizeInBytes"><VALUE>479559942144</VALUE></PROPERTY>
<PROPERTY NAME="MediaType"><VALUE>Solid State Drive</VALUE></PROPERTY>
<PROPERTY NAME="BusProtocol"><VALUE>SATA</VALUE></PROPERTY>
<PROPERTY NAME="Revision"><VALUE>DZ03</VALUE></PROPERTY>
<PROPERTY NAME="RemainingRatedWriteEndurance"><VALUE>100</VALUE><DisplayValue>100 %</DisplayValue></PROPERTY>
<PROPERTY NAME="PrimaryStatus"><VALUE>1</VALUE><DisplayValue>OK</DisplayValue></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PhysicalDiskView">
<PROPERTY NAME="FQDD"><VALUE>Disk.Bay.1:Enclosure.Internal.0-1:RAID.SL.3-1</VALUE></PROPERTY>
<PROPERTY NAME="Slot"><VALUE>1</VALUE></PROPERTY>
<PROPERTY NAME="Model"><VALUE>TOSHIBA MG08ADA800E</VALUE></PROPERTY>
<PROPERTY NAME="SerialNumber"><VALUE>X1G0A0YXFVVG</VALUE></PROPERTY>
<PROPERTY NAME="SizeInBytes"><VALUE>8001563222016</VALUE></PROPERTY>
<PROPERTY NAME="MediaType"><VALUE>Hard Disk Drive</VALUE></PROPERTY>
<PROPERTY NAME="BusProtocol"><VALUE>SAS</VALUE></PROPERTY>
<PROPERTY NAME="Revision"><VALUE>0104</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
</SIMPLEREQ></MESSAGE></CIM>`
inner := makeZipArchive(t, map[string][]byte{
"tsr/metadata.json": []byte(`{"Make":"Dell Inc.","Model":"PowerEdge R6625","ServiceTag":"8VS2LG4"}`),
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml": []byte(viewXML),
})
p := &Parser{}
result, err := p.Parse([]parser.ExtractedFile{
{Path: "signature", Content: []byte("ok")},
{Path: "TSR20260306141852_8VS2LG4.pl.zip", Content: inner},
})
if err != nil {
t.Fatalf("parse failed: %v", err)
}
if len(result.Hardware.Storage) != 2 {
t.Fatalf("expected 2 storage devices, got %d", len(result.Hardware.Storage))
}
ssd := result.Hardware.Storage[0]
if ssd.RemainingEndurancePct == nil {
t.Fatalf("SSD slot 0: expected RemainingEndurancePct to be set")
}
if *ssd.RemainingEndurancePct != 100 {
t.Errorf("SSD slot 0: expected RemainingEndurancePct=100, got %d", *ssd.RemainingEndurancePct)
}
hdd := result.Hardware.Storage[1]
if hdd.RemainingEndurancePct != nil {
t.Errorf("HDD slot 1: expected RemainingEndurancePct absent, got %d", *hdd.RemainingEndurancePct)
}
}
// TestParseDellInfiniBandView verifies that DCIM_InfiniBandView entries are parsed as
// NetworkAdapters (not PCIe devices) and that the corresponding SoftwareIdentity firmware
// entry with FQDD "InfiniBand.Slot.*" does not leak into hardware.firmware.
//
// Regression guard: PowerEdge R6625 (8VS2LG4) — "Mellanox Network Adapter" version
// "20.39.35.60" appeared in hardware.firmware because DCIM_InfiniBandView was ignored
// (device ended up only in PCIeDevices with model "16x or x16") and SoftwareIdentity
// FQDD "InfiniBand.Slot.1-1" was not filtered. (2026-03-15)
func TestParseDellInfiniBandView(t *testing.T) {
const viewXML = `<CIM><MESSAGE><SIMPLEREQ>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SystemView">
<PROPERTY NAME="Manufacturer"><VALUE>Dell Inc.</VALUE></PROPERTY>
<PROPERTY NAME="Model"><VALUE>PowerEdge R6625</VALUE></PROPERTY>
<PROPERTY NAME="ServiceTag"><VALUE>8VS2LG4</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_InfiniBandView">
<PROPERTY NAME="FQDD"><VALUE>InfiniBand.Slot.1-1</VALUE></PROPERTY>
<PROPERTY NAME="DeviceDescription"><VALUE>InfiniBand in Slot 1 Port 1</VALUE></PROPERTY>
<PROPERTY NAME="CurrentMACAddress"><VALUE>00:1C:FD:D7:5A:E6</VALUE></PROPERTY>
<PROPERTY NAME="FamilyVersion"><VALUE>20.39.35.60</VALUE></PROPERTY>
<PROPERTY NAME="EFIVersion"><VALUE>14.32.17</VALUE></PROPERTY>
<PROPERTY NAME="PCIVendorID"><VALUE>15B3</VALUE></PROPERTY>
<PROPERTY NAME="PCIDeviceID"><VALUE>101B</VALUE></PROPERTY>
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PCIDeviceView">
<PROPERTY NAME="FQDD"><VALUE>InfiniBand.Slot.1-1</VALUE></PROPERTY>
<PROPERTY NAME="Description"><VALUE>MT28908 Family [ConnectX-6]</VALUE></PROPERTY>
<PROPERTY NAME="DeviceDescription"><VALUE>InfiniBand in Slot 1 Port 1</VALUE></PROPERTY>
<PROPERTY NAME="Manufacturer"><VALUE>Mellanox Technologies</VALUE></PROPERTY>
<PROPERTY NAME="PCIVendorID"><VALUE>15B3</VALUE></PROPERTY>
<PROPERTY NAME="PCIDeviceID"><VALUE>101B</VALUE></PROPERTY>
<PROPERTY NAME="DataBusWidth"><DisplayValue>16x or x16</DisplayValue></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_ControllerView">
<PROPERTY NAME="FQDD"><VALUE>RAID.SL.3-1</VALUE></PROPERTY>
<PROPERTY NAME="ProductName"><VALUE>PERC H755 Front</VALUE></PROPERTY>
<PROPERTY NAME="ControllerFirmwareVersion"><VALUE>52.30.0-6115</VALUE></PROPERTY>
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
</SIMPLEREQ></MESSAGE></CIM>`
const swXML = `<CIM><MESSAGE><SIMPLEREQ>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SoftwareIdentity">
<PROPERTY NAME="ElementName"><VALUE>Mellanox Network Adapter - 00:1C:FD:D7:5A:E6</VALUE></PROPERTY>
<PROPERTY NAME="FQDD"><VALUE>InfiniBand.Slot.1-1</VALUE></PROPERTY>
<PROPERTY NAME="VersionString"><VALUE>20.39.35.60</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SoftwareIdentity">
<PROPERTY NAME="ElementName"><VALUE>PERC H755 Front</VALUE></PROPERTY>
<PROPERTY NAME="FQDD"><VALUE>RAID.SL.3-1</VALUE></PROPERTY>
<PROPERTY NAME="VersionString"><VALUE>52.30.0-6115</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SoftwareIdentity">
<PROPERTY NAME="ElementName"><VALUE>BIOS</VALUE></PROPERTY>
<PROPERTY NAME="FQDD"><VALUE>BIOS.Setup.1-1</VALUE></PROPERTY>
<PROPERTY NAME="VersionString"><VALUE>1.15.3</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
</SIMPLEREQ></MESSAGE></CIM>`
inner := makeZipArchive(t, map[string][]byte{
"tsr/metadata.json": []byte(`{"Make":"Dell Inc.","Model":"PowerEdge R6625","ServiceTag":"8VS2LG4"}`),
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml": []byte(viewXML),
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml": []byte(swXML),
})
p := &Parser{}
result, err := p.Parse([]parser.ExtractedFile{
{Path: "signature", Content: []byte("ok")},
{Path: "TSR20260306141852_8VS2LG4.pl.zip", Content: inner},
})
if err != nil {
t.Fatalf("parse failed: %v", err)
}
// InfiniBand adapter must appear as a NetworkAdapter, not a PCIe device.
if len(result.Hardware.NetworkAdapters) != 1 {
t.Fatalf("expected 1 network adapter, got %d", len(result.Hardware.NetworkAdapters))
}
nic := result.Hardware.NetworkAdapters[0]
if nic.Slot != "InfiniBand.Slot.1-1" {
t.Errorf("unexpected NIC slot: %q", nic.Slot)
}
if nic.Firmware != "20.39.35.60" {
t.Errorf("unexpected NIC firmware: %q", nic.Firmware)
}
if len(nic.MACAddresses) == 0 || nic.MACAddresses[0] != "00:1C:FD:D7:5A:E6" {
t.Errorf("unexpected NIC MAC: %v", nic.MACAddresses)
}
// pci.ids enrichment: VendorID=0x15B3, DeviceID=0x101B → chip model + vendor name.
if nic.Model != "MT28908 Family [ConnectX-6]" {
t.Errorf("NIC model = %q, want MT28908 Family [ConnectX-6] (from pci.ids)", nic.Model)
}
if nic.Vendor != "Mellanox Technologies" {
t.Errorf("NIC vendor = %q, want Mellanox Technologies (from pci.ids)", nic.Vendor)
}
// InfiniBand FQDD must NOT appear in PCIe devices.
for _, pcie := range result.Hardware.PCIeDevices {
if pcie.Slot == "InfiniBand.Slot.1-1" {
t.Errorf("InfiniBand.Slot.1-1 must not appear in PCIeDevices")
}
}
// Firmware entries from SoftwareIdentity and parseControllerView must carry the FQDD
// as their Description so the exporter's isDeviceBoundFirmwareFQDD filter can remove them.
fqddByName := make(map[string]string)
for _, fw := range result.Hardware.Firmware {
fqddByName[fw.DeviceName] = fw.Description
}
if desc := fqddByName["Mellanox Network Adapter"]; desc != "InfiniBand.Slot.1-1" {
t.Errorf("Mellanox firmware Description = %q, want InfiniBand.Slot.1-1 for FQDD filter", desc)
}
if desc := fqddByName["PERC H755 Front"]; desc != "RAID.SL.3-1" {
t.Errorf("PERC H755 Front firmware Description = %q, want RAID.SL.3-1 for FQDD filter", desc)
}
}
// TestParseDellCPUAffinity verifies that CPUAffinity is parsed into NUMANode for
// NIC, PCIe, and controller views. "Not Applicable" must result in NUMANode=0.
func TestParseDellCPUAffinity(t *testing.T) {
const viewXML = `<CIM><MESSAGE><SIMPLEREQ>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SystemView">
<PROPERTY NAME="Manufacturer"><VALUE>Dell Inc.</VALUE></PROPERTY>
<PROPERTY NAME="Model"><VALUE>PowerEdge R750</VALUE></PROPERTY>
<PROPERTY NAME="ServiceTag"><VALUE>TESTST1</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_NICView">
<PROPERTY NAME="FQDD"><VALUE>NIC.Slot.2-1-1</VALUE></PROPERTY>
<PROPERTY NAME="ProductName"><VALUE>Some NIC</VALUE></PROPERTY>
<PROPERTY NAME="CPUAffinity"><VALUE>1</VALUE></PROPERTY>
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_InfiniBandView">
<PROPERTY NAME="FQDD"><VALUE>InfiniBand.Slot.1-1</VALUE></PROPERTY>
<PROPERTY NAME="DeviceDescription"><VALUE>InfiniBand in Slot 1</VALUE></PROPERTY>
<PROPERTY NAME="CPUAffinity"><VALUE>2</VALUE></PROPERTY>
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_ControllerView">
<PROPERTY NAME="FQDD"><VALUE>RAID.Slot.1-1</VALUE></PROPERTY>
<PROPERTY NAME="ProductName"><VALUE>PERC H755</VALUE></PROPERTY>
<PROPERTY NAME="CPUAffinity"><VALUE>Not Applicable</VALUE></PROPERTY>
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PCIDeviceView">
<PROPERTY NAME="FQDD"><VALUE>Slot.7-1</VALUE></PROPERTY>
<PROPERTY NAME="Description"><VALUE>Some PCIe Card</VALUE></PROPERTY>
<PROPERTY NAME="CPUAffinity"><VALUE>2</VALUE></PROPERTY>
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
</INSTANCE></VALUE.NAMEDINSTANCE>
</SIMPLEREQ></MESSAGE></CIM>`
inner := makeZipArchive(t, map[string][]byte{
"tsr/metadata.json": []byte(`{"Make":"Dell Inc.","Model":"PowerEdge R750","ServiceTag":"TESTST1"}`),
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml": []byte(viewXML),
})
p := &Parser{}
result, err := p.Parse([]parser.ExtractedFile{
{Path: "signature", Content: []byte("ok")},
{Path: "TSR_TESTST1.pl.zip", Content: inner},
})
if err != nil {
t.Fatalf("parse failed: %v", err)
}
// NIC CPUAffinity=1 → NUMANode=1
nicBySlot := make(map[string]int)
for _, nic := range result.Hardware.NetworkAdapters {
nicBySlot[nic.Slot] = nic.NUMANode
}
if nicBySlot["NIC.Slot.2-1-1"] != 1 {
t.Errorf("NIC.Slot.2-1-1 NUMANode = %d, want 1", nicBySlot["NIC.Slot.2-1-1"])
}
if nicBySlot["InfiniBand.Slot.1-1"] != 2 {
t.Errorf("InfiniBand.Slot.1-1 NUMANode = %d, want 2", nicBySlot["InfiniBand.Slot.1-1"])
}
// PCIe device CPUAffinity=2 → NUMANode=2; controller CPUAffinity="Not Applicable" → NUMANode=0
pcieBySlot := make(map[string]int)
for _, pcie := range result.Hardware.PCIeDevices {
pcieBySlot[pcie.Slot] = pcie.NUMANode
}
if pcieBySlot["Slot.7-1"] != 2 {
t.Errorf("Slot.7-1 NUMANode = %d, want 2", pcieBySlot["Slot.7-1"])
}
if pcieBySlot["RAID.Slot.1-1"] != 0 {
t.Errorf("RAID.Slot.1-1 NUMANode = %d, want 0 (Not Applicable)", pcieBySlot["RAID.Slot.1-1"])
}
}
func makeZipArchive(t *testing.T, files map[string][]byte) []byte {
t.Helper()
var buf bytes.Buffer

View File

@@ -216,6 +216,7 @@ func parseH3CG5(files []parser.ExtractedFile) *models.AnalysisResult {
}
result.Hardware.Storage = dedupeStorage(result.Hardware.Storage)
result.Hardware.Volumes = dedupeVolumes(result.Hardware.Volumes)
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
return result
}
@@ -286,6 +287,7 @@ func parseH3CG6(files []parser.ExtractedFile) *models.AnalysisResult {
}
result.Hardware.Storage = dedupeStorage(result.Hardware.Storage)
result.Hardware.Volumes = dedupeVolumes(result.Hardware.Volumes)
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
return result
}
@@ -3024,6 +3026,7 @@ func mergeStorage(dst *models.Storage, src models.Storage) {
}
setStorageString(&dst.Location, src.Location)
setStorageString(&dst.Status, normalizeStorageStatus(src.Status, src.Present || dst.Present))
dst.Details = mergeH3CDetails(dst.Details, src.Details)
}
func setStorageString(dst *string, value string) {
@@ -3275,6 +3278,22 @@ func mergePSU(dst *models.PSU, src models.PSU) {
setStorageString(&dst.PartNumber, src.PartNumber)
setStorageString(&dst.Firmware, src.Firmware)
setStorageString(&dst.Status, src.Status)
dst.Details = mergeH3CDetails(dst.Details, src.Details)
}
func mergeH3CDetails(primary, secondary map[string]any) map[string]any {
if len(secondary) == 0 {
return primary
}
if primary == nil {
primary = make(map[string]any, len(secondary))
}
for key, value := range secondary {
if _, ok := primary[key]; !ok {
primary[key] = value
}
}
return primary
}
func dedupeVolumes(items []models.StorageVolume) []models.StorageVolume {

View File

@@ -94,8 +94,12 @@ type AssetJSON struct {
} `json:"PcieInfo"`
}
// ParseAssetJSON parses Inspur asset.json content
func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
// ParseAssetJSON parses Inspur asset.json content.
// - pcieSlotDeviceNames: optional map from integer PCIe slot ID to device name string,
// sourced from devicefrusdr.log PCIe REST section. Fills missing NVMe model names.
// - pcieSlotSerials: optional map from integer PCIe slot ID to serial number string,
// sourced from audit.log SN-changed events. Fills missing NVMe serial numbers.
func ParseAssetJSON(content []byte, pcieSlotDeviceNames map[int]string, pcieSlotSerials map[int]string) (*models.HardwareConfig, error) {
var asset AssetJSON
if err := json.Unmarshal(content, &asset); err != nil {
return nil, err
@@ -175,6 +179,23 @@ func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
continue
}
// Enrich model name from PCIe device name (supplied from devicefrusdr.log).
// BMC does not populate HddInfo.ModelName for NVMe drives, but the PCIe REST
// section in devicefrusdr.log carries the drive model as device_name.
if modelName == "" && hdd.PcieSlot > 0 && len(pcieSlotDeviceNames) > 0 {
if devName, ok := pcieSlotDeviceNames[hdd.PcieSlot]; ok && devName != "" {
modelName = devName
}
}
// Enrich serial number from audit.log SN-changed events (supplied via pcieSlotSerials).
// BMC asset.json does not carry NVMe serial numbers; audit.log logs every SN change.
if serial == "" && hdd.PcieSlot > 0 && len(pcieSlotSerials) > 0 {
if sn, ok := pcieSlotSerials[hdd.PcieSlot]; ok && sn != "" {
serial = sn
}
}
storageType := "HDD"
if hdd.DiskInterfaceType == 5 {
storageType = "NVMe"

View File

@@ -28,7 +28,7 @@ func TestParseAssetJSON_NVIDIAGPUModelFromPCIIDs(t *testing.T) {
}]
}`)
hw, err := ParseAssetJSON(raw)
hw, err := ParseAssetJSON(raw, nil, nil)
if err != nil {
t.Fatalf("ParseAssetJSON failed: %v", err)
}

94
internal/parser/vendors/inspur/audit.go vendored Normal file
View File

@@ -0,0 +1,94 @@
package inspur
import (
"fmt"
"regexp"
"strconv"
"strings"
)
// auditSNChangedNVMeRegex matches:
// "Front Back Plane N NVMe DiskM SN changed from X to Y"
// Captures: disk_num, new_serial
var auditSNChangedNVMeRegex = regexp.MustCompile(`NVMe Disk(\d+)\s+SN changed from \S+\s+to\s+(\S+)`)
// auditSNChangedRAIDRegex matches:
// "Raid(Pcie Slot:N) HDD(enclosure id:E slot:S) SN changed from X to Y"
// Captures: pcie_slot, enclosure_id, slot_num, new_serial
var auditSNChangedRAIDRegex = regexp.MustCompile(`Raid\(Pcie Slot:(\d+)\) HDD\(enclosure id:(\d+) slot:(\d+)\)\s+SN changed from \S+\s+to\s+(\S+)`)
// ParseAuditLogNVMeSerials parses audit.log and returns the final (latest) serial number
// per NVMe disk number. The disk number matches the numeric suffix in PCIe location
// strings like "#NVME0", "#NVME2", etc. from devicefrusdr.log.
// Entries where the serial changed to "NULL" are excluded.
func ParseAuditLogNVMeSerials(content []byte) map[int]string {
serials := make(map[int]string)
for _, line := range strings.Split(string(content), "\n") {
m := auditSNChangedNVMeRegex.FindStringSubmatch(line)
if m == nil {
continue
}
diskNum, err := strconv.Atoi(m[1])
if err != nil {
continue
}
serial := strings.TrimSpace(m[2])
if strings.EqualFold(serial, "NULL") || serial == "" {
delete(serials, diskNum)
} else {
serials[diskNum] = serial
}
}
if len(serials) == 0 {
return nil
}
return serials
}
// ParseAuditLogRAIDSerials parses audit.log and returns the final (latest) serial number
// per RAID backplane disk. Key format is "BP{enclosure_id-1}:{slot_num}" (e.g. "BP0:0").
//
// Each disk slot is claimed by a specific RAID controller (Pcie Slot:N). NULL events from
// an old controller do not clear serials assigned by a newer controller, preventing stale
// deletions when disks are migrated between RAID arrays.
func ParseAuditLogRAIDSerials(content []byte) map[string]string {
// owner tracks which PCIe RAID controller slot last assigned a serial to a disk key.
serials := make(map[string]string)
owner := make(map[string]int)
for _, line := range strings.Split(string(content), "\n") {
m := auditSNChangedRAIDRegex.FindStringSubmatch(line)
if m == nil {
continue
}
pcieSlot, err := strconv.Atoi(m[1])
if err != nil {
continue
}
enclosureID, err := strconv.Atoi(m[2])
if err != nil {
continue
}
slotNum, err := strconv.Atoi(m[3])
if err != nil {
continue
}
serial := strings.TrimSpace(m[4])
key := fmt.Sprintf("BP%d:%d", enclosureID-1, slotNum)
if strings.EqualFold(serial, "NULL") || serial == "" {
// Only clear if this controller was the last to set the serial.
if owner[key] == pcieSlot {
delete(serials, key)
delete(owner, key)
}
} else {
serials[key] = serial
owner[key] = pcieSlot
}
}
if len(serials) == 0 {
return nil
}
return serials
}

View File

@@ -100,10 +100,18 @@ func parseMemoryInfo(text string, hw *models.HardwareConfig) {
return
}
// Replace memory data with detailed info from component.log
hw.Memory = nil
var merged []models.MemoryDIMM
seen := make(map[string]int)
for _, existing := range hw.Memory {
key := inspurMemoryKey(existing)
if key == "" {
continue
}
seen[key] = len(merged)
merged = append(merged, existing)
}
for _, mem := range memInfo.MemModules {
hw.Memory = append(hw.Memory, models.MemoryDIMM{
item := models.MemoryDIMM{
Slot: mem.MemModSlot,
Location: mem.MemModSlot,
Present: mem.MemModStatus == 1 && mem.MemModSize > 0,
@@ -117,8 +125,18 @@ func parseMemoryInfo(text string, hw *models.HardwareConfig) {
PartNumber: strings.TrimSpace(mem.MemModPartNum),
Status: mem.Status,
Ranks: mem.MemModRanks,
})
}
key := inspurMemoryKey(item)
if idx, ok := seen[key]; ok {
mergeInspurMemoryDIMM(&merged[idx], item)
continue
}
if key != "" {
seen[key] = len(merged)
}
merged = append(merged, item)
}
hw.Memory = merged
}
// PSURESTInfo represents the RESTful PSU info structure
@@ -159,10 +177,18 @@ func parsePSUInfo(text string, hw *models.HardwareConfig) {
return
}
// Clear existing PSU data and populate with RESTful data
hw.PowerSupply = nil
var merged []models.PSU
seen := make(map[string]int)
for _, existing := range hw.PowerSupply {
key := inspurPSUKey(existing)
if key == "" {
continue
}
seen[key] = len(merged)
merged = append(merged, existing)
}
for _, psu := range psuInfo.PowerSupplies {
hw.PowerSupply = append(hw.PowerSupply, models.PSU{
item := models.PSU{
Slot: fmt.Sprintf("PSU%d", psu.ID),
Present: psu.Present == 1,
Model: strings.TrimSpace(psu.Model),
@@ -178,8 +204,18 @@ func parsePSUInfo(text string, hw *models.HardwareConfig) {
InputVoltage: psu.PSInVolt,
OutputVoltage: psu.PSOutVolt,
TemperatureC: psu.PSUMaxTemp,
})
}
key := inspurPSUKey(item)
if idx, ok := seen[key]; ok {
mergeInspurPSU(&merged[idx], item)
continue
}
if key != "" {
seen[key] = len(merged)
}
merged = append(merged, item)
}
hw.PowerSupply = merged
}
// HDDRESTInfo represents the RESTful HDD info structure
@@ -357,7 +393,16 @@ func parseNetworkAdapterInfo(text string, hw *models.HardwareConfig) {
return
}
hw.NetworkAdapters = nil
var merged []models.NetworkAdapter
seen := make(map[string]int)
for _, existing := range hw.NetworkAdapters {
key := inspurNICKey(existing)
if key == "" {
continue
}
seen[key] = len(merged)
merged = append(merged, existing)
}
for _, adapter := range netInfo.SysAdapters {
var macs []string
for _, port := range adapter.Ports {
@@ -377,7 +422,7 @@ func parseNetworkAdapterInfo(text string, hw *models.HardwareConfig) {
vendor = normalizeModelLabel(pciids.VendorName(adapter.VendorID))
}
hw.NetworkAdapters = append(hw.NetworkAdapters, models.NetworkAdapter{
item := models.NetworkAdapter{
Slot: fmt.Sprintf("Slot %d", adapter.Slot),
Location: adapter.Location,
Present: adapter.Present == 1,
@@ -392,8 +437,231 @@ func parseNetworkAdapterInfo(text string, hw *models.HardwareConfig) {
PortType: adapter.PortType,
MACAddresses: macs,
Status: adapter.Status,
})
}
key := inspurNICKey(item)
if idx, ok := seen[key]; ok {
mergeInspurNIC(&merged[idx], item)
continue
}
if slotIdx := inspurFindNICBySlot(merged, item.Slot); slotIdx >= 0 {
mergeInspurNIC(&merged[slotIdx], item)
if key != "" {
seen[key] = slotIdx
}
continue
}
if key != "" {
seen[key] = len(merged)
}
merged = append(merged, item)
}
hw.NetworkAdapters = merged
}
func inspurMemoryKey(item models.MemoryDIMM) string {
return strings.ToLower(strings.TrimSpace(inspurFirstNonEmpty(item.SerialNumber, item.Slot, item.Location)))
}
func mergeInspurMemoryDIMM(dst *models.MemoryDIMM, src models.MemoryDIMM) {
if dst == nil {
return
}
if strings.TrimSpace(dst.Slot) == "" {
dst.Slot = src.Slot
}
if strings.TrimSpace(dst.Location) == "" {
dst.Location = src.Location
}
dst.Present = dst.Present || src.Present
if dst.SizeMB == 0 {
dst.SizeMB = src.SizeMB
}
if strings.TrimSpace(dst.Type) == "" {
dst.Type = src.Type
}
if strings.TrimSpace(dst.Technology) == "" {
dst.Technology = src.Technology
}
if dst.MaxSpeedMHz == 0 {
dst.MaxSpeedMHz = src.MaxSpeedMHz
}
if dst.CurrentSpeedMHz == 0 {
dst.CurrentSpeedMHz = src.CurrentSpeedMHz
}
if strings.TrimSpace(dst.Manufacturer) == "" {
dst.Manufacturer = src.Manufacturer
}
if strings.TrimSpace(dst.SerialNumber) == "" {
dst.SerialNumber = src.SerialNumber
}
if strings.TrimSpace(dst.PartNumber) == "" {
dst.PartNumber = src.PartNumber
}
if strings.TrimSpace(dst.Status) == "" {
dst.Status = src.Status
}
if dst.Ranks == 0 {
dst.Ranks = src.Ranks
}
}
func inspurPSUKey(item models.PSU) string {
return strings.ToLower(strings.TrimSpace(inspurFirstNonEmpty(item.SerialNumber, item.Slot, item.Model)))
}
func mergeInspurPSU(dst *models.PSU, src models.PSU) {
if dst == nil {
return
}
if strings.TrimSpace(dst.Slot) == "" {
dst.Slot = src.Slot
}
dst.Present = dst.Present || src.Present
if strings.TrimSpace(dst.Model) == "" {
dst.Model = src.Model
}
if strings.TrimSpace(dst.Vendor) == "" {
dst.Vendor = src.Vendor
}
if dst.WattageW == 0 {
dst.WattageW = src.WattageW
}
if strings.TrimSpace(dst.SerialNumber) == "" {
dst.SerialNumber = src.SerialNumber
}
if strings.TrimSpace(dst.PartNumber) == "" {
dst.PartNumber = src.PartNumber
}
if strings.TrimSpace(dst.Firmware) == "" {
dst.Firmware = src.Firmware
}
if strings.TrimSpace(dst.Status) == "" {
dst.Status = src.Status
}
if strings.TrimSpace(dst.InputType) == "" {
dst.InputType = src.InputType
}
if dst.InputPowerW == 0 {
dst.InputPowerW = src.InputPowerW
}
if dst.OutputPowerW == 0 {
dst.OutputPowerW = src.OutputPowerW
}
if dst.InputVoltage == 0 {
dst.InputVoltage = src.InputVoltage
}
if dst.OutputVoltage == 0 {
dst.OutputVoltage = src.OutputVoltage
}
if dst.TemperatureC == 0 {
dst.TemperatureC = src.TemperatureC
}
}
func inspurNICKey(item models.NetworkAdapter) string {
return strings.ToLower(strings.TrimSpace(inspurFirstNonEmpty(item.SerialNumber, strings.Join(item.MACAddresses, ","), item.Slot, item.Location)))
}
func mergeInspurNIC(dst *models.NetworkAdapter, src models.NetworkAdapter) {
if dst == nil {
return
}
if strings.TrimSpace(dst.Slot) == "" {
dst.Slot = src.Slot
}
if strings.TrimSpace(dst.Location) == "" {
dst.Location = src.Location
}
dst.Present = dst.Present || src.Present
if strings.TrimSpace(dst.BDF) == "" {
dst.BDF = src.BDF
}
if strings.TrimSpace(dst.Model) == "" {
dst.Model = src.Model
}
if strings.TrimSpace(dst.Description) == "" {
dst.Description = src.Description
}
if strings.TrimSpace(dst.Vendor) == "" {
dst.Vendor = src.Vendor
}
if dst.VendorID == 0 {
dst.VendorID = src.VendorID
}
if dst.DeviceID == 0 {
dst.DeviceID = src.DeviceID
}
if strings.TrimSpace(dst.SerialNumber) == "" {
dst.SerialNumber = src.SerialNumber
}
if strings.TrimSpace(dst.PartNumber) == "" {
dst.PartNumber = src.PartNumber
}
if strings.TrimSpace(dst.Firmware) == "" {
dst.Firmware = src.Firmware
}
if dst.PortCount == 0 {
dst.PortCount = src.PortCount
}
if strings.TrimSpace(dst.PortType) == "" {
dst.PortType = src.PortType
}
if dst.LinkWidth == 0 {
dst.LinkWidth = src.LinkWidth
}
if strings.TrimSpace(dst.LinkSpeed) == "" {
dst.LinkSpeed = src.LinkSpeed
}
if dst.MaxLinkWidth == 0 {
dst.MaxLinkWidth = src.MaxLinkWidth
}
if strings.TrimSpace(dst.MaxLinkSpeed) == "" {
dst.MaxLinkSpeed = src.MaxLinkSpeed
}
if dst.NUMANode == 0 {
dst.NUMANode = src.NUMANode
}
if strings.TrimSpace(dst.Status) == "" {
dst.Status = src.Status
}
for _, mac := range src.MACAddresses {
mac = strings.TrimSpace(mac)
if mac == "" {
continue
}
found := false
for _, existing := range dst.MACAddresses {
if strings.EqualFold(strings.TrimSpace(existing), mac) {
found = true
break
}
}
if !found {
dst.MACAddresses = append(dst.MACAddresses, mac)
}
}
}
func inspurFindNICBySlot(items []models.NetworkAdapter, slot string) int {
slot = strings.ToLower(strings.TrimSpace(slot))
if slot == "" {
return -1
}
for i := range items {
if strings.ToLower(strings.TrimSpace(items[i].Slot)) == slot {
return i
}
}
return -1
}
func inspurFirstNonEmpty(values ...string) string {
for _, value := range values {
if strings.TrimSpace(value) != "" {
return strings.TrimSpace(value)
}
}
return ""
}
func parseFanSensors(text string) []models.SensorReading {
@@ -713,6 +981,63 @@ func extractComponentFirmware(text string, hw *models.HardwareConfig) {
}
}
}
// Extract BMC, CPLD and VR firmware from RESTful version info section.
// The JSON is a flat array: [{"id":N,"dev_name":"...","dev_version":"..."}, ...]
reVer := regexp.MustCompile(`RESTful version info:\s*(\[[\s\S]*?\])\s*RESTful`)
if match := reVer.FindStringSubmatch(text); match != nil {
type verEntry struct {
DevName string `json:"dev_name"`
DevVersion string `json:"dev_version"`
}
var entries []verEntry
if err := json.Unmarshal([]byte(match[1]), &entries); err == nil {
for _, e := range entries {
name := normalizeVersionInfoName(e.DevName)
if name == "" {
continue
}
version := strings.TrimSpace(e.DevVersion)
if version == "" {
continue
}
if existingFW[name] {
continue
}
hw.Firmware = append(hw.Firmware, models.FirmwareInfo{
DeviceName: name,
Version: version,
})
existingFW[name] = true
}
}
}
}
// normalizeVersionInfoName converts RESTful version info dev_name to a clean label.
// Returns "" for entries that should be skipped (inactive BMC, PSU slots).
func normalizeVersionInfoName(name string) string {
name = strings.TrimSpace(name)
if name == "" {
return ""
}
// Skip PSU_N entries — firmware already extracted from PSU info section.
if regexp.MustCompile(`(?i)^PSU_\d+$`).MatchString(name) {
return ""
}
// Skip the inactive BMC partition.
if strings.HasPrefix(strings.ToLower(name), "inactivate(") {
return ""
}
// Active BMC: "Activate(BMC1)" → "BMC"
if strings.HasPrefix(strings.ToLower(name), "activate(") {
return "BMC"
}
// Strip trailing "Version" suffix (case-insensitive), e.g. "MainBoard0CPLDVersion" → "MainBoard0CPLD"
if strings.HasSuffix(strings.ToLower(name), "version") {
name = name[:len(name)-len("version")]
}
return strings.TrimSpace(name)
}
// DiskBackplaneRESTInfo represents the RESTful diskbackplane info structure

View File

@@ -51,6 +51,64 @@ RESTful fan`
}
}
func TestParseNetworkAdapterInfo_MergesIntoExistingInventory(t *testing.T) {
text := `RESTful Network Adapter info:
{
"sys_adapters": [
{
"id": 1,
"name": "NIC1",
"Location": "#CPU0_PCIE4",
"present": 1,
"slot": 4,
"vendor_id": 32902,
"device_id": 5409,
"vendor": "Mellanox",
"model": "ConnectX-6",
"fw_ver": "22.1.0",
"status": "OK",
"sn": "",
"pn": "",
"port_num": 2,
"port_type": "QSFP",
"ports": [
{ "id": 1, "mac_addr": "00:11:22:33:44:55" }
]
}
]
}
RESTful fan`
hw := &models.HardwareConfig{
NetworkAdapters: []models.NetworkAdapter{
{
Slot: "Slot 4",
BDF: "0000:17:00.0",
SerialNumber: "NIC-SN-1",
Present: true,
},
},
}
parseNetworkAdapterInfo(text, hw)
if len(hw.NetworkAdapters) != 1 {
t.Fatalf("expected merged single adapter, got %d", len(hw.NetworkAdapters))
}
got := hw.NetworkAdapters[0]
if got.BDF != "0000:17:00.0" {
t.Fatalf("expected existing BDF to survive merge, got %q", got.BDF)
}
if got.Model != "ConnectX-6" {
t.Fatalf("expected model from component log, got %q", got.Model)
}
if got.SerialNumber != "NIC-SN-1" {
t.Fatalf("expected serial from existing inventory to survive merge, got %q", got.SerialNumber)
}
if len(got.MACAddresses) != 1 || got.MACAddresses[0] != "00:11:22:33:44:55" {
t.Fatalf("expected MAC addresses from component log, got %#v", got.MACAddresses)
}
}
func TestParseComponentLogSensors_ExtractsFanBackplaneAndPSUSummary(t *testing.T) {
text := `RESTful PSU info:
{

View File

@@ -0,0 +1,33 @@
package inspur
import "testing"
func TestParseIDLLog_UsesBMCSourceForEventLogs(t *testing.T) {
content := []byte(`|2025-12-02T17:54:27+08:00|MEMORY|Assert|Warning|0C180401|CPU1_C4D0 Memory Device Disabled - Assert|`)
events := ParseIDLLog(content)
if len(events) != 1 {
t.Fatalf("expected 1 event, got %d", len(events))
}
if events[0].Source != "BMC" {
t.Fatalf("expected IDL events to use BMC source, got %#v", events[0])
}
if events[0].SensorName != "CPU1_C4D0" {
t.Fatalf("expected extracted DIMM component ref, got %#v", events[0])
}
}
func TestParseSyslog_UsesHostSourceAndProcessAsSensorName(t *testing.T) {
content := []byte(`<13>2026-03-15T14:03:11+00:00 host123 systemd[1]: Started Example Service`)
events := ParseSyslog(content, "syslog/info")
if len(events) != 1 {
t.Fatalf("expected 1 event, got %d", len(events))
}
if events[0].Source != "syslog" {
t.Fatalf("expected syslog source, got %#v", events[0])
}
if events[0].SensorName != "systemd[1]" {
t.Fatalf("expected process name in sensor/component slot, got %#v", events[0])
}
}

View File

@@ -165,7 +165,10 @@ func TestParseIDLLog_ParsesStructuredJSONLine(t *testing.T) {
if events[0].ID != "17FFB002" {
t.Fatalf("expected event ID 17FFB002, got %q", events[0].ID)
}
if events[0].Source != "PCIE" {
t.Fatalf("expected source PCIE, got %q", events[0].Source)
if events[0].Source != "BMC" {
t.Fatalf("expected BMC source for IDL event, got %q", events[0].Source)
}
if events[0].SensorType != "pcie" {
t.Fatalf("expected component type pcie, got %#v", events[0])
}
}

View File

@@ -60,7 +60,7 @@ func ParseIDLLog(content []byte) []models.Event {
events = append(events, models.Event{
ID: eventID,
Timestamp: ts,
Source: component,
Source: "BMC",
SensorType: strings.ToLower(component),
SensorName: sensorName,
EventType: eventType,

View File

@@ -16,7 +16,7 @@ import (
// parserVersion - version of this parser module
// IMPORTANT: Increment this version when making changes to parser logic!
const parserVersion = "1.5"
const parserVersion = "1.8"
func init() {
parser.Register(&Parser{})
@@ -95,9 +95,41 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
Sensors: make([]models.SensorReading, 0),
}
// Pre-parse enrichment maps from devicefrusdr.log for use inside ParseAssetJSON.
// BMC does not populate HddInfo.ModelName or SerialNumber for NVMe drives.
var pcieSlotDeviceNames map[int]string
var nvmeLocToSlot map[int]int
if f := parser.FindFileByName(files, "devicefrusdr.log"); f != nil {
pcieSlotDeviceNames = ParsePCIeSlotDeviceNames(f.Content)
nvmeLocToSlot = ParsePCIeNVMeLocToSlot(f.Content)
}
// Parse NVMe serial numbers from audit.log: every disk SN change is logged there.
// Combine with the NVMe loc→slot mapping to build pcieSlot→serial map.
// Also parse RAID disk serials by backplane slot key (e.g. "BP0:0").
var pcieSlotSerials map[int]string
var raidSlotSerials map[string]string
if f := parser.FindFileByName(files, "audit.log"); f != nil {
if len(nvmeLocToSlot) > 0 {
nvmeDiskSerials := ParseAuditLogNVMeSerials(f.Content)
if len(nvmeDiskSerials) > 0 {
pcieSlotSerials = make(map[int]string, len(nvmeDiskSerials))
for diskNum, serial := range nvmeDiskSerials {
if slot, ok := nvmeLocToSlot[diskNum]; ok {
pcieSlotSerials[slot] = serial
}
}
if len(pcieSlotSerials) == 0 {
pcieSlotSerials = nil
}
}
}
raidSlotSerials = ParseAuditLogRAIDSerials(f.Content)
}
// Parse asset.json first (base hardware info)
if f := parser.FindFileByName(files, "asset.json"); f != nil {
if hw, err := ParseAssetJSON(f.Content); err == nil {
if hw, err := ParseAssetJSON(f.Content, pcieSlotDeviceNames, pcieSlotSerials); err == nil {
result.Hardware = hw
}
}
@@ -182,6 +214,10 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
if result.Hardware != nil {
applyGPUStatusFromEvents(result.Hardware, result.Events)
enrichStorageFromSerialFallbackFiles(files, result.Hardware)
// Apply RAID disk serials from audit.log (authoritative: last non-NULL SN change).
// These override redis/component.log serials which may be stale after disk replacement.
applyRAIDSlotSerials(result.Hardware, raidSlotSerials)
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
}
return result, nil

View File

@@ -4,6 +4,7 @@ import (
"encoding/json"
"fmt"
"regexp"
"strconv"
"strings"
"git.mchus.pro/mchus/logpile/internal/models"
@@ -37,6 +38,84 @@ type PCIeRESTInfo []struct {
FwVer string `json:"fw_ver"`
}
// ParsePCIeSlotDeviceNames parses devicefrusdr.log and returns a map from integer PCIe slot ID
// to device name string. Used to enrich HddInfo entries in asset.json that lack model names.
func ParsePCIeSlotDeviceNames(content []byte) map[int]string {
info, ok := parsePCIeRESTJSON(content)
if !ok {
return nil
}
result := make(map[int]string, len(info))
for _, entry := range info {
if entry.Slot <= 0 {
continue
}
name := sanitizePCIeDeviceName(entry.DeviceName)
if name != "" {
result[entry.Slot] = name
}
}
if len(result) == 0 {
return nil
}
return result
}
// parsePCIeRESTJSON parses the RESTful PCIE Device info JSON from devicefrusdr.log content.
func parsePCIeRESTJSON(content []byte) (PCIeRESTInfo, bool) {
text := string(content)
startMarker := "RESTful PCIE Device info:"
endMarker := "BMC sdr Info:"
startIdx := strings.Index(text, startMarker)
if startIdx == -1 {
return nil, false
}
endIdx := strings.Index(text[startIdx:], endMarker)
if endIdx == -1 {
endIdx = len(text) - startIdx
}
jsonText := strings.TrimSpace(text[startIdx+len(startMarker) : startIdx+endIdx])
var info PCIeRESTInfo
if err := json.Unmarshal([]byte(jsonText), &info); err != nil {
return nil, false
}
return info, true
}
// ParsePCIeNVMeLocToSlot parses devicefrusdr.log and returns a map from NVMe location number
// (the numeric suffix in "#NVME0", "#NVME2", etc.) to the integer PCIe slot ID.
// This is used to correlate audit.log NVMe disk numbers with HddInfo PcieSlot values.
func ParsePCIeNVMeLocToSlot(content []byte) map[int]int {
info, ok := parsePCIeRESTJSON(content)
if !ok {
return nil
}
nvmeLocRegex := regexp.MustCompile(`(?i)^#NVME(\d+)$`)
result := make(map[int]int)
for _, entry := range info {
if entry.Slot <= 0 {
continue
}
loc := strings.TrimSpace(entry.Location)
m := nvmeLocRegex.FindStringSubmatch(loc)
if m == nil {
continue
}
locNum, err := strconv.Atoi(m[1])
if err != nil {
continue
}
result[locNum] = entry.Slot
}
if len(result) == 0 {
return nil
}
return result
}
// ParsePCIeDevices parses RESTful PCIE Device info from devicefrusdr.log
func ParsePCIeDevices(content []byte) []models.PCIeDevice {
text := string(content)

View File

@@ -73,6 +73,24 @@ func looksLikeStorageSerial(v string) bool {
return hasLetter && hasDigit
}
// applyRAIDSlotSerials updates storage serial numbers using the slot→serial map
// derived from audit.log RAID SN change events. Overwrites existing serials since
// audit.log represents the authoritative current state after all disk replacements.
func applyRAIDSlotSerials(hw *models.HardwareConfig, serials map[string]string) {
if hw == nil || len(serials) == 0 {
return
}
for i := range hw.Storage {
slot := strings.TrimSpace(hw.Storage[i].Slot)
if slot == "" {
continue
}
if sn, ok := serials[slot]; ok && sn != "" {
hw.Storage[i].SerialNumber = sn
}
}
}
func applyStorageSerialFallback(hw *models.HardwareConfig, serials []string) {
if hw == nil || len(hw.Storage) == 0 || len(serials) == 0 {
return

View File

@@ -26,7 +26,7 @@ func TestParseAssetJSON_HddSlotFallbackAndPresence(t *testing.T) {
]
}`)
hw, err := ParseAssetJSON(content)
hw, err := ParseAssetJSON(content, nil, nil)
if err != nil {
t.Fatalf("ParseAssetJSON failed: %v", err)
}

View File

@@ -48,9 +48,9 @@ func ParseSyslog(content []byte, sourcePath string) []models.Event {
event := models.Event{
ID: generateEventID(sourcePath, lineNum),
Timestamp: timestamp,
Source: matches[4],
Source: "syslog",
SensorType: "syslog",
SensorName: matches[3],
SensorName: matches[4],
Description: matches[5],
Severity: severity,
RawData: line,

View File

@@ -0,0 +1,69 @@
package server
import (
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"git.mchus.pro/mchus/logpile/internal/models"
)
func TestHandleChartCurrent_RendersCurrentReanimatorSnapshot(t *testing.T) {
s := New(Config{})
s.SetResult(&models.AnalysisResult{
SourceType: models.SourceTypeArchive,
Filename: "example.zip",
CollectedAt: time.Date(2026, 3, 16, 10, 0, 0, 0, time.UTC),
Hardware: &models.HardwareConfig{
BoardInfo: models.BoardInfo{
ProductName: "SYS-TEST",
SerialNumber: "SN123",
},
CPUs: []models.CPU{
{
Socket: 1,
Model: "Xeon Gold",
Cores: 32,
},
},
},
})
req := httptest.NewRequest(http.MethodGet, "/chart/current", nil)
rec := httptest.NewRecorder()
s.mux.ServeHTTP(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("expected 200, got %d", rec.Code)
}
body := rec.Body.String()
if !strings.Contains(body, "SYS-TEST - SN123") {
t.Fatalf("expected chart title in body, got %q", body)
}
if !strings.Contains(body, `/chart/static/view.css`) {
t.Fatalf("expected rewritten chart static path, got %q", body)
}
if !strings.Contains(body, "Snapshot Metadata") {
t.Fatalf("expected rendered chart output, got %q", body)
}
}
func TestHandleChartCurrent_RendersEmptyViewerWithoutResult(t *testing.T) {
s := New(Config{})
req := httptest.NewRequest(http.MethodGet, "/chart/current", nil)
rec := httptest.NewRecorder()
s.mux.ServeHTTP(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("expected 200, got %d", rec.Code)
}
body := rec.Body.String()
if !strings.Contains(body, "Snapshot Viewer") {
t.Fatalf("expected empty chart viewer, got %q", body)
}
}

View File

@@ -14,16 +14,50 @@ import (
func newCollectTestServer() (*Server, *httptest.Server) {
s := &Server{
jobManager: NewJobManager(),
jobManager: NewJobManager(),
collectors: testCollectorRegistry(),
}
mux := http.NewServeMux()
mux.HandleFunc("POST /api/collect/probe", s.handleCollectProbe)
mux.HandleFunc("POST /api/collect", s.handleCollectStart)
mux.HandleFunc("GET /api/collect/{id}", s.handleCollectStatus)
mux.HandleFunc("POST /api/collect/{id}/cancel", s.handleCollectCancel)
return s, httptest.NewServer(mux)
}
func TestCollectProbe(t *testing.T) {
_, ts := newCollectTestServer()
defer ts.Close()
body := `{"host":"bmc-off.local","protocol":"redfish","port":443,"username":"admin","auth_type":"password","password":"secret","tls_mode":"strict"}`
resp, err := http.Post(ts.URL+"/api/collect/probe", "application/json", bytes.NewBufferString(body))
if err != nil {
t.Fatalf("post collect probe failed: %v", err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("expected 200, got %d", resp.StatusCode)
}
var payload CollectProbeResponse
if err := json.NewDecoder(resp.Body).Decode(&payload); err != nil {
t.Fatalf("decode probe response: %v", err)
}
if !payload.Reachable {
t.Fatalf("expected reachable=true, got false")
}
if payload.HostPoweredOn {
t.Fatalf("expected host powered off in probe response")
}
if payload.HostPowerState != "Off" {
t.Fatalf("expected host power state Off, got %q", payload.HostPowerState)
}
if !payload.PowerControlAvailable {
t.Fatalf("expected power control to be available")
}
}
func TestCollectLifecycleToTerminal(t *testing.T) {
_, ts := newCollectTestServer()
defer ts.Close()
@@ -57,6 +91,21 @@ func TestCollectLifecycleToTerminal(t *testing.T) {
if len(status.Logs) < 4 {
t.Fatalf("expected detailed logs, got %v", status.Logs)
}
if len(status.ActiveModules) == 0 {
t.Fatal("expected active modules in collect status")
}
if status.ActiveModules[0].Name == "" {
t.Fatal("expected active module name")
}
if len(status.ModuleScores) == 0 {
t.Fatal("expected module scores in collect status")
}
if status.DebugInfo == nil {
t.Fatal("expected debug info in collect status")
}
if len(status.DebugInfo.PhaseTelemetry) == 0 {
t.Fatal("expected phase telemetry in collect debug info")
}
}
func TestCollectCancel(t *testing.T) {

View File

@@ -17,8 +17,44 @@ func (c *mockConnector) Protocol() string {
return c.protocol
}
func (c *mockConnector) Probe(ctx context.Context, req collector.Request) (*collector.ProbeResult, error) {
if strings.Contains(strings.ToLower(req.Host), "fail") {
return nil, context.DeadlineExceeded
}
return &collector.ProbeResult{
Reachable: true,
Protocol: c.protocol,
HostPowerState: map[bool]string{true: "On", false: "Off"}[!strings.Contains(strings.ToLower(req.Host), "off")],
HostPoweredOn: !strings.Contains(strings.ToLower(req.Host), "off"),
PowerControlAvailable: true,
SystemPath: "/redfish/v1/Systems/1",
}, nil
}
func (c *mockConnector) Collect(ctx context.Context, req collector.Request, emit collector.ProgressFn) (*models.AnalysisResult, error) {
steps := []collector.Progress{
{
Status: CollectStatusRunning,
Progress: 10,
Message: "Подбор модулей Redfish...",
ActiveModules: []collector.ModuleActivation{
{Name: "supermicro", Score: 80},
{Name: "generic", Score: 10},
},
ModuleScores: []collector.ModuleScore{
{Name: "supermicro", Score: 80, Active: true, Priority: 20},
{Name: "generic", Score: 10, Active: true, Priority: 100},
{Name: "hgx-topology", Score: 0, Active: false, Priority: 30},
},
DebugInfo: &collector.CollectDebugInfo{
AdaptiveThrottled: false,
SnapshotWorkers: 6,
PrefetchWorkers: 4,
PhaseTelemetry: []collector.PhaseTelemetry{
{Phase: "discovery", Requests: 6, Errors: 0, ErrorRate: 0, AvgMS: 120, P95MS: 180},
},
},
},
{Status: CollectStatusRunning, Progress: 20, Message: "Подключение..."},
{Status: CollectStatusRunning, Progress: 50, Message: "Сбор инвентаря..."},
{Status: CollectStatusRunning, Progress: 80, Message: "Нормализация..."},

View File

@@ -11,14 +11,24 @@ const (
)
type CollectRequest struct {
Host string `json:"host"`
Protocol string `json:"protocol"`
Port int `json:"port"`
Username string `json:"username"`
AuthType string `json:"auth_type"`
Password string `json:"password,omitempty"`
Token string `json:"token,omitempty"`
TLSMode string `json:"tls_mode"`
Host string `json:"host"`
Protocol string `json:"protocol"`
Port int `json:"port"`
Username string `json:"username"`
AuthType string `json:"auth_type"`
Password string `json:"password,omitempty"`
Token string `json:"token,omitempty"`
TLSMode string `json:"tls_mode"`
PowerOnIfHostOff bool `json:"power_on_if_host_off,omitempty"`
}
type CollectProbeResponse struct {
Reachable bool `json:"reachable"`
Protocol string `json:"protocol,omitempty"`
HostPowerState string `json:"host_power_state,omitempty"`
HostPoweredOn bool `json:"host_powered_on"`
PowerControlAvailable bool `json:"power_control_available"`
Message string `json:"message,omitempty"`
}
type CollectJobResponse struct {
@@ -29,13 +39,18 @@ type CollectJobResponse struct {
}
type CollectJobStatusResponse struct {
JobID string `json:"job_id"`
Status string `json:"status"`
Progress *int `json:"progress,omitempty"`
Logs []string `json:"logs,omitempty"`
Error string `json:"error,omitempty"`
CreatedAt time.Time `json:"created_at,omitempty"`
UpdatedAt time.Time `json:"updated_at"`
JobID string `json:"job_id"`
Status string `json:"status"`
Progress *int `json:"progress,omitempty"`
CurrentPhase string `json:"current_phase,omitempty"`
ETASeconds *int `json:"eta_seconds,omitempty"`
Logs []string `json:"logs,omitempty"`
Error string `json:"error,omitempty"`
ActiveModules []CollectModuleStatus `json:"active_modules,omitempty"`
ModuleScores []CollectModuleStatus `json:"module_scores,omitempty"`
DebugInfo *CollectDebugInfo `json:"debug_info,omitempty"`
CreatedAt time.Time `json:"created_at,omitempty"`
UpdatedAt time.Time `json:"updated_at"`
}
type CollectRequestMeta struct {
@@ -48,27 +63,64 @@ type CollectRequestMeta struct {
}
type Job struct {
ID string
Status string
Progress int
Logs []string
Error string
CreatedAt time.Time
UpdatedAt time.Time
RequestMeta CollectRequestMeta
cancel func()
ID string
Status string
Progress int
CurrentPhase string
ETASeconds int
Logs []string
Error string
ActiveModules []CollectModuleStatus
ModuleScores []CollectModuleStatus
DebugInfo *CollectDebugInfo
CreatedAt time.Time
UpdatedAt time.Time
RequestMeta CollectRequestMeta
cancel func()
}
type CollectModuleStatus struct {
Name string `json:"name"`
Score int `json:"score"`
Active bool `json:"active,omitempty"`
Priority int `json:"priority,omitempty"`
}
type CollectDebugInfo struct {
AdaptiveThrottled bool `json:"adaptive_throttled"`
SnapshotWorkers int `json:"snapshot_workers,omitempty"`
PrefetchWorkers int `json:"prefetch_workers,omitempty"`
PrefetchEnabled *bool `json:"prefetch_enabled,omitempty"`
PhaseTelemetry []CollectPhaseTelemetry `json:"phase_telemetry,omitempty"`
}
type CollectPhaseTelemetry struct {
Phase string `json:"phase"`
Requests int `json:"requests,omitempty"`
Errors int `json:"errors,omitempty"`
ErrorRate float64 `json:"error_rate,omitempty"`
AvgMS int64 `json:"avg_ms,omitempty"`
P95MS int64 `json:"p95_ms,omitempty"`
}
func (j *Job) toStatusResponse() CollectJobStatusResponse {
progress := j.Progress
resp := CollectJobStatusResponse{
JobID: j.ID,
Status: j.Status,
Progress: &progress,
Logs: append([]string(nil), j.Logs...),
Error: j.Error,
CreatedAt: j.CreatedAt,
UpdatedAt: j.UpdatedAt,
JobID: j.ID,
Status: j.Status,
Progress: &progress,
CurrentPhase: j.CurrentPhase,
Logs: append([]string(nil), j.Logs...),
Error: j.Error,
ActiveModules: append([]CollectModuleStatus(nil), j.ActiveModules...),
ModuleScores: append([]CollectModuleStatus(nil), j.ModuleScores...),
DebugInfo: cloneCollectDebugInfo(j.DebugInfo),
CreatedAt: j.CreatedAt,
UpdatedAt: j.UpdatedAt,
}
if j.ETASeconds > 0 {
eta := j.ETASeconds
resp.ETASeconds = &eta
}
return resp
}
@@ -81,3 +133,16 @@ func (j *Job) toJobResponse(message string) CollectJobResponse {
CreatedAt: j.CreatedAt,
}
}
func cloneCollectDebugInfo(in *CollectDebugInfo) *CollectDebugInfo {
if in == nil {
return nil
}
out := *in
out.PhaseTelemetry = append([]CollectPhaseTelemetry(nil), in.PhaseTelemetry...)
if in.PrefetchEnabled != nil {
value := *in.PrefetchEnabled
out.PrefetchEnabled = &value
}
return &out
}

View File

@@ -21,8 +21,10 @@ import (
"git.mchus.pro/mchus/logpile/internal/collector"
"git.mchus.pro/mchus/logpile/internal/exporter"
"git.mchus.pro/mchus/logpile/internal/ingest"
"git.mchus.pro/mchus/logpile/internal/models"
"git.mchus.pro/mchus/logpile/internal/parser"
chartviewer "reanimator/chart/viewer"
)
func (s *Server) handleIndex(w http.ResponseWriter, r *http.Request) {
@@ -47,6 +49,82 @@ func (s *Server) handleIndex(w http.ResponseWriter, r *http.Request) {
tmpl.Execute(w, nil)
}
func (s *Server) handleChartCurrent(w http.ResponseWriter, r *http.Request) {
result := s.GetResult()
title := chartTitle(result)
if result == nil || result.Hardware == nil {
html, err := chartviewer.RenderHTML(nil, title)
if err != nil {
http.Error(w, "failed to render viewer", http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "text/html; charset=utf-8")
_, _ = w.Write(rewriteChartStaticPaths(html))
return
}
snapshotBytes, err := currentReanimatorSnapshotBytes(result)
if err != nil {
http.Error(w, "failed to build reanimator snapshot: "+err.Error(), http.StatusInternalServerError)
return
}
html, err := chartviewer.RenderHTML(snapshotBytes, title)
if err != nil {
http.Error(w, "failed to render chart: "+err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "text/html; charset=utf-8")
_, _ = w.Write(rewriteChartStaticPaths(html))
}
func currentReanimatorSnapshotBytes(result *models.AnalysisResult) ([]byte, error) {
reanimatorData, err := exporter.ConvertToReanimator(result)
if err != nil {
return nil, err
}
var buf bytes.Buffer
encoder := json.NewEncoder(&buf)
encoder.SetIndent("", " ")
if err := encoder.Encode(reanimatorData); err != nil {
return nil, err
}
return buf.Bytes(), nil
}
func chartTitle(result *models.AnalysisResult) string {
const fallback = "LOGPile Reanimator Viewer"
if result == nil {
return fallback
}
if result.Hardware != nil {
board := result.Hardware.BoardInfo
product := strings.TrimSpace(board.ProductName)
serial := strings.TrimSpace(board.SerialNumber)
switch {
case product != "" && serial != "":
return product + " - " + serial
case product != "":
return product
case serial != "":
return serial
}
}
if host := strings.TrimSpace(result.TargetHost); host != "" {
return host
}
if filename := strings.TrimSpace(result.Filename); filename != "" {
return filename
}
return fallback
}
func rewriteChartStaticPaths(html []byte) []byte {
return bytes.ReplaceAll(html, []byte(`href="/static/view.css"`), []byte(`href="/chart/static/view.css"`))
}
func (s *Server) handleUpload(w http.ResponseWriter, r *http.Request) {
r.Body = http.MaxBytesReader(w, r.Body, uploadMultipartMaxBytes())
if err := r.ParseMultipartForm(uploadMultipartFormMemoryBytes()); err != nil {
@@ -142,13 +220,12 @@ func (s *Server) analyzeUploadedFile(filename, mimeType string, payload []byte)
return nil, "", nil, fmt.Errorf("unsupported archive format: %s", strings.ToLower(filepath.Ext(filename)))
}
p := parser.NewBMCParser()
if err := p.ParseFromReader(bytes.NewReader(payload), filename); err != nil {
result, vendor, err := s.ingestService().AnalyzeArchivePayload(filename, payload)
if err != nil {
return nil, "", nil, err
}
result := p.Result()
applyArchiveSourceMetadata(result)
return result, p.DetectedVendor(), newRawExportFromUploadedFile(filename, mimeType, payload, result), nil
return result, vendor, newRawExportFromUploadedFile(filename, mimeType, payload, result), nil
}
func uploadMultipartMaxBytes() int64 {
@@ -220,33 +297,18 @@ func (s *Server) reanalyzeRawExportPackage(pkg *RawExportPackage) (*models.Analy
if !strings.EqualFold(strings.TrimSpace(pkg.Source.Protocol), "redfish") {
return nil, "", fmt.Errorf("unsupported live protocol: %s", pkg.Source.Protocol)
}
result, err := collector.ReplayRedfishFromRawPayloads(pkg.Source.RawPayloads, nil)
result, vendor, err := s.ingestService().AnalyzeRedfishRawPayloads(pkg.Source.RawPayloads, ingest.RedfishSourceMetadata{
TargetHost: pkg.Source.TargetHost,
SourceTimezone: pkg.Source.SourceTimezone,
Filename: pkg.Source.Filename,
})
if err != nil {
return nil, "", err
}
if result != nil {
if strings.TrimSpace(result.Protocol) == "" {
result.Protocol = "redfish"
}
if strings.TrimSpace(result.SourceType) == "" {
result.SourceType = models.SourceTypeAPI
}
if strings.TrimSpace(result.TargetHost) == "" {
result.TargetHost = strings.TrimSpace(pkg.Source.TargetHost)
}
if strings.TrimSpace(result.SourceTimezone) == "" {
result.SourceTimezone = strings.TrimSpace(pkg.Source.SourceTimezone)
}
result.CollectedAt = inferRawExportCollectedAt(result, pkg)
if strings.TrimSpace(result.Filename) == "" {
target := result.TargetHost
if target == "" {
target = "snapshot"
}
result.Filename = "redfish://" + target
}
}
return result, "redfish", nil
return result, vendor, nil
default:
return nil, "", fmt.Errorf("unsupported raw export source kind: %s", pkg.Source.Kind)
}
@@ -265,13 +327,12 @@ func (s *Server) parseUploadedPayload(filename string, payload []byte) (*models.
return snapshotResult, vendor, nil
}
p := parser.NewBMCParser()
if err := p.ParseFromReader(bytes.NewReader(payload), filename); err != nil {
result, vendor, err := s.ingestService().AnalyzeArchivePayload(filename, payload)
if err != nil {
return nil, "", err
}
result := p.Result()
applyArchiveSourceMetadata(result)
return result, p.DetectedVendor(), nil
return result, vendor, nil
}
func (s *Server) handleGetParsers(w http.ResponseWriter, r *http.Request) {
@@ -1510,6 +1571,69 @@ func (s *Server) handleCollectStart(w http.ResponseWriter, r *http.Request) {
_ = json.NewEncoder(w).Encode(job.toJobResponse("Collection job accepted"))
}
func (s *Server) handleCollectProbe(w http.ResponseWriter, r *http.Request) {
var req CollectRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
jsonError(w, "Invalid JSON body", http.StatusBadRequest)
return
}
if err := validateCollectRequest(req); err != nil {
jsonError(w, err.Error(), http.StatusBadRequest)
return
}
connector, ok := s.getCollector(req.Protocol)
if !ok {
jsonError(w, "Коннектор для протокола не зарегистрирован", http.StatusBadRequest)
return
}
prober, ok := connector.(collector.Prober)
if !ok {
jsonError(w, "Проверка подключения для протокола не поддерживается", http.StatusBadRequest)
return
}
ctx, cancel := context.WithTimeout(r.Context(), 20*time.Second)
defer cancel()
result, err := prober.Probe(ctx, toCollectorRequest(req))
if err != nil {
jsonError(w, "Проверка подключения не удалась: "+err.Error(), http.StatusBadRequest)
return
}
message := "Связь с BMC установлена"
if result != nil {
switch {
case !result.HostPoweredOn && result.PowerControlAvailable:
message = "Связь с BMC установлена, host выключен. Можно включить перед сбором."
case !result.HostPoweredOn:
message = "Связь с BMC установлена, host выключен."
default:
message = "Связь с BMC установлена, host включен."
}
}
hostPowerState := ""
hostPoweredOn := false
powerControlAvailable := false
reachable := false
if result != nil {
reachable = result.Reachable
hostPowerState = strings.TrimSpace(result.HostPowerState)
hostPoweredOn = result.HostPoweredOn
powerControlAvailable = result.PowerControlAvailable
}
jsonResponse(w, CollectProbeResponse{
Reachable: reachable,
Protocol: req.Protocol,
HostPowerState: hostPowerState,
HostPoweredOn: hostPoweredOn,
PowerControlAvailable: powerControlAvailable,
Message: message,
})
}
func (s *Server) handleCollectStatus(w http.ResponseWriter, r *http.Request) {
jobID := strings.TrimSpace(r.PathValue("id"))
if !isValidCollectJobID(jobID) {
@@ -1566,6 +1690,51 @@ func (s *Server) startCollectionJob(jobID string, req CollectRequest) {
status = CollectStatusRunning
}
s.jobManager.UpdateJobStatus(jobID, status, update.Progress, "")
if update.CurrentPhase != "" || update.ETASeconds > 0 {
s.jobManager.UpdateJobETA(jobID, update.CurrentPhase, update.ETASeconds)
}
if update.DebugInfo != nil {
debugInfo := &CollectDebugInfo{
AdaptiveThrottled: update.DebugInfo.AdaptiveThrottled,
SnapshotWorkers: update.DebugInfo.SnapshotWorkers,
PrefetchWorkers: update.DebugInfo.PrefetchWorkers,
PrefetchEnabled: update.DebugInfo.PrefetchEnabled,
}
if len(update.DebugInfo.PhaseTelemetry) > 0 {
debugInfo.PhaseTelemetry = make([]CollectPhaseTelemetry, 0, len(update.DebugInfo.PhaseTelemetry))
for _, item := range update.DebugInfo.PhaseTelemetry {
debugInfo.PhaseTelemetry = append(debugInfo.PhaseTelemetry, CollectPhaseTelemetry{
Phase: item.Phase,
Requests: item.Requests,
Errors: item.Errors,
ErrorRate: item.ErrorRate,
AvgMS: item.AvgMS,
P95MS: item.P95MS,
})
}
}
s.jobManager.UpdateJobDebugInfo(jobID, debugInfo)
}
if len(update.ActiveModules) > 0 || len(update.ModuleScores) > 0 {
activeModules := make([]CollectModuleStatus, 0, len(update.ActiveModules))
for _, module := range update.ActiveModules {
activeModules = append(activeModules, CollectModuleStatus{
Name: module.Name,
Score: module.Score,
Active: true,
})
}
moduleScores := make([]CollectModuleStatus, 0, len(update.ModuleScores))
for _, module := range update.ModuleScores {
moduleScores = append(moduleScores, CollectModuleStatus{
Name: module.Name,
Score: module.Score,
Active: module.Active,
Priority: module.Priority,
})
}
s.jobManager.UpdateJobModules(jobID, activeModules, moduleScores)
}
if update.Message != "" {
s.jobManager.AppendJobLog(jobID, update.Message)
}
@@ -1787,14 +1956,15 @@ func applyCollectSourceMetadata(result *models.AnalysisResult, req CollectReques
func toCollectorRequest(req CollectRequest) collector.Request {
return collector.Request{
Host: req.Host,
Protocol: req.Protocol,
Port: req.Port,
Username: req.Username,
AuthType: req.AuthType,
Password: req.Password,
Token: req.Token,
TLSMode: req.TLSMode,
Host: req.Host,
Protocol: req.Protocol,
Port: req.Port,
Username: req.Username,
AuthType: req.AuthType,
Password: req.Password,
Token: req.Token,
TLSMode: req.TLSMode,
PowerOnIfHostOff: req.PowerOnIfHostOff,
}
}

View File

@@ -128,6 +128,53 @@ func (m *JobManager) AppendJobLog(id, message string) (*Job, bool) {
return cloned, true
}
func (m *JobManager) UpdateJobModules(id string, activeModules, moduleScores []CollectModuleStatus) (*Job, bool) {
m.mu.Lock()
job, ok := m.jobs[id]
if !ok || job == nil {
m.mu.Unlock()
return nil, false
}
job.ActiveModules = append([]CollectModuleStatus(nil), activeModules...)
job.ModuleScores = append([]CollectModuleStatus(nil), moduleScores...)
job.UpdatedAt = time.Now().UTC()
cloned := cloneJob(job)
m.mu.Unlock()
return cloned, true
}
func (m *JobManager) UpdateJobETA(id, phase string, etaSeconds int) (*Job, bool) {
m.mu.Lock()
job, ok := m.jobs[id]
if !ok || job == nil {
m.mu.Unlock()
return nil, false
}
job.CurrentPhase = phase
job.ETASeconds = etaSeconds
job.UpdatedAt = time.Now().UTC()
cloned := cloneJob(job)
m.mu.Unlock()
return cloned, true
}
func (m *JobManager) UpdateJobDebugInfo(id string, info *CollectDebugInfo) (*Job, bool) {
m.mu.Lock()
job, ok := m.jobs[id]
if !ok || job == nil {
m.mu.Unlock()
return nil, false
}
job.DebugInfo = cloneCollectDebugInfo(info)
job.UpdatedAt = time.Now().UTC()
cloned := cloneJob(job)
m.mu.Unlock()
return cloned, true
}
func (m *JobManager) AttachJobCancel(id string, cancelFn context.CancelFunc) bool {
m.mu.Lock()
defer m.mu.Unlock()
@@ -176,6 +223,11 @@ func cloneJob(job *Job) *Job {
}
cloned := *job
cloned.Logs = append([]string(nil), job.Logs...)
cloned.ActiveModules = append([]CollectModuleStatus(nil), job.ActiveModules...)
cloned.ModuleScores = append([]CollectModuleStatus(nil), job.ModuleScores...)
cloned.DebugInfo = cloneCollectDebugInfo(job.DebugInfo)
cloned.CurrentPhase = job.CurrentPhase
cloned.ETASeconds = job.ETASeconds
cloned.cancel = nil
return &cloned
}

View File

@@ -0,0 +1,72 @@
package server
import (
"os"
"strings"
"testing"
"git.mchus.pro/mchus/logpile/internal/models"
)
// TestManualInspectInput is a persistent local debugging harness for checking
// how the current server code analyzes a real input file. It is skipped unless
// LOGPILE_MANUAL_INPUT points to a file on disk.
//
// Usage:
//
// LOGPILE_MANUAL_INPUT=/abs/path/to/file.zip go test ./internal/server -run TestManualInspectInput -v
func TestManualInspectInput(t *testing.T) {
path := strings.TrimSpace(os.Getenv("LOGPILE_MANUAL_INPUT"))
if path == "" {
t.Skip("set LOGPILE_MANUAL_INPUT to inspect a real input file")
}
payload, err := os.ReadFile(path)
if err != nil {
t.Fatalf("read input: %v", err)
}
s := &Server{}
filename := path
if rawPkg, ok, err := parseRawExportBundle(payload); err != nil {
t.Fatalf("parseRawExportBundle: %v", err)
} else if ok {
result, vendor, err := s.reanalyzeRawExportPackage(rawPkg)
if err != nil {
t.Fatalf("reanalyzeRawExportPackage: %v", err)
}
logManualAnalysisResult(t, "raw_export_bundle", vendor, result)
return
}
result, vendor, err := s.parseUploadedPayload(filename, payload)
if err != nil {
t.Fatalf("parseUploadedPayload: %v", err)
}
logManualAnalysisResult(t, "uploaded_payload", vendor, result)
}
func logManualAnalysisResult(t *testing.T, mode, vendor string, result *models.AnalysisResult) {
t.Helper()
if result == nil || result.Hardware == nil {
t.Fatalf("missing hardware result")
}
t.Logf("mode=%s vendor=%s source_type=%s protocol=%s target=%s", mode, vendor, result.SourceType, result.Protocol, result.TargetHost)
t.Logf("counts: gpus=%d pcie=%d cpus=%d memory=%d storage=%d nics=%d psus=%d",
len(result.Hardware.GPUs),
len(result.Hardware.PCIeDevices),
len(result.Hardware.CPUs),
len(result.Hardware.Memory),
len(result.Hardware.Storage),
len(result.Hardware.NetworkAdapters),
len(result.Hardware.PowerSupply),
)
for i, g := range result.Hardware.GPUs {
t.Logf("gpu[%d]: slot=%s model=%s bdf=%s serial=%s status=%s", i, g.Slot, g.Model, g.BDF, g.SerialNumber, g.Status)
}
for i, p := range result.Hardware.PCIeDevices {
t.Logf("pcie[%d]: slot=%s class=%s model=%s bdf=%s serial=%s vendor=%s", i, p.Slot, p.DeviceClass, p.PartNumber, p.BDF, p.SerialNumber, p.Manufacturer)
}
}

View File

@@ -32,17 +32,17 @@ type RawExportPackage struct {
}
type RawExportSource struct {
Kind string `json:"kind"` // file_bytes | live_redfish | snapshot_json
Filename string `json:"filename,omitempty"`
MIMEType string `json:"mime_type,omitempty"`
Encoding string `json:"encoding,omitempty"` // base64
Data string `json:"data,omitempty"`
Protocol string `json:"protocol,omitempty"`
TargetHost string `json:"target_host,omitempty"`
SourceTimezone string `json:"source_timezone,omitempty"`
RawPayloads map[string]any `json:"raw_payloads,omitempty"`
CollectLogs []string `json:"collect_logs,omitempty"`
CollectMeta *CollectRequestMeta `json:"collect_meta,omitempty"`
Kind string `json:"kind"` // file_bytes | live_redfish | snapshot_json
Filename string `json:"filename,omitempty"`
MIMEType string `json:"mime_type,omitempty"`
Encoding string `json:"encoding,omitempty"` // base64
Data string `json:"data,omitempty"`
Protocol string `json:"protocol,omitempty"`
TargetHost string `json:"target_host,omitempty"`
SourceTimezone string `json:"source_timezone,omitempty"`
RawPayloads map[string]any `json:"raw_payloads,omitempty"`
CollectLogs []string `json:"collect_logs,omitempty"`
CollectMeta *CollectRequestMeta `json:"collect_meta,omitempty"`
}
func newRawExportFromUploadedFile(filename, mimeType string, payload []byte, result *models.AnalysisResult) *RawExportPackage {
@@ -50,13 +50,13 @@ func newRawExportFromUploadedFile(filename, mimeType string, payload []byte, res
Format: rawExportFormatV1,
ExportedAt: time.Now().UTC(),
Source: RawExportSource{
Kind: "file_bytes",
Filename: filename,
MIMEType: mimeType,
Encoding: "base64",
Data: base64.StdEncoding.EncodeToString(payload),
Protocol: resultProtocol(result),
TargetHost: resultTargetHost(result),
Kind: "file_bytes",
Filename: filename,
MIMEType: mimeType,
Encoding: "base64",
Data: base64.StdEncoding.EncodeToString(payload),
Protocol: resultProtocol(result),
TargetHost: resultTargetHost(result),
SourceTimezone: resultSourceTimezone(result),
},
}
@@ -81,13 +81,13 @@ func newRawExportFromLiveCollect(result *models.AnalysisResult, req CollectReque
Format: rawExportFormatV1,
ExportedAt: time.Now().UTC(),
Source: RawExportSource{
Kind: "live_redfish",
Protocol: req.Protocol,
TargetHost: req.Host,
Kind: "live_redfish",
Protocol: req.Protocol,
TargetHost: req.Host,
SourceTimezone: resultSourceTimezone(result),
RawPayloads: rawPayloads,
CollectLogs: append([]string(nil), logs...),
CollectMeta: &meta,
RawPayloads: rawPayloads,
CollectLogs: append([]string(nil), logs...),
CollectMeta: &meta,
},
}
}
@@ -386,6 +386,10 @@ func buildParserFieldSummary(result *models.AnalysisResult) map[string]any {
return out
}
hw := result.Hardware
out["vendor"] = hw.BoardInfo.Manufacturer
out["model"] = hw.BoardInfo.ProductName
out["serial"] = hw.BoardInfo.SerialNumber
out["part_number"] = hw.BoardInfo.PartNumber
out["hardware"] = map[string]any{
"board": hw.BoardInfo,
"counts": map[string]int{

View File

@@ -1,101 +1,35 @@
package server
import (
"archive/zip"
"bytes"
"encoding/json"
"strings"
"testing"
"time"
"git.mchus.pro/mchus/logpile/internal/models"
)
func TestCollectLogTimeBounds(t *testing.T) {
lines := []string{
"2026-02-28T13:10:13.7442032Z Задача поставлена в очередь",
"not-a-timestamp line",
"2026-02-28T13:31:00.5077486Z Сбор завершен",
}
startedAt, finishedAt, ok := collectLogTimeBounds(lines)
if !ok {
t.Fatalf("expected bounds to be parsed")
}
if got := formatRawExportDuration(finishedAt.Sub(startedAt)); got != "20m47s" {
t.Fatalf("unexpected duration: %s", got)
}
}
func TestBuildHumanReadableCollectionLog_IncludesDurationHeader(t *testing.T) {
pkg := &RawExportPackage{
Format: rawExportFormatV1,
Source: RawExportSource{
Kind: "live_redfish",
CollectLogs: []string{
"2026-02-28T13:10:13.7442032Z Redfish: подключение к BMC...",
"2026-02-28T13:31:00.5077486Z Сбор завершен",
func TestBuildParserFieldSummary_MirrorsBoardInfoToTopLevel(t *testing.T) {
result := &models.AnalysisResult{
Hardware: &models.HardwareConfig{
BoardInfo: models.BoardInfo{
Manufacturer: "Supermicro",
ProductName: "SYS-821GE-TNHR",
SerialNumber: "A514359X5C08846",
PartNumber: "SYS-821GE-TNHR",
},
},
}
logText := buildHumanReadableCollectionLog(pkg, nil, "LOGPile test")
for _, token := range []string{
"Collection Started:",
"Collection Finished:",
"Collection Duration:",
} {
if !strings.Contains(logText, token) {
t.Fatalf("expected %q in log header", token)
}
}
}
func TestParseRawExportBundle_ExtractsCollectedAtHintFromParserFields(t *testing.T) {
pkg := &RawExportPackage{
Format: rawExportFormatV1,
ExportedAt: time.Date(2026, 2, 25, 9, 59, 41, 479023400, time.UTC),
Source: RawExportSource{
Kind: "live_redfish",
},
}
pkgJSON, err := json.Marshal(pkg)
if err != nil {
t.Fatalf("marshal pkg: %v", err)
}
parserFields := []byte(`{"collected_at":"2026-02-25T09:58:05.9129753Z"}`)
var buf bytes.Buffer
zw := zip.NewWriter(&buf)
jf, err := zw.Create(rawExportBundlePackageFile)
if err != nil {
t.Fatalf("create package file: %v", err)
}
if _, err := jf.Write(pkgJSON); err != nil {
t.Fatalf("write package file: %v", err)
}
ff, err := zw.Create(rawExportBundleFieldsFile)
if err != nil {
t.Fatalf("create parser fields file: %v", err)
}
if _, err := ff.Write(parserFields); err != nil {
t.Fatalf("write parser fields file: %v", err)
}
if err := zw.Close(); err != nil {
t.Fatalf("close zip writer: %v", err)
}
gotPkg, ok, err := parseRawExportBundle(buf.Bytes())
if err != nil {
t.Fatalf("parse bundle: %v", err)
}
if !ok || gotPkg == nil {
t.Fatalf("expected valid raw export bundle")
}
want := time.Date(2026, 2, 25, 9, 58, 5, 912975300, time.UTC)
if !gotPkg.CollectedAtHint.Equal(want) {
t.Fatalf("expected collected_at hint %s, got %s", want, gotPkg.CollectedAtHint)
got := buildParserFieldSummary(result)
if got["vendor"] != "Supermicro" {
t.Fatalf("expected vendor mirror, got %v", got["vendor"])
}
if got["model"] != "SYS-821GE-TNHR" {
t.Fatalf("expected model mirror, got %v", got["model"])
}
if got["serial"] != "A514359X5C08846" {
t.Fatalf("expected serial mirror, got %v", got["serial"])
}
if got["part_number"] != "SYS-821GE-TNHR" {
t.Fatalf("expected part_number mirror, got %v", got["part_number"])
}
}

View File

@@ -0,0 +1,81 @@
package server
import (
"os"
"path/filepath"
"strings"
"testing"
"git.mchus.pro/mchus/logpile/internal/exporter"
)
func TestReanimatorExport_RedfishExampleDoesNotDuplicateCPUs(t *testing.T) {
payload, err := os.ReadFile(filepath.Join("..", "..", "example", "2026-03-11 (SYS-821GE-TNHR) - A514359X5C08846.zip"))
if err != nil {
t.Fatalf("read example bundle: %v", err)
}
rawPkg, ok, err := parseRawExportBundle(payload)
if err != nil {
t.Fatalf("parse raw export bundle: %v", err)
}
if !ok {
t.Fatalf("example bundle was not recognized as raw export bundle")
}
s := &Server{}
result, vendor, err := s.reanalyzeRawExportPackage(rawPkg)
if err != nil {
t.Fatalf("reanalyze raw export bundle: %v", err)
}
if vendor != "redfish" {
t.Fatalf("expected redfish vendor, got %q", vendor)
}
if result == nil || result.Hardware == nil {
t.Fatalf("expected parsed hardware result")
}
if got := len(result.Hardware.CPUs); got != 2 {
t.Fatalf("expected 2 CPUs after replay, got %d", got)
}
reanimatorData, err := exporter.ConvertToReanimator(result)
if err != nil {
t.Fatalf("ConvertToReanimator: %v", err)
}
if got := len(reanimatorData.Hardware.CPUs); got != 2 {
t.Fatalf("expected 2 CPUs in reanimator export, got %d", got)
}
for i, cpu := range reanimatorData.Hardware.CPUs {
if cpu.SerialNumber != "" {
t.Fatalf("expected CPU %d serial to stay empty, got %q", i, cpu.SerialNumber)
}
}
for _, dev := range reanimatorData.Hardware.PCIeDevices {
joined := strings.ToLower(strings.Join([]string{dev.Slot, dev.Model, dev.DeviceClass}, " "))
if strings.Contains(joined, "nvme") || strings.Contains(joined, "ssd") || strings.Contains(joined, "disk") {
t.Fatalf("expected storage endpoint to stay out of pcie export, got %+v", dev)
}
if strings.EqualFold(dev.Model, "Network Device View") {
t.Fatalf("expected placeholder network model to be replaced, got %+v", dev)
}
if strings.Contains(dev.SerialNumber, "-PCIE-") {
t.Fatalf("expected no synthetic pcie serials, got %+v", dev)
}
if isNumericOnly(dev.Slot) && dev.DeviceClass == "NetworkController" && dev.Model == "" && dev.SerialNumber == "" {
t.Fatalf("expected placeholder numeric-slot network export to be skipped, got %+v", dev)
}
}
}
func isNumericOnly(v string) bool {
v = strings.TrimSpace(v)
if v == "" {
return false
}
for _, r := range v {
if r < '0' || r > '9' {
return false
}
}
return true
}

View File

@@ -10,7 +10,9 @@ import (
"time"
"git.mchus.pro/mchus/logpile/internal/collector"
"git.mchus.pro/mchus/logpile/internal/ingest"
"git.mchus.pro/mchus/logpile/internal/models"
chartviewer "reanimator/chart/viewer"
)
// WebFS holds embedded web files (set by main package)
@@ -37,6 +39,7 @@ type Server struct {
jobManager *JobManager
collectors *collector.Registry
ingest *ingest.Service
}
type ConvertArtifact struct {
@@ -50,6 +53,7 @@ func New(cfg Config) *Server {
mux: http.NewServeMux(),
jobManager: NewJobManager(),
collectors: collector.NewDefaultRegistry(),
ingest: ingest.NewService(),
convertJobs: make(map[string]struct{}),
convertOutput: make(map[string]ConvertArtifact),
}
@@ -64,9 +68,13 @@ func (s *Server) setupRoutes() {
panic(err)
}
s.mux.Handle("/static/", http.StripPrefix("/static/", http.FileServer(http.FS(staticContent))))
s.mux.Handle("/chart/", http.StripPrefix("/chart", chartviewer.NewHandler(chartviewer.HandlerOptions{
Title: "LOGPile Reanimator Viewer",
})))
// Pages
s.mux.HandleFunc("/", s.handleIndex)
s.mux.HandleFunc("GET /chart/current", s.handleChartCurrent)
// API endpoints
s.mux.HandleFunc("POST /api/upload", s.handleUpload)
@@ -88,6 +96,7 @@ func (s *Server) setupRoutes() {
s.mux.HandleFunc("DELETE /api/clear", s.handleClear)
s.mux.HandleFunc("POST /api/shutdown", s.handleShutdown)
s.mux.HandleFunc("POST /api/collect", s.handleCollectStart)
s.mux.HandleFunc("POST /api/collect/probe", s.handleCollectProbe)
s.mux.HandleFunc("GET /api/collect/{id}", s.handleCollectStatus)
s.mux.HandleFunc("POST /api/collect/{id}/cancel", s.handleCollectCancel)
}
@@ -154,6 +163,17 @@ func (s *Server) ClientVersionString() string {
return fmt.Sprintf("LOGPile %s (commit: %s)", v, c)
}
func (s *Server) ingestService() *ingest.Service {
if s != nil && s.ingest != nil {
return s.ingest
}
svc := ingest.NewService()
if s != nil {
s.ingest = svc
}
return svc
}
// SetDetectedVendor sets the detected vendor name
func (s *Server) SetDetectedVendor(vendor string) {
s.mu.Lock()

View File

@@ -24,7 +24,9 @@ echo ""
# Stable build env for this machine/toolchain
export GOPATH="${GOPATH:-/tmp/go}"
export GOCACHE="${GOCACHE:-/tmp/gocache}"
export GOTOOLCHAIN="${GOTOOLCHAIN:-go1.22.12}"
# Use the locally installed Go toolchain by default.
# A pinned or auto-downloaded toolchain can still be requested via env override.
export GOTOOLCHAIN="${GOTOOLCHAIN:-local}"
mkdir -p "${GOPATH}" "${GOCACHE}"
# Create release directory

View File

@@ -17,6 +17,18 @@ header {
padding: 1rem 2rem;
}
.app-header-row {
display: flex;
align-items: flex-start;
justify-content: space-between;
gap: 1rem 2rem;
flex-wrap: wrap;
}
.app-header-brand {
min-width: 0;
}
header h1 {
font-size: 1.5rem;
font-weight: 600;
@@ -33,9 +45,34 @@ header p {
opacity: 0.7;
}
.header-log-meta {
display: flex;
align-items: center;
gap: 0.75rem;
flex-wrap: wrap;
justify-content: flex-end;
}
.header-actions {
display: flex;
gap: 0.5rem;
flex-wrap: wrap;
justify-content: flex-end;
}
.header-actions button {
border: none;
border-radius: 6px;
padding: 0.55rem 0.9rem;
font-size: 0.85rem;
font-weight: 600;
cursor: pointer;
}
main {
max-width: 1400px;
margin: 2rem auto;
width: 100%;
max-width: none;
margin: 1rem 0 2rem;
padding: 0 1rem;
}
@@ -109,6 +146,10 @@ main {
color: #2c3e50;
}
#data-section {
margin: 0 -1rem;
}
.api-form-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(220px, 1fr));
@@ -169,7 +210,10 @@ main {
pointer-events: none;
}
#api-check-btn,
#api-connect-btn,
#api-power-on-collect-btn,
#api-collect-off-btn,
#convert-folder-btn,
#convert-run-btn,
#cancel-job-btn,
@@ -185,7 +229,10 @@ main {
transition: background-color 0.2s ease, opacity 0.2s ease;
}
#api-check-btn:hover,
#api-connect-btn:hover,
#api-power-on-collect-btn:hover,
#api-collect-off-btn:hover,
#convert-folder-btn:hover,
#convert-run-btn:hover,
#cancel-job-btn:hover,
@@ -195,7 +242,10 @@ main {
#convert-run-btn:disabled,
#convert-folder-btn:disabled,
#api-check-btn:disabled,
#api-connect-btn:disabled,
#api-power-on-collect-btn:disabled,
#api-collect-off-btn:disabled,
#cancel-job-btn:disabled,
.upload-area button:disabled {
opacity: 0.6;
@@ -219,6 +269,10 @@ main {
color: #0f4dba;
}
.api-connect-status.warning {
color: #b06a00;
}
.convert-progress {
margin-top: 0.9rem;
}
@@ -303,6 +357,82 @@ main {
transition: width 0.25s ease;
}
.job-active-modules {
margin-bottom: 0.85rem;
}
.job-module-chips {
display: flex;
flex-wrap: wrap;
gap: 0.45rem;
margin-top: 0.35rem;
}
.job-module-chip {
display: inline-flex;
align-items: center;
gap: 0.4rem;
background: #eef6ff;
border: 1px solid #bfdcff;
border-radius: 999px;
padding: 0.32rem 0.68rem;
line-height: 1;
}
.job-module-chip-name {
font-size: 0.82rem;
color: #1f2937;
font-weight: 600;
}
.job-module-chip-score {
font-size: 0.72rem;
color: #1d4ed8;
background: #dbeafe;
border: 1px solid #bfdbfe;
border-radius: 999px;
padding: 0.1rem 0.38rem;
font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace;
}
.job-debug-info {
margin-bottom: 0.85rem;
border: 1px solid #dbe5f0;
background: #f8fbff;
border-radius: 8px;
padding: 0.75rem;
}
.job-debug-summary {
font-size: 0.82rem;
color: #334155;
margin-top: 0.35rem;
}
.job-phase-telemetry {
margin-top: 0.55rem;
display: grid;
gap: 0.35rem;
}
.job-phase-row {
display: grid;
grid-template-columns: minmax(120px, 180px) repeat(4, minmax(60px, auto));
gap: 0.5rem;
align-items: center;
font-size: 0.8rem;
}
.job-phase-name {
color: #0f172a;
font-weight: 600;
}
.job-phase-metric {
color: #475569;
font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, monospace;
}
.meta-label {
color: #64748b;
font-weight: 600;
@@ -424,18 +554,6 @@ main {
}
/* File Info */
.file-info {
background: #fff;
border: 1px solid #e0e0e0;
border-radius: 8px;
padding: 1rem 1.5rem;
margin-bottom: 1.5rem;
display: flex;
gap: 2rem;
flex-wrap: wrap;
align-items: center;
}
.parser-badge, .file-name {
display: flex;
align-items: center;
@@ -463,6 +581,28 @@ main {
border-color: #81c784;
}
.result-panel {
background: transparent;
border: none;
border-radius: 0;
padding: 0;
margin-bottom: 0;
}
.audit-viewer-shell {
min-height: 60vh;
margin: 0;
}
.audit-viewer-frame {
width: 100%;
min-height: 60vh;
border: none;
border-radius: 0;
background: #fff;
display: block;
}
/* Tabs */
.tabs {
display: flex;

View File

@@ -5,8 +5,7 @@ document.addEventListener('DOMContentLoaded', () => {
initApiSource();
initUpload();
initConvertMode();
initTabs();
initFilters();
initAuditViewer();
loadParsersInfo();
loadSupportedFileTypes();
});
@@ -14,6 +13,7 @@ document.addEventListener('DOMContentLoaded', () => {
let sourceType = 'archive';
let convertFiles = [];
let isConvertRunning = false;
let convertDuplicates = [];
const CONVERT_MAX_FILES_PER_BATCH = 1000;
let supportedUploadExtensions = null;
let supportedConvertExtensions = null;
@@ -23,6 +23,32 @@ let collectionJobPollTimer = null;
let collectionJobLogCounter = 0;
let apiPortTouchedByUser = false;
let isAutoUpdatingApiPort = false;
let apiProbeResult = null;
let apiPowerDecisionTimer = null;
function initAuditViewer() {
const frame = document.getElementById('audit-viewer-frame');
if (!frame) {
return;
}
frame.addEventListener('load', () => {
resizeAuditViewerFrame();
try {
const win = frame.contentWindow;
if (win) {
win.setTimeout(resizeAuditViewerFrame, 50);
win.setTimeout(resizeAuditViewerFrame, 250);
}
} catch (err) {
console.error('Failed to schedule viewer resize:', err);
}
});
window.addEventListener('resize', () => {
resizeAuditViewerFrame();
});
}
function initSourceType() {
const sourceButtons = document.querySelectorAll('.source-switch-btn');
@@ -65,22 +91,14 @@ function initApiSource() {
}
const cancelJobButton = document.getElementById('cancel-job-btn');
const checkButton = document.getElementById('api-check-btn');
const collectOffButton = document.getElementById('api-collect-off-btn');
const powerOnCollectButton = document.getElementById('api-power-on-collect-btn');
const fieldNames = ['host', 'port', 'username', 'password'];
apiForm.addEventListener('submit', (event) => {
event.preventDefault();
const { isValid, payload, errors } = validateCollectForm();
renderFormErrors(errors);
if (!isValid) {
renderApiConnectStatus(false, null);
apiConnectPayload = null;
return;
}
apiConnectPayload = payload;
renderApiConnectStatus(true, payload);
startCollectionJob(payload);
startCollectionFromCurrentProbe(false);
});
if (cancelJobButton) {
@@ -88,6 +106,23 @@ function initApiSource() {
cancelCollectionJob();
});
}
if (checkButton) {
checkButton.addEventListener('click', () => {
startApiProbe();
});
}
if (collectOffButton) {
collectOffButton.addEventListener('click', () => {
clearApiPowerDecisionTimer();
startCollectionFromCurrentProbe(false);
});
}
if (powerOnCollectButton) {
powerOnCollectButton.addEventListener('click', () => {
clearApiPowerDecisionTimer();
startCollectionFromCurrentProbe(true);
});
}
fieldNames.forEach((fieldName) => {
const field = apiForm.elements.namedItem(fieldName);
@@ -104,6 +139,7 @@ function initApiSource() {
const { errors } = validateCollectForm();
renderFormErrors(errors);
clearApiConnectStatus();
resetApiProbeState();
if (collectionJob && isCollectionJobTerminal(collectionJob.status)) {
resetCollectionJobState();
@@ -115,6 +151,144 @@ function initApiSource() {
renderCollectionJob();
}
function startApiProbe() {
const { isValid, payload, errors } = validateCollectForm();
renderFormErrors(errors);
if (!isValid) {
renderApiConnectStatus(false, null);
resetApiProbeState();
return;
}
apiConnectPayload = payload;
resetApiProbeState();
setApiFormBlocked(true);
renderApiConnectStatus(true, { ...payload, password: '***' });
fetch('/api/collect/probe', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(payload)
})
.then(async (response) => {
const body = await response.json().catch(() => ({}));
if (!response.ok) {
throw new Error(body.error || 'Проверка подключения не удалась');
}
apiProbeResult = body;
renderApiProbeState();
})
.catch((err) => {
resetApiProbeState();
renderApiConnectStatus(false, null);
const status = document.getElementById('api-connect-status');
if (status) {
status.textContent = err.message || 'Проверка подключения не удалась';
status.className = 'api-connect-status error';
}
})
.finally(() => {
if (!collectionJob || isCollectionJobTerminal(collectionJob.status)) {
setApiFormBlocked(false);
}
});
}
function startCollectionFromCurrentProbe(powerOnIfHostOff) {
const { isValid, payload, errors } = validateCollectForm();
renderFormErrors(errors);
if (!isValid) {
renderApiConnectStatus(false, null);
resetApiProbeState();
return;
}
if (!apiProbeResult || !apiProbeResult.reachable) {
const status = document.getElementById('api-connect-status');
if (status) {
status.textContent = 'Сначала выполните проверку подключения.';
status.className = 'api-connect-status error';
}
return;
}
clearApiPowerDecisionTimer();
payload.power_on_if_host_off = Boolean(powerOnIfHostOff);
startCollectionJob(payload);
}
function renderApiProbeState() {
const collectButton = document.getElementById('api-connect-btn');
const status = document.getElementById('api-connect-status');
const decision = document.getElementById('api-power-decision');
const decisionText = document.getElementById('api-power-decision-text');
if (!collectButton || !status || !decision || !decisionText) {
return;
}
decision.classList.add('hidden');
clearApiPowerDecisionTimer();
collectButton.disabled = !apiProbeResult || !apiProbeResult.reachable;
if (!apiProbeResult || !apiProbeResult.reachable) {
status.textContent = 'Проверка подключения не пройдена.';
status.className = 'api-connect-status error';
return;
}
if (apiProbeResult.host_powered_on) {
status.textContent = apiProbeResult.message || 'Связь с BMC есть, host включен.';
status.className = 'api-connect-status success';
collectButton.disabled = false;
return;
}
status.textContent = apiProbeResult.message || 'Связь с BMC есть, host выключен.';
status.className = 'api-connect-status warning';
if (!apiProbeResult.power_control_available) {
collectButton.disabled = false;
return;
}
decision.classList.remove('hidden');
let secondsLeft = 5;
const updateDecisionText = () => {
decisionText.textContent = `Если не выбрать действие, сбор начнется без включения через ${secondsLeft} сек.`;
};
updateDecisionText();
apiPowerDecisionTimer = window.setInterval(() => {
secondsLeft -= 1;
if (secondsLeft <= 0) {
clearApiPowerDecisionTimer();
startCollectionFromCurrentProbe(false);
return;
}
updateDecisionText();
}, 1000);
}
function resetApiProbeState() {
apiProbeResult = null;
clearApiPowerDecisionTimer();
const collectButton = document.getElementById('api-connect-btn');
const decision = document.getElementById('api-power-decision');
if (collectButton) {
collectButton.disabled = true;
}
if (decision) {
decision.classList.add('hidden');
}
}
function clearApiPowerDecisionTimer() {
if (!apiPowerDecisionTimer) {
return;
}
window.clearInterval(apiPowerDecisionTimer);
apiPowerDecisionTimer = null;
}
function validateCollectForm() {
const host = getApiValue('host');
const portRaw = getApiValue('port');
@@ -229,6 +403,7 @@ function clearApiConnectStatus() {
}
function startCollectionJob(payload) {
clearApiPowerDecisionTimer();
resetCollectionJobState();
setApiFormBlocked(true);
@@ -247,7 +422,12 @@ function startCollectionJob(payload) {
id: body.job_id,
status: normalizeJobStatus(body.status || 'queued'),
progress: 0,
currentPhase: '',
etaSeconds: null,
logs: [],
activeModules: [],
moduleScores: [],
debugInfo: null,
payload
};
appendJobLog(body.message || 'Задача поставлена в очередь');
@@ -285,7 +465,12 @@ function pollCollectionJobStatus() {
const prevStatus = collectionJob.status;
collectionJob.status = normalizeJobStatus(body.status || collectionJob.status);
collectionJob.progress = Number.isFinite(body.progress) ? body.progress : collectionJob.progress;
collectionJob.currentPhase = body.current_phase || collectionJob.currentPhase || '';
collectionJob.etaSeconds = Number.isFinite(body.eta_seconds) ? body.eta_seconds : collectionJob.etaSeconds;
collectionJob.error = body.error || '';
collectionJob.activeModules = Array.isArray(body.active_modules) ? body.active_modules : collectionJob.activeModules;
collectionJob.moduleScores = Array.isArray(body.module_scores) ? body.module_scores : collectionJob.moduleScores;
collectionJob.debugInfo = body.debug_info || collectionJob.debugInfo || null;
syncServerLogs(body.logs);
renderCollectionJob();
@@ -353,9 +538,14 @@ function renderCollectionJob() {
const progressValue = document.getElementById('job-progress-value');
const etaValue = document.getElementById('job-eta-value');
const progressBar = document.getElementById('job-progress-bar');
const activeModulesBlock = document.getElementById('job-active-modules');
const activeModulesList = document.getElementById('job-active-modules-list');
const debugInfoBlock = document.getElementById('job-debug-info');
const debugSummary = document.getElementById('job-debug-summary');
const phaseTelemetryNode = document.getElementById('job-phase-telemetry');
const logsList = document.getElementById('job-logs-list');
const cancelButton = document.getElementById('cancel-job-btn');
if (!jobStatusBlock || !jobIdValue || !statusValue || !progressValue || !etaValue || !progressBar || !logsList || !cancelButton) {
if (!jobStatusBlock || !jobIdValue || !statusValue || !progressValue || !etaValue || !progressBar || !activeModulesBlock || !activeModulesList || !debugInfoBlock || !debugSummary || !phaseTelemetryNode || !logsList || !cancelButton) {
return;
}
@@ -383,6 +573,8 @@ function renderCollectionJob() {
etaValue.textContent = eta;
progressBar.style.width = `${progressPercent}%`;
progressBar.textContent = `${progressPercent}%`;
renderJobActiveModules(activeModulesBlock, activeModulesList);
renderJobDebugInfo(debugInfoBlock, debugSummary, phaseTelemetryNode);
logsList.innerHTML = [...collectionJob.logs].reverse().map((log) => (
`<li><span class="log-time">${escapeHtml(log.time)}</span><span class="log-message">${escapeHtml(log.message)}</span></li>`
@@ -393,6 +585,9 @@ function renderCollectionJob() {
}
function latestCollectionActivityMessage() {
if (collectionJob && collectionJob.currentPhase) {
return humanizeCollectionPhase(collectionJob.currentPhase);
}
if (!collectionJob || !Array.isArray(collectionJob.logs) || collectionJob.logs.length === 0) {
return 'Сбор данных...';
}
@@ -409,6 +604,9 @@ function latestCollectionActivityMessage() {
}
function latestCollectionETA() {
if (collectionJob && Number.isFinite(collectionJob.etaSeconds) && collectionJob.etaSeconds > 0) {
return formatDurationSeconds(collectionJob.etaSeconds);
}
if (!collectionJob || !Array.isArray(collectionJob.logs) || collectionJob.logs.length === 0) {
return '-';
}
@@ -474,6 +672,94 @@ function normalizeJobStatus(status) {
return String(status || '').trim().toLowerCase();
}
function humanizeCollectionPhase(phase) {
const value = String(phase || '').trim().toLowerCase();
return {
discovery: 'Discovery',
snapshot: 'Snapshot',
snapshot_postprobe_nvme: 'Snapshot NVMe post-probe',
snapshot_postprobe_collections: 'Snapshot collection post-probe',
prefetch: 'Prefetch critical endpoints',
critical_plan_b: 'Critical plan-B',
profile_plan_b: 'Profile plan-B'
}[value] || value || 'Сбор данных...';
}
function formatDurationSeconds(totalSeconds) {
const seconds = Math.max(0, Math.round(Number(totalSeconds) || 0));
if (seconds <= 0) {
return '-';
}
const minutes = Math.floor(seconds / 60);
const remainingSeconds = seconds % 60;
if (minutes === 0) {
return `${remainingSeconds}s`;
}
if (remainingSeconds === 0) {
return `${minutes}m`;
}
return `${minutes}m ${remainingSeconds}s`;
}
function renderJobActiveModules(activeModulesBlock, activeModulesList) {
const activeModules = collectionJob && Array.isArray(collectionJob.activeModules) ? collectionJob.activeModules : [];
if (activeModules.length === 0) {
activeModulesBlock.classList.add('hidden');
activeModulesList.innerHTML = '';
return;
}
activeModulesBlock.classList.remove('hidden');
activeModulesList.innerHTML = activeModules.map((module) => {
const score = Number.isFinite(module.score) ? module.score : 0;
return `<span class="job-module-chip" title="${escapeHtml(moduleTitle(module))}">
<span class="job-module-chip-name">${escapeHtml(module.name || '-')}</span>
<span class="job-module-chip-score">${escapeHtml(String(score))}</span>
</span>`;
}).join('');
}
function renderJobDebugInfo(debugInfoBlock, debugSummary, phaseTelemetryNode) {
const debug = collectionJob && collectionJob.debugInfo ? collectionJob.debugInfo : null;
if (!debug) {
debugInfoBlock.classList.add('hidden');
debugSummary.innerHTML = '';
phaseTelemetryNode.innerHTML = '';
return;
}
debugInfoBlock.classList.remove('hidden');
const throttled = debug.adaptive_throttled ? 'on' : 'off';
const prefetchEnabled = typeof debug.prefetch_enabled === 'boolean' ? String(debug.prefetch_enabled) : 'auto';
debugSummary.innerHTML = `adaptive_throttling=<strong>${escapeHtml(throttled)}</strong>, snapshot_workers=<strong>${escapeHtml(String(debug.snapshot_workers || 0))}</strong>, prefetch_workers=<strong>${escapeHtml(String(debug.prefetch_workers || 0))}</strong>, prefetch_enabled=<strong>${escapeHtml(prefetchEnabled)}</strong>`;
const phases = Array.isArray(debug.phase_telemetry) ? debug.phase_telemetry : [];
if (phases.length === 0) {
phaseTelemetryNode.innerHTML = '';
return;
}
phaseTelemetryNode.innerHTML = phases.map((item) => (
`<div class="job-phase-row">
<span class="job-phase-name">${escapeHtml(humanizeCollectionPhase(item.phase || ''))}</span>
<span class="job-phase-metric">req=${escapeHtml(String(item.requests || 0))}</span>
<span class="job-phase-metric">err=${escapeHtml(String(item.errors || 0))}</span>
<span class="job-phase-metric">avg=${escapeHtml(String(item.avg_ms || 0))}ms</span>
<span class="job-phase-metric">p95=${escapeHtml(String(item.p95_ms || 0))}ms</span>
</div>`
)).join('');
}
function moduleTitle(activeModule) {
const name = String(activeModule && activeModule.name || '').trim();
const scores = collectionJob && Array.isArray(collectionJob.moduleScores) ? collectionJob.moduleScores : [];
const full = scores.find((item) => String(item && item.name || '').trim() === name);
if (!full) {
return name;
}
const state = full.active ? 'active' : 'inactive';
return `${name}: score=${Number.isFinite(full.score) ? full.score : 0}, priority=${Number.isFinite(full.priority) ? full.priority : 0}, ${state}`;
}
async function loadDataFromStatus() {
try {
const response = await fetch('/api/status');
@@ -623,8 +909,16 @@ function initConvertMode() {
return;
}
folderInput.addEventListener('change', () => {
convertFiles = Array.from(folderInput.files || []).filter(file => file && file.name);
folderInput.addEventListener('change', async () => {
const raw = Array.from(folderInput.files || []).filter(file => file && file.name);
const summary = document.getElementById('convert-folder-summary');
if (summary) {
summary.textContent = 'Проверка дубликатов…';
summary.className = 'api-connect-status';
}
const { unique, duplicates } = await deduplicateConvertFiles(raw);
convertFiles = unique;
convertDuplicates = duplicates;
renderConvertSummary();
});
@@ -657,7 +951,14 @@ function renderConvertSummary() {
const batchCount = Math.ceil(supportedFiles.length / CONVERT_MAX_FILES_PER_BATCH);
const batchesText = batchCount > 1 ? ` Будет ${batchCount} прохода(ов) по ${CONVERT_MAX_FILES_PER_BATCH} файлов.` : '';
summary.innerHTML = `<strong>${supportedFiles.length}</strong> файлов готовы к конвертации.${previewText ? ` ${previewText}` : ''}${remaining > 0 ? ` и ещё ${remaining}` : ''}.${skippedText}${batchesText}`;
let dupText = '';
if (convertDuplicates.length > 0) {
const names = convertDuplicates.map(d => escapeHtml(d.name)).join(', ');
const reasons = convertDuplicates.map(d => d.reason === 'hash' ? 'одинаковое содержимое' : 'одинаковое имя');
const uniqueReasons = [...new Set(reasons)].join(', ');
dupText = ` <span style="color:#c0392b">⚠ Пропущено дубликатов: ${convertDuplicates.length} (${uniqueReasons}): ${names}.</span>`;
}
summary.innerHTML = `<strong>${supportedFiles.length}</strong> файлов готовы к конвертации.${previewText ? ` ${previewText}` : ''}${remaining > 0 ? ` и ещё ${remaining}` : ''}.${skippedText}${batchesText}${dupText}`;
summary.className = 'api-connect-status';
}
@@ -860,6 +1161,40 @@ function parseConvertErrorPayload(bodyText) {
}
}
async function deduplicateConvertFiles(files) {
// First pass: deduplicate by basename
const seenNames = new Map(); // name -> index in unique
const unique = [];
const duplicates = [];
for (const file of files) {
if (seenNames.has(file.name)) {
duplicates.push({ name: file.webkitRelativePath || file.name, reason: 'name' });
} else {
seenNames.set(file.name, unique.length);
unique.push(file);
}
}
// Second pass: deduplicate by SHA-256 hash
const seenHashes = new Map(); // hash -> file.name
const hashUnique = [];
for (const file of unique) {
try {
const buf = await file.arrayBuffer();
const hashBuf = await crypto.subtle.digest('SHA-256', buf);
const hash = Array.from(new Uint8Array(hashBuf)).map(b => b.toString(16).padStart(2, '0')).join('');
if (seenHashes.has(hash)) {
duplicates.push({ name: file.webkitRelativePath || file.name, reason: 'hash' });
} else {
seenHashes.set(hash, file.name);
hashUnique.push(file);
}
} catch (_) {
hashUnique.push(file);
}
}
return { unique: hashUnique, duplicates };
}
function isSupportedConvertFileName(filename) {
const name = String(filename || '').trim().toLowerCase();
if (!name) {
@@ -990,6 +1325,7 @@ let allSerials = [];
let allParseErrors = [];
let currentVendor = '';
let auditViewerNonce = 0;
// Load data from API
async function loadData(vendor, filename) {
@@ -997,16 +1333,9 @@ async function loadData(vendor, filename) {
document.getElementById('upload-section').classList.add('hidden');
document.getElementById('data-section').classList.remove('hidden');
document.getElementById('clear-btn').classList.remove('hidden');
// Update parser name and filename
const parserName = document.getElementById('parser-name');
const fileNameElem = document.getElementById('file-name');
if (parserName && currentVendor) {
parserName.textContent = currentVendor;
}
if (fileNameElem && filename) {
fileNameElem.textContent = filename;
}
document.getElementById('header-raw-btn').classList.remove('hidden');
document.getElementById('header-reanimator-btn').classList.remove('hidden');
document.getElementById('header-log-meta').classList.remove('hidden');
// Update vendor badge if exists (legacy support)
const vendorBadge = document.getElementById('vendor-badge');
@@ -1015,14 +1344,40 @@ async function loadData(vendor, filename) {
vendorBadge.classList.remove('hidden');
}
await Promise.all([
loadConfig(),
loadFirmware(),
loadSensors(),
loadSerials(),
loadEvents(),
loadParseErrors()
]);
loadAuditViewer();
}
function loadAuditViewer() {
const frame = document.getElementById('audit-viewer-frame');
if (!frame) {
return;
}
auditViewerNonce += 1;
frame.style.height = '60vh';
frame.src = `/chart/current?ts=${auditViewerNonce}`;
}
function resizeAuditViewerFrame() {
const frame = document.getElementById('audit-viewer-frame');
if (!frame) {
return;
}
try {
const doc = frame.contentDocument || (frame.contentWindow && frame.contentWindow.document);
if (!doc || !doc.documentElement || !doc.body) {
return;
}
const nextHeight = Math.max(
doc.documentElement.scrollHeight,
doc.body.scrollHeight,
640
);
frame.style.height = `${nextHeight}px`;
} catch (err) {
console.error('Failed to resize audit viewer frame:', err);
}
}
async function loadConfig() {
@@ -1735,11 +2090,19 @@ async function clearData() {
document.getElementById('upload-section').classList.remove('hidden');
document.getElementById('data-section').classList.add('hidden');
document.getElementById('clear-btn').classList.add('hidden');
document.getElementById('header-raw-btn').classList.add('hidden');
document.getElementById('header-reanimator-btn').classList.add('hidden');
document.getElementById('header-log-meta').classList.add('hidden');
document.getElementById('upload-status').textContent = '';
allSensors = [];
allEvents = [];
allSerials = [];
allParseErrors = [];
currentVendor = '';
const frame = document.getElementById('audit-viewer-frame');
if (frame) {
frame.src = 'about:blank';
}
} catch (err) {
console.error('Failed to clear data:', err);
}

View File

@@ -8,8 +8,21 @@
</head>
<body>
<header>
<h1>LOGPile <span class="header-domain">mchus.pro</span></h1>
<p>Анализатор диагностических данных BMC/IPMI</p>
<div class="app-header-row">
<div class="app-header-brand">
<h1>LOGPile <span class="header-domain">mchus.pro</span></h1>
<p>Анализатор диагностических данных BMC/IPMI</p>
</div>
<div id="header-log-meta" class="header-log-meta hidden">
<div class="header-actions">
<button id="clear-btn" class="hidden" onclick="clearData()">Очистить данные</button>
<button id="header-raw-btn" class="hidden" onclick="exportData('json')">Export Raw Data</button>
<button id="header-reanimator-btn" class="hidden" onclick="exportData('reanimator')">Экспорт Reanimator</button>
<button id="restart-btn" onclick="restartApp()">Перезапуск</button>
<button id="exit-btn" onclick="exitApp()">Выход</button>
</div>
</div>
</div>
</header>
<main>
@@ -63,9 +76,18 @@
</div>
<div class="api-form-actions">
<button id="api-connect-btn" type="submit">Подключиться</button>
<button id="api-check-btn" type="button">Проверить</button>
<button id="api-connect-btn" type="submit" disabled>Собрать</button>
</div>
<div id="api-connect-status" class="api-connect-status"></div>
<div id="api-power-decision" class="api-connect-status hidden">
<strong>Host выключен.</strong>
<p id="api-power-decision-text">Если не выбрать действие, сбор начнется без включения через 5 секунд.</p>
<div class="api-form-actions">
<button id="api-power-on-collect-btn" type="button">Включить и собрать</button>
<button id="api-collect-off-btn" type="button">Собирать выключенный</button>
</div>
</div>
</form>
<section id="api-job-status" class="job-status hidden" aria-live="polite">
@@ -85,6 +107,15 @@
<div class="job-progress" aria-label="Прогресс задачи">
<div id="job-progress-bar" class="job-progress-bar" style="width: 0%">0%</div>
</div>
<div id="job-active-modules" class="job-active-modules hidden">
<p class="meta-label">Активные модули:</p>
<div id="job-active-modules-list" class="job-module-chips"></div>
</div>
<div id="job-debug-info" class="job-debug-info hidden">
<p class="meta-label">Redfish debug:</p>
<div id="job-debug-summary" class="job-debug-summary"></div>
<div id="job-phase-telemetry" class="job-phase-telemetry"></div>
</div>
<div class="job-status-logs">
<p class="meta-label">Журнал шагов:</p>
<ul id="job-logs-list"></ul>
@@ -115,142 +146,23 @@
</section>
<section id="data-section" class="hidden">
<div class="file-info">
<div class="parser-badge">
<span class="badge-label">Парсер:</span>
<span id="parser-name" class="badge-value"></span>
<section class="result-panel">
<div class="audit-viewer-shell">
<iframe
id="audit-viewer-frame"
class="audit-viewer-frame"
title="Reanimator chart viewer"
loading="eager"
scrolling="no"
referrerpolicy="same-origin">
</iframe>
</div>
<div class="file-name">
<span class="badge-label">Файл:</span>
<span id="file-name" class="badge-value"></span>
</div>
</div>
<nav class="tabs">
<button class="tab active" data-tab="config">Конфигурация</button>
<button class="tab" data-tab="firmware">Прошивки</button>
<button class="tab" data-tab="sensors">Сенсоры</button>
<button class="tab" data-tab="serials">Серийные номера</button>
<button class="tab" data-tab="events">События</button>
<button class="tab" data-tab="parse-errors">Ошибки разбора</button>
</nav>
<div class="tab-content active" id="config">
<div class="toolbar">
<button onclick="exportData('json')">Export Raw Data</button>
<button onclick="exportData('reanimator')">Экспорт Reanimator</button>
</div>
<div id="config-content"></div>
</div>
<div class="tab-content" id="firmware">
<div class="toolbar">
<span class="toolbar-label">Версии прошивок компонентов</span>
</div>
<table id="firmware-table">
<thead>
<tr>
<th>Компонент</th>
<th>Модель</th>
<th>Версия</th>
</tr>
</thead>
<tbody></tbody>
</table>
</div>
<div class="tab-content" id="sensors">
<div class="toolbar">
<select id="sensor-filter">
<option value="">Все сенсоры</option>
<option value="temperature">Температура</option>
<option value="voltage">Напряжение</option>
<option value="power">Мощность</option>
<option value="fan_speed">Вентиляторы</option>
</select>
</div>
<div id="sensors-content"></div>
</div>
<div class="tab-content" id="serials">
<div class="toolbar">
<select id="serial-filter">
<option value="">Все компоненты</option>
<option value="Board">Материнская плата</option>
<option value="CPU">Процессоры</option>
<option value="Memory">Память</option>
<option value="Storage">Накопители</option>
<option value="PCIe">PCIe устройства</option>
<option value="Network">Сетевые адаптеры</option>
<option value="PSU">Блоки питания</option>
<option value="Firmware">Прошивки</option>
<option value="FRU">FRU</option>
</select>
<button onclick="exportData('csv')">Экспорт CSV</button>
</div>
<table id="serials-table">
<thead>
<tr>
<th>Категория</th>
<th>Компонент</th>
<th>Расположение</th>
<th>Серийный номер</th>
<th>Производитель</th>
</tr>
</thead>
<tbody></tbody>
</table>
</div>
<div class="tab-content" id="events">
<div class="toolbar">
<select id="severity-filter">
<option value="">Все события</option>
<option value="critical">Критические</option>
<option value="warning">Предупреждения</option>
<option value="info">Информационные</option>
</select>
</div>
<table id="events-table">
<thead>
<tr>
<th>Время</th>
<th>Источник</th>
<th>Описание</th>
<th>Важность</th>
</tr>
</thead>
<tbody></tbody>
</table>
</div>
<div class="tab-content" id="parse-errors">
<div class="toolbar">
<span class="toolbar-label">Ошибки сборки / разбора (Redfish, parser, файл)</span>
</div>
<div class="table-scroll">
<table id="parse-errors-table">
<thead>
<tr>
<th>Источник</th>
<th>Категория</th>
<th>Важность</th>
<th>Endpoint / Path</th>
<th>Сообщение</th>
</tr>
</thead>
<tbody></tbody>
</table>
</div>
</div>
</section>
</section>
</main>
<footer>
<div class="footer-buttons">
<button id="clear-btn" class="hidden" onclick="clearData()">Очистить данные</button>
<button id="restart-btn" onclick="restartApp()">Перезапуск</button>
<button id="exit-btn" onclick="exitApp()">Выход</button>
</div>
<div class="footer-info">
<p>Автор: <a href="https://mchus.pro" target="_blank">mchus.pro</a> | <a href="https://git.mchus.pro/mchus/logpile" target="_blank">Git Repository</a></p>