Compare commits
47 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
a82fb227e5 | ||
| c9969fc3da | |||
| 89b6701f43 | |||
| b04877549a | |||
| 8ca173c99b | |||
| f19a3454fa | |||
|
|
becdca1d7e | ||
|
|
e10440ae32 | ||
| 5c2a21aff1 | |||
|
|
9df13327aa | ||
|
|
7e9af89c46 | ||
|
|
db74df9994 | ||
|
|
bb82387d48 | ||
|
|
475f6ac472 | ||
|
|
93ce676f04 | ||
|
|
c47c34fd11 | ||
|
|
d8c3256e41 | ||
|
|
1b2d978d29 | ||
|
|
0f310d57c4 | ||
|
|
3547ef9083 | ||
|
|
99f0d6217c | ||
|
|
8acbba3cc9 | ||
|
|
8942991f0c | ||
|
|
9b71c4a95f | ||
|
|
125f77ef69 | ||
|
|
063b08d5fb | ||
|
|
e3ff1745fc | ||
|
|
96e65d8f65 | ||
|
|
30409eef67 | ||
|
|
65e65968cf | ||
|
|
380c199705 | ||
|
|
d650a6ba1c | ||
|
|
d8d3d8c524 | ||
|
|
057a222288 | ||
|
|
f11a43f690 | ||
|
|
476630190d | ||
|
|
9007f1b360 | ||
|
|
0acdc2b202 | ||
|
|
47bb0ee939 | ||
|
|
5815100e2f | ||
|
|
1eb639e6bf | ||
|
|
a9f58b3cf4 | ||
|
|
d8ffe3d3a5 | ||
|
|
9df29b1be9 | ||
|
|
62d6ad6f66 | ||
|
|
f09344e288 | ||
| 19d857b459 |
3
.gitmodules
vendored
3
.gitmodules
vendored
@@ -4,3 +4,6 @@
|
||||
[submodule "bible"]
|
||||
path = bible
|
||||
url = https://git.mchus.pro/mchus/bible.git
|
||||
[submodule "internal/chart"]
|
||||
path = internal/chart
|
||||
url = https://git.mchus.pro/reanimator/chart.git
|
||||
|
||||
@@ -6,6 +6,6 @@ Start with `bible/rules/patterns/` for specific contracts.
|
||||
|
||||
## Project Architecture
|
||||
Read `bible-local/` — LOGPile specific architecture.
|
||||
Read order: `bible-local/README.md` → `01-overview.md` → relevant files for the task.
|
||||
Read order: `bible-local/README.md` → `01-overview.md` → `02-architecture.md` → `04-data-models.md` → relevant file(s) for the task.
|
||||
|
||||
Every architectural decision specific to this project must be recorded in `bible-local/10-decisions.md`.
|
||||
|
||||
@@ -6,6 +6,6 @@ Start with `bible/rules/patterns/` for specific contracts.
|
||||
|
||||
## Project Architecture
|
||||
Read `bible-local/` — LOGPile specific architecture.
|
||||
Read order: `bible-local/README.md` → `01-overview.md` → relevant files for the task.
|
||||
Read order: `bible-local/README.md` → `01-overview.md` → `02-architecture.md` → `04-data-models.md` → relevant file(s) for the task.
|
||||
|
||||
Every architectural decision specific to this project must be recorded in `bible-local/10-decisions.md`.
|
||||
|
||||
20
README.md
20
README.md
@@ -2,9 +2,27 @@
|
||||
|
||||
Standalone Go application for BMC diagnostics analysis with an embedded web UI.
|
||||
|
||||
## What it does
|
||||
|
||||
- Parses vendor diagnostic archives into a normalized hardware inventory
|
||||
- Collects live BMC data via Redfish
|
||||
- Exports normalized data as CSV, raw re-analysis bundles, and Reanimator JSON
|
||||
- Runs as a single Go binary with embedded UI assets
|
||||
|
||||
## Documentation
|
||||
|
||||
- Architecture and technical documentation (single source of truth): [`docs/bible/README.md`](docs/bible/README.md)
|
||||
- Shared engineering rules: [`bible/README.md`](bible/README.md)
|
||||
- Project architecture and API contracts: [`bible-local/README.md`](bible-local/README.md)
|
||||
- Agent entrypoints: [`AGENTS.md`](AGENTS.md), [`CLAUDE.md`](CLAUDE.md)
|
||||
|
||||
## Run
|
||||
|
||||
```bash
|
||||
make build
|
||||
./bin/logpile
|
||||
```
|
||||
|
||||
Default port: `8082`
|
||||
|
||||
## License
|
||||
|
||||
|
||||
2
bible
2
bible
Submodule bible updated: 0c829182a1...52444350c1
@@ -1,35 +1,46 @@
|
||||
# 01 — Overview
|
||||
|
||||
## What is LOGPile?
|
||||
## Purpose
|
||||
|
||||
LOGPile is a standalone Go application for BMC (Baseboard Management Controller)
|
||||
diagnostics analysis with an embedded web UI.
|
||||
It runs as a single binary with no external file dependencies.
|
||||
LOGPile is a standalone Go application for BMC diagnostics analysis with an embedded web UI.
|
||||
It runs as a single binary and normalizes hardware data from archives or live Redfish collection.
|
||||
|
||||
## Operating modes
|
||||
|
||||
| Mode | Entry point | Description |
|
||||
|------|-------------|-------------|
|
||||
| **Offline / archive** | `POST /api/upload` | Upload a vendor diagnostic archive or a JSON snapshot; parse and display in UI |
|
||||
| **Live / Redfish** | `POST /api/collect` | Connect to a live BMC via Redfish API, collect hardware inventory, display and export |
|
||||
| Mode | Entry point | Outcome |
|
||||
|------|-------------|---------|
|
||||
| Archive upload | `POST /api/upload` | Parse a supported archive, raw export bundle, or JSON snapshot into `AnalysisResult` |
|
||||
| Live collection | `POST /api/collect` | Collect from a live BMC via Redfish and store the result in memory |
|
||||
| Batch convert | `POST /api/convert` | Convert multiple supported input files into Reanimator JSON in a ZIP artifact |
|
||||
|
||||
Both modes produce the same in-memory `AnalysisResult` structure and expose it
|
||||
through the same API and UI.
|
||||
All modes converge on the same normalized hardware model and exporter pipeline.
|
||||
|
||||
## Key capabilities
|
||||
## In scope
|
||||
|
||||
- Single self-contained binary with embedded HTML/JS/CSS (no static file serving required).
|
||||
- Vendor archive parsing: Inspur/Kaytus, Dell TSR, NVIDIA HGX Field Diagnostics,
|
||||
NVIDIA Bug Report, Unraid, XigmaNAS, Generic text fallback.
|
||||
- Live Redfish collection with async progress tracking.
|
||||
- Normalized hardware inventory: CPU / RAM / Storage / GPU / PSU / NIC / PCIe / Firmware.
|
||||
- Raw `redfish_tree` snapshot stored in `RawPayloads` for future offline re-analysis.
|
||||
- Re-upload of a JSON snapshot for offline work (`/api/upload` accepts `AnalysisResult` JSON).
|
||||
- Export in CSV, JSON (full `AnalysisResult`), and Reanimator format.
|
||||
- PCI device model resolution via embedded `pci.ids` (no hardcoded model strings).
|
||||
- Single-binary desktop/server utility with embedded UI
|
||||
- Vendor archive parsing and live Redfish collection
|
||||
- Canonical hardware inventory across UI and exports
|
||||
- Reopenable raw export bundles for future re-analysis
|
||||
- Reanimator export and batch conversion workflows
|
||||
- Embedded `pci.ids` lookup for vendor/device name enrichment
|
||||
|
||||
## Non-goals (current scope)
|
||||
## Current vendor coverage
|
||||
|
||||
- No persistent storage — all state is in-memory per process lifetime.
|
||||
- IPMI collector is a mock scaffold only; real IPMI support is not implemented.
|
||||
- No authentication layer on the HTTP server.
|
||||
- Dell TSR
|
||||
- Reanimator Easy Bee support bundles
|
||||
- H3C SDS G5/G6
|
||||
- Inspur / Kaytus
|
||||
- HPE iLO AHS
|
||||
- NVIDIA HGX Field Diagnostics
|
||||
- NVIDIA Bug Report
|
||||
- Unraid
|
||||
- xFusion iBMC dump / file export
|
||||
- XigmaNAS
|
||||
- Generic fallback parser
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Persistent storage or multi-user state
|
||||
- Production IPMI collection
|
||||
- Authentication/authorization on the built-in HTTP server
|
||||
- Long-term server-side job history beyond in-memory process lifetime
|
||||
|
||||
@@ -2,114 +2,97 @@
|
||||
|
||||
## Runtime stack
|
||||
|
||||
| Layer | Technology |
|
||||
|-------|------------|
|
||||
| Layer | Implementation |
|
||||
|-------|----------------|
|
||||
| Language | Go 1.22+ |
|
||||
| HTTP | `net/http`, `http.ServeMux` |
|
||||
| UI | Embedded via `//go:embed` in `web/embed.go` (templates + static assets) |
|
||||
| State | In-memory only — no database |
|
||||
| Build | `CGO_ENABLED=0`, single static binary |
|
||||
| HTTP | `net/http` + `http.ServeMux` |
|
||||
| UI | Embedded templates and static assets via `go:embed` |
|
||||
| State | In-memory only |
|
||||
| Build | `CGO_ENABLED=0`, single binary |
|
||||
|
||||
Default port: **8082**
|
||||
Default port: `8082`
|
||||
|
||||
## Directory structure
|
||||
Audit result rendering is delegated to embedded `reanimator/chart`, vendored as git submodule `internal/chart`.
|
||||
LOGPile remains responsible for upload, collection, parsing, normalization, and Reanimator export generation.
|
||||
|
||||
```
|
||||
cmd/logpile/main.go # Binary entry point, CLI flag parsing
|
||||
internal/
|
||||
collector/ # Live data collectors
|
||||
registry.go # Collector registration
|
||||
redfish.go # Redfish connector (real implementation)
|
||||
ipmi_mock.go # IPMI mock connector (scaffold)
|
||||
types.go # Connector request/progress contracts
|
||||
parser/ # Archive parsers
|
||||
parser.go # BMCParser (dispatcher) + parse orchestration
|
||||
archive.go # Archive extraction helpers
|
||||
registry.go # Parser registry + detect/selection
|
||||
interface.go # VendorParser interface
|
||||
vendors/ # Vendor-specific parser modules
|
||||
vendors.go # Import-side-effect registrations
|
||||
dell/
|
||||
inspur/
|
||||
nvidia/
|
||||
nvidia_bug_report/
|
||||
unraid/
|
||||
xigmanas/
|
||||
generic/
|
||||
pciids/ # PCI IDs lookup (embedded pci.ids)
|
||||
server/ # HTTP layer
|
||||
server.go # Server struct, route registration
|
||||
handlers.go # All HTTP handler functions
|
||||
exporter/ # Export formatters
|
||||
exporter.go # CSV + JSON exporters
|
||||
reanimator_models.go
|
||||
reanimator_converter.go
|
||||
models/ # Shared data contracts
|
||||
web/
|
||||
embed.go # go:embed directive
|
||||
templates/ # HTML templates
|
||||
static/ # JS / CSS
|
||||
js/app.js # Frontend — API contract consumer
|
||||
## Code map
|
||||
|
||||
```text
|
||||
cmd/logpile/main.go entrypoint and CLI flags
|
||||
internal/server/ HTTP handlers, jobs, upload/export flows
|
||||
internal/ingest/ source-family orchestration for upload and raw replay
|
||||
internal/collector/ live collection and Redfish replay
|
||||
internal/analyzer/ shared analysis helpers
|
||||
internal/parser/ archive extraction and parser dispatch
|
||||
internal/exporter/ CSV and Reanimator conversion
|
||||
internal/chart/ vendored `reanimator/chart` viewer submodule
|
||||
internal/models/ stable data contracts
|
||||
web/ embedded UI assets
|
||||
```
|
||||
|
||||
## In-memory state
|
||||
## Server state
|
||||
|
||||
The `Server` struct in `internal/server/server.go` holds:
|
||||
`internal/server.Server` stores:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `result` | `*models.AnalysisResult` | Current parsed/collected dataset |
|
||||
| `detectedVendor` | `string` | Vendor identifier from last parse |
|
||||
| `jobManager` | `*JobManager` | Tracks live collect job status/logs |
|
||||
| `collectors` | `*collector.Registry` | Registered live collection connectors |
|
||||
| Field | Purpose |
|
||||
|------|---------|
|
||||
| `result` | Current `AnalysisResult` shown in UI and used by exports |
|
||||
| `detectedVendor` | Parser/collector identity for the current dataset |
|
||||
| `rawExport` | Reopenable raw-export package associated with current result |
|
||||
| `jobManager` | Shared async job state for collect and convert flows |
|
||||
| `collectors` | Registered live collectors (`redfish`, `ipmi`) |
|
||||
| `convertOutput` | Temporary ZIP artifacts for batch convert downloads |
|
||||
|
||||
State is replaced atomically on successful upload or collect.
|
||||
On a failed/canceled collect, the previous `result` is preserved unchanged.
|
||||
State is replaced only on successful upload or successful live collection.
|
||||
Failed or canceled jobs do not overwrite the previous dataset.
|
||||
|
||||
## Upload flow (`POST /api/upload`)
|
||||
## Main flows
|
||||
|
||||
```
|
||||
multipart form field: "archive"
|
||||
│
|
||||
├─ file looks like JSON?
|
||||
│ └─ parse as models.AnalysisResult snapshot → store in Server.result
|
||||
│
|
||||
└─ otherwise
|
||||
└─ parser.NewBMCParser().ParseFromReader(...)
|
||||
│
|
||||
├─ try all registered vendor parsers (highest confidence wins)
|
||||
└─ result → store in Server.result
|
||||
```
|
||||
### Upload
|
||||
|
||||
## Live collect flow (`POST /api/collect`)
|
||||
1. `POST /api/upload` receives multipart field `archive`
|
||||
2. `internal/ingest.Service` resolves the source family
|
||||
3. JSON inputs are checked for raw-export package or `AnalysisResult` snapshot
|
||||
4. Non-JSON archives go through the archive parser family
|
||||
5. Archive metadata is normalized onto `AnalysisResult`
|
||||
6. Result becomes the current in-memory dataset
|
||||
|
||||
```
|
||||
validate request (host / protocol / port / username / auth_type / tls_mode)
|
||||
│
|
||||
└─ launch async job
|
||||
│
|
||||
├─ progress callback → job log (queryable via GET /api/collect/{id})
|
||||
│
|
||||
├─ success:
|
||||
│ set source metadata (source_type=api, protocol, host, date)
|
||||
│ store result in Server.result
|
||||
│
|
||||
└─ failure / cancel:
|
||||
previous Server.result unchanged
|
||||
```
|
||||
### Live collect
|
||||
|
||||
Job lifecycle states: `queued → running → success | failed | canceled`
|
||||
1. `POST /api/collect` validates request fields
|
||||
2. Server creates an async job and returns `202 Accepted`
|
||||
3. Selected collector gathers raw data
|
||||
4. For Redfish, collector runs minimal discovery, matches Redfish profiles, and builds an acquisition plan
|
||||
5. Collector applies profile tuning hints (for example crawl breadth, prefetch, bounded plan-B passes)
|
||||
6. Collector saves `raw_payloads.redfish_tree` plus acquisition diagnostics
|
||||
7. Result is normalized, source metadata applied, and state replaced on success
|
||||
|
||||
### Batch convert
|
||||
|
||||
1. `POST /api/convert` accepts multiple files
|
||||
2. Each supported file is analyzed independently
|
||||
3. Successful results are converted to Reanimator JSON
|
||||
4. Outputs are packaged into a temporary ZIP artifact
|
||||
5. Client polls job status and downloads the artifact when ready
|
||||
|
||||
## Redfish design rule
|
||||
|
||||
Live Redfish collection and offline Redfish re-analysis must use the same replay path.
|
||||
The collector first captures `raw_payloads.redfish_tree`, then the replay logic builds the normalized result.
|
||||
|
||||
Redfish is being split into two coordinated phases:
|
||||
- acquisition: profile-driven snapshot collection strategy
|
||||
- analysis: replay over the saved snapshot with the same profile framework
|
||||
|
||||
## PCI IDs lookup
|
||||
|
||||
Load/override order (`LOGPILE_PCI_IDS_PATH` has highest priority because it is loaded last):
|
||||
Lookup order:
|
||||
|
||||
1. Embedded `internal/parser/vendors/pciids/pci.ids` (base dataset compiled into binary)
|
||||
1. Embedded `internal/parser/vendors/pciids/pci.ids`
|
||||
2. `./pci.ids`
|
||||
3. `/usr/share/hwdata/pci.ids`
|
||||
4. `/usr/share/misc/pci.ids`
|
||||
5. `/opt/homebrew/share/pciids/pci.ids`
|
||||
6. Paths from `LOGPILE_PCI_IDS_PATH` (colon-separated on Unix; later loaded, override same IDs)
|
||||
6. Extra paths from `LOGPILE_PCI_IDS_PATH`
|
||||
|
||||
This means unknown GPU/NIC model strings can be updated by refreshing `pci.ids`
|
||||
without any code change.
|
||||
Later sources override earlier ones for the same IDs.
|
||||
|
||||
@@ -2,38 +2,38 @@
|
||||
|
||||
## Conventions
|
||||
|
||||
- All endpoints under `/api/`.
|
||||
- Request bodies: `application/json` or `multipart/form-data` where noted.
|
||||
- Responses: `application/json` unless file download.
|
||||
- Export filenames follow pattern: `YYYY-MM-DD (SERVER MODEL) - SERVER SN.<ext>`
|
||||
- All endpoints are under `/api/`
|
||||
- JSON responses are used unless the endpoint downloads a file
|
||||
- Async jobs share the same status model: `queued`, `running`, `success`, `failed`, `canceled`
|
||||
- Export filenames use `YYYY-MM-DD (MODEL) - SERIAL.<ext>` when board metadata exists
|
||||
- Embedded chart viewer routes live under `/chart/` and return HTML/CSS, not JSON
|
||||
|
||||
---
|
||||
|
||||
## Upload & Data Input
|
||||
## Input endpoints
|
||||
|
||||
### `POST /api/upload`
|
||||
|
||||
Upload a vendor diagnostic archive or a JSON snapshot.
|
||||
|
||||
**Request:** `multipart/form-data`, field name `archive`.
|
||||
Server-side multipart limit: **100 MiB**.
|
||||
Uploads one file in multipart field `archive`.
|
||||
|
||||
Accepted inputs:
|
||||
- `.tar`, `.tar.gz`, `.tgz` — vendor diagnostic archives
|
||||
- `.txt` — plain text files
|
||||
- JSON file containing a serialized `AnalysisResult` — re-loaded as-is
|
||||
- supported archive/log formats from the parser registry
|
||||
- `.json` `AnalysisResult` snapshots
|
||||
- raw-export JSON packages
|
||||
- raw-export ZIP bundles
|
||||
|
||||
**Response:** `200 OK` with parsed result summary, or `4xx`/`5xx` on error.
|
||||
Result:
|
||||
- parses or replays the input
|
||||
- stores the result as current in-memory state
|
||||
- returns parsed summary JSON
|
||||
|
||||
---
|
||||
|
||||
## Live Collection
|
||||
Related helper:
|
||||
- `GET /api/file-types` returns `archive_extensions`, `upload_extensions`, and `convert_extensions`
|
||||
|
||||
### `POST /api/collect`
|
||||
|
||||
Start a live collection job (`redfish` or `ipmi`).
|
||||
Starts a live collection job.
|
||||
|
||||
Request body:
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"host": "bmc01.example.local",
|
||||
@@ -47,138 +47,154 @@ Start a live collection job (`redfish` or `ipmi`).
|
||||
```
|
||||
|
||||
Supported values:
|
||||
- `protocol`: `redfish` | `ipmi`
|
||||
- `auth_type`: `password` | `token`
|
||||
- `tls_mode`: `strict` | `insecure`
|
||||
- `protocol`: `redfish` or `ipmi`
|
||||
- `auth_type`: `password` or `token`
|
||||
- `tls_mode`: `strict` or `insecure`
|
||||
|
||||
**Response:** `202 Accepted`
|
||||
```json
|
||||
{
|
||||
"job_id": "job_a1b2c3d4e5f6",
|
||||
"status": "queued",
|
||||
"message": "Collection job accepted",
|
||||
"created_at": "2026-02-23T12:00:00Z"
|
||||
}
|
||||
```
|
||||
Responses:
|
||||
- `202` on accepted job creation
|
||||
- `400` on malformed JSON
|
||||
- `422` on validation errors
|
||||
|
||||
Validation behavior:
|
||||
- `400 Bad Request` for invalid JSON
|
||||
- `422 Unprocessable Entity` for semantic validation errors (missing/invalid fields)
|
||||
Optional request field:
|
||||
- `power_on_if_host_off`: when `true`, Redfish collection may power on the host before collection if preflight found it powered off
|
||||
- `debug_payloads`: when `true`, collector keeps extra diagnostic payloads and enables extended plan-B retries for slow HGX component inventory branches (`Assembly`, `Accelerators`, `Drives`, `NetworkAdapters`, `PCIeDevices`)
|
||||
|
||||
### `POST /api/collect/probe`
|
||||
|
||||
Checks that live API connectivity works and returns host power state before collection starts.
|
||||
|
||||
Typical request body is the same as `POST /api/collect`.
|
||||
|
||||
Typical response fields:
|
||||
- `reachable`
|
||||
- `protocol`
|
||||
- `host_power_state`
|
||||
- `host_powered_on`
|
||||
- `power_control_available`
|
||||
- `message`
|
||||
|
||||
### `GET /api/collect/{id}`
|
||||
|
||||
Poll job status and progress log.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"job_id": "job_a1b2c3d4e5f6",
|
||||
"status": "running",
|
||||
"progress": 55,
|
||||
"logs": ["..."],
|
||||
"created_at": "2026-02-23T12:00:00Z",
|
||||
"updated_at": "2026-02-23T12:00:10Z"
|
||||
}
|
||||
```
|
||||
|
||||
Status values: `queued` | `running` | `success` | `failed` | `canceled`
|
||||
Returns async collection job status, progress, timestamps, and accumulated logs.
|
||||
|
||||
### `POST /api/collect/{id}/cancel`
|
||||
|
||||
Cancel a running job.
|
||||
Requests cancellation for a running collection job.
|
||||
|
||||
---
|
||||
### `POST /api/convert`
|
||||
|
||||
## Data Queries
|
||||
Starts a batch conversion job that accepts multiple files under `files[]` or `files`.
|
||||
Each supported file is parsed independently and converted to Reanimator JSON.
|
||||
|
||||
Response fields:
|
||||
- `job_id`
|
||||
- `status`
|
||||
- `accepted`
|
||||
- `skipped`
|
||||
- `total_files`
|
||||
|
||||
### `GET /api/convert/{id}`
|
||||
|
||||
Returns batch convert job status using the same async job envelope as collection.
|
||||
|
||||
### `GET /api/convert/{id}/download`
|
||||
|
||||
Downloads the ZIP artifact produced by a successful convert job.
|
||||
|
||||
## Read endpoints
|
||||
|
||||
### `GET /api/status`
|
||||
|
||||
Returns source metadata for the current dataset.
|
||||
If nothing is loaded, response is `{ "loaded": false }`.
|
||||
|
||||
```json
|
||||
{
|
||||
"loaded": true,
|
||||
"filename": "redfish://bmc01.example.local",
|
||||
"vendor": "redfish",
|
||||
"source_type": "api",
|
||||
"protocol": "redfish",
|
||||
"target_host": "bmc01.example.local",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"stats": { "events": 0, "sensors": 0, "fru": 0 }
|
||||
}
|
||||
```
|
||||
|
||||
`source_type`: `archive` | `api`
|
||||
|
||||
When no dataset is loaded, response is `{ "loaded": false }`.
|
||||
Typical fields:
|
||||
- `loaded`
|
||||
- `filename`
|
||||
- `vendor`
|
||||
- `source_type`
|
||||
- `protocol`
|
||||
- `target_host`
|
||||
- `source_timezone`
|
||||
- `collected_at`
|
||||
- `stats`
|
||||
|
||||
### `GET /api/config`
|
||||
|
||||
Returns source metadata plus:
|
||||
Returns the main UI configuration payload, including:
|
||||
- source metadata
|
||||
- `hardware.board`
|
||||
- `hardware.firmware`
|
||||
- canonical `hardware.devices`
|
||||
- computed `specification` summary lines
|
||||
- computed specification lines
|
||||
|
||||
### `GET /api/events`
|
||||
|
||||
Returns parsed diagnostic events.
|
||||
Returns events sorted newest first.
|
||||
|
||||
### `GET /api/sensors`
|
||||
|
||||
Returns sensor readings (temperatures, voltages, fan speeds).
|
||||
Returns parsed sensors plus synthesized PSU voltage sensors when telemetry is available.
|
||||
|
||||
### `GET /api/serials`
|
||||
|
||||
Returns serial numbers built from canonical `hardware.devices`.
|
||||
Returns serial-oriented inventory built from canonical devices.
|
||||
|
||||
### `GET /api/firmware`
|
||||
|
||||
Returns firmware versions built from canonical `hardware.devices`.
|
||||
Returns firmware-oriented inventory built from canonical devices.
|
||||
|
||||
### `GET /api/parse-errors`
|
||||
|
||||
Returns normalized parse and collection issues combined from:
|
||||
- Redfish fetch errors in `raw_payloads`
|
||||
- raw-export collect logs
|
||||
- derived partial-inventory warnings
|
||||
|
||||
### `GET /api/parsers`
|
||||
|
||||
Returns list of registered vendor parsers with their identifiers.
|
||||
Returns registered parser metadata.
|
||||
|
||||
---
|
||||
### `GET /api/file-types`
|
||||
|
||||
## Export
|
||||
Returns supported file extensions for upload and batch convert.
|
||||
|
||||
## Viewer endpoints
|
||||
|
||||
### `GET /chart/current`
|
||||
|
||||
Renders the current in-memory dataset as Reanimator HTML using embedded `reanimator/chart`.
|
||||
The server first converts the current result to Reanimator JSON, then passes that snapshot to the viewer.
|
||||
|
||||
### `GET /chart/static/...`
|
||||
|
||||
Serves embedded `reanimator/chart` static assets.
|
||||
|
||||
## Export endpoints
|
||||
|
||||
### `GET /api/export/csv`
|
||||
|
||||
Download serial numbers as CSV.
|
||||
Downloads serial-number CSV.
|
||||
|
||||
### `GET /api/export/json`
|
||||
|
||||
Download full `AnalysisResult` as JSON (includes `raw_payloads`).
|
||||
Downloads a raw-export artifact for reopen and re-analysis.
|
||||
Current implementation emits a ZIP bundle containing:
|
||||
- `raw_export.json`
|
||||
- `collect.log`
|
||||
- `parser_fields.json`
|
||||
|
||||
### `GET /api/export/reanimator`
|
||||
|
||||
Download hardware data in Reanimator format for asset tracking integration.
|
||||
See [`07-exporters.md`](07-exporters.md) for full format spec.
|
||||
Downloads Reanimator JSON built from the current normalized result.
|
||||
|
||||
---
|
||||
|
||||
## Management
|
||||
## Management endpoints
|
||||
|
||||
### `DELETE /api/clear`
|
||||
|
||||
Clear current in-memory dataset.
|
||||
Clears current in-memory dataset, raw export state, and temporary convert artifacts.
|
||||
|
||||
### `POST /api/shutdown`
|
||||
|
||||
Gracefully shut down the server process.
|
||||
This endpoint terminates the current process after responding.
|
||||
|
||||
---
|
||||
|
||||
## Source metadata fields
|
||||
|
||||
Fields present in `/api/status` and `/api/config`:
|
||||
|
||||
| Field | Values |
|
||||
|-------|--------|
|
||||
| `source_type` | `archive` \| `api` |
|
||||
| `protocol` | `redfish` \| `ipmi` (may be empty for archive uploads) |
|
||||
| `target_host` | IP or hostname |
|
||||
| `collected_at` | RFC3339 timestamp |
|
||||
Gracefully shuts down the process after responding.
|
||||
|
||||
@@ -1,104 +1,87 @@
|
||||
# 04 — Data Models
|
||||
|
||||
## AnalysisResult
|
||||
## Core contract: `AnalysisResult`
|
||||
|
||||
`internal/models/` — the central data contract shared by parsers, collectors, exporters, and the HTTP layer.
|
||||
`internal/models/models.go` defines the shared result passed between parsers, collectors, server handlers, and exporters.
|
||||
|
||||
**Stability rule:** Never break the JSON shape of `AnalysisResult`.
|
||||
Backward-compatible additions are allowed; removals or renames are not.
|
||||
Stability rule:
|
||||
- do not rename or remove JSON fields from `AnalysisResult`
|
||||
- additive fields are allowed
|
||||
- UI and exporter compatibility depends on this shape remaining stable
|
||||
|
||||
Key top-level fields:
|
||||
Key fields:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `filename` | `string` | Uploaded filename or generated live source identifier |
|
||||
| `source_type` | `string` | `archive` or `api` |
|
||||
| `protocol` | `string` | `redfish`, `ipmi`, or empty for archive uploads |
|
||||
| `target_host` | `string` | BMC host for live collection |
|
||||
| `collected_at` | `time.Time` | Upload/collection timestamp |
|
||||
| `hardware` | `*HardwareConfig` | All normalized hardware inventory |
|
||||
| `events` | `[]Event` | Diagnostic events from parsers |
|
||||
| `fru` | `[]FRUInfo` | FRU/SDR-derived inventory details |
|
||||
| `sensors` | `[]SensorReading` | Sensor readings |
|
||||
| `raw_payloads` | `map[string]any` | Raw vendor data (e.g. `redfish_tree`) |
|
||||
| Field | Meaning |
|
||||
|------|---------|
|
||||
| `filename` | Original upload name or synthesized live source name |
|
||||
| `source_type` | `archive` or `api` |
|
||||
| `protocol` | `redfish`, `ipmi`, or empty for archive uploads |
|
||||
| `target_host` | Hostname or IP for live collection |
|
||||
| `source_timezone` | Source timezone/offset if known |
|
||||
| `collected_at` | Canonical collection/upload time |
|
||||
| `raw_payloads` | Raw source data used for replay or diagnostics |
|
||||
| `events` | Parsed event timeline |
|
||||
| `fru` | FRU-derived inventory details |
|
||||
| `sensors` | Sensor readings |
|
||||
| `hardware` | Normalized hardware inventory |
|
||||
|
||||
`raw_payloads` is the durable source for offline re-analysis (especially for Redfish).
|
||||
Normalized fields should be treated as derivable output from raw source data.
|
||||
## `HardwareConfig`
|
||||
|
||||
### Hardware sub-structure
|
||||
Main sections:
|
||||
|
||||
```
|
||||
HardwareConfig
|
||||
├── board BoardInfo — server/motherboard identity
|
||||
├── devices []HardwareDevice — CANONICAL INVENTORY (see below)
|
||||
├── cpus []CPU
|
||||
├── memory []MemoryDIMM
|
||||
├── storage []Storage
|
||||
├── volumes []StorageVolume — logical RAID/VROC volumes
|
||||
├── pcie_devices []PCIeDevice
|
||||
├── gpus []GPU
|
||||
├── network_adapters []NetworkAdapter
|
||||
├── network_cards []NIC (legacy/alternate source field)
|
||||
├── power_supplies []PSU
|
||||
└── firmware []FirmwareInfo
|
||||
```text
|
||||
hardware.board
|
||||
hardware.devices
|
||||
hardware.cpus
|
||||
hardware.memory
|
||||
hardware.storage
|
||||
hardware.volumes
|
||||
hardware.pcie_devices
|
||||
hardware.gpus
|
||||
hardware.network_adapters
|
||||
hardware.network_cards
|
||||
hardware.power_supplies
|
||||
hardware.firmware
|
||||
```
|
||||
|
||||
---
|
||||
`network_cards` is legacy/alternate source data.
|
||||
`hardware.devices` is the canonical cross-section inventory.
|
||||
|
||||
## Canonical Device Repository (`hardware.devices`)
|
||||
## Canonical inventory: `hardware.devices`
|
||||
|
||||
`hardware.devices` is the **single source of truth** for hardware inventory.
|
||||
`hardware.devices` is the single source of truth for device-oriented UI and Reanimator export.
|
||||
|
||||
### Rules — must not be violated
|
||||
Required rules:
|
||||
|
||||
1. All UI tabs displaying hardware components **must read from `hardware.devices`**.
|
||||
2. The Device Inventory tab shows kinds: `pcie`, `storage`, `gpu`, `network`.
|
||||
3. The Reanimator exporter **must use the same `hardware.devices`** as the UI.
|
||||
4. Any discrepancy between UI data and Reanimator export data is a **bug**.
|
||||
5. New hardware attributes must be added to the canonical device schema **first**,
|
||||
then mapped to Reanimator/UI — never the other way around.
|
||||
6. The exporter should group/filter canonical records by section, not rebuild data
|
||||
from multiple sources.
|
||||
1. UI hardware views must read from `hardware.devices`
|
||||
2. Reanimator conversion must derive device sections from `hardware.devices`
|
||||
3. UI/export mismatches are bugs, not accepted divergence
|
||||
4. New shared device fields belong in `HardwareDevice` first
|
||||
|
||||
### Deduplication logic (applied once by repository builder)
|
||||
Deduplication priority:
|
||||
|
||||
| Priority | Key used |
|
||||
|----------|----------|
|
||||
| 1 | `serial_number` — usable (not empty, not `N/A`, `NA`, `NONE`, `NULL`, `UNKNOWN`, `-`) |
|
||||
| 2 | `bdf` — PCI Bus:Device.Function address |
|
||||
| 3 | No merge — records remain distinct if both serial and bdf are absent |
|
||||
| Priority | Key |
|
||||
|----------|-----|
|
||||
| 1 | usable `serial_number` |
|
||||
| 2 | `bdf` |
|
||||
| 3 | keep records separate |
|
||||
|
||||
### Device schema alignment
|
||||
## Raw payloads
|
||||
|
||||
Keep `hardware.devices` schema as close as possible to Reanimator JSON field names.
|
||||
This minimizes translation logic in the exporter and prevents drift.
|
||||
`raw_payloads` is authoritative for replayable sources.
|
||||
|
||||
---
|
||||
Current important payloads:
|
||||
- `redfish_tree`
|
||||
- `redfish_fetch_errors`
|
||||
- `source_timezone`
|
||||
|
||||
## Source metadata fields (stored directly on `AnalysisResult`)
|
||||
Normalized hardware fields are derived output, not the long-term source of truth.
|
||||
|
||||
Carried by both `/api/status` and `/api/config`:
|
||||
## Raw export package
|
||||
|
||||
```json
|
||||
{
|
||||
"source_type": "api",
|
||||
"protocol": "redfish",
|
||||
"target_host": "10.0.0.1",
|
||||
"collected_at": "2026-02-10T15:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
Valid `source_type` values: `archive`, `api`
|
||||
Valid `protocol` values: `redfish`, `ipmi` (empty is allowed for archive uploads)
|
||||
|
||||
---
|
||||
|
||||
## Raw Export Package (reopenable artifact)
|
||||
|
||||
`Export Raw Data` does not merely dump `AnalysisResult`; it emits a reopenable raw package
|
||||
(JSON or ZIP bundle) that carries source data required for re-analysis.
|
||||
`/api/export/json` produces a reopenable raw-export artifact.
|
||||
|
||||
Design rules:
|
||||
- raw source is authoritative (`redfish_tree` or original file bytes)
|
||||
- imports must re-analyze from raw source
|
||||
- parsed field snapshots included in bundles are diagnostic artifacts, not the source of truth
|
||||
- raw source stays authoritative
|
||||
- uploads of raw-export artifacts must re-analyze from raw source
|
||||
- parsed snapshots inside the bundle are diagnostic only
|
||||
|
||||
@@ -3,107 +3,180 @@
|
||||
Collectors live in `internal/collector/`.
|
||||
|
||||
Core files:
|
||||
- `internal/collector/registry.go` — connector registry (`redfish`, `ipmi`)
|
||||
- `internal/collector/redfish.go` — real Redfish connector
|
||||
- `internal/collector/ipmi_mock.go` — IPMI mock connector scaffold
|
||||
- `internal/collector/types.go` — request/progress contracts
|
||||
- `registry.go` for protocol registration
|
||||
- `redfish.go` for live collection
|
||||
- `redfish_replay.go` for replay from raw payloads
|
||||
- `redfish_replay_gpu.go` for profile-driven GPU replay collectors and GPU fallback helpers
|
||||
- `redfish_replay_storage.go` for profile-driven storage replay collectors and storage recovery helpers
|
||||
- `redfish_replay_inventory.go` for replay inventory collectors (PCIe, NIC, BMC MAC, NIC enrichment)
|
||||
- `redfish_replay_fru.go` for board fallback helpers and Assembly/FRU replay extraction
|
||||
- `redfish_replay_profiles.go` for profile-driven replay helpers and vendor-aware recovery helpers
|
||||
- `redfishprofile/` for Redfish profile matching and acquisition/analysis hooks
|
||||
- `ipmi_mock.go` for the placeholder IPMI implementation
|
||||
- `types.go` for request/progress contracts
|
||||
|
||||
---
|
||||
## Redfish collector
|
||||
|
||||
## Redfish Collector (`redfish`)
|
||||
Status: active production path.
|
||||
|
||||
**Status:** Production-ready.
|
||||
Request fields passed from the server:
|
||||
- `host`
|
||||
- `port`
|
||||
- `username`
|
||||
- `auth_type`
|
||||
- credential field (`password` or token)
|
||||
- `tls_mode`
|
||||
- optional `power_on_if_host_off`
|
||||
- optional `debug_payloads` for extended diagnostics
|
||||
|
||||
### Request contract (from server)
|
||||
### Core rule
|
||||
|
||||
Passed through from `/api/collect` after validation:
|
||||
- `host`, `port`, `username`
|
||||
- `auth_type=password|token` (+ matching credential field)
|
||||
- `tls_mode=strict|insecure`
|
||||
Live collection and replay must stay behaviorally aligned.
|
||||
If the collector adds a fallback, probe, or normalization rule, replay must mirror it.
|
||||
|
||||
### Discovery
|
||||
### Preflight and host power
|
||||
|
||||
Dynamic — does not assume fixed paths. Discovers:
|
||||
- `Systems` collection → per-system resources
|
||||
- `Chassis` collection → enclosure/board data
|
||||
- `Managers` collection → BMC/firmware info
|
||||
- `Probe()` is used before collection to verify API connectivity and report current host `PowerState`
|
||||
- if the host is off, the collector logs a warning and proceeds with collection; inventory data may
|
||||
be incomplete when the host is powered off
|
||||
- power-on and power-off are not performed by the collector
|
||||
|
||||
### Collected data
|
||||
### Skip hung requests
|
||||
|
||||
| Category | Notes |
|
||||
|----------|-------|
|
||||
| CPU | Model, cores, threads, socket, status |
|
||||
| Memory | DIMM slot, size, type, speed, serial, manufacturer |
|
||||
| Storage | Slot, type, model, serial, firmware, interface, status |
|
||||
| GPU | Detected via PCIe class + NVIDIA vendor ID |
|
||||
| PSU | Model, serial, wattage, firmware, telemetry (input/output power, voltage) |
|
||||
| NIC | Model, serial, port count, BDF |
|
||||
| PCIe | Slot, vendor_id, device_id, BDF, link width/speed |
|
||||
| Firmware | BIOS, BMC versions |
|
||||
Redfish collection uses a two-level context model:
|
||||
|
||||
### Raw snapshot
|
||||
- `ctx` — job lifetime context, cancelled only on explicit job cancel
|
||||
- `collectCtx` — collection phase context, derived from `ctx`; covers snapshot, prefetch, and plan-B
|
||||
|
||||
Full Redfish response tree is stored in `result.RawPayloads["redfish_tree"]`.
|
||||
This allows future offline re-analysis without re-collecting from a live BMC.
|
||||
`collectCtx` is cancelled when the user presses "Пропустить зависшие" (skip hung).
|
||||
On skip, all in-flight HTTP requests in the current phase are aborted immediately via context
|
||||
cancellation, the crawler and plan-B loops exit, and execution proceeds to the replay phase using
|
||||
whatever was collected in `rawTree`. The result is partial but valid.
|
||||
|
||||
### Unified Redfish analysis pipeline (live == replay)
|
||||
The skip signal travels: UI button → `POST /api/collect/{id}/skip` → `JobManager.SkipJob()` →
|
||||
closes `skipCh` → goroutine in `Collect()` → `cancelCollect()`.
|
||||
|
||||
LOGPile uses a **single Redfish analyzer path**:
|
||||
The skip button is visible during `running` state and hidden once the job reaches a terminal state.
|
||||
|
||||
1. Live collector crawls the Redfish API and builds `raw_payloads.redfish_tree`
|
||||
2. Parsed result is produced by replaying that tree through the same analyzer used by raw import
|
||||
### Extended diagnostics toggle
|
||||
|
||||
This guarantees that live collection and `Export Raw Data` re-open/re-analyze produce the same
|
||||
normalized output for the same `redfish_tree`.
|
||||
The live collect form exposes a user-facing checkbox for extended diagnostics.
|
||||
|
||||
### Snapshot crawler behavior (important)
|
||||
- default collection prioritizes inventory completeness and bounded runtime
|
||||
- when extended diagnostics is off, heavy HGX component-chassis critical plan-B retries
|
||||
(`Assembly`, `Accelerators`, `Drives`, `NetworkAdapters`, `PCIeDevices`) are skipped
|
||||
- when extended diagnostics is on, those retries are allowed and extra debug payloads are collected
|
||||
|
||||
The Redfish snapshot crawler is intentionally:
|
||||
- **bounded** (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
|
||||
- **prioritized** (PCIe, Fabrics, FirmwareInventory, Storage, PowerSubsystem, ThermalSubsystem)
|
||||
- **tolerant** (skips noisy expected failures, strips `#fragment` from `@odata.id`)
|
||||
This toggle is intended for operator-driven deep diagnostics on problematic hosts, not for the default path.
|
||||
|
||||
Design notes:
|
||||
- Queue capacity is sized to snapshot cap to avoid worker deadlocks on large trees.
|
||||
- UI progress is coarse and human-readable; detailed per-request diagnostics are available via debug logs.
|
||||
- `LOGPILE_REDFISH_DEBUG=1` and `LOGPILE_REDFISH_SNAPSHOT_DEBUG=1` enable console diagnostics.
|
||||
### Discovery model
|
||||
|
||||
### Parsing guidelines
|
||||
The collector does not rely on one fixed vendor tree.
|
||||
It discovers and follows Redfish resources dynamically from root collections such as:
|
||||
- `Systems`
|
||||
- `Chassis`
|
||||
- `Managers`
|
||||
|
||||
When adding Redfish mappings, follow these principles:
|
||||
- Support alternate collection paths (resources may appear at different odata URLs).
|
||||
- Follow `@odata.id` references and handle embedded `Members` arrays.
|
||||
- Prefer **raw-tree replay compatibility**: if live collector adds a fallback/probe, replay analyzer must mirror it.
|
||||
- Deduplicate by serial / BDF / slot+model (in that priority order).
|
||||
- Prefer tolerant/fallback parsing — missing fields should be silently skipped,
|
||||
not cause the whole collection to fail.
|
||||
After minimal discovery the collector builds `MatchSignals` and selects a Redfish profile mode:
|
||||
- `matched` when one or more profiles score with high confidence
|
||||
- `fallback` when vendor/platform confidence is low; in this mode the collector aggregates safe additive profile probes to maximize snapshot completeness
|
||||
|
||||
### Vendor-specific storage fallbacks (Supermicro and similar)
|
||||
Profile modules may contribute:
|
||||
- primary acquisition seeds
|
||||
- bounded `PlanBPaths` for secondary recovery
|
||||
- critical paths
|
||||
- acquisition notes/diagnostics
|
||||
- tuning hints such as snapshot document cap, prefetch behavior, and expensive post-probe toggles
|
||||
- post-probe policy for numeric collection recovery, direct NVMe `Disk.Bay` recovery, and sensor post-probe enablement
|
||||
- recovery policy for critical collection member retry, slow numeric plan-B probing, and profile-specific plan-B activation
|
||||
- scoped path policy for discovered `Systems/*`, `Chassis/*`, and `Managers/*` branches when a profile needs extra seeds/critical targets beyond the vendor-neutral core set
|
||||
- prefetch policy for which critical paths are eligible for adaptive prefetch and which path shapes are explicitly excluded
|
||||
|
||||
When standard `Storage/.../Drives` collections are empty, collector/replay may recover drives via:
|
||||
- `Storage.Links.Enclosures[*] -> .../Drives`
|
||||
- direct probing of finite `Disk.Bay` candidates (`Disk.Bay.0`, `Disk.Bay0`, `.../0`)
|
||||
Model- or topology-specific `CriticalPaths` and profile `PlanBPaths` must live in the profile
|
||||
module that owns the behavior. The collector core may execute those paths, but it should not
|
||||
hardcode vendor-specific recovery targets.
|
||||
The same rule applies to expensive post-probe decisions: the collector core may execute bounded
|
||||
post-probe loops, but profiles own whether those loops are enabled for a given platform shape.
|
||||
The same rule applies to critical recovery passes: the collector core may run bounded plan-B
|
||||
loops, but profiles own whether member retry, slow numeric recovery, and profile-specific plan-B
|
||||
passes are enabled.
|
||||
When a profile needs extra discovered-path branches such as storage controller subtrees, it must
|
||||
provide them as scoped suffix policy rather than by hardcoding platform-shaped suffixes into the
|
||||
collector core baseline seed list.
|
||||
The same applies to prefetch shaping: the collector core may execute adaptive prefetch, but
|
||||
profiles own the include/exclude rules for which critical paths should participate.
|
||||
The same applies to critical inventory shaping: the collector core should keep only a minimal
|
||||
vendor-neutral critical baseline, while profiles own additional system/chassis/manager critical
|
||||
suffixes and top-level critical targets.
|
||||
Resolved live acquisition plans should be built inside `redfishprofile/`, not by hand in
|
||||
`redfish.go`. The collector core should receive discovered resources plus the selected profile
|
||||
plan and then execute the resolved seed/critical paths.
|
||||
When profile behavior depends on what discovery actually returned, use a post-discovery
|
||||
refinement hook in `redfishprofile/` instead of hardcoding guessed absolute paths in the static
|
||||
plan. MSI GPU chassis refinement is the reference example.
|
||||
|
||||
This is required for some BMCs that publish drive inventory in vendor-specific paths while leaving
|
||||
standard collections empty.
|
||||
Live Redfish collection must expose profile-match diagnostics:
|
||||
- collector logs must include the selected modules and score for every known module
|
||||
- job status responses must carry structured `active_modules` and `module_scores`
|
||||
- the collect page should render active modules as chips from structured status data, not by
|
||||
parsing log lines
|
||||
|
||||
### PSU source preference (newer Redfish)
|
||||
Profile matching may use stable platform grammar signals in addition to vendor strings:
|
||||
- discovered member/resource naming from lightweight discovery collections
|
||||
- firmware inventory member IDs
|
||||
- OEM action names and linked target paths embedded in discovery documents
|
||||
- replay-only snapshot hints such as OEM assembly/type markers when they are present in
|
||||
`raw_payloads.redfish_tree`
|
||||
|
||||
PSU inventory source order:
|
||||
1. `Chassis/*/PowerSubsystem/PowerSupplies` (preferred on X14+/newer Redfish)
|
||||
2. `Chassis/*/Power` (legacy fallback)
|
||||
On replay, profile-derived analysis directives may enable vendor-specific inventory linking
|
||||
helpers such as processor-GPU fallback, chassis-ID alias resolution, and bounded storage recovery.
|
||||
Replay should now resolve a structured analysis plan inside `redfishprofile/`, analogous to the
|
||||
live acquisition plan. The replay core may execute collectors against the resolved directives, but
|
||||
snapshot-aware vendor decisions should live in profile analysis hooks, not in `redfish_replay.go`.
|
||||
GPU and storage replay executors should consume the resolved analysis plan directly, not a raw
|
||||
`AnalysisDirectives` struct, so the boundary between planning and execution stays explicit.
|
||||
|
||||
### Progress reporting
|
||||
Profile matching and acquisition tuning must be regression-tested against repo-owned compact
|
||||
fixtures under `internal/collector/redfishprofile/testdata/`, derived from representative
|
||||
raw-export snapshots, for at least MSI and Supermicro shapes.
|
||||
When multiple raw-export snapshots exist for the same platform, profile selection must remain
|
||||
stable across those sibling fixtures unless the topology actually changes.
|
||||
Analysis-plan metadata should be stored in replay raw payloads so vendor hook activation is
|
||||
debuggable offline.
|
||||
|
||||
The collector emits progress log entries at each stage (connecting, enumerating systems,
|
||||
collecting CPUs, etc.) so the UI can display meaningful status.
|
||||
Current progress message strings are user-facing and may be localized.
|
||||
### Stored raw data
|
||||
|
||||
---
|
||||
Important raw payloads:
|
||||
- `raw_payloads.redfish_tree`
|
||||
- `raw_payloads.redfish_fetch_errors`
|
||||
- `raw_payloads.redfish_profiles`
|
||||
- `raw_payloads.source_timezone` when available
|
||||
|
||||
## IPMI Collector (`ipmi`)
|
||||
### Snapshot crawler rules
|
||||
|
||||
**Status:** Mock scaffold only — not implemented.
|
||||
- bounded by `LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`
|
||||
- prioritized toward high-value inventory paths
|
||||
- tolerant of expected vendor-specific failures
|
||||
- normalizes `@odata.id` values before queueing
|
||||
|
||||
Registered in the collector registry but returns placeholder data.
|
||||
Real IPMI support is a future work item.
|
||||
### Redfish implementation guidance
|
||||
|
||||
When changing collection logic:
|
||||
|
||||
1. Prefer profile modules over ad-hoc vendor branches in the collector core
|
||||
2. Keep expensive probing bounded
|
||||
3. Deduplicate by serial, then BDF, then location/model fallbacks
|
||||
4. Preserve replay determinism from saved raw payloads
|
||||
5. Add tests for both the motivating topology and a negative case
|
||||
|
||||
### Known vendor fallbacks
|
||||
|
||||
- empty standard drive collections may trigger bounded `Disk.Bay` probing
|
||||
- `Storage.Links.Enclosures[*]` may be followed to recover physical drives
|
||||
- `PowerSubsystem/PowerSupplies` is preferred over legacy `Power` when available
|
||||
|
||||
## IPMI collector
|
||||
|
||||
Status: mock scaffold only.
|
||||
|
||||
It remains registered for protocol completeness, but it is not a real collection path.
|
||||
|
||||
@@ -2,261 +2,73 @@
|
||||
|
||||
## Framework
|
||||
|
||||
### Registration
|
||||
Parsers live in `internal/parser/` and vendor implementations live in `internal/parser/vendors/`.
|
||||
|
||||
Each vendor parser registers itself via Go's `init()` side-effect import pattern.
|
||||
Core behavior:
|
||||
- registration uses `init()` side effects
|
||||
- all registered parsers run `Detect()`
|
||||
- the highest-confidence parser wins
|
||||
- generic fallback stays last and low-confidence
|
||||
|
||||
All registrations are collected in `internal/parser/vendors/vendors.go`:
|
||||
```go
|
||||
import (
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
|
||||
// etc.
|
||||
)
|
||||
```
|
||||
|
||||
### VendorParser interface
|
||||
`VendorParser` contract:
|
||||
|
||||
```go
|
||||
type VendorParser interface {
|
||||
Name() string // human-readable name
|
||||
Vendor() string // vendor identifier string
|
||||
Version() string // parser version (increment on logic changes)
|
||||
Detect(files []ExtractedFile) int // confidence 0–100
|
||||
Name() string
|
||||
Vendor() string
|
||||
Version() string
|
||||
Detect(files []ExtractedFile) int
|
||||
Parse(files []ExtractedFile) (*models.AnalysisResult, error)
|
||||
}
|
||||
```
|
||||
|
||||
### Selection logic
|
||||
## Adding a parser
|
||||
|
||||
All registered parsers run `Detect()` against the uploaded archive's file list.
|
||||
The parser with the **highest confidence score** is selected.
|
||||
Multiple parsers may return >0; only the top scorer is used.
|
||||
1. Create `internal/parser/vendors/<vendor>/`
|
||||
2. Start from `internal/parser/vendors/template/parser.go.template`
|
||||
3. Implement `Detect()` and `Parse()`
|
||||
4. Add a blank import in `internal/parser/vendors/vendors.go`
|
||||
5. Add at least one positive and one negative detection test
|
||||
|
||||
### Adding a new vendor parser
|
||||
## Data quality rules
|
||||
|
||||
1. `mkdir -p internal/parser/vendors/VENDORNAME`
|
||||
2. Copy `internal/parser/vendors/template/parser.go.template` as starting point.
|
||||
3. Implement `Detect()` and `Parse()`.
|
||||
4. Add blank import to `vendors/vendors.go`.
|
||||
### System firmware only in `hardware.firmware`
|
||||
|
||||
`Detect()` tips:
|
||||
- Look for unique filenames or directory names.
|
||||
- Check file content for vendor-specific markers.
|
||||
- Return 70+ only when confident; return 0 if clearly not a match.
|
||||
`hardware.firmware` must contain system-level firmware only.
|
||||
Device-bound firmware belongs on the device record and must not be duplicated at the top level.
|
||||
|
||||
### Parser versioning
|
||||
### Strip embedded MAC addresses from model names
|
||||
|
||||
Each parser file contains a `parserVersion` constant.
|
||||
Increment the version whenever parsing logic changes — this helps trace which
|
||||
version produced a given result.
|
||||
If a source embeds ` - XX:XX:XX:XX:XX:XX` in a model/name field, remove that suffix before storing it.
|
||||
|
||||
---
|
||||
### Use `pci.ids` for empty or generic PCI model names
|
||||
|
||||
## Parser data quality rules
|
||||
When `vendor_id` and `device_id` are known but the model name is missing or generic, resolve the name via `internal/parser/vendors/pciids`.
|
||||
|
||||
### FirmwareInfo — system-level only
|
||||
## Active vendor coverage
|
||||
|
||||
`Hardware.Firmware` must contain **only system-level firmware**: BIOS, BMC/iDRAC,
|
||||
Lifecycle Controller, CPLD, storage controllers, BOSS adapters.
|
||||
| Vendor ID | Input family | Notes |
|
||||
|-----------|--------------|-------|
|
||||
| `dell` | TSR ZIP archives | Broad hardware, firmware, sensors, lifecycle events |
|
||||
| `easy_bee` | `bee-support-*.tar.gz` | Imports embedded `export/bee-audit.json` snapshot from reanimator-easy-bee bundles |
|
||||
| `h3c_g5` | H3C SDS G5 bundles | INI/XML/CSV-driven hardware and event parsing |
|
||||
| `h3c_g6` | H3C SDS G6 bundles | Similar flow with G6-specific files |
|
||||
| `hpe_ilo_ahs` | HPE iLO Active Health System (`.ahs`) | Proprietary `ABJR` container with gzip-compressed `zbb` members; parser combines SMBIOS-style inventory strings and embedded Redfish storage JSON |
|
||||
| `inspur` | onekeylog archives | FRU/SDR plus optional Redis enrichment |
|
||||
| `lenovo_xcc` | Lenovo XCC mini-log ZIP archives | JSON inventory + platform event logs |
|
||||
| `nvidia` | HGX Field Diagnostics | GPU- and fabric-heavy diagnostic input |
|
||||
| `nvidia_bug_report` | `nvidia-bug-report-*.log.gz` | dmidecode, lspci, NVIDIA driver sections |
|
||||
| `unraid` | Unraid diagnostics/log bundles | Server and storage-focused parsing |
|
||||
| `xfusion` | xFusion iBMC `tar.gz` dump / file export | AppDump + RTOSDump + LogDump merge for hardware and firmware |
|
||||
| `xigmanas` | XigmaNAS plain logs | FreeBSD/NAS-oriented inventory |
|
||||
| `generic` | fallback | Low-confidence text fallback when nothing else matches |
|
||||
|
||||
**Device-bound firmware** (NIC, GPU, PSU, disk, backplane) **must NOT be added to
|
||||
`Hardware.Firmware`**. It belongs to the device's own `Firmware` field and is already
|
||||
present there. Duplicating it in `Hardware.Firmware` causes double entries in Reanimator.
|
||||
## Practical guidance
|
||||
|
||||
The Reanimator exporter filters by `FirmwareInfo.DeviceName` prefix and by
|
||||
`FirmwareInfo.Description` (FQDD prefix). Parsers must cooperate:
|
||||
|
||||
- Store the device's FQDD (or equivalent slot identifier) in `FirmwareInfo.Description`
|
||||
for all firmware entries that come from a per-device inventory source (e.g. Dell
|
||||
`DCIM_SoftwareIdentity`).
|
||||
- FQDD prefixes that are device-bound: `NIC.`, `PSU.`, `Disk.`, `RAID.Backplane.`, `GPU.`
|
||||
|
||||
### NIC/device model names — strip embedded MAC addresses
|
||||
|
||||
Some vendors (confirmed: Dell TSR) embed the MAC address in the device model name field,
|
||||
e.g. `ProductName = "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`.
|
||||
|
||||
**Rule:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from model/name strings before storing
|
||||
them in `FirmwareInfo.DeviceName`, `NetworkAdapter.Model`, or any other model field.
|
||||
|
||||
Use `nicMACInModelRE` (defined in the Dell parser) or an equivalent regex:
|
||||
```
|
||||
\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$
|
||||
```
|
||||
|
||||
This applies to **all** string fields used as device names or model identifiers.
|
||||
|
||||
### PCI device name enrichment via pci.ids
|
||||
|
||||
If a PCIe device, GPU, NIC, or any hardware component has a `vendor_id` + `device_id`
|
||||
but its model/name field is **empty or generic** (e.g. blank, equals the description,
|
||||
or is just a raw hex ID), the parser **must** attempt to resolve the human-readable
|
||||
model name from the embedded `pci.ids` database before storing the result.
|
||||
|
||||
**Rule:** When `Model` (or equivalent name field) is empty and both `VendorID` and
|
||||
`DeviceID` are non-zero, call the pciids lookup and use the result as the model name.
|
||||
|
||||
```go
|
||||
// Example pattern — use in any parser that handles PCIe/GPU/NIC devices:
|
||||
if strings.TrimSpace(device.Model) == "" && device.VendorID != 0 && device.DeviceID != 0 {
|
||||
if name := pciids.Lookup(device.VendorID, device.DeviceID); name != "" {
|
||||
device.Model = name
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This rule applies to all vendor parsers. The pciids package is available at
|
||||
`internal/parser/vendors/pciids`. See ADL-005 for the rationale.
|
||||
|
||||
**Do not hardcode model name strings.** If a device is unknown today, it will be
|
||||
resolved automatically once `pci.ids` is updated.
|
||||
|
||||
---
|
||||
|
||||
## Vendor parsers
|
||||
|
||||
### Inspur / Kaytus (`inspur`)
|
||||
|
||||
**Status:** Ready. Tested on KR4268X2 (onekeylog format).
|
||||
|
||||
**Archive format:** `.tar.gz` onekeylog
|
||||
|
||||
**Primary source files:**
|
||||
|
||||
| File | Content |
|
||||
|------|---------|
|
||||
| `asset.json` | Base hardware inventory |
|
||||
| `component.log` | Component list |
|
||||
| `devicefrusdr.log` | FRU and SDR data |
|
||||
| `onekeylog/runningdata/redis-dump.rdb` | Runtime enrichment (optional) |
|
||||
|
||||
**Redis RDB enrichment** (applied conservatively — fills missing fields only):
|
||||
- GPU: `serial_number`, `firmware` (VBIOS/FW), runtime telemetry
|
||||
- NIC: firmware, serial, part number (when text logs leave fields empty)
|
||||
|
||||
**Module structure:**
|
||||
```
|
||||
inspur/
|
||||
parser.go — main parser + registration
|
||||
sdr.go — sensor/SDR parsing
|
||||
fru.go — FRU serial parsing
|
||||
asset.go — asset.json parsing
|
||||
syslog.go — syslog parsing
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Dell TSR (`dell`)
|
||||
|
||||
**Status:** Ready (v3.0). Tested on nested TSR archives with embedded `*.pl.zip`.
|
||||
|
||||
**Archive format:** `.zip` (outer archive + nested `*.pl.zip`)
|
||||
|
||||
**Primary source files:**
|
||||
- `tsr/metadata.json`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml`
|
||||
- `tsr/hardware/sysinfo/lcfiles/curr_lclog.xml`
|
||||
|
||||
**Extracted data:**
|
||||
- Board/system identity and BIOS/iDRAC firmware
|
||||
- CPU, memory, physical disks, virtual disks, PSU, NIC, PCIe
|
||||
- GPU inventory (`DCIM_VideoView`) + GPU sensor enrichment (`DCIM_GPUSensor`)
|
||||
- Controller/backplane inventory (`DCIM_ControllerView`, `DCIM_EnclosureView`)
|
||||
- Sensor readings (temperature/voltage/current/power/fan/utilization)
|
||||
- Lifecycle events (`curr_lclog.xml`)
|
||||
|
||||
---
|
||||
|
||||
### NVIDIA HGX Field Diagnostics (`nvidia`)
|
||||
|
||||
**Status:** Ready (v1.1.0). Works with any server vendor.
|
||||
|
||||
**Archive format:** `.tar` / `.tar.gz`
|
||||
|
||||
**Confidence scoring:**
|
||||
|
||||
| File | Score |
|
||||
|------|-------|
|
||||
| `unified_summary.json` with "HGX Field Diag" marker | +40 |
|
||||
| `summary.json` | +20 |
|
||||
| `summary.csv` | +15 |
|
||||
| `gpu_fieldiag/` directory | +15 |
|
||||
|
||||
**Source files:**
|
||||
|
||||
| File | Content |
|
||||
|------|---------|
|
||||
| `output.log` | dmidecode — server manufacturer, model, serial number |
|
||||
| `unified_summary.json` | GPU details, NVSwitch devices, PCI addresses |
|
||||
| `summary.json` | Diagnostic test results and error codes |
|
||||
| `summary.csv` | Alternative test results format |
|
||||
|
||||
**Extracted data:**
|
||||
- GPUs: slot, model, manufacturer, firmware (VBIOS), BDF
|
||||
- NVSwitch devices: slot, device_class, vendor_id, device_id, BDF, link speed/width
|
||||
- Events: diagnostic test failures (connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power)
|
||||
|
||||
**Severity mapping:**
|
||||
- `info` — tests passed
|
||||
- `warning` — e.g. "Row remapping failed"
|
||||
- `critical` — error codes 300+
|
||||
|
||||
**Known limitations:**
|
||||
- Detailed logs in `gpu_fieldiag/*.log` are not parsed.
|
||||
- No CPU, memory, or storage extraction (not present in field diag archives).
|
||||
|
||||
---
|
||||
|
||||
### NVIDIA Bug Report (`nvidia_bug_report`)
|
||||
|
||||
**Status:** Ready (v1.0.0).
|
||||
|
||||
**File format:** `nvidia-bug-report-*.log.gz` (gzip-compressed text)
|
||||
|
||||
**Confidence:** 85 (high priority for matching filename pattern)
|
||||
|
||||
**Source sections parsed:**
|
||||
|
||||
| dmidecode section | Extracts |
|
||||
|-------------------|---------|
|
||||
| System Information | server serial, UUID, manufacturer, product name |
|
||||
| Processor Information | CPU model, serial, core/thread count, frequency |
|
||||
| Memory Device | DIMM slot, size, type, manufacturer, serial, part number, speed |
|
||||
| System Power Supply | PSU location, manufacturer, model, serial, wattage, firmware, status |
|
||||
|
||||
| Other source | Extracts |
|
||||
|--------------|---------|
|
||||
| `lspci -vvv` (Ethernet/Network/IB) | NIC model (from VPD), BDF, slot, P/N, S/N, port count, port type |
|
||||
| `/proc/driver/nvidia/gpus/*/information` | GPU model, BDF, UUID, VBIOS version, IRQ |
|
||||
| NVRM version line | NVIDIA driver version |
|
||||
|
||||
**Known limitations:**
|
||||
- Driver error/warning log lines not yet extracted.
|
||||
- GPU temperature/utilization metrics require additional parsing sections.
|
||||
|
||||
---
|
||||
|
||||
### XigmaNAS (`xigmanas`)
|
||||
|
||||
**Status:** Ready.
|
||||
|
||||
**Archive format:** Plain log files (FreeBSD-based NAS system)
|
||||
|
||||
**Detection:** Files named `xigmanas`, `system`, or `dmesg`; content containing "XigmaNAS" or "FreeBSD"; SMART data presence.
|
||||
|
||||
**Extracted data:**
|
||||
- System: firmware version, uptime, CPU model, memory configuration, hardware platform
|
||||
- Storage: disk models, serial numbers, capacity, health, SMART temperatures
|
||||
- Populates: `Hardware.Firmware`, `Hardware.CPUs`, `Hardware.Memory`, `Hardware.Storage`, `Sensors`
|
||||
|
||||
---
|
||||
|
||||
### Unraid (`unraid`)
|
||||
|
||||
**Status:** Ready (v1.0.0).
|
||||
- Be conservative with high detect scores
|
||||
- Prefer filling missing fields over overwriting stronger source data
|
||||
- Keep parser version constants current when behavior changes
|
||||
- Any new vendor-specific filtering or dedup logic must ship with tests for that vendor format
|
||||
|
||||
**Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories).
|
||||
|
||||
@@ -312,6 +124,55 @@ with content markers (e.g. `Unraid kernel build`, parity data markers).
|
||||
|
||||
---
|
||||
|
||||
### HPE iLO AHS (`hpe_ilo_ahs`)
|
||||
|
||||
**Status:** Ready (v1.0.0). Tested on HPE ProLiant Gen11 `.ahs` export from iLO 6.
|
||||
|
||||
**Archive format:** `.ahs` single-file Active Health System export.
|
||||
|
||||
**Detection:** Single-file input with `ABJR` container header and HPE AHS member names
|
||||
such as `CUST_INFO.DAT`, `*.zbb`, `ilo_boot_support.zbb`.
|
||||
|
||||
**Extracted data (current):**
|
||||
- System board identity (manufacturer, model, serial, part number)
|
||||
- iLO / System ROM / SPS top-level firmware
|
||||
- CPU inventory (model-level)
|
||||
- Memory DIMM inventory for populated slots
|
||||
- PSU inventory
|
||||
- PCIe / OCP NIC inventory from SMBIOS-style slot records
|
||||
- Storage controller and physical drives from embedded Redfish JSON inside `zbb` members
|
||||
- Basic iLO event log entries with timestamps when present
|
||||
|
||||
**Implementation note:** The format is proprietary. Parser support is intentionally hybrid:
|
||||
container parsing (`ABJR` + gzip) plus structured extraction from embedded Redfish objects and
|
||||
printable SMBIOS/FRU payloads. This is sufficient for inventory-grade parsing without decoding the
|
||||
entire internal `zbb` schema.
|
||||
|
||||
---
|
||||
|
||||
### xFusion iBMC Dump / File Export (`xfusion`)
|
||||
|
||||
**Status:** Ready (v1.1.0). Tested on xFusion G5500 V7 `tar.gz` exports.
|
||||
|
||||
**Archive format:** `tar.gz` dump exported from the iBMC UI, including `AppDump/`, `RTOSDump/`,
|
||||
and `LogDump/` trees.
|
||||
|
||||
**Detection:** `AppDump/FruData/fruinfo.txt`, `AppDump/card_manage/card_info`,
|
||||
`RTOSDump/versioninfo/app_revision.txt`, and `LogDump/netcard/netcard_info.txt`.
|
||||
|
||||
**Extracted data (current):**
|
||||
- Board / FRU inventory from `fruinfo.txt`
|
||||
- CPU inventory from `CpuMem/cpu_info`
|
||||
- Memory DIMM inventory from `CpuMem/mem_info`
|
||||
- GPU inventory from `card_info`
|
||||
- OCP NIC inventory by merging `card_info` with `LogDump/netcard/netcard_info.txt`
|
||||
- PSU inventory from `BMC/psu_info.txt`
|
||||
- Physical storage from `StorageMgnt/PhysicalDrivesInfo/*/disk_info`
|
||||
- System firmware entries from `RTOSDump/versioninfo/app_revision.txt`
|
||||
- Maintenance events from `LogDump/maintenance_log`
|
||||
|
||||
---
|
||||
|
||||
### Generic text fallback (`generic`)
|
||||
|
||||
**Status:** Ready (v1.0.0).
|
||||
@@ -331,10 +192,14 @@ with content markers (e.g. `Unraid kernel build`, parity data markers).
|
||||
| Vendor | ID | Status | Tested on |
|
||||
|--------|----|--------|-----------|
|
||||
| Dell TSR | `dell` | Ready | TSR nested zip archives |
|
||||
| Reanimator Easy Bee | `easy_bee` | Ready | `bee-support-*.tar.gz` support bundles |
|
||||
| HPE iLO AHS | `hpe_ilo_ahs` | Ready | iLO 6 `.ahs` exports |
|
||||
| Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog |
|
||||
| Lenovo XCC mini-log | `lenovo_xcc` | Ready | ThinkSystem SR650 V3 XCC mini-log ZIP |
|
||||
| NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers |
|
||||
| NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems |
|
||||
| Unraid | `unraid` | Ready | Unraid diagnostics archives |
|
||||
| xFusion iBMC dump | `xfusion` | Ready | G5500 V7 file-export `tar.gz` bundles |
|
||||
| XigmaNAS | `xigmanas` | Ready | FreeBSD NAS logs |
|
||||
| H3C SDS G5 | `h3c_g5` | Ready | H3C UniServer R4900 G5 SDS archives |
|
||||
| H3C SDS G6 | `h3c_g6` | Ready | H3C UniServer R4700 G6 SDS archives |
|
||||
|
||||
@@ -1,366 +1,93 @@
|
||||
# 07 — Exporters & Reanimator Integration
|
||||
|
||||
## Export endpoints summary
|
||||
|
||||
| Endpoint | Format | Filename pattern |
|
||||
|----------|--------|-----------------|
|
||||
| `GET /api/export/csv` | CSV — serial numbers | `YYYY-MM-DD (MODEL) - SN.csv` |
|
||||
| `GET /api/export/json` | **Raw export package** (JSON or ZIP bundle) for reopen/re-analysis | `YYYY-MM-DD (MODEL) - SN.(json|zip)` |
|
||||
| `GET /api/export/reanimator` | Reanimator hardware JSON | `YYYY-MM-DD (MODEL) - SN.json` |
|
||||
|
||||
---
|
||||
|
||||
## Raw Export (`Export Raw Data`)
|
||||
|
||||
### Purpose
|
||||
|
||||
Preserve enough source data to reproduce parsing later after parser fixes, without requiring
|
||||
another live collection from the target system.
|
||||
|
||||
### Format
|
||||
|
||||
`/api/export/json` returns a **raw export package**:
|
||||
- JSON package (machine-readable), or
|
||||
- ZIP bundle containing:
|
||||
- `raw_export.json` — machine-readable package
|
||||
- `collect.log` — human-readable collection + parsing summary
|
||||
- `parser_fields.json` — structured parsed field snapshot for diffs between parser versions
|
||||
|
||||
### Import / reopen behavior
|
||||
|
||||
When a raw export package is uploaded back into LOGPile:
|
||||
- the app **re-analyzes from raw source**
|
||||
- it does **not** trust embedded parsed output as source of truth
|
||||
|
||||
For Redfish, this means replay from `raw_payloads.redfish_tree`.
|
||||
|
||||
### Design rule
|
||||
|
||||
Raw export is a **re-analysis artifact**, not a final report dump. Keep it self-contained and
|
||||
forward-compatible where possible (versioned package format, additive fields only).
|
||||
|
||||
---
|
||||
|
||||
## Reanimator Export
|
||||
|
||||
### Purpose
|
||||
|
||||
Exports hardware inventory data in the format expected by the Reanimator asset tracking
|
||||
system. Enables one-click push from LOGPile to an external asset management platform.
|
||||
|
||||
### Implementation files
|
||||
|
||||
| File | Role |
|
||||
|------|------|
|
||||
| `internal/exporter/reanimator_models.go` | Go structs for Reanimator JSON |
|
||||
| `internal/exporter/reanimator_converter.go` | `ConvertToReanimator()` and helpers |
|
||||
| `internal/server/handlers.go` | `handleExportReanimator()` HTTP handler |
|
||||
|
||||
### Conversion rules
|
||||
|
||||
- Source: canonical `hardware.devices` repository (see [`04-data-models.md`](04-data-models.md))
|
||||
- CPU manufacturer inferred from model string (Intel / AMD / ARM / Ampere)
|
||||
- PCIe serial number generated when absent: `{board_serial}-PCIE-{slot}`
|
||||
- Status values normalized to: `OK`, `Warning`, `Critical`, `Unknown` (`Empty` only for memory slots)
|
||||
- Timestamps in RFC3339 format
|
||||
- `target_host` derived from `filename` field (`redfish://…`, `ipmi://…`) if not in source; omitted if undeterminable
|
||||
- `board.manufacturer` and `board.product_name` values of `"NULL"` treated as absent
|
||||
|
||||
### LOGPile → Reanimator field mapping
|
||||
|
||||
| LOGPile type | Reanimator section | Notes |
|
||||
|---|---|---|
|
||||
| `BoardInfo` | `board` | Direct mapping |
|
||||
| `CPU` | `cpus` | + manufacturer (inferred) |
|
||||
| `MemoryDIMM` | `memory` | Direct; empty slots included (`present=false`) |
|
||||
| `Storage` | `storage` | Excluded if no `serial_number` |
|
||||
| `PCIeDevice` | `pcie_devices` | Serial generated if missing |
|
||||
| `GPU` | `pcie_devices` | `device_class=DisplayController` |
|
||||
| `NetworkAdapter` | `pcie_devices` | `device_class=NetworkController` |
|
||||
| `PSU` | `power_supplies` | Excluded if no serial or `present=false` |
|
||||
| `FirmwareInfo` | `firmware` | Direct mapping |
|
||||
|
||||
### Inclusion / exclusion rules
|
||||
|
||||
**Included:**
|
||||
- Memory slots with `present=false` (as Empty slots)
|
||||
- PCIe devices without serial number (serial is generated)
|
||||
|
||||
**Excluded:**
|
||||
- Storage without `serial_number`
|
||||
- PSU without `serial_number` or with `present=false`
|
||||
- NetworkAdapters with `present=false`
|
||||
|
||||
---
|
||||
|
||||
## Reanimator Integration Guide
|
||||
|
||||
This section documents the Reanimator receiver-side JSON format (what the Reanimator
|
||||
system expects when it ingests a LOGPile export).
|
||||
|
||||
> **Important:** The Reanimator endpoint uses a strict JSON decoder (`DisallowUnknownFields`).
|
||||
> Any unknown field — including nested ones — causes `400 Bad Request`.
|
||||
> Use only `snake_case` keys listed here.
|
||||
|
||||
### Top-level structure
|
||||
|
||||
```json
|
||||
{
|
||||
"filename": "redfish://10.10.10.103",
|
||||
"source_type": "api",
|
||||
"protocol": "redfish",
|
||||
"target_host": "10.10.10.103",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"hardware": {
|
||||
"board": {...},
|
||||
"firmware": [...],
|
||||
"cpus": [...],
|
||||
"memory": [...],
|
||||
"storage": [...],
|
||||
"pcie_devices": [...],
|
||||
"power_supplies": [...]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Required:** `collected_at`, `hardware.board.serial_number`
|
||||
**Optional:** `target_host`, `source_type`, `protocol`, `filename`
|
||||
|
||||
`source_type` values: `api`, `logfile`, `manual`
|
||||
`protocol` values: `redfish`, `ipmi`, `snmp`, `ssh`
|
||||
|
||||
### Component status fields (all component sections)
|
||||
|
||||
Each component may carry:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `status` | string | `OK`, `Warning`, `Critical`, `Unknown`, `Empty` |
|
||||
| `status_checked_at` | RFC3339 | When status was last verified |
|
||||
| `status_changed_at` | RFC3339 | When status last changed |
|
||||
| `status_at_collection` | object | `{ "status": "...", "at": "..." }` — snapshot-time status |
|
||||
| `status_history` | array | `[{ "status": "...", "changed_at": "...", "details": "..." }]` |
|
||||
| `error_description` | string | Human-readable error for Warning/Critical |
|
||||
|
||||
### Board
|
||||
|
||||
```json
|
||||
{
|
||||
"board": {
|
||||
"manufacturer": "Supermicro",
|
||||
"product_name": "X12DPG-QT6",
|
||||
"serial_number": "21D634101",
|
||||
"part_number": "X12DPG-QT6-REV1.01",
|
||||
"uuid": "d7ef2fe5-2fd0-11f0-910a-346f11040868"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`serial_number` required. `manufacturer` / `product_name` of `"NULL"` treated as absent.
|
||||
|
||||
### CPUs
|
||||
|
||||
```json
|
||||
{
|
||||
"socket": 0,
|
||||
"model": "INTEL(R) XEON(R) GOLD 6530",
|
||||
"cores": 32,
|
||||
"threads": 64,
|
||||
"frequency_mhz": 2100,
|
||||
"max_frequency_mhz": 4000,
|
||||
"manufacturer": "Intel",
|
||||
"status": "OK"
|
||||
}
|
||||
```
|
||||
|
||||
`socket` (int) and `model` required. Serial generated: `{board_serial}-CPU-{socket}`.
|
||||
|
||||
LOT format: `CPU_{VENDOR}_{MODEL_NORMALIZED}` → e.g. `CPU_INTEL_XEON_GOLD_6530`
|
||||
|
||||
### Memory
|
||||
|
||||
```json
|
||||
{
|
||||
"slot": "CPU0_C0D0",
|
||||
"location": "CPU0_C0D0",
|
||||
"present": true,
|
||||
"size_mb": 32768,
|
||||
"type": "DDR5",
|
||||
"max_speed_mhz": 4800,
|
||||
"current_speed_mhz": 4800,
|
||||
"manufacturer": "Hynix",
|
||||
"serial_number": "80AD032419E17CEEC1",
|
||||
"part_number": "HMCG88AGBRA191N",
|
||||
"status": "OK"
|
||||
}
|
||||
```
|
||||
|
||||
`slot` and `present` required. `serial_number` required when `present=true`.
|
||||
Empty slots (`present=false`, `status="Empty"`) are included but no component created.
|
||||
|
||||
LOT format: `DIMM_{TYPE}_{SIZE_GB}GB` → e.g. `DIMM_DDR5_32GB`
|
||||
|
||||
### Storage
|
||||
|
||||
```json
|
||||
{
|
||||
"slot": "OB01",
|
||||
"type": "NVMe",
|
||||
"model": "INTEL SSDPF2KX076T1",
|
||||
"size_gb": 7680,
|
||||
"serial_number": "BTAX41900GF87P6DGN",
|
||||
"manufacturer": "Intel",
|
||||
"firmware": "9CV10510",
|
||||
"interface": "NVMe",
|
||||
"present": true,
|
||||
"status": "OK"
|
||||
}
|
||||
```
|
||||
|
||||
`slot`, `model`, `serial_number`, `present` required.
|
||||
|
||||
LOT format: `{TYPE}_{INTERFACE}_{SIZE_TB}TB` → e.g. `SSD_NVME_07.68TB`
|
||||
|
||||
### Power Supplies
|
||||
|
||||
```json
|
||||
{
|
||||
"slot": "0",
|
||||
"present": true,
|
||||
"model": "GW-CRPS3000LW",
|
||||
"vendor": "Great Wall",
|
||||
"wattage_w": 3000,
|
||||
"serial_number": "2P06C102610",
|
||||
"part_number": "V0310C9000000000",
|
||||
"firmware": "00.03.05",
|
||||
"status": "OK",
|
||||
"input_power_w": 137,
|
||||
"output_power_w": 104,
|
||||
"input_voltage": 215.25
|
||||
}
|
||||
```
|
||||
|
||||
`slot`, `present` required. `serial_number` required when `present=true`.
|
||||
Telemetry fields (`input_power_w`, `output_power_w`, `input_voltage`) stored in observation only.
|
||||
|
||||
LOT format: `PSU_{WATTAGE}W_{VENDOR_NORMALIZED}` → e.g. `PSU_3000W_GREAT_WALL`
|
||||
|
||||
### PCIe Devices
|
||||
|
||||
```json
|
||||
{
|
||||
"slot": "PCIeCard1",
|
||||
"vendor_id": 32902,
|
||||
"device_id": 2912,
|
||||
"bdf": "0000:18:00.0",
|
||||
"device_class": "MassStorageController",
|
||||
"manufacturer": "Intel",
|
||||
"model": "RAID Controller RSP3DD080F",
|
||||
"link_width": 8,
|
||||
"link_speed": "Gen3",
|
||||
"max_link_width": 8,
|
||||
"max_link_speed": "Gen3",
|
||||
"serial_number": "RAID-001-12345",
|
||||
"firmware": "50.9.1-4296",
|
||||
"status": "OK"
|
||||
}
|
||||
```
|
||||
|
||||
`slot` required. Serial generated if absent: `{board_serial}-PCIE-{slot}`.
|
||||
|
||||
`device_class` values: `NetworkController`, `MassStorageController`, `DisplayController`, etc.
|
||||
|
||||
LOT format: `PCIE_{DEVICE_CLASS}_{MODEL_NORMALIZED}` → e.g. `PCIE_NETWORK_CONNECTX5`
|
||||
|
||||
### Firmware
|
||||
|
||||
```json
|
||||
[
|
||||
{ "device_name": "BIOS", "version": "06.08.05" },
|
||||
{ "device_name": "BMC", "version": "5.17.00" }
|
||||
]
|
||||
```
|
||||
|
||||
Both fields required. Changes trigger `FIRMWARE_CHANGED` timeline events.
|
||||
|
||||
---
|
||||
|
||||
### Import process (Reanimator side)
|
||||
|
||||
1. Validate `collected_at` (RFC3339) and `hardware.board.serial_number`.
|
||||
2. Find or create Asset by `board.serial_number` → `vendor_serial`.
|
||||
3. For each component: filter `present=false`, auto-determine LOT, find or create Component,
|
||||
create Observation, update Installations.
|
||||
4. Detect removed components (present in previous snapshot, absent in current) → close Installation.
|
||||
5. Generate timeline events: `LOG_COLLECTED`, `INSTALLED`, `REMOVED`, `FIRMWARE_CHANGED`.
|
||||
|
||||
**Idempotency:** Repeated import of the same snapshot (same content hash) returns `200 OK`
|
||||
with `"duplicate": true` and does not create duplicate records.
|
||||
|
||||
### Reanimator API endpoint
|
||||
|
||||
```http
|
||||
POST /ingest/hardware
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
**Success (201):**
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"bundle_id": "lb_01J...",
|
||||
"asset_id": "mach_01J...",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"duplicate": false,
|
||||
"summary": {
|
||||
"parts_observed": 15,
|
||||
"parts_created": 2,
|
||||
"installations_created": 2,
|
||||
"timeline_events_created": 9
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Duplicate (200):**
|
||||
```json
|
||||
{ "status": "success", "duplicate": true, "message": "LogBundle with this content hash already exists" }
|
||||
```
|
||||
|
||||
**Error (400):**
|
||||
```json
|
||||
{ "status": "error", "error": "validation_failed", "details": { "field": "...", "message": "..." } }
|
||||
```
|
||||
|
||||
Common `400` causes:
|
||||
- Unknown JSON field (strict decoder)
|
||||
- Wrong key name (e.g. `targetHost` instead of `target_host`)
|
||||
- Invalid `collected_at` format (must be RFC3339)
|
||||
- Empty `hardware.board.serial_number`
|
||||
|
||||
### LOT normalization rules
|
||||
|
||||
1. Remove special chars `( ) - ® ™`; replace spaces with `_`
|
||||
2. Uppercase all
|
||||
3. Collapse multiple underscores to one
|
||||
4. Strip common prefixes like `MODEL:`, `PN:`
|
||||
|
||||
### Status values
|
||||
|
||||
| Value | Meaning | Action |
|
||||
|-------|---------|--------|
|
||||
| `OK` | Normal | — |
|
||||
| `Warning` | Degraded | Create `COMPONENT_WARNING` event (optional) |
|
||||
| `Critical` | Failed | Auto-create `failure_event`, create `COMPONENT_FAILED` event |
|
||||
| `Unknown` | Not determinable | Treat as working |
|
||||
| `Empty` | Slot unpopulated | No component created (memory/PCIe only) |
|
||||
|
||||
### Missing field handling
|
||||
|
||||
| Field | Fallback |
|
||||
|-------|---------|
|
||||
| CPU serial | Generated: `{board_serial}-CPU-{socket}` |
|
||||
| PCIe serial | Generated: `{board_serial}-PCIE-{slot}` |
|
||||
| Other serial | Component skipped if absent |
|
||||
| manufacturer (PCIe) | Looked up from `vendor_id` (8086→Intel, 10de→NVIDIA, 15b3→Mellanox…) |
|
||||
| status | Treated as `Unknown` |
|
||||
| firmware | No `FIRMWARE_CHANGED` event |
|
||||
# 07 — Exporters
|
||||
|
||||
## Export surfaces
|
||||
|
||||
| Endpoint | Output | Purpose |
|
||||
|----------|--------|---------|
|
||||
| `GET /api/export/csv` | CSV | Serial-number export |
|
||||
| `GET /api/export/json` | raw-export ZIP bundle | Reopen and re-analyze later |
|
||||
| `GET /api/export/reanimator` | JSON | Reanimator hardware payload |
|
||||
| `POST /api/convert` | async ZIP artifact | Batch archive-to-Reanimator conversion |
|
||||
|
||||
## Raw export
|
||||
|
||||
Raw export is not a final report dump.
|
||||
It is a replayable artifact that preserves enough source data for future parser improvements.
|
||||
|
||||
Current bundle contents:
|
||||
- `raw_export.json`
|
||||
- `collect.log`
|
||||
- `parser_fields.json`
|
||||
|
||||
Design rules:
|
||||
- raw source is authoritative
|
||||
- uploads of raw export must replay from raw source
|
||||
- parsed snapshots inside the bundle are diagnostic only
|
||||
|
||||
## Reanimator export
|
||||
|
||||
Implementation files:
|
||||
- `internal/exporter/reanimator_models.go`
|
||||
- `internal/exporter/reanimator_converter.go`
|
||||
- `internal/server/handlers.go`
|
||||
- `bible-local/docs/hardware-ingest-contract.md`
|
||||
|
||||
Conversion rules:
|
||||
- canonical source is merged canonical inventory derived from `hardware.devices` plus legacy hardware slices
|
||||
- output must conform to the strict Reanimator ingest contract in `docs/hardware-ingest-contract.md`
|
||||
- local mirror currently tracks upstream contract `v2.7`
|
||||
- timestamps are RFC3339
|
||||
- status is normalized to Reanimator-friendly values
|
||||
- missing component serial numbers must stay absent; LOGPile must not synthesize fake serials for Reanimator export
|
||||
- CPU `firmware` field means CPU microcode, not generic processor firmware inventory
|
||||
- `NULL`-style board manufacturer/product values are treated as absent
|
||||
- optional component telemetry/health fields are exported when LOGPile already has the data
|
||||
- partial `hardware.devices` must not suppress components still present only in legacy parser/collector fields
|
||||
- `present` is not serialized for exported components; presence is expressed by the existence of the component record itself
|
||||
- Reanimator ingest may apply its own server-side fallback serial rules for CPU and PCIe when LOGPile leaves serials absent
|
||||
|
||||
## Inclusion rules
|
||||
|
||||
Included:
|
||||
- PCIe-class devices when the component itself is present, even if serial number is missing
|
||||
- contract `v2.7` component telemetry and health fields when source data exists
|
||||
- hardware sensors grouped into `fans`, `power`, `temperatures`, `other` only when the sensor has a real numeric reading
|
||||
- sensor `location` is not exported; LOGPile keeps only sensor `name` plus measured values and status
|
||||
- Redfish linked metric docs that carry component telemetry: `ProcessorMetrics`, `MemoryMetrics`, `DriveMetrics`, `EnvironmentMetrics`, `Metrics`
|
||||
- `pcie_devices.slot` is treated as the canonical PCIe address; `bdf` is used only as an internal fallback/dedupe key and is not serialized in the payload
|
||||
- `event_logs` are exported only from normalized parser/collector events that can be mapped to contract sources `host` / `bmc` / `redfish` without synthesizing content
|
||||
- `manufactured_year_week` is exported only as a reliable passthrough when the parser/collector already extracted a valid `YYYY-Www` value
|
||||
|
||||
Excluded:
|
||||
- storage endpoints from `pcie_devices`; disks and NVMe drives export only through `hardware.storage`
|
||||
- fake serial numbers for PCIe-class devices; any fallback serial generation belongs to Reanimator ingest, not LOGPile
|
||||
- sensors without a real numeric reading
|
||||
- events with internal-only or unmappable sources such as LOGPile internal warnings
|
||||
- memory with missing serial number
|
||||
- memory with `present=false` or `status=Empty`
|
||||
- CPUs with `present=false`
|
||||
- storage without `serial_number`
|
||||
- storage with `present=false`
|
||||
- power supplies without `serial_number`
|
||||
- power supplies with `present=false`
|
||||
- non-present network adapters
|
||||
- non-present PCIe / GPU devices
|
||||
- device-bound firmware duplicated at top-level firmware list
|
||||
- any field not present in the strict ingest contract
|
||||
|
||||
## Batch convert
|
||||
|
||||
`POST /api/convert` accepts multiple supported files and produces a ZIP with:
|
||||
- one `*.reanimator.json` file per successful input
|
||||
- `convert-summary.txt`
|
||||
|
||||
Behavior:
|
||||
- unsupported filenames are skipped
|
||||
- each file is parsed independently
|
||||
- one bad file must not fail the whole batch if at least one conversion succeeds
|
||||
- result artifact is temporary and deleted after download
|
||||
|
||||
## CSV export
|
||||
|
||||
`GET /api/export/csv` uses the same merged canonical inventory as Reanimator export,
|
||||
with legacy network-card fallback kept only for records that still have no canonical device match.
|
||||
|
||||
@@ -4,86 +4,83 @@
|
||||
|
||||
Defined in `cmd/logpile/main.go`:
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| Flag | Default | Purpose |
|
||||
|------|---------|---------|
|
||||
| `--port` | `8082` | HTTP server port |
|
||||
| `--file` | — | Reserved for archive preload (not active) |
|
||||
| `--version` | — | Print version and exit |
|
||||
| `--no-browser` | — | Do not open browser on start |
|
||||
| `--hold-on-crash` | `true` on Windows | Keep console open on fatal crash for debugging |
|
||||
| `--file` | empty | Preload archive file |
|
||||
| `--version` | `false` | Print version and exit |
|
||||
| `--no-browser` | `false` | Do not auto-open browser |
|
||||
| `--hold-on-crash` | `true` on Windows | Keep console open after fatal crash |
|
||||
|
||||
## Build
|
||||
## Common commands
|
||||
|
||||
```bash
|
||||
# Local binary (current OS/arch)
|
||||
make build
|
||||
# Output: bin/logpile
|
||||
|
||||
# Cross-platform binaries
|
||||
make build-all
|
||||
# Output:
|
||||
# bin/logpile-linux-amd64
|
||||
# bin/logpile-linux-arm64
|
||||
# bin/logpile-darwin-amd64
|
||||
# bin/logpile-darwin-arm64
|
||||
# bin/logpile-windows-amd64.exe
|
||||
```
|
||||
|
||||
Both `make build` and `make build-all` run `scripts/update-pci-ids.sh --best-effort`
|
||||
before compilation to sync `pci.ids` from the submodule.
|
||||
|
||||
To skip PCI IDs update:
|
||||
```bash
|
||||
SKIP_PCI_IDS_UPDATE=1 make build
|
||||
```
|
||||
|
||||
Build flags: `CGO_ENABLED=0` — fully static binary, no C runtime dependency.
|
||||
|
||||
## PCI IDs submodule
|
||||
|
||||
Source: `third_party/pciids` (git submodule → `github.com/pciutils/pciids`)
|
||||
Local copy embedded at build time: `internal/parser/vendors/pciids/pci.ids`
|
||||
|
||||
```bash
|
||||
# Manual update
|
||||
make test
|
||||
make fmt
|
||||
make update-pci-ids
|
||||
```
|
||||
|
||||
# Init submodule after fresh clone
|
||||
Notes:
|
||||
- `make build` outputs `bin/logpile`
|
||||
- `make build-all` builds the supported cross-platform binaries
|
||||
- `make build` and `make build-all` run `scripts/update-pci-ids.sh --best-effort` unless `SKIP_PCI_IDS_UPDATE=1`
|
||||
|
||||
## PCI IDs
|
||||
|
||||
Source submodule: `third_party/pciids`
|
||||
Embedded copy: `internal/parser/vendors/pciids/pci.ids`
|
||||
|
||||
Typical setup after clone:
|
||||
|
||||
```bash
|
||||
git submodule update --init third_party/pciids
|
||||
```
|
||||
|
||||
## Release process
|
||||
## Release script
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
scripts/release.sh
|
||||
./scripts/release.sh
|
||||
```
|
||||
|
||||
What it does:
|
||||
Current behavior:
|
||||
|
||||
1. Reads version from `git describe --tags`
|
||||
2. Validates clean working tree (override: `ALLOW_DIRTY=1`)
|
||||
3. Sets stable `GOPATH` / `GOCACHE` / `GOTOOLCHAIN` env
|
||||
4. Creates `releases/{VERSION}/` directory
|
||||
5. Generates `RELEASE_NOTES.md` template if not present
|
||||
6. Builds `darwin-arm64` and `windows-amd64` binaries
|
||||
7. Packages all binaries found in `bin/` as `.tar.gz` / `.zip`
|
||||
2. Refuses a dirty tree unless `ALLOW_DIRTY=1`
|
||||
3. Sets stable Go cache/toolchain environment
|
||||
4. Creates `releases/{VERSION}/`
|
||||
5. Creates a release-notes template if missing
|
||||
6. Builds `darwin-arm64` and `windows-amd64`
|
||||
7. Packages any already-present binaries from `bin/`
|
||||
8. Generates `SHA256SUMS.txt`
|
||||
9. Prints next steps (tag, push, create release manually)
|
||||
|
||||
Release notes template is created in `releases/{VERSION}/RELEASE_NOTES.md`.
|
||||
Release tag format:
|
||||
- project release tags use `vN.M`
|
||||
- do not create `vN.M.P` tags for LOGPile releases
|
||||
- release artifacts and `main.version` inherit the exact git tag string
|
||||
|
||||
## Running
|
||||
Important limitation:
|
||||
- `scripts/release.sh` does not run `make build-all` for you
|
||||
- if you want Linux or additional macOS archives in the release directory, build them before running the script
|
||||
|
||||
Toolchain note:
|
||||
- `scripts/release.sh` defaults `GOTOOLCHAIN=local` to use the already installed Go toolchain and avoid implicit network downloads during release builds
|
||||
- if you intentionally want another toolchain, pass it explicitly, for example `GOTOOLCHAIN=go1.24.0 ./scripts/release.sh`
|
||||
|
||||
## Run locally
|
||||
|
||||
```bash
|
||||
./bin/logpile
|
||||
./bin/logpile --port 9090
|
||||
./bin/logpile --no-browser
|
||||
./bin/logpile --version
|
||||
./bin/logpile --hold-on-crash # keep console open on crash (default on Windows)
|
||||
```
|
||||
|
||||
## macOS Gatekeeper
|
||||
|
||||
After downloading a binary, remove the quarantine attribute:
|
||||
```bash
|
||||
xattr -d com.apple.quarantine /path/to/logpile-darwin-arm64
|
||||
```
|
||||
|
||||
@@ -1,43 +1,54 @@
|
||||
# 09 — Testing
|
||||
|
||||
## Required before merge
|
||||
## Baseline
|
||||
|
||||
Required before merge:
|
||||
|
||||
```bash
|
||||
go test ./...
|
||||
```
|
||||
|
||||
All tests must pass before any change is merged.
|
||||
## Test locations
|
||||
|
||||
## Where to add tests
|
||||
|
||||
| Change area | Test location |
|
||||
|-------------|---------------|
|
||||
| Collectors | `internal/collector/*_test.go` |
|
||||
| HTTP handlers | `internal/server/*_test.go` |
|
||||
| Area | Location |
|
||||
|------|----------|
|
||||
| Collectors and replay | `internal/collector/*_test.go` |
|
||||
| HTTP handlers and jobs | `internal/server/*_test.go` |
|
||||
| Exporters | `internal/exporter/*_test.go` |
|
||||
| Parsers | `internal/parser/vendors/<vendor>/*_test.go` |
|
||||
| Vendor parsers | `internal/parser/vendors/<vendor>/*_test.go` |
|
||||
|
||||
## Exporter tests
|
||||
## General rules
|
||||
|
||||
The Reanimator exporter has comprehensive coverage:
|
||||
- Prefer table-driven tests
|
||||
- No network access in unit tests
|
||||
- Cover happy path and realistic failure/partial-data cases
|
||||
- New vendor parsers need both detection and parse coverage
|
||||
|
||||
| Test file | Coverage |
|
||||
|-----------|----------|
|
||||
| `reanimator_converter_test.go` | Unit tests per conversion function |
|
||||
| `reanimator_integration_test.go` | Full export with realistic `AnalysisResult` |
|
||||
## Mandatory coverage for dedup/filter/classify logic
|
||||
|
||||
Any new deduplication, filtering, or classification function must have:
|
||||
|
||||
1. A true-positive case
|
||||
2. A true-negative case
|
||||
3. A regression case for the vendor or topology that motivated the change
|
||||
|
||||
This is mandatory for inventory logic, firmware filtering, and similar code paths where silent data drift is likely.
|
||||
|
||||
## Mandatory coverage for expensive path selection
|
||||
|
||||
Any function that decides whether to crawl or probe an expensive path must have:
|
||||
|
||||
1. A positive selection case
|
||||
2. A negative exclusion case
|
||||
3. A topology-level count/integration case
|
||||
|
||||
The goal is to catch runaway I/O regressions before they ship.
|
||||
|
||||
## Useful focused commands
|
||||
|
||||
Run exporter tests only:
|
||||
```bash
|
||||
go test ./internal/exporter/...
|
||||
go test ./internal/exporter/... -v -run Reanimator
|
||||
go test ./internal/exporter/... -cover
|
||||
go test ./internal/collector/...
|
||||
go test ./internal/server/...
|
||||
go test ./internal/parser/vendors/...
|
||||
```
|
||||
|
||||
## Guidelines
|
||||
|
||||
- Prefer table-driven tests for parsing logic (multiple input variants).
|
||||
- Do not rely on network access in unit tests.
|
||||
- Test both the happy path and edge cases (missing fields, empty collections).
|
||||
- When adding a new vendor parser, include at minimum:
|
||||
- `Detect()` test with a positive and a negative sample file list.
|
||||
- `Parse()` test with a minimal but representative archive.
|
||||
|
||||
@@ -253,4 +253,904 @@ at parse time before storing in any model struct. Use the regex
|
||||
|
||||
---
|
||||
|
||||
## ADL-018 — NVMe bay probe must be restricted to storage-capable chassis types
|
||||
|
||||
**Date:** 2026-03-12
|
||||
**Context:** `shouldAdaptiveNVMeProbe` was introduced in `2fa4a12` to recover NVMe drives on
|
||||
Supermicro BMCs that expose empty `Drives` collections but serve disks at direct `Disk.Bay.N`
|
||||
|
||||
---
|
||||
|
||||
paths. The function returns `true` for any chassis with an empty `Members` array. On
|
||||
Supermicro HGX systems (SYS-A21GE-NBRT and similar) ~35 sub-chassis (GPU, NVSwitch,
|
||||
PCIeRetimer, ERoT, IRoT, BMC, FPGA) all carry `ChassisType=Module/Component/Zone` and
|
||||
expose empty `/Drives` collections. Without filtering, each triggered 384 HTTP requests →
|
||||
13 440 requests ≈ 22 minutes of pure I/O waste per collection.
|
||||
**Decision:** Before probing `Disk.Bay.N` candidates for a chassis, check its `ChassisType`
|
||||
via `chassisTypeCanHaveNVMe`. Skip if type is `Module`, `Component`, or `Zone`. Keep probing
|
||||
for `Enclosure`, `RackMount`, and any unrecognised type (fail-safe).
|
||||
**Consequences:**
|
||||
- On HGX systems post-probe NVMe goes from ~22 min to effectively zero.
|
||||
- NVMe backplane recovery (`Enclosure` type) is unaffected.
|
||||
- Any new chassis type that hosts NVMe storage is covered by the default `true` path.
|
||||
- `chassisTypeCanHaveNVMe` and the candidate-selection loop must have unit tests covering
|
||||
both the excluded types and the storage-capable types (see `TestChassisTypeCanHaveNVMe`
|
||||
and `TestNVMePostProbeSkipsNonStorageChassis`).
|
||||
|
||||
## ADL-019 — Redfish post-probe recovery is profile-owned acquisition policy
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Numeric collection post-probe and direct NVMe `Disk.Bay` recovery were still
|
||||
controlled by collector-core heuristics, which kept platform-specific acquisition behavior in
|
||||
`redfish.go` and made vendor/topology refactoring incomplete.
|
||||
**Decision:** Move expensive Redfish post-probe enablement into profile-owned acquisition policy.
|
||||
The collector core may execute bounded post-probe loops, but profiles must explicitly enable:
|
||||
- numeric collection post-probe
|
||||
- direct NVMe `Disk.Bay` recovery
|
||||
- sensor collection post-probe
|
||||
**Consequences:**
|
||||
- Generic collector flow no longer implicitly turns on storage/NVMe recovery for every platform.
|
||||
- Supermicro-specific direct NVMe recovery and generic numeric collection recovery are now
|
||||
regression-tested through profile fixtures.
|
||||
- Future platform storage/post-probe behavior must be added through profile tuning, not new
|
||||
vendor-shaped `if` branches in collector core.
|
||||
|
||||
## ADL-020 — Redfish critical plan-B activation is profile-owned recovery policy
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** `critical plan-B` and `profile plan-B` were still effectively always-on collector
|
||||
behavior once paths were present, including critical collection member retry and slow numeric
|
||||
child probing. That kept acquisition recovery semantics in `redfish.go` instead of the profile
|
||||
layer.
|
||||
**Decision:** Move plan-B activation into profile-owned recovery policy. Profiles must explicitly
|
||||
enable:
|
||||
- critical collection member retry
|
||||
- slow numeric probing during critical plan-B
|
||||
- profile-specific plan-B pass
|
||||
**Consequences:**
|
||||
- Recovery behavior is now observable in raw Redfish diagnostics alongside other tuning.
|
||||
- Generic/fallback recovery remains available through profile policy instead of implicit collector
|
||||
defaults.
|
||||
- Future platform-specific plan-B behavior must be introduced through profile tuning and tests,
|
||||
not through new unconditional collector branches.
|
||||
|
||||
## ADL-021 — Extra discovered-path storage seeds must be profile-scoped, not core-baseline
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** The collector core baseline seed list still contained storage-specific discovered-path
|
||||
suffixes such as `SimpleStorage` and `Storage/IntelVROC/*`. These are useful on some platforms,
|
||||
but they are acquisition extensions layered on top of discovered `Systems/*` resources, not part
|
||||
of the minimal vendor-neutral Redfish baseline.
|
||||
**Decision:** Move such discovered-path expansions into profile-owned scoped path policy. The
|
||||
collector core keeps the vendor-neutral baseline; profiles may add extra system/chassis/manager
|
||||
suffixes that are expanded over discovered members during acquisition planning.
|
||||
**Consequences:**
|
||||
- Platform-shaped storage discovery no longer lives in `redfish.go` baseline seed construction.
|
||||
- Extra discovered-path branches are visible in plan diagnostics and fixture regression tests.
|
||||
- Future model/vendor storage path expansions must be added through scoped profile policy instead
|
||||
of editing the shared baseline seed list.
|
||||
|
||||
## ADL-022 — Adaptive prefetch eligibility is profile-owned policy
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** The adaptive prefetch executor was still driven by hardcoded include/exclude path
|
||||
rules in `redfish.go`. That made GPU/storage/network prefetch shaping part of collector-core
|
||||
knowledge rather than profile-owned acquisition policy.
|
||||
**Decision:** Move prefetch eligibility rules into profile tuning. The collector core still runs
|
||||
adaptive prefetch, but profiles provide:
|
||||
- `IncludeSuffixes` for critical paths eligible for prefetch
|
||||
- `ExcludeContains` for path shapes that must never be prefetched
|
||||
**Consequences:**
|
||||
- Prefetch behavior is now visible in raw Redfish diagnostics and test fixtures.
|
||||
- Platform- or topology-specific prefetch shaping no longer requires editing collector-core
|
||||
string lists.
|
||||
- Future prefetch tuning must be introduced through profiles and regression tests.
|
||||
|
||||
## ADL-023 — Core critical baseline is roots-only; critical shaping is profile-owned
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** `redfishCriticalEndpoints(...)` still encoded a broad set of system/chassis/manager
|
||||
critical branches directly in collector core. This mixed minimal crawl invariants with profile-
|
||||
specific acquisition shaping.
|
||||
**Decision:** Reduce collector-core critical baseline to vendor-neutral roots only:
|
||||
- `/redfish/v1`
|
||||
- discovered `Systems/*`
|
||||
- discovered `Chassis/*`
|
||||
- discovered `Managers/*`
|
||||
|
||||
Profiles now own additional critical shaping through:
|
||||
- scoped critical suffix policy for discovered resources
|
||||
- explicit top-level `CriticalPaths`
|
||||
**Consequences:**
|
||||
- Critical inventory breadth is now explained by the acquisition plan, not hidden in collector
|
||||
helper defaults.
|
||||
- Generic profile still provides the previous broad critical coverage, so behavior stays stable.
|
||||
- Future critical-path tuning must be implemented in profiles and regression-tested there.
|
||||
|
||||
## ADL-024 — Live Redfish execution plans are resolved inside redfishprofile
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Even after moving seeds, scoped paths, critical shaping, recovery, and prefetch
|
||||
policy into profiles, `redfish.go` still manually merged discovered resources with those policy
|
||||
fragments. That left acquisition-plan resolution logic in collector core.
|
||||
**Decision:** Introduce `redfishprofile.ResolveAcquisitionPlan(...)` as the boundary between
|
||||
profile planning and collector execution. `redfishprofile` now resolves:
|
||||
- baseline seeds
|
||||
- baseline critical roots
|
||||
- scoped path expansions
|
||||
- explicit profile seed/critical/plan-B paths
|
||||
|
||||
The collector core consumes the resolved plan and executes it.
|
||||
**Consequences:**
|
||||
- Acquisition planning logic is now testable in `redfishprofile` without going through the live
|
||||
collector.
|
||||
- `redfish.go` no longer owns path-resolution helpers for seeds/critical planning.
|
||||
- This creates a clean next step toward true per-profile acquisition hooks beyond static policy
|
||||
fragments.
|
||||
|
||||
## ADL-025 — Post-discovery acquisition refinement belongs to profile hooks
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Some acquisition behavior depends not only on vendor/model hints, but on what the
|
||||
lightweight Redfish discovery actually returned. Static absolute path lists in profile plans are
|
||||
too rigid for such cases and reintroduce guessed platform knowledge.
|
||||
**Decision:** Add a post-discovery acquisition refinement hook to Redfish profiles. Profiles may
|
||||
mutate the resolved execution plan after discovered `Systems/*`, `Chassis/*`, and `Managers/*`
|
||||
are known.
|
||||
|
||||
First concrete use:
|
||||
- MSI now derives GPU chassis seeds and `.../Sensors` critical/plan-B paths from discovered
|
||||
`Chassis/GPU*` resources instead of hardcoded `GPU1..GPU4` absolute paths in the static plan.
|
||||
Additional use:
|
||||
- Supermicro now derives `UpdateService/Oem/Supermicro/FirmwareInventory` critical/plan-B paths
|
||||
from resource hints instead of carrying that absolute path in the static plan.
|
||||
Additional use:
|
||||
- Dell now derives `Managers/iDRAC.Embedded.*` acquisition paths from discovered manager
|
||||
resources instead of carrying `Managers/iDRAC.Embedded.1` as a static absolute path.
|
||||
**Consequences:**
|
||||
- Profile modules can react to actual discovery results without pushing conditional logic back
|
||||
into `redfish.go`.
|
||||
- Diagnostics still show the final refined plan because the collector stores the refined plan,
|
||||
not only the pre-refinement template.
|
||||
- Future vendor-specific discovery-dependent acquisition behavior should be implemented through
|
||||
this hook rather than new collector-core branches.
|
||||
|
||||
## ADL-026 — Replay analysis uses a resolved profile plan, not ad-hoc directives only
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Replay still relied on a flat `AnalysisDirectives` struct assembled centrally,
|
||||
while vendor-specific conditions often depended on the actual snapshot shape. That made analysis
|
||||
behavior harder to explain and kept too much vendor logic in generic replay collectors.
|
||||
**Decision:** Introduce `redfishprofile.ResolveAnalysisPlan(...)` for replay. The resolved
|
||||
analysis plan contains:
|
||||
- active match result
|
||||
- resolved analysis directives
|
||||
- analysis notes explaining snapshot-aware hook activation
|
||||
|
||||
Profiles may refine this plan using the snapshot and discovered resources before replay collectors
|
||||
run.
|
||||
|
||||
First concrete uses:
|
||||
- MSI enables processor-GPU fallback and MSI chassis lookup only when the snapshot actually
|
||||
contains GPU processors and `Chassis/GPU*`
|
||||
- HGX enables processor-GPU alias fallback from actual HGX/GPU_SXM topology signals in the snapshot
|
||||
- Supermicro enables NVMe backplane and known-controller recovery from actual snapshot paths
|
||||
**Consequences:**
|
||||
- Replay behavior is now closer to the acquisition architecture: a resolved profile plan feeds the
|
||||
executor.
|
||||
- `redfish_analysis_plan` is stored in raw payload metadata for offline debugging.
|
||||
- Future analysis-side vendor logic should move into profile refinement hooks instead of growing the
|
||||
central directive builder.
|
||||
|
||||
## ADL-027 — Replay GPU/storage executors consume resolved analysis plans
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Even after introducing `ResolveAnalysisPlan(...)`, replay GPU/storage collectors still
|
||||
accepted a raw `AnalysisDirectives` struct. That preserved an implicit shortcut from the old design
|
||||
and weakened the plan/executor boundary.
|
||||
**Decision:** Replay GPU/storage executors now accept `redfishprofile.ResolvedAnalysisPlan`
|
||||
directly. The executor reads resolved directives from the plan instead of being passed a standalone
|
||||
directive bundle.
|
||||
**Consequences:**
|
||||
- GPU and storage replay execution now follows the same architectural pattern as acquisition:
|
||||
resolve plan first, execute second.
|
||||
- Future profile-owned execution helpers can use plan notes or additional resolved fields without
|
||||
changing the executor API again.
|
||||
- Remaining replay areas should migrate the same way instead of continuing to accept raw directive
|
||||
structs.
|
||||
|
||||
## ADL-019 — isDeviceBoundFirmwareName must cover vendor-specific naming patterns per vendor
|
||||
|
||||
**Date:** 2026-03-12
|
||||
**Context:** `isDeviceBoundFirmwareName` was written to filter Dell-style device firmware names
|
||||
(`"GPU SomeDevice"`, `"NIC OnboardLAN"`). When Supermicro Redfish FirmwareInventory was added
|
||||
(`6c19a58`), no Supermicro-specific patterns were added. Supermicro names a NIC entry
|
||||
`"NIC1 System Slot0 AOM-DP805-IO"` — a digit follows the type prefix directly, bypassing the
|
||||
`"nic "` (space-terminated) check. 29 device-bound entries leaked into `hardware.firmware` on
|
||||
SYS-A21GE-NBRT (HGX B200). Commit `9c5512d` attempted a fix by adding `_fw_gpu_` patterns,
|
||||
but checked `DeviceName` which contains `"Software Inventory"` (from the Redfish `Name` field),
|
||||
not the firmware inventory ID. The patterns were dead code from the moment they were committed.
|
||||
**Decision:**
|
||||
- `isDeviceBoundFirmwareName` must be extended for each new vendor whose FirmwareInventory
|
||||
naming convention differs from the existing patterns.
|
||||
- When adding HGX/Supermicro patterns, check that the pattern matches the field value that
|
||||
`collectFirmwareInventory` actually stores — trace the data path from Redfish doc to
|
||||
`FirmwareInfo.DeviceName` before writing the condition.
|
||||
- `TestIsDeviceBoundFirmwareName` must contain at least one case per vendor format.
|
||||
**Consequences:**
|
||||
- New vendors with FirmwareInventory support require a test covering both device-bound names
|
||||
(must return true) and system-level names (must return false) before the code ships.
|
||||
- The dead `_fw_gpu_` / `_fw_nvswitch_` / `_inforom_gpu_` patterns were replaced with
|
||||
correct prefix+digit checks (`"gpu" + digit`, `"nic" + digit`) and explicit string checks
|
||||
(`"nvmecontroller"`, `"power supply"`, `"software inventory"`).
|
||||
|
||||
## ADL-020 — Dell TSR device-bound firmware filtered via FQDD; InfiniBand routed to NetworkAdapters
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:** Dell TSR `sysinfo_DCIM_SoftwareIdentity.xml` lists firmware for every installed
|
||||
component. `parseSoftwareIdentityXML` dumped all of these into `hardware.firmware` without
|
||||
filtering, so device-bound entries such as `"Mellanox Network Adapter"` (FQDD `InfiniBand.Slot.1-1`)
|
||||
and `"PERC H755 Front"` (FQDD `RAID.SL.3-1`) appeared in the reanimator export alongside system
|
||||
firmware like BIOS and iDRAC. Confirmed on PowerEdge R6625 (8VS2LG4).
|
||||
|
||||
Additionally, `DCIM_InfiniBandView` was not handled in the parser switch, so Mellanox ConnectX-6
|
||||
appeared only as a PCIe device with `model: "16x or x16"` (from `DataBusWidth` fallback).
|
||||
`parseControllerView` called `addFirmware` with description `"storage controller"` instead of the
|
||||
FQDD, so the FQDD-based filter in the exporter could not remove it.
|
||||
|
||||
**Decision:**
|
||||
1. `isDeviceBoundFirmwareFQDD` extended with `"infiniband."` and `"fc."` prefixes; `"raid.backplane."`
|
||||
broadened to `"raid."` to cover `RAID.SL.*`, `RAID.Integrated.*`, etc.
|
||||
2. `DCIM_InfiniBandView` routed to `parseNICView` → device appears as `NetworkAdapter` with correct
|
||||
firmware, MAC address, and VendorID/DeviceID.
|
||||
3. `"InfiniBand."` added to `pcieFQDDNoisePrefix` to suppress the duplicate `DCIM_PCIDeviceView`
|
||||
entry (DataBusWidth-only, no useful data).
|
||||
4. `parseControllerView` now passes `fqdd` as the `addFirmware` description so the FQDD filter
|
||||
removes the entry in the exporter.
|
||||
5. `parsePCIeDeviceView` now prioritises `props["description"]` (chip model, e.g. `"MT28908 Family
|
||||
[ConnectX-6]"`) over `props["devicedescription"]` (location string) for `pcie.Description`.
|
||||
6. `convertPCIeDevices` model fallback order: `PartNumber → Description → DeviceClass`.
|
||||
|
||||
**Consequences:**
|
||||
- `hardware.firmware` contains only system-level entries; NIC/RAID/storage-controller firmware
|
||||
lives on the respective device record.
|
||||
- `TestParseDellInfiniBandView` and `TestIsDeviceBoundFirmwareFQDD` guard the regression.
|
||||
- Any future Dell TSR device class whose FQDD prefix is not yet in the prefix list may still leak;
|
||||
extend `isDeviceBoundFirmwareFQDD` and add a test case when encountered.
|
||||
|
||||
---
|
||||
|
||||
## ADL-021 — pci.ids enrichment: chip model and vendor resolved from PCI IDs when source data is generic or missing
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
Dell TSR `DCIM_InfiniBandView.ProductName` reports a generic marketing name ("Mellanox Network
|
||||
Adapter") instead of the precise chip identifier ("MT28908 Family [ConnectX-6]"). The actual
|
||||
chip model is available in `pci.ids` by VendorID:DeviceID (15B3:101B). Vendor name may also be
|
||||
absent when no `VendorName` / `Manufacturer` property is present.
|
||||
|
||||
The general rule was established: *if model is not found in source data but PCI IDs are known,
|
||||
resolve model from `pci.ids`*. This rule applies broadly across all export paths.
|
||||
|
||||
**Decision (two-layer enrichment):**
|
||||
1. **Parser layer (Dell, `parseNICView`):** When `VendorID != 0 && DeviceID != 0`, prefer
|
||||
`pciids.DeviceName(vendorID, deviceID)` over the product name from logs. This makes the chip
|
||||
identifier the primary model for NIC/InfiniBand adapters (more specific than marketing name).
|
||||
Fill `Vendor` from `pciids.VendorName(vendorID)` when the vendor field is otherwise empty.
|
||||
Same fallback applied in `parsePCIeDeviceView` for empty `Description`.
|
||||
2. **Exporter layer (`convertPCIeFromDevices`):** General rule — when `d.Model == ""` after all
|
||||
legacy fallbacks and `VendorID != 0 && DeviceID != 0`, set `model = pciids.DeviceName(...)`.
|
||||
Also fill empty `manufacturer` from `pciids.VendorName(...)`. This covers all parsers/sources.
|
||||
|
||||
**Consequences:**
|
||||
- Mellanox InfiniBand slot now reports `model: "MT28908 Family [ConnectX-6]"` and
|
||||
`manufacturer: "Mellanox Technologies"` in the reanimator export.
|
||||
- For NICs where pci.ids has no entry, the original product name is kept (pci.ids returns "").
|
||||
- `TestParseDellInfiniBandView` asserts the model and vendor from pci.ids.
|
||||
|
||||
---
|
||||
|
||||
## ADL-022 — CPUAffinity parsed into NUMANode for PCIe, NIC, and controller devices
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
Dell TSR DCIM view classes report `CPUAffinity` for NIC, InfiniBand, PCIe, and controller
|
||||
devices. Values are "1", "2" (NUMA node index), or "Not Applicable" (for devices that bridge
|
||||
both CPUs or have no CPU affinity). This data is needed for topology-aware diagnostics.
|
||||
|
||||
**Decision:**
|
||||
- Add `NUMANode int` (JSON: `"numa_node,omitempty"`) to `models.PCIeDevice`,
|
||||
`models.NetworkAdapter`, `models.HardwareDevice`, and `ReanimatorPCIe`.
|
||||
- Parse from `props["cpuaffinity"]` using `parseIntLoose`: numeric values ("1", "2") map
|
||||
directly; "Not Applicable" returns 0 (omitted via `omitempty`).
|
||||
- Thread through `buildDevicesFromLegacy` (PCIe and NIC sections) and `convertPCIeFromDevices`.
|
||||
- `parseControllerView` also parses CPUAffinity since RAID controllers have NUMA affinity.
|
||||
|
||||
**Consequences:**
|
||||
- `numa_node: 1` or `2` appears in reanimator export for devices with known affinity.
|
||||
- Value 0 / absent means "not reported" — covers both "Not Applicable" and sources that don't
|
||||
provide CPUAffinity at all.
|
||||
- `TestParseDellCPUAffinity` verifies numeric values parsed correctly and "Not Applicable"→0.
|
||||
|
||||
---
|
||||
|
||||
## ADL-023 — Reanimator export must match ingest contract exactly
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
LOGPile's Reanimator export had drifted from the strict ingest contract. It emitted fields that
|
||||
Reanimator does not currently accept (`status_at_collection`, `numa_node`),
|
||||
while missing fields and sections now present in the contract (`hardware.sensors`,
|
||||
`pcie_devices[].mac_addresses`). Memory export rules also diverged from the ingest side: empty or
|
||||
serial-less DIMMs were still exported.
|
||||
|
||||
**Decision:**
|
||||
- Treat the Reanimator ingest contract as the authoritative schema for `GET /api/export/reanimator`.
|
||||
- Emit only fields present in the current upstream contract revision.
|
||||
- Add `hardware.sensors`, `pcie_devices[].mac_addresses`, `pcie_devices[].numa_node`, and
|
||||
upstream-approved component telemetry/health fields.
|
||||
- Leave out fields that are still not part of the upstream contract.
|
||||
- Map internal `source_type=archive` to external `source_type=logfile`.
|
||||
- Skip memory entries that are empty, not present, or missing serial numbers.
|
||||
- Generate CPU and PCIe serials only in the forms allowed by the contract.
|
||||
- Mirror the applied contract in `bible-local/docs/hardware-ingest-contract.md`.
|
||||
|
||||
**Consequences:**
|
||||
- Some previously exported diagnostic fields are intentionally dropped from the Reanimator payload
|
||||
until the upstream contract adds them.
|
||||
- Internal models may retain richer fields than the current export schema.
|
||||
- `hardware.devices` is canonical only after merge with legacy hardware slices; partial parser-owned
|
||||
canonical records must not hide CPUs, memory, storage, NICs, or PSUs still stored in legacy
|
||||
fields.
|
||||
- CSV and Reanimator exports must use the same merged canonical inventory to avoid divergent export
|
||||
contents across surfaces.
|
||||
- Future exporter changes must update both the code and the mirrored contract document together.
|
||||
|
||||
---
|
||||
|
||||
## ADL-024 — Component presence is implicit; Redfish linked metrics are part of replay correctness
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
The upstream ingest contract allows `present`, but current export semantics do not need to send
|
||||
`present=true` for populated components. At the same time, several important Redfish component
|
||||
telemetry fields were only available through linked metric resources such as `ProcessorMetrics`,
|
||||
`MemoryMetrics`, and `DriveMetrics`. Without collecting and replaying these linked documents,
|
||||
live collection and raw snapshot replay still underreported component health fields.
|
||||
|
||||
**Decision:**
|
||||
- Do not serialize `present=true` in Reanimator export. Presence is represented by the presence of
|
||||
the component record itself.
|
||||
- Do not export component records marked `present=false`.
|
||||
- Interpret CPU `firmware` in Reanimator payload as CPU microcode.
|
||||
- Treat Redfish linked metric resources `ProcessorMetrics`, `MemoryMetrics`, `DriveMetrics`,
|
||||
`EnvironmentMetrics`, and generic `Metrics` as part of analyzer correctness when they are linked
|
||||
from component resources.
|
||||
- Replay logic must merge these linked metric resources back into CPU, memory, storage, PCIe, GPU,
|
||||
NIC, and PSU component `Details` the same way live collection expects them to be used.
|
||||
|
||||
**Consequences:**
|
||||
- Reanimator payloads are smaller and avoid redundant `present=true` noise while still excluding
|
||||
empty slots and absent components.
|
||||
- Any future exporter change that reintroduces serialized component presence needs an explicit
|
||||
contract review.
|
||||
- Raw Redfish snapshot completeness now includes linked per-component metric resources, not only
|
||||
top-level inventory members.
|
||||
- CPU microcode is no longer expected in top-level `hardware.firmware`; it belongs on the CPU
|
||||
component record.
|
||||
|
||||
<!-- Add new decisions below this line using the format above -->
|
||||
|
||||
## ADL-025 — Missing serial numbers must remain absent in Reanimator export
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
LOGPile previously generated synthetic serial numbers for components that had no real serial in
|
||||
source data, especially CPUs and PCIe-class devices. This made the payload look richer, but the
|
||||
serials were not authoritative and could mislead downstream consumers. Reanimator can already
|
||||
accept missing serials and generate its own internal fallback identifiers when needed.
|
||||
|
||||
**Decision:**
|
||||
- Do not synthesize fake serial numbers in LOGPile's Reanimator export.
|
||||
- If a component has no real serial in parsed source data, export the serial field as absent.
|
||||
- This applies to CPUs, PCIe devices, GPUs, NICs, and any other component class unless an
|
||||
upstream contract explicitly requires a deterministic exporter-generated identifier.
|
||||
- Any fallback serial generation defined by the upstream contract is ingest-side Reanimator behavior,
|
||||
not LOGPile exporter behavior.
|
||||
|
||||
**Consequences:**
|
||||
- Exported payloads carry only source-backed serial numbers.
|
||||
- Fake identifiers such as `BOARD-...-CPU-...` or synthetic PCIe serials are no longer considered
|
||||
acceptable exporter behavior.
|
||||
- Any future attempt to reintroduce generated serials requires an explicit contract review and a
|
||||
new ADL entry.
|
||||
|
||||
---
|
||||
|
||||
## ADL-026 — Live Redfish collection uses explicit preflight host-power confirmation
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
Live Redfish inventory can be incomplete when the managed host is powered off. At the same time,
|
||||
LOGPile must not silently power on a host without explicit user choice. The collection workflow
|
||||
therefore needs a preflight step that verifies connectivity, shows current host power state to the
|
||||
user, and only powers on the host when the user explicitly chose that path.
|
||||
|
||||
**Decision:**
|
||||
- Add a dedicated live preflight API step before collection starts.
|
||||
- UI first runs connectivity and power-state check, then offers:
|
||||
- collect as-is
|
||||
- power on and collect
|
||||
- if the host is off and the user does not answer within 5 seconds, default to collecting without
|
||||
powering the host on
|
||||
- Redfish collection may power on the host only when the request explicitly sets
|
||||
`power_on_if_host_off=true`
|
||||
- when LOGPile powers on the host for collection, it must try to power the host back off after
|
||||
collection completes
|
||||
- if LOGPile did not power the host on itself, it must never power the host off
|
||||
- all preflight and power-control steps must be logged into the collection log and therefore into
|
||||
the raw-export bundle
|
||||
|
||||
**Consequences:**
|
||||
- Live collection becomes a two-step UX: probe first, collect second.
|
||||
- Raw bundles preserve operator-visible evidence of power-state decisions and power-control attempts.
|
||||
- Power-on failures do not block collection entirely; they only downgrade completeness expectations.
|
||||
|
||||
---
|
||||
|
||||
## ADL-027 — Sensors without numeric readings are not exported
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
Some parsed sensor records carry only a name, unit, or status, but no actual numeric reading. Such
|
||||
records are not useful as telemetry in Reanimator export and create noisy, low-value sensor lists.
|
||||
|
||||
**Decision:**
|
||||
- Do not export temperature, power, fan, or other sensor records unless they carry a real numeric
|
||||
measurement value.
|
||||
- Presence of a sensor name or health/status alone is not sufficient for export.
|
||||
|
||||
**Consequences:**
|
||||
- Exported sensor groups contain only actionable telemetry.
|
||||
- Parsers and collectors may still keep non-numeric sensor artifacts internally for diagnostics, but
|
||||
Reanimator export must filter them out.
|
||||
|
||||
---
|
||||
|
||||
## ADL-028 — Reanimator PCIe export excludes storage endpoints and synthetic serials
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
Some Redfish and archive sources expose NVMe drives both as storage inventory and as PCIe-visible
|
||||
endpoints. Exporting such drives in both `hardware.storage` and `hardware.pcie_devices` creates
|
||||
duplicates without adding useful topology value. At the same time, PCIe-class export still had old
|
||||
fallback behavior that generated synthetic serial numbers when source serials were absent.
|
||||
|
||||
**Decision:**
|
||||
- Export disks and NVMe drives only through `hardware.storage`.
|
||||
- Do not export storage endpoints as `hardware.pcie_devices`, even if the source inventory exposes
|
||||
them as PCIe/NVMe devices.
|
||||
- Keep real PCIe storage controllers such as RAID and HBA adapters in `hardware.pcie_devices`.
|
||||
- Do not synthesize PCIe/GPU/NIC serial numbers in LOGPile; missing serials stay absent.
|
||||
- Treat placeholder names such as `Network Device View` as non-authoritative and prefer resolved
|
||||
device names when stronger data exists.
|
||||
|
||||
**Consequences:**
|
||||
- Reanimator payloads no longer duplicate NVMe drives between storage and PCIe sections.
|
||||
- PCIe export remains topology-focused while storage export remains component-focused.
|
||||
- Missing PCIe-class serials no longer produce fake `BOARD-...-PCIE-...` identifiers.
|
||||
|
||||
---
|
||||
|
||||
## ADL-029 — Local exporter guidance tracks upstream contract v2.7 terminology
|
||||
|
||||
**Date:** 2026-03-15
|
||||
**Context:**
|
||||
The upstream Reanimator hardware ingest contract moved to `v2.7` and clarified several points that
|
||||
matter for LOGPile documentation: ingest-side serial fallback rules, canonical PCIe addressing via
|
||||
`slot`, the optional `event_logs` section, and the shared `manufactured_year_week` field.
|
||||
|
||||
**Decision:**
|
||||
- Keep the local mirrored contract file as an exact copy of the upstream `v2.7` document.
|
||||
- Describe CPU/PCIe serial fallback as Reanimator ingest behavior, not LOGPile exporter behavior.
|
||||
- Treat `pcie_devices.slot` as the canonical address on the LOGPile side as well; `bdf` may remain
|
||||
an internal fallback/dedupe key but is not serialized in the payload.
|
||||
- Export `event_logs` only from normalized parser/collector events that can be mapped to contract
|
||||
sources `host` / `bmc` / `redfish` without synthesizing message content.
|
||||
- Export `manufactured_year_week` only as a reliable passthrough when a parser/collector already
|
||||
extracted a valid `YYYY-Www` value.
|
||||
|
||||
**Consequences:**
|
||||
- Local bible wording no longer conflicts with upstream contract terminology.
|
||||
- Reanimator payloads use contract-native PCIe addressing and no longer expose `bdf` as a parallel
|
||||
coordinate.
|
||||
- LOGPile event export remains strictly source-derived; internal warnings such as LOGPile analysis
|
||||
notes do not leak into Reanimator `event_logs`.
|
||||
|
||||
---
|
||||
|
||||
## ADL-030 — Audit result rendering is delegated to embedded reanimator/chart
|
||||
|
||||
**Date:** 2026-03-16
|
||||
**Context:**
|
||||
LOGPile already owns file upload, Redfish collection, archive parsing, normalization, and
|
||||
Reanimator export. Maintaining a second host-side audit renderer for the same data created
|
||||
presentation drift and duplicated UI logic.
|
||||
|
||||
**Decision:**
|
||||
- Use vendored `reanimator/chart` as the only audit result viewer.
|
||||
- Keep LOGPile responsible for service flows: upload, live collection, batch convert, raw export,
|
||||
Reanimator export, and parse-error reporting.
|
||||
- Render the current dataset by converting it to Reanimator JSON and passing that snapshot to
|
||||
embedded `chart` under `/chart/current`.
|
||||
|
||||
**Consequences:**
|
||||
- Reanimator JSON becomes the single presentation contract for the audit surface.
|
||||
- The host UI becomes a service shell around the viewer instead of maintaining its own
|
||||
field-by-field tabs.
|
||||
- `internal/chart` must be updated explicitly as a git submodule when the viewer changes.
|
||||
|
||||
---
|
||||
|
||||
## ADL-031 — Redfish uses profile-driven acquisition and unified ingest entrypoints
|
||||
|
||||
**Date:** 2026-03-17
|
||||
**Context:**
|
||||
Redfish collection had accumulated platform-specific probing in the shared collector path, while
|
||||
upload and raw-export replay still entered analysis through direct handler branches. This made
|
||||
vendor/model tuning harder to contain and increased regression risk when one topology needed a
|
||||
special acquisition strategy.
|
||||
|
||||
**Decision:**
|
||||
- Introduce `internal/ingest.Service` as the internal source-family entrypoint for archive parsing
|
||||
and Redfish raw replay.
|
||||
- Introduce `internal/collector/redfishprofile/` for Redfish profile matching and modular hooks.
|
||||
- Split Redfish behavior into coordinated phases:
|
||||
- acquisition planning during live collection
|
||||
- analysis hooks during snapshot replay
|
||||
- Use score-based profile matching. If confidence is low, enter fallback acquisition mode and
|
||||
aggregate only safe additive profile probes.
|
||||
- Allow profile modules to provide bounded acquisition tuning hints such as crawl cap, prefetch
|
||||
behavior, and expensive post-probe toggles.
|
||||
- Allow profile modules to own model-specific `CriticalPaths` and bounded `PlanBPaths` so vendor
|
||||
recovery targets stop leaking into the collector core.
|
||||
- Expose Redfish profile matching as structured diagnostics during live collection: logs must
|
||||
contain all module scores, and collect job status must expose active modules for the UI.
|
||||
|
||||
**Consequences:**
|
||||
- Server handlers stop owning parser-vs-replay branching details directly.
|
||||
- Vendor/model-specific Redfish logic gets an explicit module boundary.
|
||||
- Unknown-vendor Redfish collection becomes slower but more complete by design.
|
||||
- Tactical Redfish fixes should move into profile modules instead of widening generic replay logic.
|
||||
- Repo-owned compact fixtures under `internal/collector/redfishprofile/testdata/`, derived from
|
||||
representative raw-export snapshots, are used to lock profile matching and acquisition tuning
|
||||
for known MSI and Supermicro-family shapes.
|
||||
|
||||
---
|
||||
|
||||
## ADL-032 — MSI ghost GPU filter: exclude GPUs with temperature=0 on powered-on host
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:**
|
||||
MSI/AMI BMC caches GPU inventory from the host via Host Interface (in-band). When GPUs are
|
||||
removed without a reboot the old entries remain in `Chassis/GPU*` and
|
||||
`Systems/Self/Processors/GPU*` with `Status.Health: OK, State: Enabled`. The BMC has no
|
||||
out-of-band mechanism to detect physical absence. A physically present GPU always reports
|
||||
an ambient temperature (>0°C) even when idle; a stale cached entry returns `Reading: 0`.
|
||||
|
||||
**Decision:**
|
||||
- Add `EnableMSIGhostGPUFilter` directive (enabled by MSI profile's `refineAnalysis`
|
||||
alongside `EnableProcessorGPUFallback`).
|
||||
- In `collectGPUsFromProcessors`: for each processor GPU, resolve its chassis path and read
|
||||
`Chassis/GPU{n}/Sensors/GPU{n}_Temperature`. If `PowerState=On` and `Reading=0` → skip.
|
||||
- Filter only applies when host is powered on; when host is off all temperatures are 0 and
|
||||
the signal is ambiguous.
|
||||
|
||||
**Consequences:**
|
||||
- Ghost GPUs from previous hardware configurations no longer appear in the inventory.
|
||||
- Filter is MSI-profile-owned and does not affect HGX, Supermicro, or generic paths.
|
||||
- Any new MSI GPU chassis that uses a different temperature sensor path will bypass the filter
|
||||
(safe default: include rather than wrongly exclude).
|
||||
|
||||
---
|
||||
|
||||
## ADL-033 — Reanimator export collected_at uses inventory LastModifiedTime with 30-day fallback
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:**
|
||||
For Redfish sources the BMC Manager `DateTime` reflects when the BMC clock read the time, not
|
||||
when the hardware inventory was last known-good. `InventoryData/Status.LastModifiedTime`
|
||||
(AMI/MSI OEM endpoint) records the actual timestamp of the last successful host-pushed
|
||||
inventory cycle and is a better proxy for "when was this hardware configuration last confirmed".
|
||||
|
||||
**Decision:**
|
||||
- `inferInventoryLastModifiedTime` reads `LastModifiedTime` from the snapshot and sets
|
||||
`AnalysisResult.InventoryLastModifiedAt`.
|
||||
- `reanimatorCollectedAt()` in the exporter selects `InventoryLastModifiedAt` when it is set
|
||||
and no older than 30 days; otherwise falls back to `CollectedAt`.
|
||||
- Fallback rationale: inventory older than 30 days is likely from a long-running server with
|
||||
no recent reboot; using the actual collection date is more useful for the downstream consumer.
|
||||
- The inventory timestamp is also logged during replay and live collection for diagnostics.
|
||||
|
||||
**Consequences:**
|
||||
- Reanimator export `collected_at` reflects the last confirmed inventory cycle on AMI/MSI BMCs.
|
||||
- On non-AMI BMCs or when `InventoryData/Status` is absent, behavior is unchanged.
|
||||
- If inventory is stale (>30 days), collection date is used as before.
|
||||
|
||||
---
|
||||
|
||||
## ADL-034 — Redfish inventory invalidated before host power-on
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:**
|
||||
When a host is powered on by the collector (`power_on_if_host_off=true`), the BMC still holds
|
||||
inventory from the previous boot. If hardware changed between shutdowns, the new boot will push
|
||||
fresh inventory — but only if the BMC accepts it (CRC mismatch triggers re-population). Without
|
||||
explicit invalidation, unchanged CRCs can cause the BMC to skip re-processing even after a
|
||||
hardware change.
|
||||
|
||||
**Decision:**
|
||||
- Before any power-on attempt, `invalidateRedfishInventory` POSTs to
|
||||
`{systemPath}/Oem/Ami/Inventory/Crc` with all groups zeroed (`CPU`, `DIMM`, `PCIE`,
|
||||
`CERTIFICATES`, `SECUREBOOT`).
|
||||
- Best-effort: a 404/405 response (non-AMI BMC) is logged and silently ignored.
|
||||
- The invalidation is logged at `INFO` level and surfaced as a collect progress message.
|
||||
|
||||
**Consequences:**
|
||||
- On AMI/MSI BMCs: the next boot will push a full fresh inventory regardless of whether
|
||||
CRCs appear unchanged, eliminating ghost components from prior hardware configurations.
|
||||
- On non-AMI BMCs: the POST fails immediately (endpoint does not exist), nothing changes.
|
||||
- Invalidation runs only when `power_on_if_host_off=true` and host is confirmed off.
|
||||
|
||||
---
|
||||
|
||||
## ADL-035 — Redfish hardware event log collection from Systems LogServices
|
||||
|
||||
**Date:** 2026-03-18
|
||||
**Context:** Redfish BMCs expose event logs via `LogServices/{svc}/Entries`. On MSI/AMI this includes the IPMI SEL with hardware events (temperature, power, drive failures, etc.). Live collection previously collected only inventory/sensor snapshots; event history was unavailable in Reanimator.
|
||||
**Decision:**
|
||||
- After tree-walk, fetch hardware log entries separately via `collectRedfishLogEntries()` (not part of tree-walk to avoid bloat).
|
||||
- Only `Systems/{sys}/LogServices` is queried — Managers LogServices (BMC audit/journal) are excluded.
|
||||
- Log services with Id/Name containing "audit", "journal", "bmc", "security", "manager", "debug" are skipped.
|
||||
- Entries older than 7 days (client-side filter) are discarded. Pages are followed until an out-of-window entry is found (assumes newest-first ordering, typical for BMCs).
|
||||
- Entries with `EntryType: "Oem"` or `MessageId` containing user/auth/login keywords are filtered as non-hardware.
|
||||
- Raw entries stored in `rawPayloads["redfish_log_entries"]` as `[]map[string]interface{}`.
|
||||
- Parsed to `models.Event` in `parseRedfishLogEntries()` during replay — same path for live and offline.
|
||||
- Max 200 entries per log service, 500 total to limit BMC load.
|
||||
**Consequences:**
|
||||
- Hardware event history (last 7 days) visible in Reanimator `EventLogs` section.
|
||||
- No impact on existing inventory pipeline or offline archive replay (archives without `redfish_log_entries` key silently skip parsing).
|
||||
- Adds extra HTTP requests during live collection (sequential, after tree-walk completes).
|
||||
|
||||
---
|
||||
|
||||
## ADL-036 — Redfish profile matching may use platform grammar hints beyond vendor strings
|
||||
|
||||
**Date:** 2026-03-25
|
||||
**Context:**
|
||||
Some BMCs expose unusable `Manufacturer` / `Model` values (`NULL`, placeholders, or generic SoC
|
||||
names) while still exposing a stable platform-specific Redfish grammar: repeated member names,
|
||||
firmware inventory IDs, OEM action names, and target-path quirks. Matching only on vendor
|
||||
strings forced such systems into fallback mode even when the platform shape was consistent.
|
||||
|
||||
**Decision:**
|
||||
- Extend `redfishprofile.MatchSignals` with doc-derived hint tokens collected from discovery docs
|
||||
and replay snapshots.
|
||||
- Allow profile matchers to score on stable platform grammar such as:
|
||||
- collection member naming (`outboardPCIeCard*`, drive slot grammars)
|
||||
- firmware inventory member IDs
|
||||
- OEM action/type markers and linked target paths
|
||||
- During live collection, gather only lightweight extra hint collections needed for matching
|
||||
(`NetworkInterfaces`, `NetworkAdapters`, `Drives`, `UpdateService/FirmwareInventory`), not slow
|
||||
deep inventory branches.
|
||||
- Keep such profiles out of fallback aggregation unless they are proven safe as broad additive
|
||||
hints.
|
||||
|
||||
**Consequences:**
|
||||
- Platform-family profiles can activate even when vendor strings are absent or set to `NULL`.
|
||||
- Matching logic becomes more robust for OEM BMC implementations that differ mainly by Redfish
|
||||
grammar rather than by explicit vendor strings.
|
||||
- Live collection gains a small amount of extra discovery I/O to harvest stable member IDs, but
|
||||
avoids slow deep probes such as `Assembly` just for profile selection.
|
||||
|
||||
---
|
||||
|
||||
## ADL-037 — easy-bee archives are parsed from the embedded bee-audit snapshot
|
||||
|
||||
**Date:** 2026-03-25
|
||||
**Context:**
|
||||
`reanimator-easy-bee` support bundles already contain a normalized hardware snapshot in
|
||||
`export/bee-audit.json` plus supporting logs and techdump files. Rebuilding the same inventory
|
||||
from raw `techdump/` files inside LOGPile would duplicate parser logic and create drift between
|
||||
the producer utility and archive importer.
|
||||
|
||||
**Decision:**
|
||||
- Add a dedicated `easy_bee` vendor parser for `bee-support-*.tar.gz` bundles.
|
||||
- Detect the bundle by `manifest.txt` (`bee_version=...`) plus `export/bee-audit.json`.
|
||||
- Parse the archive from the embedded snapshot first; treat `techdump/` and runtime files as
|
||||
secondary context only.
|
||||
- Normalize snapshot-only fields needed by LOGPile, notably:
|
||||
- flatten `hardware.sensors` groups into `[]SensorReading`
|
||||
- turn runtime issues/status into `[]Event`
|
||||
- synthesize a board FRU entry when the snapshot does not include FRU data
|
||||
|
||||
**Consequences:**
|
||||
- LOGPile stays aligned with the schema emitted by `reanimator-easy-bee`.
|
||||
- Adding support required only a thin archive adapter instead of a full hardware parser.
|
||||
- If the upstream utility changes the embedded snapshot schema, the `easy_bee` adapter is the
|
||||
only place that must be updated.
|
||||
|
||||
---
|
||||
|
||||
## ADL-038 — HPE AHS parser uses hybrid extraction instead of full `zbb` schema decoding
|
||||
|
||||
**Date:** 2026-03-30
|
||||
**Context:** HPE iLO Active Health System exports (`.ahs`) are proprietary `ABJR` containers
|
||||
with gzip-compressed `zbb` payloads. The sample inventory data contains two practical signal
|
||||
families: printable SMBIOS/FRU-style strings and embedded Redfish JSON subtrees, especially for
|
||||
storage controllers and drives. Full `zbb` binary schema decoding is not documented and would add
|
||||
significant complexity before proving user value.
|
||||
**Decision:** Support HPE AHS with a hybrid parser:
|
||||
- decode the outer `ABJR` container
|
||||
- gunzip embedded members when applicable
|
||||
- extract inventory from printable SMBIOS/FRU payloads
|
||||
- extract storage/controller/backplane details from embedded Redfish JSON objects
|
||||
- enrich firmware and PSU inventory from auxiliary package payloads such as `bcert.pkg`
|
||||
- do not attempt complete semantic decoding of the internal `zbb` record format
|
||||
**Consequences:**
|
||||
- Parser reaches inventory-grade usefulness quickly for HPE `.ahs` uploads.
|
||||
- Storage inventory is stronger than text-only parsing because it reuses structured Redfish data when present.
|
||||
- Auxiliary package payloads can supply missing firmware/PSU fields even when the main SMBIOS-like blob is incomplete.
|
||||
- Future deeper `zbb` decoding can be added incrementally without replacing the current parser contract.
|
||||
|
||||
---
|
||||
|
||||
## ADL-039 — Canonical inventory keeps DIMMs with unknown capacity when identity is known
|
||||
|
||||
**Date:** 2026-03-30
|
||||
**Context:** Some sources, notably HPE iLO AHS SMBIOS-like blobs, expose installed DIMM identity
|
||||
(slot, serial, part number, manufacturer) but do not include capacity. The parser already extracts
|
||||
those modules into `Hardware.Memory`, but canonical device building and export previously dropped
|
||||
them because `size_mb == 0`.
|
||||
**Decision:** Treat a DIMM as installed inventory when `present=true` and it has identifying
|
||||
memory fields such as serial number or part number, even if `size_mb` is unknown.
|
||||
**Consequences:**
|
||||
- HPE AHS uploads now show real installed memory modules instead of hiding them.
|
||||
- Empty slots still stay filtered because they lack inventory identity or are marked absent.
|
||||
- Specification/export can include "size unknown" memory entries without inventing capacity data.
|
||||
|
||||
---
|
||||
|
||||
## ADL-040 — HPE Redfish normalization prefers chassis `Devices/*` over generic PCIe topology labels
|
||||
|
||||
**Date:** 2026-03-30
|
||||
**Context:** HPE ProLiant Gen11 Redfish snapshots expose parallel inventory trees. `Chassis/*/PCIeDevices/*`
|
||||
is good for topology presence, but often reports only generic `DeviceType` values such as
|
||||
`SingleFunction`. `Chassis/*/Devices/*` carries the concrete slot label, richer device type, and
|
||||
product-vs-spare part identifiers for the same physical NIC/controller. Replay fallback over empty
|
||||
storage volume collections can also discover `Volumes/Capabilities` children, which are not real
|
||||
logical volumes.
|
||||
|
||||
**Decision:**
|
||||
- Treat Redfish `SKU` as a valid fallback for `hardware.board.part_number` when `PartNumber` is empty.
|
||||
- Ignore `Volumes/Capabilities` documents during logical-volume parsing.
|
||||
- Enrich `Chassis/*/PCIeDevices/*` entries with matching `Chassis/*/Devices/*` documents by
|
||||
serial/name/part identity.
|
||||
- Keep `pcie.device_class` semantic; do not replace it with model or part-number strings when
|
||||
Redfish exposes only generic topology labels.
|
||||
|
||||
**Consequences:**
|
||||
- HPE Redfish imports now keep the server SKU in `hardware.board.part_number`.
|
||||
- Empty volume collections no longer produce fake `Capabilities` volume records.
|
||||
- HPE PCIe inventory gets better slot labels like `OCP 3.0 Slot 15` plus concrete classes such as
|
||||
`LOM/NIC` or `SAS/SATA Storage Controller`.
|
||||
- `part_number` remains available separately for model identity, without polluting the class field.
|
||||
|
||||
---
|
||||
|
||||
## ADL-041 — Redfish replay drops topology-only PCIe noise classes from canonical inventory
|
||||
|
||||
**Date:** 2026-04-01
|
||||
**Context:** Some Redfish BMCs, especially MSI/AMI GPU systems, expose a very wide PCIe topology
|
||||
tree under `Chassis/*/PCIeDevices/*`. Besides real endpoint devices, the replay sees bridge stages,
|
||||
CPU-side helper functions, IMC/mesh signal-processing nodes, USB/SPI side controllers, and GPU
|
||||
display-function duplicates reported as generic `Display Device`. Keeping all of them in
|
||||
`hardware.pcie_devices` pollutes downstream exports such as Reanimator and hides the actual
|
||||
endpoint inventory signal.
|
||||
|
||||
**Decision:**
|
||||
- Filter topology-only PCIe records during Redfish replay, not in the UI layer.
|
||||
- Drop PCIe entries with replay-resolved classes:
|
||||
- `Bridge`
|
||||
- `Processor`
|
||||
- `SignalProcessingController`
|
||||
- `SerialBusController`
|
||||
- Drop `DisplayController` entries when the source Redfish PCIe document is the generic MSI-style
|
||||
`Description: "Display Device"` duplicate.
|
||||
- Drop PCIe network endpoints when their PCIe functions already link to `NetworkDeviceFunctions`,
|
||||
because those devices are represented canonically in `hardware.network_adapters`.
|
||||
- When `Systems/*/NetworkInterfaces/*` links back to a chassis `NetworkAdapter`, match against the
|
||||
fully enriched chassis NIC identity to avoid creating a second ghost NIC row with the raw
|
||||
`NetworkAdapter_*` slot/name.
|
||||
- Treat generic Redfish object names such as `NetworkAdapter_*` and `PCIeDevice_*` as placeholder
|
||||
models and replace them from PCI IDs when a concrete vendor/device match exists.
|
||||
- Drop MSI-style storage service PCIe endpoints whose resolved device names are only
|
||||
`Volume Management Device NVMe RAID Controller` or `PCIe Switch management endpoint`; storage
|
||||
inventory already comes from the Redfish storage tree.
|
||||
- Normalize Ethernet-class NICs into the single exported class `NetworkController`; do not split
|
||||
`EthernetController` into a separate top-level inventory section.
|
||||
- Keep endpoint classes such as `NetworkController`, `MassStorageController`, and dedicated GPU
|
||||
inventory coming from `hardware.gpus`.
|
||||
|
||||
**Consequences:**
|
||||
- `hardware.pcie_devices` becomes closer to real endpoint inventory instead of raw PCIe topology.
|
||||
- Reanimator exports stop showing MSI bridge/processor/display duplicate noise.
|
||||
- Reanimator exports no longer duplicate the same MSI NIC as both `PCIeDevice_*` and
|
||||
`NetworkAdapter_*`.
|
||||
- Replay no longer creates extra NIC rows from `Systems/NetworkInterfaces` when the same adapter
|
||||
was already normalized from `Chassis/NetworkAdapters`.
|
||||
- MSI VMD / PCIe switch storage service endpoints no longer pollute PCIe inventory.
|
||||
- UI/Reanimator group all Ethernet NICs under the same `NETWORKCONTROLLER` section.
|
||||
- Canonical NIC inventory prefers resolved PCI product names over generic Redfish placeholder names.
|
||||
- The raw Redfish snapshot still remains available in `raw_payloads.redfish_tree` for low-level
|
||||
troubleshooting if topology details are ever needed.
|
||||
|
||||
---
|
||||
|
||||
## ADL-042 — xFusion file-export archives merge AppDump inventory with RTOS/Log snapshots
|
||||
|
||||
**Date:** 2026-04-04
|
||||
**Context:** xFusion iBMC `tar.gz` exports expose the base inventory in `AppDump/`, but the most
|
||||
useful NIC and firmware details live elsewhere: NIC firmware/MAC snapshots in
|
||||
`LogDump/netcard/netcard_info.txt` and system firmware versions in
|
||||
`RTOSDump/versioninfo/app_revision.txt`. Parsing only `AppDump/` left xFusion uploads detectable but
|
||||
incomplete for UI and Reanimator consumers.
|
||||
|
||||
**Decision:**
|
||||
- Treat xFusion file-export `tar.gz` bundles as a first-class archive parser input.
|
||||
- Merge OCP NIC identity from `AppDump/card_manage/card_info` with the latest per-slot snapshot
|
||||
from `LogDump/netcard/netcard_info.txt` to produce `hardware.network_adapters`.
|
||||
- Import system-level firmware from `RTOSDump/versioninfo/app_revision.txt` into
|
||||
`hardware.firmware`.
|
||||
- Allow FRU fallback from `RTOSDump/versioninfo/fruinfo.txt` when `AppDump/FruData/fruinfo.txt`
|
||||
is absent.
|
||||
|
||||
**Consequences:**
|
||||
- xFusion uploads now preserve NIC BDF, MAC, firmware, and serial identity in normalized output.
|
||||
- System firmware such as BIOS and iBMC versions survives xFusion file exports.
|
||||
- xFusion archives participate more reliably in canonical device/export flows without special UI
|
||||
cases.
|
||||
|
||||
---
|
||||
|
||||
## ADL-043 — Extended HGX diagnostic plan-B is opt-in from the live collect form
|
||||
|
||||
**Date:** 2026-04-13
|
||||
**Context:** Some Supermicro HGX Redfish targets expose slow or hanging component-chassis inventory
|
||||
collections during critical plan-B, especially under `Chassis/HGX_*` for `Assembly`,
|
||||
`Accelerators`, `Drives`, `NetworkAdapters`, and `PCIeDevices`. Default collection should not
|
||||
block operators on deep diagnostic retries that are useful mainly for troubleshooting.
|
||||
**Decision:** Keep the normal snapshot/replay path unchanged, but gate those heavy HGX
|
||||
component-chassis critical plan-B retries behind the existing live-collect `debug_payloads` flag,
|
||||
presented in the UI as "Сбор расширенных данных для диагностики".
|
||||
**Consequences:**
|
||||
- Default live collection skips those heavy diagnostic plan-B retries and reaches replay faster.
|
||||
- Operators can explicitly opt into the slower diagnostic path when they need deeper collection.
|
||||
- The same user-facing toggle continues to enable extra debug payload capture for troubleshooting.
|
||||
|
||||
---
|
||||
|
||||
## ADL-044 — LOGPile project release tags use `vN.M`
|
||||
|
||||
**Date:** 2026-04-13
|
||||
**Context:** The repository accumulated release tags in `vN.M.P` form, while the shared module
|
||||
versioning contract in `bible/rules/patterns/module-versioning/contract.md` standardizes version
|
||||
shape as `N.M`. Release tooling reads the git tag verbatim into build metadata and release
|
||||
artifacts, so inconsistent tag shape leaks directly into packaged versions.
|
||||
**Decision:** Use `vN.M` for LOGPile project release tags going forward. Do not create new
|
||||
`vN.M.P` tags for repository releases. Build metadata, release directory names, and release notes
|
||||
continue to inherit the exact git tag string from `git describe --tags`.
|
||||
**Consequences:**
|
||||
- Future project releases have a two-component version string such as `v1.12`.
|
||||
- Release artifacts and `--version` output stay aligned with the tag shape without extra mapping.
|
||||
- Existing historical `vN.M.P` tags remain as-is unless explicitly rewritten.
|
||||
|
||||
@@ -1,59 +1,42 @@
|
||||
# LOGPile Bible
|
||||
|
||||
> **Documentation language:** English only. All maintained project documentation must be written in English.
|
||||
>
|
||||
> **Architectural decisions:** Every significant architectural decision **must** be recorded in
|
||||
> [`10-decisions.md`](10-decisions.md) before or alongside the code change.
|
||||
>
|
||||
> **Single source of truth:** Architecture and technical design documentation belongs in `docs/bible/`.
|
||||
> Keep `README.md` and `CLAUDE.md` minimal to avoid duplicate documentation.
|
||||
`bible-local/` is the project-specific source of truth for LOGPile.
|
||||
Keep top-level docs minimal and put maintained architecture/API contracts here.
|
||||
|
||||
This directory is the single source of truth for LOGPile's architecture, design, and integration contracts.
|
||||
It is structured so that both humans and AI assistants can navigate it quickly.
|
||||
## Rules
|
||||
|
||||
---
|
||||
- Documentation language: English only
|
||||
- Update relevant bible files in the same change as the code
|
||||
- Record significant architectural decisions in [`10-decisions.md`](10-decisions.md)
|
||||
- Do not duplicate shared rules from `bible/`
|
||||
|
||||
## Reading Map (Hierarchical)
|
||||
## Read order
|
||||
|
||||
### 1. Foundations (read first)
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| [01-overview.md](01-overview.md) | Product scope, modes, non-goals |
|
||||
| [02-architecture.md](02-architecture.md) | Runtime structure, state, main flows |
|
||||
| [04-data-models.md](04-data-models.md) | Stable data contracts and canonical inventory |
|
||||
| [03-api.md](03-api.md) | HTTP endpoints and response contracts |
|
||||
| [05-collectors.md](05-collectors.md) | Live collection behavior |
|
||||
| [06-parsers.md](06-parsers.md) | Archive parser framework and vendor coverage |
|
||||
| [07-exporters.md](07-exporters.md) | Raw export, Reanimator export, batch convert |
|
||||
| [docs/hardware-ingest-contract.md](docs/hardware-ingest-contract.md) | Reanimator ingest schema mirrored locally |
|
||||
| [08-build-release.md](08-build-release.md) | Build and release workflow |
|
||||
| [09-testing.md](09-testing.md) | Test expectations and regression rules |
|
||||
| [10-decisions.md](10-decisions.md) | Architectural Decision Log |
|
||||
|
||||
| File | What it covers |
|
||||
|------|----------------|
|
||||
| [01-overview.md](01-overview.md) | Product purpose, operating modes, scope |
|
||||
| [02-architecture.md](02-architecture.md) | Runtime structure, control flow, in-memory state |
|
||||
| [04-data-models.md](04-data-models.md) | Core contracts (`AnalysisResult`, canonical `hardware.devices`) |
|
||||
## Fast orientation
|
||||
|
||||
### 2. Runtime Interfaces
|
||||
|
||||
| File | What it covers |
|
||||
|------|----------------|
|
||||
| [03-api.md](03-api.md) | HTTP API contracts and endpoint behavior |
|
||||
| [05-collectors.md](05-collectors.md) | Live collection connectors (Redfish, IPMI mock) |
|
||||
| [06-parsers.md](06-parsers.md) | Archive parser framework and vendor parsers |
|
||||
| [07-exporters.md](07-exporters.md) | CSV / JSON / Reanimator exports and integration mapping |
|
||||
|
||||
### 3. Delivery & Quality
|
||||
|
||||
| File | What it covers |
|
||||
|------|----------------|
|
||||
| [08-build-release.md](08-build-release.md) | Build, packaging, release workflow |
|
||||
| [09-testing.md](09-testing.md) | Testing expectations and verification guidance |
|
||||
|
||||
### 4. Governance (always current)
|
||||
|
||||
| File | What it covers |
|
||||
|------|----------------|
|
||||
| [10-decisions.md](10-decisions.md) | Architectural Decision Log (ADL) |
|
||||
|
||||
---
|
||||
|
||||
## Quick orientation for AI assistants
|
||||
|
||||
- Read order for most changes: `01` → `02` → `04` → relevant interface doc(s) → `10`
|
||||
- Entry point: `cmd/logpile/main.go`
|
||||
- HTTP server: `internal/server/` — handlers in `handlers.go`, routes in `server.go`
|
||||
- Data contracts: `internal/models/` — never break `AnalysisResult` JSON shape
|
||||
- Frontend contract: `web/static/js/app.js` — keep API responses stable
|
||||
- Canonical inventory: `hardware.devices` in `AnalysisResult` — source of truth for UI and exports
|
||||
- Parser registry: `internal/parser/vendors/` — `init()` auto-registration pattern
|
||||
- Collector registry: `internal/collector/registry.go`
|
||||
- HTTP layer: `internal/server/`
|
||||
- Core contracts: `internal/models/models.go`
|
||||
- Live collection: `internal/collector/`
|
||||
- Archive parsing: `internal/parser/`
|
||||
- Export conversion: `internal/exporter/`
|
||||
- Frontend consumer: `web/static/js/app.js`
|
||||
|
||||
## Maintenance rule
|
||||
|
||||
If a document becomes stale, either fix it immediately or delete it.
|
||||
Stale docs are worse than missing docs.
|
||||
|
||||
793
bible-local/docs/hardware-ingest-contract.md
Normal file
793
bible-local/docs/hardware-ingest-contract.md
Normal file
@@ -0,0 +1,793 @@
|
||||
---
|
||||
title: Hardware Ingest JSON Contract
|
||||
version: "2.7"
|
||||
updated: "2026-03-15"
|
||||
maintainer: Reanimator Core
|
||||
audience: external-integrators, ai-agents
|
||||
language: ru
|
||||
---
|
||||
|
||||
# Интеграция с Reanimator: контракт JSON-импорта аппаратного обеспечения
|
||||
|
||||
Версия: **2.7** · Дата: **2026-03-15**
|
||||
|
||||
Документ описывает формат JSON для передачи данных об аппаратном обеспечении серверов в систему **Reanimator** (управление жизненным циклом аппаратного обеспечения).
|
||||
Предназначен для разработчиков смежных систем (Redfish-коллекторов, агентов мониторинга, CMDB-экспортёров) и может быть включён в документацию интегрируемых проектов.
|
||||
|
||||
> Актуальная версия документа: https://git.mchus.pro/reanimator/core/src/branch/main/bible-local/docs/hardware-ingest-contract.md
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Версия | Дата | Изменения |
|
||||
|--------|------|-----------|
|
||||
| 2.7 | 2026-03-15 | Явно запрещён синтез данных в `event_logs`; интеграторы не должны придумывать серийные номера компонентов, если источник их не отдал |
|
||||
| 2.6 | 2026-03-15 | Добавлена необязательная секция `event_logs` для dedup/upsert логов `host` / `bmc` / `redfish` вне history timeline |
|
||||
| 2.5 | 2026-03-15 | Добавлено общее необязательное поле `manufactured_year_week` для компонентных секций (`YYYY-Www`) |
|
||||
| 2.4 | 2026-03-15 | Добавлена первая волна component telemetry: health/life поля для `cpus`, `memory`, `storage`, `pcie_devices`, `power_supplies` |
|
||||
| 2.3 | 2026-03-15 | Добавлены component telemetry поля: `pcie_devices.temperature_c`, `pcie_devices.power_w`, `power_supplies.temperature_c` |
|
||||
| 2.2 | 2026-03-15 | Добавлено поле `numa_node` у `pcie_devices` для topology/affinity |
|
||||
| 2.1 | 2026-03-15 | Добавлена секция `sensors` (fans, power, temperatures, other); поле `mac_addresses` у `pcie_devices`; расширен список значений `device_class` |
|
||||
| 2.0 | 2026-02-01 | История статусов (`status_history`, `status_changed_at`); поля telemetry у PSU; async job response |
|
||||
| 1.0 | 2026-01-01 | Начальная версия контракта |
|
||||
|
||||
---
|
||||
|
||||
## Принципы
|
||||
|
||||
1. **Snapshot** — JSON описывает состояние сервера на момент сбора. Может включать историю изменений статуса компонентов.
|
||||
2. **Идемпотентность** — повторная отправка идентичного payload не создаёт дублей (дедупликация по хешу).
|
||||
3. **Частичность** — можно передавать только те секции, данные по которым доступны. Пустой массив и отсутствие секции эквивалентны.
|
||||
4. **Строгая схема** — endpoint использует строгий JSON-декодер; неизвестные поля приводят к `400 Bad Request`.
|
||||
5. **Event-driven** — импорт создаёт события в timeline (LOG_COLLECTED, INSTALLED, REMOVED, FIRMWARE_CHANGED и др.).
|
||||
6. **Без синтеза со стороны интегратора** — сборщик передаёт только фактически собранные значения. Нельзя придумывать `serial_number`, `component_ref`, `message`, `message_id` или другие идентификаторы/атрибуты, если источник их не предоставил или парсер не смог их надёжно извлечь.
|
||||
|
||||
---
|
||||
|
||||
## Endpoint
|
||||
|
||||
```
|
||||
POST /ingest/hardware
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
**Ответ при приёме (202 Accepted):**
|
||||
```json
|
||||
{
|
||||
"status": "accepted",
|
||||
"job_id": "job_01J..."
|
||||
}
|
||||
```
|
||||
|
||||
Импорт выполняется асинхронно. Результат доступен по:
|
||||
```
|
||||
GET /ingest/hardware/jobs/{job_id}
|
||||
```
|
||||
|
||||
**Ответ при успехе задачи:**
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"bundle_id": "lb_01J...",
|
||||
"asset_id": "mach_01J...",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"duplicate": false,
|
||||
"summary": {
|
||||
"parts_observed": 15,
|
||||
"parts_created": 2,
|
||||
"parts_updated": 13,
|
||||
"installations_created": 2,
|
||||
"installations_closed": 1,
|
||||
"timeline_events_created": 9,
|
||||
"failure_events_created": 1
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Ответ при дубликате:**
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"duplicate": true,
|
||||
"message": "LogBundle with this content hash already exists"
|
||||
}
|
||||
```
|
||||
|
||||
**Ответ при ошибке (400 Bad Request):**
|
||||
```json
|
||||
{
|
||||
"status": "error",
|
||||
"error": "validation_failed",
|
||||
"details": {
|
||||
"field": "hardware.board.serial_number",
|
||||
"message": "serial_number is required"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Частые причины `400`:
|
||||
- Неверный формат `collected_at` (требуется RFC3339).
|
||||
- Пустой `hardware.board.serial_number`.
|
||||
- Наличие неизвестного JSON-поля на любом уровне.
|
||||
- Тело запроса превышает допустимый размер.
|
||||
|
||||
---
|
||||
|
||||
## Структура верхнего уровня
|
||||
|
||||
```json
|
||||
{
|
||||
"filename": "redfish://10.10.10.103",
|
||||
"source_type": "api",
|
||||
"protocol": "redfish",
|
||||
"target_host": "10.10.10.103",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"hardware": {
|
||||
"board": { ... },
|
||||
"firmware": [ ... ],
|
||||
"cpus": [ ... ],
|
||||
"memory": [ ... ],
|
||||
"storage": [ ... ],
|
||||
"pcie_devices": [ ... ],
|
||||
"power_supplies": [ ... ],
|
||||
"sensors": { ... },
|
||||
"event_logs": [ ... ]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Поля верхнего уровня
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `collected_at` | string RFC3339 | **да** | Время сбора данных |
|
||||
| `hardware` | object | **да** | Аппаратный снапшот |
|
||||
| `hardware.board.serial_number` | string | **да** | Серийный номер платы/сервера |
|
||||
| `target_host` | string | нет | IP или hostname |
|
||||
| `source_type` | string | нет | Тип источника: `api`, `logfile`, `manual` |
|
||||
| `protocol` | string | нет | Протокол: `redfish`, `ipmi`, `snmp`, `ssh` |
|
||||
| `filename` | string | нет | Идентификатор источника |
|
||||
|
||||
---
|
||||
|
||||
## Общие поля статуса компонентов
|
||||
|
||||
Применяются ко всем компонентным секциям (`cpus`, `memory`, `storage`, `pcie_devices`, `power_supplies`).
|
||||
|
||||
| Поле | Тип | Описание |
|
||||
|------|-----|----------|
|
||||
| `status` | string | Текущий статус: `OK`, `Warning`, `Critical`, `Unknown`, `Empty` |
|
||||
| `status_checked_at` | string RFC3339 | Время последней проверки статуса |
|
||||
| `status_changed_at` | string RFC3339 | Время последнего изменения статуса |
|
||||
| `status_history` | array | История переходов статусов (см. ниже) |
|
||||
| `error_description` | string | Текст ошибки/диагностики |
|
||||
| `manufactured_year_week` | string | Дата производства в формате `YYYY-Www`, например `2024-W07` |
|
||||
|
||||
**Объект `status_history[]`:**
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `status` | string | **да** | Статус в этот момент |
|
||||
| `changed_at` | string RFC3339 | **да** | Время перехода (без этого поля запись игнорируется) |
|
||||
| `details` | string | нет | Пояснение к переходу |
|
||||
|
||||
**Правила приоритета времени события:**
|
||||
|
||||
1. `status_changed_at`
|
||||
2. Последняя запись `status_history` с совпадающим статусом
|
||||
3. Последняя парсируемая запись `status_history`
|
||||
4. `status_checked_at`
|
||||
|
||||
**Правила передачи статусов:**
|
||||
- Передавайте `status` как текущее состояние компонента в snapshot.
|
||||
- Если источник хранит историю — передавайте `status_history` отсортированным по `changed_at` по возрастанию.
|
||||
- Не включайте записи `status_history` без `changed_at`.
|
||||
- Все даты — RFC3339, рекомендуется UTC (`Z`).
|
||||
- `manufactured_year_week` используйте, когда источник знает только год и неделю производства, без точной календарной даты.
|
||||
|
||||
---
|
||||
|
||||
## Секции hardware
|
||||
|
||||
### board
|
||||
|
||||
Основная информация о сервере. Обязательная секция.
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `serial_number` | string | **да** | Серийный номер (ключ идентификации Asset) |
|
||||
| `manufacturer` | string | нет | Производитель |
|
||||
| `product_name` | string | нет | Модель |
|
||||
| `part_number` | string | нет | Партномер |
|
||||
| `uuid` | string | нет | UUID системы |
|
||||
|
||||
Значения `"NULL"` в строковых полях трактуются как отсутствие данных.
|
||||
|
||||
```json
|
||||
"board": {
|
||||
"manufacturer": "Supermicro",
|
||||
"product_name": "X12DPG-QT6",
|
||||
"serial_number": "21D634101",
|
||||
"part_number": "X12DPG-QT6-REV1.01",
|
||||
"uuid": "d7ef2fe5-2fd0-11f0-910a-346f11040868"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### firmware
|
||||
|
||||
Версии прошивок системных компонентов (BIOS, BMC, CPLD и др.).
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `device_name` | string | **да** | Название устройства (`BIOS`, `BMC`, `CPLD`, …) |
|
||||
| `version` | string | **да** | Версия прошивки |
|
||||
|
||||
Записи с пустым `device_name` или `version` игнорируются.
|
||||
Изменение версии создаёт событие `FIRMWARE_CHANGED` для Asset.
|
||||
|
||||
```json
|
||||
"firmware": [
|
||||
{ "device_name": "BIOS", "version": "06.08.05" },
|
||||
{ "device_name": "BMC", "version": "5.17.00" },
|
||||
{ "device_name": "CPLD", "version": "01.02.03" }
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### cpus
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `socket` | int | **да** | Номер сокета (используется для генерации serial) |
|
||||
| `model` | string | нет | Модель процессора |
|
||||
| `manufacturer` | string | нет | Производитель |
|
||||
| `cores` | int | нет | Количество ядер |
|
||||
| `threads` | int | нет | Количество потоков |
|
||||
| `frequency_mhz` | int | нет | Текущая частота |
|
||||
| `max_frequency_mhz` | int | нет | Максимальная частота |
|
||||
| `temperature_c` | float | нет | Температура CPU, °C (telemetry) |
|
||||
| `power_w` | float | нет | Текущая мощность CPU, Вт (telemetry) |
|
||||
| `throttled` | bool | нет | Зафиксирован thermal/power throttling |
|
||||
| `correctable_error_count` | int | нет | Количество корректируемых ошибок CPU |
|
||||
| `uncorrectable_error_count` | int | нет | Количество некорректируемых ошибок CPU |
|
||||
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
|
||||
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
|
||||
| `serial_number` | string | нет | Серийный номер (если доступен) |
|
||||
| `firmware` | string | нет | Версия микрокода; если логгер отдает `Microcode level`, передавайте его сюда как есть |
|
||||
| `present` | bool | нет | Наличие (по умолчанию `true`) |
|
||||
| + общие поля статуса | | | см. раздел выше |
|
||||
|
||||
**Генерация serial_number при отсутствии:** `{board_serial}-CPU-{socket}`
|
||||
|
||||
Если источник использует поле/лейбл `Microcode level`, его значение передавайте в `cpus[].firmware` без дополнительного преобразования.
|
||||
|
||||
```json
|
||||
"cpus": [
|
||||
{
|
||||
"socket": 0,
|
||||
"model": "INTEL(R) XEON(R) GOLD 6530",
|
||||
"cores": 32,
|
||||
"threads": 64,
|
||||
"frequency_mhz": 2100,
|
||||
"max_frequency_mhz": 4000,
|
||||
"temperature_c": 61.5,
|
||||
"power_w": 182.0,
|
||||
"throttled": false,
|
||||
"manufacturer": "Intel",
|
||||
"status": "OK",
|
||||
"status_checked_at": "2026-02-10T15:28:00Z"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### memory
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `slot` | string | нет | Идентификатор слота |
|
||||
| `present` | bool | нет | Наличие модуля (по умолчанию `true`) |
|
||||
| `serial_number` | string | нет | Серийный номер |
|
||||
| `part_number` | string | нет | Партномер (используется как модель) |
|
||||
| `manufacturer` | string | нет | Производитель |
|
||||
| `size_mb` | int | нет | Объём в МБ |
|
||||
| `type` | string | нет | Тип: `DDR3`, `DDR4`, `DDR5`, … |
|
||||
| `max_speed_mhz` | int | нет | Максимальная частота |
|
||||
| `current_speed_mhz` | int | нет | Текущая частота |
|
||||
| `temperature_c` | float | нет | Температура DIMM/модуля, °C (telemetry) |
|
||||
| `correctable_ecc_error_count` | int | нет | Количество корректируемых ECC-ошибок |
|
||||
| `uncorrectable_ecc_error_count` | int | нет | Количество некорректируемых ECC-ошибок |
|
||||
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
|
||||
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
|
||||
| `spare_blocks_remaining_pct` | float | нет | Остаток spare blocks, % |
|
||||
| `performance_degraded` | bool | нет | Зафиксирована деградация производительности |
|
||||
| `data_loss_detected` | bool | нет | Источник сигнализирует риск/факт потери данных |
|
||||
| + общие поля статуса | | | см. раздел выше |
|
||||
|
||||
Модуль без `serial_number` игнорируется. Модуль с `present=false` или `status=Empty` игнорируется.
|
||||
|
||||
```json
|
||||
"memory": [
|
||||
{
|
||||
"slot": "CPU0_C0D0",
|
||||
"present": true,
|
||||
"size_mb": 32768,
|
||||
"type": "DDR5",
|
||||
"max_speed_mhz": 4800,
|
||||
"current_speed_mhz": 4800,
|
||||
"temperature_c": 43.0,
|
||||
"correctable_ecc_error_count": 0,
|
||||
"manufacturer": "Hynix",
|
||||
"serial_number": "80AD032419E17CEEC1",
|
||||
"part_number": "HMCG88AGBRA191N",
|
||||
"status": "OK"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### storage
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `slot` | string | нет | Канонический адрес установки PCIe-устройства; передавайте BDF (`0000:18:00.0`) |
|
||||
| `serial_number` | string | нет | Серийный номер |
|
||||
| `model` | string | нет | Модель |
|
||||
| `manufacturer` | string | нет | Производитель |
|
||||
| `type` | string | нет | Тип: `NVMe`, `SSD`, `HDD` |
|
||||
| `interface` | string | нет | Интерфейс: `NVMe`, `SATA`, `SAS` |
|
||||
| `size_gb` | int | нет | Размер в ГБ |
|
||||
| `temperature_c` | float | нет | Температура накопителя, °C (telemetry) |
|
||||
| `power_on_hours` | int64 | нет | Время работы, часы |
|
||||
| `power_cycles` | int64 | нет | Количество циклов питания |
|
||||
| `unsafe_shutdowns` | int64 | нет | Нештатные выключения |
|
||||
| `media_errors` | int64 | нет | Ошибки носителя / media errors |
|
||||
| `error_log_entries` | int64 | нет | Количество записей в error log |
|
||||
| `written_bytes` | int64 | нет | Всего записано байт |
|
||||
| `read_bytes` | int64 | нет | Всего прочитано байт |
|
||||
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
|
||||
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
|
||||
| `available_spare_pct` | float | нет | Доступный spare, % |
|
||||
| `reallocated_sectors` | int64 | нет | Переназначенные сектора |
|
||||
| `current_pending_sectors` | int64 | нет | Сектора в ожидании ремапа |
|
||||
| `offline_uncorrectable` | int64 | нет | Некорректируемые ошибки offline scan |
|
||||
| `firmware` | string | нет | Версия прошивки |
|
||||
| `present` | bool | нет | Наличие (по умолчанию `true`) |
|
||||
| + общие поля статуса | | | см. раздел выше |
|
||||
|
||||
Диск без `serial_number` игнорируется. Изменение `firmware` создаёт событие `FIRMWARE_CHANGED`.
|
||||
|
||||
```json
|
||||
"storage": [
|
||||
{
|
||||
"slot": "OB01",
|
||||
"type": "NVMe",
|
||||
"model": "INTEL SSDPF2KX076T1",
|
||||
"size_gb": 7680,
|
||||
"temperature_c": 38.5,
|
||||
"power_on_hours": 12450,
|
||||
"unsafe_shutdowns": 3,
|
||||
"written_bytes": 9876543210,
|
||||
"life_remaining_pct": 91.0,
|
||||
"serial_number": "BTAX41900GF87P6DGN",
|
||||
"manufacturer": "Intel",
|
||||
"firmware": "9CV10510",
|
||||
"interface": "NVMe",
|
||||
"present": true,
|
||||
"status": "OK"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### pcie_devices
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `slot` | string | нет | Идентификатор слота |
|
||||
| `vendor_id` | int | нет | PCI Vendor ID (decimal) |
|
||||
| `device_id` | int | нет | PCI Device ID (decimal) |
|
||||
| `numa_node` | int | нет | NUMA node / CPU affinity устройства |
|
||||
| `temperature_c` | float | нет | Температура устройства, °C (telemetry) |
|
||||
| `power_w` | float | нет | Текущее энергопотребление устройства, Вт (telemetry) |
|
||||
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
|
||||
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
|
||||
| `ecc_corrected_total` | int64 | нет | Всего корректируемых ECC-ошибок |
|
||||
| `ecc_uncorrected_total` | int64 | нет | Всего некорректируемых ECC-ошибок |
|
||||
| `hw_slowdown` | bool | нет | Устройство вошло в hardware slowdown / protective mode |
|
||||
| `battery_charge_pct` | float | нет | Заряд батареи / supercap, % |
|
||||
| `battery_health_pct` | float | нет | Состояние батареи / supercap, % |
|
||||
| `battery_temperature_c` | float | нет | Температура батареи / supercap, °C |
|
||||
| `battery_voltage_v` | float | нет | Напряжение батареи / supercap, В |
|
||||
| `battery_replace_required` | bool | нет | Требуется замена батареи / supercap |
|
||||
| `sfp_temperature_c` | float | нет | Температура SFP/optic, °C |
|
||||
| `sfp_tx_power_dbm` | float | нет | TX optical power, dBm |
|
||||
| `sfp_rx_power_dbm` | float | нет | RX optical power, dBm |
|
||||
| `sfp_voltage_v` | float | нет | Напряжение SFP, В |
|
||||
| `sfp_bias_ma` | float | нет | Bias current SFP, мА |
|
||||
| `bdf` | string | нет | Deprecated alias для `slot`; при наличии ingest нормализует его в `slot` |
|
||||
| `device_class` | string | нет | Класс устройства (см. список ниже) |
|
||||
| `manufacturer` | string | нет | Производитель |
|
||||
| `model` | string | нет | Модель |
|
||||
| `serial_number` | string | нет | Серийный номер |
|
||||
| `firmware` | string | нет | Версия прошивки |
|
||||
| `link_width` | int | нет | Текущая ширина линка |
|
||||
| `link_speed` | string | нет | Текущая скорость: `Gen3`, `Gen4`, `Gen5` |
|
||||
| `max_link_width` | int | нет | Максимальная ширина линка |
|
||||
| `max_link_speed` | string | нет | Максимальная скорость |
|
||||
| `mac_addresses` | string[] | нет | MAC-адреса портов (для сетевых устройств) |
|
||||
| `present` | bool | нет | Наличие (по умолчанию `true`) |
|
||||
| + общие поля статуса | | | см. раздел выше |
|
||||
|
||||
`numa_node` передавайте для NIC / InfiniBand / RAID / GPU, когда источник знает CPU/NUMA affinity. Поле сохраняется в snapshot-атрибутах PCIe-компонента и дублируется в telemetry для topology use cases.
|
||||
Поля `temperature_c` и `power_w` используйте для device-level telemetry GPU / accelerator / smart PCIe devices. Они не влияют на идентификацию компонента.
|
||||
|
||||
**Генерация serial_number при отсутствии или `"N/A"`:** `{board_serial}-PCIE-{slot}`, где `slot` для PCIe равен BDF.
|
||||
|
||||
`slot` — единственный канонический адрес компонента. Для PCIe в `slot` передавайте BDF. Поле `bdf` сохраняется только как переходный alias на входе и не должно использоваться как отдельная координата рядом со `slot`.
|
||||
|
||||
**Значения `device_class`:**
|
||||
|
||||
| Значение | Назначение |
|
||||
|----------|------------|
|
||||
| `MassStorageController` | RAID-контроллеры |
|
||||
| `StorageController` | HBA, SAS-контроллеры |
|
||||
| `NetworkController` | Сетевые адаптеры (InfiniBand, общий) |
|
||||
| `EthernetController` | Ethernet NIC |
|
||||
| `FibreChannelController` | Fibre Channel HBA |
|
||||
| `VideoController` | GPU, видеокарты |
|
||||
| `ProcessingAccelerator` | Вычислительные ускорители (AI/ML) |
|
||||
| `DisplayController` | Контроллеры дисплея (BMC VGA) |
|
||||
|
||||
Список открытый: допускаются произвольные строки для нестандартных классов.
|
||||
|
||||
```json
|
||||
"pcie_devices": [
|
||||
{
|
||||
"slot": "0000:3b:00.0",
|
||||
"vendor_id": 5555,
|
||||
"device_id": 4401,
|
||||
"numa_node": 0,
|
||||
"temperature_c": 48.5,
|
||||
"power_w": 18.2,
|
||||
"sfp_temperature_c": 36.2,
|
||||
"sfp_tx_power_dbm": -1.8,
|
||||
"sfp_rx_power_dbm": -2.1,
|
||||
"device_class": "EthernetController",
|
||||
"manufacturer": "Intel",
|
||||
"model": "X710 10GbE",
|
||||
"serial_number": "K65472-003",
|
||||
"firmware": "9.20 0x8000d4ae",
|
||||
"mac_addresses": ["3c:fd:fe:aa:bb:cc", "3c:fd:fe:aa:bb:cd"],
|
||||
"status": "OK"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### power_supplies
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `slot` | string | нет | Идентификатор слота |
|
||||
| `present` | bool | нет | Наличие (по умолчанию `true`) |
|
||||
| `serial_number` | string | нет | Серийный номер |
|
||||
| `part_number` | string | нет | Партномер |
|
||||
| `model` | string | нет | Модель |
|
||||
| `vendor` | string | нет | Производитель |
|
||||
| `wattage_w` | int | нет | Мощность в ваттах |
|
||||
| `firmware` | string | нет | Версия прошивки |
|
||||
| `input_type` | string | нет | Тип входа (например `ACWideRange`) |
|
||||
| `input_voltage` | float | нет | Входное напряжение, В (telemetry) |
|
||||
| `input_power_w` | float | нет | Входная мощность, Вт (telemetry) |
|
||||
| `output_power_w` | float | нет | Выходная мощность, Вт (telemetry) |
|
||||
| `temperature_c` | float | нет | Температура PSU, °C (telemetry) |
|
||||
| `life_remaining_pct` | float | нет | Остаточный ресурс / health, % |
|
||||
| `life_used_pct` | float | нет | Использованный ресурс / wear, % |
|
||||
| + общие поля статуса | | | см. раздел выше |
|
||||
|
||||
Поля telemetry (`input_voltage`, `input_power_w`, `output_power_w`, `temperature_c`, `life_remaining_pct`, `life_used_pct`) сохраняются в атрибутах компонента и не влияют на его идентификацию.
|
||||
|
||||
PSU без `serial_number` игнорируется.
|
||||
|
||||
```json
|
||||
"power_supplies": [
|
||||
{
|
||||
"slot": "0",
|
||||
"present": true,
|
||||
"model": "GW-CRPS3000LW",
|
||||
"vendor": "Great Wall",
|
||||
"wattage_w": 3000,
|
||||
"serial_number": "2P06C102610",
|
||||
"firmware": "00.03.05",
|
||||
"status": "OK",
|
||||
"input_type": "ACWideRange",
|
||||
"input_power_w": 137,
|
||||
"output_power_w": 104,
|
||||
"input_voltage": 215.25,
|
||||
"temperature_c": 39.5,
|
||||
"life_remaining_pct": 97.0
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### sensors
|
||||
|
||||
Показания сенсоров сервера. Секция опциональная, не привязана к компонентам.
|
||||
Данные хранятся как последнее известное значение (last-known-value) на уровне Asset.
|
||||
|
||||
```json
|
||||
"sensors": {
|
||||
"fans": [ ... ],
|
||||
"power": [ ... ],
|
||||
"temperatures": [ ... ],
|
||||
"other": [ ... ]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### event_logs
|
||||
|
||||
Нормализованные операционные логи сервера из `host`, `bmc` или `redfish`.
|
||||
|
||||
Эти записи не попадают в history timeline и не создают history events. Они сохраняются в отдельной deduplicated log store и отображаются в отдельном UI-блоке asset logs / host logs.
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `source` | string | **да** | Источник лога: `host`, `bmc`, `redfish` |
|
||||
| `event_time` | string RFC3339 | нет | Время события из источника; если отсутствует, используется время ingest/collection |
|
||||
| `severity` | string | нет | Уровень: `OK`, `Info`, `Warning`, `Critical`, `Unknown` |
|
||||
| `message_id` | string | нет | Идентификатор/код события источника |
|
||||
| `message` | string | **да** | Нормализованный текст события |
|
||||
| `component_ref` | string | нет | Ссылка на компонент/устройство/слот, если извлекается |
|
||||
| `fingerprint` | string | нет | Внешний готовый dedup-key; если не передан, система вычисляет свой |
|
||||
| `is_active` | bool | нет | Признак, что событие всё ещё активно/не погашено, если источник умеет lifecycle |
|
||||
| `raw_payload` | object | нет | Сырой vendor-specific payload для диагностики |
|
||||
|
||||
**Правила event_logs:**
|
||||
- Логи дедуплицируются в рамках asset + source + fingerprint.
|
||||
- Если `fingerprint` не передан, система строит его из нормализованных полей (`source`, `message_id`, `message`, `component_ref`, временная нормализация).
|
||||
- Интегратор/сборщик логов не должен синтезировать содержимое событий: не придумывайте `message`, `message_id`, `component_ref`, serial/device identifiers или иные поля, если они отсутствуют в исходном логе или не были надёжно извлечены.
|
||||
- Повторное получение того же события обновляет `last_seen_at`/счётчик повторов и не должно создавать новый timeline/history event.
|
||||
- `event_logs` используются для отдельного UI-представления логов и не изменяют canonical state компонентов/asset по умолчанию.
|
||||
|
||||
```json
|
||||
"event_logs": [
|
||||
{
|
||||
"source": "bmc",
|
||||
"event_time": "2026-03-15T14:03:11Z",
|
||||
"severity": "Warning",
|
||||
"message_id": "0x000F",
|
||||
"message": "Correctable ECC error threshold exceeded",
|
||||
"component_ref": "CPU0_C0D0",
|
||||
"raw_payload": {
|
||||
"sensor": "DIMM_A1",
|
||||
"sel_record_id": "0042"
|
||||
}
|
||||
},
|
||||
{
|
||||
"source": "redfish",
|
||||
"event_time": "2026-03-15T14:03:20Z",
|
||||
"severity": "Info",
|
||||
"message_id": "OpenBMC.0.1.SystemReboot",
|
||||
"message": "System reboot requested by administrator",
|
||||
"component_ref": "Mainboard"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
#### sensors.fans
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `name` | string | **да** | Уникальное имя сенсора в рамках секции |
|
||||
| `location` | string | нет | Физическое расположение |
|
||||
| `rpm` | int | нет | Обороты, RPM |
|
||||
| `status` | string | нет | Статус: `OK`, `Warning`, `Critical`, `Unknown` |
|
||||
|
||||
#### sensors.power
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `name` | string | **да** | Уникальное имя сенсора |
|
||||
| `location` | string | нет | Физическое расположение |
|
||||
| `voltage_v` | float | нет | Напряжение, В |
|
||||
| `current_a` | float | нет | Ток, А |
|
||||
| `power_w` | float | нет | Мощность, Вт |
|
||||
| `status` | string | нет | Статус |
|
||||
|
||||
#### sensors.temperatures
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `name` | string | **да** | Уникальное имя сенсора |
|
||||
| `location` | string | нет | Физическое расположение |
|
||||
| `celsius` | float | нет | Температура, °C |
|
||||
| `threshold_warning_celsius` | float | нет | Порог Warning, °C |
|
||||
| `threshold_critical_celsius` | float | нет | Порог Critical, °C |
|
||||
| `status` | string | нет | Статус |
|
||||
|
||||
#### sensors.other
|
||||
|
||||
| Поле | Тип | Обязательно | Описание |
|
||||
|------|-----|-------------|----------|
|
||||
| `name` | string | **да** | Уникальное имя сенсора |
|
||||
| `location` | string | нет | Физическое расположение |
|
||||
| `value` | float | нет | Значение |
|
||||
| `unit` | string | нет | Единица измерения |
|
||||
| `status` | string | нет | Статус |
|
||||
|
||||
**Правила sensors:**
|
||||
- Идентификатор сенсора: пара `(sensor_type, name)`. Дубли в одном payload — берётся первое вхождение.
|
||||
- Сенсоры без `name` игнорируются.
|
||||
- При каждом импорте значения перезаписываются (upsert по ключу).
|
||||
|
||||
```json
|
||||
"sensors": {
|
||||
"fans": [
|
||||
{ "name": "FAN1", "location": "Front", "rpm": 4200, "status": "OK" },
|
||||
{ "name": "FAN_CPU0", "location": "CPU0", "rpm": 5600, "status": "OK" }
|
||||
],
|
||||
"power": [
|
||||
{ "name": "12V Rail", "location": "Mainboard", "voltage_v": 12.06, "status": "OK" },
|
||||
{ "name": "PSU0 Input", "location": "PSU0", "voltage_v": 215.25, "current_a": 0.64, "power_w": 137.0, "status": "OK" }
|
||||
],
|
||||
"temperatures": [
|
||||
{ "name": "CPU0 Temp", "location": "CPU0", "celsius": 46.0, "threshold_warning_celsius": 80.0, "threshold_critical_celsius": 95.0, "status": "OK" },
|
||||
{ "name": "Inlet Temp", "location": "Front", "celsius": 22.0, "threshold_warning_celsius": 40.0, "threshold_critical_celsius": 50.0, "status": "OK" }
|
||||
],
|
||||
"other": [
|
||||
{ "name": "System Humidity", "value": 38.5, "unit": "%" , "status": "OK" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Обработка статусов компонентов
|
||||
|
||||
| Статус | Поведение |
|
||||
|--------|-----------|
|
||||
| `OK` | Нормальная обработка |
|
||||
| `Warning` | Создаётся событие `COMPONENT_WARNING` |
|
||||
| `Critical` | Создаётся событие `COMPONENT_FAILED` + запись в `failure_events` |
|
||||
| `Unknown` | Компонент считается рабочим, создаётся событие `COMPONENT_UNKNOWN` |
|
||||
| `Empty` | Компонент не создаётся/не обновляется |
|
||||
|
||||
---
|
||||
|
||||
## Обработка отсутствующих serial_number
|
||||
|
||||
Общее правило для всех секций: если источник не вернул серийный номер и сборщик не смог его надёжно извлечь, интегратор не должен подставлять вымышленные значения, хеши, локальные placeholder-идентификаторы или серийные номера "по догадке". Разрешены только явно оговорённые ниже server-side fallback-правила ingest.
|
||||
|
||||
| Тип | Поведение |
|
||||
|-----|-----------|
|
||||
| CPU | Генерируется: `{board_serial}-CPU-{socket}` |
|
||||
| PCIe | Генерируется: `{board_serial}-PCIE-{slot}` (если serial = `"N/A"` или пустой; `slot` для PCIe = BDF) |
|
||||
| Memory | Компонент игнорируется |
|
||||
| Storage | Компонент игнорируется |
|
||||
| PSU | Компонент игнорируется |
|
||||
|
||||
Если `serial_number` не уникален внутри одного payload для того же `model`:
|
||||
- Первое вхождение сохраняет оригинальный серийный номер.
|
||||
- Каждое следующее дублирующее получает placeholder: `NO_SN-XXXXXXXX`.
|
||||
|
||||
---
|
||||
|
||||
## Минимальный валидный пример
|
||||
|
||||
```json
|
||||
{
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"target_host": "192.168.1.100",
|
||||
"hardware": {
|
||||
"board": {
|
||||
"serial_number": "SRV-001"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Полный пример с историей статусов
|
||||
|
||||
```json
|
||||
{
|
||||
"filename": "redfish://10.10.10.103",
|
||||
"source_type": "api",
|
||||
"protocol": "redfish",
|
||||
"target_host": "10.10.10.103",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"hardware": {
|
||||
"board": {
|
||||
"manufacturer": "Supermicro",
|
||||
"product_name": "X12DPG-QT6",
|
||||
"serial_number": "21D634101"
|
||||
},
|
||||
"firmware": [
|
||||
{ "device_name": "BIOS", "version": "06.08.05" },
|
||||
{ "device_name": "BMC", "version": "5.17.00" }
|
||||
],
|
||||
"cpus": [
|
||||
{
|
||||
"socket": 0,
|
||||
"model": "INTEL(R) XEON(R) GOLD 6530",
|
||||
"manufacturer": "Intel",
|
||||
"cores": 32,
|
||||
"threads": 64,
|
||||
"status": "OK"
|
||||
}
|
||||
],
|
||||
"storage": [
|
||||
{
|
||||
"slot": "OB01",
|
||||
"type": "NVMe",
|
||||
"model": "INTEL SSDPF2KX076T1",
|
||||
"size_gb": 7680,
|
||||
"serial_number": "BTAX41900GF87P6DGN",
|
||||
"manufacturer": "Intel",
|
||||
"firmware": "9CV10510",
|
||||
"present": true,
|
||||
"status": "OK",
|
||||
"status_changed_at": "2026-02-10T15:22:00Z",
|
||||
"status_history": [
|
||||
{ "status": "Critical", "changed_at": "2026-02-10T15:10:00Z", "details": "I/O timeout on NVMe queue 3" },
|
||||
{ "status": "OK", "changed_at": "2026-02-10T15:22:00Z", "details": "Recovered after controller reset" }
|
||||
]
|
||||
}
|
||||
],
|
||||
"pcie_devices": [
|
||||
{
|
||||
"slot": "0000:18:00.0",
|
||||
"device_class": "EthernetController",
|
||||
"manufacturer": "Intel",
|
||||
"model": "X710 10GbE",
|
||||
"serial_number": "K65472-003",
|
||||
"mac_addresses": ["3c:fd:fe:aa:bb:cc", "3c:fd:fe:aa:bb:cd"],
|
||||
"status": "OK"
|
||||
}
|
||||
],
|
||||
"power_supplies": [
|
||||
{
|
||||
"slot": "0",
|
||||
"present": true,
|
||||
"model": "GW-CRPS3000LW",
|
||||
"vendor": "Great Wall",
|
||||
"wattage_w": 3000,
|
||||
"serial_number": "2P06C102610",
|
||||
"firmware": "00.03.05",
|
||||
"status": "OK",
|
||||
"input_power_w": 137,
|
||||
"output_power_w": 104,
|
||||
"input_voltage": 215.25
|
||||
}
|
||||
],
|
||||
"sensors": {
|
||||
"fans": [
|
||||
{ "name": "FAN1", "location": "Front", "rpm": 4200, "status": "OK" }
|
||||
],
|
||||
"power": [
|
||||
{ "name": "12V Rail", "voltage_v": 12.06, "status": "OK" }
|
||||
],
|
||||
"temperatures": [
|
||||
{ "name": "CPU0 Temp", "celsius": 46.0, "threshold_warning_celsius": 80.0, "threshold_critical_celsius": 95.0, "status": "OK" }
|
||||
],
|
||||
"other": [
|
||||
{ "name": "System Humidity", "value": 38.5, "unit": "%" }
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
343
bible-local/docs/msi-redfish-api.md
Normal file
343
bible-local/docs/msi-redfish-api.md
Normal file
@@ -0,0 +1,343 @@
|
||||
# MSI BMC Redfish API Reference
|
||||
|
||||
Source: MSI Enterprise Platform Solutions — Redfish BMC User Guide v1.0 (AMI/MegaRAC stack).
|
||||
Spec compliance: DSP0266 1.15.1, DSP8010 2019.2.
|
||||
|
||||
> This document is trimmed to sections relevant to LOGPile collection and inventory analysis.
|
||||
> Auth, LDAP/AD, SMTP, VirtualMedia, Certificates, RADIUS, Composability, and BMC config
|
||||
> sections are omitted.
|
||||
|
||||
---
|
||||
|
||||
## Supported HTTP methods
|
||||
|
||||
`GET`, `POST`, `PATCH`, `DELETE`. Unsupported methods return `405`.
|
||||
|
||||
PATCH requires an `If-Match` / `ETag` precondition header; missing header → `428`, mismatch → `412`.
|
||||
|
||||
---
|
||||
|
||||
## 1. Core Redfish API endpoints
|
||||
|
||||
| Resource | URI | Schema |
|
||||
|---|---|---|
|
||||
| Service Root | `/redfish/v1/` | ServiceRoot.v1_7_0 |
|
||||
| ComputerSystem Collection | `/redfish/v1/Systems` | ComputerSystemCollection |
|
||||
| ComputerSystem | `/redfish/v1/Systems/{sys}` | ComputerSystem.v1_16_2 |
|
||||
| Memory Collection | `/redfish/v1/Systems/{sys}/Memory` | MemoryCollection |
|
||||
| Memory | `/redfish/v1/Systems/{sys}/Memory/{mem}` | Memory.v1_19_0 |
|
||||
| MemoryMetrics | `/redfish/v1/Systems/{sys}/Memory/{mem}/MemoryMetrics` | MemoryMetrics.v1_7_0 |
|
||||
| MemoryDomain Collection | `/redfish/v1/Systems/{sys}/MemoryDomain` | MemoryDomainCollection |
|
||||
| MemoryDomain | `/redfish/v1/Systems/{sys}/MemoryDomain/{dom}` | MemoryDomain.v1_2_3 |
|
||||
| MemoryChunks Collection | `/redfish/v1/Systems/{sys}/MemoryDomain/{dom}/MemoryChunks` | MemoryChunksCollection |
|
||||
| MemoryChunks | `/redfish/v1/Systems/{sys}/MemoryDomain/{dom}/MemoryChunks/{chunk}` | MemoryChunks.v1_4_0 |
|
||||
| Processor Collection | `/redfish/v1/Systems/{sys}/Processors` | ProcessorCollection |
|
||||
| Processor | `/redfish/v1/Systems/{sys}/Processors/{proc}` | Processor.v1_15_0 |
|
||||
| SubProcessors Collection | `/redfish/v1/Systems/{sys}/Processors/{proc}/SubProcessors` | ProcessorCollection |
|
||||
| SubProcessor | `/redfish/v1/Systems/{sys}/Processors/{proc}/SubProcessors/{sub}` | Processor.v1_15_0 |
|
||||
| ProcessorMetrics | `/redfish/v1/Systems/{sys}/Processors/{proc}/ProcessorMetrics` | ProcessorMetrics.v1_4_0 |
|
||||
| Bios | `/redfish/v1/Systems/{sys}/Bios` | Bios.v1_2_0 |
|
||||
| SimpleStorage Collection | `/redfish/v1/Systems/{sys}/SimpleStorage` | SimpleStorageCollection |
|
||||
| SimpleStorage | `/redfish/v1/Systems/{sys}/SimpleStorage/{ss}` | SimpleStorage.v1_3_0 |
|
||||
| Storage Collection | `/redfish/v1/Systems/{sys}/Storage` | StorageCollection |
|
||||
| Storage | `/redfish/v1/Systems/{sys}/Storage/{stor}` | Storage.v1_9_0 |
|
||||
| StorageController Collection | `/redfish/v1/Systems/{sys}/Storage/{stor}/Controllers` | StorageControllerCollection |
|
||||
| StorageController | `/redfish/v1/Systems/{sys}/Storage/{stor}/Controllers/{ctrl}` | StorageController.v1_0_0 |
|
||||
| Drive | `/redfish/v1/Systems/{sys}/Storage/{stor}/Drives/{drv}` | Drive.v1_13_0 |
|
||||
| Volume Collection | `/redfish/v1/Systems/{sys}/Storage/{stor}/Volumes` | VolumeCollection |
|
||||
| Volume | `/redfish/v1/Systems/{sys}/Storage/{stor}/Volumes/{vol}` | Volume.v1_5_0 |
|
||||
| NetworkInterface Collection | `/redfish/v1/Systems/{sys}/NetworkInterfaces` | NetworkInterfaceCollection |
|
||||
| NetworkInterface | `/redfish/v1/Systems/{sys}/NetworkInterfaces/{nic}` | NetworkInterface.v1_2_0 |
|
||||
| EthernetInterface (System) | `/redfish/v1/Systems/{sys}/EthernetInterfaces/{eth}` | EthernetInterface.v1_6_2 |
|
||||
| GraphicsController Collection | `/redfish/v1/Systems/{sys}/GraphicsControllers` | GraphicsControllerCollection |
|
||||
| GraphicsController | `/redfish/v1/Systems/{sys}/GraphicsControllers/{gpu}` | GraphicsController.v1_0_0 |
|
||||
| USBController Collection | `/redfish/v1/Systems/{sys}/USBControllers` | USBControllerCollection |
|
||||
| USBController | `/redfish/v1/Systems/{sys}/USBControllers/{usb}` | USBController.v1_0_0 |
|
||||
| SecureBoot | `/redfish/v1/Systems/{sys}/SecureBoot` | SecureBoot.v1_1_0 |
|
||||
| LogService Collection (System) | `/redfish/v1/Systems/{sys}/LogServices` | LogServiceCollection |
|
||||
| LogService (System) | `/redfish/v1/Systems/{sys}/LogServices/{log}` | LogService.v1_1_3 |
|
||||
| LogEntry Collection | `/redfish/v1/Systems/{sys}/LogServices/{log}/Entries` | LogEntryCollection |
|
||||
| LogEntry | `/redfish/v1/Systems/{sys}/LogServices/{log}/Entries/{entry}` | LogEntry.v1_12_0 |
|
||||
| Chassis Collection | `/redfish/v1/Chassis` | ChassisCollection |
|
||||
| Chassis | `/redfish/v1/Chassis/{ch}` | Chassis.v1_15_0 |
|
||||
| Power | `/redfish/v1/Chassis/{ch}/Power` | Power.v1_5_4 |
|
||||
| PowerSubSystem | `/redfish/v1/Chassis/{ch}/PowerSubSystem` | PowerSubsystem.v1_1_0 |
|
||||
| PowerSupplies Collection | `/redfish/v1/Chassis/{ch}/PowerSubSystem/PowerSupplies` | PowerSupplyCollection |
|
||||
| PowerSupply | `/redfish/v1/Chassis/{ch}/PowerSubSystem/PowerSupplies/{psu}` | PowerSupply.v1_3_0 |
|
||||
| PowerSupplyMetrics | `/redfish/v1/Chassis/{ch}/PowerSubSystem/PowerSupplies/{psu}/Metrics` | PowerSupplyMetrics.v1_0_1 |
|
||||
| Thermal | `/redfish/v1/Chassis/{ch}/Thermal` | Thermal.v1_5_3 |
|
||||
| ThermalSubSystem | `/redfish/v1/Chassis/{ch}/ThermalSubSystem` | ThermalSubsystem.v1_0_0 |
|
||||
| ThermalMetrics | `/redfish/v1/Chassis/{ch}/ThermalSubSystem/ThermalMetrics` | ThermalMetrics.v1_0_1 |
|
||||
| Fans Collection | `/redfish/v1/Chassis/{ch}/ThermalSubSystem/Fans` | FanCollection |
|
||||
| Fan | `/redfish/v1/Chassis/{ch}/ThermalSubSystem/Fans/{fan}` | Fan.v1_1_1 |
|
||||
| Sensor Collection | `/redfish/v1/Chassis/{ch}/Sensors` | SensorCollection |
|
||||
| Sensor | `/redfish/v1/Chassis/{ch}/Sensors/{sen}` | Sensor.v1_0_2 |
|
||||
| PCIeDevice Collection | `/redfish/v1/Chassis/{ch}/PCIeDevices` | PCIeDeviceCollection |
|
||||
| PCIeDevice | `/redfish/v1/Chassis/{ch}/PCIeDevices/{dev}` | PCIeDevice.v1_9_0 |
|
||||
| PCIeFunction Collection | `/redfish/v1/Chassis/{ch}/PCIeDevices/{dev}/PCIeFunctions` | PCIeFunctionCollection |
|
||||
| PCIeFunction | `/redfish/v1/Chassis/{ch}/PCIeDevices/{dev}/PCIeFunctions/{fn}` | PCIeFunction.v1_2_3 |
|
||||
| PCIeSlots | `/redfish/v1/Chassis/{ch}/PCIeSlots` | PCIeSlots.v1_5_0 |
|
||||
| NetworkAdapter Collection | `/redfish/v1/Chassis/{ch}/NetworkAdapters` | NetworkAdapterCollection |
|
||||
| NetworkAdapter | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}` | NetworkAdapter.v1_8_0 |
|
||||
| NetworkDeviceFunction Collection | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/NetworkDeviceFunctions` | NetworkDeviceFunctionCollection |
|
||||
| NetworkDeviceFunction | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/NetworkDeviceFunctions/{fn}` | NetworkDeviceFunction.v1_5_0 |
|
||||
| Assembly | `/redfish/v1/Chassis/{ch}/Assembly` | Assembly.v1_2_2 |
|
||||
| Assembly (Drive) | `/redfish/v1/Systems/{sys}/Storage/{stor}/Drives/{drv}/Assembly` | Assembly.v1_2_2 |
|
||||
| Assembly (Processor) | `/redfish/v1/Systems/{sys}/Processors/{proc}/Assembly` | Assembly.v1_2_2 |
|
||||
| Assembly (Memory) | `/redfish/v1/Systems/{sys}/Memory/{mem}/Assembly` | Assembly.v1_2_2 |
|
||||
| Assembly (NetworkAdapter) | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/Assembly` | Assembly.v1_2_2 |
|
||||
| Assembly (PCIeDevice) | `/redfish/v1/Chassis/{ch}/PCIeDevices/{dev}/Assembly` | Assembly.v1_2_2 |
|
||||
| MediaController Collection | `/redfish/v1/Chassis/{ch}/MediaControllers` | MediaControllerCollection |
|
||||
| MediaController | `/redfish/v1/Chassis/{ch}/MediaControllers/{mc}` | MediaController.v1_1_0 |
|
||||
| LogService Collection (Chassis) | `/redfish/v1/Chassis/{ch}/LogServices` | LogServiceCollection |
|
||||
| LogService (Chassis) | `/redfish/v1/Chassis/{ch}/LogServices/{log}` | LogService.v1_1_3 |
|
||||
| Manager Collection | `/redfish/v1/Managers` | ManagerCollection |
|
||||
| Manager | `/redfish/v1/Managers/{mgr}` | Manager.v1_13_0 |
|
||||
| EthernetInterface (Manager) | `/redfish/v1/Managers/{mgr}/EthernetInterfaces/{eth}` | EthernetInterface.v1_6_2 |
|
||||
| LogService Collection (Manager) | `/redfish/v1/Managers/{mgr}/LogServices` | LogServiceCollection |
|
||||
| LogService (Manager) | `/redfish/v1/Managers/{mgr}/LogServices/{log}` | LogService.v1_1_3 |
|
||||
| UpdateService | `/redfish/v1/UpdateService` | UpdateService.v1_6_0 |
|
||||
| TaskService | `/redfish/v1/TasksService` | TaskService.v1_1_4 |
|
||||
| Task Collection | `/redfish/v1/TaskService/Tasks` | TaskCollection |
|
||||
| Task | `/redfish/v1/TaskService/Tasks/{task}` | Task.v1_4_2 |
|
||||
|
||||
---
|
||||
|
||||
## 2. Telemetry API endpoints
|
||||
|
||||
| Resource | URI | Schema |
|
||||
|---|---|---|
|
||||
| TelemetryService | `/redfish/v1/TelemetryService` | TelemetryService.v1_2_1 |
|
||||
| MetricDefinition Collection | `/redfish/v1/TelemetryService/MetricDefinitions` | MetricDefinitionCollection |
|
||||
| MetricDefinition | `/redfish/v1/TelemetryService/MetricDefinitions/{md}` | MetricDefinition.v1_0_3 |
|
||||
| MetricReportDefinition Collection | `/redfish/v1/TelemetryService/MetricReportDefinitions` | MetricReportDefinitionCollection |
|
||||
| MetricReportDefinition | `/redfish/v1/TelemetryService/MetricReportDefinitions/{mrd}` | MetricReportDefinition.v1_3_0 |
|
||||
| MetricReport Collection | `/redfish/v1/TelemetryService/MetricReports` | MetricReportCollection |
|
||||
| MetricReport | `/redfish/v1/TelemetryService/MetricReports/{mr}` | MetricReport.v1_2_0 |
|
||||
| Telemetry LogService | `/redfish/v1/TelemetryService/LogService` | LogService.v1_1_3 |
|
||||
| Telemetry LogEntry Collection | `/redfish/v1/TelemetryService/LogService/Entries` | LogEntryCollection |
|
||||
|
||||
---
|
||||
|
||||
## 3. Processor / NIC sub-resources (GPU-relevant)
|
||||
|
||||
| Resource | URI |
|
||||
|---|---|
|
||||
| Processor (NetworkAdapter) | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/Processors/{proc}` |
|
||||
| AccelerationFunction Collection | `/redfish/v1/Systems/{sys}/Processors/{proc}/AccelerationFunctions` |
|
||||
| AccelerationFunction | `/redfish/v1/Systems/{sys}/Processors/{proc}/AccelerationFunctions/{fn}` |
|
||||
| Port Collection (NetworkAdapter) | `/redfish/v1/Chassis/{ch}/NetworkAdapters/{na}/Ports` |
|
||||
| Port (GraphicsController) | `/redfish/v1/Systems/{sys}/GraphicsControllers/{gpu}/Ports/{port}` |
|
||||
| OperatingConfig Collection | `/redfish/v1/Systems/{sys}/Processors/{proc}/OperatingConfigs` |
|
||||
| OperatingConfig | `/redfish/v1/Systems/{sys}/Processors/{proc}/OperatingConfigs/{cfg}` |
|
||||
|
||||
---
|
||||
|
||||
## 4. Error response format
|
||||
|
||||
On error, the service returns an HTTP status code and a JSON body with a single `error` property:
|
||||
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"code": "Base.1.12.0.ActionParameterMissing",
|
||||
"message": "...",
|
||||
"@Message.ExtendedInfo": [
|
||||
{
|
||||
"@odata.type": "#Message.v1_0_8.Message",
|
||||
"MessageId": "Base.1.12.0.ActionParameterMissing",
|
||||
"Message": "...",
|
||||
"MessageArgs": [],
|
||||
"Severity": "Warning",
|
||||
"Resolution": "..."
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Common status codes:**
|
||||
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| 200 | OK with body |
|
||||
| 201 | Created |
|
||||
| 204 | Success, no body |
|
||||
| 400 | Bad request / validation error |
|
||||
| 401 | Unauthorized |
|
||||
| 403 | Forbidden / firmware update in progress |
|
||||
| 404 | Resource not found |
|
||||
| 405 | Method not allowed |
|
||||
| 412 | ETag precondition failed (PATCH) |
|
||||
| 415 | Unsupported media type |
|
||||
| 428 | Missing precondition header (PATCH) |
|
||||
| 501 | Not implemented |
|
||||
|
||||
**Request validation sequence:**
|
||||
1. Authorization check → 401
|
||||
2. Entity privilege check → 403
|
||||
3. URI existence → 404
|
||||
4. Firmware update lock → 403
|
||||
5. Method allowed → 405
|
||||
6. Media type → 415
|
||||
7. Body format → 400
|
||||
8. PATCH: ETag header → 428/412
|
||||
9. Property validation → 400
|
||||
|
||||
---
|
||||
|
||||
## 5. OEM: Inventory refresh (AMI/MSI-specific)
|
||||
|
||||
### 5.1 InventoryCrc — force component re-inventory
|
||||
|
||||
`GET/POST/DELETE /redfish/v1/Systems/{sys}/Oem/Ami/Inventory/Crc`
|
||||
|
||||
The `GroupCrcList` field lists current CRC checksums per component group. When a group's CRC
|
||||
changes (host sends new inventory) or is explicitly zeroed out via POST, the BMC discards its
|
||||
cached inventory and re-reads that group from the host.
|
||||
|
||||
**CRC groups:**
|
||||
|
||||
| Group | Covers |
|
||||
|-------|--------|
|
||||
| `CPU` | Processors, ProcessorMetrics |
|
||||
| `DIMM` | Memory, MemoryDomains, MemoryChunks, MemoryMetrics |
|
||||
| `PCIE` | Storage, PCIeDevices, NetworkInterfaces, NetworkAdapters |
|
||||
| `CERTIFICATES` | Boot Certificates |
|
||||
| `SECURBOOT` | SecureBoot data |
|
||||
|
||||
**POST — invalidate selected groups (force re-inventory):**
|
||||
|
||||
```
|
||||
POST /redfish/v1/Systems/{sys}/Oem/Ami/Inventory/Crc
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"GroupCrcList": [
|
||||
{ "CPU": 0 },
|
||||
{ "DIMM": 0 },
|
||||
{ "PCIE": 0 }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Setting a group's value to `0` signals the BMC to invalidate and repopulate that group on next
|
||||
host inventory push (typically at next boot or host-interface inventory cycle).
|
||||
|
||||
**DELETE** — remove all CRC records entirely.
|
||||
|
||||
**Note:** Inventory data is populated by the host via the Redfish Host Interface (in-band),
|
||||
not by the BMC itself. Zeroing a CRC group does not immediately re-read hardware — it marks
|
||||
the group as stale so the next host-side inventory push will be accepted. A cold reboot is the
|
||||
most reliable trigger.
|
||||
|
||||
### 5.2 InventoryData Status — monitor inventory processing
|
||||
|
||||
`GET /redfish/v1/Oem/Ami/InventoryData/Status`
|
||||
|
||||
Available only after the host has posted an inventory file. Shows current processing state.
|
||||
|
||||
**Status enum:**
|
||||
|
||||
| Value | Meaning |
|
||||
|-------|---------|
|
||||
| `BootInProgress` | Host is booting |
|
||||
| `Queued` | Processing task queued |
|
||||
| `In-Progress` | Processing running in background |
|
||||
| `Ready` / `Completed` | Processing finished successfully |
|
||||
| `Failed` | Processing failed |
|
||||
|
||||
Response also includes:
|
||||
- `InventoryData.DeletedModules` — array of groups updated in this population cycle
|
||||
- `InventoryData.Messages` — warnings/errors encountered during processing
|
||||
- `ProcessingTime` — milliseconds taken
|
||||
- `LastModifiedTime` — ISO 8601 timestamp of last successful update
|
||||
|
||||
### 5.3 Systems OEM properties — Inventory reference
|
||||
|
||||
`GET /redfish/v1/Systems/{sys}` → `Oem.Ami` contains:
|
||||
|
||||
| Property | Notes |
|
||||
|----------|-------|
|
||||
| `Inventory` | Reference to InventoryCrc URI + current GroupCrc data |
|
||||
| `RedfishVersion` | BIOS Redfish version (populated via Host Interface) |
|
||||
| `RtpVersion` | BIOS RTP version (populated via Host Interface) |
|
||||
| `ManagerBootConfiguration.ManagerBootMode` | PATCH to trigger soft reset: `SoftReset` / `ResetTimeout` / `None` |
|
||||
|
||||
---
|
||||
|
||||
## 6. OEM: Component state actions
|
||||
|
||||
### 6.1 Memory enable/disable
|
||||
|
||||
```
|
||||
POST /redfish/v1/Systems/{sys}/Memory/{mem}/Actions/AmiBios.ChangeState
|
||||
Content-Type: application/json
|
||||
|
||||
{ "State": "Disabled" }
|
||||
```
|
||||
|
||||
Response: 204.
|
||||
|
||||
### 6.2 PCIeFunction enable/disable
|
||||
|
||||
```
|
||||
POST /redfish/v1/Chassis/{ch}/PCIeDevices/{dev}/PCIeFunctions/{fn}/Actions/AmiBios.ChangeState
|
||||
Content-Type: application/json
|
||||
|
||||
{ "State": "Disabled" }
|
||||
```
|
||||
|
||||
Response: 204.
|
||||
|
||||
---
|
||||
|
||||
## 7. OEM: Storage sensor readings
|
||||
|
||||
`GET /redfish/v1/Systems/{sys}/Storage/{stor}` → `Oem.Ami.StorageControllerSensors`
|
||||
|
||||
Array of sensor objects per storage controller instance. Each entry exposes:
|
||||
- `Reading` (Number) — current sensor value
|
||||
- `ReadingType` (String) — type of reading
|
||||
- `ReadingUnit` (String) — unit
|
||||
|
||||
---
|
||||
|
||||
## 8. OEM: Power and Thermal OwnerLUN
|
||||
|
||||
Both `GET /redfish/v1/Chassis/{ch}/Power` and `GET /redfish/v1/Chassis/{ch}/Thermal` expose
|
||||
`Oem.Ami.OwnerLUN` (Number, read-only) — the IPMI LUN associated with each
|
||||
temperature/fan/voltage sensor entry. Useful for correlating Redfish sensor readings with IPMI
|
||||
SDR records.
|
||||
|
||||
---
|
||||
|
||||
## 9. UpdateService
|
||||
|
||||
`GET /redfish/v1/UpdateService` → `Oem.Ami.BMC.DualImageConfiguration`:
|
||||
|
||||
| Property | Description |
|
||||
|----------|-------------|
|
||||
| `ActiveImage` | Currently active BMC image slot |
|
||||
| `BootImage` | Image slot BMC boots from |
|
||||
| `FirmwareImage1Name` / `FirmwareImage1Version` | First image slot name + version |
|
||||
| `FirmwareImage2Name` / `FirmwareImage2Version` | Second image slot name + version |
|
||||
|
||||
Standard `SimpleUpdate` action available at `/redfish/v1/UpdateService/Actions/UpdateService.SimpleUpdate`.
|
||||
|
||||
---
|
||||
|
||||
## 10. Inventory refresh summary
|
||||
|
||||
| Approach | Trigger | Latency | Scope |
|
||||
|----------|---------|---------|-------|
|
||||
| Host reboot | Physical/soft reset | Minutes | All groups |
|
||||
| `POST InventoryCrc` (groups = 0) | Explicit API call | Next host inventory push | Selected groups |
|
||||
| Firmware update (`SimpleUpdate`) | Explicit API call | Minutes + reboot | Full platform |
|
||||
| Sensor/telemetry reads | Always live on GET | Immediate | Sensors only |
|
||||
|
||||
**Key constraint:** `InventoryCrc POST` marks groups stale but does not re-read hardware
|
||||
directly. Actual inventory data flows from the host to BMC via the Redfish Host Interface
|
||||
in-band channel, typically during POST/boot. For immediate inventory refresh without a full
|
||||
reboot, a soft reset via `ManagerBootMode: SoftReset` PATCH may be sufficient on some
|
||||
configurations.
|
||||
@@ -1,28 +0,0 @@
|
||||
# Test Server Collection Memory
|
||||
|
||||
Keep this table updated after each test-server run.
|
||||
|
||||
Definition:
|
||||
- `Collection Time` = total Redfish collection duration from `collect.log`.
|
||||
- `Speed` = `Documents / seconds`.
|
||||
- `Metrics Collected` = sum of `Counts` fields (`cpus + memory + storage + pcie + gpus + nics + psus + firmware`).
|
||||
- `n/a` means the log does not contain enough timestamp metadata to calculate duration/speed.
|
||||
|
||||
## Server Model: `NF5688M7`
|
||||
|
||||
| Date (UTC) | App Version | Collection Time | Documents | Speed | Metrics Collected | Notes |
|
||||
|---|---|---:|---:|---:|---:|---|
|
||||
| 2026-02-28 | `v1.7.1-12-g612058e` (`612058e`) | 10m10s (610s) | 228 | 0.37 docs/s | 98 | 2026-02-28 (SERVER MODEL) - 23E100043.zip |
|
||||
| 2026-02-28 | `v1.7.1-11-ge0146ad` (`e0146ad`) | 9m36s (576s) | 138 | 0.24 docs/s | 110 | 2026-02-28 (SERVER MODEL) - 23E100042.zip |
|
||||
| 2026-02-28 | `v1.7.1-10-g9a30705` (`9a30705`) | 20m47s (1247s) | 106 | 0.09 docs/s | 97 | 2026-02-28 (SERVER MODEL) - 23E100053.zip |
|
||||
| 2026-02-28 | `v1.7.1` (`6c19a58`) | 15m08s (908s) | 184 | 0.20 docs/s | 96 | 2026-02-28 (DDR5 DIMM) - 23E100051.zip |
|
||||
| 2026-02-28 | `v1.7.0` (`ddab93a`) | n/a | 193 | n/a | 61 | 2026-02-28 (NULL) - 23E100051.zip |
|
||||
| 2026-02-28 | `v1.7.0` (`ddab93a`) | n/a | 291 | n/a | 61 | 2026-02-28 (NULL) - 23E100206.zip |
|
||||
|
||||
## Server Model: `KR1280-X2-A0-R0-00`
|
||||
|
||||
| Date (UTC) | App Version | Collection Time | Documents | Speed | Metrics Collected | Notes |
|
||||
|---|---|---:|---:|---:|---:|---|
|
||||
| 2026-02-28 | `v1.7.1-12-g612058e` (`612058e`) | 6m15s (375s) | 185 | 0.49 docs/s | 46 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657.zip |
|
||||
| 2026-02-28 | `v1.7.1-9-g8dbbec3-dirty` (`8dbbec3`) | 6m16s (376s) | 165 | 0.44 docs/s | 46 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657-2.zip |
|
||||
| 2026-02-28 | `v1.7.1-7-gc52fea2` (`c52fea2`) | 10m51s (651s) | 227 | 0.35 docs/s | 40 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657 copy.zip |
|
||||
6
go.mod
6
go.mod
@@ -1,3 +1,7 @@
|
||||
module git.mchus.pro/mchus/logpile
|
||||
|
||||
go 1.22
|
||||
go 1.24.0
|
||||
|
||||
require reanimator/chart v0.0.0
|
||||
|
||||
replace reanimator/chart => ./internal/chart
|
||||
|
||||
1
internal/chart
Submodule
1
internal/chart
Submodule
Submodule internal/chart added at 2fb01d30a6
File diff suppressed because it is too large
Load Diff
445
internal/collector/redfish_logentries.go
Normal file
445
internal/collector/redfish_logentries.go
Normal file
@@ -0,0 +1,445 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log"
|
||||
"net/http"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
const (
|
||||
redfishLogEntriesWindow = 7 * 24 * time.Hour
|
||||
redfishLogEntriesMaxTotal = 500
|
||||
redfishLogEntriesMaxPerSvc = 200
|
||||
)
|
||||
|
||||
// collectRedfishLogEntries fetches hardware event log entries from Systems and Managers LogServices.
|
||||
// Only hardware-relevant entries from the last 7 days are returned.
|
||||
// For Systems: all log services except audit/journal/security/debug.
|
||||
// For Managers: only the IPMI SEL service (Id="SEL") — audit and event logs are excluded.
|
||||
func (c *RedfishConnector) collectRedfishLogEntries(ctx context.Context, client *http.Client, req Request, baseURL string, systemPaths, managerPaths []string) []map[string]interface{} {
|
||||
cutoff := time.Now().UTC().Add(-redfishLogEntriesWindow)
|
||||
seen := make(map[string]struct{})
|
||||
var out []map[string]interface{}
|
||||
|
||||
collectFrom := func(logServicesPath string, filter func(map[string]interface{}) bool) {
|
||||
if len(out) >= redfishLogEntriesMaxTotal {
|
||||
return
|
||||
}
|
||||
services, err := c.getCollectionMembers(ctx, client, req, baseURL, logServicesPath)
|
||||
if err != nil || len(services) == 0 {
|
||||
return
|
||||
}
|
||||
for _, svc := range services {
|
||||
if len(out) >= redfishLogEntriesMaxTotal {
|
||||
break
|
||||
}
|
||||
if !filter(svc) {
|
||||
continue
|
||||
}
|
||||
entriesPath := redfishLogServiceEntriesPath(svc)
|
||||
if entriesPath == "" {
|
||||
continue
|
||||
}
|
||||
entries := c.fetchRedfishLogEntriesWithPaging(ctx, client, req, baseURL, entriesPath, cutoff, seen, redfishLogEntriesMaxPerSvc)
|
||||
out = append(out, entries...)
|
||||
}
|
||||
}
|
||||
|
||||
for _, systemPath := range systemPaths {
|
||||
for _, logServicesPath := range c.redfishLinkedCollectionPaths(ctx, client, req, baseURL, systemPath, "LogServices") {
|
||||
collectFrom(logServicesPath, isHardwareLogService)
|
||||
}
|
||||
}
|
||||
// Managers hold the IPMI SEL on AMI/MSI BMCs — include only the "SEL" service.
|
||||
for _, managerPath := range managerPaths {
|
||||
for _, logServicesPath := range c.redfishLinkedCollectionPaths(ctx, client, req, baseURL, managerPath, "LogServices") {
|
||||
collectFrom(logServicesPath, isManagerSELService)
|
||||
}
|
||||
}
|
||||
|
||||
if len(out) > 0 {
|
||||
log.Printf("redfish: collected %d hardware log entries (Systems+Managers SEL, window=7d)", len(out))
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func (c *RedfishConnector) redfishLinkedCollectionPaths(
|
||||
ctx context.Context,
|
||||
client *http.Client,
|
||||
req Request,
|
||||
baseURL, resourcePath, linkKey string,
|
||||
) []string {
|
||||
resourcePath = normalizeRedfishPath(resourcePath)
|
||||
if resourcePath == "" || strings.TrimSpace(linkKey) == "" {
|
||||
return nil
|
||||
}
|
||||
|
||||
seen := make(map[string]struct{}, 2)
|
||||
var out []string
|
||||
add := func(path string) {
|
||||
path = normalizeRedfishPath(path)
|
||||
if path == "" {
|
||||
return
|
||||
}
|
||||
if _, ok := seen[path]; ok {
|
||||
return
|
||||
}
|
||||
seen[path] = struct{}{}
|
||||
out = append(out, path)
|
||||
}
|
||||
|
||||
add(joinPath(resourcePath, "/"+strings.TrimSpace(linkKey)))
|
||||
|
||||
resourceDoc, err := c.getJSON(ctx, client, req, baseURL, resourcePath)
|
||||
if err == nil {
|
||||
if linked := redfishLinkedPath(resourceDoc, linkKey); linked != "" {
|
||||
add(linked)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// fetchRedfishLogEntriesWithPaging fetches entries from a LogEntry collection,
|
||||
// following nextLink pages. Stops early when entries older than cutoff are encountered
|
||||
// (assumes BMC returns entries newest-first, which is typical).
|
||||
func (c *RedfishConnector) fetchRedfishLogEntriesWithPaging(ctx context.Context, client *http.Client, req Request, baseURL, entriesPath string, cutoff time.Time, seen map[string]struct{}, limit int) []map[string]interface{} {
|
||||
var out []map[string]interface{}
|
||||
nextPath := entriesPath
|
||||
|
||||
for nextPath != "" && len(out) < limit {
|
||||
collection, err := c.getJSON(ctx, client, req, baseURL, nextPath)
|
||||
if err != nil {
|
||||
break
|
||||
}
|
||||
|
||||
// Handle both linked members (@odata.id only) and inline members (full objects).
|
||||
rawMembers, _ := collection["Members"].([]interface{})
|
||||
hitOldEntry := false
|
||||
|
||||
for _, rawMember := range rawMembers {
|
||||
if len(out) >= limit {
|
||||
break
|
||||
}
|
||||
memberMap, ok := rawMember.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
|
||||
var entry map[string]interface{}
|
||||
if _, hasCreated := memberMap["Created"]; hasCreated {
|
||||
// Inline entry — use directly.
|
||||
entry = memberMap
|
||||
} else {
|
||||
// Linked entry — fetch by path.
|
||||
memberPath := normalizeRedfishPath(asString(memberMap["@odata.id"]))
|
||||
if memberPath == "" {
|
||||
continue
|
||||
}
|
||||
entry, err = c.getJSON(ctx, client, req, baseURL, memberPath)
|
||||
if err != nil || len(entry) == 0 {
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
// Dedup by entry Id or path.
|
||||
entryKey := asString(entry["Id"])
|
||||
if entryKey == "" {
|
||||
entryKey = asString(entry["@odata.id"])
|
||||
}
|
||||
if entryKey != "" {
|
||||
if _, dup := seen[entryKey]; dup {
|
||||
continue
|
||||
}
|
||||
seen[entryKey] = struct{}{}
|
||||
}
|
||||
|
||||
// Time filter.
|
||||
created := parseRedfishEntryTime(asString(entry["Created"]))
|
||||
if !created.IsZero() && created.Before(cutoff) {
|
||||
hitOldEntry = true
|
||||
continue
|
||||
}
|
||||
|
||||
// Hardware relevance filter.
|
||||
if !isHardwareLogEntry(entry) {
|
||||
continue
|
||||
}
|
||||
|
||||
out = append(out, entry)
|
||||
}
|
||||
|
||||
// Stop paging once we've seen entries older than the window.
|
||||
if hitOldEntry {
|
||||
break
|
||||
}
|
||||
nextPath = firstNonEmpty(
|
||||
normalizeRedfishPath(asString(collection["Members@odata.nextLink"])),
|
||||
normalizeRedfishPath(asString(collection["@odata.nextLink"])),
|
||||
)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// isManagerSELService returns true only for the IPMI SEL exposed under Managers.
|
||||
// On AMI/MSI BMCs the hardware SEL lives at Managers/{mgr}/LogServices/SEL.
|
||||
// All other Manager log services (AuditLog, EventLog, Journal) are excluded.
|
||||
func isManagerSELService(svc map[string]interface{}) bool {
|
||||
id := strings.ToLower(strings.TrimSpace(asString(svc["Id"])))
|
||||
return id == "sel"
|
||||
}
|
||||
|
||||
// isHardwareLogService returns true if the log service looks like a hardware event log
|
||||
// (SEL, System Event Log) rather than a BMC audit/journal log.
|
||||
func isHardwareLogService(svc map[string]interface{}) bool {
|
||||
id := strings.ToLower(strings.TrimSpace(asString(svc["Id"])))
|
||||
name := strings.ToLower(strings.TrimSpace(asString(svc["Name"])))
|
||||
for _, skip := range []string{"audit", "journal", "bmc", "security", "manager", "debug"} {
|
||||
if strings.Contains(id, skip) || strings.Contains(name, skip) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// redfishLogServiceEntriesPath returns the Entries collection path for a LogService document.
|
||||
func redfishLogServiceEntriesPath(svc map[string]interface{}) string {
|
||||
if entriesLink, ok := svc["Entries"].(map[string]interface{}); ok {
|
||||
if p := normalizeRedfishPath(asString(entriesLink["@odata.id"])); p != "" {
|
||||
return p
|
||||
}
|
||||
}
|
||||
if id := normalizeRedfishPath(asString(svc["@odata.id"])); id != "" {
|
||||
return joinPath(id, "/Entries")
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// isHardwareLogEntry returns true if the log entry is hardware-related.
|
||||
// Audit, authentication, and session events are excluded.
|
||||
func isHardwareLogEntry(entry map[string]interface{}) bool {
|
||||
entryType := strings.TrimSpace(asString(entry["EntryType"]))
|
||||
if strings.EqualFold(entryType, "Oem") && !strings.EqualFold(strings.TrimSpace(asString(entry["OemRecordFormat"])), "Lenovo") {
|
||||
return false
|
||||
}
|
||||
|
||||
msgID := strings.ToLower(strings.TrimSpace(asString(entry["MessageId"])))
|
||||
for _, skip := range []string{
|
||||
"user", "account", "password", "login", "logon", "session",
|
||||
"auth", "certificate", "security", "credential", "privilege",
|
||||
} {
|
||||
if strings.Contains(msgID, skip) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
// Also check the human-readable message for obvious audit patterns.
|
||||
msg := strings.ToLower(strings.TrimSpace(asString(entry["Message"])))
|
||||
for _, skip := range []string{"logged in", "logged out", "log in", "log out", "sign in", "signed in"} {
|
||||
if strings.Contains(msg, skip) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// parseRedfishEntryTime parses a Redfish LogEntry Created timestamp (ISO 8601 / RFC 3339).
|
||||
func parseRedfishEntryTime(raw string) time.Time {
|
||||
raw = strings.TrimSpace(raw)
|
||||
if raw == "" {
|
||||
return time.Time{}
|
||||
}
|
||||
for _, layout := range []string{time.RFC3339, time.RFC3339Nano, "2006-01-02T15:04:05Z07:00"} {
|
||||
if t, err := time.Parse(layout, raw); err == nil {
|
||||
return t.UTC()
|
||||
}
|
||||
}
|
||||
return time.Time{}
|
||||
}
|
||||
|
||||
// parseRedfishLogEntries converts raw log entries stored in RawPayloads into models.Event slice.
|
||||
// Called during Redfish replay for both live and offline (archive) collections.
|
||||
func parseRedfishLogEntries(rawPayloads map[string]any, collectedAt time.Time) []models.Event {
|
||||
raw, ok := rawPayloads["redfish_log_entries"]
|
||||
if !ok {
|
||||
return nil
|
||||
}
|
||||
|
||||
var entries []map[string]interface{}
|
||||
switch v := raw.(type) {
|
||||
case []map[string]interface{}:
|
||||
entries = v
|
||||
case []interface{}:
|
||||
for _, item := range v {
|
||||
if m, ok := item.(map[string]interface{}); ok {
|
||||
entries = append(entries, m)
|
||||
}
|
||||
}
|
||||
default:
|
||||
return nil
|
||||
}
|
||||
|
||||
if len(entries) == 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
out := make([]models.Event, 0, len(entries))
|
||||
for _, entry := range entries {
|
||||
ev := redfishLogEntryToEvent(entry, collectedAt)
|
||||
if ev == nil {
|
||||
continue
|
||||
}
|
||||
out = append(out, *ev)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// redfishLogEntryToEvent converts a single Redfish LogEntry document to models.Event.
|
||||
func redfishLogEntryToEvent(entry map[string]interface{}, collectedAt time.Time) *models.Event {
|
||||
// Prefer EventTimestamp (actual hardware event time) over Created (Redfish record creation time).
|
||||
ts := parseRedfishEntryTime(asString(entry["EventTimestamp"]))
|
||||
if ts.IsZero() {
|
||||
ts = parseRedfishEntryTime(asString(entry["Created"]))
|
||||
}
|
||||
if ts.IsZero() {
|
||||
ts = collectedAt
|
||||
}
|
||||
|
||||
severity := redfishLogEntrySeverity(entry)
|
||||
sensorType := strings.TrimSpace(asString(entry["SensorType"]))
|
||||
messageID := strings.TrimSpace(asString(entry["MessageId"]))
|
||||
entryType := strings.TrimSpace(asString(entry["EntryType"]))
|
||||
entryCode := strings.TrimSpace(asString(entry["EntryCode"]))
|
||||
|
||||
// SensorName: prefer "Name", fall back to "SensorNumber" + SensorType.
|
||||
sensorName := strings.TrimSpace(asString(entry["Name"]))
|
||||
if sensorName == "" {
|
||||
num := strings.TrimSpace(asString(entry["SensorNumber"]))
|
||||
if num != "" && sensorType != "" {
|
||||
sensorName = sensorType + " " + num
|
||||
}
|
||||
}
|
||||
|
||||
rawMessage := strings.TrimSpace(asString(entry["Message"]))
|
||||
|
||||
// AMI/MSI BMCs dump raw IPMI record fields into Message instead of human-readable text.
|
||||
// Detect this and build a readable description from structured fields instead.
|
||||
description, rawData := redfishDecodeMessage(rawMessage, sensorType, entryCode, entry)
|
||||
if description == "" {
|
||||
return nil
|
||||
}
|
||||
|
||||
return &models.Event{
|
||||
ID: messageID,
|
||||
Timestamp: ts,
|
||||
Source: "redfish",
|
||||
SensorType: sensorType,
|
||||
SensorName: sensorName,
|
||||
EventType: entryType,
|
||||
Severity: severity,
|
||||
Description: description,
|
||||
RawData: rawData,
|
||||
}
|
||||
}
|
||||
|
||||
// redfishDecodeMessage returns a human-readable description and optional raw data.
|
||||
// AMI/MSI BMCs dump raw IPMI record fields into Message as "Key : Value, Key : Value, ..."
|
||||
// instead of a plain human-readable string. We extract the useful decoded fields from it.
|
||||
func redfishDecodeMessage(message, sensorType, entryCode string, entry map[string]interface{}) (description, rawData string) {
|
||||
if !isRawIPMIDump(message) {
|
||||
description = message
|
||||
return
|
||||
}
|
||||
|
||||
rawData = message
|
||||
kv := parseIPMIDumpKV(message)
|
||||
|
||||
// Sensor_Type inside the dump is more specific than the top-level SensorType field.
|
||||
if v := kv["Sensor_Type"]; v != "" {
|
||||
sensorType = v
|
||||
}
|
||||
eventType := kv["Event_Type"] // human-readable IPMI event type, e.g. "Legacy OFF State"
|
||||
|
||||
var parts []string
|
||||
if sensorType != "" {
|
||||
parts = append(parts, sensorType)
|
||||
}
|
||||
if eventType != "" {
|
||||
parts = append(parts, eventType)
|
||||
} else if entryCode != "" {
|
||||
parts = append(parts, entryCode)
|
||||
}
|
||||
description = strings.Join(parts, ": ")
|
||||
return
|
||||
}
|
||||
|
||||
// isRawIPMIDump returns true if the message is an AMI raw IPMI record dump.
|
||||
func isRawIPMIDump(message string) bool {
|
||||
return strings.Contains(message, "Event_Data_1 :") && strings.Contains(message, "Record_Type :")
|
||||
}
|
||||
|
||||
// parseIPMIDumpKV parses the AMI "Key : Value, Key : Value, " format into a map.
|
||||
func parseIPMIDumpKV(message string) map[string]string {
|
||||
out := make(map[string]string)
|
||||
for _, part := range strings.Split(message, ",") {
|
||||
part = strings.TrimSpace(part)
|
||||
idx := strings.Index(part, " : ")
|
||||
if idx < 0 {
|
||||
continue
|
||||
}
|
||||
k := strings.TrimSpace(part[:idx])
|
||||
v := strings.TrimSpace(part[idx+3:])
|
||||
if k != "" && v != "" {
|
||||
out[k] = v
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// redfishLogEntrySeverity maps a Redfish LogEntry to models.Severity.
|
||||
// AMI/MSI BMCs often set Severity="OK" on all SEL records regardless of content,
|
||||
// so we fall back to inferring severity from SensorType when the explicit field is unhelpful.
|
||||
func redfishLogEntrySeverity(entry map[string]interface{}) models.Severity {
|
||||
if redfishLogEntryLooksLikeWarning(entry) {
|
||||
return models.SeverityWarning
|
||||
}
|
||||
// Newer Redfish uses MessageSeverity; older uses Severity.
|
||||
raw := strings.ToLower(firstNonEmpty(
|
||||
strings.TrimSpace(asString(entry["MessageSeverity"])),
|
||||
strings.TrimSpace(asString(entry["Severity"])),
|
||||
))
|
||||
switch raw {
|
||||
case "critical":
|
||||
return models.SeverityCritical
|
||||
case "warning":
|
||||
return models.SeverityWarning
|
||||
case "ok", "informational", "":
|
||||
// BMC didn't set a meaningful severity — infer from SensorType.
|
||||
return redfishSeverityFromSensorType(strings.TrimSpace(asString(entry["SensorType"])))
|
||||
default:
|
||||
return models.SeverityInfo
|
||||
}
|
||||
}
|
||||
|
||||
func redfishLogEntryLooksLikeWarning(entry map[string]interface{}) bool {
|
||||
joined := strings.ToLower(strings.TrimSpace(strings.Join([]string{
|
||||
asString(entry["Message"]),
|
||||
asString(entry["Name"]),
|
||||
asString(entry["SensorType"]),
|
||||
asString(entry["EntryCode"]),
|
||||
}, " ")))
|
||||
return strings.Contains(joined, "unqualified dimm")
|
||||
}
|
||||
|
||||
// redfishSeverityFromSensorType infers event severity from the IPMI/Redfish SensorType string.
|
||||
func redfishSeverityFromSensorType(sensorType string) models.Severity {
|
||||
switch strings.ToLower(sensorType) {
|
||||
case "critical interrupt", "processor", "memory", "power unit",
|
||||
"power supply", "drive slot", "system firmware progress":
|
||||
return models.SeverityWarning
|
||||
default:
|
||||
return models.SeverityInfo
|
||||
}
|
||||
}
|
||||
125
internal/collector/redfish_logentries_test.go
Normal file
125
internal/collector/redfish_logentries_test.go
Normal file
@@ -0,0 +1,125 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestCollectRedfishLogEntries_UsesLinkedManagerLogServicesPath(t *testing.T) {
|
||||
mux := http.NewServeMux()
|
||||
register := func(path string, payload interface{}) {
|
||||
mux.HandleFunc(path, func(w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
_ = json.NewEncoder(w).Encode(payload)
|
||||
})
|
||||
}
|
||||
|
||||
register("/redfish/v1/Managers/1", map[string]interface{}{
|
||||
"Id": "1",
|
||||
"LogServices": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Systems/1/LogServices",
|
||||
},
|
||||
})
|
||||
register("/redfish/v1/Systems/1/LogServices", map[string]interface{}{
|
||||
"Members": []map[string]string{
|
||||
{"@odata.id": "/redfish/v1/Systems/1/LogServices/SEL"},
|
||||
},
|
||||
})
|
||||
register("/redfish/v1/Systems/1/LogServices/SEL", map[string]interface{}{
|
||||
"Id": "SEL",
|
||||
"Entries": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Systems/1/LogServices/SEL/Entries",
|
||||
},
|
||||
})
|
||||
register("/redfish/v1/Systems/1/LogServices/SEL/Entries", map[string]interface{}{
|
||||
"Members": []map[string]string{
|
||||
{"@odata.id": "/redfish/v1/Systems/1/LogServices/SEL/Entries/1"},
|
||||
},
|
||||
})
|
||||
register("/redfish/v1/Systems/1/LogServices/SEL/Entries/1", map[string]interface{}{
|
||||
"Id": "1",
|
||||
"Created": time.Now().UTC().Format(time.RFC3339),
|
||||
"Message": "System found Unqualified DIMM in slot DIMM A1",
|
||||
"MessageSeverity": "OK",
|
||||
"SensorType": "Memory",
|
||||
"EntryType": "Event",
|
||||
})
|
||||
|
||||
ts := httptest.NewServer(mux)
|
||||
defer ts.Close()
|
||||
|
||||
c := NewRedfishConnector()
|
||||
got := c.collectRedfishLogEntries(context.Background(), ts.Client(), Request{
|
||||
Host: ts.URL,
|
||||
Port: 443,
|
||||
Protocol: "redfish",
|
||||
Username: "admin",
|
||||
AuthType: "password",
|
||||
Password: "secret",
|
||||
TLSMode: "strict",
|
||||
}, ts.URL, nil, []string{"/redfish/v1/Managers/1"})
|
||||
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected 1 collected log entry, got %d", len(got))
|
||||
}
|
||||
if got[0]["Message"] != "System found Unqualified DIMM in slot DIMM A1" {
|
||||
t.Fatalf("unexpected collected message: %#v", got[0]["Message"])
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseRedfishLogEntries_UnqualifiedDIMMBecomesWarning(t *testing.T) {
|
||||
rawPayloads := map[string]any{
|
||||
"redfish_log_entries": []any{
|
||||
map[string]any{
|
||||
"Id": "sel-1",
|
||||
"Created": "2026-04-13T12:00:00Z",
|
||||
"Message": "System found Unqualified DIMM in slot DIMM A1",
|
||||
"MessageSeverity": "OK",
|
||||
"SensorType": "Memory",
|
||||
"EntryType": "Event",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
events := parseRedfishLogEntries(rawPayloads, time.Date(2026, 4, 13, 12, 30, 0, 0, time.UTC))
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 event, got %d", len(events))
|
||||
}
|
||||
if events[0].Severity != models.SeverityWarning {
|
||||
t.Fatalf("expected warning severity, got %q", events[0].Severity)
|
||||
}
|
||||
if events[0].Description != "System found Unqualified DIMM in slot DIMM A1" {
|
||||
t.Fatalf("unexpected description: %q", events[0].Description)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseRedfishLogEntries_LenovoOEMEntryIsKept(t *testing.T) {
|
||||
rawPayloads := map[string]any{
|
||||
"redfish_log_entries": []any{
|
||||
map[string]any{
|
||||
"Id": "plat-55",
|
||||
"Created": "2026-04-13T12:00:00Z",
|
||||
"Message": "DIMM A1 is unqualified",
|
||||
"MessageSeverity": "Warning",
|
||||
"SensorType": "Memory",
|
||||
"EntryType": "Oem",
|
||||
"OemRecordFormat": "Lenovo",
|
||||
"EntryCode": "Assert",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
events := parseRedfishLogEntries(rawPayloads, time.Date(2026, 4, 13, 12, 30, 0, 0, time.UTC))
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 Lenovo OEM event, got %d", len(events))
|
||||
}
|
||||
if events[0].Severity != models.SeverityWarning {
|
||||
t.Fatalf("expected warning severity, got %q", events[0].Severity)
|
||||
}
|
||||
}
|
||||
57
internal/collector/redfish_planb_test.go
Normal file
57
internal/collector/redfish_planb_test.go
Normal file
@@ -0,0 +1,57 @@
|
||||
package collector
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestShouldIncludeCriticalPlanBPath(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
req Request
|
||||
path string
|
||||
want bool
|
||||
}{
|
||||
{
|
||||
name: "skip hgx erot pcie without extended diagnostics",
|
||||
req: Request{},
|
||||
path: "/redfish/v1/Chassis/HGX_ERoT_NVSwitch_0/PCIeDevices",
|
||||
want: false,
|
||||
},
|
||||
{
|
||||
name: "skip hgx chassis assembly without extended diagnostics",
|
||||
req: Request{},
|
||||
path: "/redfish/v1/Chassis/HGX_Chassis_0/Assembly",
|
||||
want: false,
|
||||
},
|
||||
{
|
||||
name: "keep standard chassis inventory without extended diagnostics",
|
||||
req: Request{},
|
||||
path: "/redfish/v1/Chassis/1/PCIeDevices",
|
||||
want: true,
|
||||
},
|
||||
{
|
||||
name: "keep nvme storage backplane drives without extended diagnostics",
|
||||
req: Request{},
|
||||
path: "/redfish/v1/Chassis/NVMeSSD.0.Group.0.StorageBackplane/Drives",
|
||||
want: true,
|
||||
},
|
||||
{
|
||||
name: "keep system processors without extended diagnostics",
|
||||
req: Request{},
|
||||
path: "/redfish/v1/Systems/HGX_Baseboard_0/Processors",
|
||||
want: true,
|
||||
},
|
||||
{
|
||||
name: "include hgx erot pcie when extended diagnostics enabled",
|
||||
req: Request{DebugPayloads: true},
|
||||
path: "/redfish/v1/Chassis/HGX_ERoT_NVSwitch_0/PCIeDevices",
|
||||
want: true,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
if got := shouldIncludeCriticalPlanBPath(tt.req, tt.path); got != tt.want {
|
||||
t.Fatalf("shouldIncludeCriticalPlanBPath(%q) = %v, want %v", tt.path, got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
159
internal/collector/redfish_replay_fru.go
Normal file
159
internal/collector/redfish_replay_fru.go
Normal file
@@ -0,0 +1,159 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) collectBoardFallbackDocs(systemPaths, chassisPaths []string) []map[string]interface{} {
|
||||
out := make([]map[string]interface{}, 0)
|
||||
for _, chassisPath := range chassisPaths {
|
||||
for _, suffix := range []string{"/Boards", "/Backplanes"} {
|
||||
path := joinPath(chassisPath, suffix)
|
||||
if docs, err := r.getCollectionMembers(path); err == nil && len(docs) > 0 {
|
||||
out = append(out, docs...)
|
||||
continue
|
||||
}
|
||||
if doc, err := r.getJSON(path); err == nil && len(doc) > 0 {
|
||||
out = append(out, doc)
|
||||
}
|
||||
}
|
||||
}
|
||||
for _, path := range append(append([]string{}, systemPaths...), chassisPaths...) {
|
||||
for _, suffix := range []string{"/Oem/Public", "/Oem/Public/ThermalConfig", "/ThermalConfig"} {
|
||||
docPath := joinPath(path, suffix)
|
||||
if doc, err := r.getJSON(docPath); err == nil && len(doc) > 0 {
|
||||
out = append(out, doc)
|
||||
}
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func applyBoardInfoFallbackFromDocs(board *models.BoardInfo, docs []map[string]interface{}) {
|
||||
if board == nil || len(docs) == 0 {
|
||||
return
|
||||
}
|
||||
for _, doc := range docs {
|
||||
candidate := parseBoardInfoFromFRUDoc(doc)
|
||||
if !isLikelyServerProductName(candidate.ProductName) {
|
||||
continue
|
||||
}
|
||||
if board.Manufacturer == "" {
|
||||
board.Manufacturer = candidate.Manufacturer
|
||||
}
|
||||
if board.ProductName == "" {
|
||||
board.ProductName = candidate.ProductName
|
||||
}
|
||||
if board.SerialNumber == "" {
|
||||
board.SerialNumber = candidate.SerialNumber
|
||||
}
|
||||
if board.PartNumber == "" {
|
||||
board.PartNumber = candidate.PartNumber
|
||||
}
|
||||
if board.Manufacturer != "" && board.ProductName != "" && board.SerialNumber != "" && board.PartNumber != "" {
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func isLikelyServerProductName(v string) bool {
|
||||
v = strings.TrimSpace(v)
|
||||
if v == "" {
|
||||
return false
|
||||
}
|
||||
n := strings.ToUpper(v)
|
||||
if strings.Contains(n, "NULL") {
|
||||
return false
|
||||
}
|
||||
componentTokens := []string{
|
||||
"DIMM", "DDR", "NVME", "SSD", "HDD", "GPU", "NIC", "RAID",
|
||||
"PSU", "FAN", "BACKPLANE", "FRU",
|
||||
}
|
||||
for _, token := range componentTokens {
|
||||
if strings.Contains(n, strings.ToUpper(token)) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// collectAssemblyFRU reads Chassis/*/Assembly documents and returns FRU entries
|
||||
// for subcomponents (backplanes, PSUs, DIMMs, etc.) that carry meaningful
|
||||
// serial or part numbers. Entries already present in dedicated collections
|
||||
// (PSUs, DIMMs) are included here as well so that all FRU data is available
|
||||
// in one place; deduplication by serial is performed.
|
||||
func (r redfishSnapshotReader) collectAssemblyFRU(chassisPaths []string) []models.FRUInfo {
|
||||
seen := make(map[string]struct{})
|
||||
var out []models.FRUInfo
|
||||
|
||||
add := func(fru models.FRUInfo) {
|
||||
key := strings.ToUpper(strings.TrimSpace(fru.SerialNumber))
|
||||
if key == "" {
|
||||
key = strings.ToUpper(strings.TrimSpace(fru.Description + "|" + fru.PartNumber))
|
||||
}
|
||||
if key == "" || key == "|" {
|
||||
return
|
||||
}
|
||||
if _, ok := seen[key]; ok {
|
||||
return
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
out = append(out, fru)
|
||||
}
|
||||
|
||||
for _, chassisPath := range chassisPaths {
|
||||
doc, err := r.getJSON(joinPath(chassisPath, "/Assembly"))
|
||||
if err != nil || len(doc) == 0 {
|
||||
continue
|
||||
}
|
||||
assemblies, _ := doc["Assemblies"].([]interface{})
|
||||
for _, aAny := range assemblies {
|
||||
a, ok := aAny.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
name := strings.TrimSpace(firstNonEmpty(asString(a["Name"]), asString(a["Description"])))
|
||||
model := strings.TrimSpace(asString(a["Model"]))
|
||||
partNumber := strings.TrimSpace(asString(a["PartNumber"]))
|
||||
serial := extractAssemblySerial(a)
|
||||
|
||||
if serial == "" && partNumber == "" {
|
||||
continue
|
||||
}
|
||||
add(models.FRUInfo{
|
||||
Description: name,
|
||||
ProductName: model,
|
||||
SerialNumber: serial,
|
||||
PartNumber: partNumber,
|
||||
})
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// extractAssemblySerial tries to find a serial number in an Assembly entry.
|
||||
// Standard Redfish Assembly has no top-level SerialNumber; vendors put it in Oem.
|
||||
func extractAssemblySerial(a map[string]interface{}) string {
|
||||
if s := strings.TrimSpace(asString(a["SerialNumber"])); s != "" {
|
||||
return s
|
||||
}
|
||||
oem, _ := a["Oem"].(map[string]interface{})
|
||||
for _, v := range oem {
|
||||
subtree, ok := v.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
for _, v2 := range subtree {
|
||||
node, ok := v2.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if s := strings.TrimSpace(asString(node["SerialNumber"])); s != "" {
|
||||
return s
|
||||
}
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
198
internal/collector/redfish_replay_gpu.go
Normal file
198
internal/collector/redfish_replay_gpu.go
Normal file
@@ -0,0 +1,198 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) collectGPUs(systemPaths, chassisPaths []string, plan redfishprofile.ResolvedAnalysisPlan) []models.GPU {
|
||||
collections := make([]string, 0, len(systemPaths)*3+len(chassisPaths)*2)
|
||||
for _, systemPath := range systemPaths {
|
||||
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
|
||||
collections = append(collections, joinPath(systemPath, "/Accelerators"))
|
||||
collections = append(collections, joinPath(systemPath, "/GraphicsControllers"))
|
||||
}
|
||||
for _, chassisPath := range chassisPaths {
|
||||
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
|
||||
collections = append(collections, joinPath(chassisPath, "/Accelerators"))
|
||||
}
|
||||
var out []models.GPU
|
||||
seen := make(map[string]struct{})
|
||||
idx := 1
|
||||
for _, collectionPath := range collections {
|
||||
memberDocs, err := r.getCollectionMembers(collectionPath)
|
||||
if err != nil || len(memberDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range memberDocs {
|
||||
functionDocs := r.getLinkedPCIeFunctions(doc)
|
||||
if !looksLikeGPU(doc, functionDocs) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
gpu := parseGPUWithSupplementalDocs(doc, functionDocs, supplementalDocs, idx)
|
||||
idx++
|
||||
if plan.Directives.EnableGenericGraphicsControllerDedup && shouldSkipGenericGPUDuplicate(out, gpu) {
|
||||
continue
|
||||
}
|
||||
key := gpuDocDedupKey(doc, gpu)
|
||||
if key == "" {
|
||||
continue
|
||||
}
|
||||
if _, ok := seen[key]; ok {
|
||||
continue
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
out = append(out, gpu)
|
||||
}
|
||||
}
|
||||
if plan.Directives.EnableGenericGraphicsControllerDedup {
|
||||
return dropModelOnlyGPUPlaceholders(out)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// msiGhostGPUFilter returns true when the GPU chassis for gpuID shows a temperature
|
||||
// of 0 on a powered-on host, which is the reliable MSI/AMI signal that the GPU is
|
||||
// no longer physically installed (stale BMC inventory cache).
|
||||
// It only filters when the system PowerState is "On" — when the host is off, all
|
||||
// temperature readings are 0 and we cannot distinguish absent from idle.
|
||||
func (r redfishSnapshotReader) msiGhostGPUFilter(systemPaths []string, gpuID, chassisPath string) bool {
|
||||
// Require host powered on.
|
||||
for _, sp := range systemPaths {
|
||||
doc, err := r.getJSON(sp)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if !strings.EqualFold(strings.TrimSpace(asString(doc["PowerState"])), "on") {
|
||||
return false
|
||||
}
|
||||
break
|
||||
}
|
||||
// Read the temperature sensor for this GPU chassis.
|
||||
sensorPath := joinPath(chassisPath, "/Sensors/"+gpuID+"_Temperature")
|
||||
sensorDoc, err := r.getJSON(sensorPath)
|
||||
if err != nil || len(sensorDoc) == 0 {
|
||||
return false
|
||||
}
|
||||
reading, ok := sensorDoc["Reading"]
|
||||
if !ok {
|
||||
return false
|
||||
}
|
||||
switch v := reading.(type) {
|
||||
case float64:
|
||||
return v == 0
|
||||
case int:
|
||||
return v == 0
|
||||
case int64:
|
||||
return v == 0
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// collectGPUsFromProcessors finds GPUs that some BMCs (e.g. MSI) expose as
|
||||
// Processor entries with ProcessorType=GPU rather than as PCIe devices.
|
||||
// It supplements the existing gpus slice (already found via PCIe path),
|
||||
// skipping entries already present by UUID or SerialNumber.
|
||||
// Serial numbers are looked up from Chassis members named after each GPU Id.
|
||||
func (r redfishSnapshotReader) collectGPUsFromProcessors(systemPaths, chassisPaths []string, existing []models.GPU, plan redfishprofile.ResolvedAnalysisPlan) []models.GPU {
|
||||
if !plan.Directives.EnableProcessorGPUFallback {
|
||||
return append([]models.GPU{}, existing...)
|
||||
}
|
||||
chassisByID := make(map[string]map[string]interface{})
|
||||
chassisPathByID := make(map[string]string)
|
||||
for _, cp := range chassisPaths {
|
||||
doc, err := r.getJSON(cp)
|
||||
if err != nil || len(doc) == 0 {
|
||||
continue
|
||||
}
|
||||
id := strings.TrimSpace(asString(doc["Id"]))
|
||||
if id != "" {
|
||||
chassisByID[strings.ToUpper(id)] = doc
|
||||
chassisPathByID[strings.ToUpper(id)] = cp
|
||||
}
|
||||
}
|
||||
|
||||
seenUUID := make(map[string]struct{})
|
||||
seenSerial := make(map[string]struct{})
|
||||
for _, g := range existing {
|
||||
if u := strings.ToUpper(strings.TrimSpace(g.UUID)); u != "" {
|
||||
seenUUID[u] = struct{}{}
|
||||
}
|
||||
if s := strings.ToUpper(strings.TrimSpace(g.SerialNumber)); s != "" {
|
||||
seenSerial[s] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
out := append([]models.GPU{}, existing...)
|
||||
idx := len(existing) + 1
|
||||
for _, systemPath := range systemPaths {
|
||||
procDocs, err := r.getCollectionMembers(joinPath(systemPath, "/Processors"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, doc := range procDocs {
|
||||
if !strings.EqualFold(strings.TrimSpace(asString(doc["ProcessorType"])), "GPU") {
|
||||
continue
|
||||
}
|
||||
|
||||
gpuID := strings.TrimSpace(asString(doc["Id"]))
|
||||
serial := findFirstNormalizedStringByKeys(doc, "SerialNumber")
|
||||
if serial == "" {
|
||||
serial = resolveProcessorGPUChassisSerial(chassisByID, gpuID, plan)
|
||||
}
|
||||
|
||||
if plan.Directives.EnableMSIGhostGPUFilter {
|
||||
chassisPath := resolveProcessorGPUChassisPath(chassisPathByID, gpuID, plan)
|
||||
if chassisPath != "" && r.msiGhostGPUFilter(systemPaths, gpuID, chassisPath) {
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
uuid := strings.TrimSpace(asString(doc["UUID"]))
|
||||
uuidKey := strings.ToUpper(uuid)
|
||||
serialKey := strings.ToUpper(serial)
|
||||
|
||||
if uuidKey != "" {
|
||||
if _, dup := seenUUID[uuidKey]; dup {
|
||||
continue
|
||||
}
|
||||
seenUUID[uuidKey] = struct{}{}
|
||||
}
|
||||
if serialKey != "" {
|
||||
if _, dup := seenSerial[serialKey]; dup {
|
||||
continue
|
||||
}
|
||||
seenSerial[serialKey] = struct{}{}
|
||||
}
|
||||
|
||||
slotLabel := firstNonEmpty(
|
||||
redfishLocationLabel(doc["Location"]),
|
||||
redfishLocationLabel(doc["PhysicalLocation"]),
|
||||
)
|
||||
if slotLabel == "" && gpuID != "" {
|
||||
slotLabel = gpuID
|
||||
}
|
||||
if slotLabel == "" {
|
||||
slotLabel = fmt.Sprintf("GPU%d", idx)
|
||||
}
|
||||
out = append(out, models.GPU{
|
||||
Slot: slotLabel,
|
||||
Model: firstNonEmpty(asString(doc["Model"]), asString(doc["Name"])),
|
||||
Manufacturer: asString(doc["Manufacturer"]),
|
||||
PartNumber: asString(doc["PartNumber"]),
|
||||
SerialNumber: serial,
|
||||
UUID: uuid,
|
||||
Status: mapStatus(doc["Status"]),
|
||||
})
|
||||
idx++
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
599
internal/collector/redfish_replay_inventory.go
Normal file
599
internal/collector/redfish_replay_inventory.go
Normal file
@@ -0,0 +1,599 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) enrichNICsFromNetworkInterfaces(nics *[]models.NetworkAdapter, systemPaths []string) {
|
||||
if nics == nil {
|
||||
return
|
||||
}
|
||||
bySlot := make(map[string]int, len(*nics))
|
||||
for i, nic := range *nics {
|
||||
bySlot[strings.ToLower(strings.TrimSpace(nic.Slot))] = i
|
||||
}
|
||||
|
||||
for _, systemPath := range systemPaths {
|
||||
ifaces, err := r.getCollectionMembers(joinPath(systemPath, "/NetworkInterfaces"))
|
||||
if err != nil || len(ifaces) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, iface := range ifaces {
|
||||
slot := firstNonEmpty(asString(iface["Id"]), asString(iface["Name"]))
|
||||
if strings.TrimSpace(slot) == "" {
|
||||
continue
|
||||
}
|
||||
idx, ok := bySlot[strings.ToLower(strings.TrimSpace(slot))]
|
||||
if !ok {
|
||||
// The NetworkInterface Id (e.g. "2") may not match the display slot of
|
||||
// the real NIC that came from Chassis/NetworkAdapters (e.g. "RISER 5
|
||||
// slot 1 (7)"). Try to find the real NIC via the Links.NetworkAdapter
|
||||
// cross-reference before creating a ghost entry.
|
||||
if linkedIdx := r.findNICIndexByLinkedNetworkAdapter(iface, *nics, bySlot); linkedIdx >= 0 {
|
||||
idx = linkedIdx
|
||||
ok = true
|
||||
}
|
||||
}
|
||||
if !ok {
|
||||
*nics = append(*nics, models.NetworkAdapter{
|
||||
Slot: slot,
|
||||
Present: true,
|
||||
Model: firstNonEmpty(asString(iface["Model"]), asString(iface["Name"])),
|
||||
Status: mapStatus(iface["Status"]),
|
||||
})
|
||||
idx = len(*nics) - 1
|
||||
bySlot[strings.ToLower(strings.TrimSpace(slot))] = idx
|
||||
}
|
||||
|
||||
portsPath := redfishLinkedPath(iface, "NetworkPorts")
|
||||
if portsPath == "" {
|
||||
continue
|
||||
}
|
||||
portDocs, err := r.getCollectionMembers(portsPath)
|
||||
if err != nil || len(portDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
macs := append([]string{}, (*nics)[idx].MACAddresses...)
|
||||
for _, p := range portDocs {
|
||||
macs = append(macs, collectNetworkPortMACs(p)...)
|
||||
}
|
||||
(*nics)[idx].MACAddresses = dedupeStrings(macs)
|
||||
if sanitizeNetworkPortCount((*nics)[idx].PortCount) == 0 {
|
||||
(*nics)[idx].PortCount = len(portDocs)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectNICs(chassisPaths []string) []models.NetworkAdapter {
|
||||
var nics []models.NetworkAdapter
|
||||
for _, chassisPath := range chassisPaths {
|
||||
adapterDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/NetworkAdapters"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, doc := range adapterDocs {
|
||||
nics = append(nics, r.buildNICFromAdapterDoc(doc))
|
||||
}
|
||||
}
|
||||
return dedupeNetworkAdapters(nics)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) buildNICFromAdapterDoc(adapterDoc map[string]interface{}) models.NetworkAdapter {
|
||||
nic := parseNIC(adapterDoc)
|
||||
adapterFunctionDocs := r.getNetworkAdapterFunctionDocs(adapterDoc)
|
||||
for _, pciePath := range networkAdapterPCIeDevicePaths(adapterDoc) {
|
||||
pcieDoc, err := r.getJSON(pciePath)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
functionDocs := r.getLinkedPCIeFunctions(pcieDoc)
|
||||
for _, adapterFnDoc := range adapterFunctionDocs {
|
||||
functionDocs = append(functionDocs, r.getLinkedPCIeFunctions(adapterFnDoc)...)
|
||||
}
|
||||
functionDocs = dedupeJSONDocsByPath(functionDocs)
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(pcieDoc, "EnvironmentMetrics", "Metrics")
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
enrichNICFromPCIe(&nic, pcieDoc, functionDocs, supplementalDocs)
|
||||
}
|
||||
if len(nic.MACAddresses) == 0 {
|
||||
r.enrichNICMACsFromNetworkDeviceFunctions(&nic, adapterDoc)
|
||||
}
|
||||
return nic
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) getNetworkAdapterFunctionDocs(adapterDoc map[string]interface{}) []map[string]interface{} {
|
||||
ndfCol, ok := adapterDoc["NetworkDeviceFunctions"].(map[string]interface{})
|
||||
if !ok {
|
||||
return nil
|
||||
}
|
||||
colPath := asString(ndfCol["@odata.id"])
|
||||
if colPath == "" {
|
||||
return nil
|
||||
}
|
||||
funcDocs, err := r.getCollectionMembers(colPath)
|
||||
if err != nil {
|
||||
return nil
|
||||
}
|
||||
return funcDocs
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectPCIeDevices(systemPaths, chassisPaths []string) []models.PCIeDevice {
|
||||
collections := make([]string, 0, len(systemPaths)+len(chassisPaths))
|
||||
for _, systemPath := range systemPaths {
|
||||
collections = append(collections, joinPath(systemPath, "/PCIeDevices"))
|
||||
}
|
||||
for _, chassisPath := range chassisPaths {
|
||||
collections = append(collections, joinPath(chassisPath, "/PCIeDevices"))
|
||||
}
|
||||
var out []models.PCIeDevice
|
||||
for _, collectionPath := range collections {
|
||||
memberDocs, err := r.getCollectionMembers(collectionPath)
|
||||
if err != nil || len(memberDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range memberDocs {
|
||||
functionDocs := r.getLinkedPCIeFunctions(doc)
|
||||
if looksLikeGPU(doc, functionDocs) {
|
||||
continue
|
||||
}
|
||||
if replayPCIeDeviceBackedByCanonicalNIC(doc, functionDocs) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(doc, "EnvironmentMetrics", "Metrics")
|
||||
supplementalDocs = append(supplementalDocs, r.getChassisScopedPCIeSupplementalDocs(doc)...)
|
||||
for _, fn := range functionDocs {
|
||||
supplementalDocs = append(supplementalDocs, r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")...)
|
||||
}
|
||||
dev := parsePCIeDeviceWithSupplementalDocs(doc, functionDocs, supplementalDocs)
|
||||
if shouldSkipReplayPCIeDevice(doc, dev) {
|
||||
continue
|
||||
}
|
||||
out = append(out, dev)
|
||||
}
|
||||
}
|
||||
for _, systemPath := range systemPaths {
|
||||
functionDocs, err := r.getCollectionMembers(joinPath(systemPath, "/PCIeFunctions"))
|
||||
if err != nil || len(functionDocs) == 0 {
|
||||
continue
|
||||
}
|
||||
for idx, fn := range functionDocs {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(fn, "EnvironmentMetrics", "Metrics")
|
||||
dev := parsePCIeFunctionWithSupplementalDocs(fn, supplementalDocs, idx+1)
|
||||
if shouldSkipReplayPCIeDevice(fn, dev) {
|
||||
continue
|
||||
}
|
||||
out = append(out, dev)
|
||||
}
|
||||
}
|
||||
return dedupePCIeDevices(out)
|
||||
}
|
||||
|
||||
func shouldSkipReplayPCIeDevice(doc map[string]interface{}, dev models.PCIeDevice) bool {
|
||||
if isUnidentifiablePCIeDevice(dev) {
|
||||
return true
|
||||
}
|
||||
if replayNetworkFunctionBackedByCanonicalNIC(doc, dev) {
|
||||
return true
|
||||
}
|
||||
if isReplayStorageServiceEndpoint(doc, dev) {
|
||||
return true
|
||||
}
|
||||
if isReplayNoisePCIeClass(dev.DeviceClass) {
|
||||
return true
|
||||
}
|
||||
if isReplayDisplayDeviceDuplicate(doc, dev) {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func replayPCIeDeviceBackedByCanonicalNIC(doc map[string]interface{}, functionDocs []map[string]interface{}) bool {
|
||||
if !looksLikeReplayNetworkPCIeDevice(doc, functionDocs) {
|
||||
return false
|
||||
}
|
||||
for _, fn := range functionDocs {
|
||||
if hasRedfishLinkedMember(fn, "NetworkDeviceFunctions") {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func replayNetworkFunctionBackedByCanonicalNIC(doc map[string]interface{}, dev models.PCIeDevice) bool {
|
||||
if !looksLikeReplayNetworkClass(dev.DeviceClass) {
|
||||
return false
|
||||
}
|
||||
return hasRedfishLinkedMember(doc, "NetworkDeviceFunctions")
|
||||
}
|
||||
|
||||
func looksLikeReplayNetworkPCIeDevice(doc map[string]interface{}, functionDocs []map[string]interface{}) bool {
|
||||
for _, fn := range functionDocs {
|
||||
if looksLikeReplayNetworkClass(asString(fn["DeviceClass"])) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
joined := strings.ToLower(strings.TrimSpace(strings.Join([]string{
|
||||
asString(doc["DeviceType"]),
|
||||
asString(doc["Description"]),
|
||||
asString(doc["Name"]),
|
||||
asString(doc["Model"]),
|
||||
}, " ")))
|
||||
return strings.Contains(joined, "network")
|
||||
}
|
||||
|
||||
func looksLikeReplayNetworkClass(class string) bool {
|
||||
class = strings.ToLower(strings.TrimSpace(class))
|
||||
return strings.Contains(class, "network") || strings.Contains(class, "ethernet")
|
||||
}
|
||||
|
||||
func isReplayStorageServiceEndpoint(doc map[string]interface{}, dev models.PCIeDevice) bool {
|
||||
class := strings.ToLower(strings.TrimSpace(dev.DeviceClass))
|
||||
if class != "massstoragecontroller" && class != "mass storage controller" {
|
||||
return false
|
||||
}
|
||||
name := strings.ToLower(strings.TrimSpace(firstNonEmpty(
|
||||
dev.PartNumber,
|
||||
asString(doc["PartNumber"]),
|
||||
asString(doc["Description"]),
|
||||
)))
|
||||
if strings.Contains(name, "pcie switch management endpoint") {
|
||||
return true
|
||||
}
|
||||
if strings.Contains(name, "volume management device nvme raid controller") {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func hasRedfishLinkedMember(doc map[string]interface{}, key string) bool {
|
||||
links, ok := doc["Links"].(map[string]interface{})
|
||||
if !ok {
|
||||
return false
|
||||
}
|
||||
if asInt(links[key+"@odata.count"]) > 0 {
|
||||
return true
|
||||
}
|
||||
linked, ok := links[key]
|
||||
if !ok {
|
||||
return false
|
||||
}
|
||||
switch v := linked.(type) {
|
||||
case []interface{}:
|
||||
return len(v) > 0
|
||||
case map[string]interface{}:
|
||||
if asString(v["@odata.id"]) != "" {
|
||||
return true
|
||||
}
|
||||
return len(v) > 0
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
func isReplayNoisePCIeClass(class string) bool {
|
||||
switch strings.ToLower(strings.TrimSpace(class)) {
|
||||
case "bridge", "processor", "signalprocessingcontroller", "signal processing controller", "serialbuscontroller", "serial bus controller":
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
func isReplayDisplayDeviceDuplicate(doc map[string]interface{}, dev models.PCIeDevice) bool {
|
||||
class := strings.ToLower(strings.TrimSpace(dev.DeviceClass))
|
||||
if class != "displaycontroller" && class != "display controller" {
|
||||
return false
|
||||
}
|
||||
return strings.EqualFold(strings.TrimSpace(asString(doc["Description"])), "Display Device")
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) getChassisScopedPCIeSupplementalDocs(doc map[string]interface{}) []map[string]interface{} {
|
||||
docPath := normalizeRedfishPath(asString(doc["@odata.id"]))
|
||||
chassisPath := chassisPathForPCIeDoc(docPath)
|
||||
if chassisPath == "" {
|
||||
return nil
|
||||
}
|
||||
|
||||
out := make([]map[string]interface{}, 0, 6)
|
||||
if looksLikeNVSwitchPCIeDoc(doc) {
|
||||
for _, path := range []string{
|
||||
joinPath(chassisPath, "/EnvironmentMetrics"),
|
||||
joinPath(chassisPath, "/ThermalSubsystem/ThermalMetrics"),
|
||||
} {
|
||||
supplementalDoc, err := r.getJSON(path)
|
||||
if err != nil || len(supplementalDoc) == 0 {
|
||||
continue
|
||||
}
|
||||
out = append(out, supplementalDoc)
|
||||
}
|
||||
}
|
||||
deviceDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/Devices"))
|
||||
if err == nil {
|
||||
for _, deviceDoc := range deviceDocs {
|
||||
if !redfishPCIeMatchesChassisDeviceDoc(doc, deviceDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, deviceDoc)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// collectBMCMAC returns the MAC address of the best BMC management interface
|
||||
// found in Managers/*/EthernetInterfaces. Prefer an active link with an IP
|
||||
// address over a passive sideband interface.
|
||||
func (r redfishSnapshotReader) collectBMCMAC(managerPaths []string) string {
|
||||
summary := r.collectBMCManagementSummary(managerPaths)
|
||||
if len(summary) == 0 {
|
||||
return ""
|
||||
}
|
||||
return strings.ToUpper(strings.TrimSpace(asString(summary["mac_address"])))
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectBMCManagementSummary(managerPaths []string) map[string]any {
|
||||
bestScore := -1
|
||||
var best map[string]any
|
||||
for _, managerPath := range managerPaths {
|
||||
collectionPath := joinPath(managerPath, "/EthernetInterfaces")
|
||||
collectionDoc, _ := r.getJSON(collectionPath)
|
||||
ncsiEnabled, lldpMode, lldpByEth := redfishManagerEthernetCollectionHints(collectionDoc)
|
||||
members, err := r.getCollectionMembers(collectionPath)
|
||||
if err != nil || len(members) == 0 {
|
||||
continue
|
||||
}
|
||||
for _, doc := range members {
|
||||
mac := strings.TrimSpace(firstNonEmpty(
|
||||
asString(doc["PermanentMACAddress"]),
|
||||
asString(doc["MACAddress"]),
|
||||
))
|
||||
if mac == "" || strings.EqualFold(mac, "00:00:00:00:00:00") {
|
||||
continue
|
||||
}
|
||||
ifaceID := strings.TrimSpace(firstNonEmpty(asString(doc["Id"]), asString(doc["Name"])))
|
||||
summary := map[string]any{
|
||||
"manager_path": managerPath,
|
||||
"interface_id": ifaceID,
|
||||
"hostname": strings.TrimSpace(asString(doc["HostName"])),
|
||||
"fqdn": strings.TrimSpace(asString(doc["FQDN"])),
|
||||
"mac_address": strings.ToUpper(mac),
|
||||
"link_status": strings.TrimSpace(asString(doc["LinkStatus"])),
|
||||
"speed_mbps": asInt(doc["SpeedMbps"]),
|
||||
"interface_name": strings.TrimSpace(asString(doc["Name"])),
|
||||
"interface_desc": strings.TrimSpace(asString(doc["Description"])),
|
||||
"ncsi_enabled": ncsiEnabled,
|
||||
"lldp_mode": lldpMode,
|
||||
"ipv4_address": redfishManagerIPv4Field(doc, "Address"),
|
||||
"ipv4_gateway": redfishManagerIPv4Field(doc, "Gateway"),
|
||||
"ipv4_subnet": redfishManagerIPv4Field(doc, "SubnetMask"),
|
||||
"ipv6_address": redfishManagerIPv6Field(doc, "Address"),
|
||||
"link_is_active": strings.EqualFold(strings.TrimSpace(asString(doc["LinkStatus"])), "LinkActive"),
|
||||
"interface_score": 0,
|
||||
}
|
||||
if lldp, ok := lldpByEth[strings.ToLower(ifaceID)]; ok {
|
||||
summary["lldp_chassis_name"] = lldp["ChassisName"]
|
||||
summary["lldp_port_desc"] = lldp["PortDesc"]
|
||||
summary["lldp_port_id"] = lldp["PortId"]
|
||||
if vlan := asInt(lldp["VlanId"]); vlan > 0 {
|
||||
summary["lldp_vlan_id"] = vlan
|
||||
}
|
||||
}
|
||||
score := redfishManagerInterfaceScore(summary)
|
||||
summary["interface_score"] = score
|
||||
if score > bestScore {
|
||||
bestScore = score
|
||||
best = summary
|
||||
}
|
||||
}
|
||||
}
|
||||
return best
|
||||
}
|
||||
|
||||
func redfishManagerEthernetCollectionHints(collectionDoc map[string]interface{}) (bool, string, map[string]map[string]interface{}) {
|
||||
lldpByEth := make(map[string]map[string]interface{})
|
||||
if len(collectionDoc) == 0 {
|
||||
return false, "", lldpByEth
|
||||
}
|
||||
oem, _ := collectionDoc["Oem"].(map[string]interface{})
|
||||
public, _ := oem["Public"].(map[string]interface{})
|
||||
ncsiEnabled := asBool(public["NcsiEnabled"])
|
||||
lldp, _ := public["LLDP"].(map[string]interface{})
|
||||
lldpMode := strings.TrimSpace(asString(lldp["LLDPMode"]))
|
||||
if members, ok := lldp["Members"].([]interface{}); ok {
|
||||
for _, item := range members {
|
||||
member, ok := item.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
ethIndex := strings.ToLower(strings.TrimSpace(asString(member["EthIndex"])))
|
||||
if ethIndex == "" {
|
||||
continue
|
||||
}
|
||||
lldpByEth[ethIndex] = member
|
||||
}
|
||||
}
|
||||
return ncsiEnabled, lldpMode, lldpByEth
|
||||
}
|
||||
|
||||
func redfishManagerIPv4Field(doc map[string]interface{}, key string) string {
|
||||
if len(doc) == 0 {
|
||||
return ""
|
||||
}
|
||||
for _, field := range []string{"IPv4Addresses", "IPv4StaticAddresses"} {
|
||||
list, ok := doc[field].([]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
for _, item := range list {
|
||||
entry, ok := item.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
value := strings.TrimSpace(asString(entry[key]))
|
||||
if value != "" {
|
||||
return value
|
||||
}
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func redfishManagerIPv6Field(doc map[string]interface{}, key string) string {
|
||||
if len(doc) == 0 {
|
||||
return ""
|
||||
}
|
||||
list, ok := doc["IPv6Addresses"].([]interface{})
|
||||
if !ok {
|
||||
return ""
|
||||
}
|
||||
for _, item := range list {
|
||||
entry, ok := item.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
value := strings.TrimSpace(asString(entry[key]))
|
||||
if value != "" {
|
||||
return value
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func redfishManagerInterfaceScore(summary map[string]any) int {
|
||||
score := 0
|
||||
if strings.EqualFold(strings.TrimSpace(asString(summary["link_status"])), "LinkActive") {
|
||||
score += 100
|
||||
}
|
||||
if strings.TrimSpace(asString(summary["ipv4_address"])) != "" {
|
||||
score += 40
|
||||
}
|
||||
if strings.TrimSpace(asString(summary["ipv6_address"])) != "" {
|
||||
score += 10
|
||||
}
|
||||
if strings.TrimSpace(asString(summary["mac_address"])) != "" {
|
||||
score += 10
|
||||
}
|
||||
if asInt(summary["speed_mbps"]) > 0 {
|
||||
score += 5
|
||||
}
|
||||
if ifaceID := strings.ToLower(strings.TrimSpace(asString(summary["interface_id"]))); ifaceID != "" && !strings.HasPrefix(ifaceID, "usb") {
|
||||
score += 3
|
||||
}
|
||||
if asBool(summary["ncsi_enabled"]) {
|
||||
score += 1
|
||||
}
|
||||
return score
|
||||
}
|
||||
|
||||
// findNICIndexByLinkedNetworkAdapter resolves a NetworkInterface document to an
|
||||
// existing NIC in bySlot by following Links.NetworkAdapter → the Chassis
|
||||
// NetworkAdapter doc and reconstructing the canonical NIC identity. Returns -1
|
||||
// if no match is found.
|
||||
func (r redfishSnapshotReader) findNICIndexByLinkedNetworkAdapter(iface map[string]interface{}, existing []models.NetworkAdapter, bySlot map[string]int) int {
|
||||
links, ok := iface["Links"].(map[string]interface{})
|
||||
if !ok {
|
||||
return -1
|
||||
}
|
||||
adapterRef, ok := links["NetworkAdapter"].(map[string]interface{})
|
||||
if !ok {
|
||||
return -1
|
||||
}
|
||||
adapterPath := normalizeRedfishPath(asString(adapterRef["@odata.id"]))
|
||||
if adapterPath == "" {
|
||||
return -1
|
||||
}
|
||||
adapterDoc, err := r.getJSON(adapterPath)
|
||||
if err != nil || len(adapterDoc) == 0 {
|
||||
return -1
|
||||
}
|
||||
adapterNIC := r.buildNICFromAdapterDoc(adapterDoc)
|
||||
if serial := normalizeRedfishIdentityField(adapterNIC.SerialNumber); serial != "" {
|
||||
for idx, nic := range existing {
|
||||
if strings.EqualFold(normalizeRedfishIdentityField(nic.SerialNumber), serial) {
|
||||
return idx
|
||||
}
|
||||
}
|
||||
}
|
||||
if bdf := strings.TrimSpace(adapterNIC.BDF); bdf != "" {
|
||||
for idx, nic := range existing {
|
||||
if strings.EqualFold(strings.TrimSpace(nic.BDF), bdf) {
|
||||
return idx
|
||||
}
|
||||
}
|
||||
}
|
||||
if slot := strings.ToLower(strings.TrimSpace(adapterNIC.Slot)); slot != "" {
|
||||
if idx, ok := bySlot[slot]; ok {
|
||||
return idx
|
||||
}
|
||||
}
|
||||
for idx, nic := range existing {
|
||||
if networkAdaptersShareMACs(nic, adapterNIC) {
|
||||
return idx
|
||||
}
|
||||
}
|
||||
return -1
|
||||
}
|
||||
|
||||
func networkAdaptersShareMACs(a, b models.NetworkAdapter) bool {
|
||||
if len(a.MACAddresses) == 0 || len(b.MACAddresses) == 0 {
|
||||
return false
|
||||
}
|
||||
seen := make(map[string]struct{}, len(a.MACAddresses))
|
||||
for _, mac := range a.MACAddresses {
|
||||
normalized := strings.ToUpper(strings.TrimSpace(mac))
|
||||
if normalized == "" {
|
||||
continue
|
||||
}
|
||||
seen[normalized] = struct{}{}
|
||||
}
|
||||
for _, mac := range b.MACAddresses {
|
||||
normalized := strings.ToUpper(strings.TrimSpace(mac))
|
||||
if normalized == "" {
|
||||
continue
|
||||
}
|
||||
if _, ok := seen[normalized]; ok {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// enrichNICMACsFromNetworkDeviceFunctions reads the NetworkDeviceFunctions
|
||||
// collection linked from a NetworkAdapter document and populates the NIC's
|
||||
// MACAddresses from each function's Ethernet.PermanentMACAddress / MACAddress.
|
||||
// Called when PCIe-path enrichment does not produce any MACs.
|
||||
func (r redfishSnapshotReader) enrichNICMACsFromNetworkDeviceFunctions(nic *models.NetworkAdapter, adapterDoc map[string]interface{}) {
|
||||
ndfCol, ok := adapterDoc["NetworkDeviceFunctions"].(map[string]interface{})
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
colPath := asString(ndfCol["@odata.id"])
|
||||
if colPath == "" {
|
||||
return
|
||||
}
|
||||
funcDocs, err := r.getCollectionMembers(colPath)
|
||||
if err != nil || len(funcDocs) == 0 {
|
||||
return
|
||||
}
|
||||
for _, fn := range funcDocs {
|
||||
eth, _ := fn["Ethernet"].(map[string]interface{})
|
||||
if eth == nil {
|
||||
continue
|
||||
}
|
||||
mac := strings.TrimSpace(firstNonEmpty(
|
||||
asString(eth["PermanentMACAddress"]),
|
||||
asString(eth["MACAddress"]),
|
||||
))
|
||||
if mac == "" {
|
||||
continue
|
||||
}
|
||||
nic.MACAddresses = dedupeStrings(append(nic.MACAddresses, strings.ToUpper(mac)))
|
||||
}
|
||||
if len(funcDocs) > 0 && nic.PortCount == 0 {
|
||||
nic.PortCount = sanitizeNetworkPortCount(len(funcDocs))
|
||||
}
|
||||
}
|
||||
100
internal/collector/redfish_replay_profiles.go
Normal file
100
internal/collector/redfish_replay_profiles.go
Normal file
@@ -0,0 +1,100 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) collectKnownStorageMembers(systemPath string, relativeCollections []string) []map[string]interface{} {
|
||||
var out []map[string]interface{}
|
||||
for _, rel := range relativeCollections {
|
||||
docs, err := r.getCollectionMembers(joinPath(systemPath, rel))
|
||||
if err != nil || len(docs) == 0 {
|
||||
continue
|
||||
}
|
||||
out = append(out, docs...)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) probeSupermicroNVMeDiskBays(backplanePath string) []map[string]interface{} {
|
||||
return r.probeDirectDiskBayChildren(joinPath(backplanePath, "/Drives"))
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) probeDirectDiskBayChildren(drivesCollectionPath string) []map[string]interface{} {
|
||||
var out []map[string]interface{}
|
||||
for _, path := range directDiskBayCandidates(drivesCollectionPath) {
|
||||
doc, err := r.getJSON(path)
|
||||
if err != nil || !looksLikeDrive(doc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, doc)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func resolveProcessorGPUChassisSerial(chassisByID map[string]map[string]interface{}, gpuID string, plan redfishprofile.ResolvedAnalysisPlan) string {
|
||||
for _, candidateID := range processorGPUChassisCandidateIDs(gpuID, plan) {
|
||||
if chassisDoc, ok := chassisByID[strings.ToUpper(candidateID)]; ok {
|
||||
if serial := strings.TrimSpace(asString(chassisDoc["SerialNumber"])); serial != "" {
|
||||
return serial
|
||||
}
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func resolveProcessorGPUChassisPath(chassisPathByID map[string]string, gpuID string, plan redfishprofile.ResolvedAnalysisPlan) string {
|
||||
for _, candidateID := range processorGPUChassisCandidateIDs(gpuID, plan) {
|
||||
if p, ok := chassisPathByID[strings.ToUpper(candidateID)]; ok {
|
||||
return p
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func processorGPUChassisCandidateIDs(gpuID string, plan redfishprofile.ResolvedAnalysisPlan) []string {
|
||||
gpuID = strings.TrimSpace(gpuID)
|
||||
if gpuID == "" {
|
||||
return nil
|
||||
}
|
||||
candidates := []string{gpuID}
|
||||
for _, mode := range plan.ProcessorGPUChassisLookupModes {
|
||||
switch strings.ToLower(strings.TrimSpace(mode)) {
|
||||
case "msi-index":
|
||||
candidates = append(candidates, msiProcessorGPUChassisCandidateIDs(gpuID)...)
|
||||
case "hgx-alias":
|
||||
if strings.HasPrefix(strings.ToUpper(gpuID), "GPU_") {
|
||||
candidates = append(candidates, "HGX_"+gpuID)
|
||||
}
|
||||
}
|
||||
}
|
||||
return dedupeStrings(candidates)
|
||||
}
|
||||
|
||||
func msiProcessorGPUChassisCandidateIDs(gpuID string) []string {
|
||||
gpuID = strings.TrimSpace(strings.ToUpper(gpuID))
|
||||
if gpuID == "" {
|
||||
return nil
|
||||
}
|
||||
var out []string
|
||||
switch {
|
||||
case strings.HasPrefix(gpuID, "GPU_SXM_"):
|
||||
index := strings.TrimPrefix(gpuID, "GPU_SXM_")
|
||||
if index != "" {
|
||||
out = append(out, "GPU"+index, "GPU_"+index)
|
||||
}
|
||||
case strings.HasPrefix(gpuID, "GPU_"):
|
||||
index := strings.TrimPrefix(gpuID, "GPU_")
|
||||
if index != "" {
|
||||
out = append(out, "GPU"+index, "GPU_SXM_"+index)
|
||||
}
|
||||
case strings.HasPrefix(gpuID, "GPU"):
|
||||
index := strings.TrimPrefix(gpuID, "GPU")
|
||||
if index != "" {
|
||||
out = append(out, "GPU_"+index, "GPU_SXM_"+index)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
167
internal/collector/redfish_replay_storage.go
Normal file
167
internal/collector/redfish_replay_storage.go
Normal file
@@ -0,0 +1,167 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"git.mchus.pro/mchus/logpile/internal/collector/redfishprofile"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func (r redfishSnapshotReader) collectStorage(systemPath string, plan redfishprofile.ResolvedAnalysisPlan) []models.Storage {
|
||||
var out []models.Storage
|
||||
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
|
||||
for _, member := range storageMembers {
|
||||
if driveCollection, ok := member["Drives"].(map[string]interface{}); ok {
|
||||
if driveCollectionPath := asString(driveCollection["@odata.id"]); driveCollectionPath != "" {
|
||||
driveDocs, err := r.getCollectionMembers(driveCollectionPath)
|
||||
if err == nil {
|
||||
for _, driveDoc := range driveDocs {
|
||||
if !isAbsentDriveDoc(driveDoc) && !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
if len(driveDocs) == 0 {
|
||||
for _, driveDoc := range r.probeDirectDiskBayChildren(driveCollectionPath) {
|
||||
if isAbsentDriveDoc(driveDoc) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
}
|
||||
if drives, ok := member["Drives"].([]interface{}); ok {
|
||||
for _, driveAny := range drives {
|
||||
driveRef, ok := driveAny.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
odata := asString(driveRef["@odata.id"])
|
||||
if odata == "" {
|
||||
continue
|
||||
}
|
||||
driveDoc, err := r.getJSON(odata)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if !isAbsentDriveDoc(driveDoc) && !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
if looksLikeDrive(member) {
|
||||
if isAbsentDriveDoc(member) || isVirtualStorageDrive(member) {
|
||||
continue
|
||||
}
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(member, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(member, supplementalDocs...))
|
||||
}
|
||||
|
||||
if plan.Directives.EnableStorageEnclosureRecovery {
|
||||
for _, enclosurePath := range redfishLinkRefs(member, "Links", "Enclosures") {
|
||||
driveDocs, err := r.getCollectionMembers(joinPath(enclosurePath, "/Drives"))
|
||||
if err == nil {
|
||||
for _, driveDoc := range driveDocs {
|
||||
if looksLikeDrive(driveDoc) && !isAbsentDriveDoc(driveDoc) && !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
if len(driveDocs) == 0 {
|
||||
for _, driveDoc := range r.probeDirectDiskBayChildren(joinPath(enclosurePath, "/Drives")) {
|
||||
if isAbsentDriveDoc(driveDoc) || isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if len(plan.KnownStorageDriveCollections) > 0 {
|
||||
for _, driveDoc := range r.collectKnownStorageMembers(systemPath, plan.KnownStorageDriveCollections) {
|
||||
if looksLikeDrive(driveDoc) && !isAbsentDriveDoc(driveDoc) && !isVirtualStorageDrive(driveDoc) {
|
||||
supplementalDocs := r.getLinkedSupplementalDocs(driveDoc, "DriveMetrics", "EnvironmentMetrics", "Metrics")
|
||||
out = append(out, parseDriveWithSupplementalDocs(driveDoc, supplementalDocs...))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
simpleStorageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/SimpleStorage"))
|
||||
for _, member := range simpleStorageMembers {
|
||||
devices, ok := member["Devices"].([]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
for _, devAny := range devices {
|
||||
devDoc, ok := devAny.(map[string]interface{})
|
||||
if !ok || !looksLikeDrive(devDoc) || isAbsentDriveDoc(devDoc) || isVirtualStorageDrive(devDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(devDoc))
|
||||
}
|
||||
}
|
||||
|
||||
chassisPaths := r.discoverMemberPaths("/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
|
||||
for _, chassisPath := range chassisPaths {
|
||||
driveDocs, err := r.getCollectionMembers(joinPath(chassisPath, "/Drives"))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, driveDoc := range driveDocs {
|
||||
if !looksLikeDrive(driveDoc) || isAbsentDriveDoc(driveDoc) || isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
if plan.Directives.EnableSupermicroNVMeBackplane {
|
||||
for _, chassisPath := range chassisPaths {
|
||||
if !isSupermicroNVMeBackplanePath(chassisPath) {
|
||||
continue
|
||||
}
|
||||
for _, driveDoc := range r.probeSupermicroNVMeDiskBays(chassisPath) {
|
||||
if !looksLikeDrive(driveDoc) || isAbsentDriveDoc(driveDoc) || isVirtualStorageDrive(driveDoc) {
|
||||
continue
|
||||
}
|
||||
out = append(out, parseDrive(driveDoc))
|
||||
}
|
||||
}
|
||||
}
|
||||
return dedupeStorage(out)
|
||||
}
|
||||
|
||||
func (r redfishSnapshotReader) collectStorageVolumes(systemPath string, plan redfishprofile.ResolvedAnalysisPlan) []models.StorageVolume {
|
||||
var out []models.StorageVolume
|
||||
storageMembers, _ := r.getCollectionMembers(joinPath(systemPath, "/Storage"))
|
||||
for _, member := range storageMembers {
|
||||
controller := firstNonEmpty(asString(member["Id"]), asString(member["Name"]))
|
||||
volumeCollectionPath := redfishLinkedPath(member, "Volumes")
|
||||
if volumeCollectionPath == "" {
|
||||
continue
|
||||
}
|
||||
volumeDocs, err := r.getCollectionMembers(volumeCollectionPath)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
for _, volDoc := range volumeDocs {
|
||||
if looksLikeVolume(volDoc) {
|
||||
out = append(out, parseStorageVolume(volDoc, controller))
|
||||
}
|
||||
}
|
||||
}
|
||||
if len(plan.KnownStorageVolumeCollections) > 0 {
|
||||
for _, volDoc := range r.collectKnownStorageMembers(systemPath, plan.KnownStorageVolumeCollections) {
|
||||
if looksLikeVolume(volDoc) {
|
||||
out = append(out, parseStorageVolume(volDoc, storageControllerFromPath(asString(volDoc["@odata.id"]))))
|
||||
}
|
||||
}
|
||||
}
|
||||
return dedupeStorageVolumes(out)
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
162
internal/collector/redfishprofile/acquisition.go
Normal file
162
internal/collector/redfishprofile/acquisition.go
Normal file
@@ -0,0 +1,162 @@
|
||||
package redfishprofile
|
||||
|
||||
import "strings"
|
||||
|
||||
func ResolveAcquisitionPlan(match MatchResult, plan AcquisitionPlan, discovered DiscoveredResources, signals MatchSignals) ResolvedAcquisitionPlan {
|
||||
seedGroups := [][]string{
|
||||
baselineSeedPaths(discovered),
|
||||
expandScopedSuffixes(discovered.SystemPaths, plan.ScopedPaths.SystemSeedSuffixes),
|
||||
expandScopedSuffixes(discovered.ChassisPaths, plan.ScopedPaths.ChassisSeedSuffixes),
|
||||
expandScopedSuffixes(discovered.ManagerPaths, plan.ScopedPaths.ManagerSeedSuffixes),
|
||||
plan.SeedPaths,
|
||||
}
|
||||
if plan.Mode == ModeFallback {
|
||||
seedGroups = append(seedGroups, plan.PlanBPaths)
|
||||
}
|
||||
|
||||
criticalGroups := [][]string{
|
||||
baselineCriticalPaths(discovered),
|
||||
expandScopedSuffixes(discovered.SystemPaths, plan.ScopedPaths.SystemCriticalSuffixes),
|
||||
expandScopedSuffixes(discovered.ChassisPaths, plan.ScopedPaths.ChassisCriticalSuffixes),
|
||||
expandScopedSuffixes(discovered.ManagerPaths, plan.ScopedPaths.ManagerCriticalSuffixes),
|
||||
plan.CriticalPaths,
|
||||
}
|
||||
|
||||
resolved := ResolvedAcquisitionPlan{
|
||||
Plan: plan,
|
||||
SeedPaths: mergeResolvedPaths(seedGroups...),
|
||||
CriticalPaths: mergeResolvedPaths(criticalGroups...),
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
profile.RefineAcquisitionPlan(&resolved, discovered, signals)
|
||||
}
|
||||
resolved.SeedPaths = mergeResolvedPaths(resolved.SeedPaths)
|
||||
resolved.CriticalPaths = mergeResolvedPaths(resolved.CriticalPaths, resolved.Plan.CriticalPaths)
|
||||
resolved.Plan.SeedPaths = mergeResolvedPaths(resolved.Plan.SeedPaths)
|
||||
resolved.Plan.CriticalPaths = mergeResolvedPaths(resolved.Plan.CriticalPaths)
|
||||
resolved.Plan.PlanBPaths = mergeResolvedPaths(resolved.Plan.PlanBPaths)
|
||||
return resolved
|
||||
}
|
||||
|
||||
func baselineSeedPaths(discovered DiscoveredResources) []string {
|
||||
var out []string
|
||||
add := func(p string) {
|
||||
if p = normalizePath(p); p != "" {
|
||||
out = append(out, p)
|
||||
}
|
||||
}
|
||||
|
||||
add("/redfish/v1/UpdateService")
|
||||
add("/redfish/v1/UpdateService/FirmwareInventory")
|
||||
|
||||
for _, p := range discovered.SystemPaths {
|
||||
add(p)
|
||||
add(joinPath(p, "/Bios"))
|
||||
add(joinPath(p, "/Oem/Public"))
|
||||
add(joinPath(p, "/Oem/Public/FRU"))
|
||||
add(joinPath(p, "/Processors"))
|
||||
add(joinPath(p, "/Memory"))
|
||||
add(joinPath(p, "/EthernetInterfaces"))
|
||||
add(joinPath(p, "/NetworkInterfaces"))
|
||||
add(joinPath(p, "/PCIeDevices"))
|
||||
add(joinPath(p, "/PCIeFunctions"))
|
||||
add(joinPath(p, "/Accelerators"))
|
||||
add(joinPath(p, "/GraphicsControllers"))
|
||||
add(joinPath(p, "/Storage"))
|
||||
}
|
||||
for _, p := range discovered.ChassisPaths {
|
||||
add(p)
|
||||
add(joinPath(p, "/Oem/Public"))
|
||||
add(joinPath(p, "/Oem/Public/FRU"))
|
||||
add(joinPath(p, "/PCIeDevices"))
|
||||
add(joinPath(p, "/PCIeSlots"))
|
||||
add(joinPath(p, "/NetworkAdapters"))
|
||||
add(joinPath(p, "/Drives"))
|
||||
add(joinPath(p, "/Power"))
|
||||
}
|
||||
for _, p := range discovered.ManagerPaths {
|
||||
add(p)
|
||||
add(joinPath(p, "/EthernetInterfaces"))
|
||||
add(joinPath(p, "/NetworkProtocol"))
|
||||
}
|
||||
return mergeResolvedPaths(out)
|
||||
}
|
||||
|
||||
func baselineCriticalPaths(discovered DiscoveredResources) []string {
|
||||
var out []string
|
||||
for _, group := range [][]string{
|
||||
{"/redfish/v1"},
|
||||
discovered.SystemPaths,
|
||||
discovered.ChassisPaths,
|
||||
discovered.ManagerPaths,
|
||||
} {
|
||||
out = append(out, group...)
|
||||
}
|
||||
return mergeResolvedPaths(out)
|
||||
}
|
||||
|
||||
func expandScopedSuffixes(basePaths, suffixes []string) []string {
|
||||
if len(basePaths) == 0 || len(suffixes) == 0 {
|
||||
return nil
|
||||
}
|
||||
out := make([]string, 0, len(basePaths)*len(suffixes))
|
||||
for _, basePath := range basePaths {
|
||||
basePath = normalizePath(basePath)
|
||||
if basePath == "" {
|
||||
continue
|
||||
}
|
||||
for _, suffix := range suffixes {
|
||||
suffix = strings.TrimSpace(suffix)
|
||||
if suffix == "" {
|
||||
continue
|
||||
}
|
||||
out = append(out, joinPath(basePath, suffix))
|
||||
}
|
||||
}
|
||||
return mergeResolvedPaths(out)
|
||||
}
|
||||
|
||||
func mergeResolvedPaths(groups ...[]string) []string {
|
||||
seen := make(map[string]struct{})
|
||||
out := make([]string, 0)
|
||||
for _, group := range groups {
|
||||
for _, path := range group {
|
||||
path = normalizePath(path)
|
||||
if path == "" {
|
||||
continue
|
||||
}
|
||||
if _, ok := seen[path]; ok {
|
||||
continue
|
||||
}
|
||||
seen[path] = struct{}{}
|
||||
out = append(out, path)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func normalizePath(path string) string {
|
||||
path = strings.TrimSpace(path)
|
||||
if path == "" {
|
||||
return ""
|
||||
}
|
||||
if !strings.HasPrefix(path, "/") {
|
||||
path = "/" + path
|
||||
}
|
||||
return strings.TrimRight(path, "/")
|
||||
}
|
||||
|
||||
func joinPath(base, rel string) string {
|
||||
base = normalizePath(base)
|
||||
rel = strings.TrimSpace(rel)
|
||||
if base == "" {
|
||||
return normalizePath(rel)
|
||||
}
|
||||
if rel == "" {
|
||||
return base
|
||||
}
|
||||
if !strings.HasPrefix(rel, "/") {
|
||||
rel = "/" + rel
|
||||
}
|
||||
return normalizePath(base + rel)
|
||||
}
|
||||
100
internal/collector/redfishprofile/analysis.go
Normal file
100
internal/collector/redfishprofile/analysis.go
Normal file
@@ -0,0 +1,100 @@
|
||||
package redfishprofile
|
||||
|
||||
import "strings"
|
||||
|
||||
func ResolveAnalysisPlan(match MatchResult, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals) ResolvedAnalysisPlan {
|
||||
plan := ResolvedAnalysisPlan{
|
||||
Match: match,
|
||||
Directives: AnalysisDirectives{},
|
||||
}
|
||||
if match.Mode == ModeFallback {
|
||||
plan.Directives.EnableProcessorGPUFallback = true
|
||||
plan.Directives.EnableSupermicroNVMeBackplane = true
|
||||
plan.Directives.EnableProcessorGPUChassisAlias = true
|
||||
plan.Directives.EnableGenericGraphicsControllerDedup = true
|
||||
plan.Directives.EnableStorageEnclosureRecovery = true
|
||||
plan.Directives.EnableKnownStorageControllerRecovery = true
|
||||
addAnalysisLookupMode(&plan, "msi-index")
|
||||
addAnalysisLookupMode(&plan, "hgx-alias")
|
||||
addAnalysisStorageDriveCollections(&plan,
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Controllers/1/Drives",
|
||||
)
|
||||
addAnalysisStorageVolumeCollections(&plan,
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
"/Storage/HA-RAID/Volumes",
|
||||
"/Storage/MRVL.HA-RAID/Volumes",
|
||||
)
|
||||
addAnalysisNote(&plan, "fallback analysis enables broad recovery directives")
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
profile.ApplyAnalysisDirectives(&plan.Directives, signals)
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
profile.RefineAnalysisPlan(&plan, snapshot, discovered, signals)
|
||||
}
|
||||
return plan
|
||||
}
|
||||
|
||||
func snapshotHasPathPrefix(snapshot map[string]interface{}, prefix string) bool {
|
||||
prefix = normalizePath(prefix)
|
||||
if prefix == "" {
|
||||
return false
|
||||
}
|
||||
for path := range snapshot {
|
||||
if strings.HasPrefix(normalizePath(path), prefix) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func snapshotHasPathContaining(snapshot map[string]interface{}, sub string) bool {
|
||||
sub = strings.ToLower(strings.TrimSpace(sub))
|
||||
if sub == "" {
|
||||
return false
|
||||
}
|
||||
for path := range snapshot {
|
||||
if strings.Contains(strings.ToLower(path), sub) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func snapshotHasGPUProcessor(snapshot map[string]interface{}, systemPaths []string) bool {
|
||||
for _, systemPath := range systemPaths {
|
||||
prefix := normalizePath(joinPath(systemPath, "/Processors")) + "/"
|
||||
for path, docAny := range snapshot {
|
||||
if !strings.HasPrefix(normalizePath(path), prefix) {
|
||||
continue
|
||||
}
|
||||
doc, ok := docAny.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if strings.EqualFold(strings.TrimSpace(asString(doc["ProcessorType"])), "GPU") {
|
||||
return true
|
||||
}
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func snapshotHasStorageControllerHint(snapshot map[string]interface{}, needles ...string) bool {
|
||||
for _, needle := range needles {
|
||||
if snapshotHasPathContaining(snapshot, needle) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func asString(v interface{}) string {
|
||||
switch x := v.(type) {
|
||||
case string:
|
||||
return x
|
||||
default:
|
||||
return ""
|
||||
}
|
||||
}
|
||||
450
internal/collector/redfishprofile/fixture_test.go
Normal file
450
internal/collector/redfishprofile/fixture_test.go
Normal file
@@ -0,0 +1,450 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_MSI_CG480(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "msi-cg480.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "msi")
|
||||
assertProfileSelected(t, match, "ami-family")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
|
||||
if plan.Tuning.PrefetchWorkers < 6 {
|
||||
t.Fatalf("expected msi prefetch worker tuning, got %d", plan.Tuning.PrefetchWorkers)
|
||||
}
|
||||
if !containsString(resolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
|
||||
t.Fatalf("expected MSI chassis GPU seed path")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Chassis/GPU1/Sensors") {
|
||||
t.Fatal("expected MSI GPU sensor critical path")
|
||||
}
|
||||
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/Chassis/GPU1/Sensors") {
|
||||
t.Fatal("expected MSI GPU sensor plan-b path")
|
||||
}
|
||||
if plan.Tuning.ETABaseline.SnapshotSeconds <= 0 {
|
||||
t.Fatal("expected MSI snapshot eta baseline")
|
||||
}
|
||||
if !plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe {
|
||||
t.Fatal("expected MSI fixture to inherit generic numeric post-probe policy")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.SystemSeedSuffixes, "/SimpleStorage") {
|
||||
t.Fatal("expected MSI fixture to inherit generic SimpleStorage scoped seed suffix")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.SystemCriticalSuffixes, "/Memory") {
|
||||
t.Fatal("expected MSI fixture to inherit generic system critical suffixes")
|
||||
}
|
||||
if !containsString(plan.Tuning.PrefetchPolicy.IncludeSuffixes, "/Storage") {
|
||||
t.Fatal("expected MSI fixture to inherit generic storage prefetch policy")
|
||||
}
|
||||
if !containsString(plan.CriticalPaths, "/redfish/v1/UpdateService") {
|
||||
t.Fatal("expected MSI fixture to inherit generic top-level critical path")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
|
||||
t.Fatal("expected MSI fixture to enable profile plan-b")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_MSI_CG480_CopyMatchesSameProfiles(t *testing.T) {
|
||||
originalSignals := loadProfileFixtureSignals(t, "msi-cg480.json")
|
||||
copySignals := loadProfileFixtureSignals(t, "msi-cg480-copy.json")
|
||||
originalMatch := MatchProfiles(originalSignals)
|
||||
copyMatch := MatchProfiles(copySignals)
|
||||
originalPlan := BuildAcquisitionPlan(originalSignals)
|
||||
copyPlan := BuildAcquisitionPlan(copySignals)
|
||||
originalResolved := ResolveAcquisitionPlan(originalMatch, originalPlan, discoveredResourcesFromSignals(originalSignals), originalSignals)
|
||||
copyResolved := ResolveAcquisitionPlan(copyMatch, copyPlan, discoveredResourcesFromSignals(copySignals), copySignals)
|
||||
|
||||
assertSameProfileNames(t, originalMatch, copyMatch)
|
||||
if originalPlan.Tuning.PrefetchWorkers != copyPlan.Tuning.PrefetchWorkers {
|
||||
t.Fatalf("expected same MSI prefetch worker tuning, got %d vs %d", originalPlan.Tuning.PrefetchWorkers, copyPlan.Tuning.PrefetchWorkers)
|
||||
}
|
||||
if containsString(originalResolved.SeedPaths, "/redfish/v1/Chassis/GPU1") != containsString(copyResolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
|
||||
t.Fatal("expected same MSI GPU chassis seed presence in both fixtures")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_MSI_CG290(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "msi-cg290.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "msi")
|
||||
assertProfileSelected(t, match, "ami-family")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
|
||||
if plan.Tuning.PrefetchWorkers < 6 {
|
||||
t.Fatalf("expected MSI prefetch worker tuning, got %d", plan.Tuning.PrefetchWorkers)
|
||||
}
|
||||
if !containsString(resolved.SeedPaths, "/redfish/v1/Chassis/GPU1") {
|
||||
t.Fatalf("expected MSI chassis GPU seed path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_Supermicro_HGX(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "supermicro-hgx.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
discovered := discoveredResourcesFromSignals(signals)
|
||||
discovered.SystemPaths = dedupeSorted(append(discovered.SystemPaths, "/redfish/v1/Systems/HGX_Baseboard_0"))
|
||||
resolved := ResolveAcquisitionPlan(match, plan, discovered, signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "supermicro")
|
||||
assertProfileSelected(t, match, "hgx-topology")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
|
||||
if plan.Tuning.SnapshotMaxDocuments < 180000 {
|
||||
t.Fatalf("expected widened HGX snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.NVMePostProbeEnabled == nil || *plan.Tuning.NVMePostProbeEnabled {
|
||||
t.Fatal("expected HGX fixture to disable NVMe post-probe")
|
||||
}
|
||||
if !containsString(resolved.SeedPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX baseboard processors seed path")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX baseboard processors critical path")
|
||||
}
|
||||
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX baseboard processors plan-b path")
|
||||
}
|
||||
if plan.Tuning.ETABaseline.SnapshotSeconds < 300 {
|
||||
t.Fatalf("expected HGX snapshot eta baseline, got %d", plan.Tuning.ETABaseline.SnapshotSeconds)
|
||||
}
|
||||
if !plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe {
|
||||
t.Fatal("expected HGX fixture to retain Supermicro direct NVMe disk bay probe policy")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.SystemCriticalSuffixes, "/Storage/IntelVROC/Drives") {
|
||||
t.Fatal("expected HGX fixture to inherit generic IntelVROC scoped critical suffix")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.ChassisCriticalSuffixes, "/Assembly") {
|
||||
t.Fatal("expected HGX fixture to inherit generic chassis critical suffixes")
|
||||
}
|
||||
if !containsString(plan.Tuning.PrefetchPolicy.ExcludeContains, "/Assembly") {
|
||||
t.Fatal("expected HGX fixture to inherit generic assembly prefetch exclusion")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
|
||||
t.Fatal("expected HGX fixture to enable profile plan-b")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_Supermicro_OAM_NoHGX(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "supermicro-oam-amd.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, discoveredResourcesFromSignals(signals), signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "supermicro")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
|
||||
if containsString(resolved.SeedPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("did not expect HGX baseboard processors seed path for OAM fixture")
|
||||
}
|
||||
if containsString(resolved.CriticalPaths, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("did not expect HGX baseboard processors critical path for OAM fixture")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
|
||||
t.Fatal("expected Supermicro firmware critical path")
|
||||
}
|
||||
if !containsString(resolved.Plan.PlanBPaths, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
|
||||
t.Fatal("expected Supermicro firmware plan-b path")
|
||||
}
|
||||
if plan.Tuning.SnapshotMaxDocuments != 150000 {
|
||||
t.Fatalf("expected generic supermicro snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.NVMePostProbeEnabled != nil {
|
||||
t.Fatal("did not expect HGX NVMe tuning for OAM fixture")
|
||||
}
|
||||
if plan.Tuning.ETABaseline.SnapshotSeconds < 180 {
|
||||
t.Fatalf("expected Supermicro snapshot eta baseline, got %d", plan.Tuning.ETABaseline.SnapshotSeconds)
|
||||
}
|
||||
if !plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe {
|
||||
t.Fatal("expected Supermicro OAM fixture to use direct NVMe disk bay probe policy")
|
||||
}
|
||||
if !plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe {
|
||||
t.Fatal("expected Supermicro OAM fixture to inherit generic numeric post-probe policy")
|
||||
}
|
||||
if !containsString(plan.ScopedPaths.SystemSeedSuffixes, "/Storage/IntelVROC") {
|
||||
t.Fatal("expected Supermicro OAM fixture to inherit generic IntelVROC scoped seed suffix")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
|
||||
t.Fatal("expected Supermicro OAM fixture to enable profile plan-b")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_Dell_R750(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "dell-r750.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/System.Embedded.1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/System.Embedded.1"},
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1", "/redfish/v1/Managers/iDRAC.Embedded.1"},
|
||||
}, signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "dell")
|
||||
assertProfileNotSelected(t, match, "supermicro")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
|
||||
if !plan.Tuning.RecoveryPolicy.EnableProfilePlanB {
|
||||
t.Fatal("expected dell fixture to enable profile plan-b")
|
||||
}
|
||||
if !containsString(resolved.SeedPaths, "/redfish/v1/Managers/iDRAC.Embedded.1") {
|
||||
t.Fatal("expected Dell refinement to add iDRAC manager seed path")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Managers/iDRAC.Embedded.1") {
|
||||
t.Fatal("expected Dell refinement to add iDRAC manager critical path")
|
||||
}
|
||||
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals).Directives
|
||||
if !directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected dell fixture to enable graphics controller dedup")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_AMI_Generic(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "ami-generic.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "ami-family")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
assertProfileNotSelected(t, match, "supermicro")
|
||||
assertProfileNotSelected(t, match, "dell")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
|
||||
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
|
||||
t.Fatal("expected ami-family fixture to force prefetch enabled")
|
||||
}
|
||||
if !containsString(plan.SeedPaths, "/redfish/v1/Oem/Ami") {
|
||||
t.Fatal("expected ami-family fixture seed path /redfish/v1/Oem/Ami")
|
||||
}
|
||||
if !containsString(plan.SeedPaths, "/redfish/v1/Oem/Ami/InventoryData/Status") {
|
||||
t.Fatal("expected ami-family fixture seed path /redfish/v1/Oem/Ami/InventoryData/Status")
|
||||
}
|
||||
if !containsString(plan.CriticalPaths, "/redfish/v1/UpdateService") {
|
||||
t.Fatal("expected ami-family fixture to inherit generic critical path")
|
||||
}
|
||||
|
||||
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals).Directives
|
||||
if !directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected ami-family fixture to enable graphics controller dedup")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_UnknownVendor(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "unknown-vendor.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1"},
|
||||
}, signals)
|
||||
|
||||
if match.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode for unknown vendor, got %q", match.Mode)
|
||||
}
|
||||
if len(match.Profiles) == 0 {
|
||||
t.Fatal("expected fallback to aggregate profiles")
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
if !profile.SafeForFallback() {
|
||||
t.Fatalf("fallback mode included non-safe profile %q", profile.Name())
|
||||
}
|
||||
}
|
||||
|
||||
if plan.Tuning.SnapshotMaxDocuments < 180000 {
|
||||
t.Fatalf("expected fallback to widen snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
|
||||
t.Fatal("expected fallback fixture to force prefetch enabled")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Systems/1") {
|
||||
t.Fatal("expected fallback resolved critical paths to include discovered system")
|
||||
}
|
||||
|
||||
analysisPlan := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, signals)
|
||||
if !analysisPlan.Directives.EnableProcessorGPUFallback {
|
||||
t.Fatal("expected fallback fixture to enable processor GPU fallback")
|
||||
}
|
||||
if !analysisPlan.Directives.EnableStorageEnclosureRecovery {
|
||||
t.Fatal("expected fallback fixture to enable storage enclosure recovery")
|
||||
}
|
||||
if !analysisPlan.Directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected fallback fixture to enable graphics controller dedup")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_Fixture_xFusion_G5500V7(t *testing.T) {
|
||||
signals := loadProfileFixtureSignals(t, "xfusion-g5500v7.json")
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1"},
|
||||
}, signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode for xFusion, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "xfusion")
|
||||
assertProfileNotSelected(t, match, "supermicro")
|
||||
assertProfileNotSelected(t, match, "hgx-topology")
|
||||
assertProfileNotSelected(t, match, "msi")
|
||||
assertProfileNotSelected(t, match, "dell")
|
||||
|
||||
if plan.Tuning.SnapshotMaxDocuments > 150000 {
|
||||
t.Fatalf("expected xfusion snapshot cap <= 150000, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
|
||||
t.Fatal("expected xfusion fixture to enable prefetch")
|
||||
}
|
||||
if plan.Tuning.ETABaseline.SnapshotSeconds <= 0 {
|
||||
t.Fatal("expected xfusion snapshot eta baseline")
|
||||
}
|
||||
if !containsString(resolved.CriticalPaths, "/redfish/v1/Systems/1") {
|
||||
t.Fatal("expected system path in critical paths")
|
||||
}
|
||||
|
||||
analysisPlan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Systems/1/Processors/Gpu1": map[string]interface{}{"ProcessorType": "GPU"},
|
||||
}, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
}, signals)
|
||||
if !analysisPlan.Directives.EnableProcessorGPUFallback {
|
||||
t.Fatal("expected xfusion analysis to enable processor GPU fallback when GPU processors present")
|
||||
}
|
||||
if !analysisPlan.Directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected xfusion analysis to enable graphics controller dedup")
|
||||
}
|
||||
}
|
||||
|
||||
func loadProfileFixtureSignals(t *testing.T, fixtureName string) MatchSignals {
|
||||
t.Helper()
|
||||
path := filepath.Join("testdata", fixtureName)
|
||||
data, err := os.ReadFile(path)
|
||||
if err != nil {
|
||||
t.Fatalf("read fixture %s: %v", path, err)
|
||||
}
|
||||
var signals MatchSignals
|
||||
if err := json.Unmarshal(data, &signals); err != nil {
|
||||
t.Fatalf("decode fixture %s: %v", path, err)
|
||||
}
|
||||
return normalizeSignals(signals)
|
||||
}
|
||||
|
||||
func assertProfileSelected(t *testing.T, match MatchResult, want string) {
|
||||
t.Helper()
|
||||
for _, profile := range match.Profiles {
|
||||
if profile.Name() == want {
|
||||
return
|
||||
}
|
||||
}
|
||||
t.Fatalf("expected profile %q in %v", want, profileNames(match))
|
||||
}
|
||||
|
||||
func assertProfileNotSelected(t *testing.T, match MatchResult, want string) {
|
||||
t.Helper()
|
||||
for _, profile := range match.Profiles {
|
||||
if profile.Name() == want {
|
||||
t.Fatalf("did not expect profile %q in %v", want, profileNames(match))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func profileNames(match MatchResult) []string {
|
||||
out := make([]string, 0, len(match.Profiles))
|
||||
for _, profile := range match.Profiles {
|
||||
out = append(out, profile.Name())
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func assertSameProfileNames(t *testing.T, left, right MatchResult) {
|
||||
t.Helper()
|
||||
leftNames := profileNames(left)
|
||||
rightNames := profileNames(right)
|
||||
if len(leftNames) != len(rightNames) {
|
||||
t.Fatalf("profile stack size differs: %v vs %v", leftNames, rightNames)
|
||||
}
|
||||
for i := range leftNames {
|
||||
if leftNames[i] != rightNames[i] {
|
||||
t.Fatalf("profile stack differs: %v vs %v", leftNames, rightNames)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func containsString(items []string, want string) bool {
|
||||
for _, item := range items {
|
||||
if item == want {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func discoveredResourcesFromSignals(signals MatchSignals) DiscoveredResources {
|
||||
var discovered DiscoveredResources
|
||||
for _, hint := range signals.ResourceHints {
|
||||
memberPath := discoveredMemberPath(hint)
|
||||
switch {
|
||||
case strings.HasPrefix(memberPath, "/redfish/v1/Systems/"):
|
||||
discovered.SystemPaths = append(discovered.SystemPaths, memberPath)
|
||||
case strings.HasPrefix(memberPath, "/redfish/v1/Chassis/"):
|
||||
discovered.ChassisPaths = append(discovered.ChassisPaths, memberPath)
|
||||
case strings.HasPrefix(memberPath, "/redfish/v1/Managers/"):
|
||||
discovered.ManagerPaths = append(discovered.ManagerPaths, memberPath)
|
||||
}
|
||||
}
|
||||
discovered.SystemPaths = dedupeSorted(discovered.SystemPaths)
|
||||
discovered.ChassisPaths = dedupeSorted(discovered.ChassisPaths)
|
||||
discovered.ManagerPaths = dedupeSorted(discovered.ManagerPaths)
|
||||
return discovered
|
||||
}
|
||||
|
||||
func discoveredMemberPath(path string) string {
|
||||
path = strings.TrimSpace(path)
|
||||
if path == "" {
|
||||
return ""
|
||||
}
|
||||
parts := strings.Split(strings.Trim(path, "/"), "/")
|
||||
if len(parts) < 4 || parts[0] != "redfish" || parts[1] != "v1" {
|
||||
return ""
|
||||
}
|
||||
switch parts[2] {
|
||||
case "Systems", "Chassis", "Managers":
|
||||
return "/" + strings.Join(parts[:4], "/")
|
||||
default:
|
||||
return ""
|
||||
}
|
||||
}
|
||||
122
internal/collector/redfishprofile/matcher.go
Normal file
122
internal/collector/redfishprofile/matcher.go
Normal file
@@ -0,0 +1,122 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"sort"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
const (
|
||||
ModeMatched = "matched"
|
||||
ModeFallback = "fallback"
|
||||
)
|
||||
|
||||
func MatchProfiles(signals MatchSignals) MatchResult {
|
||||
type scored struct {
|
||||
profile Profile
|
||||
score int
|
||||
}
|
||||
builtins := BuiltinProfiles()
|
||||
candidates := make([]scored, 0, len(builtins))
|
||||
allScores := make([]ProfileScore, 0, len(builtins))
|
||||
for _, profile := range builtins {
|
||||
score := profile.Match(signals)
|
||||
allScores = append(allScores, ProfileScore{
|
||||
Name: profile.Name(),
|
||||
Score: score,
|
||||
Priority: profile.Priority(),
|
||||
})
|
||||
if score <= 0 {
|
||||
continue
|
||||
}
|
||||
candidates = append(candidates, scored{profile: profile, score: score})
|
||||
}
|
||||
sort.Slice(allScores, func(i, j int) bool {
|
||||
if allScores[i].Score == allScores[j].Score {
|
||||
if allScores[i].Priority == allScores[j].Priority {
|
||||
return allScores[i].Name < allScores[j].Name
|
||||
}
|
||||
return allScores[i].Priority < allScores[j].Priority
|
||||
}
|
||||
return allScores[i].Score > allScores[j].Score
|
||||
})
|
||||
sort.Slice(candidates, func(i, j int) bool {
|
||||
if candidates[i].score == candidates[j].score {
|
||||
return candidates[i].profile.Priority() < candidates[j].profile.Priority()
|
||||
}
|
||||
return candidates[i].score > candidates[j].score
|
||||
})
|
||||
if len(candidates) == 0 || candidates[0].score < 60 {
|
||||
profiles := make([]Profile, 0, len(builtins))
|
||||
active := make(map[string]struct{}, len(builtins))
|
||||
for _, profile := range builtins {
|
||||
if profile.SafeForFallback() {
|
||||
profiles = append(profiles, profile)
|
||||
active[profile.Name()] = struct{}{}
|
||||
}
|
||||
}
|
||||
sortProfiles(profiles)
|
||||
for i := range allScores {
|
||||
_, ok := active[allScores[i].Name]
|
||||
allScores[i].Active = ok
|
||||
}
|
||||
return MatchResult{Mode: ModeFallback, Profiles: profiles, Scores: allScores}
|
||||
}
|
||||
profiles := make([]Profile, 0, len(candidates))
|
||||
seen := make(map[string]struct{}, len(candidates))
|
||||
for _, candidate := range candidates {
|
||||
name := candidate.profile.Name()
|
||||
if _, ok := seen[name]; ok {
|
||||
continue
|
||||
}
|
||||
seen[name] = struct{}{}
|
||||
profiles = append(profiles, candidate.profile)
|
||||
}
|
||||
sortProfiles(profiles)
|
||||
for i := range allScores {
|
||||
_, ok := seen[allScores[i].Name]
|
||||
allScores[i].Active = ok
|
||||
}
|
||||
return MatchResult{Mode: ModeMatched, Profiles: profiles, Scores: allScores}
|
||||
}
|
||||
|
||||
func BuildAcquisitionPlan(signals MatchSignals) AcquisitionPlan {
|
||||
match := MatchProfiles(signals)
|
||||
plan := AcquisitionPlan{Mode: match.Mode}
|
||||
for _, profile := range match.Profiles {
|
||||
plan.Profiles = append(plan.Profiles, profile.Name())
|
||||
profile.ExtendAcquisitionPlan(&plan, signals)
|
||||
}
|
||||
plan.Profiles = dedupeSorted(plan.Profiles)
|
||||
plan.SeedPaths = dedupeSorted(plan.SeedPaths)
|
||||
plan.CriticalPaths = dedupeSorted(plan.CriticalPaths)
|
||||
plan.PlanBPaths = dedupeSorted(plan.PlanBPaths)
|
||||
plan.Notes = dedupeSorted(plan.Notes)
|
||||
if plan.Mode == ModeFallback {
|
||||
ensureSnapshotMaxDocuments(&plan, 180000)
|
||||
ensurePrefetchEnabled(&plan, true)
|
||||
addPlanNote(&plan, "fallback acquisition expands safe profile probes")
|
||||
}
|
||||
return plan
|
||||
}
|
||||
|
||||
func ApplyAnalysisProfiles(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals) MatchResult {
|
||||
match := MatchProfiles(signals)
|
||||
for _, profile := range match.Profiles {
|
||||
profile.PostAnalyze(result, snapshot, signals)
|
||||
}
|
||||
return match
|
||||
}
|
||||
|
||||
func BuildAnalysisDirectives(match MatchResult) AnalysisDirectives {
|
||||
return ResolveAnalysisPlan(match, nil, DiscoveredResources{}, MatchSignals{}).Directives
|
||||
}
|
||||
|
||||
func sortProfiles(profiles []Profile) {
|
||||
sort.Slice(profiles, func(i, j int) bool {
|
||||
if profiles[i].Priority() == profiles[j].Priority() {
|
||||
return profiles[i].Name() < profiles[j].Name()
|
||||
}
|
||||
return profiles[i].Priority() < profiles[j].Priority()
|
||||
})
|
||||
}
|
||||
431
internal/collector/redfishprofile/matcher_test.go
Normal file
431
internal/collector/redfishprofile/matcher_test.go
Normal file
@@ -0,0 +1,431 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestMatchProfiles_UnknownVendorFallsBackToAggregateProfiles(t *testing.T) {
|
||||
match := MatchProfiles(MatchSignals{
|
||||
ServiceRootProduct: "Redfish Server",
|
||||
})
|
||||
if match.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode, got %q", match.Mode)
|
||||
}
|
||||
if len(match.Profiles) < 2 {
|
||||
t.Fatalf("expected aggregated fallback profiles, got %d", len(match.Profiles))
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_MSISelectsMatchedMode(t *testing.T) {
|
||||
match := MatchProfiles(MatchSignals{
|
||||
SystemManufacturer: "Micro-Star International Co., Ltd.",
|
||||
ResourceHints: []string{"/redfish/v1/Chassis/GPU1"},
|
||||
})
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
found := false
|
||||
for _, profile := range match.Profiles {
|
||||
if profile.Name() == "msi" {
|
||||
found = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
t.Fatal("expected msi profile to be selected")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_FallbackIncludesProfileNotes(t *testing.T) {
|
||||
plan := BuildAcquisitionPlan(MatchSignals{
|
||||
ServiceRootVendor: "AMI",
|
||||
})
|
||||
if len(plan.Profiles) == 0 {
|
||||
t.Fatal("expected acquisition plan profiles")
|
||||
}
|
||||
if len(plan.Notes) == 0 {
|
||||
t.Fatal("expected acquisition plan notes")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_FallbackAddsBroadCrawlTuning(t *testing.T) {
|
||||
plan := BuildAcquisitionPlan(MatchSignals{
|
||||
ServiceRootProduct: "Unknown Redfish",
|
||||
})
|
||||
if plan.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode, got %q", plan.Mode)
|
||||
}
|
||||
if plan.Tuning.SnapshotMaxDocuments < 180000 {
|
||||
t.Fatalf("expected widened snapshot cap, got %d", plan.Tuning.SnapshotMaxDocuments)
|
||||
}
|
||||
if plan.Tuning.PrefetchEnabled == nil || !*plan.Tuning.PrefetchEnabled {
|
||||
t.Fatal("expected fallback to force prefetch enabled")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableCriticalCollectionMemberRetry {
|
||||
t.Fatal("expected fallback to inherit critical member retry recovery")
|
||||
}
|
||||
if !plan.Tuning.RecoveryPolicy.EnableCriticalSlowProbe {
|
||||
t.Fatal("expected fallback to inherit critical slow probe recovery")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAcquisitionPlan_HGXDisablesNVMePostProbe(t *testing.T) {
|
||||
plan := BuildAcquisitionPlan(MatchSignals{
|
||||
SystemModel: "HGX B200",
|
||||
ResourceHints: []string{"/redfish/v1/Systems/HGX_Baseboard_0"},
|
||||
})
|
||||
if plan.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", plan.Mode)
|
||||
}
|
||||
if plan.Tuning.NVMePostProbeEnabled == nil || *plan.Tuning.NVMePostProbeEnabled {
|
||||
t.Fatal("expected hgx profile to disable NVMe post-probe")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_ExpandsScopedPaths(t *testing.T) {
|
||||
signals := MatchSignals{}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1", "/redfish/v1/Systems/2"},
|
||||
}, signals)
|
||||
joined := joinResolvedPaths(resolved.SeedPaths)
|
||||
for _, wanted := range []string{
|
||||
"/redfish/v1/Systems/1/SimpleStorage",
|
||||
"/redfish/v1/Systems/1/Storage/IntelVROC",
|
||||
"/redfish/v1/Systems/2/SimpleStorage",
|
||||
"/redfish/v1/Systems/2/Storage/IntelVROC",
|
||||
} {
|
||||
if !containsJoinedPath(joined, wanted) {
|
||||
t.Fatalf("expected resolved seed path %q", wanted)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_CriticalBaselineIsShapedByProfiles(t *testing.T) {
|
||||
signals := MatchSignals{}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/1"},
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1"},
|
||||
}, signals)
|
||||
joined := joinResolvedPaths(resolved.CriticalPaths)
|
||||
for _, wanted := range []string{
|
||||
"/redfish/v1",
|
||||
"/redfish/v1/Systems/1",
|
||||
"/redfish/v1/Systems/1/Memory",
|
||||
"/redfish/v1/Chassis/1/Assembly",
|
||||
"/redfish/v1/Managers/1/NetworkProtocol",
|
||||
"/redfish/v1/UpdateService",
|
||||
} {
|
||||
if !containsJoinedPath(joined, wanted) {
|
||||
t.Fatalf("expected resolved critical path %q", wanted)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_FallbackAppendsPlanBToSeeds(t *testing.T) {
|
||||
signals := MatchSignals{ServiceRootProduct: "Unknown Redfish"}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
if plan.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode, got %q", plan.Mode)
|
||||
}
|
||||
plan.PlanBPaths = append(plan.PlanBPaths, "/redfish/v1/Systems/1/Oem/TestPlanB")
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
}, signals)
|
||||
if !containsJoinedPath(joinResolvedPaths(resolved.SeedPaths), "/redfish/v1/Systems/1/Oem/TestPlanB") {
|
||||
t.Fatal("expected fallback resolved seeds to include plan-b path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_MSIRefinesDiscoveredGPUChassis(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Micro-Star International Co., Ltd.",
|
||||
ResourceHints: []string{"/redfish/v1/Chassis/GPU1", "/redfish/v1/Chassis/GPU4/Sensors"},
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/1", "/redfish/v1/Chassis/GPU1", "/redfish/v1/Chassis/GPU4"},
|
||||
}, signals)
|
||||
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
|
||||
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
|
||||
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU1") || !containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU4") {
|
||||
t.Fatal("expected MSI refinement to add discovered GPU chassis seed paths")
|
||||
}
|
||||
if containsJoinedPath(joinedSeeds, "/redfish/v1/Chassis/GPU2") {
|
||||
t.Fatal("did not expect undiscovered MSI GPU chassis in resolved seeds")
|
||||
}
|
||||
if !containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU1/Sensors") || !containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU4/Sensors") {
|
||||
t.Fatal("expected MSI refinement to add discovered GPU sensor critical paths")
|
||||
}
|
||||
if containsJoinedPath(joinedCritical, "/redfish/v1/Chassis/GPU3/Sensors") {
|
||||
t.Fatal("did not expect undiscovered MSI GPU sensor critical path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_HGXRefinesDiscoveredBaseboardSystems(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
SystemModel: "SYS-821GE-TNHR",
|
||||
ChassisModel: "HGX B200",
|
||||
ResourceHints: []string{
|
||||
"/redfish/v1/Systems/HGX_Baseboard_0",
|
||||
"/redfish/v1/Systems/HGX_Baseboard_0/Processors",
|
||||
"/redfish/v1/Systems/1",
|
||||
},
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1", "/redfish/v1/Systems/HGX_Baseboard_0"},
|
||||
}, signals)
|
||||
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
|
||||
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
|
||||
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_0") || !containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX refinement to add discovered baseboard system paths")
|
||||
}
|
||||
if !containsJoinedPath(joinedCritical, "/redfish/v1/Systems/HGX_Baseboard_0") || !containsJoinedPath(joinedCritical, "/redfish/v1/Systems/HGX_Baseboard_0/Processors") {
|
||||
t.Fatal("expected HGX refinement to add discovered baseboard critical paths")
|
||||
}
|
||||
if containsJoinedPath(joinedSeeds, "/redfish/v1/Systems/HGX_Baseboard_1") {
|
||||
t.Fatal("did not expect undiscovered HGX baseboard system path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_SupermicroRefinesFirmwareInventoryFromHint(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
ResourceHints: []string{
|
||||
"/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory",
|
||||
"/redfish/v1/Managers/1/Oem/Supermicro/FanMode",
|
||||
},
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1"},
|
||||
}, signals)
|
||||
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
|
||||
if !containsJoinedPath(joinedCritical, "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
|
||||
t.Fatal("expected Supermicro refinement to add firmware inventory critical path")
|
||||
}
|
||||
if !containsJoinedPath(joinResolvedPaths(resolved.Plan.PlanBPaths), "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory") {
|
||||
t.Fatal("expected Supermicro refinement to add firmware inventory plan-b path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestResolveAcquisitionPlan_DellRefinesDiscoveredIDRACManager(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Dell Inc.",
|
||||
ServiceRootProduct: "iDRAC Redfish Service",
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := BuildAcquisitionPlan(signals)
|
||||
resolved := ResolveAcquisitionPlan(match, plan, DiscoveredResources{
|
||||
ManagerPaths: []string{"/redfish/v1/Managers/1", "/redfish/v1/Managers/iDRAC.Embedded.1"},
|
||||
}, signals)
|
||||
joinedSeeds := joinResolvedPaths(resolved.SeedPaths)
|
||||
joinedCritical := joinResolvedPaths(resolved.CriticalPaths)
|
||||
if !containsJoinedPath(joinedSeeds, "/redfish/v1/Managers/iDRAC.Embedded.1") {
|
||||
t.Fatal("expected Dell refinement to add discovered iDRAC manager seed path")
|
||||
}
|
||||
if !containsJoinedPath(joinedCritical, "/redfish/v1/Managers/iDRAC.Embedded.1") {
|
||||
t.Fatal("expected Dell refinement to add discovered iDRAC manager critical path")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_SupermicroEnablesVendorStorageFallbacks(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
SystemModel: "SYS-821GE",
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Chassis/NVMeSSD.1.StorageBackplane/Drives": map[string]interface{}{},
|
||||
}, DiscoveredResources{}, signals)
|
||||
directives := plan.Directives
|
||||
if !directives.EnableSupermicroNVMeBackplane {
|
||||
t.Fatal("expected supermicro nvme backplane fallback")
|
||||
}
|
||||
}
|
||||
|
||||
func joinResolvedPaths(paths []string) string {
|
||||
return "\n" + strings.Join(paths, "\n") + "\n"
|
||||
}
|
||||
|
||||
func containsJoinedPath(joined, want string) bool {
|
||||
return strings.Contains(joined, "\n"+want+"\n")
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_HGXEnablesGPUFallbacks(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
SystemModel: "SYS-821GE-TNHR",
|
||||
ChassisModel: "HGX B200",
|
||||
ResourceHints: []string{"/redfish/v1/Systems/HGX_Baseboard_0", "/redfish/v1/Chassis/HGX_Chassis_0/PCIeDevices/GPU_SXM_1"},
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Systems/HGX_Baseboard_0/Processors/GPU_SXM_1": map[string]interface{}{"ProcessorType": "GPU"},
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/PCIeDevices/GPU_SXM_1": map[string]interface{}{},
|
||||
}, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/HGX_Baseboard_0"},
|
||||
}, signals)
|
||||
directives := plan.Directives
|
||||
if !directives.EnableProcessorGPUFallback {
|
||||
t.Fatal("expected processor GPU fallback for hgx profile")
|
||||
}
|
||||
if !directives.EnableProcessorGPUChassisAlias {
|
||||
t.Fatal("expected processor GPU chassis alias resolution for hgx profile")
|
||||
}
|
||||
if !directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("expected graphics-controller dedup for hgx profile")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_MSIEnablesMSIChassisLookup(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Micro-Star International Co., Ltd.",
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Systems/1/Processors/GPU1": map[string]interface{}{"ProcessorType": "GPU"},
|
||||
"/redfish/v1/Chassis/GPU1": map[string]interface{}{},
|
||||
}, DiscoveredResources{
|
||||
SystemPaths: []string{"/redfish/v1/Systems/1"},
|
||||
ChassisPaths: []string{"/redfish/v1/Chassis/GPU1"},
|
||||
}, signals)
|
||||
directives := plan.Directives
|
||||
if !directives.EnableMSIProcessorGPUChassisLookup {
|
||||
t.Fatal("expected MSI processor GPU chassis lookup")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_SupermicroEnablesStorageRecovery(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Supermicro",
|
||||
}
|
||||
match := MatchProfiles(signals)
|
||||
plan := ResolveAnalysisPlan(match, map[string]interface{}{
|
||||
"/redfish/v1/Chassis/1/Drives": map[string]interface{}{},
|
||||
"/redfish/v1/Systems/1/Storage/IntelVROC": map[string]interface{}{},
|
||||
"/redfish/v1/Systems/1/Storage/IntelVROC/Drives": map[string]interface{}{},
|
||||
}, DiscoveredResources{}, signals)
|
||||
directives := plan.Directives
|
||||
if !directives.EnableStorageEnclosureRecovery {
|
||||
t.Fatal("expected storage enclosure recovery for supermicro")
|
||||
}
|
||||
if !directives.EnableKnownStorageControllerRecovery {
|
||||
t.Fatal("expected known storage controller recovery for supermicro")
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_LenovoXCCSelectsMatchedModeAndExcludesSensors(t *testing.T) {
|
||||
match := MatchProfiles(MatchSignals{
|
||||
SystemManufacturer: "Lenovo",
|
||||
ChassisManufacturer: "Lenovo",
|
||||
OEMNamespaces: []string{"Lenovo"},
|
||||
})
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
found := false
|
||||
for _, profile := range match.Profiles {
|
||||
if profile.Name() == "lenovo" {
|
||||
found = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
t.Fatal("expected lenovo profile to be selected")
|
||||
}
|
||||
|
||||
// Verify the acquisition plan excludes noisy Lenovo-specific snapshot paths.
|
||||
plan := BuildAcquisitionPlan(MatchSignals{
|
||||
SystemManufacturer: "Lenovo",
|
||||
ChassisManufacturer: "Lenovo",
|
||||
OEMNamespaces: []string{"Lenovo"},
|
||||
})
|
||||
wantExcluded := []string{"/Sensors/", "/Oem/Lenovo/LEDs/", "/Oem/Lenovo/Slots/"}
|
||||
for _, want := range wantExcluded {
|
||||
found := false
|
||||
for _, ex := range plan.Tuning.SnapshotExcludeContains {
|
||||
if ex == want {
|
||||
found = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
t.Errorf("expected SnapshotExcludeContains to include %q, got %v", want, plan.Tuning.SnapshotExcludeContains)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_OrderingIsDeterministic(t *testing.T) {
|
||||
signals := MatchSignals{
|
||||
SystemManufacturer: "Micro-Star International Co., Ltd.",
|
||||
ResourceHints: []string{"/redfish/v1/Chassis/GPU1"},
|
||||
}
|
||||
first := MatchProfiles(signals)
|
||||
second := MatchProfiles(signals)
|
||||
if len(first.Profiles) != len(second.Profiles) {
|
||||
t.Fatalf("profile stack size differs across calls: %d vs %d", len(first.Profiles), len(second.Profiles))
|
||||
}
|
||||
for i := range first.Profiles {
|
||||
if first.Profiles[i].Name() != second.Profiles[i].Name() {
|
||||
t.Fatalf("profile ordering differs at index %d: %q vs %q", i, first.Profiles[i].Name(), second.Profiles[i].Name())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_FallbackOrderingIsDeterministic(t *testing.T) {
|
||||
signals := MatchSignals{ServiceRootProduct: "Unknown Redfish"}
|
||||
first := MatchProfiles(signals)
|
||||
second := MatchProfiles(signals)
|
||||
if first.Mode != ModeFallback || second.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode in both calls")
|
||||
}
|
||||
if len(first.Profiles) != len(second.Profiles) {
|
||||
t.Fatalf("fallback profile stack size differs: %d vs %d", len(first.Profiles), len(second.Profiles))
|
||||
}
|
||||
for i := range first.Profiles {
|
||||
if first.Profiles[i].Name() != second.Profiles[i].Name() {
|
||||
t.Fatalf("fallback profile ordering differs at index %d: %q vs %q", i, first.Profiles[i].Name(), second.Profiles[i].Name())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestMatchProfiles_FallbackOnlySelectsSafeProfiles(t *testing.T) {
|
||||
match := MatchProfiles(MatchSignals{ServiceRootProduct: "Unknown Generic Redfish Server"})
|
||||
if match.Mode != ModeFallback {
|
||||
t.Fatalf("expected fallback mode, got %q", match.Mode)
|
||||
}
|
||||
for _, profile := range match.Profiles {
|
||||
if !profile.SafeForFallback() {
|
||||
t.Fatalf("fallback mode included non-safe profile %q", profile.Name())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestBuildAnalysisDirectives_GenericMatchedKeepsFallbacksDisabled(t *testing.T) {
|
||||
match := MatchResult{
|
||||
Mode: ModeMatched,
|
||||
Profiles: []Profile{genericProfile()},
|
||||
}
|
||||
directives := ResolveAnalysisPlan(match, nil, DiscoveredResources{}, MatchSignals{}).Directives
|
||||
if directives.EnableProcessorGPUFallback {
|
||||
t.Fatal("did not expect processor GPU fallback for generic matched profile")
|
||||
}
|
||||
if directives.EnableSupermicroNVMeBackplane {
|
||||
t.Fatal("did not expect supermicro nvme fallback for generic matched profile")
|
||||
}
|
||||
if directives.EnableGenericGraphicsControllerDedup {
|
||||
t.Fatal("did not expect generic graphics-controller dedup for generic matched profile")
|
||||
}
|
||||
}
|
||||
33
internal/collector/redfishprofile/profile_ami.go
Normal file
33
internal/collector/redfishprofile/profile_ami.go
Normal file
@@ -0,0 +1,33 @@
|
||||
package redfishprofile
|
||||
|
||||
func amiProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "ami-family",
|
||||
priority: 10,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.ServiceRootVendor, "ami") || containsFold(s.ServiceRootProduct, "ami") {
|
||||
score += 70
|
||||
}
|
||||
for _, ns := range s.OEMNamespaces {
|
||||
if containsFold(ns, "ami") {
|
||||
score += 30
|
||||
break
|
||||
}
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
addPlanPaths(&plan.SeedPaths,
|
||||
"/redfish/v1/Oem/Ami",
|
||||
"/redfish/v1/Oem/Ami/InventoryData/Status",
|
||||
)
|
||||
ensurePrefetchEnabled(plan, true)
|
||||
addPlanNote(plan, "ami-family acquisition extensions enabled")
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
},
|
||||
}
|
||||
}
|
||||
45
internal/collector/redfishprofile/profile_dell.go
Normal file
45
internal/collector/redfishprofile/profile_dell.go
Normal file
@@ -0,0 +1,45 @@
|
||||
package redfishprofile
|
||||
|
||||
func dellProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "dell",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemManufacturer, "dell") || containsFold(s.ChassisManufacturer, "dell") {
|
||||
score += 80
|
||||
}
|
||||
for _, ns := range s.OEMNamespaces {
|
||||
if containsFold(ns, "dell") {
|
||||
score += 30
|
||||
break
|
||||
}
|
||||
}
|
||||
if containsFold(s.ServiceRootProduct, "idrac") {
|
||||
score += 30
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
addPlanNote(plan, "dell iDRAC acquisition extensions enabled")
|
||||
},
|
||||
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
|
||||
for _, managerPath := range discovered.ManagerPaths {
|
||||
if !containsFold(managerPath, "idrac") {
|
||||
continue
|
||||
}
|
||||
addPlanPaths(&resolved.SeedPaths, managerPath)
|
||||
addPlanPaths(&resolved.Plan.SeedPaths, managerPath)
|
||||
addPlanPaths(&resolved.CriticalPaths, managerPath)
|
||||
addPlanPaths(&resolved.Plan.CriticalPaths, managerPath)
|
||||
}
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
},
|
||||
}
|
||||
}
|
||||
116
internal/collector/redfishprofile/profile_generic.go
Normal file
116
internal/collector/redfishprofile/profile_generic.go
Normal file
@@ -0,0 +1,116 @@
|
||||
package redfishprofile
|
||||
|
||||
func genericProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "generic",
|
||||
priority: 100,
|
||||
safeForFallback: true,
|
||||
matchFn: func(MatchSignals) int { return 10 },
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensurePrefetchPolicy(plan, AcquisitionPrefetchPolicy{
|
||||
IncludeSuffixes: []string{
|
||||
"/Bios",
|
||||
"/Processors",
|
||||
"/Memory",
|
||||
"/Storage",
|
||||
"/SimpleStorage",
|
||||
"/PCIeDevices",
|
||||
"/PCIeFunctions",
|
||||
"/Accelerators",
|
||||
"/GraphicsControllers",
|
||||
"/EthernetInterfaces",
|
||||
"/NetworkInterfaces",
|
||||
"/NetworkAdapters",
|
||||
"/Drives",
|
||||
"/Power",
|
||||
"/PowerSubsystem/PowerSupplies",
|
||||
"/NetworkProtocol",
|
||||
"/UpdateService",
|
||||
"/UpdateService/FirmwareInventory",
|
||||
},
|
||||
ExcludeContains: []string{
|
||||
"/Fabrics",
|
||||
"/Backplanes",
|
||||
"/Boards",
|
||||
"/Assembly",
|
||||
"/Sensors",
|
||||
"/ThresholdSensors",
|
||||
"/DiscreteSensors",
|
||||
"/ThermalConfig",
|
||||
"/ThermalSubsystem",
|
||||
"/EnvironmentMetrics",
|
||||
"/Certificates",
|
||||
"/LogServices",
|
||||
},
|
||||
})
|
||||
ensureScopedPathPolicy(plan, AcquisitionScopedPathPolicy{
|
||||
SystemCriticalSuffixes: []string{
|
||||
"/Bios",
|
||||
"/Oem/Public",
|
||||
"/Oem/Public/FRU",
|
||||
"/Processors",
|
||||
"/Memory",
|
||||
"/Storage",
|
||||
"/PCIeDevices",
|
||||
"/PCIeFunctions",
|
||||
"/Accelerators",
|
||||
"/GraphicsControllers",
|
||||
"/EthernetInterfaces",
|
||||
"/NetworkInterfaces",
|
||||
"/SimpleStorage",
|
||||
"/Storage/IntelVROC",
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
},
|
||||
ChassisCriticalSuffixes: []string{
|
||||
"/Oem/Public",
|
||||
"/Oem/Public/FRU",
|
||||
"/Power",
|
||||
"/NetworkAdapters",
|
||||
"/PCIeDevices",
|
||||
"/Accelerators",
|
||||
"/Drives",
|
||||
"/Assembly",
|
||||
},
|
||||
ManagerCriticalSuffixes: []string{
|
||||
"/NetworkProtocol",
|
||||
},
|
||||
SystemSeedSuffixes: []string{
|
||||
"/SimpleStorage",
|
||||
"/Storage/IntelVROC",
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
},
|
||||
})
|
||||
addPlanPaths(&plan.CriticalPaths,
|
||||
"/redfish/v1/UpdateService",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory",
|
||||
)
|
||||
ensureSnapshotMaxDocuments(plan, 100000)
|
||||
ensureSnapshotWorkers(plan, 6)
|
||||
ensurePrefetchWorkers(plan, 4)
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 8,
|
||||
SnapshotSeconds: 90,
|
||||
PrefetchSeconds: 20,
|
||||
CriticalPlanBSeconds: 20,
|
||||
ProfilePlanBSeconds: 15,
|
||||
})
|
||||
ensurePostProbePolicy(plan, AcquisitionPostProbePolicy{
|
||||
EnableNumericCollectionProbe: true,
|
||||
})
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableCriticalCollectionMemberRetry: true,
|
||||
EnableCriticalSlowProbe: true,
|
||||
EnableEmptyCriticalCollectionRetry: true,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 900,
|
||||
ThrottleP95LatencyMS: 1800,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
},
|
||||
}
|
||||
}
|
||||
85
internal/collector/redfishprofile/profile_hgx.go
Normal file
85
internal/collector/redfishprofile/profile_hgx.go
Normal file
@@ -0,0 +1,85 @@
|
||||
package redfishprofile
|
||||
|
||||
func hgxProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "hgx-topology",
|
||||
priority: 30,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemModel, "hgx") || containsFold(s.ChassisModel, "hgx") {
|
||||
score += 70
|
||||
}
|
||||
for _, hint := range s.ResourceHints {
|
||||
if containsFold(hint, "hgx_") || containsFold(hint, "gpu_sxm") {
|
||||
score += 20
|
||||
break
|
||||
}
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureSnapshotMaxDocuments(plan, 180000)
|
||||
ensureSnapshotWorkers(plan, 4)
|
||||
ensurePrefetchWorkers(plan, 4)
|
||||
ensureNVMePostProbeEnabled(plan, false)
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 20,
|
||||
SnapshotSeconds: 300,
|
||||
PrefetchSeconds: 50,
|
||||
CriticalPlanBSeconds: 90,
|
||||
ProfilePlanBSeconds: 40,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 1500,
|
||||
ThrottleP95LatencyMS: 3000,
|
||||
MinSnapshotWorkers: 1,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
addPlanNote(plan, "hgx topology acquisition extensions enabled")
|
||||
},
|
||||
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
|
||||
for _, systemPath := range discovered.SystemPaths {
|
||||
if !containsFold(systemPath, "hgx_baseboard_") {
|
||||
continue
|
||||
}
|
||||
addPlanPaths(&resolved.SeedPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
addPlanPaths(&resolved.Plan.SeedPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
addPlanPaths(&resolved.CriticalPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
addPlanPaths(&resolved.Plan.CriticalPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
addPlanPaths(&resolved.Plan.PlanBPaths, systemPath, joinPath(systemPath, "/Processors"))
|
||||
}
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
d.EnableStorageEnclosureRecovery = true
|
||||
},
|
||||
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, _ MatchSignals) {
|
||||
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) && (snapshotHasPathContaining(snapshot, "gpu_sxm") || snapshotHasPathContaining(snapshot, "hgx_")) {
|
||||
plan.Directives.EnableProcessorGPUFallback = true
|
||||
plan.Directives.EnableProcessorGPUChassisAlias = true
|
||||
addAnalysisLookupMode(plan, "hgx-alias")
|
||||
addAnalysisNote(plan, "hgx analysis enables processor-gpu alias fallback from snapshot topology")
|
||||
}
|
||||
if snapshotHasStorageControllerHint(snapshot, "/storage/intelvroc", "/storage/ha-raid", "/storage/mrvl.ha-raid") {
|
||||
plan.Directives.EnableKnownStorageControllerRecovery = true
|
||||
addAnalysisStorageDriveCollections(plan,
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Controllers/1/Drives",
|
||||
)
|
||||
addAnalysisStorageVolumeCollections(plan,
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
"/Storage/HA-RAID/Volumes",
|
||||
"/Storage/MRVL.HA-RAID/Volumes",
|
||||
)
|
||||
}
|
||||
if snapshotHasPathContaining(snapshot, "/chassis/nvmessd.") && snapshotHasPathContaining(snapshot, ".storagebackplane") {
|
||||
plan.Directives.EnableSupermicroNVMeBackplane = true
|
||||
}
|
||||
},
|
||||
}
|
||||
}
|
||||
67
internal/collector/redfishprofile/profile_hpe.go
Normal file
67
internal/collector/redfishprofile/profile_hpe.go
Normal file
@@ -0,0 +1,67 @@
|
||||
package redfishprofile
|
||||
|
||||
func hpeProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "hpe",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemManufacturer, "hpe") ||
|
||||
containsFold(s.SystemManufacturer, "hewlett packard") ||
|
||||
containsFold(s.ChassisManufacturer, "hpe") ||
|
||||
containsFold(s.ChassisManufacturer, "hewlett packard") {
|
||||
score += 80
|
||||
}
|
||||
for _, ns := range s.OEMNamespaces {
|
||||
if containsFold(ns, "hpe") {
|
||||
score += 30
|
||||
break
|
||||
}
|
||||
}
|
||||
if containsFold(s.ServiceRootProduct, "ilo") {
|
||||
score += 30
|
||||
}
|
||||
if containsFold(s.ManagerManufacturer, "hpe") || containsFold(s.ManagerManufacturer, "ilo") {
|
||||
score += 20
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
// HPE ProLiant SmartStorage RAID controller inventory is not reachable
|
||||
// via standard Redfish Storage paths — it requires the HPE OEM SmartStorage tree.
|
||||
ensureScopedPathPolicy(plan, AcquisitionScopedPathPolicy{
|
||||
SystemCriticalSuffixes: []string{
|
||||
"/SmartStorage",
|
||||
"/SmartStorageConfig",
|
||||
},
|
||||
ManagerCriticalSuffixes: []string{
|
||||
"/LicenseService",
|
||||
},
|
||||
})
|
||||
// HPE iLO responds more slowly than average BMCs under load; give the
|
||||
// ETA estimator a realistic baseline so progress reports are accurate.
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 12,
|
||||
SnapshotSeconds: 180,
|
||||
PrefetchSeconds: 30,
|
||||
CriticalPlanBSeconds: 40,
|
||||
ProfilePlanBSeconds: 25,
|
||||
})
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
// HPE iLO starts throttling under high request rates. Setting a higher
|
||||
// latency tolerance prevents the adaptive throttler from treating normal
|
||||
// iLO slowness as a reason to stall the collection.
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 1200,
|
||||
ThrottleP95LatencyMS: 2500,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
addPlanNote(plan, "hpe ilo acquisition extensions enabled")
|
||||
},
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,149 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"regexp"
|
||||
"strings"
|
||||
)
|
||||
|
||||
var (
|
||||
outboardCardHintRe = regexp.MustCompile(`/outboardPCIeCard\d+(?:/|$)`)
|
||||
obDriveHintRe = regexp.MustCompile(`/Drives/OB\d+$`)
|
||||
fpDriveHintRe = regexp.MustCompile(`/Drives/FP00HDD\d+$`)
|
||||
vrFirmwareHintRe = regexp.MustCompile(`^CPU\d+_PVCC.*_VR$`)
|
||||
)
|
||||
|
||||
var inspurGroupOEMFirmwareHints = map[string]struct{}{
|
||||
"Front_HDD_CPLD0": {},
|
||||
"MainBoard0CPLD": {},
|
||||
"MainBoardCPLD": {},
|
||||
"PDBBoardCPLD": {},
|
||||
"SCMCPLD": {},
|
||||
"SWBoardCPLD": {},
|
||||
}
|
||||
|
||||
func inspurGroupOEMPlatformsProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "inspur-group-oem-platforms",
|
||||
priority: 25,
|
||||
safeForFallback: false,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
topologyScore := 0
|
||||
boardScore := 0
|
||||
chassisOutboard := matchedPathTokens(s.ResourceHints, "/redfish/v1/Chassis/", outboardCardHintRe)
|
||||
systemOutboard := matchedPathTokens(s.ResourceHints, "/redfish/v1/Systems/", outboardCardHintRe)
|
||||
obDrives := matchedPathTokens(s.ResourceHints, "", obDriveHintRe)
|
||||
fpDrives := matchedPathTokens(s.ResourceHints, "", fpDriveHintRe)
|
||||
firmwareNames, vrFirmwareNames := inspurGroupOEMFirmwareMatches(s.ResourceHints)
|
||||
|
||||
if len(chassisOutboard) > 0 {
|
||||
topologyScore += 20
|
||||
}
|
||||
if len(systemOutboard) > 0 {
|
||||
topologyScore += 10
|
||||
}
|
||||
switch {
|
||||
case len(obDrives) > 0 && len(fpDrives) > 0:
|
||||
topologyScore += 15
|
||||
}
|
||||
switch {
|
||||
case len(firmwareNames) >= 2:
|
||||
boardScore += 15
|
||||
}
|
||||
switch {
|
||||
case len(vrFirmwareNames) >= 2:
|
||||
boardScore += 10
|
||||
}
|
||||
if anySignalContains(s, "COMMONbAssembly") {
|
||||
boardScore += 12
|
||||
}
|
||||
if anySignalContains(s, "EnvironmentMetrcs") {
|
||||
boardScore += 8
|
||||
}
|
||||
if anySignalContains(s, "GetServerAllUSBStatus") {
|
||||
boardScore += 8
|
||||
}
|
||||
if topologyScore == 0 || boardScore == 0 {
|
||||
return 0
|
||||
}
|
||||
return min(topologyScore+boardScore, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
addPlanNote(plan, "Inspur Group OEM platform fingerprint matched")
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
func matchedPathTokens(paths []string, requiredPrefix string, re *regexp.Regexp) []string {
|
||||
seen := make(map[string]struct{})
|
||||
for _, rawPath := range paths {
|
||||
path := normalizePath(rawPath)
|
||||
if path == "" || (requiredPrefix != "" && !strings.HasPrefix(path, requiredPrefix)) {
|
||||
continue
|
||||
}
|
||||
token := re.FindString(path)
|
||||
if token == "" {
|
||||
continue
|
||||
}
|
||||
token = strings.Trim(token, "/")
|
||||
if token == "" {
|
||||
continue
|
||||
}
|
||||
seen[token] = struct{}{}
|
||||
}
|
||||
out := make([]string, 0, len(seen))
|
||||
for token := range seen {
|
||||
out = append(out, token)
|
||||
}
|
||||
return dedupeSorted(out)
|
||||
}
|
||||
|
||||
func inspurGroupOEMFirmwareMatches(paths []string) ([]string, []string) {
|
||||
firmwareNames := make(map[string]struct{})
|
||||
vrNames := make(map[string]struct{})
|
||||
for _, rawPath := range paths {
|
||||
path := normalizePath(rawPath)
|
||||
if !strings.HasPrefix(path, "/redfish/v1/UpdateService/FirmwareInventory/") {
|
||||
continue
|
||||
}
|
||||
name := strings.TrimSpace(path[strings.LastIndex(path, "/")+1:])
|
||||
if name == "" {
|
||||
continue
|
||||
}
|
||||
if _, ok := inspurGroupOEMFirmwareHints[name]; ok {
|
||||
firmwareNames[name] = struct{}{}
|
||||
}
|
||||
if vrFirmwareHintRe.MatchString(name) {
|
||||
vrNames[name] = struct{}{}
|
||||
}
|
||||
}
|
||||
return mapKeysSorted(firmwareNames), mapKeysSorted(vrNames)
|
||||
}
|
||||
|
||||
func anySignalContains(signals MatchSignals, needle string) bool {
|
||||
needle = strings.TrimSpace(needle)
|
||||
if needle == "" {
|
||||
return false
|
||||
}
|
||||
for _, signal := range signals.ResourceHints {
|
||||
if strings.Contains(signal, needle) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
for _, signal := range signals.DocHints {
|
||||
if strings.Contains(signal, needle) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func mapKeysSorted(items map[string]struct{}) []string {
|
||||
out := make([]string, 0, len(items))
|
||||
for item := range items {
|
||||
out = append(out, item)
|
||||
}
|
||||
return dedupeSorted(out)
|
||||
}
|
||||
@@ -0,0 +1,182 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"archive/zip"
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestCollectSignalsFromTree_InspurGroupOEMPlatformsSelectsMatchedMode(t *testing.T) {
|
||||
tree := map[string]interface{}{
|
||||
"/redfish/v1": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1",
|
||||
},
|
||||
"/redfish/v1/Systems": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Systems/1": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Systems/1",
|
||||
"Oem": map[string]interface{}{
|
||||
"Public": map[string]interface{}{
|
||||
"USB": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Systems/1/Oem/Public/GetServerAllUSBStatus",
|
||||
},
|
||||
},
|
||||
},
|
||||
"NetworkInterfaces": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Systems/1/NetworkInterfaces",
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Systems/1/NetworkInterfaces": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/NetworkInterfaces/outboardPCIeCard0"},
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Systems/1/NetworkInterfaces/outboardPCIeCard1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Chassis/1",
|
||||
"Actions": map[string]interface{}{
|
||||
"Oem": map[string]interface{}{
|
||||
"Public": map[string]interface{}{
|
||||
"NvGpuPowerLimitWatts": map[string]interface{}{
|
||||
"target": "/redfish/v1/Chassis/1/GPU/EnvironmentMetrcs",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
"Drives": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Chassis/1/Drives",
|
||||
},
|
||||
"NetworkAdapters": map[string]interface{}{
|
||||
"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters",
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1/Drives": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/Drives/OB01"},
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/Drives/FP00HDD00"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1/NetworkAdapters": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/outboardPCIeCard0"},
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Chassis/1/NetworkAdapters/outboardPCIeCard1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Chassis/1/Assembly": map[string]interface{}{
|
||||
"Assemblies": []interface{}{
|
||||
map[string]interface{}{
|
||||
"Oem": map[string]interface{}{
|
||||
"COMMONb": map[string]interface{}{
|
||||
"COMMONbAssembly": map[string]interface{}{
|
||||
"@odata.type": "#COMMONbAssembly.v1_0_0.COMMONbAssembly",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Managers": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/Managers/1"},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/Managers/1": map[string]interface{}{
|
||||
"Actions": map[string]interface{}{
|
||||
"Oem": map[string]interface{}{
|
||||
"#PublicManager.ExportConfFile": map[string]interface{}{
|
||||
"target": "/redfish/v1/Managers/1/Actions/Oem/Public/ExportConfFile",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
"/redfish/v1/UpdateService/FirmwareInventory": map[string]interface{}{
|
||||
"Members": []interface{}{
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/Front_HDD_CPLD0"},
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/SCMCPLD"},
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/CPU0_PVCCD_HV_VR"},
|
||||
map[string]interface{}{"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/CPU1_PVCCIN_VR"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
signals := CollectSignalsFromTree(tree)
|
||||
match := MatchProfiles(signals)
|
||||
|
||||
if match.Mode != ModeMatched {
|
||||
t.Fatalf("expected matched mode, got %q", match.Mode)
|
||||
}
|
||||
assertProfileSelected(t, match, "inspur-group-oem-platforms")
|
||||
}
|
||||
|
||||
func TestCollectSignalsFromTree_InspurGroupOEMPlatformsDoesNotFalsePositiveOnExampleRawExports(t *testing.T) {
|
||||
examples := []string{
|
||||
"2026-03-18 (G5500 V7) - 210619KUGGXGS2000015.zip",
|
||||
"2026-03-11 (SYS-821GE-TNHR) - A514359X5C08846.zip",
|
||||
"2026-03-15 (CG480-S5063) - P5T0006091.zip",
|
||||
"2026-03-18 (CG290-S3063) - PAT0011258.zip",
|
||||
"2024-04-25 (AS -4124GQ-TNMI) - S490387X4418273.zip",
|
||||
}
|
||||
for _, name := range examples {
|
||||
t.Run(name, func(t *testing.T) {
|
||||
tree := loadRawExportTreeFromExampleZip(t, name)
|
||||
match := MatchProfiles(CollectSignalsFromTree(tree))
|
||||
assertProfileNotSelected(t, match, "inspur-group-oem-platforms")
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func loadRawExportTreeFromExampleZip(t *testing.T, name string) map[string]interface{} {
|
||||
t.Helper()
|
||||
path := filepath.Join("..", "..", "..", "example", name)
|
||||
f, err := os.Open(path)
|
||||
if err != nil {
|
||||
t.Fatalf("open example zip %s: %v", path, err)
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
info, err := f.Stat()
|
||||
if err != nil {
|
||||
t.Fatalf("stat example zip %s: %v", path, err)
|
||||
}
|
||||
|
||||
zr, err := zip.NewReader(f, info.Size())
|
||||
if err != nil {
|
||||
t.Fatalf("read example zip %s: %v", path, err)
|
||||
}
|
||||
for _, file := range zr.File {
|
||||
if file.Name != "raw_export.json" {
|
||||
continue
|
||||
}
|
||||
rc, err := file.Open()
|
||||
if err != nil {
|
||||
t.Fatalf("open %s in %s: %v", file.Name, path, err)
|
||||
}
|
||||
defer rc.Close()
|
||||
var payload struct {
|
||||
Source struct {
|
||||
RawPayloads struct {
|
||||
RedfishTree map[string]interface{} `json:"redfish_tree"`
|
||||
} `json:"raw_payloads"`
|
||||
} `json:"source"`
|
||||
}
|
||||
if err := json.NewDecoder(rc).Decode(&payload); err != nil {
|
||||
t.Fatalf("decode raw_export.json from %s: %v", path, err)
|
||||
}
|
||||
if len(payload.Source.RawPayloads.RedfishTree) == 0 {
|
||||
t.Fatalf("example %s has empty redfish_tree", path)
|
||||
}
|
||||
return payload.Source.RawPayloads.RedfishTree
|
||||
}
|
||||
t.Fatalf("raw_export.json not found in %s", path)
|
||||
return nil
|
||||
}
|
||||
65
internal/collector/redfishprofile/profile_lenovo.go
Normal file
65
internal/collector/redfishprofile/profile_lenovo.go
Normal file
@@ -0,0 +1,65 @@
|
||||
package redfishprofile
|
||||
|
||||
func lenovoProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "lenovo",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemManufacturer, "lenovo") ||
|
||||
containsFold(s.ChassisManufacturer, "lenovo") {
|
||||
score += 80
|
||||
}
|
||||
for _, ns := range s.OEMNamespaces {
|
||||
if containsFold(ns, "lenovo") {
|
||||
score += 30
|
||||
break
|
||||
}
|
||||
}
|
||||
// Lenovo XClarity Controller (XCC) is the BMC product line.
|
||||
if containsFold(s.ServiceRootProduct, "xclarity") ||
|
||||
containsFold(s.ServiceRootProduct, "xcc") {
|
||||
score += 30
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
// Lenovo XCC BMC exposes Chassis/1/Sensors with hundreds of individual
|
||||
// sensor member documents (e.g. Chassis/1/Sensors/101L1). These are
|
||||
// not used by any LOGPile parser — thermal/power data is read from
|
||||
// the aggregate Chassis/*/Thermal and Chassis/*/Power endpoints. On
|
||||
// a real server they largely return errors, wasting many minutes.
|
||||
// Lenovo OEM subtrees under Oem/Lenovo/LEDs and Oem/Lenovo/Slots also
|
||||
// enumerate dozens of individual documents not relevant to inventory.
|
||||
ensureSnapshotExcludeContains(plan,
|
||||
"/Sensors/", // individual sensor docs (Chassis/1/Sensors/NNN)
|
||||
"/Oem/Lenovo/LEDs/", // individual LED status entries (~47 per server)
|
||||
"/Oem/Lenovo/Slots/", // individual slot detail entries (~26 per server)
|
||||
"/Oem/Lenovo/Metrics/", // operational metrics, not inventory
|
||||
"/Oem/Lenovo/History", // historical telemetry
|
||||
"/Oem/Lenovo/ScheduledPower", // power scheduling config
|
||||
"/Oem/Lenovo/BootSettings/BootOrder", // individual boot order lists
|
||||
"/PortForwardingMap/", // network port forwarding config
|
||||
)
|
||||
// Lenovo XCC BMC is typically slow (p95 latency often 3-5s even under
|
||||
// normal load). Set rate thresholds that don't over-throttle on the
|
||||
// first few requests, and give the ETA estimator a realistic baseline.
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 2000,
|
||||
ThrottleP95LatencyMS: 4000,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 15,
|
||||
SnapshotSeconds: 120,
|
||||
PrefetchSeconds: 30,
|
||||
CriticalPlanBSeconds: 40,
|
||||
ProfilePlanBSeconds: 20,
|
||||
})
|
||||
addPlanNote(plan, "lenovo xcc acquisition extensions enabled: noisy sensor/oem paths excluded from snapshot")
|
||||
},
|
||||
}
|
||||
}
|
||||
74
internal/collector/redfishprofile/profile_msi.go
Normal file
74
internal/collector/redfishprofile/profile_msi.go
Normal file
@@ -0,0 +1,74 @@
|
||||
package redfishprofile
|
||||
|
||||
import "strings"
|
||||
|
||||
func msiProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "msi",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemManufacturer, "micro-star") || containsFold(s.ChassisManufacturer, "micro-star") {
|
||||
score += 80
|
||||
}
|
||||
if containsFold(s.SystemManufacturer, "msi") || containsFold(s.ChassisManufacturer, "msi") {
|
||||
score += 40
|
||||
}
|
||||
for _, hint := range s.ResourceHints {
|
||||
if strings.HasPrefix(hint, "/redfish/v1/Chassis/GPU") {
|
||||
score += 10
|
||||
break
|
||||
}
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureSnapshotWorkers(plan, 6)
|
||||
ensurePrefetchWorkers(plan, 8)
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 12,
|
||||
SnapshotSeconds: 120,
|
||||
PrefetchSeconds: 25,
|
||||
CriticalPlanBSeconds: 35,
|
||||
ProfilePlanBSeconds: 25,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 1000,
|
||||
ThrottleP95LatencyMS: 2200,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 2,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
addPlanNote(plan, "msi gpu chassis probes enabled")
|
||||
},
|
||||
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, _ MatchSignals) {
|
||||
for _, chassisPath := range discovered.ChassisPaths {
|
||||
if !strings.HasPrefix(chassisPath, "/redfish/v1/Chassis/GPU") {
|
||||
continue
|
||||
}
|
||||
addPlanPaths(&resolved.SeedPaths, chassisPath)
|
||||
addPlanPaths(&resolved.Plan.SeedPaths, chassisPath)
|
||||
addPlanPaths(&resolved.CriticalPaths, joinPath(chassisPath, "/Sensors"))
|
||||
addPlanPaths(&resolved.Plan.CriticalPaths, joinPath(chassisPath, "/Sensors"))
|
||||
addPlanPaths(&resolved.Plan.PlanBPaths, joinPath(chassisPath, "/Sensors"))
|
||||
}
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
},
|
||||
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, _ MatchSignals) {
|
||||
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) && snapshotHasPathPrefix(snapshot, "/redfish/v1/Chassis/GPU") {
|
||||
plan.Directives.EnableProcessorGPUFallback = true
|
||||
plan.Directives.EnableMSIProcessorGPUChassisLookup = true
|
||||
plan.Directives.EnableMSIGhostGPUFilter = true
|
||||
addAnalysisLookupMode(plan, "msi-index")
|
||||
addAnalysisNote(plan, "msi analysis enables processor-gpu fallback from discovered GPU chassis")
|
||||
addAnalysisNote(plan, "msi ghost-gpu filter enabled: GPUs with temperature=0 on powered-on host are excluded")
|
||||
}
|
||||
},
|
||||
}
|
||||
}
|
||||
81
internal/collector/redfishprofile/profile_supermicro.go
Normal file
81
internal/collector/redfishprofile/profile_supermicro.go
Normal file
@@ -0,0 +1,81 @@
|
||||
package redfishprofile
|
||||
|
||||
func supermicroProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "supermicro",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.SystemManufacturer, "supermicro") || containsFold(s.ChassisManufacturer, "supermicro") {
|
||||
score += 80
|
||||
}
|
||||
for _, hint := range s.ResourceHints {
|
||||
if containsFold(hint, "hgx_baseboard") || containsFold(hint, "hgx_gpu_sxm") {
|
||||
score += 20
|
||||
break
|
||||
}
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureSnapshotMaxDocuments(plan, 150000)
|
||||
ensureSnapshotWorkers(plan, 6)
|
||||
ensurePrefetchWorkers(plan, 4)
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 15,
|
||||
SnapshotSeconds: 180,
|
||||
PrefetchSeconds: 35,
|
||||
CriticalPlanBSeconds: 45,
|
||||
ProfilePlanBSeconds: 30,
|
||||
})
|
||||
ensurePostProbePolicy(plan, AcquisitionPostProbePolicy{
|
||||
EnableDirectNVMEDiskBayProbe: true,
|
||||
})
|
||||
ensureRecoveryPolicy(plan, AcquisitionRecoveryPolicy{
|
||||
EnableProfilePlanB: true,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 1200,
|
||||
ThrottleP95LatencyMS: 2400,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
addPlanNote(plan, "supermicro acquisition extensions enabled")
|
||||
},
|
||||
refineAcquisition: func(resolved *ResolvedAcquisitionPlan, _ DiscoveredResources, signals MatchSignals) {
|
||||
for _, hint := range signals.ResourceHints {
|
||||
if normalizePath(hint) != "/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory" {
|
||||
continue
|
||||
}
|
||||
addPlanPaths(&resolved.CriticalPaths, hint)
|
||||
addPlanPaths(&resolved.Plan.CriticalPaths, hint)
|
||||
addPlanPaths(&resolved.Plan.PlanBPaths, hint)
|
||||
break
|
||||
}
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableStorageEnclosureRecovery = true
|
||||
},
|
||||
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, _ DiscoveredResources, _ MatchSignals) {
|
||||
if snapshotHasPathContaining(snapshot, "/chassis/nvmessd.") && snapshotHasPathContaining(snapshot, ".storagebackplane") {
|
||||
plan.Directives.EnableSupermicroNVMeBackplane = true
|
||||
addAnalysisNote(plan, "supermicro analysis enables NVMe backplane recovery from snapshot paths")
|
||||
}
|
||||
if snapshotHasStorageControllerHint(snapshot, "/storage/intelvroc", "/storage/ha-raid", "/storage/mrvl.ha-raid") {
|
||||
plan.Directives.EnableKnownStorageControllerRecovery = true
|
||||
addAnalysisStorageDriveCollections(plan,
|
||||
"/Storage/IntelVROC/Drives",
|
||||
"/Storage/IntelVROC/Controllers/1/Drives",
|
||||
)
|
||||
addAnalysisStorageVolumeCollections(plan,
|
||||
"/Storage/IntelVROC/Volumes",
|
||||
"/Storage/HA-RAID/Volumes",
|
||||
"/Storage/MRVL.HA-RAID/Volumes",
|
||||
)
|
||||
addAnalysisNote(plan, "supermicro analysis enables known storage-controller recovery from snapshot paths")
|
||||
}
|
||||
},
|
||||
}
|
||||
}
|
||||
55
internal/collector/redfishprofile/profile_xfusion.go
Normal file
55
internal/collector/redfishprofile/profile_xfusion.go
Normal file
@@ -0,0 +1,55 @@
|
||||
package redfishprofile
|
||||
|
||||
func xfusionProfile() Profile {
|
||||
return staticProfile{
|
||||
name: "xfusion",
|
||||
priority: 20,
|
||||
safeForFallback: true,
|
||||
matchFn: func(s MatchSignals) int {
|
||||
score := 0
|
||||
if containsFold(s.ServiceRootVendor, "xfusion") {
|
||||
score += 90
|
||||
}
|
||||
for _, ns := range s.OEMNamespaces {
|
||||
if containsFold(ns, "xfusion") {
|
||||
score += 20
|
||||
break
|
||||
}
|
||||
}
|
||||
if containsFold(s.SystemManufacturer, "xfusion") || containsFold(s.ChassisManufacturer, "xfusion") {
|
||||
score += 40
|
||||
}
|
||||
return min(score, 100)
|
||||
},
|
||||
extendAcquisition: func(plan *AcquisitionPlan, _ MatchSignals) {
|
||||
ensureSnapshotMaxDocuments(plan, 120000)
|
||||
ensureSnapshotWorkers(plan, 4)
|
||||
ensurePrefetchWorkers(plan, 4)
|
||||
ensurePrefetchEnabled(plan, true)
|
||||
ensureETABaseline(plan, AcquisitionETABaseline{
|
||||
DiscoverySeconds: 10,
|
||||
SnapshotSeconds: 90,
|
||||
PrefetchSeconds: 20,
|
||||
CriticalPlanBSeconds: 30,
|
||||
ProfilePlanBSeconds: 20,
|
||||
})
|
||||
ensureRatePolicy(plan, AcquisitionRatePolicy{
|
||||
TargetP95LatencyMS: 800,
|
||||
ThrottleP95LatencyMS: 1800,
|
||||
MinSnapshotWorkers: 2,
|
||||
MinPrefetchWorkers: 1,
|
||||
DisablePrefetchOnErrors: true,
|
||||
})
|
||||
addPlanNote(plan, "xfusion ibmc acquisition extensions enabled")
|
||||
},
|
||||
applyAnalysisDirectives: func(d *AnalysisDirectives, _ MatchSignals) {
|
||||
d.EnableGenericGraphicsControllerDedup = true
|
||||
},
|
||||
refineAnalysis: func(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, _ MatchSignals) {
|
||||
if snapshotHasGPUProcessor(snapshot, discovered.SystemPaths) {
|
||||
plan.Directives.EnableProcessorGPUFallback = true
|
||||
addAnalysisNote(plan, "xfusion analysis enables processor-gpu fallback from snapshot topology")
|
||||
}
|
||||
},
|
||||
}
|
||||
}
|
||||
239
internal/collector/redfishprofile/profiles_common.go
Normal file
239
internal/collector/redfishprofile/profiles_common.go
Normal file
@@ -0,0 +1,239 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
type staticProfile struct {
|
||||
name string
|
||||
priority int
|
||||
safeForFallback bool
|
||||
matchFn func(MatchSignals) int
|
||||
extendAcquisition func(*AcquisitionPlan, MatchSignals)
|
||||
refineAcquisition func(*ResolvedAcquisitionPlan, DiscoveredResources, MatchSignals)
|
||||
applyAnalysisDirectives func(*AnalysisDirectives, MatchSignals)
|
||||
refineAnalysis func(*ResolvedAnalysisPlan, map[string]interface{}, DiscoveredResources, MatchSignals)
|
||||
postAnalyze func(*models.AnalysisResult, map[string]interface{}, MatchSignals)
|
||||
}
|
||||
|
||||
func (p staticProfile) Name() string { return p.name }
|
||||
func (p staticProfile) Priority() int { return p.priority }
|
||||
func (p staticProfile) Match(signals MatchSignals) int { return p.matchFn(normalizeSignals(signals)) }
|
||||
func (p staticProfile) SafeForFallback() bool { return p.safeForFallback }
|
||||
func (p staticProfile) ExtendAcquisitionPlan(plan *AcquisitionPlan, signals MatchSignals) {
|
||||
if p.extendAcquisition != nil {
|
||||
p.extendAcquisition(plan, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
func (p staticProfile) RefineAcquisitionPlan(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, signals MatchSignals) {
|
||||
if p.refineAcquisition != nil {
|
||||
p.refineAcquisition(resolved, discovered, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
func (p staticProfile) ApplyAnalysisDirectives(directives *AnalysisDirectives, signals MatchSignals) {
|
||||
if p.applyAnalysisDirectives != nil {
|
||||
p.applyAnalysisDirectives(directives, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
func (p staticProfile) RefineAnalysisPlan(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals) {
|
||||
if p.refineAnalysis != nil {
|
||||
p.refineAnalysis(plan, snapshot, discovered, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
func (p staticProfile) PostAnalyze(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals) {
|
||||
if p.postAnalyze != nil {
|
||||
p.postAnalyze(result, snapshot, normalizeSignals(signals))
|
||||
}
|
||||
}
|
||||
|
||||
func BuiltinProfiles() []Profile {
|
||||
return []Profile{
|
||||
genericProfile(),
|
||||
amiProfile(),
|
||||
msiProfile(),
|
||||
supermicroProfile(),
|
||||
dellProfile(),
|
||||
hpeProfile(),
|
||||
lenovoProfile(),
|
||||
inspurGroupOEMPlatformsProfile(),
|
||||
hgxProfile(),
|
||||
xfusionProfile(),
|
||||
}
|
||||
}
|
||||
|
||||
func containsFold(v, sub string) bool {
|
||||
return strings.Contains(strings.ToLower(strings.TrimSpace(v)), strings.ToLower(strings.TrimSpace(sub)))
|
||||
}
|
||||
|
||||
func addPlanPaths(dst *[]string, paths ...string) {
|
||||
*dst = append(*dst, paths...)
|
||||
*dst = dedupeSorted(*dst)
|
||||
}
|
||||
|
||||
func addPlanNote(plan *AcquisitionPlan, note string) {
|
||||
if strings.TrimSpace(note) == "" {
|
||||
return
|
||||
}
|
||||
plan.Notes = append(plan.Notes, note)
|
||||
plan.Notes = dedupeSorted(plan.Notes)
|
||||
}
|
||||
|
||||
func addAnalysisNote(plan *ResolvedAnalysisPlan, note string) {
|
||||
if plan == nil || strings.TrimSpace(note) == "" {
|
||||
return
|
||||
}
|
||||
plan.Notes = append(plan.Notes, note)
|
||||
plan.Notes = dedupeSorted(plan.Notes)
|
||||
}
|
||||
|
||||
func addAnalysisLookupMode(plan *ResolvedAnalysisPlan, mode string) {
|
||||
if plan == nil || strings.TrimSpace(mode) == "" {
|
||||
return
|
||||
}
|
||||
plan.ProcessorGPUChassisLookupModes = dedupeSorted(append(plan.ProcessorGPUChassisLookupModes, mode))
|
||||
}
|
||||
|
||||
func addAnalysisStorageDriveCollections(plan *ResolvedAnalysisPlan, rels ...string) {
|
||||
if plan == nil {
|
||||
return
|
||||
}
|
||||
plan.KnownStorageDriveCollections = dedupeSorted(append(plan.KnownStorageDriveCollections, rels...))
|
||||
}
|
||||
|
||||
func addAnalysisStorageVolumeCollections(plan *ResolvedAnalysisPlan, rels ...string) {
|
||||
if plan == nil {
|
||||
return
|
||||
}
|
||||
plan.KnownStorageVolumeCollections = dedupeSorted(append(plan.KnownStorageVolumeCollections, rels...))
|
||||
}
|
||||
|
||||
func ensureSnapshotMaxDocuments(plan *AcquisitionPlan, n int) {
|
||||
if n <= 0 {
|
||||
return
|
||||
}
|
||||
if plan.Tuning.SnapshotMaxDocuments < n {
|
||||
plan.Tuning.SnapshotMaxDocuments = n
|
||||
}
|
||||
}
|
||||
|
||||
func ensureSnapshotWorkers(plan *AcquisitionPlan, n int) {
|
||||
if n <= 0 {
|
||||
return
|
||||
}
|
||||
if plan.Tuning.SnapshotWorkers < n {
|
||||
plan.Tuning.SnapshotWorkers = n
|
||||
}
|
||||
}
|
||||
|
||||
func ensurePrefetchEnabled(plan *AcquisitionPlan, enabled bool) {
|
||||
if plan.Tuning.PrefetchEnabled == nil {
|
||||
plan.Tuning.PrefetchEnabled = new(bool)
|
||||
}
|
||||
*plan.Tuning.PrefetchEnabled = enabled
|
||||
}
|
||||
|
||||
func ensurePrefetchWorkers(plan *AcquisitionPlan, n int) {
|
||||
if n <= 0 {
|
||||
return
|
||||
}
|
||||
if plan.Tuning.PrefetchWorkers < n {
|
||||
plan.Tuning.PrefetchWorkers = n
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNVMePostProbeEnabled(plan *AcquisitionPlan, enabled bool) {
|
||||
if plan.Tuning.NVMePostProbeEnabled == nil {
|
||||
plan.Tuning.NVMePostProbeEnabled = new(bool)
|
||||
}
|
||||
*plan.Tuning.NVMePostProbeEnabled = enabled
|
||||
}
|
||||
|
||||
func ensureRatePolicy(plan *AcquisitionPlan, policy AcquisitionRatePolicy) {
|
||||
if policy.TargetP95LatencyMS > plan.Tuning.RatePolicy.TargetP95LatencyMS {
|
||||
plan.Tuning.RatePolicy.TargetP95LatencyMS = policy.TargetP95LatencyMS
|
||||
}
|
||||
if policy.ThrottleP95LatencyMS > plan.Tuning.RatePolicy.ThrottleP95LatencyMS {
|
||||
plan.Tuning.RatePolicy.ThrottleP95LatencyMS = policy.ThrottleP95LatencyMS
|
||||
}
|
||||
if policy.MinSnapshotWorkers > plan.Tuning.RatePolicy.MinSnapshotWorkers {
|
||||
plan.Tuning.RatePolicy.MinSnapshotWorkers = policy.MinSnapshotWorkers
|
||||
}
|
||||
if policy.MinPrefetchWorkers > plan.Tuning.RatePolicy.MinPrefetchWorkers {
|
||||
plan.Tuning.RatePolicy.MinPrefetchWorkers = policy.MinPrefetchWorkers
|
||||
}
|
||||
if policy.DisablePrefetchOnErrors {
|
||||
plan.Tuning.RatePolicy.DisablePrefetchOnErrors = true
|
||||
}
|
||||
}
|
||||
|
||||
func ensureETABaseline(plan *AcquisitionPlan, baseline AcquisitionETABaseline) {
|
||||
if baseline.DiscoverySeconds > plan.Tuning.ETABaseline.DiscoverySeconds {
|
||||
plan.Tuning.ETABaseline.DiscoverySeconds = baseline.DiscoverySeconds
|
||||
}
|
||||
if baseline.SnapshotSeconds > plan.Tuning.ETABaseline.SnapshotSeconds {
|
||||
plan.Tuning.ETABaseline.SnapshotSeconds = baseline.SnapshotSeconds
|
||||
}
|
||||
if baseline.PrefetchSeconds > plan.Tuning.ETABaseline.PrefetchSeconds {
|
||||
plan.Tuning.ETABaseline.PrefetchSeconds = baseline.PrefetchSeconds
|
||||
}
|
||||
if baseline.CriticalPlanBSeconds > plan.Tuning.ETABaseline.CriticalPlanBSeconds {
|
||||
plan.Tuning.ETABaseline.CriticalPlanBSeconds = baseline.CriticalPlanBSeconds
|
||||
}
|
||||
if baseline.ProfilePlanBSeconds > plan.Tuning.ETABaseline.ProfilePlanBSeconds {
|
||||
plan.Tuning.ETABaseline.ProfilePlanBSeconds = baseline.ProfilePlanBSeconds
|
||||
}
|
||||
}
|
||||
|
||||
func ensurePostProbePolicy(plan *AcquisitionPlan, policy AcquisitionPostProbePolicy) {
|
||||
if policy.EnableDirectNVMEDiskBayProbe {
|
||||
plan.Tuning.PostProbePolicy.EnableDirectNVMEDiskBayProbe = true
|
||||
}
|
||||
if policy.EnableNumericCollectionProbe {
|
||||
plan.Tuning.PostProbePolicy.EnableNumericCollectionProbe = true
|
||||
}
|
||||
if policy.EnableSensorCollectionProbe {
|
||||
plan.Tuning.PostProbePolicy.EnableSensorCollectionProbe = true
|
||||
}
|
||||
}
|
||||
|
||||
func ensureRecoveryPolicy(plan *AcquisitionPlan, policy AcquisitionRecoveryPolicy) {
|
||||
if policy.EnableCriticalCollectionMemberRetry {
|
||||
plan.Tuning.RecoveryPolicy.EnableCriticalCollectionMemberRetry = true
|
||||
}
|
||||
if policy.EnableCriticalSlowProbe {
|
||||
plan.Tuning.RecoveryPolicy.EnableCriticalSlowProbe = true
|
||||
}
|
||||
if policy.EnableProfilePlanB {
|
||||
plan.Tuning.RecoveryPolicy.EnableProfilePlanB = true
|
||||
}
|
||||
if policy.EnableEmptyCriticalCollectionRetry {
|
||||
plan.Tuning.RecoveryPolicy.EnableEmptyCriticalCollectionRetry = true
|
||||
}
|
||||
}
|
||||
|
||||
func ensureScopedPathPolicy(plan *AcquisitionPlan, policy AcquisitionScopedPathPolicy) {
|
||||
addPlanPaths(&plan.ScopedPaths.SystemSeedSuffixes, policy.SystemSeedSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.SystemCriticalSuffixes, policy.SystemCriticalSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.ChassisSeedSuffixes, policy.ChassisSeedSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.ChassisCriticalSuffixes, policy.ChassisCriticalSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.ManagerSeedSuffixes, policy.ManagerSeedSuffixes...)
|
||||
addPlanPaths(&plan.ScopedPaths.ManagerCriticalSuffixes, policy.ManagerCriticalSuffixes...)
|
||||
}
|
||||
|
||||
func ensurePrefetchPolicy(plan *AcquisitionPlan, policy AcquisitionPrefetchPolicy) {
|
||||
addPlanPaths(&plan.Tuning.PrefetchPolicy.IncludeSuffixes, policy.IncludeSuffixes...)
|
||||
addPlanPaths(&plan.Tuning.PrefetchPolicy.ExcludeContains, policy.ExcludeContains...)
|
||||
}
|
||||
|
||||
func ensureSnapshotExcludeContains(plan *AcquisitionPlan, patterns ...string) {
|
||||
addPlanPaths(&plan.Tuning.SnapshotExcludeContains, patterns...)
|
||||
}
|
||||
|
||||
func min(a, b int) int {
|
||||
if a < b {
|
||||
return a
|
||||
}
|
||||
return b
|
||||
}
|
||||
177
internal/collector/redfishprofile/signals.go
Normal file
177
internal/collector/redfishprofile/signals.go
Normal file
@@ -0,0 +1,177 @@
|
||||
package redfishprofile
|
||||
|
||||
import "strings"
|
||||
|
||||
func CollectSignals(serviceRootDoc, systemDoc, chassisDoc, managerDoc map[string]interface{}, resourceHints []string, hintDocs ...map[string]interface{}) MatchSignals {
|
||||
resourceHints = append([]string{}, resourceHints...)
|
||||
docHints := make([]string, 0)
|
||||
for _, doc := range append([]map[string]interface{}{serviceRootDoc, systemDoc, chassisDoc, managerDoc}, hintDocs...) {
|
||||
embeddedPaths, embeddedHints := collectDocSignalHints(doc)
|
||||
resourceHints = append(resourceHints, embeddedPaths...)
|
||||
docHints = append(docHints, embeddedHints...)
|
||||
}
|
||||
signals := MatchSignals{
|
||||
ServiceRootVendor: lookupString(serviceRootDoc, "Vendor"),
|
||||
ServiceRootProduct: lookupString(serviceRootDoc, "Product"),
|
||||
SystemManufacturer: lookupString(systemDoc, "Manufacturer"),
|
||||
SystemModel: lookupString(systemDoc, "Model"),
|
||||
SystemSKU: lookupString(systemDoc, "SKU"),
|
||||
ChassisManufacturer: lookupString(chassisDoc, "Manufacturer"),
|
||||
ChassisModel: lookupString(chassisDoc, "Model"),
|
||||
ManagerManufacturer: lookupString(managerDoc, "Manufacturer"),
|
||||
ResourceHints: resourceHints,
|
||||
DocHints: docHints,
|
||||
}
|
||||
signals.OEMNamespaces = dedupeSorted(append(
|
||||
oemNamespaces(serviceRootDoc),
|
||||
append(oemNamespaces(systemDoc), append(oemNamespaces(chassisDoc), oemNamespaces(managerDoc)...)...)...,
|
||||
))
|
||||
return normalizeSignals(signals)
|
||||
}
|
||||
|
||||
func CollectSignalsFromTree(tree map[string]interface{}) MatchSignals {
|
||||
getDoc := func(path string) map[string]interface{} {
|
||||
if v, ok := tree[path]; ok {
|
||||
if doc, ok := v.(map[string]interface{}); ok {
|
||||
return doc
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
memberPath := func(collectionPath, fallbackPath string) string {
|
||||
collection := getDoc(collectionPath)
|
||||
if len(collection) != 0 {
|
||||
if members, ok := collection["Members"].([]interface{}); ok && len(members) > 0 {
|
||||
if ref, ok := members[0].(map[string]interface{}); ok {
|
||||
if path := lookupString(ref, "@odata.id"); path != "" {
|
||||
return path
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return fallbackPath
|
||||
}
|
||||
|
||||
systemPath := memberPath("/redfish/v1/Systems", "/redfish/v1/Systems/1")
|
||||
chassisPath := memberPath("/redfish/v1/Chassis", "/redfish/v1/Chassis/1")
|
||||
managerPath := memberPath("/redfish/v1/Managers", "/redfish/v1/Managers/1")
|
||||
|
||||
resourceHints := make([]string, 0, len(tree))
|
||||
hintDocs := make([]map[string]interface{}, 0, len(tree))
|
||||
for path := range tree {
|
||||
path = strings.TrimSpace(path)
|
||||
if path == "" {
|
||||
continue
|
||||
}
|
||||
resourceHints = append(resourceHints, path)
|
||||
}
|
||||
for _, v := range tree {
|
||||
doc, ok := v.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
hintDocs = append(hintDocs, doc)
|
||||
}
|
||||
|
||||
return CollectSignals(
|
||||
getDoc("/redfish/v1"),
|
||||
getDoc(systemPath),
|
||||
getDoc(chassisPath),
|
||||
getDoc(managerPath),
|
||||
resourceHints,
|
||||
hintDocs...,
|
||||
)
|
||||
}
|
||||
|
||||
func collectDocSignalHints(doc map[string]interface{}) ([]string, []string) {
|
||||
if len(doc) == 0 {
|
||||
return nil, nil
|
||||
}
|
||||
paths := make([]string, 0)
|
||||
hints := make([]string, 0)
|
||||
var walk func(any)
|
||||
walk = func(v any) {
|
||||
switch x := v.(type) {
|
||||
case map[string]interface{}:
|
||||
for rawKey, child := range x {
|
||||
key := strings.TrimSpace(rawKey)
|
||||
if key != "" {
|
||||
hints = append(hints, key)
|
||||
}
|
||||
if s, ok := child.(string); ok {
|
||||
s = strings.TrimSpace(s)
|
||||
if s != "" {
|
||||
switch key {
|
||||
case "@odata.id", "target":
|
||||
paths = append(paths, s)
|
||||
case "@odata.type":
|
||||
hints = append(hints, s)
|
||||
default:
|
||||
if isInterestingSignalString(s) {
|
||||
hints = append(hints, s)
|
||||
if strings.HasPrefix(s, "/") {
|
||||
paths = append(paths, s)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
walk(child)
|
||||
}
|
||||
case []interface{}:
|
||||
for _, child := range x {
|
||||
walk(child)
|
||||
}
|
||||
}
|
||||
}
|
||||
walk(doc)
|
||||
return paths, hints
|
||||
}
|
||||
|
||||
func isInterestingSignalString(s string) bool {
|
||||
switch {
|
||||
case strings.HasPrefix(s, "/"):
|
||||
return true
|
||||
case strings.HasPrefix(s, "#"):
|
||||
return true
|
||||
case strings.Contains(s, "COMMONb"):
|
||||
return true
|
||||
case strings.Contains(s, "EnvironmentMetrcs"):
|
||||
return true
|
||||
case strings.Contains(s, "GetServerAllUSBStatus"):
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
func lookupString(doc map[string]interface{}, key string) string {
|
||||
if len(doc) == 0 {
|
||||
return ""
|
||||
}
|
||||
value, _ := doc[key]
|
||||
if s, ok := value.(string); ok {
|
||||
return strings.TrimSpace(s)
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func oemNamespaces(doc map[string]interface{}) []string {
|
||||
if len(doc) == 0 {
|
||||
return nil
|
||||
}
|
||||
oem, ok := doc["Oem"].(map[string]interface{})
|
||||
if !ok {
|
||||
return nil
|
||||
}
|
||||
out := make([]string, 0, len(oem))
|
||||
for key := range oem {
|
||||
key = strings.TrimSpace(key)
|
||||
if key == "" {
|
||||
continue
|
||||
}
|
||||
out = append(out, key)
|
||||
}
|
||||
return out
|
||||
}
|
||||
17
internal/collector/redfishprofile/testdata/ami-generic.json
vendored
Normal file
17
internal/collector/redfishprofile/testdata/ami-generic.json
vendored
Normal file
@@ -0,0 +1,17 @@
|
||||
{
|
||||
"ServiceRootVendor": "AMI",
|
||||
"ServiceRootProduct": "AMI Redfish Server",
|
||||
"SystemManufacturer": "Gigabyte",
|
||||
"SystemModel": "G292-Z42",
|
||||
"SystemSKU": "",
|
||||
"ChassisManufacturer": "",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Ami"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/Self",
|
||||
"/redfish/v1/Managers/Self",
|
||||
"/redfish/v1/Oem/Ami",
|
||||
"/redfish/v1/Systems/Self"
|
||||
]
|
||||
}
|
||||
18
internal/collector/redfishprofile/testdata/dell-r750.json
vendored
Normal file
18
internal/collector/redfishprofile/testdata/dell-r750.json
vendored
Normal file
@@ -0,0 +1,18 @@
|
||||
{
|
||||
"ServiceRootVendor": "",
|
||||
"ServiceRootProduct": "iDRAC Redfish Service",
|
||||
"SystemManufacturer": "Dell Inc.",
|
||||
"SystemModel": "PowerEdge R750",
|
||||
"SystemSKU": "0A42H9",
|
||||
"ChassisManufacturer": "Dell Inc.",
|
||||
"ChassisModel": "PowerEdge R750",
|
||||
"ManagerManufacturer": "Dell Inc.",
|
||||
"OEMNamespaces": ["Dell"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/System.Embedded.1",
|
||||
"/redfish/v1/Managers/iDRAC.Embedded.1",
|
||||
"/redfish/v1/Managers/iDRAC.Embedded.1/Oem/Dell",
|
||||
"/redfish/v1/Systems/System.Embedded.1",
|
||||
"/redfish/v1/Systems/System.Embedded.1/Storage"
|
||||
]
|
||||
}
|
||||
33
internal/collector/redfishprofile/testdata/msi-cg290.json
vendored
Normal file
33
internal/collector/redfishprofile/testdata/msi-cg290.json
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"ServiceRootVendor": "AMI",
|
||||
"ServiceRootProduct": "AMI Redfish Server",
|
||||
"SystemManufacturer": "Micro-Star International Co., Ltd.",
|
||||
"SystemModel": "CG290-S3063",
|
||||
"SystemSKU": "S3063G290RAU4",
|
||||
"ChassisManufacturer": "NVIDIA",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Ami"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/GPU1",
|
||||
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
|
||||
"/redfish/v1/Chassis/GPU2",
|
||||
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
|
||||
"/redfish/v1/Chassis/GPU3",
|
||||
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
|
||||
"/redfish/v1/Chassis/GPU4",
|
||||
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
|
||||
]
|
||||
}
|
||||
33
internal/collector/redfishprofile/testdata/msi-cg480-copy.json
vendored
Normal file
33
internal/collector/redfishprofile/testdata/msi-cg480-copy.json
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"ServiceRootVendor": "AMI",
|
||||
"ServiceRootProduct": "AMI Redfish Server",
|
||||
"SystemManufacturer": "Micro-Star International Co., Ltd.",
|
||||
"SystemModel": "CG480-S5063",
|
||||
"SystemSKU": "5063G480RAE20",
|
||||
"ChassisManufacturer": "NVIDIA",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Ami"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/GPU1",
|
||||
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
|
||||
"/redfish/v1/Chassis/GPU2",
|
||||
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
|
||||
"/redfish/v1/Chassis/GPU3",
|
||||
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
|
||||
"/redfish/v1/Chassis/GPU4",
|
||||
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
|
||||
]
|
||||
}
|
||||
33
internal/collector/redfishprofile/testdata/msi-cg480.json
vendored
Normal file
33
internal/collector/redfishprofile/testdata/msi-cg480.json
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"ServiceRootVendor": "AMI",
|
||||
"ServiceRootProduct": "AMI Redfish Server",
|
||||
"SystemManufacturer": "Micro-Star International Co., Ltd.",
|
||||
"SystemModel": "CG480-S5063",
|
||||
"SystemSKU": "5063G480RAE20",
|
||||
"ChassisManufacturer": "NVIDIA",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Ami"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/GPU1",
|
||||
"/redfish/v1/Chassis/GPU1/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Power",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_TLimit",
|
||||
"/redfish/v1/Chassis/GPU1/Sensors/GPU1_Temperature",
|
||||
"/redfish/v1/Chassis/GPU2",
|
||||
"/redfish/v1/Chassis/GPU2/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Power",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_TLimit",
|
||||
"/redfish/v1/Chassis/GPU2/Sensors/GPU2_Temperature",
|
||||
"/redfish/v1/Chassis/GPU3",
|
||||
"/redfish/v1/Chassis/GPU3/NetworkAdapters",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Power",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_TLimit",
|
||||
"/redfish/v1/Chassis/GPU3/Sensors/GPU3_Temperature",
|
||||
"/redfish/v1/Chassis/GPU4",
|
||||
"/redfish/v1/Chassis/GPU4/NetworkAdapters"
|
||||
]
|
||||
}
|
||||
33
internal/collector/redfishprofile/testdata/supermicro-hgx.json
vendored
Normal file
33
internal/collector/redfishprofile/testdata/supermicro-hgx.json
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
{
|
||||
"ServiceRootVendor": "Supermicro",
|
||||
"ServiceRootProduct": "",
|
||||
"SystemManufacturer": "Supermicro",
|
||||
"SystemModel": "SYS-821GE-TNHR",
|
||||
"SystemSKU": "0x1D1415D9",
|
||||
"ChassisManufacturer": "Supermicro",
|
||||
"ChassisModel": "X13DEG-OAD",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": ["Supermicro"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/HGX_BMC_0",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Assembly",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Controls",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Drives",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/EnvironmentMetrics",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/LogServices",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/PCIeDevices",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/PCIeSlots",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/PowerSubsystem",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/PowerSubsystem/PowerSupplies",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Sensors",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/Sensors/HGX_BMC_0_Temp_0",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/ThermalSubsystem",
|
||||
"/redfish/v1/Chassis/HGX_BMC_0/ThermalSubsystem/ThermalMetrics",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/Assembly",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/Controls",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/Controls/TotalGPU_Power_0",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/Drives",
|
||||
"/redfish/v1/Chassis/HGX_Chassis_0/EnvironmentMetrics"
|
||||
]
|
||||
}
|
||||
51
internal/collector/redfishprofile/testdata/supermicro-oam-amd.json
vendored
Normal file
51
internal/collector/redfishprofile/testdata/supermicro-oam-amd.json
vendored
Normal file
@@ -0,0 +1,51 @@
|
||||
{
|
||||
"ServiceRootVendor": "",
|
||||
"ServiceRootProduct": "H12DGQ-NT6",
|
||||
"SystemManufacturer": "Supermicro",
|
||||
"SystemModel": "AS -4124GQ-TNMI",
|
||||
"SystemSKU": "091715D9",
|
||||
"ChassisManufacturer": "Supermicro",
|
||||
"ChassisModel": "H12DGQ-NT6",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": [
|
||||
"Supermicro"
|
||||
],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/1/PCIeDevices",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU1/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU2",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU2/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU2/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU3",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU3/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU3/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU4",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU4/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU4/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU5",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU5/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU5/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU6",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU6/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU6/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU7",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU7/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU7/PCIeFunctions/1",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU8",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU8/PCIeFunctions",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices/GPU8/PCIeFunctions/1",
|
||||
"/redfish/v1/Managers/1/Oem/Supermicro/FanMode",
|
||||
"/redfish/v1/Oem/Supermicro/DumpService",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU1",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU2",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU3",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU4",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU5",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU6",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU7",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory/GPU8",
|
||||
"/redfish/v1/UpdateService/Oem/Supermicro/FirmwareInventory"
|
||||
]
|
||||
}
|
||||
16
internal/collector/redfishprofile/testdata/unknown-vendor.json
vendored
Normal file
16
internal/collector/redfishprofile/testdata/unknown-vendor.json
vendored
Normal file
@@ -0,0 +1,16 @@
|
||||
{
|
||||
"ServiceRootVendor": "",
|
||||
"ServiceRootProduct": "Redfish Service",
|
||||
"SystemManufacturer": "",
|
||||
"SystemModel": "",
|
||||
"SystemSKU": "",
|
||||
"ChassisManufacturer": "",
|
||||
"ChassisModel": "",
|
||||
"ManagerManufacturer": "",
|
||||
"OEMNamespaces": [],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/1",
|
||||
"/redfish/v1/Managers/1",
|
||||
"/redfish/v1/Systems/1"
|
||||
]
|
||||
}
|
||||
24
internal/collector/redfishprofile/testdata/xfusion-g5500v7.json
vendored
Normal file
24
internal/collector/redfishprofile/testdata/xfusion-g5500v7.json
vendored
Normal file
@@ -0,0 +1,24 @@
|
||||
{
|
||||
"ServiceRootVendor": "xFusion",
|
||||
"ServiceRootProduct": "G5500 V7",
|
||||
"SystemManufacturer": "OEM",
|
||||
"SystemModel": "G5500 V7",
|
||||
"SystemSKU": "",
|
||||
"ChassisManufacturer": "OEM",
|
||||
"ChassisModel": "G5500 V7",
|
||||
"ManagerManufacturer": "XFUSION",
|
||||
"OEMNamespaces": ["xFusion"],
|
||||
"ResourceHints": [
|
||||
"/redfish/v1/Chassis/1",
|
||||
"/redfish/v1/Chassis/1/Drives",
|
||||
"/redfish/v1/Chassis/1/PCIeDevices",
|
||||
"/redfish/v1/Chassis/1/Sensors",
|
||||
"/redfish/v1/Managers/1",
|
||||
"/redfish/v1/Systems/1",
|
||||
"/redfish/v1/Systems/1/GraphicsControllers",
|
||||
"/redfish/v1/Systems/1/Processors",
|
||||
"/redfish/v1/Systems/1/Processors/Gpu1",
|
||||
"/redfish/v1/Systems/1/Storages",
|
||||
"/redfish/v1/UpdateService/FirmwareInventory"
|
||||
]
|
||||
}
|
||||
172
internal/collector/redfishprofile/types.go
Normal file
172
internal/collector/redfishprofile/types.go
Normal file
@@ -0,0 +1,172 @@
|
||||
package redfishprofile
|
||||
|
||||
import (
|
||||
"sort"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
type MatchSignals struct {
|
||||
ServiceRootVendor string
|
||||
ServiceRootProduct string
|
||||
SystemManufacturer string
|
||||
SystemModel string
|
||||
SystemSKU string
|
||||
ChassisManufacturer string
|
||||
ChassisModel string
|
||||
ManagerManufacturer string
|
||||
OEMNamespaces []string
|
||||
ResourceHints []string
|
||||
DocHints []string
|
||||
}
|
||||
|
||||
type AcquisitionPlan struct {
|
||||
Mode string
|
||||
Profiles []string
|
||||
SeedPaths []string
|
||||
CriticalPaths []string
|
||||
PlanBPaths []string
|
||||
Notes []string
|
||||
ScopedPaths AcquisitionScopedPathPolicy
|
||||
Tuning AcquisitionTuning
|
||||
}
|
||||
|
||||
type DiscoveredResources struct {
|
||||
SystemPaths []string
|
||||
ChassisPaths []string
|
||||
ManagerPaths []string
|
||||
}
|
||||
|
||||
type ResolvedAcquisitionPlan struct {
|
||||
Plan AcquisitionPlan
|
||||
SeedPaths []string
|
||||
CriticalPaths []string
|
||||
}
|
||||
|
||||
type AcquisitionScopedPathPolicy struct {
|
||||
SystemSeedSuffixes []string
|
||||
SystemCriticalSuffixes []string
|
||||
ChassisSeedSuffixes []string
|
||||
ChassisCriticalSuffixes []string
|
||||
ManagerSeedSuffixes []string
|
||||
ManagerCriticalSuffixes []string
|
||||
}
|
||||
|
||||
type AcquisitionTuning struct {
|
||||
SnapshotMaxDocuments int
|
||||
SnapshotWorkers int
|
||||
SnapshotExcludeContains []string
|
||||
PrefetchEnabled *bool
|
||||
PrefetchWorkers int
|
||||
NVMePostProbeEnabled *bool
|
||||
RatePolicy AcquisitionRatePolicy
|
||||
ETABaseline AcquisitionETABaseline
|
||||
PostProbePolicy AcquisitionPostProbePolicy
|
||||
RecoveryPolicy AcquisitionRecoveryPolicy
|
||||
PrefetchPolicy AcquisitionPrefetchPolicy
|
||||
}
|
||||
|
||||
type AcquisitionRatePolicy struct {
|
||||
TargetP95LatencyMS int
|
||||
ThrottleP95LatencyMS int
|
||||
MinSnapshotWorkers int
|
||||
MinPrefetchWorkers int
|
||||
DisablePrefetchOnErrors bool
|
||||
}
|
||||
|
||||
type AcquisitionETABaseline struct {
|
||||
DiscoverySeconds int
|
||||
SnapshotSeconds int
|
||||
PrefetchSeconds int
|
||||
CriticalPlanBSeconds int
|
||||
ProfilePlanBSeconds int
|
||||
}
|
||||
|
||||
type AcquisitionPostProbePolicy struct {
|
||||
EnableDirectNVMEDiskBayProbe bool
|
||||
EnableNumericCollectionProbe bool
|
||||
EnableSensorCollectionProbe bool
|
||||
}
|
||||
|
||||
type AcquisitionRecoveryPolicy struct {
|
||||
EnableCriticalCollectionMemberRetry bool
|
||||
EnableCriticalSlowProbe bool
|
||||
EnableProfilePlanB bool
|
||||
EnableEmptyCriticalCollectionRetry bool
|
||||
}
|
||||
|
||||
type AcquisitionPrefetchPolicy struct {
|
||||
IncludeSuffixes []string
|
||||
ExcludeContains []string
|
||||
}
|
||||
|
||||
type AnalysisDirectives struct {
|
||||
EnableProcessorGPUFallback bool
|
||||
EnableSupermicroNVMeBackplane bool
|
||||
EnableProcessorGPUChassisAlias bool
|
||||
EnableGenericGraphicsControllerDedup bool
|
||||
EnableMSIProcessorGPUChassisLookup bool
|
||||
EnableMSIGhostGPUFilter bool
|
||||
EnableStorageEnclosureRecovery bool
|
||||
EnableKnownStorageControllerRecovery bool
|
||||
}
|
||||
|
||||
type ResolvedAnalysisPlan struct {
|
||||
Match MatchResult
|
||||
Directives AnalysisDirectives
|
||||
Notes []string
|
||||
ProcessorGPUChassisLookupModes []string
|
||||
KnownStorageDriveCollections []string
|
||||
KnownStorageVolumeCollections []string
|
||||
}
|
||||
|
||||
type Profile interface {
|
||||
Name() string
|
||||
Priority() int
|
||||
Match(signals MatchSignals) int
|
||||
SafeForFallback() bool
|
||||
ExtendAcquisitionPlan(plan *AcquisitionPlan, signals MatchSignals)
|
||||
RefineAcquisitionPlan(resolved *ResolvedAcquisitionPlan, discovered DiscoveredResources, signals MatchSignals)
|
||||
ApplyAnalysisDirectives(directives *AnalysisDirectives, signals MatchSignals)
|
||||
RefineAnalysisPlan(plan *ResolvedAnalysisPlan, snapshot map[string]interface{}, discovered DiscoveredResources, signals MatchSignals)
|
||||
PostAnalyze(result *models.AnalysisResult, snapshot map[string]interface{}, signals MatchSignals)
|
||||
}
|
||||
|
||||
type MatchResult struct {
|
||||
Mode string
|
||||
Profiles []Profile
|
||||
Scores []ProfileScore
|
||||
}
|
||||
|
||||
type ProfileScore struct {
|
||||
Name string
|
||||
Score int
|
||||
Active bool
|
||||
Priority int
|
||||
}
|
||||
|
||||
func normalizeSignals(signals MatchSignals) MatchSignals {
|
||||
signals.OEMNamespaces = dedupeSorted(signals.OEMNamespaces)
|
||||
signals.ResourceHints = dedupeSorted(signals.ResourceHints)
|
||||
signals.DocHints = dedupeSorted(signals.DocHints)
|
||||
return signals
|
||||
}
|
||||
|
||||
func dedupeSorted(items []string) []string {
|
||||
if len(items) == 0 {
|
||||
return nil
|
||||
}
|
||||
set := make(map[string]struct{}, len(items))
|
||||
for _, item := range items {
|
||||
if item == "" {
|
||||
continue
|
||||
}
|
||||
set[item] = struct{}{}
|
||||
}
|
||||
out := make([]string, 0, len(set))
|
||||
for item := range set {
|
||||
out = append(out, item)
|
||||
}
|
||||
sort.Strings(out)
|
||||
return out
|
||||
}
|
||||
@@ -7,25 +7,73 @@ import (
|
||||
)
|
||||
|
||||
type Request struct {
|
||||
Host string
|
||||
Protocol string
|
||||
Port int
|
||||
Username string
|
||||
AuthType string
|
||||
Password string
|
||||
Token string
|
||||
TLSMode string
|
||||
Host string
|
||||
Protocol string
|
||||
Port int
|
||||
Username string
|
||||
AuthType string
|
||||
Password string
|
||||
Token string
|
||||
TLSMode string
|
||||
DebugPayloads bool
|
||||
SkipHungCh <-chan struct{}
|
||||
}
|
||||
|
||||
type Progress struct {
|
||||
Status string
|
||||
Progress int
|
||||
Message string
|
||||
Status string
|
||||
Progress int
|
||||
Message string
|
||||
CurrentPhase string
|
||||
ETASeconds int
|
||||
ActiveModules []ModuleActivation
|
||||
ModuleScores []ModuleScore
|
||||
DebugInfo *CollectDebugInfo
|
||||
}
|
||||
|
||||
type ProgressFn func(Progress)
|
||||
|
||||
type ModuleActivation struct {
|
||||
Name string
|
||||
Score int
|
||||
}
|
||||
|
||||
type ModuleScore struct {
|
||||
Name string
|
||||
Score int
|
||||
Active bool
|
||||
Priority int
|
||||
}
|
||||
|
||||
type CollectDebugInfo struct {
|
||||
AdaptiveThrottled bool
|
||||
SnapshotWorkers int
|
||||
PrefetchWorkers int
|
||||
PrefetchEnabled *bool
|
||||
PhaseTelemetry []PhaseTelemetry
|
||||
}
|
||||
|
||||
type PhaseTelemetry struct {
|
||||
Phase string
|
||||
Requests int
|
||||
Errors int
|
||||
ErrorRate float64
|
||||
AvgMS int64
|
||||
P95MS int64
|
||||
}
|
||||
|
||||
type ProbeResult struct {
|
||||
Reachable bool
|
||||
Protocol string
|
||||
HostPowerState string
|
||||
HostPoweredOn bool
|
||||
SystemPath string
|
||||
}
|
||||
|
||||
type Connector interface {
|
||||
Protocol() string
|
||||
Collect(ctx context.Context, req Request, emit ProgressFn) (*models.AnalysisResult, error)
|
||||
}
|
||||
|
||||
type Prober interface {
|
||||
Probe(ctx context.Context, req Request) (*ProbeResult, error)
|
||||
}
|
||||
|
||||
@@ -66,104 +66,15 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
|
||||
}
|
||||
}
|
||||
|
||||
// CPUs
|
||||
for _, cpu := range e.result.Hardware.CPUs {
|
||||
if !hasUsableSerial(cpu.SerialNumber) {
|
||||
seenCanonical := make(map[string]struct{})
|
||||
for _, dev := range canonicalDevicesForExport(e.result.Hardware) {
|
||||
if !hasUsableSerial(dev.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
cpu.Model,
|
||||
strings.TrimSpace(cpu.SerialNumber),
|
||||
"",
|
||||
"CPU",
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Memory
|
||||
for _, mem := range e.result.Hardware.Memory {
|
||||
if !hasUsableSerial(mem.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
location := mem.Location
|
||||
if location == "" {
|
||||
location = mem.Slot
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
mem.PartNumber,
|
||||
strings.TrimSpace(mem.SerialNumber),
|
||||
mem.Manufacturer,
|
||||
location,
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Storage
|
||||
for _, stor := range e.result.Hardware.Storage {
|
||||
if !hasUsableSerial(stor.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
stor.Model,
|
||||
strings.TrimSpace(stor.SerialNumber),
|
||||
stor.Manufacturer,
|
||||
stor.Slot,
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// GPUs
|
||||
for _, gpu := range e.result.Hardware.GPUs {
|
||||
if !hasUsableSerial(gpu.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
component := gpu.Model
|
||||
if component == "" {
|
||||
component = "GPU"
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
component,
|
||||
strings.TrimSpace(gpu.SerialNumber),
|
||||
gpu.Manufacturer,
|
||||
gpu.Slot,
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// PCIe devices
|
||||
for _, pcie := range e.result.Hardware.PCIeDevices {
|
||||
if !hasUsableSerial(pcie.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
pcie.DeviceClass,
|
||||
strings.TrimSpace(pcie.SerialNumber),
|
||||
pcie.Manufacturer,
|
||||
pcie.Slot,
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Network adapters
|
||||
for _, nic := range e.result.Hardware.NetworkAdapters {
|
||||
if !hasUsableSerial(nic.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
location := nic.Location
|
||||
if location == "" {
|
||||
location = nic.Slot
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
nic.Model,
|
||||
strings.TrimSpace(nic.SerialNumber),
|
||||
nic.Vendor,
|
||||
location,
|
||||
}); err != nil {
|
||||
serial := strings.TrimSpace(dev.SerialNumber)
|
||||
seenCanonical[serial] = struct{}{}
|
||||
component, manufacturer, location := csvFieldsFromCanonicalDevice(dev)
|
||||
if err := writer.Write([]string{component, serial, manufacturer, location}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
@@ -173,26 +84,15 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
|
||||
if !hasUsableSerial(nic.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
nic.Model,
|
||||
strings.TrimSpace(nic.SerialNumber),
|
||||
"",
|
||||
"Network",
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Power supplies
|
||||
for _, psu := range e.result.Hardware.PowerSupply {
|
||||
if !hasUsableSerial(psu.SerialNumber) {
|
||||
serial := strings.TrimSpace(nic.SerialNumber)
|
||||
if _, ok := seenCanonical[serial]; ok {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
psu.Model,
|
||||
strings.TrimSpace(psu.SerialNumber),
|
||||
psu.Vendor,
|
||||
psu.Slot,
|
||||
nic.Model,
|
||||
serial,
|
||||
"",
|
||||
"Network",
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
@@ -221,3 +121,52 @@ func hasUsableSerial(serial string) bool {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
func csvFieldsFromCanonicalDevice(dev models.HardwareDevice) (component, manufacturer, location string) {
|
||||
component = firstNonEmptyString(
|
||||
dev.Model,
|
||||
dev.PartNumber,
|
||||
dev.DeviceClass,
|
||||
dev.Kind,
|
||||
)
|
||||
manufacturer = firstNonEmptyString(dev.Manufacturer, inferCSVVendor(dev))
|
||||
location = firstNonEmptyString(dev.Location, dev.Slot, dev.BDF, dev.Kind)
|
||||
|
||||
switch dev.Kind {
|
||||
case models.DeviceKindCPU:
|
||||
if component == "" {
|
||||
component = "CPU"
|
||||
}
|
||||
if location == "" {
|
||||
location = "CPU"
|
||||
}
|
||||
case models.DeviceKindMemory:
|
||||
component = firstNonEmptyString(dev.PartNumber, dev.Model, "Memory")
|
||||
case models.DeviceKindPCIe, models.DeviceKindGPU, models.DeviceKindNetwork:
|
||||
if location == "" {
|
||||
location = firstNonEmptyString(dev.Slot, dev.BDF, "PCIe")
|
||||
}
|
||||
case models.DeviceKindPSU:
|
||||
component = firstNonEmptyString(dev.Model, "Power Supply")
|
||||
}
|
||||
|
||||
return component, manufacturer, location
|
||||
}
|
||||
|
||||
func inferCSVVendor(dev models.HardwareDevice) string {
|
||||
switch dev.Kind {
|
||||
case models.DeviceKindCPU:
|
||||
return ""
|
||||
default:
|
||||
return ""
|
||||
}
|
||||
}
|
||||
|
||||
func firstNonEmptyString(values ...string) string {
|
||||
for _, value := range values {
|
||||
if strings.TrimSpace(value) != "" {
|
||||
return strings.TrimSpace(value)
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -19,6 +19,8 @@ type ReanimatorHardware struct {
|
||||
Storage []ReanimatorStorage `json:"storage,omitempty"`
|
||||
PCIeDevices []ReanimatorPCIe `json:"pcie_devices,omitempty"`
|
||||
PowerSupplies []ReanimatorPSU `json:"power_supplies,omitempty"`
|
||||
Sensors *ReanimatorSensors `json:"sensors,omitempty"`
|
||||
EventLogs []ReanimatorEventLog `json:"event_logs,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorBoard represents motherboard/server information
|
||||
@@ -36,11 +38,6 @@ type ReanimatorFirmware struct {
|
||||
Version string `json:"version"`
|
||||
}
|
||||
|
||||
type ReanimatorStatusAtCollection struct {
|
||||
Status string `json:"status"`
|
||||
At string `json:"at"`
|
||||
}
|
||||
|
||||
type ReanimatorStatusHistoryEntry struct {
|
||||
Status string `json:"status"`
|
||||
ChangedAt string `json:"changed_at"`
|
||||
@@ -49,105 +46,209 @@ type ReanimatorStatusHistoryEntry struct {
|
||||
|
||||
// ReanimatorCPU represents processor information
|
||||
type ReanimatorCPU struct {
|
||||
Socket int `json:"socket"`
|
||||
Model string `json:"model"`
|
||||
Cores int `json:"cores,omitempty"`
|
||||
Threads int `json:"threads,omitempty"`
|
||||
FrequencyMHz int `json:"frequency_mhz,omitempty"`
|
||||
MaxFrequencyMHz int `json:"max_frequency_mhz,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Socket int `json:"socket"`
|
||||
Model string `json:"model,omitempty"`
|
||||
Cores int `json:"cores,omitempty"`
|
||||
Threads int `json:"threads,omitempty"`
|
||||
FrequencyMHz int `json:"frequency_mhz,omitempty"`
|
||||
MaxFrequencyMHz int `json:"max_frequency_mhz,omitempty"`
|
||||
TemperatureC float64 `json:"temperature_c,omitempty"`
|
||||
PowerW float64 `json:"power_w,omitempty"`
|
||||
Throttled *bool `json:"throttled,omitempty"`
|
||||
CorrectableErrorCount int64 `json:"correctable_error_count,omitempty"`
|
||||
UncorrectableErrorCount int64 `json:"uncorrectable_error_count,omitempty"`
|
||||
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
|
||||
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Present *bool `json:"present,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorMemory represents a memory module (DIMM)
|
||||
type ReanimatorMemory struct {
|
||||
Slot string `json:"slot"`
|
||||
Location string `json:"location,omitempty"`
|
||||
Present bool `json:"present"`
|
||||
SizeMB int `json:"size_mb,omitempty"`
|
||||
Type string `json:"type,omitempty"`
|
||||
MaxSpeedMHz int `json:"max_speed_mhz,omitempty"`
|
||||
CurrentSpeedMHz int `json:"current_speed_mhz,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Slot string `json:"slot"`
|
||||
Location string `json:"location,omitempty"`
|
||||
Present *bool `json:"present,omitempty"`
|
||||
SizeMB int `json:"size_mb,omitempty"`
|
||||
Type string `json:"type,omitempty"`
|
||||
MaxSpeedMHz int `json:"max_speed_mhz,omitempty"`
|
||||
CurrentSpeedMHz int `json:"current_speed_mhz,omitempty"`
|
||||
TemperatureC float64 `json:"temperature_c,omitempty"`
|
||||
CorrectableECCErrorCount int64 `json:"correctable_ecc_error_count,omitempty"`
|
||||
UncorrectableECCErrorCount int64 `json:"uncorrectable_ecc_error_count,omitempty"`
|
||||
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
|
||||
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
|
||||
SpareBlocksRemainingPct float64 `json:"spare_blocks_remaining_pct,omitempty"`
|
||||
PerformanceDegraded *bool `json:"performance_degraded,omitempty"`
|
||||
DataLossDetected *bool `json:"data_loss_detected,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorStorage represents a storage device
|
||||
type ReanimatorStorage struct {
|
||||
Slot string `json:"slot"`
|
||||
Type string `json:"type,omitempty"`
|
||||
Model string `json:"model"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
SerialNumber string `json:"serial_number"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Interface string `json:"interface,omitempty"`
|
||||
Present bool `json:"present"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Slot string `json:"slot"`
|
||||
Type string `json:"type,omitempty"`
|
||||
Model string `json:"model"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
SerialNumber string `json:"serial_number"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Interface string `json:"interface,omitempty"`
|
||||
Present *bool `json:"present,omitempty"`
|
||||
TemperatureC float64 `json:"temperature_c,omitempty"`
|
||||
PowerOnHours int64 `json:"power_on_hours,omitempty"`
|
||||
PowerCycles int64 `json:"power_cycles,omitempty"`
|
||||
UnsafeShutdowns int64 `json:"unsafe_shutdowns,omitempty"`
|
||||
MediaErrors int64 `json:"media_errors,omitempty"`
|
||||
ErrorLogEntries int64 `json:"error_log_entries,omitempty"`
|
||||
WrittenBytes int64 `json:"written_bytes,omitempty"`
|
||||
ReadBytes int64 `json:"read_bytes,omitempty"`
|
||||
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
|
||||
RemainingEndurancePct *int `json:"remaining_endurance_pct,omitempty"`
|
||||
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
|
||||
AvailableSparePct float64 `json:"available_spare_pct,omitempty"`
|
||||
ReallocatedSectors int64 `json:"reallocated_sectors,omitempty"`
|
||||
CurrentPendingSectors int64 `json:"current_pending_sectors,omitempty"`
|
||||
OfflineUncorrectable int64 `json:"offline_uncorrectable,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorPCIe represents a PCIe device
|
||||
type ReanimatorPCIe struct {
|
||||
Slot string `json:"slot"`
|
||||
VendorID int `json:"vendor_id,omitempty"`
|
||||
DeviceID int `json:"device_id,omitempty"`
|
||||
BDF string `json:"bdf,omitempty"`
|
||||
DeviceClass string `json:"device_class,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Model string `json:"model,omitempty"`
|
||||
LinkWidth int `json:"link_width,omitempty"`
|
||||
LinkSpeed string `json:"link_speed,omitempty"`
|
||||
MaxLinkWidth int `json:"max_link_width,omitempty"`
|
||||
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
PowerW int `json:"power_w,omitempty"`
|
||||
VoltageV float64 `json:"voltage_v,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Slot string `json:"slot"`
|
||||
VendorID int `json:"vendor_id,omitempty"`
|
||||
DeviceID int `json:"device_id,omitempty"`
|
||||
NUMANode int `json:"numa_node,omitempty"`
|
||||
TemperatureC float64 `json:"temperature_c,omitempty"`
|
||||
PowerW float64 `json:"power_w,omitempty"`
|
||||
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
|
||||
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
|
||||
ECCCorrectedTotal int64 `json:"ecc_corrected_total,omitempty"`
|
||||
ECCUncorrectedTotal int64 `json:"ecc_uncorrected_total,omitempty"`
|
||||
HWSlowdown *bool `json:"hw_slowdown,omitempty"`
|
||||
BatteryChargePct float64 `json:"battery_charge_pct,omitempty"`
|
||||
BatteryHealthPct float64 `json:"battery_health_pct,omitempty"`
|
||||
BatteryTemperatureC float64 `json:"battery_temperature_c,omitempty"`
|
||||
BatteryVoltageV float64 `json:"battery_voltage_v,omitempty"`
|
||||
BatteryReplaceRequired *bool `json:"battery_replace_required,omitempty"`
|
||||
SFPTemperatureC float64 `json:"sfp_temperature_c,omitempty"`
|
||||
SFPTXPowerDBm float64 `json:"sfp_tx_power_dbm,omitempty"`
|
||||
SFPRXPowerDBm float64 `json:"sfp_rx_power_dbm,omitempty"`
|
||||
SFPVoltageV float64 `json:"sfp_voltage_v,omitempty"`
|
||||
SFPBiasMA float64 `json:"sfp_bias_ma,omitempty"`
|
||||
BDF string `json:"-"`
|
||||
DeviceClass string `json:"device_class,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Model string `json:"model,omitempty"`
|
||||
LinkWidth int `json:"link_width,omitempty"`
|
||||
LinkSpeed string `json:"link_speed,omitempty"`
|
||||
MaxLinkWidth int `json:"max_link_width,omitempty"`
|
||||
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
|
||||
MACAddresses []string `json:"mac_addresses,omitempty"`
|
||||
Present *bool `json:"present,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorPSU represents a power supply unit
|
||||
type ReanimatorPSU struct {
|
||||
Slot string `json:"slot"`
|
||||
Present bool `json:"present"`
|
||||
Model string `json:"model,omitempty"`
|
||||
Vendor string `json:"vendor,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
InputType string `json:"input_type,omitempty"`
|
||||
InputPowerW int `json:"input_power_w,omitempty"`
|
||||
OutputPowerW int `json:"output_power_w,omitempty"`
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Slot string `json:"slot"`
|
||||
Present *bool `json:"present,omitempty"`
|
||||
Model string `json:"model,omitempty"`
|
||||
Vendor string `json:"vendor,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
InputType string `json:"input_type,omitempty"`
|
||||
InputPowerW float64 `json:"input_power_w,omitempty"`
|
||||
OutputPowerW float64 `json:"output_power_w,omitempty"`
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
TemperatureC float64 `json:"temperature_c,omitempty"`
|
||||
LifeRemainingPct float64 `json:"life_remaining_pct,omitempty"`
|
||||
LifeUsedPct float64 `json:"life_used_pct,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
ManufacturedYearWeek string `json:"manufactured_year_week,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
type ReanimatorEventLog struct {
|
||||
Source string `json:"source"`
|
||||
EventTime string `json:"event_time,omitempty"`
|
||||
Severity string `json:"severity,omitempty"`
|
||||
MessageID string `json:"message_id,omitempty"`
|
||||
Message string `json:"message"`
|
||||
ComponentRef string `json:"component_ref,omitempty"`
|
||||
Fingerprint string `json:"fingerprint,omitempty"`
|
||||
IsActive *bool `json:"is_active,omitempty"`
|
||||
RawPayload map[string]any `json:"raw_payload,omitempty"`
|
||||
}
|
||||
|
||||
type ReanimatorSensors struct {
|
||||
Fans []ReanimatorFanSensor `json:"fans,omitempty"`
|
||||
Power []ReanimatorPowerSensor `json:"power,omitempty"`
|
||||
Temperatures []ReanimatorTemperatureSensor `json:"temperatures,omitempty"`
|
||||
Other []ReanimatorOtherSensor `json:"other,omitempty"`
|
||||
}
|
||||
|
||||
type ReanimatorFanSensor struct {
|
||||
Name string `json:"name"`
|
||||
Location string `json:"location,omitempty"`
|
||||
RPM int `json:"rpm,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
type ReanimatorPowerSensor struct {
|
||||
Name string `json:"name"`
|
||||
Location string `json:"location,omitempty"`
|
||||
VoltageV float64 `json:"voltage_v,omitempty"`
|
||||
CurrentA float64 `json:"current_a,omitempty"`
|
||||
PowerW float64 `json:"power_w,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
type ReanimatorTemperatureSensor struct {
|
||||
Name string `json:"name"`
|
||||
Location string `json:"location,omitempty"`
|
||||
Celsius float64 `json:"celsius,omitempty"`
|
||||
ThresholdWarningCelsius float64 `json:"threshold_warning_celsius,omitempty"`
|
||||
ThresholdCriticalCelsius float64 `json:"threshold_critical_celsius,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
type ReanimatorOtherSensor struct {
|
||||
Name string `json:"name"`
|
||||
Location string `json:"location,omitempty"`
|
||||
Value float64 `json:"value,omitempty"`
|
||||
Unit string `json:"unit,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
63
internal/ingest/service.go
Normal file
63
internal/ingest/service.go
Normal file
@@ -0,0 +1,63 @@
|
||||
package ingest
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/collector"
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
type Service struct{}
|
||||
|
||||
type RedfishSourceMetadata struct {
|
||||
TargetHost string
|
||||
SourceTimezone string
|
||||
Filename string
|
||||
}
|
||||
|
||||
func NewService() *Service {
|
||||
return &Service{}
|
||||
}
|
||||
|
||||
func (s *Service) AnalyzeArchivePayload(filename string, payload []byte) (*models.AnalysisResult, string, error) {
|
||||
p := parser.NewBMCParser()
|
||||
if err := p.ParseFromReader(bytes.NewReader(payload), filename); err != nil {
|
||||
return nil, "", err
|
||||
}
|
||||
return p.Result(), p.DetectedVendor(), nil
|
||||
}
|
||||
|
||||
func (s *Service) AnalyzeRedfishRawPayloads(rawPayloads map[string]any, meta RedfishSourceMetadata) (*models.AnalysisResult, string, error) {
|
||||
result, err := collector.ReplayRedfishFromRawPayloads(rawPayloads, nil)
|
||||
if err != nil {
|
||||
return nil, "", err
|
||||
}
|
||||
if result == nil {
|
||||
return nil, "", fmt.Errorf("redfish replay returned nil result")
|
||||
}
|
||||
if strings.TrimSpace(result.Protocol) == "" {
|
||||
result.Protocol = "redfish"
|
||||
}
|
||||
if strings.TrimSpace(result.SourceType) == "" {
|
||||
result.SourceType = models.SourceTypeAPI
|
||||
}
|
||||
if strings.TrimSpace(result.TargetHost) == "" {
|
||||
result.TargetHost = strings.TrimSpace(meta.TargetHost)
|
||||
}
|
||||
if strings.TrimSpace(result.SourceTimezone) == "" {
|
||||
result.SourceTimezone = strings.TrimSpace(meta.SourceTimezone)
|
||||
}
|
||||
if strings.TrimSpace(result.Filename) == "" {
|
||||
if strings.TrimSpace(meta.Filename) != "" {
|
||||
result.Filename = strings.TrimSpace(meta.Filename)
|
||||
} else if target := strings.TrimSpace(result.TargetHost); target != "" {
|
||||
result.Filename = "redfish://" + target
|
||||
} else {
|
||||
result.Filename = "redfish://snapshot"
|
||||
}
|
||||
}
|
||||
return result, "redfish", nil
|
||||
}
|
||||
29
internal/models/memory.go
Normal file
29
internal/models/memory.go
Normal file
@@ -0,0 +1,29 @@
|
||||
package models
|
||||
|
||||
import "strings"
|
||||
|
||||
// HasInventoryIdentity reports whether the DIMM has enough identifying
|
||||
// inventory data to treat it as a populated module even when size is unknown.
|
||||
func (m MemoryDIMM) HasInventoryIdentity() bool {
|
||||
return strings.TrimSpace(m.SerialNumber) != "" ||
|
||||
strings.TrimSpace(m.PartNumber) != "" ||
|
||||
strings.TrimSpace(m.Type) != "" ||
|
||||
strings.TrimSpace(m.Technology) != "" ||
|
||||
strings.TrimSpace(m.Description) != ""
|
||||
}
|
||||
|
||||
// IsInstalledInventory reports whether the DIMM represents an installed module
|
||||
// that should be kept in canonical inventory and exports.
|
||||
func (m MemoryDIMM) IsInstalledInventory() bool {
|
||||
if !m.Present {
|
||||
return false
|
||||
}
|
||||
|
||||
status := strings.ToLower(strings.TrimSpace(m.Status))
|
||||
switch status {
|
||||
case "empty", "absent", "not installed":
|
||||
return false
|
||||
}
|
||||
|
||||
return m.SizeMB > 0 || m.HasInventoryIdentity()
|
||||
}
|
||||
@@ -9,17 +9,18 @@ const (
|
||||
|
||||
// AnalysisResult contains all parsed data from an archive
|
||||
type AnalysisResult struct {
|
||||
Filename string `json:"filename"`
|
||||
SourceType string `json:"source_type,omitempty"` // archive | api
|
||||
Protocol string `json:"protocol,omitempty"` // redfish | ipmi
|
||||
TargetHost string `json:"target_host,omitempty"` // BMC host for live collect
|
||||
SourceTimezone string `json:"source_timezone,omitempty"` // Source timezone/offset used during collection (e.g. +08:00)
|
||||
CollectedAt time.Time `json:"collected_at,omitempty"` // Collection/upload timestamp
|
||||
RawPayloads map[string]any `json:"raw_payloads,omitempty"` // Additional source payloads (e.g. Redfish tree)
|
||||
Events []Event `json:"events"`
|
||||
FRU []FRUInfo `json:"fru"`
|
||||
Sensors []SensorReading `json:"sensors"`
|
||||
Hardware *HardwareConfig `json:"hardware"`
|
||||
Filename string `json:"filename"`
|
||||
SourceType string `json:"source_type,omitempty"` // archive | api
|
||||
Protocol string `json:"protocol,omitempty"` // redfish | ipmi
|
||||
TargetHost string `json:"target_host,omitempty"` // BMC host for live collect
|
||||
SourceTimezone string `json:"source_timezone,omitempty"` // Source timezone/offset used during collection (e.g. +08:00)
|
||||
CollectedAt time.Time `json:"collected_at,omitempty"` // Collection/upload timestamp
|
||||
InventoryLastModifiedAt time.Time `json:"inventory_last_modified_at,omitempty"` // Redfish inventory last modified (InventoryData/Status)
|
||||
RawPayloads map[string]any `json:"raw_payloads,omitempty"` // Additional source payloads (e.g. Redfish tree)
|
||||
Events []Event `json:"events"`
|
||||
FRU []FRUInfo `json:"fru"`
|
||||
Sensors []SensorReading `json:"sensors"`
|
||||
Hardware *HardwareConfig `json:"hardware"`
|
||||
}
|
||||
|
||||
// Event represents a single log event
|
||||
@@ -110,43 +111,45 @@ const (
|
||||
|
||||
// HardwareDevice is canonical device inventory used across UI and exports.
|
||||
type HardwareDevice struct {
|
||||
ID string `json:"id"`
|
||||
Kind string `json:"kind"`
|
||||
Source string `json:"source,omitempty"`
|
||||
Slot string `json:"slot,omitempty"`
|
||||
Location string `json:"location,omitempty"`
|
||||
BDF string `json:"bdf,omitempty"`
|
||||
DeviceClass string `json:"device_class,omitempty"`
|
||||
VendorID int `json:"vendor_id,omitempty"`
|
||||
DeviceID int `json:"device_id,omitempty"`
|
||||
Model string `json:"model,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Type string `json:"type,omitempty"`
|
||||
Interface string `json:"interface,omitempty"`
|
||||
Present *bool `json:"present,omitempty"`
|
||||
SizeMB int `json:"size_mb,omitempty"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
Cores int `json:"cores,omitempty"`
|
||||
Threads int `json:"threads,omitempty"`
|
||||
FrequencyMHz int `json:"frequency_mhz,omitempty"`
|
||||
MaxFreqMHz int `json:"max_frequency_mhz,omitempty"`
|
||||
PortCount int `json:"port_count,omitempty"`
|
||||
PortType string `json:"port_type,omitempty"`
|
||||
MACAddresses []string `json:"mac_addresses,omitempty"`
|
||||
LinkWidth int `json:"link_width,omitempty"`
|
||||
LinkSpeed string `json:"link_speed,omitempty"`
|
||||
MaxLinkWidth int `json:"max_link_width,omitempty"`
|
||||
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
InputType string `json:"input_type,omitempty"`
|
||||
InputPowerW int `json:"input_power_w,omitempty"`
|
||||
OutputPowerW int `json:"output_power_w,omitempty"`
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
ID string `json:"id"`
|
||||
Kind string `json:"kind"`
|
||||
Source string `json:"source,omitempty"`
|
||||
Slot string `json:"slot,omitempty"`
|
||||
Location string `json:"location,omitempty"`
|
||||
BDF string `json:"bdf,omitempty"`
|
||||
DeviceClass string `json:"device_class,omitempty"`
|
||||
VendorID int `json:"vendor_id,omitempty"`
|
||||
DeviceID int `json:"device_id,omitempty"`
|
||||
Model string `json:"model,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Type string `json:"type,omitempty"`
|
||||
Interface string `json:"interface,omitempty"`
|
||||
Present *bool `json:"present,omitempty"`
|
||||
SizeMB int `json:"size_mb,omitempty"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
Cores int `json:"cores,omitempty"`
|
||||
Threads int `json:"threads,omitempty"`
|
||||
FrequencyMHz int `json:"frequency_mhz,omitempty"`
|
||||
MaxFreqMHz int `json:"max_frequency_mhz,omitempty"`
|
||||
PortCount int `json:"port_count,omitempty"`
|
||||
PortType string `json:"port_type,omitempty"`
|
||||
MACAddresses []string `json:"mac_addresses,omitempty"`
|
||||
LinkWidth int `json:"link_width,omitempty"`
|
||||
LinkSpeed string `json:"link_speed,omitempty"`
|
||||
MaxLinkWidth int `json:"max_link_width,omitempty"`
|
||||
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
InputType string `json:"input_type,omitempty"`
|
||||
InputPowerW int `json:"input_power_w,omitempty"`
|
||||
OutputPowerW int `json:"output_power_w,omitempty"`
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
RemainingEndurancePct *int `json:"remaining_endurance_pct,omitempty"` // 0-100 %; nil = not reported
|
||||
NUMANode int `json:"numa_node,omitempty"` // 0 = not reported/N/A
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
@@ -167,14 +170,14 @@ type FirmwareInfo struct {
|
||||
|
||||
// BoardInfo represents motherboard/system information
|
||||
type BoardInfo struct {
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
ProductName string `json:"product_name,omitempty"`
|
||||
Description string `json:"description,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Version string `json:"version,omitempty"`
|
||||
UUID string `json:"uuid,omitempty"`
|
||||
BMCMACAddress string `json:"bmc_mac_address,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
ProductName string `json:"product_name,omitempty"`
|
||||
Description string `json:"description,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Version string `json:"version,omitempty"`
|
||||
UUID string `json:"uuid,omitempty"`
|
||||
BMCMACAddress string `json:"bmc_mac_address,omitempty"`
|
||||
}
|
||||
|
||||
// CPU represents processor information
|
||||
@@ -194,11 +197,12 @@ type CPU struct {
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Details map[string]any `json:"details,omitempty"`
|
||||
}
|
||||
|
||||
// MemoryDIMM represents a memory module
|
||||
@@ -218,31 +222,34 @@ type MemoryDIMM struct {
|
||||
Status string `json:"status,omitempty"`
|
||||
Ranks int `json:"ranks,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Details map[string]any `json:"details,omitempty"`
|
||||
}
|
||||
|
||||
// Storage represents a storage device
|
||||
type Storage struct {
|
||||
Slot string `json:"slot"`
|
||||
Type string `json:"type"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
SizeGB int `json:"size_gb"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Interface string `json:"interface,omitempty"`
|
||||
Present bool `json:"present"`
|
||||
Location string `json:"location,omitempty"` // Front/Rear
|
||||
BackplaneID int `json:"backplane_id,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
Slot string `json:"slot"`
|
||||
Type string `json:"type"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
SizeGB int `json:"size_gb"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Interface string `json:"interface,omitempty"`
|
||||
Present bool `json:"present"`
|
||||
Location string `json:"location,omitempty"` // Front/Rear
|
||||
BackplaneID int `json:"backplane_id,omitempty"`
|
||||
RemainingEndurancePct *int `json:"remaining_endurance_pct,omitempty"` // 0-100 %; nil = not reported
|
||||
Status string `json:"status,omitempty"`
|
||||
Details map[string]any `json:"details,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
@@ -250,15 +257,15 @@ type Storage struct {
|
||||
|
||||
// StorageVolume represents a logical storage volume (RAID/VROC/etc.).
|
||||
type StorageVolume struct {
|
||||
ID string `json:"id,omitempty"`
|
||||
Name string `json:"name,omitempty"`
|
||||
Controller string `json:"controller,omitempty"`
|
||||
RAIDLevel string `json:"raid_level,omitempty"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
CapacityBytes int64 `json:"capacity_bytes,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
Bootable bool `json:"bootable,omitempty"`
|
||||
Encrypted bool `json:"encrypted,omitempty"`
|
||||
ID string `json:"id,omitempty"`
|
||||
Name string `json:"name,omitempty"`
|
||||
Controller string `json:"controller,omitempty"`
|
||||
RAIDLevel string `json:"raid_level,omitempty"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
CapacityBytes int64 `json:"capacity_bytes,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
Bootable bool `json:"bootable,omitempty"`
|
||||
Encrypted bool `json:"encrypted,omitempty"`
|
||||
}
|
||||
|
||||
// PCIeDevice represents a PCIe device
|
||||
@@ -277,13 +284,15 @@ type PCIeDevice struct {
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
MACAddresses []string `json:"mac_addresses,omitempty"`
|
||||
NUMANode int `json:"numa_node,omitempty"` // 0 = not reported/N/A
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Details map[string]any `json:"details,omitempty"`
|
||||
}
|
||||
|
||||
// NIC represents a network interface card
|
||||
@@ -298,25 +307,26 @@ type NIC struct {
|
||||
|
||||
// PSU represents a power supply unit
|
||||
type PSU struct {
|
||||
Slot string `json:"slot"`
|
||||
Present bool `json:"present"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Vendor string `json:"vendor,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
InputType string `json:"input_type,omitempty"`
|
||||
InputPowerW int `json:"input_power_w,omitempty"`
|
||||
OutputPowerW int `json:"output_power_w,omitempty"`
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
OutputVoltage float64 `json:"output_voltage,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
Slot string `json:"slot"`
|
||||
Present bool `json:"present"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Vendor string `json:"vendor,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
InputType string `json:"input_type,omitempty"`
|
||||
InputPowerW int `json:"input_power_w,omitempty"`
|
||||
OutputPowerW int `json:"output_power_w,omitempty"`
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
OutputVoltage float64 `json:"output_voltage,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
Details map[string]any `json:"details,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
@@ -353,11 +363,12 @@ type GPU struct {
|
||||
CurrentLinkSpeed string `json:"current_link_speed,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Details map[string]any `json:"details,omitempty"`
|
||||
}
|
||||
|
||||
// NetworkAdapter represents a network adapter with detailed info
|
||||
@@ -365,6 +376,7 @@ type NetworkAdapter struct {
|
||||
Slot string `json:"slot"`
|
||||
Location string `json:"location"`
|
||||
Present bool `json:"present"`
|
||||
BDF string `json:"bdf,omitempty"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Vendor string `json:"vendor,omitempty"`
|
||||
@@ -376,11 +388,17 @@ type NetworkAdapter struct {
|
||||
PortCount int `json:"port_count,omitempty"`
|
||||
PortType string `json:"port_type,omitempty"`
|
||||
MACAddresses []string `json:"mac_addresses,omitempty"`
|
||||
LinkWidth int `json:"link_width,omitempty"`
|
||||
LinkSpeed string `json:"link_speed,omitempty"`
|
||||
MaxLinkWidth int `json:"max_link_width,omitempty"`
|
||||
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
|
||||
NUMANode int `json:"numa_node,omitempty"` // 0 = not reported/N/A
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
Details map[string]any `json:"details,omitempty"`
|
||||
}
|
||||
|
||||
@@ -19,6 +19,7 @@ const maxZipArchiveSize = 50 * 1024 * 1024
|
||||
const maxGzipDecompressedSize = 50 * 1024 * 1024
|
||||
|
||||
var supportedArchiveExt = map[string]struct{}{
|
||||
".ahs": {},
|
||||
".gz": {},
|
||||
".tgz": {},
|
||||
".tar": {},
|
||||
@@ -45,6 +46,8 @@ func ExtractArchive(archivePath string) ([]ExtractedFile, error) {
|
||||
ext := strings.ToLower(filepath.Ext(archivePath))
|
||||
|
||||
switch ext {
|
||||
case ".ahs":
|
||||
return extractSingleFile(archivePath)
|
||||
case ".gz", ".tgz":
|
||||
return extractTarGz(archivePath)
|
||||
case ".tar", ".sds":
|
||||
@@ -66,6 +69,8 @@ func ExtractArchiveFromReader(r io.Reader, filename string) ([]ExtractedFile, er
|
||||
ext := strings.ToLower(filepath.Ext(filename))
|
||||
|
||||
switch ext {
|
||||
case ".ahs":
|
||||
return extractSingleFileFromReader(r, filename)
|
||||
case ".gz", ".tgz":
|
||||
return extractTarGzFromReader(r, filename)
|
||||
case ".tar", ".sds":
|
||||
|
||||
@@ -76,6 +76,7 @@ func TestIsSupportedArchiveFilename(t *testing.T) {
|
||||
name string
|
||||
want bool
|
||||
}{
|
||||
{name: "HPE_CZ2D1X0GS3_20260330.ahs", want: true},
|
||||
{name: "dump.tar.gz", want: true},
|
||||
{name: "nvidia-bug-report-1651124000923.log.gz", want: true},
|
||||
{name: "snapshot.zip", want: true},
|
||||
@@ -124,3 +125,20 @@ func TestExtractArchiveFromReaderSDS(t *testing.T) {
|
||||
t.Fatalf("expected bmc/pack.info, got %q", files[0].Path)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractArchiveFromReaderAHS(t *testing.T) {
|
||||
payload := []byte("ABJRtest")
|
||||
files, err := ExtractArchiveFromReader(bytes.NewReader(payload), "sample.ahs")
|
||||
if err != nil {
|
||||
t.Fatalf("extract ahs from reader: %v", err)
|
||||
}
|
||||
if len(files) != 1 {
|
||||
t.Fatalf("expected 1 extracted file, got %d", len(files))
|
||||
}
|
||||
if files[0].Path != "sample.ahs" {
|
||||
t.Fatalf("expected sample.ahs, got %q", files[0].Path)
|
||||
}
|
||||
if string(files[0].Content) != string(payload) {
|
||||
t.Fatalf("content mismatch")
|
||||
}
|
||||
}
|
||||
|
||||
135
internal/parser/fru_manufactured.go
Normal file
135
internal/parser/fru_manufactured.go
Normal file
@@ -0,0 +1,135 @@
|
||||
package parser
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
var manufacturedYearWeekPattern = regexp.MustCompile(`^\d{4}-W\d{2}$`)
|
||||
|
||||
// NormalizeManufacturedYearWeek converts common FRU manufacturing date formats
|
||||
// into contract-compatible YYYY-Www values. Unknown or ambiguous inputs return "".
|
||||
func NormalizeManufacturedYearWeek(raw string) string {
|
||||
value := strings.TrimSpace(raw)
|
||||
if value == "" {
|
||||
return ""
|
||||
}
|
||||
upper := strings.ToUpper(value)
|
||||
if manufacturedYearWeekPattern.MatchString(upper) {
|
||||
return upper
|
||||
}
|
||||
|
||||
layouts := []string{
|
||||
time.RFC3339,
|
||||
"2006-01-02T15:04:05",
|
||||
"2006-01-02 15:04:05",
|
||||
"2006-01-02",
|
||||
"2006/01/02",
|
||||
"01/02/2006 15:04:05",
|
||||
"01/02/2006",
|
||||
"01-02-2006",
|
||||
"Mon Jan 2 15:04:05 2006",
|
||||
"Mon Jan _2 15:04:05 2006",
|
||||
"Jan 2 2006",
|
||||
"Jan _2 2006",
|
||||
}
|
||||
for _, layout := range layouts {
|
||||
if ts, err := time.Parse(layout, value); err == nil {
|
||||
year, week := ts.ISOWeek()
|
||||
return formatYearWeek(year, week)
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func formatYearWeek(year, week int) string {
|
||||
if year <= 0 || week <= 0 || week > 53 {
|
||||
return ""
|
||||
}
|
||||
return fmt.Sprintf("%04d-W%02d", year, week)
|
||||
}
|
||||
|
||||
// ApplyManufacturedYearWeekFromFRU attaches normalized manufactured_year_week to
|
||||
// component details by exact serial-number match. Board-level FRU entries are not
|
||||
// expanded to components.
|
||||
func ApplyManufacturedYearWeekFromFRU(frus []models.FRUInfo, hw *models.HardwareConfig) {
|
||||
if hw == nil || len(frus) == 0 {
|
||||
return
|
||||
}
|
||||
bySerial := make(map[string]string, len(frus))
|
||||
for _, fru := range frus {
|
||||
serial := normalizeFRUSerial(fru.SerialNumber)
|
||||
yearWeek := NormalizeManufacturedYearWeek(fru.MfgDate)
|
||||
if serial == "" || yearWeek == "" {
|
||||
continue
|
||||
}
|
||||
if _, exists := bySerial[serial]; exists {
|
||||
continue
|
||||
}
|
||||
bySerial[serial] = yearWeek
|
||||
}
|
||||
if len(bySerial) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
for i := range hw.CPUs {
|
||||
attachYearWeek(&hw.CPUs[i].Details, bySerial[normalizeFRUSerial(hw.CPUs[i].SerialNumber)])
|
||||
}
|
||||
for i := range hw.Memory {
|
||||
attachYearWeek(&hw.Memory[i].Details, bySerial[normalizeFRUSerial(hw.Memory[i].SerialNumber)])
|
||||
}
|
||||
for i := range hw.Storage {
|
||||
attachYearWeek(&hw.Storage[i].Details, bySerial[normalizeFRUSerial(hw.Storage[i].SerialNumber)])
|
||||
}
|
||||
for i := range hw.PCIeDevices {
|
||||
attachYearWeek(&hw.PCIeDevices[i].Details, bySerial[normalizeFRUSerial(hw.PCIeDevices[i].SerialNumber)])
|
||||
}
|
||||
for i := range hw.GPUs {
|
||||
attachYearWeek(&hw.GPUs[i].Details, bySerial[normalizeFRUSerial(hw.GPUs[i].SerialNumber)])
|
||||
}
|
||||
for i := range hw.NetworkAdapters {
|
||||
attachYearWeek(&hw.NetworkAdapters[i].Details, bySerial[normalizeFRUSerial(hw.NetworkAdapters[i].SerialNumber)])
|
||||
}
|
||||
for i := range hw.PowerSupply {
|
||||
attachYearWeek(&hw.PowerSupply[i].Details, bySerial[normalizeFRUSerial(hw.PowerSupply[i].SerialNumber)])
|
||||
}
|
||||
}
|
||||
|
||||
func attachYearWeek(details *map[string]any, yearWeek string) {
|
||||
if yearWeek == "" {
|
||||
return
|
||||
}
|
||||
if *details == nil {
|
||||
*details = map[string]any{}
|
||||
}
|
||||
if existing, ok := (*details)["manufactured_year_week"]; ok && strings.TrimSpace(toString(existing)) != "" {
|
||||
return
|
||||
}
|
||||
(*details)["manufactured_year_week"] = yearWeek
|
||||
}
|
||||
|
||||
func normalizeFRUSerial(v string) string {
|
||||
s := strings.TrimSpace(v)
|
||||
if s == "" {
|
||||
return ""
|
||||
}
|
||||
switch strings.ToUpper(s) {
|
||||
case "N/A", "NA", "NULL", "UNKNOWN", "-", "0":
|
||||
return ""
|
||||
default:
|
||||
return strings.ToUpper(s)
|
||||
}
|
||||
}
|
||||
|
||||
func toString(v any) string {
|
||||
switch x := v.(type) {
|
||||
case string:
|
||||
return x
|
||||
default:
|
||||
return strings.TrimSpace(fmt.Sprint(v))
|
||||
}
|
||||
}
|
||||
65
internal/parser/fru_manufactured_test.go
Normal file
65
internal/parser/fru_manufactured_test.go
Normal file
@@ -0,0 +1,65 @@
|
||||
package parser
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestNormalizeManufacturedYearWeek(t *testing.T) {
|
||||
tests := []struct {
|
||||
in string
|
||||
want string
|
||||
}{
|
||||
{"2024-W07", "2024-W07"},
|
||||
{"2024-02-13", "2024-W07"},
|
||||
{"02/13/2024", "2024-W07"},
|
||||
{"Tue Feb 13 12:00:00 2024", "2024-W07"},
|
||||
{"", ""},
|
||||
{"not-a-date", ""},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
if got := NormalizeManufacturedYearWeek(tt.in); got != tt.want {
|
||||
t.Fatalf("NormalizeManufacturedYearWeek(%q) = %q, want %q", tt.in, got, tt.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyManufacturedYearWeekFromFRU_AttachesByExactSerial(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
PowerSupply: []models.PSU{
|
||||
{
|
||||
Slot: "PSU0",
|
||||
SerialNumber: "PSU-SN-001",
|
||||
},
|
||||
},
|
||||
Storage: []models.Storage{
|
||||
{
|
||||
Slot: "OB01",
|
||||
SerialNumber: "DISK-SN-001",
|
||||
},
|
||||
},
|
||||
}
|
||||
fru := []models.FRUInfo{
|
||||
{
|
||||
Description: "PSU0_FRU (ID 30)",
|
||||
SerialNumber: "PSU-SN-001",
|
||||
MfgDate: "2024-02-13",
|
||||
},
|
||||
{
|
||||
Description: "Builtin FRU Device (ID 0)",
|
||||
SerialNumber: "BOARD-SN-001",
|
||||
MfgDate: "2024-02-01",
|
||||
},
|
||||
}
|
||||
|
||||
ApplyManufacturedYearWeekFromFRU(fru, hw)
|
||||
|
||||
if got := hw.PowerSupply[0].Details["manufactured_year_week"]; got != "2024-W07" {
|
||||
t.Fatalf("expected PSU year week 2024-W07, got %#v", hw.PowerSupply[0].Details)
|
||||
}
|
||||
if hw.Storage[0].Details != nil {
|
||||
t.Fatalf("expected unmatched storage serial to stay untouched, got %#v", hw.Storage[0].Details)
|
||||
}
|
||||
}
|
||||
100
internal/parser/vendors/dell/parser.go
vendored
100
internal/parser/vendors/dell/parser.go
vendored
@@ -16,6 +16,7 @@ import (
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser/vendors/pciids"
|
||||
)
|
||||
|
||||
const parserVersion = "3.0"
|
||||
@@ -199,7 +200,7 @@ func parseDCIMViewXML(content []byte, result *models.AnalysisResult) {
|
||||
parsePowerSupplyView(props, result)
|
||||
case "DCIM_PCIDeviceView":
|
||||
parsePCIeDeviceView(props, result)
|
||||
case "DCIM_NICView":
|
||||
case "DCIM_NICView", "DCIM_InfiniBandView":
|
||||
parseNICView(props, result)
|
||||
case "DCIM_VideoView":
|
||||
parseVideoView(props, result)
|
||||
@@ -374,6 +375,10 @@ func parsePhysicalDiskView(props map[string]string, result *models.AnalysisResul
|
||||
Location: strings.TrimSpace(props["devicedescription"]),
|
||||
Status: normalizeStatus(firstNonEmpty(props["raidstatus"], props["primarystatus"])),
|
||||
}
|
||||
if v := strings.TrimSpace(props["remainingratedwriteendurance"]); v != "" {
|
||||
n := parseIntLoose(v)
|
||||
st.RemainingEndurancePct = &n
|
||||
}
|
||||
result.Hardware.Storage = append(result.Hardware.Storage, st)
|
||||
}
|
||||
|
||||
@@ -424,20 +429,60 @@ func parsePowerSupplyView(props map[string]string, result *models.AnalysisResult
|
||||
result.Hardware.PowerSupply = append(result.Hardware.PowerSupply, psu)
|
||||
}
|
||||
|
||||
// pcieFQDDNoisePrefix lists FQDD prefixes that represent internal chipset/CPU
|
||||
// components or devices already captured with richer data elsewhere:
|
||||
// - HostBridge/P2PBridge/ISABridge/SMBus: AMD EPYC internal fabric, not PCIe slots
|
||||
// - AHCI.Embedded: AMD FCH SATA, not a slot device
|
||||
// - Video.Embedded: BMC/iDRAC Matrox graphics chip, not user-visible
|
||||
// - NIC.Embedded: already parsed from DCIM_NICView with model and MAC addresses
|
||||
var pcieFQDDNoisePrefix = []string{
|
||||
"HostBridge.Embedded.",
|
||||
"P2PBridge.Embedded.",
|
||||
"ISABridge.Embedded.",
|
||||
"SMBus.Embedded.",
|
||||
"AHCI.Embedded.",
|
||||
"Video.Embedded.",
|
||||
// All NIC FQDD classes are parsed from DCIM_NICView / DCIM_InfiniBandView into
|
||||
// NetworkAdapters with model, MAC, firmware, and VendorID/DeviceID. The
|
||||
// DCIM_PCIDeviceView duplicate carries only DataBusWidth ("Unknown", "16x or x16")
|
||||
// and no useful extra data, so suppress it here.
|
||||
"NIC.",
|
||||
"InfiniBand.",
|
||||
}
|
||||
|
||||
func parsePCIeDeviceView(props map[string]string, result *models.AnalysisResult) {
|
||||
desc := strings.TrimSpace(firstNonEmpty(props["devicedescription"], props["description"]))
|
||||
// "description" is the chip/device model (e.g. "MT28908 Family [ConnectX-6]"); prefer
|
||||
// it over "devicedescription" which is the location string ("InfiniBand in Slot 1 Port 1").
|
||||
desc := strings.TrimSpace(firstNonEmpty(props["description"], props["devicedescription"]))
|
||||
fqdd := strings.TrimSpace(firstNonEmpty(props["fqdd"], props["instanceid"]))
|
||||
if desc == "" && fqdd == "" {
|
||||
return
|
||||
}
|
||||
for _, prefix := range pcieFQDDNoisePrefix {
|
||||
if strings.HasPrefix(fqdd, prefix) {
|
||||
return
|
||||
}
|
||||
}
|
||||
vendorID := parseHexOrDec(firstNonEmpty(props["pcivendorid"], props["vendorid"]))
|
||||
deviceID := parseHexOrDec(firstNonEmpty(props["pcideviceid"], props["deviceid"]))
|
||||
manufacturer := strings.TrimSpace(props["manufacturer"])
|
||||
|
||||
// General rule: if chip model not found in logs but PCI IDs are known, resolve from pci.ids
|
||||
if desc == "" && vendorID != 0 && deviceID != 0 {
|
||||
desc = pciids.DeviceName(vendorID, deviceID)
|
||||
}
|
||||
if manufacturer == "" && vendorID != 0 {
|
||||
manufacturer = pciids.VendorName(vendorID)
|
||||
}
|
||||
|
||||
p := models.PCIeDevice{
|
||||
Slot: fqdd,
|
||||
Description: desc,
|
||||
VendorID: parseHexOrDec(firstNonEmpty(props["pcivendorid"], props["vendorid"])),
|
||||
DeviceID: parseHexOrDec(firstNonEmpty(props["pcideviceid"], props["deviceid"])),
|
||||
VendorID: vendorID,
|
||||
DeviceID: deviceID,
|
||||
BDF: formatBDF(props["busnumber"], props["devicenumber"], props["functionnumber"]),
|
||||
DeviceClass: strings.TrimSpace(props["databuswidth"]),
|
||||
Manufacturer: strings.TrimSpace(props["manufacturer"]),
|
||||
Manufacturer: manufacturer,
|
||||
NUMANode: parseIntLoose(props["cpuaffinity"]),
|
||||
Status: normalizeStatus(props["primarystatus"]),
|
||||
}
|
||||
result.Hardware.PCIeDevices = append(result.Hardware.PCIeDevices, p)
|
||||
@@ -450,15 +495,31 @@ func parseNICView(props map[string]string, result *models.AnalysisResult) {
|
||||
return
|
||||
}
|
||||
mac := strings.TrimSpace(firstNonEmpty(props["currentmacaddress"], props["permanentmacaddress"]))
|
||||
vendorID := parseHexOrDec(firstNonEmpty(props["pcivendorid"], props["vendorid"]))
|
||||
deviceID := parseHexOrDec(firstNonEmpty(props["pcideviceid"], props["deviceid"]))
|
||||
vendor := strings.TrimSpace(firstNonEmpty(props["vendorname"], props["manufacturer"]))
|
||||
|
||||
// Prefer pci.ids chip model over generic ProductName when PCI IDs are available.
|
||||
// Dell TSR often reports a marketing name (e.g. "Mellanox Network Adapter") while
|
||||
// pci.ids has the precise chip identifier (e.g. "MT28908 Family [ConnectX-6]").
|
||||
if vendorID != 0 && deviceID != 0 {
|
||||
if chipModel := pciids.DeviceName(vendorID, deviceID); chipModel != "" {
|
||||
model = chipModel
|
||||
}
|
||||
if vendor == "" {
|
||||
vendor = pciids.VendorName(vendorID)
|
||||
}
|
||||
}
|
||||
|
||||
n := models.NetworkAdapter{
|
||||
Slot: fqdd,
|
||||
Location: strings.TrimSpace(firstNonEmpty(props["devicedescription"], fqdd)),
|
||||
Present: true,
|
||||
Model: model,
|
||||
Description: strings.TrimSpace(props["protocol"]),
|
||||
Vendor: strings.TrimSpace(firstNonEmpty(props["vendorname"], props["manufacturer"])),
|
||||
VendorID: parseHexOrDec(firstNonEmpty(props["pcivendorid"], props["vendorid"])),
|
||||
DeviceID: parseHexOrDec(firstNonEmpty(props["pcideviceid"], props["deviceid"])),
|
||||
Vendor: vendor,
|
||||
VendorID: vendorID,
|
||||
DeviceID: deviceID,
|
||||
SerialNumber: strings.TrimSpace(props["serialnumber"]),
|
||||
PartNumber: strings.TrimSpace(props["partnumber"]),
|
||||
Firmware: strings.TrimSpace(firstNonEmpty(
|
||||
@@ -468,6 +529,7 @@ func parseNICView(props map[string]string, result *models.AnalysisResult) {
|
||||
props["controllerbiosversion"],
|
||||
)),
|
||||
PortCount: inferPortCountFromFQDD(fqdd),
|
||||
NUMANode: parseIntLoose(props["cpuaffinity"]),
|
||||
Status: normalizeStatus(props["primarystatus"]),
|
||||
}
|
||||
if mac != "" {
|
||||
@@ -521,10 +583,11 @@ func parseControllerView(props map[string]string, result *models.AnalysisResult)
|
||||
DeviceClass: "storage-controller",
|
||||
Manufacturer: strings.TrimSpace(firstNonEmpty(props["devicecardmanufacturer"], props["manufacturer"])),
|
||||
PartNumber: strings.TrimSpace(firstNonEmpty(props["ppid"], props["boardpartnumber"])),
|
||||
NUMANode: parseIntLoose(props["cpuaffinity"]),
|
||||
Status: normalizeStatus(props["primarystatus"]),
|
||||
})
|
||||
|
||||
addFirmware(result, firstNonEmpty(name, fqdd), props["controllerfirmwareversion"], "storage controller")
|
||||
addFirmware(result, firstNonEmpty(name, fqdd), props["controllerfirmwareversion"], firstNonEmpty(fqdd, "storage controller"))
|
||||
}
|
||||
|
||||
func parseControllerBatteryView(props map[string]string, result *models.AnalysisResult) {
|
||||
@@ -1110,6 +1173,7 @@ func mergeStorage(dst *models.Storage, src models.Storage) {
|
||||
}
|
||||
setIfEmpty(&dst.Location, src.Location)
|
||||
setIfEmpty(&dst.Status, src.Status)
|
||||
dst.Details = mergeDellDetails(dst.Details, src.Details)
|
||||
}
|
||||
|
||||
func dedupeVolumes(items []models.StorageVolume) []models.StorageVolume {
|
||||
@@ -1181,6 +1245,22 @@ func mergePSU(dst *models.PSU, src models.PSU) {
|
||||
dst.InputVoltage = src.InputVoltage
|
||||
}
|
||||
setIfEmpty(&dst.InputType, src.InputType)
|
||||
dst.Details = mergeDellDetails(dst.Details, src.Details)
|
||||
}
|
||||
|
||||
func mergeDellDetails(primary, secondary map[string]any) map[string]any {
|
||||
if len(secondary) == 0 {
|
||||
return primary
|
||||
}
|
||||
if primary == nil {
|
||||
primary = make(map[string]any, len(secondary))
|
||||
}
|
||||
for key, value := range secondary {
|
||||
if _, ok := primary[key]; !ok {
|
||||
primary[key] = value
|
||||
}
|
||||
}
|
||||
return primary
|
||||
}
|
||||
|
||||
func dedupeNetworkAdapters(items []models.NetworkAdapter) []models.NetworkAdapter {
|
||||
|
||||
256
internal/parser/vendors/dell/parser_test.go
vendored
256
internal/parser/vendors/dell/parser_test.go
vendored
@@ -204,6 +204,262 @@ func TestParseNestedTSRZip(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestParseDellPhysicalDiskEndurance verifies that RemainingRatedWriteEndurance from
|
||||
// DCIM_PhysicalDiskView is parsed into Storage.RemainingEndurancePct.
|
||||
func TestParseDellPhysicalDiskEndurance(t *testing.T) {
|
||||
const viewXML = `<CIM><MESSAGE><SIMPLEREQ>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SystemView">
|
||||
<PROPERTY NAME="Manufacturer"><VALUE>Dell Inc.</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Model"><VALUE>PowerEdge R6625</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ServiceTag"><VALUE>8VS2LG4</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PhysicalDiskView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>Disk.Bay.0:Enclosure.Internal.0-1:RAID.SL.3-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Slot"><VALUE>0</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Model"><VALUE>HFS480G3H2X069N</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="SerialNumber"><VALUE>ESEAN5254I030B26B</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="SizeInBytes"><VALUE>479559942144</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="MediaType"><VALUE>Solid State Drive</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="BusProtocol"><VALUE>SATA</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Revision"><VALUE>DZ03</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="RemainingRatedWriteEndurance"><VALUE>100</VALUE><DisplayValue>100 %</DisplayValue></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryStatus"><VALUE>1</VALUE><DisplayValue>OK</DisplayValue></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PhysicalDiskView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>Disk.Bay.1:Enclosure.Internal.0-1:RAID.SL.3-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Slot"><VALUE>1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Model"><VALUE>TOSHIBA MG08ADA800E</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="SerialNumber"><VALUE>X1G0A0YXFVVG</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="SizeInBytes"><VALUE>8001563222016</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="MediaType"><VALUE>Hard Disk Drive</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="BusProtocol"><VALUE>SAS</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Revision"><VALUE>0104</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
</SIMPLEREQ></MESSAGE></CIM>`
|
||||
|
||||
inner := makeZipArchive(t, map[string][]byte{
|
||||
"tsr/metadata.json": []byte(`{"Make":"Dell Inc.","Model":"PowerEdge R6625","ServiceTag":"8VS2LG4"}`),
|
||||
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml": []byte(viewXML),
|
||||
})
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse([]parser.ExtractedFile{
|
||||
{Path: "signature", Content: []byte("ok")},
|
||||
{Path: "TSR20260306141852_8VS2LG4.pl.zip", Content: inner},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
if len(result.Hardware.Storage) != 2 {
|
||||
t.Fatalf("expected 2 storage devices, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
|
||||
ssd := result.Hardware.Storage[0]
|
||||
if ssd.RemainingEndurancePct == nil {
|
||||
t.Fatalf("SSD slot 0: expected RemainingEndurancePct to be set")
|
||||
}
|
||||
if *ssd.RemainingEndurancePct != 100 {
|
||||
t.Errorf("SSD slot 0: expected RemainingEndurancePct=100, got %d", *ssd.RemainingEndurancePct)
|
||||
}
|
||||
|
||||
hdd := result.Hardware.Storage[1]
|
||||
if hdd.RemainingEndurancePct != nil {
|
||||
t.Errorf("HDD slot 1: expected RemainingEndurancePct absent, got %d", *hdd.RemainingEndurancePct)
|
||||
}
|
||||
}
|
||||
|
||||
// TestParseDellInfiniBandView verifies that DCIM_InfiniBandView entries are parsed as
|
||||
// NetworkAdapters (not PCIe devices) and that the corresponding SoftwareIdentity firmware
|
||||
// entry with FQDD "InfiniBand.Slot.*" does not leak into hardware.firmware.
|
||||
//
|
||||
// Regression guard: PowerEdge R6625 (8VS2LG4) — "Mellanox Network Adapter" version
|
||||
// "20.39.35.60" appeared in hardware.firmware because DCIM_InfiniBandView was ignored
|
||||
// (device ended up only in PCIeDevices with model "16x or x16") and SoftwareIdentity
|
||||
// FQDD "InfiniBand.Slot.1-1" was not filtered. (2026-03-15)
|
||||
func TestParseDellInfiniBandView(t *testing.T) {
|
||||
const viewXML = `<CIM><MESSAGE><SIMPLEREQ>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SystemView">
|
||||
<PROPERTY NAME="Manufacturer"><VALUE>Dell Inc.</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Model"><VALUE>PowerEdge R6625</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ServiceTag"><VALUE>8VS2LG4</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_InfiniBandView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>InfiniBand.Slot.1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="DeviceDescription"><VALUE>InfiniBand in Slot 1 Port 1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="CurrentMACAddress"><VALUE>00:1C:FD:D7:5A:E6</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="FamilyVersion"><VALUE>20.39.35.60</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="EFIVersion"><VALUE>14.32.17</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PCIVendorID"><VALUE>15B3</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PCIDeviceID"><VALUE>101B</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PCIDeviceView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>InfiniBand.Slot.1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Description"><VALUE>MT28908 Family [ConnectX-6]</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="DeviceDescription"><VALUE>InfiniBand in Slot 1 Port 1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Manufacturer"><VALUE>Mellanox Technologies</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PCIVendorID"><VALUE>15B3</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PCIDeviceID"><VALUE>101B</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="DataBusWidth"><DisplayValue>16x or x16</DisplayValue></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_ControllerView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>RAID.SL.3-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ProductName"><VALUE>PERC H755 Front</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ControllerFirmwareVersion"><VALUE>52.30.0-6115</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
</SIMPLEREQ></MESSAGE></CIM>`
|
||||
|
||||
const swXML = `<CIM><MESSAGE><SIMPLEREQ>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SoftwareIdentity">
|
||||
<PROPERTY NAME="ElementName"><VALUE>Mellanox Network Adapter - 00:1C:FD:D7:5A:E6</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="FQDD"><VALUE>InfiniBand.Slot.1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="VersionString"><VALUE>20.39.35.60</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SoftwareIdentity">
|
||||
<PROPERTY NAME="ElementName"><VALUE>PERC H755 Front</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="FQDD"><VALUE>RAID.SL.3-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="VersionString"><VALUE>52.30.0-6115</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SoftwareIdentity">
|
||||
<PROPERTY NAME="ElementName"><VALUE>BIOS</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="FQDD"><VALUE>BIOS.Setup.1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="VersionString"><VALUE>1.15.3</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
</SIMPLEREQ></MESSAGE></CIM>`
|
||||
|
||||
inner := makeZipArchive(t, map[string][]byte{
|
||||
"tsr/metadata.json": []byte(`{"Make":"Dell Inc.","Model":"PowerEdge R6625","ServiceTag":"8VS2LG4"}`),
|
||||
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml": []byte(viewXML),
|
||||
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml": []byte(swXML),
|
||||
})
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse([]parser.ExtractedFile{
|
||||
{Path: "signature", Content: []byte("ok")},
|
||||
{Path: "TSR20260306141852_8VS2LG4.pl.zip", Content: inner},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
|
||||
// InfiniBand adapter must appear as a NetworkAdapter, not a PCIe device.
|
||||
if len(result.Hardware.NetworkAdapters) != 1 {
|
||||
t.Fatalf("expected 1 network adapter, got %d", len(result.Hardware.NetworkAdapters))
|
||||
}
|
||||
nic := result.Hardware.NetworkAdapters[0]
|
||||
if nic.Slot != "InfiniBand.Slot.1-1" {
|
||||
t.Errorf("unexpected NIC slot: %q", nic.Slot)
|
||||
}
|
||||
if nic.Firmware != "20.39.35.60" {
|
||||
t.Errorf("unexpected NIC firmware: %q", nic.Firmware)
|
||||
}
|
||||
if len(nic.MACAddresses) == 0 || nic.MACAddresses[0] != "00:1C:FD:D7:5A:E6" {
|
||||
t.Errorf("unexpected NIC MAC: %v", nic.MACAddresses)
|
||||
}
|
||||
// pci.ids enrichment: VendorID=0x15B3, DeviceID=0x101B → chip model + vendor name.
|
||||
if nic.Model != "MT28908 Family [ConnectX-6]" {
|
||||
t.Errorf("NIC model = %q, want MT28908 Family [ConnectX-6] (from pci.ids)", nic.Model)
|
||||
}
|
||||
if nic.Vendor != "Mellanox Technologies" {
|
||||
t.Errorf("NIC vendor = %q, want Mellanox Technologies (from pci.ids)", nic.Vendor)
|
||||
}
|
||||
|
||||
// InfiniBand FQDD must NOT appear in PCIe devices.
|
||||
for _, pcie := range result.Hardware.PCIeDevices {
|
||||
if pcie.Slot == "InfiniBand.Slot.1-1" {
|
||||
t.Errorf("InfiniBand.Slot.1-1 must not appear in PCIeDevices")
|
||||
}
|
||||
}
|
||||
|
||||
// Firmware entries from SoftwareIdentity and parseControllerView must carry the FQDD
|
||||
// as their Description so the exporter's isDeviceBoundFirmwareFQDD filter can remove them.
|
||||
fqddByName := make(map[string]string)
|
||||
for _, fw := range result.Hardware.Firmware {
|
||||
fqddByName[fw.DeviceName] = fw.Description
|
||||
}
|
||||
if desc := fqddByName["Mellanox Network Adapter"]; desc != "InfiniBand.Slot.1-1" {
|
||||
t.Errorf("Mellanox firmware Description = %q, want InfiniBand.Slot.1-1 for FQDD filter", desc)
|
||||
}
|
||||
if desc := fqddByName["PERC H755 Front"]; desc != "RAID.SL.3-1" {
|
||||
t.Errorf("PERC H755 Front firmware Description = %q, want RAID.SL.3-1 for FQDD filter", desc)
|
||||
}
|
||||
}
|
||||
|
||||
// TestParseDellCPUAffinity verifies that CPUAffinity is parsed into NUMANode for
|
||||
// NIC, PCIe, and controller views. "Not Applicable" must result in NUMANode=0.
|
||||
func TestParseDellCPUAffinity(t *testing.T) {
|
||||
const viewXML = `<CIM><MESSAGE><SIMPLEREQ>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SystemView">
|
||||
<PROPERTY NAME="Manufacturer"><VALUE>Dell Inc.</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Model"><VALUE>PowerEdge R750</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ServiceTag"><VALUE>TESTST1</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_NICView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>NIC.Slot.2-1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ProductName"><VALUE>Some NIC</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="CPUAffinity"><VALUE>1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_InfiniBandView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>InfiniBand.Slot.1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="DeviceDescription"><VALUE>InfiniBand in Slot 1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="CPUAffinity"><VALUE>2</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_ControllerView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>RAID.Slot.1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ProductName"><VALUE>PERC H755</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="CPUAffinity"><VALUE>Not Applicable</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PCIDeviceView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>Slot.7-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Description"><VALUE>Some PCIe Card</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="CPUAffinity"><VALUE>2</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryStatus"><VALUE>0</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
</SIMPLEREQ></MESSAGE></CIM>`
|
||||
|
||||
inner := makeZipArchive(t, map[string][]byte{
|
||||
"tsr/metadata.json": []byte(`{"Make":"Dell Inc.","Model":"PowerEdge R750","ServiceTag":"TESTST1"}`),
|
||||
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml": []byte(viewXML),
|
||||
})
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse([]parser.ExtractedFile{
|
||||
{Path: "signature", Content: []byte("ok")},
|
||||
{Path: "TSR_TESTST1.pl.zip", Content: inner},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
|
||||
// NIC CPUAffinity=1 → NUMANode=1
|
||||
nicBySlot := make(map[string]int)
|
||||
for _, nic := range result.Hardware.NetworkAdapters {
|
||||
nicBySlot[nic.Slot] = nic.NUMANode
|
||||
}
|
||||
if nicBySlot["NIC.Slot.2-1-1"] != 1 {
|
||||
t.Errorf("NIC.Slot.2-1-1 NUMANode = %d, want 1", nicBySlot["NIC.Slot.2-1-1"])
|
||||
}
|
||||
if nicBySlot["InfiniBand.Slot.1-1"] != 2 {
|
||||
t.Errorf("InfiniBand.Slot.1-1 NUMANode = %d, want 2", nicBySlot["InfiniBand.Slot.1-1"])
|
||||
}
|
||||
|
||||
// PCIe device CPUAffinity=2 → NUMANode=2; controller CPUAffinity="Not Applicable" → NUMANode=0
|
||||
pcieBySlot := make(map[string]int)
|
||||
for _, pcie := range result.Hardware.PCIeDevices {
|
||||
pcieBySlot[pcie.Slot] = pcie.NUMANode
|
||||
}
|
||||
if pcieBySlot["Slot.7-1"] != 2 {
|
||||
t.Errorf("Slot.7-1 NUMANode = %d, want 2", pcieBySlot["Slot.7-1"])
|
||||
}
|
||||
if pcieBySlot["RAID.Slot.1-1"] != 0 {
|
||||
t.Errorf("RAID.Slot.1-1 NUMANode = %d, want 0 (Not Applicable)", pcieBySlot["RAID.Slot.1-1"])
|
||||
}
|
||||
}
|
||||
|
||||
func makeZipArchive(t *testing.T, files map[string][]byte) []byte {
|
||||
t.Helper()
|
||||
var buf bytes.Buffer
|
||||
|
||||
601
internal/parser/vendors/easy_bee/parser.go
vendored
Normal file
601
internal/parser/vendors/easy_bee/parser.go
vendored
Normal file
@@ -0,0 +1,601 @@
|
||||
package easy_bee
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
const parserVersion = "1.0"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
}
|
||||
|
||||
// Parser imports support bundles produced by reanimator-easy-bee.
|
||||
// These archives embed a ready-to-use hardware snapshot in export/bee-audit.json.
|
||||
type Parser struct{}
|
||||
|
||||
func (p *Parser) Name() string {
|
||||
return "Reanimator Easy Bee Parser"
|
||||
}
|
||||
|
||||
func (p *Parser) Vendor() string {
|
||||
return "easy_bee"
|
||||
}
|
||||
|
||||
func (p *Parser) Version() string {
|
||||
return parserVersion
|
||||
}
|
||||
|
||||
func (p *Parser) Detect(files []parser.ExtractedFile) int {
|
||||
confidence := 0
|
||||
hasManifest := false
|
||||
hasBeeAudit := false
|
||||
hasRuntimeHealth := false
|
||||
hasTechdump := false
|
||||
hasBundlePrefix := false
|
||||
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(strings.TrimSpace(f.Path))
|
||||
content := strings.ToLower(string(f.Content))
|
||||
|
||||
if !hasBundlePrefix && strings.Contains(path, "bee-support-") {
|
||||
hasBundlePrefix = true
|
||||
confidence += 5
|
||||
}
|
||||
|
||||
if (strings.HasSuffix(path, "/manifest.txt") || path == "manifest.txt") &&
|
||||
strings.Contains(content, "bee_version=") {
|
||||
hasManifest = true
|
||||
confidence += 35
|
||||
if strings.Contains(content, "export_dir=") {
|
||||
confidence += 10
|
||||
}
|
||||
}
|
||||
|
||||
if strings.HasSuffix(path, "/export/bee-audit.json") || path == "bee-audit.json" {
|
||||
hasBeeAudit = true
|
||||
confidence += 55
|
||||
}
|
||||
|
||||
if hasBundlePrefix && (strings.HasSuffix(path, "/export/runtime-health.json") || path == "runtime-health.json") {
|
||||
hasRuntimeHealth = true
|
||||
confidence += 10
|
||||
}
|
||||
|
||||
if hasBundlePrefix && !hasTechdump && strings.Contains(path, "/export/techdump/") {
|
||||
hasTechdump = true
|
||||
confidence += 10
|
||||
}
|
||||
}
|
||||
|
||||
if hasManifest && hasBeeAudit {
|
||||
return 100
|
||||
}
|
||||
if hasBeeAudit && (hasRuntimeHealth || hasTechdump) {
|
||||
confidence += 10
|
||||
}
|
||||
if confidence > 100 {
|
||||
return 100
|
||||
}
|
||||
return confidence
|
||||
}
|
||||
|
||||
func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, error) {
|
||||
snapshotFile := findSnapshotFile(files)
|
||||
if snapshotFile == nil {
|
||||
return nil, fmt.Errorf("easy-bee snapshot not found")
|
||||
}
|
||||
|
||||
var snapshot beeSnapshot
|
||||
if err := json.Unmarshal(snapshotFile.Content, &snapshot); err != nil {
|
||||
return nil, fmt.Errorf("decode %s: %w", snapshotFile.Path, err)
|
||||
}
|
||||
|
||||
manifest := parseManifest(files)
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
SourceType: strings.TrimSpace(snapshot.SourceType),
|
||||
Protocol: strings.TrimSpace(snapshot.Protocol),
|
||||
TargetHost: firstNonEmpty(snapshot.TargetHost, manifest.Host),
|
||||
SourceTimezone: strings.TrimSpace(snapshot.SourceTimezone),
|
||||
CollectedAt: chooseCollectedAt(snapshot, manifest),
|
||||
InventoryLastModifiedAt: snapshot.InventoryLastModifiedAt,
|
||||
RawPayloads: snapshot.RawPayloads,
|
||||
Events: make([]models.Event, 0),
|
||||
FRU: append([]models.FRUInfo(nil), snapshot.FRU...),
|
||||
Sensors: make([]models.SensorReading, 0),
|
||||
Hardware: &models.HardwareConfig{
|
||||
Firmware: append([]models.FirmwareInfo(nil), snapshot.Hardware.Firmware...),
|
||||
BoardInfo: snapshot.Hardware.Board,
|
||||
Devices: append([]models.HardwareDevice(nil), snapshot.Hardware.Devices...),
|
||||
CPUs: append([]models.CPU(nil), snapshot.Hardware.CPUs...),
|
||||
Memory: append([]models.MemoryDIMM(nil), snapshot.Hardware.Memory...),
|
||||
Storage: append([]models.Storage(nil), snapshot.Hardware.Storage...),
|
||||
Volumes: append([]models.StorageVolume(nil), snapshot.Hardware.Volumes...),
|
||||
PCIeDevices: normalizePCIeDevices(snapshot.Hardware.PCIeDevices),
|
||||
GPUs: append([]models.GPU(nil), snapshot.Hardware.GPUs...),
|
||||
NetworkCards: append([]models.NIC(nil), snapshot.Hardware.NetworkCards...),
|
||||
NetworkAdapters: normalizeNetworkAdapters(snapshot.Hardware.NetworkAdapters),
|
||||
PowerSupply: append([]models.PSU(nil), snapshot.Hardware.PowerSupply...),
|
||||
},
|
||||
}
|
||||
|
||||
result.Events = append(result.Events, snapshot.Events...)
|
||||
result.Events = append(result.Events, convertRuntimeToEvents(snapshot.Runtime, result.CollectedAt)...)
|
||||
result.Events = append(result.Events, convertEventLogs(snapshot.Hardware.EventLogs)...)
|
||||
|
||||
result.Sensors = append(result.Sensors, snapshot.Sensors...)
|
||||
result.Sensors = append(result.Sensors, flattenSensorGroups(snapshot.Hardware.Sensors)...)
|
||||
|
||||
if len(result.FRU) == 0 {
|
||||
if boardFRU, ok := buildBoardFRU(snapshot.Hardware.Board); ok {
|
||||
result.FRU = append(result.FRU, boardFRU)
|
||||
}
|
||||
}
|
||||
|
||||
if result.Hardware == nil || (result.Hardware.BoardInfo.SerialNumber == "" &&
|
||||
len(result.Hardware.CPUs) == 0 &&
|
||||
len(result.Hardware.Memory) == 0 &&
|
||||
len(result.Hardware.Storage) == 0 &&
|
||||
len(result.Hardware.PCIeDevices) == 0 &&
|
||||
len(result.Hardware.Devices) == 0) {
|
||||
return nil, fmt.Errorf("unsupported easy-bee snapshot format")
|
||||
}
|
||||
|
||||
return result, nil
|
||||
}
|
||||
|
||||
type beeSnapshot struct {
|
||||
SourceType string `json:"source_type,omitempty"`
|
||||
Protocol string `json:"protocol,omitempty"`
|
||||
TargetHost string `json:"target_host,omitempty"`
|
||||
SourceTimezone string `json:"source_timezone,omitempty"`
|
||||
CollectedAt time.Time `json:"collected_at,omitempty"`
|
||||
InventoryLastModifiedAt time.Time `json:"inventory_last_modified_at,omitempty"`
|
||||
RawPayloads map[string]any `json:"raw_payloads,omitempty"`
|
||||
Events []models.Event `json:"events,omitempty"`
|
||||
FRU []models.FRUInfo `json:"fru,omitempty"`
|
||||
Sensors []models.SensorReading `json:"sensors,omitempty"`
|
||||
Hardware beeHardware `json:"hardware"`
|
||||
Runtime beeRuntime `json:"runtime,omitempty"`
|
||||
}
|
||||
|
||||
type beeHardware struct {
|
||||
Board models.BoardInfo `json:"board"`
|
||||
Firmware []models.FirmwareInfo `json:"firmware,omitempty"`
|
||||
Devices []models.HardwareDevice `json:"devices,omitempty"`
|
||||
CPUs []models.CPU `json:"cpus,omitempty"`
|
||||
Memory []models.MemoryDIMM `json:"memory,omitempty"`
|
||||
Storage []models.Storage `json:"storage,omitempty"`
|
||||
Volumes []models.StorageVolume `json:"volumes,omitempty"`
|
||||
PCIeDevices []models.PCIeDevice `json:"pcie_devices,omitempty"`
|
||||
GPUs []models.GPU `json:"gpus,omitempty"`
|
||||
NetworkCards []models.NIC `json:"network_cards,omitempty"`
|
||||
NetworkAdapters []models.NetworkAdapter `json:"network_adapters,omitempty"`
|
||||
PowerSupply []models.PSU `json:"power_supplies,omitempty"`
|
||||
Sensors beeSensorGroups `json:"sensors,omitempty"`
|
||||
EventLogs []beeEventLog `json:"event_logs,omitempty"`
|
||||
}
|
||||
|
||||
type beeSensorGroups struct {
|
||||
Fans []beeFanSensor `json:"fans,omitempty"`
|
||||
Power []beePowerSensor `json:"power,omitempty"`
|
||||
Temperatures []beeTemperatureSensor `json:"temperatures,omitempty"`
|
||||
Other []beeOtherSensor `json:"other,omitempty"`
|
||||
}
|
||||
|
||||
type beeFanSensor struct {
|
||||
Name string `json:"name"`
|
||||
Location string `json:"location,omitempty"`
|
||||
RPM int `json:"rpm,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
type beePowerSensor struct {
|
||||
Name string `json:"name"`
|
||||
Location string `json:"location,omitempty"`
|
||||
VoltageV float64 `json:"voltage_v,omitempty"`
|
||||
CurrentA float64 `json:"current_a,omitempty"`
|
||||
PowerW float64 `json:"power_w,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
type beeTemperatureSensor struct {
|
||||
Name string `json:"name"`
|
||||
Location string `json:"location,omitempty"`
|
||||
Celsius float64 `json:"celsius,omitempty"`
|
||||
ThresholdWarningCelsius float64 `json:"threshold_warning_celsius,omitempty"`
|
||||
ThresholdCriticalCelsius float64 `json:"threshold_critical_celsius,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
type beeOtherSensor struct {
|
||||
Name string `json:"name"`
|
||||
Location string `json:"location,omitempty"`
|
||||
Value float64 `json:"value,omitempty"`
|
||||
Unit string `json:"unit,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
type beeRuntime struct {
|
||||
Status string `json:"status,omitempty"`
|
||||
CheckedAt time.Time `json:"checked_at,omitempty"`
|
||||
NetworkStatus string `json:"network_status,omitempty"`
|
||||
Issues []beeRuntimeIssue `json:"issues,omitempty"`
|
||||
Services []beeRuntimeStatus `json:"services,omitempty"`
|
||||
Interfaces []beeInterface `json:"interfaces,omitempty"`
|
||||
}
|
||||
|
||||
type beeRuntimeIssue struct {
|
||||
Code string `json:"code,omitempty"`
|
||||
Severity string `json:"severity,omitempty"`
|
||||
Description string `json:"description,omitempty"`
|
||||
}
|
||||
|
||||
type beeRuntimeStatus struct {
|
||||
Name string `json:"name,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
}
|
||||
|
||||
type beeInterface struct {
|
||||
Name string `json:"name,omitempty"`
|
||||
State string `json:"state,omitempty"`
|
||||
IPv4 []string `json:"ipv4,omitempty"`
|
||||
Outcome string `json:"outcome,omitempty"`
|
||||
}
|
||||
|
||||
type beeEventLog struct {
|
||||
Source string `json:"source,omitempty"`
|
||||
EventTime string `json:"event_time,omitempty"`
|
||||
Severity string `json:"severity,omitempty"`
|
||||
MessageID string `json:"message_id,omitempty"`
|
||||
Message string `json:"message,omitempty"`
|
||||
RawPayload map[string]any `json:"raw_payload,omitempty"`
|
||||
}
|
||||
|
||||
type manifestMetadata struct {
|
||||
Host string
|
||||
GeneratedAtUTC time.Time
|
||||
}
|
||||
|
||||
func findSnapshotFile(files []parser.ExtractedFile) *parser.ExtractedFile {
|
||||
for i := range files {
|
||||
path := strings.ToLower(strings.TrimSpace(files[i].Path))
|
||||
if strings.HasSuffix(path, "/export/bee-audit.json") || path == "bee-audit.json" {
|
||||
return &files[i]
|
||||
}
|
||||
}
|
||||
for i := range files {
|
||||
path := strings.ToLower(strings.TrimSpace(files[i].Path))
|
||||
if strings.HasSuffix(path, ".json") && strings.Contains(path, "reanimator") {
|
||||
return &files[i]
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func parseManifest(files []parser.ExtractedFile) manifestMetadata {
|
||||
var meta manifestMetadata
|
||||
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(strings.TrimSpace(f.Path))
|
||||
if !(strings.HasSuffix(path, "/manifest.txt") || path == "manifest.txt") {
|
||||
continue
|
||||
}
|
||||
|
||||
lines := strings.Split(string(f.Content), "\n")
|
||||
for _, line := range lines {
|
||||
key, value, ok := strings.Cut(strings.TrimSpace(line), "=")
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
switch strings.TrimSpace(key) {
|
||||
case "host":
|
||||
meta.Host = strings.TrimSpace(value)
|
||||
case "generated_at_utc":
|
||||
if ts, err := time.Parse(time.RFC3339, strings.TrimSpace(value)); err == nil {
|
||||
meta.GeneratedAtUTC = ts.UTC()
|
||||
}
|
||||
}
|
||||
}
|
||||
break
|
||||
}
|
||||
|
||||
return meta
|
||||
}
|
||||
|
||||
func chooseCollectedAt(snapshot beeSnapshot, manifest manifestMetadata) time.Time {
|
||||
switch {
|
||||
case !snapshot.CollectedAt.IsZero():
|
||||
return snapshot.CollectedAt.UTC()
|
||||
case !snapshot.Runtime.CheckedAt.IsZero():
|
||||
return snapshot.Runtime.CheckedAt.UTC()
|
||||
case !manifest.GeneratedAtUTC.IsZero():
|
||||
return manifest.GeneratedAtUTC.UTC()
|
||||
default:
|
||||
return time.Time{}
|
||||
}
|
||||
}
|
||||
|
||||
func convertRuntimeToEvents(runtime beeRuntime, fallback time.Time) []models.Event {
|
||||
events := make([]models.Event, 0)
|
||||
ts := runtime.CheckedAt
|
||||
if ts.IsZero() {
|
||||
ts = fallback
|
||||
}
|
||||
|
||||
if status := strings.TrimSpace(runtime.Status); status != "" {
|
||||
desc := "Bee runtime status: " + status
|
||||
if networkStatus := strings.TrimSpace(runtime.NetworkStatus); networkStatus != "" {
|
||||
desc += " (network: " + networkStatus + ")"
|
||||
}
|
||||
events = append(events, models.Event{
|
||||
Timestamp: ts,
|
||||
Source: "Bee Runtime",
|
||||
EventType: "Runtime Status",
|
||||
Severity: mapSeverity(status),
|
||||
Description: desc,
|
||||
})
|
||||
}
|
||||
|
||||
for _, issue := range runtime.Issues {
|
||||
desc := strings.TrimSpace(issue.Description)
|
||||
if desc == "" {
|
||||
desc = "Bee runtime issue"
|
||||
}
|
||||
events = append(events, models.Event{
|
||||
Timestamp: ts,
|
||||
Source: "Bee Runtime",
|
||||
EventType: "Runtime Issue",
|
||||
Severity: mapSeverity(issue.Severity),
|
||||
Description: desc,
|
||||
RawData: strings.TrimSpace(issue.Code),
|
||||
})
|
||||
}
|
||||
|
||||
for _, svc := range runtime.Services {
|
||||
status := strings.TrimSpace(svc.Status)
|
||||
if status == "" || strings.EqualFold(status, "active") {
|
||||
continue
|
||||
}
|
||||
events = append(events, models.Event{
|
||||
Timestamp: ts,
|
||||
Source: "systemd",
|
||||
EventType: "Service Status",
|
||||
Severity: mapSeverity(status),
|
||||
Description: fmt.Sprintf("%s is %s", strings.TrimSpace(svc.Name), status),
|
||||
})
|
||||
}
|
||||
|
||||
for _, iface := range runtime.Interfaces {
|
||||
state := strings.TrimSpace(iface.State)
|
||||
outcome := strings.TrimSpace(iface.Outcome)
|
||||
if state == "" && outcome == "" {
|
||||
continue
|
||||
}
|
||||
if strings.EqualFold(state, "up") && strings.EqualFold(outcome, "lease_acquired") {
|
||||
continue
|
||||
}
|
||||
desc := fmt.Sprintf("interface %s state=%s outcome=%s", strings.TrimSpace(iface.Name), state, outcome)
|
||||
events = append(events, models.Event{
|
||||
Timestamp: ts,
|
||||
Source: "network",
|
||||
EventType: "Interface Status",
|
||||
Severity: models.SeverityWarning,
|
||||
Description: strings.TrimSpace(desc),
|
||||
})
|
||||
}
|
||||
|
||||
return events
|
||||
}
|
||||
|
||||
func convertEventLogs(items []beeEventLog) []models.Event {
|
||||
events := make([]models.Event, 0, len(items))
|
||||
for _, item := range items {
|
||||
message := strings.TrimSpace(item.Message)
|
||||
if message == "" {
|
||||
continue
|
||||
}
|
||||
ts := parseEventTime(item.EventTime)
|
||||
rawData := strings.TrimSpace(item.MessageID)
|
||||
events = append(events, models.Event{
|
||||
Timestamp: ts,
|
||||
Source: firstNonEmpty(strings.TrimSpace(item.Source), "Reanimator"),
|
||||
EventType: "Event Log",
|
||||
Severity: mapSeverity(item.Severity),
|
||||
Description: message,
|
||||
RawData: rawData,
|
||||
})
|
||||
}
|
||||
return events
|
||||
}
|
||||
|
||||
func parseEventTime(raw string) time.Time {
|
||||
raw = strings.TrimSpace(raw)
|
||||
if raw == "" {
|
||||
return time.Time{}
|
||||
}
|
||||
layouts := []string{time.RFC3339Nano, time.RFC3339}
|
||||
for _, layout := range layouts {
|
||||
if ts, err := time.Parse(layout, raw); err == nil {
|
||||
return ts.UTC()
|
||||
}
|
||||
}
|
||||
return time.Time{}
|
||||
}
|
||||
|
||||
func flattenSensorGroups(groups beeSensorGroups) []models.SensorReading {
|
||||
result := make([]models.SensorReading, 0, len(groups.Fans)+len(groups.Power)+len(groups.Temperatures)+len(groups.Other))
|
||||
|
||||
for _, fan := range groups.Fans {
|
||||
result = append(result, models.SensorReading{
|
||||
Name: sensorName(fan.Name, fan.Location),
|
||||
Type: "fan",
|
||||
Value: float64(fan.RPM),
|
||||
Unit: "RPM",
|
||||
Status: strings.TrimSpace(fan.Status),
|
||||
})
|
||||
}
|
||||
|
||||
for _, power := range groups.Power {
|
||||
name := sensorName(power.Name, power.Location)
|
||||
status := strings.TrimSpace(power.Status)
|
||||
if power.PowerW != 0 {
|
||||
result = append(result, models.SensorReading{
|
||||
Name: name,
|
||||
Type: "power",
|
||||
Value: power.PowerW,
|
||||
Unit: "W",
|
||||
Status: status,
|
||||
})
|
||||
}
|
||||
if power.VoltageV != 0 {
|
||||
result = append(result, models.SensorReading{
|
||||
Name: name + " Voltage",
|
||||
Type: "voltage",
|
||||
Value: power.VoltageV,
|
||||
Unit: "V",
|
||||
Status: status,
|
||||
})
|
||||
}
|
||||
if power.CurrentA != 0 {
|
||||
result = append(result, models.SensorReading{
|
||||
Name: name + " Current",
|
||||
Type: "current",
|
||||
Value: power.CurrentA,
|
||||
Unit: "A",
|
||||
Status: status,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
for _, temp := range groups.Temperatures {
|
||||
result = append(result, models.SensorReading{
|
||||
Name: sensorName(temp.Name, temp.Location),
|
||||
Type: "temperature",
|
||||
Value: temp.Celsius,
|
||||
Unit: "C",
|
||||
Status: strings.TrimSpace(temp.Status),
|
||||
})
|
||||
}
|
||||
|
||||
for _, other := range groups.Other {
|
||||
result = append(result, models.SensorReading{
|
||||
Name: sensorName(other.Name, other.Location),
|
||||
Type: "other",
|
||||
Value: other.Value,
|
||||
Unit: strings.TrimSpace(other.Unit),
|
||||
Status: strings.TrimSpace(other.Status),
|
||||
})
|
||||
}
|
||||
|
||||
return result
|
||||
}
|
||||
|
||||
func sensorName(name, location string) string {
|
||||
name = strings.TrimSpace(name)
|
||||
location = strings.TrimSpace(location)
|
||||
if name == "" {
|
||||
return location
|
||||
}
|
||||
if location == "" {
|
||||
return name
|
||||
}
|
||||
return name + " [" + location + "]"
|
||||
}
|
||||
|
||||
func normalizePCIeDevices(items []models.PCIeDevice) []models.PCIeDevice {
|
||||
out := append([]models.PCIeDevice(nil), items...)
|
||||
for i := range out {
|
||||
slot := strings.TrimSpace(out[i].Slot)
|
||||
if out[i].BDF == "" && looksLikeBDF(slot) {
|
||||
out[i].BDF = slot
|
||||
}
|
||||
if out[i].Slot == "" && out[i].BDF != "" {
|
||||
out[i].Slot = out[i].BDF
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func normalizeNetworkAdapters(items []models.NetworkAdapter) []models.NetworkAdapter {
|
||||
out := append([]models.NetworkAdapter(nil), items...)
|
||||
for i := range out {
|
||||
slot := strings.TrimSpace(out[i].Slot)
|
||||
if out[i].BDF == "" && looksLikeBDF(slot) {
|
||||
out[i].BDF = slot
|
||||
}
|
||||
if out[i].Slot == "" && out[i].BDF != "" {
|
||||
out[i].Slot = out[i].BDF
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func looksLikeBDF(value string) bool {
|
||||
value = strings.TrimSpace(value)
|
||||
if len(value) != len("0000:00:00.0") {
|
||||
return false
|
||||
}
|
||||
for i, r := range value {
|
||||
switch i {
|
||||
case 4, 7:
|
||||
if r != ':' {
|
||||
return false
|
||||
}
|
||||
case 10:
|
||||
if r != '.' {
|
||||
return false
|
||||
}
|
||||
default:
|
||||
if !((r >= '0' && r <= '9') || (r >= 'a' && r <= 'f') || (r >= 'A' && r <= 'F')) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func buildBoardFRU(board models.BoardInfo) (models.FRUInfo, bool) {
|
||||
if strings.TrimSpace(board.SerialNumber) == "" &&
|
||||
strings.TrimSpace(board.Manufacturer) == "" &&
|
||||
strings.TrimSpace(board.ProductName) == "" &&
|
||||
strings.TrimSpace(board.PartNumber) == "" {
|
||||
return models.FRUInfo{}, false
|
||||
}
|
||||
|
||||
return models.FRUInfo{
|
||||
Description: "System Board",
|
||||
Manufacturer: strings.TrimSpace(board.Manufacturer),
|
||||
ProductName: strings.TrimSpace(board.ProductName),
|
||||
SerialNumber: strings.TrimSpace(board.SerialNumber),
|
||||
PartNumber: strings.TrimSpace(board.PartNumber),
|
||||
}, true
|
||||
}
|
||||
|
||||
func mapSeverity(raw string) models.Severity {
|
||||
switch strings.ToLower(strings.TrimSpace(raw)) {
|
||||
case "critical", "crit", "error", "failed", "failure":
|
||||
return models.SeverityCritical
|
||||
case "warning", "warn", "partial", "degraded", "inactive", "activating", "deactivating":
|
||||
return models.SeverityWarning
|
||||
default:
|
||||
return models.SeverityInfo
|
||||
}
|
||||
}
|
||||
|
||||
func firstNonEmpty(values ...string) string {
|
||||
for _, value := range values {
|
||||
value = strings.TrimSpace(value)
|
||||
if value != "" {
|
||||
return value
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
219
internal/parser/vendors/easy_bee/parser_test.go
vendored
Normal file
219
internal/parser/vendors/easy_bee/parser_test.go
vendored
Normal file
@@ -0,0 +1,219 @@
|
||||
package easy_bee
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestDetectBeeSupportArchive(t *testing.T) {
|
||||
p := &Parser{}
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "bee-support-debian-20260325-162030/manifest.txt",
|
||||
Content: []byte("bee_version=1.0.0\nhost=debian\ngenerated_at_utc=2026-03-25T16:20:30Z\nexport_dir=/appdata/bee/export\n"),
|
||||
},
|
||||
{
|
||||
Path: "bee-support-debian-20260325-162030/export/bee-audit.json",
|
||||
Content: []byte(`{"hardware":{"board":{"serial_number":"SN-BEE-001"}}}`),
|
||||
},
|
||||
{
|
||||
Path: "bee-support-debian-20260325-162030/export/runtime-health.json",
|
||||
Content: []byte(`{"status":"PARTIAL"}`),
|
||||
},
|
||||
}
|
||||
|
||||
if got := p.Detect(files); got < 90 {
|
||||
t.Fatalf("expected high confidence detect score, got %d", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestDetectRejectsNonBeeArchive(t *testing.T) {
|
||||
p := &Parser{}
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "random/manifest.txt",
|
||||
Content: []byte("host=test\n"),
|
||||
},
|
||||
{
|
||||
Path: "random/export/runtime-health.json",
|
||||
Content: []byte(`{"status":"OK"}`),
|
||||
},
|
||||
}
|
||||
|
||||
if got := p.Detect(files); got != 0 {
|
||||
t.Fatalf("expected detect score 0, got %d", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseBeeAuditSnapshot(t *testing.T) {
|
||||
p := &Parser{}
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "bee-support-debian-20260325-162030/manifest.txt",
|
||||
Content: []byte("bee_version=1.0.0\nhost=debian\ngenerated_at_utc=2026-03-25T16:20:30Z\nexport_dir=/appdata/bee/export\n"),
|
||||
},
|
||||
{
|
||||
Path: "bee-support-debian-20260325-162030/export/bee-audit.json",
|
||||
Content: []byte(`{
|
||||
"source_type": "manual",
|
||||
"target_host": "debian",
|
||||
"collected_at": "2026-03-25T16:08:09Z",
|
||||
"runtime": {
|
||||
"status": "PARTIAL",
|
||||
"checked_at": "2026-03-25T16:07:56Z",
|
||||
"network_status": "OK",
|
||||
"issues": [
|
||||
{
|
||||
"code": "nvidia_kernel_module_missing",
|
||||
"severity": "warning",
|
||||
"description": "NVIDIA kernel module is not loaded."
|
||||
}
|
||||
],
|
||||
"services": [
|
||||
{
|
||||
"name": "bee-web",
|
||||
"status": "inactive"
|
||||
}
|
||||
]
|
||||
},
|
||||
"hardware": {
|
||||
"board": {
|
||||
"manufacturer": "Supermicro",
|
||||
"product_name": "AS-4124GQ-TNMI",
|
||||
"serial_number": "S490387X4418273",
|
||||
"part_number": "H12DGQ-NT6",
|
||||
"uuid": "d868ae00-a61f-11ee-8000-7cc255e10309"
|
||||
},
|
||||
"firmware": [
|
||||
{
|
||||
"device_name": "BIOS",
|
||||
"version": "2.8"
|
||||
}
|
||||
],
|
||||
"cpus": [
|
||||
{
|
||||
"status": "OK",
|
||||
"status_checked_at": "2026-03-25T16:08:09Z",
|
||||
"socket": 1,
|
||||
"model": "AMD EPYC 7763 64-Core Processor",
|
||||
"cores": 64,
|
||||
"threads": 128,
|
||||
"frequency_mhz": 2450,
|
||||
"max_frequency_mhz": 3525
|
||||
}
|
||||
],
|
||||
"memory": [
|
||||
{
|
||||
"status": "OK",
|
||||
"status_checked_at": "2026-03-25T16:08:09Z",
|
||||
"slot": "P1-DIMMA1",
|
||||
"location": "P0_Node0_Channel0_Dimm0",
|
||||
"present": true,
|
||||
"size_mb": 32768,
|
||||
"type": "DDR4",
|
||||
"max_speed_mhz": 3200,
|
||||
"current_speed_mhz": 2933,
|
||||
"manufacturer": "SK Hynix",
|
||||
"serial_number": "80AD01224887286666",
|
||||
"part_number": "HMA84GR7DJR4N-XN"
|
||||
}
|
||||
],
|
||||
"storage": [
|
||||
{
|
||||
"status": "Unknown",
|
||||
"status_checked_at": "2026-03-25T16:08:09Z",
|
||||
"slot": "nvme0n1",
|
||||
"type": "NVMe",
|
||||
"model": "KCD6XLUL960G",
|
||||
"serial_number": "2470A00XT5M8",
|
||||
"interface": "NVMe",
|
||||
"present": true
|
||||
}
|
||||
],
|
||||
"pcie_devices": [
|
||||
{
|
||||
"status": "OK",
|
||||
"status_checked_at": "2026-03-25T16:08:09Z",
|
||||
"slot": "0000:05:00.0",
|
||||
"vendor_id": 5555,
|
||||
"device_id": 4123,
|
||||
"device_class": "EthernetController",
|
||||
"manufacturer": "Mellanox Technologies",
|
||||
"model": "MT28908 Family [ConnectX-6]",
|
||||
"link_width": 16,
|
||||
"link_speed": "Gen4",
|
||||
"max_link_width": 16,
|
||||
"max_link_speed": "Gen4",
|
||||
"mac_addresses": ["94:6d:ae:9a:75:4a"],
|
||||
"present": true
|
||||
}
|
||||
],
|
||||
"sensors": {
|
||||
"power": [
|
||||
{
|
||||
"name": "PPT",
|
||||
"location": "amdgpu-pci-1100",
|
||||
"power_w": 95
|
||||
}
|
||||
],
|
||||
"temperatures": [
|
||||
{
|
||||
"name": "Composite",
|
||||
"location": "nvme-pci-0600",
|
||||
"celsius": 28.85,
|
||||
"threshold_warning_celsius": 72.85,
|
||||
"threshold_critical_celsius": 81.85,
|
||||
"status": "OK"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}`),
|
||||
},
|
||||
}
|
||||
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
|
||||
if result.Hardware == nil {
|
||||
t.Fatal("expected hardware to be populated")
|
||||
}
|
||||
if result.TargetHost != "debian" {
|
||||
t.Fatalf("expected target host debian, got %q", result.TargetHost)
|
||||
}
|
||||
wantCollectedAt := time.Date(2026, 3, 25, 16, 8, 9, 0, time.UTC)
|
||||
if !result.CollectedAt.Equal(wantCollectedAt) {
|
||||
t.Fatalf("expected collected_at %s, got %s", wantCollectedAt, result.CollectedAt)
|
||||
}
|
||||
if result.Hardware.BoardInfo.SerialNumber != "S490387X4418273" {
|
||||
t.Fatalf("unexpected board serial %q", result.Hardware.BoardInfo.SerialNumber)
|
||||
}
|
||||
if len(result.Hardware.CPUs) != 1 {
|
||||
t.Fatalf("expected 1 cpu, got %d", len(result.Hardware.CPUs))
|
||||
}
|
||||
if len(result.Hardware.Memory) != 1 {
|
||||
t.Fatalf("expected 1 dimm, got %d", len(result.Hardware.Memory))
|
||||
}
|
||||
if len(result.Hardware.Storage) != 1 {
|
||||
t.Fatalf("expected 1 storage device, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
if len(result.Hardware.PCIeDevices) != 1 {
|
||||
t.Fatalf("expected 1 pcie device, got %d", len(result.Hardware.PCIeDevices))
|
||||
}
|
||||
if result.Hardware.PCIeDevices[0].BDF != "0000:05:00.0" {
|
||||
t.Fatalf("expected BDF to be normalized from slot, got %q", result.Hardware.PCIeDevices[0].BDF)
|
||||
}
|
||||
if len(result.Sensors) != 2 {
|
||||
t.Fatalf("expected 2 flattened sensors, got %d", len(result.Sensors))
|
||||
}
|
||||
if len(result.Events) < 3 {
|
||||
t.Fatalf("expected runtime events to be created, got %d", len(result.Events))
|
||||
}
|
||||
if len(result.FRU) == 0 {
|
||||
t.Fatal("expected board FRU fallback to be populated")
|
||||
}
|
||||
}
|
||||
19
internal/parser/vendors/h3c/parser.go
vendored
19
internal/parser/vendors/h3c/parser.go
vendored
@@ -216,6 +216,7 @@ func parseH3CG5(files []parser.ExtractedFile) *models.AnalysisResult {
|
||||
}
|
||||
result.Hardware.Storage = dedupeStorage(result.Hardware.Storage)
|
||||
result.Hardware.Volumes = dedupeVolumes(result.Hardware.Volumes)
|
||||
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
|
||||
|
||||
return result
|
||||
}
|
||||
@@ -286,6 +287,7 @@ func parseH3CG6(files []parser.ExtractedFile) *models.AnalysisResult {
|
||||
}
|
||||
result.Hardware.Storage = dedupeStorage(result.Hardware.Storage)
|
||||
result.Hardware.Volumes = dedupeVolumes(result.Hardware.Volumes)
|
||||
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
|
||||
|
||||
return result
|
||||
}
|
||||
@@ -3024,6 +3026,7 @@ func mergeStorage(dst *models.Storage, src models.Storage) {
|
||||
}
|
||||
setStorageString(&dst.Location, src.Location)
|
||||
setStorageString(&dst.Status, normalizeStorageStatus(src.Status, src.Present || dst.Present))
|
||||
dst.Details = mergeH3CDetails(dst.Details, src.Details)
|
||||
}
|
||||
|
||||
func setStorageString(dst *string, value string) {
|
||||
@@ -3275,6 +3278,22 @@ func mergePSU(dst *models.PSU, src models.PSU) {
|
||||
setStorageString(&dst.PartNumber, src.PartNumber)
|
||||
setStorageString(&dst.Firmware, src.Firmware)
|
||||
setStorageString(&dst.Status, src.Status)
|
||||
dst.Details = mergeH3CDetails(dst.Details, src.Details)
|
||||
}
|
||||
|
||||
func mergeH3CDetails(primary, secondary map[string]any) map[string]any {
|
||||
if len(secondary) == 0 {
|
||||
return primary
|
||||
}
|
||||
if primary == nil {
|
||||
primary = make(map[string]any, len(secondary))
|
||||
}
|
||||
for key, value := range secondary {
|
||||
if _, ok := primary[key]; !ok {
|
||||
primary[key] = value
|
||||
}
|
||||
}
|
||||
return primary
|
||||
}
|
||||
|
||||
func dedupeVolumes(items []models.StorageVolume) []models.StorageVolume {
|
||||
|
||||
1706
internal/parser/vendors/hpe_ilo_ahs/parser.go
vendored
Normal file
1706
internal/parser/vendors/hpe_ilo_ahs/parser.go
vendored
Normal file
File diff suppressed because it is too large
Load Diff
316
internal/parser/vendors/hpe_ilo_ahs/parser_test.go
vendored
Normal file
316
internal/parser/vendors/hpe_ilo_ahs/parser_test.go
vendored
Normal file
@@ -0,0 +1,316 @@
|
||||
package hpe_ilo_ahs
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"compress/gzip"
|
||||
"encoding/binary"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestDetectAHS(t *testing.T) {
|
||||
p := &Parser{}
|
||||
score := p.Detect([]parser.ExtractedFile{{
|
||||
Path: "HPE_CZ2D1X0GS3_20260330.ahs",
|
||||
Content: makeAHSArchive(t, []ahsTestEntry{{Name: "CUST_INFO.DAT", Payload: []byte("x")}}),
|
||||
}})
|
||||
if score < 80 {
|
||||
t.Fatalf("expected high confidence detect, got %d", score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseAHSInventory(t *testing.T) {
|
||||
p := &Parser{}
|
||||
content := makeAHSArchive(t, []ahsTestEntry{
|
||||
{Name: "CUST_INFO.DAT", Payload: make([]byte, 16)},
|
||||
{Name: "0000088-2026-03-30.zbb", Payload: gzipBytes(t, []byte(sampleInventoryBlob()))},
|
||||
{Name: "bcert.pkg", Payload: []byte(sampleBCertBlob())},
|
||||
})
|
||||
|
||||
result, err := p.Parse([]parser.ExtractedFile{{
|
||||
Path: "HPE_CZ2D1X0GS3_20260330.ahs",
|
||||
Content: content,
|
||||
}})
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
if result.Hardware == nil {
|
||||
t.Fatalf("expected hardware section")
|
||||
}
|
||||
|
||||
board := result.Hardware.BoardInfo
|
||||
if board.Manufacturer != "HPE" {
|
||||
t.Fatalf("unexpected board manufacturer: %q", board.Manufacturer)
|
||||
}
|
||||
if board.ProductName != "ProLiant DL380 Gen11" {
|
||||
t.Fatalf("unexpected board product: %q", board.ProductName)
|
||||
}
|
||||
if board.SerialNumber != "CZ2D1X0GS3" {
|
||||
t.Fatalf("unexpected board serial: %q", board.SerialNumber)
|
||||
}
|
||||
if board.PartNumber != "P52560-421" {
|
||||
t.Fatalf("unexpected board part number: %q", board.PartNumber)
|
||||
}
|
||||
|
||||
if len(result.Hardware.CPUs) != 1 || result.Hardware.CPUs[0].Model != "Intel(R) Xeon(R) Gold 6444Y" {
|
||||
t.Fatalf("unexpected CPUs: %+v", result.Hardware.CPUs)
|
||||
}
|
||||
if len(result.Hardware.Memory) != 1 {
|
||||
t.Fatalf("expected one DIMM, got %d", len(result.Hardware.Memory))
|
||||
}
|
||||
if result.Hardware.Memory[0].PartNumber != "HMCG88AEBRA115N" {
|
||||
t.Fatalf("unexpected DIMM part number: %q", result.Hardware.Memory[0].PartNumber)
|
||||
}
|
||||
|
||||
if len(result.Hardware.NetworkAdapters) != 2 {
|
||||
t.Fatalf("expected two network adapters, got %d", len(result.Hardware.NetworkAdapters))
|
||||
}
|
||||
if len(result.Hardware.PowerSupply) != 1 {
|
||||
t.Fatalf("expected one PSU, got %d", len(result.Hardware.PowerSupply))
|
||||
}
|
||||
if result.Hardware.PowerSupply[0].SerialNumber != "5XUWB0C4DJG4BV" {
|
||||
t.Fatalf("unexpected PSU serial: %q", result.Hardware.PowerSupply[0].SerialNumber)
|
||||
}
|
||||
if result.Hardware.PowerSupply[0].Firmware != "2.00" {
|
||||
t.Fatalf("unexpected PSU firmware: %q", result.Hardware.PowerSupply[0].Firmware)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Storage) != 1 {
|
||||
t.Fatalf("expected one physical drive, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
drive := result.Hardware.Storage[0]
|
||||
if drive.Model != "SAMSUNGMZ7L3480HCHQ-00A07" {
|
||||
t.Fatalf("unexpected drive model: %q", drive.Model)
|
||||
}
|
||||
if drive.SerialNumber != "S664NC0Y502720" {
|
||||
t.Fatalf("unexpected drive serial: %q", drive.SerialNumber)
|
||||
}
|
||||
if drive.SizeGB != 480 {
|
||||
t.Fatalf("unexpected drive size: %d", drive.SizeGB)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Firmware) == 0 {
|
||||
t.Fatalf("expected firmware inventory")
|
||||
}
|
||||
foundILO := false
|
||||
foundControllerFW := false
|
||||
foundNICFW := false
|
||||
foundBackplaneFW := false
|
||||
for _, item := range result.Hardware.Firmware {
|
||||
if item.DeviceName == "iLO 6" && item.Version == "v1.63p20" {
|
||||
foundILO = true
|
||||
}
|
||||
if item.DeviceName == "HPE MR408i-o Gen11" && item.Version == "52.26.3-5379" {
|
||||
foundControllerFW = true
|
||||
}
|
||||
if item.DeviceName == "BCM 5719 1Gb 4p BASE-T OCP Adptr" && item.Version == "20.28.41" {
|
||||
foundNICFW = true
|
||||
}
|
||||
if item.DeviceName == "8 SFF 24G x1NVMe/SAS UBM3 BC BP" && item.Version == "1.24" {
|
||||
foundBackplaneFW = true
|
||||
}
|
||||
}
|
||||
if !foundILO {
|
||||
t.Fatalf("expected iLO firmware entry")
|
||||
}
|
||||
if !foundControllerFW {
|
||||
t.Fatalf("expected controller firmware entry")
|
||||
}
|
||||
if !foundNICFW {
|
||||
t.Fatalf("expected broadcom firmware entry")
|
||||
}
|
||||
if !foundBackplaneFW {
|
||||
t.Fatalf("expected backplane firmware entry")
|
||||
}
|
||||
|
||||
broadcomFound := false
|
||||
backplaneFound := false
|
||||
for _, nic := range result.Hardware.NetworkAdapters {
|
||||
if nic.SerialNumber == "1CH0150001" && nic.Firmware == "20.28.41" {
|
||||
broadcomFound = true
|
||||
}
|
||||
}
|
||||
for _, dev := range result.Hardware.Devices {
|
||||
if dev.DeviceClass == "storage_backplane" && dev.Firmware == "1.24" {
|
||||
backplaneFound = true
|
||||
}
|
||||
}
|
||||
if !broadcomFound {
|
||||
t.Fatalf("expected broadcom adapter firmware to be enriched")
|
||||
}
|
||||
if !backplaneFound {
|
||||
t.Fatalf("expected backplane canonical device")
|
||||
}
|
||||
|
||||
if len(result.Hardware.Devices) < 6 {
|
||||
t.Fatalf("expected canonical devices, got %d", len(result.Hardware.Devices))
|
||||
}
|
||||
if len(result.Events) == 0 {
|
||||
t.Fatalf("expected parsed events")
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseExampleAHS(t *testing.T) {
|
||||
path := filepath.Join("..", "..", "..", "..", "example", "HPE_CZ2D1X0GS3_20260330.ahs")
|
||||
content, err := os.ReadFile(path)
|
||||
if err != nil {
|
||||
t.Skipf("example fixture unavailable: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse([]parser.ExtractedFile{{
|
||||
Path: filepath.Base(path),
|
||||
Content: content,
|
||||
}})
|
||||
if err != nil {
|
||||
t.Fatalf("parse example failed: %v", err)
|
||||
}
|
||||
if result.Hardware == nil {
|
||||
t.Fatalf("expected hardware section")
|
||||
}
|
||||
|
||||
board := result.Hardware.BoardInfo
|
||||
if board.ProductName != "ProLiant DL380 Gen11" {
|
||||
t.Fatalf("unexpected board product: %q", board.ProductName)
|
||||
}
|
||||
if board.SerialNumber != "CZ2D1X0GS3" {
|
||||
t.Fatalf("unexpected board serial: %q", board.SerialNumber)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Storage) < 2 {
|
||||
t.Fatalf("expected at least two drives, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
if len(result.Hardware.PowerSupply) != 2 {
|
||||
t.Fatalf("expected exactly two PSUs, got %d: %+v", len(result.Hardware.PowerSupply), result.Hardware.PowerSupply)
|
||||
}
|
||||
|
||||
foundController := false
|
||||
foundBackplaneFW := false
|
||||
foundNICFW := false
|
||||
for _, device := range result.Hardware.Devices {
|
||||
if device.Model == "HPE MR408i-o Gen11" && device.SerialNumber == "PXSFQ0BBIJY3B3" {
|
||||
foundController = true
|
||||
}
|
||||
if device.DeviceClass == "storage_backplane" && device.Firmware == "1.24" {
|
||||
foundBackplaneFW = true
|
||||
}
|
||||
}
|
||||
if !foundController {
|
||||
t.Fatalf("expected MR408i-o controller in canonical devices")
|
||||
}
|
||||
for _, fw := range result.Hardware.Firmware {
|
||||
if fw.DeviceName == "BCM 5719 1Gb 4p BASE-T OCP Adptr" && fw.Version == "20.28.41" {
|
||||
foundNICFW = true
|
||||
}
|
||||
}
|
||||
if !foundBackplaneFW {
|
||||
t.Fatalf("expected backplane device in canonical devices")
|
||||
}
|
||||
if !foundNICFW {
|
||||
t.Fatalf("expected broadcom firmware from bcert/pkg lockdown")
|
||||
}
|
||||
}
|
||||
|
||||
type ahsTestEntry struct {
|
||||
Name string
|
||||
Payload []byte
|
||||
Flag uint32
|
||||
}
|
||||
|
||||
func makeAHSArchive(t *testing.T, entries []ahsTestEntry) []byte {
|
||||
t.Helper()
|
||||
|
||||
var buf bytes.Buffer
|
||||
for _, entry := range entries {
|
||||
header := make([]byte, ahsHeaderSize)
|
||||
copy(header[:4], []byte("ABJR"))
|
||||
binary.LittleEndian.PutUint16(header[4:6], 0x0300)
|
||||
binary.LittleEndian.PutUint16(header[6:8], 0x0002)
|
||||
binary.LittleEndian.PutUint32(header[8:12], uint32(len(entry.Payload)))
|
||||
flag := entry.Flag
|
||||
if flag == 0 {
|
||||
flag = 0x80000002
|
||||
if len(entry.Payload) >= 2 && entry.Payload[0] == 0x1f && entry.Payload[1] == 0x8b {
|
||||
flag = 0x80000001
|
||||
}
|
||||
}
|
||||
binary.LittleEndian.PutUint32(header[16:20], flag)
|
||||
copy(header[20:52], []byte(entry.Name))
|
||||
buf.Write(header)
|
||||
buf.Write(entry.Payload)
|
||||
}
|
||||
return buf.Bytes()
|
||||
}
|
||||
|
||||
func gzipBytes(t *testing.T, payload []byte) []byte {
|
||||
t.Helper()
|
||||
|
||||
var buf bytes.Buffer
|
||||
zw := gzip.NewWriter(&buf)
|
||||
if _, err := zw.Write(payload); err != nil {
|
||||
t.Fatalf("gzip payload: %v", err)
|
||||
}
|
||||
if err := zw.Close(); err != nil {
|
||||
t.Fatalf("close gzip writer: %v", err)
|
||||
}
|
||||
return buf.Bytes()
|
||||
}
|
||||
|
||||
func sampleInventoryBlob() string {
|
||||
return stringsJoin(
|
||||
"iLO 6 v1.63p20 built on Sep 13 2024",
|
||||
"HPE",
|
||||
"ProLiant DL380 Gen11",
|
||||
"CZ2D1X0GS3",
|
||||
"P52560-421",
|
||||
"Proc 1",
|
||||
"Intel(R) Corporation",
|
||||
"Intel(R) Xeon(R) Gold 6444Y",
|
||||
"PROC 1 DIMM 3",
|
||||
"Hynix",
|
||||
"HMCG88AEBRA115N",
|
||||
"2B5F92C6",
|
||||
"Power Supply 1",
|
||||
"5XUWB0C4DJG4BV",
|
||||
"P03178-B21",
|
||||
"PciRoot(0x1)/Pci(0x5,0x0)/Pci(0x0,0x0)",
|
||||
"NIC.Slot.1.1",
|
||||
"Network Controller",
|
||||
"Slot 1",
|
||||
"MCX512A-ACAT",
|
||||
"MT2230478382",
|
||||
"PciRoot(0x3)/Pci(0x1,0x0)/Pci(0x0,0x0)",
|
||||
"OCP.Slot.15.1",
|
||||
"Broadcom NetXtreme Gigabit Ethernet - NIC",
|
||||
"OCP Slot 15",
|
||||
"P51183-001",
|
||||
"1CH0150001",
|
||||
"20.28.41",
|
||||
"System ROM",
|
||||
"v2.22 (06/19/2024)",
|
||||
"03/30/2026 09:47:33",
|
||||
"iLO network link down.",
|
||||
`{"@odata.id":"/redfish/v1/Systems/1/Storage/DE00A000/Controllers/0","@odata.type":"#StorageController.v1_7_0.StorageController","Id":"0","Name":"HPE MR408i-o Gen11","FirmwareVersion":"52.26.3-5379","Manufacturer":"HPE","Model":"HPE MR408i-o Gen11","PartNumber":"P58543-001","SKU":"P58335-B21","SerialNumber":"PXSFQ0BBIJY3B3","Status":{"State":"Enabled","Health":"OK"},"Location":{"PartLocation":{"ServiceLabel":"Slot=14","LocationType":"Slot","LocationOrdinalValue":14}},"PCIeInterface":{"PCIeType":"Gen4","LanesInUse":8}}`,
|
||||
`{"@odata.id":"/redfish/v1/Fabrics/DE00A000","@odata.type":"#Fabric.v1_3_0.Fabric","Id":"DE00A000","Name":"8 SFF 24G x1NVMe/SAS UBM3 BC BP","FabricType":"MultiProtocol"}`,
|
||||
`{"@odata.id":"/redfish/v1/Fabrics/DE00A000/Switches/1","@odata.type":"#Switch.v1_9_1.Switch","Id":"1","Name":"Direct Attached","Model":"UBM3","FirmwareVersion":"1.24","SupportedProtocols":["SAS","SATA","NVMe"],"SwitchType":"MultiProtocol","Status":{"State":"Enabled","Health":"OK"}}`,
|
||||
`{"@odata.id":"/redfish/v1/Chassis/DE00A000/Drives/0","@odata.type":"#Drive.v1_17_0.Drive","Id":"0","Name":"480GB 6G SATA SSD","Status":{"State":"StandbyOffline","Health":"OK"},"PhysicalLocation":{"PartLocation":{"ServiceLabel":"Slot=14:Port=1:Box=3:Bay=1","LocationType":"Bay","LocationOrdinalValue":1}},"CapacityBytes":480103981056,"MediaType":"SSD","Model":"SAMSUNGMZ7L3480HCHQ-00A07","Protocol":"SATA","Revision":"JXTC604Q","SerialNumber":"S664NC0Y502720","PredictedMediaLifeLeftPercent":100}`,
|
||||
`{"@odata.id":"/redfish/v1/Chassis/DE00A000/Drives/64515","@odata.type":"#Drive.v1_17_0.Drive","Id":"64515","Name":"Empty Bay","Status":{"State":"Absent","Health":"OK"}}`,
|
||||
)
|
||||
}
|
||||
|
||||
func sampleBCertBlob() string {
|
||||
return `<BC><MfgRecord><PowerSupplySlot id="0"><Present>Yes</Present><SerialNumber>5XUWB0C4DJG4BV</SerialNumber><FirmwareVersion>2.00</FirmwareVersion><SparePartNumber>P44412-001</SparePartNumber></PowerSupplySlot><FirmwareLockdown><SystemProgrammableLogicDevice>0x12</SystemProgrammableLogicDevice><ServerPlatformServicesSPSFirmware>6.1.4.47</ServerPlatformServicesSPSFirmware><STMicroGen11TPM>1.512</STMicroGen11TPM><HPEMR408i-oGen11>52.26.3-5379</HPEMR408i-oGen11><UBM3>UBM3/1.24</UBM3><BCM57191Gb4pBASE-TOCP3>20.28.41</BCM57191Gb4pBASE-TOCP3></FirmwareLockdown></MfgRecord></BC>`
|
||||
}
|
||||
|
||||
func stringsJoin(parts ...string) string {
|
||||
return string(bytes.Join(func() [][]byte {
|
||||
out := make([][]byte, 0, len(parts))
|
||||
for _, part := range parts {
|
||||
out = append(out, []byte(part))
|
||||
}
|
||||
return out
|
||||
}(), []byte{0}))
|
||||
}
|
||||
25
internal/parser/vendors/inspur/asset.go
vendored
25
internal/parser/vendors/inspur/asset.go
vendored
@@ -94,8 +94,12 @@ type AssetJSON struct {
|
||||
} `json:"PcieInfo"`
|
||||
}
|
||||
|
||||
// ParseAssetJSON parses Inspur asset.json content
|
||||
func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
|
||||
// ParseAssetJSON parses Inspur asset.json content.
|
||||
// - pcieSlotDeviceNames: optional map from integer PCIe slot ID to device name string,
|
||||
// sourced from devicefrusdr.log PCIe REST section. Fills missing NVMe model names.
|
||||
// - pcieSlotSerials: optional map from integer PCIe slot ID to serial number string,
|
||||
// sourced from audit.log SN-changed events. Fills missing NVMe serial numbers.
|
||||
func ParseAssetJSON(content []byte, pcieSlotDeviceNames map[int]string, pcieSlotSerials map[int]string) (*models.HardwareConfig, error) {
|
||||
var asset AssetJSON
|
||||
if err := json.Unmarshal(content, &asset); err != nil {
|
||||
return nil, err
|
||||
@@ -175,6 +179,23 @@ func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
|
||||
continue
|
||||
}
|
||||
|
||||
// Enrich model name from PCIe device name (supplied from devicefrusdr.log).
|
||||
// BMC does not populate HddInfo.ModelName for NVMe drives, but the PCIe REST
|
||||
// section in devicefrusdr.log carries the drive model as device_name.
|
||||
if modelName == "" && hdd.PcieSlot > 0 && len(pcieSlotDeviceNames) > 0 {
|
||||
if devName, ok := pcieSlotDeviceNames[hdd.PcieSlot]; ok && devName != "" {
|
||||
modelName = devName
|
||||
}
|
||||
}
|
||||
|
||||
// Enrich serial number from audit.log SN-changed events (supplied via pcieSlotSerials).
|
||||
// BMC asset.json does not carry NVMe serial numbers; audit.log logs every SN change.
|
||||
if serial == "" && hdd.PcieSlot > 0 && len(pcieSlotSerials) > 0 {
|
||||
if sn, ok := pcieSlotSerials[hdd.PcieSlot]; ok && sn != "" {
|
||||
serial = sn
|
||||
}
|
||||
}
|
||||
|
||||
storageType := "HDD"
|
||||
if hdd.DiskInterfaceType == 5 {
|
||||
storageType = "NVMe"
|
||||
|
||||
@@ -28,7 +28,7 @@ func TestParseAssetJSON_NVIDIAGPUModelFromPCIIDs(t *testing.T) {
|
||||
}]
|
||||
}`)
|
||||
|
||||
hw, err := ParseAssetJSON(raw)
|
||||
hw, err := ParseAssetJSON(raw, nil, nil)
|
||||
if err != nil {
|
||||
t.Fatalf("ParseAssetJSON failed: %v", err)
|
||||
}
|
||||
|
||||
94
internal/parser/vendors/inspur/audit.go
vendored
Normal file
94
internal/parser/vendors/inspur/audit.go
vendored
Normal file
@@ -0,0 +1,94 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// auditSNChangedNVMeRegex matches:
|
||||
// "Front Back Plane N NVMe DiskM SN changed from X to Y"
|
||||
// Captures: disk_num, new_serial
|
||||
var auditSNChangedNVMeRegex = regexp.MustCompile(`NVMe Disk(\d+)\s+SN changed from \S+\s+to\s+(\S+)`)
|
||||
|
||||
// auditSNChangedRAIDRegex matches:
|
||||
// "Raid(Pcie Slot:N) HDD(enclosure id:E slot:S) SN changed from X to Y"
|
||||
// Captures: pcie_slot, enclosure_id, slot_num, new_serial
|
||||
var auditSNChangedRAIDRegex = regexp.MustCompile(`Raid\(Pcie Slot:(\d+)\) HDD\(enclosure id:(\d+) slot:(\d+)\)\s+SN changed from \S+\s+to\s+(\S+)`)
|
||||
|
||||
// ParseAuditLogNVMeSerials parses audit.log and returns the final (latest) serial number
|
||||
// per NVMe disk number. The disk number matches the numeric suffix in PCIe location
|
||||
// strings like "#NVME0", "#NVME2", etc. from devicefrusdr.log.
|
||||
// Entries where the serial changed to "NULL" are excluded.
|
||||
func ParseAuditLogNVMeSerials(content []byte) map[int]string {
|
||||
serials := make(map[int]string)
|
||||
|
||||
for _, line := range strings.Split(string(content), "\n") {
|
||||
m := auditSNChangedNVMeRegex.FindStringSubmatch(line)
|
||||
if m == nil {
|
||||
continue
|
||||
}
|
||||
diskNum, err := strconv.Atoi(m[1])
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
serial := strings.TrimSpace(m[2])
|
||||
if strings.EqualFold(serial, "NULL") || serial == "" {
|
||||
delete(serials, diskNum)
|
||||
} else {
|
||||
serials[diskNum] = serial
|
||||
}
|
||||
}
|
||||
if len(serials) == 0 {
|
||||
return nil
|
||||
}
|
||||
return serials
|
||||
}
|
||||
|
||||
// ParseAuditLogRAIDSerials parses audit.log and returns the final (latest) serial number
|
||||
// per RAID backplane disk. Key format is "BP{enclosure_id-1}:{slot_num}" (e.g. "BP0:0").
|
||||
//
|
||||
// Each disk slot is claimed by a specific RAID controller (Pcie Slot:N). NULL events from
|
||||
// an old controller do not clear serials assigned by a newer controller, preventing stale
|
||||
// deletions when disks are migrated between RAID arrays.
|
||||
func ParseAuditLogRAIDSerials(content []byte) map[string]string {
|
||||
// owner tracks which PCIe RAID controller slot last assigned a serial to a disk key.
|
||||
serials := make(map[string]string)
|
||||
owner := make(map[string]int)
|
||||
|
||||
for _, line := range strings.Split(string(content), "\n") {
|
||||
m := auditSNChangedRAIDRegex.FindStringSubmatch(line)
|
||||
if m == nil {
|
||||
continue
|
||||
}
|
||||
pcieSlot, err := strconv.Atoi(m[1])
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
enclosureID, err := strconv.Atoi(m[2])
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
slotNum, err := strconv.Atoi(m[3])
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
serial := strings.TrimSpace(m[4])
|
||||
key := fmt.Sprintf("BP%d:%d", enclosureID-1, slotNum)
|
||||
if strings.EqualFold(serial, "NULL") || serial == "" {
|
||||
// Only clear if this controller was the last to set the serial.
|
||||
if owner[key] == pcieSlot {
|
||||
delete(serials, key)
|
||||
delete(owner, key)
|
||||
}
|
||||
} else {
|
||||
serials[key] = serial
|
||||
owner[key] = pcieSlot
|
||||
}
|
||||
}
|
||||
if len(serials) == 0 {
|
||||
return nil
|
||||
}
|
||||
return serials
|
||||
}
|
||||
347
internal/parser/vendors/inspur/component.go
vendored
347
internal/parser/vendors/inspur/component.go
vendored
@@ -100,10 +100,18 @@ func parseMemoryInfo(text string, hw *models.HardwareConfig) {
|
||||
return
|
||||
}
|
||||
|
||||
// Replace memory data with detailed info from component.log
|
||||
hw.Memory = nil
|
||||
var merged []models.MemoryDIMM
|
||||
seen := make(map[string]int)
|
||||
for _, existing := range hw.Memory {
|
||||
key := inspurMemoryKey(existing)
|
||||
if key == "" {
|
||||
continue
|
||||
}
|
||||
seen[key] = len(merged)
|
||||
merged = append(merged, existing)
|
||||
}
|
||||
for _, mem := range memInfo.MemModules {
|
||||
hw.Memory = append(hw.Memory, models.MemoryDIMM{
|
||||
item := models.MemoryDIMM{
|
||||
Slot: mem.MemModSlot,
|
||||
Location: mem.MemModSlot,
|
||||
Present: mem.MemModStatus == 1 && mem.MemModSize > 0,
|
||||
@@ -117,8 +125,18 @@ func parseMemoryInfo(text string, hw *models.HardwareConfig) {
|
||||
PartNumber: strings.TrimSpace(mem.MemModPartNum),
|
||||
Status: mem.Status,
|
||||
Ranks: mem.MemModRanks,
|
||||
})
|
||||
}
|
||||
key := inspurMemoryKey(item)
|
||||
if idx, ok := seen[key]; ok {
|
||||
mergeInspurMemoryDIMM(&merged[idx], item)
|
||||
continue
|
||||
}
|
||||
if key != "" {
|
||||
seen[key] = len(merged)
|
||||
}
|
||||
merged = append(merged, item)
|
||||
}
|
||||
hw.Memory = merged
|
||||
}
|
||||
|
||||
// PSURESTInfo represents the RESTful PSU info structure
|
||||
@@ -159,10 +177,18 @@ func parsePSUInfo(text string, hw *models.HardwareConfig) {
|
||||
return
|
||||
}
|
||||
|
||||
// Clear existing PSU data and populate with RESTful data
|
||||
hw.PowerSupply = nil
|
||||
var merged []models.PSU
|
||||
seen := make(map[string]int)
|
||||
for _, existing := range hw.PowerSupply {
|
||||
key := inspurPSUKey(existing)
|
||||
if key == "" {
|
||||
continue
|
||||
}
|
||||
seen[key] = len(merged)
|
||||
merged = append(merged, existing)
|
||||
}
|
||||
for _, psu := range psuInfo.PowerSupplies {
|
||||
hw.PowerSupply = append(hw.PowerSupply, models.PSU{
|
||||
item := models.PSU{
|
||||
Slot: fmt.Sprintf("PSU%d", psu.ID),
|
||||
Present: psu.Present == 1,
|
||||
Model: strings.TrimSpace(psu.Model),
|
||||
@@ -178,8 +204,18 @@ func parsePSUInfo(text string, hw *models.HardwareConfig) {
|
||||
InputVoltage: psu.PSInVolt,
|
||||
OutputVoltage: psu.PSOutVolt,
|
||||
TemperatureC: psu.PSUMaxTemp,
|
||||
})
|
||||
}
|
||||
key := inspurPSUKey(item)
|
||||
if idx, ok := seen[key]; ok {
|
||||
mergeInspurPSU(&merged[idx], item)
|
||||
continue
|
||||
}
|
||||
if key != "" {
|
||||
seen[key] = len(merged)
|
||||
}
|
||||
merged = append(merged, item)
|
||||
}
|
||||
hw.PowerSupply = merged
|
||||
}
|
||||
|
||||
// HDDRESTInfo represents the RESTful HDD info structure
|
||||
@@ -357,7 +393,16 @@ func parseNetworkAdapterInfo(text string, hw *models.HardwareConfig) {
|
||||
return
|
||||
}
|
||||
|
||||
hw.NetworkAdapters = nil
|
||||
var merged []models.NetworkAdapter
|
||||
seen := make(map[string]int)
|
||||
for _, existing := range hw.NetworkAdapters {
|
||||
key := inspurNICKey(existing)
|
||||
if key == "" {
|
||||
continue
|
||||
}
|
||||
seen[key] = len(merged)
|
||||
merged = append(merged, existing)
|
||||
}
|
||||
for _, adapter := range netInfo.SysAdapters {
|
||||
var macs []string
|
||||
for _, port := range adapter.Ports {
|
||||
@@ -377,7 +422,7 @@ func parseNetworkAdapterInfo(text string, hw *models.HardwareConfig) {
|
||||
vendor = normalizeModelLabel(pciids.VendorName(adapter.VendorID))
|
||||
}
|
||||
|
||||
hw.NetworkAdapters = append(hw.NetworkAdapters, models.NetworkAdapter{
|
||||
item := models.NetworkAdapter{
|
||||
Slot: fmt.Sprintf("Slot %d", adapter.Slot),
|
||||
Location: adapter.Location,
|
||||
Present: adapter.Present == 1,
|
||||
@@ -392,8 +437,231 @@ func parseNetworkAdapterInfo(text string, hw *models.HardwareConfig) {
|
||||
PortType: adapter.PortType,
|
||||
MACAddresses: macs,
|
||||
Status: adapter.Status,
|
||||
})
|
||||
}
|
||||
key := inspurNICKey(item)
|
||||
if idx, ok := seen[key]; ok {
|
||||
mergeInspurNIC(&merged[idx], item)
|
||||
continue
|
||||
}
|
||||
if slotIdx := inspurFindNICBySlot(merged, item.Slot); slotIdx >= 0 {
|
||||
mergeInspurNIC(&merged[slotIdx], item)
|
||||
if key != "" {
|
||||
seen[key] = slotIdx
|
||||
}
|
||||
continue
|
||||
}
|
||||
if key != "" {
|
||||
seen[key] = len(merged)
|
||||
}
|
||||
merged = append(merged, item)
|
||||
}
|
||||
hw.NetworkAdapters = merged
|
||||
}
|
||||
|
||||
func inspurMemoryKey(item models.MemoryDIMM) string {
|
||||
return strings.ToLower(strings.TrimSpace(inspurFirstNonEmpty(item.SerialNumber, item.Slot, item.Location)))
|
||||
}
|
||||
|
||||
func mergeInspurMemoryDIMM(dst *models.MemoryDIMM, src models.MemoryDIMM) {
|
||||
if dst == nil {
|
||||
return
|
||||
}
|
||||
if strings.TrimSpace(dst.Slot) == "" {
|
||||
dst.Slot = src.Slot
|
||||
}
|
||||
if strings.TrimSpace(dst.Location) == "" {
|
||||
dst.Location = src.Location
|
||||
}
|
||||
dst.Present = dst.Present || src.Present
|
||||
if dst.SizeMB == 0 {
|
||||
dst.SizeMB = src.SizeMB
|
||||
}
|
||||
if strings.TrimSpace(dst.Type) == "" {
|
||||
dst.Type = src.Type
|
||||
}
|
||||
if strings.TrimSpace(dst.Technology) == "" {
|
||||
dst.Technology = src.Technology
|
||||
}
|
||||
if dst.MaxSpeedMHz == 0 {
|
||||
dst.MaxSpeedMHz = src.MaxSpeedMHz
|
||||
}
|
||||
if dst.CurrentSpeedMHz == 0 {
|
||||
dst.CurrentSpeedMHz = src.CurrentSpeedMHz
|
||||
}
|
||||
if strings.TrimSpace(dst.Manufacturer) == "" {
|
||||
dst.Manufacturer = src.Manufacturer
|
||||
}
|
||||
if strings.TrimSpace(dst.SerialNumber) == "" {
|
||||
dst.SerialNumber = src.SerialNumber
|
||||
}
|
||||
if strings.TrimSpace(dst.PartNumber) == "" {
|
||||
dst.PartNumber = src.PartNumber
|
||||
}
|
||||
if strings.TrimSpace(dst.Status) == "" {
|
||||
dst.Status = src.Status
|
||||
}
|
||||
if dst.Ranks == 0 {
|
||||
dst.Ranks = src.Ranks
|
||||
}
|
||||
}
|
||||
|
||||
func inspurPSUKey(item models.PSU) string {
|
||||
return strings.ToLower(strings.TrimSpace(inspurFirstNonEmpty(item.SerialNumber, item.Slot, item.Model)))
|
||||
}
|
||||
|
||||
func mergeInspurPSU(dst *models.PSU, src models.PSU) {
|
||||
if dst == nil {
|
||||
return
|
||||
}
|
||||
if strings.TrimSpace(dst.Slot) == "" {
|
||||
dst.Slot = src.Slot
|
||||
}
|
||||
dst.Present = dst.Present || src.Present
|
||||
if strings.TrimSpace(dst.Model) == "" {
|
||||
dst.Model = src.Model
|
||||
}
|
||||
if strings.TrimSpace(dst.Vendor) == "" {
|
||||
dst.Vendor = src.Vendor
|
||||
}
|
||||
if dst.WattageW == 0 {
|
||||
dst.WattageW = src.WattageW
|
||||
}
|
||||
if strings.TrimSpace(dst.SerialNumber) == "" {
|
||||
dst.SerialNumber = src.SerialNumber
|
||||
}
|
||||
if strings.TrimSpace(dst.PartNumber) == "" {
|
||||
dst.PartNumber = src.PartNumber
|
||||
}
|
||||
if strings.TrimSpace(dst.Firmware) == "" {
|
||||
dst.Firmware = src.Firmware
|
||||
}
|
||||
if strings.TrimSpace(dst.Status) == "" {
|
||||
dst.Status = src.Status
|
||||
}
|
||||
if strings.TrimSpace(dst.InputType) == "" {
|
||||
dst.InputType = src.InputType
|
||||
}
|
||||
if dst.InputPowerW == 0 {
|
||||
dst.InputPowerW = src.InputPowerW
|
||||
}
|
||||
if dst.OutputPowerW == 0 {
|
||||
dst.OutputPowerW = src.OutputPowerW
|
||||
}
|
||||
if dst.InputVoltage == 0 {
|
||||
dst.InputVoltage = src.InputVoltage
|
||||
}
|
||||
if dst.OutputVoltage == 0 {
|
||||
dst.OutputVoltage = src.OutputVoltage
|
||||
}
|
||||
if dst.TemperatureC == 0 {
|
||||
dst.TemperatureC = src.TemperatureC
|
||||
}
|
||||
}
|
||||
|
||||
func inspurNICKey(item models.NetworkAdapter) string {
|
||||
return strings.ToLower(strings.TrimSpace(inspurFirstNonEmpty(item.SerialNumber, strings.Join(item.MACAddresses, ","), item.Slot, item.Location)))
|
||||
}
|
||||
|
||||
func mergeInspurNIC(dst *models.NetworkAdapter, src models.NetworkAdapter) {
|
||||
if dst == nil {
|
||||
return
|
||||
}
|
||||
if strings.TrimSpace(dst.Slot) == "" {
|
||||
dst.Slot = src.Slot
|
||||
}
|
||||
if strings.TrimSpace(dst.Location) == "" {
|
||||
dst.Location = src.Location
|
||||
}
|
||||
dst.Present = dst.Present || src.Present
|
||||
if strings.TrimSpace(dst.BDF) == "" {
|
||||
dst.BDF = src.BDF
|
||||
}
|
||||
if strings.TrimSpace(dst.Model) == "" {
|
||||
dst.Model = src.Model
|
||||
}
|
||||
if strings.TrimSpace(dst.Description) == "" {
|
||||
dst.Description = src.Description
|
||||
}
|
||||
if strings.TrimSpace(dst.Vendor) == "" {
|
||||
dst.Vendor = src.Vendor
|
||||
}
|
||||
if dst.VendorID == 0 {
|
||||
dst.VendorID = src.VendorID
|
||||
}
|
||||
if dst.DeviceID == 0 {
|
||||
dst.DeviceID = src.DeviceID
|
||||
}
|
||||
if strings.TrimSpace(dst.SerialNumber) == "" {
|
||||
dst.SerialNumber = src.SerialNumber
|
||||
}
|
||||
if strings.TrimSpace(dst.PartNumber) == "" {
|
||||
dst.PartNumber = src.PartNumber
|
||||
}
|
||||
if strings.TrimSpace(dst.Firmware) == "" {
|
||||
dst.Firmware = src.Firmware
|
||||
}
|
||||
if dst.PortCount == 0 {
|
||||
dst.PortCount = src.PortCount
|
||||
}
|
||||
if strings.TrimSpace(dst.PortType) == "" {
|
||||
dst.PortType = src.PortType
|
||||
}
|
||||
if dst.LinkWidth == 0 {
|
||||
dst.LinkWidth = src.LinkWidth
|
||||
}
|
||||
if strings.TrimSpace(dst.LinkSpeed) == "" {
|
||||
dst.LinkSpeed = src.LinkSpeed
|
||||
}
|
||||
if dst.MaxLinkWidth == 0 {
|
||||
dst.MaxLinkWidth = src.MaxLinkWidth
|
||||
}
|
||||
if strings.TrimSpace(dst.MaxLinkSpeed) == "" {
|
||||
dst.MaxLinkSpeed = src.MaxLinkSpeed
|
||||
}
|
||||
if dst.NUMANode == 0 {
|
||||
dst.NUMANode = src.NUMANode
|
||||
}
|
||||
if strings.TrimSpace(dst.Status) == "" {
|
||||
dst.Status = src.Status
|
||||
}
|
||||
for _, mac := range src.MACAddresses {
|
||||
mac = strings.TrimSpace(mac)
|
||||
if mac == "" {
|
||||
continue
|
||||
}
|
||||
found := false
|
||||
for _, existing := range dst.MACAddresses {
|
||||
if strings.EqualFold(strings.TrimSpace(existing), mac) {
|
||||
found = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
dst.MACAddresses = append(dst.MACAddresses, mac)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func inspurFindNICBySlot(items []models.NetworkAdapter, slot string) int {
|
||||
slot = strings.ToLower(strings.TrimSpace(slot))
|
||||
if slot == "" {
|
||||
return -1
|
||||
}
|
||||
for i := range items {
|
||||
if strings.ToLower(strings.TrimSpace(items[i].Slot)) == slot {
|
||||
return i
|
||||
}
|
||||
}
|
||||
return -1
|
||||
}
|
||||
|
||||
func inspurFirstNonEmpty(values ...string) string {
|
||||
for _, value := range values {
|
||||
if strings.TrimSpace(value) != "" {
|
||||
return strings.TrimSpace(value)
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func parseFanSensors(text string) []models.SensorReading {
|
||||
@@ -713,6 +981,63 @@ func extractComponentFirmware(text string, hw *models.HardwareConfig) {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Extract BMC, CPLD and VR firmware from RESTful version info section.
|
||||
// The JSON is a flat array: [{"id":N,"dev_name":"...","dev_version":"..."}, ...]
|
||||
reVer := regexp.MustCompile(`RESTful version info:\s*(\[[\s\S]*?\])\s*RESTful`)
|
||||
if match := reVer.FindStringSubmatch(text); match != nil {
|
||||
type verEntry struct {
|
||||
DevName string `json:"dev_name"`
|
||||
DevVersion string `json:"dev_version"`
|
||||
}
|
||||
var entries []verEntry
|
||||
if err := json.Unmarshal([]byte(match[1]), &entries); err == nil {
|
||||
for _, e := range entries {
|
||||
name := normalizeVersionInfoName(e.DevName)
|
||||
if name == "" {
|
||||
continue
|
||||
}
|
||||
version := strings.TrimSpace(e.DevVersion)
|
||||
if version == "" {
|
||||
continue
|
||||
}
|
||||
if existingFW[name] {
|
||||
continue
|
||||
}
|
||||
hw.Firmware = append(hw.Firmware, models.FirmwareInfo{
|
||||
DeviceName: name,
|
||||
Version: version,
|
||||
})
|
||||
existingFW[name] = true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// normalizeVersionInfoName converts RESTful version info dev_name to a clean label.
|
||||
// Returns "" for entries that should be skipped (inactive BMC, PSU slots).
|
||||
func normalizeVersionInfoName(name string) string {
|
||||
name = strings.TrimSpace(name)
|
||||
if name == "" {
|
||||
return ""
|
||||
}
|
||||
// Skip PSU_N entries — firmware already extracted from PSU info section.
|
||||
if regexp.MustCompile(`(?i)^PSU_\d+$`).MatchString(name) {
|
||||
return ""
|
||||
}
|
||||
// Skip the inactive BMC partition.
|
||||
if strings.HasPrefix(strings.ToLower(name), "inactivate(") {
|
||||
return ""
|
||||
}
|
||||
// Active BMC: "Activate(BMC1)" → "BMC"
|
||||
if strings.HasPrefix(strings.ToLower(name), "activate(") {
|
||||
return "BMC"
|
||||
}
|
||||
// Strip trailing "Version" suffix (case-insensitive), e.g. "MainBoard0CPLDVersion" → "MainBoard0CPLD"
|
||||
if strings.HasSuffix(strings.ToLower(name), "version") {
|
||||
name = name[:len(name)-len("version")]
|
||||
}
|
||||
return strings.TrimSpace(name)
|
||||
}
|
||||
|
||||
// DiskBackplaneRESTInfo represents the RESTful diskbackplane info structure
|
||||
|
||||
58
internal/parser/vendors/inspur/component_test.go
vendored
58
internal/parser/vendors/inspur/component_test.go
vendored
@@ -51,6 +51,64 @@ RESTful fan`
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseNetworkAdapterInfo_MergesIntoExistingInventory(t *testing.T) {
|
||||
text := `RESTful Network Adapter info:
|
||||
{
|
||||
"sys_adapters": [
|
||||
{
|
||||
"id": 1,
|
||||
"name": "NIC1",
|
||||
"Location": "#CPU0_PCIE4",
|
||||
"present": 1,
|
||||
"slot": 4,
|
||||
"vendor_id": 32902,
|
||||
"device_id": 5409,
|
||||
"vendor": "Mellanox",
|
||||
"model": "ConnectX-6",
|
||||
"fw_ver": "22.1.0",
|
||||
"status": "OK",
|
||||
"sn": "",
|
||||
"pn": "",
|
||||
"port_num": 2,
|
||||
"port_type": "QSFP",
|
||||
"ports": [
|
||||
{ "id": 1, "mac_addr": "00:11:22:33:44:55" }
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
RESTful fan`
|
||||
|
||||
hw := &models.HardwareConfig{
|
||||
NetworkAdapters: []models.NetworkAdapter{
|
||||
{
|
||||
Slot: "Slot 4",
|
||||
BDF: "0000:17:00.0",
|
||||
SerialNumber: "NIC-SN-1",
|
||||
Present: true,
|
||||
},
|
||||
},
|
||||
}
|
||||
parseNetworkAdapterInfo(text, hw)
|
||||
|
||||
if len(hw.NetworkAdapters) != 1 {
|
||||
t.Fatalf("expected merged single adapter, got %d", len(hw.NetworkAdapters))
|
||||
}
|
||||
got := hw.NetworkAdapters[0]
|
||||
if got.BDF != "0000:17:00.0" {
|
||||
t.Fatalf("expected existing BDF to survive merge, got %q", got.BDF)
|
||||
}
|
||||
if got.Model != "ConnectX-6" {
|
||||
t.Fatalf("expected model from component log, got %q", got.Model)
|
||||
}
|
||||
if got.SerialNumber != "NIC-SN-1" {
|
||||
t.Fatalf("expected serial from existing inventory to survive merge, got %q", got.SerialNumber)
|
||||
}
|
||||
if len(got.MACAddresses) != 1 || got.MACAddresses[0] != "00:11:22:33:44:55" {
|
||||
t.Fatalf("expected MAC addresses from component log, got %#v", got.MACAddresses)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseComponentLogSensors_ExtractsFanBackplaneAndPSUSummary(t *testing.T) {
|
||||
text := `RESTful PSU info:
|
||||
{
|
||||
|
||||
33
internal/parser/vendors/inspur/event_logs_test.go
vendored
Normal file
33
internal/parser/vendors/inspur/event_logs_test.go
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
package inspur
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestParseIDLLog_UsesBMCSourceForEventLogs(t *testing.T) {
|
||||
content := []byte(`|2025-12-02T17:54:27+08:00|MEMORY|Assert|Warning|0C180401|CPU1_C4D0 Memory Device Disabled - Assert|`)
|
||||
|
||||
events := ParseIDLLog(content)
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 event, got %d", len(events))
|
||||
}
|
||||
if events[0].Source != "BMC" {
|
||||
t.Fatalf("expected IDL events to use BMC source, got %#v", events[0])
|
||||
}
|
||||
if events[0].SensorName != "CPU1_C4D0" {
|
||||
t.Fatalf("expected extracted DIMM component ref, got %#v", events[0])
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseSyslog_UsesHostSourceAndProcessAsSensorName(t *testing.T) {
|
||||
content := []byte(`<13>2026-03-15T14:03:11+00:00 host123 systemd[1]: Started Example Service`)
|
||||
|
||||
events := ParseSyslog(content, "syslog/info")
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 event, got %d", len(events))
|
||||
}
|
||||
if events[0].Source != "syslog" {
|
||||
t.Fatalf("expected syslog source, got %#v", events[0])
|
||||
}
|
||||
if events[0].SensorName != "systemd[1]" {
|
||||
t.Fatalf("expected process name in sensor/component slot, got %#v", events[0])
|
||||
}
|
||||
}
|
||||
@@ -165,7 +165,10 @@ func TestParseIDLLog_ParsesStructuredJSONLine(t *testing.T) {
|
||||
if events[0].ID != "17FFB002" {
|
||||
t.Fatalf("expected event ID 17FFB002, got %q", events[0].ID)
|
||||
}
|
||||
if events[0].Source != "PCIE" {
|
||||
t.Fatalf("expected source PCIE, got %q", events[0].Source)
|
||||
if events[0].Source != "BMC" {
|
||||
t.Fatalf("expected BMC source for IDL event, got %q", events[0].Source)
|
||||
}
|
||||
if events[0].SensorType != "pcie" {
|
||||
t.Fatalf("expected component type pcie, got %#v", events[0])
|
||||
}
|
||||
}
|
||||
|
||||
2
internal/parser/vendors/inspur/idl.go
vendored
2
internal/parser/vendors/inspur/idl.go
vendored
@@ -60,7 +60,7 @@ func ParseIDLLog(content []byte) []models.Event {
|
||||
events = append(events, models.Event{
|
||||
ID: eventID,
|
||||
Timestamp: ts,
|
||||
Source: component,
|
||||
Source: "BMC",
|
||||
SensorType: strings.ToLower(component),
|
||||
SensorName: sensorName,
|
||||
EventType: eventType,
|
||||
|
||||
40
internal/parser/vendors/inspur/parser.go
vendored
40
internal/parser/vendors/inspur/parser.go
vendored
@@ -16,7 +16,7 @@ import (
|
||||
|
||||
// parserVersion - version of this parser module
|
||||
// IMPORTANT: Increment this version when making changes to parser logic!
|
||||
const parserVersion = "1.5"
|
||||
const parserVersion = "1.8"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
@@ -95,9 +95,41 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
|
||||
Sensors: make([]models.SensorReading, 0),
|
||||
}
|
||||
|
||||
// Pre-parse enrichment maps from devicefrusdr.log for use inside ParseAssetJSON.
|
||||
// BMC does not populate HddInfo.ModelName or SerialNumber for NVMe drives.
|
||||
var pcieSlotDeviceNames map[int]string
|
||||
var nvmeLocToSlot map[int]int
|
||||
if f := parser.FindFileByName(files, "devicefrusdr.log"); f != nil {
|
||||
pcieSlotDeviceNames = ParsePCIeSlotDeviceNames(f.Content)
|
||||
nvmeLocToSlot = ParsePCIeNVMeLocToSlot(f.Content)
|
||||
}
|
||||
|
||||
// Parse NVMe serial numbers from audit.log: every disk SN change is logged there.
|
||||
// Combine with the NVMe loc→slot mapping to build pcieSlot→serial map.
|
||||
// Also parse RAID disk serials by backplane slot key (e.g. "BP0:0").
|
||||
var pcieSlotSerials map[int]string
|
||||
var raidSlotSerials map[string]string
|
||||
if f := parser.FindFileByName(files, "audit.log"); f != nil {
|
||||
if len(nvmeLocToSlot) > 0 {
|
||||
nvmeDiskSerials := ParseAuditLogNVMeSerials(f.Content)
|
||||
if len(nvmeDiskSerials) > 0 {
|
||||
pcieSlotSerials = make(map[int]string, len(nvmeDiskSerials))
|
||||
for diskNum, serial := range nvmeDiskSerials {
|
||||
if slot, ok := nvmeLocToSlot[diskNum]; ok {
|
||||
pcieSlotSerials[slot] = serial
|
||||
}
|
||||
}
|
||||
if len(pcieSlotSerials) == 0 {
|
||||
pcieSlotSerials = nil
|
||||
}
|
||||
}
|
||||
}
|
||||
raidSlotSerials = ParseAuditLogRAIDSerials(f.Content)
|
||||
}
|
||||
|
||||
// Parse asset.json first (base hardware info)
|
||||
if f := parser.FindFileByName(files, "asset.json"); f != nil {
|
||||
if hw, err := ParseAssetJSON(f.Content); err == nil {
|
||||
if hw, err := ParseAssetJSON(f.Content, pcieSlotDeviceNames, pcieSlotSerials); err == nil {
|
||||
result.Hardware = hw
|
||||
}
|
||||
}
|
||||
@@ -182,6 +214,10 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
|
||||
if result.Hardware != nil {
|
||||
applyGPUStatusFromEvents(result.Hardware, result.Events)
|
||||
enrichStorageFromSerialFallbackFiles(files, result.Hardware)
|
||||
// Apply RAID disk serials from audit.log (authoritative: last non-NULL SN change).
|
||||
// These override redis/component.log serials which may be stale after disk replacement.
|
||||
applyRAIDSlotSerials(result.Hardware, raidSlotSerials)
|
||||
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
|
||||
}
|
||||
|
||||
return result, nil
|
||||
|
||||
79
internal/parser/vendors/inspur/pcie.go
vendored
79
internal/parser/vendors/inspur/pcie.go
vendored
@@ -4,6 +4,7 @@ import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
@@ -37,6 +38,84 @@ type PCIeRESTInfo []struct {
|
||||
FwVer string `json:"fw_ver"`
|
||||
}
|
||||
|
||||
// ParsePCIeSlotDeviceNames parses devicefrusdr.log and returns a map from integer PCIe slot ID
|
||||
// to device name string. Used to enrich HddInfo entries in asset.json that lack model names.
|
||||
func ParsePCIeSlotDeviceNames(content []byte) map[int]string {
|
||||
info, ok := parsePCIeRESTJSON(content)
|
||||
if !ok {
|
||||
return nil
|
||||
}
|
||||
result := make(map[int]string, len(info))
|
||||
for _, entry := range info {
|
||||
if entry.Slot <= 0 {
|
||||
continue
|
||||
}
|
||||
name := sanitizePCIeDeviceName(entry.DeviceName)
|
||||
if name != "" {
|
||||
result[entry.Slot] = name
|
||||
}
|
||||
}
|
||||
if len(result) == 0 {
|
||||
return nil
|
||||
}
|
||||
return result
|
||||
}
|
||||
|
||||
// parsePCIeRESTJSON parses the RESTful PCIE Device info JSON from devicefrusdr.log content.
|
||||
func parsePCIeRESTJSON(content []byte) (PCIeRESTInfo, bool) {
|
||||
text := string(content)
|
||||
startMarker := "RESTful PCIE Device info:"
|
||||
endMarker := "BMC sdr Info:"
|
||||
|
||||
startIdx := strings.Index(text, startMarker)
|
||||
if startIdx == -1 {
|
||||
return nil, false
|
||||
}
|
||||
endIdx := strings.Index(text[startIdx:], endMarker)
|
||||
if endIdx == -1 {
|
||||
endIdx = len(text) - startIdx
|
||||
}
|
||||
jsonText := strings.TrimSpace(text[startIdx+len(startMarker) : startIdx+endIdx])
|
||||
|
||||
var info PCIeRESTInfo
|
||||
if err := json.Unmarshal([]byte(jsonText), &info); err != nil {
|
||||
return nil, false
|
||||
}
|
||||
return info, true
|
||||
}
|
||||
|
||||
// ParsePCIeNVMeLocToSlot parses devicefrusdr.log and returns a map from NVMe location number
|
||||
// (the numeric suffix in "#NVME0", "#NVME2", etc.) to the integer PCIe slot ID.
|
||||
// This is used to correlate audit.log NVMe disk numbers with HddInfo PcieSlot values.
|
||||
func ParsePCIeNVMeLocToSlot(content []byte) map[int]int {
|
||||
info, ok := parsePCIeRESTJSON(content)
|
||||
if !ok {
|
||||
return nil
|
||||
}
|
||||
|
||||
nvmeLocRegex := regexp.MustCompile(`(?i)^#NVME(\d+)$`)
|
||||
result := make(map[int]int)
|
||||
for _, entry := range info {
|
||||
if entry.Slot <= 0 {
|
||||
continue
|
||||
}
|
||||
loc := strings.TrimSpace(entry.Location)
|
||||
m := nvmeLocRegex.FindStringSubmatch(loc)
|
||||
if m == nil {
|
||||
continue
|
||||
}
|
||||
locNum, err := strconv.Atoi(m[1])
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
result[locNum] = entry.Slot
|
||||
}
|
||||
if len(result) == 0 {
|
||||
return nil
|
||||
}
|
||||
return result
|
||||
}
|
||||
|
||||
// ParsePCIeDevices parses RESTful PCIE Device info from devicefrusdr.log
|
||||
func ParsePCIeDevices(content []byte) []models.PCIeDevice {
|
||||
text := string(content)
|
||||
|
||||
@@ -73,6 +73,24 @@ func looksLikeStorageSerial(v string) bool {
|
||||
return hasLetter && hasDigit
|
||||
}
|
||||
|
||||
// applyRAIDSlotSerials updates storage serial numbers using the slot→serial map
|
||||
// derived from audit.log RAID SN change events. Overwrites existing serials since
|
||||
// audit.log represents the authoritative current state after all disk replacements.
|
||||
func applyRAIDSlotSerials(hw *models.HardwareConfig, serials map[string]string) {
|
||||
if hw == nil || len(serials) == 0 {
|
||||
return
|
||||
}
|
||||
for i := range hw.Storage {
|
||||
slot := strings.TrimSpace(hw.Storage[i].Slot)
|
||||
if slot == "" {
|
||||
continue
|
||||
}
|
||||
if sn, ok := serials[slot]; ok && sn != "" {
|
||||
hw.Storage[i].SerialNumber = sn
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func applyStorageSerialFallback(hw *models.HardwareConfig, serials []string) {
|
||||
if hw == nil || len(hw.Storage) == 0 || len(serials) == 0 {
|
||||
return
|
||||
|
||||
@@ -26,7 +26,7 @@ func TestParseAssetJSON_HddSlotFallbackAndPresence(t *testing.T) {
|
||||
]
|
||||
}`)
|
||||
|
||||
hw, err := ParseAssetJSON(content)
|
||||
hw, err := ParseAssetJSON(content, nil, nil)
|
||||
if err != nil {
|
||||
t.Fatalf("ParseAssetJSON failed: %v", err)
|
||||
}
|
||||
|
||||
4
internal/parser/vendors/inspur/syslog.go
vendored
4
internal/parser/vendors/inspur/syslog.go
vendored
@@ -48,9 +48,9 @@ func ParseSyslog(content []byte, sourcePath string) []models.Event {
|
||||
event := models.Event{
|
||||
ID: generateEventID(sourcePath, lineNum),
|
||||
Timestamp: timestamp,
|
||||
Source: matches[4],
|
||||
Source: "syslog",
|
||||
SensorType: "syslog",
|
||||
SensorName: matches[3],
|
||||
SensorName: matches[4],
|
||||
Description: matches[5],
|
||||
Severity: severity,
|
||||
RawData: line,
|
||||
|
||||
710
internal/parser/vendors/lenovo_xcc/parser.go
vendored
Normal file
710
internal/parser/vendors/lenovo_xcc/parser.go
vendored
Normal file
@@ -0,0 +1,710 @@
|
||||
// Package lenovo_xcc provides parser for Lenovo XCC mini-log archives.
|
||||
// Tested with: ThinkSystem SR650 V3 (XCC mini-log zip, exported via XCC UI)
|
||||
//
|
||||
// Archive structure: zip with tmp/ directory containing JSON .log files.
|
||||
//
|
||||
// IMPORTANT: Increment parserVersion when modifying parser logic!
|
||||
package lenovo_xcc
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
const parserVersion = "1.1"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
}
|
||||
|
||||
// Parser implements VendorParser for Lenovo XCC mini-log archives.
|
||||
type Parser struct{}
|
||||
|
||||
func (p *Parser) Name() string { return "Lenovo XCC Mini-Log Parser" }
|
||||
func (p *Parser) Vendor() string { return "lenovo_xcc" }
|
||||
func (p *Parser) Version() string { return parserVersion }
|
||||
|
||||
// Detect checks if files match the Lenovo XCC mini-log archive format.
|
||||
// Returns confidence score 0-100.
|
||||
func (p *Parser) Detect(files []parser.ExtractedFile) int {
|
||||
confidence := 0
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(f.Path)
|
||||
switch {
|
||||
case strings.HasSuffix(path, "tmp/basic_sys_info.log"):
|
||||
confidence += 60
|
||||
case strings.HasSuffix(path, "tmp/inventory_cpu.log"):
|
||||
confidence += 20
|
||||
case strings.HasSuffix(path, "tmp/xcc_plat_events1.log"):
|
||||
confidence += 20
|
||||
case strings.HasSuffix(path, "tmp/inventory_dimm.log"):
|
||||
confidence += 10
|
||||
case strings.HasSuffix(path, "tmp/inventory_fw.log"):
|
||||
confidence += 10
|
||||
}
|
||||
if confidence >= 100 {
|
||||
return 100
|
||||
}
|
||||
}
|
||||
return confidence
|
||||
}
|
||||
|
||||
// Parse parses the Lenovo XCC mini-log archive and returns an analysis result.
|
||||
func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, error) {
|
||||
result := &models.AnalysisResult{
|
||||
Events: make([]models.Event, 0),
|
||||
FRU: make([]models.FRUInfo, 0),
|
||||
Sensors: make([]models.SensorReading, 0),
|
||||
Hardware: &models.HardwareConfig{
|
||||
Firmware: make([]models.FirmwareInfo, 0),
|
||||
CPUs: make([]models.CPU, 0),
|
||||
Memory: make([]models.MemoryDIMM, 0),
|
||||
Storage: make([]models.Storage, 0),
|
||||
PCIeDevices: make([]models.PCIeDevice, 0),
|
||||
PowerSupply: make([]models.PSU, 0),
|
||||
},
|
||||
}
|
||||
|
||||
if f := findByPath(files, "tmp/basic_sys_info.log"); f != nil {
|
||||
parseBasicSysInfo(f.Content, result)
|
||||
}
|
||||
if f := findByPath(files, "tmp/inventory_fw.log"); f != nil {
|
||||
result.Hardware.Firmware = append(result.Hardware.Firmware, parseFirmware(f.Content)...)
|
||||
}
|
||||
if f := findByPath(files, "tmp/inventory_cpu.log"); f != nil {
|
||||
result.Hardware.CPUs = parseCPUs(f.Content)
|
||||
}
|
||||
if f := findByPath(files, "tmp/inventory_dimm.log"); f != nil {
|
||||
memory, events := parseDIMMs(f.Content)
|
||||
result.Hardware.Memory = memory
|
||||
result.Events = append(result.Events, events...)
|
||||
}
|
||||
if f := findByPath(files, "tmp/inventory_disk.log"); f != nil {
|
||||
result.Hardware.Storage = parseDisks(f.Content)
|
||||
}
|
||||
if f := findByPath(files, "tmp/inventory_card.log"); f != nil {
|
||||
result.Hardware.PCIeDevices = parseCards(f.Content)
|
||||
}
|
||||
if f := findByPath(files, "tmp/inventory_psu.log"); f != nil {
|
||||
result.Hardware.PowerSupply = parsePSUs(f.Content)
|
||||
}
|
||||
if f := findByPath(files, "tmp/inventory_ipmi_fru.log"); f != nil {
|
||||
result.FRU = parseFRU(f.Content)
|
||||
}
|
||||
if f := findByPath(files, "tmp/inventory_ipmi_sensor.log"); f != nil {
|
||||
result.Sensors = parseSensors(f.Content)
|
||||
}
|
||||
for _, f := range findEventFiles(files) {
|
||||
result.Events = append(result.Events, parseEvents(f.Content)...)
|
||||
}
|
||||
|
||||
result.Protocol = "ipmi"
|
||||
result.SourceType = models.SourceTypeArchive
|
||||
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
|
||||
|
||||
return result, nil
|
||||
}
|
||||
|
||||
// findByPath returns the first file whose lowercased path ends with the given suffix.
|
||||
func findByPath(files []parser.ExtractedFile, suffix string) *parser.ExtractedFile {
|
||||
for i := range files {
|
||||
if strings.HasSuffix(strings.ToLower(files[i].Path), suffix) {
|
||||
return &files[i]
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// findEventFiles returns all xcc_plat_eventsN.log files.
|
||||
func findEventFiles(files []parser.ExtractedFile) []parser.ExtractedFile {
|
||||
var out []parser.ExtractedFile
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(f.Path)
|
||||
if strings.Contains(path, "tmp/xcc_plat_events") && strings.HasSuffix(path, ".log") {
|
||||
out = append(out, f)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// --- JSON structures ---
|
||||
|
||||
type xccBasicSysInfoDoc struct {
|
||||
Items []xccBasicSysInfoItem `json:"items"`
|
||||
}
|
||||
|
||||
type xccBasicSysInfoItem struct {
|
||||
MachineName string `json:"machine_name"`
|
||||
MachineTypeModel string `json:"machine_typemodel"`
|
||||
SerialNumber string `json:"serial_number"`
|
||||
UUID string `json:"uuid"`
|
||||
PowerState string `json:"power_state"`
|
||||
ServerState string `json:"server_state"`
|
||||
CurrentTime string `json:"current_time"`
|
||||
}
|
||||
|
||||
// xccFWEntry covers both basic_sys_info firmware (no type_str) and inventory_fw (has type_str).
|
||||
type xccFWEntry struct {
|
||||
Index int `json:"index"`
|
||||
TypeCode int `json:"type"`
|
||||
TypeStr string `json:"type_str"` // only in inventory_fw.log
|
||||
Version string `json:"version"`
|
||||
Build string `json:"build"`
|
||||
ReleaseDate string `json:"release_date"`
|
||||
}
|
||||
|
||||
type xccFirmwareDoc struct {
|
||||
Items []xccFWEntry `json:"items"`
|
||||
}
|
||||
|
||||
type xccCPUDoc struct {
|
||||
Items []xccCPUItem `json:"items"`
|
||||
}
|
||||
|
||||
type xccCPUItem struct {
|
||||
Processors []xccCPU `json:"processors"`
|
||||
}
|
||||
|
||||
type xccCPU struct {
|
||||
Name int `json:"processors_name"`
|
||||
Model string `json:"processors_cpu_model"`
|
||||
Cores json.RawMessage `json:"processors_cores"` // may be int or string
|
||||
Threads json.RawMessage `json:"processors_threads"` // may be int or string
|
||||
ClockSpeed string `json:"processors_clock_speed"`
|
||||
L1DataCache string `json:"processors_l1datacache"`
|
||||
L2Cache string `json:"processors_l2cache"`
|
||||
L3Cache string `json:"processors_l3cache"`
|
||||
Status string `json:"processors_status"`
|
||||
SerialNumber string `json:"processors_serial_number"`
|
||||
}
|
||||
|
||||
type xccDIMMDoc struct {
|
||||
Items []xccDIMMItem `json:"items"`
|
||||
}
|
||||
|
||||
type xccDIMMItem struct {
|
||||
Memory []xccDIMM `json:"memory"`
|
||||
}
|
||||
|
||||
type xccDIMM struct {
|
||||
Index int `json:"memory_index"`
|
||||
Status string `json:"memory_status"`
|
||||
Name string `json:"memory_name"`
|
||||
Type string `json:"memory_type"`
|
||||
Capacity json.RawMessage `json:"memory_capacity"` // int (GB) or string
|
||||
PartNumber string `json:"memory_part_number"`
|
||||
SerialNumber string `json:"memory_serial_number"`
|
||||
Manufacturer string `json:"memory_manufacturer"`
|
||||
MemSpeed json.RawMessage `json:"memory_mem_speed"` // int or string
|
||||
ConfigSpeed json.RawMessage `json:"memory_config_speed"` // int or string
|
||||
}
|
||||
|
||||
type xccDiskDoc struct {
|
||||
Items []xccDiskItem `json:"items"`
|
||||
}
|
||||
|
||||
type xccDiskItem struct {
|
||||
Disks []xccDisk `json:"disks"`
|
||||
}
|
||||
|
||||
type xccDisk struct {
|
||||
ID int `json:"id"`
|
||||
SlotNo int `json:"slotNo"`
|
||||
Type string `json:"type"`
|
||||
Interface string `json:"interface"`
|
||||
Media string `json:"media"`
|
||||
SerialNo string `json:"serialNo"`
|
||||
PartNo string `json:"partNo"`
|
||||
CapacityStr string `json:"capacityStr"` // e.g. "3.20 TB"
|
||||
Manufacture string `json:"manufacture"`
|
||||
ProductName string `json:"productName"`
|
||||
RemainLife int `json:"remainLife"` // 0-100
|
||||
FWVersion string `json:"fwVersion"`
|
||||
Temperature int `json:"temperature"`
|
||||
HealthStatus int `json:"healthStatus"` // int code: 2=Normal
|
||||
State int `json:"state"`
|
||||
StateStr string `json:"statestr"`
|
||||
}
|
||||
|
||||
type xccCardDoc struct {
|
||||
Items []xccCard `json:"items"`
|
||||
}
|
||||
|
||||
type xccCard struct {
|
||||
Key int `json:"key"`
|
||||
SlotNo int `json:"slotNo"`
|
||||
AdapterName string `json:"adapterName"`
|
||||
ConnectorLabel string `json:"connectorLabel"`
|
||||
OOBSupported int `json:"oobSupported"`
|
||||
Location int `json:"location"`
|
||||
Functions []xccCardFunc `json:"functions"`
|
||||
}
|
||||
|
||||
type xccCardFunc struct {
|
||||
FunType int `json:"funType"`
|
||||
BusNo int `json:"generic_busNo"`
|
||||
DevNo int `json:"generic_devNo"`
|
||||
FunNo int `json:"generic_funNo"`
|
||||
VendorID int `json:"generic_vendorId"` // direct int
|
||||
DeviceID int `json:"generic_devId"` // direct int
|
||||
SlotDesignation string `json:"generic_slotDesignation"`
|
||||
}
|
||||
|
||||
type xccPSUDoc struct {
|
||||
Items []xccPSUItem `json:"items"`
|
||||
}
|
||||
|
||||
type xccPSUItem struct {
|
||||
Power []xccPSU `json:"power"`
|
||||
}
|
||||
|
||||
type xccPSU struct {
|
||||
Name int `json:"name"`
|
||||
Status string `json:"status"`
|
||||
RatedPower int `json:"rated_power"`
|
||||
PartNumber string `json:"part_number"`
|
||||
FRUNumber string `json:"fru_number"`
|
||||
SerialNumber string `json:"serial_number"`
|
||||
ManufID string `json:"manuf_id"`
|
||||
}
|
||||
|
||||
type xccFRUDoc struct {
|
||||
Items []xccFRUItem `json:"items"`
|
||||
}
|
||||
|
||||
type xccFRUItem struct {
|
||||
BuiltinFRU []map[string]string `json:"builtin_fru_device"`
|
||||
}
|
||||
|
||||
type xccSensorDoc struct {
|
||||
Items []xccSensor `json:"items"`
|
||||
}
|
||||
|
||||
type xccSensor struct {
|
||||
Name string `json:"Sensor Name"`
|
||||
Value string `json:"Value"`
|
||||
Status string `json:"status"`
|
||||
Unit string `json:"unit"`
|
||||
}
|
||||
|
||||
type xccEventDoc struct {
|
||||
Items []xccEvent `json:"items"`
|
||||
}
|
||||
|
||||
type xccEvent struct {
|
||||
Severity string `json:"severity"` // "I", "W", "E", "C"
|
||||
Source string `json:"source"`
|
||||
Date string `json:"date"` // "2025-12-22T13:24:02.070"
|
||||
Index int `json:"index"`
|
||||
EventID string `json:"eventid"`
|
||||
CmnID string `json:"cmnid"`
|
||||
Message string `json:"message"`
|
||||
}
|
||||
|
||||
// --- Parsers ---
|
||||
|
||||
func parseBasicSysInfo(content []byte, result *models.AnalysisResult) {
|
||||
var doc xccBasicSysInfoDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
|
||||
return
|
||||
}
|
||||
item := doc.Items[0]
|
||||
|
||||
result.Hardware.BoardInfo = models.BoardInfo{
|
||||
ProductName: strings.TrimSpace(item.MachineTypeModel),
|
||||
SerialNumber: strings.TrimSpace(item.SerialNumber),
|
||||
UUID: strings.TrimSpace(item.UUID),
|
||||
}
|
||||
|
||||
if t, err := parseXCCTime(item.CurrentTime); err == nil {
|
||||
result.CollectedAt = t.UTC()
|
||||
}
|
||||
}
|
||||
|
||||
func parseFirmware(content []byte) []models.FirmwareInfo {
|
||||
var doc xccFirmwareDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil {
|
||||
return nil
|
||||
}
|
||||
var out []models.FirmwareInfo
|
||||
for _, fw := range doc.Items {
|
||||
if fi := xccFWEntryToModel(fw); fi != nil {
|
||||
out = append(out, *fi)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func xccFWEntryToModel(fw xccFWEntry) *models.FirmwareInfo {
|
||||
name := strings.TrimSpace(fw.TypeStr)
|
||||
version := strings.TrimSpace(fw.Version)
|
||||
if name == "" && version == "" {
|
||||
return nil
|
||||
}
|
||||
build := strings.TrimSpace(fw.Build)
|
||||
v := version
|
||||
if build != "" {
|
||||
v = version + " (" + build + ")"
|
||||
}
|
||||
return &models.FirmwareInfo{
|
||||
DeviceName: name,
|
||||
Version: v,
|
||||
BuildTime: strings.TrimSpace(fw.ReleaseDate),
|
||||
}
|
||||
}
|
||||
|
||||
func parseCPUs(content []byte) []models.CPU {
|
||||
var doc xccCPUDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
|
||||
return nil
|
||||
}
|
||||
var out []models.CPU
|
||||
for _, item := range doc.Items {
|
||||
for _, c := range item.Processors {
|
||||
cpu := models.CPU{
|
||||
Socket: c.Name,
|
||||
Model: strings.TrimSpace(c.Model),
|
||||
Cores: rawJSONToInt(c.Cores),
|
||||
Threads: rawJSONToInt(c.Threads),
|
||||
FrequencyMHz: parseMHz(c.ClockSpeed),
|
||||
L1CacheKB: parseKB(c.L1DataCache),
|
||||
L2CacheKB: parseKB(c.L2Cache),
|
||||
L3CacheKB: parseKB(c.L3Cache),
|
||||
Status: strings.TrimSpace(c.Status),
|
||||
SerialNumber: strings.TrimSpace(c.SerialNumber),
|
||||
}
|
||||
out = append(out, cpu)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func parseDIMMs(content []byte) ([]models.MemoryDIMM, []models.Event) {
|
||||
var doc xccDIMMDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
|
||||
return nil, nil
|
||||
}
|
||||
var out []models.MemoryDIMM
|
||||
var events []models.Event
|
||||
for _, item := range doc.Items {
|
||||
for _, m := range item.Memory {
|
||||
status := strings.TrimSpace(m.Status)
|
||||
present := !strings.EqualFold(status, "not present") &&
|
||||
!strings.EqualFold(status, "absent")
|
||||
// memory_capacity is in GB (int); convert to MB
|
||||
capacityGB := rawJSONToInt(m.Capacity)
|
||||
dimm := models.MemoryDIMM{
|
||||
Slot: strings.TrimSpace(m.Name),
|
||||
Location: strings.TrimSpace(m.Name),
|
||||
Present: present,
|
||||
SizeMB: capacityGB * 1024,
|
||||
Type: strings.TrimSpace(m.Type),
|
||||
MaxSpeedMHz: rawJSONToInt(m.MemSpeed),
|
||||
CurrentSpeedMHz: rawJSONToInt(m.ConfigSpeed),
|
||||
Manufacturer: strings.TrimSpace(m.Manufacturer),
|
||||
SerialNumber: strings.TrimSpace(m.SerialNumber),
|
||||
PartNumber: strings.TrimSpace(strings.TrimRight(m.PartNumber, " ")),
|
||||
Status: status,
|
||||
}
|
||||
out = append(out, dimm)
|
||||
if isUnqualifiedDIMM(status) {
|
||||
events = append(events, models.Event{
|
||||
Source: "Memory",
|
||||
SensorType: "Memory",
|
||||
SensorName: dimm.Slot,
|
||||
EventType: "DIMM Qualification",
|
||||
Severity: models.SeverityWarning,
|
||||
Description: status,
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
return out, events
|
||||
}
|
||||
|
||||
func parseDisks(content []byte) []models.Storage {
|
||||
var doc xccDiskDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
|
||||
return nil
|
||||
}
|
||||
var out []models.Storage
|
||||
for _, item := range doc.Items {
|
||||
for _, d := range item.Disks {
|
||||
sizeGB := parseCapacityToGB(d.CapacityStr)
|
||||
stateStr := strings.TrimSpace(d.StateStr)
|
||||
present := !strings.EqualFold(stateStr, "absent") &&
|
||||
!strings.EqualFold(stateStr, "not present")
|
||||
disk := models.Storage{
|
||||
Slot: fmt.Sprintf("%d", d.SlotNo),
|
||||
Type: strings.TrimSpace(d.Media),
|
||||
Model: strings.TrimSpace(d.ProductName),
|
||||
SizeGB: sizeGB,
|
||||
SerialNumber: strings.TrimSpace(d.SerialNo),
|
||||
Manufacturer: strings.TrimSpace(d.Manufacture),
|
||||
Firmware: strings.TrimSpace(d.FWVersion),
|
||||
Interface: strings.TrimSpace(d.Interface),
|
||||
Present: present,
|
||||
Status: stateStr,
|
||||
}
|
||||
if d.RemainLife >= 0 && d.RemainLife <= 100 {
|
||||
v := d.RemainLife
|
||||
disk.RemainingEndurancePct = &v
|
||||
}
|
||||
out = append(out, disk)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func parseCards(content []byte) []models.PCIeDevice {
|
||||
var doc xccCardDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil {
|
||||
return nil
|
||||
}
|
||||
var out []models.PCIeDevice
|
||||
for _, card := range doc.Items {
|
||||
slot := strings.TrimSpace(card.ConnectorLabel)
|
||||
if slot == "" {
|
||||
slot = fmt.Sprintf("%d", card.SlotNo)
|
||||
}
|
||||
dev := models.PCIeDevice{
|
||||
Slot: slot,
|
||||
Description: strings.TrimSpace(card.AdapterName),
|
||||
}
|
||||
if len(card.Functions) > 0 {
|
||||
fn := card.Functions[0]
|
||||
dev.BDF = fmt.Sprintf("%02x:%02x.%x", fn.BusNo, fn.DevNo, fn.FunNo)
|
||||
dev.VendorID = fn.VendorID
|
||||
dev.DeviceID = fn.DeviceID
|
||||
}
|
||||
out = append(out, dev)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func parsePSUs(content []byte) []models.PSU {
|
||||
var doc xccPSUDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
|
||||
return nil
|
||||
}
|
||||
var out []models.PSU
|
||||
for _, item := range doc.Items {
|
||||
for _, p := range item.Power {
|
||||
psu := models.PSU{
|
||||
Slot: fmt.Sprintf("%d", p.Name),
|
||||
Present: true,
|
||||
WattageW: p.RatedPower,
|
||||
SerialNumber: strings.TrimSpace(p.SerialNumber),
|
||||
PartNumber: strings.TrimSpace(p.PartNumber),
|
||||
Vendor: strings.TrimSpace(p.ManufID),
|
||||
Status: strings.TrimSpace(p.Status),
|
||||
}
|
||||
out = append(out, psu)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func parseFRU(content []byte) []models.FRUInfo {
|
||||
var doc xccFRUDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil || len(doc.Items) == 0 {
|
||||
return nil
|
||||
}
|
||||
var out []models.FRUInfo
|
||||
for _, item := range doc.Items {
|
||||
for _, entry := range item.BuiltinFRU {
|
||||
fru := models.FRUInfo{
|
||||
Description: entry["FRU Device Description"],
|
||||
Manufacturer: entry["Board Mfg"],
|
||||
ProductName: entry["Board Product"],
|
||||
SerialNumber: entry["Board Serial"],
|
||||
PartNumber: entry["Board Part Number"],
|
||||
MfgDate: entry["Board Mfg Date"],
|
||||
}
|
||||
if fru.ProductName == "" {
|
||||
fru.ProductName = entry["Product Name"]
|
||||
}
|
||||
if fru.SerialNumber == "" {
|
||||
fru.SerialNumber = entry["Product Serial"]
|
||||
}
|
||||
if fru.PartNumber == "" {
|
||||
fru.PartNumber = entry["Product Part Number"]
|
||||
}
|
||||
if fru.Description == "" && fru.ProductName == "" && fru.SerialNumber == "" {
|
||||
continue
|
||||
}
|
||||
out = append(out, fru)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func parseSensors(content []byte) []models.SensorReading {
|
||||
var doc xccSensorDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil {
|
||||
return nil
|
||||
}
|
||||
var out []models.SensorReading
|
||||
for _, s := range doc.Items {
|
||||
name := strings.TrimSpace(s.Name)
|
||||
if name == "" {
|
||||
continue
|
||||
}
|
||||
sr := models.SensorReading{
|
||||
Name: name,
|
||||
RawValue: strings.TrimSpace(s.Value),
|
||||
Unit: strings.TrimSpace(s.Unit),
|
||||
Status: strings.TrimSpace(s.Status),
|
||||
}
|
||||
if v, err := strconv.ParseFloat(sr.RawValue, 64); err == nil {
|
||||
sr.Value = v
|
||||
}
|
||||
out = append(out, sr)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func parseEvents(content []byte) []models.Event {
|
||||
var doc xccEventDoc
|
||||
if err := json.Unmarshal(content, &doc); err != nil {
|
||||
return nil
|
||||
}
|
||||
var out []models.Event
|
||||
for _, e := range doc.Items {
|
||||
ev := models.Event{
|
||||
ID: e.EventID,
|
||||
Source: strings.TrimSpace(e.Source),
|
||||
Description: strings.TrimSpace(e.Message),
|
||||
Severity: xccSeverity(e.Severity, e.Message),
|
||||
}
|
||||
if t, err := parseXCCTime(e.Date); err == nil {
|
||||
ev.Timestamp = t.UTC()
|
||||
}
|
||||
out = append(out, ev)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// --- Helpers ---
|
||||
|
||||
func xccSeverity(s, message string) models.Severity {
|
||||
if isUnqualifiedDIMM(message) {
|
||||
return models.SeverityWarning
|
||||
}
|
||||
switch strings.ToUpper(strings.TrimSpace(s)) {
|
||||
case "C":
|
||||
return models.SeverityCritical
|
||||
case "E":
|
||||
return models.SeverityCritical
|
||||
case "W":
|
||||
return models.SeverityWarning
|
||||
default:
|
||||
return models.SeverityInfo
|
||||
}
|
||||
}
|
||||
|
||||
func isUnqualifiedDIMM(value string) bool {
|
||||
return strings.Contains(strings.ToLower(strings.TrimSpace(value)), "unqualified dimm")
|
||||
}
|
||||
|
||||
func parseXCCTime(s string) (time.Time, error) {
|
||||
s = strings.TrimSpace(s)
|
||||
formats := []string{
|
||||
"2006-01-02T15:04:05.000",
|
||||
"2006-01-02T15:04:05",
|
||||
"2006-01-02 15:04:05",
|
||||
}
|
||||
for _, f := range formats {
|
||||
if t, err := time.Parse(f, s); err == nil {
|
||||
return t, nil
|
||||
}
|
||||
}
|
||||
return time.Time{}, fmt.Errorf("unparseable time: %q", s)
|
||||
}
|
||||
|
||||
// parseMHz parses "4100 MHz" → 4100
|
||||
func parseMHz(s string) int {
|
||||
s = strings.TrimSpace(s)
|
||||
parts := strings.Fields(s)
|
||||
if len(parts) == 0 {
|
||||
return 0
|
||||
}
|
||||
v, _ := strconv.Atoi(parts[0])
|
||||
return v
|
||||
}
|
||||
|
||||
// parseKB parses "384 KB" → 384
|
||||
func parseKB(s string) int {
|
||||
s = strings.TrimSpace(s)
|
||||
parts := strings.Fields(s)
|
||||
if len(parts) == 0 {
|
||||
return 0
|
||||
}
|
||||
v, _ := strconv.Atoi(parts[0])
|
||||
return v
|
||||
}
|
||||
|
||||
// parseMB parses "32768 MB" → 32768
|
||||
func parseMB(s string) int {
|
||||
return parseKB(s)
|
||||
}
|
||||
|
||||
// parseMTs parses "4800 MT/s" → 4800 (treated as MHz equivalent)
|
||||
func parseMTs(s string) int {
|
||||
return parseKB(s)
|
||||
}
|
||||
|
||||
// parseCapacityToGB parses "3.20 TB" or "480 GB" → GB integer
|
||||
func parseCapacityToGB(s string) int {
|
||||
s = strings.TrimSpace(s)
|
||||
parts := strings.Fields(s)
|
||||
if len(parts) < 2 {
|
||||
return 0
|
||||
}
|
||||
v, err := strconv.ParseFloat(parts[0], 64)
|
||||
if err != nil {
|
||||
return 0
|
||||
}
|
||||
switch strings.ToUpper(parts[1]) {
|
||||
case "TB":
|
||||
return int(v * 1000)
|
||||
case "GB":
|
||||
return int(v)
|
||||
case "MB":
|
||||
return int(v / 1024)
|
||||
}
|
||||
return int(v)
|
||||
}
|
||||
|
||||
// rawJSONToInt parses a json.RawMessage that may be an int or a quoted string → int
|
||||
func rawJSONToInt(raw json.RawMessage) int {
|
||||
if len(raw) == 0 {
|
||||
return 0
|
||||
}
|
||||
// try direct int
|
||||
var n int
|
||||
if err := json.Unmarshal(raw, &n); err == nil {
|
||||
return n
|
||||
}
|
||||
// try string
|
||||
var s string
|
||||
if err := json.Unmarshal(raw, &s); err == nil {
|
||||
v, _ := strconv.Atoi(strings.TrimSpace(s))
|
||||
return v
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
// parseHexID parses "0x15b3" → 5555
|
||||
func parseHexID(s string) int {
|
||||
s = strings.TrimSpace(strings.ToLower(s))
|
||||
s = strings.TrimPrefix(s, "0x")
|
||||
v, _ := strconv.ParseInt(s, 16, 32)
|
||||
return int(v)
|
||||
}
|
||||
258
internal/parser/vendors/lenovo_xcc/parser_test.go
vendored
Normal file
258
internal/parser/vendors/lenovo_xcc/parser_test.go
vendored
Normal file
@@ -0,0 +1,258 @@
|
||||
package lenovo_xcc
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
const exampleArchive = "/Users/mchusavitin/Documents/git/logpile/example/7D76CTO1WW_JF0002KT_xcc_mini-log_20260413-122150.zip"
|
||||
|
||||
func TestDetect_LenovoXCCMiniLog(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
score := p.Detect(files)
|
||||
if score < 80 {
|
||||
t.Errorf("expected Detect score >= 80 for XCC mini-log archive, got %d", score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_BasicSysInfo(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse returned error: %v", err)
|
||||
}
|
||||
if result == nil || result.Hardware == nil {
|
||||
t.Fatal("Parse returned nil result or hardware")
|
||||
}
|
||||
|
||||
hw := result.Hardware
|
||||
if hw.BoardInfo.SerialNumber == "" {
|
||||
t.Error("BoardInfo.SerialNumber is empty")
|
||||
}
|
||||
if hw.BoardInfo.ProductName == "" {
|
||||
t.Error("BoardInfo.ProductName is empty")
|
||||
}
|
||||
t.Logf("BoardInfo: serial=%s model=%s uuid=%s", hw.BoardInfo.SerialNumber, hw.BoardInfo.ProductName, hw.BoardInfo.UUID)
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_CPUs(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil || result.Hardware == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
if len(result.Hardware.CPUs) == 0 {
|
||||
t.Error("expected at least one CPU, got none")
|
||||
}
|
||||
for i, cpu := range result.Hardware.CPUs {
|
||||
t.Logf("CPU[%d]: socket=%d model=%q cores=%d threads=%d freq=%dMHz", i, cpu.Socket, cpu.Model, cpu.Cores, cpu.Threads, cpu.FrequencyMHz)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_Memory(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil || result.Hardware == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
if len(result.Hardware.Memory) == 0 {
|
||||
t.Error("expected memory DIMMs, got none")
|
||||
}
|
||||
t.Logf("Memory: %d DIMMs", len(result.Hardware.Memory))
|
||||
for i, m := range result.Hardware.Memory {
|
||||
t.Logf("DIMM[%d]: slot=%s present=%v size=%dMB sn=%s", i, m.Slot, m.Present, m.SizeMB, m.SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_Storage(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil || result.Hardware == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
t.Logf("Storage: %d disks", len(result.Hardware.Storage))
|
||||
for i, s := range result.Hardware.Storage {
|
||||
t.Logf("Disk[%d]: slot=%s model=%q size=%dGB sn=%s", i, s.Slot, s.Model, s.SizeGB, s.SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_PCIeCards(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil || result.Hardware == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
t.Logf("PCIe cards: %d", len(result.Hardware.PCIeDevices))
|
||||
for i, c := range result.Hardware.PCIeDevices {
|
||||
t.Logf("Card[%d]: slot=%s desc=%q bdf=%s", i, c.Slot, c.Description, c.BDF)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_PSUs(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil || result.Hardware == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
if len(result.Hardware.PowerSupply) == 0 {
|
||||
t.Error("expected PSUs, got none")
|
||||
}
|
||||
for i, p := range result.Hardware.PowerSupply {
|
||||
t.Logf("PSU[%d]: slot=%s wattage=%dW status=%s sn=%s", i, p.Slot, p.WattageW, p.Status, p.SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_Sensors(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
if len(result.Sensors) == 0 {
|
||||
t.Error("expected sensors, got none")
|
||||
}
|
||||
t.Logf("Sensors: %d", len(result.Sensors))
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_Events(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
if len(result.Events) == 0 {
|
||||
t.Error("expected events, got none")
|
||||
}
|
||||
t.Logf("Events: %d", len(result.Events))
|
||||
for i, e := range result.Events {
|
||||
if i >= 5 {
|
||||
break
|
||||
}
|
||||
t.Logf("Event[%d]: severity=%s ts=%s desc=%q", i, e.Severity, e.Timestamp.Format("2006-01-02T15:04:05"), e.Description)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_FRU(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
t.Logf("FRU: %d entries", len(result.FRU))
|
||||
for i, f := range result.FRU {
|
||||
t.Logf("FRU[%d]: desc=%q product=%q serial=%q", i, f.Description, f.ProductName, f.SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_LenovoXCCMiniLog_Firmware(t *testing.T) {
|
||||
files, err := parser.ExtractArchive(exampleArchive)
|
||||
if err != nil {
|
||||
t.Skipf("example archive not available: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, _ := p.Parse(files)
|
||||
if result == nil || result.Hardware == nil {
|
||||
t.Fatal("Parse returned nil")
|
||||
}
|
||||
|
||||
if len(result.Hardware.Firmware) == 0 {
|
||||
t.Error("expected firmware entries, got none")
|
||||
}
|
||||
for i, f := range result.Hardware.Firmware {
|
||||
t.Logf("FW[%d]: name=%q version=%q buildtime=%q", i, f.DeviceName, f.Version, f.BuildTime)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseDIMMs_UnqualifiedDIMMAddsWarningEvent(t *testing.T) {
|
||||
content := []byte(`{
|
||||
"items": [{
|
||||
"memory": [{
|
||||
"memory_name": "DIMM A1",
|
||||
"memory_status": "Unqualified DIMM",
|
||||
"memory_type": "DDR5",
|
||||
"memory_capacity": 32
|
||||
}]
|
||||
}]
|
||||
}`)
|
||||
|
||||
memory, events := parseDIMMs(content)
|
||||
if len(memory) != 1 {
|
||||
t.Fatalf("expected 1 DIMM, got %d", len(memory))
|
||||
}
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 warning event, got %d", len(events))
|
||||
}
|
||||
if events[0].Severity != models.SeverityWarning {
|
||||
t.Fatalf("expected warning severity, got %q", events[0].Severity)
|
||||
}
|
||||
if events[0].SensorName != "DIMM A1" {
|
||||
t.Fatalf("unexpected sensor name: %q", events[0].SensorName)
|
||||
}
|
||||
}
|
||||
|
||||
func TestSeverity_UnqualifiedDIMMMessageBecomesWarning(t *testing.T) {
|
||||
if got := xccSeverity("I", "System found Unqualified DIMM in slot DIMM A1"); got != models.SeverityWarning {
|
||||
t.Fatalf("expected warning severity, got %q", got)
|
||||
}
|
||||
}
|
||||
4
internal/parser/vendors/vendors.go
vendored
4
internal/parser/vendors/vendors.go
vendored
@@ -5,12 +5,16 @@ package vendors
|
||||
import (
|
||||
// Import vendor modules to trigger their init() registration
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/easy_bee"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/h3c"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/hpe_ilo_ahs"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/nvidia"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/nvidia_bug_report"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/unraid"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/xfusion"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/xigmanas"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/lenovo_xcc"
|
||||
|
||||
// Generic fallback parser (must be last for lowest priority)
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/generic"
|
||||
|
||||
1081
internal/parser/vendors/xfusion/hardware.go
vendored
Normal file
1081
internal/parser/vendors/xfusion/hardware.go
vendored
Normal file
File diff suppressed because it is too large
Load Diff
157
internal/parser/vendors/xfusion/parser.go
vendored
Normal file
157
internal/parser/vendors/xfusion/parser.go
vendored
Normal file
@@ -0,0 +1,157 @@
|
||||
// Package xfusion provides parser for xFusion iBMC diagnostic dump archives.
|
||||
// Tested with: xFusion G5500 V7 iBMC dump (tar.gz format, exported via iBMC UI)
|
||||
//
|
||||
// Archive structure: dump_info/AppDump/... and dump_info/LogDump/...
|
||||
//
|
||||
// IMPORTANT: Increment parserVersion when modifying parser logic!
|
||||
package xfusion
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
const parserVersion = "1.1"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
}
|
||||
|
||||
// Parser implements VendorParser for xFusion iBMC dump archives.
|
||||
type Parser struct{}
|
||||
|
||||
func (p *Parser) Name() string { return "xFusion iBMC Dump Parser" }
|
||||
func (p *Parser) Vendor() string { return "xfusion" }
|
||||
func (p *Parser) Version() string { return parserVersion }
|
||||
|
||||
// Detect checks if files match the xFusion iBMC dump format.
|
||||
// Returns confidence score 0-100.
|
||||
func (p *Parser) Detect(files []parser.ExtractedFile) int {
|
||||
confidence := 0
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(f.Path)
|
||||
switch {
|
||||
case strings.Contains(path, "appdump/frudata/fruinfo.txt"):
|
||||
confidence += 50
|
||||
case strings.Contains(path, "rtosdump/versioninfo/app_revision.txt"):
|
||||
confidence += 30
|
||||
case strings.Contains(path, "appdump/sensor_alarm/sensor_info.txt"):
|
||||
confidence += 10
|
||||
case strings.Contains(path, "appdump/card_manage/card_info"):
|
||||
confidence += 20
|
||||
case strings.Contains(path, "logdump/netcard/netcard_info.txt"):
|
||||
confidence += 20
|
||||
}
|
||||
if confidence >= 100 {
|
||||
return 100
|
||||
}
|
||||
}
|
||||
return confidence
|
||||
}
|
||||
|
||||
// Parse parses xFusion iBMC dump and returns an analysis result.
|
||||
func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, error) {
|
||||
result := &models.AnalysisResult{
|
||||
Events: make([]models.Event, 0),
|
||||
FRU: make([]models.FRUInfo, 0),
|
||||
Sensors: make([]models.SensorReading, 0),
|
||||
Hardware: &models.HardwareConfig{
|
||||
Firmware: make([]models.FirmwareInfo, 0),
|
||||
Devices: make([]models.HardwareDevice, 0),
|
||||
CPUs: make([]models.CPU, 0),
|
||||
Memory: make([]models.MemoryDIMM, 0),
|
||||
Storage: make([]models.Storage, 0),
|
||||
Volumes: make([]models.StorageVolume, 0),
|
||||
PCIeDevices: make([]models.PCIeDevice, 0),
|
||||
GPUs: make([]models.GPU, 0),
|
||||
NetworkCards: make([]models.NIC, 0),
|
||||
NetworkAdapters: make([]models.NetworkAdapter, 0),
|
||||
PowerSupply: make([]models.PSU, 0),
|
||||
},
|
||||
}
|
||||
|
||||
if f := findByAnyPath(files, "appdump/frudata/fruinfo.txt", "rtosdump/versioninfo/fruinfo.txt"); f != nil {
|
||||
parseFRUInfo(f.Content, result)
|
||||
}
|
||||
if f := findByPath(files, "appdump/sensor_alarm/sensor_info.txt"); f != nil {
|
||||
result.Sensors = parseSensorInfo(f.Content)
|
||||
}
|
||||
if f := findByPath(files, "appdump/cpumem/cpu_info"); f != nil {
|
||||
result.Hardware.CPUs = parseCPUInfo(f.Content)
|
||||
}
|
||||
if f := findByPath(files, "appdump/cpumem/mem_info"); f != nil {
|
||||
result.Hardware.Memory = parseMemInfo(f.Content)
|
||||
}
|
||||
var nicCards []xfusionNICCard
|
||||
if f := findByPath(files, "appdump/card_manage/card_info"); f != nil {
|
||||
gpus, cards := parseCardInfo(f.Content)
|
||||
result.Hardware.GPUs = gpus
|
||||
nicCards = cards
|
||||
}
|
||||
if f := findByPath(files, "logdump/netcard/netcard_info.txt"); f != nil || len(nicCards) > 0 {
|
||||
var content []byte
|
||||
if f != nil {
|
||||
content = f.Content
|
||||
}
|
||||
adapters, legacyNICs := mergeNetworkAdapters(nicCards, parseNetcardInfo(content))
|
||||
result.Hardware.NetworkAdapters = adapters
|
||||
result.Hardware.NetworkCards = legacyNICs
|
||||
}
|
||||
if f := findByPath(files, "appdump/bmc/psu_info.txt"); f != nil {
|
||||
result.Hardware.PowerSupply = parsePSUInfo(f.Content)
|
||||
}
|
||||
if f := findByPath(files, "appdump/storagemgnt/raid_controller_info.txt"); f != nil {
|
||||
parseStorageControllerInfo(f.Content, result)
|
||||
}
|
||||
if f := findByPath(files, "rtosdump/versioninfo/app_revision.txt"); f != nil {
|
||||
parseAppRevision(f.Content, result)
|
||||
}
|
||||
for _, f := range findDiskInfoFiles(files) {
|
||||
disk := parseDiskInfo(f.Content)
|
||||
if disk != nil {
|
||||
result.Hardware.Storage = append(result.Hardware.Storage, *disk)
|
||||
}
|
||||
}
|
||||
if f := findByPath(files, "logdump/maintenance_log"); f != nil {
|
||||
result.Events = parseMaintenanceLog(f.Content)
|
||||
}
|
||||
|
||||
result.Protocol = "ipmi"
|
||||
result.SourceType = models.SourceTypeArchive
|
||||
parser.ApplyManufacturedYearWeekFromFRU(result.FRU, result.Hardware)
|
||||
|
||||
return result, nil
|
||||
}
|
||||
|
||||
// findByPath returns the first file whose lowercased path contains the given substring.
|
||||
func findByPath(files []parser.ExtractedFile, substring string) *parser.ExtractedFile {
|
||||
for i := range files {
|
||||
if strings.Contains(strings.ToLower(files[i].Path), substring) {
|
||||
return &files[i]
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func findByAnyPath(files []parser.ExtractedFile, substrings ...string) *parser.ExtractedFile {
|
||||
for _, substring := range substrings {
|
||||
if f := findByPath(files, substring); f != nil {
|
||||
return f
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// findDiskInfoFiles returns all PhysicalDrivesInfo disk_info files.
|
||||
func findDiskInfoFiles(files []parser.ExtractedFile) []parser.ExtractedFile {
|
||||
var out []parser.ExtractedFile
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(f.Path)
|
||||
if strings.Contains(path, "physicaldrivesinfo/") && strings.HasSuffix(path, "/disk_info") {
|
||||
out = append(out, f)
|
||||
}
|
||||
}
|
||||
return out
|
||||
}
|
||||
332
internal/parser/vendors/xfusion/parser_test.go
vendored
Normal file
332
internal/parser/vendors/xfusion/parser_test.go
vendored
Normal file
@@ -0,0 +1,332 @@
|
||||
package xfusion
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
// loadTestArchive extracts the given archive path for use in tests.
|
||||
// Skips the test if the file is not found (CI environments without testdata).
|
||||
func loadTestArchive(t *testing.T, path string) []parser.ExtractedFile {
|
||||
t.Helper()
|
||||
files, err := parser.ExtractArchive(path)
|
||||
if err != nil {
|
||||
t.Skipf("cannot load test archive %s: %v", path, err)
|
||||
}
|
||||
return files
|
||||
}
|
||||
|
||||
func TestDetect_G5500V7(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
score := p.Detect(files)
|
||||
if score < 80 {
|
||||
t.Fatalf("expected Detect score >= 80, got %d", score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestDetect_ServerFileExportMarkers(t *testing.T) {
|
||||
p := &Parser{}
|
||||
score := p.Detect([]parser.ExtractedFile{
|
||||
{Path: "dump_info/RTOSDump/versioninfo/app_revision.txt", Content: []byte("Product Name: G5500 V7")},
|
||||
{Path: "dump_info/LogDump/netcard/netcard_info.txt", Content: []byte("2026-02-04 03:54:06 UTC")},
|
||||
{Path: "dump_info/AppDump/card_manage/card_info", Content: []byte("OCP Card Info")},
|
||||
})
|
||||
if score < 70 {
|
||||
t.Fatalf("expected Detect score >= 70 for xFusion file export markers, got %d", score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestDetect_Negative(t *testing.T) {
|
||||
p := &Parser{}
|
||||
score := p.Detect([]parser.ExtractedFile{
|
||||
{Path: "logs/messages.txt", Content: []byte("plain text")},
|
||||
{Path: "inventory.json", Content: []byte(`{"vendor":"other"}`)},
|
||||
})
|
||||
if score != 0 {
|
||||
t.Fatalf("expected Detect score 0 for non-xFusion input, got %d", score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_BoardInfo(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if result.Hardware == nil {
|
||||
t.Fatal("Hardware is nil")
|
||||
}
|
||||
board := result.Hardware.BoardInfo
|
||||
if board.SerialNumber != "210619KUGGXGS2000015" {
|
||||
t.Errorf("BoardInfo.SerialNumber = %q, want 210619KUGGXGS2000015", board.SerialNumber)
|
||||
}
|
||||
if board.ProductName != "G5500 V7" {
|
||||
t.Errorf("BoardInfo.ProductName = %q, want G5500 V7", board.ProductName)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_CPUs(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if len(result.Hardware.CPUs) != 2 {
|
||||
t.Fatalf("expected 2 CPUs, got %d", len(result.Hardware.CPUs))
|
||||
}
|
||||
cpu1 := result.Hardware.CPUs[0]
|
||||
if cpu1.Cores != 32 {
|
||||
t.Errorf("CPU1 cores = %d, want 32", cpu1.Cores)
|
||||
}
|
||||
if cpu1.Threads != 64 {
|
||||
t.Errorf("CPU1 threads = %d, want 64", cpu1.Threads)
|
||||
}
|
||||
if cpu1.SerialNumber == "" {
|
||||
t.Error("CPU1 SerialNumber is empty")
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_Memory(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
// Only 2 DIMMs are populated (rest are "NO DIMM")
|
||||
if len(result.Hardware.Memory) != 2 {
|
||||
t.Fatalf("expected 2 populated DIMMs, got %d", len(result.Hardware.Memory))
|
||||
}
|
||||
dimm := result.Hardware.Memory[0]
|
||||
if dimm.SizeMB != 65536 {
|
||||
t.Errorf("DIMM0 SizeMB = %d, want 65536", dimm.SizeMB)
|
||||
}
|
||||
if dimm.Type != "DDR5" {
|
||||
t.Errorf("DIMM0 Type = %q, want DDR5", dimm.Type)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_GPUs(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if len(result.Hardware.GPUs) != 8 {
|
||||
t.Fatalf("expected 8 GPUs, got %d", len(result.Hardware.GPUs))
|
||||
}
|
||||
for _, gpu := range result.Hardware.GPUs {
|
||||
if gpu.SerialNumber == "" {
|
||||
t.Errorf("GPU slot %s has empty SerialNumber", gpu.Slot)
|
||||
}
|
||||
if gpu.Model == "" {
|
||||
t.Errorf("GPU slot %s has empty Model", gpu.Slot)
|
||||
}
|
||||
if gpu.Firmware == "" {
|
||||
t.Errorf("GPU slot %s has empty Firmware", gpu.Slot)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_NICs(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if len(result.Hardware.NetworkCards) < 1 {
|
||||
t.Fatal("expected at least 1 NIC (OCP CX6), got 0")
|
||||
}
|
||||
nic := result.Hardware.NetworkCards[0]
|
||||
if nic.SerialNumber == "" {
|
||||
t.Errorf("NIC SerialNumber is empty")
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_ServerFileExport_NetworkAdaptersAndFirmware(t *testing.T) {
|
||||
p := &Parser{}
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "dump_info/AppDump/card_manage/card_info",
|
||||
Content: []byte(strings.TrimSpace(`
|
||||
Pcie Card Info
|
||||
Slot | Vender Id | Device Id | Sub Vender Id | Sub Device Id | Segment Number | Bus Number | Device Number | Function Number | Card Desc | Board Id | PCB Version | CPLD Version | Sub Card Bom Id | PartNum | SerialNumber | OriginalPartNum
|
||||
1 | 0x15b3 | 0x101f | 0x1f24 | 0x2011 | 0x00 | 0x27 | 0x00 | 0x00 | MT2894 Family [ConnectX-6 Lx] | N/A | N/A | N/A | N/A | 0302Y238 | 02Y238X6RC000058 |
|
||||
|
||||
OCP Card Info
|
||||
Slot | Vender Id | Device Id | Sub Vender Id | Sub Device Id | Segment Number | Bus Number | Device Number | Function Number | Card Desc | Board Id | PCB Version | CPLD Version | Sub Card Bom Id | PartNum | SerialNumber | OriginalPartNum
|
||||
1 | 0x15b3 | 0x101f | 0x1f24 | 0x2011 | 0x00 | 0x27 | 0x00 | 0x00 | MT2894 Family [ConnectX-6 Lx] | N/A | N/A | N/A | N/A | 0302Y238 | 02Y238X6RC000058 |
|
||||
`)),
|
||||
},
|
||||
{
|
||||
Path: "dump_info/LogDump/netcard/netcard_info.txt",
|
||||
Content: []byte(strings.TrimSpace(`
|
||||
2026-02-04 03:54:06 UTC
|
||||
ProductName :XC385
|
||||
Manufacture :XFUSION
|
||||
FirmwareVersion :26.39.2048
|
||||
SlotId :1
|
||||
Port0 BDF:0000:27:00.0
|
||||
MacAddr:44:1A:4C:16:E8:03
|
||||
ActualMac:44:1A:4C:16:E8:03
|
||||
Port1 BDF:0000:27:00.1
|
||||
MacAddr:00:00:00:00:00:00
|
||||
ActualMac:44:1A:4C:16:E8:04
|
||||
`)),
|
||||
},
|
||||
{
|
||||
Path: "dump_info/RTOSDump/versioninfo/app_revision.txt",
|
||||
Content: []byte(strings.TrimSpace(`
|
||||
------------------- iBMC INFO -------------------
|
||||
Active iBMC Version: (U68)3.08.05.85
|
||||
Active iBMC Built: 16:46:26 Jan 4 2026
|
||||
SDK Version: 13.16.30.16
|
||||
SDK Built: 07:55:18 Dec 12 2025
|
||||
Active BIOS Version: (U6216)01.02.08.17
|
||||
Active BIOS Built: 00:00:00 Jan 05 2026
|
||||
Product Name: G5500 V7
|
||||
`)),
|
||||
},
|
||||
}
|
||||
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if result.Protocol != "ipmi" || result.SourceType != models.SourceTypeArchive {
|
||||
t.Fatalf("unexpected source metadata: protocol=%q source_type=%q", result.Protocol, result.SourceType)
|
||||
}
|
||||
if result.Hardware == nil {
|
||||
t.Fatal("Hardware is nil")
|
||||
}
|
||||
if len(result.Hardware.NetworkAdapters) != 1 {
|
||||
t.Fatalf("expected 1 network adapter, got %d", len(result.Hardware.NetworkAdapters))
|
||||
}
|
||||
adapter := result.Hardware.NetworkAdapters[0]
|
||||
if adapter.BDF != "0000:27:00.0" {
|
||||
t.Fatalf("expected network adapter BDF 0000:27:00.0, got %q", adapter.BDF)
|
||||
}
|
||||
if adapter.Firmware != "26.39.2048" {
|
||||
t.Fatalf("expected network adapter firmware 26.39.2048, got %q", adapter.Firmware)
|
||||
}
|
||||
if adapter.SerialNumber != "02Y238X6RC000058" {
|
||||
t.Fatalf("expected network adapter serial from card_info, got %q", adapter.SerialNumber)
|
||||
}
|
||||
if len(adapter.MACAddresses) != 2 || adapter.MACAddresses[0] != "44:1A:4C:16:E8:03" || adapter.MACAddresses[1] != "44:1A:4C:16:E8:04" {
|
||||
t.Fatalf("unexpected MAC addresses: %#v", adapter.MACAddresses)
|
||||
}
|
||||
|
||||
fwByDevice := make(map[string]models.FirmwareInfo)
|
||||
for _, fw := range result.Hardware.Firmware {
|
||||
fwByDevice[fw.DeviceName] = fw
|
||||
}
|
||||
if fwByDevice["iBMC"].Version != "(U68)3.08.05.85" {
|
||||
t.Fatalf("expected iBMC firmware from app_revision.txt, got %#v", fwByDevice["iBMC"])
|
||||
}
|
||||
if fwByDevice["BIOS"].Version != "(U6216)01.02.08.17" {
|
||||
t.Fatalf("expected BIOS firmware from app_revision.txt, got %#v", fwByDevice["BIOS"])
|
||||
}
|
||||
if result.Hardware.BoardInfo.ProductName != "G5500 V7" {
|
||||
t.Fatalf("expected board product fallback from app_revision.txt, got %q", result.Hardware.BoardInfo.ProductName)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_PSUs(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if len(result.Hardware.PowerSupply) != 4 {
|
||||
t.Fatalf("expected 4 PSUs, got %d", len(result.Hardware.PowerSupply))
|
||||
}
|
||||
for _, psu := range result.Hardware.PowerSupply {
|
||||
if psu.WattageW != 3000 {
|
||||
t.Errorf("PSU slot %s wattage = %d, want 3000", psu.Slot, psu.WattageW)
|
||||
}
|
||||
if psu.SerialNumber == "" {
|
||||
t.Errorf("PSU slot %s has empty SerialNumber", psu.Slot)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_Storage(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if len(result.Hardware.Storage) != 2 {
|
||||
t.Fatalf("expected 2 storage devices, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
for _, disk := range result.Hardware.Storage {
|
||||
if disk.SerialNumber == "" {
|
||||
t.Errorf("disk slot %s has empty SerialNumber", disk.Slot)
|
||||
}
|
||||
if disk.Model == "" {
|
||||
t.Errorf("disk slot %s has empty Model", disk.Slot)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_Sensors(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if len(result.Sensors) < 20 {
|
||||
t.Fatalf("expected at least 20 sensors, got %d", len(result.Sensors))
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_Events(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if len(result.Events) < 5 {
|
||||
t.Fatalf("expected at least 5 events, got %d", len(result.Events))
|
||||
}
|
||||
// All events should have real timestamps (not epoch 0)
|
||||
for _, ev := range result.Events {
|
||||
if ev.Timestamp.Year() <= 1970 {
|
||||
t.Errorf("event has epoch timestamp: %v %s", ev.Timestamp, ev.Description)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_G5500V7_FRU(t *testing.T) {
|
||||
files := loadTestArchive(t, "../../../../example/G5500V7_210619KUGGXGS2000015_20260318-1128.tar.gz")
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse: %v", err)
|
||||
}
|
||||
if len(result.FRU) < 3 {
|
||||
t.Fatalf("expected at least 3 FRU entries, got %d", len(result.FRU))
|
||||
}
|
||||
// Check mainboard FRU serial
|
||||
found := false
|
||||
for _, f := range result.FRU {
|
||||
if f.SerialNumber == "210619KUGGXGS2000015" {
|
||||
found = true
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
t.Error("mainboard serial 210619KUGGXGS2000015 not found in FRU")
|
||||
}
|
||||
}
|
||||
@@ -44,6 +44,9 @@ func TestParserParseExample(t *testing.T) {
|
||||
examplePath := filepath.Join("..", "..", "..", "..", "example", "xigmanas.txt")
|
||||
raw, err := os.ReadFile(examplePath)
|
||||
if err != nil {
|
||||
if os.IsNotExist(err) {
|
||||
t.Skipf("example file %s not present", examplePath)
|
||||
}
|
||||
t.Fatalf("read example file: %v", err)
|
||||
}
|
||||
|
||||
|
||||
69
internal/server/chart_view_test.go
Normal file
69
internal/server/chart_view_test.go
Normal file
@@ -0,0 +1,69 @@
|
||||
package server
|
||||
|
||||
import (
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestHandleChartCurrent_RendersCurrentReanimatorSnapshot(t *testing.T) {
|
||||
s := New(Config{})
|
||||
s.SetResult(&models.AnalysisResult{
|
||||
SourceType: models.SourceTypeArchive,
|
||||
Filename: "example.zip",
|
||||
CollectedAt: time.Date(2026, 3, 16, 10, 0, 0, 0, time.UTC),
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{
|
||||
ProductName: "SYS-TEST",
|
||||
SerialNumber: "SN123",
|
||||
},
|
||||
CPUs: []models.CPU{
|
||||
{
|
||||
Socket: 1,
|
||||
Model: "Xeon Gold",
|
||||
Cores: 32,
|
||||
},
|
||||
},
|
||||
},
|
||||
})
|
||||
|
||||
req := httptest.NewRequest(http.MethodGet, "/chart/current", nil)
|
||||
rec := httptest.NewRecorder()
|
||||
|
||||
s.mux.ServeHTTP(rec, req)
|
||||
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d", rec.Code)
|
||||
}
|
||||
body := rec.Body.String()
|
||||
if !strings.Contains(body, "SYS-TEST - SN123") {
|
||||
t.Fatalf("expected chart title in body, got %q", body)
|
||||
}
|
||||
if !strings.Contains(body, `/chart/static/view.css`) {
|
||||
t.Fatalf("expected rewritten chart static path, got %q", body)
|
||||
}
|
||||
if !strings.Contains(body, "Snapshot Metadata") {
|
||||
t.Fatalf("expected rendered chart output, got %q", body)
|
||||
}
|
||||
}
|
||||
|
||||
func TestHandleChartCurrent_RendersEmptyViewerWithoutResult(t *testing.T) {
|
||||
s := New(Config{})
|
||||
|
||||
req := httptest.NewRequest(http.MethodGet, "/chart/current", nil)
|
||||
rec := httptest.NewRecorder()
|
||||
|
||||
s.mux.ServeHTTP(rec, req)
|
||||
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("expected 200, got %d", rec.Code)
|
||||
}
|
||||
body := rec.Body.String()
|
||||
if !strings.Contains(body, "Snapshot Viewer") {
|
||||
t.Fatalf("expected empty chart viewer, got %q", body)
|
||||
}
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user