Compare commits
63 Commits
7d9135dc63
...
v1.8.0
| Author | SHA1 | Date | |
|---|---|---|---|
| 8d80048117 | |||
| 21ea129933 | |||
| 9c5512d238 | |||
| 206496efae | |||
| 7d1a02cb72 | |||
| 070971685f | |||
| 78806f9fa0 | |||
| 4940cd9645 | |||
| 736b77f055 | |||
| 0252264ddc | |||
| 25e3b8bb42 | |||
| bb4505a249 | |||
| 2fa4a1235a | |||
| fe5da1dbd7 | |||
| 612058ed16 | |||
| e0146adfff | |||
| 9a30705c9a | |||
| 8dbbec3610 | |||
| 4c60ebbf1d | |||
| c52fea2fec | |||
| dae4744eb3 | |||
| b6ff47fea8 | |||
| 1d282c4196 | |||
| f35cabac48 | |||
| a2c9e9a57f | |||
| b918363252 | |||
| 6c19a58b24 | |||
| 9aadf2f1e9 | |||
|
|
ddab93a5ee | ||
|
|
000199fbdc | ||
|
|
68592da9f5 | ||
|
|
b1dde592ae | ||
|
|
693b7346ab | ||
|
|
a4a1a19a94 | ||
|
|
66fb90233f | ||
|
|
7a1285db99 | ||
|
|
144d298efa | ||
|
|
a6c90b6e77 | ||
|
|
2e348751f3 | ||
|
|
15dc86a0e4 | ||
|
|
752b063613 | ||
|
|
6f66a8b2a1 | ||
|
|
ce30f943df | ||
|
|
810c4b5ff9 | ||
|
|
5d9e9d73de | ||
| 38cc051f23 | |||
|
|
fcd57c1ba9 | ||
|
|
82ee513835 | ||
| de5521a4e5 | |||
| a82b55b144 | |||
| 758fa66282 | |||
| b33cca5fcc | |||
| 514da76ddb | |||
| c13788132b | |||
| 5e49adaf05 | |||
| c7b2a7ab29 | |||
| 0af3cee9b6 | |||
| 8715fcace4 | |||
| 1b1bc74fc7 | |||
| 77e25ddc02 | |||
| bcce975fd6 | |||
| 8b065c6cca | |||
| aa22034944 |
7
.gitignore
vendored
7
.gitignore
vendored
@@ -62,3 +62,10 @@ go.work.sum
|
||||
|
||||
# Distribution binaries
|
||||
dist/
|
||||
|
||||
# Release artifacts
|
||||
release/
|
||||
releases/
|
||||
releases/**/SHA256SUMS.txt
|
||||
releases/**/*.tar.gz
|
||||
releases/**/*.zip
|
||||
|
||||
6
.gitmodules
vendored
Normal file
6
.gitmodules
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
[submodule "third_party/pciids"]
|
||||
path = third_party/pciids
|
||||
url = https://github.com/pciutils/pciids.git
|
||||
[submodule "bible"]
|
||||
path = bible
|
||||
url = https://git.mchus.pro/mchus/bible.git
|
||||
11
AGENTS.md
Normal file
11
AGENTS.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# LOGPile — Instructions for Codex
|
||||
|
||||
## Shared Engineering Rules
|
||||
Read `bible/` — shared rules for all projects (CSV, logging, DB, tables, background tasks, code style).
|
||||
Start with `bible/rules/patterns/` for specific contracts.
|
||||
|
||||
## Project Architecture
|
||||
Read `bible-local/` — LOGPile specific architecture.
|
||||
Read order: `bible-local/README.md` → `01-overview.md` → relevant files for the task.
|
||||
|
||||
Every architectural decision specific to this project must be recorded in `bible-local/10-decisions.md`.
|
||||
100
CLAUDE.md
100
CLAUDE.md
@@ -1,95 +1,11 @@
|
||||
# LOGPile - Engineering Notes (for Claude/Codex)
|
||||
# LOGPile — Instructions for Claude
|
||||
|
||||
## Project summary
|
||||
## Shared Engineering Rules
|
||||
Read `bible/` — shared rules for all projects (CSV, logging, DB, tables, background tasks, code style).
|
||||
Start with `bible/rules/patterns/` for specific contracts.
|
||||
|
||||
LOGPile is a standalone Go app for BMC diagnostics analysis with embedded web UI.
|
||||
## Project Architecture
|
||||
Read `bible-local/` — LOGPile specific architecture.
|
||||
Read order: `bible-local/README.md` → `01-overview.md` → relevant files for the task.
|
||||
|
||||
Current product modes:
|
||||
1. Upload and parse vendor archives / JSON snapshots.
|
||||
2. Collect live data via Redfish and analyze/export it.
|
||||
|
||||
## Runtime architecture
|
||||
|
||||
- Go + `net/http` (`http.ServeMux`)
|
||||
- Embedded UI (`web/embed.go`, `//go:embed templates static`)
|
||||
- In-memory state (`Server.result`, `Server.detectedVendor`)
|
||||
- Job manager for live collect status/logs
|
||||
|
||||
Default port: `8082`.
|
||||
|
||||
## Key flows
|
||||
|
||||
### Upload flow (`POST /api/upload`)
|
||||
- Accepts multipart file field `archive`.
|
||||
- If file looks like JSON, parsed as `models.AnalysisResult` snapshot.
|
||||
- Otherwise passed to archive parser (`parser.NewBMCParser().ParseFromReader(...)`).
|
||||
- Result stored in memory and exposed by API/UI.
|
||||
|
||||
### Live flow (`POST /api/collect`)
|
||||
- Validates request (`host/protocol/port/username/auth_type/tls_mode`).
|
||||
- Runs collector asynchronously with progress callback.
|
||||
- On success:
|
||||
- source metadata set (`source_type=api`, protocol/host/date),
|
||||
- result becomes current in-memory dataset.
|
||||
- On failed/canceled previous dataset stays unchanged.
|
||||
|
||||
## Collectors
|
||||
|
||||
Registry: `internal/collector/registry.go`
|
||||
|
||||
- `redfish` (real collector):
|
||||
- dynamic discovery of Systems/Chassis/Managers,
|
||||
- CPU/RAM/Storage/GPU/PSU/NIC/PCIe/Firmware mapping,
|
||||
- raw Redfish snapshot (`result.RawPayloads["redfish_tree"]`) for offline future analysis,
|
||||
- progress logs include active collection stage and snapshot progress.
|
||||
- `ipmi` is currently a mock collector scaffold.
|
||||
|
||||
## Export behavior
|
||||
|
||||
Endpoints:
|
||||
- `/api/export/csv`
|
||||
- `/api/export/json`
|
||||
- `/api/export/txt`
|
||||
|
||||
Filename pattern for all exports:
|
||||
`YYYY-MM-DD (SERVER MODEL) - SERVER SN.<ext>`
|
||||
|
||||
Notes:
|
||||
- JSON export contains full `AnalysisResult`, including `raw_payloads`.
|
||||
- TXT export is tabular and mirrors UI sections (no raw JSON section).
|
||||
|
||||
## CLI flags (`cmd/logpile/main.go`)
|
||||
|
||||
- `--port`
|
||||
- `--file` (reserved/preload, not active workflow)
|
||||
- `--version`
|
||||
- `--no-browser`
|
||||
- `--hold-on-crash` (default true on Windows) — keeps console open on fatal crash for debugging.
|
||||
|
||||
## Build / release
|
||||
|
||||
- `make build` -> single local binary (`CGO_ENABLED=0`).
|
||||
- `make build-all` -> cross-platform binaries.
|
||||
- Tags/releases are published with `tea`.
|
||||
- Release notes live in `docs/releases/<tag>.md`.
|
||||
|
||||
## Testing expectations
|
||||
|
||||
Before merge:
|
||||
|
||||
```bash
|
||||
go test ./...
|
||||
```
|
||||
|
||||
If touching collectors/handlers, prefer adding or updating tests in:
|
||||
- `internal/collector/*_test.go`
|
||||
- `internal/server/*_test.go`
|
||||
|
||||
## Practical coding guidance
|
||||
|
||||
- Keep API contracts stable with frontend (`web/static/js/app.js`).
|
||||
- When adding Redfish mappings, prefer tolerant/fallback parsing:
|
||||
- alternate collection paths,
|
||||
- `@odata.id` references and embedded members,
|
||||
- deduping by serial/BDF/slot+model.
|
||||
- Avoid breaking snapshot backward compatibility (`AnalysisResult` JSON shape).
|
||||
Every architectural decision specific to this project must be recorded in `bible-local/10-decisions.md`.
|
||||
|
||||
7
Makefile
7
Makefile
@@ -1,4 +1,4 @@
|
||||
.PHONY: build run clean test build-all
|
||||
.PHONY: build run clean test build-all update-pci-ids
|
||||
|
||||
BINARY_NAME=logpile
|
||||
VERSION=$(shell git describe --tags --always --dirty 2>/dev/null || echo "dev")
|
||||
@@ -6,6 +6,7 @@ COMMIT=$(shell git rev-parse --short HEAD 2>/dev/null || echo "none")
|
||||
LDFLAGS=-ldflags "-X main.version=$(VERSION) -X main.commit=$(COMMIT)"
|
||||
|
||||
build:
|
||||
@if [ "$(SKIP_PCI_IDS_UPDATE)" != "1" ]; then ./scripts/update-pci-ids.sh --best-effort; fi
|
||||
CGO_ENABLED=0 go build $(LDFLAGS) -o bin/$(BINARY_NAME) ./cmd/logpile
|
||||
|
||||
run: build
|
||||
@@ -19,6 +20,7 @@ test:
|
||||
|
||||
# Cross-platform builds
|
||||
build-all: clean
|
||||
@if [ "$(SKIP_PCI_IDS_UPDATE)" != "1" ]; then ./scripts/update-pci-ids.sh --best-effort; fi
|
||||
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build $(LDFLAGS) -o bin/$(BINARY_NAME)-linux-amd64 ./cmd/logpile
|
||||
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 go build $(LDFLAGS) -o bin/$(BINARY_NAME)-linux-arm64 ./cmd/logpile
|
||||
CGO_ENABLED=0 GOOS=darwin GOARCH=amd64 go build $(LDFLAGS) -o bin/$(BINARY_NAME)-darwin-amd64 ./cmd/logpile
|
||||
@@ -33,3 +35,6 @@ fmt:
|
||||
|
||||
lint:
|
||||
golangci-lint run
|
||||
|
||||
update-pci-ids:
|
||||
./scripts/update-pci-ids.sh --sync-submodule
|
||||
|
||||
150
README.md
150
README.md
@@ -1,151 +1,11 @@
|
||||
# LOGPile
|
||||
|
||||
LOGPile — standalone Go-приложение для анализа диагностических данных BMC.
|
||||
Standalone Go application for BMC diagnostics analysis with an embedded web UI.
|
||||
|
||||
Поддерживает два сценария:
|
||||
1. Загрузка архивов/снапшотов и оффлайн-анализ в веб-интерфейсе.
|
||||
2. Live-сбор через Redfish API с последующим экспортом и повторной загрузкой оффлайн.
|
||||
## Documentation
|
||||
|
||||
## Что умеет
|
||||
- Architecture and technical documentation (single source of truth): [`docs/bible/README.md`](docs/bible/README.md)
|
||||
|
||||
- Standalone бинарник с embedded UI (без внешних статических файлов).
|
||||
- Парсинг vendor-архивов (Supermicro, Inspur/Kaytus, NVIDIA, fallback generic).
|
||||
- Live-сбор по Redfish (`/api/collect`) с прогрессом и журналом шагов.
|
||||
- Расширенный Redfish snapshot:
|
||||
- нормализованные данные (CPU/RAM/Storage/GPU/PSU/NIC/PCIe/Firmware),
|
||||
- сырой `redfish_tree` для будущего анализа.
|
||||
- Загрузка JSON snapshot обратно через `/api/upload` для оффлайн-работы.
|
||||
- Экспорт в CSV / JSON / TXT.
|
||||
## License
|
||||
|
||||
## Требования
|
||||
|
||||
- Go 1.22+
|
||||
|
||||
## Сборка
|
||||
|
||||
```bash
|
||||
make build
|
||||
```
|
||||
|
||||
Бинарник будет в `bin/logpile`.
|
||||
|
||||
Для кросс-сборки:
|
||||
|
||||
```bash
|
||||
make build-all
|
||||
```
|
||||
|
||||
Артефакты:
|
||||
- `bin/logpile-linux-amd64`
|
||||
- `bin/logpile-linux-arm64`
|
||||
- `bin/logpile-darwin-amd64`
|
||||
- `bin/logpile-darwin-arm64`
|
||||
- `bin/logpile-windows-amd64.exe`
|
||||
|
||||
## Запуск
|
||||
|
||||
```bash
|
||||
./bin/logpile
|
||||
./bin/logpile --port 8082
|
||||
./bin/logpile --no-browser
|
||||
./bin/logpile --version
|
||||
```
|
||||
|
||||
Отладка падений (чтобы консоль не закрывалась):
|
||||
|
||||
```bash
|
||||
./bin/logpile --hold-on-crash
|
||||
```
|
||||
|
||||
> На Windows `--hold-on-crash` включён по умолчанию.
|
||||
|
||||
## Форматы загрузки
|
||||
|
||||
`POST /api/upload` принимает:
|
||||
- архивы: `.tar`, `.tar.gz`, `.tgz`
|
||||
- JSON snapshot (`AnalysisResult`)
|
||||
|
||||
## Live Redfish
|
||||
|
||||
Запуск live-сбора:
|
||||
|
||||
```http
|
||||
POST /api/collect
|
||||
```
|
||||
|
||||
Пример body:
|
||||
|
||||
```json
|
||||
{
|
||||
"host": "bmc01.example.local",
|
||||
"protocol": "redfish",
|
||||
"port": 443,
|
||||
"username": "admin",
|
||||
"auth_type": "password",
|
||||
"password": "secret",
|
||||
"tls_mode": "insecure"
|
||||
}
|
||||
```
|
||||
|
||||
Жизненный цикл задачи:
|
||||
`queued -> running -> success|failed|canceled`
|
||||
|
||||
Статус и прогресс:
|
||||
- `GET /api/collect/{id}`
|
||||
- `POST /api/collect/{id}/cancel`
|
||||
|
||||
## Экспорт
|
||||
|
||||
- `GET /api/export/csv` — серийные номера
|
||||
- `GET /api/export/json` — полный `AnalysisResult` (включая `raw_payloads`)
|
||||
- `GET /api/export/txt` — табличный отчёт по разделам UI
|
||||
|
||||
Имена экспортируемых файлов:
|
||||
|
||||
`YYYY-MM-DD (SERVER MODEL) - SERVER SN.<ext>`
|
||||
|
||||
Пример:
|
||||
`2026-02-04 (SYS-421GE-TNHR2) - C8X123456789.json`
|
||||
|
||||
## API
|
||||
|
||||
```text
|
||||
POST /api/upload
|
||||
POST /api/collect
|
||||
GET /api/collect/{id}
|
||||
POST /api/collect/{id}/cancel
|
||||
GET /api/status
|
||||
GET /api/parsers
|
||||
GET /api/events
|
||||
GET /api/sensors
|
||||
GET /api/config
|
||||
GET /api/serials
|
||||
GET /api/firmware
|
||||
GET /api/export/csv
|
||||
GET /api/export/json
|
||||
GET /api/export/txt
|
||||
DELETE /api/clear
|
||||
POST /api/shutdown
|
||||
```
|
||||
|
||||
`/api/status` и `/api/config` содержат метаданные источника:
|
||||
- `source_type`: `archive` | `api`
|
||||
- `protocol`: `redfish` | `ipmi` (для архивов может быть пустым)
|
||||
- `target_host`
|
||||
- `collected_at`
|
||||
|
||||
## Структура
|
||||
|
||||
```text
|
||||
cmd/logpile/main.go # entrypoint
|
||||
internal/collector/ # live collectors (redfish, ipmi mock)
|
||||
internal/parser/ # archive parsers
|
||||
internal/server/ # HTTP handlers
|
||||
internal/exporter/ # CSV/JSON/TXT export
|
||||
internal/models/ # data contracts
|
||||
web/ # embedded templates/static
|
||||
```
|
||||
|
||||
## Лицензия
|
||||
|
||||
MIT — см. `LICENSE`.
|
||||
MIT (see `LICENSE`)
|
||||
|
||||
1
bible
Submodule
1
bible
Submodule
Submodule bible added at 0c829182a1
35
bible-local/01-overview.md
Normal file
35
bible-local/01-overview.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# 01 — Overview
|
||||
|
||||
## What is LOGPile?
|
||||
|
||||
LOGPile is a standalone Go application for BMC (Baseboard Management Controller)
|
||||
diagnostics analysis with an embedded web UI.
|
||||
It runs as a single binary with no external file dependencies.
|
||||
|
||||
## Operating modes
|
||||
|
||||
| Mode | Entry point | Description |
|
||||
|------|-------------|-------------|
|
||||
| **Offline / archive** | `POST /api/upload` | Upload a vendor diagnostic archive or a JSON snapshot; parse and display in UI |
|
||||
| **Live / Redfish** | `POST /api/collect` | Connect to a live BMC via Redfish API, collect hardware inventory, display and export |
|
||||
|
||||
Both modes produce the same in-memory `AnalysisResult` structure and expose it
|
||||
through the same API and UI.
|
||||
|
||||
## Key capabilities
|
||||
|
||||
- Single self-contained binary with embedded HTML/JS/CSS (no static file serving required).
|
||||
- Vendor archive parsing: Inspur/Kaytus, Dell TSR, NVIDIA HGX Field Diagnostics,
|
||||
NVIDIA Bug Report, Unraid, XigmaNAS, Generic text fallback.
|
||||
- Live Redfish collection with async progress tracking.
|
||||
- Normalized hardware inventory: CPU / RAM / Storage / GPU / PSU / NIC / PCIe / Firmware.
|
||||
- Raw `redfish_tree` snapshot stored in `RawPayloads` for future offline re-analysis.
|
||||
- Re-upload of a JSON snapshot for offline work (`/api/upload` accepts `AnalysisResult` JSON).
|
||||
- Export in CSV, JSON (full `AnalysisResult`), and Reanimator format.
|
||||
- PCI device model resolution via embedded `pci.ids` (no hardcoded model strings).
|
||||
|
||||
## Non-goals (current scope)
|
||||
|
||||
- No persistent storage — all state is in-memory per process lifetime.
|
||||
- IPMI collector is a mock scaffold only; real IPMI support is not implemented.
|
||||
- No authentication layer on the HTTP server.
|
||||
115
bible-local/02-architecture.md
Normal file
115
bible-local/02-architecture.md
Normal file
@@ -0,0 +1,115 @@
|
||||
# 02 — Architecture
|
||||
|
||||
## Runtime stack
|
||||
|
||||
| Layer | Technology |
|
||||
|-------|------------|
|
||||
| Language | Go 1.22+ |
|
||||
| HTTP | `net/http`, `http.ServeMux` |
|
||||
| UI | Embedded via `//go:embed` in `web/embed.go` (templates + static assets) |
|
||||
| State | In-memory only — no database |
|
||||
| Build | `CGO_ENABLED=0`, single static binary |
|
||||
|
||||
Default port: **8082**
|
||||
|
||||
## Directory structure
|
||||
|
||||
```
|
||||
cmd/logpile/main.go # Binary entry point, CLI flag parsing
|
||||
internal/
|
||||
collector/ # Live data collectors
|
||||
registry.go # Collector registration
|
||||
redfish.go # Redfish connector (real implementation)
|
||||
ipmi_mock.go # IPMI mock connector (scaffold)
|
||||
types.go # Connector request/progress contracts
|
||||
parser/ # Archive parsers
|
||||
parser.go # BMCParser (dispatcher) + parse orchestration
|
||||
archive.go # Archive extraction helpers
|
||||
registry.go # Parser registry + detect/selection
|
||||
interface.go # VendorParser interface
|
||||
vendors/ # Vendor-specific parser modules
|
||||
vendors.go # Import-side-effect registrations
|
||||
dell/
|
||||
inspur/
|
||||
nvidia/
|
||||
nvidia_bug_report/
|
||||
unraid/
|
||||
xigmanas/
|
||||
generic/
|
||||
pciids/ # PCI IDs lookup (embedded pci.ids)
|
||||
server/ # HTTP layer
|
||||
server.go # Server struct, route registration
|
||||
handlers.go # All HTTP handler functions
|
||||
exporter/ # Export formatters
|
||||
exporter.go # CSV + JSON exporters
|
||||
reanimator_models.go
|
||||
reanimator_converter.go
|
||||
models/ # Shared data contracts
|
||||
web/
|
||||
embed.go # go:embed directive
|
||||
templates/ # HTML templates
|
||||
static/ # JS / CSS
|
||||
js/app.js # Frontend — API contract consumer
|
||||
```
|
||||
|
||||
## In-memory state
|
||||
|
||||
The `Server` struct in `internal/server/server.go` holds:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `result` | `*models.AnalysisResult` | Current parsed/collected dataset |
|
||||
| `detectedVendor` | `string` | Vendor identifier from last parse |
|
||||
| `jobManager` | `*JobManager` | Tracks live collect job status/logs |
|
||||
| `collectors` | `*collector.Registry` | Registered live collection connectors |
|
||||
|
||||
State is replaced atomically on successful upload or collect.
|
||||
On a failed/canceled collect, the previous `result` is preserved unchanged.
|
||||
|
||||
## Upload flow (`POST /api/upload`)
|
||||
|
||||
```
|
||||
multipart form field: "archive"
|
||||
│
|
||||
├─ file looks like JSON?
|
||||
│ └─ parse as models.AnalysisResult snapshot → store in Server.result
|
||||
│
|
||||
└─ otherwise
|
||||
└─ parser.NewBMCParser().ParseFromReader(...)
|
||||
│
|
||||
├─ try all registered vendor parsers (highest confidence wins)
|
||||
└─ result → store in Server.result
|
||||
```
|
||||
|
||||
## Live collect flow (`POST /api/collect`)
|
||||
|
||||
```
|
||||
validate request (host / protocol / port / username / auth_type / tls_mode)
|
||||
│
|
||||
└─ launch async job
|
||||
│
|
||||
├─ progress callback → job log (queryable via GET /api/collect/{id})
|
||||
│
|
||||
├─ success:
|
||||
│ set source metadata (source_type=api, protocol, host, date)
|
||||
│ store result in Server.result
|
||||
│
|
||||
└─ failure / cancel:
|
||||
previous Server.result unchanged
|
||||
```
|
||||
|
||||
Job lifecycle states: `queued → running → success | failed | canceled`
|
||||
|
||||
## PCI IDs lookup
|
||||
|
||||
Load/override order (`LOGPILE_PCI_IDS_PATH` has highest priority because it is loaded last):
|
||||
|
||||
1. Embedded `internal/parser/vendors/pciids/pci.ids` (base dataset compiled into binary)
|
||||
2. `./pci.ids`
|
||||
3. `/usr/share/hwdata/pci.ids`
|
||||
4. `/usr/share/misc/pci.ids`
|
||||
5. `/opt/homebrew/share/pciids/pci.ids`
|
||||
6. Paths from `LOGPILE_PCI_IDS_PATH` (colon-separated on Unix; later loaded, override same IDs)
|
||||
|
||||
This means unknown GPU/NIC model strings can be updated by refreshing `pci.ids`
|
||||
without any code change.
|
||||
184
bible-local/03-api.md
Normal file
184
bible-local/03-api.md
Normal file
@@ -0,0 +1,184 @@
|
||||
# 03 — API Reference
|
||||
|
||||
## Conventions
|
||||
|
||||
- All endpoints under `/api/`.
|
||||
- Request bodies: `application/json` or `multipart/form-data` where noted.
|
||||
- Responses: `application/json` unless file download.
|
||||
- Export filenames follow pattern: `YYYY-MM-DD (SERVER MODEL) - SERVER SN.<ext>`
|
||||
|
||||
---
|
||||
|
||||
## Upload & Data Input
|
||||
|
||||
### `POST /api/upload`
|
||||
|
||||
Upload a vendor diagnostic archive or a JSON snapshot.
|
||||
|
||||
**Request:** `multipart/form-data`, field name `archive`.
|
||||
Server-side multipart limit: **100 MiB**.
|
||||
|
||||
Accepted inputs:
|
||||
- `.tar`, `.tar.gz`, `.tgz` — vendor diagnostic archives
|
||||
- `.txt` — plain text files
|
||||
- JSON file containing a serialized `AnalysisResult` — re-loaded as-is
|
||||
|
||||
**Response:** `200 OK` with parsed result summary, or `4xx`/`5xx` on error.
|
||||
|
||||
---
|
||||
|
||||
## Live Collection
|
||||
|
||||
### `POST /api/collect`
|
||||
|
||||
Start a live collection job (`redfish` or `ipmi`).
|
||||
|
||||
**Request body:**
|
||||
```json
|
||||
{
|
||||
"host": "bmc01.example.local",
|
||||
"protocol": "redfish",
|
||||
"port": 443,
|
||||
"username": "admin",
|
||||
"auth_type": "password",
|
||||
"password": "secret",
|
||||
"tls_mode": "insecure"
|
||||
}
|
||||
```
|
||||
|
||||
Supported values:
|
||||
- `protocol`: `redfish` | `ipmi`
|
||||
- `auth_type`: `password` | `token`
|
||||
- `tls_mode`: `strict` | `insecure`
|
||||
|
||||
**Response:** `202 Accepted`
|
||||
```json
|
||||
{
|
||||
"job_id": "job_a1b2c3d4e5f6",
|
||||
"status": "queued",
|
||||
"message": "Collection job accepted",
|
||||
"created_at": "2026-02-23T12:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
Validation behavior:
|
||||
- `400 Bad Request` for invalid JSON
|
||||
- `422 Unprocessable Entity` for semantic validation errors (missing/invalid fields)
|
||||
|
||||
### `GET /api/collect/{id}`
|
||||
|
||||
Poll job status and progress log.
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
{
|
||||
"job_id": "job_a1b2c3d4e5f6",
|
||||
"status": "running",
|
||||
"progress": 55,
|
||||
"logs": ["..."],
|
||||
"created_at": "2026-02-23T12:00:00Z",
|
||||
"updated_at": "2026-02-23T12:00:10Z"
|
||||
}
|
||||
```
|
||||
|
||||
Status values: `queued` | `running` | `success` | `failed` | `canceled`
|
||||
|
||||
### `POST /api/collect/{id}/cancel`
|
||||
|
||||
Cancel a running job.
|
||||
|
||||
---
|
||||
|
||||
## Data Queries
|
||||
|
||||
### `GET /api/status`
|
||||
|
||||
Returns source metadata for the current dataset.
|
||||
|
||||
```json
|
||||
{
|
||||
"loaded": true,
|
||||
"filename": "redfish://bmc01.example.local",
|
||||
"vendor": "redfish",
|
||||
"source_type": "api",
|
||||
"protocol": "redfish",
|
||||
"target_host": "bmc01.example.local",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"stats": { "events": 0, "sensors": 0, "fru": 0 }
|
||||
}
|
||||
```
|
||||
|
||||
`source_type`: `archive` | `api`
|
||||
|
||||
When no dataset is loaded, response is `{ "loaded": false }`.
|
||||
|
||||
### `GET /api/config`
|
||||
|
||||
Returns source metadata plus:
|
||||
- `hardware.board`
|
||||
- `hardware.firmware`
|
||||
- canonical `hardware.devices`
|
||||
- computed `specification` summary lines
|
||||
|
||||
### `GET /api/events`
|
||||
|
||||
Returns parsed diagnostic events.
|
||||
|
||||
### `GET /api/sensors`
|
||||
|
||||
Returns sensor readings (temperatures, voltages, fan speeds).
|
||||
|
||||
### `GET /api/serials`
|
||||
|
||||
Returns serial numbers built from canonical `hardware.devices`.
|
||||
|
||||
### `GET /api/firmware`
|
||||
|
||||
Returns firmware versions built from canonical `hardware.devices`.
|
||||
|
||||
### `GET /api/parsers`
|
||||
|
||||
Returns list of registered vendor parsers with their identifiers.
|
||||
|
||||
---
|
||||
|
||||
## Export
|
||||
|
||||
### `GET /api/export/csv`
|
||||
|
||||
Download serial numbers as CSV.
|
||||
|
||||
### `GET /api/export/json`
|
||||
|
||||
Download full `AnalysisResult` as JSON (includes `raw_payloads`).
|
||||
|
||||
### `GET /api/export/reanimator`
|
||||
|
||||
Download hardware data in Reanimator format for asset tracking integration.
|
||||
See [`07-exporters.md`](07-exporters.md) for full format spec.
|
||||
|
||||
---
|
||||
|
||||
## Management
|
||||
|
||||
### `DELETE /api/clear`
|
||||
|
||||
Clear current in-memory dataset.
|
||||
|
||||
### `POST /api/shutdown`
|
||||
|
||||
Gracefully shut down the server process.
|
||||
This endpoint terminates the current process after responding.
|
||||
|
||||
---
|
||||
|
||||
## Source metadata fields
|
||||
|
||||
Fields present in `/api/status` and `/api/config`:
|
||||
|
||||
| Field | Values |
|
||||
|-------|--------|
|
||||
| `source_type` | `archive` \| `api` |
|
||||
| `protocol` | `redfish` \| `ipmi` (may be empty for archive uploads) |
|
||||
| `target_host` | IP or hostname |
|
||||
| `collected_at` | RFC3339 timestamp |
|
||||
104
bible-local/04-data-models.md
Normal file
104
bible-local/04-data-models.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# 04 — Data Models
|
||||
|
||||
## AnalysisResult
|
||||
|
||||
`internal/models/` — the central data contract shared by parsers, collectors, exporters, and the HTTP layer.
|
||||
|
||||
**Stability rule:** Never break the JSON shape of `AnalysisResult`.
|
||||
Backward-compatible additions are allowed; removals or renames are not.
|
||||
|
||||
Key top-level fields:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `filename` | `string` | Uploaded filename or generated live source identifier |
|
||||
| `source_type` | `string` | `archive` or `api` |
|
||||
| `protocol` | `string` | `redfish`, `ipmi`, or empty for archive uploads |
|
||||
| `target_host` | `string` | BMC host for live collection |
|
||||
| `collected_at` | `time.Time` | Upload/collection timestamp |
|
||||
| `hardware` | `*HardwareConfig` | All normalized hardware inventory |
|
||||
| `events` | `[]Event` | Diagnostic events from parsers |
|
||||
| `fru` | `[]FRUInfo` | FRU/SDR-derived inventory details |
|
||||
| `sensors` | `[]SensorReading` | Sensor readings |
|
||||
| `raw_payloads` | `map[string]any` | Raw vendor data (e.g. `redfish_tree`) |
|
||||
|
||||
`raw_payloads` is the durable source for offline re-analysis (especially for Redfish).
|
||||
Normalized fields should be treated as derivable output from raw source data.
|
||||
|
||||
### Hardware sub-structure
|
||||
|
||||
```
|
||||
HardwareConfig
|
||||
├── board BoardInfo — server/motherboard identity
|
||||
├── devices []HardwareDevice — CANONICAL INVENTORY (see below)
|
||||
├── cpus []CPU
|
||||
├── memory []MemoryDIMM
|
||||
├── storage []Storage
|
||||
├── volumes []StorageVolume — logical RAID/VROC volumes
|
||||
├── pcie_devices []PCIeDevice
|
||||
├── gpus []GPU
|
||||
├── network_adapters []NetworkAdapter
|
||||
├── network_cards []NIC (legacy/alternate source field)
|
||||
├── power_supplies []PSU
|
||||
└── firmware []FirmwareInfo
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Canonical Device Repository (`hardware.devices`)
|
||||
|
||||
`hardware.devices` is the **single source of truth** for hardware inventory.
|
||||
|
||||
### Rules — must not be violated
|
||||
|
||||
1. All UI tabs displaying hardware components **must read from `hardware.devices`**.
|
||||
2. The Device Inventory tab shows kinds: `pcie`, `storage`, `gpu`, `network`.
|
||||
3. The Reanimator exporter **must use the same `hardware.devices`** as the UI.
|
||||
4. Any discrepancy between UI data and Reanimator export data is a **bug**.
|
||||
5. New hardware attributes must be added to the canonical device schema **first**,
|
||||
then mapped to Reanimator/UI — never the other way around.
|
||||
6. The exporter should group/filter canonical records by section, not rebuild data
|
||||
from multiple sources.
|
||||
|
||||
### Deduplication logic (applied once by repository builder)
|
||||
|
||||
| Priority | Key used |
|
||||
|----------|----------|
|
||||
| 1 | `serial_number` — usable (not empty, not `N/A`, `NA`, `NONE`, `NULL`, `UNKNOWN`, `-`) |
|
||||
| 2 | `bdf` — PCI Bus:Device.Function address |
|
||||
| 3 | No merge — records remain distinct if both serial and bdf are absent |
|
||||
|
||||
### Device schema alignment
|
||||
|
||||
Keep `hardware.devices` schema as close as possible to Reanimator JSON field names.
|
||||
This minimizes translation logic in the exporter and prevents drift.
|
||||
|
||||
---
|
||||
|
||||
## Source metadata fields (stored directly on `AnalysisResult`)
|
||||
|
||||
Carried by both `/api/status` and `/api/config`:
|
||||
|
||||
```json
|
||||
{
|
||||
"source_type": "api",
|
||||
"protocol": "redfish",
|
||||
"target_host": "10.0.0.1",
|
||||
"collected_at": "2026-02-10T15:30:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
Valid `source_type` values: `archive`, `api`
|
||||
Valid `protocol` values: `redfish`, `ipmi` (empty is allowed for archive uploads)
|
||||
|
||||
---
|
||||
|
||||
## Raw Export Package (reopenable artifact)
|
||||
|
||||
`Export Raw Data` does not merely dump `AnalysisResult`; it emits a reopenable raw package
|
||||
(JSON or ZIP bundle) that carries source data required for re-analysis.
|
||||
|
||||
Design rules:
|
||||
- raw source is authoritative (`redfish_tree` or original file bytes)
|
||||
- imports must re-analyze from raw source
|
||||
- parsed field snapshots included in bundles are diagnostic artifacts, not the source of truth
|
||||
109
bible-local/05-collectors.md
Normal file
109
bible-local/05-collectors.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# 05 — Collectors
|
||||
|
||||
Collectors live in `internal/collector/`.
|
||||
|
||||
Core files:
|
||||
- `internal/collector/registry.go` — connector registry (`redfish`, `ipmi`)
|
||||
- `internal/collector/redfish.go` — real Redfish connector
|
||||
- `internal/collector/ipmi_mock.go` — IPMI mock connector scaffold
|
||||
- `internal/collector/types.go` — request/progress contracts
|
||||
|
||||
---
|
||||
|
||||
## Redfish Collector (`redfish`)
|
||||
|
||||
**Status:** Production-ready.
|
||||
|
||||
### Request contract (from server)
|
||||
|
||||
Passed through from `/api/collect` after validation:
|
||||
- `host`, `port`, `username`
|
||||
- `auth_type=password|token` (+ matching credential field)
|
||||
- `tls_mode=strict|insecure`
|
||||
|
||||
### Discovery
|
||||
|
||||
Dynamic — does not assume fixed paths. Discovers:
|
||||
- `Systems` collection → per-system resources
|
||||
- `Chassis` collection → enclosure/board data
|
||||
- `Managers` collection → BMC/firmware info
|
||||
|
||||
### Collected data
|
||||
|
||||
| Category | Notes |
|
||||
|----------|-------|
|
||||
| CPU | Model, cores, threads, socket, status |
|
||||
| Memory | DIMM slot, size, type, speed, serial, manufacturer |
|
||||
| Storage | Slot, type, model, serial, firmware, interface, status |
|
||||
| GPU | Detected via PCIe class + NVIDIA vendor ID |
|
||||
| PSU | Model, serial, wattage, firmware, telemetry (input/output power, voltage) |
|
||||
| NIC | Model, serial, port count, BDF |
|
||||
| PCIe | Slot, vendor_id, device_id, BDF, link width/speed |
|
||||
| Firmware | BIOS, BMC versions |
|
||||
|
||||
### Raw snapshot
|
||||
|
||||
Full Redfish response tree is stored in `result.RawPayloads["redfish_tree"]`.
|
||||
This allows future offline re-analysis without re-collecting from a live BMC.
|
||||
|
||||
### Unified Redfish analysis pipeline (live == replay)
|
||||
|
||||
LOGPile uses a **single Redfish analyzer path**:
|
||||
|
||||
1. Live collector crawls the Redfish API and builds `raw_payloads.redfish_tree`
|
||||
2. Parsed result is produced by replaying that tree through the same analyzer used by raw import
|
||||
|
||||
This guarantees that live collection and `Export Raw Data` re-open/re-analyze produce the same
|
||||
normalized output for the same `redfish_tree`.
|
||||
|
||||
### Snapshot crawler behavior (important)
|
||||
|
||||
The Redfish snapshot crawler is intentionally:
|
||||
- **bounded** (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
|
||||
- **prioritized** (PCIe, Fabrics, FirmwareInventory, Storage, PowerSubsystem, ThermalSubsystem)
|
||||
- **tolerant** (skips noisy expected failures, strips `#fragment` from `@odata.id`)
|
||||
|
||||
Design notes:
|
||||
- Queue capacity is sized to snapshot cap to avoid worker deadlocks on large trees.
|
||||
- UI progress is coarse and human-readable; detailed per-request diagnostics are available via debug logs.
|
||||
- `LOGPILE_REDFISH_DEBUG=1` and `LOGPILE_REDFISH_SNAPSHOT_DEBUG=1` enable console diagnostics.
|
||||
|
||||
### Parsing guidelines
|
||||
|
||||
When adding Redfish mappings, follow these principles:
|
||||
- Support alternate collection paths (resources may appear at different odata URLs).
|
||||
- Follow `@odata.id` references and handle embedded `Members` arrays.
|
||||
- Prefer **raw-tree replay compatibility**: if live collector adds a fallback/probe, replay analyzer must mirror it.
|
||||
- Deduplicate by serial / BDF / slot+model (in that priority order).
|
||||
- Prefer tolerant/fallback parsing — missing fields should be silently skipped,
|
||||
not cause the whole collection to fail.
|
||||
|
||||
### Vendor-specific storage fallbacks (Supermicro and similar)
|
||||
|
||||
When standard `Storage/.../Drives` collections are empty, collector/replay may recover drives via:
|
||||
- `Storage.Links.Enclosures[*] -> .../Drives`
|
||||
- direct probing of finite `Disk.Bay` candidates (`Disk.Bay.0`, `Disk.Bay0`, `.../0`)
|
||||
|
||||
This is required for some BMCs that publish drive inventory in vendor-specific paths while leaving
|
||||
standard collections empty.
|
||||
|
||||
### PSU source preference (newer Redfish)
|
||||
|
||||
PSU inventory source order:
|
||||
1. `Chassis/*/PowerSubsystem/PowerSupplies` (preferred on X14+/newer Redfish)
|
||||
2. `Chassis/*/Power` (legacy fallback)
|
||||
|
||||
### Progress reporting
|
||||
|
||||
The collector emits progress log entries at each stage (connecting, enumerating systems,
|
||||
collecting CPUs, etc.) so the UI can display meaningful status.
|
||||
Current progress message strings are user-facing and may be localized.
|
||||
|
||||
---
|
||||
|
||||
## IPMI Collector (`ipmi`)
|
||||
|
||||
**Status:** Mock scaffold only — not implemented.
|
||||
|
||||
Registered in the collector registry but returns placeholder data.
|
||||
Real IPMI support is a future work item.
|
||||
341
bible-local/06-parsers.md
Normal file
341
bible-local/06-parsers.md
Normal file
@@ -0,0 +1,341 @@
|
||||
# 06 — Parsers
|
||||
|
||||
## Framework
|
||||
|
||||
### Registration
|
||||
|
||||
Each vendor parser registers itself via Go's `init()` side-effect import pattern.
|
||||
|
||||
All registrations are collected in `internal/parser/vendors/vendors.go`:
|
||||
```go
|
||||
import (
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
|
||||
// etc.
|
||||
)
|
||||
```
|
||||
|
||||
### VendorParser interface
|
||||
|
||||
```go
|
||||
type VendorParser interface {
|
||||
Name() string // human-readable name
|
||||
Vendor() string // vendor identifier string
|
||||
Version() string // parser version (increment on logic changes)
|
||||
Detect(files []ExtractedFile) int // confidence 0–100
|
||||
Parse(files []ExtractedFile) (*models.AnalysisResult, error)
|
||||
}
|
||||
```
|
||||
|
||||
### Selection logic
|
||||
|
||||
All registered parsers run `Detect()` against the uploaded archive's file list.
|
||||
The parser with the **highest confidence score** is selected.
|
||||
Multiple parsers may return >0; only the top scorer is used.
|
||||
|
||||
### Adding a new vendor parser
|
||||
|
||||
1. `mkdir -p internal/parser/vendors/VENDORNAME`
|
||||
2. Copy `internal/parser/vendors/template/parser.go.template` as starting point.
|
||||
3. Implement `Detect()` and `Parse()`.
|
||||
4. Add blank import to `vendors/vendors.go`.
|
||||
|
||||
`Detect()` tips:
|
||||
- Look for unique filenames or directory names.
|
||||
- Check file content for vendor-specific markers.
|
||||
- Return 70+ only when confident; return 0 if clearly not a match.
|
||||
|
||||
### Parser versioning
|
||||
|
||||
Each parser file contains a `parserVersion` constant.
|
||||
Increment the version whenever parsing logic changes — this helps trace which
|
||||
version produced a given result.
|
||||
|
||||
---
|
||||
|
||||
## Parser data quality rules
|
||||
|
||||
### FirmwareInfo — system-level only
|
||||
|
||||
`Hardware.Firmware` must contain **only system-level firmware**: BIOS, BMC/iDRAC,
|
||||
Lifecycle Controller, CPLD, storage controllers, BOSS adapters.
|
||||
|
||||
**Device-bound firmware** (NIC, GPU, PSU, disk, backplane) **must NOT be added to
|
||||
`Hardware.Firmware`**. It belongs to the device's own `Firmware` field and is already
|
||||
present there. Duplicating it in `Hardware.Firmware` causes double entries in Reanimator.
|
||||
|
||||
The Reanimator exporter filters by `FirmwareInfo.DeviceName` prefix and by
|
||||
`FirmwareInfo.Description` (FQDD prefix). Parsers must cooperate:
|
||||
|
||||
- Store the device's FQDD (or equivalent slot identifier) in `FirmwareInfo.Description`
|
||||
for all firmware entries that come from a per-device inventory source (e.g. Dell
|
||||
`DCIM_SoftwareIdentity`).
|
||||
- FQDD prefixes that are device-bound: `NIC.`, `PSU.`, `Disk.`, `RAID.Backplane.`, `GPU.`
|
||||
|
||||
### NIC/device model names — strip embedded MAC addresses
|
||||
|
||||
Some vendors (confirmed: Dell TSR) embed the MAC address in the device model name field,
|
||||
e.g. `ProductName = "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`.
|
||||
|
||||
**Rule:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from model/name strings before storing
|
||||
them in `FirmwareInfo.DeviceName`, `NetworkAdapter.Model`, or any other model field.
|
||||
|
||||
Use `nicMACInModelRE` (defined in the Dell parser) or an equivalent regex:
|
||||
```
|
||||
\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$
|
||||
```
|
||||
|
||||
This applies to **all** string fields used as device names or model identifiers.
|
||||
|
||||
### PCI device name enrichment via pci.ids
|
||||
|
||||
If a PCIe device, GPU, NIC, or any hardware component has a `vendor_id` + `device_id`
|
||||
but its model/name field is **empty or generic** (e.g. blank, equals the description,
|
||||
or is just a raw hex ID), the parser **must** attempt to resolve the human-readable
|
||||
model name from the embedded `pci.ids` database before storing the result.
|
||||
|
||||
**Rule:** When `Model` (or equivalent name field) is empty and both `VendorID` and
|
||||
`DeviceID` are non-zero, call the pciids lookup and use the result as the model name.
|
||||
|
||||
```go
|
||||
// Example pattern — use in any parser that handles PCIe/GPU/NIC devices:
|
||||
if strings.TrimSpace(device.Model) == "" && device.VendorID != 0 && device.DeviceID != 0 {
|
||||
if name := pciids.Lookup(device.VendorID, device.DeviceID); name != "" {
|
||||
device.Model = name
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This rule applies to all vendor parsers. The pciids package is available at
|
||||
`internal/parser/vendors/pciids`. See ADL-005 for the rationale.
|
||||
|
||||
**Do not hardcode model name strings.** If a device is unknown today, it will be
|
||||
resolved automatically once `pci.ids` is updated.
|
||||
|
||||
---
|
||||
|
||||
## Vendor parsers
|
||||
|
||||
### Inspur / Kaytus (`inspur`)
|
||||
|
||||
**Status:** Ready. Tested on KR4268X2 (onekeylog format).
|
||||
|
||||
**Archive format:** `.tar.gz` onekeylog
|
||||
|
||||
**Primary source files:**
|
||||
|
||||
| File | Content |
|
||||
|------|---------|
|
||||
| `asset.json` | Base hardware inventory |
|
||||
| `component.log` | Component list |
|
||||
| `devicefrusdr.log` | FRU and SDR data |
|
||||
| `onekeylog/runningdata/redis-dump.rdb` | Runtime enrichment (optional) |
|
||||
|
||||
**Redis RDB enrichment** (applied conservatively — fills missing fields only):
|
||||
- GPU: `serial_number`, `firmware` (VBIOS/FW), runtime telemetry
|
||||
- NIC: firmware, serial, part number (when text logs leave fields empty)
|
||||
|
||||
**Module structure:**
|
||||
```
|
||||
inspur/
|
||||
parser.go — main parser + registration
|
||||
sdr.go — sensor/SDR parsing
|
||||
fru.go — FRU serial parsing
|
||||
asset.go — asset.json parsing
|
||||
syslog.go — syslog parsing
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Dell TSR (`dell`)
|
||||
|
||||
**Status:** Ready (v3.0). Tested on nested TSR archives with embedded `*.pl.zip`.
|
||||
|
||||
**Archive format:** `.zip` (outer archive + nested `*.pl.zip`)
|
||||
|
||||
**Primary source files:**
|
||||
- `tsr/metadata.json`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml`
|
||||
- `tsr/hardware/sysinfo/lcfiles/curr_lclog.xml`
|
||||
|
||||
**Extracted data:**
|
||||
- Board/system identity and BIOS/iDRAC firmware
|
||||
- CPU, memory, physical disks, virtual disks, PSU, NIC, PCIe
|
||||
- GPU inventory (`DCIM_VideoView`) + GPU sensor enrichment (`DCIM_GPUSensor`)
|
||||
- Controller/backplane inventory (`DCIM_ControllerView`, `DCIM_EnclosureView`)
|
||||
- Sensor readings (temperature/voltage/current/power/fan/utilization)
|
||||
- Lifecycle events (`curr_lclog.xml`)
|
||||
|
||||
---
|
||||
|
||||
### NVIDIA HGX Field Diagnostics (`nvidia`)
|
||||
|
||||
**Status:** Ready (v1.1.0). Works with any server vendor.
|
||||
|
||||
**Archive format:** `.tar` / `.tar.gz`
|
||||
|
||||
**Confidence scoring:**
|
||||
|
||||
| File | Score |
|
||||
|------|-------|
|
||||
| `unified_summary.json` with "HGX Field Diag" marker | +40 |
|
||||
| `summary.json` | +20 |
|
||||
| `summary.csv` | +15 |
|
||||
| `gpu_fieldiag/` directory | +15 |
|
||||
|
||||
**Source files:**
|
||||
|
||||
| File | Content |
|
||||
|------|---------|
|
||||
| `output.log` | dmidecode — server manufacturer, model, serial number |
|
||||
| `unified_summary.json` | GPU details, NVSwitch devices, PCI addresses |
|
||||
| `summary.json` | Diagnostic test results and error codes |
|
||||
| `summary.csv` | Alternative test results format |
|
||||
|
||||
**Extracted data:**
|
||||
- GPUs: slot, model, manufacturer, firmware (VBIOS), BDF
|
||||
- NVSwitch devices: slot, device_class, vendor_id, device_id, BDF, link speed/width
|
||||
- Events: diagnostic test failures (connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power)
|
||||
|
||||
**Severity mapping:**
|
||||
- `info` — tests passed
|
||||
- `warning` — e.g. "Row remapping failed"
|
||||
- `critical` — error codes 300+
|
||||
|
||||
**Known limitations:**
|
||||
- Detailed logs in `gpu_fieldiag/*.log` are not parsed.
|
||||
- No CPU, memory, or storage extraction (not present in field diag archives).
|
||||
|
||||
---
|
||||
|
||||
### NVIDIA Bug Report (`nvidia_bug_report`)
|
||||
|
||||
**Status:** Ready (v1.0.0).
|
||||
|
||||
**File format:** `nvidia-bug-report-*.log.gz` (gzip-compressed text)
|
||||
|
||||
**Confidence:** 85 (high priority for matching filename pattern)
|
||||
|
||||
**Source sections parsed:**
|
||||
|
||||
| dmidecode section | Extracts |
|
||||
|-------------------|---------|
|
||||
| System Information | server serial, UUID, manufacturer, product name |
|
||||
| Processor Information | CPU model, serial, core/thread count, frequency |
|
||||
| Memory Device | DIMM slot, size, type, manufacturer, serial, part number, speed |
|
||||
| System Power Supply | PSU location, manufacturer, model, serial, wattage, firmware, status |
|
||||
|
||||
| Other source | Extracts |
|
||||
|--------------|---------|
|
||||
| `lspci -vvv` (Ethernet/Network/IB) | NIC model (from VPD), BDF, slot, P/N, S/N, port count, port type |
|
||||
| `/proc/driver/nvidia/gpus/*/information` | GPU model, BDF, UUID, VBIOS version, IRQ |
|
||||
| NVRM version line | NVIDIA driver version |
|
||||
|
||||
**Known limitations:**
|
||||
- Driver error/warning log lines not yet extracted.
|
||||
- GPU temperature/utilization metrics require additional parsing sections.
|
||||
|
||||
---
|
||||
|
||||
### XigmaNAS (`xigmanas`)
|
||||
|
||||
**Status:** Ready.
|
||||
|
||||
**Archive format:** Plain log files (FreeBSD-based NAS system)
|
||||
|
||||
**Detection:** Files named `xigmanas`, `system`, or `dmesg`; content containing "XigmaNAS" or "FreeBSD"; SMART data presence.
|
||||
|
||||
**Extracted data:**
|
||||
- System: firmware version, uptime, CPU model, memory configuration, hardware platform
|
||||
- Storage: disk models, serial numbers, capacity, health, SMART temperatures
|
||||
- Populates: `Hardware.Firmware`, `Hardware.CPUs`, `Hardware.Memory`, `Hardware.Storage`, `Sensors`
|
||||
|
||||
---
|
||||
|
||||
### Unraid (`unraid`)
|
||||
|
||||
**Status:** Ready (v1.0.0).
|
||||
|
||||
**Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories).
|
||||
|
||||
**Detection:** Combines filename/path markers (`diagnostics-*`, `unraid-*.txt`, `vars.txt`)
|
||||
with content markers (e.g. `Unraid kernel build`, parity data markers).
|
||||
|
||||
**Extracted data (current):**
|
||||
- Board / BIOS metadata (from motherboard/system files)
|
||||
- CPU summary (from `lscpu.txt`)
|
||||
- Memory modules (from diagnostics memory file)
|
||||
- Storage devices (from `vars.txt` + SMART files)
|
||||
- Syslog events
|
||||
|
||||
---
|
||||
|
||||
### H3C SDS G5 (`h3c_g5`)
|
||||
|
||||
**Status:** Ready (v1.0.0). Tested on H3C UniServer R4900 G5 SDS archives.
|
||||
|
||||
**Archive format:** `.sds` (tar archive)
|
||||
|
||||
**Detection:** `hardware_info.ini`, `hardware.info`, `firmware_version.ini`, `user/test*.csv`, plus H3C markers.
|
||||
|
||||
**Extracted data (current):**
|
||||
- Board/FRU inventory (`FRUInfo.ini`, `board_info.ini`)
|
||||
- Firmware list (`firmware_version.ini`)
|
||||
- CPU inventory (`hardware_info.ini`)
|
||||
- Memory DIMM inventory (`hardware_info.ini`)
|
||||
- Storage inventory (`hardware.info`, `storage_disk.ini`, `NVMe_info.txt`, RAID text enrichments)
|
||||
- Logical RAID volumes (`raid.json`, `Storage_RAID-*.txt`)
|
||||
- Sensor snapshot (`sensor_info.ini`)
|
||||
- SEL events (`user/test.csv`, `user/test1.csv`, fallback `Sel.json` / `sel_list.txt`)
|
||||
|
||||
---
|
||||
|
||||
### H3C SDS G6 (`h3c_g6`)
|
||||
|
||||
**Status:** Ready (v1.0.0). Tested on H3C UniServer R4700 G6 SDS archives.
|
||||
|
||||
**Archive format:** `.sds` (tar archive)
|
||||
|
||||
**Detection:** `CPUDetailInfo.xml`, `MemoryDetailInfo.xml`, `firmware_version.json`, `Sel.json`, plus H3C markers.
|
||||
|
||||
**Extracted data (current):**
|
||||
- Board/FRU inventory (`FRUInfo.ini`, `board_info.ini`)
|
||||
- Firmware list (`firmware_version.json`)
|
||||
- CPU inventory (`CPUDetailInfo.xml`)
|
||||
- Memory DIMM inventory (`MemoryDetailInfo.xml`)
|
||||
- Storage inventory + capacity/model/interface (`storage_disk.ini`, `Storage_RAID-*.txt`, `NVMe_info.txt`)
|
||||
- Logical RAID volumes (`raid.json`, fallback from `Storage_RAID-*.txt` when available)
|
||||
- Sensor snapshot (`sensor_info.ini`)
|
||||
- SEL events (`user/Sel.json`, fallback `user/sel_list.txt`)
|
||||
|
||||
---
|
||||
|
||||
### Generic text fallback (`generic`)
|
||||
|
||||
**Status:** Ready (v1.0.0).
|
||||
|
||||
**Confidence:** 15 (lowest — only matches if no other parser scores higher)
|
||||
|
||||
**Purpose:** Fallback for any text file or single `.gz` file not matching a specific vendor.
|
||||
|
||||
**Behavior:**
|
||||
- If filename matches `nvidia-bug-report-*.log.gz`: extracts driver version and GPU list.
|
||||
- Otherwise: confirms file is text (not binary) and records a basic "Text File" event.
|
||||
|
||||
---
|
||||
|
||||
## Supported vendor matrix
|
||||
|
||||
| Vendor | ID | Status | Tested on |
|
||||
|--------|----|--------|-----------|
|
||||
| Dell TSR | `dell` | Ready | TSR nested zip archives |
|
||||
| Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog |
|
||||
| NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers |
|
||||
| NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems |
|
||||
| Unraid | `unraid` | Ready | Unraid diagnostics archives |
|
||||
| XigmaNAS | `xigmanas` | Ready | FreeBSD NAS logs |
|
||||
| H3C SDS G5 | `h3c_g5` | Ready | H3C UniServer R4900 G5 SDS archives |
|
||||
| H3C SDS G6 | `h3c_g6` | Ready | H3C UniServer R4700 G6 SDS archives |
|
||||
| Generic fallback | `generic` | Ready | Any text file |
|
||||
366
bible-local/07-exporters.md
Normal file
366
bible-local/07-exporters.md
Normal file
@@ -0,0 +1,366 @@
|
||||
# 07 — Exporters & Reanimator Integration
|
||||
|
||||
## Export endpoints summary
|
||||
|
||||
| Endpoint | Format | Filename pattern |
|
||||
|----------|--------|-----------------|
|
||||
| `GET /api/export/csv` | CSV — serial numbers | `YYYY-MM-DD (MODEL) - SN.csv` |
|
||||
| `GET /api/export/json` | **Raw export package** (JSON or ZIP bundle) for reopen/re-analysis | `YYYY-MM-DD (MODEL) - SN.(json|zip)` |
|
||||
| `GET /api/export/reanimator` | Reanimator hardware JSON | `YYYY-MM-DD (MODEL) - SN.json` |
|
||||
|
||||
---
|
||||
|
||||
## Raw Export (`Export Raw Data`)
|
||||
|
||||
### Purpose
|
||||
|
||||
Preserve enough source data to reproduce parsing later after parser fixes, without requiring
|
||||
another live collection from the target system.
|
||||
|
||||
### Format
|
||||
|
||||
`/api/export/json` returns a **raw export package**:
|
||||
- JSON package (machine-readable), or
|
||||
- ZIP bundle containing:
|
||||
- `raw_export.json` — machine-readable package
|
||||
- `collect.log` — human-readable collection + parsing summary
|
||||
- `parser_fields.json` — structured parsed field snapshot for diffs between parser versions
|
||||
|
||||
### Import / reopen behavior
|
||||
|
||||
When a raw export package is uploaded back into LOGPile:
|
||||
- the app **re-analyzes from raw source**
|
||||
- it does **not** trust embedded parsed output as source of truth
|
||||
|
||||
For Redfish, this means replay from `raw_payloads.redfish_tree`.
|
||||
|
||||
### Design rule
|
||||
|
||||
Raw export is a **re-analysis artifact**, not a final report dump. Keep it self-contained and
|
||||
forward-compatible where possible (versioned package format, additive fields only).
|
||||
|
||||
---
|
||||
|
||||
## Reanimator Export
|
||||
|
||||
### Purpose
|
||||
|
||||
Exports hardware inventory data in the format expected by the Reanimator asset tracking
|
||||
system. Enables one-click push from LOGPile to an external asset management platform.
|
||||
|
||||
### Implementation files
|
||||
|
||||
| File | Role |
|
||||
|------|------|
|
||||
| `internal/exporter/reanimator_models.go` | Go structs for Reanimator JSON |
|
||||
| `internal/exporter/reanimator_converter.go` | `ConvertToReanimator()` and helpers |
|
||||
| `internal/server/handlers.go` | `handleExportReanimator()` HTTP handler |
|
||||
|
||||
### Conversion rules
|
||||
|
||||
- Source: canonical `hardware.devices` repository (see [`04-data-models.md`](04-data-models.md))
|
||||
- CPU manufacturer inferred from model string (Intel / AMD / ARM / Ampere)
|
||||
- PCIe serial number generated when absent: `{board_serial}-PCIE-{slot}`
|
||||
- Status values normalized to: `OK`, `Warning`, `Critical`, `Unknown` (`Empty` only for memory slots)
|
||||
- Timestamps in RFC3339 format
|
||||
- `target_host` derived from `filename` field (`redfish://…`, `ipmi://…`) if not in source; omitted if undeterminable
|
||||
- `board.manufacturer` and `board.product_name` values of `"NULL"` treated as absent
|
||||
|
||||
### LOGPile → Reanimator field mapping
|
||||
|
||||
| LOGPile type | Reanimator section | Notes |
|
||||
|---|---|---|
|
||||
| `BoardInfo` | `board` | Direct mapping |
|
||||
| `CPU` | `cpus` | + manufacturer (inferred) |
|
||||
| `MemoryDIMM` | `memory` | Direct; empty slots included (`present=false`) |
|
||||
| `Storage` | `storage` | Excluded if no `serial_number` |
|
||||
| `PCIeDevice` | `pcie_devices` | Serial generated if missing |
|
||||
| `GPU` | `pcie_devices` | `device_class=DisplayController` |
|
||||
| `NetworkAdapter` | `pcie_devices` | `device_class=NetworkController` |
|
||||
| `PSU` | `power_supplies` | Excluded if no serial or `present=false` |
|
||||
| `FirmwareInfo` | `firmware` | Direct mapping |
|
||||
|
||||
### Inclusion / exclusion rules
|
||||
|
||||
**Included:**
|
||||
- Memory slots with `present=false` (as Empty slots)
|
||||
- PCIe devices without serial number (serial is generated)
|
||||
|
||||
**Excluded:**
|
||||
- Storage without `serial_number`
|
||||
- PSU without `serial_number` or with `present=false`
|
||||
- NetworkAdapters with `present=false`
|
||||
|
||||
---
|
||||
|
||||
## Reanimator Integration Guide
|
||||
|
||||
This section documents the Reanimator receiver-side JSON format (what the Reanimator
|
||||
system expects when it ingests a LOGPile export).
|
||||
|
||||
> **Important:** The Reanimator endpoint uses a strict JSON decoder (`DisallowUnknownFields`).
|
||||
> Any unknown field — including nested ones — causes `400 Bad Request`.
|
||||
> Use only `snake_case` keys listed here.
|
||||
|
||||
### Top-level structure
|
||||
|
||||
```json
|
||||
{
|
||||
"filename": "redfish://10.10.10.103",
|
||||
"source_type": "api",
|
||||
"protocol": "redfish",
|
||||
"target_host": "10.10.10.103",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"hardware": {
|
||||
"board": {...},
|
||||
"firmware": [...],
|
||||
"cpus": [...],
|
||||
"memory": [...],
|
||||
"storage": [...],
|
||||
"pcie_devices": [...],
|
||||
"power_supplies": [...]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Required:** `collected_at`, `hardware.board.serial_number`
|
||||
**Optional:** `target_host`, `source_type`, `protocol`, `filename`
|
||||
|
||||
`source_type` values: `api`, `logfile`, `manual`
|
||||
`protocol` values: `redfish`, `ipmi`, `snmp`, `ssh`
|
||||
|
||||
### Component status fields (all component sections)
|
||||
|
||||
Each component may carry:
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `status` | string | `OK`, `Warning`, `Critical`, `Unknown`, `Empty` |
|
||||
| `status_checked_at` | RFC3339 | When status was last verified |
|
||||
| `status_changed_at` | RFC3339 | When status last changed |
|
||||
| `status_at_collection` | object | `{ "status": "...", "at": "..." }` — snapshot-time status |
|
||||
| `status_history` | array | `[{ "status": "...", "changed_at": "...", "details": "..." }]` |
|
||||
| `error_description` | string | Human-readable error for Warning/Critical |
|
||||
|
||||
### Board
|
||||
|
||||
```json
|
||||
{
|
||||
"board": {
|
||||
"manufacturer": "Supermicro",
|
||||
"product_name": "X12DPG-QT6",
|
||||
"serial_number": "21D634101",
|
||||
"part_number": "X12DPG-QT6-REV1.01",
|
||||
"uuid": "d7ef2fe5-2fd0-11f0-910a-346f11040868"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`serial_number` required. `manufacturer` / `product_name` of `"NULL"` treated as absent.
|
||||
|
||||
### CPUs
|
||||
|
||||
```json
|
||||
{
|
||||
"socket": 0,
|
||||
"model": "INTEL(R) XEON(R) GOLD 6530",
|
||||
"cores": 32,
|
||||
"threads": 64,
|
||||
"frequency_mhz": 2100,
|
||||
"max_frequency_mhz": 4000,
|
||||
"manufacturer": "Intel",
|
||||
"status": "OK"
|
||||
}
|
||||
```
|
||||
|
||||
`socket` (int) and `model` required. Serial generated: `{board_serial}-CPU-{socket}`.
|
||||
|
||||
LOT format: `CPU_{VENDOR}_{MODEL_NORMALIZED}` → e.g. `CPU_INTEL_XEON_GOLD_6530`
|
||||
|
||||
### Memory
|
||||
|
||||
```json
|
||||
{
|
||||
"slot": "CPU0_C0D0",
|
||||
"location": "CPU0_C0D0",
|
||||
"present": true,
|
||||
"size_mb": 32768,
|
||||
"type": "DDR5",
|
||||
"max_speed_mhz": 4800,
|
||||
"current_speed_mhz": 4800,
|
||||
"manufacturer": "Hynix",
|
||||
"serial_number": "80AD032419E17CEEC1",
|
||||
"part_number": "HMCG88AGBRA191N",
|
||||
"status": "OK"
|
||||
}
|
||||
```
|
||||
|
||||
`slot` and `present` required. `serial_number` required when `present=true`.
|
||||
Empty slots (`present=false`, `status="Empty"`) are included but no component created.
|
||||
|
||||
LOT format: `DIMM_{TYPE}_{SIZE_GB}GB` → e.g. `DIMM_DDR5_32GB`
|
||||
|
||||
### Storage
|
||||
|
||||
```json
|
||||
{
|
||||
"slot": "OB01",
|
||||
"type": "NVMe",
|
||||
"model": "INTEL SSDPF2KX076T1",
|
||||
"size_gb": 7680,
|
||||
"serial_number": "BTAX41900GF87P6DGN",
|
||||
"manufacturer": "Intel",
|
||||
"firmware": "9CV10510",
|
||||
"interface": "NVMe",
|
||||
"present": true,
|
||||
"status": "OK"
|
||||
}
|
||||
```
|
||||
|
||||
`slot`, `model`, `serial_number`, `present` required.
|
||||
|
||||
LOT format: `{TYPE}_{INTERFACE}_{SIZE_TB}TB` → e.g. `SSD_NVME_07.68TB`
|
||||
|
||||
### Power Supplies
|
||||
|
||||
```json
|
||||
{
|
||||
"slot": "0",
|
||||
"present": true,
|
||||
"model": "GW-CRPS3000LW",
|
||||
"vendor": "Great Wall",
|
||||
"wattage_w": 3000,
|
||||
"serial_number": "2P06C102610",
|
||||
"part_number": "V0310C9000000000",
|
||||
"firmware": "00.03.05",
|
||||
"status": "OK",
|
||||
"input_power_w": 137,
|
||||
"output_power_w": 104,
|
||||
"input_voltage": 215.25
|
||||
}
|
||||
```
|
||||
|
||||
`slot`, `present` required. `serial_number` required when `present=true`.
|
||||
Telemetry fields (`input_power_w`, `output_power_w`, `input_voltage`) stored in observation only.
|
||||
|
||||
LOT format: `PSU_{WATTAGE}W_{VENDOR_NORMALIZED}` → e.g. `PSU_3000W_GREAT_WALL`
|
||||
|
||||
### PCIe Devices
|
||||
|
||||
```json
|
||||
{
|
||||
"slot": "PCIeCard1",
|
||||
"vendor_id": 32902,
|
||||
"device_id": 2912,
|
||||
"bdf": "0000:18:00.0",
|
||||
"device_class": "MassStorageController",
|
||||
"manufacturer": "Intel",
|
||||
"model": "RAID Controller RSP3DD080F",
|
||||
"link_width": 8,
|
||||
"link_speed": "Gen3",
|
||||
"max_link_width": 8,
|
||||
"max_link_speed": "Gen3",
|
||||
"serial_number": "RAID-001-12345",
|
||||
"firmware": "50.9.1-4296",
|
||||
"status": "OK"
|
||||
}
|
||||
```
|
||||
|
||||
`slot` required. Serial generated if absent: `{board_serial}-PCIE-{slot}`.
|
||||
|
||||
`device_class` values: `NetworkController`, `MassStorageController`, `DisplayController`, etc.
|
||||
|
||||
LOT format: `PCIE_{DEVICE_CLASS}_{MODEL_NORMALIZED}` → e.g. `PCIE_NETWORK_CONNECTX5`
|
||||
|
||||
### Firmware
|
||||
|
||||
```json
|
||||
[
|
||||
{ "device_name": "BIOS", "version": "06.08.05" },
|
||||
{ "device_name": "BMC", "version": "5.17.00" }
|
||||
]
|
||||
```
|
||||
|
||||
Both fields required. Changes trigger `FIRMWARE_CHANGED` timeline events.
|
||||
|
||||
---
|
||||
|
||||
### Import process (Reanimator side)
|
||||
|
||||
1. Validate `collected_at` (RFC3339) and `hardware.board.serial_number`.
|
||||
2. Find or create Asset by `board.serial_number` → `vendor_serial`.
|
||||
3. For each component: filter `present=false`, auto-determine LOT, find or create Component,
|
||||
create Observation, update Installations.
|
||||
4. Detect removed components (present in previous snapshot, absent in current) → close Installation.
|
||||
5. Generate timeline events: `LOG_COLLECTED`, `INSTALLED`, `REMOVED`, `FIRMWARE_CHANGED`.
|
||||
|
||||
**Idempotency:** Repeated import of the same snapshot (same content hash) returns `200 OK`
|
||||
with `"duplicate": true` and does not create duplicate records.
|
||||
|
||||
### Reanimator API endpoint
|
||||
|
||||
```http
|
||||
POST /ingest/hardware
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
**Success (201):**
|
||||
```json
|
||||
{
|
||||
"status": "success",
|
||||
"bundle_id": "lb_01J...",
|
||||
"asset_id": "mach_01J...",
|
||||
"collected_at": "2026-02-10T15:30:00Z",
|
||||
"duplicate": false,
|
||||
"summary": {
|
||||
"parts_observed": 15,
|
||||
"parts_created": 2,
|
||||
"installations_created": 2,
|
||||
"timeline_events_created": 9
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Duplicate (200):**
|
||||
```json
|
||||
{ "status": "success", "duplicate": true, "message": "LogBundle with this content hash already exists" }
|
||||
```
|
||||
|
||||
**Error (400):**
|
||||
```json
|
||||
{ "status": "error", "error": "validation_failed", "details": { "field": "...", "message": "..." } }
|
||||
```
|
||||
|
||||
Common `400` causes:
|
||||
- Unknown JSON field (strict decoder)
|
||||
- Wrong key name (e.g. `targetHost` instead of `target_host`)
|
||||
- Invalid `collected_at` format (must be RFC3339)
|
||||
- Empty `hardware.board.serial_number`
|
||||
|
||||
### LOT normalization rules
|
||||
|
||||
1. Remove special chars `( ) - ® ™`; replace spaces with `_`
|
||||
2. Uppercase all
|
||||
3. Collapse multiple underscores to one
|
||||
4. Strip common prefixes like `MODEL:`, `PN:`
|
||||
|
||||
### Status values
|
||||
|
||||
| Value | Meaning | Action |
|
||||
|-------|---------|--------|
|
||||
| `OK` | Normal | — |
|
||||
| `Warning` | Degraded | Create `COMPONENT_WARNING` event (optional) |
|
||||
| `Critical` | Failed | Auto-create `failure_event`, create `COMPONENT_FAILED` event |
|
||||
| `Unknown` | Not determinable | Treat as working |
|
||||
| `Empty` | Slot unpopulated | No component created (memory/PCIe only) |
|
||||
|
||||
### Missing field handling
|
||||
|
||||
| Field | Fallback |
|
||||
|-------|---------|
|
||||
| CPU serial | Generated: `{board_serial}-CPU-{socket}` |
|
||||
| PCIe serial | Generated: `{board_serial}-PCIE-{slot}` |
|
||||
| Other serial | Component skipped if absent |
|
||||
| manufacturer (PCIe) | Looked up from `vendor_id` (8086→Intel, 10de→NVIDIA, 15b3→Mellanox…) |
|
||||
| status | Treated as `Unknown` |
|
||||
| firmware | No `FIRMWARE_CHANGED` event |
|
||||
89
bible-local/08-build-release.md
Normal file
89
bible-local/08-build-release.md
Normal file
@@ -0,0 +1,89 @@
|
||||
# 08 — Build & Release
|
||||
|
||||
## CLI flags
|
||||
|
||||
Defined in `cmd/logpile/main.go`:
|
||||
|
||||
| Flag | Default | Description |
|
||||
|------|---------|-------------|
|
||||
| `--port` | `8082` | HTTP server port |
|
||||
| `--file` | — | Reserved for archive preload (not active) |
|
||||
| `--version` | — | Print version and exit |
|
||||
| `--no-browser` | — | Do not open browser on start |
|
||||
| `--hold-on-crash` | `true` on Windows | Keep console open on fatal crash for debugging |
|
||||
|
||||
## Build
|
||||
|
||||
```bash
|
||||
# Local binary (current OS/arch)
|
||||
make build
|
||||
# Output: bin/logpile
|
||||
|
||||
# Cross-platform binaries
|
||||
make build-all
|
||||
# Output:
|
||||
# bin/logpile-linux-amd64
|
||||
# bin/logpile-linux-arm64
|
||||
# bin/logpile-darwin-amd64
|
||||
# bin/logpile-darwin-arm64
|
||||
# bin/logpile-windows-amd64.exe
|
||||
```
|
||||
|
||||
Both `make build` and `make build-all` run `scripts/update-pci-ids.sh --best-effort`
|
||||
before compilation to sync `pci.ids` from the submodule.
|
||||
|
||||
To skip PCI IDs update:
|
||||
```bash
|
||||
SKIP_PCI_IDS_UPDATE=1 make build
|
||||
```
|
||||
|
||||
Build flags: `CGO_ENABLED=0` — fully static binary, no C runtime dependency.
|
||||
|
||||
## PCI IDs submodule
|
||||
|
||||
Source: `third_party/pciids` (git submodule → `github.com/pciutils/pciids`)
|
||||
Local copy embedded at build time: `internal/parser/vendors/pciids/pci.ids`
|
||||
|
||||
```bash
|
||||
# Manual update
|
||||
make update-pci-ids
|
||||
|
||||
# Init submodule after fresh clone
|
||||
git submodule update --init third_party/pciids
|
||||
```
|
||||
|
||||
## Release process
|
||||
|
||||
```bash
|
||||
scripts/release.sh
|
||||
```
|
||||
|
||||
What it does:
|
||||
1. Reads version from `git describe --tags`
|
||||
2. Validates clean working tree (override: `ALLOW_DIRTY=1`)
|
||||
3. Sets stable `GOPATH` / `GOCACHE` / `GOTOOLCHAIN` env
|
||||
4. Creates `releases/{VERSION}/` directory
|
||||
5. Generates `RELEASE_NOTES.md` template if not present
|
||||
6. Builds `darwin-arm64` and `windows-amd64` binaries
|
||||
7. Packages all binaries found in `bin/` as `.tar.gz` / `.zip`
|
||||
8. Generates `SHA256SUMS.txt`
|
||||
9. Prints next steps (tag, push, create release manually)
|
||||
|
||||
Release notes template is created in `releases/{VERSION}/RELEASE_NOTES.md`.
|
||||
|
||||
## Running
|
||||
|
||||
```bash
|
||||
./bin/logpile
|
||||
./bin/logpile --port 9090
|
||||
./bin/logpile --no-browser
|
||||
./bin/logpile --version
|
||||
./bin/logpile --hold-on-crash # keep console open on crash (default on Windows)
|
||||
```
|
||||
|
||||
## macOS Gatekeeper
|
||||
|
||||
After downloading a binary, remove the quarantine attribute:
|
||||
```bash
|
||||
xattr -d com.apple.quarantine /path/to/logpile-darwin-arm64
|
||||
```
|
||||
43
bible-local/09-testing.md
Normal file
43
bible-local/09-testing.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# 09 — Testing
|
||||
|
||||
## Required before merge
|
||||
|
||||
```bash
|
||||
go test ./...
|
||||
```
|
||||
|
||||
All tests must pass before any change is merged.
|
||||
|
||||
## Where to add tests
|
||||
|
||||
| Change area | Test location |
|
||||
|-------------|---------------|
|
||||
| Collectors | `internal/collector/*_test.go` |
|
||||
| HTTP handlers | `internal/server/*_test.go` |
|
||||
| Exporters | `internal/exporter/*_test.go` |
|
||||
| Parsers | `internal/parser/vendors/<vendor>/*_test.go` |
|
||||
|
||||
## Exporter tests
|
||||
|
||||
The Reanimator exporter has comprehensive coverage:
|
||||
|
||||
| Test file | Coverage |
|
||||
|-----------|----------|
|
||||
| `reanimator_converter_test.go` | Unit tests per conversion function |
|
||||
| `reanimator_integration_test.go` | Full export with realistic `AnalysisResult` |
|
||||
|
||||
Run exporter tests only:
|
||||
```bash
|
||||
go test ./internal/exporter/...
|
||||
go test ./internal/exporter/... -v -run Reanimator
|
||||
go test ./internal/exporter/... -cover
|
||||
```
|
||||
|
||||
## Guidelines
|
||||
|
||||
- Prefer table-driven tests for parsing logic (multiple input variants).
|
||||
- Do not rely on network access in unit tests.
|
||||
- Test both the happy path and edge cases (missing fields, empty collections).
|
||||
- When adding a new vendor parser, include at minimum:
|
||||
- `Detect()` test with a positive and a negative sample file list.
|
||||
- `Parse()` test with a minimal but representative archive.
|
||||
256
bible-local/10-decisions.md
Normal file
256
bible-local/10-decisions.md
Normal file
@@ -0,0 +1,256 @@
|
||||
# 10 — Architectural Decision Log (ADL)
|
||||
|
||||
> **Rule:** Every significant architectural decision **must be recorded here** before or alongside
|
||||
> the code change. This applies to humans and AI assistants alike.
|
||||
>
|
||||
> Format: date · title · context · decision · consequences
|
||||
|
||||
---
|
||||
|
||||
## ADL-001 — In-memory only state (no database)
|
||||
|
||||
**Date:** project start
|
||||
**Context:** LOGPile is designed as a standalone diagnostic tool, not a persistent service.
|
||||
**Decision:** All parsed/collected data lives in `Server.result` (in-memory). No database, no files written.
|
||||
**Consequences:**
|
||||
- Data is lost on process restart — intentional.
|
||||
- Simple deployment: single binary, no setup required.
|
||||
- JSON export is the persistence mechanism for users who want to save results.
|
||||
|
||||
---
|
||||
|
||||
## ADL-002 — Vendor parser auto-registration via init()
|
||||
|
||||
**Date:** project start
|
||||
**Context:** Need an extensible parser registry without a central factory function.
|
||||
**Decision:** Each vendor parser registers itself in its package's `init()` function.
|
||||
`vendors/vendors.go` holds blank imports to trigger registration.
|
||||
**Consequences:**
|
||||
- Adding a new parser requires only: implement interface + add one blank import.
|
||||
- No central list to maintain (other than the import file).
|
||||
- `go test ./...` will include new parsers automatically.
|
||||
|
||||
---
|
||||
|
||||
## ADL-003 — Highest-confidence parser wins
|
||||
|
||||
**Date:** project start
|
||||
**Context:** Multiple parsers may partially match an archive (e.g. generic + specific vendor).
|
||||
**Decision:** Run all parsers' `Detect()`, select the one returning the highest score (0–100).
|
||||
**Consequences:**
|
||||
- Generic fallback (score 15) only activates when no vendor parser scores higher.
|
||||
- Parsers must be conservative with high scores (70+) to avoid false positives.
|
||||
|
||||
---
|
||||
|
||||
## ADL-004 — Canonical hardware.devices as single source of truth
|
||||
|
||||
**Date:** v1.5.0
|
||||
**Context:** UI tabs and Reanimator exporter were reading from different sub-fields of
|
||||
`AnalysisResult`, causing potential drift.
|
||||
**Decision:** Introduce `hardware.devices` as the canonical inventory repository.
|
||||
All UI tabs and all exporters must read exclusively from this repository.
|
||||
**Consequences:**
|
||||
- Any UI vs Reanimator discrepancy is classified as a bug, not a "known difference".
|
||||
- Deduplication logic runs once in the repository builder (serial → bdf → distinct).
|
||||
- New hardware attributes must be added to canonical schema first, then mapped to consumers.
|
||||
|
||||
---
|
||||
|
||||
## ADL-005 — No hardcoded PCI model strings; use pci.ids
|
||||
|
||||
**Date:** v1.5.0
|
||||
**Context:** NVIDIA and other vendors release new GPU models frequently; hardcoded maps
|
||||
required code changes for each new model ID.
|
||||
**Decision:** Use the `pciutils/pciids` database (git submodule, embedded at build time).
|
||||
PCI vendor/device ID → human-readable model name via lookup.
|
||||
**Consequences:**
|
||||
- New GPU models can be supported by updating `pci.ids` without code changes.
|
||||
- `make build` auto-syncs `pci.ids` from submodule before compilation.
|
||||
- External override via `LOGPILE_PCI_IDS_PATH` env var.
|
||||
|
||||
---
|
||||
|
||||
## ADL-006 — Reanimator export uses canonical hardware.devices (not raw sub-fields)
|
||||
|
||||
**Date:** v1.5.0
|
||||
**Context:** Early Reanimator exporter read from `Hardware.GPUs`, `Hardware.NICs`, etc.
|
||||
directly, diverging from UI data.
|
||||
**Decision:** Reanimator exporter must use `hardware.devices` — the same source as the UI.
|
||||
Exporter groups/filters canonical records by section; does not rebuild from sub-fields.
|
||||
**Consequences:**
|
||||
- Guarantees UI and export consistency.
|
||||
- Exporter code is simpler — mainly a filter+map, not a data reconstruction.
|
||||
|
||||
---
|
||||
|
||||
## ADL-007 — Documentation language is English
|
||||
|
||||
**Date:** 2026-02-20
|
||||
**Context:** Codebase documentation was mixed Russian/English, reducing clarity for
|
||||
international contributors and AI assistants.
|
||||
**Decision:** All maintained project documentation (`docs/bible/`, `README.md`,
|
||||
`CLAUDE.md`, and new technical docs) must be written in English.
|
||||
**Consequences:**
|
||||
- Bible is authoritative in English.
|
||||
- AI assistants get consistent, unambiguous context.
|
||||
|
||||
---
|
||||
|
||||
## ADL-008 — Bible is the single source of truth for architecture docs
|
||||
|
||||
**Date:** 2026-02-23
|
||||
**Context:** Architecture information was duplicated across `README.md`, `CLAUDE.md`,
|
||||
and the Bible, creating drift risk and stale guidance for humans and AI agents.
|
||||
**Decision:** Keep architecture and technical design documentation only in `docs/bible/`.
|
||||
Top-level `README.md` and `CLAUDE.md` must remain minimal pointers/instructions.
|
||||
**Consequences:**
|
||||
- Reduces documentation drift and duplicate updates.
|
||||
- AI assistants are directed to one authoritative source before making changes.
|
||||
- Documentation updates that affect architecture must include Bible changes (and ADL entries when significant).
|
||||
|
||||
---
|
||||
|
||||
## ADL-009 — Redfish analysis is performed from raw snapshot replay (unified tunnel)
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Live Redfish collection and raw export re-analysis used different parsing paths,
|
||||
which caused drift and made bug fixes difficult to validate consistently.
|
||||
**Decision:** Redfish live collection must produce a `raw_payloads.redfish_tree` snapshot first,
|
||||
then run the same replay analyzer used for imported raw exports.
|
||||
**Consequences:**
|
||||
- Same `redfish_tree` input produces the same parsed result in live and offline modes.
|
||||
- Debugging parser issues can be done against exported raw bundles without live BMC access.
|
||||
- Snapshot completeness becomes critical; collector seeds/limits are part of analyzer correctness.
|
||||
|
||||
---
|
||||
|
||||
## ADL-010 — Raw export is a self-contained re-analysis package (not a final result dump)
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Exporting only normalized `AnalysisResult` loses raw source fidelity and prevents
|
||||
future parser improvements from being applied to already collected data.
|
||||
**Decision:** `Export Raw Data` produces a self-contained raw package (JSON or ZIP bundle)
|
||||
that the application can reopen and re-analyze. Parsed data in the package is optional and not
|
||||
the source of truth on import.
|
||||
**Consequences:**
|
||||
- Re-opening an export always re-runs analysis from raw source (`redfish_tree` or uploaded file bytes).
|
||||
- Raw bundles include collection context and diagnostics for debugging (`collect.log`, `parser_fields.json`).
|
||||
- Endpoint compatibility is preserved (`/api/export/json`) while actual payload format may be a bundle.
|
||||
|
||||
---
|
||||
|
||||
## ADL-011 — Redfish snapshot crawler is bounded, prioritized, and failure-tolerant
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Full Redfish trees on modern GPU systems are large, noisy, and contain many
|
||||
vendor-specific or non-fetchable links. Unbounded crawling and naive queue design caused hangs
|
||||
and incomplete snapshots.
|
||||
**Decision:** Use a bounded snapshot crawler with:
|
||||
- explicit document cap (`LOGPILE_REDFISH_SNAPSHOT_MAX_DOCS`)
|
||||
- priority seed paths (PCIe/Fabrics/Firmware/Storage/PowerSubsystem/ThermalSubsystem)
|
||||
- normalized `@odata.id` paths (strip `#fragment`)
|
||||
- noisy expected error filtering (404/405/410/501 hidden from UI)
|
||||
- queue capacity sized to crawl cap to avoid producer/consumer deadlock
|
||||
**Consequences:**
|
||||
- Snapshot collection remains stable on large BMC trees.
|
||||
- Most high-value inventory paths are reached before the cap.
|
||||
- UI progress remains useful while debug logs retain low-level fetch failures.
|
||||
|
||||
---
|
||||
|
||||
## ADL-012 — Vendor-specific storage inventory probing is allowed as fallback
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Some Supermicro BMCs expose empty standard `Storage/.../Drives` collections while
|
||||
real disk inventory exists under vendor-specific `Disk.Bay` endpoints and enclosure links.
|
||||
**Decision:** When standard drive collections are empty, collector/replay may probe vendor-style
|
||||
`.../Drives/Disk.Bay.*` endpoints and follow `Storage.Links.Enclosures[*]` to recover physical drives.
|
||||
**Consequences:**
|
||||
- Higher storage inventory coverage on Supermicro HBA/HA-RAID/MRVL/NVMe backplane implementations.
|
||||
- Replay must mirror the same probing behavior to preserve deterministic results.
|
||||
- Probing remains bounded (finite candidate set) to avoid runaway requests.
|
||||
|
||||
---
|
||||
|
||||
## ADL-013 — PowerSubsystem is preferred over legacy Power on newer Redfish implementations
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** X14+/newer Redfish implementations increasingly expose authoritative PSU data in
|
||||
`PowerSubsystem/PowerSupplies`, while legacy `/Power` may be incomplete or schema-shifted.
|
||||
**Decision:** Prefer `Chassis/*/PowerSubsystem/PowerSupplies` as the primary PSU source and use
|
||||
legacy `Chassis/*/Power` as fallback.
|
||||
**Consequences:**
|
||||
- Better compatibility with newer BMC firmware generations.
|
||||
- Legacy systems remain supported without special-case collector selection.
|
||||
- Snapshot priority seeds must include `PowerSubsystem` resources.
|
||||
|
||||
---
|
||||
|
||||
## ADL-014 — Threshold logic lives on the server; UI reflects status only
|
||||
|
||||
**Date:** 2026-02-24
|
||||
**Context:** Duplicating threshold math in frontend and backend creates drift and inconsistent
|
||||
highlighting (e.g. PSU mains voltage range checks).
|
||||
**Decision:** Business threshold evaluation (e.g. PSU voltage nominal range) must be computed on
|
||||
the server; frontend only renders status/flags returned by the API.
|
||||
**Consequences:**
|
||||
- Single source of truth for threshold policies.
|
||||
- UI can evolve visually without re-implementing domain logic.
|
||||
- API payloads may carry richer status semantics over time.
|
||||
|
||||
---
|
||||
|
||||
## ADL-015 — Supermicro crashdump archive parser removed from active registry
|
||||
|
||||
**Date:** 2026-03-01
|
||||
**Context:** The Supermicro crashdump parser (`SMC Crash Dump Parser`) produced low-value
|
||||
results for current workflows and was explicitly rejected as a supported archive path.
|
||||
**Decision:** Remove `supermicro` vendor parser from active registration and project source.
|
||||
Do not include it in `/api/parsers` output or parser documentation matrix.
|
||||
**Consequences:**
|
||||
- Supermicro crashdump archives (`CDump.txt` format) are no longer parsed by a dedicated vendor parser.
|
||||
- Such archives fall back to other matching parsers (typically `generic`) unless a new replacement parser is added.
|
||||
- Reintroduction requires a new parser package and an explicit registry import in `vendors/vendors.go`.
|
||||
|
||||
---
|
||||
|
||||
## ADL-016 — Device-bound firmware must not appear in hardware.firmware
|
||||
|
||||
**Date:** 2026-03-01
|
||||
**Context:** Dell TSR `DCIM_SoftwareIdentity` lists firmware for every component (NICs,
|
||||
PSUs, disks, backplanes) in addition to system-level firmware. Naively importing all entries
|
||||
into `Hardware.Firmware` caused device firmware to appear twice in Reanimator: once in the
|
||||
device's own record and again in the top-level firmware list.
|
||||
**Decision:**
|
||||
- `Hardware.Firmware` contains only system-level firmware (BIOS, BMC/iDRAC, CPLD,
|
||||
Lifecycle Controller, storage controllers, BOSS).
|
||||
- Device-bound entries (NIC, PSU, Disk, Backplane, GPU) must not be added to
|
||||
`Hardware.Firmware`.
|
||||
- Parsers must store the FQDD (or equivalent slot identifier) in `FirmwareInfo.Description`
|
||||
so the Reanimator exporter can filter by FQDD prefix.
|
||||
- The exporter's `isDeviceBoundFirmwareFQDD()` function performs this filter.
|
||||
**Consequences:**
|
||||
- Any new parser that ingests a per-device firmware inventory must follow the same rule.
|
||||
- Device firmware is accessible only via the device's own record, not the firmware list.
|
||||
|
||||
---
|
||||
|
||||
## ADL-017 — Vendor-embedded MAC addresses must be stripped from model name fields
|
||||
|
||||
**Date:** 2026-03-01
|
||||
**Context:** Dell TSR embeds MAC addresses directly in `ProductName` and `ElementName`
|
||||
fields (e.g. `"NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`).
|
||||
This caused model names to contain MAC addresses in NIC model, NIC firmware device name,
|
||||
and potentially other fields.
|
||||
**Decision:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from all model/name string fields
|
||||
at parse time before storing in any model struct. Use the regex
|
||||
`\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$`.
|
||||
**Consequences:**
|
||||
- Model names are clean and consistent across all devices.
|
||||
- All parsers must apply this stripping to any field used as a device name or model.
|
||||
- Confirmed affected fields in Dell: `DCIM_NICView.ProductName`, `DCIM_SoftwareIdentity.ElementName`.
|
||||
|
||||
---
|
||||
|
||||
<!-- Add new decisions below this line using the format above -->
|
||||
59
bible-local/README.md
Normal file
59
bible-local/README.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# LOGPile Bible
|
||||
|
||||
> **Documentation language:** English only. All maintained project documentation must be written in English.
|
||||
>
|
||||
> **Architectural decisions:** Every significant architectural decision **must** be recorded in
|
||||
> [`10-decisions.md`](10-decisions.md) before or alongside the code change.
|
||||
>
|
||||
> **Single source of truth:** Architecture and technical design documentation belongs in `docs/bible/`.
|
||||
> Keep `README.md` and `CLAUDE.md` minimal to avoid duplicate documentation.
|
||||
|
||||
This directory is the single source of truth for LOGPile's architecture, design, and integration contracts.
|
||||
It is structured so that both humans and AI assistants can navigate it quickly.
|
||||
|
||||
---
|
||||
|
||||
## Reading Map (Hierarchical)
|
||||
|
||||
### 1. Foundations (read first)
|
||||
|
||||
| File | What it covers |
|
||||
|------|----------------|
|
||||
| [01-overview.md](01-overview.md) | Product purpose, operating modes, scope |
|
||||
| [02-architecture.md](02-architecture.md) | Runtime structure, control flow, in-memory state |
|
||||
| [04-data-models.md](04-data-models.md) | Core contracts (`AnalysisResult`, canonical `hardware.devices`) |
|
||||
|
||||
### 2. Runtime Interfaces
|
||||
|
||||
| File | What it covers |
|
||||
|------|----------------|
|
||||
| [03-api.md](03-api.md) | HTTP API contracts and endpoint behavior |
|
||||
| [05-collectors.md](05-collectors.md) | Live collection connectors (Redfish, IPMI mock) |
|
||||
| [06-parsers.md](06-parsers.md) | Archive parser framework and vendor parsers |
|
||||
| [07-exporters.md](07-exporters.md) | CSV / JSON / Reanimator exports and integration mapping |
|
||||
|
||||
### 3. Delivery & Quality
|
||||
|
||||
| File | What it covers |
|
||||
|------|----------------|
|
||||
| [08-build-release.md](08-build-release.md) | Build, packaging, release workflow |
|
||||
| [09-testing.md](09-testing.md) | Testing expectations and verification guidance |
|
||||
|
||||
### 4. Governance (always current)
|
||||
|
||||
| File | What it covers |
|
||||
|------|----------------|
|
||||
| [10-decisions.md](10-decisions.md) | Architectural Decision Log (ADL) |
|
||||
|
||||
---
|
||||
|
||||
## Quick orientation for AI assistants
|
||||
|
||||
- Read order for most changes: `01` → `02` → `04` → relevant interface doc(s) → `10`
|
||||
- Entry point: `cmd/logpile/main.go`
|
||||
- HTTP server: `internal/server/` — handlers in `handlers.go`, routes in `server.go`
|
||||
- Data contracts: `internal/models/` — never break `AnalysisResult` JSON shape
|
||||
- Frontend contract: `web/static/js/app.js` — keep API responses stable
|
||||
- Canonical inventory: `hardware.devices` in `AnalysisResult` — source of truth for UI and exports
|
||||
- Parser registry: `internal/parser/vendors/` — `init()` auto-registration pattern
|
||||
- Collector registry: `internal/collector/registry.go`
|
||||
@@ -40,6 +40,8 @@ func main() {
|
||||
cfg := server.Config{
|
||||
Port: *port,
|
||||
PreloadFile: *file,
|
||||
AppVersion: version,
|
||||
AppCommit: commit,
|
||||
}
|
||||
|
||||
srv := server.New(cfg)
|
||||
|
||||
28
docs/test_server_collection_memory.md
Normal file
28
docs/test_server_collection_memory.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Test Server Collection Memory
|
||||
|
||||
Keep this table updated after each test-server run.
|
||||
|
||||
Definition:
|
||||
- `Collection Time` = total Redfish collection duration from `collect.log`.
|
||||
- `Speed` = `Documents / seconds`.
|
||||
- `Metrics Collected` = sum of `Counts` fields (`cpus + memory + storage + pcie + gpus + nics + psus + firmware`).
|
||||
- `n/a` means the log does not contain enough timestamp metadata to calculate duration/speed.
|
||||
|
||||
## Server Model: `NF5688M7`
|
||||
|
||||
| Date (UTC) | App Version | Collection Time | Documents | Speed | Metrics Collected | Notes |
|
||||
|---|---|---:|---:|---:|---:|---|
|
||||
| 2026-02-28 | `v1.7.1-12-g612058e` (`612058e`) | 10m10s (610s) | 228 | 0.37 docs/s | 98 | 2026-02-28 (SERVER MODEL) - 23E100043.zip |
|
||||
| 2026-02-28 | `v1.7.1-11-ge0146ad` (`e0146ad`) | 9m36s (576s) | 138 | 0.24 docs/s | 110 | 2026-02-28 (SERVER MODEL) - 23E100042.zip |
|
||||
| 2026-02-28 | `v1.7.1-10-g9a30705` (`9a30705`) | 20m47s (1247s) | 106 | 0.09 docs/s | 97 | 2026-02-28 (SERVER MODEL) - 23E100053.zip |
|
||||
| 2026-02-28 | `v1.7.1` (`6c19a58`) | 15m08s (908s) | 184 | 0.20 docs/s | 96 | 2026-02-28 (DDR5 DIMM) - 23E100051.zip |
|
||||
| 2026-02-28 | `v1.7.0` (`ddab93a`) | n/a | 193 | n/a | 61 | 2026-02-28 (NULL) - 23E100051.zip |
|
||||
| 2026-02-28 | `v1.7.0` (`ddab93a`) | n/a | 291 | n/a | 61 | 2026-02-28 (NULL) - 23E100206.zip |
|
||||
|
||||
## Server Model: `KR1280-X2-A0-R0-00`
|
||||
|
||||
| Date (UTC) | App Version | Collection Time | Documents | Speed | Metrics Collected | Notes |
|
||||
|---|---|---:|---:|---:|---:|---|
|
||||
| 2026-02-28 | `v1.7.1-12-g612058e` (`612058e`) | 6m15s (375s) | 185 | 0.49 docs/s | 46 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657.zip |
|
||||
| 2026-02-28 | `v1.7.1-9-g8dbbec3-dirty` (`8dbbec3`) | 6m16s (376s) | 165 | 0.44 docs/s | 46 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657-2.zip |
|
||||
| 2026-02-28 | `v1.7.1-7-gc52fea2` (`c52fea2`) | 10m51s (651s) | 227 | 0.35 docs/s | 40 | 2026-02-28 (KR1280-X2-A0-R0-00) - 23D401657 copy.zip |
|
||||
File diff suppressed because it is too large
Load Diff
40
internal/collector/redfish_pciids_test.go
Normal file
40
internal/collector/redfish_pciids_test.go
Normal file
@@ -0,0 +1,40 @@
|
||||
package collector
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestParseNIC_ResolvesModelFromPCIIDs(t *testing.T) {
|
||||
doc := map[string]interface{}{
|
||||
"Id": "NIC1",
|
||||
"VendorId": "0x8086",
|
||||
"DeviceId": "0x1521",
|
||||
"Model": "0x1521",
|
||||
}
|
||||
|
||||
nic := parseNIC(doc)
|
||||
if nic.Model == "" {
|
||||
t.Fatalf("expected model resolved from pci.ids")
|
||||
}
|
||||
if !strings.Contains(strings.ToUpper(nic.Model), "I350") {
|
||||
t.Fatalf("expected I350 in model, got %q", nic.Model)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParsePCIeFunction_ResolvesDeviceClassFromPCIIDs(t *testing.T) {
|
||||
doc := map[string]interface{}{
|
||||
"Id": "PCIE1",
|
||||
"VendorId": "0x9005",
|
||||
"DeviceId": "0x028f",
|
||||
"ClassCode": "0x010700",
|
||||
}
|
||||
|
||||
dev := parsePCIeFunction(doc, 0)
|
||||
if dev.DeviceClass == "" || strings.EqualFold(dev.DeviceClass, "PCIe device") {
|
||||
t.Fatalf("expected device class resolved from pci.ids, got %q", dev.DeviceClass)
|
||||
}
|
||||
if strings.HasPrefix(strings.ToLower(strings.TrimSpace(dev.DeviceClass)), "0x") {
|
||||
t.Fatalf("expected resolved name instead of raw hex, got %q", dev.DeviceClass)
|
||||
}
|
||||
}
|
||||
1520
internal/collector/redfish_replay.go
Normal file
1520
internal/collector/redfish_replay.go
Normal file
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -3,9 +3,8 @@ package exporter
|
||||
import (
|
||||
"encoding/csv"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"text/tabwriter"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
@@ -36,7 +35,7 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
|
||||
|
||||
// FRU data
|
||||
for _, fru := range e.result.FRU {
|
||||
if fru.SerialNumber == "" {
|
||||
if !hasUsableSerial(fru.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
name := fru.ProductName
|
||||
@@ -55,9 +54,36 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
|
||||
|
||||
// Hardware data
|
||||
if e.result.Hardware != nil {
|
||||
// Board
|
||||
if hasUsableSerial(e.result.Hardware.BoardInfo.SerialNumber) {
|
||||
if err := writer.Write([]string{
|
||||
e.result.Hardware.BoardInfo.ProductName,
|
||||
strings.TrimSpace(e.result.Hardware.BoardInfo.SerialNumber),
|
||||
e.result.Hardware.BoardInfo.Manufacturer,
|
||||
"Board",
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// CPUs
|
||||
for _, cpu := range e.result.Hardware.CPUs {
|
||||
if !hasUsableSerial(cpu.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
cpu.Model,
|
||||
strings.TrimSpace(cpu.SerialNumber),
|
||||
"",
|
||||
"CPU",
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Memory
|
||||
for _, mem := range e.result.Hardware.Memory {
|
||||
if mem.SerialNumber == "" {
|
||||
if !hasUsableSerial(mem.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
location := mem.Location
|
||||
@@ -66,7 +92,7 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
mem.PartNumber,
|
||||
mem.SerialNumber,
|
||||
strings.TrimSpace(mem.SerialNumber),
|
||||
mem.Manufacturer,
|
||||
location,
|
||||
}); err != nil {
|
||||
@@ -76,12 +102,12 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
|
||||
|
||||
// Storage
|
||||
for _, stor := range e.result.Hardware.Storage {
|
||||
if stor.SerialNumber == "" {
|
||||
if !hasUsableSerial(stor.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
stor.Model,
|
||||
stor.SerialNumber,
|
||||
strings.TrimSpace(stor.SerialNumber),
|
||||
stor.Manufacturer,
|
||||
stor.Slot,
|
||||
}); err != nil {
|
||||
@@ -89,20 +115,88 @@ func (e *Exporter) ExportCSV(w io.Writer) error {
|
||||
}
|
||||
}
|
||||
|
||||
// GPUs
|
||||
for _, gpu := range e.result.Hardware.GPUs {
|
||||
if !hasUsableSerial(gpu.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
component := gpu.Model
|
||||
if component == "" {
|
||||
component = "GPU"
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
component,
|
||||
strings.TrimSpace(gpu.SerialNumber),
|
||||
gpu.Manufacturer,
|
||||
gpu.Slot,
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// PCIe devices
|
||||
for _, pcie := range e.result.Hardware.PCIeDevices {
|
||||
if pcie.SerialNumber == "" {
|
||||
if !hasUsableSerial(pcie.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
pcie.DeviceClass,
|
||||
pcie.SerialNumber,
|
||||
strings.TrimSpace(pcie.SerialNumber),
|
||||
pcie.Manufacturer,
|
||||
pcie.Slot,
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Network adapters
|
||||
for _, nic := range e.result.Hardware.NetworkAdapters {
|
||||
if !hasUsableSerial(nic.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
location := nic.Location
|
||||
if location == "" {
|
||||
location = nic.Slot
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
nic.Model,
|
||||
strings.TrimSpace(nic.SerialNumber),
|
||||
nic.Vendor,
|
||||
location,
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Legacy network cards
|
||||
for _, nic := range e.result.Hardware.NetworkCards {
|
||||
if !hasUsableSerial(nic.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
nic.Model,
|
||||
strings.TrimSpace(nic.SerialNumber),
|
||||
"",
|
||||
"Network",
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
// Power supplies
|
||||
for _, psu := range e.result.Hardware.PowerSupply {
|
||||
if !hasUsableSerial(psu.SerialNumber) {
|
||||
continue
|
||||
}
|
||||
if err := writer.Write([]string{
|
||||
psu.Model,
|
||||
strings.TrimSpace(psu.SerialNumber),
|
||||
psu.Vendor,
|
||||
psu.Slot,
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
@@ -115,220 +209,15 @@ func (e *Exporter) ExportJSON(w io.Writer) error {
|
||||
return encoder.Encode(e.result)
|
||||
}
|
||||
|
||||
// ExportTXT exports a human-readable text report
|
||||
func (e *Exporter) ExportTXT(w io.Writer) error {
|
||||
fmt.Fprintln(w, "LOGPile Analysis Report - mchus.pro")
|
||||
fmt.Fprintln(w, "====================================")
|
||||
fmt.Fprintln(w)
|
||||
|
||||
if e.result == nil {
|
||||
fmt.Fprintln(w, "No data loaded.")
|
||||
return nil
|
||||
func hasUsableSerial(serial string) bool {
|
||||
s := strings.TrimSpace(serial)
|
||||
if s == "" {
|
||||
return false
|
||||
}
|
||||
|
||||
fmt.Fprintf(w, "File:\t%s\n", e.result.Filename)
|
||||
fmt.Fprintf(w, "Source:\t%s\n", e.result.SourceType)
|
||||
fmt.Fprintf(w, "Protocol:\t%s\n", e.result.Protocol)
|
||||
fmt.Fprintf(w, "Target:\t%s\n", e.result.TargetHost)
|
||||
fmt.Fprintln(w)
|
||||
|
||||
// Server model and serial number
|
||||
if e.result.Hardware != nil && e.result.Hardware.BoardInfo.ProductName != "" {
|
||||
fmt.Fprintf(w, "Server Model:\t%s\n", e.result.Hardware.BoardInfo.ProductName)
|
||||
fmt.Fprintf(w, "Serial Number:\t%s\n", e.result.Hardware.BoardInfo.SerialNumber)
|
||||
switch strings.ToUpper(s) {
|
||||
case "N/A", "NA", "NONE", "NULL", "UNKNOWN", "-":
|
||||
return false
|
||||
default:
|
||||
return true
|
||||
}
|
||||
fmt.Fprintln(w)
|
||||
|
||||
// Hardware summary
|
||||
if e.result.Hardware != nil {
|
||||
hw := e.result.Hardware
|
||||
|
||||
// Firmware tab
|
||||
if len(hw.Firmware) > 0 {
|
||||
fmt.Fprintln(w, "FIRMWARE VERSIONS")
|
||||
fmt.Fprintln(w, "-----------------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Component\tVersion\tBuild Time")
|
||||
for _, fw := range hw.Firmware {
|
||||
fmt.Fprintf(tw, "%s\t%s\t%s\n", fw.DeviceName, fw.Version, fw.BuildTime)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// CPU tab
|
||||
if len(hw.CPUs) > 0 {
|
||||
fmt.Fprintln(w, "PROCESSORS")
|
||||
fmt.Fprintln(w, "----------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Socket\tModel\tCores\tThreads\tFreq MHz\tTurbo MHz\tTDP W\tPPIN/SN")
|
||||
for _, cpu := range hw.CPUs {
|
||||
id := cpu.SerialNumber
|
||||
if id == "" {
|
||||
id = cpu.PPIN
|
||||
}
|
||||
fmt.Fprintf(tw, "CPU%d\t%s\t%d\t%d\t%d\t%d\t%d\t%s\n",
|
||||
cpu.Socket, cpu.Model, cpu.Cores, cpu.Threads, cpu.FrequencyMHz, cpu.MaxFreqMHz, cpu.TDP, id)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// Memory tab
|
||||
if len(hw.Memory) > 0 {
|
||||
fmt.Fprintln(w, "MEMORY")
|
||||
fmt.Fprintln(w, "------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Slot\tPresent\tSize MB\tType\tSpeed MHz\tVendor\tModel/PN\tSerial\tStatus")
|
||||
for _, mem := range hw.Memory {
|
||||
location := mem.Location
|
||||
if location == "" {
|
||||
location = mem.Slot
|
||||
}
|
||||
fmt.Fprintf(tw, "%s\t%t\t%d\t%s\t%d\t%s\t%s\t%s\t%s\n",
|
||||
location, mem.Present, mem.SizeMB, mem.Type, mem.CurrentSpeedMHz, mem.Manufacturer, mem.PartNumber, mem.SerialNumber, mem.Status)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// Power tab
|
||||
if len(hw.PowerSupply) > 0 {
|
||||
fmt.Fprintln(w, "POWER SUPPLIES")
|
||||
fmt.Fprintln(w, "--------------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Slot\tPresent\tVendor\tModel\tWattage W\tInput W\tOutput W\tInput V\tTemp C\tStatus\tSerial")
|
||||
for _, psu := range hw.PowerSupply {
|
||||
fmt.Fprintf(tw, "%s\t%t\t%s\t%s\t%d\t%d\t%d\t%.0f\t%d\t%s\t%s\n",
|
||||
psu.Slot, psu.Present, psu.Vendor, psu.Model, psu.WattageW, psu.InputPowerW, psu.OutputPowerW, psu.InputVoltage, psu.TemperatureC, psu.Status, psu.SerialNumber)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// Storage tab
|
||||
if len(hw.Storage) > 0 {
|
||||
fmt.Fprintln(w, "STORAGE")
|
||||
fmt.Fprintln(w, "-------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Slot\tPresent\tType\tInterface\tModel\tSize GB\tVendor\tFirmware\tSerial")
|
||||
for _, stor := range hw.Storage {
|
||||
fmt.Fprintf(tw, "%s\t%t\t%s\t%s\t%s\t%d\t%s\t%s\t%s\n",
|
||||
stor.Slot, stor.Present, stor.Type, stor.Interface, stor.Model, stor.SizeGB, stor.Manufacturer, stor.Firmware, stor.SerialNumber)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// GPU tab
|
||||
if len(hw.GPUs) > 0 {
|
||||
fmt.Fprintln(w, "GPUS")
|
||||
fmt.Fprintln(w, "----")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Slot\tModel\tVendor\tBDF\tPCIe\tSerial\tStatus")
|
||||
for _, gpu := range hw.GPUs {
|
||||
link := fmt.Sprintf("x%d %s", gpu.CurrentLinkWidth, gpu.CurrentLinkSpeed)
|
||||
if gpu.MaxLinkWidth > 0 || gpu.MaxLinkSpeed != "" {
|
||||
link = fmt.Sprintf("%s / x%d %s", link, gpu.MaxLinkWidth, gpu.MaxLinkSpeed)
|
||||
}
|
||||
fmt.Fprintf(tw, "%s\t%s\t%s\t%s\t%s\t%s\t%s\n",
|
||||
gpu.Slot, gpu.Model, gpu.Manufacturer, gpu.BDF, link, gpu.SerialNumber, gpu.Status)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// Network tab
|
||||
if len(hw.NetworkAdapters) > 0 {
|
||||
fmt.Fprintln(w, "NETWORK ADAPTERS")
|
||||
fmt.Fprintln(w, "----------------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Slot\tLocation\tModel\tVendor\tPorts\tType\tStatus\tSerial")
|
||||
for _, nic := range hw.NetworkAdapters {
|
||||
fmt.Fprintf(tw, "%s\t%s\t%s\t%s\t%d\t%s\t%s\t%s\n",
|
||||
nic.Slot, nic.Location, nic.Model, nic.Vendor, nic.PortCount, nic.PortType, nic.Status, nic.SerialNumber)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// Device inventory tab
|
||||
if len(hw.PCIeDevices) > 0 {
|
||||
fmt.Fprintln(w, "PCIE DEVICES")
|
||||
fmt.Fprintln(w, "------------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Slot\tBDF\tClass\tVendor\tVID:DID\tLink\tSerial")
|
||||
for _, pcie := range hw.PCIeDevices {
|
||||
fmt.Fprintf(tw, "%s\t%s\t%s\t%s\t%04x:%04x\tx%d %s / x%d %s\t%s\n",
|
||||
pcie.Slot, pcie.BDF, pcie.DeviceClass, pcie.Manufacturer, pcie.VendorID, pcie.DeviceID,
|
||||
pcie.LinkWidth, pcie.LinkSpeed, pcie.MaxLinkWidth, pcie.MaxLinkSpeed, pcie.SerialNumber)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
}
|
||||
|
||||
// Sensors tab
|
||||
if len(e.result.Sensors) > 0 {
|
||||
fmt.Fprintln(w, "SENSOR READINGS")
|
||||
fmt.Fprintln(w, "---------------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Type\tName\tValue\tUnit\tRaw\tStatus")
|
||||
for _, s := range e.result.Sensors {
|
||||
fmt.Fprintf(tw, "%s\t%s\t%.0f\t%s\t%s\t%s\n", s.Type, s.Name, s.Value, s.Unit, s.RawValue, s.Status)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// Serials/FRU tab
|
||||
if len(e.result.FRU) > 0 {
|
||||
fmt.Fprintln(w, "FRU COMPONENTS")
|
||||
fmt.Fprintln(w, "--------------")
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Description\tManufacturer\tProduct\tSerial\tPart Number")
|
||||
for _, fru := range e.result.FRU {
|
||||
name := fru.ProductName
|
||||
if name == "" {
|
||||
name = fru.Description
|
||||
}
|
||||
fmt.Fprintf(tw, "%s\t%s\t%s\t%s\t%s\n", fru.Description, fru.Manufacturer, name, fru.SerialNumber, fru.PartNumber)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
fmt.Fprintln(w)
|
||||
}
|
||||
|
||||
// Events tab
|
||||
fmt.Fprintf(w, "EVENTS: %d total\n", len(e.result.Events))
|
||||
if len(e.result.Events) > 0 {
|
||||
tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0)
|
||||
fmt.Fprintln(tw, "Time\tSeverity\tSource\tType\tName\tDescription")
|
||||
for _, ev := range e.result.Events {
|
||||
fmt.Fprintf(tw, "%s\t%s\t%s\t%s\t%s\t%s\n",
|
||||
ev.Timestamp.Format("2006-01-02 15:04:05"), ev.Severity, ev.Source, ev.SensorType, ev.SensorName, ev.Description)
|
||||
}
|
||||
_ = tw.Flush()
|
||||
}
|
||||
var critical, warning, info int
|
||||
for _, ev := range e.result.Events {
|
||||
switch ev.Severity {
|
||||
case models.SeverityCritical:
|
||||
critical++
|
||||
case models.SeverityWarning:
|
||||
warning++
|
||||
case models.SeverityInfo:
|
||||
info++
|
||||
}
|
||||
}
|
||||
fmt.Fprintf(w, " Critical: %d\n", critical)
|
||||
fmt.Fprintf(w, " Warning: %d\n", warning)
|
||||
fmt.Fprintf(w, " Info: %d\n", info)
|
||||
|
||||
// Footer
|
||||
fmt.Fprintln(w)
|
||||
fmt.Fprintln(w, "------------------------------------")
|
||||
fmt.Fprintln(w, "Generated by LOGPile - mchus.pro")
|
||||
fmt.Fprintln(w, "https://git.mchus.pro/mchus/logpile")
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
79
internal/exporter/exporter_csv_test.go
Normal file
79
internal/exporter/exporter_csv_test.go
Normal file
@@ -0,0 +1,79 @@
|
||||
package exporter
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/csv"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestExportCSV_IncludesAllComponentTypesWithUsableSerials(t *testing.T) {
|
||||
result := &models.AnalysisResult{
|
||||
FRU: []models.FRUInfo{
|
||||
{ProductName: "FRU Board", SerialNumber: "FRU-001", Manufacturer: "ACME"},
|
||||
},
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{
|
||||
ProductName: "X12",
|
||||
SerialNumber: "BOARD-001",
|
||||
Manufacturer: "Supermicro",
|
||||
},
|
||||
CPUs: []models.CPU{
|
||||
{Socket: 0, Model: "Xeon", SerialNumber: "CPU-001"},
|
||||
},
|
||||
Memory: []models.MemoryDIMM{
|
||||
{Slot: "DIMM0", PartNumber: "MEM-PN", SerialNumber: "MEM-001", Manufacturer: "Samsung"},
|
||||
},
|
||||
Storage: []models.Storage{
|
||||
{Slot: "U.2-1", Model: "PM9A3", SerialNumber: "SSD-001", Manufacturer: "Samsung"},
|
||||
},
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "GPU1", Model: "H200", SerialNumber: "GPU-001", Manufacturer: "NVIDIA"},
|
||||
},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{Slot: "PCIe1", DeviceClass: "NVSwitch", SerialNumber: "PCIE-001", Manufacturer: "NVIDIA"},
|
||||
},
|
||||
NetworkAdapters: []models.NetworkAdapter{
|
||||
{Slot: "Slot 17", Location: "#CPU0_PCIE4", Model: "I350", SerialNumber: "NIC-001", Vendor: "Intel"},
|
||||
{Slot: "Slot 18", Model: "skip-na", SerialNumber: "N/A", Vendor: "Intel"},
|
||||
},
|
||||
NetworkCards: []models.NIC{
|
||||
{Model: "Legacy NIC", SerialNumber: "LNIC-001"},
|
||||
},
|
||||
PowerSupply: []models.PSU{
|
||||
{Slot: "PSU0", Model: "GW-CRPS3000LW", SerialNumber: "PSU-001", Vendor: "Great Wall"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
var buf bytes.Buffer
|
||||
if err := New(result).ExportCSV(&buf); err != nil {
|
||||
t.Fatalf("ExportCSV failed: %v", err)
|
||||
}
|
||||
|
||||
rows, err := csv.NewReader(bytes.NewReader(buf.Bytes())).ReadAll()
|
||||
if err != nil {
|
||||
t.Fatalf("read csv: %v", err)
|
||||
}
|
||||
if len(rows) < 2 {
|
||||
t.Fatalf("expected data rows, got %d", len(rows))
|
||||
}
|
||||
|
||||
serials := make(map[string]bool)
|
||||
for _, row := range rows[1:] {
|
||||
if len(row) > 1 {
|
||||
serials[row[1]] = true
|
||||
}
|
||||
}
|
||||
|
||||
want := []string{"FRU-001", "BOARD-001", "CPU-001", "MEM-001", "SSD-001", "GPU-001", "PCIE-001", "NIC-001", "LNIC-001", "PSU-001"}
|
||||
for _, sn := range want {
|
||||
if !serials[sn] {
|
||||
t.Fatalf("expected serial %s in csv export", sn)
|
||||
}
|
||||
}
|
||||
if serials["N/A"] {
|
||||
t.Fatalf("did not expect unusable serial N/A in export")
|
||||
}
|
||||
}
|
||||
164
internal/exporter/generate_example_test.go
Normal file
164
internal/exporter/generate_example_test.go
Normal file
@@ -0,0 +1,164 @@
|
||||
package exporter
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
// TestGenerateReanimatorExample generates an example reanimator.json file
|
||||
// This test is marked as skipped by default - run with: go test -v -run TestGenerateReanimatorExample
|
||||
func TestGenerateReanimatorExample(t *testing.T) {
|
||||
t.Skip("Skip by default - run manually to generate example")
|
||||
|
||||
// Create realistic test data matching import-example-full.json structure
|
||||
result := &models.AnalysisResult{
|
||||
Filename: "redfish://10.10.10.103",
|
||||
SourceType: "api",
|
||||
Protocol: "redfish",
|
||||
TargetHost: "10.10.10.103",
|
||||
CollectedAt: time.Date(2026, 2, 10, 15, 30, 0, 0, time.UTC),
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{
|
||||
Manufacturer: "Supermicro",
|
||||
ProductName: "X12DPG-QT6",
|
||||
SerialNumber: "21D634101",
|
||||
PartNumber: "X12DPG-QT6-REV1.01",
|
||||
UUID: "d7ef2fe5-2fd0-11f0-910a-346f11040868",
|
||||
},
|
||||
Firmware: []models.FirmwareInfo{
|
||||
{DeviceName: "BIOS", Version: "06.08.05"},
|
||||
{DeviceName: "BMC", Version: "5.17.00"},
|
||||
{DeviceName: "CPLD", Version: "01.02.03"},
|
||||
},
|
||||
CPUs: []models.CPU{
|
||||
{
|
||||
Socket: 0,
|
||||
Model: "INTEL(R) XEON(R) GOLD 6530",
|
||||
Cores: 32,
|
||||
Threads: 64,
|
||||
FrequencyMHz: 2100,
|
||||
MaxFreqMHz: 4000,
|
||||
},
|
||||
{
|
||||
Socket: 1,
|
||||
Model: "INTEL(R) XEON(R) GOLD 6530",
|
||||
Cores: 32,
|
||||
Threads: 64,
|
||||
FrequencyMHz: 2100,
|
||||
MaxFreqMHz: 4000,
|
||||
},
|
||||
},
|
||||
Memory: []models.MemoryDIMM{
|
||||
{
|
||||
Slot: "CPU0_C0D0",
|
||||
Location: "CPU0_C0D0",
|
||||
Present: true,
|
||||
SizeMB: 32768,
|
||||
Type: "DDR5",
|
||||
MaxSpeedMHz: 4800,
|
||||
CurrentSpeedMHz: 4800,
|
||||
Manufacturer: "Hynix",
|
||||
SerialNumber: "80AD032419E17CEEC1",
|
||||
PartNumber: "HMCG88AGBRA191N",
|
||||
Status: "OK",
|
||||
},
|
||||
{
|
||||
Slot: "CPU1_C0D0",
|
||||
Location: "CPU1_C0D0",
|
||||
Present: true,
|
||||
SizeMB: 32768,
|
||||
Type: "DDR5",
|
||||
MaxSpeedMHz: 4800,
|
||||
CurrentSpeedMHz: 4800,
|
||||
Manufacturer: "Hynix",
|
||||
SerialNumber: "80AD032419E17D6FBA",
|
||||
PartNumber: "HMCG88AGBRA191N",
|
||||
Status: "OK",
|
||||
},
|
||||
},
|
||||
Storage: []models.Storage{
|
||||
{
|
||||
Slot: "OB01",
|
||||
Type: "NVMe",
|
||||
Model: "INTEL SSDPF2KX076T1",
|
||||
SizeGB: 7680,
|
||||
SerialNumber: "BTAX41900GF87P6DGN",
|
||||
Manufacturer: "Intel",
|
||||
Firmware: "9CV10510",
|
||||
Interface: "NVMe",
|
||||
Present: true,
|
||||
},
|
||||
{
|
||||
Slot: "OB02",
|
||||
Type: "NVMe",
|
||||
Model: "INTEL SSDPF2KX076T1",
|
||||
SizeGB: 7680,
|
||||
SerialNumber: "BTAX41900BEG7P6DGN",
|
||||
Manufacturer: "Intel",
|
||||
Firmware: "9CV10510",
|
||||
Interface: "NVMe",
|
||||
Present: true,
|
||||
},
|
||||
},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{
|
||||
Slot: "PCIeCard1",
|
||||
VendorID: 32902,
|
||||
DeviceID: 2912,
|
||||
BDF: "0000:18:00.0",
|
||||
DeviceClass: "MassStorageController",
|
||||
Manufacturer: "Intel",
|
||||
PartNumber: "RAID Controller",
|
||||
SerialNumber: "RAID-001-12345",
|
||||
LinkWidth: 8,
|
||||
LinkSpeed: "Gen3",
|
||||
MaxLinkWidth: 8,
|
||||
MaxLinkSpeed: "Gen3",
|
||||
},
|
||||
},
|
||||
PowerSupply: []models.PSU{
|
||||
{
|
||||
Slot: "0",
|
||||
Present: true,
|
||||
Model: "GW-CRPS3000LW",
|
||||
Vendor: "Great Wall",
|
||||
WattageW: 3000,
|
||||
SerialNumber: "2P06C102610",
|
||||
PartNumber: "V0310C9000000000",
|
||||
Firmware: "00.03.05",
|
||||
Status: "OK",
|
||||
InputType: "ACWideRange",
|
||||
InputPowerW: 137,
|
||||
OutputPowerW: 104,
|
||||
InputVoltage: 215.25,
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
// Convert to Reanimator format
|
||||
reanimator, err := ConvertToReanimator(result)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator failed: %v", err)
|
||||
}
|
||||
|
||||
// Marshal to JSON with indentation
|
||||
jsonData, err := json.MarshalIndent(reanimator, "", " ")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to marshal JSON: %v", err)
|
||||
}
|
||||
|
||||
// Write to example file
|
||||
examplePath := filepath.Join("../../example/docs", "export-example-logpile.json")
|
||||
if err := os.WriteFile(examplePath, jsonData, 0644); err != nil {
|
||||
t.Fatalf("Failed to write example file: %v", err)
|
||||
}
|
||||
|
||||
t.Logf("Generated example file: %s", examplePath)
|
||||
t.Logf("JSON length: %d bytes", len(jsonData))
|
||||
}
|
||||
1598
internal/exporter/reanimator_converter.go
Normal file
1598
internal/exporter/reanimator_converter.go
Normal file
File diff suppressed because it is too large
Load Diff
883
internal/exporter/reanimator_converter_test.go
Normal file
883
internal/exporter/reanimator_converter_test.go
Normal file
@@ -0,0 +1,883 @@
|
||||
package exporter
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestConvertToReanimator(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
input *models.AnalysisResult
|
||||
wantErr bool
|
||||
errMsg string
|
||||
}{
|
||||
{
|
||||
name: "nil result",
|
||||
input: nil,
|
||||
wantErr: true,
|
||||
errMsg: "no data available",
|
||||
},
|
||||
{
|
||||
name: "no hardware",
|
||||
input: &models.AnalysisResult{
|
||||
Filename: "test.json",
|
||||
},
|
||||
wantErr: true,
|
||||
errMsg: "no hardware data available",
|
||||
},
|
||||
{
|
||||
name: "no board serial",
|
||||
input: &models.AnalysisResult{
|
||||
Filename: "test.json",
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{},
|
||||
},
|
||||
},
|
||||
wantErr: true,
|
||||
errMsg: "board serial_number is required",
|
||||
},
|
||||
{
|
||||
name: "valid minimal data",
|
||||
input: &models.AnalysisResult{
|
||||
Filename: "test.json",
|
||||
SourceType: "api",
|
||||
Protocol: "redfish",
|
||||
TargetHost: "10.10.10.10",
|
||||
CollectedAt: time.Date(2026, 2, 10, 15, 30, 0, 0, time.UTC),
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{
|
||||
Manufacturer: "Supermicro",
|
||||
ProductName: "X12DPG-QT6",
|
||||
SerialNumber: "TEST123",
|
||||
},
|
||||
},
|
||||
},
|
||||
wantErr: false,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
result, err := ConvertToReanimator(tt.input)
|
||||
if tt.wantErr {
|
||||
if err == nil {
|
||||
t.Errorf("expected error containing %q, got nil", tt.errMsg)
|
||||
}
|
||||
return
|
||||
}
|
||||
if err != nil {
|
||||
t.Errorf("unexpected error: %v", err)
|
||||
return
|
||||
}
|
||||
if result == nil {
|
||||
t.Error("expected non-nil result")
|
||||
return
|
||||
}
|
||||
if result.Hardware.Board.SerialNumber != tt.input.Hardware.BoardInfo.SerialNumber {
|
||||
t.Errorf("board serial mismatch: got %q, want %q",
|
||||
result.Hardware.Board.SerialNumber,
|
||||
tt.input.Hardware.BoardInfo.SerialNumber)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestInferCPUManufacturer(t *testing.T) {
|
||||
tests := []struct {
|
||||
model string
|
||||
want string
|
||||
}{
|
||||
{"INTEL(R) XEON(R) GOLD 6530", "Intel"},
|
||||
{"Intel Core i9-12900K", "Intel"},
|
||||
{"AMD EPYC 7763", "AMD"},
|
||||
{"AMD Ryzen 9 5950X", "AMD"},
|
||||
{"ARM Cortex-A78", "ARM"},
|
||||
{"Ampere Altra Max", "Ampere"},
|
||||
{"Unknown CPU Model", ""},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.model, func(t *testing.T) {
|
||||
got := inferCPUManufacturer(tt.model)
|
||||
if got != tt.want {
|
||||
t.Errorf("inferCPUManufacturer(%q) = %q, want %q", tt.model, got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestNormalizedSerial(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
in string
|
||||
want string
|
||||
}{
|
||||
{
|
||||
name: "empty",
|
||||
in: "",
|
||||
want: "",
|
||||
},
|
||||
{
|
||||
name: "n_a",
|
||||
in: "N/A",
|
||||
want: "",
|
||||
},
|
||||
{
|
||||
name: "unknown",
|
||||
in: "unknown",
|
||||
want: "",
|
||||
},
|
||||
{
|
||||
name: "normal",
|
||||
in: "SN123",
|
||||
want: "SN123",
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
got := normalizedSerial(tt.in)
|
||||
if got != tt.want {
|
||||
t.Errorf("normalizedSerial() = %q, want %q", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestInferStorageStatus(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
stor models.Storage
|
||||
want string
|
||||
}{
|
||||
{
|
||||
name: "present",
|
||||
stor: models.Storage{
|
||||
Present: true,
|
||||
},
|
||||
want: "Unknown",
|
||||
},
|
||||
{
|
||||
name: "not present",
|
||||
stor: models.Storage{
|
||||
Present: false,
|
||||
},
|
||||
want: "Unknown",
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
got := inferStorageStatus(tt.stor)
|
||||
if got != tt.want {
|
||||
t.Errorf("inferStorageStatus() = %q, want %q", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestNormalizeStatus_PassFail(t *testing.T) {
|
||||
if got := normalizeStatus("PASS", false); got != "OK" {
|
||||
t.Fatalf("expected PASS -> OK, got %q", got)
|
||||
}
|
||||
if got := normalizeStatus("FAIL", false); got != "Critical" {
|
||||
t.Fatalf("expected FAIL -> Critical, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertCPUs(t *testing.T) {
|
||||
cpus := []models.CPU{
|
||||
{
|
||||
Socket: 0,
|
||||
Model: "INTEL(R) XEON(R) GOLD 6530",
|
||||
Cores: 32,
|
||||
Threads: 64,
|
||||
FrequencyMHz: 2100,
|
||||
MaxFreqMHz: 4000,
|
||||
},
|
||||
{
|
||||
Socket: 1,
|
||||
Model: "AMD EPYC 7763",
|
||||
Cores: 64,
|
||||
Threads: 128,
|
||||
FrequencyMHz: 2450,
|
||||
MaxFreqMHz: 3500,
|
||||
},
|
||||
}
|
||||
|
||||
result := convertCPUs(cpus, "2026-02-10T15:30:00Z")
|
||||
|
||||
if len(result) != 2 {
|
||||
t.Fatalf("expected 2 CPUs, got %d", len(result))
|
||||
}
|
||||
|
||||
if result[0].Manufacturer != "Intel" {
|
||||
t.Errorf("expected Intel manufacturer for first CPU, got %q", result[0].Manufacturer)
|
||||
}
|
||||
|
||||
if result[1].Manufacturer != "AMD" {
|
||||
t.Errorf("expected AMD manufacturer for second CPU, got %q", result[1].Manufacturer)
|
||||
}
|
||||
|
||||
if result[0].Status != "Unknown" {
|
||||
t.Errorf("expected Unknown status, got %q", result[0].Status)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertMemory(t *testing.T) {
|
||||
memory := []models.MemoryDIMM{
|
||||
{
|
||||
Slot: "CPU0_C0D0",
|
||||
Present: true,
|
||||
SizeMB: 32768,
|
||||
Type: "DDR5",
|
||||
SerialNumber: "TEST-MEM-001",
|
||||
Status: "OK",
|
||||
},
|
||||
{
|
||||
Slot: "CPU0_C1D0",
|
||||
Present: false,
|
||||
},
|
||||
}
|
||||
|
||||
result := convertMemory(memory, "2026-02-10T15:30:00Z")
|
||||
|
||||
if len(result) != 2 {
|
||||
t.Fatalf("expected 2 memory modules, got %d", len(result))
|
||||
}
|
||||
|
||||
if result[0].Status != "OK" {
|
||||
t.Errorf("expected OK status for first module, got %q", result[0].Status)
|
||||
}
|
||||
|
||||
if result[1].Status != "Empty" {
|
||||
t.Errorf("expected Empty status for second module, got %q", result[1].Status)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertStorage(t *testing.T) {
|
||||
storage := []models.Storage{
|
||||
{
|
||||
Slot: "OB01",
|
||||
Type: "NVMe",
|
||||
Model: "INTEL SSDPF2KX076T1",
|
||||
SerialNumber: "BTAX41900GF87P6DGN",
|
||||
Present: true,
|
||||
},
|
||||
{
|
||||
Slot: "OB02",
|
||||
Type: "NVMe",
|
||||
Model: "INTEL SSDPF2KX076T1",
|
||||
SerialNumber: "", // No serial - should be skipped
|
||||
Present: true,
|
||||
},
|
||||
}
|
||||
|
||||
result := convertStorage(storage, "2026-02-10T15:30:00Z")
|
||||
|
||||
if len(result) != 1 {
|
||||
t.Fatalf("expected 1 storage device (skipped one without serial), got %d", len(result))
|
||||
}
|
||||
|
||||
if result[0].Status != "Unknown" {
|
||||
t.Errorf("expected Unknown status, got %q", result[0].Status)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertPCIeDevices(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{
|
||||
Slot: "PCIeCard1",
|
||||
VendorID: 32902,
|
||||
DeviceID: 2912,
|
||||
BDF: "0000:18:00.0",
|
||||
DeviceClass: "MassStorageController",
|
||||
Manufacturer: "Intel",
|
||||
PartNumber: "RSP3DD080F",
|
||||
SerialNumber: "RAID-001",
|
||||
},
|
||||
{
|
||||
Slot: "PCIeCard2",
|
||||
DeviceClass: "NetworkController",
|
||||
Manufacturer: "Mellanox",
|
||||
SerialNumber: "", // Should be generated
|
||||
},
|
||||
},
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "GPU1",
|
||||
Model: "NVIDIA A100",
|
||||
Manufacturer: "NVIDIA",
|
||||
SerialNumber: "GPU-001",
|
||||
Status: "OK",
|
||||
},
|
||||
},
|
||||
NetworkAdapters: []models.NetworkAdapter{
|
||||
{
|
||||
Slot: "NIC1",
|
||||
Model: "ConnectX-6",
|
||||
Vendor: "Mellanox",
|
||||
Present: true,
|
||||
SerialNumber: "NIC-001",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
result := convertPCIeDevices(hw, "2026-02-10T15:30:00Z")
|
||||
|
||||
// Should have: 2 PCIe devices + 1 GPU + 1 NIC = 4 total
|
||||
if len(result) != 4 {
|
||||
t.Fatalf("expected 4 PCIe devices total, got %d", len(result))
|
||||
}
|
||||
|
||||
// Check that serial is empty for second PCIe device (no auto-generation)
|
||||
if result[1].SerialNumber != "" {
|
||||
t.Errorf("expected empty serial for missing device serial, got %q", result[1].SerialNumber)
|
||||
}
|
||||
|
||||
// Check GPU was included
|
||||
foundGPU := false
|
||||
for _, dev := range result {
|
||||
if dev.SerialNumber == "GPU-001" {
|
||||
foundGPU = true
|
||||
if dev.DeviceClass != "DisplayController" {
|
||||
t.Errorf("expected GPU device_class DisplayController, got %q", dev.DeviceClass)
|
||||
}
|
||||
break
|
||||
}
|
||||
}
|
||||
if !foundGPU {
|
||||
t.Error("expected GPU to be included in PCIe devices")
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertPCIeDevices_NVSwitchWithoutSerialRemainsEmpty(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
Firmware: []models.FirmwareInfo{
|
||||
{
|
||||
DeviceName: "NVSwitch NVSWITCH1 (965-25612-0002-000)",
|
||||
Version: "96.10.6D.00.01",
|
||||
},
|
||||
},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{
|
||||
Slot: "NVSWITCH1",
|
||||
DeviceClass: "NVSwitch",
|
||||
BDF: "0000:06:00.0",
|
||||
// SerialNumber empty on purpose; should remain empty.
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
result := convertPCIeDevices(hw, "2026-02-10T15:30:00Z")
|
||||
|
||||
if len(result) != 1 {
|
||||
t.Fatalf("expected 1 PCIe device, got %d", len(result))
|
||||
}
|
||||
|
||||
if result[0].SerialNumber != "" {
|
||||
t.Fatalf("expected empty NVSwitch serial, got %q", result[0].SerialNumber)
|
||||
}
|
||||
if result[0].Firmware != "96.10.6D.00.01" {
|
||||
t.Fatalf("expected NVSwitch firmware 96.10.6D.00.01, got %q", result[0].Firmware)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertPCIeDevices_SkipsDisplayControllerDuplicates(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{
|
||||
Slot: "#GPU0",
|
||||
DeviceClass: "3D Controller",
|
||||
},
|
||||
},
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "#GPU0",
|
||||
Model: "B200 180GB HBM3e",
|
||||
Manufacturer: "NVIDIA",
|
||||
SerialNumber: "1655024043371",
|
||||
Status: "OK",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
result := convertPCIeDevices(hw, "2026-02-10T15:30:00Z")
|
||||
if len(result) != 1 {
|
||||
t.Fatalf("expected only dedicated GPU record without duplicate display PCIe, got %d", len(result))
|
||||
}
|
||||
if result[0].DeviceClass != "DisplayController" {
|
||||
t.Fatalf("expected GPU record with DisplayController class, got %q", result[0].DeviceClass)
|
||||
}
|
||||
if result[0].Status != "OK" {
|
||||
t.Fatalf("expected GPU status OK, got %q", result[0].Status)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertPCIeDevices_MapsGPUStatusHistory(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "#GPU6",
|
||||
Model: "B200 180GB HBM3e",
|
||||
Manufacturer: "NVIDIA",
|
||||
SerialNumber: "1655024043204",
|
||||
Status: "Critical",
|
||||
StatusHistory: []models.StatusHistoryEntry{
|
||||
{
|
||||
Status: "Critical",
|
||||
ChangedAt: time.Date(2026, 1, 12, 15, 5, 18, 0, time.UTC),
|
||||
Details: "BIOS miss F_GPU6",
|
||||
},
|
||||
},
|
||||
ErrorDescription: "BIOS miss F_GPU6",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
result := convertPCIeDevices(hw, "2026-02-10T15:30:00Z")
|
||||
if len(result) != 1 {
|
||||
t.Fatalf("expected 1 converted GPU, got %d", len(result))
|
||||
}
|
||||
|
||||
if len(result[0].StatusHistory) != 1 {
|
||||
t.Fatalf("expected 1 history entry, got %d", len(result[0].StatusHistory))
|
||||
}
|
||||
if result[0].StatusHistory[0].ChangedAt != "2026-01-12T15:05:18Z" {
|
||||
t.Fatalf("unexpected history changed_at: %q", result[0].StatusHistory[0].ChangedAt)
|
||||
}
|
||||
if result[0].StatusAtCollect == nil || result[0].StatusAtCollect.At != "2026-02-10T15:30:00Z" {
|
||||
t.Fatalf("expected status_at_collection to be populated from collected_at")
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertPowerSupplies(t *testing.T) {
|
||||
psus := []models.PSU{
|
||||
{
|
||||
Slot: "0",
|
||||
Present: true,
|
||||
Model: "GW-CRPS3000LW",
|
||||
Vendor: "Great Wall",
|
||||
WattageW: 3000,
|
||||
SerialNumber: "PSU-001",
|
||||
Status: "OK",
|
||||
},
|
||||
{
|
||||
Slot: "1",
|
||||
Present: false,
|
||||
SerialNumber: "", // Not present, should be skipped
|
||||
},
|
||||
}
|
||||
|
||||
result := convertPowerSupplies(psus, "2026-02-10T15:30:00Z")
|
||||
|
||||
if len(result) != 1 {
|
||||
t.Fatalf("expected 1 PSU (skipped empty), got %d", len(result))
|
||||
}
|
||||
|
||||
if result[0].Status != "OK" {
|
||||
t.Errorf("expected OK status, got %q", result[0].Status)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertBoardNormalizesNULL(t *testing.T) {
|
||||
board := convertBoard(models.BoardInfo{
|
||||
Manufacturer: " NULL ",
|
||||
ProductName: "null",
|
||||
SerialNumber: "TEST123",
|
||||
})
|
||||
|
||||
if board.Manufacturer != "" {
|
||||
t.Fatalf("expected empty manufacturer, got %q", board.Manufacturer)
|
||||
}
|
||||
if board.ProductName != "" {
|
||||
t.Fatalf("expected empty product_name, got %q", board.ProductName)
|
||||
}
|
||||
}
|
||||
|
||||
func TestSourceTypeOmittedWhenInvalidOrEmpty(t *testing.T) {
|
||||
result, err := ConvertToReanimator(&models.AnalysisResult{
|
||||
Filename: "redfish://10.0.0.1",
|
||||
SourceType: "archive",
|
||||
TargetHost: "10.0.0.1",
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{SerialNumber: "TEST123"},
|
||||
},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error: %v", err)
|
||||
}
|
||||
|
||||
payload, err := json.Marshal(result)
|
||||
if err != nil {
|
||||
t.Fatalf("marshal failed: %v", err)
|
||||
}
|
||||
if strings.Contains(string(payload), `"source_type"`) {
|
||||
t.Fatalf("expected source_type to be omitted for invalid value, got %s", string(payload))
|
||||
}
|
||||
}
|
||||
|
||||
func TestTargetHostOmittedWhenUnavailable(t *testing.T) {
|
||||
result, err := ConvertToReanimator(&models.AnalysisResult{
|
||||
Filename: "test.json",
|
||||
SourceType: "api",
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{SerialNumber: "TEST123"},
|
||||
},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error: %v", err)
|
||||
}
|
||||
|
||||
payload, err := json.Marshal(result)
|
||||
if err != nil {
|
||||
t.Fatalf("marshal failed: %v", err)
|
||||
}
|
||||
if strings.Contains(string(payload), `"target_host"`) {
|
||||
t.Fatalf("expected target_host to be omitted when unavailable, got %s", string(payload))
|
||||
}
|
||||
}
|
||||
|
||||
func TestInferTargetHost(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
targetHost string
|
||||
filename string
|
||||
want string
|
||||
}{
|
||||
{
|
||||
name: "explicit target host wins",
|
||||
targetHost: "10.0.0.10",
|
||||
filename: "redfish://10.0.0.20",
|
||||
want: "10.0.0.10",
|
||||
},
|
||||
{
|
||||
name: "hostname from URL",
|
||||
filename: "redfish://10.10.10.103",
|
||||
want: "10.10.10.103",
|
||||
},
|
||||
{
|
||||
name: "ip extracted from archive name",
|
||||
filename: "nvidia_bug_report_192.168.12.34.tar.gz",
|
||||
want: "192.168.12.34",
|
||||
},
|
||||
{
|
||||
name: "no host available",
|
||||
filename: "test.json",
|
||||
want: "",
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
got := inferTargetHost(tt.targetHost, tt.filename)
|
||||
if got != tt.want {
|
||||
t.Fatalf("inferTargetHost() = %q, want %q", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertToReanimator_DeduplicatesAllSections(t *testing.T) {
|
||||
input := &models.AnalysisResult{
|
||||
Filename: "dup-test.json",
|
||||
CollectedAt: time.Date(2026, 2, 10, 15, 30, 0, 0, time.UTC),
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{SerialNumber: "BOARD-001"},
|
||||
Firmware: []models.FirmwareInfo{
|
||||
{DeviceName: "BMC", Version: "1.0"},
|
||||
{DeviceName: "BMC", Version: "1.1"},
|
||||
},
|
||||
CPUs: []models.CPU{
|
||||
{Socket: 0, Model: "CPU-A"},
|
||||
{Socket: 0, Model: "CPU-A-DUP"},
|
||||
},
|
||||
Memory: []models.MemoryDIMM{
|
||||
{Slot: "DIMM_A1", Present: true, SizeMB: 32768, SerialNumber: "MEM-1", Status: "OK"},
|
||||
{Slot: "DIMM_A1", Present: true, SizeMB: 32768, SerialNumber: "MEM-1-DUP", Status: "OK"},
|
||||
},
|
||||
Storage: []models.Storage{
|
||||
{Slot: "U.2-1", SerialNumber: "SSD-1", Model: "Disk1", Present: true},
|
||||
{Slot: "U.2-2", SerialNumber: "SSD-1", Model: "Disk1-dup", Present: true},
|
||||
},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{Slot: "#GPU0", DeviceClass: "3D Controller", BDF: "17:00.0"},
|
||||
{Slot: "SLOT-NIC1", DeviceClass: "NetworkController", BDF: "18:00.0"},
|
||||
{Slot: "SLOT-NIC1", DeviceClass: "NetworkController", BDF: "18:00.1"},
|
||||
},
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "#GPU0", Model: "B200 180GB HBM3e", SerialNumber: "GPU-1", Status: "OK"},
|
||||
},
|
||||
PowerSupply: []models.PSU{
|
||||
{Slot: "0", Present: true, SerialNumber: "PSU-1", Status: "OK"},
|
||||
{Slot: "1", Present: true, SerialNumber: "PSU-1", Status: "OK"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
out, err := ConvertToReanimator(input)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator() failed: %v", err)
|
||||
}
|
||||
|
||||
if len(out.Hardware.Firmware) != 1 {
|
||||
t.Fatalf("expected deduped firmware len=1, got %d", len(out.Hardware.Firmware))
|
||||
}
|
||||
if len(out.Hardware.CPUs) != 2 {
|
||||
t.Fatalf("expected cpus len=2 (no serial/bdf dedupe), got %d", len(out.Hardware.CPUs))
|
||||
}
|
||||
if len(out.Hardware.Memory) != 2 {
|
||||
t.Fatalf("expected memory len=2 (different serials), got %d", len(out.Hardware.Memory))
|
||||
}
|
||||
if len(out.Hardware.Storage) != 1 {
|
||||
t.Fatalf("expected deduped storage len=1, got %d", len(out.Hardware.Storage))
|
||||
}
|
||||
if len(out.Hardware.PowerSupplies) != 1 {
|
||||
t.Fatalf("expected deduped psu len=1, got %d", len(out.Hardware.PowerSupplies))
|
||||
}
|
||||
if len(out.Hardware.PCIeDevices) != 4 {
|
||||
t.Fatalf("expected pcie len=4 with serial->bdf dedupe, got %d", len(out.Hardware.PCIeDevices))
|
||||
}
|
||||
|
||||
gpuCount := 0
|
||||
for _, dev := range out.Hardware.PCIeDevices {
|
||||
if dev.Slot == "#GPU0" {
|
||||
gpuCount++
|
||||
}
|
||||
}
|
||||
if gpuCount != 2 {
|
||||
t.Fatalf("expected two #GPU0 records (pcie+gpu kinds), got %d", gpuCount)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertToReanimator_StatusFallbackUsesCollectedAt(t *testing.T) {
|
||||
collectedAt := time.Date(2026, 2, 10, 15, 30, 0, 0, time.UTC)
|
||||
input := &models.AnalysisResult{
|
||||
Filename: "status-fallback.json",
|
||||
CollectedAt: collectedAt,
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{SerialNumber: "BOARD-001"},
|
||||
Storage: []models.Storage{
|
||||
{
|
||||
Slot: "U.2-1",
|
||||
Model: "PM9A3",
|
||||
SerialNumber: "SSD-001",
|
||||
Present: true,
|
||||
Status: "OK",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
out, err := ConvertToReanimator(input)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator() failed: %v", err)
|
||||
}
|
||||
if len(out.Hardware.Storage) != 1 {
|
||||
t.Fatalf("expected 1 storage entry, got %d", len(out.Hardware.Storage))
|
||||
}
|
||||
|
||||
wantTs := collectedAt.UTC().Format(time.RFC3339)
|
||||
got := out.Hardware.Storage[0]
|
||||
if got.StatusCheckedAt != wantTs {
|
||||
t.Fatalf("expected status_checked_at=%q, got %q", wantTs, got.StatusCheckedAt)
|
||||
}
|
||||
if got.StatusAtCollect == nil || got.StatusAtCollect.At != wantTs {
|
||||
t.Fatalf("expected status_at_collection.at=%q, got %#v", wantTs, got.StatusAtCollect)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertToReanimator_FirmwareExcludesDeviceBoundEntries(t *testing.T) {
|
||||
input := &models.AnalysisResult{
|
||||
Filename: "fw-filter-test.json",
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{SerialNumber: "BOARD-001"},
|
||||
Firmware: []models.FirmwareInfo{
|
||||
{DeviceName: "BIOS", Version: "1.0.0"},
|
||||
{DeviceName: "BMC", Version: "2.0.0"},
|
||||
{DeviceName: "GPU GPUSXM1 (692-2G520-0280-501)", Version: "96.00.D0.00.03"},
|
||||
{DeviceName: "NVSwitch NVSWITCH0 (965-25612-0002-000)", Version: "96.10.6D.00.01"},
|
||||
{DeviceName: "NIC #CPU1_PCIE9 (MCX512A-ACAT)", Version: "28.38.1900"},
|
||||
{DeviceName: "CPU0 Microcode", Version: "0x2b000643"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
out, err := ConvertToReanimator(input)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator() failed: %v", err)
|
||||
}
|
||||
|
||||
if len(out.Hardware.Firmware) != 2 {
|
||||
t.Fatalf("expected only machine-level firmware entries, got %d", len(out.Hardware.Firmware))
|
||||
}
|
||||
|
||||
got := map[string]string{}
|
||||
for _, fw := range out.Hardware.Firmware {
|
||||
got[fw.DeviceName] = fw.Version
|
||||
}
|
||||
|
||||
if got["BIOS"] != "1.0.0" {
|
||||
t.Fatalf("expected BIOS firmware to be kept")
|
||||
}
|
||||
if got["BMC"] != "2.0.0" {
|
||||
t.Fatalf("expected BMC firmware to be kept")
|
||||
}
|
||||
if _, exists := got["GPU GPUSXM1 (692-2G520-0280-501)"]; exists {
|
||||
t.Fatalf("expected GPU firmware to be excluded from hardware.firmware")
|
||||
}
|
||||
if _, exists := got["NVSwitch NVSWITCH0 (965-25612-0002-000)"]; exists {
|
||||
t.Fatalf("expected NVSwitch firmware to be excluded from hardware.firmware")
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertToReanimator_UsesCanonicalDevices(t *testing.T) {
|
||||
input := &models.AnalysisResult{
|
||||
Filename: "canonical.json",
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{SerialNumber: "BOARD-001"},
|
||||
Devices: []models.HardwareDevice{
|
||||
{
|
||||
Kind: models.DeviceKindCPU,
|
||||
Slot: "CPU0",
|
||||
Model: "INTEL(R) XEON(R)",
|
||||
Cores: 32,
|
||||
Threads: 64,
|
||||
FrequencyMHz: 2100,
|
||||
},
|
||||
{
|
||||
Kind: models.DeviceKindStorage,
|
||||
Slot: "U.2-1",
|
||||
Model: "Disk1",
|
||||
SerialNumber: "SSD-1",
|
||||
Present: boolPtr(true),
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
out, err := ConvertToReanimator(input)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator() failed: %v", err)
|
||||
}
|
||||
if len(out.Hardware.CPUs) != 1 {
|
||||
t.Fatalf("expected cpu from hardware.devices, got %d", len(out.Hardware.CPUs))
|
||||
}
|
||||
if len(out.Hardware.Storage) != 1 {
|
||||
t.Fatalf("expected storage from hardware.devices, got %d", len(out.Hardware.Storage))
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertToReanimator_BindsDeviceVitals(t *testing.T) {
|
||||
input := &models.AnalysisResult{
|
||||
Filename: "vitals.json",
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{SerialNumber: "BOARD-001"},
|
||||
Devices: []models.HardwareDevice{
|
||||
{
|
||||
Kind: models.DeviceKindGPU,
|
||||
Slot: "#GPU0",
|
||||
Model: "B200 180GB HBM3e",
|
||||
SerialNumber: "GPU-001",
|
||||
BDF: "0000:17:00.0",
|
||||
Details: map[string]any{
|
||||
"temperature": 71,
|
||||
"power": 350,
|
||||
"voltage": 12.2,
|
||||
},
|
||||
},
|
||||
{
|
||||
Kind: models.DeviceKindPSU,
|
||||
Slot: "PSU0",
|
||||
SerialNumber: "PSU-001",
|
||||
Present: boolPtr(true),
|
||||
InputPowerW: 1400,
|
||||
OutputPowerW: 1300,
|
||||
InputVoltage: 229.5,
|
||||
TemperatureC: 44,
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
out, err := ConvertToReanimator(input)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator() failed: %v", err)
|
||||
}
|
||||
|
||||
if len(out.Hardware.PCIeDevices) != 1 {
|
||||
t.Fatalf("expected one pcie device, got %d", len(out.Hardware.PCIeDevices))
|
||||
}
|
||||
pcie := out.Hardware.PCIeDevices[0]
|
||||
if pcie.TemperatureC != 71 {
|
||||
t.Fatalf("expected GPU temperature 71C, got %d", pcie.TemperatureC)
|
||||
}
|
||||
if pcie.PowerW != 350 {
|
||||
t.Fatalf("expected GPU power 350W, got %d", pcie.PowerW)
|
||||
}
|
||||
if pcie.VoltageV != 12.2 {
|
||||
t.Fatalf("expected device voltage 12.2V, got %.2f", pcie.VoltageV)
|
||||
}
|
||||
|
||||
if len(out.Hardware.PowerSupplies) != 1 {
|
||||
t.Fatalf("expected one PSU, got %d", len(out.Hardware.PowerSupplies))
|
||||
}
|
||||
psu := out.Hardware.PowerSupplies[0]
|
||||
if psu.TemperatureC != 44 {
|
||||
t.Fatalf("expected PSU temperature 44C, got %d", psu.TemperatureC)
|
||||
}
|
||||
}
|
||||
|
||||
func TestConvertToReanimator_PreservesVitalsAcrossCanonicalDedup(t *testing.T) {
|
||||
input := &models.AnalysisResult{
|
||||
Filename: "dedup-vitals.json",
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{SerialNumber: "BOARD-001"},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{
|
||||
Slot: "#GPU0",
|
||||
BDF: "0000:17:00.0",
|
||||
DeviceClass: "3D Controller",
|
||||
PartNumber: "Generic Display",
|
||||
Manufacturer: "NVIDIA",
|
||||
SerialNumber: "GPU-SN-001",
|
||||
},
|
||||
},
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "#GPU0",
|
||||
BDF: "0000:17:00.0",
|
||||
Model: "B200 180GB HBM3e",
|
||||
Manufacturer: "NVIDIA",
|
||||
SerialNumber: "GPU-SN-001",
|
||||
Temperature: 67,
|
||||
Power: 330,
|
||||
Status: "OK",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
out, err := ConvertToReanimator(input)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator() failed: %v", err)
|
||||
}
|
||||
if len(out.Hardware.PCIeDevices) != 1 {
|
||||
t.Fatalf("expected deduped one pcie entry, got %d", len(out.Hardware.PCIeDevices))
|
||||
}
|
||||
got := out.Hardware.PCIeDevices[0]
|
||||
if got.TemperatureC != 67 {
|
||||
t.Fatalf("expected deduped GPU temperature 67C, got %d", got.TemperatureC)
|
||||
}
|
||||
if got.PowerW != 330 {
|
||||
t.Fatalf("expected deduped GPU power 330W, got %d", got.PowerW)
|
||||
}
|
||||
}
|
||||
|
||||
func boolPtr(v bool) *bool { return &v }
|
||||
289
internal/exporter/reanimator_integration_test.go
Normal file
289
internal/exporter/reanimator_integration_test.go
Normal file
@@ -0,0 +1,289 @@
|
||||
package exporter
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
// TestFullReanimatorExport tests complete export with realistic data
|
||||
func TestFullReanimatorExport(t *testing.T) {
|
||||
// Create a realistic AnalysisResult similar to import-example-full.json
|
||||
result := &models.AnalysisResult{
|
||||
Filename: "redfish://10.10.10.103",
|
||||
SourceType: "api",
|
||||
Protocol: "redfish",
|
||||
TargetHost: "10.10.10.103",
|
||||
CollectedAt: time.Date(2026, 2, 10, 15, 30, 0, 0, time.UTC),
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{
|
||||
Manufacturer: "Supermicro",
|
||||
ProductName: "X12DPG-QT6",
|
||||
SerialNumber: "21D634101",
|
||||
PartNumber: "X12DPG-QT6-REV1.01",
|
||||
UUID: "d7ef2fe5-2fd0-11f0-910a-346f11040868",
|
||||
},
|
||||
Firmware: []models.FirmwareInfo{
|
||||
{DeviceName: "BIOS", Version: "06.08.05"},
|
||||
{DeviceName: "BMC", Version: "5.17.00"},
|
||||
{DeviceName: "CPLD", Version: "01.02.03"},
|
||||
},
|
||||
CPUs: []models.CPU{
|
||||
{
|
||||
Socket: 0,
|
||||
Model: "INTEL(R) XEON(R) GOLD 6530",
|
||||
Cores: 32,
|
||||
Threads: 64,
|
||||
FrequencyMHz: 2100,
|
||||
MaxFreqMHz: 4000,
|
||||
},
|
||||
{
|
||||
Socket: 1,
|
||||
Model: "INTEL(R) XEON(R) GOLD 6530",
|
||||
Cores: 32,
|
||||
Threads: 64,
|
||||
FrequencyMHz: 2100,
|
||||
MaxFreqMHz: 4000,
|
||||
},
|
||||
},
|
||||
Memory: []models.MemoryDIMM{
|
||||
{
|
||||
Slot: "CPU0_C0D0",
|
||||
Location: "CPU0_C0D0",
|
||||
Present: true,
|
||||
SizeMB: 32768,
|
||||
Type: "DDR5",
|
||||
MaxSpeedMHz: 4800,
|
||||
CurrentSpeedMHz: 4800,
|
||||
Manufacturer: "Hynix",
|
||||
SerialNumber: "80AD032419E17CEEC1",
|
||||
PartNumber: "HMCG88AGBRA191N",
|
||||
Status: "OK",
|
||||
},
|
||||
{
|
||||
Slot: "CPU0_C1D0",
|
||||
Location: "CPU0_C1D0",
|
||||
Present: false,
|
||||
SizeMB: 0,
|
||||
Type: "",
|
||||
MaxSpeedMHz: 0,
|
||||
CurrentSpeedMHz: 0,
|
||||
Status: "Empty",
|
||||
},
|
||||
},
|
||||
Storage: []models.Storage{
|
||||
{
|
||||
Slot: "OB01",
|
||||
Type: "NVMe",
|
||||
Model: "INTEL SSDPF2KX076T1",
|
||||
SizeGB: 7680,
|
||||
SerialNumber: "BTAX41900GF87P6DGN",
|
||||
Manufacturer: "Intel",
|
||||
Firmware: "9CV10510",
|
||||
Interface: "NVMe",
|
||||
Present: true,
|
||||
},
|
||||
{
|
||||
Slot: "FP00HDD00",
|
||||
Type: "HDD",
|
||||
Model: "ST12000NM0008",
|
||||
SizeGB: 12000,
|
||||
SerialNumber: "ZJV01234ABC",
|
||||
Manufacturer: "Seagate",
|
||||
Firmware: "SN03",
|
||||
Interface: "SATA",
|
||||
Present: true,
|
||||
},
|
||||
},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{
|
||||
Slot: "PCIeCard1",
|
||||
VendorID: 32902,
|
||||
DeviceID: 2912,
|
||||
BDF: "0000:18:00.0",
|
||||
DeviceClass: "MassStorageController",
|
||||
Manufacturer: "Intel",
|
||||
PartNumber: "RAID Controller RSP3DD080F",
|
||||
LinkWidth: 8,
|
||||
LinkSpeed: "Gen3",
|
||||
MaxLinkWidth: 8,
|
||||
MaxLinkSpeed: "Gen3",
|
||||
SerialNumber: "RAID-001-12345",
|
||||
},
|
||||
{
|
||||
Slot: "PCIeCard2",
|
||||
VendorID: 5555,
|
||||
DeviceID: 4401,
|
||||
BDF: "0000:3b:00.0",
|
||||
DeviceClass: "NetworkController",
|
||||
Manufacturer: "Mellanox",
|
||||
PartNumber: "ConnectX-5",
|
||||
LinkWidth: 16,
|
||||
LinkSpeed: "Gen3",
|
||||
MaxLinkWidth: 16,
|
||||
MaxLinkSpeed: "Gen3",
|
||||
SerialNumber: "MT2892012345",
|
||||
},
|
||||
},
|
||||
PowerSupply: []models.PSU{
|
||||
{
|
||||
Slot: "0",
|
||||
Present: true,
|
||||
Model: "GW-CRPS3000LW",
|
||||
Vendor: "Great Wall",
|
||||
WattageW: 3000,
|
||||
SerialNumber: "2P06C102610",
|
||||
PartNumber: "V0310C9000000000",
|
||||
Firmware: "00.03.05",
|
||||
Status: "OK",
|
||||
InputType: "ACWideRange",
|
||||
InputPowerW: 137,
|
||||
OutputPowerW: 104,
|
||||
InputVoltage: 215.25,
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
// Convert to Reanimator format
|
||||
reanimator, err := ConvertToReanimator(result)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator failed: %v", err)
|
||||
}
|
||||
|
||||
// Verify top-level fields
|
||||
if reanimator.Filename != "redfish://10.10.10.103" {
|
||||
t.Errorf("Filename mismatch: got %q", reanimator.Filename)
|
||||
}
|
||||
|
||||
if reanimator.SourceType != "api" {
|
||||
t.Errorf("SourceType mismatch: got %q", reanimator.SourceType)
|
||||
}
|
||||
|
||||
if reanimator.Protocol != "redfish" {
|
||||
t.Errorf("Protocol mismatch: got %q", reanimator.Protocol)
|
||||
}
|
||||
|
||||
if reanimator.TargetHost != "10.10.10.103" {
|
||||
t.Errorf("TargetHost mismatch: got %q", reanimator.TargetHost)
|
||||
}
|
||||
|
||||
if reanimator.CollectedAt != "2026-02-10T15:30:00Z" {
|
||||
t.Errorf("CollectedAt mismatch: got %q", reanimator.CollectedAt)
|
||||
}
|
||||
|
||||
// Verify hardware sections
|
||||
hw := reanimator.Hardware
|
||||
|
||||
// Board
|
||||
if hw.Board.SerialNumber != "21D634101" {
|
||||
t.Errorf("Board serial mismatch: got %q", hw.Board.SerialNumber)
|
||||
}
|
||||
|
||||
// Firmware
|
||||
if len(hw.Firmware) != 3 {
|
||||
t.Errorf("Expected 3 firmware entries, got %d", len(hw.Firmware))
|
||||
}
|
||||
|
||||
// CPUs
|
||||
if len(hw.CPUs) != 2 {
|
||||
t.Fatalf("Expected 2 CPUs, got %d", len(hw.CPUs))
|
||||
}
|
||||
|
||||
if hw.CPUs[0].Manufacturer != "Intel" {
|
||||
t.Errorf("CPU manufacturer not inferred: got %q", hw.CPUs[0].Manufacturer)
|
||||
}
|
||||
|
||||
if hw.CPUs[0].Status != "Unknown" {
|
||||
t.Errorf("CPU status mismatch: got %q", hw.CPUs[0].Status)
|
||||
}
|
||||
|
||||
// Memory (empty slots are excluded)
|
||||
if len(hw.Memory) != 1 {
|
||||
t.Errorf("Expected 1 memory entry (installed only), got %d", len(hw.Memory))
|
||||
}
|
||||
|
||||
// Storage
|
||||
if len(hw.Storage) != 2 {
|
||||
t.Errorf("Expected 2 storage devices, got %d", len(hw.Storage))
|
||||
}
|
||||
|
||||
if hw.Storage[0].Status != "Unknown" {
|
||||
t.Errorf("Storage status mismatch: got %q", hw.Storage[0].Status)
|
||||
}
|
||||
|
||||
// PCIe devices
|
||||
if len(hw.PCIeDevices) != 2 {
|
||||
t.Errorf("Expected 2 PCIe devices, got %d", len(hw.PCIeDevices))
|
||||
}
|
||||
|
||||
if hw.PCIeDevices[0].Model == "" {
|
||||
t.Error("PCIe model should be populated from PartNumber")
|
||||
}
|
||||
|
||||
// Power supplies
|
||||
if len(hw.PowerSupplies) != 1 {
|
||||
t.Errorf("Expected 1 PSU, got %d", len(hw.PowerSupplies))
|
||||
}
|
||||
|
||||
// Verify JSON marshaling works
|
||||
jsonData, err := json.MarshalIndent(reanimator, "", " ")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to marshal to JSON: %v", err)
|
||||
}
|
||||
|
||||
// Check that JSON contains expected fields
|
||||
jsonStr := string(jsonData)
|
||||
expectedFields := []string{
|
||||
`"filename"`,
|
||||
`"source_type"`,
|
||||
`"protocol"`,
|
||||
`"target_host"`,
|
||||
`"collected_at"`,
|
||||
`"hardware"`,
|
||||
`"board"`,
|
||||
`"cpus"`,
|
||||
`"memory"`,
|
||||
`"storage"`,
|
||||
`"pcie_devices"`,
|
||||
`"power_supplies"`,
|
||||
`"firmware"`,
|
||||
}
|
||||
|
||||
for _, field := range expectedFields {
|
||||
if !strings.Contains(jsonStr, field) {
|
||||
t.Errorf("JSON missing expected field: %s", field)
|
||||
}
|
||||
}
|
||||
|
||||
// Optional: print JSON for manual inspection (commented out for normal test runs)
|
||||
// t.Logf("Generated Reanimator JSON:\n%s", string(jsonData))
|
||||
}
|
||||
|
||||
// TestReanimatorExportWithoutTargetHost tests that target_host is inferred from filename
|
||||
func TestReanimatorExportWithoutTargetHost(t *testing.T) {
|
||||
result := &models.AnalysisResult{
|
||||
Filename: "redfish://192.168.1.100",
|
||||
SourceType: "api",
|
||||
Protocol: "redfish",
|
||||
TargetHost: "", // Empty - should be inferred
|
||||
CollectedAt: time.Now(),
|
||||
Hardware: &models.HardwareConfig{
|
||||
BoardInfo: models.BoardInfo{
|
||||
SerialNumber: "TEST123",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
reanimator, err := ConvertToReanimator(result)
|
||||
if err != nil {
|
||||
t.Fatalf("ConvertToReanimator failed: %v", err)
|
||||
}
|
||||
|
||||
if reanimator.TargetHost != "192.168.1.100" {
|
||||
t.Errorf("Expected target_host to be inferred from filename, got %q", reanimator.TargetHost)
|
||||
}
|
||||
}
|
||||
153
internal/exporter/reanimator_models.go
Normal file
153
internal/exporter/reanimator_models.go
Normal file
@@ -0,0 +1,153 @@
|
||||
package exporter
|
||||
|
||||
// ReanimatorExport represents the top-level structure for Reanimator format export
|
||||
type ReanimatorExport struct {
|
||||
Filename string `json:"filename"`
|
||||
SourceType string `json:"source_type,omitempty"`
|
||||
Protocol string `json:"protocol,omitempty"`
|
||||
TargetHost string `json:"target_host,omitempty"`
|
||||
CollectedAt string `json:"collected_at"` // RFC3339 format
|
||||
Hardware ReanimatorHardware `json:"hardware"`
|
||||
}
|
||||
|
||||
// ReanimatorHardware contains all hardware components
|
||||
type ReanimatorHardware struct {
|
||||
Board ReanimatorBoard `json:"board"`
|
||||
Firmware []ReanimatorFirmware `json:"firmware,omitempty"`
|
||||
CPUs []ReanimatorCPU `json:"cpus,omitempty"`
|
||||
Memory []ReanimatorMemory `json:"memory,omitempty"`
|
||||
Storage []ReanimatorStorage `json:"storage,omitempty"`
|
||||
PCIeDevices []ReanimatorPCIe `json:"pcie_devices,omitempty"`
|
||||
PowerSupplies []ReanimatorPSU `json:"power_supplies,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorBoard represents motherboard/server information
|
||||
type ReanimatorBoard struct {
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
ProductName string `json:"product_name,omitempty"`
|
||||
SerialNumber string `json:"serial_number"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
UUID string `json:"uuid,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorFirmware represents firmware version information
|
||||
type ReanimatorFirmware struct {
|
||||
DeviceName string `json:"device_name"`
|
||||
Version string `json:"version"`
|
||||
}
|
||||
|
||||
type ReanimatorStatusAtCollection struct {
|
||||
Status string `json:"status"`
|
||||
At string `json:"at"`
|
||||
}
|
||||
|
||||
type ReanimatorStatusHistoryEntry struct {
|
||||
Status string `json:"status"`
|
||||
ChangedAt string `json:"changed_at"`
|
||||
Details string `json:"details,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorCPU represents processor information
|
||||
type ReanimatorCPU struct {
|
||||
Socket int `json:"socket"`
|
||||
Model string `json:"model"`
|
||||
Cores int `json:"cores,omitempty"`
|
||||
Threads int `json:"threads,omitempty"`
|
||||
FrequencyMHz int `json:"frequency_mhz,omitempty"`
|
||||
MaxFrequencyMHz int `json:"max_frequency_mhz,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorMemory represents a memory module (DIMM)
|
||||
type ReanimatorMemory struct {
|
||||
Slot string `json:"slot"`
|
||||
Location string `json:"location,omitempty"`
|
||||
Present bool `json:"present"`
|
||||
SizeMB int `json:"size_mb,omitempty"`
|
||||
Type string `json:"type,omitempty"`
|
||||
MaxSpeedMHz int `json:"max_speed_mhz,omitempty"`
|
||||
CurrentSpeedMHz int `json:"current_speed_mhz,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorStorage represents a storage device
|
||||
type ReanimatorStorage struct {
|
||||
Slot string `json:"slot"`
|
||||
Type string `json:"type,omitempty"`
|
||||
Model string `json:"model"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
SerialNumber string `json:"serial_number"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Interface string `json:"interface,omitempty"`
|
||||
Present bool `json:"present"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorPCIe represents a PCIe device
|
||||
type ReanimatorPCIe struct {
|
||||
Slot string `json:"slot"`
|
||||
VendorID int `json:"vendor_id,omitempty"`
|
||||
DeviceID int `json:"device_id,omitempty"`
|
||||
BDF string `json:"bdf,omitempty"`
|
||||
DeviceClass string `json:"device_class,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
Model string `json:"model,omitempty"`
|
||||
LinkWidth int `json:"link_width,omitempty"`
|
||||
LinkSpeed string `json:"link_speed,omitempty"`
|
||||
MaxLinkWidth int `json:"max_link_width,omitempty"`
|
||||
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
PowerW int `json:"power_w,omitempty"`
|
||||
VoltageV float64 `json:"voltage_v,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// ReanimatorPSU represents a power supply unit
|
||||
type ReanimatorPSU struct {
|
||||
Slot string `json:"slot"`
|
||||
Present bool `json:"present"`
|
||||
Model string `json:"model,omitempty"`
|
||||
Vendor string `json:"vendor,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
InputType string `json:"input_type,omitempty"`
|
||||
InputPowerW int `json:"input_power_w,omitempty"`
|
||||
OutputPowerW int `json:"output_power_w,omitempty"`
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
StatusCheckedAt string `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt string `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *ReanimatorStatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []ReanimatorStatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
@@ -13,6 +13,7 @@ type AnalysisResult struct {
|
||||
SourceType string `json:"source_type,omitempty"` // archive | api
|
||||
Protocol string `json:"protocol,omitempty"` // redfish | ipmi
|
||||
TargetHost string `json:"target_host,omitempty"` // BMC host for live collect
|
||||
SourceTimezone string `json:"source_timezone,omitempty"` // Source timezone/offset used during collection (e.g. +08:00)
|
||||
CollectedAt time.Time `json:"collected_at,omitempty"` // Collection/upload timestamp
|
||||
RawPayloads map[string]any `json:"raw_payloads,omitempty"` // Additional source payloads (e.g. Redfish tree)
|
||||
Events []Event `json:"events"`
|
||||
@@ -43,6 +44,19 @@ const (
|
||||
SeverityInfo Severity = "info"
|
||||
)
|
||||
|
||||
// StatusAtCollection captures component status at a specific timestamp.
|
||||
type StatusAtCollection struct {
|
||||
Status string `json:"status"`
|
||||
At time.Time `json:"at"`
|
||||
}
|
||||
|
||||
// StatusHistoryEntry represents a status transition point.
|
||||
type StatusHistoryEntry struct {
|
||||
Status string `json:"status"`
|
||||
ChangedAt time.Time `json:"changed_at"`
|
||||
Details string `json:"details,omitempty"`
|
||||
}
|
||||
|
||||
// SensorReading represents a single sensor reading
|
||||
type SensorReading struct {
|
||||
Name string `json:"name"`
|
||||
@@ -71,9 +85,11 @@ type FRUInfo struct {
|
||||
type HardwareConfig struct {
|
||||
Firmware []FirmwareInfo `json:"firmware,omitempty"`
|
||||
BoardInfo BoardInfo `json:"board,omitempty"`
|
||||
Devices []HardwareDevice `json:"devices,omitempty"`
|
||||
CPUs []CPU `json:"cpus,omitempty"`
|
||||
Memory []MemoryDIMM `json:"memory,omitempty"`
|
||||
Storage []Storage `json:"storage,omitempty"`
|
||||
Volumes []StorageVolume `json:"volumes,omitempty"`
|
||||
PCIeDevices []PCIeDevice `json:"pcie_devices,omitempty"`
|
||||
GPUs []GPU `json:"gpus,omitempty"`
|
||||
NetworkCards []NIC `json:"network_cards,omitempty"`
|
||||
@@ -81,27 +97,91 @@ type HardwareConfig struct {
|
||||
PowerSupply []PSU `json:"power_supplies,omitempty"`
|
||||
}
|
||||
|
||||
const (
|
||||
DeviceKindBoard = "board"
|
||||
DeviceKindCPU = "cpu"
|
||||
DeviceKindMemory = "memory"
|
||||
DeviceKindStorage = "storage"
|
||||
DeviceKindPCIe = "pcie"
|
||||
DeviceKindGPU = "gpu"
|
||||
DeviceKindNetwork = "network"
|
||||
DeviceKindPSU = "psu"
|
||||
)
|
||||
|
||||
// HardwareDevice is canonical device inventory used across UI and exports.
|
||||
type HardwareDevice struct {
|
||||
ID string `json:"id"`
|
||||
Kind string `json:"kind"`
|
||||
Source string `json:"source,omitempty"`
|
||||
Slot string `json:"slot,omitempty"`
|
||||
Location string `json:"location,omitempty"`
|
||||
BDF string `json:"bdf,omitempty"`
|
||||
DeviceClass string `json:"device_class,omitempty"`
|
||||
VendorID int `json:"vendor_id,omitempty"`
|
||||
DeviceID int `json:"device_id,omitempty"`
|
||||
Model string `json:"model,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Firmware string `json:"firmware,omitempty"`
|
||||
Type string `json:"type,omitempty"`
|
||||
Interface string `json:"interface,omitempty"`
|
||||
Present *bool `json:"present,omitempty"`
|
||||
SizeMB int `json:"size_mb,omitempty"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
Cores int `json:"cores,omitempty"`
|
||||
Threads int `json:"threads,omitempty"`
|
||||
FrequencyMHz int `json:"frequency_mhz,omitempty"`
|
||||
MaxFreqMHz int `json:"max_frequency_mhz,omitempty"`
|
||||
PortCount int `json:"port_count,omitempty"`
|
||||
PortType string `json:"port_type,omitempty"`
|
||||
MACAddresses []string `json:"mac_addresses,omitempty"`
|
||||
LinkWidth int `json:"link_width,omitempty"`
|
||||
LinkSpeed string `json:"link_speed,omitempty"`
|
||||
MaxLinkWidth int `json:"max_link_width,omitempty"`
|
||||
MaxLinkSpeed string `json:"max_link_speed,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
InputType string `json:"input_type,omitempty"`
|
||||
InputPowerW int `json:"input_power_w,omitempty"`
|
||||
OutputPowerW int `json:"output_power_w,omitempty"`
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
|
||||
Details map[string]any `json:"details,omitempty"`
|
||||
}
|
||||
|
||||
// FirmwareInfo represents firmware version information
|
||||
type FirmwareInfo struct {
|
||||
DeviceName string `json:"device_name"`
|
||||
Version string `json:"version"`
|
||||
BuildTime string `json:"build_time,omitempty"`
|
||||
DeviceName string `json:"device_name"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Version string `json:"version"`
|
||||
BuildTime string `json:"build_time,omitempty"`
|
||||
}
|
||||
|
||||
// BoardInfo represents motherboard/system information
|
||||
type BoardInfo struct {
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
ProductName string `json:"product_name,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Version string `json:"version,omitempty"`
|
||||
UUID string `json:"uuid,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
ProductName string `json:"product_name,omitempty"`
|
||||
Description string `json:"description,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Version string `json:"version,omitempty"`
|
||||
UUID string `json:"uuid,omitempty"`
|
||||
BMCMACAddress string `json:"bmc_mac_address,omitempty"`
|
||||
}
|
||||
|
||||
// CPU represents processor information
|
||||
type CPU struct {
|
||||
Socket int `json:"socket"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Cores int `json:"cores"`
|
||||
Threads int `json:"threads"`
|
||||
FrequencyMHz int `json:"frequency_mhz"`
|
||||
@@ -112,12 +192,20 @@ type CPU struct {
|
||||
TDP int `json:"tdp_w,omitempty"`
|
||||
PPIN string `json:"ppin,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// MemoryDIMM represents a memory module
|
||||
type MemoryDIMM struct {
|
||||
Slot string `json:"slot"`
|
||||
Location string `json:"location"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Present bool `json:"present"`
|
||||
SizeMB int `json:"size_mb"`
|
||||
Type string `json:"type"`
|
||||
@@ -129,6 +217,12 @@ type MemoryDIMM struct {
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
Ranks int `json:"ranks,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// Storage represents a storage device
|
||||
@@ -136,6 +230,7 @@ type Storage struct {
|
||||
Slot string `json:"slot"`
|
||||
Type string `json:"type"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
SizeGB int `json:"size_gb"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
@@ -144,11 +239,32 @@ type Storage struct {
|
||||
Present bool `json:"present"`
|
||||
Location string `json:"location,omitempty"` // Front/Rear
|
||||
BackplaneID int `json:"backplane_id,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// StorageVolume represents a logical storage volume (RAID/VROC/etc.).
|
||||
type StorageVolume struct {
|
||||
ID string `json:"id,omitempty"`
|
||||
Name string `json:"name,omitempty"`
|
||||
Controller string `json:"controller,omitempty"`
|
||||
RAIDLevel string `json:"raid_level,omitempty"`
|
||||
SizeGB int `json:"size_gb,omitempty"`
|
||||
CapacityBytes int64 `json:"capacity_bytes,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
Bootable bool `json:"bootable,omitempty"`
|
||||
Encrypted bool `json:"encrypted,omitempty"`
|
||||
}
|
||||
|
||||
// PCIeDevice represents a PCIe device
|
||||
type PCIeDevice struct {
|
||||
Slot string `json:"slot"`
|
||||
Description string `json:"description,omitempty"`
|
||||
VendorID int `json:"vendor_id"`
|
||||
DeviceID int `json:"device_id"`
|
||||
BDF string `json:"bdf"`
|
||||
@@ -161,12 +277,20 @@ type PCIeDevice struct {
|
||||
PartNumber string `json:"part_number,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
MACAddresses []string `json:"mac_addresses,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// NIC represents a network interface card
|
||||
type NIC struct {
|
||||
Name string `json:"name"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
MACAddress string `json:"mac_address"`
|
||||
SpeedMbps int `json:"speed_mbps,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
@@ -177,6 +301,7 @@ type PSU struct {
|
||||
Slot string `json:"slot"`
|
||||
Present bool `json:"present"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Vendor string `json:"vendor,omitempty"`
|
||||
WattageW int `json:"wattage_w,omitempty"`
|
||||
SerialNumber string `json:"serial_number,omitempty"`
|
||||
@@ -189,6 +314,12 @@ type PSU struct {
|
||||
InputVoltage float64 `json:"input_voltage,omitempty"`
|
||||
OutputVoltage float64 `json:"output_voltage,omitempty"`
|
||||
TemperatureC int `json:"temperature_c,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// GPU represents a graphics processing unit
|
||||
@@ -196,6 +327,7 @@ type GPU struct {
|
||||
Slot string `json:"slot"`
|
||||
Location string `json:"location,omitempty"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Manufacturer string `json:"manufacturer,omitempty"`
|
||||
VendorID int `json:"vendor_id,omitempty"`
|
||||
DeviceID int `json:"device_id,omitempty"`
|
||||
@@ -220,6 +352,12 @@ type GPU struct {
|
||||
CurrentLinkWidth int `json:"current_link_width,omitempty"`
|
||||
CurrentLinkSpeed string `json:"current_link_speed,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
// NetworkAdapter represents a network adapter with detailed info
|
||||
@@ -228,6 +366,7 @@ type NetworkAdapter struct {
|
||||
Location string `json:"location"`
|
||||
Present bool `json:"present"`
|
||||
Model string `json:"model"`
|
||||
Description string `json:"description,omitempty"`
|
||||
Vendor string `json:"vendor,omitempty"`
|
||||
VendorID int `json:"vendor_id,omitempty"`
|
||||
DeviceID int `json:"device_id,omitempty"`
|
||||
@@ -238,4 +377,10 @@ type NetworkAdapter struct {
|
||||
PortType string `json:"port_type,omitempty"`
|
||||
MACAddresses []string `json:"mac_addresses,omitempty"`
|
||||
Status string `json:"status,omitempty"`
|
||||
|
||||
StatusCheckedAt *time.Time `json:"status_checked_at,omitempty"`
|
||||
StatusChangedAt *time.Time `json:"status_changed_at,omitempty"`
|
||||
StatusAtCollect *StatusAtCollection `json:"status_at_collection,omitempty"`
|
||||
StatusHistory []StatusHistoryEntry `json:"status_history,omitempty"`
|
||||
ErrorDescription string `json:"error_description,omitempty"`
|
||||
}
|
||||
|
||||
@@ -9,25 +9,45 @@ import (
|
||||
"io"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sort"
|
||||
"strings"
|
||||
"time"
|
||||
)
|
||||
|
||||
const maxSingleFileSize = 10 * 1024 * 1024
|
||||
const maxZipArchiveSize = 50 * 1024 * 1024
|
||||
const maxGzipDecompressedSize = 50 * 1024 * 1024
|
||||
|
||||
var supportedArchiveExt = map[string]struct{}{
|
||||
".gz": {},
|
||||
".tgz": {},
|
||||
".tar": {},
|
||||
".sds": {},
|
||||
".zip": {},
|
||||
".txt": {},
|
||||
".log": {},
|
||||
}
|
||||
|
||||
// ExtractedFile represents a file extracted from archive
|
||||
type ExtractedFile struct {
|
||||
Path string
|
||||
Content []byte
|
||||
Path string
|
||||
Content []byte
|
||||
ModTime time.Time
|
||||
Truncated bool
|
||||
TruncatedMessage string
|
||||
}
|
||||
|
||||
// ExtractArchive extracts tar.gz or zip archive and returns file contents
|
||||
func ExtractArchive(archivePath string) ([]ExtractedFile, error) {
|
||||
if !IsSupportedArchiveFilename(archivePath) {
|
||||
return nil, fmt.Errorf("unsupported archive format: %s", strings.ToLower(filepath.Ext(archivePath)))
|
||||
}
|
||||
ext := strings.ToLower(filepath.Ext(archivePath))
|
||||
|
||||
switch ext {
|
||||
case ".gz", ".tgz":
|
||||
return extractTarGz(archivePath)
|
||||
case ".tar":
|
||||
case ".tar", ".sds":
|
||||
return extractTar(archivePath)
|
||||
case ".zip":
|
||||
return extractZip(archivePath)
|
||||
@@ -40,13 +60,18 @@ func ExtractArchive(archivePath string) ([]ExtractedFile, error) {
|
||||
|
||||
// ExtractArchiveFromReader extracts archive from reader
|
||||
func ExtractArchiveFromReader(r io.Reader, filename string) ([]ExtractedFile, error) {
|
||||
if !IsSupportedArchiveFilename(filename) {
|
||||
return nil, fmt.Errorf("unsupported archive format: %s", strings.ToLower(filepath.Ext(filename)))
|
||||
}
|
||||
ext := strings.ToLower(filepath.Ext(filename))
|
||||
|
||||
switch ext {
|
||||
case ".gz", ".tgz":
|
||||
return extractTarGzFromReader(r, filename)
|
||||
case ".tar":
|
||||
case ".tar", ".sds":
|
||||
return extractTarFromReader(r)
|
||||
case ".zip":
|
||||
return extractZipFromReader(r)
|
||||
case ".txt", ".log":
|
||||
return extractSingleFileFromReader(r, filename)
|
||||
default:
|
||||
@@ -54,6 +79,27 @@ func ExtractArchiveFromReader(r io.Reader, filename string) ([]ExtractedFile, er
|
||||
}
|
||||
}
|
||||
|
||||
// IsSupportedArchiveFilename reports whether filename extension is supported by archive extractor.
|
||||
func IsSupportedArchiveFilename(filename string) bool {
|
||||
ext := strings.ToLower(strings.TrimSpace(filepath.Ext(filename)))
|
||||
if ext == "" {
|
||||
return false
|
||||
}
|
||||
_, ok := supportedArchiveExt[ext]
|
||||
return ok
|
||||
}
|
||||
|
||||
// SupportedArchiveExtensions returns sorted list of archive/file extensions
|
||||
// accepted by archive extractor.
|
||||
func SupportedArchiveExtensions() []string {
|
||||
out := make([]string, 0, len(supportedArchiveExt))
|
||||
for ext := range supportedArchiveExt {
|
||||
out = append(out, ext)
|
||||
}
|
||||
sort.Strings(out)
|
||||
return out
|
||||
}
|
||||
|
||||
func extractTarGz(archivePath string) ([]ExtractedFile, error) {
|
||||
f, err := os.Open(archivePath)
|
||||
if err != nil {
|
||||
@@ -105,6 +151,7 @@ func extractTarFromReader(r io.Reader) ([]ExtractedFile, error) {
|
||||
files = append(files, ExtractedFile{
|
||||
Path: header.Name,
|
||||
Content: content,
|
||||
ModTime: header.ModTime,
|
||||
})
|
||||
}
|
||||
|
||||
@@ -118,12 +165,16 @@ func extractTarGzFromReader(r io.Reader, filename string) ([]ExtractedFile, erro
|
||||
}
|
||||
defer gzr.Close()
|
||||
|
||||
// Read all decompressed content into buffer
|
||||
// Limit to 50MB for plain gzip files, 10MB per file for tar.gz
|
||||
decompressed, err := io.ReadAll(io.LimitReader(gzr, 50*1024*1024))
|
||||
// Read decompressed content with a hard cap.
|
||||
// When the payload exceeds the cap, keep the first chunk and mark it as truncated.
|
||||
decompressed, err := io.ReadAll(io.LimitReader(gzr, maxGzipDecompressedSize+1))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read gzip content: %w", err)
|
||||
}
|
||||
gzipTruncated := len(decompressed) > maxGzipDecompressedSize
|
||||
if gzipTruncated {
|
||||
decompressed = decompressed[:maxGzipDecompressedSize]
|
||||
}
|
||||
|
||||
// Try to read as tar archive
|
||||
tr := tar.NewReader(bytes.NewReader(decompressed))
|
||||
@@ -139,12 +190,20 @@ func extractTarGzFromReader(r io.Reader, filename string) ([]ExtractedFile, erro
|
||||
baseName = gzr.Name
|
||||
}
|
||||
|
||||
return []ExtractedFile{
|
||||
{
|
||||
Path: baseName,
|
||||
Content: decompressed,
|
||||
},
|
||||
}, nil
|
||||
file := ExtractedFile{
|
||||
Path: baseName,
|
||||
Content: decompressed,
|
||||
ModTime: gzr.ModTime,
|
||||
}
|
||||
if gzipTruncated {
|
||||
file.Truncated = true
|
||||
file.TruncatedMessage = fmt.Sprintf(
|
||||
"decompressed gzip content exceeded %d bytes and was truncated",
|
||||
maxGzipDecompressedSize,
|
||||
)
|
||||
}
|
||||
|
||||
return []ExtractedFile{file}, nil
|
||||
}
|
||||
return nil, fmt.Errorf("tar read: %w", err)
|
||||
}
|
||||
@@ -163,6 +222,7 @@ func extractTarGzFromReader(r io.Reader, filename string) ([]ExtractedFile, erro
|
||||
files = append(files, ExtractedFile{
|
||||
Path: header.Name,
|
||||
Content: content,
|
||||
ModTime: header.ModTime,
|
||||
})
|
||||
}
|
||||
}
|
||||
@@ -213,6 +273,59 @@ func extractZip(archivePath string) ([]ExtractedFile, error) {
|
||||
files = append(files, ExtractedFile{
|
||||
Path: f.Name,
|
||||
Content: content,
|
||||
ModTime: f.Modified,
|
||||
})
|
||||
}
|
||||
|
||||
return files, nil
|
||||
}
|
||||
|
||||
func extractZipFromReader(r io.Reader) ([]ExtractedFile, error) {
|
||||
// Read all data into memory with a hard cap
|
||||
data, err := io.ReadAll(io.LimitReader(r, maxZipArchiveSize+1))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read zip data: %w", err)
|
||||
}
|
||||
if len(data) > maxZipArchiveSize {
|
||||
return nil, fmt.Errorf("zip too large: max %d bytes", maxZipArchiveSize)
|
||||
}
|
||||
|
||||
// Create a ReaderAt from the byte slice
|
||||
readerAt := bytes.NewReader(data)
|
||||
|
||||
// Open the zip archive
|
||||
zipReader, err := zip.NewReader(readerAt, int64(len(data)))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("open zip: %w", err)
|
||||
}
|
||||
|
||||
var files []ExtractedFile
|
||||
|
||||
for _, f := range zipReader.File {
|
||||
if f.FileInfo().IsDir() {
|
||||
continue
|
||||
}
|
||||
|
||||
// Skip large files (>10MB)
|
||||
if f.FileInfo().Size() > 10*1024*1024 {
|
||||
continue
|
||||
}
|
||||
|
||||
rc, err := f.Open()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("open file %s: %w", f.Name, err)
|
||||
}
|
||||
|
||||
content, err := io.ReadAll(rc)
|
||||
rc.Close()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read file %s: %w", f.Name, err)
|
||||
}
|
||||
|
||||
files = append(files, ExtractedFile{
|
||||
Path: f.Name,
|
||||
Content: content,
|
||||
ModTime: f.Modified,
|
||||
})
|
||||
}
|
||||
|
||||
@@ -220,13 +333,24 @@ func extractZip(archivePath string) ([]ExtractedFile, error) {
|
||||
}
|
||||
|
||||
func extractSingleFile(path string) ([]ExtractedFile, error) {
|
||||
info, err := os.Stat(path)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("stat file: %w", err)
|
||||
}
|
||||
f, err := os.Open(path)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("open file: %w", err)
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
return extractSingleFileFromReader(f, filepath.Base(path))
|
||||
files, err := extractSingleFileFromReader(f, filepath.Base(path))
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if len(files) > 0 {
|
||||
files[0].ModTime = info.ModTime()
|
||||
}
|
||||
return files, nil
|
||||
}
|
||||
|
||||
func extractSingleFileFromReader(r io.Reader, filename string) ([]ExtractedFile, error) {
|
||||
@@ -234,16 +358,24 @@ func extractSingleFileFromReader(r io.Reader, filename string) ([]ExtractedFile,
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("read file content: %w", err)
|
||||
}
|
||||
if len(content) > maxSingleFileSize {
|
||||
return nil, fmt.Errorf("file too large: max %d bytes", maxSingleFileSize)
|
||||
truncated := len(content) > maxSingleFileSize
|
||||
if truncated {
|
||||
content = content[:maxSingleFileSize]
|
||||
}
|
||||
|
||||
return []ExtractedFile{
|
||||
{
|
||||
Path: filepath.Base(filename),
|
||||
Content: content,
|
||||
},
|
||||
}, nil
|
||||
file := ExtractedFile{
|
||||
Path: filepath.Base(filename),
|
||||
Content: content,
|
||||
}
|
||||
if truncated {
|
||||
file.Truncated = true
|
||||
file.TruncatedMessage = fmt.Sprintf(
|
||||
"file exceeded %d bytes and was truncated",
|
||||
maxSingleFileSize,
|
||||
)
|
||||
}
|
||||
|
||||
return []ExtractedFile{file}, nil
|
||||
}
|
||||
|
||||
// FindFileByPattern finds files matching pattern in extracted files
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
package parser
|
||||
|
||||
import (
|
||||
"archive/tar"
|
||||
"bytes"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
@@ -46,3 +48,79 @@ func TestExtractArchiveTXT(t *testing.T) {
|
||||
t.Fatalf("content mismatch")
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractArchiveFromReaderTXT_TruncatedWhenTooLarge(t *testing.T) {
|
||||
large := bytes.Repeat([]byte("a"), maxSingleFileSize+1024)
|
||||
files, err := ExtractArchiveFromReader(bytes.NewReader(large), "huge.log")
|
||||
if err != nil {
|
||||
t.Fatalf("extract huge txt from reader: %v", err)
|
||||
}
|
||||
if len(files) != 1 {
|
||||
t.Fatalf("expected 1 file, got %d", len(files))
|
||||
}
|
||||
|
||||
f := files[0]
|
||||
if !f.Truncated {
|
||||
t.Fatalf("expected file to be marked as truncated")
|
||||
}
|
||||
if got := len(f.Content); got != maxSingleFileSize {
|
||||
t.Fatalf("expected truncated size %d, got %d", maxSingleFileSize, got)
|
||||
}
|
||||
if f.TruncatedMessage == "" {
|
||||
t.Fatalf("expected truncation message")
|
||||
}
|
||||
}
|
||||
|
||||
func TestIsSupportedArchiveFilename(t *testing.T) {
|
||||
cases := []struct {
|
||||
name string
|
||||
want bool
|
||||
}{
|
||||
{name: "dump.tar.gz", want: true},
|
||||
{name: "nvidia-bug-report-1651124000923.log.gz", want: true},
|
||||
{name: "snapshot.zip", want: true},
|
||||
{name: "h3c_20250819.sds", want: true},
|
||||
{name: "report.log", want: true},
|
||||
{name: "xigmanas.txt", want: true},
|
||||
{name: "raw_export.json", want: false},
|
||||
{name: "archive.bin", want: false},
|
||||
}
|
||||
|
||||
for _, tc := range cases {
|
||||
got := IsSupportedArchiveFilename(tc.name)
|
||||
if got != tc.want {
|
||||
t.Fatalf("IsSupportedArchiveFilename(%q)=%v, want %v", tc.name, got, tc.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractArchiveFromReaderSDS(t *testing.T) {
|
||||
var buf bytes.Buffer
|
||||
tw := tar.NewWriter(&buf)
|
||||
|
||||
payload := []byte("STARTTIME:0\nENDTIME:0\n")
|
||||
if err := tw.WriteHeader(&tar.Header{
|
||||
Name: "bmc/pack.info",
|
||||
Mode: 0o600,
|
||||
Size: int64(len(payload)),
|
||||
}); err != nil {
|
||||
t.Fatalf("write tar header: %v", err)
|
||||
}
|
||||
if _, err := tw.Write(payload); err != nil {
|
||||
t.Fatalf("write tar payload: %v", err)
|
||||
}
|
||||
if err := tw.Close(); err != nil {
|
||||
t.Fatalf("close tar writer: %v", err)
|
||||
}
|
||||
|
||||
files, err := ExtractArchiveFromReader(bytes.NewReader(buf.Bytes()), "sample.sds")
|
||||
if err != nil {
|
||||
t.Fatalf("extract sds from reader: %v", err)
|
||||
}
|
||||
if len(files) != 1 {
|
||||
t.Fatalf("expected 1 extracted file, got %d", len(files))
|
||||
}
|
||||
if files[0].Path != "bmc/pack.info" {
|
||||
t.Fatalf("expected bmc/pack.info, got %q", files[0].Path)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -9,7 +9,7 @@ type VendorParser interface {
|
||||
// Name returns human-readable parser name
|
||||
Name() string
|
||||
|
||||
// Vendor returns vendor identifier (e.g., "inspur", "supermicro", "dell")
|
||||
// Vendor returns vendor identifier (e.g., "inspur", "dell", "h3c_g6")
|
||||
Vendor() string
|
||||
|
||||
// Version returns parser version string
|
||||
|
||||
@@ -3,6 +3,8 @@ package parser
|
||||
import (
|
||||
"fmt"
|
||||
"io"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
@@ -62,11 +64,74 @@ func (p *BMCParser) parseFiles() error {
|
||||
|
||||
// Preserve filename
|
||||
result.Filename = p.result.Filename
|
||||
|
||||
appendExtractionWarnings(result, p.files)
|
||||
if result.CollectedAt.IsZero() {
|
||||
if ts := inferCollectedAtFromExtractedFiles(p.files); !ts.IsZero() {
|
||||
result.CollectedAt = ts.UTC()
|
||||
}
|
||||
}
|
||||
p.result = result
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func inferCollectedAtFromExtractedFiles(files []ExtractedFile) time.Time {
|
||||
var latestReliable time.Time
|
||||
var latestAny time.Time
|
||||
for _, f := range files {
|
||||
ts := f.ModTime
|
||||
if ts.IsZero() {
|
||||
continue
|
||||
}
|
||||
if latestAny.IsZero() || ts.After(latestAny) {
|
||||
latestAny = ts
|
||||
}
|
||||
// Ignore placeholder archive mtimes like 1980-01-01.
|
||||
if ts.Year() < 2000 {
|
||||
continue
|
||||
}
|
||||
if latestReliable.IsZero() || ts.After(latestReliable) {
|
||||
latestReliable = ts
|
||||
}
|
||||
}
|
||||
if !latestReliable.IsZero() {
|
||||
return latestReliable
|
||||
}
|
||||
return latestAny
|
||||
}
|
||||
|
||||
func appendExtractionWarnings(result *models.AnalysisResult, files []ExtractedFile) {
|
||||
if result == nil {
|
||||
return
|
||||
}
|
||||
|
||||
truncated := make([]string, 0)
|
||||
for _, f := range files {
|
||||
if !f.Truncated {
|
||||
continue
|
||||
}
|
||||
if f.TruncatedMessage != "" {
|
||||
truncated = append(truncated, fmt.Sprintf("%s: %s", f.Path, f.TruncatedMessage))
|
||||
continue
|
||||
}
|
||||
truncated = append(truncated, fmt.Sprintf("%s: content was truncated due to size limit", f.Path))
|
||||
}
|
||||
|
||||
if len(truncated) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
result.Events = append(result.Events, models.Event{
|
||||
Timestamp: time.Now(),
|
||||
Source: "LOGPile",
|
||||
EventType: "Analysis Warning",
|
||||
Severity: models.SeverityWarning,
|
||||
Description: "Input data was too large; analysis is partial and may be incomplete",
|
||||
RawData: strings.Join(truncated, "; "),
|
||||
})
|
||||
}
|
||||
|
||||
// Result returns the analysis result
|
||||
func (p *BMCParser) Result() *models.AnalysisResult {
|
||||
return p.result
|
||||
|
||||
62
internal/parser/parser_test.go
Normal file
62
internal/parser/parser_test.go
Normal file
@@ -0,0 +1,62 @@
|
||||
package parser
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestAppendExtractionWarnings(t *testing.T) {
|
||||
result := &models.AnalysisResult{
|
||||
Events: make([]models.Event, 0),
|
||||
}
|
||||
|
||||
files := []ExtractedFile{
|
||||
{Path: "ok.log", Content: []byte("ok")},
|
||||
{Path: "big.log", Truncated: true, TruncatedMessage: "file exceeded size limit and was truncated"},
|
||||
}
|
||||
|
||||
appendExtractionWarnings(result, files)
|
||||
|
||||
if len(result.Events) != 1 {
|
||||
t.Fatalf("expected 1 warning event, got %d", len(result.Events))
|
||||
}
|
||||
ev := result.Events[0]
|
||||
if ev.Severity != models.SeverityWarning {
|
||||
t.Fatalf("expected warning severity, got %q", ev.Severity)
|
||||
}
|
||||
if ev.EventType != "Analysis Warning" {
|
||||
t.Fatalf("unexpected event type: %q", ev.EventType)
|
||||
}
|
||||
if ev.RawData == "" {
|
||||
t.Fatalf("expected warning details in RawData")
|
||||
}
|
||||
}
|
||||
|
||||
func TestInferCollectedAtFromExtractedFiles_PrefersReliableMTime(t *testing.T) {
|
||||
files := []ExtractedFile{
|
||||
{Path: "a.log", ModTime: time.Date(1980, 1, 1, 0, 0, 0, 0, time.UTC)},
|
||||
{Path: "b.log", ModTime: time.Date(2025, 12, 12, 10, 14, 49, 0, time.FixedZone("EST", -5*3600))},
|
||||
{Path: "c.log", ModTime: time.Date(2026, 2, 28, 4, 18, 18, 0, time.FixedZone("UTC+8", 8*3600))},
|
||||
}
|
||||
|
||||
got := inferCollectedAtFromExtractedFiles(files)
|
||||
want := files[2].ModTime
|
||||
if !got.Equal(want) {
|
||||
t.Fatalf("expected %s, got %s", want, got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInferCollectedAtFromExtractedFiles_FallsBackToAnyMTime(t *testing.T) {
|
||||
files := []ExtractedFile{
|
||||
{Path: "a.log", ModTime: time.Date(1980, 1, 1, 0, 0, 0, 0, time.UTC)},
|
||||
{Path: "b.log", ModTime: time.Date(1970, 1, 2, 0, 0, 0, 0, time.UTC)},
|
||||
}
|
||||
|
||||
got := inferCollectedAtFromExtractedFiles(files)
|
||||
want := files[0].ModTime
|
||||
if !got.Equal(want) {
|
||||
t.Fatalf("expected fallback %s, got %s", want, got)
|
||||
}
|
||||
}
|
||||
33
internal/parser/timezone.go
Normal file
33
internal/parser/timezone.go
Normal file
@@ -0,0 +1,33 @@
|
||||
package parser
|
||||
|
||||
import (
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
const fallbackTimezoneName = "Europe/Moscow"
|
||||
|
||||
var (
|
||||
fallbackTimezoneOnce sync.Once
|
||||
fallbackTimezone *time.Location
|
||||
)
|
||||
|
||||
// DefaultArchiveLocation returns the timezone used for source timestamps
|
||||
// that do not contain an explicit offset.
|
||||
func DefaultArchiveLocation() *time.Location {
|
||||
fallbackTimezoneOnce.Do(func() {
|
||||
loc, err := time.LoadLocation(fallbackTimezoneName)
|
||||
if err != nil {
|
||||
fallbackTimezone = time.FixedZone("MSK", 3*60*60)
|
||||
return
|
||||
}
|
||||
fallbackTimezone = loc
|
||||
})
|
||||
return fallbackTimezone
|
||||
}
|
||||
|
||||
// ParseInDefaultArchiveLocation parses timestamps without timezone information
|
||||
// using Europe/Moscow as the assumed source timezone.
|
||||
func ParseInDefaultArchiveLocation(layout, value string) (time.Time, error) {
|
||||
return time.ParseInLocation(layout, value, DefaultArchiveLocation())
|
||||
}
|
||||
96
internal/parser/vendors/README.md
vendored
96
internal/parser/vendors/README.md
vendored
@@ -1,96 +0,0 @@
|
||||
# Vendor Parser Modules
|
||||
|
||||
Каждый производитель серверов имеет свой формат диагностических архивов BMC.
|
||||
Эта директория содержит модули парсеров для разных производителей.
|
||||
|
||||
## Структура модуля
|
||||
|
||||
```
|
||||
vendors/
|
||||
├── vendors.go # Импорты всех модулей (добавьте сюда новый)
|
||||
├── README.md # Эта документация
|
||||
├── template/ # Шаблон для нового модуля
|
||||
│ └── parser.go.template
|
||||
├── inspur/ # Модуль Inspur/Kaytus
|
||||
│ ├── parser.go # Основной парсер + регистрация
|
||||
│ ├── sdr.go # Парсинг SDR (сенсоры)
|
||||
│ ├── fru.go # Парсинг FRU (серийники)
|
||||
│ ├── asset.go # Парсинг asset.json
|
||||
│ └── syslog.go # Парсинг syslog
|
||||
├── supermicro/ # Будущий модуль Supermicro
|
||||
├── dell/ # Будущий модуль Dell iDRAC
|
||||
└── hpe/ # Будущий модуль HPE iLO
|
||||
```
|
||||
|
||||
## Как добавить новый модуль
|
||||
|
||||
### 1. Создайте директорию модуля
|
||||
|
||||
```bash
|
||||
mkdir -p internal/parser/vendors/VENDORNAME
|
||||
```
|
||||
|
||||
### 2. Скопируйте шаблон
|
||||
|
||||
```bash
|
||||
cp internal/parser/vendors/template/parser.go.template \
|
||||
internal/parser/vendors/VENDORNAME/parser.go
|
||||
```
|
||||
|
||||
### 3. Отредактируйте parser.go
|
||||
|
||||
- Замените `VENDORNAME` на идентификатор вендора (например, `supermicro`)
|
||||
- Замените `VENDOR_DESCRIPTION` на описание (например, `Supermicro`)
|
||||
- Реализуйте метод `Detect()` для определения формата
|
||||
- Реализуйте метод `Parse()` для парсинга данных
|
||||
|
||||
### 4. Зарегистрируйте модуль
|
||||
|
||||
Добавьте импорт в `vendors/vendors.go`:
|
||||
|
||||
```go
|
||||
import (
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/VENDORNAME" // Новый модуль
|
||||
)
|
||||
```
|
||||
|
||||
### 5. Готово!
|
||||
|
||||
Модуль автоматически зарегистрируется при старте приложения через `init()`.
|
||||
|
||||
## Интерфейс VendorParser
|
||||
|
||||
```go
|
||||
type VendorParser interface {
|
||||
// Name возвращает человекочитаемое имя парсера
|
||||
Name() string
|
||||
|
||||
// Vendor возвращает идентификатор вендора
|
||||
Vendor() string
|
||||
|
||||
// Detect проверяет, подходит ли этот парсер для файлов
|
||||
// Возвращает уверенность 0-100 (0 = не подходит, 100 = точно этот формат)
|
||||
Detect(files []ExtractedFile) int
|
||||
|
||||
// Parse парсит извлеченные файлы
|
||||
Parse(files []ExtractedFile) (*models.AnalysisResult, error)
|
||||
}
|
||||
```
|
||||
|
||||
## Советы по реализации Detect()
|
||||
|
||||
- Ищите уникальные файлы/директории для данного вендора
|
||||
- Проверяйте содержимое файлов на характерные маркеры
|
||||
- Возвращайте высокий confidence (70+) только при уверенном совпадении
|
||||
- Несколько парсеров могут вернуть >0, выбирается с максимальным confidence
|
||||
|
||||
## Поддерживаемые вендоры
|
||||
|
||||
| Вендор | Идентификатор | Статус | Протестировано на |
|
||||
|--------|---------------|--------|-------------------|
|
||||
| Inspur/Kaytus | `inspur` | ✅ Готов | KR4268X2 (onekeylog) |
|
||||
| Supermicro | `supermicro` | ⏳ Планируется | - |
|
||||
| Dell iDRAC | `dell` | ⏳ Планируется | - |
|
||||
| HPE iLO | `hpe` | ⏳ Планируется | - |
|
||||
| Lenovo XCC | `lenovo` | ⏳ Планируется | - |
|
||||
1493
internal/parser/vendors/dell/parser.go
vendored
Normal file
1493
internal/parser/vendors/dell/parser.go
vendored
Normal file
File diff suppressed because it is too large
Load Diff
224
internal/parser/vendors/dell/parser_test.go
vendored
Normal file
224
internal/parser/vendors/dell/parser_test.go
vendored
Normal file
@@ -0,0 +1,224 @@
|
||||
package dell
|
||||
|
||||
import (
|
||||
"archive/zip"
|
||||
"bytes"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestDetectNestedTSRZip(t *testing.T) {
|
||||
inner := makeZipArchive(t, map[string][]byte{
|
||||
"tsr/metadata.json": []byte(`{"Make":"Dell Inc.","Model":"PowerEdge R750","ServiceTag":"G37Q064"}`),
|
||||
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml": []byte(`<CIM><MESSAGE><SIMPLEREQ/></MESSAGE></CIM>`),
|
||||
})
|
||||
|
||||
p := &Parser{}
|
||||
score := p.Detect([]parser.ExtractedFile{
|
||||
{Path: "signature", Content: []byte("ok")},
|
||||
{Path: "TSR20241119143901_G37Q064.pl.zip", Content: inner},
|
||||
})
|
||||
if score < 80 {
|
||||
t.Fatalf("expected high detect score for nested TSR zip, got %d", score)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseNestedTSRZip(t *testing.T) {
|
||||
const viewXML = `<CIM><MESSAGE><SIMPLEREQ>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SystemView">
|
||||
<PROPERTY NAME="Manufacturer"><VALUE>Dell Inc.</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Model"><VALUE>PowerEdge R750</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ServiceTag"><VALUE>G37Q064</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="BIOSVersionString"><VALUE>2.19.1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="LifecycleControllerVersion"><VALUE>7.00.30.00</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_CPUView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>CPU.Socket.1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Model"><VALUE>Intel(R) Xeon(R) Gold 6330</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Manufacturer"><VALUE>Intel</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="NumberOfEnabledCores"><VALUE>28</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="NumberOfEnabledThreads"><VALUE>56</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="CurrentClockSpeed"><VALUE>2000</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="MaxClockSpeed"><VALUE>3100</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PPIN"><VALUE>ABCD</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_NICView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>NIC.Slot.1-1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ProductName"><VALUE>Broadcom 57414 Dual Port 10/25GbE SFP28 Adapter</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="VendorName"><VALUE>Broadcom</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="CurrentMACAddress"><VALUE>00:11:22:33:44:55</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="SerialNumber"><VALUE>NICSERIAL1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="FamilyVersion"><VALUE>22.80.17</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PCIVendorID"><VALUE>0x14e4</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PCIDeviceID"><VALUE>0x16d7</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_PowerSupplyView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>PSU.Slot.1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Model"><VALUE>D1400E-S0</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Manufacturer"><VALUE>Dell</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="SerialNumber"><VALUE>PSUSERIAL1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="FirmwareVersion"><VALUE>00.1A</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="TotalOutputPower"><VALUE>1400</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_VideoView">
|
||||
<PROPERTY NAME="FQDD"><VALUE>Video.Slot.38-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="MarketingName"><VALUE>NVIDIA H100 PCIe</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Description"><VALUE>GH100 [H100 PCIe]</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="Manufacturer"><VALUE>NVIDIA Corporation</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PCIVendorID"><VALUE>10DE</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PCIDeviceID"><VALUE>2331</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="BusNumber"><VALUE>74</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="DeviceNumber"><VALUE>0</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="FunctionNumber"><VALUE>0</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="SerialNumber"><VALUE>1793924039808</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="FirmwareVersion"><VALUE>96.00.AF.00.01</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="GPUGUID"><VALUE>bc681a6d4785dde08c21f49c46c05cc3</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
</SIMPLEREQ></MESSAGE></CIM>`
|
||||
|
||||
const swXML = `<CIM><MESSAGE><SIMPLEREQ>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_SoftwareIdentity">
|
||||
<PROPERTY NAME="ElementName"><VALUE>NIC.Slot.1-1-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="VersionString"><VALUE>22.80.17</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ComponentType"><VALUE>Network</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
</SIMPLEREQ></MESSAGE></CIM>`
|
||||
|
||||
const eventsXML = `<Log>
|
||||
<Event AgentID="Lifecycle Controller" Category="System Health" Severity="Warning" Timestamp="2024-11-19T14:39:01-0800">
|
||||
<MessageID>SYS1001</MessageID>
|
||||
<Message>Link is down</Message>
|
||||
<FQDD>NIC.Slot.1-1-1</FQDD>
|
||||
</Event>
|
||||
</Log>`
|
||||
|
||||
const cimSensorXML = `<CIM><MESSAGE><SIMPLEREQ>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="DCIM_GPUSensor">
|
||||
<PROPERTY NAME="DeviceID"><VALUE>Video.Slot.38-1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryGPUTemperature"><VALUE>290</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="MemoryTemperature"><VALUE>440</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PowerConsumption"><VALUE>295</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="ThermalAlertStatus"><VALUE>5</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
<VALUE.NAMEDINSTANCE><INSTANCE CLASSNAME="CIM_NumericSensor">
|
||||
<PROPERTY NAME="ElementName"><VALUE>PS1 Voltage 1</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="CurrentReading"><VALUE>224.0</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="BaseUnits"><VALUE>5</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="UnitModifier"><VALUE>0</VALUE></PROPERTY>
|
||||
<PROPERTY NAME="PrimaryStatus"><VALUE>5</VALUE></PROPERTY>
|
||||
</INSTANCE></VALUE.NAMEDINSTANCE>
|
||||
</SIMPLEREQ></MESSAGE></CIM>`
|
||||
|
||||
inner := makeZipArchive(t, map[string][]byte{
|
||||
"tsr/metadata.json": []byte(`{
|
||||
"Make":"Dell Inc.",
|
||||
"Model":"PowerEdge R750",
|
||||
"ServiceTag":"G37Q064",
|
||||
"FirmwareVersion":"7.00.30.00",
|
||||
"CollectionDateTime":"2024-11-19 14:39:01.000-0800"
|
||||
}`),
|
||||
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml": []byte(viewXML),
|
||||
"tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml": []byte(swXML),
|
||||
"tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml": []byte(cimSensorXML),
|
||||
"tsr/hardware/sysinfo/lcfiles/curr_lclog.xml": []byte(eventsXML),
|
||||
})
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse([]parser.ExtractedFile{
|
||||
{Path: "signature", Content: []byte("ok")},
|
||||
{Path: "TSR20241119143901_G37Q064.pl.zip", Content: inner},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
if result.Hardware == nil {
|
||||
t.Fatalf("expected hardware section")
|
||||
}
|
||||
|
||||
if got := result.Hardware.BoardInfo.Manufacturer; got != "Dell Inc." {
|
||||
t.Fatalf("unexpected board manufacturer: %q", got)
|
||||
}
|
||||
if got := result.Hardware.BoardInfo.ProductName; got != "PowerEdge R750" {
|
||||
t.Fatalf("unexpected board product: %q", got)
|
||||
}
|
||||
if got := result.Hardware.BoardInfo.SerialNumber; got != "G37Q064" {
|
||||
t.Fatalf("unexpected service tag: %q", got)
|
||||
}
|
||||
|
||||
if len(result.Hardware.CPUs) != 1 {
|
||||
t.Fatalf("expected 1 cpu, got %d", len(result.Hardware.CPUs))
|
||||
}
|
||||
if got := result.Hardware.CPUs[0].Model; got != "Intel(R) Xeon(R) Gold 6330" {
|
||||
t.Fatalf("unexpected cpu model: %q", got)
|
||||
}
|
||||
|
||||
if len(result.Hardware.NetworkAdapters) != 1 {
|
||||
t.Fatalf("expected 1 network adapter, got %d", len(result.Hardware.NetworkAdapters))
|
||||
}
|
||||
adapter := result.Hardware.NetworkAdapters[0]
|
||||
if adapter.Vendor != "Broadcom" {
|
||||
t.Fatalf("unexpected nic vendor: %q", adapter.Vendor)
|
||||
}
|
||||
if adapter.Firmware != "22.80.17" {
|
||||
t.Fatalf("unexpected nic firmware: %q", adapter.Firmware)
|
||||
}
|
||||
if adapter.SerialNumber != "NICSERIAL1" {
|
||||
t.Fatalf("unexpected nic serial: %q", adapter.SerialNumber)
|
||||
}
|
||||
|
||||
if len(result.Hardware.PowerSupply) != 1 {
|
||||
t.Fatalf("expected 1 psu, got %d", len(result.Hardware.PowerSupply))
|
||||
}
|
||||
psu := result.Hardware.PowerSupply[0]
|
||||
if psu.Model != "D1400E-S0" {
|
||||
t.Fatalf("unexpected psu model: %q", psu.Model)
|
||||
}
|
||||
if psu.Firmware != "00.1A" {
|
||||
t.Fatalf("unexpected psu firmware: %q", psu.Firmware)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Firmware) == 0 {
|
||||
t.Fatalf("expected firmware entries")
|
||||
}
|
||||
if len(result.Hardware.GPUs) != 1 {
|
||||
t.Fatalf("expected 1 gpu, got %d", len(result.Hardware.GPUs))
|
||||
}
|
||||
if got := result.Hardware.GPUs[0].Model; got != "NVIDIA H100 PCIe" {
|
||||
t.Fatalf("unexpected gpu model: %q", got)
|
||||
}
|
||||
if got := result.Hardware.GPUs[0].SerialNumber; got != "1793924039808" {
|
||||
t.Fatalf("unexpected gpu serial: %q", got)
|
||||
}
|
||||
if got := result.Hardware.GPUs[0].Temperature; got != 29 {
|
||||
t.Fatalf("unexpected gpu temperature: %d", got)
|
||||
}
|
||||
if len(result.Sensors) == 0 {
|
||||
t.Fatalf("expected sensors from CIM_Sensor")
|
||||
}
|
||||
if len(result.Events) != 1 {
|
||||
t.Fatalf("expected one lifecycle event, got %d", len(result.Events))
|
||||
}
|
||||
if got := string(result.Events[0].Severity); got != "warning" {
|
||||
t.Fatalf("unexpected event severity: %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func makeZipArchive(t *testing.T, files map[string][]byte) []byte {
|
||||
t.Helper()
|
||||
var buf bytes.Buffer
|
||||
zw := zip.NewWriter(&buf)
|
||||
for name, content := range files {
|
||||
w, err := zw.Create(name)
|
||||
if err != nil {
|
||||
t.Fatalf("create zip entry %s: %v", name, err)
|
||||
}
|
||||
if _, err := w.Write(content); err != nil {
|
||||
t.Fatalf("write zip entry %s: %v", name, err)
|
||||
}
|
||||
}
|
||||
if err := zw.Close(); err != nil {
|
||||
t.Fatalf("close zip: %v", err)
|
||||
}
|
||||
return buf.Bytes()
|
||||
}
|
||||
72
internal/parser/vendors/generic/README.md
vendored
72
internal/parser/vendors/generic/README.md
vendored
@@ -1,72 +0,0 @@
|
||||
# Generic Text File Parser
|
||||
|
||||
Fallback парсер для текстовых файлов, которые не распознаны другими парсерами.
|
||||
|
||||
## Назначение
|
||||
|
||||
Этот парсер обрабатывает любые текстовые файлы, которые:
|
||||
- Не являются архивами специфичных вендоров
|
||||
- Содержат текстовую информацию (не бинарные данные)
|
||||
- Представляют собой одиночные .gz файлы или простые текстовые файлы
|
||||
|
||||
## Приоритет
|
||||
|
||||
**Confidence score: 15** (низкий приоритет)
|
||||
|
||||
Этот парсер срабатывает только если ни один другой парсер не подошел с более высоким confidence.
|
||||
|
||||
## Поддерживаемые файлы
|
||||
|
||||
### Автоматически распознаваемые типы
|
||||
|
||||
1. **NVIDIA Bug Report** (`nvidia-bug-report-*.log.gz`)
|
||||
- Извлекает информацию о драйвере NVIDIA
|
||||
- Находит GPU устройства
|
||||
- Показывает версию драйвера
|
||||
|
||||
2. **Любые текстовые файлы**
|
||||
- Проверяет, что содержимое - текст (не бинарные данные)
|
||||
- Показывает базовую информацию о файле
|
||||
|
||||
## Извлекаемые данные
|
||||
|
||||
### Events
|
||||
|
||||
- **Text File**: Базовая информация о загруженном файле
|
||||
- **Driver Info**: Информация о NVIDIA драйвере (для nvidia-bug-report)
|
||||
- **GPU Device**: Обнаруженные GPU устройства (для nvidia-bug-report)
|
||||
|
||||
## Пример использования
|
||||
|
||||
```bash
|
||||
# Запуск с nvidia-bug-report
|
||||
./logpile --file nvidia-bug-report-*.log.gz
|
||||
|
||||
# Запуск с любым текстовым файлом
|
||||
./logpile --file system.log.gz
|
||||
```
|
||||
|
||||
## Версионирование
|
||||
|
||||
**Текущая версия парсера:** 1.0.0
|
||||
|
||||
## Ограничения
|
||||
|
||||
1. Этот парсер предоставляет только базовую информацию
|
||||
2. Не выполняет глубокий анализ содержимого
|
||||
3. Для детального анализа специфичных логов рекомендуется создать dedicated парсер
|
||||
|
||||
## Расширение
|
||||
|
||||
Чтобы добавить поддержку нового типа файлов:
|
||||
|
||||
1. Добавьте проверку в функцию `Parse()`
|
||||
2. Создайте функцию `parseXXX()` для извлечения специфичной информации
|
||||
3. Увеличьте версию парсера
|
||||
|
||||
Пример:
|
||||
```go
|
||||
if strings.Contains(strings.ToLower(file.Path), "custom-log") {
|
||||
parseCustomLog(content, result)
|
||||
}
|
||||
```
|
||||
2
internal/parser/vendors/generic/parser.go
vendored
2
internal/parser/vendors/generic/parser.go
vendored
@@ -10,7 +10,7 @@ import (
|
||||
)
|
||||
|
||||
// parserVersion - version of this parser module
|
||||
const parserVersion = "1.0.0"
|
||||
const parserVersion = "1.1"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
|
||||
3516
internal/parser/vendors/h3c/parser.go
vendored
Normal file
3516
internal/parser/vendors/h3c/parser.go
vendored
Normal file
File diff suppressed because it is too large
Load Diff
962
internal/parser/vendors/h3c/parser_test.go
vendored
Normal file
962
internal/parser/vendors/h3c/parser_test.go
vendored
Normal file
@@ -0,0 +1,962 @@
|
||||
package h3c
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestDetectH3C_GenerationRouting(t *testing.T) {
|
||||
g5 := &G5Parser{}
|
||||
g6 := &G6Parser{}
|
||||
|
||||
g5Files := []parser.ExtractedFile{
|
||||
{Path: "bmc/pack.info", Content: []byte("STARTTIME:0")},
|
||||
{Path: "static/FRUInfo.ini", Content: []byte("[Baseboard]\nBoard Manufacturer=H3C\n")},
|
||||
{Path: "static/hardware_info.ini", Content: []byte("[Processors: Processor 1]\nModel: Intel Xeon\n")},
|
||||
{Path: "static/hardware.info", Content: []byte("[Disk_0_Front_NA]\nSerialNumber=DISK-0\n")},
|
||||
{Path: "static/firmware_version.ini", Content: []byte("[System board]\nBIOS Version: 5.59\n")},
|
||||
{Path: "user/test1.csv", Content: []byte("Record Time Stamp,DescInfo\n2025-01-01 00:00:00,foo\n")},
|
||||
}
|
||||
if gotG5, gotG6 := g5.Detect(g5Files), g6.Detect(g5Files); gotG5 <= gotG6 {
|
||||
t.Fatalf("expected G5 confidence > G6 for G5 sample, got g5=%d g6=%d", gotG5, gotG6)
|
||||
}
|
||||
|
||||
g6Files := []parser.ExtractedFile{
|
||||
{Path: "bmc/pack.info", Content: []byte("STARTTIME:0")},
|
||||
{Path: "static/FRUInfo.ini", Content: []byte("[Baseboard]\nBoard Manufacturer=H3C\n")},
|
||||
{Path: "static/board_info.ini", Content: []byte("[System board]\nBoardMfr=H3C\n")},
|
||||
{Path: "static/firmware_version.json", Content: []byte(`{"BIOS":{"Firmware Name":"BIOS","Firmware Version":"6.10"}}`)},
|
||||
{Path: "static/CPUDetailInfo.xml", Content: []byte("<Root><CPU1><Model>X</Model></CPU1></Root>")},
|
||||
{Path: "static/MemoryDetailInfo.xml", Content: []byte("<Root><DIMM1><Name>A0</Name></DIMM1></Root>")},
|
||||
{Path: "user/Sel.json", Content: []byte(`{"Id":1}`)},
|
||||
}
|
||||
if gotG5, gotG6 := g5.Detect(g6Files), g6.Detect(g6Files); gotG6 <= gotG5 {
|
||||
t.Fatalf("expected G6 confidence > G5 for G6 sample, got g5=%d g6=%d", gotG5, gotG6)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseH3CG6_RaidAndNVMeEnrichment(t *testing.T) {
|
||||
p := &G6Parser{}
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "static/storage_disk.ini",
|
||||
Content: []byte(`[Disk_000]
|
||||
DiskSlotDesc=Front0
|
||||
Present=YES
|
||||
SerialNumber=SER-0
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/raid.json",
|
||||
Content: []byte(`{
|
||||
"RaidConfig": {
|
||||
"CtrlInfo": [
|
||||
{
|
||||
"CtrlSlot": 1,
|
||||
"CtrlName": "RAID-LSI-9560",
|
||||
"LDInfo": [
|
||||
{
|
||||
"LDID": "0",
|
||||
"LDName": "VD0",
|
||||
"RAIDLevel": "1",
|
||||
"CapacityBytes": 1000000000,
|
||||
"Status": "Optimal"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}`),
|
||||
},
|
||||
{
|
||||
Path: "static/Storage_RAID-LSI-9560-LP-8i-4GB[1].txt",
|
||||
Content: []byte(`Controller Information
|
||||
------------------------------------------------------------------------
|
||||
AssetTag : RAID-LSI-9560
|
||||
|
||||
Logical Device Information
|
||||
------------------------------------------------------------------------
|
||||
LDID : 0
|
||||
Name : VD0
|
||||
RAID Level : 1
|
||||
CapacityBytes : 1000000000
|
||||
Status : Optimal
|
||||
|
||||
Physical Device Information
|
||||
------------------------------------------------------------------------
|
||||
ConnectionID : 0
|
||||
Position : Front0
|
||||
StatusIndicator : OK
|
||||
Protocol : SATA
|
||||
MediaType : SSD
|
||||
Manufacturer : Samsung
|
||||
Model : PM893
|
||||
Revision : GDC1
|
||||
SerialNumber : SER-0
|
||||
CapacityBytes : 480000000000
|
||||
|
||||
ConnectionID : 1
|
||||
Position : Front1
|
||||
StatusIndicator : OK
|
||||
Protocol : SATA
|
||||
MediaType : SSD
|
||||
Manufacturer : Samsung
|
||||
Model : PM893
|
||||
Revision : GDC1
|
||||
SerialNumber : SER-1
|
||||
CapacityBytes : 480000000000
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/NVMe_info.txt",
|
||||
Content: []byte(`[NVMe_0]
|
||||
Present=YES
|
||||
DiskSlotDesc=Front2
|
||||
Model=INTEL SSDPE2KX010T8
|
||||
SerialNumber=NVME-1
|
||||
Firmware=V100
|
||||
CapacityBytes=1000204886016
|
||||
Interface=NVMe
|
||||
Status=OK
|
||||
`),
|
||||
},
|
||||
}
|
||||
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
if result.Hardware == nil {
|
||||
t.Fatalf("expected hardware section")
|
||||
}
|
||||
|
||||
if len(result.Hardware.Volumes) != 1 {
|
||||
t.Fatalf("expected 1 volume, got %d", len(result.Hardware.Volumes))
|
||||
}
|
||||
vol := result.Hardware.Volumes[0]
|
||||
if vol.RAIDLevel != "RAID1" {
|
||||
t.Fatalf("expected RAID1 level, got %q", vol.RAIDLevel)
|
||||
}
|
||||
if vol.SizeGB != 1 {
|
||||
t.Fatalf("expected 1GB logical volume, got %d", vol.SizeGB)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Storage) != 3 {
|
||||
t.Fatalf("expected 3 unique storage devices, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
|
||||
var front0 *models.Storage
|
||||
var nvme *models.Storage
|
||||
for i := range result.Hardware.Storage {
|
||||
s := &result.Hardware.Storage[i]
|
||||
if strings.EqualFold(s.SerialNumber, "SER-0") {
|
||||
front0 = s
|
||||
}
|
||||
if strings.EqualFold(s.SerialNumber, "NVME-1") {
|
||||
nvme = s
|
||||
}
|
||||
}
|
||||
if front0 == nil {
|
||||
t.Fatalf("expected merged Front0 disk by serial SER-0")
|
||||
}
|
||||
if front0.Model != "PM893" {
|
||||
t.Fatalf("expected Front0 model PM893, got %q", front0.Model)
|
||||
}
|
||||
if front0.SizeGB != 480 {
|
||||
t.Fatalf("expected Front0 size 480GB, got %d", front0.SizeGB)
|
||||
}
|
||||
if nvme == nil {
|
||||
t.Fatalf("expected NVMe disk by serial NVME-1")
|
||||
}
|
||||
if nvme.Type != "nvme" {
|
||||
t.Fatalf("expected nvme type, got %q", nvme.Type)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseH3CG6(t *testing.T) {
|
||||
p := &G6Parser{}
|
||||
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "static/FRUInfo.ini",
|
||||
Content: []byte(`[Baseboard]
|
||||
Board Manufacturer=H3C
|
||||
Board Product Name=RS36M2C6SB
|
||||
Product Product Name=H3C UniServer R4700 G6
|
||||
Product Serial Number=210235A4FYH257000010
|
||||
Product Part Number=0235A4FY
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/firmware_version.json",
|
||||
Content: []byte(`{
|
||||
"BMCP": {"Firmware Name":"HDM","Firmware Version":"1.83","Location":"bmc card","Part Model":"-"},
|
||||
"BIOS": {"Firmware Name":"BIOS","Firmware Version":"6.10.53","Location":"system board","Part Model":"-"}
|
||||
}`),
|
||||
},
|
||||
{
|
||||
Path: "static/CPUDetailInfo.xml",
|
||||
Content: []byte(`<Root>
|
||||
<CPU1>
|
||||
<Status>Presence</Status>
|
||||
<Model>INTEL(R) XEON(R) GOLD 6542Y</Model>
|
||||
<ProcessorSpeed>0xb54</ProcessorSpeed>
|
||||
<ProcessorMaxSpeed>0x1004</ProcessorMaxSpeed>
|
||||
<TotalCores>0x18</TotalCores>
|
||||
<TotalThreads>0x30</TotalThreads>
|
||||
<SerialNumber>68-5C-81-C1-0E-A3-4E-40</SerialNumber>
|
||||
<PPIN>68-5C-81-C1-0E-A3-4E-40</PPIN>
|
||||
</CPU1>
|
||||
</Root>`),
|
||||
},
|
||||
{
|
||||
Path: "static/MemoryDetailInfo.xml",
|
||||
Content: []byte(`<Root>
|
||||
<DIMM1>
|
||||
<Status>Presence</Status>
|
||||
<Name>CPU1_CH1_D0 (A0)</Name>
|
||||
<PartNumber>M321R8GA0PB0-CWMXJ</PartNumber>
|
||||
<DIMMTech>RDIMM</DIMMTech>
|
||||
<SerialNumber>80CE032519135C82ED</SerialNumber>
|
||||
<DIMMRanks>0x2</DIMMRanks>
|
||||
<DIMMSize>0x10000</DIMMSize>
|
||||
<CurFreq>0x1130</CurFreq>
|
||||
<MaxFreq>0x15e0</MaxFreq>
|
||||
<DIMMSilk>A0</DIMMSilk>
|
||||
</DIMM1>
|
||||
</Root>`),
|
||||
},
|
||||
{
|
||||
Path: "static/storage_disk.ini",
|
||||
Content: []byte(`[Disk_000]
|
||||
SerialNumber=S6KLNN0Y516813
|
||||
DiskSlotDesc=Front0
|
||||
Present=YES
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/net_cfg.ini",
|
||||
Content: []byte(`[Network Configuration]
|
||||
eth0 Link encap:Ethernet HWaddr 30:C6:D7:94:54:F6
|
||||
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
|
||||
|
||||
eth0.2 Link encap:Ethernet HWaddr 30:C6:D7:94:54:F6
|
||||
inet6 addr: fe80::32c6:d7ff:fe94:54f6/64 Scope:Link
|
||||
UP BROADCAST RUNNING MULTICAST MTU:1496 Metric:1
|
||||
|
||||
eth1 Link encap:Ethernet HWaddr 30:C6:D7:94:54:F5
|
||||
inet addr:10.201.129.0 Bcast:10.201.143.255 Mask:255.255.240.0
|
||||
inet6 addr: fe80::32c6:d7ff:fe94:54f5/64 Scope:Link
|
||||
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
|
||||
|
||||
lo Link encap:Local Loopback
|
||||
inet addr:127.0.0.1 Mask:255.0.0.0
|
||||
UP LOOPBACK RUNNING MTU:65536 Metric:1
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/psu_cfg.ini",
|
||||
Content: []byte(`[Psu0]
|
||||
SN=210231AGUNH257001569
|
||||
Max_Power(W)=1600
|
||||
Manufacturer=Great Wall
|
||||
Power Status=Input Normal, Output Normal
|
||||
Present_Status=Present
|
||||
Power_ID=1
|
||||
Model=GW-CRPS1600D2
|
||||
Version=03.02.00
|
||||
|
||||
[Psu1]
|
||||
Manufacturer=Great Wall
|
||||
Power_ID=2
|
||||
Version=03.02.00
|
||||
Power Status=Input Normal, Output Normal
|
||||
SN=210231AGUNH257001570
|
||||
Model=GW-CRPS1600D2
|
||||
Present_Status=Present
|
||||
Max_Power(W)=1600
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/hardware_info.ini",
|
||||
Content: []byte(`[Ethernet adapters: Port 1]
|
||||
Device Type : NIC
|
||||
Network Port : Port 1
|
||||
Location : PCIE-[1]
|
||||
MAC Address : E4:3D:1A:6F:B0:30
|
||||
Speed : 8.0GT/s
|
||||
Product Name : NIC-BCM957414-F-B-25Gb-2P
|
||||
[Ethernet adapters: Port 2]
|
||||
Device Type : NIC
|
||||
Network Port : Port 2
|
||||
Location : PCIE-[1]
|
||||
MAC Address : E4:3D:1A:6F:B0:31
|
||||
Speed : 8.0GT/s
|
||||
Product Name : NIC-BCM957414-F-B-25Gb-2P
|
||||
|
||||
[PCIe Card: PCIe 1]
|
||||
Location : 1
|
||||
Product Name : NIC-BCM957414-F-B-25Gb-2P
|
||||
Status : Normal
|
||||
Vendor ID : 0x14E4
|
||||
Device ID : 0x16D7
|
||||
Serial Number : NICSN-G6-001
|
||||
Part Number : NICPN-G6-001
|
||||
Firmware Version : 22.35.1010
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/sensor_info.ini",
|
||||
Content: []byte(`Sensor Name | Reading | Unit | Status| Crit low
|
||||
Inlet_Temp | 20.000 | degrees C | ok | na
|
||||
CPU1_Status | 0x0 | discrete | 0x8080| na
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "user/Sel.json",
|
||||
Content: []byte(`
|
||||
{
|
||||
"Created": "2025-07-14 03:34:18 UTC+08:00",
|
||||
"Severity": "Info",
|
||||
"EntryCode": "Asserted",
|
||||
"EntryType": "Event",
|
||||
"Id": 1,
|
||||
"Level": "Info",
|
||||
"Message": "Processor Presence detected",
|
||||
"SensorName": "CPU1_Status",
|
||||
"SensorType": "Processor"
|
||||
},
|
||||
{
|
||||
"Created": "2025-07-14 20:56:45 UTC+08:00",
|
||||
"Severity": "Critical",
|
||||
"EntryCode": "Asserted",
|
||||
"EntryType": "Event",
|
||||
"Id": 2,
|
||||
"Level": "Critical",
|
||||
"Message": "Power Supply AC lost",
|
||||
"SensorName": "PSU1_Status",
|
||||
"SensorType": "Power Supply"
|
||||
}
|
||||
`),
|
||||
},
|
||||
}
|
||||
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
|
||||
if result.Hardware == nil {
|
||||
t.Fatalf("expected hardware section")
|
||||
}
|
||||
if result.Hardware.BoardInfo.Manufacturer != "H3C" {
|
||||
t.Fatalf("unexpected board manufacturer: %q", result.Hardware.BoardInfo.Manufacturer)
|
||||
}
|
||||
if result.Hardware.BoardInfo.ProductName != "H3C UniServer R4700 G6" {
|
||||
t.Fatalf("unexpected board product: %q", result.Hardware.BoardInfo.ProductName)
|
||||
}
|
||||
if result.Hardware.BoardInfo.SerialNumber != "210235A4FYH257000010" {
|
||||
t.Fatalf("unexpected board serial: %q", result.Hardware.BoardInfo.SerialNumber)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Firmware) < 2 {
|
||||
t.Fatalf("expected firmware entries, got %d", len(result.Hardware.Firmware))
|
||||
}
|
||||
if len(result.Hardware.CPUs) != 1 {
|
||||
t.Fatalf("expected 1 cpu, got %d", len(result.Hardware.CPUs))
|
||||
}
|
||||
if result.Hardware.CPUs[0].Cores != 24 {
|
||||
t.Fatalf("expected 24 cores, got %d", result.Hardware.CPUs[0].Cores)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Memory) != 1 {
|
||||
t.Fatalf("expected 1 dimm, got %d", len(result.Hardware.Memory))
|
||||
}
|
||||
if result.Hardware.Memory[0].SizeMB != 65536 {
|
||||
t.Fatalf("expected 65536MB, got %d", result.Hardware.Memory[0].SizeMB)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Storage) != 1 {
|
||||
t.Fatalf("expected 1 disk, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
if result.Hardware.Storage[0].SerialNumber != "S6KLNN0Y516813" {
|
||||
t.Fatalf("unexpected disk serial: %q", result.Hardware.Storage[0].SerialNumber)
|
||||
}
|
||||
if len(result.Hardware.PowerSupply) != 2 {
|
||||
t.Fatalf("expected 2 PSUs from psu_cfg.ini, got %d", len(result.Hardware.PowerSupply))
|
||||
}
|
||||
if result.Hardware.PowerSupply[0].WattageW == 0 {
|
||||
t.Fatalf("expected PSU wattage parsed, got 0")
|
||||
}
|
||||
|
||||
if len(result.Hardware.NetworkAdapters) != 1 {
|
||||
t.Fatalf("expected 1 host network adapter from hardware_info.ini, got %d", len(result.Hardware.NetworkAdapters))
|
||||
}
|
||||
macs := make(map[string]struct{})
|
||||
var hostNIC models.NetworkAdapter
|
||||
var hostNICFound bool
|
||||
for _, nic := range result.Hardware.NetworkAdapters {
|
||||
if len(nic.MACAddresses) == 0 {
|
||||
t.Fatalf("expected MAC on network adapter %+v", nic)
|
||||
}
|
||||
for _, mac := range nic.MACAddresses {
|
||||
macs[strings.ToLower(mac)] = struct{}{}
|
||||
}
|
||||
if strings.EqualFold(nic.Slot, "PCIe 1") && strings.Contains(strings.ToLower(nic.Model), "bcm957414") {
|
||||
hostNIC = nic
|
||||
hostNICFound = true
|
||||
}
|
||||
}
|
||||
if !hostNICFound {
|
||||
t.Fatalf("expected host NIC from hardware_info.ini, got %+v", result.Hardware.NetworkAdapters)
|
||||
}
|
||||
if _, ok := macs["e4:3d:1a:6f:b0:30"]; !ok {
|
||||
t.Fatalf("expected host NIC MAC e4:3d:1a:6f:b0:30 in adapters, got %+v", result.Hardware.NetworkAdapters)
|
||||
}
|
||||
if _, ok := macs["e4:3d:1a:6f:b0:31"]; !ok {
|
||||
t.Fatalf("expected host NIC MAC e4:3d:1a:6f:b0:31 in adapters, got %+v", result.Hardware.NetworkAdapters)
|
||||
}
|
||||
if !strings.Contains(strings.ToLower(hostNIC.Vendor), "broadcom") {
|
||||
t.Fatalf("expected host NIC vendor enrichment from Vendor ID, got %q", hostNIC.Vendor)
|
||||
}
|
||||
if hostNIC.SerialNumber != "NICSN-G6-001" {
|
||||
t.Fatalf("expected host NIC serial from PCIe card section, got %q", hostNIC.SerialNumber)
|
||||
}
|
||||
if hostNIC.PartNumber != "NICPN-G6-001" {
|
||||
t.Fatalf("expected host NIC part number from PCIe card section, got %q", hostNIC.PartNumber)
|
||||
}
|
||||
if hostNIC.Firmware != "22.35.1010" {
|
||||
t.Fatalf("expected host NIC firmware from PCIe card section, got %q", hostNIC.Firmware)
|
||||
}
|
||||
|
||||
if len(result.Sensors) != 2 {
|
||||
t.Fatalf("expected 2 sensors, got %d", len(result.Sensors))
|
||||
}
|
||||
if result.Sensors[0].Name != "Inlet_Temp" {
|
||||
t.Fatalf("unexpected first sensor: %q", result.Sensors[0].Name)
|
||||
}
|
||||
|
||||
if len(result.Events) != 2 {
|
||||
t.Fatalf("expected 2 events, got %d", len(result.Events))
|
||||
}
|
||||
if result.Events[0].Timestamp.Year() != 2025 || result.Events[0].Timestamp.Month() != 7 {
|
||||
t.Fatalf("expected SEL timestamp from payload, got %s", result.Events[0].Timestamp)
|
||||
}
|
||||
if result.Events[1].Severity != models.SeverityCritical {
|
||||
t.Fatalf("expected critical severity for AC lost event, got %q", result.Events[1].Severity)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseH3CG5_PCIeArgumentsEnrichesNonNVMeStorage(t *testing.T) {
|
||||
p := &G5Parser{}
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "static/storage_disk.ini",
|
||||
Content: []byte(`[Disk_000]
|
||||
DiskSlotDesc=Front slot 3
|
||||
Present=YES
|
||||
SerialNumber=SAT-03
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/NVMe_info.txt",
|
||||
Content: []byte(`[NVMe_0]
|
||||
Present=YES
|
||||
DiskSlotDesc=Front slot 108
|
||||
SerialNumber=NVME-108
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/PCIe_arguments_table.xml",
|
||||
Content: []byte(`<root>
|
||||
<PCIE100>
|
||||
<base_args>
|
||||
<type>SSD</type>
|
||||
<name>SSD-SATA-960G</name>
|
||||
</base_args>
|
||||
<type_get_args>
|
||||
<bios_args>
|
||||
<vendor_id>0x144D</vendor_id>
|
||||
</bios_args>
|
||||
</type_get_args>
|
||||
</PCIE100>
|
||||
<PCIE200>
|
||||
<base_args>
|
||||
<type>SSD</type>
|
||||
<name>SSD-3.84T-NVMe-SFF</name>
|
||||
</base_args>
|
||||
<type_get_args>
|
||||
<bios_args>
|
||||
<vendor_id>0x144D</vendor_id>
|
||||
</bios_args>
|
||||
</type_get_args>
|
||||
</PCIE200>
|
||||
</root>`),
|
||||
},
|
||||
}
|
||||
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
if result.Hardware == nil {
|
||||
t.Fatalf("expected hardware section")
|
||||
}
|
||||
|
||||
if len(result.Hardware.Storage) != 2 {
|
||||
t.Fatalf("expected 2 storage devices, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
|
||||
var sata *models.Storage
|
||||
var nvme *models.Storage
|
||||
for i := range result.Hardware.Storage {
|
||||
s := &result.Hardware.Storage[i]
|
||||
switch s.SerialNumber {
|
||||
case "SAT-03":
|
||||
sata = s
|
||||
case "NVME-108":
|
||||
nvme = s
|
||||
}
|
||||
}
|
||||
|
||||
if sata == nil {
|
||||
t.Fatalf("expected SATA storage SAT-03")
|
||||
}
|
||||
if sata.Model != "SSD-SATA-960G" {
|
||||
t.Fatalf("expected SATA model enrichment from PCIe table, got %q", sata.Model)
|
||||
}
|
||||
if !strings.Contains(strings.ToLower(sata.Manufacturer), "samsung") {
|
||||
t.Fatalf("expected SATA vendor enrichment to Samsung, got %q", sata.Manufacturer)
|
||||
}
|
||||
|
||||
if nvme == nil {
|
||||
t.Fatalf("expected NVMe storage NVME-108")
|
||||
}
|
||||
if nvme.Model != "SSD-3.84T-NVMe-SFF" {
|
||||
t.Fatalf("expected NVMe model enrichment from PCIe table, got %q", nvme.Model)
|
||||
}
|
||||
if !strings.Contains(strings.ToLower(nvme.Manufacturer), "samsung") {
|
||||
t.Fatalf("expected NVMe vendor enrichment to Samsung, got %q", nvme.Manufacturer)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseH3CG5_VariantLayout(t *testing.T) {
|
||||
p := &G5Parser{}
|
||||
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "static/FRUInfo.ini",
|
||||
Content: []byte(`[Baseboard]
|
||||
Board Manufacturer=H3C
|
||||
Product Product Name=H3C UniServer R4900 G5
|
||||
Product Serial Number=02A6AX5231C003VM
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/firmware_version.ini",
|
||||
Content: []byte(`[System board]
|
||||
BIOS Version : 5.59 V100R001B05D078
|
||||
ME Version : 4.4.4.202
|
||||
HDM Version : 3.34.01 HDM V100R001B05D078SP01
|
||||
CPLD Version : V00C
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/board_cfg.ini",
|
||||
Content: []byte(`[Board Type]
|
||||
Board Type : R4900 G5
|
||||
|
||||
[Board Version]
|
||||
Board Version : VER.D
|
||||
|
||||
[Customer ID]
|
||||
CustomerID : 255
|
||||
|
||||
[OEM ID]
|
||||
OEM Flag : 1
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/hardware_info.ini",
|
||||
Content: []byte(`[Processors: Processor 1]
|
||||
Model : Intel(R) Xeon(R) Gold 6342 CPU @ 2.80GHz
|
||||
Status : Normal
|
||||
Frequency : 2800 MHz
|
||||
Cores : 24
|
||||
Threads : 48
|
||||
L1 Cache : 1920 KB
|
||||
L2 Cache : 30720 KB
|
||||
L3 Cache : 36864 KB
|
||||
CPU PPIN : 49-A9-50-C0-15-9F-2D-DC
|
||||
|
||||
[Processors: Processor 2]
|
||||
Model : Intel(R) Xeon(R) Gold 6342 CPU @ 2.80GHz
|
||||
Status : Normal
|
||||
Frequency : 2800 MHz
|
||||
Cores : 24
|
||||
Threads : 48
|
||||
CPU PPIN : 49-AC-3D-BF-85-7F-17-58
|
||||
|
||||
[Memory Details: Dimm Index 0]
|
||||
Location : Processor 1
|
||||
Channel : 1
|
||||
Socket ID : A0
|
||||
Status : Normal
|
||||
Size : 65536 MB
|
||||
Maximum Frequency : 3200 MHz
|
||||
Type : DDR4
|
||||
Ranks : 2R DIMM
|
||||
Technology : RDIMM
|
||||
Part Number : M393A8G40AB2-CWE
|
||||
Manufacture : Samsung
|
||||
Serial Number : S02K0D0243351D7079
|
||||
|
||||
[Memory Details: Dimm Index 16]
|
||||
Location : Processor 2
|
||||
Channel : 1
|
||||
Socket ID : A0
|
||||
Status : Normal
|
||||
Size : 65536 MB
|
||||
Maximum Frequency : 3200 MHz
|
||||
Type : DDR4
|
||||
Ranks : 2R DIMM
|
||||
Technology : RDIMM
|
||||
Part Number : M393A8G40AB2-CWE
|
||||
Manufacture : Samsung
|
||||
Serial Number : S02K0D0243351D73F0
|
||||
|
||||
[Ethernet adapters: Port 1]
|
||||
Device Type : NIC
|
||||
Network Port : Port 1
|
||||
Location : PCIE-[1]
|
||||
MAC Address : E4:3D:1A:6F:B0:30
|
||||
Speed : 8.0GT/s
|
||||
Product Name : NIC-BCM957414-F-B-25Gb-2P
|
||||
[Ethernet adapters: Port 2]
|
||||
Device Type : NIC
|
||||
Network Port : Port 2
|
||||
Location : PCIE-[1]
|
||||
MAC Address : E4:3D:1A:6F:B0:31
|
||||
Speed : 8.0GT/s
|
||||
Product Name : NIC-BCM957414-F-B-25Gb-2P
|
||||
|
||||
[Ethernet adapters: Port 1]
|
||||
Device Type : NIC
|
||||
Network Port : Port 1
|
||||
Location : PCIE-[4]
|
||||
MAC Address : E8:EB:D3:4F:2E:90
|
||||
Speed : 8.0GT/s
|
||||
Product Name : NIC-MCX512A-ACAT-2*25Gb-F
|
||||
[Ethernet adapters: Port 2]
|
||||
Device Type : NIC
|
||||
Network Port : Port 2
|
||||
Location : PCIE-[4]
|
||||
MAC Address : E8:EB:D3:4F:2E:91
|
||||
Speed : 8.0GT/s
|
||||
Product Name : NIC-MCX512A-ACAT-2*25Gb-F
|
||||
|
||||
[PCIe Card: PCIe 1]
|
||||
Location : 1
|
||||
Product Name : NIC-BCM957414-F-B-25Gb-2P
|
||||
Status : Normal
|
||||
Vendor ID : 0x14E4
|
||||
Device ID : 0x16D7
|
||||
Serial Number : NICSN-G5-001
|
||||
Part Number : NICPN-G5-001
|
||||
Firmware Version : 21.80.1
|
||||
|
||||
[PCIe Card: PCIe 4]
|
||||
Location : 4
|
||||
Product Name : NIC-MCX512A-ACAT-2*25Gb-F
|
||||
Status : Normal
|
||||
Vendor ID : 0x15B3
|
||||
Device ID : 0x1017
|
||||
Serial Number : NICSN-G5-004
|
||||
Part Number : NICPN-G5-004
|
||||
Firmware Version : 28.33.15
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/hardware.info",
|
||||
Content: []byte(`[Disk_0_Front_NA]
|
||||
Present=YES
|
||||
SlotNum=0
|
||||
FrontOrRear=Front
|
||||
SerialNumber=22443C4EE184
|
||||
|
||||
[Nvme_Front slot 21]
|
||||
Present=YES
|
||||
NvmePhySlot=Front slot 21
|
||||
SlotNum=121
|
||||
SerialNumber=NVME-21
|
||||
|
||||
[Nvme_255_121]
|
||||
Present=YES
|
||||
SlotNum=121
|
||||
SerialNumber=NVME-21
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/raid.json",
|
||||
Content: []byte(`{
|
||||
"RAIDCONFIG": {
|
||||
"Ctrl info": [
|
||||
{
|
||||
"CtrlDevice Slot": 3,
|
||||
"CtrlDevice Name": "AVAGO MegaRAID SAS 9460-8i",
|
||||
"LDInfo": [
|
||||
{
|
||||
"LD ID": 0,
|
||||
"LD_name": "SystemRAID",
|
||||
"RAID_level(RAID 0,RAID 1,RAID 5,RAID 6,RAID 00,RAID 10,RAID 50,RAID 60)": "RAID1",
|
||||
"Logical_capicity(per 512byte)": 936640512
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"CtrlDevice Slot": 6,
|
||||
"CtrlDevice Name": "MegaRAID 9560-16i 8GB",
|
||||
"LDInfo": [
|
||||
{
|
||||
"LD ID": 0,
|
||||
"LD_name": "DataRAID",
|
||||
"RAID_level(RAID 0,RAID 1,RAID 5,RAID 6,RAID 00,RAID 10,RAID 50,RAID 60)": "RAID50",
|
||||
"Logical_capicity(per 512byte)": 90004783104
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}`),
|
||||
},
|
||||
{
|
||||
Path: "static/Raid_BP_Conf_Info.ini",
|
||||
Content: []byte(`[BP Information]
|
||||
Description | BP TYPE | I2cPort | BpConnectorNum | FrontOrRear | Node Num | DiskSlotRange |
|
||||
8SFF SAS/SATA | BP_G5_8SFF | AUX_1 | ~ | ~ | ~ | ~ |
|
||||
8SFF SAS/SATA | BP_G5_8SFF | AUX_2 | ~ | ~ | ~ | ~ |
|
||||
8SFF SAS/SATA | BP_G5_8SFF | AUX_3 | ~ | ~ | ~ | ~ |
|
||||
|
||||
[RAID Information]
|
||||
PCIE SLOT | RAID SAS_NUM |
|
||||
3 | 2 |
|
||||
6 | 4 |
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/PCIe_arguments_table.xml",
|
||||
Content: []byte(`<root>
|
||||
<PCIE100>
|
||||
<base_args>
|
||||
<type>SSD</type>
|
||||
<name>SSD-1.92T/3.84T-NVMe-EV-SFF-sa</name>
|
||||
</base_args>
|
||||
<type_get_args>
|
||||
<bios_args>
|
||||
<vendor_id>0x144D</vendor_id>
|
||||
</bios_args>
|
||||
</type_get_args>
|
||||
</PCIE100>
|
||||
</root>`),
|
||||
},
|
||||
{
|
||||
Path: "static/psu_cfg.ini",
|
||||
Content: []byte(`[Active / Standby configuration]
|
||||
Power ID : 1
|
||||
Present Status : Present
|
||||
Cold Status : Active Power
|
||||
Model : DPS-1300AB-6 R
|
||||
SN : 210231ACT9H232000080
|
||||
Max Power(W) : 1300
|
||||
|
||||
Power ID : 2
|
||||
Present Status : Present
|
||||
Cold Status : Active Power
|
||||
Model : DPS-1300AB-6 R
|
||||
SN : 210231ACT9H232000079
|
||||
Max Power(W) : 1300
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/net_cfg.ini",
|
||||
Content: []byte(`[Network Configuration]
|
||||
eth0 Link encap:Ethernet HWaddr 30:C6:D7:94:54:F6
|
||||
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
|
||||
|
||||
eth0.2 Link encap:Ethernet HWaddr 30:C6:D7:94:54:F6
|
||||
inet6 addr: fe80::32c6:d7ff:fe94:54f6/64 Scope:Link
|
||||
UP BROADCAST RUNNING MULTICAST MTU:1496 Metric:1
|
||||
|
||||
eth1 Link encap:Ethernet HWaddr 30:C6:D7:94:54:F5
|
||||
inet addr:10.201.129.0 Bcast:10.201.143.255 Mask:255.255.240.0
|
||||
inet6 addr: fe80::32c6:d7ff:fe94:54f5/64 Scope:Link
|
||||
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
|
||||
|
||||
lo Link encap:Local Loopback
|
||||
inet addr:127.0.0.1 Mask:255.0.0.0
|
||||
UP LOOPBACK RUNNING MTU:65536 Metric:1
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "static/smartdata/Front0/first_date_analysis.txt",
|
||||
Content: []byte(`The Current System Time Is 2023_09_22_14_19_39
|
||||
Model Info: ATA Micron_5300_MTFD
|
||||
Serial Number: 22443C4EE184
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "user/test1.csv",
|
||||
Content: []byte(`Record Time Stamp,Severity Level,Severity Level ID,SensorTypeStr,SensorName,Event Dir,Event Occurred Time,DescInfo,Explanation,Suggestion
|
||||
2025-04-01 08:50:13,Minor,0x1,NA,NA,NA,2025-04-01 08:50:13,"SSH login failed from IP: 10.200.10.121 user: admin"," "," "
|
||||
Pre-Init,Info,0x0,Management Subsystem Health,Health,Assertion event,Pre-Init,"Management controller off-line"," "," "
|
||||
2025-04-01 08:51:10,Major,0x2,Power Supply,PSU1_Status,Assertion event,2025-04-01 08:51:10,"Power Supply AC lost"," "," "
|
||||
`),
|
||||
},
|
||||
}
|
||||
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
if result.Hardware == nil {
|
||||
t.Fatalf("expected hardware section")
|
||||
}
|
||||
|
||||
if len(result.Hardware.CPUs) != 2 {
|
||||
t.Fatalf("expected 2 CPUs from hardware_info.ini, got %d", len(result.Hardware.CPUs))
|
||||
}
|
||||
if result.Hardware.CPUs[0].FrequencyMHz != 2800 {
|
||||
t.Fatalf("expected CPU frequency 2800MHz, got %d", result.Hardware.CPUs[0].FrequencyMHz)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Memory) != 2 {
|
||||
t.Fatalf("expected 2 DIMMs from hardware_info.ini, got %d", len(result.Hardware.Memory))
|
||||
}
|
||||
if result.Hardware.Memory[0].SizeMB != 65536 {
|
||||
t.Fatalf("expected DIMM size 65536MB, got %d", result.Hardware.Memory[0].SizeMB)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Firmware) < 4 {
|
||||
t.Fatalf("expected firmware entries from firmware_version.ini, got %d", len(result.Hardware.Firmware))
|
||||
}
|
||||
if result.Hardware.BoardInfo.Version == "" {
|
||||
t.Fatalf("expected board version from board_cfg.ini")
|
||||
}
|
||||
if !strings.Contains(result.Hardware.BoardInfo.Description, "CustomerID: 255") {
|
||||
t.Fatalf("expected board description enrichment from board_cfg.ini, got %q", result.Hardware.BoardInfo.Description)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Storage) != 2 {
|
||||
t.Fatalf("expected 2 unique storage devices from hardware.info, got %d", len(result.Hardware.Storage))
|
||||
}
|
||||
var nvmeFound bool
|
||||
var diskModelEnriched bool
|
||||
for _, s := range result.Hardware.Storage {
|
||||
if s.SerialNumber == "NVME-21" {
|
||||
nvmeFound = true
|
||||
if s.Type != "nvme" {
|
||||
t.Fatalf("expected NVME-21 type nvme, got %q", s.Type)
|
||||
}
|
||||
if !strings.Contains(strings.ToLower(s.Manufacturer), "samsung") {
|
||||
t.Fatalf("expected NVME vendor enrichment to Samsung, got %q", s.Manufacturer)
|
||||
}
|
||||
if s.Model != "SSD-1.92T/3.84T-NVMe-EV-SFF-sa" {
|
||||
t.Fatalf("expected NVME model enrichment from PCIe table, got %q", s.Model)
|
||||
}
|
||||
}
|
||||
if s.SerialNumber == "22443C4EE184" && strings.Contains(s.Model, "Micron") {
|
||||
diskModelEnriched = true
|
||||
}
|
||||
}
|
||||
if !nvmeFound {
|
||||
t.Fatalf("expected deduped NVME storage by serial NVME-21")
|
||||
}
|
||||
if !diskModelEnriched {
|
||||
t.Fatalf("expected disk model enrichment from smartdata by serial")
|
||||
}
|
||||
|
||||
if len(result.Hardware.PowerSupply) != 2 {
|
||||
t.Fatalf("expected 2 PSUs from psu_cfg.ini, got %d", len(result.Hardware.PowerSupply))
|
||||
}
|
||||
if result.Hardware.PowerSupply[0].WattageW == 0 {
|
||||
t.Fatalf("expected PSU wattage parsed, got 0")
|
||||
}
|
||||
if len(result.Hardware.NetworkAdapters) != 2 {
|
||||
t.Fatalf("expected 2 host network adapters from hardware_info.ini, got %d", len(result.Hardware.NetworkAdapters))
|
||||
}
|
||||
if len(result.Hardware.NetworkCards) != 2 {
|
||||
t.Fatalf("expected 2 network cards synthesized from adapters, got %d", len(result.Hardware.NetworkCards))
|
||||
}
|
||||
var g5NIC models.NetworkAdapter
|
||||
var g5NICFound bool
|
||||
for _, nic := range result.Hardware.NetworkAdapters {
|
||||
if strings.EqualFold(nic.Slot, "PCIe 1") && strings.Contains(strings.ToLower(nic.Model), "bcm957414") {
|
||||
g5NIC = nic
|
||||
g5NICFound = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if !g5NICFound {
|
||||
t.Fatalf("expected host NIC PCIe 1 from hardware_info.ini, got %+v", result.Hardware.NetworkAdapters)
|
||||
}
|
||||
if !strings.Contains(strings.ToLower(g5NIC.Vendor), "broadcom") {
|
||||
t.Fatalf("expected G5 NIC vendor from Vendor ID, got %q", g5NIC.Vendor)
|
||||
}
|
||||
if g5NIC.SerialNumber != "NICSN-G5-001" {
|
||||
t.Fatalf("expected G5 NIC serial from PCIe card section, got %q", g5NIC.SerialNumber)
|
||||
}
|
||||
if g5NIC.PartNumber != "NICPN-G5-001" {
|
||||
t.Fatalf("expected G5 NIC part number from PCIe card section, got %q", g5NIC.PartNumber)
|
||||
}
|
||||
if g5NIC.Firmware != "21.80.1" {
|
||||
t.Fatalf("expected G5 NIC firmware from PCIe card section, got %q", g5NIC.Firmware)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Devices) != 5 {
|
||||
t.Fatalf("expected 5 topology devices from Raid_BP_Conf_Info.ini (3 BP + 2 RAID), got %d", len(result.Hardware.Devices))
|
||||
}
|
||||
var bpFound bool
|
||||
var raidFound bool
|
||||
for _, d := range result.Hardware.Devices {
|
||||
if strings.Contains(d.ID, "h3c-bp-") && strings.Contains(d.Model, "BP_G5_8SFF") {
|
||||
bpFound = true
|
||||
}
|
||||
desc, _ := d.Details["description"].(string)
|
||||
if strings.Contains(d.ID, "h3c-raid-slot-3") && strings.Contains(desc, "SAS ports: 2") {
|
||||
raidFound = true
|
||||
}
|
||||
}
|
||||
if !bpFound || !raidFound {
|
||||
t.Fatalf("expected parsed backplane and RAID topology devices, got %+v", result.Hardware.Devices)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Volumes) != 2 {
|
||||
t.Fatalf("expected 2 RAID volumes (same LD ID on different controllers), got %d", len(result.Hardware.Volumes))
|
||||
}
|
||||
var raid1Found bool
|
||||
var raid50Found bool
|
||||
for _, v := range result.Hardware.Volumes {
|
||||
if strings.Contains(v.Controller, "slot 3") {
|
||||
raid1Found = v.RAIDLevel == "RAID1" && v.CapacityBytes > 0
|
||||
}
|
||||
if strings.Contains(v.Controller, "slot 6") {
|
||||
raid50Found = v.RAIDLevel == "RAID50" && v.CapacityBytes > 0
|
||||
}
|
||||
}
|
||||
if !raid1Found || !raid50Found {
|
||||
t.Fatalf("expected RAID1 and RAID50 volumes with parsed capacities, got %+v", result.Hardware.Volumes)
|
||||
}
|
||||
|
||||
if len(result.Events) != 2 {
|
||||
t.Fatalf("expected 2 CSV events (Pre-Init skipped), got %d", len(result.Events))
|
||||
}
|
||||
if result.Events[0].Severity != models.SeverityWarning {
|
||||
t.Fatalf("expected Minor CSV severity mapped to warning, got %q", result.Events[0].Severity)
|
||||
}
|
||||
if result.Events[1].Severity != models.SeverityCritical {
|
||||
t.Fatalf("expected Major CSV severity mapped to critical, got %q", result.Events[1].Severity)
|
||||
}
|
||||
}
|
||||
128
internal/parser/vendors/inspur/asset.go
vendored
128
internal/parser/vendors/inspur/asset.go
vendored
@@ -3,12 +3,15 @@ package inspur
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser/vendors/pciids"
|
||||
)
|
||||
|
||||
var rawHexPCIDeviceRegex = regexp.MustCompile(`(?i)^0x[0-9a-f]+$`)
|
||||
|
||||
// AssetJSON represents the structure of Inspur asset.json file
|
||||
type AssetJSON struct {
|
||||
VersionInfo []struct {
|
||||
@@ -55,6 +58,7 @@ type AssetJSON struct {
|
||||
} `json:"MemInfo"`
|
||||
|
||||
HddInfo []struct {
|
||||
PresentBitmap []int `json:"PresentBitmap"`
|
||||
SerialNumber string `json:"SerialNumber"`
|
||||
Manufacturer string `json:"Manufacturer"`
|
||||
ModelName string `json:"ModelName"`
|
||||
@@ -158,8 +162,19 @@ func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
|
||||
}
|
||||
|
||||
// Parse storage info
|
||||
seenHDDFW := make(map[string]bool)
|
||||
for _, hdd := range asset.HddInfo {
|
||||
slot := normalizeAssetHDDSlot(hdd.LocationString, hdd.Location, hdd.DiskInterfaceType)
|
||||
modelName := strings.TrimSpace(hdd.ModelName)
|
||||
serial := normalizeRedisValue(hdd.SerialNumber)
|
||||
present := bitmapHasAnyValue(hdd.PresentBitmap)
|
||||
if !present && (slot != "" || modelName != "" || serial != "" || hdd.Capacity > 0) {
|
||||
present = true
|
||||
}
|
||||
|
||||
if !present && slot == "" && modelName == "" && serial == "" && hdd.Capacity == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
storageType := "HDD"
|
||||
if hdd.DiskInterfaceType == 5 {
|
||||
storageType = "NVMe"
|
||||
@@ -168,35 +183,21 @@ func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
|
||||
}
|
||||
|
||||
// Resolve manufacturer: try vendor ID first, then model name extraction
|
||||
modelName := strings.TrimSpace(hdd.ModelName)
|
||||
manufacturer := resolveManufacturer(hdd.Manufacturer, modelName)
|
||||
|
||||
config.Storage = append(config.Storage, models.Storage{
|
||||
Slot: hdd.LocationString,
|
||||
Slot: slot,
|
||||
Type: storageType,
|
||||
Model: modelName,
|
||||
SizeGB: hdd.Capacity,
|
||||
SerialNumber: hdd.SerialNumber,
|
||||
SerialNumber: serial,
|
||||
Manufacturer: manufacturer,
|
||||
Firmware: hdd.FirmwareVersion,
|
||||
Interface: diskInterfaceToString(hdd.DiskInterfaceType),
|
||||
Present: present,
|
||||
})
|
||||
|
||||
// Add HDD firmware to firmware list (deduplicated by model+version)
|
||||
if hdd.FirmwareVersion != "" {
|
||||
fwKey := modelName + ":" + hdd.FirmwareVersion
|
||||
if !seenHDDFW[fwKey] {
|
||||
slot := hdd.LocationString
|
||||
if slot == "" {
|
||||
slot = fmt.Sprintf("%s %dGB", storageType, hdd.Capacity)
|
||||
}
|
||||
config.Firmware = append(config.Firmware, models.FirmwareInfo{
|
||||
DeviceName: fmt.Sprintf("%s (%s)", modelName, slot),
|
||||
Version: hdd.FirmwareVersion,
|
||||
})
|
||||
seenHDDFW[fwKey] = true
|
||||
}
|
||||
}
|
||||
// Disk firmware is already stored in Storage.Firmware — do not duplicate in Hardware.Firmware.
|
||||
}
|
||||
|
||||
// Parse PCIe info
|
||||
@@ -207,8 +208,8 @@ func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
|
||||
VendorID: pcie.VendorId,
|
||||
DeviceID: pcie.DeviceId,
|
||||
BDF: formatBDF(pcie.BusNumber, pcie.DeviceNumber, pcie.FunctionNumber),
|
||||
LinkWidth: pcie.NegotiatedLinkWidth,
|
||||
LinkSpeed: pcieLinkSpeedToString(pcie.CurrentLinkSpeed),
|
||||
LinkWidth: pcie.NegotiatedLinkWidth,
|
||||
LinkSpeed: pcieLinkSpeedToString(pcie.CurrentLinkSpeed),
|
||||
MaxLinkWidth: pcie.MaxLinkWidth,
|
||||
MaxLinkSpeed: pcieLinkSpeedToString(pcie.MaxLinkSpeed),
|
||||
DeviceClass: pcieClassToString(pcie.ClassCode, pcie.SubClassCode),
|
||||
@@ -225,25 +226,22 @@ func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
|
||||
}
|
||||
// Use device name from PCI IDs database if available
|
||||
if deviceName != "" {
|
||||
device.DeviceClass = deviceName
|
||||
device.DeviceClass = normalizeModelLabel(deviceName)
|
||||
}
|
||||
config.PCIeDevices = append(config.PCIeDevices, device)
|
||||
|
||||
// Extract GPUs (class 3 = display controller)
|
||||
if pcie.ClassCode == 3 {
|
||||
gpuModel := deviceName
|
||||
if gpuModel == "" {
|
||||
gpuModel = pcieClassToString(pcie.ClassCode, pcie.SubClassCode)
|
||||
}
|
||||
gpuModel := normalizeGPUModel(pcie.VendorId, pcie.DeviceId, deviceName, pcie.ClassCode, pcie.SubClassCode)
|
||||
gpu := models.GPU{
|
||||
Slot: pcie.LocString,
|
||||
Model: gpuModel,
|
||||
Manufacturer: vendor,
|
||||
VendorID: pcie.VendorId,
|
||||
DeviceID: pcie.DeviceId,
|
||||
BDF: formatBDF(pcie.BusNumber, pcie.DeviceNumber, pcie.FunctionNumber),
|
||||
CurrentLinkWidth: pcie.NegotiatedLinkWidth,
|
||||
CurrentLinkSpeed: pcieLinkSpeedToString(pcie.CurrentLinkSpeed),
|
||||
Slot: pcie.LocString,
|
||||
Model: gpuModel,
|
||||
Manufacturer: vendor,
|
||||
VendorID: pcie.VendorId,
|
||||
DeviceID: pcie.DeviceId,
|
||||
BDF: formatBDF(pcie.BusNumber, pcie.DeviceNumber, pcie.FunctionNumber),
|
||||
CurrentLinkWidth: pcie.NegotiatedLinkWidth,
|
||||
CurrentLinkSpeed: pcieLinkSpeedToString(pcie.CurrentLinkSpeed),
|
||||
MaxLinkWidth: pcie.MaxLinkWidth,
|
||||
MaxLinkSpeed: pcieLinkSpeedToString(pcie.MaxLinkSpeed),
|
||||
}
|
||||
@@ -260,6 +258,45 @@ func ParseAssetJSON(content []byte) (*models.HardwareConfig, error) {
|
||||
return config, nil
|
||||
}
|
||||
|
||||
func normalizeModelLabel(v string) string {
|
||||
v = strings.TrimSpace(v)
|
||||
if v == "" {
|
||||
return ""
|
||||
}
|
||||
return strings.Join(strings.Fields(v), " ")
|
||||
}
|
||||
|
||||
func normalizeGPUModel(vendorID, deviceID int, model string, classCode, subClass int) string {
|
||||
model = normalizeModelLabel(model)
|
||||
|
||||
if model == "" || rawHexPCIDeviceRegex.MatchString(model) || isGenericGPUModelLabel(model) {
|
||||
if pciModel := normalizeModelLabel(pciids.DeviceName(vendorID, deviceID)); pciModel != "" {
|
||||
model = pciModel
|
||||
}
|
||||
}
|
||||
|
||||
if model == "" || isGenericGPUModelLabel(model) {
|
||||
model = pcieClassToString(classCode, subClass)
|
||||
}
|
||||
|
||||
// Last fallback for unknown NVIDIA display devices: expose PCI DeviceID
|
||||
// instead of generic "3D Controller".
|
||||
if (model == "" || strings.EqualFold(model, "3D Controller")) && vendorID == 0x10de && deviceID > 0 {
|
||||
return fmt.Sprintf("0x%04X", deviceID)
|
||||
}
|
||||
|
||||
return model
|
||||
}
|
||||
|
||||
func isGenericGPUModelLabel(model string) bool {
|
||||
switch strings.ToLower(strings.TrimSpace(model)) {
|
||||
case "", "gpu", "display", "display controller", "vga", "3d controller", "other", "unknown":
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
func memoryTypeToString(memType int) string {
|
||||
switch memType {
|
||||
case 26:
|
||||
@@ -284,6 +321,29 @@ func diskInterfaceToString(ifType int) string {
|
||||
}
|
||||
}
|
||||
|
||||
func normalizeAssetHDDSlot(locationString string, location int, diskInterfaceType int) string {
|
||||
slot := strings.TrimSpace(locationString)
|
||||
if slot != "" {
|
||||
return slot
|
||||
}
|
||||
if location < 0 {
|
||||
return ""
|
||||
}
|
||||
if diskInterfaceType == 5 {
|
||||
return fmt.Sprintf("OB%02d", location+1)
|
||||
}
|
||||
return fmt.Sprintf("%d", location)
|
||||
}
|
||||
|
||||
func bitmapHasAnyValue(values []int) bool {
|
||||
for _, v := range values {
|
||||
if v != 0 {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func pcieLinkSpeedToString(speed int) string {
|
||||
switch speed {
|
||||
case 1:
|
||||
|
||||
48
internal/parser/vendors/inspur/asset_gpu_model_test.go
vendored
Normal file
48
internal/parser/vendors/inspur/asset_gpu_model_test.go
vendored
Normal file
@@ -0,0 +1,48 @@
|
||||
package inspur
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestParseAssetJSON_NVIDIAGPUModelFromPCIIDs(t *testing.T) {
|
||||
raw := []byte(`{
|
||||
"VersionInfo": [],
|
||||
"CpuInfo": [],
|
||||
"MemInfo": {"MemCommonInfo": [], "DimmInfo": []},
|
||||
"HddInfo": [],
|
||||
"PcieInfo": [{
|
||||
"VendorId": 4318,
|
||||
"DeviceId": 9019,
|
||||
"BusNumber": 12,
|
||||
"DeviceNumber": 0,
|
||||
"FunctionNumber": 0,
|
||||
"MaxLinkWidth": 16,
|
||||
"MaxLinkSpeed": 5,
|
||||
"NegotiatedLinkWidth": 16,
|
||||
"CurrentLinkSpeed": 5,
|
||||
"ClassCode": 3,
|
||||
"SubClassCode": 2,
|
||||
"PcieSlot": 11,
|
||||
"LocString": "#CPU0_PCIE2",
|
||||
"PartNumber": null,
|
||||
"SerialNumber": null,
|
||||
"Mac": []
|
||||
}]
|
||||
}`)
|
||||
|
||||
hw, err := ParseAssetJSON(raw)
|
||||
if err != nil {
|
||||
t.Fatalf("ParseAssetJSON failed: %v", err)
|
||||
}
|
||||
if len(hw.GPUs) != 1 {
|
||||
t.Fatalf("expected 1 GPU, got %d", len(hw.GPUs))
|
||||
}
|
||||
if hw.GPUs[0].Model != "GH100 [H200 NVL]" {
|
||||
t.Fatalf("expected model GH100 [H200 NVL], got %q", hw.GPUs[0].Model)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNormalizeGPUModel_FallbackToDeviceIDForUnknownNVIDIA(t *testing.T) {
|
||||
got := normalizeGPUModel(0x10de, 0xbeef, "0xBEEF\t", 3, 2)
|
||||
if got != "0xBEEF" {
|
||||
t.Fatalf("expected 0xBEEF, got %q", got)
|
||||
}
|
||||
}
|
||||
442
internal/parser/vendors/inspur/component.go
vendored
442
internal/parser/vendors/inspur/component.go
vendored
@@ -8,6 +8,7 @@ import (
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser/vendors/pciids"
|
||||
)
|
||||
|
||||
// ParseComponentLog parses component.log file and extracts detailed hardware info
|
||||
@@ -45,27 +46,38 @@ func ParseComponentLogEvents(content []byte) []models.Event {
|
||||
// Parse RESTful Memory info for Warning/Error status
|
||||
memEvents := parseMemoryEvents(text)
|
||||
events = append(events, memEvents...)
|
||||
events = append(events, parseFanEvents(text)...)
|
||||
|
||||
return events
|
||||
}
|
||||
|
||||
// ParseComponentLogSensors extracts sensor readings from component.log JSON sections.
|
||||
func ParseComponentLogSensors(content []byte) []models.SensorReading {
|
||||
text := string(content)
|
||||
var out []models.SensorReading
|
||||
out = append(out, parseFanSensors(text)...)
|
||||
out = append(out, parseDiskBackplaneSensors(text)...)
|
||||
out = append(out, parsePSUSummarySensors(text)...)
|
||||
return out
|
||||
}
|
||||
|
||||
// MemoryRESTInfo represents the RESTful Memory info structure
|
||||
type MemoryRESTInfo struct {
|
||||
MemModules []struct {
|
||||
MemModID int `json:"mem_mod_id"`
|
||||
ConfigStatus int `json:"config_status"`
|
||||
MemModSlot string `json:"mem_mod_slot"`
|
||||
MemModStatus int `json:"mem_mod_status"`
|
||||
MemModSize int `json:"mem_mod_size"`
|
||||
MemModType string `json:"mem_mod_type"`
|
||||
MemModTechnology string `json:"mem_mod_technology"`
|
||||
MemModFrequency int `json:"mem_mod_frequency"`
|
||||
MemModCurrentFreq int `json:"mem_mod_current_frequency"`
|
||||
MemModVendor string `json:"mem_mod_vendor"`
|
||||
MemModPartNum string `json:"mem_mod_part_num"`
|
||||
MemModSerial string `json:"mem_mod_serial_num"`
|
||||
MemModRanks int `json:"mem_mod_ranks"`
|
||||
Status string `json:"status"`
|
||||
MemModID int `json:"mem_mod_id"`
|
||||
ConfigStatus int `json:"config_status"`
|
||||
MemModSlot string `json:"mem_mod_slot"`
|
||||
MemModStatus int `json:"mem_mod_status"`
|
||||
MemModSize int `json:"mem_mod_size"`
|
||||
MemModType string `json:"mem_mod_type"`
|
||||
MemModTechnology string `json:"mem_mod_technology"`
|
||||
MemModFrequency int `json:"mem_mod_frequency"`
|
||||
MemModCurrentFreq int `json:"mem_mod_current_frequency"`
|
||||
MemModVendor string `json:"mem_mod_vendor"`
|
||||
MemModPartNum string `json:"mem_mod_part_num"`
|
||||
MemModSerial string `json:"mem_mod_serial_num"`
|
||||
MemModRanks int `json:"mem_mod_ranks"`
|
||||
Status string `json:"status"`
|
||||
} `json:"mem_modules"`
|
||||
TotalMemoryCount int `json:"total_memory_count"`
|
||||
PresentMemoryCount int `json:"present_memory_count"`
|
||||
@@ -112,21 +124,21 @@ func parseMemoryInfo(text string, hw *models.HardwareConfig) {
|
||||
// PSURESTInfo represents the RESTful PSU info structure
|
||||
type PSURESTInfo struct {
|
||||
PowerSupplies []struct {
|
||||
ID int `json:"id"`
|
||||
Present int `json:"present"`
|
||||
VendorID string `json:"vendor_id"`
|
||||
Model string `json:"model"`
|
||||
SerialNum string `json:"serial_num"`
|
||||
PartNum string `json:"part_num"`
|
||||
FwVer string `json:"fw_ver"`
|
||||
InputType string `json:"input_type"`
|
||||
Status string `json:"status"`
|
||||
RatedPower int `json:"rated_power"`
|
||||
PSInPower int `json:"ps_in_power"`
|
||||
PSOutPower int `json:"ps_out_power"`
|
||||
PSInVolt float64 `json:"ps_in_volt"`
|
||||
PSOutVolt float64 `json:"ps_out_volt"`
|
||||
PSUMaxTemp int `json:"psu_max_temperature"`
|
||||
ID int `json:"id"`
|
||||
Present int `json:"present"`
|
||||
VendorID string `json:"vendor_id"`
|
||||
Model string `json:"model"`
|
||||
SerialNum string `json:"serial_num"`
|
||||
PartNum string `json:"part_num"`
|
||||
FwVer string `json:"fw_ver"`
|
||||
InputType string `json:"input_type"`
|
||||
Status string `json:"status"`
|
||||
RatedPower int `json:"rated_power"`
|
||||
PSInPower int `json:"ps_in_power"`
|
||||
PSOutPower int `json:"ps_out_power"`
|
||||
PSInVolt float64 `json:"ps_in_volt"`
|
||||
PSOutVolt float64 `json:"ps_out_volt"`
|
||||
PSUMaxTemp int `json:"psu_max_temperature"`
|
||||
} `json:"power_supplies"`
|
||||
PresentPowerReading int `json:"present_power_reading"`
|
||||
}
|
||||
@@ -209,20 +221,49 @@ func parseHDDInfo(text string, hw *models.HardwareConfig) {
|
||||
})
|
||||
for _, hdd := range hddInfo {
|
||||
if hdd.Present == 1 {
|
||||
hddMap[hdd.LocationString] = struct {
|
||||
slot := strings.TrimSpace(hdd.LocationString)
|
||||
if slot == "" {
|
||||
slot = fmt.Sprintf("HDD%d", hdd.ID)
|
||||
}
|
||||
hddMap[slot] = struct {
|
||||
SN string
|
||||
Model string
|
||||
Firmware string
|
||||
Mfr string
|
||||
}{
|
||||
SN: strings.TrimSpace(hdd.SN),
|
||||
SN: normalizeRedisValue(hdd.SN),
|
||||
Model: strings.TrimSpace(hdd.Model),
|
||||
Firmware: strings.TrimSpace(hdd.Firmware),
|
||||
Firmware: normalizeRedisValue(hdd.Firmware),
|
||||
Mfr: strings.TrimSpace(hdd.Manufacture),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Merge into existing inventory first (asset/other sections).
|
||||
for i := range hw.Storage {
|
||||
slot := strings.TrimSpace(hw.Storage[i].Slot)
|
||||
if slot == "" {
|
||||
continue
|
||||
}
|
||||
detail, ok := hddMap[slot]
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if normalizeRedisValue(hw.Storage[i].SerialNumber) == "" {
|
||||
hw.Storage[i].SerialNumber = detail.SN
|
||||
}
|
||||
if hw.Storage[i].Model == "" {
|
||||
hw.Storage[i].Model = detail.Model
|
||||
}
|
||||
if normalizeRedisValue(hw.Storage[i].Firmware) == "" {
|
||||
hw.Storage[i].Firmware = detail.Firmware
|
||||
}
|
||||
if hw.Storage[i].Manufacturer == "" {
|
||||
hw.Storage[i].Manufacturer = detail.Mfr
|
||||
}
|
||||
hw.Storage[i].Present = true
|
||||
}
|
||||
|
||||
// If storage is empty, populate from HDD info
|
||||
if len(hw.Storage) == 0 {
|
||||
for _, hdd := range hddInfo {
|
||||
@@ -239,21 +280,42 @@ func parseHDDInfo(text string, hw *models.HardwareConfig) {
|
||||
if hdd.CapableSpeed == 12 {
|
||||
iface = "SAS"
|
||||
}
|
||||
slot := strings.TrimSpace(hdd.LocationString)
|
||||
if slot == "" {
|
||||
slot = fmt.Sprintf("HDD%d", hdd.ID)
|
||||
}
|
||||
|
||||
hw.Storage = append(hw.Storage, models.Storage{
|
||||
Slot: hdd.LocationString,
|
||||
Slot: slot,
|
||||
Type: storType,
|
||||
Model: model,
|
||||
SizeGB: hdd.Capacity,
|
||||
SerialNumber: strings.TrimSpace(hdd.SN),
|
||||
SerialNumber: normalizeRedisValue(hdd.SN),
|
||||
Manufacturer: extractStorageManufacturer(model),
|
||||
Firmware: strings.TrimSpace(hdd.Firmware),
|
||||
Firmware: normalizeRedisValue(hdd.Firmware),
|
||||
Interface: iface,
|
||||
Present: true,
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// FanRESTInfo represents the RESTful fan info structure.
|
||||
type FanRESTInfo struct {
|
||||
Fans []struct {
|
||||
ID int `json:"id"`
|
||||
FanName string `json:"fan_name"`
|
||||
Present string `json:"present"`
|
||||
Status string `json:"status"`
|
||||
StatusStr string `json:"status_str"`
|
||||
SpeedRPM int `json:"speed_rpm"`
|
||||
SpeedPercent int `json:"speed_percent"`
|
||||
MaxSpeedRPM int `json:"max_speed_rpm"`
|
||||
FanModel string `json:"fan_model"`
|
||||
} `json:"fans"`
|
||||
FansPower int `json:"fans_power"`
|
||||
}
|
||||
|
||||
// NetworkAdapterRESTInfo represents the RESTful Network Adapter info structure
|
||||
type NetworkAdapterRESTInfo struct {
|
||||
SysAdapters []struct {
|
||||
@@ -304,17 +366,28 @@ func parseNetworkAdapterInfo(text string, hw *models.HardwareConfig) {
|
||||
}
|
||||
}
|
||||
|
||||
model := normalizeModelLabel(adapter.Model)
|
||||
if model == "" || looksLikeRawDeviceID(model) {
|
||||
if resolved := normalizeModelLabel(pciids.DeviceName(adapter.VendorID, adapter.DeviceID)); resolved != "" {
|
||||
model = resolved
|
||||
}
|
||||
}
|
||||
vendor := normalizeModelLabel(adapter.Vendor)
|
||||
if vendor == "" {
|
||||
vendor = normalizeModelLabel(pciids.VendorName(adapter.VendorID))
|
||||
}
|
||||
|
||||
hw.NetworkAdapters = append(hw.NetworkAdapters, models.NetworkAdapter{
|
||||
Slot: fmt.Sprintf("Slot %d", adapter.Slot),
|
||||
Location: adapter.Location,
|
||||
Present: adapter.Present == 1,
|
||||
Model: strings.TrimSpace(adapter.Model),
|
||||
Vendor: strings.TrimSpace(adapter.Vendor),
|
||||
Model: model,
|
||||
Vendor: vendor,
|
||||
VendorID: adapter.VendorID,
|
||||
DeviceID: adapter.DeviceID,
|
||||
SerialNumber: strings.TrimSpace(adapter.SN),
|
||||
PartNumber: strings.TrimSpace(adapter.PN),
|
||||
Firmware: adapter.FwVer,
|
||||
SerialNumber: normalizeRedisValue(adapter.SN),
|
||||
PartNumber: normalizeRedisValue(adapter.PN),
|
||||
Firmware: normalizeRedisValue(adapter.FwVer),
|
||||
PortCount: adapter.PortNum,
|
||||
PortType: adapter.PortType,
|
||||
MACAddresses: macs,
|
||||
@@ -323,6 +396,223 @@ func parseNetworkAdapterInfo(text string, hw *models.HardwareConfig) {
|
||||
}
|
||||
}
|
||||
|
||||
func parseFanSensors(text string) []models.SensorReading {
|
||||
re := regexp.MustCompile(`RESTful fan info:\s*(\{[\s\S]*?\})\s*RESTful diskbackplane`)
|
||||
match := re.FindStringSubmatch(text)
|
||||
if match == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
jsonStr := strings.ReplaceAll(match[1], "\n", "")
|
||||
var fanInfo FanRESTInfo
|
||||
if err := json.Unmarshal([]byte(jsonStr), &fanInfo); err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
out := make([]models.SensorReading, 0, len(fanInfo.Fans)+1)
|
||||
for _, fan := range fanInfo.Fans {
|
||||
name := strings.TrimSpace(fan.FanName)
|
||||
if name == "" {
|
||||
name = fmt.Sprintf("FAN%d", fan.ID)
|
||||
}
|
||||
status := normalizeComponentStatus(fan.StatusStr, fan.Status, fan.Present)
|
||||
raw := fmt.Sprintf("rpm=%d pct=%d model=%s max_rpm=%d", fan.SpeedRPM, fan.SpeedPercent, fan.FanModel, fan.MaxSpeedRPM)
|
||||
out = append(out, models.SensorReading{
|
||||
Name: name,
|
||||
Type: "fan_speed",
|
||||
Value: float64(fan.SpeedRPM),
|
||||
Unit: "RPM",
|
||||
RawValue: raw,
|
||||
Status: status,
|
||||
})
|
||||
}
|
||||
|
||||
if fanInfo.FansPower > 0 {
|
||||
out = append(out, models.SensorReading{
|
||||
Name: "Fans_Power",
|
||||
Type: "power",
|
||||
Value: float64(fanInfo.FansPower),
|
||||
Unit: "W",
|
||||
RawValue: fmt.Sprintf("%d", fanInfo.FansPower),
|
||||
Status: "OK",
|
||||
})
|
||||
}
|
||||
|
||||
return out
|
||||
}
|
||||
|
||||
func parseFanEvents(text string) []models.Event {
|
||||
re := regexp.MustCompile(`RESTful fan info:\s*(\{[\s\S]*?\})\s*RESTful diskbackplane`)
|
||||
match := re.FindStringSubmatch(text)
|
||||
if match == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
jsonStr := strings.ReplaceAll(match[1], "\n", "")
|
||||
var fanInfo FanRESTInfo
|
||||
if err := json.Unmarshal([]byte(jsonStr), &fanInfo); err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
var events []models.Event
|
||||
for _, fan := range fanInfo.Fans {
|
||||
status := normalizeComponentStatus(fan.StatusStr, fan.Status, fan.Present)
|
||||
if isHealthyComponentStatus(status) {
|
||||
continue
|
||||
}
|
||||
|
||||
name := strings.TrimSpace(fan.FanName)
|
||||
if name == "" {
|
||||
name = fmt.Sprintf("FAN%d", fan.ID)
|
||||
}
|
||||
|
||||
severity := models.SeverityWarning
|
||||
lowStatus := strings.ToLower(status)
|
||||
if strings.Contains(lowStatus, "critical") || strings.Contains(lowStatus, "fail") || strings.Contains(lowStatus, "error") {
|
||||
severity = models.SeverityCritical
|
||||
}
|
||||
|
||||
events = append(events, models.Event{
|
||||
ID: fmt.Sprintf("fan_%d_status", fan.ID),
|
||||
Timestamp: time.Now(),
|
||||
Source: "Fan",
|
||||
SensorType: "fan",
|
||||
SensorName: name,
|
||||
EventType: "Fan Status",
|
||||
Severity: severity,
|
||||
Description: fmt.Sprintf("%s reports %s", name, status),
|
||||
RawData: fmt.Sprintf("rpm=%d pct=%d model=%s", fan.SpeedRPM, fan.SpeedPercent, fan.FanModel),
|
||||
})
|
||||
}
|
||||
|
||||
return events
|
||||
}
|
||||
|
||||
func parseDiskBackplaneSensors(text string) []models.SensorReading {
|
||||
re := regexp.MustCompile(`RESTful diskbackplane info:\s*(\[[\s\S]*?\])\s*BMC`)
|
||||
match := re.FindStringSubmatch(text)
|
||||
if match == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
jsonStr := strings.ReplaceAll(match[1], "\n", "")
|
||||
var backplaneInfo DiskBackplaneRESTInfo
|
||||
if err := json.Unmarshal([]byte(jsonStr), &backplaneInfo); err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
out := make([]models.SensorReading, 0, len(backplaneInfo))
|
||||
for _, bp := range backplaneInfo {
|
||||
if bp.Present != 1 {
|
||||
continue
|
||||
}
|
||||
name := fmt.Sprintf("Backplane%d_Temp", bp.BackplaneIndex)
|
||||
status := "OK"
|
||||
if bp.Temperature <= 0 {
|
||||
status = "unknown"
|
||||
}
|
||||
raw := fmt.Sprintf("front=%d ports=%d drives=%d cpld=%s", bp.Front, bp.PortCount, bp.DriverCount, bp.CPLDVersion)
|
||||
out = append(out, models.SensorReading{
|
||||
Name: name,
|
||||
Type: "temperature",
|
||||
Value: float64(bp.Temperature),
|
||||
Unit: "C",
|
||||
RawValue: raw,
|
||||
Status: status,
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func parsePSUSummarySensors(text string) []models.SensorReading {
|
||||
re := regexp.MustCompile(`RESTful PSU info:\s*(\{[\s\S]*?\})\s*RESTful Network`)
|
||||
match := re.FindStringSubmatch(text)
|
||||
if match == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
jsonStr := strings.ReplaceAll(match[1], "\n", "")
|
||||
var psuInfo PSURESTInfo
|
||||
if err := json.Unmarshal([]byte(jsonStr), &psuInfo); err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
out := make([]models.SensorReading, 0, len(psuInfo.PowerSupplies)*3+1)
|
||||
if psuInfo.PresentPowerReading > 0 {
|
||||
out = append(out, models.SensorReading{
|
||||
Name: "PSU_Present_Power_Reading",
|
||||
Type: "power",
|
||||
Value: float64(psuInfo.PresentPowerReading),
|
||||
Unit: "W",
|
||||
RawValue: fmt.Sprintf("%d", psuInfo.PresentPowerReading),
|
||||
Status: "OK",
|
||||
})
|
||||
}
|
||||
|
||||
for _, psu := range psuInfo.PowerSupplies {
|
||||
if psu.Present != 1 {
|
||||
continue
|
||||
}
|
||||
status := normalizeComponentStatus(psu.Status)
|
||||
out = append(out, models.SensorReading{
|
||||
Name: fmt.Sprintf("PSU%d_InputPower", psu.ID),
|
||||
Type: "power",
|
||||
Value: float64(psu.PSInPower),
|
||||
Unit: "W",
|
||||
RawValue: fmt.Sprintf("%d", psu.PSInPower),
|
||||
Status: status,
|
||||
})
|
||||
out = append(out, models.SensorReading{
|
||||
Name: fmt.Sprintf("PSU%d_OutputPower", psu.ID),
|
||||
Type: "power",
|
||||
Value: float64(psu.PSOutPower),
|
||||
Unit: "W",
|
||||
RawValue: fmt.Sprintf("%d", psu.PSOutPower),
|
||||
Status: status,
|
||||
})
|
||||
out = append(out, models.SensorReading{
|
||||
Name: fmt.Sprintf("PSU%d_Temp", psu.ID),
|
||||
Type: "temperature",
|
||||
Value: float64(psu.PSUMaxTemp),
|
||||
Unit: "C",
|
||||
RawValue: fmt.Sprintf("%d", psu.PSUMaxTemp),
|
||||
Status: status,
|
||||
})
|
||||
}
|
||||
|
||||
return out
|
||||
}
|
||||
|
||||
func normalizeComponentStatus(values ...string) string {
|
||||
for _, v := range values {
|
||||
s := strings.TrimSpace(v)
|
||||
if s == "" {
|
||||
continue
|
||||
}
|
||||
return s
|
||||
}
|
||||
return "unknown"
|
||||
}
|
||||
|
||||
func isHealthyComponentStatus(status string) bool {
|
||||
switch strings.ToLower(strings.TrimSpace(status)) {
|
||||
case "", "ok", "normal", "present", "enabled":
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
var rawDeviceIDLikeRegex = regexp.MustCompile(`(?i)^(?:0x)?[0-9a-f]{3,4}$`)
|
||||
|
||||
func looksLikeRawDeviceID(v string) bool {
|
||||
v = strings.TrimSpace(v)
|
||||
if v == "" {
|
||||
return true
|
||||
}
|
||||
return rawDeviceIDLikeRegex.MatchString(v)
|
||||
}
|
||||
|
||||
func parseMemoryEvents(text string) []models.Event {
|
||||
var events []models.Event
|
||||
|
||||
@@ -452,28 +742,88 @@ func parseDiskBackplaneInfo(text string, hw *models.HardwareConfig) {
|
||||
return
|
||||
}
|
||||
|
||||
// Create storage entries based on backplane info
|
||||
presentByBackplane := make(map[int]int)
|
||||
totalPresent := 0
|
||||
for _, bp := range backplaneInfo {
|
||||
if bp.Present != 1 {
|
||||
continue
|
||||
}
|
||||
if bp.DriverCount <= 0 {
|
||||
continue
|
||||
}
|
||||
limit := bp.DriverCount
|
||||
if bp.PortCount > 0 && limit > bp.PortCount {
|
||||
limit = bp.PortCount
|
||||
}
|
||||
presentByBackplane[bp.BackplaneIndex] = limit
|
||||
totalPresent += limit
|
||||
}
|
||||
|
||||
if totalPresent == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
existingPresent := countPresentStorage(hw.Storage)
|
||||
remaining := totalPresent - existingPresent
|
||||
if remaining <= 0 {
|
||||
return
|
||||
}
|
||||
|
||||
for _, bp := range backplaneInfo {
|
||||
if bp.Present != 1 || remaining <= 0 {
|
||||
continue
|
||||
}
|
||||
driveCount := presentByBackplane[bp.BackplaneIndex]
|
||||
if driveCount <= 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
location := "Rear"
|
||||
if bp.Front == 1 {
|
||||
location = "Front"
|
||||
}
|
||||
|
||||
// Create entries for each port (disk slot)
|
||||
for i := 0; i < bp.PortCount; i++ {
|
||||
isPresent := i < bp.DriverCount
|
||||
for i := 0; i < driveCount && remaining > 0; i++ {
|
||||
slot := fmt.Sprintf("BP%d:%d", bp.BackplaneIndex, i)
|
||||
if hasStorageSlot(hw.Storage, slot) {
|
||||
continue
|
||||
}
|
||||
|
||||
hw.Storage = append(hw.Storage, models.Storage{
|
||||
Slot: fmt.Sprintf("%d", i),
|
||||
Present: isPresent,
|
||||
Slot: slot,
|
||||
Present: true,
|
||||
Location: location,
|
||||
BackplaneID: bp.BackplaneIndex,
|
||||
Type: "HDD",
|
||||
})
|
||||
remaining--
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func countPresentStorage(storage []models.Storage) int {
|
||||
count := 0
|
||||
for _, dev := range storage {
|
||||
if dev.Present {
|
||||
count++
|
||||
continue
|
||||
}
|
||||
if strings.TrimSpace(dev.Slot) != "" && (normalizeRedisValue(dev.Model) != "" || normalizeRedisValue(dev.SerialNumber) != "" || dev.SizeGB > 0) {
|
||||
count++
|
||||
}
|
||||
}
|
||||
return count
|
||||
}
|
||||
|
||||
func hasStorageSlot(storage []models.Storage, slot string) bool {
|
||||
slot = strings.ToLower(strings.TrimSpace(slot))
|
||||
if slot == "" {
|
||||
return false
|
||||
}
|
||||
for _, dev := range storage {
|
||||
if strings.ToLower(strings.TrimSpace(dev.Slot)) == slot {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
166
internal/parser/vendors/inspur/component_test.go
vendored
Normal file
166
internal/parser/vendors/inspur/component_test.go
vendored
Normal file
@@ -0,0 +1,166 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestParseNetworkAdapterInfo_ResolvesModelFromPCIIDsForRawHexModel(t *testing.T) {
|
||||
text := `RESTful Network Adapter info:
|
||||
{
|
||||
"sys_adapters": [
|
||||
{
|
||||
"id": 1,
|
||||
"name": "NIC1",
|
||||
"Location": "#CPU0_PCIE4",
|
||||
"present": 1,
|
||||
"slot": 4,
|
||||
"vendor_id": 32902,
|
||||
"device_id": 5409,
|
||||
"vendor": "",
|
||||
"model": "0x1521",
|
||||
"fw_ver": "",
|
||||
"status": "OK",
|
||||
"sn": "",
|
||||
"pn": "",
|
||||
"port_num": 4,
|
||||
"port_type": "Base-T",
|
||||
"ports": []
|
||||
}
|
||||
]
|
||||
}
|
||||
RESTful fan`
|
||||
|
||||
hw := &models.HardwareConfig{}
|
||||
parseNetworkAdapterInfo(text, hw)
|
||||
|
||||
if len(hw.NetworkAdapters) != 1 {
|
||||
t.Fatalf("expected 1 network adapter, got %d", len(hw.NetworkAdapters))
|
||||
}
|
||||
got := hw.NetworkAdapters[0]
|
||||
if got.Model == "" {
|
||||
t.Fatalf("expected NIC model resolved from pci.ids, got empty")
|
||||
}
|
||||
if !strings.Contains(strings.ToUpper(got.Model), "I350") {
|
||||
t.Fatalf("expected I350 in model, got %q", got.Model)
|
||||
}
|
||||
if got.Vendor == "" {
|
||||
t.Fatalf("expected NIC vendor resolved from pci.ids")
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseComponentLogSensors_ExtractsFanBackplaneAndPSUSummary(t *testing.T) {
|
||||
text := `RESTful PSU info:
|
||||
{
|
||||
"power_supplies": [
|
||||
{ "id": 0, "present": 1, "status": "OK", "ps_in_power": 123, "ps_out_power": 110, "psu_max_temperature": 41 }
|
||||
],
|
||||
"present_power_reading": 999
|
||||
}
|
||||
RESTful Network Adapter info:
|
||||
{ "sys_adapters": [] }
|
||||
RESTful fan info:
|
||||
{
|
||||
"fans": [
|
||||
{ "id": 1, "fan_name": "FAN0_F_Speed", "present": "OK", "status": "OK", "status_str": "OK", "speed_rpm": 9200, "speed_percent": 35, "max_speed_rpm": 20000, "fan_model": "6056" }
|
||||
],
|
||||
"fans_power": 33
|
||||
}
|
||||
RESTful diskbackplane info:
|
||||
[
|
||||
{ "port_count": 8, "driver_count": 4, "front": 1, "backplane_index": 0, "present": 1, "cpld_version": "3.1", "temperature": 18 }
|
||||
]
|
||||
BMC`
|
||||
|
||||
sensors := ParseComponentLogSensors([]byte(text))
|
||||
if len(sensors) == 0 {
|
||||
t.Fatalf("expected sensors from component.log, got none")
|
||||
}
|
||||
|
||||
has := func(name string) bool {
|
||||
for _, s := range sensors {
|
||||
if s.Name == name {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
if !has("FAN0_F_Speed") {
|
||||
t.Fatalf("expected FAN0_F_Speed sensor in parsed output")
|
||||
}
|
||||
if !has("Backplane0_Temp") {
|
||||
t.Fatalf("expected Backplane0_Temp sensor in parsed output")
|
||||
}
|
||||
if !has("PSU_Present_Power_Reading") {
|
||||
t.Fatalf("expected PSU_Present_Power_Reading sensor in parsed output")
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseComponentLogEvents_FanCriticalStatus(t *testing.T) {
|
||||
text := `RESTful fan info:
|
||||
{
|
||||
"fans": [
|
||||
{ "id": 7, "fan_name": "FAN3_R_Speed", "present": "OK", "status": "Critical", "status_str": "Critical", "speed_rpm": 0, "speed_percent": 0, "max_speed_rpm": 20000, "fan_model": "6056" }
|
||||
],
|
||||
"fans_power": 0
|
||||
}
|
||||
RESTful diskbackplane info:
|
||||
[]
|
||||
BMC`
|
||||
|
||||
events := ParseComponentLogEvents([]byte(text))
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 fan event, got %d", len(events))
|
||||
}
|
||||
if events[0].EventType != "Fan Status" {
|
||||
t.Fatalf("expected Fan Status event type, got %q", events[0].EventType)
|
||||
}
|
||||
if events[0].Severity != models.SeverityCritical {
|
||||
t.Fatalf("expected critical severity, got %q", events[0].Severity)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseHDDInfo_MergesIntoExistingStorage(t *testing.T) {
|
||||
text := `RESTful HDD info:
|
||||
[
|
||||
{
|
||||
"id": 1,
|
||||
"present": 1,
|
||||
"enable": 1,
|
||||
"SN": "SER123",
|
||||
"model": "Sample SSD",
|
||||
"capacity": 1024,
|
||||
"manufacture": "ACME",
|
||||
"firmware": "1.0.0",
|
||||
"locationstring": "OB01",
|
||||
"capablespeed": 6
|
||||
}
|
||||
]
|
||||
RESTful PSU`
|
||||
|
||||
hw := &models.HardwareConfig{
|
||||
Storage: []models.Storage{
|
||||
{
|
||||
Slot: "OB01",
|
||||
Type: "SSD",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
parseHDDInfo(text, hw)
|
||||
if len(hw.Storage) != 1 {
|
||||
t.Fatalf("expected 1 storage item, got %d", len(hw.Storage))
|
||||
}
|
||||
if hw.Storage[0].SerialNumber != "SER123" {
|
||||
t.Fatalf("expected serial from HDD section, got %q", hw.Storage[0].SerialNumber)
|
||||
}
|
||||
if hw.Storage[0].Model != "Sample SSD" {
|
||||
t.Fatalf("expected model from HDD section, got %q", hw.Storage[0].Model)
|
||||
}
|
||||
if hw.Storage[0].Firmware != "1.0.0" {
|
||||
t.Fatalf("expected firmware from HDD section, got %q", hw.Storage[0].Firmware)
|
||||
}
|
||||
}
|
||||
39
internal/parser/vendors/inspur/fru.go
vendored
39
internal/parser/vendors/inspur/fru.go
vendored
@@ -103,8 +103,9 @@ func extractBoardInfo(fruList []models.FRUInfo, hw *models.HardwareConfig) {
|
||||
return
|
||||
}
|
||||
|
||||
// Look for the main board/chassis FRU entry
|
||||
// Usually it's the first entry or one with "Builtin FRU" or containing board info
|
||||
// Look for the main board/chassis FRU entry.
|
||||
// Keep the first non-empty serial as the server serial and avoid overwriting it
|
||||
// with module-specific serials (e.g., SCM_FRU).
|
||||
for _, fru := range fruList {
|
||||
// Skip empty entries
|
||||
if fru.ProductName == "" && fru.SerialNumber == "" {
|
||||
@@ -118,25 +119,23 @@ func extractBoardInfo(fruList []models.FRUInfo, hw *models.HardwareConfig) {
|
||||
strings.Contains(desc, "chassis") ||
|
||||
strings.Contains(desc, "board")
|
||||
|
||||
// If we haven't set board info yet, or this is a main board entry
|
||||
if hw.BoardInfo.ProductName == "" || isMainBoard {
|
||||
if fru.ProductName != "" {
|
||||
hw.BoardInfo.ProductName = fru.ProductName
|
||||
}
|
||||
if fru.SerialNumber != "" {
|
||||
hw.BoardInfo.SerialNumber = fru.SerialNumber
|
||||
}
|
||||
if fru.Manufacturer != "" {
|
||||
hw.BoardInfo.Manufacturer = fru.Manufacturer
|
||||
}
|
||||
if fru.PartNumber != "" {
|
||||
hw.BoardInfo.PartNumber = fru.PartNumber
|
||||
}
|
||||
if fru.SerialNumber != "" && hw.BoardInfo.SerialNumber == "" {
|
||||
hw.BoardInfo.SerialNumber = fru.SerialNumber
|
||||
}
|
||||
if fru.ProductName != "" && (hw.BoardInfo.ProductName == "" || isMainBoard) {
|
||||
hw.BoardInfo.ProductName = fru.ProductName
|
||||
}
|
||||
// Manufacturer from non-main FRU entries (e.g. PSU vendor) should not become server vendor.
|
||||
if fru.Manufacturer != "" && isMainBoard && hw.BoardInfo.Manufacturer == "" {
|
||||
hw.BoardInfo.Manufacturer = fru.Manufacturer
|
||||
}
|
||||
if fru.PartNumber != "" && (hw.BoardInfo.PartNumber == "" || isMainBoard) {
|
||||
hw.BoardInfo.PartNumber = fru.PartNumber
|
||||
}
|
||||
|
||||
// If we found a main board entry, stop searching
|
||||
if isMainBoard && fru.ProductName != "" && fru.SerialNumber != "" {
|
||||
break
|
||||
}
|
||||
// Main board entry with complete data is good enough to stop.
|
||||
if isMainBoard && hw.BoardInfo.ProductName != "" && hw.BoardInfo.SerialNumber != "" {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
59
internal/parser/vendors/inspur/fru_test.go
vendored
Normal file
59
internal/parser/vendors/inspur/fru_test.go
vendored
Normal file
@@ -0,0 +1,59 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestExtractBoardInfo_PreservesBuiltinSerial(t *testing.T) {
|
||||
hw := &models.HardwareConfig{}
|
||||
fruList := []models.FRUInfo{
|
||||
{
|
||||
Description: "Builtin FRU Device (ID 0)",
|
||||
SerialNumber: "21D634101",
|
||||
},
|
||||
{
|
||||
Description: "SCM_FRU (ID 8)",
|
||||
SerialNumber: "CAR509K10613C10",
|
||||
ProductName: "CA",
|
||||
Manufacturer: "inagile",
|
||||
PartNumber: "YZCA-02758-105",
|
||||
},
|
||||
}
|
||||
|
||||
extractBoardInfo(fruList, hw)
|
||||
|
||||
if hw.BoardInfo.SerialNumber != "21D634101" {
|
||||
t.Fatalf("expected board serial 21D634101, got %q", hw.BoardInfo.SerialNumber)
|
||||
}
|
||||
if hw.BoardInfo.ProductName != "CA" {
|
||||
t.Fatalf("expected product name CA, got %q", hw.BoardInfo.ProductName)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractBoardInfo_DoesNotUsePSUVendorAsBoardManufacturer(t *testing.T) {
|
||||
hw := &models.HardwareConfig{}
|
||||
fruList := []models.FRUInfo{
|
||||
{
|
||||
Description: "Builtin FRU Device (ID 0)",
|
||||
SerialNumber: "2KD605238",
|
||||
},
|
||||
{
|
||||
Description: "PSU0_FRU (ID 30)",
|
||||
SerialNumber: "PMR315HS10F1A",
|
||||
ProductName: "AP-CR3000F12BY",
|
||||
Manufacturer: "APLUSPOWER",
|
||||
PartNumber: "18XA1M43400C2",
|
||||
},
|
||||
}
|
||||
|
||||
extractBoardInfo(fruList, hw)
|
||||
|
||||
if hw.BoardInfo.SerialNumber != "2KD605238" {
|
||||
t.Fatalf("expected board serial 2KD605238, got %q", hw.BoardInfo.SerialNumber)
|
||||
}
|
||||
if hw.BoardInfo.Manufacturer != "" {
|
||||
t.Fatalf("expected empty board manufacturer, got %q", hw.BoardInfo.Manufacturer)
|
||||
}
|
||||
}
|
||||
117
internal/parser/vendors/inspur/gpu_status.go
vendored
Normal file
117
internal/parser/vendors/inspur/gpu_status.go
vendored
Normal file
@@ -0,0 +1,117 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"regexp"
|
||||
"sort"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
var reFaultGPU = regexp.MustCompile(`\bF_GPU(\d+)\b`)
|
||||
|
||||
func applyGPUStatusFromEvents(hw *models.HardwareConfig, events []models.Event) {
|
||||
if hw == nil || len(hw.GPUs) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
gpuByIndex := make(map[int]*models.GPU)
|
||||
for i := range hw.GPUs {
|
||||
gpu := &hw.GPUs[i]
|
||||
idx, ok := extractLogicalGPUIndex(gpu.Slot)
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
gpuByIndex[idx] = gpu
|
||||
gpu.StatusHistory = nil
|
||||
gpu.ErrorDescription = ""
|
||||
}
|
||||
|
||||
relevantEvents := make([]models.Event, 0)
|
||||
for _, e := range events {
|
||||
if !isGPUFaultEvent(e) || len(extractFaultyGPUSet(e.Description)) == 0 {
|
||||
continue
|
||||
}
|
||||
relevantEvents = append(relevantEvents, e)
|
||||
}
|
||||
|
||||
if len(relevantEvents) == 0 {
|
||||
for _, gpu := range gpuByIndex {
|
||||
if strings.TrimSpace(gpu.Status) == "" {
|
||||
gpu.Status = "OK"
|
||||
}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
sort.Slice(relevantEvents, func(i, j int) bool {
|
||||
return relevantEvents[i].Timestamp.Before(relevantEvents[j].Timestamp)
|
||||
})
|
||||
|
||||
currentStatus := make(map[int]string, len(gpuByIndex))
|
||||
lastCriticalDetails := make(map[int]string, len(gpuByIndex))
|
||||
for idx := range gpuByIndex {
|
||||
currentStatus[idx] = "OK"
|
||||
}
|
||||
|
||||
for _, e := range relevantEvents {
|
||||
faultySet := extractFaultyGPUSet(e.Description)
|
||||
for idx, gpu := range gpuByIndex {
|
||||
newStatus := "OK"
|
||||
if faultySet[idx] {
|
||||
newStatus = "Critical"
|
||||
lastCriticalDetails[idx] = strings.TrimSpace(e.Description)
|
||||
}
|
||||
|
||||
if currentStatus[idx] != newStatus {
|
||||
gpu.StatusHistory = append(gpu.StatusHistory, models.StatusHistoryEntry{
|
||||
Status: newStatus,
|
||||
ChangedAt: e.Timestamp,
|
||||
Details: strings.TrimSpace(e.Description),
|
||||
})
|
||||
ts := e.Timestamp
|
||||
gpu.StatusChangedAt = &ts
|
||||
currentStatus[idx] = newStatus
|
||||
}
|
||||
|
||||
ts := e.Timestamp
|
||||
gpu.StatusCheckedAt = &ts
|
||||
}
|
||||
}
|
||||
|
||||
for idx, gpu := range gpuByIndex {
|
||||
gpu.Status = currentStatus[idx]
|
||||
if gpu.Status == "Critical" {
|
||||
gpu.ErrorDescription = lastCriticalDetails[idx]
|
||||
} else {
|
||||
gpu.ErrorDescription = ""
|
||||
}
|
||||
if gpu.StatusCheckedAt == nil && strings.TrimSpace(gpu.Status) == "" {
|
||||
gpu.Status = "OK"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func extractFaultyGPUSet(description string) map[int]bool {
|
||||
faulty := make(map[int]bool)
|
||||
matches := reFaultGPU.FindAllStringSubmatch(description, -1)
|
||||
for _, m := range matches {
|
||||
if len(m) < 2 {
|
||||
continue
|
||||
}
|
||||
idx, err := strconv.Atoi(m[1])
|
||||
if err == nil && idx >= 0 {
|
||||
faulty[idx] = true
|
||||
}
|
||||
}
|
||||
return faulty
|
||||
}
|
||||
|
||||
func isGPUFaultEvent(e models.Event) bool {
|
||||
desc := strings.ToLower(e.Description)
|
||||
if strings.Contains(desc, "bios miss f_gpu") {
|
||||
return true
|
||||
}
|
||||
return strings.EqualFold(strings.TrimSpace(e.ID), "17FFB002")
|
||||
}
|
||||
69
internal/parser/vendors/inspur/hgx_firmware_test.go
vendored
Normal file
69
internal/parser/vendors/inspur/hgx_firmware_test.go
vendored
Normal file
@@ -0,0 +1,69 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestAppendHGXFirmwareFromHWInfo_AppendsInventoryEntries(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
Firmware: []models.FirmwareInfo{
|
||||
{DeviceName: "BIOS", Version: "1.0.0"},
|
||||
},
|
||||
}
|
||||
|
||||
content := []byte(`
|
||||
{
|
||||
"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/HGX_FW_BMC_0",
|
||||
"Id": "HGX_FW_BMC_0",
|
||||
"Oem": {
|
||||
"Nvidia": {
|
||||
"ActiveFirmwareSlot": {"Version": "25.05-A"},
|
||||
"InactiveFirmwareSlot": {"Version": "25.04-B"}
|
||||
}
|
||||
},
|
||||
"Version": "25.05-A",
|
||||
"WriteProtected": false
|
||||
}
|
||||
{
|
||||
"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/HGX_FW_GPU_SXM_1",
|
||||
"Id": "HGX_FW_GPU_SXM_1",
|
||||
"Version": "97.00.C5.00.0E",
|
||||
"WriteProtected": false
|
||||
}
|
||||
{
|
||||
"@odata.id": "/redfish/v1/UpdateService/FirmwareInventory/HGX_Driver_GPU_SXM_1",
|
||||
"Id": "HGX_Driver_GPU_SXM_1",
|
||||
"Version": "",
|
||||
"WriteProtected": false
|
||||
}
|
||||
`)
|
||||
|
||||
appendHGXFirmwareFromHWInfo(content, hw)
|
||||
|
||||
if len(hw.Firmware) != 5 {
|
||||
t.Fatalf("expected 5 firmware entries after append, got %d", len(hw.Firmware))
|
||||
}
|
||||
|
||||
seen := make(map[string]string)
|
||||
for _, fw := range hw.Firmware {
|
||||
seen[fw.DeviceName] = fw.Version
|
||||
}
|
||||
|
||||
if seen["HGX_FW_BMC_0"] != "25.05-A" {
|
||||
t.Fatalf("expected HGX_FW_BMC_0 version 25.05-A, got %q", seen["HGX_FW_BMC_0"])
|
||||
}
|
||||
if seen["HGX_FW_BMC_0 Active Slot"] != "25.05-A" {
|
||||
t.Fatalf("expected active slot version, got %q", seen["HGX_FW_BMC_0 Active Slot"])
|
||||
}
|
||||
if seen["HGX_FW_BMC_0 Inactive Slot"] != "25.04-B" {
|
||||
t.Fatalf("expected inactive slot version, got %q", seen["HGX_FW_BMC_0 Inactive Slot"])
|
||||
}
|
||||
if seen["HGX_FW_GPU_SXM_1"] != "97.00.C5.00.0E" {
|
||||
t.Fatalf("expected GPU FW entry, got %q", seen["HGX_FW_GPU_SXM_1"])
|
||||
}
|
||||
if _, ok := seen["HGX_Driver_GPU_SXM_1"]; ok {
|
||||
t.Fatalf("did not expect empty version driver entry")
|
||||
}
|
||||
}
|
||||
171
internal/parser/vendors/inspur/hgx_gpu_status_test.go
vendored
Normal file
171
internal/parser/vendors/inspur/hgx_gpu_status_test.go
vendored
Normal file
@@ -0,0 +1,171 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestEnrichGPUsFromHGXHWInfo_UsesHGXLogicalMapping(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "#GPU6"},
|
||||
{Slot: "#GPU7"},
|
||||
{Slot: "#GPU0"},
|
||||
{Slot: "#CPU0_PE1_E_BMC", Model: "AST2500 VGA"},
|
||||
},
|
||||
}
|
||||
|
||||
content := []byte(`
|
||||
# curl -X GET http://127.0.0.1/redfish/v1/Chassis/HGX_GPU_SXM_1/Assembly
|
||||
{"Name":"GPU Board Assembly","Model":"B200 180GB HBM3e","PartNumber":"PN1","SerialNumber":"SXM1SN"}
|
||||
# curl -X GET http://127.0.0.1/redfish/v1/Chassis/HGX_GPU_SXM_3/Assembly
|
||||
{"Name":"GPU Board Assembly","Model":"B200 180GB HBM3e","PartNumber":"PN3","SerialNumber":"SXM3SN"}
|
||||
# curl -X GET http://127.0.0.1/redfish/v1/Chassis/HGX_GPU_SXM_5/Assembly
|
||||
{"Name":"GPU Board Assembly","Model":"B200 180GB HBM3e","PartNumber":"PN5","SerialNumber":"SXM5SN"}
|
||||
{"Id":"HGX_FW_GPU_SXM_1","Version":"FW1"}
|
||||
{"Id":"HGX_FW_GPU_SXM_3","Version":"FW3"}
|
||||
{"Id":"HGX_FW_GPU_SXM_5","Version":"FW5"}
|
||||
{"Id":"HGX_InfoROM_GPU_SXM_3","Version":"IR3"}
|
||||
`)
|
||||
|
||||
enrichGPUsFromHGXHWInfo(content, hw)
|
||||
|
||||
if hw.GPUs[0].SerialNumber != "SXM3SN" {
|
||||
t.Fatalf("expected #GPU6 to map to SXM3 serial, got %q", hw.GPUs[0].SerialNumber)
|
||||
}
|
||||
if hw.GPUs[1].SerialNumber != "SXM1SN" {
|
||||
t.Fatalf("expected #GPU7 to map to SXM1 serial, got %q", hw.GPUs[1].SerialNumber)
|
||||
}
|
||||
if hw.GPUs[2].SerialNumber != "SXM5SN" {
|
||||
t.Fatalf("expected #GPU0 to map to SXM5 serial, got %q", hw.GPUs[2].SerialNumber)
|
||||
}
|
||||
if hw.GPUs[0].Firmware != "FW3" {
|
||||
t.Fatalf("expected #GPU6 firmware FW3, got %q", hw.GPUs[0].Firmware)
|
||||
}
|
||||
if hw.GPUs[0].VideoBIOS != "IR3" {
|
||||
t.Fatalf("expected #GPU6 InfoROM in VideoBIOS IR3, got %q", hw.GPUs[0].VideoBIOS)
|
||||
}
|
||||
if hw.GPUs[2].Firmware != "FW5" {
|
||||
t.Fatalf("expected #GPU0 firmware FW5, got %q", hw.GPUs[2].Firmware)
|
||||
}
|
||||
for _, g := range hw.GPUs {
|
||||
if g.Slot == "#CPU0_PE1_E_BMC" {
|
||||
t.Fatalf("expected non-HGX BMC VGA entry to be filtered out")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestEnrichGPUsFromHGXHWInfo_AddsMissingLogicalGPU(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "#GPU0"},
|
||||
{Slot: "#GPU1"},
|
||||
{Slot: "#GPU2"},
|
||||
{Slot: "#GPU3"},
|
||||
{Slot: "#GPU4"},
|
||||
{Slot: "#GPU5"},
|
||||
{Slot: "#GPU7"},
|
||||
},
|
||||
}
|
||||
|
||||
content := []byte(`
|
||||
# curl -X GET http://127.0.0.1/redfish/v1/Chassis/HGX_GPU_SXM_3/Assembly
|
||||
{"Name":"GPU Board Assembly","Model":"B200 180GB HBM3e","PartNumber":"PN3","SerialNumber":"SXM3SN"}
|
||||
`)
|
||||
|
||||
enrichGPUsFromHGXHWInfo(content, hw)
|
||||
|
||||
found := false
|
||||
for _, g := range hw.GPUs {
|
||||
if g.Slot == "#GPU6" {
|
||||
found = true
|
||||
if g.SerialNumber != "SXM3SN" {
|
||||
t.Fatalf("expected synthesized #GPU6 serial SXM3SN, got %q", g.SerialNumber)
|
||||
}
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
t.Fatalf("expected synthesized #GPU6 entry")
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyGPUStatusFromEvents_MarksFaultedGPU(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "#GPU6"},
|
||||
{Slot: "#GPU5"},
|
||||
},
|
||||
}
|
||||
|
||||
events := []models.Event{
|
||||
{
|
||||
ID: "17FFB002",
|
||||
Timestamp: time.Now(),
|
||||
Description: "PCIe Present mismatch BIOS miss F_GPU6",
|
||||
},
|
||||
}
|
||||
|
||||
applyGPUStatusFromEvents(hw, events)
|
||||
|
||||
if hw.GPUs[0].Status != "Critical" {
|
||||
t.Fatalf("expected #GPU6 status Critical, got %q", hw.GPUs[0].Status)
|
||||
}
|
||||
if hw.GPUs[1].Status != "OK" {
|
||||
t.Fatalf("expected healthy GPU status OK, got %q", hw.GPUs[1].Status)
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyGPUStatusFromEvents_UsesLatestEventAsCurrentStatusAndKeepsHistory(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "#GPU1"},
|
||||
{Slot: "#GPU3"},
|
||||
{Slot: "#GPU6"},
|
||||
},
|
||||
}
|
||||
|
||||
events := []models.Event{
|
||||
{
|
||||
ID: "17FFB002",
|
||||
Timestamp: time.Date(2026, 1, 12, 22, 51, 16, 0, time.FixedZone("UTC+8", 8*3600)),
|
||||
Description: "PCIe Present mismatch BIOS miss F_GPU1 F_GPU3 F_GPU6",
|
||||
},
|
||||
{
|
||||
ID: "17FFB002",
|
||||
Timestamp: time.Date(2026, 1, 12, 23, 5, 18, 0, time.FixedZone("UTC+8", 8*3600)),
|
||||
Description: "PCIe Present mismatch BIOS miss F_GPU6",
|
||||
},
|
||||
}
|
||||
|
||||
applyGPUStatusFromEvents(hw, events)
|
||||
|
||||
if hw.GPUs[0].Status != "OK" {
|
||||
t.Fatalf("expected #GPU1 to recover to OK on latest event, got %q", hw.GPUs[0].Status)
|
||||
}
|
||||
if hw.GPUs[1].Status != "OK" {
|
||||
t.Fatalf("expected #GPU3 to recover to OK on latest event, got %q", hw.GPUs[1].Status)
|
||||
}
|
||||
if hw.GPUs[2].Status != "Critical" {
|
||||
t.Fatalf("expected #GPU6 to remain Critical, got %q", hw.GPUs[2].Status)
|
||||
}
|
||||
if len(hw.GPUs[0].StatusHistory) == 0 {
|
||||
t.Fatalf("expected #GPU1 status history to be populated")
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseIDLLog_ParsesStructuredJSONLine(t *testing.T) {
|
||||
content := []byte(`{ "MESSAGE": "|2026-01-12T23:05:18+08:00|PCIE|Assert|Critical|17FFB002|PCIe Present mismatch BIOS miss F_GPU6 - Assert|" }`)
|
||||
|
||||
events := ParseIDLLog(content)
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 event from JSON line, got %d", len(events))
|
||||
}
|
||||
if events[0].ID != "17FFB002" {
|
||||
t.Fatalf("expected event ID 17FFB002, got %q", events[0].ID)
|
||||
}
|
||||
if events[0].Source != "PCIE" {
|
||||
t.Fatalf("expected source PCIE, got %q", events[0].Source)
|
||||
}
|
||||
}
|
||||
360
internal/parser/vendors/inspur/hgx_hwinfo.go
vendored
Normal file
360
internal/parser/vendors/inspur/hgx_hwinfo.go
vendored
Normal file
@@ -0,0 +1,360 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
type hgxGPUAssemblyInfo struct {
|
||||
Model string
|
||||
Part string
|
||||
Serial string
|
||||
}
|
||||
|
||||
type hgxGPUFirmwareInfo struct {
|
||||
Firmware string
|
||||
InfoROM string
|
||||
}
|
||||
|
||||
type hgxFirmwareInventoryEntry struct {
|
||||
ID string
|
||||
Version string
|
||||
ActiveVersion string
|
||||
InactiveVersion string
|
||||
}
|
||||
|
||||
// Logical GPU index mapping used by HGX B200 UI ordering.
|
||||
// Example from real logs/UI:
|
||||
// GPU0->SXM5, GPU1->SXM7, GPU2->SXM6, GPU3->SXM8, GPU4->SXM2, GPU5->SXM4, GPU6->SXM3, GPU7->SXM1.
|
||||
var hgxLogicalToSXM = map[int]int{
|
||||
0: 5,
|
||||
1: 7,
|
||||
2: 6,
|
||||
3: 8,
|
||||
4: 2,
|
||||
5: 4,
|
||||
6: 3,
|
||||
7: 1,
|
||||
}
|
||||
|
||||
var (
|
||||
reHGXGPUBlock = regexp.MustCompile(`(?s)/redfish/v1/Chassis/HGX_GPU_SXM_(\d+)/Assembly.*?"Name":\s*"GPU Board Assembly".*?"Model":\s*"([^"]+)".*?"PartNumber":\s*"([^"]+)".*?"SerialNumber":\s*"([^"]+)"`)
|
||||
reHGXFWBlock = regexp.MustCompile(`(?s)"Id":\s*"HGX_FW_GPU_SXM_(\d+)".*?"Version":\s*"([^"]*)"`)
|
||||
reHGXInfoROM = regexp.MustCompile(`(?s)"Id":\s*"HGX_InfoROM_GPU_SXM_(\d+)".*?"Version":\s*"([^"]*)"`)
|
||||
reIDLine = regexp.MustCompile(`"Id":\s*"([^"]+)"`)
|
||||
reVersion = regexp.MustCompile(`"Version":\s*"([^"]*)"`)
|
||||
reSlotGPU = regexp.MustCompile(`(?i)gpu\s*#?\s*(\d+)`)
|
||||
)
|
||||
|
||||
func enrichGPUsFromHGXHWInfo(content []byte, hw *models.HardwareConfig) {
|
||||
if hw == nil || len(hw.GPUs) == 0 || len(content) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
bySXM := parseHGXGPUAssembly(content)
|
||||
if len(bySXM) == 0 {
|
||||
return
|
||||
}
|
||||
fwBySXM := parseHGXGPUFirmware(content)
|
||||
|
||||
normalizeHGXGPUInventory(hw, bySXM)
|
||||
|
||||
for i := range hw.GPUs {
|
||||
gpu := &hw.GPUs[i]
|
||||
logicalIdx, ok := extractLogicalGPUIndex(gpu.Slot)
|
||||
if !ok {
|
||||
// Keep existing info if slot index cannot be determined.
|
||||
continue
|
||||
}
|
||||
|
||||
sxm := resolveSXMIndex(logicalIdx, bySXM)
|
||||
info, found := bySXM[sxm]
|
||||
if !found {
|
||||
continue
|
||||
}
|
||||
|
||||
if strings.TrimSpace(gpu.SerialNumber) == "" {
|
||||
gpu.SerialNumber = info.Serial
|
||||
}
|
||||
if shouldReplaceGPUModel(gpu.Model) {
|
||||
gpu.Model = info.Model
|
||||
}
|
||||
if strings.TrimSpace(gpu.PartNumber) == "" {
|
||||
gpu.PartNumber = info.Part
|
||||
}
|
||||
if strings.TrimSpace(gpu.Manufacturer) == "" {
|
||||
gpu.Manufacturer = "NVIDIA"
|
||||
}
|
||||
if fw, ok := fwBySXM[sxm]; ok {
|
||||
if strings.TrimSpace(gpu.Firmware) == "" && strings.TrimSpace(fw.Firmware) != "" {
|
||||
gpu.Firmware = fw.Firmware
|
||||
}
|
||||
if strings.TrimSpace(gpu.VideoBIOS) == "" && strings.TrimSpace(fw.InfoROM) != "" {
|
||||
gpu.VideoBIOS = fw.InfoROM
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func appendHGXFirmwareFromHWInfo(content []byte, hw *models.HardwareConfig) {
|
||||
if hw == nil || len(content) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
entries := parseHGXFirmwareInventory(content)
|
||||
if len(entries) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
existing := make(map[string]bool, len(hw.Firmware))
|
||||
for _, fw := range hw.Firmware {
|
||||
key := strings.ToLower(strings.TrimSpace(fw.DeviceName) + "|" + strings.TrimSpace(fw.Version))
|
||||
existing[key] = true
|
||||
}
|
||||
|
||||
appendFW := func(name, version string) {
|
||||
name = strings.TrimSpace(name)
|
||||
version = strings.TrimSpace(version)
|
||||
if name == "" || version == "" {
|
||||
return
|
||||
}
|
||||
key := strings.ToLower(name + "|" + version)
|
||||
if existing[key] {
|
||||
return
|
||||
}
|
||||
existing[key] = true
|
||||
hw.Firmware = append(hw.Firmware, models.FirmwareInfo{
|
||||
DeviceName: name,
|
||||
Version: version,
|
||||
})
|
||||
}
|
||||
|
||||
for _, e := range entries {
|
||||
appendFW(e.ID, e.Version)
|
||||
|
||||
if e.ActiveVersion != "" && e.InactiveVersion != "" && e.ActiveVersion != e.InactiveVersion {
|
||||
appendFW(e.ID+" Active Slot", e.ActiveVersion)
|
||||
appendFW(e.ID+" Inactive Slot", e.InactiveVersion)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func parseHGXGPUAssembly(content []byte) map[int]hgxGPUAssemblyInfo {
|
||||
result := make(map[int]hgxGPUAssemblyInfo)
|
||||
matches := reHGXGPUBlock.FindAllSubmatch(content, -1)
|
||||
for _, m := range matches {
|
||||
if len(m) != 5 {
|
||||
continue
|
||||
}
|
||||
|
||||
sxmIdx, err := strconv.Atoi(string(m[1]))
|
||||
if err != nil || sxmIdx <= 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
result[sxmIdx] = hgxGPUAssemblyInfo{
|
||||
Model: strings.TrimSpace(string(m[2])),
|
||||
Part: strings.TrimSpace(string(m[3])),
|
||||
Serial: strings.TrimSpace(string(m[4])),
|
||||
}
|
||||
}
|
||||
return result
|
||||
}
|
||||
|
||||
func parseHGXGPUFirmware(content []byte) map[int]hgxGPUFirmwareInfo {
|
||||
result := make(map[int]hgxGPUFirmwareInfo)
|
||||
|
||||
matchesFW := reHGXFWBlock.FindAllSubmatch(content, -1)
|
||||
for _, m := range matchesFW {
|
||||
if len(m) != 3 {
|
||||
continue
|
||||
}
|
||||
sxmIdx, err := strconv.Atoi(string(m[1]))
|
||||
if err != nil || sxmIdx <= 0 {
|
||||
continue
|
||||
}
|
||||
version := strings.TrimSpace(string(m[2]))
|
||||
if version == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
current := result[sxmIdx]
|
||||
if current.Firmware == "" {
|
||||
current.Firmware = version
|
||||
}
|
||||
result[sxmIdx] = current
|
||||
}
|
||||
|
||||
matchesInfoROM := reHGXInfoROM.FindAllSubmatch(content, -1)
|
||||
for _, m := range matchesInfoROM {
|
||||
if len(m) != 3 {
|
||||
continue
|
||||
}
|
||||
sxmIdx, err := strconv.Atoi(string(m[1]))
|
||||
if err != nil || sxmIdx <= 0 {
|
||||
continue
|
||||
}
|
||||
version := strings.TrimSpace(string(m[2]))
|
||||
if version == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
current := result[sxmIdx]
|
||||
if current.InfoROM == "" {
|
||||
current.InfoROM = version
|
||||
}
|
||||
result[sxmIdx] = current
|
||||
}
|
||||
|
||||
return result
|
||||
}
|
||||
|
||||
func parseHGXFirmwareInventory(content []byte) []hgxFirmwareInventoryEntry {
|
||||
lines := strings.Split(string(content), "\n")
|
||||
result := make([]hgxFirmwareInventoryEntry, 0)
|
||||
|
||||
var current *hgxFirmwareInventoryEntry
|
||||
section := ""
|
||||
|
||||
flush := func() {
|
||||
if current == nil {
|
||||
return
|
||||
}
|
||||
if current.Version == "" && current.ActiveVersion == "" && current.InactiveVersion == "" {
|
||||
current = nil
|
||||
section = ""
|
||||
return
|
||||
}
|
||||
result = append(result, *current)
|
||||
current = nil
|
||||
section = ""
|
||||
}
|
||||
|
||||
for _, line := range lines {
|
||||
if m := reIDLine.FindStringSubmatch(line); len(m) > 1 {
|
||||
flush()
|
||||
id := strings.TrimSpace(m[1])
|
||||
if strings.HasPrefix(id, "HGX_") {
|
||||
current = &hgxFirmwareInventoryEntry{ID: id}
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
if current == nil {
|
||||
continue
|
||||
}
|
||||
|
||||
if strings.Contains(line, `"ActiveFirmwareSlot"`) {
|
||||
section = "active"
|
||||
}
|
||||
if strings.Contains(line, `"InactiveFirmwareSlot"`) {
|
||||
section = "inactive"
|
||||
}
|
||||
|
||||
if m := reVersion.FindStringSubmatch(line); len(m) > 1 {
|
||||
version := strings.TrimSpace(m[1])
|
||||
if version == "" {
|
||||
section = ""
|
||||
continue
|
||||
}
|
||||
switch section {
|
||||
case "active":
|
||||
if current.ActiveVersion == "" {
|
||||
current.ActiveVersion = version
|
||||
}
|
||||
case "inactive":
|
||||
if current.InactiveVersion == "" {
|
||||
current.InactiveVersion = version
|
||||
}
|
||||
default:
|
||||
// Keep top-level version from the last seen plain "Version" in current entry.
|
||||
current.Version = version
|
||||
}
|
||||
section = ""
|
||||
}
|
||||
}
|
||||
flush()
|
||||
|
||||
return result
|
||||
}
|
||||
|
||||
func extractLogicalGPUIndex(slot string) (int, bool) {
|
||||
m := reSlotGPU.FindStringSubmatch(slot)
|
||||
if len(m) < 2 {
|
||||
return 0, false
|
||||
}
|
||||
|
||||
idx, err := strconv.Atoi(m[1])
|
||||
if err != nil || idx < 0 {
|
||||
return 0, false
|
||||
}
|
||||
return idx, true
|
||||
}
|
||||
|
||||
func resolveSXMIndex(logicalIdx int, bySXM map[int]hgxGPUAssemblyInfo) int {
|
||||
if sxm, ok := hgxLogicalToSXM[logicalIdx]; ok {
|
||||
if _, exists := bySXM[sxm]; exists {
|
||||
return sxm
|
||||
}
|
||||
}
|
||||
|
||||
identity := logicalIdx + 1
|
||||
if _, exists := bySXM[identity]; exists {
|
||||
return identity
|
||||
}
|
||||
|
||||
return identity
|
||||
}
|
||||
|
||||
func shouldReplaceGPUModel(model string) bool {
|
||||
trimmed := strings.TrimSpace(model)
|
||||
if trimmed == "" {
|
||||
return true
|
||||
}
|
||||
switch strings.ToLower(trimmed) {
|
||||
case "vga", "3d controller", "display controller", "unknown":
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
func normalizeHGXGPUInventory(hw *models.HardwareConfig, bySXM map[int]hgxGPUAssemblyInfo) {
|
||||
// Keep only logical HGX GPUs (#GPU0..#GPU7) and remove BMC VGA entries.
|
||||
filtered := make([]models.GPU, 0, len(hw.GPUs))
|
||||
present := make(map[int]bool)
|
||||
for _, gpu := range hw.GPUs {
|
||||
idx, ok := extractLogicalGPUIndex(gpu.Slot)
|
||||
if !ok || idx < 0 || idx > 7 {
|
||||
continue
|
||||
}
|
||||
present[idx] = true
|
||||
filtered = append(filtered, gpu)
|
||||
}
|
||||
|
||||
// If some logical GPUs are missing in asset.json, add placeholders from HGX Redfish assembly.
|
||||
for logicalIdx := 0; logicalIdx <= 7; logicalIdx++ {
|
||||
if present[logicalIdx] {
|
||||
continue
|
||||
}
|
||||
sxm := resolveSXMIndex(logicalIdx, bySXM)
|
||||
info, ok := bySXM[sxm]
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
|
||||
filtered = append(filtered, models.GPU{
|
||||
Slot: fmt.Sprintf("#GPU%d", logicalIdx),
|
||||
Model: info.Model,
|
||||
Manufacturer: "NVIDIA",
|
||||
SerialNumber: info.Serial,
|
||||
PartNumber: info.Part,
|
||||
})
|
||||
}
|
||||
|
||||
hw.GPUs = filtered
|
||||
}
|
||||
10
internal/parser/vendors/inspur/idl.go
vendored
10
internal/parser/vendors/inspur/idl.go
vendored
@@ -8,8 +8,10 @@ import (
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
// ParseIDLLog parses the IDL (Inspur Diagnostic Log) file for BMC alarms
|
||||
// Format: |timestamp|component|type|severity|eventID|description|
|
||||
// ParseIDLLog parses IDL-style entries for BMC alarms.
|
||||
// Works for both plain idl.log lines and JSON structured logs (idl_json/run_json)
|
||||
// where MESSAGE/LOG2_FMTMSG contains:
|
||||
// |timestamp|component|type|severity|eventID|description|
|
||||
func ParseIDLLog(content []byte) []models.Event {
|
||||
var events []models.Event
|
||||
|
||||
@@ -21,10 +23,6 @@ func ParseIDLLog(content []byte) []models.Event {
|
||||
seenEvents := make(map[string]bool) // Deduplicate events
|
||||
|
||||
for _, line := range lines {
|
||||
if !strings.Contains(line, "CommerDiagnose") {
|
||||
continue
|
||||
}
|
||||
|
||||
matches := re.FindStringSubmatch(line)
|
||||
if matches == nil {
|
||||
continue
|
||||
|
||||
129
internal/parser/vendors/inspur/parser.go
vendored
129
internal/parser/vendors/inspur/parser.go
vendored
@@ -8,6 +8,7 @@ package inspur
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
@@ -15,7 +16,7 @@ import (
|
||||
|
||||
// parserVersion - version of this parser module
|
||||
// IMPORTANT: Increment this version when making changes to parser logic!
|
||||
const parserVersion = "1.0.0"
|
||||
const parserVersion = "1.5"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
@@ -86,6 +87,8 @@ func containsInspurMarkers(content []byte) bool {
|
||||
|
||||
// Parse parses Inspur/Kaytus archive
|
||||
func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, error) {
|
||||
selLocation := inferInspurArchiveLocation(files)
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Events: make([]models.Event, 0),
|
||||
FRU: make([]models.FRUInfo, 0),
|
||||
@@ -123,17 +126,29 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
|
||||
// Extract events from component.log (memory errors, etc.)
|
||||
componentEvents := ParseComponentLogEvents(f.Content)
|
||||
result.Events = append(result.Events, componentEvents...)
|
||||
|
||||
// Extract additional telemetry sensors from component.log sections
|
||||
// (fan RPM, backplane temperature, PSU summary power, etc.).
|
||||
componentSensors := ParseComponentLogSensors(f.Content)
|
||||
result.Sensors = mergeSensorReadings(result.Sensors, componentSensors)
|
||||
}
|
||||
|
||||
// Parse IDL log (BMC alarms/diagnose events)
|
||||
if f := parser.FindFileByName(files, "idl.log"); f != nil {
|
||||
// Enrich runtime component data from Redis snapshot (serials, FW, telemetry),
|
||||
// when text logs miss these fields.
|
||||
if f := parser.FindFileByName(files, "redis-dump.rdb"); f != nil && result.Hardware != nil {
|
||||
enrichFromRedisDump(f.Content, result.Hardware)
|
||||
}
|
||||
|
||||
// Parse IDL-like logs (plain and structured JSON logs with embedded IDL messages)
|
||||
idlFiles := parser.FindFileByPattern(files, "/idl.log", "idl_json.log", "run_json.log")
|
||||
for _, f := range idlFiles {
|
||||
idlEvents := ParseIDLLog(f.Content)
|
||||
result.Events = append(result.Events, idlEvents...)
|
||||
}
|
||||
|
||||
// Parse SEL list (selelist.csv)
|
||||
if f := parser.FindFileByName(files, "selelist.csv"); f != nil {
|
||||
selEvents := ParseSELList(f.Content)
|
||||
selEvents := ParseSELListWithLocation(f.Content, selLocation)
|
||||
result.Events = append(result.Events, selEvents...)
|
||||
}
|
||||
|
||||
@@ -144,9 +159,71 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
|
||||
result.Events = append(result.Events, events...)
|
||||
}
|
||||
|
||||
// Fallback for archives where board serial is missing in parsed FRU/asset data:
|
||||
// recover it from log content, never from archive filename.
|
||||
if strings.TrimSpace(result.Hardware.BoardInfo.SerialNumber) == "" {
|
||||
if serial := inferBoardSerialFromFallbackLogs(files); serial != "" {
|
||||
result.Hardware.BoardInfo.SerialNumber = serial
|
||||
}
|
||||
}
|
||||
if strings.TrimSpace(result.Hardware.BoardInfo.ProductName) == "" {
|
||||
if model := inferBoardModelFromFallbackLogs(files); model != "" {
|
||||
result.Hardware.BoardInfo.ProductName = model
|
||||
}
|
||||
}
|
||||
|
||||
// Enrich GPU inventory from HGX Redfish snapshot (serial/model/part mapping).
|
||||
if f := parser.FindFileByName(files, "HGX_HWInfo_FWVersion.log"); f != nil && result.Hardware != nil {
|
||||
enrichGPUsFromHGXHWInfo(f.Content, result.Hardware)
|
||||
appendHGXFirmwareFromHWInfo(f.Content, result.Hardware)
|
||||
}
|
||||
|
||||
// Mark problematic GPUs from IDL errors like "BIOS miss F_GPU6".
|
||||
if result.Hardware != nil {
|
||||
applyGPUStatusFromEvents(result.Hardware, result.Events)
|
||||
enrichStorageFromSerialFallbackFiles(files, result.Hardware)
|
||||
}
|
||||
|
||||
return result, nil
|
||||
}
|
||||
|
||||
func inferInspurArchiveLocation(files []parser.ExtractedFile) *time.Location {
|
||||
fallback := parser.DefaultArchiveLocation()
|
||||
f := parser.FindFileByName(files, "timezone.conf")
|
||||
if f == nil {
|
||||
return fallback
|
||||
}
|
||||
locName := parseTimezoneConfigLocation(f.Content)
|
||||
if strings.TrimSpace(locName) == "" {
|
||||
return fallback
|
||||
}
|
||||
loc, err := time.LoadLocation(locName)
|
||||
if err != nil {
|
||||
return fallback
|
||||
}
|
||||
return loc
|
||||
}
|
||||
|
||||
func parseTimezoneConfigLocation(content []byte) string {
|
||||
lines := strings.Split(string(content), "\n")
|
||||
for _, line := range lines {
|
||||
line = strings.TrimSpace(line)
|
||||
if line == "" || strings.HasPrefix(line, "[") || strings.HasPrefix(line, "#") || strings.HasPrefix(line, ";") {
|
||||
continue
|
||||
}
|
||||
parts := strings.SplitN(line, "=", 2)
|
||||
if len(parts) != 2 {
|
||||
continue
|
||||
}
|
||||
key := strings.ToLower(strings.TrimSpace(parts[0]))
|
||||
val := strings.TrimSpace(parts[1])
|
||||
if key == "timezone" && val != "" {
|
||||
return val
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func (p *Parser) parseDeviceFruSDR(content []byte, result *models.AnalysisResult) {
|
||||
lines := string(content)
|
||||
|
||||
@@ -174,14 +251,9 @@ func (p *Parser) parseDeviceFruSDR(content []byte, result *models.AnalysisResult
|
||||
// This supplements data from asset.json with serial numbers, firmware, etc.
|
||||
pcieDevicesFromREST := ParsePCIeDevices(content)
|
||||
|
||||
// Merge PCIe data: keep asset.json data but add RESTful data if available
|
||||
// Merge PCIe data: asset.json is the base inventory, RESTful data enriches names/links/serials.
|
||||
if result.Hardware != nil {
|
||||
// If asset.json didn't have PCIe devices, use RESTful data
|
||||
if len(result.Hardware.PCIeDevices) == 0 && len(pcieDevicesFromREST) > 0 {
|
||||
result.Hardware.PCIeDevices = pcieDevicesFromREST
|
||||
}
|
||||
// If we have both, merge them (RESTful data takes precedence for detailed info)
|
||||
// For now, we keep asset.json data which has more details
|
||||
result.Hardware.PCIeDevices = MergePCIeDevices(result.Hardware.PCIeDevices, pcieDevicesFromREST)
|
||||
}
|
||||
|
||||
// Parse GPU devices and add temperature data from sensors
|
||||
@@ -236,3 +308,38 @@ func extractSlotNumberFromGPU(slot string) int {
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func mergeSensorReadings(base, extra []models.SensorReading) []models.SensorReading {
|
||||
if len(extra) == 0 {
|
||||
return base
|
||||
}
|
||||
|
||||
out := append([]models.SensorReading{}, base...)
|
||||
seen := make(map[string]struct{}, len(out))
|
||||
for _, s := range out {
|
||||
if key := sensorMergeKey(s); key != "" {
|
||||
seen[key] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
for _, s := range extra {
|
||||
key := sensorMergeKey(s)
|
||||
if key != "" {
|
||||
if _, ok := seen[key]; ok {
|
||||
continue
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
}
|
||||
out = append(out, s)
|
||||
}
|
||||
|
||||
return out
|
||||
}
|
||||
|
||||
func sensorMergeKey(s models.SensorReading) string {
|
||||
name := strings.ToLower(strings.TrimSpace(s.Name))
|
||||
if name == "" {
|
||||
return ""
|
||||
}
|
||||
return name
|
||||
}
|
||||
|
||||
217
internal/parser/vendors/inspur/pcie.go
vendored
217
internal/parser/vendors/inspur/pcie.go
vendored
@@ -3,36 +3,38 @@ package inspur
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser/vendors/pciids"
|
||||
)
|
||||
|
||||
// PCIeRESTInfo represents the RESTful PCIE Device info structure
|
||||
type PCIeRESTInfo []struct {
|
||||
ID int `json:"id"`
|
||||
Present int `json:"present"`
|
||||
Enable int `json:"enable"`
|
||||
Status int `json:"status"`
|
||||
VendorID int `json:"vendor_id"`
|
||||
VendorName string `json:"vendor_name"`
|
||||
DeviceID int `json:"device_id"`
|
||||
DeviceName string `json:"device_name"`
|
||||
BusNum int `json:"bus_num"`
|
||||
DevNum int `json:"dev_num"`
|
||||
FuncNum int `json:"func_num"`
|
||||
MaxLinkWidth int `json:"max_link_width"`
|
||||
MaxLinkSpeed int `json:"max_link_speed"`
|
||||
CurrentLinkWidth int `json:"current_link_width"`
|
||||
CurrentLinkSpeed int `json:"current_link_speed"`
|
||||
Slot int `json:"slot"`
|
||||
Location string `json:"location"`
|
||||
DeviceLocator string `json:"DeviceLocator"`
|
||||
DevType int `json:"dev_type"`
|
||||
DevSubtype int `json:"dev_subtype"`
|
||||
PartNum string `json:"part_num"`
|
||||
SerialNum string `json:"serial_num"`
|
||||
FwVer string `json:"fw_ver"`
|
||||
ID int `json:"id"`
|
||||
Present int `json:"present"`
|
||||
Enable int `json:"enable"`
|
||||
Status int `json:"status"`
|
||||
VendorID int `json:"vendor_id"`
|
||||
VendorName string `json:"vendor_name"`
|
||||
DeviceID int `json:"device_id"`
|
||||
DeviceName string `json:"device_name"`
|
||||
BusNum int `json:"bus_num"`
|
||||
DevNum int `json:"dev_num"`
|
||||
FuncNum int `json:"func_num"`
|
||||
MaxLinkWidth int `json:"max_link_width"`
|
||||
MaxLinkSpeed int `json:"max_link_speed"`
|
||||
CurrentLinkWidth int `json:"current_link_width"`
|
||||
CurrentLinkSpeed int `json:"current_link_speed"`
|
||||
Slot int `json:"slot"`
|
||||
Location string `json:"location"`
|
||||
DeviceLocator string `json:"DeviceLocator"`
|
||||
DevType int `json:"dev_type"`
|
||||
DevSubtype int `json:"dev_subtype"`
|
||||
PartNum string `json:"part_num"`
|
||||
SerialNum string `json:"serial_num"`
|
||||
FwVer string `json:"fw_ver"`
|
||||
}
|
||||
|
||||
// ParsePCIeDevices parses RESTful PCIE Device info from devicefrusdr.log
|
||||
@@ -73,9 +75,27 @@ func ParsePCIeDevices(content []byte) []models.PCIeDevice {
|
||||
|
||||
// Determine device class based on dev_type
|
||||
deviceClass := determineDeviceClass(pcie.DevType, pcie.DevSubtype, pcie.DeviceName)
|
||||
_, pciDeviceName := pciids.DeviceInfo(pcie.VendorID, pcie.DeviceID)
|
||||
|
||||
// Build BDF string
|
||||
bdf := fmt.Sprintf("%04x/%02x/%02x/%02x", 0, pcie.BusNum, pcie.DevNum, pcie.FuncNum)
|
||||
// Build BDF string in canonical form (bb:dd.f)
|
||||
bdf := formatBDF(pcie.BusNum, pcie.DevNum, pcie.FuncNum)
|
||||
|
||||
partNumber := strings.TrimSpace(pcie.PartNum)
|
||||
if partNumber == "" {
|
||||
partNumber = sanitizePCIeDeviceName(pcie.DeviceName)
|
||||
}
|
||||
if partNumber == "" {
|
||||
partNumber = normalizeModelLabel(pciDeviceName)
|
||||
}
|
||||
if isGenericPCIeClass(deviceClass) {
|
||||
if resolved := normalizeModelLabel(pciDeviceName); resolved != "" {
|
||||
deviceClass = resolved
|
||||
}
|
||||
}
|
||||
manufacturer := strings.TrimSpace(pcie.VendorName)
|
||||
if manufacturer == "" {
|
||||
manufacturer = normalizeModelLabel(pciids.VendorName(pcie.VendorID))
|
||||
}
|
||||
|
||||
device := models.PCIeDevice{
|
||||
Slot: pcie.Location,
|
||||
@@ -83,12 +103,12 @@ func ParsePCIeDevices(content []byte) []models.PCIeDevice {
|
||||
DeviceID: pcie.DeviceID,
|
||||
BDF: bdf,
|
||||
DeviceClass: deviceClass,
|
||||
Manufacturer: pcie.VendorName,
|
||||
Manufacturer: manufacturer,
|
||||
LinkWidth: pcie.CurrentLinkWidth,
|
||||
LinkSpeed: currentSpeed,
|
||||
MaxLinkWidth: pcie.MaxLinkWidth,
|
||||
MaxLinkSpeed: maxSpeed,
|
||||
PartNumber: strings.TrimSpace(pcie.PartNum),
|
||||
PartNumber: partNumber,
|
||||
SerialNumber: strings.TrimSpace(pcie.SerialNum),
|
||||
}
|
||||
|
||||
@@ -98,6 +118,149 @@ func ParsePCIeDevices(content []byte) []models.PCIeDevice {
|
||||
return devices
|
||||
}
|
||||
|
||||
var rawHexDeviceNameRegex = regexp.MustCompile(`(?i)^0x[0-9a-f]+$`)
|
||||
|
||||
func sanitizePCIeDeviceName(name string) string {
|
||||
name = strings.TrimSpace(name)
|
||||
if name == "" {
|
||||
return ""
|
||||
}
|
||||
if strings.EqualFold(name, "N/A") {
|
||||
return ""
|
||||
}
|
||||
if rawHexDeviceNameRegex.MatchString(name) {
|
||||
return ""
|
||||
}
|
||||
return name
|
||||
}
|
||||
|
||||
// MergePCIeDevices enriches base devices (from asset.json) with detailed RESTful PCIe data.
|
||||
// Matching is done by BDF first, then by slot fallback.
|
||||
func MergePCIeDevices(base []models.PCIeDevice, rest []models.PCIeDevice) []models.PCIeDevice {
|
||||
if len(rest) == 0 {
|
||||
return base
|
||||
}
|
||||
if len(base) == 0 {
|
||||
return append([]models.PCIeDevice(nil), rest...)
|
||||
}
|
||||
|
||||
type ref struct {
|
||||
index int
|
||||
}
|
||||
byBDF := make(map[string]ref, len(base))
|
||||
bySlot := make(map[string]ref, len(base))
|
||||
|
||||
for i := range base {
|
||||
bdf := normalizePCIeBDF(base[i].BDF)
|
||||
if bdf != "" {
|
||||
byBDF[bdf] = ref{index: i}
|
||||
}
|
||||
slot := strings.ToLower(strings.TrimSpace(base[i].Slot))
|
||||
if slot != "" {
|
||||
bySlot[slot] = ref{index: i}
|
||||
}
|
||||
}
|
||||
|
||||
for _, detailed := range rest {
|
||||
idx := -1
|
||||
if bdf := normalizePCIeBDF(detailed.BDF); bdf != "" {
|
||||
if found, ok := byBDF[bdf]; ok {
|
||||
idx = found.index
|
||||
}
|
||||
}
|
||||
if idx == -1 {
|
||||
slot := strings.ToLower(strings.TrimSpace(detailed.Slot))
|
||||
if slot != "" {
|
||||
if found, ok := bySlot[slot]; ok {
|
||||
idx = found.index
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if idx == -1 {
|
||||
base = append(base, detailed)
|
||||
newIdx := len(base) - 1
|
||||
if bdf := normalizePCIeBDF(detailed.BDF); bdf != "" {
|
||||
byBDF[bdf] = ref{index: newIdx}
|
||||
}
|
||||
if slot := strings.ToLower(strings.TrimSpace(detailed.Slot)); slot != "" {
|
||||
bySlot[slot] = ref{index: newIdx}
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
enrichPCIeDevice(&base[idx], detailed)
|
||||
}
|
||||
|
||||
return base
|
||||
}
|
||||
|
||||
func enrichPCIeDevice(dst *models.PCIeDevice, src models.PCIeDevice) {
|
||||
if dst == nil {
|
||||
return
|
||||
}
|
||||
if strings.TrimSpace(dst.Slot) == "" {
|
||||
dst.Slot = src.Slot
|
||||
}
|
||||
if strings.TrimSpace(dst.BDF) == "" {
|
||||
dst.BDF = src.BDF
|
||||
}
|
||||
if dst.VendorID == 0 {
|
||||
dst.VendorID = src.VendorID
|
||||
}
|
||||
if dst.DeviceID == 0 {
|
||||
dst.DeviceID = src.DeviceID
|
||||
}
|
||||
if strings.TrimSpace(dst.Manufacturer) == "" {
|
||||
dst.Manufacturer = src.Manufacturer
|
||||
}
|
||||
if strings.TrimSpace(dst.SerialNumber) == "" {
|
||||
dst.SerialNumber = src.SerialNumber
|
||||
}
|
||||
if strings.TrimSpace(dst.PartNumber) == "" {
|
||||
dst.PartNumber = src.PartNumber
|
||||
}
|
||||
if strings.TrimSpace(dst.LinkSpeed) == "" || strings.EqualFold(strings.TrimSpace(dst.LinkSpeed), "unknown") {
|
||||
dst.LinkSpeed = src.LinkSpeed
|
||||
}
|
||||
if strings.TrimSpace(dst.MaxLinkSpeed) == "" || strings.EqualFold(strings.TrimSpace(dst.MaxLinkSpeed), "unknown") {
|
||||
dst.MaxLinkSpeed = src.MaxLinkSpeed
|
||||
}
|
||||
if dst.LinkWidth == 0 {
|
||||
dst.LinkWidth = src.LinkWidth
|
||||
}
|
||||
if dst.MaxLinkWidth == 0 {
|
||||
dst.MaxLinkWidth = src.MaxLinkWidth
|
||||
}
|
||||
if isGenericPCIeClass(dst.DeviceClass) && !isGenericPCIeClass(src.DeviceClass) {
|
||||
dst.DeviceClass = src.DeviceClass
|
||||
}
|
||||
}
|
||||
|
||||
func normalizePCIeBDF(bdf string) string {
|
||||
bdf = strings.TrimSpace(strings.ToLower(bdf))
|
||||
if bdf == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
if strings.Contains(bdf, "/") {
|
||||
parts := strings.Split(bdf, "/")
|
||||
if len(parts) == 4 {
|
||||
return fmt.Sprintf("%s:%s.%s", parts[1], parts[2], parts[3])
|
||||
}
|
||||
}
|
||||
return bdf
|
||||
}
|
||||
|
||||
func isGenericPCIeClass(class string) bool {
|
||||
switch strings.ToLower(strings.TrimSpace(class)) {
|
||||
case "", "unknown", "other", "bridge", "network", "storage", "sas", "sata", "display", "vga", "3d controller", "serial bus":
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
// determineDeviceClass maps device type to human-readable class
|
||||
func determineDeviceClass(devType, devSubtype int, deviceName string) string {
|
||||
// dev_type mapping:
|
||||
|
||||
77
internal/parser/vendors/inspur/pcie_test.go
vendored
Normal file
77
internal/parser/vendors/inspur/pcie_test.go
vendored
Normal file
@@ -0,0 +1,77 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestParsePCIeDevices_UsesDeviceNameAsModelWhenPartNumberMissing(t *testing.T) {
|
||||
content := []byte(`RESTful PCIE Device info:
|
||||
[{"id":1,"present":1,"vendor_id":32902,"vendor_name":"Intel","device_id":5409,"device_name":"I350T4V2","bus_num":69,"dev_num":0,"func_num":0,"max_link_width":4,"max_link_speed":2,"current_link_width":4,"current_link_speed":2,"location":"#CPU0_PCIE4","dev_type":2,"dev_subtype":0,"part_num":"","serial_num":"","fw_ver":""}]
|
||||
BMC sdr Info:`)
|
||||
|
||||
devices := ParsePCIeDevices(content)
|
||||
if len(devices) != 1 {
|
||||
t.Fatalf("expected 1 device, got %d", len(devices))
|
||||
}
|
||||
if devices[0].PartNumber != "I350T4V2" {
|
||||
t.Fatalf("expected part/model I350T4V2, got %q", devices[0].PartNumber)
|
||||
}
|
||||
if devices[0].BDF != "45:00.0" {
|
||||
t.Fatalf("expected BDF 45:00.0, got %q", devices[0].BDF)
|
||||
}
|
||||
}
|
||||
|
||||
func TestMergePCIeDevices_EnrichesGenericAssetEntry(t *testing.T) {
|
||||
base := []models.PCIeDevice{
|
||||
{
|
||||
Slot: "#CPU1_PCIE9",
|
||||
BDF: "98:00.0",
|
||||
VendorID: 0x9005,
|
||||
DeviceID: 0x028f,
|
||||
DeviceClass: "SAS",
|
||||
Manufacturer: "Adaptec / Microsemi",
|
||||
},
|
||||
}
|
||||
rest := []models.PCIeDevice{
|
||||
{
|
||||
Slot: "#CPU1_PCIE9",
|
||||
BDF: "98:00.0",
|
||||
VendorID: 0x9005,
|
||||
DeviceID: 0x028f,
|
||||
DeviceClass: "Storage Controller",
|
||||
Manufacturer: "Microchip",
|
||||
PartNumber: "PM8222-SHBA",
|
||||
},
|
||||
}
|
||||
|
||||
got := MergePCIeDevices(base, rest)
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected 1 merged device, got %d", len(got))
|
||||
}
|
||||
if got[0].PartNumber != "PM8222-SHBA" {
|
||||
t.Fatalf("expected merged part number PM8222-SHBA, got %q", got[0].PartNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParsePCIeDevices_ResolvesModelFromPCIIDsWhenDeviceNameIsRawHex(t *testing.T) {
|
||||
content := []byte(`RESTful PCIE Device info:
|
||||
[{"id":5,"present":1,"vendor_id":36869,"vendor_name":"","device_id":655,"device_name":"0x028F","bus_num":152,"dev_num":0,"func_num":0,"max_link_width":8,"max_link_speed":3,"current_link_width":8,"current_link_speed":3,"location":"#CPU1_PCIE9","dev_type":1,"dev_subtype":7,"part_num":"","serial_num":"","fw_ver":""}]
|
||||
BMC sdr Info:`)
|
||||
|
||||
devices := ParsePCIeDevices(content)
|
||||
if len(devices) != 1 {
|
||||
t.Fatalf("expected 1 device, got %d", len(devices))
|
||||
}
|
||||
if devices[0].PartNumber == "" {
|
||||
t.Fatalf("expected part number resolved from pci.ids, got empty")
|
||||
}
|
||||
if strings.HasPrefix(strings.ToLower(strings.TrimSpace(devices[0].PartNumber)), "0x") {
|
||||
t.Fatalf("expected resolved name instead of raw hex, got %q", devices[0].PartNumber)
|
||||
}
|
||||
if devices[0].Manufacturer == "" {
|
||||
t.Fatalf("expected manufacturer resolved from pci.ids")
|
||||
}
|
||||
}
|
||||
559
internal/parser/vendors/inspur/redis_dump.go
vendored
Normal file
559
internal/parser/vendors/inspur/redis_dump.go
vendored
Normal file
@@ -0,0 +1,559 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"encoding/hex"
|
||||
"regexp"
|
||||
"sort"
|
||||
"strconv"
|
||||
"strings"
|
||||
"unicode"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
var (
|
||||
reRedisGPUKey = regexp.MustCompile(`GPUInfo:REDIS_GPUINFO_T([0-9]+):([A-Za-z0-9_]+)`)
|
||||
reRedisNICKey = regexp.MustCompile(`RedisNicInfo:redis_nic_info_t:stNicDeviceInfo([0-9]+):([A-Za-z0-9_]+)`)
|
||||
reRedisRAIDSerial = regexp.MustCompile(`RAIDMSCCInfo:redis_pcie_mscc_raid_info_t([0-9]+):RAIDInfo:SerialNum`)
|
||||
reRedisPCIESNPN = regexp.MustCompile(`AssetInfoPCIE:SNPN([0-9]+):(SN|PN)`)
|
||||
)
|
||||
|
||||
type redisGPUSnapshot struct {
|
||||
ByIndex map[int]map[string]string
|
||||
}
|
||||
|
||||
type redisNICSnapshot struct {
|
||||
ByIndex map[int]map[string]string
|
||||
}
|
||||
|
||||
type redisPCIESerialSnapshot struct {
|
||||
ByPart map[string]string
|
||||
}
|
||||
|
||||
func enrichFromRedisDump(content []byte, hw *models.HardwareConfig) {
|
||||
if hw == nil || len(content) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
gpuSnap := parseRedisGPUSnapshot(content)
|
||||
nicSnap := parseRedisNICSnapshot(content)
|
||||
raidSerials := parseRedisRAIDSerials(content)
|
||||
pcieSnap := parseRedisPCIESerialSnapshot(content)
|
||||
|
||||
applyRedisGPUEnrichment(hw, gpuSnap)
|
||||
applyRedisNICEnrichment(hw, nicSnap)
|
||||
applyRedisPCIESNPNEnrichment(hw, pcieSnap)
|
||||
applyRedisPCIeEnrichment(hw, raidSerials)
|
||||
}
|
||||
|
||||
func parseRedisRAIDSerials(content []byte) []string {
|
||||
matches := reRedisRAIDSerial.FindAllSubmatchIndex(content, -1)
|
||||
if len(matches) == 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
seen := make(map[string]bool, len(matches))
|
||||
serials := make([]string, 0, len(matches))
|
||||
for _, m := range matches {
|
||||
if len(m) < 4 {
|
||||
continue
|
||||
}
|
||||
value := normalizeRedisValue(extractRedisCandidateValue(content, m[1]))
|
||||
if value == "" || seen[value] {
|
||||
continue
|
||||
}
|
||||
seen[value] = true
|
||||
serials = append(serials, value)
|
||||
}
|
||||
return serials
|
||||
}
|
||||
|
||||
func parseRedisPCIESerialSnapshot(content []byte) redisPCIESerialSnapshot {
|
||||
type rec struct {
|
||||
PN string
|
||||
SN string
|
||||
}
|
||||
tmp := make(map[int]rec)
|
||||
|
||||
matches := reRedisPCIESNPN.FindAllSubmatchIndex(content, -1)
|
||||
for _, m := range matches {
|
||||
if len(m) < 6 {
|
||||
continue
|
||||
}
|
||||
idxStr := string(content[m[2]:m[3]])
|
||||
field := string(content[m[4]:m[5]])
|
||||
idx, err := strconv.Atoi(idxStr)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
value := normalizeRedisValue(extractRedisCandidateValue(content, m[1]))
|
||||
if value == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
r := tmp[idx]
|
||||
if field == "PN" {
|
||||
r.PN = value
|
||||
} else if field == "SN" {
|
||||
r.SN = value
|
||||
}
|
||||
tmp[idx] = r
|
||||
}
|
||||
|
||||
out := redisPCIESerialSnapshot{ByPart: make(map[string]string)}
|
||||
for _, r := range tmp {
|
||||
pn := normalizeRedisValue(r.PN)
|
||||
sn := normalizeRedisValue(r.SN)
|
||||
if pn == "" || sn == "" {
|
||||
continue
|
||||
}
|
||||
out.ByPart[strings.ToLower(strings.TrimSpace(pn))] = sn
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func parseRedisGPUSnapshot(content []byte) redisGPUSnapshot {
|
||||
snap := redisGPUSnapshot{ByIndex: make(map[int]map[string]string)}
|
||||
matches := reRedisGPUKey.FindAllSubmatchIndex(content, -1)
|
||||
for _, m := range matches {
|
||||
if len(m) < 6 {
|
||||
continue
|
||||
}
|
||||
|
||||
idxStr := string(content[m[2]:m[3]])
|
||||
field := string(content[m[4]:m[5]])
|
||||
idx, err := strconv.Atoi(idxStr)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
value := extractRedisInlineValue(content, m[1])
|
||||
if value == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
byField, ok := snap.ByIndex[idx]
|
||||
if !ok {
|
||||
byField = make(map[string]string)
|
||||
snap.ByIndex[idx] = byField
|
||||
}
|
||||
byField[field] = value
|
||||
}
|
||||
return snap
|
||||
}
|
||||
|
||||
func parseRedisNICSnapshot(content []byte) redisNICSnapshot {
|
||||
snap := redisNICSnapshot{ByIndex: make(map[int]map[string]string)}
|
||||
matches := reRedisNICKey.FindAllSubmatchIndex(content, -1)
|
||||
for _, m := range matches {
|
||||
if len(m) < 6 {
|
||||
continue
|
||||
}
|
||||
|
||||
idxStr := string(content[m[2]:m[3]])
|
||||
field := string(content[m[4]:m[5]])
|
||||
idx, err := strconv.Atoi(idxStr)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
value := extractRedisInlineValue(content, m[1])
|
||||
if value == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
byField, ok := snap.ByIndex[idx]
|
||||
if !ok {
|
||||
byField = make(map[string]string)
|
||||
snap.ByIndex[idx] = byField
|
||||
}
|
||||
byField[field] = value
|
||||
}
|
||||
return snap
|
||||
}
|
||||
|
||||
func extractRedisInlineValue(content []byte, start int) string {
|
||||
if start < 0 || start >= len(content) {
|
||||
return ""
|
||||
}
|
||||
|
||||
i := start
|
||||
for i < len(content) && content[i] <= 0x20 {
|
||||
i++
|
||||
}
|
||||
if i >= len(content) {
|
||||
return ""
|
||||
}
|
||||
|
||||
j := i
|
||||
for j < len(content) {
|
||||
c := content[j]
|
||||
if c == 0 || c < 0x20 || c > 0x7e {
|
||||
break
|
||||
}
|
||||
j++
|
||||
}
|
||||
|
||||
if j <= i {
|
||||
return ""
|
||||
}
|
||||
|
||||
raw := strings.TrimSpace(string(content[i:j]))
|
||||
if raw == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
decoded := maybeDecodeHexString(raw)
|
||||
if decoded != "" {
|
||||
return decoded
|
||||
}
|
||||
return raw
|
||||
}
|
||||
|
||||
func extractRedisCandidateValue(content []byte, start int) string {
|
||||
// Fast-path for simple inline string values.
|
||||
if v := extractRedisInlineValue(content, start); normalizeRedisValue(v) != "" {
|
||||
return v
|
||||
}
|
||||
|
||||
if start < 0 || start >= len(content) {
|
||||
return ""
|
||||
}
|
||||
|
||||
end := start + 256
|
||||
if end > len(content) {
|
||||
end = len(content)
|
||||
}
|
||||
window := content[start:end]
|
||||
|
||||
for _, token := range splitAlphaNumTokens(window) {
|
||||
if len(token) < 6 {
|
||||
continue
|
||||
}
|
||||
lower := strings.ToLower(token)
|
||||
if strings.Contains(lower, "redis") || strings.Contains(lower, "sensor") || strings.Contains(lower, "fullsdr") {
|
||||
continue
|
||||
}
|
||||
if decoded := maybeDecodeHexString(token); normalizeRedisValue(decoded) != "" {
|
||||
return decoded
|
||||
}
|
||||
if normalizeRedisValue(token) != "" {
|
||||
return token
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func splitAlphaNumTokens(b []byte) []string {
|
||||
var out []string
|
||||
start := -1
|
||||
for i := 0; i < len(b); i++ {
|
||||
c := rune(b[i])
|
||||
if unicode.IsLetter(c) || unicode.IsDigit(c) {
|
||||
if start == -1 {
|
||||
start = i
|
||||
}
|
||||
continue
|
||||
}
|
||||
if start != -1 {
|
||||
out = append(out, string(b[start:i]))
|
||||
start = -1
|
||||
}
|
||||
}
|
||||
if start != -1 {
|
||||
out = append(out, string(b[start:]))
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func maybeDecodeHexString(s string) string {
|
||||
if len(s) < 8 || len(s)%2 != 0 {
|
||||
return ""
|
||||
}
|
||||
|
||||
for _, c := range s {
|
||||
if (c < '0' || c > '9') && (c < 'a' || c > 'f') && (c < 'A' || c > 'F') {
|
||||
return ""
|
||||
}
|
||||
}
|
||||
|
||||
b, err := hex.DecodeString(s)
|
||||
if err != nil {
|
||||
return ""
|
||||
}
|
||||
decoded := strings.TrimSpace(strings.TrimRight(string(b), "\x00"))
|
||||
if decoded == "" {
|
||||
return ""
|
||||
}
|
||||
for _, c := range decoded {
|
||||
if c < 0x20 || c > 0x7e {
|
||||
return ""
|
||||
}
|
||||
}
|
||||
return decoded
|
||||
}
|
||||
|
||||
func applyRedisGPUEnrichment(hw *models.HardwareConfig, snap redisGPUSnapshot) {
|
||||
if len(hw.GPUs) == 0 || len(snap.ByIndex) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
type redisGPU struct {
|
||||
Index int
|
||||
Data map[string]string
|
||||
}
|
||||
redisGPUs := make([]redisGPU, 0, len(snap.ByIndex))
|
||||
for idx, data := range snap.ByIndex {
|
||||
if data == nil {
|
||||
continue
|
||||
}
|
||||
if data["NV_GPU_SerialNumber"] == "" && data["NV_GPU_FWVersion"] == "" && data["NV_GPU_UUID"] == "" {
|
||||
continue
|
||||
}
|
||||
redisGPUs = append(redisGPUs, redisGPU{Index: idx, Data: data})
|
||||
}
|
||||
if len(redisGPUs) == 0 {
|
||||
return
|
||||
}
|
||||
sort.Slice(redisGPUs, func(i, j int) bool { return redisGPUs[i].Index < redisGPUs[j].Index })
|
||||
|
||||
target := make([]*models.GPU, 0, len(hw.GPUs))
|
||||
for i := range hw.GPUs {
|
||||
gpu := &hw.GPUs[i]
|
||||
if isNVIDIAGPU(gpu) {
|
||||
target = append(target, gpu)
|
||||
}
|
||||
}
|
||||
if len(target) == 0 || len(target) != len(redisGPUs) {
|
||||
return
|
||||
}
|
||||
sort.Slice(target, func(i, j int) bool {
|
||||
left := strings.TrimSpace(target[i].BDF)
|
||||
right := strings.TrimSpace(target[j].BDF)
|
||||
if left != "" && right != "" {
|
||||
return left < right
|
||||
}
|
||||
return strings.TrimSpace(target[i].Slot) < strings.TrimSpace(target[j].Slot)
|
||||
})
|
||||
|
||||
for i := range target {
|
||||
applyRedisGPUFields(target[i], redisGPUs[i].Data)
|
||||
}
|
||||
}
|
||||
|
||||
func isNVIDIAGPU(gpu *models.GPU) bool {
|
||||
if gpu == nil {
|
||||
return false
|
||||
}
|
||||
if gpu.VendorID == 0x10de {
|
||||
return true
|
||||
}
|
||||
man := strings.ToLower(strings.TrimSpace(gpu.Manufacturer))
|
||||
return strings.Contains(man, "nvidia")
|
||||
}
|
||||
|
||||
func applyRedisGPUFields(gpu *models.GPU, fields map[string]string) {
|
||||
if gpu == nil || fields == nil {
|
||||
return
|
||||
}
|
||||
|
||||
if serial := normalizeRedisValue(fields["NV_GPU_SerialNumber"]); serial != "" && isMissingGPUField(gpu.SerialNumber) {
|
||||
gpu.SerialNumber = serial
|
||||
}
|
||||
if fw := normalizeRedisValue(fields["NV_GPU_FWVersion"]); fw != "" && isMissingGPUField(gpu.Firmware) {
|
||||
gpu.Firmware = fw
|
||||
}
|
||||
if uuid := normalizeRedisValue(fields["NV_GPU_UUID"]); uuid != "" && isMissingGPUField(gpu.UUID) {
|
||||
gpu.UUID = uuid
|
||||
}
|
||||
|
||||
if part := normalizeRedisValue(fields["NVGPUPartNumber"]); part != "" && isMissingGPUField(gpu.PartNumber) {
|
||||
gpu.PartNumber = part
|
||||
}
|
||||
if model := normalizeRedisValue(fields["NVGPUMarketingName"]); model != "" && isGenericGPUModel(gpu.Model) {
|
||||
gpu.Model = model
|
||||
}
|
||||
|
||||
if gpu.ClockSpeed == 0 {
|
||||
if mhz, ok := parseIntField(fields["OperatingSpeedMHz"]); ok {
|
||||
gpu.ClockSpeed = mhz
|
||||
}
|
||||
}
|
||||
if gpu.Power == 0 {
|
||||
if pwr, ok := parseIntField(fields["GPUTotalPower"]); ok {
|
||||
gpu.Power = pwr
|
||||
}
|
||||
}
|
||||
if gpu.Temperature == 0 {
|
||||
if temp, ok := parseIntField(fields["Temp"]); ok {
|
||||
gpu.Temperature = temp
|
||||
}
|
||||
}
|
||||
if gpu.MemTemperature == 0 {
|
||||
if temp, ok := parseIntField(fields["MemTemp"]); ok {
|
||||
gpu.MemTemperature = temp
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func parseIntField(v string) (int, bool) {
|
||||
v = normalizeRedisValue(v)
|
||||
if v == "" {
|
||||
return 0, false
|
||||
}
|
||||
n, err := strconv.Atoi(v)
|
||||
if err != nil {
|
||||
return 0, false
|
||||
}
|
||||
return n, true
|
||||
}
|
||||
|
||||
func normalizeRedisValue(v string) string {
|
||||
v = strings.TrimSpace(v)
|
||||
if v == "" {
|
||||
return ""
|
||||
}
|
||||
l := strings.ToLower(v)
|
||||
if l == "n/a" || l == "na" || l == "null" || l == "unknown" {
|
||||
return ""
|
||||
}
|
||||
return v
|
||||
}
|
||||
|
||||
func isMissingGPUField(v string) bool {
|
||||
return normalizeRedisValue(v) == ""
|
||||
}
|
||||
|
||||
func isGenericGPUModel(model string) bool {
|
||||
m := strings.ToLower(strings.TrimSpace(model))
|
||||
switch m {
|
||||
case "", "unknown", "display", "display controller", "3d controller", "vga", "gpu":
|
||||
return true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
func applyRedisNICEnrichment(hw *models.HardwareConfig, snap redisNICSnapshot) {
|
||||
if len(hw.NetworkAdapters) == 0 || len(snap.ByIndex) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
type redisNIC struct {
|
||||
Index int
|
||||
Data map[string]string
|
||||
}
|
||||
redisNICs := make([]redisNIC, 0, len(snap.ByIndex))
|
||||
for idx, data := range snap.ByIndex {
|
||||
if data == nil {
|
||||
continue
|
||||
}
|
||||
if normalizeRedisValue(data["FWVersion"]) == "" {
|
||||
continue
|
||||
}
|
||||
redisNICs = append(redisNICs, redisNIC{Index: idx, Data: data})
|
||||
}
|
||||
if len(redisNICs) == 0 {
|
||||
return
|
||||
}
|
||||
sort.Slice(redisNICs, func(i, j int) bool { return redisNICs[i].Index < redisNICs[j].Index })
|
||||
|
||||
target := make([]*models.NetworkAdapter, 0, len(hw.NetworkAdapters))
|
||||
for i := range hw.NetworkAdapters {
|
||||
nic := &hw.NetworkAdapters[i]
|
||||
if nic.Present {
|
||||
target = append(target, nic)
|
||||
}
|
||||
}
|
||||
if len(target) == 0 {
|
||||
return
|
||||
}
|
||||
sort.Slice(target, func(i, j int) bool {
|
||||
left := strings.TrimSpace(target[i].Location)
|
||||
right := strings.TrimSpace(target[j].Location)
|
||||
if left != "" && right != "" {
|
||||
return left < right
|
||||
}
|
||||
return strings.TrimSpace(target[i].Slot) < strings.TrimSpace(target[j].Slot)
|
||||
})
|
||||
|
||||
limit := len(target)
|
||||
if len(redisNICs) < limit {
|
||||
limit = len(redisNICs)
|
||||
}
|
||||
for i := 0; i < limit; i++ {
|
||||
nic := target[i]
|
||||
data := redisNICs[i].Data
|
||||
|
||||
if fw := normalizeRedisValue(data["FWVersion"]); fw != "" && normalizeRedisValue(nic.Firmware) == "" {
|
||||
nic.Firmware = fw
|
||||
}
|
||||
if serial := normalizeRedisValue(data["SerialNum"]); serial != "" && normalizeRedisValue(nic.SerialNumber) == "" {
|
||||
nic.SerialNumber = serial
|
||||
}
|
||||
if part := normalizeRedisValue(data["PartNum"]); part != "" && normalizeRedisValue(nic.PartNumber) == "" {
|
||||
nic.PartNumber = part
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func applyRedisPCIeEnrichment(hw *models.HardwareConfig, raidSerials []string) {
|
||||
if hw == nil || len(hw.PCIeDevices) == 0 || len(raidSerials) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
target := make([]*models.PCIeDevice, 0, len(hw.PCIeDevices))
|
||||
for i := range hw.PCIeDevices {
|
||||
dev := &hw.PCIeDevices[i]
|
||||
if normalizeRedisValue(dev.SerialNumber) != "" {
|
||||
continue
|
||||
}
|
||||
class := strings.ToLower(strings.TrimSpace(dev.DeviceClass))
|
||||
part := strings.ToLower(strings.TrimSpace(dev.PartNumber))
|
||||
if strings.Contains(class, "raid") || strings.Contains(class, "sas") || strings.Contains(class, "storage") ||
|
||||
strings.Contains(part, "raid") || strings.Contains(part, "sas") || strings.Contains(part, "hba") {
|
||||
target = append(target, dev)
|
||||
}
|
||||
}
|
||||
if len(target) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
sort.Slice(target, func(i, j int) bool {
|
||||
left := strings.TrimSpace(target[i].BDF)
|
||||
right := strings.TrimSpace(target[j].BDF)
|
||||
if left != "" && right != "" {
|
||||
return left < right
|
||||
}
|
||||
return strings.TrimSpace(target[i].Slot) < strings.TrimSpace(target[j].Slot)
|
||||
})
|
||||
|
||||
limit := len(target)
|
||||
if len(raidSerials) < limit {
|
||||
limit = len(raidSerials)
|
||||
}
|
||||
for i := 0; i < limit; i++ {
|
||||
target[i].SerialNumber = raidSerials[i]
|
||||
}
|
||||
}
|
||||
|
||||
func applyRedisPCIESNPNEnrichment(hw *models.HardwareConfig, snap redisPCIESerialSnapshot) {
|
||||
if hw == nil || len(hw.PCIeDevices) == 0 || len(snap.ByPart) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
for i := range hw.PCIeDevices {
|
||||
dev := &hw.PCIeDevices[i]
|
||||
if normalizeRedisValue(dev.SerialNumber) != "" {
|
||||
continue
|
||||
}
|
||||
part := strings.ToLower(strings.TrimSpace(dev.PartNumber))
|
||||
if part == "" {
|
||||
continue
|
||||
}
|
||||
if serial := normalizeRedisValue(snap.ByPart[part]); serial != "" {
|
||||
dev.SerialNumber = serial
|
||||
}
|
||||
}
|
||||
}
|
||||
144
internal/parser/vendors/inspur/redis_dump_test.go
vendored
Normal file
144
internal/parser/vendors/inspur/redis_dump_test.go
vendored
Normal file
@@ -0,0 +1,144 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestExtractRedisInlineValue_DecodesHexEncodedString(t *testing.T) {
|
||||
data := []byte("RedisNicInfo:redis_nic_info_t:stNicDeviceInfo0:FWVersion 32362e34332e32353636000000000000\x00tail")
|
||||
key := []byte("RedisNicInfo:redis_nic_info_t:stNicDeviceInfo0:FWVersion")
|
||||
pos := indexBytes(data, key)
|
||||
if pos < 0 {
|
||||
t.Fatal("key not found")
|
||||
}
|
||||
|
||||
got := extractRedisInlineValue(data, pos+len(key))
|
||||
if got != "26.43.2566" {
|
||||
t.Fatalf("expected decoded fw 26.43.2566, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyRedisGPUEnrichment_FillsSerialFirmwareUUID(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "#CPU0_PCIE2", BDF: "0c:00.0", VendorID: 0x10de, Model: "3D Controller"},
|
||||
{Slot: "#CPU0_PCIE1", BDF: "58:00.0", VendorID: 0x10de, Model: "3D Controller"},
|
||||
},
|
||||
}
|
||||
|
||||
snap := redisGPUSnapshot{
|
||||
ByIndex: map[int]map[string]string{
|
||||
1: {
|
||||
"NV_GPU_SerialNumber": "1321125009572",
|
||||
"NV_GPU_FWVersion": "96.00.B7.00.02",
|
||||
"NV_GPU_UUID": "GPU-AAA",
|
||||
},
|
||||
2: {
|
||||
"NV_GPU_SerialNumber": "1321125010420",
|
||||
"NV_GPU_FWVersion": "96.00.B7.00.02",
|
||||
"NV_GPU_UUID": "GPU-BBB",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
applyRedisGPUEnrichment(hw, snap)
|
||||
|
||||
if hw.GPUs[0].SerialNumber != "1321125009572" || hw.GPUs[0].Firmware != "96.00.B7.00.02" || hw.GPUs[0].UUID != "GPU-AAA" {
|
||||
t.Fatalf("unexpected gpu0 enrichment: %+v", hw.GPUs[0])
|
||||
}
|
||||
if hw.GPUs[1].SerialNumber != "1321125010420" || hw.GPUs[1].Firmware != "96.00.B7.00.02" || hw.GPUs[1].UUID != "GPU-BBB" {
|
||||
t.Fatalf("unexpected gpu1 enrichment: %+v", hw.GPUs[1])
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyRedisGPUEnrichment_SkipsOnCountMismatch(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "#CPU0_PCIE2", BDF: "0c:00.0", VendorID: 0x10de, Model: "3D Controller"},
|
||||
},
|
||||
}
|
||||
snap := redisGPUSnapshot{
|
||||
ByIndex: map[int]map[string]string{
|
||||
1: {"NV_GPU_SerialNumber": "1321125009572"},
|
||||
2: {"NV_GPU_SerialNumber": "1321125010420"},
|
||||
},
|
||||
}
|
||||
|
||||
applyRedisGPUEnrichment(hw, snap)
|
||||
if hw.GPUs[0].SerialNumber != "" {
|
||||
t.Fatalf("expected no enrichment on count mismatch, got %q", hw.GPUs[0].SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseRedisRAIDSerials_DecodesHexSerial(t *testing.T) {
|
||||
raw := []byte("RAIDMSCCInfo:redis_pcie_mscc_raid_info_t0:RAIDInfo:SerialNum\x80%@`5341523531314532 \x00tail")
|
||||
got := parseRedisRAIDSerials(raw)
|
||||
if len(got) != 1 {
|
||||
t.Fatalf("expected 1 raid serial, got %d", len(got))
|
||||
}
|
||||
if got[0] != "SAR511E2" {
|
||||
t.Fatalf("expected decoded serial SAR511E2, got %q", got[0])
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyRedisPCIeEnrichment_FillsStorageControllerSerial(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{Slot: "#CPU1_PCIE9", BDF: "98:00.0", DeviceClass: "Smart Storage PQI SAS", PartNumber: "PM8222-SHBA"},
|
||||
{Slot: "#CPU0_PCIE3", BDF: "32:00.0", DeviceClass: "Fibre Channel", PartNumber: "LPE32002"},
|
||||
},
|
||||
}
|
||||
|
||||
applyRedisPCIeEnrichment(hw, []string{"SAR511E2"})
|
||||
|
||||
if hw.PCIeDevices[0].SerialNumber != "SAR511E2" {
|
||||
t.Fatalf("expected PM8222 serial SAR511E2, got %q", hw.PCIeDevices[0].SerialNumber)
|
||||
}
|
||||
if hw.PCIeDevices[1].SerialNumber != "" {
|
||||
t.Fatalf("expected non-storage device serial untouched, got %q", hw.PCIeDevices[1].SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseRedisPCIESerialSnapshot_MapsPNToSN(t *testing.T) {
|
||||
raw := []byte("" +
|
||||
"AssetInfoPCIE:SNPN9:PN PM8222-SHBA\x00" +
|
||||
"AssetInfoPCIE:SNPN9:SN SAR511E2\x00")
|
||||
|
||||
snap := parseRedisPCIESerialSnapshot(raw)
|
||||
got := snap.ByPart["pm8222-shba"]
|
||||
if got != "SAR511E2" {
|
||||
t.Fatalf("expected SN SAR511E2 for PM8222-SHBA, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyRedisPCIESNPNEnrichment_FillsByPartNumber(t *testing.T) {
|
||||
hw := &models.HardwareConfig{
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{Slot: "#CPU1_PCIE9", PartNumber: "PM8222-SHBA"},
|
||||
},
|
||||
}
|
||||
snap := redisPCIESerialSnapshot{ByPart: map[string]string{"pm8222-shba": "SAR511E2"}}
|
||||
|
||||
applyRedisPCIESNPNEnrichment(hw, snap)
|
||||
if hw.PCIeDevices[0].SerialNumber != "SAR511E2" {
|
||||
t.Fatalf("expected serial SAR511E2, got %q", hw.PCIeDevices[0].SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func indexBytes(haystack, needle []byte) int {
|
||||
for i := 0; i+len(needle) <= len(haystack); i++ {
|
||||
match := true
|
||||
for j := 0; j < len(needle); j++ {
|
||||
if haystack[i+j] != needle[j] {
|
||||
match = false
|
||||
break
|
||||
}
|
||||
}
|
||||
if match {
|
||||
return i
|
||||
}
|
||||
}
|
||||
return -1
|
||||
}
|
||||
17
internal/parser/vendors/inspur/sel.go
vendored
17
internal/parser/vendors/inspur/sel.go
vendored
@@ -6,12 +6,19 @@ import (
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
// ParseSELList parses selelist.csv file with SEL events
|
||||
// Format: ID, Date (MM/DD/YYYY), Time (HH:MM:SS), Sensor, Event, Status
|
||||
// Example: 1,04/18/2025,09:31:18,Event Logging Disabled SEL_Status,Log area reset/cleared,Asserted
|
||||
func ParseSELList(content []byte) []models.Event {
|
||||
return ParseSELListWithLocation(content, parser.DefaultArchiveLocation())
|
||||
}
|
||||
|
||||
// ParseSELListWithLocation parses selelist.csv using provided source timezone
|
||||
// for timestamps that don't contain an explicit offset.
|
||||
func ParseSELListWithLocation(content []byte, location *time.Location) []models.Event {
|
||||
var events []models.Event
|
||||
|
||||
text := string(content)
|
||||
@@ -48,7 +55,7 @@ func ParseSELList(content []byte) []models.Event {
|
||||
status := strings.TrimSpace(records[5])
|
||||
|
||||
// Parse timestamp: MM/DD/YYYY HH:MM:SS
|
||||
timestamp := parseSELTimestamp(dateStr, timeStr)
|
||||
timestamp := parseSELTimestamp(dateStr, timeStr, location)
|
||||
|
||||
// Extract sensor type and name
|
||||
sensorType, sensorName := parseSensorInfo(sensorStr)
|
||||
@@ -76,12 +83,16 @@ func ParseSELList(content []byte) []models.Event {
|
||||
}
|
||||
|
||||
// parseSELTimestamp parses MM/DD/YYYY and HH:MM:SS into time.Time
|
||||
func parseSELTimestamp(dateStr, timeStr string) time.Time {
|
||||
func parseSELTimestamp(dateStr, timeStr string, location *time.Location) time.Time {
|
||||
// Combine date and time: MM/DD/YYYY HH:MM:SS
|
||||
timestampStr := dateStr + " " + timeStr
|
||||
|
||||
if location == nil {
|
||||
location = parser.DefaultArchiveLocation()
|
||||
}
|
||||
|
||||
// Try parsing with MM/DD/YYYY format
|
||||
t, err := time.Parse("01/02/2006 15:04:05", timestampStr)
|
||||
t, err := time.ParseInLocation("01/02/2006 15:04:05", timestampStr, location)
|
||||
if err != nil {
|
||||
// Fallback to current time
|
||||
return time.Now()
|
||||
|
||||
33
internal/parser/vendors/inspur/sel_test.go
vendored
Normal file
33
internal/parser/vendors/inspur/sel_test.go
vendored
Normal file
@@ -0,0 +1,33 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
func TestParseSELListWithLocation_UsesProvidedTimezone(t *testing.T) {
|
||||
content := []byte("sel elist:\n1,02/28/2026,04:18:18,Sensor X,Event,Asserted\n")
|
||||
shanghai, err := time.LoadLocation("Asia/Shanghai")
|
||||
if err != nil {
|
||||
t.Fatalf("load location: %v", err)
|
||||
}
|
||||
|
||||
events := ParseSELListWithLocation(content, shanghai)
|
||||
if len(events) != 1 {
|
||||
t.Fatalf("expected 1 event, got %d", len(events))
|
||||
}
|
||||
|
||||
// 04:18:18 +08:00 == 20:18:18Z (previous day)
|
||||
want := time.Date(2026, 2, 27, 20, 18, 18, 0, time.UTC)
|
||||
if !events[0].Timestamp.UTC().Equal(want) {
|
||||
t.Fatalf("unexpected timestamp: got %s want %s", events[0].Timestamp.UTC(), want)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseTimezoneConfigLocation(t *testing.T) {
|
||||
content := []byte("[TimeZoneConfig]\ntimezone=Asia/Shanghai\n")
|
||||
got := parseTimezoneConfigLocation(content)
|
||||
if got != "Asia/Shanghai" {
|
||||
t.Fatalf("unexpected timezone: %q", got)
|
||||
}
|
||||
}
|
||||
92
internal/parser/vendors/inspur/serial_fallback.go
vendored
Normal file
92
internal/parser/vendors/inspur/serial_fallback.go
vendored
Normal file
@@ -0,0 +1,92 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"regexp"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
var (
|
||||
hostnameJSONRegex = regexp.MustCompile(`"_HOSTNAME"\s*:\s*"([^"]+)"`)
|
||||
)
|
||||
|
||||
func inferBoardSerialFromFallbackLogs(files []parser.ExtractedFile) string {
|
||||
// Prefer FRU dump when present.
|
||||
if f := parser.FindFileByName(files, "fru.txt"); f != nil {
|
||||
fruList := ParseFRU(f.Content)
|
||||
for _, fru := range fruList {
|
||||
serial := strings.TrimSpace(fru.SerialNumber)
|
||||
if serial == "" || serial == "0" {
|
||||
continue
|
||||
}
|
||||
desc := strings.ToLower(strings.TrimSpace(fru.Description))
|
||||
if strings.Contains(desc, "builtin") || strings.Contains(desc, "fru device") {
|
||||
return serial
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Fallback to explicit hostname file.
|
||||
if f := parser.FindFileByName(files, "hostname"); f != nil {
|
||||
if serial := sanitizeCandidateSerial(firstNonEmptyLine(string(f.Content))); serial != "" {
|
||||
return serial
|
||||
}
|
||||
}
|
||||
|
||||
// Last-resort fallback from structured journal logs.
|
||||
if f := parser.FindFileByName(files, "maintenance_json.log"); f != nil {
|
||||
if m := hostnameJSONRegex.FindSubmatch(f.Content); len(m) == 2 {
|
||||
if serial := sanitizeCandidateSerial(string(m[1])); serial != "" {
|
||||
return serial
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return ""
|
||||
}
|
||||
|
||||
func inferBoardModelFromFallbackLogs(files []parser.ExtractedFile) string {
|
||||
// Prefer FRU dump when present.
|
||||
if f := parser.FindFileByName(files, "fru.txt"); f != nil {
|
||||
fruList := ParseFRU(f.Content)
|
||||
for _, fru := range fruList {
|
||||
model := sanitizeCandidateModel(fru.ProductName)
|
||||
if model == "" {
|
||||
continue
|
||||
}
|
||||
desc := strings.ToLower(strings.TrimSpace(fru.Description))
|
||||
if strings.Contains(desc, "builtin") || strings.Contains(desc, "fru device") {
|
||||
return model
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return ""
|
||||
}
|
||||
|
||||
func firstNonEmptyLine(s string) string {
|
||||
for _, line := range strings.Split(s, "\n") {
|
||||
line = strings.TrimSpace(line)
|
||||
if line != "" {
|
||||
return line
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func sanitizeCandidateSerial(s string) string {
|
||||
s = strings.TrimSpace(s)
|
||||
if s == "" || strings.EqualFold(s, "localhost") || strings.ContainsAny(s, " \t") {
|
||||
return ""
|
||||
}
|
||||
return s
|
||||
}
|
||||
|
||||
func sanitizeCandidateModel(s string) string {
|
||||
s = strings.TrimSpace(s)
|
||||
if s == "" || strings.EqualFold(s, "null") || s == "0" {
|
||||
return ""
|
||||
}
|
||||
return s
|
||||
}
|
||||
76
internal/parser/vendors/inspur/serial_fallback_test.go
vendored
Normal file
76
internal/parser/vendors/inspur/serial_fallback_test.go
vendored
Normal file
@@ -0,0 +1,76 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestInferBoardSerialFromFallbackLogs_PrefersFRU(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "component/fru.txt",
|
||||
Content: []byte(`FRU Device Description : Builtin FRU Device (ID 0)
|
||||
Product Serial : 23DB01639
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "runningdata/RTOSDump/hostname",
|
||||
Content: []byte("HOSTNAME-FALLBACK\n"),
|
||||
},
|
||||
{
|
||||
Path: "log/bmc/struct-log/maintenance_json.log",
|
||||
Content: []byte(`{ "_HOSTNAME": "JSON-FALLBACK" }`),
|
||||
},
|
||||
}
|
||||
|
||||
got := inferBoardSerialFromFallbackLogs(files)
|
||||
if got != "23DB01639" {
|
||||
t.Fatalf("expected FRU serial 23DB01639, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInferBoardSerialFromFallbackLogs_UsesHostnameFile(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "runningdata/RTOSDump/hostname",
|
||||
Content: []byte("23DB01639\n"),
|
||||
},
|
||||
}
|
||||
|
||||
got := inferBoardSerialFromFallbackLogs(files)
|
||||
if got != "23DB01639" {
|
||||
t.Fatalf("expected hostname serial 23DB01639, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInferBoardSerialFromFallbackLogs_UsesMaintenanceJSON(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "log/bmc/struct-log/maintenance_json.log",
|
||||
Content: []byte(`{ "_HOSTNAME": "23DB01639", "MESSAGE": "ok" }`),
|
||||
},
|
||||
}
|
||||
|
||||
got := inferBoardSerialFromFallbackLogs(files)
|
||||
if got != "23DB01639" {
|
||||
t.Fatalf("expected JSON hostname serial 23DB01639, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestInferBoardModelFromFallbackLogs_PrefersFRU(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "component/fru.txt",
|
||||
Content: []byte(`FRU Device Description : Builtin FRU Device (ID 0)
|
||||
Board Product : KR9288-X3-A0-F0-00
|
||||
Product Name : KR9288-X3-A0-F0-00
|
||||
`),
|
||||
},
|
||||
}
|
||||
|
||||
got := inferBoardModelFromFallbackLogs(files)
|
||||
if got != "KR9288-X3-A0-F0-00" {
|
||||
t.Fatalf("expected board model KR9288-X3-A0-F0-00, got %q", got)
|
||||
}
|
||||
}
|
||||
148
internal/parser/vendors/inspur/storage_serial_fallback.go
vendored
Normal file
148
internal/parser/vendors/inspur/storage_serial_fallback.go
vendored
Normal file
@@ -0,0 +1,148 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"regexp"
|
||||
"sort"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
var bpHDDSerialTokenRegex = regexp.MustCompile(`[A-Za-z0-9]{8,32}`)
|
||||
|
||||
func enrichStorageFromSerialFallbackFiles(files []parser.ExtractedFile, hw *models.HardwareConfig) {
|
||||
if hw == nil {
|
||||
return
|
||||
}
|
||||
f := parser.FindFileByName(files, "BpHDDSerialNumber.info")
|
||||
if f == nil {
|
||||
return
|
||||
}
|
||||
serials := extractBPHDDSerials(f.Content)
|
||||
if len(serials) == 0 {
|
||||
return
|
||||
}
|
||||
applyStorageSerialFallback(hw, serials)
|
||||
}
|
||||
|
||||
func extractBPHDDSerials(content []byte) []string {
|
||||
if len(content) == 0 {
|
||||
return nil
|
||||
}
|
||||
matches := bpHDDSerialTokenRegex.FindAllString(string(content), -1)
|
||||
if len(matches) == 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
out := make([]string, 0, len(matches))
|
||||
seen := make(map[string]struct{}, len(matches))
|
||||
for _, m := range matches {
|
||||
v := normalizeRedisValue(m)
|
||||
if !looksLikeStorageSerial(v) {
|
||||
continue
|
||||
}
|
||||
key := strings.ToLower(v)
|
||||
if _, ok := seen[key]; ok {
|
||||
continue
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
out = append(out, v)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func looksLikeStorageSerial(v string) bool {
|
||||
if len(v) < 8 {
|
||||
return false
|
||||
}
|
||||
hasLetter := false
|
||||
hasDigit := false
|
||||
for _, r := range v {
|
||||
switch {
|
||||
case r >= 'A' && r <= 'Z':
|
||||
hasLetter = true
|
||||
case r >= 'a' && r <= 'z':
|
||||
hasLetter = true
|
||||
case r >= '0' && r <= '9':
|
||||
hasDigit = true
|
||||
default:
|
||||
return false
|
||||
}
|
||||
}
|
||||
return hasLetter && hasDigit
|
||||
}
|
||||
|
||||
func applyStorageSerialFallback(hw *models.HardwareConfig, serials []string) {
|
||||
if hw == nil || len(hw.Storage) == 0 || len(serials) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
existing := make(map[string]struct{}, len(hw.Storage))
|
||||
for _, dev := range hw.Storage {
|
||||
if sn := normalizeRedisValue(dev.SerialNumber); sn != "" {
|
||||
existing[strings.ToLower(sn)] = struct{}{}
|
||||
}
|
||||
}
|
||||
|
||||
filtered := make([]string, 0, len(serials))
|
||||
for _, sn := range serials {
|
||||
key := strings.ToLower(sn)
|
||||
if _, ok := existing[key]; ok {
|
||||
continue
|
||||
}
|
||||
filtered = append(filtered, sn)
|
||||
}
|
||||
if len(filtered) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
type target struct {
|
||||
index int
|
||||
rank int
|
||||
slot string
|
||||
}
|
||||
targets := make([]target, 0, len(hw.Storage))
|
||||
for i := range hw.Storage {
|
||||
dev := hw.Storage[i]
|
||||
if normalizeRedisValue(dev.SerialNumber) != "" {
|
||||
continue
|
||||
}
|
||||
if !dev.Present && strings.TrimSpace(dev.Slot) == "" {
|
||||
continue
|
||||
}
|
||||
rank := 0
|
||||
if !dev.Present {
|
||||
rank += 10
|
||||
}
|
||||
if strings.EqualFold(strings.TrimSpace(dev.Type), "NVMe") {
|
||||
rank += 5
|
||||
}
|
||||
if strings.TrimSpace(dev.Slot) == "" {
|
||||
rank += 4
|
||||
}
|
||||
targets = append(targets, target{
|
||||
index: i,
|
||||
rank: rank,
|
||||
slot: strings.ToLower(strings.TrimSpace(dev.Slot)),
|
||||
})
|
||||
}
|
||||
if len(targets) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
sort.Slice(targets, func(i, j int) bool {
|
||||
if targets[i].rank != targets[j].rank {
|
||||
return targets[i].rank < targets[j].rank
|
||||
}
|
||||
return targets[i].slot < targets[j].slot
|
||||
})
|
||||
|
||||
for i := 0; i < len(targets) && i < len(filtered); i++ {
|
||||
dev := &hw.Storage[targets[i].index]
|
||||
dev.SerialNumber = filtered[i]
|
||||
if !dev.Present {
|
||||
dev.Present = true
|
||||
}
|
||||
}
|
||||
}
|
||||
106
internal/parser/vendors/inspur/storage_serial_fallback_test.go
vendored
Normal file
106
internal/parser/vendors/inspur/storage_serial_fallback_test.go
vendored
Normal file
@@ -0,0 +1,106 @@
|
||||
package inspur
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestParseAssetJSON_HddSlotFallbackAndPresence(t *testing.T) {
|
||||
content := []byte(`{
|
||||
"HddInfo": [
|
||||
{
|
||||
"PresentBitmap": [1],
|
||||
"SerialNumber": "",
|
||||
"Manufacturer": "",
|
||||
"ModelName": "",
|
||||
"FirmwareVersion": "",
|
||||
"Capacity": 0,
|
||||
"Location": 2,
|
||||
"DiskInterfaceType": 5,
|
||||
"MediaType": 1,
|
||||
"LocationString": ""
|
||||
}
|
||||
]
|
||||
}`)
|
||||
|
||||
hw, err := ParseAssetJSON(content)
|
||||
if err != nil {
|
||||
t.Fatalf("ParseAssetJSON failed: %v", err)
|
||||
}
|
||||
if len(hw.Storage) != 1 {
|
||||
t.Fatalf("expected 1 storage entry, got %d", len(hw.Storage))
|
||||
}
|
||||
if hw.Storage[0].Slot != "OB03" {
|
||||
t.Fatalf("expected OB03 slot fallback, got %q", hw.Storage[0].Slot)
|
||||
}
|
||||
if !hw.Storage[0].Present {
|
||||
t.Fatalf("expected fallback storage entry marked present")
|
||||
}
|
||||
if hw.Storage[0].Type != "NVMe" {
|
||||
t.Fatalf("expected NVMe type, got %q", hw.Storage[0].Type)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseDiskBackplaneInfo_PopulatesOnlyMissingPresentDrives(t *testing.T) {
|
||||
text := `RESTful diskbackplane info:
|
||||
[
|
||||
{ "port_count": 8, "driver_count": 4, "front": 1, "backplane_index": 0, "present": 1, "cpld_version": "3.1", "temperature": 18 },
|
||||
{ "port_count": 8, "driver_count": 3, "front": 1, "backplane_index": 1, "present": 1, "cpld_version": "3.1", "temperature": 17 }
|
||||
]
|
||||
BMC`
|
||||
|
||||
hw := &models.HardwareConfig{
|
||||
Storage: []models.Storage{
|
||||
{Slot: "OB01", Type: "NVMe", Present: true},
|
||||
{Slot: "OB02", Type: "NVMe", Present: true},
|
||||
{Slot: "OB03", Type: "NVMe", Present: true},
|
||||
{Slot: "OB04", Type: "NVMe", Present: true},
|
||||
},
|
||||
}
|
||||
|
||||
parseDiskBackplaneInfo(text, hw)
|
||||
|
||||
if len(hw.Storage) != 7 {
|
||||
t.Fatalf("expected total storage count 7 after backplane merge, got %d", len(hw.Storage))
|
||||
}
|
||||
bpCount := 0
|
||||
for _, dev := range hw.Storage {
|
||||
if strings.HasPrefix(dev.Slot, "BP0:") || strings.HasPrefix(dev.Slot, "BP1:") {
|
||||
bpCount++
|
||||
}
|
||||
}
|
||||
if bpCount != 3 {
|
||||
t.Fatalf("expected 3 synthetic backplane rows, got %d", bpCount)
|
||||
}
|
||||
}
|
||||
|
||||
func TestEnrichStorageFromSerialFallbackFiles_AssignsSerials(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "onekeylog/configuration/conf/BpHDDSerialNumber.info",
|
||||
Content: []byte{
|
||||
0xA0, 0xA1, 0xA2, 0xA3,
|
||||
'S', '6', 'K', 'N', 'N', 'G', '0', 'W', '4', '2', '8', '5', '5', '2',
|
||||
0x00,
|
||||
'P', 'H', 'Y', 'I', '5', '2', '7', '1', '0', '0', '4', 'B', '1', 'P', '9', 'D', 'G', 'N',
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
hw := &models.HardwareConfig{
|
||||
Storage: []models.Storage{
|
||||
{Slot: "BP0:0", Type: "HDD", Present: true},
|
||||
{Slot: "BP0:1", Type: "HDD", Present: true},
|
||||
{Slot: "OB01", Type: "NVMe", Present: true},
|
||||
},
|
||||
}
|
||||
|
||||
enrichStorageFromSerialFallbackFiles(files, hw)
|
||||
|
||||
if hw.Storage[0].SerialNumber == "" || hw.Storage[1].SerialNumber == "" {
|
||||
t.Fatalf("expected serials assigned to present storage entries, got %#v", hw.Storage)
|
||||
}
|
||||
}
|
||||
175
internal/parser/vendors/nvidia/README.md
vendored
175
internal/parser/vendors/nvidia/README.md
vendored
@@ -1,175 +0,0 @@
|
||||
# NVIDIA Field Diagnostics Parser
|
||||
|
||||
Парсер для диагностических архивов NVIDIA HGX Field Diagnostics.
|
||||
Универсальный парсер, не привязанный к конкретному производителю серверов.
|
||||
|
||||
## Поддерживаемые архивы
|
||||
|
||||
- NVIDIA HGX Field Diag (работает с любыми серверами: Supermicro, Dell, HPE, и т.д.)
|
||||
- Архивы с результатами GPU диагностики NVIDIA
|
||||
|
||||
## Формат архива
|
||||
|
||||
Парсер работает с архивами в формате:
|
||||
- `.tar` (несжатый tar)
|
||||
- `.tar.gz` (сжатый gzip)
|
||||
|
||||
## Распознаваемые файлы
|
||||
|
||||
### Основные файлы
|
||||
|
||||
1. **output.log** - вывод dmidecode с информацией о системе
|
||||
- Производитель сервера (Manufacturer)
|
||||
- Модель сервера (Product Name) - например, SYS-821GE-TNHR
|
||||
- Серийный номер сервера (Serial Number) - например, A514359X5A07900
|
||||
- UUID, SKU Number, Family
|
||||
|
||||
2. **unified_summary.json** - детальная информация о системе и компонентах
|
||||
- Информация о GPU (модель, производитель, VBIOS, PCI адреса)
|
||||
- Информация о NVSwitch (VendorID, DeviceID, Link speed/width)
|
||||
- Информация о производителе и модели сервера
|
||||
|
||||
3. **summary.json** - результаты тестов диагностики
|
||||
- Результаты тестов GPU (inforom, checkinforom, gpumem, gpustress, pcie, nvlink, nvswitch, power)
|
||||
- Коды ошибок и статусы тестов
|
||||
|
||||
4. **summary.csv** - альтернативный формат результатов тестов
|
||||
|
||||
### Дополнительные файлы
|
||||
|
||||
- `gpu_fieldiag/*.log` - детальные логи диагностики каждого GPU
|
||||
- `inventory/*.json` - дополнительная информация о конфигурации
|
||||
|
||||
## Извлекаемые данные
|
||||
|
||||
### Hardware Configuration
|
||||
|
||||
#### GPUs
|
||||
```json
|
||||
{
|
||||
"slot": "GPUSXM1",
|
||||
"model": "NVIDIA Device 2335",
|
||||
"manufacturer": "NVIDIA Corporation",
|
||||
"firmware": "96.00.D0.00.03",
|
||||
"bdf": "0000:3a:00.0"
|
||||
}
|
||||
```
|
||||
|
||||
#### NVSwitch (как PCIe устройства)
|
||||
```json
|
||||
{
|
||||
"slot": "NVSWITCHNVSWITCH0",
|
||||
"device_class": "NVSwitch",
|
||||
"manufacturer": "NVIDIA Corporation",
|
||||
"vendor_id": 4318,
|
||||
"device_id": 8867,
|
||||
"bdf": "0000:05:00.0",
|
||||
"link_speed": "16GT/s",
|
||||
"link_width": 2
|
||||
}
|
||||
```
|
||||
|
||||
### Events
|
||||
|
||||
События создаются для:
|
||||
- **Предупреждений и ошибок** тестов диагностики
|
||||
- Примеры событий:
|
||||
- `Row remapping failed` - ошибка памяти GPU (Warning)
|
||||
- Различные тесты: connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power
|
||||
|
||||
Уровни severity:
|
||||
- `info` - информационные события (тесты прошли успешно)
|
||||
- `warning` - предупреждения (например, Row remapping failed)
|
||||
- `critical` - критические ошибки (коды ошибок 300+)
|
||||
|
||||
## Пример использования
|
||||
|
||||
```bash
|
||||
# Запуск веб-интерфейса
|
||||
./logpile --file /path/to/A514359X5A07900_logs-20260122-074208.tar
|
||||
|
||||
# Веб-интерфейс будет доступен на http://localhost:8082
|
||||
```
|
||||
|
||||
## Автоопределение
|
||||
|
||||
Парсер автоматически определяет архивы NVIDIA Field Diag по наличию:
|
||||
- `unified_summary.json` с маркером "HGX Field Diag"
|
||||
- `summary.json` и `summary.csv` с результатами тестов
|
||||
- Директории `gpu_fieldiag/`
|
||||
|
||||
Confidence score:
|
||||
- `unified_summary.json` с маркером "HGX Field Diag": +40
|
||||
- `summary.json`: +20
|
||||
- `summary.csv`: +15
|
||||
- `gpu_fieldiag/` directory: +15
|
||||
|
||||
## Версионирование
|
||||
|
||||
**Текущая версия парсера:** 1.1.0
|
||||
|
||||
При модификации логики парсера необходимо увеличивать версию в константе `parserVersion` в файле `parser.go`.
|
||||
|
||||
### История версий
|
||||
|
||||
- **1.1.0** - Добавлен парсинг output.log (dmidecode) для извлечения модели и серийного номера сервера
|
||||
- **1.0.0** - Первоначальная версия с парсингом unified_summary.json и summary.json/csv
|
||||
|
||||
## Примеры данных
|
||||
|
||||
### Пример unified_summary.json
|
||||
```json
|
||||
{
|
||||
"runInfo": {
|
||||
"diagVersion": "24287-XXXX-FLD-42658",
|
||||
"diagName": "HGX Field Diag",
|
||||
"finalResult": "FAIL",
|
||||
"errorCode": 363
|
||||
},
|
||||
"tests": [{
|
||||
"virtualId": "inventory",
|
||||
"components": [{
|
||||
"componentId": "GPUSXM1",
|
||||
"properties": [
|
||||
{"id": "Manufacturer", "value": "Any Server Vendor"},
|
||||
{"id": "VendorID", "value": "10de"},
|
||||
{"id": "DeviceID", "value": "2335"}
|
||||
]
|
||||
}]
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
### Пример summary.json
|
||||
```json
|
||||
[
|
||||
{
|
||||
"Error Code": "005-000-1-000000000363",
|
||||
"Test": "gpumem",
|
||||
"Component ID": "SXM5_SN_1653925025497",
|
||||
"Notes": "Row remapping failed",
|
||||
"Virtual ID": "gpumem"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
## Известные ограничения
|
||||
|
||||
1. Парсер фокусируется на данных из `unified_summary.json` и `summary.json`
|
||||
2. Детальные логи из `gpu_fieldiag/*.log` пока не парсятся
|
||||
3. Информация о CPU, памяти и дисках не извлекается (в архиве отсутствует)
|
||||
|
||||
## Разработка
|
||||
|
||||
### Добавление новых полей
|
||||
|
||||
1. Изучите структуру JSON в архиве
|
||||
2. Добавьте поля в структуры `Component` или `Property`
|
||||
3. Обновите функции `parseGPUComponent` или `parseNVSwitchComponent`
|
||||
4. Увеличьте версию парсера
|
||||
|
||||
### Добавление новых типов файлов
|
||||
|
||||
1. Создайте новый файл с парсером (например, `gpu_logs.go`)
|
||||
2. Добавьте парсинг в функцию `Parse()` в `parser.go`
|
||||
3. Обновите документацию
|
||||
274
internal/parser/vendors/nvidia/component_status_time.go
vendored
Normal file
274
internal/parser/vendors/nvidia/component_status_time.go
vendored
Normal file
@@ -0,0 +1,274 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
var verboseRunTestingLineRegex = regexp.MustCompile(`^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}),\d+\s+-\s+Testing\s+([a-zA-Z0-9_]+)\s*$`)
|
||||
var runLogStartTimeRegex = regexp.MustCompile(`^Start time\s+([A-Za-z]{3}, \d{2} [A-Za-z]{3} \d{4} \d{2}:\d{2}:\d{2})\s*$`)
|
||||
var runLogTestDurationRegex = regexp.MustCompile(`^Testing\s+([a-zA-Z0-9_]+)\s+\S+\s+\[\s*([0-9]+):([0-9]{2})s\s*\]\s*$`)
|
||||
var modsStartLineRegex = regexp.MustCompile(`(?m)^MODS start:\s+([A-Za-z]{3}\s+[A-Za-z]{3}\s+\d{1,2}\s+\d{2}:\d{2}:\d{2}\s+\d{4})\s*$`)
|
||||
var gpuFieldiagOutputPathRegex = regexp.MustCompile(`(?i)gpu_fieldiag[\\/]+sxm(\d+)_sn_([^\\/]+)[\\/]+output\.log$`)
|
||||
var nvswitchDevnameRegex = regexp.MustCompile(`devname=[^,\s]+,(NVSWITCH\d+)`)
|
||||
|
||||
type componentCheckTimes struct {
|
||||
GPUDefault time.Time
|
||||
NVSwitchDefault time.Time
|
||||
GPUBySerial map[string]time.Time // key: GPU serial
|
||||
GPUBySlot map[string]time.Time // key: GPUSXM<idx>
|
||||
NVSwitchBySlot map[string]time.Time // key: NVSWITCH<idx>
|
||||
}
|
||||
|
||||
// CollectGPUAndNVSwitchCheckTimes extracts GPU/NVSwitch check timestamps from NVIDIA logs.
|
||||
// Priority:
|
||||
// 1) verbose_run.log "Testing <test>" timestamps
|
||||
// 2) run.log start time + cumulative durations
|
||||
func CollectGPUAndNVSwitchCheckTimes(files []parser.ExtractedFile) componentCheckTimes {
|
||||
gpuBySerial := make(map[string]time.Time)
|
||||
gpuBySlot := make(map[string]time.Time)
|
||||
nvsBySlot := make(map[string]time.Time)
|
||||
|
||||
for _, f := range files {
|
||||
path := strings.TrimSpace(f.Path)
|
||||
pathLower := strings.ToLower(path)
|
||||
|
||||
// Per-GPU timestamp from gpu_fieldiag/<SXMx_SN_serial>/output.log
|
||||
if strings.HasSuffix(pathLower, "output.log") && strings.Contains(pathLower, "gpu_fieldiag/") {
|
||||
ts := parseModsStartTime(f.Content)
|
||||
if ts.IsZero() {
|
||||
continue
|
||||
}
|
||||
matches := gpuFieldiagOutputPathRegex.FindStringSubmatch(path)
|
||||
if len(matches) == 3 {
|
||||
slot := "GPUSXM" + strings.TrimSpace(matches[1])
|
||||
serial := strings.TrimSpace(matches[2])
|
||||
if slot != "" {
|
||||
gpuBySlot[slot] = ts
|
||||
}
|
||||
if serial != "" {
|
||||
gpuBySerial[serial] = ts
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Per-NVSwitch timestamp and slot list from nvswitch/output.log
|
||||
if strings.HasSuffix(pathLower, "nvswitch/output.log") || strings.HasSuffix(pathLower, "nvswitch\\output.log") {
|
||||
ts := parseModsStartTime(f.Content)
|
||||
if ts.IsZero() {
|
||||
continue
|
||||
}
|
||||
for _, slot := range parseNVSwitchSlotsFromOutput(f.Content) {
|
||||
nvsBySlot[slot] = ts
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
testStarts := make(map[string]time.Time)
|
||||
|
||||
if f := parser.FindFileByName(files, "verbose_run.log"); f != nil {
|
||||
for testName, ts := range parseVerboseRunTestStartTimes(f.Content) {
|
||||
testStarts[strings.ToLower(strings.TrimSpace(testName))] = ts
|
||||
}
|
||||
}
|
||||
|
||||
if len(testStarts) == 0 {
|
||||
if f := parser.FindFileByName(files, "run.log"); f != nil {
|
||||
for testName, ts := range parseRunLogTestStartTimes(f.Content) {
|
||||
testStarts[strings.ToLower(strings.TrimSpace(testName))] = ts
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return componentCheckTimes{
|
||||
GPUDefault: pickFirstTestTime(testStarts, "gpu_fieldiag", "gpumem", "gpustress", "pcie", "inventory"),
|
||||
NVSwitchDefault: pickFirstTestTime(testStarts, "nvswitch", "inventory"),
|
||||
GPUBySerial: gpuBySerial,
|
||||
GPUBySlot: gpuBySlot,
|
||||
NVSwitchBySlot: nvsBySlot,
|
||||
}
|
||||
}
|
||||
|
||||
func pickFirstTestTime(testStarts map[string]time.Time, names ...string) time.Time {
|
||||
for _, name := range names {
|
||||
if ts := testStarts[strings.ToLower(strings.TrimSpace(name))]; !ts.IsZero() {
|
||||
return ts
|
||||
}
|
||||
}
|
||||
return time.Time{}
|
||||
}
|
||||
|
||||
func parseVerboseRunTestStartTimes(content []byte) map[string]time.Time {
|
||||
result := make(map[string]time.Time)
|
||||
lines := strings.Split(string(content), "\n")
|
||||
for _, line := range lines {
|
||||
matches := verboseRunTestingLineRegex.FindStringSubmatch(strings.TrimSpace(line))
|
||||
if len(matches) != 3 {
|
||||
continue
|
||||
}
|
||||
|
||||
ts, err := parser.ParseInDefaultArchiveLocation("2006-01-02 15:04:05", strings.TrimSpace(matches[1]))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
testName := strings.ToLower(strings.TrimSpace(matches[2]))
|
||||
if testName == "" {
|
||||
continue
|
||||
}
|
||||
if _, exists := result[testName]; !exists {
|
||||
result[testName] = ts
|
||||
}
|
||||
}
|
||||
return result
|
||||
}
|
||||
|
||||
func parseRunLogTestStartTimes(content []byte) map[string]time.Time {
|
||||
lines := strings.Split(string(content), "\n")
|
||||
start := time.Time{}
|
||||
for _, line := range lines {
|
||||
matches := runLogStartTimeRegex.FindStringSubmatch(strings.TrimSpace(line))
|
||||
if len(matches) != 2 {
|
||||
continue
|
||||
}
|
||||
parsed, err := parser.ParseInDefaultArchiveLocation("Mon, 02 Jan 2006 15:04:05", strings.TrimSpace(matches[1]))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
start = parsed
|
||||
break
|
||||
}
|
||||
if start.IsZero() {
|
||||
return nil
|
||||
}
|
||||
|
||||
result := make(map[string]time.Time)
|
||||
cursor := start
|
||||
for _, line := range lines {
|
||||
matches := runLogTestDurationRegex.FindStringSubmatch(strings.TrimSpace(line))
|
||||
if len(matches) != 4 {
|
||||
continue
|
||||
}
|
||||
|
||||
testName := strings.ToLower(strings.TrimSpace(matches[1]))
|
||||
minutes, errMin := strconv.Atoi(strings.TrimSpace(matches[2]))
|
||||
seconds, errSec := strconv.Atoi(strings.TrimSpace(matches[3]))
|
||||
if errMin != nil || errSec != nil {
|
||||
continue
|
||||
}
|
||||
if _, exists := result[testName]; !exists {
|
||||
result[testName] = cursor
|
||||
}
|
||||
cursor = cursor.Add(time.Duration(minutes)*time.Minute + time.Duration(seconds)*time.Second)
|
||||
}
|
||||
|
||||
return result
|
||||
}
|
||||
|
||||
func parseModsStartTime(content []byte) time.Time {
|
||||
matches := modsStartLineRegex.FindSubmatch(content)
|
||||
if len(matches) != 2 {
|
||||
return time.Time{}
|
||||
}
|
||||
tsRaw := strings.TrimSpace(string(matches[1]))
|
||||
if tsRaw == "" {
|
||||
return time.Time{}
|
||||
}
|
||||
ts, err := parser.ParseInDefaultArchiveLocation("Mon Jan 2 15:04:05 2006", tsRaw)
|
||||
if err != nil {
|
||||
return time.Time{}
|
||||
}
|
||||
return ts
|
||||
}
|
||||
|
||||
func parseNVSwitchSlotsFromOutput(content []byte) []string {
|
||||
matches := nvswitchDevnameRegex.FindAllSubmatch(content, -1)
|
||||
if len(matches) == 0 {
|
||||
return nil
|
||||
}
|
||||
seen := make(map[string]struct{})
|
||||
out := make([]string, 0, len(matches))
|
||||
for _, m := range matches {
|
||||
if len(m) != 2 {
|
||||
continue
|
||||
}
|
||||
slot := strings.ToUpper(strings.TrimSpace(string(m[1])))
|
||||
if slot == "" {
|
||||
continue
|
||||
}
|
||||
if _, exists := seen[slot]; exists {
|
||||
continue
|
||||
}
|
||||
seen[slot] = struct{}{}
|
||||
out = append(out, slot)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// ApplyGPUAndNVSwitchCheckTimes writes parsed check timestamps to component status metadata.
|
||||
func ApplyGPUAndNVSwitchCheckTimes(result *models.AnalysisResult, times componentCheckTimes) {
|
||||
if result == nil || result.Hardware == nil {
|
||||
return
|
||||
}
|
||||
|
||||
for i := range result.Hardware.GPUs {
|
||||
gpu := &result.Hardware.GPUs[i]
|
||||
ts := time.Time{}
|
||||
if serial := strings.TrimSpace(gpu.SerialNumber); serial != "" {
|
||||
ts = times.GPUBySerial[serial]
|
||||
}
|
||||
if ts.IsZero() {
|
||||
ts = times.GPUBySlot[strings.ToUpper(strings.TrimSpace(gpu.Slot))]
|
||||
}
|
||||
if ts.IsZero() {
|
||||
ts = times.GPUDefault
|
||||
}
|
||||
if ts.IsZero() {
|
||||
continue
|
||||
}
|
||||
gpu.StatusCheckedAt = &ts
|
||||
status := strings.TrimSpace(gpu.Status)
|
||||
if status == "" {
|
||||
status = "Unknown"
|
||||
}
|
||||
gpu.StatusAtCollect = &models.StatusAtCollection{
|
||||
Status: status,
|
||||
At: ts,
|
||||
}
|
||||
}
|
||||
|
||||
for i := range result.Hardware.PCIeDevices {
|
||||
dev := &result.Hardware.PCIeDevices[i]
|
||||
slot := normalizeNVSwitchSlot(strings.TrimSpace(dev.Slot))
|
||||
if slot == "" {
|
||||
continue
|
||||
}
|
||||
slot = strings.ToUpper(slot)
|
||||
if !strings.EqualFold(strings.TrimSpace(dev.DeviceClass), "NVSwitch") &&
|
||||
!strings.HasPrefix(slot, "NVSWITCH") {
|
||||
continue
|
||||
}
|
||||
|
||||
ts := times.NVSwitchBySlot[slot]
|
||||
if ts.IsZero() {
|
||||
ts = times.NVSwitchDefault
|
||||
}
|
||||
if ts.IsZero() {
|
||||
continue
|
||||
}
|
||||
|
||||
dev.StatusCheckedAt = &ts
|
||||
status := strings.TrimSpace(dev.Status)
|
||||
if status == "" {
|
||||
status = "Unknown"
|
||||
}
|
||||
dev.StatusAtCollect = &models.StatusAtCollection{
|
||||
Status: status,
|
||||
At: ts,
|
||||
}
|
||||
}
|
||||
}
|
||||
143
internal/parser/vendors/nvidia/component_status_time_test.go
vendored
Normal file
143
internal/parser/vendors/nvidia/component_status_time_test.go
vendored
Normal file
@@ -0,0 +1,143 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestParseVerboseRunTestStartTimes(t *testing.T) {
|
||||
content := []byte(`
|
||||
2026-01-22 09:11:32,458 - Testing nvswitch
|
||||
2026-01-22 09:45:36,016 - Testing gpu_fieldiag
|
||||
`)
|
||||
got := parseVerboseRunTestStartTimes(content)
|
||||
|
||||
nvs := got["nvswitch"]
|
||||
if nvs.IsZero() {
|
||||
t.Fatalf("expected nvswitch timestamp")
|
||||
}
|
||||
gpu := got["gpu_fieldiag"]
|
||||
if gpu.IsZero() {
|
||||
t.Fatalf("expected gpu_fieldiag timestamp")
|
||||
}
|
||||
if nvs.UTC().Format(time.RFC3339) != "2026-01-22T06:11:32Z" {
|
||||
t.Fatalf("unexpected nvswitch timestamp: %s", nvs.Format(time.RFC3339))
|
||||
}
|
||||
if gpu.UTC().Format(time.RFC3339) != "2026-01-22T06:45:36Z" {
|
||||
t.Fatalf("unexpected gpu_fieldiag timestamp: %s", gpu.Format(time.RFC3339))
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseRunLogTestStartTimes(t *testing.T) {
|
||||
content := []byte(`
|
||||
Start time Thu, 22 Jan 2026 07:42:26
|
||||
Testing gpumem FAILED [ 26:12s ]
|
||||
Testing gpustress OK [ 7:10s ]
|
||||
Testing nvswitch OK [ 9:25s ]
|
||||
`)
|
||||
|
||||
got := parseRunLogTestStartTimes(content)
|
||||
if got["gpumem"].UTC().Format(time.RFC3339) != "2026-01-22T04:42:26Z" {
|
||||
t.Fatalf("unexpected gpumem start: %s", got["gpumem"].Format(time.RFC3339))
|
||||
}
|
||||
if got["gpustress"].UTC().Format(time.RFC3339) != "2026-01-22T05:08:38Z" {
|
||||
t.Fatalf("unexpected gpustress start: %s", got["gpustress"].Format(time.RFC3339))
|
||||
}
|
||||
if got["nvswitch"].UTC().Format(time.RFC3339) != "2026-01-22T05:15:48Z" {
|
||||
t.Fatalf("unexpected nvswitch start: %s", got["nvswitch"].Format(time.RFC3339))
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyGPUAndNVSwitchCheckTimes(t *testing.T) {
|
||||
gpuTs := time.Date(2026, 1, 22, 9, 45, 36, 0, time.UTC)
|
||||
nvsTs := time.Date(2026, 1, 22, 9, 11, 32, 0, time.UTC)
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "GPUSXM5", Status: "FAIL"},
|
||||
},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{Slot: "NVSWITCH0", DeviceClass: "NVSwitch", Status: "PASS"},
|
||||
{Slot: "NIC0", DeviceClass: "NetworkController", Status: "PASS"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
ApplyGPUAndNVSwitchCheckTimes(result, componentCheckTimes{
|
||||
GPUBySlot: map[string]time.Time{"GPUSXM5": gpuTs},
|
||||
NVSwitchBySlot: map[string]time.Time{"NVSWITCH0": nvsTs},
|
||||
})
|
||||
|
||||
if got := result.Hardware.GPUs[0].StatusCheckedAt; got == nil || !got.Equal(gpuTs) {
|
||||
t.Fatalf("expected gpu status_checked_at %s, got %v", gpuTs.Format(time.RFC3339), got)
|
||||
}
|
||||
if result.Hardware.GPUs[0].StatusAtCollect == nil || !result.Hardware.GPUs[0].StatusAtCollect.At.Equal(gpuTs) {
|
||||
t.Fatalf("expected gpu status_at_collection.at %s", gpuTs.Format(time.RFC3339))
|
||||
}
|
||||
if got := result.Hardware.PCIeDevices[0].StatusCheckedAt; got == nil || !got.Equal(nvsTs) {
|
||||
t.Fatalf("expected nvswitch status_checked_at %s, got %v", nvsTs.Format(time.RFC3339), got)
|
||||
}
|
||||
if result.Hardware.PCIeDevices[0].StatusAtCollect == nil || !result.Hardware.PCIeDevices[0].StatusAtCollect.At.Equal(nvsTs) {
|
||||
t.Fatalf("expected nvswitch status_at_collection.at %s", nvsTs.Format(time.RFC3339))
|
||||
}
|
||||
if result.Hardware.PCIeDevices[1].StatusCheckedAt != nil {
|
||||
t.Fatalf("expected non-nvswitch device status_checked_at to stay nil")
|
||||
}
|
||||
}
|
||||
|
||||
func TestCollectGPUAndNVSwitchCheckTimes_FromVerboseRun(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "verbose_run.log",
|
||||
Content: []byte(`
|
||||
2026-01-22 09:11:32,458 - Testing nvswitch
|
||||
2026-01-22 09:45:36,016 - Testing gpu_fieldiag
|
||||
`),
|
||||
},
|
||||
}
|
||||
|
||||
got := CollectGPUAndNVSwitchCheckTimes(files)
|
||||
if got.GPUDefault.UTC().Format(time.RFC3339) != "2026-01-22T06:45:36Z" {
|
||||
t.Fatalf("unexpected GPU check time: %s", got.GPUDefault.Format(time.RFC3339))
|
||||
}
|
||||
if got.NVSwitchDefault.UTC().Format(time.RFC3339) != "2026-01-22T06:11:32Z" {
|
||||
t.Fatalf("unexpected NVSwitch check time: %s", got.NVSwitchDefault.Format(time.RFC3339))
|
||||
}
|
||||
}
|
||||
|
||||
func TestCollectGPUAndNVSwitchCheckTimes_FromComponentOutputLogs(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "gpu_fieldiag/SXM5_SN_1653925025497/output.log",
|
||||
Content: []byte(`
|
||||
$ some command
|
||||
MODS start: Thu Jan 22 09:45:36 2026
|
||||
`),
|
||||
},
|
||||
{
|
||||
Path: "nvswitch/output.log",
|
||||
Content: []byte(`
|
||||
$ cmd devname=0000:08:00.0,NVSWITCH3 devname=0000:07:00.0,NVSWITCH2 devname=0000:06:00.0,NVSWITCH1 devname=0000:05:00.0,NVSWITCH0
|
||||
MODS start: Thu Jan 22 09:11:32 2026
|
||||
`),
|
||||
},
|
||||
}
|
||||
|
||||
got := CollectGPUAndNVSwitchCheckTimes(files)
|
||||
if got.GPUBySerial["1653925025497"].UTC().Format(time.RFC3339) != "2026-01-22T06:45:36Z" {
|
||||
t.Fatalf("unexpected GPU serial check time: %s", got.GPUBySerial["1653925025497"].Format(time.RFC3339))
|
||||
}
|
||||
if got.GPUBySlot["GPUSXM5"].UTC().Format(time.RFC3339) != "2026-01-22T06:45:36Z" {
|
||||
t.Fatalf("unexpected GPU slot check time: %s", got.GPUBySlot["GPUSXM5"].Format(time.RFC3339))
|
||||
}
|
||||
if got.NVSwitchBySlot["NVSWITCH0"].UTC().Format(time.RFC3339) != "2026-01-22T06:11:32Z" {
|
||||
t.Fatalf("unexpected NVSwitch0 check time: %s", got.NVSwitchBySlot["NVSWITCH0"].Format(time.RFC3339))
|
||||
}
|
||||
if got.NVSwitchBySlot["NVSWITCH3"].UTC().Format(time.RFC3339) != "2026-01-22T06:11:32Z" {
|
||||
t.Fatalf("unexpected NVSwitch3 check time: %s", got.NVSwitchBySlot["NVSWITCH3"].Format(time.RFC3339))
|
||||
}
|
||||
}
|
||||
374
internal/parser/vendors/nvidia/gpu_model.go
vendored
Normal file
374
internal/parser/vendors/nvidia/gpu_model.go
vendored
Normal file
@@ -0,0 +1,374 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
var (
|
||||
gpuNameWithSerialRegex = regexp.MustCompile(`^SXM(\d+)_SN_(.+)$`)
|
||||
gpuNameSlotOnlyRegex = regexp.MustCompile(`^SXM(\d+)$`)
|
||||
skuCodeRegex = regexp.MustCompile(`^(G\d{3})[.-](\d{4})`)
|
||||
skuCodeInsideRegex = regexp.MustCompile(`(?:^|[^A-Z0-9])(?:\d)?(G\d{3})[.-](\d{4})(?:[^A-Z0-9]|$)`)
|
||||
inforomPathRegex = regexp.MustCompile(`(?i)(?:^|[\\/])(checkinforom|inforom)[\\/](SXM(\d+))(?:_SN_([^\\/]+))?[\\/]fieldiag\.jso$`)
|
||||
inforomProductPNRegex = regexp.MustCompile(`"product_part_num"\s*:\s*"([^"]+)"`)
|
||||
inforomSerialRegex = regexp.MustCompile(`"serial_number"\s*:\s*"([^"]+)"`)
|
||||
)
|
||||
|
||||
type testSpecData struct {
|
||||
Actions []struct {
|
||||
VirtualID string `json:"virtual_id"`
|
||||
Args struct {
|
||||
SKUToFile map[string]string `json:"sku_to_sku_json_file_map"`
|
||||
ModsMapping map[string]json.RawMessage `json:"mods_mapping"`
|
||||
} `json:"args"`
|
||||
} `json:"actions"`
|
||||
}
|
||||
|
||||
type inventoryFieldDiagSummary struct {
|
||||
ModsRuns []struct {
|
||||
ModsHeader []struct {
|
||||
GPUName string `json:"GpuName"`
|
||||
BoardInfo string `json:"BoardInfo"`
|
||||
} `json:"ModsHeader"`
|
||||
} `json:"ModsRuns"`
|
||||
}
|
||||
|
||||
var hardcodedSKUToFileMap = map[string]string{
|
||||
"G520-0200": "sku_hgx-h100-8-gpu_80g_aircooled_field.json",
|
||||
"G520-0201": "sku_hgx-h100-8-gpu_80g_aircooled_field.json",
|
||||
"G520-0202": "sku_hgx-h100-8-gpu_80g_tpol_field.json",
|
||||
"G520-0203": "sku_hgx-h100-8-gpu_80g_tpol_field.json",
|
||||
"G520-0205": "sku_hgx-h800-8-gpu_80g_aircooled_field.json",
|
||||
"G520-0207": "sku_hgx-h800-8-gpu_80g_tpol_field.json",
|
||||
"G520-0221": "sku_hgx-h100-8-gpu_96g_aircooled_field.json",
|
||||
"G520-0236": "sku_hgx-h20-8-gpu_96g_aircooled_field.json",
|
||||
"G520-0238": "sku_hgx-h20-8-gpu_96g_tpol_field.json",
|
||||
"G520-0266": "sku_hgx-h20-8-gpu_141g_aircooled_field.json",
|
||||
"G520-0280": "sku_hgx-h200-8-gpu_141g_aircooled_field.json",
|
||||
"G520-0282": "sku_hgx-h200-8-gpu_141g_tpol_field.json",
|
||||
"G520-0292": "sku_hgx-h100-8-gpu_sku_292_field.json",
|
||||
}
|
||||
|
||||
// ApplyGPUModelsFromSKU updates GPU model names using SKU mapping from testspec.json.
|
||||
// Mapping source:
|
||||
// - inventory/fieldiag_summary.json: GPUName -> BoardInfo(SKU)
|
||||
// - hardcoded SKU mapping
|
||||
// - testspec.json: SKU -> sku_hgx-... filename (fallback for unknown hardcoded SKU)
|
||||
// - inforom/*/fieldiag.jso: product_part_num (full P/N with embedded SKU)
|
||||
// - testspec.json gpu_fieldiag.mods_mapping: DeviceID -> GPU generation (last fallback for description)
|
||||
func ApplyGPUModelsFromSKU(files []parser.ExtractedFile, result *models.AnalysisResult) {
|
||||
if result == nil || result.Hardware == nil || len(result.Hardware.GPUs) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
skuToFile := parseSKUToFileMap(files)
|
||||
generationByDeviceID := parseGenerationByDeviceID(files)
|
||||
|
||||
serialToSKU, slotToSKU, serialToPartNumber, slotToPartNumber := parseGPUSKUMapping(files)
|
||||
|
||||
for i := range result.Hardware.GPUs {
|
||||
gpu := &result.Hardware.GPUs[i]
|
||||
slot := strings.TrimSpace(gpu.Slot)
|
||||
serial := strings.TrimSpace(gpu.SerialNumber)
|
||||
|
||||
if gpu.PartNumber == "" && serial != "" {
|
||||
if pn := strings.TrimSpace(serialToPartNumber[serial]); pn != "" {
|
||||
gpu.PartNumber = pn
|
||||
}
|
||||
}
|
||||
if gpu.PartNumber == "" {
|
||||
if pn := strings.TrimSpace(slotToPartNumber[slot]); pn != "" {
|
||||
gpu.PartNumber = pn
|
||||
}
|
||||
}
|
||||
|
||||
if partNumber := strings.TrimSpace(gpu.PartNumber); partNumber != "" {
|
||||
gpu.Model = partNumber
|
||||
}
|
||||
|
||||
sku := extractSKUFromPartNumber(gpu.PartNumber)
|
||||
if sku == "" && serial != "" {
|
||||
sku = serialToSKU[serial]
|
||||
}
|
||||
if sku == "" {
|
||||
sku = slotToSKU[slot]
|
||||
}
|
||||
if sku != "" {
|
||||
if desc := resolveDescriptionFromSKU(sku, skuToFile); desc != "" {
|
||||
gpu.Description = desc
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
if gen := resolveGenerationDescription(gpu.DeviceID, generationByDeviceID); gen != "" {
|
||||
gpu.Description = gen
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func parseSKUToFileMap(files []parser.ExtractedFile) map[string]string {
|
||||
result := make(map[string]string, len(hardcodedSKUToFileMap))
|
||||
for sku, file := range hardcodedSKUToFileMap {
|
||||
result[normalizeSKUCode(sku)] = strings.TrimSpace(file)
|
||||
}
|
||||
|
||||
specFile := parser.FindFileByName(files, "testspec.json")
|
||||
if specFile == nil {
|
||||
return result
|
||||
}
|
||||
|
||||
var spec testSpecData
|
||||
if err := json.Unmarshal(specFile.Content, &spec); err != nil {
|
||||
return result
|
||||
}
|
||||
|
||||
for _, action := range spec.Actions {
|
||||
for sku, file := range action.Args.SKUToFile {
|
||||
normSKU := normalizeSKUCode(sku)
|
||||
if normSKU == "" {
|
||||
continue
|
||||
}
|
||||
// Priority: hardcoded mapping wins, testspec extends unknown SKU list.
|
||||
if _, exists := result[normSKU]; !exists {
|
||||
result[normSKU] = strings.TrimSpace(file)
|
||||
}
|
||||
}
|
||||
}
|
||||
return result
|
||||
}
|
||||
|
||||
func parseGenerationByDeviceID(files []parser.ExtractedFile) map[string]string {
|
||||
specFile := parser.FindFileByName(files, "testspec.json")
|
||||
if specFile == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
var spec testSpecData
|
||||
if err := json.Unmarshal(specFile.Content, &spec); err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
familyToGeneration := make(map[string]string)
|
||||
deviceToGeneration := make(map[string]string)
|
||||
|
||||
for _, action := range spec.Actions {
|
||||
if strings.TrimSpace(strings.ToLower(action.VirtualID)) != "gpu_fieldiag" {
|
||||
continue
|
||||
}
|
||||
for key, raw := range action.Args.ModsMapping {
|
||||
if strings.HasPrefix(key, "#mods.") {
|
||||
family := strings.TrimSpace(strings.TrimPrefix(key, "#mods."))
|
||||
if family == "" {
|
||||
continue
|
||||
}
|
||||
var generation string
|
||||
if err := json.Unmarshal(raw, &generation); err == nil {
|
||||
generation = strings.TrimSpace(generation)
|
||||
if generation != "" {
|
||||
familyToGeneration[family] = generation
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for key, raw := range action.Args.ModsMapping {
|
||||
family := strings.TrimSpace(key)
|
||||
if family == "" || strings.HasPrefix(family, "#") {
|
||||
continue
|
||||
}
|
||||
generation := strings.TrimSpace(familyToGeneration[family])
|
||||
if generation == "" {
|
||||
continue
|
||||
}
|
||||
var deviceIDs []string
|
||||
if err := json.Unmarshal(raw, &deviceIDs); err != nil {
|
||||
continue
|
||||
}
|
||||
for _, id := range deviceIDs {
|
||||
norm := normalizeDeviceIDHex(id)
|
||||
if norm != "" {
|
||||
deviceToGeneration[norm] = generation
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return deviceToGeneration
|
||||
}
|
||||
|
||||
func parseGPUSKUMapping(files []parser.ExtractedFile) (map[string]string, map[string]string, map[string]string, map[string]string) {
|
||||
serialToSKU := make(map[string]string)
|
||||
slotToSKU := make(map[string]string)
|
||||
serialToPartNumber := make(map[string]string)
|
||||
slotToPartNumber := make(map[string]string)
|
||||
|
||||
// 1) inventory/fieldiag_summary.json mapping (GPUName/BoardInfo).
|
||||
var summaryFile *parser.ExtractedFile
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(f.Path)
|
||||
if strings.Contains(path, "inventory/fieldiag_summary.json") ||
|
||||
strings.Contains(path, "inventory\\fieldiag_summary.json") {
|
||||
summaryFile = &f
|
||||
break
|
||||
}
|
||||
}
|
||||
if summaryFile == nil {
|
||||
// Continue: inforom may still contain usable part numbers.
|
||||
} else {
|
||||
var summaries []inventoryFieldDiagSummary
|
||||
if err := json.Unmarshal(summaryFile.Content, &summaries); err == nil {
|
||||
for _, summary := range summaries {
|
||||
addSummaryMapping(summary, serialToSKU, slotToSKU)
|
||||
}
|
||||
} else {
|
||||
var summary inventoryFieldDiagSummary
|
||||
if err := json.Unmarshal(summaryFile.Content, &summary); err == nil {
|
||||
addSummaryMapping(summary, serialToSKU, slotToSKU)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// 2) inforom/checkinforom fieldiag.jso mapping (full product_part_num).
|
||||
for _, f := range files {
|
||||
path := strings.TrimSpace(f.Path)
|
||||
m := inforomPathRegex.FindStringSubmatch(path)
|
||||
if len(m) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
slot := "GPU" + strings.ToUpper(strings.TrimSpace(m[2])) // SXM7 -> GPUSXM7
|
||||
serialFromPath := strings.TrimSpace(m[4])
|
||||
|
||||
productPNMatch := inforomProductPNRegex.FindSubmatch(f.Content)
|
||||
if len(productPNMatch) == 2 {
|
||||
partNumber := strings.TrimSpace(string(productPNMatch[1]))
|
||||
if partNumber != "" {
|
||||
slotToPartNumber[slot] = partNumber
|
||||
if serialFromPath != "" {
|
||||
serialToPartNumber[serialFromPath] = partNumber
|
||||
}
|
||||
if sku := extractSKUFromPartNumber(partNumber); sku != "" {
|
||||
slotToSKU[slot] = sku
|
||||
if serialFromPath != "" {
|
||||
serialToSKU[serialFromPath] = sku
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
serialMatch := inforomSerialRegex.FindSubmatch(f.Content)
|
||||
if len(serialMatch) == 2 {
|
||||
serial := strings.TrimSpace(string(serialMatch[1]))
|
||||
if serial != "" {
|
||||
if sku := slotToSKU[slot]; sku != "" {
|
||||
serialToSKU[serial] = sku
|
||||
}
|
||||
if pn := slotToPartNumber[slot]; pn != "" {
|
||||
serialToPartNumber[serial] = pn
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return serialToSKU, slotToSKU, serialToPartNumber, slotToPartNumber
|
||||
}
|
||||
|
||||
func addSummaryMapping(summary inventoryFieldDiagSummary, serialToSKU map[string]string, slotToSKU map[string]string) {
|
||||
for _, run := range summary.ModsRuns {
|
||||
for _, h := range run.ModsHeader {
|
||||
sku := normalizeSKUCode(h.BoardInfo)
|
||||
if sku == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
gpuName := strings.TrimSpace(h.GPUName)
|
||||
if matches := gpuNameWithSerialRegex.FindStringSubmatch(gpuName); len(matches) == 3 {
|
||||
slotToSKU["GPUSXM"+matches[1]] = sku
|
||||
serialToSKU[strings.TrimSpace(matches[2])] = sku
|
||||
continue
|
||||
}
|
||||
if matches := gpuNameSlotOnlyRegex.FindStringSubmatch(gpuName); len(matches) == 2 {
|
||||
slotToSKU["GPUSXM"+matches[1]] = sku
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func resolveDescriptionFromSKU(sku string, skuToFile map[string]string) string {
|
||||
file := strings.ToLower(strings.TrimSpace(skuToFile[normalizeSKUCode(sku)]))
|
||||
if file == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
return skuFilenameToDescription(file)
|
||||
}
|
||||
|
||||
func normalizeSKUCode(v string) string {
|
||||
s := strings.TrimSpace(strings.ToUpper(v))
|
||||
if s == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
if m := skuCodeRegex.FindStringSubmatch(s); len(m) == 3 {
|
||||
return m[1] + "-" + m[2]
|
||||
}
|
||||
|
||||
return s
|
||||
}
|
||||
|
||||
func extractSKUFromPartNumber(partNumber string) string {
|
||||
s := strings.TrimSpace(strings.ToUpper(partNumber))
|
||||
if s == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
if m := skuCodeInsideRegex.FindStringSubmatch(s); len(m) == 3 {
|
||||
return m[1] + "-" + m[2]
|
||||
}
|
||||
|
||||
return ""
|
||||
}
|
||||
|
||||
func skuFilenameToDescription(file string) string {
|
||||
s := strings.TrimSpace(strings.ToLower(file))
|
||||
if s == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
s = strings.TrimSuffix(s, ".json")
|
||||
s = strings.TrimSuffix(s, "_field")
|
||||
s = strings.TrimPrefix(s, "sku_")
|
||||
s = strings.ReplaceAll(s, "-", " ")
|
||||
s = strings.ReplaceAll(s, "_", " ")
|
||||
s = strings.Join(strings.Fields(s), " ")
|
||||
|
||||
return strings.TrimSpace(s)
|
||||
}
|
||||
|
||||
func resolveGenerationDescription(deviceID int, deviceToGeneration map[string]string) string {
|
||||
if deviceID <= 0 || len(deviceToGeneration) == 0 {
|
||||
return ""
|
||||
}
|
||||
return strings.TrimSpace(deviceToGeneration[normalizeDeviceIDHex(strconv.FormatInt(int64(deviceID), 16))])
|
||||
}
|
||||
|
||||
func normalizeDeviceIDHex(v string) string {
|
||||
s := strings.TrimSpace(strings.ToLower(v))
|
||||
s = strings.TrimPrefix(s, "0x")
|
||||
if s == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
n, err := strconv.ParseUint(s, 16, 32)
|
||||
if err != nil {
|
||||
return ""
|
||||
}
|
||||
|
||||
return "0x" + strings.ToLower(strconv.FormatUint(n, 16))
|
||||
}
|
||||
207
internal/parser/vendors/nvidia/gpu_model_test.go
vendored
Normal file
207
internal/parser/vendors/nvidia/gpu_model_test.go
vendored
Normal file
@@ -0,0 +1,207 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestApplyGPUModelsFromSKU(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "inventory/fieldiag_summary.json",
|
||||
Content: []byte(`{
|
||||
"ModsRuns":[
|
||||
{"ModsHeader":[
|
||||
{"GpuName":"SXM5_SN_1653925025497","BoardInfo":"G520-0280"}
|
||||
]}
|
||||
]
|
||||
}`),
|
||||
},
|
||||
{
|
||||
Path: "testspec.json",
|
||||
Content: []byte(`{
|
||||
"actions":[
|
||||
{
|
||||
"virtual_id":"inventory",
|
||||
"args":{
|
||||
"sku_to_sku_json_file_map":{
|
||||
"G520-0280":"sku_hgx-h200-8-gpu_141g_aircooled_field.json"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}`),
|
||||
},
|
||||
}
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "GPUSXM5",
|
||||
SerialNumber: "1653925025497",
|
||||
Model: "NVIDIA Device 2335",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
ApplyGPUModelsFromSKU(files, result)
|
||||
|
||||
if got := result.Hardware.GPUs[0].Model; got != "NVIDIA Device 2335" {
|
||||
t.Fatalf("expected model NVIDIA Device 2335, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.GPUs[0].Description; got != "hgx h200 8 gpu 141g aircooled" {
|
||||
t.Fatalf("expected description hgx h200 8 gpu 141g aircooled, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyGPUModelsFromSKU_FromPartNumber(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "inforom/SXM5/fieldiag.jso",
|
||||
Content: []byte(`[
|
||||
[
|
||||
{
|
||||
"__tag__":"inforom",
|
||||
"serial_number":"1653925025497",
|
||||
"product_part_num":"692-2G520-0280-501"
|
||||
}
|
||||
]
|
||||
]`),
|
||||
},
|
||||
{
|
||||
Path: "testspec.json",
|
||||
Content: []byte(`{
|
||||
"actions":[
|
||||
{
|
||||
"virtual_id":"inventory",
|
||||
"args":{
|
||||
"sku_to_sku_json_file_map":{
|
||||
"G520-0280":"sku_hgx-h200-8-gpu_141g_aircooled_field.json"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}`),
|
||||
},
|
||||
}
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "GPUSXM5",
|
||||
SerialNumber: "1653925025497",
|
||||
Model: "NVIDIA Device 2335",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
ApplyGPUModelsFromSKU(files, result)
|
||||
|
||||
if got := result.Hardware.GPUs[0].Model; got != "692-2G520-0280-501" {
|
||||
t.Fatalf("expected model 692-2G520-0280-501, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.GPUs[0].PartNumber; got != "692-2G520-0280-501" {
|
||||
t.Fatalf("expected part number 692-2G520-0280-501, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.GPUs[0].Description; got != "hgx h200 8 gpu 141g aircooled" {
|
||||
t.Fatalf("expected description hgx h200 8 gpu 141g aircooled, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyGPUModelsFromSKU_FieldDiagSummaryArrayFormat(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "inventory/fieldiag_summary.json",
|
||||
Content: []byte(`[
|
||||
{
|
||||
"ModsRuns":[
|
||||
{"ModsHeader":[
|
||||
{"GpuName":"SXM5_SN_1653925025497","BoardInfo":"G520-0280"}
|
||||
]}
|
||||
]
|
||||
}
|
||||
]`),
|
||||
},
|
||||
{
|
||||
Path: "testspec.json",
|
||||
Content: []byte(`{
|
||||
"actions":[
|
||||
{
|
||||
"virtual_id":"inventory",
|
||||
"args":{
|
||||
"sku_to_sku_json_file_map":{
|
||||
"G520-0280":"sku_hgx-h200-8-gpu_141g_aircooled_field.json"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}`),
|
||||
},
|
||||
}
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "GPUSXM5",
|
||||
SerialNumber: "1653925025497",
|
||||
Model: "NVIDIA Device 2335",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
ApplyGPUModelsFromSKU(files, result)
|
||||
|
||||
if got := result.Hardware.GPUs[0].Model; got != "NVIDIA Device 2335" {
|
||||
t.Fatalf("expected model NVIDIA Device 2335, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.GPUs[0].Description; got != "hgx h200 8 gpu 141g aircooled" {
|
||||
t.Fatalf("expected description hgx h200 8 gpu 141g aircooled, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyGPUModelsFromSKU_FallbackToGenerationFromModsMapping(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "testspec.json",
|
||||
Content: []byte(`{
|
||||
"actions":[
|
||||
{
|
||||
"virtual_id":"gpu_fieldiag",
|
||||
"args":{
|
||||
"mods_mapping":{
|
||||
"#mods.525":"Hopper",
|
||||
"525":["0x2335"]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}`),
|
||||
},
|
||||
}
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "GPUSXM5",
|
||||
Model: "NVIDIA Device 2335",
|
||||
DeviceID: 0x2335,
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
ApplyGPUModelsFromSKU(files, result)
|
||||
|
||||
if got := result.Hardware.GPUs[0].Description; got != "Hopper" {
|
||||
t.Fatalf("expected description Hopper, got %q", got)
|
||||
}
|
||||
}
|
||||
155
internal/parser/vendors/nvidia/inventory_log.go
vendored
Normal file
155
internal/parser/vendors/nvidia/inventory_log.go
vendored
Normal file
@@ -0,0 +1,155 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"regexp"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
var (
|
||||
// Regex to extract devname mappings from fieldiag command line
|
||||
// Example: "devname=0000:ba:00.0,SXM5_SN_1653925027099"
|
||||
devnameRegex = regexp.MustCompile(`devname=([\da-fA-F:\.]+),(\w+)`)
|
||||
// Regex to capture BDF from commands like:
|
||||
// "$ lspci -vvvs 0000:05:00.0" or "$ lspci -vvs 0000:05:00.0"
|
||||
lspciBDFRegex = regexp.MustCompile(`^\$\s+lspci\s+-[^\s]*\s+([0-9a-fA-F]{4}:[0-9a-fA-F]{2}:[0-9a-fA-F]{2}\.[0-7])\s*$`)
|
||||
// Example: "Capabilities: [2f0 v1] Device Serial Number 99-d3-61-c8-ac-2d-b0-48"
|
||||
deviceSerialRegex = regexp.MustCompile(`Device Serial Number\s+([0-9a-fA-F\-:]+)`)
|
||||
)
|
||||
|
||||
// ParseInventoryLog parses inventory/output.log to extract GPU serial numbers
|
||||
// from fieldiag devname parameters (e.g., "SXM5_SN_1653925027099")
|
||||
func ParseInventoryLog(content []byte, result *models.AnalysisResult) error {
|
||||
if result.Hardware == nil || len(result.Hardware.GPUs) == 0 {
|
||||
// No GPUs to update
|
||||
return nil
|
||||
}
|
||||
|
||||
scanner := bufio.NewScanner(strings.NewReader(string(content)))
|
||||
|
||||
// First pass: build mapping of PCI BDF -> Slot name and serial number from fieldiag command line
|
||||
pciToSlot := make(map[string]string)
|
||||
pciToSerial := make(map[string]string)
|
||||
for scanner.Scan() {
|
||||
line := scanner.Text()
|
||||
// Look for fieldiag command with devname parameters
|
||||
if strings.Contains(line, "devname=") && strings.Contains(line, "fieldiag") {
|
||||
matches := devnameRegex.FindAllStringSubmatch(line, -1)
|
||||
for _, match := range matches {
|
||||
if len(match) == 3 {
|
||||
pciBDF := match[1]
|
||||
slotName := match[2]
|
||||
// Extract slot number and serial from name like "SXM5_SN_1653925027099"
|
||||
if strings.HasPrefix(slotName, "SXM") {
|
||||
parts := strings.Split(slotName, "_")
|
||||
if len(parts) >= 1 {
|
||||
// Convert "SXM5" to "GPUSXM5"
|
||||
slot := "GPU" + parts[0]
|
||||
pciToSlot[pciBDF] = slot
|
||||
}
|
||||
// Extract serial number from "SXM5_SN_1653925027099"
|
||||
if len(parts) == 3 && parts[1] == "SN" {
|
||||
serial := parts[2]
|
||||
pciToSerial[pciBDF] = serial
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Second pass: assign serial numbers to GPUs based on slot mapping
|
||||
for i := range result.Hardware.GPUs {
|
||||
slot := result.Hardware.GPUs[i].Slot
|
||||
// Find the PCI BDF for this slot
|
||||
var foundSerial string
|
||||
for pciBDF, mappedSlot := range pciToSlot {
|
||||
if mappedSlot == slot {
|
||||
// Found matching slot, get serial number
|
||||
if serial, ok := pciToSerial[pciBDF]; ok {
|
||||
foundSerial = serial
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
if foundSerial != "" {
|
||||
result.Hardware.GPUs[i].SerialNumber = foundSerial
|
||||
}
|
||||
}
|
||||
|
||||
// Third pass: parse lspci "Device Serial Number" by BDF (useful for NVSwitch serials).
|
||||
bdfToDeviceSerial := make(map[string]string)
|
||||
currentBDF := ""
|
||||
scanner = bufio.NewScanner(strings.NewReader(string(content)))
|
||||
for scanner.Scan() {
|
||||
line := strings.TrimSpace(scanner.Text())
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
if m := lspciBDFRegex.FindStringSubmatch(line); len(m) == 2 {
|
||||
currentBDF = strings.ToLower(strings.TrimSpace(m[1]))
|
||||
continue
|
||||
}
|
||||
|
||||
if currentBDF == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
if m := deviceSerialRegex.FindStringSubmatch(line); len(m) == 2 {
|
||||
serial := strings.TrimSpace(m[1])
|
||||
if serial != "" {
|
||||
bdfToDeviceSerial[currentBDF] = serial
|
||||
}
|
||||
currentBDF = ""
|
||||
}
|
||||
}
|
||||
|
||||
// Apply to PCIe devices first (includes NVSwitch).
|
||||
for i := range result.Hardware.PCIeDevices {
|
||||
dev := &result.Hardware.PCIeDevices[i]
|
||||
if strings.TrimSpace(dev.SerialNumber) != "" {
|
||||
continue
|
||||
}
|
||||
bdf := strings.ToLower(strings.TrimSpace(dev.BDF))
|
||||
if bdf == "" {
|
||||
continue
|
||||
}
|
||||
if serial := bdfToDeviceSerial[bdf]; serial != "" {
|
||||
dev.SerialNumber = serial
|
||||
}
|
||||
}
|
||||
|
||||
// Apply to GPUs only if GPU serial is still empty (do not overwrite prod serial from devname).
|
||||
for i := range result.Hardware.GPUs {
|
||||
gpu := &result.Hardware.GPUs[i]
|
||||
if strings.TrimSpace(gpu.SerialNumber) != "" {
|
||||
continue
|
||||
}
|
||||
bdf := strings.ToLower(strings.TrimSpace(gpu.BDF))
|
||||
if bdf == "" {
|
||||
continue
|
||||
}
|
||||
if serial := bdfToDeviceSerial[bdf]; serial != "" {
|
||||
gpu.SerialNumber = serial
|
||||
}
|
||||
}
|
||||
|
||||
return scanner.Err()
|
||||
}
|
||||
|
||||
// findInventoryOutputLog finds the inventory/output.log file
|
||||
func findInventoryOutputLog(files []parser.ExtractedFile) *parser.ExtractedFile {
|
||||
for _, f := range files {
|
||||
// Look for inventory/output.log
|
||||
path := strings.ToLower(f.Path)
|
||||
if strings.Contains(path, "inventory/output.log") ||
|
||||
strings.Contains(path, "inventory\\output.log") {
|
||||
return &f
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
126
internal/parser/vendors/nvidia/inventory_log_test.go
vendored
Normal file
126
internal/parser/vendors/nvidia/inventory_log_test.go
vendored
Normal file
@@ -0,0 +1,126 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestParseInventoryLog(t *testing.T) {
|
||||
// Test with the real archive
|
||||
archivePath := filepath.Join("../../../../example", "A514359X5A09844_logs-20260115-151707.tar")
|
||||
|
||||
// Check if file exists
|
||||
if _, err := os.Stat(archivePath); os.IsNotExist(err) {
|
||||
t.Skip("Test archive not found, skipping test")
|
||||
}
|
||||
|
||||
// Extract files from archive
|
||||
files, err := parser.ExtractArchive(archivePath)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to extract archive: %v", err)
|
||||
}
|
||||
|
||||
// Find inventory/output.log
|
||||
var inventoryLog *parser.ExtractedFile
|
||||
for _, f := range files {
|
||||
if strings.Contains(f.Path, "inventory/output.log") {
|
||||
inventoryLog = &f
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
if inventoryLog == nil {
|
||||
t.Fatal("inventory/output.log not found")
|
||||
}
|
||||
|
||||
content := string(inventoryLog.Content)
|
||||
|
||||
// Test devname regex - this extracts both slot mapping and serial numbers
|
||||
t.Log("Testing devname extraction:")
|
||||
lines := strings.Split(content, "\n")
|
||||
serialCount := 0
|
||||
for i, line := range lines {
|
||||
if strings.Contains(line, "devname=") && strings.Contains(line, "fieldiag") {
|
||||
t.Logf("Line %d: Found fieldiag command", i)
|
||||
matches := devnameRegex.FindAllStringSubmatch(line, -1)
|
||||
t.Logf(" Found %d devname matches", len(matches))
|
||||
for _, match := range matches {
|
||||
if len(match) == 3 {
|
||||
pciBDF := match[1]
|
||||
slotName := match[2]
|
||||
t.Logf(" PCI: %s -> Slot: %s", pciBDF, slotName)
|
||||
|
||||
// Extract serial number from slot name
|
||||
if strings.HasPrefix(slotName, "SXM") {
|
||||
parts := strings.Split(slotName, "_")
|
||||
if len(parts) == 3 && parts[1] == "SN" {
|
||||
serial := parts[2]
|
||||
t.Logf(" Serial: %s", serial)
|
||||
serialCount++
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
break
|
||||
}
|
||||
}
|
||||
t.Logf("\nTotal GPU serials extracted: %d", serialCount)
|
||||
|
||||
if serialCount == 0 {
|
||||
t.Error("Expected to find GPU serial numbers, but found none")
|
||||
}
|
||||
}
|
||||
|
||||
func min(a, b int) int {
|
||||
if a < b {
|
||||
return a
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
func TestParseInventoryLog_AssignsNVSwitchSerialByBDF(t *testing.T) {
|
||||
content := []byte(`
|
||||
$ lspci -vvvs 0000:05:00.0
|
||||
05:00.0 Bridge: NVIDIA Corporation Device 22a3 (rev a1)
|
||||
Capabilities: [2f0 v1] Device Serial Number 99-d3-61-c8-ac-2d-b0-48
|
||||
|
||||
/tmp/fieldiag devname=0000:ba:00.0,SXM5_SN_1653925025497 fieldiag
|
||||
`)
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "GPUSXM5",
|
||||
BDF: "0000:ba:00.0",
|
||||
SerialNumber: "",
|
||||
},
|
||||
},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{
|
||||
Slot: "NVSWITCH0",
|
||||
BDF: "0000:05:00.0",
|
||||
SerialNumber: "",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
if err := ParseInventoryLog(content, result); err != nil {
|
||||
t.Fatalf("ParseInventoryLog failed: %v", err)
|
||||
}
|
||||
|
||||
if got := result.Hardware.PCIeDevices[0].SerialNumber; got != "99-d3-61-c8-ac-2d-b0-48" {
|
||||
t.Fatalf("expected NVSwitch serial 99-d3-61-c8-ac-2d-b0-48, got %q", got)
|
||||
}
|
||||
|
||||
// GPU serial should come from fieldiag devname mapping.
|
||||
if got := result.Hardware.GPUs[0].SerialNumber; got != "1653925025497" {
|
||||
t.Fatalf("expected GPU serial 1653925025497, got %q", got)
|
||||
}
|
||||
}
|
||||
370
internal/parser/vendors/nvidia/nvflash_verbose.go
vendored
Normal file
370
internal/parser/vendors/nvidia/nvflash_verbose.go
vendored
Normal file
@@ -0,0 +1,370 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
var (
|
||||
nvflashAdapterRegex = regexp.MustCompile(`^Adapter:\s+.+\(([\da-fA-F]+),([\da-fA-F]+),([\da-fA-F]+),([\da-fA-F]+)\)\s+S:([0-9A-Fa-f]{2}),B:([0-9A-Fa-f]{2}),D:([0-9A-Fa-f]{2}),F:([0-9A-Fa-f])`)
|
||||
gpuPCIIDRegex = regexp.MustCompile(`^GPU_SXM(\d+)_PCIID:\s*(\S+)$`)
|
||||
nvsPCIIDRegex = regexp.MustCompile(`^NVSWITCH_NVSWITCH(\d+)_PCIID:\s*(\S+)$`)
|
||||
)
|
||||
|
||||
var nvswitchProjectToPartNumber = map[string]string{
|
||||
"5612-0002": "965-25612-0002-000",
|
||||
}
|
||||
|
||||
type nvflashDeviceRecord struct {
|
||||
BDF string
|
||||
VendorID int
|
||||
DeviceID int
|
||||
SSVendorID int
|
||||
SSDeviceID int
|
||||
Version string
|
||||
BoardID string
|
||||
HierarchyID string
|
||||
ChipSKU string
|
||||
Project string
|
||||
}
|
||||
|
||||
// ParseNVFlashVerboseLog parses inventory/nvflash_verbose.log and applies firmware versions
|
||||
// to already discovered devices using PCI BDF with optional ID checks.
|
||||
func ParseNVFlashVerboseLog(content []byte, result *models.AnalysisResult) error {
|
||||
if result == nil || result.Hardware == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
records := parseNVFlashRecords(content)
|
||||
if len(records) == 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
for i := range result.Hardware.GPUs {
|
||||
gpu := &result.Hardware.GPUs[i]
|
||||
bdf := normalizePCIBDF(gpu.BDF)
|
||||
if bdf == "" {
|
||||
continue
|
||||
}
|
||||
rec, ok := records[bdf]
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if gpu.DeviceID != 0 && rec.DeviceID != 0 && gpu.DeviceID != rec.DeviceID {
|
||||
continue
|
||||
}
|
||||
if gpu.VendorID != 0 && rec.VendorID != 0 && gpu.VendorID != rec.VendorID {
|
||||
continue
|
||||
}
|
||||
if strings.TrimSpace(rec.Version) != "" {
|
||||
gpu.Firmware = strings.TrimSpace(rec.Version)
|
||||
}
|
||||
}
|
||||
|
||||
for i := range result.Hardware.PCIeDevices {
|
||||
dev := &result.Hardware.PCIeDevices[i]
|
||||
bdf := normalizePCIBDF(dev.BDF)
|
||||
if bdf == "" {
|
||||
continue
|
||||
}
|
||||
rec, ok := records[bdf]
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if dev.DeviceID != 0 && rec.DeviceID != 0 && dev.DeviceID != rec.DeviceID {
|
||||
continue
|
||||
}
|
||||
if dev.VendorID != 0 && rec.VendorID != 0 && dev.VendorID != rec.VendorID {
|
||||
continue
|
||||
}
|
||||
|
||||
if strings.EqualFold(strings.TrimSpace(dev.DeviceClass), "NVSwitch") || strings.HasPrefix(strings.ToUpper(strings.TrimSpace(dev.Slot)), "NVSWITCH") {
|
||||
if mappedPN := mapNVSwitchPartNumberByProject(rec.Project); mappedPN != "" {
|
||||
dev.PartNumber = mappedPN
|
||||
}
|
||||
}
|
||||
|
||||
if strings.TrimSpace(rec.Version) != "" && strings.TrimSpace(dev.PartNumber) == "" {
|
||||
// Fallback for non-NVSwitch devices where part number is unknown.
|
||||
dev.PartNumber = strings.TrimSpace(rec.Version)
|
||||
}
|
||||
}
|
||||
|
||||
appendNVFlashFirmwareEntries(result, records)
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// ApplyInventoryPCIIDs enriches devices with PCI BDFs from inventory/inventory.log.
|
||||
func ApplyInventoryPCIIDs(content []byte, result *models.AnalysisResult) error {
|
||||
if result == nil || result.Hardware == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
slotToBDF := parseInventoryPCIIDs(content)
|
||||
if len(slotToBDF) == 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
for i := range result.Hardware.GPUs {
|
||||
gpu := &result.Hardware.GPUs[i]
|
||||
if strings.TrimSpace(gpu.BDF) != "" {
|
||||
continue
|
||||
}
|
||||
if bdf := slotToBDF[strings.TrimSpace(gpu.Slot)]; bdf != "" {
|
||||
gpu.BDF = bdf
|
||||
}
|
||||
}
|
||||
|
||||
for i := range result.Hardware.PCIeDevices {
|
||||
dev := &result.Hardware.PCIeDevices[i]
|
||||
if strings.TrimSpace(dev.BDF) != "" {
|
||||
continue
|
||||
}
|
||||
if bdf := slotToBDF[normalizeNVSwitchSlot(strings.TrimSpace(dev.Slot))]; bdf != "" {
|
||||
dev.BDF = bdf
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func parseNVFlashRecords(content []byte) map[string]nvflashDeviceRecord {
|
||||
scanner := bufio.NewScanner(strings.NewReader(string(content)))
|
||||
records := make(map[string]nvflashDeviceRecord)
|
||||
var current *nvflashDeviceRecord
|
||||
|
||||
commit := func() {
|
||||
if current == nil {
|
||||
return
|
||||
}
|
||||
if current.BDF == "" || strings.TrimSpace(current.Version) == "" {
|
||||
return
|
||||
}
|
||||
records[current.BDF] = *current
|
||||
}
|
||||
|
||||
for scanner.Scan() {
|
||||
line := strings.TrimSpace(scanner.Text())
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
if m := nvflashAdapterRegex.FindStringSubmatch(line); len(m) == 9 {
|
||||
commit()
|
||||
vendorID, _ := parseHexInt(m[1])
|
||||
deviceID, _ := parseHexInt(m[2])
|
||||
ssVendorID, _ := parseHexInt(m[3])
|
||||
ssDeviceID, _ := parseHexInt(m[4])
|
||||
|
||||
current = &nvflashDeviceRecord{
|
||||
BDF: fmt.Sprintf("0000:%s:%s.%s", strings.ToLower(m[6]), strings.ToLower(m[7]), strings.ToLower(m[8])),
|
||||
VendorID: vendorID,
|
||||
DeviceID: deviceID,
|
||||
SSVendorID: ssVendorID,
|
||||
SSDeviceID: ssDeviceID,
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
if current == nil {
|
||||
continue
|
||||
}
|
||||
|
||||
if !strings.Contains(line, ":") {
|
||||
continue
|
||||
}
|
||||
parts := strings.SplitN(line, ":", 2)
|
||||
key := strings.TrimSpace(parts[0])
|
||||
val := strings.TrimSpace(parts[1])
|
||||
if key == "" || val == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
switch key {
|
||||
case "Version":
|
||||
current.Version = val
|
||||
case "Board ID":
|
||||
current.BoardID = strings.ToLower(strings.TrimPrefix(val, "0x"))
|
||||
case "Vendor ID":
|
||||
if v, err := parseHexInt(val); err == nil {
|
||||
current.VendorID = v
|
||||
}
|
||||
case "Device ID":
|
||||
if v, err := parseHexInt(val); err == nil {
|
||||
current.DeviceID = v
|
||||
}
|
||||
case "Hierarchy ID":
|
||||
current.HierarchyID = val
|
||||
case "Chip SKU":
|
||||
current.ChipSKU = val
|
||||
case "Project":
|
||||
current.Project = val
|
||||
}
|
||||
}
|
||||
|
||||
commit()
|
||||
return records
|
||||
}
|
||||
|
||||
func parseInventoryPCIIDs(content []byte) map[string]string {
|
||||
scanner := bufio.NewScanner(strings.NewReader(string(content)))
|
||||
slotToBDF := make(map[string]string)
|
||||
|
||||
for scanner.Scan() {
|
||||
line := strings.TrimSpace(scanner.Text())
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
if m := gpuPCIIDRegex.FindStringSubmatch(line); len(m) == 3 {
|
||||
slotToBDF["GPUSXM"+m[1]] = normalizePCIBDF(m[2])
|
||||
continue
|
||||
}
|
||||
if m := nvsPCIIDRegex.FindStringSubmatch(line); len(m) == 3 {
|
||||
slotToBDF["NVSWITCH"+m[1]] = normalizePCIBDF(m[2])
|
||||
}
|
||||
}
|
||||
|
||||
return slotToBDF
|
||||
}
|
||||
|
||||
func normalizePCIBDF(v string) string {
|
||||
s := strings.TrimSpace(strings.ToLower(v))
|
||||
if s == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
// bus:device.func -> 0000:bus:device.func
|
||||
short := regexp.MustCompile(`^([0-9a-f]{2}:[0-9a-f]{2}\.[0-7])$`)
|
||||
if m := short.FindStringSubmatch(s); len(m) == 2 {
|
||||
return "0000:" + m[1]
|
||||
}
|
||||
|
||||
full := regexp.MustCompile(`^([0-9a-f]{4}:[0-9a-f]{2}:[0-9a-f]{2}\.[0-7])$`)
|
||||
if m := full.FindStringSubmatch(s); len(m) == 2 {
|
||||
return m[1]
|
||||
}
|
||||
|
||||
return s
|
||||
}
|
||||
|
||||
func parseHexInt(v string) (int, error) {
|
||||
s := strings.TrimSpace(strings.ToLower(v))
|
||||
s = strings.TrimPrefix(s, "0x")
|
||||
if s == "" {
|
||||
return 0, fmt.Errorf("empty hex value")
|
||||
}
|
||||
n, err := strconv.ParseInt(s, 16, 32)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
return int(n), nil
|
||||
}
|
||||
|
||||
func findNVFlashVerboseLog(files []parser.ExtractedFile) *parser.ExtractedFile {
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(f.Path)
|
||||
if strings.Contains(path, "inventory/nvflash_verbose.log") ||
|
||||
strings.Contains(path, "inventory\\nvflash_verbose.log") {
|
||||
return &f
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func findInventoryInfoLog(files []parser.ExtractedFile) *parser.ExtractedFile {
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(f.Path)
|
||||
if strings.Contains(path, "inventory/inventory.log") ||
|
||||
strings.Contains(path, "inventory\\inventory.log") {
|
||||
return &f
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func appendNVFlashFirmwareEntries(result *models.AnalysisResult, records map[string]nvflashDeviceRecord) {
|
||||
if result == nil || result.Hardware == nil {
|
||||
return
|
||||
}
|
||||
|
||||
if result.Hardware.Firmware == nil {
|
||||
result.Hardware.Firmware = make([]models.FirmwareInfo, 0)
|
||||
}
|
||||
|
||||
seen := make(map[string]struct{})
|
||||
for _, fw := range result.Hardware.Firmware {
|
||||
key := strings.ToLower(strings.TrimSpace(fw.DeviceName)) + "|" + strings.TrimSpace(fw.Version)
|
||||
seen[key] = struct{}{}
|
||||
}
|
||||
|
||||
for _, gpu := range result.Hardware.GPUs {
|
||||
version := strings.TrimSpace(gpu.Firmware)
|
||||
if version == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
model := strings.TrimSpace(gpu.PartNumber)
|
||||
if model == "" {
|
||||
model = strings.TrimSpace(gpu.Model)
|
||||
}
|
||||
if model == "" {
|
||||
model = strings.TrimSpace(gpu.Slot)
|
||||
}
|
||||
deviceName := fmt.Sprintf("GPU %s (%s)", strings.TrimSpace(gpu.Slot), model)
|
||||
key := strings.ToLower(deviceName) + "|" + version
|
||||
if _, ok := seen[key]; ok {
|
||||
continue
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
result.Hardware.Firmware = append(result.Hardware.Firmware, models.FirmwareInfo{
|
||||
DeviceName: deviceName,
|
||||
Version: version,
|
||||
})
|
||||
}
|
||||
|
||||
for _, dev := range result.Hardware.PCIeDevices {
|
||||
bdf := normalizePCIBDF(dev.BDF)
|
||||
rec, ok := records[bdf]
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
version := strings.TrimSpace(rec.Version)
|
||||
if version == "" {
|
||||
continue
|
||||
}
|
||||
slot := strings.TrimSpace(dev.Slot)
|
||||
deviceClass := strings.TrimSpace(dev.DeviceClass)
|
||||
if strings.EqualFold(deviceClass, "NVSwitch") || strings.HasPrefix(strings.ToUpper(slot), "NVSWITCH") {
|
||||
model := slot
|
||||
if pn := strings.TrimSpace(dev.PartNumber); pn != "" {
|
||||
model = pn
|
||||
}
|
||||
deviceName := fmt.Sprintf("NVSwitch %s (%s)", slot, model)
|
||||
key := strings.ToLower(deviceName) + "|" + version
|
||||
if _, ok := seen[key]; ok {
|
||||
continue
|
||||
}
|
||||
seen[key] = struct{}{}
|
||||
result.Hardware.Firmware = append(result.Hardware.Firmware, models.FirmwareInfo{
|
||||
DeviceName: deviceName,
|
||||
Version: version,
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func mapNVSwitchPartNumberByProject(project string) string {
|
||||
key := strings.TrimSpace(strings.ToLower(project))
|
||||
if key == "" {
|
||||
return ""
|
||||
}
|
||||
return strings.TrimSpace(nvswitchProjectToPartNumber[key])
|
||||
}
|
||||
93
internal/parser/vendors/nvidia/nvflash_verbose_test.go
vendored
Normal file
93
internal/parser/vendors/nvidia/nvflash_verbose_test.go
vendored
Normal file
@@ -0,0 +1,93 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestApplyInventoryPCIIDsAndNVFlashFirmware(t *testing.T) {
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{
|
||||
Slot: "GPUSXM5",
|
||||
DeviceID: 0x2335,
|
||||
},
|
||||
},
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{
|
||||
Slot: "NVSWITCHNVSWITCH2",
|
||||
DeviceID: 0x22a3,
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
inventoryLog := []byte(`
|
||||
GPU_SXM5_PCIID: 0000:ba:00.0
|
||||
NVSWITCH_NVSWITCH2_PCIID: 0000:07:00.0
|
||||
`)
|
||||
|
||||
nvflashLog := []byte(`
|
||||
Adapter: Graphics Device (10DE,2335,10DE,18BE) S:00,B:BA,D:00,F:00
|
||||
Version : 96.00.D0.00.03
|
||||
Board ID : 0x053C
|
||||
Vendor ID : 0x10DE
|
||||
Device ID : 0x2335
|
||||
Hierarchy ID : Normal Board
|
||||
Chip SKU : 895-0
|
||||
Project : G520-0280
|
||||
|
||||
Adapter: Graphics Device (10DE,22A3,10DE,1796) S:00,B:07,D:00,F:00
|
||||
Version : 96.10.6D.00.01
|
||||
Board ID : 0x03B7
|
||||
Vendor ID : 0x10DE
|
||||
Device ID : 0x22A3
|
||||
Hierarchy ID : Normal Board
|
||||
Chip SKU : 890-0
|
||||
Project : 5612-0002
|
||||
`)
|
||||
|
||||
if err := ApplyInventoryPCIIDs(inventoryLog, result); err != nil {
|
||||
t.Fatalf("ApplyInventoryPCIIDs failed: %v", err)
|
||||
}
|
||||
if err := ParseNVFlashVerboseLog(nvflashLog, result); err != nil {
|
||||
t.Fatalf("ParseNVFlashVerboseLog failed: %v", err)
|
||||
}
|
||||
|
||||
if got := result.Hardware.GPUs[0].BDF; got != "0000:ba:00.0" {
|
||||
t.Fatalf("expected GPU BDF 0000:ba:00.0, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.GPUs[0].Firmware; got != "96.00.D0.00.03" {
|
||||
t.Fatalf("expected GPU firmware 96.00.D0.00.03, got %q", got)
|
||||
}
|
||||
|
||||
if got := result.Hardware.PCIeDevices[0].BDF; got != "0000:07:00.0" {
|
||||
t.Fatalf("expected NVSwitch BDF 0000:07:00.0, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.PCIeDevices[0].PartNumber; got != "965-25612-0002-000" {
|
||||
t.Fatalf("expected NVSwitch part number 965-25612-0002-000, got %q", got)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Firmware) == 0 {
|
||||
t.Fatalf("expected firmware entries to be populated from nvflash log")
|
||||
}
|
||||
|
||||
hasGPUFW := false
|
||||
hasNVSwitchFW := false
|
||||
for _, fw := range result.Hardware.Firmware {
|
||||
if fw.Version == "96.00.D0.00.03" {
|
||||
hasGPUFW = true
|
||||
}
|
||||
if fw.Version == "96.10.6D.00.01" {
|
||||
hasNVSwitchFW = true
|
||||
}
|
||||
}
|
||||
if !hasGPUFW {
|
||||
t.Fatalf("expected GPU firmware version 96.00.D0.00.03 in hardware firmware list")
|
||||
}
|
||||
if !hasNVSwitchFW {
|
||||
t.Fatalf("expected NVSwitch firmware version 96.10.6D.00.01 in hardware firmware list")
|
||||
}
|
||||
}
|
||||
66
internal/parser/vendors/nvidia/parser.go
vendored
66
internal/parser/vendors/nvidia/parser.go
vendored
@@ -14,7 +14,7 @@ import (
|
||||
|
||||
// parserVersion - version of this parser module
|
||||
// IMPORTANT: Increment this version when making changes to parser logic!
|
||||
const parserVersion = "1.1.0"
|
||||
const parserVersion = "1.4"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
@@ -70,7 +70,7 @@ func (p *Parser) Detect(files []parser.ExtractedFile) int {
|
||||
if strings.HasSuffix(path, "output.log") {
|
||||
// Check if it contains dmidecode output
|
||||
if strings.Contains(string(f.Content), "dmidecode") ||
|
||||
strings.Contains(string(f.Content), "System Information") {
|
||||
strings.Contains(string(f.Content), "System Information") {
|
||||
confidence += 10
|
||||
}
|
||||
}
|
||||
@@ -105,6 +105,9 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
|
||||
result.Hardware = &models.HardwareConfig{
|
||||
GPUs: make([]models.GPU, 0),
|
||||
}
|
||||
gpuStatuses := make(map[string]string)
|
||||
gpuFailureDetails := make(map[string]string)
|
||||
nvswitchStatuses := make(map[string]string)
|
||||
|
||||
// Parse output.log first (contains dmidecode system info)
|
||||
// Find the output.log file that contains dmidecode output
|
||||
@@ -124,18 +127,75 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
|
||||
}
|
||||
}
|
||||
|
||||
// Parse inventory/output.log (contains GPU serial numbers from lspci)
|
||||
inventoryLogFile := findInventoryOutputLog(files)
|
||||
if inventoryLogFile != nil {
|
||||
if err := ParseInventoryLog(inventoryLogFile.Content, result); err != nil {
|
||||
// Log error but continue parsing other files
|
||||
_ = err // Ignore error for now
|
||||
}
|
||||
}
|
||||
|
||||
// Parse inventory/inventory.log to enrich PCI BDF mapping for components.
|
||||
inventoryInfoLog := findInventoryInfoLog(files)
|
||||
if inventoryInfoLog != nil {
|
||||
if err := ApplyInventoryPCIIDs(inventoryInfoLog.Content, result); err != nil {
|
||||
_ = err
|
||||
}
|
||||
}
|
||||
|
||||
// Enhance GPU model names using SKU mapping from testspec + inventory summary.
|
||||
ApplyGPUModelsFromSKU(files, result)
|
||||
|
||||
// Parse inventory/nvflash_verbose.log and apply firmware versions by BDF + IDs.
|
||||
// This runs after GPU model/part-number enrichment so firmware tab uses final model labels.
|
||||
nvflashVerbose := findNVFlashVerboseLog(files)
|
||||
if nvflashVerbose != nil {
|
||||
if err := ParseNVFlashVerboseLog(nvflashVerbose.Content, result); err != nil {
|
||||
_ = err
|
||||
}
|
||||
}
|
||||
|
||||
// Parse summary.json (test results summary)
|
||||
if f := parser.FindFileByName(files, "summary.json"); f != nil {
|
||||
events := ParseSummaryJSON(f.Content)
|
||||
result.Events = append(result.Events, events...)
|
||||
for componentID, status := range CollectGPUStatusesFromSummaryJSON(f.Content) {
|
||||
gpuStatuses[componentID] = mergeGPUStatus(gpuStatuses[componentID], status)
|
||||
}
|
||||
for slot, status := range CollectNVSwitchStatusesFromSummaryJSON(f.Content) {
|
||||
nvswitchStatuses[slot] = mergeGPUStatus(nvswitchStatuses[slot], status)
|
||||
}
|
||||
for componentID, detail := range CollectGPUFailureDetailsFromSummaryJSON(f.Content) {
|
||||
if _, exists := gpuFailureDetails[componentID]; !exists && strings.TrimSpace(detail) != "" {
|
||||
gpuFailureDetails[componentID] = strings.TrimSpace(detail)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Parse summary.csv (alternative format)
|
||||
if f := parser.FindFileByName(files, "summary.csv"); f != nil {
|
||||
csvEvents := ParseSummaryCSV(f.Content)
|
||||
result.Events = append(result.Events, csvEvents...)
|
||||
for componentID, status := range CollectGPUStatusesFromSummaryCSV(f.Content) {
|
||||
gpuStatuses[componentID] = mergeGPUStatus(gpuStatuses[componentID], status)
|
||||
}
|
||||
for slot, status := range CollectNVSwitchStatusesFromSummaryCSV(f.Content) {
|
||||
nvswitchStatuses[slot] = mergeGPUStatus(nvswitchStatuses[slot], status)
|
||||
}
|
||||
for componentID, detail := range CollectGPUFailureDetailsFromSummaryCSV(f.Content) {
|
||||
if _, exists := gpuFailureDetails[componentID]; !exists && strings.TrimSpace(detail) != "" {
|
||||
gpuFailureDetails[componentID] = strings.TrimSpace(detail)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Apply per-GPU PASS/FAIL status derived from summary files.
|
||||
ApplyGPUStatuses(result, gpuStatuses)
|
||||
ApplyGPUFailureDetails(result, gpuFailureDetails)
|
||||
ApplyNVSwitchStatuses(result, nvswitchStatuses)
|
||||
ApplyGPUAndNVSwitchCheckTimes(result, CollectGPUAndNVSwitchCheckTimes(files))
|
||||
|
||||
// Parse GPU field diagnostics logs
|
||||
gpuFieldiagFiles := parser.FindFileByPattern(files, "gpu_fieldiag/", ".log")
|
||||
for _, f := range gpuFieldiagFiles {
|
||||
@@ -158,7 +218,7 @@ func findDmidecodeOutputLog(files []parser.ExtractedFile) *parser.ExtractedFile
|
||||
// Check if it contains dmidecode output
|
||||
content := string(f.Content)
|
||||
if strings.Contains(content, "dmidecode") &&
|
||||
strings.Contains(content, "System Information") {
|
||||
strings.Contains(content, "System Information") {
|
||||
return &f
|
||||
}
|
||||
}
|
||||
|
||||
291
internal/parser/vendors/nvidia/parser_test.go
vendored
Normal file
291
internal/parser/vendors/nvidia/parser_test.go
vendored
Normal file
@@ -0,0 +1,291 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestNVIDIAParser_RealArchive(t *testing.T) {
|
||||
// Test with the real archive that was reported as problematic
|
||||
archivePath := filepath.Join("../../../../example", "A514359X5A09844_logs-20260115-151707.tar")
|
||||
|
||||
// Check if file exists
|
||||
if _, err := os.Stat(archivePath); os.IsNotExist(err) {
|
||||
t.Skip("Test archive not found, skipping test")
|
||||
}
|
||||
|
||||
// Extract files from archive
|
||||
files, err := parser.ExtractArchive(archivePath)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to extract archive: %v", err)
|
||||
}
|
||||
|
||||
// Check if inventory/output.log exists
|
||||
hasInventoryLog := false
|
||||
for _, f := range files {
|
||||
if filepath.Base(f.Path) == "output.log" {
|
||||
t.Logf("Found file: %s", f.Path)
|
||||
}
|
||||
if f.Path == "./inventory/output.log" || f.Path == "inventory/output.log" {
|
||||
hasInventoryLog = true
|
||||
t.Logf("Found inventory/output.log with %d bytes", len(f.Content))
|
||||
}
|
||||
}
|
||||
if !hasInventoryLog {
|
||||
t.Error("inventory/output.log not found in extracted files")
|
||||
}
|
||||
|
||||
// Create parser and parse
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse archive: %v", err)
|
||||
}
|
||||
|
||||
// Verify basic system info
|
||||
if result.Hardware.BoardInfo.Manufacturer == "" {
|
||||
t.Error("Expected Manufacturer to be set")
|
||||
}
|
||||
if result.Hardware.BoardInfo.ProductName == "" {
|
||||
t.Error("Expected ProductName to be set")
|
||||
}
|
||||
if result.Hardware.BoardInfo.SerialNumber == "" {
|
||||
t.Error("Expected SerialNumber to be set")
|
||||
}
|
||||
|
||||
t.Logf("System Info:")
|
||||
t.Logf(" Manufacturer: %s", result.Hardware.BoardInfo.Manufacturer)
|
||||
t.Logf(" Product: %s", result.Hardware.BoardInfo.ProductName)
|
||||
t.Logf(" Serial: %s", result.Hardware.BoardInfo.SerialNumber)
|
||||
|
||||
// Verify GPUs were found
|
||||
if len(result.Hardware.GPUs) == 0 {
|
||||
t.Error("Expected to find GPUs")
|
||||
}
|
||||
|
||||
t.Logf("\nFound %d GPUs:", len(result.Hardware.GPUs))
|
||||
|
||||
gpusWithSerials := 0
|
||||
for _, gpu := range result.Hardware.GPUs {
|
||||
t.Logf(" %s: %s (Firmware: %s, Serial: %s, BDF: %s)",
|
||||
gpu.Slot, gpu.Model, gpu.Firmware, gpu.SerialNumber, gpu.BDF)
|
||||
|
||||
if gpu.SerialNumber != "" {
|
||||
gpusWithSerials++
|
||||
}
|
||||
}
|
||||
|
||||
// Verify that GPU serial numbers were extracted
|
||||
if gpusWithSerials == 0 {
|
||||
t.Error("Expected at least some GPUs to have serial numbers")
|
||||
}
|
||||
|
||||
t.Logf("\nGPUs with serial numbers: %d/%d", gpusWithSerials, len(result.Hardware.GPUs))
|
||||
|
||||
// Check events for SXM2 failures
|
||||
t.Logf("\nTotal events: %d", len(result.Events))
|
||||
|
||||
// Look for the specific serial or SXM2
|
||||
sxm2Events := 0
|
||||
for _, event := range result.Events {
|
||||
desc := event.Description + " " + event.RawData + " " + event.EventType
|
||||
if contains(desc, "SXM2") || contains(desc, "1653925025827") {
|
||||
t.Logf(" SXM2 Event: [%s] %s (Severity: %s)", event.EventType, event.Description, event.Severity)
|
||||
sxm2Events++
|
||||
}
|
||||
}
|
||||
|
||||
if sxm2Events == 0 {
|
||||
t.Error("Expected to find events for SXM2 (faulty GPU 1653925025827)")
|
||||
}
|
||||
t.Logf("\nSXM2 failure events: %d", sxm2Events)
|
||||
}
|
||||
|
||||
func TestNVIDIAParser_GPUStatusFromSummary_RealArchive07900(t *testing.T) {
|
||||
archivePath := filepath.Join("../../../../example", "A514359X5A07900_logs-20260122-074208.tar")
|
||||
if _, err := os.Stat(archivePath); os.IsNotExist(err) {
|
||||
t.Skip("Test archive not found, skipping test")
|
||||
}
|
||||
|
||||
files, err := parser.ExtractArchive(archivePath)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to extract archive: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse archive: %v", err)
|
||||
}
|
||||
|
||||
if result.Hardware == nil || len(result.Hardware.GPUs) == 0 {
|
||||
t.Fatalf("expected GPUs in parsed result")
|
||||
}
|
||||
|
||||
statusBySerial := make(map[string]string, len(result.Hardware.GPUs))
|
||||
for _, gpu := range result.Hardware.GPUs {
|
||||
if gpu.SerialNumber != "" {
|
||||
statusBySerial[gpu.SerialNumber] = gpu.Status
|
||||
}
|
||||
}
|
||||
|
||||
if got := statusBySerial["1653925025497"]; got != "FAIL" {
|
||||
t.Fatalf("expected GPU serial 1653925025497 status FAIL, got %q", got)
|
||||
}
|
||||
|
||||
for serial, st := range statusBySerial {
|
||||
if serial == "1653925025497" {
|
||||
continue
|
||||
}
|
||||
if st != "PASS" {
|
||||
t.Fatalf("expected non-failing GPU serial %s status PASS, got %q", serial, st)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestNVIDIAParser_GPUErrorDetailsFromSummary_RealArchive07900(t *testing.T) {
|
||||
archivePath := filepath.Join("../../../../example", "A514359X5A07900_logs-20260122-074208.tar")
|
||||
if _, err := os.Stat(archivePath); os.IsNotExist(err) {
|
||||
t.Skip("Test archive not found, skipping test")
|
||||
}
|
||||
|
||||
files, err := parser.ExtractArchive(archivePath)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to extract archive: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse archive: %v", err)
|
||||
}
|
||||
|
||||
if result.Hardware == nil || len(result.Hardware.GPUs) == 0 {
|
||||
t.Fatalf("expected GPUs in parsed result")
|
||||
}
|
||||
|
||||
errBySerial := make(map[string]string, len(result.Hardware.GPUs))
|
||||
for _, gpu := range result.Hardware.GPUs {
|
||||
if gpu.SerialNumber != "" {
|
||||
errBySerial[gpu.SerialNumber] = gpu.ErrorDescription
|
||||
}
|
||||
}
|
||||
|
||||
if got := errBySerial["1653925025497"]; got != "Row remapping failed" {
|
||||
t.Fatalf("expected GPU serial 1653925025497 error Row remapping failed, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNVIDIAParser_GPUModelFromSKU_RealArchive07900(t *testing.T) {
|
||||
archivePath := filepath.Join("../../../../example", "A514359X5A07900_logs-20260122-074208.tar")
|
||||
if _, err := os.Stat(archivePath); os.IsNotExist(err) {
|
||||
t.Skip("Test archive not found, skipping test")
|
||||
}
|
||||
|
||||
files, err := parser.ExtractArchive(archivePath)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to extract archive: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse archive: %v", err)
|
||||
}
|
||||
|
||||
if result.Hardware == nil || len(result.Hardware.GPUs) == 0 {
|
||||
t.Fatalf("expected GPUs in parsed result")
|
||||
}
|
||||
|
||||
found := false
|
||||
for _, gpu := range result.Hardware.GPUs {
|
||||
if gpu.Model == "692-2G520-0280-501" && gpu.Description == "hgx h200 8 gpu 141g aircooled" {
|
||||
found = true
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
if !found {
|
||||
t.Fatalf("expected at least one GPU with model 692-2G520-0280-501 and description hgx h200 8 gpu 141g aircooled")
|
||||
}
|
||||
}
|
||||
|
||||
func TestNVIDIAParser_ComponentCheckTimes_RealArchive07900(t *testing.T) {
|
||||
archivePath := filepath.Join("../../../../example", "A514359X5A07900_logs-20260122-074208.tar")
|
||||
if _, err := os.Stat(archivePath); os.IsNotExist(err) {
|
||||
t.Skip("Test archive not found, skipping test")
|
||||
}
|
||||
|
||||
files, err := parser.ExtractArchive(archivePath)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to extract archive: %v", err)
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse archive: %v", err)
|
||||
}
|
||||
|
||||
if result.Hardware == nil {
|
||||
t.Fatalf("expected hardware in parsed result")
|
||||
}
|
||||
|
||||
expectedGPU := time.Date(2026, 1, 22, 6, 45, 36, 0, time.UTC)
|
||||
expectedNVSwitch := time.Date(2026, 1, 22, 6, 11, 32, 0, time.UTC)
|
||||
|
||||
if len(result.Hardware.GPUs) == 0 {
|
||||
t.Fatalf("expected GPUs in parsed result")
|
||||
}
|
||||
for _, gpu := range result.Hardware.GPUs {
|
||||
if !gpu.StatusCheckedAt.Equal(expectedGPU) {
|
||||
t.Fatalf("expected GPU %s status_checked_at %s, got %s", gpu.Slot, expectedGPU.Format(time.RFC3339), gpu.StatusCheckedAt.Format(time.RFC3339))
|
||||
}
|
||||
if gpu.StatusAtCollect == nil || !gpu.StatusAtCollect.At.Equal(expectedGPU) {
|
||||
t.Fatalf("expected GPU %s status_at_collection.at %s", gpu.Slot, expectedGPU.Format(time.RFC3339))
|
||||
}
|
||||
}
|
||||
|
||||
nvsCount := 0
|
||||
for _, dev := range result.Hardware.PCIeDevices {
|
||||
slot := normalizeNVSwitchSlot(dev.Slot)
|
||||
if slot == "" {
|
||||
continue
|
||||
}
|
||||
if dev.DeviceClass != "NVSwitch" && len(slot) < len("NVSWITCH") {
|
||||
continue
|
||||
}
|
||||
if dev.DeviceClass != "NVSwitch" && slot[:len("NVSWITCH")] != "NVSWITCH" {
|
||||
continue
|
||||
}
|
||||
nvsCount++
|
||||
if !dev.StatusCheckedAt.Equal(expectedNVSwitch) {
|
||||
t.Fatalf("expected NVSwitch %s status_checked_at %s, got %s", dev.Slot, expectedNVSwitch.Format(time.RFC3339), dev.StatusCheckedAt.Format(time.RFC3339))
|
||||
}
|
||||
if dev.StatusAtCollect == nil || !dev.StatusAtCollect.At.Equal(expectedNVSwitch) {
|
||||
t.Fatalf("expected NVSwitch %s status_at_collection.at %s", dev.Slot, expectedNVSwitch.Format(time.RFC3339))
|
||||
}
|
||||
}
|
||||
if nvsCount == 0 {
|
||||
t.Fatalf("expected NVSwitch devices in parsed result")
|
||||
}
|
||||
}
|
||||
|
||||
func contains(s, substr string) bool {
|
||||
return len(s) >= len(substr) && (s == substr || len(s) > len(substr) &&
|
||||
(s[:len(substr)] == substr || s[len(s)-len(substr):] == substr ||
|
||||
findSubstring(s, substr)))
|
||||
}
|
||||
|
||||
func findSubstring(s, substr string) bool {
|
||||
for i := 0; i <= len(s)-len(substr); i++ {
|
||||
if s[i:i+len(substr)] == substr {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
338
internal/parser/vendors/nvidia/summary.go
vendored
338
internal/parser/vendors/nvidia/summary.go
vendored
@@ -4,6 +4,7 @@ import (
|
||||
"encoding/csv"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
@@ -20,6 +21,9 @@ type SummaryEntry struct {
|
||||
IgnoreError string `json:"Ignore Error"`
|
||||
}
|
||||
|
||||
var gpuComponentIDRegex = regexp.MustCompile(`^SXM(\d+)_SN_(.+)$`)
|
||||
var nvswitchInventoryComponentRegex = regexp.MustCompile(`^NVSWITCH_(NVSWITCH\d+)_`)
|
||||
|
||||
// ParseSummaryJSON parses summary.json file and returns events
|
||||
func ParseSummaryJSON(content []byte) []models.Event {
|
||||
var entries []SummaryEntry
|
||||
@@ -92,6 +96,340 @@ func ParseSummaryCSV(content []byte) []models.Event {
|
||||
return events
|
||||
}
|
||||
|
||||
// CollectGPUStatusesFromSummaryJSON extracts per-GPU PASS/FAIL status from summary.json.
|
||||
// Key format in returned map is component ID from summary (e.g. "SXM5_SN_1653925025497").
|
||||
func CollectGPUStatusesFromSummaryJSON(content []byte) map[string]string {
|
||||
var entries []SummaryEntry
|
||||
if err := json.Unmarshal(content, &entries); err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
statuses := make(map[string]string)
|
||||
for _, entry := range entries {
|
||||
component := strings.TrimSpace(entry.ComponentID)
|
||||
if component == "" || !gpuComponentIDRegex.MatchString(component) {
|
||||
continue
|
||||
}
|
||||
|
||||
current := statuses[component]
|
||||
next := "PASS"
|
||||
if !isSummaryJSONRecordPassing(entry.ErrorCode, entry.Notes) {
|
||||
next = "FAIL"
|
||||
}
|
||||
statuses[component] = mergeGPUStatus(current, next)
|
||||
}
|
||||
|
||||
return statuses
|
||||
}
|
||||
|
||||
// CollectGPUFailureDetailsFromSummaryJSON extracts per-GPU failure details from summary.json.
|
||||
// Key format in returned map is component ID from summary (e.g. "SXM5_SN_1653925025497").
|
||||
func CollectGPUFailureDetailsFromSummaryJSON(content []byte) map[string]string {
|
||||
var entries []SummaryEntry
|
||||
if err := json.Unmarshal(content, &entries); err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
details := make(map[string]string)
|
||||
for _, entry := range entries {
|
||||
component := strings.TrimSpace(entry.ComponentID)
|
||||
if component == "" || !gpuComponentIDRegex.MatchString(component) {
|
||||
continue
|
||||
}
|
||||
if isSummaryJSONRecordPassing(entry.ErrorCode, entry.Notes) {
|
||||
continue
|
||||
}
|
||||
|
||||
note := strings.TrimSpace(entry.Notes)
|
||||
if note == "" || strings.EqualFold(note, "OK") {
|
||||
note = strings.TrimSpace(entry.ErrorCode)
|
||||
}
|
||||
if note == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
// Keep first non-empty detail to avoid noisy overrides.
|
||||
if _, exists := details[component]; !exists {
|
||||
details[component] = note
|
||||
}
|
||||
}
|
||||
|
||||
return details
|
||||
}
|
||||
|
||||
// CollectGPUStatusesFromSummaryCSV extracts per-GPU PASS/FAIL status from summary.csv.
|
||||
// Key format in returned map is component ID from summary (e.g. "SXM5_SN_1653925025497").
|
||||
func CollectGPUStatusesFromSummaryCSV(content []byte) map[string]string {
|
||||
reader := csv.NewReader(strings.NewReader(string(content)))
|
||||
records, err := reader.ReadAll()
|
||||
if err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
statuses := make(map[string]string)
|
||||
for i, record := range records {
|
||||
if i == 0 || len(record) < 7 {
|
||||
continue
|
||||
}
|
||||
|
||||
component := strings.TrimSpace(record[5])
|
||||
if component == "" || !gpuComponentIDRegex.MatchString(component) {
|
||||
continue
|
||||
}
|
||||
|
||||
errorCode := strings.TrimSpace(record[0])
|
||||
notes := strings.TrimSpace(record[6])
|
||||
|
||||
current := statuses[component]
|
||||
next := "PASS"
|
||||
if !isSummaryCSVRecordPassing(errorCode, notes) {
|
||||
next = "FAIL"
|
||||
}
|
||||
statuses[component] = mergeGPUStatus(current, next)
|
||||
}
|
||||
|
||||
return statuses
|
||||
}
|
||||
|
||||
// CollectNVSwitchStatusesFromSummaryJSON extracts per-NVSwitch PASS/FAIL status from summary.json.
|
||||
// Key format in returned map is normalized switch slot (e.g. "NVSWITCH0").
|
||||
func CollectNVSwitchStatusesFromSummaryJSON(content []byte) map[string]string {
|
||||
var entries []SummaryEntry
|
||||
if err := json.Unmarshal(content, &entries); err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
statuses := make(map[string]string)
|
||||
for _, entry := range entries {
|
||||
component := strings.TrimSpace(entry.ComponentID)
|
||||
matches := nvswitchInventoryComponentRegex.FindStringSubmatch(component)
|
||||
if len(matches) != 2 {
|
||||
continue
|
||||
}
|
||||
|
||||
slot := strings.TrimSpace(matches[1])
|
||||
if slot == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
current := statuses[slot]
|
||||
next := "PASS"
|
||||
if !isSummaryJSONRecordPassing(entry.ErrorCode, entry.Notes) {
|
||||
next = "FAIL"
|
||||
}
|
||||
statuses[slot] = mergeGPUStatus(current, next)
|
||||
}
|
||||
|
||||
return statuses
|
||||
}
|
||||
|
||||
// CollectNVSwitchStatusesFromSummaryCSV extracts per-NVSwitch PASS/FAIL status from summary.csv.
|
||||
// Key format in returned map is normalized switch slot (e.g. "NVSWITCH0").
|
||||
func CollectNVSwitchStatusesFromSummaryCSV(content []byte) map[string]string {
|
||||
reader := csv.NewReader(strings.NewReader(string(content)))
|
||||
records, err := reader.ReadAll()
|
||||
if err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
statuses := make(map[string]string)
|
||||
for i, record := range records {
|
||||
if i == 0 || len(record) < 7 {
|
||||
continue
|
||||
}
|
||||
|
||||
component := strings.TrimSpace(record[5])
|
||||
matches := nvswitchInventoryComponentRegex.FindStringSubmatch(component)
|
||||
if len(matches) != 2 {
|
||||
continue
|
||||
}
|
||||
|
||||
slot := strings.TrimSpace(matches[1])
|
||||
if slot == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
errorCode := strings.TrimSpace(record[0])
|
||||
notes := strings.TrimSpace(record[6])
|
||||
|
||||
current := statuses[slot]
|
||||
next := "PASS"
|
||||
if !isSummaryCSVRecordPassing(errorCode, notes) {
|
||||
next = "FAIL"
|
||||
}
|
||||
statuses[slot] = mergeGPUStatus(current, next)
|
||||
}
|
||||
|
||||
return statuses
|
||||
}
|
||||
|
||||
// CollectGPUFailureDetailsFromSummaryCSV extracts per-GPU failure details from summary.csv.
|
||||
// Key format in returned map is component ID from summary (e.g. "SXM5_SN_1653925025497").
|
||||
func CollectGPUFailureDetailsFromSummaryCSV(content []byte) map[string]string {
|
||||
reader := csv.NewReader(strings.NewReader(string(content)))
|
||||
records, err := reader.ReadAll()
|
||||
if err != nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
details := make(map[string]string)
|
||||
for i, record := range records {
|
||||
if i == 0 || len(record) < 7 {
|
||||
continue
|
||||
}
|
||||
|
||||
component := strings.TrimSpace(record[5])
|
||||
if component == "" || !gpuComponentIDRegex.MatchString(component) {
|
||||
continue
|
||||
}
|
||||
|
||||
errorCode := strings.TrimSpace(record[0])
|
||||
notes := strings.TrimSpace(record[6])
|
||||
if isSummaryCSVRecordPassing(errorCode, notes) {
|
||||
continue
|
||||
}
|
||||
|
||||
note := notes
|
||||
if note == "" || strings.EqualFold(note, "OK") {
|
||||
note = errorCode
|
||||
}
|
||||
if note == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
if _, exists := details[component]; !exists {
|
||||
details[component] = note
|
||||
}
|
||||
}
|
||||
|
||||
return details
|
||||
}
|
||||
|
||||
func isSummaryJSONRecordPassing(errorCode, notes string) bool {
|
||||
_ = errorCode
|
||||
return strings.TrimSpace(notes) == "OK"
|
||||
}
|
||||
|
||||
func isSummaryCSVRecordPassing(errorCode, notes string) bool {
|
||||
_ = errorCode
|
||||
return strings.TrimSpace(notes) == "OK"
|
||||
}
|
||||
|
||||
func mergeGPUStatus(current, next string) string {
|
||||
// FAIL has highest priority.
|
||||
if current == "FAIL" || next == "FAIL" {
|
||||
return "FAIL"
|
||||
}
|
||||
if current == "PASS" || next == "PASS" {
|
||||
return "PASS"
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
// ApplyGPUStatuses applies aggregated PASS/FAIL statuses from summary components to parsed GPUs.
|
||||
func ApplyGPUStatuses(result *models.AnalysisResult, componentStatuses map[string]string) {
|
||||
if result == nil || result.Hardware == nil || len(result.Hardware.GPUs) == 0 || len(componentStatuses) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
slotStatus := make(map[string]string) // key: GPUSXM<idx>
|
||||
serialStatus := make(map[string]string) // key: GPU serial
|
||||
|
||||
for componentID, status := range componentStatuses {
|
||||
matches := gpuComponentIDRegex.FindStringSubmatch(strings.TrimSpace(componentID))
|
||||
if len(matches) != 3 {
|
||||
continue
|
||||
}
|
||||
slotKey := "GPUSXM" + matches[1]
|
||||
serialKey := strings.TrimSpace(matches[2])
|
||||
slotStatus[slotKey] = mergeGPUStatus(slotStatus[slotKey], status)
|
||||
if serialKey != "" {
|
||||
serialStatus[serialKey] = mergeGPUStatus(serialStatus[serialKey], status)
|
||||
}
|
||||
}
|
||||
|
||||
for i := range result.Hardware.GPUs {
|
||||
gpu := &result.Hardware.GPUs[i]
|
||||
next := ""
|
||||
if serial := strings.TrimSpace(gpu.SerialNumber); serial != "" {
|
||||
next = serialStatus[serial]
|
||||
}
|
||||
if next == "" {
|
||||
next = slotStatus[strings.TrimSpace(gpu.Slot)]
|
||||
}
|
||||
if next != "" {
|
||||
gpu.Status = next
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ApplyNVSwitchStatuses applies aggregated PASS/FAIL statuses from summary components to parsed NVSwitch devices.
|
||||
func ApplyNVSwitchStatuses(result *models.AnalysisResult, switchStatuses map[string]string) {
|
||||
if result == nil || result.Hardware == nil || len(result.Hardware.PCIeDevices) == 0 || len(switchStatuses) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
for i := range result.Hardware.PCIeDevices {
|
||||
dev := &result.Hardware.PCIeDevices[i]
|
||||
slot := normalizeNVSwitchSlot(strings.TrimSpace(dev.Slot))
|
||||
if slot == "" {
|
||||
continue
|
||||
}
|
||||
if !strings.HasPrefix(strings.ToUpper(slot), "NVSWITCH") {
|
||||
continue
|
||||
}
|
||||
if st := switchStatuses[slot]; st != "" {
|
||||
dev.Status = st
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ApplyGPUFailureDetails maps parsed failure details from summary components to GPUs.
|
||||
func ApplyGPUFailureDetails(result *models.AnalysisResult, componentDetails map[string]string) {
|
||||
if result == nil || result.Hardware == nil || len(result.Hardware.GPUs) == 0 || len(componentDetails) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
slotDetails := make(map[string]string) // key: GPUSXM<idx>
|
||||
serialDetails := make(map[string]string) // key: GPU serial
|
||||
|
||||
for componentID, detail := range componentDetails {
|
||||
matches := gpuComponentIDRegex.FindStringSubmatch(strings.TrimSpace(componentID))
|
||||
if len(matches) != 3 {
|
||||
continue
|
||||
}
|
||||
detail = strings.TrimSpace(detail)
|
||||
if detail == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
slotKey := "GPUSXM" + matches[1]
|
||||
serialKey := strings.TrimSpace(matches[2])
|
||||
if _, exists := slotDetails[slotKey]; !exists {
|
||||
slotDetails[slotKey] = detail
|
||||
}
|
||||
if serialKey != "" {
|
||||
if _, exists := serialDetails[serialKey]; !exists {
|
||||
serialDetails[serialKey] = detail
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for i := range result.Hardware.GPUs {
|
||||
gpu := &result.Hardware.GPUs[i]
|
||||
detail := ""
|
||||
if serial := strings.TrimSpace(gpu.SerialNumber); serial != "" {
|
||||
detail = serialDetails[serial]
|
||||
}
|
||||
if detail == "" {
|
||||
detail = slotDetails[strings.TrimSpace(gpu.Slot)]
|
||||
}
|
||||
if detail != "" {
|
||||
gpu.ErrorDescription = detail
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// formatSummaryDescription creates a human-readable description from summary entry
|
||||
func formatSummaryDescription(entry SummaryEntry) string {
|
||||
component := entry.ComponentID
|
||||
|
||||
122
internal/parser/vendors/nvidia/summary_status_test.go
vendored
Normal file
122
internal/parser/vendors/nvidia/summary_status_test.go
vendored
Normal file
@@ -0,0 +1,122 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestApplyGPUStatuses_FromSummaryCSV_FailAndPass(t *testing.T) {
|
||||
csvData := strings.Join([]string{
|
||||
"ErrorCode,Test,VirtualID,SubTest,Type,ComponentID,Notes,Level,,,IgnoreError",
|
||||
"0,gpumem,gpumem,,GPU,SXM1_SN_111,OK,1,,,False",
|
||||
"363,gpumem,gpumem,,GPU,SXM5_SN_1653925025497,Row remapping failed,1,,,False",
|
||||
"0,gpu_fieldiag,gpu_fieldiag,,GPU,SXM1_SN_111,OK,1,,,False",
|
||||
"0,gpu_fieldiag,gpu_fieldiag,,GPU,SXM2_SN_222,OK,1,,,False",
|
||||
}, "\n")
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "GPUSXM1", SerialNumber: "111"},
|
||||
{Slot: "GPUSXM2", SerialNumber: "222"},
|
||||
{Slot: "GPUSXM5", SerialNumber: "1653925025497"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
statuses := CollectGPUStatusesFromSummaryCSV([]byte(csvData))
|
||||
ApplyGPUStatuses(result, statuses)
|
||||
|
||||
bySerial := map[string]string{}
|
||||
for _, gpu := range result.Hardware.GPUs {
|
||||
bySerial[gpu.SerialNumber] = gpu.Status
|
||||
}
|
||||
|
||||
if bySerial["1653925025497"] != "FAIL" {
|
||||
t.Fatalf("expected serial 1653925025497 status FAIL, got %q", bySerial["1653925025497"])
|
||||
}
|
||||
if bySerial["111"] != "PASS" {
|
||||
t.Fatalf("expected serial 111 status PASS, got %q", bySerial["111"])
|
||||
}
|
||||
if bySerial["222"] != "PASS" {
|
||||
t.Fatalf("expected serial 222 status PASS, got %q", bySerial["222"])
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyGPUFailureDetails_FromSummaryJSON_BySerial(t *testing.T) {
|
||||
jsonData := []byte(`[
|
||||
{
|
||||
"Error Code": "005-000-1-000000000363",
|
||||
"Test": "gpumem",
|
||||
"Component ID": "SXM5_SN_1653925025497",
|
||||
"Notes": "Row remapping failed",
|
||||
"Virtual ID": "gpumem",
|
||||
"Ignore Error": "False"
|
||||
}
|
||||
]`)
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
GPUs: []models.GPU{
|
||||
{Slot: "GPUSXM5", SerialNumber: "1653925025497"},
|
||||
{Slot: "GPUSXM2", SerialNumber: "1653925024190"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
details := CollectGPUFailureDetailsFromSummaryJSON(jsonData)
|
||||
ApplyGPUFailureDetails(result, details)
|
||||
|
||||
if got := result.Hardware.GPUs[0].ErrorDescription; got != "Row remapping failed" {
|
||||
t.Fatalf("expected serial 1653925025497 error Row remapping failed, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.GPUs[1].ErrorDescription; got != "" {
|
||||
t.Fatalf("expected no error description for healthy GPU, got %q", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyNVSwitchStatuses_FromSummaryJSON(t *testing.T) {
|
||||
jsonData := []byte(`[
|
||||
{
|
||||
"Error Code": "0",
|
||||
"Test": "inventory",
|
||||
"Component ID": "NVSWITCH_NVSWITCH0_VendorID",
|
||||
"Notes": "OK",
|
||||
"Virtual ID": "inventory",
|
||||
"Ignore Error": "False"
|
||||
},
|
||||
{
|
||||
"Error Code": "1",
|
||||
"Test": "inventory",
|
||||
"Component ID": "NVSWITCH_NVSWITCH1_LinkState",
|
||||
"Notes": "Link down",
|
||||
"Virtual ID": "inventory",
|
||||
"Ignore Error": "False"
|
||||
}
|
||||
]`)
|
||||
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{
|
||||
PCIeDevices: []models.PCIeDevice{
|
||||
{Slot: "NVSWITCH0", Status: "Unknown"},
|
||||
{Slot: "NVSWITCH1", Status: "Unknown"},
|
||||
{Slot: "NVSWITCH2", Status: "Unknown"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
statuses := CollectNVSwitchStatusesFromSummaryJSON(jsonData)
|
||||
ApplyNVSwitchStatuses(result, statuses)
|
||||
|
||||
if got := result.Hardware.PCIeDevices[0].Status; got != "PASS" {
|
||||
t.Fatalf("expected NVSWITCH0 status PASS, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.PCIeDevices[1].Status; got != "FAIL" {
|
||||
t.Fatalf("expected NVSWITCH1 status FAIL, got %q", got)
|
||||
}
|
||||
if got := result.Hardware.PCIeDevices[2].Status; got != "Unknown" {
|
||||
t.Fatalf("expected NVSWITCH2 status unchanged Unknown, got %q", got)
|
||||
}
|
||||
}
|
||||
@@ -3,6 +3,7 @@ package nvidia
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
@@ -53,6 +54,8 @@ type Property struct {
|
||||
Value interface{} `json:"value"` // Can be string or number
|
||||
}
|
||||
|
||||
var nvswitchComponentIDRegex = regexp.MustCompile(`^(NVSWITCH\d+|NVSWITCHNVSWITCH\d+)$`)
|
||||
|
||||
// GetValueAsString returns the value as a string
|
||||
func (p *Property) GetValueAsString() string {
|
||||
switch v := p.Value.(type) {
|
||||
@@ -107,7 +110,7 @@ func parseInventoryComponents(components []Component, result *models.AnalysisRes
|
||||
}
|
||||
|
||||
// Parse NVSwitch components
|
||||
if strings.HasPrefix(comp.ComponentID, "NVSWITCHNVSWITCH") {
|
||||
if isNVSwitchComponentID(comp.ComponentID) {
|
||||
nvswitch := parseNVSwitchComponent(comp)
|
||||
if nvswitch != nil {
|
||||
// Add as PCIe device for now
|
||||
@@ -152,7 +155,7 @@ func parseSystemInfo(comp Component, result *models.AnalysisResult) bool {
|
||||
// Don't overwrite real data from output.log with generic data
|
||||
// Only set if empty or still has the default placeholder value
|
||||
if result.Hardware.BoardInfo.ProductName == "" ||
|
||||
result.Hardware.BoardInfo.ProductName == "GPU Server (Field Diag)" {
|
||||
result.Hardware.BoardInfo.ProductName == "GPU Server (Field Diag)" {
|
||||
result.Hardware.BoardInfo.ProductName = value
|
||||
}
|
||||
case "SerialNumber", "Serial", "BoardSerial", "SystemSerial":
|
||||
@@ -183,6 +186,9 @@ func parseGPUComponent(comp Component) *models.GPU {
|
||||
switch prop.ID {
|
||||
case "DeviceID":
|
||||
deviceID = prop.GetValueAsString()
|
||||
if deviceID != "" {
|
||||
fmt.Sscanf(deviceID, "%x", &gpu.DeviceID)
|
||||
}
|
||||
case "Vendor":
|
||||
gpu.Manufacturer = prop.GetValueAsString()
|
||||
case "DeviceName":
|
||||
@@ -217,7 +223,7 @@ func parseGPUComponent(comp Component) *models.GPU {
|
||||
// parseNVSwitchComponent parses NVSwitch component information
|
||||
func parseNVSwitchComponent(comp Component) *models.PCIeDevice {
|
||||
device := &models.PCIeDevice{
|
||||
Slot: comp.ComponentID, // e.g., "NVSWITCHNVSWITCH0"
|
||||
Slot: normalizeNVSwitchSlot(comp.ComponentID),
|
||||
}
|
||||
|
||||
var vendorIDStr, deviceIDStr, vbios, pciID string
|
||||
@@ -279,3 +285,15 @@ func parseNVSwitchComponent(comp Component) *models.PCIeDevice {
|
||||
|
||||
return device
|
||||
}
|
||||
|
||||
func normalizeNVSwitchSlot(componentID string) string {
|
||||
slot := strings.TrimSpace(componentID)
|
||||
if strings.HasPrefix(slot, "NVSWITCHNVSWITCH") {
|
||||
return strings.Replace(slot, "NVSWITCHNVSWITCH", "NVSWITCH", 1)
|
||||
}
|
||||
return slot
|
||||
}
|
||||
|
||||
func isNVSwitchComponentID(componentID string) bool {
|
||||
return nvswitchComponentIDRegex.MatchString(strings.TrimSpace(componentID))
|
||||
}
|
||||
|
||||
46
internal/parser/vendors/nvidia/unified_summary_filter_test.go
vendored
Normal file
46
internal/parser/vendors/nvidia/unified_summary_filter_test.go
vendored
Normal file
@@ -0,0 +1,46 @@
|
||||
package nvidia
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestParseInventoryComponents_IgnoresNVSwitchPropertyChecks(t *testing.T) {
|
||||
result := &models.AnalysisResult{
|
||||
Hardware: &models.HardwareConfig{},
|
||||
}
|
||||
|
||||
components := []Component{
|
||||
{
|
||||
ComponentID: "NVSWITCHNVSWITCH1",
|
||||
Properties: []Property{
|
||||
{ID: "VendorID", Value: "10de"},
|
||||
{ID: "DeviceID", Value: "22a3"},
|
||||
{ID: "PCIID", Value: "0000:06:00.0"},
|
||||
},
|
||||
},
|
||||
{
|
||||
ComponentID: "NVSWITCHNum",
|
||||
Properties: []Property{
|
||||
{ID: "NVSWITCHNum", Value: 4},
|
||||
},
|
||||
},
|
||||
{
|
||||
ComponentID: "NVSWITCH_NVSWITCH1_VendorID",
|
||||
Properties: []Property{
|
||||
{ID: "NVSWITCH_NVSWITCH1_VendorID", Value: "10de"},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
parseInventoryComponents(components, result)
|
||||
|
||||
if got := len(result.Hardware.PCIeDevices); got != 1 {
|
||||
t.Fatalf("expected exactly 1 parsed NVSwitch device, got %d", got)
|
||||
}
|
||||
|
||||
if result.Hardware.PCIeDevices[0].Slot != "NVSWITCH1" {
|
||||
t.Fatalf("expected slot NVSWITCH1, got %q", result.Hardware.PCIeDevices[0].Slot)
|
||||
}
|
||||
}
|
||||
35
internal/parser/vendors/nvidia/unified_summary_test.go
vendored
Normal file
35
internal/parser/vendors/nvidia/unified_summary_test.go
vendored
Normal file
@@ -0,0 +1,35 @@
|
||||
package nvidia
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestParseNVSwitchComponent_NormalizesDuplicatedPrefixInSlot(t *testing.T) {
|
||||
comp := Component{
|
||||
ComponentID: "NVSWITCHNVSWITCH1",
|
||||
Properties: []Property{
|
||||
{ID: "VendorID", Value: "10de"},
|
||||
{ID: "DeviceID", Value: "22a3"},
|
||||
{ID: "Vendor", Value: "NVIDIA Corporation"},
|
||||
{ID: "PCIID", Value: "0000:06:00.0"},
|
||||
{ID: "PCISpeed", Value: "16GT/s"},
|
||||
{ID: "PCIWidth", Value: "x2"},
|
||||
{ID: "VBIOS_version", Value: "96.10.6D.00.01"},
|
||||
},
|
||||
}
|
||||
|
||||
device := parseNVSwitchComponent(comp)
|
||||
if device == nil {
|
||||
t.Fatal("expected non-nil NVSwitch device")
|
||||
}
|
||||
|
||||
if device.Slot != "NVSWITCH1" {
|
||||
t.Fatalf("expected normalized slot NVSWITCH1, got %q", device.Slot)
|
||||
}
|
||||
|
||||
if device.BDF != "0000:06:00.0" {
|
||||
t.Fatalf("expected BDF 0000:06:00.0, got %q", device.BDF)
|
||||
}
|
||||
|
||||
if device.DeviceClass != "NVSwitch" {
|
||||
t.Fatalf("expected device class NVSwitch, got %q", device.DeviceClass)
|
||||
}
|
||||
}
|
||||
275
internal/parser/vendors/nvidia_bug_report/README.md
vendored
275
internal/parser/vendors/nvidia_bug_report/README.md
vendored
@@ -1,275 +0,0 @@
|
||||
# NVIDIA Bug Report Parser
|
||||
|
||||
Парсер для файлов nvidia-bug-report, генерируемых скриптом `nvidia-bug-report.sh`.
|
||||
|
||||
## Назначение
|
||||
|
||||
Этот парсер обрабатывает диагностические логи NVIDIA драйверов и извлекает:
|
||||
- Информацию о модулях памяти (из dmidecode)
|
||||
- Информацию о GPU устройствах
|
||||
- Версию NVIDIA драйвера
|
||||
|
||||
## Формат файла
|
||||
|
||||
- Имя файла: `nvidia-bug-report-*.log.gz`
|
||||
- Формат: Gzip-сжатый текстовый файл
|
||||
- Генерируется: `nvidia-bug-report.sh` скриптом
|
||||
|
||||
## Confidence Score
|
||||
|
||||
**85** - высокий приоритет для файлов nvidia-bug-report
|
||||
|
||||
## Извлекаемые данные
|
||||
|
||||
### 1. System Information (из dmidecode)
|
||||
|
||||
Информация о сервере:
|
||||
- **Serial Number**: Серийный номер сервера (например, 2KD501412)
|
||||
- **UUID**: Уникальный идентификатор системы (например, 2e4054bc-1dd2-11b2-0284-6b0a21737950)
|
||||
- **Manufacturer**: Производитель сервера
|
||||
- **Product Name**: Модель сервера
|
||||
- **Version**: Версия системы
|
||||
|
||||
### 2. CPU Information (из dmidecode)
|
||||
|
||||
Для каждого процессора извлекается:
|
||||
- **Model**: Модель процессора (например, Intel(R) Xeon(R) Platinum 8480+)
|
||||
- **Serial Number**: Серийный номер (например, 5DB0D6C0DD30ABD8)
|
||||
- **Core Count**: Количество ядер (например, 56)
|
||||
- **Thread Count**: Количество потоков (например, 112)
|
||||
- **Max Speed**: Максимальная частота (например, 3800 MHz)
|
||||
- **Current Speed**: Текущая частота (например, 2000 MHz)
|
||||
|
||||
Пример:
|
||||
```
|
||||
Socket 0: Intel(R) Xeon(R) Platinum 8480+
|
||||
Serial Number: 5DB0D6C0DD30ABD8
|
||||
Cores: 56, Threads: 112
|
||||
Frequency: 2000 MHz (Max: 3800 MHz)
|
||||
```
|
||||
|
||||
### 3. Memory Modules (из dmidecode)
|
||||
|
||||
Для каждого модуля памяти извлекается:
|
||||
- **Slot/Location**: Например, CPU0_C0D0
|
||||
- **Size**: Размер в GB (например, 64 GB)
|
||||
- **Type**: Тип памяти (DDR5, DDR4, etc.)
|
||||
- **Manufacturer**: Производитель (Hynix, Samsung, Micron, etc.)
|
||||
- **Part Number**: P/N модуля (например, HMCG94AGBRA179N)
|
||||
- **Serial Number**: S/N модуля (например, 80AD0224322B3834E6)
|
||||
- **Speed**: Max/Current скорость (например, 5600/4400 MHz)
|
||||
- **Ranks**: Количество рангов
|
||||
|
||||
Пример:
|
||||
```
|
||||
Slot: CPU0_C0D0
|
||||
Size: 64 GB
|
||||
Type: DDR5
|
||||
Manufacturer: Hynix
|
||||
Part Number: HMCG94AGBRA179N
|
||||
Serial Number: 80AD0224322B3834E6
|
||||
Speed: 5600 MT/s (configured: 4400 MT/s)
|
||||
Ranks: 2
|
||||
```
|
||||
|
||||
### 4. Power Supplies (из dmidecode)
|
||||
|
||||
Для каждого блока питания извлекается:
|
||||
- **Location**: Позиция (например, PSU0, PSU1)
|
||||
- **Manufacturer**: Производитель (например, DELTA, Great Wall)
|
||||
- **Model Part Number**: Модель БП (например, V0310DT000000000)
|
||||
- **Serial Number**: Серийный номер (например, DGPLV251500LZ)
|
||||
- **Max Power Capacity**: Максимальная мощность (например, 2700 W)
|
||||
- **Revision**: Версия прошивки (например, 00.01.04)
|
||||
- **Status**: Статус (например, Present, OK)
|
||||
|
||||
Пример:
|
||||
```
|
||||
PSU0: V0310DT000000000 (DELTA)
|
||||
Serial Number: DGPLV251500LZ
|
||||
Power: 2700 W, Revision: 00.01.04
|
||||
Status: Present, OK
|
||||
```
|
||||
|
||||
### 5. Network Adapters (из lspci)
|
||||
|
||||
Для каждого сетевого адаптера (Ethernet, Network, InfiniBand) извлекается:
|
||||
- **Model**: Полное название модели из VPD (например, "NVIDIA ConnectX-7 HHHL Adapter card, 400GbE / NDR IB (default mode), Single-port OSFP, PCIe 5.0 x16")
|
||||
- **Location**: PCI BDF адрес (например, 0000:0e:00.0)
|
||||
- **Slot**: Физический слот (например, 108)
|
||||
- **Part Number**: P/N адаптера (например, MCX75310AAS-NEAT)
|
||||
- **Serial Number**: S/N адаптера (например, MT2430600249)
|
||||
- **Vendor**: Производитель (Mellanox, NVIDIA)
|
||||
- **Vendor ID / Device ID**: PCI идентификаторы (например, 15b3:1021)
|
||||
- **Port Count**: Количество портов (определяется из модели: Dual-port = 2, Single-port = 1)
|
||||
- **Port Type**: Тип портов (QSFP56, OSFP, SFP+)
|
||||
|
||||
Пример:
|
||||
```
|
||||
0000:0e:00.0: NVIDIA ConnectX-7 HHHL Adapter card, 400GbE / NDR IB (default mode), Single-port OSFP
|
||||
Slot: 108
|
||||
P/N: MCX75310AAS-NEAT
|
||||
S/N: MT2430600249
|
||||
Ports: 1 x OSFP
|
||||
```
|
||||
|
||||
### 6. GPU Devices
|
||||
|
||||
Для каждого GPU извлекается:
|
||||
- **Model**: Модель GPU (например, NVIDIA H100 80GB HBM3)
|
||||
- **BDF (Bus:Device.Function)**: PCI адрес (например, 0000:0f:00.0)
|
||||
- **UUID**: Уникальный идентификатор GPU (например, GPU-64674e47-e036-c12a-3e8d-55a2a9ac8db3)
|
||||
- **Video BIOS**: Версия BIOS видеокарты (например, 96.00.99.00.01)
|
||||
- **IRQ**: Прерывание (например, 17)
|
||||
- **Bus Type**: Тип шины (PCIe)
|
||||
- **DMA Size**: Размер DMA (например, 52 bits)
|
||||
- **DMA Mask**: Маска DMA (например, 0xfffffffffffff)
|
||||
- **Device Minor**: Номер устройства (например, 0)
|
||||
- **Manufacturer**: NVIDIA
|
||||
|
||||
Пример:
|
||||
```
|
||||
0000:0f:00.0: NVIDIA H100 80GB HBM3
|
||||
UUID: GPU-64674e47-e036-c12a-3e8d-55a2a9ac8db3
|
||||
Video BIOS: 96.00.99.00.01
|
||||
IRQ: 17
|
||||
```
|
||||
|
||||
### 7. Events
|
||||
|
||||
- **Memory Configuration**: Сводка по модулям памяти (количество, производители, общий размер)
|
||||
- **GPU Detection**: Обнаруженные GPU устройства
|
||||
- **Driver Version**: Версия NVIDIA драйвера
|
||||
|
||||
## Пример использования
|
||||
|
||||
```bash
|
||||
# Запуск с nvidia-bug-report файлом
|
||||
./logpile --file nvidia-bug-report-2KD501412.log.gz
|
||||
|
||||
# Веб-интерфейс будет доступен на http://localhost:8082
|
||||
```
|
||||
|
||||
## Пример вывода
|
||||
|
||||
```
|
||||
✓ Detected vendor: NVIDIA Bug Report Parser
|
||||
✓ CPUs: 2
|
||||
✓ Memory: 32 modules
|
||||
✓ Power Supplies: 8
|
||||
✓ GPUs: 8
|
||||
✓ Network Adapters: 12
|
||||
|
||||
System Information:
|
||||
Serial Number: 2KD501412
|
||||
UUID: 2e4054bc-1dd2-11b2-0284-6b0a21737950
|
||||
Version: 0
|
||||
|
||||
CPU Information:
|
||||
Socket 0: Intel(R) Xeon(R) Platinum 8480+
|
||||
S/N: 5DB0D6C0DD30ABD8, Cores: 56, Threads: 112
|
||||
Socket 1: Intel(R) Xeon(R) Platinum 8480+
|
||||
S/N: 5DB017C05685B3ED, Cores: 56, Threads: 112
|
||||
|
||||
Power Supplies:
|
||||
PSU0: V0310DT000000000 (DELTA)
|
||||
S/N: DGPLV251500LZ
|
||||
Power: 2700 W, Revision: 00.01.04
|
||||
Status: Present, OK
|
||||
PSU1: V0310DT000000000 (DELTA)
|
||||
S/N: DGPLV251500GY
|
||||
Power: 2700 W, Revision: 00.01.04
|
||||
Status: Present, OK
|
||||
[... 6 more PSUs ...]
|
||||
|
||||
Memory Modules:
|
||||
CPU0_C0D0: 64 GB, Hynix
|
||||
P/N: HMCG94AGBRA179N, S/N: 80AD0224322B3834E6
|
||||
Type: DDR5, Speed: 4400/5600 MHz
|
||||
[... 31 more modules ...]
|
||||
|
||||
Network Adapters: 12 devices
|
||||
0000:0e:00.0: NVIDIA ConnectX-7 HHHL Adapter card, 400GbE / NDR IB (default mode), Single-port OSFP
|
||||
Slot: 108
|
||||
P/N: MCX75310AAS-NEAT
|
||||
S/N: MT2430600249
|
||||
Ports: 1 x OSFP
|
||||
0000:1f:00.0: ConnectX-6 Dx EN adapter card, 100GbE, Dual-port QSFP56
|
||||
Slot: 12
|
||||
P/N: MCX623106AN-CDAT
|
||||
S/N: MT2434J00PCD
|
||||
Ports: 2 x QSFP56
|
||||
[... 10 more adapters ...]
|
||||
|
||||
GPUs: 8 devices
|
||||
0000:0f:00.0: NVIDIA H100 80GB HBM3
|
||||
UUID: GPU-64674e47-e036-c12a-3e8d-55a2a9ac8db3
|
||||
Video BIOS: 96.00.99.00.01
|
||||
IRQ: 17
|
||||
0000:34:00.0: NVIDIA H100 80GB HBM3
|
||||
UUID: GPU-fa796345-c23a-54aa-1b67-709ac2542852
|
||||
Video BIOS: 96.00.99.00.01
|
||||
IRQ: 16
|
||||
[... 6 more GPUs ...]
|
||||
```
|
||||
|
||||
## Версионирование
|
||||
|
||||
**Текущая версия парсера:** 1.0.0
|
||||
|
||||
### История версий
|
||||
|
||||
- **1.0.0** - Первоначальная версия с парсингом System Info, CPU, Memory, PSU, GPU, Network Adapters и Driver
|
||||
|
||||
## Структура данных
|
||||
|
||||
Парсер использует следующие секции в bug report:
|
||||
1. **dmidecode output (System Information)** - для извлечения информации о сервере
|
||||
2. **dmidecode output (Processor Information)** - для извлечения информации о CPU
|
||||
3. **dmidecode output (Memory Device)** - для извлечения информации о памяти
|
||||
4. **dmidecode output (System Power Supply)** - для извлечения информации о блоках питания
|
||||
5. **lspci -vvv output (Ethernet/Network/Infiniband controller)** - для извлечения информации о сетевых адаптерах
|
||||
6. **lspci VPD (Vital Product Data)** - для извлечения P/N, S/N и модели сетевых адаптеров
|
||||
7. **/proc/driver/nvidia/gpus/.../information** - для детальной информации о GPU
|
||||
8. **NVRM version** - для версии драйвера
|
||||
|
||||
## Известные ограничения
|
||||
|
||||
1. Ошибки и предупреждения из логов пока не извлекаются
|
||||
2. Некоторые специфичные характеристики GPU (температура, утилизация) не парсятся
|
||||
3. Информация о производительности и метрики GPU требуют парсинга других секций
|
||||
|
||||
## Расширение
|
||||
|
||||
Для добавления новых возможностей:
|
||||
|
||||
1. **Ошибки драйвера**: Парсить секции с ошибками NVIDIA драйвера
|
||||
2. **nvidia-smi output**: Извлекать детальную информацию из вывода nvidia-smi (температура, утилизация)
|
||||
3. **GPU производительность**: Парсить метрики производительности и использования памяти GPU
|
||||
4. **PCIe информация**: Извлекать детали о PCIe конфигурации (скорость линка, ширина)
|
||||
|
||||
## Пример структуры файла
|
||||
|
||||
```
|
||||
Start of NVIDIA bug report log file
|
||||
nvidia-bug-report.sh Version: 34275561
|
||||
Date: Thu Jul 17 18:18:18 EDT 2025
|
||||
|
||||
[... system info ...]
|
||||
|
||||
Memory Device
|
||||
Data Width: 64 bits
|
||||
Size: 64 GB
|
||||
Form Factor: DIMM
|
||||
Locator: CPU0_C0D0
|
||||
Type: DDR5
|
||||
Speed: 5600 MT/s
|
||||
Manufacturer: Hynix
|
||||
Serial Number: 80AD0224322B3834E6
|
||||
Part Number: HMCG94AGBRA179N
|
||||
|
||||
[... more memory modules ...]
|
||||
|
||||
*** /proc/driver/nvidia/./gpus/0000:0f:00.0/power
|
||||
[... GPU info ...]
|
||||
```
|
||||
137
internal/parser/vendors/nvidia_bug_report/gpu.go
vendored
137
internal/parser/vendors/nvidia_bug_report/gpu.go
vendored
@@ -106,6 +106,8 @@ func parseGPUInfo(content string, result *models.AnalysisResult) {
|
||||
result.Hardware.GPUs = append(result.Hardware.GPUs, *currentGPU)
|
||||
}
|
||||
|
||||
applyGPUSerialNumbers(content, result.Hardware.GPUs)
|
||||
|
||||
// Create event for GPU summary
|
||||
if len(result.Hardware.GPUs) > 0 {
|
||||
result.Events = append(result.Events, models.Event{
|
||||
@@ -168,3 +170,138 @@ func formatGPUSummary(gpus []models.GPU) string {
|
||||
|
||||
return summary.String()
|
||||
}
|
||||
|
||||
func applyGPUSerialNumbers(content string, gpus []models.GPU) {
|
||||
if len(gpus) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
serialByBDF := parseGPUSerialsFromNvidiaSMI(content)
|
||||
if len(serialByBDF) == 0 {
|
||||
serialByBDF = parseGPUSerialsFromSummary(content)
|
||||
}
|
||||
|
||||
if len(serialByBDF) == 0 {
|
||||
return
|
||||
}
|
||||
|
||||
for i := range gpus {
|
||||
bdf := normalizeGPUAddress(gpus[i].BDF)
|
||||
if bdf == "" {
|
||||
continue
|
||||
}
|
||||
if serial, ok := serialByBDF[bdf]; ok && serial != "" {
|
||||
gpus[i].SerialNumber = serial
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func parseGPUSerialsFromNvidiaSMI(content string) map[string]string {
|
||||
scanner := bufio.NewScanner(strings.NewReader(content))
|
||||
reGPU := regexp.MustCompile(`^GPU\s+([0-9A-F]{8}:[0-9A-F]{2}:[0-9A-F]{2}\.[0-9A-F])$`)
|
||||
|
||||
serialByBDF := make(map[string]string)
|
||||
currentBDF := ""
|
||||
|
||||
for scanner.Scan() {
|
||||
line := strings.TrimSpace(scanner.Text())
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
if matches := reGPU.FindStringSubmatch(line); len(matches) == 2 {
|
||||
currentBDF = normalizeGPUAddress(matches[1])
|
||||
continue
|
||||
}
|
||||
|
||||
if currentBDF == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
if strings.HasPrefix(line, "Serial Number") {
|
||||
parts := strings.SplitN(line, ":", 2)
|
||||
if len(parts) != 2 {
|
||||
continue
|
||||
}
|
||||
serial := strings.TrimSpace(parts[1])
|
||||
if serial != "" && !strings.EqualFold(serial, "N/A") {
|
||||
serialByBDF[currentBDF] = serial
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return serialByBDF
|
||||
}
|
||||
|
||||
func parseGPUSerialsFromSummary(content string) map[string]string {
|
||||
scanner := bufio.NewScanner(strings.NewReader(content))
|
||||
|
||||
serialByBDF := make(map[string]string)
|
||||
inGPUDetails := false
|
||||
|
||||
for scanner.Scan() {
|
||||
line := scanner.Text()
|
||||
trimmed := strings.TrimSpace(line)
|
||||
|
||||
if strings.HasPrefix(trimmed, "NVIDIA GPU Details") {
|
||||
inGPUDetails = true
|
||||
}
|
||||
if !inGPUDetails {
|
||||
continue
|
||||
}
|
||||
if strings.HasPrefix(trimmed, "NVIDIA Switch Details") {
|
||||
break
|
||||
}
|
||||
|
||||
parts := strings.Split(line, "|")
|
||||
if len(parts) < 2 {
|
||||
continue
|
||||
}
|
||||
payload := strings.TrimSpace(parts[len(parts)-1])
|
||||
if payload == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
fields := strings.Split(payload, ",")
|
||||
if len(fields) < 6 {
|
||||
continue
|
||||
}
|
||||
|
||||
bdf := normalizeGPUAddress(strings.TrimSpace(fields[4]))
|
||||
serial := strings.TrimSpace(fields[5])
|
||||
if bdf == "" || serial == "" || strings.EqualFold(serial, "N/A") {
|
||||
continue
|
||||
}
|
||||
serialByBDF[bdf] = serial
|
||||
}
|
||||
|
||||
return serialByBDF
|
||||
}
|
||||
|
||||
func normalizeGPUAddress(addr string) string {
|
||||
addr = strings.TrimSpace(addr)
|
||||
if addr == "" {
|
||||
return ""
|
||||
}
|
||||
parts := strings.Split(addr, ":")
|
||||
if len(parts) != 3 {
|
||||
return strings.ToLower(addr)
|
||||
}
|
||||
|
||||
domain := parts[0]
|
||||
bus := parts[1]
|
||||
devFn := parts[2]
|
||||
|
||||
devFnParts := strings.Split(devFn, ".")
|
||||
if len(devFnParts) != 2 {
|
||||
return strings.ToLower(addr)
|
||||
}
|
||||
device := devFnParts[0]
|
||||
fn := devFnParts[1]
|
||||
|
||||
if len(domain) == 8 {
|
||||
domain = domain[4:]
|
||||
}
|
||||
|
||||
return strings.ToLower(domain + ":" + bus + ":" + device + "." + fn)
|
||||
}
|
||||
|
||||
54
internal/parser/vendors/nvidia_bug_report/gpu_test.go
vendored
Normal file
54
internal/parser/vendors/nvidia_bug_report/gpu_test.go
vendored
Normal file
@@ -0,0 +1,54 @@
|
||||
package nvidia_bug_report
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
func TestApplyGPUSerialNumbers_FromNvidiaSMI(t *testing.T) {
|
||||
content := `
|
||||
/usr/bin/nvidia-smi --query
|
||||
GPU 00000000:18:00.0
|
||||
Serial Number : 1653925025827
|
||||
GPU 00000000:2A:00.0
|
||||
Serial Number : 1653925050608
|
||||
`
|
||||
|
||||
gpus := []models.GPU{
|
||||
{BDF: "0000:18:00.0"},
|
||||
{BDF: "0000:2a:00.0"},
|
||||
}
|
||||
|
||||
applyGPUSerialNumbers(content, gpus)
|
||||
|
||||
if gpus[0].SerialNumber != "1653925025827" {
|
||||
t.Fatalf("unexpected serial for gpu0: %q", gpus[0].SerialNumber)
|
||||
}
|
||||
if gpus[1].SerialNumber != "1653925050608" {
|
||||
t.Fatalf("unexpected serial for gpu1: %q", gpus[1].SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestApplyGPUSerialNumbers_FromSummaryFallback(t *testing.T) {
|
||||
content := `
|
||||
NVIDIA GPU Details | NVIDIA H200, 570.172.08, 143771 MiB, 96.00.D0.00.03, 00000000:18:00.0, 1653925025827
|
||||
| NVIDIA H200, 570.172.08, 143771 MiB, 96.00.D0.00.03, 00000000:2A:00.0, 1653925050608
|
||||
NVIDIA Switch Details | No devices matching query 'Quantum'
|
||||
`
|
||||
|
||||
gpus := []models.GPU{
|
||||
{BDF: "0000:18:00.0"},
|
||||
{BDF: "0000:2a:00.0"},
|
||||
}
|
||||
|
||||
applyGPUSerialNumbers(content, gpus)
|
||||
|
||||
if gpus[0].SerialNumber != "1653925025827" {
|
||||
t.Fatalf("unexpected serial for gpu0: %q", gpus[0].SerialNumber)
|
||||
}
|
||||
if gpus[1].SerialNumber != "1653925050608" {
|
||||
t.Fatalf("unexpected serial for gpu1: %q", gpus[1].SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -3,14 +3,33 @@
|
||||
package nvidia_bug_report
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
// parserVersion - version of this parser module
|
||||
const parserVersion = "1.0.0"
|
||||
const parserVersion = "1.2"
|
||||
|
||||
var bugReportDateLineRegex = regexp.MustCompile(`(?m)^Date:\s+(.+?)\s*$`)
|
||||
var dateWithTZAbbrevRegex = regexp.MustCompile(`^([A-Za-z]{3}\s+[A-Za-z]{3}\s+\d{1,2}\s+\d{2}:\d{2}:\d{2})\s+([A-Za-z]{2,5})\s+(\d{4})$`)
|
||||
|
||||
var timezoneAbbrevToOffset = map[string]string{
|
||||
"UTC": "+00:00",
|
||||
"GMT": "+00:00",
|
||||
"EST": "-05:00",
|
||||
"EDT": "-04:00",
|
||||
"CST": "-06:00",
|
||||
"CDT": "-05:00",
|
||||
"MST": "-07:00",
|
||||
"MDT": "-06:00",
|
||||
"PST": "-08:00",
|
||||
"PDT": "-07:00",
|
||||
}
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
@@ -81,6 +100,10 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
|
||||
}
|
||||
|
||||
content := string(files[0].Content)
|
||||
if collectedAt, tzOffset, ok := parseBugReportCollectedAt(content); ok {
|
||||
result.CollectedAt = collectedAt.UTC()
|
||||
result.SourceTimezone = tzOffset
|
||||
}
|
||||
|
||||
// Parse system information
|
||||
parseSystemInfo(content, result)
|
||||
@@ -105,3 +128,49 @@ func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, er
|
||||
|
||||
return result, nil
|
||||
}
|
||||
|
||||
func parseBugReportCollectedAt(content string) (time.Time, string, bool) {
|
||||
matches := bugReportDateLineRegex.FindStringSubmatch(content)
|
||||
if len(matches) != 2 {
|
||||
return time.Time{}, "", false
|
||||
}
|
||||
raw := strings.TrimSpace(matches[1])
|
||||
if raw == "" {
|
||||
return time.Time{}, "", false
|
||||
}
|
||||
|
||||
if m := dateWithTZAbbrevRegex.FindStringSubmatch(raw); len(m) == 4 {
|
||||
if offset, ok := timezoneAbbrevToOffset[strings.ToUpper(strings.TrimSpace(m[2]))]; ok {
|
||||
layout := "Mon Jan 2 15:04:05 -07:00 2006"
|
||||
normalized := strings.TrimSpace(m[1]) + " " + offset + " " + strings.TrimSpace(m[3])
|
||||
if ts, err := time.Parse(layout, normalized); err == nil {
|
||||
return ts, offset, true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
layouts := []string{
|
||||
"Mon Jan 2 15:04:05 MST 2006",
|
||||
"Mon Jan 2 15:04:05 2006",
|
||||
}
|
||||
for _, layout := range layouts {
|
||||
ts, err := time.Parse(layout, raw)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
return ts, formatOffset(ts), true
|
||||
}
|
||||
return time.Time{}, "", false
|
||||
}
|
||||
|
||||
func formatOffset(t time.Time) string {
|
||||
_, sec := t.Zone()
|
||||
sign := '+'
|
||||
if sec < 0 {
|
||||
sign = '-'
|
||||
sec = -sec
|
||||
}
|
||||
h := sec / 3600
|
||||
m := (sec % 3600) / 60
|
||||
return fmt.Sprintf("%c%02d:%02d", sign, h, m)
|
||||
}
|
||||
|
||||
54
internal/parser/vendors/nvidia_bug_report/parser_test.go
vendored
Normal file
54
internal/parser/vendors/nvidia_bug_report/parser_test.go
vendored
Normal file
@@ -0,0 +1,54 @@
|
||||
package nvidia_bug_report
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestParseBugReportCollectedAt(t *testing.T) {
|
||||
content := `
|
||||
Start of NVIDIA bug report log file
|
||||
Date: Fri Dec 12 10:14:49 EST 2025
|
||||
`
|
||||
|
||||
got, tz, ok := parseBugReportCollectedAt(content)
|
||||
if !ok {
|
||||
t.Fatalf("expected collected_at to be parsed")
|
||||
}
|
||||
if tz != "-05:00" {
|
||||
t.Fatalf("expected tz offset -05:00, got %q", tz)
|
||||
}
|
||||
wantUTC := time.Date(2025, 12, 12, 15, 14, 49, 0, time.UTC)
|
||||
if !got.UTC().Equal(wantUTC) {
|
||||
t.Fatalf("expected %s, got %s", wantUTC, got.UTC())
|
||||
}
|
||||
}
|
||||
|
||||
func TestNvidiaBugReportParser_SetsCollectedAtAndTimezone(t *testing.T) {
|
||||
p := &Parser{}
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "nvidia-bug-report-1653925023938.log",
|
||||
Content: []byte(`
|
||||
Start of NVIDIA bug report log file
|
||||
nvidia-bug-report.sh Version: 34275561
|
||||
Date: Fri Dec 12 10:14:49 EST 2025
|
||||
`),
|
||||
},
|
||||
}
|
||||
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("parse failed: %v", err)
|
||||
}
|
||||
|
||||
if result.SourceTimezone != "-05:00" {
|
||||
t.Fatalf("expected source timezone -05:00, got %q", result.SourceTimezone)
|
||||
}
|
||||
wantUTC := time.Date(2025, 12, 12, 15, 14, 49, 0, time.UTC)
|
||||
if !result.CollectedAt.Equal(wantUTC) {
|
||||
t.Fatalf("expected collected_at %s, got %s", wantUTC, result.CollectedAt)
|
||||
}
|
||||
}
|
||||
41507
internal/parser/vendors/pciids/pci.ids
vendored
Normal file
41507
internal/parser/vendors/pciids/pci.ids
vendored
Normal file
File diff suppressed because it is too large
Load Diff
222
internal/parser/vendors/pciids/pciids.go
vendored
222
internal/parser/vendors/pciids/pciids.go
vendored
@@ -1,12 +1,27 @@
|
||||
package pciids
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
_ "embed"
|
||||
"fmt"
|
||||
"os"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
)
|
||||
|
||||
var (
|
||||
//go:embed pci.ids
|
||||
embeddedPCIIDs string
|
||||
|
||||
loadOnce sync.Once
|
||||
vendors map[int]string
|
||||
devices map[string]string
|
||||
)
|
||||
|
||||
// VendorName returns vendor name by PCI Vendor ID
|
||||
func VendorName(vendorID int) string {
|
||||
loadPCIIDs()
|
||||
if name, ok := vendors[vendorID]; ok {
|
||||
return name
|
||||
}
|
||||
@@ -15,6 +30,7 @@ func VendorName(vendorID int) string {
|
||||
|
||||
// DeviceName returns device name by Vendor ID and Device ID
|
||||
func DeviceName(vendorID, deviceID int) string {
|
||||
loadPCIIDs()
|
||||
key := fmt.Sprintf("%04x:%04x", vendorID, deviceID)
|
||||
if name, ok := devices[key]; ok {
|
||||
return name
|
||||
@@ -46,7 +62,6 @@ func VendorNameFromString(s string) string {
|
||||
} else if c >= 'a' && c <= 'f' {
|
||||
id = id*16 + int(c-'a'+10)
|
||||
} else {
|
||||
// Not a valid hex string, return original
|
||||
return ""
|
||||
}
|
||||
}
|
||||
@@ -54,124 +69,99 @@ func VendorNameFromString(s string) string {
|
||||
return VendorName(id)
|
||||
}
|
||||
|
||||
// Common PCI Vendor IDs
|
||||
// Source: https://pci-ids.ucw.cz/
|
||||
var vendors = map[int]string{
|
||||
// Storage controllers and SSDs
|
||||
0x1E0F: "KIOXIA",
|
||||
0x144D: "Samsung Electronics",
|
||||
0x1C5C: "SK Hynix",
|
||||
0x15B7: "SanDisk (Western Digital)",
|
||||
0x1179: "Toshiba",
|
||||
0x8086: "Intel",
|
||||
0x1344: "Micron Technology",
|
||||
0x126F: "Silicon Motion",
|
||||
0x1987: "Phison Electronics",
|
||||
0x1CC1: "ADATA Technology",
|
||||
0x2646: "Kingston Technology",
|
||||
0x1E95: "Solid State Storage Technology",
|
||||
0x025E: "Solidigm",
|
||||
0x1D97: "Shenzhen Longsys Electronics",
|
||||
0x1E4B: "MAXIO Technology",
|
||||
func loadPCIIDs() {
|
||||
loadOnce.Do(func() {
|
||||
vendors = make(map[int]string)
|
||||
devices = make(map[string]string)
|
||||
|
||||
// Network adapters
|
||||
0x15B3: "Mellanox Technologies",
|
||||
0x14E4: "Broadcom",
|
||||
0x10EC: "Realtek Semiconductor",
|
||||
0x1077: "QLogic",
|
||||
0x19A2: "Emulex",
|
||||
0x1137: "Cisco Systems",
|
||||
0x1924: "Solarflare Communications",
|
||||
0x177D: "Cavium",
|
||||
0x1D6A: "Aquantia",
|
||||
0x1FC9: "Tehuti Networks",
|
||||
0x18D4: "Chelsio Communications",
|
||||
parsePCIIDs(strings.NewReader(embeddedPCIIDs), vendors, devices)
|
||||
|
||||
// GPU / Graphics
|
||||
0x10DE: "NVIDIA",
|
||||
0x1002: "AMD/ATI",
|
||||
0x102B: "Matrox Electronics",
|
||||
0x1A03: "ASPEED Technology",
|
||||
|
||||
// Storage controllers (RAID/HBA)
|
||||
0x1000: "LSI Logic / Broadcom",
|
||||
0x9005: "Adaptec / Microsemi",
|
||||
0x1028: "Dell",
|
||||
0x103C: "Hewlett-Packard",
|
||||
0x17D3: "Areca Technology",
|
||||
0x1CC4: "Union Memory",
|
||||
|
||||
// Server vendors
|
||||
0x1014: "IBM",
|
||||
0x15D9: "Supermicro",
|
||||
0x8088: "Inspur",
|
||||
|
||||
// Other common
|
||||
0x1022: "AMD",
|
||||
0x1106: "VIA Technologies",
|
||||
0x10B5: "PLX Technology",
|
||||
0x1B21: "ASMedia Technology",
|
||||
0x1B4B: "Marvell Technology",
|
||||
0x197B: "JMicron Technology",
|
||||
for _, path := range candidatePCIIDsPaths() {
|
||||
f, err := os.Open(path)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
parsePCIIDs(f, vendors, devices)
|
||||
_ = f.Close()
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
// Device IDs (vendor:device -> name)
|
||||
var devices = map[string]string{
|
||||
// NVIDIA GPUs (0x10DE)
|
||||
"10de:26b9": "L40S 48GB",
|
||||
"10de:26b1": "L40 48GB",
|
||||
"10de:2684": "RTX 4090",
|
||||
"10de:2704": "RTX 4080",
|
||||
"10de:2782": "RTX 4070 Ti",
|
||||
"10de:2786": "RTX 4070",
|
||||
"10de:27b8": "RTX 4060 Ti",
|
||||
"10de:2882": "RTX 4060",
|
||||
"10de:2204": "RTX 3090",
|
||||
"10de:2208": "RTX 3080 Ti",
|
||||
"10de:2206": "RTX 3080",
|
||||
"10de:2484": "RTX 3070",
|
||||
"10de:2503": "RTX 3060",
|
||||
"10de:20b0": "A100 80GB",
|
||||
"10de:20b2": "A100 40GB",
|
||||
"10de:20f1": "A10",
|
||||
"10de:2236": "A10G",
|
||||
"10de:25b6": "A16",
|
||||
"10de:20b5": "A30",
|
||||
"10de:20b7": "A30X",
|
||||
"10de:1db4": "V100 32GB",
|
||||
"10de:1db1": "V100 16GB",
|
||||
"10de:1e04": "RTX 2080 Ti",
|
||||
"10de:1e07": "RTX 2080",
|
||||
"10de:1f02": "RTX 2070",
|
||||
"10de:26ba": "L40S-PCIE-48G",
|
||||
"10de:2330": "H100 80GB PCIe",
|
||||
"10de:2331": "H100 80GB SXM5",
|
||||
"10de:2322": "H100 NVL",
|
||||
"10de:2324": "H200",
|
||||
func candidatePCIIDsPaths() []string {
|
||||
paths := []string{
|
||||
"pci.ids",
|
||||
"/usr/share/hwdata/pci.ids",
|
||||
"/usr/share/misc/pci.ids",
|
||||
"/opt/homebrew/share/pciids/pci.ids",
|
||||
}
|
||||
|
||||
// AMD GPUs (0x1002)
|
||||
"1002:744c": "Instinct MI250X",
|
||||
"1002:7408": "Instinct MI100",
|
||||
"1002:73a5": "RX 6950 XT",
|
||||
"1002:73bf": "RX 6900 XT",
|
||||
"1002:73df": "RX 6700 XT",
|
||||
"1002:7480": "RX 7900 XTX",
|
||||
"1002:7483": "RX 7900 XT",
|
||||
|
||||
// ASPEED (0x1A03) - BMC VGA
|
||||
"1a03:2000": "AST2500 VGA",
|
||||
"1a03:1150": "AST2600 VGA",
|
||||
|
||||
// Intel GPUs
|
||||
"8086:56c0": "Data Center GPU Flex 170",
|
||||
"8086:56c1": "Data Center GPU Flex 140",
|
||||
|
||||
// Mellanox/NVIDIA NICs (0x15B3)
|
||||
"15b3:1017": "ConnectX-5 100GbE",
|
||||
"15b3:1019": "ConnectX-5 Ex",
|
||||
"15b3:101b": "ConnectX-6",
|
||||
"15b3:101d": "ConnectX-6 Dx",
|
||||
"15b3:101f": "ConnectX-6 Lx",
|
||||
"15b3:1021": "ConnectX-7",
|
||||
"15b3:a2d6": "ConnectX-4 Lx",
|
||||
// Env paths have highest priority, so they are applied last.
|
||||
if env := strings.TrimSpace(os.Getenv("LOGPILE_PCI_IDS_PATH")); env != "" {
|
||||
for _, p := range strings.Split(env, string(os.PathListSeparator)) {
|
||||
p = strings.TrimSpace(p)
|
||||
if p != "" {
|
||||
paths = append(paths, p)
|
||||
}
|
||||
}
|
||||
}
|
||||
return paths
|
||||
}
|
||||
|
||||
func parsePCIIDs(r interface{ Read([]byte) (int, error) }, outVendors map[int]string, outDevices map[string]string) {
|
||||
scanner := bufio.NewScanner(r)
|
||||
currentVendor := -1
|
||||
|
||||
for scanner.Scan() {
|
||||
line := scanner.Text()
|
||||
if line == "" || strings.HasPrefix(line, "#") {
|
||||
continue
|
||||
}
|
||||
|
||||
// Subdevice line (tab-tab) - ignored for now
|
||||
if strings.HasPrefix(line, "\t\t") {
|
||||
continue
|
||||
}
|
||||
|
||||
// Device line
|
||||
if strings.HasPrefix(line, "\t") {
|
||||
if currentVendor < 0 {
|
||||
continue
|
||||
}
|
||||
trimmed := strings.TrimLeft(line, "\t")
|
||||
fields := strings.Fields(trimmed)
|
||||
if len(fields) < 2 {
|
||||
continue
|
||||
}
|
||||
deviceID, err := strconv.ParseInt(fields[0], 16, 32)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
name := strings.TrimSpace(trimmed[len(fields[0]):])
|
||||
if name == "" {
|
||||
continue
|
||||
}
|
||||
key := fmt.Sprintf("%04x:%04x", currentVendor, int(deviceID))
|
||||
outDevices[key] = name
|
||||
continue
|
||||
}
|
||||
|
||||
// Vendor line
|
||||
fields := strings.Fields(line)
|
||||
if len(fields) < 2 {
|
||||
currentVendor = -1
|
||||
continue
|
||||
}
|
||||
vendorID, err := strconv.ParseInt(fields[0], 16, 32)
|
||||
if err != nil {
|
||||
currentVendor = -1
|
||||
continue
|
||||
}
|
||||
name := strings.TrimSpace(line[len(fields[0]):])
|
||||
if name == "" {
|
||||
currentVendor = -1
|
||||
continue
|
||||
}
|
||||
currentVendor = int(vendorID)
|
||||
outVendors[currentVendor] = name
|
||||
}
|
||||
}
|
||||
|
||||
38
internal/parser/vendors/pciids/pciids_external_test.go
vendored
Normal file
38
internal/parser/vendors/pciids/pciids_external_test.go
vendored
Normal file
@@ -0,0 +1,38 @@
|
||||
package pciids
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sync"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestExternalPCIIDsLookup(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
idsPath := filepath.Join(dir, "pci.ids")
|
||||
content := "" +
|
||||
"# sample\n" +
|
||||
"10de NVIDIA Corporation\n" +
|
||||
"\t233b NVIDIA H200 SXM\n" +
|
||||
"8086 Intel Corporation\n" +
|
||||
"\t1521 I350 Gigabit Network Connection\n"
|
||||
|
||||
if err := os.WriteFile(idsPath, []byte(content), 0o644); err != nil {
|
||||
t.Fatalf("write pci.ids: %v", err)
|
||||
}
|
||||
|
||||
t.Setenv("LOGPILE_PCI_IDS_PATH", idsPath)
|
||||
loadOnce = sync.Once{}
|
||||
vendors = nil
|
||||
devices = nil
|
||||
|
||||
if got := DeviceName(0x10de, 0x233b); got != "NVIDIA H200 SXM" {
|
||||
t.Fatalf("expected external device name, got %q", got)
|
||||
}
|
||||
if got := VendorName(0x10de); got != "NVIDIA Corporation" {
|
||||
t.Fatalf("expected external vendor name, got %q", got)
|
||||
}
|
||||
if got := DeviceName(0x8086, 0x1521); got != "I350 Gigabit Network Connection" {
|
||||
t.Fatalf("expected external intel device name, got %q", got)
|
||||
}
|
||||
}
|
||||
133
internal/parser/vendors/supermicro/README.md
vendored
133
internal/parser/vendors/supermicro/README.md
vendored
@@ -1,133 +0,0 @@
|
||||
# SMC Crash Dump Parser
|
||||
|
||||
Парсер для архивов Supermicro (SMC) BMC Crash Dump.
|
||||
|
||||
## Поддерживаемые серверы
|
||||
|
||||
- Supermicro SYS-821GE-TNHR
|
||||
- Другие серверы Supermicro с BMC Crashdump функциональностью
|
||||
|
||||
## Формат архива
|
||||
|
||||
Парсер работает с архивами в формате:
|
||||
- `.tgz` / `.tar.gz` (сжатый tar)
|
||||
- `.tar` (несжатый tar)
|
||||
|
||||
## Распознаваемые файлы
|
||||
|
||||
### Основные файлы
|
||||
|
||||
1. **CDump.txt** - JSON файл с данными crashdump
|
||||
- Metadata (BMC, BIOS, ME версии firmware)
|
||||
- CPU информация (CPUID, количество ядер, microcode версия, PPIN)
|
||||
- MCA (Machine Check Architecture) данные - ошибки процессоров
|
||||
|
||||
## Извлекаемые данные
|
||||
|
||||
### Hardware Configuration
|
||||
|
||||
#### CPUs
|
||||
```json
|
||||
{
|
||||
"slot": "CPU0",
|
||||
"model": "CPUID: 0xc06f2",
|
||||
"cores": 56,
|
||||
"manufacturer": "Intel",
|
||||
"firmware": "Microcode: 0x210002b3"
|
||||
}
|
||||
```
|
||||
|
||||
### FRU Information
|
||||
|
||||
- BMC Firmware Version
|
||||
- BIOS Version
|
||||
- ME Firmware Version
|
||||
- CPU PPIN (Protected Processor Inventory Number)
|
||||
|
||||
### Events
|
||||
|
||||
События создаются для:
|
||||
- **Crashdump collection** - когда был собран crashdump
|
||||
- **MCA Errors** - ошибки Machine Check Architecture
|
||||
- Corrected errors (Warning severity)
|
||||
- Uncorrected errors (Critical severity)
|
||||
|
||||
Уровни severity:
|
||||
- `info` - информационные события (crashdump по запросу)
|
||||
- `warning` - предупреждения (corrected MCA errors, reset detected)
|
||||
- `critical` - критические ошибки (uncorrected MCA errors)
|
||||
|
||||
## Пример использования
|
||||
|
||||
```bash
|
||||
# Запуск веб-интерфейса
|
||||
./logpile --file /path/to/CDump_090859_01302026.tgz
|
||||
|
||||
# Веб-интерфейс будет доступен на http://localhost:8082
|
||||
```
|
||||
|
||||
## Автоопределение
|
||||
|
||||
Парсер автоматически определяет архивы SMC Crash Dump по наличию:
|
||||
- `CDump.txt` с маркерами "crash_data", "METADATA", "bmc_fw_ver"
|
||||
|
||||
Confidence score:
|
||||
- `CDump.txt` с маркерами crashdump: +80
|
||||
|
||||
## Версионирование
|
||||
|
||||
**Текущая версия парсера:** 1.0.0
|
||||
|
||||
При модификации логики парсера необходимо увеличивать версию в константе `parserVersion` в файле `parser.go`.
|
||||
|
||||
## Примеры данных
|
||||
|
||||
### Пример CDump.txt (metadata)
|
||||
```json
|
||||
{
|
||||
"crash_data": {
|
||||
"METADATA": {
|
||||
"cpu0": {
|
||||
"cpuid": "0xc06f2",
|
||||
"core_count": "0x38",
|
||||
"ppin": "0xa3ccbe7d45026592",
|
||||
"ucode_patch_ver": "0x210002b3"
|
||||
},
|
||||
"bmc_fw_ver": "01.03.18",
|
||||
"bios_id": "BIOS Date: 08/04/2025 Rev 2.7",
|
||||
"me_fw_ver": "6.1.4.204",
|
||||
"timestamp": "2026-01-30T09:06:52Z",
|
||||
"trigger_type": "On-Demand"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### MCA Error Detection
|
||||
|
||||
Парсер проверяет регистры MCA status на наличие ошибок:
|
||||
- Bit 63 (Valid) - индикатор валидной ошибки
|
||||
- Bit 61 (UC) - uncorrected error
|
||||
- Bit 60 (EN) - error enabled
|
||||
|
||||
## Известные ограничения
|
||||
|
||||
1. Парсер фокусируется на данных из `CDump.txt`
|
||||
2. Детальный анализ MCA errors пока упрощен (только проверка status регистров)
|
||||
3. TOR dump и другие расширенные данные пока не парсятся
|
||||
|
||||
## Разработка
|
||||
|
||||
### Добавление новых полей
|
||||
|
||||
1. Изучите структуру JSON в CDump.txt
|
||||
2. Добавьте поля в структуры `Metadata`, `CPUMetadata`, или `MCAData`
|
||||
3. Обновите функции парсинга
|
||||
4. Увеличьте версию парсера
|
||||
|
||||
### Расширение MCA анализа
|
||||
|
||||
Для более детального анализа MCA ошибок можно:
|
||||
1. Добавить декодирование MCA error codes
|
||||
2. Парсить MISC и ADDR регистры
|
||||
3. Добавить корреляцию ошибок между банками
|
||||
261
internal/parser/vendors/supermicro/crashdump.go
vendored
261
internal/parser/vendors/supermicro/crashdump.go
vendored
@@ -1,261 +0,0 @@
|
||||
package supermicro
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
// CrashDumpData represents the structure of CDump.txt
|
||||
type CrashDumpData struct {
|
||||
CrashData struct {
|
||||
METADATA Metadata `json:"METADATA"`
|
||||
PROCESSORS ProcessorsData `json:"PROCESSORS"`
|
||||
} `json:"crash_data"`
|
||||
}
|
||||
|
||||
// ProcessorsData contains processor crash data
|
||||
type ProcessorsData struct {
|
||||
Version string `json:"_version"`
|
||||
CPU0 Processors `json:"cpu0"`
|
||||
CPU1 Processors `json:"cpu1"`
|
||||
}
|
||||
|
||||
// Metadata contains crashdump metadata
|
||||
type Metadata struct {
|
||||
CPU0 CPUMetadata `json:"cpu0"`
|
||||
CPU1 CPUMetadata `json:"cpu1"`
|
||||
BMCFWVer string `json:"bmc_fw_ver"`
|
||||
BIOSId string `json:"bios_id"`
|
||||
MEFWVer string `json:"me_fw_ver"`
|
||||
Timestamp string `json:"timestamp"`
|
||||
TriggerType string `json:"trigger_type"`
|
||||
PlatformName string `json:"platform_name"`
|
||||
CrashdumpVer string `json:"crashdump_ver"`
|
||||
ResetDetected string `json:"_reset_detected"`
|
||||
}
|
||||
|
||||
// CPUMetadata contains CPU metadata
|
||||
type CPUMetadata struct {
|
||||
CPUID string `json:"cpuid"`
|
||||
CoreMask string `json:"core_mask"`
|
||||
CHACount string `json:"cha_count"`
|
||||
CoreCount string `json:"core_count"`
|
||||
PPIN string `json:"ppin"`
|
||||
UcodePatchVer string `json:"ucode_patch_ver"`
|
||||
}
|
||||
|
||||
// Processors contains processor crash data
|
||||
type Processors struct {
|
||||
MCA MCAData `json:"MCA"`
|
||||
}
|
||||
|
||||
// MCAData contains Machine Check Architecture data
|
||||
type MCAData struct {
|
||||
Uncore map[string]interface{} `json:"uncore"`
|
||||
}
|
||||
|
||||
// ParseCrashDump parses CDump.txt file
|
||||
func ParseCrashDump(content []byte, result *models.AnalysisResult) error {
|
||||
var data CrashDumpData
|
||||
if err := json.Unmarshal(content, &data); err != nil {
|
||||
return fmt.Errorf("failed to parse CDump.txt: %w", err)
|
||||
}
|
||||
|
||||
// Initialize Hardware.Firmware slice if nil
|
||||
if result.Hardware.Firmware == nil {
|
||||
result.Hardware.Firmware = make([]models.FirmwareInfo, 0)
|
||||
}
|
||||
|
||||
// Parse metadata
|
||||
parseMetadata(&data.CrashData.METADATA, result)
|
||||
|
||||
// Parse CPU information
|
||||
parseCPUInfo(&data.CrashData.METADATA, result)
|
||||
|
||||
// Parse MCA errors
|
||||
parseMCAErrors(&data.CrashData, result)
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// parseMetadata extracts metadata information
|
||||
func parseMetadata(metadata *Metadata, result *models.AnalysisResult) {
|
||||
// Store firmware versions in HardwareConfig.Firmware
|
||||
if metadata.BMCFWVer != "" {
|
||||
result.Hardware.Firmware = append(result.Hardware.Firmware, models.FirmwareInfo{
|
||||
DeviceName: "BMC",
|
||||
Version: metadata.BMCFWVer,
|
||||
})
|
||||
}
|
||||
|
||||
if metadata.BIOSId != "" {
|
||||
result.Hardware.Firmware = append(result.Hardware.Firmware, models.FirmwareInfo{
|
||||
DeviceName: "BIOS",
|
||||
Version: metadata.BIOSId,
|
||||
})
|
||||
}
|
||||
|
||||
if metadata.MEFWVer != "" {
|
||||
result.Hardware.Firmware = append(result.Hardware.Firmware, models.FirmwareInfo{
|
||||
DeviceName: "ME",
|
||||
Version: metadata.MEFWVer,
|
||||
})
|
||||
}
|
||||
|
||||
// Create event for crashdump trigger
|
||||
timestamp := time.Now()
|
||||
if metadata.Timestamp != "" {
|
||||
if t, err := time.Parse(time.RFC3339, metadata.Timestamp); err == nil {
|
||||
timestamp = t
|
||||
}
|
||||
}
|
||||
|
||||
triggerType := metadata.TriggerType
|
||||
if triggerType == "" {
|
||||
triggerType = "Unknown"
|
||||
}
|
||||
|
||||
severity := models.SeverityInfo
|
||||
if metadata.ResetDetected != "" && metadata.ResetDetected != "NONE" {
|
||||
severity = models.SeverityWarning
|
||||
}
|
||||
|
||||
result.Events = append(result.Events, models.Event{
|
||||
Timestamp: timestamp,
|
||||
Source: "Crashdump",
|
||||
EventType: "System Crashdump",
|
||||
Description: fmt.Sprintf("Crashdump collected (%s)", triggerType),
|
||||
Severity: severity,
|
||||
RawData: fmt.Sprintf("Version: %s, Reset: %s", metadata.CrashdumpVer, metadata.ResetDetected),
|
||||
})
|
||||
}
|
||||
|
||||
// parseCPUInfo extracts CPU information
|
||||
func parseCPUInfo(metadata *Metadata, result *models.AnalysisResult) {
|
||||
cpus := []struct {
|
||||
socket int
|
||||
data CPUMetadata
|
||||
}{
|
||||
{0, metadata.CPU0},
|
||||
{1, metadata.CPU1},
|
||||
}
|
||||
|
||||
for _, cpu := range cpus {
|
||||
if cpu.data.CPUID == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
// Parse core count
|
||||
coreCount := 0
|
||||
if cpu.data.CoreCount != "" {
|
||||
if count, err := strconv.ParseInt(strings.TrimPrefix(cpu.data.CoreCount, "0x"), 16, 64); err == nil {
|
||||
coreCount = int(count)
|
||||
}
|
||||
}
|
||||
|
||||
cpuModel := models.CPU{
|
||||
Socket: cpu.socket,
|
||||
Model: fmt.Sprintf("Intel CPU (CPUID: %s)", cpu.data.CPUID),
|
||||
Cores: coreCount,
|
||||
}
|
||||
|
||||
// Add PPIN
|
||||
if cpu.data.PPIN != "" && cpu.data.PPIN != "0x0" {
|
||||
cpuModel.PPIN = cpu.data.PPIN
|
||||
}
|
||||
|
||||
result.Hardware.CPUs = append(result.Hardware.CPUs, cpuModel)
|
||||
|
||||
// Add microcode version to firmware list
|
||||
if cpu.data.UcodePatchVer != "" {
|
||||
result.Hardware.Firmware = append(result.Hardware.Firmware, models.FirmwareInfo{
|
||||
DeviceName: fmt.Sprintf("CPU%d Microcode", cpu.socket),
|
||||
Version: cpu.data.UcodePatchVer,
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// parseMCAErrors extracts Machine Check Architecture errors
|
||||
func parseMCAErrors(crashData *struct {
|
||||
METADATA Metadata `json:"METADATA"`
|
||||
PROCESSORS ProcessorsData `json:"PROCESSORS"`
|
||||
}, result *models.AnalysisResult) {
|
||||
timestamp := time.Now()
|
||||
if crashData.METADATA.Timestamp != "" {
|
||||
if t, err := time.Parse(time.RFC3339, crashData.METADATA.Timestamp); err == nil {
|
||||
timestamp = t
|
||||
}
|
||||
}
|
||||
|
||||
// Parse each CPU's MCA data
|
||||
cpuProcs := []struct {
|
||||
name string
|
||||
data Processors
|
||||
}{
|
||||
{"cpu0", crashData.PROCESSORS.CPU0},
|
||||
{"cpu1", crashData.PROCESSORS.CPU1},
|
||||
}
|
||||
|
||||
for _, cpu := range cpuProcs {
|
||||
if cpu.data.MCA.Uncore == nil {
|
||||
continue
|
||||
}
|
||||
|
||||
// Check each MCA bank for errors
|
||||
for bankName, bankDataRaw := range cpu.data.MCA.Uncore {
|
||||
bankData, ok := bankDataRaw.(map[string]interface{})
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
|
||||
// Look for status register
|
||||
statusKey := strings.ToLower(bankName) + "_status"
|
||||
statusRaw, ok := bankData[statusKey]
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
|
||||
statusStr, ok := statusRaw.(string)
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
|
||||
// Parse status value
|
||||
status, err := strconv.ParseUint(strings.TrimPrefix(statusStr, "0x"), 16, 64)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
// Check if MCA error is valid (bit 63 = Valid)
|
||||
if status&(1<<63) != 0 {
|
||||
// MCA error detected
|
||||
severity := models.SeverityWarning
|
||||
if status&(1<<61) != 0 { // UC bit = uncorrected error
|
||||
severity = models.SeverityCritical
|
||||
}
|
||||
|
||||
description := fmt.Sprintf("MCA Error in %s bank %s", cpu.name, bankName)
|
||||
if status&(1<<61) != 0 {
|
||||
description += " (Uncorrected)"
|
||||
} else {
|
||||
description += " (Corrected)"
|
||||
}
|
||||
|
||||
result.Events = append(result.Events, models.Event{
|
||||
Timestamp: timestamp,
|
||||
Source: "MCA",
|
||||
EventType: "Machine Check",
|
||||
Description: description,
|
||||
Severity: severity,
|
||||
RawData: fmt.Sprintf("Status: %s, CPU: %s, Bank: %s", statusStr, cpu.name, bankName),
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
98
internal/parser/vendors/supermicro/parser.go
vendored
98
internal/parser/vendors/supermicro/parser.go
vendored
@@ -1,98 +0,0 @@
|
||||
// Package supermicro provides parser for Supermicro BMC crashdump archives
|
||||
// Tested with: Supermicro SYS-821GE-TNHR (Crashdump format)
|
||||
//
|
||||
// IMPORTANT: Increment parserVersion when modifying parser logic!
|
||||
// This helps track which version was used to parse specific logs.
|
||||
package supermicro
|
||||
|
||||
import (
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
// parserVersion - version of this parser module
|
||||
// IMPORTANT: Increment this version when making changes to parser logic!
|
||||
const parserVersion = "1.0.0"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
}
|
||||
|
||||
// Parser implements VendorParser for Supermicro servers
|
||||
type Parser struct{}
|
||||
|
||||
// Name returns human-readable parser name
|
||||
func (p *Parser) Name() string {
|
||||
return "SMC Crash Dump Parser"
|
||||
}
|
||||
|
||||
// Vendor returns vendor identifier
|
||||
func (p *Parser) Vendor() string {
|
||||
return "supermicro"
|
||||
}
|
||||
|
||||
// Version returns parser version
|
||||
// IMPORTANT: Update parserVersion constant when modifying parser logic!
|
||||
func (p *Parser) Version() string {
|
||||
return parserVersion
|
||||
}
|
||||
|
||||
// Detect checks if archive matches Supermicro crashdump format
|
||||
// Returns confidence 0-100
|
||||
func (p *Parser) Detect(files []parser.ExtractedFile) int {
|
||||
confidence := 0
|
||||
|
||||
for _, f := range files {
|
||||
path := strings.ToLower(f.Path)
|
||||
|
||||
// Strong indicator for Supermicro Crashdump format
|
||||
if strings.HasSuffix(path, "cdump.txt") {
|
||||
// Check if it's really Supermicro crashdump format
|
||||
if containsCrashdumpMarkers(f.Content) {
|
||||
confidence += 80
|
||||
}
|
||||
}
|
||||
|
||||
// Cap at 100
|
||||
if confidence >= 100 {
|
||||
return 100
|
||||
}
|
||||
}
|
||||
|
||||
return confidence
|
||||
}
|
||||
|
||||
// containsCrashdumpMarkers checks if content has Supermicro crashdump markers
|
||||
func containsCrashdumpMarkers(content []byte) bool {
|
||||
s := string(content)
|
||||
// Check for typical Supermicro Crashdump structure
|
||||
return strings.Contains(s, "crash_data") &&
|
||||
strings.Contains(s, "METADATA") &&
|
||||
(strings.Contains(s, "bmc_fw_ver") || strings.Contains(s, "crashdump_ver"))
|
||||
}
|
||||
|
||||
// Parse parses Supermicro crashdump archive
|
||||
func (p *Parser) Parse(files []parser.ExtractedFile) (*models.AnalysisResult, error) {
|
||||
result := &models.AnalysisResult{
|
||||
Events: make([]models.Event, 0),
|
||||
FRU: make([]models.FRUInfo, 0),
|
||||
Sensors: make([]models.SensorReading, 0),
|
||||
}
|
||||
|
||||
// Initialize hardware config
|
||||
result.Hardware = &models.HardwareConfig{
|
||||
CPUs: make([]models.CPU, 0),
|
||||
}
|
||||
|
||||
// Parse CDump.txt (JSON crashdump)
|
||||
if f := parser.FindFileByName(files, "CDump.txt"); f != nil {
|
||||
if err := ParseCrashDump(f.Content, result); err != nil {
|
||||
// Log error but continue parsing other files
|
||||
_ = err // Ignore error for now
|
||||
}
|
||||
}
|
||||
|
||||
return result, nil
|
||||
}
|
||||
1040
internal/parser/vendors/unraid/parser.go
vendored
Normal file
1040
internal/parser/vendors/unraid/parser.go
vendored
Normal file
File diff suppressed because it is too large
Load Diff
393
internal/parser/vendors/unraid/parser_test.go
vendored
Normal file
393
internal/parser/vendors/unraid/parser_test.go
vendored
Normal file
@@ -0,0 +1,393 @@
|
||||
package unraid
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/parser"
|
||||
)
|
||||
|
||||
func TestDetect(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
files []parser.ExtractedFile
|
||||
wantMin int
|
||||
wantMax int
|
||||
shouldFind bool
|
||||
}{
|
||||
{
|
||||
name: "typical unraid diagnostics",
|
||||
files: []parser.ExtractedFile{
|
||||
{
|
||||
Path: "box3-diagnostics-20260205-2333/unraid-7.2.0.txt",
|
||||
Content: []byte("7.2.0\n"),
|
||||
},
|
||||
{
|
||||
Path: "box3-diagnostics-20260205-2333/system/vars.txt",
|
||||
Content: []byte("[parity] => Array\n[disk1] => Array\n"),
|
||||
},
|
||||
},
|
||||
wantMin: 50,
|
||||
wantMax: 100,
|
||||
shouldFind: true,
|
||||
},
|
||||
{
|
||||
name: "unraid with kernel marker",
|
||||
files: []parser.ExtractedFile{
|
||||
{
|
||||
Path: "diagnostics/system/lscpu.txt",
|
||||
Content: []byte("Unraid kernel build 6.12.54"),
|
||||
},
|
||||
},
|
||||
wantMin: 50,
|
||||
wantMax: 100,
|
||||
shouldFind: true,
|
||||
},
|
||||
{
|
||||
name: "not unraid",
|
||||
files: []parser.ExtractedFile{
|
||||
{
|
||||
Path: "some/random/file.txt",
|
||||
Content: []byte("just some random content"),
|
||||
},
|
||||
},
|
||||
wantMin: 0,
|
||||
wantMax: 0,
|
||||
shouldFind: false,
|
||||
},
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
got := p.Detect(tt.files)
|
||||
|
||||
if tt.shouldFind && got < tt.wantMin {
|
||||
t.Errorf("Detect() = %v, want at least %v", got, tt.wantMin)
|
||||
}
|
||||
|
||||
if got > tt.wantMax {
|
||||
t.Errorf("Detect() = %v, want at most %v", got, tt.wantMax)
|
||||
}
|
||||
|
||||
if !tt.shouldFind && got > 0 {
|
||||
t.Errorf("Detect() = %v, want 0 (should not detect)", got)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_Version(t *testing.T) {
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "unraid-7.2.0.txt",
|
||||
Content: []byte("7.2.0\n"),
|
||||
},
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
|
||||
if err != nil {
|
||||
t.Fatalf("Parse() error = %v", err)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Firmware) == 0 {
|
||||
t.Fatal("expected firmware info")
|
||||
}
|
||||
|
||||
fw := result.Hardware.Firmware[0]
|
||||
if fw.DeviceName != "Unraid OS" {
|
||||
t.Errorf("DeviceName = %v, want 'Unraid OS'", fw.DeviceName)
|
||||
}
|
||||
|
||||
if fw.Version != "7.2.0" {
|
||||
t.Errorf("Version = %v, want '7.2.0'", fw.Version)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_CPU(t *testing.T) {
|
||||
lscpuContent := `Architecture: x86_64
|
||||
CPU op-mode(s): 32-bit, 64-bit
|
||||
CPU(s): 16
|
||||
Model name: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
|
||||
Core(s) per socket: 8
|
||||
Socket(s): 1
|
||||
CPU max MHz: 3400.0000
|
||||
`
|
||||
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "diagnostics/system/lscpu.txt",
|
||||
Content: []byte(lscpuContent),
|
||||
},
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
|
||||
if err != nil {
|
||||
t.Fatalf("Parse() error = %v", err)
|
||||
}
|
||||
|
||||
if len(result.Hardware.CPUs) == 0 {
|
||||
t.Fatal("expected CPU info")
|
||||
}
|
||||
|
||||
cpu := result.Hardware.CPUs[0]
|
||||
if cpu.Model != "Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz" {
|
||||
t.Errorf("Model = %v", cpu.Model)
|
||||
}
|
||||
|
||||
if cpu.Cores != 8 {
|
||||
t.Errorf("Cores = %v, want 8", cpu.Cores)
|
||||
}
|
||||
|
||||
if cpu.Threads != 16 {
|
||||
t.Errorf("Threads = %v, want 16", cpu.Threads)
|
||||
}
|
||||
|
||||
if cpu.FrequencyMHz != 3400 {
|
||||
t.Errorf("FrequencyMHz = %v, want 3400", cpu.FrequencyMHz)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_Memory(t *testing.T) {
|
||||
memContent := ` total used free shared buff/cache available
|
||||
Mem: 50Gi 11Gi 1.4Gi 565Mi 39Gi 39Gi
|
||||
Swap: 0B 0B 0B
|
||||
Total: 50Gi 11Gi 1.4Gi
|
||||
`
|
||||
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "diagnostics/system/memory.txt",
|
||||
Content: []byte(memContent),
|
||||
},
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
|
||||
if err != nil {
|
||||
t.Fatalf("Parse() error = %v", err)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Memory) == 0 {
|
||||
t.Fatal("expected memory info")
|
||||
}
|
||||
|
||||
mem := result.Hardware.Memory[0]
|
||||
expectedSizeMB := 50 * 1024 // 50 GiB in MB
|
||||
|
||||
if mem.SizeMB != expectedSizeMB {
|
||||
t.Errorf("SizeMB = %v, want %v", mem.SizeMB, expectedSizeMB)
|
||||
}
|
||||
|
||||
if mem.Type != "DRAM" {
|
||||
t.Errorf("Type = %v, want 'DRAM'", mem.Type)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_SMART(t *testing.T) {
|
||||
smartContent := `smartctl 7.5 2025-04-30 r5714 [x86_64-linux-6.12.54-Unraid] (local build)
|
||||
Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org
|
||||
|
||||
=== START OF INFORMATION SECTION ===
|
||||
Device Model: ST4000NM000B-2TF100
|
||||
Serial Number: WX103EC9
|
||||
LU WWN Device Id: 5 000c50 0ed59db60
|
||||
Firmware Version: TNA1
|
||||
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
|
||||
Sector Size: 512 bytes logical/physical
|
||||
Rotation Rate: 7200 rpm
|
||||
Form Factor: 3.5 inches
|
||||
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
|
||||
|
||||
=== START OF READ SMART DATA SECTION ===
|
||||
SMART overall-health self-assessment test result: PASSED
|
||||
`
|
||||
|
||||
files := []parser.ExtractedFile{
|
||||
{
|
||||
Path: "diagnostics/smart/ST4000NM000B-2TF100_WX103EC9-20260205-2333 disk1 (sdi).txt",
|
||||
Content: []byte(smartContent),
|
||||
},
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
|
||||
if err != nil {
|
||||
t.Fatalf("Parse() error = %v", err)
|
||||
}
|
||||
|
||||
if len(result.Hardware.Storage) == 0 {
|
||||
t.Fatal("expected storage info")
|
||||
}
|
||||
|
||||
disk := result.Hardware.Storage[0]
|
||||
|
||||
if disk.Model != "ST4000NM000B-2TF100" {
|
||||
t.Errorf("Model = %v, want 'ST4000NM000B-2TF100'", disk.Model)
|
||||
}
|
||||
|
||||
if disk.SerialNumber != "WX103EC9" {
|
||||
t.Errorf("SerialNumber = %v, want 'WX103EC9'", disk.SerialNumber)
|
||||
}
|
||||
|
||||
if disk.Firmware != "TNA1" {
|
||||
t.Errorf("Firmware = %v, want 'TNA1'", disk.Firmware)
|
||||
}
|
||||
|
||||
if disk.SizeGB != 4000 {
|
||||
t.Errorf("SizeGB = %v, want 4000", disk.SizeGB)
|
||||
}
|
||||
|
||||
if disk.Type != "hdd" {
|
||||
t.Errorf("Type = %v, want 'hdd'", disk.Type)
|
||||
}
|
||||
|
||||
// Check that no health warnings were generated (PASSED health)
|
||||
healthWarnings := 0
|
||||
for _, event := range result.Events {
|
||||
if event.EventType == "Disk Health" && event.Severity == "warning" {
|
||||
healthWarnings++
|
||||
}
|
||||
}
|
||||
if healthWarnings != 0 {
|
||||
t.Errorf("Expected no health warnings for PASSED disk, got %v", healthWarnings)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParser_Metadata(t *testing.T) {
|
||||
p := &Parser{}
|
||||
|
||||
if p.Name() != "Unraid Parser" {
|
||||
t.Errorf("Name() = %v, want 'Unraid Parser'", p.Name())
|
||||
}
|
||||
|
||||
if p.Vendor() != "unraid" {
|
||||
t.Errorf("Vendor() = %v, want 'unraid'", p.Vendor())
|
||||
}
|
||||
|
||||
if p.Version() == "" {
|
||||
t.Error("Version() should not be empty")
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_MemoryDIMMsFromMeminfo(t *testing.T) {
|
||||
memInfo := `MemTotal: 53393436 kB
|
||||
|
||||
Handle 0x002D, DMI type 17, 34 bytes
|
||||
Memory Device
|
||||
Size: 16 GB
|
||||
Locator: Node0_Dimm1
|
||||
Bank Locator: Node0_Bank0
|
||||
Type: DDR3
|
||||
Speed: 1333 MT/s
|
||||
Manufacturer: Samsung
|
||||
Serial Number: 238F7649
|
||||
Part Number: M393B2G70BH0-
|
||||
Rank: 4
|
||||
Configured Memory Speed: 1333 MT/s
|
||||
|
||||
Handle 0x002F, DMI type 17, 34 bytes
|
||||
Memory Device
|
||||
Size: No Module Installed
|
||||
Locator: Node0_Dimm2
|
||||
`
|
||||
|
||||
files := []parser.ExtractedFile{
|
||||
{Path: "diagnostics/system/memory.txt", Content: []byte("Mem: 50Gi")},
|
||||
{Path: "diagnostics/system/meminfo.txt", Content: []byte(memInfo)},
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse() error = %v", err)
|
||||
}
|
||||
|
||||
if got := len(result.Hardware.Memory); got != 1 {
|
||||
t.Fatalf("expected only installed DIMM entries, got %d entries", got)
|
||||
}
|
||||
dimm := result.Hardware.Memory[0]
|
||||
if dimm.Slot != "Node0_Dimm1" {
|
||||
t.Errorf("Slot = %q, want Node0_Dimm1", dimm.Slot)
|
||||
}
|
||||
if dimm.SizeMB != 16*1024 {
|
||||
t.Errorf("SizeMB = %d, want %d", dimm.SizeMB, 16*1024)
|
||||
}
|
||||
if dimm.Type != "DDR3" {
|
||||
t.Errorf("Type = %q, want DDR3", dimm.Type)
|
||||
}
|
||||
if dimm.SerialNumber != "238F7649" {
|
||||
t.Errorf("SerialNumber = %q, want 238F7649", dimm.SerialNumber)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_NetworkAndPCIeFromLSPCIAndEthtool(t *testing.T) {
|
||||
lspci := `03:00.0 SCSI storage controller [0100]: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
|
||||
07:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 06)
|
||||
`
|
||||
ethtool := `Settings for eth0:
|
||||
Speed: 1000Mb/s
|
||||
Link detected: yes
|
||||
driver: r8168
|
||||
firmware-version:
|
||||
bus-info: 0000:07:00.0
|
||||
--------------------------------
|
||||
`
|
||||
files := []parser.ExtractedFile{
|
||||
{Path: "diagnostics/system/lspci.txt", Content: []byte(lspci)},
|
||||
{Path: "diagnostics/system/ethtool.txt", Content: []byte(ethtool)},
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse() error = %v", err)
|
||||
}
|
||||
|
||||
if len(result.Hardware.NetworkAdapters) != 1 {
|
||||
t.Fatalf("expected 1 network adapter, got %d", len(result.Hardware.NetworkAdapters))
|
||||
}
|
||||
nic := result.Hardware.NetworkAdapters[0]
|
||||
if nic.Location != "0000:07:00.0" {
|
||||
t.Errorf("Location = %q, want 0000:07:00.0", nic.Location)
|
||||
}
|
||||
if nic.Model == "" {
|
||||
t.Error("Model should not be empty")
|
||||
}
|
||||
if nic.Vendor == "" {
|
||||
t.Error("Vendor should not be empty")
|
||||
}
|
||||
|
||||
if len(result.Hardware.PCIeDevices) < 2 {
|
||||
t.Fatalf("expected at least 2 PCIe devices, got %d", len(result.Hardware.PCIeDevices))
|
||||
}
|
||||
}
|
||||
|
||||
func TestParse_HostSerialFallbackFromVarsUUID(t *testing.T) {
|
||||
vars := ` [flashGUID] => 1...
|
||||
[regGUID] => 1...7
|
||||
[uuid] => 2713440667722491190
|
||||
`
|
||||
files := []parser.ExtractedFile{
|
||||
{Path: "diagnostics/system/vars.txt", Content: []byte(vars)},
|
||||
}
|
||||
|
||||
p := &Parser{}
|
||||
result, err := p.Parse(files)
|
||||
if err != nil {
|
||||
t.Fatalf("Parse() error = %v", err)
|
||||
}
|
||||
|
||||
if result.Hardware.BoardInfo.SerialNumber != "2713440667722491190" {
|
||||
t.Fatalf("BoardInfo.SerialNumber = %q, want vars uuid", result.Hardware.BoardInfo.SerialNumber)
|
||||
}
|
||||
if result.Hardware.BoardInfo.UUID != "2713440667722491190" {
|
||||
t.Fatalf("BoardInfo.UUID = %q, want vars uuid", result.Hardware.BoardInfo.UUID)
|
||||
}
|
||||
}
|
||||
6
internal/parser/vendors/vendors.go
vendored
6
internal/parser/vendors/vendors.go
vendored
@@ -4,17 +4,17 @@ package vendors
|
||||
|
||||
import (
|
||||
// Import vendor modules to trigger their init() registration
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/h3c"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/nvidia"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/nvidia_bug_report"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/supermicro"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/unraid"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/xigmanas"
|
||||
|
||||
// Generic fallback parser (must be last for lowest priority)
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/generic"
|
||||
|
||||
// Future vendors:
|
||||
// _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
|
||||
// _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/hpe"
|
||||
// _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/lenovo"
|
||||
)
|
||||
|
||||
46
internal/parser/vendors/xigmanas/README.md
vendored
46
internal/parser/vendors/xigmanas/README.md
vendored
@@ -1,46 +0,0 @@
|
||||
# Xigmanas Parser
|
||||
|
||||
Parser for Xigmanas (FreeBSD-based NAS) system logs.
|
||||
|
||||
## Supported Files
|
||||
|
||||
- `xigmanas` - Main system log file with configuration and status information
|
||||
- `dmesg` - Kernel messages and hardware initialization information
|
||||
- SMART data from disk monitoring
|
||||
|
||||
## Features
|
||||
|
||||
This parser extracts the following information from Xigmanas logs:
|
||||
|
||||
### System Information
|
||||
- Firmware version
|
||||
- System uptime
|
||||
- CPU model and specifications
|
||||
- Memory configuration
|
||||
- Hardware platform information
|
||||
|
||||
### Storage Information
|
||||
- Disk models and serial numbers
|
||||
- Disk capacity and health status
|
||||
- SMART temperature readings
|
||||
|
||||
### Hardware Configuration
|
||||
- CPU information
|
||||
- Memory modules
|
||||
- Storage devices
|
||||
|
||||
## Detection Logic
|
||||
|
||||
The parser detects Xigmanas format by looking for:
|
||||
- Files with "xigmanas", "system", or "dmesg" in their names
|
||||
- Content containing "XigmaNAS" or "FreeBSD" strings
|
||||
- SMART-related information in log content
|
||||
|
||||
## Example Output
|
||||
|
||||
The parser populates the following fields in AnalysisResult:
|
||||
- `Hardware.Firmware` - Firmware versions
|
||||
- `Hardware.CPUs` - CPU information
|
||||
- `Hardware.Memory` - Memory configuration
|
||||
- `Hardware.Storage` - Storage devices with SMART data
|
||||
- `Sensors` - Temperature readings from SMART data
|
||||
4
internal/parser/vendors/xigmanas/parser.go
vendored
4
internal/parser/vendors/xigmanas/parser.go
vendored
@@ -12,7 +12,7 @@ import (
|
||||
)
|
||||
|
||||
// parserVersion - increment when parsing logic changes.
|
||||
const parserVersion = "2.1.0"
|
||||
const parserVersion = "2.2"
|
||||
|
||||
func init() {
|
||||
parser.Register(&Parser{})
|
||||
@@ -431,7 +431,7 @@ func parseEventTimestamp(line string) time.Time {
|
||||
prefixRe := regexp.MustCompile(`^[A-Z][a-z]{2}\s+\d{1,2}\s+\d{2}:\d{2}:\d{2}`)
|
||||
if prefix := prefixRe.FindString(line); prefix != "" {
|
||||
year := time.Now().Year()
|
||||
if ts, err := time.Parse("Jan 2 15:04:05 2006", prefix+" "+strconv.Itoa(year)); err == nil {
|
||||
if ts, err := parser.ParseInDefaultArchiveLocation("Jan 2 15:04:05 2006", prefix+" "+strconv.Itoa(year)); err == nil {
|
||||
return ts
|
||||
}
|
||||
}
|
||||
|
||||
779
internal/server/device_repository.go
Normal file
779
internal/server/device_repository.go
Normal file
@@ -0,0 +1,779 @@
|
||||
package server
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"git.mchus.pro/mchus/logpile/internal/models"
|
||||
)
|
||||
|
||||
type slotFirmwareInfo struct {
|
||||
Model string
|
||||
Version string
|
||||
Category string
|
||||
}
|
||||
|
||||
var (
|
||||
psuFirmwareRe = regexp.MustCompile(`(?i)^PSU\s*([0-9A-Za-z_-]+)\s*(?:\(([^)]+)\))?$`)
|
||||
nicFirmwareRe = regexp.MustCompile(`(?i)^NIC\s+([^()]+?)\s*(?:\(([^)]+)\))?$`)
|
||||
gpuFirmwareRe = regexp.MustCompile(`(?i)^GPU\s+([^()]+?)\s*(?:\(([^)]+)\))?$`)
|
||||
nvsFirmwareRe = regexp.MustCompile(`(?i)^NVSwitch\s+([^()]+?)\s*(?:\(([^)]+)\))?$`)
|
||||
)
|
||||
|
||||
func BuildHardwareDevices(hw *models.HardwareConfig) []models.HardwareDevice {
|
||||
if hw == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
all := make([]models.HardwareDevice, 0, 1+len(hw.CPUs)+len(hw.Memory)+len(hw.Storage)+len(hw.PCIeDevices)+len(hw.GPUs)+len(hw.NetworkAdapters)+len(hw.PowerSupply))
|
||||
fwBySlot := buildFirmwareBySlot(hw.Firmware)
|
||||
nextID := 0
|
||||
add := func(d models.HardwareDevice) {
|
||||
d.ID = fmt.Sprintf("%s:%d", d.Kind, nextID)
|
||||
nextID++
|
||||
all = append(all, d)
|
||||
}
|
||||
|
||||
add(models.HardwareDevice{
|
||||
Kind: models.DeviceKindBoard,
|
||||
Source: "board",
|
||||
Slot: "board",
|
||||
Model: strings.TrimSpace(hw.BoardInfo.ProductName),
|
||||
PartNumber: strings.TrimSpace(hw.BoardInfo.PartNumber),
|
||||
Manufacturer: strings.TrimSpace(hw.BoardInfo.Manufacturer),
|
||||
SerialNumber: strings.TrimSpace(hw.BoardInfo.SerialNumber),
|
||||
Details: map[string]any{
|
||||
"description": strings.TrimSpace(hw.BoardInfo.Description),
|
||||
"version": strings.TrimSpace(hw.BoardInfo.Version),
|
||||
"uuid": strings.TrimSpace(hw.BoardInfo.UUID),
|
||||
},
|
||||
})
|
||||
|
||||
for _, cpu := range hw.CPUs {
|
||||
add(models.HardwareDevice{
|
||||
Kind: models.DeviceKindCPU,
|
||||
Source: "cpus",
|
||||
Slot: fmt.Sprintf("CPU%d", cpu.Socket),
|
||||
Model: cpu.Model,
|
||||
SerialNumber: cpu.SerialNumber,
|
||||
Cores: cpu.Cores,
|
||||
Threads: cpu.Threads,
|
||||
FrequencyMHz: cpu.FrequencyMHz,
|
||||
MaxFreqMHz: cpu.MaxFreqMHz,
|
||||
Status: cpu.Status,
|
||||
StatusCheckedAt: cpu.StatusCheckedAt,
|
||||
StatusChangedAt: cpu.StatusChangedAt,
|
||||
StatusAtCollect: cpu.StatusAtCollect,
|
||||
StatusHistory: cpu.StatusHistory,
|
||||
ErrorDescription: cpu.ErrorDescription,
|
||||
Details: map[string]any{
|
||||
"description": cpu.Description,
|
||||
"socket": cpu.Socket,
|
||||
"l1_cache_kb": cpu.L1CacheKB,
|
||||
"l2_cache_kb": cpu.L2CacheKB,
|
||||
"l3_cache_kb": cpu.L3CacheKB,
|
||||
"tdp_w": cpu.TDP,
|
||||
"ppin": cpu.PPIN,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
for _, mem := range hw.Memory {
|
||||
if !mem.Present || mem.SizeMB == 0 {
|
||||
continue
|
||||
}
|
||||
present := mem.Present
|
||||
add(models.HardwareDevice{
|
||||
Kind: models.DeviceKindMemory,
|
||||
Source: "memory",
|
||||
Slot: mem.Slot,
|
||||
Location: mem.Location,
|
||||
Manufacturer: mem.Manufacturer,
|
||||
SerialNumber: mem.SerialNumber,
|
||||
PartNumber: mem.PartNumber,
|
||||
Type: mem.Type,
|
||||
Present: &present,
|
||||
SizeMB: mem.SizeMB,
|
||||
Status: mem.Status,
|
||||
StatusCheckedAt: mem.StatusCheckedAt,
|
||||
StatusChangedAt: mem.StatusChangedAt,
|
||||
StatusAtCollect: mem.StatusAtCollect,
|
||||
StatusHistory: mem.StatusHistory,
|
||||
ErrorDescription: mem.ErrorDescription,
|
||||
Details: map[string]any{
|
||||
"description": mem.Description,
|
||||
"technology": mem.Technology,
|
||||
"max_speed_mhz": mem.MaxSpeedMHz,
|
||||
"current_speed_mhz": mem.CurrentSpeedMHz,
|
||||
"ranks": mem.Ranks,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
for _, stor := range hw.Storage {
|
||||
if !stor.Present {
|
||||
continue
|
||||
}
|
||||
present := stor.Present
|
||||
add(models.HardwareDevice{
|
||||
Kind: models.DeviceKindStorage,
|
||||
Source: "storage",
|
||||
Slot: stor.Slot,
|
||||
Location: stor.Location,
|
||||
Model: stor.Model,
|
||||
Manufacturer: stor.Manufacturer,
|
||||
SerialNumber: stor.SerialNumber,
|
||||
Firmware: stor.Firmware,
|
||||
Type: stor.Type,
|
||||
Interface: stor.Interface,
|
||||
Present: &present,
|
||||
SizeGB: stor.SizeGB,
|
||||
Status: stor.Status,
|
||||
StatusCheckedAt: stor.StatusCheckedAt,
|
||||
StatusChangedAt: stor.StatusChangedAt,
|
||||
StatusAtCollect: stor.StatusAtCollect,
|
||||
StatusHistory: stor.StatusHistory,
|
||||
ErrorDescription: stor.ErrorDescription,
|
||||
Details: map[string]any{
|
||||
"description": stor.Description,
|
||||
"backplane_id": stor.BackplaneID,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
for _, p := range hw.PCIeDevices {
|
||||
if isEmptyPCIeDevice(p) {
|
||||
continue
|
||||
}
|
||||
slotKey := normalizeSlotKey(p.Slot)
|
||||
fwInfo := fwBySlot[slotKey]
|
||||
model := strings.TrimSpace(p.PartNumber)
|
||||
if model == "" {
|
||||
model = strings.TrimSpace(p.DeviceClass)
|
||||
}
|
||||
if model == "" {
|
||||
model = strings.TrimSpace(p.Description)
|
||||
}
|
||||
if model == "" && fwInfo.Model != "" {
|
||||
model = fwInfo.Model
|
||||
}
|
||||
add(models.HardwareDevice{
|
||||
Kind: models.DeviceKindPCIe,
|
||||
Source: "pcie_devices",
|
||||
Slot: p.Slot,
|
||||
BDF: p.BDF,
|
||||
DeviceClass: p.DeviceClass,
|
||||
VendorID: p.VendorID,
|
||||
DeviceID: p.DeviceID,
|
||||
Model: model,
|
||||
PartNumber: p.PartNumber,
|
||||
Manufacturer: p.Manufacturer,
|
||||
SerialNumber: p.SerialNumber,
|
||||
Firmware: fwInfo.Version,
|
||||
MACAddresses: p.MACAddresses,
|
||||
LinkWidth: p.LinkWidth,
|
||||
LinkSpeed: p.LinkSpeed,
|
||||
MaxLinkWidth: p.MaxLinkWidth,
|
||||
MaxLinkSpeed: p.MaxLinkSpeed,
|
||||
Status: p.Status,
|
||||
StatusCheckedAt: p.StatusCheckedAt,
|
||||
StatusChangedAt: p.StatusChangedAt,
|
||||
StatusAtCollect: p.StatusAtCollect,
|
||||
StatusHistory: p.StatusHistory,
|
||||
ErrorDescription: p.ErrorDescription,
|
||||
Details: map[string]any{
|
||||
"description": p.Description,
|
||||
"fw_category": fwInfo.Category,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
for _, gpu := range hw.GPUs {
|
||||
add(models.HardwareDevice{
|
||||
Kind: models.DeviceKindGPU,
|
||||
Source: "gpus",
|
||||
Slot: gpu.Slot,
|
||||
Location: gpu.Location,
|
||||
BDF: gpu.BDF,
|
||||
DeviceClass: "DisplayController",
|
||||
VendorID: gpu.VendorID,
|
||||
DeviceID: gpu.DeviceID,
|
||||
Model: gpu.Model,
|
||||
PartNumber: gpu.PartNumber,
|
||||
Manufacturer: gpu.Manufacturer,
|
||||
SerialNumber: gpu.SerialNumber,
|
||||
Firmware: gpu.Firmware,
|
||||
LinkWidth: gpu.CurrentLinkWidth,
|
||||
LinkSpeed: gpu.CurrentLinkSpeed,
|
||||
MaxLinkWidth: gpu.MaxLinkWidth,
|
||||
MaxLinkSpeed: gpu.MaxLinkSpeed,
|
||||
Status: gpu.Status,
|
||||
StatusCheckedAt: gpu.StatusCheckedAt,
|
||||
StatusChangedAt: gpu.StatusChangedAt,
|
||||
StatusAtCollect: gpu.StatusAtCollect,
|
||||
StatusHistory: gpu.StatusHistory,
|
||||
ErrorDescription: gpu.ErrorDescription,
|
||||
Details: map[string]any{
|
||||
"description": gpu.Description,
|
||||
"uuid": gpu.UUID,
|
||||
"video_bios": gpu.VideoBIOS,
|
||||
"irq": gpu.IRQ,
|
||||
"bus_type": gpu.BusType,
|
||||
"dma_size": gpu.DMASize,
|
||||
"dma_mask": gpu.DMAMask,
|
||||
"device_minor": gpu.DeviceMinor,
|
||||
"temperature": gpu.Temperature,
|
||||
"mem_temperature": gpu.MemTemperature,
|
||||
"power": gpu.Power,
|
||||
"max_power": gpu.MaxPower,
|
||||
"clock_speed": gpu.ClockSpeed,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
for _, nic := range hw.NetworkAdapters {
|
||||
if !nic.Present {
|
||||
continue
|
||||
}
|
||||
present := nic.Present
|
||||
add(models.HardwareDevice{
|
||||
Kind: models.DeviceKindNetwork,
|
||||
Source: "network_adapters",
|
||||
Slot: nic.Slot,
|
||||
Location: nic.Location,
|
||||
VendorID: nic.VendorID,
|
||||
DeviceID: nic.DeviceID,
|
||||
Model: nic.Model,
|
||||
PartNumber: nic.PartNumber,
|
||||
Manufacturer: nic.Vendor,
|
||||
SerialNumber: nic.SerialNumber,
|
||||
Firmware: nic.Firmware,
|
||||
PortCount: nic.PortCount,
|
||||
PortType: nic.PortType,
|
||||
MACAddresses: nic.MACAddresses,
|
||||
Present: &present,
|
||||
Status: nic.Status,
|
||||
StatusCheckedAt: nic.StatusCheckedAt,
|
||||
StatusChangedAt: nic.StatusChangedAt,
|
||||
StatusAtCollect: nic.StatusAtCollect,
|
||||
StatusHistory: nic.StatusHistory,
|
||||
ErrorDescription: nic.ErrorDescription,
|
||||
Details: map[string]any{
|
||||
"description": nic.Description,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
for _, psu := range hw.PowerSupply {
|
||||
if !psu.Present {
|
||||
continue
|
||||
}
|
||||
present := psu.Present
|
||||
add(models.HardwareDevice{
|
||||
Kind: models.DeviceKindPSU,
|
||||
Source: "power_supplies",
|
||||
Slot: psu.Slot,
|
||||
Model: psu.Model,
|
||||
PartNumber: psu.PartNumber,
|
||||
Manufacturer: psu.Vendor,
|
||||
SerialNumber: psu.SerialNumber,
|
||||
Firmware: psu.Firmware,
|
||||
Present: &present,
|
||||
WattageW: psu.WattageW,
|
||||
InputType: psu.InputType,
|
||||
InputPowerW: psu.InputPowerW,
|
||||
OutputPowerW: psu.OutputPowerW,
|
||||
InputVoltage: psu.InputVoltage,
|
||||
TemperatureC: psu.TemperatureC,
|
||||
Status: psu.Status,
|
||||
StatusCheckedAt: psu.StatusCheckedAt,
|
||||
StatusChangedAt: psu.StatusChangedAt,
|
||||
StatusAtCollect: psu.StatusAtCollect,
|
||||
StatusHistory: psu.StatusHistory,
|
||||
ErrorDescription: psu.ErrorDescription,
|
||||
Details: map[string]any{
|
||||
"description": psu.Description,
|
||||
"output_voltage": psu.OutputVoltage,
|
||||
},
|
||||
})
|
||||
}
|
||||
|
||||
return annotateDuplicateSerials(dedupeDevices(all))
|
||||
}
|
||||
|
||||
func isEmptyPCIeDevice(p models.PCIeDevice) bool {
|
||||
if isNumericSlot(strings.TrimSpace(p.Slot)) &&
|
||||
strings.TrimSpace(p.BDF) == "" &&
|
||||
p.VendorID == 0 &&
|
||||
p.DeviceID == 0 &&
|
||||
normalizedSerial(p.SerialNumber) == "" &&
|
||||
!hasMeaningfulText(p.PartNumber) &&
|
||||
!hasMeaningfulText(p.Manufacturer) &&
|
||||
!hasMeaningfulText(p.Description) &&
|
||||
len(p.MACAddresses) == 0 &&
|
||||
p.LinkWidth == 0 &&
|
||||
p.MaxLinkWidth == 0 {
|
||||
return true
|
||||
}
|
||||
|
||||
if strings.TrimSpace(p.BDF) != "" {
|
||||
return false
|
||||
}
|
||||
if p.VendorID != 0 || p.DeviceID != 0 {
|
||||
return false
|
||||
}
|
||||
if normalizedSerial(p.SerialNumber) != "" {
|
||||
return false
|
||||
}
|
||||
if hasMeaningfulText(p.PartNumber) {
|
||||
return false
|
||||
}
|
||||
if hasMeaningfulText(p.Manufacturer) {
|
||||
return false
|
||||
}
|
||||
if hasMeaningfulText(p.Description) {
|
||||
return false
|
||||
}
|
||||
if strings.TrimSpace(p.DeviceClass) != "" {
|
||||
class := strings.ToLower(strings.TrimSpace(p.DeviceClass))
|
||||
if class != "unknown" && class != "other" && class != "pcie device" {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func isNumericSlot(slot string) bool {
|
||||
if slot == "" {
|
||||
return false
|
||||
}
|
||||
for _, r := range slot {
|
||||
if r < '0' || r > '9' {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func hasMeaningfulText(v string) bool {
|
||||
s := strings.ToLower(strings.TrimSpace(v))
|
||||
if s == "" {
|
||||
return false
|
||||
}
|
||||
switch s {
|
||||
case "-", "n/a", "na", "none", "null", "unknown":
|
||||
return false
|
||||
default:
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
func dedupeDevices(items []models.HardwareDevice) []models.HardwareDevice {
|
||||
if len(items) < 2 {
|
||||
return items
|
||||
}
|
||||
parent := make([]int, len(items))
|
||||
for i := range parent {
|
||||
parent[i] = i
|
||||
}
|
||||
find := func(x int) int {
|
||||
for parent[x] != x {
|
||||
parent[x] = parent[parent[x]]
|
||||
x = parent[x]
|
||||
}
|
||||
return x
|
||||
}
|
||||
union := func(a, b int) {
|
||||
ra := find(a)
|
||||
rb := find(b)
|
||||
if ra != rb {
|
||||
parent[rb] = ra
|
||||
}
|
||||
}
|
||||
|
||||
for i := 0; i < len(items); i++ {
|
||||
for j := i + 1; j < len(items); j++ {
|
||||
if shouldMergeDevices(items[i], items[j]) {
|
||||
union(i, j)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
groups := make(map[int][]int, len(items))
|
||||
order := make([]int, 0, len(items))
|
||||
for i := range items {
|
||||
root := find(i)
|
||||
if _, ok := groups[root]; !ok {
|
||||
order = append(order, root)
|
||||
}
|
||||
groups[root] = append(groups[root], i)
|
||||
}
|
||||
|
||||
out := make([]models.HardwareDevice, 0, len(order))
|
||||
for _, root := range order {
|
||||
indices := groups[root]
|
||||
bestIdx := indices[0]
|
||||
bestScore := qualityScore(items[bestIdx])
|
||||
for _, idx := range indices[1:] {
|
||||
if s := qualityScore(items[idx]); s > bestScore {
|
||||
bestIdx = idx
|
||||
bestScore = s
|
||||
}
|
||||
}
|
||||
merged := items[bestIdx]
|
||||
for _, idx := range indices {
|
||||
if idx == bestIdx {
|
||||
continue
|
||||
}
|
||||
merged = mergeDevices(merged, items[idx])
|
||||
}
|
||||
out = append(out, merged)
|
||||
}
|
||||
|
||||
for i := range out {
|
||||
out[i].ID = out[i].Kind + ":" + strconv.Itoa(i)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func shouldMergeDevices(a, b models.HardwareDevice) bool {
|
||||
aSN := strings.ToLower(normalizedSerial(a.SerialNumber))
|
||||
bSN := strings.ToLower(normalizedSerial(b.SerialNumber))
|
||||
aBDF := strings.ToLower(strings.TrimSpace(a.BDF))
|
||||
bBDF := strings.ToLower(strings.TrimSpace(b.BDF))
|
||||
aSlot := normalizeSlot(a.Slot)
|
||||
bSlot := normalizeSlot(b.Slot)
|
||||
|
||||
// Memory DIMMs can legitimately share serial number in some dumps.
|
||||
// Never merge DIMMs with different slots.
|
||||
if a.Kind == models.DeviceKindMemory && b.Kind == models.DeviceKindMemory {
|
||||
if aSlot != "" && bSlot != "" && aSlot != bSlot {
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
// Hard conflicts.
|
||||
if aSN != "" && bSN != "" && aSN == bSN {
|
||||
if a.Kind == models.DeviceKindMemory && b.Kind == models.DeviceKindMemory {
|
||||
return aSlot != "" && bSlot != "" && aSlot == bSlot
|
||||
}
|
||||
return true
|
||||
}
|
||||
if aSN != "" && bSN != "" && aSN != bSN {
|
||||
return false
|
||||
}
|
||||
if aBDF != "" && bBDF != "" && aBDF != bBDF {
|
||||
return false
|
||||
}
|
||||
|
||||
// Strong identities.
|
||||
if aBDF != "" && aBDF == bBDF {
|
||||
return true
|
||||
}
|
||||
|
||||
// If both have no strong IDs, be conservative.
|
||||
if aSN == "" && bSN == "" && aBDF == "" && bBDF == "" {
|
||||
if hasMACOverlap(a.MACAddresses, b.MACAddresses) {
|
||||
return true
|
||||
}
|
||||
if aSlot != "" && aSlot == bSlot {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
score := 0
|
||||
if samePCIID(a, b) {
|
||||
score += 4
|
||||
}
|
||||
if sameModel(a, b) {
|
||||
score += 3
|
||||
}
|
||||
if sameManufacturer(a, b) {
|
||||
score += 2
|
||||
}
|
||||
if aSlot != "" && aSlot == bSlot {
|
||||
score += 2
|
||||
}
|
||||
if hasMACOverlap(a.MACAddresses, b.MACAddresses) {
|
||||
score += 2
|
||||
}
|
||||
if sameKindFamily(a.Kind, b.Kind) {
|
||||
score++
|
||||
}
|
||||
if samePCIID(a, b) && ((aBDF != "" && bBDF == "") || (aBDF == "" && bBDF != "")) {
|
||||
score += 2
|
||||
}
|
||||
|
||||
return score >= 7
|
||||
}
|
||||
|
||||
func mergeDevices(primary, secondary models.HardwareDevice) models.HardwareDevice {
|
||||
fillString := func(dst *string, src string) {
|
||||
if strings.TrimSpace(*dst) == "" && strings.TrimSpace(src) != "" {
|
||||
*dst = src
|
||||
}
|
||||
}
|
||||
fillInt := func(dst *int, src int) {
|
||||
if *dst == 0 && src != 0 {
|
||||
*dst = src
|
||||
}
|
||||
}
|
||||
fillFloat := func(dst *float64, src float64) {
|
||||
if *dst == 0 && src != 0 {
|
||||
*dst = src
|
||||
}
|
||||
}
|
||||
|
||||
fillString(&primary.ID, secondary.ID)
|
||||
fillString(&primary.Kind, secondary.Kind)
|
||||
fillString(&primary.Source, secondary.Source)
|
||||
fillString(&primary.Slot, secondary.Slot)
|
||||
fillString(&primary.Location, secondary.Location)
|
||||
fillString(&primary.BDF, secondary.BDF)
|
||||
fillString(&primary.DeviceClass, secondary.DeviceClass)
|
||||
fillInt(&primary.VendorID, secondary.VendorID)
|
||||
fillInt(&primary.DeviceID, secondary.DeviceID)
|
||||
fillString(&primary.Model, secondary.Model)
|
||||
fillString(&primary.PartNumber, secondary.PartNumber)
|
||||
fillString(&primary.Manufacturer, secondary.Manufacturer)
|
||||
fillString(&primary.SerialNumber, secondary.SerialNumber)
|
||||
fillString(&primary.Firmware, secondary.Firmware)
|
||||
fillString(&primary.Type, secondary.Type)
|
||||
fillString(&primary.Interface, secondary.Interface)
|
||||
if primary.Present == nil && secondary.Present != nil {
|
||||
primary.Present = secondary.Present
|
||||
}
|
||||
fillInt(&primary.SizeMB, secondary.SizeMB)
|
||||
fillInt(&primary.SizeGB, secondary.SizeGB)
|
||||
fillInt(&primary.Cores, secondary.Cores)
|
||||
fillInt(&primary.Threads, secondary.Threads)
|
||||
fillInt(&primary.FrequencyMHz, secondary.FrequencyMHz)
|
||||
fillInt(&primary.MaxFreqMHz, secondary.MaxFreqMHz)
|
||||
fillInt(&primary.PortCount, secondary.PortCount)
|
||||
fillString(&primary.PortType, secondary.PortType)
|
||||
if len(primary.MACAddresses) == 0 && len(secondary.MACAddresses) > 0 {
|
||||
primary.MACAddresses = secondary.MACAddresses
|
||||
}
|
||||
fillInt(&primary.LinkWidth, secondary.LinkWidth)
|
||||
fillString(&primary.LinkSpeed, secondary.LinkSpeed)
|
||||
fillInt(&primary.MaxLinkWidth, secondary.MaxLinkWidth)
|
||||
fillString(&primary.MaxLinkSpeed, secondary.MaxLinkSpeed)
|
||||
fillInt(&primary.WattageW, secondary.WattageW)
|
||||
fillString(&primary.InputType, secondary.InputType)
|
||||
fillInt(&primary.InputPowerW, secondary.InputPowerW)
|
||||
fillInt(&primary.OutputPowerW, secondary.OutputPowerW)
|
||||
fillFloat(&primary.InputVoltage, secondary.InputVoltage)
|
||||
fillInt(&primary.TemperatureC, secondary.TemperatureC)
|
||||
fillString(&primary.Status, secondary.Status)
|
||||
if primary.StatusCheckedAt == nil && secondary.StatusCheckedAt != nil {
|
||||
primary.StatusCheckedAt = secondary.StatusCheckedAt
|
||||
}
|
||||
if primary.StatusChangedAt == nil && secondary.StatusChangedAt != nil {
|
||||
primary.StatusChangedAt = secondary.StatusChangedAt
|
||||
}
|
||||
if primary.StatusAtCollect == nil && secondary.StatusAtCollect != nil {
|
||||
primary.StatusAtCollect = secondary.StatusAtCollect
|
||||
}
|
||||
if len(primary.StatusHistory) == 0 && len(secondary.StatusHistory) > 0 {
|
||||
primary.StatusHistory = secondary.StatusHistory
|
||||
}
|
||||
fillString(&primary.ErrorDescription, secondary.ErrorDescription)
|
||||
if primary.Details == nil && secondary.Details != nil {
|
||||
primary.Details = secondary.Details
|
||||
}
|
||||
return primary
|
||||
}
|
||||
|
||||
func samePCIID(a, b models.HardwareDevice) bool {
|
||||
if (a.VendorID == 0 && a.DeviceID == 0) || (b.VendorID == 0 && b.DeviceID == 0) {
|
||||
return false
|
||||
}
|
||||
return a.VendorID == b.VendorID && a.DeviceID == b.DeviceID
|
||||
}
|
||||
|
||||
func sameModel(a, b models.HardwareDevice) bool {
|
||||
am := normalizeText(coalesce(a.Model, a.PartNumber, a.DeviceClass))
|
||||
bm := normalizeText(coalesce(b.Model, b.PartNumber, b.DeviceClass))
|
||||
return am != "" && am == bm
|
||||
}
|
||||
|
||||
func sameManufacturer(a, b models.HardwareDevice) bool {
|
||||
am := normalizeText(a.Manufacturer)
|
||||
bm := normalizeText(b.Manufacturer)
|
||||
return am != "" && am == bm
|
||||
}
|
||||
|
||||
func hasMACOverlap(a, b []string) bool {
|
||||
if len(a) == 0 || len(b) == 0 {
|
||||
return false
|
||||
}
|
||||
set := make(map[string]struct{}, len(a))
|
||||
for _, mac := range a {
|
||||
key := normalizeText(mac)
|
||||
if key != "" {
|
||||
set[key] = struct{}{}
|
||||
}
|
||||
}
|
||||
for _, mac := range b {
|
||||
if _, ok := set[normalizeText(mac)]; ok {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func sameKindFamily(a, b string) bool {
|
||||
if a == b {
|
||||
return true
|
||||
}
|
||||
family := map[string]bool{
|
||||
models.DeviceKindPCIe: true,
|
||||
models.DeviceKindGPU: true,
|
||||
models.DeviceKindNetwork: true,
|
||||
}
|
||||
return family[a] && family[b]
|
||||
}
|
||||
|
||||
func normalizeText(v string) string {
|
||||
s := strings.ToLower(strings.TrimSpace(v))
|
||||
s = strings.ReplaceAll(s, " ", "")
|
||||
s = strings.ReplaceAll(s, "_", "")
|
||||
s = strings.ReplaceAll(s, "-", "")
|
||||
return s
|
||||
}
|
||||
|
||||
func normalizeSlot(slot string) string {
|
||||
return normalizeText(slot)
|
||||
}
|
||||
|
||||
func qualityScore(d models.HardwareDevice) int {
|
||||
score := 0
|
||||
if normalizedSerial(d.SerialNumber) != "" {
|
||||
score += 6
|
||||
}
|
||||
if strings.TrimSpace(d.BDF) != "" {
|
||||
score += 4
|
||||
}
|
||||
if strings.TrimSpace(d.Model) != "" {
|
||||
score += 3
|
||||
}
|
||||
if strings.TrimSpace(d.Firmware) != "" {
|
||||
score += 2
|
||||
}
|
||||
if strings.TrimSpace(d.Status) != "" {
|
||||
score++
|
||||
}
|
||||
return score
|
||||
}
|
||||
|
||||
func normalizedSerial(serial string) string {
|
||||
s := strings.TrimSpace(serial)
|
||||
if s == "" {
|
||||
return ""
|
||||
}
|
||||
switch strings.ToUpper(s) {
|
||||
case "N/A", "NA", "NONE", "NULL", "UNKNOWN", "-":
|
||||
return ""
|
||||
default:
|
||||
return s
|
||||
}
|
||||
}
|
||||
|
||||
func buildFirmwareBySlot(firmware []models.FirmwareInfo) map[string]slotFirmwareInfo {
|
||||
out := make(map[string]slotFirmwareInfo)
|
||||
add := func(slot, model, version, category string) {
|
||||
key := normalizeSlotKey(slot)
|
||||
if key == "" || strings.TrimSpace(version) == "" {
|
||||
return
|
||||
}
|
||||
existing, ok := out[key]
|
||||
if ok && strings.TrimSpace(existing.Model) != "" {
|
||||
return
|
||||
}
|
||||
out[key] = slotFirmwareInfo{
|
||||
Model: strings.TrimSpace(model),
|
||||
Version: strings.TrimSpace(version),
|
||||
Category: category,
|
||||
}
|
||||
}
|
||||
|
||||
for _, fw := range firmware {
|
||||
name := strings.TrimSpace(fw.DeviceName)
|
||||
if name == "" {
|
||||
continue
|
||||
}
|
||||
if m := psuFirmwareRe.FindStringSubmatch(name); len(m) == 3 {
|
||||
model := strings.TrimSpace(m[2])
|
||||
if model == "" {
|
||||
model = "PSU"
|
||||
}
|
||||
add(m[1], model, fw.Version, "psu")
|
||||
continue
|
||||
}
|
||||
if m := nicFirmwareRe.FindStringSubmatch(name); len(m) == 3 {
|
||||
model := strings.TrimSpace(m[2])
|
||||
if model == "" {
|
||||
model = "NIC"
|
||||
}
|
||||
add(m[1], model, fw.Version, "nic")
|
||||
continue
|
||||
}
|
||||
if m := gpuFirmwareRe.FindStringSubmatch(name); len(m) == 3 {
|
||||
model := strings.TrimSpace(m[2])
|
||||
if model == "" {
|
||||
model = "GPU"
|
||||
}
|
||||
add(m[1], model, fw.Version, "gpu")
|
||||
continue
|
||||
}
|
||||
if m := nvsFirmwareRe.FindStringSubmatch(name); len(m) == 3 {
|
||||
model := strings.TrimSpace(m[2])
|
||||
if model == "" {
|
||||
model = "NVSwitch"
|
||||
}
|
||||
add(m[1], model, fw.Version, "nvswitch")
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
return out
|
||||
}
|
||||
|
||||
func normalizeSlotKey(slot string) string {
|
||||
return strings.ToLower(strings.TrimSpace(slot))
|
||||
}
|
||||
|
||||
func annotateDuplicateSerials(items []models.HardwareDevice) []models.HardwareDevice {
|
||||
if len(items) < 2 {
|
||||
return items
|
||||
}
|
||||
|
||||
countByKindSerial := make(map[string]int)
|
||||
for _, d := range items {
|
||||
serial := normalizedSerial(d.SerialNumber)
|
||||
if serial == "" {
|
||||
continue
|
||||
}
|
||||
key := d.Kind + "|" + strings.ToLower(serial)
|
||||
countByKindSerial[key]++
|
||||
}
|
||||
|
||||
seenByKindSerial := make(map[string]int)
|
||||
for i := range items {
|
||||
serial := normalizedSerial(items[i].SerialNumber)
|
||||
if serial == "" {
|
||||
continue
|
||||
}
|
||||
key := items[i].Kind + "|" + strings.ToLower(serial)
|
||||
if countByKindSerial[key] < 2 {
|
||||
continue
|
||||
}
|
||||
seenByKindSerial[key]++
|
||||
items[i].SerialNumber = serial + " (DUP#" + strconv.Itoa(seenByKindSerial[key]) + ")"
|
||||
}
|
||||
|
||||
return items
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user