dell: strip MAC from model names; fix device-bound firmware in dell/inspur

- Dell NICView: strip " - XX:XX:XX:XX:XX:XX" suffix from ProductName
  (Dell TSR embeds MAC in this field for every NIC port)
- Dell SoftwareIdentity: same strip applied to ElementName; store FQDD
  in FirmwareInfo.Description so exporter can filter device-bound entries
- Exporter: add isDeviceBoundFirmwareFQDD() to filter firmware entries
  whose Description matches NIC./PSU./Disk./RAID.Backplane./GPU. FQDD
  prefixes (prevents device firmware from appearing in hardware.firmware)
- Exporter: extend isDeviceBoundFirmwareName() to filter HGX GPU/NVSwitch
  firmware inventory IDs (_fw_gpu_, _fw_nvswitch_, _inforom_gpu_)
- Inspur: remove HDD firmware from Hardware.Firmware — already present
  in Storage.Firmware, duplicating it violates ADL-016
- bible-local/06-parsers.md: document firmware and MAC stripping rules
- bible-local/10-decisions.md: add ADL-016 (device-bound firmware) and
  ADL-017 (vendor-embedded MAC in model name fields)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-01 22:07:53 +03:00
parent 206496efae
commit 9c5512d238
6 changed files with 1813 additions and 38 deletions

View File

@@ -10,7 +10,7 @@ All registrations are collected in `internal/parser/vendors/vendors.go`:
```go
import (
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/supermicro"
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
// etc.
)
```
@@ -53,6 +53,42 @@ version produced a given result.
---
## Parser data quality rules
### FirmwareInfo — system-level only
`Hardware.Firmware` must contain **only system-level firmware**: BIOS, BMC/iDRAC,
Lifecycle Controller, CPLD, storage controllers, BOSS adapters.
**Device-bound firmware** (NIC, GPU, PSU, disk, backplane) **must NOT be added to
`Hardware.Firmware`**. It belongs to the device's own `Firmware` field and is already
present there. Duplicating it in `Hardware.Firmware` causes double entries in Reanimator.
The Reanimator exporter filters by `FirmwareInfo.DeviceName` prefix and by
`FirmwareInfo.Description` (FQDD prefix). Parsers must cooperate:
- Store the device's FQDD (or equivalent slot identifier) in `FirmwareInfo.Description`
for all firmware entries that come from a per-device inventory source (e.g. Dell
`DCIM_SoftwareIdentity`).
- FQDD prefixes that are device-bound: `NIC.`, `PSU.`, `Disk.`, `RAID.Backplane.`, `GPU.`
### NIC/device model names — strip embedded MAC addresses
Some vendors (confirmed: Dell TSR) embed the MAC address in the device model name field,
e.g. `ProductName = "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`.
**Rule:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from model/name strings before storing
them in `FirmwareInfo.DeviceName`, `NetworkAdapter.Model`, or any other model field.
Use `nicMACInModelRE` (defined in the Dell parser) or an equivalent regex:
```
\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$
```
This applies to **all** string fields used as device names or model identifiers.
---
## Vendor parsers
### Inspur / Kaytus (`inspur`)
@@ -86,29 +122,26 @@ inspur/
---
### Supermicro (`supermicro`)
### Dell TSR (`dell`)
**Status:** Ready (v1.0.0). Tested on SYS-821GE-TNHR crash dumps.
**Status:** Ready (v3.0). Tested on nested TSR archives with embedded `*.pl.zip`.
**Archive format:** `.tgz` / `.tar.gz` / `.tar`
**Archive format:** `.zip` (outer archive + nested `*.pl.zip`)
**Primary source file:** `CDump.txt` — JSON crashdump file
**Confidence:** +80 when `CDump.txt` contains `crash_data`, `METADATA`, `bmc_fw_ver` markers.
**Primary source files:**
- `tsr/metadata.json`
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml`
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml`
- `tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml`
- `tsr/hardware/sysinfo/lcfiles/curr_lclog.xml`
**Extracted data:**
- CPUs: CPUID, core count, manufacturer (Intel), microcode version (as firmware field)
- FRU: BMC firmware version, BIOS version, ME firmware version, CPU PPIN
- Events: crashdump collection event + MCA errors
**MCA error detection:**
- Bit 63 (Valid), Bit 61 (UC — uncorrected), Bit 60 (EN — enabled)
- Corrected MCA errors → `Warning` severity
- Uncorrected MCA errors → `Critical` severity
**Known limitations:**
- TOR dump and extended MCA register data not yet parsed.
- No CPU model name (only CPUID hex code available in crashdump format).
- Board/system identity and BIOS/iDRAC firmware
- CPU, memory, physical disks, virtual disks, PSU, NIC, PCIe
- GPU inventory (`DCIM_VideoView`) + GPU sensor enrichment (`DCIM_GPUSensor`)
- Controller/backplane inventory (`DCIM_ControllerView`, `DCIM_EnclosureView`)
- Sensor readings (temperature/voltage/current/power/fan/utilization)
- Lifecycle events (`curr_lclog.xml`)
---
@@ -272,8 +305,8 @@ with content markers (e.g. `Unraid kernel build`, parity data markers).
| Vendor | ID | Status | Tested on |
|--------|----|--------|-----------|
| Dell TSR | `dell` | Ready | TSR nested zip archives |
| Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog |
| Supermicro | `supermicro` | Ready | SYS-821GE-TNHR crashdump |
| NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers |
| NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems |
| Unraid | `unraid` | Ready | Unraid diagnostics archives |