dell: strip MAC from model names; fix device-bound firmware in dell/inspur
- Dell NICView: strip " - XX:XX:XX:XX:XX:XX" suffix from ProductName (Dell TSR embeds MAC in this field for every NIC port) - Dell SoftwareIdentity: same strip applied to ElementName; store FQDD in FirmwareInfo.Description so exporter can filter device-bound entries - Exporter: add isDeviceBoundFirmwareFQDD() to filter firmware entries whose Description matches NIC./PSU./Disk./RAID.Backplane./GPU. FQDD prefixes (prevents device firmware from appearing in hardware.firmware) - Exporter: extend isDeviceBoundFirmwareName() to filter HGX GPU/NVSwitch firmware inventory IDs (_fw_gpu_, _fw_nvswitch_, _inforom_gpu_) - Inspur: remove HDD firmware from Hardware.Firmware — already present in Storage.Firmware, duplicating it violates ADL-016 - bible-local/06-parsers.md: document firmware and MAC stripping rules - bible-local/10-decisions.md: add ADL-016 (device-bound firmware) and ADL-017 (vendor-embedded MAC in model name fields) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -10,7 +10,7 @@ All registrations are collected in `internal/parser/vendors/vendors.go`:
|
||||
```go
|
||||
import (
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/supermicro"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
|
||||
// etc.
|
||||
)
|
||||
```
|
||||
@@ -53,6 +53,42 @@ version produced a given result.
|
||||
|
||||
---
|
||||
|
||||
## Parser data quality rules
|
||||
|
||||
### FirmwareInfo — system-level only
|
||||
|
||||
`Hardware.Firmware` must contain **only system-level firmware**: BIOS, BMC/iDRAC,
|
||||
Lifecycle Controller, CPLD, storage controllers, BOSS adapters.
|
||||
|
||||
**Device-bound firmware** (NIC, GPU, PSU, disk, backplane) **must NOT be added to
|
||||
`Hardware.Firmware`**. It belongs to the device's own `Firmware` field and is already
|
||||
present there. Duplicating it in `Hardware.Firmware` causes double entries in Reanimator.
|
||||
|
||||
The Reanimator exporter filters by `FirmwareInfo.DeviceName` prefix and by
|
||||
`FirmwareInfo.Description` (FQDD prefix). Parsers must cooperate:
|
||||
|
||||
- Store the device's FQDD (or equivalent slot identifier) in `FirmwareInfo.Description`
|
||||
for all firmware entries that come from a per-device inventory source (e.g. Dell
|
||||
`DCIM_SoftwareIdentity`).
|
||||
- FQDD prefixes that are device-bound: `NIC.`, `PSU.`, `Disk.`, `RAID.Backplane.`, `GPU.`
|
||||
|
||||
### NIC/device model names — strip embedded MAC addresses
|
||||
|
||||
Some vendors (confirmed: Dell TSR) embed the MAC address in the device model name field,
|
||||
e.g. `ProductName = "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`.
|
||||
|
||||
**Rule:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from model/name strings before storing
|
||||
them in `FirmwareInfo.DeviceName`, `NetworkAdapter.Model`, or any other model field.
|
||||
|
||||
Use `nicMACInModelRE` (defined in the Dell parser) or an equivalent regex:
|
||||
```
|
||||
\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$
|
||||
```
|
||||
|
||||
This applies to **all** string fields used as device names or model identifiers.
|
||||
|
||||
---
|
||||
|
||||
## Vendor parsers
|
||||
|
||||
### Inspur / Kaytus (`inspur`)
|
||||
@@ -86,29 +122,26 @@ inspur/
|
||||
|
||||
---
|
||||
|
||||
### Supermicro (`supermicro`)
|
||||
### Dell TSR (`dell`)
|
||||
|
||||
**Status:** Ready (v1.0.0). Tested on SYS-821GE-TNHR crash dumps.
|
||||
**Status:** Ready (v3.0). Tested on nested TSR archives with embedded `*.pl.zip`.
|
||||
|
||||
**Archive format:** `.tgz` / `.tar.gz` / `.tar`
|
||||
**Archive format:** `.zip` (outer archive + nested `*.pl.zip`)
|
||||
|
||||
**Primary source file:** `CDump.txt` — JSON crashdump file
|
||||
|
||||
**Confidence:** +80 when `CDump.txt` contains `crash_data`, `METADATA`, `bmc_fw_ver` markers.
|
||||
**Primary source files:**
|
||||
- `tsr/metadata.json`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml`
|
||||
- `tsr/hardware/sysinfo/lcfiles/curr_lclog.xml`
|
||||
|
||||
**Extracted data:**
|
||||
- CPUs: CPUID, core count, manufacturer (Intel), microcode version (as firmware field)
|
||||
- FRU: BMC firmware version, BIOS version, ME firmware version, CPU PPIN
|
||||
- Events: crashdump collection event + MCA errors
|
||||
|
||||
**MCA error detection:**
|
||||
- Bit 63 (Valid), Bit 61 (UC — uncorrected), Bit 60 (EN — enabled)
|
||||
- Corrected MCA errors → `Warning` severity
|
||||
- Uncorrected MCA errors → `Critical` severity
|
||||
|
||||
**Known limitations:**
|
||||
- TOR dump and extended MCA register data not yet parsed.
|
||||
- No CPU model name (only CPUID hex code available in crashdump format).
|
||||
- Board/system identity and BIOS/iDRAC firmware
|
||||
- CPU, memory, physical disks, virtual disks, PSU, NIC, PCIe
|
||||
- GPU inventory (`DCIM_VideoView`) + GPU sensor enrichment (`DCIM_GPUSensor`)
|
||||
- Controller/backplane inventory (`DCIM_ControllerView`, `DCIM_EnclosureView`)
|
||||
- Sensor readings (temperature/voltage/current/power/fan/utilization)
|
||||
- Lifecycle events (`curr_lclog.xml`)
|
||||
|
||||
---
|
||||
|
||||
@@ -272,8 +305,8 @@ with content markers (e.g. `Unraid kernel build`, parity data markers).
|
||||
|
||||
| Vendor | ID | Status | Tested on |
|
||||
|--------|----|--------|-----------|
|
||||
| Dell TSR | `dell` | Ready | TSR nested zip archives |
|
||||
| Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog |
|
||||
| Supermicro | `supermicro` | Ready | SYS-821GE-TNHR crashdump |
|
||||
| NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers |
|
||||
| NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems |
|
||||
| Unraid | `unraid` | Ready | Unraid diagnostics archives |
|
||||
|
||||
Reference in New Issue
Block a user