docs: refresh project documentation
This commit is contained in:
@@ -2,261 +2,69 @@
|
||||
|
||||
## Framework
|
||||
|
||||
### Registration
|
||||
Parsers live in `internal/parser/` and vendor implementations live in `internal/parser/vendors/`.
|
||||
|
||||
Each vendor parser registers itself via Go's `init()` side-effect import pattern.
|
||||
Core behavior:
|
||||
- registration uses `init()` side effects
|
||||
- all registered parsers run `Detect()`
|
||||
- the highest-confidence parser wins
|
||||
- generic fallback stays last and low-confidence
|
||||
|
||||
All registrations are collected in `internal/parser/vendors/vendors.go`:
|
||||
```go
|
||||
import (
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
|
||||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell"
|
||||
// etc.
|
||||
)
|
||||
```
|
||||
|
||||
### VendorParser interface
|
||||
`VendorParser` contract:
|
||||
|
||||
```go
|
||||
type VendorParser interface {
|
||||
Name() string // human-readable name
|
||||
Vendor() string // vendor identifier string
|
||||
Version() string // parser version (increment on logic changes)
|
||||
Detect(files []ExtractedFile) int // confidence 0–100
|
||||
Name() string
|
||||
Vendor() string
|
||||
Version() string
|
||||
Detect(files []ExtractedFile) int
|
||||
Parse(files []ExtractedFile) (*models.AnalysisResult, error)
|
||||
}
|
||||
```
|
||||
|
||||
### Selection logic
|
||||
## Adding a parser
|
||||
|
||||
All registered parsers run `Detect()` against the uploaded archive's file list.
|
||||
The parser with the **highest confidence score** is selected.
|
||||
Multiple parsers may return >0; only the top scorer is used.
|
||||
1. Create `internal/parser/vendors/<vendor>/`
|
||||
2. Start from `internal/parser/vendors/template/parser.go.template`
|
||||
3. Implement `Detect()` and `Parse()`
|
||||
4. Add a blank import in `internal/parser/vendors/vendors.go`
|
||||
5. Add at least one positive and one negative detection test
|
||||
|
||||
### Adding a new vendor parser
|
||||
## Data quality rules
|
||||
|
||||
1. `mkdir -p internal/parser/vendors/VENDORNAME`
|
||||
2. Copy `internal/parser/vendors/template/parser.go.template` as starting point.
|
||||
3. Implement `Detect()` and `Parse()`.
|
||||
4. Add blank import to `vendors/vendors.go`.
|
||||
### System firmware only in `hardware.firmware`
|
||||
|
||||
`Detect()` tips:
|
||||
- Look for unique filenames or directory names.
|
||||
- Check file content for vendor-specific markers.
|
||||
- Return 70+ only when confident; return 0 if clearly not a match.
|
||||
`hardware.firmware` must contain system-level firmware only.
|
||||
Device-bound firmware belongs on the device record and must not be duplicated at the top level.
|
||||
|
||||
### Parser versioning
|
||||
### Strip embedded MAC addresses from model names
|
||||
|
||||
Each parser file contains a `parserVersion` constant.
|
||||
Increment the version whenever parsing logic changes — this helps trace which
|
||||
version produced a given result.
|
||||
If a source embeds ` - XX:XX:XX:XX:XX:XX` in a model/name field, remove that suffix before storing it.
|
||||
|
||||
---
|
||||
### Use `pci.ids` for empty or generic PCI model names
|
||||
|
||||
## Parser data quality rules
|
||||
When `vendor_id` and `device_id` are known but the model name is missing or generic, resolve the name via `internal/parser/vendors/pciids`.
|
||||
|
||||
### FirmwareInfo — system-level only
|
||||
## Active vendor coverage
|
||||
|
||||
`Hardware.Firmware` must contain **only system-level firmware**: BIOS, BMC/iDRAC,
|
||||
Lifecycle Controller, CPLD, storage controllers, BOSS adapters.
|
||||
| Vendor ID | Input family | Notes |
|
||||
|-----------|--------------|-------|
|
||||
| `dell` | TSR ZIP archives | Broad hardware, firmware, sensors, lifecycle events |
|
||||
| `h3c_g5` | H3C SDS G5 bundles | INI/XML/CSV-driven hardware and event parsing |
|
||||
| `h3c_g6` | H3C SDS G6 bundles | Similar flow with G6-specific files |
|
||||
| `inspur` | onekeylog archives | FRU/SDR plus optional Redis enrichment |
|
||||
| `nvidia` | HGX Field Diagnostics | GPU- and fabric-heavy diagnostic input |
|
||||
| `nvidia_bug_report` | `nvidia-bug-report-*.log.gz` | dmidecode, lspci, NVIDIA driver sections |
|
||||
| `unraid` | Unraid diagnostics/log bundles | Server and storage-focused parsing |
|
||||
| `xigmanas` | XigmaNAS plain logs | FreeBSD/NAS-oriented inventory |
|
||||
| `generic` | fallback | Low-confidence text fallback when nothing else matches |
|
||||
|
||||
**Device-bound firmware** (NIC, GPU, PSU, disk, backplane) **must NOT be added to
|
||||
`Hardware.Firmware`**. It belongs to the device's own `Firmware` field and is already
|
||||
present there. Duplicating it in `Hardware.Firmware` causes double entries in Reanimator.
|
||||
## Practical guidance
|
||||
|
||||
The Reanimator exporter filters by `FirmwareInfo.DeviceName` prefix and by
|
||||
`FirmwareInfo.Description` (FQDD prefix). Parsers must cooperate:
|
||||
|
||||
- Store the device's FQDD (or equivalent slot identifier) in `FirmwareInfo.Description`
|
||||
for all firmware entries that come from a per-device inventory source (e.g. Dell
|
||||
`DCIM_SoftwareIdentity`).
|
||||
- FQDD prefixes that are device-bound: `NIC.`, `PSU.`, `Disk.`, `RAID.Backplane.`, `GPU.`
|
||||
|
||||
### NIC/device model names — strip embedded MAC addresses
|
||||
|
||||
Some vendors (confirmed: Dell TSR) embed the MAC address in the device model name field,
|
||||
e.g. `ProductName = "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`.
|
||||
|
||||
**Rule:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from model/name strings before storing
|
||||
them in `FirmwareInfo.DeviceName`, `NetworkAdapter.Model`, or any other model field.
|
||||
|
||||
Use `nicMACInModelRE` (defined in the Dell parser) or an equivalent regex:
|
||||
```
|
||||
\s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$
|
||||
```
|
||||
|
||||
This applies to **all** string fields used as device names or model identifiers.
|
||||
|
||||
### PCI device name enrichment via pci.ids
|
||||
|
||||
If a PCIe device, GPU, NIC, or any hardware component has a `vendor_id` + `device_id`
|
||||
but its model/name field is **empty or generic** (e.g. blank, equals the description,
|
||||
or is just a raw hex ID), the parser **must** attempt to resolve the human-readable
|
||||
model name from the embedded `pci.ids` database before storing the result.
|
||||
|
||||
**Rule:** When `Model` (or equivalent name field) is empty and both `VendorID` and
|
||||
`DeviceID` are non-zero, call the pciids lookup and use the result as the model name.
|
||||
|
||||
```go
|
||||
// Example pattern — use in any parser that handles PCIe/GPU/NIC devices:
|
||||
if strings.TrimSpace(device.Model) == "" && device.VendorID != 0 && device.DeviceID != 0 {
|
||||
if name := pciids.Lookup(device.VendorID, device.DeviceID); name != "" {
|
||||
device.Model = name
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This rule applies to all vendor parsers. The pciids package is available at
|
||||
`internal/parser/vendors/pciids`. See ADL-005 for the rationale.
|
||||
|
||||
**Do not hardcode model name strings.** If a device is unknown today, it will be
|
||||
resolved automatically once `pci.ids` is updated.
|
||||
|
||||
---
|
||||
|
||||
## Vendor parsers
|
||||
|
||||
### Inspur / Kaytus (`inspur`)
|
||||
|
||||
**Status:** Ready. Tested on KR4268X2 (onekeylog format).
|
||||
|
||||
**Archive format:** `.tar.gz` onekeylog
|
||||
|
||||
**Primary source files:**
|
||||
|
||||
| File | Content |
|
||||
|------|---------|
|
||||
| `asset.json` | Base hardware inventory |
|
||||
| `component.log` | Component list |
|
||||
| `devicefrusdr.log` | FRU and SDR data |
|
||||
| `onekeylog/runningdata/redis-dump.rdb` | Runtime enrichment (optional) |
|
||||
|
||||
**Redis RDB enrichment** (applied conservatively — fills missing fields only):
|
||||
- GPU: `serial_number`, `firmware` (VBIOS/FW), runtime telemetry
|
||||
- NIC: firmware, serial, part number (when text logs leave fields empty)
|
||||
|
||||
**Module structure:**
|
||||
```
|
||||
inspur/
|
||||
parser.go — main parser + registration
|
||||
sdr.go — sensor/SDR parsing
|
||||
fru.go — FRU serial parsing
|
||||
asset.go — asset.json parsing
|
||||
syslog.go — syslog parsing
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Dell TSR (`dell`)
|
||||
|
||||
**Status:** Ready (v3.0). Tested on nested TSR archives with embedded `*.pl.zip`.
|
||||
|
||||
**Archive format:** `.zip` (outer archive + nested `*.pl.zip`)
|
||||
|
||||
**Primary source files:**
|
||||
- `tsr/metadata.json`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml`
|
||||
- `tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml`
|
||||
- `tsr/hardware/sysinfo/lcfiles/curr_lclog.xml`
|
||||
|
||||
**Extracted data:**
|
||||
- Board/system identity and BIOS/iDRAC firmware
|
||||
- CPU, memory, physical disks, virtual disks, PSU, NIC, PCIe
|
||||
- GPU inventory (`DCIM_VideoView`) + GPU sensor enrichment (`DCIM_GPUSensor`)
|
||||
- Controller/backplane inventory (`DCIM_ControllerView`, `DCIM_EnclosureView`)
|
||||
- Sensor readings (temperature/voltage/current/power/fan/utilization)
|
||||
- Lifecycle events (`curr_lclog.xml`)
|
||||
|
||||
---
|
||||
|
||||
### NVIDIA HGX Field Diagnostics (`nvidia`)
|
||||
|
||||
**Status:** Ready (v1.1.0). Works with any server vendor.
|
||||
|
||||
**Archive format:** `.tar` / `.tar.gz`
|
||||
|
||||
**Confidence scoring:**
|
||||
|
||||
| File | Score |
|
||||
|------|-------|
|
||||
| `unified_summary.json` with "HGX Field Diag" marker | +40 |
|
||||
| `summary.json` | +20 |
|
||||
| `summary.csv` | +15 |
|
||||
| `gpu_fieldiag/` directory | +15 |
|
||||
|
||||
**Source files:**
|
||||
|
||||
| File | Content |
|
||||
|------|---------|
|
||||
| `output.log` | dmidecode — server manufacturer, model, serial number |
|
||||
| `unified_summary.json` | GPU details, NVSwitch devices, PCI addresses |
|
||||
| `summary.json` | Diagnostic test results and error codes |
|
||||
| `summary.csv` | Alternative test results format |
|
||||
|
||||
**Extracted data:**
|
||||
- GPUs: slot, model, manufacturer, firmware (VBIOS), BDF
|
||||
- NVSwitch devices: slot, device_class, vendor_id, device_id, BDF, link speed/width
|
||||
- Events: diagnostic test failures (connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power)
|
||||
|
||||
**Severity mapping:**
|
||||
- `info` — tests passed
|
||||
- `warning` — e.g. "Row remapping failed"
|
||||
- `critical` — error codes 300+
|
||||
|
||||
**Known limitations:**
|
||||
- Detailed logs in `gpu_fieldiag/*.log` are not parsed.
|
||||
- No CPU, memory, or storage extraction (not present in field diag archives).
|
||||
|
||||
---
|
||||
|
||||
### NVIDIA Bug Report (`nvidia_bug_report`)
|
||||
|
||||
**Status:** Ready (v1.0.0).
|
||||
|
||||
**File format:** `nvidia-bug-report-*.log.gz` (gzip-compressed text)
|
||||
|
||||
**Confidence:** 85 (high priority for matching filename pattern)
|
||||
|
||||
**Source sections parsed:**
|
||||
|
||||
| dmidecode section | Extracts |
|
||||
|-------------------|---------|
|
||||
| System Information | server serial, UUID, manufacturer, product name |
|
||||
| Processor Information | CPU model, serial, core/thread count, frequency |
|
||||
| Memory Device | DIMM slot, size, type, manufacturer, serial, part number, speed |
|
||||
| System Power Supply | PSU location, manufacturer, model, serial, wattage, firmware, status |
|
||||
|
||||
| Other source | Extracts |
|
||||
|--------------|---------|
|
||||
| `lspci -vvv` (Ethernet/Network/IB) | NIC model (from VPD), BDF, slot, P/N, S/N, port count, port type |
|
||||
| `/proc/driver/nvidia/gpus/*/information` | GPU model, BDF, UUID, VBIOS version, IRQ |
|
||||
| NVRM version line | NVIDIA driver version |
|
||||
|
||||
**Known limitations:**
|
||||
- Driver error/warning log lines not yet extracted.
|
||||
- GPU temperature/utilization metrics require additional parsing sections.
|
||||
|
||||
---
|
||||
|
||||
### XigmaNAS (`xigmanas`)
|
||||
|
||||
**Status:** Ready.
|
||||
|
||||
**Archive format:** Plain log files (FreeBSD-based NAS system)
|
||||
|
||||
**Detection:** Files named `xigmanas`, `system`, or `dmesg`; content containing "XigmaNAS" or "FreeBSD"; SMART data presence.
|
||||
|
||||
**Extracted data:**
|
||||
- System: firmware version, uptime, CPU model, memory configuration, hardware platform
|
||||
- Storage: disk models, serial numbers, capacity, health, SMART temperatures
|
||||
- Populates: `Hardware.Firmware`, `Hardware.CPUs`, `Hardware.Memory`, `Hardware.Storage`, `Sensors`
|
||||
|
||||
---
|
||||
|
||||
### Unraid (`unraid`)
|
||||
|
||||
**Status:** Ready (v1.0.0).
|
||||
- Be conservative with high detect scores
|
||||
- Prefer filling missing fields over overwriting stronger source data
|
||||
- Keep parser version constants current when behavior changes
|
||||
- Any new vendor-specific filtering or dedup logic must ship with tests for that vendor format
|
||||
|
||||
**Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories).
|
||||
|
||||
|
||||
Reference in New Issue
Block a user