242 lines
7.6 KiB
Markdown
242 lines
7.6 KiB
Markdown
# 06 — Parsers
|
||
|
||
## Framework
|
||
|
||
### Registration
|
||
|
||
Each vendor parser registers itself via Go's `init()` side-effect import pattern.
|
||
|
||
All registrations are collected in `internal/parser/vendors/vendors.go`:
|
||
```go
|
||
import (
|
||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
|
||
_ "git.mchus.pro/mchus/logpile/internal/parser/vendors/supermicro"
|
||
// etc.
|
||
)
|
||
```
|
||
|
||
### VendorParser interface
|
||
|
||
```go
|
||
type VendorParser interface {
|
||
Name() string // human-readable name
|
||
Vendor() string // vendor identifier string
|
||
Version() string // parser version (increment on logic changes)
|
||
Detect(files []ExtractedFile) int // confidence 0–100
|
||
Parse(files []ExtractedFile) (*models.AnalysisResult, error)
|
||
}
|
||
```
|
||
|
||
### Selection logic
|
||
|
||
All registered parsers run `Detect()` against the uploaded archive's file list.
|
||
The parser with the **highest confidence score** is selected.
|
||
Multiple parsers may return >0; only the top scorer is used.
|
||
|
||
### Adding a new vendor parser
|
||
|
||
1. `mkdir -p internal/parser/vendors/VENDORNAME`
|
||
2. Copy `internal/parser/vendors/template/parser.go.template` as starting point.
|
||
3. Implement `Detect()` and `Parse()`.
|
||
4. Add blank import to `vendors/vendors.go`.
|
||
|
||
`Detect()` tips:
|
||
- Look for unique filenames or directory names.
|
||
- Check file content for vendor-specific markers.
|
||
- Return 70+ only when confident; return 0 if clearly not a match.
|
||
|
||
### Parser versioning
|
||
|
||
Each parser file contains a `parserVersion` constant.
|
||
Increment the version whenever parsing logic changes — this helps trace which
|
||
version produced a given result.
|
||
|
||
---
|
||
|
||
## Vendor parsers
|
||
|
||
### Inspur / Kaytus (`inspur`)
|
||
|
||
**Status:** Ready. Tested on KR4268X2 (onekeylog format).
|
||
|
||
**Archive format:** `.tar.gz` onekeylog
|
||
|
||
**Primary source files:**
|
||
|
||
| File | Content |
|
||
|------|---------|
|
||
| `asset.json` | Base hardware inventory |
|
||
| `component.log` | Component list |
|
||
| `devicefrusdr.log` | FRU and SDR data |
|
||
| `onekeylog/runningdata/redis-dump.rdb` | Runtime enrichment (optional) |
|
||
|
||
**Redis RDB enrichment** (applied conservatively — fills missing fields only):
|
||
- GPU: `serial_number`, `firmware` (VBIOS/FW), runtime telemetry
|
||
- NIC: firmware, serial, part number (when text logs leave fields empty)
|
||
|
||
**Module structure:**
|
||
```
|
||
inspur/
|
||
parser.go — main parser + registration
|
||
sdr.go — sensor/SDR parsing
|
||
fru.go — FRU serial parsing
|
||
asset.go — asset.json parsing
|
||
syslog.go — syslog parsing
|
||
```
|
||
|
||
---
|
||
|
||
### Supermicro (`supermicro`)
|
||
|
||
**Status:** Ready (v1.0.0). Tested on SYS-821GE-TNHR crash dumps.
|
||
|
||
**Archive format:** `.tgz` / `.tar.gz` / `.tar`
|
||
|
||
**Primary source file:** `CDump.txt` — JSON crashdump file
|
||
|
||
**Confidence:** +80 when `CDump.txt` contains `crash_data`, `METADATA`, `bmc_fw_ver` markers.
|
||
|
||
**Extracted data:**
|
||
- CPUs: CPUID, core count, manufacturer (Intel), microcode version (as firmware field)
|
||
- FRU: BMC firmware version, BIOS version, ME firmware version, CPU PPIN
|
||
- Events: crashdump collection event + MCA errors
|
||
|
||
**MCA error detection:**
|
||
- Bit 63 (Valid), Bit 61 (UC — uncorrected), Bit 60 (EN — enabled)
|
||
- Corrected MCA errors → `Warning` severity
|
||
- Uncorrected MCA errors → `Critical` severity
|
||
|
||
**Known limitations:**
|
||
- TOR dump and extended MCA register data not yet parsed.
|
||
- No CPU model name (only CPUID hex code available in crashdump format).
|
||
|
||
---
|
||
|
||
### NVIDIA HGX Field Diagnostics (`nvidia`)
|
||
|
||
**Status:** Ready (v1.1.0). Works with any server vendor.
|
||
|
||
**Archive format:** `.tar` / `.tar.gz`
|
||
|
||
**Confidence scoring:**
|
||
|
||
| File | Score |
|
||
|------|-------|
|
||
| `unified_summary.json` with "HGX Field Diag" marker | +40 |
|
||
| `summary.json` | +20 |
|
||
| `summary.csv` | +15 |
|
||
| `gpu_fieldiag/` directory | +15 |
|
||
|
||
**Source files:**
|
||
|
||
| File | Content |
|
||
|------|---------|
|
||
| `output.log` | dmidecode — server manufacturer, model, serial number |
|
||
| `unified_summary.json` | GPU details, NVSwitch devices, PCI addresses |
|
||
| `summary.json` | Diagnostic test results and error codes |
|
||
| `summary.csv` | Alternative test results format |
|
||
|
||
**Extracted data:**
|
||
- GPUs: slot, model, manufacturer, firmware (VBIOS), BDF
|
||
- NVSwitch devices: slot, device_class, vendor_id, device_id, BDF, link speed/width
|
||
- Events: diagnostic test failures (connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power)
|
||
|
||
**Severity mapping:**
|
||
- `info` — tests passed
|
||
- `warning` — e.g. "Row remapping failed"
|
||
- `critical` — error codes 300+
|
||
|
||
**Known limitations:**
|
||
- Detailed logs in `gpu_fieldiag/*.log` are not parsed.
|
||
- No CPU, memory, or storage extraction (not present in field diag archives).
|
||
|
||
---
|
||
|
||
### NVIDIA Bug Report (`nvidia_bug_report`)
|
||
|
||
**Status:** Ready (v1.0.0).
|
||
|
||
**File format:** `nvidia-bug-report-*.log.gz` (gzip-compressed text)
|
||
|
||
**Confidence:** 85 (high priority for matching filename pattern)
|
||
|
||
**Source sections parsed:**
|
||
|
||
| dmidecode section | Extracts |
|
||
|-------------------|---------|
|
||
| System Information | server serial, UUID, manufacturer, product name |
|
||
| Processor Information | CPU model, serial, core/thread count, frequency |
|
||
| Memory Device | DIMM slot, size, type, manufacturer, serial, part number, speed |
|
||
| System Power Supply | PSU location, manufacturer, model, serial, wattage, firmware, status |
|
||
|
||
| Other source | Extracts |
|
||
|--------------|---------|
|
||
| `lspci -vvv` (Ethernet/Network/IB) | NIC model (from VPD), BDF, slot, P/N, S/N, port count, port type |
|
||
| `/proc/driver/nvidia/gpus/*/information` | GPU model, BDF, UUID, VBIOS version, IRQ |
|
||
| NVRM version line | NVIDIA driver version |
|
||
|
||
**Known limitations:**
|
||
- Driver error/warning log lines not yet extracted.
|
||
- GPU temperature/utilization metrics require additional parsing sections.
|
||
|
||
---
|
||
|
||
### XigmaNAS (`xigmanas`)
|
||
|
||
**Status:** Ready.
|
||
|
||
**Archive format:** Plain log files (FreeBSD-based NAS system)
|
||
|
||
**Detection:** Files named `xigmanas`, `system`, or `dmesg`; content containing "XigmaNAS" or "FreeBSD"; SMART data presence.
|
||
|
||
**Extracted data:**
|
||
- System: firmware version, uptime, CPU model, memory configuration, hardware platform
|
||
- Storage: disk models, serial numbers, capacity, health, SMART temperatures
|
||
- Populates: `Hardware.Firmware`, `Hardware.CPUs`, `Hardware.Memory`, `Hardware.Storage`, `Sensors`
|
||
|
||
---
|
||
|
||
### Unraid (`unraid`)
|
||
|
||
**Status:** Ready (v1.0.0).
|
||
|
||
**Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories).
|
||
|
||
**Detection:** Combines filename/path markers (`diagnostics-*`, `unraid-*.txt`, `vars.txt`)
|
||
with content markers (e.g. `Unraid kernel build`, parity data markers).
|
||
|
||
**Extracted data (current):**
|
||
- Board / BIOS metadata (from motherboard/system files)
|
||
- CPU summary (from `lscpu.txt`)
|
||
- Memory modules (from diagnostics memory file)
|
||
- Storage devices (from `vars.txt` + SMART files)
|
||
- Syslog events
|
||
|
||
---
|
||
|
||
### Generic text fallback (`generic`)
|
||
|
||
**Status:** Ready (v1.0.0).
|
||
|
||
**Confidence:** 15 (lowest — only matches if no other parser scores higher)
|
||
|
||
**Purpose:** Fallback for any text file or single `.gz` file not matching a specific vendor.
|
||
|
||
**Behavior:**
|
||
- If filename matches `nvidia-bug-report-*.log.gz`: extracts driver version and GPU list.
|
||
- Otherwise: confirms file is text (not binary) and records a basic "Text File" event.
|
||
|
||
---
|
||
|
||
## Supported vendor matrix
|
||
|
||
| Vendor | ID | Status | Tested on |
|
||
|--------|----|--------|-----------|
|
||
| Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog |
|
||
| Supermicro | `supermicro` | Ready | SYS-821GE-TNHR crashdump |
|
||
| NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers |
|
||
| NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems |
|
||
| Unraid | `unraid` | Ready | Unraid diagnostics archives |
|
||
| XigmaNAS | `xigmanas` | Ready | FreeBSD NAS logs |
|
||
| Generic fallback | `generic` | Ready | Any text file |
|