150 lines
5.5 KiB
Markdown
150 lines
5.5 KiB
Markdown
# 06 — Parsers
|
|
|
|
## Framework
|
|
|
|
Parsers live in `internal/parser/` and vendor implementations live in `internal/parser/vendors/`.
|
|
|
|
Core behavior:
|
|
- registration uses `init()` side effects
|
|
- all registered parsers run `Detect()`
|
|
- the highest-confidence parser wins
|
|
- generic fallback stays last and low-confidence
|
|
|
|
`VendorParser` contract:
|
|
|
|
```go
|
|
type VendorParser interface {
|
|
Name() string
|
|
Vendor() string
|
|
Version() string
|
|
Detect(files []ExtractedFile) int
|
|
Parse(files []ExtractedFile) (*models.AnalysisResult, error)
|
|
}
|
|
```
|
|
|
|
## Adding a parser
|
|
|
|
1. Create `internal/parser/vendors/<vendor>/`
|
|
2. Start from `internal/parser/vendors/template/parser.go.template`
|
|
3. Implement `Detect()` and `Parse()`
|
|
4. Add a blank import in `internal/parser/vendors/vendors.go`
|
|
5. Add at least one positive and one negative detection test
|
|
|
|
## Data quality rules
|
|
|
|
### System firmware only in `hardware.firmware`
|
|
|
|
`hardware.firmware` must contain system-level firmware only.
|
|
Device-bound firmware belongs on the device record and must not be duplicated at the top level.
|
|
|
|
### Strip embedded MAC addresses from model names
|
|
|
|
If a source embeds ` - XX:XX:XX:XX:XX:XX` in a model/name field, remove that suffix before storing it.
|
|
|
|
### Use `pci.ids` for empty or generic PCI model names
|
|
|
|
When `vendor_id` and `device_id` are known but the model name is missing or generic, resolve the name via `internal/parser/vendors/pciids`.
|
|
|
|
## Active vendor coverage
|
|
|
|
| Vendor ID | Input family | Notes |
|
|
|-----------|--------------|-------|
|
|
| `dell` | TSR ZIP archives | Broad hardware, firmware, sensors, lifecycle events |
|
|
| `h3c_g5` | H3C SDS G5 bundles | INI/XML/CSV-driven hardware and event parsing |
|
|
| `h3c_g6` | H3C SDS G6 bundles | Similar flow with G6-specific files |
|
|
| `inspur` | onekeylog archives | FRU/SDR plus optional Redis enrichment |
|
|
| `nvidia` | HGX Field Diagnostics | GPU- and fabric-heavy diagnostic input |
|
|
| `nvidia_bug_report` | `nvidia-bug-report-*.log.gz` | dmidecode, lspci, NVIDIA driver sections |
|
|
| `unraid` | Unraid diagnostics/log bundles | Server and storage-focused parsing |
|
|
| `xigmanas` | XigmaNAS plain logs | FreeBSD/NAS-oriented inventory |
|
|
| `generic` | fallback | Low-confidence text fallback when nothing else matches |
|
|
|
|
## Practical guidance
|
|
|
|
- Be conservative with high detect scores
|
|
- Prefer filling missing fields over overwriting stronger source data
|
|
- Keep parser version constants current when behavior changes
|
|
- Any new vendor-specific filtering or dedup logic must ship with tests for that vendor format
|
|
|
|
**Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories).
|
|
|
|
**Detection:** Combines filename/path markers (`diagnostics-*`, `unraid-*.txt`, `vars.txt`)
|
|
with content markers (e.g. `Unraid kernel build`, parity data markers).
|
|
|
|
**Extracted data (current):**
|
|
- Board / BIOS metadata (from motherboard/system files)
|
|
- CPU summary (from `lscpu.txt`)
|
|
- Memory modules (from diagnostics memory file)
|
|
- Storage devices (from `vars.txt` + SMART files)
|
|
- Syslog events
|
|
|
|
---
|
|
|
|
### H3C SDS G5 (`h3c_g5`)
|
|
|
|
**Status:** Ready (v1.0.0). Tested on H3C UniServer R4900 G5 SDS archives.
|
|
|
|
**Archive format:** `.sds` (tar archive)
|
|
|
|
**Detection:** `hardware_info.ini`, `hardware.info`, `firmware_version.ini`, `user/test*.csv`, plus H3C markers.
|
|
|
|
**Extracted data (current):**
|
|
- Board/FRU inventory (`FRUInfo.ini`, `board_info.ini`)
|
|
- Firmware list (`firmware_version.ini`)
|
|
- CPU inventory (`hardware_info.ini`)
|
|
- Memory DIMM inventory (`hardware_info.ini`)
|
|
- Storage inventory (`hardware.info`, `storage_disk.ini`, `NVMe_info.txt`, RAID text enrichments)
|
|
- Logical RAID volumes (`raid.json`, `Storage_RAID-*.txt`)
|
|
- Sensor snapshot (`sensor_info.ini`)
|
|
- SEL events (`user/test.csv`, `user/test1.csv`, fallback `Sel.json` / `sel_list.txt`)
|
|
|
|
---
|
|
|
|
### H3C SDS G6 (`h3c_g6`)
|
|
|
|
**Status:** Ready (v1.0.0). Tested on H3C UniServer R4700 G6 SDS archives.
|
|
|
|
**Archive format:** `.sds` (tar archive)
|
|
|
|
**Detection:** `CPUDetailInfo.xml`, `MemoryDetailInfo.xml`, `firmware_version.json`, `Sel.json`, plus H3C markers.
|
|
|
|
**Extracted data (current):**
|
|
- Board/FRU inventory (`FRUInfo.ini`, `board_info.ini`)
|
|
- Firmware list (`firmware_version.json`)
|
|
- CPU inventory (`CPUDetailInfo.xml`)
|
|
- Memory DIMM inventory (`MemoryDetailInfo.xml`)
|
|
- Storage inventory + capacity/model/interface (`storage_disk.ini`, `Storage_RAID-*.txt`, `NVMe_info.txt`)
|
|
- Logical RAID volumes (`raid.json`, fallback from `Storage_RAID-*.txt` when available)
|
|
- Sensor snapshot (`sensor_info.ini`)
|
|
- SEL events (`user/Sel.json`, fallback `user/sel_list.txt`)
|
|
|
|
---
|
|
|
|
### Generic text fallback (`generic`)
|
|
|
|
**Status:** Ready (v1.0.0).
|
|
|
|
**Confidence:** 15 (lowest — only matches if no other parser scores higher)
|
|
|
|
**Purpose:** Fallback for any text file or single `.gz` file not matching a specific vendor.
|
|
|
|
**Behavior:**
|
|
- If filename matches `nvidia-bug-report-*.log.gz`: extracts driver version and GPU list.
|
|
- Otherwise: confirms file is text (not binary) and records a basic "Text File" event.
|
|
|
|
---
|
|
|
|
## Supported vendor matrix
|
|
|
|
| Vendor | ID | Status | Tested on |
|
|
|--------|----|--------|-----------|
|
|
| Dell TSR | `dell` | Ready | TSR nested zip archives |
|
|
| Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog |
|
|
| NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers |
|
|
| NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems |
|
|
| Unraid | `unraid` | Ready | Unraid diagnostics archives |
|
|
| XigmaNAS | `xigmanas` | Ready | FreeBSD NAS logs |
|
|
| H3C SDS G5 | `h3c_g5` | Ready | H3C UniServer R4900 G5 SDS archives |
|
|
| H3C SDS G6 | `h3c_g6` | Ready | H3C UniServer R4700 G6 SDS archives |
|
|
| Generic fallback | `generic` | Ready | Any text file |
|