# 06 — Parsers ## Framework Parsers live in `internal/parser/` and vendor implementations live in `internal/parser/vendors/`. Core behavior: - registration uses `init()` side effects - all registered parsers run `Detect()` - the highest-confidence parser wins - generic fallback stays last and low-confidence `VendorParser` contract: ```go type VendorParser interface { Name() string Vendor() string Version() string Detect(files []ExtractedFile) int Parse(files []ExtractedFile) (*models.AnalysisResult, error) } ``` ## Adding a parser 1. Create `internal/parser/vendors//` 2. Start from `internal/parser/vendors/template/parser.go.template` 3. Implement `Detect()` and `Parse()` 4. Add a blank import in `internal/parser/vendors/vendors.go` 5. Add at least one positive and one negative detection test ## Data quality rules ### System firmware only in `hardware.firmware` `hardware.firmware` must contain system-level firmware only. Device-bound firmware belongs on the device record and must not be duplicated at the top level. ### Strip embedded MAC addresses from model names If a source embeds ` - XX:XX:XX:XX:XX:XX` in a model/name field, remove that suffix before storing it. ### Use `pci.ids` for empty or generic PCI model names When `vendor_id` and `device_id` are known but the model name is missing or generic, resolve the name via `internal/parser/vendors/pciids`. ## Active vendor coverage | Vendor ID | Input family | Notes | |-----------|--------------|-------| | `dell` | TSR ZIP archives | Broad hardware, firmware, sensors, lifecycle events | | `h3c_g5` | H3C SDS G5 bundles | INI/XML/CSV-driven hardware and event parsing | | `h3c_g6` | H3C SDS G6 bundles | Similar flow with G6-specific files | | `inspur` | onekeylog archives | FRU/SDR plus optional Redis enrichment | | `nvidia` | HGX Field Diagnostics | GPU- and fabric-heavy diagnostic input | | `nvidia_bug_report` | `nvidia-bug-report-*.log.gz` | dmidecode, lspci, NVIDIA driver sections | | `unraid` | Unraid diagnostics/log bundles | Server and storage-focused parsing | | `xigmanas` | XigmaNAS plain logs | FreeBSD/NAS-oriented inventory | | `generic` | fallback | Low-confidence text fallback when nothing else matches | ## Practical guidance - Be conservative with high detect scores - Prefer filling missing fields over overwriting stronger source data - Keep parser version constants current when behavior changes - Any new vendor-specific filtering or dedup logic must ship with tests for that vendor format **Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories). **Detection:** Combines filename/path markers (`diagnostics-*`, `unraid-*.txt`, `vars.txt`) with content markers (e.g. `Unraid kernel build`, parity data markers). **Extracted data (current):** - Board / BIOS metadata (from motherboard/system files) - CPU summary (from `lscpu.txt`) - Memory modules (from diagnostics memory file) - Storage devices (from `vars.txt` + SMART files) - Syslog events --- ### H3C SDS G5 (`h3c_g5`) **Status:** Ready (v1.0.0). Tested on H3C UniServer R4900 G5 SDS archives. **Archive format:** `.sds` (tar archive) **Detection:** `hardware_info.ini`, `hardware.info`, `firmware_version.ini`, `user/test*.csv`, plus H3C markers. **Extracted data (current):** - Board/FRU inventory (`FRUInfo.ini`, `board_info.ini`) - Firmware list (`firmware_version.ini`) - CPU inventory (`hardware_info.ini`) - Memory DIMM inventory (`hardware_info.ini`) - Storage inventory (`hardware.info`, `storage_disk.ini`, `NVMe_info.txt`, RAID text enrichments) - Logical RAID volumes (`raid.json`, `Storage_RAID-*.txt`) - Sensor snapshot (`sensor_info.ini`) - SEL events (`user/test.csv`, `user/test1.csv`, fallback `Sel.json` / `sel_list.txt`) --- ### H3C SDS G6 (`h3c_g6`) **Status:** Ready (v1.0.0). Tested on H3C UniServer R4700 G6 SDS archives. **Archive format:** `.sds` (tar archive) **Detection:** `CPUDetailInfo.xml`, `MemoryDetailInfo.xml`, `firmware_version.json`, `Sel.json`, plus H3C markers. **Extracted data (current):** - Board/FRU inventory (`FRUInfo.ini`, `board_info.ini`) - Firmware list (`firmware_version.json`) - CPU inventory (`CPUDetailInfo.xml`) - Memory DIMM inventory (`MemoryDetailInfo.xml`) - Storage inventory + capacity/model/interface (`storage_disk.ini`, `Storage_RAID-*.txt`, `NVMe_info.txt`) - Logical RAID volumes (`raid.json`, fallback from `Storage_RAID-*.txt` when available) - Sensor snapshot (`sensor_info.ini`) - SEL events (`user/Sel.json`, fallback `user/sel_list.txt`) --- ### Generic text fallback (`generic`) **Status:** Ready (v1.0.0). **Confidence:** 15 (lowest — only matches if no other parser scores higher) **Purpose:** Fallback for any text file or single `.gz` file not matching a specific vendor. **Behavior:** - If filename matches `nvidia-bug-report-*.log.gz`: extracts driver version and GPU list. - Otherwise: confirms file is text (not binary) and records a basic "Text File" event. --- ## Supported vendor matrix | Vendor | ID | Status | Tested on | |--------|----|--------|-----------| | Dell TSR | `dell` | Ready | TSR nested zip archives | | Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog | | NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers | | NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems | | Unraid | `unraid` | Ready | Unraid diagnostics archives | | XigmaNAS | `xigmanas` | Ready | FreeBSD NAS logs | | H3C SDS G5 | `h3c_g5` | Ready | H3C UniServer R4900 G5 SDS archives | | H3C SDS G6 | `h3c_g6` | Ready | H3C UniServer R4700 G6 SDS archives | | Generic fallback | `generic` | Ready | Any text file |