# 06 — Parsers ## Framework ### Registration Each vendor parser registers itself via Go's `init()` side-effect import pattern. All registrations are collected in `internal/parser/vendors/vendors.go`: ```go import ( _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur" _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/dell" // etc. ) ``` ### VendorParser interface ```go type VendorParser interface { Name() string // human-readable name Vendor() string // vendor identifier string Version() string // parser version (increment on logic changes) Detect(files []ExtractedFile) int // confidence 0–100 Parse(files []ExtractedFile) (*models.AnalysisResult, error) } ``` ### Selection logic All registered parsers run `Detect()` against the uploaded archive's file list. The parser with the **highest confidence score** is selected. Multiple parsers may return >0; only the top scorer is used. ### Adding a new vendor parser 1. `mkdir -p internal/parser/vendors/VENDORNAME` 2. Copy `internal/parser/vendors/template/parser.go.template` as starting point. 3. Implement `Detect()` and `Parse()`. 4. Add blank import to `vendors/vendors.go`. `Detect()` tips: - Look for unique filenames or directory names. - Check file content for vendor-specific markers. - Return 70+ only when confident; return 0 if clearly not a match. ### Parser versioning Each parser file contains a `parserVersion` constant. Increment the version whenever parsing logic changes — this helps trace which version produced a given result. --- ## Parser data quality rules ### FirmwareInfo — system-level only `Hardware.Firmware` must contain **only system-level firmware**: BIOS, BMC/iDRAC, Lifecycle Controller, CPLD, storage controllers, BOSS adapters. **Device-bound firmware** (NIC, GPU, PSU, disk, backplane) **must NOT be added to `Hardware.Firmware`**. It belongs to the device's own `Firmware` field and is already present there. Duplicating it in `Hardware.Firmware` causes double entries in Reanimator. The Reanimator exporter filters by `FirmwareInfo.DeviceName` prefix and by `FirmwareInfo.Description` (FQDD prefix). Parsers must cooperate: - Store the device's FQDD (or equivalent slot identifier) in `FirmwareInfo.Description` for all firmware entries that come from a per-device inventory source (e.g. Dell `DCIM_SoftwareIdentity`). - FQDD prefixes that are device-bound: `NIC.`, `PSU.`, `Disk.`, `RAID.Backplane.`, `GPU.` ### NIC/device model names — strip embedded MAC addresses Some vendors (confirmed: Dell TSR) embed the MAC address in the device model name field, e.g. `ProductName = "NVIDIA ConnectX-6 Lx 2x 25G SFP28 OCP3.0 SFF - C4:70:BD:DB:56:08"`. **Rule:** Strip any ` - XX:XX:XX:XX:XX:XX` suffix from model/name strings before storing them in `FirmwareInfo.DeviceName`, `NetworkAdapter.Model`, or any other model field. Use `nicMACInModelRE` (defined in the Dell parser) or an equivalent regex: ``` \s+-\s+([0-9A-Fa-f]{2}:){5}[0-9A-Fa-f]{2}$ ``` This applies to **all** string fields used as device names or model identifiers. ### PCI device name enrichment via pci.ids If a PCIe device, GPU, NIC, or any hardware component has a `vendor_id` + `device_id` but its model/name field is **empty or generic** (e.g. blank, equals the description, or is just a raw hex ID), the parser **must** attempt to resolve the human-readable model name from the embedded `pci.ids` database before storing the result. **Rule:** When `Model` (or equivalent name field) is empty and both `VendorID` and `DeviceID` are non-zero, call the pciids lookup and use the result as the model name. ```go // Example pattern — use in any parser that handles PCIe/GPU/NIC devices: if strings.TrimSpace(device.Model) == "" && device.VendorID != 0 && device.DeviceID != 0 { if name := pciids.Lookup(device.VendorID, device.DeviceID); name != "" { device.Model = name } } ``` This rule applies to all vendor parsers. The pciids package is available at `internal/parser/vendors/pciids`. See ADL-005 for the rationale. **Do not hardcode model name strings.** If a device is unknown today, it will be resolved automatically once `pci.ids` is updated. --- ## Vendor parsers ### Inspur / Kaytus (`inspur`) **Status:** Ready. Tested on KR4268X2 (onekeylog format). **Archive format:** `.tar.gz` onekeylog **Primary source files:** | File | Content | |------|---------| | `asset.json` | Base hardware inventory | | `component.log` | Component list | | `devicefrusdr.log` | FRU and SDR data | | `onekeylog/runningdata/redis-dump.rdb` | Runtime enrichment (optional) | **Redis RDB enrichment** (applied conservatively — fills missing fields only): - GPU: `serial_number`, `firmware` (VBIOS/FW), runtime telemetry - NIC: firmware, serial, part number (when text logs leave fields empty) **Module structure:** ``` inspur/ parser.go — main parser + registration sdr.go — sensor/SDR parsing fru.go — FRU serial parsing asset.go — asset.json parsing syslog.go — syslog parsing ``` --- ### Dell TSR (`dell`) **Status:** Ready (v3.0). Tested on nested TSR archives with embedded `*.pl.zip`. **Archive format:** `.zip` (outer archive + nested `*.pl.zip`) **Primary source files:** - `tsr/metadata.json` - `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_View.xml` - `tsr/hardware/sysinfo/inventory/sysinfo_DCIM_SoftwareIdentity.xml` - `tsr/hardware/sysinfo/inventory/sysinfo_CIM_Sensor.xml` - `tsr/hardware/sysinfo/lcfiles/curr_lclog.xml` **Extracted data:** - Board/system identity and BIOS/iDRAC firmware - CPU, memory, physical disks, virtual disks, PSU, NIC, PCIe - GPU inventory (`DCIM_VideoView`) + GPU sensor enrichment (`DCIM_GPUSensor`) - Controller/backplane inventory (`DCIM_ControllerView`, `DCIM_EnclosureView`) - Sensor readings (temperature/voltage/current/power/fan/utilization) - Lifecycle events (`curr_lclog.xml`) --- ### NVIDIA HGX Field Diagnostics (`nvidia`) **Status:** Ready (v1.1.0). Works with any server vendor. **Archive format:** `.tar` / `.tar.gz` **Confidence scoring:** | File | Score | |------|-------| | `unified_summary.json` with "HGX Field Diag" marker | +40 | | `summary.json` | +20 | | `summary.csv` | +15 | | `gpu_fieldiag/` directory | +15 | **Source files:** | File | Content | |------|---------| | `output.log` | dmidecode — server manufacturer, model, serial number | | `unified_summary.json` | GPU details, NVSwitch devices, PCI addresses | | `summary.json` | Diagnostic test results and error codes | | `summary.csv` | Alternative test results format | **Extracted data:** - GPUs: slot, model, manufacturer, firmware (VBIOS), BDF - NVSwitch devices: slot, device_class, vendor_id, device_id, BDF, link speed/width - Events: diagnostic test failures (connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power) **Severity mapping:** - `info` — tests passed - `warning` — e.g. "Row remapping failed" - `critical` — error codes 300+ **Known limitations:** - Detailed logs in `gpu_fieldiag/*.log` are not parsed. - No CPU, memory, or storage extraction (not present in field diag archives). --- ### NVIDIA Bug Report (`nvidia_bug_report`) **Status:** Ready (v1.0.0). **File format:** `nvidia-bug-report-*.log.gz` (gzip-compressed text) **Confidence:** 85 (high priority for matching filename pattern) **Source sections parsed:** | dmidecode section | Extracts | |-------------------|---------| | System Information | server serial, UUID, manufacturer, product name | | Processor Information | CPU model, serial, core/thread count, frequency | | Memory Device | DIMM slot, size, type, manufacturer, serial, part number, speed | | System Power Supply | PSU location, manufacturer, model, serial, wattage, firmware, status | | Other source | Extracts | |--------------|---------| | `lspci -vvv` (Ethernet/Network/IB) | NIC model (from VPD), BDF, slot, P/N, S/N, port count, port type | | `/proc/driver/nvidia/gpus/*/information` | GPU model, BDF, UUID, VBIOS version, IRQ | | NVRM version line | NVIDIA driver version | **Known limitations:** - Driver error/warning log lines not yet extracted. - GPU temperature/utilization metrics require additional parsing sections. --- ### XigmaNAS (`xigmanas`) **Status:** Ready. **Archive format:** Plain log files (FreeBSD-based NAS system) **Detection:** Files named `xigmanas`, `system`, or `dmesg`; content containing "XigmaNAS" or "FreeBSD"; SMART data presence. **Extracted data:** - System: firmware version, uptime, CPU model, memory configuration, hardware platform - Storage: disk models, serial numbers, capacity, health, SMART temperatures - Populates: `Hardware.Firmware`, `Hardware.CPUs`, `Hardware.Memory`, `Hardware.Storage`, `Sensors` --- ### Unraid (`unraid`) **Status:** Ready (v1.0.0). **Archive format:** Unraid diagnostics archive contents (text-heavy diagnostics directories). **Detection:** Combines filename/path markers (`diagnostics-*`, `unraid-*.txt`, `vars.txt`) with content markers (e.g. `Unraid kernel build`, parity data markers). **Extracted data (current):** - Board / BIOS metadata (from motherboard/system files) - CPU summary (from `lscpu.txt`) - Memory modules (from diagnostics memory file) - Storage devices (from `vars.txt` + SMART files) - Syslog events --- ### H3C SDS G5 (`h3c_g5`) **Status:** Ready (v1.0.0). Tested on H3C UniServer R4900 G5 SDS archives. **Archive format:** `.sds` (tar archive) **Detection:** `hardware_info.ini`, `hardware.info`, `firmware_version.ini`, `user/test*.csv`, plus H3C markers. **Extracted data (current):** - Board/FRU inventory (`FRUInfo.ini`, `board_info.ini`) - Firmware list (`firmware_version.ini`) - CPU inventory (`hardware_info.ini`) - Memory DIMM inventory (`hardware_info.ini`) - Storage inventory (`hardware.info`, `storage_disk.ini`, `NVMe_info.txt`, RAID text enrichments) - Logical RAID volumes (`raid.json`, `Storage_RAID-*.txt`) - Sensor snapshot (`sensor_info.ini`) - SEL events (`user/test.csv`, `user/test1.csv`, fallback `Sel.json` / `sel_list.txt`) --- ### H3C SDS G6 (`h3c_g6`) **Status:** Ready (v1.0.0). Tested on H3C UniServer R4700 G6 SDS archives. **Archive format:** `.sds` (tar archive) **Detection:** `CPUDetailInfo.xml`, `MemoryDetailInfo.xml`, `firmware_version.json`, `Sel.json`, plus H3C markers. **Extracted data (current):** - Board/FRU inventory (`FRUInfo.ini`, `board_info.ini`) - Firmware list (`firmware_version.json`) - CPU inventory (`CPUDetailInfo.xml`) - Memory DIMM inventory (`MemoryDetailInfo.xml`) - Storage inventory + capacity/model/interface (`storage_disk.ini`, `Storage_RAID-*.txt`, `NVMe_info.txt`) - Logical RAID volumes (`raid.json`, fallback from `Storage_RAID-*.txt` when available) - Sensor snapshot (`sensor_info.ini`) - SEL events (`user/Sel.json`, fallback `user/sel_list.txt`) --- ### Generic text fallback (`generic`) **Status:** Ready (v1.0.0). **Confidence:** 15 (lowest — only matches if no other parser scores higher) **Purpose:** Fallback for any text file or single `.gz` file not matching a specific vendor. **Behavior:** - If filename matches `nvidia-bug-report-*.log.gz`: extracts driver version and GPU list. - Otherwise: confirms file is text (not binary) and records a basic "Text File" event. --- ## Supported vendor matrix | Vendor | ID | Status | Tested on | |--------|----|--------|-----------| | Dell TSR | `dell` | Ready | TSR nested zip archives | | Inspur / Kaytus | `inspur` | Ready | KR4268X2 onekeylog | | NVIDIA HGX Field Diag | `nvidia` | Ready | Various HGX servers | | NVIDIA Bug Report | `nvidia_bug_report` | Ready | H100 systems | | Unraid | `unraid` | Ready | Unraid diagnostics archives | | XigmaNAS | `xigmanas` | Ready | FreeBSD NAS logs | | H3C SDS G5 | `h3c_g5` | Ready | H3C UniServer R4900 G5 SDS archives | | H3C SDS G6 | `h3c_g6` | Ready | H3C UniServer R4700 G6 SDS archives | | Generic fallback | `generic` | Ready | Any text file |