Files
logpile/docs/bible/06-parsers.md

7.6 KiB
Raw Blame History

06 — Parsers

Framework

Registration

Each vendor parser registers itself via Go's init() side-effect import pattern.

All registrations are collected in internal/parser/vendors/vendors.go:

import (
    _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/inspur"
    _ "git.mchus.pro/mchus/logpile/internal/parser/vendors/supermicro"
    // etc.
)

VendorParser interface

type VendorParser interface {
    Name() string                                // human-readable name
    Vendor() string                              // vendor identifier string
    Version() string                             // parser version (increment on logic changes)
    Detect(files []ExtractedFile) int            // confidence 0100
    Parse(files []ExtractedFile) (*models.AnalysisResult, error)
}

Selection logic

All registered parsers run Detect() against the uploaded archive's file list. The parser with the highest confidence score is selected. Multiple parsers may return >0; only the top scorer is used.

Adding a new vendor parser

  1. mkdir -p internal/parser/vendors/VENDORNAME
  2. Copy internal/parser/vendors/template/parser.go.template as starting point.
  3. Implement Detect() and Parse().
  4. Add blank import to vendors/vendors.go.

Detect() tips:

  • Look for unique filenames or directory names.
  • Check file content for vendor-specific markers.
  • Return 70+ only when confident; return 0 if clearly not a match.

Parser versioning

Each parser file contains a parserVersion constant. Increment the version whenever parsing logic changes — this helps trace which version produced a given result.


Vendor parsers

Inspur / Kaytus (inspur)

Status: Ready. Tested on KR4268X2 (onekeylog format).

Archive format: .tar.gz onekeylog

Primary source files:

File Content
asset.json Base hardware inventory
component.log Component list
devicefrusdr.log FRU and SDR data
onekeylog/runningdata/redis-dump.rdb Runtime enrichment (optional)

Redis RDB enrichment (applied conservatively — fills missing fields only):

  • GPU: serial_number, firmware (VBIOS/FW), runtime telemetry
  • NIC: firmware, serial, part number (when text logs leave fields empty)

Module structure:

inspur/
  parser.go   — main parser + registration
  sdr.go      — sensor/SDR parsing
  fru.go      — FRU serial parsing
  asset.go    — asset.json parsing
  syslog.go   — syslog parsing

Supermicro (supermicro)

Status: Ready (v1.0.0). Tested on SYS-821GE-TNHR crash dumps.

Archive format: .tgz / .tar.gz / .tar

Primary source file: CDump.txt — JSON crashdump file

Confidence: +80 when CDump.txt contains crash_data, METADATA, bmc_fw_ver markers.

Extracted data:

  • CPUs: CPUID, core count, manufacturer (Intel), microcode version (as firmware field)
  • FRU: BMC firmware version, BIOS version, ME firmware version, CPU PPIN
  • Events: crashdump collection event + MCA errors

MCA error detection:

  • Bit 63 (Valid), Bit 61 (UC — uncorrected), Bit 60 (EN — enabled)
  • Corrected MCA errors → Warning severity
  • Uncorrected MCA errors → Critical severity

Known limitations:

  • TOR dump and extended MCA register data not yet parsed.
  • No CPU model name (only CPUID hex code available in crashdump format).

NVIDIA HGX Field Diagnostics (nvidia)

Status: Ready (v1.1.0). Works with any server vendor.

Archive format: .tar / .tar.gz

Confidence scoring:

File Score
unified_summary.json with "HGX Field Diag" marker +40
summary.json +20
summary.csv +15
gpu_fieldiag/ directory +15

Source files:

File Content
output.log dmidecode — server manufacturer, model, serial number
unified_summary.json GPU details, NVSwitch devices, PCI addresses
summary.json Diagnostic test results and error codes
summary.csv Alternative test results format

Extracted data:

  • GPUs: slot, model, manufacturer, firmware (VBIOS), BDF
  • NVSwitch devices: slot, device_class, vendor_id, device_id, BDF, link speed/width
  • Events: diagnostic test failures (connectivity, gpumem, gpustress, pcie, nvlink, nvswitch, power)

Severity mapping:

  • info — tests passed
  • warning — e.g. "Row remapping failed"
  • critical — error codes 300+

Known limitations:

  • Detailed logs in gpu_fieldiag/*.log are not parsed.
  • No CPU, memory, or storage extraction (not present in field diag archives).

NVIDIA Bug Report (nvidia_bug_report)

Status: Ready (v1.0.0).

File format: nvidia-bug-report-*.log.gz (gzip-compressed text)

Confidence: 85 (high priority for matching filename pattern)

Source sections parsed:

dmidecode section Extracts
System Information server serial, UUID, manufacturer, product name
Processor Information CPU model, serial, core/thread count, frequency
Memory Device DIMM slot, size, type, manufacturer, serial, part number, speed
System Power Supply PSU location, manufacturer, model, serial, wattage, firmware, status
Other source Extracts
lspci -vvv (Ethernet/Network/IB) NIC model (from VPD), BDF, slot, P/N, S/N, port count, port type
/proc/driver/nvidia/gpus/*/information GPU model, BDF, UUID, VBIOS version, IRQ
NVRM version line NVIDIA driver version

Known limitations:

  • Driver error/warning log lines not yet extracted.
  • GPU temperature/utilization metrics require additional parsing sections.

XigmaNAS (xigmanas)

Status: Ready.

Archive format: Plain log files (FreeBSD-based NAS system)

Detection: Files named xigmanas, system, or dmesg; content containing "XigmaNAS" or "FreeBSD"; SMART data presence.

Extracted data:

  • System: firmware version, uptime, CPU model, memory configuration, hardware platform
  • Storage: disk models, serial numbers, capacity, health, SMART temperatures
  • Populates: Hardware.Firmware, Hardware.CPUs, Hardware.Memory, Hardware.Storage, Sensors

Unraid (unraid)

Status: Ready (v1.0.0).

Archive format: Unraid diagnostics archive contents (text-heavy diagnostics directories).

Detection: Combines filename/path markers (diagnostics-*, unraid-*.txt, vars.txt) with content markers (e.g. Unraid kernel build, parity data markers).

Extracted data (current):

  • Board / BIOS metadata (from motherboard/system files)
  • CPU summary (from lscpu.txt)
  • Memory modules (from diagnostics memory file)
  • Storage devices (from vars.txt + SMART files)
  • Syslog events

Generic text fallback (generic)

Status: Ready (v1.0.0).

Confidence: 15 (lowest — only matches if no other parser scores higher)

Purpose: Fallback for any text file or single .gz file not matching a specific vendor.

Behavior:

  • If filename matches nvidia-bug-report-*.log.gz: extracts driver version and GPU list.
  • Otherwise: confirms file is text (not binary) and records a basic "Text File" event.

Supported vendor matrix

Vendor ID Status Tested on
Inspur / Kaytus inspur Ready KR4268X2 onekeylog
Supermicro supermicro Ready SYS-821GE-TNHR crashdump
NVIDIA HGX Field Diag nvidia Ready Various HGX servers
NVIDIA Bug Report nvidia_bug_report Ready H100 systems
Unraid unraid Ready Unraid diagnostics archives
XigmaNAS xigmanas Ready FreeBSD NAS logs
Generic fallback generic Ready Any text file