Commit Graph

23 Commits

Author SHA1 Message Date
Mikhail Chusavitin
27373aa104 feat: surface BMC collection errors in parse-errors panel and event log
When Inspur component.log sections return {"error":"...","code":N} instead
of hardware data, the parser now:
- stores them in AnalysisResult.CollectionErrors (new model field)
- mirrors each one into result.Events with Source="BMC/<section>"
  so the chart viewer event table shows the specific BMC module
- feeds them into /api/parse-errors as bmc_collection_error entries

UI adds a collapsible "Collection diagnostics" panel below the chart
iframe (outside /chart) that appears when /api/parse-errors returns
any items; resets on data clear.

Affected sections in this dump: HDD (1458), PCIe Devices (1458),
Network Adapters (1458), Disk Backplane.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 14:30:01 +03:00
Mikhail Chusavitin
4f7b5b826a fix(inspur): fix PSU section regex when PCIE section precedes Network
The PSU regex used "RESTful Network" as its end anchor, but in standard
Inspur component.log layout the PCIE Device section sits between PSU and
Network Adapter. The lazy [\s\S]*? captured across the PCIE error block,
producing invalid JSON and silently dropping all PSU data.

Changed anchor to RESTful (?:PCIE|Network) — matches whichever section
immediately follows PSU in a given archive.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 14:21:13 +03:00
Mikhail Chusavitin
dfd64550cf fix(inspur): infer DIMM size from part number when BMC reports size=0
When BMC firmware fails to read capacity for a present DIMM, size_mb stays
0. If another DIMM with the same part number in the same batch has a known
size, use it to fill the gap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 14:18:15 +03:00
Mikhail Chusavitin
9505303d1d fix(inspur): show microcode version for every CPU, not just the first
Dedup by version caused CPU1 Microcode to be omitted when both CPUs run
the same version, leaving the firmware column blank for the second socket.
Each CPU gets its own firmware entry keyed by index.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 14:15:28 +03:00
Mikhail Chusavitin
f2c04cf0e8 fix(inspur): parse CPUs from component.log and fix DIMM present detection
Two bugs in onekeylog archives that lack asset.json:

- CPU count was always 0: ParseComponentLog never parsed the "RESTful CPU
  info" section. Added parseCPUInfo as a fallback when hw.CPUs is empty
  (asset.json remains the primary source when present). Also worked around
  a Go JSON case-insensitive collision between "proc_id" (int) and
  "PROC_ID" (string CPUID) by adding an explicit PROC_ID field with an
  exact-case tag.

- Only 1 of 2 DIMMs shown: Present condition required mem_mod_size > 0,
  but some BMC firmware reports size=0 for a physically installed module
  while still providing serial and part number. Now treats a DIMM as
  present when status=1 and any of size/serial/partnum is non-empty.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 14:13:55 +03:00
Mikhail Chusavitin
476630190d export: align reanimator contract v2.7 2026-03-15 23:27:32 +03:00
Mikhail Chusavitin
9007f1b360 export: align reanimator and enrich redfish metrics 2026-03-15 21:38:28 +03:00
Mikhail Chusavitin
9df29b1be9 fix: dedup GPUs across multiple chassis PCIeDevice trees in Redfish collector
Supermicro HGX exposes each GPU under both Chassis/1/PCIeDevices and a
dedicated Chassis/HGX_GPU_SXM_N/PCIeDevices. gpuDocDedupKey was keying
by @odata.id path, so identical GPUs with the same serial were not
deduplicated across sources. Now stable identifiers (serial → BDF →
slot+model) take priority over path.

Also includes Inspur parser improvements: NVMe model/serial enrichment
from devicefrusdr.log and audit.log, RAID drive slot normalization to
BP notation, PSU slot normalization, BMC/CPLD/VR firmware from RESTful
version info section, and parser version bump to 1.8.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:44:36 +03:00
8d80048117 redfish: MSI support, fix zero dates, BMC MAC, Assembly FRU, crawler cleanup
- Add MSI CG480-S5063 (H100 SXM5) support:
  - collectGPUsFromProcessors: find GPUs via Processors/ProcessorType=GPU,
    resolve serials from Chassis/<GpuId>
  - looksLikeGPU: skip Description="Display Device" PCIe sidecars
  - isVirtualStorageDrive: filter AMI virtual USB drives (0-byte)
  - enrichNICMACsFromNetworkDeviceFunctions: pull MACs for MSI NICs
  - parseCPUs: filter by ProcessorType, parse Socket, L1/L2/L3 from ProcessorMemory
  - parseMemory: Location.PartLocation.ServiceLabel slot fallback
  - shouldCrawlPath: block /SubProcessors subtrees
- Fix status_checked_at/status_changed_at serializing as 0001-01-01:
  change all StatusCheckedAt/StatusChangedAt fields to *time.Time
- Redfish crawler cleanup:
  - Block non-inventory branches: AccountService, CertificateService,
    EventService, Registries, SessionService, TaskService, manager config paths,
    OperatingConfigs, BootOptions, HostPostCode, Bios/Settings, OEM KVM paths
  - Add Assembly to critical endpoints (FRU data)
  - Remove BootOptions from priority seeds
- collectBMCMAC: read BMC MAC from Managers/*/EthernetInterfaces
- collectAssemblyFRU: extract FRU serial/part from Chassis/*/Assembly
- Firmware: remove NetworkProtocol noise, fix SecureBoot field,
  filter BMCImageN redundant backup slots

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 08:12:17 +03:00
21ea129933 misc: sds format support, convert limits, dell dedup, supermicro removal, bible updates
Parser / archive:
- Add .sds extension as tar-format alias (archive.go)
- Add tests for multipart upload size limits (multipart_limits_test.go)
- Remove supermicro crashdump parser (ADL-015)

Dell parser:
- Remove GPU duplicates from PCIeDevices (DCIM_VideoView vs DCIM_PCIDeviceView
  both list the same GPU; VideoView record is authoritative)

Server:
- Add LOGPILE_CONVERT_MAX_MB env var for independent convert batch size limit
- Improve "file too large" error message with current limit value

Web:
- Add CONVERT_MAX_FILES_PER_BATCH = 1000 cap
- Minor UI copy and CSS fixes

Bible:
- bible-local/06-parsers.md: add pci.ids enrichment rule (enrich model from
  pciids when name is empty but vendor_id+device_id are present)
- Sync bible submodule and local overview/architecture docs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 22:23:44 +03:00
9c5512d238 dell: strip MAC from model names; fix device-bound firmware in dell/inspur
- Dell NICView: strip " - XX:XX:XX:XX:XX:XX" suffix from ProductName
  (Dell TSR embeds MAC in this field for every NIC port)
- Dell SoftwareIdentity: same strip applied to ElementName; store FQDD
  in FirmwareInfo.Description so exporter can filter device-bound entries
- Exporter: add isDeviceBoundFirmwareFQDD() to filter firmware entries
  whose Description matches NIC./PSU./Disk./RAID.Backplane./GPU. FQDD
  prefixes (prevents device firmware from appearing in hardware.firmware)
- Exporter: extend isDeviceBoundFirmwareName() to filter HGX GPU/NVSwitch
  firmware inventory IDs (_fw_gpu_, _fw_nvswitch_, _inforom_gpu_)
- Inspur: remove HDD firmware from Hardware.Firmware — already present
  in Storage.Firmware, duplicating it violates ADL-016
- bible-local/06-parsers.md: document firmware and MAC stripping rules
- bible-local/10-decisions.md: add ADL-016 (device-bound firmware) and
  ADL-017 (vendor-embedded MAC in model name fields)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 22:07:53 +03:00
4940cd9645 sync file-type support across upload/convert and fix collected_at timezone handling 2026-02-28 23:27:49 +03:00
0252264ddc parser: fallback zone-less source timestamps to Europe/Moscow 2026-02-28 22:17:00 +03:00
e0146adfff Improve Redfish recovery flow and raw export timing diagnostics 2026-02-28 16:55:58 +03:00
9a30705c9a improve redfish collection progress and robust hardware dedup/serial parsing 2026-02-28 16:07:42 +03:00
758fa66282 feat: improve inspur parsing and pci.ids integration 2026-02-17 18:09:36 +03:00
514da76ddb Update Inspur parsing and align release docs 2026-02-15 23:13:47 +03:00
5e49adaf05 Update parser and project changes 2026-02-15 22:02:07 +03:00
Mikhail Chusavitin
21f4e5a67e v1.2.0: Enhanced Inspur/Kaytus parser with GPU, PCIe, and storage support
Major improvements:
- Add CSV SEL event parser for Kaytus firmware format
- Add PCIe device parser with link speed/width detection
- Add GPU temperature and PCIe link monitoring
- Add disk backplane parser for storage bay information
- Fix memory module detection (only show installed DIMMs)

Parser enhancements:
- Parse RESTful PCIe Device info (max/current link width/speed)
- Parse GPU sensor data (core and memory temperatures)
- Parse diskbackplane info (slot count, installed drives)
- Parse SEL events from CSV format (selelist.csv)
- Fix memory Present status logic (check mem_mod_status)

Web interface improvements:
- Add PCIe link degradation highlighting (red when current < max)
- Add storage table with Present status and location
- Update memory specification to show only installed modules with frequency
- Sort events from newest to oldest
- Filter out N/A serial numbers from display

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-30 12:30:18 +03:00
c7422e95aa v1.1.0: Parser versioning, server info, auto-browser, section overviews
- Add parser versioning with Version() method and version display on main screen
- Add server model and serial number to Configuration tab and TXT export
- Add auto-browser opening on startup with --no-browser flag
- Add Restart and Exit buttons with graceful shutdown
- Add section overview stats (CPU, Power, Storage, GPU, Network)
- Change PCIe Link display to "x16 PCIe Gen4" format
- Add Location column to Serials section
- Extract BoardInfo from FRU and PlatformId from ThermalConfig

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 13:49:43 +03:00
83378fa761 Add firmware versions for all components
Extract firmware from:
- asset.json: BIOS, ME, BKC, CPU Microcode, HDD/SSD/NVMe
- component.log: PSU firmware, Network adapter firmware

Deduplicate entries to avoid showing same firmware twice.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 09:24:36 +03:00
c243b4e141 Add detailed hardware configuration view with sub-tabs
- Redesign config page with tabs: Spec, CPU, Memory, Power, Storage, GPU, Network, PCIe
- Parse detailed memory info from component.log with all fields:
  Location, Present, Size, Type, Max/Current Speed, Manufacturer, Part Number, Status
- Add GPU model extraction from PCIe devices
- Add NetworkAdapter model with detailed fields from RESTful API
- Update PSU model with power metrics (input/output power, voltage, temperature)
- Memory modules with 0GB size (failed) highlighted in warning color
- Add memory overview stats (total GB, installed count, active count)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 09:11:44 +03:00
512957545a Add LOGPile BMC diagnostic log analyzer
Features:
- Modular parser architecture for vendor-specific formats
- Inspur/Kaytus parser supporting asset.json, devicefrusdr.log,
  component.log, idl.log, and syslog files
- PCI Vendor/Device ID lookup for hardware identification
- Web interface with tabs: Events, Sensors, Config, Serials, Firmware
- Server specification summary with component grouping
- Export to CSV, JSON, TXT formats
- BMC alarm parsing from IDL logs (memory errors, PSU events, etc.)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 04:11:23 +03:00