AHS files can exceed 100 MB; the previous 10 MB universal cap silently
truncated them and caused incomplete event parsing. Per-extension limits
are now used: .ahs gets 1 GB, all other single-file types keep 10 MB.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the BMC HDD API returns an empty array (RAID controller attached via
PCIe, e.g. PM8204-2GB), disk serial numbers are now recovered from smartd
startup messages in SOLHostCapture.log.
Enrichment runs in three passes: model-match on existing slots, positional
fill of empty backplane placeholders, then new entries for any remainder.
Both log/ and runningdata/var/ copies are merged with serial deduplication.
Parser version bumped to 2.1.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- identifier-normalization: use strings.EqualFold in h3c/parser.go
- import-export: CSV now uses UTF-8 BOM and semicolon delimiter
- go-code-style: translate all Russian source strings to English (ADL-007)
- go-background-tasks: add Type, Message, Result fields to Job struct
- go-api: wrap list endpoints in {items, total_count, page, per_page, total_pages}
- module-structure: rename helpers.go → context_sleep.go
- build-version-display: htmlError renders version footer on error pages
- go-logging: migrate all log.Printf calls to log/slog with structured attrs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
NF-series storage servers (e.g. NF5280M6) have no GPU/outboard-PCIe
topology, so the previous score gate (topologyScore==0 || boardScore==0
→ return 0) always produced score=0 despite SystemManufacturer="Inspur"
being available. These servers fell into mode=fallback, activating the
AMI profile and probing /Oem/Ami paths that don't exist on the BMC.
Add manufacturer-based detection: SystemManufacturer or
ChassisManufacturer containing "inspur" contributes 60 points —
enough to enter matched mode on its own. GPU servers with full
topology+board signals still score higher as before.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a PDF button to the report header. Clicking it opens
/chart/current?print=true in a new tab, which auto-triggers
window.print() so the user can save to PDF via the browser dialog.
- chart submodule bumped: PrintMode support (no filter JS, auto-print,
@media print CSS)
- handlers.go: passes PrintMode=true when ?print=true query param is set
- index.html: PDF button alongside Raw Data / Reanimator
- app.js: printReport() helper; button shown/hidden with other exports
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three related fixes for IDL event processing:
1. idl.go: include EventType in dedup key so Deassert events are no
longer silently dropped as duplicates of their Assert counterparts.
2. gpu_status.go: treat Deassert events as clearing all GPU faults —
previously the code re-applied the same faulty GPU set from the
description, leaving GPUs stuck in Critical even after alarm cleared.
3. reanimator_models/converter: add bmc_event_summary section to the
Reanimator export — a deduplicated Critical/Warning event table with
Active/Resolved status derived from Assert/Deassert pairs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removed max-width/padding constraints — panel now stretches to grid
column width like the viewer-panel above it.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When Inspur component.log sections return {"error":"...","code":N} instead
of hardware data, the parser now:
- stores them in AnalysisResult.CollectionErrors (new model field)
- mirrors each one into result.Events with Source="BMC/<section>"
so the chart viewer event table shows the specific BMC module
- feeds them into /api/parse-errors as bmc_collection_error entries
UI adds a collapsible "Collection diagnostics" panel below the chart
iframe (outside /chart) that appears when /api/parse-errors returns
any items; resets on data clear.
Affected sections in this dump: HDD (1458), PCIe Devices (1458),
Network Adapters (1458), Disk Backplane.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The PSU regex used "RESTful Network" as its end anchor, but in standard
Inspur component.log layout the PCIE Device section sits between PSU and
Network Adapter. The lazy [\s\S]*? captured across the PCIE error block,
producing invalid JSON and silently dropping all PSU data.
Changed anchor to RESTful (?:PCIE|Network) — matches whichever section
immediately follows PSU in a given archive.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When BMC firmware fails to read capacity for a present DIMM, size_mb stays
0. If another DIMM with the same part number in the same batch has a known
size, use it to fill the gap.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dedup by version caused CPU1 Microcode to be omitted when both CPUs run
the same version, leaving the firmware column blank for the second socket.
Each CPU gets its own firmware entry keyed by index.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs in onekeylog archives that lack asset.json:
- CPU count was always 0: ParseComponentLog never parsed the "RESTful CPU
info" section. Added parseCPUInfo as a fallback when hw.CPUs is empty
(asset.json remains the primary source when present). Also worked around
a Go JSON case-insensitive collision between "proc_id" (int) and
"PROC_ID" (string CPUID) by adding an explicit PROC_ID field with an
exact-case tag.
- Only 1 of 2 DIMMs shown: Present condition required mem_mod_size > 0,
but some BMC firmware reports size=0 for a physically installed module
while still providing serial and part number. Now treats a DIMM as
present when status=1 and any of size/serial/partnum is non-empty.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
IOMMUGroup was added to models.PCIeDevice but never wired into the
converter — missing from Details in buildDevicesFromLegacy, no field
in ReanimatorPCIe, and convertPCIeFromDevices never read it.
Add IOMMUGroup *int to ReanimatorPCIe, propagate through Details,
add intPtrFromDetailMap helper.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Parse inventory_volume.log: Intel VROC (VMD) RAID volumes including
RAID level, capacity (GiB/TiB support added), status and member drives.
Add Drives []string to StorageVolume model.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Lenovo ThinkSystem SR650 V3 (and similar XCC-based servers) caused
collection runs of 23+ minutes because the BMC exposes two large high-
error-rate subtrees in the snapshot BFS:
- Chassis/1/Sensors: 315 individual sensor members, 282/315 failing,
~3.7s per request → ~19 minutes wasted. These documents are never
read by any LOGPile parser (thermal/power data comes from aggregate
Chassis/*/Thermal and Chassis/*/Power endpoints).
- Chassis/1/Oem/Lenovo: 75 requests (LEDs×47, Slots×26, etc.),
68/75 failing → 8+ minutes wasted on non-inventory data.
Add a Lenovo profile (matched on SystemManufacturer/OEMNamespace "Lenovo")
that sets SnapshotExcludeContains to block individual sensor documents and
non-inventory Lenovo OEM subtrees from the snapshot BFS queue. Also sets
rate policy thresholds appropriate for XCC BMC latency (p95 often 3-5s).
Add SnapshotExcludeContains []string to AcquisitionTuning and check it
in the snapshot enqueue closure in redfish.go.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Supermicro HGX BMC reports all 8 B200 GPU PCIe devices with Name
"PCIe Device" — a generic label shared by every GPU, not a unique
hardware position. pcieDedupKey used slot as the primary key, so all
8 GPUs collapsed to one entry in the UI (the first, serial 1654925165720).
Add isGenericPCIeSlotName to detect non-positional slot labels and fall
through to serial/BDF for dedup instead, preserving each GPU separately.
Positional slots (#GPU0, SLOT-NIC1, etc.) continue to use slot-first dedup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
parseGPUWithSupplementalDocs did not read PCIeInterface from the device
doc, only from function docs. xFusion GPU PCIeCard entries carry link
width/speed in PCIeInterface (LanesInUse/Maxlanes/PCIeType/MaxPCIeType)
so GPU link width was always empty for xFusion servers.
Also apply the xFusion OEM function-level fallback for GPU function docs,
consistent with the NIC and PCIeDevice paths.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
xFusion iBMC exposes PCIe link width in two non-standard ways:
- PCIeInterface uses "Maxlanes" (lowercase 'l') instead of "MaxLanes"
- PCIeFunction docs carry width/speed in Oem.xFusion.LinkWidth ("X8"),
Oem.xFusion.LinkWidthAbility, Oem.xFusion.LinkSpeed, and
Oem.xFusion.LinkSpeedAbility rather than the standard CurrentLinkWidth int
Add redfishEnrichFromOEMxFusionPCIeLink and parseXFusionLinkWidth helpers,
apply them as fallbacks in NIC and PCIeDevice enrichment paths.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove power-on and power-off functionality from the Redfish collector;
keep host power-state detection and show a warning in the UI when the
host is powered off before collection starts.
Add a "Пропустить зависшие" (skip hung) button that lets the user abort
stuck Redfish collection phases without losing already-collected data.
Introduces a two-level context model in Collect(): the outer job context
covers the full lifecycle including replay; an inner collectCtx covers
snapshot, prefetch, and plan-B phases only. Closing the skipCh cancels
collectCtx immediately — aborts all in-flight HTTP requests and exits
plan-B loops — then replay runs on whatever rawTree was collected.
Signal path: UI → POST /api/collect/{id}/skip → JobManager.SkipJob()
→ close(skipCh) → goroutine in Collect() → cancelCollect().
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add HPE iLO Redfish profile (priority 20): matches on manufacturer/OEM/iLO signals,
adds SmartStorage/SmartStorageConfig to critical paths, sets realistic ETA baseline
and rate policy for iLO's known slowness
- Fix post-probe hang on HPE iLO: skip numeric probing of collections where
Members@odata.count == len(Members); add 4s postProbeClient timeout as safety net
- Exclude /WorkloadPerformanceAdvisor from crawl paths
- Fix replay parser: skip absent CPU sockets, absent DIMM slots, absent drive bays
- Filter N/A version entries from firmware inventory
- Remove drive firmware from general firmware list (already in Storage[].Firmware)
- Add HPE AHS (.ahs) archive parser with hybrid SMBIOS/Redfish extraction
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>