Commit Graph

32 Commits

Author SHA1 Message Date
Mikhail Chusavitin
9df29b1be9 fix: dedup GPUs across multiple chassis PCIeDevice trees in Redfish collector
Supermicro HGX exposes each GPU under both Chassis/1/PCIeDevices and a
dedicated Chassis/HGX_GPU_SXM_N/PCIeDevices. gpuDocDedupKey was keying
by @odata.id path, so identical GPUs with the same serial were not
deduplicated across sources. Now stable identifiers (serial → BDF →
slot+model) take priority over path.

Also includes Inspur parser improvements: NVMe model/serial enrichment
from devicefrusdr.log and audit.log, RAID drive slot normalization to
BP notation, PSU slot normalization, BMC/CPLD/VR firmware from RESTful
version info section, and parser version bump to 1.8.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:44:36 +03:00
19d857b459 redfish: filter PCIe topology noise, deduplicate GPU/NIC cross-sources
- isUnidentifiablePCIeDevice: skip PCIe entries with generic class
  (SingleFunction/MultiFunction) and no model/serial/VendorID — eliminates
  PCH bridges, root ports and other bus infrastructure that MSI BMC
  enumerates exhaustively (59→9 entries on CG480-S5063)
- collectPCIeDevices: skip entries where looksLikeGPU — prevents GPU
  devices from appearing in both hw.GPUs and hw.PCIeDevices (fixed
  Inspur H100 duplicate)
- dedupeCanonicalDevices: secondary model+manufacturer match for noKey
  items (no serial, no BDF) — merges NetworkAdapter entries into
  matching PCIe device entries; isGenericDeviceClass helper for
  DeviceClass identity check (fixed Inspur ENFI1100-T4 duplicate)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 22:08:02 +03:00
8d80048117 redfish: MSI support, fix zero dates, BMC MAC, Assembly FRU, crawler cleanup
- Add MSI CG480-S5063 (H100 SXM5) support:
  - collectGPUsFromProcessors: find GPUs via Processors/ProcessorType=GPU,
    resolve serials from Chassis/<GpuId>
  - looksLikeGPU: skip Description="Display Device" PCIe sidecars
  - isVirtualStorageDrive: filter AMI virtual USB drives (0-byte)
  - enrichNICMACsFromNetworkDeviceFunctions: pull MACs for MSI NICs
  - parseCPUs: filter by ProcessorType, parse Socket, L1/L2/L3 from ProcessorMemory
  - parseMemory: Location.PartLocation.ServiceLabel slot fallback
  - shouldCrawlPath: block /SubProcessors subtrees
- Fix status_checked_at/status_changed_at serializing as 0001-01-01:
  change all StatusCheckedAt/StatusChangedAt fields to *time.Time
- Redfish crawler cleanup:
  - Block non-inventory branches: AccountService, CertificateService,
    EventService, Registries, SessionService, TaskService, manager config paths,
    OperatingConfigs, BootOptions, HostPostCode, Bios/Settings, OEM KVM paths
  - Add Assembly to critical endpoints (FRU data)
  - Remove BootOptions from priority seeds
- collectBMCMAC: read BMC MAC from Managers/*/EthernetInterfaces
- collectAssemblyFRU: extract FRU serial/part from Chassis/*/Assembly
- Firmware: remove NetworkProtocol noise, fix SecureBoot field,
  filter BMCImageN redundant backup slots

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-04 08:12:17 +03:00
4940cd9645 sync file-type support across upload/convert and fix collected_at timezone handling 2026-02-28 23:27:49 +03:00
2fa4a1235a collector/redfish: make prefetch/post-probe adaptive with metrics 2026-02-28 19:05:34 +03:00
fe5da1dbd7 Fix NIC port count handling and apply pending exporter updates 2026-02-28 18:42:01 +03:00
612058ed16 redfish: optimize snapshot/plan-b crawl and add timing diagnostics 2026-02-28 17:56:04 +03:00
e0146adfff Improve Redfish recovery flow and raw export timing diagnostics 2026-02-28 16:55:58 +03:00
9a30705c9a improve redfish collection progress and robust hardware dedup/serial parsing 2026-02-28 16:07:42 +03:00
8dbbec3610 optimize redfish post-probe and add eta progress 2026-02-28 15:41:44 +03:00
4c60ebbf1d collector/redfish: remove pre-snapshot critical duplicate pass 2026-02-28 15:28:24 +03:00
c52fea2fec collector/redfish: emit critical warmup branch and eta progress 2026-02-28 15:21:49 +03:00
b6ff47fea8 collector/redfish: skip deep DIMM subresources and remove memory from critical warmup 2026-02-28 15:16:04 +03:00
1d282c4196 collector/redfish: collect and parse platform model fallback 2026-02-28 14:54:55 +03:00
f35cabac48 collector/redfish: fix server model fallback and GPU/NVMe regressions 2026-02-28 14:50:02 +03:00
a2c9e9a57f collector/redfish: add ETA estimates to snapshot and plan-B progress 2026-02-28 14:36:18 +03:00
b918363252 collector/redfish: dedupe model-only GPU rows from graphics controllers 2026-02-28 13:04:34 +03:00
6c19a58b24 collector/redfish: expand endpoint coverage and timestamp collect logs 2026-02-28 12:59:57 +03:00
9aadf2f1e9 collector/redfish: improve GPU SN/model fallback and warnings 2026-02-28 12:52:22 +03:00
Mikhail Chusavitin
000199fbdc Add parse errors tab and improve error diagnostics UI 2026-02-25 13:28:19 +03:00
Mikhail Chusavitin
68592da9f5 Harden Redfish collection for slow BMC endpoints 2026-02-25 12:42:43 +03:00
Mikhail Chusavitin
b1dde592ae Expand Redfish best-effort snapshot crawling 2026-02-25 12:24:06 +03:00
Mikhail Chusavitin
a4a1a19a94 Improve Redfish raw replay recovery and GUI diagnostics 2026-02-25 12:16:31 +03:00
Mikhail Chusavitin
66fb90233f Unify Redfish analysis through raw replay and add storage volumes 2026-02-24 18:34:13 +03:00
Mikhail Chusavitin
7a1285db99 Expand Redfish storage fallback for enclosure Disk.Bay paths 2026-02-24 18:25:00 +03:00
Mikhail Chusavitin
a6c90b6e77 Probe Supermicro NVMe Disk.Bay endpoints for drive inventory 2026-02-24 18:22:02 +03:00
Mikhail Chusavitin
6f66a8b2a1 Raise Redfish snapshot crawl limit and prioritize PCIe paths 2026-02-24 17:41:37 +03:00
Mikhail Chusavitin
810c4b5ff9 Add raw export reanalyze flow for Redfish snapshots 2026-02-24 17:23:26 +03:00
Mikhail Chusavitin
5d9e9d73de Fix Redfish snapshot crawl deadlock and add debug progress 2026-02-24 16:22:37 +03:00
758fa66282 feat: improve inspur parsing and pci.ids integration 2026-02-17 18:09:36 +03:00
Mikhail Chusavitin
bb48b03677 Redfish snapshot/export overhaul and portable release build 2026-02-04 19:43:51 +03:00
Mikhail Chusavitin
c89ee0118f Add pluggable live collectors and simplify API connect form 2026-02-04 19:00:03 +03:00