logpile

mchus/logpile

Fork 0

Commit Graph

Author	SHA1	Message	Date
Michael Chus	b04877549a	feat(collector): add Lenovo XCC profile to skip noisy snapshot paths Lenovo ThinkSystem SR650 V3 (and similar XCC-based servers) caused collection runs of 23+ minutes because the BMC exposes two large high- error-rate subtrees in the snapshot BFS: - Chassis/1/Sensors: 315 individual sensor members, 282/315 failing, ~3.7s per request → ~19 minutes wasted. These documents are never read by any LOGPile parser (thermal/power data comes from aggregate Chassis//Thermal and Chassis//Power endpoints). - Chassis/1/Oem/Lenovo: 75 requests (LEDs×47, Slots×26, etc.), 68/75 failing → 8+ minutes wasted on non-inventory data. Add a Lenovo profile (matched on SystemManufacturer/OEMNamespace "Lenovo") that sets SnapshotExcludeContains to block individual sensor documents and non-inventory Lenovo OEM subtrees from the snapshot BFS queue. Also sets rate policy thresholds appropriate for XCC BMC latency (p95 often 3-5s). Add SnapshotExcludeContains []string to AcquisitionTuning and check it in the snapshot enqueue closure in redfish.go. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-13 19:29:04 +03:00
Mikhail Chusavitin	d650a6ba1c	refactor: unified ingest pipeline + modular Redfish profile framework Implement the full architectural plan: unified ingest.Service entry point for archive and Redfish payloads, modular redfishprofile package with composable profiles (generic, ami-family, msi, supermicro, dell, hgx-topology), score-based profile matching with fallback expansion mode, and profile-driven acquisition/analysis plans. Vendor-specific logic moved out of common executors and into profile hooks. GPU chassis lookup strategies and known storage recovery collections (IntelVROC/HA-RAID/MRVL) now live in ResolvedAnalysisPlan, populated by profiles at analysis time. Replay helpers read from the plan; no hardcoded path lists remain in generic code. Also splits redfish_replay.go into domain modules (gpu, storage, inventory, fru, profiles) and adds full fixture/matcher/directive test coverage including Dell, AMI, unknown-vendor fallback, and deterministic ordering. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-18 08:48:58 +03:00

Author

SHA1

Message

Date

Michael Chus

b04877549a

feat(collector): add Lenovo XCC profile to skip noisy snapshot paths

Lenovo ThinkSystem SR650 V3 (and similar XCC-based servers) caused
collection runs of 23+ minutes because the BMC exposes two large high-
error-rate subtrees in the snapshot BFS:

  - Chassis/1/Sensors: 315 individual sensor members, 282/315 failing,
    ~3.7s per request → ~19 minutes wasted. These documents are never
    read by any LOGPile parser (thermal/power data comes from aggregate
    Chassis/*/Thermal and Chassis/*/Power endpoints).

  - Chassis/1/Oem/Lenovo: 75 requests (LEDs×47, Slots×26, etc.),
    68/75 failing → 8+ minutes wasted on non-inventory data.

Add a Lenovo profile (matched on SystemManufacturer/OEMNamespace "Lenovo")
that sets SnapshotExcludeContains to block individual sensor documents and
non-inventory Lenovo OEM subtrees from the snapshot BFS queue. Also sets
rate policy thresholds appropriate for XCC BMC latency (p95 often 3-5s).

Add SnapshotExcludeContains []string to AcquisitionTuning and check it
in the snapshot enqueue closure in redfish.go.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-13 19:29:04 +03:00

Mikhail Chusavitin

d650a6ba1c

refactor: unified ingest pipeline + modular Redfish profile framework

Implement the full architectural plan: unified ingest.Service entry point
for archive and Redfish payloads, modular redfishprofile package with
composable profiles (generic, ami-family, msi, supermicro, dell,
hgx-topology), score-based profile matching with fallback expansion mode,
and profile-driven acquisition/analysis plans.

Vendor-specific logic moved out of common executors and into profile hooks.
GPU chassis lookup strategies and known storage recovery collections
(IntelVROC/HA-RAID/MRVL) now live in ResolvedAnalysisPlan, populated by
profiles at analysis time. Replay helpers read from the plan; no hardcoded
path lists remain in generic code.

Also splits redfish_replay.go into domain modules (gpu, storage, inventory,
fru, profiles) and adds full fixture/matcher/directive test coverage
including Dell, AMI, unknown-vendor fallback, and deterministic ordering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-18 08:48:58 +03:00

2 Commits