feat(collector): add Lenovo XCC profile to skip noisy snapshot paths
Lenovo ThinkSystem SR650 V3 (and similar XCC-based servers) caused
collection runs of 23+ minutes because the BMC exposes two large high-
error-rate subtrees in the snapshot BFS:
- Chassis/1/Sensors: 315 individual sensor members, 282/315 failing,
~3.7s per request → ~19 minutes wasted. These documents are never
read by any LOGPile parser (thermal/power data comes from aggregate
Chassis/*/Thermal and Chassis/*/Power endpoints).
- Chassis/1/Oem/Lenovo: 75 requests (LEDs×47, Slots×26, etc.),
68/75 failing → 8+ minutes wasted on non-inventory data.
Add a Lenovo profile (matched on SystemManufacturer/OEMNamespace "Lenovo")
that sets SnapshotExcludeContains to block individual sensor documents and
non-inventory Lenovo OEM subtrees from the snapshot BFS queue. Also sets
rate policy thresholds appropriate for XCC BMC latency (p95 often 3-5s).
Add SnapshotExcludeContains []string to AcquisitionTuning and check it
in the snapshot enqueue closure in redfish.go.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -56,6 +56,7 @@ func BuiltinProfiles() []Profile {
|
||||
supermicroProfile(),
|
||||
dellProfile(),
|
||||
hpeProfile(),
|
||||
lenovoProfile(),
|
||||
inspurGroupOEMPlatformsProfile(),
|
||||
hgxProfile(),
|
||||
xfusionProfile(),
|
||||
@@ -226,6 +227,10 @@ func ensurePrefetchPolicy(plan *AcquisitionPlan, policy AcquisitionPrefetchPolic
|
||||
addPlanPaths(&plan.Tuning.PrefetchPolicy.ExcludeContains, policy.ExcludeContains...)
|
||||
}
|
||||
|
||||
func ensureSnapshotExcludeContains(plan *AcquisitionPlan, patterns ...string) {
|
||||
addPlanPaths(&plan.Tuning.SnapshotExcludeContains, patterns...)
|
||||
}
|
||||
|
||||
func min(a, b int) int {
|
||||
if a < b {
|
||||
return a
|
||||
|
||||
Reference in New Issue
Block a user