-
Detect thermal throttle with fans below 100% as cooling misconfiguration
released this
2026-04-14 21:44:57 +03:00 | 177 commits to main since this releaseDuring power calibration: if a thermal throttle (sw_thermal/hw_thermal)
causes ≥20% clock drop while server fans are below 98% P95 duty cycle,
record a CoolingWarning on the GPU result and emit an actionable finding
telling the operator to rerun with fans manually fixed at 100%.During steady-state benchmark: same signal enriches the existing
thermal_limited finding with fan duty cycle and clock drift values.Covers both the main benchmark (buildBenchmarkFindings) and the power
bench (NvidiaPowerBenchResult.Findings).Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com
Downloads