-
Capture per-run IPMI power and GPU telemetry in power benchmark
released this
2026-04-17 17:59:58 +03:00 | 113 commits to main since this release- Sample IPMI loaded_w per single-card calibration and per ramp step
instead of averaging over the entire Phase 2; top-level ServerPower
uses the final (all-GPU) ramp step value - Add ServerLoadedW/ServerDeltaW to NvidiaPowerBenchGPU and
NvidiaPowerBenchStep so external tooling can compare wall power per
phase without re-parsing logs - Write gpu-metrics.csv/.html inside each single-XX/ and step-XX/
subdir; aggregate all phases into a top-level gpu-metrics.csv/.html - Write 00-nvidia-smi-q.log at the start of every power run
- Add Telemetry (p95 temp/power/fan/clock) to NvidiaPowerBenchGPU in
result.json from the converged calibration attempt - Power benchmark page: split "Achieved W" into Single-card W and
Multi-GPU W (StablePowerLimitW); derate highlight and status color
now reflect the final multi-GPU limit vs nominal - Performance benchmark page: add Status column and per-GPU score
color coding (green/yellow/red) based on gpu.Status and OverallStatus
Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com
Downloads
- Sample IPMI loaded_w per single-card calibration and per ramp step