• v8.23 5285c0d101

    Capture per-run IPMI power and GPU telemetry in power benchmark

    mchus released this 2026-04-17 17:59:58 +03:00 | 113 commits to main since this release

    • Sample IPMI loaded_w per single-card calibration and per ramp step
      instead of averaging over the entire Phase 2; top-level ServerPower
      uses the final (all-GPU) ramp step value
    • Add ServerLoadedW/ServerDeltaW to NvidiaPowerBenchGPU and
      NvidiaPowerBenchStep so external tooling can compare wall power per
      phase without re-parsing logs
    • Write gpu-metrics.csv/.html inside each single-XX/ and step-XX/
      subdir; aggregate all phases into a top-level gpu-metrics.csv/.html
    • Write 00-nvidia-smi-q.log at the start of every power run
    • Add Telemetry (p95 temp/power/fan/clock) to NvidiaPowerBenchGPU in
      result.json from the converged calibration attempt
    • Power benchmark page: split "Achieved W" into Single-card W and
      Multi-GPU W (StablePowerLimitW); derate highlight and status color
      now reflect the final multi-GPU limit vs nominal
    • Performance benchmark page: add Status column and per-GPU score
      color coding (green/yellow/red) based on gpu.Status and OverallStatus

    Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

    Downloads