• v8.4 a6a07f2626

    Replace linear power derate with binary search + telemetry-guided jump

    mchus released this 2026-04-14 22:05:23 +03:00 | 176 commits to main since this release

    Power calibration previously stepped down 25 W at a time (linear),
    requiring up to 6 attempts to find a stable limit within 150 W range.

    New strategy:

    • Binary search between minLimitW (lo, assumed stable floor) and the
      starting/failed limit (hi, confirmed unstable), converging within a
      10 W tolerance in ~4 attempts.
    • For thermal throttle: the first-quarter telemetry rows estimate the
      GPU's pre-throttle power draw. nextLimit = round5W(onset - 10 W) is
      used as the initial candidate instead of the binary midpoint, landing
      much closer to the true limit on the first step.
    • On success: lo is updated and a higher level is tried (binary search
      upward) until hi-lo ≤ tolerance, ensuring the highest stable limit is
      found rather than the first stable one.
    • Let targeted_power run to natural completion on throttle (no mid-run
      SIGKILL) so nv-hostengine releases its diagnostic slot cleanly before
      the next attempt.

    Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

    Downloads