• v7.14 bf6ecab4f0

    Add per-precision benchmark phases, weighted TOPS scoring, and ECC tracking

    mchus released this 2026-04-13 10:49:49 +03:00 | 191 commits to main since this release

    • Split steady window into 6 equal slots: fp8/fp16/fp32/fp64/fp4 + combined
    • Each precision phase runs bee-gpu-burn with --precision filter so PowerCVPct reflects single-kernel stability (not round-robin artifact)
    • Add fp4 support in bee-gpu-stress.c for Blackwell (cc>=100) via existing CUDA_R_4F_E2M1 guard
    • Weighted TOPS: fp64×2.0, fp32×1.0, fp16×0.5, fp8×0.25, fp4×0.125
    • SyntheticScore = sum of weighted TOPS from per-precision phases
    • MixedScore = sum from combined phase; MixedEfficiency = Mixed/Synthetic
    • ComputeScore = SyntheticScore × (1 + MixedEfficiency × 0.3)
    • ECC volatile counters sampled before/after each phase and overall
    • DegradationReasons: ecc_uncorrected_errors, ecc_corrected_errors
    • Report: per-precision stability table with ECC columns, methodology section
    • Ramp-up history table redesign: GPU indices as columns, runs as rows

    Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

    Downloads