Files
bee/bible-local/decisions/2026-03-05-nvidia-proprietary-driver.md
2026-03-31 11:15:15 +03:00

26 lines
1.2 KiB
Markdown

# Decision: Use NVIDIA proprietary driver, not open kernel modules
**Date:** 2026-03-05
**Status:** active
## Context
bee needs to collect GPU serial numbers, VBIOS versions, and ECC telemetry via `nvidia-smi`.
Two options exist: NVIDIA open-gpu-kernel-modules (MIT/GPLv2, GitHub) or the official
proprietary `.run` installer.
## Decision
Use the official proprietary NVIDIA `.run` installer for both kernel modules and `nvidia-smi`.
## Consequences
- Kernel modules and nvidia-smi come from a single verified source.
- NVIDIA publishes `.sha256sum` alongside each installer — download and verify before use.
- Driver version pinned in `iso/builder/VERSIONS` as `NVIDIA_DRIVER_VERSION`.
- DCGM must track the CUDA user-mode driver major version exposed by `nvidia-smi`.
- For NVIDIA driver branch `590` with CUDA `13.x`, use DCGM 4 package family `datacenter-gpu-manager-4-cuda13`; legacy `datacenter-gpu-manager` 3.x does not provide a working path for this stack.
- Build process: download `.run`, extract, compile `kernel/` sources against `linux-lts-dev`.
- Modules cached in `dist/nvidia-<version>-<kver>/` — rebuild only on version or kernel change.
- ISO size increases by ~50MB for .ko files + nvidia-smi.