feat(dcgm): add NVIDIA DCGM diagnostics, fix KVM console

- Add 9002-nvidia-dcgm.hook.chroot: installs datacenter-gpu-manager
  from NVIDIA apt repo during live-build
- Enable nvidia-dcgm.service in chroot setup hook
- Replace bee-gpu-stress with dcgmi diag (levels 1-4) in NVIDIA SAT
- TUI: replace GPU checkbox + duration UI with DCGM level selection
- Remove console=tty2 from boot params: KVM/VGA now shows tty1
  where bee-tui runs, fixing unresponsive console

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-26 23:08:12 +03:00
parent 967455194c
commit eea98e6d76
12 changed files with 121 additions and 170 deletions

View File

@@ -21,6 +21,7 @@ ensure_bee_console_user() {
ensure_bee_console_user
# Enable bee services
systemctl enable nvidia-dcgm.service 2>/dev/null || true
systemctl enable bee-network.service
systemctl enable bee-nvidia.service
systemctl enable bee-preflight.service