memtest86+ postinst does not place files in /boot in a live-build chroot
without grub triggers. Added fallback: extract directly from the cached
.deb via dpkg-deb -x, with verbose logging throughout.
Also remove "NVIDIA no MSI-X" from boot menu (premature — root cause unknown).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- build.sh: add --variant nvidia|amd; separate work dirs per variant
(live-build-work-nvidia / live-build-work-amd); GPU-specific steps
(modules, NCCL, cuBLAS, nccl-tests) run only for nvidia; deb package
cache synced back to shared location after each lb build so second
variant reuses downloaded packages; ISO output named
easy-bee-{variant}-v{ver}-amd64.iso
- build-in-container.sh: add --variant nvidia|amd|all (default: all);
runs build.sh twice in one container for 'all'; --clean-build wipes
both variant work dirs
- package-lists: remove GPU packages from bee.list.chroot; add
bee-nvidia.list.chroot (DCGM) and bee-amd.list.chroot (ROCm)
- 9000-bee-setup hook: read /etc/bee-gpu-vendor; enable bee-nvidia.service
and DCGM only for nvidia; set up ROCm symlinks only for amd
- auto/config: --iso-volume uses BEE_GPU_VENDOR_UPPER env var
- grub.cfg: add nomodeset to EASY-BEE and EASY-BEE (load to RAM) entries
— fixes X/lightdm on BMC KVM (ASPEED AST chip requires nomodeset for
fbdev to work; NVIDIA H100 compute does not need KMS)
- bee.sh / smoketest.sh: add /usr/sbin to PATH so dmidecode, smartctl,
nvme are found
- 9100-memtest hook: add diagnostic listing of chroot/boot/memtest* files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add udev rule: /dev/ipmi0 readable by 'ipmi' group (no sudo needed)
- Add 'ipmi' group creation and bee user membership in chroot hook
- Remove legend from all charts (data shown in GPU table below)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add rocm-validation-suite, rocblas, rocrand, hip-runtime-amd,
hipblaslt, comgr to ISO (~700MB, needed for HIP compute)
- RunAMDStressPack: run RVS GST (SGEMM ~31 TFLOPS/GPU) + bandwidth test
- Add rvs symlink in chroot setup hook
- Pin all new package versions in VERSIONS
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add rocm-bandwidth-test package to ISO
- Add bee user to 'render' group (/dev/kfd, /dev/dri/renderD* access)
- Add rocm-bandwidth-test symlink alongside rocm-smi
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
memtest files live in chroot /boot (inside squashfs) but GRUB needs
them on the ISO filesystem. Binary hook copies them out at build time.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move datacenter-gpu-manager and rocm-smi-lib from dynamic chroot hooks
into live-build's config/archives mechanism so lb caches the .deb files
in cache/packages.chroot/ between builds, eliminating repeated 900+ MB
downloads. Versions pinned via VERSIONS and substituted into package
lists at build time.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removes ~100-300MB from the squashfs: man pages, non-en locales,
python cache, apt lists and package cache, temp files and logs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
startx from user shell has /dev/fb0 permission issues and is fragile.
LightDM starts Xorg as root — standard LiveCD approach that works
on server hardware / IPMI KVM with nomodeset + fbdev/vesa.
- Add lightdm package, configure autologin as bee/openbox session
- Add /usr/share/xsessions/openbox.desktop
- Remove startx from .profile (LightDM manages X lifecycle)
- Remove Xwrapper.config needs_root_rights workaround (no longer needed)
- Enable lightdm.service in setup hook
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AMD does not publish Debian Bookworm packages at all (only focal/jammy/noble).
Switch ROCM_UBUNTU_DIST to "jammy"; jammy packages install cleanly on
Debian 12 due to compatible glibc. Also expand candidate list to include
point-releases (6.3.4, 6.3.3, …) so we pick the latest actually-published one.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ROCm 6.4 does not yet publish a Release file for Debian Bookworm, causing
the live-build chroot hook to fail with "does not have a Release file".
Try each version in ROCM_CANDIDATES order; skip to the next if apt-get update
fails (repo unavailable). Exit gracefully if none are available.
Also rename inner 'candidate' variable to 'smi_path' to avoid collision.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- firmware-amd-graphics: Aldebaran firmware blobs (fixes amdgpu IB ring
test errors on MI250/MI250X at boot)
- 9001-amd-rocm.hook.chroot: adds AMD ROCm 6.4 apt repo and installs
rocm-smi-lib for GPU monitoring (analogous to nvidia-smi)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>