Merge debug/prod into single ISO build, fix NVIDIA module loading

## ISO build consolidation
- Remove separate debug/prod split: overlay-debug/, build-debug.sh,
  mkimg.bee_debug.sh, genapkovl-bee_debug.sh all deleted
- Single overlay: iso/overlay/ (was overlay-debug content)
- Single build script: build.sh (SSH, TUI, NVIDIA, vendor tools, bee-release)
- Single mkimage profile: bee (with dropbear, dialog, strace, gcompat, etc.)

## NVIDIA fixes
- Modules now stored at /usr/local/lib/nvidia/ instead of
  /lib/modules/<kver>/extra/nvidia/ — modloop squashfs mounts over that
  path at boot making overlay content there inaccessible
- bee-nvidia init: load via insmod (absolute path), not modprobe
- bee-nvidia init: create libnvidia-ml.so.1/libcuda.so.1 symlinks in /usr/lib/
- build-nvidia-module.sh: always install linux-lts-dev (not conditional) —
  stale 6.6.x headers caused wrong-kernel modules that never loaded at runtime
- build-nvidia-module.sh: create soname symlinks in cache
- KERNEL_VERSION in VERSIONS updated 6.6 → 6.12
- gcompat added to ISO packages (nvidia-smi is a glibc binary on musl Alpine)

## Service ordering
- bee-audit: add `after bee-nvidia` so NVIDIA enrichment always succeeds

## New tooling
- iso/builder/smoketest.sh: SSH smoke test for post-boot ISO validation
- iso/builder/build-gpu-burn.sh: builds gpu_burn vendor binary (CUDA 12.8+)
- vendor/gpu_burn included automatically if placed in iso/vendor/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Mikhail Chusavitin
2026-03-06 20:14:18 +03:00
parent 0907ba07c3
commit 1768bb58dd
24 changed files with 1296 additions and 261 deletions

View File

@@ -12,18 +12,19 @@ makefile() { OWNER="$1" PERMS="$2" FILENAME="$3"; cat > "$FILENAME"; chown "$OWN
rc_add() { mkdir -p "$tmp/etc/runlevels/$2"; ln -sf /etc/init.d/"$1" "$tmp/etc/runlevels/$2/$1"; }
mkdir -p "$tmp/etc"
makefile root:root 0644 "$tmp/etc/hostname" <<EOT
makefile root:root 0644 "$tmp/etc/hostname" <<EOF
$HOSTNAME
EOT
EOF
# Empty interfaces file — prevents ifupdown from erroring, bee-network handles DHCP
mkdir -p "$tmp/etc/network"
makefile root:root 0644 "$tmp/etc/network/interfaces" <<EOT
makefile root:root 0644 "$tmp/etc/network/interfaces" <<EOF
auto lo
iface lo inet loopback
EOT
EOF
mkdir -p "$tmp/etc/apk"
makefile root:root 0644 "$tmp/etc/apk/world" <<EOT
makefile root:root 0644 "$tmp/etc/apk/world" <<EOF
alpine-base
dmidecode
smartmontools
@@ -34,12 +35,18 @@ util-linux
lsblk
e2fsprogs
lshw
openrc
ca-certificates
dropbear
libqrencode-tools
tzdata
jq
wget
EOT
ca-certificates
strace
procps
lsof
file
less
vim
dialog
EOF
rc_add devfs sysinit
rc_add dmesg sysinit
@@ -58,14 +65,16 @@ rc_add mount-ro shutdown
rc_add killprocs shutdown
rc_add savecache shutdown
rc_add bee-sshsetup default
rc_add bee-network default
rc_add bee-update default
rc_add dropbear default
rc_add bee-nvidia default
rc_add bee-audit default
rc_add bee-audit-debug default
if [ -d "$OVERLAY/etc" ]; then
cp -r "$OVERLAY/etc/." "$tmp/etc/"
chmod +x "$tmp/etc/init.d/"* 2>/dev/null || true
[ -n "$BEE_BUILD_INFO" ] && sed -i "s/%%BUILD_INFO%%/${BEE_BUILD_INFO}/" "$tmp/etc/motd" 2>/dev/null || true
fi
mkdir -p "$tmp/usr"
@@ -74,9 +83,24 @@ if [ -d "$OVERLAY/usr" ]; then
chmod +x "$tmp/usr/local/bin/"* 2>/dev/null || true
fi
if [ -d "$OVERLAY/root" ]; then
mkdir -p "$tmp/root"
cp -r "$OVERLAY/root/." "$tmp/root/"
chmod 700 "$tmp/root/.ssh" 2>/dev/null || true
chmod 600 "$tmp/root/.ssh/authorized_keys" 2>/dev/null || true
fi
if [ -d "$OVERLAY/lib" ]; then
mkdir -p "$tmp/lib"
cp -r "$OVERLAY/lib/." "$tmp/lib/"
fi
tar -c -C "$tmp" etc usr lib 2>/dev/null | gzip -9n > "$HOSTNAME.apkovl.tar.gz"
mkdir -p "$tmp/etc/dropbear" "$tmp/etc/conf.d"
# -R: auto-generate host keys if missing
# no dependency on networking service — bee-network handles DHCP independently
makefile root:root 0644 "$tmp/etc/conf.d/dropbear" <<EOF
DROPBEAR_OPTS="-R -B"
EOF
tar -c -C "$tmp" etc usr root lib 2>/dev/null | gzip -9n > "$HOSTNAME.apkovl.tar.gz"