The static KERNEL_PKG_VERSION pin was the root cause of nvidia-smi never
working: modules were compiled for pinned version (e.g. 6.12.76-r0) but
the ISO kernel was unpinned (latest from repo at build time). When Alpine
updated linux-lts, the two diverged silently.
Fix: both steps now use whatever linux-lts is current in Alpine 3.21 main
at build time. build-nvidia-module.sh uses `apk add --update linux-lts-dev`
(no version pin), mkimage gets the same package from the same mirror.
Module cache is still keyed by detected KVER so rebuilds remain fast.
Removed: KERNEL_VERSION, KERNEL_PKG_VERSION from VERSIONS, all pin references
from build.sh and build-nvidia-module.sh.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- bee-audit init.d: use --output file: so "audit output written" is logged
(stdout mode silently redirects, never emits the slog confirmation)
- build-nvidia-module.sh: use $KERNEL_SRC in find for .ko collection
(was hardcoded $EXTRACT_DIR/kernel, silent failure if path differs)
- smoketest: add bee-audit to required services (was never checked)
- smoketest: remove legacy bee-audit-debug from service list
- smoketest: internet ping → warn (live CD runs in isolated network, no internet)
- build.sh: auto-copy smoketest.sh → overlay/usr/local/bin/bee-smoketest
(removes manual sync hazard; smoketest.sh is now single source of truth)
- remove static overlay/usr/local/bin/bee-smoketest (generated by build.sh now)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Stale apks_* dirs (from old mirror or previous version pin) cause
"unable to select package" failures. Nuke them on every build.
kernel_*, syslinux_*, grub_* are still preserved — they're large,
stable, and only need to change when KERNEL_PKG_VERSION changes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mkimage checks CWD (/var/tmp) before ~/.mkimage/ for genapkovl scripts.
Old genapkovl-bee.sh left in /var/tmp from previous builds was overriding
the updated version, causing bee-audit-debug to persist in runlevel.
Also add gcompat to apk world so it's installed at boot (was in apks cache
but missing from world file, so nvidia-smi failed with missing ld-linux).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both build-nvidia-module.sh (apk add) and mkimage.sh (--repository) now
explicitly use dl-cdn. Local builder mirror config is ignored.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause of linux-lts pin failure: mkimage was using dl-cdn.alpinelinux.org
while the builder uses mirrors.hosterion.ro — different mirrors can have different
package availability at any given moment.
Now mkimage reads repositories directly from /etc/apk/repositories on the builder,
ensuring both module build and ISO package install use the same mirror.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mkimage.sh calls git internally. Running it from inside /root/bee causes
"outside repository" fatal errors. /var/tmp is outside the git repo.
genapkovl is found via ~/.mkimage/ so no copy to /var/tmp needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pin linux-lts to exact KERNEL_PKG_VERSION=6.12.76-r0 in build and ISO package list
- Add build-time verification that compiled kernel version matches pin (fails loudly)
- Fix bee-audit-debug → bee-audit in genapkovl OpenRC registration (service was never starting)
- Add AUDIT_VERSION=0.1.0 to VERSIONS (was undefined, bee-release had empty fields)
- Pin linux-lts-dev version in second apk add in build-nvidia-module.sh
- Add /root/.profile to overlay so /usr/local/bin is in PATH for SSH sessions
- Remove "DEBUG MODE" from motd
- Fix smoketest: grep for slog "audit output written" instead of non-existent "audit completed"
- Document no-internet constraint in system-overview and runtime-flows
- Remove redundant genapkovl copy to /var/tmp (now found via ~/.mkimage/)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without this, old mkimg.bee_debug.sh left from previous builds
causes mkimage to build both bee and bee_debug profiles.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gpu_burn requires CUDA toolkit (~4GB) to build and the resulting binary
would significantly inflate the ISO. Removed from vendor tool list and
smoketest. build-gpu-burn.sh dropped as well.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
## ISO build consolidation
- Remove separate debug/prod split: overlay-debug/, build-debug.sh,
mkimg.bee_debug.sh, genapkovl-bee_debug.sh all deleted
- Single overlay: iso/overlay/ (was overlay-debug content)
- Single build script: build.sh (SSH, TUI, NVIDIA, vendor tools, bee-release)
- Single mkimage profile: bee (with dropbear, dialog, strace, gcompat, etc.)
## NVIDIA fixes
- Modules now stored at /usr/local/lib/nvidia/ instead of
/lib/modules/<kver>/extra/nvidia/ — modloop squashfs mounts over that
path at boot making overlay content there inaccessible
- bee-nvidia init: load via insmod (absolute path), not modprobe
- bee-nvidia init: create libnvidia-ml.so.1/libcuda.so.1 symlinks in /usr/lib/
- build-nvidia-module.sh: always install linux-lts-dev (not conditional) —
stale 6.6.x headers caused wrong-kernel modules that never loaded at runtime
- build-nvidia-module.sh: create soname symlinks in cache
- KERNEL_VERSION in VERSIONS updated 6.6 → 6.12
- gcompat added to ISO packages (nvidia-smi is a glibc binary on musl Alpine)
## Service ordering
- bee-audit: add `after bee-nvidia` so NVIDIA enrichment always succeeds
## New tooling
- iso/builder/smoketest.sh: SSH smoke test for post-boot ISO validation
- iso/builder/build-gpu-burn.sh: builds gpu_burn vendor binary (CUDA 12.8+)
- vendor/gpu_burn included automatically if placed in iso/vendor/
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>