Commit Graph

5 Commits

Author SHA1 Message Date
Mikhail Chusavitin
11e001cafa fix: add libc6-compat — required for dlopen of glibc shared objects on Alpine
gcompat alone provides only the ELF interpreter entry point (/lib64/ld-linux-x86-64.so.2).
It does NOT provide libpthread.so.0, libm.so.6, libdl.so.2, libc.so.6 stubs.

libnvidia-ml.so.590 has NEEDED: libpthread.so.0 etc. When nvidia-smi calls
dlopen("libnvidia-ml.so.1"), musl's linker fails to satisfy these deps
→ NVML_ERROR_LIBRARY_NOT_FOUND (exit 12), "couldn't find libnvidia-ml.so".

libc6-compat provides the missing stubs (libpthread.so.0, libm.so.6, libdl.so.2,
libc.so.6, librt.so.1) as musl redirects, enabling dlopen of glibc shared objects.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 17:03:45 +03:00
Mikhail Chusavitin
7d19fb8f60 Fix stale genapkovl in /var/tmp shadowing ~/.mkimage version
mkimage checks CWD (/var/tmp) before ~/.mkimage/ for genapkovl scripts.
Old genapkovl-bee.sh left in /var/tmp from previous builds was overriding
the updated version, causing bee-audit-debug to persist in runlevel.

Also add gcompat to apk world so it's installed at boot (was in apks cache
but missing from world file, so nvidia-smi failed with missing ld-linux).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-07 11:55:16 +03:00
Mikhail Chusavitin
ffc7e5c71a Fix critical ISO build bugs: kernel pinning, service registration, PATH, audit checks
- Pin linux-lts to exact KERNEL_PKG_VERSION=6.12.76-r0 in build and ISO package list
- Add build-time verification that compiled kernel version matches pin (fails loudly)
- Fix bee-audit-debug → bee-audit in genapkovl OpenRC registration (service was never starting)
- Add AUDIT_VERSION=0.1.0 to VERSIONS (was undefined, bee-release had empty fields)
- Pin linux-lts-dev version in second apk add in build-nvidia-module.sh
- Add /root/.profile to overlay so /usr/local/bin is in PATH for SSH sessions
- Remove "DEBUG MODE" from motd
- Fix smoketest: grep for slog "audit output written" instead of non-existent "audit completed"
- Document no-internet constraint in system-overview and runtime-flows
- Remove redundant genapkovl copy to /var/tmp (now found via ~/.mkimage/)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-07 10:52:54 +03:00
Mikhail Chusavitin
1768bb58dd Merge debug/prod into single ISO build, fix NVIDIA module loading
## ISO build consolidation
- Remove separate debug/prod split: overlay-debug/, build-debug.sh,
  mkimg.bee_debug.sh, genapkovl-bee_debug.sh all deleted
- Single overlay: iso/overlay/ (was overlay-debug content)
- Single build script: build.sh (SSH, TUI, NVIDIA, vendor tools, bee-release)
- Single mkimage profile: bee (with dropbear, dialog, strace, gcompat, etc.)

## NVIDIA fixes
- Modules now stored at /usr/local/lib/nvidia/ instead of
  /lib/modules/<kver>/extra/nvidia/ — modloop squashfs mounts over that
  path at boot making overlay content there inaccessible
- bee-nvidia init: load via insmod (absolute path), not modprobe
- bee-nvidia init: create libnvidia-ml.so.1/libcuda.so.1 symlinks in /usr/lib/
- build-nvidia-module.sh: always install linux-lts-dev (not conditional) —
  stale 6.6.x headers caused wrong-kernel modules that never loaded at runtime
- build-nvidia-module.sh: create soname symlinks in cache
- KERNEL_VERSION in VERSIONS updated 6.6 → 6.12
- gcompat added to ISO packages (nvidia-smi is a glibc binary on musl Alpine)

## Service ordering
- bee-audit: add `after bee-nvidia` so NVIDIA enrichment always succeeds

## New tooling
- iso/builder/smoketest.sh: SSH smoke test for post-boot ISO validation
- iso/builder/build-gpu-burn.sh: builds gpu_burn vendor binary (CUDA 12.8+)
- vendor/gpu_burn included automatically if placed in iso/vendor/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 20:14:18 +03:00
Mikhail Chusavitin
18b8c69bc5 Implement audit enrichments, TUI workflows, and production ISO scaffold 2026-03-06 11:56:26 +03:00