iso: improve burn-in, export, and live boot

2026-03-26 18:56:19 +03:00
parent 67a215c66f
commit fc5c2019aa
23 changed files with 1706 additions and 168 deletions
--- a/bible-local/architecture/runtime-flows.md
+++ b/bible-local/architecture/runtime-flows.md
@@ -9,6 +9,8 @@ DHCP is used only for LAN (operator SSH access). Internet is NOT available.

 ## Boot sequence (single ISO)

+The live system is expected to boot with `toram`, so `live-boot` copies the full read-only medium into RAM before mounting the root filesystem. After that point, runtime must not depend on the original USB/BMC virtual media staying readable.
+
 `systemd` boot order:

 ```
@@ -25,6 +27,7 @@ local-fs.target
 ```

 **Critical invariants:**
+- The live ISO boots with `boot=live toram`. Runtime binaries must continue working even if the original boot media disappears after early boot.
 - OpenSSH MUST start without network. `bee-sshsetup.service` runs before `ssh.service`.
 - `bee-network.service` uses `dhclient -nw` (background) — network bring-up is best effort and non-blocking.
 - `bee-nvidia.service` loads modules via `insmod` with absolute paths — NOT `modprobe`.
@@ -71,24 +74,39 @@ build-in-container.sh [--authorized-keys /path/to/keys]
       d. build kernel modules against Debian headers
       e. create `libnvidia-ml.so.1` / `libcuda.so.1` symlinks in cache
       f. cache in `dist/nvidia-<version>-<kver>/`
-  7. inject NVIDIA `.ko` → staged `/usr/local/lib/nvidia/`
-  8. inject `nvidia-smi` → staged `/usr/local/bin/nvidia-smi`
-  9. inject `libnvidia-ml` + `libcuda` → staged `/usr/lib/`
-  10. write staged `/etc/bee-release` (versions + git commit)
-  11. patch staged `motd` with build metadata
-  12. copy `iso/builder/` into a temporary live-build workdir under `dist/`
-  13. sync staged overlay into workdir `config/includes.chroot/`
-  14. run `lb config && lb build` inside the privileged builder container
+  7. `build-cublas.sh`:
+       a. download `libcublas`, `libcublasLt`, `libcudart` runtime + dev packages from the NVIDIA CUDA Debian repo
+       b. verify packages against repo `Packages.gz`
+       c. extract headers for `bee-gpu-stress` build
+       d. cache userspace libs in `dist/cublas-<version>+cuda<series>/`
+  8. build `bee-gpu-stress` against extracted cuBLASLt/cudart headers
+  9. inject NVIDIA `.ko` → staged `/usr/local/lib/nvidia/`
+  10. inject `nvidia-smi` → staged `/usr/local/bin/nvidia-smi`
+  11. inject `libnvidia-ml` + `libcuda` + `libcublas` + `libcublasLt` + `libcudart` → staged `/usr/lib/`
+  12. write staged `/etc/bee-release` (versions + git commit)
+  13. patch staged `motd` with build metadata
+  14. copy `iso/builder/` into a temporary live-build workdir under `dist/`
+  15. sync staged overlay into workdir `config/includes.chroot/`
+  16. run `lb config && lb build` inside the privileged builder container
 ```

+Build host notes:
+- `build-in-container.sh` targets `linux/amd64` builder containers by default, including Docker Desktop on macOS / Apple Silicon.
+- Override with `BEE_BUILDER_PLATFORM=<os/arch>` only if you intentionally need a different container platform.
+- If the local builder image under the same tag was previously built for the wrong architecture, the script rebuilds it automatically.
+
 **Critical invariants:**
 - `DEBIAN_KERNEL_ABI` in `iso/builder/VERSIONS` pins the exact kernel ABI used in BOTH places:
  1. `build-in-container.sh` / `build-nvidia-module.sh` — Debian kernel headers for module build
  2. `auto/config` — `linux-image-${DEBIAN_KERNEL_ABI}` in the ISO
 - NVIDIA modules go to staged `usr/local/lib/nvidia/` — NOT to `/lib/modules/<kver>/extra/`.
+- `bee-gpu-stress` must be built against cached CUDA userspace headers from `build-cublas.sh`, not against random host-installed CUDA headers.
+- The live ISO must ship `libcublas`, `libcublasLt`, and `libcudart` together with `libcuda` so tensor-core stress works without internet or package installs at boot.
 - The source overlay in `iso/overlay/` is treated as immutable source. Build-time files are injected only into the staged overlay.
 - The live-build workdir under `dist/` is disposable; source files under `iso/builder/` stay clean.
 - Container build requires `--privileged` because `live-build` uses mounts/chroots/loop devices during ISO assembly.
+- On macOS / Docker Desktop, the builder still must run as `linux/amd64` so the shipped ISO binaries remain `amd64`.
+- Operators must provision enough RAM to hold the full compressed live medium plus normal runtime overhead, because `toram` copies the entire read-only ISO payload into memory before the system reaches steady state.

 ## Post-boot smoke test

@@ -131,10 +149,15 @@ Current validation state:
 Every collector returns `nil, nil` on tool-not-found. Errors are logged, never fatal.

 Acceptance flows:
- `bee sat nvidia` → diagnostic archive with `nvidia-smi -q` + `nvidia-bug-report` + lightweight `bee-gpu-stress`
+- `bee sat nvidia` → diagnostic archive with `nvidia-smi -q` + `nvidia-bug-report` + mixed-precision `bee-gpu-stress`
 - `bee sat memory` → `memtester` archive
 - `bee sat storage` → SMART/NVMe diagnostic archive and short self-test trigger where supported
 - SAT `summary.txt` now includes `overall_status` and per-job `*_status` values (`OK`, `FAILED`, `UNSUPPORTED`)
+- `bee-gpu-stress` should prefer cuBLASLt GEMM load over the old integer/PTX burn path:
+  - Ampere: `fp16` + `fp32`/TF32 tensor-core load
+  - Ada / Hopper: add `fp8`
+  - Blackwell+: add `fp4`
+  - PTX fallback is only for missing cuBLASLt/userspace or unsupported narrow datatypes
 - Runtime overrides:
  - `BEE_GPU_STRESS_SECONDS`
  - `BEE_GPU_STRESS_SIZE_MB`