5.1 KiB
Runtime Flows — bee
Network isolation — CRITICAL
The live CD runs in an isolated network segment with no internet access. All binaries, kernel modules, and tools must be baked into the ISO at build time. No package installation, no downloads, and no package manager calls are allowed at boot. DHCP is used only for LAN (operator SSH access). Internet is NOT available.
Boot sequence (single ISO)
systemd boot order:
local-fs.target
├── bee-sshsetup.service (enables SSH key auth; password fallback only if marker exists)
│ └── ssh.service (OpenSSH on port 22 — starts without network)
├── bee-network.service (starts `dhclient -nw` on all physical interfaces, non-blocking)
├── bee-nvidia.service (insmod nvidia*.ko from /usr/local/lib/nvidia/,
│ creates /dev/nvidia* nodes)
└── bee-audit.service (runs `bee audit` → /var/log/bee-audit.json,
never blocks boot on partial collector failures)
Critical invariants:
- OpenSSH MUST start without network.
bee-sshsetup.serviceruns beforessh.service. bee-network.serviceusesdhclient -nw(background) — network bring-up is best effort and non-blocking.bee-nvidia.serviceloads modules viainsmodwith absolute paths — NOTmodprobe. Reason: the modules are shipped in the ISO overlay under/usr/local/lib/nvidia/, not in the host module tree.bee-audit.servicedoes not wait fornetwork-online.target; audit is local and must run even if DHCP is broken.bee-audit.servicelogs audit failures but does not turn partial collector problems into a boot blocker.
ISO build sequence
build.sh [--authorized-keys /path/to/keys]
1. compile `bee` binary (skip if .go files older than binary)
2. create a temporary overlay staging dir under `dist/`
3. inject authorized_keys into staged `root/.ssh/` (or set password fallback marker)
4. copy `bee` binary → staged `/usr/local/bin/bee`
5. copy vendor binaries from `iso/vendor/` → staged `/usr/local/bin/`
(`storcli64`, `sas2ircu`, `sas3ircu`, `mstflint` — each optional)
6. `build-nvidia-module.sh`:
a. install Debian kernel headers if missing
b. download NVIDIA `.run` installer (sha256 verified, cached in `dist/`)
c. extract installer
d. build kernel modules against Debian headers
e. create `libnvidia-ml.so.1` / `libcuda.so.1` symlinks in cache
f. cache in `dist/nvidia-<version>-<kver>/`
7. inject NVIDIA `.ko` → staged `/usr/local/lib/nvidia/`
8. inject `nvidia-smi` → staged `/usr/local/bin/nvidia-smi`
9. inject `libnvidia-ml` + `libcuda` → staged `/usr/lib/`
10. write staged `/etc/bee-release` (versions + git commit)
11. patch staged `motd` with build metadata
12. copy `iso/builder/` into a temporary live-build workdir under `dist/`
13. sync staged overlay into workdir `config/includes.chroot/`
14. run `lb config && lb build` inside the temporary workdir
(either on a Debian host/VM or inside the privileged builder container)
Critical invariants:
DEBIAN_KERNEL_ABIiniso/builder/VERSIONSpins the exact kernel ABI used in BOTH places:setup-builder.sh/build-in-container.sh/build-nvidia-module.sh— Debian kernel headers for module buildauto/config—linux-image-${DEBIAN_KERNEL_ABI}in the ISO
- NVIDIA modules go to staged
usr/local/lib/nvidia/— NOT to/lib/modules/<kver>/extra/. - The source overlay in
iso/overlay/is treated as immutable source. Build-time files are injected only into the staged overlay. - The live-build workdir under
dist/is disposable; source files underiso/builder/stay clean. - Container build requires
--privilegedbecauselive-builduses mounts/chroots/loop devices during ISO assembly.
Post-boot smoke test
After booting a live ISO, run to verify all critical components:
ssh root@<ip> 'sh -s' < iso/builder/smoketest.sh
Exit code 0 = all required checks pass. All FAIL lines must be zero before shipping.
Key checks: NVIDIA modules loaded, nvidia-smi sees all GPUs, lib symlinks present,
systemd services running, audit completed with NVIDIA enrichment, LAN reachability.
Overlay mechanism
live-build copies files from config/includes.chroot/ into the ISO filesystem.
build.sh prepares a staged overlay, then syncs it into a temporary workdir's
config/includes.chroot/ before running lb build.
Collector flow
`bee audit` start
1. board collector (dmidecode -t 0,1,2)
2. cpu collector (dmidecode -t 4)
3. memory collector (dmidecode -t 17)
4. storage collector (lsblk -J, smartctl -j, nvme id-ctrl, nvme smart-log)
5. pcie collector (lspci -vmm -D, /sys/bus/pci/devices/)
6. psu collector (ipmitool fru — silent if no /dev/ipmi0)
7. nvidia enrichment (nvidia-smi — skipped if binary absent or driver not loaded)
8. output JSON → /var/log/bee-audit.json
9. QR summary to stdout (qrencode if available)
Every collector returns nil, nil on tool-not-found. Errors are logged, never fatal.