docs: add bible-local with architecture and decisions, fix PLAN.md versions

- bible-local/architecture/system-overview.md: scope, tech stack, key paths
- bible-local/architecture/runtime-flows.md: boot sequence, ISO build, collector flow
- bible-local/decisions/2026-03-05-nvidia-proprietary-driver.md
- PLAN.md: update KERNEL_VERSION 6.6→6.12, NVIDIA 550.54.15→590.48.01

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-05 18:15:07 +03:00
parent 559fc2961d
commit 871c766194
6 changed files with 180 additions and 2 deletions

View File

@@ -0,0 +1,78 @@
# Runtime Flows — bee
## Boot sequence (debug ISO)
OpenRC default runlevel, service start order:
```
localmount
└── bee-sshsetup (creates bee user, sets password fallback)
└── dropbear (SSH on port 22 — starts regardless of network)
└── bee-network (udhcpc -b on all physical interfaces, non-blocking)
└── bee-nvidia (depmod -a, modprobe nvidia nvidia-modeset nvidia-uvm)
└── bee-audit-debug (runs audit binary, logs to /var/log/bee-audit.json)
```
**Critical invariants:**
- Dropbear MUST start without network. Custom init in overlay has `need localmount` only — NOT `need net`.
- bee-network uses `udhcpc -b` (background daemon) so it retries indefinitely when cable connected later.
- bee-audit-debug uses `eend 0` always — never fails boot even if audit errors.
## ISO build sequence
```
build-debug.sh
1. compile audit binary (skip if .go files older than binary)
2. build-nvidia-module.sh:
a. download NVIDIA .run installer (sha256 verified, cached)
b. extract installer
c. build kernel modules against linux-lts-dev headers
d. extract nvidia-smi + libnvidia-ml from installer
e. cache in dist/nvidia-<version>-<kver>/
3. inject authorized_keys into overlay
4. inject audit binary → overlay/usr/local/bin/audit
5. inject NVIDIA .ko → overlay/lib/modules/<kver>/extra/nvidia/
6. inject nvidia-smi → overlay/usr/local/bin/nvidia-smi
7. copy mkimg profile + genapkovl to ~/.mkimage/ AND /var/tmp/
8. mkimage.sh (from /var/tmp, TMPDIR=/var/tmp):
kernel_* section — cached (linux-lts modloop, lz4 compressed)
apks_* section — cached (downloaded packages)
syslinux_* / grub_* — cached
apkovl — always regenerated (genapkovl-bee_debug.sh)
final ISO — always assembled
```
**Critical invariants:**
- `genapkovl-bee_debug.sh` must be in `/var/tmp/` (CWD when mkimage runs), not only `~/.mkimage/`.
- `TMPDIR=/var/tmp` required — tmpfs /tmp is only ~1GB, too small for kernel firmware.
- Workdir cleanup preserves `apks_*`, `kernel_*`, `syslinux_*`, `grub_*` — only clears apkovl and final image.
- `run-builder.sh` runs build in `screen` session to survive SSH disconnects during long NVIDIA downloads.
## apkovl mechanism
The apkovl is a `.tar.gz` injected into the ISO at `/boot/`. Alpine's initramfs extracts it at boot, overlaying `/etc`, `/usr`, `/root` on the tmpfs root.
`genapkovl-bee_debug.sh` generates the tarball containing:
- `/etc/apk/world` — package list (apk installs these on first boot)
- `/etc/runlevels/*/` — OpenRC service symlinks
- `/etc/conf.d/dropbear` — DROPBEAR_OPTS="-R -B"
- `/etc/network/interfaces` — lo only (bee-network handles DHCP)
- `/etc/hostname`
- Everything from `iso/overlay-debug/` (init scripts, binaries, ssh keys)
## Collector flow
```
audit binary start
1. board collector (dmidecode -t 0,1,2)
2. cpu collector (dmidecode -t 4)
3. memory collector (dmidecode -t 17)
4. storage collector (lsblk -J, smartctl -j, nvme id-ctrl, nvme smart-log)
5. pcie collector (lspci -vmm -D, /sys/bus/pci/devices/)
6. psu collector (ipmitool fru — silent if no /dev/ipmi0)
7. nvidia enrichment (nvidia-smi — skipped if driver not loaded)
8. output JSON to stdout / file / usb
9. QR summary to stdout (qrencode if available)
```
Every collector returns `nil, nil` on tool-not-found. Errors are logged, never fatal.