Commit Graph

98 Commits

Author SHA1 Message Date
Mikhail Chusavitin
f84ec9320c Fix NVIDIA module version selection and add load diagnostics 2026-03-06 17:30:41 +03:00
Mikhail Chusavitin
a55b4108d5 Add wget/curl fallback for vendor and update downloads 2026-03-06 14:45:50 +03:00
Mikhail Chusavitin
18b8c69bc5 Implement audit enrichments, TUI workflows, and production ISO scaffold 2026-03-06 11:56:26 +03:00
bdfb6a0a79 fix: reset VM working tree before pull to clear stale build artifacts
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 23:11:18 +03:00
867565cbf8 fix: inject motd build info in genapkovl tmp, not overlay on disk
sed -i on overlay/etc/motd caused git pull conflict on next build.
Now BEE_BUILD_INFO is exported and substituted in $tmp copy only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 23:11:07 +03:00
b72688cf2c fix: chmod +x overlay scripts on builder VM after git pull
macOS does not reliably apply git file mode changes on disk.
Run chmod explicitly on the VM where it matters.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 23:03:35 +03:00
e8e09e9063 fix: chmod +x in genapkovl to fix permissions regardless of git filemode on VM
- genapkovl now explicitly chmod +x init.d/* and usr/local/bin/* after cp
- add bee-net-restart command (short name, no .sh) and /etc/profile.d/bee.sh for PATH
- udhcpc: add & to ensure non-blocking even when DHCP responds immediately
- motd: short commands without paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 22:59:28 +03:00
63c608711d fix: use agetty --autologin instead of busybox getty -a (unsupported flag)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 22:48:13 +03:00
eecd0799a0 fix: check local/remote sync before building to prevent building stale code
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 22:44:22 +03:00
fd071e28db fix: include build-debug.sh and motd changes missed from previous commit
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 22:43:23 +03:00
c908809991 fix: init scripts not executable, add autologin and build version in motd
- bee-* init.d scripts had mode 644 in git — OpenRC silently skipped them,
  causing bee-network/bee-nvidia/bee-audit to never start at boot
- bee-network.sh also lacked executable bit
- Remove -q from udhcpc (was quitting after first lease, no renewal)
- Add autologin root on tty1 via /etc/inittab
- Inject build date + git commit + versions into motd at build time

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 22:33:45 +03:00
2235a89364 fix: add modloop= to cmdline, revert lz4 compression
modloop was not mounting because:
1. modloop=/boot/modloop-lts was missing from kernel cmdline
2. lz4-compressed squashfs may not be supported by Alpine initramfs

Both issues result in /lib/modules not existing and all modprobe failing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 18:23:26 +03:00
871c766194 docs: add bible-local with architecture and decisions, fix PLAN.md versions
- bible-local/architecture/system-overview.md: scope, tech stack, key paths
- bible-local/architecture/runtime-flows.md: boot sequence, ISO build, collector flow
- bible-local/decisions/2026-03-05-nvidia-proprietary-driver.md
- PLAN.md: update KERNEL_VERSION 6.6→6.12, NVIDIA 550.54.15→590.48.01

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 18:15:07 +03:00
559fc2961d fix: update NVIDIA to 590.48.01, add sha256 verification for installer
- 550.54.15 did not exist on NVIDIA CDN (404)
- updated to 590.48.01 (latest stable, 396MB)
- download sha256sum file first, verify installer before extracting
- re-download if file is missing, empty, or sha256 mismatch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 18:10:31 +03:00
e5c1ef2c33 fix: run build in screen session to survive SSH disconnects
Long builds (NVIDIA driver download+compile) would abort on SSH timeout.
Now build runs in a detached screen session on the VM, run-builder.sh
streams the log and waits for completion safely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 18:07:17 +03:00
d4a2d7fa55 fix: use proprietary NVIDIA .run installer instead of open kernel modules
Builds kernel modules from the official NVIDIA installer source tree,
same as a standard NVIDIA driver install. No open-gpu-kernel-modules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 18:05:57 +03:00
ec9c65e20e feat: build NVIDIA open kernel modules during ISO build
- build-nvidia-module.sh: downloads nvidia open-gpu-kernel-modules source,
  builds against linux-lts headers, extracts nvidia-smi from .run installer
- modules cached by driver version + kernel version (rebuild only on update)
- .ko files injected into ISO overlay at /lib/modules/<kver>/extra/nvidia/
- bee-nvidia init script loads nvidia/nvidia-modeset/nvidia-uvm at boot
- NVIDIA_DRIVER_VERSION=550.54.15 (Turing+, H100/A100 supported)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 18:01:11 +03:00
5475a0aa77 fix: fall back to scp if rsync not available on builder VM
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 17:44:42 +03:00
fdbf533e6c fix: replace linux-firmware-nfp with linux-firmware-netronome (correct package name)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 17:41:19 +03:00
47d717955c fix: add NIC firmware packages 2026-03-05 17:40:09 +03:00
bd9279f96d perf: use lz4 compression for modloop squashfs
xz → lz4 for mksquashfs: kernel modloop rebuild is ~10x faster.
Size increase is acceptable since modloop is loaded into RAM.
Applied in both setup-builder.sh and build-debug.sh.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 16:23:55 +03:00
34faddb9d5 perf: cache syslinux and grub sections between builds
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 16:22:23 +03:00
836c098044 perf: also cache kernel modloop between builds
kernel_* workdir sections were being deleted alongside other non-apks dirs.
Now both apks_* and kernel_* are preserved — kernel modloop squashfs won't
be rebuilt unless the kernel version changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 16:21:43 +03:00
413f188278 perf: skip go rebuild if sources unchanged, use rsync for ISO download
- audit binary is only rebuilt when .go files are newer than the binary
- rsync replaces scp for ISO download (delta transfer on repeat builds)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 16:21:14 +03:00
bb4ceab452 perf: cache apk packages between ISO builds
Keep apks_* workdir sections so packages aren't re-downloaded on each build.
Only non-apks sections (kernel, apkovl, final image) are cleaned to pick up changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 16:20:07 +03:00
ec1a96976b chore: ignore .DS_Store, remove from tracking, fix genapkovl path in build, udhcpc background mode
- add .DS_Store to .gitignore and remove tracked files
- copy genapkovl-bee_debug.sh to /var/tmp before mkimage (was causing "no such file" error)
- switch udhcpc to background mode (-b -t 0) so network comes up when cable connected after boot
- add -B to DROPBEAR_OPTS to allow password fallback (bee/eeb)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 16:18:26 +03:00
279ef318e1 fix: genapkovl copy to /var/tmp, udhcpc background mode 2026-03-05 16:17:52 +03:00
40815161fe fix: clean workdir before build so apkovl changes are always applied 2026-03-05 15:05:42 +03:00
8c0e66c3ef fix: copy genapkovl-bee_debug.sh to ~/.mkimage in build-debug.sh 2026-03-05 15:01:20 +03:00
8502100074 fix: dropbear/network boot ordering — dropbear starts without network
- dropbear: custom init removes 'need net', only needs localmount + bee-sshsetup
- bee-network: removed 'before dropbear' dependency
- bee-network.sh: removed set -e so single iface failure does not abort script
2026-03-05 14:59:23 +03:00
ab22e3ad74 add: NVMe wear telemetry via nvme smart-log (1.8b) 2026-03-05 14:55:53 +03:00
e79f972fb5 add: PSU collector (1.7) via ipmitool fru, skips gracefully without IPMI 2026-03-05 14:54:12 +03:00
55f6098a17 add: memory, storage, pcie collectors (1.4-1.6) — tested on real hardware 2026-03-05 14:50:34 +03:00
569bbf8909 fix: add interfaces file so networking starts, enable dropbear default 2026-03-05 14:47:21 +03:00
aa051266bb fix: replace build_bee_debug with proper apkovl mechanism for Alpine LiveCD
- genapkovl-bee_debug.sh: creates apkovl tarball with overlay files,
  /etc/apk/world package list, runlevel symlinks, dropbear config
- mkimg.bee_debug.sh: set hostname/apkovl, remove invalid build_bee_debug
2026-03-05 14:21:45 +03:00
5ecbf185ea fix: add initfs_cmdline/features and grub_mod for USB boot 2026-03-05 13:29:49 +03:00
06da04236b add: download ISO to iso/out/ after build 2026-03-05 12:17:23 +03:00
e2a3775342 fix: remove udhcpc (busybox), rename qrencode to libqrencode-tools 2026-03-05 11:59:00 +03:00
569fd72c62 fix: set TMPDIR=/var/tmp to avoid tmpfs overflow during mkinitfs 2026-03-05 11:49:21 +03:00
554e1eee21 fix: use /var/tmp for mkimage workdir (tmpfs too small) 2026-03-05 11:44:09 +03:00
9508743dcd fix: add arch=x86_64 to profile, improve abuild key generation in setup 2026-03-05 11:37:46 +03:00
21c4a42333 fix: install profile to ~/.mkimage, pass overlay via BEE_OVERLAY_DIR env 2026-03-05 11:32:26 +03:00
0c16d9fb76 fix: cd /tmp before mkimage.sh to avoid git repo context conflict 2026-03-05 11:31:20 +03:00
45f3182470 add run-builder.sh, .env, .gitignore; fix community repo in setup-builder.sh
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 11:27:26 +03:00
65d92d59c2 feat(iso): 2.1-2.3 — debug ISO builder with SSH access
Builder setup:
- iso/builder/VERSIONS: pinned Alpine 3.21, Go 1.23.6, NVIDIA 550.54.15
- iso/builder/setup-builder.sh: installs build deps + Go on Alpine VM, verifies packages
- iso/builder/build-debug.sh: compiles audit binary, injects SSH keys, builds ISO
- iso/builder/mkimg.bee_debug.sh: Alpine mkimage profile (all audit packages + dropbear)

SSH access (same Ed25519 key as release signing):
- auto-collects ~/.keys/*.key.pub into authorized_keys at build time
- fallback: user bee / password eeb when no keys available
- bee-sshsetup init.d service: creates bee user, sets password, logs status

Debug overlay:
- bee-network: DHCP on all physical interfaces before SSH/audit
- bee-audit-debug: runs audit on boot, leaves SSH up after
- bee-sshsetup: key/password SSH setup
- motd: shows log paths, re-run command, SSH access info

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 10:43:53 +03:00
00bb2fdace feat(audit): 1.3 — CPU collector (dmidecode type 4, microcode)
- cpu.go: collectCPUs(), parseCPUs(), parseCPUSection()
- splitDMISections(): splits multi-section dmidecode output generically
- parseFieldLines(): reusable key→value parser for DMI sections
- parseCPUStatus(): Populated/Unpopulated → OK/WARNING/EMPTY/UNKNOWN
- parseSocketIndex(): CPU0/Processor 1/Socket 2 → integer
- cleanManufacturer(): strips (R), Corporation, Inc. suffixes
- parseMHz(), parseInt(): field value parsers
- Serial fallback: <board_serial>-CPU-<socket> when DMI serial absent
- readMicrocode(): /sys/devices/system/cpu/cpu0/microcode/version
- cpu_test.go: dual-socket, unpopulated skipped, status, socket, manufacturer, MHz

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 10:37:19 +03:00
f1e392a7fe feat(audit): 1.2 — board collector (dmidecode types 0, 1, 2)
- board.go: collectBoard(), parseBoard(), parseBIOSFirmware(), parseDMIFields(), cleanDMIValue()
- Reads System Information (type 1): serial, manufacturer, product_name, uuid
- Reads Base Board Information (type 2): part_number
- Reads BIOS Information (type 0): firmware version record
- cleanDMIValue strips vendor placeholders (O.E.M., Not Specified, Unknown, etc.)
- board_test.go: 6 table/case tests with dmidecode fixtures in testdata/
- collector.go: wired board + BIOS firmware into snapshot

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 10:35:14 +03:00
a4f70b17f0 feat(audit): 1.1 — project scaffold, schema types, collector stub, updater trust
- go.mod: module bee/audit
- schema/hardware.go: HardwareIngestRequest types (compatible with core)
- collector/collector.go: Run() stub, logs start/finish, returns empty snapshot
- updater/trust.go: Ed25519 multi-key verification via ldflags injection
- updater/trust_test.go: valid sig, tampered, multi-key any-match, dev build
- cmd/audit/main.go: --output stdout|file:<path>|usb, --version flag
- Version = "dev" by default, injected via ldflags at release

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 10:32:12 +03:00