Builder setup: - iso/builder/VERSIONS: pinned Alpine 3.21, Go 1.23.6, NVIDIA 550.54.15 - iso/builder/setup-builder.sh: installs build deps + Go on Alpine VM, verifies packages - iso/builder/build-debug.sh: compiles audit binary, injects SSH keys, builds ISO - iso/builder/mkimg.bee_debug.sh: Alpine mkimage profile (all audit packages + dropbear) SSH access (same Ed25519 key as release signing): - auto-collects ~/.keys/*.key.pub into authorized_keys at build time - fallback: user bee / password eeb when no keys available - bee-sshsetup init.d service: creates bee user, sets password, logs status Debug overlay: - bee-network: DHCP on all physical interfaces before SSH/audit - bee-audit-debug: runs audit on boot, leaves SSH up after - bee-sshsetup: key/password SSH setup - motd: shows log paths, re-run command, SSH access info Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
543 lines
21 KiB
Markdown
543 lines
21 KiB
Markdown
# BEE — Build Plan
|
||
|
||
Hardware audit LiveCD for offline server inventory.
|
||
Produces `HardwareIngestRequest` JSON compatible with core/reanimator.
|
||
|
||
**Principle:** OS-level collection — reads hardware directly, not through BMC.
|
||
Fully unattended — no user interaction required at any stage. Boot → update → audit → output → done.
|
||
All errors are logged, never presented interactively. Every failure path has a silent fallback.
|
||
Fills the gaps where logpile/Redfish is blind: NVMe, DIMM serials, GPU serials, physical disks behind RAID, full SMART, NIC firmware.
|
||
|
||
---
|
||
|
||
## Phase 1 — Go Audit Binary
|
||
|
||
Self-contained static binary. Runs on any Linux (including Alpine LiveCD).
|
||
Calls system utilities, parses their output, produces `HardwareIngestRequest` JSON.
|
||
|
||
### 1.1 — Project scaffold
|
||
|
||
- `audit/go.mod` — module `bee/audit`
|
||
- `audit/cmd/audit/main.go` — CLI entry point: flags, orchestration, JSON output
|
||
- `audit/internal/schema/` — copy of `HardwareIngestRequest` types from core (no import dependency)
|
||
- `audit/internal/collector/` — empty package stubs for all collectors
|
||
- `const Version = "1.0"` in main
|
||
- Output modes: stdout (default), file path flag `--output /path/to/file.json`
|
||
- Tests: none yet (stubs only)
|
||
|
||
### 1.2 — Board collector
|
||
|
||
Source: `dmidecode -t 0` (BIOS), `-t 1` (System), `-t 2` (Baseboard)
|
||
|
||
Collects:
|
||
- `board.serial_number` — from System Information
|
||
- `board.manufacturer`, `board.product_name` — from System Information
|
||
- `board.part_number` — from Baseboard
|
||
- `board.uuid` — from System Information
|
||
- `firmware[BIOS]` — vendor, version, release date from BIOS Information
|
||
|
||
Tests: table tests with `testdata/dmidecode_*.txt` fixtures
|
||
|
||
### 1.3 — CPU collector
|
||
|
||
Source: `dmidecode -t 4`
|
||
|
||
Collects:
|
||
- socket index, model, manufacturer, status
|
||
- cores, threads, current/max frequency
|
||
- firmware: microcode version from `/sys/devices/system/cpu/cpu0/microcode/version`
|
||
- serial: not available on Intel Xeon → fallback `<board_serial>-CPU-<socket>` (matches core logic)
|
||
|
||
Tests: table tests with dmidecode fixtures
|
||
|
||
### 1.4 — Memory collector
|
||
|
||
Source: `dmidecode -t 17`
|
||
|
||
Collects:
|
||
- slot, location, present flag
|
||
- size_mb, type (DDR4/DDR5), max_speed_mhz, current_speed_mhz
|
||
- manufacturer, serial_number, part_number
|
||
- status from "Data Width" / "No Module Installed" detection
|
||
|
||
Tests: table tests with dmidecode fixtures (populated + empty slots)
|
||
|
||
### 1.5 — Storage collector
|
||
|
||
Sources:
|
||
- `lsblk -J -o NAME,TYPE,SIZE,SERIAL,MODEL,TRAN,MOUNTPOINT` — device enumeration
|
||
- `smartctl -j -i /dev/X` — serial, model, firmware, interface per device
|
||
- `nvme id-ctrl /dev/nvmeX -o json` — NVMe: serial (sn), firmware (fr), model (mn), size
|
||
|
||
Collects per device:
|
||
- type: SSD/HDD/NVMe
|
||
- model, serial_number, manufacturer, firmware
|
||
- size_gb, interface (SATA/SAS/NVMe)
|
||
- slot: from lsblk HCTL where available
|
||
|
||
Tests: table tests with `smartctl -j` JSON fixtures and `nvme id-ctrl` JSON fixtures
|
||
|
||
### 1.6 — PCIe collector
|
||
|
||
Sources:
|
||
- `lspci -vmm -D` — slot, vendor, device, class
|
||
- `lspci -vvv -D` — link width/speed (LnkSta, LnkCap)
|
||
- embedded pci.ids (same submodule as logpile: `third_party/pciids`) — model name lookup
|
||
- `/sys/bus/pci/devices/<bdf>/` — actual negotiated link state from kernel
|
||
|
||
Collects per device:
|
||
- bdf, vendor_id, device_id, device_class
|
||
- manufacturer, model (via pciids lookup if empty)
|
||
- link_width, link_speed, max_link_width, max_link_speed
|
||
- serial_number: device-specific (see per-type enrichment in 1.8, 1.9)
|
||
|
||
Dedup: by serial → bdf (mirrors logpile canonical device repository logic)
|
||
|
||
Tests: table tests with lspci fixtures
|
||
|
||
### 1.7 — PSU collector
|
||
|
||
Source: `ipmitool fru` — primary (only source for PSU data from OS)
|
||
Fallback: `dmidecode -t 39` (System Power Supply, limited availability)
|
||
|
||
Collects:
|
||
- slot, present, model, vendor, serial_number, part_number
|
||
- wattage_w from FRU or dmidecode
|
||
- firmware from ipmitool FRU `Board Extra` fields
|
||
- input_power_w, output_power_w, input_voltage from `ipmitool sdr`
|
||
|
||
Tests: table tests with ipmitool fru text fixtures
|
||
|
||
### 1.8 — NVIDIA GPU enrichment
|
||
|
||
Prerequisite: NVIDIA driver loaded (checked via `nvidia-smi -L` exit code)
|
||
|
||
Sources:
|
||
- `nvidia-smi --query-gpu=index,name,serial,vbios_version,temperature.gpu,power.draw,ecc.errors.uncorrected.aggregate.total --format=csv,noheader,nounits`
|
||
- BDF correlation: `nvidia-smi --query-gpu=index,pci.bus_id --format=csv,noheader` → match to PCIe collector records
|
||
|
||
Enriches PCIe records for NVIDIA devices:
|
||
- `serial_number` — real GPU serial from nvidia-smi
|
||
- `firmware` — VBIOS version
|
||
- `status` — derived from ECC uncorrected errors (0 = OK, >0 = WARNING)
|
||
- telemetry: temperature_c, power_w added to PCIe record attributes
|
||
|
||
Fallback (no driver): PCIe record stays as-is with serial fallback `<board_serial>-PCIE-<slot>`
|
||
|
||
Tests: table tests with nvidia-smi CSV fixtures
|
||
|
||
### 1.8b — Component wear / age telemetry
|
||
|
||
Every component that stores its own usage history must have that data collected and placed in the `attributes` / `telemetry` map of the respective record. This is a cross-cutting concern applied on top of the per-collector steps.
|
||
|
||
**Storage (SATA/SAS) — smartctl:**
|
||
- `Power_On_Hours` (attr 9) — total hours powered on
|
||
- `Power_Cycle_Count` (attr 12)
|
||
- `Reallocated_Sector_Ct` (attr 5) — wear indicator
|
||
- `Wear_Leveling_Count` (attr 177, SSD) — NAND wear
|
||
- `Total_LBAs_Written` (attr 241) — bytes written lifetime
|
||
- `SSD_Life_Left` (attr 231) — % remaining if reported
|
||
- Collected as `telemetry` map keys: `power_on_hours`, `power_cycles`, `reallocated_sectors`, `wear_leveling_pct`, `total_lba_written`, `life_remaining_pct`
|
||
|
||
**NVMe — nvme smart-log:**
|
||
- `power_on_hours` — lifetime hours
|
||
- `power_cycles`
|
||
- `unsafe_shutdowns` — abnormal power loss count
|
||
- `percentage_used` — % of rated lifetime consumed (0–100)
|
||
- `data_units_written` — 512KB units written lifetime
|
||
- `controller_busy_time` — hours controller was busy
|
||
- Collected via `nvme smart-log /dev/nvmeX -o json`
|
||
|
||
**NVIDIA GPU — nvidia-smi:**
|
||
- `ecc.errors.uncorrected.aggregate.total` — lifetime uncorrected ECC errors
|
||
- `ecc.errors.corrected.aggregate.total` — lifetime corrected ECC errors
|
||
- `clocks_throttle_reasons.hw_slowdown` — thermal/power throttle state
|
||
- Stored in PCIe device `telemetry`
|
||
|
||
**NIC SFP/QSFP transceivers — ethtool:**
|
||
- `ethtool -m <iface>` — DOM (Digital Optical Monitoring) if supported
|
||
- Extracts: TX power, RX power, temperature, voltage, bias current
|
||
- Also: `ethtool -i <iface>` → firmware version
|
||
- `ip -s link show <iface>` → tx_packets, rx_packets, tx_errors, rx_errors (uptime proxy)
|
||
- Stored in PCIe device `telemetry`: `sfp_temperature_c`, `sfp_tx_power_dbm`, `sfp_rx_power_dbm`
|
||
|
||
**PSU — ipmitool sdr:**
|
||
- Input/output power readings over time not stored by BMC (point-in-time only)
|
||
- `ipmitool fru` may include manufacture date for age estimation
|
||
- Stored: `input_power_w`, `output_power_w`, `input_voltage` (already in PSU schema)
|
||
|
||
**All wear telemetry placement rules:**
|
||
- Numeric wear indicators go into `telemetry` map (machine-readable, importable by core)
|
||
- Boolean flags (throttle_active, ecc_errors_present) go into `attributes` map
|
||
- Never flatten into named top-level fields not in the schema — use maps
|
||
|
||
Tests: table tests for each SMART parser (JSON fixtures from smartctl/nvme smart-log)
|
||
|
||
### 1.9 — Mellanox/NVIDIA NIC enrichment
|
||
|
||
Source: `mstflint -d <bdf> q` — if mstflint present and device is Mellanox (vendor_id 0x15b3)
|
||
Fallback: `ethtool -i <iface>` — firmware-version field
|
||
|
||
Enriches PCIe/NIC records:
|
||
- `firmware` — from mstflint `FW Version` or ethtool `firmware-version`
|
||
- `serial_number` — from mstflint `Board Serial Number` if available
|
||
|
||
Detection: by PCI vendor_id (0x15b3 = Mellanox/NVIDIA Networking) from PCIe collector
|
||
|
||
Tests: table tests with mstflint output fixtures
|
||
|
||
### 1.10 — RAID controller enrichment
|
||
|
||
Source: tool selected by PCI vendor_id:
|
||
|
||
| PCI vendor_id | Tool | Controller |
|
||
|---|---|---|
|
||
| 0x1000 | `storcli64 /c<n> show all J` | Broadcom MegaRAID |
|
||
| 0x1000 (SAS) | `sas2ircu <n> display` / `sas3ircu <n> display` | LSI SAS 2.x/3.x |
|
||
| 0x9005 | `arcconf getconfig <n>` | Adaptec |
|
||
| 0x103c | `ssacli ctrl slot=<n> pd all show detail` | HPE Smart Array |
|
||
|
||
Collects physical drives behind controller (not visible to OS as block devices):
|
||
- serial_number, model, manufacturer, firmware
|
||
- size_gb, interface (SAS/SATA), slot/bay
|
||
- status (Online/Failed/Rebuilding → OK/CRITICAL/WARNING)
|
||
|
||
No hardcoded vendor names in detection logic — pure PCI vendor_id map.
|
||
|
||
Tests: table tests with storcli/sas2ircu text fixtures
|
||
|
||
### 1.11 — Output and USB write
|
||
|
||
`--output stdout` (default): pretty-printed JSON to stdout
|
||
`--output file:<path>`: write JSON to explicit path
|
||
`--output usb`: auto-detect first removable block device, mount it, write `audit-<board_serial>-<YYYYMMDD-HHMMSS>.json`
|
||
|
||
USB detection: scan `/sys/block/*/removable`, pick first `1`, mount to `/tmp/bee-usb`
|
||
|
||
QR summary to stdout (always): board serial + model + component counts — fits in one QR code
|
||
Uses `qrencode` if present, else skips silently
|
||
|
||
### 1.12 — Integration test (local)
|
||
|
||
`scripts/test-local.sh` — runs audit binary on developer machine (Linux), captures JSON,
|
||
validates required fields are present (board.serial_number non-empty, cpus non-empty, etc.)
|
||
|
||
Not a unit test — requires real hardware access. Documents how to run for verification.
|
||
|
||
---
|
||
|
||
## Phase 2 — Alpine LiveCD
|
||
|
||
ISO image bootable via BMC virtual media. Runs audit binary automatically on boot.
|
||
|
||
### 2.1 — Builder environment
|
||
|
||
`iso/builder/Dockerfile` — Alpine 3.21 build environment with:
|
||
- `alpine-sdk`, `abuild`, `squashfs-tools`, `xorriso`
|
||
- Go toolchain (for binary compilation inside builder)
|
||
- NVIDIA driver `.run` pre-fetched during image build
|
||
|
||
`iso/builder/build.sh` — orchestrates full ISO build:
|
||
1. Compile Go binary (static, `CGO_ENABLED=0`)
|
||
2. Compile NVIDIA kernel module against Alpine 3.21 LTS kernel headers
|
||
3. Run `mkimage.sh` with bee profile
|
||
4. Output: `dist/bee-<version>.iso`
|
||
|
||
### 2.2 — NVIDIA driver build
|
||
|
||
Alpine 3.21, LTS kernel 6.6 — fixed versions in builder.
|
||
|
||
`iso/builder/build-nvidia.sh`:
|
||
- Download `NVIDIA-Linux-x86_64-<ver>.run` (version pinned in `iso/builder/VERSIONS`)
|
||
- Extract kernel module sources
|
||
- Compile against `linux-lts-dev` headers
|
||
- Strip and package as `nvidia-<ver>-k6.6.ko.tar.gz` for inclusion in overlay
|
||
|
||
`iso/overlay/usr/local/bin/load-nvidia.sh`:
|
||
- `insmod` sequence: nvidia.ko → nvidia-modeset.ko → nvidia-uvm.ko
|
||
- Verify: `nvidia-smi -L` → log result
|
||
- On failure: log warning, continue (audit runs without GPU enrichment)
|
||
|
||
### 2.3 — Alpine mkimage profile
|
||
|
||
`iso/builder/mkimg.bee.sh` — Alpine mkimage profile:
|
||
- Base: `alpine-base`
|
||
- Kernel: `linux-lts`
|
||
- Packages: `dmidecode smartmontools nvme-cli pciutils ipmitool util-linux e2fsprogs qrencode`
|
||
- Overlay: `iso/overlay/` included as apkovl
|
||
|
||
### 2.4 — Network bring-up on boot
|
||
|
||
`iso/overlay/usr/local/bin/bee-network.sh`:
|
||
- Enumerate all network interfaces: `ip link show` → filter out loopback and virtual (docker/bridge)
|
||
- For each physical interface: `ip link set <iface> up` + `udhcpc -i <iface> -t 5 -T 3 -n`
|
||
- Log each interface result (got IP / timeout / no carrier)
|
||
- Continue regardless — network is best-effort for auto-update
|
||
|
||
`iso/overlay/etc/init.d/bee-network`:
|
||
- runlevel: default, before: bee-update
|
||
- Calls bee-network.sh
|
||
- Does not block boot if DHCP fails on all interfaces
|
||
|
||
### 2.5 — OpenRC boot service (bee-audit)
|
||
|
||
`iso/overlay/etc/init.d/bee-audit`:
|
||
- runlevel: default, after: bee-update
|
||
- start(): load-nvidia.sh → /usr/local/bin/audit --output usb
|
||
- on completion: print QR summary to /dev/tty1 (always, even if USB write failed)
|
||
- log everything to /var/log/bee-audit.log
|
||
- exits 0 regardless of partial failures — unattended, no prompts, no waits
|
||
|
||
Unattended invariants:
|
||
- No TTY prompts ever. All decisions are automatic.
|
||
- Missing USB: output goes to /tmp/bee-audit-<serial>-<date>.json, QR shown on screen.
|
||
- Missing NVIDIA driver: GPU records have status UNKNOWN, audit continues.
|
||
- Missing ipmitool/storcli/any tool: that collector is skipped, rest continue.
|
||
- Timeout on any external command: 30s hard limit via `timeout` wrapper, then skip.
|
||
- Boot never hangs waiting for user input.
|
||
|
||
`iso/overlay/etc/runlevels/default/bee-audit` symlink
|
||
|
||
### 2.6 — Vendor utilities in overlay
|
||
|
||
`iso/overlay/usr/local/bin/` includes pre-fetched proprietary tools:
|
||
- `storcli64` (Broadcom)
|
||
- `sas2ircu`, `sas3ircu` (Broadcom/LSI)
|
||
- `mstflint` (NVIDIA Networking / Mellanox)
|
||
|
||
`scripts/fetch-vendor.sh` — downloads and places these before ISO build.
|
||
Checksums verified. Tools not committed to git — fetched at build time.
|
||
|
||
`iso/vendor/.gitkeep` — placeholder, directory gitignored except .gitkeep
|
||
|
||
### 2.7 — Auto-update of audit binary (USB + network)
|
||
|
||
Two update paths, tried in order on every boot:
|
||
|
||
**Path A — USB (no network required, higher priority):**
|
||
|
||
`bee-update.sh` scans mounted removable media for an update package before checking network.
|
||
|
||
Looks for: `<usb>/bee-update/bee-audit-linux-amd64` + `<usb>/bee-update/bee-audit-linux-amd64.sha256`
|
||
|
||
Steps:
|
||
1. Find USB mount point (same detection as audit output: `/sys/block/*/removable`)
|
||
2. Check for `bee-update/bee-audit-linux-amd64` on the USB root
|
||
3. Read version from `bee-update/VERSION` file (plain text, e.g. `1.3`)
|
||
4. Compare with running binary version (`/usr/local/bin/audit --version`)
|
||
5. If USB version > running: verify SHA256 checksum, replace binary, log update
|
||
6. Re-run audit if updated
|
||
|
||
**Authenticity verification — Ed25519 multi-key trust (stdlib only, no external tools):**
|
||
|
||
Problem: SHA256 alone does not prevent a crafted attack — an attacker places their binary
|
||
and a matching SHA256 next to it. The LiveCD would accept it.
|
||
|
||
Solution: Ed25519 asymmetric signatures via Go stdlib `crypto/ed25519`.
|
||
Multiple developer public keys are supported. A binary update is accepted if its signature
|
||
verifies against ANY of the embedded trusted public keys.
|
||
|
||
This mirrors the SSH authorized_keys model: add a developer → add their public key.
|
||
Remove a developer → rebuild without their key.
|
||
|
||
**Key management — centralized across all projects:**
|
||
|
||
Public keys live in a dedicated repo at git.mchus.pro/mchus/keys (or similar):
|
||
```
|
||
keys/
|
||
developers/
|
||
mchusavitin.pub ← Ed25519 public key, base64, one line
|
||
developer2.pub
|
||
README.md ← how to generate a key pair
|
||
```
|
||
|
||
Public keys are safe to commit — they are not secret.
|
||
Private keys stay on each developer's machine, never committed anywhere.
|
||
|
||
Key generation (one-time per developer, run locally):
|
||
```sh
|
||
# scripts/keygen.sh — also lives in the keys repo
|
||
openssl genpkey -algorithm ed25519 -out ~/.bee-release.key
|
||
openssl pkey -in ~/.bee-release.key -pubout -outform DER \
|
||
| tail -c 32 | base64 > mchusavitin.pub
|
||
```
|
||
|
||
**Embedding public keys at release time (not compile time):**
|
||
|
||
Public keys are injected via `-ldflags` at build time from the keys repo.
|
||
The binary does not hardcode keys — they are provided by the release script.
|
||
|
||
```go
|
||
// audit/internal/updater/trust.go
|
||
// trustedKeysRaw is injected at build time via -ldflags
|
||
// format: base64(key1):base64(key2):...
|
||
var trustedKeysRaw string
|
||
|
||
func trustedKeys() ([]ed25519.PublicKey, error) {
|
||
if trustedKeysRaw == "" {
|
||
return nil, fmt.Errorf("binary built without trusted keys — updates disabled")
|
||
}
|
||
var keys []ed25519.PublicKey
|
||
for _, enc := range strings.Split(trustedKeysRaw, ":") {
|
||
b, err := base64.StdEncoding.DecodeString(strings.TrimSpace(enc))
|
||
if err != nil || len(b) != ed25519.PublicKeySize {
|
||
return nil, fmt.Errorf("invalid trusted key: %w", err)
|
||
}
|
||
keys = append(keys, ed25519.PublicKey(b))
|
||
}
|
||
return keys, nil
|
||
}
|
||
|
||
func verifySignature(binaryPath, sigPath string) error {
|
||
keys, err := trustedKeys()
|
||
if err != nil {
|
||
return err
|
||
}
|
||
data, _ := os.ReadFile(binaryPath)
|
||
sig, _ := os.ReadFile(sigPath) // 64 bytes raw Ed25519 signature
|
||
for _, key := range keys {
|
||
if ed25519.Verify(key, data, sig) {
|
||
return nil // any trusted key accepts → pass
|
||
}
|
||
}
|
||
return fmt.Errorf("signature verification failed: no trusted key matched")
|
||
}
|
||
```
|
||
|
||
Release build injects keys:
|
||
```sh
|
||
# scripts/build-release.sh
|
||
KEYS=$(paste -sd: keys/developers/*.pub)
|
||
go build -ldflags "-X bee/audit/internal/updater/trust.trustedKeysRaw=${KEYS}" \
|
||
-o dist/bee-audit-linux-amd64 ./cmd/audit
|
||
```
|
||
|
||
Signing (release engineer signs with their private key):
|
||
```sh
|
||
# scripts/sign-release.sh <binary>
|
||
openssl pkeyutl -sign -inkey ~/.bee-release.key \
|
||
-rawin -in "$1" -out "$1.sig"
|
||
```
|
||
|
||
Binary built without `-ldflags` injection (e.g. local dev build) has `trustedKeysRaw=""`
|
||
→ updates are disabled, logged as INFO, audit continues normally.
|
||
|
||
Update rejected silently (logged as WARNING, audit continues with current binary) if:
|
||
- `.sig` file missing
|
||
- Signature does not match any trusted key
|
||
- `trustedKeysRaw` empty (dev build)
|
||
|
||
Update package layout on USB:
|
||
```
|
||
/bee-update/
|
||
bee-audit-linux-amd64 ← new binary (also signed with embedded keys)
|
||
bee-audit-linux-amd64.sig ← Ed25519 signature (64 bytes raw)
|
||
VERSION ← plain version string e.g. "1.3"
|
||
```
|
||
|
||
Admin workflow: download `bee-audit-linux-amd64` + `bee-audit-linux-amd64.sig` from Gitea
|
||
release assets, place in `bee-update/` on USB.
|
||
|
||
**Path B — Network (requires DHCP on at least one interface):**
|
||
1. Check network: ping git.mchus.pro -c 1 -W 3 || skip
|
||
2. Fetch: `GET https://git.mchus.pro/api/v1/repos/<org>/bee/releases/latest`
|
||
3. Parse tag_name, asset URLs for `bee-audit-linux-amd64` + `bee-audit-linux-amd64.sig`
|
||
4. Compare tag with running version
|
||
5. If newer: download both files to /tmp, verify Ed25519 signature against all trusted keys
|
||
6. Replace binary on pass, log and skip on fail
|
||
7. Re-run audit if updated
|
||
|
||
**Ordering:** USB update checked first, network checked second.
|
||
If USB update applied and verified, network check is skipped.
|
||
|
||
`iso/overlay/etc/init.d/bee-update`:
|
||
- runlevel: default
|
||
- after: bee-network (network path needs interfaces up)
|
||
- before: bee-audit (audit runs with latest binary)
|
||
- Calls bee-update.sh
|
||
|
||
Triggered after bee-audit completes, only if network is available.
|
||
|
||
`iso/overlay/usr/local/bin/bee-update.sh`:
|
||
|
||
```
|
||
1. Check network: ping git.mchus.pro -c 1 -W 3 || exit 0
|
||
2. Fetch latest release metadata:
|
||
GET https://git.mchus.pro/api/v1/repos/<org>/bee/releases/latest
|
||
3. Parse: extract tag_name, asset URL for bee-audit-linux-amd64
|
||
4. Compare tag_name with /usr/local/bin/audit --version output
|
||
5. If newer: download to /tmp/bee-audit-new, verify SHA256 checksum from release assets
|
||
6. Replace /usr/local/bin/audit (tmpfs — survives until reboot)
|
||
7. Log: updated from vX.Y to vX.Z
|
||
8. Re-run audit if update happened: /usr/local/bin/audit --output usb
|
||
```
|
||
|
||
`iso/overlay/etc/init.d/bee-update`:
|
||
- runlevel: default
|
||
- after: bee-audit, network
|
||
- Calls bee-update.sh
|
||
|
||
Release naming convention: binary asset named `bee-audit-linux-amd64` per release tag.
|
||
|
||
### 2.8 — Release workflow
|
||
|
||
`iso/builder/VERSIONS` — pinned versions:
|
||
```
|
||
AUDIT_VERSION=1.0
|
||
ALPINE_VERSION=3.21
|
||
KERNEL_VERSION=6.6
|
||
NVIDIA_DRIVER_VERSION=550.54.15
|
||
```
|
||
|
||
LiveCD release = full ISO rebuild. Binary-only patch = new Gitea release with binary asset.
|
||
On boot with network: ISO auto-patches its binary without full rebuild.
|
||
|
||
ISO version embedded in `/etc/bee-release`:
|
||
```
|
||
BEE_ISO_VERSION=1.0
|
||
BEE_AUDIT_VERSION=1.0
|
||
BUILD_DATE=2026-03-05
|
||
```
|
||
|
||
---
|
||
|
||
## Eating order
|
||
|
||
Builder environment is set up early (after 1.3) so every subsequent collector
|
||
is developed and tested directly on real hardware in the actual Alpine environment.
|
||
No "works on my Mac" drift.
|
||
|
||
```
|
||
1.0 keys repo setup → git.mchus.pro/mchus/keys, keygen.sh, developer pubkeys
|
||
1.1 scaffold + schema types → binary runs, outputs empty JSON
|
||
1.2 board collector → first real data
|
||
1.3 CPU collector → +CPUs
|
||
|
||
--- BUILDER + DEBUG ISO (unblock real-hardware testing) ---
|
||
|
||
2.1 builder VM setup → Alpine VM with build deps + Go toolchain
|
||
2.2 debug ISO profile → minimal Alpine ISO: audit binary + dropbear SSH + all packages
|
||
2.3 boot on real server → SSH in, verify packages present, run audit manually
|
||
|
||
--- CONTINUE COLLECTORS (tested on real hardware from here) ---
|
||
|
||
1.4 memory collector → +DIMMs
|
||
1.5 storage collector → +disks (SATA/SAS/NVMe)
|
||
1.6 PCIe collector → +all PCIe devices
|
||
1.7 PSU collector → +power supplies
|
||
1.8 NVIDIA GPU enrichment → +GPU serial/VBIOS
|
||
1.8b wear/age telemetry → +SMART hours, NVMe % used, SFP DOM, ECC
|
||
1.9 Mellanox NIC enrichment → +NIC firmware/serial
|
||
1.10 RAID enrichment → +physical disks behind RAID
|
||
1.11 output + USB write → production-ready output
|
||
|
||
--- PRODUCTION ISO ---
|
||
|
||
2.4 NVIDIA driver build → driver compiled into overlay
|
||
2.5 network bring-up on boot → DHCP on all interfaces
|
||
2.6 OpenRC boot service → audit runs on boot automatically
|
||
2.7 vendor utilities → storcli/sas2ircu/mstflint in image
|
||
2.8 auto-update → binary self-patches from Gitea
|
||
2.9 release workflow → versioning + release notes
|
||
```
|