Files
bee/bible-local
Michael Chus 74a3c65f64 Move nvtop to GPU-specific package lists; clean up git-bible
nvtop pulled nvidia-tesla-470-* via Recommends into the nogpu build.
Move it from bee.list.chroot into bee-nvidia and bee-amd lists so it
only appears in GPU variants.

Also remove the stray git-bible/ directory (was not gitignored) and
move grub-bitmap-error docs into bible-local/docs/.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 19:36:27 +03:00
..

bee — Project Bible

Project-specific architecture, decisions, and runtime contracts. Generic engineering rules live in bible/rules/patterns/.

Files

File Contents
architecture/system-overview.md What bee does, scope, tech stack
architecture/runtime-flows.md Boot sequence, audit flow, service order
docs/customer-gpu-test-methodology.md Customer-facing GPU PCIe Validate / Validate -> Stress test list
docs/hardware-ingest-contract.md Current Reanimator hardware ingest JSON contract
docs/validate-vs-burn.md Validate and Validate -> Stress hardware test policy
decisions/ Architectural decision log, including read-only submodule policy

Validate Test Matrix

Validate

  • CPU check
    • lscpu
    • sensors
    • stress-ng
  • Memory check
    • free
    • timeout <timeout_sec> memtester
    • free
  • NVMe storage check
    • nvme id-ctrl
    • nvme smart-log
    • nvme device-self-test
  • SATA/SAS storage check
    • smartctl -H -A
    • smartctl -t short
  • Basic NVIDIA GPU check
    • nvidia-smi -pm 1
    • nvidia-smi -q
    • dmidecode -t baseboard
    • dmidecode -t system
    • dcgmi diag -r 2
  • Inter-GPU communication check
    • all_reduce_perf
  • GPU bandwidth check
    • dcgmi diag -r nvbandwidth

Validate -> Stress

  • Extended NVIDIA GPU check
    • nvidia-smi -pm 1
    • nvidia-smi -q
    • dmidecode -t baseboard
    • dmidecode -t system
    • dcgmi diag -r 3
  • NVIDIA targeted stress
    • nvidia-smi -pm 1
    • nvidia-smi -q
    • dcgmi diag -r targeted_stress
  • NVIDIA targeted power
    • nvidia-smi -pm 1
    • nvidia-smi -q
    • dcgmi diag -r targeted_power
  • NVIDIA pulse test
    • nvidia-smi -pm 1
    • nvidia-smi -q
    • dcgmi diag -r pulse_test
  • Inter-GPU communication check
    • all_reduce_perf
  • GPU bandwidth check
    • dcgmi diag -r nvbandwidth