Fix Runtime Health criteria: network, services, nvidia-fabricmanager
Network: green if at least one interface has IPv4 (drop PARTIAL state). Bee Services: treat inactive as OK — oneshot services (bee-sshsetup, bee-preflight, bee-network, bee-audit, etc.) complete successfully and exit to inactive; only failed is a real problem. nvidia-fabricmanager: add ExecCondition=bee-check-nvswitch drop-in so the service is silently skipped (inactive, not failed) on systems without NVSwitch hardware (e.g. H200 NVL with direct NVLink, no NVSwitch chips). bee-check-nvswitch detects NVSwitch via lspci (vendor 10de, class 0680). bee-nvidia.service: add ConditionPathExists=/usr/local/bin/bee-nvidia-load so the unit is a no-op if somehow present in a non-nvidia build. bee-boot-status: read /etc/bee-gpu-vendor and exclude bee-nvidia from CRITICAL/ALL on non-nvidia builds, preventing boot hang if the unit is unexpectedly present. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -2,6 +2,8 @@
|
||||
Description=Bee: load NVIDIA kernel modules and create device nodes
|
||||
After=local-fs.target udev.service bee-blackbox.service
|
||||
Before=bee-audit.service
|
||||
# Skip silently if bee-nvidia-load is absent (non-nvidia builds).
|
||||
ConditionPathExists=/usr/local/bin/bee-nvidia-load
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
|
||||
@@ -0,0 +1,4 @@
|
||||
[Service]
|
||||
# Skip fabricmanager on systems without NVSwitch hardware.
|
||||
# ExecCondition exits 1-254 → unit is silently skipped (inactive, not failed).
|
||||
ExecCondition=/usr/local/bin/bee-check-nvswitch
|
||||
Reference in New Issue
Block a user