Move nvtop to GPU-specific package lists; clean up git-bible
nvtop pulled nvidia-tesla-470-* via Recommends into the nogpu build. Move it from bee.list.chroot into bee-nvidia and bee-amd lists so it only appears in GPU variants. Also remove the stray git-bible/ directory (was not gitignored) and move grub-bitmap-error docs into bible-local/docs/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
312
bible-local/docs/grub-bitmap-error-history.md
Normal file
312
bible-local/docs/grub-bitmap-error-history.md
Normal file
@@ -0,0 +1,312 @@
|
||||
# GRUB Bitmap Error History
|
||||
|
||||
## Symptom
|
||||
|
||||
On some servers GRUB prints:
|
||||
|
||||
```text
|
||||
error: null src bitmap in grub_video_bitmap_create_scaled.
|
||||
Press any key to continue...
|
||||
```
|
||||
|
||||
The important new observation as of `v10.7` is:
|
||||
|
||||
- the error still appears even when the logo image block is removed from
|
||||
`iso/builder/config/bootloaders/grub-efi/live-theme/theme.txt`
|
||||
- therefore the current error can no longer be explained only by
|
||||
`bee-logo.png` / `bee-logo.tga`
|
||||
|
||||
That does not prove the theme system is healthy. It proves only that the
|
||||
currently remaining failure is deeper than "bad logo file".
|
||||
|
||||
## Current State
|
||||
|
||||
Current source files:
|
||||
|
||||
- [iso/builder/config/bootloaders/grub-efi/live-theme/theme.txt](/Users/mchusavitin/Documents/git/bee/iso/builder/config/bootloaders/grub-efi/live-theme/theme.txt:1)
|
||||
has no `image` block anymore
|
||||
- [iso/builder/config/bootloaders/grub-efi/config.cfg](/Users/mchusavitin/Documents/git/bee/iso/builder/config/bootloaders/grub-efi/config.cfg:1)
|
||||
still does `insmod tga` and then `source /boot/grub/theme.cfg`
|
||||
|
||||
Implication:
|
||||
|
||||
- if the error still fires, the trigger is likely elsewhere in GRUB theme
|
||||
rendering or in the assets/config GRUB resolves while sourcing `theme.cfg`
|
||||
- the old "PNG parser fragility" story is no longer a sufficient explanation
|
||||
for the current failure mode
|
||||
|
||||
Current artifact facts:
|
||||
|
||||
- the provided `easy-bee-nvidia-v10.7-amd64.logs` build logs reference
|
||||
`linux-image-6.1.0-45`
|
||||
- the provided `easy-bee-nvidia-v10.7-amd64.iso` contains
|
||||
`live/initrd.img-6.1.0-45-amd64` and `live/vmlinuz-6.1.0-45-amd64`
|
||||
- a later `BOOT FAILED!` screenshot showed `live/initrd.img-6.1.0-44-amd64`
|
||||
and `live/vmlinuz-6.1.0-44-amd64`
|
||||
|
||||
Implication:
|
||||
|
||||
- the `BOOT FAILED!` screenshot is not from the same artifact as the provided
|
||||
`v10.7` ISO/log set
|
||||
- until the exact ISO filename and checksum are tied to that failure, the
|
||||
GRUB bitmap issue and the live-boot failure must be treated as separate
|
||||
problems
|
||||
|
||||
## Chronology
|
||||
|
||||
### 1. Initial bee GRUB theme introduction
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `d52ec67` `Stability hardening, build script fixes, GRUB bee logo`
|
||||
|
||||
What changed:
|
||||
|
||||
- bee-branded GRUB theme introduced
|
||||
- image block with explicit `width` / `height`
|
||||
|
||||
Observed result:
|
||||
|
||||
- bitmap error appeared
|
||||
|
||||
### 2. Remove explicit scaling dimensions
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `aa284ae` `fix(iso): avoid grub logo scaling error`
|
||||
|
||||
What changed:
|
||||
|
||||
- removed `width = 400`
|
||||
- removed `height = 400`
|
||||
|
||||
Reason stated by the change:
|
||||
|
||||
- try to avoid the scaling path
|
||||
|
||||
Observed result:
|
||||
|
||||
- error persisted
|
||||
|
||||
Conclusion:
|
||||
|
||||
- explicit width/height were not the sole trigger
|
||||
|
||||
### 3. Rework PNG handling and menu rendering
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `6112094` `fix(grub): fix bitmap error and menu rendering`
|
||||
|
||||
Commit message says the change was intended to:
|
||||
|
||||
- convert `bee-logo.png` to RGBA and strip metadata
|
||||
- move `terminal_output gfxterm` before `insmod png` / theme load
|
||||
- remove ASCII banner from GRUB menu area
|
||||
- fix theme typography/layout fields
|
||||
|
||||
Observed result:
|
||||
|
||||
- error persisted
|
||||
|
||||
Notes:
|
||||
|
||||
- this was still operating under the assumption that the issue was the PNG
|
||||
payload or the order of gfxterm/theme init
|
||||
|
||||
### 4. Convert logo PNG back to RGB
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `333c44f` `Fix GRUB splash: convert bee-logo.png from RGBA to RGB`
|
||||
|
||||
Intended reason:
|
||||
|
||||
- GRUB might dislike RGBA PNG and want RGB PNG
|
||||
|
||||
Observed result:
|
||||
|
||||
- error still persisted according to later project notes
|
||||
|
||||
### 5. Add post-build canonical GRUB/isolinux sync
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `0cdfbc5` `fix(iso): restore boot UX and boot logs`
|
||||
|
||||
What this introduced:
|
||||
|
||||
- post-`lb build` rewriting of `binary/boot/grub/grub.cfg`
|
||||
- post-`lb build` rewriting of `binary/isolinux/live.cfg`
|
||||
- forced rebuild of `binary_checksums`, `binary_iso`, `binary_zsync`
|
||||
|
||||
Why it was added:
|
||||
|
||||
- restore canonical EASY-BEE boot UX after live-build wrote its own bootloader
|
||||
outputs
|
||||
- restore expected boot menu and logs
|
||||
|
||||
Important note:
|
||||
|
||||
- this commit did not directly solve the bitmap issue
|
||||
- it added a second layer of bootloader mutation after live-build
|
||||
|
||||
### 6. Switch from PNG to TGA
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `626763e` `Fix GRUB bitmap error: switch from PNG to TGA for splash logo`
|
||||
|
||||
Commit message says:
|
||||
|
||||
- GRUB PNG reader was considered fragile
|
||||
- switch to uncompressed 24-bit TGA
|
||||
- `config.cfg`: `insmod png` -> `insmod tga`
|
||||
- `theme.txt`: `bee-logo.png` -> `bee-logo.tga`
|
||||
|
||||
Observed result:
|
||||
|
||||
- this did not eliminate the problem in the current lineage
|
||||
- today the system still errors even after the entire image block was removed
|
||||
|
||||
Conclusion:
|
||||
|
||||
- switching PNG -> TGA was not a durable root-cause fix
|
||||
|
||||
### 7. Patch EFI image after build
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `4f20c92` `Make UEFI boot safe and remove GRUB logo`
|
||||
|
||||
What this introduced:
|
||||
|
||||
- `sync_efi_grub_theme_assets`
|
||||
- direct `mtools` patching of `efi.img`
|
||||
- copying `config.cfg`, `theme.cfg`, and `live-theme/*` into the EFI FAT image
|
||||
- removal of the logo image block from `theme.txt`
|
||||
|
||||
Why it was added:
|
||||
|
||||
- make UEFI path "safe"
|
||||
- keep EFI GRUB image aligned with canonical bootloader assets
|
||||
|
||||
Observed result:
|
||||
|
||||
- later this became the direct cause of `Disk full` during build once
|
||||
`bee-logo.tga` was large enough
|
||||
- and even with the logo removed from `theme.txt`, the bitmap error still
|
||||
remained
|
||||
|
||||
Conclusion:
|
||||
|
||||
- EFI post-build patching increased build complexity
|
||||
- removing the logo alone did not remove the runtime GRUB error
|
||||
|
||||
### 8. Remove ASCII logo banners
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `14505ef` `Remove easy bee ASCII logo banners`
|
||||
|
||||
What changed:
|
||||
|
||||
- web loading page ASCII cleanup only
|
||||
|
||||
Relevance here:
|
||||
|
||||
- none for GRUB bitmap error
|
||||
- included here only to avoid confusion with other "logo removal" work
|
||||
|
||||
### 9. Remove EFI post-build patching
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `5dc022d` `Drop post-build EFI bootloader patching`
|
||||
|
||||
Why it was done:
|
||||
|
||||
- stop mutating `efi.img` post-build
|
||||
- remove dependence on `mtools` for EFI patching
|
||||
- remove the `Disk full` failure mode
|
||||
|
||||
Impact:
|
||||
|
||||
- this did not target the GRUB bitmap error directly
|
||||
- it targeted build-system complexity and EFI image overflow
|
||||
|
||||
### 10. Restore only GRUB/isolinux post-build sync
|
||||
|
||||
Relevant commit:
|
||||
|
||||
- `42774d4` `Restore post-build GRUB and isolinux sync`
|
||||
|
||||
Why it was needed:
|
||||
|
||||
- removing all post-build sync caused final ISO validation to fail with
|
||||
missing canonical EASY-BEE boot entries
|
||||
- memtest was still fine, but final GRUB menu was no longer canonical
|
||||
|
||||
What it restored:
|
||||
|
||||
- only `binary/boot/grub/grub.cfg`
|
||||
- only `binary/isolinux/live.cfg`
|
||||
|
||||
What it did not restore:
|
||||
|
||||
- no EFI FAT image patching
|
||||
- no `mtools` path
|
||||
|
||||
## What Is Proven False
|
||||
|
||||
The current evidence rules out several simplistic explanations:
|
||||
|
||||
- "the error is only caused by explicit image scaling"
|
||||
- "the error is only caused by PNG vs TGA"
|
||||
- "the error is only caused by the logo file itself"
|
||||
|
||||
Why:
|
||||
|
||||
- scaling dimensions were removed and error persisted
|
||||
- PNG was replaced with TGA and error still survived in the lineage
|
||||
- the image block itself is now absent, and the error still occurs
|
||||
|
||||
## Working Hypotheses Left
|
||||
|
||||
The remaining plausible layers are:
|
||||
|
||||
- GRUB theme engine still tries to render some bitmap-related element even
|
||||
without the logo image block
|
||||
- GRUB is resolving stale theme assets from the built EFI/ISO path rather than
|
||||
what we think the source tree says
|
||||
- `theme.cfg` / `theme.txt` / GRUB module loading order still triggers a bitmap
|
||||
code path elsewhere
|
||||
- live-build may still package a stale `theme.txt` or stale `live-theme`
|
||||
directory into the final image
|
||||
- the GRUB environment on the failing hardware may behave differently from the
|
||||
assumptions in our source tree
|
||||
|
||||
## Decision Boundary
|
||||
|
||||
Before making another change, the next step should be evidence gathering from
|
||||
the real built artifact, not another speculative edit.
|
||||
|
||||
That means checking on the actual built ISO or EFI image:
|
||||
|
||||
- exact `boot/grub/theme.cfg`
|
||||
- exact `boot/grub/live-theme/theme.txt`
|
||||
- exact contents of `boot/grub/live-theme/`
|
||||
- whether the final image still contains a stale logo reference
|
||||
- whether the EFI path and non-EFI path differ
|
||||
|
||||
## Relevant Commits
|
||||
|
||||
- `d52ec67` `Stability hardening, build script fixes, GRUB bee logo`
|
||||
- `aa284ae` `fix(iso): avoid grub logo scaling error`
|
||||
- `6112094` `fix(grub): fix bitmap error and menu rendering`
|
||||
- `333c44f` `Fix GRUB splash: convert bee-logo.png from RGBA to RGB`
|
||||
- `0cdfbc5` `fix(iso): restore boot UX and boot logs`
|
||||
- `626763e` `Fix GRUB bitmap error: switch from PNG to TGA for splash logo`
|
||||
- `4f20c92` `Make UEFI boot safe and remove GRUB logo`
|
||||
- `5dc022d` `Drop post-build EFI bootloader patching`
|
||||
- `42774d4` `Restore post-build GRUB and isolinux sync`
|
||||
@@ -1,5 +1,6 @@
|
||||
# AMD GPU firmware
|
||||
firmware-amd-graphics
|
||||
nvtop
|
||||
|
||||
# AMD ROCm — GPU monitoring, bandwidth test, and compute stress (RVS GST)
|
||||
rocm-smi-lib=%%ROCM_SMI_VERSION%%
|
||||
|
||||
@@ -5,6 +5,7 @@
|
||||
# DCGM 4 is packaged per CUDA major. The image ships NVIDIA driver 590 with
|
||||
# CUDA 13 userspace, so install the CUDA 13 build plus proprietary components
|
||||
# explicitly.
|
||||
nvtop
|
||||
nvidia-fabricmanager=%%NVIDIA_FABRICMANAGER_VERSION%%
|
||||
datacenter-gpu-manager-4-cuda13=1:%%DCGM_VERSION%%
|
||||
datacenter-gpu-manager-4-proprietary=1:%%DCGM_VERSION%%
|
||||
|
||||
@@ -47,7 +47,6 @@ less
|
||||
vim-tiny
|
||||
mc
|
||||
htop
|
||||
nvtop
|
||||
sudo
|
||||
zstd
|
||||
mstflint
|
||||
|
||||
Reference in New Issue
Block a user