Compare commits

...

6 Commits

Author SHA1 Message Date
Mikhail Chusavitin
bab941ccf1 Fix SAA: set CWD=/usr/local/bin; include all SAA package binaries
- saa_dmi.go: set cmd.Dir=/usr/local/bin on all saa exec calls so
  acpica_bin/acpidump is found relative to correct working directory
- build.sh: copy all saa companion dirs (acpica_bin, ExternalData,
  tool, stunnel, GO_SNMP) to /usr/local/bin/ preserving structure
- iso/vendor: add acpica_bin/acpiexec, ExternalData/, tool/gpu/nVidia/x64/,
  tool/USBController/, stunnel/, GO_SNMP/ from SAA 1.5.0 release package

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 08:24:50 +03:00
Mikhail Chusavitin
b49c71a980 Add IPMI FRU editor to Tools page
- New card "IPMI — FRU" on Tools page (device 0, in-band)
- Read: GET /api/tools/ipmi-fru → ipmitool fru print 0 → editable table
- Editable fields: chassis (part#, serial, extra), board (mfr, product, serial, part#),
  product (mfr, name, part#, version, serial); read-only fields displayed as text
- Write: POST /api/tools/ipmi-fru/write → task → backup to fru-backups/ → ipmitool fru edit per field
- Dirty tracking + Save (N changed) button, same UX as Supermicro DMI card

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 08:13:35 +03:00
Mikhail Chusavitin
85d1acdaa3 Split validate/stress into separate fixed-mode pages
- Check (2): validate mode only — no mode switcher, no stress-only cards
  (nvidia-targeted-stress, nvidia-targeted-power, nvidia-pulse hidden)
- Load (3): stress mode only — no mode switcher, all cards shown
- satStressMode() hardcoded per page; satModeChanged() removed
- Profile card with radio buttons removed from both pages
- Replaced with simple Run All button + est. time

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 19:12:17 +03:00
Mikhail Chusavitin
a2d7513153 Restructure nav to Load/Burn/Benchmark; fix SAA acpidump dependency
- Nav steps 3-5: Load (validate), Burn (burn-in), Benchmark (speed+endurance merged)
- /load now renders validate mode; /burn renders burn-in; /benchmark replaces /speed+/endurance
- Legacy redirects updated: /validate→/load, /burn-in→/burn, /speed+/endurance→/benchmark
- Add acpica_bin/acpidump from SAA 1.5.0 package; required by saa GetDmiInfo (ExitCode 8)
- build.sh copies acpica_bin/acpidump to /usr/local/bin/acpica_bin/ alongside saa

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 19:07:51 +03:00
Mikhail Chusavitin
5b5d8609d3 Refactor nav: remove numbers from Tools/Settings, add separator and Tasks item
- Remove "6." / "7." prefixes from Tools and Settings nav labels and page titles
- Add a horizontal separator (nav-sep) before the Tools/Settings group
- Move Tasks into the nav as a regular nav-item after the separator,
  replacing the separate tasks-nav-btn at the sidebar bottom
- Tasks item retains the active-count badge (tasks-nav-count)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 17:54:54 +03:00
Mikhail Chusavitin
e7442972d1 Move session-scoped LiveCD tools from Tools to Settings
Tools page now contains only NVMe Block Format and Supermicro - DMI.

Moved to Settings (7):
- System Install (Install to RAM + Install to Disk)
- Support Bundle + USB Black-Box
- Tool Check
- NVIDIA Self Heal (replaces simple NVIDIA Recovery card)
- Network
- Services

Update TestToolsPageRendersNvidiaSelfHealSection to assert the moved
cards on /settings instead of /tools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 17:52:19 +03:00
26 changed files with 4188 additions and 284 deletions

View File

@@ -0,0 +1,293 @@
package webui
import (
"context"
"encoding/json"
"fmt"
"net/http"
"os"
"os/exec"
"path/filepath"
"strings"
"time"
"unicode"
)
type fruField struct {
Name string `json:"name"`
Value string `json:"value"`
Editable bool `json:"editable"`
Area string `json:"area,omitempty"`
Index int `json:"index,omitempty"`
}
type fruChange struct {
Area string `json:"area"`
Index int `json:"index"`
Name string `json:"name"`
Value string `json:"value"`
}
// fruEditableFields maps display name → area + index for ipmitool fru edit.
var fruEditableFields = map[string]struct {
Area string
Index int
}{
"Chassis Part Number": {"c", 0},
"Chassis Serial Number": {"c", 1},
"Chassis Extra": {"c", 2},
"Board Manufacturer": {"b", 0},
"Board Product Name": {"b", 1},
"Board Serial Number": {"b", 2},
"Board Part Number": {"b", 3},
"Product Manufacturer": {"p", 0},
"Product Name": {"p", 1},
"Product Part Number": {"p", 2},
"Product Version": {"p", 3},
"Product Serial Number": {"p", 4},
}
func parseFRUOutput(output string) []fruField {
var fields []fruField
for _, line := range strings.Split(output, "\n") {
// Lines look like: " Field Name : value"
trimmed := strings.TrimLeft(line, " \t")
if trimmed == "" {
continue
}
colon := strings.Index(trimmed, " : ")
if colon < 0 {
// try ": " with no leading space before colon
colon = strings.Index(trimmed, ": ")
if colon < 0 {
continue
}
name := strings.TrimSpace(trimmed[:colon])
value := strings.TrimSpace(trimmed[colon+2:])
if name == "" {
continue
}
editable, area, idx := fruFieldMeta(name)
fields = append(fields, fruField{Name: name, Value: value, Editable: editable, Area: area, Index: idx})
continue
}
name := strings.TrimSpace(trimmed[:colon])
value := strings.TrimSpace(trimmed[colon+3:])
if name == "" {
continue
}
editable, area, idx := fruFieldMeta(name)
fields = append(fields, fruField{Name: name, Value: value, Editable: editable, Area: area, Index: idx})
}
return fields
}
func fruFieldMeta(name string) (editable bool, area string, index int) {
if e, ok := fruEditableFields[name]; ok {
return true, e.Area, e.Index
}
return false, "", 0
}
func (h *handler) handleAPIIPMIFRURead(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
out, err := exec.CommandContext(ctx, "ipmitool", "fru", "print", "0").CombinedOutput()
if err != nil {
msg := strings.TrimSpace(string(out))
if msg == "" {
msg = err.Error()
}
writeError(w, http.StatusInternalServerError, "ipmitool fru print: "+msg)
return
}
fields := parseFRUOutput(string(out))
writeJSON(w, fields)
}
func (h *handler) handleAPIIPMIFRUWrite(w http.ResponseWriter, r *http.Request) {
var req struct {
Changes []fruChange `json:"changes"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, http.StatusBadRequest, "invalid JSON")
return
}
if len(req.Changes) == 0 {
writeError(w, http.StatusUnprocessableEntity, "no changes provided")
return
}
validAreas := map[string]bool{"c": true, "b": true, "p": true}
for _, c := range req.Changes {
if !validAreas[c.Area] {
writeError(w, http.StatusUnprocessableEntity, "invalid area: "+c.Area)
return
}
if c.Index < 0 || c.Index > 9 {
writeError(w, http.StatusUnprocessableEntity, fmt.Sprintf("invalid index %d", c.Index))
return
}
if len(c.Value) > 64 {
writeError(w, http.StatusUnprocessableEntity, "value too long (max 64 chars)")
return
}
for _, ch := range c.Value {
if ch > unicode.MaxASCII || (ch < 0x20 && ch != 0) {
writeError(w, http.StatusUnprocessableEntity, "value contains non-printable characters")
return
}
}
}
t := &Task{
ID: newJobID("ipmi-fru-write"),
Name: fmt.Sprintf("IPMI FRU Write (%d field(s))", len(req.Changes)),
Target: "ipmi-fru-write",
Priority: defaultTaskPriority("ipmi-fru-write", taskParams{}),
Status: TaskPending,
CreatedAt: time.Now(),
params: taskParams{FRUChanges: req.Changes},
}
globalQueue.enqueue(t)
writeJSON(w, map[string]string{"task_id": t.ID})
}
func runIPMIFRUWriteTask(ctx context.Context, j *jobState, exportDir string, p taskParams) error {
// Backup current FRU state
backupDir := filepath.Join(exportDir, "fru-backups")
if err := os.MkdirAll(backupDir, 0755); err != nil {
return fmt.Errorf("mkdir fru-backups: %w", err)
}
stamp := time.Now().Format("20060102150405")
backupPath := filepath.Join(backupDir, "fru-"+stamp+".txt")
backupOut, err := exec.CommandContext(ctx, "ipmitool", "fru", "print", "0").CombinedOutput()
if err != nil {
return fmt.Errorf("backup fru print: %w", err)
}
if err := os.WriteFile(backupPath, backupOut, 0644); err != nil {
return fmt.Errorf("write backup: %w", err)
}
j.append("Backup saved to " + backupPath)
// Apply changes
for _, c := range p.FRUChanges {
j.append(fmt.Sprintf("Setting %s (%s %d) = %q", c.Name, c.Area, c.Index, c.Value))
cmd := exec.CommandContext(ctx, "ipmitool", "fru", "edit", "0", "field", c.Area, fmt.Sprintf("%d", c.Index), c.Value)
if err := streamCmdJob(j, cmd); err != nil {
return fmt.Errorf("fru edit %s %d: %w", c.Area, c.Index, err)
}
}
return nil
}
func renderIPMIFRUCard() string {
return `<div class="card"><div class="card-head card-head-actions">IPMI &#8212; FRU<div class="card-head-buttons"><button class="btn btn-sm btn-secondary" onclick="fruRead()">Read</button></div></div><div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:12px">Reads and edits FRU fields via ipmitool (In-Band, device 0). Works on any server with IPMI support.</p>
<div id="fru-status" style="font-size:13px;color:var(--muted);margin-bottom:8px"></div>
<div id="fru-table"></div>
<div id="fru-save-row" style="display:none;margin-top:12px">
<button class="btn btn-primary" id="fru-save-btn" onclick="fruSave()">Save</button>
<span id="fru-save-msg" style="font-size:13px;color:var(--muted);margin-left:10px"></span>
</div>
</div></div>
<script>
var fruOriginal = {};
function fruRead() {
document.getElementById('fru-status').textContent = 'Reading...';
document.getElementById('fru-table').innerHTML = '';
document.getElementById('fru-save-row').style.display = 'none';
fetch('/api/tools/ipmi-fru', {cache:'no-store'})
.then(function(r) {
if (!r.ok) return r.json().then(function(e) { throw new Error(e.error || r.statusText); });
return r.json();
})
.then(function(fields) {
fruOriginal = {};
if (!fields || !fields.length) {
document.getElementById('fru-status').textContent = 'No FRU fields returned.';
return;
}
document.getElementById('fru-status').textContent = '';
var rows = fields.map(function(f) {
var val = f.value || '';
if (f.editable) {
fruOriginal[f.area + '_' + f.index] = val;
return '<tr><td style="color:var(--muted);white-space:nowrap;padding-right:16px">' + escHtml(f.name) + '</td>'
+ '<td><input class="fru-input" style="width:100%;padding:4px 6px;border:1px solid var(--border);border-radius:3px;font-size:13px;font-family:inherit;background:var(--surface);color:var(--ink)"'
+ ' data-area="' + escHtml(f.area) + '" data-index="' + f.index + '" data-name="' + escHtml(f.name) + '"'
+ ' data-original="' + escHtml(val) + '" value="' + escHtml(val) + '" oninput="fruDirtyCheck()"></td></tr>';
}
return '<tr><td style="color:var(--muted);white-space:nowrap;padding-right:16px">' + escHtml(f.name) + '</td>'
+ '<td style="color:var(--ink)">' + escHtml(val || '—') + '</td></tr>';
}).join('');
document.getElementById('fru-table').innerHTML = '<table style="width:100%">' + rows + '</table>';
fruDirtyCheck();
})
.catch(function(e) {
document.getElementById('fru-status').textContent = 'Error: ' + e.message;
document.getElementById('fru-status').style.color = 'var(--crit-fg)';
});
}
function escHtml(s) {
return String(s).replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;').replace(/"/g,'&quot;');
}
function fruDirtyCheck() {
var inputs = document.querySelectorAll('.fru-input');
var changed = 0;
inputs.forEach(function(el) { if (el.value !== el.dataset.original) changed++; });
var row = document.getElementById('fru-save-row');
var btn = document.getElementById('fru-save-btn');
if (changed > 0) {
row.style.display = '';
btn.textContent = 'Save (' + changed + ' changed)';
} else {
row.style.display = 'none';
}
}
function fruSave() {
var inputs = document.querySelectorAll('.fru-input');
var changes = [];
inputs.forEach(function(el) {
if (el.value !== el.dataset.original) {
changes.push({area: el.dataset.area, index: parseInt(el.dataset.index, 10), name: el.dataset.name, value: el.value});
}
});
if (!changes.length) return;
document.getElementById('fru-save-btn').disabled = true;
document.getElementById('fru-save-msg').textContent = 'Saving...';
fetch('/api/tools/ipmi-fru/write', {method:'POST', headers:{'Content-Type':'application/json'}, body: JSON.stringify({changes: changes})})
.then(function(r) {
if (!r.ok) return r.json().then(function(e) { throw new Error(e.error || r.statusText); });
return r.json();
})
.then(function(d) {
var taskId = d.task_id;
document.getElementById('fru-save-msg').textContent = 'Task ' + taskId + ' queued…';
var poll = setInterval(function() {
fetch('/api/tasks', {cache:'no-store'}).then(function(r) { return r.json(); }).then(function(tasks) {
var t = Array.isArray(tasks) ? tasks.find(function(x) { return x.id === taskId; }) : null;
if (!t) return;
if (t.status === 'done') {
clearInterval(poll);
document.getElementById('fru-save-msg').textContent = 'Done — backup saved to fru-backups/.';
document.getElementById('fru-save-btn').disabled = false;
inputs.forEach(function(el) { el.dataset.original = el.value; });
fruDirtyCheck();
} else if (t.status === 'failed') {
clearInterval(poll);
document.getElementById('fru-save-msg').textContent = 'Failed: ' + (t.error || 'unknown error');
document.getElementById('fru-save-btn').disabled = false;
}
});
}, 1500);
})
.catch(function(e) {
document.getElementById('fru-save-msg').textContent = 'Error: ' + e.message;
document.getElementById('fru-save-btn').disabled = false;
});
}
</script>`
}

View File

@@ -68,10 +68,9 @@ tbody tr:hover td{background:rgba(0,0,0,.03)}
.chip-warn{background:var(--warn-bg);color:var(--warn-fg);border:1px solid #c9ba9b}
.chip-fail{background:var(--crit-bg);color:var(--crit-fg);border:1px solid var(--crit-border)}
.chip-unknown{background:var(--surface-2);color:var(--muted);border:1px solid var(--border)}
/* Tasks nav badge */
.tasks-nav-btn{display:flex;justify-content:space-between;align-items:center;padding:10px 16px;color:rgba(255,255,255,.55);font-size:12px;text-decoration:none;border-top:1px solid rgba(255,255,255,.12);margin-top:auto;transition:color .15s}
.tasks-nav-btn:hover{color:#fff}
.tasks-nav-count{background:var(--accent);color:#fff;border-radius:10px;padding:1px 7px;font-size:11px;font-weight:700;display:none}
/* Nav separator and tasks count badge */
.nav-sep{height:1px;background:rgba(255,255,255,.12);margin:6px 0}
.tasks-nav-count{background:var(--accent);color:#fff;border-radius:10px;padding:1px 7px;font-size:11px;font-weight:700;display:none;margin-left:auto}
.tasks-nav-count.active{display:inline}
/* Output terminal */
.terminal{background:#1b1c1d;border:1px solid rgba(0,0,0,.2);border-radius:4px;padding:14px;font-family:monospace;font-size:12px;color:#b5cea8;max-height:400px;overflow-y:auto;white-space:pre-wrap;word-break:break-all;user-select:text;-webkit-user-select:text}
@@ -98,15 +97,21 @@ tbody tr:hover td{background:rgba(0,0,0,.03)}
}
func layoutNav(active string, buildLabel string) string {
items := []struct{ id, label, href string }{
{"dashboard", "Dashboard", "/"},
{"audit", "1. Audit", "/audit"},
{"check", "2. Check", "/check"},
{"load", "3. Load", "/load"},
{"speed", "4. Speed", "/speed"},
{"endurance", "5. Endurance", "/endurance"},
{"tools", "6. Tools", "/tools"},
{"settings", "7. Settings", "/settings"},
type navItem struct {
id, label, href string
sep bool
}
items := []navItem{
{id: "dashboard", label: "Dashboard", href: "/"},
{id: "audit", label: "1. Audit", href: "/audit"},
{id: "check", label: "2. Check", href: "/check"},
{id: "load", label: "3. Load", href: "/load"},
{id: "burn", label: "4. Burn", href: "/burn"},
{id: "benchmark", label: "5. Benchmark", href: "/benchmark"},
{sep: true},
{id: "tasks", label: "Tasks", href: "/tasks"},
{id: "tools", label: "Tools", href: "/tools"},
{id: "settings", label: "Settings", href: "/settings"},
}
var b strings.Builder
b.WriteString(`<aside class="sidebar">`)
@@ -126,19 +131,23 @@ func layoutNav(active string, buildLabel string) string {
}
b.WriteString(`<nav class="nav">`)
for _, item := range items {
if item.sep {
b.WriteString(`<div class="nav-sep"></div>`)
continue
}
cls := "nav-item"
if item.id == active {
cls += " active"
}
b.WriteString(fmt.Sprintf(`<a class="%s" href="%s">%s</a>`, cls, item.href, item.label))
if item.id == "tasks" {
b.WriteString(fmt.Sprintf(`<a class="%s" href="%s" id="tasks-nav-item">%s<span class="tasks-nav-count" id="tasks-nav-count"></span></a>`, cls, item.href, item.label))
} else {
b.WriteString(fmt.Sprintf(`<a class="%s" href="%s">%s</a>`, cls, item.href, item.label))
}
}
b.WriteString(`</nav>`)
b.WriteString(`<a href="/tasks" class="tasks-nav-btn" id="tasks-nav-btn">`)
b.WriteString(`<span>Tasks</span>`)
b.WriteString(`<span class="tasks-nav-count" id="tasks-nav-count"></span>`)
b.WriteString(`</a>`)
b.WriteString(`<script>`)
b.WriteString(`(function(){function u(){fetch('/api/tasks',{cache:'no-store'}).then(function(r){return r.json();}).then(function(d){var n=Array.isArray(d)?d.filter(function(t){return t.status==='pending'||t.status==='running';}).length:0;var c=document.getElementById('tasks-nav-count');var b=document.getElementById('tasks-nav-btn');if(c){c.textContent=n>0?String(n):'';c.className='tasks-nav-count'+(n>0?' active':'');}if(b){b.style.color=n>0?'#f6c90e':'';}}).catch(function(){});}u();setInterval(u,5000);})();`)
b.WriteString(`(function(){function u(){fetch('/api/tasks',{cache:'no-store'}).then(function(r){return r.json();}).then(function(d){var n=Array.isArray(d)?d.filter(function(t){return t.status==='pending'||t.status==='running';}).length:0;var c=document.getElementById('tasks-nav-count');var el=document.getElementById('tasks-nav-item');if(c){c.textContent=n>0?String(n):'';c.className='tasks-nav-count'+(n>0?' active':'');}if(el){el.style.color=n>0?'#f6c90e':'';}}).catch(function(){});}u();setInterval(u,5000);})();`)
b.WriteString(`</script>`)
b.WriteString(`</aside>`)
return b.String()

View File

@@ -612,19 +612,6 @@ func renderPowerBenchmarkResultsCard(exportDir string) string {
return b.String()
}
// renderSpeed renders the Speed page (step 4): performance benchmarks.
// Uses the same benchmark infrastructure; defaults to Standard profile (throughput/bandwidth).
// For long-duration stability/overnight runs, see Endurance (step 5).
func renderSpeed(opts HandlerOptions) string {
base := renderBenchmark(opts)
return `<div class="alert alert-info" style="margin-bottom:16px"><strong>Speed:</strong> Measures GPU compute throughput and memory bandwidth. For overnight stability testing, go to <a href="/endurance">5. Endurance</a>.</div>` + base
}
// renderEndurance renders the Endurance page (step 5): long-duration reliability tests.
// Focuses on Stability and Overnight profiles for multi-hour burn validation.
// For short load tests, see Load (step 3). For throughput measurement, see Speed (step 4).
func renderEndurance(opts HandlerOptions) string {
base := renderBenchmark(opts)
return `<div class="alert alert-warn" style="margin-bottom:16px"><strong>Endurance:</strong> Long-duration reliability tests — Stability (several hours) and Overnight (8+ h) profiles. These profiles run hardware at sustained load; results show whether the server holds its performance envelope over time.</div>
<div class="alert alert-info" style="margin-bottom:16px">Use the <strong>Stability</strong> or <strong>Overnight</strong> profile in the setup card below. The Standard profile is available too but is better suited for the <a href="/speed">4. Speed</a> page.</div>` + base
}
// renderSpeed and renderEndurance are legacy wrappers; canonical page is 5. Benchmark at /benchmark.
func renderSpeed(opts HandlerOptions) string { return renderBenchmark(opts) }
func renderEndurance(opts HandlerOptions) string { return renderBenchmark(opts) }

View File

@@ -1,13 +1,8 @@
package webui
// renderLoad renders the Load page (step 3): sustained stress tests.
// For non-destructive status checks, see Check (step 2).
// For DCGM targeted diagnostics (targeted_stress, targeted_power, pulse), see Check → Validate mode.
func renderLoad() string { return renderBurn() }
func renderBurn() string {
return `<div class="alert alert-warn" style="margin-bottom:16px"><strong>&#9888; Warning:</strong> Stress tests on this page run hardware at high load. Repeated or prolonged use may reduce hardware lifespan. Use only when necessary.</div>
<div class="alert alert-info" style="margin-bottom:16px"><strong>Scope:</strong> Load runs sustained GPU compute and CPU/memory stress recipes. DCGM diagnostics (<code>targeted_stress</code>, <code>targeted_power</code>, <code>pulse_test</code>) and NCCL/NVBandwidth are on the <a href="/check">2. Check</a> page. For overnight endurance runs, see <a href="/endurance">5. Endurance</a>.</div>
<div class="alert alert-info" style="margin-bottom:16px"><strong>Scope:</strong> Burn runs sustained GPU compute and CPU/memory stress recipes. DCGM targeted diagnostics (<code>targeted_stress</code>, <code>targeted_power</code>, <code>pulse_test</code>) and NCCL/NVBandwidth are on the <a href="/load">3. Load</a> page. For performance benchmarks, see <a href="/benchmark">5. Benchmark</a>.</div>
<p style="color:var(--muted);font-size:13px;margin-bottom:16px">Tasks continue in the background — view progress in <a href="/tasks">Tasks</a>.</p>
<div class="card" style="margin-bottom:16px">

View File

@@ -402,96 +402,11 @@ loadNvidiaSelfHeal();
}
func renderTools() string {
return `<div class="card" style="margin-bottom:16px">
<div class="card-head">System Install</div>
<div class="card-body">
<div style="margin-bottom:20px">
<div style="font-weight:600;margin-bottom:8px">Install to RAM</div>
<p id="boot-source-text" style="color:var(--muted);font-size:13px;margin-bottom:8px">Detecting boot source...</p>
<p id="ram-status-text" style="color:var(--muted);font-size:13px;margin-bottom:8px">Checking...</p>
<button id="ram-install-btn" class="btn btn-primary" onclick="installToRAM()" style="display:none">&#9654; Copy to RAM</button>
</div>
<div style="border-top:1px solid var(--line);padding-top:20px">
<div style="font-weight:600;margin-bottom:8px">Install to Disk</div>` +
renderInstallInline() + `
</div>
</div>
</div>
<script>
fetch('/api/system/ram-status').then(r=>r.json()).then(d=>{
const boot = document.getElementById('boot-source-text');
const txt = document.getElementById('ram-status-text');
const btn = document.getElementById('ram-install-btn');
let source = d.device || d.source || 'unknown source';
let kind = d.kind || 'unknown';
let label = source;
if (kind === 'ram') label = 'RAM';
else if (kind === 'usb') label = 'USB (' + source + ')';
else if (kind === 'cdrom') label = 'CD-ROM (' + source + ')';
else if (kind === 'disk') label = 'disk (' + source + ')';
else label = source;
boot.textContent = 'Current boot source: ' + label + '.';
txt.textContent = d.blocked_reason || d.message || 'Checking...';
if (d.status === 'ok' || d.in_ram) {
txt.style.color = 'var(--ok, green)';
} else if (d.status === 'failed') {
txt.style.color = 'var(--err, #b91c1c)';
} else {
txt.style.color = 'var(--muted)';
}
if (d.can_start_task) {
btn.style.display = '';
btn.disabled = false;
} else {
btn.style.display = 'none';
}
});
function installToRAM() {
document.getElementById('ram-install-btn').disabled = true;
fetch('/api/system/install-to-ram', {method:'POST'}).then(r=>r.json()).then(d=>{
window.location.href = '/tasks#' + d.task_id;
});
}
</script>
<div class="card"><div class="card-head">Support Bundle</div><div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:12px">Downloads a tar.gz archive of all audit files, SAT results, and logs.</p>
` + renderSupportBundleInline() + `
<div style="border-top:1px solid var(--border);margin-top:16px;padding-top:16px">
<div style="font-weight:600;margin-bottom:8px">USB Black-Box</div>
` + renderUSBExportInline() + `
</div>
</div></div>
<div class="card"><div class="card-head">Tool Check <button class="btn btn-sm btn-secondary" onclick="checkTools()" style="margin-left:auto">&#8635; Check</button></div>
<div class="card-body"><div id="tools-table"><p style="color:var(--muted);font-size:13px">Checking...</p></div></div></div>
<div class="card"><div class="card-head">NVIDIA Self Heal</div><div class="card-body">` +
renderNvidiaSelfHealInline() + `</div></div>
<div class="card"><div class="card-head">Network</div><div class="card-body">` +
renderNetworkInline() + `</div></div>
<div class="card"><div class="card-head">Services</div><div class="card-body">` +
renderServicesInline() + `</div></div>
` + renderNVMeFormatCard() + `
return renderNVMeFormatCard() + `
` + renderSAADMICard() + `
<script>
function checkTools() {
document.getElementById('tools-table').innerHTML = '<p style="color:var(--muted);font-size:13px">Checking...</p>';
fetch('/api/tools/check').then(r=>r.json()).then(tools => {
const rows = tools.map(t =>
'<tr><td>'+t.Name+'</td><td><span class="badge '+(t.OK ? 'badge-ok' : 'badge-err')+'">'+(t.OK ? '&#10003; '+t.Path : '&#10007; missing')+'</span></td></tr>'
).join('');
document.getElementById('tools-table').innerHTML =
'<table><tr><th>Tool</th><th>Status</th></tr>'+rows+'</table>';
});
}
checkTools();
</script>`
` + renderIPMIFRUCard()
}
func renderExportIndex(exportDir string) (string, error) {

View File

@@ -7,12 +7,79 @@ func renderSettings(opts HandlerOptions) string {
if version == "" {
version = "dev"
}
return `<div class="grid2">
return `<div class="card" style="margin-bottom:16px">
<div class="card-head">System Install</div>
<div class="card-body">
<div style="margin-bottom:20px">
<div style="font-weight:600;margin-bottom:8px">Install to RAM</div>
<p id="boot-source-text" style="color:var(--muted);font-size:13px;margin-bottom:8px">Detecting boot source...</p>
<p id="ram-status-text" style="color:var(--muted);font-size:13px;margin-bottom:8px">Checking...</p>
<button id="ram-install-btn" class="btn btn-primary" onclick="installToRAM()" style="display:none">&#9654; Copy to RAM</button>
</div>
<div style="border-top:1px solid var(--line);padding-top:20px">
<div style="font-weight:600;margin-bottom:8px">Install to Disk</div>` +
renderInstallInline() + `
</div>
</div>
</div>
<script>
fetch('/api/system/ram-status').then(r=>r.json()).then(d=>{
const boot = document.getElementById('boot-source-text');
const txt = document.getElementById('ram-status-text');
const btn = document.getElementById('ram-install-btn');
let kind = d.kind || 'unknown';
let source = d.device || d.source || 'unknown source';
let label = kind==='ram'?'RAM':kind==='usb'?'USB ('+source+')':kind==='cdrom'?'CD-ROM ('+source+')':kind==='disk'?'disk ('+source+')':source;
boot.textContent = 'Current boot source: ' + label + '.';
txt.textContent = d.blocked_reason || d.message || 'Checking...';
txt.style.color = (d.status==='ok'||d.in_ram)?'var(--ok,green)':d.status==='failed'?'var(--err,#b91c1c)':'var(--muted)';
if (d.can_start_task) { btn.style.display=''; btn.disabled=false; } else { btn.style.display='none'; }
});
function installToRAM() {
document.getElementById('ram-install-btn').disabled = true;
fetch('/api/system/install-to-ram', {method:'POST'}).then(r=>r.json()).then(d=>{
window.location.href = '/tasks#' + d.task_id;
});
}
</script>
<div class="card"><div class="card-head">Support Bundle</div><div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:12px">Downloads a tar.gz archive of all audit files, SAT results, and logs.</p>
` + renderSupportBundleInline() + `
<div style="border-top:1px solid var(--border);margin-top:16px;padding-top:16px">
<div style="font-weight:600;margin-bottom:8px">USB Black-Box</div>
` + renderUSBExportInline() + `
</div>
</div></div>
<div class="card"><div class="card-head">Tool Check <button class="btn btn-sm btn-secondary" onclick="checkTools()" style="margin-left:auto">&#8635; Check</button></div>
<div class="card-body"><div id="tools-table"><p style="color:var(--muted);font-size:13px">Checking...</p></div></div></div>
<script>
function checkTools() {
document.getElementById('tools-table').innerHTML = '<p style="color:var(--muted);font-size:13px">Checking...</p>';
fetch('/api/tools/check').then(r=>r.json()).then(tools => {
const rows = tools.map(t =>
'<tr><td>'+t.Name+'</td><td><span class="badge '+(t.OK?'badge-ok':'badge-err')+'">'+(t.OK?'&#10003; '+t.Path:'&#10007; missing')+'</span></td></tr>'
).join('');
document.getElementById('tools-table').innerHTML = '<table><tr><th>Tool</th><th>Status</th></tr>'+rows+'</table>';
});
}
checkTools();
</script>
<div class="card"><div class="card-head">NVIDIA Self Heal</div><div class="card-body">` +
renderNvidiaSelfHealInline() + `</div></div>
<div class="card"><div class="card-head">Network</div><div class="card-body">` +
renderNetworkInline() + `</div></div>
<div class="card"><div class="card-head">Services</div><div class="card-body">` +
renderServicesInline() + `</div></div>
<div class="card">
<div class="card-head">Blackbox Logging</div>
<div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:14px">Continuous hardware monitoring that writes a rolling log of sensor readings to the export directory. Useful for capturing thermal or power anomalies during long runs.</p>
<p style="font-size:13px;color:var(--muted);margin-bottom:14px">Continuous hardware monitoring that writes a rolling log of sensor readings to the export directory.</p>
<div style="display:flex;gap:8px;align-items:center">
<button class="btn btn-primary btn-sm" onclick="blackboxToggle('enable')">Enable</button>
<button class="btn btn-secondary btn-sm" onclick="blackboxToggle('disable')">Disable</button>
@@ -22,19 +89,6 @@ func renderSettings(opts HandlerOptions) string {
</div>
<div class="card">
<div class="card-head">NVIDIA Recovery</div>
<div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:14px">Reset NVIDIA GPU driver state. Use when <code>nvidia-smi</code> reports errors or GPUs appear stuck after a failed test.</p>
<div style="display:flex;gap:8px;align-items:center">
<button class="btn btn-danger btn-sm" onclick="nvidiaReset()">Reset NVIDIA Driver</button>
<span id="nvidia-reset-status" style="font-size:12px;color:var(--muted)"></span>
</div>
</div>
</div>
</div>
<div class="card" style="margin-top:0">
<div class="card-head">Build Info</div>
<div class="card-body">
<table style="width:auto">
@@ -64,14 +118,5 @@ function blackboxToggle(action) {
.then(d => { if (el) el.textContent = d.enabled ? 'Enabled' : 'Disabled'; })
.catch(err => { if (el) el.textContent = 'Error: ' + err.message; });
}
function nvidiaReset() {
var el = document.getElementById('nvidia-reset-status');
if (!confirm('Reset NVIDIA driver? This will interrupt any running GPU tasks.')) return;
if (el) el.textContent = 'Resetting...';
fetch('/api/gpu/nvidia-reset', {method:'POST', cache:'no-store'})
.then(r => r.json())
.then(d => { if (el) el.textContent = d.error ? ('Error: ' + d.error) : 'Done — driver reset.'; })
.catch(err => { if (el) el.textContent = 'Error: ' + err.message; });
}
</script>`
}

View File

@@ -68,6 +68,14 @@ func validateTotalStressSec(n int) int {
}
func renderValidate(opts HandlerOptions) string {
return renderValidateMode(opts, false)
}
func renderValidateStress(opts HandlerOptions) string {
return renderValidateMode(opts, true)
}
func renderValidateMode(opts HandlerOptions, stressDefault bool) string {
inv := loadValidateInventory(opts)
n := inv.NvidiaGPUCount
validateTotalStr := validateFmtDur(validateTotalValidateSec(n))
@@ -76,26 +84,49 @@ func renderValidate(opts HandlerOptions) string {
if n > 0 {
gpuNote = fmt.Sprintf(" (%d GPU)", n)
}
return `<div class="alert alert-info" style="margin-bottom:16px"><strong>Non-destructive:</strong> Validate tests collect diagnostics only. They do not write to disks, do not run sustained load, and do not increment hardware wear counters.</div>
<p style="color:var(--muted);font-size:13px;margin-bottom:16px">Tasks continue in the background — view progress in <a href="/tasks">Tasks</a>.</p>
estStr := validateTotalStr
if stressDefault {
estStr = stressTotalStr
}
alert := `<div class="alert alert-info" style="margin-bottom:16px"><strong>Non-destructive:</strong> Validate tests collect diagnostics only. They do not write to disks, do not run sustained load, and do not increment hardware wear counters.</div>`
if stressDefault {
alert = `<div class="alert alert-warn" style="margin-bottom:16px"><strong>&#9888; Stress mode:</strong> Runs extended load tests — CPU stress-ng, memory passes, DCGM targeted diagnostics. Higher wear than Validate.</div>`
}
<div class="card" style="margin-bottom:16px">
<div class="card-head">Validate Profile</div>
<div class="card-body validate-profile-body">
<div class="validate-profile-col">
<div class="form-row" style="margin:12px 0 0"><label>Mode</label></div>
<label class="cb-row"><input type="radio" name="sat-mode" id="sat-mode-validate" value="validate" checked onchange="satModeChanged()"><span>Validate — quick non-destructive check</span></label>
<label class="cb-row"><input type="radio" name="sat-mode" id="sat-mode-stress" value="stress" onchange="satModeChanged()"><span>Stress — thorough load test (` + stressTotalStr + gpuNote + `)</span></label>
</div>
<div class="validate-profile-col validate-profile-action">
<p style="color:var(--muted);font-size:12px;margin:0 0 10px">Runs validate modules sequentially. Validate: ` + validateTotalStr + gpuNote + `; Stress: ` + stressTotalStr + gpuNote + `. Estimates are based on real log data and scale with GPU count.</p>
<button type="button" class="btn btn-primary" onclick="runAllSAT()">Validate one by one</button>
<div style="margin-top:12px">
<span id="sat-all-status" style="font-size:12px;color:var(--muted)"></span>
</div>
</div>
</div>
</div>
stressOnlyCards := ""
if stressDefault {
stressOnlyCards = renderSATCard("nvidia-targeted-stress", "NVIDIA GPU Targeted Stress", "runNvidiaValidateSet('nvidia-targeted-stress')", "", renderValidateCardBody(
inv.NVIDIA,
`Runs a controlled NVIDIA DCGM load to check stability under moderate stress.`,
`<code>dcgmi diag targeted_stress</code>`,
validateFmtDur(platform.SATEstimatedNvidiaTargetedStressSec)+` (all GPUs simultaneously).`,
)) +
renderSATCard("nvidia-targeted-power", "NVIDIA Targeted Power", "runNvidiaValidateSet('nvidia-targeted-power')", "", renderValidateCardBody(
inv.NVIDIA,
`Checks that the GPU can sustain its declared power delivery envelope. Pass/fail determined by DCGM.`,
`<code>dcgmi diag targeted_power</code>`,
validateFmtDur(platform.SATEstimatedNvidiaTargetedPowerSec)+` (all GPUs simultaneously).`,
)) +
renderSATCard("nvidia-pulse", "NVIDIA PSU Pulse Test", "runNvidiaFabricValidate('nvidia-pulse')", "", renderValidateCardBody(
inv.NVIDIA,
`Tests power supply transient response by pulsing all GPUs simultaneously between idle and full load. Synchronous pulses across all GPUs create worst-case PSU load spikes — running per-GPU would miss PSU-level failures.`,
`<code>dcgmi diag pulse_test</code>`,
validateFmtDur(platform.SATEstimatedNvidiaPulseTestSec)+` (all GPUs simultaneously; measured on 8-GPU system).`,
))
}
satStressModeJS := "function satStressMode() { return false; }"
if stressDefault {
satStressModeJS = "function satStressMode() { return true; }"
}
return alert + `
<p style="color:var(--muted);font-size:13px;margin-bottom:16px">Tasks continue in the background — view progress in <a href="/tasks">Tasks</a>.</p>
<div style="display:flex;align-items:center;gap:12px;margin-bottom:16px">
<button type="button" class="btn btn-primary" onclick="runAllSAT()">Run All</button>
<span id="sat-all-status" style="font-size:12px;color:var(--muted)"></span>
<span style="font-size:12px;color:var(--muted)">est. ` + estStr + gpuNote + `</span>
</div>
<div class="grid3">
` + renderSATCard("cpu", "CPU", "runSAT('cpu')", "", renderValidateCardBody(
@@ -122,7 +153,7 @@ func renderValidate(opts HandlerOptions) string {
<div class="card-head">NVIDIA GPU Selection</div>
<div class="card-body">
<p style="font-size:12px;color:var(--muted);margin:0 0 8px">` + inv.NVIDIA + `</p>
<p style="font-size:12px;color:var(--muted);margin:0 0 10px">All NVIDIA validate tasks use only the GPUs selected here. The same selection is used by Validate one by one.</p>
<p style="font-size:12px;color:var(--muted);margin:0 0 10px">All NVIDIA validate tasks use only the GPUs selected here. The same selection is used by Run All.</p>
<div style="display:flex;gap:8px;flex-wrap:wrap;margin-bottom:8px">
<button class="btn btn-sm btn-secondary" type="button" onclick="satSelectAllGPUs()">Select All</button>
<button class="btn btn-sm btn-secondary" type="button" onclick="satSelectNoGPUs()">Clear</button>
@@ -143,46 +174,19 @@ func renderValidate(opts HandlerOptions) string {
validateFmtDur(platform.SATEstimatedNvidiaGPUValidateSec),
validateFmtDur(platform.SATEstimatedNvidiaGPUStressSec)),
)) +
`<div id="sat-card-nvidia-targeted-stress">` +
renderSATCard("nvidia-targeted-stress", "NVIDIA GPU Targeted Stress", "runNvidiaValidateSet('nvidia-targeted-stress')", "", renderValidateCardBody(
inv.NVIDIA,
`Runs a controlled NVIDIA DCGM load to check stability under moderate stress.`,
`<code>dcgmi diag targeted_stress</code>`,
"Skipped in Validate. Stress: " + validateFmtDur(platform.SATEstimatedNvidiaTargetedStressSec) + ` (all GPUs simultaneously).<p id="sat-ts-mode-hint" style="color:var(--warn-fg);font-size:12px;margin:8px 0 0">Only runs in Stress mode. Switch mode above to enable in Run All.</p>`,
)) +
`</div>` +
`<div id="sat-card-nvidia-targeted-power">` +
renderSATCard("nvidia-targeted-power", "NVIDIA Targeted Power", "runNvidiaValidateSet('nvidia-targeted-power')", "", renderValidateCardBody(
inv.NVIDIA,
`Checks that the GPU can sustain its declared power delivery envelope. Pass/fail determined by DCGM.`,
`<code>dcgmi diag targeted_power</code>`,
"Skipped in Validate. Stress: " + validateFmtDur(platform.SATEstimatedNvidiaTargetedPowerSec) + ` (all GPUs simultaneously).<p id="sat-tp-mode-hint" style="color:var(--warn-fg);font-size:12px;margin:8px 0 0">Only runs in Stress mode. Switch mode above to enable in Run All.</p>`,
)) +
`</div>` +
`<div id="sat-card-nvidia-pulse">` +
renderSATCard("nvidia-pulse", "NVIDIA PSU Pulse Test", "runNvidiaFabricValidate('nvidia-pulse')", "", renderValidateCardBody(
inv.NVIDIA,
`Tests power supply transient response by pulsing all GPUs simultaneously between idle and full load. Synchronous pulses across all GPUs create worst-case PSU load spikes — running per-GPU would miss PSU-level failures.`,
`<code>dcgmi diag pulse_test</code>`,
`Skipped in Validate. Stress: `+validateFmtDur(platform.SATEstimatedNvidiaPulseTestSec)+` (all GPUs simultaneously; measured on 8-GPU system).`+`<p id="sat-pt-mode-hint" style="color:var(--warn-fg);font-size:12px;margin:8px 0 0">Only runs in Stress mode. Switch mode above to enable in Run All.</p>`,
)) +
`</div>` +
`<div id="sat-card-nvidia-interconnect">` +
stressOnlyCards +
renderSATCard("nvidia-interconnect", "NVIDIA Interconnect (NCCL)", "runNvidiaFabricValidate('nvidia-interconnect')", "", renderValidateCardBody(
inv.NVIDIA,
`Verifies NVLink/NVSwitch fabric bandwidth using NCCL all_reduce_perf across all selected GPUs. Pass/fail based on achieved bandwidth vs. theoretical.`,
`<code>all_reduce_perf</code> (NCCL tests)`,
`Validate and Stress: `+validateFmtDur(platform.SATEstimatedNvidiaInterconnectSec)+` (all GPUs simultaneously, requires ≥2).`,
validateFmtDur(platform.SATEstimatedNvidiaInterconnectSec)+` (all GPUs simultaneously, requires ≥2).`,
)) +
`</div>` +
`<div id="sat-card-nvidia-bandwidth">` +
renderSATCard("nvidia-bandwidth", "NVIDIA Bandwidth (NVBandwidth)", "runNvidiaFabricValidate('nvidia-bandwidth')", "", renderValidateCardBody(
inv.NVIDIA,
`Validates GPU memory copy and peer-to-peer bandwidth paths using NVBandwidth.`,
`<code>nvbandwidth</code>`,
`Validate and Stress: `+validateFmtDur(platform.SATEstimatedNvidiaBandwidthSec)+` (all GPUs simultaneously; nvbandwidth runs all built-in tests without a time limit — duration set by the tool).`,
validateFmtDur(platform.SATEstimatedNvidiaBandwidthSec)+` (all GPUs simultaneously; nvbandwidth runs all built-in tests without a time limit — duration set by the tool).`,
)) +
`</div>` +
`</div>
<div class="grid3" style="margin-top:16px">
` + renderSATCard("amd", "AMD GPU", "runAMDValidateSet()", "", renderValidateCardBody(
@@ -197,36 +201,15 @@ func renderValidate(opts HandlerOptions) string {
<div class="card-body"><div id="sat-terminal" class="terminal"></div></div>
</div>
<style>
.validate-profile-body { display:grid; grid-template-columns:1fr 1fr 1fr; gap:24px; align-items:stretch; }
.validate-profile-col { min-width:0; display:flex; flex-direction:column; }
.validate-profile-action { display:flex; flex-direction:column; align-items:center; justify-content:center; }
.validate-card-body { padding:0; }
.validate-card-section { padding:12px 16px 0; }
.validate-card-section:last-child { padding-bottom:16px; }
.sat-gpu-row { display:flex; align-items:flex-start; gap:8px; padding:6px 0; cursor:pointer; font-size:13px; }
.sat-gpu-row input[type=checkbox] { width:16px; height:16px; margin-top:2px; flex-shrink:0; }
@media(max-width:900px){ .validate-profile-body { grid-template-columns:1fr; } }
</style>
<script>
let satES = null;
function satStressMode() {
return document.querySelector('input[name="sat-mode"]:checked')?.value === 'stress';
}
function satModeChanged() {
const stress = satStressMode();
[
{card: 'sat-card-nvidia-targeted-stress', hint: 'sat-ts-mode-hint'},
{card: 'sat-card-nvidia-targeted-power', hint: 'sat-tp-mode-hint'},
{card: 'sat-card-nvidia-pulse', hint: 'sat-pt-mode-hint'},
].forEach(function(item) {
const card = document.getElementById(item.card);
if (card) {
card.style.opacity = stress ? '1' : '0.5';
const hint = document.getElementById(item.hint);
if (hint) hint.style.display = stress ? 'none' : '';
}
});
}
` + satStressModeJS + `
function satLabels() {
return {nvidia:'Validate GPU', 'nvidia-targeted-stress':'NVIDIA Targeted Stress (dcgmi diag targeted_stress)', 'nvidia-targeted-power':'NVIDIA Targeted Power (dcgmi diag targeted_power)', 'nvidia-pulse':'NVIDIA PSU Pulse Test (dcgmi diag pulse_test)', 'nvidia-interconnect':'NVIDIA Interconnect (NCCL all_reduce_perf)', 'nvidia-bandwidth':'NVIDIA Bandwidth (NVBandwidth)', memory:'Validate Memory', storage:'Validate Storage', cpu:'Validate CPU', amd:'Validate AMD GPU', 'amd-mem':'AMD GPU MEM Integrity', 'amd-bandwidth':'AMD GPU MEM Bandwidth'};
}
@@ -667,7 +650,7 @@ func renderCheck(opts HandlerOptions) string {
if n > 0 {
gpuNote = fmt.Sprintf(" (%d GPU)", n)
}
return `<div class="alert alert-info" style="margin-bottom:16px"><strong>Non-destructive:</strong> Check tests collect diagnostics only — no writes to disks, no sustained load, no hardware wear counters incremented. For stress testing, go to <a href="/load">3. Load</a>.</div>
return `<div class="alert alert-info" style="margin-bottom:16px"><strong>Non-destructive:</strong> Check tests collect diagnostics only — no writes to disks, no sustained load, no hardware wear counters incremented. For stress testing, go to <a href="/burn">4. Burn</a>.</div>
<div style="display:flex;align-items:center;gap:12px;margin-bottom:16px">
<button type="button" class="btn btn-primary" onclick="runAllCheckSAT()">Run All Checks</button>
<span id="sat-all-status" style="font-size:12px;color:var(--muted)"></span>

View File

@@ -33,36 +33,36 @@ func renderPage(page string, opts HandlerOptions) string {
case "load":
pageID = "load"
title = "3. Load"
body = renderLoad()
case "speed":
pageID = "speed"
title = "4. Speed"
body = renderSpeed(opts)
case "endurance":
pageID = "endurance"
title = "5. Endurance"
body = renderEndurance(opts)
body = renderValidateStress(opts)
case "burn":
pageID = "burn"
title = "4. Burn"
body = renderBurn()
case "benchmark":
pageID = "benchmark"
title = "5. Benchmark"
body = renderBenchmark(opts)
case "tools":
pageID = "tools"
title = "6. Tools"
title = "Tools"
body = renderTools()
case "settings":
pageID = "settings"
title = "7. Settings"
title = "Settings"
body = renderSettings(opts)
// Legacy routes (redirected at HTTP level in handlePage; these are fallbacks)
case "validate", "tests":
pageID = "check"
title = "2. Check"
body = renderCheck(opts)
case "burn", "burn-in":
pageID = "load"
title = "3. Load"
body = renderLoad()
case "benchmark":
pageID = "speed"
title = "4. Speed"
body = renderSpeed(opts)
body = renderValidate(opts)
case "burn-in":
pageID = "burn"
title = "4. Burn"
body = renderBurn()
case "speed", "endurance":
pageID = "benchmark"
title = "5. Benchmark"
body = renderBenchmark(opts)
case "tasks":
pageID = "tasks"
title = "Tasks"

View File

@@ -91,7 +91,9 @@ func (h *handler) handleAPISAADMIRead(w http.ResponseWriter, r *http.Request) {
defer os.RemoveAll(tmpDir)
dmiFile := filepath.Join(tmpDir, "DMI.txt")
out, err := exec.CommandContext(ctx, "saa", "-c", "GetDmiInfo", "--file", dmiFile, "--overwrite").CombinedOutput()
cmd := exec.CommandContext(ctx, "saa", "-c", "GetDmiInfo", "--file", dmiFile, "--overwrite")
cmd.Dir = "/usr/local/bin"
out, err := cmd.CombinedOutput()
if err != nil {
msg := strings.TrimSpace(string(out))
if msg == "" {
@@ -168,7 +170,9 @@ func runSAADMIWriteTask(ctx context.Context, j *jobState, exportDir string, p ta
dmiFile := filepath.Join(tmpDir, "DMI.txt")
j.append("Reading current DMI configuration...")
if err := streamCmdJob(j, exec.CommandContext(ctx, "saa", "-c", "GetDmiInfo", "--file", dmiFile, "--overwrite")); err != nil {
getCmd := exec.CommandContext(ctx, "saa", "-c", "GetDmiInfo", "--file", dmiFile, "--overwrite")
getCmd.Dir = "/usr/local/bin"
if err := streamCmdJob(j, getCmd); err != nil {
return fmt.Errorf("GetDmiInfo: %w", err)
}
@@ -190,13 +194,16 @@ func runSAADMIWriteTask(ctx context.Context, j *jobState, exportDir string, p ta
for _, c := range p.SAADmiChanges {
j.append("Setting " + c.Shn + " = " + c.Value)
cmd := exec.CommandContext(ctx, "saa", "-c", "EditDmiInfo", "--file", dmiFile, "--shn", c.Shn, "--value", c.Value)
cmd.Dir = "/usr/local/bin"
if err := streamCmdJob(j, cmd); err != nil {
return fmt.Errorf("EditDmiInfo %s: %w", c.Shn, err)
}
}
j.append("Applying changes to hardware...")
if err := streamCmdJob(j, exec.CommandContext(ctx, "saa", "-c", "ChangeDmiInfo", "--file", dmiFile)); err != nil {
changeCmd := exec.CommandContext(ctx, "saa", "-c", "ChangeDmiInfo", "--file", dmiFile)
changeCmd.Dir = "/usr/local/bin"
if err := streamCmdJob(j, changeCmd); err != nil {
return fmt.Errorf("ChangeDmiInfo: %w", err)
}

View File

@@ -316,6 +316,8 @@ func NewHandler(opts HandlerOptions) http.Handler {
mux.HandleFunc("POST /api/tools/nvme-format/run", h.handleAPINVMeFormatRun)
mux.HandleFunc("GET /api/tools/saa-dmi", h.handleAPISAADMIRead)
mux.HandleFunc("POST /api/tools/saa-dmi/write", h.handleAPISAADMIWrite)
mux.HandleFunc("GET /api/tools/ipmi-fru", h.handleAPIIPMIFRURead)
mux.HandleFunc("POST /api/tools/ipmi-fru/write", h.handleAPIIPMIFRUWrite)
// GPU presence / tools
mux.HandleFunc("GET /api/gpu/presence", h.handleAPIGPUPresence)
@@ -1424,13 +1426,13 @@ func (h *handler) handlePage(w http.ResponseWriter, r *http.Request) {
// Redirect legacy routes to new named pages
switch page {
case "validate", "tests":
http.Redirect(w, r, "/check", http.StatusMovedPermanently)
return
case "burn", "burn-in":
http.Redirect(w, r, "/load", http.StatusMovedPermanently)
return
case "benchmark":
http.Redirect(w, r, "/speed", http.StatusMovedPermanently)
case "burn-in":
http.Redirect(w, r, "/burn", http.StatusMovedPermanently)
return
case "speed", "endurance":
http.Redirect(w, r, "/benchmark", http.StatusMovedPermanently)
return
}
body := renderPage(page, h.opts)

View File

@@ -666,54 +666,64 @@ func TestTasksPageRendersOpenLinksAndPaginationControls(t *testing.T) {
func TestToolsPageRendersNvidiaSelfHealSection(t *testing.T) {
handler := NewHandler(HandlerOptions{})
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/tools", nil))
if rec.Code != http.StatusOK {
t.Fatalf("status=%d", rec.Code)
// /tools: only NVMe Block Format and Supermicro DMI remain
recTools := httptest.NewRecorder()
handler.ServeHTTP(recTools, httptest.NewRequest(http.MethodGet, "/tools", nil))
if recTools.Code != http.StatusOK {
t.Fatalf("tools status=%d", recTools.Code)
}
body := rec.Body.String()
if !strings.Contains(body, `NVIDIA Self Heal`) {
t.Fatalf("tools page missing nvidia self heal section: %s", body)
toolsBody := recTools.Body.String()
if !strings.Contains(toolsBody, `NVMe Block Format`) {
t.Fatalf("tools page missing nvme block format section: %s", toolsBody)
}
if !strings.Contains(body, `Restart GPU Drivers`) {
t.Fatalf("tools page missing restart gpu drivers button: %s", body)
if !strings.Contains(toolsBody, `/api/tools/nvme-formats`) || !strings.Contains(toolsBody, `/api/tools/nvme-format/run`) {
t.Fatalf("tools page missing nvme format api usage: %s", toolsBody)
}
if !strings.Contains(body, `nvidiaRestartDrivers()`) {
t.Fatalf("tools page missing nvidiaRestartDrivers action: %s", body)
// /settings: system install, support bundle, tool check, nvidia self heal, network, services
recSettings := httptest.NewRecorder()
handler.ServeHTTP(recSettings, httptest.NewRequest(http.MethodGet, "/settings", nil))
if recSettings.Code != http.StatusOK {
t.Fatalf("settings status=%d", recSettings.Code)
}
if !strings.Contains(body, `/api/gpu/nvidia-status`) {
t.Fatalf("tools page missing nvidia status api usage: %s", body)
settingsBody := recSettings.Body.String()
if !strings.Contains(settingsBody, `NVIDIA Self Heal`) {
t.Fatalf("settings page missing nvidia self heal section: %s", settingsBody)
}
if !strings.Contains(body, `nvidiaResetGPU(`) {
t.Fatalf("tools page missing nvidiaResetGPU action: %s", body)
if !strings.Contains(settingsBody, `Restart GPU Drivers`) {
t.Fatalf("settings page missing restart gpu drivers button: %s", settingsBody)
}
if !strings.Contains(body, `id="boot-source-text"`) {
t.Fatalf("tools page missing boot source field: %s", body)
if !strings.Contains(settingsBody, `nvidiaRestartDrivers()`) {
t.Fatalf("settings page missing nvidiaRestartDrivers action: %s", settingsBody)
}
if !strings.Contains(body, `USB Black-Box`) {
t.Fatalf("tools page missing usb black-box section: %s", body)
if !strings.Contains(settingsBody, `/api/gpu/nvidia-status`) {
t.Fatalf("settings page missing nvidia status api usage: %s", settingsBody)
}
if !strings.Contains(body, `/api/blackbox/status`) {
t.Fatalf("tools page missing black-box status api usage: %s", body)
if !strings.Contains(settingsBody, `nvidiaResetGPU(`) {
t.Fatalf("settings page missing nvidiaResetGPU action: %s", settingsBody)
}
if !strings.Contains(body, `NVMe Block Format`) {
t.Fatalf("tools page missing nvme block format section: %s", body)
if !strings.Contains(settingsBody, `id="boot-source-text"`) {
t.Fatalf("settings page missing boot source field: %s", settingsBody)
}
if !strings.Contains(body, `/api/tools/nvme-formats`) || !strings.Contains(body, `/api/tools/nvme-format/run`) {
t.Fatalf("tools page missing nvme format api usage: %s", body)
if !strings.Contains(settingsBody, `USB Black-Box`) {
t.Fatalf("settings page missing usb black-box section: %s", settingsBody)
}
if !strings.Contains(settingsBody, `/api/blackbox/status`) {
t.Fatalf("settings page missing black-box status api usage: %s", settingsBody)
}
}
func TestBenchmarkPageRendersGPUSelectionControls(t *testing.T) {
handler := NewHandler(HandlerOptions{})
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/speed", nil))
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/benchmark", nil))
if rec.Code != http.StatusOK {
t.Fatalf("status=%d", rec.Code)
}
body := rec.Body.String()
for _, needle := range []string{
`href="/speed"`,
`href="/benchmark"`,
`id="benchmark-gpu-list"`,
`/api/gpu/nvidia`,
`/api/bee-bench/nvidia/perf/run`,
@@ -769,7 +779,7 @@ func TestBenchmarkPageRendersSavedResultsTable(t *testing.T) {
handler := NewHandler(HandlerOptions{ExportDir: exportDir})
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/speed", nil))
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/benchmark", nil))
if rec.Code != http.StatusOK {
t.Fatalf("status=%d", rec.Code)
}
@@ -834,10 +844,10 @@ func TestCheckPageRendersNvidiaFabricCards(t *testing.T) {
}
}
func TestLoadPageRendersGoalBasedNVIDIACards(t *testing.T) {
func TestBurnPageRendersGoalBasedNVIDIACards(t *testing.T) {
handler := NewHandler(HandlerOptions{})
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/load", nil))
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/burn", nil))
if rec.Code != http.StatusOK {
t.Fatalf("status=%d", rec.Code)
}

View File

@@ -388,6 +388,12 @@ func executeTaskWithOptions(opts *HandlerOptions, t *Task, j *jobState, ctx cont
break
}
err = runSAADMIWriteTask(ctx, j, opts.ExportDir, t.params)
case "ipmi-fru-write":
if len(t.params.FRUChanges) == 0 {
err = fmt.Errorf("no changes provided")
break
}
err = runIPMIFRUWriteTask(ctx, j, opts.ExportDir, t.params)
default:
j.append("ERROR: unknown target: " + t.Target)
j.finish("unknown target")

View File

@@ -140,7 +140,8 @@ type taskParams struct {
Device string `json:"device,omitempty"` // for install
LBAF int `json:"lbaf,omitempty"`
PlatformComponents []string `json:"platform_components,omitempty"`
SAADmiChanges []saaChange `json:"saa_dmi_changes,omitempty"`
SAADmiChanges []saaChange `json:"saa_dmi_changes,omitempty"`
FRUChanges []fruChange `json:"fru_changes,omitempty"`
}
type persistedTask struct {

View File

@@ -1483,6 +1483,17 @@ for tool in storcli64 sas2ircu sas3ircu arcconf ssacli saa; do
fi
done
# saa companion directories — saa searches for these relative to CWD (/usr/local/bin)
for saa_subdir in acpica_bin ExternalData tool stunnel GO_SNMP; do
if [ -d "${VENDOR_DIR}/${saa_subdir}" ]; then
cp -r "${VENDOR_DIR}/${saa_subdir}" "${OVERLAY_STAGE_DIR}/usr/local/bin/"
find "${OVERLAY_STAGE_DIR}/usr/local/bin/${saa_subdir}" -type f -exec chmod +x {} \; 2>/dev/null || true
echo "vendor saa: ${saa_subdir}/ (included)"
else
echo "vendor saa: ${saa_subdir}/ (not found, skipped)"
fi
done
# --- NVIDIA kernel modules and userspace libs ---
if [ "$BEE_GPU_VENDOR" = "nvidia" ]; then
run_step "build NVIDIA ${NVIDIA_DRIVER_VERSION} modules" "40-nvidia-module" \

1131
iso/vendor/ExternalData/SMCIPID.txt vendored Normal file

File diff suppressed because it is too large Load Diff

2333
iso/vendor/ExternalData/VENID.txt vendored Normal file

File diff suppressed because it is too large Load Diff

37
iso/vendor/ExternalData/supportAutoDST vendored Normal file
View File

@@ -0,0 +1,37 @@
(UTC-10:00) Aleutian Islands
(UTC-09:00) Alaska
(UTC-08:00) Baja California
(UTC-08:00) Pacific Time (US & Canada)
(UTC-07:00) Mountain Time (US & Canada)
(UTC-06:00) Central Time (US & Canada)
(UTC-06:00) Easter Island
(UTC-05:00) Eastern Time (US & Canada)
(UTC-05:00) Haiti
(UTC-05:00) Havana
(UTC-05:00) Indiana (East)
(UTC-05:00) Turks and Caicos
(UTC-04:00) Asuncion
(UTC-04:00) Atlantic Time (Canada)
(UTC-04:00) Santiago
(UTC-03:30) Newfoundland
(UTC-03:00) Saint Pierre and Miquelon
(UTC-01:00) Azores
(UTC+00:00) Dublin, Edinburgh, Lisbon, London
(UTC+01:00) Casablanca
(UTC+01:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna
(UTC+01:00) Belgrade, Bratislava, Budapest, Ljubljana, Prague
(UTC+01:00) Brussels, Copenhagen, Madrid, Paris
(UTC+01:00) Sarajevo, Skopje, Warsaw, Zagreb
(UTC+02:00) Athens, Bucharest
(UTC+02:00) Beirut
(UTC+02:00) Chisinau
(UTC+02:00) Gaza, Hebron
(UTC+02:00) Helsinki, Kyiv, Riga, Sofia, Tallinn, Vilnius
(UTC+02:00) Jerusalem
(UTC+09:30) Adelaide
(UTC+10:00) Canberra, Melbourne, Sydney
(UTC+10:00) Hobart
(UTC+10:30) Lord Howe Island
(UTC+11:00) Norfolk Island
(UTC+12:00) Auckland, Wellington
(UTC+12:45) Chatham Islands

139
iso/vendor/ExternalData/timezone.txt vendored Normal file
View File

@@ -0,0 +1,139 @@
(UTC-12:00) International Date Line West
(UTC-11:00) Coordinated Universal Time-11
(UTC-10:00) Aleutian Islands
(UTC-10:00) Hawaii
(UTC-09:30) Marquesas Islands
(UTC-09:00) Alaska
(UTC-09:00) Coordinated Universal Time-09
(UTC-08:00) Baja California
(UTC-08:00) Coordinated Universal Time-08
(UTC-08:00) Pacific Time (US & Canada)
(UTC-07:00) Arizona
(UTC-07:00) Chihuahua, La Paz, Mazatlan
(UTC-07:00) Mountain Time (US & Canada)
(UTC-07:00) Yukon
(UTC-06:00) Central America
(UTC-06:00) Central Time (US & Canada)
(UTC-06:00) Easter Island
(UTC-06:00) Guadalajara, Mexico City, Monterrey
(UTC-06:00) Saskatchewan
(UTC-05:00) Bogota, Lima, Quito, Rio Branco
(UTC-05:00) Chetumal
(UTC-05:00) Eastern Time (US & Canada)
(UTC-05:00) Haiti
(UTC-05:00) Havana
(UTC-05:00) Indiana (East)
(UTC-05:00) Turks and Caicos
(UTC-04:00) Atlantic Time (Canada)
(UTC-04:00) Caracas
(UTC-04:00) Cuiaba
(UTC-04:00) Georgetown, La Paz, Manaus, San Juan
(UTC-04:00) Santiago
(UTC-03:30) Newfoundland
(UTC-03:00) Asuncion
(UTC-03:00) Araguaina
(UTC-03:00) Brasilia
(UTC-03:00) Cayenne, Fortaleza
(UTC-03:00) City of Buenos Aires
(UTC-03:00) Greenland
(UTC-03:00) Montevideo
(UTC-03:00) Punta Arenas
(UTC-03:00) Saint Pierre and Miquelon
(UTC-03:00) Salvador
(UTC-02:00) Coordinated Universal Time-02
(UTC-01:00) Azores
(UTC-01:00) Cabo Verde Is.
(UTC+00:00) Coordinated Universal Time
(UTC+00:00) Dublin, Edinburgh, Lisbon, London
(UTC+00:00) Monrovia, Reykjavik
(UTC+00:00) Sao Tome
(UTC+01:00) Casablanca
(UTC+01:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna
(UTC+01:00) Belgrade, Bratislava, Budapest, Ljubljana, Prague
(UTC+01:00) Brussels, Copenhagen, Madrid, Paris
(UTC+01:00) Sarajevo, Skopje, Warsaw, Zagreb
(UTC+01:00) West Central Africa
(UTC+02:00) Amman
(UTC+02:00) Athens, Bucharest
(UTC+02:00) Beirut
(UTC+02:00) Cairo
(UTC+02:00) Chisinau
(UTC+02:00) Damascus
(UTC+02:00) Gaza, Hebron
(UTC+02:00) Harare, Pretoria
(UTC+02:00) Helsinki, Kyiv, Riga, Sofia, Tallinn, Vilnius
(UTC+02:00) Jerusalem
(UTC+02:00) Juba
(UTC+02:00) Kaliningrad
(UTC+02:00) Khartoum
(UTC+02:00) Tripoli
(UTC+02:00) Windhoek
(UTC+03:00) Baghdad
(UTC+03:00) Istanbul
(UTC+03:00) Kuwait, Riyadh
(UTC+03:00) Minsk
(UTC+03:00) Moscow, St. Petersburg
(UTC+03:00) Nairobi
(UTC+03:00) Volgograd
(UTC+03:30) Tehran
(UTC+04:00) Abu Dhabi, Muscat
(UTC+04:00) Astrakhan, Ulyanovsk
(UTC+04:00) Baku
(UTC+04:00) Izhevsk, Samara
(UTC+04:00) Port Louis
(UTC+04:00) Saratov
(UTC+04:00) Tbilisi
(UTC+04:00) Yerevan
(UTC+04:30) Kabul
(UTC+05:00) Ashgabat, Tashkent
(UTC+05:00) Astana
(UTC+05:00) Ekaterinburg
(UTC+05:00) Islamabad, Karachi
(UTC+05:00) Qyzylorda
(UTC+05:30) Chennai, Kolkata, Mumbai, New Delhi
(UTC+05:30) Sri Jayawardenepura
(UTC+05:45) Kathmandu
(UTC+06:00) Dhaka
(UTC+06:00) Omsk
(UTC+06:30) Yangon (Rangoon)
(UTC+07:00) Bangkok, Hanoi, Jakarta
(UTC+07:00) Barnaul, Gorno-Altaysk
(UTC+07:00) Hovd
(UTC+07:00) Krasnoyarsk
(UTC+07:00) Novosibirsk
(UTC+07:00) Tomsk
(UTC+08:00) Beijing, Chongqing, Hong Kong, Urumqi
(UTC+08:00) Irkutsk
(UTC+08:00) Kuala Lumpur, Singapore
(UTC+08:00) Perth
(UTC+08:00) Taipei
(UTC+08:00) Ulaanbaatar
(UTC+08:45) Eucla
(UTC+09:00) Chita
(UTC+09:00) Osaka, Sapporo, Tokyo
(UTC+09:00) Pyongyang
(UTC+09:00) Seoul
(UTC+09:00) Yakutsk
(UTC+09:30) Adelaide
(UTC+09:30) Darwin
(UTC+10:00) Brisbane
(UTC+10:00) Canberra, Melbourne, Sydney
(UTC+10:00) Guam, Port Moresby
(UTC+10:00) Hobart
(UTC+10:00) Vladivostok
(UTC+10:30) Lord Howe Island
(UTC+11:00) Bougainville Island
(UTC+11:00) Chokurdakh
(UTC+11:00) Magadan
(UTC+11:00) Norfolk Island
(UTC+11:00) Sakhalin
(UTC+11:00) Solomon Is., New Caledonia
(UTC+12:00) Anadyr, Petropavlovsk-Kamchatsky
(UTC+12:00) Auckland, Wellington
(UTC+12:00) Coordinated Universal Time+12
(UTC+12:00) Fiji
(UTC+12:45) Chatham Islands
(UTC+13:00) Coordinated Universal Time+13
(UTC+13:00) Nuku'alofa
(UTC+13:00) Samoa
(UTC+14:00) Kiritimati Island

BIN
iso/vendor/ExternalData/tui.fnt vendored Normal file

Binary file not shown.

BIN
iso/vendor/GO_SNMP/AlertServer vendored Executable file

Binary file not shown.

BIN
iso/vendor/acpica_bin/acpidump vendored Executable file

Binary file not shown.

BIN
iso/vendor/acpica_bin/acpiexec vendored Executable file

Binary file not shown.

BIN
iso/vendor/stunnel/stunnel64 vendored Executable file

Binary file not shown.

BIN
iso/vendor/tool/USBController/ASMedia/114xfwdl vendored Executable file

Binary file not shown.

BIN
iso/vendor/tool/gpu/nVidia/x64/nvuflash vendored Executable file

Binary file not shown.

BIN
iso/vendor/tool/gpu/nVidia/x64/setrom vendored Executable file

Binary file not shown.