Compare commits

...

9 Commits

Author SHA1 Message Date
Mikhail Chusavitin
ce46a97975 Remove duplicate Blackbox Logging card from Settings page
The USB Black-Box card already provides enable/disable per device.
The standalone Blackbox Logging card was non-functional and redundant.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 09:15:31 +03:00
Mikhail Chusavitin
258ecb3453 Add RAID Controller Management to Tools page
Unified card for LSI/Broadcom and Intel VROC controllers: auto-detects
foreign configurations and warns the operator with Import/Clear actions;
allows creating RAID 1 mirrors from unconfigured drives regardless of
controller type. Live output streams via SSE into an inline terminal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 08:58:19 +03:00
Mikhail Chusavitin
cbb0d1e522 Collect IPMI sensors, SEL and dmesg errors into audit JSON and support bundle
- audit JSON: IPMI sensor readings (ipmitool sensor) merged into hardware.sensors alongside lm-sensors data
- audit JSON: IPMI SEL entries (ipmitool sel list) in hardware.event_logs with source "ipmi-sel"
- audit JSON: dmesg error/warning lines in hardware.event_logs with source "dmesg" (filtered by error/warn/AER/Xid/NVRM/ECC/panic patterns)
- support bundle: added ipmitool-sensor.txt, ipmitool-sel.txt, ipmitool-sel-time.txt to techdump
- saa_dmi.go: fix dmiItemRE to accept SHN with parentheses (e.g. PS(4)LC for PSU fields)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 08:41:37 +03:00
Mikhail Chusavitin
bab941ccf1 Fix SAA: set CWD=/usr/local/bin; include all SAA package binaries
- saa_dmi.go: set cmd.Dir=/usr/local/bin on all saa exec calls so
  acpica_bin/acpidump is found relative to correct working directory
- build.sh: copy all saa companion dirs (acpica_bin, ExternalData,
  tool, stunnel, GO_SNMP) to /usr/local/bin/ preserving structure
- iso/vendor: add acpica_bin/acpiexec, ExternalData/, tool/gpu/nVidia/x64/,
  tool/USBController/, stunnel/, GO_SNMP/ from SAA 1.5.0 release package

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 08:24:50 +03:00
Mikhail Chusavitin
b49c71a980 Add IPMI FRU editor to Tools page
- New card "IPMI — FRU" on Tools page (device 0, in-band)
- Read: GET /api/tools/ipmi-fru → ipmitool fru print 0 → editable table
- Editable fields: chassis (part#, serial, extra), board (mfr, product, serial, part#),
  product (mfr, name, part#, version, serial); read-only fields displayed as text
- Write: POST /api/tools/ipmi-fru/write → task → backup to fru-backups/ → ipmitool fru edit per field
- Dirty tracking + Save (N changed) button, same UX as Supermicro DMI card

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-19 08:13:35 +03:00
Mikhail Chusavitin
85d1acdaa3 Split validate/stress into separate fixed-mode pages
- Check (2): validate mode only — no mode switcher, no stress-only cards
  (nvidia-targeted-stress, nvidia-targeted-power, nvidia-pulse hidden)
- Load (3): stress mode only — no mode switcher, all cards shown
- satStressMode() hardcoded per page; satModeChanged() removed
- Profile card with radio buttons removed from both pages
- Replaced with simple Run All button + est. time

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 19:12:17 +03:00
Mikhail Chusavitin
a2d7513153 Restructure nav to Load/Burn/Benchmark; fix SAA acpidump dependency
- Nav steps 3-5: Load (validate), Burn (burn-in), Benchmark (speed+endurance merged)
- /load now renders validate mode; /burn renders burn-in; /benchmark replaces /speed+/endurance
- Legacy redirects updated: /validate→/load, /burn-in→/burn, /speed+/endurance→/benchmark
- Add acpica_bin/acpidump from SAA 1.5.0 package; required by saa GetDmiInfo (ExitCode 8)
- build.sh copies acpica_bin/acpidump to /usr/local/bin/acpica_bin/ alongside saa

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 19:07:51 +03:00
Mikhail Chusavitin
5b5d8609d3 Refactor nav: remove numbers from Tools/Settings, add separator and Tasks item
- Remove "6." / "7." prefixes from Tools and Settings nav labels and page titles
- Add a horizontal separator (nav-sep) before the Tools/Settings group
- Move Tasks into the nav as a regular nav-item after the separator,
  replacing the separate tasks-nav-btn at the sidebar bottom
- Tasks item retains the active-count badge (tasks-nav-count)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 17:54:54 +03:00
Mikhail Chusavitin
e7442972d1 Move session-scoped LiveCD tools from Tools to Settings
Tools page now contains only NVMe Block Format and Supermicro - DMI.

Moved to Settings (7):
- System Install (Install to RAM + Install to Disk)
- Support Bundle + USB Black-Box
- Tool Check
- NVIDIA Self Heal (replaces simple NVIDIA Recovery card)
- Network
- Services

Update TestToolsPageRendersNvidiaSelfHealSection to assert the moved
cards on /settings instead of /tools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-18 17:52:19 +03:00
32 changed files with 5339 additions and 312 deletions

View File

@@ -49,7 +49,8 @@ func Run(_ runtimeenv.Mode) schema.HardwareIngestRequest {
snap.VROCLicense = collectVROCLicense(snap.PCIeDevices) snap.VROCLicense = collectVROCLicense(snap.PCIeDevices)
snap.PowerSupplies = collectPSUs(derefString(snap.Board.Manufacturer)) snap.PowerSupplies = collectPSUs(derefString(snap.Board.Manufacturer))
snap.PowerSupplies = enrichPSUsWithTelemetry(snap.PowerSupplies, sensorDoc) snap.PowerSupplies = enrichPSUsWithTelemetry(snap.PowerSupplies, sensorDoc)
snap.Sensors = buildSensorsFromDoc(sensorDoc) snap.Sensors = mergeIPMISensors(buildSensorsFromDoc(sensorDoc), collectIPMISensors())
snap.EventLogs = append(collectIPMISEL(), collectDmesgErrors()...)
finalizeSnapshot(&snap, collectedAt) finalizeSnapshot(&snap, collectedAt)
// remaining collectors added in steps 1.8 1.10 // remaining collectors added in steps 1.8 1.10

View File

@@ -0,0 +1,129 @@
package collector
import (
"bee/audit/internal/schema"
"log/slog"
"os/exec"
"regexp"
"strings"
"time"
)
// dmesg -T output: [Thu Jun 18 14:23:45 2026] message
// dmesg without -T: [ 123.456789] message
var dmesgTimestampRE = regexp.MustCompile(`^\[([^\]]+)\]\s*(.*)$`)
// Keywords that indicate an error or hardware problem worth capturing.
var dmesgErrorPatterns = []*regexp.Regexp{
regexp.MustCompile(`(?i)\berr(or)?\b`),
regexp.MustCompile(`(?i)\bfail(ed|ure)?\b`),
regexp.MustCompile(`(?i)\bfault\b`),
regexp.MustCompile(`(?i)\bwarn(ing)?\b`),
regexp.MustCompile(`(?i)\bAER\b`),
regexp.MustCompile(`(?i)\bXid\b`),
regexp.MustCompile(`(?i)\bNVRM\b`),
regexp.MustCompile(`(?i)\bpanic\b`),
regexp.MustCompile(`(?i)\bcorrected\b`),
regexp.MustCompile(`(?i)\buncorrect`),
regexp.MustCompile(`(?i)\bECC\b`),
regexp.MustCompile(`(?i)\btimeout\b`),
regexp.MustCompile(`(?i)\breset\b`),
regexp.MustCompile(`(?i)\bdead\b`),
regexp.MustCompile(`(?i)\bhang\b`),
regexp.MustCompile(`(?i)\bstall\b`),
regexp.MustCompile(`(?i)\bdisabled\b`),
}
// collectDmesgErrors runs `dmesg -T` (or `dmesg` without -T on failure) and
// returns only lines that match known error/warning patterns.
func collectDmesgErrors() []schema.HardwareEventLog {
out, err := exec.Command("dmesg", "-T").Output()
if err != nil || len(out) == 0 {
// Fallback: dmesg without human-readable timestamps
out, err = exec.Command("dmesg").Output()
if err != nil || len(out) == 0 {
return nil
}
}
entries := parseDmesgErrors(string(out))
if len(entries) == 0 {
return nil
}
slog.Info("dmesg: collected error entries", "count", len(entries))
return entries
}
func parseDmesgErrors(output string) []schema.HardwareEventLog {
var entries []schema.HardwareEventLog
collectedAt := time.Now().UTC().Format(time.RFC3339)
for _, line := range strings.Split(output, "\n") {
line = strings.TrimSpace(line)
if line == "" {
continue
}
var timestamp, message string
if m := dmesgTimestampRE.FindStringSubmatch(line); m != nil {
timestamp = strings.TrimSpace(m[1])
message = strings.TrimSpace(m[2])
} else {
message = line
}
if message == "" {
continue
}
if !matchesAny(message, dmesgErrorPatterns) {
continue
}
severity := dmesgSeverity(message)
source := "dmesg"
var eventTime *string
if timestamp != "" {
t := timestamp
eventTime = &t
} else {
eventTime = &collectedAt
}
entries = append(entries, schema.HardwareEventLog{
Source: source,
EventTime: eventTime,
Severity: &severity,
Message: message,
})
}
return entries
}
func matchesAny(s string, patterns []*regexp.Regexp) bool {
for _, p := range patterns {
if p.MatchString(s) {
return true
}
}
return false
}
func dmesgSeverity(msg string) string {
lower := strings.ToLower(msg)
switch {
case strings.Contains(lower, "panic") ||
strings.Contains(lower, "aer") ||
strings.Contains(lower, "uncorrect") ||
strings.Contains(lower, "xid") ||
strings.Contains(lower, "nvrm"):
return statusCritical
case strings.Contains(lower, "error") ||
strings.Contains(lower, "fault") ||
strings.Contains(lower, "fail") ||
strings.Contains(lower, "dead") ||
strings.Contains(lower, "hang"):
return statusCritical
default:
return statusWarning
}
}

View File

@@ -0,0 +1,90 @@
package collector
import (
"bee/audit/internal/schema"
"fmt"
"log/slog"
"os/exec"
"strings"
)
// collectIPMISEL runs `ipmitool sel list` and returns parsed event log entries.
// Returns nil if ipmitool is unavailable or the SEL is empty.
func collectIPMISEL() []schema.HardwareEventLog {
out, err := exec.Command("ipmitool", "sel", "list").Output()
if err != nil || len(out) == 0 {
return nil
}
entries := parseIPMISELOutput(string(out))
if len(entries) == 0 {
return nil
}
slog.Info("ipmi sel: collected", "entries", len(entries))
return entries
}
// parseIPMISELOutput parses `ipmitool sel list` output.
// Line format: ID | date | time | sensor | event description | direction
// Example: 1 | 06/18/2026 | 14:23:45 | Temperature #0x30 | Upper Critical going high | Asserted
func parseIPMISELOutput(output string) []schema.HardwareEventLog {
var entries []schema.HardwareEventLog
for _, line := range strings.Split(output, "\n") {
line = strings.TrimSpace(line)
if line == "" {
continue
}
parts := strings.SplitN(line, "|", 6)
if len(parts) < 5 {
continue
}
id := strings.TrimSpace(parts[0])
date := strings.TrimSpace(parts[1])
timeStr := strings.TrimSpace(parts[2])
sensor := strings.TrimSpace(parts[3])
event := strings.TrimSpace(parts[4])
direction := ""
if len(parts) == 6 {
direction = strings.TrimSpace(parts[5])
}
var eventTime *string
if date != "" && timeStr != "" {
t := fmt.Sprintf("%s %s", date, timeStr)
eventTime = &t
}
message := event
if direction != "" && strings.EqualFold(direction, "Deasserted") {
message = event + " (Deasserted)"
}
severity := ipmiSELSeverity(event)
isActive := !strings.EqualFold(direction, "Deasserted")
entry := schema.HardwareEventLog{
Source: "ipmi-sel",
EventTime: eventTime,
Severity: &severity,
MessageID: &id,
Message: message,
IsActive: &isActive,
}
if sensor != "" {
entry.ComponentRef = &sensor
}
entries = append(entries, entry)
}
return entries
}
func ipmiSELSeverity(event string) string {
lower := strings.ToLower(event)
switch {
case strings.Contains(lower, "critical") || strings.Contains(lower, "non-recoverable"):
return statusCritical
case strings.Contains(lower, "non-critical") || strings.Contains(lower, "warning") || strings.Contains(lower, "degraded"):
return statusWarning
default:
return "info"
}
}

View File

@@ -0,0 +1,216 @@
package collector
import (
"bee/audit/internal/schema"
"log/slog"
"os/exec"
"strconv"
"strings"
)
// collectIPMISensors runs `ipmitool sensor` and returns parsed sensor readings.
// Returns nil if ipmitool is unavailable or produces no output.
func collectIPMISensors() *schema.HardwareSensors {
out, err := exec.Command("ipmitool", "sensor").Output()
if err != nil || len(out) == 0 {
return nil
}
result := parseIPMISensorOutput(string(out))
if result == nil {
return nil
}
slog.Info("ipmi sensors: collected",
"fans", len(result.Fans),
"temperatures", len(result.Temperatures),
"power", len(result.Power),
"other", len(result.Other),
)
return result
}
// parseIPMISensorOutput parses `ipmitool sensor` text output.
// Each line: name | value | unit | status | lnr | lcr | lnc | unc | ucr | unr
func parseIPMISensorOutput(output string) *schema.HardwareSensors {
result := &schema.HardwareSensors{}
seen := map[string]struct{}{}
for _, line := range strings.Split(output, "\n") {
line = strings.TrimSpace(line)
if line == "" {
continue
}
parts := strings.Split(line, "|")
if len(parts) < 4 {
continue
}
name := strings.TrimSpace(parts[0])
rawVal := strings.TrimSpace(parts[1])
unit := strings.TrimSpace(parts[2])
status := strings.TrimSpace(parts[3])
if name == "" || rawVal == "na" || rawVal == "N/A" || rawVal == "" {
continue
}
value, err := strconv.ParseFloat(rawVal, 64)
if err != nil {
continue
}
statusStr := normalizeIPMISensorStatus(status)
switch {
case strings.EqualFold(unit, "RPM"):
if duplicateSensor(seen, "fan", name) {
continue
}
rpm := int(value)
item := schema.HardwareFanSensor{Name: name, RPM: &rpm}
if statusStr != "" {
item.Status = &statusStr
}
result.Fans = append(result.Fans, item)
case strings.EqualFold(unit, "degrees C") || strings.EqualFold(unit, "C"):
if duplicateSensor(seen, "temp", name) {
continue
}
item := schema.HardwareTemperatureSensor{Name: name, Celsius: &value}
if len(parts) >= 9 {
if unc := parseIPMIThreshold(parts[7]); unc != nil {
item.ThresholdWarningCelsius = unc
}
if ucr := parseIPMIThreshold(parts[8]); ucr != nil {
item.ThresholdCriticalCelsius = ucr
}
}
if statusStr != "" {
item.Status = &statusStr
} else {
item.Status = deriveTemperatureStatus(item.Celsius, item.ThresholdWarningCelsius, item.ThresholdCriticalCelsius)
}
result.Temperatures = append(result.Temperatures, item)
case strings.EqualFold(unit, "Volts") || strings.EqualFold(unit, "V"):
if duplicateSensor(seen, "power", name) {
continue
}
item := schema.HardwarePowerSensor{Name: name, VoltageV: &value}
if statusStr != "" {
item.Status = &statusStr
}
result.Power = append(result.Power, item)
case strings.EqualFold(unit, "Watts") || strings.EqualFold(unit, "W"):
if duplicateSensor(seen, "power", name) {
continue
}
item := schema.HardwarePowerSensor{Name: name, PowerW: &value}
if statusStr != "" {
item.Status = &statusStr
}
result.Power = append(result.Power, item)
case strings.EqualFold(unit, "Amps") || strings.EqualFold(unit, "A"):
if duplicateSensor(seen, "power", name) {
continue
}
item := schema.HardwarePowerSensor{Name: name, CurrentA: &value}
if statusStr != "" {
item.Status = &statusStr
}
result.Power = append(result.Power, item)
default:
if duplicateSensor(seen, "other", name) {
continue
}
item := schema.HardwareOtherSensor{Name: name, Value: &value}
if unit != "" {
item.Unit = &unit
}
if statusStr != "" {
item.Status = &statusStr
}
result.Other = append(result.Other, item)
}
}
if len(result.Fans) == 0 && len(result.Temperatures) == 0 && len(result.Power) == 0 && len(result.Other) == 0 {
return nil
}
return result
}
func parseIPMIThreshold(raw string) *float64 {
s := strings.TrimSpace(raw)
if s == "" || s == "na" || s == "N/A" {
return nil
}
v, err := strconv.ParseFloat(s, 64)
if err != nil {
return nil
}
return &v
}
func normalizeIPMISensorStatus(s string) string {
switch strings.ToLower(s) {
case "ok":
return statusOK
case "cr", "ucr", "lcr":
return statusCritical
case "nc", "unc", "lnc", "nr", "unr", "lnr":
return statusWarning
case "ns", "na":
return ""
default:
return ""
}
}
// mergeIPMISensors appends IPMI sensor entries into existing, skipping names already present.
func mergeIPMISensors(existing, ipmi *schema.HardwareSensors) *schema.HardwareSensors {
if ipmi == nil {
return existing
}
if existing == nil {
return ipmi
}
existingNames := map[string]struct{}{}
for _, s := range existing.Fans {
existingNames["fan\x00"+s.Name] = struct{}{}
}
for _, s := range existing.Temperatures {
existingNames["temp\x00"+s.Name] = struct{}{}
}
for _, s := range existing.Power {
existingNames["power\x00"+s.Name] = struct{}{}
}
for _, s := range existing.Other {
existingNames["other\x00"+s.Name] = struct{}{}
}
for _, s := range ipmi.Fans {
if _, ok := existingNames["fan\x00"+s.Name]; !ok {
existing.Fans = append(existing.Fans, s)
}
}
for _, s := range ipmi.Temperatures {
if _, ok := existingNames["temp\x00"+s.Name]; !ok {
existing.Temperatures = append(existing.Temperatures, s)
}
}
for _, s := range ipmi.Power {
if _, ok := existingNames["power\x00"+s.Name]; !ok {
existing.Power = append(existing.Power, s)
}
}
for _, s := range ipmi.Other {
if _, ok := existingNames["other\x00"+s.Name]; !ok {
existing.Other = append(existing.Other, s)
}
}
return existing
}

View File

@@ -25,6 +25,9 @@ var techDumpFixedCommands = []struct {
{Name: "sensors", Args: []string{"-j"}, File: "sensors.json"}, {Name: "sensors", Args: []string{"-j"}, File: "sensors.json"},
{Name: "ipmitool", Args: []string{"fru", "print"}, File: "ipmitool-fru.txt"}, {Name: "ipmitool", Args: []string{"fru", "print"}, File: "ipmitool-fru.txt"},
{Name: "ipmitool", Args: []string{"sdr"}, File: "ipmitool-sdr.txt"}, {Name: "ipmitool", Args: []string{"sdr"}, File: "ipmitool-sdr.txt"},
{Name: "ipmitool", Args: []string{"sensor"}, File: "ipmitool-sensor.txt"},
{Name: "ipmitool", Args: []string{"sel", "list"}, File: "ipmitool-sel.txt"},
{Name: "ipmitool", Args: []string{"sel", "time", "get"}, File: "ipmitool-sel-time.txt"},
{Name: "nvme", Args: []string{"list", "-o", "json"}, File: "nvme-list.json"}, {Name: "nvme", Args: []string{"list", "-o", "json"}, File: "nvme-list.json"},
} }

View File

@@ -0,0 +1,293 @@
package webui
import (
"context"
"encoding/json"
"fmt"
"net/http"
"os"
"os/exec"
"path/filepath"
"strings"
"time"
"unicode"
)
type fruField struct {
Name string `json:"name"`
Value string `json:"value"`
Editable bool `json:"editable"`
Area string `json:"area,omitempty"`
Index int `json:"index,omitempty"`
}
type fruChange struct {
Area string `json:"area"`
Index int `json:"index"`
Name string `json:"name"`
Value string `json:"value"`
}
// fruEditableFields maps display name → area + index for ipmitool fru edit.
var fruEditableFields = map[string]struct {
Area string
Index int
}{
"Chassis Part Number": {"c", 0},
"Chassis Serial Number": {"c", 1},
"Chassis Extra": {"c", 2},
"Board Manufacturer": {"b", 0},
"Board Product Name": {"b", 1},
"Board Serial Number": {"b", 2},
"Board Part Number": {"b", 3},
"Product Manufacturer": {"p", 0},
"Product Name": {"p", 1},
"Product Part Number": {"p", 2},
"Product Version": {"p", 3},
"Product Serial Number": {"p", 4},
}
func parseFRUOutput(output string) []fruField {
var fields []fruField
for _, line := range strings.Split(output, "\n") {
// Lines look like: " Field Name : value"
trimmed := strings.TrimLeft(line, " \t")
if trimmed == "" {
continue
}
colon := strings.Index(trimmed, " : ")
if colon < 0 {
// try ": " with no leading space before colon
colon = strings.Index(trimmed, ": ")
if colon < 0 {
continue
}
name := strings.TrimSpace(trimmed[:colon])
value := strings.TrimSpace(trimmed[colon+2:])
if name == "" {
continue
}
editable, area, idx := fruFieldMeta(name)
fields = append(fields, fruField{Name: name, Value: value, Editable: editable, Area: area, Index: idx})
continue
}
name := strings.TrimSpace(trimmed[:colon])
value := strings.TrimSpace(trimmed[colon+3:])
if name == "" {
continue
}
editable, area, idx := fruFieldMeta(name)
fields = append(fields, fruField{Name: name, Value: value, Editable: editable, Area: area, Index: idx})
}
return fields
}
func fruFieldMeta(name string) (editable bool, area string, index int) {
if e, ok := fruEditableFields[name]; ok {
return true, e.Area, e.Index
}
return false, "", 0
}
func (h *handler) handleAPIIPMIFRURead(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
out, err := exec.CommandContext(ctx, "ipmitool", "fru", "print", "0").CombinedOutput()
if err != nil {
msg := strings.TrimSpace(string(out))
if msg == "" {
msg = err.Error()
}
writeError(w, http.StatusInternalServerError, "ipmitool fru print: "+msg)
return
}
fields := parseFRUOutput(string(out))
writeJSON(w, fields)
}
func (h *handler) handleAPIIPMIFRUWrite(w http.ResponseWriter, r *http.Request) {
var req struct {
Changes []fruChange `json:"changes"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, http.StatusBadRequest, "invalid JSON")
return
}
if len(req.Changes) == 0 {
writeError(w, http.StatusUnprocessableEntity, "no changes provided")
return
}
validAreas := map[string]bool{"c": true, "b": true, "p": true}
for _, c := range req.Changes {
if !validAreas[c.Area] {
writeError(w, http.StatusUnprocessableEntity, "invalid area: "+c.Area)
return
}
if c.Index < 0 || c.Index > 9 {
writeError(w, http.StatusUnprocessableEntity, fmt.Sprintf("invalid index %d", c.Index))
return
}
if len(c.Value) > 64 {
writeError(w, http.StatusUnprocessableEntity, "value too long (max 64 chars)")
return
}
for _, ch := range c.Value {
if ch > unicode.MaxASCII || (ch < 0x20 && ch != 0) {
writeError(w, http.StatusUnprocessableEntity, "value contains non-printable characters")
return
}
}
}
t := &Task{
ID: newJobID("ipmi-fru-write"),
Name: fmt.Sprintf("IPMI FRU Write (%d field(s))", len(req.Changes)),
Target: "ipmi-fru-write",
Priority: defaultTaskPriority("ipmi-fru-write", taskParams{}),
Status: TaskPending,
CreatedAt: time.Now(),
params: taskParams{FRUChanges: req.Changes},
}
globalQueue.enqueue(t)
writeJSON(w, map[string]string{"task_id": t.ID})
}
func runIPMIFRUWriteTask(ctx context.Context, j *jobState, exportDir string, p taskParams) error {
// Backup current FRU state
backupDir := filepath.Join(exportDir, "fru-backups")
if err := os.MkdirAll(backupDir, 0755); err != nil {
return fmt.Errorf("mkdir fru-backups: %w", err)
}
stamp := time.Now().Format("20060102150405")
backupPath := filepath.Join(backupDir, "fru-"+stamp+".txt")
backupOut, err := exec.CommandContext(ctx, "ipmitool", "fru", "print", "0").CombinedOutput()
if err != nil {
return fmt.Errorf("backup fru print: %w", err)
}
if err := os.WriteFile(backupPath, backupOut, 0644); err != nil {
return fmt.Errorf("write backup: %w", err)
}
j.append("Backup saved to " + backupPath)
// Apply changes
for _, c := range p.FRUChanges {
j.append(fmt.Sprintf("Setting %s (%s %d) = %q", c.Name, c.Area, c.Index, c.Value))
cmd := exec.CommandContext(ctx, "ipmitool", "fru", "edit", "0", "field", c.Area, fmt.Sprintf("%d", c.Index), c.Value)
if err := streamCmdJob(j, cmd); err != nil {
return fmt.Errorf("fru edit %s %d: %w", c.Area, c.Index, err)
}
}
return nil
}
func renderIPMIFRUCard() string {
return `<div class="card"><div class="card-head card-head-actions">IPMI &#8212; FRU<div class="card-head-buttons"><button class="btn btn-sm btn-secondary" onclick="fruRead()">Read</button></div></div><div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:12px">Reads and edits FRU fields via ipmitool (In-Band, device 0). Works on any server with IPMI support.</p>
<div id="fru-status" style="font-size:13px;color:var(--muted);margin-bottom:8px"></div>
<div id="fru-table"></div>
<div id="fru-save-row" style="display:none;margin-top:12px">
<button class="btn btn-primary" id="fru-save-btn" onclick="fruSave()">Save</button>
<span id="fru-save-msg" style="font-size:13px;color:var(--muted);margin-left:10px"></span>
</div>
</div></div>
<script>
var fruOriginal = {};
function fruRead() {
document.getElementById('fru-status').textContent = 'Reading...';
document.getElementById('fru-table').innerHTML = '';
document.getElementById('fru-save-row').style.display = 'none';
fetch('/api/tools/ipmi-fru', {cache:'no-store'})
.then(function(r) {
if (!r.ok) return r.json().then(function(e) { throw new Error(e.error || r.statusText); });
return r.json();
})
.then(function(fields) {
fruOriginal = {};
if (!fields || !fields.length) {
document.getElementById('fru-status').textContent = 'No FRU fields returned.';
return;
}
document.getElementById('fru-status').textContent = '';
var rows = fields.map(function(f) {
var val = f.value || '';
if (f.editable) {
fruOriginal[f.area + '_' + f.index] = val;
return '<tr><td style="color:var(--muted);white-space:nowrap;padding-right:16px">' + escHtml(f.name) + '</td>'
+ '<td><input class="fru-input" style="width:100%;padding:4px 6px;border:1px solid var(--border);border-radius:3px;font-size:13px;font-family:inherit;background:var(--surface);color:var(--ink)"'
+ ' data-area="' + escHtml(f.area) + '" data-index="' + f.index + '" data-name="' + escHtml(f.name) + '"'
+ ' data-original="' + escHtml(val) + '" value="' + escHtml(val) + '" oninput="fruDirtyCheck()"></td></tr>';
}
return '<tr><td style="color:var(--muted);white-space:nowrap;padding-right:16px">' + escHtml(f.name) + '</td>'
+ '<td style="color:var(--ink)">' + escHtml(val || '—') + '</td></tr>';
}).join('');
document.getElementById('fru-table').innerHTML = '<table style="width:100%">' + rows + '</table>';
fruDirtyCheck();
})
.catch(function(e) {
document.getElementById('fru-status').textContent = 'Error: ' + e.message;
document.getElementById('fru-status').style.color = 'var(--crit-fg)';
});
}
function escHtml(s) {
return String(s).replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;').replace(/"/g,'&quot;');
}
function fruDirtyCheck() {
var inputs = document.querySelectorAll('.fru-input');
var changed = 0;
inputs.forEach(function(el) { if (el.value !== el.dataset.original) changed++; });
var row = document.getElementById('fru-save-row');
var btn = document.getElementById('fru-save-btn');
if (changed > 0) {
row.style.display = '';
btn.textContent = 'Save (' + changed + ' changed)';
} else {
row.style.display = 'none';
}
}
function fruSave() {
var inputs = document.querySelectorAll('.fru-input');
var changes = [];
inputs.forEach(function(el) {
if (el.value !== el.dataset.original) {
changes.push({area: el.dataset.area, index: parseInt(el.dataset.index, 10), name: el.dataset.name, value: el.value});
}
});
if (!changes.length) return;
document.getElementById('fru-save-btn').disabled = true;
document.getElementById('fru-save-msg').textContent = 'Saving...';
fetch('/api/tools/ipmi-fru/write', {method:'POST', headers:{'Content-Type':'application/json'}, body: JSON.stringify({changes: changes})})
.then(function(r) {
if (!r.ok) return r.json().then(function(e) { throw new Error(e.error || r.statusText); });
return r.json();
})
.then(function(d) {
var taskId = d.task_id;
document.getElementById('fru-save-msg').textContent = 'Task ' + taskId + ' queued…';
var poll = setInterval(function() {
fetch('/api/tasks', {cache:'no-store'}).then(function(r) { return r.json(); }).then(function(tasks) {
var t = Array.isArray(tasks) ? tasks.find(function(x) { return x.id === taskId; }) : null;
if (!t) return;
if (t.status === 'done') {
clearInterval(poll);
document.getElementById('fru-save-msg').textContent = 'Done — backup saved to fru-backups/.';
document.getElementById('fru-save-btn').disabled = false;
inputs.forEach(function(el) { el.dataset.original = el.value; });
fruDirtyCheck();
} else if (t.status === 'failed') {
clearInterval(poll);
document.getElementById('fru-save-msg').textContent = 'Failed: ' + (t.error || 'unknown error');
document.getElementById('fru-save-btn').disabled = false;
}
});
}, 1500);
})
.catch(function(e) {
document.getElementById('fru-save-msg').textContent = 'Error: ' + e.message;
document.getElementById('fru-save-btn').disabled = false;
});
}
</script>`
}

View File

@@ -68,10 +68,9 @@ tbody tr:hover td{background:rgba(0,0,0,.03)}
.chip-warn{background:var(--warn-bg);color:var(--warn-fg);border:1px solid #c9ba9b} .chip-warn{background:var(--warn-bg);color:var(--warn-fg);border:1px solid #c9ba9b}
.chip-fail{background:var(--crit-bg);color:var(--crit-fg);border:1px solid var(--crit-border)} .chip-fail{background:var(--crit-bg);color:var(--crit-fg);border:1px solid var(--crit-border)}
.chip-unknown{background:var(--surface-2);color:var(--muted);border:1px solid var(--border)} .chip-unknown{background:var(--surface-2);color:var(--muted);border:1px solid var(--border)}
/* Tasks nav badge */ /* Nav separator and tasks count badge */
.tasks-nav-btn{display:flex;justify-content:space-between;align-items:center;padding:10px 16px;color:rgba(255,255,255,.55);font-size:12px;text-decoration:none;border-top:1px solid rgba(255,255,255,.12);margin-top:auto;transition:color .15s} .nav-sep{height:1px;background:rgba(255,255,255,.12);margin:6px 0}
.tasks-nav-btn:hover{color:#fff} .tasks-nav-count{background:var(--accent);color:#fff;border-radius:10px;padding:1px 7px;font-size:11px;font-weight:700;display:none;margin-left:auto}
.tasks-nav-count{background:var(--accent);color:#fff;border-radius:10px;padding:1px 7px;font-size:11px;font-weight:700;display:none}
.tasks-nav-count.active{display:inline} .tasks-nav-count.active{display:inline}
/* Output terminal */ /* Output terminal */
.terminal{background:#1b1c1d;border:1px solid rgba(0,0,0,.2);border-radius:4px;padding:14px;font-family:monospace;font-size:12px;color:#b5cea8;max-height:400px;overflow-y:auto;white-space:pre-wrap;word-break:break-all;user-select:text;-webkit-user-select:text} .terminal{background:#1b1c1d;border:1px solid rgba(0,0,0,.2);border-radius:4px;padding:14px;font-family:monospace;font-size:12px;color:#b5cea8;max-height:400px;overflow-y:auto;white-space:pre-wrap;word-break:break-all;user-select:text;-webkit-user-select:text}
@@ -98,15 +97,21 @@ tbody tr:hover td{background:rgba(0,0,0,.03)}
} }
func layoutNav(active string, buildLabel string) string { func layoutNav(active string, buildLabel string) string {
items := []struct{ id, label, href string }{ type navItem struct {
{"dashboard", "Dashboard", "/"}, id, label, href string
{"audit", "1. Audit", "/audit"}, sep bool
{"check", "2. Check", "/check"}, }
{"load", "3. Load", "/load"}, items := []navItem{
{"speed", "4. Speed", "/speed"}, {id: "dashboard", label: "Dashboard", href: "/"},
{"endurance", "5. Endurance", "/endurance"}, {id: "audit", label: "1. Audit", href: "/audit"},
{"tools", "6. Tools", "/tools"}, {id: "check", label: "2. Check", href: "/check"},
{"settings", "7. Settings", "/settings"}, {id: "load", label: "3. Load", href: "/load"},
{id: "burn", label: "4. Burn", href: "/burn"},
{id: "benchmark", label: "5. Benchmark", href: "/benchmark"},
{sep: true},
{id: "tasks", label: "Tasks", href: "/tasks"},
{id: "tools", label: "Tools", href: "/tools"},
{id: "settings", label: "Settings", href: "/settings"},
} }
var b strings.Builder var b strings.Builder
b.WriteString(`<aside class="sidebar">`) b.WriteString(`<aside class="sidebar">`)
@@ -126,19 +131,23 @@ func layoutNav(active string, buildLabel string) string {
} }
b.WriteString(`<nav class="nav">`) b.WriteString(`<nav class="nav">`)
for _, item := range items { for _, item := range items {
if item.sep {
b.WriteString(`<div class="nav-sep"></div>`)
continue
}
cls := "nav-item" cls := "nav-item"
if item.id == active { if item.id == active {
cls += " active" cls += " active"
} }
b.WriteString(fmt.Sprintf(`<a class="%s" href="%s">%s</a>`, cls, item.href, item.label)) if item.id == "tasks" {
b.WriteString(fmt.Sprintf(`<a class="%s" href="%s" id="tasks-nav-item">%s<span class="tasks-nav-count" id="tasks-nav-count"></span></a>`, cls, item.href, item.label))
} else {
b.WriteString(fmt.Sprintf(`<a class="%s" href="%s">%s</a>`, cls, item.href, item.label))
}
} }
b.WriteString(`</nav>`) b.WriteString(`</nav>`)
b.WriteString(`<a href="/tasks" class="tasks-nav-btn" id="tasks-nav-btn">`)
b.WriteString(`<span>Tasks</span>`)
b.WriteString(`<span class="tasks-nav-count" id="tasks-nav-count"></span>`)
b.WriteString(`</a>`)
b.WriteString(`<script>`) b.WriteString(`<script>`)
b.WriteString(`(function(){function u(){fetch('/api/tasks',{cache:'no-store'}).then(function(r){return r.json();}).then(function(d){var n=Array.isArray(d)?d.filter(function(t){return t.status==='pending'||t.status==='running';}).length:0;var c=document.getElementById('tasks-nav-count');var b=document.getElementById('tasks-nav-btn');if(c){c.textContent=n>0?String(n):'';c.className='tasks-nav-count'+(n>0?' active':'');}if(b){b.style.color=n>0?'#f6c90e':'';}}).catch(function(){});}u();setInterval(u,5000);})();`) b.WriteString(`(function(){function u(){fetch('/api/tasks',{cache:'no-store'}).then(function(r){return r.json();}).then(function(d){var n=Array.isArray(d)?d.filter(function(t){return t.status==='pending'||t.status==='running';}).length:0;var c=document.getElementById('tasks-nav-count');var el=document.getElementById('tasks-nav-item');if(c){c.textContent=n>0?String(n):'';c.className='tasks-nav-count'+(n>0?' active':'');}if(el){el.style.color=n>0?'#f6c90e':'';}}).catch(function(){});}u();setInterval(u,5000);})();`)
b.WriteString(`</script>`) b.WriteString(`</script>`)
b.WriteString(`</aside>`) b.WriteString(`</aside>`)
return b.String() return b.String()

View File

@@ -612,19 +612,6 @@ func renderPowerBenchmarkResultsCard(exportDir string) string {
return b.String() return b.String()
} }
// renderSpeed renders the Speed page (step 4): performance benchmarks. // renderSpeed and renderEndurance are legacy wrappers; canonical page is 5. Benchmark at /benchmark.
// Uses the same benchmark infrastructure; defaults to Standard profile (throughput/bandwidth). func renderSpeed(opts HandlerOptions) string { return renderBenchmark(opts) }
// For long-duration stability/overnight runs, see Endurance (step 5). func renderEndurance(opts HandlerOptions) string { return renderBenchmark(opts) }
func renderSpeed(opts HandlerOptions) string {
base := renderBenchmark(opts)
return `<div class="alert alert-info" style="margin-bottom:16px"><strong>Speed:</strong> Measures GPU compute throughput and memory bandwidth. For overnight stability testing, go to <a href="/endurance">5. Endurance</a>.</div>` + base
}
// renderEndurance renders the Endurance page (step 5): long-duration reliability tests.
// Focuses on Stability and Overnight profiles for multi-hour burn validation.
// For short load tests, see Load (step 3). For throughput measurement, see Speed (step 4).
func renderEndurance(opts HandlerOptions) string {
base := renderBenchmark(opts)
return `<div class="alert alert-warn" style="margin-bottom:16px"><strong>Endurance:</strong> Long-duration reliability tests — Stability (several hours) and Overnight (8+ h) profiles. These profiles run hardware at sustained load; results show whether the server holds its performance envelope over time.</div>
<div class="alert alert-info" style="margin-bottom:16px">Use the <strong>Stability</strong> or <strong>Overnight</strong> profile in the setup card below. The Standard profile is available too but is better suited for the <a href="/speed">4. Speed</a> page.</div>` + base
}

View File

@@ -1,13 +1,8 @@
package webui package webui
// renderLoad renders the Load page (step 3): sustained stress tests.
// For non-destructive status checks, see Check (step 2).
// For DCGM targeted diagnostics (targeted_stress, targeted_power, pulse), see Check → Validate mode.
func renderLoad() string { return renderBurn() }
func renderBurn() string { func renderBurn() string {
return `<div class="alert alert-warn" style="margin-bottom:16px"><strong>&#9888; Warning:</strong> Stress tests on this page run hardware at high load. Repeated or prolonged use may reduce hardware lifespan. Use only when necessary.</div> return `<div class="alert alert-warn" style="margin-bottom:16px"><strong>&#9888; Warning:</strong> Stress tests on this page run hardware at high load. Repeated or prolonged use may reduce hardware lifespan. Use only when necessary.</div>
<div class="alert alert-info" style="margin-bottom:16px"><strong>Scope:</strong> Load runs sustained GPU compute and CPU/memory stress recipes. DCGM diagnostics (<code>targeted_stress</code>, <code>targeted_power</code>, <code>pulse_test</code>) and NCCL/NVBandwidth are on the <a href="/check">2. Check</a> page. For overnight endurance runs, see <a href="/endurance">5. Endurance</a>.</div> <div class="alert alert-info" style="margin-bottom:16px"><strong>Scope:</strong> Burn runs sustained GPU compute and CPU/memory stress recipes. DCGM targeted diagnostics (<code>targeted_stress</code>, <code>targeted_power</code>, <code>pulse_test</code>) and NCCL/NVBandwidth are on the <a href="/load">3. Load</a> page. For performance benchmarks, see <a href="/benchmark">5. Benchmark</a>.</div>
<p style="color:var(--muted);font-size:13px;margin-bottom:16px">Tasks continue in the background — view progress in <a href="/tasks">Tasks</a>.</p> <p style="color:var(--muted);font-size:13px;margin-bottom:16px">Tasks continue in the background — view progress in <a href="/tasks">Tasks</a>.</p>
<div class="card" style="margin-bottom:16px"> <div class="card" style="margin-bottom:16px">

View File

@@ -402,96 +402,13 @@ loadNvidiaSelfHeal();
} }
func renderTools() string { func renderTools() string {
return `<div class="card" style="margin-bottom:16px"> return renderNVMeFormatCard() + `
<div class="card-head">System Install</div>
<div class="card-body">
<div style="margin-bottom:20px">
<div style="font-weight:600;margin-bottom:8px">Install to RAM</div>
<p id="boot-source-text" style="color:var(--muted);font-size:13px;margin-bottom:8px">Detecting boot source...</p>
<p id="ram-status-text" style="color:var(--muted);font-size:13px;margin-bottom:8px">Checking...</p>
<button id="ram-install-btn" class="btn btn-primary" onclick="installToRAM()" style="display:none">&#9654; Copy to RAM</button>
</div>
<div style="border-top:1px solid var(--line);padding-top:20px">
<div style="font-weight:600;margin-bottom:8px">Install to Disk</div>` +
renderInstallInline() + `
</div>
</div>
</div>
<script>
fetch('/api/system/ram-status').then(r=>r.json()).then(d=>{
const boot = document.getElementById('boot-source-text');
const txt = document.getElementById('ram-status-text');
const btn = document.getElementById('ram-install-btn');
let source = d.device || d.source || 'unknown source';
let kind = d.kind || 'unknown';
let label = source;
if (kind === 'ram') label = 'RAM';
else if (kind === 'usb') label = 'USB (' + source + ')';
else if (kind === 'cdrom') label = 'CD-ROM (' + source + ')';
else if (kind === 'disk') label = 'disk (' + source + ')';
else label = source;
boot.textContent = 'Current boot source: ' + label + '.';
txt.textContent = d.blocked_reason || d.message || 'Checking...';
if (d.status === 'ok' || d.in_ram) {
txt.style.color = 'var(--ok, green)';
} else if (d.status === 'failed') {
txt.style.color = 'var(--err, #b91c1c)';
} else {
txt.style.color = 'var(--muted)';
}
if (d.can_start_task) {
btn.style.display = '';
btn.disabled = false;
} else {
btn.style.display = 'none';
}
});
function installToRAM() {
document.getElementById('ram-install-btn').disabled = true;
fetch('/api/system/install-to-ram', {method:'POST'}).then(r=>r.json()).then(d=>{
window.location.href = '/tasks#' + d.task_id;
});
}
</script>
<div class="card"><div class="card-head">Support Bundle</div><div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:12px">Downloads a tar.gz archive of all audit files, SAT results, and logs.</p>
` + renderSupportBundleInline() + `
<div style="border-top:1px solid var(--border);margin-top:16px;padding-top:16px">
<div style="font-weight:600;margin-bottom:8px">USB Black-Box</div>
` + renderUSBExportInline() + `
</div>
</div></div>
<div class="card"><div class="card-head">Tool Check <button class="btn btn-sm btn-secondary" onclick="checkTools()" style="margin-left:auto">&#8635; Check</button></div>
<div class="card-body"><div id="tools-table"><p style="color:var(--muted);font-size:13px">Checking...</p></div></div></div>
<div class="card"><div class="card-head">NVIDIA Self Heal</div><div class="card-body">` +
renderNvidiaSelfHealInline() + `</div></div>
<div class="card"><div class="card-head">Network</div><div class="card-body">` +
renderNetworkInline() + `</div></div>
<div class="card"><div class="card-head">Services</div><div class="card-body">` +
renderServicesInline() + `</div></div>
` + renderNVMeFormatCard() + `
` + renderSAADMICard() + ` ` + renderSAADMICard() + `
<script> ` + renderIPMIFRUCard() + `
function checkTools() {
document.getElementById('tools-table').innerHTML = '<p style="color:var(--muted);font-size:13px">Checking...</p>'; ` + renderRAIDMgmtCard()
fetch('/api/tools/check').then(r=>r.json()).then(tools => {
const rows = tools.map(t =>
'<tr><td>'+t.Name+'</td><td><span class="badge '+(t.OK ? 'badge-ok' : 'badge-err')+'">'+(t.OK ? '&#10003; '+t.Path : '&#10007; missing')+'</span></td></tr>'
).join('');
document.getElementById('tools-table').innerHTML =
'<table><tr><th>Tool</th><th>Status</th></tr>'+rows+'</table>';
});
}
checkTools();
</script>`
} }
func renderExportIndex(exportDir string) (string, error) { func renderExportIndex(exportDir string) (string, error) {

View File

@@ -7,34 +7,76 @@ func renderSettings(opts HandlerOptions) string {
if version == "" { if version == "" {
version = "dev" version = "dev"
} }
return `<div class="grid2"> return `<div class="card" style="margin-bottom:16px">
<div class="card-head">System Install</div>
<div class="card">
<div class="card-head">Blackbox Logging</div>
<div class="card-body"> <div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:14px">Continuous hardware monitoring that writes a rolling log of sensor readings to the export directory. Useful for capturing thermal or power anomalies during long runs.</p> <div style="margin-bottom:20px">
<div style="display:flex;gap:8px;align-items:center"> <div style="font-weight:600;margin-bottom:8px">Install to RAM</div>
<button class="btn btn-primary btn-sm" onclick="blackboxToggle('enable')">Enable</button> <p id="boot-source-text" style="color:var(--muted);font-size:13px;margin-bottom:8px">Detecting boot source...</p>
<button class="btn btn-secondary btn-sm" onclick="blackboxToggle('disable')">Disable</button> <p id="ram-status-text" style="color:var(--muted);font-size:13px;margin-bottom:8px">Checking...</p>
<span id="blackbox-status" style="font-size:12px;color:var(--muted)">Loading...</span> <button id="ram-install-btn" class="btn btn-primary" onclick="installToRAM()" style="display:none">&#9654; Copy to RAM</button>
</div>
<div style="border-top:1px solid var(--line);padding-top:20px">
<div style="font-weight:600;margin-bottom:8px">Install to Disk</div>` +
renderInstallInline() + `
</div> </div>
</div> </div>
</div> </div>
<script>
fetch('/api/system/ram-status').then(r=>r.json()).then(d=>{
const boot = document.getElementById('boot-source-text');
const txt = document.getElementById('ram-status-text');
const btn = document.getElementById('ram-install-btn');
let kind = d.kind || 'unknown';
let source = d.device || d.source || 'unknown source';
let label = kind==='ram'?'RAM':kind==='usb'?'USB ('+source+')':kind==='cdrom'?'CD-ROM ('+source+')':kind==='disk'?'disk ('+source+')':source;
boot.textContent = 'Current boot source: ' + label + '.';
txt.textContent = d.blocked_reason || d.message || 'Checking...';
txt.style.color = (d.status==='ok'||d.in_ram)?'var(--ok,green)':d.status==='failed'?'var(--err,#b91c1c)':'var(--muted)';
if (d.can_start_task) { btn.style.display=''; btn.disabled=false; } else { btn.style.display='none'; }
});
function installToRAM() {
document.getElementById('ram-install-btn').disabled = true;
fetch('/api/system/install-to-ram', {method:'POST'}).then(r=>r.json()).then(d=>{
window.location.href = '/tasks#' + d.task_id;
});
}
</script>
<div class="card"><div class="card-head">Support Bundle</div><div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:12px">Downloads a tar.gz archive of all audit files, SAT results, and logs.</p>
` + renderSupportBundleInline() + `
<div style="border-top:1px solid var(--border);margin-top:16px;padding-top:16px">
<div style="font-weight:600;margin-bottom:8px">USB Black-Box</div>
` + renderUSBExportInline() + `
</div>
</div></div>
<div class="card"><div class="card-head">Tool Check <button class="btn btn-sm btn-secondary" onclick="checkTools()" style="margin-left:auto">&#8635; Check</button></div>
<div class="card-body"><div id="tools-table"><p style="color:var(--muted);font-size:13px">Checking...</p></div></div></div>
<script>
function checkTools() {
document.getElementById('tools-table').innerHTML = '<p style="color:var(--muted);font-size:13px">Checking...</p>';
fetch('/api/tools/check').then(r=>r.json()).then(tools => {
const rows = tools.map(t =>
'<tr><td>'+t.Name+'</td><td><span class="badge '+(t.OK?'badge-ok':'badge-err')+'">'+(t.OK?'&#10003; '+t.Path:'&#10007; missing')+'</span></td></tr>'
).join('');
document.getElementById('tools-table').innerHTML = '<table><tr><th>Tool</th><th>Status</th></tr>'+rows+'</table>';
});
}
checkTools();
</script>
<div class="card"><div class="card-head">NVIDIA Self Heal</div><div class="card-body">` +
renderNvidiaSelfHealInline() + `</div></div>
<div class="card"><div class="card-head">Network</div><div class="card-body">` +
renderNetworkInline() + `</div></div>
<div class="card"><div class="card-head">Services</div><div class="card-body">` +
renderServicesInline() + `</div></div>
<div class="card"> <div class="card">
<div class="card-head">NVIDIA Recovery</div>
<div class="card-body">
<p style="font-size:13px;color:var(--muted);margin-bottom:14px">Reset NVIDIA GPU driver state. Use when <code>nvidia-smi</code> reports errors or GPUs appear stuck after a failed test.</p>
<div style="display:flex;gap:8px;align-items:center">
<button class="btn btn-danger btn-sm" onclick="nvidiaReset()">Reset NVIDIA Driver</button>
<span id="nvidia-reset-status" style="font-size:12px;color:var(--muted)"></span>
</div>
</div>
</div>
</div>
<div class="card" style="margin-top:0">
<div class="card-head">Build Info</div> <div class="card-head">Build Info</div>
<div class="card-body"> <div class="card-body">
<table style="width:auto"> <table style="width:auto">
@@ -46,32 +88,5 @@ func renderSettings(opts HandlerOptions) string {
</div> </div>
</div> </div>
<script> `
(function() {
fetch('/api/blackbox/status', {cache:'no-store'}).then(r => r.json()).then(d => {
var el = document.getElementById('blackbox-status');
if (el) el.textContent = d.enabled ? 'Enabled' : 'Disabled';
}).catch(() => {
var el = document.getElementById('blackbox-status');
if (el) el.textContent = 'Status unavailable';
});
})();
function blackboxToggle(action) {
var el = document.getElementById('blackbox-status');
if (el) el.textContent = 'Updating...';
fetch('/api/blackbox/' + action, {method:'POST', cache:'no-store'})
.then(r => r.json())
.then(d => { if (el) el.textContent = d.enabled ? 'Enabled' : 'Disabled'; })
.catch(err => { if (el) el.textContent = 'Error: ' + err.message; });
}
function nvidiaReset() {
var el = document.getElementById('nvidia-reset-status');
if (!confirm('Reset NVIDIA driver? This will interrupt any running GPU tasks.')) return;
if (el) el.textContent = 'Resetting...';
fetch('/api/gpu/nvidia-reset', {method:'POST', cache:'no-store'})
.then(r => r.json())
.then(d => { if (el) el.textContent = d.error ? ('Error: ' + d.error) : 'Done — driver reset.'; })
.catch(err => { if (el) el.textContent = 'Error: ' + err.message; });
}
</script>`
} }

View File

@@ -68,6 +68,14 @@ func validateTotalStressSec(n int) int {
} }
func renderValidate(opts HandlerOptions) string { func renderValidate(opts HandlerOptions) string {
return renderValidateMode(opts, false)
}
func renderValidateStress(opts HandlerOptions) string {
return renderValidateMode(opts, true)
}
func renderValidateMode(opts HandlerOptions, stressDefault bool) string {
inv := loadValidateInventory(opts) inv := loadValidateInventory(opts)
n := inv.NvidiaGPUCount n := inv.NvidiaGPUCount
validateTotalStr := validateFmtDur(validateTotalValidateSec(n)) validateTotalStr := validateFmtDur(validateTotalValidateSec(n))
@@ -76,26 +84,49 @@ func renderValidate(opts HandlerOptions) string {
if n > 0 { if n > 0 {
gpuNote = fmt.Sprintf(" (%d GPU)", n) gpuNote = fmt.Sprintf(" (%d GPU)", n)
} }
return `<div class="alert alert-info" style="margin-bottom:16px"><strong>Non-destructive:</strong> Validate tests collect diagnostics only. They do not write to disks, do not run sustained load, and do not increment hardware wear counters.</div> estStr := validateTotalStr
<p style="color:var(--muted);font-size:13px;margin-bottom:16px">Tasks continue in the background — view progress in <a href="/tasks">Tasks</a>.</p> if stressDefault {
estStr = stressTotalStr
}
alert := `<div class="alert alert-info" style="margin-bottom:16px"><strong>Non-destructive:</strong> Validate tests collect diagnostics only. They do not write to disks, do not run sustained load, and do not increment hardware wear counters.</div>`
if stressDefault {
alert = `<div class="alert alert-warn" style="margin-bottom:16px"><strong>&#9888; Stress mode:</strong> Runs extended load tests — CPU stress-ng, memory passes, DCGM targeted diagnostics. Higher wear than Validate.</div>`
}
<div class="card" style="margin-bottom:16px"> stressOnlyCards := ""
<div class="card-head">Validate Profile</div> if stressDefault {
<div class="card-body validate-profile-body"> stressOnlyCards = renderSATCard("nvidia-targeted-stress", "NVIDIA GPU Targeted Stress", "runNvidiaValidateSet('nvidia-targeted-stress')", "", renderValidateCardBody(
<div class="validate-profile-col"> inv.NVIDIA,
<div class="form-row" style="margin:12px 0 0"><label>Mode</label></div> `Runs a controlled NVIDIA DCGM load to check stability under moderate stress.`,
<label class="cb-row"><input type="radio" name="sat-mode" id="sat-mode-validate" value="validate" checked onchange="satModeChanged()"><span>Validate — quick non-destructive check</span></label> `<code>dcgmi diag targeted_stress</code>`,
<label class="cb-row"><input type="radio" name="sat-mode" id="sat-mode-stress" value="stress" onchange="satModeChanged()"><span>Stress — thorough load test (` + stressTotalStr + gpuNote + `)</span></label> validateFmtDur(platform.SATEstimatedNvidiaTargetedStressSec)+` (all GPUs simultaneously).`,
</div> )) +
<div class="validate-profile-col validate-profile-action"> renderSATCard("nvidia-targeted-power", "NVIDIA Targeted Power", "runNvidiaValidateSet('nvidia-targeted-power')", "", renderValidateCardBody(
<p style="color:var(--muted);font-size:12px;margin:0 0 10px">Runs validate modules sequentially. Validate: ` + validateTotalStr + gpuNote + `; Stress: ` + stressTotalStr + gpuNote + `. Estimates are based on real log data and scale with GPU count.</p> inv.NVIDIA,
<button type="button" class="btn btn-primary" onclick="runAllSAT()">Validate one by one</button> `Checks that the GPU can sustain its declared power delivery envelope. Pass/fail determined by DCGM.`,
<div style="margin-top:12px"> `<code>dcgmi diag targeted_power</code>`,
<span id="sat-all-status" style="font-size:12px;color:var(--muted)"></span> validateFmtDur(platform.SATEstimatedNvidiaTargetedPowerSec)+` (all GPUs simultaneously).`,
</div> )) +
</div> renderSATCard("nvidia-pulse", "NVIDIA PSU Pulse Test", "runNvidiaFabricValidate('nvidia-pulse')", "", renderValidateCardBody(
</div> inv.NVIDIA,
</div> `Tests power supply transient response by pulsing all GPUs simultaneously between idle and full load. Synchronous pulses across all GPUs create worst-case PSU load spikes — running per-GPU would miss PSU-level failures.`,
`<code>dcgmi diag pulse_test</code>`,
validateFmtDur(platform.SATEstimatedNvidiaPulseTestSec)+` (all GPUs simultaneously; measured on 8-GPU system).`,
))
}
satStressModeJS := "function satStressMode() { return false; }"
if stressDefault {
satStressModeJS = "function satStressMode() { return true; }"
}
return alert + `
<p style="color:var(--muted);font-size:13px;margin-bottom:16px">Tasks continue in the background — view progress in <a href="/tasks">Tasks</a>.</p>
<div style="display:flex;align-items:center;gap:12px;margin-bottom:16px">
<button type="button" class="btn btn-primary" onclick="runAllSAT()">Run All</button>
<span id="sat-all-status" style="font-size:12px;color:var(--muted)"></span>
<span style="font-size:12px;color:var(--muted)">est. ` + estStr + gpuNote + `</span>
</div>
<div class="grid3"> <div class="grid3">
` + renderSATCard("cpu", "CPU", "runSAT('cpu')", "", renderValidateCardBody( ` + renderSATCard("cpu", "CPU", "runSAT('cpu')", "", renderValidateCardBody(
@@ -122,7 +153,7 @@ func renderValidate(opts HandlerOptions) string {
<div class="card-head">NVIDIA GPU Selection</div> <div class="card-head">NVIDIA GPU Selection</div>
<div class="card-body"> <div class="card-body">
<p style="font-size:12px;color:var(--muted);margin:0 0 8px">` + inv.NVIDIA + `</p> <p style="font-size:12px;color:var(--muted);margin:0 0 8px">` + inv.NVIDIA + `</p>
<p style="font-size:12px;color:var(--muted);margin:0 0 10px">All NVIDIA validate tasks use only the GPUs selected here. The same selection is used by Validate one by one.</p> <p style="font-size:12px;color:var(--muted);margin:0 0 10px">All NVIDIA validate tasks use only the GPUs selected here. The same selection is used by Run All.</p>
<div style="display:flex;gap:8px;flex-wrap:wrap;margin-bottom:8px"> <div style="display:flex;gap:8px;flex-wrap:wrap;margin-bottom:8px">
<button class="btn btn-sm btn-secondary" type="button" onclick="satSelectAllGPUs()">Select All</button> <button class="btn btn-sm btn-secondary" type="button" onclick="satSelectAllGPUs()">Select All</button>
<button class="btn btn-sm btn-secondary" type="button" onclick="satSelectNoGPUs()">Clear</button> <button class="btn btn-sm btn-secondary" type="button" onclick="satSelectNoGPUs()">Clear</button>
@@ -143,46 +174,19 @@ func renderValidate(opts HandlerOptions) string {
validateFmtDur(platform.SATEstimatedNvidiaGPUValidateSec), validateFmtDur(platform.SATEstimatedNvidiaGPUValidateSec),
validateFmtDur(platform.SATEstimatedNvidiaGPUStressSec)), validateFmtDur(platform.SATEstimatedNvidiaGPUStressSec)),
)) + )) +
`<div id="sat-card-nvidia-targeted-stress">` + stressOnlyCards +
renderSATCard("nvidia-targeted-stress", "NVIDIA GPU Targeted Stress", "runNvidiaValidateSet('nvidia-targeted-stress')", "", renderValidateCardBody(
inv.NVIDIA,
`Runs a controlled NVIDIA DCGM load to check stability under moderate stress.`,
`<code>dcgmi diag targeted_stress</code>`,
"Skipped in Validate. Stress: " + validateFmtDur(platform.SATEstimatedNvidiaTargetedStressSec) + ` (all GPUs simultaneously).<p id="sat-ts-mode-hint" style="color:var(--warn-fg);font-size:12px;margin:8px 0 0">Only runs in Stress mode. Switch mode above to enable in Run All.</p>`,
)) +
`</div>` +
`<div id="sat-card-nvidia-targeted-power">` +
renderSATCard("nvidia-targeted-power", "NVIDIA Targeted Power", "runNvidiaValidateSet('nvidia-targeted-power')", "", renderValidateCardBody(
inv.NVIDIA,
`Checks that the GPU can sustain its declared power delivery envelope. Pass/fail determined by DCGM.`,
`<code>dcgmi diag targeted_power</code>`,
"Skipped in Validate. Stress: " + validateFmtDur(platform.SATEstimatedNvidiaTargetedPowerSec) + ` (all GPUs simultaneously).<p id="sat-tp-mode-hint" style="color:var(--warn-fg);font-size:12px;margin:8px 0 0">Only runs in Stress mode. Switch mode above to enable in Run All.</p>`,
)) +
`</div>` +
`<div id="sat-card-nvidia-pulse">` +
renderSATCard("nvidia-pulse", "NVIDIA PSU Pulse Test", "runNvidiaFabricValidate('nvidia-pulse')", "", renderValidateCardBody(
inv.NVIDIA,
`Tests power supply transient response by pulsing all GPUs simultaneously between idle and full load. Synchronous pulses across all GPUs create worst-case PSU load spikes — running per-GPU would miss PSU-level failures.`,
`<code>dcgmi diag pulse_test</code>`,
`Skipped in Validate. Stress: `+validateFmtDur(platform.SATEstimatedNvidiaPulseTestSec)+` (all GPUs simultaneously; measured on 8-GPU system).`+`<p id="sat-pt-mode-hint" style="color:var(--warn-fg);font-size:12px;margin:8px 0 0">Only runs in Stress mode. Switch mode above to enable in Run All.</p>`,
)) +
`</div>` +
`<div id="sat-card-nvidia-interconnect">` +
renderSATCard("nvidia-interconnect", "NVIDIA Interconnect (NCCL)", "runNvidiaFabricValidate('nvidia-interconnect')", "", renderValidateCardBody( renderSATCard("nvidia-interconnect", "NVIDIA Interconnect (NCCL)", "runNvidiaFabricValidate('nvidia-interconnect')", "", renderValidateCardBody(
inv.NVIDIA, inv.NVIDIA,
`Verifies NVLink/NVSwitch fabric bandwidth using NCCL all_reduce_perf across all selected GPUs. Pass/fail based on achieved bandwidth vs. theoretical.`, `Verifies NVLink/NVSwitch fabric bandwidth using NCCL all_reduce_perf across all selected GPUs. Pass/fail based on achieved bandwidth vs. theoretical.`,
`<code>all_reduce_perf</code> (NCCL tests)`, `<code>all_reduce_perf</code> (NCCL tests)`,
`Validate and Stress: `+validateFmtDur(platform.SATEstimatedNvidiaInterconnectSec)+` (all GPUs simultaneously, requires ≥2).`, validateFmtDur(platform.SATEstimatedNvidiaInterconnectSec)+` (all GPUs simultaneously, requires ≥2).`,
)) + )) +
`</div>` +
`<div id="sat-card-nvidia-bandwidth">` +
renderSATCard("nvidia-bandwidth", "NVIDIA Bandwidth (NVBandwidth)", "runNvidiaFabricValidate('nvidia-bandwidth')", "", renderValidateCardBody( renderSATCard("nvidia-bandwidth", "NVIDIA Bandwidth (NVBandwidth)", "runNvidiaFabricValidate('nvidia-bandwidth')", "", renderValidateCardBody(
inv.NVIDIA, inv.NVIDIA,
`Validates GPU memory copy and peer-to-peer bandwidth paths using NVBandwidth.`, `Validates GPU memory copy and peer-to-peer bandwidth paths using NVBandwidth.`,
`<code>nvbandwidth</code>`, `<code>nvbandwidth</code>`,
`Validate and Stress: `+validateFmtDur(platform.SATEstimatedNvidiaBandwidthSec)+` (all GPUs simultaneously; nvbandwidth runs all built-in tests without a time limit — duration set by the tool).`, validateFmtDur(platform.SATEstimatedNvidiaBandwidthSec)+` (all GPUs simultaneously; nvbandwidth runs all built-in tests without a time limit — duration set by the tool).`,
)) + )) +
`</div>` +
`</div> `</div>
<div class="grid3" style="margin-top:16px"> <div class="grid3" style="margin-top:16px">
` + renderSATCard("amd", "AMD GPU", "runAMDValidateSet()", "", renderValidateCardBody( ` + renderSATCard("amd", "AMD GPU", "runAMDValidateSet()", "", renderValidateCardBody(
@@ -197,36 +201,15 @@ func renderValidate(opts HandlerOptions) string {
<div class="card-body"><div id="sat-terminal" class="terminal"></div></div> <div class="card-body"><div id="sat-terminal" class="terminal"></div></div>
</div> </div>
<style> <style>
.validate-profile-body { display:grid; grid-template-columns:1fr 1fr 1fr; gap:24px; align-items:stretch; }
.validate-profile-col { min-width:0; display:flex; flex-direction:column; }
.validate-profile-action { display:flex; flex-direction:column; align-items:center; justify-content:center; }
.validate-card-body { padding:0; } .validate-card-body { padding:0; }
.validate-card-section { padding:12px 16px 0; } .validate-card-section { padding:12px 16px 0; }
.validate-card-section:last-child { padding-bottom:16px; } .validate-card-section:last-child { padding-bottom:16px; }
.sat-gpu-row { display:flex; align-items:flex-start; gap:8px; padding:6px 0; cursor:pointer; font-size:13px; } .sat-gpu-row { display:flex; align-items:flex-start; gap:8px; padding:6px 0; cursor:pointer; font-size:13px; }
.sat-gpu-row input[type=checkbox] { width:16px; height:16px; margin-top:2px; flex-shrink:0; } .sat-gpu-row input[type=checkbox] { width:16px; height:16px; margin-top:2px; flex-shrink:0; }
@media(max-width:900px){ .validate-profile-body { grid-template-columns:1fr; } }
</style> </style>
<script> <script>
let satES = null; let satES = null;
function satStressMode() { ` + satStressModeJS + `
return document.querySelector('input[name="sat-mode"]:checked')?.value === 'stress';
}
function satModeChanged() {
const stress = satStressMode();
[
{card: 'sat-card-nvidia-targeted-stress', hint: 'sat-ts-mode-hint'},
{card: 'sat-card-nvidia-targeted-power', hint: 'sat-tp-mode-hint'},
{card: 'sat-card-nvidia-pulse', hint: 'sat-pt-mode-hint'},
].forEach(function(item) {
const card = document.getElementById(item.card);
if (card) {
card.style.opacity = stress ? '1' : '0.5';
const hint = document.getElementById(item.hint);
if (hint) hint.style.display = stress ? 'none' : '';
}
});
}
function satLabels() { function satLabels() {
return {nvidia:'Validate GPU', 'nvidia-targeted-stress':'NVIDIA Targeted Stress (dcgmi diag targeted_stress)', 'nvidia-targeted-power':'NVIDIA Targeted Power (dcgmi diag targeted_power)', 'nvidia-pulse':'NVIDIA PSU Pulse Test (dcgmi diag pulse_test)', 'nvidia-interconnect':'NVIDIA Interconnect (NCCL all_reduce_perf)', 'nvidia-bandwidth':'NVIDIA Bandwidth (NVBandwidth)', memory:'Validate Memory', storage:'Validate Storage', cpu:'Validate CPU', amd:'Validate AMD GPU', 'amd-mem':'AMD GPU MEM Integrity', 'amd-bandwidth':'AMD GPU MEM Bandwidth'}; return {nvidia:'Validate GPU', 'nvidia-targeted-stress':'NVIDIA Targeted Stress (dcgmi diag targeted_stress)', 'nvidia-targeted-power':'NVIDIA Targeted Power (dcgmi diag targeted_power)', 'nvidia-pulse':'NVIDIA PSU Pulse Test (dcgmi diag pulse_test)', 'nvidia-interconnect':'NVIDIA Interconnect (NCCL all_reduce_perf)', 'nvidia-bandwidth':'NVIDIA Bandwidth (NVBandwidth)', memory:'Validate Memory', storage:'Validate Storage', cpu:'Validate CPU', amd:'Validate AMD GPU', 'amd-mem':'AMD GPU MEM Integrity', 'amd-bandwidth':'AMD GPU MEM Bandwidth'};
} }
@@ -667,7 +650,7 @@ func renderCheck(opts HandlerOptions) string {
if n > 0 { if n > 0 {
gpuNote = fmt.Sprintf(" (%d GPU)", n) gpuNote = fmt.Sprintf(" (%d GPU)", n)
} }
return `<div class="alert alert-info" style="margin-bottom:16px"><strong>Non-destructive:</strong> Check tests collect diagnostics only — no writes to disks, no sustained load, no hardware wear counters incremented. For stress testing, go to <a href="/load">3. Load</a>.</div> return `<div class="alert alert-info" style="margin-bottom:16px"><strong>Non-destructive:</strong> Check tests collect diagnostics only — no writes to disks, no sustained load, no hardware wear counters incremented. For stress testing, go to <a href="/burn">4. Burn</a>.</div>
<div style="display:flex;align-items:center;gap:12px;margin-bottom:16px"> <div style="display:flex;align-items:center;gap:12px;margin-bottom:16px">
<button type="button" class="btn btn-primary" onclick="runAllCheckSAT()">Run All Checks</button> <button type="button" class="btn btn-primary" onclick="runAllCheckSAT()">Run All Checks</button>
<span id="sat-all-status" style="font-size:12px;color:var(--muted)"></span> <span id="sat-all-status" style="font-size:12px;color:var(--muted)"></span>

View File

@@ -33,36 +33,36 @@ func renderPage(page string, opts HandlerOptions) string {
case "load": case "load":
pageID = "load" pageID = "load"
title = "3. Load" title = "3. Load"
body = renderLoad() body = renderValidateStress(opts)
case "speed": case "burn":
pageID = "speed" pageID = "burn"
title = "4. Speed" title = "4. Burn"
body = renderSpeed(opts) body = renderBurn()
case "endurance": case "benchmark":
pageID = "endurance" pageID = "benchmark"
title = "5. Endurance" title = "5. Benchmark"
body = renderEndurance(opts) body = renderBenchmark(opts)
case "tools": case "tools":
pageID = "tools" pageID = "tools"
title = "6. Tools" title = "Tools"
body = renderTools() body = renderTools()
case "settings": case "settings":
pageID = "settings" pageID = "settings"
title = "7. Settings" title = "Settings"
body = renderSettings(opts) body = renderSettings(opts)
// Legacy routes (redirected at HTTP level in handlePage; these are fallbacks) // Legacy routes (redirected at HTTP level in handlePage; these are fallbacks)
case "validate", "tests": case "validate", "tests":
pageID = "check"
title = "2. Check"
body = renderCheck(opts)
case "burn", "burn-in":
pageID = "load" pageID = "load"
title = "3. Load" title = "3. Load"
body = renderLoad() body = renderValidate(opts)
case "benchmark": case "burn-in":
pageID = "speed" pageID = "burn"
title = "4. Speed" title = "4. Burn"
body = renderSpeed(opts) body = renderBurn()
case "speed", "endurance":
pageID = "benchmark"
title = "5. Benchmark"
body = renderBenchmark(opts)
case "tasks": case "tasks":
pageID = "tasks" pageID = "tasks"
title = "Tasks" title = "Tasks"

View File

@@ -0,0 +1,689 @@
package webui
import (
"context"
"encoding/json"
"fmt"
"net/http"
"os"
"os/exec"
"regexp"
"strconv"
"strings"
"time"
)
// --- Response types ---
type raidDriveInfo struct {
Slot string `json:"slot,omitempty"`
Device string `json:"device,omitempty"`
Model string `json:"model,omitempty"`
SizeGB float64 `json:"size_gb,omitempty"`
Serial string `json:"serial,omitempty"`
State string `json:"state,omitempty"`
}
type raidArrayInfo struct {
Name string `json:"name"`
Level string `json:"level,omitempty"`
Members []string `json:"members"`
Degraded bool `json:"degraded"`
}
type raidControllerInfo struct {
ID string `json:"id"`
Type string `json:"type"`
Index int `json:"index"`
Model string `json:"model"`
ForeignDrives []raidDriveInfo `json:"foreign_drives"`
FreeDrives []raidDriveInfo `json:"free_drives"`
Arrays []raidArrayInfo `json:"arrays,omitempty"`
}
type raidStatusResp struct {
Controllers []raidControllerInfo `json:"controllers"`
}
// --- LSI/storcli detection ---
func detectLSIControllers() []raidControllerInfo {
ctrlOut, err := exec.Command("storcli64", "/call", "show", "J").Output()
if err != nil {
return nil
}
var ctrlDoc struct {
Controllers []struct {
ResponseData struct {
Basics struct {
Controller int `json:"Controller"`
Model string `json:"Model"`
} `json:"Basics"`
} `json:"Response Data"`
} `json:"Controllers"`
}
if err := json.Unmarshal(ctrlOut, &ctrlDoc); err != nil || len(ctrlDoc.Controllers) == 0 {
return nil
}
driveOut, _ := exec.Command("storcli64", "/call/eall/sall", "show", "all", "J").Output()
var driveDoc struct {
Controllers []struct {
ResponseData struct {
DriveInformation []struct {
EIDSlt string `json:"EID:Slt"`
State string `json:"State"`
Size string `json:"Size"`
Intf string `json:"Intf"`
Med string `json:"Med"`
Model string `json:"Model"`
SN string `json:"SN"`
} `json:"Drive Information"`
} `json:"Response Data"`
} `json:"Controllers"`
}
if len(driveOut) > 0 {
json.Unmarshal(driveOut, &driveDoc) //nolint:errcheck
}
var controllers []raidControllerInfo
for i, c := range ctrlDoc.Controllers {
ctrl := raidControllerInfo{
ID: fmt.Sprintf("lsi-%d", c.ResponseData.Basics.Controller),
Type: "lsi",
Index: c.ResponseData.Basics.Controller,
Model: c.ResponseData.Basics.Model,
ForeignDrives: []raidDriveInfo{},
FreeDrives: []raidDriveInfo{},
}
if ctrl.Model == "" {
ctrl.Model = fmt.Sprintf("LSI Controller %d", ctrl.Index)
}
if i < len(driveDoc.Controllers) {
for _, d := range driveDoc.Controllers[i].ResponseData.DriveInformation {
info := raidDriveInfo{
Slot: strings.TrimSpace(d.EIDSlt),
Model: strings.TrimSpace(d.Model),
State: strings.TrimSpace(d.State),
SizeGB: raidParseHumanSizeGB(d.Size),
Serial: strings.TrimSpace(d.SN),
}
switch strings.TrimSpace(d.State) {
case "Frgn":
ctrl.ForeignDrives = append(ctrl.ForeignDrives, info)
case "UGood", "JBOD":
ctrl.FreeDrives = append(ctrl.FreeDrives, info)
}
}
}
controllers = append(controllers, ctrl)
}
return controllers
}
// --- VROC/mdadm detection ---
var raidMDStatDegradedRx = regexp.MustCompile(`\[[U_]+\]`)
type mdStatEntry struct {
Name string
Level string
Members []string
Degraded bool
}
func parseRAIDMDStat(raw string) []mdStatEntry {
var entries []mdStatEntry
var cur *mdStatEntry
for _, line := range strings.Split(raw, "\n") {
if strings.HasPrefix(line, "Personalities") || strings.HasPrefix(line, "unused devices") {
continue
}
if idx := strings.Index(line, " : "); idx > 0 {
name := strings.TrimSpace(line[:idx])
rest := line[idx+3:]
entry := mdStatEntry{Name: name}
for _, tok := range strings.Fields(rest) {
if strings.HasPrefix(tok, "raid") || strings.HasPrefix(tok, "linear") {
entry.Level = tok
}
if bk := strings.Index(tok, "["); bk > 0 && strings.HasSuffix(tok, "]") {
entry.Members = append(entry.Members, tok[:bk])
}
}
entries = append(entries, entry)
cur = &entries[len(entries)-1]
continue
}
if cur != nil {
if m := raidMDStatDegradedRx.FindString(line); m != "" && strings.Contains(m, "_") {
cur.Degraded = true
}
}
}
return entries
}
func detectVROCController() *raidControllerInfo {
out, err := exec.Command("mdadm", "--detail-platform").CombinedOutput()
if err != nil && len(out) == 0 {
return nil
}
hasVROC := false
for _, line := range strings.Split(string(out), "\n") {
lower := strings.ToLower(line)
if strings.Contains(lower, "license") || strings.Contains(lower, "intel") || strings.Contains(lower, "platform") {
hasVROC = true
break
}
}
if !hasVROC {
return nil
}
ctrl := &raidControllerInfo{
ID: "vroc-0",
Type: "vroc",
Model: "Intel VROC",
ForeignDrives: []raidDriveInfo{},
FreeDrives: []raidDriveInfo{},
}
inArray := map[string]bool{}
raw, err := os.ReadFile("/proc/mdstat")
if err == nil {
for _, arr := range parseRAIDMDStat(string(raw)) {
ctrl.Arrays = append(ctrl.Arrays, raidArrayInfo{
Name: arr.Name,
Level: arr.Level,
Members: arr.Members,
Degraded: arr.Degraded,
})
for _, m := range arr.Members {
inArray[m] = true
}
}
}
lsblkOut, err := exec.Command("lsblk", "-J", "-d", "-o", "NAME,SIZE,TYPE,MODEL,SERIAL").Output()
if err == nil {
var lsblkDoc struct {
BlockDevices []struct {
Name string `json:"name"`
Size string `json:"size"`
Type string `json:"type"`
Model string `json:"model"`
Serial string `json:"serial"`
} `json:"blockdevices"`
}
if json.Unmarshal(lsblkOut, &lsblkDoc) == nil {
for _, d := range lsblkDoc.BlockDevices {
if d.Type != "disk" || inArray[d.Name] {
continue
}
ctrl.FreeDrives = append(ctrl.FreeDrives, raidDriveInfo{
Device: "/dev/" + d.Name,
Model: strings.TrimSpace(d.Model),
Serial: strings.TrimSpace(d.Serial),
State: "available",
})
}
}
}
return ctrl
}
// --- API handlers ---
func (h *handler) handleAPIRAIDStatus(w http.ResponseWriter, r *http.Request) {
resp := raidStatusResp{Controllers: []raidControllerInfo{}}
if lsi := detectLSIControllers(); len(lsi) > 0 {
resp.Controllers = append(resp.Controllers, lsi...)
}
if vroc := detectVROCController(); vroc != nil {
resp.Controllers = append(resp.Controllers, *vroc)
}
writeJSON(w, resp)
}
func (h *handler) handleAPIRAIDForeignAction(w http.ResponseWriter, r *http.Request) {
var req struct {
ControllerID string `json:"controller_id"`
Action string `json:"action"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, http.StatusBadRequest, "invalid JSON")
return
}
if req.Action != "import" && req.Action != "clear" {
writeError(w, http.StatusBadRequest, "action must be 'import' or 'clear'")
return
}
ctrlIdx, ok := parseLSIControllerIndex(req.ControllerID)
if !ok {
writeError(w, http.StatusBadRequest, "invalid controller_id")
return
}
target := "raid-foreign-clear"
name := fmt.Sprintf("RAID Foreign Clear (ctrl %d)", ctrlIdx)
if req.Action == "import" {
target = "raid-foreign-import"
name = fmt.Sprintf("RAID Foreign Import (ctrl %d)", ctrlIdx)
}
t := &Task{
ID: newJobID(target),
Name: name,
Target: target,
Priority: defaultTaskPriority(target, taskParams{}),
Status: TaskPending,
CreatedAt: time.Now(),
params: taskParams{RAIDController: ctrlIdx},
}
globalQueue.enqueue(t)
writeJSON(w, map[string]string{"task_id": t.ID})
}
func (h *handler) handleAPIRAIDCreateMirror(w http.ResponseWriter, r *http.Request) {
var req struct {
ControllerID string `json:"controller_id"`
Devices []string `json:"devices"`
ArrayName string `json:"array_name"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, http.StatusBadRequest, "invalid JSON")
return
}
if len(req.Devices) < 2 {
writeError(w, http.StatusBadRequest, "at least 2 devices required")
return
}
var target, name string
var params taskParams
switch {
case strings.HasPrefix(req.ControllerID, "lsi-"):
ctrlIdx, ok := parseLSIControllerIndex(req.ControllerID)
if !ok {
writeError(w, http.StatusBadRequest, "invalid controller_id")
return
}
target = "raid-lsi-create-mirror"
name = fmt.Sprintf("Create RAID 1 Mirror (LSI ctrl %d)", ctrlIdx)
params = taskParams{RAIDController: ctrlIdx, RAIDDevices: req.Devices}
case req.ControllerID == "vroc-0":
arrayName := strings.TrimSpace(req.ArrayName)
if arrayName == "" {
arrayName = "bee-mirror0"
}
target = "raid-vroc-create-mirror"
name = fmt.Sprintf("Create VROC RAID 1 (%s)", arrayName)
params = taskParams{RAIDDevices: req.Devices, RAIDArrayName: arrayName}
default:
writeError(w, http.StatusBadRequest, "unknown controller_id")
return
}
t := &Task{
ID: newJobID(target),
Name: name,
Target: target,
Priority: defaultTaskPriority(target, taskParams{}),
Status: TaskPending,
CreatedAt: time.Now(),
params: params,
}
globalQueue.enqueue(t)
writeJSON(w, map[string]string{"task_id": t.ID})
}
func parseLSIControllerIndex(id string) (int, bool) {
if !strings.HasPrefix(id, "lsi-") {
return 0, false
}
n, err := strconv.Atoi(strings.TrimPrefix(id, "lsi-"))
if err != nil || n < 0 {
return 0, false
}
return n, true
}
// --- Task runner functions ---
func runRAIDForeignClearTask(ctx context.Context, j *jobState, ctrl int) error {
j.append(fmt.Sprintf("Clearing foreign configuration on controller %d...", ctrl))
cmd := exec.CommandContext(ctx, "storcli64", fmt.Sprintf("/c%d/fall", ctrl), "del", "noprompt")
return streamCmdJob(j, cmd)
}
func runRAIDForeignImportTask(ctx context.Context, j *jobState, ctrl int) error {
j.append(fmt.Sprintf("Importing foreign configuration on controller %d...", ctrl))
cmd := exec.CommandContext(ctx, "storcli64", fmt.Sprintf("/c%d/fall", ctrl), "import", "noprompt")
return streamCmdJob(j, cmd)
}
func runRAIDLSICreateMirrorTask(ctx context.Context, j *jobState, ctrl int, drives []string) error {
driveList := strings.Join(drives, ",")
j.append(fmt.Sprintf("Creating RAID 1 on controller %d with drives: %s", ctrl, driveList))
cmd := exec.CommandContext(ctx, "storcli64",
fmt.Sprintf("/c%d", ctrl),
"add", "vd", "type=raid1",
fmt.Sprintf("drives=%s", driveList),
"pdperarray=2",
)
return streamCmdJob(j, cmd)
}
func runRAIDVROCCreateMirrorTask(ctx context.Context, j *jobState, devices []string, arrayName string) error {
if arrayName == "" {
arrayName = "bee-mirror0"
}
devPath := "/dev/md/" + arrayName
args := []string{
"--create", devPath,
"--level=1",
fmt.Sprintf("--raid-devices=%d", len(devices)),
"--run",
}
args = append(args, devices...)
j.append(fmt.Sprintf("Creating VROC RAID 1 array %s with: %s", devPath, strings.Join(devices, " ")))
cmd := exec.CommandContext(ctx, "mdadm", args...)
return streamCmdJob(j, cmd)
}
// raidParseHumanSizeGB parses storcli size strings like "1.818 TB", "745.211 GB".
func raidParseHumanSizeGB(s string) float64 {
s = strings.TrimSpace(s)
if s == "" {
return 0
}
upper := strings.ToUpper(s)
var mul float64
var numStr string
switch {
case strings.Contains(upper, " TB"):
mul = 1024
numStr = strings.TrimSpace(strings.SplitN(upper, " T", 2)[0])
case strings.Contains(upper, " GB"):
mul = 1
numStr = strings.TrimSpace(strings.SplitN(upper, " G", 2)[0])
case strings.Contains(upper, " MB"):
mul = 1.0 / 1024
numStr = strings.TrimSpace(strings.SplitN(upper, " M", 2)[0])
default:
return 0
}
v, err := strconv.ParseFloat(numStr, 64)
if err != nil {
return 0
}
return v * mul
}
// --- UI card ---
func renderRAIDMgmtCard() string {
return `<div class="card"><div class="card-head card-head-actions">RAID Controller Management<div class="card-head-buttons"><button class="btn btn-sm btn-secondary" onclick="raidLoad()">&#8635; Refresh</button></div></div><div class="card-body">
<div id="raid-status" style="font-size:13px;color:var(--muted);margin-bottom:8px">Loading...</div>
<div id="raid-content"></div>
<div id="raid-out-wrap" style="display:none;margin-top:14px">
<div style="display:flex;align-items:center;justify-content:space-between;margin-bottom:4px">
<span id="raid-out-label" style="font-size:12px;font-weight:600;color:var(--muted)">Output</span>
<span id="raid-out-status" style="font-size:12px"></span>
</div>
<div id="raid-terminal" class="terminal" style="max-height:260px;width:100%;box-sizing:border-box"></div>
</div>
</div></div>
<script>
(function(){
function escHtml(s) {
return String(s||'').replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;').replace(/"/g,'&quot;');
}
var _raidControllers = [];
function raidLoad() {
var status = document.getElementById('raid-status');
var content = document.getElementById('raid-content');
status.textContent = 'Detecting RAID controllers...';
status.style.color = 'var(--muted)';
content.innerHTML = '';
fetch('/api/tools/raid/status', {cache:'no-store'})
.then(function(r) {
if (!r.ok) return r.json().then(function(e) { throw new Error(e.error || r.statusText); });
return r.json();
})
.then(function(data) {
_raidControllers = data.controllers || [];
if (_raidControllers.length === 0) {
status.textContent = 'No RAID controllers detected.';
return;
}
status.textContent = _raidControllers.length + ' controller(s) detected.';
content.innerHTML = _raidControllers.map(function(c, i) {
return raidRenderController(c, i);
}).join('<hr style="margin:16px 0;border:none;border-top:1px solid var(--border)">');
})
.catch(function(e) {
status.textContent = 'Error: ' + e.message;
status.style.color = 'var(--crit-fg)';
});
}
function raidRenderController(c, idx) {
var html = '';
var typeLabel = c.type === 'lsi' ? 'LSI / Broadcom' : 'Intel VROC';
html += '<div style="font-weight:600;font-size:13px;margin-bottom:10px">' + typeLabel + ' &mdash; ' + escHtml(c.model) + '</div>';
if (c.type === 'lsi') {
var foreign = c.foreign_drives || [];
if (foreign.length > 0) {
html += '<div style="background:var(--warn-bg,rgba(240,192,0,0.1));border:1px solid var(--warn-border,#c8a800);border-radius:4px;padding:10px 12px;margin-bottom:12px">';
html += '<div style="font-weight:600;font-size:13px;margin-bottom:6px">&#9888;&#xFE0E; Foreign Configuration Detected (' + foreign.length + ' drive(s))</div>';
html += '<table style="margin-bottom:10px"><tr><th>Slot</th><th>Model</th><th>Size</th><th>State</th></tr>';
foreign.forEach(function(d) {
html += '<tr>'
+ '<td style="font-family:monospace">' + escHtml(d.slot) + '</td>'
+ '<td>' + escHtml(d.model||'—') + '</td>'
+ '<td>' + (d.size_gb > 0 ? Math.round(d.size_gb) + ' GB' : '—') + '</td>'
+ '<td><span class="badge badge-warn">' + escHtml(d.state) + '</span></td>'
+ '</tr>';
});
html += '</table>';
html += '<div style="display:flex;gap:8px;flex-wrap:wrap">';
html += '<button class="btn btn-sm btn-primary" onclick="raidForeignAction(\'' + escHtml(c.id) + '\',\'import\',this)">Import Foreign Config</button>';
html += '<button class="btn btn-sm btn-secondary" style="color:var(--crit-fg)" onclick="raidForeignAction(\'' + escHtml(c.id) + '\',\'clear\',this)">Clear Foreign Config</button>';
html += '</div></div>';
}
html += raidRenderMirrorSection(c, idx, 'lsi');
}
if (c.type === 'vroc') {
var arrays = c.arrays || [];
if (arrays.length > 0) {
html += '<div style="font-size:12px;font-weight:600;color:var(--muted);margin-bottom:6px;text-transform:uppercase;letter-spacing:.04em">Active Arrays</div>';
html += '<table style="margin-bottom:14px"><tr><th>Name</th><th>Level</th><th>Members</th><th>Status</th></tr>';
arrays.forEach(function(a) {
var badge = a.degraded
? '<span class="badge badge-err">Degraded</span>'
: '<span class="badge badge-ok">OK</span>';
html += '<tr>'
+ '<td style="font-family:monospace">' + escHtml(a.name) + '</td>'
+ '<td>' + escHtml(a.level||'—') + '</td>'
+ '<td style="font-family:monospace;font-size:12px">' + (a.members||[]).map(escHtml).join(', ') + '</td>'
+ '<td>' + badge + '</td>'
+ '</tr>';
});
html += '</table>';
}
html += raidRenderMirrorSection(c, idx, 'vroc');
}
return html;
}
function raidRenderMirrorSection(c, idx, kind) {
var free = c.free_drives || [];
var html = '<div style="font-size:12px;font-weight:600;color:var(--muted);margin-bottom:6px;text-transform:uppercase;letter-spacing:.04em">Create RAID 1 Mirror</div>';
if (free.length < 2) {
html += '<p style="font-size:13px;color:var(--muted)">No unconfigured drives available (need at least 2).</p>';
return html;
}
html += '<p style="font-size:13px;color:var(--muted);margin-bottom:8px">Select exactly 2 drives:</p>';
html += '<div>';
free.forEach(function(d) {
var val = kind === 'lsi' ? d.slot : d.device;
var label = kind === 'lsi'
? escHtml(d.slot) + (d.model ? ' &mdash; ' + escHtml(d.model) : '') + (d.size_gb > 0 ? ' (' + Math.round(d.size_gb) + ' GB)' : '')
: escHtml(d.device) + (d.model ? ' &mdash; ' + escHtml(d.model) : '') + (d.serial ? ' [' + escHtml(d.serial) + ']' : '');
html += '<label style="display:block;margin-bottom:4px;font-size:13px;cursor:pointer">'
+ '<input type="checkbox" class="raid-mirror-check-' + idx + '" value="' + escHtml(val) + '"> '
+ label + '</label>';
});
html += '</div>';
if (kind === 'vroc') {
html += '<div style="margin-top:10px;display:flex;align-items:center;gap:8px;flex-wrap:wrap">'
+ '<label style="font-size:13px">Array name:&nbsp;<input type="text" id="vroc-arrayname-' + idx + '" value="bee-mirror0" style="font-family:monospace;padding:2px 6px;width:140px"></label>';
} else {
html += '<div style="margin-top:10px;display:flex;gap:8px">';
}
html += '<button class="btn btn-sm btn-primary raid-mirror-btn-' + idx + '" onclick="raidCreateMirror(\'' + escHtml(c.id) + '\',' + idx + ',\'' + kind + '\',this)">Create Mirror</button>';
html += '</div>';
return html;
}
function raidForeignAction(ctrlID, action, btn) {
if (action === 'clear' && !confirm('Clear foreign configuration on ' + ctrlID + '?\n\nThis will DELETE the foreign RAID metadata. Data on those drives may become inaccessible.')) {
return;
}
var original = btn ? btn.textContent : '';
if (btn) { btn.disabled = true; btn.textContent = action === 'import' ? 'Importing...' : 'Clearing...'; }
raidShowOutput('RAID foreign ' + action, '', '');
fetch('/api/tools/raid/foreign', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({controller_id: ctrlID, action: action})
})
.then(function(r) { return r.json(); })
.then(function(d) {
if (d.error) throw new Error(d.error);
var actionLabel = action === 'import' ? 'Import foreign config' : 'Clear foreign config';
raidStreamTask(d.task_id, actionLabel, function() {
if (btn) { btn.disabled = false; btn.textContent = original; }
raidLoad();
});
})
.catch(function(e) {
raidShowOutput('Error', 'failed', e.message);
if (btn) { btn.disabled = false; btn.textContent = original; }
});
}
function raidCreateMirror(ctrlID, idx, kind, btn) {
var checks = document.querySelectorAll('.raid-mirror-check-' + idx + ':checked');
if (checks.length !== 2) {
alert('Select exactly 2 drives.');
return;
}
var devices = Array.from(checks).map(function(c) { return c.value; });
var arrayName = '';
if (kind === 'vroc') {
var nameEl = document.getElementById('vroc-arrayname-' + idx);
arrayName = nameEl ? nameEl.value.trim() : 'bee-mirror0';
if (!arrayName) arrayName = 'bee-mirror0';
}
var original = btn ? btn.textContent : '';
if (btn) { btn.disabled = true; btn.textContent = 'Creating...'; }
raidShowOutput('Create RAID 1', '', '');
fetch('/api/tools/raid/create-mirror', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({controller_id: ctrlID, devices: devices, array_name: arrayName})
})
.then(function(r) { return r.json(); })
.then(function(d) {
if (d.error) throw new Error(d.error);
raidStreamTask(d.task_id, 'Create RAID 1 mirror', function() {
if (btn) { btn.disabled = false; btn.textContent = original; }
raidLoad();
});
})
.catch(function(e) {
raidShowOutput('Error', 'failed', e.message);
if (btn) { btn.disabled = false; btn.textContent = original; }
});
}
function raidShowOutput(label, status, text) {
var wrap = document.getElementById('raid-out-wrap');
var labelEl = document.getElementById('raid-out-label');
var statusEl = document.getElementById('raid-out-status');
var term = document.getElementById('raid-terminal');
wrap.style.display = 'block';
labelEl.textContent = label;
if (status === 'ok') {
statusEl.textContent = '✓ done';
statusEl.style.color = 'var(--ok-fg)';
} else if (status === 'failed') {
statusEl.textContent = '✗ failed';
statusEl.style.color = 'var(--crit-fg)';
} else {
statusEl.textContent = status;
statusEl.style.color = 'var(--muted)';
}
if (text !== undefined) {
term.textContent = text;
term.scrollTop = term.scrollHeight;
}
}
function raidStreamTask(taskID, taskName, onDone) {
var term = document.getElementById('raid-terminal');
term.textContent = '';
raidShowOutput(taskName || 'Running…', 'running…', undefined);
var es = new EventSource('/api/tasks/' + taskID + '/stream');
es.onmessage = function(e) {
term.textContent += e.data + '\n';
term.scrollTop = term.scrollHeight;
};
es.addEventListener('done', function(e) {
es.close();
if (!e.data) {
raidShowOutput(taskName, 'ok', undefined);
} else {
raidShowOutput(taskName, 'failed', undefined);
term.textContent += '\nFailed: ' + e.data;
term.scrollTop = term.scrollHeight;
}
if (onDone) onDone();
});
es.onerror = function() {
es.close();
raidShowOutput(taskName, 'failed', undefined);
if (onDone) onDone();
};
}
window.raidLoad = raidLoad;
raidLoad();
})();
</script>`
}

View File

@@ -28,7 +28,8 @@ var (
shnRE = regexp.MustCompile(`^[A-Za-z0-9_]{1,16}$`) shnRE = regexp.MustCompile(`^[A-Za-z0-9_]{1,16}$`)
dmiSectionRE = regexp.MustCompile(`^\[(.+?)\]$`) dmiSectionRE = regexp.MustCompile(`^\[(.+?)\]$`)
// Item Name {SHN} = value // comment // Item Name {SHN} = value // comment
dmiItemRE = regexp.MustCompile(`^(.+?)\s+\{([A-Za-z0-9]{1,16})\}\s*=\s*(.*)$`) // SHN may contain parentheses, e.g. {PS(4)LC} for power supply fields
dmiItemRE = regexp.MustCompile(`^(.+?)\s+\{([A-Za-z0-9_()\-]{1,24})\}\s*=\s*(.*)$`)
dmiVersionRE = regexp.MustCompile(`(?i)^version\s*=`) dmiVersionRE = regexp.MustCompile(`(?i)^version\s*=`)
) )
@@ -91,7 +92,9 @@ func (h *handler) handleAPISAADMIRead(w http.ResponseWriter, r *http.Request) {
defer os.RemoveAll(tmpDir) defer os.RemoveAll(tmpDir)
dmiFile := filepath.Join(tmpDir, "DMI.txt") dmiFile := filepath.Join(tmpDir, "DMI.txt")
out, err := exec.CommandContext(ctx, "saa", "-c", "GetDmiInfo", "--file", dmiFile, "--overwrite").CombinedOutput() cmd := exec.CommandContext(ctx, "saa", "-c", "GetDmiInfo", "--file", dmiFile, "--overwrite")
cmd.Dir = "/usr/local/bin"
out, err := cmd.CombinedOutput()
if err != nil { if err != nil {
msg := strings.TrimSpace(string(out)) msg := strings.TrimSpace(string(out))
if msg == "" { if msg == "" {
@@ -168,7 +171,9 @@ func runSAADMIWriteTask(ctx context.Context, j *jobState, exportDir string, p ta
dmiFile := filepath.Join(tmpDir, "DMI.txt") dmiFile := filepath.Join(tmpDir, "DMI.txt")
j.append("Reading current DMI configuration...") j.append("Reading current DMI configuration...")
if err := streamCmdJob(j, exec.CommandContext(ctx, "saa", "-c", "GetDmiInfo", "--file", dmiFile, "--overwrite")); err != nil { getCmd := exec.CommandContext(ctx, "saa", "-c", "GetDmiInfo", "--file", dmiFile, "--overwrite")
getCmd.Dir = "/usr/local/bin"
if err := streamCmdJob(j, getCmd); err != nil {
return fmt.Errorf("GetDmiInfo: %w", err) return fmt.Errorf("GetDmiInfo: %w", err)
} }
@@ -190,13 +195,16 @@ func runSAADMIWriteTask(ctx context.Context, j *jobState, exportDir string, p ta
for _, c := range p.SAADmiChanges { for _, c := range p.SAADmiChanges {
j.append("Setting " + c.Shn + " = " + c.Value) j.append("Setting " + c.Shn + " = " + c.Value)
cmd := exec.CommandContext(ctx, "saa", "-c", "EditDmiInfo", "--file", dmiFile, "--shn", c.Shn, "--value", c.Value) cmd := exec.CommandContext(ctx, "saa", "-c", "EditDmiInfo", "--file", dmiFile, "--shn", c.Shn, "--value", c.Value)
cmd.Dir = "/usr/local/bin"
if err := streamCmdJob(j, cmd); err != nil { if err := streamCmdJob(j, cmd); err != nil {
return fmt.Errorf("EditDmiInfo %s: %w", c.Shn, err) return fmt.Errorf("EditDmiInfo %s: %w", c.Shn, err)
} }
} }
j.append("Applying changes to hardware...") j.append("Applying changes to hardware...")
if err := streamCmdJob(j, exec.CommandContext(ctx, "saa", "-c", "ChangeDmiInfo", "--file", dmiFile)); err != nil { changeCmd := exec.CommandContext(ctx, "saa", "-c", "ChangeDmiInfo", "--file", dmiFile)
changeCmd.Dir = "/usr/local/bin"
if err := streamCmdJob(j, changeCmd); err != nil {
return fmt.Errorf("ChangeDmiInfo: %w", err) return fmt.Errorf("ChangeDmiInfo: %w", err)
} }

View File

@@ -316,6 +316,11 @@ func NewHandler(opts HandlerOptions) http.Handler {
mux.HandleFunc("POST /api/tools/nvme-format/run", h.handleAPINVMeFormatRun) mux.HandleFunc("POST /api/tools/nvme-format/run", h.handleAPINVMeFormatRun)
mux.HandleFunc("GET /api/tools/saa-dmi", h.handleAPISAADMIRead) mux.HandleFunc("GET /api/tools/saa-dmi", h.handleAPISAADMIRead)
mux.HandleFunc("POST /api/tools/saa-dmi/write", h.handleAPISAADMIWrite) mux.HandleFunc("POST /api/tools/saa-dmi/write", h.handleAPISAADMIWrite)
mux.HandleFunc("GET /api/tools/ipmi-fru", h.handleAPIIPMIFRURead)
mux.HandleFunc("POST /api/tools/ipmi-fru/write", h.handleAPIIPMIFRUWrite)
mux.HandleFunc("GET /api/tools/raid/status", h.handleAPIRAIDStatus)
mux.HandleFunc("POST /api/tools/raid/foreign", h.handleAPIRAIDForeignAction)
mux.HandleFunc("POST /api/tools/raid/create-mirror", h.handleAPIRAIDCreateMirror)
// GPU presence / tools // GPU presence / tools
mux.HandleFunc("GET /api/gpu/presence", h.handleAPIGPUPresence) mux.HandleFunc("GET /api/gpu/presence", h.handleAPIGPUPresence)
@@ -1424,13 +1429,13 @@ func (h *handler) handlePage(w http.ResponseWriter, r *http.Request) {
// Redirect legacy routes to new named pages // Redirect legacy routes to new named pages
switch page { switch page {
case "validate", "tests": case "validate", "tests":
http.Redirect(w, r, "/check", http.StatusMovedPermanently)
return
case "burn", "burn-in":
http.Redirect(w, r, "/load", http.StatusMovedPermanently) http.Redirect(w, r, "/load", http.StatusMovedPermanently)
return return
case "benchmark": case "burn-in":
http.Redirect(w, r, "/speed", http.StatusMovedPermanently) http.Redirect(w, r, "/burn", http.StatusMovedPermanently)
return
case "speed", "endurance":
http.Redirect(w, r, "/benchmark", http.StatusMovedPermanently)
return return
} }
body := renderPage(page, h.opts) body := renderPage(page, h.opts)

View File

@@ -666,54 +666,64 @@ func TestTasksPageRendersOpenLinksAndPaginationControls(t *testing.T) {
func TestToolsPageRendersNvidiaSelfHealSection(t *testing.T) { func TestToolsPageRendersNvidiaSelfHealSection(t *testing.T) {
handler := NewHandler(HandlerOptions{}) handler := NewHandler(HandlerOptions{})
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/tools", nil)) // /tools: only NVMe Block Format and Supermicro DMI remain
if rec.Code != http.StatusOK { recTools := httptest.NewRecorder()
t.Fatalf("status=%d", rec.Code) handler.ServeHTTP(recTools, httptest.NewRequest(http.MethodGet, "/tools", nil))
if recTools.Code != http.StatusOK {
t.Fatalf("tools status=%d", recTools.Code)
} }
body := rec.Body.String() toolsBody := recTools.Body.String()
if !strings.Contains(body, `NVIDIA Self Heal`) { if !strings.Contains(toolsBody, `NVMe Block Format`) {
t.Fatalf("tools page missing nvidia self heal section: %s", body) t.Fatalf("tools page missing nvme block format section: %s", toolsBody)
} }
if !strings.Contains(body, `Restart GPU Drivers`) { if !strings.Contains(toolsBody, `/api/tools/nvme-formats`) || !strings.Contains(toolsBody, `/api/tools/nvme-format/run`) {
t.Fatalf("tools page missing restart gpu drivers button: %s", body) t.Fatalf("tools page missing nvme format api usage: %s", toolsBody)
} }
if !strings.Contains(body, `nvidiaRestartDrivers()`) {
t.Fatalf("tools page missing nvidiaRestartDrivers action: %s", body) // /settings: system install, support bundle, tool check, nvidia self heal, network, services
recSettings := httptest.NewRecorder()
handler.ServeHTTP(recSettings, httptest.NewRequest(http.MethodGet, "/settings", nil))
if recSettings.Code != http.StatusOK {
t.Fatalf("settings status=%d", recSettings.Code)
} }
if !strings.Contains(body, `/api/gpu/nvidia-status`) { settingsBody := recSettings.Body.String()
t.Fatalf("tools page missing nvidia status api usage: %s", body) if !strings.Contains(settingsBody, `NVIDIA Self Heal`) {
t.Fatalf("settings page missing nvidia self heal section: %s", settingsBody)
} }
if !strings.Contains(body, `nvidiaResetGPU(`) { if !strings.Contains(settingsBody, `Restart GPU Drivers`) {
t.Fatalf("tools page missing nvidiaResetGPU action: %s", body) t.Fatalf("settings page missing restart gpu drivers button: %s", settingsBody)
} }
if !strings.Contains(body, `id="boot-source-text"`) { if !strings.Contains(settingsBody, `nvidiaRestartDrivers()`) {
t.Fatalf("tools page missing boot source field: %s", body) t.Fatalf("settings page missing nvidiaRestartDrivers action: %s", settingsBody)
} }
if !strings.Contains(body, `USB Black-Box`) { if !strings.Contains(settingsBody, `/api/gpu/nvidia-status`) {
t.Fatalf("tools page missing usb black-box section: %s", body) t.Fatalf("settings page missing nvidia status api usage: %s", settingsBody)
} }
if !strings.Contains(body, `/api/blackbox/status`) { if !strings.Contains(settingsBody, `nvidiaResetGPU(`) {
t.Fatalf("tools page missing black-box status api usage: %s", body) t.Fatalf("settings page missing nvidiaResetGPU action: %s", settingsBody)
} }
if !strings.Contains(body, `NVMe Block Format`) { if !strings.Contains(settingsBody, `id="boot-source-text"`) {
t.Fatalf("tools page missing nvme block format section: %s", body) t.Fatalf("settings page missing boot source field: %s", settingsBody)
} }
if !strings.Contains(body, `/api/tools/nvme-formats`) || !strings.Contains(body, `/api/tools/nvme-format/run`) { if !strings.Contains(settingsBody, `USB Black-Box`) {
t.Fatalf("tools page missing nvme format api usage: %s", body) t.Fatalf("settings page missing usb black-box section: %s", settingsBody)
}
if !strings.Contains(settingsBody, `/api/blackbox/status`) {
t.Fatalf("settings page missing black-box status api usage: %s", settingsBody)
} }
} }
func TestBenchmarkPageRendersGPUSelectionControls(t *testing.T) { func TestBenchmarkPageRendersGPUSelectionControls(t *testing.T) {
handler := NewHandler(HandlerOptions{}) handler := NewHandler(HandlerOptions{})
rec := httptest.NewRecorder() rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/speed", nil)) handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/benchmark", nil))
if rec.Code != http.StatusOK { if rec.Code != http.StatusOK {
t.Fatalf("status=%d", rec.Code) t.Fatalf("status=%d", rec.Code)
} }
body := rec.Body.String() body := rec.Body.String()
for _, needle := range []string{ for _, needle := range []string{
`href="/speed"`, `href="/benchmark"`,
`id="benchmark-gpu-list"`, `id="benchmark-gpu-list"`,
`/api/gpu/nvidia`, `/api/gpu/nvidia`,
`/api/bee-bench/nvidia/perf/run`, `/api/bee-bench/nvidia/perf/run`,
@@ -769,7 +779,7 @@ func TestBenchmarkPageRendersSavedResultsTable(t *testing.T) {
handler := NewHandler(HandlerOptions{ExportDir: exportDir}) handler := NewHandler(HandlerOptions{ExportDir: exportDir})
rec := httptest.NewRecorder() rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/speed", nil)) handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/benchmark", nil))
if rec.Code != http.StatusOK { if rec.Code != http.StatusOK {
t.Fatalf("status=%d", rec.Code) t.Fatalf("status=%d", rec.Code)
} }
@@ -834,10 +844,10 @@ func TestCheckPageRendersNvidiaFabricCards(t *testing.T) {
} }
} }
func TestLoadPageRendersGoalBasedNVIDIACards(t *testing.T) { func TestBurnPageRendersGoalBasedNVIDIACards(t *testing.T) {
handler := NewHandler(HandlerOptions{}) handler := NewHandler(HandlerOptions{})
rec := httptest.NewRecorder() rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/load", nil)) handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/burn", nil))
if rec.Code != http.StatusOK { if rec.Code != http.StatusOK {
t.Fatalf("status=%d", rec.Code) t.Fatalf("status=%d", rec.Code)
} }

View File

@@ -388,6 +388,28 @@ func executeTaskWithOptions(opts *HandlerOptions, t *Task, j *jobState, ctx cont
break break
} }
err = runSAADMIWriteTask(ctx, j, opts.ExportDir, t.params) err = runSAADMIWriteTask(ctx, j, opts.ExportDir, t.params)
case "ipmi-fru-write":
if len(t.params.FRUChanges) == 0 {
err = fmt.Errorf("no changes provided")
break
}
err = runIPMIFRUWriteTask(ctx, j, opts.ExportDir, t.params)
case "raid-foreign-clear":
err = runRAIDForeignClearTask(ctx, j, t.params.RAIDController)
case "raid-foreign-import":
err = runRAIDForeignImportTask(ctx, j, t.params.RAIDController)
case "raid-lsi-create-mirror":
if len(t.params.RAIDDevices) < 2 {
err = fmt.Errorf("at least 2 drives required")
break
}
err = runRAIDLSICreateMirrorTask(ctx, j, t.params.RAIDController, t.params.RAIDDevices)
case "raid-vroc-create-mirror":
if len(t.params.RAIDDevices) < 2 {
err = fmt.Errorf("at least 2 devices required")
break
}
err = runRAIDVROCCreateMirrorTask(ctx, j, t.params.RAIDDevices, t.params.RAIDArrayName)
default: default:
j.append("ERROR: unknown target: " + t.Target) j.append("ERROR: unknown target: " + t.Target)
j.finish("unknown target") j.finish("unknown target")

View File

@@ -140,7 +140,11 @@ type taskParams struct {
Device string `json:"device,omitempty"` // for install Device string `json:"device,omitempty"` // for install
LBAF int `json:"lbaf,omitempty"` LBAF int `json:"lbaf,omitempty"`
PlatformComponents []string `json:"platform_components,omitempty"` PlatformComponents []string `json:"platform_components,omitempty"`
SAADmiChanges []saaChange `json:"saa_dmi_changes,omitempty"` SAADmiChanges []saaChange `json:"saa_dmi_changes,omitempty"`
FRUChanges []fruChange `json:"fru_changes,omitempty"`
RAIDController int `json:"raid_controller,omitempty"`
RAIDDevices []string `json:"raid_devices,omitempty"`
RAIDArrayName string `json:"raid_array_name,omitempty"`
} }
type persistedTask struct { type persistedTask struct {

View File

@@ -1483,6 +1483,17 @@ for tool in storcli64 sas2ircu sas3ircu arcconf ssacli saa; do
fi fi
done done
# saa companion directories — saa searches for these relative to CWD (/usr/local/bin)
for saa_subdir in acpica_bin ExternalData tool stunnel GO_SNMP; do
if [ -d "${VENDOR_DIR}/${saa_subdir}" ]; then
cp -r "${VENDOR_DIR}/${saa_subdir}" "${OVERLAY_STAGE_DIR}/usr/local/bin/"
find "${OVERLAY_STAGE_DIR}/usr/local/bin/${saa_subdir}" -type f -exec chmod +x {} \; 2>/dev/null || true
echo "vendor saa: ${saa_subdir}/ (included)"
else
echo "vendor saa: ${saa_subdir}/ (not found, skipped)"
fi
done
# --- NVIDIA kernel modules and userspace libs --- # --- NVIDIA kernel modules and userspace libs ---
if [ "$BEE_GPU_VENDOR" = "nvidia" ]; then if [ "$BEE_GPU_VENDOR" = "nvidia" ]; then
run_step "build NVIDIA ${NVIDIA_DRIVER_VERSION} modules" "40-nvidia-module" \ run_step "build NVIDIA ${NVIDIA_DRIVER_VERSION} modules" "40-nvidia-module" \

1131
iso/vendor/ExternalData/SMCIPID.txt vendored Normal file

File diff suppressed because it is too large Load Diff

2333
iso/vendor/ExternalData/VENID.txt vendored Normal file

File diff suppressed because it is too large Load Diff

37
iso/vendor/ExternalData/supportAutoDST vendored Normal file
View File

@@ -0,0 +1,37 @@
(UTC-10:00) Aleutian Islands
(UTC-09:00) Alaska
(UTC-08:00) Baja California
(UTC-08:00) Pacific Time (US & Canada)
(UTC-07:00) Mountain Time (US & Canada)
(UTC-06:00) Central Time (US & Canada)
(UTC-06:00) Easter Island
(UTC-05:00) Eastern Time (US & Canada)
(UTC-05:00) Haiti
(UTC-05:00) Havana
(UTC-05:00) Indiana (East)
(UTC-05:00) Turks and Caicos
(UTC-04:00) Asuncion
(UTC-04:00) Atlantic Time (Canada)
(UTC-04:00) Santiago
(UTC-03:30) Newfoundland
(UTC-03:00) Saint Pierre and Miquelon
(UTC-01:00) Azores
(UTC+00:00) Dublin, Edinburgh, Lisbon, London
(UTC+01:00) Casablanca
(UTC+01:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna
(UTC+01:00) Belgrade, Bratislava, Budapest, Ljubljana, Prague
(UTC+01:00) Brussels, Copenhagen, Madrid, Paris
(UTC+01:00) Sarajevo, Skopje, Warsaw, Zagreb
(UTC+02:00) Athens, Bucharest
(UTC+02:00) Beirut
(UTC+02:00) Chisinau
(UTC+02:00) Gaza, Hebron
(UTC+02:00) Helsinki, Kyiv, Riga, Sofia, Tallinn, Vilnius
(UTC+02:00) Jerusalem
(UTC+09:30) Adelaide
(UTC+10:00) Canberra, Melbourne, Sydney
(UTC+10:00) Hobart
(UTC+10:30) Lord Howe Island
(UTC+11:00) Norfolk Island
(UTC+12:00) Auckland, Wellington
(UTC+12:45) Chatham Islands

139
iso/vendor/ExternalData/timezone.txt vendored Normal file
View File

@@ -0,0 +1,139 @@
(UTC-12:00) International Date Line West
(UTC-11:00) Coordinated Universal Time-11
(UTC-10:00) Aleutian Islands
(UTC-10:00) Hawaii
(UTC-09:30) Marquesas Islands
(UTC-09:00) Alaska
(UTC-09:00) Coordinated Universal Time-09
(UTC-08:00) Baja California
(UTC-08:00) Coordinated Universal Time-08
(UTC-08:00) Pacific Time (US & Canada)
(UTC-07:00) Arizona
(UTC-07:00) Chihuahua, La Paz, Mazatlan
(UTC-07:00) Mountain Time (US & Canada)
(UTC-07:00) Yukon
(UTC-06:00) Central America
(UTC-06:00) Central Time (US & Canada)
(UTC-06:00) Easter Island
(UTC-06:00) Guadalajara, Mexico City, Monterrey
(UTC-06:00) Saskatchewan
(UTC-05:00) Bogota, Lima, Quito, Rio Branco
(UTC-05:00) Chetumal
(UTC-05:00) Eastern Time (US & Canada)
(UTC-05:00) Haiti
(UTC-05:00) Havana
(UTC-05:00) Indiana (East)
(UTC-05:00) Turks and Caicos
(UTC-04:00) Atlantic Time (Canada)
(UTC-04:00) Caracas
(UTC-04:00) Cuiaba
(UTC-04:00) Georgetown, La Paz, Manaus, San Juan
(UTC-04:00) Santiago
(UTC-03:30) Newfoundland
(UTC-03:00) Asuncion
(UTC-03:00) Araguaina
(UTC-03:00) Brasilia
(UTC-03:00) Cayenne, Fortaleza
(UTC-03:00) City of Buenos Aires
(UTC-03:00) Greenland
(UTC-03:00) Montevideo
(UTC-03:00) Punta Arenas
(UTC-03:00) Saint Pierre and Miquelon
(UTC-03:00) Salvador
(UTC-02:00) Coordinated Universal Time-02
(UTC-01:00) Azores
(UTC-01:00) Cabo Verde Is.
(UTC+00:00) Coordinated Universal Time
(UTC+00:00) Dublin, Edinburgh, Lisbon, London
(UTC+00:00) Monrovia, Reykjavik
(UTC+00:00) Sao Tome
(UTC+01:00) Casablanca
(UTC+01:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna
(UTC+01:00) Belgrade, Bratislava, Budapest, Ljubljana, Prague
(UTC+01:00) Brussels, Copenhagen, Madrid, Paris
(UTC+01:00) Sarajevo, Skopje, Warsaw, Zagreb
(UTC+01:00) West Central Africa
(UTC+02:00) Amman
(UTC+02:00) Athens, Bucharest
(UTC+02:00) Beirut
(UTC+02:00) Cairo
(UTC+02:00) Chisinau
(UTC+02:00) Damascus
(UTC+02:00) Gaza, Hebron
(UTC+02:00) Harare, Pretoria
(UTC+02:00) Helsinki, Kyiv, Riga, Sofia, Tallinn, Vilnius
(UTC+02:00) Jerusalem
(UTC+02:00) Juba
(UTC+02:00) Kaliningrad
(UTC+02:00) Khartoum
(UTC+02:00) Tripoli
(UTC+02:00) Windhoek
(UTC+03:00) Baghdad
(UTC+03:00) Istanbul
(UTC+03:00) Kuwait, Riyadh
(UTC+03:00) Minsk
(UTC+03:00) Moscow, St. Petersburg
(UTC+03:00) Nairobi
(UTC+03:00) Volgograd
(UTC+03:30) Tehran
(UTC+04:00) Abu Dhabi, Muscat
(UTC+04:00) Astrakhan, Ulyanovsk
(UTC+04:00) Baku
(UTC+04:00) Izhevsk, Samara
(UTC+04:00) Port Louis
(UTC+04:00) Saratov
(UTC+04:00) Tbilisi
(UTC+04:00) Yerevan
(UTC+04:30) Kabul
(UTC+05:00) Ashgabat, Tashkent
(UTC+05:00) Astana
(UTC+05:00) Ekaterinburg
(UTC+05:00) Islamabad, Karachi
(UTC+05:00) Qyzylorda
(UTC+05:30) Chennai, Kolkata, Mumbai, New Delhi
(UTC+05:30) Sri Jayawardenepura
(UTC+05:45) Kathmandu
(UTC+06:00) Dhaka
(UTC+06:00) Omsk
(UTC+06:30) Yangon (Rangoon)
(UTC+07:00) Bangkok, Hanoi, Jakarta
(UTC+07:00) Barnaul, Gorno-Altaysk
(UTC+07:00) Hovd
(UTC+07:00) Krasnoyarsk
(UTC+07:00) Novosibirsk
(UTC+07:00) Tomsk
(UTC+08:00) Beijing, Chongqing, Hong Kong, Urumqi
(UTC+08:00) Irkutsk
(UTC+08:00) Kuala Lumpur, Singapore
(UTC+08:00) Perth
(UTC+08:00) Taipei
(UTC+08:00) Ulaanbaatar
(UTC+08:45) Eucla
(UTC+09:00) Chita
(UTC+09:00) Osaka, Sapporo, Tokyo
(UTC+09:00) Pyongyang
(UTC+09:00) Seoul
(UTC+09:00) Yakutsk
(UTC+09:30) Adelaide
(UTC+09:30) Darwin
(UTC+10:00) Brisbane
(UTC+10:00) Canberra, Melbourne, Sydney
(UTC+10:00) Guam, Port Moresby
(UTC+10:00) Hobart
(UTC+10:00) Vladivostok
(UTC+10:30) Lord Howe Island
(UTC+11:00) Bougainville Island
(UTC+11:00) Chokurdakh
(UTC+11:00) Magadan
(UTC+11:00) Norfolk Island
(UTC+11:00) Sakhalin
(UTC+11:00) Solomon Is., New Caledonia
(UTC+12:00) Anadyr, Petropavlovsk-Kamchatsky
(UTC+12:00) Auckland, Wellington
(UTC+12:00) Coordinated Universal Time+12
(UTC+12:00) Fiji
(UTC+12:45) Chatham Islands
(UTC+13:00) Coordinated Universal Time+13
(UTC+13:00) Nuku'alofa
(UTC+13:00) Samoa
(UTC+14:00) Kiritimati Island

BIN
iso/vendor/ExternalData/tui.fnt vendored Normal file

Binary file not shown.

BIN
iso/vendor/GO_SNMP/AlertServer vendored Executable file

Binary file not shown.

BIN
iso/vendor/acpica_bin/acpidump vendored Executable file

Binary file not shown.

BIN
iso/vendor/acpica_bin/acpiexec vendored Executable file

Binary file not shown.

BIN
iso/vendor/stunnel/stunnel64 vendored Executable file

Binary file not shown.

BIN
iso/vendor/tool/USBController/ASMedia/114xfwdl vendored Executable file

Binary file not shown.

BIN
iso/vendor/tool/gpu/nVidia/x64/nvuflash vendored Executable file

Binary file not shown.

BIN
iso/vendor/tool/gpu/nVidia/x64/setrom vendored Executable file

Binary file not shown.