bee/audit/internal/platform at df1385d3d65c0ea2e94360e9e73f0259af899a32 - bee - MCHUS git PRO

reanimator/bee

Files

History

Michael Chus df1385d3d6 Fix dcgmproftester parallel mode: use staggered script for all multi-GPU runs

A single dcgmproftester process without -i only loads GPU 0 regardless of
CUDA_VISIBLE_DEVICES. Now always routes multi-GPU runs through
bee-dcgmproftester-staggered (--stagger-seconds 0 for parallel mode),
which spawns one process per GPU so all GPUs are loaded simultaneously.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-19 18:31:34 +03:00

..

benchmark_report.go

Rework Power Fit report: 90 min stability, aligned tables, PSU/fan sections

2026-04-19 18:04:12 +03:00

benchmark_table.go

Rework Power Fit report: 90 min stability, aligned tables, PSU/fan sections

2026-04-19 18:04:12 +03:00

benchmark_test.go

Disable unstable fp4/fp64 benchmark phases

2026-04-16 09:58:02 +03:00

benchmark_types.go

Rework Power Fit report: 90 min stability, aligned tables, PSU/fan sections

2026-04-19 18:04:12 +03:00

benchmark.go

Rework Power Fit report: 90 min stability, aligned tables, PSU/fan sections

2026-04-19 18:04:12 +03:00

error_patterns.go

feat(watchdog): hardware error monitor + unified component status store

2026-04-02 19:20:59 +03:00

export_test.go

iso: improve burn-in, export, and live boot

2026-03-26 18:56:19 +03:00

export.go

iso: improve burn-in, export, and live boot

2026-03-26 18:56:19 +03:00

gpu_metrics_test.go

Unify benchmark exports and drop ASCII charts

2026-04-13 21:38:28 +03:00

gpu_metrics.go

Estimate fan duty from observed RPM maxima

2026-04-16 10:10:18 +03:00

install_to_ram_linux.go

Improve install-to-RAM verification for ISO boots

2026-04-07 20:21:06 +03:00

install_to_ram_other.go

Improve install-to-RAM verification for ISO boots

2026-04-07 20:21:06 +03:00

install_to_ram_test.go

Unify live RAM runtime state

2026-04-14 16:18:33 +03:00

install_to_ram.go

Add toram boot entry and Install to RAM resume support

2026-04-17 23:48:56 +03:00

install.go

feat(webui): show current boot source

2026-04-02 15:36:32 +03:00

kill_workers.go

Stability hardening, build script fixes, GRUB bee logo

2026-04-19 13:08:31 +03:00

live_metrics_test.go

fix(metrics): stabilize cpu and power sampling

2026-04-01 09:40:42 +03:00

live_metrics.go

Redesign system power chart as stacked per-PSU area chart

2026-04-18 10:42:00 +03:00

network_test.go

release: v3.1

2026-03-28 22:51:36 +03:00

network.go

fix(network): strip linkdown/dead/onlink flags when restoring routes

2026-03-29 10:39:16 +03:00

nvidia_stress.go

Add staged NVIDIA burn ramp-up mode

2026-04-09 15:21:14 +03:00

parse.go

Refactor bee CLI and LiveCD integration

2026-03-13 16:52:16 +03:00

platform_stress_test.go

fix(stress): keep platform burn responsive under load

2026-03-31 22:28:26 +03:00

platform_stress.go

Fix GPU model propagation, export filenames, PSU/service status, and chart perf

2026-04-11 10:05:27 +03:00

runtime.go

Add fabric manager boot and support diagnostics

2026-04-15 16:14:26 +03:00

sat_fan_stress_test.go

Estimate fan duty from observed RPM maxima

2026-04-16 10:10:18 +03:00

sat_fan_stress.go

Estimate fan duty from observed RPM maxima

2026-04-16 10:10:18 +03:00

sat_test.go

Move NCCL and NVBandwidth into validate mode

2026-04-16 11:02:30 +03:00

sat.go

Fix dcgmproftester parallel mode: use staggered script for all multi-GPU runs

2026-04-19 18:31:34 +03:00

services.go

Fix service control buttons: sudo, real error output, UX feedback

2026-04-05 20:25:41 +03:00

system_test.go

Refactor bee CLI and LiveCD integration

2026-03-13 16:52:16 +03:00

techdump_test.go

Tighten support bundles and fix AMD runtime checks

2026-03-25 19:35:25 +03:00

techdump.go

Warn on PCIe link speed degradation and collect lspci -vvv in techdump

2026-04-12 12:42:17 +03:00

tools.go

Refactor bee CLI and LiveCD integration

2026-03-13 16:52:16 +03:00

types_test.go

WIP: checkpoint current tree

2026-04-05 12:05:00 +03:00

types.go

Unify live RAM runtime state

2026-04-14 16:18:33 +03:00