Add continuous hardware health monitoring and component detail view
- kmsg watcher now records kernel errors (GPU Xid, MCE, EDAC, storage I/O) at all times, not only during SAT tasks; flushImmediate writes directly to ComponentStatusDB - New health_poller: polls ipmitool sdr every 60s for PSU health (watchdog:psu source) - Hardware Summary card auto-refreshes every 30s via htmx without page reload - Component rows (CPU/Memory/Storage/GPU/PSU) are now clickable -- opens a modal with per-component status, source, timestamp and last 20 history entries Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -221,6 +221,11 @@ func NewHandler(opts HandlerOptions) http.Handler {
|
||||
h.kmsg = newKmsgWatcher(opts.App.StatusDB)
|
||||
h.kmsg.start()
|
||||
globalQueue.kmsgWatcher = h.kmsg
|
||||
|
||||
// Start periodic health poller for components that don't emit kernel log events (e.g. PSU).
|
||||
if opts.App.StatusDB != nil {
|
||||
newHealthPoller(opts.App.StatusDB).start()
|
||||
}
|
||||
}
|
||||
|
||||
globalQueue.startWorker(&opts)
|
||||
@@ -328,6 +333,10 @@ func NewHandler(opts HandlerOptions) http.Handler {
|
||||
mux.HandleFunc("GET /api/install/disks", h.handleAPIInstallDisks)
|
||||
mux.HandleFunc("POST /api/install/run", h.handleAPIInstallRun)
|
||||
|
||||
// Hardware component detail (fragment for modal in Hardware Summary card)
|
||||
mux.HandleFunc("GET /api/hardware-summary", h.handleAPIHardwareSummary)
|
||||
mux.HandleFunc("GET /api/components/{type}", h.handleAPIComponentDetail)
|
||||
|
||||
// Metrics — SSE stream of live sensor data + server-side SVG charts + CSV export
|
||||
mux.HandleFunc("GET /api/metrics/stream", h.handleAPIMetricsStream)
|
||||
mux.HandleFunc("GET /api/metrics/latest", h.handleAPIMetricsLatest)
|
||||
|
||||
Reference in New Issue
Block a user