Michael Chus 849235be22 Add PCIe link width and speed fields to hardware ingest
Fixes 400 Bad Request error when ingesting hardware snapshots that include
PCIe link speed information (link_width, link_speed, max_link_width, max_link_speed).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-12 21:53:44 +03:00
2026-02-05 23:30:33 +03:00

Reanimator

Reanimator is an event-driven hardware lifecycle platform for tracking servers and components from warehouse intake to production operation, failure analysis, and retirement.

It treats infrastructure like a medical record: every asset and component has a complete, queryable history of location, firmware, incidents, and reliability over time.

Project Tracking

Development milestones and execution checklist are in TODO.md.

What Reanimator Solves

  • Inventory and shipment tracking
  • Full component hierarchy (asset -> subcomponents)
  • Time-based installation history (components can move between assets)
  • Firmware and lifecycle timeline visibility
  • Failure analytics (AFR, MTBF, reliability by part class)
  • Spare-part planning
  • Service ticket correlation
  • Offline log ingestion with later synchronization

Core Principles

  1. Events are the source of truth
    • Current state is derived from event history.
  2. Hardware relationships are time-based
    • A component is linked to an asset through installation intervals, not permanent foreign keys.
  3. Observations are not facts
    • Logs produce observations; observations produce derived events; events update state.
  4. Data integrity first
    • Keep writes idempotent and preserve raw ingested data.

High-Level Architecture

LogBundle (offline)
      ->
Ingestion API
      ->
Parser / Observation Layer
      ->
Event Generator
      ->
Domain Model
      ->
MariaDB
      ->
API / UI / Connectors

Domain Model (MVP)

Organizational

  • Customer: owner organization
  • Project: infrastructure grouping within a customer
  • Location (recommended): warehouse, datacenter, rack, repair center

Hardware

  • Asset: deployable hardware unit (usually a server)
  • Component: hardware part installed in an asset (SSD, DIMM, CPU, NIC, PSU)
  • LOT: internal part classification for reliability analytics (for example, SSD_NVME_03.84TB_GEN4)
  • Installation: time-bounded relationship between component and asset

installations.removed_at IS NULL means the installation is currently active.

Ingestion and Lifecycle

  • LogBundle: immutable raw upload package
  • Observation: parsed snapshot at collection time
  • TimelineEvent: normalized lifecycle event (installed, removed, moved, firmware changed, ticket linked, etc.)
  • FailureEvent: explicit failure record with source and confidence

Service Correlation

  • Ticket: imported external service case
  • TicketLink: relation to project, asset, or component

Suggested Database Tables (MVP)

customers
projects
locations

assets
components
lots
installations

log_bundles
observations
timeline_events

tickets
ticket_links
failure_events

Important Indexes

  • components(vendor_serial)
  • assets(vendor_serial)
  • installations(component_id, removed_at)
  • timeline_events(subject_type, subject_id, timestamp)
  • observations(component_id)

API (MVP)

Ingestion

  • POST /ingest/logbundle (must be idempotent)
  • POST /ingest/failures

Assets and Components

  • GET /assets
  • GET /assets/{id}
  • GET /assets/{id}/components
  • GET /assets/{id}/tickets
  • GET /components
  • GET /components/{id}
  • GET /assets/{id}/timeline
  • GET /components/{id}/timeline

Tickets

  • POST /connectors/tickets/sync
  • GET /tickets

Failures

  • GET /failures

Analytics

  • GET /analytics/lot-metrics
  • GET /analytics/firmware-risk
  • GET /analytics/spare-forecast

UI

  • GET /ui (dashboard)
  • GET /ui/assets, GET /ui/assets/{id}
  • GET /ui/components, GET /ui/components/{id}
  • GET /ui/tickets
  • GET /ui/failures
  • GET /ui/ingest
  • GET /ui/analytics
cmd/reanimator-api
internal/domain
internal/repository
internal/ingest
internal/events
internal/connectors
internal/jobs
internal/api

Initial Delivery Plan

  1. Registry: customers, projects, assets, components, LOT
  2. Ingestion: logbundle upload, observation persistence, installation detection
  3. Timeline: event derivation and lifecycle views
  4. Service Correlation: ticket sync and linking

Future Extensions

  • Predictive failure modeling
  • Spare-part forecasting
  • Firmware risk scoring
  • Automated anomaly detection
  • Rack-level topology
  • Warranty and RMA workflows
  • Multi-tenant support

Immediate Next Steps

  1. Initialize Go module
  2. Create base schema and indexes
  3. Implement ingestion endpoint
  4. Add observation-to-event derivation
  5. Expose asset/component timeline APIs

Start with data model correctness, not UI.

Local Configuration

The API loads settings from environment variables and an optional config.json. If CONFIG_FILE is set, it will load that file. Otherwise it looks for config.json in the working directory if present. Environment variables override file values.

Database configuration can be provided as a full database_dsn or as discrete fields under database (user/password/host/port/name/params). If database_dsn is empty and no DATABASE_DSN env var is set, the discrete fields are used to build the DSN.

An example file is available as config.example.json.

Description
No description provided
Readme 8.5 MiB
Languages
Go 99.6%
Shell 0.3%