161 lines
4.0 KiB
Markdown
161 lines
4.0 KiB
Markdown
# Reanimator
|
|
|
|
**Reanimator** is an event-driven hardware lifecycle platform for tracking servers and components from warehouse intake to production operation, failure analysis, and retirement.
|
|
|
|
It treats infrastructure like a medical record: every asset and component has a complete, queryable history of location, firmware, incidents, and reliability over time.
|
|
|
|
## What Reanimator Solves
|
|
|
|
- Inventory and shipment tracking
|
|
- Full component hierarchy (`asset -> subcomponents`)
|
|
- Time-based installation history (components can move between assets)
|
|
- Firmware and lifecycle timeline visibility
|
|
- Failure analytics (AFR, MTBF, reliability by part class)
|
|
- Spare-part planning
|
|
- Service ticket correlation
|
|
- Offline log ingestion with later synchronization
|
|
|
|
## Core Principles
|
|
|
|
1. **Events are the source of truth**
|
|
- Current state is derived from event history.
|
|
2. **Hardware relationships are time-based**
|
|
- A component is linked to an asset through installation intervals, not permanent foreign keys.
|
|
3. **Observations are not facts**
|
|
- Logs produce observations; observations produce derived events; events update state.
|
|
4. **Data integrity first**
|
|
- Keep writes idempotent and preserve raw ingested data.
|
|
|
|
## High-Level Architecture
|
|
|
|
```text
|
|
LogBundle (offline)
|
|
->
|
|
Ingestion API
|
|
->
|
|
Parser / Observation Layer
|
|
->
|
|
Event Generator
|
|
->
|
|
Domain Model
|
|
->
|
|
MariaDB
|
|
->
|
|
API / UI / Connectors
|
|
```
|
|
|
|
## Domain Model (MVP)
|
|
|
|
### Organizational
|
|
|
|
- **Customer**: owner organization
|
|
- **Project**: infrastructure grouping within a customer
|
|
- **Location** (recommended): warehouse, datacenter, rack, repair center
|
|
|
|
### Hardware
|
|
|
|
- **Asset**: deployable hardware unit (usually a server)
|
|
- **Component**: hardware part installed in an asset (SSD, DIMM, CPU, NIC, PSU)
|
|
- **LOT**: internal part classification for reliability analytics (for example, `SSD_NVME_03.84TB_GEN4`)
|
|
- **Installation**: time-bounded relationship between component and asset
|
|
|
|
`installations.removed_at IS NULL` means the installation is currently active.
|
|
|
|
### Ingestion and Lifecycle
|
|
|
|
- **LogBundle**: immutable raw upload package
|
|
- **Observation**: parsed snapshot at collection time
|
|
- **TimelineEvent**: normalized lifecycle event (installed, removed, moved, firmware changed, ticket linked, etc.)
|
|
- **FailureEvent**: explicit failure record with source and confidence
|
|
|
|
### Service Correlation
|
|
|
|
- **Ticket**: imported external service case
|
|
- **TicketLink**: relation to project, asset, or component
|
|
|
|
## Suggested Database Tables (MVP)
|
|
|
|
```text
|
|
customers
|
|
projects
|
|
locations
|
|
|
|
assets
|
|
components
|
|
lots
|
|
installations
|
|
|
|
log_bundles
|
|
observations
|
|
timeline_events
|
|
|
|
tickets
|
|
ticket_links
|
|
failure_events
|
|
```
|
|
|
|
### Important Indexes
|
|
|
|
- `components(vendor_serial)`
|
|
- `assets(vendor_serial)`
|
|
- `installations(component_id, removed_at)`
|
|
- `timeline_events(subject_type, subject_id, timestamp)`
|
|
- `observations(component_id)`
|
|
|
|
## API (MVP)
|
|
|
|
### Ingestion
|
|
|
|
- `POST /ingest/logbundle` (must be idempotent)
|
|
|
|
### Assets and Components
|
|
|
|
- `GET /assets`
|
|
- `GET /components`
|
|
- `GET /assets/{id}/timeline`
|
|
- `GET /components/{id}/timeline`
|
|
|
|
### Tickets
|
|
|
|
- `POST /connectors/tickets/sync`
|
|
|
|
## Recommended Go Project Layout
|
|
|
|
```text
|
|
cmd/reanimator-api
|
|
internal/domain
|
|
internal/repository
|
|
internal/ingest
|
|
internal/events
|
|
internal/connectors
|
|
internal/jobs
|
|
internal/api
|
|
```
|
|
|
|
## Initial Delivery Plan
|
|
|
|
1. **Registry**: customers, projects, assets, components, LOT
|
|
2. **Ingestion**: logbundle upload, observation persistence, installation detection
|
|
3. **Timeline**: event derivation and lifecycle views
|
|
4. **Service Correlation**: ticket sync and linking
|
|
|
|
## Future Extensions
|
|
|
|
- Predictive failure modeling
|
|
- Spare-part forecasting
|
|
- Firmware risk scoring
|
|
- Automated anomaly detection
|
|
- Rack-level topology
|
|
- Warranty and RMA workflows
|
|
- Multi-tenant support
|
|
|
|
## Immediate Next Steps
|
|
|
|
1. Initialize Go module
|
|
2. Create base schema and indexes
|
|
3. Implement ingestion endpoint
|
|
4. Add observation-to-event derivation
|
|
5. Expose asset/component timeline APIs
|
|
|
|
Start with data model correctness, not UI.
|