refactor: unified ingest pipeline + modular Redfish profile framework

Implement the full architectural plan: unified ingest.Service entry point for archive and Redfish payloads, modular redfishprofile package with composable profiles (generic, ami-family, msi, supermicro, dell, hgx-topology), score-based profile matching with fallback expansion mode, and profile-driven acquisition/analysis plans. Vendor-specific logic moved out of common executors and into profile hooks. GPU chassis lookup strategies and known storage recovery collections (IntelVROC/HA-RAID/MRVL) now live in ResolvedAnalysisPlan, populated by profiles at analysis time. Replay helpers read from the plan; no hardcoded path lists remain in generic code. Also splits redfish_replay.go into domain modules (gpu, storage, inventory, fru, profiles) and adds full fixture/matcher/directive test coverage including Dell, AMI, unknown-vendor fallback, and deterministic ordering. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 08:48:58 +03:00
parent d8d3d8c524
commit d650a6ba1c
45 changed files with 5231 additions and 1011 deletions
@@ -20,6 +20,7 @@ LOGPile remains responsible for upload, collection, parsing, normalization, and
 ```text
 cmd/logpile/main.go          entrypoint and CLI flags
 internal/server/             HTTP handlers, jobs, upload/export flows
+internal/ingest/             source-family orchestration for upload and raw replay
 internal/collector/          live collection and Redfish replay
 internal/analyzer/           shared analysis helpers
 internal/parser/             archive extraction and parser dispatch
@@ -50,18 +51,21 @@ Failed or canceled jobs do not overwrite the previous dataset.
 ### Upload

 1. `POST /api/upload` receives multipart field `archive`
-2. JSON inputs are checked for raw-export package or `AnalysisResult` snapshot
-3. Non-JSON inputs go through `parser.BMCParser`
-4. Archive metadata is normalized onto `AnalysisResult`
-5. Result becomes the current in-memory dataset
+2. `internal/ingest.Service` resolves the source family
+3. JSON inputs are checked for raw-export package or `AnalysisResult` snapshot
+4. Non-JSON archives go through the archive parser family
+5. Archive metadata is normalized onto `AnalysisResult`
+6. Result becomes the current in-memory dataset

 ### Live collect

 1. `POST /api/collect` validates request fields
 2. Server creates an async job and returns `202 Accepted`
 3. Selected collector gathers raw data
-4. For Redfish, collector saves `raw_payloads.redfish_tree`
-5. Result is normalized, source metadata applied, and state replaced on success
+4. For Redfish, collector runs minimal discovery, matches Redfish profiles, and builds an acquisition plan
+5. Collector applies profile tuning hints (for example crawl breadth, prefetch, bounded plan-B passes)
+6. Collector saves `raw_payloads.redfish_tree` plus acquisition diagnostics
+7. Result is normalized, source metadata applied, and state replaced on success

 ### Batch convert

@@ -76,6 +80,10 @@ Failed or canceled jobs do not overwrite the previous dataset.
 Live Redfish collection and offline Redfish re-analysis must use the same replay path.
 The collector first captures `raw_payloads.redfish_tree`, then the replay logic builds the normalized result.

+Redfish is being split into two coordinated phases:
+- acquisition: profile-driven snapshot collection strategy
+- analysis: replay over the saved snapshot with the same profile framework
+
 ## PCI IDs lookup

 Lookup order: