feat(importing): add orchestrator, wire admin, retire raw-spreadsheet path

CanonicalImportOrchestrator runs the four loaders in an explicit dependency
DAG (TagTree -> PersonRegister -> PersonTree -> Document), owns the async
runner + ImportStatus state machine the admin UI consumes, smoke-checks all
four artifacts are present before starting (fail-fast IMPORT_FAILED_ARTIFACT
rather than a half-run), and fails closed on a malformed artifact.

AdminController now depends on the orchestrator; the {state, statusCode,
processed, skippedFiles, skipped} response shape is unchanged so
ImportStatusCard.svelte keeps working.

Deletes the legacy MassImportService (positional @Value app.import.col.*,
ISO-only parseDate, Java name classification) and the ODS/XXE
XxeSafeXmlParser path now that the loaders cover them — the security guards
were ported to DocumentImporter first (previous commit). Replaces the
positional column config in application.yaml with the canonical artifact
directory.

Refs #669

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Marcel
2026-05-27 10:36:28 +02:00
parent c56ba6219c
commit 459ba14207
8 changed files with 245 additions and 1451 deletions

View File

@@ -125,17 +125,10 @@ app:
password: ${APP_ADMIN_PASSWORD:admin123}
import:
col:
index: 0
box: 1
folder: 2
sender: 3
receivers: 5
date: 7
location: 9
tags: 10
summary: 11
transcription: 13
# Directory holding the normalizer's committed canonical artifacts
# (canonical-{documents,persons,tag-tree}.xlsx + canonical-persons-tree.json).
# The loader maps columns by header name — no positional indices (see ADR-025).
dir: ${IMPORT_DIR:/import}
ocr:
sender-model: