docs: drop remaining stale MassImportService/ExcelService references
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m38s
CI / OCR Service Tests (pull_request) Successful in 21s
CI / Backend Unit Tests (pull_request) Successful in 4m9s
CI / fail2ban Regex (pull_request) Successful in 48s
CI / Semgrep Security Scan (pull_request) Successful in 21s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m5s

Replace the legacy raw-spreadsheet importer references left behind after
#674 with the canonical import architecture (CanonicalImportOrchestrator +
four loaders) and document #686 index-based PDF resolution.

- l3-backend-3b: DocumentImporter now resolves PDF by index (importDir/
  <index>.pdf) with index validation + canonical-path containment + %PDF
  magic-byte check (no recursive walk / homoglyph file-path guards)
- c4-diagrams.md: replace massImport/excelSvc components + their rels with
  an importOrch (CanonicalImportOrchestrator) component wired to doc/person/
  tag services; refresh adminCtrl and adminSystem descriptions
- ARCHITECTURE.md: importing package row now describes the orchestrator +
  four loaders consuming canonical artifacts
- TODO-backend.md: remove obsolete "MassImportService provides no status"
  item (service deleted; orchestrator already exposes import-status); update
  stale ExcelService test-coverage suggestion

Refs #686

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Marcel
2026-05-27 21:30:40 +02:00
parent 7354c3d332
commit ea38efc734
5 changed files with 12 additions and 23 deletions

View File

@@ -93,7 +93,7 @@ C4Component
### 3b — Document Management & Import
Document management, file storage, and bulk Excel/ODS import.
Document management, file storage, and the canonical import.
```mermaid
C4Component
@@ -105,12 +105,11 @@ C4Component
System_Boundary(backend, "API Backend (Spring Boot)") {
Component(docCtrl, "DocumentController", "Spring MVC — /api/documents", "CRUD for documents: search, get by ID, update metadata, upload/download file, conversation thread, and batch metadata updates.")
Component(adminCtrl, "AdminController", "Spring MVC — /api/admin", "Triggers asynchronous Excel/ODS mass import (requires ADMIN permission). Reports import state (IDLE/RUNNING/DONE/FAILED).")
Component(adminCtrl, "AdminController", "Spring MVC — /api/admin", "Triggers the asynchronous canonical import (requires ADMIN permission). Reports import state via GET /api/admin/import-status (IDLE/RUNNING/DONE/FAILED).")
Component(docSvc, "DocumentService", "Spring Service", "Core document business logic: store, update, search. Resolves persons and tags, delegates file I/O to FileService, builds dynamic JPA Specifications, and integrates with audit logging.")
Component(fileSvc, "FileService", "Spring Service", "Wraps AWS SDK v2 S3Client. Uploads files with UUID-keyed paths, computes SHA-256 hash, downloads with content-type detection, and generates presigned URLs for OCR access.")
Component(massImport, "MassImportService", "Spring Service — @Async", "Reads Excel/ODS files from /import mount. Tracks import state (IDLE/RUNNING/DONE/FAILED) and delegates to ExcelService. Returns immediately; processing runs asynchronously.")
Component(excelSvc, "ExcelService", "Spring Service", "Parses Excel/ODS workbooks (Apache POI). Column indices configurable via application.properties. Creates/updates document records per row.")
Component(importOrch, "CanonicalImportOrchestrator", "Spring Service — @Async", "Runs four idempotent loaders (TagTree → PersonRegister → PersonTree → Document) in a fixed DAG over the normalizer's committed canonical artifacts (canonical-*.xlsx + canonical-persons-tree.json) from /import — see diagram 3b. Owns the IDLE/RUNNING/DONE/FAILED state machine.")
Component(minioConf, "MinioConfig", "Spring @Configuration", "Creates the S3Client and S3Presigner beans with path-style access for MinIO. Validates MinIO connectivity on startup.")
Component(docRepo, "DocumentRepository", "Spring Data JPA", "Queries documents with Specification-based dynamic search, bidirectional conversation thread queries, full-text search with ranking and match highlighting, and transcription pipeline queue projections.")
@@ -123,14 +122,15 @@ C4Component
Rel(frontend, docCtrl, "Document requests", "HTTP / JSON")
Rel(frontend, adminCtrl, "Trigger import", "HTTP / JSON")
Rel(docCtrl, docSvc, "Delegates to", "")
Rel(adminCtrl, massImport, "Triggers", "")
Rel(adminCtrl, importOrch, "Triggers", "")
Rel(docSvc, fileSvc, "Upload / download files", "")
Rel(docSvc, docRepo, "Reads / writes documents", "")
Rel(docSvc, docSpec, "Builds search predicates", "")
Rel(docSvc, personSvc, "Resolves sender / receivers", "")
Rel(docSvc, tagSvc, "Finds or creates tags", "")
Rel(massImport, excelSvc, "Parses Excel/ODS file", "")
Rel(excelSvc, docSvc, "Creates / updates documents", "")
Rel(importOrch, docSvc, "Upserts documents (PDF by index) — see 3b", "")
Rel(importOrch, personSvc, "Upserts persons + relationships", "")
Rel(importOrch, tagSvc, "Upserts tag hierarchy", "")
Rel(minioConf, fileSvc, "Provides S3Client and S3Presigner beans", "")
Rel(fileSvc, minio, "PUT / GET / presigned URL objects", "S3 API / HTTP")
Rel(docRepo, db, "SQL queries", "JDBC")
@@ -492,7 +492,7 @@ C4Component
Component(adminGroups, "/admin/groups, /admin/groups/[id], /admin/groups/new", "SvelteKit Routes", "Permission group management: create/edit groups and their permission sets.")
Component(adminTags, "/admin/tags and /admin/tags/[id]", "SvelteKit Routes", "Tag administration: edit tag hierarchy, merge tags, delete subtrees.")
Component(adminOcr, "/admin/ocr and /admin/ocr/[personId]", "SvelteKit Routes", "Global and per-person OCR configuration. Manages script types and triggers sender model training.")
Component(adminSystem, "/admin/system", "SvelteKit Route", "System status panel. Triggers Excel/ODS mass import (POST /api/admin/trigger-import). Displays import state.")
Component(adminSystem, "/admin/system", "SvelteKit Route", "System status panel. Triggers the canonical import (POST /api/admin/trigger-import). Displays import state.")
Component(hilfe, "/hilfe/transkription", "SvelteKit Route", "Static transcription style guide for Kurrent and Sütterlin character recognition. No backend calls.")
}