Files
familienarchiv/docs/architecture/c4/l3-backend-3b-document-management.puml
Marcel 21c85ff081 docs(importing): document the canonical importer rebuild
- ADR-025: add decision 3 (four idempotent loaders over canonical artifacts;
  raw spreadsheet no longer parsed by Java) with the settled Option-A name
  policy, human-edit-preserve precedence, provisional contract, and ported
  security guards.
- l3-backend-3b diagram: replace MassImportService/ExcelService with the
  orchestrator, the four loaders, and CanonicalSheetReader, with the loader
  dependency edges.
- GLOSSARY: Canonical import / canonical artifact / CanonicalSheetReader terms;
  refresh SkippedFile (new INVALID_FILENAME_PATH_TRAVERSAL reason, index key).
- DEPLOYMENT §6: canonical-artifact prerequisite runbook (run normalizer →
  place four artifacts → trigger import); note idempotent re-run.
- CLAUDE.md (root + backend): importing/ package now lists the orchestrator +
  loaders + CanonicalSheetReader.

OpenAPI: no generate:api needed — the ImportStatus/SkippedFile generated
schemas already match the new types byte-for-byte (same fields + SkipReason
enum), so the API surface is unchanged.

Closes #669

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 10:44:45 +02:00

5.2 KiB

Component Diagram: API Backend — Document Management & Canonical ImportComponent Diagram: API Backend — Document Management & Canonical ImportAPI Backend (Spring Boot)[system]«component»DocumentController[Spring MVC â€”/api/documents] CRUD for documents:search, get by ID, updatemetadata, upload/downloadfile, conversation thread,batch metadata updates,and per-month densityaggregation for the timelinefilter widget.«component»AdminController[Spring MVC â€” /api/admin] Triggers the asynchronouscanonical import (requiresADMIN permission). Reportsimport state(IDLE/RUNNING/DONE/FAILED).«component»DocumentService[Spring Service] Core document businesslogic: store, update, search.Resolves persons and tags,delegates file I/O toFileService, builds dynamicJPA Specifications, andintegrates with auditlogging.«component»FileService[Spring Service] Wraps AWS SDK v2S3Client. Uploads files withUUID-keyed paths,computes SHA-256 hash,downloads withcontent-type detection, andgenerates presigned URLsfor OCR access.«component»CanonicalImportOrchestrator[Spring Service â€” @Async] Runs the four canonicalloaders in an explicitdependency DAG (TagTree→ PersonRegister â†’PersonTree â†’ Document).Smoke-checks all fourartifacts before starting,owns theIDLE/RUNNING/DONE/FAILEDstate machine, fails closedon a malformed artifact.«component»TagTreeImporter[Spring Component] Upserts the tag hierarchyfrom canonical-tag-tree.xlsxvia TagService (bycanonical tag_path).«component»PersonRegisterImporter[Spring Component] Upserts register personsfrom canonical-persons.xlsxvia PersonService (bynormalizer person_id).«component»PersonTreeImporter[Spring Component] Upserts tree persons +relationships fromcanonical-persons-tree.jsonvia PersonService andRelationshipService.«component»DocumentImporter[Spring Component] Loadscanonical-documents.xlsx:routes attributionregister-first (raw cellalways retained insender_text/receiver_text),parses clean dates, keepsthe S3 upload + thumbnailplumbing, and ports thepath-traversal / homoglyph/ absolute-path / %PDFmagic-byte security guards.«component»CanonicalSheetReader[POI helper] Maps a canonical .xlsx byheader name (no positionalindices), splitspipe-delimited list columns,fails closed(IMPORT_ARTIFACT_INVALID)on a missing requiredheader.«component»MinioConfig[Spring @Configuration] Creates the S3Client andS3Presigner beans withpath-style access for MinIO.Validates MinIO connectivityon startup.«component»DocumentRepository[Spring Data JPA] Queries documents withSpecification-baseddynamic search,bidirectional conversationthread queries, full-textsearch with ranking andmatch highlighting, andtranscription pipeline queueprojections.«component»DocumentSpecifications[JPA Criteria API] Factory for composablepredicates: hasText(full-text), hasSender,hasReceiver, isBetween(date range), hasTags(subquery AND/OR logic).«container»Web Frontend[SvelteKit]«container»PostgreSQL[PostgreSQL 16]«container»Object Storage[MinIO (S3-compatible)]«component»PersonService[Spring Service] See diagram 3e. Resolvessender / receiver personsby ID; upserts persons bysource_ref for the importer.«component»TagService[Spring Service] See diagram 3d. Finds orcreates tags by name;upserts tags by source_reffor the importer.«component»RelationshipService[Spring Service] See diagram 3e. Createsfamily relationships fromthe person tree duringimport.Document requests[HTTP / JSON]Trigger import[HTTP / JSON]Delegates toTriggersUpload / downloadfilesReads / writesdocumentsBuilds searchpredicatesResolves sender /receiversFinds or creates tags1. Loads tags2. Loads registerpersons3. Loads tree persons+ relationships4. Loads documentsReads canonical .xlsxReads canonical .xlsxReads canonical .xlsxUpserts tags bysource_refUpserts persons bysource_refUpserts persons bysource_refCreates relationshipsUpserts documentsby indexRegister-first match /provisional personAttaches tag bysource_refUploads resolved fileProvides S3Client andS3Presigner beansPUT / GET / presignedURL objects[S3 API / HTTP]SQL queries[JDBC]