As an admin I want to backfill file hashes for already-uploaded documents so existing files are covered by the annotation versioning feature #56

Closed
opened 2026-03-24 08:47:19 +01:00 by marcel · 0 comments
Owner

Context

Issue #55 introduces a file_hash (SHA-256) column on documents and document_annotations. New uploads will populate it automatically, but documents uploaded before the migration will have file_hash = NULL.

Without backfilling, those documents will always show a "stale annotations" notice (NULL ≠ any hash), and their annotations will be hidden.

User Journey

An admin opens the System tab in the admin panel and clicks "Datei-Hashes berechnen". The backend streams through all documents where file_hash IS NULL, downloads each file from MinIO, computes SHA-256, and writes it back. A progress indicator shows how many files have been processed. When done, a success message confirms the count.

E2E Scenarios

Scenario: Admin triggers hash backfill from the System tab
  Given at least one document has no file_hash
  When I click "Datei-Hashes berechnen" in the System tab
  Then a success message appears
  And the document's file_hash is no longer null

Implementation notes

  • New POST /api/admin/backfill-file-hashes endpoint (mirrors the existing backfill-versions pattern in DocumentController)
  • DocumentService.backfillFileHashes(): paginate through documents with fileHash IS NULL AND s3Key IS NOT NULL, download each from FileService, compute SHA-256, save
  • SHA-256 helper can be MessageDigest.getInstance("SHA-256") — no new dependency
  • Frontend: add a new card in admin/+page.svelte System tab alongside the existing "Versionen nachfüllen" card
  • Must also backfill document_annotations.file_hash for any annotations that already exist: set them to the document's file_hash after computing it (reasonable assumption — annotations were created against the file that was current at the time)

Dependency

Implement after #55.

## Context Issue #55 introduces a `file_hash` (SHA-256) column on `documents` and `document_annotations`. New uploads will populate it automatically, but documents uploaded before the migration will have `file_hash = NULL`. Without backfilling, those documents will always show a "stale annotations" notice (NULL ≠ any hash), and their annotations will be hidden. ## User Journey An admin opens the System tab in the admin panel and clicks **"Datei-Hashes berechnen"**. The backend streams through all documents where `file_hash IS NULL`, downloads each file from MinIO, computes SHA-256, and writes it back. A progress indicator shows how many files have been processed. When done, a success message confirms the count. ## E2E Scenarios ``` Scenario: Admin triggers hash backfill from the System tab Given at least one document has no file_hash When I click "Datei-Hashes berechnen" in the System tab Then a success message appears And the document's file_hash is no longer null ``` ## Implementation notes - New `POST /api/admin/backfill-file-hashes` endpoint (mirrors the existing `backfill-versions` pattern in `DocumentController`) - `DocumentService.backfillFileHashes()`: paginate through documents with `fileHash IS NULL` AND `s3Key IS NOT NULL`, download each from `FileService`, compute SHA-256, save - SHA-256 helper can be `MessageDigest.getInstance("SHA-256")` — no new dependency - Frontend: add a new card in `admin/+page.svelte` System tab alongside the existing "Versionen nachfüllen" card - Must also backfill `document_annotations.file_hash` for any annotations that already exist: set them to the document's `file_hash` after computing it (reasonable assumption — annotations were created against the file that was current at the time) ## Dependency Implement after #55.
marcel added the featurecollaboration labels 2026-03-24 10:09:54 +01:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#56