feat(document): replace DocumentSearchItem with flat DocumentListItem DTO (#647) #660

marcel · 2026-05-22T18:39:15+02:00

marcel commented

2026-05-22 18:39:15 +02:00

Summary

Closes #647

[backend] New flat DocumentListItem Java record replaces both the DocumentSearchItem wrapper and the nested Document reference — eliminates OWASP API3:2023 excessive data exposure (no more transcription, filePath, fileHash, contentType, scriptType etc. sent to list views)
[backend] Document.list entity graph pruned to sender + receivers + tags — trainingLabels removed since DocumentListItem.toListItem() never touches it, ending the maintenance trap
[backend] DocumentSearchResult updated to List<DocumentListItem>; DocumentService.enrichItems() now calls a new toListItem() mapping method
[frontend] npm run generate:api regenerated; DocumentRow, DocumentList, +page.server.ts, DocumentMultiSelect all migrated to flat item.title / item.sender access
[test] DocumentListItemIntegrationTest — Testcontainers guard: search with trainingLabels does not throw LazyInitializationException; detail endpoint still returns trainingLabels
[test] DocumentControllerTest — @WebMvcTest shape assertion: $.items[0].id present, $.items[0].transcription / filePath / fileHash absent

Test plan

CI passes (backend unit + Testcontainers integration + frontend type check)
DocumentControllerTest.search_returns_flat_item_with_id_and_without_sensitive_fields green
DocumentListItemIntegrationTest — all 3 tests green (no LazyInitializationException, detail still returns trainingLabels)
npm run check — zero errors in changed files
Document list view loads, groups/sorts correctly, thumbnails, tags, sender/receiver render as before

🤖 Generated with Claude Code

## Summary Closes #647 - **[backend]** New flat `DocumentListItem` Java record replaces both the `DocumentSearchItem` wrapper and the nested `Document` reference — eliminates OWASP API3:2023 excessive data exposure (no more `transcription`, `filePath`, `fileHash`, `contentType`, `scriptType` etc. sent to list views) - **[backend]** `Document.list` entity graph pruned to `sender + receivers + tags` — `trainingLabels` removed since `DocumentListItem.toListItem()` never touches it, ending the maintenance trap - **[backend]** `DocumentSearchResult` updated to `List<DocumentListItem>`; `DocumentService.enrichItems()` now calls a new `toListItem()` mapping method - **[frontend]** `npm run generate:api` regenerated; `DocumentRow`, `DocumentList`, `+page.server.ts`, `DocumentMultiSelect` all migrated to flat `item.title` / `item.sender` access - **[test]** `DocumentListItemIntegrationTest` — Testcontainers guard: search with `trainingLabels` does not throw `LazyInitializationException`; detail endpoint still returns `trainingLabels` - **[test]** `DocumentControllerTest` — `@WebMvcTest` shape assertion: `$.items[0].id` present, `$.items[0].transcription` / `filePath` / `fileHash` absent ## Test plan - [ ] CI passes (backend unit + Testcontainers integration + frontend type check) - [ ] `DocumentControllerTest.search_returns_flat_item_with_id_and_without_sensitive_fields` green - [ ] `DocumentListItemIntegrationTest` — all 3 tests green (no `LazyInitializationException`, detail still returns `trainingLabels`) - [ ] `npm run check` — zero errors in changed files - [ ] Document list view loads, groups/sorts correctly, thumbnails, tags, sender/receiver render as before 🤖 Generated with [Claude Code](https://claude.com/claude-code)

marcel referenced this pull request

2026-05-22 18:39:30 +02:00

Optimize document list API: avoid loading associations not needed by the list view #647

marcel added 4 commits 2026-05-22 19:19:35 +02:00

refactor(document): replace DocumentSearchItem with flat DocumentListItem DTO f22dcaecb7

Eliminates excessive data exposure (OWASP API3:2023) — transcription,
filePath, fileHash, thumbnailKey, scriptType and other detail-only fields
are no longer serialised in the list API response.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(document): add LazyInit guard + detail regression tests; prune Document.list graph 41b205becc

Remove trainingLabels from Document.list entity graph now that DocumentListItem
does not touch that association. Integration tests guard against future
LazyInitializationException regressions and confirm Document.full still
loads trainingLabels for the detail endpoint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor(document): migrate frontend from DocumentSearchItem to flat DocumentListItem 6583226d79

All components, specs, and the generated API client now use the new
DocumentListItem shape — flat access (item.title, item.sender) instead of
the removed item.document.* nesting.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(document): fix test regressions from DocumentListItem migration

CI / Unit & Component Tests (pull_request) Successful in 3m32s

Details

CI / OCR Service Tests (pull_request) Successful in 20s

Details

CI / Backend Unit Tests (pull_request) Successful in 3m46s

Details

CI / fail2ban Regex (pull_request) Successful in 42s

Details

CI / Semgrep Security Scan (pull_request) Successful in 19s

Details

CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s

Details

627fc44d99

- Use documentService.getDocumentById() in detail_stillReturnsTrainingLabels
  so the Document.full entity graph eager-loads trainingLabels
- Flatten makeItem() factory in DocumentList.svelte.test.ts (nested
  document: {} overrides broke item.id / item.documentDate access)
- Remove { document: {} } wrapper from DocumentMultiSelect.svelte.spec.ts
  mock responses — component now reads body.items directly as flat items
- Flatten single nested item in page.svelte.test.ts document list test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel force-pushed feat/issue-647-document-list-dto from c92eda33b1 to 627fc44d99

2026-05-22 19:19:35 +02:00

Compare

marcel commented

2026-05-22 19:22:44 +02:00

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ⚠️ Approved with concerns

Blockers

None.

Suggestions

1. DocumentMultiSelect.svelte — unsafe as unknown as Document[] cast
frontend/src/lib/document/DocumentMultiSelect.svelte, line ~49:

const docs = body.items as unknown as Document[];

This works at runtime because GeschichteEditor only reads id, title, and documentDate from selectedDocuments. But the cast tells TypeScript that every Document-only field (status, metadataComplete, scriptType, createdAt, updatedAt) is present — any future consumer that accesses one of those will compile fine and fail silently at runtime.

A minimal improvement: map explicitly to the fields the component actually needs, then cast:

const docs = body.items.map(it => ({
    id: it.id,
    title: it.title,
    documentDate: it.documentDate
})) as unknown as Document[];

The real fix is giving selectedDocuments a purpose-built picker type, but that touches GeschichteEditor and is out of scope here. At minimum, the field list in the map makes the implicit contract explicit.

2. DocumentSearchResult.java — factory method JavaDoc silently removed
of() and paged() had load-bearing documentation explaining non-obvious semantics ("single-page convenience factory used by empty-result shortcuts and by tests that don't care about paging" / "paged factory used by the service when it has a real Pageable + full match count"). These were removed as part of the type rename but the semantics didn't change. A future developer reading the class won't know why two factories exist. Restore the comments.

3. DocumentService.toListItem() — redundant null guards for @Builder.Default collections

doc.getReceivers() != null ? List.copyOf(doc.getReceivers()) : List.of()

Document.receivers and Document.tags both have @Builder.Default with new HashSet<>() — they're null only if a Document is constructed without the builder (a bug, not a runtime scenario). The defensive check implies a uncertainty that doesn't exist and makes readers wonder "when would this be null?". Just List.copyOf(doc.getReceivers()) is correct — if it NPEs in a test, that's the test's broken setup speaking, which is useful signal.

All good

const doc = $derived(item) in DocumentRow.svelte — perfect alias, zero template churn
Test factories across all 6 files cleanly flattened — the diff noise is all mechanical and correct
toListItem() private to DocumentService — mapping owned by the domain, no leakage
Commit granularity: one logical change per commit throughout

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ⚠️ Approved with concerns** ### Blockers None. ### Suggestions **1. `DocumentMultiSelect.svelte` — unsafe `as unknown as Document[]` cast** `frontend/src/lib/document/DocumentMultiSelect.svelte`, line ~49: ```typescript const docs = body.items as unknown as Document[]; ``` This works at runtime because `GeschichteEditor` only reads `id`, `title`, and `documentDate` from `selectedDocuments`. But the cast tells TypeScript that every `Document`-only field (`status`, `metadataComplete`, `scriptType`, `createdAt`, `updatedAt`) is present — any future consumer that accesses one of those will compile fine and fail silently at runtime. A minimal improvement: map explicitly to the fields the component actually needs, then cast: ```typescript const docs = body.items.map(it => ({ id: it.id, title: it.title, documentDate: it.documentDate })) as unknown as Document[]; ``` The real fix is giving `selectedDocuments` a purpose-built picker type, but that touches `GeschichteEditor` and is out of scope here. At minimum, the field list in the map makes the implicit contract explicit. **2. `DocumentSearchResult.java` — factory method JavaDoc silently removed** `of()` and `paged()` had load-bearing documentation explaining non-obvious semantics ("single-page convenience factory used by empty-result shortcuts and by tests that don't care about paging" / "paged factory used by the service when it has a real Pageable + full match count"). These were removed as part of the type rename but the semantics didn't change. A future developer reading the class won't know why two factories exist. Restore the comments. **3. `DocumentService.toListItem()` — redundant null guards for `@Builder.Default` collections** ```java doc.getReceivers() != null ? List.copyOf(doc.getReceivers()) : List.of() ``` `Document.receivers` and `Document.tags` both have `@Builder.Default` with `new HashSet<>()` — they're null only if a `Document` is constructed without the builder (a bug, not a runtime scenario). The defensive check implies a uncertainty that doesn't exist and makes readers wonder "when would this be null?". Just `List.copyOf(doc.getReceivers())` is correct — if it NPEs in a test, that's the test's broken setup speaking, which is useful signal. ### All good - `const doc = $derived(item)` in `DocumentRow.svelte` — perfect alias, zero template churn - Test factories across all 6 files cleanly flattened — the diff noise is all mechanical and correct - `toListItem()` private to `DocumentService` — mapping owned by the domain, no leakage - Commit granularity: one logical change per commit throughout

marcel commented

2026-05-22 19:22:54 +02:00

🔐 Nora "NullX" Steiner — Application Security Engineer

Verdict: ✅ Approved

This PR directly remediates OWASP API3:2023 — Excessive Data Exposure for the document list endpoint. I audited every field in the new DocumentListItem record.

Fields removed from list response (all correct)

Field	Reason to exclude	Status
`transcription`	Full document text — never needed in list view	✅ Removed
`filePath`	Internal S3 object key	✅ Removed
`fileHash`	SHA-256 of file content	✅ Removed
`contentType`	MIME type, internal storage metadata	✅ Removed
`scriptType`	OCR workflow metadata	✅ Removed
`status`	Document workflow state	✅ Removed
`metadataComplete`	Workflow flag	✅ Removed
`thumbnailKey`	Raw S3 key of thumbnail — the `thumbnailUrl` computed property is exposed instead (correct)	✅ Removed
`pageCount`	Storage metadata	✅ Removed

Security regression test: excellent

DocumentControllerTest.search_returns_flat_item_with_id_and_without_sensitive_fields asserts at the HTTP layer that $.items[0].transcription, $.items[0].filePath, and $.items[0].fileHash do not exist in the response. This is exactly the right layer — it catches any accidental re-introduction even if the Java types change.

Note for a follow-up issue (not a blocker)

Person is still serialized as a full entity in sender and receivers. The Person schema exposes birthDate, deathDate, and potentially notes — fields that are not rendered in the list view. This was equally true in the old DocumentSearchItem.document.sender path, so this PR is not a regression. But it's worth a dedicated issue to introduce a PersonSummaryDTO with just id and displayName for list-view contexts.

## 🔐 Nora "NullX" Steiner — Application Security Engineer **Verdict: ✅ Approved** This PR directly remediates **OWASP API3:2023 — Excessive Data Exposure** for the document list endpoint. I audited every field in the new `DocumentListItem` record. ### Fields removed from list response (all correct) | Field | Reason to exclude | Status | |---|---|---| | `transcription` | Full document text — never needed in list view | ✅ Removed | | `filePath` | Internal S3 object key | ✅ Removed | | `fileHash` | SHA-256 of file content | ✅ Removed | | `contentType` | MIME type, internal storage metadata | ✅ Removed | | `scriptType` | OCR workflow metadata | ✅ Removed | | `status` | Document workflow state | ✅ Removed | | `metadataComplete` | Workflow flag | ✅ Removed | | `thumbnailKey` | Raw S3 key of thumbnail — the `thumbnailUrl` computed property is exposed instead (correct) | ✅ Removed | | `pageCount` | Storage metadata | ✅ Removed | ### Security regression test: excellent `DocumentControllerTest.search_returns_flat_item_with_id_and_without_sensitive_fields` asserts at the HTTP layer that `$.items[0].transcription`, `$.items[0].filePath`, and `$.items[0].fileHash` do not exist in the response. This is exactly the right layer — it catches any accidental re-introduction even if the Java types change. ### Note for a follow-up issue (not a blocker) `Person` is still serialized as a full entity in `sender` and `receivers`. The `Person` schema exposes `birthDate`, `deathDate`, and potentially `notes` — fields that are not rendered in the list view. This was equally true in the old `DocumentSearchItem.document.sender` path, so this PR is not a regression. But it's worth a dedicated issue to introduce a `PersonSummaryDTO` with just `id` and `displayName` for list-view contexts.

marcel commented

2026-05-22 19:23:04 +02:00

🏗️ Markus Keller — Application Architect

Verdict: ✅ Approved

Architecture is sound

Domain ownership: DocumentListItem lives in org.raddatz.familienarchiv.document — the right package. The old DocumentSearchItem also lived there, so no boundary shift. The mapping method toListItem() is private to DocumentService — the list projection is fully owned by the document domain, nothing leaks.

DocumentSearchItem fully deleted — no dead code, no deprecated wrapper left behind. Clean.

Entity graph: Removing trainingLabels from Document.list is correct. toListItem() never accesses trainingLabels, so there's no lazy-loading risk on that path. Document.full (used by getDocumentById) still loads trainingLabels correctly. The two graphs now accurately reflect what each query path actually needs.

Pre-existing concerns (not introduced here, but worth flagging)

1. DocumentListItem exposes full Person and Tag entities
sender: Person and receivers: List<Person> carry the complete Person entity to list-view clients. The list row only needs displayName (and id for link targets). A PersonSummaryDTO with {id, displayName} would close this surface. This was equally true through DocumentSearchItem.document.sender — not a regression — but now it's explicit in the list DTO's contract and worth tracking.

2. DocumentMultiSelect implicit coupling
The as unknown as Document[] cast in DocumentMultiSelect.svelte makes visible an implicit contract between DocumentListItem and Document: the component treats search results as if they were full entities. This works today because GeschichteEditor only reads id/title/documentDate. A DocumentPickerItem interface would make the contract compile-time-checkable. Small, but as the component grows this cast will bite someone.

## 🏗️ Markus Keller — Application Architect **Verdict: ✅ Approved** ### Architecture is sound **Domain ownership:** `DocumentListItem` lives in `org.raddatz.familienarchiv.document` — the right package. The old `DocumentSearchItem` also lived there, so no boundary shift. The mapping method `toListItem()` is `private` to `DocumentService` — the list projection is fully owned by the document domain, nothing leaks. **`DocumentSearchItem` fully deleted** — no dead code, no deprecated wrapper left behind. Clean. **Entity graph:** Removing `trainingLabels` from `Document.list` is correct. `toListItem()` never accesses `trainingLabels`, so there's no lazy-loading risk on that path. `Document.full` (used by `getDocumentById`) still loads `trainingLabels` correctly. The two graphs now accurately reflect what each query path actually needs. ### Pre-existing concerns (not introduced here, but worth flagging) **1. `DocumentListItem` exposes full `Person` and `Tag` entities** `sender: Person` and `receivers: List<Person>` carry the complete `Person` entity to list-view clients. The list row only needs `displayName` (and `id` for link targets). A `PersonSummaryDTO` with `{id, displayName}` would close this surface. This was equally true through `DocumentSearchItem.document.sender` — not a regression — but now it's explicit in the list DTO's contract and worth tracking. **2. `DocumentMultiSelect` implicit coupling** The `as unknown as Document[]` cast in `DocumentMultiSelect.svelte` makes visible an implicit contract between `DocumentListItem` and `Document`: the component treats search results as if they were full entities. This works today because `GeschichteEditor` only reads `id/title/documentDate`. A `DocumentPickerItem` interface would make the contract compile-time-checkable. Small, but as the component grows this cast will bite someone.

marcel commented

2026-05-22 19:23:18 +02:00

🧪 Sara Holt — QA Engineer & Test Strategist

Verdict: ⚠️ Approved with concerns

What's solid

DocumentListItemIntegrationTest — Three Testcontainers tests with a real PostgreSQL instance:

search_doesNotThrow_whenDocumentHasTrainingLabels — confirms the LazyInitializationException regression is guarded
search_returns_list_item_without_sensitive_fields_when_document_has_training_labels — confirms flat item shape
detail_stillReturnsTrainingLabels — confirms Document.full graph is unaffected

@AfterEach cleanup() via documentRepository.deleteAll() properly isolates tests. Correct use of documentService.getDocumentById() (not documentRepository.findById()) to exercise the entity graph.

DocumentControllerTest.search_returns_flat_item_with_id_and_without_sensitive_fields — Verifies the HTTP response shape, not just Java types. This is the right place for a data-exposure regression guard — it would catch even a reflection-based serialization change that bypasses the type system.

Test factory flattening — All makeItem() / baseItem() factories across DocumentRow.svelte.spec.ts, DocumentRow.svelte.test.ts, DocumentList.svelte.spec.ts, DocumentList.svelte.test.ts, page.svelte.test.ts are correctly updated. That's thorough.

Concern

DocumentMultiSelect.svelte.spec.ts — docFactory produces fields that don't exist on DocumentListItem

The factory:

const docFactory = (id, title, date) => ({
    id, title, documentDate: date,
    originalFilename: `${title}.pdf`,
    status: 'UPLOADED',           // ← not on DocumentListItem
    metadataComplete: false,       // ← not on DocumentListItem
    scriptType: 'UNKNOWN',         // ← not on DocumentListItem
    createdAt: '2024-01-01T00:00:00',  // ← not on DocumentListItem
    updatedAt: '2024-01-01T00:00:00'   // ← not on DocumentListItem
});

These extra fields are passed directly to the mock response's items array. The component casts to Document[] anyway so the tests pass — but the test data no longer matches the actual API shape. Any future test that relies on those fields being undefined (because DocumentListItem doesn't have them) will get stale factory data instead.

Update docFactory to match DocumentListItem's actual shape: id, title, documentDate, originalFilename, receivers, tags, completionPercentage, contributors, matchData.

Suggestion

DocumentSearchResultTest.item() factory — 15 positional args with many nulls

return new DocumentListItem(
    docId, "Test", "test.pdf", null, null, null,
    List.of(), List.of(), null, null, null, null,
    0, List.of(), SearchMatchData.empty());

If the DocumentListItem record field order ever changes, this silently constructs wrong data — no compile error. Since this is test-only code, a small named factory with only the fields each test cares about would be more robust. Not a blocker, but worth a follow-up.

## 🧪 Sara Holt — QA Engineer & Test Strategist **Verdict: ⚠️ Approved with concerns** ### What's solid **`DocumentListItemIntegrationTest`** — Three Testcontainers tests with a real PostgreSQL instance: - `search_doesNotThrow_whenDocumentHasTrainingLabels` — confirms the `LazyInitializationException` regression is guarded - `search_returns_list_item_without_sensitive_fields_when_document_has_training_labels` — confirms flat item shape - `detail_stillReturnsTrainingLabels` — confirms `Document.full` graph is unaffected `@AfterEach cleanup()` via `documentRepository.deleteAll()` properly isolates tests. Correct use of `documentService.getDocumentById()` (not `documentRepository.findById()`) to exercise the entity graph. **`DocumentControllerTest.search_returns_flat_item_with_id_and_without_sensitive_fields`** — Verifies the HTTP response shape, not just Java types. This is the right place for a data-exposure regression guard — it would catch even a reflection-based serialization change that bypasses the type system. **Test factory flattening** — All `makeItem()` / `baseItem()` factories across `DocumentRow.svelte.spec.ts`, `DocumentRow.svelte.test.ts`, `DocumentList.svelte.spec.ts`, `DocumentList.svelte.test.ts`, `page.svelte.test.ts` are correctly updated. That's thorough. ### Concern **`DocumentMultiSelect.svelte.spec.ts` — `docFactory` produces fields that don't exist on `DocumentListItem`** The factory: ```typescript const docFactory = (id, title, date) => ({ id, title, documentDate: date, originalFilename: `${title}.pdf`, status: 'UPLOADED', // ← not on DocumentListItem metadataComplete: false, // ← not on DocumentListItem scriptType: 'UNKNOWN', // ← not on DocumentListItem createdAt: '2024-01-01T00:00:00', // ← not on DocumentListItem updatedAt: '2024-01-01T00:00:00' // ← not on DocumentListItem }); ``` These extra fields are passed directly to the mock response's `items` array. The component casts to `Document[]` anyway so the tests pass — but the test data no longer matches the actual API shape. Any future test that relies on those fields being `undefined` (because `DocumentListItem` doesn't have them) will get stale factory data instead. Update `docFactory` to match `DocumentListItem`'s actual shape: `id`, `title`, `documentDate`, `originalFilename`, `receivers`, `tags`, `completionPercentage`, `contributors`, `matchData`. ### Suggestion **`DocumentSearchResultTest.item()` factory — 15 positional args with many nulls** ```java return new DocumentListItem( docId, "Test", "test.pdf", null, null, null, List.of(), List.of(), null, null, null, null, 0, List.of(), SearchMatchData.empty()); ``` If the `DocumentListItem` record field order ever changes, this silently constructs wrong data — no compile error. Since this is test-only code, a small named factory with only the fields each test cares about would be more robust. Not a blocker, but worth a follow-up.

marcel commented

2026-05-22 19:23:23 +02:00

🚀 Tobias Wendt — DevOps & Platform Engineer

Verdict: ✅ Approved — LGTM

Checked: CI impact, dependency surface, generated file management, container requirements.

No new Maven dependencies, no new npm packages, no Docker image changes.
DocumentListItemIntegrationTest uses the existing PostgresContainerConfig Testcontainers setup — no new container images. CI already pulls postgres:16 for integration tests.
No CI workflow changes needed.
frontend/src/lib/generated/api.ts is fully managed by npm run generate:api — the diff is reproducible from the current backend spec. The incidental changes to ImportStatus (new skippedFiles field, SkippedFile schema) and PagedResponse field reordering came from a backend feature already on main that the regeneration picked up. Not a problem — the file is always derived from the spec.

Nothing here that touches infrastructure, secrets, or deployment config.

## 🚀 Tobias Wendt — DevOps & Platform Engineer **Verdict: ✅ Approved — LGTM** Checked: CI impact, dependency surface, generated file management, container requirements. - No new Maven dependencies, no new npm packages, no Docker image changes. - `DocumentListItemIntegrationTest` uses the existing `PostgresContainerConfig` Testcontainers setup — no new container images. CI already pulls `postgres:16` for integration tests. - No CI workflow changes needed. - `frontend/src/lib/generated/api.ts` is fully managed by `npm run generate:api` — the diff is reproducible from the current backend spec. The incidental changes to `ImportStatus` (new `skippedFiles` field, `SkippedFile` schema) and `PagedResponse` field reordering came from a backend feature already on `main` that the regeneration picked up. Not a problem — the file is always derived from the spec. Nothing here that touches infrastructure, secrets, or deployment config.

marcel commented

2026-05-22 19:23:31 +02:00

📋 Elicit — Requirements Engineer

Verdict: ✅ Approved

Full traceability audit from issue #647 acceptance criteria to implementation and tests.

AC #1 — List response exposes only list-view fields; sensitive storage fields absent

Criterion	Implementation	Test
`transcription` absent	Not in `DocumentListItem` record	`DocumentControllerTest.search_returns_flat_item_with_id_and_without_sensitive_fields` → `jsonPath("$.items[0].transcription").doesNotExist()`
`filePath` absent	Not in `DocumentListItem` record	Same test → `jsonPath("$.items[0].filePath").doesNotExist()`
`fileHash` absent	Not in `DocumentListItem` record	Same test → `jsonPath("$.items[0].fileHash").doesNotExist()`
`id` present at top level	`DocumentListItem.id()`	Same test → `jsonPath("$.items[0].id").value(...)`

✅ Fully covered.

AC #2 — Search with a document that has `trainingLabels` does not throw `LazyInitializationException`

Criterion	Implementation	Test
`trainingLabels` removed from `Document.list` entity graph	`Document.java` — `@NamedEntityGraph(name = "Document.list")` no longer lists `trainingLabels`	`DocumentListItemIntegrationTest.search_doesNotThrow_whenDocumentHasTrainingLabels`

✅ Fully covered.

AC #3 — Detail endpoint still returns `trainingLabels` after the entity graph change

Criterion	Implementation	Test
`Document.full` graph unchanged	`Document.java` — `@NamedEntityGraph(name = "Document.full")` still includes `trainingLabels`	`DocumentListItemIntegrationTest.detail_stillReturnsTrainingLabels` via `documentService.getDocumentById()`

✅ Fully covered.

All three acceptance criteria are implemented and machine-verified at the appropriate layer (HTTP response shape, integration with real DB).

## 📋 Elicit — Requirements Engineer **Verdict: ✅ Approved** Full traceability audit from issue #647 acceptance criteria to implementation and tests. ### AC #1 — List response exposes only list-view fields; sensitive storage fields absent | Criterion | Implementation | Test | |---|---|---| | `transcription` absent | Not in `DocumentListItem` record | `DocumentControllerTest.search_returns_flat_item_with_id_and_without_sensitive_fields` → `jsonPath("$.items[0].transcription").doesNotExist()` | | `filePath` absent | Not in `DocumentListItem` record | Same test → `jsonPath("$.items[0].filePath").doesNotExist()` | | `fileHash` absent | Not in `DocumentListItem` record | Same test → `jsonPath("$.items[0].fileHash").doesNotExist()` | | `id` present at top level | `DocumentListItem.id()` | Same test → `jsonPath("$.items[0].id").value(...)` | **✅ Fully covered.** ### AC #2 — Search with a document that has `trainingLabels` does not throw `LazyInitializationException` | Criterion | Implementation | Test | |---|---|---| | `trainingLabels` removed from `Document.list` entity graph | `Document.java` — `@NamedEntityGraph(name = "Document.list")` no longer lists `trainingLabels` | `DocumentListItemIntegrationTest.search_doesNotThrow_whenDocumentHasTrainingLabels` | **✅ Fully covered.** ### AC #3 — Detail endpoint still returns `trainingLabels` after the entity graph change | Criterion | Implementation | Test | |---|---|---| | `Document.full` graph unchanged | `Document.java` — `@NamedEntityGraph(name = "Document.full")` still includes `trainingLabels` | `DocumentListItemIntegrationTest.detail_stillReturnsTrainingLabels` via `documentService.getDocumentById()` | **✅ Fully covered.** All three acceptance criteria are implemented and machine-verified at the appropriate layer (HTTP response shape, integration with real DB).

marcel commented

2026-05-22 19:23:37 +02:00

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Verdict: ✅ Approved — LGTM

This is a backend DTO refactor and TypeScript type rename. No rendered HTML changes, no CSS changes, no ARIA attribute changes.

DocumentRow.svelte and DocumentList.svelte now receive a flat DocumentListItem instead of a nested DocumentSearchItem, but the template reads the same fields (title, documentDate, sender, receivers, tags, thumbnailUrl, summary, archiveBox, archiveFolder, location) through the const doc = $derived(item) alias. The visual output is byte-for-byte identical to before.

thumbnailUrl was previously a computed @JsonProperty on the Document entity; it's now an explicit field on DocumentListItem. The URL format is unchanged — the thumbnail renders identically.

Nothing to flag from a design or accessibility perspective.

## 🎨 Leonie Voss — UX Designer & Accessibility Strategist **Verdict: ✅ Approved — LGTM** This is a backend DTO refactor and TypeScript type rename. No rendered HTML changes, no CSS changes, no ARIA attribute changes. `DocumentRow.svelte` and `DocumentList.svelte` now receive a flat `DocumentListItem` instead of a nested `DocumentSearchItem`, but the template reads the same fields (`title`, `documentDate`, `sender`, `receivers`, `tags`, `thumbnailUrl`, `summary`, `archiveBox`, `archiveFolder`, `location`) through the `const doc = $derived(item)` alias. The visual output is byte-for-byte identical to before. `thumbnailUrl` was previously a computed `@JsonProperty` on the `Document` entity; it's now an explicit field on `DocumentListItem`. The URL format is unchanged — the thumbnail renders identically. Nothing to flag from a design or accessibility perspective.

marcel added 1 commit 2026-05-22 19:27:54 +02:00

refactor(document): address review concerns from PR #660

CI / Semgrep Security Scan (pull_request) Successful in 21s

Details

CI / Compose Bucket Idempotency (pull_request) Successful in 1m2s

Details

nightly / deploy-staging (push) Successful in 2m2s

Details

CI / Unit & Component Tests (push) Successful in 3m58s

Details

CI / OCR Service Tests (push) Successful in 20s

Details

CI / Backend Unit Tests (push) Successful in 3m50s

Details

CI / fail2ban Regex (push) Successful in 44s

Details

CI / Unit & Component Tests (pull_request) Successful in 3m29s

Details

CI / Semgrep Security Scan (push) Successful in 21s

Details

CI / OCR Service Tests (pull_request) Successful in 21s

Details

CI / Backend Unit Tests (pull_request) Successful in 3m43s

Details

CI / Compose Bucket Idempotency (push) Successful in 59s

Details

CI / fail2ban Regex (pull_request) Successful in 45s

Details

8e9e3bba06

- Restore JavaDoc on DocumentSearchResult.of() and .paged() factory methods
- Remove redundant null guards on @Builder.Default collections in toListItem()
- Map DocumentListItem fields explicitly in DocumentMultiSelect before cast
- Add DocumentListItem required fields to docFactory in spec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel referenced this issue from a commit

2026-05-22 19:27:55 +02:00

refactor(document): address review concerns from PR #660

marcel merged commit 8e9e3bba06 into main

2026-05-23 19:35:44 +02:00

marcel deleted branch feat/issue-647-document-list-dto

2026-05-23 19:35:45 +02:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marcel/familienarchiv#660

feat(document): replace DocumentSearchItem with flat DocumentListItem DTO (#647) #660

Summary

Test plan

👨‍💻 Felix Brandt — Senior Fullstack Developer

Blockers

Suggestions

All good

🔐 Nora "NullX" Steiner — Application Security Engineer

Fields removed from list response (all correct)

Security regression test: excellent

Note for a follow-up issue (not a blocker)

🏗️ Markus Keller — Application Architect

Architecture is sound

Pre-existing concerns (not introduced here, but worth flagging)

🧪 Sara Holt — QA Engineer & Test Strategist

What's solid

Concern

Suggestion

🚀 Tobias Wendt — DevOps & Platform Engineer

📋 Elicit — Requirements Engineer

AC #1 — List response exposes only list-view fields; sensitive storage fields absent

AC #2 — Search with a document that has trainingLabels does not throw LazyInitializationException

AC #3 — Detail endpoint still returns trainingLabels after the entity graph change

🎨 Leonie Voss — UX Designer & Accessibility Strategist

AC #2 — Search with a document that has `trainingLabels` does not throw `LazyInitializationException`

AC #3 — Detail endpoint still returns `trainingLabels` after the entity graph change