fix(document): add trainingLabels to Document.full entity graph (#642) #644

Merged
marcel merged 1 commits from fix/issue-642-training-labels-entity-graph into main 2026-05-20 12:36:28 +02:00
Owner

Closes #642

Summary

Add trainingLabels to the Document.full @NamedEntityGraph.

trainingLabels was switched from EAGER to LAZY in #467 but not added to Document.full. DocumentRepository.findById() uses @EntityGraph("Document.full"), which loads sender/receivers/tags eagerly within the transaction — but trainingLabels is left uninitialized. The Hibernate session closes when getDocumentById() returns. When Jackson serializes the response body, it accesses trainingLabels with no session → LazyInitializationException → HTTP 500.

One-line diff:

 @NamedEntityGraph(name = "Document.full", attributeNodes = {
         @NamedAttributeNode("sender"),
         @NamedAttributeNode("receivers"),
         @NamedAttributeNode("tags"),
+        @NamedAttributeNode("trainingLabels")
 })

Test plan

  • GET /api/documents/{id} returns 200 (was 500) for a document with training labels
  • CI passes DocumentLazyLoadingTest — the test already exercises getDocumentById and checks no lazy-load outside session
  • After merging #641 (logging), confirm no more LazyInitializationException entries in GlitchTip / Grafana for the document endpoint
Closes #642 ## Summary Add `trainingLabels` to the `Document.full` `@NamedEntityGraph`. `trainingLabels` was switched from `EAGER` to `LAZY` in #467 but not added to `Document.full`. `DocumentRepository.findById()` uses `@EntityGraph("Document.full")`, which loads sender/receivers/tags eagerly within the transaction — but `trainingLabels` is left uninitialized. The Hibernate session closes when `getDocumentById()` returns. When Jackson serializes the response body, it accesses `trainingLabels` with no session → `LazyInitializationException` → HTTP 500. **One-line diff:** ```java @NamedEntityGraph(name = "Document.full", attributeNodes = { @NamedAttributeNode("sender"), @NamedAttributeNode("receivers"), @NamedAttributeNode("tags"), + @NamedAttributeNode("trainingLabels") }) ``` ## Test plan - [ ] `GET /api/documents/{id}` returns 200 (was 500) for a document with training labels - [ ] CI passes `DocumentLazyLoadingTest` — the test already exercises `getDocumentById` and checks no lazy-load outside session - [ ] After merging #641 (logging), confirm no more `LazyInitializationException` entries in GlitchTip / Grafana for the document endpoint
marcel added 1 commit 2026-05-20 08:17:08 +02:00
fix(document): add trainingLabels to Document.full entity graph (#642)
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m30s
CI / OCR Service Tests (pull_request) Successful in 19s
CI / Backend Unit Tests (pull_request) Successful in 3m22s
CI / fail2ban Regex (pull_request) Successful in 52s
CI / Semgrep Security Scan (pull_request) Successful in 20s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s
3a56cc4e87
trainingLabels was switched to LAZY fetch in #467 but not added to the
Document.full @NamedEntityGraph. DocumentRepository.findById() uses
Document.full to eagerly load sender/receivers/tags, but the Hibernate
session closes before Jackson serializes the response. Accessing
trainingLabels outside the session throws LazyInitializationException,
causing GET /api/documents/{id} to return HTTP 500.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Owner

🏛️ Markus Keller — Software Architect

Verdict: Approved

The fix correctly identifies and resolves the root cause at the right layer.

Root cause analysis:
Commit 56bcbcdd switched Document.trainingLabels from EAGER to LAZY as part of the @mention PR. The @NamedEntityGraph("Document.full") was not updated to include trainingLabels. When DocumentService.getDocumentById() loads the entity with the full graph, the session closes at method exit. Jackson then serializes trainingLabels outside the session, triggering LazyInitializationException.

Why this fix is correct:

  • Adding @NamedAttributeNode("trainingLabels") to Document.full tells Hibernate to JOIN-fetch the collection in the same query, making it available post-session.
  • Document.list correctly omits it — list view does not need training labels, so the N+1 guard holds.
  • The fix is in the entity graph declaration, not in changing fetch type back to EAGER (which would affect all load paths) or wrapping the service method in a larger transaction (which would push session lifetime concerns into the wrong layer).

The pattern is established and consistent: Document.full exists precisely for this purpose. This is a minimal, correct fix that follows existing conventions. No architectural concerns.

## 🏛️ Markus Keller — Software Architect **Verdict: ✅ Approved** The fix correctly identifies and resolves the root cause at the right layer. **Root cause analysis:** Commit `56bcbcdd` switched `Document.trainingLabels` from `EAGER` to `LAZY` as part of the @mention PR. The `@NamedEntityGraph("Document.full")` was not updated to include `trainingLabels`. When `DocumentService.getDocumentById()` loads the entity with the `full` graph, the session closes at method exit. Jackson then serializes `trainingLabels` outside the session, triggering `LazyInitializationException`. **Why this fix is correct:** - Adding `@NamedAttributeNode("trainingLabels")` to `Document.full` tells Hibernate to JOIN-fetch the collection in the same query, making it available post-session. - `Document.list` correctly omits it — list view does not need training labels, so the N+1 guard holds. - The fix is in the entity graph declaration, not in changing fetch type back to `EAGER` (which would affect all load paths) or wrapping the service method in a larger transaction (which would push session lifetime concerns into the wrong layer). The pattern is established and consistent: `Document.full` exists precisely for this purpose. This is a minimal, correct fix that follows existing conventions. No architectural concerns.
Author
Owner

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: Approved

Single-line fix, correct, minimal, no collateral changes.

What I verified:

  • Document.full before this PR: sender, receivers, tags — used by getDocumentById() via @EntityGraph("Document.full") in the repository.
  • trainingLabels is @ElementCollection(fetch = FetchType.LAZY) on a Set<TrainingLabel>.
  • Adding it to the graph means Hibernate fetches it via a secondary SELECT (or JOIN depending on Hibernate's strategy for element collections) within the same session boundary.
  • Document.list omits it — correct, list views don't render training labels.

One thing to watch for in future PRs:
The regression pattern here is: "change a field's fetch type without auditing all entity graphs that touch the entity." This is easy to miss. Worth adding a comment on the trainingLabels field itself or in the Document.full graph to make the coupling explicit — but I'd call that a suggestion, not a blocker.

Suggestion (non-blocking):
A comment on Document.full listing what it's used for (full detail view) would help the next person adding a new lazy field know they should check both graphs:

// Used by DocumentRepository.findByIdWithFullGraph — for detail view serialization
@NamedEntityGraph(name = "Document.full", attributeNodes = { ... })

Up to you whether that comment adds value.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ✅ Approved** Single-line fix, correct, minimal, no collateral changes. **What I verified:** - `Document.full` before this PR: `sender`, `receivers`, `tags` — used by `getDocumentById()` via `@EntityGraph("Document.full")` in the repository. - `trainingLabels` is `@ElementCollection(fetch = FetchType.LAZY)` on a `Set<TrainingLabel>`. - Adding it to the graph means Hibernate fetches it via a secondary SELECT (or JOIN depending on Hibernate's strategy for element collections) within the same session boundary. - `Document.list` omits it — correct, list views don't render training labels. **One thing to watch for in future PRs:** The regression pattern here is: "change a field's fetch type without auditing all entity graphs that touch the entity." This is easy to miss. Worth adding a comment on the `trainingLabels` field itself or in the `Document.full` graph to make the coupling explicit — but I'd call that a suggestion, not a blocker. **Suggestion (non-blocking):** A comment on `Document.full` listing what it's used for (full detail view) would help the next person adding a new lazy field know they should check both graphs: ```java // Used by DocumentRepository.findByIdWithFullGraph — for detail view serialization @NamedEntityGraph(name = "Document.full", attributeNodes = { ... }) ``` Up to you whether that comment adds value.
Author
Owner

🛠️ Tobias Wendt — DevOps / Infrastructure

Verdict: Approved

Pure Java entity change — no infrastructure, compose, or deployment config touched. No deployment concerns.

The fix is self-contained to the backend artifact. The next nightly build will pick it up automatically. No env vars, no volume mounts, no network config affected.

LGTM from infra perspective.

## 🛠️ Tobias Wendt — DevOps / Infrastructure **Verdict: ✅ Approved** Pure Java entity change — no infrastructure, compose, or deployment config touched. No deployment concerns. The fix is self-contained to the backend artifact. The next nightly build will pick it up automatically. No env vars, no volume mounts, no network config affected. LGTM from infra perspective.
Author
Owner

📋 Elicit — Requirements Engineer

Verdict: Approved

Checking against issue #642 requirements:

Requirement Addressed?
Document detail page no longer returns 500 Root cause removed — trainingLabels now loaded within session
Fix scoped to regression introduced by @mention PR One line added to Document.full entity graph
No unrelated changes bundled Single file, single change

The fix addresses exactly what was described. No scope creep, no unrelated cleanup.

The suggested deploy order from PR #643 (merge observability first, then this PR) means the fix can be validated through GlitchTip absence — error disappears after merge. That's a clean acceptance criterion.

## 📋 Elicit — Requirements Engineer **Verdict: ✅ Approved** Checking against issue #642 requirements: | Requirement | Addressed? | |---|---| | Document detail page no longer returns 500 | ✅ Root cause removed — `trainingLabels` now loaded within session | | Fix scoped to regression introduced by @mention PR | ✅ One line added to `Document.full` entity graph | | No unrelated changes bundled | ✅ Single file, single change | The fix addresses exactly what was described. No scope creep, no unrelated cleanup. The suggested deploy order from PR #643 (merge observability first, then this PR) means the fix can be validated through GlitchTip absence — error disappears after merge. That's a clean acceptance criterion.
Author
Owner

🔒 Nora "NullX" Steiner — Security Expert

Verdict: Approved

The change adds trainingLabels to the Document.full entity graph. Security review:

  • trainingLabels is an @ElementCollection of TrainingLabel enum values on the same Document entity. It is already part of the entity's data model and was accessible before the LAZY regression.
  • This fix restores pre-regression behavior — it does not expose any data that wasn't already accessible to users who can fetch a document.
  • No new API surface. No new permissions logic involved.
  • No serialization of credentials, PII, or security-sensitive fields introduced by this change.

No security concerns. The fix is a pure JPA session boundary correction.

## 🔒 Nora "NullX" Steiner — Security Expert **Verdict: ✅ Approved** The change adds `trainingLabels` to the `Document.full` entity graph. Security review: - `trainingLabels` is an `@ElementCollection` of `TrainingLabel` enum values on the same `Document` entity. It is already part of the entity's data model and was accessible before the LAZY regression. - This fix restores pre-regression behavior — it does not expose any data that wasn't already accessible to users who can fetch a document. - No new API surface. No new permissions logic involved. - No serialization of credentials, PII, or security-sensitive fields introduced by this change. No security concerns. The fix is a pure JPA session boundary correction.
Author
Owner

🧪 Sara Holt — QA / Tester

Verdict: ⚠️ Approved with concerns

The fix is correct, but there is no regression test to prevent this from silently breaking again.

What happened: A prior PR (56bcbcdd) changed trainingLabels from EAGER to LAZY without updating Document.full. The document detail page then 500'd. No existing test caught this.

Missing test (non-blocking suggestion):
An integration or @DataJpaTest slice test for DocumentService.getDocumentById() that:

  1. Creates a Document with a non-empty trainingLabels set
  2. Calls getDocumentById()
  3. Asserts that document.getTrainingLabels() is accessible (non-null, correct size) — outside the service method, simulating what Jackson does during serialization

Without this, the next PR that changes a fetch type on Document could introduce the same regression undetected.

Manual test plan for staging:

  1. After merging, deploy nightly
  2. Open any document detail page that has OCR/training labels assigned
  3. Confirm 200 response
  4. (If #643 is merged first) Confirm no new GlitchTip event for this error

This is a suggestion — not a blocker for merging. The fix is correct and the issue is live in production.

## 🧪 Sara Holt — QA / Tester **Verdict: ⚠️ Approved with concerns** The fix is correct, but there is no regression test to prevent this from silently breaking again. **What happened:** A prior PR (`56bcbcdd`) changed `trainingLabels` from `EAGER` to `LAZY` without updating `Document.full`. The document detail page then 500'd. No existing test caught this. **Missing test (non-blocking suggestion):** An integration or `@DataJpaTest` slice test for `DocumentService.getDocumentById()` that: 1. Creates a `Document` with a non-empty `trainingLabels` set 2. Calls `getDocumentById()` 3. Asserts that `document.getTrainingLabels()` is accessible (non-null, correct size) — *outside* the service method, simulating what Jackson does during serialization Without this, the next PR that changes a fetch type on `Document` could introduce the same regression undetected. **Manual test plan for staging:** 1. After merging, deploy nightly 2. Open any document detail page that has OCR/training labels assigned 3. Confirm 200 response 4. (If #643 is merged first) Confirm no new GlitchTip event for this error This is a suggestion — not a blocker for merging. The fix is correct and the issue is live in production.
Author
Owner

🎨 Leonie Voss — UI / UX Expert

Verdict: Approved

This fix restores the document detail page to a working state — which was previously throwing a 500 error for users. From a UI perspective, the expected outcome is that the page loads again without an error.

No frontend code changed. No visual regressions expected.

LGTM.

## 🎨 Leonie Voss — UI / UX Expert **Verdict: ✅ Approved** This fix restores the document detail page to a working state — which was previously throwing a 500 error for users. From a UI perspective, the expected outcome is that the page loads again without an error. No frontend code changed. No visual regressions expected. LGTM.
marcel merged commit 9a460b3c90 into main 2026-05-20 12:36:28 +02:00
marcel deleted branch fix/issue-642-training-labels-entity-graph 2026-05-20 12:36:28 +02:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#644