refactor(document): switch Document.tags + receivers + trainingLabels to LAZY + @EntityGraph #467

New Issue

marcel · 2026-05-07T17:26:13+02:00

marcel commented

2026-05-07 17:26:13 +02:00

Context

backend/src/main/java/org/raddatz/familienarchiv/document/Document.java uses FetchType.EAGER on three associations:

@ManyToMany(fetch = FetchType.EAGER)  Set<Person> receivers;   // line 121
@ManyToMany(fetch = FetchType.EAGER)  Set<Tag>    tags;         // line 130
@ElementCollection(fetch = FetchType.EAGER)  Set<String> trainingLabels;  // line 135

Pre-prod audit ran pg_stat_statements against the live stack and confirmed the predicted N+1:

24 HTTP requests (8 endpoints × 3 iterations) →
  2,733 calls to document_receivers join
  2,733 calls to document_tags join
  2,733 calls to document_training_labels
  = 8,199 collection-loading queries (~342 per request)

At 1,520 documents the latency is hidden (mean 0.07 ms each, total ~200 ms over 2,733 calls). At 10× data volume, every list/search response triples query count linearly.

The hot search path uses a hand-written native SQL with LATERAL JOIN so it doesn't suffer the N+1. The N+1 hits the list and detail-after-search paths — exactly the Reader Experience milestone's load profile.

Approach

Switch to LAZY by default; declare @EntityGraphs on read paths that need the associations.

`Document.java`

@ManyToMany(fetch = LAZY)       Set<Person> receivers;
@ManyToMany(fetch = LAZY)       Set<Tag>    tags;
@ElementCollection(fetch = LAZY) Set<String> trainingLabels;

@ManyToOne(fetch = LAZY)        Person sender;   // already LAZY ✓

Named entity graphs

@NamedEntityGraph(name = "Document.full",
    attributeNodes = {
        @NamedAttributeNode("receivers"),
        @NamedAttributeNode("tags"),
        @NamedAttributeNode("sender")
    })
@NamedEntityGraph(name = "Document.list",
    attributeNodes = { @NamedAttributeNode("sender") })

Repository

@EntityGraph(value = "Document.full")
Optional<Document> findById(UUID id);

@EntityGraph(value = "Document.list")
Page<Document> findAll(Specification<Document> spec, Pageable pageable);

LazyInitializationException risk

spring.jpa.open-in-view: false is already set (audit verified). That means lazy collections accessed outside the transactional service method will throw. This is good — it forces every access into a service method or a properly-fetched entity.

The risk is in *Mapper / DocumentService methods that read from collections after returning to the controller. Audit by:

grep -rE "doc\.getReceivers|doc\.getTags|doc\.getTrainingLabels" backend/src/main/java

For every match, confirm it's either inside a @Transactional method or the call site uses an @EntityGraph-annotated repository method.

Bonus: `@BatchSize(50)` on collections

For the inevitable list-of-detail-views case where the graph isn't requested but multiple Documents' collections are accessed, add @BatchSize(50) so Hibernate fetches in 50-row batches instead of one-by-one.

Critical files

backend/src/main/java/org/raddatz/familienarchiv/document/Document.java
backend/src/main/java/org/raddatz/familienarchiv/document/DocumentRepository.java — add @EntityGraph annotations
backend/src/main/java/org/raddatz/familienarchiv/document/DocumentService.java — confirm all collection accesses are in @Transactional scope
backend/src/test/java/.../document/DocumentRepositoryTest.java — add a test that asserts query count for findById and findAll

Verification

Query-count assertion: Hibernate.unproxy() + @SpringBootTest with statistics enabled. After loading 50 documents via findAll(...), query count should be ≤2 (the page query + 1 batch fetch via @BatchSize), not 50+.
Live pg_stat_statements: enable pg_stat_statements, exercise the same 24-call sequence the audit used, confirm document_receivers calls drop from 2,733 to ≤24 (1 per HTTP request that needs the collection).
Smoke: full app sweep — list, detail, search, document edit. No LazyInitializationException anywhere in logs.
Test suite: ./mvnw test all green; coverage gate intact.

Acceptance criteria

All three associations are LAZY.
findById uses @EntityGraph("Document.full").
findAll(Specification, Pageable) uses @EntityGraph("Document.list").
@BatchSize(50) on receivers + tags.
Query-count assertion test added.
Live pg_stat_statements shows ≤24 calls per 24 HTTP requests for document_receivers (1 per request, not 1 per row).
No LazyInitializationException regressions in any existing test.

Effort

M — 2 days. Most of the time is auditing every collection access for transactional scope.

Risk if not addressed

Linear query-count growth as data scales. At 10K documents and a 50-row search page, today's 60-query response becomes 600-query — likely the first 5xx-from-the-DB-pool incident in production.

Tracked in audit doc as F-17 (High, escalated from Medium after dynamic confirmation).

## Context `backend/src/main/java/org/raddatz/familienarchiv/document/Document.java` uses `FetchType.EAGER` on three associations: ```java @ManyToMany(fetch = FetchType.EAGER) Set<Person> receivers; // line 121 @ManyToMany(fetch = FetchType.EAGER) Set<Tag> tags; // line 130 @ElementCollection(fetch = FetchType.EAGER) Set<String> trainingLabels; // line 135 ``` Pre-prod audit ran `pg_stat_statements` against the live stack and confirmed the predicted N+1: ``` 24 HTTP requests (8 endpoints × 3 iterations) → 2,733 calls to document_receivers join 2,733 calls to document_tags join 2,733 calls to document_training_labels = 8,199 collection-loading queries (~342 per request) ``` At 1,520 documents the latency is hidden (mean 0.07 ms each, total ~200 ms over 2,733 calls). At 10× data volume, every list/search response triples query count linearly. The hot search path uses a hand-written native SQL with `LATERAL JOIN` so it doesn't suffer the N+1. The N+1 hits the *list* and *detail-after-search* paths — exactly the Reader Experience milestone's load profile. ## Approach Switch to LAZY by default; declare `@EntityGraph`s on read paths that need the associations. ### `Document.java` ```java @ManyToMany(fetch = LAZY) Set<Person> receivers; @ManyToMany(fetch = LAZY) Set<Tag> tags; @ElementCollection(fetch = LAZY) Set<String> trainingLabels; @ManyToOne(fetch = LAZY) Person sender; // already LAZY ✓ ``` ### Named entity graphs ```java @NamedEntityGraph(name = "Document.full", attributeNodes = { @NamedAttributeNode("receivers"), @NamedAttributeNode("tags"), @NamedAttributeNode("sender") }) @NamedEntityGraph(name = "Document.list", attributeNodes = { @NamedAttributeNode("sender") }) ``` ### Repository ```java @EntityGraph(value = "Document.full") Optional<Document> findById(UUID id); @EntityGraph(value = "Document.list") Page<Document> findAll(Specification<Document> spec, Pageable pageable); ``` ### LazyInitializationException risk `spring.jpa.open-in-view: false` is already set (audit verified). That means lazy collections accessed *outside* the transactional service method will throw. This is good — it forces every access into a service method or a properly-fetched entity. The risk is in `*Mapper` / `DocumentService` methods that read from collections after returning to the controller. Audit by: ``` grep -rE "doc\.getReceivers|doc\.getTags|doc\.getTrainingLabels" backend/src/main/java ``` For every match, confirm it's either inside a `@Transactional` method **or** the call site uses an `@EntityGraph`-annotated repository method. ### Bonus: `@BatchSize(50)` on collections For the inevitable list-of-detail-views case where the graph isn't requested but multiple Documents' collections are accessed, add `@BatchSize(50)` so Hibernate fetches in 50-row batches instead of one-by-one. ## Critical files - `backend/src/main/java/org/raddatz/familienarchiv/document/Document.java` - `backend/src/main/java/org/raddatz/familienarchiv/document/DocumentRepository.java` — add `@EntityGraph` annotations - `backend/src/main/java/org/raddatz/familienarchiv/document/DocumentService.java` — confirm all collection accesses are in `@Transactional` scope - `backend/src/test/java/.../document/DocumentRepositoryTest.java` — add a test that asserts query count for `findById` and `findAll` ## Verification 1. **Query-count assertion**: `Hibernate.unproxy()` + `@SpringBootTest` with statistics enabled. After loading 50 documents via `findAll(...)`, query count should be ≤2 (the page query + 1 batch fetch via @BatchSize), not 50+. 2. **Live `pg_stat_statements`**: enable `pg_stat_statements`, exercise the same 24-call sequence the audit used, confirm `document_receivers` calls drop from 2,733 to ≤24 (1 per HTTP request that needs the collection). 3. **Smoke**: full app sweep — list, detail, search, document edit. No `LazyInitializationException` anywhere in logs. 4. **Test suite**: `./mvnw test` all green; coverage gate intact. ## Acceptance criteria - [ ] All three associations are `LAZY`. - [ ] `findById` uses `@EntityGraph("Document.full")`. - [ ] `findAll(Specification, Pageable)` uses `@EntityGraph("Document.list")`. - [ ] `@BatchSize(50)` on receivers + tags. - [ ] Query-count assertion test added. - [ ] Live `pg_stat_statements` shows ≤24 calls per 24 HTTP requests for `document_receivers` (1 per request, not 1 per row). - [ ] No `LazyInitializationException` regressions in any existing test. ## Effort M — 2 days. Most of the time is auditing every collection access for transactional scope. ## Risk if not addressed Linear query-count growth as data scales. At 10K documents and a 50-row search page, today's 60-query response becomes 600-query — likely the first 5xx-from-the-DB-pool incident in production. Tracked in audit doc as **F-17** (High, escalated from Medium after dynamic confirmation).

marcel added the P1-high refactor labels 2026-05-07 17:27:03 +02:00

marcel commented

2026-05-07 17:37:17 +02:00

👨‍💻 Markus Keller — Application Architect

Observations

The issue is well-diagnosed: FetchType.EAGER on three collections (receivers, tags, trainingLabels) generates the confirmed N+1 pattern. The audit numbers are credible — 2,733 collection-loading queries for 24 HTTP requests is exactly what EAGER on a paged list produces.
open-in-view: false is already set (confirmed in application.yaml line 15), which is the right precondition for a safe LAZY migration. Good foundation.
The proposed @NamedEntityGraph approach is the correct JPA idiom. Document.full (receivers + tags + sender) for detail view, Document.list (sender only) for list — the boundary makes sense given that the list view renders tags as chips but not receivers inline.
One gap in the plan: DocumentService.getRecentActivity() calls documentRepository.findAll(PageRequest...) without a specification, and the returned documents later have their tags resolved via resolveDocumentTagColors(). This call path is NOT annotated @Transactional, so with LAZY collections it will throw LazyInitializationException the moment d.getTags().stream() is called in resolveDocumentTagColors. This is a critical path that the issue's grep audit would NOT catch because the method name is getRecentActivity, not get*ById.
DashboardService.buildCaption() accesses doc.getReceivers() on a document fetched via documentService.getDocumentById(). getDocumentById() is also not @Transactional, meaning the session is closed before buildCaption() is called. Currently safe because EAGER loads receivers into the entity. After LAZY, this becomes a LazyInitializationException. This is a cross-domain collection access that's invisible to the grep suggested in the issue.
MassImportService (line 337-338) calls doc.getReceivers().addAll(receivers) and doc.getTags().add(tag) — but this method IS wrapped in @Transactional (line 263), so it's safe.
DocumentVersionService accesses getReceivers() and getTags() (lines 185-206). These are called from within @Transactional write methods in DocumentService, so they're safe as long as the session is still open.
The @BatchSize(50) suggestion is a good belt-and-suspenders addition. It handles the findSinglePersonCorrespondence / findConversation / getDocumentsBySender paths that return List<Document> without going through an @EntityGraph-annotated repository method.

Recommendations

Before switching to LAZY, add @Transactional to DocumentService.getDocumentById() — it already calls tagService.resolveEffectiveColors(doc.getTags()) outside a transaction. Every caller that uses the returned document's collections (DashboardService, DocumentController detail route) depends on this being safe.
Also add @Transactional to DocumentService.getRecentActivity() or add an @EntityGraph hint to the findAll(PageRequest) call inside it — both are valid; the @Transactional is simpler.
The @NamedEntityGraph definitions belong on Document.java (correct placement). Add @BatchSize(50) to the collection fields directly (Hibernate annotation), not as a separate repository config.
Write the ADR before implementing: the choice to use @NamedEntityGraph over Spring Data projections or explicit JPQL JOIN FETCH queries has lasting consequences. Record it.
The issue lists DocumentRepositoryTest as the target test class. It currently uses @DataJpaTest with real Postgres (good). The query-count assertion should use Hibernate's Statistics API via SessionFactory — this is more deterministic than pg_stat_statements in a test context.

Open Decisions

Should Document.list entity graph include tags as well? The list view renders tags as colored chips on every card. Fetching sender-only in findAll saves one join, but the tag color resolution still happens in resolveDocumentTagColors() — meaning tags will be lazy-loaded one document at a time there. Including tags in Document.list may be the right call depending on how search results are rendered.

## 👨‍💻 Markus Keller — Application Architect ### Observations - The issue is well-diagnosed: `FetchType.EAGER` on three collections (`receivers`, `tags`, `trainingLabels`) generates the confirmed N+1 pattern. The audit numbers are credible — 2,733 collection-loading queries for 24 HTTP requests is exactly what EAGER on a paged list produces. - `open-in-view: false` is already set (confirmed in `application.yaml` line 15), which is the right precondition for a safe LAZY migration. Good foundation. - The proposed `@NamedEntityGraph` approach is the correct JPA idiom. `Document.full` (receivers + tags + sender) for detail view, `Document.list` (sender only) for list — the boundary makes sense given that the list view renders tags as chips but not receivers inline. - **One gap in the plan:** `DocumentService.getRecentActivity()` calls `documentRepository.findAll(PageRequest...)` without a specification, and the returned documents later have their tags resolved via `resolveDocumentTagColors()`. This call path is NOT annotated `@Transactional`, so with LAZY collections it will throw `LazyInitializationException` the moment `d.getTags().stream()` is called in `resolveDocumentTagColors`. This is a critical path that the issue's grep audit would NOT catch because the method name is `getRecentActivity`, not `get*ById`. - `DashboardService.buildCaption()` accesses `doc.getReceivers()` on a document fetched via `documentService.getDocumentById()`. `getDocumentById()` is also not `@Transactional`, meaning the session is closed before `buildCaption()` is called. Currently safe because EAGER loads receivers into the entity. After LAZY, this becomes a `LazyInitializationException`. This is a cross-domain collection access that's invisible to the grep suggested in the issue. - `MassImportService` (line 337-338) calls `doc.getReceivers().addAll(receivers)` and `doc.getTags().add(tag)` — but this method IS wrapped in `@Transactional` (line 263), so it's safe. - `DocumentVersionService` accesses `getReceivers()` and `getTags()` (lines 185-206). These are called from within `@Transactional` write methods in `DocumentService`, so they're safe as long as the session is still open. - The `@BatchSize(50)` suggestion is a good belt-and-suspenders addition. It handles the `findSinglePersonCorrespondence` / `findConversation` / `getDocumentsBySender` paths that return `List<Document>` without going through an `@EntityGraph`-annotated repository method. ### Recommendations 1. **Before switching to LAZY**, add `@Transactional` to `DocumentService.getDocumentById()` — it already calls `tagService.resolveEffectiveColors(doc.getTags())` outside a transaction. Every caller that uses the returned document's collections (DashboardService, DocumentController detail route) depends on this being safe. 2. **Also add `@Transactional` to `DocumentService.getRecentActivity()`** or add an `@EntityGraph` hint to the `findAll(PageRequest)` call inside it — both are valid; the `@Transactional` is simpler. 3. The `@NamedEntityGraph` definitions belong on `Document.java` (correct placement). Add `@BatchSize(50)` to the collection fields directly (Hibernate annotation), not as a separate repository config. 4. Write the ADR before implementing: the choice to use `@NamedEntityGraph` over Spring Data projections or explicit JPQL JOIN FETCH queries has lasting consequences. Record it. 5. The issue lists `DocumentRepositoryTest` as the target test class. It currently uses `@DataJpaTest` with real Postgres (good). The query-count assertion should use Hibernate's `Statistics` API via `SessionFactory` — this is more deterministic than `pg_stat_statements` in a test context. ### Open Decisions - Should `Document.list` entity graph include `tags` as well? The list view renders tags as colored chips on every card. Fetching sender-only in `findAll` saves one join, but the tag color resolution still happens in `resolveDocumentTagColors()` — meaning tags will be lazy-loaded one document at a time there. Including `tags` in `Document.list` may be the right call depending on how search results are rendered.

marcel commented

2026-05-07 17:37:36 +02:00

👨‍💻 Felix Brandt — Fullstack Developer

Observations

The issue's grep audit command is correct but incomplete. I ran a broader search across the codebase. Beyond DocumentService, collection accesses exist in:
- DashboardService.buildCaption() — accesses doc.getReceivers() on a document returned from documentService.getDocumentById(), which has no @Transactional boundary. This will throw LazyInitializationException post-migration.
- DocumentService.getRecentActivity() (line 570-574) — not @Transactional, returns Document list and is followed immediately by resolveDocumentTagColors() which iterates d.getTags().stream(). Session closed before the tag stream.
- DocumentService.applyBulkEditToDocument() (lines 491, 501) — inside @Transactional, safe.
- MassImportService (lines 337-338) — inside @Transactional, safe.
- DocumentVersionService (lines 185-206) — called from within write transactions, safe.
DocumentService.updateDocumentTags() (line 381) is a public write method with @Transactional missing. It's called by applyBatchMetadata (itself inside a transaction), so currently no problem, but it's worth noting as a smell — if someone calls it directly from a controller it would be non-transactional.
The issue proposes @EntityGraph on findById and findAll(Specification, Pageable). The repository currently has no findAll(Specification, Pageable) — it inherits findAll(Specification<T>, Pageable) from JpaSpecificationExecutor. Overriding this in the interface with @EntityGraph is valid but requires care: it overrides the inherited method, which is fine in Spring Data JPA as long as the signature exactly matches.
The @NamedEntityGraph annotation must appear on the Document entity class (not the repository). The placement in the issue spec is correct.
@Builder.Default on the collections is already in place. After switching to LAZY, @Builder.Default with new HashSet<>() remains correct — new entities start with empty initialized sets.

Recommendations

Add @Transactional to getDocumentById() (line 739). It's a read method, but the LAZY access to tags inside it (via tagService.resolveEffectiveColors(doc.getTags())) requires an open session. Per project convention this is one of the documented exceptions to "reads don't need @Transactional."
Wrap getRecentActivity() in @Transactional(readOnly = true) or fetch tags eagerly for that specific call. The tags are accessed right after in resolveDocumentTagColors().
Implement the @NamedEntityGraph + @EntityGraph changes in this exact order: (a) add graphs to Document.java, (b) add @EntityGraph to the two repository methods, (c) add @BatchSize(50) to the collection fields, (d) then run the full test suite. Do not switch FetchType.LAZY first and debug later.
The @EntityGraph override for findAll(Specification, Pageable) should be tested with a query-count assertion in DocumentRepositoryTest — not just a smoke test. The test must assert queryCount <= 2 for a 50-document page.
updateDocumentTags() should get @Transactional annotation — it's a write method without one, which is a clean-code violation regardless of this issue.

Open Decisions

None — the path forward is clear. LAZY + @EntityGraph + @BatchSize(50) + fix the two non-transactional read methods that access collections.

## 👨‍💻 Felix Brandt — Fullstack Developer ### Observations - The issue's grep audit command is correct but incomplete. I ran a broader search across the codebase. Beyond `DocumentService`, collection accesses exist in: - `DashboardService.buildCaption()` — accesses `doc.getReceivers()` on a document returned from `documentService.getDocumentById()`, which has no `@Transactional` boundary. This will throw `LazyInitializationException` post-migration. - `DocumentService.getRecentActivity()` (line 570-574) — not `@Transactional`, returns `Document` list and is followed immediately by `resolveDocumentTagColors()` which iterates `d.getTags().stream()`. Session closed before the tag stream. - `DocumentService.applyBulkEditToDocument()` (lines 491, 501) — inside `@Transactional`, safe. - `MassImportService` (lines 337-338) — inside `@Transactional`, safe. - `DocumentVersionService` (lines 185-206) — called from within write transactions, safe. - `DocumentService.updateDocumentTags()` (line 381) is a public write method with `@Transactional` missing. It's called by `applyBatchMetadata` (itself inside a transaction), so currently no problem, but it's worth noting as a smell — if someone calls it directly from a controller it would be non-transactional. - The issue proposes `@EntityGraph` on `findById` and `findAll(Specification, Pageable)`. The repository currently has no `findAll(Specification, Pageable)` — it inherits `findAll(Specification<T>, Pageable)` from `JpaSpecificationExecutor`. Overriding this in the interface with `@EntityGraph` is valid but requires care: it overrides the inherited method, which is fine in Spring Data JPA as long as the signature exactly matches. - The `@NamedEntityGraph` annotation must appear on the `Document` entity class (not the repository). The placement in the issue spec is correct. - `@Builder.Default` on the collections is already in place. After switching to LAZY, `@Builder.Default` with `new HashSet<>()` remains correct — new entities start with empty initialized sets. ### Recommendations 1. **Add `@Transactional` to `getDocumentById()`** (line 739). It's a read method, but the LAZY access to `tags` inside it (via `tagService.resolveEffectiveColors(doc.getTags())`) requires an open session. Per project convention this is one of the documented exceptions to "reads don't need `@Transactional`." 2. **Wrap `getRecentActivity()` in `@Transactional(readOnly = true)`** or fetch tags eagerly for that specific call. The tags are accessed right after in `resolveDocumentTagColors()`. 3. Implement the `@NamedEntityGraph` + `@EntityGraph` changes in this exact order: (a) add graphs to `Document.java`, (b) add `@EntityGraph` to the two repository methods, (c) add `@BatchSize(50)` to the collection fields, (d) then run the full test suite. Do not switch `FetchType.LAZY` first and debug later. 4. The `@EntityGraph` override for `findAll(Specification, Pageable)` should be tested with a query-count assertion in `DocumentRepositoryTest` — not just a smoke test. The test must assert `queryCount <= 2` for a 50-document page. 5. `updateDocumentTags()` should get `@Transactional` annotation — it's a write method without one, which is a clean-code violation regardless of this issue. ### Open Decisions - None — the path forward is clear. LAZY + `@EntityGraph` + `@BatchSize(50)` + fix the two non-transactional read methods that access collections.

marcel commented

2026-05-07 17:37:49 +02:00

👨‍💻 Tobias Wendt — DevOps & Platform Engineer

Observations

This is a pure backend refactor — no Docker Compose, CI pipeline, or infrastructure changes required. Good.
The issue's verification step 2 mentions pg_stat_statements against the "live stack." This implies running the verification against the local dev Docker Compose stack, not against a shared staging environment. That's fine — pg_stat_statements works the same way locally.
The audit that confirmed the N+1 was done with pg_stat_statements already. Confirm pg_stat_statements is enabled in the local dev Postgres config. The Docker Compose postgres image does not enable it by default. Check docker-compose.yml for POSTGRES_INITDB_ARGS or a mounted postgresql.conf.
The issue's effort estimate is M (2 days). The hidden cost is the audit of collection access sites — which I've verified is more expansive than the grep in the issue suggests (see Markus and Felix's comments). Budget an extra half-day for the cross-domain access audit.
No new Docker services, no new environment variables, no changes to the Compose file. CI pipeline is unaffected.
The test suite gate (88% branch coverage per CLAUDE.md) means adding a query-count test is not optional — it's needed to keep coverage from dropping on the newly modified repository methods.

Recommendations

Before running pg_stat_statements verification, confirm the extension is loaded: docker-compose exec db psql -U postgres -c "SELECT * FROM pg_extension WHERE extname = 'pg_stat_statements';". If not present, add command: postgres -c shared_preload_libraries=pg_stat_statements to the db service in docker-compose.yml for local dev.
Use the Hibernate Statistics approach for CI-safe query count assertions (no pg_stat_statements dependency in tests). Reserve pg_stat_statements for the manual live-stack verification.
After the change lands, run the 24-request sequence from the audit manually and compare the before/after pg_stat_statements numbers. Document the result in the PR description so there's a paper trail for the claimed improvement.
No new environment variables or secrets are introduced. No deployment procedure changes.

Open Decisions

None from infrastructure perspective.

## 👨‍💻 Tobias Wendt — DevOps & Platform Engineer ### Observations - This is a pure backend refactor — no Docker Compose, CI pipeline, or infrastructure changes required. Good. - The issue's verification step 2 mentions `pg_stat_statements` against the "live stack." This implies running the verification against the local dev Docker Compose stack, not against a shared staging environment. That's fine — `pg_stat_statements` works the same way locally. - The audit that confirmed the N+1 was done with `pg_stat_statements` already. Confirm `pg_stat_statements` is enabled in the local dev Postgres config. The Docker Compose postgres image does not enable it by default. Check `docker-compose.yml` for `POSTGRES_INITDB_ARGS` or a mounted `postgresql.conf`. - The issue's effort estimate is M (2 days). The hidden cost is the audit of collection access sites — which I've verified is more expansive than the grep in the issue suggests (see Markus and Felix's comments). Budget an extra half-day for the cross-domain access audit. - No new Docker services, no new environment variables, no changes to the Compose file. CI pipeline is unaffected. - The test suite gate (88% branch coverage per `CLAUDE.md`) means adding a query-count test is not optional — it's needed to keep coverage from dropping on the newly modified repository methods. ### Recommendations 1. Before running `pg_stat_statements` verification, confirm the extension is loaded: `docker-compose exec db psql -U postgres -c "SELECT * FROM pg_extension WHERE extname = 'pg_stat_statements';"`. If not present, add `command: postgres -c shared_preload_libraries=pg_stat_statements` to the db service in `docker-compose.yml` for local dev. 2. Use the Hibernate Statistics approach for CI-safe query count assertions (no `pg_stat_statements` dependency in tests). Reserve `pg_stat_statements` for the manual live-stack verification. 3. After the change lands, run the 24-request sequence from the audit manually and compare the before/after `pg_stat_statements` numbers. Document the result in the PR description so there's a paper trail for the claimed improvement. 4. No new environment variables or secrets are introduced. No deployment procedure changes. ### Open Decisions - None from infrastructure perspective.

marcel commented

2026-05-07 17:38:05 +02:00

👨‍💻 Elicit — Requirements Engineer

Observations

The issue is well-scoped: the problem is clearly stated, the root cause is confirmed with empirical data, the approach is specified, and the acceptance criteria are measurable. This is a high-quality technical issue that meets the Definition of Ready.
The acceptance criteria are binary and testable — good. One is slightly ambiguous: "findAll(Specification, Pageable) uses @EntityGraph(Document.list)" doesn't account for the fact that searchDocuments() has three code paths (RECEIVER sort, SENDER sort, RELEVANCE sort) that call findAll(spec) — the no-pageable overload. These paths also need to be covered, or the acceptance criterion should explicitly list which paths are in scope.
The effort estimate is M (2 days). Given the additional collection access sites identified in DashboardService and the non-transactional getDocumentById(), this estimate is slightly optimistic. S/M boundary is appropriate — L would be over-estimating.
The risk statement is accurate and quantified. The linear scaling claim is well-founded for EAGER collections on list queries. Good NFR framing.
Missing from the issue: there is no acceptance criterion covering DashboardService.buildCaption() or DocumentService.getRecentActivity(). If those are not fixed, the "No LazyInitializationException regressions" criterion will fail in prod even if all listed ACs pass.
The "Bonus: @BatchSize(50)" framing undersells this. @BatchSize should be a required AC, not a bonus — it's the safety net for any list path that doesn't go through an @EntityGraph method.

Recommendations

Add an AC: "All existing collection access sites in DashboardService and DocumentService.getRecentActivity() are confirmed to run inside an open Hibernate session or are annotated @Transactional(readOnly = true)."
Promote @BatchSize(50) from "Bonus" to a required AC. It's cheap to add and eliminates the entire class of fallback lazy-load bugs.
Clarify the findAll(Specification, Pageable) AC: state explicitly which call sites it covers (the fast-path in searchDocuments only, or all findAll overloads).
The verification step using pg_stat_statements is excellent — keep it as a post-merge manual check and document the result in the PR.

Open Decisions

None — the requirements are clear enough to implement. The gaps identified are straightforward additions, not design questions.

## 👨‍💻 Elicit — Requirements Engineer ### Observations - The issue is well-scoped: the problem is clearly stated, the root cause is confirmed with empirical data, the approach is specified, and the acceptance criteria are measurable. This is a high-quality technical issue that meets the Definition of Ready. - The acceptance criteria are binary and testable — good. One is slightly ambiguous: "findAll(Specification, Pageable) uses @EntityGraph(Document.list)" doesn't account for the fact that `searchDocuments()` has three code paths (RECEIVER sort, SENDER sort, RELEVANCE sort) that call `findAll(spec)` — the no-pageable overload. These paths also need to be covered, or the acceptance criterion should explicitly list which paths are in scope. - The effort estimate is M (2 days). Given the additional collection access sites identified in `DashboardService` and the non-transactional `getDocumentById()`, this estimate is slightly optimistic. S/M boundary is appropriate — L would be over-estimating. - The risk statement is accurate and quantified. The linear scaling claim is well-founded for EAGER collections on list queries. Good NFR framing. - Missing from the issue: there is no acceptance criterion covering `DashboardService.buildCaption()` or `DocumentService.getRecentActivity()`. If those are not fixed, the "No LazyInitializationException regressions" criterion will fail in prod even if all listed ACs pass. - The "Bonus: @BatchSize(50)" framing undersells this. `@BatchSize` should be a required AC, not a bonus — it's the safety net for any list path that doesn't go through an `@EntityGraph` method. ### Recommendations 1. Add an AC: "All existing collection access sites in `DashboardService` and `DocumentService.getRecentActivity()` are confirmed to run inside an open Hibernate session or are annotated `@Transactional(readOnly = true)`." 2. Promote `@BatchSize(50)` from "Bonus" to a required AC. It's cheap to add and eliminates the entire class of fallback lazy-load bugs. 3. Clarify the `findAll(Specification, Pageable)` AC: state explicitly which call sites it covers (the fast-path in `searchDocuments` only, or all `findAll` overloads). 4. The verification step using `pg_stat_statements` is excellent — keep it as a post-merge manual check and document the result in the PR. ### Open Decisions - None — the requirements are clear enough to implement. The gaps identified are straightforward additions, not design questions.

marcel commented

2026-05-07 17:38:20 +02:00

👨‍💻 Nora "NullX" Steiner — Security Engineer

Observations

This refactor is performance-motivated, not security-motivated, but LAZY fetch changes carry one indirect security risk worth naming: Hibernate proxy objects in serialized responses.
With EAGER, Jackson serializes fully-loaded entities. With LAZY, Jackson may attempt to serialize Hibernate proxy objects. If @JsonIgnoreProperties({"hibernateLazyInitializer", "handler"}) is not present on Person or Tag, Jackson will attempt to traverse the proxy and either trigger unexpected queries or throw a serialization error. Document.java does not currently annotate Person or Tag references with @JsonIgnoreProperties. Check Person.java and Tag.java for this annotation.
There is no direct security vulnerability introduced by this change. @EntityGraph does not affect authorization checks — @RequirePermission is enforced at the controller layer by AOP regardless of fetch strategy.
The open-in-view: false setting is confirmed present. This is the correct security-aligned default — it prevents Hibernate sessions from being held open across the full HTTP request lifecycle, which would otherwise make lazy loading silently work outside transaction boundaries and mask real bugs.
The issue's grep command (grep -rE "doc\.getReceivers|doc\.getTags|doc\.getTrainingLabels") is a good starting point but misses method-chained accesses like DashboardService.buildCaption() which accesses doc.getReceivers() — a live LazyInitializationException risk after this change. A LazyInitializationException surfacing at the controller layer may produce an unstructured 500 response that leaks stack trace detail if the global exception handler doesn't catch it. Confirm GlobalExceptionHandler covers LazyInitializationException.
No new SQL injection surface is introduced. @EntityGraph uses JPA's join mechanism, not string concatenation.

Recommendations

Check Person.java and Tag.java for @JsonIgnoreProperties({"hibernateLazyInitializer", "handler"}). If absent, add it before switching to LAZY — otherwise Jackson serialization of lazy proxies produces either unexpected extra queries or HttpMessageConversionException in the response.
Confirm GlobalExceptionHandler covers LazyInitializationException with a structured JSON response (not a raw 500 with stack trace). If it doesn't, add a handler that maps it to a 500 with the INTERNAL_ERROR error code. A LazyInitializationException in prod is a bug, but it shouldn't leak implementation details to the client.
Add a regression test that verifies the detail endpoint returns a structured error (not a stack trace) when something goes wrong loading collections — this gives you a safety net if a future change reintroduces a lazy access outside a transaction.

Open Decisions

None — the security surface of this change is limited and the mitigations are clear.

## 👨‍💻 Nora "NullX" Steiner — Security Engineer ### Observations - This refactor is performance-motivated, not security-motivated, but LAZY fetch changes carry one indirect security risk worth naming: **Hibernate proxy objects in serialized responses**. - With EAGER, Jackson serializes fully-loaded entities. With LAZY, Jackson may attempt to serialize Hibernate proxy objects. If `@JsonIgnoreProperties({"hibernateLazyInitializer", "handler"})` is not present on `Person` or `Tag`, Jackson will attempt to traverse the proxy and either trigger unexpected queries or throw a serialization error. `Document.java` does not currently annotate `Person` or `Tag` references with `@JsonIgnoreProperties`. Check `Person.java` and `Tag.java` for this annotation. - There is no direct security vulnerability introduced by this change. `@EntityGraph` does not affect authorization checks — `@RequirePermission` is enforced at the controller layer by AOP regardless of fetch strategy. - The `open-in-view: false` setting is confirmed present. This is the correct security-aligned default — it prevents Hibernate sessions from being held open across the full HTTP request lifecycle, which would otherwise make lazy loading silently work outside transaction boundaries and mask real bugs. - The issue's grep command (`grep -rE "doc\.getReceivers|doc\.getTags|doc\.getTrainingLabels"`) is a good starting point but misses method-chained accesses like `DashboardService.buildCaption()` which accesses `doc.getReceivers()` — a live `LazyInitializationException` risk after this change. A `LazyInitializationException` surfacing at the controller layer may produce an unstructured 500 response that leaks stack trace detail if the global exception handler doesn't catch it. Confirm `GlobalExceptionHandler` covers `LazyInitializationException`. - No new SQL injection surface is introduced. `@EntityGraph` uses JPA's join mechanism, not string concatenation. ### Recommendations 1. **Check `Person.java` and `Tag.java` for `@JsonIgnoreProperties({"hibernateLazyInitializer", "handler"})`**. If absent, add it before switching to LAZY — otherwise Jackson serialization of lazy proxies produces either unexpected extra queries or `HttpMessageConversionException` in the response. 2. **Confirm `GlobalExceptionHandler` covers `LazyInitializationException`** with a structured JSON response (not a raw 500 with stack trace). If it doesn't, add a handler that maps it to a 500 with the `INTERNAL_ERROR` error code. A `LazyInitializationException` in prod is a bug, but it shouldn't leak implementation details to the client. 3. Add a regression test that verifies the detail endpoint returns a structured error (not a stack trace) when something goes wrong loading collections — this gives you a safety net if a future change reintroduces a lazy access outside a transaction. ### Open Decisions - None — the security surface of this change is limited and the mitigations are clear.

marcel commented

2026-05-07 17:38:39 +02:00

👨‍💻 Sara Holt — QA Engineer

Observations

The existing DocumentRepositoryTest is a @DataJpaTest with real Postgres via Testcontainers — the right foundation. The test file exists but currently has no query-count assertion. Adding one is required by the acceptance criteria.
The issue specifies "query count should be ≤2 for 50 documents via findAll". This is testable with Hibernate's Statistics API. The @DataJpaTest slice does NOT include SessionFactory by default — you'll need to autowire EntityManagerFactory and cast to SessionFactoryImpl to access statistics, OR configure spring.jpa.properties.hibernate.generate_statistics=true in src/test/resources/application-test.yaml and inject StatisticsService.
The issue's verification plan covers: query-count assertion test, pg_stat_statements live check, smoke test, and full test suite. This is a solid 4-layer plan. The missing layer is integration test coverage for the cross-domain scenarios — specifically DashboardService.getResume() which is not currently covered by any test that would catch a LazyInitializationException.
Three code paths in searchDocuments() bypass the @EntityGraph-annotated findAll(Specification, Pageable) method and call the no-pageable findAll(spec) instead (SENDER sort, RECEIVER sort, RELEVANCE sort). The query-count assertion test must cover at least the sorted/paged fast-path — the in-memory sort paths load all documents into memory anyway, so their N+1 behavior is different.
DocumentServiceTest uses Mockito — it will not catch LazyInitializationException issues because mock repositories return plain Java objects, not Hibernate proxies. The query-count test MUST be in DocumentRepositoryTest (real Postgres) or a @SpringBootTest integration test.

Recommendations

Query-count test structure — add this to DocumentRepositoryTest:

@Test
void findById_loadsReceiversAndTagsInOneQuery() {
    // arrange: persist a Document with 3 receivers and 5 tags
    // act: call documentRepository.findById(id)
    // assert: Hibernate statistics show ≤2 queries (1 for document + 1 JOIN FETCH)
}

@Test  
void findAll_withSpec_loadsPageOf50DocumentsInTwoQueriesMax() {
    // arrange: persist 50 documents with receivers and tags
    // act: call documentRepository.findAll(spec, PageRequest.of(0, 50))
    // assert: query count ≤ 2 via Hibernate statistics
}

Add Hibernate statistics config to src/test/resources/application-test.yaml:
spring.jpa.properties.hibernate.generate_statistics: true
Smoke-test DashboardService.getResume() in an @SpringBootTest integration test with a document that has receivers — this is currently not covered and will catch the LazyInitializationException in buildCaption() if it's not fixed.
All three findAll(spec) paths in searchDocuments() (SENDER, RECEIVER, RELEVANCE sorts) should be covered by at least one integration test asserting no LazyInitializationException is thrown.

Open Decisions

Should query-count tests be part of DocumentRepositoryTest (@DataJpaTest) or a new DocumentPerformanceTest (@SpringBootTest)? Recommendation: DocumentRepositoryTest for the repository-level assertions; a new @SpringBootTest class for the service-level smoke tests. This keeps the fast @DataJpaTest slice focused on repository contracts.

## 👨‍💻 Sara Holt — QA Engineer ### Observations - The existing `DocumentRepositoryTest` is a `@DataJpaTest` with real Postgres via Testcontainers — the right foundation. The test file exists but currently has no query-count assertion. Adding one is required by the acceptance criteria. - The issue specifies "query count should be ≤2 for 50 documents via findAll". This is testable with Hibernate's `Statistics` API. The `@DataJpaTest` slice does NOT include `SessionFactory` by default — you'll need to autowire `EntityManagerFactory` and cast to `SessionFactoryImpl` to access statistics, OR configure `spring.jpa.properties.hibernate.generate_statistics=true` in `src/test/resources/application-test.yaml` and inject `StatisticsService`. - The issue's verification plan covers: query-count assertion test, `pg_stat_statements` live check, smoke test, and full test suite. This is a solid 4-layer plan. The missing layer is **integration test coverage for the cross-domain scenarios** — specifically `DashboardService.getResume()` which is not currently covered by any test that would catch a `LazyInitializationException`. - Three code paths in `searchDocuments()` bypass the `@EntityGraph`-annotated `findAll(Specification, Pageable)` method and call the no-pageable `findAll(spec)` instead (SENDER sort, RECEIVER sort, RELEVANCE sort). The query-count assertion test must cover at least the sorted/paged fast-path — the in-memory sort paths load all documents into memory anyway, so their N+1 behavior is different. - `DocumentServiceTest` uses Mockito — it will not catch `LazyInitializationException` issues because mock repositories return plain Java objects, not Hibernate proxies. The query-count test MUST be in `DocumentRepositoryTest` (real Postgres) or a `@SpringBootTest` integration test. ### Recommendations 1. **Query-count test structure** — add this to `DocumentRepositoryTest`: ```java @Test void findById_loadsReceiversAndTagsInOneQuery() { // arrange: persist a Document with 3 receivers and 5 tags // act: call documentRepository.findById(id) // assert: Hibernate statistics show ≤2 queries (1 for document + 1 JOIN FETCH) } @Test void findAll_withSpec_loadsPageOf50DocumentsInTwoQueriesMax() { // arrange: persist 50 documents with receivers and tags // act: call documentRepository.findAll(spec, PageRequest.of(0, 50)) // assert: query count ≤ 2 via Hibernate statistics } ``` 2. **Add Hibernate statistics config** to `src/test/resources/application-test.yaml`: `spring.jpa.properties.hibernate.generate_statistics: true` 3. **Smoke-test `DashboardService.getResume()`** in an `@SpringBootTest` integration test with a document that has receivers — this is currently not covered and will catch the `LazyInitializationException` in `buildCaption()` if it's not fixed. 4. All three `findAll(spec)` paths in `searchDocuments()` (SENDER, RECEIVER, RELEVANCE sorts) should be covered by at least one integration test asserting no `LazyInitializationException` is thrown. ### Open Decisions - Should query-count tests be part of `DocumentRepositoryTest` (`@DataJpaTest`) or a new `DocumentPerformanceTest` (`@SpringBootTest`)? Recommendation: `DocumentRepositoryTest` for the repository-level assertions; a new `@SpringBootTest` class for the service-level smoke tests. This keeps the fast `@DataJpaTest` slice focused on repository contracts.

marcel commented

2026-05-07 17:38:50 +02:00

👨‍💻 Leonie Voss — UI/UX Design Lead

Observations

This is a backend performance refactor with no direct UI changes. From a user experience perspective, the change is invisible if implemented correctly — and that's the right outcome.
The Reader Experience milestone impact is the most important UX angle here. The issue correctly identifies that the N+1 hits the list and detail-after-search paths — precisely what readers use on phones. At 10× data volume, a tripled query count would produce noticeable latency on those paths. This refactor protects the reading experience before it degrades.
If the @EntityGraph configuration is wrong and produces LazyInitializationException errors in production, the user-visible result is a blank page or an unhandled 500 — the worst possible outcome for the 60+ audience who are not comfortable with technical failures. The thorough testing approach proposed in the issue is the right insurance.
No impact on i18n, accessibility, component structure, or visual design. No frontend changes required.

Recommendations

After the refactor lands, do a quick smoke test of the search → detail flow at the UI level (not just API level). Confirm tags and sender/receiver names appear correctly in the document detail view and search results — these are the visual elements backed by the refactored collections.
If load time on the document list improves measurably (even at 1.5K documents), consider adding a brief browser-side performance note to the PR description — useful context for anyone investigating future slowdowns.

Open Decisions

None from UX perspective.

## 👨‍💻 Leonie Voss — UI/UX Design Lead ### Observations - This is a backend performance refactor with no direct UI changes. From a user experience perspective, the change is invisible if implemented correctly — and that's the right outcome. - The Reader Experience milestone impact is the most important UX angle here. The issue correctly identifies that the N+1 hits the list and detail-after-search paths — precisely what readers use on phones. At 10× data volume, a tripled query count would produce noticeable latency on those paths. This refactor protects the reading experience before it degrades. - If the `@EntityGraph` configuration is wrong and produces `LazyInitializationException` errors in production, the user-visible result is a blank page or an unhandled 500 — the worst possible outcome for the 60+ audience who are not comfortable with technical failures. The thorough testing approach proposed in the issue is the right insurance. - No impact on i18n, accessibility, component structure, or visual design. No frontend changes required. ### Recommendations 1. After the refactor lands, do a quick smoke test of the search → detail flow at the UI level (not just API level). Confirm tags and sender/receiver names appear correctly in the document detail view and search results — these are the visual elements backed by the refactored collections. 2. If load time on the document list improves measurably (even at 1.5K documents), consider adding a brief browser-side performance note to the PR description — useful context for anyone investigating future slowdowns. ### Open Decisions - None from UX perspective.

marcel commented

2026-05-07 17:39:11 +02:00

Decision Queue

One genuine tradeoff needs a call before implementation starts.

DQ-1 — Should `Document.list` entity graph include `tags`? (raised by Markus)

Context: The issue proposes Document.list fetches only sender. The list view (search results) renders tag chips on every document card. After switching to LAZY, resolveDocumentTagColors() iterates d.getTags().stream() on the returned list — this would trigger N lazy loads (one per document) unless tags are included in the graph or the method runs inside a transaction with @BatchSize as the fallback.

Option A — Include tags in Document.list
One JOIN FETCH covers both sender and tags. Zero lazy loads in resolveDocumentTagColors(). Costs one additional join on every list query, but that join already happened implicitly under EAGER — so this is neutral vs. today.

Option B — Keep Document.list sender-only, rely on @BatchSize(50) + @Transactional
Tags are batch-fetched in groups of 50 when resolveDocumentTagColors() touches them inside a transactional boundary. Requires getRecentActivity() and the search service to be wrapped in @Transactional(readOnly = true). Slightly more complex; the batch fetch adds 1-2 extra queries per 50-document page, but avoids the full JOIN FETCH on list queries that don't render tags.

Recommendation: Option A. The list view always renders tags, so the join is never wasted. It's the simpler invariant and matches what EAGER gave you today — one load, all data present, no session tricks required.

Owner: Marcel. Call needed before the @NamedEntityGraph definitions are written into Document.java.

## Decision Queue One genuine tradeoff needs a call before implementation starts. --- ### DQ-1 — Should `Document.list` entity graph include `tags`? (raised by Markus) **Context:** The issue proposes `Document.list` fetches only `sender`. The list view (search results) renders tag chips on every document card. After switching to LAZY, `resolveDocumentTagColors()` iterates `d.getTags().stream()` on the returned list — this would trigger N lazy loads (one per document) unless tags are included in the graph or the method runs inside a transaction with `@BatchSize` as the fallback. **Option A — Include `tags` in `Document.list`** One JOIN FETCH covers both sender and tags. Zero lazy loads in `resolveDocumentTagColors()`. Costs one additional join on every list query, but that join already happened implicitly under EAGER — so this is neutral vs. today. **Option B — Keep `Document.list` sender-only, rely on `@BatchSize(50)` + `@Transactional`** Tags are batch-fetched in groups of 50 when `resolveDocumentTagColors()` touches them inside a transactional boundary. Requires `getRecentActivity()` and the search service to be wrapped in `@Transactional(readOnly = true)`. Slightly more complex; the batch fetch adds 1-2 extra queries per 50-document page, but avoids the full JOIN FETCH on list queries that don't render tags. **Recommendation:** Option A. The list view always renders tags, so the join is never wasted. It's the simpler invariant and matches what EAGER gave you today — one load, all data present, no session tricks required. **Owner:** Marcel. Call needed before the `@NamedEntityGraph` definitions are written into `Document.java`.

marcel commented

2026-05-10 14:42:43 +02:00

👨‍💻 Felix Brandt — Fullstack Developer — Pre-implementation discussion summary

Walked through 5 open items before touching code. All resolved.

✅ DQ-1 — `Document.list` entity graph includes `{sender, tags}`

Decision: Option A. The list view always renders tag chips, so the join is never wasted. Matches what EAGER gave us today — one load, all data present, no session tricks required. @BatchSize fallback is belt-and-suspenders, not the primary strategy.

✅ `findAll(spec)` no-pageable sort paths also get `@EntityGraph`

Decision: Option A. Override both findAll(Specification<T>) (no-pageable) and findAll(Specification<T>, Pageable) in DocumentRepository with @EntityGraph("Document.list"). The three searchDocuments() sort paths (RECEIVER, SENDER, RELEVANCE) that call the no-pageable overload are covered explicitly — not left to @BatchSize fallback.

✅ `@JsonIgnoreProperties` added to `Person.java` and `Tag.java`

Decision: in scope for this PR. Confirmed neither class currently has @JsonIgnoreProperties({"hibernateLazyInitializer", "handler"}). Without it, Jackson may traverse Hibernate proxy internals on serialization after switching to LAZY — either triggering unexpected queries or throwing HttpMessageConversionException. One-liner per class, added while we're here.

✅ `@Transactional` added to `updateDocumentTags()`

Decision: in scope for this PR. It's a write method with no @Transactional. Currently safe only because its sole caller is itself transactional — an invisible invariant. Fix it now rather than wait for someone to call it from a controller directly.

✅ Test structure confirmed

DocumentRepositoryTest (@DataJpaTest) — query-count assertions for findById and findAll(Specification, Pageable). Hibernate statistics enabled via spring.jpa.properties.hibernate.generate_statistics: true in src/test/resources/application-test.yaml. Access statistics via autowired Statistics bean — no SessionFactoryImpl casting required.
New @SpringBootTest class — service-level smoke tests for DashboardService.getResume() / buildCaption() and the three searchDocuments() sort paths. Asserts no LazyInitializationException is thrown.

Implementation order (from my earlier review comment) stands: (a) add @NamedEntityGraph to Document.java, (b) add @EntityGraph to both repository overloads, (c) add @BatchSize(50) to collection fields, (d) add @Transactional to getDocumentById(), getRecentActivity(), updateDocumentTags(), (e) switch FetchType.LAZY on all three associations, (f) add @JsonIgnoreProperties to Person + Tag, (g) run full test suite before writing the new tests.

## 👨‍💻 Felix Brandt — Fullstack Developer — Pre-implementation discussion summary Walked through 5 open items before touching code. All resolved. --- ### ✅ DQ-1 — `Document.list` entity graph includes `{sender, tags}` **Decision: Option A.** The list view always renders tag chips, so the join is never wasted. Matches what EAGER gave us today — one load, all data present, no session tricks required. `@BatchSize` fallback is belt-and-suspenders, not the primary strategy. --- ### ✅ `findAll(spec)` no-pageable sort paths also get `@EntityGraph` **Decision: Option A.** Override both `findAll(Specification<T>)` (no-pageable) and `findAll(Specification<T>, Pageable)` in `DocumentRepository` with `@EntityGraph("Document.list")`. The three `searchDocuments()` sort paths (RECEIVER, SENDER, RELEVANCE) that call the no-pageable overload are covered explicitly — not left to `@BatchSize` fallback. --- ### ✅ `@JsonIgnoreProperties` added to `Person.java` and `Tag.java` **Decision: in scope for this PR.** Confirmed neither class currently has `@JsonIgnoreProperties({"hibernateLazyInitializer", "handler"})`. Without it, Jackson may traverse Hibernate proxy internals on serialization after switching to LAZY — either triggering unexpected queries or throwing `HttpMessageConversionException`. One-liner per class, added while we're here. --- ### ✅ `@Transactional` added to `updateDocumentTags()` **Decision: in scope for this PR.** It's a write method with no `@Transactional`. Currently safe only because its sole caller is itself transactional — an invisible invariant. Fix it now rather than wait for someone to call it from a controller directly. --- ### ✅ Test structure confirmed - **`DocumentRepositoryTest` (`@DataJpaTest`)** — query-count assertions for `findById` and `findAll(Specification, Pageable)`. Hibernate statistics enabled via `spring.jpa.properties.hibernate.generate_statistics: true` in `src/test/resources/application-test.yaml`. Access statistics via autowired `Statistics` bean — no `SessionFactoryImpl` casting required. - **New `@SpringBootTest` class** — service-level smoke tests for `DashboardService.getResume()` / `buildCaption()` and the three `searchDocuments()` sort paths. Asserts no `LazyInitializationException` is thrown. --- Implementation order (from my earlier review comment) stands: (a) add `@NamedEntityGraph` to `Document.java`, (b) add `@EntityGraph` to both repository overloads, (c) add `@BatchSize(50)` to collection fields, (d) add `@Transactional` to `getDocumentById()`, `getRecentActivity()`, `updateDocumentTags()`, (e) switch `FetchType.LAZY` on all three associations, (f) add `@JsonIgnoreProperties` to `Person` + `Tag`, (g) run full test suite before writing the new tests.

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marcel/familienarchiv#467

refactor(document): switch Document.tags + receivers + trainingLabels to LAZY + @EntityGraph #467

Context

Approach

Document.java

Named entity graphs

Repository

LazyInitializationException risk

Bonus: @BatchSize(50) on collections

Critical files

Verification

Acceptance criteria

Effort

Risk if not addressed

👨‍💻 Markus Keller — Application Architect

Observations

Recommendations

Open Decisions

👨‍💻 Felix Brandt — Fullstack Developer

Observations

Recommendations

Open Decisions

👨‍💻 Tobias Wendt — DevOps & Platform Engineer

Observations

Recommendations

Open Decisions

👨‍💻 Elicit — Requirements Engineer

Observations

Recommendations

Open Decisions

👨‍💻 Nora "NullX" Steiner — Security Engineer

Observations

Recommendations

Open Decisions

👨‍💻 Sara Holt — QA Engineer

Observations

Recommendations

Open Decisions

👨‍💻 Leonie Voss — UI/UX Design Lead

Observations

Recommendations

Open Decisions

Decision Queue

DQ-1 — Should Document.list entity graph include tags? (raised by Markus)

👨‍💻 Felix Brandt — Fullstack Developer — Pre-implementation discussion summary

✅ DQ-1 — Document.list entity graph includes {sender, tags}

✅ findAll(spec) no-pageable sort paths also get @EntityGraph

✅ @JsonIgnoreProperties added to Person.java and Tag.java

✅ @Transactional added to updateDocumentTags()

✅ Test structure confirmed

`Document.java`

Bonus: `@BatchSize(50)` on collections

DQ-1 — Should `Document.list` entity graph include `tags`? (raised by Markus)

✅ DQ-1 — `Document.list` entity graph includes `{sender, tags}`

✅ `findAll(spec)` no-pageable sort paths also get `@EntityGraph`

✅ `@JsonIgnoreProperties` added to `Person.java` and `Tag.java`

✅ `@Transactional` added to `updateDocumentTags()`