Files
familienarchiv/docs/adr/022-eager-to-lazy-fetch-strategy.md
Marcel cabcf6a6ca docs(adr): add ADR-022 for EAGER→LAZY fetch strategy with @EntityGraph
Records context (2733 queries/24 requests), the two-graph decision,
@BatchSize fallback, @Transactional(readOnly=true) session-lifetime
requirement, and alternatives considered.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 22:25:38 +02:00

5.1 KiB
Raw Blame History

ADR-022 — EAGER→LAZY Fetch Strategy for Document Collections

Date: 2026-05-18 Status: Accepted Issue: #467 PR: #622


Context

A pre-production query audit of 24 HTTP requests to the document list and detail endpoints produced 2,733 SQL statements — primarily N+1 queries caused by FetchType.EAGER on Document.receivers, Document.tags, Document.trainingLabels, and Document.sender.

With EAGER fetch, every Document loaded by any repository method immediately triggers additional SELECT statements for each associated collection, regardless of whether the caller needs those associations. For a list of 100 documents, this means up to 400 extra queries for receivers alone.


Decision

Switch all four associations to FetchType.LAZY and use a two-tier strategy to load exactly what each code path needs:

Tier 1 — Named entity graphs on Document + @EntityGraph overrides on DocumentRepository:

  • Document.full — loads sender, receivers, tags — used by findById (detail view)
  • Document.list — loads sender, tags — used by findAll(Spec, Pageable), findAll(Spec), and findAll(Pageable) (list/search/dashboard paths)

Each repository method that is called from a hot code path has an @EntityGraph override that declares exactly which associations to JOIN-fetch, collapsing N+1 into 12 queries.

Tier 2 — @BatchSize(50) fallback on all four associations:

For any lazy access path not covered by an entity graph (e.g., a future ad-hoc query or an in-memory sort that touches trainingLabels), Hibernate batches the secondary SELECT to at most one statement per 50 entities instead of one per entity.

Session lifetime for post-return lazy access:

getDocumentById and getRecentActivity return entities to callers that may access lazy associations after the repository call returns. Both methods are annotated @Transactional(readOnly = true) to keep the Hibernate session open until the service method returns, making those post-return accesses safe.

This is an intentional exception to the project convention that read methods are not annotated (see CLAUDE.md §Services). The convention remains correct for all other read methods; this exception applies only to methods that serve lazy-initialized associations to their callers.


Alternatives Considered

@BatchSize-only (no entity graphs)

@BatchSize(50) on all associations would eliminate the worst N+1 cases (100 documents → 2 batch queries instead of 100 individual queries) without requiring repository overrides. Simpler to maintain — no named graph definitions, no per-method overrides.

Rejected because batch loading is best-effort: it depends on what Hibernate happens to find in the first-level cache and produces a variable number of statements. Entity graphs produce a deterministic, verifiable statement count that can be asserted in tests. The query-count test suite (DocumentRepositoryTest) validates the exact statement bounds on every CI run.

Single unified entity graph (Document.full everywhere)

Loading receivers on every list query is wasteful — the document list view only needs sender and tags. receivers is a @ManyToMany collection that, when JOIN-fetched together with tags, forces Hibernate to split into two queries anyway (to avoid Cartesian product). Using a single graph on list paths would load data the UI does not display.

Rejected in favour of two graphs with distinct scopes: Document.list for list paths (sender + tags), Document.full for detail paths (sender + receivers + tags).

@Transactional on the Spring Data repository methods

Spring Data allows @Transactional on repository interfaces directly. This would keep the session open for all calls to those methods without touching the service layer.

Rejected because the transaction boundary belongs at the service layer — repositories should not own transaction lifecycle. The service methods are the natural scope for "keep the session open long enough for the caller to use the result."


Consequences

  • Query count reduced from ~2,733 to ≤10 statements per 24 HTTP requests — verified by DocumentRepositoryTest query-count assertions and DocumentLazyLoadingTest smoke tests.
  • Read methods that return lazily-initialized entities must carry @Transactional(readOnly = true). Any future service method that loads a Document and returns it to a caller that accesses lazy associations must follow this pattern. Removing the annotation causes LazyInitializationException in production.
  • New lazy code paths need an entity graph or @BatchSize review. Any new DocumentRepository method added to a hot code path should be assessed for N+1 risk and given an @EntityGraph override if warranted.
  • @JsonIgnoreProperties({"hibernateLazyInitializer", "handler"}) required on serialized lazy-proxy entities. Person and Tag carry this annotation to prevent Jackson from attempting to serialize Hibernate proxy internals when the association is not initialized. Any new entity that is used as a lazy association and serialized directly (without a DTO) needs the same annotation.