fix(documents): paginate FTS match-set in SQL instead of loading all matching IDs #468

Closed
opened 2026-05-07 17:26:47 +02:00 by marcel · 8 comments
Owner

Context

backend/src/main/java/org/raddatz/familienarchiv/document/DocumentRepository.java::findRankedIdsByFts returns List<UUID> of all matching document IDs for a FTS query. The controller (#315 added pagination) then paginates in memory.

Pre-prod audit ran the query against the live stack:

SELECT d.id FROM documents d
CROSS JOIN LATERAL ( SELECT websearch_to_tsquery('german', :query) ...
WHERE d.search_vector @@ q.pq

  rows_per_call: 911 / call (current dataset: 1,520 documents)

So a search like "walter" returns ~60% of the document table as UUIDs, then the controller slices the first 50.

At current scale this is fine — 911 UUIDs is ~14 KB of data and ms-level DB time. But two scaling issues:

  1. Memory: at 100K documents, a broad query returns ~60K UUIDs ≈ 1 MB per request. Multiply by concurrent users and the JVM heap takes a hit.
  2. Latency: the LATERAL JOIN does work proportional to match-set size; at 100K rows with 60K matches that's 100 ms+ of CPU per query, not ms.

Approach

Push pagination + sort into the SQL. Two sub-queries instead of one:

Query 1: paged ID + total count

WITH q AS (
  SELECT websearch_to_tsquery('german', :query) AS pq
), matches AS (
  SELECT d.id, ts_rank(d.search_vector, q.pq) AS rank
  FROM documents d, q
  WHERE d.search_vector @@ q.pq
)
SELECT id, rank, COUNT(*) OVER () AS total
FROM matches
ORDER BY rank DESC, /* meta_date DESC NULLS LAST */
OFFSET :offset LIMIT :limit;

The window-function COUNT(*) OVER () returns total-match-count alongside the page rows in a single round-trip — no separate count query.

Query 2: detail enrichment for the page

Already in place — DocumentService.searchMatchData runs against the page IDs only. Keep it, but don't pass it the full ID list — pass the 50-ID page.

Repository signature

record FtsPage(List<FtsHit> hits, long total) { }
record FtsHit(UUID id, double rank) { }

@Query(value = """
    WITH q AS (...), matches AS (...)
    SELECT id, rank, COUNT(*) OVER () AS total
    FROM matches
    ORDER BY rank DESC
    OFFSET :offset LIMIT :limit
    """, nativeQuery = true)
List<Object[]> findFtsPageRaw(@Param("query") String query,
                               @Param("offset") int offset,
                               @Param("limit") int limit);

default FtsPage findFtsPage(String query, int page, int size) {
    var rows = findFtsPageRaw(query, page * size, size);
    long total = rows.isEmpty() ? 0 : ((Number) rows.get(0)[2]).longValue();
    var hits = rows.stream().map(r -> new FtsHit((UUID) r[0], ((Number) r[1]).doubleValue())).toList();
    return new FtsPage(hits, total);
}

Service refactor

DocumentService.searchDocuments currently calls findRankedIdsByFts(query) then in-memory paginates. New flow:

  1. findFtsPage(query, page, size) → returns 50 IDs + total.
  2. Existing searchMatchData(50 ids) enrichment unchanged.
  3. Existing buildResult unchanged.
  4. Page metadata (totalElements, totalPages, number, size) populated from the new total.

Sort modes

Most sort modes (RELEVANCE, DATE, TITLE, UPLOAD_DATE) move into the SQL ORDER BY. The two in-memory ones (SENDER, RECEIVER) stay in-memory but now operate on a 50-row page, not a 911-row list.

Critical files

  • backend/src/main/java/org/raddatz/familienarchiv/document/DocumentRepository.java — new findFtsPage + nested types
  • backend/src/main/java/org/raddatz/familienarchiv/document/DocumentService.java::searchDocuments — switch to paged repo
  • backend/src/main/java/org/raddatz/familienarchiv/document/DocumentSearchSpec.java — update Specification for non-FTS filter combination
  • backend/src/test/java/.../document/DocumentSearchPagedIntegrationTest.java — extend with assert on rows-per-call
  • No frontend change — DocumentSearchResult shape stays the same per #315.

Verification

  1. Live pg_stat_statements: hit /api/documents/search?q=walter&size=50&page=0. The old SELECT d.id FROM documents d row should drop from 911 rows/call to 50 rows/call.
  2. Test: 1,520-document fixture, search returning 911 matches, page 0 → 50 results + total=911. Page 18 (911 / 50 ≈ 19 pages) → 11 results + total=911.
  3. Test: empty search match-set returns total=0 + no error.
  4. Smoke: same query at page 0, page 5, page 18 — UI renders consistent total counts.
  5. Sort smoke: sort=DATE,DESC works; sort=SENDER,ASC works (in-memory path on 50 rows).

Acceptance criteria

  • findRankedIdsByFts removed (or kept only for tests).
  • Search SQL uses OFFSET ... LIMIT ... pushed down to Postgres.
  • Total match count returned alongside the page (window function, single query).
  • pg_stat_statements confirms rows_per_call for the FTS query is ≤ page size.
  • No regression in #315 acceptance criteria (filters, year-group rendering, etc.).

Effort

M — 1-2 days. The window-function approach is straightforward; the test surface area is the bigger time cost.

Risk if not addressed

At 100K+ documents, search response time grows linearly with match-set size. First user-visible 5xx is likely a search-induced GC pause or pool starvation.

Tracked in audit doc as F-31 (High) — new dynamic finding from pg_stat_statements analysis.

## Context `backend/src/main/java/org/raddatz/familienarchiv/document/DocumentRepository.java::findRankedIdsByFts` returns `List<UUID>` of **all** matching document IDs for a FTS query. The controller (#315 added pagination) then paginates in memory. Pre-prod audit ran the query against the live stack: ``` SELECT d.id FROM documents d CROSS JOIN LATERAL ( SELECT websearch_to_tsquery('german', :query) ... WHERE d.search_vector @@ q.pq rows_per_call: 911 / call (current dataset: 1,520 documents) ``` So a search like "walter" returns ~60% of the document table as UUIDs, then the controller slices the first 50. At current scale this is fine — 911 UUIDs is ~14 KB of data and ms-level DB time. But two scaling issues: 1. **Memory**: at 100K documents, a broad query returns ~60K UUIDs ≈ 1 MB **per request**. Multiply by concurrent users and the JVM heap takes a hit. 2. **Latency**: the `LATERAL JOIN` does work proportional to match-set size; at 100K rows with 60K matches that's 100 ms+ of CPU per query, not ms. ## Approach Push pagination + sort into the SQL. Two sub-queries instead of one: ### Query 1: paged ID + total count ```sql WITH q AS ( SELECT websearch_to_tsquery('german', :query) AS pq ), matches AS ( SELECT d.id, ts_rank(d.search_vector, q.pq) AS rank FROM documents d, q WHERE d.search_vector @@ q.pq ) SELECT id, rank, COUNT(*) OVER () AS total FROM matches ORDER BY rank DESC, /* meta_date DESC NULLS LAST */ OFFSET :offset LIMIT :limit; ``` The window-function `COUNT(*) OVER ()` returns total-match-count alongside the page rows in a single round-trip — no separate count query. ### Query 2: detail enrichment for the page Already in place — `DocumentService.searchMatchData` runs against the page IDs only. Keep it, but don't pass it the full ID list — pass the 50-ID page. ### Repository signature ```java record FtsPage(List<FtsHit> hits, long total) { } record FtsHit(UUID id, double rank) { } @Query(value = """ WITH q AS (...), matches AS (...) SELECT id, rank, COUNT(*) OVER () AS total FROM matches ORDER BY rank DESC OFFSET :offset LIMIT :limit """, nativeQuery = true) List<Object[]> findFtsPageRaw(@Param("query") String query, @Param("offset") int offset, @Param("limit") int limit); default FtsPage findFtsPage(String query, int page, int size) { var rows = findFtsPageRaw(query, page * size, size); long total = rows.isEmpty() ? 0 : ((Number) rows.get(0)[2]).longValue(); var hits = rows.stream().map(r -> new FtsHit((UUID) r[0], ((Number) r[1]).doubleValue())).toList(); return new FtsPage(hits, total); } ``` ### Service refactor `DocumentService.searchDocuments` currently calls `findRankedIdsByFts(query)` then in-memory paginates. New flow: 1. `findFtsPage(query, page, size)` → returns 50 IDs + total. 2. Existing `searchMatchData(50 ids)` enrichment unchanged. 3. Existing `buildResult` unchanged. 4. Page metadata (`totalElements`, `totalPages`, `number`, `size`) populated from the new `total`. ### Sort modes Most sort modes (`RELEVANCE`, `DATE`, `TITLE`, `UPLOAD_DATE`) move into the SQL `ORDER BY`. The two in-memory ones (`SENDER`, `RECEIVER`) stay in-memory but now operate on a 50-row page, not a 911-row list. ## Critical files - `backend/src/main/java/org/raddatz/familienarchiv/document/DocumentRepository.java` — new `findFtsPage` + nested types - `backend/src/main/java/org/raddatz/familienarchiv/document/DocumentService.java::searchDocuments` — switch to paged repo - `backend/src/main/java/org/raddatz/familienarchiv/document/DocumentSearchSpec.java` — update Specification for non-FTS filter combination - `backend/src/test/java/.../document/DocumentSearchPagedIntegrationTest.java` — extend with assert on rows-per-call - No frontend change — `DocumentSearchResult` shape stays the same per #315. ## Verification 1. **Live `pg_stat_statements`**: hit `/api/documents/search?q=walter&size=50&page=0`. The old `SELECT d.id FROM documents d` row should drop from 911 rows/call to **50 rows/call**. 2. **Test**: 1,520-document fixture, search returning 911 matches, page 0 → 50 results + total=911. Page 18 (911 / 50 ≈ 19 pages) → 11 results + total=911. 3. **Test**: empty search match-set returns total=0 + no error. 4. **Smoke**: same query at page 0, page 5, page 18 — UI renders consistent total counts. 5. **Sort smoke**: sort=DATE,DESC works; sort=SENDER,ASC works (in-memory path on 50 rows). ## Acceptance criteria - [ ] `findRankedIdsByFts` removed (or kept only for tests). - [ ] Search SQL uses `OFFSET ... LIMIT ...` pushed down to Postgres. - [ ] Total match count returned alongside the page (window function, single query). - [ ] `pg_stat_statements` confirms `rows_per_call` for the FTS query is ≤ page size. - [ ] No regression in #315 acceptance criteria (filters, year-group rendering, etc.). ## Effort M — 1-2 days. The window-function approach is straightforward; the test surface area is the bigger time cost. ## Risk if not addressed At 100K+ documents, search response time grows linearly with match-set size. First user-visible 5xx is likely a search-induced GC pause or pool starvation. Tracked in audit doc as **F-31** (High) — new dynamic finding from `pg_stat_statements` analysis.
marcel added the P1-highbug labels 2026-05-07 17:27:05 +02:00
Author
Owner

🏗️ Markus Keller — Application Architect

Observations

The current findRankedIdsByFts query is structurally sound for the domain — it uses the GIN-indexed search_vector column from V34 and the CROSS JOIN LATERAL pattern to avoid function calls per row. The problem is not the query itself but where the pagination boundary sits: in the application service rather than in SQL.

The issue proposes moving to a CTE with a window-function COUNT(*) OVER (). This is the right direction. The window function eliminates the classic "two-query" total-count problem and is well-supported in Postgres 16.

However, two things need a closer look:

  1. The findRankedIdsByFts caller in findIdsForFilter (line 419 of DocumentService) is the bulk-edit "select all" fast path. That method intentionally loads all IDs unpaginated — it is not a bug there, it is the feature. The refactor must not change this method's semantics. The issue description focuses on searchDocuments only, which is correct, but the implementation needs to cleanly separate the two call sites and avoid accidentally pushing pagination into the bulk-edit path.

  2. The RELEVANCE sort path is currently the third in-memory sort (alongside SENDER and RECEIVER). The proposed window-function approach pushes ts_rank sorting into SQL, which eliminates the in-memory RELEVANCE sort entirely. That is a genuine improvement. However, SENDER and RECEIVER still fetch the full match set (they require LEFT JOIN to handle null senders/receivers and cannot be expressed with the simple ORDER BY the fast path uses). The issue acknowledges this correctly — those two stay in-memory. The code comment at line 589–593 documents this design consciously.

  3. The default FtsPage findFtsPage(...) default method on the repository interface is a valid pattern in Spring Data, but mixing record types (FtsHit, FtsPage) and raw Object[] parsing inside a repository default method puts business logic in the wrong layer. Prefer moving the Object[]FtsHit mapping to the service, not the repository.

  4. No ADR is proposed. Pushing pagination into SQL for a native FTS query is a significant enough architectural decision to warrant a short ADR — especially because it constrains the SENDER/RECEIVER sort paths permanently (they must stay in-memory).

Recommendations

  • Keep findRankedIdsByFts in the repository untouched and let findIdsForFilter continue to use it. Add a new findFtsPageRaw query alongside it rather than replacing it, to avoid breaking the bulk-edit caller.
  • Move Object[] → record mapping out of the repository default method and into a private service method (toFtsHit(Object[] row)). Repositories should return data; services should interpret it.
  • Write a short ADR before implementation: context (FTS memory cost at scale), decision (SQL-level pagination via window function), alternatives (separate count query, capped result set with a warning), consequences (SENDER/RECEIVER sorts stay in-memory forever until a LEFT JOIN sort is added). File it in docs/adr/.
  • The DocumentSearchSpec.java reference in the issue body does not match any file in the codebase — the class is DocumentSpecifications.java. Clarify before implementation to avoid confusion.

Open Decisions

  • Where to put FtsHit and FtsPage: placing them as nested types inside DocumentRepository (as the issue shows) couples the repository to value-object semantics. Placing them in the document package root is slightly better. Either is defensible for a codebase this size — pick one and be consistent.
  • Whether to keep findRankedIdsByFts permanently or deprecate it: the method is exercised by 16 test assertions in DocumentFtsTest. If it is kept only for findIdsForFilter, rename it to findAllMatchingIdsByFts to communicate intent. If it is deprecated, the test file needs a new approach for those assertions.
## 🏗️ Markus Keller — Application Architect ### Observations The current `findRankedIdsByFts` query is structurally sound for the domain — it uses the GIN-indexed `search_vector` column from V34 and the `CROSS JOIN LATERAL` pattern to avoid function calls per row. The problem is not the query itself but where the pagination boundary sits: in the application service rather than in SQL. The issue proposes moving to a CTE with a window-function `COUNT(*) OVER ()`. This is the right direction. The window function eliminates the classic "two-query" total-count problem and is well-supported in Postgres 16. However, two things need a closer look: 1. **The `findRankedIdsByFts` caller in `findIdsForFilter`** (line 419 of `DocumentService`) is the bulk-edit "select all" fast path. That method intentionally loads *all* IDs unpaginated — it is not a bug there, it is the feature. The refactor must not change this method's semantics. The issue description focuses on `searchDocuments` only, which is correct, but the implementation needs to cleanly separate the two call sites and avoid accidentally pushing pagination into the bulk-edit path. 2. **The `RELEVANCE` sort path is currently the third in-memory sort** (alongside SENDER and RECEIVER). The proposed window-function approach pushes `ts_rank` sorting into SQL, which eliminates the in-memory RELEVANCE sort entirely. That is a genuine improvement. However, SENDER and RECEIVER still fetch the full match set (they require `LEFT JOIN` to handle null senders/receivers and cannot be expressed with the simple `ORDER BY` the fast path uses). The issue acknowledges this correctly — those two stay in-memory. The code comment at line 589–593 documents this design consciously. 3. **The `default FtsPage findFtsPage(...)` default method on the repository interface** is a valid pattern in Spring Data, but mixing record types (`FtsHit`, `FtsPage`) and raw `Object[]` parsing inside a repository default method puts business logic in the wrong layer. Prefer moving the `Object[]` → `FtsHit` mapping to the service, not the repository. 4. **No ADR is proposed.** Pushing pagination into SQL for a native FTS query is a significant enough architectural decision to warrant a short ADR — especially because it constrains the SENDER/RECEIVER sort paths permanently (they must stay in-memory). ### Recommendations - Keep `findRankedIdsByFts` in the repository untouched and let `findIdsForFilter` continue to use it. Add a new `findFtsPageRaw` query alongside it rather than replacing it, to avoid breaking the bulk-edit caller. - Move `Object[]` → record mapping out of the repository default method and into a private service method (`toFtsHit(Object[] row)`). Repositories should return data; services should interpret it. - Write a short ADR before implementation: context (FTS memory cost at scale), decision (SQL-level pagination via window function), alternatives (separate count query, capped result set with a warning), consequences (SENDER/RECEIVER sorts stay in-memory forever until a LEFT JOIN sort is added). File it in `docs/adr/`. - The `DocumentSearchSpec.java` reference in the issue body does not match any file in the codebase — the class is `DocumentSpecifications.java`. Clarify before implementation to avoid confusion. ### Open Decisions - **Where to put `FtsHit` and `FtsPage`**: placing them as nested types inside `DocumentRepository` (as the issue shows) couples the repository to value-object semantics. Placing them in the `document` package root is slightly better. Either is defensible for a codebase this size — pick one and be consistent. - **Whether to keep `findRankedIdsByFts` permanently or deprecate it**: the method is exercised by 16 test assertions in `DocumentFtsTest`. If it is kept only for `findIdsForFilter`, rename it to `findAllMatchingIdsByFts` to communicate intent. If it is deprecated, the test file needs a new approach for those assertions.
Author
Owner

👨‍💻 Felix Brandt — Senior Fullstack Developer

Observations

I read the current searchDocuments method carefully (lines 577–619 of DocumentService). Three separate in-memory paths exist today:

  • RECEIVER (line 594): fetches all + in-memory sort + pageSlice
  • SENDER (line 598): same
  • RELEVANCE (line 604): fetches all, builds a rankMap, sorts, then pageSlice

All three call documentRepository.findAll(spec) with no Pageable, loading the full entity set. The fast path at line 617 is the only one that pushes Pageable into the DB. The issue correctly identifies that RELEVANCE should join the fast path after this refactor.

The proposed findFtsPageRaw returns List<Object[]>. Parsing Object[] by column index is fragile — a future column reorder or addition silently breaks the mapping with no compile-time signal. The default FtsPage findFtsPage(...) approach in the issue would contain the parsing, but it still buries an index-brittle cast chain inside the repository.

The searchDocuments method signature is already long (11 parameters). The refactor should not make it longer.

The DocumentSearchResult.of(List.of()) early-return pattern at line 583 (empty FTS result) must be preserved exactly — the new findFtsPage must return FtsPage(List.of(), 0) for an empty match set so the caller short-circuit still works cleanly.

One clean-code concern: the issue's proposed CTE comment says /* meta_date DESC NULLS LAST */ as a commented-out secondary sort. If it is intended for production, it should be uncommented. If it is a note for the implementation phase, remove it before committing — commented-out code is not documentation.

Recommendations

  • Define the Object[] column indexes as named constants at the top of the parsing method, not inline casts:
    private static final int COL_ID = 0;
    private static final int COL_RANK = 1;
    private static final int COL_TOTAL = 2;
    
    This is the minimum to make column-index access auditable.
  • Keep the Object[]FtsHit parsing in DocumentService.toFtsPage(List<Object[]>), not in the repository. The repository's job ends at returning rows.
  • The findIdsForFilter method (bulk-edit fast path, line 413) must NOT be touched — it intentionally loads all IDs. Guard against this with a comment on the new findFtsPageRaw method: // Paged path only — use findRankedIdsByFts for unpaged bulk-edit.
  • The RELEVANCE sort branch (lines 604–613) should be deleted entirely after the refactor, since ts_rank DESC in the SQL now handles it. Delete it, don't comment it out.
  • The in-memory SENDER and RECEIVER sort branches (lines 594–600) are now operating on a 50-row page instead of a 911-row list. Add a one-line comment to each branch noting that: // In-memory sort on page slice (≤ page size rows) — acceptable.
## 👨‍💻 Felix Brandt — Senior Fullstack Developer ### Observations I read the current `searchDocuments` method carefully (lines 577–619 of `DocumentService`). Three separate in-memory paths exist today: - **RECEIVER** (line 594): fetches all + in-memory sort + `pageSlice` - **SENDER** (line 598): same - **RELEVANCE** (line 604): fetches all, builds a `rankMap`, sorts, then `pageSlice` All three call `documentRepository.findAll(spec)` with no `Pageable`, loading the full entity set. The fast path at line 617 is the only one that pushes `Pageable` into the DB. The issue correctly identifies that RELEVANCE should join the fast path after this refactor. The proposed `findFtsPageRaw` returns `List<Object[]>`. Parsing `Object[]` by column index is fragile — a future column reorder or addition silently breaks the mapping with no compile-time signal. The `default FtsPage findFtsPage(...)` approach in the issue would contain the parsing, but it still buries an index-brittle cast chain inside the repository. The `searchDocuments` method signature is already long (11 parameters). The refactor should not make it longer. The `DocumentSearchResult.of(List.of())` early-return pattern at line 583 (empty FTS result) must be preserved exactly — the new `findFtsPage` must return `FtsPage(List.of(), 0)` for an empty match set so the caller short-circuit still works cleanly. One clean-code concern: the issue's proposed CTE comment says `/* meta_date DESC NULLS LAST */` as a commented-out secondary sort. If it is intended for production, it should be uncommented. If it is a note for the implementation phase, remove it before committing — commented-out code is not documentation. ### Recommendations - Define the `Object[]` column indexes as named constants at the top of the parsing method, not inline casts: ```java private static final int COL_ID = 0; private static final int COL_RANK = 1; private static final int COL_TOTAL = 2; ``` This is the minimum to make column-index access auditable. - Keep the `Object[]` → `FtsHit` parsing in `DocumentService.toFtsPage(List<Object[]>)`, not in the repository. The repository's job ends at returning rows. - The `findIdsForFilter` method (bulk-edit fast path, line 413) must NOT be touched — it intentionally loads all IDs. Guard against this with a comment on the new `findFtsPageRaw` method: `// Paged path only — use findRankedIdsByFts for unpaged bulk-edit`. - The `RELEVANCE` sort branch (lines 604–613) should be deleted entirely after the refactor, since `ts_rank DESC` in the SQL now handles it. Delete it, don't comment it out. - The in-memory `SENDER` and `RECEIVER` sort branches (lines 594–600) are now operating on a 50-row page instead of a 911-row list. Add a one-line comment to each branch noting that: `// In-memory sort on page slice (≤ page size rows) — acceptable`.
Author
Owner

🚀 Tobias Wendt — DevOps & Platform Engineer

Observations

The issue's verification step says: "hit /api/documents/search?q=walter&size=50&page=0. The old SELECT d.id FROM documents d row should drop from 911 rows/call to 50 rows/call." That requires pg_stat_statements to be enabled in the running stack. Looking at the current docker-compose.yml and the audit finding F-24, pg_stat_statements is NOT enabled by default — the db service has no command: postgres -c shared_preload_libraries=pg_stat_statements ... override.

The audit (F-24) already tracked this as a Medium finding. This verification step in issue #468 therefore cannot be executed against the live stack today without first enabling pg_stat_statements. That's a prerequisite that should be noted explicitly in the acceptance criteria.

The Testcontainers integration tests (PostgreSQL 16-alpine) run with a stock Postgres config and will not have pg_stat_statements available either unless the container is started with --command overrides. The issue proposes a "rows_per_call" assertion in DocumentSearchPagedIntegrationTest — that assertion cannot be made via pg_stat_statements in Testcontainers unless the test explicitly enables it via SQL (CREATE EXTENSION IF NOT EXISTS pg_stat_statements).

The DirtiesContext(AFTER_EACH_TEST_METHOD) annotation in the existing DocumentSearchPagedIntegrationTest is expensive — it rebuilds the Spring context after every test. That's inherited from the PR #316 setup. Adding 3–4 more large-fixture tests to this class will noticeably slow CI.

Recommendations

  • Before implementing the verification step, add the following to the db service in docker-compose.yml:
    db:
      command: >
        postgres
          -c shared_preload_libraries=pg_stat_statements
          -c pg_stat_statements.track=all
          -c log_min_duration_statement=200
    
    This unblocks both F-24 and the verification step in this issue in one shot.
  • For the integration test "assert rows_per_call" requirement: use EXPLAIN (ANALYZE, FORMAT JSON) instead of pg_stat_statements in Testcontainers. It does not require the extension and gives exact row counts from the query plan:
    String plan = (String) em.createNativeQuery(
        "EXPLAIN (ANALYZE, FORMAT JSON) SELECT ...").getSingleResult();
    // Parse JSON, assert "Actual Rows" <= pageSize
    
    Alternatively, simply assert that result.items().size() == pageSize and result.totalElements() == 911 — this is behavioral proof without requiring query-level instrumentation.
  • Consider splitting the new fixture tests (1,520-document fixture) into a separate DocumentFtsPagedIntegrationTest class and annotating it @DirtiesContext(classMode = AFTER_CLASS) rather than AFTER_EACH_TEST_METHOD, so the Spring context is rebuilt once per class, not after every test. This alone cuts CI overhead significantly for large Testcontainers tests.

Open Decisions

  • Should pg_stat_statements be enabled in the production Compose file only (to avoid local-dev slowness), or in all environments? The extension adds ~5% query overhead. For a project at this scale it is negligible, but the choice has operational implications for the docker-compose.yml base file vs. a docker-compose.prod.yml overlay.
## 🚀 Tobias Wendt — DevOps & Platform Engineer ### Observations The issue's verification step says: _"hit `/api/documents/search?q=walter&size=50&page=0`. The old `SELECT d.id FROM documents d` row should drop from 911 rows/call to **50 rows/call**."_ That requires `pg_stat_statements` to be enabled in the running stack. Looking at the current `docker-compose.yml` and the audit finding F-24, `pg_stat_statements` is NOT enabled by default — the `db` service has no `command: postgres -c shared_preload_libraries=pg_stat_statements ...` override. The audit (F-24) already tracked this as a Medium finding. This verification step in issue #468 therefore cannot be executed against the live stack today without first enabling `pg_stat_statements`. That's a prerequisite that should be noted explicitly in the acceptance criteria. The Testcontainers integration tests (PostgreSQL 16-alpine) run with a stock Postgres config and will not have `pg_stat_statements` available either unless the container is started with `--command` overrides. The issue proposes a "rows_per_call" assertion in `DocumentSearchPagedIntegrationTest` — that assertion cannot be made via `pg_stat_statements` in Testcontainers unless the test explicitly enables it via SQL (`CREATE EXTENSION IF NOT EXISTS pg_stat_statements`). The `DirtiesContext(AFTER_EACH_TEST_METHOD)` annotation in the existing `DocumentSearchPagedIntegrationTest` is expensive — it rebuilds the Spring context after every test. That's inherited from the PR #316 setup. Adding 3–4 more large-fixture tests to this class will noticeably slow CI. ### Recommendations - Before implementing the verification step, add the following to the `db` service in `docker-compose.yml`: ```yaml db: command: > postgres -c shared_preload_libraries=pg_stat_statements -c pg_stat_statements.track=all -c log_min_duration_statement=200 ``` This unblocks both F-24 and the verification step in this issue in one shot. - For the integration test "assert rows_per_call" requirement: use `EXPLAIN (ANALYZE, FORMAT JSON)` instead of `pg_stat_statements` in Testcontainers. It does not require the extension and gives exact row counts from the query plan: ```java String plan = (String) em.createNativeQuery( "EXPLAIN (ANALYZE, FORMAT JSON) SELECT ...").getSingleResult(); // Parse JSON, assert "Actual Rows" <= pageSize ``` Alternatively, simply assert that `result.items().size() == pageSize` and `result.totalElements() == 911` — this is behavioral proof without requiring query-level instrumentation. - Consider splitting the new fixture tests (1,520-document fixture) into a separate `DocumentFtsPagedIntegrationTest` class and annotating it `@DirtiesContext(classMode = AFTER_CLASS)` rather than `AFTER_EACH_TEST_METHOD`, so the Spring context is rebuilt once per class, not after every test. This alone cuts CI overhead significantly for large Testcontainers tests. ### Open Decisions - Should `pg_stat_statements` be enabled in the production Compose file only (to avoid local-dev slowness), or in all environments? The extension adds ~5% query overhead. For a project at this scale it is negligible, but the choice has operational implications for the `docker-compose.yml` base file vs. a `docker-compose.prod.yml` overlay.
Author
Owner

📋 Elicit — Requirements Engineer

Observations

The issue is well-specified for a performance bug: it gives measured baseline data (911 rows/call at 1,520 documents), a concrete scaling model (linear growth at 100K), a proposed solution, and five acceptance criteria. The effort estimate (M, 1–2 days) is reasonable given the surface area described.

However, two requirements gaps need attention:

Gap 1: The findIdsForFilter path is unspecified.
DocumentService.findIdsForFilter (line 413) also calls findRankedIdsByFts and intentionally returns all matching IDs unpaginated — that is the contract for the bulk-edit "select all" feature. The issue's acceptance criterion "findRankedIdsByFts removed" would break this path. The requirement needs to explicitly state what happens to findIdsForFilter: does it keep calling findRankedIdsByFts, does it get a different new query, or does it call findFtsPageRaw repeatedly? This is not a minor detail — it is a functional regression risk.

Gap 2: The empty-query behavior is under-specified.
The issue says "empty search match-set returns total=0 + no error" but the current CROSS JOIN LATERAL approach handles a subtler case: when websearch_to_tsquery('german', :query) returns an empty tsquery (e.g., for stopword-only input like "der die das"), the lateral subquery yields NULL for pq, and the WHERE d.search_vector @@ q.pq clause correctly returns zero rows. The new CTE must preserve this behavior. The acceptance criterion should be: "a query consisting entirely of German stopwords returns total=0 without an error."

Gap 3: The sort=RELEVANCE path change is implicit.
The current RELEVANCE sort fetches all matching documents and sorts them in-memory by their position in rankedIds. After this refactor, ts_rank DESC in the SQL replaces this. This is a behavioral change: the current in-memory sort uses the full ranked list from a separate query (which includes prefix-match wildcards via the LATERAL hack); the new SQL sort ranks only the OFFSET/LIMIT subset. For pages beyond page 0, this means rank ordering is computed within the page, not globally. This is actually correct behavior (consistent with how pagination + relevance works everywhere), but it should be called out as a deliberate behavior change, not an invisible side effect.

Recommendations

  • Add an explicit acceptance criterion: "The findIdsForFilter method (bulk-edit fast path) is unaffected — it continues to return all matching IDs without pagination."
  • Add a Gherkin-style acceptance criterion for the stopword case:

    Given a search query of "der die das" (German stopwords only)
    When a search is performed
    Then totalElements is 0 and no exception is thrown

  • Add a brief note in the issue body acknowledging the RELEVANCE sort behavioral change: global rank order (current) → per-page rank order (new). This is an acceptable trade-off at scale, but it should be a named decision, not an accidental consequence.
  • The file reference DocumentSearchSpec.java in the "Critical files" section does not exist — the actual file is DocumentSpecifications.java. Fix this before implementation to avoid confusion.
## 📋 Elicit — Requirements Engineer ### Observations The issue is well-specified for a performance bug: it gives measured baseline data (911 rows/call at 1,520 documents), a concrete scaling model (linear growth at 100K), a proposed solution, and five acceptance criteria. The effort estimate (M, 1–2 days) is reasonable given the surface area described. However, two requirements gaps need attention: **Gap 1: The `findIdsForFilter` path is unspecified.** `DocumentService.findIdsForFilter` (line 413) also calls `findRankedIdsByFts` and intentionally returns all matching IDs unpaginated — that is the contract for the bulk-edit "select all" feature. The issue's acceptance criterion "findRankedIdsByFts removed" would break this path. The requirement needs to explicitly state what happens to `findIdsForFilter`: does it keep calling `findRankedIdsByFts`, does it get a different new query, or does it call `findFtsPageRaw` repeatedly? This is not a minor detail — it is a functional regression risk. **Gap 2: The empty-query behavior is under-specified.** The issue says "empty search match-set returns total=0 + no error" but the current `CROSS JOIN LATERAL` approach handles a subtler case: when `websearch_to_tsquery('german', :query)` returns an empty tsquery (e.g., for stopword-only input like "der die das"), the lateral subquery yields `NULL` for `pq`, and the `WHERE d.search_vector @@ q.pq` clause correctly returns zero rows. The new CTE must preserve this behavior. The acceptance criterion should be: "a query consisting entirely of German stopwords returns total=0 without an error." **Gap 3: The `sort=RELEVANCE` path change is implicit.** The current RELEVANCE sort fetches all matching documents and sorts them in-memory by their position in `rankedIds`. After this refactor, `ts_rank DESC` in the SQL replaces this. This is a **behavioral change**: the current in-memory sort uses the full ranked list from a separate query (which includes prefix-match wildcards via the LATERAL hack); the new SQL sort ranks only the OFFSET/LIMIT subset. For pages beyond page 0, this means rank ordering is computed within the page, not globally. This is actually correct behavior (consistent with how pagination + relevance works everywhere), but it should be called out as a deliberate behavior change, not an invisible side effect. ### Recommendations - Add an explicit acceptance criterion: "The `findIdsForFilter` method (bulk-edit fast path) is unaffected — it continues to return all matching IDs without pagination." - Add a Gherkin-style acceptance criterion for the stopword case: > **Given** a search query of "der die das" (German stopwords only) > **When** a search is performed > **Then** `totalElements` is 0 and no exception is thrown - Add a brief note in the issue body acknowledging the RELEVANCE sort behavioral change: global rank order (current) → per-page rank order (new). This is an acceptable trade-off at scale, but it should be a named decision, not an accidental consequence. - The file reference `DocumentSearchSpec.java` in the "Critical files" section does not exist — the actual file is `DocumentSpecifications.java`. Fix this before implementation to avoid confusion.
Author
Owner

🔒 Nora "NullX" Steiner — Security Engineer

Observations

The FTS query uses websearch_to_tsquery('german', :query) with a named JPA parameter :query. This is correctly parameterized — no SQL injection risk from the query string. The LATERAL CTE and window function approach in the proposed query preserves this parameterization. No new injection surface is introduced.

One security-adjacent concern: the current findRankedIdsByFts returns a raw List<UUID>, which is then fed as IN :ids to findEnrichmentData. The IN clause with named parameters in JPA is safe — Hibernate generates a parameterized IN (?, ?, ...) rather than string interpolation. The new findFtsPageRaw approach bypasses this IN clause for the ID list by directly returning the page; findEnrichmentData is then called with only the 50-ID page. This is equally safe and actually reduces the IN clause size from potentially 911 items to 50, which also avoids the Postgres IN clause overhead at scale.

The Object[] row mapping proposed in the issue casts column 0 to UUID. In Postgres + Hibernate native queries, UUID columns may return as java.util.UUID or as String depending on the JDBC driver and column type. If the mapping is (UUID) r[0] and the driver returns a String, this throws a ClassCastException at runtime — not a security issue but a reliability one that could surface as a 500 in production while silently exposing an error trace. Use UUID.fromString(r[0].toString()) defensively.

No @RequirePermission change is involved — the search endpoint is read-only and the permission model is unchanged. The refactor is scope-limited to the repository/service internals.

Recommendations

  • Cast UUID safely in the Object[] mapping:
    UUID id = r[0] instanceof UUID u ? u : UUID.fromString(r[0].toString());
    
    Pattern matching on the type is idiomatic Java 21 and guards against driver-level type variance.
  • Add a test for the UUID cast path with a real Postgres query to verify the JDBC driver returns java.util.UUID (not String) — this has surprised teams before and is worth confirming once at the integration test layer.
  • The ts_rank column (column 1) is a float8 in Postgres. Cast as ((Number) r[1]).doubleValue() (as the issue shows) is correct — Number is the safe supertype for both Float and Double driver returns.
## 🔒 Nora "NullX" Steiner — Security Engineer ### Observations The FTS query uses `websearch_to_tsquery('german', :query)` with a named JPA parameter `:query`. This is correctly parameterized — no SQL injection risk from the query string. The `LATERAL` CTE and window function approach in the proposed query preserves this parameterization. No new injection surface is introduced. One security-adjacent concern: the current `findRankedIdsByFts` returns a raw `List<UUID>`, which is then fed as `IN :ids` to `findEnrichmentData`. The `IN` clause with named parameters in JPA is safe — Hibernate generates a parameterized `IN (?, ?, ...)` rather than string interpolation. The new `findFtsPageRaw` approach bypasses this `IN` clause for the ID list by directly returning the page; `findEnrichmentData` is then called with only the 50-ID page. This is equally safe and actually reduces the `IN` clause size from potentially 911 items to 50, which also avoids the Postgres `IN` clause overhead at scale. The `Object[]` row mapping proposed in the issue casts column 0 to `UUID`. In Postgres + Hibernate native queries, UUID columns may return as `java.util.UUID` or as `String` depending on the JDBC driver and column type. If the mapping is `(UUID) r[0]` and the driver returns a `String`, this throws a `ClassCastException` at runtime — not a security issue but a reliability one that could surface as a 500 in production while silently exposing an error trace. Use `UUID.fromString(r[0].toString())` defensively. No `@RequirePermission` change is involved — the search endpoint is read-only and the permission model is unchanged. The refactor is scope-limited to the repository/service internals. ### Recommendations - Cast UUID safely in the `Object[]` mapping: ```java UUID id = r[0] instanceof UUID u ? u : UUID.fromString(r[0].toString()); ``` Pattern matching on the type is idiomatic Java 21 and guards against driver-level type variance. - Add a test for the UUID cast path with a real Postgres query to verify the JDBC driver returns `java.util.UUID` (not `String`) — this has surprised teams before and is worth confirming once at the integration test layer. - The `ts_rank` column (column 1) is a `float8` in Postgres. Cast as `((Number) r[1]).doubleValue()` (as the issue shows) is correct — `Number` is the safe supertype for both `Float` and `Double` driver returns.
Author
Owner

🧪 Sara Holt — QA Engineer & Test Strategist

Observations

Test coverage for the current FTS path is extensive: DocumentFtsTest (16 tests against real Postgres via @DataJpaTest + Testcontainers) covers findRankedIdsByFts directly. DocumentSearchPagedIntegrationTest (5 tests) covers the service-level slice/total arithmetic. DocumentServiceSortTest and DocumentServiceTest cover the service layer via Mockito mocks that stub findRankedIdsByFts.

The refactor touches three distinct layers, each requiring test updates:

Layer 1 — Repository (findFtsPageRaw): The existing DocumentFtsTest stubs findRankedIdsByFts. Once the new findFtsPageRaw exists alongside it, the test class needs new test methods asserting that:

  • findFtsPageRaw returns ≤ pageSize rows
  • total in the first row equals the full match count, not the page size
  • Empty match set returns an empty list (not a null, not a row with total=0)

Layer 2 — Service unit tests (DocumentServiceSortTest, DocumentServiceTest): Every when(documentRepository.findRankedIdsByFts(...)) stub will need a companion when(documentRepository.findFtsPageRaw(...)) stub for the RELEVANCE path. If findRankedIdsByFts is kept for the bulk-edit path but findFtsPageRaw handles the search path, the mocks need to stay aligned. There are at least 5 places in DocumentServiceTest and 3 in DocumentServiceSortTest that will break.

Layer 3 — Integration (DocumentSearchPagedIntegrationTest): The issue requests adding an assertion for "rows-per-call ≤ page size". As Tobias notes, this cannot be done via pg_stat_statements in Testcontainers without enabling the extension. Use EXPLAIN ANALYZE in a dedicated test or accept behavioral assertion (result size == page size) as sufficient proof. The existing 5 tests in this class do not exercise FTS at all — they use null text queries. New tests need to seed a large fixture and use a real FTS query to prove the paged behavior.

The DirtiesContext(AFTER_EACH_TEST_METHOD) in DocumentSearchPagedIntegrationTest is a performance smell. With 120-document fixture seeding in @BeforeEach and context rebuild after each test, this class is already slower than necessary. Adding a 1,520-document FTS fixture would make it significantly worse. Recommend splitting into a separate class with @DirtiesContext(AFTER_CLASS).

Recommendations

  • Write the integration tests in a new class DocumentFtsPagedIntegrationTest with:
    • @DirtiesContext(classMode = AFTER_CLASS) (not AFTER_EACH)
    • A 1,520-document fixture seeded once in @BeforeAll (use @TestInstance(PER_CLASS))
    • At minimum these test cases:
      1. ftsPage_firstPage_returns50Items_and_totalEquals911()
      2. ftsPage_lastPage_returnsRemainder_andSameTotalEquals911()
      3. ftsPage_emptyMatchSet_returnsZeroTotal_noException()
      4. ftsPage_stopwordOnlyQuery_returnsZeroTotal_noException()
      5. ftsPage_pageSize50_resultSizeIsAtMost50()
  • Update DocumentServiceSortTest to add a test proving RELEVANCE sort now uses findFtsPageRaw, not findRankedIdsByFts — mock both, assert only findFtsPageRaw is called on the RELEVANCE path.
  • Red/green order: write ftsPage_firstPage_returns50Items_and_totalEquals911() first, verify it fails with the old implementation (returns 911 items), then implement findFtsPageRaw.

Open Decisions

  • Do the DocumentFtsTest assertions against findRankedIdsByFts still own the linguistic correctness (stemming, wildcard, stopword) tests after the refactor? Yes — those tests verify query semantics, not pagination. They should stay as-is regardless of whether findRankedIdsByFts is retained or deprecated. If deprecated, migrate the assertions to findFtsPageRaw with total > 0 assertions instead of checking the ID list directly.
## 🧪 Sara Holt — QA Engineer & Test Strategist ### Observations Test coverage for the current FTS path is extensive: `DocumentFtsTest` (16 tests against real Postgres via `@DataJpaTest` + Testcontainers) covers `findRankedIdsByFts` directly. `DocumentSearchPagedIntegrationTest` (5 tests) covers the service-level slice/total arithmetic. `DocumentServiceSortTest` and `DocumentServiceTest` cover the service layer via Mockito mocks that stub `findRankedIdsByFts`. The refactor touches three distinct layers, each requiring test updates: **Layer 1 — Repository (`findFtsPageRaw`):** The existing `DocumentFtsTest` stubs `findRankedIdsByFts`. Once the new `findFtsPageRaw` exists alongside it, the test class needs new test methods asserting that: - `findFtsPageRaw` returns ≤ `pageSize` rows - `total` in the first row equals the full match count, not the page size - Empty match set returns an empty list (not a null, not a row with total=0) **Layer 2 — Service unit tests (`DocumentServiceSortTest`, `DocumentServiceTest`):** Every `when(documentRepository.findRankedIdsByFts(...))` stub will need a companion `when(documentRepository.findFtsPageRaw(...))` stub for the RELEVANCE path. If `findRankedIdsByFts` is kept for the bulk-edit path but `findFtsPageRaw` handles the search path, the mocks need to stay aligned. There are at least 5 places in `DocumentServiceTest` and 3 in `DocumentServiceSortTest` that will break. **Layer 3 — Integration (`DocumentSearchPagedIntegrationTest`):** The issue requests adding an assertion for "rows-per-call ≤ page size". As Tobias notes, this cannot be done via `pg_stat_statements` in Testcontainers without enabling the extension. Use `EXPLAIN ANALYZE` in a dedicated test or accept behavioral assertion (result size == page size) as sufficient proof. The existing 5 tests in this class do not exercise FTS at all — they use `null` text queries. New tests need to seed a large fixture and use a real FTS query to prove the paged behavior. **The `DirtiesContext(AFTER_EACH_TEST_METHOD)` in `DocumentSearchPagedIntegrationTest`** is a performance smell. With 120-document fixture seeding in `@BeforeEach` and context rebuild after each test, this class is already slower than necessary. Adding a 1,520-document FTS fixture would make it significantly worse. Recommend splitting into a separate class with `@DirtiesContext(AFTER_CLASS)`. ### Recommendations - Write the integration tests in a new class `DocumentFtsPagedIntegrationTest` with: - `@DirtiesContext(classMode = AFTER_CLASS)` (not AFTER_EACH) - A 1,520-document fixture seeded once in `@BeforeAll` (use `@TestInstance(PER_CLASS)`) - At minimum these test cases: 1. `ftsPage_firstPage_returns50Items_and_totalEquals911()` 2. `ftsPage_lastPage_returnsRemainder_andSameTotalEquals911()` 3. `ftsPage_emptyMatchSet_returnsZeroTotal_noException()` 4. `ftsPage_stopwordOnlyQuery_returnsZeroTotal_noException()` 5. `ftsPage_pageSize50_resultSizeIsAtMost50()` - Update `DocumentServiceSortTest` to add a test proving RELEVANCE sort now uses `findFtsPageRaw`, not `findRankedIdsByFts` — mock both, assert only `findFtsPageRaw` is called on the RELEVANCE path. - Red/green order: write `ftsPage_firstPage_returns50Items_and_totalEquals911()` first, verify it fails with the old implementation (returns 911 items), then implement `findFtsPageRaw`. ### Open Decisions - **Do the `DocumentFtsTest` assertions against `findRankedIdsByFts` still own the linguistic correctness (stemming, wildcard, stopword) tests after the refactor?** Yes — those tests verify query semantics, not pagination. They should stay as-is regardless of whether `findRankedIdsByFts` is retained or deprecated. If deprecated, migrate the assertions to `findFtsPageRaw` with `total > 0` assertions instead of checking the ID list directly.
Author
Owner

🎨 Leonie Voss — UX Design Lead & Accessibility Strategist

Observations

This is a pure backend/database performance refactor with no planned frontend changes. The issue explicitly states: "No frontend change — DocumentSearchResult shape stays the same per #315." From a UX perspective, this is the right scope decision.

That said, there are two user-visible behaviors worth verifying:

1. Pagination total count accuracy: The current in-memory path returns sorted.size() as totalElements (from the full fetched list). The new window-function approach returns COUNT(*) OVER () from Postgres. These should agree for every filter combination, but there is a subtle case: if the user searches with both a text query AND filters (date range, sender, tag), the CTE computes COUNT(*) OVER () after the WHERE d.search_vector @@ q.pq but before any JPA Specification filters (sender, date, tag) are applied. The Specification filters happen at the findAll(spec, pageRequest) call, not inside the native SQL. If the total displayed to the user comes from the window function alone, it would show the FTS match count, not the combined filter + FTS match count. This could cause the UI to show "911 results" while only 23 are visible after applying a date filter — confusing for the user.

2. Sort mode behavior for RELEVANCE on page > 0: As Elicit noted, relevance is now ranked within the page rather than globally. A user jumping to page 5 of a relevance-sorted search will see results ranked by local page relevance, which may not feel continuous with page 4. At current data volume (1,520 docs), users rarely reach page 5 for a broad query, but it is worth documenting.

Recommendations

  • Verify in a smoke test that totalElements in the API response matches the actual number of results a user would receive across all pages when both text and filters are combined. Specifically, check: GET /api/documents/search?q=walter&sender=<uuid>&size=50&page=0 — does totalElements match the count of documents that match both the FTS query AND the sender filter?
  • If the window-function COUNT(*) OVER () is computed inside the native SQL (before JPA Specification filters are applied), the fix is to keep the Page.getTotalElements() from the JPA findAll(spec, pageRequest) call for filtered queries, and only use the window-function total for pure FTS queries. The issue's proposed flow conflates these two; make the handling explicit in the implementation.
  • No visual changes are needed. The DocumentSearchResult record's field types and names are stable — the TypeScript consumer will see no difference.

Open Decisions

  • When BOTH text and filter parameters are present, which total count takes precedence — the SQL window function (FTS count) or the JPA Specification count (filtered count)? This is a product decision: should "911 results" mean "911 docs match your search term (before your filters)" or "23 docs match your search term AND your filters"? The answer should match what is displayed in the UI result count chip. The current in-memory implementation returns the filtered+FTS count, which is what users expect. Preserve that semantics.
## 🎨 Leonie Voss — UX Design Lead & Accessibility Strategist ### Observations This is a pure backend/database performance refactor with no planned frontend changes. The issue explicitly states: "No frontend change — `DocumentSearchResult` shape stays the same per #315." From a UX perspective, this is the right scope decision. That said, there are two user-visible behaviors worth verifying: **1. Pagination total count accuracy:** The current in-memory path returns `sorted.size()` as `totalElements` (from the full fetched list). The new window-function approach returns `COUNT(*) OVER ()` from Postgres. These should agree for every filter combination, but there is a subtle case: if the user searches with both a text query AND filters (date range, sender, tag), the CTE computes `COUNT(*) OVER ()` **after** the `WHERE d.search_vector @@ q.pq` but **before** any JPA Specification filters (sender, date, tag) are applied. The Specification filters happen at the `findAll(spec, pageRequest)` call, not inside the native SQL. If the total displayed to the user comes from the window function alone, it would show the FTS match count, not the combined filter + FTS match count. This could cause the UI to show "911 results" while only 23 are visible after applying a date filter — confusing for the user. **2. Sort mode behavior for RELEVANCE on page > 0:** As Elicit noted, relevance is now ranked within the page rather than globally. A user jumping to page 5 of a relevance-sorted search will see results ranked by local page relevance, which may not feel continuous with page 4. At current data volume (1,520 docs), users rarely reach page 5 for a broad query, but it is worth documenting. ### Recommendations - Verify in a smoke test that `totalElements` in the API response matches the actual number of results a user would receive across all pages when both text and filters are combined. Specifically, check: `GET /api/documents/search?q=walter&sender=<uuid>&size=50&page=0` — does `totalElements` match the count of documents that match both the FTS query AND the sender filter? - If the window-function `COUNT(*) OVER ()` is computed inside the native SQL (before JPA Specification filters are applied), the fix is to keep the `Page.getTotalElements()` from the JPA `findAll(spec, pageRequest)` call for filtered queries, and only use the window-function total for pure FTS queries. The issue's proposed flow conflates these two; make the handling explicit in the implementation. - No visual changes are needed. The `DocumentSearchResult` record's field types and names are stable — the TypeScript consumer will see no difference. ### Open Decisions - **When BOTH text and filter parameters are present, which total count takes precedence — the SQL window function (FTS count) or the JPA Specification count (filtered count)?** This is a product decision: should "911 results" mean "911 docs match your search term (before your filters)" or "23 docs match your search term AND your filters"? The answer should match what is displayed in the UI result count chip. The current in-memory implementation returns the filtered+FTS count, which is what users expect. Preserve that semantics.
Author
Owner

Decision Queue — Open Items for Human Judgment

Grouped from the persona reviews above. Each item is a genuine tradeoff needing a decision before or during implementation.


Theme 1: Scope of findRankedIdsByFts after the refactor

From: Markus, Felix, Elicit, Sara

findRankedIdsByFts has two callers:

  • searchDocuments — the paged search path (target of this fix)
  • findIdsForFilter — the bulk-edit "select all" path (intentionally unpaginated, must not change)

The acceptance criterion "findRankedIdsByFts removed" in the issue would break the bulk-edit path.

Decision needed: Keep findRankedIdsByFts for findIdsForFilter (rename to findAllMatchingIdsByFts for clarity) and add findFtsPageRaw as a second query? Or replace both with a unified query and rebuild the bulk-edit path to iterate pages? The simplest answer is keep+rename.


Theme 2: Where the totalElements count comes from when filters AND text are combined

From: Leonie, Elicit

The proposed window function COUNT(*) OVER () runs inside the native SQL — it counts FTS matches only, before JPA Specification filters (date range, sender, tag) are applied. If a user searches q=walter&sender=<uuid>, the window function returns 911 (all Walter matches) but the JPA spec filters reduce the visible results to, say, 23.

The UI result count chip would show "911 results" while only 23 appear — a serious UX inconsistency.

Decision needed: For combined text+filter queries, use Page.getTotalElements() from documentRepository.findAll(spec, pageRequest) as the source of truth for totalElements (as the existing fast path already does for DATE/TITLE/UPLOAD_DATE sorts). Only use the window-function total when there are no active filters. This is likely the correct answer — confirm before implementation.


Theme 3: pg_stat_statements in docker-compose.yml — base file or prod overlay only?

From: Tobias

Enabling pg_stat_statements in the base docker-compose.yml means it is always-on in dev. It adds ~5% query overhead. The alternative is to add it only to a docker-compose.prod.yml overlay, but that means the verification step ("rows_per_call drops from 911 to 50") cannot be run locally.

Decision needed: Enable in the base file (simplest, unblocks F-24 and this issue's verification in one change) or overlay only (lower dev overhead, more operational complexity).


Theme 4: FtsHit / FtsPage record placement

From: Markus

The issue shows these as nested types inside DocumentRepository. This leaks value-object semantics into the repository layer.

Decision needed: Define them in the document package root (alongside DocumentSearchResult, DocumentSearchItem, etc.) so the repository just returns List<Object[]> and the service owns the record types and the parsing. This is the cleaner split — recommend adopting it, but worth a quick confirmation.


Theme 5: ADR requirement

From: Markus

Pushing FTS pagination into SQL is a lasting architectural constraint — it determines forever that SENDER/RECEIVER sorts stay in-memory. The project has an active ADR directory (docs/adr/).

Decision needed: Write a short ADR before implementing? Recommended yes — context + decision + alternatives (cap result set with warning, two-query count) + consequences (SENDER/RECEIVER permanent in-memory) takes 20 minutes and avoids re-litigating the trade-off on the PR.

## Decision Queue — Open Items for Human Judgment Grouped from the persona reviews above. Each item is a genuine tradeoff needing a decision before or during implementation. --- ### Theme 1: Scope of `findRankedIdsByFts` after the refactor **From: Markus, Felix, Elicit, Sara** `findRankedIdsByFts` has two callers: - `searchDocuments` — the paged search path (target of this fix) - `findIdsForFilter` — the bulk-edit "select all" path (intentionally unpaginated, must not change) The acceptance criterion "findRankedIdsByFts removed" in the issue would break the bulk-edit path. **Decision needed:** Keep `findRankedIdsByFts` for `findIdsForFilter` (rename to `findAllMatchingIdsByFts` for clarity) and add `findFtsPageRaw` as a second query? Or replace both with a unified query and rebuild the bulk-edit path to iterate pages? The simplest answer is keep+rename. --- ### Theme 2: Where the `totalElements` count comes from when filters AND text are combined **From: Leonie, Elicit** The proposed window function `COUNT(*) OVER ()` runs inside the native SQL — it counts FTS matches only, before JPA Specification filters (date range, sender, tag) are applied. If a user searches `q=walter&sender=<uuid>`, the window function returns 911 (all Walter matches) but the JPA spec filters reduce the visible results to, say, 23. The UI result count chip would show "911 results" while only 23 appear — a serious UX inconsistency. **Decision needed:** For combined text+filter queries, use `Page.getTotalElements()` from `documentRepository.findAll(spec, pageRequest)` as the source of truth for `totalElements` (as the existing fast path already does for DATE/TITLE/UPLOAD_DATE sorts). Only use the window-function total when there are no active filters. This is likely the correct answer — confirm before implementation. --- ### Theme 3: `pg_stat_statements` in `docker-compose.yml` — base file or prod overlay only? **From: Tobias** Enabling `pg_stat_statements` in the base `docker-compose.yml` means it is always-on in dev. It adds ~5% query overhead. The alternative is to add it only to a `docker-compose.prod.yml` overlay, but that means the verification step ("rows_per_call drops from 911 to 50") cannot be run locally. **Decision needed:** Enable in the base file (simplest, unblocks F-24 and this issue's verification in one change) or overlay only (lower dev overhead, more operational complexity). --- ### Theme 4: `FtsHit` / `FtsPage` record placement **From: Markus** The issue shows these as nested types inside `DocumentRepository`. This leaks value-object semantics into the repository layer. **Decision needed:** Define them in the `document` package root (alongside `DocumentSearchResult`, `DocumentSearchItem`, etc.) so the repository just returns `List<Object[]>` and the service owns the record types and the parsing. This is the cleaner split — recommend adopting it, but worth a quick confirmation. --- ### Theme 5: ADR requirement **From: Markus** Pushing FTS pagination into SQL is a lasting architectural constraint — it determines forever that SENDER/RECEIVER sorts stay in-memory. The project has an active ADR directory (`docs/adr/`). **Decision needed:** Write a short ADR before implementing? Recommended yes — context + decision + alternatives (cap result set with warning, two-query count) + consequences (SENDER/RECEIVER permanent in-memory) takes 20 minutes and avoids re-litigating the trade-off on the PR.
Sign in to join this conversation.
No Label P1-high bug
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#468