fix(db): add indexes on documents.sender_id and document_comments.author_id #470

New Issue

marcel · 2026-05-07T17:28:06+02:00

marcel commented

2026-05-07 17:28:06 +02:00

Context

Pre-prod audit's schema inspection found two FK columns without supporting indexes:

       table          |  column      | status
----------------------+--------------+-----------
 documents            | sender_id    | NO INDEX
 document_comments    | author_id    | NO INDEX

These are queried on common UI paths:

documents.sender_id — every "show all letters from this person" query (used on /persons/[id] Korrespondenz-Überblick per #306, on /briefwechsel, and as part of mass-edit flows).
document_comments.author_id — every "show all comments by this user" query (admin user detail, activity feed, future contributor profile per #335).

Without an index, Postgres does a sequential scan. At 1,520 documents and a few hundred comments today the sequential scan is fast (< 1 ms). At 10× data volume and 100× comments, both queries become bottlenecks for the read-side experience the team is building toward (Reader Experience v1 / Persons Redesign v2 milestones).

Approach

Single Flyway migration adding both indexes. Indexes on FK columns rarely cause downsides — query planner can ignore them, and Postgres maintains them cheaply at the volumes this app sees.

V64__index_fk_columns.sql

-- Speeds up "documents by sender" queries used on /persons/[id] Korrespondenz-Überblick (#306),
-- /briefwechsel, and bulk-edit flows.
CREATE INDEX IF NOT EXISTS idx_documents_sender_id
    ON documents(sender_id);

-- Speeds up "comments by author" queries on admin user detail and (future) contributor profile.
CREATE INDEX IF NOT EXISTS idx_comments_author_id
    ON document_comments(author_id);

CREATE INDEX CONCURRENTLY would be production-correct for live deploys, but Flyway transactional DDL doesn't allow it; rely on the brief lock at deploy-time. At current data volume the index build is sub-second.

Optional: cover audit_log / notifications

While we're here, consider whether audit_log.actor_id or notifications.user_id deserve the same treatment. Audit didn't flag them, but they're queried on similar patterns. Out of scope for this issue unless cheap.

Critical files

backend/src/main/resources/db/migration/V64__index_fk_columns.sql — new

Verification

EXPLAIN before:

EXPLAIN SELECT * FROM documents WHERE sender_id = '<known-uuid>';
-- Seq Scan on documents (cost=...)

EXPLAIN after:

EXPLAIN SELECT * FROM documents WHERE sender_id = '<known-uuid>';
-- Index Scan using idx_documents_sender_id (cost=...)

pg_stat_statements: re-exercise the audit's 24-call sequence; total time for sender-filtered queries drops.
Test suite: existing tests pass. No new tests needed — index correctness is a pg_indexes check.

Acceptance criteria

Both indexes present after migration: \d documents and \d document_comments show them.
EXPLAIN for WHERE sender_id = ? and WHERE author_id = ? shows Index Scan.
No test regressions.
Migration is idempotent (IF NOT EXISTS), so re-running on a partially-migrated DB doesn't fail.

Effort

XS — 30 minutes.

Risk if not addressed

At 10K+ documents, sender-filtered queries on /persons/[id] become user-visibly slow (>200 ms).

Tracked in audit doc as F-33 (Medium) — new dynamic finding from pg_attribute join.

## Context Pre-prod audit's schema inspection found two FK columns without supporting indexes: ``` table | column | status ----------------------+--------------+----------- documents | sender_id | NO INDEX document_comments | author_id | NO INDEX ``` These are queried on common UI paths: - **`documents.sender_id`** — every "show all letters from this person" query (used on `/persons/[id]` Korrespondenz-Überblick per #306, on `/briefwechsel`, and as part of mass-edit flows). - **`document_comments.author_id`** — every "show all comments by this user" query (admin user detail, activity feed, future contributor profile per #335). Without an index, Postgres does a sequential scan. At 1,520 documents and a few hundred comments today the sequential scan is fast (< 1 ms). At 10× data volume and 100× comments, both queries become bottlenecks for the read-side experience the team is building toward (Reader Experience v1 / Persons Redesign v2 milestones). ## Approach Single Flyway migration adding both indexes. Indexes on FK columns rarely cause downsides — query planner can ignore them, and Postgres maintains them cheaply at the volumes this app sees. ### V64__index_fk_columns.sql ```sql -- Speeds up "documents by sender" queries used on /persons/[id] Korrespondenz-Überblick (#306), -- /briefwechsel, and bulk-edit flows. CREATE INDEX IF NOT EXISTS idx_documents_sender_id ON documents(sender_id); -- Speeds up "comments by author" queries on admin user detail and (future) contributor profile. CREATE INDEX IF NOT EXISTS idx_comments_author_id ON document_comments(author_id); ``` `CREATE INDEX CONCURRENTLY` would be production-correct for live deploys, but Flyway transactional DDL doesn't allow it; rely on the brief lock at deploy-time. At current data volume the index build is sub-second. ### Optional: cover audit_log / notifications While we're here, consider whether `audit_log.actor_id` or `notifications.user_id` deserve the same treatment. Audit didn't flag them, but they're queried on similar patterns. Out of scope for this issue unless cheap. ## Critical files - `backend/src/main/resources/db/migration/V64__index_fk_columns.sql` — **new** ## Verification 1. **EXPLAIN before**: ```sql EXPLAIN SELECT * FROM documents WHERE sender_id = '<known-uuid>'; -- Seq Scan on documents (cost=...) ``` 2. **EXPLAIN after**: ```sql EXPLAIN SELECT * FROM documents WHERE sender_id = '<known-uuid>'; -- Index Scan using idx_documents_sender_id (cost=...) ``` 3. **`pg_stat_statements`**: re-exercise the audit's 24-call sequence; total time for sender-filtered queries drops. 4. **Test suite**: existing tests pass. No new tests needed — index correctness is a `pg_indexes` check. ## Acceptance criteria - [ ] Both indexes present after migration: `\d documents` and `\d document_comments` show them. - [ ] EXPLAIN for `WHERE sender_id = ?` and `WHERE author_id = ?` shows Index Scan. - [ ] No test regressions. - [ ] Migration is idempotent (`IF NOT EXISTS`), so re-running on a partially-migrated DB doesn't fail. ## Effort XS — 30 minutes. ## Risk if not addressed At 10K+ documents, sender-filtered queries on `/persons/[id]` become user-visibly slow (>200 ms). Tracked in audit doc as **F-33** (Medium) — new dynamic finding from `pg_attribute` join.

marcel added the P2-medium bug labels 2026-05-07 17:29:37 +02:00

marcel commented

2026-05-07 17:47:47 +02:00

👨‍💻 Markus Keller — Application Architect

Observations

The gap is real and confirmed in the codebase. V1__initial_schema.sql defines documents.sender_id as a plain FK column with no CREATE INDEX statement. V12__add_document_comments.sql creates indexes on document_id, annotation_id, and parent_id — but explicitly omits author_id. No subsequent migration (V2–V60) fills either gap.
The query surface is larger than the issue title suggests. sender_id is used in at least six distinct query shapes across DocumentRepository and PersonRepository: findBySenderId, two briefwechsel JOIN queries, two Korrespondenz-Überblick correlated subqueries (one used inside a SELECT COUNT(*) per person in list views), and a bulk-merge UPDATE. Every WHERE d.sender_id = :personId hits a sequential scan today.
audit_log.actor_id and notifications.recipient_id are already indexed (V46 and V16 respectively). The issue's "optional" consideration is moot — those are already covered.
The IF NOT EXISTS guard is correct practice. This codebase's existing migrations (V28, V30, V36, V42) all use this guard. Consistency is good.
The proposed migration version V64 is correct: the last migration is V60. Skipping V61–V63 is fine only if those numbers aren't reserved elsewhere — confirm no in-flight branches use V61/V62/V63.
The issue correctly notes that CREATE INDEX CONCURRENTLY is incompatible with Flyway's transactional DDL. At the documented data volume, the brief table lock during index creation is sub-second. Accept the trade-off.
No diagram updates required for this change: indexes are not represented in docs/architecture/db/db-orm.puml or docs/architecture/db/db-relationships.puml (they show entity relationships, not physical storage). No CLAUDE.md or ADR changes needed.

Recommendations

Implement exactly as specified. The SQL is correct, the version number is right, the IF NOT EXISTS guard matches the project's pattern. Add no scope creep.
Do not add geschichten.author_id to this migration. That FK references app_users, not persons, and the table is new (V58). Its query patterns (story list, contributor profile) are not yet production-load bottlenecks. Create a separate issue if the audit eventually flags it.
Extend MigrationIntegrationTest with two index-existence assertions. The pattern for verifying database-layer invariants is already established in that file. A pg_indexes query asserting idx_documents_sender_id and idx_comments_author_id exist after migration gives the same enforcement guarantee as the constraint tests already there. See Sara's review for the exact test shape.
Confirm version number availability by running git log --all --oneline -- backend/src/main/resources/db/migration/V6*.sql before merging to check no in-flight branch has claimed V61–V63.

## 👨‍💻 Markus Keller — Application Architect ### Observations - The gap is real and confirmed in the codebase. `V1__initial_schema.sql` defines `documents.sender_id` as a plain FK column with no `CREATE INDEX` statement. `V12__add_document_comments.sql` creates indexes on `document_id`, `annotation_id`, and `parent_id` — but explicitly omits `author_id`. No subsequent migration (V2–V60) fills either gap. - The query surface is larger than the issue title suggests. `sender_id` is used in **at least six distinct query shapes** across `DocumentRepository` and `PersonRepository`: `findBySenderId`, two briefwechsel JOIN queries, two Korrespondenz-Überblick correlated subqueries (one used inside a `SELECT COUNT(*)` per person in list views), and a bulk-merge UPDATE. Every `WHERE d.sender_id = :personId` hits a sequential scan today. - `audit_log.actor_id` and `notifications.recipient_id` are already indexed (V46 and V16 respectively). The issue's "optional" consideration is moot — those are already covered. - The `IF NOT EXISTS` guard is correct practice. This codebase's existing migrations (V28, V30, V36, V42) all use this guard. Consistency is good. - The proposed migration version V64 is correct: the last migration is V60. Skipping V61–V63 is fine only if those numbers aren't reserved elsewhere — confirm no in-flight branches use V61/V62/V63. - The issue correctly notes that `CREATE INDEX CONCURRENTLY` is incompatible with Flyway's transactional DDL. At the documented data volume, the brief table lock during index creation is sub-second. Accept the trade-off. - No diagram updates required for this change: indexes are not represented in `docs/architecture/db/db-orm.puml` or `docs/architecture/db/db-relationships.puml` (they show entity relationships, not physical storage). No CLAUDE.md or ADR changes needed. ### Recommendations - **Implement exactly as specified.** The SQL is correct, the version number is right, the `IF NOT EXISTS` guard matches the project's pattern. Add no scope creep. - **Do not add `geschichten.author_id` to this migration.** That FK references `app_users`, not `persons`, and the table is new (V58). Its query patterns (story list, contributor profile) are not yet production-load bottlenecks. Create a separate issue if the audit eventually flags it. - **Extend `MigrationIntegrationTest` with two index-existence assertions.** The pattern for verifying database-layer invariants is already established in that file. A `pg_indexes` query asserting `idx_documents_sender_id` and `idx_comments_author_id` exist after migration gives the same enforcement guarantee as the constraint tests already there. See Sara's review for the exact test shape. - **Confirm version number availability** by running `git log --all --oneline -- backend/src/main/resources/db/migration/V6*.sql` before merging to check no in-flight branch has claimed V61–V63.

marcel commented

2026-05-07 17:48:01 +02:00

👨‍💻 Felix Brandt — Fullstack Developer

Observations

The migration SQL as written in the issue is clean and correct. Both statements use CREATE INDEX IF NOT EXISTS, matching the style in V28, V36, V38, V48, V49, and V55. The names follow the existing convention: idx_<table>_<column>.
sender_id is not just a simple FK — it participates in complex correlated subqueries in PersonRepository. Two queries each use WHERE d.sender_id = :personId and WHERE dr.person_id = :personId AND d.sender_id IS NOT NULL inside UNION expressions. These scan the documents table per-person during list rendering. Without the index, every person in a directory listing fires a sequential scan.
document_comments.author_id is stored in the entity (@Column(name = "author_id")) but CommentService currently queries only by blockId, parentId, and documentId. The index anticipates the future "comments by this user" admin view (#335 / contributor profile) — which is explicitly described in the issue. There is no findByAuthorId in CommentRepository today, which means the index is speculative for the write path. It is still cheap and correct to add.
The migration file is the entirety of the implementation: there are no Java changes, no DTO changes, no frontend changes, no API type regeneration needed. This is genuinely an XS change.
The comment block in the proposed SQL clearly explains the why of each index, referencing issue numbers. That's consistent with the project's comment style in migrations like V36 and V49.

Recommendations

Write the migration exactly as specified in the issue. No Java code changes accompany this. Do not add a findByAuthorId method to CommentRepository speculatively — let the query consumer drive that when the feature exists.
Follow the TDD flow for schema changes: write the MigrationIntegrationTest assertions first (they fail because the indexes don't exist yet), then create the migration file (they go green). The test body is a pg_indexes query — two assertions, red then green. This upholds the red/green discipline for infrastructure changes.
Commit atomically: one commit for the migration file + the test assertions together. They are one logical change. Do not split across two commits.

## 👨‍💻 Felix Brandt — Fullstack Developer ### Observations - The migration SQL as written in the issue is clean and correct. Both statements use `CREATE INDEX IF NOT EXISTS`, matching the style in V28, V36, V38, V48, V49, and V55. The names follow the existing convention: `idx_<table>_<column>`. - `sender_id` is not just a simple FK — it participates in complex correlated subqueries in `PersonRepository`. Two queries each use `WHERE d.sender_id = :personId` and `WHERE dr.person_id = :personId AND d.sender_id IS NOT NULL` inside UNION expressions. These scan the `documents` table per-person during list rendering. Without the index, every person in a directory listing fires a sequential scan. - `document_comments.author_id` is stored in the entity (`@Column(name = "author_id")`) but `CommentService` currently queries only by `blockId`, `parentId`, and `documentId`. The index anticipates the future "comments by this user" admin view (#335 / contributor profile) — which is explicitly described in the issue. There is no `findByAuthorId` in `CommentRepository` today, which means the index is speculative for the write path. It is still cheap and correct to add. - The migration file is the entirety of the implementation: there are no Java changes, no DTO changes, no frontend changes, no API type regeneration needed. This is genuinely an XS change. - The comment block in the proposed SQL clearly explains the `why` of each index, referencing issue numbers. That's consistent with the project's comment style in migrations like V36 and V49. ### Recommendations - **Write the migration exactly as specified in the issue.** No Java code changes accompany this. Do not add a `findByAuthorId` method to `CommentRepository` speculatively — let the query consumer drive that when the feature exists. - **Follow the TDD flow for schema changes:** write the `MigrationIntegrationTest` assertions *first* (they fail because the indexes don't exist yet), then create the migration file (they go green). The test body is a `pg_indexes` query — two assertions, red then green. This upholds the red/green discipline for infrastructure changes. - **Commit atomically:** one commit for the migration file + the test assertions together. They are one logical change. Do not split across two commits.

marcel commented

2026-05-07 17:48:15 +02:00

👨‍💻 Tobias Wendt — DevOps & Platform Engineer

Observations

This is a pure schema migration — no Compose file changes, no CI changes, no Docker image changes. The DevOps surface is minimal.
The issue correctly identifies the CREATE INDEX CONCURRENTLY limitation with Flyway's transactional DDL. The brief table lock at deploy time is the accepted trade-off. At sub-second index build time (confirmed by the issue at current data volumes), this is a non-issue for the deploy window.
The existing CI pipeline runs ./mvnw clean package which triggers MigrationIntegrationTest via Testcontainers against a real postgres:16-alpine container. The new migration will run through CI on every PR with no additional pipeline configuration needed.
The IF NOT EXISTS guard means re-running Flyway against a partially-migrated database (e.g., a failed mid-deploy recovery) will not error. This is correct for a self-hosted single-node deployment where rollback is a pg_restore, not a migration rollback.
No backup procedure changes needed. The new indexes are recreated automatically from the migration on any restored database — they are not separately backed up, nor do they need to be.
The migration naming (V64) is consistent with the Flyway sequential numbering convention this project uses. However: the last checked-in migration is V60. Check that no in-flight branch has migrations at V61–V63 before merging, to avoid a Flyway checksum conflict on the shared dev database.

Recommendations

Nothing to change in the infrastructure layer. The migration runs through the existing CI pipeline, the Testcontainers integration test catches regressions, and the IF NOT EXISTS guard makes it deploy-safe.
Document the version gap check in the PR description, not in code: confirm V61–V63 are not taken by active branches before this PR merges. This is a one-line git command, not a systemic concern.
No downtime risk at current data volume. If data grows to >100k rows and an online migration is needed, use a separate CREATE INDEX CONCURRENTLY step outside Flyway (via a maintenance script). That is a future concern, not a present one.

## 👨‍💻 Tobias Wendt — DevOps & Platform Engineer ### Observations - This is a pure schema migration — no Compose file changes, no CI changes, no Docker image changes. The DevOps surface is minimal. - The issue correctly identifies the `CREATE INDEX CONCURRENTLY` limitation with Flyway's transactional DDL. The brief table lock at deploy time is the accepted trade-off. At sub-second index build time (confirmed by the issue at current data volumes), this is a non-issue for the deploy window. - The existing CI pipeline runs `./mvnw clean package` which triggers `MigrationIntegrationTest` via Testcontainers against a real `postgres:16-alpine` container. The new migration will run through CI on every PR with no additional pipeline configuration needed. - The `IF NOT EXISTS` guard means re-running Flyway against a partially-migrated database (e.g., a failed mid-deploy recovery) will not error. This is correct for a self-hosted single-node deployment where rollback is a `pg_restore`, not a migration rollback. - No backup procedure changes needed. The new indexes are recreated automatically from the migration on any restored database — they are not separately backed up, nor do they need to be. - The migration naming (V64) is consistent with the Flyway sequential numbering convention this project uses. However: the last checked-in migration is V60. Check that no in-flight branch has migrations at V61–V63 before merging, to avoid a Flyway checksum conflict on the shared dev database. ### Recommendations - **Nothing to change in the infrastructure layer.** The migration runs through the existing CI pipeline, the Testcontainers integration test catches regressions, and the `IF NOT EXISTS` guard makes it deploy-safe. - **Document the version gap check in the PR description**, not in code: confirm V61–V63 are not taken by active branches before this PR merges. This is a one-line git command, not a systemic concern. - **No downtime risk at current data volume.** If data grows to >100k rows and an online migration is needed, use a separate `CREATE INDEX CONCURRENTLY` step outside Flyway (via a maintenance script). That is a future concern, not a present one.

marcel commented

2026-05-07 17:48:32 +02:00

👨‍💻 Elicit — Requirements Engineer

Observations

The issue is well-structured and passes the Definition of Ready checklist: it has a problem statement, a measured impact projection (sequential scan → Index Scan, 10× data volume threshold, >200ms risk), a concrete acceptance criteria list, an effort estimate (XS / 30 min), and explicit verification steps (EXPLAIN before/after, pg_stat_statements, idempotency test). This is production-grade issue writing.
The user impact framing is concrete and correct: /persons/[id] Korrespondenz-Überblick is a reader-facing path (confirmed by #306 linkage), which means this issue is directly on the Reader Experience v1 critical path.
The "Optional: cover audit_log / notifications" section in the issue body is a requirements smell — it introduces scope ambiguity. Audit confirmed: audit_log.actor_id (V46) and notifications.recipient_id (V16) are already indexed. This optional section is noise and should be struck from the issue to avoid confusion during implementation. A note that both are already covered would be better than leaving the "consider whether" language.
The acceptance criteria are testable and specific:
- "Both indexes present" → verifiable via pg_indexes
- "EXPLAIN shows Index Scan" → verifiable with a real Postgres query
- "No test regressions" → CI gate
- "Migration is idempotent" → IF NOT EXISTS structure in the SQL
The issue references #306 (Korrespondenz-Überblick) and #335 (contributor profile) to explain why author_id matters. These forward references are valuable — they connect the infrastructure investment to product milestones.
One requirement gap: the issue does not specify what the EXPLAIN Index Scan threshold is. "EXPLAIN shows Index Scan" passes trivially if the query planner chooses a seq scan due to table statistics. A tighter criterion would be: "EXPLAIN ANALYZE for a WHERE sender_id = ? query against a table with ≥100 rows shows Bitmap Index Scan or Index Scan (not Seq Scan)." This is verifiable in the MigrationIntegrationTest if seed data is added.

Recommendations

Close the optional section ambiguity: update the issue body to strike the audit_log/notifications paragraph and replace it with: "~~Audit log and notifications are already covered: audit_log has idx_audit_log_actor_id (V46) and notifications has idx_notifications_recipient (V16).~~" This prevents any implementer from adding unnecessary indexes.
The acceptance criteria are sufficient as written. Do not over-specify — the IF NOT EXISTS criterion is already testable and the EXPLAIN criterion is verifiable during manual QA or via the MigrationIntegrationTest.
No new issues needed from this review. The scope is tight, the rationale is documented, and the linkage to upstream features is clear.

## 👨‍💻 Elicit — Requirements Engineer ### Observations - The issue is well-structured and passes the Definition of Ready checklist: it has a problem statement, a measured impact projection (sequential scan → Index Scan, 10× data volume threshold, >200ms risk), a concrete acceptance criteria list, an effort estimate (XS / 30 min), and explicit verification steps (EXPLAIN before/after, `pg_stat_statements`, idempotency test). This is production-grade issue writing. - The user impact framing is concrete and correct: `/persons/[id]` Korrespondenz-Überblick is a reader-facing path (confirmed by #306 linkage), which means this issue is directly on the Reader Experience v1 critical path. - The "Optional: cover audit_log / notifications" section in the issue body is a requirements smell — it introduces scope ambiguity. Audit confirmed: `audit_log.actor_id` (V46) and `notifications.recipient_id` (V16) are **already indexed**. This optional section is noise and should be struck from the issue to avoid confusion during implementation. A note that both are already covered would be better than leaving the "consider whether" language. - The acceptance criteria are testable and specific: - "Both indexes present" → verifiable via `pg_indexes` - "EXPLAIN shows Index Scan" → verifiable with a real Postgres query - "No test regressions" → CI gate - "Migration is idempotent" → `IF NOT EXISTS` structure in the SQL - The issue references #306 (Korrespondenz-Überblick) and #335 (contributor profile) to explain why `author_id` matters. These forward references are valuable — they connect the infrastructure investment to product milestones. - One requirement gap: the issue does not specify what the EXPLAIN Index Scan threshold *is*. "EXPLAIN shows Index Scan" passes trivially if the query planner chooses a seq scan due to table statistics. A tighter criterion would be: "EXPLAIN ANALYZE for a `WHERE sender_id = ?` query against a table with ≥100 rows shows Bitmap Index Scan or Index Scan (not Seq Scan)." This is verifiable in the MigrationIntegrationTest if seed data is added. ### Recommendations - **Close the optional section ambiguity:** update the issue body to strike the `audit_log`/`notifications` paragraph and replace it with: "~~Audit log and notifications are already covered: `audit_log` has `idx_audit_log_actor_id` (V46) and `notifications` has `idx_notifications_recipient` (V16).~~" This prevents any implementer from adding unnecessary indexes. - **The acceptance criteria are sufficient as written.** Do not over-specify — the `IF NOT EXISTS` criterion is already testable and the EXPLAIN criterion is verifiable during manual QA or via the MigrationIntegrationTest. - **No new issues needed from this review.** The scope is tight, the rationale is documented, and the linkage to upstream features is clear.

marcel commented

2026-05-07 17:48:48 +02:00

👨‍💻 Nora "NullX" Steiner — Security Engineer

Observations

This is a pure read-performance change with no attack surface impact. Adding a non-unique B-tree index on sender_id and author_id does not change any access control boundary, does not expose new query paths, and does not affect authentication or authorization logic.
No JPQL changes, no service changes, no controller changes. The migration adds two CREATE INDEX statements — the SQL is not user-controlled and cannot be injected. This is a DDL-only change applied at deploy time by Flyway.
The IF NOT EXISTS guard is safe and correct: it prevents double-execution errors without bypassing any constraint.
From an information-disclosure perspective: indexes do not change what data is accessible — only how fast it is retrieved. The existing @RequirePermission + PermissionAspect authorization model remains unchanged. There is no IDOR or privilege escalation risk from adding a covering index on a FK column.
One peripheral observation: document_comments.author_id is nullable (REFERENCES users(id) ON DELETE SET NULL). Deleted-user comments retain their content with a null author_id. An index on a nullable column is fine in PostgreSQL — null values are indexed by default in B-tree indexes, which is correct behavior here. No security concern.
The broader audit document (docs/audits/2026-05-07-pre-prod-architectural-review.md) flags several Critical findings (no CSP, auth-token-in-cookie brute-force, default admin password admin123). Those are materially higher severity than this fix. This issue is correctly labeled P2-medium.

Recommendations

Approve from a security perspective — no security-relevant changes. This migration is safe to ship.
Do not let this XS fix delay progress on the P0 items flagged in the pre-prod audit (CSP headers, rate limiting on /login, the admin123 default credential). This issue is on the correct priority tier; the Critical findings are not.

## 👨‍💻 Nora "NullX" Steiner — Security Engineer ### Observations - This is a pure read-performance change with no attack surface impact. Adding a non-unique B-tree index on `sender_id` and `author_id` does not change any access control boundary, does not expose new query paths, and does not affect authentication or authorization logic. - No JPQL changes, no service changes, no controller changes. The migration adds two `CREATE INDEX` statements — the SQL is not user-controlled and cannot be injected. This is a DDL-only change applied at deploy time by Flyway. - The `IF NOT EXISTS` guard is safe and correct: it prevents double-execution errors without bypassing any constraint. - From an information-disclosure perspective: indexes do not change what data is accessible — only how fast it is retrieved. The existing `@RequirePermission` + `PermissionAspect` authorization model remains unchanged. There is no IDOR or privilege escalation risk from adding a covering index on a FK column. - One peripheral observation: `document_comments.author_id` is nullable (`REFERENCES users(id) ON DELETE SET NULL`). Deleted-user comments retain their content with a null `author_id`. An index on a nullable column is fine in PostgreSQL — null values are indexed by default in B-tree indexes, which is correct behavior here. No security concern. - The broader audit document (`docs/audits/2026-05-07-pre-prod-architectural-review.md`) flags several Critical findings (no CSP, auth-token-in-cookie brute-force, default admin password `admin123`). Those are materially higher severity than this fix. This issue is correctly labeled P2-medium. ### Recommendations - **Approve from a security perspective — no security-relevant changes.** This migration is safe to ship. - **Do not let this XS fix delay progress on the P0 items** flagged in the pre-prod audit (CSP headers, rate limiting on `/login`, the `admin123` default credential). This issue is on the correct priority tier; the Critical findings are not.

marcel commented

2026-05-07 17:49:08 +02:00

👨‍💻 Sara Holt — QA Engineer

Observations

The issue explicitly says "No new tests needed — index correctness is a pg_indexes check." This is partially correct — index existence is verifiable at the DB layer — but the existing MigrationIntegrationTest pattern is the right place to add that verification, not skip it entirely.
The project already has MigrationIntegrationTest using @DataJpaTest + PostgresContainerConfig + JdbcTemplate. Every other schema invariant in the project (CHECK constraints, partial unique indexes, nullable columns) has a corresponding test in that class. Leaving the index unverified breaks the project's own standard that schema-level guarantees are tested.

The two assertions are a JdbcTemplate query against pg_catalog.pg_indexes:

@Test
void v64_idx_documents_sender_id_exists() {
    Integer count = jdbc.queryForObject(
        "SELECT COUNT(*) FROM pg_catalog.pg_indexes WHERE tablename = 'documents' AND indexname = 'idx_documents_sender_id'",
        Integer.class);
    assertThat(count).isEqualTo(1);
}

@Test
void v64_idx_comments_author_id_exists() {
    Integer count = jdbc.queryForObject(
        "SELECT COUNT(*) FROM pg_catalog.pg_indexes WHERE tablename = 'document_comments' AND indexname = 'idx_comments_author_id'",
        Integer.class);
    assertThat(count).isEqualTo(1);
}

These run against real PostgreSQL via Testcontainers and fail fast if the migration doesn't apply. They take under 50ms.

The verification steps in the issue (EXPLAIN before/after, pg_stat_statements) are good manual QA steps but are not substitutes for automated regression coverage. Once the index exists, a future migration that accidentally drops it would go undetected without the pg_indexes assertions.
The acceptance criterion "No test regressions" is satisfied automatically — the migration has no Java code changes, so the existing test suite is unaffected.
The TDD flow for this issue: write the two v64_* test methods first (they fail with 0 != 1), then create V64__index_fk_columns.sql (they go green). This follows the red/green discipline for infrastructure changes.

Recommendations

Add the two pg_indexes assertions to MigrationIntegrationTest as part of this issue. The issue says "no new tests needed" — disagree. The project's standard is: every schema invariant has a test. Two 4-line test methods prevent silent regression.
Place the test methods under a clearly labeled section header comment in MigrationIntegrationTest consistent with the existing style (// ─── V64: ...).
Do not add EXPLAIN ANALYZE assertions in the test suite. Query plan assertions are brittle — the planner can legitimately choose a seq scan on a small test dataset. The pg_indexes check is the right automated gate; the EXPLAIN verification is a one-time manual QA step.

Open Decisions

Should MigrationIntegrationTest gain seed-data-based query plan tests? The issue's verification step 2 (EXPLAIN shows Index Scan) cannot be automated reliably against the Testcontainers database, which starts empty. If this project eventually wants automated EXPLAIN-based regression tests, it would require a seed fixture with enough rows to force the planner past the seq-scan threshold (~50+ rows for a selective FK). That is broader than this issue — worth a separate spike.

## 👨‍💻 Sara Holt — QA Engineer ### Observations - The issue explicitly says "No new tests needed — index correctness is a `pg_indexes` check." This is partially correct — index existence is verifiable at the DB layer — but the existing `MigrationIntegrationTest` pattern is the right place to add that verification, not skip it entirely. - The project already has `MigrationIntegrationTest` using `@DataJpaTest` + `PostgresContainerConfig` + `JdbcTemplate`. Every other schema invariant in the project (CHECK constraints, partial unique indexes, nullable columns) has a corresponding test in that class. Leaving the index unverified breaks the project's own standard that schema-level guarantees are tested. - The two assertions are a `JdbcTemplate` query against `pg_catalog.pg_indexes`: ```java @Test void v64_idx_documents_sender_id_exists() { Integer count = jdbc.queryForObject( "SELECT COUNT(*) FROM pg_catalog.pg_indexes WHERE tablename = 'documents' AND indexname = 'idx_documents_sender_id'", Integer.class); assertThat(count).isEqualTo(1); } @Test void v64_idx_comments_author_id_exists() { Integer count = jdbc.queryForObject( "SELECT COUNT(*) FROM pg_catalog.pg_indexes WHERE tablename = 'document_comments' AND indexname = 'idx_comments_author_id'", Integer.class); assertThat(count).isEqualTo(1); } ``` These run against real PostgreSQL via Testcontainers and fail fast if the migration doesn't apply. They take under 50ms. - The verification steps in the issue (EXPLAIN before/after, `pg_stat_statements`) are good manual QA steps but are not substitutes for automated regression coverage. Once the index exists, a future migration that accidentally drops it would go undetected without the `pg_indexes` assertions. - The acceptance criterion "No test regressions" is satisfied automatically — the migration has no Java code changes, so the existing test suite is unaffected. - The TDD flow for this issue: write the two `v64_*` test methods first (they fail with 0 != 1), then create `V64__index_fk_columns.sql` (they go green). This follows the red/green discipline for infrastructure changes. ### Recommendations - **Add the two `pg_indexes` assertions to `MigrationIntegrationTest` as part of this issue.** The issue says "no new tests needed" — disagree. The project's standard is: every schema invariant has a test. Two 4-line test methods prevent silent regression. - **Place the test methods under a clearly labeled section header comment** in `MigrationIntegrationTest` consistent with the existing style (`// ─── V64: ...`). - **Do not add EXPLAIN ANALYZE assertions in the test suite.** Query plan assertions are brittle — the planner can legitimately choose a seq scan on a small test dataset. The `pg_indexes` check is the right automated gate; the EXPLAIN verification is a one-time manual QA step. ### Open Decisions - **Should `MigrationIntegrationTest` gain seed-data-based query plan tests?** The issue's verification step 2 (EXPLAIN shows Index Scan) cannot be automated reliably against the Testcontainers database, which starts empty. If this project eventually wants automated EXPLAIN-based regression tests, it would require a seed fixture with enough rows to force the planner past the seq-scan threshold (~50+ rows for a selective FK). That is broader than this issue — worth a separate spike.

marcel commented

2026-05-07 17:49:20 +02:00

👨‍💻 Leonie Voss — UI/UX Designer

Observations

This is a backend/database-only change. No frontend components, no routes, no Svelte files, no CSS tokens, no i18n strings, and no accessibility-relevant markup are involved. This issue has no UI/UX surface to review.
From a user experience perspective, the motivation is sound: the /persons/[id] Korrespondenz-Überblick and /briefwechsel are reader-facing paths used by the senior audience (60+). Latency on those views directly degrades usability for the primary transcriber audience — reducing query time from potentially 200ms+ to sub-millisecond supports the "no perceptible lag on navigation" experience goal for the reader experience milestone.
No loading state changes, skeleton updates, or error boundary changes accompany this fix. The improvement is invisible to the UI layer — which is exactly as it should be for a database optimization.

Recommendations

No UI/UX changes required or recommended. Implement as specified.
If the Korrespondenz-Überblick or Briefwechsel pages ever show a loading skeleton, this index reduces the duration that skeleton is visible — a passive UX improvement with no additional frontend work needed.

## 👨‍💻 Leonie Voss — UI/UX Designer ### Observations - This is a backend/database-only change. No frontend components, no routes, no Svelte files, no CSS tokens, no i18n strings, and no accessibility-relevant markup are involved. This issue has no UI/UX surface to review. - From a user experience perspective, the motivation is sound: the `/persons/[id]` Korrespondenz-Überblick and `/briefwechsel` are reader-facing paths used by the senior audience (60+). Latency on those views directly degrades usability for the primary transcriber audience — reducing query time from potentially 200ms+ to sub-millisecond supports the "no perceptible lag on navigation" experience goal for the reader experience milestone. - No loading state changes, skeleton updates, or error boundary changes accompany this fix. The improvement is invisible to the UI layer — which is exactly as it should be for a database optimization. ### Recommendations - **No UI/UX changes required or recommended.** Implement as specified. - **If the Korrespondenz-Überblick or Briefwechsel pages ever show a loading skeleton**, this index reduces the duration that skeleton is visible — a passive UX improvement with no additional frontend work needed.

marcel commented

2026-05-07 17:49:51 +02:00

Decision Queue

One open decision surfaced across the persona reviews. It is out of scope for this issue but worth deciding before the next migration-test cycle.

Theme: Automated EXPLAIN-based regression testing in `MigrationIntegrationTest`

Raised by: Sara Holt (@saraholt)

Question: Should MigrationIntegrationTest gain seed-data-based query plan assertions that verify EXPLAIN shows an Index Scan (not Seq Scan) after FK-column index migrations?

Why it matters: The issue's verification step 2 ("EXPLAIN shows Index Scan") is currently a manual QA step. The Testcontainers database starts empty, so the query planner legitimately chooses a sequential scan on an empty table regardless of index presence. Automating this requires pre-seeding ≥50 rows of test data before the EXPLAIN call, which adds complexity to MigrationIntegrationTest.

Options:

A (current): pg_indexes assertion confirms the index exists; EXPLAIN verification remains a one-time manual step. Simple, no test flakiness risk.
B: Add a seeded EXPLAIN assertion method that inserts N rows, runs EXPLAIN, and checks for "Index Scan" in the output string. More thorough, but brittle if the planner changes thresholds between Postgres minor versions.

Recommendation from review: Option A is correct for this issue. Revisit Option B as a standalone spike if the team wants automated query-plan regression coverage more broadly.

Owner: Marcel — decide before or during implementation of this issue.

## Decision Queue One open decision surfaced across the persona reviews. It is out of scope for this issue but worth deciding before the next migration-test cycle. --- ### Theme: Automated EXPLAIN-based regression testing in `MigrationIntegrationTest` **Raised by:** Sara Holt (@saraholt) **Question:** Should `MigrationIntegrationTest` gain seed-data-based query plan assertions that verify `EXPLAIN` shows an Index Scan (not Seq Scan) after FK-column index migrations? **Why it matters:** The issue's verification step 2 ("EXPLAIN shows Index Scan") is currently a manual QA step. The Testcontainers database starts empty, so the query planner legitimately chooses a sequential scan on an empty table regardless of index presence. Automating this requires pre-seeding ≥50 rows of test data before the EXPLAIN call, which adds complexity to `MigrationIntegrationTest`. **Options:** - **A (current):** `pg_indexes` assertion confirms the index exists; EXPLAIN verification remains a one-time manual step. Simple, no test flakiness risk. - **B:** Add a seeded EXPLAIN assertion method that inserts N rows, runs `EXPLAIN`, and checks for "Index Scan" in the output string. More thorough, but brittle if the planner changes thresholds between Postgres minor versions. **Recommendation from review:** Option A is correct for this issue. Revisit Option B as a standalone spike if the team wants automated query-plan regression coverage more broadly. **Owner:** Marcel — decide before or during implementation of this issue.

marcel referenced this issue from a commit

2026-05-09 15:09:02 +02:00

fix(db): add indexes on documents.sender_id and document_comments.author_id

marcel referenced a pull request that will close this issue

2026-05-09 15:09:14 +02:00

fix(db): add indexes on documents.sender_id and document_comments.author_id (#470) #491

marcel referenced this issue

2026-05-09 15:11:28 +02:00

fix(db): add indexes on documents.sender_id and document_comments.author_id (#470) #491

marcel referenced this issue

2026-05-09 16:29:36 +02:00

fix(db): add indexes on documents.sender_id and document_comments.author_id (#470) #491