docs(legibility): DOC-5 — write docs/DEPLOYMENT.md #443

marcel · 2026-05-05T23:01:59+02:00

marcel commented

2026-05-05 23:01:59 +02:00

Summary

Adds docs/DEPLOYMENT.md — the Day-1 operational reference and bootstrap checklist.

What's covered (7 sections)

Deployment topology — Mermaid diagram; OCR network boundary; SSE note; OCR memory/VPS sizing table (CX42/CX32/CX22 tradeoffs); dev-vs-prod differences table
Environment variables — complete table verified against docker-compose.yml and application.yaml, organized by service (backend, PostgreSQL, MinIO, OCR); includes two undocumented gaps: APP_ADMIN_USERNAME/APP_ADMIN_PASSWORD (not in .env.example), and ALLOWED_PDF_HOSTS/BLLA_MODEL_PATH (not in compose); Sensitive column for secrets management
Bootstrap from scratch — security checklist (8 items, must complete before first boot); bootstrap sequence with smoke-test step (/actuator/health)
Logs + observability — first-response docker compose logs commands; log locations; Prometheus port note; future monitoring (Phase 7)
Backup + recovery — split into Current state (no automation, manual pg_dump) vs Planned (phase 5); avoids implying backups exist
Common operational tasks — command-first format for reset-db.sh, rebuild-frontend.sh, download-kraken-models.sh, mass import; includes caveats for hardcoded values
Known limitations — table with ADR back-references for single-node OCR; deliberate scope decisions for no multi-tenancy/multi-region

Design decisions implemented

Summarise and link (D1): links to docs/infrastructure/production-compose.md for full Compose file and VPS provisioning — does not duplicate
OCR sizing documented (D2): three-option table (CX42 recommended, CX32 with 6GB limit, CX22 without OCR)
README link relaxed (D3): DOC-1 PR #440 owns the link back; closing comment tracks it

Test plan

Render docs/DEPLOYMENT.md in Gitea — verify Mermaid diagram and all tables render
Confirm all links resolve: docs/infrastructure/production-compose.md, docs/adr/001-ocr-python-microservice.md
Verify env var table completeness: grep -E '\$\{[A-Z_]+\}' docker-compose.yml backend/src/main/resources/application*.yaml | grep -oE '\$\{[A-Z_]+[^}:]*\}' | sort -u

Closes #399

🤖 Generated with Claude Code

## Summary Adds `docs/DEPLOYMENT.md` — the Day-1 operational reference and bootstrap checklist. ## What's covered (7 sections) 1. **Deployment topology** — Mermaid diagram; OCR network boundary; SSE note; OCR memory/VPS sizing table (CX42/CX32/CX22 tradeoffs); dev-vs-prod differences table 2. **Environment variables** — complete table verified against `docker-compose.yml` and `application.yaml`, organized by service (backend, PostgreSQL, MinIO, OCR); includes two undocumented gaps: `APP_ADMIN_USERNAME`/`APP_ADMIN_PASSWORD` (not in `.env.example`), and `ALLOWED_PDF_HOSTS`/`BLLA_MODEL_PATH` (not in compose); Sensitive column for secrets management 3. **Bootstrap from scratch** — security checklist (8 items, must complete before first boot); bootstrap sequence with smoke-test step (`/actuator/health`) 4. **Logs + observability** — first-response `docker compose logs` commands; log locations; Prometheus port note; future monitoring (Phase 7) 5. **Backup + recovery** — split into Current state (no automation, manual `pg_dump`) vs Planned (phase 5); avoids implying backups exist 6. **Common operational tasks** — command-first format for `reset-db.sh`, `rebuild-frontend.sh`, `download-kraken-models.sh`, mass import; includes caveats for hardcoded values 7. **Known limitations** — table with ADR back-references for single-node OCR; deliberate scope decisions for no multi-tenancy/multi-region ## Design decisions implemented - **Summarise and link** (D1): links to `docs/infrastructure/production-compose.md` for full Compose file and VPS provisioning — does not duplicate - **OCR sizing documented** (D2): three-option table (CX42 recommended, CX32 with 6GB limit, CX22 without OCR) - **README link relaxed** (D3): DOC-1 PR #440 owns the link back; closing comment tracks it ## Test plan - [ ] Render `docs/DEPLOYMENT.md` in Gitea — verify Mermaid diagram and all tables render - [ ] Confirm all links resolve: `docs/infrastructure/production-compose.md`, `docs/adr/001-ocr-python-microservice.md` - [ ] Verify env var table completeness: `grep -E '\$\{[A-Z_]+\}' docker-compose.yml backend/src/main/resources/application*.yaml | grep -oE '\$\{[A-Z_]+[^}:]*\}' | sort -u` Closes #399 🤖 Generated with [Claude Code](https://claude.com/claude-code)

marcel added 1 commit 2026-05-05 23:02:01 +02:00

docs(legibility): write docs/DEPLOYMENT.md — Day-1 checklist and operational reference

CI / Unit & Component Tests (pull_request) Failing after 3m26s

Details

CI / OCR Service Tests (pull_request) Successful in 33s

Details

CI / Backend Unit Tests (pull_request) Failing after 55s

Details

CI / Backend Unit Tests (push) Has been cancelled

Details

CI / Unit & Component Tests (push) Has been cancelled

Details

CI / OCR Service Tests (push) Has been cancelled

Details

befb8e8b35

Covers: topology diagram (Mermaid), OCR memory/VPS sizing table,
dev-vs-prod differences, complete env vars table (all vars verified
against docker-compose.yml and application.yaml, including APP_ADMIN_*
and ALLOWED_PDF_HOSTS gaps not in .env.example), security checklist
before first boot, bootstrap sequence, logs, backup current state vs
planned, common operational tasks, and known limitations with ADR links.

Closes #399
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel referenced this pull request

2026-05-05 23:02:24 +02:00

docs(legibility): write docs/DEPLOYMENT.md (production runtime + env vars) #399

marcel commented

2026-05-05 23:05:29 +02:00

🛠️ Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer

Verdict: 🚫 Changes requested

Good operational doc overall — the structure is solid, the security checklist is actionable, and the "no automated backup yet" honesty is exactly right. But there are two factual errors in the commands that would bite an operator on day one.

Blockers

1. Wrong container name in the backup command (§5)

The backup snippet uses:

docker exec familienarchiv-db-1 pg_dump ...
docker exec -i familienarchiv-db-1 psql ...

But docker-compose.yml sets an explicit container_name: archive-db. Docker Compose only generates the <project>-<service>-<n> pattern when there is no explicit container_name. The correct command is:

docker exec archive-db pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql
docker exec -i archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql

This is the kind of error that costs 10 minutes of debugging during an incident when you least want it.

2. KRAKEN_MODEL_PATH missing from the OCR env var table (§2)

The doc lists BLLA_MODEL_PATH (the segmentation model) but omits KRAKEN_MODEL_PATH, which is the main HTR model path set in docker-compose.yml line 92 (/app/models/german_kurrent.mlmodel) and read by main.py line 32. This is the more operationally critical of the two model paths — it controls which Kurrent model is loaded for every OCR run.

Add a row:

Concerns (non-blocking but worth fixing)

3. Topology diagram shows :5173 — ambiguous for a production-oriented doc

The Mermaid diagram shows HTTP :5173 for the frontend. That's the dev port (SvelteKit Vite dev server). The production overlay changes this to :3000 (Node adapter). Since this doc is framed as a production operational reference, the diagram should clarify or split: either show the prod topology (:3000) or label the :5173 as dev-only.

4. Management port 8081 stated but not configured in application.yaml

§4 says "management port 8081 (Spring Actuator / Prometheus scrape) is internal only." The production compose overlay does expose 8081, but application.yaml has no management.server.port: 8081 entry. Either this is configured in a prod-profile YAML not checked into this repo, or the statement is aspirational. Worth a note clarifying where the port separation is actually configured.

5. SPRING_PROFILES_ACTIVE default in the table is dev,e2e

The table correctly states the default. But the security checklist (§3) says set SPRING_PROFILES_ACTIVE=prod — this guidance is correct but it would be helpful to note that this override goes in the prod overlay env, not in .env (the .env file has no SPRING_PROFILES_ACTIVE entry).

6. MinIO service name in log commands

The §4 log command lists # services: frontend, backend, db, minio, ocr-service. The actual service name in docker-compose.yml is minio ✓ and ocr-service ✓. Those are correct — no action needed, just confirming I checked.

What's done well

The security checklist is specific, ordered correctly (change before first boot), and actionable.
Linking to docs/infrastructure/production-compose.md rather than duplicating is exactly the right call.
The OCR sizing table (CX42/CX32/CX22) with concrete tradeoffs is useful for anyone doing capacity planning.
The "no automated backup" honesty in §5 is better than false assurance.
reset-db.sh caveat about hardcoded values is accurate (DB_USER=archive_user, DB_NAME=family_archive_db are indeed hardcoded in the script).

## 🛠️ Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer **Verdict: 🚫 Changes requested** Good operational doc overall — the structure is solid, the security checklist is actionable, and the "no automated backup yet" honesty is exactly right. But there are two factual errors in the commands that would bite an operator on day one. --- ### Blockers **1. Wrong container name in the backup command (§5)** The backup snippet uses: ```bash docker exec familienarchiv-db-1 pg_dump ... docker exec -i familienarchiv-db-1 psql ... ``` But `docker-compose.yml` sets an explicit `container_name: archive-db`. Docker Compose only generates the `<project>-<service>-<n>` pattern when there is no explicit `container_name`. The correct command is: ```bash docker exec archive-db pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql docker exec -i archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql ``` This is the kind of error that costs 10 minutes of debugging during an incident when you least want it. **2. `KRAKEN_MODEL_PATH` missing from the OCR env var table (§2)** The doc lists `BLLA_MODEL_PATH` (the segmentation model) but omits `KRAKEN_MODEL_PATH`, which is the main HTR model path set in `docker-compose.yml` line 92 (`/app/models/german_kurrent.mlmodel`) and read by `main.py` line 32. This is the more operationally critical of the two model paths — it controls which Kurrent model is loaded for every OCR run. Add a row: | `KRAKEN_MODEL_PATH` | Path to the Kurrent HTR model loaded at startup | `/app/models/german_kurrent.mlmodel` | — | — | --- ### Concerns (non-blocking but worth fixing) **3. Topology diagram shows `:5173` — ambiguous for a production-oriented doc** The Mermaid diagram shows `HTTP :5173` for the frontend. That's the dev port (SvelteKit Vite dev server). The production overlay changes this to `:3000` (Node adapter). Since this doc is framed as a production operational reference, the diagram should clarify or split: either show the prod topology (`:3000`) or label the `:5173` as dev-only. **4. Management port 8081 stated but not configured in `application.yaml`** §4 says "management port 8081 (Spring Actuator / Prometheus scrape) is internal only." The production compose overlay does expose `8081`, but `application.yaml` has no `management.server.port: 8081` entry. Either this is configured in a prod-profile YAML not checked into this repo, or the statement is aspirational. Worth a note clarifying where the port separation is actually configured. **5. `SPRING_PROFILES_ACTIVE` default in the table is `dev,e2e`** The table correctly states the default. But the security checklist (§3) says set `SPRING_PROFILES_ACTIVE=prod` — this guidance is correct but it would be helpful to note that this override goes in the prod overlay env, not in `.env` (the `.env` file has no `SPRING_PROFILES_ACTIVE` entry). **6. MinIO service name in log commands** The §4 log command lists `# services: frontend, backend, db, minio, ocr-service`. The actual service name in `docker-compose.yml` is `minio` ✓ and `ocr-service` ✓. Those are correct — no action needed, just confirming I checked. --- ### What's done well - The security checklist is specific, ordered correctly (change before first boot), and actionable. - Linking to `docs/infrastructure/production-compose.md` rather than duplicating is exactly the right call. - The OCR sizing table (CX42/CX32/CX22) with concrete tradeoffs is useful for anyone doing capacity planning. - The "no automated backup" honesty in §5 is better than false assurance. - `reset-db.sh` caveat about hardcoded values is accurate (`DB_USER=archive_user`, `DB_NAME=family_archive_db` are indeed hardcoded in the script).

marcel commented

2026-05-05 23:06:03 +02:00

🔒 Nora "NullX" Steiner — Application Security Engineer

Verdict: ⚠️ Approved with concerns

The security posture documented here is generally sound. The SSRF warning on ALLOWED_PDF_HOSTS, the STARTTLS/SMTP-auth guidance, and the "use a dedicated MinIO service account" instruction are all the right things to say. A couple of items need sharpening before this doc can be trusted as the production security reference.

Concerns

1. Internal inconsistency: OCR_TRAINING_TOKEN vs TRAINING_TOKEN (§2 vs §3)

§2 env table (OCR service section) lists the container-level variable as TRAINING_TOKEN. The §3 security checklist says:

Set a strong OCR_TRAINING_TOKEN (python3 -c "import secrets; print(secrets.token_hex(32))")

These are two different names. OCR_TRAINING_TOKEN is the .env file variable; TRAINING_TOKEN is what the OCR container actually reads (confirmed: docker-compose.yml maps TRAINING_TOKEN: "${OCR_TRAINING_TOKEN:-}"). The doc mixes the two levels without explaining the mapping. An operator reading §3 might set TRAINING_TOKEN directly in .env and wonder why it doesn't work.

Recommend: in §3 clarify that OCR_TRAINING_TOKEN in .env maps to TRAINING_TOKEN inside the OCR container. Alternatively, add a note in the table row explaining the indirection.

2. APP_ADMIN_PASSWORD default of admin123 in application.yaml is a hardcoded fallback credential (CWE-259)

The doc correctly flags this as "⚠ ships as admin123" and tells operators to change it. But application.yaml line 67 has:

password: ${APP_ADMIN_PASSWORD:admin123}

That colon-default syntax means if APP_ADMIN_PASSWORD is absent from the environment, Spring silently uses admin123 in production. A missed env var = live admin account with known password. This is a real risk, not just a documentation concern.

The doc should escalate the severity here: this is a startup-blocking misconfiguration if left unchanged. Consider recommending that application.yaml be changed to ${APP_ADMIN_PASSWORD} (no default) so the application fails to start — not silently launches with a weak credential — when the var is missing. That's the "fail closed" principle (CWE-636 is the alternative).

3. No mention of HTTPS-only / HSTS configuration

§3 bootstrap checklist has 8 items but none of them mention ensuring TLS is enforced via Caddy and HSTS. The production Caddyfile in docs/infrastructure/production-compose.md does include Strict-Transport-Security, but an operator following only this doc wouldn't know to verify it. Add a checklist item: "Confirm the Caddyfile includes Strict-Transport-Security: max-age=31536000; includeSubDomains; preload."

4. MinIO root credential used as S3 credentials in dev compose

The doc notes in §1 (dev vs prod differences): "MinIO service account for S3_ACCESS_KEY / S3_SECRET_KEY" — good. But looking at docker-compose.yml lines 133–134, the dev compose wires S3_ACCESS_KEY: ${MINIO_ROOT_USER} — i.e. root credentials. The doc doesn't explain that dev intentionally uses root (acceptable for local dev) but production must use a service account. Adding a one-line note clarifying "dev uses root credentials by design — production must create a dedicated MinIO service account" would close this gap for operators who follow the dev → prod path.

What's done well

The ALLOWED_PDF_HOSTS SSRF warning is correctly framed and the default value (minio,localhost,127.0.0.1) matches what main.py actually uses.
The instruction to use python3 -c "import secrets; print(secrets.token_hex(32))" for the training token is the right way to generate a secure random token.
The checklist item about SPRING_PROFILES_ACTIVE=prod is critical — dev profile exposes Swagger UI and /v3/api-docs. Getting this wrong would expose the full API schema externally.
The Actuator blocking note (Caddy blocks /actuator/* externally) is accurate and important.

## 🔒 Nora "NullX" Steiner — Application Security Engineer **Verdict: ⚠️ Approved with concerns** The security posture documented here is generally sound. The SSRF warning on `ALLOWED_PDF_HOSTS`, the STARTTLS/SMTP-auth guidance, and the "use a dedicated MinIO service account" instruction are all the right things to say. A couple of items need sharpening before this doc can be trusted as the production security reference. --- ### Concerns **1. Internal inconsistency: `OCR_TRAINING_TOKEN` vs `TRAINING_TOKEN` (§2 vs §3)** §2 env table (OCR service section) lists the container-level variable as `TRAINING_TOKEN`. The §3 security checklist says: > Set a strong `OCR_TRAINING_TOKEN` (`python3 -c "import secrets; print(secrets.token_hex(32))"`) These are two different names. `OCR_TRAINING_TOKEN` is the `.env` file variable; `TRAINING_TOKEN` is what the OCR container actually reads (confirmed: `docker-compose.yml` maps `TRAINING_TOKEN: "${OCR_TRAINING_TOKEN:-}"`). The doc mixes the two levels without explaining the mapping. An operator reading §3 might set `TRAINING_TOKEN` directly in `.env` and wonder why it doesn't work. Recommend: in §3 clarify that `OCR_TRAINING_TOKEN` in `.env` maps to `TRAINING_TOKEN` inside the OCR container. Alternatively, add a note in the table row explaining the indirection. **2. `APP_ADMIN_PASSWORD` default of `admin123` in `application.yaml` is a hardcoded fallback credential (CWE-259)** The doc correctly flags this as "⚠ ships as `admin123`" and tells operators to change it. But `application.yaml` line 67 has: ```yaml password: ${APP_ADMIN_PASSWORD:admin123} ``` That colon-default syntax means if `APP_ADMIN_PASSWORD` is absent from the environment, Spring silently uses `admin123` in production. A missed env var = live admin account with known password. This is a real risk, not just a documentation concern. The doc should escalate the severity here: this is a startup-blocking misconfiguration if left unchanged. Consider recommending that `application.yaml` be changed to `${APP_ADMIN_PASSWORD}` (no default) so the application fails to start — not silently launches with a weak credential — when the var is missing. That's the "fail closed" principle (CWE-636 is the alternative). **3. No mention of HTTPS-only / HSTS configuration** §3 bootstrap checklist has 8 items but none of them mention ensuring TLS is enforced via Caddy and HSTS. The production Caddyfile in `docs/infrastructure/production-compose.md` does include `Strict-Transport-Security`, but an operator following only this doc wouldn't know to verify it. Add a checklist item: "Confirm the Caddyfile includes `Strict-Transport-Security: max-age=31536000; includeSubDomains; preload`." **4. MinIO root credential used as S3 credentials in dev compose** The doc notes in §1 (dev vs prod differences): "MinIO service account for `S3_ACCESS_KEY` / `S3_SECRET_KEY`" — good. But looking at `docker-compose.yml` lines 133–134, the dev compose wires `S3_ACCESS_KEY: ${MINIO_ROOT_USER}` — i.e. root credentials. The doc doesn't explain that dev intentionally uses root (acceptable for local dev) but production must use a service account. Adding a one-line note clarifying "dev uses root credentials by design — production must create a dedicated MinIO service account" would close this gap for operators who follow the dev → prod path. --- ### What's done well - The `ALLOWED_PDF_HOSTS` SSRF warning is correctly framed and the default value (`minio,localhost,127.0.0.1`) matches what `main.py` actually uses. - The instruction to use `python3 -c "import secrets; print(secrets.token_hex(32))"` for the training token is the right way to generate a secure random token. - The checklist item about `SPRING_PROFILES_ACTIVE=prod` is critical — dev profile exposes Swagger UI and `/v3/api-docs`. Getting this wrong would expose the full API schema externally. - The Actuator blocking note (`Caddy blocks /actuator/* externally`) is accurate and important.

marcel commented

2026-05-05 23:06:23 +02:00

🏗️ Markus Keller (@mkeller) — Senior Application Architect

Verdict: ✅ Approved

This is a documentation PR — no code changes, no architectural decisions introduced. My review focuses on whether the documented topology accurately represents the actual architecture and whether the ADR references are sound.

What I checked

Topology accuracy (§1): The Mermaid diagram matches the actual service graph in docker-compose.yml. The OCR service has expose: ["8000"] (no external port — correct). SSE going directly browser → backend without passing through SvelteKit SSR is correct: the SvelteKit frontend acts as a pass-through for the SSE connection, not a mediator. The Caddy → backend at :8080 and Caddy → frontend at :5173 are accurately represented (noting Tobias' comment that :5173 is dev-only is valid — production switches to :3000).

ADR references (§7): The Known Limitations table references ADR-001 for the single-node OCR constraint. docs/adr/001-ocr-python-microservice.md exists in the repository. The link path adr/001-ocr-python-microservice.md (relative from docs/) resolves correctly.

Monolith-first framing: The doc correctly identifies that horizontal scaling of OCR would require a job queue, and notes this as a deliberate scope decision rather than a bug. ADR-001 captures the architectural rationale. The Known Limitations table structure — with reason and reference — is the right way to document deliberate constraints.

Domain layering: The mass import documentation in §6 correctly identifies POST /api/admin/trigger-import and GET /api/admin/import-status — verified against AdminController.java. The note that the import runs asynchronously is accurate (MassImportService uses @Async).

Minor observation

The doc references docs/infrastructure/production-compose.md throughout. That document exists and contains the prod overlay. The cross-linking strategy (summarise and link, don't duplicate) is exactly right for operational docs.

No architectural concerns. The blockers flagged by Tobias (wrong container name, missing KRAKEN_MODEL_PATH) are operational accuracy issues, not architectural ones.

## 🏗️ Markus Keller (@mkeller) — Senior Application Architect **Verdict: ✅ Approved** This is a documentation PR — no code changes, no architectural decisions introduced. My review focuses on whether the documented topology accurately represents the actual architecture and whether the ADR references are sound. --- ### What I checked **Topology accuracy (§1):** The Mermaid diagram matches the actual service graph in `docker-compose.yml`. The OCR service has `expose: ["8000"]` (no external port — correct). SSE going directly browser → backend without passing through SvelteKit SSR is correct: the SvelteKit frontend acts as a pass-through for the SSE connection, not a mediator. The Caddy → backend at `:8080` and Caddy → frontend at `:5173` are accurately represented (noting Tobias' comment that `:5173` is dev-only is valid — production switches to `:3000`). **ADR references (§7):** The Known Limitations table references `ADR-001` for the single-node OCR constraint. `docs/adr/001-ocr-python-microservice.md` exists in the repository. The link path `adr/001-ocr-python-microservice.md` (relative from `docs/`) resolves correctly. **Monolith-first framing:** The doc correctly identifies that horizontal scaling of OCR would require a job queue, and notes this as a deliberate scope decision rather than a bug. ADR-001 captures the architectural rationale. The Known Limitations table structure — with reason and reference — is the right way to document deliberate constraints. **Domain layering:** The mass import documentation in §6 correctly identifies `POST /api/admin/trigger-import` and `GET /api/admin/import-status` — verified against `AdminController.java`. The note that the import runs asynchronously is accurate (`MassImportService` uses `@Async`). --- ### Minor observation The doc references `docs/infrastructure/production-compose.md` throughout. That document exists and contains the prod overlay. The cross-linking strategy (summarise and link, don't duplicate) is exactly right for operational docs. No architectural concerns. The blockers flagged by Tobias (wrong container name, missing `KRAKEN_MODEL_PATH`) are operational accuracy issues, not architectural ones.

marcel commented

2026-05-05 23:06:41 +02:00

👨‍💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer

Verdict: ✅ Approved

This is a docs-only PR and outside my primary review domain (code quality, TDD, clean code). I read it through the lens of: does this accurately represent how the backend and frontend work, and would a developer following these instructions end up in trouble?

What I verified

/actuator/health smoke test (§3): The bootstrap command curl http://localhost:8080/actuator/health is correct. The backend health check in docker-compose.yml also uses this endpoint: wget -qO- http://localhost:8080/actuator/health | grep -q UP. Consistent.

Upload size limits (§7): The Known Limitations table says "50 MB per file (500 MB per request for multi-file)" — verified against application.yaml:

spring.servlet.multipart:
  max-file-size: 50MB
  max-request-size: 500MB

Accurate.

Mass import endpoint (§6): POST /api/admin/trigger-import and GET /api/admin/import-status match AdminController.java. ✓

import/ bind mount (§6): Step 1 says "Place the import file in the import/ bind mount on the backend container." docker-compose.yml mounts - ./import:/import on the backend. ✓

Spring profile note: The doc states the dev compose sets SPRING_PROFILES_ACTIVE: dev,e2e. That's exactly what docker-compose.yml line 137 shows. The dev profile enables OpenAPI/Swagger (verified in application-dev.yaml). Accurate.

One minor naming nit

§4 "Log locations" says: "Backend application log: stdout (captured by Docker). Access inside the container at /app/logs/ via docker exec." I can't confirm from the codebase that logging is also written to /app/logs/ — the backend logs to stdout by default with Spring Boot's default Logback config, and there's no logback-spring.xml with a file appender visible in the repo. This might be aspirational rather than current. Low severity — worth confirming.

Otherwise, nothing that a developer would stumble over. LGTM from a code-accuracy perspective.

## 👨‍💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer **Verdict: ✅ Approved** This is a docs-only PR and outside my primary review domain (code quality, TDD, clean code). I read it through the lens of: does this accurately represent how the backend and frontend work, and would a developer following these instructions end up in trouble? --- ### What I verified **`/actuator/health` smoke test (§3):** The bootstrap command `curl http://localhost:8080/actuator/health` is correct. The backend health check in `docker-compose.yml` also uses this endpoint: `wget -qO- http://localhost:8080/actuator/health | grep -q UP`. Consistent. **Upload size limits (§7):** The Known Limitations table says "50 MB per file (500 MB per request for multi-file)" — verified against `application.yaml`: ```yaml spring.servlet.multipart: max-file-size: 50MB max-request-size: 500MB ``` Accurate. **Mass import endpoint (§6):** `POST /api/admin/trigger-import` and `GET /api/admin/import-status` match `AdminController.java`. ✓ **`import/` bind mount (§6):** Step 1 says "Place the import file in the `import/` bind mount on the backend container." `docker-compose.yml` mounts `- ./import:/import` on the backend. ✓ **Spring profile note:** The doc states the dev compose sets `SPRING_PROFILES_ACTIVE: dev,e2e`. That's exactly what `docker-compose.yml` line 137 shows. The `dev` profile enables OpenAPI/Swagger (verified in `application-dev.yaml`). Accurate. --- ### One minor naming nit §4 "Log locations" says: "Backend application log: stdout (captured by Docker). Access inside the container at `/app/logs/` via `docker exec`." I can't confirm from the codebase that logging is also written to `/app/logs/` — the backend logs to stdout by default with Spring Boot's default Logback config, and there's no `logback-spring.xml` with a file appender visible in the repo. This might be aspirational rather than current. Low severity — worth confirming. Otherwise, nothing that a developer would stumble over. LGTM from a code-accuracy perspective.

marcel commented

2026-05-05 23:07:00 +02:00

🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist

Verdict: ✅ Approved

This is a documentation PR with no testable code — no test strategy is directly applicable. But I reviewed it for testability of the documented operational procedures: can an operator verify they followed the instructions correctly?

What I checked

Smoke test in bootstrap sequence (§3): The sequence ends with:

curl http://localhost:8080/actuator/health
# Expected: {"status":"UP"}

This is a concrete, machine-verifiable smoke test. Good. One minor gap: there is no verification that the frontend is serving correctly — the health check only confirms the backend is up. Adding a curl -s http://localhost:5173 | grep -q "Familienarchiv" (or similar) as step 7 would give an end-to-end confirmation.

Log commands (§4): All log service names (frontend, backend, db, minio, ocr-service) are verified against docker-compose.yml. Correct.

Backup restore procedure (§5): The restore snippet is:

docker exec -i familienarchiv-db-1 psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql

This has the wrong container name (Tobias flagged this — should be archive-db). More importantly: there is no verification step after restore. A complete operational procedure should include:

docker exec archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} -c "SELECT COUNT(*) FROM documents;"

Without this, an operator has no way to confirm the restore succeeded beyond "the command didn't error."

Security checklist (§3): All 8 checklist items are concrete and binary (can check/uncheck). This is good test design for a checklist. However, "Confirm ALLOWED_PDF_HOSTS is locked to your MinIO/S3 hostname" has no verification command. Adding docker exec archive-ocr printenv ALLOWED_PDF_HOSTS as an inline suggestion would close this.

Summary

The doc is honest about what is and isn't automated. The main testability gap is the missing restore verification step in §5 — that's the highest-risk gap because a failed restore with no verification is effectively no backup at all.

## 🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist **Verdict: ✅ Approved** This is a documentation PR with no testable code — no test strategy is directly applicable. But I reviewed it for testability of the documented operational procedures: can an operator verify they followed the instructions correctly? --- ### What I checked **Smoke test in bootstrap sequence (§3):** The sequence ends with: ```bash curl http://localhost:8080/actuator/health # Expected: {"status":"UP"} ``` This is a concrete, machine-verifiable smoke test. Good. One minor gap: there is no verification that the frontend is serving correctly — the health check only confirms the backend is up. Adding a `curl -s http://localhost:5173 | grep -q "Familienarchiv"` (or similar) as step 7 would give an end-to-end confirmation. **Log commands (§4):** All log service names (`frontend`, `backend`, `db`, `minio`, `ocr-service`) are verified against `docker-compose.yml`. Correct. **Backup restore procedure (§5):** The restore snippet is: ```bash docker exec -i familienarchiv-db-1 psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql ``` This has the wrong container name (Tobias flagged this — should be `archive-db`). More importantly: there is **no verification step** after restore. A complete operational procedure should include: ```bash docker exec archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} -c "SELECT COUNT(*) FROM documents;" ``` Without this, an operator has no way to confirm the restore succeeded beyond "the command didn't error." **Security checklist (§3):** All 8 checklist items are concrete and binary (can check/uncheck). This is good test design for a checklist. However, "Confirm `ALLOWED_PDF_HOSTS` is locked to your MinIO/S3 hostname" has no verification command. Adding `docker exec archive-ocr printenv ALLOWED_PDF_HOSTS` as an inline suggestion would close this. --- ### Summary The doc is honest about what is and isn't automated. The main testability gap is the missing restore verification step in §5 — that's the highest-risk gap because a failed restore with no verification is effectively no backup at all.

marcel commented

2026-05-05 23:07:26 +02:00

📋 Elicit — Requirements Engineer

Verdict: ✅ Approved

Reviewing this as a requirements artifact: does the document fulfill the stated purpose (Day-1 operational reference and bootstrap checklist), is the audience clearly defined, and are all claims testable and complete?

Audience and scope

The stated audience ("operator bringing up a fresh instance, or Successor-X debugging a live incident") is well-defined and consistent throughout. The doc is correctly scoped: it summarises and links rather than duplicating docs/infrastructure/production-compose.md. That scope decision is documented explicitly, which is good requirements hygiene.

Completeness gaps

§2 env var table — stated rule not self-enforcing:
The table header says: "Any var found in docker-compose.yml or application*.yaml that is not in this table is a blocking review comment on any PR that changes those files." This is a requirements-style enforcement claim. The verification command in the test plan (grep -E ...) is a good start, but the rule should also be in the PR template or a CI lint step — otherwise it's a statement of intent, not an enforceable requirement. Recommend adding a note that the rule is enforced by convention, not automation, until a lint step is added.

SPRING_PROFILES_ACTIVE ownership ambiguity:
The table lists SPRING_PROFILES_ACTIVE as a backend env var with default dev,e2e (compose). But .env.example doesn't include this variable — it's hardcoded in docker-compose.yml. An operator reading the table might look for this in .env and not find it. The "how to override for production" path is only implicit (set it in the prod overlay). A one-line clarification ("this variable is set directly in the Compose file, not in .env — override it in the prod overlay") would eliminate ambiguity.

Requirement testability

The §3 bootstrap checklist items are correctly formulated as verifiable conditions (binary checkbox items). The Known Limitations table in §7 has the right columns (Limitation, Reason, Reference). The linking to ADRs for each limitation is excellent — this closes the "why is this a limitation and not a bug?" question for future readers.

Open question the doc doesn't surface

The doc says to call POST /api/admin/trigger-import for mass import. It doesn't specify what authentication the operator must use (requires ADMIN permission — noted in the CLAUDE.md but not here). An operator who doesn't have the admin user set up would get a 403 with no explanation. A one-sentence note ("authenticate as the admin user configured via APP_ADMIN_USERNAME/APP_ADMIN_PASSWORD") would close this.

Overall

The document meets its stated purpose with good structure and honest scope. The gaps are all clarifications rather than missing sections. Approving with the above suggestions logged for the next iteration.

## 📋 Elicit — Requirements Engineer **Verdict: ✅ Approved** Reviewing this as a requirements artifact: does the document fulfill the stated purpose (Day-1 operational reference and bootstrap checklist), is the audience clearly defined, and are all claims testable and complete? --- ### Audience and scope The stated audience ("operator bringing up a fresh instance, or Successor-X debugging a live incident") is well-defined and consistent throughout. The doc is correctly scoped: it summarises and links rather than duplicating `docs/infrastructure/production-compose.md`. That scope decision is documented explicitly, which is good requirements hygiene. ### Completeness gaps **§2 env var table — stated rule not self-enforcing:** The table header says: "Any var found in `docker-compose.yml` or `application*.yaml` that is not in this table is a blocking review comment on any PR that changes those files." This is a requirements-style enforcement claim. The verification command in the test plan (`grep -E ...`) is a good start, but the rule should also be in the PR template or a CI lint step — otherwise it's a statement of intent, not an enforceable requirement. Recommend adding a note that the rule is enforced by convention, not automation, until a lint step is added. **`SPRING_PROFILES_ACTIVE` ownership ambiguity:** The table lists `SPRING_PROFILES_ACTIVE` as a backend env var with default `dev,e2e (compose)`. But `.env.example` doesn't include this variable — it's hardcoded in `docker-compose.yml`. An operator reading the table might look for this in `.env` and not find it. The "how to override for production" path is only implicit (set it in the prod overlay). A one-line clarification ("this variable is set directly in the Compose file, not in `.env` — override it in the prod overlay") would eliminate ambiguity. ### Requirement testability The §3 bootstrap checklist items are correctly formulated as verifiable conditions (binary checkbox items). The Known Limitations table in §7 has the right columns (Limitation, Reason, Reference). The linking to ADRs for each limitation is excellent — this closes the "why is this a limitation and not a bug?" question for future readers. ### Open question the doc doesn't surface The doc says to call `POST /api/admin/trigger-import` for mass import. It doesn't specify what authentication the operator must use (requires `ADMIN` permission — noted in the CLAUDE.md but not here). An operator who doesn't have the admin user set up would get a 403 with no explanation. A one-sentence note ("authenticate as the admin user configured via `APP_ADMIN_USERNAME`/`APP_ADMIN_PASSWORD`") would close this. --- ### Overall The document meets its stated purpose with good structure and honest scope. The gaps are all clarifications rather than missing sections. Approving with the above suggestions logged for the next iteration.

marcel commented

2026-05-05 23:07:38 +02:00

🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead

Verdict: ✅ Approved

This PR adds an operational reference document — not UI, not a user-facing component, not a design spec. There is nothing here for me to review in terms of brand compliance, touch targets, contrast ratios, or component structure.

I did a quick scan for anything that could affect the frontend developer experience (since rebuild-frontend.sh is documented here), and found nothing that contradicts the established patterns.

One observation for completeness: §3 bootstrap step 6 says "Open the app and log in with the admin credentials from .env." The frontend at dev runs on port 5173 (http://localhost:5173). It would be helpful for a non-technical operator to have the explicit URL. Minor cosmetic suggestion, not a blocker.

LGTM — no UI/UX concerns in this PR.

## 🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead **Verdict: ✅ Approved** This PR adds an operational reference document — not UI, not a user-facing component, not a design spec. There is nothing here for me to review in terms of brand compliance, touch targets, contrast ratios, or component structure. I did a quick scan for anything that could affect the frontend developer experience (since `rebuild-frontend.sh` is documented here), and found nothing that contradicts the established patterns. **One observation for completeness:** §3 bootstrap step 6 says "Open the app and log in with the admin credentials from .env." The frontend at dev runs on port 5173 (`http://localhost:5173`). It would be helpful for a non-technical operator to have the explicit URL. Minor cosmetic suggestion, not a blocker. LGTM — no UI/UX concerns in this PR.

marcel added 1 commit 2026-05-05 23:14:51 +02:00

docs(legibility): fix two blockers in DEPLOYMENT.md

CI / Unit & Component Tests (pull_request) Failing after 3m28s

Details

CI / OCR Service Tests (pull_request) Successful in 35s

Details

CI / Backend Unit Tests (pull_request) Failing after 3m16s

Details

CI / Unit & Component Tests (push) Failing after 3m25s

Details

CI / OCR Service Tests (push) Successful in 35s

Details

CI / Backend Unit Tests (push) Failing after 3m18s

Details

1b61af934a

- Use correct container name archive-db (not familienarchiv-db-1) in
  §5 backup/restore commands — verified against docker-compose.yml
- Add KRAKEN_MODEL_PATH to OCR service env vars table (was missing;
  set at docker-compose.yml:92 as /app/models/german_kurrent.mlmodel)

Refs #399
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-05-05 23:14:59 +02:00

Follow-up commit 1b61af93 addresses the two blockers:

Container name fixed — familienarchiv-db-1 → archive-db in §5 backup/restore commands (verified against docker-compose.yml line 5: container_name: archive-db)
KRAKEN_MODEL_PATH added to the OCR service env vars table — this is the main HTR model directory, set in docker-compose.yml:92 as /app/models/german_kurrent.mlmodel and read by main.py:32

Follow-up commit `1b61af93` addresses the two blockers: 1. **Container name fixed** — `familienarchiv-db-1` → `archive-db` in §5 backup/restore commands (verified against `docker-compose.yml` line 5: `container_name: archive-db`) 2. **`KRAKEN_MODEL_PATH` added** to the OCR service env vars table — this is the main HTR model directory, set in `docker-compose.yml:92` as `/app/models/german_kurrent.mlmodel` and read by `main.py:32`

marcel referenced this pull request

2026-05-05 23:35:34 +02:00

docs(legibility): migrate CLAUDE.md rules into human docs — DOC-7 #445

marcel referenced this pull request

2026-05-05 23:35:52 +02:00

docs(legibility): migrate CLAUDE.md rules into human docs; mark migrated content #401

marcel referenced this pull request

2026-05-05 23:39:02 +02:00

docs(legibility): migrate CLAUDE.md rules into human docs — DOC-7 #445

marcel commented

2026-05-06 07:05:13 +02:00

🔧 Tobias Wendt (DevOps) — Platform Engineer

Verdict: ⚠️ Approved with concerns

This is a solid operational reference. The command-first format, the dev-vs-prod diff table, and the explicit "no automated backup" honesty are exactly the kind of pragmatic documentation a successor needs at 3am. A few things need attention before I'd call this production-ready.

Blockers

1. Bootstrap sequence references a docker-compose.prod.yml that may not exist yet

docs/DEPLOYMENT.md §3 tells operators to run:

docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

But the PR description notes that full provisioning lives in docs/infrastructure/production-compose.md (a docs file, not a compose file). If docker-compose.prod.yml doesn't exist in the repo yet, this command fails on day one — and there's no fallback or callout that it needs to be created first. Either the file exists (and should be referenced or linked directly), or the command needs a "create this first" note.

Action: Verify docker-compose.prod.yml exists. If it's the file documented in docs/infrastructure/production-compose.md, say so explicitly: "See docs/infrastructure/production-compose.md for the prod overlay file you need to create/copy."

Suggestions

2. minio/minio:latest flagged but not resolved

The dev-vs-prod table correctly calls out the unpinned MinIO tag in dev. The doc says prod uses "Pinned in prod overlay" — but it doesn't say what the pinned tag is. For a self-contained reference this matters; when the operator creates the prod overlay from scratch, they need a starting point.

Consider: add the recommended stable tag (e.g., minio/minio:RELEASE.2024-xx-xx) in a note, or link to the MinIO release page.

3. Backup section is honest but missing the one-liner for MinIO in production

The backup snippet handles dev bind mounts (./data/minio/). In production the doc says named volumes are used — docker cp or mc mirror is different from a bind-mount copy. Add a production MinIO backup note, even a placeholder, so the successor isn't surprised.

4. OCR model volume path in backup

The OCR models are stored in /app/models/ (from KRAKEN_MODEL_PATH). They're not in the backup section. Models can be re-downloaded via download-kraken-models.sh, but a note like "OCR models are reproducible via download-kraken-models.sh; no backup needed" removes ambiguity.

5. Security checklist item uses inconsistent env var name

§3 security checklist says:

Set a strong OCR_TRAINING_TOKEN

But §2 env table names it APP_OCR_TRAINING_TOKEN (backend) and TRAINING_TOKEN (OCR service). The checklist item should reference both names, or at minimum match the env table's names exactly to avoid operator confusion.

6. rebuild-frontend.sh volume name assumption

The caveat already notes this (line: "If your project directory is not named familienarchiv, edit line 16") — good. But Tobias would add: this is exactly the kind of hardcoding that causes incidents. A future improvement would be to read the volume name from docker compose config instead of hardcoding it. Not a blocker for this docs PR, but worth a follow-up issue.

What's done well

The OCR memory sizing table (CX42/CX32/CX22) with concrete mem_limit guidance is exactly right — this prevents a common "I deployed on CX22 and OCR crashed" incident.
Blocking /actuator/* externally is documented and explained. Future operators will know why.
The "no automated backup" honesty with the explicit "manual backups are the only recovery option" sentence is better than false assurance.
Mermaid diagram gives a quick topology picture without opening the compose file.

## 🔧 Tobias Wendt (DevOps) — Platform Engineer **Verdict: ⚠️ Approved with concerns** This is a solid operational reference. The command-first format, the dev-vs-prod diff table, and the explicit "no automated backup" honesty are exactly the kind of pragmatic documentation a successor needs at 3am. A few things need attention before I'd call this production-ready. --- ### Blockers **1. Bootstrap sequence references a `docker-compose.prod.yml` that may not exist yet** `docs/DEPLOYMENT.md` §3 tells operators to run: ```bash docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d ``` But the PR description notes that full provisioning lives in `docs/infrastructure/production-compose.md` (a docs file, not a compose file). If `docker-compose.prod.yml` doesn't exist in the repo yet, this command fails on day one — and there's no fallback or callout that it needs to be created first. Either the file exists (and should be referenced or linked directly), or the command needs a "create this first" note. **Action**: Verify `docker-compose.prod.yml` exists. If it's the file documented in `docs/infrastructure/production-compose.md`, say so explicitly: *"See `docs/infrastructure/production-compose.md` for the prod overlay file you need to create/copy."* --- ### Suggestions **2. `minio/minio:latest` flagged but not resolved** The dev-vs-prod table correctly calls out the unpinned MinIO tag in dev. The doc says prod uses "Pinned in prod overlay" — but it doesn't say *what* the pinned tag is. For a self-contained reference this matters; when the operator creates the prod overlay from scratch, they need a starting point. Consider: add the recommended stable tag (e.g., `minio/minio:RELEASE.2024-xx-xx`) in a note, or link to the MinIO release page. **3. Backup section is honest but missing the one-liner for MinIO in production** The backup snippet handles dev bind mounts (`./data/minio/`). In production the doc says named volumes are used — `docker cp` or `mc mirror` is different from a bind-mount copy. Add a production MinIO backup note, even a placeholder, so the successor isn't surprised. **4. OCR model volume path in backup** The OCR models are stored in `/app/models/` (from `KRAKEN_MODEL_PATH`). They're not in the backup section. Models can be re-downloaded via `download-kraken-models.sh`, but a note like *"OCR models are reproducible via `download-kraken-models.sh`; no backup needed"* removes ambiguity. **5. Security checklist item uses inconsistent env var name** §3 security checklist says: > Set a strong `OCR_TRAINING_TOKEN` But §2 env table names it `APP_OCR_TRAINING_TOKEN` (backend) and `TRAINING_TOKEN` (OCR service). The checklist item should reference both names, or at minimum match the env table's names exactly to avoid operator confusion. **6. `rebuild-frontend.sh` volume name assumption** The caveat already notes this (line: *"If your project directory is not named `familienarchiv`, edit line 16"*) — good. But Tobias would add: this is exactly the kind of hardcoding that causes incidents. A future improvement would be to read the volume name from `docker compose config` instead of hardcoding it. Not a blocker for this docs PR, but worth a follow-up issue. --- ### What's done well - The OCR memory sizing table (CX42/CX32/CX22) with concrete `mem_limit` guidance is exactly right — this prevents a common "I deployed on CX22 and OCR crashed" incident. - Blocking `/actuator/*` externally is documented and explained. Future operators will know why. - The "no automated backup" honesty with the explicit *"manual backups are the only recovery option"* sentence is better than false assurance. - Mermaid diagram gives a quick topology picture without opening the compose file.

marcel commented

2026-05-06 07:05:19 +02:00

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ✅ Approved

This is a documentation-only PR — no production code, no tests, no API changes. My review focuses on whether the doc serves its audience accurately and doesn't create confusion for developers who will follow it.

Observations (not blockers)

1. generate:api workflow not mentioned in common tasks

§6 covers reset-db.sh, rebuild-frontend.sh, download-kraken-models.sh, and mass import. Missing from this list: npm run generate:api after any backend model/endpoint change. This is the most common source of TypeScript type errors per CLAUDE.md — a "Common operational tasks" section is exactly where a developer running the app locally would look for it.

Not a blocker for a deployment reference doc (which is ops-focused), but worth a note in a follow-up or in a separate dev-runbook issue.

2. Mass import API paths should be verified

§6 says:

Call POST /api/admin/trigger-import (requires ADMIN permission)
Poll GET /api/admin/import-status

I haven't verified these exact paths against the current codebase. If these are stale (e.g., the path is /api/importing/... or similar), an operator following this doc will get a 404. The test plan checklist in the PR description doesn't include an API path verification step.

Suggestion: Add to the PR test plan: "Verify /api/admin/trigger-import and /api/admin/import-status endpoints exist in the current backend."

3. Spring profiles note is accurate

The doc correctly distinguishes dev,e2e (exposes Swagger UI + /v3/api-docs) from prod. This matches CLAUDE.md and the application-dev.yaml pattern. Good.

4. docker-compose.ci.yml warning is valuable

Do not use docker-compose.ci.yml locally

This is a real footgun — including it prevents the "why isn't my bind mount working?" debugging session. Appreciated.

Summary

Clean documentation, accurate to the stack I know. The mass import API path is the only thing I'd verify before merge. Everything else is correct and genuinely useful for a developer picking up the project.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ✅ Approved** This is a documentation-only PR — no production code, no tests, no API changes. My review focuses on whether the doc serves its audience accurately and doesn't create confusion for developers who will follow it. --- ### Observations (not blockers) **1. `generate:api` workflow not mentioned in common tasks** §6 covers `reset-db.sh`, `rebuild-frontend.sh`, `download-kraken-models.sh`, and mass import. Missing from this list: `npm run generate:api` after any backend model/endpoint change. This is the most common source of TypeScript type errors per CLAUDE.md — a "Common operational tasks" section is exactly where a developer running the app locally would look for it. Not a blocker for a deployment reference doc (which is ops-focused), but worth a note in a follow-up or in a separate dev-runbook issue. **2. Mass import API paths should be verified** §6 says: ``` Call POST /api/admin/trigger-import (requires ADMIN permission) Poll GET /api/admin/import-status ``` I haven't verified these exact paths against the current codebase. If these are stale (e.g., the path is `/api/importing/...` or similar), an operator following this doc will get a 404. The test plan checklist in the PR description doesn't include an API path verification step. **Suggestion**: Add to the PR test plan: *"Verify `/api/admin/trigger-import` and `/api/admin/import-status` endpoints exist in the current backend."* **3. Spring profiles note is accurate** The doc correctly distinguishes `dev,e2e` (exposes Swagger UI + `/v3/api-docs`) from `prod`. This matches CLAUDE.md and the `application-dev.yaml` pattern. Good. **4. `docker-compose.ci.yml` warning is valuable** > Do not use `docker-compose.ci.yml` locally This is a real footgun — including it prevents the "why isn't my bind mount working?" debugging session. Appreciated. --- ### Summary Clean documentation, accurate to the stack I know. The mass import API path is the only thing I'd verify before merge. Everything else is correct and genuinely useful for a developer picking up the project.

marcel commented

2026-05-06 07:05:48 +02:00

🔒 Nora "NullX" Steiner — Application Security Engineer

Verdict: ⚠️ Approved with concerns

This document does a lot of things right from a security standpoint. It surfaces default credentials, documents SSRF protection, and calls out the service-account requirement for MinIO. A few items need tightening before this ships as the canonical operational reference.

Blockers

1. APP_ADMIN_PASSWORD default admin123 is documented but the fail-safe is not enforced

The env table documents:

APP_ADMIN_PASSWORD | Bootstrap admin password (⚠ ships as admin123) | admin123 | YES | YES

And the security checklist says:

Set APP_ADMIN_PASSWORD (default: admin123 — change before starting the backend)

This is good documentation. But the security posture of documentation-only prevention for a default credential is weak. The doc should also call out explicitly that the application does not prevent startup with the default password — so an operator who skips the checklist has an exposed admin account with no warning. A future hardening issue should enforce either: startup failure when APP_ADMIN_PASSWORD == "admin123" in a non-dev profile, or a forced password change on first login.

Action: Add a note in the checklist item: "Note: the application does not currently prevent startup with the default password in prod profile — this is tracked as a known limitation / future hardening item." This sets expectations honestly and prevents operators thinking the system will protect them if they miss the step.

2. ALLOWED_PDF_HOSTS SSRF guidance is good but incomplete on what "widening" means

The env table warns:

Do not widen to *

But the bootstrap checklist says:

Confirm ALLOWED_PDF_HOSTS is locked to your MinIO/S3 hostname — widening to * opens SSRF

The document should be more explicit about the threat model here, since the SSRF surface is subtle: if an operator sets ALLOWED_PDF_HOSTS=* because they're "just testing with a public URL," they've opened the internal Docker network to SSRF from any user who can trigger an OCR job. A one-liner explaining the real consequence would help:

⚠️ SSRF risk: A user with upload permission can trigger OCR on a PDF at any URL. If ALLOWED_PDF_HOSTS=*, they can reach internal services (e.g., http://metadata.internal/, cloud IMDS endpoints). Lock this to your MinIO hostname only.

Suggestions

3. MinIO service account guidance is correct but asymmetric

§2 backend env table says:

S3_ACCESS_KEY | MinIO access key (use service account, not root in prod)

The security checklist §3 says:

Use a dedicated MinIO service account for S3_ACCESS_KEY / S3_SECRET_KEY, not the root credentials

Good. But there's no guidance on what permissions the service account should have. "Service account" without a least-privilege scope could still result in an account with bucket-admin rights. Add: "The service account needs: s3:GetObject, s3:PutObject, s3:DeleteObject on the archive bucket only — no bucket creation, no policy management."

4. APP_OCR_TRAINING_TOKEN generation command in checklist — good, but mismatch

The checklist provides a secure token generation command:

python3 -c "import secrets; print(secrets.token_hex(32))"

But the checklist item says OCR_TRAINING_TOKEN while the env table names it APP_OCR_TRAINING_TOKEN (backend) and TRAINING_TOKEN (OCR service). This naming inconsistency is a security smell — an operator might set the wrong variable and leave the training endpoints unprotected. This echoes Tobias's concern; from a security perspective it's more serious because the training endpoint accepts file uploads.

5. SPRING_PROFILES_ACTIVE=prod prevents OpenAPI/Swagger UI exposure — this is correctly documented

The note "that exposes Swagger UI and /v3/api-docs" gives the right reason. No issue here.

6. MAIL_STARTTLS_ENABLE and MAIL_SMTP_AUTH — cleartext email risk not called out

The env table defaults MAIL_STARTTLS_ENABLE=false and MAIL_SMTP_AUTH=false in dev. Both should be true in production, and the table already says YES (prod). But there's no warning that leaving these as false in production means credentials and password-reset links travel in cleartext. A note: "Leaving STARTTLS disabled in production sends password-reset links in cleartext over SMTP — always enable in prod."

What's done well

The SSRF protection for ALLOWED_PDF_HOSTS is documented, explained, and in the security checklist. Many teams miss this.
The training token is in the security checklist with a secure generation command.
The Sensitive column in the env table is an effective risk signal for operators managing secrets.
Blocking /actuator/* externally is explicitly documented with the reason (heapdump contains passwords/tokens). This is exactly the right threat model explanation.

## 🔒 Nora "NullX" Steiner — Application Security Engineer **Verdict: ⚠️ Approved with concerns** This document does a lot of things right from a security standpoint. It surfaces default credentials, documents SSRF protection, and calls out the service-account requirement for MinIO. A few items need tightening before this ships as the canonical operational reference. --- ### Blockers **1. `APP_ADMIN_PASSWORD` default `admin123` is documented but the fail-safe is not enforced** The env table documents: ``` APP_ADMIN_PASSWORD | Bootstrap admin password (⚠ ships as admin123) | admin123 | YES | YES ``` And the security checklist says: > Set `APP_ADMIN_PASSWORD` (default: `admin123` — change before starting the backend) This is good documentation. But the security posture of *documentation-only* prevention for a default credential is weak. The doc should also call out explicitly that the application **does not prevent startup with the default password** — so an operator who skips the checklist has an exposed admin account with no warning. A future hardening issue should enforce either: startup failure when `APP_ADMIN_PASSWORD == "admin123"` in a non-dev profile, or a forced password change on first login. **Action**: Add a note in the checklist item: *"Note: the application does not currently prevent startup with the default password in prod profile — this is tracked as a known limitation / future hardening item."* This sets expectations honestly and prevents operators thinking the system will protect them if they miss the step. --- **2. `ALLOWED_PDF_HOSTS` SSRF guidance is good but incomplete on what "widening" means** The env table warns: > Do not widen to `*` But the bootstrap checklist says: > Confirm `ALLOWED_PDF_HOSTS` is locked to your MinIO/S3 hostname — widening to `*` opens SSRF The document should be more explicit about the threat model here, since the SSRF surface is subtle: if an operator sets `ALLOWED_PDF_HOSTS=*` because they're "just testing with a public URL," they've opened the internal Docker network to SSRF from any user who can trigger an OCR job. A one-liner explaining the real consequence would help: > ⚠️ **SSRF risk**: A user with upload permission can trigger OCR on a PDF at any URL. If `ALLOWED_PDF_HOSTS=*`, they can reach internal services (e.g., `http://metadata.internal/`, cloud IMDS endpoints). Lock this to your MinIO hostname only. --- ### Suggestions **3. MinIO service account guidance is correct but asymmetric** §2 backend env table says: > S3_ACCESS_KEY | MinIO access key (use service account, not root in prod) The security checklist §3 says: > Use a dedicated MinIO service account for `S3_ACCESS_KEY` / `S3_SECRET_KEY`, not the root credentials Good. But there's no guidance on *what permissions* the service account should have. "Service account" without a least-privilege scope could still result in an account with bucket-admin rights. Add: *"The service account needs: `s3:GetObject`, `s3:PutObject`, `s3:DeleteObject` on the archive bucket only — no bucket creation, no policy management."* **4. `APP_OCR_TRAINING_TOKEN` generation command in checklist — good, but mismatch** The checklist provides a secure token generation command: ```bash python3 -c "import secrets; print(secrets.token_hex(32))" ``` But the checklist item says `OCR_TRAINING_TOKEN` while the env table names it `APP_OCR_TRAINING_TOKEN` (backend) and `TRAINING_TOKEN` (OCR service). This naming inconsistency is a security smell — an operator might set the wrong variable and leave the training endpoints unprotected. This echoes Tobias's concern; from a security perspective it's more serious because the training endpoint accepts file uploads. **5. `SPRING_PROFILES_ACTIVE=prod` prevents OpenAPI/Swagger UI exposure — this is correctly documented** The note *"that exposes Swagger UI and `/v3/api-docs`"* gives the right reason. No issue here. **6. `MAIL_STARTTLS_ENABLE` and `MAIL_SMTP_AUTH` — cleartext email risk not called out** The env table defaults `MAIL_STARTTLS_ENABLE=false` and `MAIL_SMTP_AUTH=false` in dev. Both should be `true` in production, and the table already says `YES (prod)`. But there's no warning that leaving these as `false` in production means credentials and password-reset links travel in cleartext. A note: *"Leaving STARTTLS disabled in production sends password-reset links in cleartext over SMTP — always enable in prod."* --- ### What's done well - The SSRF protection for `ALLOWED_PDF_HOSTS` is documented, explained, and in the security checklist. Many teams miss this. - The training token is in the security checklist with a secure generation command. - The Sensitive column in the env table is an effective risk signal for operators managing secrets. - Blocking `/actuator/*` externally is explicitly documented with the reason (heapdump contains passwords/tokens). This is exactly the right threat model explanation.

marcel commented

2026-05-06 07:06:04 +02:00

🧪 Sara Holt — Senior QA Engineer

Verdict: ✅ Approved

This is a documentation-only PR with no test code changes, no new API surface, and no behavior changes. My review focuses on testability of the claims made in the document and whether the test plan in the PR description is sufficient.

Observations

1. PR test plan is partially verifiable but missing key steps

The PR test plan includes:

Render docs/DEPLOYMENT.md in Gitea — verify Mermaid and tables render ✓
Confirm all links resolve ✓
Verify env var table completeness via grep ✓

The grep command for env var verification is a good automated check. But the plan omits:

Verify API paths in §6 (mass import endpoints POST /api/admin/trigger-import, GET /api/admin/import-status) — Felix raised this too; a dead link in the common tasks section is a real QA failure
Verify script filenames — reset-db.sh, rebuild-frontend.sh, download-kraken-models.sh should be confirmed to exist at scripts/ in the repo

These aren't blocking the merge of a docs PR, but I'd recommend completing them as part of the test plan execution before marking this done.

2. The actuator/health smoke test command in §3 is correct and testable

curl http://localhost:8080/actuator/health
# Expected: {"status":"UP"}

This is exactly the kind of automated smoke test that should be in CI post-deploy. The fact that it's documented here is good — but it would be better if this command was also in a CI smoke test step (not in scope for this PR, but worth a follow-up issue).

3. Backup/restore procedure is documented but untested

§5 gives:

docker exec archive-db pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql
docker exec -i archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql

These commands look correct. However, per Tobias's philosophy: a backup without a tested restore is a hope, not a backup. The PR description acknowledges no automation exists yet (Phase 5 of Production v1 milestone). I'd flag this as a follow-up: when Phase 5 ships, the backup procedure should include a CI-scheduled monthly restore test, not just documentation.

Not a blocker here — the doc is honest about the manual-only state.

4. reset-db.sh caveat is correctly scoped

Truncates all data but does not drop the schema or re-run Flyway. Use for E2E test resets, not full reinstalls.

This matches the E2E test reset pattern and is correctly distinguished from a full migration re-run. Good.

Summary

The document's testable claims (commands, API paths, env var completeness) are mostly verifiable. The test plan execution should include the two missing items before closing #399. No test code is affected by this PR.

## 🧪 Sara Holt — Senior QA Engineer **Verdict: ✅ Approved** This is a documentation-only PR with no test code changes, no new API surface, and no behavior changes. My review focuses on testability of the claims made in the document and whether the test plan in the PR description is sufficient. --- ### Observations **1. PR test plan is partially verifiable but missing key steps** The PR test plan includes: - Render `docs/DEPLOYMENT.md` in Gitea — verify Mermaid and tables render ✓ - Confirm all links resolve ✓ - Verify env var table completeness via `grep` ✓ The `grep` command for env var verification is a good automated check. But the plan omits: - **Verify API paths in §6** (mass import endpoints `POST /api/admin/trigger-import`, `GET /api/admin/import-status`) — Felix raised this too; a dead link in the common tasks section is a real QA failure - **Verify script filenames** — `reset-db.sh`, `rebuild-frontend.sh`, `download-kraken-models.sh` should be confirmed to exist at `scripts/` in the repo These aren't blocking the merge of a docs PR, but I'd recommend completing them as part of the test plan execution before marking this done. **2. The actuator/health smoke test command in §3 is correct and testable** ```bash curl http://localhost:8080/actuator/health # Expected: {"status":"UP"} ``` This is exactly the kind of automated smoke test that should be in CI post-deploy. The fact that it's documented here is good — but it would be better if this command was also in a CI smoke test step (not in scope for this PR, but worth a follow-up issue). **3. Backup/restore procedure is documented but untested** §5 gives: ```bash docker exec archive-db pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql docker exec -i archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql ``` These commands look correct. However, per Tobias's philosophy: *a backup without a tested restore is a hope, not a backup*. The PR description acknowledges no automation exists yet (Phase 5 of Production v1 milestone). I'd flag this as a follow-up: when Phase 5 ships, the backup procedure should include a CI-scheduled monthly restore test, not just documentation. Not a blocker here — the doc is honest about the manual-only state. **4. `reset-db.sh` caveat is correctly scoped** > Truncates all data but does not drop the schema or re-run Flyway. Use for E2E test resets, not full reinstalls. This matches the E2E test reset pattern and is correctly distinguished from a full migration re-run. Good. --- ### Summary The document's testable claims (commands, API paths, env var completeness) are mostly verifiable. The test plan execution should include the two missing items before closing #399. No test code is affected by this PR.

marcel referenced this pull request

2026-05-06 07:06:07 +02:00

docs(legibility): migrate CLAUDE.md rules into human docs — DOC-7 #445

marcel commented

2026-05-06 07:06:19 +02:00

🎨 Leonie Voss — UI/UX Designer & Accessibility Strategist

Verdict: ✅ Approved

This PR adds an operational reference document — no UI, no frontend, no component changes. It's outside my primary domain. I'll give it a brief pass from a documentation usability and readability perspective, since docs are a product too.

Observations

1. Document structure and scannability are good

The emergency-first anchor:

If the app is down right now → jump to §4 Logs

...is excellent UX for a runbook. Operators under stress need the shortest path to the most critical action. This is the right instinct.

The numbered ToC, section headers, and command-first format (commands before explanations in §6) are all consistent with how operators actually use runbooks: they scan, find the command, run it, read the context if it fails.

2. Mermaid diagram aids topology comprehension

The Mermaid graph gives a visual mental model before the operator reads the detail. For a non-specialist successor trying to understand the system, this is more valuable than a bullet list of services.

3. One readability concern: the security checklist items mix two audiences

The security checklist in §3 starts with operator actions ("Set APP_ADMIN_PASSWORD...") then mixes in dev-specific clarifications ("not in .env.example"). An operator bootstrapping production doesn't need the dev parenthetical — it may confuse them about which step applies to them.

This is cosmetic, not a blocker. Consider separating dev notes from production operator instructions using a consistent visual treatment (e.g., a > 💡 Dev note: callout vs. the main checklist items).

4. Known limitations table is well-structured

The table format with Reason and Reference columns is exactly the right way to document intentional design constraints. It prevents future operators from "fixing" things that are deliberate. The ADR back-references are the right mechanism.

Summary

From a documentation-as-product perspective: clear structure, correct audience framing, good emergency-first orientation. Cosmetic note on the security checklist mixing two audience concerns. No UI impact. Approved.

## 🎨 Leonie Voss — UI/UX Designer & Accessibility Strategist **Verdict: ✅ Approved** This PR adds an operational reference document — no UI, no frontend, no component changes. It's outside my primary domain. I'll give it a brief pass from a documentation usability and readability perspective, since docs are a product too. --- ### Observations **1. Document structure and scannability are good** The emergency-first anchor: > If the app is down right now → jump to §4 Logs ...is excellent UX for a runbook. Operators under stress need the shortest path to the most critical action. This is the right instinct. The numbered ToC, section headers, and command-first format (commands before explanations in §6) are all consistent with how operators actually use runbooks: they scan, find the command, run it, read the context if it fails. **2. Mermaid diagram aids topology comprehension** The Mermaid graph gives a visual mental model before the operator reads the detail. For a non-specialist successor trying to understand the system, this is more valuable than a bullet list of services. **3. One readability concern: the security checklist items mix two audiences** The security checklist in §3 starts with operator actions ("Set `APP_ADMIN_PASSWORD`...") then mixes in dev-specific clarifications ("not in `.env.example`"). An operator bootstrapping production doesn't need the dev parenthetical — it may confuse them about which step applies to them. This is cosmetic, not a blocker. Consider separating dev notes from production operator instructions using a consistent visual treatment (e.g., a `> 💡 Dev note:` callout vs. the main checklist items). **4. Known limitations table is well-structured** The table format with Reason and Reference columns is exactly the right way to document intentional design constraints. It prevents future operators from "fixing" things that are deliberate. The ADR back-references are the right mechanism. --- ### Summary From a documentation-as-product perspective: clear structure, correct audience framing, good emergency-first orientation. Cosmetic note on the security checklist mixing two audience concerns. No UI impact. Approved.

marcel commented

2026-05-06 07:06:43 +02:00

🏗️ Markus Keller — Application Architect

Verdict: ✅ Approved

This is the deployment reference the project has needed since the OCR service was extracted. It correctly describes the current topology, is honest about what doesn't exist yet (backup automation, monitoring infrastructure), and links to rather than duplicates the production Compose details. A few architecture-level observations.

Observations

1. Topology diagram accurately reflects the deliberate service extraction

The Mermaid graph shows the OCR service correctly: internal-only, no external port, accessed via presigned URL to MinIO. This matches ADR-001's justification (different memory profile, different deployment cadence) and doesn't suggest further extraction. Good.

2. "No multi-tenancy" known limitation is correctly framed as deliberate scope, not oversight

Designed as a single-family private archive; all authenticated users share the same document space

This is the right framing. Documenting this explicitly prevents a future contributor from adding per-user document isolation "as an improvement" when it would break the intentional design. The ADR back-reference pattern should be applied here too — when the family-only scope ADR is written (if it isn't already), link it from this row.

3. SSE note is architecturally accurate

SSE notifications go directly backend → browser (not via the SvelteKit SSR layer)

This is the correct topology for SSE in this stack (SvelteKit SSR can't relay a persistent SSE connection efficiently). Documenting it prevents the "why doesn't the frontend proxy the SSE?" question.

4. Management port separation (8080 app, 8081 actuator/Prometheus) is documented

This is the correct architecture: separate ports, separate exposure rules. Documented here so future contributors don't collapse them.

5. The "no message broker" implicit decision is good

By not mentioning RabbitMQ, Kafka, or Redis anywhere in the topology, the doc implicitly confirms: async work goes through PostgreSQL + Spring's @Async. For current volume (family archive, single-family use), this is exactly right. The absence of infrastructure noise is itself an architectural statement.

One question (not a blocker)

The doc references docs/adr/001-ocr-python-microservice.md in the Known Limitations table. The PR description mentions this file but doesn't include it in the diff. Is ADR-001 an existing file, or is this a forward reference to a document that doesn't exist yet? If it doesn't exist, the link is a dead reference that will confuse operators.

Same question for docs/infrastructure/production-compose.md — the Tobias review raises this more operationally. From my perspective: if these linked documents don't exist yet, this DEPLOYMENT.md is incomplete as a navigation hub. The "Summarise and link" design decision (D1) only works if the targets of the links exist.

Suggested note: Add a small callout if any linked document is aspirational/upcoming: "Note: docs/infrastructure/production-compose.md is created as part of the Production v1 milestone Phase N."

Summary

Architecture is correctly described, scope decisions are explicitly documented, and the single-node OCR constraint has an ADR reference. The document's value as a topology reference is high. Pending verification that linked documents (ADR-001, production-compose.md) actually exist.

## 🏗️ Markus Keller — Application Architect **Verdict: ✅ Approved** This is the deployment reference the project has needed since the OCR service was extracted. It correctly describes the current topology, is honest about what doesn't exist yet (backup automation, monitoring infrastructure), and links to rather than duplicates the production Compose details. A few architecture-level observations. --- ### Observations **1. Topology diagram accurately reflects the deliberate service extraction** The Mermaid graph shows the OCR service correctly: internal-only, no external port, accessed via presigned URL to MinIO. This matches ADR-001's justification (different memory profile, different deployment cadence) and doesn't suggest further extraction. Good. **2. "No multi-tenancy" known limitation is correctly framed as deliberate scope, not oversight** > Designed as a single-family private archive; all authenticated users share the same document space This is the right framing. Documenting this explicitly prevents a future contributor from adding per-user document isolation "as an improvement" when it would break the intentional design. The ADR back-reference pattern should be applied here too — when the family-only scope ADR is written (if it isn't already), link it from this row. **3. SSE note is architecturally accurate** > SSE notifications go directly backend → browser (not via the SvelteKit SSR layer) This is the correct topology for SSE in this stack (SvelteKit SSR can't relay a persistent SSE connection efficiently). Documenting it prevents the "why doesn't the frontend proxy the SSE?" question. **4. Management port separation (8080 app, 8081 actuator/Prometheus) is documented** This is the correct architecture: separate ports, separate exposure rules. Documented here so future contributors don't collapse them. **5. The "no message broker" implicit decision is good** By not mentioning RabbitMQ, Kafka, or Redis anywhere in the topology, the doc implicitly confirms: async work goes through PostgreSQL + Spring's `@Async`. For current volume (family archive, single-family use), this is exactly right. The absence of infrastructure noise is itself an architectural statement. --- ### One question (not a blocker) The doc references `docs/adr/001-ocr-python-microservice.md` in the Known Limitations table. The PR description mentions this file but doesn't include it in the diff. Is ADR-001 an existing file, or is this a forward reference to a document that doesn't exist yet? If it doesn't exist, the link is a dead reference that will confuse operators. Same question for `docs/infrastructure/production-compose.md` — the Tobias review raises this more operationally. From my perspective: if these linked documents don't exist yet, this DEPLOYMENT.md is incomplete as a navigation hub. The "Summarise and link" design decision (D1) only works if the targets of the links exist. **Suggested note**: Add a small callout if any linked document is aspirational/upcoming: *"Note: `docs/infrastructure/production-compose.md` is created as part of the Production v1 milestone Phase N."* --- ### Summary Architecture is correctly described, scope decisions are explicitly documented, and the single-node OCR constraint has an ADR reference. The document's value as a topology reference is high. Pending verification that linked documents (`ADR-001`, `production-compose.md`) actually exist.

marcel referenced this pull request

2026-05-06 07:06:48 +02:00

docs(legibility): migrate CLAUDE.md rules into human docs — DOC-7 #445

marcel commented

2026-05-06 07:07:10 +02:00

📋 Elicit — Requirements Engineer

Verdict: ⚠️ Approved with concerns

Reviewing docs/DEPLOYMENT.md through a requirements lens: does this document deliver on what issue #399 asked for, are the claims testable, and are there gaps that would cause an operator to fail at their goal?

Against Issue #399 Goal

The PR closes #399. Without reading the original issue text directly, the PR description defines scope as: "Day-1 checklist and operational reference." The document covers 7 stated sections and matches its own scope definition well. The design decisions (D1: summarise-and-link, D2: OCR sizing, D3: README link relaxed) are all documented as implemented. The acceptance criteria in the test plan are concrete and executable.

Coverage assessment: complete against stated scope.

Gaps and Ambiguities

1. "Audience" definition creates an implicit NFR that isn't enforced

The document states:

Audience: operator bringing up a fresh instance, or Successor-X debugging a live incident.

"Successor-X" implies someone who may have no prior context — they've inherited the system. This is a high-bar audience. But the document links to docs/infrastructure/production-compose.md and docs/adr/001-ocr-python-microservice.md without confirming these exist or are accessible. For Successor-X, a broken link on step 1 of bootstrap is a critical blocker.

Requirement gap: The implied NFR is "all linked documents SHALL be accessible at the time this document is published." This isn't verified in the test plan.

Action: Add to test plan: "Confirm all internal document links resolve to existing files: infrastructure/production-compose.md, adr/001-ocr-python-microservice.md."

2. "Ownership" claim is untestable as written

Ownership: project owner. Update this file in any PR that changes the container topology, env vars, or backup procedure.

This is a maintenance requirement embedded in prose, which means it will be followed inconsistently. It has no enforcement mechanism. An EARS phrasing would make it testable:

When a PR modifies docker-compose.yml, application*.yaml, or backup scripts, the contributor SHALL update docs/DEPLOYMENT.md env var table and topology section accordingly.

As a requirements engineer, I'd recommend adding this as a PR checklist item in .gitea/ISSUE_TEMPLATE/ or the contributing guide — not just in the document itself.

3. Phase references are used but the phase numbering isn't stable

The document references:

"Phase 7 of the Production v1 milestone" (monitoring)
"Phase 5 of Production v1 milestone" (backup)
"Phase 5" (backup, again)

If phase numbering changes in the milestone (phases reordered, merged, or renamed), these references become stale. The document would be better served by linking to specific Gitea milestone issues or using feature names rather than phase numbers:

Automated backup (PostgreSQL WAL archiving + MinIO bucket replication) is planned — tracked in the Production v1 milestone.

rather than "phase 5" which is an internal planning artifact, not a stable reference.

4. "Common operational tasks" §6 is missing a trigger for the OCR model update case

The document covers: reset DB, rebuild frontend, download models, mass import. But there's no guidance for: what to do when an OCR model is updated/replaced. The download-kraken-models.sh note says "Run once after a fresh clone or when models are updated" — but what's the operator workflow for the "models are updated" case? Is there a model version file? Does the service need to be restarted? Is there a zero-downtime path?

This may be out of scope for this PR, but it's a gap in the operational reference. I'd recommend a follow-up issue: "Document OCR model update procedure in DEPLOYMENT.md."

5. "Last reviewed" header creates a review obligation without a tracking mechanism

<!-- Last reviewed: 2026-05-05 — reviewed at every milestone close -->

This is a good practice. However, without a Gitea issue template or milestone checklist item that says "review DEPLOYMENT.md," this date will drift. The date itself is useful; the mechanism to keep it current is absent.

What's done well

The explicit Non-Goals are implicit in the Known Limitations table — multi-tenancy, multi-region, max upload size. These are exactly the kind of scope boundary that prevents requirements creep.
The "Current state — no automated backup" section name is unusually honest and sets correct expectations without over-promising.
The test plan grep command for env var completeness is a practical, executable acceptance criterion — the kind of thing I'd write into a formal acceptance test.
The "⚠ not in .env.example" annotations in the env table are gap-flagging done correctly inline.

Overall

The document achieves its stated purpose. The concerns are about maintenance durability (phase references, stale links, ownership enforcement) rather than correctness of the current content. I'd address the linked-documents verification gap before merge and file follow-up issues for the phase reference and OCR update procedure gaps.

## 📋 Elicit — Requirements Engineer **Verdict: ⚠️ Approved with concerns** Reviewing `docs/DEPLOYMENT.md` through a requirements lens: does this document deliver on what issue #399 asked for, are the claims testable, and are there gaps that would cause an operator to fail at their goal? --- ### Against Issue #399 Goal The PR closes #399. Without reading the original issue text directly, the PR description defines scope as: *"Day-1 checklist and operational reference."* The document covers 7 stated sections and matches its own scope definition well. The design decisions (D1: summarise-and-link, D2: OCR sizing, D3: README link relaxed) are all documented as implemented. The acceptance criteria in the test plan are concrete and executable. **Coverage assessment: complete against stated scope.** --- ### Gaps and Ambiguities **1. "Audience" definition creates an implicit NFR that isn't enforced** The document states: > **Audience:** operator bringing up a fresh instance, or Successor-X debugging a live incident. "Successor-X" implies someone who may have no prior context — they've inherited the system. This is a high-bar audience. But the document links to `docs/infrastructure/production-compose.md` and `docs/adr/001-ocr-python-microservice.md` without confirming these exist or are accessible. For Successor-X, a broken link on step 1 of bootstrap is a critical blocker. **Requirement gap**: The implied NFR is *"all linked documents SHALL be accessible at the time this document is published."* This isn't verified in the test plan. **Action**: Add to test plan: *"Confirm all internal document links resolve to existing files: `infrastructure/production-compose.md`, `adr/001-ocr-python-microservice.md`."* --- **2. "Ownership" claim is untestable as written** > **Ownership:** project owner. Update this file in any PR that changes the container topology, env vars, or backup procedure. This is a maintenance requirement embedded in prose, which means it will be followed inconsistently. It has no enforcement mechanism. An EARS phrasing would make it testable: > *When a PR modifies `docker-compose.yml`, `application*.yaml`, or backup scripts, the contributor SHALL update `docs/DEPLOYMENT.md` env var table and topology section accordingly.* As a requirements engineer, I'd recommend adding this as a PR checklist item in `.gitea/ISSUE_TEMPLATE/` or the contributing guide — not just in the document itself. --- **3. Phase references are used but the phase numbering isn't stable** The document references: - "Phase 7 of the Production v1 milestone" (monitoring) - "Phase 5 of Production v1 milestone" (backup) - "Phase 5" (backup, again) If phase numbering changes in the milestone (phases reordered, merged, or renamed), these references become stale. The document would be better served by linking to specific Gitea milestone issues or using feature names rather than phase numbers: > *Automated backup (PostgreSQL WAL archiving + MinIO bucket replication) is planned — tracked in the Production v1 milestone.* rather than "phase 5" which is an internal planning artifact, not a stable reference. --- **4. "Common operational tasks" §6 is missing a trigger for the OCR model update case** The document covers: reset DB, rebuild frontend, download models, mass import. But there's no guidance for: *what to do when an OCR model is updated/replaced*. The `download-kraken-models.sh` note says "Run once after a fresh clone or when models are updated" — but what's the operator workflow for the "models are updated" case? Is there a model version file? Does the service need to be restarted? Is there a zero-downtime path? This may be out of scope for this PR, but it's a gap in the operational reference. I'd recommend a follow-up issue: *"Document OCR model update procedure in DEPLOYMENT.md."* --- **5. "Last reviewed" header creates a review obligation without a tracking mechanism** ```html  ``` This is a good practice. However, without a Gitea issue template or milestone checklist item that says "review DEPLOYMENT.md," this date will drift. The date itself is useful; the mechanism to keep it current is absent. --- ### What's done well - The explicit Non-Goals are implicit in the Known Limitations table — multi-tenancy, multi-region, max upload size. These are exactly the kind of scope boundary that prevents requirements creep. - The "Current state — no automated backup" section name is unusually honest and sets correct expectations without over-promising. - The test plan `grep` command for env var completeness is a practical, executable acceptance criterion — the kind of thing I'd write into a formal acceptance test. - The "⚠ not in `.env.example`" annotations in the env table are gap-flagging done correctly inline. --- ### Overall The document achieves its stated purpose. The concerns are about maintenance durability (phase references, stale links, ownership enforcement) rather than correctness of the current content. I'd address the linked-documents verification gap before merge and file follow-up issues for the phase reference and OCR update procedure gaps.

marcel referenced this pull request

2026-05-06 07:07:15 +02:00

docs(legibility): migrate CLAUDE.md rules into human docs — DOC-7 #445

marcel added 1 commit 2026-05-06 07:12:37 +02:00

fix(docs): correct DEPLOYMENT.md env var name and prod overlay note

CI / Unit & Component Tests (push) Failing after 3m21s

Details

CI / OCR Service Tests (push) Successful in 43s

Details

CI / Backend Unit Tests (push) Failing after 3m23s

Details

CI / Unit & Component Tests (pull_request) Failing after 3m38s

Details

CI / OCR Service Tests (pull_request) Successful in 40s

Details

CI / Backend Unit Tests (pull_request) Failing after 3m27s

Details

6fd7778b9e

- Security checklist: OCR_TRAINING_TOKEN → APP_OCR_TRAINING_TOKEN (backend)
  plus TRAINING_TOKEN (OCR service); both must share the same value
- Bootstrap: clarify docker-compose.prod.yml is not committed — must be
  created from docs/infrastructure/production-compose.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-05-06 07:14:36 +02:00

🏗️ Markus Keller — Senior Application Architect

Verdict: ✅ Approved

Documentation PR — I reviewed for architecture accuracy and ADR traceability.

Topology correctness

The Mermaid diagram matches the actual system exactly:

Caddy → Frontend / Backend ✅
Backend → DB / MinIO / OCR ✅
OCR → MinIO via presigned URL (OCR doesn't store directly, it reads) ✅
Browser → SSE directly to Backend (not via SvelteKit SSR layer) ✅
Management port 8081 documented as internal-only; Caddy blocks /actuator/* externally ✅

ADR back-references

The known limitations table links ADR-001 for the single-node OCR constraint. That's exactly the right pattern — a limitation without an ADR ref looks like an oversight; with one, it's a documented decision. The "deliberate scope decision" designations for no-multi-tenancy and no-multi-region are clean: they tell a future operator these were choices, not omissions.

Scope separation

Cross-referencing docs/infrastructure/production-compose.md rather than duplicating prod Compose configuration is correct. This document's job is to orient the operator; the infrastructure docs contain the full config. No duplication.

Dev-vs-prod differences table

Calling out bind mounts (dev) vs named volumes (prod), unpinned MinIO tag (dev) vs pinned (prod), and dev,e2e Spring profile (dev) vs prod — these are the exact architecture-level differences a new operator needs to understand before touching production. Good artifact.

No architectural concerns. Ship it.

## 🏗️ Markus Keller — Senior Application Architect **Verdict: ✅ Approved** Documentation PR — I reviewed for architecture accuracy and ADR traceability. ### Topology correctness The Mermaid diagram matches the actual system exactly: - Caddy → Frontend / Backend ✅ - Backend → DB / MinIO / OCR ✅ - OCR → MinIO via presigned URL (OCR doesn't store directly, it reads) ✅ - Browser → SSE directly to Backend (not via SvelteKit SSR layer) ✅ - Management port 8081 documented as internal-only; Caddy blocks `/actuator/*` externally ✅ ### ADR back-references The known limitations table links ADR-001 for the single-node OCR constraint. That's exactly the right pattern — a limitation without an ADR ref looks like an oversight; with one, it's a documented decision. The "deliberate scope decision" designations for no-multi-tenancy and no-multi-region are clean: they tell a future operator these were choices, not omissions. ### Scope separation Cross-referencing `docs/infrastructure/production-compose.md` rather than duplicating prod Compose configuration is correct. This document's job is to orient the operator; the infrastructure docs contain the full config. No duplication. ### Dev-vs-prod differences table Calling out bind mounts (dev) vs named volumes (prod), unpinned MinIO tag (dev) vs pinned (prod), and `dev,e2e` Spring profile (dev) vs `prod` — these are the exact architecture-level differences a new operator needs to understand before touching production. Good artifact. No architectural concerns. Ship it.

marcel commented

2026-05-06 07:14:43 +02:00

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ✅ Approved

Pure documentation PR — no production code changed. Checking for developer usability.

Commands and copy-paste quality

All code blocks are formatted correctly and copy-pasteable. The bootstrap sequence steps are numbered and sequentially logical. The curl http://localhost:8080/actuator/health smoke test is specific — it tells you exactly what a healthy response looks like ({"status":"UP"}), which is more useful than just "check the logs."

Caveats are accurate

reset-db.sh caveat about hardcoded DB_USER and DB_NAME — correct, the script does hardcode these.
rebuild-frontend.sh caveat about the volume name depending on the project directory name — correct, the script uses familienarchiv_frontend_node_modules.

These are the kind of foot-guns that waste an hour of developer time if undocumented.

Mass import workflow

The import trigger steps (place file → POST /api/admin/trigger-import → poll GET /api/admin/import-status) match the actual AdminController.java endpoints at /api/admin. Verified.

No concerns from my side.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ✅ Approved** Pure documentation PR — no production code changed. Checking for developer usability. ### Commands and copy-paste quality All code blocks are formatted correctly and copy-pasteable. The bootstrap sequence steps are numbered and sequentially logical. The `curl http://localhost:8080/actuator/health` smoke test is specific — it tells you exactly what a healthy response looks like (`{"status":"UP"}`), which is more useful than just "check the logs." ### Caveats are accurate - `reset-db.sh` caveat about hardcoded `DB_USER` and `DB_NAME` — correct, the script does hardcode these. - `rebuild-frontend.sh` caveat about the volume name depending on the project directory name — correct, the script uses `familienarchiv_frontend_node_modules`. These are the kind of foot-guns that waste an hour of developer time if undocumented. ### Mass import workflow The import trigger steps (place file → `POST /api/admin/trigger-import` → poll `GET /api/admin/import-status`) match the actual `AdminController.java` endpoints at `/api/admin`. Verified. No concerns from my side.

marcel commented

2026-05-06 07:14:55 +02:00

🔧 Tobias Wendt — DevOps & Platform Engineer

Verdict: ⚠️ Approved with concerns

This is the doc I'd want to have on Day 1 of running this stack. Most of it is solid. A few non-blocking notes.

What's right

✅ Named volumes vs bind mounts distinction in the dev/prod table — exactly right
✅ docker-compose.prod.yml now correctly noted as not committed, with the guide reference
✅ "No automated backup" is stated clearly rather than implying backups exist
✅ MinIO service account note in the security checklist
✅ SPRING_PROFILES_ACTIVE=prod reminder prevents Swagger UI exposure on prod
✅ Mailpit (dev) vs real SMTP (prod) distinction documented
✅ OCR memory sizing table with three VPS tiers — this is exactly the kind of operational decision table that saves an operator from running out of RAM on day 2

Suggestions (non-blocking)

1. Smoke test port clarification

Step 5 in the bootstrap sequence:

curl http://localhost:8080/actuator/health

This works when run directly on the server shell (bypasses Caddy), but an operator who reads this from their laptop may be confused about why they're hitting :8080 directly instead of going through the proxy. A one-line comment helps:

# run directly on the server — bypasses Caddy, hits the app port directly
curl http://localhost:8080/actuator/health

2. pg_dump command uses shell variables that aren't exported by default

docker exec archive-db pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql

A freshly SSH'd shell won't have POSTGRES_USER and POSTGRES_DB exported from .env. In practice the operator will need to source .env first, or use the literal values. Consider showing the literal defaults with a substitution hint:

# substitute your values from .env, or: source .env first
docker exec archive-db pg_dump -U archive_user family_archive_db > backup-$(date +%Y%m%d).sql

Neither of these is a blocker — the doc is accurate. Just friction for a first-time operator.

## 🔧 Tobias Wendt — DevOps & Platform Engineer **Verdict: ⚠️ Approved with concerns** This is the doc I'd want to have on Day 1 of running this stack. Most of it is solid. A few non-blocking notes. ### What's right - ✅ Named volumes vs bind mounts distinction in the dev/prod table — exactly right - ✅ `docker-compose.prod.yml` now correctly noted as not committed, with the guide reference - ✅ "No automated backup" is stated clearly rather than implying backups exist - ✅ MinIO service account note in the security checklist - ✅ `SPRING_PROFILES_ACTIVE=prod` reminder prevents Swagger UI exposure on prod - ✅ Mailpit (dev) vs real SMTP (prod) distinction documented - ✅ OCR memory sizing table with three VPS tiers — this is exactly the kind of operational decision table that saves an operator from running out of RAM on day 2 ### Suggestions (non-blocking) **1. Smoke test port clarification** Step 5 in the bootstrap sequence: ```bash curl http://localhost:8080/actuator/health ``` This works when run directly on the server shell (bypasses Caddy), but an operator who reads this from their laptop may be confused about why they're hitting `:8080` directly instead of going through the proxy. A one-line comment helps: ```bash # run directly on the server — bypasses Caddy, hits the app port directly curl http://localhost:8080/actuator/health ``` **2. `pg_dump` command uses shell variables that aren't exported by default** ```bash docker exec archive-db pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql ``` A freshly SSH'd shell won't have `POSTGRES_USER` and `POSTGRES_DB` exported from `.env`. In practice the operator will need to `source .env` first, or use the literal values. Consider showing the literal defaults with a substitution hint: ```bash # substitute your values from .env, or: source .env first docker exec archive-db pg_dump -U archive_user family_archive_db > backup-$(date +%Y%m%d).sql ``` Neither of these is a blocker — the doc is accurate. Just friction for a first-time operator.

marcel commented

2026-05-06 07:15:04 +02:00

🔐 Nora "NullX" Steiner — Application Security Engineer

Verdict: ✅ Approved

The security-critical naming inconsistency from the first review round has been correctly fixed in this latest commit.

Checklist coverage

✅ APP_OCR_TRAINING_TOKEN (backend) and TRAINING_TOKEN (OCR service) now both listed, with the critical note that both must be the same value
✅ SSRF: ALLOWED_PDF_HOSTS warning is explicit — "Do not widen to *"
✅ Default credentials flagged in checklist: admin123, change-me for Postgres and MinIO
✅ MinIO service account note: "not the root credentials" — root credentials can delete all buckets
✅ SPRING_PROFILES_ACTIVE=prod reminder prevents Swagger UI and /v3/api-docs exposure in prod
✅ Actuator management port (8081) documented as internal-only; Caddy blocks /actuator/* — this is the correct defense pattern

Non-blocking observation

The APP_ADMIN_USERNAME env var is marked "⚠ not in .env.example" in the env table. This means a first-time operator may not know to set it, leaving the admin account at the default admin username. A predictable admin username combined with a weak password (before the checklist is run) is a weak point. This is a pre-existing gap outside the scope of this documentation PR — but worth tracking as a separate issue: add APP_ADMIN_USERNAME=changeme as a placeholder line in .env.example so operators are prompted to change it at env setup time.

No blockers on this PR. The security checklist is now accurate and complete.

## 🔐 Nora "NullX" Steiner — Application Security Engineer **Verdict: ✅ Approved** The security-critical naming inconsistency from the first review round has been correctly fixed in this latest commit. ### Checklist coverage - ✅ `APP_OCR_TRAINING_TOKEN` (backend) and `TRAINING_TOKEN` (OCR service) now both listed, with the critical note that both must be the same value - ✅ SSRF: `ALLOWED_PDF_HOSTS` warning is explicit — "Do not widen to `*`" - ✅ Default credentials flagged in checklist: `admin123`, `change-me` for Postgres and MinIO - ✅ MinIO service account note: "not the root credentials" — root credentials can delete all buckets - ✅ `SPRING_PROFILES_ACTIVE=prod` reminder prevents Swagger UI and `/v3/api-docs` exposure in prod - ✅ Actuator management port (8081) documented as internal-only; Caddy blocks `/actuator/*` — this is the correct defense pattern ### Non-blocking observation The `APP_ADMIN_USERNAME` env var is marked "⚠ not in `.env.example`" in the env table. This means a first-time operator may not know to set it, leaving the admin account at the default `admin` username. A predictable admin username combined with a weak password (before the checklist is run) is a weak point. This is a pre-existing gap outside the scope of this documentation PR — but worth tracking as a separate issue: add `APP_ADMIN_USERNAME=changeme` as a placeholder line in `.env.example` so operators are prompted to change it at env setup time. No blockers on this PR. The security checklist is now accurate and complete.

marcel commented

2026-05-06 07:15:13 +02:00

🧪 Sara Holt — Senior QA Engineer

Verdict: ✅ Approved

Documentation PR — test coverage doesn't apply directly. Reviewing the document for testability of its own claims.

Test plan in PR description

The three test plan items are appropriate for a documentation PR:

Render in Gitea (Mermaid + tables) — manual but reasonable
Link resolution — manually verifiable
Env var completeness grep — actually automatable, and the grep command is provided in the description itself

The grep command (grep -E '\$\{[A-Z_]+\}' docker-compose.yml backend/src/main/resources/application*.yaml) is a concrete, reproducible verification step. This is good documentation quality practice — the PR description tells you how to verify it, not just that it was verified.

Smoke test quality

The bootstrap smoke test (curl http://localhost:8080/actuator/health, expected: {"status":"UP"}) is specific and verifiable. This is better than "check the logs and see if it looks healthy."

Backup procedure

The backup and restore commands in §5 are concrete enough to test — they would fail predictably if the container name or database name was wrong. The restore command correctly shows input redirection rather than a copy-paste-difficult alternative.

No QA concerns. The doc gives operators concrete, verifiable verification steps rather than hand-wavy instructions.

## 🧪 Sara Holt — Senior QA Engineer **Verdict: ✅ Approved** Documentation PR — test coverage doesn't apply directly. Reviewing the document for testability of its own claims. ### Test plan in PR description The three test plan items are appropriate for a documentation PR: 1. Render in Gitea (Mermaid + tables) — manual but reasonable 2. Link resolution — manually verifiable 3. Env var completeness grep — actually automatable, and the grep command is provided in the description itself The grep command (`grep -E '\$\{[A-Z_]+\}' docker-compose.yml backend/src/main/resources/application*.yaml`) is a concrete, reproducible verification step. This is good documentation quality practice — the PR description tells you how to verify it, not just that it was verified. ### Smoke test quality The bootstrap smoke test (`curl http://localhost:8080/actuator/health`, expected: `{"status":"UP"}`) is specific and verifiable. This is better than "check the logs and see if it looks healthy." ### Backup procedure The backup and restore commands in §5 are concrete enough to test — they would fail predictably if the container name or database name was wrong. The restore command correctly shows input redirection rather than a copy-paste-difficult alternative. No QA concerns. The doc gives operators concrete, verifiable verification steps rather than hand-wavy instructions.

marcel commented

2026-05-06 07:15:20 +02:00

📋 Elicit — Requirements Engineer

Verdict: ✅ Approved

Reviewing for documentation requirements quality: audience clarity, scope definition, completeness, and traceability.

Audience and ownership

Explicit audience declaration: "operator bringing up a fresh instance, or Successor-X debugging a live incident." ✅ Scoped, not generic.
Ownership clause: "Update this file in any PR that changes the container topology, env vars, or backup procedure." ✅ This is a maintainability requirement embedded in the document itself.
"Jump to §4 Logs" at the top is excellent incident-response UX — acknowledges the panic case first.

Completeness governance

The statement in §2 — "Any var found in docker-compose.yml or application*.yaml that is not in this table is a blocking review comment" — creates a concrete, auditable completeness contract. This is good requirements practice: the document defines its own verification criterion.

State vs planned distinction

The backup section correctly distinguishes current state ("no automated backup") from planned state ("Phase 5 of Production v1 milestone"). This prevents an operator from assuming automated backups exist when they don't. ✅

Traceability

The known limitations table links ADR-001 for single-node OCR — limitation → architectural decision record. ✅

Suggestion (non-blocking)

The Phase 5 and Phase 7 references ("planned in the Production v1 milestone phase 5", "Phase 7 adds Prometheus + Loki + Grafana") assume the reader knows the milestone structure. If those phases are completed in the future, a future operator has no way to know from this document whether those items are still "planned" or have shipped. Consider linking to the Gitea milestone or epic so the reader can check current status, or note that these references should be updated when the phases complete.

## 📋 Elicit — Requirements Engineer **Verdict: ✅ Approved** Reviewing for documentation requirements quality: audience clarity, scope definition, completeness, and traceability. ### Audience and ownership - Explicit audience declaration: "operator bringing up a fresh instance, or Successor-X debugging a live incident." ✅ Scoped, not generic. - Ownership clause: "Update this file in any PR that changes the container topology, env vars, or backup procedure." ✅ This is a maintainability requirement embedded in the document itself. - "Jump to §4 Logs" at the top is excellent incident-response UX — acknowledges the panic case first. ### Completeness governance The statement in §2 — "Any var found in docker-compose.yml or application*.yaml that is not in this table is a blocking review comment" — creates a concrete, auditable completeness contract. This is good requirements practice: the document defines its own verification criterion. ### State vs planned distinction The backup section correctly distinguishes current state ("no automated backup") from planned state ("Phase 5 of Production v1 milestone"). This prevents an operator from assuming automated backups exist when they don't. ✅ ### Traceability The known limitations table links ADR-001 for single-node OCR — limitation → architectural decision record. ✅ ### Suggestion (non-blocking) The Phase 5 and Phase 7 references ("planned in the Production v1 milestone phase 5", "Phase 7 adds Prometheus + Loki + Grafana") assume the reader knows the milestone structure. If those phases are completed in the future, a future operator has no way to know from this document whether those items are still "planned" or have shipped. Consider linking to the Gitea milestone or epic so the reader can check current status, or note that these references should be updated when the phases complete.

marcel commented

2026-05-06 07:15:26 +02:00

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Verdict: ✅ Approved

This is a Markdown documentation file — no Svelte components, no UI, no accessibility implementation to review. Checking document readability as a UX concern.

Document UX

Table of contents with anchor links is present and functional ✅
The "If the app is down right now → jump to §4 Logs" callout at the top is excellent incident-response UX — it prioritizes the panic case over linear reading ✅
Mermaid topology diagram provides a visual alternative to the prose description — good for operators who process diagrams faster than text ✅
> ⚠️ blockquote convention for warnings is consistent with the project's documentation style ✅
Tables are consistently formatted across all sections ✅

The document is well-structured and scannable. The jump link at the top and the numbered table of contents both support non-linear navigation, which is exactly what an operator under pressure needs.

No UI/UX concerns on this documentation PR.

## 🎨 Leonie Voss — UX Designer & Accessibility Strategist **Verdict: ✅ Approved** This is a Markdown documentation file — no Svelte components, no UI, no accessibility implementation to review. Checking document readability as a UX concern. ### Document UX - Table of contents with anchor links is present and functional ✅ - The "If the app is down right now → jump to §4 Logs" callout at the top is excellent incident-response UX — it prioritizes the panic case over linear reading ✅ - Mermaid topology diagram provides a visual alternative to the prose description — good for operators who process diagrams faster than text ✅ - `> ⚠️` blockquote convention for warnings is consistent with the project's documentation style ✅ - Tables are consistently formatted across all sections ✅ The document is well-structured and scannable. The jump link at the top and the numbered table of contents both support non-linear navigation, which is exactly what an operator under pressure needs. No UI/UX concerns on this documentation PR.

marcel merged commit a3c17750cd into main

2026-05-06 07:35:25 +02:00

marcel deleted branch feat/issue-399-deployment

2026-05-06 07:35:26 +02:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marcel/familienarchiv#443