Replace the stale Basic-Auth picture with the post-#523 model:
AuthSessionController + AuthService (the new auth/ package), Spring Session
JDBC (spring_session*, 8h idle timeout, fa_session cookie), and the
ChangeSessionIdAuthenticationStrategy bean used by login to defend against
session fixation. Addresses PR #612 / Markus M3.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Removes the cookie-promotion step (auth_token → Authorization: Basic) and
splits the diagram into three labelled phases: Login, Authenticated
request, Logout. Adds the spring_session DB round-trip on every
authenticated request and the alt branch for an expired session
returning 401 → /login?reason=expired.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mirror the CIS Docker §4.1/§4.6 hardening from docker-compose.yml to the
production/staging compose file, which is standalone (not an overlay).
- Fix cache volume mount path: ocr-cache:/root/.cache → /app/cache (matches
the non-root user's HF_HOME/XDG_CACHE_HOME, avoids PermissionError)
- Add HF_HOME, XDG_CACHE_HOME, TORCH_HOME env vars so HuggingFace, ketos,
and PyTorch all write to the declared writable volumes, not HOME
- Add read_only: true, tmpfs (/tmp:512m), cap_drop: [ALL],
no-new-privileges:true — matching the dev baseline
Also extend DEPLOYMENT.md §8 upgrade notes to cover all three environments
(dev/production/staging), each with its correct project-namespaced volume name.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the decision to use the Sentry SDK with self-hosted GlitchTip,
sendDefaultPii:false rationale, errorId surfacing to users, and alternatives
considered (Sentry SaaS rejected for data-minimisation reasons).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Port 4317 is gRPC; the backend uses HttpExporter (HTTP/1.1) and sends
to port 4318. Update Container description and Rel label to match.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New docs/OBSERVABILITY.md: developer-facing guide with a "where to look
for what" table, common LogQL queries, trace exploration workflow,
log→trace correlation via traceId links, and a signal summary table
- Link from DEPLOYMENT.md §4 (ops section now points to dev guide) and
from CLAUDE.md Infrastructure section
- Fix stale DEPLOYMENT.md env var table: OTEL_EXPORTER_OTLP_ENDPOINT
now documents port 4318 (HTTP) not 4317 (gRPC); add the three new
env vars wired in this PR (OTEL_LOGS_EXPORTER, OTEL_METRICS_EXPORTER,
MANAGEMENT_METRICS_TAGS_APPLICATION) with their rationale
- Fix stale obs-tempo service description (port 4318, not 4317)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The job label (derived from the Docker Compose service name) is what
powers {job="backend"} queries in Loki dashboards and populates the
Grafana "App" variable dropdown. Operators need to know this mapping
when writing custom Loki queries.
Addresses @markus non-blocker suggestion from PR #606 review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the architectural decision behind the dedicated management
SecurityFilterChain, the discovery that SB4+Jetty removed the isolated
management child-context security, and the consequences for actuator
endpoint exposure.
Addresses @markus blocker from PR #606 review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace with a cross-reference to DEPLOYMENT.md §4 now that the obs
stack shipped as docker-compose.observability.yml.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- GlitchTip image corrected from glitchtip:v4 to glitchtip:6.1.6 in services table
- Grafana default port corrected from 3001 to 3003 in services table description
- SENTRY_DSN added to backend env vars table (wired in docker-compose.yml and application.yaml)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The live runner config was missing /opt/familienarchiv in valid_volumes
and options, so deploy steps wrote files into the ephemeral job
container rather than the host — silently discarded on exit.
Updated /root/docker/gitea/runner-config.yaml on the server and
restarted gitea-runner. Repo file now matches the server exactly,
including the network: gitea_gitea setting that was previously
only on the server.
DEPLOYMENT.md: clarifies that /opt/familienarchiv does not need to be
in the runner container's own volumes (DooD spawns job containers from
the host daemon directly); updates restart command from systemctl to
docker restart; narrows the cp-r stale-file note to manual ops only
(CI uses rm -rf before copying).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CI uses 'cp -r' which does not remove deleted files. Documents the
manual cleanup step for config files removed from git.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Decision section described an operator-managed /opt/familienarchiv/.env
that CI does not touch. The actual implementation is a two-source model:
obs.env (git-tracked, non-secret config) + obs-secrets.env (CI-written
fresh from Gitea secrets on every deploy). Also updates the Consequences
bullet that incorrectly stated secrets are decoupled from CI.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add the /srv/gitea-workspace prerequisite step to DEPLOYMENT.md §3.1
and a new "Workspace bind-mount setup" subsection plus failure mode 4
to ci-gitea.md, covering the root cause, one-time host setup, disk
management, and troubleshooting for the bind-mount resolution fix
introduced in ADR-015.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the decision to use workdir_parent + identical host<->container
path instead of the overlay2 MergedDir sync that was in the initial fix.
Captures the alternatives (nsenter sync, image-baked configs, path mismatch)
and the operational consequences (prereq directory, out-of-band compose.yaml).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY, and SENTRY_DSN were
referenced in the workflow env files but absent from the secrets table,
leaving the first-run operator without a complete checklist.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds GlitchTip env vars to the observability env var table, extends the
services table, and adds a first-run section with superuser creation and
project setup steps. Updates the C4 L2 container diagram with GlitchTip
and Redis containers and their relationships.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Grafana row to the observability services table, Grafana access details
(URL, credentials, auto-provisioned datasources, pre-loaded dashboards), and
GRAFANA_ADMIN_PASSWORD to the env vars table in DEPLOYMENT.md.
Update C4 l2-containers.puml: replace placeholder Grafana entry with pinned
image version, expand observability boundary with node_exporter and cadvisor
containers, and add Rel() edges for Grafana → Prometheus, Loki, and Tempo.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the stale "no monitoring infrastructure in place yet" note in
§4 with a brief description of the observability compose file and a
pointer to issue #581 for full docs.
Add a placeholder System_Boundary block for Prometheus + Loki + Grafana
to l2-containers.puml, showing the stack joins archiv-net.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Lines 203, 230, and 332 carried comments that actively encouraged
the regression (they read as if v4 is the canonical target). Replaced
with the correct pinned-at-v3 comment referencing ADR-014.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the three-incident history, the enforcement layers (inline
comments + grep guard + ADR), how to spot the symptom, and the explicit
upgrade trigger (act_runner v4 protocol support OR v3 CVE).
Cross-references ADR-011 (single-tenant Gitea runner) and #557.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Branches gate was blocking CI at 75% measured coverage. The 80% floor
suffers Istanbul parent/child denominator coupling (long-tail grind, per
#496) that makes the remaining gap disproportionately costly to close.
Drop branches to 75 to match current state; leave lines/functions/
statements at 80. ADR-013 documents the rationale and the ratchet rule
for raising the gate back incrementally.
Closes#556
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a second binding invariant section to ADR-012 covering the
duplicate-id mechanism named in #553's follow-up investigation: same
resolved module URL referenced via two distinct vi.mock id strings →
@vitest/browser-playwright leaks an orphan Playwright route → birpc-closed
crash in the next session.
Records the rule (one canonical id per mocked module, prefer the spelling
production uses, no-extension for .svelte rune modules), the in-suite
detector (no-duplicate-mock-ids.test.ts), and the patch-package backport
of vitest PR #10267 with its removal trigger.
Extends the existing Consequences enforcement list from four layers to
six, adding the duplicate-id detector and the patch-package layer.
Refs: #553 · vitest-dev/vitest#9957 · vitest-dev/vitest#10267
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous revision allowed vi.mock for virtual modules on the "consumer
import is static" argument. #553 proved that argument wrong: a statically-
imported module with an async factory body whose dynamic import landed
after teardown still produced the race. The factory body — not the
consumer — is the failure surface.
- Drop the "residual exceptions" table.
- Add the binding invariant: factory bodies under `**/*.svelte.{test,spec}.ts`
must be synchronous (no `await`, no `import(...)`).
- Document the canonical vi.hoisted + getter pattern, with file references.
- Record the $app/stores → $app/state architectural call (Markus's
recommendation), removing one of the last two deprecated-import
outliers.
- Record the preload-data=off hardening (Tobias's recommendation) as a
pattern note.
- Update the Enforcement section to list all four defence layers (ESLint,
CI grep, in-suite meta-test, CI birpc assert) and the coverage-flake-
probe verification workflow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a no-restricted-syntax rule scoped to *.spec.ts / *.test.ts that
flags any vi.mock call whose first argument starts with 'pdfjs-dist'.
Turns the ~2-min CI wait into an immediate lint error on save.
Updates ADR 012 Enforcement section to document the rule.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Explicitly states no lint rule is planned; CI guard is the backstop
(addresses Elicit OQ-001 from PR #536 round 4)
- Adds a "when to revisit" note: extract shared DynamicImportLoader<T>
if 3+ components adopt the libLoader pattern
(addresses Markus Keller round-4 observation on PR #536)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents why vi.mock(module, factory) races with birpc teardown for
dynamically-imported modules, the libLoader injection pattern used to fix
#535, and the residual exceptions ($app/*, $env/*) that are safe to keep
as vi.mock because they are resolved statically before any test runs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers the three failure modes Sara flagged: Caddy stopped (explicit
systemctl error), symlink missing/mis-pointed (silent reload, stale
smoke test), and Docker socket / nsenter unavailable (container error).
Each failure mode includes symptoms and recovery steps.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>