familienarchiv

Author	SHA1	Message	Date
Marcel	57ffb7d751	chore(ocr): lower OCR_MAX_CACHED_MODELS to 2 with memory budget comment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 20:20:53 +02:00
Marcel	64d27d6d61	feat(ocr): per-sender model registry and /train-sender endpoint engines/kraken.py: - Add _SenderModelRegistry with LRU eviction (max configurable via OCR_MAX_CACHED_MODELS env var), double-checked locking, invalidate(), and path whitelist (/app/models/ only) - Add _load_sender_model() helper for testability - extract_page_blocks() and extract_region_text() accept optional sender_model_path; route to sender registry when provided models.py: - OcrRequest gains senderModelPath: str \| None = None field main.py: - /ocr and /ocr/stream pass request.senderModelPath to Kraken engine - New /train-sender endpoint: validates output_model_path, runs ketos train with base model as starting point, invalidates sender cache docker-compose.yml: - Add OCR_MAX_CACHED_MODELS: "5" to ocr-service environment test_sender_registry.py: - 4 tests: cache hit, LRU eviction, invalidate, path traversal guard Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 18:05:39 +02:00
Marcel	615d404ba9	chore(ocr): add opencv-python-headless, libglib2.0-0, and CLAHE env vars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 14:14:47 +02:00
Marcel	57c44cf02f	devops(backend): reduce healthcheck start_period to 30s Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details With a pre-built JAR, Spring Boot + Flyway starts in ~15 seconds. The previous 60s was sized for runtime compilation (90+ seconds). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 11:33:03 +02:00
Marcel	3c46d820ad	devops(backend): switch to multi-stage Docker build Replace runtime mvn spring-boot:run with a proper multi-stage build: - Stage 1 (builder): compiles JAR with BuildKit cache mount for ~/.m2 - Stage 2 (runtime): eclipse-temurin:21-jre with only the JAR Removes the backend source volume mount and maven_cache named volume. Deploy with: docker compose up -d --build Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 11:33:03 +02:00
Marcel	e4719b9487	fix(deploy): increase OCR healthcheck start_period, comment ocr_cache volume, add token hint Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details - start_period 60s → 120s: Zenodo download on cold start can exceed 60s on slow connections - ocr_cache volume comment: documents what the cache stores for future operators - .env.example: add token generation command to prevent weak placeholder in production Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 21:17:53 +02:00
Marcel	f08897b801	fix(deploy): wire OCR training token to backend and raise container memory limit - Pass OCR_TRAINING_TOKEN through to the backend container as APP_OCR_TRAINING_TOKEN so RestClientOcrClient sends the X-Training-Token header when calling /train and /segtrain. - Raise mem_limit/memswap_limit from 8g to 12g to give segtrain headroom on hosts with more available RAM. - Uncomment OCR_TRAINING_TOKEN in .env.example — it is now required. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 21:17:53 +02:00
Marcel	287920a982	docs(ocr): document single-node constraint for OCR training Training reloads the Kraken model in-process on the Python service. The DB-level RUNNING constraint prevents concurrent API calls but cannot protect against multi-replica deployments. Added explicit comments in docker-compose.yml and OcrTrainingService to prevent accidental horizontal scaling. See ADR-001. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-13 23:01:45 +02:00
Marcel	bc97a2dade	feat(ocr): add /train endpoint to OCR service and OcrClient.trainModel() - POST /train in ocr-service with ZIP Slip validation, TemporaryDirectory, ketos transfer learning, timestamped backups (keep last 3), in-process reload - X-Training-Token auth (no-op in dev when TRAINING_TOKEN env is empty) - trainModel() in OcrClient interface + RestClientOcrClient (10-min timeout, multipart upload, forwards X-Training-Token when configured) - TRAINING_TOKEN env var wired in docker-compose; --workers 2 in Dockerfile so /health stays responsive during synchronous training Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-13 14:40:53 +02:00
Marcel	0beaf351f0	fix(docker): soften ocr-service dependency and clean up compose Changed ocr-service dependency from service_healthy to service_started since the backend already handles OCR unavailability gracefully. Removed unused APP_S3_INTERNAL_URL env var. Added expose directive and .dockerignore for ocr-service. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 12:29:21 +02:00
Marcel	0b0d4a7d5e	perf(ocr): double batch sizes (detector=8, recognition=16) Some checks failed CI / Unit & Component Tests (push) Failing after 1s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details 4GB headroom in the container. Doubling batches should use ~2GB more RAM but significantly speed up inference. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 23:23:13 +02:00
Marcel	1b7540143e	fix(ocr): persist model cache across container restarts Some checks failed CI / Unit & Component Tests (push) Failing after 1s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 0s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details Surya downloads models from HuggingFace to /root/.cache on first use. Without a volume, every container restart re-downloads ~73MB+. Added ocr_cache volume to persist the cache. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 23:21:51 +02:00
Marcel	2cc7dcd5e3	perf(ocr): increase batch sizes (detector=4, recognition=8) Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 2s Details 5GB free on host during OCR, container at 3.8/8GB. Larger batches use more memory but process faster on CPU. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 23:19:22 +02:00
Marcel	741979304c	fix(ocr): increase to 8g mem_limit and larger batch sizes Some checks failed CI / Unit & Component Tests (push) Failing after 1s Details CI / Backend Unit Tests (push) Failing after 2s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 0s Details 5GB free on host while OCR runs — give the container more room. Bump batch sizes (detector=2, recognition=4) so it processes faster with the available memory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 22:35:34 +02:00
Marcel	e9cf2998fe	fix(ocr): reduce mem_limit to 4g, allow 4g swap for 16GB dev machines Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details mem_limit 4g keeps more RAM free for the host. memswap_limit 8g (= 4g swap) lets peaks spill to disk instead of OOM-killing. Slower during peak inference but won't starve the dev machine. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 22:33:05 +02:00
Marcel	902d423f3c	fix(ocr): reduce memory usage for 16GB dev machines Some checks failed CI / Unit & Component Tests (push) Failing after 1s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details - Surya models lazy-load on first OCR request instead of at startup (saves ~3-4GB idle RAM — Kraken stays eager at ~16MB) - Process one page at a time in Surya engine (limits peak memory) - RECOGNITION_BATCH_SIZE=1, DETECTOR_BATCH_SIZE=1 (slower but fits in RAM) - Revert mem_limit back to 6GB (sufficient with these optimizations) - Render DPI stays at 200 Idle memory: ~2GB (Kraken only). Peak during OCR: ~5-6GB (Surya loaded). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 22:26:50 +02:00
Marcel	7f78bc9cf4	fix(ocr): increase memory limit to 10GB, reduce render DPI to 200 Some checks failed CI / Unit & Component Tests (push) Failing after 1s Details CI / Backend Unit Tests (push) Failing after 0s Details CI / Unit & Component Tests (pull_request) Failing after 0s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details Surya 0.17 models use ~5GB idle. At 300 DPI on a multi-page PDF, page images + inference tensors push past the 6GB limit, causing OOM kills during 'Detecting bboxes'. Increased to 10GB and reduced render DPI to 200 (still sufficient for OCR, uses ~44% less memory). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 22:20:36 +02:00
Marcel	f064b27439	feat(ocr): per-script-type confidence thresholds Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details Kurrent OCR produces much lower confidence than typewriter/Latin. Separate thresholds allow aggressive filtering for Kurrent (0.5) while keeping typewriter lenient (0.3). - OCR_CONFIDENCE_THRESHOLD: default for Surya paths (0.3) - OCR_CONFIDENCE_THRESHOLD_KURRENT: Kraken Kurrent path (0.5) - apply_confidence_markers() now accepts threshold parameter - get_threshold(script_type) selects the right threshold Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 20:50:59 +02:00
Marcel	c74539b04b	feat(ocr): auto-insert [unleserlich] markers for low-confidence words Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 2s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details New confidence.py module with two functions: - apply_confidence_markers(): replaces words below threshold with [unleserlich], collapses adjacent markers into one - words_from_characters(): reconstructs word-level confidence from Kraken's character-level data Surya 0.17 provides native word-level confidence via line.words. Kraken 7.0 provides per-character confidences via record.confidences. Both engines now pass word+confidence data through main.py, which applies the marker post-processing before returning the API response. Threshold configurable via OCR_CONFIDENCE_THRESHOLD env var (default 0.3). Frontend already renders [unleserlich] markers via transcriptionMarkers.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 19:16:17 +02:00
Marcel	6737bd6db5	feat(ocr): add Python OCR microservice, RestClientOcrClient, Docker Compose Python microservice (ocr-service/): - FastAPI app with /ocr and /health endpoints - Surya engine: transformer-based OCR for typewritten/modern handwriting - Kraken engine: historical HTR for Kurrent/Suetterlin with pure-Python polygon-to-quad approximation (gift wrapping + rotating calipers) - Eager model loading at startup via lifespan context manager - PDF download via httpx, page rendering via pypdfium2 at 300 DPI Java RestClientOcrClient: - Implements OcrClient + OcrHealthClient interfaces - Calls Python service via Spring RestClient - Health check with graceful fallback Docker Compose: - New ocr-service container (mem_limit 6g, no host ports) - Health check with start_period 60s for model loading - ocr_models volume for Kraken model files - Backend depends on ocr-service health Refs #226, #227 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:26:40 +02:00
Marcel	ea1c097ae0	fix(e2e): activate e2e profile in dev mode and create reader user idempotently - Add e2e to the dev Maven profile's spring.profiles.active so DataInitializer always runs when developing/testing locally - Create the reader test user independently of the person-seed guard so it survives restarts where seed data already exists - Set SPRING_PROFILES_ACTIVE=dev,e2e in docker-compose backend service Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-24 08:25:54 +01:00
Marcel	c18cdbfac1	feat(dev): add Mailpit mail catcher to docker-compose Some checks failed CI / Backend Unit Tests (push) Has been cancelled Details CI / E2E Tests (push) Has been cancelled Details CI / Unit & Component Tests (push) Has been cancelled Details CI / Unit & Component Tests (pull_request) Successful in 2m6s Details CI / Backend Unit Tests (pull_request) Successful in 2m7s Details CI / E2E Tests (pull_request) Has been cancelled Details Adds a Mailpit container that catches all outgoing emails locally so password reset links can be tested without a real SMTP server. - Backend defaults to MAIL_HOST=mailpit / MAIL_PORT=1025 in compose - SMTP auth and STARTTLS disabled for Mailpit (no credentials needed) - Web inbox available at http://localhost:8025 - Production SMTP still works by overriding MAIL_HOST, MAIL_PORT, MAIL_USERNAME, MAIL_SMTP_AUTH, and MAIL_STARTTLS_ENABLE in .env Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-23 09:10:17 +01:00
Marcel	908221f04d	feat(frontend): add forgot-password and reset-password pages Some checks failed CI / Unit & Component Tests (push) Successful in 2m7s Details CI / Backend Unit Tests (push) Successful in 2m3s Details CI / E2E Tests (push) Failing after 14m54s Details CI / Unit & Component Tests (pull_request) Successful in 2m4s Details CI / E2E Tests (pull_request) Has been cancelled Details CI / Backend Unit Tests (pull_request) Has been cancelled Details - /forgot-password: email form → sends POST /api/auth/forgot-password → success banner - /reset-password: password form reads token from URL → sends POST /api/auth/reset-password - Login page: add "Passwort vergessen?" link - hooks.server.ts: add /forgot-password and /reset-password to PUBLIC_PATHS; skip auth injection for public auth API endpoints - errors.ts: add INVALID_RESET_TOKEN error code - i18n: add all new message keys in de/en/es - playwright.config.ts: use E2E_BASE_URL for webServer check URL (allows reusing docker dev server at port 5173 locally) - ci.yml: pass E2E_BACKEND_URL=http://localhost:8080 to E2E test step - e2e/password-reset.spec.ts: 5 tests (4 pass locally, full flow requires e2e profile in CI) - Regenerated OpenAPI types including new /api/auth/* endpoints Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-23 07:26:35 +01:00
Marcel	9b67db74eb	feat: auto-start Spring Boot backend via docker-compose Replace the devcontainer (sleep infinity + VS Code image) with a proper dev setup: - Dockerfile: eclipse-temurin:21-jdk-alpine running ./mvnw spring-boot:run - Source mounted at /app, Maven deps cached in named volume maven_cache - Healthcheck on /actuator/health so frontend waits until backend is ready - frontend depends_on backend: service_healthy (was service_started) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 12:03:14 +01:00
Marcel	3280125140	feat: add frontend dev container to docker-compose - frontend/Dockerfile: Node 20 Alpine image running npm run dev - docker-compose: frontend service with depends_on db/minio/backend, source mounted as volume, named volume for node_modules to avoid OS binary conflicts between host and container - vite.config.ts: make API proxy target configurable via API_PROXY_TARGET env var (defaults to localhost:8080 for local dev, set to http://backend:8080 inside Docker) - .env: update PORT_FRONTEND to 5173 (actual vite dev server port) Usage: docker compose up frontend # starts frontend + all dependencies docker compose up # starts everything Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 12:03:14 +01:00
Marcel	0cb8812692	fix: correct devcontainer workspace path mismatch Volume was mounting ./backend to /workspaces/backend, but devcontainer.json pointed VS Code to /workspaces/familienarchiv — causing the broken path shown in Remote Explorer. Now mounts the full project root to /workspaces/familienarchiv, which matches the workspaceFolder variable. Also gives container access to frontend/ for running npm run generate:api without leaving the devcontainer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-15 13:44:06 +01:00
Marcel	e63adb964d	restructure: flatten workspace nesting, move devcontainer to root - backend/workspaces/backend/ → backend/ - backend/workspaces/frontend/ → frontend/ - backend/.devcontainer/ + .vscode/ → repo root (where VS Code expects them) - loose scripts/SQL files → scripts/ - replace nested git repo with single repo at project root - update docker-compose.yml build context and devcontainer.json path - add root .gitignore Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-15 11:47:58 +01:00

27 Commits