From ca93cde06e599f95301530d40234515dc0da5d81 Mon Sep 17 00:00:00 2001
From: Marcel <marcel@familienarchiv>
Date: Sat, 6 Jun 2026 14:50:40 +0200
Subject: [PATCH] =?UTF-8?q?docs(infra):=20correct=20server=20specs=20?=
 =?UTF-8?q?=E2=80=94=20Hetzner=20Serverb=C3=B6rse=20i7-6700=2064=20GB,=20n?=
 =?UTF-8?q?ot=20CX32?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Replace all references to the CX32 VPS (8 GB RAM, Hetzner Cloud) with the
actual production server: a Hetzner Serverbörse dedicated server with an
Intel Core i7-6700 (4C/8T, 3.4 GHz) and 64 GB RAM.

Affected files:
- .claude/personas/devops.md — monthly cost line + upgrade example
- docs/infrastructure/production-compose.md — sizing section + cost table
- docs/DEPLOYMENT.md — OCR memory table + OCR_MEM_LIMIT env var description
- docs/adr/004-pdfbox-thumbnails.md — thumbnailExecutor memory ceiling note
- docs/adr/021-tmpdir-persistent-volume-staging.md — OOMKill rationale in alternatives

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .claude/personas/devops.md                    |  8 +++---
 docs/DEPLOYMENT.md                            | 11 ++++----
 docs/adr/004-pdfbox-thumbnails.md             |  2 +-
 .../021-tmpdir-persistent-volume-staging.md   |  2 +-
 docs/infrastructure/production-compose.md     | 26 +++++++------------
 5 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/.claude/personas/devops.md b/.claude/personas/devops.md
index 1e7638ef..ee7fbe61 100644
--- a/.claude/personas/devops.md
+++ b/.claude/personas/devops.md
@@ -154,9 +154,9 @@ Schedule monthly automated restore tests. If the restore fails, the backup is wo
 ```
 Every alert needs: description, severity, likely cause, resolution steps, escalation path.
 
-3. **Upgrading VPS tier before profiling**
+3. **Upgrading hardware before profiling**
 ```
-# "The app feels slow" → upgrade from CX32 to CX42
+# "The app feels slow" → order more RAM / a faster CPU
 # Actual cause: unindexed query scanning 100k rows
 ```
 Profile with Grafana dashboards first. Most perceived performance issues are application bugs, not resource constraints.
@@ -404,8 +404,8 @@ Hetzner Object Storage (S3-compatible, replaces MinIO in prod)
 Prometheus + Loki + Alertmanager
 ```
 
-### Monthly Cost: ~23 EUR
-CX32 VPS (4 vCPU, 8GB RAM): 17 EUR · Object Storage (~200GB): 5 EUR · SMTP relay: ~1 EUR
+### Monthly Cost: ~6 EUR (excl. server)
+Hetzner dedicated server (Serverbörse, i7-6700, 64 GB RAM): see invoice · Object Storage (~200GB): 5 EUR · SMTP relay: ~1 EUR
 
 ### Reference Documentation
 - Full CI workflow, Gitea vs GitHub differences: `docs/infrastructure/ci-gitea.md`
diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md
index 2e79481e..3fddf929 100644
--- a/docs/DEPLOYMENT.md
+++ b/docs/DEPLOYMENT.md
@@ -52,11 +52,12 @@ The OCR service requires significant RAM for model loading. The dev compose sets
 
 | Production target | RAM | Recommended OCR limit | Notes |
 |---|---|---|---|
-| Hetzner CX42 | 16 GB | 12 GB | Recommended for OCR-enabled production |
-| Hetzner CX32 | 8 GB | 6 GB | Accept reduced batch sizes and slower throughput |
-| Hetzner CX22 | 4 GB | — | Disable the OCR service (`profiles: [ocr]`); run OCR on demand only |
+| Current server (Hetzner Serverbörse, i7-6700) | 64 GB | 12 GB | Default `mem_limit: 12g` works comfortably |
+| ≥ 16 GB RAM | 16+ GB | 12 GB | Default works |
+| 8 GB RAM | 8 GB | 6 GB | Set `OCR_MEM_LIMIT=6g`; accept reduced batch sizes |
+| 4 GB RAM | 4 GB | — | Disable OCR service (`profiles: [ocr]`); run OCR on demand only |
 
-A CX32 cannot honour the default `mem_limit: 12g` — set the `OCR_MEM_LIMIT=6g` env var (in `.env.production` / `.env.staging`, or as a Gitea secret consumed by the workflow) before deploying on a CX32. The prod compose interpolates this var with a 12g default.
+On servers with less than 16 GB RAM the default `mem_limit: 12g` cannot be honoured — set the `OCR_MEM_LIMIT` env var (in `.env.production` / `.env.staging`, or as a Gitea secret consumed by the workflow). The prod compose interpolates this var with a 12g default.
 
 ### Dev vs production differences
 
@@ -140,7 +141,7 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
 | `ALLOWED_PDF_HOSTS` | SSRF protection — comma-separated list of allowed PDF source hosts. **Do not widen to `*`** | `minio,localhost,127.0.0.1` | YES | — |
 | `KRAKEN_MODEL_PATH` | Directory containing Kraken HTR models (populated by `download-kraken-models.sh`) | `/app/models/` | — | — |
 | `BLLA_MODEL_PATH` | Kraken baseline layout analysis model path | `/app/models/blla.mlmodel` | — | — |
-| `OCR_MEM_LIMIT` | Container memory cap for ocr-service in `docker-compose.prod.yml`. Set to `6g` on CX32 hosts; leave unset on CX42+ to use the 12g default | `12g` (prod compose default) | — | — |
+| `OCR_MEM_LIMIT` | Container memory cap for ocr-service in `docker-compose.prod.yml`. Set to `6g` on servers with 8 GB RAM; leave unset (12g default) on servers with ≥ 16 GB RAM | `12g` (prod compose default) | — | — |
 | `XDG_CACHE_HOME` | XDG cache base dir — redirects Matplotlib and other XDG-aware libraries away from the read-only `HOME` (`/home/ocr`) to the writable cache volume | `/app/cache` | — | — |
 | `TORCH_HOME` | PyTorch model cache — redirects `~/.cache/torch` to the writable models volume | `/app/models/torch` | — | — |
 
diff --git a/docs/adr/004-pdfbox-thumbnails.md b/docs/adr/004-pdfbox-thumbnails.md
index 4d799545..de6ed4b0 100644
--- a/docs/adr/004-pdfbox-thumbnails.md
+++ b/docs/adr/004-pdfbox-thumbnails.md
@@ -35,7 +35,7 @@ Render thumbnails in-process in Spring Boot using **Apache PDFBox 3.0.4** (alrea
 
 **Harder:**
 - PDFBox is a parser attack surface. Mitigated by a 30-second watchdog timeout in `ThumbnailAsyncRunner` and by the fire-and-forget contract (failures never break upload).
-- Memory ceiling: the `thumbnailExecutor` is capped at 2 threads on the CX32 (8 GB). A busy backfill alongside OCR can approach the 3 GB heap — acceptable but not comfortable. Streaming via `FileService.downloadFileStream` keeps this bounded for PDFs up to 50 MB.
+- Memory ceiling: the `thumbnailExecutor` is capped at 2 threads on memory-constrained hosts. A busy backfill alongside OCR can approach the 3 GB heap on an 8 GB server — acceptable but not comfortable. The current production server (64 GB) has ample headroom. Streaming via `FileService.downloadFileStream` keeps this bounded for PDFs up to 50 MB.
 
 ### Operational caveats (intentional)
 
diff --git a/docs/adr/021-tmpdir-persistent-volume-staging.md b/docs/adr/021-tmpdir-persistent-volume-staging.md
index aaa7a203..7f712974 100644
--- a/docs/adr/021-tmpdir-persistent-volume-staging.md
+++ b/docs/adr/021-tmpdir-persistent-volume-staging.md
@@ -62,7 +62,7 @@ The `/tmp` tmpfs remains at 512 MB and continues to serve training-ZIP extractio
 ## Alternatives considered
 
 **Approach B — Enlarge `/tmp` to 4 GB**  
-One-line change. Discarded because: (1) 4 GB tmpfs counts against the cgroup `mem_limit`; on CX32 hosts with `OCR_MEM_LIMIT=6g` the combined Surya resident set + tmpfs would trigger OOMKill on cold start; (2) staging GB-scale model files through RAM is using the wrong storage tier; (3) any future model larger than 4 GB requires another bump.
+One-line change. Discarded because: (1) 4 GB tmpfs counts against the cgroup `mem_limit`; on servers with `OCR_MEM_LIMIT=6g` the combined Surya resident set + tmpfs would trigger OOMKill on cold start; (2) staging GB-scale model files through RAM is using the wrong storage tier; (3) any future model larger than 4 GB requires another bump.
 
 **Approach C — Both TMPDIR redirect and enlarged /tmp**  
 Belt-and-suspenders: Approach A + 1 GB tmpfs. Discarded in favour of the cleaner Approach A. The defence-in-depth benefit does not outweigh the extra compose churn; the 512 MB cap on `/tmp` is intentional.
diff --git a/docs/infrastructure/production-compose.md b/docs/infrastructure/production-compose.md
index cf873bfe..c209a401 100644
--- a/docs/infrastructure/production-compose.md
+++ b/docs/infrastructure/production-compose.md
@@ -20,24 +20,19 @@ The observability stack (Prometheus, Loki, Grafana, Tempo, GlitchTip) ships as a
 
 ---
 
-## VPS Sizing Recommendations
+## Server Sizing
 
-### Recommended: Hetzner CX32
+### Current Production Server: Hetzner Dedicated (Serverbörse)
 
-**Specs**: 4 vCPU, 8 GB RAM, 80 GB SSD · **Cost**: 17 EUR/mo
+**Specs**: Intel Core i7-6700 (4C/8T, 3.4 GHz), 64 GB RAM · acquired via Hetzner server auction
 
-Sufficient for the application stack (Postgres, MinIO, OCR with `mem_limit: 12g`, backend, frontend, Caddy) on a CX32 today. Once the observability stack lands (Prometheus/Loki/Grafana/Alertmanager add ~2 GB) consider a CX42.
+Comfortably handles the full application stack (Postgres, MinIO, OCR with `mem_limit: 12g`, backend, frontend, Caddy, full observability stack) with headroom to spare. The 64 GB RAM means OCR, Ollama inference, and the observability stack can all run concurrently without memory pressure.
 
-### When to Upgrade: Hetzner CX42
+### When to Reconsider Hardware
 
-**Specs**: 8 vCPU, 16 GB RAM · **Cost**: 29 EUR/mo
-
-Upgrade when:
-- Observability stack adds memory pressure (Loki + Grafana with >30 days retention)
-- OCR throughput needs scaling beyond a single-node Surya/Kraken setup
-- Real user load profiled in Grafana shows response-time degradation
-
-Never upgrade the VPS tier before profiling — most perceived performance issues are application bugs, not resource constraints.
+- CPU is Skylake (2015) — single-threaded performance is the likely bottleneck before RAM
+- Profile with Grafana dashboards before concluding hardware is the constraint
+- Most perceived performance issues are application bugs (unindexed queries, N+1 loads), not resource limits
 
 ---
 
@@ -45,12 +40,11 @@ Never upgrade the VPS tier before profiling — most perceived performance issue
 
 | Service | Cost |
 |---|---|
-| Hetzner CX32 VPS | 17.00 EUR |
+| Hetzner dedicated server (Serverbörse, i7-6700, 64 GB RAM) | see invoice |
 | Hetzner DNS | 0.00 EUR |
 | Hetzner SMTP relay | ~1.00 EUR |
-| **Total** | **~18 EUR/mo** |
 
-MinIO data lives on the VPS disk (no Object Storage line item yet). The Hetzner OBS migration would add ~5 EUR/mo at ~200 GB.
+MinIO data lives on the server disk (no Object Storage line item yet). The Hetzner OBS migration would add ~5 EUR/mo at ~200 GB.
 
 Equivalent SaaS stack: 200–300 EUR/mo.