Files
familienarchiv/docs/adr/002-polygon-jsonb-storage.md
Marcel ec32d225b5 docs(adr): add ADR-001 (OCR microservice) and ADR-002 (polygon JSONB)
ADR-001 documents the decision to use a separate Python container for
OCR (Surya + Kraken), the interface contract, and why alternatives
like Tess4J were rejected.

ADR-002 documents the decision to store polygon annotations as JSONB
with a 4-point CHECK constraint, backed by an AttributeConverter.

Refs #226, #227

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 15:07:46 +02:00

2.9 KiB

ADR-002: Polygon JSONB Storage for Annotations

Status

Accepted

Context

Document annotations currently store axis-aligned bounding boxes (x, y, width, height). Kraken OCR outputs polygon boundaries for text lines — historical handwriting (Kurrent, Suetterlin) produces rotated and curved text that axis-aligned rectangles approximate poorly.

We need to store an optional quadrilateral (4 corner points) per annotation to represent the precise text region. The polygon is display-only — overlap detection and all server-side geometry logic continues to use the AABB fields.

Decision

Add a polygon JSONB column to document_annotations:

ALTER TABLE document_annotations ADD COLUMN polygon JSONB;
ALTER TABLE document_annotations
ADD CONSTRAINT chk_annotation_polygon_quad
    CHECK (polygon IS NULL OR jsonb_array_length(polygon) = 4);
  • null means rectangle — render using existing x, y, width, height fields (fully backward compatible)
  • Non-null value is a normalized 4-point quadrilateral: [[x1,y1],[x2,y2],[x3,y3],[x4,y4]] with coordinates in the 0-1 range relative to page dimensions

The existing AABB fields are always populated (even when a polygon is present) and remain the authoritative geometry for overlap detection.

Java entity: List<List<Double>> polygon backed by a custom AttributeConverter<List<List<Double>>, String>. No new dependency (Hypersistence Utils is not in the project and won't be added for a single column).

Semantic invariant: polygon, if present, is a 4-point quadrilateral with coordinates normalized to [0, 1] relative to page dimensions. It may originate from OCR engine output (Kraken) or from a future manual drawing tool. The AABB fields remain the geometry source of truth for server-side logic.

Alternatives Considered

Alternative Why rejected
8 NUMERIC(8,6) columns (x1,y1,...,x4,y4) Verbose, no structural enforcement, awkward to query or extend
Separate annotation_polygons join table Unnecessary complexity for a 1:1 optional relationship
PostGIS geometry column Adds a heavyweight extension for a display-only field with no spatial queries
String polygon on the entity Requires manual parsing at every callsite; error-prone

Consequences

Easier:

  • Backward compatible — all existing annotations continue to work unchanged
  • Frontend renders <polygon> or <rect> based on a simple null check
  • Schema can accommodate N-point polygons in the future (JSONB is flexible), though the CHECK constraint currently enforces exactly 4

Harder:

  • Cannot express range checks (0 <= x <= 1) as database constraints without a PL/pgSQL function — validated at the DTO layer instead
  • No server-side geometry queries on polygon coordinates (acceptable — polygon is display-only)
  • AttributeConverter adds a small amount of serialization code to maintain