marcel/familienarchiv

Fork 0

Go to file

marcel d650b6c066

CI / Unit & Component Tests (push) Successful in 3m23s

Details

CI / OCR Service Tests (push) Successful in 24s

Details

CI / Backend Unit Tests (push) Successful in 3m46s

Details

CI / fail2ban Regex (push) Successful in 46s

Details

CI / Semgrep Security Scan (push) Successful in 25s

Details

CI / Compose Bucket Idempotency (push) Successful in 1m8s

Details

refactor(search): remove NLP/smart-search feature entirely (#772 )

## Summary

- Removes the NLP/smart-search feature completely — the feature was too unreliable and slow; users get better results with the regular search filters
- Deletes the entire backend `search/` package (NlSearchController, NlQueryParserService, NlpClient, NlSearchRateLimiter — 14 classes + 6 test classes)
- Deletes the `nlp-service/` Python microservice (FastAPI, rapidfuzz, DB-backed person matching)
- Removes all frontend NL search components: SmartModeToggle, SmartSearchStatus, InterpretationChipRow, DisambiguationPicker, chip-types, theme-chip-removal
- Strips smart-mode logic from SearchFilterBar and documents/+page.svelte
- Removes `SMART_SEARCH_UNAVAILABLE` / `SMART_SEARCH_RATE_LIMITED` error codes from backend, frontend types, and all three i18n files (de/en/es)
- Removes `nlp-service` container and `APP_NLP_BASE_URL` from both docker-compose files
- Removes Ollama/NLP Prometheus scrape job and Grafana dashboard
- Deletes ADRs 028 (×2), 034, 035

## Test plan

- [ ] Backend compiles: `cd backend && ./mvnw compile -q` → BUILD SUCCESS
- [ ] Frontend server tests pass: `cd frontend && npm run test -- --project=server`
- [ ] No NLP/smart-search references remain in source: `grep -r "SmartSearch\|NlSearch\|nlp-service\|SMART_SEARCH" backend/src frontend/src`
- [ ] `docker compose config` validates both compose files
- [ ] Search page loads, filter bar works, no smart-mode toggle visible

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Marcel <marcel@familienarchiv>
Reviewed-on: #772

2026-06-08 10:57:00 +02:00

.claude

docs(infra): correct server specs — Hetzner Serverbörse i7-6700 64 GB, not CX32

2026-06-06 14:51:07 +02:00

.devcontainer

docs(legibility): migrate CLAUDE.md rules into human docs — DOC-7

2026-05-06 07:41:02 +02:00

.gitea

ci(deploy): use ::error:: annotations for smoke-test failures

2026-06-02 19:41:07 +02:00

.husky

chore(hooks): remove pre-push E2E hook

2026-03-22 22:15:00 +01:00

.semgrep

fix(ci): pin semgrep version, add pip cache, harden rule severity

2026-05-17 16:18:03 +02:00

.vscode

docs(c4): add VS Code PlantUML server config and diagram index

2026-05-06 22:52:21 +02:00

backend

refactor(search): remove NLP/smart-search feature entirely (#772 )

2026-06-08 10:57:00 +02:00

docs

refactor(search): remove NLP/smart-search feature entirely (#772 )

2026-06-08 10:57:00 +02:00

frontend

refactor(search): remove NLP/smart-search feature entirely (#772 )

2026-06-08 10:57:00 +02:00

infra

refactor(search): remove NLP/smart-search feature entirely (#772 )

2026-06-08 10:57:00 +02:00

ocr-service

docs(ocr): explain why two metrics tests skip fresh_metrics fixture

2026-05-21 17:23:32 +02:00

scripts

docs(legibility): migrate CLAUDE.md rules into human docs — DOC-7

2026-05-06 07:41:02 +02:00

tools/import-normalizer

chore(import): stop tracking real family PII canonical artifacts

2026-05-28 10:20:38 +02:00

.env.example

docs(env): correct OLLAMA_API_KEY comment — tested on 0.6.5 and 0.30.6

2026-06-06 14:59:35 +02:00

.gitignore

chore(import): stop tracking real family PII canonical artifacts

2026-05-28 10:20:38 +02:00

CLAUDE.md

refactor(search): remove NLP/smart-search feature entirely (#772 )

2026-06-08 10:57:00 +02:00

CODESTYLE.md

docs(codestyle): add Svelte 5 specific rules with examples

2026-03-20 15:58:40 +01:00

COLLABORATING.md

docs(legibility): link GLOSSARY.md from COLLABORATING.md — DOC-3

2026-05-05 22:29:07 +02:00

CONTRIBUTING.md

docs: note honest date formatter, title formatter and drift fixture

2026-05-27 12:08:00 +02:00

docker-compose.ci.yml

ci: set up CI pipeline with unit, backend, and E2E test jobs

2026-03-19 12:03:37 +01:00

docker-compose.observability.yml

feat(observability): wire obs-grafana to archive-db and inject GRAFANA_DB_PASSWORD

2026-05-21 20:21:05 +02:00

docker-compose.prod.yml

refactor(search): remove NLP/smart-search feature entirely (#772 )

2026-06-08 10:57:00 +02:00

docker-compose.yml

refactor(search): remove NLP/smart-search feature entirely (#772 )

2026-06-08 10:57:00 +02:00

README.md

docs(legibility): write human-targeted README.md at repo root — DOC-1

2026-05-06 07:01:16 +02:00

renovate.json

ci(deploy): extend Renovate privileged-digest watch to .gitea/actions

2026-06-02 19:23:56 +02:00

runner-config.yaml

chore(runner): mount /opt/familienarchiv into job containers

2026-05-16 10:19:09 +02:00

README.md

Familienarchiv

Familienarchiv is a private web application for digitising, organising, and searching a family document collection — letters, postcards, and photographs from 1899 to 1950. Family members upload scans, transcribe handwritten text (Kurrent/Sütterlin), and read the archive from any device.

Subsystems

frontend/ — SvelteKit 2 / Svelte 5 / TypeScript / Tailwind 4 web app (server-side rendered)
backend/ — Spring Boot 4 (Java 21) REST API; handles documents, persons, search, and user management
ocr-service/ — Python FastAPI microservice for OCR and handwritten text recognition (HTR); single-node by design — see ADR-001. Not part of the default dev stack (see Quick start below)
infra/ — Gitea Actions CI/CD config; future home for infrastructure-as-code
scripts/ — operational and data-pipeline helpers (reset-db.sh, clean-e2e-data.sh, import scripts)

Quick start

Prerequisites: Java 21, Node 24, Docker with the docker compose plugin (V2).

1. Configure environment

cp .env.example .env
# The defaults in .env.example work for local development without changes.

2. Start infrastructure

# Starts PostgreSQL, MinIO (object storage), and Mailpit (dev mail catcher)
docker compose up -d db minio mailpit

3. Start the backend

cd backend
./mvnw spring-boot:run
# Starts on http://localhost:8080
# API docs (dev profile, auto-enabled): http://localhost:8080/v3/api-docs

4. Start the frontend

cd frontend
npm install
npm run dev
# Starts on http://localhost:5173

Open http://localhost:5173 — you should see the Familienarchiv login screen.

Default development credentials:

# local dev only — change before any network-exposed deployment
Email:    admin@familyarchive.local
Password: admin123

Development setup only. The default docker compose config exposes the database port and uses root MinIO credentials. Do not connect this to a network without first reading docs/DEPLOYMENT.md (coming: DOC-5, #399).

Running the full stack via Docker (optional)

To run everything including the backend and frontend in containers:

docker compose up -d

Note: the OCR service (ocr-service/) builds its Docker image locally and downloads ~6 GB of ML models on first start. Expect 30–60 minutes on a first run. The rest of the stack starts independently; OCR can be excluded with --scale ocr-service=0 on memory-constrained machines (requires ≥ 12 GB RAM).

Where to go next

Resource	Purpose
docs/architecture/c4-diagrams.md	C4 container and component diagrams (current system view)
docs/ARCHITECTURE.md (coming: DOC-2, #396)	Full architecture guide with domain list
docs/GLOSSARY.md	Overloaded terms: Person vs AppUser, Chronik vs Aktivität, etc.
CONTRIBUTING.md (coming: DOC-4, #398)	How to add a domain, endpoint, or SvelteKit route
docs/DEPLOYMENT.md (coming: DOC-5, #399)	Production deployment checklist and secrets guide
docs/adr/	Architecture Decision Records — the "why" behind key choices
Gitea issue tracker (internal — home network only)	Bug reports, feature requests, and project planning

License

Languages

Python 69.8%

TypeScript 12.9%

Java 12.7%

Svelte 4.3%

Shell 0.1%

Other 0.1%

README.md Unescape Escape

Familienarchiv

Subsystems

Quick start

1. Configure environment

2. Start infrastructure

3. Start the backend

4. Start the frontend

Running the full stack via Docker (optional)

Where to go next

License

README.md