refactor(document): move document domain core to document/ package
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
144
scripts/CLAUDE.md
Normal file
144
scripts/CLAUDE.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# Scripts — Familienarchiv
|
||||
|
||||
## Overview
|
||||
|
||||
Utility scripts for development, data management, model downloads, and database operations. These are standalone shell and Python scripts used outside the normal application runtime.
|
||||
|
||||
## Scripts
|
||||
|
||||
### `reset-db.sh`
|
||||
**Purpose**: Hard-reset the development database, wiping all documents, persons, tags, and related data.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
./scripts/reset-db.sh
|
||||
# Type 'yes' to confirm
|
||||
```
|
||||
|
||||
**What it truncates:**
|
||||
- `transcription_block_versions`
|
||||
- `transcription_blocks`
|
||||
- `comment_mentions`
|
||||
- `document_comments`
|
||||
- `document_annotations`
|
||||
- `document_versions`
|
||||
- `notifications`
|
||||
- `documents`
|
||||
- `person_name_aliases`
|
||||
- `persons`
|
||||
- `tag`
|
||||
|
||||
> ⚠️ **Destructive operation** — only for development!
|
||||
|
||||
---
|
||||
|
||||
### `rebuild-frontend.sh`
|
||||
**Purpose**: Force a clean rebuild of the frontend Docker container.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
./scripts/rebuild-frontend.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `download-kraken-models.sh`
|
||||
**Purpose**: Download Kraken HTR models for German Kurrent and Sütterlin scripts.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
./scripts/download-kraken-models.sh
|
||||
```
|
||||
|
||||
Downloads models into `./ocr-service/models/` or the `ocr_models` Docker volume. Models are ~100-500 MB each.
|
||||
|
||||
---
|
||||
|
||||
### `download-paperless.sh`
|
||||
**Purpose**: Download exported documents from a Paperless-ngx instance.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
./scripts/download-paperless.sh
|
||||
```
|
||||
|
||||
Requires environment variables or config for the Paperless API endpoint and token.
|
||||
|
||||
---
|
||||
|
||||
### `flatten-paperless.sh`
|
||||
**Purpose**: Flatten nested Paperless export directories into a single import-ready structure.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
./scripts/flatten-paperless.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### `generate_data.py`
|
||||
**Purpose**: Generate synthetic test data for development.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
python scripts/generate_data.py
|
||||
```
|
||||
|
||||
Generates fake documents, persons, and tags suitable for load testing or UI development.
|
||||
|
||||
---
|
||||
|
||||
### `prepare_historical_dict.py`
|
||||
**Purpose**: Build a historical German word dictionary for the OCR spell-checker.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
python scripts/prepare_historical_dict.py
|
||||
```
|
||||
|
||||
Processes raw word lists into the format expected by `ocr-service/spell_check.py`.
|
||||
|
||||
---
|
||||
|
||||
### `schema.sql`
|
||||
**Purpose**: Complete database schema dump for reference.
|
||||
|
||||
**Note**: Flyway migrations in `backend/src/main/resources/db/migration/` are the source of truth for schema evolution. `schema.sql` is a snapshot for quick reference only.
|
||||
|
||||
---
|
||||
|
||||
### `large-data.sql`
|
||||
**Purpose**: Pre-seeded dataset with a large number of documents for performance testing.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Import into PostgreSQL
|
||||
docker exec -i archive-db psql -U archive_user -d family_archive_db < scripts/large-data.sql
|
||||
```
|
||||
|
||||
## How to Use
|
||||
|
||||
Most scripts should be run from the **repository root**:
|
||||
|
||||
```bash
|
||||
# Database reset
|
||||
./scripts/reset-db.sh
|
||||
|
||||
# Model download
|
||||
./scripts/download-kraken-models.sh
|
||||
|
||||
# Data generation
|
||||
cd scripts && python generate_data.py
|
||||
```
|
||||
|
||||
Ensure scripts are executable:
|
||||
```bash
|
||||
chmod +x scripts/*.sh
|
||||
```
|
||||
|
||||
## Adding New Scripts
|
||||
|
||||
1. Place the script in `scripts/`
|
||||
2. Add a header comment describing purpose and usage
|
||||
3. Make it executable (`chmod +x`)
|
||||
4. Document it in this `CLAUDE.md`
|
||||
Reference in New Issue
Block a user