chore(import): stop tracking real family PII canonical artifacts
The four files in tools/import-normalizer/out/ contain real names, addresses, and attribution prose for ~163 living/deceased family members and were committed by mistake. They are now removed from the index (kept on disk for local development) and gitignored. The canonical artifacts are produced locally from the Python normalizer and synced into IMPORT_HOST_DIR out-of-band alongside the PDFs. The contract between normalizer and importer is the header schema, not the file contents — CanonicalSheetReader fails closed on a missing header, which is what locks the contract. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -42,6 +42,12 @@ built) transforms the raw xlsx + person register into a clean canonical dataset
|
||||
re-run. The Java importer is adjusted to consume the canonical contract in a later **Phase 2**.
|
||||
See the spec for the full contract.
|
||||
|
||||
The canonical artifacts themselves (the `out/` files) are **produced locally and not
|
||||
version-controlled** — they contain real family PII. They are synced onto the ops host's
|
||||
`IMPORT_HOST_DIR` alongside the PDFs, out-of-band. The contract is the header schema in
|
||||
`02-normalization-spec.md` §6, not any particular file in `out/`. See ADR-025 for the full
|
||||
rationale.
|
||||
|
||||
## Status board
|
||||
|
||||
| ID | Issue | Severity | Status |
|
||||
|
||||
Reference in New Issue
Block a user