cleanup(ocr): use %n instead of \n in TrainingDataExportService format string #474

Open
opened 2026-05-07 17:29:29 +02:00 by marcel · 0 comments
Owner

Context

SpotBugs (BAD_PRACTICE category) reports:

VA_FORMAT_STRING_USES_NEWLINE
location: org/raddatz/familienarchiv/ocr/TrainingDataExportService.java:28
short:    Format string should use %n rather than \n

The export service writes training data with String.format("...\n", ...) (or similar). On Windows/CI runners (CI runs on ubuntu-latest so this is fine today, but the project also has Windows-using contributors per frontend/Dockerfile mentioning cross-platform) the resulting file has mixed line endings if combined with native writers that emit \r\n.

Pure code-hygiene fix. Smallest issue in the audit; bundling for completeness.

Approach

Replace \n in String.format / printf / Formatter calls with %n. The latter resolves to the platform line separator and matches what BufferedWriter.newLine() would emit alongside.

// before
String line = String.format("text\t%s\n", training.getValue());

// after
String line = String.format("text\t%s%n", training.getValue());

If the file format is intended to be Unix line endings regardless of platform (likely, since training tools are typically Unix-first), then leave \n and add a SpotBugs suppression with a @SuppressFBWarnings("VA_FORMAT_STRING_USES_NEWLINE") annotation + a one-line comment explaining the intent. Either fix is correct; the bug is "make the choice explicit."

Critical files

  • backend/src/main/java/org/raddatz/familienarchiv/ocr/TrainingDataExportService.java (line ~28)
  • backend/spotbugs-exclude.xml if you go the suppression route (created in the new devops(ci) SAST issue)

Verification

  1. SpotBugs re-run: VA_FORMAT_STRING_USES_NEWLINE finding disappears.
  2. Existing tests pass — ./mvnw test -Dtest='*TrainingDataExport*'.

Acceptance criteria

  • SpotBugs no longer flags this line.
  • Either: format string uses %n (platform line ending) or an explicit suppression with rationale exists for keeping \n (Unix-required output format).

Effort

XS — 5 minutes.

Risk if not addressed

None operationally. Cleanup-only; bundled with the audit roadmap for completeness.

Tracked in audit doc as F-37 (Low) — new dynamic finding from SpotBugs SAST.

## Context SpotBugs (BAD_PRACTICE category) reports: ``` VA_FORMAT_STRING_USES_NEWLINE location: org/raddatz/familienarchiv/ocr/TrainingDataExportService.java:28 short: Format string should use %n rather than \n ``` The export service writes training data with `String.format("...\n", ...)` (or similar). On Windows/CI runners (CI runs on `ubuntu-latest` so this is fine today, but the project also has Windows-using contributors per `frontend/Dockerfile` mentioning cross-platform) the resulting file has mixed line endings if combined with native writers that emit `\r\n`. Pure code-hygiene fix. Smallest issue in the audit; bundling for completeness. ## Approach Replace `\n` in `String.format` / `printf` / `Formatter` calls with `%n`. The latter resolves to the platform line separator and matches what `BufferedWriter.newLine()` would emit alongside. ```java // before String line = String.format("text\t%s\n", training.getValue()); // after String line = String.format("text\t%s%n", training.getValue()); ``` If the file format is intended to be Unix line endings regardless of platform (likely, since training tools are typically Unix-first), then **leave `\n`** and add a SpotBugs suppression with a `@SuppressFBWarnings("VA_FORMAT_STRING_USES_NEWLINE")` annotation + a one-line comment explaining the intent. Either fix is correct; the bug is "make the choice explicit." ## Critical files - `backend/src/main/java/org/raddatz/familienarchiv/ocr/TrainingDataExportService.java` (line ~28) - `backend/spotbugs-exclude.xml` if you go the suppression route (created in the new devops(ci) SAST issue) ## Verification 1. SpotBugs re-run: `VA_FORMAT_STRING_USES_NEWLINE` finding disappears. 2. Existing tests pass — `./mvnw test -Dtest='*TrainingDataExport*'`. ## Acceptance criteria - [ ] SpotBugs no longer flags this line. - [ ] Either: format string uses `%n` (platform line ending) **or** an explicit suppression with rationale exists for keeping `\n` (Unix-required output format). ## Effort XS — 5 minutes. ## Risk if not addressed None operationally. Cleanup-only; bundled with the audit roadmap for completeness. Tracked in audit doc as **F-37** (Low) — new dynamic finding from SpotBugs SAST.
marcel added the P3-latercleanup labels 2026-05-07 17:29:42 +02:00
Sign in to join this conversation.
No Label P3-later cleanup
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#474