fix(security): validate file upload MIME type from magic bytes, not client header #84
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Security Issue — HIGH
Found in:
backend/src/main/java/org/raddatz/familienarchiv/controller/DocumentController.java(lines ~122–143)The vulnerable pattern
file.getContentType()reads theContent-Typevalue from the multipart HTTP header — which the client controls entirely. An attacker can upload an HTML file containing<script>alert(document.cookie)</script>with the headerContent-Type: image/jpegand it passes the check. The file is stored in MinIO and later served back to users.Attack path:
evil.htmlwithContent-Type: image/jpegin the multipart header.Content-Type: text/html(or if the stored content-type is later lost), it executes the script — stored XSS via file upload.The fix
Use Apache Tika to detect the real MIME type from the file's magic bytes, independent of what the client claims. Move the validation into
FileServiceso it cannot be bypassed at the controller level.Add dependency to
backend/pom.xml:FileService.java:
Remove the duplicate content-type check from
DocumentController— validation now lives in the service.Why
Magic bytes are the first few bytes of a file that identify its real format — they cannot be faked without corrupting the file itself. A PDF always starts with
%PDF, a JPEG withFF D8 FF. The HTTPContent-Typeheader is just a string the client types — it has no binding relationship to the actual bytes.Detection in CI
Priority
HIGH — exploitable without authentication bypass, leads to stored XSS.
Audit confirmation (2026-05-07)
Pre-prod audit confirms this is still present.
backend/src/main/java/org/raddatz/familienarchiv/filestorage/FileService.java:46-66reads the client-suppliedContent-Typeheader fromMultipartFilewithout server-side magic-byte detection.DocumentController.java:179-180validates against an allowlist (application/pdf,image/jpeg,image/png,image/tiff) but trusts the header.Recommended fix — Apache Tika
Tika reads the magic bytes (PDF:
%PDF-, JPEG:FF D8 FF, PNG:89 50 4E 47, TIFF:49 49 2A 00/4D 4D 00 2A) and ignores the client-sent header. Pair withContent-Disposition: attachmentand theX-Content-Type-Options: nosniffheader (already set by Spring Security defaults on/api/*) to prevent browser MIME sniffing on download.Suggested AC tightening
.exerenamed toevil.pdfwithContent-Type: application/pdf→ rejected with 400 +INVALID_FILE_TYPEcode.Tracked in audit doc as F-11 (High). See
docs/audits/2026-05-07-pre-prod-architectural-review.md.