security(import): harden DocumentBuilderFactory against XXE in MassImportService #528
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type: Security hardening
Priority: P0-critical — prod-reachable XML parsing of attacker-influenceable input
Source: review of #526 by Nora "NullX" Steiner (security_expert) — comment #8648, finding 4
Parent PR: #526 (mass-import bind mount → first prod-reachable use)
Summary
Explicitly disable DOCTYPE declarations, external entity resolution, XInclude, and entity-reference expansion on every
DocumentBuilderFactoryin the backend, starting withMassImportService. Cover the regression with a test and a Semgrep rule.Context
MassImportServiceparses the ODS spreadsheet's inner XML viaDocumentBuilderFactory(MassImportService.java:29). With #526 merged, the mass-import endpoint becomes reachable on production for the first time — the risk profile of any XXE bug in this parser goes from "dev-only" to "prod-reachable". An attacker who controls or substitutes the ODS source (e.g. supply-chain compromise of thezzfamilienarchiv*.odsfile before it lands on the host) could exfiltrate any file the backend container can read.POI 5.5.0 is recent enough that defaults are probably safe — but "probably" is not enough for prod-reachable XML parsing of attacker-influenceable input.
Required
DocumentBuilderFactory.newInstance()call in the codebase (currently onlyMassImportService).<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/hostname">]>block and an&xxe;reference; assert the parse rejects or ignores it.Detection
Add the Semgrep rule from Nora's review comment to catch future regressions across all stacks:
Acceptance criteria
DocumentBuilderFactorycall sites harden their factory with the features aboveLinked NFRs
Definition of Ready
🤖 Generated with Claude Code during /implement on #526