security(uploads): integrate ClamAV scan before persisting documents to MinIO #464
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
backend/src/main/java/org/raddatz/familienarchiv/filestorage/FileService.java:46-66writes uploaded multipart files straight to MinIO without antimalware scanning. Combined with #84 (no magic-byte MIME validation) and theBODY_SIZE_LIMITbypass in@sveltejs/adapter-node(covered in the newsecurity(deps): bump @sveltejs/kit + viteissue), the upload path is the largest untrusted-input surface in the system.The OCR pipeline then loads each PDF into PyTorch / Surya / Kraken / opencv. Several CVE classes target image/PDF parsers specifically; a malicious PDF can ship an exploit payload regardless of MIME validation.
Note: this is not about email-style virus scanning of family content — it's about catching adversarial content before it reaches native image-processing code.
Approach
Add a ClamAV sidecar to the stack and call it from
FileService.persistFile()before the MinIOPutObject. Reject infected uploads with a translated 400.docker-compose.ymlBackend
depends_on: { clamav: { condition: service_healthy } }. Memory: ~2 GB resident (signature DB).backend/pom.xmlPick a maintained Java client.
xenit-eu/spring-boot-starter-clamavor the lightweightde.malkusch.clamav.ClamAvClient. The latter is a single-file dep:FileService.persistFile()app.clamav.enabled: falsefor the dev profile (skip scan locally) andtruefor prod.Streaming consideration
MultipartFile.getInputStream()is fine for ≤50 MB (the configured max). Don't read bytes into a buffer — pass the InputStream directly toclamavClient.scan(InputStream)so the file is streamed through clamd.Critical files
docker-compose.yml— addclamavservice + volumebackend/pom.xml— clamav-client depbackend/src/main/java/org/raddatz/familienarchiv/filestorage/FileService.javabackend/src/main/java/org/raddatz/familienarchiv/filestorage/ClamAvConfig.java— new (host/port/timeout config)backend/src/main/java/org/raddatz/familienarchiv/exception/ErrorCode.java—MALWARE_DETECTEDfrontend/src/lib/shared/errors.ts— mirrorfrontend/messages/{de,en,es}.json— i18n keybackend/src/main/resources/application.yaml—app.clamav.{host,port,enabled}backend/src/test/java/.../filestorage/FileServiceClamAvTest.java— new, uses EICAR test stringVerification
X5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*). Expect 400 +MALWARE_DETECTED.clamavservice. Verify upload fails closed (rejected), not fail-open. Decision: fail closed is correct here — upload is a low-frequency operation, and silently disabling AV scan because the sidecar is down is unacceptable.freshclamruns daily by default. Verify in logs that DB version increases over a week of operation.Acceptance criteria
MALWARE_DETECTEDErrorCode + translated message.app.clamav.enabled: falseworks for local dev (no sidecar required).clamav_db).Effort
M — 1-2 days. Most of the time is the EICAR test fixture + the fail-closed integration test.
Risk if not addressed
Adversarial PDFs reach native image-processing code (libvips, OpenEXR, PyTorch image decoders) where image-parser CVEs (zero-days, n-days) are a recurring class.
Tracked in audit doc as F-12 (High).
🏗️ Markus Keller — Application Architect
Observations
docs/architecture/c4/l2-containers.pumlanddocs/DEPLOYMENT.md. The currentl2-containers.pumlhas no antivirus layer; adding it without updating the diagram is a doc omission that I treat as a blocker.docs/architecture/c4/seq-document-upload.pumlmust be updated. The current sequence diagram shows the upload flow as:Frontend → Backend → PermissionAspect → DocSvc → FileSvc → MinIO. ClamAV adds a new step betweenFileSvcandMinIOthat the diagram must reflect, along with the rejection path.FileService(currently S3 only), a 2 GB memory addition to the stack, a fail-closed availability constraint, and a decision that ClamAV is the right tool over alternatives (magic-byte-only validation, no AV, server-side VirusTotal API). ADRs 001–006 exist; ADR-007 should capture this. The "why ClamAV over X" reasoning in the issue body is exactly the kind of content that needs to be preserved.ClamAvConfigplacement. The issue proposes a newfilestorage/ClamAvConfig.java. That's correct — it lives in thefilestoragepackage, not inconfig/(which is for infrastructure wiring likeMinioConfig). Keep it there.app.clamav.enabledas a boolean flag. The issue proposes using@Value("${app.clamav.enabled:true}")to skip scanning in dev. This is fine as a tactical measure, but it means the dev profile silently bypasses a security control. A comment on the field explaining why this is acceptable in dev (sidecar not running locally, low-risk dev data) keeps the intent visible for future readers.filestorage/is the right home. Do not create a top-levelantivirus/package.uploadFilemethod currently reads the entire file into abyte[]for SHA-256 hashing before the S3 put. The issue's streaming note says "passInputStreamdirectly toclamavClient.scan(InputStream)" — but the file has already been fully buffered. For this iteration, scanning the samebyte[]as aByteArrayInputStreamis fine and avoids re-streaming concerns. The issue's streaming guidance is aspirationally correct but slightly misleading given the current implementation.Recommendations
docs/architecture/c4/l2-containers.pumlto add theclamavcontainer insidearchivboundary, with a relationshipFileSvc → ClamAV: "Scan before PutObject".docs/architecture/c4/seq-document-upload.pumlto show the scan step and the 400 rejection path.docs/DEPLOYMENT.mdto document the 2 GB ClamAV memory requirement and thestart_period: 5mfor signature DB initialization.MALWARE_DETECTEDtoCLAUDE.md's error handling section once theErrorCodeis added.Open Decisions
MassImportService)? The issue scopes toFileService.uploadFile(), which covers all upload paths that go throughFileService. Verify that mass import does not write files to MinIO via a different code path — if it does, that path needs the same guard.app.clamav.enabled: falsedev shortcut is acceptable long-term or if a Testcontainers-based ClamAV container in CI is planned to test the scan path end-to-end.👨💻 Felix Brandt — Fullstack Developer
Observations
uploadFilecurrently reads the full file intobyte[]. Line 47 ofFileService.java:byte[] bytes = file.getBytes(). The SHA-256 hash is computed on that buffer, then the bytes are passed toRequestBody.fromBytes(bytes). The issue says "passInputStreamdirectly toclamavClient.scan(InputStream)" — but the file is already in memory. In practice, scanning aByteArrayInputStream(bytes)is the right call here; wrapping with anew ByteArrayInputStream(bytes)before the scan keeps things clean without refactoring SHA-256.ClamAvConfigshould follow project conventions. The issue sketches@Valueinjection directly intoFileService. Better: a@ConfigurationProperties(prefix = "app.clamav")record in a newClamAvConfig.java— matching the existingMinioConfigpattern — and a@Bean ClamAvClientproduced there. This keepsFileServiceclean and makes the connection pool (host, port, timeout) configurable and testable.FileServiceis currently constructed with manual constructor injection (not@RequiredArgsConstructor) because it uses@ValueforbucketName. Adding a second@ValueforclamavEnabledis fine, but if you add aClamAvClientbean it should go through the constructor, not field injection. Prefer:FileService(S3Client, S3Presigner, String bucketName, ClamAvClient clamavClient, boolean clamavEnabled).MALWARE_DETECTEDerror code path. The issue usesDomainException.badRequest(ErrorCode.MALWARE_DETECTED, ...). That's correct per project conventions. Make sure theGlobalExceptionHandlermapsDomainExceptionto 400 forbadRequest— it does, based on the existing pattern.log.warnon infected upload is correct. Use SLF4J parameterized form:log.warn("Rejected infected upload: {} — signature: {}", file.getOriginalFilename(), result.getSignature())— already shown in the issue. Ensure the original filename is not logged unsanitized if it could contain log injection characters (strip newlines/CRLFs before logging).MALWARE_DETECTEDerror code viagetErrorMessage()once it's wired inerrors.ts. The i18n keys need to land inmessages/{de,en,es}.json.Recommendations
ClamAvConfigas a@ConfigurationPropertiesbean (not@Valuescattered inFileService). This matchesMinioConfig's pattern and makes the config class testable in isolation.FileService.uploadFile(), scan before hashing and before the S3 put — reject as early as possible. Order:scan → hash → put.file.getOriginalFilename()before logging:filename.replaceAll("[\r\n]", "_").NOTIFICATION_NOT_FOUNDtoerrors.tswhile you're in there — it's inErrorCode.javabut absent from the TypeScript mirror (unrelated debt, but easy to fix in the same PR).error_malware_detectedinmessages/de.json,messages/en.json,messages/es.json.Open Decisions
🛠️ Tobias Wendt — DevOps & Platform Engineer
Observations
minio/minio:latestis already a problem in the currentdocker-compose.yml— unpinned image, non-reproducible builds. This issue adds another service (clamav/clamav:1.4) that the issue does pin. That's good. But the existingminio:latestandaxllent/mailpit:latestshould be pinned in the same PR as hygiene, since we're in the Compose file anyway.clamav/clamav:1.4is pinned to a minor version, not a patch-pinned digest.1.4will pull different images over time asclamav/clamav:1.4is a mutable tag. Pin toclamav/clamav:1.4.2(or whatever the current stable patch is) and add to Renovate config. The security team accepting an unpinned AV container is ironic.start_period: 5mis appropriate — the signature database download on first boot genuinely takes 3–5 minutes. Document this in a comment next to the healthcheck so operators know why deploys appear slow on first run.clamav_dbis specified in the issue. The current Compose file uses bind mounts for Postgres (./data/postgres) and MinIO (./data/minio) — both of which are flagged in my standard review. The ClamAV volume correctly uses a named volume. Staying consistent: addclamav_dbto thevolumes:section at the bottom of the Compose file.backend: depends_on: { clamav: { condition: service_healthy } }is fail-closed at startup — correct. But note that once the backend is running, if ClamAV crashes, the backend will not restart automatically. The issue correctly states uploads should fail closed (503) when ClamAV is down. Verify the Java client throws a catchable exception on connection refused that can be mapped to a 503.~2 GBresident. Current stack on CX32 (8 GB RAM): Postgres ~300 MB, MinIO ~200 MB, OCR service ~5 GB, backend ~512 MB, frontend ~256 MB. Adding 2 GB for ClamAV brings estimated total to ~8.3 GB — over the CX32 limit. The OCR service alone is why the VPS is already at the edge. This needs a sizing conversation before deploying to production.expose: ["3310"]is correct — clamd listens on 3310, no external port needed. Do not add aports:mapping.freshclamruns daily inside the ClamAV container by default. No additional configuration needed; verify withdocker logs archive-clamavafter a week. The issue's verification step #4 is correct.app.clamav.enabled: falsein dev, the CI pipeline also skips the scan. That means the fail-closed behavior (503 when clamd is unreachable) is only manually tested. At minimum, adocker-compose.ci.ymloverlay could start a real ClamAV container for the integration test suite.Recommendations
clamav/clamav:1.4to a patch-pinned tag (e.g.,clamav/clamav:1.4.2) and add it to the Renovate config so it gets automatic version bump PRs.# ClamAV downloads the virus signature DB on first boot — allow 5 minutes before marking healthy.clamav_db:to thevolumes:section at the bottom ofdocker-compose.yml.8gfrom12gto make room. At the current estimate, the stack exceeds available RAM with ClamAV added.CLAMAV_SERVICE_UNAVAILABLErunbook entry: "If uploads return 503, checkdocker logs archive-clamav. If freshclam failed to download signatures, restart the container."Open Decisions
📋 Elicit — Requirements Engineer
Observations
throw DomainException.badRequest(...)with a translated 400) but the user-facing string is not specified. "Datei wurde durch Virusscan abgewiesen" is a developer default. The actual translated message keyerror_malware_detectedneeds to be defined in all three message files (de/en/es). The German string in the issue is functional; the English and Spanish equivalents need to be written.freshclamruns daily. There is no requirement stating how stale a signature database is acceptable before uploads should be rejected. If freshclam fails for 7 days, the database is 7 days stale — is that acceptable? This is a deliberate gap the issue leaves open, but it should be a tracked decision.Recommendations
error_malware_detectedin de/en/es. Suggested text: de: "Die Datei wurde vom Virusscan abgewiesen.", en: "The file was rejected by the antivirus scan.", es: "El archivo fue rechazado por el análisis antivirus."Open Decisions
freshclambut no alert threshold. This is a DevOps/Ops question, but the requirement boundary should be noted.🔐 Nora "NullX" Steiner — Security Engineer
Observations
DocumentController(lines 179/188–190) checksfile.getContentType(), which comes from theContent-Typemultipart field — a client-controlled value. An attacker can setContent-Type: application/pdfon an ELF binary. Magic-byte validation (Apache Tika or similar) closes this gap. ClamAV and magic-byte validation are complementary, not substitutes. This PR should note that #84 remains open.app.clamav.enabled: falsein dev is a security control bypass — the code path that rejects malicious files is not exercised in the dev profile. This is architecturally acceptable for local development but means the integration tests (which also run withapp.clamav.enabled: falseif they inherit the dev profile) never verify the scan path. The EICAR test must run with ClamAV enabled — either via a Testcontainers ClamAV or a dedicated integration test profile.de.malkusch.clamav.ClamAvClientthrowsClamAvException(or similar) on connection failure.FileServicemust catch that and throw a 503, not swallow it and allow the upload. This must be tested explicitly.scan → hash → putorder matters. CurrentlyuploadFiledoes:getBytes() → sha256Hex(bytes) → putObject. The scan should happen before SHA-256 — reject before doing any work. Correct order:getBytes() → scan(bytes) → sha256Hex(bytes) → putObject.Eicar-Signature) is logged so it's auditable. Do not log the file content or the full original filename without sanitization (strip newlines).Recommendations
application-test-clamav.yaml) whereapp.clamav.enabled: trueand a Testcontainers ClamAV container is started. The EICAR test runs only in this profile.@Tag("clamav-integration")to exclude from default test run.ClamAvClientto throwIOExceptiononscan(), assert thatFileService.uploadFile()throws aDomainExceptionmapping to 503, not that it succeeds silently.clamavEnabledfield explaining the threat model:// When false (dev profile): ClamAV sidecar not required. Never set false in production — adversarial PDFs reach OCR image parsers without this check.Open Decisions
CLEAN,INFECTED,ERROR. The fail-closed recommendation means treatERRORas rejected — confirm this is the intended behavior and test it.🧪 Sara Holt — QA Engineer
Observations
FileServiceClamAvTest.java— a new test class for the EICAR fixture. This is the right layer: unit test with a mocked or real ClamAV client testing theFileServicescan integration. The test class name is well-chosen.FileServiceTest.javagives us the test structure to follow:MALWARE_DETECTEDclamavEnabled: false→ scan skipped, upload proceedsFileServiceTest.javauses manual mocking (mock(S3Client.class)) not@ExtendWith(MockitoExtension.class). The new ClamAV test should use@ExtendWith(MockitoExtension.class)for consistency with the project's test style guide.src/test/resources/eicar.txtfile and read it in the test:Files.readAllBytes(Path.of(getClass().getResource("/eicar.txt").toURI())).ClamAvClient.scan()to throwIOExceptiondirectly in the unit test; (b) start a real ClamAV container via Testcontainers and stop it mid-test (fragile). Option (a) is the correct unit-test approach; the Testcontainers approach belongs in an@Tag("clamav-integration")test.MALWARE_DETECTEDerror code flows through the existing error-handling pipeline. Existing frontend error handling tests cover thegetErrorMessage()dispatch table; just add the new case.clamavEnabled: falseshould use@TestPropertySource(properties = "app.clamav.enabled=false")or constructFileServicewith the flag set tofalsedirectly in the unit test constructor call. The latter is simpler and avoids Spring context loading.FileService.uploadFile()currently has test coverage viaFileServiceTest. Adding the scan path increases branch count — the new branches (infected, error, disabled) all need coverage to maintain the 88% JaCoCo gate.Recommendations
src/test/resources/eicar.txtand load it viagetClass().getResource(). Addeicar.txtto.gitignoreif the Gitea instance flags it (unlikely for self-hosted, but worth checking).FileServiceClamAvTest.javaas a pure unit test (@ExtendWith(MockitoExtension.class),@Mock ClamAvClient clamavClient,@InjectMocks FileService fileService). Four tests: EICAR rejected, clean accepted, client throws IOException → 503 mapped, enabled=false skips scan.@Tag("clamav-integration")integration test class that starts a real ClamAV container via Testcontainers and runs the EICAR test end-to-end. This gives confidence that the Java client actually talks to clamd correctly — the unit test mocks the client, so real protocol behavior is only verified here.Open Decisions
🎨 Leonie Voss — UI/UX Design Lead
Observations
getErrorMessage(). TheMALWARE_DETECTEDcase must produce a message that:Recommendations
"Die Datei konnte nicht hochgeladen werden, da sie möglicherweise schädliche Inhalte enthält. Bitte wende dich an den Administrator, falls du das für einen Fehler hältst.""The file could not be uploaded because it may contain harmful content. Please contact the administrator if you believe this is a mistake.""El archivo no pudo cargarse porque puede contener contenido dañino. Ponte en contacto con el administrador si crees que esto es un error."Open Decisions
🗳️ Decision Queue — Open Items Requiring Human Judgment
Grouped by theme. All items below are genuine tradeoffs where a clear winner does not exist without product/ops input.
Infrastructure / VPS Sizing
Raised by: Tobias
The CX32 VPS (8 GB RAM) is already near capacity: OCR service alone is ~5 GB, backend ~512 MB, Postgres ~300 MB, MinIO ~200 MB, frontend ~256 MB. ClamAV adds ~2 GB. Estimated total: ~8.3 GB — over the hardware limit.
Decision needed: Before Production v1, choose one:
mem_limitfrom12gto8g(acceptable if no concurrent large OCR jobs), freeing ~4 GB.Testing Strategy: EICAR in CI vs. Dev-Only Bypass
Raised by: Nora, Sara, Markus
app.clamav.enabled: falsein the dev profile means the scan path is never exercised by CI unless a dedicated profile is set up.Decision needed: Choose one:
@Tag("clamav-integration")test suite that CI runs on PR. This verifies real protocol behavior but adds ~2–3 minutes to CI and requires Docker-in-Docker or equivalent.ClamAvClient), accept that real clamd protocol behavior is validated only by manual EICAR verification on first deploy. Simpler, faster, but leaves a gap.ClamAV Indeterminate Scan State
Raised by: Nora
The
de.malkusch.clamav.ClamAvClientlibrary (and most clamd clients) distinguishes three result states:CLEAN,INFECTED,ERROR. The issue only explicitly handlesINFECTED(reject) and connection failure (fail-closed 503). AnERRORstate (clamd responded but couldn't complete the scan — e.g., corrupted stream) is not specified.Decision needed: For the
ERRORstate (clamd reachable but scan returned error):INFECTED— reject the upload (fully fail-closed). Safest.CLEAN— allow upload (fail-open). Not recommended; contradicts the fail-closed decision.Recommendation: Option (A) is consistent with the issue's stated fail-closed intent.
Mass Import Upload Path Coverage
Raised by: Markus
The issue scopes the scan to
FileService.uploadFile().MassImportServicealso stores files, but does it route throughFileService.uploadFile()or write directly to S3? The codebase showsMassImportServicecallsfileService.uploadFile()for file persistence — so the scan will cover that path automatically. Verify during implementation: if any code path callss3Client.putObject()directly (bypassingFileService), it needs its own scan call.Decision needed: Confirm during implementation that no direct
s3Clientcalls exist outsideFileService. If found, decide whether to add scan there or refactor to route throughFileService.Signature Staleness Alert Threshold
Raised by: Elicit
freshclamruns daily by default. No requirement specifies what happens if freshclam fails for an extended period (signatures become stale).Decision needed: Define an acceptable staleness window:
For a family archive, Option (A) is likely sufficient.