4GB headroom in the container. Doubling batches should use ~2GB more RAM but significantly speed up inference. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5.6 KiB
5.6 KiB