Two workers × ~5 GB Surya model load = ~10 GB required, exceeding the 8 GB memory cap and causing OOM on the first /train call. Two OS processes also cause model-state divergence after training, contradicting the single-node constraint documented in ADR-001. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
773 B
773 B