fix(segtrain): reduce input height to 800px on first run to avoid OOM
ketos segtrain has no batch-size flag (-B), so with the default 1800px input height the intermediate CNN feature maps consume ~500 MB+ per image, causing the kernel OOM-killer (exit -9) to terminate the process. On first run (no existing blla.mlmodel), override the VGSL spec to use 800px height instead. Subsequent runs load the saved model with --resize both, preserving incremental fine-tuning. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -465,7 +465,15 @@ async def segtrain_model(
|
||||
"-N", "10",
|
||||
]
|
||||
if os.path.exists(blla_model_path):
|
||||
cmd += ["-i", blla_model_path]
|
||||
cmd += ["-i", blla_model_path, "--resize", "both"]
|
||||
else:
|
||||
# No pretrained model — train from scratch with reduced height (800px)
|
||||
# to keep peak RAM under ~200 MB on CPU (default 1800px uses ~500 MB+)
|
||||
cmd += [
|
||||
"-s",
|
||||
"[1,800,0,3 Cr7,7,64,2,2 Gn32 Cr3,3,128,2,2 Gn32 Cr3,3,128 Gn32 "
|
||||
"Cr3,3,256 Gn32 Cr3,3,256 Gn32 Lbx32 Lby32 Cr1,1,32 Gn32 Lby32 Lbx32]",
|
||||
]
|
||||
cmd += xml_files
|
||||
|
||||
log.info("Running: %s", " ".join(cmd[:5]) + " ...")
|
||||
|
||||
Reference in New Issue
Block a user