Find Related products on Amazon

Shop on Amazon

Show HN: Keep your PyTorch model in VRAM by hot swapping code

Published on: 2025-08-18 19:21:27

Training Hot Swap This is an example of how to hotswap PyTorch training code without unloading your model weights from VRAM. For large LLMs it can take upwards of 30 seconds to load a model from disk to VRAM. Waiting 30 seconds every time you want to rerun your script slows down development. This is a barebones implementation of a method to keep large models in VRAM even after your training script exits. If a model reload is necessary, it happens in the background after exit ensuring the model will be ready immediately the next time your script is run. This works by spawning a second process that stays active after your target script exits. The script you change is not run directly. Instead, this background process runs the code on your behalf using Python's eval(). This can also be used over a VPN for remote code execution. IntelliJ's remote SSH interpreter is quite buggy and not ideal for seamless remote development. Configure model_server.py to run on a remote machine, and run c ... Read full article.