with block.Reserve real H100s, B200s, or a CPU box, run your code, and tear it all down — in a single context manager. Pre-warmed pods mean you get a shell before your terminal finishes scrolling.
No tickets, no Slack pings, no idle instances you forgot to kill. The
sandbox lives exactly as long as your with block.
Pick a GPU type and count. Defaults are sane; everything is a keyword arg.
A pre-booted pod with sshd already running is handed to you — your key injected on the fly.
.exec(), .upload(), .download(), or just SSH in like any box.
Leaving the block cancels the reservation. Persistent disks survive if you asked for one.
We keep a standby pool of booted pods per GPU type. Reserving claims one instead of building it from scratch — no scheduling, no image pull, no sshd boot on your critical path.
The classic path: place the pod, pull the image, boot the SSH daemon, wait for the readiness probe. Fine — but you feel it.
The pod is already running. Claiming it is a label swap and an
authorized_keys append. That's the whole story.
Every sandbox is a genuine GPU node on our cluster — full driver stack, NVLink, EFA, the works.
T4, L4, A10G, RTX PRO 6000, A100, H100, B200, B300 — single card up to multi-node NVLink fabrics.
Need a sliver? Reserve h100-mig-1g and share a card. Pre-warmed and instant too.
Name a disk once; it follows you across sandboxes. Clone it for parallel experiments in seconds.
pip install gpu-dev gives you both gpu-dev reserve and the Python GpuDev() client.
Pass spot=True for ~70% off. Checkpoints land on a persistent disk that survives reclaim.
A failing PyTorch job is one reserve away from a live shell on a matching box. soon: 1-command