One visual language across the whole CLI, in the Freesolo website palette
(navy #1b1b4b, periwinkle #5f72ff, green #57ff8f): a brand
header, colored status badges, aligned tables, key/value panels, and syntax-highlighted JSON. Like
the website, it ships light and dark variants (auto-detected from the terminal background,
or forced with FLASH_THEME) — each command below shows both. The themed view
renders on an interactive terminal; piped or scripted output stays byte-for-byte plain, so
jq, scripts, and the agent contract are untouched.
flash versionpackage versionflash 0.2.19
flash v0.2.19
flash v0.2.19
flash loginverify + store your freesolo key✓ logged in to flash account ml-team@acme.ai org org_acme user usr_4192 key fs_live_8Kd2… (freesolo key)
✓ logged in to flash account ml-team@acme.ai org org_acme user usr_4192 key fs_live_8Kd2… (freesolo key)
✓ logged in to flash account ml-team@acme.ai org org_acme user usr_4192 key fs_live_8Kd2… (freesolo key)
flash whoamiidentity behind the stored keylogged in to flash account ml-team@acme.ai org org_acme user usr_4192 key fs_live_8Kd2… (freesolo key)
logged in to flash account ml-team@acme.ai org org_acme user usr_4192 key fs_live_8Kd2… (freesolo key)
logged in to flash account ml-team@acme.ai org org_acme user usr_4192 key fs_live_8Kd2… (freesolo key)
flash modelssupported base modelsQwen/Qwen3.5-0.8B openbmb/MiniCPM5-1B Qwen/Qwen3.5-2B Qwen/Qwen3.5-4B Qwen/Qwen3.5-9B
flash › models supported base models ──────────────────────────────────────────────────────────────────────────── • Qwen/Qwen3.5-0.8B • openbmb/MiniCPM5-1B • Qwen/Qwen3.5-2B • Qwen/Qwen3.5-4B • Qwen/Qwen3.5-9B → train one with: flash train configs/rl.toml
flash › models supported base models ──────────────────────────────────────────────────────────────────────────── • Qwen/Qwen3.5-0.8B • openbmb/MiniCPM5-1B • Qwen/Qwen3.5-2B • Qwen/Qwen3.5-4B • Qwen/Qwen3.5-9B → train one with: flash train configs/rl.toml
flash gpusmanaged GPU classes + $/hrgpu vram runpod$/hr L4 24G 0.39 A40 48G 0.44 RTX 3090 24G 0.46 RTX A6000 48G 0.49 RTX 4090 24G 0.69 RTX 6000 Ada 48G 0.77 RTX 5090 32G 0.99 A100 PCIe 80G 1.39 A100 SXM 80G 1.49 RTX Pro 6000 96G 2.09 H100 80G 3.29 Tip: GPU class selection is fully automatic — the submit-time allocator always picks the cheapest validated RunPod class that fits the model, so you don't pin a GPU type.
flash › gpus managed GPU classes ──────────────────────────────────────────────────────────────────────────── GPU VRAM $/HR ──────────── ───── ───── L4 24 GB $0.39 A40 48 GB $0.44 RTX 3090 24 GB $0.46 RTX A6000 48 GB $0.49 RTX 4090 24 GB $0.69 RTX 6000 Ada 48 GB $0.77 RTX 5090 32 GB $0.99 A100 PCIe 80 GB $1.39 A100 SXM 80 GB $1.49 RTX Pro 6000 96 GB $2.09 H100 80 GB $3.29 Tip: GPU class selection is fully automatic — the submit-time allocator always picks the cheapest validated RunPod class that fits the model, so you don't pin a GPU type.
flash › gpus managed GPU classes ──────────────────────────────────────────────────────────────────────────── GPU VRAM $/HR ──────────── ───── ───── L4 24 GB $0.39 A40 48 GB $0.44 RTX 3090 24 GB $0.46 RTX A6000 48 GB $0.49 RTX 4090 24 GB $0.69 RTX 6000 Ada 48 GB $0.77 RTX 5090 32 GB $0.99 A100 PCIe 80 GB $1.39 A100 SXM 80 GB $1.49 RTX Pro 6000 96 GB $2.09 H100 80 GB $3.29 Tip: GPU class selection is fully automatic — the submit-time allocator always picks the cheapest validated RunPod class that fits the model, so you don't pin a GPU type.
flash env setupscaffold a starter environmentensured environment.py, datasets/train.jsonl, configs/, configs/rl.toml, configs/sft.toml
flash › env setup starter Freesolo environment ──────────────────────────────────────────────────────────────────────────── ✓ scaffold ready environment.py env entrypoint — edit the reward + prompt datasets/train.jsonl starter training rows configs/rl.toml GRPO run config configs/sft.toml SFT run config → publish it: flash env push --name my-env .
flash › env setup starter Freesolo environment ──────────────────────────────────────────────────────────────────────────── ✓ scaffold ready environment.py env entrypoint — edit the reward + prompt datasets/train.jsonl starter training rows configs/rl.toml GRPO run config configs/sft.toml SFT run config → publish it: flash env push --name my-env .
flash env listinstalled + local environmentslocal env sources (publish with `flash env push --name <name> <path>`): . environments/math-grader
flash › env list installed + local environments ──────────────────────────────────────────────────────────────────────────── local sources (publish with flash env push --name <name> <path>) · . · environments/math-grader
flash › env list installed + local environments ──────────────────────────────────────────────────────────────────────────── local sources (publish with flash env push --name <name> <path>) · . · environments/math-grader
flash env install acme/math-graderrecord a published environmentinstalled acme/math-grader; recorded in /Users/you/.flash/environments.json use it via: [environment]\nid = "acme/math-grader"
✓ recorded acme/math-grader manifest: /Users/you/.flash/environments.json use it in your config: [environment] id = "acme/math-grader"
✓ recorded acme/math-grader manifest: /Users/you/.flash/environments.json use it in your config: [environment] id = "acme/math-grader"
flash env push --name math-grader .package + upload a local environmentpublished acme/math-grader reference it in your config: [environment] id = "acme/math-grader"
✓ published acme/math-grader reference it in your config: [environment] id = "acme/math-grader"
✓ published acme/math-grader reference it in your config: [environment] id = "acme/math-grader"
flash train rl.tomlsubmit a run and follow its logsrun flash-1718900000-a1b2c3d4 submitted; following logs (Ctrl-C detaches, `flash status flash-1718900000-a1b2c3d4 --follow` resumes)
worker: provisioning RTX 5090 on runpod ...
worker: loading Qwen/Qwen3.5-4B (bf16, LoRA r=32) ...
step 10/150 loss=1.842 reward=0.41 lr=1.0e-4
step 150/150 loss=0.213 reward=0.88 lr=2.0e-6
worker: pushing adapter -> acme/qwen3.5-4b-grpo-runs
{
"run_id": "flash-1718900000-a1b2c3d4",
"state": "done",
"spec": {
"model": "Qwen/Qwen3.5-4B",
"algorithm": "grpo",
"gpu": {
"type": "RTX 5090"
}
},
"remote": {
"provider": "runpod",
"gpu": "RTX 5090"
},
"cost_usd": 1.8421,
"realized_cost_usd": 1.774,
"created_at": 1718896400.0,
"updated_at": 1718900000.0,
"error": null,
"artifacts_dir": "acme/qwen3.5-4b-grpo-runs"
}✓ run flash-1718900000-a1b2c3d4 submitted following logs — Ctrl-C detaches; resume with `flash status flash-1718900000-a1b2c3d4 --follow` worker: provisioning RTX 5090 on runpod ... worker: loading Qwen/Qwen3.5-4B (bf16, LoRA r=32) ... step 10/150 loss=1.842 reward=0.41 lr=1.0e-4 step 150/150 loss=0.213 reward=0.88 lr=2.0e-6 worker: pushing adapter -> acme/qwen3.5-4b-grpo-runs flash › status ──────────────────────────────────────────────────────────────────────────── ● done run id · flash-1718900000-a1b2c3d4 model · Qwen/Qwen3.5-4B algorithm · GRPO gpu · RTX 5090 @ runpod cost · $1.8421 realized · $1.7740 created · 2024-06-20 15:13 UTC updated · 2024-06-20 16:13 UTC artifacts · acme/qwen3.5-4b-grpo-runs details { "run_id": "flash-1718900000-a1b2c3d4", "state": "done", "spec": { "model": "Qwen/Qwen3.5-4B", "algorithm": "grpo", "gpu": { "type": "RTX 5090" } }, "remote": { "provider": "runpod", "gpu": "RTX 5090" }, "cost_usd": 1.8421, "realized_cost_usd": 1.774, "created_at": 1718896400.0, "updated_at": 1718900000.0, "error": null, "artifacts_dir": "acme/qwen3.5-4b-grpo-runs" }
✓ run flash-1718900000-a1b2c3d4 submitted following logs — Ctrl-C detaches; resume with `flash status flash-1718900000-a1b2c3d4 --follow` worker: provisioning RTX 5090 on runpod ... worker: loading Qwen/Qwen3.5-4B (bf16, LoRA r=32) ... step 10/150 loss=1.842 reward=0.41 lr=1.0e-4 step 150/150 loss=0.213 reward=0.88 lr=2.0e-6 worker: pushing adapter -> acme/qwen3.5-4b-grpo-runs flash › status ──────────────────────────────────────────────────────────────────────────── ● done run id · flash-1718900000-a1b2c3d4 model · Qwen/Qwen3.5-4B algorithm · GRPO gpu · RTX 5090 @ runpod cost · $1.8421 realized · $1.7740 created · 2024-06-20 15:13 UTC updated · 2024-06-20 16:13 UTC artifacts · acme/qwen3.5-4b-grpo-runs details { "run_id": "flash-1718900000-a1b2c3d4", "state": "done", "spec": { "model": "Qwen/Qwen3.5-4B", "algorithm": "grpo", "gpu": { "type": "RTX 5090" } }, "remote": { "provider": "runpod", "gpu": "RTX 5090" }, "cost_usd": 1.8421, "realized_cost_usd": 1.774, "created_at": 1718896400.0, "updated_at": 1718900000.0, "error": null, "artifacts_dir": "acme/qwen3.5-4b-grpo-runs" }
flash train --cost rl.tomlpre-flight cost estimateRun : Qwen/Qwen3.5-4B [GRPO, 150 steps] GPU : A40 on auto (48 GB; run needs >= 35 GB) @ $0.44/hr Setup : 9.8 min (cold start: boot + deps + model load + vLLM init) Per step : 234.90 s Train : 587.3 min Wall clock : 9.95 h TOTAL : $4.38 Notes : - GRPO step = vLLM rollout of 64x8=512 completions @ 320 tok + reward (1.00s/completion, env acme/math-grader) + policy+reference update - GPU sized with 10% VRAM headroom; static GPU $/hr
flash › train pre-flight cost estimate ──────────────────────────────────────────────────────────────────────────── run · Qwen/Qwen3.5-4B [GRPO, 150 steps] gpu · A40 on auto (48 GB; needs >= 35 GB) @ $0.44/hr setup · 9.8 min (cold start: boot + deps + model load + vLLM init) per step · 234.90 s train · 587.3 min wall clock · 9.95 h ──────────────────────────────────────────────────────────────────────────── TOTAL · $4.38 notes · GRPO step = vLLM rollout of 64x8=512 completions @ 320 tok + reward (1.00s/completion, env acme/math-grader) + policy+reference update · GPU sized with 10% VRAM headroom; static GPU $/hr
flash › train pre-flight cost estimate ──────────────────────────────────────────────────────────────────────────── run · Qwen/Qwen3.5-4B [GRPO, 150 steps] gpu · A40 on auto (48 GB; needs >= 35 GB) @ $0.44/hr setup · 9.8 min (cold start: boot + deps + model load + vLLM init) per step · 234.90 s train · 587.3 min wall clock · 9.95 h ──────────────────────────────────────────────────────────────────────────── TOTAL · $4.38 notes · GRPO step = vLLM rollout of 64x8=512 completions @ 320 tok + reward (1.00s/completion, env acme/math-grader) + policy+reference update · GPU sized with 10% VRAM headroom; static GPU $/hr
flash train --dry-run sft.tomlvalidate a config locally{
"run_id": "flash-1718900000-d0cf00ed",
"state": "dry_run",
"spec": {
"model": "Qwen/Qwen3.5-4B",
"algorithm": "sft",
"environment": {
"id": "acme/math-grader",
"params": {},
"pip": [],
"secrets": []
},
"train": {
"steps": null,
"epochs": 1,
"lora_rank": 32,
"lora_alpha": 64,
"seeds": [
0
],
"init_from_adapter": "",
"hf_repo": "",
"learning_rate": null,
"batch_size": null,
"max_length": null,
"save_every": null,
"max_steps": null,
"max_examples": null,
"group_size": null,
"temperature": null,
"max_tokens": null,
"kl_penalty_coef": null,
"advantage_clip": null,
"thinking_length_penalty_coef": null,
"stop_sequences": []
},
"gpu": {
"type": "RTX 3090",
"disk_gb": 60,
"max_wall_seconds": 86400,
"max_retries": 2,
"network_volume": null,
"network_volume_gb": 100,
"datacenter": null
},
"run_id": "flash-1718900000-d0cf00ed",
"worker_env": {},
"model_policy": "catalog",
"thinking": false,
"wandb": {
"project": null,
"run_name": null
}
}
}flash › train dry run — validated locally, not submitted ──────────────────────────────────────────────────────────────────────────── ○ dry_run flash-1718900000-d0cf00ed { "run_id": "flash-1718900000-d0cf00ed", "state": "dry_run", "spec": { "model": "Qwen/Qwen3.5-4B", "algorithm": "sft", "environment": { "id": "acme/math-grader", "params": {}, "pip": [], "secrets": [] }, "train": { "steps": null, "epochs": 1, "lora_rank": 32, "lora_alpha": 64, "seeds": [0], "init_from_adapter": "", "hf_repo": "", "learning_rate": null, "batch_size": null, "max_length": null, "save_every": null, "max_steps": null, "max_examples": null, "group_size": null, "temperature": null, "max_tokens": null, "kl_penalty_coef": null, "advantage_clip": null, "thinking_length_penalty_coef": null, "stop_sequences": [] }, "gpu": { "type": "RTX 3090", "disk_gb": 60, "max_wall_seconds": 86400, "max_retries": 2, "network_volume": null, "network_volume_gb": 100, "datacenter": null }, "run_id": "flash-1718900000-d0cf00ed", "worker_env": {}, "model_policy": "catalog", "thinking": false, "wandb": { "project": null, "run_name": null } } }
flash › train dry run — validated locally, not submitted ──────────────────────────────────────────────────────────────────────────── ○ dry_run flash-1718900000-d0cf00ed { "run_id": "flash-1718900000-d0cf00ed", "state": "dry_run", "spec": { "model": "Qwen/Qwen3.5-4B", "algorithm": "sft", "environment": { "id": "acme/math-grader", "params": {}, "pip": [], "secrets": [] }, "train": { "steps": null, "epochs": 1, "lora_rank": 32, "lora_alpha": 64, "seeds": [0], "init_from_adapter": "", "hf_repo": "", "learning_rate": null, "batch_size": null, "max_length": null, "save_every": null, "max_steps": null, "max_examples": null, "group_size": null, "temperature": null, "max_tokens": null, "kl_penalty_coef": null, "advantage_clip": null, "thinking_length_penalty_coef": null, "stop_sequences": [] }, "gpu": { "type": "RTX 3090", "disk_gb": 60, "max_wall_seconds": 86400, "max_retries": 2, "network_volume": null, "network_volume_gb": 100, "datacenter": null }, "run_id": "flash-1718900000-d0cf00ed", "worker_env": {}, "model_policy": "catalog", "thinking": false, "wandb": { "project": null, "run_name": null } } }
flash status <run>a run's full status{
"run_id": "flash-1718900000-a1b2c3d4",
"state": "done",
"spec": {
"model": "Qwen/Qwen3.5-4B",
"algorithm": "grpo",
"gpu": {
"type": "RTX 5090"
}
},
"remote": {
"provider": "runpod",
"gpu": "RTX 5090"
},
"cost_usd": 1.8421,
"realized_cost_usd": 1.774,
"created_at": 1718896400.0,
"updated_at": 1718900000.0,
"error": null,
"artifacts_dir": "acme/qwen3.5-4b-grpo-runs"
}flash › status ──────────────────────────────────────────────────────────────────────────── ● done run id · flash-1718900000-a1b2c3d4 model · Qwen/Qwen3.5-4B algorithm · GRPO gpu · RTX 5090 @ runpod cost · $1.8421 realized · $1.7740 created · 2024-06-20 15:13 UTC updated · 2024-06-20 16:13 UTC artifacts · acme/qwen3.5-4b-grpo-runs details { "run_id": "flash-1718900000-a1b2c3d4", "state": "done", "spec": { "model": "Qwen/Qwen3.5-4B", "algorithm": "grpo", "gpu": { "type": "RTX 5090" } }, "remote": { "provider": "runpod", "gpu": "RTX 5090" }, "cost_usd": 1.8421, "realized_cost_usd": 1.774, "created_at": 1718896400.0, "updated_at": 1718900000.0, "error": null, "artifacts_dir": "acme/qwen3.5-4b-grpo-runs" }
flash › status ──────────────────────────────────────────────────────────────────────────── ● done run id · flash-1718900000-a1b2c3d4 model · Qwen/Qwen3.5-4B algorithm · GRPO gpu · RTX 5090 @ runpod cost · $1.8421 realized · $1.7740 created · 2024-06-20 15:13 UTC updated · 2024-06-20 16:13 UTC artifacts · acme/qwen3.5-4b-grpo-runs details { "run_id": "flash-1718900000-a1b2c3d4", "state": "done", "spec": { "model": "Qwen/Qwen3.5-4B", "algorithm": "grpo", "gpu": { "type": "RTX 5090" } }, "remote": { "provider": "runpod", "gpu": "RTX 5090" }, "cost_usd": 1.8421, "realized_cost_usd": 1.774, "created_at": 1718896400.0, "updated_at": 1718900000.0, "error": null, "artifacts_dir": "acme/qwen3.5-4b-grpo-runs" }
flash runsall runs and their state/costRUN_ID STATE ALGO COST($) GPU MODEL flash-1718903000-e5f6a7b8 running SFT 0.4210 L40S@vast Qwen/Qwen3.5-9B flash-1718900000-a1b2c3d4 done GRPO 1.8421 RTX 5090@runpod Qwen/Qwen3.5-4B flash-1718800000-99887766 failed GRPO 0.0712 Qwen/Qwen3.5-2B flash-1718700000-12ab34cd queued SFT 0.0000 openbmb/MiniCPM5-1B
flash › runs 4 run(s) ──────────────────────────────────────────────────────────────────────────── RUN ID STATE ALGO COST GPU MODEL ───────────────────────── ───────── ──── ─────── ─────────────── ─────────────────── flash-1718903000-e5f6a7b8 ● running SFT $0.4210 L40S@vast Qwen/Qwen3.5-9B flash-1718900000-a1b2c3d4 ● done GRPO $1.8421 RTX 5090@runpod Qwen/Qwen3.5-4B flash-1718800000-99887766 ● failed GRPO $0.0712 Qwen/Qwen3.5-2B flash-1718700000-12ab34cd ○ queued SFT $0.0000 openbmb/MiniCPM5-1B
flash › runs 4 run(s) ──────────────────────────────────────────────────────────────────────────── RUN ID STATE ALGO COST GPU MODEL ───────────────────────── ───────── ──── ─────── ─────────────── ─────────────────── flash-1718903000-e5f6a7b8 ● running SFT $0.4210 L40S@vast Qwen/Qwen3.5-9B flash-1718900000-a1b2c3d4 ● done GRPO $1.8421 RTX 5090@runpod Qwen/Qwen3.5-4B flash-1718800000-99887766 ● failed GRPO $0.0712 Qwen/Qwen3.5-2B flash-1718700000-12ab34cd ○ queued SFT $0.0000 openbmb/MiniCPM5-1B
flash cancel <run>cancel a run{
"run_id": "flash-1718903000-e5f6a7b8",
"state": "cancelled"
}flash › cancel ──────────────────────────────────────────────────────────────────────────── ● cancelled flash-1718903000-e5f6a7b8 { "run_id": "flash-1718903000-e5f6a7b8", "state": "cancelled" }
flash › cancel ──────────────────────────────────────────────────────────────────────────── ● cancelled flash-1718903000-e5f6a7b8 { "run_id": "flash-1718903000-e5f6a7b8", "state": "cancelled" }
flash deploy <run>serve a trained adapter{
"run_id": "flash-1718900000-a1b2c3d4",
"state": "deployed",
"mode": "dev",
"openai_model": "flash-1718900000-a1b2c3d4",
"endpoint_name": "https://api.freesolo.co/v1",
"gpu": "RTX 4090"
}flash › deploy ──────────────────────────────────────────────────────────────────────────── ● deployed flash-1718900000-a1b2c3d4 { "run_id": "flash-1718900000-a1b2c3d4", "state": "deployed", "mode": "dev", "openai_model": "flash-1718900000-a1b2c3d4", "endpoint_name": "https://api.freesolo.co/v1", "gpu": "RTX 4090" }
flash › deploy ──────────────────────────────────────────────────────────────────────────── ● deployed flash-1718900000-a1b2c3d4 { "run_id": "flash-1718900000-a1b2c3d4", "state": "deployed", "mode": "dev", "openai_model": "flash-1718900000-a1b2c3d4", "endpoint_name": "https://api.freesolo.co/v1", "gpu": "RTX 4090" }
flash undeploy <run>tear down a serving endpoint{
"run_id": "flash-1718900000-a1b2c3d4",
"deleted_endpoints": [
"flash-a1b2c3d4-serve"
]
}flash › undeploy ──────────────────────────────────────────────────────────────────────────── { "run_id": "flash-1718900000-a1b2c3d4", "deleted_endpoints": [ "flash-a1b2c3d4-serve" ] }
flash › undeploy ──────────────────────────────────────────────────────────────────────────── { "run_id": "flash-1718900000-a1b2c3d4", "deleted_endpoints": [ "flash-a1b2c3d4-serve" ] }
flash deploymentsactive serving deploymentsRUN_ID GPU ENDPOINT flash-1718900000-a1b2c3d4 RTX 4090 https://api.freesolo.co/v1
flash › deployments 1 active ──────────────────────────────────────────────────────────────────────────── RUN ID GPU ENDPOINT ───────────────────────── ──────── ────────────────────────── flash-1718900000-a1b2c3d4 RTX 4090 https://api.freesolo.co/v1
flash › deployments 1 active ──────────────────────────────────────────────────────────────────────────── RUN ID GPU ENDPOINT ───────────────────────── ──────── ────────────────────────── flash-1718900000-a1b2c3d4 RTX 4090 https://api.freesolo.co/v1
flash chat <run> -m ...chat with a deployed adapterThe capital of France is Paris. It has been the country's political and cultural center since the 12th century.
assistant
The capital of France is Paris. It has been the country's political and cultural center since the 12th century.assistant
The capital of France is Paris. It has been the country's political and cultural center since the 12th century.