flash cli

A standardized output theme for every command

One visual language across the whole CLI, in the Freesolo website palette (navy #1b1b4b, periwinkle #5f72ff, green #57ff8f): a brand header, colored status badges, aligned tables, key/value panels, and syntax-highlighted JSON. Like the website, it ships light and dark variants (auto-detected from the terminal background, or forced with FLASH_THEME) — each command below shows both. The themed view renders on an interactive terminal; piped or scripted output stays byte-for-byte plain, so jq, scripts, and the agent contract are untouched.

periwinkle — accent / links green — success / done teal — amounts indigo — literals red — failure structure
flash versionpackage version
before
flash 0.2.19
after · dark
flash v0.2.19
after · light
flash v0.2.19
flash loginverify + store your freesolo key
before
✓ logged in to flash

  account  ml-team@acme.ai
  org      org_acme
  user     usr_4192
  key      fs_live_8Kd2… (freesolo key)
after · dark
 logged in to flash

  account  ml-team@acme.ai
  org      org_acme
  user     usr_4192
  key      fs_live_8Kd2… (freesolo key)
after · light
 logged in to flash

  account  ml-team@acme.ai
  org      org_acme
  user     usr_4192
  key      fs_live_8Kd2… (freesolo key)
flash whoamiidentity behind the stored key
before
logged in to flash

  account  ml-team@acme.ai
  org      org_acme
  user     usr_4192
  key      fs_live_8Kd2… (freesolo key)
after · dark
logged in to flash

  account  ml-team@acme.ai
  org      org_acme
  user     usr_4192
  key      fs_live_8Kd2… (freesolo key)
after · light
logged in to flash

  account  ml-team@acme.ai
  org      org_acme
  user     usr_4192
  key      fs_live_8Kd2… (freesolo key)
flash modelssupported base models
before
Qwen/Qwen3.5-0.8B
openbmb/MiniCPM5-1B
Qwen/Qwen3.5-2B
Qwen/Qwen3.5-4B
Qwen/Qwen3.5-9B
after · dark
flash  models  supported base models
────────────────────────────────────────────────────────────────────────────
   Qwen/Qwen3.5-0.8B
   openbmb/MiniCPM5-1B
   Qwen/Qwen3.5-2B
   Qwen/Qwen3.5-4B
   Qwen/Qwen3.5-9B

 train one with: flash train configs/rl.toml
after · light
flash  models  supported base models
────────────────────────────────────────────────────────────────────────────
   Qwen/Qwen3.5-0.8B
   openbmb/MiniCPM5-1B
   Qwen/Qwen3.5-2B
   Qwen/Qwen3.5-4B
   Qwen/Qwen3.5-9B

 train one with: flash train configs/rl.toml
flash gpusmanaged GPU classes + $/hr
before
gpu               vram runpod$/hr
L4                 24G       0.39
A40                48G       0.44
RTX 3090           24G       0.46
RTX A6000          48G       0.49
RTX 4090           24G       0.69
RTX 6000 Ada       48G       0.77
RTX 5090           32G       0.99
A100 PCIe          80G       1.39
A100 SXM           80G       1.49
RTX Pro 6000       96G       2.09
H100               80G       3.29

Tip: GPU class selection is fully automatic — the submit-time allocator always picks the
cheapest validated RunPod class that fits the model, so you don't pin a GPU type.
after · dark
flash  gpus  managed GPU classes
────────────────────────────────────────────────────────────────────────────
GPU            VRAM   $/HR
────────────  ─────  ─────
L4            24 GB  $0.39
A40           48 GB  $0.44
RTX 3090      24 GB  $0.46
RTX A6000     48 GB  $0.49
RTX 4090      24 GB  $0.69
RTX 6000 Ada  48 GB  $0.77
RTX 5090      32 GB  $0.99
A100 PCIe     80 GB  $1.39
A100 SXM      80 GB  $1.49
RTX Pro 6000  96 GB  $2.09
H100          80 GB  $3.29

Tip: GPU class selection is fully automatic — the submit-time allocator always picks the
cheapest validated RunPod class that fits the model, so you don't pin a GPU type.
after · light
flash  gpus  managed GPU classes
────────────────────────────────────────────────────────────────────────────
GPU            VRAM   $/HR
────────────  ─────  ─────
L4            24 GB  $0.39
A40           48 GB  $0.44
RTX 3090      24 GB  $0.46
RTX A6000     48 GB  $0.49
RTX 4090      24 GB  $0.69
RTX 6000 Ada  48 GB  $0.77
RTX 5090      32 GB  $0.99
A100 PCIe     80 GB  $1.39
A100 SXM      80 GB  $1.49
RTX Pro 6000  96 GB  $2.09
H100          80 GB  $3.29

Tip: GPU class selection is fully automatic — the submit-time allocator always picks the
cheapest validated RunPod class that fits the model, so you don't pin a GPU type.
flash env setupscaffold a starter environment
before
ensured environment.py, datasets/train.jsonl, configs/, configs/rl.toml, configs/sft.toml
after · dark
flash  env setup  starter Freesolo environment
────────────────────────────────────────────────────────────────────────────
 scaffold ready

  environment.py        env entrypoint — edit the reward + prompt
  datasets/train.jsonl  starter training rows
  configs/rl.toml       GRPO run config
  configs/sft.toml      SFT run config

 publish it: flash env push --name my-env .
after · light
flash  env setup  starter Freesolo environment
────────────────────────────────────────────────────────────────────────────
 scaffold ready

  environment.py        env entrypoint — edit the reward + prompt
  datasets/train.jsonl  starter training rows
  configs/rl.toml       GRPO run config
  configs/sft.toml      SFT run config

 publish it: flash env push --name my-env .
flash env listinstalled + local environments
before
local env sources (publish with `flash env push --name <name> <path>`):
  .
  environments/math-grader
after · dark
flash  env list  installed + local environments
────────────────────────────────────────────────────────────────────────────
local sources  (publish with flash env push --name <name> <path>)
  · .
  · environments/math-grader
after · light
flash  env list  installed + local environments
────────────────────────────────────────────────────────────────────────────
local sources  (publish with flash env push --name <name> <path>)
  · .
  · environments/math-grader
flash env install acme/math-graderrecord a published environment
before
installed acme/math-grader; recorded in /Users/you/.flash/environments.json
use it via:  [environment]\nid = "acme/math-grader"
after · dark
 recorded acme/math-grader
  manifest: /Users/you/.flash/environments.json

use it in your config:
  [environment]
  id = "acme/math-grader"
after · light
 recorded acme/math-grader
  manifest: /Users/you/.flash/environments.json

use it in your config:
  [environment]
  id = "acme/math-grader"
flash env push --name math-grader .package + upload a local environment
before
published acme/math-grader
reference it in your config:

  [environment]
  id = "acme/math-grader"
after · dark
 published acme/math-grader

reference it in your config:
  [environment]
  id = "acme/math-grader"
after · light
 published acme/math-grader

reference it in your config:
  [environment]
  id = "acme/math-grader"
flash train rl.tomlsubmit a run and follow its logs
before
run flash-1718900000-a1b2c3d4 submitted; following logs (Ctrl-C detaches, `flash status flash-1718900000-a1b2c3d4 --follow` resumes)
worker: provisioning RTX 5090 on runpod ...
worker: loading Qwen/Qwen3.5-4B (bf16, LoRA r=32) ...
step  10/150  loss=1.842  reward=0.41  lr=1.0e-4
step 150/150  loss=0.213  reward=0.88  lr=2.0e-6
worker: pushing adapter -> acme/qwen3.5-4b-grpo-runs
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "done",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "grpo",
    "gpu": {
      "type": "RTX 5090"
    }
  },
  "remote": {
    "provider": "runpod",
    "gpu": "RTX 5090"
  },
  "cost_usd": 1.8421,
  "realized_cost_usd": 1.774,
  "created_at": 1718896400.0,
  "updated_at": 1718900000.0,
  "error": null,
  "artifacts_dir": "acme/qwen3.5-4b-grpo-runs"
}
after · dark
 run flash-1718900000-a1b2c3d4 submitted
following logs — Ctrl-C detaches; resume with `flash status flash-1718900000-a1b2c3d4 --follow`
worker: provisioning RTX 5090 on runpod ...
worker: loading Qwen/Qwen3.5-4B (bf16, LoRA r=32) ...
step  10/150  loss=1.842  reward=0.41  lr=1.0e-4
step 150/150  loss=0.213  reward=0.88  lr=2.0e-6
worker: pushing adapter -> acme/qwen3.5-4b-grpo-runs
flash  status
────────────────────────────────────────────────────────────────────────────
  ● done

  run id    · flash-1718900000-a1b2c3d4
  model     · Qwen/Qwen3.5-4B
  algorithm · GRPO
  gpu       · RTX 5090 @ runpod
  cost      · $1.8421
  realized  · $1.7740
  created   · 2024-06-20 15:13 UTC
  updated   · 2024-06-20 16:13 UTC
  artifacts · acme/qwen3.5-4b-grpo-runs

details
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "done",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "grpo",
    "gpu": {
      "type": "RTX 5090"
    }
  },
  "remote": {
    "provider": "runpod",
    "gpu": "RTX 5090"
  },
  "cost_usd": 1.8421,
  "realized_cost_usd": 1.774,
  "created_at": 1718896400.0,
  "updated_at": 1718900000.0,
  "error": null,
  "artifacts_dir": "acme/qwen3.5-4b-grpo-runs"
}
after · light
 run flash-1718900000-a1b2c3d4 submitted
following logs — Ctrl-C detaches; resume with `flash status flash-1718900000-a1b2c3d4 --follow`
worker: provisioning RTX 5090 on runpod ...
worker: loading Qwen/Qwen3.5-4B (bf16, LoRA r=32) ...
step  10/150  loss=1.842  reward=0.41  lr=1.0e-4
step 150/150  loss=0.213  reward=0.88  lr=2.0e-6
worker: pushing adapter -> acme/qwen3.5-4b-grpo-runs
flash  status
────────────────────────────────────────────────────────────────────────────
  ● done

  run id    · flash-1718900000-a1b2c3d4
  model     · Qwen/Qwen3.5-4B
  algorithm · GRPO
  gpu       · RTX 5090 @ runpod
  cost      · $1.8421
  realized  · $1.7740
  created   · 2024-06-20 15:13 UTC
  updated   · 2024-06-20 16:13 UTC
  artifacts · acme/qwen3.5-4b-grpo-runs

details
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "done",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "grpo",
    "gpu": {
      "type": "RTX 5090"
    }
  },
  "remote": {
    "provider": "runpod",
    "gpu": "RTX 5090"
  },
  "cost_usd": 1.8421,
  "realized_cost_usd": 1.774,
  "created_at": 1718896400.0,
  "updated_at": 1718900000.0,
  "error": null,
  "artifacts_dir": "acme/qwen3.5-4b-grpo-runs"
}
flash train --cost rl.tomlpre-flight cost estimate
before
Run        : Qwen/Qwen3.5-4B  [GRPO, 150 steps]
GPU        : A40 on auto (48 GB; run needs >= 35 GB) @ $0.44/hr
Setup      : 9.8 min (cold start: boot + deps + model load + vLLM init)
Per step   : 234.90 s
Train      : 587.3 min
Wall clock : 9.95 h
TOTAL      : $4.38
Notes      :
  - GRPO step = vLLM rollout of 64x8=512 completions @ 320 tok + reward (1.00s/completion, env acme/math-grader) + policy+reference update
  - GPU sized with 10% VRAM headroom; static GPU $/hr
after · dark
flash  train  pre-flight cost estimate
────────────────────────────────────────────────────────────────────────────
  run        · Qwen/Qwen3.5-4B  [GRPO, 150 steps]
  gpu        · A40 on auto  (48 GB; needs >= 35 GB)  @ $0.44/hr
  setup      · 9.8 min  (cold start: boot + deps + model load + vLLM init)
  per step   · 234.90 s
  train      · 587.3 min
  wall clock · 9.95 h
────────────────────────────────────────────────────────────────────────────
  TOTAL      · $4.38

notes
  · GRPO step = vLLM rollout of 64x8=512 completions @ 320 tok + reward (1.00s/completion, env acme/math-grader) + policy+reference update
  · GPU sized with 10% VRAM headroom; static GPU $/hr
after · light
flash  train  pre-flight cost estimate
────────────────────────────────────────────────────────────────────────────
  run        · Qwen/Qwen3.5-4B  [GRPO, 150 steps]
  gpu        · A40 on auto  (48 GB; needs >= 35 GB)  @ $0.44/hr
  setup      · 9.8 min  (cold start: boot + deps + model load + vLLM init)
  per step   · 234.90 s
  train      · 587.3 min
  wall clock · 9.95 h
────────────────────────────────────────────────────────────────────────────
  TOTAL      · $4.38

notes
  · GRPO step = vLLM rollout of 64x8=512 completions @ 320 tok + reward (1.00s/completion, env acme/math-grader) + policy+reference update
  · GPU sized with 10% VRAM headroom; static GPU $/hr
flash train --dry-run sft.tomlvalidate a config locally
before
{
  "run_id": "flash-1718900000-d0cf00ed",
  "state": "dry_run",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "sft",
    "environment": {
      "id": "acme/math-grader",
      "params": {},
      "pip": [],
      "secrets": []
    },
    "train": {
      "steps": null,
      "epochs": 1,
      "lora_rank": 32,
      "lora_alpha": 64,
      "seeds": [
        0
      ],
      "init_from_adapter": "",
      "hf_repo": "",
      "learning_rate": null,
      "batch_size": null,
      "max_length": null,
      "save_every": null,
      "max_steps": null,
      "max_examples": null,
      "group_size": null,
      "temperature": null,
      "max_tokens": null,
      "kl_penalty_coef": null,
      "advantage_clip": null,
      "thinking_length_penalty_coef": null,
      "stop_sequences": []
    },
    "gpu": {
      "type": "RTX 3090",
      "disk_gb": 60,
      "max_wall_seconds": 86400,
      "max_retries": 2,
      "network_volume": null,
      "network_volume_gb": 100,
      "datacenter": null
    },
    "run_id": "flash-1718900000-d0cf00ed",
    "worker_env": {},
    "model_policy": "catalog",
    "thinking": false,
    "wandb": {
      "project": null,
      "run_name": null
    }
  }
}
after · dark
flash  train  dry run — validated locally, not submitted
────────────────────────────────────────────────────────────────────────────
  ○ dry_run   flash-1718900000-d0cf00ed

{
  "run_id": "flash-1718900000-d0cf00ed",
  "state": "dry_run",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "sft",
    "environment": {
      "id": "acme/math-grader",
      "params": {},
      "pip": [],
      "secrets": []
    },
    "train": {
      "steps": null,
      "epochs": 1,
      "lora_rank": 32,
      "lora_alpha": 64,
      "seeds": [0],
      "init_from_adapter": "",
      "hf_repo": "",
      "learning_rate": null,
      "batch_size": null,
      "max_length": null,
      "save_every": null,
      "max_steps": null,
      "max_examples": null,
      "group_size": null,
      "temperature": null,
      "max_tokens": null,
      "kl_penalty_coef": null,
      "advantage_clip": null,
      "thinking_length_penalty_coef": null,
      "stop_sequences": []
    },
    "gpu": {
      "type": "RTX 3090",
      "disk_gb": 60,
      "max_wall_seconds": 86400,
      "max_retries": 2,
      "network_volume": null,
      "network_volume_gb": 100,
      "datacenter": null
    },
    "run_id": "flash-1718900000-d0cf00ed",
    "worker_env": {},
    "model_policy": "catalog",
    "thinking": false,
    "wandb": {
      "project": null,
      "run_name": null
    }
  }
}
after · light
flash  train  dry run — validated locally, not submitted
────────────────────────────────────────────────────────────────────────────
  ○ dry_run   flash-1718900000-d0cf00ed

{
  "run_id": "flash-1718900000-d0cf00ed",
  "state": "dry_run",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "sft",
    "environment": {
      "id": "acme/math-grader",
      "params": {},
      "pip": [],
      "secrets": []
    },
    "train": {
      "steps": null,
      "epochs": 1,
      "lora_rank": 32,
      "lora_alpha": 64,
      "seeds": [0],
      "init_from_adapter": "",
      "hf_repo": "",
      "learning_rate": null,
      "batch_size": null,
      "max_length": null,
      "save_every": null,
      "max_steps": null,
      "max_examples": null,
      "group_size": null,
      "temperature": null,
      "max_tokens": null,
      "kl_penalty_coef": null,
      "advantage_clip": null,
      "thinking_length_penalty_coef": null,
      "stop_sequences": []
    },
    "gpu": {
      "type": "RTX 3090",
      "disk_gb": 60,
      "max_wall_seconds": 86400,
      "max_retries": 2,
      "network_volume": null,
      "network_volume_gb": 100,
      "datacenter": null
    },
    "run_id": "flash-1718900000-d0cf00ed",
    "worker_env": {},
    "model_policy": "catalog",
    "thinking": false,
    "wandb": {
      "project": null,
      "run_name": null
    }
  }
}
flash status <run>a run's full status
before
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "done",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "grpo",
    "gpu": {
      "type": "RTX 5090"
    }
  },
  "remote": {
    "provider": "runpod",
    "gpu": "RTX 5090"
  },
  "cost_usd": 1.8421,
  "realized_cost_usd": 1.774,
  "created_at": 1718896400.0,
  "updated_at": 1718900000.0,
  "error": null,
  "artifacts_dir": "acme/qwen3.5-4b-grpo-runs"
}
after · dark
flash  status
────────────────────────────────────────────────────────────────────────────
  ● done

  run id    · flash-1718900000-a1b2c3d4
  model     · Qwen/Qwen3.5-4B
  algorithm · GRPO
  gpu       · RTX 5090 @ runpod
  cost      · $1.8421
  realized  · $1.7740
  created   · 2024-06-20 15:13 UTC
  updated   · 2024-06-20 16:13 UTC
  artifacts · acme/qwen3.5-4b-grpo-runs

details
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "done",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "grpo",
    "gpu": {
      "type": "RTX 5090"
    }
  },
  "remote": {
    "provider": "runpod",
    "gpu": "RTX 5090"
  },
  "cost_usd": 1.8421,
  "realized_cost_usd": 1.774,
  "created_at": 1718896400.0,
  "updated_at": 1718900000.0,
  "error": null,
  "artifacts_dir": "acme/qwen3.5-4b-grpo-runs"
}
after · light
flash  status
────────────────────────────────────────────────────────────────────────────
  ● done

  run id    · flash-1718900000-a1b2c3d4
  model     · Qwen/Qwen3.5-4B
  algorithm · GRPO
  gpu       · RTX 5090 @ runpod
  cost      · $1.8421
  realized  · $1.7740
  created   · 2024-06-20 15:13 UTC
  updated   · 2024-06-20 16:13 UTC
  artifacts · acme/qwen3.5-4b-grpo-runs

details
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "done",
  "spec": {
    "model": "Qwen/Qwen3.5-4B",
    "algorithm": "grpo",
    "gpu": {
      "type": "RTX 5090"
    }
  },
  "remote": {
    "provider": "runpod",
    "gpu": "RTX 5090"
  },
  "cost_usd": 1.8421,
  "realized_cost_usd": 1.774,
  "created_at": 1718896400.0,
  "updated_at": 1718900000.0,
  "error": null,
  "artifacts_dir": "acme/qwen3.5-4b-grpo-runs"
}
flash runsall runs and their state/cost
before
RUN_ID                            STATE        ALGO    COST($)  GPU                     MODEL
flash-1718903000-e5f6a7b8         running      SFT      0.4210  L40S@vast               Qwen/Qwen3.5-9B
flash-1718900000-a1b2c3d4         done         GRPO     1.8421  RTX 5090@runpod         Qwen/Qwen3.5-4B
flash-1718800000-99887766         failed       GRPO     0.0712                          Qwen/Qwen3.5-2B
flash-1718700000-12ab34cd         queued       SFT      0.0000                          openbmb/MiniCPM5-1B
after · dark
flash  runs  4 run(s)
────────────────────────────────────────────────────────────────────────────
RUN ID                     STATE      ALGO     COST  GPU              MODEL              
─────────────────────────  ─────────  ────  ───────  ───────────────  ───────────────────
flash-1718903000-e5f6a7b8  ● running  SFT   $0.4210  L40S@vast        Qwen/Qwen3.5-9B
flash-1718900000-a1b2c3d4  ● done     GRPO  $1.8421  RTX 5090@runpod  Qwen/Qwen3.5-4B
flash-1718800000-99887766  ● failed   GRPO  $0.0712                   Qwen/Qwen3.5-2B
flash-1718700000-12ab34cd  ○ queued   SFT   $0.0000                   openbmb/MiniCPM5-1B
after · light
flash  runs  4 run(s)
────────────────────────────────────────────────────────────────────────────
RUN ID                     STATE      ALGO     COST  GPU              MODEL              
─────────────────────────  ─────────  ────  ───────  ───────────────  ───────────────────
flash-1718903000-e5f6a7b8  ● running  SFT   $0.4210  L40S@vast        Qwen/Qwen3.5-9B
flash-1718900000-a1b2c3d4  ● done     GRPO  $1.8421  RTX 5090@runpod  Qwen/Qwen3.5-4B
flash-1718800000-99887766  ● failed   GRPO  $0.0712                   Qwen/Qwen3.5-2B
flash-1718700000-12ab34cd  ○ queued   SFT   $0.0000                   openbmb/MiniCPM5-1B
flash cancel <run>cancel a run
before
{
  "run_id": "flash-1718903000-e5f6a7b8",
  "state": "cancelled"
}
after · dark
flash  cancel
────────────────────────────────────────────────────────────────────────────
  ● cancelled   flash-1718903000-e5f6a7b8

{
  "run_id": "flash-1718903000-e5f6a7b8",
  "state": "cancelled"
}
after · light
flash  cancel
────────────────────────────────────────────────────────────────────────────
  ● cancelled   flash-1718903000-e5f6a7b8

{
  "run_id": "flash-1718903000-e5f6a7b8",
  "state": "cancelled"
}
flash deploy <run>serve a trained adapter
before
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "deployed",
  "mode": "dev",
  "openai_model": "flash-1718900000-a1b2c3d4",
  "endpoint_name": "https://api.freesolo.co/v1",
  "gpu": "RTX 4090"
}
after · dark
flash  deploy
────────────────────────────────────────────────────────────────────────────
  ● deployed   flash-1718900000-a1b2c3d4

{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "deployed",
  "mode": "dev",
  "openai_model": "flash-1718900000-a1b2c3d4",
  "endpoint_name": "https://api.freesolo.co/v1",
  "gpu": "RTX 4090"
}
after · light
flash  deploy
────────────────────────────────────────────────────────────────────────────
  ● deployed   flash-1718900000-a1b2c3d4

{
  "run_id": "flash-1718900000-a1b2c3d4",
  "state": "deployed",
  "mode": "dev",
  "openai_model": "flash-1718900000-a1b2c3d4",
  "endpoint_name": "https://api.freesolo.co/v1",
  "gpu": "RTX 4090"
}
flash undeploy <run>tear down a serving endpoint
before
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "deleted_endpoints": [
    "flash-a1b2c3d4-serve"
  ]
}
after · dark
flash  undeploy
────────────────────────────────────────────────────────────────────────────
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "deleted_endpoints": [
    "flash-a1b2c3d4-serve"
  ]
}
after · light
flash  undeploy
────────────────────────────────────────────────────────────────────────────
{
  "run_id": "flash-1718900000-a1b2c3d4",
  "deleted_endpoints": [
    "flash-a1b2c3d4-serve"
  ]
}
flash deploymentsactive serving deployments
before
RUN_ID                            GPU        ENDPOINT
flash-1718900000-a1b2c3d4         RTX 4090   https://api.freesolo.co/v1
after · dark
flash  deployments  1 active
────────────────────────────────────────────────────────────────────────────
RUN ID                     GPU       ENDPOINT                  
─────────────────────────  ────────  ──────────────────────────
flash-1718900000-a1b2c3d4  RTX 4090  https://api.freesolo.co/v1
after · light
flash  deployments  1 active
────────────────────────────────────────────────────────────────────────────
RUN ID                     GPU       ENDPOINT                  
─────────────────────────  ────────  ──────────────────────────
flash-1718900000-a1b2c3d4  RTX 4090  https://api.freesolo.co/v1
flash chat <run> -m ...chat with a deployed adapter
before
The capital of France is Paris. It has been the country's political and cultural center since the 12th century.
after · dark
assistant
The capital of France is Paris. It has been the country's political and cultural center since the 12th century.
after · light
assistant
The capital of France is Paris. It has been the country's political and cultural center since the 12th century.