Metadata-Version: 2.4
Name: soup-cli
Version: 0.71.1
Summary: Fine-tune LLMs in one command. No SSH, no config hell.
Project-URL: Homepage, https://github.com/MakazhanAlpamys/Soup
Project-URL: Repository, https://github.com/MakazhanAlpamys/Soup
Project-URL: Issues, https://github.com/MakazhanAlpamys/Soup/issues
Author: Soup Team
License-Expression: Apache-2.0
License-File: LICENSE
License-File: NOTICE
Keywords: fine-tuning,llm,lora,machine-learning,qlora
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: huggingface-hub>=0.16.0
Requires-Dist: plotext>=5.2.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typer<0.21.0,>=0.9.0
Provides-Extra: all
Requires-Dist: accelerate>=0.25.0; extra == 'all'
Requires-Dist: bitsandbytes>=0.41.0; extra == 'all'
Requires-Dist: datasets>=2.14.0; extra == 'all'
Requires-Dist: datasketch>=1.6.0; extra == 'all'
Requires-Dist: fastapi>=0.104.0; extra == 'all'
Requires-Dist: peft>=0.7.0; extra == 'all'
Requires-Dist: torch>=2.0.0; extra == 'all'
Requires-Dist: transformers<5.0.0,>=4.36.0; extra == 'all'
Requires-Dist: trl>=0.7.0; extra == 'all'
Requires-Dist: uvicorn>=0.24.0; extra == 'all'
Provides-Extra: audio
Requires-Dist: librosa>=0.10.0; extra == 'audio'
Requires-Dist: soundfile>=0.12.0; extra == 'audio'
Provides-Extra: awq
Requires-Dist: autoawq>=0.2.0; extra == 'awq'
Provides-Extra: cce
Requires-Dist: cut-cross-entropy>=24.10.0; extra == 'cce'
Provides-Extra: data
Requires-Dist: datasketch>=1.6.0; extra == 'data'
Provides-Extra: data-pro
Requires-Dist: langdetect>=1.0.9; extra == 'data-pro'
Requires-Dist: presidio-analyzer>=2.2.0; extra == 'data-pro'
Provides-Extra: deepspeed
Requires-Dist: deepspeed>=0.12.0; extra == 'deepspeed'
Provides-Extra: dev
Requires-Dist: accelerate>=0.25.0; extra == 'dev'
Requires-Dist: bitsandbytes>=0.41.0; extra == 'dev'
Requires-Dist: datasets>=2.14.0; extra == 'dev'
Requires-Dist: httpx>=0.24.0; extra == 'dev'
Requires-Dist: mypy>=1.8.0; extra == 'dev'
Requires-Dist: peft>=0.7.0; extra == 'dev'
Requires-Dist: pre-commit>=3.5.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: torch>=2.0.0; extra == 'dev'
Requires-Dist: transformers<5.0.0,>=4.36.0; extra == 'dev'
Requires-Dist: trl>=0.7.0; extra == 'dev'
Provides-Extra: eval
Requires-Dist: lm-eval>=0.4.0; extra == 'eval'
Provides-Extra: fast
Requires-Dist: unsloth>=2024.8; extra == 'fast'
Provides-Extra: generate
Requires-Dist: httpx>=0.24.0; extra == 'generate'
Provides-Extra: gptq
Requires-Dist: auto-gptq>=0.7.0; extra == 'gptq'
Provides-Extra: liger
Requires-Dist: liger-kernel>=0.3.0; extra == 'liger'
Provides-Extra: mix
Requires-Dist: scikit-optimize>=0.9.0; extra == 'mix'
Provides-Extra: mlx
Requires-Dist: mlx-lm>=0.20.0; extra == 'mlx'
Requires-Dist: mlx>=0.20.0; extra == 'mlx'
Provides-Extra: onnx
Requires-Dist: optimum[onnxruntime]>=1.16.0; extra == 'onnx'
Provides-Extra: qat
Requires-Dist: torchao>=0.4.0; extra == 'qat'
Provides-Extra: remote
Requires-Dist: adlfs>=2024.1.0; extra == 'remote'
Requires-Dist: fsspec>=2024.1.0; extra == 'remote'
Requires-Dist: gcsfs>=2024.1.0; extra == 'remote'
Requires-Dist: s3fs>=2024.1.0; extra == 'remote'
Provides-Extra: ring-attn
Requires-Dist: ring-flash-attn>=0.1.0; extra == 'ring-attn'
Provides-Extra: serve
Requires-Dist: fastapi>=0.104.0; extra == 'serve'
Requires-Dist: uvicorn>=0.24.0; extra == 'serve'
Provides-Extra: serve-fast
Requires-Dist: fastapi>=0.104.0; extra == 'serve-fast'
Requires-Dist: uvicorn>=0.24.0; extra == 'serve-fast'
Requires-Dist: vllm>=0.4.0; extra == 'serve-fast'
Provides-Extra: sglang
Requires-Dist: fastapi>=0.104.0; extra == 'sglang'
Requires-Dist: sglang>=0.2.0; extra == 'sglang'
Requires-Dist: uvicorn>=0.24.0; extra == 'sglang'
Provides-Extra: tensorrt
Requires-Dist: tensorrt-llm>=0.9.0; extra == 'tensorrt'
Provides-Extra: trackers
Requires-Dist: mlflow>=2.0.0; extra == 'trackers'
Requires-Dist: swanlab>=0.3.0; extra == 'trackers'
Requires-Dist: trackio>=0.0.1; extra == 'trackers'
Provides-Extra: train
Requires-Dist: accelerate>=0.25.0; extra == 'train'
Requires-Dist: bitsandbytes>=0.41.0; extra == 'train'
Requires-Dist: datasets>=2.14.0; extra == 'train'
Requires-Dist: peft>=0.7.0; extra == 'train'
Requires-Dist: torch>=2.0.0; extra == 'train'
Requires-Dist: transformers<5.0.0,>=4.36.0; extra == 'train'
Requires-Dist: trl>=0.7.0; extra == 'train'
Provides-Extra: tui
Requires-Dist: textual>=0.50.0; extra == 'tui'
Provides-Extra: ui
Requires-Dist: fastapi>=0.104.0; extra == 'ui'
Requires-Dist: uvicorn>=0.24.0; extra == 'ui'
Provides-Extra: vision
Requires-Dist: pillow>=9.0.0; extra == 'vision'
Provides-Extra: wandb
Requires-Dist: wandb<0.18.0,>=0.15.0; extra == 'wandb'
Description-Content-Type: text/markdown

<p align="center">
  <img src="soup.png" alt="Soup" width="280">
</p>

<h1 align="center">Soup</h1>

<p align="center">
  <strong>Fine-tune LLMs in one command. No SSH, no config hell.</strong>
</p>

<p align="center">
  <a href="https://trysoup.dev">Website</a> &middot;
  <a href="#quick-start">Quick Start</a> &middot;
  <a href="#configuration">Config</a> &middot;
  <a href="#documentation">Docs</a> &middot;
  <a href="docs/commands.md">Commands</a> &middot;
  <a href="docs/models.md">Models</a>
</p>

<p align="center">
  <a href="https://pypi.org/project/soup-cli/"><img src="https://img.shields.io/pypi/v/soup-cli?color=blue" alt="PyPI"></a>
  <a href="https://pepy.tech/project/soup-cli"><img src="https://img.shields.io/pepy/dt/soup-cli?color=blue" alt="Downloads"></a>
  <img src="https://img.shields.io/badge/python-3.10%2B-blue" alt="Python 3.10+">
  <img src="https://img.shields.io/badge/license-Apache--2.0-blue" alt="Apache-2.0 License">
  <a href="https://github.com/MakazhanAlpamys/Soup/actions"><img src="https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/MakazhanAlpamys/65fdc943f85f3b2c46ecddb415c2b779/raw/soup_tests.json" alt="Tests"></a>
  <a href="https://github.com/MakazhanAlpamys/Soup/actions"><img src="https://github.com/MakazhanAlpamys/Soup/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
  <a href="https://trysoup.dev"><img src="https://img.shields.io/badge/website-trysoup.dev-blue" alt="Website"></a>
</p>

---

Soup turns the pain of LLM fine-tuning into a simple workflow. One config, one command, done.

```bash
pip install 'soup-cli[train]'   # add [train] to fine-tune; bare `soup-cli` is the light CLI
soup init --template chat
soup train
```

## Why Soup?

Training LLMs is still painful. Even experienced teams spend 30-50% of their time fighting
infrastructure instead of improving models. Soup fixes that.

- **Zero SSH.** Never SSH into a broken GPU box again.
- **One config.** A simple YAML file is all you need.
- **Auto everything.** Batch size, GPU detection, quantization — handled.
- **Works locally.** Train on your own GPU with QLoRA. No cloud required.

## What's New

**v0.71.1 — Quick wins + wiring.** Seven small-but-sharp closures:

- **`soup env fix`** renders a reproducible install plan (copy/paste `uv pip` commands or a
  `requirements.txt`) straight from `soup-env.lock` — print-only, no surprise package-manager calls.
- **`soup lock write --env-lock`** auto-derives the env hash from `soup-env.lock` so you never
  hand-copy a 64-hex string after `soup env lock`.
- **`soup serve --record-thumbs <db>`** captures 👍/👎 feedback into a local-RL SQLite, plus a new
  `POST /v1/thumbs` endpoint — the start of an on-box feedback flywheel.
- **Judge calibration persistence** — write/load a `JudgeCalibrationReport` as JSON, backed by a new
  `judge_calibration` registry artifact kind.
- **Bundled MUSE + WMDP unlearning eval fixtures** so `soup eval unlearning --benchmark muse|wmdp`
  runs out of the box (WMDP forget-set probes ship **redacted** — never verbatim hazardous content).
- **`soup completions`** now introspects a cached base model's real LoRA target modules.

Full history: [CHANGELOG.md](CHANGELOG.md) &middot; [GitHub Releases](https://github.com/MakazhanAlpamys/Soup/releases).

## Quick Start

### 1. Install

```bash
pip install soup-cli            # light: CLI + config + data tools (no PyTorch)
pip install 'soup-cli[train]'   # add the training stack (torch, transformers, peft, trl, …)
pip install git+https://github.com/MakazhanAlpamys/Soup.git   # latest dev
```

`soup init`, `soup data …`, and the other data/inspection commands work on the light install.
Fine-tuning (`soup train`) needs the `[train]` extra.

### 2. Create a config

```bash
soup init                       # interactive wizard
soup init --template chat       # or start from a template
```

Templates: `chat`, `code`, `tool-calling`, `medical`, `reasoning`, `vision`, `kto`, `orpo`,
`simpo`, `ipo`, `bco`, `rlhf`, `pretrain`, `moe`, `longcontext`, `embedding`, `audio`.

### 3. Train, test, ship

```bash
soup train --config soup.yaml                 # LoRA, quantization, batching — all handled
soup chat  --model ./output                    # talk to your model
soup push  --model ./output --repo you/my-model

soup merge  --adapter ./output                              # merge LoRA into the base
soup export --model ./output --format gguf --quant q4_k_m   # GGUF for Ollama / llama.cpp
```

More export targets (ONNX, TensorRT, AWQ, GPTQ, BitNet) and deployment options live in
[`docs/serving-and-export.md`](docs/serving-and-export.md).

## Configuration

A complete `soup.yaml`:

```yaml
base: meta-llama/Llama-3.1-8B-Instruct
task: sft
# backend: unsloth  # 2-5x faster, pip install 'soup-cli[fast]'

data:
  train: ./data/train.jsonl
  format: alpaca
  val_split: 0.1

training:
  epochs: 3
  lr: 2e-5
  batch_size: auto
  lora:
    r: 64
    alpha: 16
  quantization: 4bit

output: ./output
```

`config/schema.py` is the single source of truth for every field. Advanced data, training,
and PEFT options are documented under [Documentation](#documentation).

## Documentation

The full feature reference lives in [`docs/`](docs/). Start here:

| Guide | Covers |
|---|---|
| [Training tasks & methods](docs/training.md) | SFT, DPO/GRPO/PPO/KTO/ORPO/SimPO/IPO/BCO, tool-calling, PRM, pre-training, distillation, classification, vision/audio/TTS, unlearning, RAFT/RA-DIT, loop-hardening detectors |
| [PEFT, long context & efficiency](docs/peft-and-efficiency.md) | DoRA, LoRA+, rsLoRA, VeRA, OLoRA, NEFTune, PiSSA, ReLoRA, optimizer & PEFT zoo, LLaMA Pro, GaLore, YaRN/LongLoRA, packing, curriculum, auto-tuning |
| [Performance & quantization](docs/performance-and-quantization.md) | QAT, FP8, Quant Menu (I + II), KV-cache, NVFP4, save formats, Cut Cross-Entropy, gradient checkpointing, kernels, activation offloading, multi-GPU / DeepSpeed / FSDP |
| [Data engineering](docs/data.md) | Formats, the Axolotl/LF-parity pipeline, data tools, synthetic generation & forge, quality scorecards, trace tooling, remote datasets, mixing, recipe DAGs |
| [Evaluation & probes](docs/evaluation.md) | Eval design/gate, eval-gated training, benchmarks, NLG metrics, calibration, Elo arena, diagnose, post-train X-ray probes, A/B, drift, tunability, `soup advise` |
| [Serving & export](docs/serving-and-export.md) | OpenAI-compatible server, batch inference, benchmarking, merge/export, Anthropic Messages endpoint, speculative decoding, deploy autopilot, Web UI, Agent Forge |
| [Adapters, registry & governance](docs/adapters-and-governance.md) | Adapter lifecycle/management, model registry, Soup Cans, the data flywheel (`soup loop`), knowledge editing, steering, supply-chain controls (scan/sign/BOM/attest/audit/airgap) |
| [Backends, platform & ops](docs/backends-and-ops.md) | MLX/Unsloth backends, alternative hubs, HF Hub integration, autopilot, experiment tracking, plan/apply, env lockfiles, hardware-fit, completions, plugins, utility commands |
| [Command reference](docs/commands.md) | The full `soup` command list |
| [Supported models & extras](docs/models.md) | Recommended model families, the VRAM size guide, the pip extras matrix |

## Data Formats

All formats are auto-detected from JSONL, JSON, CSV, Parquet, or TXT:

- **alpaca** — `{"instruction": ..., "input": ..., "output": ...}`
- **sharegpt** — `{"conversations": [{"from": "human", "value": ...}, ...]}`
- **chatml** — `{"messages": [{"role": "user", "content": ...}, ...]}`
- **dpo / orpo / simpo / ipo** — `{"prompt": ..., "chosen": ..., "rejected": ...}`
- **kto** — `{"prompt": ..., "completion": ..., "label": true}`
- **llava / sharegpt4v** (vision), **audio**, **plaintext** (pre-training), **embedding**,
  **prm**, **pre_tokenized**, **video**, **multimodal**

Full schemas and the Axolotl/LlamaFactory-parity data pipeline (remote URIs, streaming,
sharding, interleaving, vocab expansion, document ingestion) are in
[`docs/data.md`](docs/data.md).

## Common Commands

```bash
soup train  --config soup.yaml        # train (SFT/DPO/GRPO/PPO/KTO/ORPO/SimPO/IPO/...)
soup infer  --model ./output --input prompts.jsonl   # batch inference
soup chat   --model ./output          # interactive chat
soup serve  --model ./output          # OpenAI-compatible API server
soup merge  --adapter ./output        # merge LoRA into the base model
soup export --model ./output --format gguf           # export for deployment
soup eval   benchmark --model ./output               # evaluate
soup data   inspect ./data/train.jsonl               # dataset stats
soup recipes list                     # 100+ ready-made model recipes
soup autopilot --model <id> --data d.jsonl --goal chat  # zero-config
soup doctor                           # check GPU / deps / environment
```

The complete command list is in [`docs/commands.md`](docs/commands.md).

## Supported Models

Soup works with **any** text-generation model on the
[HuggingFace Hub](https://huggingface.co/models?pipeline_tag=text-generation) — if it loads with
`AutoModelForCausalLM`, it works, zero config changes. Llama 3.x/4, Qwen 2.5/3, Gemma 3, Mistral,
Mixtral, DeepSeek R1/V3, Phi-4, and 100+ others ship as ready-made recipes (`soup recipes list`).

| VRAM | Max model (QLoRA 4-bit) | Example |
|---|---|---|
| 8 GB | ~7B | Llama-3.1-8B, Mistral-7B |
| 16 GB | ~14B | Phi-4-14B, Qwen2.5-14B |
| 24 GB | ~34B | CodeLlama-34B, Yi-1.5-34B |
| 48 GB | ~70B | Llama-3.3-70B |
| 80 GB+ | 70B+ (full) or MoE | Mixtral-8x22B, DeepSeek-V3 |

Full model + vision tables and the optional-extras matrix are in [`docs/models.md`](docs/models.md).

## Docker

Run Soup without installing CUDA or PyTorch locally (image published to GHCR on every release):

```bash
docker pull ghcr.io/makazhanalpamys/soup:latest
docker run --gpus all -v $(pwd):/workspace ghcr.io/makazhanalpamys/soup train --config soup.yaml
docker compose up   # or build locally
```

## Requirements

- Python 3.10+
- GPU with CUDA (recommended), Apple Silicon (MPS), or CPU (experimental — very slow)
- 8 GB+ VRAM for 7B models with QLoRA

All training tasks run on CPU for testing (quantization auto-disabled). Optional extras
(`train`, `all`, `fast`, `vision`, `qat`, `serve`, `serve-fast`, `ui`, `eval`, `deepspeed`,
`liger`, `mlx`, `onnx`, `tensorrt`, …) are listed in
[`docs/models.md`](docs/models.md#optional-extras).

## Troubleshooting

```bash
soup doctor    # GPU, system resources, dependencies, and version in one place
```

- **`ImportError: DLL load failed while importing _C` (Windows)** — reinstall PyTorch for your
  CUDA version: `pip install torch --index-url https://download.pytorch.org/whl/cu121`.
- **`soup version` ≠ `pip show soup-cli`** — multiple Python installs; use a virtualenv.

## Development

```bash
git clone https://github.com/MakazhanAlpamys/Soup.git
cd Soup
pip install -e ".[dev]"

ruff check src/soup_cli/ tests/    # lint
pytest tests/ -v                   # unit tests (fast, no GPU)
pytest tests/ -m smoke -v          # smoke tests (downloads a tiny model, trains)

pre-commit install                 # optional: ruff lint+format on commit
```

See [CONTRIBUTING.md](CONTRIBUTING.md) for the full workflow and [SECURITY.md](SECURITY.md) to
report a vulnerability.

## License

[Apache-2.0](LICENSE). Copyright © the Soup contributors.
