Metadata-Version: 2.4
Name: gpu-installer
Version: 4.5.0
Summary: Intelligent GPU-accelerated ML package installer for PyTorch, CuPy, and llama-cpp-python
Project-URL: Repository, https://gitlab.com/yoanncure/gpu_installer
License: MIT
Keywords: cuda,gpu,installer,pytorch,rocm
Classifier: Environment :: GPU :: NVIDIA CUDA
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.9
Provides-Extra: color
Requires-Dist: rich-argparse>=1; extra == 'color'
Provides-Extra: completion
Requires-Dist: argcomplete>=3; extra == 'completion'
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == 'dev'
Description-Content-Type: text/markdown

# Smart GPU Package Installer

An intelligent, autonomous installer that detects your system, GPU, and CUDA/ROCm configuration to install the optimal ML ecosystem (`PyTorch`, `CuPy`, `JAX`, `llama-cpp-python`) with zero guesswork — and can optionally provision the underlying system GPU stack (CUDA toolkit + cuDNN, or ROCm) for you. I've made this from https://gist.github.com/kipavy/2c5acabbff81a410b340464a667a12c4/93a56a0cc6bcca4868bcb7724e0f7f3331b32f02 + https://pypi.org/project/torch-installer-coff33ninja/

---

## ✨ Features

- 🔍 **Automatic hardware detection** — identifies GPU vendor (NVIDIA / AMD / Apple Silicon), driver, and CUDA/ROCm version with no flags.
- 🧩 **Synchronized version matrix** — pins `torch`, `torchvision`, and `torchaudio` to compatible triplets, eliminating ABI mismatches.
- 🌐 **Cross-platform** — Linux (CUDA & ROCm), Windows (CUDA, with optional auto-install), macOS (MPS), and CPU-only fallback.
- 📦 **Multi-package** — installs PyTorch, CuPy, JAX, and llama-cpp-python individually or all at once, with GPU-aware skipping.
- 🛠️ **System stack auto-install (opt-in)** — installs the GPU *userspace* stack (CUDA toolkit + cuDNN, or ROCm) from official vendor repos via `apt`/`dnf`/`winget`/`chocolatey`. Never touches kernel drivers, and previews the full plan before any `sudo` step.
- 🗑️ **Clean uninstall** — `--uninstall` removes the selected ecosystem, including stray `nvidia-*` fat-wheels, for a sterile environment.
- 🧹 **Sterile cross-grades** — purges stale `nvidia-*` fat-wheels and CPU-only builds before reinstalling.
- ✅ **Isolated verification** — pre- and post-install checks run in a subprocess to defeat `sys.modules` caching, with retry/backoff for linker sync lag.
- ♻️ **Smart, idempotent reinstall** — skips packages already working and only acts when the environment actually needs it.
- 🚀 **Zero-install usage** — run on the fly via `uvx`, or embed programmatically through the dependency-free `ensure()` API.
- ⚡ **Faster installs with `uv`** — automatically uses `uv` when available for significantly faster installs, falling back to `pip` transparently.
- 🔬 **Dry-run & diagnostics** — preview exact commands, inspect detected GPU info, and list supported CUDA wheel versions without touching your environment.

---

## 🚀 Quick Start

The fastest path — **no install, no manual steps**. With [`uv`](https://docs.astral.sh/uv/getting-started/installation/) installed, one command fetches the tool on the fly and installs the right GPU wheels into your **active** environment:

```bash
# Detect GPU/CUDA and install the full ecosystem (torch + cupy + jax + llamacpp)
uvx --from git+https://gitlab.com/yoanncure/gpu_installer gpu-installer

# Just PyTorch — space- or comma-separated for a subset
uvx --from git+https://gitlab.com/yoanncure/gpu_installer gpu-installer torch
uvx --from git+https://gitlab.com/yoanncure/gpu_installer gpu-installer torch,cupy

# Preview only — print the exact commands without running them
uvx --from git+https://gitlab.com/yoanncure/gpu_installer gpu-installer --dry-run
```

> **Not on PyPI yet**, hence `--from git+…`. Once published this shortens to just `uvx gpu-installer torch`.
> `uvx` installs into the currently active venv/conda env. If none is active, add `--python /path/to/python` so it knows where to install.

Prefer a traditional install? Install the CLI once, then call it directly:

```bash
pip install git+https://gitlab.com/yoanncure/gpu_installer   # or: uv pip install git+…

gpu-installer                 # full auto-detect + install
gpu-installer --cpu-only      # force CPU-only builds
gpu-installer torch cupy      # a subset (space- or comma-separated)
gpu-installer --dry-run       # preview without executing
```

---

## Why This Exists

Installing GPU-accelerated ML packages is notoriously fragile. The wrong wheel, a stale `nvidia-*` package, or a `uv`/`pip` index race can silently install a CPU-only build or brick your environment. This script handles all of that:

- Detects your GPU vendor (NVIDIA / AMD / Apple Silicon) and driver version automatically
- Pins all three packages to a mathematically synchronized release matrix — no more `torch` and `torchvision` ABI mismatches
- Purges stale `nvidia-*` fat-wheel residue before any cross-grade
- Runs pre- and post-install verification in an **isolated subprocess** to defeat `sys.modules` caching
- Retries dynamic linker sync after fast installs (`uv`) with exponential backoff

---

## Supported Platforms

| Platform | GPU Backend | Notes |
|---|---|---|
| Linux | NVIDIA CUDA | Auto-detected via `nvcc` / `nvidia-smi`; optional toolkit+cuDNN auto-install via apt/dnf |
| Linux | AMD ROCm | Auto-detected via `rocminfo` / `rocm-smi`; optional ROCm userspace auto-install via apt/dnf |
| Windows | NVIDIA CUDA | Optional auto-install via `winget` / `chocolatey` |
| macOS | Apple MPS (Metal) | Auto-detected, no CUDA needed |
| Any | CPU-only | Use `--cpu-only` |

---

## Requirements

- Python 3.8+
- [`uv`](https://docs.astral.sh/uv/getting-started/installation/) *(recommended, falls back to `pip` automatically)*
- NVIDIA drivers / ROCm stack already installed for GPU builds

---

## Shell Completion (optional)

Tab-completion for package names and flags is provided via
[`argcomplete`](https://github.com/kislyuk/argcomplete). It's an optional
extra — the core tool stays dependency-free.

```bash
# Install with the completion extra
pip install "gpu-installer[completion]"

# Activate for the current shell (bash / zsh)
eval "$(register-python-argcomplete gpu-installer)"

# Fish
register-python-argcomplete --shell fish gpu-installer | source
```

Add the `eval` line to your `~/.bashrc` / `~/.zshrc` to make it permanent. For
fish, write it to a completions file instead:

```fish
register-python-argcomplete --shell fish gpu-installer > ~/.config/fish/completions/gpu-installer.fish
```

Then `gpu-installer <TAB>` completes `torch`, `cupy`, `jax`, and `llamacpp`.

## Colored Help (optional)

Colorized `--help` output is provided via
[`rich-argparse`](https://github.com/hamdanal/rich-argparse). Like completion,
it's an optional extra — the core tool stays dependency-free and falls back to
plain help when it isn't installed.

```bash
pip install "gpu-installer[color]"
```

---

## Use in Another Project

Other projects can install their GPU dependencies through gpu-installer without hooking into `pip install` (wheels have no reliable install-time hook). Trigger it explicitly — from a setup command, a first-run guard, or your app's startup.

**Zero permanent dependency (recommended).** Shell out via `uvx`, passing your own interpreter so the target is unambiguous. gpu-installer is fetched and run on the fly — never added to your dependency tree:

```python
import subprocess, sys

subprocess.run(
    [
        "uvx", "--from", "git+https://gitlab.com/yoanncure/gpu_installer",
        "gpu-installer", "--python", sys.executable, "torch", "cupy",
    ],
    check=True,
)
```

**Programmatic API (if you accept the tiny zero-dep dependency).** Add gpu-installer to your deps and call `ensure()` — it is idempotent (skips packages already working) and returns a structured result:

```python
from gpu_installer import ensure

result = ensure(["torch", "cupy"])   # installs into the active environment
if not result.ok:
    raise RuntimeError(f"GPU deps failed: {result.failed}")
```

`ensure()` accepts a list or comma string, plus `python=`, `cpu_only=`, `force_cuda=`, `force_reinstall=`, `dry_run=`, and `quiet=`. It returns an `EnsureResult` with `target_python`, `cuda`, `installed`, `skipped`, `failed`, `errors` (a `{package: reason}` map for anything that failed — carrying the installer's captured error text or the GPU probe's real failure message, not just a category), and an `ok` flag.

Pass `quiet=True` to keep gpu-installer's own status output off your stdout — it's routed to the `gpu_installer` logger instead, so configure that logger to capture it (the underlying `uv`/`pip` install still streams its progress):

```python
import logging
logging.getLogger("gpu_installer").addHandler(logging.StreamHandler())

result = ensure(["torch", "cupy"], quiet=True)
if not result.ok:
    raise RuntimeError(f"GPU deps failed: {result.errors}")
```

---

## CUDA / PyTorch Compatibility Matrix

The script uses a strict pinned matrix to prevent rolling-release index desyncs where `torch` and `torchvision` resolve to incompatible builds:

| CUDA Version | PyTorch | TorchVision | TorchAudio |
|---|---|---|---|
| 11.6 | 2.1.2 | 0.16.2 | 2.1.2 |
| 11.7 | 2.2.2 | 0.17.2 | 2.2.2 |
| 11.8 | 2.5.1 | 0.20.1 | 2.5.1 |
| 12.1 | 2.5.1 | 0.20.1 | 2.5.1 |
| 12.4 | 2.5.1 | 0.20.1 | 2.5.1 |
| 12.6 | 2.11.0 | 0.26.0 | 2.11.0 |
| 12.8 | 2.11.0 | 0.26.0 | 2.11.0 |
| 13.0 | 2.11.0 | 0.26.0 | 2.11.0 |

If your exact CUDA version isn't listed, the script automatically picks the highest compatible version below yours.

---

## CLI Reference

### Install Options

| Argument | Default | Description |
|---|---|---|
| `[PACKAGE ...]` | all | Packages to install, space- or comma-separated (`torch`, `cupy`, `jax`, `llamacpp`) |
| `-p`, `--python PATH` | active env | Target interpreter to install into (active venv/conda env by default) |
| `--cpu-only` | off | Force CPU-only builds for all packages |
| `--force-cuda VER` | auto | Override detected CUDA version (e.g. `121`, `cu12.1`) |
| `--force-reinstall` | off | Force reinstall even if already detected as working |
| `-u`, `--uninstall` | off | Uninstall the selected packages' distributions (default: all). Mutually exclusive with `--force-reinstall` |
| `-n`, `--dry-run` | off | Print all commands that would run, without executing |
| `-l`, `--log` | off | Tee all output to a timestamped log file |

### Diagnostic / Info Commands

| Flag | Description |
|---|---|
| `--gpu-info` | Show detected GPU model, VRAM, CUDA version, and upgrade guidance |
| `-d`, `--doctor` | Diagnose the active environment — run GPU verification probes on each package |
| `--list-cuda` | List all supported CUDA wheel versions |
| `--show-matching` | Show which wheel version your detected CUDA maps to |

### System Stack Auto-Install (opt-in)

Installs the GPU **userspace** stack from official vendor repos. Never touches
kernel drivers. Privileged steps run via `sudo` after showing the full plan.

| Flag | Description |
|---|---|
| `--auto-install-system` | Install the system stack: CUDA toolkit + cuDNN (NVIDIA) or ROCm (AMD) on Linux; CUDA via `winget`/`chocolatey` on Windows |
| `--cuda-version VER` | Toolkit version for auto-install (e.g. `12.4`) |
| `--yes` / `-y` | Skip the confirmation prompt (scripted / CI use) |

Linux coverage: `apt` (Ubuntu/Debian) and `dnf` (RHEL/Fedora/Rocky). Always
preview with `--dry-run` first.

---

## Packages Installed

### PyTorch Ecosystem (`torch`)
Installs `torch`, `torchvision`, and `torchaudio` as a pinned, synchronized triplet from the official PyTorch wheel index. Handles CUDA, ROCm, MPS, and CPU targets.

### CuPy (`cupy`)
Installs the appropriate `cupy-cuda{major}x` wheel based on your detected CUDA major version. Skipped automatically on CPU-only or MPS systems.

### JAX (`jax`)
Installs `jax[cuda12]` or `jax[cuda13]` keyed to your detected CUDA major version (the wheels bundle their own CUDA redistributables, so only a recent NVIDIA driver is required). Fully standalone — it does not depend on your installed `torch`. Plain `jax` is a real CPU build, so `--cpu-only` (and macOS, where JAX has no Metal backend) installs it like `llama-cpp` rather than skipping. CUDA wheels are Linux-only, so the **GPU** path is skipped on Windows (use WSL2 or `--cpu-only`) and on ROCm (experimental, local-build-only upstream).

### Llama-CPP-Python (`llamacpp`)
Installs `llama-cpp-python` with GPU offload support via the `abetlen` pre-built wheel index. Falls back to a CPU build if no CUDA/ROCm is detected.

---

## Adding a Package

Each GPU package is a `GpuPackage` subclass in `src/gpu_installer/packages/`.
To add one:

1. Create `src/gpu_installer/packages/<name>.py` with a class implementing
   `install(self, plan) -> Outcome` and `verify(self, python=None) -> bool`
   (optionally override `preflight` and declare a `purge_names` list).
2. Register it in the `_PACKAGES` tuple in `packages/__init__.py`.

That's the whole change — the CLI, `ensure()`, and the purge logic pick it up
automatically.

---

## How Detection Works

```
nvidia-smi / nvcc          →  CUDA version
rocminfo / rocm-smi        →  ROCm version
platform.system() Darwin   →  Apple MPS
(none found)               →  CPU-only
```

GPU model and VRAM are also parsed from `nvidia-smi` / `rocminfo` output for upgrade guidance.

---

## Smart Reinstall Logic

The script avoids unnecessary reinstalls by checking the current state before acting:

| Detected State | Action |
|---|---|
| Not installed | Install |
| CPU-only build, GPU requested | Auto-purge + upgrade |
| CUDA broken / ABI mismatch | Auto-purge + reinstall |
| Already working (CUDA tensor test passes) | Skip |
| `--force-reinstall` passed | Always purge + reinstall |

The purge step removes `torch`, `torchvision`, `torchaudio`, `cupy`, `jax`, `jaxlib` (plus its `jax-cuda*` plugins), `llama-cpp-python`, and **all `nvidia-*` fat-wheel packages** scraped from `pip freeze` to ensure a sterile environment before reinstalling.

---

## Post-Install Verification

After installation, the script runs isolated subprocess checks for each package:

- **PyTorch**: imports `torch`, checks `cuda.is_available()`, runs a `.cuda()` tensor allocation test. Retries up to 4 times with backoff (3s / 5s / 8s) to handle dynamic linker sync lag after `uv` installs.
- **CuPy**: imports `cupy`, calls `cupy.cuda.runtime.getDeviceCount()`.
- **JAX**: imports `jax`, inspects `jax.devices()`, and — when a `gpu` backend is present — runs a small device computation (`block_until_ready()`) to confirm acceleration (a CPU build passes too). Retries with backoff like PyTorch/CuPy to ride out linker sync lag.
- **Llama-CPP**: calls `llama_supports_gpu_offload()` to confirm hardware acceleration.

---

## Examples

```bash
# Check what GPU and CUDA are detected
gpu-installer --gpu-info

# Diagnose the environment without reinstalling anything
gpu-installer --doctor

# Install only PyTorch and Llama-CPP, forcing CUDA 12.1
gpu-installer torch llamacpp --force-cuda 121

# Simulate a full install on an AMD ROCm system (dry run)
gpu-installer --dry-run

# Force a clean reinstall of everything with logging
gpu-installer --force-reinstall --log

# Uninstall just PyTorch (torch, torchvision, torchaudio)
gpu-installer --uninstall torch

# Preview removal of the whole ecosystem without executing
gpu-installer --uninstall --dry-run

# Auto-install the system GPU stack (preview first!)
gpu-installer --auto-install-system --dry-run
gpu-installer --auto-install-system --cuda-version 12.4

# Then install the Python packages
gpu-installer
```

---

## Troubleshooting

**"CUDA available but tensor test failed"**
This is usually a dynamic linker sync issue immediately after a `uv` install. The verifier retries automatically. If it persists, run `gpu-installer --doctor` a few seconds later — it will re-verify without reinstalling.

**"CPU-only version detected"**
Your installed `torch` was built without CUDA. Run `gpu-installer --force-reinstall` to trigger an auto-upgrade.

**CuPy skipped on a CUDA machine**
CuPy requires NVIDIA or AMD GPUs. It is intentionally skipped on macOS/MPS and CPU-only environments.

**`uv` not found**
The script falls back to `pip` automatically. Install `uv` for significantly faster installs: `pip install uv`.
