Metadata-Version: 2.4
Name: llmboost_hub
Version: 0.4.3rc2
Summary: A lightweight CLI tool for managing LLMBoost™ model images and environments.
Author-email: Harish Kambhampaty <harish.kambhampaty@mangoboost.io>
Project-URL: Homepage, https://llmboost.mangoboost.io/
Project-URL: Documentation, https://llmboost.mangoboost.io/docs/
Keywords: LLM,Docker,CLI,HPC,AI,LLMBoost
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0
Requires-Dist: requests>=2.31
Requires-Dist: openai>=0.27
Requires-Dist: tabulate>=0.9
Requires-Dist: rich>=13.7
Requires-Dist: docker>=7.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: pandas>=2.0
Requires-Dist: licensing
Requires-Dist: bcrypt>=4.0
Requires-Dist: nvidia-ml-py>=12.570.86
Requires-Dist: amdsmi>=6.4.2
Requires-Dist: huggingface_hub>=0.23
Requires-Dist: black>=23.1.0
Requires-Dist: cryptography>=41.0
Provides-Extra: dev
Requires-Dist: build>=1.0.0; extra == "dev"
Requires-Dist: wheel>=0.42; extra == "dev"
Requires-Dist: setuptools>=68.0; extra == "dev"
Requires-Dist: twine>=4.0; extra == "dev"
Requires-Dist: nuitka>=2.1; extra == "dev"
Requires-Dist: codeenigma>=1.2; extra == "dev"
Requires-Dist: pytest>=7.4; extra == "dev"
Requires-Dist: pytest-cov>=4.1; extra == "dev"
Requires-Dist: pytest-dependency; extra == "dev"
Requires-Dist: kubernetes==32.0.1; extra == "dev"
Requires-Dist: sh; extra == "dev"
Dynamic: license-file

# [LLMBoost Hub (lbh)](https://llmboost.mangoboost.io/docs/)

Manage LLMBoost™ model containers and environments to run, serve, and tune large language models.

Note: This is proprietary software and requires a valid LLMBoost™ license to use. Request a license at [support@mangoboost.io](mailto:support@mangoboost.io).

---

## Pre-requisites

### Dependencies:
- Python 3.11+
- Docker 27.3.1+
- NVIDIA GPU: [nvidia-docker2](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) or AMD GPU: [ROCm 6.3+](https://rocm.docs.amd.com/en/latest/Installation_Guide/Installation-Guide.html)

### Install LLMBoost Hub:

```bash
pip install llmboost_hub

# Verify installation
lbh --version
```

Upgrade:
```bash
pip install --upgrade llmboost_hub
```

Note: This document uses `lbh` interchangeably with `llmboost_hub`.

### Login to Hugging Face and Docker:
```bash
huggingface-cli login     # or set HF_TOKEN env var
docker login -u <your_docker_username>
```

---

## Quick start

Fetch list of supported models from remote (automatically authenticates LLMBoost license):
```bash
lbh fetch # only needed once a day
```

One-liner to start serving a model (automatically downloads image and model, if needed):
```bash
lbh serve <Repo/Model-Name> # Full model name (including repository or organization name) must match the name from https://huggingface.co
```

For example:
```bash
lbh serve meta-llama/Llama-3.2-1B-Instruct
```

#### Basic workflow:
```bash
lbh fetch # authenticate LLMBoost license
lbh list # list models you've prepared (from model_paths.yaml)
lbh list --discover /path # discover models in a directory and add to model_paths.yaml
lbh prep <Repo/Model-Name> # download image and model assets
lbh run <Repo/Model-Name> # start container
lbh serve <Repo/Model-Name> # start LLMBoost server inside container
lbh test <Repo/Model-Name> # send test request
lbh stop <Repo/Model-Name> # stop container
```

For more details, see the [Command Reference](#command-reference) section below.
For more details, see the [Configuration Options](#configuration-options) section below.

#### Shell completions (auto-complete commands and model names):
```bash
eval "$(lbh completions)"                 # current shell
lbh completions [--venv|--profile]        # persist for venv or profile
```
**Note:** Model name completion shows all supported models from `lbh list` (includes wildcard-expanded models).

## Configuration Options

*llmboost_hub* uses the following environment variables:

- `LBH_HOME`: base directory for all *llmboost_hub* data. (defaults: (host) `~/.llmboost_hub` <- (container) `/llmboost_hub`)
- `LBH_MODELS`: directory for storing and retrieving model assets. (default: `$LBH_HOME/models`)
- `LBH_MODEL_PATHS`: YAML file mapping model names to their paths. Automatically updated by `prep` and `run --model_path`. (default: `$LBH_HOME/model_paths.yaml`)
- `LBH_LOCAL_DB`: persistent local database for tuning results. Survives container restarts, system reboots, and LLMBoost version upgrades. (default: `$LBH_HOME/local_inference.db`)
- `LBH_WORKSPACE`: mounted user workspace for manually transferring files out of containers. (defaults: (host) `$LBH_HOME/workspace` <- (container) `/user_workspace`)

Notes:
- A configuation file is stored at `$LBH_HOME/config.yaml` with all the above mentioned settings (and other advanced settings). 
  - Precedence order for settings: Environment variables > Configuration file > Defaults
- `LBH_HOME` can only be changed by setting the env var (or in `~/.bashrc`). 
  - WARNING: Changing `LBH_HOME` will cause a new data directory to be used, and all configuration will be reset.
- `HF_TOKEN` is injected automatically when set.

---

## Command Reference

Use `lbh -h` for a summary of all commands, and `lbh [COMMAND] -h` for help with a specific command and all available options.

Use `lbh -v [COMMAND]` for verbose output with any command; shows useful diagnostic info for troubleshooting.

- `lbh login`
  - If EULA not yet accepted, displays a full-screen scrollable End User License Agreement; exits if rejected.
  - Reads `$LBH_LICENSE_PATH` if valid.
  - Else asks for license file path (or use `--license-file`).
  - Imports JSON-style `.skm` file, validates, saves to `$LBH_LICENSE_PATH` (simultaneously generating `legacy_license.skm` for backwards image compatibility).

- `lbh fetch`
  - Requires EULA acceptance (run `lbh login` first).
  - Fetches latest models supported by LLMBoost.
  - Filters to available GPU.

- `lbh list [query] [--discover PATH]`
  - Lists models you've prepared (tracked in `model_paths.yaml`).
  - GPU matching follows same rule as `lbh fetch`, but filters based on local model availability.
  - **Default mode**: Shows only models prepared via `lbh prep`.
  - **Discovery mode** (`--discover /path`): Scans directory for models and prompts to add them to `model_paths.yaml`.
  - Status meanings:
    - `pending`: model path doesn't exist or is empty
    - `stopped`: model exists but container not running
    - `running`: container running but idling
    - `initializing`: container running and starting LLMBoost server
    - `serving`: LLMBoost server ready to accept requests
    - `tuning`: autotuner running
  - Supports query filtering (case-insensitive, e.g., `lbh list llama`)
  - Works correctly even when `LBH_MODELS` changes (paths in `model_paths.yaml` are absolute)

- `lbh prep <Repo/Model-Name> [--only-verify] [--fresh]`
  - Pulls the image and downloads HF assets.
  - Automatically saves model path to `LBH_MODEL_PATHS` after successful preparation.
  - `--only-verify` checks digests and sizes.
  - `--fresh` removes existing image and re-downloads model assets from Hugging Face.

- `lbh run <Repo/Model-Name> [OPTIONS] -- [DOCKER FLAGS...]`
  - Resolves and starts the container detached.
  - Mounts `$LBH_HOME` and `$LBH_WORKSPACE`. Injects HF_TOKEN.
  - NVIDIA GPUs use `--gpus all`. AMD maps `/dev/dri` and `/dev/kfd`.
  - Path resolution: checks `LBH_MODEL_PATHS` first, then falls back to `$LBH_MODELS/<repo>/<model>`.
  - Useful options: 
    - `--image <image>`: override docker image.
    - `--model_path <model_path>`: override model assets path (saved to `LBH_MODEL_PATHS` for future use).
    - `--restart`: restarts container, if already running.
    - `--use-local-db`: merge persistent local database (~/.llmboost_hub/local_inference.db) into container to leverage historical tuning data.
    - Pass extra docker flags after `--`.

- `lbh serve <Repo/Model-Name> [--host 0.0.0.0] [--port 8011] [--detached] [--force] -- [LLMBOOST ARGS...]`
  - Requires EULA acceptance (run `lbh login` first).
  - Starts LLMBoost server inside the container.
  - Waits until ready, unless `--detached`.
  - `--force` skips GPU utilization checks (use if GPU utilization is incorrectly reported by NVidia or AMD GPU drivers).
  - `--use-local-db`: merge persistent local database (~/.llmboost_hub/local_inference.db) into container to leverage historical tuning data.
  - Pass extra llmboost serve arguments after `--`.

- `lbh test <Repo/Model-Name> [--query "..."] [-t N] [--host 127.0.0.1] [--port 8011]`
  - Sends a test request to `/v1/chat/completions`.

- `lbh attach <Repo/Model-Name> [-c <container name or ID>]`
  - Opens a shell in the running container.

- `lbh stop <Repo/Model-Name> [-c <container name or ID>]`
  - Stops the container.

- `lbh status [model]`
  - Shows status and model.

- `lbh tune <Repo/Model-Name> [--metrics throughput] [--detached] [--image <image>]`
  - Runs the autotuner.   - Results are automatically saved to persistent local database (`$LBH_LOCAL_DB`) and will survive container restarts, system reboots, and LLMBoost version upgrades.
  - Use `lbh serve --use-local-db` to leverage tuning results from previous sessions.

### Cluster Commands (Multi-Node Deployments)

- `lbh cluster install [--kubeconfig PATH] [--docker-username USER] [--docker-pat TOKEN] [--docker-email EMAIL] [-- EXTRA_HELM_ARGS]`
  - Install LLMBoost Helm chart and Kubernetes infrastructure for multi-node deployments.
  - Displays access credentials for management and monitoring UIs after installation.
  - If `$LBH_CLUSTER_CONFIG_PATH` already exists, `install` automatically invokes `lbh cluster deploy` after Helm installation succeeds.
  - Requires running Kubernetes cluster and helm installed.
  - Docker authentication options:
    - `--docker-username`, `--docker-pat`, `--docker-email`: Provide credentials directly (all three required together)
    - Alternatively, run `docker login` and credentials will be read from `~/.docker/config.json`
    - If neither provided, cluster will be installed without Docker registry secret

- `lbh cluster deploy [-f CONFIG_FILE] [--kubeconfig PATH] [--get-schema VERSION]`
  - Deploy models across cluster nodes based on configuration file.
  - Use `--get-schema 1.1` to print a generated JSONC template for the recommended schema version.
  - Recommended workflow:
    - `lbh cluster deploy --get-schema 1.1 > $LBH_CLUSTER_CONFIG_PATH`
    - edit the generated config
    - `lbh cluster deploy`
  - Supports schema `1.1` (recommended) and schema `1.0` (backward compatibility).
  - Writes generated Kubernetes CRD manifests to `$LBH_KUBE_MODEL_DEPLOYMENTS_PATH` before applying them.

- `lbh cluster status [--kubeconfig PATH] [--show-secrets]`
  - Show status of all model deployments and management services.
  - Displays summary statistics: Models: <ready>/<total> and Mgmt.: <ready>/<total>
  - Shows model deployment table with pod status, restarts, and error messages.
  - Service URLs for management UI and monitoring (Grafana).
  - Use `--show-secrets` to display access credentials (masked).
  - Use `lbh -v cluster status --show-secrets` for full unmasked credentials.

- `lbh cluster logs [--models|--management] [--pod POD_NAME] [--tail TAIL_ARGS...] [--grep GREP_ARGS...] [--kubeconfig PATH]`
  - View logs from model deployment or management pods.
  - `--models`: Show logs from model deployment pods.
  - `--management`: Show logs from management/monitoring pods (displays as table).
  - `--pod POD_NAME`: Filter to specific pod by name.
  - `--tail TAIL_ARGS`: Show last N lines from workspace logs (default: 10).
  - `--grep GREP_ARGS`: Filter logs by pattern (uses awk for pattern matching).
  - Defaults to showing both model and management logs if no filter specified.

- `lbh cluster remove <MODEL_NAME> [--all] [--kubeconfig PATH] [--force]`
  - Remove specific model deployments from the cluster.
  - Deletes LLMBoostDeployment custom resources by name.
  - `--all`: Remove all model deployments (requires confirmation unless used with --force).
  - Example: `lbh cluster remove facebook/opt-125m` or `lbh cluster remove --all`

- `lbh cluster uninstall [--kubeconfig PATH] [--force]`
  - Uninstall LLMBoost cluster resources.
  - Prompts for confirmation unless `--force` is used.
  - Does not automatically delete the namespace.

---

## Support

- Docs: https://llmboost.mangoboost.io/docs/
- Website: https://llmboost.mangoboost.io/
- Email: support@mangoboost.io
