Metadata-Version: 2.4
Name: llmboost_hub
Version: 0.3.0
Summary: A lightweight CLI tool for managing LLMBoost model images and environments.
Author-email: Harish Kambhampaty <harish.kambhampaty@mangoboost.io>
Project-URL: Homepage, https://llmboost.mangoboost.io/
Project-URL: Documentation, https://llmboost.mangoboost.io/
Keywords: LLM,Docker,CLI,HPC,AI,LLMBoost
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0
Requires-Dist: requests>=2.31
Requires-Dist: openai>=0.27
Requires-Dist: tabulate>=0.9
Requires-Dist: rich>=13.7
Requires-Dist: docker>=7.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: pandas>=2.0
Requires-Dist: licensing
Requires-Dist: bcrypt>=4.0
Requires-Dist: huggingface_hub>=0.23
Requires-Dist: black>=23.1.0
Requires-Dist: cryptography>=41.0
Provides-Extra: dev
Requires-Dist: build>=1.0.0; extra == "dev"
Requires-Dist: wheel>=0.42; extra == "dev"
Requires-Dist: setuptools>=68.0; extra == "dev"
Requires-Dist: twine>=4.0; extra == "dev"
Requires-Dist: nuitka>=2.1; extra == "dev"
Requires-Dist: codeenigma>=1.2; extra == "dev"
Requires-Dist: pytest>=7.4; extra == "dev"
Requires-Dist: pytest-cov>=4.1; extra == "dev"
Requires-Dist: pytest-dependency; extra == "dev"
Requires-Dist: kubernetes==32.0.1; extra == "dev"
Requires-Dist: sh; extra == "dev"
Dynamic: license-file

# [LLMBoost Hub (lbh)](https://docs.mangoboost.io/llmboost_hub/)

Manage LLMBoost™ model containers and environments to run, serve, and tune large language models.

---

## Pre-requisites

### Dependencies:
- Python 3.10+
- Docker 27.3.1+
- NVIDIA GPU: [nvidia-docker2](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) or AMD GPU: [ROCm 6.3+](https://rocm.docs.amd.com/en/latest/Installation_Guide/Installation-Guide.html)

### Install LLMBoost Hub:

```bash
pip install llmboost_hub

# Verify installation
lbh --version
```

Upgrade:
```bash
pip install --upgrade llmboost_hub
```

Note: This document uses `lbh` interchangeably with `llmboost_hub`.

### Login to Hugging Face and Docker:
```bash
huggingface-cli login     # or set HF_TOKEN env var
docker login -u <your_docker_username>
```

---

## Quick start

Fetch list of supported models from remote (automatically authenticates LLMBoost license):
```bash
lbh fetch # only needed once a day
```

One-liner to start serving a model (automatically downloads image and model, if needed):
```bash
lbh serve <Repo/Model-Name> # Full model name (including repository or organization name) must match the name from https://huggingface.co
```

For example:
```bash
lbh serve meta-llama/Llama-3.1-1B-Instruct
```

#### Basic workflow:
```bash
lbh fetch # authenticate LLMBoost license
lbh list [model] # list supported models; case-insensitive, regex-style match
lbh prep <Repo/Model-Name> # download image and model assets
lbh run <Repo/Model-Name> # start container
lbh serve <Repo/Model-Name> # start LLMBoost server inside container
lbh test <Repo/Model-Name> # send test request
lbh stop <Repo/Model-Name> # stop container
```

For more details, see the [Command Reference](#command-reference) section below.
For more details, see the [Configuration Options](#configuration-options) section below.

#### Shell completions:
```bash
eval "$(lbh completions)"                 # current shell
lbh completions [--venv|--profile]        # persist for venv or profile
```

## Configuration Options

*llmboost_hub* uses the following environment variables:

- `LBH_HOME`: base directory for all *llmboost_hub* data. (defaults: (host) `~/.llmboost_hub` <- (container) `/llmboost_hub`)
- `LBH_MODELS`: directory for storing and retrieving model assets. (default: `$LBH_HOME/models`)
- `LBH_MODEL_PATHS`: YAML file mapping model names to their paths. Automatically updated by `prep` and `run --model_path`. (default: `$LBH_HOME/model_paths.yaml`)
- `LBH_WORKSPACE`: mounted user workspace for manually transferring files out of containers. (defaults: (host) `$LBH_HOME/workspace` <- (container) `/user_workspace`)

Notes:
- A configuation file is stored at `$LBH_HOME/config.yaml` with all the above mentioned settings (and other advanced settings). 
  - Precedence order for settings: Environment variables > Configuration file > Defaults
- `LBH_HOME` can only be changed by setting the env var (or in `~/.bashrc`). 
  - WARNING: Changing `LBH_HOME` will cause a new data directory to be used, and all configuration will be reset.
- `HF_TOKEN` is injected automatically when set.

---

## Command Reference

Use `lbh -h` for a summary of all commands, and `lbh [COMMAND] -h` for help with a specific command and all available options.

Use `lbh -v [COMMAND]` for verbose output with any command; shows useful diagnostic info for troubleshooting.

- `lbh login`
  - Reads `$LBH_LICENSE_PATH` or prompts for a token.
  - Validates online and saves the license file.

- `lbh fetch`
  - Fetches latest models supported by LLMBoost.
  - Filters to available GPU.

- `lbh list [model]`
  - Lists local images joined with lookup.
  - Shows status: 
    - pending: model not prepared; docker image or model assets missing
    - stopped: model prepared but container not running
    - running: container running but idling
    - initializing: container running and starting LLMBoost server
    - serving: LLMBoost server ready to accept requests
    - tuning: autotuner running

- `lbh prep <Repo/Model-Name> [--only-verify] [--fresh]`
  - Pulls the image and downloads HF assets.
  - Automatically saves model path to `LBH_MODEL_PATHS` after successful preparation.
  - `--only-verify` checks digests and sizes.
  - `--fresh` removes existing image and re-downloads model assets from Hugging Face.

- `lbh run <Repo/Model-Name> [OPTIONS] -- [DOCKER FLAGS...]`
  - Resolves and starts the container detached.
  - Mounts `$LBH_HOME` and `$LBH_WORKSPACE`. Injects HF_TOKEN.
  - NVIDIA GPUs use `--gpus all`. AMD maps `/dev/dri` and `/dev/kfd`.
  - Path resolution: checks `LBH_MODEL_PATHS` first, then falls back to `$LBH_MODELS/<repo>/<model>`.
  - Useful options: 
    - `--image <image>`: override docker image.
    - `--model_path <model_path>`: override model assets path (saved to `LBH_MODEL_PATHS` for future use).
    - `--restart`: restarts container, if already running.
    - Pass extra docker flags after `--`.

- `lbh serve <Repo/Model-Name> [--host 0.0.0.0] [--port 8011] [--detached] [--force] -- [LLMBOOST ARGS...]`
  - Starts LLMBoost server inside the container.
  - Waits until ready, unless `--detached`.
  - `--force` skips GPU utilization checks (use if GPU utilization is incorrectly reported by NVidia or AMD GPU drivers).
  - Pass extra llmboost serve arguments after `--`.

- `lbh test <Repo/Model-Name> [--query "..."] [-t N] [--host 127.0.0.1] [--port 8011]`
  - Sends a test request to `/v1/chat/completions`.

- `lbh attach <Repo/Model-Name> [-c <container name or ID>]`
  - Opens a shell in the running container.

- `lbh stop <Repo/Model-Name> [-c <container name or ID>]`
  - Stops the container.

- `lbh status [model]`
  - Shows status and model.

- `lbh tune <Repo/Model-Name> [--metrics throughput] [--detached] [--image <image>]`
  - Runs the autotuner. 
  - Store results to `$LBH_HOME/inference.db`, and loads this on next `lbh serve`.

### Cluster Commands (Multi-Node Deployments)

- `lbh cluster install [--kubeconfig PATH] [-- EXTRA_HELM_ARGS]`
  - Install LLMBoost Helm chart and Kubernetes infrastructure for multi-node deployments.
  - Displays access credentials for management and monitoring UIs after installation.
  - Requires running Kubernetes cluster and helm installed.
  - Note: Ensure Docker authentication is configured (`docker login`) before deploying models.

- `lbh cluster deploy [-f CONFIG_FILE] [--kubeconfig PATH]`
  - Deploy models across cluster nodes based on configuration file.
  - Generates and applies Kubernetes CRD manifests.
  - Config template: `$LBH_HOME/utils/template_cluster_config.jsonc`

- `lbh cluster status [--kubeconfig PATH] [--show-secrets]`
  - Show status of all model deployments and management services.
  - Displays summary statistics: Models: <ready>/<total> and Mgmt.: <ready>/<total>
  - Shows model deployment table with pod status, restarts, and error messages.
  - Service URLs for management UI and monitoring (Grafana).
  - Use `--show-secrets` to display access credentials (masked).
  - Use `-v --show-secrets` for full unmasked credentials.

- `lbh cluster logs [--models|--management] [--pod POD_NAME] [--tail TAIL_ARGS...] [--grep GREP_ARGS...] [--kubeconfig PATH]`
  - View logs from model deployment or management pods.
  - `--models`: Show logs from model deployment pods.
  - `--management`: Show logs from management/monitoring pods (displays as table).
  - `--pod POD_NAME`: Filter to specific pod by name.
  - `--tail TAIL_ARGS`: Show last N lines from workspace logs (default: 10).
  - `--grep GREP_ARGS`: Filter logs by pattern (uses awk for pattern matching).
  - Defaults to showing both model and management logs if no filter specified.

- `lbh cluster remove <MODEL_NAME> [--all] [--kubeconfig PATH]`
  - Remove specific model deployments from the cluster.
  - Deletes LLMBoostDeployment custom resources by name.
  - `--all`: Remove all model deployments (requires confirmation unless used with --force).
  - Example: `lbh cluster remove facebook/opt-125m` or `lbh cluster remove --all`

- `lbh cluster uninstall [--kubeconfig PATH] [--force]`
  - Uninstall LLMBoost cluster resources.
  - Prompts for confirmation unless `--force` is used.
  - Does not automatically delete the namespace.

---

## Support

- Docs: https://docs.mangoboost.io/llmboost_hub/
- Website: https://docs.mangoboost.io/llmboost_hub/
- Email: support@mangoboost.io
