Metadata-Version: 2.4
Name: afrilink-sdk
Version: 0.8.16
Summary: AfriLink SDK — One-line access to GPUs, models and datasets from your notebook
Home-page: https://github.com/dataspires/afrilink-sdk
Author: DataSpires
Author-email: DataSpires <info@dataspires.com>
License-Expression: MIT
Project-URL: Homepage, https://dataspires.com
Project-URL: Documentation, https://www.dataspires.com/docs
Project-URL: Repository, https://github.com/DataSpires/afrilink-sdk
Project-URL: Bug Tracker, https://github.com/DataSpires/afrilink-sdk/issues
Keywords: hpc,high-performance-computing,finetuning,llm,lora,notebook,gpu,slurm,afrilink
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: System :: Distributed Computing
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: full
Requires-Dist: requests>=2.28.0; extra == "full"
Requires-Dist: psutil>=5.9.0; extra == "full"
Provides-Extra: build
Requires-Dist: requests>=2.28.0; extra == "build"
Requires-Dist: cryptography>=41.0.0; extra == "build"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# AfriLink SDK

**Version:** 0.8.16

**Last Updated:** May 29, 2026

**Train & Finetune on a Dedicated OpenToken A100 from Any Notebook**

AfriLink SDK gives you one-line access to a dedicated **NVIDIA A100 80 GB** hosted by [OpenToken](https://opentoken.global) for **training and finetuning** across text, vision and multimodal models. Works on **Google Colab, Kaggle, Jupyter, VS Code**, and any Python environment.

| Capability | API | What It Does |
|------------|-----|-------------|
| **Curated finetune** | `client.finetune()` | LoRA/QLoRA LLM fine-tuning in our pre-built `afrilink-finetune` container |
| **Curated training** | `client.train()` | Run any training script in our pre-built `afrilink-yolo` container (Ultralytics, vision) |
| **Build-your-own container** | `client.build_image()` | Define base image + pip / apt deps + model source, build on Cloud Build, push to private Artifact Registry |
| **Build-and-train** | `client.build_and_train()` | One-shot: builds (or hits the cache), runs on the A100, downloads artefacts, cleans up |
| **Image cache lookup** | `client.find_existing_image()` | Check if a matching image already exists before triggering a fresh ~5 min build |

```
pip install afrilink-sdk[build]
```

---

## Quick Start — Finetune an LLM

```python
from afrilink import AfriLinkClient

client = AfriLinkClient()
client.authenticate()   # reads AFRILINK_API_KEY from notebook secrets / env

import pandas as pd
data = pd.DataFrame({"text": [
    "Below is an instruction...\n\n### Response:\nHere is the answer..."
]})

job = client.finetune(
    model="qwen2.5-0.5b",
    training_mode="low",
    data=data,
    gpus=1,
    time_limit="01:00:00",
)
result = job.run(wait=True)

if result["status"] == "completed":
    client.download_model(result["job_id"], "./my-model")
```

## Quick Start — Train a Vision Model

```python
from afrilink import AfriLinkClient

client = AfriLinkClient()
client.authenticate()

# Submit a YOLOv8 training job to the A100
job = client.train(
    script="train_yolo.py",        # your training script
    container="afrilink-yolo",      # pre-built container with YOLOv8 + PyTorch
    data="./dataset.tar.gz",        # dataset (uploaded automatically)
    data_config="dataset.yaml",     # YOLO dataset.yaml
    gpus=1,
    time_limit="02:00:00",
)
result = job.run(wait=True)
print(job.get_logs(tail=50))
client.download_model(result["job_id"], "./yolo-out")
```

## Quick Start — Custom Container

```python
from afrilink import AfriLinkClient

client = AfriLinkClient()
client.authenticate()

# Define exactly the environment your training needs.
spec = dict(
    base_image="pytorch",                                # preset
    pip_packages=["transformers>=4.45", "accelerate>=0.34", "peft>=0.13"],
    apt_packages=["git"],
    model_source={
        "kind": "huggingface",
        "id": "Qwen/Qwen2.5-0.5B-Instruct",
    },
)

# Builds the image on Cloud Build (~5 min first time, instant on cache hit
# for the same spec on subsequent runs), runs on the A100, deletes the
# local image layer afterwards, returns the artefact directory.
result = client.build_and_train(
    script="my_train.py",
    gpus=1,
    time_limit_hours=0.5,
    reuse_existing_image=True,   # default — short-circuits identical specs
    **spec,
)
client.download_model(result["run"]["job_id"], "./output")
```

---

## Installation

```bash
pip install 'afrilink-sdk[build]'
```

The `[build]` extras pull `cryptography` + `requests`, needed for the GCP service-account JWT signing the custom-container path uses. Without `[build]` only the curated `client.train()` and `client.finetune()` paths work.

The core package has **zero required dependencies** — heavy libraries (`torch`, `transformers`, `peft`, etc.) are only loaded when you actually call into code that needs them, and are pre-installed in most notebook environments.

---

## Authentication

As of v0.8.x the SDK uses **stateless API-key auth** — no email/password prompts, no 12-hour certificate refreshes, no SSH key management on your side.

### Get an API key

1. Sign up at [dataspires.com](https://dataspires.com).
2. Go to **Profile → AfriLink SDK keys**, click **Create new key**, copy the `afk_live_…` value (shown once).
3. Add it to your notebook environment as `AFRILINK_API_KEY`.

### Set the key

| Where you run | How to set the key |
|---|---|
| **Google Colab** | 🔑 sidebar → **Add secret** → name `AFRILINK_API_KEY`, paste, **enable for notebook** |
| **Kaggle** | Add-ons → **Secrets** → name `AFRILINK_API_KEY`, paste, **attach to notebook** |
| **Local Jupyter / VS Code** | `os.environ["AFRILINK_API_KEY"] = "afk_live_…"` before `client.authenticate()` |
| **Anywhere** | Pass directly: `client.authenticate(api_key="afk_live_…")` |

```python
from afrilink import AfriLinkClient

client = AfriLinkClient()
client.authenticate()   # resolves from secret / env / argument in that order
```

### What happens at auth time

| Phase | What runs |
|-------|-----------|
| **1. DataSpires session** | The SDK exchanges your API key at `api.dataspires.com` for a short-lived Supabase JWT used for billing writes (`sessions`, `deduct_credits` RPC) |
| **2. A100 reachability** | Silent SSH probe to the OpenToken A100 to confirm your slot is live and pull-ready |

Both phases together take ~1–2 seconds. The session keeps the JWT in memory for the kernel lifetime — no on-disk state. To rotate the key, revoke it on the dashboard and mint a new one.

---

## Built-in User Guide

The SDK ships with an inline reference manual you can query from any notebook cell using a slash-style syntax:

```python
import afrilink

afrilink/help          # top-level index of all topics
afrilink/quickstart    # step-by-step getting started guide
afrilink/auth          # authentication
afrilink/finetune      # finetune job parameters & training modes
afrilink/training      # general training jobs and containers
afrilink/specs         # A100 hardware spec sheet
afrilink/datasets      # dataset formats and upload
afrilink/billing       # rates, credits, invoices
```

Each page prints a formatted reference to your notebook output — no internet connection required.

---

## API Reference

### `AfriLinkClient`

Main entry point. Created once per notebook session.

| Method | Description |
|--------|-------------|
| `authenticate(api_key=None)` | Resolve API key (arg / env / Colab Secrets / Kaggle Secrets), exchange at `api.dataspires.com`, probe the A100 |
| `finetune(model, training_mode, data, gpus, ...)` | Create a `FinetuneJob` in the curated `afrilink-finetune` container |
| `train(script, container, data, gpus, ...)` | Create a `TrainJob` in a curated container (`afrilink-yolo`) |
| `find_existing_image(base_image, pip_packages, apt_packages, model_source, ...)` | Check the A100 + Artifact Registry for a matching cached image; returns `{"image", "source", "spec_hash"}` or `None` |
| `build_image(base_image, pip_packages, apt_packages, script, model_source, ...)` | Build a custom Docker image on Cloud Build, push to private Artifact Registry |
| `build_and_train(...)` | One-shot: cache-check → build (or skip) → run on the A100 → ephemeral cleanup |
| `delete_built_image(job_id_or_image)` | Remove a built image from the A100's local Docker cache (Artifact Registry copy persists) |
| `download_model(job_id, local_dir)` | Download the entire `output/` directory from the A100 |
| `upload_dataset(local_path, dataset_name)` | Upload a dataset to the A100's job-scoped staging area |
| `list_containers()` | List available curated training containers |
| `list_available_models(size=None)` | List models in the registry |
| `list_available_datasets()` | List datasets in the registry |
| `get_model_requirements(model, training_mode)` | GPU/memory recommendations |
| `cancel_job(job_id)` | Stop + remove a running container |
| `run_command(command)` | Run arbitrary shell command on the A100 |

### `client.finetune()`

```python
job = client.finetune(
    model="qwen2.5-0.5b",         # model ID from registry
    training_mode="low",           # "low" | "medium" | "high"
    data=my_dataframe,             # pandas DataFrame, HF Dataset, or file path
    gpus=1,                        # silently clamped to 1 (A100 backend has 1 GPU)
    time_limit="01:00:00",         # max wallclock (HH:MM:SS)
    output_dir=None,               # default: /workspace/job/output
)
```

**Training modes:**

| Mode | Strategy | Quantization |
|------|----------|--------------|
| `low` | QLoRA (rank 8) | 4-bit |
| `medium` | LoRA (rank 16) | 8-bit / none |
| `high` | Full LoRA (rank 64) | none |

The A100 backend has 1 GPU — distributed training (`gpus>1`) is silently clamped to 1 with a console note. Multi-GPU is on the roadmap.

### `client.train()`

```python
job = client.train(
    script="train_yolo.py",        # local Python script to upload and run
    container="afrilink-yolo",      # pre-built container
    data="./dataset/",              # local path, archive, DataFrame, or remote path
    data_config="dataset.yaml",     # config file (e.g. YOLO dataset.yaml)
    gpus=1,
    time_limit="04:00:00",
    script_args=["--epochs", "100"],
    extra_files=["weights.pt"],
    container_env={"KEY": "val"},
)
```

**Curated containers (`container=` argument):**

| Name | Frameworks | Use case |
|------|-----------|----------|
| `afrilink-yolo` | Ultralytics, PyTorch, torchvision | Object detection, segmentation, pose estimation |
| `afrilink-finetune` | PyTorch, Transformers, PEFT, bitsandbytes | LLM fine-tuning (used internally by `client.finetune()`) |

Need a different stack? Use `client.build_image()` / `client.build_and_train()` (next section).

**Data handling:**

| Input type | What happens |
|------------|-------------|
| Local directory | Uploaded via SCP to `/mnt/data/sdk-jobs/<job_id>/input/`, mounted at `/workspace/job/input/` inside the container |
| `.tar.gz` / `.zip` archive | Uploaded and extracted on the A100 |
| Single file | Uploaded to job directory |
| `pandas.DataFrame` | Serialised to JSONL, uploaded |
| Path starting with `/` | Treated as a remote A100 path (no upload) |

### `TrainJob` / `FinetuneJob`

Returned by `client.train()` / `client.finetune()`.

| Method / Property | Description |
|-------------------|-------------|
| `run(wait=True)` | Submit to the A100. `wait=True` polls until done. |
| `cancel()` | Stop + remove the running container |
| `get_logs(tail=100)` | Fetch recent log lines |
| `estimated_cost_usd()` | Estimate max cost based on GPUs and time limit |
| `status` | Current status string |
| `job_id` | AfriLink job ID (8-char UUID prefix) |
| `container_id` | Docker container ID on the A100 (set after `run()`) |

`run()` returns a dict:

```python
{
    "job_id": "a1b2c3d4",
    "container_id": "d9072f194771...",
    "status": "completed",        # or "submitted" / "failed" / "cancelled"
    "output_dir": "/mnt/data/sdk-jobs/a1b2c3d4/output",
    "billing": {
        "total_gpu_minutes": 5.0,
        "total_cost_usd": 0.1667,
        "rate_per_gpu_hour": 2.00,
        "billing_source": "wall-clock-docker",
    },
}
```

---

## Custom Containers — `client.build_image()` / `client.build_and_train()`

If the curated containers don't have the framework, version, or model you need, define it yourself. Cloud Build builds the image, Artifact Registry hosts it, the A100 runs it ephemerally.

### Define the spec

```python
spec = dict(
    base_image="pytorch",                                # preset name, or full image:tag
    pip_packages=["transformers>=4.45", "accelerate>=0.34"],
    apt_packages=["git"],
    pip_index_url=None,                                  # optional alternative index
    pip_extra_index_urls=[],
    model_source={                                       # fetched at job runtime
        "kind": "huggingface",                           # huggingface | url | git | gs | s3
        "id": "Qwen/Qwen2.5-0.5B-Instruct",
        "revision": "main",
    },
    env={"WANDB_PROJECT": "demo"},                       # baked into image (non-secret)
)
```

**Presets for `base_image`:**

| Preset | Resolves to | Notes |
|--------|-------------|-------|
| `pytorch` | `pytorch/pytorch:2.5.0-cuda12.4-cudnn9-runtime` | GPU default |
| `pytorch-2.4` | `pytorch/pytorch:2.4.0-cuda12.4-cudnn9-runtime` | |
| `pytorch-cpu` | `pytorch/pytorch:2.5.0-cpu-runtime` | CPU-only build (smaller, no GPU at runtime) |
| `cuda-12.4` | `nvidia/cuda:12.4.0-runtime-ubuntu22.04` | bring-your-own-Python |
| `ultralytics` | `ultralytics/ultralytics:latest` | YOLOv8 ready |

You can also pass any full `image:tag` you want.

**Model sources (`model_source=`):**

| `kind` | Required fields | Example |
|---|---|---|
| `"huggingface"` | `id`, optional `revision`, `subfolder` | `{"kind":"huggingface","id":"meta-llama/Llama-3.2-1B","revision":"main"}` |
| `"url"` | `url` | `{"kind":"url","url":"https://example.com/weights.tar.gz"}` |
| `"git"` | `url`, optional `revision` | `{"kind":"git","url":"https://github.com/openai/whisper.git"}` |
| `"gs"` | `uri` | `{"kind":"gs","uri":"gs://bucket/checkpoints/"}` |
| `"s3"` | `uri` | `{"kind":"s3","uri":"s3://bucket/checkpoints/"}` |
| (omitted) | — | Your script handles model loading itself |

The model is **fetched at container runtime**, not baked at build time — that keeps user images thin (~2 GB instead of 7+ GB) and means you can iterate on dependencies without re-shipping weights. The downloaded model lands at `/workspace/models/<sanitised_id>/` and the path is exposed via `MODELS_DIR` env var to your script.

**For gated HF models** (Llama, Gemma, etc.): add `HUGGINGFACE_TOKEN` as a notebook secret and the SDK forwards it to the container automatically.

### Check the cache before building

```python
hit = client.find_existing_image(
    base_image="pytorch",
    pip_packages=["transformers>=4.45", "accelerate>=0.34"],
    apt_packages=["git"],
    model_source={"kind": "huggingface", "id": "Qwen/Qwen2.5-0.5B-Instruct"},
)
# hit == None  → no match, will build
# hit == {"image": "...", "source": "a100" | "artifact_registry", "spec_hash": "..."}
```

The hash includes only the inputs that change what gets *baked* into the image:

| Included | Excluded |
|----------|----------|
| `base_image` (after preset resolution) | `script` / `script_content` (uploaded but not baked) |
| `pip_packages` (sorted, exact strings) | `env` (runtime injection, not bake-time) |
| `apt_packages` (sorted, exact strings) | `extra_files` |
| `pip_index_url` / `pip_extra_index_urls` | `user_id` / `job_id` |
| `model_source` (kind + id/url/uri + revision + subfolder) | Cloud Build `machine_type` / `build_timeout` |

Two specs that produce a runtime-equivalent image hash to the same value → instant cache hit. A version bump on any pip package, an extra apt dep, a different model revision → fresh hash → fresh build.

### Build and run

```python
# Build only — useful if you want to inspect the image or run it multiple ways
build = client.build_image(**spec, script="my_train.py")
# build["image"] = "europe-west4-docker.pkg.dev/.../<job>:latest"
# build["status"] = "success"
# build["build_id"] = "<...>"

# Build (or skip if cached) + run + cleanup, all in one call
result = client.build_and_train(
    **spec,
    script="my_train.py",
    data="./train.jsonl",
    gpus=1,
    time_limit_hours=1.0,
    reuse_existing_image=True,   # default; False forces a fresh build
    cleanup_image_after=True,    # default; False keeps the A100's local layer
)
# result["build"]["status"] = "success" (fresh) or "cached" (reused)
# result["run"]["status"]   = "completed"
# result["run"]["output_dir"] = "/mnt/data/sdk-jobs/<job_id>/output"

client.download_model(result["run"]["job_id"], "./local-out")
```

### Container lifecycle

| Where | Lifetime |
|---|---|
| A100 disk (pulled image + container) | **Ephemeral** — removed at end of `build_and_train()` unless `cleanup_image_after=False` |
| A100 disk (running container) | Removed at job end always |
| Artifact Registry (image) | Persistent — cache hits read from here on subsequent runs |
| Your notebook (downloaded output) | Yours to manage |

To delete an image from Artifact Registry too: `gcloud artifacts docker images delete <uri>`.

---

## Working With Your Model

Once you've downloaded the adapter, the directory is ready for standard Hugging Face tooling.

### GGUF Conversion & Ollama

Convert your adapter to GGUF format for use with Ollama or llama.cpp:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# 1. Merge adapter into base model
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B")
model = PeftModel.from_pretrained(base, "./my-model")
merged = model.merge_and_unload()
merged.save_pretrained("./my-model-merged")
AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B").save_pretrained("./my-model-merged")

# 2. Convert to GGUF (requires llama.cpp built locally)
# python convert_hf_to_gguf.py ./my-model-merged --outfile my-model.gguf --outtype f16

# 3. Quantize (optional, 4-bit)
# ./llama-quantize my-model.gguf my-model-q4.gguf Q4_K_M

# 4. Run with Ollama
# Create a Modelfile:  FROM ./my-model-q4.gguf
# ollama create my-model -f Modelfile
# ollama run my-model
```

### Publishing to Hugging Face Hub

```python
from huggingface_hub import HfApi

api = HfApi(token="hf_...")
repo_id = "your-username/my-finetuned-model"
api.create_repo(repo_id, exist_ok=True)

# Option A — adapter only (small, loads on top of base model)
api.upload_folder(folder_path="./my-model", repo_id=repo_id)

# Option B — full merged model
api.upload_folder(folder_path="./my-model-merged", repo_id=repo_id)

# Option C — GGUF file
api.upload_file(path_or_fileobj="./my-model-q4.gguf",
                path_in_repo="my-model-q4.gguf",
                repo_id=repo_id)
```

---

## Hardware Specs

**OpenToken A100 80 GB** ([opentoken.global](https://opentoken.global)) — the dedicated GPU node the SDK runs on:

| Component | Specification |
|-----------|---------------|
| GPU | 1× NVIDIA A100 PCIe |
| GPU memory | 80 GB HBM2e |
| FP64 performance | 9.7 TFLOPS |
| FP32 performance | 19.5 TFLOPS |
| TensorFloat-32 | 156 TFLOPS |
| BF16 / FP16 (tensor cores) | 312 TFLOPS |
| CPU cores | 12 |
| System RAM | 82 GB |
| Job runtime | Containerised (Docker, CUDA-aware via `--gpus all`) |

**Per-job memory guide for 1× A100 80 GB:**

| Model size | Training mode | Fits on 1 GPU? |
|------------|---------------|----------------|
| 0.5B – 1B | low (QLoRA 4-bit) | yes |
| 3B – 7B | low / medium | yes |
| 7B – 13B | low (QLoRA) | yes |
| 13B | high (bf16) | tight — checkpoint-heavy |
| 30B+ | low (QLoRA) | marginal |

---

## Billing

**$2.00 / GPU-hour**, charged per completed GPU-minute (minimum 1 minute). Credits deducted automatically from your DataSpires balance via the `deduct_credits` Supabase RPC at job end. Invoices appear on the [DataSpires Billing dashboard](https://dataspires.com/dashboard/billing) in real time.

Build-time minutes on Cloud Build are absorbed by the platform — you only pay GPU-time.

---

## Model & Dataset Registry

```python
client.list_available_models()                      # all models
client.list_available_models(size="tiny")           # tiny | small | medium | large
client.list_available_datasets()
client.get_model_requirements("qwen2.5-0.5b", "low")
```

**Curated models:**

| ID | Name | Type | Params | Min VRAM |
|----|------|------|--------|----------|
| `qwen2.5-0.5b` | Qwen 2.5 0.5B | text | 0.5B | 4 GB |
| `gemma-3-270m` | Gemma 3 270M | text | 0.27B | 2 GB |
| `llama-3.2-1b` | Llama 3.2 1B | text | 1.0B | 4 GB |
| `deepseek-r1-1.5b` | DeepSeek R1 1.5B | text | 1.5B | 6 GB |
| `ministral-3b` | Ministral 3B | text | 3.3B | 8 GB |
| `florence-2-base` | Florence 2 Base | vision | 0.23B | 4 GB |
| `smolvlm-256m` | SmolVLM 256M | vision | 0.26B | 2 GB |
| `moondream2` | Moondream 2 | vision | 1.9B | 8 GB |
| `internvl2-1b` | InternVL2 1B | vision | 1.0B | 4 GB |
| `llava-1.5-7b` | LLaVA 1.5 7B | vision | 7.0B | 16 GB |

For anything outside this registry, use `client.build_image()` / `client.build_and_train()` with `model_source=`.

---

## Architecture

```
Notebook (Colab / Kaggle / Local)         api.dataspires.com (Cloudflare Worker)
+---------------------+                   +---------------------------+
| AfriLink SDK        | --- POST -----→   | exchange afk_live_… for:  |
|  client.authenticate()                  |  - Supabase JWT (billing) |
|                     | ←-- response ---  |  - A100 SSH key (in-mem)  |
+---------------------+                   |  - GCP SA key (build)     |
     |        ↓                           |  - GHCR PAT (image pulls) |
     |   (in-memory state)                +---------------------------+
     |
     ↓
+---------------------+        SSH        +---------------------+
| docker_runner.py    | ----------------→ | OpenToken A100 80GB |
|  - prepare_job_dir  |   /mnt/data/      |  Docker daemon      |
|  - upload via SCP   |    sdk-jobs/      |  (containerd at     |
|  - docker run --gpus=all                |   /mnt/data/)       |
|  - docker inspect (poll)                +---------------------+
+---------------------+
     |
     ↓ build path
+---------------------+   Cloud Build    +---------------------+
| build.py            | --→ submit job → | europe-west4-       |
|  - generate Docker- |     (anadrome)   | docker.pkg.dev/...  |
|    file from spec   |                  |  afrilink-user-     |
|  - tar build context|                  |  images/<user>/<job>|
|  - upload to GCS    |                  +---------------------+
+---------------------+                            |
                                                   ↓ docker pull
                                              (A100 fetches image,
                                               runs it, deletes
                                               local layer at end)
```

The A100 backend, the Cloudflare Worker, the Cloud Build pipeline, the Artifact Registry, the Supabase backend — all of it lives behind `client.authenticate()`. As a user you set one notebook secret and get on with training.

---

## License

MIT
