Metadata-Version: 2.4
Name: aquin
Version: 0.1.2
Summary: Aquin CLI. Run GPU inspection, steering, simulation, and evals locally with aquin connect, aquin load, aquin chat, and aquin inspect.
Project-URL: Homepage, https://aquin.app
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.28
Requires-Dist: websockets>=12.0
Requires-Dist: pydantic>=2.0
Requires-Dist: httpx>=0.27
Requires-Dist: certifi>=2024.0
Requires-Dist: safetensors>=0.4
Requires-Dist: typer>=0.12
Requires-Dist: rich>=13.0
Requires-Dist: torch>=2.3
Requires-Dist: transformer-lens>=1.19
Requires-Dist: transformers>=4.40
Requires-Dist: accelerate>=0.30
Requires-Dist: datasets>=2.18
Requires-Dist: peft>=0.10
Requires-Dist: sentencepiece>=0.2
Requires-Dist: protobuf>=4.0
Requires-Dist: huggingface-hub>=0.23
Requires-Dist: sae-lens>=3.0
Requires-Dist: sentence-transformers>=3.0
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: scipy>=1.12
Requires-Dist: scikit-learn>=1.4
Requires-Dist: umap-learn>=0.5
Requires-Dist: matplotlib>=3.8
Requires-Dist: openai>=1.0
Dynamic: license-file

# Aquin SDK

Record your training runs locally and push them to [Aquin](https://aquin.app) for post-hoc inspection — loss curves, learning rate, grad norm, epoch summaries, SAE feature diffs, model behaviour diffs, and more.

## Install

```bash
pip install aquin
```

## Quickstart

```python
import aquin

run = aquin.init(
    base_model="meta-llama/Llama-3.2-1B-Instruct",
    run_name="my-lora-run",
    config={
        "lr": 2e-4, "epochs": 3, "rank": 16, "lora_alpha": 32,
        "method": "qlora", "per_device_train_batch_size": 2,
        "gradient_accumulation_steps": 8, "dataset": "data.jsonl",
    },
)

for epoch in range(3):
    for step, batch in enumerate(dataloader):
        loss = train_step(batch)
        grad_norm = torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0).item()
        run.log(
            step,
            loss=loss.item(),
            learning_rate=scheduler.get_last_lr()[0],
            grad_norm=grad_norm,
            epoch=epoch,
        )

run.checkpoint(model, step=step)
run.finish()
```

Then push:

```bash
aquin package
aquin push
```

Your run appears in the Aquin dashboard under **CLI runs** with the full inspection suite.

## API

### `aquin.init(base_model, run_name, config)`

Starts a new run. Creates `aquin_run/` in the current directory.

| Param | Description |
|---|---|
| `base_model` | HuggingFace model ID, e.g. `"meta-llama/Llama-3.2-1B-Instruct"` |
| `run_name` | Display name for the run |
| `config` | Dict of training hyperparameters (optional, can also pass to `finish()`) |

### `run.log(step, *, loss, ...)`

Record metrics for one training step. Call every step inside your loop.

| Param | Description |
|---|---|
| `step` | Global training step (required) |
| `loss` | Scalar training loss (required) |
| `learning_rate` | Current LR — enables LR chart |
| `grad_norm` | Gradient norm — enables grad norm chart |
| `epoch` | Current epoch — enables epoch summary table |
| `momentum_norm` | Optimizer momentum norm — enables momentum chart |
| `step_ms` | Wall-clock time for this step in ms |

### `run.checkpoint(model, step)`

Saves the model checkpoint locally. One checkpoint per run — always replaces the previous save. Call once at the end of training. The checkpoint is included in the push and used for SAE diff and model diff analysis.

### `run.finish(config)`

Flushes all metrics to disk. Pass `config` here if you didn't pass it to `aquin.init()`.

## CLI

```bash
aquin login       # save your API key
aquin package     # bundle aquin_run/ into aquin_run.tar.gz
aquin push        # push to Aquin
aquin whoami      # check which account you're logged in as
```

## Using with HuggingFace Trainer / TRL

Use a `TrainerCallback` to hook into the training loop:

```python
import time
from transformers import TrainerCallback

class AquinCallback(TrainerCallback):
    def __init__(self, run):
        self.run = run
        self._step_start = 0.0

    def on_step_begin(self, args, state, control, **kwargs):
        self._step_start = time.time()

    def on_log(self, args, state, control, logs=None, **kwargs):
        if not logs or "loss" not in logs:
            return
        self.run.log(
            step=state.global_step,
            loss=float(logs["loss"]),
            learning_rate=float(logs["learning_rate"]) if "learning_rate" in logs else None,
            grad_norm=float(logs["grad_norm"]) if "grad_norm" in logs else None,
            epoch=int(state.epoch) if state.epoch is not None else None,
            step_ms=round((time.time() - self._step_start) * 1000),
        )

    def on_train_end(self, args, state, control, **kwargs):
        model = kwargs.get("model")
        if model:
            self.run.checkpoint(model, step=state.global_step)
```

---

## Building and publishing a new release

**Prerequisites:** Python 3.13, Nuitka, MSVC (Visual Studio Build Tools with Desktop C++ workload).

**1. Compile to native extensions**
```bash
cd cli
python scripts/build_nuitka.py
# Compiles engine/ + compute/ to .pyd, removes .py source, audits on finish
```

**2. Build the wheel**
```bash
python -m build --wheel
# Output: dist/aquin-<version>-py3-none-any.whl
```

**3. Audit — confirm no source leaked**
```bash
python scripts/build_nuitka.py --check
# Must print: Audit passed
```

**4. Bump version before releasing**
Edit `version` in `pyproject.toml`, then repeat steps 1–3.

**5. Distribute**
Send the wheel directly to users (`pip install aquin-*.whl`) or upload to R2 and share a signed link.

**Notes:**
- `.pyd` files and `dist/` are gitignored — never commit compiled artifacts
- After building, `engine/` and `compute/` have no `.py` source locally either — keep a clean git working tree by running builds in a separate branch or restoring source from git after building
- To rebuild from scratch: `git restore cli/aquin/engine cli/aquin/compute` then repeat from step 1
