Metadata-Version: 2.4
Name: neotune
Version: 0.3.0
Summary: Supervised fine-tuning of LLMs with LoRA and DeepSpeed
License-Expression: MIT
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Intended Audience :: Science/Research
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0
Requires-Dist: transformers>=4.40
Requires-Dist: peft>=0.10
Requires-Dist: datasets>=2.0
Requires-Dist: deepspeed>=0.12
Requires-Dist: accelerate>=0.30
Requires-Dist: scikit-learn>=1.0
Requires-Dist: numpy>=1.24
Requires-Dist: pyyaml>=6.0
Requires-Dist: python-dotenv>=1.0
Requires-Dist: tqdm>=4.60
Provides-Extra: ray
Requires-Dist: ray[train]>=2.0; extra == "ray"
Provides-Extra: logging
Requires-Dist: mlflow>=2.0; extra == "logging"
Requires-Dist: wandb>=0.15; extra == "logging"
Provides-Extra: all
Requires-Dist: neotune[logging,ray]; extra == "all"
Dynamic: license-file

# neotune

Supervised fine-tuning of LLMs with LoRA and DeepSpeed, packaged as a simple Python API.

## Installation

```bash
pip install neotune
```

With optional extras:

```bash
pip install "neotune[ray]"       # Ray Train + Kubernetes support
pip install "neotune[logging]"   # MLflow + Weights & Biases
pip install "neotune[all]"       # everything
```

## Quick Start

```python
from datasets import load_dataset
from neotune import finetune

# Load and prepare your datasets (each split needs a "text" column)
ds = load_dataset("tatsu-lab/alpaca")
train_ds = ds["train"]  # must have a "text" column
val_ds = ds["validation"]

results = finetune(
    model="meta-llama/Llama-3.1-8B-Instruct",
    datasets={"train": train_ds, "validation": val_ds},
    hyperparameters={"learning_rate": 2e-4, "num_train_epochs": 3},
)
```

Or use the class-based API for more control:

```python
from neotune import NeoTune

nt = NeoTune(
    model="meta-llama/Llama-3.1-8B-Instruct",
    datasets={"train": train_ds, "validation": val_ds, "test": test_ds},
    hyperparameters={"learning_rate": 2e-4, "output_dir": "./my-adapter"},
)

results = nt.train()
nt.evaluate()
```

## API Reference

### `finetune(model, datasets, hyperparameters)` → `dict`

One-call convenience function. Creates a `NeoTune` instance and trains immediately.

### `NeoTune(model, datasets, hyperparameters)`

#### `model` — `str`
HuggingFace model ID or local path.

```python
model="meta-llama/Llama-3.1-8B-Instruct"
```

#### `datasets` — `dict[str, Dataset]`
A dict of HuggingFace `Dataset` objects. Only `"train"` is required; `"validation"` and `"test"` are optional.

Each dataset must contain **either**:

- A `"text"` column with fully-formatted prompt/response text (will be tokenized automatically), **or**
- Pre-tokenized columns: `input_ids`, `attention_mask`, and `labels`

```python
from datasets import load_dataset

ds = load_dataset("my_dataset")
datasets = {
    "train": ds["train"],
    "validation": ds["validation"],
    "test": ds["test"],       # optional — used for final evaluation
}
```

#### `hyperparameters` — `dict`, optional
Override any default. All keys are optional.

**Training:**

| Key | Default | Description |
|-----|---------|-------------|
| `learning_rate` | `1e-4` | Learning rate |
| `num_train_epochs` | `3` | Number of training epochs |
| `batch_size` | `1` | Per-device train/eval batch size |
| `gradient_accumulation_steps` | `4` | Gradient accumulation steps |
| `warmup_ratio` | `0.03` | Warmup ratio |
| `weight_decay` | `0.01` | Weight decay |
| `bf16` | `True` | Use bfloat16 mixed precision |
| `gradient_checkpointing` | `False` | Enable gradient checkpointing |
| `logging_steps` | `10` | Log every N steps |
| `eval_steps` | `50` | Evaluate every N steps |
| `save_steps` | `100` | Save checkpoint every N steps |
| `save_total_limit` | `3` | Maximum checkpoints to keep |

**LoRA:**

| Key | Default | Description |
|-----|---------|-------------|
| `lora_r` | `16` | LoRA rank |
| `lora_alpha` | `32` | LoRA alpha scaling |
| `lora_dropout` | `0.05` | LoRA dropout |
| `lora_target_modules` | `["q_proj", "k_proj", ...]` | Modules to apply LoRA to |

**Output:**

| Key | Default | Description |
|-----|---------|-------------|
| `output_dir` | `"./adapter-output"` | Where to save the trained adapter |
| `hf_repo` | `None` | HuggingFace Hub repo to push to |

**DeepSpeed:**

| Key | Default | Description |
|-----|---------|-------------|
| `ds_config` | `None` | DeepSpeed config: `None` (built-in ZeRO-2), a file path, or a dict |

**Data:**

| Key | Default | Description |
|-----|---------|-------------|
| `max_len` | `2048` | Max token length (used when tokenizing a `"text"` column) |

```python
hyperparameters={
    "learning_rate": 2e-4,
    "num_train_epochs": 5,
    "lora_r": 32,
    "output_dir": "./my-adapter",
    "hf_repo": "username/my-adapter",
}
```

### Methods

#### `.train()` → `dict`
Fine-tunes the model and returns test-set evaluation metrics (`accuracy`, `f1_macro`, `precision_macro`, `recall_macro`). Returns an empty dict if no test split was provided.

#### `.evaluate()` → `tuple`
Runs inference on the test split using the saved adapter. Returns `(accuracy, f1, precision, recall)`. Requires a `"test"` split.

## Advanced Usage

### Distributed training with DeepSpeed (CLI)

```bash
deepspeed --num_gpus 4 -m neotune.train --config config.yaml --mode train
```

### Distributed training with Ray

```bash
python -m neotune.ray_train --config config.yaml --num_workers 4
```

### Kubernetes (KubeRay)

See `k8s/rayjob-lora-sft.yaml` for a KubeRay `RayJob` template.

## Environment Variables

| Variable | Description |
|----------|-------------|
| `HF_TOKEN` | HuggingFace access token (required for gated models) |
| `WANDB_API_KEY` | Weights & Biases API key (optional) |

Create a `.env` file in your working directory:

```
HF_TOKEN=your_token_here
WANDB_API_KEY=your_wandb_key_here
```

## License

MIT
