Metadata-Version: 2.4
Name: eggroll-es
Version: 0.2.0
Summary: Gradient-free fine-tuning for any HuggingFace model. EGGROLL Evolution Strategies in PyTorch.
Author: ShipItAndPray
License: MIT
Project-URL: Homepage, https://github.com/ShipItAndPray/eggroll
Project-URL: Issues, https://github.com/ShipItAndPray/eggroll/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0.0
Requires-Dist: transformers>=4.36.0
Requires-Dist: datasets>=2.14.0
Requires-Dist: tqdm>=4.60.0
Provides-Extra: vllm
Requires-Dist: vllm>=0.4.0; extra == "vllm"
Requires-Dist: safetensors>=0.4.0; extra == "vllm"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: safetensors>=0.4.0; extra == "dev"
Dynamic: license-file

<p align="center">
  <pre align="center">
 ███████╗ ██████╗  ██████╗ ██████╗  ██████╗ ██╗     ██╗
 ██╔════╝██╔════╝ ██╔════╝ ██╔══██╗██╔═══██╗██║     ██║
 █████╗  ██║  ███╗██║  ███╗██████╔╝██║   ██║██║     ██║
 ██╔══╝  ██║   ██║██║   ██║██╔══██╗██║   ██║██║     ██║
 ███████╗╚██████╔╝╚██████╔╝██║  ██║╚██████╔╝███████╗███████╗
 ╚══════╝ ╚═════╝  ╚═════╝ ╚═╝  ╚═╝ ╚═════╝ ╚══════╝╚══════╝
  </pre>
</p>

<h3 align="center">Gradient-Free Fine-Tuning for Any HuggingFace Model</h3>

<p align="center">
  <a href="https://pypi.org/project/eggroll-es/"><img alt="PyPI" src="https://img.shields.io/pypi/v/eggroll-es?color=blue"></a>
  <a href="https://github.com/ShipItAndPray/eggroll/actions/workflows/ci.yml"><img alt="CI" src="https://github.com/ShipItAndPray/eggroll/actions/workflows/ci.yml/badge.svg"></a>
  <a href="https://github.com/ShipItAndPray/eggroll/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/github/license/ShipItAndPray/eggroll"></a>
  <a href="https://pypi.org/project/eggroll-es/"><img alt="Python" src="https://img.shields.io/pypi/pyversions/eggroll-es"></a>
</p>

<p align="center">
  PyTorch implementation of <a href="https://arxiv.org/abs/2511.16652">EGGROLL</a> (Evolution Strategies at the Hyperscale, NVIDIA + Oxford).<br>
  No backprop. No gradients. Just evolution.
</p>

---

## Why EGGROLL?

Backpropagation requires differentiable objectives, massive memory for activations, and complex distributed training setups. EGGROLL replaces all of that with **evolution** — mutate, evaluate, keep what works.

| | Backprop (LoRA/GRPO) | EGGROLL |
|---|---|---|
| Gradients needed | Yes | **No** |
| Memory (activations) | O(layers) | **O(1)** |
| Differentiable reward | Required | **Any function** |
| Works on quantized models | Limited | **Native** |
| Throughput | Training speed | **~91% of inference speed** |

---

## Install

```bash
pip install eggroll-es
```

Or from source:

```bash
git clone https://github.com/ShipItAndPray/eggroll.git
cd eggroll
pip install -e ".[dev]"
```

---

## Quick Start

### CLI — One Command

```bash
# Evolve GPT-2 to minimize perplexity
eggroll tune gpt2 --reward perplexity --generations 50

# Evolve Llama with custom reward, only attention layers
eggroll tune meta-llama/Llama-3.1-8B \
  --reward score.py \
  --target-modules q_proj v_proj \
  --population 128 \
  --rank 8

# Use an inline lambda as reward
eggroll tune gpt2 --reward "lambda text: 1.0 if 'yes' in text else 0.0"

# Show model info + EGGROLL memory estimates
eggroll info meta-llama/Llama-3.1-8B
```

### Python API

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from eggroll import EggrollTrainer, EggrollConfig, PerplexityReward

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

config = EggrollConfig(
    population_size=64,   # mutations per generation
    rank=8,               # low-rank perturbation rank
    sigma=0.01,           # noise magnitude
    lr=0.001,             # learning rate
    generations=100,      # evolution steps
    target_modules=["c_attn", "c_proj"],  # only evolve these layers
)

reward = PerplexityReward(tokenizer)
trainer = EggrollTrainer(model, config, reward, tokenizer)
trainer.evolve(dataloader)
trainer.save("./evolved-model")
```

### Custom Reward Functions

The killer feature — optimize for **anything**, not just differentiable losses:

```python
from eggroll import EggrollTrainer, EggrollConfig, TextReward, CustomReward

# Score generated text (non-differentiable!)
def my_scorer(text: str) -> float:
    if "correct answer" in text:
        return 1.0
    return 0.0

reward = TextReward(tokenizer, scorer=my_scorer)

# Or use any function of (model, inputs) -> float
reward = CustomReward(lambda model, inputs: run_tests(model, inputs))

# Combine multiple rewards
from eggroll import MultiReward
reward = MultiReward([
    (PerplexityReward(tokenizer), 0.3),
    (TextReward(tokenizer, code_scorer), 0.7),
])
```

---

## How It Works

Based on [NVIDIA + Oxford's EGGROLL paper](https://arxiv.org/abs/2511.16652):

1. **Mutate** — Generate low-rank perturbations of model weights (A × B.T instead of full-rank noise)
2. **Evaluate** — Run each mutated model on your reward function
3. **Select** — Fitness-weighted combination of best mutations updates the parameters
4. **Repeat** — Each generation gets closer to optimal

```
Generation 0    →    Generation N
  θ₀                    θ*

  ┌──── mutate ────┐
  │  θ + ε₁  → 0.3 │    Rank perturbations:
  │  θ + ε₂  → 0.8 │    ε = σ · A · Bᵀ / √r
  │  θ + ε₃  → 0.1 │
  │  θ + ε₄  → 0.9 │    Update:
  └──── select ────┘    θ ← θ + lr · Σ fᵢεᵢ / nσ
         ↓
    θ + weighted avg
```

**Why low-rank?** Full-rank ES requires O(D) memory per population member. EGGROLL uses O(2Dr) where r << D, achieving **100x speedup** while the approximation error drops as O(1/r).

---

## vLLM Backend (Massively Parallel)

The killer feature. Instead of evaluating mutations one at a time, the vLLM backend converts each EGGROLL perturbation into a LoRA adapter and evaluates the **entire population in one batched vLLM call**.

```bash
# CLI
eggroll tune meta-llama/Llama-3.1-8B-Instruct \
  --backend vllm \
  --reward score.py \
  --population 128 \
  --target-modules q_proj v_proj

# Python
from eggroll.vllm_backend import VllmEggrollTrainer
from eggroll import EggrollConfig

config = EggrollConfig(population_size=128, rank=8, generations=50)

def reward_fn(outputs: list[str], prompts: list[str]) -> list[float]:
    return [1.0 if "correct" in o else 0.0 for o in outputs]

trainer = VllmEggrollTrainer(
    model_id="meta-llama/Llama-3.1-8B-Instruct",
    config=config,
    reward_fn=reward_fn,
)
results = trainer.evolve(prompts=["What is 2+2?", "Explain gravity."])
trainer.save_best_adapter("./evolved-adapter")
```

**How it works:**
1. Each EGGROLL perturbation is factorized via SVD into LoRA A/B matrices
2. Saved as PEFT-compatible adapter directories on disk
3. vLLM loads all adapters via `LoRARequest` and evaluates them in parallel
4. Fitnesses collected, parameters updated, repeat

**Speed:** ~91% of pure inference throughput (from the EGGROLL paper). With vLLM's batching, a population of 128 evaluates nearly as fast as a single inference pass.

**Requirements:** `pip install eggroll-es[vllm]`

---

## Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `population_size` | 256 | Mutations per generation (higher = better gradient estimate) |
| `rank` | 8 | Low-rank perturbation rank (higher = more accurate, more memory) |
| `sigma` | 0.01 | Noise magnitude (too high = chaos, too low = no exploration) |
| `lr` | 0.001 | Learning rate for parameter updates |
| `generations` | 100 | Number of evolution steps |
| `antithetic` | True | Mirror perturbations to halve variance |
| `fitness_shaping` | "centered_rank" | "centered_rank", "normalized", or "raw" |
| `target_modules` | None | Only evolve layers matching these patterns |
| `elite_k` | 0 | Keep only top-k members (0 = use all) |
| `weight_decay` | 0.0 | L2 regularization |

---

## CLI Reference

```
eggroll tune MODEL [OPTIONS]

  MODEL                      HuggingFace model ID or local path

  --reward, -r REWARD        perplexity, path/to/score.py, or lambda
  --generations, -g N        Evolution generations (default: 100)
  --population, -p N         Population size (default: 64)
  --rank N                   Low-rank perturbation rank (default: 8)
  --sigma F                  Noise std dev (default: 0.01)
  --lr F                     Learning rate (default: 0.001)
  --output, -o DIR           Output directory
  --dataset, -d DATASET      HuggingFace dataset (default: wikitext-2)
  --target-modules M [M..]   Only evolve matching layers
  --seed N                   Random seed (default: 42)

eggroll info MODEL           Show model info + memory estimates
```

---

## Custom Reward File Format

Create a Python file with either:

```python
# Option 1: Score generated text
def score(text: str) -> float:
    return 1.0 if "correct" in text else 0.0

# Option 2: Full model access
def reward_fn(model, inputs) -> float:
    output = model(**inputs)
    return -output.loss.item()
```

Then: `eggroll tune gpt2 --reward my_reward.py`

---

## Comparison to Existing EGGROLL Implementations

| | **This library** | HyperscaleES (official) | egg.c | eggroll-embedding-trainer |
|---|---|---|---|---|
| Language | **PyTorch** | JAX | CUDA/C | PyTorch |
| HuggingFace integration | **Yes** | No | No | No |
| vLLM multi-LoRA backend | **Yes** | No | No | No |
| CLI | **Yes** | No | No | No |
| Custom rewards | **Any function** | Hardcoded | Hardcoded | NDCG only |
| Install | **pip install** | Manual | Compile | Manual |
| Use case | **General fine-tuning** | Research | Edge/embedded | Retrieval |

---

## Development

```bash
pip install -e ".[dev]"
pytest tests/ -v
```

---

## Paper

> Gajane et al. "Evolution Strategies at the Hyperscale" (2025)
> NVIDIA + University of Oxford + MILA
> [arxiv.org/abs/2511.16652](https://arxiv.org/abs/2511.16652) | [Project Page](https://eshyperscale.github.io/)

---

## License

MIT
