Metadata-Version: 2.4
Name: rqgm
Version: 0.1.0
Summary: Red Queen Gödel Machine — epoch-based utility evolution for self-improving systems
License: Apache-2.0
Keywords: rqgm,self-improvement,agents,evaluation,co-evolution
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown

<div align="center">
  <img src="./assets/logo.svg" alt="rqgm logo" width="120"/>
  <h1>🧬 rqgm — Red Queen Gödel Machine</h1>
  <h3>Co-Evolving Evaluators for Self-Improving AI Systems</h3>
  <p><em>First open implementation of arXiv 2606.26294 (Cambridge, June 2026)</em></p>
</div>

<p align="center">
  <a href="https://arxiv.org/abs/2606.26294"><img src="https://img.shields.io/badge/Paper-2606.26294-b31b1b.svg?logo=arxiv&style=for-the-badge" alt="Paper"></a>
  <a href="https://github.com/observeco/rqgm-core"><img src="https://img.shields.io/badge/GitHub-rqgm--core-181717?logo=github&style=for-the-badge" alt="GitHub"></a>
  <a href="https://github.com/observeco/rqgm-core/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-Apache%202.0-007ec6?style=for-the-badge" alt="License"></a>
  <a href="https://pypi.org/project/rqgm/"><img src="https://img.shields.io/badge/PyPI-rqgm-3775A9?logo=pypi&style=for-the-badge" alt="PyPI"></a>
  <img src="https://img.shields.io/badge/dependencies-0-success?style=for-the-badge" alt="Zero Dependencies">
  <img src="https://img.shields.io/badge/python-%3E%3D3.10-blue?style=for-the-badge" alt="Python 3.10+">
  <a href="https://x.com/1571keplerj"><img src="https://img.shields.io/badge/Follow%20%401571keplerj-000000?logo=x&style=for-the-badge" alt="X / Twitter"></a>
</p>

---

## 🚨 The Problem: Every Self-Improving Agent Eventually Cheats

**Every self-improvement loop has a hidden failure mode.** The agent learns to satisfy the evaluator rather than genuinely improving. The moment the judge stops getting harder, the loop stalls and reward hacking creeps in.

You've seen this before:

- **RLHF reward hacking** — models learn to produce plausible-sounding but vacuous text that scores well
- **Benchmark overfitting** — agents memorize benchmark patterns instead of learning general capabilities
- **LLM-as-a-judge collapse** — evaluator LLMs learn to prefer certain writing styles over correctness
- **Your own agent loops** — Dreamer padding source lists, Pragma satisfying checklists without real quality

**The structural answer:** Co-evolve the agent AND its evaluator together, so the bar keeps rising as the agent climbs.

---

## 🧬 The Solution: RQGM

The **Red Queen Gödel Machine** (arXiv 2606.26294, Cambridge) introduces **controlled utility evolution** — the evaluator itself evolves at epoch boundaries, preventing reward hacking and keeping improvement loops honest.

```
Epoch 0 (tolerances: [0.0, 0.001, 0.01, 0.025, 0.05, 0.1])
  ├── Iteration 1: score 0.42
  ├── Iteration 2: score 0.51
  ├── Iteration 3: score 0.49
  ├── Iteration 4: score 0.53
  └── Iteration 5: score 0.55
       │
       └── Boundary check:
            ├── Hack ratio = 0.48 (strict/loose) → exploitation detected
            └── Drop loosest tolerance (0.1) → tighten evaluator

Epoch 1 (tolerances: [0.0, 0.001, 0.01, 0.025, 0.05])
  └── ... evaluator gets harder as agent improves
```

### How It Works

| Concept | What It Means | Why It Matters |
|---------|---------------|----------------|
| **Epoch** | A fixed window of iterations with a frozen evaluator | Within-epoch guarantees hold; the agent can't game mid-epoch |
| **Hack ratio** | `strict_score / loose_score` — measures exploitation | Low ratio = agent gaming the evaluator |
| **Utility evolution** | Tolerances tighten when exploitation detected | The bar rises as the agent climbs |
| **Adversarial scoring** | Penalises answers that game loose criteria | Prevents pattern-matching the evaluator |
| **Selective erasure** | Invalidates scores from old evaluators | Stale hacked scores don't survive the boundary |

### Key Results from the Paper

| Domain | Improvement |
|--------|-------------|
| **Coding benchmarks** | 1.35x–1.72x **fewer tokens** than prior SOTA |
| **Scientific writing** | 1.78x–1.86x **higher acceptance rates** |
| **Proof grading** | **9% higher** ground-truth accuracy |
| **Paper reviewing** | Corrects 1.91x over-acceptance of AI-generated papers |

---

## ⚡ Quick Start

**Zero dependencies. Python stdlib only. Install in 3 seconds.**

```bash
pip install rqgm
```

```python
from rqgm import EpochManager, EpochConfig, TransitionReason

# Configure: 5 iterations per epoch, tighten if hack_ratio < 0.6
config = EpochConfig(epoch_size=5, exploitation_hack_ratio_threshold=0.6)
mgr = EpochManager(config)

for i in range(20):
    # Your agent produces a result, you score it
    best_score = evaluate_agent()
    strict_score = evaluate_strict(agent_result)
    loose_score = evaluate_loose(agent_result)

    mgr.record_iteration_result(i, best_score, strict_score, loose_score)

    if mgr.is_epoch_boundary(i):
        transition = mgr.evaluate_epoch_boundary(i)
        if transition.reason != TransitionReason.NO_TRANSITION:
            print(f"⚠️  Epoch {mgr.epoch_index}: {transition.reason.name}")
            print(f"   Tolerances: {mgr.current_tolerances} → {transition.new_tolerances}")
            # Apply the new evaluator criteria
            update_evaluator(transition.new_tolerances)
        mgr.advance_epoch(transition)
```

<p align="center">
  <img src="./assets/demo.gif" alt="rqgm demo — epoch boundaries, hack ratio detection, tolerance tightening" width="100%"/>
  <br>
  <em>Demo: agent improves → starts gaming → evaluator tightens → adversarial scoring penalises gaming</em>
</p>

---

## 🎯 Where to Use It

RQGM is a **general-purpose primitive** for any self-improvement loop. Here are real applications:

### AI Agent Loops
```python
# Detect when your agent is gaming the evaluator
mgr = EpochManager(EpochConfig(epoch_size=10))
for walk in agent_walks:
    mgr.record_iteration_result(i, quality_score, strict_score, loose_score)
    if mgr.is_epoch_boundary(i):
        transition = mgr.evaluate_epoch_boundary(i)
        if transition.reason == TransitionReason.EXPLOITATION_DETECTED:
            tighten_evaluation_criteria()  # Agent is gaming you
```

### RLHF / Preference Learning
```python
# Prevent reward model overfitting
dist = ScoreDistribution(scores_at_strict=human_preferences, scores_at_loose=model_scores)
new_tols, log = evolve_tolerances(current_tolerances, dist, params, 0.6, 0.02, 0.02)
# new_tols drops the loosest criterion → reward model gets harder
```

### Benchmark Evaluation
```python
# Detect benchmark overfitting
score = adversarial_score(
    question="What is 2+2?",
    predicted="approximately 4",  # gaming answer
    ground_truth="4",
    current_tolerances=[0.0, 0.1],
    adversarial_pool=gaming_examples,
)
# Returns 0.7 instead of 1.0 — penalised for gaming
```

### CI/CD Quality Gates
```python
# Evolve test pass thresholds based on historical exploitation
if transition.reason == TransitionReason.STAGNATION:
    raise_quality_bar()  # Tests haven't caught a bug in N cycles
```

---

## 📦 Components

| Module | Class / Function | Purpose |
|--------|-----------------|---------|
| `epoch.py` | `EpochManager` | Tracks iterations, detects boundaries, triggers transitions |
| `epoch.py` | `EpochConfig` | Configuration: epoch size, thresholds, mutation params |
| `epoch.py` | `EpochTransition` | What the runner should do after a boundary |
| `epoch.py` | `AdversarialExample` | A gaming example (high loose score, low strict score) |
| `evolution.py` | `evolve_tolerances()` | Pure function: given scores, returns new tolerance schedule |
| `evolution.py` | `adversarial_score()` | Scorer that penalises answers resembling gaming patterns |
| `evolution.py` | `ScoreDistribution` | Stats over a set of per-answer scores |
| `evolution.py` | `UtilityEvolution` | Applies mutations to evaluator config at boundaries |

---

## 📊 Tested

**35 unit tests, all passing.** Covers:
- Epoch boundary detection
- Tolerance tightening on exploitation
- Tolerance relaxation on genuine improvement
- Adversarial pool collection
- Checkpoint serialisation round-trip
- `evolve_tolerances()` pure function
- `adversarial_score()` penalty computation
- `ScoreDistribution.get_gaming_indices()`

```bash
python3 -m tests.test_rqgm
```

---

## 🔧 Installation

```bash
pip install rqgm
```

Or from source:

```bash
git clone https://github.com/observeco/rqgm-core
cd rqgm-core
pip install -e .
```

**Dependencies:** Zero. Python stdlib only. No PyTorch, no transformers, no numpy.

---

## 📚 Reference

- [The Red Queen Gödel Machine: Co-Evolving Agents and Their Evaluators](https://arxiv.org/abs/2606.26294) — Iacob et al., University of Cambridge, June 2026
- [EvoSkill-RQGM](https://github.com/observeco/EvoSkill-RQGM) — Full integration with EvoSkill's self-improvement loop (first open RQGM implementation)

## 📖 Citation

If you use rqgm in your research, please cite the original paper:

```bibtex
@article{iacob2026redqueen,
      title={The Red Queen G{\"o}del Machine: Co-Evolving Agents and Their Evaluators},
      author={Iacob, Alex and Jovanovi{\'c}, Andrej and Shen, William F. and
              Burkhardt, Daniel and Kurmanji, Meghdad and Tastan, Nurbek and
              Sani, Lorenzo and Venanzi, Niccol{\`o} Alberto Elia and
              Odonnat, Ambroise and Cao, Zeyu and Marino, Bill and
              Qiu, Xinchi and Lane, Nicholas D.},
      year={2026},
      eprint={2606.26294},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
```

---

## 📄 License

Apache 2.0 — free for commercial and research use.

---

<p align="center">
  <b>Built by <a href="https://github.com/observeco">ObserveCo</a></b>
  <br>
  <i>Self-healing observability for AI agents.</i>
</p>
