Metadata-Version: 2.4
Name: reason-critic
Version: 0.1.0
Summary: A self-verification model that critiques agent output — it doesn't generate, it flags errors.
Author: FableForge
License: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: rich>=13.0
Requires-Dist: click>=8.1
Requires-Dist: pydantic>=2.0
Requires-Dist: httpx>=0.25
Requires-Dist: fastapi>=0.104.0
Requires-Dist: uvicorn>=0.24.0
Provides-Extra: train
Requires-Dist: torch>=2.1.0; extra == "train"
Requires-Dist: transformers>=4.36.0; extra == "train"
Requires-Dist: peft>=0.7.0; extra == "train"
Requires-Dist: datasets>=2.14.0; extra == "train"
Requires-Dist: accelerate>=0.25.0; extra == "train"
Requires-Dist: unsloth>=2024.1; extra == "train"
Provides-Extra: gpu
Requires-Dist: bitsandbytes>=0.43.0; extra == "gpu"
Provides-Extra: dpo
Requires-Dist: trl>=0.7.0; extra == "dpo"
Provides-Extra: all
Requires-Dist: reason-critic[dpo,gpu,train]; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Dynamic: license-file

# ReasonCritic

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/) [![Tests](https://img.shields.io/badge/tests-0-yellow.svg)](tests/)


> A self-verification model that critiques agent output. It doesn't generate — it flags errors.

## Overview

ReasonCritic is a verification model trained to detect bugs, security issues, logic errors, and style problems in code generated by AI agents. Unlike generative models, it focuses exclusively on **critique**: given code, it produces a structured verdict (PASS/FAIL), confidence score, issue list, and actionable suggestions.

### Data Sources

- **v-Fable verification phase**: 62.2% of traces contain verification steps — extracted as (code, pass/fail) pairs
- **Glint error/recovery pairs**: 3,725 examples of agent mistakes and their corrections

### Architecture

- **Base model**: Qwen3-7B
- **Training**: Three-stage pipeline (contrastive → LoRA → DPO)
- **Output**: Structured verification result with verdict, confidence, issues, and suggestions

## Installation

```bash
pip install -e .

# With DPO training support:
pip install -e ".[dpo]"

# With development tools:
pip install -e ".[dev]"
```

## Quick Start

### CLI

```bash
# Verify a code snippet
critic verify --code "def add(a, b): return a + b"

# Verify a file
critic verify --file app.py

# Verify an agent trace
critic verify --trace trace.jsonl

# Train the critic model
critic train --data pairs.jsonl --model Qwen/Qwen3-7B

# Start the API server
critic serve --port 8000
```

### Python API

```python
from reason_critic import ReasonCritic, VerificationResult

# Initialize critic
critic = ReasonCritic(backend="local", model_name="reason-critic-7b")

# Verify code
result = critic.verify(
    code="def factorial(n):\n    if n <= 1:\n        return 1\n    return n * factorial(n - 1)",
    language="python",
)

print(f"Verdict: {result.pass_fail}")      # PASS or FAIL
print(f"Confidence: {result.confidence}")  # 0.0 to 1.0
print(f"Issues: {result.issues}")          # List of issues
print(f"Suggestions: {result.suggestions}") # List of suggestions
```

### Verify an Agent Step

```python
step = {
    "index": 0,
    "type": "code_generation",
    "code": "for i in range(11):\n    print(data[i])",
    "name": "process_data",
}
step_result = critic.verify_step(step, context="Processing user data")
print(step_result.result.pass_fail)  # FAIL (off-by-one)
```

### Verify a Full Agent Run

```python
run = {
    "id": "run-abc123",
    "steps": [
        {"index": 0, "type": "generation", "code": "x = 1", "name": "init"},
        {"index": 1, "type": "generation", "code": "y = x + 1", "name": "compute"},
    ]
}
run_result = critic.verify_run(run)
print(f"Overall: {run_result.overall_verdict}")  # PASS or FAIL
print(f"Steps passed: {run_result.num_passed}/{len(run_result.step_verifications)}")
```

### Generate-then-Verify Pipeline

```python
from reason_critic.pipeline import GenerateVerifyPipeline, GeneratorWrapper
from reason_critic import ReasonCritic

pipeline = GenerateVerifyPipeline(
    generator=GeneratorWrapper(model_name="Qwen/Qwen3-7B"),
    critic=ReasonCritic(backend="local", model_name="reason-critic-7b"),
    max_attempts=3,
)

result = pipeline.generate_and_verify(
    task="Write a function that checks if a string is a palindrome",
    language="python",
)

print(f"Passed: {result.passed}")
print(f"Attempts: {result.total_attempts}")
print(f"Final code:\n{result.final_code}")
```

If verification fails, the pipeline feeds issues back to the generator for re-generation, up to `max_attempts` cycles.

## API Server

```bash
# Start the server
critic serve --port 8000
```

### Endpoints

#### `POST /verify` — Verify code

```json
{
    "code": "def add(a, b): return a - b",
    "context": "Addition function",
    "language": "python"
}
```

Response:
```json
{
    "pass_fail": "FAIL",
    "confidence": 0.92,
    "issues": ["Subtraction instead of addition"],
    "suggestions": ["Use + instead of -"],
    "explanation": "Function uses subtraction where addition is expected",
    "language": "python"
}
```

#### `POST /verify/step` — Verify a single step

```json
{
    "step": {
        "index": 0,
        "type": "code_generation",
        "code": "for i in range(11): print(data[i])",
        "name": "loop_data"
    },
    "context": "Processing array"
}
```

#### `POST /verify/run` — Verify a full agent run

```json
{
    "run": {
        "id": "run-123",
        "steps": [
            {"index": 0, "type": "generation", "code": "x = 1"},
            {"index": 1, "type": "generation", "code": "y = x / 0"}
        ]
    },
    "context": "Data processing pipeline"
}
```

#### `POST /pipeline` — Generate-then-verify

```json
{
    "task": "Write a sorting function",
    "max_attempts": 3,
    "language": "python"
}
```

#### `GET /health` — Health check

```json
{
    "status": "healthy",
    "model": "reason-critic-7b",
    "backend": "local"
}
```

## Training Pipeline

### Three-Stage Training

ReasonCritic uses a three-stage training pipeline:

1. **Stage 1: Contrastive Learning** — Train on correct/incorrect code pairs to learn the difference
2. **Stage 2: LoRA Fine-Tuning** — Efficient fine-tuning with Low-Rank Adaptation
3. **Stage 3: DPO Alignment** — Direct Preference Optimization for better verification preferences

### Data Preparation

```python
from reason_critic.data_prep import (
    extract_verification_pairs,
    generate_incorrect_versions,
    create_contrastive_pairs,
    load_glint_error_recovery,
)

# Extract from agent traces
examples = extract_verification_pairs(traces)

# Generate buggy versions for contrastive learning
buggy = generate_incorrect_versions(correct_code, num_versions=3)

# Create pairs
pair = create_contrastive_pairs(correct_code, incorrect_code)

# Load Glint error/recovery data
glint_examples = load_glint_error_recovery("glint_data.jsonl")
```

### Bug Templates

`generate_incorrect_versions` applies systematic bug-introduction strategies:

| Bug Type | Description |
|----------|-------------|
| `off_by_one` | Off-by-one errors in loop bounds |
| `wrong_operator` | Swapped comparison operators |
| `missing_none_check` | Missing None check before attribute access |
| `forgotten_await` | Missing await on async call |
| `mutable_default` | Mutable default arguments |
| `shadowed_variable` | Variable shadowing in inner scope |

### Training

```python
from reason_critic.trainer import TrainingConfig, run_three_stage_pipeline

config = TrainingConfig(
    model_name="Qwen/Qwen3-7B",
    output_dir="./reason-critic-output",
    contrastive_epochs=3,
    lora_epochs=2,
    dpo_epochs=1,
)

results = run_three_stage_pipeline(examples, pairs, output_dir="./output", config=config)
```

Or via CLI:
```bash
critic train --data pairs.jsonl --model Qwen/Qwen3-7B --stage all
critic train --data pairs.jsonl --stage contrastive
critic train --data pairs.jsonl --stage lora
critic train --data pairs.jsonl --stage dpo
```

## Benchmarks

The project includes 130 verification benchmark tasks across 4 categories:

| Category | Count | Description |
|----------|-------|-------------|
| Code Correctness | 50 | Off-by-one, wrong operators, missing checks, mutations, async bugs |
| Security Issues | 30 | SQL injection, XSS, CSRF, command injection, crypto weaknesses |
| Logic Errors | 30 | Condition order, inverted logic, De Morgan's law, scope issues |
| Style Issues | 20 | Missing docs, magic numbers, god objects, naming, logging |

```python
from reason_critic.benchmarks import BENCHMARK_CATEGORIES
import json
from pathlib import Path

for category in BENCHMARK_CATEGORIES:
    path = Path(__file__).parent / "benchmarks" / category / "tasks.json"
    tasks = json.loads(path.read_text())
    print(f"{category}: {len(tasks)} tasks")
```

## Architecture

```
ReasonCritic
├── critic.py           # Core verification model + backends (local, API, hybrid)
├── data_prep.py        # Training data preparation from traces
├── trainer.py           # Three-stage training pipeline
├── pipeline.py          # Generate-then-verify pipeline
├── server.py            # FastAPI server
├── cli.py               # CLI interface
└── benchmarks/          # Verification benchmark tasks
    ├── code_correctness/  # 50 tasks
    ├── security_issues/    # 30 tasks
    ├── logic_errors/        # 30 tasks
    └── style_issues/         # 20 tasks
```

### Backends

- **Local**: Load model via transformers/Unsloth for local inference
- **API**: Call a remote verification service
- **Hybrid**: Try local first, fall back to API for low-confidence results

### VerificationResult Schema

```python
@dataclass
class VerificationResult:
    pass_fail: str         # "PASS" or "FAIL"
    confidence: float      # 0.0 to 1.0
    issues: list[str]      # List of detected issues
    suggestions: list[str] # List of suggested fixes
    explanation: str       # Brief explanation
    language: str          # Programming language
    raw_output: str        # Raw model output
    model_name: str        # Model that produced this result
```

## Running Tests

```bash
pip install -e ".[dev]"
pytest tests/ -v
```

## License

MIT

## Ecosystem

Part of the [FableForge](../) ecosystem — 21 open-source projects built from 210K real agent traces:

| Project | Description |
| --- | --- |
| **[Anvil](../anvil)** | Self-verified coding agent |
| **[VerifyLoop](../verifyloop)** | Plan→Execute→Verify→Recover framework |
| **[ErrorRecovery](../error-recovery)** | Self-healing middleware (3,725 error patterns) |
| **[FableForge-14B](../fableforge-14b)** | The fine-tuned 14B model (4-stage training) |
| **[ShellWhisperer](../shell-whisperer)** | 1.5B edge agent (phone/RPi, 50ms) |
| **[ReasonCritic](../reason-critic)** | Verification model (130 benchmark tasks) |
| **[TraceCompiler](../trace-compiler)** | Compile traces → LoRA skills |
| **[AgentRuntime](../agent-runtime)** | Persistent agent daemon (systemd for AI) |
| **[AgentSwarm](../agent-swarm)** | Multi-agent from real trace transitions |
| **[AgentTelemetry](../agent-telemetry)** | Datadog for agents (token tracking, costs) |
| **[BenchAgent](../bench-agent)** | HumanEval for tool-use (107 tasks) |
| **[AgentDev](../agent-dev)** | VSCode extension with verification |
| **[TraceViz](../trace-viz)** | Trace replay visualizer (Next.js) |
| **[AgentSkills](../agent-skills)** | npm for agent behaviors |
| **[AgentCurriculum](../agent-curriculum)** | 5-stage progressive training |
| **[AgentFuzzer](../agent-fuzzer)** | Adversarial testing for agents |
| **[AgentConstitution](../agent-constitution)** | Safety guardrails from traces |
| **[CostOptimizer](../cost-optimizer)** | Token cost reduction (50-80%) |
| **[AgentProfiler](../agent-profiler)** | Behavioral fingerprinting |
| **[TrajectoryDistiller](../trajectory-distiller)** | Trace→training data pipeline |
| **[Fable5-Dataset](../fable5-dataset)** | HuggingFace dataset release |
