Metadata-Version: 2.4
Name: meta-reasoning
Version: 0.0.5
Summary: Reasoning is not a property of the model — it is an emergent dynamic of external control.
Author: Tommaso Bredariol
License: AGPL-3.0
Project-URL: Homepage, https://github.com/tommasobredariol/meta-reasoning
Project-URL: Repository, https://github.com/tommasobredariol/meta-reasoning
Keywords: llm,reasoning,cognitive-control,meta-reasoning,ai
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU Affero General Public License v3
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: openai>=1.0
Requires-Dist: pydantic>=2.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Dynamic: license-file

<p align="center">
  <img src="https://raw.githubusercontent.com/tictacguy/Meta-Reasoning/main/docs/static/logo_gh.png" alt="Meta-Reasoning" width="500">
</p>

<p align="center">
  <strong>Cognitive Heteronomy for LLMs</strong>
</p>

<p align="center">
  <a href="https://pypi.org/project/meta-reasoning/"><img src="https://img.shields.io/pypi/v/meta-reasoning" alt="PyPI"></a>
  <a href="https://pypi.org/project/meta-reasoning/"><img src="https://img.shields.io/pypi/pyversions/meta-reasoning" alt="Python"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/license-AGPL--3.0-blue.svg" alt="License"></a>
</p>

---

> **Reasoning is not a property of the model — it is an emergent dynamic of external control.**

An SDK that rejects the illusion of autonomous LLM reasoning. Instead of treating language models as cognitive agents, Meta-Reasoning introduces **cognitive heteronomy**: reasoning is governed, observed, and mutated from the outside.

The model doesn't think. It executes. The thinking happens in the architecture around it.

🌐 Meta-Reasoning Website: https://tictacguy.github.io/Meta-Reasoning/

### 🔌 Native Integrations

|  |  |  |
|:---:|:---:|:---:|
| 🟠 **Claude Code** | 🦞 **OpenClaw** | 🤖 **Codex** |
| Native tool definitions with strict JSON schemas. Claude plans multi-step cognitive executions autonomously. | Declarative plugin with capability discovery, cost/risk metadata, and autonomous chaining. | Typed Pydantic API. Codex generates correct calls by reading the type contracts. |
| [`integrations/claude/`](integrations/claude/) | [`integrations/openclaw/`](integrations/openclaw/) | [`integrations/codex/`](integrations/codex/) |

---

## Core Thesis

LLMs are generative substrates, not minds. What is commonly called "reasoning" is pattern replay — not deliberation. This SDK externalizes all meta-cognitive functions into a **Cognitive Controller** that:

- **Observes** the *form* of reasoning (not its content)
- **Measures** trajectory, redundancy, stall, and premature convergence
- **Mutates** the reasoning process through formal constraint operators
- **Records** cognitive trajectories in an **Epistemic Ledger**

No self-reflection. No "think step by step". No autonomous agents.

## Architecture

<p align="center">
  <img src="https://raw.githubusercontent.com/tictacguy/Meta-Reasoning/main/docs/static/architecture.png" alt="Architecture" width="100%">
</p>

### Level 1 — Generative Substrate (LLM)
Produces text and structures. Decides nothing. Stateless by design.

### Level 2 — Cognitive Controller
The heart. Semantically blind — it doesn't evaluate truth, it evaluates *cognitive form*:
- Entropy of reasoning moves
- Strategy repetition index
- Depth without novelty
- Constraint violation rate
- Premature closure score

### Level 3 — Epistemic Ledger
Not RAG. Not content memory. A structural trace of:
- Cognitive transformations attempted
- Strategies that produced stall
- Failure maps that prevent regression

## Key Concepts

### Structured Output Protocol
Every LLM generation must include a formal reasoning trace:
```json
{
  "content": "...",
  "reasoning_trace": {
    "moves": ["assumption", "deduction", "analogy"],
    "depth": 4,
    "confidence_markers": 2,
    "abstraction_level": "medium"
  }
}
```

### Cognitive Move Taxonomy
A finite, observable alphabet:
`assumption` · `deduction` · `induction` · `abduction` · `analogy` · `contradiction` · `enumeration` · `compression` · `narrative_simulation`

### Mutation Operators
The controller doesn't say "reason better". It says:
- **BAN**: "deduction is forbidden"
- **REQUIRE**: "you must use analogy"
- **LIMIT_DEPTH**: "max 2 reasoning steps"
- **FORCE_COMPRESSION**: "reduce to 2 concepts"
- **INVERT_CAUSALITY**: "reverse the causal direction"
- **REQUIRE_CONTRADICTION**: "find an internal contradiction"

Improvisation emerges from constraint, not freedom — like jazz.

### Failure as First-Class Output
The system does not optimize for correct answers. Failure is informative:
- Every collapsed trajectory is recorded
- Every stall enriches the ledger
- The system learns *which cognitive spaces to avoid*

## Features

### 1. Reasoning Debugger
Put a breakpoint in thought. Step through the cognitive loop cycle by cycle, inspect mutations, understand *why* a strategy was banned, and rewind to any previous cognitive state.

```python
from meta_reasoning import ReasoningDebugger

dbg = ReasoningDebugger(backend=my_backend, max_cycles=5)
dbg.add_breakpoint(lambda cycle, metrics, muts: metrics.entropy < 1.0)
result = dbg.run("Your task")

snap = dbg.rewind_to(2)
print(snap.explain())
```

### 2. Reasoning Policies as Code
Write cognitive governance rules in Python — not prompts. Versionable, testable, reviewable.

```python
from meta_reasoning import ReasoningPolicy, PolicyRule, CognitiveEngine, strict_diversity_policy

# Use a built-in policy
engine = CognitiveEngine(backend=my_backend, policy=strict_diversity_policy())

# Or define your own
policy = ReasoningPolicy("my_policy")
policy.add_rule(PolicyRule(
    name="ban_dominant",
    condition=lambda m, c: m.dominant_move is not None,
    mutations=lambda m, c: [Mutation(type=MutationType.BAN, target=m.dominant_move)],
))
```

### 3. Model-Agnostic Benchmarks
Compare models not by accuracy, but by cognitive behavior: rigidity, diversity, constraint response, improvisation capacity.

```python
from meta_reasoning import benchmark_models

result = benchmark_models(
    backends={"gpt-4o": gpt_backend, "claude": claude_backend},
    task="Your task",
)
print(result.comparison_table())
```

### 4. Cognitive Fingerprinting
Every model has a cognitive signature. Profile which strategies it prefers, which it avoids, how it reacts to pressure, and where it collapses.

```python
from meta_reasoning import fingerprint_from_result

fp = fingerprint_from_result("gpt-4o", engine_result)
print(fp.summary())
# → Preferred moves: deduction, assumption
# → Stall rate: 15%
# → Collapse at cycles: [4, 7]
```

### 5. Failure Atlas
Instead of hiding failures: map them, visualize them, query them.

```python
atlas = engine.ledger.failure_atlas()
atlas.by_reason()              # Group by failure cause
atlas.stall_inducing_mutations()  # Which mutations caused stall?
atlas.query(max_entropy=0.5)   # Find low-entropy failures
```

### 6. Deterministic Replay
Same task + same controller = reproducible, replayable, diffable reasoning. Enterprise-ready and CI-testable.

```python
from meta_reasoning import record_session, ReplaySession

session = record_session(result, "task", max_cycles=5)
session.save("session_v1.json")

# Later: compare two sessions
s1 = ReplaySession.load("session_v1.json")
s2 = ReplaySession.load("session_v2.json")
print(s1.diff(s2))
```

### 7. Anti-Hallucination via Governance
Instead of filtering output after the fact, detect cognitive patterns that correlate with hallucination and break them *before* they produce output.

```python
from meta_reasoning import assess_hallucination_risk

risk = assess_hallucination_risk(metrics, confidence_markers=4, depth=1)
# → score=0.60, triggers=["high_confidence_low_depth", "single_strategy"]
# → preventive_mutations=[BAN deduction, REQUIRE contradiction]
```

### 8. Mutation Plugins
An open ecosystem where the community can define new mutations, constraints, and metrics.

```python
from meta_reasoning import PluginRegistry, MutationPlugin, CognitiveEngine

registry = PluginRegistry()
registry.register_mutation(MutationPlugin(
    name="force_narrative",
    description="Always require narrative_simulation",
    generate=lambda m, c: [Mutation(type=MutationType.REQUIRE, target=CognitiveMove.NARRATIVE_SIMULATION)],
))
engine = CognitiveEngine(backend=my_backend, plugin_registry=registry)
```

### 9. CI/CD for Reasoning
Automated cognitive regression testing. Detect when a model becomes more rigid, less diverse, or more prone to stall — in your CI pipeline.

```python
from meta_reasoning import CognitiveCI, assert_min_entropy, assert_no_total_stall

ci = CognitiveCI(backend=my_backend, max_cycles=5)
report = ci.run("Your task", [
    assert_min_entropy(1.0),
    assert_no_total_stall(),
    assert_min_move_diversity(3),
])
assert report.passed  # Fails CI if cognitive behavior regresses
```

### 10. Reasoning Runtime
A first-class runtime that treats reasoning as a computational process — not text. Explicit cognitive states (INITIAL → ANALYSIS → HYPOTHESIS → VALIDATION → REFLECTION → FINAL), typed transitions driven by metrics, budget management, and deterministic forking. The model does not choose to reflect. The runtime forces it.

```python
from meta_reasoning import ReasoningRuntime, ReasoningBudget

rt = ReasoningRuntime(
    backend=my_backend,
    budget=ReasoningBudget(max_cycles=8, max_branches=4),
)
result = rt.run("Your task")

print(result.summary())
# Runtime: Your task
#   Final state:    final
#   States visited: initial → analysis → hypothesis → validation → reflection → final
#   Cycles:         6
#   Budget: 6/8 cycles, 0/4 branches
```

## Installation

```bash
pip install meta-reasoning
```

Or from source with dev dependencies:
```bash
pip install -e ".[dev]"
```

## Quick Start

### Without an API key (mock backend)
```bash
python -m examples.mock_example
```

### With OpenAI
```bash
export OPENAI_API_KEY=<your-key>
python -m examples.openai_example
```

### Programmatic usage
```python
from meta_reasoning import CognitiveEngine

class MyBackend:
    def generate(self, messages):
        # Call your LLM here, return {"content": "..."}
        ...

engine = CognitiveEngine(backend=MyBackend(), max_cycles=5)
result = engine.run("Your task here")

for cycle in result.cycles:
    print(f"Cycle {cycle.cycle}: {cycle.outcome}")
    print(f"  Moves: {[m.value for m in cycle.output.reasoning_trace.moves]}")
    print(f"  Entropy: {cycle.metrics.entropy:.2f}")

# Save the epistemic ledger for analysis
engine.ledger.save("session.json")
```

## Running Tests

```bash
pip install -e ".[dev]"
pytest tests/ -v
```

## Project Structure

```
meta_reasoning/
├── __init__.py        # Public API
├── types.py           # Cognitive moves, traces, mutations, metrics
├── substrate.py       # Level 1 — LLM interface
├── controller.py      # Level 2 — Cognitive Controller
├── ledger.py          # Level 3 — Epistemic Ledger + Failure Atlas
├── metrics.py         # Semantically-blind cognitive metrics
├── mutations.py       # Mutation operator generation
├── engine.py          # The governed cognitive loop
├── runtime.py         # Reasoning Runtime (state machine + budget + forking)
├── debugger.py        # Reasoning Debugger
├── policies.py        # Reasoning Policies as code
├── benchmark.py       # Benchmarking & Cognitive Fingerprinting
├── replay.py          # Deterministic Replay
├── hallucination.py   # Anti-Hallucination Governance
├── plugins.py         # Mutation Plugin system
└── ci.py              # CI/CD for Reasoning
```

## Related Work & Philosophy

For a detailed comparison with Chain-of-Thought, Tree-of-Thoughts, Meta-Reasoning Prompting, Reflexion, Self-Refine, ReAct, and other approaches — including a comparative table — see the **[full Related Work page](https://tictacguy.github.io/Meta-Reasoning/#related-work)** on the project website.

The short version: every existing approach keeps the LLM as the cognitive subject. We don't. The model is a substrate. The reasoning is governed from outside.

## License

AGPL-3.0 -- See [LICENSE](LICENSE) for details.
