Metadata-Version: 2.4
Name: agentprop
Version: 0.1.0a1
Summary: Graph optimization for multi-agent LLM workflows.
Project-URL: Homepage, https://github.com/aryan5v/AgentProp
Project-URL: Repository, https://github.com/aryan5v/AgentProp
Project-URL: Documentation, https://github.com/aryan5v/AgentProp#readme
Project-URL: Issues, https://github.com/aryan5v/AgentProp/issues
Author: AgentProp contributors
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: graph-optimization,influence-maximization,llm,multi-agent,reinforcement-learning
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Requires-Dist: networkx>=3.2
Provides-Extra: dev
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Provides-Extra: dl
Requires-Dist: torch>=2.2; extra == 'dl'
Provides-Extra: ml
Requires-Dist: numpy>=1.26; extra == 'ml'
Provides-Extra: rl
Requires-Dist: gymnasium>=0.29; extra == 'rl'
Description-Content-Type: text/markdown

# AgentProp

<p align="center">
  <img src="docs/assets/agentprop-logo.png" alt="AgentProp logo" width="180" />
</p>

<p align="center">
  <strong>Graph optimization for multi-agent LLM workflows.</strong>
</p>

<p align="center">
  <a href="https://github.com/aryan5v/AgentProp/actions/workflows/ci.yml"><img alt="CI" src="https://github.com/aryan5v/AgentProp/actions/workflows/ci.yml/badge.svg" /></a>
  <img alt="Version" src="https://img.shields.io/badge/version-0.1.0a1-black" />
  <img alt="License" src="https://img.shields.io/badge/license-Apache--2.0-black" />
  <img alt="Status" src="https://img.shields.io/badge/status-public_alpha-12c95b" />
</p>

AgentProp models agents, tools, memories, documents, and verifiers as nodes in a
directed weighted graph. It then uses propagation models, classical graph
algorithms, optional GNN-style policies, and reinforcement learning to optimize:

- which agents receive full context first
- which edges are redundant or risky to prune
- where verifier agents should observe or intercept failures
- how much cost is saved versus broadcast routing
- how learned routing policies compare with graph-theoretic baselines

AgentProp is not an agent orchestrator. It is an analysis and optimization layer
for workflows you already have, want to inspect, or want to study.

## Status

AgentProp is usable as a public alpha framework. The graph backbone, CLI,
reports, workflow templates, ML/RL baselines, MCP/coding-agent briefs,
checkpoints, and experiment artifact registry are implemented and tested.

The main limitation is evidence depth: real routed LLM validation should be
treated as directional until larger, repeated studies are published. The current
library prioritizes reproducible artifacts and conservative claims.

Check the current rollout state:

```bash
agentprop readiness
```

## Install

```bash
python -m venv .venv
source .venv/bin/activate
python -m pip install -e ".[dev]"
```

Optional extras:

```bash
python -m pip install -e ".[dl]"  # torch-backed GNN experiments
python -m pip install -e ".[rl]"  # optional Gymnasium ecosystem compatibility
```

CUDA/GPU is not required for the current dependency-light alpha workflows.
Modal/GPU becomes useful for larger torch sweeps and hyperparameter searches.

## First Recipes

Analyze a workflow:

```bash
agentprop analyze benchmarks/workflows/planner_coder_tester_reviewer.json
```

Recommend seed agents for context routing:

```bash
agentprop optimize planner_coder_tester_reviewer --budget 2 --algorithm greedy
```

Use quality-aware routing when correctness-sensitive roles should be protected:

```bash
agentprop optimize planner_coder_tester_reviewer \
  --budget 2 \
  --algorithm quality-aware-greedy
```

Simulate propagation:

```bash
agentprop simulate chain --seeds node_0 --model zero-forcing
```

Prune toward a token-reduction target:

```bash
agentprop prune planner_coder_tester_reviewer --target-token-reduction 0.3
```

Write an HTML report:

```bash
agentprop report planner_coder_tester_reviewer --out reports/demo.html --format html
```

Generate a Codex or Claude Code brief:

```bash
agentprop agent-instructions planner_coder_tester_reviewer \
  --target codex \
  --out reports/codex_agent_brief.md
```

## Experiment Recipes

Run the benchmark table and SVG plot:

```bash
PYTHONPATH=src:. python experiments/run_benchmark.py \
  --workflows planner_coder_tester_reviewer,chain,tree \
  --budget 2 \
  --trials 20 \
  --out-dir results/benchmark
```

Run a small ML/RL sweep:

```bash
PYTHONPATH=src:. python experiments/run_ml_rl_sweep.py \
  --config configs/sweeps/ml_rl_smoke.json \
  --artifact-root results/ml_rl_smoke
```

Dry-run the full recipe suite:

```bash
PYTHONPATH=src:. python experiments/run_experiment_suite.py \
  --config configs/experiment_suites/ml_core.json \
  --artifact-root results/ml_core \
  --dry-run
```

Preflight the real LLM case study without making LLM calls:

```bash
PYTHONPATH=src:. python experiments/run_case_study.py \
  --execution-mode llm \
  --preflight \
  --out-dir docs/results/case_study_001
```

## Artifacts

AgentProp writes plain, inspectable artifacts:

- `results.json` / `results.csv` for benchmark and case-study rows
- `summary.json` for aggregate metrics
- `traces.jsonl` and `outputs.jsonl` for routed LLM execution traces
- `verification_logs.jsonl` when command verification is enabled
- `registry.json` for ML/RL checkpoints and metric artifacts
- `*.svg` plots for benchmark and case-study summaries
- Markdown, JSON, or HTML optimization reports

This recipe-first layout is intentional: every claim should point to a command
and a saved artifact.

## What Is Implemented

- Directed weighted `AgentGraph` with JSON, validation, NetworkX conversion, and
  Graphviz DOT export.
- Propagation models: Independent Cascade, Linear Threshold, Bootstrap
  Percolation, Randomized Zero Forcing, deterministic Zero Forcing, and learned
  trace-calibrated propagation.
- Classical baselines: random, degree, in-degree, out-degree, PageRank,
  betweenness, closeness, k-core, greedy, CELF, cost-aware greedy, and
  quality-aware greedy.
- Bottleneck, articulation, bridge, low-reliability, failure-sensitive, pruning,
  observability, and verifier-placement diagnostics.
- Role-critical routing with context-sensitivity scores, graded context
  allocation, calibrated compression ratios, risk annotations, and
  verifier-placement coupling.
- Workflow templates for agent-inspired workflows and synthetic graph families.
- Quality scorers for exact match, human labels, rubrics, and injected
  LLM-as-judge adapters.
- Dependency-light ML baselines, optional torch GNNs, Q-learning, REINFORCE, PPO,
  expanded workflow-control actions, a category-conditioned online bandit,
  checkpoints, and artifact registry.
- Framework interchange adapters and optional native hooks for LangGraph, CrewAI,
  and OpenAI Agents SDK.
- Claude Code/Codex instructions and a lightweight stdio MCP server.

## Documentation

- [Documentation index](docs/index.md)
- [Tutorial](docs/tutorial.md)
- [Quality-aware routing](docs/routing_quality.md)
- [Case-study protocol](docs/research/case_study_protocol.md)
- [ML/DL/RL guide](docs/deep_learning.md)
- [Coding-agent integration](docs/coding_agents.md)
- [Framework integrations](docs/framework_integrations.md)
- [Contributing](CONTRIBUTING.md)

## Development

```bash
ruff check .
mypy src
pytest
```

CI runs the same gates on every push and pull request.
