Metadata-Version: 2.4
Name: argus-debate-ai
Version: 0.1.0
Summary: ARGUS: A Debate-Native Multi-Agent AI Architecture for Accelerating Scientific Discovery
Author: Ankush Pandey, Rishi Ghodawat, Ronit Mehta
License: MIT
Project-URL: Homepage, https://github.com/argus-ai/argus
Project-URL: Documentation, https://argus-ai.readthedocs.io
Project-URL: Repository, https://github.com/argus-ai/argus
Project-URL: Issues, https://github.com/argus-ai/argus/issues
Keywords: ai,multi-agent,debate,reasoning,scientific-discovery,llm,knowledge-graph,argumentation,bayesian-inference,provenance
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic<3.0,>=2.0
Requires-Dist: pydantic-settings<3.0,>=2.0
Requires-Dist: numpy<2.0,>=1.24
Requires-Dist: scipy<2.0,>=1.10
Requires-Dist: networkx<4.0,>=3.0
Requires-Dist: litellm<2.0,>=1.40
Requires-Dist: openai<2.0,>=1.30
Requires-Dist: anthropic<1.0,>=0.25
Requires-Dist: google-generativeai<1.0,>=0.5
Requires-Dist: sentence-transformers<3.0,>=2.2
Requires-Dist: rank-bm25<1.0,>=0.2
Requires-Dist: faiss-cpu<2.0,>=1.7
Requires-Dist: pymupdf<2.0,>=1.23
Requires-Dist: beautifulsoup4<5.0,>=4.12
Requires-Dist: lxml<6.0,>=4.9
Requires-Dist: chardet<6.0,>=5.0
Requires-Dist: click<9.0,>=8.0
Requires-Dist: rich<14.0,>=13.0
Requires-Dist: python-dotenv<2.0,>=1.0
Requires-Dist: httpx<1.0,>=0.25
Requires-Dist: tenacity<9.0,>=8.2
Requires-Dist: tiktoken<1.0,>=0.5
Provides-Extra: dev
Requires-Dist: pytest<9.0,>=7.4; extra == "dev"
Requires-Dist: pytest-asyncio<1.0,>=0.21; extra == "dev"
Requires-Dist: pytest-cov<6.0,>=4.1; extra == "dev"
Requires-Dist: black<25.0,>=23.0; extra == "dev"
Requires-Dist: ruff<1.0,>=0.1; extra == "dev"
Requires-Dist: mypy<2.0,>=1.5; extra == "dev"
Requires-Dist: pre-commit<4.0,>=3.4; extra == "dev"
Provides-Extra: ollama
Requires-Dist: ollama<1.0,>=0.2; extra == "ollama"
Provides-Extra: all
Requires-Dist: argus-debate-ai[dev]; extra == "all"
Requires-Dist: argus-debate-ai[ollama]; extra == "all"
Dynamic: license-file

# ARGUS

<div align="center">

**Agentic Research & Governance Unified System**

*A debate-native, multi-agent AI framework for evidence-based reasoning with structured argumentation, decision-theoretic planning, and full provenance tracking.*

[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://badge.fury.io/py/argus-ai.svg)](https://badge.fury.io/py/argus-ai)

</div>

---

## 🎯 Overview

ARGUS implements **Research Debate Chain (RDC)** - a novel approach to AI reasoning that structures knowledge evaluation as multi-agent debates. Instead of single-pass inference, ARGUS orchestrates specialist agents that gather evidence, generate rebuttals, and render verdicts through Bayesian aggregation.

### Key Innovations

- **Conceptual Debate Graph (C-DAG)**: A directed graph structure where propositions, evidence, and rebuttals are nodes with signed edges representing support/attack relationships
- **Evidence-Directed Debate Orchestration (EDDO)**: Algorithm for managing multi-round debates with stopping criteria
- **Value of Information Planning**: Decision-theoretic experiment selection using Expected Information Gain
- **Full Provenance**: PROV-O compatible ledger with hash-chain integrity for audit trails

---

## ✨ Features

### 🧠 Multi-Agent Debate System
- **Moderator**: Creates debate agendas, manages rounds, evaluates stopping criteria
- **Specialist Agents**: Domain-specific evidence gathering with hybrid retrieval
- **Refuter**: Generates counter-evidence and methodological critiques
- **Jury**: Aggregates evidence via Bayesian updating, renders verdicts

### 📊 Conceptual Debate Graph (C-DAG)
- **Node Types**: Propositions, Evidence, Rebuttals, Findings, Assumptions
- **Edge Types**: Supports, Attacks, Refines, Rebuts with signed weights
- **Propagation**: Log-odds Bayesian belief updating across the graph
- **Visualization**: Export to NetworkX for analysis

### 🔍 Hybrid Retrieval System
- **BM25 Sparse**: Traditional keyword-based retrieval
- **FAISS Dense**: Semantic vector search with sentence-transformers
- **Fusion Methods**: Weighted combination or Reciprocal Rank Fusion (RRF)
- **Cross-Encoder Reranking**: Neural reranking for precision

### 🎯 Decision-Theoretic Planning
- **Expected Information Gain (EIG)**: Monte Carlo estimation for experiment value
- **VoI Planner**: Knapsack-based optimal action selection under budget
- **Calibration**: Brier score, ECE, temperature scaling for confidence tuning

### 🔐 Provenance & Governance
- **PROV-O Compatible**: W3C standard provenance model
- **Hash-Chain Integrity**: SHA-256 linked events for tamper detection
- **Attestations**: Cryptographic proofs for content integrity
- **Query API**: Filter events by entity, agent, time range

### 🔌 LLM Provider Support
| Provider | Models | Features |
|----------|--------|----------|
| **OpenAI** | GPT-4o, GPT-4, GPT-3.5 | Generate, Stream, Embed |
| **Anthropic** | Claude 3.5, Claude 3 | Generate, Stream |
| **Google** | Gemini 1.5 Pro/Flash | Generate, Stream, Embed |
| **Ollama** | Llama, Mistral, Phi | Local deployment |

---

## 🚀 Installation

### From PyPI (Recommended)

```bash
pip install argus-ai
```

### From Source

```bash
git clone https://github.com/your-org/argus.git
cd argus
pip install -e ".[dev]"
```

### Optional Dependencies

```bash
# For PDF processing
pip install argus-ai[pdf]

# For all features
pip install argus-ai[all]
```

---

## 📖 Quick Start

### Basic Usage

```python
from argus import RDCOrchestrator, get_llm

# Initialize with any supported LLM
llm = get_llm("openai", model="gpt-4o")

# Run a debate on a proposition
orchestrator = RDCOrchestrator(llm=llm, max_rounds=5)
result = orchestrator.debate(
    "The new treatment reduces symptoms by more than 20%",
    prior=0.5,  # Start with 50/50 uncertainty
)

print(f"Verdict: {result.verdict.label}")
print(f"Posterior: {result.verdict.posterior:.3f}")
print(f"Evidence: {result.num_evidence} items")
print(f"Reasoning: {result.verdict.reasoning}")
```

### Building a Debate Graph Manually

```python
from argus import CDAG, Proposition, Evidence, EdgeType
from argus.cdag.nodes import EvidenceType
from argus.cdag.propagation import compute_posterior

# Create the graph
graph = CDAG(name="drug_efficacy_debate")

# Add the proposition to evaluate
prop = Proposition(
    text="Drug X is effective for treating condition Y",
    prior=0.5,
    domain="clinical",
)
graph.add_proposition(prop)

# Add supporting evidence
trial_evidence = Evidence(
    text="Phase 3 RCT showed 35% symptom reduction (n=500, p<0.001)",
    evidence_type=EvidenceType.EMPIRICAL,
    polarity=1,  # Supports
    confidence=0.9,
    relevance=0.95,
    quality=0.85,
)
graph.add_evidence(trial_evidence, prop.id, EdgeType.SUPPORTS)

# Add challenging evidence
side_effect = Evidence(
    text="15% of patients experienced adverse events",
    evidence_type=EvidenceType.EMPIRICAL,
    polarity=-1,  # Attacks
    confidence=0.8,
    relevance=0.7,
)
graph.add_evidence(side_effect, prop.id, EdgeType.ATTACKS)

# Compute Bayesian posterior
posterior = compute_posterior(graph, prop.id)
print(f"Posterior probability: {posterior:.3f}")
```

### Document Ingestion & Retrieval

```python
from argus import DocumentLoader, Chunker, EmbeddingGenerator
from argus.retrieval import HybridRetriever

# Load documents
loader = DocumentLoader()
doc = loader.load("research_paper.pdf")

# Chunk with overlap
chunker = Chunker(chunk_size=512, chunk_overlap=50)
chunks = chunker.chunk(doc)

# Create hybrid retriever
retriever = HybridRetriever(
    embedding_model="all-MiniLM-L6-v2",
    lambda_param=0.7,  # Weight toward dense retrieval
    use_reranker=True,
)
retriever.index_chunks(chunks)

# Search
results = retriever.retrieve("treatment efficacy results", top_k=10)
for r in results:
    print(f"[{r.rank}] Score: {r.score:.3f} - {r.chunk.text[:100]}...")
```

### Multi-Agent Debate

```python
from argus import get_llm
from argus.agents import Moderator, Specialist, Refuter, Jury
from argus import CDAG, Proposition

llm = get_llm("anthropic", model="claude-3-5-sonnet-20241022")

# Initialize agents
moderator = Moderator(llm)
specialist = Specialist(llm, domain="clinical")
refuter = Refuter(llm)
jury = Jury(llm)

# Create debate
graph = CDAG()
prop = Proposition(text="The intervention is cost-effective", prior=0.5)
graph.add_proposition(prop)

# Moderator creates agenda
agenda = moderator.create_agenda(graph, prop.id)

# Specialists gather evidence
evidence = specialist.gather_evidence(graph, prop.id)

# Refuter challenges
rebuttals = refuter.generate_rebuttals(graph, prop.id)

# Jury renders verdict
verdict = jury.evaluate(graph, prop.id)
print(f"Verdict: {verdict.label} (posterior={verdict.posterior:.3f})")
```

---

## 🖥️ Command Line Interface

ARGUS provides a full CLI for common operations:

```bash
# Run a debate
argus debate "The hypothesis is supported by evidence" --prior 0.5 --rounds 3

# Quick evaluation (single LLM call)
argus evaluate "Climate change increases wildfire frequency"

# Ingest documents into index
argus ingest ./documents --output ./index

# Show configuration
argus config

# Specify provider
argus debate "Query" --provider anthropic --model claude-3-5-sonnet-20241022
```

---

## ⚙️ Configuration

### Environment Variables

```bash
# LLM API Keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."

# Default settings
export ARGUS_DEFAULT_PROVIDER="openai"
export ARGUS_DEFAULT_MODEL="gpt-4o"
export ARGUS_TEMPERATURE="0.7"
export ARGUS_MAX_TOKENS="2048"

# Ollama (local)
export ARGUS_OLLAMA_HOST="http://localhost:11434"
```

### Programmatic Configuration

```python
from argus import ArgusConfig, get_config

config = ArgusConfig(
    default_provider="anthropic",
    default_model="claude-3-5-sonnet-20241022",
    temperature=0.5,
    max_tokens=4096,
)

# Or get global config
config = get_config()
```

---

## 🏗️ Architecture

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                              ARGUS Architecture                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  ┌───────────────┐    ┌───────────────┐    ┌───────────────┐                │
│  │   Moderator   │───▶│  Specialists  │───▶│    Refuter    │                │
│  │   (Planner)   │    │  (Evidence)   │    │ (Challenges)  │                │
│  └───────┬───────┘    └───────┬───────┘    └───────┬───────┘                │
│          │                    │                    │                         │
│          ▼                    ▼                    ▼                         │
│  ┌─────────────────────────────────────────────────────────────────┐        │
│  │                    C-DAG (Debate Graph)                          │        │
│  │  ┌────────┐     ┌──────────┐     ┌──────────┐                   │        │
│  │  │Props   │────▶│ Evidence │────▶│ Rebuttals│                   │        │
│  │  └────────┘     └──────────┘     └──────────┘                   │        │
│  │         ▲              │               │                         │        │
│  │         └──────────────┴───────────────┘                         │        │
│  │                 Signed Influence Propagation                     │        │
│  └─────────────────────────────────────────────────────────────────┘        │
│                                    │                                         │
│                                    ▼                                         │
│  ┌─────────────────────────────────────────────────────────────────┐        │
│  │                         Jury (Verdict)                           │        │
│  │           Bayesian Aggregation → Posterior → Label               │        │
│  └─────────────────────────────────────────────────────────────────┘        │
│                                                                              │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐              │
│  │ Knowledge Layer │  │ Decision Layer  │  │   Provenance    │              │
│  │ • Ingestion     │  │ • Bayesian      │  │ • PROV-O Ledger │              │
│  │ • Chunking      │  │ • EIG/VoI       │  │ • Hash Chain    │              │
│  │ • Hybrid Index  │  │ • Calibration   │  │ • Attestations  │              │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘              │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
```

### Module Overview

| Module | Description |
|--------|-------------|
| `argus.core` | Configuration, data models, LLM abstractions |
| `argus.cdag` | Conceptual Debate Graph implementation |
| `argus.decision` | Bayesian updating, EIG, VoI planning, calibration |
| `argus.knowledge` | Document ingestion, chunking, embeddings, indexing |
| `argus.retrieval` | Hybrid retrieval, reranking, cite & critique |
| `argus.agents` | Moderator, Specialist, Refuter, Jury agents |
| `argus.provenance` | PROV-O ledger, integrity, attestations |
| `argus.orchestrator` | RDC orchestration engine |

---

## 📊 Algorithms

### Signed Influence Propagation

The C-DAG uses log-odds space for numerically stable belief propagation:

```
posterior = σ(log-odds(prior) + Σ signed_weight_i × log(LR_i))
```

Where:
- `σ` is the sigmoid function
- `LR_i` is the likelihood ratio for evidence i
- `signed_weight = polarity × confidence × relevance × quality`

### Expected Information Gain

For experiment planning, ARGUS computes EIG via Monte Carlo:

```
EIG(a) = H(p) - E_y[H(p|y)]
```

Where `H` is entropy and the expectation is over possible outcomes.

### Calibration

Temperature scaling optimizes:

```
T* = argmin_T Σ -y_i log(σ(z_i/T)) - (1-y_i) log(1-σ(z_i/T))
```

---

## 🧪 Testing

```bash
# Run all tests
pytest

# Run with coverage
pytest --cov=argus --cov-report=html

# Run specific test module
pytest tests/unit/test_cdag.py -v

# Run integration tests
pytest tests/integration/ -v

# Skip slow tests
pytest -m "not slow"
```

---

## 📄 Examples

### Clinical Evidence Evaluation

```python
from argus import RDCOrchestrator, get_llm
from argus.retrieval import HybridRetriever

# Load clinical literature
retriever = HybridRetriever()
retriever.index_chunks(clinical_chunks)

# Evaluate treatment claim
orchestrator = RDCOrchestrator(
    llm=get_llm("openai", model="gpt-4o"),
    max_rounds=5,
)

result = orchestrator.debate(
    "Metformin reduces HbA1c by >1% in Type 2 diabetes",
    prior=0.6,  # Prior based on existing knowledge
    retriever=retriever,
    domain="clinical",
)
```

### Research Claim Verification

```python
from argus import CDAG, Proposition, Evidence
from argus.cdag.propagation import compute_all_posteriors

graph = CDAG(name="research_verification")

# Main claim
claim = Proposition(
    text="Neural scaling laws predict emergent capabilities",
    prior=0.5,
)
graph.add_proposition(claim)

# Add evidence from multiple papers
# ... (add supporting/attacking evidence)

# Compute all posteriors
posteriors = compute_all_posteriors(graph)

for prop_id, posterior in posteriors.items():
    prop = graph.get_proposition(prop_id)
    print(f"{prop.text[:50]}... : {posterior:.3f}")
```

---

## 🤝 Contributing

We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

---

## 📜 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

## 📚 Citation

If you use ARGUS in your research, please cite:

```bibtex
@software{argus2024,
  title={ARGUS: Agentic Research \& Governance Unified System},
  author={ARGUS Team},
  year={2024},
  url={https://github.com/your-org/argus}
}
```

---

## 🙏 Acknowledgments

- Inspired by debate-native reasoning approaches in AI safety research
- Built on excellent open-source libraries: Pydantic, NetworkX, FAISS, Sentence-Transformers
- LLM integrations powered by OpenAI, Anthropic, and Google APIs

---

<div align="center">

**[Documentation](https://argus.readthedocs.io)** • **[PyPI](https://pypi.org/project/argus-ai/)** • **[GitHub](https://github.com/your-org/argus)**

</div>
