Metadata-Version: 2.4
Name: swiss-truth-haystack
Version: 0.1.0
Summary: Haystack components for Swiss Truth — certified knowledge base for AI agents
Project-URL: Homepage, https://swisstruth.org
Project-URL: Documentation, https://swisstruth.org/docs/integrations/haystack
Project-URL: Repository, https://github.com/swisstruthorg/swiss-truth-mcp
Project-URL: Changelog, https://github.com/swisstruthorg/swiss-truth-mcp/blob/main/CHANGELOG.md
Author-email: Swiss Truth <hello@swisstruth.org>
License: MIT
Keywords: agents,ai,compliance,component,deepset,eu-ai-act,fact-checking,grounding,hallucination,hallucination-prevention,haystack,haystack-ai,knowledge-base,mcp,pipeline,rag,retriever,swiss-truth,verified-facts
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Requires-Dist: haystack-ai>=2.0.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: responses>=0.23; extra == 'dev'
Requires-Dist: respx>=0.20; extra == 'dev'
Description-Content-Type: text/markdown

# swiss-truth-haystack

[![PyPI version](https://img.shields.io/pypi/v/swiss-truth-haystack.svg)](https://pypi.org/project/swiss-truth-haystack/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![Haystack 2.x](https://img.shields.io/badge/haystack-2.x-orange.svg)](https://haystack.deepset.ai/)

**Haystack 2.x components for [Swiss Truth](https://swisstruth.org)** — the certified, hallucination-free knowledge base for AI agents.

Stop your Haystack pipelines from hallucinating. Ground every LLM response in 3000+ certified Swiss and EU facts, verified by a 5-stage validation pipeline.

---

## Features

- 🔍 **`SwissTruthRetriever`** — Drop-in Haystack Retriever component. Plug into any RAG pipeline.
- ✅ **`SwissTruthFactChecker`** — Verify LLM outputs against certified facts. Returns `supported` / `contradicted` / `unverified`.
- 🌍 **3000+ certified claims** across 38 domains (Swiss law, EU health, AI/ML, finance, climate, …)
- 🗣️ **10 languages** — DE, EN, FR, IT, ES, ZH, AR, RU, JA, KO
- 🔒 **EU AI Act compliant** — SHA256 integrity hashes + weekly blockchain anchoring
- ⚡ **Zero config** — no API key required for public endpoints

---

## Installation

```bash
pip install swiss-truth-haystack
```

---

## Quick Start

### RAG Pipeline with Certified Facts

```python
from swiss_truth_haystack import SwissTruthRetriever
from haystack import Pipeline
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator

retriever = SwissTruthRetriever(domain="swiss-law", top_k=5)

prompt = PromptBuilder(template="""
Answer the question using only the certified facts below.

Facts:
{% for doc in documents %}
- {{ doc.content }} (confidence: {{ doc.meta.confidence }})
{% endfor %}

Question: {{ query }}
Answer:
""")

pipeline = Pipeline()
pipeline.add_component("retriever", retriever)
pipeline.add_component("prompt", prompt)
pipeline.add_component("generator", OpenAIGenerator(model="gpt-4o"))

pipeline.connect("retriever.documents", "prompt.documents")
pipeline.connect("prompt.prompt", "generator.prompt")

result = pipeline.run({
    "retriever": {"query": "Is health insurance mandatory in Switzerland?"},
    "prompt": {"query": "Is health insurance mandatory in Switzerland?"},
})

print(result["generator"]["replies"][0])
```

### Standalone Retriever

```python
from swiss_truth_haystack import SwissTruthRetriever

retriever = SwissTruthRetriever(domain="ai-ml", top_k=8)
result = retriever.run(query="What are the EU AI Act risk categories?")

for doc in result["documents"]:
    print(f"[{doc.meta['confidence']:.0%}] {doc.content}")
    print(f"  Source: {doc.meta['source_url']}")
```

### Fact-Checking Pipeline

```python
from swiss_truth_haystack import SwissTruthFactChecker

checker = SwissTruthFactChecker(domain="swiss-law")
result = checker.run(claim="Health insurance is optional in Switzerland.")

print(f"Verdict: {result['verdict']}")   # "contradicted"
for doc in result["documents"]:
    print(f"  Certified fact: {doc.content}")
```

### Post-Generation Fact Check

```python
from swiss_truth_haystack import SwissTruthFactChecker
from haystack import Pipeline
from haystack.components.generators import OpenAIGenerator

pipeline = Pipeline()
pipeline.add_component("generator", OpenAIGenerator(model="gpt-4o"))
pipeline.add_component("checker", SwissTruthFactChecker(domain="swiss-law"))

# Note: connect generator output to checker input in your pipeline logic
result = pipeline.run({"generator": {"prompt": "Explain Swiss health insurance."}})
```

---

## Components

### `SwissTruthRetriever`

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `api_key` | `str` | `None` | Optional API key (not required for public endpoints) |
| `base_url` | `str` | `https://swisstruth.org` | API base URL |
| `timeout` | `float` | `30.0` | HTTP timeout in seconds |
| `domain` | `str` | `None` | Domain filter (e.g. `"swiss-law"`, `"ai-ml"`) |
| `top_k` | `int` | `5` | Max documents to retrieve (1-20) |
| `min_confidence` | `float` | `0.8` | Minimum confidence threshold |

**`run()` inputs:**

| Input | Type | Description |
|-------|------|-------------|
| `query` | `str` | Search query |
| `domain` | `str` (optional) | Per-call domain override |
| `top_k` | `int` (optional) | Per-call result count override |

**`run()` outputs:**

| Output | Type | Description |
|--------|------|-------------|
| `documents` | `List[Document]` | Certified facts as Haystack Documents |

---

### `SwissTruthFactChecker`

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `api_key` | `str` | `None` | Optional API key |
| `base_url` | `str` | `https://swisstruth.org` | API base URL |
| `timeout` | `float` | `30.0` | HTTP timeout in seconds |
| `domain` | `str` | `None` | Domain hint for verification |

**`run()` inputs:**

| Input | Type | Description |
|-------|------|-------------|
| `claim` | `str` | Factual statement to verify |
| `domain` | `str` (optional) | Per-call domain override |

**`run()` outputs:**

| Output | Type | Description |
|--------|------|-------------|
| `documents` | `List[Document]` | Matching certified claims |
| `verdict` | `str` | `"supported"` / `"contradicted"` / `"unverified"` |
| `raw` | `dict` | Raw API response |

---

## Available Domains

| Domain | Description |
|--------|-------------|
| `swiss-law` | Swiss federal and cantonal law |
| `eu-health` | EU health regulations |
| `ai-ml` | AI/ML facts and EU AI Act |
| `finance` | Swiss and EU financial regulations |
| `climate` | Climate science and Swiss policy |
| `quantum-computing` | Quantum computing facts |
| `swiss-history` | Swiss historical facts |
| … | 38 domains total — see [swisstruth.org/domains](https://swisstruth.org/domains) |

---

## Document Metadata

Every retrieved `Document` includes:

```python
doc.meta = {
    "claim_id": "uuid",          # Unique claim identifier
    "domain": "swiss-law",       # Knowledge domain
    "language": "en",            # ISO 639-1 language code
    "source_url": "https://...", # Primary source URL
    "verified": True,            # Verification status
    "confidence": 0.97,          # Confidence score 0.0-1.0
    "swiss_truth": True,         # Swiss Truth provenance marker
}
doc.score = 0.97                 # Same as confidence (for Haystack ranking)
```

---

## Serialization

Both components support Haystack's standard serialization for pipeline YAML export:

```python
import yaml
from haystack import Pipeline
from swiss_truth_haystack import SwissTruthRetriever

pipeline = Pipeline()
pipeline.add_component("retriever", SwissTruthRetriever(domain="swiss-law"))

# Export
print(pipeline.dumps())

# Import
pipeline2 = Pipeline.loads(pipeline.dumps())
```

---

## Links

- 🌐 **API**: [swisstruth.org](https://swisstruth.org)
- 📖 **Docs**: [swisstruth.org/docs/integrations/haystack](https://swisstruth.org/docs/integrations/haystack)
- 🐙 **GitHub**: [github.com/swisstruthorg/swiss-truth-mcp](https://github.com/swisstruthorg/swiss-truth-mcp)
- 📦 **PyPI**: [pypi.org/project/swiss-truth-haystack](https://pypi.org/project/swiss-truth-haystack/)
- 🤖 **MCP Server**: [swisstruth.org/mcp](https://swisstruth.org/mcp)

---

## License

MIT — see [LICENSE](../../LICENSE)
