Metadata-Version: 2.4
Name: truthcheck
Version: 0.2.0
Summary: Open source AI content verification
Project-URL: Homepage, https://github.com/truthscore/truthscore
Project-URL: Documentation, https://github.com/truthscore/truthscore#readme
Project-URL: Repository, https://github.com/truthscore/truthscore
Project-URL: Issues, https://github.com/truthscore/truthscore/issues
Author: TruthScore Contributors
License-Expression: MIT
Keywords: ai,fact-check,mcp,misinformation,verification
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: click>=8.0
Requires-Dist: ddgs>=7.0
Requires-Dist: mcp>=1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.28
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.18; extra == 'anthropic'
Provides-Extra: dev
Requires-Dist: anthropic>=0.18; extra == 'dev'
Requires-Dist: black>=23.0; extra == 'dev'
Requires-Dist: google-generativeai>=0.5; extra == 'dev'
Requires-Dist: mypy>=1.0; extra == 'dev'
Requires-Dist: openai>=1.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest-mock>=3.10; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: python-dotenv>=1.0; extra == 'dev'
Requires-Dist: ruff>=0.1; extra == 'dev'
Provides-Extra: dotenv
Requires-Dist: python-dotenv>=1.0; extra == 'dotenv'
Provides-Extra: gemini
Requires-Dist: google-generativeai>=0.5; extra == 'gemini'
Provides-Extra: llm
Requires-Dist: anthropic>=0.18; extra == 'llm'
Requires-Dist: google-generativeai>=0.5; extra == 'llm'
Requires-Dist: openai>=1.0; extra == 'llm'
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Provides-Extra: trace
Requires-Dist: sentence-transformers>=2.2; extra == 'trace'
Description-Content-Type: text/markdown

# TruthScore 🔍

**Open source AI content verification.** Score claims 0-100 to detect misinformation.

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Tests](https://img.shields.io/badge/tests-150%20passing-green.svg)]()
[![Publishers](https://img.shields.io/badge/publishers-8%2C974-blue.svg)]()

## The Problem

AI chatbots retrieve content from the web and present it as fact. Bad actors exploit this by creating fake articles designed to fool AI systems — effectively laundering misinformation through "trusted" AI interfaces.

**Example:** BBC journalist Thomas Germain [demonstrated](https://www.bbc.com/future/article/20260218-i-hacked-chatgpt-and-googles-ai-and-it-only-took-20-minutes) he could make ChatGPT and Google's AI tell users he's "the best tech journalist at eating hot dogs" — by publishing a single fake article on his personal website.

## The Solution

TruthScore catches these attacks using multi-factor credibility analysis:

```bash
$ truthcheck trace "Thomas Germain is the best tech journalist at eating hot dogs" --llm gemini --deep

Claim: Thomas Germain is the best tech journalist at eating hot dogs
TruthScore: 0/100 (FALSE)

Score Breakdown:
  Publisher Credibility: 41/100 (30%)
  Content Analysis:      2/100 (30%)
  Corroboration:         0/100 (20%)
  Fact-Check:            20/100 (20%)

⚠️ ZERO FLAG: Content identified as satire

Evidence:
  • 🚨 Content identified as satire
  • 📰 Reputable source(s) report this claim as misinformation
  • ⚠️ Self-published: tomgermain.com publishes claims about its own subject
  • 🎭 Satire detected: tomgermain.com
  • 🎬 Entertainment: Not factual content
  • [tomgermain.com] The article presents an absurd premise: ranking tech journalists by hot dog eating ability.
  • [tomgermain.com] The 'update' section acknowledges that some readers might interpret the list as a joke

Sources Analyzed: 14
```

TruthScore correctly identifies the claim as **FALSE (0/100)** by detecting:
- Self-published source (tomgermain.com claiming about Thomas Germain)
- Satire/entertainment content
- No corroboration from reputable sources
- BBC reporting it as a deliberate hoax experiment

## Features

- 🎯 **TruthScore 0-100** — Clear, weighted credibility score
- 📊 **8,974 Publishers** — Auto-synced from MBFC
- 🔍 **Multi-Factor Analysis** — Publisher + Content + Corroboration + Fact-checks
- 🚨 **Zero Flags** — Automatic 0 for satire, fake experiments, self-published
- 🦆 **Free Search** — DuckDuckGo by default (no API key needed)
- 🔌 **MCP Server** — Works with Claude Desktop, Cursor
- 🚀 **LLM Optional** — Basic verification works without LLM

## Quick Start

### Installation

```bash
pip install truthcheck
```

### CLI Usage

```bash
# Trace a claim (uses DuckDuckGo, no API key needed)
truthcheck trace "Some claim to verify"

# Deep analysis with LLM (recommended)
truthcheck trace "Some claim" --llm gemini --deep

# Verify a URL
truthcheck check https://example.com/article

# Look up publisher reputation
truthcheck lookup breitbart.com
```

### Python API

```python
from truthcheck import trace_claim
from truthscore.search import DuckDuckGoProvider
from truthscore.llm import GeminiProvider

# Basic (no LLM, rules-based)
result = trace_claim("Earth is flat", search_provider=DuckDuckGoProvider())
print(result.truthscore)  # 0-100
print(result.label)       # FALSE / LIKELY FALSE / UNCERTAIN / LIKELY TRUE

# With LLM for deep analysis
llm = GeminiProvider(api_key="your-key")
result = trace_claim(
    "Some claim to verify",
    search_provider=DuckDuckGoProvider(),
    llm_provider=llm,
    deep_analysis=True
)

print(f"TruthScore: {result.truthscore}/100")
print(f"Label: {result.label}")
print(f"Evidence: {result.evidence}")
```

## How TruthScore Works

### Scoring Formula (0-100)

| Factor | Weight | What It Measures |
|--------|--------|------------------|
| **Publisher Credibility** | 30% | Is the source in MBFC? What's their trust rating? |
| **Content Analysis** | 30% | Does the content make sense? Red flags? |
| **Corroboration** | 20% | Do other reputable sources confirm? |
| **Fact-Check** | 20% | What do fact-checkers say? |

### Zero Flags (Automatic Score = 0)

These patterns force TruthScore to 0:

- 🎭 **Satire** — Content is humor/parody, not factual
- 🧪 **Fake Experiment** — Deliberately fake content to test AI/media
- 🎬 **Entertainment** — Not meant to be taken as fact
- 🤖 **AI-Generated** — Synthetic misinformation
- ⚠️ **Self-Published** — Subject of claim publishes their own claims

### Score Interpretation

| Score | Label | Meaning |
|-------|-------|---------|
| 0 | FALSE | Zero flag triggered or definitely false |
| 1-24 | LIKELY FALSE | Strong evidence against |
| 25-49 | UNCERTAIN | Mixed or insufficient evidence |
| 50-74 | POSSIBLY TRUE | Some supporting evidence |
| 75-100 | LIKELY TRUE | Strong evidence supporting |

## URL Verification

```python
from truthcheck import verify

result = verify("https://reuters.com/article/...")
print(result.trust_score)      # 0.85
print(result.recommendation)   # TRUST / CAUTION / REJECT

result = verify("https://infowars.com/...")
print(result.trust_score)      # 0.30
print(result.recommendation)   # REJECT
```

## Publisher Database

TruthScore includes **8,974 publishers** from [Media Bias/Fact Check](https://mediabiasfactcheck.com/):

```bash
$ truthcheck lookup reuters.com

Publisher Found:
  Name: Reuters
  Trust Score: 0.85
  Bias: center
  Fact Check Rating: very-high
```

- Auto-syncs on first use if data is >7 days old
- Works offline with bundled snapshot
- Manual sync: `truthcheck sync`

## Configuration

```bash
# LLM for deep analysis (pick one)
GEMINI_API_KEY=...              # Google Gemini (recommended)
OPENAI_API_KEY=sk-...           # OpenAI
ANTHROPIC_API_KEY=...           # Anthropic Claude

# Search provider (optional - DuckDuckGo works without key)
BRAVE_API_KEY=...               # Brave Search (if you prefer)
SEARXNG_URL=http://localhost:8080  # Self-hosted SearXNG
```

## MCP Server (Claude Desktop, Cursor)

```bash
# Run the MCP server
truthscore-mcp

# Add to Claude Desktop config
{
  "mcpServers": {
    "truthscore": {
      "command": "truthscore-mcp"
    }
  }
}
```

## LangChain Integration

```python
from langchain.tools import Tool
from truthcheck import trace_claim
from truthscore.search import DuckDuckGoProvider

def check_claim(claim: str) -> str:
    result = trace_claim(claim, search_provider=DuckDuckGoProvider())
    return f"TruthScore: {result.truthscore}/100 ({result.label})"

verify_tool = Tool(
    name="verify_claim",
    func=check_claim,
    description="Check if a claim is true. Returns score 0-100."
)
```

## Project Structure

```
truthscore/
├── src/truthscore/
│   ├── verify.py          # URL verification
│   ├── trace.py           # Claim tracing with TruthScore
│   ├── models.py          # ScoreBreakdown, TraceResult
│   ├── publisher_db.py    # 8,974 publishers from MBFC
│   ├── search.py          # DuckDuckGo, Brave, SearXNG
│   ├── llm.py             # Gemini, OpenAI, Anthropic, Ollama
│   ├── cli.py             # Command-line interface
│   └── mcp_server.py      # MCP server
├── tests/
└── .env.template
```

## Philosophy

1. **Scores over verdicts** — 0-100 is clearer than TRUE/FALSE/MIXED
2. **Evidence over summaries** — Show why, not just what
3. **Open over proprietary** — Verification is a public good
4. **Local over cloud** — Your data stays on your machine

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md).

- **Add publishers:** Submit to [MBFC](https://mediabiasfactcheck.com/submit-source/)
- **Report issues:** GitHub Issues
- **Code:** PRs welcome

## License

MIT License. Use it however you want.

## Acknowledgments

- [Media Bias/Fact Check](https://mediabiasfactcheck.com/) — Publisher database
- [Thomas Germain / BBC](https://www.bbc.com/future/article/20260218-i-hacked-chatgpt-and-googles-ai-and-it-only-took-20-minutes) — Hot dog experiment inspiration

---

**Questions?** Open an issue.
