Metadata-Version: 2.4
Name: llm-cache-guard
Version: 0.1.0
Summary: Prevent LLM cache poisoning with Confidence Gap Analysis — intent-aware thresholds, zero false serves
Author: Shrinidhi Mahishi
License-Expression: MIT
Project-URL: Repository, https://github.com/shrinidhi-mahishi/cache-guard
Keywords: llm,cache,poisoning,confidence-gap,semantic-cache
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Provides-Extra: demo
Requires-Dist: streamlit; extra == "demo"
Requires-Dist: plotly; extra == "demo"
Requires-Dist: pandas; extra == "demo"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Provides-Extra: all
Requires-Dist: streamlit; extra == "all"
Requires-Dist: plotly; extra == "all"
Requires-Dist: pandas; extra == "all"
Requires-Dist: pytest; extra == "all"
Dynamic: license-file

# Cache Guard

Prevent LLM cache poisoning with Confidence Gap Analysis — intent-aware thresholds, zero false serves.

## The Problem

"Reset my password" and "Reset my admin password" score **0.92 cosine similarity**. A naive cache with a static threshold serves the wrong answer. This is **cache poisoning** — happening silently in production.

## The Fix

Instead of trusting the top match score, look at the **gap** between the top two matches:

- **Large gap** (top match far ahead) -> safe to serve from cache
- **Small gap** (top two neck-and-neck) -> bypass to the LLM

## Features

* **Confidence Gap Analysis** — serve cached responses only when the match is unambiguous
* **Intent-aware thresholds** — informational (loose), actionable (medium), transactional (strict)
* **TTL by risk tier** — 30 / 14 / 7 days based on query intent
* **Model staleness check** — reject entries generated by an older model version
* **Semantic drift revalidation** — sample entries periodically to detect drift
* **Framework-agnostic** — works with any LLM and any embedding model

## Installation

```bash
pip install cache-guard
```

With optional dependencies:

```bash
pip install cache-guard[demo]    # Streamlit demo + Plotly charts
pip install cache-guard[dev]     # pytest
pip install cache-guard[all]     # Everything
```

## Quick Start

```python
from cache_guard import SafeCache

cache = SafeCache()

# Add a cached response
cache.add(
    query="How do I reset my password?",
    embedding=your_embed_fn("How do I reset my password?"),
    response="Go to Settings > Security > Reset.",
)

# Lookup with confidence gap analysis
result = cache.lookup(
    query="Reset my password",
    embedding=your_embed_fn("Reset my password"),
)

if result.hit:
    print(result.response)  # Serve from cache
else:
    print(result.reason)    # "gap 0.03 < min_gap 0.08 (ambiguous)"
    # -> call the LLM instead
```

## With Any LLM + Embedding Model

Cache Guard manages the cache decision — you bring your own LLM and embeddings:

```python
from openai import OpenAI
from cache_guard import SafeCache

client = OpenAI()
cache = SafeCache(model_version="gpt-4o-2024-08-06")

def embed(text):
    r = client.embeddings.create(model="text-embedding-3-small", input=text)
    return r.data[0].embedding

# Check cache first
result = cache.lookup(user_query, embed(user_query))

if result.hit:
    answer = result.response
else:
    answer = client.chat.completions.create(
        model="gpt-4o", messages=[{"role": "user", "content": user_query}]
    ).choices[0].message.content
    cache.add(user_query, embed(user_query), answer)
```

## Intent-Aware Thresholds

Different query types carry different risk levels:

| Intent | Threshold | Min Gap | TTL | Example |
|--------|-----------|---------|-----|---------|
| Informational | 0.78 | 0.08 | 30d | "What are your hours?" |
| Actionable | 0.90 | 0.15 | 14d | "Reset my password" |
| Transactional | 0.95 | 0.20 | 7d | "Pay my bill" |

```python
from cache_guard import SafeCache, IntentClassifier

# Use built-in keyword heuristic
cache = SafeCache()  # auto-classifies queries

# Or plug in your own ML classifier
classifier = IntentClassifier(custom_fn=your_ml_model.predict)
cache = SafeCache(classifier=classifier)
```

## Custom Thresholds

```python
from cache_guard import SafeCache
from cache_guard.types import IntentConfig

cache = SafeCache(thresholds={
    "informational": IntentConfig(threshold=0.80, min_gap=0.10, ttl_days=30),
    "actionable":    IntentConfig(threshold=0.92, min_gap=0.18, ttl_days=14),
    "transactional": IntentConfig(threshold=0.97, min_gap=0.25, ttl_days=3),
})
```

## Health Monitoring

```python
stats = cache.get_stats()
print(f"Hit rate:    {stats['hit_rate']:.1%}")
print(f"Bypass rate: {stats['bypass_rate']:.1%}")
print(f"Entries:     {stats['entries']}")

# Periodic drift check
sample = cache.revalidate(sample_rate=0.05)
print(f"Sampled {sample['sampled']} of {sample['total']} entries")
```

## Examples

See the `examples/` folder for complete working demos:

| Example | Description |
|---------|-------------|
| `01_basic_usage.py` | Add entries, look them up, inspect stats |
| `02_poisoning_demo.py` | Naive cache vs SafeCache — poisoning comparison |
| `03_intent_thresholds.py` | Risk-based thresholds by query type |
| `04_streamlit_demo.py` | Interactive demo with live charts (DevConf talk) |

### Running Examples

```bash
git clone https://github.com/shrinidhi-mahishi/cache-guard.git
cd cache-guard
python -m venv venv && source venv/bin/activate
pip install -e ".[all]"

python examples/01_basic_usage.py
streamlit run examples/04_streamlit_demo.py
```

## API Reference

### SafeCache

| Method | Description |
|--------|-------------|
| `lookup(query, embedding)` | Confidence-gap lookup -> `LookupResult` |
| `add(query, embedding, response, model_version=None)` | Add entry -> `CacheEntry` |
| `revalidate(sample_rate=0.01)` | Sample entries for drift checking |
| `get_stats()` | Hit rate, bypass rate, entries count |
| `clear()` | Remove all entries and reset counters |

### IntentClassifier

| Method | Description |
|--------|-------------|
| `classify(query)` | Return intent tier (informational / actionable / transactional) |

### QuerySimulator

| Method | Description |
|--------|-------------|
| `get_queries(shuffle=True)` | Generate synthetic queries with poison injections |
| `get_cluster_centroids()` | Return embedding centroids for analysis |

## Configuration

```python
cache = SafeCache(
    thresholds=DEFAULT_THRESHOLDS,  # Per-intent configs
    classifier=IntentClassifier(),  # Or custom ML classifier
    model_version="gpt-4o",        # For staleness checks
)
```

## License

MIT
