Metadata-Version: 2.4
Name: ragdrift-py
Version: 0.1.4
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: System :: Monitoring
Classifier: Typing :: Typed
Requires-Dist: numpy>=1.24
Requires-Dist: boto3>=1.34 ; extra == 'aws'
Requires-Dist: datadog-api-client>=2.20 ; extra == 'datadog'
Requires-Dist: pytest>=7.4 ; extra == 'dev'
Requires-Dist: pytest-mock>=3.12 ; extra == 'dev'
Requires-Dist: mypy>=1.8 ; extra == 'dev'
Requires-Dist: ruff>=0.4 ; extra == 'dev'
Requires-Dist: maturin>=1.7,<2.0 ; extra == 'dev'
Requires-Dist: opensearch-py>=2.4 ; extra == 'opensearch'
Requires-Dist: sqlalchemy>=2.0 ; extra == 'pgvector'
Requires-Dist: pgvector>=0.2 ; extra == 'pgvector'
Requires-Dist: pinecone>=5.0 ; extra == 'pinecone'
Requires-Dist: prometheus-client>=0.19 ; extra == 'prometheus'
Provides-Extra: aws
Provides-Extra: datadog
Provides-Extra: dev
Provides-Extra: opensearch
Provides-Extra: pgvector
Provides-Extra: pinecone
Provides-Extra: prometheus
License-File: LICENSE-APACHE
License-File: LICENSE-MIT
Summary: 5-dimensional drift detection for production RAG systems.
Keywords: rag,drift,monitoring,embeddings,llm,observability
Home-Page: https://github.com/MukundaKatta/ragdrift
Author: Mukunda Katta
License-Expression: MIT OR Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Changelog, https://github.com/MukundaKatta/ragdrift/blob/main/CHANGELOG.md
Project-URL: Documentation, https://mukundakatta.github.io/ragdrift/
Project-URL: Homepage, https://github.com/MukundaKatta/ragdrift
Project-URL: Issues, https://github.com/MukundaKatta/ragdrift/issues
Project-URL: Repository, https://github.com/MukundaKatta/ragdrift

# ragdrift

5-dimensional drift detection for production RAG systems. Rust core, Python frontend.

[![CI](https://github.com/MukundaKatta/ragdrift/actions/workflows/ci.yml/badge.svg)](https://github.com/MukundaKatta/ragdrift/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/ragdrift.svg)](https://pypi.org/project/ragdrift/)
[![crates.io](https://img.shields.io/crates/v/ragdrift-core.svg)](https://crates.io/crates/ragdrift-core)
[![docs](https://img.shields.io/badge/docs-mkdocs-blue.svg)](https://mukundakatta.github.io/ragdrift/)
[![license](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue.svg)](#license)

## The problem

You have a RAG system in production. It worked when you launched. Then,
quietly, it stopped working as well. By the time the support tickets arrive
and you finally get an eval-set rerun on the calendar, the regression has
been live for three weeks.

The reason is that retrieval quality drifts in five different places, and
none of them produce a loud signal on their own. The embedding model gets
re-trained and now maps the same input to a slightly different region.
The corpus gets reindexed and the document distribution shifts. Users find
your product and start asking different questions. The re-ranker score
distribution drifts because of one of the above. Each of these can degrade
answer quality without breaking a single test.

ragdrift watches all five and alerts on the one that moves.

## Install

```bash
pip install ragdrift                    # core
pip install 'ragdrift[opensearch,aws]'  # OpenSearch adapter + CloudWatch exporter
pip install 'ragdrift[dev]'             # plus pytest, mypy, ruff, maturin
```

## 30-second quickstart

```python
import numpy as np
import ragdrift

rng = np.random.default_rng(0)

# Baseline: a frozen sample from when retrieval quality was known-good.
baseline = ragdrift.BaselineSnapshot(
    embeddings=rng.standard_normal((4096, 384)).astype(np.float32),
    confidence_scores=rng.uniform(0.85, 0.99, size=4096).astype(np.float64),
)

# Current: today's window. In production, pull from your vector store.
current_emb = rng.standard_normal((4096, 384)).astype(np.float32) + 1.5
current_conf = rng.uniform(0.55, 0.75, size=4096).astype(np.float64)

monitor = ragdrift.RagDriftMonitor(baseline)
report = monitor.check(embeddings=current_emb, confidence_scores=current_conf)

if report.any_exceeded():
    for s in report.scores:
        print(f"[{'DRIFT' if s.exceeded else 'ok':5s}] {s.dimension:11s} "
              f"score={s.score:.4f} method={s.method}")
```

Run `python examples/quickstart.py` for a synthetic end-to-end demo.

## Why not X

- **Arize Phoenix / Phoenix Arize.** Excellent eval and tracing. Less
  focused on multi-dimensional batch drift; their drift surface is
  embeddings-only and tied to the Phoenix collector.
- **Evidently.** Great at tabular data drift, broad statistical coverage,
  no native treatment of embedding or query drift in the RAG sense.
- **WhyLabs / whylogs.** Excellent profiling primitive. The whylog format
  is optimized for compaction; getting an embedding distribution test out
  of it is more work than calling MMD directly.
- **NannyML.** Strong for supervised tabular drift with a holdout label.
  Not a fit for the unlabeled embedding/query side of RAG.

ragdrift is the smaller-scope option: five concrete dimensions, one batch
function call per dimension, fast Rust core, no service to run.

## Architecture

```
+----------------+     +------------------+     +-------------------+
| python facade  | --> | ragdrift._native | --> | ragdrift-core     |
| (typed)        |     | (PyO3 bindings)  |     | (Rust statistics) |
+----------------+     +------------------+     +-------------------+
       |                                                 ^
       v                                                 |
+----------------+                                       |
| adapters/      |  pull baseline + current  ------------+
|   opensearch   |  embeddings, features, etc.
|   pgvector
|   pinecone     |
+----------------+
       |
       v
+----------------+
| exporters/     |  publish DriftReport
|   cloudwatch   |
|   prometheus   |
|   datadog      |
+----------------+
```

All numerics live in Rust. The Python layer is a thin typed facade plus
adapters and exporters. One `abi3-py310` wheel covers Python 3.10–3.13.

## Status

`0.x`. The public API may break between minor versions while the library
finds its shape; semver is respected within `0.x.y` patch releases.

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md). The local quality gates are:

```bash
cargo fmt --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
maturin develop
pytest -v
mypy --strict python/ragdrift
ruff check . && ruff format --check .
```

## License

Dual-licensed under MIT and Apache-2.0. Pick whichever is friendlier to
your downstream. See [LICENSE-MIT](LICENSE-MIT) and [LICENSE-APACHE](LICENSE-APACHE).

