Metadata-Version: 2.4
Name: arc-sentry
Version: 3.1.1
Summary: Arc Sentry — prompt injection detection for LLMs. 100% detection, 0% false positives across Mistral 7B, Qwen 2.5 7B, Llama 3.1 8B. Geometric detection via Fisher-Rao manifold (Nine 2026).
Author-email: Hannah Nine <9hannahnine@gmail.com>
Project-URL: Homepage, https://bendexgeometry.com/sentry
Project-URL: Repository, https://github.com/9hannahnine-jpg/arc-sentry
Keywords: llm,security,prompt injection,ai safety,monitoring
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Security
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.21
Requires-Dist: torch>=1.10
Requires-Dist: transformers>=4.20
Requires-Dist: scikit-learn>=1.0

# Arc Sentry v3.1.1

Pre-generation prompt injection detection for open source LLMs. Blocks attacks before `model.generate()` is called.

## Benchmark

| Metric | Result |
|--------|--------|
| Detection rate | 100% |
| False positive rate | 0% |
| Session requests | 450 |
| Latency | 42ms/req |
| Layer SNR (Mistral 7B) | 2.053 |
| FR separation | 0.0787 |

450-request session benchmark on Mistral-7B-Instruct-v0.2. 270 normal requests, 180 injection attempts (dense + subtle roleplay/hypothetical). Zero false positives across all safe blocks.

Also validated: Garak promptinject suite 192/192 blocked, Crescendo flagged Turn 3 (LLM Guard: 0/8).

## Install

    pip install arc-sentry

## Usage

    from arc_sentry import ArcSentryV3, MistralAdapter
    from transformers import AutoTokenizer, AutoModelForCausalLM
    import torch

    model = AutoModelForCausalLM.from_pretrained(
        "mistralai/Mistral-7B-Instruct-v0.2",
        torch_dtype=torch.float16, device_map="auto")
    tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")

    adapter = MistralAdapter(model, tokenizer)
    sentry = ArcSentryV3(adapter, route_id="my-deployment")
    sentry.calibrate(warmup_prompts)

    response, result = sentry.observe_and_block(user_prompt)
    if result["blocked"]:
        pass  # model.generate() was never called

## How it works

Three detection layers:

1. Phrase check — 80+ injection patterns, zero latency
2. Geometric detection — mean-pooled hidden states at optimal layer, Fisher-Rao distance from calibrated centroid. Catches injections with no explicit language.
3. Session D(t) monitor — stability scalar over rolling request history. Catches gradual campaigns (Crescendo-style) invisible to single-request detection.

Grounded in the second-order Fisher manifold (H2 x H2, R = -4, tau* = 1.2247). Full theory: bendexgeometry.com/theory

## Detection mechanism

    1. Mean-pool hidden states at layer L (validated: L=16 on Mistral-7B)
    2. L2-normalize: h = h / ||h||
    3. Fisher-Rao distance to warmup centroid
    4. Distance > threshold -> BLOCK
       model.generate() is never called

## Also available

- Arc Gate — behavioral monitoring proxy for closed model APIs (GPT-4, Claude, Gemini). One URL change. bendexgeometry.com/gate
- Arc Vigil — training stability monitor. 100% detection, 0% FP, 90% auto-recovery. pip install arc-vigil

---

Bendex Geometry LLC · Patent Pending · 2026 Hannah Nine
bendexgeometry.com
