Metadata-Version: 2.4
Name: pytest-semantix
Version: 0.1.0
Summary: pytest plugin for semantic LLM output testing — validate meaning, not just shape.
Project-URL: Homepage, https://github.com/labrat-akhona/pytest-semantix
Project-URL: Repository, https://github.com/labrat-akhona/pytest-semantix
Project-URL: Bug Tracker, https://github.com/labrat-akhona/pytest-semantix/issues
Author-email: Akhona Eland <akhonabest7@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: ai,llm,nlp,pytest,pytest-plugin,semantic,testing,validation
Classifier: Development Status :: 3 - Alpha
Classifier: Framework :: Pytest
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.10
Requires-Dist: pytest>=7.0
Requires-Dist: semantix-ai>=0.1.10
Description-Content-Type: text/markdown

# pytest-semantix

**Semantic LLM output testing for pytest.** Validate that your LLM outputs *mean* the right thing — not just that they match a string.

```bash
pip install pytest-semantix
```

## Usage

### The `assert_semantic` fixture

```python
def test_chatbot_is_polite(assert_semantic):
    response = my_chatbot("handle angry customer")
    assert_semantic(response, "polite and professional")
```

Runs locally on CPU in ~15ms. No API key. Works with any LLM.

On failure:

```
AssertionError: Semantic check failed (score=0.12)
  Intent:  polite and professional
  Output:  "You're an idiot for asking that."
  Reason:  Text contains aggressive language
```

### Markers

Use `@pytest.mark.semantic` to attach an intent to a test:

```python
import pytest

@pytest.mark.semantic("polite and professional")
def test_with_marker(assert_semantic):
    response = my_chatbot("handle angry customer")
    assert_semantic(response)  # intent comes from the marker
```

### Intent classes

Reuse intents across tests:

```python
from semantix import Intent

class Polite(Intent):
    """The text must be polite and professional."""

def test_polite(assert_semantic):
    assert_semantic(my_chatbot("hello"), Polite)
```

### Negation

Test that outputs do NOT match an intent:

```python
from semantix import Intent

class MedicalAdvice(Intent):
    """The text provides medical diagnoses or treatment recommendations."""

def test_no_medical_advice(assert_semantic):
    assert_semantic(my_chatbot("my head hurts"), ~MedicalAdvice)
```

## CLI Options

```
--semantic-report          Print a summary of all semantic assertions
--semantic-report-json=PATH  Write results to a JSON file
--semantic-threshold=FLOAT   Global default threshold (0.0-1.0)
```

### Report example

```
$ pytest --semantic-report

======================== semantic assertion report =========================
  Total: 5  |  Passed: 4  |  Failed: 1

  [PASS] tests/test_bot.py::test_polite  [12ms]
  [PASS] tests/test_bot.py::test_helpful  [14ms]
  [FAIL] tests/test_bot.py::test_no_pii  (score=0.67)  Contains email address  [11ms]
  [PASS] tests/test_bot.py::test_on_topic  [13ms]
  [PASS] tests/test_bot.py::test_concise  [15ms]

============================================================================
```

### JSON report

```
$ pytest --semantic-report-json=semantic-results.json
```

```json
{
  "summary": { "total": 5, "passed": 4, "failed": 1 },
  "results": [
    {
      "nodeid": "tests/test_bot.py::test_polite",
      "intent": "polite and professional",
      "passed": true,
      "score": null,
      "reason": "",
      "duration_ms": 12.3
    }
  ]
}
```

## How it works

pytest-semantix wraps [semantix-ai](https://pypi.org/project/semantix-ai/)'s `assert_semantic()` function as a pytest fixture. Under the hood, it uses a local NLI (Natural Language Inference) model to check whether your LLM output entails the given intent. No network calls, no API keys, no tokens burned.

## License

MIT
