Metadata-Version: 2.3
Name: callspec
Version: 0.1.0
Summary: Contract testing for LLM tool calls.
Project-URL: Homepage, https://github.com/moonrunnerkc/callspec
Project-URL: Documentation, https://github.com/moonrunnerkc/callspec
Project-URL: Repository, https://github.com/moonrunnerkc/callspec
Project-URL: Issues, https://github.com/moonrunnerkc/callspec/issues
Author-email: "Bradley R. Kinnard" <brad@aftermath.tech>
License: Apache-2.0
License-File: LICENSE
Keywords: agents,ai,contracts,llm,pytest,testing,tool-calls
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.9
Requires-Dist: click>=8.0.0
Requires-Dist: jsonschema>=4.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0.0
Provides-Extra: all
Requires-Dist: anthropic>=0.18.0; extra == 'all'
Requires-Dist: coverage>=7.0.0; extra == 'all'
Requires-Dist: google-generativeai>=0.3.0; extra == 'all'
Requires-Dist: litellm>=1.0.0; extra == 'all'
Requires-Dist: mistralai>=0.1.0; extra == 'all'
Requires-Dist: mypy>=1.0.0; extra == 'all'
Requires-Dist: ollama>=0.1.0; extra == 'all'
Requires-Dist: openai>=1.0.0; extra == 'all'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'all'
Requires-Dist: pytest-cov>=4.0.0; extra == 'all'
Requires-Dist: pytest>=7.0.0; extra == 'all'
Requires-Dist: ruff>=0.1.0; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.18.0; extra == 'anthropic'
Provides-Extra: dev
Requires-Dist: coverage>=7.0.0; extra == 'dev'
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Provides-Extra: google
Requires-Dist: google-generativeai>=0.3.0; extra == 'google'
Provides-Extra: litellm
Requires-Dist: litellm>=1.0.0; extra == 'litellm'
Provides-Extra: mistral
Requires-Dist: mistralai>=0.1.0; extra == 'mistral'
Provides-Extra: ollama
Requires-Dist: ollama>=0.1.0; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == 'openai'
Description-Content-Type: text/markdown

# callspec

Contract testing for LLM tool calls.

```bash
pip install callspec
```

```python
from callspec import Callspec, ToolCall, ToolCallTrajectory
from callspec.providers.mock import MockProvider

provider = MockProvider(
    response_fn=lambda p, m: "Booked flight",
    tool_calls=[
        {"name": "search_flights", "arguments": {"origin": "SFO", "dest": "JFK"}},
        {"name": "book_flight", "arguments": {"flight_id": "UA123"}},
    ],
)

v = Callspec(provider)
response = provider.call("Book me a flight from SFO to JFK")
trajectory = ToolCallTrajectory.from_provider_response(response)

result = (
    v.assert_trajectory(trajectory)
    .calls_tools_in_order(["search_flights", "book_flight"])
    .does_not_call("cancel_flight")
    .argument_not_empty("search_flights", "origin")
    .run()
)
assert result.passed
```

Your agent calls tools. Those calls are the contract between your code and the model. When you swap models, update a prompt, or change your retrieval pipeline, callspec tells you whether the agent still calls the right tools, in the right order, with the right arguments. No LLM-as-judge. No API calls for evaluation. Deterministic pass/fail that runs in CI.

## Docs

- [Getting Started](docs/getting_started.md)
- [Trajectory Assertions](docs/trajectory_assertions.md)
- [Contract Assertions](docs/contract_assertions.md)
- [Snapshots and Drift](docs/snapshots_and_drift.md)
- [pytest and CI](docs/pytest_and_ci.md)

## License

Apache 2.0
