Metadata-Version: 2.4
Name: inferencache
Version: 0.1.0
Summary: Semantic LLM response caching — stop paying for the same call twice.
Project-URL: Homepage, https://github.com/lavondev/inferencache
Project-URL: Repository, https://github.com/lavondev/inferencache
Project-URL: Issues, https://github.com/lavondev/inferencache/issues
Author-email: Justin Gaddy <justingaddy1@gmail.com>
License: MIT
Keywords: agents,anthropic,cache,caching,embeddings,llm,openai,proxy,semantic
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Requires-Dist: duckdb>=0.10.0
Requires-Dist: qdrant-client>=1.9.0
Provides-Extra: all
Requires-Dist: anyio>=4.0.0; extra == 'all'
Requires-Dist: duckdb>=0.10.0; extra == 'all'
Requires-Dist: fastapi>=0.110.0; extra == 'all'
Requires-Dist: httpx>=0.27.0; extra == 'all'
Requires-Dist: mcp>=1.0.0; extra == 'all'
Requires-Dist: openai>=1.0.0; extra == 'all'
Requires-Dist: python-multipart>=0.0.9; extra == 'all'
Requires-Dist: qdrant-client>=1.9.0; extra == 'all'
Requires-Dist: sentence-transformers>=3.0.0; extra == 'all'
Requires-Dist: uvicorn[standard]>=0.29.0; extra == 'all'
Provides-Extra: dev
Requires-Dist: mypy>=1.10.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Requires-Dist: sentence-transformers>=3.0.0; extra == 'dev'
Provides-Extra: embed
Requires-Dist: sentence-transformers>=3.0.0; extra == 'embed'
Provides-Extra: mcp
Requires-Dist: anyio>=4.0.0; extra == 'mcp'
Requires-Dist: mcp>=1.0.0; extra == 'mcp'
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == 'openai'
Provides-Extra: serve
Requires-Dist: fastapi>=0.110.0; extra == 'serve'
Requires-Dist: httpx>=0.27.0; extra == 'serve'
Requires-Dist: python-multipart>=0.0.9; extra == 'serve'
Requires-Dist: uvicorn[standard]>=0.29.0; extra == 'serve'
Description-Content-Type: text/markdown

# inferencache

**Multi-tier semantic caching for LLM APIs. Stop paying for the same prompt twice.**

```bash
pip install "inferencache[embed,serve]"
export ANTHROPIC_API_KEY=sk-ant-...
inferencache serve
# landing:   http://localhost:8080/
# dashboard: http://localhost:8080/dashboard/
# proxy:     http://localhost:8080/v1/messages
```

Point Cursor or Claude Code at `http://localhost:8080` — no code changes required.

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup.
