Metadata-Version: 2.4
Name: cachecore-python
Version: 0.1.0
Summary: Python client for CacheCore — semantic cache gateway for LLM agent workloads
Project-URL: Homepage, https://cachecore.it
Project-URL: Repository, https://github.com/cachecore/cachecore-python
Author: Fabrizio
License-Expression: MIT
License-File: LICENSE
Keywords: agents,cache,llm,openai,proxy,semantic-cache
Classifier: Development Status :: 3 - Alpha
Classifier: Framework :: AsyncIO
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: httpx>=0.25.0
Description-Content-Type: text/markdown

# cachecore

Python client for [CacheCore](https://cachecore.it) — the LLM API caching proxy that reduces cost and latency for AI agent workloads.

CacheCore sits transparently between your application and LLM providers (OpenAI, Anthropic via OpenAI-compat, etc.) and caches responses at two levels: L1 exact-match and L2 semantic similarity. This client handles the CacheCore-specific plumbing — header injection, dependency encoding, invalidation — without replacing your LLM SDK.

## Install

```bash
pip install cachecore-python
```

```python
import cachecore  # the import name is 'cachecore'
```

## Quick start

### Rung 1 — zero code changes: swap `base_url`

Point your existing SDK at CacheCore and get L1 exact-match caching immediately.
No `import cachecore` required.

```python
from openai import AsyncOpenAI

oai = AsyncOpenAI(
    api_key="your-openai-key",
    base_url="https://gateway.cachecore.it/v1",  # ← only change
)

# Identical requests are now served from cache.
resp = await oai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)
```

### Rung 2 — tenant isolation (3 lines)

Add `CacheCoreClient` to unlock tenant-scoped namespaces, L2 semantic caching, and per-tenant
metrics. Three extra lines wired into the SDK's `http_client`.

```python
from cachecore import CacheCoreClient
import httpx
from openai import AsyncOpenAI

cc = CacheCoreClient(
    gateway_url="https://gateway.cachecore.it",
    tenant_jwt="ey...",  # your tenant JWT from the CacheCore dashboard
)

oai = AsyncOpenAI(
    api_key="ignored",  # gateway injects its own upstream key
    base_url="https://gateway.cachecore.it/v1",
    http_client=httpx.AsyncClient(transport=cc.transport),
)

# Requests now carry your tenant identity.
# Semantically similar prompts hit L2 cache.
resp = await oai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain photosynthesis"}],
)
```

### Rung 3 — dep invalidation

Declare which data a cached response depends on. When that data changes, invalidate the dep
and all stale entries are evicted automatically.

```python
from cachecore import CacheCoreClient, Dep
import httpx
from openai import AsyncOpenAI

cc = CacheCoreClient(
    gateway_url="https://gateway.cachecore.it",
    tenant_jwt="ey...",
)

oai = AsyncOpenAI(
    api_key="ignored",
    base_url="https://gateway.cachecore.it/v1",
    http_client=httpx.AsyncClient(transport=cc.transport),
)

# Read path — declare what data this response depends on
with cc.request_context(deps=[Dep("table:products"), Dep("table:orders")]):
    resp = await oai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "List all products under $50"}],
    )

# Write path — bypass cache for the LLM call, then invalidate
with cc.request_context(bypass=True):
    resp = await oai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Confirm order created."}],
    )
await cc.invalidate("table:products")

# Invalidate multiple deps at once
await cc.invalidate_many(["table:orders", "table:products"])
```

## Works with LangChain / LangGraph

The transport works with any SDK that accepts an `httpx.AsyncClient`:

```python
from langchain_openai import ChatOpenAI
import httpx
from cachecore import CacheCoreClient, Dep

cc = CacheCoreClient(gateway_url="https://gateway.cachecore.it", tenant_jwt="ey...")

llm = ChatOpenAI(
    model="gpt-4o",
    api_key="ignored",
    base_url="https://gateway.cachecore.it/v1",
    http_async_client=httpx.AsyncClient(transport=cc.transport),
)

# Use request_context() around any ainvoke / astream call
with cc.request_context(deps=[Dep("doc:policy-42")]):
    result = await llm.ainvoke("Summarise the compliance policy")
```

## API reference

### `CacheCoreClient`

```python
CacheCoreClient(
    gateway_url: str,       # "https://gateway.cachecore.it"
    tenant_jwt: str,        # tenant HS256/RS256 JWT
    timeout: float = 30.0,  # for invalidation calls
    debug: bool = False,    # log cache status per request
)
```

| Property / Method | Description |
|---|---|
| `.transport` | `httpx.AsyncBaseTransport` — pass to `httpx.AsyncClient(transport=...)` |
| `.request_context(deps, bypass)` | Context manager — sets per-request deps / bypass |
| `await .invalidate(dep_id)` | Evict all entries tagged with this dep |
| `await .invalidate_many(dep_ids)` | Invalidate multiple deps concurrently |
| `await .aclose()` | Close HTTP clients. Also works as `async with CacheCoreClient(...):` |

### `Dep` / `DepDeclaration`

```python
Dep("table:products")                  # simple — hash defaults to "v1"
Dep("table:products", hash="abc123")   # explicit hash for versioned deps
```

### `CacheStatus`

Parsed from response headers after a proxied request:

```python
from cachecore import CacheStatus

status = CacheStatus.from_headers(response.headers)
# status.status      → "HIT_L1" | "HIT_L1_STALE" | "HIT_L2" | "MISS" | "BYPASS" | "UNKNOWN"
# status.similarity  → float 0.0–1.0  (non-zero on L2 hits)
# status.age_seconds → int
```

### Exceptions

| Exception | When |
|---|---|
| `CacheCoreError` | Base class for all CacheCore errors |
| `CacheCoreAuthError` | 401 / 403 from the gateway |
| `CacheCoreRateLimitError` | 429 — check `.retry_after` attribute (seconds, or `None`) |

## How it works

The client injects headers at the httpx transport layer — below the LLM SDK, above the network. Your SDK continues to work exactly as before:

```
Your code  →  openai SDK  →  httpx  →  [CacheCoreTransport]  →  CacheCore proxy  →  OpenAI API
                                              ↑
                                  injects X-CacheCore-Token
                                  injects X-CacheCore-Deps
```

## Requirements

- Python 3.10+
- `httpx >= 0.25.0`

## Links

- Website: [cachecore.it](https://cachecore.it)
- Source: [github.com/cachecore/cachecore-python](https://github.com/cachecore/cachecore-python)

## License

MIT — see [LICENSE](LICENSE)
