Metadata-Version: 2.4
Name: langchain-hana-cache
Version: 0.1.0
Summary: Semantic caching for LLM responses on SAP HANA Cloud
Author: David Diaz
License-Expression: MIT
Project-URL: Homepage, https://github.com/stubborncoder/langchain-hana-cache
Project-URL: Repository, https://github.com/stubborncoder/langchain-hana-cache
Project-URL: Issues, https://github.com/stubborncoder/langchain-hana-cache/issues
Keywords: langchain,sap,hana,hana-cloud,cache,semantic,llm,btp
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langchain-core>=0.2.0
Requires-Dist: hdbcli>=2.18.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21; extra == "dev"
Requires-Dist: python-dotenv>=1.0; extra == "dev"
Dynamic: license-file

# langchain-hana-cache

Semantic caching for LLM responses on SAP HANA Cloud.

Stores prompt embeddings and LLM responses in HANA Cloud. When a semantically similar prompt comes in, it returns the cached response instead of calling the LLM — saving tokens and reducing latency.

## How it works

1. User sends a prompt to the LLM
2. The cache embeds the prompt using the configured embedding model
3. Searches HANA for cached entries using `COSINE_SIMILARITY` on a `REAL_VECTOR` column
4. If similarity exceeds the threshold (default 0.95), returns the cached response — no LLM call
5. If no match, calls the LLM normally, caches the prompt embedding + response, returns the response

## Installation

```bash
pip install langchain-hana-cache
```

## Usage

### As LangChain global cache

```python
import hdbcli.dbapi
from langchain_hana_cache import HANASemanticLLMCache
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_core.globals import set_llm_cache

connection = hdbcli.dbapi.connect(
    address="your-host.hanacloud.ondemand.com",
    port=443,
    user="DBADMIN",
    password="your-password",
    encrypt=True,
)

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

cache = HANASemanticLLMCache(
    connection=connection,
    embedding=embeddings,
    table_name="LLM_CACHE",
    similarity_threshold=0.95,
    ttl_seconds=86400,
)

set_llm_cache(cache)

llm = ChatOpenAI(model="gpt-4o")
response1 = llm.invoke("What are the reporting requirements for article 12?")
response2 = llm.invoke("Tell me about article 12 reporting requirements")  # cache hit
```

### Manual usage

```python
from langchain_core.outputs import Generation

# Store a response
cache.update(
    "What is the capital of France?",
    "gpt-4o",
    [Generation(text="The capital of France is Paris.")],
)

# Look up a similar prompt
result = cache.lookup("Tell me the capital of France", "gpt-4o")
# result = [Generation(text="The capital of France is Paris.")]
```

### Eviction

```python
# Remove entries older than TTL
cache.evict_expired()

# Keep only the 1000 most recently accessed entries
cache.evict_lru(max_entries=1000)

# Clear all cached entries
cache.clear()
```

## Parameters

| Parameter | Type | Default | Description |
|---|---|---|---|
| `connection` | `hdbcli.dbapi.Connection` | required | HANA database connection |
| `embedding` | `Embeddings` | required | LangChain embedding model for encoding prompts |
| `table_name` | `str` | `"LLM_CACHE"` | Name of the cache table |
| `similarity_threshold` | `float` | `0.95` | Minimum cosine similarity for a cache hit |
| `ttl_seconds` | `int \| None` | `None` | Time-to-live in seconds (None = no expiry) |

## Development

```bash
git clone https://github.com/stubborncoder/langchain-hana-cache.git
cd langchain-hana-cache
pip install -e ".[dev]"

# Run unit tests
pytest tests/test_utils.py tests/test_llm_cache.py -v

# Run integration tests (requires HANA credentials in .env)
pytest tests/test_integration.py -v
```

## License

MIT
