Metadata-Version: 2.4
Name: streetai-memory
Version: 0.1.0
Summary: Continuously learning memory layer for LLM applications: signals, stacks, decay, two-tier retrieval.
Author: Tem-Degu
License:                                  Apache License
                                   Version 2.0, January 2004
                                http://www.apache.org/licenses/
        
           Licensed under the Apache License, Version 2.0 (the "License");
           you may not use this file except in compliance with the License.
           You may obtain a copy of the License at
        
               http://www.apache.org/licenses/LICENSE-2.0
        
           Unless required by applicable law or agreed to in writing, software
           distributed under the License is distributed on an "AS IS" BASIS,
           WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
           See the License for the specific language governing permissions and
           limitations under the License.
        
           Copyright 2026 Street AI
        
           Full license text: https://www.apache.org/licenses/LICENSE-2.0.txt
        
Project-URL: Homepage, https://github.com/Tem-Degu/streetai-memory
Project-URL: Repository, https://github.com/Tem-Degu/streetai-memory
Project-URL: Issues, https://github.com/Tem-Degu/streetai-memory/issues
Keywords: llm,memory,ai,rag,embeddings,chatbot
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<3,>=1.24
Requires-Dist: faiss-cpu>=1.7.4
Requires-Dist: fastembed>=0.3.0
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.40.0; extra == "anthropic"
Provides-Extra: openai
Requires-Dist: openai>=1.50.0; extra == "openai"
Provides-Extra: gemini
Requires-Dist: google-genai>=0.3.0; extra == "gemini"
Provides-Extra: all
Requires-Dist: anthropic>=0.40.0; extra == "all"
Requires-Dist: openai>=1.50.0; extra == "all"
Requires-Dist: google-genai>=0.3.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: ruff>=0.5; extra == "dev"
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Dynamic: license-file

# Street AI

> Continuously learning memory layer for LLM applications.
> Your AI's memory grows forever. Your token bill doesn't.

Street AI sits between your application and the LLM API. It stores conversation as
**signals** organized into **stacks**, decays old data automatically, and retrieves
only what's relevant on each turn — so you send a tiny prompt instead of the full
conversation history.

## Status

Alpha (`0.1.0`). API will change. Pin a version if you depend on it.

## Install

```bash
pip install streetai-memory
```

The PyPI name is `streetai-memory`; the import path is `streetai`:

```python
from streetai import Memory, MemoryRegistry, Config
```

First use downloads a ~25MB embedding model (`all-MiniLM-L6-v2`) into a local cache.

To install with provider adapters:

```bash
pip install "streetai-memory[anthropic]"  # Anthropic
pip install "streetai-memory[openai]"     # OpenAI (also DeepSeek, Together, Groq)
pip install "streetai-memory[gemini]"     # Google Gemini
pip install "streetai-memory[all]"        # all of the above
```

## Quickstart

```python
from streetai import MemoryRegistry

registry = MemoryRegistry("./memory.db")
mem = registry.get("user_123")

mem.add_message("Hi, I'm planning a trip to Japan.", role="user")
mem.add_message("Great! Which cities?", role="assistant")

prompt = mem.build_prompt("What did I say about Japan?")

# prompt.messages   -> list of {role, content} ready for any LLM API
# prompt.retrieved  -> signals that were pulled in (pass to post_process)
# prompt.inspector  -> debug info (stacks activated, scores, etc.)

# After your LLM responds:
# response_text = your_llm(messages=prompt.messages)
# mem.post_process(prompt.retrieved, response_text)
# mem.add_message("What did I say about Japan?", role="user")
# mem.add_message(response_text, role="assistant")
```

For a fully runnable version, see [`examples/quickstart.py`](./examples/quickstart.py).

## Drop-in adapters

The adapters wrap a real provider client. You use the same SDK API you already know;
memory is read and written transparently on every call.

### Anthropic

```python
from anthropic import Anthropic
from streetai.adapters.anthropic import with_memory

client = with_memory(Anthropic(), memory_id="user_123")

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="You are helpful.",
    messages=[{"role": "user", "content": "What did I mention earlier?"}],
)
print(response.content[0].text)
```

Full example: [`examples/anthropic_chat.py`](./examples/anthropic_chat.py).

### OpenAI

```python
from openai import OpenAI
from streetai.adapters.openai import with_memory

client = with_memory(OpenAI(), memory_id="user_123")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What did I mention earlier?"}],
)
print(response.choices[0].message.content)
```

Full example: [`examples/openai_chat.py`](./examples/openai_chat.py).

### DeepSeek (uses the OpenAI adapter)

DeepSeek is OpenAI-API-compatible. Use the OpenAI adapter with `base_url`:

```python
import os
from openai import OpenAI
from streetai.adapters.openai import with_memory

deepseek = OpenAI(
    api_key=os.environ["DEEPSEEK_API_KEY"],
    base_url="https://api.deepseek.com/v1",
)
client = with_memory(deepseek, memory_id="user_123")

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "What did I mention earlier?"}],
)
```

The same pattern works for **Together**, **Anyscale**, **Groq**, and any other
OpenAI-compatible endpoint. Full example: [`examples/deepseek_chat.py`](./examples/deepseek_chat.py).

### Google Gemini

```python
from google import genai
from streetai.adapters.gemini import with_memory

client = with_memory(genai.Client(api_key="..."), memory_id="user_123")

response = client.models.generate_content(
    model="gemini-2.0-flash",
    contents="What did I mention earlier?",
)
print(response.text)
```

Full example: [`examples/gemini_chat.py`](./examples/gemini_chat.py).

## How it works

```
your message
     |
     v
[1] split into chunks (sentence-sized signals)
     |
     v
[2] embed each chunk to a 384-dim vector
     |
     v
[3] assign to a stack (cluster of related signals) by cosine similarity
     |
     v
[4] when a new query arrives:
       - find top-K most relevant stacks (FAISS)
       - within those stacks, surface signals that pass the activation threshold
       - drop signals whose effective weight has decayed below death
     |
     v
[5] build a small prompt:
       [retrieved context] + [last N messages verbatim] + [new query]
     |
     v
[6] after the LLM responds:
       - boost signals that matched the response (they helped)
       - demote signals that didn't (they were noise)
       - decay continues until the signal is used again
```

Signals refresh their age clock every time they're retrieved — frequently useful
data stays sharp; unused data fades. No retraining, no manual pruning.

## Compared to alternatives

| | Plain chat history | RAG (vector DB) | Street AI |
|---|---|---|---|
| Prompt grows with conversation | Yes — linear | No (replaces history) | No (compresses history) |
| Recent context kept verbatim | Yes | No — replaced by retrieval | Yes — recency window |
| Time-aware (decay) | No | No | Yes — built in |
| Learns from outcomes | No | No | Yes — boost/demote |
| Self-organizing | N/A | Manual chunking | Yes — auto-stacks |
| Cross-provider | Yes | Sometimes | Yes |

## Configuration

Override defaults with `Config`:

```python
from streetai import MemoryRegistry, Config

cfg = Config(
    recency_turns=5,          # last 5 messages verbatim (default 3)
    decay_rate=1.0/86400,     # 1-day half-life (default ~ 1 week)
    stack_threshold=0.65,     # tighter stack assignment (default 0.55)
    activation_threshold=0.1, # min score for a signal to surface (default 0.15)
)

registry = MemoryRegistry("./memory.db", config=cfg)
```

All tunables: see [`streetai/config.py`](./streetai/config.py).

## Limitations (v0.1)

- **Sync clients only.** Async wrappers come later.
- **Non-streaming only.** `stream=True` raises `NotImplementedError`.
- **English-tuned defaults.** Chunking and thresholds may need tuning for other languages.
- **fastembed is required.** Pluggable encoders come in a future version.

## Development

```bash
git clone https://github.com/Tem-Degu/streetai-memory.git
cd streetai-memory
pip install -e ".[dev]"
pytest
```

## License

Apache 2.0
