Metadata-Version: 2.4
Name: merlin-langchain
Version: 0.1.0
Summary: Merlin dedup integration for LangChain - strip byte-redundant context before it reaches the LLM.
Author-email: Corbenic AI <hello@corbenic.ai>
License: MIT License
        
        Copyright (c) 2026 Corbenic
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
        ----------------------------------------------------------------------
        
        NOTE on scope of this MIT license:
        
        This license covers ONLY the integration glue in this repository:
          - The Python install scripts and MCP server in `shared/`, `claude_desktop/`,
            `claude_code/`, `openclaw/`
          - The VSCode extension source in `vscode/`
        
        It does NOT cover the merlin C++ engine binary that these tools invoke. The
        binary is distributed separately under its own license terms (free for
        community use within stated caps; see corbenic.ai/community for details).
        
Project-URL: Homepage, https://github.com/corbenicai/merlin-community
Project-URL: Repository, https://github.com/corbenicai/merlin-community
Project-URL: Issues, https://github.com/corbenicai/merlin-community/issues
Project-URL: Documentation, https://github.com/corbenicai/merlin-community/tree/main/repo/integrations/langchain
Keywords: langchain,llm,dedup,context-compression,merlin
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: langchain-core>=0.1.0
Requires-Dist: langchain-classic>=1.0
Requires-Dist: pydantic>=2.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: tiktoken>=0.5; extra == "dev"
Requires-Dist: langchain>=0.1.0; extra == "dev"
Dynamic: license-file

# merlin-langchain

Drop-in `MerlinBufferMemory` for LangChain. Strips redundant text from your
chat history before it reaches the LLM, so multi-turn agents stop choking
on context-window overflow.

* Real-world demo: a coding agent fed two real lock files
  (`facebook/react/yarn.lock` + `vercel/next.js/pnpm-lock.yaml`,
  ~2 MB / 1 M tokens per turn) crashes vanilla LangChain on turn 2 with
  Gemini's `400 INVALID_ARGUMENT "exceeds 1048576"`. With
  `MerlinBufferMemory` the same agent survives 6 turns and the
  same Gemini call returns `200 OK` (receipts in
  [docs/benchmarks/langchain_2026-05-14.pdf](../../docs/benchmarks/langchain_2026-05-14.pdf)).

---

## Quick start (3 minutes)

### 1 - Install the Python package

```bash
pip install merlin-langchain
```

### 2 - Get the Merlin binary

The Python package only contains the LangChain glue. The dedup engine itself
ships as a small native binary, downloaded once.

* **Windows x64:** download from the latest GitHub release:
  <https://github.com/corbenicai/merlin-community/releases/latest>
* **Linux / macOS:** native builds are landing soon - see the issues tracker
  for status. Until then the package falls back to vanilla LangChain
  behavior on those platforms (see *Fallback*, below).

Place the binary anywhere you like. Most users put it in `~/.merlin/`:

```bash
mkdir -p ~/.merlin
mv merlin-lite-windows-x64.exe ~/.merlin/merlin.exe
```

### 3 - Tell the package where the binary lives

```bash
# Windows PowerShell
$env:MERLIN_BINARY = "$HOME\.merlin\merlin.exe"

# bash / zsh
export MERLIN_BINARY=~/.merlin/merlin
```

If you skip this step, the package looks in `~/.merlin/merlin[.exe]` by
default. If the binary still isn't found, MerlinBufferMemory transparently
falls back to vanilla LangChain - no crash, just no optimization.

### 4 - Use it

```python
from merlin_langchain import MerlinBufferMemory
from langchain.chains import ConversationChain
from langchain_openai import ChatOpenAI

memory = MerlinBufferMemory(memory_key="chat_history")
chain = ConversationChain(llm=ChatOpenAI(model="gpt-4o"), memory=memory)
chain.invoke({"input": "..."})
```

That's it. Your agent now silently dedupes its rolling chat history before
each LLM call. No code changes elsewhere.

---

## What you get

| Component | Drop-in replacement for |
|---|---|
| `MerlinBufferMemory` | `langchain.memory.ConversationBufferMemory` |
| `merlin_format_log_to_str` | `langchain.agents.format_scratchpad.format_log_to_str` |

Both inherit / mirror the LangChain interfaces, so they pass Pydantic
validation in `Chain.memory` slots and work in any chain that accepts
`BaseMemory`.

Async surface (`aload_memory_variables`, `asave_context`, `aclear`) is
implemented for use behind LangServe / FastAPI / `await agent.ainvoke()`.

---

## Limits (community tier)

The community binary processes **up to**:

| Window | Cap |
|---|---|
| Per call | 50 MB |
| Per day  | 200 MB |
| Per month | 2 GB |

A single solo developer never hits these. A serious commercial pipeline
hits them in 2-3 days; for higher caps see <https://corbenic.ai>.

### What happens when a cap is reached

`MerlinBufferMemory` **transparently falls back to vanilla LangChain
behavior**. Your prompts pass through unchanged - exactly as if the
package weren't installed - and your LLM call proceeds normally.

* You'll see one `WARNING` in your logs the first time fallback kicks in.
* The package will automatically retry the binary every hour (configurable
  via the `MERLIN_RETRY_AFTER_S` environment variable, minimum 60 seconds).
* When the cap rolls over (daily at 00:00 UTC, monthly on the 1st), the
  next retry succeeds and you'll see `INFO: Merlin dedup recovered`.

This means **you cannot get stuck in a degraded state** because of a
forgotten reset - long-running web servers self-heal across midnight UTC
without restart.

---

## Configuration

| Variable | Default | Purpose |
|---|---|---|
| `MERLIN_BINARY` | `~/.merlin/merlin[.exe]` | Path to the binary |
| `MERLIN_RETRY_AFTER_S` | `3600` | Seconds to skip dedup after a cap-hit before re-probing. Min 60. |

Constructor parameters on `MerlinBufferMemory`:

| Param | Default | Purpose |
|---|---|---|
| `memory_key` | `"history"` | Key under which the rendered string is returned |
| `keep_tail_lines` | `2` | Trailing lines preserved verbatim (the most-recent context) |
| `human_prefix` / `ai_prefix` | `"Human"` / `"AI"` | Standard LangChain prefixes |
| `return_messages` | `False` | If `True`, returns the message list instead of a string (no dedup applied; mirror of CBM behavior) |
| `extra_env` | `None` | Optional env-var dict for the binary subprocess (advanced) |

---

## When MerlinBufferMemory helps - and when it doesn't

**Helps:** multi-turn agents that re-feed tool outputs into the prompt
each turn (ReAct, Cline, AutoGPT, Devin-style workflows). Anywhere the
chat history accumulates large repeated content (lock files, terminal
logs, file dumps, retrieved documents).

**Doesn't help:** single-shot LLM calls with no rolling history. Tiny
prompts under a few KB. Workloads where every turn introduces only fresh
unique content.

When it doesn't help, you don't pay for it - the dedup just shrinks the
prompt by zero bytes.

---

## License

MIT. See `LICENSE`.

## Links

* GitHub: <https://github.com/corbenicai/merlin-community>
* Issues: <https://github.com/corbenicai/merlin-community/issues>
* Pro tier (no caps, multi-threaded engine, server-side validation):
  <https://corbenic.ai>
