Metadata-Version: 2.4
Name: liel
Version: 0.1.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Dist: mcp>=1.0 ; extra == 'mcp'
Provides-Extra: mcp
License-File: LICENSE
Summary: Single-file graph memory for local AI, agents, and Python applications
Keywords: graph,database,embedded,memory,ai,mcp
Home-Page: https://github.com/hy-token/liel
Author-email: hy-token <51951093+hy-token@users.noreply.github.com>
License: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Bug Tracker, https://github.com/hy-token/liel/issues
Project-URL: Documentation, https://github.com/hy-token/liel/blob/main/docs/index.md
Project-URL: Homepage, https://github.com/hy-token/liel
Project-URL: Repository, https://github.com/hy-token/liel

# liel

[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/hy-token/liel/blob/main/LICENSE)
[![CI](https://img.shields.io/github/actions/workflow/status/hy-token/liel/ci.yml?branch=main&label=CI)](https://github.com/hy-token/liel/actions/workflows/ci.yml)

**Single-file graph memory for local AI, agents, and Python applications. Standalone. Zero core dependencies. No server.**

`liel` is a lightweight local graph memory store for LLM tools, AI agents, and Python applications.
It stores facts, decisions, tasks, files, sources, tool results, and their relationships in one portable `.liel` file.

The core package has **no runtime dependencies**. No external database server, cloud service, or background daemon is required. On supported platforms, `pip install liel` is enough to get started.

MCP integration is optional. Install `liel[mcp]` only when you want to expose a `.liel` memory file to an MCP-capable AI tool.

Under the hood, `liel` is a Rust-core embedded **Property Graph database** with a Python-first API and optional MCP integration.
If SQLite is the one-file relational database, `liel` is the one-file graph memory layer for relationship-centric AI workflows.

> *Etymology: a portmanteau of French* lier *(to connect) and Latin* ligare.

---

## Table of contents

- [What liel gives AI tools](#what-liel-gives-ai-tools)
- [Problems this helps solve](#problems-this-helps-solve)
- [Install](#install)
- [Quickstart: LLM memory with MCP](#quickstart-llm-memory-with-mcp)
- [Quickstart: Python property graph](#quickstart-python-property-graph)
- [What to store](#what-to-store)
- [Vector stores and liel](#vector-stores-and-liel)
- [When to use liel](#when-to-use-liel)
- [When not to use liel](#when-not-to-use-liel)
- [Features](#features)
- [Reliability and failure model](#reliability-and-failure-model)
- [API reference](#api-reference)
- [File format](#file-format)
- [Limitations](#limitations)
- [Documentation](#documentation)
- [Contributing](#contributing)
- [License](#license)

---

## What liel gives AI tools

`liel` gives local AI tools a memory file they can update, traverse, inspect, and carry between sessions.

With one `.liel` file, an AI tool can:

- Store entities such as projects, files, tasks, people, sources, and notes.
- Store explicit facts, decisions, observations, and tool results.
- Connect those records with typed relationships.
- Retrieve nearby context by traversing the graph.
- Keep memory local, portable, and easy to back up.
- Run without a database server or background daemon.
- Use the core library with no required runtime dependencies.

This turns scattered AI memory into a durable graph file that both humans and tools can inspect.

---

## Problems this helps solve

Because memory is stored as an explicit local graph, `liel` helps with problems common in local AI workflows:

- Decisions and assumptions get lost across sessions in chat history.
- Facts, files, sources, tasks, and tool outputs become hard to connect later.
- Keyword search and vector similarity alone do not model explicit relationships.
- AI memory is hard for humans to inspect, clean up, copy, or back up.
- Small local agents often do not need a database server or cloud service.
- Memory needs to move between machines, archives, and experiments as one file.

---

## Install

Install the dependency-free core package:

```bash
pip install liel
```

This installs prebuilt wheels for supported platforms — **Rust is not required** at install time.

Install the optional MCP integration only when you want an MCP-capable AI tool to use a `.liel` file as external memory:

```bash
pip install "liel[mcp]"
```

**Platform support**

- OS: Linux, macOS, Windows
- Architecture: x86_64 first, arm64 where practical
- Python: **3.9 or newer**

### Source build (for contributors)

You only need this if you are hacking on liel itself, or your platform/Python combination has no prebuilt wheel.

**Prerequisites**

```bash
# Linux / WSL
sudo apt-get update && sudo apt-get install -y build-essential

# macOS
xcode-select --install

# Rust (any OS)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
```

**Build and install in editable mode**

```bash
git clone https://github.com/hy-token/liel.git
cd liel
python3 -m venv .venv
source .venv/bin/activate          # Windows: .venv\Scripts\activate
pip install -r requirements-dev.txt
maturin develop
```

**Verify**

```bash
python -c "import liel; print(liel.__version__)"
```

See [CONTRIBUTING.md](https://github.com/hy-token/liel/blob/main/CONTRIBUTING.md) for the full developer workflow.

---

## Quickstart: LLM memory with MCP

Install the MCP-enabled package:

```bash
pip install "liel[mcp]"
```

Start the MCP server with a local memory file:

```bash
liel-mcp --path agent-memory.liel
```

The `.liel` file becomes local external memory for your AI tool. Through MCP, the tool can read and write nodes, edges, and properties, then retrieve related context by traversing the graph.

An agent can store things like:

- Project goals
- User decisions
- Important files and their roles
- Tool results
- Task dependencies
- Sources behind a decision

For real MCP client configuration, prefer an absolute path because clients may start servers from a different working directory:

```bash
liel-mcp --path /absolute/path/to/agent-memory.liel
```

`agent-memory.liel` is just a file. Copy it to back it up, move it to another machine, or delete it when you no longer need it.

---

## Quickstart: Python property graph

You can also use `liel` directly as an embedded property graph database from Python.

### Basic graph

```python
import liel

with liel.open(":memory:") as db:
    alice = db.add_node(["Person"], name="Alice", age=30)
    bob   = db.add_node(["Person"], name="Bob",   age=25)
    db.add_edge(alice, "KNOWS", bob, since=2020)
    db.commit()

    friends = db.neighbors(alice, edge_label="KNOWS")
    print(friends[0]["name"])   # Bob
```

→ [examples/01_quickstart.py](https://github.com/hy-token/liel/blob/main/examples/01_quickstart.py)

### Heterogeneous knowledge graph with QueryBuilder

```python
import liel

with liel.open(":memory:") as db:
    alice = db.add_node(["Person"],            name="Alice", role="Engineer")
    bob   = db.add_node(["Person"],            name="Bob",   role="Designer")
    carol = db.add_node(["Person"],            name="Carol", role="Engineer")
    dave  = db.add_node(["Person", "Manager"], name="Dave",  role="Manager")

    acme = db.add_node(["Company"],    name="Acme",   industry="SaaS")
    py   = db.add_node(["Technology"], name="Python", category="Language")

    db.add_edge(alice, "WORKS_AT", acme, since=2021)
    db.add_edge(alice, "USES", py, proficiency="expert")
    db.add_edge(alice, "KNOWS", carol)
    db.commit()

    engineers = (
        db.nodes()
          .label("Person")
          .where_(lambda n: n.get("role") == "Engineer")
          .fetch()
    )
    print([n["name"] for n in engineers])           # ['Alice', 'Carol']

    managers = db.nodes().label("Manager").fetch()  # multi-label filter
    print([n["name"] for n in managers])            # ['Dave']
```

→ [examples/02_knowledge_graph.py](https://github.com/hy-token/liel/blob/main/examples/02_knowledge_graph.py)

### Bulk import public graph data in a single transaction

```python
import json, urllib.request, liel

url = "https://raw.githubusercontent.com/vega/vega-datasets/main/data/miserables.json"
data = json.loads(urllib.request.urlopen(url).read().decode("utf-8"))

with liel.open(":memory:") as db:
    node_ids = []
    with db.transaction():                # 1 fsync for the whole batch
        for n in data["nodes"]:
            node = db.add_node(["Character"], name=n["name"], group=n["group"])
            node_ids.append(node.id)

        for e in data["links"]:
            db.add_edge(
                node_ids[e["source"]],
                "APPEARS_WITH",
                node_ids[e["target"]],
                weight=e["value"],
            )
```

The WAL is flushed only on commit, so wrapping a bulk import in `db.transaction()` keeps I/O cost flat regardless of row count.

→ [examples/03_bulk_import.py](https://github.com/hy-token/liel/blob/main/examples/03_bulk_import.py)

---

## What to store

A `.liel` file can hold structured AI memory such as:

- `Project`: repositories, products, research topics
- `Task`: work items, TODOs, blockers
- `Decision`: choices made by the user or agent
- `Observation`: facts learned during tool use
- `Source`: files, URLs, documents, command outputs
- `Person` / `Team`: people and ownership
- `File` / `Module`: codebase structure

Relationships can express:

- `DEPENDS_ON`
- `MENTIONS`
- `DERIVED_FROM`
- `DECIDED_BY`
- `BLOCKED_BY`
- `RELATED_TO`
- `UPDATED_BY`

See [examples/07_agent_memory.py](https://github.com/hy-token/liel/blob/main/examples/07_agent_memory.py)
for a small project-memory graph using tasks, files, decisions, sources, and observations.

---

## Vector stores and liel

`liel` is not a vector database replacement.

Vector stores are useful for semantic similarity search over text. `liel` is for explicit memory: facts, entities, decisions, dependencies, provenance, and relationships you can traverse.

Many AI workflows can use both:

- Use vector search to find similar text.
- Use graph memory to answer "what is this decision based on?", "which files are related to this task?", or "what changed this assumption?"

---

## When to use liel

Use `liel` when:

- You want local AI memory as a file, not a server.
- Relationships between entities matter.
- You want to persist decisions, facts, sources, tasks, and tool outputs across sessions.
- You need graph traversal and relationship modeling without running a separate database server.
- You want all memory in one portable `.liel` file that is easy to copy, back up, and archive.
- You want a practical Rust-core graph engine with a Python-first developer experience.

Example use cases:

- Local agent memory
- Project memory for coding assistants
- Personal or project knowledge graphs
- MCP-backed memory for AI tools
- Tool result caches with provenance
- Research assistant memory
- Lightweight relationship stores for RAG pipelines

---

## When not to use liel

| If you want… | Use instead |
|---|---|
| Semantic similarity search over text | A vector database or embedding index |
| Graph queries on top of existing tabular data | **DuckDB** recursive CTEs / [DuckPGQ](https://duckdb.org/docs/extensions/pgq.html) |
| Tens of millions of nodes/edges, or concurrent writes | **Neo4j**, **Amazon Neptune** |
| Graph-style queries from a SQL-familiar team on existing relational data | **PostgreSQL** `WITH RECURSIVE` |
| High-throughput, low-latency writes | A dedicated server-backed graph database |
| Documents or full-text search as the primary access pattern | **MongoDB**, **Elasticsearch** |

liel uses page-level WAL, has no full-text or aggregation queries, and is single-process by design — see [Limitations](#limitations).

---

## Features

- **Single file** — memory lives in one `.liel` file.
- **Zero core runtime dependencies** — the core `liel` package has no required runtime dependencies.
- **Optional MCP integration** — install `liel[mcp]` only when you want to expose graph memory to MCP clients.
- **No database server** — no external service, daemon, or cloud database is required.
- **Property Graph** — nodes and edges support multiple labels and arbitrary properties.
- **Crash-safe** — transactional guarantees via a Write-Ahead Log (WAL).
- **`:memory:` mode** — in-memory operation for tests and experiments.
- **Python-first API** — type stubs are included for editor support.

### Status

- The Rust core is implemented and tested.
- The Python API is usable today for local development, scripts, research, prototypes, and local AI memory experiments.
- CI runs Rust + Python tests on Linux, Windows, and macOS for every pull request and for version-tag pushes such as `v0.1.0` — see the [Actions tab](https://github.com/hy-token/liel/actions).
- **Practical scale (guidance, not a warranty):** **a few gigabytes** in a single `.liel` file is a reasonable comfort zone on typical desktop hardware. Beyond that depends on RAM, disk, and access patterns — measure your workload.
- This project does not promise fitness for a particular purpose, SLA-style support, or legal indemnity. See [product trade-offs](https://github.com/hy-token/liel/blob/main/docs/design/product-tradeoffs.md) for the explicit list of trade-offs.

### Tests

```bash
cargo test            # Rust unit tests
pytest tests/python/  # Python integration tests
```

Latest CI results: [GitHub Actions](https://github.com/hy-token/liel/actions).

---

## Reliability and failure model

`liel` is designed around a narrow reliability contract: **one writer process, one local file, explicit commits**.

What is covered:

- **Committed data survives process crashes.** `commit()` writes modified pages to the page-level WAL, fsyncs the WAL, applies the pages to their canonical locations, and fsyncs the data file.
- **Interrupted commits are recovered on open.** If a file is opened with a non-empty WAL, recovery replays complete WAL entries back into the data file.
- **Double-open is rejected.** Opening the same `.liel` path twice for writing raises `AlreadyOpenError`; same-process conflicts use an in-process registry, and cross-process conflicts use a `<file>.lock/` directory.
- **Corrupt or incompatible files fail closed.** Header, checksum, layout, and WAL validation errors surface as explicit `GraphDBError` subclasses rather than silent best-effort reads.

What is not covered:

- **Multi-process concurrent mutation is not supported.** The lock directory rejects a second writer to protect the file, but it does not make concurrent writes safe. If several tools need to write, put one service or worker in charge of the `.liel` file.
- **Uncommitted changes are disposable.** If a process exits before `commit()`, the next open returns to the last committed state.
- **Filesystem guarantees matter.** `liel` relies on the local filesystem honoring write and fsync ordering. Network filesystems, sync folders, and unusual virtual filesystems may not provide the same durability semantics.

See the full [reliability and failure model](https://github.com/hy-token/liel/blob/main/docs/reference/reliability.md) and [product trade-offs](https://github.com/hy-token/liel/blob/main/docs/design/product-tradeoffs.md) before using `liel` as durable application state.

---

## API reference

### `liel.open(path)` → `GraphDB`

```python
db = liel.open("path/to/graph.liel")   # file (created if it does not exist)
db = liel.open(":memory:")             # in-memory (for testing)

with liel.open("graph.liel") as db:    # context manager
    ...
```

Use **one writer process per `.liel` file**. Concurrent multi-process writes are not supported; if several applications need to modify the same graph, centralize writes through one service or worker.

Opening the same `.liel` path twice is detected and rejected with `liel.AlreadyOpenError`. Within one process this uses an in-process registry; across processes it uses a `<file>.lock/` directory. Close the previous handle (or let its `with` block exit) before re-opening:

```python
with liel.open("graph.liel") as db:
    ...
# the with block releases the writer slot; re-opening here is fine
with liel.open("graph.liel") as db:
    ...
```

If a writer crashes and leaves `.lock/` behind, the next `open()` reclaims it when the recorded owner PID is clearly dead. See [product trade-offs](https://github.com/hy-token/liel/blob/main/docs/design/product-tradeoffs.md) for the write-safety trade-off and recommended deployment pattern.

### Node operations

```python
node = db.add_node(["Person", "Employee"], name="Alice", age=30)

node.id           # int: node ID (1-based)
node.labels       # list[str]
node["name"]      # "Alice"
node.properties   # dict (a copy)
"name" in node    # True
node.get("x")     # None (missing key)

db.get_node(1)
db.update_node(1, age=31)   # replace the node's property map
db.delete_node(node)        # also deletes incident edges
db.all_nodes()
db.node_count()
```

### Edge operations

```python
edge = db.add_edge(alice, "KNOWS", bob, since=2020)

edge.id         # int
edge.label      # "KNOWS"
edge.from_node  # source node ID
edge.to_node    # target node ID
edge["since"]   # 2020

db.get_edge(1)
db.update_edge(1, since=2021)
db.delete_edge(edge)
db.all_edges()
db.edge_count()

# Returns an existing edge matching label + properties, or creates one
e = db.merge_edge(alice, "KNOWS", bob, since=2020)

db.out_edges(alice)
db.out_edges(alice, label="KNOWS")
db.in_edges(bob)
```

### Adjacency queries

```python
# direction: "out" (default) | "in" | "both"
db.neighbors(alice)
db.neighbors(alice, edge_label="KNOWS")
db.neighbors(alice, direction="in")
db.neighbors(alice, direction="both")
```

### Traversal

```python
# BFS / DFS → [(Node, depth), ...]
for node, depth in db.bfs(alice, max_depth=3):
    print(f"{'  ' * depth}{node['name']} (depth={depth})")

for node, depth in db.dfs(alice, max_depth=3):
    ...

# Minimum-hop directed path → [Node, ...] | None
# (unweighted BFS on out-edges; not Dijkstra)
path = db.shortest_path(alice, carol)
path = db.shortest_path(alice, carol, edge_label="KNOWS")
```

`shortest_path` follows **out-edges only** and minimizes the number of hops; edge properties are **not** weights. Performance notes for traversal and scan-heavy APIs live in the [Python guide](https://github.com/hy-token/liel/blob/main/docs/guide/connectors/python.md).

### QueryBuilder (chained methods)

```python
results = db.nodes().label("Person").where_(lambda n: n["age"] > 20).fetch()
count   = db.nodes().label("Person").count()
exists  = db.nodes().label("Person").where_(lambda n: n["name"] == "Alice").exists()
page2   = db.nodes().label("Person").skip(10).limit(10).fetch()

edges = db.edges().label("KNOWS").where_(lambda e: e["since"] >= 2020).fetch()
```

### Transactions

```python
db.add_node(["Person"], name="Alice")
db.commit()
db.rollback()

with db.transaction():                    # recommended
    db.add_node(["Person"], name="Alice")
    db.add_edge(alice, "KNOWS", bob)
# normal exit -> commit; exception -> rollback

db.begin()  # compatibility shim — no state change today
```

### Utilities

```python
db.vacuum()       # compact the prop region
db.clear()        # fully reset the DB, discard dirty state, and reset IDs to 1
db.repair_adjacency()  # rebuild adjacency heads / degrees from live edges
db.info()         # {"version": "1.0", "node_count": N, "edge_count": E, "file_size": bytes}

rows = db.all_nodes_as_records()  # bulk dict records (fewer PyO3 objects)
rows = db.all_edges_as_records()

stats = db.degree_stats()                               # { node_id: (out_deg, in_deg) }
sub   = db.edges_between({alice.id, bob.id, carol.id}) # edges fully inside the set
```

JSON import/export is **not** built into `GraphDB`. See [examples/06_export.py](https://github.com/hy-token/liel/blob/main/examples/06_export.py) and [examples/03_bulk_import.py](https://github.com/hy-token/liel/blob/main/examples/03_bulk_import.py) for reference scripts.

If `liel.CorruptedFileError` reports damaged adjacency metadata, stop writing to the file, take a backup, and run `db.repair_adjacency()` before retrying. If repair fails because a live edge points at a missing node, treat the file as more deeply damaged and restore from backup or salvage readable records into a new database.

### `Node` / `Edge` objects

| Attribute / method | Type | Description |
|---|---|---|
| `.id` | `int` | Auto-assigned ID (1-based) |
| `.labels` | `list[str]` | Node labels |
| `.label` | `str` | Edge label |
| `.from_node` | `int` | Edge source node ID |
| `.to_node` | `int` | Edge target node ID |
| `.properties` | `dict` | Property dict (a copy) |
| `obj["key"]` | `Any` | Property access (raises `KeyError`) |
| `obj.get("key")` | `Any \| None` | Property access (default `None`) |
| `"key" in obj` | `bool` | Check property existence |

### Supported property types

| Python | Stored as |
|---|---|
| `None` | Null |
| `bool` | Bool |
| `int` | Int64 |
| `float` | Float64 |
| `str` | String (UTF-8) |
| `list` | List (recursive) |
| `dict` | Map (recursive) |

### Exception classes

```python
liel.GraphDBError        # base class for all liel exceptions
liel.NodeNotFoundError   # node does not exist
liel.EdgeNotFoundError   # edge does not exist
liel.CorruptedFileError  # file is corrupted
liel.TransactionError    # transaction violation

try:
    db.delete_node(9999)
except liel.GraphDBError as e:
    print(e)
```

### Type stubs

Type definitions are provided in `python/liel/liel.pyi`, compatible with mypy and pyright.

---

## File format

The on-disk unit is a **4096-byte page**. **Page 0** (offsets `0..4096`) starts with the **128-byte file header**; the remaining **3968 bytes** of page 0 are unused. The **WAL** has a **fixed 4 MiB reservation** starting at byte offset **4096** (`PAGE_SIZE`). After the WAL reservation, node / edge / property **extents** (1 MiB each) and **extent-index** pages are appended as needed. Extent locations are tracked via header fields and index-page chains — there is no single contiguous "data region".

```
Offset      0 -    127 : File header (128 bytes); magic, counts, IDs, extent-index heads, WAL fields
Offset    128 -   4095 : Unused (padding to complete page 0)
Offset   4096 - (4096 + 4 MiB - 1) : WAL reservation (1024 pages; live length in header `wal_length`)
Offset 4198400 -    end : Extents and index pages (4 KiB pages), allocated toward EOF
```

- NodeSlot: fixed 64 bytes
- EdgeSlot: fixed 80 bytes
- Adjacency list: singly linked, prepend on insert
- Properties: custom binary format (no external crate dependencies)

The byte-level format specification lives in the GitHub repository: [docs/reference/format-spec.md](https://github.com/hy-token/liel/blob/main/docs/reference/format-spec.md).

---

## Limitations

- **No concurrent writes to the same file.** A second writer is rejected with `AlreadyOpenError` using an in-process registry plus a cross-process lock directory. This protects the file, but it does not make peer-to-peer multi-writer mutation supported.
- **The Python `GraphDB` uses a process-wide lock** (`Arc<Mutex<...>>`). Concurrent calls from multiple threads serialize on the same handle.
- **No query language.** Python API and QueryBuilder only. Cypher and similar DSLs are deliberate non-goals for the current product shape.
- **No property index.** Filtered queries use full scans plus optional Python predicates — see the [Python guide](https://github.com/hy-token/liel/blob/main/docs/guide/connectors/python.md) for API-level performance notes.
- **No WASM support.** Browser and WASM support are backlog ideas, not part of the current compatibility promise.

If your deployment needs several producers, the recommended pattern today is **one writer + many readers** rather than peer-to-peer multi-process mutation of the same file.

---

## Documentation

The PyPI source distribution is intentionally small and does not include the
full documentation tree or example scripts. Use the GitHub repository for:

- [Documentation](https://github.com/hy-token/liel/tree/main/docs)
- [Examples](https://github.com/hy-token/liel/tree/main/examples)
- [Notebooks](https://github.com/hy-token/liel/tree/main/notebooks)
- [Contributing guide](https://github.com/hy-token/liel/blob/main/CONTRIBUTING.md)

---

## Contributing

Pull requests and issues are welcome. Please:

1. Read [CONTRIBUTING.md](https://github.com/hy-token/liel/blob/main/CONTRIBUTING.md) before opening a PR.
2. Run the local checks (`cargo fmt`, `cargo clippy`, `cargo test`, `pytest tests/python/`) — they mirror CI.
3. Keep changes focused. For larger changes, open an issue first to discuss the approach.

---

## License

[MIT](https://github.com/hy-token/liel/blob/main/LICENSE)

