Metadata-Version: 2.4
Name: gnokee
Version: 0.4.0
Summary: Memory infrastructure for personal AI — bi-temporal facts, honest contradictions, autonomous maintenance, real forgetting. MCP-native, multi-tenant, multilingual.
Project-URL: Homepage, https://github.com/gnokeelabs/gnokee
Project-URL: Source, https://github.com/gnokeelabs/gnokee
Project-URL: Issues, https://github.com/gnokeelabs/gnokee/issues
Author: gnokeelabs
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: ai,bi-temporal,graphiti,knowledge-graph,mcp,memory,personal-ai
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database :: Front-Ends
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: asyncpg>=0.31
Requires-Dist: graphiti-core<0.30,>=0.29
Requires-Dist: httpx>=0.28
Requires-Dist: langdetect>=1.0
Requires-Dist: neo4j<7,>=5.20
Requires-Dist: openai>=2.0
Requires-Dist: pgvector>=0.4
Requires-Dist: pydantic-settings>=2.5
Requires-Dist: pydantic<3,>=2.13
Requires-Dist: structlog>=25.0
Provides-Extra: core
Provides-Extra: dev
Requires-Dist: hatchling>=1.29; extra == 'dev'
Requires-Dist: mypy>=2.0; extra == 'dev'
Requires-Dist: pre-commit>=4; extra == 'dev'
Requires-Dist: pynacl>=1.5; extra == 'dev'
Requires-Dist: pytest-asyncio>=1.3; extra == 'dev'
Requires-Dist: pytest>=9; extra == 'dev'
Requires-Dist: pyyaml>=6.0; extra == 'dev'
Requires-Dist: ruff>=0.15; extra == 'dev'
Requires-Dist: tach>=0.30; extra == 'dev'
Requires-Dist: testcontainers>=4.14; extra == 'dev'
Provides-Extra: ingest
Provides-Extra: mcp
Requires-Dist: mcp<2.0,>=1.27; extra == 'mcp'
Provides-Extra: retrieval
Description-Content-Type: text/markdown

# gnokee

> Memory infrastructure for personal AI — bi-temporal facts, honest contradictions, autonomous maintenance, real forgetting. MCP-native, multi-tenant, multilingual.

[![PyPI](https://img.shields.io/pypi/v/gnokee.svg)](https://pypi.org/project/gnokee/)
[![npm](https://img.shields.io/npm/v/gnokee.svg)](https://www.npmjs.com/package/gnokee)
[![License: Apache-2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)

**Status:** v0.2 — v0.1 surface (ingest + recall + MCP) plus typed clinical-data reads (`gnokee_lab_query` / `gnokee_med_query`) per [ADR-0009](docs/adr/0009-clinical-labs-need-structured-store.md) + [ADR-0010](docs/adr/0010-meds-share-lab-record-shape.md). See [`docs/specs/v0.1.md`](docs/specs/v0.1.md) for the core surface; ADRs for v0.2 deltas.

## Quickstart

Requirements: Docker, Python 3.10+, an OpenAI-compatible LLM API key (gpt-4o-mini per Q4).

```bash
# 1. Bring up Postgres + Neo4j + TEI
make up

# 2. Install (editable) and run the demo
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[mcp,dev]"

# 3. Configure env
cp .env.example .env
# edit .env to set GNOKEE_OPENAI_API_KEY (or GNOKEE_LLM_API_KEY)

# 4. Apply migrations and run the demo
make migrate
make demo
```

The demo ingests five episodes about a person, runs a recall, and prints any contradictions. Output is plain stdout — gnokee is a library + MCP server, not a UI.

### MCP server

```bash
python -m gnokee.mcp        # stdio (default; for Claude Desktop / Code / Cursor)
GNOKEE_MCP_HTTP=1 python -m gnokee.mcp   # streamable-http (dev only)
```

Tools surfaced:

| Tool | Purpose |
| --- | --- |
| `gnokee_ingest_episode` | Store a fact / event / observation in bi-temporal memory. |
| `gnokee_recall` | Natural-language fact retrieval with provenance handles. |
| `gnokee_fact_provenance` | Fetch the original episode body behind a `fact_uuid`. |
| `gnokee_lab_query` (v0.2) | Typed clinical-lab reads: `latest \| history \| min \| max \| avg \| count` over `lab_record`. |
| `gnokee_med_query` (v0.2) | Typed medication-history reads: `active \| history \| allergies \| switches` over `med_record`. |

Core schemas in [`docs/specs/v0.1.md`](docs/specs/v0.1.md) §6; v0.2 typed reads documented in [ADR-0009](docs/adr/0009-clinical-labs-need-structured-store.md) + [ADR-0010](docs/adr/0010-meds-share-lab-record-shape.md).

### Environment variables

| Var | Purpose |
| --- | --- |
| `GNOKEE_PG_DSN` | `postgresql://…` |
| `GNOKEE_NEO4J_URI` / `GNOKEE_NEO4J_USER` / `GNOKEE_NEO4J_PASSWORD` | Bolt + auth |
| `GNOKEE_TEI_URL` | TEI base URL (e.g. `http://localhost:8080`) |
| `GNOKEE_EMBED_MODEL` | default `bge-m3` (1024-dim, locked) |
| `GNOKEE_OPENAI_BASE_URL` / `GNOKEE_OPENAI_API_KEY` | consumer-supplied LLM (alias: `GNOKEE_LLM_*`) |
| `GNOKEE_LLM_MODEL` | default `gpt-4o-mini` |
| `GNOKEE_TENANT_DEFAULT` | demo + tests only; production paths require `tenant_id` explicitly |
| `GNOKEE_MCP_HTTP` | `0` for stdio, `1` for streamable-http (dev only) |
| `GNOKEE_LOG_LEVEL` | default `info` |

### Tests

```bash
make test-unit             # no compose required
make up && make test-integration   # needs LLM key + compose stack
```


## What gnokee is

A memory layer that treats facts the way infrastructure treats state: declared, versioned, reconciled, garbage-collected. It ingests episodes from any source (files, APIs, event streams), stores them with bi-temporal validity, detects contradictions at write and surfaces them at read, supersedes facts explicitly rather than overwriting, forgets verifiably when asked, and runs maintenance autonomously.

MCP-native. Multi-tenant from day one. Multilingual by default (bge-m3). Built on top of Graphiti's bi-temporal knowledge graph primitive.

## What gnokee is not

- Not a chat UI (use any MCP client)
- Not an LLM host (use Ollama, LiteLLM, direct APIs)
- Not a sync engine, workflow engine, agent framework, document parser, vault editor, auth provider, federation layer

See `docs/architecture.md` for the full refusal list.

## Status

- [x] Namespace claimed (PyPI, npm, GitHub org `gnokeelabs`, domains gnokee.com/.dev/.io)
- [x] Q1 Graphiti spike — ADOPT as storage primitive ([ADR-0001](docs/adr/0001-build-on-graphiti.md))
- [x] Q2 bge-m3 cross-lingual — ADOPT (100% top-1 via direct cosine); retrieval reassigned to gnokee ([ADR-0004](docs/adr/0004-gnokee-owns-retrieval.md))
- [x] Q3 MCP token-efficiency — ADOPT gnokee response shape (60% token reduction at 91.7% top-3 recall vs Graphiti-raw 50%)
- [x] Q4 contradiction-classifier smoke test — ADOPT gpt-4o-mini (8/10 = 80% accuracy on labeled pairs)
- [x] Q5 forgetting hard-delete propagation — ADOPT (2/2 probes; Neo4j cascade + retrieval-surface clean)
- [x] Q5 storage adapter audit (FalkorDB) — Neo4j for v0.1; FalkorDB swap deferred to v0.2 (Graphiti-internal API differs; gnokee's Cypher is portable)
- [x] v0.1 spec finalized ([`docs/specs/v0.1.md`](docs/specs/v0.1.md))
- [x] v0.1 implementation — ingest + recall + MCP + contradictions, integration tests on real compose stack
- [x] Q7 clinical-labs spike — OFF-RAMP at 13.3 %; typed `lab_record` table per [ADR-0009](docs/adr/0009-clinical-labs-need-structured-store.md)
- [x] Q8 med-supersession spike — ADOPT_WITH_GAPS at 53.3 %; typed `med_record` table per [ADR-0010](docs/adr/0010-meds-share-lab-record-shape.md)
- [x] Q9 wearable-throughput off-ramp ([ADR-0011](docs/adr/0011-wearables-batch-or-off-ramp.md))
- [x] v0.2 typed clinical reads landed (`gnokee_lab_query` + `gnokee_med_query`); Q7 lifted to 66.7 % ADOPT_WITH_GAPS, Q8 lifted to 93.3 % ADOPT
- [x] Eval suite formalised across cross-tool (vs Mem0 / Graphiti-alone / Zep-OSS / basic-memory) — Stage A 2026-05-09 (10 Q × 10 sessions, 3 SUTs, double-run judge agreed); Stage C-pilot 2026-05-10 (30 Q × 20-session, 3 SUTs, double-run judge agreed) confirms magnitude — all three SUTs 0/30 strict on LongMemEval-S; gnokee REJECT for v0.2.x retrieval surface ([ADR-0012](docs/adr/0012-cross-tool-eval-verdict.md) Accepted). Q10.1: Zep-OSS + basic-memory adapters wired + smoked (5 wired SUTs). Bottleneck moved from retrieval to synth-prompt abstention bias ([#62](https://github.com/gnokeelabs/gnokee/issues/62)); full-pilot Stage C (100 Q × ~47-session × Track 1 + Track 2) tracked under [#63](https://github.com/gnokeelabs/gnokee/issues/63)
- [ ] First tagged release

## Roadmap

| Phase | Milestone |
|---|---|
| Spike | Q1 — Graphiti's bi-temporal model fits Omur + personal corpora |
| v0.1 | Single-tenant, single-binary; ingest + retrieval + MCP; basic forgetting; Apache 2.0 |
| v0.2 | Typed clinical reads (labs + meds) per ADR-0009/0010; structured Postgres siblings to graphiti's narrative |
| v0.3 | Omur consumes gnokee; multi-tenant validation; encrypted-body branch |
| v1.0 | API stability commitment; published evals vs Mem0 / Graphiti-alone / basic-memory |

## Project name

`gnokee` = *gno* (Greek γνῶσις, "knowledge") + *kee* (English *keep*, custodian). Pun: "no key" — gnokee never holds keys, never decrypts (see `docs/architecture.md` § Privacy). Originally drafted as `Veda`; rebranded after namespace collision.

## License

Apache-2.0. See [`LICENSE`](LICENSE).

## Contributing

Pre-v0.1: closed to external contributions until spec stabilizes. Watch the repo for issues / RFCs once v0.1 spec lands.

## Source of truth

Design documents live in Notion (private) until first commit; from this point forward the repo is canonical. See [`AGENTS.md`](AGENTS.md) for the convention.
