Metadata-Version: 2.4
Name: pretensor
Version: 0.1.0a3
Summary: Pretensor database knowledge graph — Kuzu-backed schema graph from precomputed DB introspection
Project-URL: Homepage, https://github.com/pretensor-ai/pretensor
Project-URL: Repository, https://github.com/pretensor-ai/pretensor
Project-URL: Issues, https://github.com/pretensor-ai/pretensor/issues
Project-URL: Changelog, https://github.com/pretensor-ai/pretensor/blob/main/CHANGELOG.md
Project-URL: Documentation, https://github.com/pretensor-ai/pretensor#readme
Author-email: Carlos <275843998+tensorcarlos@users.noreply.github.com>
License-Expression: MIT
License-File: LICENSE
Keywords: bigquery,data-catalog,database,graph,knowledge-graph,kuzu,llm,mcp,postgres,schema,snowflake
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: cryptography>=42.0
Requires-Dist: httpx>=0.27
Requires-Dist: igraph<1,>=0.11
Requires-Dist: kuzu<1,>=0.7
Requires-Dist: mcp>=1.0
Requires-Dist: psycopg2-binary>=2.9
Requires-Dist: pydantic>=2.0
Requires-Dist: rapidfuzz>=3.0
Requires-Dist: rich>=13.0
Requires-Dist: ruamel-yaml>=0.18
Requires-Dist: sqlalchemy>=2.0
Requires-Dist: sqlglot>=25.0
Requires-Dist: typer>=0.12
Provides-Extra: all-connectors
Requires-Dist: google-cloud-bigquery>=3.25; extra == 'all-connectors'
Requires-Dist: snowflake-sqlalchemy>=1.5; extra == 'all-connectors'
Provides-Extra: bigquery
Requires-Dist: google-cloud-bigquery>=3.25; extra == 'bigquery'
Provides-Extra: clustering
Requires-Dist: leidenalg>=0.10; extra == 'clustering'
Provides-Extra: dev
Requires-Dist: gitlint==0.19.1; extra == 'dev'
Requires-Dist: numpy>=1.26; extra == 'dev'
Requires-Dist: pre-commit>=3.0; extra == 'dev'
Requires-Dist: pyright>=1.1; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.15; extra == 'dev'
Provides-Extra: e2e
Requires-Dist: testcontainers[postgres]>=4.0; extra == 'e2e'
Provides-Extra: embeddings
Requires-Dist: huggingface-hub>=0.20; extra == 'embeddings'
Requires-Dist: numpy>=1.26; extra == 'embeddings'
Requires-Dist: onnxruntime>=1.17; extra == 'embeddings'
Requires-Dist: transformers>=4.38; extra == 'embeddings'
Provides-Extra: snowflake
Requires-Dist: snowflake-sqlalchemy>=1.5; extra == 'snowflake'
Description-Content-Type: text/markdown

# Pretensor OSS

[![PyPI](https://img.shields.io/pypi/v/pretensor.svg)](https://pypi.org/project/pretensor/)
[![CI](https://github.com/pretensor-ai/pretensor/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/pretensor-ai/pretensor/actions/workflows/ci.yml)
[![Status: Alpha](https://img.shields.io/badge/status-alpha-yellow.svg)](#status)
[![Python: 3.11 | 3.12](https://img.shields.io/badge/python-3.11%20%7C%203.12-blue.svg)](#prerequisites)

**Pretensor OSS** introspects **PostgreSQL** and **Snowflake**, with optional **BigQuery** connector support, builds a **Kuzu** knowledge graph of tables, columns, foreign keys, inferred joins, and related metadata, and exposes that graph to AI tools through an **MCP** (Model Context Protocol) server. Agents query schema context and search without issuing raw SQL against your graph store.

> **Status: Alpha.** Pretensor is on PyPI as `pretensor` and currently in alpha. CLI flags, MCP tools, and graph schema can still change between alpha versions — pin exact versions until `1.0.0`. See [docs/releases.md](https://github.com/pretensor-ai/pretensor/blob/main/docs/releases.md) for the versioning policy.

## Who is this for

- Data analysts using AI to explore warehouses.
- Data engineers tired of copy-pasting DDLs into chat.
- Data architects who need grounded schema context for agents.
- Anyone feeding database schemas to an LLM by hand.

## Prerequisites

- **Python 3.11 or 3.12** (3.13 not yet tested).
- A reachable database for `pretensor index`. PostgreSQL is the fastest local path; Snowflake and BigQuery are supported via the `pretensor[snowflake]` and `pretensor[bigquery]` extras.

## Install

```bash
pip install pretensor
# or, inside a uv-managed environment:
uv pip install pretensor
```

Optional features are exposed as extras:

| Extra | Adds | Use when |
|-------|------|----------|
| `pretensor[snowflake]` | `snowflake-sqlalchemy` | You're indexing a Snowflake warehouse. |
| `pretensor[bigquery]` | `google-cloud-bigquery` | You're indexing BigQuery. |
| `pretensor[clustering]` | `leidenalg` | You want Leiden community detection during indexing. Without this, Pretensor falls back to igraph Louvain (works, but no resolution tuning). |

Combine extras with comma separation, e.g. `pip install 'pretensor[snowflake,clustering]'`.

Try it without installing:

```bash
uvx --from pretensor pretensor --help
```

> **A note on alpha versions.** Pretensor is in alpha. The plain `pip install pretensor` command picks up the latest alpha because PyPI has no stable release yet. Once `1.0.0` ships, future alphas will require `--pre` (e.g. `pip install --pre pretensor`); pin to a specific version (e.g. `pretensor==<version>`) if you want a deterministic install today — see the [PyPI badge above](https://pypi.org/project/pretensor/) for the latest.

If you want to hack on Pretensor itself rather than use it, see the contributor setup in [CONTRIBUTING.md](https://github.com/pretensor-ai/pretensor/blob/main/CONTRIBUTING.md) for the `git clone` + `make install` flow.

## Quickstart

```bash
pretensor index postgresql://USER:PASSWORD@HOST:5432/DBNAME
pretensor serve --config-only   # prints mcpServers JSON for Claude / Cursor
```

`serve --config-only` prints the **`mcpServers` JSON** to stdout. Merge the `pretensor` entry into your Claude or Cursor MCP settings — the IDE starts the server automatically. Run `pretensor serve` directly if you prefer a long-running terminal process (config hints go to stderr, keeping stdout clean for JSON-RPC).

Use **`--state-dir`** on `index` / `reindex` and **`--graph-dir`** on `serve` when overriding the default state directory (`.pretensor`).

**Full guide — install, tools, visibility, reindexing, graph visualization:** [guides/quickstart.md](https://github.com/pretensor-ai/pretensor/blob/main/guides/quickstart.md)

## MCP tools

| Name | Role |
|------|------|
| `list_databases` | List indexed database connections with table counts and staleness. |
| `schema` | Inspect node labels, edge types, and available properties before writing Cypher. |
| `query` | BM25 keyword search over table and entity metadata. |
| `cypher` | Read-only Kuzu Cypher for one indexed database; mutating clauses are rejected. |
| `context` | Full context for one physical table, including columns, joins, lineage, and cluster metadata. |
| `traverse` | Join paths between two physical tables, including confirmed cross-database paths. |
| `impact` | Downstream tables reachable from a table via FK and inferred-join edges. |
| `detect_changes` | Compare the live database schema to the last indexed snapshot without mutating the graph. |
| `compile_metric` | Compile semantic-layer YAML into validated SQL for one indexed database. |
| `validate_sql` | Validate SQL against the indexed graph before execution. |

## Architecture

`src/pretensor/` is organized by subsystem:

- **`connectors/`** — database-specific introspection (PostgreSQL, Snowflake, BigQuery)
- **`core/`** — Kuzu graph store, schema writing, relationship discovery
- **`intelligence/`** — deterministic graph intelligence (classification, clustering, join-path precomputation; metric-template code exists but is not part of the default OSS indexing flow)
- **`mcp/`** — MCP server, tools, resources
- **`cli/`** — Typer CLI (`index`, `reindex`, `serve`, `list`, `quickstart`, `export`, `validate`, `sync-grants`, `add`, `remove`, plus the `semantic` subcommand group)

## Status

Pretensor is in **pre-release development**. Before the first packaged release:

- The package on PyPI is named `pretensor`. The first stable release will be `1.0.0`; everything before that is alpha. `pip install pretensor` works today because no stable version exists yet — `--pre` will be required once `1.0.0` ships and future alphas resume.
- There is no SemVer stability guarantee yet, so CLI flags, MCP tools, and graph schema may change between alphas. Pin exact versions.
- Treat current builds as evaluation software and test upgrades in a staging environment before production use.

Progress and release notes: [CHANGELOG.md](https://github.com/pretensor-ai/pretensor/blob/main/CHANGELOG.md).

## Contributing

See [CONTRIBUTING.md](https://github.com/pretensor-ai/pretensor/blob/main/CONTRIBUTING.md). Security issues: see [SECURITY.md](https://github.com/pretensor-ai/pretensor/blob/main/SECURITY.md).

## Tests

```bash
make verify
```

Individual commands are also available:

```bash
make test      # pytest
make lint      # ruff check
make typecheck # pyright
```

## License

MIT — see [LICENSE](https://github.com/pretensor-ai/pretensor/blob/main/LICENSE).
