Metadata-Version: 2.4
Name: simplegraph-kernel
Version: 0.2.0
Summary: Small typed graph kernel with canonical stores, projections, and sparse algorithms.
Keywords: graph,cli,sqlite,duckdb,pagerank,traversal
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Provides-Extra: algorithms
Requires-Dist: numpy>=1.23; extra == "algorithms"
Requires-Dist: scipy>=1.10; extra == "algorithms"
Provides-Extra: duckdb
Requires-Dist: duckdb>=1.0; extra == "duckdb"
Requires-Dist: pyarrow>=14.0; extra == "duckdb"
Provides-Extra: postgres
Requires-Dist: psycopg[binary]>=3.1; extra == "postgres"
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: twine>=5; extra == "dev"
Requires-Dist: numpy>=1.23; extra == "dev"
Requires-Dist: scipy>=1.10; extra == "dev"
Requires-Dist: duckdb>=1.0; extra == "dev"
Requires-Dist: pyarrow>=14.0; extra == "dev"
Requires-Dist: psycopg[binary]>=3.1; extra == "dev"

# simpleGraph

`simpleGraph` is a small typed graph kernel for projects that need stable refs,
typed node and edge definitions, trace rows, bounded traversal, and analytical graph
projections.

The design keeps mutable application state separate from analytical views:

- runtime configuration: `open_graph(config)`
- canonical stores: memory and SQLite, plus a Postgres schema adapter
- analytical projection: DuckDB export and sparse matrix snapshots
- algorithms: PageRank, seed propagation, connected components, multi-source
  distance, and seed reachability

The project is intentionally not a Neo4j replacement. It provides graph
building blocks that Katib and LogBD can share without inheriting each other's
domain model.

## Traversal

Traversal is intentionally bounded. `simpleGraph` supports indexed local graph
operations, not arbitrary graph-query planning:

- `iter_nodes(...)`
- `iter_edges(...)`
- `iter_incoming_edges(...)`
- `iter_outgoing_edges(...)`
- `iter_incident_edges(...)`
- `neighbors(..., direction="both")`
- `subgraph(..., direction="both")`
- `path(..., direction="outgoing")`
- `traverse(..., max_depth=..., max_nodes=..., max_edges=...)`

Directional edges follow their stored direction for `outgoing` traversal and
reverse direction for `incoming` traversal. Edges whose `EdgeType` is not
directional are traversable both ways. Deep traversal must set explicit limits
so high-branching graphs and cycles cannot expand without bounds.

`MemoryGraphStore` is for small local graphs, tests, and embedded use.
`SQLiteGraphStore` is durable local storage with in-process indexes. Larger
graphs should use query-backed stores or DuckDB/sparse-matrix projections for
bulk analysis.

## Graph Model

Node types define node categories. A `NodeType` has `key`, `label`,
`properties_schema`, `system`, and `created_at`/`updated_at` timestamps.

Nodes are things. A node has `ref`, `type_key`, `label`, `properties`, and
`created_at`/`updated_at` timestamps. Custom fields belong in `properties`;
node types do not create custom SQL columns or per-type detail tables.

Edge types define connection kinds. An `EdgeType` has `key`,
`forward_label`, `reverse_label`, `directional`, `constraints`,
`properties_schema`, `system`, and `created_at`/`updated_at` timestamps.
`constraints` is for graph validity such as endpoint types or acyclic
containment. `properties_schema` validates edge `properties`.

Edges are actual connections. An edge has `id`, `source_ref`, `target_ref`,
`edge_type_key`, free-form user annotation `label`, `weight`, `properties`,
and `created_at`/`updated_at` timestamps. Custom connection data belongs in
`properties`.

Traces record source, import, and process metadata as append-only support rows.
They are audit evidence for a node or edge, not the source of truth for whether
canonical graph rows exist. `source` is free text.
Containment is a regular directional edge with `edge_type_key="contains"`.
Objects may have multiple parents. The only built-in containment validation for
now is acyclic containment, represented by `constraints={"acyclic": True}` on
the `contains` edge type.

Canonical storage tables are named `sg_node_types`, `sg_nodes`,
`sg_edge_types`, `sg_edges`, `sg_traces`, and `sg_node_metrics`.

## Quick Start

```python
from simplegraph import Edge, Node, Trace, open_graph

schema = {
    "node_types": [
        {"key": "note", "label": "Note"},
        {
            "key": "file",
            "label": "File",
            "properties_schema": {
                "required": ["mime_type"],
                "properties": {"mime_type": {"type": "string"}},
            },
        },
    ]
}
runtime = open_graph({"store": {"backend": "memory"}, "schema": schema})
store = runtime.store
store.upsert_node(Node(ref="note:a", type_key="note", label="A"))
store.upsert_node(Node(ref="file:b", type_key="file", label="B", properties={"mime_type": "text/plain"}))
edge = store.upsert_edge(Edge(source_ref="note:a", target_ref="file:b", edge_type_key="contains"))
store.add_trace(Trace(subject_kind="edge", subject_id=edge.id, source="import"))

print(store.incoming_edges("file:b"))
print(list(store.iter_edges(source_ref="note:a", edge_type_key="contains")))
print(store.path("note:a", "file:b"))
print(store.traverse("note:a", depth=2, max_nodes=100, max_edges=100))
```

Runtime config keeps backend-specific settings inside adapter-owned `options`:

```python
runtime = open_graph({
    "store": {
        "backend": "sqlite",
        "options": {"path": "/state/simplegraph.sqlite"},
    },
    "projections": {
        "analytics": {
            "backend": "duckdb",
            "options": {"path": "/state/simplegraph.duckdb"},
        }
    },
})
```

## CLI

`simpleGraph` is a library plus deterministic CLI, not a server. It does not
ship an HTTP API, daemon, auth layer, or deployment runtime. Applications that
need a service should wrap the library or CLI themselves.

Supported command groups:

```bash
pip install simplegraph-kernel

simplegraph --config simplegraph.toml --path /tmp/graph.sqlite graph info

simplegraph graph init --schema schema.json
simplegraph graph info

simplegraph graph validate records --input records.json
simplegraph graph validate snapshot --input snapshot.json

simplegraph graph import records --input records.json
simplegraph graph export records --output records.json
simplegraph graph export snapshot --output snapshot.json
simplegraph graph export duckdb --output graph.duckdb

simplegraph graph trace add --subject-kind edge --subject-id simplegraph:edge:... --source import
simplegraph graph trace list --subject-kind edge --subject-id simplegraph:edge:...

simplegraph node upsert --ref note:a --type note --label "A"
simplegraph node get --ref note:a
simplegraph node list --type note
simplegraph node delete --ref note:a
simplegraph node types

simplegraph edge upsert --from note:a --to file:b --type uses
simplegraph edge get --from note:a --to file:b --type uses
simplegraph edge list --from note:a --type uses
simplegraph edge delete --from note:a --to file:b --type uses
simplegraph edge types

simplegraph query neighbors --ref note:a --depth 2
simplegraph query path --from note:a --to file:b
simplegraph algorithm pagerank --snapshot snapshot.json --output pagerank.json
```

The CLI resolves the store path from `--path`, then `SIMPLEGRAPH_STORE_PATH`,
then the config file, then a default state path. Records input can be a JSON
object with `node_types`, `edge_types`, `nodes`, `edges`, and `traces` arrays,
or JSONL records with `record_type` set to `node_type`, `edge_type`, `node`,
`edge`, or `trace`. Algorithm commands consume `GraphSnapshot` JSON exported by
`simplegraph graph export snapshot`.

Type definitions are installed by schema initialization or records import.
Existing type definitions are immutable by default: identical definitions are
accepted and changed definitions fail with `schema.definition_mismatch`. Host
applications own migrations when a schema must change. The built-in system
types `entity`, `related_to`, and `contains` are available without being
declared in application schemas.

## Optional Extras

```bash
pip install "simplegraph-kernel[dev]"
```

The core package has no required third-party dependencies. Sparse algorithms
use NumPy and SciPy. DuckDB projections use DuckDB and PyArrow. The Postgres
schema adapter uses psycopg.

## Adapter Ownership

Project-specific adapters do not live in this package. Katib, LogBD, and other
host applications should own their own translation layers and depend on
`simplegraph-kernel` only for the shared graph models, stores, projections,
algorithms, and CLI.
