Metadata-Version: 2.4
Name: tomo-sdk
Version: 1.0.0
Summary: Python SDK for Apache AGE + pgvector — graph and vector queries on PostgreSQL
Project-URL: Homepage, https://tomo.rizlabs.com
Project-URL: Documentation, https://tomo.rizlabs.com
Project-URL: Repository, https://git.rizlabs.com/gregf/tomo
Project-URL: Issues, https://git.rizlabs.com/gregf/tomo/issues
Author: Greg Felice
License-Expression: Apache-2.0
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: antlr4-python3-runtime==4.11.1
Requires-Dist: psycopg[binary]>=3.1
Provides-Extra: all
Requires-Dist: igraph>=0.11; extra == 'all'
Requires-Dist: langchain-community>=0.1; extra == 'all'
Requires-Dist: langchain-core>=0.1; extra == 'all'
Requires-Dist: llama-index-core>=0.10; extra == 'all'
Requires-Dist: networkx>=3.0; extra == 'all'
Requires-Dist: node2vec>=0.4; extra == 'all'
Requires-Dist: pandas>=2.0; extra == 'all'
Requires-Dist: pyarrow>=14.0; extra == 'all'
Requires-Dist: torch-geometric>=2.4; extra == 'all'
Requires-Dist: torch>=2.0; extra == 'all'
Provides-Extra: arrow
Requires-Dist: pyarrow>=14.0; extra == 'arrow'
Provides-Extra: dev
Requires-Dist: igraph>=0.11; extra == 'dev'
Requires-Dist: jupyterlab>=4.0; extra == 'dev'
Requires-Dist: mypy>=1.8; extra == 'dev'
Requires-Dist: networkx>=3.0; extra == 'dev'
Requires-Dist: pandas>=2.0; extra == 'dev'
Requires-Dist: pyarrow>=14.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.3; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.5; extra == 'docs'
Provides-Extra: embeddings
Requires-Dist: node2vec>=0.4; extra == 'embeddings'
Provides-Extra: igraph
Requires-Dist: igraph>=0.11; extra == 'igraph'
Provides-Extra: jupyter
Requires-Dist: jupyterlab>=4.0; extra == 'jupyter'
Provides-Extra: langchain
Requires-Dist: langchain-community>=0.1; extra == 'langchain'
Requires-Dist: langchain-core>=0.1; extra == 'langchain'
Provides-Extra: llamaindex
Requires-Dist: llama-index-core>=0.10; extra == 'llamaindex'
Provides-Extra: networkx
Requires-Dist: networkx>=3.0; extra == 'networkx'
Provides-Extra: pandas
Requires-Dist: pandas>=2.0; extra == 'pandas'
Provides-Extra: pool
Requires-Dist: psycopg-pool>=3.1; extra == 'pool'
Provides-Extra: pyg
Requires-Dist: torch-geometric>=2.4; extra == 'pyg'
Requires-Dist: torch>=2.0; extra == 'pyg'
Description-Content-Type: text/markdown

# Piggie

[![PyPI version](https://img.shields.io/pypi/v/piggie)](https://pypi.org/project/piggie/)
[![Python versions](https://img.shields.io/pypi/pyversions/piggie)](https://pypi.org/project/piggie/)
[![License](https://img.shields.io/github/license/gregfelice/piggie)](https://github.com/gregfelice/tomo/blob/main/LICENSE)

**One PostgreSQL instance replaces your graph database, your vector database, and your ETL pipeline.**

Piggie is a Python SDK for [Apache AGE](https://age.apache.org/) + [pgvector](https://github.com/pgvector/pgvector) on PostgreSQL. Cypher graph queries, vector similarity search, and hybrid graph+vector operations — all through a single connection string, with no vendor lock-in.

```
┌──────────────────────────────────────────────────┐
│                  Your Application                 │
├──────────────────────────────────────────────────┤
│              Piggie SDK (Python)                  │
│  Cypher │ SQL │ Vector Search │ Hybrid │ KM API  │
├──────────────────────────────────────────────────┤
│                  PostgreSQL                       │
│  Apache AGE     │  pgvector    │  Core SQL       │
│  (graphs)       │  (vectors)   │  (relational)   │
└──────────────────────────────────────────────────┘
```

## Quick Start

```bash
# Start the database (Docker, any platform, ARM + x86)
git clone https://github.com/gregfelice/tomo.git && cd piggie
docker compose up -d

# Install the SDK
pip install piggie
```

```python
import piggie

db = piggie.connect("postgresql://piggie:piggie@localhost:5488/piggie", graph="my_graph")

# Cypher queries → DataFrames
df = db.cypher("MATCH (n:Person) RETURN n.name, n.age").to_df()

# Vector search
results = db.vector_search("documents", "embedding", query_vec, k=10).to_df()

# Hybrid: graph traversal + vector similarity
results = db.hybrid_search(
    cypher="MATCH (p:Paper)-[:CITES]->(cited) RETURN cited",
    vector_table="papers",
    vector_column="abstract_embedding",
    query_vector=query_vec,
    k=10,
).to_df()
```

## Installation

```bash
pip install piggie

# With pandas support
pip install piggie[pandas]

# With all optional dependencies
pip install piggie[all]
```

## Database Setup

Piggie requires PostgreSQL with [Apache AGE](https://age.apache.org/) (graph) and [pgvector](https://github.com/pgvector/pgvector) (vectors).

**Docker (recommended)** — multi-arch image (ARM + x86), hardened config, everything pre-configured:

```bash
docker compose up -d
```

Verify the stack:

```bash
python docker/smoke-test.py
```

**Other platforms:** See the [Install Database](https://tomo.rizlabs.com/guide/install-database/) guide for Ubuntu, macOS, Fedora, and from-source instructions.

**Cloud deployment:** Terraform modules for [AWS, GCP, Azure, DigitalOcean, and Hetzner](https://tomo.rizlabs.com/guide/cloud-deployment/) — one `terraform apply` and you're running.

## Docker Image

The `piggie/db` image ships PostgreSQL 18 + Apache AGE 1.7.0 + pgvector 0.8.2 with:

- **Multi-arch:** linux/amd64 + linux/arm64 (native on Apple M-series, AWS Graviton)
- **scram-sha-256** authentication (no trust or md5)
- **TLS enabled** with self-signed cert (replace for production)
- **Audit logging** for connections, disconnections, and DDL
- **Minimal attack surface** — multi-stage build, no build tools in runtime

## Benchmarks

We benchmarked AGE + pgvector against Neo4j, Kuzu, and NebulaGraph across 12 workloads at three scales (10K, 100K, 1M). **AGE won all 12 workloads at every scale.** At 1M nodes:

| Workload | AGE | Neo4j | Speedup |
|----------|-----|-------|---------|
| Point Lookup | 0.29ms | 0.97ms | 3x |
| Pattern Match | 0.14ms | 1.8ms | 13x |
| VLE Traversal | 0.18ms | 0.62ms | 3x |
| Concurrent Queries | 52ms | 1,323ms | 25x |
| Bulk Load | 59ms | 133ms | 2x |

The Piggie SDK also **won 14/16 algorithm benchmarks** against Neo4j GDS, using igraph (C) and networkit (C++) backends.

Read the full analysis: [Can One PostgreSQL Replace Your Graph Database and Your Vector Database?](https://rizlabs.com/graph-database-benchmarks/)

## Graph Algorithms

19 algorithms across 7 categories with automatic backend selection:

```python
# Auto-selects igraph C backend
df = db.centrality(measure="betweenness")

# Auto-selects networkit C++ backend
df = db.centrality(measure="closeness")

# Community detection
df = db.communities(method="louvain")
```

Backends: igraph (C), networkit (C++), NetworkX (Python). The SDK detects installed backends and routes each algorithm to the fastest available implementation.

## Integrations

- **LangChain** — `PiggieVectorStore` and `PiggieGraphStore` for RAG pipelines
- **LlamaIndex** — `PiggieGraphStore` (triplets) and `PiggiePropertyGraphStore` (labeled property graph)
- **NetworkX** — `to_networkx()` conversion + built-in PageRank, community detection, shortest path, centrality
- **PyTorch Geometric** — `FeatureStore` and `GraphStore` for GNN training on AGE data

## Notebooks

| Notebook | Description |
|----------|-------------|
| [Quickstart](notebooks/01_quickstart.ipynb) | Getting started with Piggie |
| [Hybrid Search](notebooks/02_hybrid_demo.ipynb) | Graph + vector search combined |
| [Bulk Load](notebooks/03_bulk_load.ipynb) | Bulk load and export |
| [Algorithms](notebooks/04_algorithms.ipynb) | Graph algorithms with igraph/networkit backends |
| [RAG Pipeline](notebooks/05_rag_pipeline.ipynb) | RAG with LlamaIndex and LangChain |
| [GNN Training](notebooks/06_gnn_training.ipynb) | GNN training with PyTorch Geometric |

## Documentation

- [Quick Start](https://tomo.rizlabs.com/guide/quickstart/) — 5-minute getting-started guide
- [Install Database](https://tomo.rizlabs.com/guide/install-database/) — Docker, Ubuntu, macOS, Fedora, from source
- [API Reference](https://tomo.rizlabs.com/guide/api-reference/) — full method signatures and parameters
- [Graph Algorithms](https://tomo.rizlabs.com/guide/algorithms/) — algorithm catalog and backend selection
- [Integrations](https://tomo.rizlabs.com/guide/integrations/) — LangChain, LlamaIndex, NetworkX, PyG
- [Compatibility](https://tomo.rizlabs.com/guide/compatibility/) — tested version matrix
- [Cloud Deployment](https://tomo.rizlabs.com/guide/cloud-deployment/) — Terraform for AWS, GCP, Azure, DO, Hetzner
- [Troubleshooting](https://tomo.rizlabs.com/guide/troubleshooting/) — common issues and fixes
- [Security](https://tomo.rizlabs.com/guide/security/) — TLS, authentication, credential handling
- [Hardening](https://tomo.rizlabs.com/guide/hardening/) — CIS benchmark alignment

## Security

See [SECURITY.md](SECURITY.md) for vulnerability disclosure policy and security model.

## Community

- [GitHub Discussions](https://github.com/gregfelice/tomo/discussions) — questions, ideas, show and tell
- [Issue Tracker](https://github.com/gregfelice/tomo/issues) — bug reports and feature requests
- [Contributing Guide](CONTRIBUTING.md) — how to set up and contribute

## Requirements

- Python 3.10+
- PostgreSQL 16–18 with Apache AGE and pgvector extensions (or use the [Docker image](https://tomo.rizlabs.com/guide/install-database/))

## License

Apache 2.0
