Metadata-Version: 2.4
Name: sporedb
Version: 0.1.0
Summary: Bioprocess-native time-series database
Project-URL: Homepage, https://github.com/spore-db/SporeDB
Project-URL: Repository, https://github.com/spore-db/SporeDB
Project-URL: Documentation, https://spore-db.github.io/SporeDB/
Project-URL: Issues, https://github.com/spore-db/SporeDB/issues
Author: Rishikesh Ranjan
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: bioprocess,bioreactor,database,fermentation,phase-detection,time-series
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: argon2-cffi>=23.1.0
Requires-Dist: charset-normalizer>=3.4
Requires-Dist: click>=8.1.0
Requires-Dist: cryptography>=46.0.0
Requires-Dist: duckdb>=1.5.0
Requires-Dist: fastdtw>=0.3.4
Requires-Dist: filelock>=3.12
Requires-Dist: lark>=1.3.1
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pandas>=2.2.0
Requires-Dist: pyarrow>=23.0.0
Requires-Dist: pydantic>=2.12.0
Requires-Dist: pyjwt[crypto]>=2.12.0
Requires-Dist: python-dateutil>=2.9
Requires-Dist: pytz>=2024.1
Requires-Dist: rich>=13.0.0
Requires-Dist: ruptures>=1.1.9
Requires-Dist: scipy>=1.12
Requires-Dist: thefuzz[speedup]>=0.22
Requires-Dist: uncertainties>=3.2.0
Requires-Dist: uuid-utils>=0.14.0
Provides-Extra: all
Requires-Dist: aiosqlite>=0.20.0; extra == 'all'
Requires-Dist: alembic>=1.18.4; extra == 'all'
Requires-Dist: anywidget>=0.9.0; extra == 'all'
Requires-Dist: asyncpg>=0.31.0; extra == 'all'
Requires-Dist: authlib>=1.0; extra == 'all'
Requires-Dist: boto3>=1.42.0; extra == 'all'
Requires-Dist: fastapi>=0.136.0; extra == 'all'
Requires-Dist: httpx>=0.28.0; extra == 'all'
Requires-Dist: hypothesis>=6.100; extra == 'all'
Requires-Dist: influxdb-client>=1.50.0; extra == 'all'
Requires-Dist: influxdb>=5.3.0; extra == 'all'
Requires-Dist: ipywidgets>=8.1.0; extra == 'all'
Requires-Dist: jinja2>=3.1.0; extra == 'all'
Requires-Dist: mkdocs-material>=9.7.6; extra == 'all'
Requires-Dist: mkdocstrings[python]>=1.0.4; extra == 'all'
Requires-Dist: mypy>=1.10; extra == 'all'
Requires-Dist: pandas-stubs>=2.2; extra == 'all'
Requires-Dist: pi-web-sdk>=0.1.34; extra == 'all'
Requires-Dist: plotly>=6.7.0; extra == 'all'
Requires-Dist: pydantic-settings>=2.9.0; extra == 'all'
Requires-Dist: pymdown-extensions>=10.0; extra == 'all'
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'all'
Requires-Dist: pytest-cov>=5.0; extra == 'all'
Requires-Dist: pytest>=8.0; extra == 'all'
Requires-Dist: python-multipart>=0.0.9; extra == 'all'
Requires-Dist: pyyaml>=6.0; extra == 'all'
Requires-Dist: ruff>=0.4.0; extra == 'all'
Requires-Dist: slowapi>=0.1.9; extra == 'all'
Requires-Dist: sqlalchemy[asyncio]>=2.0.49; extra == 'all'
Requires-Dist: types-openpyxl>=3.1; extra == 'all'
Requires-Dist: types-python-dateutil>=2.9; extra == 'all'
Requires-Dist: types-pyyaml>=6.0; extra == 'all'
Requires-Dist: uvicorn[standard]>=0.45.0; extra == 'all'
Provides-Extra: cloud
Requires-Dist: alembic>=1.18.4; extra == 'cloud'
Requires-Dist: asyncpg>=0.31.0; extra == 'cloud'
Requires-Dist: boto3>=1.42.0; extra == 'cloud'
Requires-Dist: fastapi>=0.136.0; extra == 'cloud'
Requires-Dist: httpx>=0.28.0; extra == 'cloud'
Requires-Dist: jinja2>=3.1.0; extra == 'cloud'
Requires-Dist: pydantic-settings>=2.9.0; extra == 'cloud'
Requires-Dist: python-multipart>=0.0.9; extra == 'cloud'
Requires-Dist: slowapi>=0.1.9; extra == 'cloud'
Requires-Dist: sqlalchemy[asyncio]>=2.0.49; extra == 'cloud'
Requires-Dist: uvicorn[standard]>=0.45.0; extra == 'cloud'
Provides-Extra: connectors
Requires-Dist: authlib>=1.0; extra == 'connectors'
Requires-Dist: httpx>=0.28.0; extra == 'connectors'
Requires-Dist: influxdb-client>=1.50.0; extra == 'connectors'
Requires-Dist: influxdb>=5.3.0; extra == 'connectors'
Requires-Dist: pyyaml>=6.0; extra == 'connectors'
Provides-Extra: dev
Requires-Dist: aiosqlite>=0.20.0; extra == 'dev'
Requires-Dist: hypothesis>=6.100; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pandas-stubs>=2.2; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Requires-Dist: types-openpyxl>=3.1; extra == 'dev'
Requires-Dist: types-python-dateutil>=2.9; extra == 'dev'
Requires-Dist: types-pyyaml>=6.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.7.6; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=1.0.4; extra == 'docs'
Requires-Dist: pymdown-extensions>=10.0; extra == 'docs'
Provides-Extra: osisoft
Requires-Dist: pi-web-sdk>=0.1.34; extra == 'osisoft'
Provides-Extra: viz
Requires-Dist: anywidget>=0.9.0; extra == 'viz'
Requires-Dist: ipywidgets>=8.1.0; extra == 'viz'
Requires-Dist: plotly>=6.7.0; extra == 'viz'
Description-Content-Type: text/markdown

# SporeDB

[![CI](https://github.com/spore-db/SporeDB/actions/workflows/ci.yml/badge.svg)](https://github.com/spore-db/SporeDB/actions/workflows/ci.yml)
[![PyPI version](https://img.shields.io/pypi/v/sporedb)](https://pypi.org/project/sporedb/)
[![Python versions](https://img.shields.io/pypi/pyversions/sporedb)](https://pypi.org/project/sporedb/)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)

Bioprocess-native time-series database for fermentation scientists, cell culture engineers, and biologics developers.

Built on Apache Arrow, Parquet, and DuckDB -- SporeDB provides first-class primitives for batch management, automatic phase detection, cross-run alignment, and regulatory-compliant audit trails (FDA 21 CFR Part 11).

## Features

- **Batch Management** -- create, track, and compare fermentation runs with full metadata
- **Automatic Phase Detection** -- PELT + BOCPD algorithms identify lag, exponential, stationary, and decline phases
- **Cross-Run Alignment** -- `align(runs, by='phase')` for multi-batch comparison
- **FDA 21 CFR Part 11 Compliance** -- cryptographic audit trails, electronic signatures, access controls
- **PromQL-style Query Language** -- domain-specific DSL compiled to DuckDB SQL
- **Columnar Storage** -- Apache Arrow + Parquet + DuckDB for fast analytical queries
- **Industrial Connectors** -- InfluxDB, OSIsoft PI Web API import/export
- **Interactive Visualization** -- Plotly-based charts for Jupyter notebooks

## Installation

```bash
pip install sporedb
```

### Optional extras

```bash
pip install sporedb[cloud]       # FastAPI server, SQLAlchemy, Alembic
pip install sporedb[viz]         # Plotly interactive visualizations
pip install sporedb[connectors]  # InfluxDB, PI Web API connectors
pip install sporedb[all]         # Everything
```

## Quick Start

```python
from sporedb import SporeDB

# Connect to a local SporeDB instance
with SporeDB("./my_data") as db:
    # Create a batch
    batch = db.create_batch("CHO-Run-001", strain="CHO-K1")
    print(batch)

    # Import telemetry from CSV
    result = db.import_csv("telemetry.csv", "CHO-Run-001")
    print(f"Imported {result.rows_imported} rows in {result.elapsed_seconds:.2f}s")

    # Detect phases automatically
    phases = db.detect_phases(result.batch_id)
    for phase in phases:
        print(f"  {phase.phase_type.value}: {phase.start_ts} - {phase.end_ts}")

    # Retrieve telemetry as a Pandas DataFrame
    df = db.get_telemetry(result.batch_id)
    print(df.head())
```

Expected output:

```
Batch CHO-Run-001 created (id=019...)
Imported 2847 rows in 0.42s
Detected 4 phases:
  lag: 2024-01-01T00:00 - 2024-01-01T06:00
  exponential: 2024-01-01T06:00 - 2024-01-02T12:00
  stationary: 2024-01-02T12:00 - 2024-01-03T18:00
  decline: 2024-01-03T18:00 - 2024-01-04T00:00
```

## Architecture

SporeDB is a **library-first** database that embeds directly in your Python process:

- **Storage**: Apache Arrow in-memory + Parquet on-disk + DuckDB for SQL analytics
- **Phase Detection**: PELT (offline) and BOCPD (online) changepoint algorithms via `ruptures`
- **Query Language**: PromQL-style DSL parsed by Lark, compiled to DuckDB SQL
- **Compliance**: SHA-256 hash chains + Ed25519 signatures for tamper-evident audit trails
- **Cloud Tier**: Optional FastAPI server with PostgreSQL metadata + S3-compatible object storage

## Deploy

Deploy SporeDB on your platform of choice:

| Platform | Deploy | Time |
|----------|--------|------|
| **Railway** | [![Deploy on Railway](https://railway.com/button.svg)](https://railway.com/template/SporeDB) | ~3 min |
| **Render** | [![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://github.com/spore-db/SporeDB) | ~5 min |
| **Fly.io** | [Deploy Guide](docs/deploy/fly-io.md) | ~5 min |
| **AWS** | [ECS/Fargate Guide](docs/deploy/aws.md) | ~20 min |
| **DigitalOcean** | [App Platform Guide](docs/deploy/digitalocean.md) | ~10 min |
| **Self-Hosted** | [Docker Compose Guide](docs/deployment/selfhosted.md) | ~5 min |

**Quick start with Docker Compose:**

```bash
git clone https://github.com/spore-db/SporeDB.git
cd SporeDB && make generate-keys && make build && make up
```

See [all deployment guides](docs/deploy/index.md) for detailed instructions, cost estimates, and recommended sizing.

## Contributing

We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, testing, and PR guidelines.

## License

Apache-2.0 -- see [LICENSE](LICENSE) for details.
