Metadata-Version: 2.4
Name: swiss-ai-hub-pipeline
Version: 0.291.5
Summary: Swiss AI Hub Pipeline SDK: Dagster-based document ingestion, parsing, embedding, and vector storage for RAG.
Author: Joel Barmettler, Marius Högger, Michèle Fundneider, Thomas Mannhart, Noah Hermann
Author-email: Joel Barmettler <joel.barmettler@bbv.ch>, Marius Högger <marius.hoegger@bbv.ch>, Michèle Fundneider <michele.fundneider@bbv.ch>, Thomas Mannhart <thomas.mannhart@bbv.ch>, Noah Hermann <noah.hermann@bbv.ch>
License-Expression: Apache-2.0
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Dist: swiss-ai-hub-core==0.291.5
Requires-Dist: dagster-webserver>=1.11.2
Requires-Dist: dagster-postgres>=0.27.2
Requires-Dist: dagster-azure>=0.27.2
Requires-Dist: dagster-aws>=0.27.2
Requires-Dist: matplotlib>=3.9.2
Requires-Dist: dagster-apprise>=0.0.2
Requires-Dist: apprise>=1.9.9
Requires-Python: >=3.13, <3.14
Project-URL: Homepage, https://github.com/bbvch-ai/aihub-core
Project-URL: Repository, https://github.com/bbvch-ai/aihub-core
Project-URL: Documentation, https://bbvch-ai.github.io/aihub-core/
Project-URL: Issues, https://github.com/bbvch-ai/aihub-core/issues
Description-Content-Type: text/markdown

# Swiss AI Hub Pipeline

Data ingestion and processing SDK for the [Swiss AI Hub](https://github.com/bbvch-ai/aihub-core) platform. A
Dagster-based, asset-oriented framework that turns documents into RAG-ready vectors.

- **Two-stage architecture** — source-specific ingestion (SharePoint, OneDrive, S3, local FS via rclone) into an
  S3-compatible data lake, then a unified parse → chunk → embed → store pipeline into Milvus and MongoDB.
- **Factory pattern** — compose pipelines from asset factories, resources, IO managers, and sensors.
- **Lineage** — every vector embedding traces back to its source document.

## Installation

```bash
pip install swiss-ai-hub-pipeline
```

This pulls in [`swiss-ai-hub-core`](https://pypi.org/project/swiss-ai-hub-core/).

## Usage

```python
from swiss_ai_hub.pipeline import default_definitions
```

## Links

- Source & issues: https://github.com/bbvch-ai/aihub-core
- Documentation: https://bbvch-ai.github.io/aihub-core/

## License

Apache-2.0
