Metadata-Version: 2.2
Name: isage-flow
Version: 0.1.0
Summary: Vector-native stream processing engine for incremental semantic state snapshots
Keywords: stream processing,semantic search,vector database,incremental indexing
Author-Email: IntelliStream Team <shuhao_zhang@hust.edu.cn>
License: Apache-2.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Database
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: C++
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Project-URL: Homepage, https://github.com/intellistream/sageFlow
Project-URL: Documentation, https://github.com/intellistream/sageFlow#readme
Project-URL: Repository, https://github.com/intellistream/sageFlow
Project-URL: Issues, https://github.com/intellistream/sageFlow/issues
Requires-Python: >=3.10
Requires-Dist: numpy>=1.19.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Provides-Extra: publish
Requires-Dist: sage-pypi-publisher==0.1.4; extra == "publish"
Description-Content-Type: text/markdown

## sageFlow

`sageFlow` is a cutting-edge, vector-native stream processing engine designed specifically to maintain and materialize semantic state snapshots for real-time, LLM-based generation tasks. The engine offers a declarative API to compose stateful vector operations within temporal windows, enabling fast and efficient updates to semantic context for dynamically changing datasets.

## Features

-   **Vector-Native Stream Processing**: At its core, sageFlow is built to handle high-dimensional vector streams efficiently.
-   **Declarative API**: Easily compose complex, stateful vector operations such as `TopK`, `Filter`, and `Join` within defined temporal windows.
-   **Incremental Low-Latency Updates**: Optimized for incremental computations, ensuring semantic states are updated with minimal delay.
-   **Optimized Three-Phase Pipeline**: Abstracts stream processing into three distinct phases—ingestion, state materialization, and snapshot exposure—unlocking significant optimization opportunities.
-   **Stateful and Windowed Operations**: Natively supports windowing to create time-bound semantic snapshots from continuous data streams.

## Key Use Cases

-   **Real-time LLM Generation**: Provide large language models with fresh, stateful context snapshots for more accurate and timely responses.
-   **Dynamic Context Maintenance**: Ideal for conversational AI or interactive applications where the context evolves rapidly over time.
-   **Streaming Data Analytics**: Serve high-velocity data analysis use cases that require complex, stateful semantic queries on vector data.
-   **Adaptive Recommendation Systems**: Build systems that can update recommendations in real-time based on the most recent user interactions and streaming events.

## Setup

To setup `sageFlow` and it's dependencies, begin by making sure that you have `docker` installed, or any **Linux** release version that contains `apt`, such as `Ubuntu` or `Debian`

We suggest first begin with `docker` before you are familiar with `sageFlow`.

### Quick Installation (Ubuntu/Debian)

For a quick one-click installation of all dependencies including DiskANN support, run:

```bash
cd <PATH_TO_REPO>
sudo ./scripts/install-deps.sh
```

This script will install:
- Build essentials (gcc, g++, cmake, etc.)
- DiskANN dependencies (libaio, boost, etc.)
- Intel MKL (Math Kernel Library)
- Environment configuration

After installation, reload your environment:
```bash
source /etc/profile.d/mkl.sh
```

### Docker

make sure you have installed `Docker` and `Docker` is running

#### Windows

```
cd <PATH_TO_REPO>/setup
./start_win.bat
```

#### Linux

```
cd <PATH_TO_REPO>/setup
./start.sh
```

### Linux with apt

check the dependencies in <PATH_TO_REPO>/setup/Dockerfile, and build your env

## sageFlow: examples

run the following commands to generate examples

```
cmake -B build
cmake --build build -j $(nproc)
```
This will generate the examples in the `build/bin` directory.
You can run the examples with:

```
./build/bin/itopk
```
