Metadata-Version: 2.4
Name: flowyml
Version: 1.10.0
Summary: Next-Generation ML Pipeline Framework
License: Apache-2.0
License-File: LICENSE
Author: flowyml Team
Author-email: support@unicolab.ai
Requires-Python: >=3.10,<4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Provides-Extra: all
Provides-Extra: aws
Provides-Extra: azure
Provides-Extra: gcp
Provides-Extra: genai
Provides-Extra: langchain
Provides-Extra: langgraph
Provides-Extra: openai
Provides-Extra: pytorch
Provides-Extra: rich
Provides-Extra: sklearn
Provides-Extra: tensorflow
Provides-Extra: ui
Requires-Dist: alembic (>=1.13.0)
Requires-Dist: boto3 (>=1.28) ; extra == "aws" or extra == "all"
Requires-Dist: click (>=8.0.0)
Requires-Dist: cloudpickle (>=2.0.0)
Requires-Dist: croniter (>=2.0.1,<3.0.0)
Requires-Dist: fastapi (>=0.122.0) ; extra == "ui" or extra == "all"
Requires-Dist: google-cloud-aiplatform (>=1.35.0) ; extra == "gcp" or extra == "all"
Requires-Dist: google-cloud-storage (>=2.10.0) ; extra == "gcp" or extra == "all"
Requires-Dist: httpx (>=0.24,<0.28)
Requires-Dist: kfp (>=2.0) ; extra == "gcp"
Requires-Dist: kfp-server-api (>=2.0.0) ; extra == "gcp"
Requires-Dist: langchain-core (>=0.2.0) ; extra == "langchain" or extra == "langgraph" or extra == "genai" or extra == "all"
Requires-Dist: langgraph (>=0.2.0) ; extra == "langgraph" or extra == "genai" or extra == "all"
Requires-Dist: loguru (>=0.7.3,<0.8.0)
Requires-Dist: numpy (>=1.20.0)
Requires-Dist: openai (>=1.0.0) ; extra == "openai" or extra == "genai" or extra == "all"
Requires-Dist: opentelemetry-api (>=1.39.1,<2.0.0)
Requires-Dist: opentelemetry-exporter-prometheus (>=0.60b1,<0.61)
Requires-Dist: opentelemetry-instrumentation-fastapi (>=0.60b1,<0.61)
Requires-Dist: opentelemetry-sdk (>=1.39.1,<2.0.0)
Requires-Dist: pandas (>=1.3.0)
Requires-Dist: psutil (>=7.2.2,<8.0.0)
Requires-Dist: psycopg2-binary (>=2.9.0)
Requires-Dist: pydantic (>=2.0.0)
Requires-Dist: python-multipart (>=0.0.9) ; extra == "ui" or extra == "all"
Requires-Dist: pytz (>=2024.1,<2025.0)
Requires-Dist: pyyaml (>=6.0)
Requires-Dist: requests (>=2.28.0)
Requires-Dist: rich (>=13.0.0) ; extra == "rich"
Requires-Dist: rich-click (>=1.9.7,<2.0.0)
Requires-Dist: scikit-learn (>=1.0.0) ; extra == "sklearn" or extra == "all"
Requires-Dist: sqlalchemy (>=2.0.0)
Requires-Dist: tensorflow (>=2.12.0) ; extra == "tensorflow" or extra == "all"
Requires-Dist: textual (>=8.1.1,<9.0.0)
Requires-Dist: toml (>=0.10.2)
Requires-Dist: torch (>=2.0.0) ; extra == "pytorch" or extra == "all"
Requires-Dist: uvicorn[standard] (>=0.23.0) ; extra == "ui" or extra == "all"
Requires-Dist: websockets (>=11.0) ; extra == "ui" or extra == "all"
Project-URL: Documentation, https://unicolab.github.io/FlowyML/latest
Project-URL: Homepage, https://github.com/UnicoLab/FlowyML
Project-URL: Repository, https://github.com/UnicoLab/FlowyML
Description-Content-Type: text/markdown

# 🌊 flowyml

<p align="center">
  <img src="docs/logo.png" width="350" alt="flowyml Logo"/>
  <br>
  <em>The Enterprise-Grade ML Pipeline Framework for Humans</em>
  <br>
  <br>
  <p align="center">
    <a href="https://github.com/UnicoLab/FlowyML/actions"><img src="https://img.shields.io/github/actions/workflow/status/UnicoLab/FlowyML/ci.yml?branch=main" alt="CI Status"></a>
    <a href="https://pypi.org/project/flowyml/"><img src="https://img.shields.io/pypi/v/flowyml" alt="PyPI Version"></a>
    <a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.10+-blue.svg" alt="Python 3.10+"></a>
    <a href="LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" alt="License"></a>
    <a href="https://unicolab.ai"><img src="https://img.shields.io/badge/UnicoLab-ai-red.svg" alt="UnicoLab"></a>
  </p>
</p>

---

**FlowyML** is a lightweight yet powerful ML pipeline orchestration framework. It bridges the gap between rapid experimentation and enterprise production by making assets first-class citizens. Write pipelines in pure Python, and scale them to production without changing a single line of code.

## 🚀 Why FlowyML?

| Feature | FlowyML | Traditional Orchestrators |
|---------|---------|---------------------------|
| **Developer Experience** | 🐍 **Native Python** - No DSLs, no YAML hell. | 📜 Complex YAML or rigid DSLs. |
| **Type-Based Routing** | 🧠 **Auto-Routing** - Define WHAT, we handle WHERE. | 🔌 Manual wiring to cloud buckets. |
| **Smart Caching** | ⚡ **Multi-Level** - Smart content-hashing skips re-runs. | 🐢 Basic file-timestamp checking. |
| **Asset Management** | 📦 **First-Class Assets** - Models & Datasets with lineage. | 📁 Generic file paths only. |
| **Multi-Stack** | 🌍 **Abstract Infra** - Switch local/prod with one env var. | 🔒 Vendor lock-in or complex setup. |
| **GenAI Ready** | 🤖 **LLM Tracing** - Built-in token & cost tracking. | 🧩 Requires external tools. |
| **Build-Time Validation** | ✅ **Type Safety** - Catches mismatches at build time. | 💥 Runtime errors only. |
| **Map Tasks** | 🗺️ **Parallel Maps** - `@map_task` with retries & concurrency. | 🔁 Manual parallelism boilerplate. |
| **Dynamic Workflows** | 🔀 **Runtime DAGs** - Generate pipelines based on data. | 📐 Static definitions only. |
| **GenAI Assets** | 🎯 **Prompt & Checkpoint** - First-class prompt versioning and training resumability. | 📝 Unmanaged text files. |
| **Stack Hydration** | 🏗️ **YAML → Live Stack** - `StackConfig.to_stack()` wires infra automatically. | ⚙️ Manual component assembly. |

---

## ⚡️ Quick Start

This is a complete, multi-step ML pipeline with auto-injected context:

```python
from flowyml import Pipeline, step, context

@step(outputs=["dataset"])
def load_data(batch_size: int = 32):
    return [i for i in range(batch_size)]

@step(inputs=["dataset"], outputs=["model"])
def train_model(dataset, learning_rate: float = 0.01):
    print(f"Training on {len(dataset)} items with lr={learning_rate}")
    return "model_v1"

# Configure and Run
ctx = context(learning_rate=0.05, batch_size=64)
pipeline = Pipeline("quickstart", context=ctx)
pipeline.add_step(load_data).add_step(train_model)

pipeline.run()
```

---

## 🌟 Key Features

### 1. 🧠 Type-Based Artifact Routing (New in 1.8.0)
Define artifact types in code, and FlowyML automatically routes them to your cloud infrastructure.
```python
@step
def train(...) -> Model:
    # Auto-saved to GCS/S3 and registered to Vertex AI / SageMaker
    return Model(obj, name="classifier")
```

### 2. 🌍 Multi-Stack Configuration
Manage local, staging, and production environments in a single `flowyml.yaml`.
```bash
export FLOWYML_STACK=production
python pipeline.py  # Now runs on Vertex AI with GCS storage
```

### 3. 🛡️ Intelligent Step Grouping
Group consecutive steps to run in the same container. Perfect for reducing overhead while maintaining clear step boundaries.

### 4. 📊 Built-in Observability
Beautiful dark-mode dashboard to monitor pipelines, visualize DAGs, and inspect artifacts in real-time.

### 5. 🎯 Evaluations Framework
Production-grade evaluation system with 29+ scorers — classification, regression, GenAI (LLM-as-a-judge), and adapters for **DeepEval**, **RAGAS**, and **Phoenix**:
```python
from flowyml.evals import evaluate, EvalDataset, get_scorer

data = EvalDataset.create_genai("my_test", examples=[...])
result = evaluate(data=data, scorers=[get_scorer("relevance"), get_scorer("ragas.faithfulness")])
result.notify_if_regression(threshold=0.05)
```

### 6. 🗺️ Map Tasks & Dynamic Workflows
Distribute work over collections with `@map_task` and generate pipelines at runtime with `@dynamic`:
```python
from flowyml import map_task, dynamic

@map_task(concurrency=8, retries=2, min_success_ratio=0.95)
def process_document(doc: dict) -> dict:
    return transform(doc)

@dynamic(outputs=["best_model"])
def hyperparameter_search(config: dict):
    sub = Pipeline("hp_search")
    for lr in config["learning_rates"]:
        sub.add_step(train_with_lr(lr))
    return sub
```

### 7. 📦 Artifact Catalog with Lineage
Centralized artifact discovery, tagging, and lineage tracking — works local and remote:
```python
from flowyml import ArtifactCatalog

catalog = ArtifactCatalog()  # Auto-selects local SQLite or remote API
catalog.register(name="classifier", artifact_type="Model", parent_ids=[dataset_id])
lineage = catalog.get_lineage(model_id)  # Full parent→child graph
```

---

## 📦 Installation

```bash
# Install core
pip install flowyml

# Install with everything (recommended)
pip install "flowyml[all]"
```

## 📚 Documentation

Visit [FlowyML Docs](https://unicolab.github.io/FlowyML/latest/) for:
- **[Getting Started](https://unicolab.github.io/FlowyML/latest/getting-started)**
- **[Core Concepts](https://unicolab.github.io/FlowyML/latest/core/pipelines)**
- **[Type-Based Routing](https://unicolab.github.io/FlowyML/latest/plugins/type_routing)**
- **[API Reference](https://unicolab.github.io/FlowyML/latest/api/core)**

---
<p align="center">
  <strong>Built with ❤️ by <a href="https://unicolab.ai">UnicoLab</a></strong>
</p>

