Metadata-Version: 2.4
Name: cogflow
Version: 2.0.1b4
Summary: COGFlow — modular machine learning workflow management system
Author-email: Sai Kireeti <sai.kireeti@hiro-microdatacenters.nl>
License-Expression: MIT
Project-URL: Homepage, https://github.com/HIRO-MicroDataCenters-BV/cogflow
Project-URL: Issues, https://github.com/HIRO-MicroDataCenters-BV/cogflow/issues
Project-URL: Documentation, https://hiro-microdatacenters-bv.github.io/cogflow
Keywords: ml,workflow,kfp,federated-learning,cogflow,mlops
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.md
License-File: NOTICE
License-File: kfp/LICENSE
Requires-Dist: mlflow==2.22.0
Requires-Dist: kfp-pipeline-spec<0.2.0,>=0.1.16
Requires-Dist: kfp-server-api<2.0.0,>=1.1.2
Requires-Dist: absl-py<2,>=0.9
Requires-Dist: click<9,>=7.1.2
Requires-Dist: cloudpickle<3,>=2.0.0
Requires-Dist: Deprecated<2,>=1.2.7
Requires-Dist: docstring-parser<1,>=0.7.3
Requires-Dist: fire<1,>=0.3.1
Requires-Dist: google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5
Requires-Dist: google-api-python-client<2,>=1.7.8
Requires-Dist: google-auth<3,>=1.6.1
Requires-Dist: google-cloud-storage<3,>=2.2.1
Requires-Dist: jsonschema<5,>=3.0.1
Requires-Dist: protobuf<4,>=3.13.0
Requires-Dist: requests-toolbelt<1,>=0.8.0
Requires-Dist: strip-hints<1,>=0.1.8
Requires-Dist: tabulate<1,>=0.8.6
Requires-Dist: typer<1.0,>=0.3.2
Requires-Dist: uritemplate<4,>=3.0.1
Requires-Dist: urllib3<3.0.0
Requires-Dist: pyyaml<7,>=6.0.2
Requires-Dist: pydantic<3.0.0,>=2.0.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: scikit-learn==1.7.0
Requires-Dist: tenacity
Requires-Dist: kubernetes
Requires-Dist: kubernetes_asyncio
Requires-Dist: minio>=7.2.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: pytest-mock>=3.12.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: coverage; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.6.0; extra == "docs"
Requires-Dist: mkdocs-material>=9.4.0; extra == "docs"
Requires-Dist: pdoc3>=0.11.0; extra == "docs"
Dynamic: license-file

# CogFlow

**CogFlow** is a modular, SDK-first machine learning workflow management system
built on **Kubeflow Pipelines**, **MLflow**, **Kubernetes**, and **MinIO**.

It provides a clean Python API for:
- building production-grade ML pipelines
- managing datasets and components
- orchestrating federated learning workflows
- enforcing consistent error handling and validation

CogFlow is designed for **real infrastructure**, not notebooks only.

---

## Why CogFlow?

Modern ML platforms are powerful but fragmented:

- Kubeflow Pipelines → great orchestration, weak ergonomics
- MLflow → experiment tracking, limited workflow control
- Kubernetes → powerful, but verbose and error-prone
- Federated learning → no standard orchestration layer

**CogFlow bridges these gaps** by providing:

- a stable Python SDK
- safe lazy-loading of heavy dependencies
- unified error handling
- infrastructure-aware abstractions
- zero circular imports

---

## Core Features

### 🧩 Pipeline Orchestration
- Lazy-loaded Kubeflow Pipelines client
- Safe pipeline compilation and submission
- Runtime environment injection
- Kubernetes service lifecycle management

### 📦 Component Management
- YAML-based component registry
- MinIO-backed component storage
- Automatic component registration
- Runtime-safe component loading

### 📊 Dataset Management
- Dataset registration and metadata retrieval
- Secure dataset downloads
- Silent deletes with strict error semantics
- Pluggable storage backends

### 🤝 Federated Learning
- Auto-generated FL pipelines
- Dynamic pipeline signatures
- Connector-based and dataspace-based workflows
- Region-aware scheduling with node selectors

### 🧠 Unified Error Handling
- Strongly typed error hierarchy
- Context-aware exception wrapping
- API-ready error serialization
- Zero silent failures

---

## Architecture Overview

CogFlow follows a **layered SDK architecture**:

cogflow/
├── core/
│ ├── pipelines/ # Kubeflow orchestration & FL pipelines
│ ├── datasets/ # Dataset lifecycle management
│ ├── components/ # Component registry & YAML handling
│ └── models/ # (future extension)
│
├── utils/
│ ├── common.py # UUIDs, paths, Kubernetes helpers
│ ├── network.py # HTTP utilities with retry & streaming
│ ├── storage.py # MinIO client abstraction
│ └── exceptions.py # Unified error framework
│
├── config.py
└── api.py
