Metadata-Version: 2.4
Name: kailash
Version: 0.1.1
Summary: Python SDK for the Kailash container-node architecture
Home-page: https://github.com/integrum/kailash-python-sdk
Author: Integrum
Author-email: Integrum <info@integrum.com>
Project-URL: Homepage, https://github.com/integrum/kailash-python-sdk
Project-URL: Bug Tracker, https://github.com/integrum/kailash-python-sdk/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: networkx>=2.7
Requires-Dist: pydantic>=1.9
Requires-Dist: matplotlib>=3.5
Requires-Dist: pyyaml>=6.0
Requires-Dist: click>=8.0
Requires-Dist: pytest>=8.3.5
Requires-Dist: mcp[cli]>=1.9.0
Requires-Dist: pandas>=2.2.3
Requires-Dist: numpy>=2.2.5
Requires-Dist: scipy>=1.15.3
Requires-Dist: scikit-learn>=1.6.1
Requires-Dist: requests>=2.32.3
Requires-Dist: pytest-cov>=6.1.1
Requires-Dist: isort>=6.0.1
Requires-Dist: aiohttp>=3.12.4
Requires-Dist: ruff>=0.11.12
Requires-Dist: msal>=1.32.3
Requires-Dist: sphinx>=8.2.3
Requires-Dist: sphinx-rtd-theme>=3.0.2
Requires-Dist: sphinx-copybutton>=0.5.2
Requires-Dist: sphinxcontrib-mermaid>=1.0.0
Requires-Dist: sphinx-autobuild>=2024.10.3
Requires-Dist: autodoc>=0.5.0
Requires-Dist: myst-parser>=4.0.1
Requires-Dist: black>=25.1.0
Requires-Dist: psutil>=7.0.0
Requires-Dist: fastapi[all]>=0.115.12
Requires-Dist: pytest-asyncio>=1.0.0
Requires-Dist: pre-commit>=4.2.0
Requires-Dist: twine>=6.1.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=3.0; extra == "dev"
Requires-Dist: black>=22.0; extra == "dev"
Requires-Dist: isort>=5.10; extra == "dev"
Requires-Dist: mypy>=0.9; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# Kailash Python SDK

<p align="center">
  <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/v/kailash.svg" alt="PyPI version"></a>
  <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/pyversions/kailash.svg" alt="Python versions"></a>
  <a href="https://pypi.org/project/kailash/"><img src="https://img.shields.io/pypi/dm/kailash.svg" alt="Downloads"></a>
  <img src="https://img.shields.io/badge/license-MIT-green.svg" alt="MIT License">
  <img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black">
  <img src="https://img.shields.io/badge/tests-544%20passing-brightgreen.svg" alt="Tests: 544 passing">
  <img src="https://img.shields.io/badge/coverage-100%25-brightgreen.svg" alt="Coverage: 100%">
</p>

<p align="center">
  <strong>A Pythonic SDK for the Kailash container-node architecture</strong>
</p>

<p align="center">
  Build workflows that seamlessly integrate with Kailash's production environment while maintaining the flexibility to prototype quickly and iterate locally.
</p>

---

## ✨ Highlights

- 🚀 **Rapid Prototyping**: Create and test workflows locally without containerization
- 🏗️ **Architecture-Aligned**: Automatically ensures compliance with Kailash standards
- 🔄 **Seamless Handoff**: Export prototypes directly to production-ready formats
- 📊 **Real-time Monitoring**: Live dashboards with WebSocket streaming and performance metrics
- 🧩 **Extensible**: Easy to create custom nodes for domain-specific operations
- ⚡ **Fast Installation**: Uses `uv` for lightning-fast Python package management

## 🎯 Who Is This For?

The Kailash Python SDK is designed for:

- **AI Business Coaches (ABCs)** who need to prototype workflows quickly
- **Data Scientists** building ML pipelines compatible with production infrastructure
- **Engineers** who want to test Kailash workflows locally before deployment
- **Teams** looking to standardize their workflow development process

## 🚀 Quick Start

### Installation

```bash
# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# For users: Install from PyPI
pip install kailash

# For developers: Clone and sync
git clone https://github.com/integrum/kailash-python-sdk.git
cd kailash-python-sdk
uv sync
```

### Your First Workflow

```python
from kailash.workflow import Workflow
from kailash.nodes.data import CSVReader
from kailash.nodes.code import PythonCodeNode
from kailash.runtime.local import LocalRuntime
import pandas as pd

# Create a workflow
workflow = Workflow("customer_analysis", name="customer_analysis")

# Add data reader
reader = CSVReader(file_path="customers.csv")
workflow.add_node("read_customers", reader)

# Add custom processing using Python code
def analyze_customers(data):
    """Analyze customer data and compute metrics."""
    df = pd.DataFrame(data)
    # Convert total_spent to numeric
    df['total_spent'] = pd.to_numeric(df['total_spent'])
    return {
        "total_customers": len(df),
        "avg_spend": df["total_spent"].mean(),
        "top_customers": df.nlargest(10, "total_spent").to_dict("records")
    }

analyzer = PythonCodeNode.from_function(analyze_customers, name="analyzer")
workflow.add_node("analyze", analyzer)

# Connect nodes
workflow.connect("read_customers", "analyze", {"data": "data"})

# Run locally
runtime = LocalRuntime()
results, run_id = runtime.execute(workflow)
print(f"Analysis complete! Results: {results}")

# Export for production
from kailash.utils.export import WorkflowExporter
exporter = WorkflowExporter()
workflow.save("customer_analysis.yaml", format="yaml")
```

### SharePoint Integration Example

```python
from kailash.workflow import Workflow
from kailash.nodes.data import SharePointGraphReader, CSVWriter
import os

# Create workflow for SharePoint file processing
workflow = Workflow("sharepoint_processor", name="sharepoint_processor")

# Configure SharePoint reader (using environment variables)
sharepoint = SharePointGraphReader()
workflow.add_node("read_sharepoint", sharepoint)

# Process downloaded files
csv_writer = CSVWriter()
workflow.add_node("save_locally", csv_writer)

# Connect nodes
workflow.connect("read_sharepoint", "save_locally")

# Execute with credentials
from kailash.runtime.local import LocalRuntime

inputs = {
    "read_sharepoint": {
        "tenant_id": os.getenv("SHAREPOINT_TENANT_ID"),
        "client_id": os.getenv("SHAREPOINT_CLIENT_ID"),
        "client_secret": os.getenv("SHAREPOINT_CLIENT_SECRET"),
        "site_url": "https://yourcompany.sharepoint.com/sites/YourSite",
        "operation": "list_files",
        "library_name": "Documents"
    }
}

runtime = LocalRuntime()
results, run_id = runtime.execute(workflow, inputs=inputs)
```

## 📚 Documentation

| Resource | Description |
|----------|-------------|
| 📖 [User Guide](docs/user-guide.md) | Comprehensive guide for using the SDK |
| 📋 [API Reference](docs/) | Detailed API documentation |
| 🌐 [API Integration Guide](examples/API_INTEGRATION_README.md) | Complete API integration documentation |
| 🎓 [Examples](examples/) | Working examples and tutorials |
| 🤝 [Contributing](CONTRIBUTING.md) | Contribution guidelines |

## 🛠️ Features

### 📦 Pre-built Nodes

The SDK includes a rich set of pre-built nodes for common operations:

<table>
<tr>
<td width="50%">

**Data Operations**
- `CSVReader` - Read CSV files
- `JSONReader` - Read JSON files
- `SQLDatabaseNode` - Query databases
- `CSVWriter` - Write CSV files
- `JSONWriter` - Write JSON files

</td>
<td width="50%">

**Processing Nodes**
- `PythonCodeNode` - Custom Python logic
- `DataTransformer` - Transform data
- `Filter` - Filter records
- `Aggregator` - Aggregate data
- `TextProcessor` - Process text

</td>
</tr>
<tr>
<td width="50%">

**AI/ML Nodes**
- `EmbeddingNode` - Generate embeddings
- `VectorDatabaseNode` - Vector search
- `ModelPredictorNode` - ML predictions
- `LLMNode` - LLM integration

</td>
<td width="50%">

**API Integration Nodes**
- `HTTPRequestNode` - HTTP requests
- `RESTAPINode` - REST API client
- `GraphQLClientNode` - GraphQL queries
- `OAuth2AuthNode` - OAuth 2.0 authentication
- `RateLimitedAPINode` - Rate-limited API calls

**Other Integration Nodes**
- `KafkaConsumerNode` - Kafka streaming
- `WebSocketNode` - WebSocket connections
- `EmailNode` - Send emails

**SharePoint Integration**
- `SharePointGraphReader` - Read SharePoint files
- `SharePointGraphWriter` - Upload to SharePoint

**Real-time Monitoring**
- `RealTimeDashboard` - Live workflow monitoring
- `WorkflowPerformanceReporter` - Comprehensive reports
- `SimpleDashboardAPI` - REST API for metrics
- `DashboardAPIServer` - WebSocket streaming server

</td>
</tr>
</table>

### 🔧 Core Capabilities

#### Workflow Management
```python
from kailash.workflow import Workflow

# Create complex workflows with branching logic
workflow = Workflow("data_pipeline", name="data_pipeline")

# Add conditional branching
validator = ValidationNode()
workflow.add_node("validate", validator)

# Different paths based on validation
workflow.add_node("process_valid", processor_a)
workflow.add_node("handle_errors", error_handler)

# Connect with conditions
workflow.connect("validate", "process_valid", condition="is_valid")
workflow.connect("validate", "handle_errors", condition="has_errors")
```

#### Immutable State Management
```python
from kailash.workflow.state import WorkflowStateWrapper
from pydantic import BaseModel

# Define state model
class MyStateModel(BaseModel):
    counter: int = 0
    status: str = "pending"
    nested: dict = {}

# Create and wrap state object
state = MyStateModel()
state_wrapper = workflow.create_state_wrapper(state)

# Single path-based update
updated_wrapper = state_wrapper.update_in(
    ["counter"],
    42
)

# Batch update multiple fields atomically
updated_wrapper = state_wrapper.batch_update([
    (["counter"], 10),
    (["status"], "processing")
])

# Execute workflow with state management
final_state, results = workflow.execute_with_state(state_model=state)
```

#### Task Tracking
```python
from kailash.tracking import TaskManager

# Initialize task manager
task_manager = TaskManager()

# Create a sample workflow
from kailash.workflow import Workflow
workflow = Workflow("sample_workflow", name="Sample Workflow")

# Run workflow with tracking
from kailash.runtime.local import LocalRuntime
runtime = LocalRuntime()
results, run_id = runtime.execute(workflow, task_manager=task_manager)

# Query execution history
runs = task_manager.list_runs(status="completed", limit=10)
details = task_manager.get_run(run_id)
```

#### Local Testing
```python
from kailash.runtime.local import LocalRuntime

# Create test runtime with debugging enabled
runtime = LocalRuntime(debug=True)

# Execute with test data
test_data = {"customers": [...]}
results = runtime.execute(workflow, inputs=test_data)

# Validate results
assert results["node_id"]["output_key"] == expected_value
```

#### Performance Monitoring & Real-time Dashboards
```python
from kailash.visualization.performance import PerformanceVisualizer
from kailash.visualization.dashboard import RealTimeDashboard, DashboardConfig
from kailash.visualization.reports import WorkflowPerformanceReporter
from kailash.tracking import TaskManager
from kailash.runtime.local import LocalRuntime

# Run workflow with task tracking
task_manager = TaskManager()
runtime = LocalRuntime()
results, run_id = runtime.execute(workflow, task_manager=task_manager)

# Static performance analysis
perf_viz = PerformanceVisualizer(task_manager)
outputs = perf_viz.create_run_performance_summary(run_id, output_dir="performance_report")
perf_viz.compare_runs([run_id_1, run_id_2], output_path="comparison.png")

# Real-time monitoring dashboard
config = DashboardConfig(
    update_interval=1.0,
    max_history_points=100,
    auto_refresh=True,
    theme="light"
)

dashboard = RealTimeDashboard(task_manager, config)
dashboard.start_monitoring(run_id)

# Add real-time callbacks
def on_metrics_update(metrics):
    print(f"Tasks: {metrics.completed_tasks} completed, {metrics.active_tasks} active")

dashboard.add_metrics_callback(on_metrics_update)

# Generate live HTML dashboard
dashboard.generate_live_report("live_dashboard.html", include_charts=True)
dashboard.stop_monitoring()

# Comprehensive performance reports
reporter = WorkflowPerformanceReporter(task_manager)
report_path = reporter.generate_report(
    run_id,
    output_path="workflow_report.html",
    format=ReportFormat.HTML,
    compare_runs=[run_id_1, run_id_2]
)
```

**Real-time Dashboard Features**:
- ⚡ **Live Metrics Streaming**: Real-time task progress and resource monitoring
- 📊 **Interactive Charts**: CPU, memory, and throughput visualizations with Chart.js
- 🔌 **API Endpoints**: REST and WebSocket APIs for custom integrations
- 📈 **Performance Reports**: Multi-format reports (HTML, Markdown, JSON) with insights
- 🎯 **Bottleneck Detection**: Automatic identification of performance issues
- 📱 **Responsive Design**: Mobile-friendly dashboards with auto-refresh

**Performance Metrics Collected**:
- **Execution Timeline**: Gantt charts showing node execution order and duration
- **Resource Usage**: Real-time CPU and memory consumption
- **I/O Analysis**: Read/write operations and data transfer volumes
- **Performance Heatmaps**: Identify bottlenecks across workflow runs
- **Throughput Metrics**: Tasks per minute and completion rates
- **Error Tracking**: Failed task analysis and error patterns

#### API Integration
```python
from kailash.nodes.api import (
    HTTPRequestNode as RESTAPINode,
    # OAuth2AuthNode,
    # RateLimitedAPINode,
    # RateLimitConfig
)

# OAuth 2.0 authentication
# # auth_node = OAuth2AuthNode(
#     client_id="your_client_id",
#     client_secret="your_client_secret",
#     token_url="https://api.example.com/oauth/token"
# )

# Rate-limited API client
rate_config = None  # RateLimitConfig(
#     max_requests=100,
#     time_window=60.0,
#     strategy="token_bucket"
# )

api_client = RESTAPINode(
    base_url="https://api.example.com"
    # auth_node=auth_node
)

# rate_limited_client = RateLimitedAPINode(
#     wrapped_node=api_client,
#     rate_limit_config=rate_config
# )
```

#### Export Formats
```python
from kailash.utils.export import WorkflowExporter, ExportConfig

exporter = WorkflowExporter()

# Export to different formats
workflow.save("workflow.yaml", format="yaml")  # Kailash YAML format
workflow.save("workflow.json", format="json")  # JSON representation

# Export with custom configuration
config = ExportConfig(
    include_metadata=True,
    container_tag="latest"
)
workflow.save("deployment.yaml", format="yaml")
```

### 🎨 Visualization

```python
from kailash.workflow.visualization import WorkflowVisualizer

# Visualize workflow structure
visualizer = WorkflowVisualizer(workflow)
visualizer.visualize(output_path="workflow.png")

# Show in Jupyter notebook
visualizer.show()
```

## 💻 CLI Commands

The SDK includes a comprehensive CLI for workflow management:

```bash
# Project initialization
kailash init my-project --template data-pipeline

# Workflow operations
kailash validate workflow.yaml
kailash run workflow.yaml --inputs data.json
kailash export workflow.py --format kubernetes

# Task management
kailash tasks list --status running
kailash tasks show run-123
kailash tasks cancel run-123

# Development tools
kailash test workflow.yaml --data test_data.json
kailash debug workflow.yaml --breakpoint node-id
```

## 🏗️ Architecture

The SDK follows a clean, modular architecture:

```
kailash/
├── nodes/           # Node implementations and base classes
│   ├── base.py      # Abstract Node class
│   ├── data/        # Data I/O nodes
│   ├── transform/   # Transformation nodes
│   ├── logic/       # Business logic nodes
│   └── ai/          # AI/ML nodes
├── workflow/        # Workflow management
│   ├── graph.py     # DAG representation
│   └── visualization.py  # Visualization tools
├── visualization/   # Performance visualization
│   └── performance.py    # Performance metrics charts
├── runtime/         # Execution engines
│   ├── local.py     # Local execution
│   └── docker.py    # Docker execution (planned)
├── tracking/        # Monitoring and tracking
│   ├── manager.py   # Task management
│   └── metrics_collector.py  # Performance metrics
│   └── storage/     # Storage backends
├── cli/             # Command-line interface
└── utils/           # Utilities and helpers
```

## 🧪 Testing

The SDK is thoroughly tested with comprehensive test suites:

```bash
# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=kailash --cov-report=html

# Run specific test categories
uv run pytest tests/unit/
uv run pytest tests/integration/
uv run pytest tests/e2e/
```

## 🤝 Contributing

We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

### Development Setup

```bash
# Clone the repository
git clone https://github.com/integrum/kailash-python-sdk.git
cd kailash-python-sdk

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Sync dependencies (creates venv automatically and installs everything)
uv sync

# Run commands using uv (no need to activate venv)
uv run pytest
uv run kailash --help

# Or activate the venv if you prefer
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install development dependencies
uv add --dev pre-commit detect-secrets doc8

# Install Trivy (macOS with Homebrew)
brew install trivy

# Set up pre-commit hooks
pre-commit install
pre-commit install --hook-type pre-push

# Run initial setup (formats code and fixes issues)
pre-commit run --all-files
```

### Code Quality & Pre-commit Hooks

We use automated pre-commit hooks to ensure code quality:

**Hooks Include:**
- **Black**: Code formatting
- **isort**: Import sorting
- **Ruff**: Fast Python linting
- **pytest**: Unit tests
- **Trivy**: Security vulnerability scanning
- **detect-secrets**: Secret detection
- **doc8**: Documentation linting
- **mypy**: Type checking

**Manual Quality Checks:**
```bash
# Format code
black src/ tests/
isort src/ tests/

# Linting and fixes
ruff check src/ tests/ --fix

# Type checking
mypy src/

# Run all pre-commit hooks manually
pre-commit run --all-files

# Run specific hooks
pre-commit run black
pre-commit run pytest-check
```

## 📈 Project Status

<table>
<tr>
<td width="40%">

### ✅ Completed
- Core node system with 15+ node types
- Workflow builder with DAG validation
- Local & async execution engines
- Task tracking with metrics
- Multiple storage backends
- Export functionality (YAML/JSON)
- CLI interface
- Immutable state management
- API integration with rate limiting
- OAuth 2.0 authentication
- SharePoint Graph API integration
- **Real-time performance metrics collection**
- **Performance visualization dashboards**
- **Real-time monitoring dashboard with WebSocket streaming**
- **Comprehensive performance reports (HTML, Markdown, JSON)**
- **100% test coverage (544 tests)**
- **15 test categories all passing**
- 21+ working examples

</td>
<td width="30%">

### 🚧 In Progress
- Comprehensive API documentation
- Security audit & hardening
- Performance optimizations
- Docker runtime finalization

</td>
<td width="30%">

### 📋 Planned
- Cloud deployment templates
- Visual workflow editor
- Plugin system
- Additional integrations

</td>
</tr>
</table>

### 🎯 Test Suite Status
- **Total Tests**: 544 passing (100%)
- **Test Categories**: 15/15 at 100%
- **Integration Tests**: 65 passing
- **Examples**: 21/21 working
- **Code Coverage**: Comprehensive

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- The Integrum team for the Kailash architecture
- All contributors who have helped shape this SDK
- The Python community for excellent tools and libraries

## 📞 Support

- 📋 [GitHub Issues](https://github.com/integrum/kailash-python-sdk/issues)
- 📧 Email: support@integrum.com
- 💬 Slack: [Join our community](https://integrum.slack.com/kailash-sdk)

---

<p align="center">
  Made with ❤️ by the Integrum Team
</p>
