Metadata-Version: 2.4
Name: kohakuboard
Version: 0.0.1
Summary: High efficiency Local/self-hosted ML Experiment Tracking System
Author-email: "Shih-Ying Yeh(KohakuBlueLeaf)" <apolloyeh0123@gmail.com>
License: Apache-2.0
Project-URL: Homepage, https://github.com/KohakuBlueleaf/KohakuBoard
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiofiles
Requires-Dist: aiohttp
Requires-Dist: bcrypt
Requires-Dist: boto3
Requires-Dist: cachetools
Requires-Dist: click>=8.0
Requires-Dist: click-completion
Requires-Dist: cryptography>=46.0
Requires-Dist: duckdb
Requires-Dist: fastapi
Requires-Dist: fsspec
Requires-Dist: httpx
Requires-Dist: kohakuvault
Requires-Dist: numpy
Requires-Dist: orjson
Requires-Dist: pandas
Requires-Dist: peewee
Requires-Dist: psycopg2-binary
Requires-Dist: pydantic[email]
Requires-Dist: python-multipart
Requires-Dist: pytz
Requires-Dist: pyyaml
Requires-Dist: questionary
Requires-Dist: requests
Requires-Dist: rich
Requires-Dist: toml
Requires-Dist: tqdm
Requires-Dist: uvicorn
Requires-Dist: pillow
Provides-Extra: inspector
Requires-Dist: customtkinter>=5.2.2; extra == "inspector"
Requires-Dist: matplotlib>=3.5.0; extra == "inspector"
Dynamic: license-file

# KohakuBoard

High-performance ML experiment tracking with **zero training overhead**.

[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/KohakuBlueleaf/KohakuBoard)

> **Part of [KohakuHub](https://github.com/KohakuBlueleaf/KohakuHub)** - Self-hosted AI Infrastructure

---

## Quick Start

```bash
pip install -e .
```

```python
from kohakuboard.client import Board

board = Board(name="my-experiment", config={"lr": 0.001, "batch_size": 32})

# Training loop
for epoch in range(10):
    for data, target in train_loader:
        loss = train_step(data, target)

        board.step()  # Once per optimizer step
        board.log(loss=loss.item())  # Non-blocking, <0.1ms
        # Alternative: move board.step() after board.log() for 0-indexed steps

# logs are stored under ./kohakuboard using KohakuVault column stores + SQLite metadata
```


---

|![1761752427584](image/README/1761752427584.png)|![1761752450957](image/README/1761752450957.png)|
|-|-|

**Join our community:** https://discord.gg/xWYrkyvJ2s

---

## Why KohakuBoard?

### KohakuBoard's Advantages

- **Zero Training Overhead** - Non-blocking logging returns in <0.1ms
- **Local-First** - No server required during training, view results instantly
- **High Throughput** - 20,000+ metrics/second sustained
- **Rich Data Types** - Scalars, images, videos, tables, histograms
- **WebGL Visualization** - Handle 100K+ datapoints smoothly
- **Self-Hosted** - Your data stays on your infrastructure

---

## Features

### Non-Blocking Architecture

**Background Writer Process** ensures training never waits:

```
Training Script          Background Process
     │                          │
     ├─ board.log(loss=0.5)     │
     │  └─> Queue.put()         │
     │      (<0.1ms return!)    │
     │                          ├─ Queue.get()
     ├─ Continue training...    ├─ Batch write
     │                          └─ Flush to disk
```

**Performance:**
- Log call latency: **<0.1ms**
- Throughput: **20,000+ metrics/sec**
- Queue capacity: **50,000 messages**
- Memory overhead: **~100-200 MB**

### Rich Data Types

**Unified API** for all data types - no step inflation:

```python
board.log(
    loss=0.5,                           # Scalar
    sample_img=Media(image),            # Image
    predictions=Table(results),         # Table
    gradients=Histogram(grads)          # Histogram
)
# All logged at SAME step with 1 queue message!
```

**Supported Types:**
- **Scalars** - Metrics, learning rates, accuracies
- **Media** - Images (PNG/JPG), videos (MP4), audio (WAV)
- **Tables** - Structured data with embedded images
- **Histograms** - Weight/gradient distributions with compression (99.8% size reduction)

### Three-Tier SQLite Storage Architecture

**Powered by [KohakuVault](https://github.com/KohakuBlueleaf/KohakuVault)** - A high-performance storage library with dual interfaces over SQLite:

```
Three Specialized SQLite Implementations:

1. KohakuVault KVault        2. KohakuVault ColumnVault     3. Standard SQLite
   (K-V Store)                   (Columnar Storage)             (Relational)
   ├─ Media blobs                ├─ Metrics                     ├─ Media metadata
   ├─ B+Tree index on K          ├─ Histograms                  ├─ Tables
   ├─ Content-addressable        ├─ Blob-based columnar         └─ Step info
   └─ .cache() for bulk ops      └─ Dynamic chunk growth
```

**Why KohakuVault?**
- **Zero dependencies** - Single SQLite file, no external services
- **Simple deployment** - Just .db files, no infrastructure
- **Dual-interface design** - Dict-like for blobs, list-like for sequences
- **High performance** - Native speed with Pythonic API
- **Memory efficient** - Streaming support, dynamic chunk growth
- **True SWMR** - Multiple readers, single writer via SQLite WAL

**Why Three Tiers?**
- **KVault**: Optimized for blob storage with B+Tree index, content-addressable
- **ColumnVault**: Optimized for append-heavy time-series with columnar layout
- **Standard SQLite**: Optimized for structured metadata with ACID guarantees

### Advanced Visualization

**WebGL-Based Charts** powered by Plotly.js:
- Handle **100K+ datapoints** smoothly
- Configurable smoothing (EMA, MA, Gaussian)
- X-axis selection (step, global_step, any metric)
- Multi-metric overlays
- Dark/light mode
- Responsive design

**Rich Viewers:**
- **Histogram Navigator** - Step-by-step distribution exploration
- **Media Viewer** - Image grids, video playback
- **Table Viewer** - Structured data with embedded images
- **Dashboard** - Customizable metric layouts

### Local-First Workflow

```bash
# Train locally
python train.py              # Logs to ./kohakuboard/

# View results (no server required!)
kobo open ./kohakuboard --browser

# Optional server for team sharing (requires kohakuboard_server)
kobo-serve --port 48889
```

**No server setup, no configuration, no hassle.**

---

## Quick Start

### Installation

```bash
pip install -e .
```

### Basic Usage

```python
from kohakuboard.client import Board

# Create board - automatically saves on program exit
board = Board(name="my-experiment", config={"lr": 0.001, "batch_size": 32})

# Training loop
for epoch in range(10):
    for batch_idx, (data, target) in enumerate(train_loader):
        loss = train_step(data, target)

        # Increment step once per optimizer step (not per epoch!)
        board.step()

        # Log metrics (non-blocking, returns in <0.1ms)
        board.log(loss=loss.item(), lr=optimizer.param_groups[0]['lr'])

    # Log validation at end of epoch
    val_loss = validate(model, val_loader)
    board.log(**{"val/loss": val_loss})

# That's it! No .finish() needed - auto-cleanup via atexit
```

### View Results

```bash
# Local viewer (no server)
kobo open ./kohakuboard --browser

# Or launch the authenticated server (requires kohakuboard_server)
kobo-serve --port 48889
# Drop/copy board folders into the configured data dir to share runs
```

---

## Complete Example

```python
from kohakuboard.client import Board, Histogram, Table, Media
import torch

# Create board with hyperparameters
board = Board(
    name="cifar10-resnet18",
    config={"lr": 0.001, "batch_size": 128, "epochs": 100, "optimizer": "AdamW"}
)

# Training loop
for epoch in range(100):
    model.train()
    for data, target in train_loader:
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

        # Step once per optimizer step
        board.step()

        # Log scalars (non-blocking, <0.1ms)
        board.log(loss=loss.item(), lr=optimizer.param_groups[0]['lr'])

    # Validation
    model.eval()
    val_loss, correct, predictions_table = 0, 0, []

    with torch.no_grad():
        for batch_idx, (data, target) in enumerate(val_loader):
            output = model(data)
            val_loss += criterion(output, target).item()
            pred = output.argmax(dim=1)
            correct += (pred == target).sum().item()

            # Sample predictions for table (first batch only)
            if batch_idx == 0:
                for i in range(min(8, len(data))):
                    predictions_table.append({
                        "image": Media(data[i].cpu().numpy()),
                        "true": class_names[target[i]],
                        "pred": class_names[pred[i]],
                        "correct": "✓" if pred[i] == target[i] else "✗"
                    })

    # Log validation (scalars + table + histograms - all at same step!)
    hist_data = {f"grad/{n}": Histogram(p.grad) for n, p in model.named_parameters() if p.grad is not None}
    board.log(**{
        "val/loss": val_loss / len(val_loader),
        "val/accuracy": correct / len(val_loader.dataset),
        "val/predictions": Table(predictions_table),
        **hist_data
    })

# No .finish() needed - automatic cleanup when script exits
```

---

## Architecture

### Client (Training Script)

```
Main Process (Training)          Background Writer Process
       │                                   │
       ├─ board.log(loss=0.5)              │
       │  └─> Queue.put()                  │
       │      (returns instantly!)         │
       │                                   ├─ Queue.get()
       │                                   ├─ Process batch
       ├─ Continue training...             ├─ Write to storage
       │                                   └─ Flush to disk
```

**Key Features:**
- **Non-blocking**: `log()` returns in <0.1ms
- **Message Queue**: 50,000 message capacity
- **Writer Process**: Background process drains queue
- **Storage Layer**: Three-tier SQLite architecture (KohakuVault KVault + ColumnVault + Standard SQLite)
- **Graceful Shutdown**: atexit hooks + signal handlers ensure no data loss

### Backend (Visualization Server)

```
FastAPI Backend (Port 48889)
    ↓ Read-only connections
Board Files (./kohakuboard/)
    ├── {board_id}/
    │   ├── metadata.json
    │   ├── data/           ← SQL/columnar queries here
    │   │   ├── metrics/    ← KohakuVault DB files
    │   │   └── metadata.db ← SQLite database
    │   └── media/
    │       └── *.png, *.mp4
        ↓ REST API
Vue 3 Frontend (WebGL Charts)
```

**Key Features:**
- **Zero-copy serving**: Reads files directly (no database)
- **Concurrent reads**: Multiple connections supported
- **Fast queries**: Columnar storage for metrics
- **Static serving**: Media files served directly

---

## Data Model

### Directory Structure

```
./kohakuboard/
└── {board_id_timestamp}/
    ├── metadata.json           # Board info, config, timestamps
    ├── data/                   # Storage backend files
    │   ├── metrics/            # (hybrid) KohakuVault columnar files
    │   │   ├── train__loss.db
    │   │   ├── val__accuracy.db
    │   │   └── ...
    │   ├── metadata.db         # (hybrid) SQLite metadata
    │   └── histograms/
    │       ├── gradients_i32.db  # int32 precision
    │       └── params_u8.db      # uint8 precision (compact)
    ├── media/                  # Content-addressed storage
    │   ├── {name}_{idx}_{step}_{sha256}.png
    │   ├── {name}_{idx}_{step}_{sha256}.mp4
    │   └── {name}_{idx}_{step}_{sha256}.wav
    └── logs/
        ├── output.log          # Captured stdout/stderr
        └── writer.log          # Writer process logs
```

### Metadata Schema

```json
{
  "board_id": "20250129_150423_abc123",
  "name": "cifar10-resnet18",
  "config": {
    "lr": 0.001,
    "batch_size": 128,
    "epochs": 100
  },
  "created_at": "2025-01-29T15:04:23",
  "finished_at": "2025-01-29T18:32:45",
  "status": "finished",
  "version": "0.0.1"
}
```

### Manual Sync / Remote Sharing

Both the training-side package (`kohakuboard`) and the optional server (`kohakuboard_server`) read the exact same directory layout. To move a run between machines:

1. Copy the entire board folder (`{base_dir}/{project}/{board_id}`) using `cp`, `rsync`, or any file transfer tool.
2. Drop it into the destination data directory (the folder you pass to `kobo open ...` or the directory configured via `KOHAKU_BOARD_DATA_DIR` / `--data-dir` on `kobo-serve`).
3. Restart the viewer or refresh the UI. The new run is immediately available.

No export/import step is required because metrics, metadata, tensors, and media already live in KohakuVault + SQLite files. The legacy `kobo sync` command still expects a DuckDB export and will fail on modern boards—use manual copy until the new sync API lands.

---

## CLI Tool

```bash
# Open local viewer (no server)
kobo open ./kohakuboard --browser

# Start authenticated server (kohakuboard_server package)
kobo-serve --port 48889 --host 0.0.0.0

# Manual sync (recommended today): copy the entire board folder into the server's data dir
# (kobo sync is still wired to the legacy DuckDB exporter and will error on modern boards)
```

---

## Configuration

### Basic Usage

```python
# All boards use KohakuVault + SQLite (no backend parameter needed)
board = Board(name="my-experiment", project="vision")
```

### Advanced Options

```python
board = Board(
    name="experiment",
    board_id="custom-id",           # Auto-generated if not provided
    config={"lr": 0.001},           # Hyperparameters
    project="custom-project",       # Sub-directory inside base_dir
    base_dir="./my-boards",         # Custom directory
    capture_output=True,            # Capture stdout/stderr
    remote_url="https://board.example.com",  # Optional future sync target
    remote_token="...",             # Token for remote sync (WIP)
    sync_enabled=False,             # Enable when remote endpoints are ready
    memory_mode=False,              # Keep data in RAM (requires sync to persist)
    annotation="debug-run",         # Suffix appended to run directory name
)
```

**Storage Architecture:**
- KohakuVault KVault: Media blobs (K-V table with B+Tree index)
- KohakuVault ColumnVault: Metrics/histograms (blob-based columnar)
- Standard SQLite: Metadata (traditional relational tables)

### Context Manager

```python
with Board(name="experiment") as board:
    board.log(loss=0.5)
    # Automatic flush() and finish() on exit
```

---

## API Reference

### Board

```python
Board(
    name: str | None = None,
    board_id: str | None = None,
    config: dict | None = None,
    project: str | None = None,
    base_dir: str | Path | None = None,
    capture_output: bool = True,
    remote_url: str | None = None,
    remote_token: str | None = None,
    remote_project: str | None = None,
    sync_enabled: bool = False,
    sync_interval: int = 10,
    memory_mode: bool = False,
    *,
    annotation: str | None = None,
)
```

**Methods:**

**`board.step()`** - Increment global_step

```python
for batch_idx, batch in enumerate(train_loader):
    loss = train_step(batch)
    board.step()  # Increment ONCE per optimizer step
    board.log(**{"train/loss": loss, "train/lr": scheduler.get_last_lr()[0]})
```

**`board.log(**metrics)`** - Log data (non-blocking)

```python
board.log(
    loss=0.5,
    accuracy=0.95,
    learning_rate=0.001,            # Scalars
    sample=Media(image_array),      # Images/video/audio
    predictions=Table(rows),        # Tables (optionally with Media)
    grad_norm=Histogram(values),    # Histograms
)

# Namespaces (creates tabs in UI)
board.log(**{
    "train/loss": 0.5,
    "val/accuracy": 0.95
})

# Tensor + KDE payloads (specialized viewers)
board.log(
    attention_tensor=TensorLog(tensor),
    density=KernelDensity(values, grid_size=256),
)
```

**`board.flush()`** - Force flush (blocks until complete)

```python
board.flush()  # Wait for all pending writes
```

**`board.finish()`** - Manual cleanup (auto-called on exit)

```python
board.finish()  # Flush buffers, close connections
```

### Data Types

#### Media

```python
from kobo.client.types import Media

# Images
board.log(
    sample_img=Media(image_array),  # numpy, PIL, torch tensor
    prediction=Media(pred_img, caption="Predicted: cat")
)

# Video
board.log(
    training_video=Media("output.mp4", media_type="video")
)

# Audio (if supported)
board.log(
    audio_sample=Media("sample.wav", media_type="audio")
)
```

#### Table

```python
from kobo.client.types import Table

# From list of dicts
results = Table([
    {"name": "Alice", "score": 95, "pass": True},
    {"name": "Bob", "score": 87, "pass": True},
])
board.log(student_results=results)

# Tables with embedded images
predictions = Table([
    {"image": Media(img), "label": "cat", "confidence": 0.95},
    {"image": Media(img2), "label": "dog", "confidence": 0.87},
])
board.log(val_predictions=predictions)
```

#### Histogram

```python
from kobo.client.types import Histogram

# Log gradient distributions
board.log(
    gradients=Histogram(param.grad),
    weights=Histogram(param.data)
)

# Precompute for efficiency (optional)
hist = Histogram(gradients).compute_bins()
board.log(grad_distribution=hist)

# Compact precision (75% size reduction, ~1% accuracy loss)
hist = Histogram(gradients, precision="compact")
board.log(grad_distribution=hist)
```

---

## Deployment

### Local Mode (Recommended)

```bash
# Install
pip install -e .

# Train
python train.py

# View results
kobo open ./kohakuboard --browser
```

### Remote Mode (WIP)

```bash
# Run the authenticated server (still stabilizing)
kobo-serve --data-dir /var/kohakuboard --db sqlite:///kohakuboard.db

# Share boards by copying folders into /var/kohakuboard/<project>/
# Restart/reload the server to pick up new runs
```

See [docs/kohakuboard/](./docs/kohakuboard/) for complete deployment guides.

---

## Comparison with Alternatives

| Feature | WandB | TensorBoard | MLflow | KohakuBoard |
|---------|-------|-------------|--------|-------------|
| **Latency** | ~10ms | ~1ms | ~5ms | **<0.1ms** |
| **Throughput** | ~1K/sec | ~10K/sec | ~5K/sec | **20K+/sec** |
| **Offline** | ❌ No | ✅ Yes | ✅ Yes | ✅ Yes |
| **File-Based** | ❌ No | ✅ Yes | ❌ No | ✅ Yes |
| **Non-Blocking** | ❌ No | ❌ No | ❌ No | ✅ Yes |
| **Columnar Reads** | ❌ No | ❌ No | ✅ Yes | ✅ Yes (KohakuVault ColumnVault) |
| **WebGL Charts** | ❌ No | ❌ No | ❌ No | ✅ Yes |
| **100K+ Points** | Slow | Slow | Slow | **Fast** |
| **Self-Hosted** | Limited | ✅ Yes | ✅ Yes | ✅ Yes |
| **Setup** | Cloud | Local | Server | **None** |

---

## Documentation

- [Getting Started](./docs/kohakuboard/getting-started.md) - Installation and basic usage
- [API Reference](./docs/kohakuboard/api.md) - Complete API documentation
- [Architecture](./docs/kohakuboard/architecture.md) - System design and internals
- [CLI Guide](./docs/kohakuboard/cli.md) - Command-line tool usage
- [Usage Manual](./docs/kohakuboard/usage-manual.md) - Day-to-day operations checklist
- [Examples](./examples/) - Real-world usage patterns

---

## Examples

See `examples/` directory:

- `kohakuboard_basic.py` - Simple scalar logging
- `kohakuboard_all_media_types.py` - Images, videos, tables
- `kohakuboard_cifar_training.py` - Complete CIFAR-10 training example
- `kohakuboard_media_in_tables.py` - Tables with embedded images
- `kohakuboard_histogram_logging.py` - Gradient distribution tracking

---

## Roadmap

### ✅ Complete

**Client Library:**
- Non-blocking logging architecture
- Rich data types (scalars, media, tables, histograms)
- Three-tier SQLite architecture (KohakuVault KVault + ColumnVault + Standard SQLite)
- Graceful shutdown with queue draining
- Content-addressed media storage

**Backend & UI:**
- FastAPI REST API
- Vue 3 interface with dark/light mode
- WebGL charts (100K+ points)
- Histogram navigator
- Media/table viewers
- CLI tool (`kobo`)

### 🚧 In Progress

- Remote server mode with authentication
- Sync protocol for uploading local boards
- Project management (group related boards)
- Run comparison UI (side-by-side metrics)
- Real-time streaming (live updates while training)

### 📋 Planned

**Client Features:**
- PyTorch Lightning integration
- Keras callback
- Hugging Face Trainer integration
- Custom callback system

**Backend Features:**
- Multi-board comparison API
- Advanced filtering (tags, date range)
- Export to CSV/JSON
- Aggregation queries

**UI Features:**
- Diff viewer (compare runs)
- Scatter plots (metric vs metric)
- Custom dashboards
- Annotations
- Search and filter

**Infrastructure:**
- Docker/Kubernetes deployment
- Cloud storage backends (S3, GCS)
- Multi-user authentication


---

## License

KohakuBoard is a multi-component project with different licenses:

- **Client Library** (`kohakuboard`): [Apache License 2.0](src/kohakuboard/LICENSE)
  - Free for commercial and non-commercial use
  - Permissive license with minimal requirements

- **Web UI** (`kohaku-board-ui`): [AGPL-3.0](src/kohaku-board-ui/LICENSE)
  - Free to use and modify
  - Source code disclosure required for network services

- **Server** (`kohakuboard_server`): [Kohaku Software License 1.0](src/kohakuboard_server/LICENSE)
  - Free for non-commercial use
  - Free for commercial use under revenue/duration limits
  - Commercial licenses available for larger deployments

**Commercial Licensing:** For commercial licenses or exemptions, contact kohaku@kblueleaf.net

See [LICENSE](./LICENSE) for complete details.

---

## Contributing

KohakuBoard is part of the KohakuHub ecosystem. We welcome contributions!

**Before contributing:**
- Read [CONTRIBUTING.md](./CONTRIBUTING.md) for code style and guidelines
- Join our [Discord](https://discord.gg/xWYrkyvJ2s) for discussions
- Check [open issues](https://github.com/KohakuBlueleaf/KohakuHub/issues) tagged with `kohakuboard`

**Areas we need help:**
- 🎨 Frontend (chart improvements, UI/UX)
- 🔧 Backend (storage backends, performance)
- 📊 Client library (framework integrations)
- 📚 Documentation (tutorials, guides)
- 🧪 Testing (unit tests, benchmarks)

---

## Support

- **Discord:** https://discord.gg/xWYrkyvJ2s (Use #kohakuboard channel)
- **Issues:** https://github.com/KohakuBlueleaf/KohakuHub/issues (Tag with `kohakuboard`)
- **Email:** kohaku@kblueleaf.net

---

## Acknowledgments

- [KohakuVault](https://github.com/KohakuBlueleaf/KohakuVault) - High-performance storage library with dual SQLite interfaces (KVault for blobs, ColumnVault for sequences)
- [Plotly.js](https://plotly.com/javascript/) - WebGL charts
- [Vue 3](https://vuejs.org/) - Modern UI framework
- [FastAPI](https://fastapi.tiangolo.com/) - Backend framework

---

**Production Ready!** Core features are stable and performant. Use in real training workflows and help us improve.
