Metadata-Version: 2.4
Name: modosaic
Version: 0.1.0
Summary: Modosaic: A Multimodal Mosaic for In-Context Learning
Author-email: Oriol Agost Batalla <oriol.agost@udl.cat>, Oriol Agost Batalla <oriol.agost@gft.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/dcg-udl-cat/modosaic
Project-URL: Documentation, https://modosaic.udl.cat
Project-URL: Source, https://github.com/dcg-udl-cat/modosaic
Project-URL: Issues, https://github.com/dcg-udl-cat/modosaic/issues
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Education
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Utilities
Requires-Python: ~=3.13.0
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=2.4.2
Requires-Dist: torch==2.10.0
Requires-Dist: pyarrow>=20.0.0
Requires-Dist: transformers==5.2.0
Requires-Dist: einops>=0.8.2
Requires-Dist: timm>=1.0.25
Requires-Dist: sentencepiece>=0.2.1
Requires-Dist: qwen-vl-utils==0.0.14
Requires-Dist: accelerate==1.12.0
Requires-Dist: diffusers>=0.36.0
Requires-Dist: imagebind
Requires-Dist: setuptools==81.0.0
Requires-Dist: segment-anything
Requires-Dist: opencv-python==4.13.0.92
Requires-Dist: typer>=0.24.1
Dynamic: license-file

# Modosaic

Modosaic is a multimodal image-dataset pipeline for generating, validating, and
saving complementary modalities from a shared image source. It gives you:

- A unified dataset layer for local image folders and parquet datasets.
- A configurable generation pipeline for source images, captions, segmentation
  masks, depth maps, and surface-normal fields.
- Validators and quality-gate constraints that decide which generated
  modalities are saved.
- A CLI for default runs, fully configurable runs, and config-file driven runs.
- A clean Python API for composing custom modalities, validators, and
  postprocessors.
- Reproducible experiment folders with generated artifacts, validation JSON, and
  structured logs.

---

## Installation

```bash
uv sync
```

or, from an activated environment:

```bash
pip install -e .
```

or, from PyPI:

```bash
pip install modosaic
```

Python 3.13 is required. CUDA is optional but strongly
recommended for the heavier text, segmentation, depth, and normals models.

### Nix dev shell

The repository includes a Nix flake for a CUDA-ready development shell. Before
entering it, configure the Nix daemon to trust the binary caches used by the
flake; this avoids long local builds for CUDA and community packages.

Add the following to `/etc/nix/nix.conf`:

```ini
experimental-features = nix-command flakes

trusted-users = root olal_gft_com

extra-substituters = https://cache.nixos.org https://nix-community.cachix.org https://cache.nixos-cuda.org

extra-trusted-public-keys = nix-community.cachix.org-1:mB9ZQ+4kTq9qUqM96H8P6oz+ZWHR+Hh3wlgYx9oSt1A= cache.nixos-cuda.org:74DUi4Ye579gUqzH4ziL9IyiJBlDpMRn9MBN8oNan9M=
```

Restart the Nix daemon after changing the file, then enter the shell:

```bash
nix develop
```

### Hugging Face model access

If you use the SAM 3 segmentation model (`sam3`, backed by `facebook/sam3`),
run Modosaic with `HF_TOKEN` set to a Hugging Face token from an account that
has access to Meta's SAM 3 model:

```bash
export HF_TOKEN=hf_...
modosaic run --dataset local --root ./images --segmentation-model sam3
```

Do not commit tokens to the repository.

---

## Quick Start

### 1. List supported modalities and models

```bash
modosaic models
```

When running directly from a checkout without installing the console script:

```bash
python -m modosaic.cli.cli models
```

### 2. Run the default pipeline on a local image folder

```bash
modosaic simple ./images --limit 10
```

This runs all default modalities in dependency-safe order:

```text
image -> text -> segmentation -> depth -> normals
```

Artifacts are written under `./experiments/<timestamp>/`.

### 3. Run a selected local-folder experiment

```bash
modosaic run \
  --dataset local \
  --root ./images \
  --modality image \
  --modality segmentation \
  --modality depth \
  --segmentation-model sam3 \
  --depth-model depth-anything-v2-small \
  --segmentation-mask-quality-min 0.70 \
  --depth-segmentation-boundary-min 0.15 \
  --limit 20 \
  --experiment-root ./experiments \
  --experiment-name local-seg-depth
```

Validators can be disabled with `--no-validators`. To run validators without
using them as save/discard gates, pass `--no-constraints`.

### 4. Run from a parquet dataset

```bash
modosaic run \
  --dataset parquet \
  --parquet-path ./data/imagenet-a \
  --image-column image.bytes \
  --metadata-column label \
  --modality image \
  --modality text \
  --text-model qwen-2-2b \
  --text-siglip-min 0.65 \
  --limit 20
```

Parquet image columns can contain bytes, bytearray/memoryview values,
`list[int]`, nested fields such as `image.bytes`, or paths relative to the
parquet file.

---

## Full Pipeline From Config

Pipeline config files can be JSON, TOML, YAML, or YML. YAML requires PyYAML.
The config maps directly into `modosaic.cli.config.RunConfig`.

Example `examples/config.yaml`:

```yaml
dataset:
  type: local
  root: ./images
  recursive: true
  extensions: [.jpg, .png]

modalities:
  enabled: [image, segmentation, depth]
  models:
    segmentation: sam3
    depth: depth-anything-v2-small

validators:
  enabled: true
  constraints:
    enabled: true
    segmentation_mask_quality_minimum: 0.70
    segmentation_boundary_minimum: 0.20
    depth_imagebind_minimum: 0.55
    depth_segmentation_boundary_minimum: 0.15
  segmentation_boundary_thickness: 1
  segmentation_tolerance_radius: 2
  segmentation_rgb_edge_quantile: 0.90
  depth_boundary_thickness: 1
  depth_tolerance_radius: 2
  depth_edge_quantile: 0.90

run:
  limit: 20
  experiment_root: ./experiments
  experiment_name: local-seg-depth
  log_path: ./.logs
  seed: 42
```

Run it with:

```bash
modosaic pipeline examples/config.yaml
```

Override selected config values at execution time:

```bash
modosaic pipeline examples/config.yaml --limit 5 --seed 123 --json
```

---

## Python API

```python
from pathlib import Path

from modosaic import ExperimentService, ImageDataset, LoggingService, Pipeline
from modosaic.depth.preconfigured_modality import build_preconfigured_depth_modality
from modosaic.image import build_preconfigured_image_modality
from modosaic.segmentation.preconfigured_modality import (
    build_preconfigured_segmentation_modality,
)
from modosaic.services.seeding import SeedingService

LoggingService.setup_logging()
SeedingService.set_global_seed(42)

dataset = ImageDataset.from_local_folder(Path("images"))

pipeline = Pipeline(
    dataset=dataset,
    modalities=[
        build_preconfigured_image_modality(),
        build_preconfigured_segmentation_modality(),
        build_preconfigured_depth_modality(),
    ],
    experiment=ExperimentService(
        root="experiments",
        experiment_name="local-seg-depth",
    ),
)

results = pipeline.run(limit=10)

for result in results:
    print(result.record.sample_id, result.artifact_paths)
```

Each `ConfiguredModality` owns a generator, validators, and a postprocessor.
Plain validators receive `(record, generated)`. A `ValidatorStep` can also pass
generated outputs from earlier modalities as keyword dependencies.

```python
from modosaic.core.validation_constraint import ValidationConstraint
from modosaic.core.validator_step import ValidatorStep
from modosaic.segmentation.validators.impl.mask_statistics import MaskStatsValidator

mask_quality_gate = ValidatorStep(
    validator=MaskStatsValidator(),
    constraint=ValidationConstraint(
        minimum=0.75,
        score_name="weighted_mask_quality",
        score_fn=lambda stats: (
            0.4 * stats.coverage_score
            + 0.3 * stats.distinctness_score
            + 0.3 * stats.fragmentation_score
        ),
    ),
)
```

A constrained modality is saved only when every configured constraint passes.
Rejected modalities do not write generated artifacts or validation JSON, and
later validators cannot use them as dependencies.

---

## CLI Reference

### `modosaic simple ROOT [--limit N]`

Run the default Modosaic pipeline on a local image folder.

### `modosaic run [OPTIONS]`

Run with CLI-provided dataset, modality, model, validator, constraint, and
experiment settings.

### `modosaic pipeline CONFIG_PATH [--limit N] [--seed SEED] [--json]`

Run from a JSON, TOML, YAML, or YML config file.

### `modosaic models`

Print valid modality and model names for CLI options and config files.

---

## Concepts & Extensibility

- Dataset adapters -> `modosaic.providers`: local folders and parquet data.
- Generators -> `modosaic.<modality>.generators`: model-backed modality
  generation.
- Validators -> `modosaic.<modality>.validators`: quality checks and
  cross-modality consistency checks.
- Constraints -> `modosaic.core.validation_constraint`: pass/fail gates over
  validator output.
- Postprocessors -> `modosaic.<modality>.postprocessor`: conversion from model
  output to saved experiment artifacts.
- Services -> logging, seeding, image conversion, artifact persistence,
  boundaries, edges, and tolerances.

Add custom components by subclassing:

```text
DatasetAdapter -> providers.adapters.adapter.DatasetAdapter
ModalityGenerator -> core.modality_generator.ModalityGenerator
ModalityValidator -> core.validator.ModalityValidator
ModalityPostprocessor -> core.postprocessor.ModalityPostprocessor
Modality -> core.modality.Modality
```

or by composing existing pieces with `ConfiguredModality`.

---

## Experiment Outputs

`ExperimentService` writes every accepted artifact beneath the configured run
folder. Typical output includes:

```text
experiments/<run>/
  image/
  text/
  segmentation/
  depth/
  normals/
  validations/
```

Validation files include the validator name, raw value, optional score,
threshold, and pass/fail result.

---

## Documentation

MkDocs pages live in `docs/`, and API pages are generated from Google-style
Python docstrings through `mkdocstrings`.

Public classes, functions, and methods should carry useful Google-style
docstrings because they form the API reference. Module docstrings are optional
for simple implementation modules; add them when a file exposes important
package-level behavior, re-exports public symbols, or needs context that is not
clear from the documented objects inside it.

```bash
uv run --group docs mkdocs serve
uv run --group docs mkdocs build --strict
```

---

## Examples

See `examples/main.py` for a complete local demo that loads a parquet dataset, runs the
preconfigured modality stack, and prints validation summaries.

---

## License
This project is licensed under the MIT License. See the LICENSE file for details.
