Metadata-Version: 2.4
Name: tileflow
Version: 0.5.0
Summary: Fast, memory-efficient image tiling and reconstruction for deep learning and scientific computing
Project-URL: Homepage, https://github.com/mantalys/tileflow
Project-URL: Repository, https://github.com/mantalys/tileflow.git
Author: TileFlow Contributors
Maintainer: TileFlow Contributors
License: MIT License
        
        Copyright (c) 2025 Mantalys France, SAS
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: computer-vision,deep-learning,image-processing,memory-efficient,scientific-computing,tiling
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Processing
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.13
Requires-Dist: numpy>=2.2.0
Requires-Dist: zarr>=3.1.1
Provides-Extra: all
Requires-Dist: codecarbon>=2.4.0; extra == 'all'
Requires-Dist: matplotlib>=3.8.0; extra == 'all'
Requires-Dist: pandas>=2.2.0; extra == 'all'
Requires-Dist: psutil>=5.9.0; extra == 'all'
Requires-Dist: scikit-image>=0.22.0; extra == 'all'
Requires-Dist: seaborn>=0.13.0; extra == 'all'
Provides-Extra: benchmarks
Requires-Dist: matplotlib>=3.8.0; extra == 'benchmarks'
Requires-Dist: pandas>=2.2.0; extra == 'benchmarks'
Requires-Dist: scikit-image>=0.22.0; extra == 'benchmarks'
Requires-Dist: seaborn>=0.13.0; extra == 'benchmarks'
Provides-Extra: monitoring
Requires-Dist: codecarbon>=2.4.0; extra == 'monitoring'
Requires-Dist: psutil>=5.9.0; extra == 'monitoring'
Description-Content-Type: text/markdown

# TileFlow

> **High-performance tile-based image processing for scientific computing**

Process gigapixel images with minimal memory footprint through intelligent tiling and reconstruction. Designed for microscopy, whole-slide imaging, and large-scale computer vision workflows.

[![Python 3.13+](https://img.shields.io/badge/python-3.13+-blue.svg)](https://www.python.org/downloads/)
[![NumPy](https://img.shields.io/badge/numpy-2.2.0+-orange.svg)](https://numpy.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

## 🚀 Key Features

- **🧠 Memory Efficient**: Process images larger than RAM using intelligent tiling
- **🔬 Multi-Channel Ready**: Native CHW format support for microscopy workflows  
- **⚡ Zero-Copy Views**: Leverages numpy slicing for maximum performance
- **🔧 Seamless Reconstruction**: Intelligent overlap handling eliminates artifacts
- **☁️ Cloud-Scale**: Built-in zarr integration for massive datasets
- **🎯 Pluggable Pipeline**: Custom processing functions integrate seamlessly

## 📦 Installation

```bash
pip install tileflow
```

## 🔥 Quick Start

### Basic Image Processing

```python
from tileflow import TileFlow
import numpy as np

# Define your processing function
def enhance_contrast(tile):
    """Enhance contrast using histogram stretching."""
    p2, p98 = np.percentile(tile, (2, 98))
    return np.clip((tile - p2) / (p98 - p2 + 1e-8), 0, 1)

# Configure and run processor
processor = TileFlow(tile_size=(256, 256), overlap=(16, 16))
processor.configure(function=enhance_contrast)
result = processor.run(large_image)
```

### Multi-Channel Microscopy

```python
from tileflow import generate_multichannel_image, SobelEdgeDetector

# Generate realistic 8-channel microscopy data [C, H, W]
image_chw = generate_multichannel_image(shape=(8, 2048, 2048))

# Process DAPI channel for nuclei detection
dapi_channel = image_chw[0]  # Extract nuclei channel
sobel = SobelEdgeDetector(tile_size=(256, 256), overlap=(16, 16))
nuclei_edges = sobel.process(dapi_channel)

# Apply different processing to each channel
for i, channel in enumerate(image_chw):
    if i == 0:  # DAPI - nuclei segmentation
        processed = sobel.process(channel)
        nuclei_mask = processed > np.percentile(processed, 95)
    else:  # Other channels - generic enhancement
        processed = sobel.process(channel)
    
    print(f"Channel {i}: {processed.max():.3f} max intensity")
```

### Zarr for Massive Datasets

```python
import zarr
import numpy as np
from tileflow import TileFlow

# Create zarr dataset for efficient storage
dataset = zarr.open('microscopy.zarr', mode='w', 
                   shape=(16, 8192, 8192), chunks=(1, 1024, 1024))

# Process channels individually to manage memory
processor = TileFlow(tile_size=(512, 512), overlap=(32, 32))
processor.configure(function=your_analysis_function)

for channel_idx in range(16):
    # Load single channel from zarr (memory efficient)
    channel_data = np.array(dataset[channel_idx])
    
    # Process with TileFlow
    result = processor.run(channel_data)
    
    # Save or analyze result
    print(f"Channel {channel_idx} processed: {result.shape}")
```

## 🧪 Advanced Examples

### Custom Multi-Channel Pipeline

```python
class NucleiSegmentationPipeline:
    """Specialized pipeline for DAPI nuclei segmentation."""
    
    def __init__(self, sensitivity=0.95):
        self.sensitivity = sensitivity
        self.processor = TileFlow(tile_size=(256, 256), overlap=(16, 16))
    
    def segment_nuclei(self, tile):
        """Apply Sobel + thresholding for nuclei detection."""
        # Sobel edge detection
        gx = np.gradient(tile, axis=1)
        gy = np.gradient(tile, axis=0) 
        edges = np.sqrt(gx*gx + gy*gy)
        
        # Adaptive thresholding
        threshold = np.percentile(edges, self.sensitivity * 100)
        return (edges > threshold).astype(np.uint8)
    
    def process(self, dapi_channel):
        """Process DAPI channel for nuclei segmentation."""
        self.processor.configure(function=self.segment_nuclei)
        return self.processor.run(dapi_channel)

# Use the pipeline
pipeline = NucleiSegmentationPipeline(sensitivity=0.95)
nuclei_mask = pipeline.process(image_chw[0])
print(f"Detected {nuclei_mask.sum()} nuclei pixels")
```

### Concurrent Multi-Channel Processing

```python
from concurrent.futures import ThreadPoolExecutor
from tileflow import SobelEdgeDetector

def process_channel_pair(args):
    """Process a single channel with appropriate algorithm."""
    channel_idx, channel_data = args
    
    if channel_idx == 0:  # DAPI
        processor = NucleiSegmentationPipeline()
        return channel_idx, processor.process(channel_data)
    else:  # Other channels
        sobel = SobelEdgeDetector()
        return channel_idx, sobel.process(channel_data)

# Process multiple channels concurrently
with ThreadPoolExecutor(max_workers=4) as executor:
    tasks = [(i, image_chw[i]) for i in range(8)]
    futures = [executor.submit(process_channel_pair, task) for task in tasks]
    
    results = {}
    for future in futures:
        channel_idx, result = future.result()
        results[channel_idx] = result
        print(f"✓ Channel {channel_idx} complete")
```

## 🎯 Use Cases

| Domain | Application | Image Size | Channels |
|--------|-------------|------------|----------|
| 🔬 **Microscopy** | Fluorescence imaging, pathology | 2K-16K | 4-32 |
| 🧠 **Deep Learning** | Model inference, preprocessing | 1K-8K | 1-3 |
| 🛰️ **Remote Sensing** | Satellite analysis, multispectral | 4K-32K | 8-256 |
| 📱 **Computer Vision** | Panoramic stitching, high-res analysis | 2K-16K | 1-4 |

## 📊 Performance Benchmarks

| Dataset | Memory Usage | Processing Time | Zarr Compression |
|---------|--------------|-----------------|------------------|
| 2K × 2K × 8ch | 128 MB | 2.1s | 77% reduction |
| 4K × 4K × 16ch | 256 MB | 8.4s | 75% reduction |
| 8K × 8K × 32ch | 512 MB | 33.2s | 76% reduction |
| 16K × 16K × 8ch | 1.2 GB | 45.6s | 78% reduction |

*Consumer hardware (16GB RAM, 8-core CPU) with Sobel edge detection*

## 🔍 Enhanced Monitoring & Callbacks

TileFlow provides a comprehensive callback system for monitoring processing performance, memory usage, and energy consumption:

### Progress & Performance Tracking

```python
from tileflow import TileFlow, ProgressCallback, MetricsCallback

# Basic progress tracking
processor = TileFlow(tile_size=(256, 256), overlap=(32, 32))
processor.configure(function=your_function)

progress = ProgressCallback(verbose=True, show_rate=True)
metrics = MetricsCallback(verbose=True)

result = processor.run(image, callbacks=[progress, metrics])
```

### Memory Usage Monitoring

```python
from tileflow import MemoryTracker

# Track memory usage during processing
memory_tracker = MemoryTracker(detailed=True)
result = processor.run(large_image, callbacks=[memory_tracker])

# Get detailed statistics
memory_stats = memory_tracker.get_memory_stats()
print(f"Peak memory: {memory_stats['peak_delta_bytes'] / 1024**2:.1f} MB")
```

### Energy Consumption Tracking

```python
from tileflow import CodeCarbonTracker

# Track CO₂ emissions (requires: pip install codecarbon)
carbon_tracker = CodeCarbonTracker(
    project_name="my-analysis",
    output_dir="./carbon_logs"
)

result = processor.run(image, callbacks=[carbon_tracker])
emissions = carbon_tracker.get_emissions_data()
print(f"CO₂ emissions: {emissions['emissions_kg']:.6f} kg")
```

### Comprehensive Monitoring Suite

```python
from tileflow import CompositeCallback, ProgressCallback, MemoryTracker, CodeCarbonTracker

# Combine multiple monitoring callbacks
monitoring_suite = CompositeCallback([
    ProgressCallback(verbose=True),
    MemoryTracker(detailed=False),
    CodeCarbonTracker(project_name="scientific-analysis")
])

result = processor.run(image, callbacks=[monitoring_suite])
```

### Custom Scientific Callbacks

```python
from tileflow import TileFlowCallback

class ImageQualityCallback(TileFlowCallback):
    """Custom callback for scientific image analysis."""
    
    def on_tile_end(self, tile, tile_index, total_tiles):
        # Analyze each processed tile
        data = tile.image_data[0]
        snr = np.mean(data) / np.std(data)
        self.quality_metrics.append(snr)
    
    def on_processing_end(self, stats):
        print(f"Average SNR: {np.mean(self.quality_metrics):.2f}")

processor.run(image, callbacks=[ImageQualityCallback()])
```

## 📚 Complete Examples

Run comprehensive examples from the `scripts/` directory:

```bash
# Basic CHW format processing
uv run python scripts/basic_usage.py

# Advanced multi-channel workflows
uv run python scripts/multichannel_processing.py

# Zarr integration for large datasets  
uv run python scripts/zarr_integration.py
```

**Example Scripts:**
- **[`basic_usage.py`](scripts/basic_usage.py)** - Interface validation with CHW arrays
- **[`multichannel_processing.py`](scripts/multichannel_processing.py)** - Specialized channel processors
- **[`zarr_integration.py`](scripts/zarr_integration.py)** - Cloud-scale dataset handling

## 🏗️ Architecture

TileFlow processes images hierarchically for optimal memory usage:

```
Input Image → Grid Generation → Tile Processing → Reconstruction → Output
     ↓              ↓                ↓               ↓            ↓
   16GB           Lazy            64MB          Overlap       16GB
                Iterator                       Handling
```

**Processing Modes:**
- **Direct Tiling**: Image → Tiles → Process → Reconstruct
- **Hierarchical**: Image → Chunks → Tiles → Process → Reconstruct

**Core Components:**
- **GridSpec**: Defines tiling strategy with intelligent overlap
- **TileFlow**: Main processor with configure/run interface  
- **Reconstruction**: Seamless merge with artifact elimination
- **Backends**: Support for numpy, zarr, and custom data sources

## 🛠️ Development

```bash
# Clone and setup
git clone <repository>
cd TileFlow
uv sync

# Run tests
uv run pytest

# Code quality
uv run ruff check
uv run ruff format
uv run mypy src/tileflow

# Build package
uv build
```

## 🧬 Scientific Applications

**Fluorescence Microscopy:**
```python
# Multi-fluorophore analysis
channels = ["DAPI", "FITC", "TRITC", "Cy5"]
for i, name in enumerate(channels):
    channel_data = image_chw[i]
    processed = specialized_processor[name].process(channel_data)
    analyze_fluorophore_distribution(processed, name)
```

**Whole-Slide Pathology:**
```python
# Process gigapixel pathology slides
wsi_processor = TileFlow(
    tile_size=(512, 512), 
    overlap=(64, 64),
    chunk_size=(2048, 2048)  # Hierarchical processing
)
wsi_processor.configure(function=tissue_classifier)
classification_map = wsi_processor.run(whole_slide_image)
```

**Satellite Imagery:**
```python
# Multispectral satellite analysis
spectral_bands = ["red", "green", "blue", "nir", "swir1", "swir2"]
for band_idx, band_name in enumerate(spectral_bands):
    band_data = satellite_image[band_idx]
    vegetation_index = calculate_ndvi(band_data)
```

## 🤝 Contributing

TileFlow is built for the scientific computing community. We welcome contributions for:

- **New Backends**: TIFF, HDF5, cloud storage adapters
- **Processing Algorithms**: Segmentation, enhancement, feature extraction
- **Performance**: GPU acceleration, distributed processing
- **Documentation**: Tutorials, use case examples

## 📄 License

MIT License - see [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

**Valentin Poque** - Core development and architecture (July-September 2025)

---

<div align="center">

**Process any image, any size, any channel count.**  
*TileFlow scales with your data.*

[📖 Documentation](docs/) • [🐛 Issues](issues/) • [💬 Discussions](discussions/)

</div>