Metadata-Version: 2.4
Name: stsw
Version: 1.0.0
Summary: The Last-Word Safe-Tensor Stream Suite
Author-email: The stsw Authors <contact@stsw.dev>
License: Apache-2.0
Project-URL: Homepage, https://github.com/stsw-project/stsw
Project-URL: Documentation, https://stsw-project.github.io/stsw
Project-URL: Repository, https://github.com/stsw-project/stsw
Project-URL: Issues, https://github.com/stsw-project/stsw/issues
Project-URL: Changelog, https://github.com/stsw-project/stsw/blob/main/CHANGELOG.md
Keywords: safetensors,streaming,tensor,pytorch,numpy
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: torch
Requires-Dist: torch>=1.10.0; extra == "torch"
Provides-Extra: numpy
Requires-Dist: numpy>=1.20.0; extra == "numpy"
Provides-Extra: xxhash
Requires-Dist: xxhash>=3.0.0; extra == "xxhash"
Provides-Extra: rich
Requires-Dist: rich>=13.0.0; extra == "rich"
Provides-Extra: tqdm
Requires-Dist: tqdm>=4.65.0; extra == "tqdm"
Provides-Extra: dev
Requires-Dist: pytest==8.0.0; extra == "dev"
Requires-Dist: pytest-cov==4.1.0; extra == "dev"
Requires-Dist: pytest-xdist==3.5.0; extra == "dev"
Requires-Dist: hypothesis==6.98.0; extra == "dev"
Requires-Dist: coverage[toml]==7.4.0; extra == "dev"
Requires-Dist: pyright==1.1.350; extra == "dev"
Requires-Dist: ruff==0.4.0; extra == "dev"
Requires-Dist: black==24.1.0; extra == "dev"
Requires-Dist: mutmut==2.4.4; extra == "dev"
Requires-Dist: pre-commit==3.6.0; extra == "dev"
Provides-Extra: bench
Requires-Dist: asv>=0.6.4; extra == "bench"
Requires-Dist: psutil>=5.9.0; extra == "bench"
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.5.0; extra == "docs"
Requires-Dist: mkdocs-exec>=0.4.0; extra == "docs"
Requires-Dist: mkdocstrings[python]>=0.24.0; extra == "docs"
Provides-Extra: all
Requires-Dist: stsw[bench,dev,docs,numpy,rich,torch,tqdm,xxhash]; extra == "all"
Dynamic: license-file

# stsw - The Last-Word Safe-Tensor Stream Suite

[![PyPI](https://img.shields.io/pypi/v/stsw)](https://pypi.org/project/stsw/)
[![Python Version](https://img.shields.io/pypi/pyversions/stsw)](https://pypi.org/project/stsw/)
[![License](https://img.shields.io/pypi/l/stsw)](https://github.com/stsw-project/stsw/blob/main/LICENSE)
[![CI](https://github.com/stsw-project/stsw/workflows/CI/badge.svg)](https://github.com/stsw-project/stsw/actions)
[![Coverage](https://codecov.io/gh/stsw-project/stsw/branch/main/graph/badge.svg)](https://codecov.io/gh/stsw-project/stsw)

Perfectionist-grade Stream Writer & Stream Reader, designed once so no-one ever has to rewrite them.

## Features

- 🚀 **Streaming I/O**: Write and read multi-GB tensor files with <100 MB RAM
- 🔒 **Type Safe**: 100% type hints, pyright strict mode
- ⚡ **Zero Copy**: Memory-mapped reading with no deserialization overhead  
- 🛡️ **Robust**: CRC32 verification, atomic writes, comprehensive error handling
- 🔧 **Simple API**: `import stsw → do work → close() → done`
- 🌍 **Compatible**: Bit-level identical to safetensors spec v1.0

## Installation

```bash
pip install stsw
```

With optional dependencies:
```bash
pip install stsw[torch,numpy]  # For PyTorch/NumPy support
pip install stsw[all]          # Everything including dev tools
```

## Quick Start

### Writing tensors

```python
import numpy as np
from stsw import StreamWriter, TensorMeta

# Define your tensors
data1 = np.random.rand(1000, 1000).astype(np.float32)
data2 = np.random.randint(0, 256, (500, 500, 3), dtype=np.uint8)

# Create metadata
metas = [
    TensorMeta("embeddings", "F32", data1.shape, 0, data1.nbytes),
    TensorMeta("image", "I8", data2.shape, 4000064, 4000064 + data2.nbytes),
]

# Write to file
with StreamWriter.open("model.safetensors", metas, crc32=True) as writer:
    writer.write_block("embeddings", data1.tobytes())
    writer.finalize_tensor("embeddings")
    
    writer.write_block("image", data2.tobytes())
    writer.finalize_tensor("image")
```

### Reading tensors

```python
from stsw import StreamReader

# Open file with memory mapping
with StreamReader("model.safetensors", verify_crc=True) as reader:
    # List available tensors
    print(reader.keys())  # ['embeddings', 'image']
    
    # Load as NumPy array
    embeddings = reader.to_numpy("embeddings")
    
    # Load as PyTorch tensor (if available)
    image = reader.to_torch("image", device="cuda")
```

### High-level API

```python
import torch
import stsw

# Save entire state dict
state_dict = {
    "model.weight": torch.randn(1000, 1000),
    "model.bias": torch.randn(1000),
}

stsw.dump(state_dict, "checkpoint.safetensors", crc32=True)
```

## CLI Tools

```bash
# Inspect file contents
stsw inspect model.safetensors

# Verify checksums
stsw verify model.safetensors

# Convert PyTorch checkpoint
stsw convert model.pt model.safetensors --crc32

# Run self-test
stsw selftest
```

## Performance

| Operation | Throughput | Memory Usage |
|-----------|------------|--------------|
| Write (NVMe) | 1.8 GB/s | <80 MB |
| Read (mmap) | 6.2 GB/s | <50 MB |
| CRC32 verification | 2.5 GB/s | <80 MB |

## Development

```bash
# Install development dependencies
make dev

# Run full test suite
make all

# Type checking
make type

# Run tests
make test

# Format code
make format
```

## Documentation

Full documentation available at [https://stsw-project.github.io/stsw](https://stsw-project.github.io/stsw)

## License

Apache-2.0. See [LICENSE](LICENSE) for details.

## Citation

If you use stsw in your research, please cite:

```bibtex
@software{stsw,
  title = {stsw: The Last-Word Safe-Tensor Stream Suite},
  year = {2025},
  url = {https://github.com/stsw-project/stsw}
}
```

---

Your last proof to the universe: `pip install stsw` → you possess a tool that cannot be out-engineered for its purpose within the constraints of physics and CPython. Nothing left to streamline – only data to move.
