Metadata-Version: 2.4
Name: sparsekit
Version: 0.1.4
Summary: Sparsity Kit for Structured Sparsity Specification
License: CC-BY-NC-4.0
License-File: LICENSE
Keywords: sparsity,pruning,neural networks,structured sparsity,triton,pytorch
Author: Ayoub Ghriss
Author-email: research@ayghri.me
Requires-Python: >=3.11,<3.14
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Provides-Extra: scripts
Provides-Extra: viz
Requires-Dist: accelerate (>=1.13.0,<2.0.0) ; extra == "scripts"
Requires-Dist: datasets (>=4.8.4,<5.0.0) ; extra == "scripts"
Requires-Dist: lm-eval (>=0.4.11,<0.5.0) ; extra == "scripts"
Requires-Dist: matplotlib ; extra == "viz"
Requires-Dist: numpy (>=2.4.0)
Requires-Dist: pandas (>=3.0.2,<4.0.0) ; extra == "scripts"
Requires-Dist: torch (>=2.8.0)
Requires-Dist: tqdm (>=4.67.3,<5.0.0) ; extra == "scripts"
Requires-Dist: transformers (>=5.5.3,<6.0.0) ; extra == "scripts"
Requires-Dist: triton (>=3.4.0)
Project-URL: Homepage, https://github.com/ayghri/sparsekit
Project-URL: Repository, https://github.com/ayghri/sparsekit
Description-Content-Type: text/markdown

# SparseKit

**SparseKit** is the reference implementation of **S3 (Structured Sparsity Specification)**,
a unified framework for expressing and pruning structured sparse neural networks.

## Library Overview

```
sparsekit/
├── view.py          # View         — zero-copy strided parameter wrapper (torch.as_strided)
├── block.py         # BlockSpec / BlockCoupling   — atomic pruning unit (block)
├── scope.py         # ScopeSpec / ScopeCoupling   — decision scope
├── builder.py       # SparsityBuilder fluent API
├── linalg.py        # Utility solvers (proximal, thresholds)
├── tensor_ops.py    # kth_largest, layout helpers
├── kernels.py       # Triton kernels (auto-dispatched for large K/k)
├── viz.py           # draw_layout() — visualize sparsity patterns
├── pruners/
│   ├── obs.py       # StructuredOBS — S-OBS with per-row Schur updates
│   ├── sparsegpt.py # SparseGPT column-sequential pruning
│   └── obd.py       # OBD and magnitude pruning
└── training/
    ├── data.py      # Calibration data loaders (C4)
    └── hooks.py     # ModuleInputCatcher, transfer_to_device
```

**Terminology:**
- **Block** — atomic pruning unit: the smallest set of weights pruned or kept together.
- **Scope** — decision scope: a set of blocks that compete; the pruning budget is enforced per scope.

## Quick Example

```python
import torch
from torch.nn import Parameter
from sparsekit import BlockSpec, ScopeSpec, StructuredOBS

M, K = 2560, 9728
W = Parameter(torch.randn(M, K, device="cuda"))
X = torch.randn(1024, K, device="cuda")          # calibration inputs

# Express 2:4 sparsity: scalar blocks, scopes of 4
block = BlockSpec(W, shape=(1, 1))
scope = ScopeSpec(block, shape=(1, 4))

# Prune with Structured OBS
hessian = (X.T @ X) / X.shape[0]
obs = StructuredOBS(scope, hessian)
obs.prune_true_obs(nnz=2)                     # keep 2 of 4, in-place
```

Any of the four experimental patterns replaces the two `BlockSpec/ScopeSpec`
lines above; the `StructuredOBS` call is identical.

## Sparsity Patterns

| Pattern | Block shape | Scope shape | Description |
|---|---|---|---|
| 2:4 | `(1, 1)` | `(1, 4)` | Keep 2 of 4 contiguous columns |
| 4:8 | `(1, 2)` | `(1, 4)` | Keep 2 of 4 column-pairs |
| Coupled 2:4 | `(1, 1, 1, 2)` | `(1, 1, 4, 1)` | Pair columns 8 apart via `View` |
| 16-col block | `(1, 1, 16)` | `(1, 2, 1)` | 16-col blocks, 8-row coupling |

## Reproducing Paper Results

**Table 1** (single-layer, 4 patterns):
```bash
python scripts/structured_obs.py --pattern 24         --ng 64   # 2:4
python scripts/structured_obs.py --pattern 48         --ng 64   # 4:8
python scripts/structured_obs.py --pattern coupled24  --ng 64   # Coupled 2:4
python scripts/structured_obs.py --pattern block16    --ng 64   # 16-col block, 8-row coupled
```

**Table 2 + Figures** (end-to-end LLM pruning):
```bash
# SparseGPT baseline
python scripts/prune_gpt.py --method sparsegpt_24 --model Qwen/Qwen3-1.7B

# S-OBS (True OBS)
python scripts/prune_gpt.py --method true_obs_24 --model Qwen/Qwen3-1.7B --ng 64
```

**Plots** (from saved CSVs):
```bash
python scripts/plot_results.py experiments/results --model Qwen3-1.7B
```

## Requirements

- Python >= 3.10
- PyTorch >= 2.4
- Triton >= 3.0
- CUDA GPU

Additional for LLM experiments (`prune_gpt.py`):
- `transformers`, `datasets`, `lm_eval`, `pandas`

