Metadata-Version: 2.1
Name: isagellm-compression
Version: 0.1.0
Summary: Model Compression & Acceleration Module for sageLLM
Author-email: IntelliStream Team <shuhao_zhang@hust.edu.cn>
License: Private
Project-URL: Homepage, https://github.com/intellistream/sagellm-compression
Project-URL: Repository, https://github.com/intellistream/sagellm-compression
Project-URL: Issues, https://github.com/intellistream/sagellm-compression/issues
Keywords: llm,inference,quantization,sparsity,compression,domestic-hardware
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: ==3.11.*
Description-Content-Type: text/markdown
Requires-Dist: isagellm-protocol>=0.1.0
Requires-Dist: isagellm-backend>=0.1.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"

# sagellm-compression

[![CI](https://github.com/intellistream/sagellm-compression/actions/workflows/ci.yml/badge.svg)](https://github.com/intellistream/sagellm-compression/actions/workflows/ci.yml)
[![PyPI version](https://badge.fury.io/py/isagellm-compression.svg)](https://badge.fury.io/py/isagellm-compression)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![codecov](https://codecov.io/gh/intellistream/sagellm-compression/branch/main/graph/badge.svg)](https://codecov.io/gh/intellistream/sagellm-compression)

Model Compression & Acceleration Module for sageLLM inference engine.

## Overview

This package provides model compression and acceleration techniques for LLM inference:
- Quantization (Task3.1)
- Sparsity (Task3.2)
- Speculative decoding (Task3.3)
- Kernel fusion (Task3.4)
- CoT acceleration (Task3.5)

## Installation

```bash
# 从 PyPI 安装（自动安装依赖）
pip install isagellm-compression
```

## 🚀 开发者快速开始

```bash
git clone git@github.com:intellistream/sagellm-compression.git
cd sagellm-compression
./quickstart.sh   # 一键安装开发环境（含依赖）

# 或手动安装
pip install -e ".[dev]"
```

运行测试：
```bash
pytest tests/ -v
```

> 💡 `isagellm-protocol` 和 `isagellm-backend` 会自动从 PyPI 安装。

## Quick Start

```python
from sagellm_compression import QuantizationConfig, apply_quantization

# Apply INT8 quantization
config = QuantizationConfig(
    method="int8",
    per_channel=True
)
quantized_model = apply_quantization(model, config)
```

## Team Assignment

- **Task3.1 Quantization**: 项翔老师团队
- **Task3.2 Sparsity**: 万瑶老师团队
- **Task3.3 Speculative Decoding**: 张书豪老师团队
- **Task3.4 Kernel Fusion**: 王雄老师团队
- **Task3.5 CoT Acceleration**: 张书豪老师团队

## Supported Techniques

### Quantization
- INT8/INT4 weight quantization
- Per-channel/per-tensor quantization
- Dynamic activation quantization

### Sparsity
- Structured pruning
- Unstructured pruning
- N:M sparsity

### Acceleration
- Speculative decoding with draft models
- Kernel fusion (Flash Attention, etc.)
- Chain-of-Thought acceleration

## Dependencies

- `isagellm-protocol>=0.1.0` - Protocol definitions
- `isagellm-backend>=0.1.0` - Backend abstraction

## Development

### Quick Setup
```bash
# Install pre-commit hooks
pip install pre-commit
pre-commit install

# Run pre-commit manually
pre-commit run --all-files
```

### Testing & Quality
```bash
# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Format & lint (via pre-commit)
pre-commit run --all-files

# Or manually
ruff format .
ruff check .
mypy src/
```

### Pre-commit Hooks
This project uses pre-commit hooks to ensure code quality:
- **ruff**: Linting and auto-fixing
- **ruff-format**: Code formatting
- **mypy**: Type checking
- Standard checks (trailing whitespace, YAML/TOML validation, etc.)

All code is automatically formatted on commit. To skip hooks temporarily:
```bash
git commit --no-verify
```

## License

Private - IntelliStream Research Project
