Metadata-Version: 2.4
Name: tilus
Version: 0.2.0
Summary: Tilus
Author: Tilus Team
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/NVIDIA/tilus
Project-URL: Documentation, https://nvidia.github.io/tilus
Keywords: GPU,Compiler,CUDA,hidet,tensor,torch
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: apache-tvm-ffi
Requires-Dist: filelock
Requires-Dist: numpy
Requires-Dist: torch
Requires-Dist: tabulate
Requires-Dist: tqdm
Provides-Extra: dev
Requires-Dist: ruff==0.11.0; extra == "dev"
Requires-Dist: mypy==1.15.0; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: types-tabulate; extra == "dev"
Requires-Dist: types-tqdm; extra == "dev"
Requires-Dist: pandas; extra == "dev"
Requires-Dist: nvidia-ml-py; extra == "dev"
Requires-Dist: einops; extra == "dev"
Requires-Dist: jinja2; extra == "dev"
Requires-Dist: sphinx==8.1.3; extra == "dev"
Requires-Dist: sphinx-autobuild==2024.10.3; extra == "dev"
Requires-Dist: sphinx-copybutton==0.5.2; extra == "dev"
Requires-Dist: autodocsumm==0.2.14; extra == "dev"
Requires-Dist: sphinx-book-theme==1.1.4; extra == "dev"
Requires-Dist: matplotlib; extra == "dev"
Dynamic: license-file

# Tilus: A Tile-Level GPU Kernel Programming Language
[Documentation](https://nvidia.github.io/tilus/index.html) | [Paper](https://arxiv.org/abs/2504.12984)

**Tilus** is a powerful research domain-specific language (DSL) for GPU programming that offers:

* **Thread-block-level granularity** with **tensors** as the primary data type.
* **Explicit control** over shared memory and register tensors (unlike Triton).
* **Low-precision types** with arbitrary bit-widths (1 to 8 bits).

It also includes automatic tuning, caching, and a Pythonic interface for ease of use.

Tilus is pronounced as tie-lus, /ˈtaɪləs/.

## Status

Tilus supports Ampere architecture, and we are actively working on support for Hopper/Blackwell GPUs (see the [roadmap](https://github.com/NVIDIA/tilus/issues/49)).

## Getting Started

### Installation
Install Tilus using `pip`:
```
pip install tilus
```

> [!NOTE]
> Tilus depends on `cuda-python`. If your GPU driver is older than **580.65.06**, you will need to install an older version of cuda-python to ensure compatibility.
> ```
> pip install tilus "cuda-python<13"
> ```

### Usage

To get started, refer to the [tutorials](https://nvidia.github.io/tilus/getting-started/tutorials/__init__.html) to learn how to program kernels with Tilus.

You can also check more [examples](https://github.com/NVIDIA/tilus/tree/main/examples) of using Tilus.

You can learn more on different topics in the [programming guide](https://nvidia.github.io/tilus/programming-guides/overview.html).

## Research
This project is based on the following research paper:

```bibtex
@inproceedings{ding2025tilus,
 author = {Ding, Yaoyao and Hou, Bohan and Zhang, Xiao and Lin, Allan and Chen, Tianqi and Yu, Cody Hao and Wang, Yida and Pekhimenko, Gennady},
 title = {Tilus: A Tile-Level GPGPU Programming Language for Low-Precision Computation},
 url = {https://doi.org/10.1145/3760250.3762219},
 booktitle = {Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1},
 series = {ASPLOS '26}
}
```

## Acknowledgement
We would like to acknowledge the following projects for their influence on Tilus's design and development:
- **Hidet**: We take Hidet IR as our low-level target and reuse its runtime system.
- **TVM**: Hidet's initial IR was adopted from TVM, and we also learned a lot from TVM on how to build a compiler.
- **Triton**: The core idea of defining kernels at a thread-block level and working with tiles was inspired by Triton.
- **Hexcute**: We adopted the idea of using automatic layout inference to simplify programming from Hexcute.
