Metadata-Version: 2.4
Name: gwo-benchmark
Version: 0.2.5
Summary: A benchmark for Generalized Windowed Operations in neural networks.
Home-page: https://github.com/Kim-Ai-gpu/gwo-benchmark
Author: Youngseong Kim
Author-email: dafaafafaf33@gmail.com
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch
Requires-Dist: torchvision
Requires-Dist: tqdm
Requires-Dist: pandas
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# GWO Benchmark: The Architect's Arena

[![PyPI version](https://badge.fury.io/py/gwo-benchmark.svg)](https://badge.fury.io/py/gwo-benchmark)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**Is your neural network 'smart' or just big? This benchmark tells you the difference.**

This Python package provides a framework for benchmarking neural network operations, inspired by the GWO (Generalized Windowed Operation) theory from the paper ["Window is Everything: A Grammar for Neural Operations"](https://zenodo.org/records/17103133).

Instead of just measuring accuracy, this benchmark scores operations on their **architectural efficiency**. It quantifies the relationship between an operation's theoretical **Operational Complexity (`Ω_proxy`)** and its real-world performance, helping you design smarter, more efficient models.

---

## Key Concepts in 1 Minute

The core idea is to break down any neural network operation (like Convolution or Self-Attention) into its fundamental building blocks and score its complexity.

- **GWO (Generalized Windowed Operation):** A "grammar" that describes any operation using three components:
    - **Path (P):** *Where* to look for information (e.g., a local sliding window).
    - **Shape (S):** *What form* of information to look for (e.g., a square patch).
    - **Weight (W):** *What* to value in that information (e.g., a learnable kernel).

- **Operational Complexity (`Ω_proxy`):** The "intelligence score" of your operation. A lower score for the same performance means a more efficient design. It's calculated as:
    `Ω_proxy = C_D (Structural Complexity) + α * C_P (Parametric Complexity)`

    - **`C_D` (Descriptive Complexity):** How many basic "primitives" does it take to describe your operation's structure? (You define this based on our guide).
    - **`C_P` (Parametric Complexity):** How many extra parameters are needed to *generate* the operation's behavior dynamically? (e.g., the offset prediction network in Deformable Convolution). This is calculated automatically.

## Installation

```bash
pip install gwo-benchmark
```
Or for development from this repository:
```bash
git clone https://github.com/Kim-Ai-gpu/gwo-benchmark.git
cd gwo-benchmark
pip install -e .
```

## Quick Start in 3 Steps

Let's benchmark a simple custom CNN on CIFAR-10.

**Step 1: Define your model inheriting from `GWOModule`**

Create your model file `my_models.py`:
```python
# my_models.py
import torch.nn as nn
from gwo_benchmark import GWOModule

class MySimpleConv(GWOModule):
    # PRIMITIVES: STATIC_SLIDING(1) + DENSE_SQUARE(1) + SHARED_KERNEL(1)
    # Based on the official primitive guide, the complexity is 3.
    C_D = 3

    def __init__(self, in_channels=3, out_channels=16):
        super().__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1)
        self.relu = nn.ReLU()

    def forward(self, x):
        return self.relu(self.conv(x))

    # This model has no dynamic components, so C_P is zero.
    # We can omit get_parametric_complexity_modules() as it defaults to [].
```

**Step 2: Create your benchmark script**

Create your main script `run_benchmark.py`:
```python
# run_benchmark.py
from gwo_benchmark import run, Evaluator
from my_models import MySimpleConv

# 1. Instantiate your model
model = MySimpleConv()

# 2. Configure the evaluation environment
#    The standard Evaluator handles training and testing for you.
evaluator = Evaluator(
    dataset_name="cifar10",
    train_config={ "epochs": 2, "batch_size": 64 }
)

# 3. Run the benchmark!
if __name__ == "__main__":
    result = run(model, evaluator, result_dir="benchmark_results")
    print(result)
```

**Step 3: Run from your terminal**

```bash
python run_benchmark.py
```

You'll see a detailed analysis of your model's complexity and performance, saved in the `benchmark_results` directory.

## How It Works

The framework is designed for flexibility and extension.

1.  **`GWOModule` (`gwo_benchmark.base.GWOModule`):** The heart of your submission. You must inherit from this abstract class and implement:
    - `C_D` (property): Your calculation of the Descriptive Complexity.
    - `get_parametric_complexity_modules()` (method): A list of `nn.Module`s that contribute to `C_P`.

2.  **`Evaluator` (`gwo_benchmark.evaluator.BaseEvaluator`):** This class encapsulates all evaluation logic (training, testing, performance measurement).
    - Use the built-in `Evaluator` for standard datasets like CIFAR-10.
    - Create your own custom evaluation loop by inheriting from `BaseEvaluator` for specialized tasks.

3.  **Datasets (`gwo_benchmark.datasets`):** Easily add support for new datasets by inheriting from `BaseDataset` and registering your class. See the `datasets` directory for examples.

## Contributing

We welcome contributions! This project is in its early stages, and we believe it can grow into a standard tool for the deep learning community.

-   **Add New GWO Models:** Implement novel or existing operations (like Transformers, Attention variants, MLPs) as `GWOModule`s in the `examples` directory.
-   **Support More Datasets:** Help us expand the benchmark to new domains like NLP, Graphs, etc.
-   **Improve the Core Engine:** Enhance the `Evaluator`, `ComplexityCalculator`, or add new analysis tools.

Please see our `CONTRIBUTING.md` for more details.

## Running Tests

To ensure the integrity of the framework, please run tests before submitting a pull request.

```bash
python -m unittest discover tests
```

## Citation

If you use this framework in your research, please consider citing the original paper:
*@article{https://doi.org/10.5281/zenodo.17103133, doi = {10.5281/ZENODO.17103133}, url = {https://zenodo.org/doi/10.5281/zenodo.17103133}, author = {Kim, Youngseong}, keywords = {Machine learning, Machine Learning, Supervised Machine Learning, Machine Learning/classification, Machine Learning/ethics, Machine Learning/standards, Unsupervised Machine Learning, Machine Learning/history, Machine Learning/trends, Machine Learning/economics, Supervised Machine Learning/standards, Unsupervised Machine Learning/classification}, language = {en}, title = {Window is Everything: A Grammar for Neural Operations}, publisher = {Zenodo}, year = {2025}, copyright = {Creative Commons Attribution 4.0 International}}*
