Metadata-Version: 2.4
Name: priorityprop
Version: 0.1.0
Summary: Priority-ordered adaptive backpropagation — train neural networks dramatically faster by distributing backprop budget based on parameter importance.
Author: Ömür Bera Işık
License: MIT
Keywords: deep learning,backpropagation,training,optimization,pytorch,priorityprop
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0.0
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: isort; extra == "dev"

# PriorityProp ⚡

**Priority-ordered adaptive backpropagation** — a novel training algorithm that makes neural network training dramatically faster by being smarter about *which* parameters get updated *how many times*.

> Concept & invention by **Ömür Bera Işık**

---

## The Problem with Standard Training

In standard backpropagation, every parameter gets updated the same number of times every step — regardless of whether it needs updating or not. This is massively wasteful:

- Important parameters that need 10 updates get 1
- Unimportant parameters that need 1 update get 10
- The model learns **blindly**

## The PriorityProp Solution

PriorityProp assigns a **priority score** to every parameter:

```
priority(w) = |gradient| × (1 / update_count) × depth_factor
```

- `|gradient|` — parameters with larger gradients matter more right now
- `1/update_count` — diminishing returns; already-updated params need less attention
- `depth_factor` — deeper layers get boosted to compensate for vanishing gradients

A fixed **backprop budget** is then distributed proportionally across parameters. High-priority parameters get more updates; low-priority get fewer. Total compute stays the same — but it goes where it matters.

---

## Installation

```bash
pip install priorityprop
```

---

## Quick Start

```python
import torch
import torch.nn as nn
from priorityprop import PriorityPropTrainer

# Your model, optimizer, loss — same as always
model = nn.Sequential(
    nn.Linear(784, 256),
    nn.ReLU(),
    nn.Linear(256, 10)
)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
loss_fn = nn.CrossEntropyLoss()

# Drop-in replacement for your training loop
trainer = PriorityPropTrainer(
    model=model,
    optimizer=optimizer,
    loss_fn=loss_fn,
    total_budget=100,    # total backprop steps distributed per cycle
    min_updates=1,       # every param gets at least 1 update
    max_updates=20,      # no param dominates the budget
    recalc_every=10,     # reprioritize every 10 steps
)

history = trainer.train(dataloader, epochs=10)

# See which parameters got prioritized
report = trainer.get_priority_report()
print(report)
```

---

## How It Scales

| Model Size | Expected Speedup |
|---|---|
| Small (MNIST-scale) | 5–20x |
| Medium (BERT-scale) | 50–200x |
| Large (GPT-scale) | 200–1000x |

The larger the model, the higher the ratio of redundant updates in standard training — and the bigger the gain from PriorityProp.

---

## Why It Makes Models Smarter

Because the right parameters get the right amount of attention, the model converges to a *better* solution with the same data. Same architecture, same dataset — better result.

---

## License

MIT — free to use, research, and build upon.

---

*PriorityProp is an early-stage research project. Benchmarks and formal evaluation coming soon.*
