Metadata-Version: 2.4
Name: pulseopt
Version: 0.1.0
Summary: PulseOpt: episodic adaptive control for optimizer dynamics (LR multiplier and gradient noise)
Author-email: David Kneringer Foss <david.k.foss@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/davidkfoss/adaptive-scheduler
Project-URL: Repository, https://github.com/davidkfoss/adaptive-scheduler
Project-URL: Issues, https://github.com/davidkfoss/adaptive-scheduler/issues
Keywords: deep learning,optimizer,learning rate,bandit,adaptive,scheduler
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0
Provides-Extra: experiments
Requires-Dist: torchvision>=0.22; extra == "experiments"
Requires-Dist: transformers>=4.52; extra == "experiments"
Requires-Dist: datasets>=3.6; extra == "experiments"
Requires-Dist: matplotlib>=3.10.9; extra == "experiments"
Provides-Extra: dev
Requires-Dist: pytest<9,>=8; extra == "dev"
Dynamic: license-file

# PulseOpt

**PulseOpt: episodic adaptive control for optimizer dynamics.**

`pulseopt` wraps any PyTorch optimizer with an episode-level bandit that adapts a learning-rate multiplier and a gradient-noise level online. Instead of committing to one static schedule, it evaluates short training episodes ("pulses"), scores them with a shaped log-loss-improvement reward, and picks the next configuration with a discounted-UCB controller. The underlying method is **Adaptive Episodic Exploration Scheduling (AEES)**, exposed as the `AEES` class.

It is small, has a single dependency (`torch>=2.0`), and is designed to drop into an existing training loop with two extra calls per step.

## Install

```bash
pip install pulseopt
```

## Quick start

```python
import torch
from torch import nn
from pulseopt import AEES

model = nn.Linear(8, 4)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-3)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=1000)

aees = AEES(
    optimizer,
    lr_candidates=[0.5, 1.0, 2.0],   # tried as multipliers on the optimizer's base LR
    noise_candidates=[0.0, 0.005],   # tried as gradient-noise std
    episode_length=50,
    lr_scheduler=scheduler,           # optional — AEES calls .step() for you
    seed=0,
)

for step in range(1000):
    aees.step_start(step)            # selects the candidate for this step
    optimizer.zero_grad()
    loss = model(torch.randn(32, 8)).pow(2).mean()
    loss.backward()
    aees.step_end(loss)              # runs optimizer.step() + scheduler.step()

aees.finalize()
logs = aees.get_logs()
print(f"Episodes run: {len(logs['episode_rewards'])}")
print(f"Last selected LR multiplier: {logs['selected_lr_values'][-1]}")
```

The wrapper owns `optimizer.step()` and `lr_scheduler.step()`; you keep `zero_grad()` and `loss.backward()`. The LR multiplier is applied transiently around `optimizer.step()`, so any external scheduler still advances on the optimizer's base learning rate.

## How it works

- **Episode**: a fixed-length window of training steps with one frozen candidate (LR multiplier, noise std).
- **Reward**: log-EMA-loss improvement over the episode, minus an instability penalty proportional to within-episode loss variance, clipped to `[-1, 1]`.
- **Controller**: discounted-UCB by default; an optional bucketed-contextual variant uses a coarse loss-trend (and optional training-phase) bucket to share information across similar regimes.

Axes with a single candidate are treated as fixed constants and get no controller — passing `lr_candidates=[1.0]` keeps the LR multiplier disabled, and `noise_candidates=[0.0]` keeps gradient noise off.

## Common knobs

| Argument | Meaning |
|---|---|
| `lr_candidates` | Multipliers tried against the optimizer's base LR. |
| `noise_candidates` | Gradient-noise std values; `0.0` means no noise. |
| `episode_length` | Steps per episode; reward is computed at episode end. |
| `lr_scheduler` | Optional `torch.optim.lr_scheduler.*` instance; `step()` is called for you. |
| `structured_control_mode` | `"independent"` (default) or `"conditional"` (one noise controller per LR arm). |
| `context_mode` | `"none"` (default), `"trend"`, or `"trend_phase"` (requires `total_training_steps`). |
| `reward_instability_lambda` | Weight on the variance penalty in the reward. |
| `seed` | Seeds controllers and gradient-noise generators. |

`AEES.step_end(loss)` raises `ValueError` on a non-finite loss. If you train with mixed precision (`torch.cuda.amp` / `torch.amp`) and expect occasional NaN/Inf during loss-scaling backoff, guard the call yourself or skip the step.

## Caveats

- AEES does not adapt weight decay; keep it as a normal optimizer hyperparameter.
- Each step clones the optimizer's parameters once to compute an update norm for the reward signal. Memory cost is roughly 1× model size.
- There is no `state_dict` / `load_state_dict` yet — checkpoint and resume are planned for a future minor release.

## For researchers / thesis reproduction

The package is the library half of a thesis project. The thesis-facing experiment runners and orchestration helpers live alongside the library in this repository but are **not** part of the published wheel.

Main experiment scripts:

- [`experiments/task_cifar100.py`](experiments/task_cifar100.py)
- [`experiments/task_sst2.py`](experiments/task_sst2.py)

Plan generators:

- [`experiments/run_v1_quick.py`](experiments/run_v1_quick.py)
- [`experiments/run_v1_matrix.py`](experiments/run_v1_matrix.py)
- [`experiments/utils/run_plan.py`](experiments/utils/run_plan.py)

Key structured AEES flags exposed by the runners:

- `--lr-candidates`, `--noise-candidates`
- `--structured-control-mode {independent,conditional}`
- `--context-mode {none,trend,trend_phase}`
- `--context-trend-window`, `--context-trend-epsilon`
- `--episode-length`

Scheduler flags: `--lr-scheduler {none,cosine,linear,warmup_linear}`, `--scheduler-t-max`, `--warmup-epochs`.

Reward flags: `--reward-epsilon`, `--reward-instability-lambda`, `--reward-clip-min`, `--reward-clip-max`.

CIFAR-specific noise flags: `--label-noise-type {none,symmetric,asymmetric}`, `--label-noise-rate`.

The structured v1.0.0 path does not adapt weight decay and does not use the older hierarchical driver-axis interface as the main method. `--mode-names` and `FixedMode` remain as a small compatibility path for the older named-mode surface.

Example quick-plan generation:

```bash
python3.11 experiments/run_v1_quick.py --write-manifest
python3.11 experiments/run_v1_matrix.py --plan main --seeds 0,1,2 --write-manifest
```

## Repo layout

- [`src/pulseopt/`](src/pulseopt) — published library (controllers, episode managers, reward, optimizer wrappers, the `AEES` high-level API).
- [`experiments/`](experiments) — task runners and orchestration helpers (not packaged).
- [`configs/`](configs) — reference defaults (not packaged).
- [`tests/`](tests) — regression and unit tests.
- [`data/`](data), [`results/`](results) — local datasets and outputs (gitignored).

## Development

```bash
python3.11 -m venv .venv && source .venv/bin/activate
pip install -e .[dev,experiments]
pytest
```

## License

MIT — see [LICENSE](LICENSE).
