Metadata-Version: 2.4
Name: ber-equalization-studio
Version: 0.1.1
Summary: Research studio for BER equalization experiments in nonlinear optical links.
Keywords: ber,equalization,optical-communications,photonics,pytorch
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Physics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: matplotlib>=3.8
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: plotly>=5.18
Requires-Dist: torch>=2.1
Provides-Extra: kan
Provides-Extra: notebook
Requires-Dist: ipykernel>=6.29; extra == "notebook"
Requires-Dist: ipython>=8.18; extra == "notebook"
Requires-Dist: ipywidgets>=8.1; extra == "notebook"
Requires-Dist: jupyterlab>=4.0; extra == "notebook"
Requires-Dist: nbformat>=5.9; extra == "notebook"
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: twine>=5.1; extra == "dev"
Provides-Extra: publish
Requires-Dist: build>=1.2; extra == "publish"
Requires-Dist: twine>=5.1; extra == "publish"

# BER Equalization Studio

Research library for BER equalization experiments in nonlinear fiber-optic links.

The studio wraps the existing, validated `BER_minimization_survey/ber_equalization.py`
backend with a cleaner API:

- structured dataclass configs instead of editing one large global `Config`;
- model registry and aliases;
- reusable experiment runner;
- per-model run folders with checkpoints, metrics, histories, and plots;
- result comparison helpers for paper tables;
- Plotly HTML visualizations for BER/loss curves and model tradeoffs.

## Layout

```text
ber-equalization-studio/
  src/ber_equalization_studio/
    config.py          # DataConfig, ModelConfig, TrainingConfig, StudioConfig
    legacy.py          # adapter to the original ber_equalization.py backend
    models.py          # model registry and constructors
    data.py            # dataset preparation wrapper
    experiment.py      # ExperimentRunner
    api.py             # notebook-friendly Studio, RunResult, StudyResult
    results.py         # result loading/comparison helpers
    visualization.py   # BER/loss/complexity plots
    cli.py             # ber-studio command
  examples/
    run_smoke.py
```

## Quick Start

Install from PyPI:

```powershell
python -m pip install "ber-equalization-studio[notebook]"
```

For local development from the repository root:

```powershell
cd ber-equalization-studio
python -m pip install -e ".[notebook,dev]"
ber-studio models
```

Run a short smoke experiment:

```powershell
ber-studio run `
  --name smoke_complex_fastkan `
  --models complex_fastkan,mlp `
  --data-dir ..\BER_minimization_survey\symbols_new `
  --epochs 5 `
  --max-test-files 1
```

If your symbol CSV files live directly under `BER_minimization_survey`, pass that
folder instead. The backend searches for files named:

```text
Symbols_1m_1ch_PR_*.csv
```

The PyPI package includes the legacy training backend and the local EfficientKAN
implementation used by the studio. You only need to pass data directories with symbol
CSV files.

## Notebook API

The easiest way to work from Jupyter is the high-level `Studio` interface:

```python
from ber_equalization_studio import Studio

studio = Studio(
    data_dirs=["../test"],
    out_dir="notebook_runs",
    device="cuda",
)

studio.models()
```

Run one experiment:

```python
run = studio.run(
    name="fastkan_smoke",
    models=["complex_fastkan", "mlp"],
    epochs=5,
    lr=1e-3,
    context_k=32,
    max_test_files=1,
)

run.results
```

Useful result helpers:

```python
run.best()
run.compare()
run.history("complex_fastkan")
run.plot_history("complex_fastkan")
run.plot_comparison()
run.run_dir
```

Sweep a small research grid:

```python
study = studio.sweep(
    name="context_lr_sweep",
    models=["complex_fastkan"],
    grid={
        "context_k": [16, 32, 64],
        "lr": [1e-3, 3e-4],
    },
    epochs=50,
    max_test_files=2,
)

study.results.sort_values("equalized_ber")
study.best()
study.plot_tradeoff(x="trainable_params", y="equalized_ber")
```

Short parameters such as `epochs`, `lr`, `context_k`, `max_test_files`, `data_dirs`,
and `out_dir` are mapped to the structured config. For advanced settings, use dotted
keys:

```python
run = studio.run(
    name="custom_eval",
    models="complex_fastkan",
    **{
        "evaluation.ber_scale_steps": 20,
        "training.early_stopping_patience": 100,
    },
)
```

## Low-Level Python API

```python
from pathlib import Path

from ber_equalization_studio import (
    DataConfig,
    ExperimentConfig,
    ExperimentRunner,
    OutputConfig,
    StudioConfig,
    TrainingConfig,
)

config = StudioConfig(
    data=DataConfig(
        data_dirs=[Path("../BER_minimization_survey/symbols_new")],
        context_k=32,
        max_test_files=1,
    ),
    training=TrainingConfig(epochs=10, learning_rate=1e-3),
    output=OutputConfig(out_dir=Path("studio_outputs"), experiment_name="fastkan_baseline"),
    experiment=ExperimentConfig(models=["complex_fastkan", "efficient_kan_baseline", "mlp"]),
)

result = ExperimentRunner(config).run()
print(result["results_csv"])
```

## Available Models

Use:

```powershell
ber-studio models
```

Important current models:

- `complex_fastkan`: lightweight complex temporal encoder + RBF/FastKAN regression head.
- `complex_fastkan_classifier`: same encoder + RBF/FastKAN 16-class detector.
- `efficient_kan_baseline`: flat IQ window + B-spline EfficientKAN regression.
- `kan_classifier`: flat IQ window + B-spline EfficientKAN classifier.
- `cnn_kan`: temporal CNN + EfficientKAN head.
- `mlp`, `cnn`, `tcn`, `lstm`, `transformer`: neural baselines.

## Outputs

Each run creates:

```text
studio_outputs/<experiment_name>/
  config.json
  dataset_summary.json
  results.csv
  model_comparison_ber.html
  model_comparison_complexity.html
  <model_name>/
    metrics.json
    history.json
    final_state_dict.pt
    <model_name>_loss.html
    <model_name>_ber.html
```

Compare previous runs:

```powershell
ber-studio compare studio_outputs
```

## Research Workflow

Recommended progression:

1. In Jupyter, start with `Studio(...).models()` to choose candidate models.
2. Run `complex_fastkan`, `efficient_kan_baseline`, and `mlp` as a short smoke test.
3. Increase epochs and use the full test split for promising candidates.
4. Use `run.results`, `study.results`, and saved `results.csv` files for paper tables.
5. Sweep one family at a time with `studio.sweep(...)`, changing only a few parameters per study.

The current implementation intentionally keeps the original backend intact. That means
existing BER computation, Gray labels, file-level split protocol, normalization, KAN
implementations, and training loop behavior are preserved while the new API makes
future experiments much easier to compose.
