Metadata-Version: 2.4
Name: photon-tools
Version: 0.1.1
Summary: Lightweight tools for loading and handling photon counting data
Author: Janosch Kappel
License: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: numpy
Requires-Dist: h5py
Requires-Dist: plotly
Requires-Dist: nbformat
Requires-Dist: pandas
Requires-Dist: ipywidgets
Requires-Dist: phconvert
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# photon-tools

Lightweight Python tools for loading, inspecting, and screening single-molecule photon counting data.

`photon-tools` is designed for **interactive, notebook-based workflows** commonly used in single-molecule fluorescence experiments.  
The focus is on **data loading, standardization, and visual inspection**, not on enforcing a specific analysis pipeline.

`photon-tools` provides:
- a clean and extensible **data loading layer**
- a **standardized in-memory data model**
- fast, interactive **Plotly-based previews**
- a **Jupyter-based browser** for screening and annotating many files

## Features

### ✔ Data loading
- Built-in loader for **Photon-HDF5**
- Unified data representation via `PhotonDataset` / `PhotonData`
- Optional runtime registration of **custom loaders** (no forking required)

### ✔ Clean data model
- Integer timestamps (ticks)
- Optional detector/channel information
- Explicit `timing_resolution` (seconds per tick)
- Safe conversion to physical time
- Easy splitting by detector channel

### ✔ Interactive preview (Plotly)
- Fast binning of large photon streams
- Multiple detector channels in one plot
- Clickable legend to enable/disable channels
- Scroll-wheel zoom + mouse pan
- Physically meaningful defaults (axes clamped to zero)
- Fully customizable via returned Plotly `Figure`

### ✔ Screening workflow (Jupyter)
- Browse many files interactively
- Next / Previous navigation
- Visual evaluation instead of purely numeric filtering
- Mark files as *keep / reject*
- Store annotations and notes in a CSV file



## Installation

Create a virtual environment (recommended):

```bash
python -m venv .venv
source .venv/bin/activate
```

Install in editable (development) mode:

```bash
pip install photon-tools
```

Required dependencies:
- numpy
- h5py
- plotly
- nbformat
- ipywidgets
- pandas


## Basic Usage

```
notebooks/01_quickstart.ipynb
```
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/JKL453/photon-tools/HEAD?urlpath=%2Fdoc%2Ftree%2F%2Fnotebooks%2F01_quickstart.ipynb)

### Load a Photon-HDF5 file

```python
import photon_tools as pt

ds = pt.load(
    "measurement.hdf5",
    timing_resolution=5e-9,  # seconds per tick
)
```

### Access data

```python
ds.photons.timestamps        # raw integer timestamps (ticks)
ds.photons.times_s           # physical time in seconds
ds.photons.detectors         # detector/channel IDs
```

Split by detector channel:

```python
by_ch = ds.photons.by_detector()
t_ch0 = by_ch[0]
t_ch1 = by_ch[1]
```



## Interactive Preview

Quick visual inspection of a time trace:

```python
pt.preview(ds, bin_width_ms=10)
```

Customize appearance and detector labels:

```python
pt.preview(
    ds,
    bin_width_ms=5,
    detector_labels={0: "donor", 1: "acceptor"},
    colors={0: "royalblue", 1: "firebrick"},
    width=1000,
    height=400,
)
```

Further customization via Plotly:

```python
fig = pt.preview(ds, show=False)
fig.update_yaxes(type="log")
fig.show(config={"scrollZoom": True})
```

Because `preview()` returns a Plotly `Figure`, all Plotly features remain available.



## Screening Many Files (Notebook Browser)

```
notebooks/02_screening_browser.ipynb
```
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/JKL453/photon-tools/HEAD?urlpath=%2Fdoc%2Ftree%2F%2Fnotebooks%2F02_screening_browser.ipynb)

The browser allows you to:
- step through many measurement files
- inspect traces interactively (zoom, pan, toggle channels)
- mark files as *keep* or *reject*
- add free-text notes
- store all annotations in a CSV file

This workflow is intended for **expert-driven screening**, where visual judgment is essential and cannot be replaced by scalar metrics alone.



## Custom Loaders

```
notebooks/03_custom_loaders.ipynb
```
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/JKL453/photon-tools/HEAD?urlpath=%2Fdoc%2Ftree%2F%2Fnotebooks%2F03_custom_loader.ipynb)

Custom file formats can be supported without modifying or forking the package.

Define a loader function and register it at runtime:

```python
def my_loader(path):
    ...
    return PhotonDataset(...)
```

```python
pt.register_loader(".dat", loader=my_loader)
ds = pt.load("custom_format.dat")
```

This allows extending `photon-tools` in notebooks or scripts in a lightweight and flexible way.



## Files without extensions

Some binary formats (e.g. custom NI time-tagged data) do not use file extensions.
In this case, the loader must be specified explicitly:

```python
ds = pt.load(
    "measurement_001",
    loader=pt.load_ni_binary,
    timing_resolution=10e-9,
)
```

## Data Model

`photon-tools` uses a **small, explicit, immutable data model** to represent
photon-counting data and scan images in memory.

The goal of this model is **not** to mirror file formats, but to provide a
**stable and analysis-friendly abstraction layer** that decouples:

- file I/O
- experimental setup specifics
- downstream analysis and visualization

All loaders (built-in or custom) convert raw data into this common model.

---

### PhotonDataset

`PhotonDataset` is the **top-level container** returned by all loaders.

```python
PhotonDataset(
    photons: PhotonData | None,
    images: dict[str, ImageData],
    meta: dict[str, Any],
    raw: dict[str, Any],
    source: str | None,
)
```

**Fields:**

- `photons`  
  Time-tagged photon data (`PhotonData`).  
  May be `None` for pure image data.

- `images`  
  Mapping of image identifiers to `ImageData` objects  
  (e.g. `"scan"`, `"preview"`, `"apd_sum"`).

- `meta`  
  High-level, standardized metadata (sample name, setup, excitation power, etc.).

- `raw`  
  Loader-specific metadata and diagnostics  
  (file paths, header dumps, original parameters).

- `source`  
  Optional string identifying the data source  
  (file path, measurement ID, etc.).

`PhotonDataset` is intentionally lightweight and **does not enforce**
any analysis workflow.

---

### PhotonData

`PhotonData` represents a **single stream of photon arrival times**.

```python
PhotonData(
    timestamps: np.ndarray,
    detectors: np.ndarray | None = None,
    nanotimes: np.ndarray | None = None,
    timing_resolution: float | None = None,
    unit: str = "ticks",
)
```

**Concepts:**

- `timestamps`  
  Integer macro-times (usually hardware clock ticks).

- `timing_resolution`  
  Seconds per tick (e.g. `5e-9`).  
  Required to convert timestamps into physical time.

- `detectors`  
  Optional detector/channel assignment per photon  
  (`0, 1, 2, ...`).  
  If absent, the data is treated as a single channel.

- `nanotimes`  
  Optional microtime / TCSPC information (same length as `timestamps`).

**Key properties & helpers:**

```python
ds.photons.times_s          # timestamps converted to seconds
ds.photons.by_detector()   # split timestamps by detector ID
```

**Design notes:**

- `PhotonData` is immutable.
- No implicit unit conversions.
- Missing information (e.g. timing resolution) raises explicit errors.

---

### ImageData

`ImageData` represents **multi-channel 2D scan images**.

```python
ImageData(
    channels: Mapping[str, np.ndarray],
    meta: ImageMeta,
    raw: dict[str, Any] = {},
)
```

**Fields:**

- `channels`  
  Dictionary mapping channel names to 2D arrays  
  (e.g. `"detector0"`, `"detector1"`, `"sum"`).

- `meta`  
  Physical scan metadata (`ImageMeta`).

- `raw`  
  Loader-specific information (binary headers, parsing details).

All image channels must:
- be 2D
- share the same shape

---

### ImageMeta

`ImageMeta` stores **physical scan parameters** in explicit units.

```python
ImageMeta(
    n_pixels_x: int,
    n_pixels_y: int,
    range_x_um: float,
    range_y_um: float,
    offset_x_um: float = 0.0,
    offset_y_um: float = 0.0,
    pixel_dwell_time_s: float | None = None,
)
```

This allows downstream code to:
- display axes in µm
- compute pixel sizes
- remain independent of scan file formats

---

### Writing Custom Loaders

Custom loaders should:

1. Parse the raw file format
2. Convert data into `PhotonData` and/or `ImageData`
3. Populate `meta` and `raw` dictionaries as needed
4. Return a `PhotonDataset`

Minimal example:

```python
def my_loader(path, **kwargs):
    photons = PhotonData(
        timestamps=ts,
        detectors=det,
        timing_resolution=5e-9,
    )

    return PhotonDataset(
        photons=photons,
        meta={"format": "custom"},
        raw={"path": str(path)},
        source=str(path),
    )
```

The loader does **not** need to:
- perform binning
- normalize data
- apply analysis logic

Those steps are intentionally left to the user.

### Data Model Diagram

```
PhotonDataset
 ├─ photons: PhotonData | None
 │    ├─ timestamps: int ticks (N)
 │    ├─ detectors: int ids (N) | None
 │    ├─ nanotimes: int microtimes (N) | None
 │    └─ timing_resolution: seconds per tick | None
 │
 ├─ images: dict[str, ImageData]
 │    └─ ImageData
 │         ├─ channels: {name -> 2D array (H,W)}   (all same shape)
 │         ├─ meta: ImageMeta (µm + seconds)
 │         └─ raw: loader-specific dict
 │
 ├─ meta: dict (standardized, analysis-friendly)
 ├─ raw: dict (loader-specific diagnostics)
 └─ source: str | None
```


## Design Philosophy

- **Notebook-first**
- **Explicit over implicit**
- **No silent assumptions**
- **Visual inspection before automation**
- Keep the core lightweight; downstream analysis is user-specific

`photon-tools` is not an analysis framework — it is a **foundation** for interactive and exploratory workflows.



## Status

This project is under active development and tailored to real experimental workflows.  
APIs may evolve, but changes are made conservatively and with practical use cases in mind.

### Todos
- add support for loading pixel images from binary files
- add ttl data 
- add support for setup3
- naming convention: timestamps -> macro times vs micro times vs nano times -> tt vs mt vs ttl
