Metadata-Version: 2.4
Name: w2t_bkin
Version: 0.0.9
Summary: Mouse wiskers body kinematics and behaviour
Author: Larkum Lab
Requires-Python: ~=3.10.0
Description-Content-Type: text/markdown
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Information Analysis
License-File: LICENSE
Requires-Dist: pydantic~=2.12.0
Requires-Dist: pydantic-settings~=2.11.0
Requires-Dist: tomli~=2.3.0
Requires-Dist: facemap~=1.0.0
Requires-Dist: deeplabcut[tf]~=2.3.0
Requires-Dist: pynwb~=3.1.0
Requires-Dist: hdmf~=4.1.0
Requires-Dist: ndx-pose~=0.2.0
Requires-Dist: ndx-events~=0.4.0
Requires-Dist: ndx-structured-behavior~=0.1.0
Requires-Dist: ffmpeg-python~=0.2.0
Requires-Dist: scipy~=1.15.0
Requires-Dist: pandas~=2.3.0
Requires-Dist: h5py~=3.15.0
Requires-Dist: tables~=3.8.0
Requires-Dist: nwbinspector~=0.6.5
Requires-Dist: typer~=0.20.0
Requires-Dist: rich~=14.2.0
Requires-Dist: black~=25.9.0 ; extra == "dev"
Requires-Dist: isort~=7.0.0 ; extra == "dev"
Requires-Dist: pytest~=9.0.0 ; extra == "dev"
Requires-Dist: matplotlib~=3.8.0 ; extra == "dev"
Requires-Dist: numpy~=1.26.0 ; extra == "dev"
Provides-Extra: dev

# W2T Body Kinematics Pipeline (w2t-bkin)

A modular, reproducible Python pipeline for processing multi-camera rodent behavior recordings. It integrates synchronization, pose estimation (DeepLabCut/SLEAP), facial metrics, and behavioral events into standardized **NWB (Neurodata Without Borders)** datasets.

## Key Features

- **NWB-First Architecture**: Produces NWB-native data structures directly, eliminating intermediate conversion layers.
- **Hierarchical Metadata**: Supports cascading configuration from global → subject → session levels for efficient metadata management.
- **Bpod Integration**: Parses Bpod `.mat` files and converts them to `ndx-structured-behavior` format.
- **Pose Estimation**: Imports and harmonizes data from DeepLabCut and SLEAP into `ndx-pose`.
- **Synchronization**: Robust alignment of behavioral data and video frames to a common timebase using TTL pulses.
- **Modular Design**: Distinct modules for behavior, pose, sync, and session management.

## Installation

The project requires Python ~3.10.

1. **Install `ndx-structured-behavior`** (currently required from source):

   ```bash
   git clone https://github.com/rly/ndx-structured-behavior.git
   pip install -U ./ndx-structured-behavior
   ```

2. **Install `w2t-bkin`**:

   ```bash
   pip install w2t-bkin
   ```

## Project Structure

```text
project/
├── config.toml              # Pipeline configuration
├── data/
│   ├── raw/                 # Raw data organized by subject/session
│   │   ├── metadata.toml    # Optional: Global metadata (lab-wide defaults)
│   │   └── subject-001/
│   │       ├── subject.toml # Optional: Subject-specific metadata
│   │       └── session-001/
│   │           ├── session.toml  # Session-specific NWB metadata
│   │           ├── Video/        # Raw video files
│   │           ├── TTLs/         # TTL pulse timestamps
│   │           └── Bpod/         # Bpod behavior files
│   ├── interim/             # Processed data (pose estimation, etc.)
│   │   └── subject-001/
│   │       └── session-001/
│   │           └── Pose/
│   └── processed/           # Final NWB output files
└── models/                  # Pose estimation models (DLC/SLEAP)
```

## Configuration

The pipeline uses TOML for configuration:

### Pipeline Configuration (`config.toml`)

Defines paths, timebase, and synchronization settings:

```toml
[project]
name = "my-experiment"

[paths]
raw_root = "data/raw"
intermediate_root = "data/interim"
output_root = "data/processed"
models_root = "models"
root_metadata = "data/raw/metadata.toml"  # Optional global metadata

[synchronization]
strategy = "hardware_pulse"
reference_channel = "ttl_camera"

[synchronization.alignment]
method = "nearest"
tolerance_s = 0.001

[[bpod.sync.trial_types]]
trial_type = 1
sync_signal = "W2T_Audio"
sync_ttl = "ttl_cue"
```

### Hierarchical Metadata

Metadata is loaded and merged from multiple levels (later files override earlier ones):

1. **`root_metadata`** (optional): Lab/project-wide defaults
2. **`raw_root/metadata.toml`** (optional): Experiment-wide settings
3. **`raw_root/subject_id/subject.toml`** (optional): Subject-specific metadata
4. **`raw_root/subject_id/session_id/session.toml`**: Session-specific NWB metadata

Example `session.toml`:

```toml
session_description = "Behavioral training with pose tracking"
identifier = "session-001"
session_start_time = "2025-11-21T14:30:00Z"
experimenter = ["Esteban, Borja"]
institution = "My Lab"
lab = "Neuroscience Lab"

[subject]
subject_id = "subject-001"
species = "Mus musculus"
sex = "M"
age = "P90D"

# Camera configuration
[[cameras]]
id = "camera_0"             # Must match a device name in [[devices]]
paths = "Video/cam0_*.avi"  # Glob pattern for video files
fps = 150.0                 # Acquisition frame rate (defaults to 30.0 if omitted)
ttl_id = "ttl_camera"       # Associated TTL stream for synchronization
```

## Quick Start

### Using the High-Level Helper

```python
from pathlib import Path
from w2t_bkin.config import load_config
from w2t_bkin.utils import load_session_metadata_and_nwb

# Load configuration
config = load_config("config.toml")

# Load hierarchical metadata and create NWBFile in one step
metadata, nwbfile = load_session_metadata_and_nwb(
    config=config,
    subject_id="subject-001",
    session_id="session-001"
)

# Continue with your pipeline...
```

### Manual Approach

```python
from pathlib import Path
from w2t_bkin import config, sync
from w2t_bkin.core import session
from w2t_bkin.ingest import behavior, bpod, ttl

# 1. Load Configuration
settings = config.load_config("config.toml")

# 2. Build metadata paths and load hierarchically
metadata_paths = session.build_metadata_paths(
    raw_root=settings.paths.raw_root,
    subject_id="subject-001",
    session_id="session-001",
    root_metadata=settings.paths.root_metadata
)
metadata = session.load_metadata(metadata_paths)

# 3. Create NWBFile
nwbfile = session.create_nwb_file(metadata)

# 4. Get session directory
session_dir = settings.paths.raw_root / "subject-001" / "session-001"

# 5. Import TTL Signals
ttl_patterns = {
    "ttl_camera": "TTLs/*.xa_7_0*.txt",
    "ttl_cue": "TTLs/*.xia_3_0*.txt",
}
ttl_pulses = ttl.get_ttl_pulses(session_dir, ttl_patterns)

# 6. Parse Bpod Data
bpod_data = bpod.parse_bpod(
    session_dir=session_dir,
    pattern="Bpod/*.mat",
    order="name_asc"
)

# 7. Synchronize Bpod to TTL
trial_offsets, warnings = sync.align_bpod_trials_to_ttl(
    trial_type_configs=settings.bpod.sync.trial_types,
    bpod_data=bpod_data,
    ttl_pulses=ttl_pulses,
)

# 8. Extract Behavioral Data (NWB objects)
task, recording, trials = behavior.extract_behavioral_data(
    bpod_data,
    trial_offsets
)

# 9. Add to NWB
nwbfile.add_lab_meta_data(task)
nwbfile.add_acquisition(recording.states)
nwbfile.add_acquisition(recording.events)
nwbfile.add_acquisition(recording.actions)
nwbfile.trials = trials
```

## Examples

The `examples/` directory contains complete working examples:

- **`bpod_camera_sync.py`**: Demonstrates Bpod-camera synchronization with TTL alignment
- **`pose_camera_nwb.py`**: Shows pose estimation data import and NWB file creation
- **`sync_recovery_demo.py`**: Robust sync recovery with missing TTL pulses

Run an example:

```bash
python examples/pose_camera_nwb.py
```

## Module Overview

| Module                     | Description                                                                                        |
| :------------------------- | :------------------------------------------------------------------------------------------------- |
| `w2t_bkin.ingest.behavior` | Converts Bpod data into `ndx-structured-behavior` classes (StatesTable, EventsTable, TrialsTable). |
| `w2t_bkin.ingest.bpod`     | Low-level parsing and validation of Bpod `.mat` files.                                             |
| `w2t_bkin.ingest.pose`     | Imports pose estimation data (DLC/SLEAP) and builds `ndx-pose` objects (PoseEstimation, Skeleton). |
| `w2t_bkin.ingest.ttl`      | Loads hardware TTL pulse timestamps and creates `ndx-events` tables.                               |
| `w2t_bkin.sync`            | Handles timebase alignment, jitter calculation, and synchronization of video/behavior to TTLs.     |
| `w2t_bkin.core.session`    | Loads metadata hierarchically and assembles the root `NWBFile`.                                    |
| `w2t_bkin.core.pipeline`   | High-level orchestration of the entire workflow.                                                   |
| `w2t_bkin.utils`           | Shared utilities including datetime parsing, dictionary merging, and helper functions.             |

## CLI Utilities

The `scripts/` directory contains useful utilities:

- `mat2json.py`: Converts MATLAB `.mat` files to JSON, handling nested structures and arrays.
- `pose2ttl.py`: Generates mock TTL signals from DeepLabCut pose data (useful for testing or when hardware sync fails).
- `trials2df.py`: Converts NWB `TrialsTable` and `TaskRecording` objects into a flat pandas DataFrame for analysis.

## Testing

The project includes synthetic data generation for testing:

```python
from synthetic import build_raw_folder, build_interim_pose

# Generate synthetic session
session = build_raw_folder(
    out_root=Path("output/test/raw"),
    project_name="test-project",
    subject_id="subject-001",
    session_id="session-001",
    camera_ids=["cam0", "cam1"],
    ttl_ids=["ttl_camera", "ttl_bpod"],
    n_frames=300,
    n_trials=10,
)
```

## License

See `LICENSE` file for details.

