Metadata-Version: 2.3
Name: pfb-imaging
Version: 0.0.9
Summary: Radio interferometric imaging suite based on a preconditioned forward-backward approach
Author: landmanbester
Author-email: landmanbester <lbester@sarao.ac.za>
License: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Astronomy
Requires-Dist: hip-cargo>=0.2.0
Requires-Dist: katbeam>=0.1 ; extra == 'full'
Requires-Dist: codex-africanus[dask,scipy,astropy,python-casacore]>=0.4.4 ; extra == 'full'
Requires-Dist: dask-ms[s3,xarray,zarr]>=0.2.23 ; extra == 'full'
Requires-Dist: reproject ; extra == 'full'
Requires-Dist: ray[default]>=2.54.0 ; extra == 'full'
Requires-Dist: scikit-image>=0.24.0 ; extra == 'full'
Requires-Dist: pywavelets>=1.7.0 ; extra == 'full'
Requires-Dist: numexpr>=2.10.1 ; extra == 'full'
Requires-Dist: ducc0>=0.35.0 ; extra == 'full'
Requires-Dist: sympy>=1.9 ; extra == 'full'
Requires-Dist: tbb>=2021.13.1 ; extra == 'full'
Requires-Dist: jax[cpu]>=0.4.31 ; extra == 'full'
Requires-Dist: lz4>=4.3.3 ; extra == 'full'
Requires-Dist: bokeh>=3.1.0 ; extra == 'full'
Requires-Dist: regions>=0.9 ; extra == 'full'
Requires-Dist: psutil>=5.9.8 ; extra == 'full'
Requires-Dist: matplotlib>=3.9.2 ; extra == 'full'
Requires-Dist: distributed<2024.11.0 ; extra == 'full'
Requires-Dist: pydantic>=2 ; extra == 'full'
Requires-Dist: omegaconf>=2.3 ; extra == 'full'
Requires-Dist: pillow>=12.2.0 ; extra == 'full'
Requires-Python: >=3.10, <3.13
Project-URL: Homepage, https://github.com/ratt-ru/pfb-imaging
Provides-Extra: full
Description-Content-Type: text/markdown

# pfb-imaging

Radio interferometric imaging suite based on the preconditioned forward-backward algorithm.
The project follows the [hip-cargo](https://github.com/landmanbester/hip-cargo) package format:
lightweight CLI installation with auto-generated [stimela](https://github.com/caracal-pipeline/stimela) cab definitions and containerised execution.

## Installation

**Lightweight (CLI + cabs only):**

```bash
pip install pfb-imaging
```

This installs the CLI and stimela cab definitions without the full scientific stack.
The cabs can be included in stimela recipes using:

```yaml
_include:
  - (pfb_imaging.cabs)init.yml
```

**Full stack:**

To run the code natively you need to install the full stack using

```bash
pip install "pfb-imaging[full]"
```

For maximum performance install `ducc0` in no-binary mode:

```bash
pip install ducc0 --no-binary ducc0
```

See the [Development](#development) section for instructions on how to set the package up in development mode and make contributions.

## Quick start

The easiest way to use `pfb-imaging` is via the `stimela` recipes given in the [recipes folder](recipes/).
Once the package is installed, a recipe can be queried for its input and output parameters using the `stimela doc` command.
For example, to see the inputs and outputs of the `sara` recipe, simply run

```bash
stimela doc 'pfb_imaging.recipes::sara.yaml'
```

The recipe can then be run with the `stimela run` command:

```bash
stimela run 'pfb_imaging.recipes::sara.yaml' sara \
  ms=path/to/data.ms \
  base-dir=path/to/base/output/directory \
  image-name=saraout
```

The recipe should contain sensible defaults for MeerKAT data at L-band.

## CLI documentation

The CLI is built with [Typer](https://typer.tiangolo.com/) and provides rich, auto-generated documentation.
To list all available commands:

```bash
pfb --help
```

To get detailed documentation for a specific command including all parameters, types, and defaults:

```bash
pfb init --help
```

This is often more useful than `stimela doc` as it shows the full parameter documentation with types and defaults directly in the terminal.

## CLI commands

The processing pipeline follows a modular pattern where each step is a separate command:

1. `pfb init` -- Parse measurement sets into xarray datasets
2. `pfb grid` -- Create dirty images, PSFs, and weights
3. `pfb kclean` -- Classical deconvolution (Hogbom/Clark)
4. `pfb sara` -- Advanced deconvolution with sparsity constraints
5. `pfb restore` -- Restore clean components to final image
6. `pfb degrid` -- Subtract model from visibilities

Additional commands:

- `pfb deconv` -- General deconvolution (replaces individual algorithm apps)
- `pfb hci` -- High cadence imaging
- `pfb fluxtractor` -- Flux extraction
- `pfb model2comps` -- Convert model to components

## Execution backends

Every command supports a `--backend` option that controls how the command is executed.
This is provided by [hip-cargo](https://github.com/landmanbester/hip-cargo) and enables container fallback execution: when the full scientific stack is not installed locally, commands automatically run inside a container.

Available backends:

- `auto` (default) -- Try native execution first; if the core module import fails (lightweight install), fall back to the best available container runtime.
- `native` -- Run natively using the locally installed Python environment. Fails with `ImportError` if dependencies are missing.
- `docker` -- Run inside a Docker container.
- `podman` -- Run inside a Podman container (daemonless, rootless).
- `apptainer` -- Run inside an Apptainer container (HPC-friendly, formerly Singularity).
- `singularity` -- Run inside a Singularity container.

An additional `--always-pull-images` flag forces re-pulling the container image before execution, useful for ensuring you have the latest version.

Example usage:

```bash
# Run natively (requires full install)
pfb init --ms data.ms --output-filename out --backend native

# Run in a Docker container (lightweight install only)
pfb init --ms data.ms --output-filename out --backend docker

# Auto-detect: native if available, otherwise container
pfb init --ms data.ms --output-filename out
```

Volume mounts are resolved automatically from the command's type hints: input paths are mounted read-only, output paths read-write.
Docker and Podman run as the current user to avoid root-owned output files.

## Default naming conventions

Output files follow consistent naming patterns using `--output-filename`, `--product`, and `--suffix`:

- XDS datasets: `{output_filename}_{product}.xds`
- DDS datasets: `{output_filename}_{product}_{suffix}.dds`
- Models: `{output_filename}_{product}_{suffix}_model.mds`
- FITS files: same convention with appropriate extensions

The `--suffix` parameter (default `main`) allows imaging multiple fields from a single set of corrected Stokes visibilities.
For example, the sun can be imaged by setting `--target sun --suffix sun`.
The `--target` parameter accepts any object recognised by `astropy` or `HH:MM:SS,DD:MM:SS` format.

## Parallelism settings

Two settings control parallelism:

- `--nworkers` controls how many chunks (usually imaging bands) are processed in parallel.
- `--nthreads` specifies threads available to each worker (gridding, FFTs, wavelet transforms).

By default a single worker is used for the smallest memory footprint and easy debugging.
Set `--nworkers` larger than one to use multiple Dask workers for parallel chunk processing.
The product of `--nworkers` and `--nthreads` should not exceed available resources.

## Package structure

The project follows the hip-cargo src layout:

```
pfb-imaging/
├── src/pfb_imaging/
│   ├── cli/          # Lightweight CLI wrappers (Typer)
│   ├── core/         # Core implementations (lazy-loaded)
│   ├── cabs/         # Generated Stimela cab definitions (YAML)
│   ├── deconv/       # Deconvolution algorithms
│   ├── operators/    # Mathematical operators (gridding, PSF, Psi)
│   ├── opt/          # Optimization algorithms (PCG, FISTA, primal-dual)
│   ├── prox/         # Proximal operators
│   ├── utils/        # Utility functions
│   └── wavelets/     # Wavelet transform implementations
├── scripts/          # Profiling and automation scripts
├── tests/
├── Dockerfile
└── pyproject.toml
```

**Key separation:** CLI modules (`cli/`) are lightweight with lazy imports so that `pfb --help` and cab generation don't pull in the full scientific stack.
Core implementations live in `core/` and are imported only when a command is executed.

## Container images

Container images are published to GitHub Container Registry at `ghcr.io/ratt-ru/pfb-imaging`.
The full image URL (including tag) is the single source of truth and lives in `src/pfb_imaging/_container_image.py` as the `CONTAINER_IMAGE` variable, loaded via `importlib` (no CWD dependency, no `uv sync` needed).

```python
CONTAINER_IMAGE = "ghcr.io/ratt-ru/pfb-imaging:<tag>"
```

The `<tag>` is managed by three mechanisms:

- **Feature branches:** the developer manually updates the tag in `_container_image.py` to match the branch name.
- **Merge to main:** the `update-cabs.yml` GitHub Action rewrites the tag to `latest`, regenerates cab definitions, and commits the changes.
- **Releases:** `tbump` rewrites the tag to the semantic version (e.g. `0.0.9`) via `before_commit` hooks in `tbump.toml`.

Cab definitions are auto-generated with the correct image tag via pre-commit hooks and the `update-cabs.yml` GitHub Action -- the image URL is read from `_container_image.py` at generation time, so the `--image` flag is not needed.

## Development

This project uses:
- [uv](https://github.com/astral-sh/uv) for dependency management
- [ruff](https://github.com/astral-sh/ruff) for linting and formatting (core dependency — `generate-function` runs `ruff format` and `ruff check --fix` on generated code)
- [typer](https://typer.tiangolo.com/) for the CLI
- [git-cliff](https://git-cliff.org/) for `CHANGELOG` automation


### Setting Up Development Environment

```bash
# Clone the repository
git clone https://github.com/ratt-ru/pfb-imaging.git
cd pfb-imaging

# Install dependencies with development tools
uv sync --extra full --group dev --group test

# Install pre-commit hooks (recommended)
uv run pre-commit install --hook-type commit-msg
```

This will automatically run the hooks before each commit.
If any checks fail, the commit will be blocked until you fix the issues.

#### Running Hooks Manually

You can run the hooks manually on all files:

```bash
# Run on all files
uv run pre-commit run --all-files

# Run on staged files only
uv run pre-commit run
```

#### Updating Hook Versions

To update hook versions to the latest:

```bash
uv run pre-commit autoupdate
```

### Manual Code Quality Checks

If you prefer to run checks manually without pre-commit:

```bash
# Format code
uv run ruff format .

# Check and auto-fix linting issues
uv run ruff check . --fix

# Run tests
uv run pytest -v

```

### Commit Message Convention

This project uses [Conventional Commits](https://www.conventionalcommits.org/) to enable automated changelog generation via [git-cliff](https://git-cliff.org/).

Every commit message should follow this format:

```
<type>: <description>

[optional body]
```

**Types:**

| Type | When to use | Changelog section |
|------|------------|-------------------|
| `feat` | New feature or capability | Added |
| `fix` | Bug fix | Fixed |
| `refactor` | Code change that neither fixes a bug nor adds a feature | Changed |
| `perf` | Performance improvement | Changed |
| `docs` | Documentation only | Documentation |
| `test` | Adding or updating tests | Testing |
| `ci` | CI/CD changes | CI |
| `deps` | Dependency updates | Dependencies |
| `chore` | Maintenance tasks (cab regeneration, formatting) | Miscellaneous |

**Examples:**

```bash
git commit -m "feat: add support for MS dtype in type inference"
git commit -m "fix: handle empty docstrings in introspector"
git commit -m "refactor: simplify generate_cabs output formatting"
git commit -m "docs: add container fallback section to README"
git commit -m "test: add roundtrip test for List types"
```

**Scoped commits** (optional): Use parentheses to specify the affected component:

```bash
git commit -m "feat(init): add --license-type option for BSD-3-Clause"
git commit -m "fix(runner): resolve volume mount for symlinked paths"
```

### Contributing Workflow


1. **Create a feature branch**:
   ```bash
   git checkout -b your-feature-name
   ```

2. **Update the container image tag** in `src/pfb_imaging/_container_image.py` to match your branch name.

   This ensures the cab definitions generated by pre-commit hooks use the correct branch-specific image tag during development. You do not need to reset the tag before merging — the `update-cabs` workflow handles that automatically on merge to `main`.

3. **Make your changes** and ensure tests pass:
   ```bash
   uv run pytest -v
   ```

4. **Commit using [conventional commit messages](#commit-message-convention)**:
   ```bash
   git add .
   git commit -m "feat: your feature description"
   # Pre-commit hooks run automatically
   ```

   The pre-commit hooks keep the CLI and corresponding cab definitions in sync, enforce code quality and conventional commits.

5. **Push and create a pull request**:
   ```bash
   git push origin your-feature-name
   ```

The GitHub actions workflow automates containerisation by pushing container images to the GitHub Container Registry. Once the PR is merged, they also sync the name of container image corresponding to the branch (i.e. tagged with `:latest`).

## Acknowledgement

If you find any of this useful please cite the [pfb-imaging paper](https://arxiv.org/abs/2412.10073/).
