Metadata-Version: 2.4
Name: modelarrayio
Version: 26.0.0rc1
Summary: ModelArrayIO: convert fixel, voxel, and greyordinate neuroimaging data to/from HDF5 for the ModelArray R package
Project-URL: Homepage, https://github.com/PennLINC/ModelArrayIO
Project-URL: Repository, https://github.com/PennLINC/ModelArrayIO
Project-URL: Documentation, https://modelarrayio.readthedocs.io
Author-email: The PennLINC developers <Matthew.Cieslak@pennmedicine.upenn.edu>
Maintainer-email: Matt Cieslak <Matthew.Cieslak@pennmedicine.upenn.edu>
License: BSD 3-Clause License
        
        Copyright (c) 2022,  Lifespan Informatics and Neuroimaging Center
        All rights reserved.
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted provided that the following conditions are met:
        
        1. Redistributions of source code must retain the above copyright notice, this
           list of conditions and the following disclaimer.
        
        2. Redistributions in binary form must reproduce the above copyright notice,
           this list of conditions and the following disclaimer in the documentation
           and/or other materials provided with the distribution.
        
        3. Neither the name of the copyright holder nor the names of its
           contributors may be used to endorse or promote products derived from
           this software without specific prior written permission.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
        AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
        IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
        DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
        FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
        DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
        SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
        CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
        OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
        OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
License-File: LICENSE
Keywords: cifti,fixel,hdf5,modelarray,neuroimaging,nifti
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.11
Requires-Dist: h5py>=3.10
Requires-Dist: nibabel>=5.0
Requires-Dist: numpy>=1.26
Requires-Dist: pandas>=2.0
Requires-Dist: tiledb>=0.23
Requires-Dist: tqdm>=4.65
Provides-Extra: all
Requires-Dist: boto3; extra == 'all'
Requires-Dist: coverage[toml]>=7.0; extra == 'all'
Requires-Dist: myst-parser>=2; extra == 'all'
Requires-Dist: pre-commit>=3.5; extra == 'all'
Requires-Dist: pytest-cov>=5; extra == 'all'
Requires-Dist: pytest-env>=1.0; extra == 'all'
Requires-Dist: pytest-xdist>=3; extra == 'all'
Requires-Dist: pytest>=8; extra == 'all'
Requires-Dist: ruff>=0.4; extra == 'all'
Requires-Dist: sphinx-argparse; extra == 'all'
Requires-Dist: sphinx-copybutton; extra == 'all'
Requires-Dist: sphinx-rtd-theme; extra == 'all'
Requires-Dist: sphinx>=6.2.1; extra == 'all'
Requires-Dist: sphinxcontrib-apidoc; extra == 'all'
Requires-Dist: sphinxcontrib-bibtex; extra == 'all'
Provides-Extra: doc
Requires-Dist: myst-parser>=2; extra == 'doc'
Requires-Dist: sphinx-argparse; extra == 'doc'
Requires-Dist: sphinx-copybutton; extra == 'doc'
Requires-Dist: sphinx-rtd-theme; extra == 'doc'
Requires-Dist: sphinx>=6.2.1; extra == 'doc'
Requires-Dist: sphinxcontrib-apidoc; extra == 'doc'
Requires-Dist: sphinxcontrib-bibtex; extra == 'doc'
Provides-Extra: s3
Requires-Dist: boto3; extra == 's3'
Provides-Extra: test
Requires-Dist: coverage[toml]>=7.0; extra == 'test'
Requires-Dist: pre-commit>=3.5; extra == 'test'
Requires-Dist: pytest-cov>=5; extra == 'test'
Requires-Dist: pytest-env>=1.0; extra == 'test'
Requires-Dist: pytest-xdist>=3; extra == 'test'
Requires-Dist: pytest>=8; extra == 'test'
Requires-Dist: ruff>=0.4; extra == 'test'
Description-Content-Type: text/markdown

# ModelArrayIO

**ModelArrayIO** is a Python package that converts between neuroimaging formats (fixel `.mif`, voxel NIfTI, CIFTI-2 dscalar) and the HDF5 (`.h5`) layout used by the R package [ModelArray](https://pennlinc.github.io/ModelArray/). It can also write ModelArray statistical results back to imaging formats.

**Relationship to ConFixel:** The earlier project [**ConFixel**](https://github.com/PennLINC/ConFixel) is superseded by ModelArrayIO. The ConFixel repository is retained for history (including links from publications) and will be archived; new work should use this repository.

Documentation for installation and usage: [ModelArrayIO on GitHub](https://github.com/PennLINC/ModelArrayIO#installation) (this README). For conda, HDF5 libraries, and installing the ModelArray R package, see the ModelArray vignette [Installation](https://pennlinc.github.io/ModelArray/articles/installations.html).

<p align="center">

![Overview](overview_structure.png)

</p>

ModelArrayIO provides three converter areas, each with import and export commands:

After [installation](#installation), these commands are available in your terminal:

* **Fixel-wise** data (MRtrix `.mif`):
    * `.mif` → `.h5`: `confixel` (CLI name kept for compatibility with earlier ConFixel workflows)
    * `.h5` → `.mif`: `fixelstats_write`
* **Voxel-wise** data (NIfTI):
    * NIfTI → `.h5`: `convoxel`
    * `.h5` → NIfTI: `volumestats_write`
* **Greyordinate-wise** data (CIFTI-2):
    * CIFTI-2 → `.h5`: `concifti`
    * `.h5` → CIFTI-2: `ciftistats_write`

## Installation
### Install dependent software MRtrix (only required for fixel-wise data `.mif`)
For fixel-wise `.mif` conversion, the `confixel` / `fixelstats_write` tools use MRtrix `mrconvert`. Install MRtrix from [MRtrix’s webpage](https://www.mrtrix.org/download/) if needed. Run `mrview` in the terminal to verify the installation.

If your data are voxel-wise or CIFTI only, you can skip this step.

### Install ModelArrayIO
You may want a conda environment first—see [ModelArray: Installation](https://pennlinc.github.io/ModelArray/articles/installations.html). If MRtrix is installed in that environment, install ModelArrayIO in the same environment.

Install from GitHub:

``` console
git clone https://github.com/PennLINC/ModelArrayIO.git
cd ModelArrayIO
pip install .   # build via pyproject.toml
```

Editable install for development:

``` console
# From the repository root
pip install -e .
```

With `hatch` installed, you can build wheels/sdist locally:

``` console
hatch build
pip install dist/*.whl
```

## How to use
We provide a [walkthrough for fixel-wise data](notebooks/walkthrough_fixel-wise_data.md) (`confixel` / `fixelstats_write`) and a [walkthrough for voxel-wise data](notebooks/walkthrough_voxel-wise_data.md) (`convoxel` / `volumestats_write`).

Together with [ModelArray](https://pennlinc.github.io/ModelArray/), see the [combined walkthrough](https://pennlinc.github.io/ModelArray/articles/walkthrough.html) with example fixel-wise data (ModelArray + ModelArrayIO).

CLI help:

``` console
confixel --help
```

Use the same pattern for `convoxel`, `concifti`, `fixelstats_write`, `volumestats_write`, and `ciftistats_write`.

## Storage backends: HDF5 and TileDB

ModelArrayIO supports two on-disk backends for the subject-by-element matrix:

- HDF5 (default), implemented in `modelarrayio/h5_storage.py`
- TileDB, implemented in `modelarrayio/tiledb_storage.py`

Both backends expose a similar API:

- create a dense 2D array `(subjects, items)` and write all values at once
- create an empty array with the same shape and write by column stripes
- write/read column names alongside the data

Notes and minor differences:
- Chunking vs tiling: HDF5 uses chunks; TileDB uses tiles. We compute tile sizes analogous to chunk sizes to keep write/read patterns similar.
- Compression: HDF5 uses `gzip` by default; TileDB defaults to `zstd`+shuffle for better speed/ratio. You can switch to `gzip` for parity.
- Metadata: HDF5 stores `column_names` as a dataset attribute; TileDB stores names as JSON metadata on the array/group.
- Layout: Both backends keep dimensions in the same order and use zero-based indices.
