Metadata-Version: 2.4
Name: dendros
Version: 0.1.0
Summary: A Python toolkit for analyzing Galacticus semi-analytic model outputs.
Author-email: Andrew Benson <abenson@carnegiescience.edu>
License: GPL-3.0-or-later
Project-URL: Homepage, https://github.com/galacticusorg/dendros
Project-URL: Documentation, https://dendros.readthedocs.io
Project-URL: Bug Tracker, https://github.com/galacticusorg/dendros/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Astronomy
Classifier: Intended Audience :: Science/Research
Classifier: Development Status :: 3 - Alpha
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: h5py>=3.0
Requires-Dist: numpy>=1.20
Requires-Dist: astropy>=5.0
Provides-Extra: pandas
Requires-Dist: pandas>=1.3; extra == "pandas"
Provides-Extra: tabulate
Requires-Dist: tabulate>=0.10.0; extra == "tabulate"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: pandas>=1.3; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=5.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.0; extra == "docs"
Requires-Dist: myst-parser>=0.18; extra == "docs"
Requires-Dist: nbsphinx>=0.8; extra == "docs"
Dynamic: license-file

# Dendros

<p align="center">
  <img src="https://github.com/galacticusorg/dendros/blob/main/assets/dendros.png?raw=true" width="400" alt="Dendros Logo">
</p>

[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![PyPI version](https://badge.fury.io/py/dendros.svg)](https://badge.fury.io/py/dendros)
[![Documentation](https://readthedocs.org/projects/dendros/badge/?version=latest)](https://dendros.readthedocs.io/en/latest/)

A Python toolkit for analyzing [Galacticus](https://github.com/galacticusorg/galacticus)
semi-analytic model outputs.

---

## Installation

```bash
pip install dendros
```

To also enable pandas and tabulate table output:

```bash
pip install 'dendros[pandas,tabulate]'
```

Install the latest development version directly from GitHub:

```bash
pip install git+https://github.com/galacticusorg/dendros.git
```

---

## Quickstart

### Opening files

```python
from dendros import open_outputs

# Single file
c = open_outputs("galacticus.hdf5")

# Auto-detect MPI-split outputs (given any one rank's file)
c = open_outputs("galacticus_MPI:0000.hdf5")

# Explicit list of files
c = open_outputs(["rank0.hdf5", "rank1.hdf5"])

# Glob pattern
c = open_outputs("run001/galacticus*.hdf5")

# Lightcone run (different top-level group)
c = open_outputs("lightcone.hdf5", output_root="Lightcone")
```

Use `Collection` as a context manager to ensure files are closed:

```python
with open_outputs("galacticus.hdf5") as c:
    ...
```

### Checking completion status

Galacticus writes a `statusCompletion` attribute when a run finishes.
`validate_completion` raises an error if any file is incomplete:

```python
with open_outputs("galacticus.hdf5") as c:
    c.validate_completion()           # raises RuntimeError if incomplete
    c.validate_completion(mode="warn")    # emit warning instead
    c.validate_completion(mode="ignore")  # do nothing
```

### Listing available outputs

```python
with open_outputs("galacticus.hdf5") as c:
    tbl = c.list_outputs()          # astropy Table by default
    print(tbl)

    # or as a pandas DataFrame:
    df = c.list_outputs(format="pandas")

    # or as a tabulate string:
    df = c.list_outputs(format="tabulate")
```

Example output:

```
index  name     time   scale_factor  redshift
----- ------- -------- ------------ ---------
    1 Output1  13.8        1.0          0.0
    2 Output2   6.0        0.5          1.0
```

You can also access the index object directly:

```python
with open_outputs("galacticus.hdf5") as c:
    for meta in c.outputs:
        print(meta.name, meta.redshift)
```

### Listing available properties

```python
with open_outputs("galacticus.hdf5") as c:
    tbl = c.list_properties("Output1")   # by name
    tbl = c.list_properties(1)           # by 1-based integer index
    print(tbl)
```

Example output:

```
name         dtype    shape   description          unitsInSI
---------- ------- -------- -------------------- -----------
haloMass   float64  (1000,) Halo virial mass     1.989e+30
stellarMass float64 (1000,) Stellar mass of disk 1.989e+30
...
```

### Reading datasets

```python
with open_outputs("galacticus.hdf5") as c:
    # List of dataset paths → same strings used as dict keys
    data = c.read("Output1", ["nodeData/basicMass", "nodeData/diskMassStellar"])
    print(data["nodeData/basicMass"])   # numpy array

    # Dict → custom labels
    data = c.read(
        "Output1",
        {"Mhalo": "nodeData/basicMass", "Mstar": "nodeData/diskMassStellar"},
    )
    print(data["Mhalo"])
```

### Filtering galaxies

Pass a boolean mask or integer index array as `where`:

```python
with open_outputs("galacticus.hdf5") as c:
    # First read to build a mask
    masses = c.read("Output1", ["nodeData/basicMass"])["nodeData/basicMass"]
    mask = masses > 1e12

    # Then read everything for the selected galaxies only
    data = c.read(
        "Output1",
        {"Mhalo": "nodeData/basicMass", "Mstar": "nodeData/diskMassStellar"},
        where=mask,
    )
```

### h5py-like browsing

```python
with open_outputs("galacticus.hdf5") as c:
    print(c.keys())                        # top-level groups
    grp = c["Outputs/Output1"]
    print(grp.keys())                      # subgroups / datasets
    print(grp.attrs)                       # group attributes
    ds = c["Outputs/Output1/nodeData/basicMass"]
    print(ds.dtype, ds.shape)
```

---

## MPI outputs

When Galacticus runs with MPI, it writes one file per rank with the suffix
`_MPI:NNNN` (e.g. `galacticus_MPI:0000.hdf5`, `galacticus_MPI:0001.hdf5`, …).
All ranks contain identical metadata groups; galaxy datasets are split across
ranks.

`open_outputs` handles this automatically:

```python
# Any single-rank file → auto-detects all peers
c = open_outputs("galacticus_MPI:0000.hdf5")

# Or pass an explicit list / glob
c = open_outputs("galacticus_MPI:????.hdf5")
```

`c.read(...)` transparently concatenates arrays across all ranks along axis 0.

---

## Lightcone outputs

For lightcone runs the top-level group is typically `Lightcone` rather than
`Outputs`.  Pass `output_root` to override the default:

```python
c = open_outputs("lightcone.hdf5", output_root="Lightcone")
```

---

## Documentation

Full API reference and more examples are available at
[dendros.readthedocs.io](https://dendros.readthedocs.io).

---

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, coding style,
and how to propose changes.

---

## License

Dendros is released under the
[GNU General Public License v3.0 or later](LICENSE).

