Metadata-Version: 2.4
Name: eo-history
Version: 0.2.8
Summary: Metadata providing creation histories for spatial files
Author-email: rdenham <robert.denham@uq.edu.au>
License-File: AUTHORS.md
License-File: LICENSE
License-File: LICENSE.txt
Requires-Python: >=3.11
Requires-Dist: click>=8.3.1
Requires-Dist: colorful>=0.5.8
Provides-Extra: gdal
Requires-Dist: gdal>=3.0; extra == 'gdal'
Provides-Extra: laspy
Requires-Dist: laspy[laszip]>=2.7.0; extra == 'laspy'
Provides-Extra: rasterio
Requires-Dist: rasterio>=1.3; extra == 'rasterio'
Description-Content-Type: text/markdown

# eo-history

This python package add metadata to spatial files such as GDAL supported rasters and the Lidar formats LAZ and LAS.

## Installation

You can install directly from this repository using a personal access token or a deploy token. For example:


```sh
python3 -m pip install git+https://<token-name>:<token>@gitlab.com/jrsrp/sys/eo-history
```

Replace `<token-name>` with your username if using a personal access token, or with `_token_` if
using a deploy token.

If you have read access to our package registry, you can also do

```sh
python3 -m pip install --extra-index=https://<token-name>:<token>@gitlab.com/api/v4/projects/64206106/packages/pypi/simple
```

### Using `history_view` as a standalone tool with uv

If you only want the `history_view` command without adding `eo-history` to a project environment,
[`uv tool install`](https://docs.astral.sh/uv/guides/tools/) makes it available globally:

```sh
uv tool install git+https://<token-name>:<token>@gitlab.com/jrsrp/sys/eo-history
```

After that, `history_view` is on your PATH and can be run directly:

```sh
history_view my_image.tif
```

To run it once without a permanent install, use `uvx`:

```sh
uvx --from git+https://<token-name>:<token>@gitlab.com/jrsrp/sys/eo-history history_view my_image.tif
```

## Usage

This package allows reading and writing of processing history to gdal supported
images and las/laz files.
When a file is created, you can capture information of each of the inputs and store it
in the created file.  This allows you
to build a history of a processing workflow.

For example, imagine you have a base image, which is used to create a second image,
which itself is used to create a third, final image. The metadata for the final image can have
the history from all its antecedants.

A diagram of this would be like

```mermaid
graph TD;
final_image.tif --> second_image.tif;
second_image.tif --> first_image.tif;
```

The history creation would be like:

```python
from osgeo import gdal
import numpy as np
from eo_history import history


# for demonstration we create three identical images
driver = gdal.GetDriverByName("GTiff")
dst_ds = driver.Create("first_image.tif", xsize=9, ysize=9, bands=1, eType=gdal.GDT_Byte)
opt = {}
opt["DESCRIPTION"] = "Our source image"

# this image has no parents, so this is an empty list
history.insertMetadataDataset(dst_ds, [], opt)
dst_ds = None

# imagine this image was the input to a second image
dst_ds = driver.Create("second_image.tif", xsize=9, ysize=9, bands=1, eType=gdal.GDT_Byte)
opt = {}
opt["DESCRIPTION"] = "This image was created using first_image.tif"
# this image has a single parent
history.insertMetadataDataset(dst_ds, ["first_image.tif"], opt)
dst_ds = None

# and our final image
dst_ds = driver.Create("final_image.tif", xsize=9, ysize=9, bands=1, eType=gdal.GDT_Byte)
opt = {}
opt["DESCRIPTION"] = "This image is the result of a chain of processing"
# this image has a single parent
history.insertMetadataDataset(dst_ds, ["second_image.tif"], opt)
dst_ds = None
```

The metadata of the final image includes information on its input:

```python
histobj = history.readTreeFromFilename("final_image.tif")
histobj.directparents.keys()
# dict_keys(['second_image.tif 2024-12-15 17:44:53'])

```

But also a dictionery of files and relationships:

```python
histobj.files.keys()
# dict_keys(['first_image.tif 2024-12-15 17:44:53'])
histobj.relationships
# {('second_image.tif 2024-12-15 17:44:53',
#  'first_image.tif 2024-12-15 17:44:53'): None}
```

Some basic information is attempted to be captured automatically, but users
can add any extra data that they would like, by passing an optional dictionary
to `history.insertMetadataDataset`.

The standard captured data includes:

```python
list(histobj.thismeta.keys())
# ['timestamp',
#  'login',
#  'uname_os',
#  'uname_host',
#  'uname_release',
#  'uname_version',
#  'uname_machine',
#  'cwd',
#  'python_version',
#  'commandline',
#  'script',
#  'script_dir',
#  'package_version_dict',
#  'DESCRIPTION'] # user supplied
```


## Command-line tools

Two command-line tools are distributed with this package.

### simple_history

Shows the top-level history of a file — its description, creation date,
command used, and direct parents. For example:

```sh
simple_history final_image.tif
```

Will produce

```sh
Description:
This image is the result of a chain of processing
Creation date:

        2025-01-22 13:30:30

Command used:

        ipython

Direct Parents:

 ─┬ final_image.tif
  └──── second_image.tif
```

Both tools also accept GDAL virtual filesystem paths, so remote files can be
inspected without downloading them first:

```sh
simple_history /vsicurl/https://example.org/image.tif
```

### history_view

Opens a full interactive graph of the processing history in a web browser.
Every ancestor file appears as a node; edges flow from oldest (top) to newest
(bottom). Clicking a node shows its complete metadata in a side panel.

```sh
history_view final_image.tif
history_view /vsicurl/https://example.org/image.tif
```

Use `--output` / `-o` to write the HTML to a specific file instead of opening
the browser directly:

```sh
history_view -o history.html final_image.tif
```

As an example 

```sh
history_view -o cemsre_t52jes_20260318_aj0m2.html cemsre_t52jes_20260318_aj0m2.img
```

will produce [`cemsre_t52jes_20260318_aj0m2.html`](https://jrsrp.gitlab.io/sys/eo-history/cemsre_t52jes_20260318_aj0m2.html).



The toolbar provides several controls for navigating complex histories:

| Control | Description |
|---|---|
| Depth slider | Show only the N most-recent levels of ancestry |
| Hierarchical | Toggle between depth-locked rows and free positioning |
| Untangle | Enable physics to spread overlapping nodes; click Lock to freeze |
| Reset | Snap all nodes back to their original depth-based positions |
| Fit | Zoom to fit all visible nodes on screen |

You can also run it with a `server` option, which will start a light web server
rather than opening a browser. 

Usage is as follows:

```sh
Usage: history_view [OPTIONS] FILENAME

  Display the full processing history of a spatial file as an interactive graph in a web browser.

  Nodes represent files; edges flow from oldest ancestor (top) to the queried file (bottom).  Click a node to see its
  full metadata in the side panel.

  Controls:   Depth slider — show only the N most-recent levels of ancestry   Untangle     — enable physics to spread
  overlapping nodes; click Lock to freeze   Fit          — zoom to fit all visible nodes

  Works with local files and GDAL virtual filesystem paths (e.g. /vsicurl/https://... or /vsis3/...).

  Example:

  \b $ history_view final_image.tif $ history_view /vsicurl/https://example.org/image.tif $ history_view -o out.html
  final_image.tif $ history_view --serve final_image.tif $ history_view --serve --port 9000 final_image.tif

Options:
  -o, --output TEXT   Write HTML to this file instead of opening in browser.
  -s, --serve         Serve the HTML via a local web server (useful over SSH).
  -p, --port INTEGER  Port to use with --serve.  [default: 8000]
  --help              Show this message and exit.
  ```

