Metadata-Version: 2.2
Name: marisco
Version: 0.9.5
Summary: MARIS companion package and tutorials
Home-page: https://github.com/franckalbinet/marisco
Author: Franck Albinet, Niall Murphy
Author-email: franckalbinet@gmail.com
License: Apache Software License 2.0
Keywords: nbdev jupyter notebook python netcdf marine radioactivity data
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas
Requires-Dist: openpyxl
Requires-Dist: fastcore
Requires-Dist: rich
Requires-Dist: tqdm
Requires-Dist: netcdf4
Requires-Dist: tomli
Requires-Dist: tomli-w
Requires-Dist: shapely
Requires-Dist: pyzotero
Requires-Dist: jellyfish
Requires-Dist: requests
Requires-Dist: pyarrow
Requires-Dist: gevent>=22.10.2
Provides-Extra: dev
Requires-Dist: nbdev; extra == "dev"
Requires-Dist: ipykernel; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# MARISCO


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

The [IAEA **M**arine **R**adioactivity **I**nformation **S**ystem
(MARIS)](https://maris.iaea.org) provides open access to radioactivity
measurements in marine environments. Developed by the [IAEA Marine
Environmental
Laboratories](https://www.iaea.org/about/organizational-structure/department-of-nuclear-sciences-and-applications/division-of-iaea-environment-laboratories)
in Monaco, MARIS offers data on seawater, biota, sediment, and suspended
matter.

This Python package includes command-line tools to convert MARIS
datasets into [`NetCDF`](https://www.unidata.ucar.edu/software/netcdf/)
or `.csv` formats, enhancing compatibility with various scientific and
data analysis software.

## Core Concept: Handlers

`marisco` is built around the concept of `handlers` - specialized
modules designed to convert MARIS datasets into NetCDF format. Each
handler is tailored to a specific data provider and implemented as a
dedicated Jupyter notebook.

### Literate Programming Approach

We’ve adopted a Literate Programming approach, which means:

1.  **Documentation**: Each handler serves as comprehensive
    documentation.
2.  **Code Reference**: The notebooks contain the actual implementation
    code.
3.  **Communication Tool**: They facilitate discussions with data
    providers about discrepancies or inconsistencies.

### Powered by nbdev

To achieve this, we leverage [nbdev](https://nbdev.fast.ai), a powerful
tool that allows us to:

1.  Write code within Jupyter notebooks
2.  Automatically export relevant parts as dedicated Python modules

This approach bridges the gap between documentation and implementation,
ensuring they remain in sync.

### See It in Action

For a concrete example of this approach, check out our [OSPAR dataset
handler
implementation](https://fr.anckalbi.net/marisco/handlers/ospar.html).

### List of currently available handlers

MARISCO includes a suite of specialized data handlers designed to:

- Convert provider-specific data formats into standardized MARIS NetCDF
  files
- Ensure data quality and consistency across providers
- Facilitate integration with the MARIS marine radioactivity database
- Support automated data processing workflows

The following handlers are currently implemented:

| Handler | Description | Link to Data Source |
|----|----|----|
| [MARIS Legacy](https://fr.anckalbi.net/marisco/handlers/maris_legacy.html) | All legacy MARIS datasets from the MARIS Master Database | \- |
| [HELCOM](https://fr.anckalbi.net/marisco/handlers/helcom.html) | HELCOM marine environment protection datasets | [HELCOM](https://helcom.fi/about-us) |
| [OSPAR](https://fr.anckalbi.net/marisco/handlers/ospar.html) | OSPAR marine environment datasets | [ODIMS OSPAR](https://odims.ospar.org/en/) |
| [TEPCO](https://fr.anckalbi.net/marisco/handlers/tepco.html) | TEPCO Fukushima monitoring data | [TEPCO Monitoring](https://radioactivity.nsr.go.jp/ja/list/349/list-1.html) |
| [GEOTRACES](https://fr.anckalbi.net/marisco/handlers/geotraces.html) | BODC GEOTRACES oceanographic data | [GEOTRACES IDP2021](https://www.geotraces.org/geotraces-intermediate-data-product-2021/) |

## Install

Now, to install `marisco` simply run

``` console
pip install marisco
```

Once successfully installed, run the following command:

``` console
maris_init
```

This command:

1.  creates a `.marisco/` directory containing various
    configuration/configurable files ((below)) in your `/home`
    directory;
2.  creates a `configs.toml` file containing default but configurable
    settings (default paths, …);
3.  downloads several MARIS DB nomenclature/lookup table into
    `.marisco/lut/` directory;
4.  downloads `maris-template.nc`, the MARIS NetCDF4 template.

### Zotero API key

Upon conversion, `marisco` will automatically retrieve the bibliographic
metadata of each MARIS dataset from [Zotero](https://www.zotero.org/).
To do so, you need to define the following environment variable
`ZOTERO_API_KEY` containing the MARIS Zotero API key. Please contact the
MARIS team to get your API key.

## Getting started

### Command line utilities

All commands accept a `-h` argument to get access to its documentation.

#### `maris_init`

Download configuration file, NetCDF MARIS template and required lookup
tables (nomenclatures).

#### `maris_to_nc`

Convert `helcom`, `geotraces`, `tepco` or `ospar` marine radioactivity
datasets to MARIS NetCDF4 format.

    usage: maris_to_nc [-h] [--src SRC] ds dest

    positional arguments:
      ds          Name of the dataset to encode as NetCDF4
      dest        Output path for NetCDF file

    options:
      -h, --help  show this help message and exit
      --src SRC   Optional input data path only required for the 'GEOTRACES' dataset

For instance: `maris_to_nc ospar 191-OSPAR-2024.nc`

#### `maris_db_to_nc`

The MARIS Master Database integrates two types of datasets:

- Historical datasets retrieved from published scientific papers
- Ongoing monitoring data from international programs like `HELCOM`,
  `OSPAR`, `TEPCO`, and `GEOTRACES`

This command-line utility converts MARIS datasets from their legacy
format to NetCDF4, making them more accessible for modern data analysis
workflows. Users can either convert the entire database or specify
particular datasets by their reference IDs for selective conversion.

    usage: maris_db_to_nc [-h] [--ref_ids REF_IDS] src dest

    Convert MARIS legacy database to NetCDF4 format. If ref_ids is provided as comma-separated values, only encodes those subsets.

    positional arguments:
      src                Path to MARIS database dump as `.txt` file
      dest               Output path for NetCDF file(s)

    options:
      -h, --help         show this help message and exit
      --ref_ids REF_IDS  Optional comma-separated reference IDs (e.g., "123,456,789") (default: )

For instance:

- `maris_db_to_nc "~/pro/data/maris/2024-11-20 MARIS_QA_shapetype_id=1.txt" ~/pro/tmp/output`  
- or
  `maris_db_to_nc "~/pro/data/maris/2024-11-20 MARIS_QA_shapetype_id=1.txt" ~/pro/tmp/output --ref_ids="16,30"`
  for a subset of the MARIS Master Database.

## Development

The MARIS NetCDF template is generated from `nbs/files/cdl/maris.cdl`
Common Data Language (CDL) file as defined by
[Unidata](https://docs.unidata.ucar.edu/). To generate the MARIS NetCDF
template `nbs/files/nc/maris-template.nc`, install the
[NetCDF-C](https://pjbartlein.github.io/REarthSysSci/install_netCDF.html)
utilities, once in `Marisco` home directory, run:

``` console
ncgen -4 -o nc/maris-template.nc cdl/maris.cdl
```
