oceanum.datamesh.Connector#

class oceanum.datamesh.Connector(token=None, service='https://datamesh.oceanum.io', gateway=None)[source]#

Datamesh connector class.

All datamesh operations are methods of this class

Attributes

host

Datamesh host

Methods

__init__(token=None, service='https://datamesh.oceanum.io', gateway=None)[source]#

Datamesh connector constructor

Parameters
  • token (string) – Your datamesh access token. Defaults to os.environ.get(“DATAMESH_TOKEN”, None).

  • service (string, optional) – URL of datamesh service. Defaults to os.environ.get(“DATAMESH_SERVICE”, “https://datamesh.oceanum.io”).

  • gateway (string, optional) – URL of gateway service. Defaults to os.environ.get(“DATAMESH_GATEWAY”, “https://gateway.<datamesh_service_domain>”).

Raises

ValueError – Missing or invalid arguments

delete_datasource(datasource_id)[source]#

Delete a datasource from datamesh. This will delete the datamesh registration and any stored data.

Parameters

datasource_id (string) – Unique datasource id

Returns

Return True for successfully deleted datasource

Return type

boolean

async delete_datasource_async(datasource_id)[source]#

Asynchronously delete a datasource from datamesh. This will delete the datamesh registration and any stored data.

Parameters

datasource_id (string) – Unique datasource id

Returns

Return True for successfully deleted datasource

Return type

boolean

get_catalog(search=None, timefilter=None, geofilter=None)[source]#

Get datamesh catalog

Parameters
  • search (string, optional) – Search string for filtering datasources

  • timefilter (Union[oceanum.datamesh.query.TimeFilter, list], Optional) – Time filter as valid Query TimeFilter or list of [start,end]

  • geofilter (Union[oceanum.datamesh.query.GeoFilter, dict, shapely.geometry], Optional) – Spatial filter as valid Query Geofilter or geojson geometry as dict or shapely Geometry

Returns

A datamesh catalog instance

Return type

oceanum.datamesh.Catalog

async get_catalog_async(filter={})[source]#

Get datamesh catalog asynchronously

Parameters
  • filter (dict, optional) – Set of filters to apply. Defaults to {}.

  • loop – event loop. default=None will use asyncio.get_running_loop()

  • executorconcurrent.futures.Executor instance. default=None will use the default executor

Returns

A datamesh catalog instance

Return type

Coroutine<oceanum.datamesh.Catalog>

get_datasource(datasource_id)[source]#

Get a Datasource instance from the datamesh. This does not load the actual data.

Parameters

datasource_id (string) – Unique datasource id

Returns

A datasource instance

Return type

oceanum.datamesh.Datasource

Raises

DatameshConnectError – Datasource cannot be found or is not authorized for the datamesh key

async get_datasource_async(datasource_id)[source]#

Get a Datasource instance from the datamesh asynchronously. This does not load the actual data.

Parameters
  • datasource_id (string) – Unique datasource id

  • loop – event loop. default=None will use asyncio.get_running_loop()

  • executorconcurrent.futures.Executor instance. default=None will use the default executor

Returns

A datasource instance

Return type

Coroutine<oceanum.datamesh.Datasource>

Raises

DatameshConnectError – Datasource cannot be found or is not authorized for the datamesh key

load_datasource(datasource_id, parameters={}, use_dask=True)[source]#

Load a datasource into the work environment. For datasources which load into DataFrames or GeoDataFrames, this returns an in memory instance of the DataFrame. For datasources which load into an xarray Dataset, an open zarr backed dataset is returned.

Parameters
  • datasource_id (string) – Unique datasource id

  • parameters (dict) – Additional datasource parameters

  • use_dask (bool, optional) – Load datasource as a dask enabled datasource if possible. Defaults to True.

Returns

The datasource container

Return type

Union[pandas.DataFrame, geopandas.GeoDataFrame, xarray.Dataset]

async load_datasource_async(datasource_id, parameters={}, use_dask=True)[source]#

Load a datasource asynchronously into the work environment

Parameters
  • datasource_id (string) – Unique datasource id

  • use_dask (bool, optional) – Load datasource as a dask enabled datasource if possible. Defaults to True.

  • loop – event loop. default=None will use asyncio.get_running_loop()

  • executorconcurrent.futures.Executor instance. default=None will use the default executor

Returns

The datasource container

Return type

coroutine<Union[pandas.DataFrame, geopandas.GeoDataFrame, xarray.Dataset]>

query(query=None, *, use_dask=True, **query_keys)[source]#

Make a datamesh query

Parameters

query (Union[oceanum.datamesh.Query, dict]) – Datamesh query as a query object or a valid query dictionary

Kwargs:

use_dask (bool, optional): Load datasource as a dask enabled datasource if possible. Defaults to True. **query_keys: Keywords form of query, for example datamesh.query(datasource=”my_datasource”)

Returns

The datasource container

Return type

Union[pandas.DataFrame, geopandas.GeoDataFrame, xarray.Dataset]

async query_async(query, *, use_dask=True, **query_keys)[source]#

Make a datamesh query asynchronously

Parameters

query (Union[oceanum.datamesh.Query, dict]) – Datamesh query as a query object or a valid query dictionary

Kwargs:

use_dask (bool, optional): Load datasource as a dask enabled datasource if possible. Defaults to True. loop: event loop. default=None will use asyncio.get_running_loop() executor: concurrent.futures.Executor instance. default=None will use the default executor **query_keys: Keywords form of query, for example datamesh.query(datasource=”my_datasource”)

Returns

The datasource container

Return type

Coroutine<Union[pandas.DataFrame, geopandas.GeoDataFrame, xarray.Dataset]>

write_datasource(datasource_id, data, geometry=None, append=None, overwrite=False, **properties)[source]#

Write a datasource to datamesh from the work environment

Parameters
  • datasource_id (string) – Unique datasource id

  • data (Union[pandas.DataFrame, geopandas.GeoDataFrame, xarray.Dataset, None]) – The data to be written to datamesh. If data is None, just update metadata properties.

  • geometry (oceanum.datasource.Geometry, optional) – GeoJSON geometry of the datasource

  • append (string, optional) – Coordinate to append on. default=None

  • overwrite (bool, optional) – Overwrite existing datasource. default=False

  • **properties – Additional properties for the datasource - see oceanum.datamesh.Datasource

Returns

The datasource instance that was written to

Return type

oceanum.datamesh.Datasource

async write_datasource_async(datasource_id, data, append=None, overwrite=False, **properties)[source]#

Write a datasource to datamesh from the work environment asynchronously

Parameters
  • datasource_id (string) – Unique datasource id

  • data (Union[pandas.DataFrame, geopandas.GeoDataFrame, xarray.Dataset, None]) – The data to be written to datamesh. If data is None, just update metadata properties.

  • geometry (oceanum.datasource.Geometry) – GeoJSON geometry of the datasource

  • append (string, optional) – Coordinate to append on. default=None

  • overwrite (bool, optional) – Overwrite existing datasource. default=False

  • **properties – Additional properties for the datasource - see oceanum.datamesh.Datasource constructor

Returns

The datasource instance that was written to

Return type

Coroutine<oceanum.datamesh.Datasource>