evo.objects.client.object_client.DownloadedObject
A downloaded geoscience object.
schema
property
schema: ObjectSchema
The schema of the object.
metadata
property
metadata: ObjectMetadata
The metadata of the object.
__init__
__init__(
object_: GeoscienceObject,
metadata: ObjectMetadata,
urls_by_name: dict[str, str],
connector: APIConnector,
cache: ICache | None = None,
) -> None
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object_
|
GeoscienceObject
|
The raw geoscience object model. |
required |
metadata
|
ObjectMetadata
|
The parsed metadata for the object. |
required |
urls_by_name
|
dict[str, str]
|
A mapping of data names to their initial download URLs. |
required |
connector
|
APIConnector
|
The API connector to use for downloading data. |
required |
cache
|
ICache | None
|
An optional cache to use for data downloads. |
None
|
from_reference
async
staticmethod
from_reference(
connector: APIConnector,
reference: ObjectReference | str,
cache: ICache | None = None,
request_timeout: int | float | tuple[int | float, int | float] | None = None,
) -> DownloadedObject
Download a geoscience object from the service, given an object reference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
connector
|
APIConnector
|
The API connector to use for downloading data. |
required |
reference
|
ObjectReference | str
|
The reference to the object to download, or a URL as a string that can be parsed into a reference. |
required |
cache
|
ICache | None
|
An optional cache to use for data downloads. |
None
|
request_timeout
|
int | float | tuple[int | float, int | float] | None
|
An optional timeout to use for API requests. See evo.common.APIConnector for details. |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the reference is invalid, or if the connector base URL does not match the reference hub URL. |
from_context
async
staticmethod
from_context(context: IContext, reference: ObjectReference | str) -> DownloadedObject
Download a geoscience object from the service using a context.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context
|
IContext
|
The context providing the connector and cache. |
required |
reference
|
ObjectReference | str
|
The reference to the object to download, or a URL as a string that can be parsed into a reference. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the reference is invalid, or if the connector base URL does not match the reference hub URL. |
as_dict
as_dict() -> dict
Get this object as a dictionary.
search
search(expression: str) -> Any
Search the object metadata using a JMESPath expression.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expression
|
str
|
The JMESPath expression to use for the search. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The result of the search. |
prepare_data_download
prepare_data_download(data_identifiers: Sequence[str | UUID]) -> Iterator[ObjectDataDownload]
Prepare to download multiple data files from the geoscience object service, for this object.
Any data IDs that are not associated with the requested object will raise a DataNotFoundError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_identifiers
|
Sequence[str | UUID]
|
A list of sha256 digests or UUIDs for the data to be downloaded. |
required |
Returns:
| Type | Description |
|---|---|
Iterator[ObjectDataDownload]
|
An iterator of data download contexts that can be used to download the data. |
Raises:
| Type | Description |
|---|---|
DataNotFoundError
|
If any requested data ID is not associated with this object. |
update
async
update(
object_dict: dict,
check_for_conflict: bool = True,
request_timeout: int | float | tuple[int | float, int | float] | None = None,
) -> DownloadedObject
Update the geoscience object on the geoscience object service. Returning a new DownloadedObject representing the new version of the object.
This will create a new version of the object, that fully replaces the existing properties of the object with
those provided in object_dict.
Note, this will not update the "DownloadedObject" instance in-place - it will still represent the original version of the object. You will need to download the updated version separately if you wish to work with it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object_dict
|
dict
|
The new properties of the object as a dictionary. |
required |
check_for_conflict
|
bool
|
If True, and if a newer version of the object exists on the geoscience object service, the update will fail with a ObjectModifiedError exception. If False, it will not check whether there is a newer version, so will perform the update regardless. |
True
|
request_timeout
|
int | float | tuple[int | float, int | float] | None
|
An optional timeout to use for API requests. See evo.common.APIConnector for details. |
None
|
Returns:
| Type | Description |
|---|---|
DownloadedObject
|
The new version of the object as a DownloadedObject. |
download_table
async
download_table(
table_info: TableInfo | str,
fb: IFeedback = NoFeedback,
*,
nan_values: list[int] | list[float] | str | None = None,
column_names: Sequence[str] | None = None,
) -> pa.Table
Download the data referenced by the given table info as a PyArrow Table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_info
|
TableInfo | str
|
The table info dict, ot JMESPath to table info within the object. |
required |
fb
|
IFeedback
|
An optional feedback instance to report download progress to. |
NoFeedback
|
nan_values
|
list[int] | list[float] | str | None
|
An optional list of values to treat as null. Can also be a JMESPath expression to the list of nan values, or the nan_description structure. |
None
|
column_names
|
Sequence[str] | None
|
An optional list of column names for the table, instead of those in the Parquet file. |
None
|
Returns:
| Type | Description |
|---|---|
Table
|
A PyArrow Table containing the downloaded data. |
download_category_table
async
download_category_table(
category_info: CategoryInfo | str,
*,
nan_values: list[int] | list[float] | str | None = None,
column_names: Sequence[str] | None = None,
fb: IFeedback = NoFeedback,
) -> pa.Table
Download the data referenced by the given category info as a PyArrow Table.
The arrays into the table will be DictionaryArrays constructed from the values and lookup tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
category_info
|
CategoryInfo | str
|
The category info dict, or JMESPath to the category info within the object. |
required |
nan_values
|
list[int] | list[float] | str | None
|
An optional list of values to treat as null. Can also be a JMESPath expression to nan_description structure. |
None
|
column_names
|
Sequence[str] | None
|
An optional list of column names for the table, instead of those in the Parquet file. |
None
|
fb
|
IFeedback
|
An optional feedback instance to report download progress to. |
NoFeedback
|
Returns:
| Type | Description |
|---|---|
Table
|
A PyArrow Table containing the downloaded data. |
download_attribute_table
async
download_attribute_table(attribute: AttributeInfo | str, fb: IFeedback = NoFeedback) -> pa.Table
Download the data referenced by the given attribute as a PyArrow Table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
attribute
|
AttributeInfo | str
|
The attribute info dict, or JMESPath to the attribute info within the object. |
required |
fb
|
IFeedback
|
An optional feedback instance to report download progress to. |
NoFeedback
|
Returns:
| Type | Description |
|---|---|
Table
|
A PyArrow Table containing the downloaded data. |
download_dataframe
async
download_dataframe(
table_info: TableInfo | str,
fb: IFeedback = NoFeedback,
*,
nan_values: list[int] | list[float] | str | None = None,
column_names: Sequence[str] | None = None,
) -> pd.DataFrame
Download the data referenced by the given table info as a Pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_info
|
TableInfo | str
|
The table info dict, JMESPath to table info within the object. |
required |
fb
|
IFeedback
|
An optional feedback instance to report download progress to. |
NoFeedback
|
nan_values
|
list[int] | list[float] | str | None
|
An optional list of values to treat as null. Can also be a JMESPath expression to nan_description structure. |
None
|
column_names
|
Sequence[str] | None
|
An optional list of column names for the table, instead of those from the Parquet file. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
A Pandas DataFrame containing the downloaded data. |
download_category_dataframe
async
download_category_dataframe(
category_info: CategoryInfo | str,
fb: IFeedback = NoFeedback,
*,
nan_values: list[int] | list[float] | str | None = None,
column_names: Sequence[str] | None = None,
) -> pd.DataFrame
Download the data referenced by the given category info as a Pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
category_info
|
CategoryInfo | str
|
The category info dict, or JMESPath to the category info within the object. |
required |
nan_values
|
list[int] | list[float] | str | None
|
An optional list of values to treat as null. Can also be a JMESPath expression to nan_description structure. |
None
|
column_names
|
Sequence[str] | None
|
An optional list of column names for the table, instead of those from the Parquet file. |
None
|
fb
|
IFeedback
|
An optional feedback instance to report download progress to. |
NoFeedback
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
A Pandas DataFrame containing the downloaded data. |
download_attribute_dataframe
async
download_attribute_dataframe(attribute: AttributeInfo | str, fb: IFeedback = NoFeedback) -> pd.DataFrame
Download the data referenced by the given attribute as a Pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
attribute
|
AttributeInfo | str
|
The attribute info dict, or JMESPath to the attribute within the object. |
required |
fb
|
IFeedback
|
An optional feedback instance to report download progress to. |
NoFeedback
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
A Pandas DataFrame containing the downloaded data. |
download_array
async
download_array(table_info: TableInfo | str, fb: IFeedback = NoFeedback) -> np.ndarray
Download the data referenced by the given table info as a NumPy array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table_info
|
TableInfo | str
|
The table info dict, JMESPath to table info within the object. |
required |
fb
|
IFeedback
|
An optional feedback instance to report download progress to. |
NoFeedback
|
Returns:
| Type | Description |
|---|---|
ndarray
|
A NumPy array containing the downloaded data. |