RemotePathDataset

class pyremotedata.dataloader.RemotePathDataset(remote_path_iterator: RemotePathIterator, prefetch: int = 64, transform: Callable | None = None, target_transform: Callable | None = None, device: device | None = None, dtype: dtype | None = None, hierarchical: int = 0, hierarchy_parser: Callable | None = None, shuffle: bool = False, return_remote_path: bool = False, return_local_path: bool = False, verbose: bool = False)

Bases: IterableDataset

Creates a PyTorch dataset from a RemotePathIterator.

By default the dataset will return the image as a tensor and the remote path as a string.

Hierarchical mode

If hierarchical >= 1, the dataset is in “Hierarchical mode” and will return the image as a tensor and the label as a list of integers (class indices for each level in the hierarchy).
The class_handles property can be used to get the class-idx mappings for the dataset.
By default the dataset will use a parser which assumes that the hierarchical levels are encoded in the remote path as directories like so:
…/level_n/…/level_1/level_0/image.jpg
Where n = (hierarchical - 1) and level_0 is the leaf level.
param remote_path_iterator:

The remote path iterator to create the dataset from.

type remote_path_iterator:

RemotePathIterator

param prefetch:

The number of items to prefetch from the remote path iterator.

type prefetch:

int

param transform:

A function/transform that takes in an image as a torch.Tensor and returns a transformed version.

type transform:

callable, optional

param target_transform:

A function/transform that takes in the label (after potential parsing by parse_hierarchical) and transforms it.

type target_transform:

callable, optional

param device:

The device to move the tensors to.

type device:

torch.device, optional

param dtype:

The data type to convert the tensors to.

type dtype:

torch.dtype, optional

param hierarchical:

The number of hierarchical levels to use for the labels. Default: 0, i.e. no hierarchy.

type hierarchical:

int, optional

param hierarchy_parser:

A function to parse the hierarchical levels from the remote path. Default: None, i.e. use the default parser.

type hierarchy_parser:

callable, optional

param return_remote_path:

Whether to return the remote path. Default: False.

type return_remote_path:

bool, optional

param return_local_path:

Whether to return the local path. Default: False.

type return_local_path:

bool, optional

param verbose:

Whether to print verbose output. Default: False.

type verbose:

bool, optional

Yields:

Tuple[torch.Tensor, Union[str, List[int]]]

A tuple containing the image as a tensor and the label as the remote path or class indices.

or

Tuple[torch.Tensor, Union[str, List[int]], str]: A tuple containing the image as a tensor, the label as the remote path or class indices, and the local or remote path.

or

Tuple[torch.Tensor, Union[str, List[int]], str, str]: A tuple containing the image as a tensor, the label as the remote path or class indices, the local path, and the remote path.