RemotePathDataset¶
- class pyremotedata.dataloader.RemotePathDataset(remote_path_iterator: RemotePathIterator, prefetch: int = 64, transform: Callable | None = None, target_transform: Callable | None = None, device: device | None = None, dtype: dtype | None = None, hierarchical: int = 0, hierarchy_parser: Callable | None = None, shuffle: bool = False, return_remote_path: bool = False, return_local_path: bool = False, verbose: bool = False)
Bases:
IterableDataset
Creates a PyTorch dataset from a RemotePathIterator.
By default the dataset will return the image as a tensor and the remote path as a string.
### Hierarchical mode If hierarchical >= 1, the dataset is in “Hierarchical mode” and will return the image as a tensor and the label as a list of integers (class indices for each level in the hierarchy).
The class_handles property can be used to get the class-idx mappings for the dataset.
By default the dataset will use a parser which assumes that the hierarchical levels are encoded in the remote path as directories like so:
…/level_n/…/level_1/level_0/image.jpg
Where n = (hierarchical - 1) and level_0 is the leaf level.
- Parameters:
remote_path_iterator (RemotePathIterator) – The remote path iterator to create the dataset from.
prefetch (int) – The number of items to prefetch from the remote path iterator.
transform (callable, optional) – A function/transform that takes in an image as a torch.Tensor and returns a transformed version.
target_transform (callable, optional) – A function/transform that takes in the label (after potential parsing by parse_hierarchical) and transforms it.
device (torch.device, optional) – The device to move the tensors to.
dtype (torch.dtype, optional) – The data type to convert the tensors to.
hierarchical (int, optional) – The number of hierarchical levels to use for the labels. Default: 0, i.e. no hierarchy.
hierarchy_parser (callable, optional) – A function to parse the hierarchical levels from the remote path. Default: None, i.e. use the default parser.
return_remote_path (bool, optional) – Whether to return the remote path. Default: False.
return_local_path (bool, optional) – Whether to return the local path. Default: False.
verbose (bool, optional) – Whether to print verbose output. Default: False.