zea.data.dataloader¶
H5 dataloader for loading images from zea datasets.
Functions
|
Generate indices for h5 files. |
Classes
|
H5Generator class for iterating over hdf5 files in an advanced way. |
- class zea.data.dataloader.H5Generator(file_paths, key='data/image', n_frames=1, shuffle=True, return_filename=False, limit_n_samples=None, limit_n_frames=None, seed=None, cache=False, additional_axes_iter=None, sort_files=True, overlapping_blocks=False, initial_frame_axis=0, insert_frame_axis=True, frame_index_stride=1, frame_axis=-1, validate=True, **kwargs)[source]¶
Bases:
DatasetH5Generator class for iterating over hdf5 files in an advanced way. Mostly used internally, you might want to use the Dataloader class instead. Loads one item at a time. Always outputs numpy arrays.
Initializes the Dataset.
- Parameters:
file_paths (
List[str]) – (list of) path(s) to the folder(s) containing the HDF5 file(s) or list of HDF5 file paths. Can be a mixed list of folders and files.validate (
bool) – Whether to validate the dataset. Defaults to True.directory_splits (list, optional) – List of directory split by. Is a list of floats between 0 and 1, with the same length as the number of file_paths given. If none, all files in file_paths are used.
- load(file, key, indices=None)[source]¶
Extract data from hdf5 file. :param file_name: name of the file to extract image from. :type file_name: str :param key: key of the hdf5 dataset to grab data from. :type key:
str:param indices: indices to extract image from (tuple of slices) :type indices:Union[Tuple[Union[list,slice,int],...],List[int],int,None]- Returns:
image extracted from hdf5 file and indexed by indices.
- Return type:
np.ndarray
- zea.data.dataloader.generate_h5_indices(file_paths, file_shapes, n_frames, frame_index_stride, key='data/image', initial_frame_axis=0, additional_axes_iter=None, sort_files=True, overlapping_blocks=False, limit_n_frames=None)[source]¶
Generate indices for h5 files.
Generates a list of indices to extract images from hdf5 files. Length of this list is the length of the extracted dataset.
- Parameters:
file_paths (
List[str]) – List of file paths.file_shapes (
list) – List of file shapes.n_frames (
int) – Number of frames to load from each hdf5 file.frame_index_stride (
int) – Interval between frames to load.key (
str) – Key of hdf5 dataset to grab data from. Defaults to “data/image”.initial_frame_axis (
int) – Axis to iterate over. Defaults to 0.additional_axes_iter (
Optional[List[int]]) – Additional axes to iterate over in the dataset. Defaults to None.sort_files (
bool) – Sort files by number. Defaults to True.overlapping_blocks (
bool) – Will take n_frames from sequence, then move by 1. Defaults to False.limit_n_frames (
int|None) – Limit the number of frames to load from each file. This means n_frames per data file will be used. These will be the first frames in the file. Defaults to None.
- Returns:
- List of tuples with indices to extract images from hdf5 files.
(file_name, key, indices) with indices being a tuple of slices.
- Return type:
list
Example
[ ( "/folder/path_to_file.hdf5", "data/image", (range(0, 1), slice(None, 256, None), slice(None, 256, None)), ), ( "/folder/path_to_file.hdf5", "data/image", (range(1, 2), slice(None, 256, None), slice(None, 256, None)), ), ..., ]