zea.data.file

zea H5 file functionality.

Functions

assert_key(file, key)

Asserts key is in a h5py.File.

dict_to_sorted_list(dictionary)

Convert a dictionary with sortable keys to a sorted list of values.

load_file(path[, data_type, indices, ...])

Loads a zea data files (h5py file).

load_file_all_data_types(path[, indices, ...])

Loads a zea data files (h5py file).

validate_file([path, file])

Reads the hdf5 file at the given path and validates its structure.

Classes

File(name, *args, **kwargs)

h5py.File in zea format.

class zea.data.file.File(name, *args, **kwargs)[source]

Bases: File

h5py.File in zea format.

Initialize the file.

Parameters:
  • name (str, Path, HFPath) – The path to the file. Can be a string or a Path object. Additionally can be a string with the prefix ‘hf://’, in which case it will be resolved to a huggingface path.

  • *args – Additional arguments to pass to h5py.File.

  • **kwargs – Additional keyword arguments to pass to h5py.File.

copy_key(key, dst)[source]

Copy a specific key to another file.

Will always copy the attributes and the scan data if it exists. Will warn if the key is not in this file or if the key already exists in the destination file.

Parameters:
  • key (str) – The key to copy.

  • dst (File) – The destination file to copy the key to.

property description

Reads the description from the data file and returns it.

property event_keys

Return all events in the file.

events_have_same_shape(key)[source]

Check if all events have the same shape for a given key.

format_key(key)[source]

Format the key to match the data type.

get_event_shapes(key)[source]

Get the shapes of a key for all events.

get_parameters(event=None)[source]

Returns a dictionary of parameters to initialize a scan object that comes with the file (stored inside datafile).

If there are no scan parameters in the hdf5 file, returns an empty dictionary.

Parameters:

event (int, optional) –

Event number. When specified, an event structure is expected as follows:

event_0 / scan
event_1 / scan
...

Defaults to None. In that case no event structure is expected.

Returns:

The scan parameters.

Return type:

dict

get_probe_parameters(event=None)[source]

Returns a dictionary of probe parameters to initialize a probe object that comes with the file (stored inside datafile).

Returns:

The probe parameters.

Return type:

dict

get_scan_parameters(event=None)[source]

Returns a dictionary of scan parameters stored in the file.

Return type:

dict

classmethod get_shape(path, key)[source]

Get the shape of a key in a file.

Parameters:
  • path (str) – The path to the file.

  • key (str) – The key to get the shape of.

Returns:

The shape of the key.

Return type:

tuple

property has_events

Check if the file has events.

has_key(key)[source]

Check if the file has a specific key.

Parameters:

key (str) – The key to check.

Returns:

True if the key exists, False otherwise.

Return type:

bool

static key_to_data_type(key)[source]

Convert the key to a data type.

load_data(data_type, indices=None)[source]

Load data from the file.

The indices parameter can be used to load a subset of the data. This can be

  • 'all' or None to load all data

  • an int to load a single frame

  • a List[int] to load specific frames

  • a Tuple[Union[list, slice, int], ...] to index multiple axes (i.e. frames and transmits). Note that

    indexing with lists of indices for multiple axes is not supported. In that case, try to define one of the axes with a slice for optimal performance. Alternatively, slice the data after loading.

For more information on the indexing options, see indexing on ndarrays and fancy indexing in h5py.

>>> from zea import File

>>> path_to_file = (
...     "hf://zeahub/picmus/database/experiments/contrast_speckle/"
...     "contrast_speckle_expe_dataset_iq/contrast_speckle_expe_dataset_iq.hdf5"
... )

>>> with File(path_to_file, mode="r") as file:
...     # data has shape (n_frames, n_tx, n_el, n_ax, n_ch)
...     data = file.load_data("raw_data")
...     data.shape
...     # load first frame only
...     data = file.load_data("raw_data", indices=0)
...     data.shape
...     # load frame 0 and transmits 0, 2 and 4
...     data = file.load_data("raw_data", indices=(0, [0, 2, 4]))
...     data.shape
(1, 75, 832, 128, 2)
(75, 832, 128, 2)
(3, 832, 128, 2)
Parameters:
  • data_type (str) – The type of data to load. Options are ‘raw_data’, ‘aligned_data’, ‘beamformed_data’, ‘envelope_data’, ‘image’ and ‘image_sc’.

  • indices (Union[Tuple[Union[list, slice, int], ...], List[int], int, None]) – The indices to load. Defaults to None in which case all data is loaded.

Return type:

ndarray

load_scan(event=None)[source]

Alias for get_scan_parameters.

load_transmits(key, selected_transmits)[source]

Load raw_data or aligned_data for a given list of transmits. :type key: str :param key: The type of data to load. Options are ‘raw_data’ and ‘aligned_data’. :type key: str :type selected_transmits: list, np.ndarray :param selected_transmits: The transmits to load. :type selected_transmits: list, np.ndarray

property n_frames

Return number of frames in a file.

property name

Return the name of the file.

property path

Return the path of the file.

probe(event=None)[source]

Returns a Probe object initialized with the parameters from the file.

Parameters:

event (int, optional) –

Event number. When specified, an event structure is expected as follows:

event_0 / scan
event_1 / scan
...

Defaults to None. In that case, no event structure is expected.

Returns:

The probe object.

Return type:

Probe

property probe_name

Reads the probe name from the data file and returns it.

recursively_load_dict_contents_from_group(path)[source]

Load dict from contents of group

Values inside the group are converted to numpy arrays or primitive types (int, float, str).

Parameters:

path (str) – path to group

Returns:

dictionary with contents of group

Return type:

dict

scan(event=None, safe=True, **kwargs)[source]

Returns a Scan object initialized with the parameters from the file.

Parameters:
  • event (int, optional) –

    Event number. When specified, an event structure is expected as follows:

    event_0 / scan
    event_1 / scan
    ...
    

    Defaults to None. In that case no event structure is expected.

  • safe (bool, optional) – If True, will only use parameters that are defined in the Scan class. If False, will use all parameters from the file. Defaults to True.

  • **kwargs – Additional keyword arguments to pass to the Scan object. These will override the parameters from the file if they are present in the file.

Returns:

The scan object.

Return type:

Scan

shape(key)[source]

Return shape of some key, or all events.

Return type:

tuple

property stem

Return the stem of the file.

summary()[source]

Print the contents of the file.

to_iterator(key)[source]

Convert the data to an iterator over all frames.

validate()[source]

Validate the file structure.

Returns:

A dictionary with the validation results.

Return type:

dict

zea.data.file.assert_key(file, key)[source]

Asserts key is in a h5py.File.

zea.data.file.dict_to_sorted_list(dictionary)[source]

Convert a dictionary with sortable keys to a sorted list of values.

Note

This function operates on the top level of the dictionary only. If the dictionary contains nested dictionaries, those will not be sorted.

Example

>>> from zea.data.file import dict_to_sorted_list
>>> input_dict = {"number_000": 5, "number_001": 1, "number_002": 23}
>>> dict_to_sorted_list(input_dict)
[5, 1, 23]
Parameters:

dictionary (dict) – The dictionary to convert. The keys must be sortable.

Returns:

The sorted list of values.

Return type:

list

zea.data.file.load_file(path, data_type='raw_data', indices=None, scan_kwargs=None)[source]

Loads a zea data files (h5py file).

Returns the data together with a scan object containing the parameters of the acquisition and a probe object containing the parameters of the probe.

Additionally, it can load a specific subset of frames / transmits.

The indices parameter can be used to load a subset of the data. This can be

  • 'all' or None to load all data

  • an int to load a single frame

  • a List[int] to load specific frames

  • a Tuple[Union[list, slice, int], ...] to index multiple axes (i.e. frames and transmits). Note that

    indexing with lists of indices for multiple axes is not supported. In that case, try to define one of the axes with a slice for optimal performance. Alternatively, slice the data after loading.

For more information on the indexing options, see indexing on ndarrays and fancy indexing in h5py.

Parameters:
  • path (str, pathlike) – The path to the hdf5 file.

  • data_type (str, optional) – The type of data to load. Defaults to ‘raw_data’. Other options are ‘aligned_data’, ‘beamformed_data’, ‘envelope_data’, ‘image’ and ‘image_sc’.

  • indices (Union[Tuple[Union[list, slice, int], ...], List[int], int, None]) – The indices to load. Defaults to None in which case all frames are loaded.

  • scan_kwargs (dict) – Additional keyword arguments to pass to the Scan object. These will override the parameters from the file if they are present in the file. Defaults to None.

Returns:

The raw data of shape (n_frames, n_tx, n_ax, n_el, n_ch). (Scan): A scan object containing the parameters of the acquisition. (Probe): A probe object containing the parameters of the probe.

Return type:

Tuple[ndarray, Scan, Probe]

zea.data.file.load_file_all_data_types(path, indices=None, scan_kwargs=None)[source]

Loads a zea data files (h5py file).

Returns all data types together with a scan object containing the parameters of the acquisition and a probe object containing the parameters of the probe.

Additionally, it can load a specific subset of frames / transmits.

The indices parameter can be used to load a subset of the data. This can be

  • 'all' or None to load all data

  • an int to load a single frame

  • a List[int] to load specific frames

  • a Tuple[Union[list, slice, int], ...] to index multiple axes (i.e. frames and transmits). Note that

    indexing with lists of indices for multiple axes is not supported. In that case, try to define one of the axes with a slice for optimal performance. Alternatively, slice the data after loading.

For more information on the indexing options, see indexing on ndarrays and fancy indexing in h5py.

Parameters:
  • path (str, pathlike) – The path to the hdf5 file.

  • indices (Union[Tuple[Union[list, slice, int], ...], List[int], int, None]) – The indices to load. Defaults to None in which case all frames are loaded.

  • scan_kwargs (dict) – Additional keyword arguments to pass to the Scan object. These will override the parameters from the file if they are present in the file. Defaults to None.

Returns:

A dictionary with all data types as keys and the corresponding data as values. (Scan): A scan object containing the parameters of the acquisition. (Probe): A probe object containing the parameters of the probe.

Return type:

(dict)

zea.data.file.validate_file(path=None, file=None)[source]

Reads the hdf5 file at the given path and validates its structure.

Provide either the path or the file, but not both.

Parameters:
  • path (str) – The path to the hdf5 file.

  • file (File) – The hdf5 file.