zea.data.convert.echonet¶
Script to convert the EchoNet database to zea format.
Note
Will segment the images and convert them to polar coordinates.
For more information about the dataset, resort to the following links:
Functions
|
Acceptance algorithm that determines whether to reject an image based on left and right corner data. |
|
Function that converts a timeseries of a cartesian cone to a polar representation that is more compatible with CNN's/action selection. |
|
Convert an EchoNet dataset into zea files, organizing results into train/val/test/rejected splits. |
|
Initialize the module-level shared counter used by worker processes. |
|
Locate which split contains a given filename. |
|
Function that rotates the datapoints by a certain degree. |
|
Segments the background of the echonet images by setting it to 0 and creating a hard edge. |
Classes
|
Stores a few variables and paths to allow for hyperthreading. |
- class zea.data.convert.echonet.H5Processor(path_out_h5, num_val=500, num_test=500, range_from=(0, 255), range_to=(-60, 0), splits=None)[source]¶
Bases:
objectStores a few variables and paths to allow for hyperthreading.
- __call__(avi_file)[source]¶
Convert a single AVI file into a zea dataset entry. Loads the AVI, validates and rescales pixel ranges, applies segmentation, assigns a data split (train/val/test/rejected), converts accepted frames to polar coordinates. Constructs and returns the zea dataset descriptor used by generate_zea_dataset; the descriptor always includes path, image_sc, probe_name, and description, and includes image when the file is accepted.
- Parameters:
avi_file (pathlib.Path) – Path to the source .avi file to process.
- Returns:
- The value returned by generate_zea_dataset containing the dataset
entry for the processed file.
- Return type:
dict
- get_split(hdf5_file, sequence)[source]¶
Determine the dataset split label for a given file and its image sequence.
This method checks acceptance based on the first frame of sequence. If explicit splits were provided to the processor, it returns the split found for hdf5_file (and asserts that the acceptance result matches the split). If no explicit splits are provided, rejected sequences are labeled “rejected”. Accepted sequences increment a shared counter and are assigned “val”, “test”, or “train” according to the processor’s num_val and num_test quotas.
- Parameters:
hdf5_file (
str) – Filename or identifier used to look up an existing split when splits are provided.sequence (array-like) – Time-ordered sequence of images; the first frame is used for acceptance checking.
- Returns:
One of “train”, “val”, “test”, or “rejected” indicating the assigned split.
- Return type:
str
- validate_split_copy(split_file)[source]¶
Validate that a generated split YAML matches the original splits provided to the processor.
Reads the YAML at split_file and compares its train, val, test, and rejected lists (or other split keys present in self.splits) against self.splits; logs confirmation when a split matches and logs which entries are missing or extra when they differ. If the processor was not initialized with splits, validation is skipped and a message is logged.
- Parameters:
split_file (str or os.PathLike) – Path to the YAML file containing the generated dataset splits.
- zea.data.convert.echonet.accept_shape(tensor)[source]¶
Acceptance algorithm that determines whether to reject an image based on left and right corner data.
- Parameters:
tensor (ndarray) – Input image (sc) with 2 dimensions. (112, 112)
- Returns:
Whether or not the tensor should be rejected.
- Return type:
decision (bool)
- zea.data.convert.echonet.cartesian_to_polar_matrix(cartesian_matrix, tip=(61, 7), r_max=107, angle=0.79, interpolation='nearest')[source]¶
Function that converts a timeseries of a cartesian cone to a polar representation that is more compatible with CNN’s/action selection.
- Parameters:
cartesian_matrix (-) – (rows, cols) matrix containing time sequence of image_sc data.
tip (-) – coordinates (in indices) of the tip of the cone. Defaults to (61, 7).
r_max (-) – expected radius of the cone. Defaults to 107.
angle (-) – expected angle of the cone, will be used as (-angle, angle). Defaults to 0.79.
interpolation (-) – can be [nearest, linear, cubic]. Defaults to ‘nearest’.
- Returns:
polar conversion of the input.
- Return type:
polar_matrix (2d array)
- zea.data.convert.echonet.convert_echonet(args)[source]¶
Convert an EchoNet dataset into zea files, organizing results into train/val/test/rejected splits.
- Parameters:
args (argparse.Namespace) –
An object with the following attributes.
- src (str|Path): Path to the source archive or directory containing .avi files.
Will be unzipped if needed.
- dst (str|Path): Destination directory for generated zea files
per-split subdirectories (train, val, test, rejected) and a split.yaml are created or updated.
- split_path (str|Path|None): If provided, must contain a split.yaml to reproduce
an existing split; function asserts the file exists.
- no_hyperthreading (bool): When false, processing uses a ProcessPoolExecutor
with a shared counter; when true, processing runs sequentially.
Note
May unzip the source into a working directory.
Writes zea files into dst.
Writes a split.yaml into dst summarizing produced files per split.
Logs progress and validation results.
Asserts that split.yaml exists at split_path when split reproduction is requested.
- zea.data.convert.echonet.count_init(shared_counter)[source]¶
Initialize the module-level shared counter used by worker processes.
- Parameters:
shared_counter (multiprocessing.Value) – A process-shared integer Value that will be assigned to the module-global COUNTER for coordinated counting across processes.
- zea.data.convert.echonet.find_split_for_file(file_dict, target_file)[source]¶
Locate which split contains a given filename.
- Parameters:
file_dict (dict) – Mapping from split name (e.g., “train”, “val”, “test”, “rejected”) to an iterable of filenames.
target_file (str) – Filename to search for within the split lists.
- Returns:
The split name that contains target_file, or “rejected” if the file is not found.
- Return type:
str
- zea.data.convert.echonet.rotate_coordinates(data_points, degrees)[source]¶
Function that rotates the datapoints by a certain degree.
- Parameters:
data_points (ndarray) – tensor containing [N,2] (x and y) datapoints.
degrees (int) – angle to rotate the datapoints with
- Returns:
the rotated data_points.
- Return type:
rotated_points (ndarray)
- zea.data.convert.echonet.segment(tensor, number_erasing=0, min_clip=0)[source]¶
Segments the background of the echonet images by setting it to 0 and creating a hard edge.
- Parameters:
tensor (ndarray) – Input image (sc) with 3 dimensions. (N, 112, 112)
number_erasing (float, optional) – number to fill the background with.
min_clip (float, optional) – If > 0, values on the computed cone edge will be clipped to be at least this value. Defaults to 0.
- Returns:
Segmented matrix of same dimensions as input
- Return type:
tensor (ndarray)