zea.data.convert.camus

Functionality to convert the CAMUS dataset to the zea format.

Note

Requires SimpleITK to be installed: pip install SimpleITK.

The CAMUS (Cardiac Acquisitions for Multi-structure Ultrasound Segmentation) dataset contains 2D echocardiographic sequences from 500 patients. The sequences are stored in NIfTI (.nii.gz) format.

The dataset can be downloaded automatically using the --download flag:

python -m zea.data.convert camus <source_folder> <destination_folder> --download

Links:

Functions

convert_camus(args)

Convert the CAMUS dataset into zea HDF5 files across dataset splits.

download_camus(destination[, patients])

Download the CAMUS dataset from the Girder server.

get_split(patient_id)

Determine which dataset split a patient ID belongs to.

process_camus(source_path, output_path[, ...])

Converts the camus database to the zea format.

transform_sc_image_to_polar(image_sc[, ...])

Transform a scan converted input image (cone) into square

zea.data.convert.camus.convert_camus(args)[source]

Convert the CAMUS dataset into zea HDF5 files across dataset splits.

Processes files found under the CAMUS source folder (after unzipping or downloading if needed), assigns each patient to a train/val/test split, creates matching output paths, and executes per-file conversion tasks either serially or in parallel.

Usage:

python -m zea.data.convert camus <source_folder> <destination_folder>
python -m zea.data.convert camus <source_folder> <destination_folder> --download
Parameters:

args (argparse.Namespace) –

An object with attributes:

  • src (str | Path): Path to the CAMUS archive or extracted folder, or a directory to download into when --download is set.

  • dst (str | Path): Root destination folder for ZEA HDF5 outputs; split subfolders will be created.

  • download (bool, optional): If True, download the dataset first from the Girder server.

  • no_hyperthreading (bool, optional): If True, run tasks serially instead of using a process pool.

zea.data.convert.camus.download_camus(destination, patients=None)[source]

Download the CAMUS dataset from the Girder server.

Downloads NIfTI files for each patient.

Parameters:
  • destination (str | Path) – Directory where the dataset will be downloaded.

  • patients (list[int] | None) – List of patient IDs to download (1-500). If None, all patients are downloaded.

Return type:

Path

Returns:

Path to the downloaded dataset directory.

zea.data.convert.camus.get_split(patient_id)[source]

Determine which dataset split a patient ID belongs to.

Parameters:

patient_id (int) – Integer ID of the patient.

Returns:

“train”, “val”, or “test”.

Return type:

str

Raises:

ValueError – If the patient_id does not fall into any defined split range.

zea.data.convert.camus.process_camus(source_path, output_path, overwrite=False)[source]

Converts the camus database to the zea format.

Parameters:
  • source_path (str, pathlike) – The path to the original camus file.

  • output_path (str, pathlike) – The path to the output file.

  • overwrite (bool, optional) – Set to True to overwrite existing file. Defaults to False.

zea.data.convert.camus.transform_sc_image_to_polar(image_sc, output_size=None, fit_outline=True)[source]
Transform a scan converted input image (cone) into square

using radial stretching and downsampling. Note that it assumes the background to be zero! Please verify if your results make sense, especially if the image contains black parts at the edges. This function is not perfect by any means, but it works for most cases.

Parameters:
  • image (numpy.ndarray) – Input image as a 2D numpy array (height, width).

  • output_size (tuple, optional) – Output size of the image as a tuple. Defaults to image_sc.shape.

  • fit_outline (bool, optional) – Whether to fit a polynomial the outline of the image. Defaults to True. If this is set to False, and the ultrasound image contains some black parts at the edges, weird artifacts can occur, because the jagged outline is stretched to the desired width.

Returns:

Squared image as a 2D numpy array (height, width).

Return type:

numpy.ndarray