zea.data.convert.cetus¶
Functionality to convert the CETUS dataset to the zea format.
Note
Requires SimpleITK to be installed: pip install SimpleITK.
The CETUS (Challenge on Endocardial Three-dimensional Ultrasound Segmentation) dataset contains 3D echocardiographic volumes from 45 patients. Each patient has end-diastolic (ED) and end-systolic (ES) B-mode volumes with corresponding ground truth left ventricle segmentation masks. The volumes are stored in NIfTI (.nii.gz) format with isotropic voxel spacing.
License: CC BY-NC-SA 4.0
The CETUS dataset is available free of charge strictly for non-commercial scientific research purposes only.
Citation (required for any use of the CETUS database):
O. Bernard, et al. “Standardized Evaluation System for Left Ventricular Segmentation Algorithms in 3D Echocardiography” IEEE Transactions on Medical Imaging, vol. 35, no. 4, pp. 967-977, April 2016. DOI: 10.1109/tmi.2015.2503890
Links:
Functions
|
Convert the CETUS dataset into zea HDF5 files across dataset splits. |
|
Download the CETUS dataset from the Girder server. |
|
Determine which dataset split a patient ID belongs to. |
|
Convert a single CETUS patient time-point to a zea HDF5 file. |
|
Upload the converted CETUS dataset to HuggingFace Hub. |
- zea.data.convert.cetus.convert_cetus(args)[source]¶
Convert the CETUS dataset into zea HDF5 files across dataset splits.
Processes all NIfTI B-mode volumes found under the source folder, assigns each patient to a train/val/test split, and executes per-file conversion tasks either serially or in parallel.
Usage:
python -m zea.data.convert cetus <source_folder> <destination_folder> --download
- Parameters:
args (argparse.Namespace) –
An object with attributes:
src (str | Path): Path to the folder containing CETUS patient subfolders, or a directory to download into when
--downloadis set.dst (str | Path): Root destination folder for zea HDF5 outputs; split subfolders (train/val/test) will be created.
download (bool, optional): If True, download the dataset first from the Girder server.
no_hyperthreading (bool, optional): If True, run tasks serially instead of using a process pool.
upload (bool, optional): If True, upload the converted dataset to HuggingFace Hub after conversion. Only for zea maintainers with push access to the repository.
- zea.data.convert.cetus.download_cetus(destination, patients=None)[source]¶
Download the CETUS dataset from the Girder server.
Downloads NIfTI files for each patient (B-mode volumes and ground truth segmentations for ED and ES time points).
- Parameters:
destination (
str|Path) – Directory where the dataset will be downloaded.patients (
list[int] |None) – List of patient IDs to download (1-45). If None, all 45 patients are downloaded.
- Return type:
Path- Returns:
Path to the downloaded dataset directory.
- zea.data.convert.cetus.get_split(patient_id)[source]¶
Determine which dataset split a patient ID belongs to.
- Parameters:
patient_id (
int) – Integer ID of the patient (1-45).- Returns:
"train","val", or"test".- Return type:
str- Raises:
ValueError – If the patient_id does not fall into any defined split range.
- zea.data.convert.cetus.process_cetus(source_path, output_path, overwrite=False)[source]¶
Convert a single CETUS patient time-point to a zea HDF5 file.
Each file stores the 3D B-mode volume as
image_sc(scan-converted image). If a corresponding ground truth segmentation file exists, it is stored as an additional element undernon_standard_elements/segmentation.The voxel spacing from the NIfTI header is stored as
non_standard_elements/voxel_spacing(in meters). License and citation information is embedded in the file description.- Parameters:
source_path (str or Path) – Path to the source
.nii.gzB-mode file.output_path (str or Path) – Path to the output
.hdf5file.overwrite (bool, optional) – Whether to overwrite an existing output file. Defaults to False.
- zea.data.convert.cetus.upload_cetus(output_folder)[source]¶
Upload the converted CETUS dataset to HuggingFace Hub.
Only for zea maintainers with push access to the repository.
Writes a dataset card, prints an upload summary, and asks for confirmation before pushing.
- Parameters:
output_folder (
str|Path) – Root folder containing the train/val/test splits.- Return type:
None