Process image and compute quantitative image features

Two of the main uses for MIRP are to process images and compute quantitative features from images. Both use the same standardized workflow that is compliant with the Image Biomarker Standardisation Initiative (IBSI) reference standards [Zwanenburg2020], [Whybra2024]. Two versions of the image processing and feature computation function exist:

extract_features_and_images(): conventional function that processes images and computes features.
extract_features_and_images_generator(): generator that yields processed images and features computed therefrom.

For convenience, the above functions are wrapped to allow for only computing feature values (without exporting images) and only processing images (without computing features):

extract_features(): conventional function that only computes features.
extract_features_generator(): generator that only yields feature values.
extract_images(): conventional function that only processes images.
extract_images_generator(): generator that yields processed images.

The features computed by MIRP are listed in Feature name references.

Examples

MIRP can compute features from regions of interest in images. The features are described in [Zwanenburg2016].

Minimal example

The following computes features from a single image and mask:

from mirp import extract_features

feature_data = extract_features(
    image="path to image",
    mask="path to mask",
    base_discretisation_method="fixed_bin_number",
    base_discretisation_n_bins=32
)

The base_discretisation_method and its corresponding parameters are required as long as any texture or intensity-histogram features are involved.

Interpolation example

A more realistic example involves interpolation to ensure that voxel spacing is the same for all images in a dataset. For example, a positron emission tomography (PET) dataset may be resampled to 3 by 3 by 3 mm isotropic voxels. This is achieved by providing the new_spacing argument, i.e. new_spacing=3.0 or new_spacing=[3.0, 3.0, 3.0].

from mirp import extract_features

feature_data = extract_features(
    image="path to PET image",
    mask="path to PET mask",
    image_modality="PET",
    new_spacing=3.0,
    base_discretisation_method="fixed_bin_number",
    base_discretisation_n_bins=32
)

Here, image_modality="PET" is used to declare that the image is a PET image. If this is a DICOM image, this argument is not necessary – the modality can be inferred from the metadata.

Slice-wise example

Sometimes, in-plane resolution is much higher than axial resolution. For example, in (older) computed tomography (CT) images, in-plane resolution may be 1 by 1 mm, but the distance between slices can be 7 mm or greater. Resampling to isotropic 1 by 1 by 1 mm voxels causes considerable data to be inferred between slices, which may not be desirable. In that case, images may be better processed by slice-by-slice (2D). This is achieved by providing the by_slice argument, i.e. by_slice=True.

from mirp import extract_features

feature_data = extract_features(
    image="path to CT image",
    mask="path to CT mask",
    image_modality="CT",
    by_slice=True,
    new_spacing=1.0,
    base_discretisation_method="fixed_bin_number",
    base_discretisation_n_bins=32
)

In the above example new_spacing=1.0 causes all images to be resampled in-plane to a 1 mm resolution.

Fixed Bin Number discretisation example

The previous examples used the Fixed Bin Number to discretise intensities within the mask into a fixed number of bins. For some imaging modalities, intensities carry a physical (or at least calibrated) meaning, such as Hounsfield units in computed tomography and standardised uptake values in positron emission tomography. For these Fixed Bin Size (also known as Fixed Bin Width) can be interesting, as this creates a mapping between intensities and bins that is consistent across the dataset. MIRP sets the lower bound of the initial bin using the resegmentation range, or in its absence, a default value (if any).

Below we compute features from a computed tomography image using a Fixed Bin Size discretisation method. Because the resegmentation range is not set, the lower bound of the initial bin defaults to -1000 Hounsfield Units.

from mirp import extract_features

feature_data = extract_features(
    image="path to CT image",
    mask="path to CT mask",
    image_modality="CT",
    new_spacing=1.0,
    base_discretisation_method="fixed_bin_size",
    base_discretisation_bin_width=25.0
)

Mask resegmentation example

If the region of interest contained in the mask in the above example covers soft tissue, this default might not be good. We can change this by providing the resegmentation_intensity_range argument. Here, we provide a window more fitting for soft tissues: resegmentation_intensity_range=[-200.0, 200.0]. Thus the lower bound of the initial bin is set to -200 Hounsfield Units, and 16 bins total are formed.

from mirp import extract_features

feature_data = extract_features(
    image="path to CT image",
    mask="path to CT mask",
    image_modality="CT",
    new_spacing=1.0,
    resegmentation_intensity_range=[-200.0, 200.0],
    base_discretisation_method="fixed_bin_size",
    base_discretisation_bin_width=25.0
)

Basic image filter example

The above examples all compute features from the base image. Filters can be applied to images to enhance patterns such as edges. MIRP implements multiple filters [Depeursinge2020]. In the example below, we compute features from a Laplacian-of-Gaussian filtered image:

from mirp import extract_features

feature_data = extract_features(
    image="path to image",
    mask="path to mask",
    new_spacing=1.0,
    base_discretisation_method="fixed_bin_size",
    base_discretisation_bin_width=25.0,
    filter_kernels="laplacian_of_gaussian",
    laplacian_of_gaussian_sigma=2.0
)

Image filter with additional features

By default, only statistical features are computed from filtered images, and features are still extracted from the base image. You can change this by specifying base_feature_families="none" (to prevent computing features from the base image) and specifying response_map_feature_families. In the example below, we compute both statistical features and intensity histogram features.

from mirp import extract_features

feature_data = extract_features(
    image="path to image",
    mask="path to mask",
    new_spacing=1.0,
    base_feature_families="none",
    response_map_feature_families=["statistics", "intensity_histogram"],
    filter_kernels="laplacian_of_gaussian",
    laplacian_of_gaussian_sigma=2.0
)

Even though intensity histogram features require discretisation, you don’t have to provide a discretisation method and associated parameters. This is because for many filters, intensities in the filtered images no longer represent a measurable quantity such as Hounsfield Units. Hence a Fixed Bin Number algorithm is used by default, with 16 bins. These parameters can be changed using the response_map_discretisation_method and response_map_discretisation_n_bins arguments.

Parallel processing example

MIRP supports parallel processing using ray and joblib. Using parallel processing, multiple images can be processed at the same time. There two relevant parameters: num_cpus and parallel_backend. num_cpus determines the number of workers that will be spawned. parallel_backend determines the backend using for parallel processing, i.e. "ray" or "joblib".

In the example below, we extract features from images using 2 workers on a joblib backend.

from mirp import extract_features

feature_data = extract_features(
    image="path to image",
    mask="path to mask",
    base_discretisation_method="fixed_bin_number",
    base_discretisation_n_bins=32,
    num_cpus=2,
    parallel_backend="joblib"
)

joblib can also be used within a generator context, e.g. with extract_features_generator, but ray cannot. Both ray and joblib are optional dependencies of MIRP and need to be installed separately.

API documentation

mirp.extract_features_and_images.extract_features_and_images(write_features: None | bool = None, export_features: None | bool = None, write_images: None | bool = None, export_images: None | bool = None, write_dir: None | str = None, image_export_format: str = 'dict', num_cpus: None | int = None, parallel_backend: None | str = None, **kwargs)[source]

Processes images and computes features from regions of interest.

Parameters:

write_features (bool, optional) – Determines whether features computed from images should be written to the directory indicated by the write_dir keyword argument.
export_features (bool, optional) – Determines whether features computed from images should be returned by the function.
write_images (bool, optional) – Determines whether processed images and masks should be written to the directory indicated by the write_dir keyword argument.
export_images (bool, optional) – Determines whether processed images and masks should be returned by the function.
write_dir (str, optional) – Path to directory where features, processed images and masks should be written. If not set, features, processed images and masks are returned by this function. Required if write_features=True or write_images=True.
image_export_format ({"dict", "native", "numpy"}, default: "dict") – Return format for processed images and masks. "dict" returns dictionaries of images and masks as numpy arrays and associated characteristics. "native" returns images and masks in their internal format. "numpy" returns images and masks in numpy format. This argument is only used if export_images=True.
num_cpus (int, optional, default: None) – Number of CPU nodes that should be used for parallel processing. Image and mask processing can be parallelized using the ray or joblib packages. If a ray cluster is defined by the user, this cluster will be used instead. By default, image and mask processing are processed sequentially.
parallel_backend ({"none", "ray", "joblib"}, optional, default: "none") – Type of backend to use. Default is the sequential backend ("none"). Alternative backends are "ray" and "joblib", which rely on the ray and joblib libraries respectively.
**kwargs –
Keyword arguments passed for importing images and masks ( mirp.data_import.import_image_and_mask.import_image_and_mask()) and configuring settings:
- general settings (GeneralSettingsClass)
- image post-processing (ImagePostProcessingClass)
- image perturbation / augmentation (ImagePerturbationSettingsClass)
- image interpolation / resampling ( ImageInterpolationSettingsClass and MaskInterpolationSettingsClass)
- mask resegmentation (ResegmentationSettingsClass)
- image transformation (ImageTransformationSettingsClass)
- feature computation / extraction ( FeatureExtractionSettingsClass)

Returns:

List of features, images and masks, depending on export_features and export_images.

Return type:

None | list[Any]

See also

Keyword arguments can be provided to configure the following:

image and mask import (import_image_and_mask())
general settings (GeneralSettingsClass)
image post-processing (ImagePostProcessingClass)
image perturbation / augmentation (ImagePerturbationSettingsClass)
image interpolation / resampling (ImageInterpolationSettingsClass and MaskInterpolationSettingsClass)
mask resegmentation (ResegmentationSettingsClass)
image transformation (ImageTransformationSettingsClass)
feature computation / extraction (FeatureExtractionSettingsClass)

mirp.extract_features_and_images.extract_features_and_images_generator(write_features: None | bool = None, export_features: None | bool = None, write_images: None | bool = None, export_images: None | bool = None, write_dir: None | str = None, image_export_format: str = 'dict', num_cpus: None | int = None, parallel_backend: None | str = None, **kwargs)[source]

Processes images and computes features from regions of interest as a generator.

Parameters:

write_features (bool, optional) – Determines whether features computed from images should be written to the directory indicated by the write_dir keyword argument.
export_features (bool, optional) – Determines whether features computed from images should be returned by the function.
write_images (bool, optional) – Determines whether processed images and masks should be written to the directory indicated by the write_dir keyword argument.
export_images (bool, optional) – Determines whether processed images and masks should be returned by the function.
write_dir (str, optional) – Path to directory where features, processed images and masks should be written. If not set, features, processed images and masks are returned by this function. Required if write_features=True or write_images=True.
image_export_format ({"dict", "native", "numpy"}, default: "dict") – Return format for processed images and masks. "dict" returns dictionaries of images and masks as numpy arrays and associated characteristics. "native" returns images and masks in their internal format. "numpy" returns images and masks in numpy format. This argument is only used if export_images=True.
num_cpus (int, optional, default: None) – Number of CPU nodes that should be used for parallel processing. Image and mask processing can be parallelized using the joblib package. By default, image and mask processing are processed sequentially.
parallel_backend ({"none", "joblib"}, optional, default: "none") – Type of backend to use. Default is the sequential backend ("none"). "joblib" can be used as an alternative backend. "ray" cannot be used in a generator context, because only a single worker will be used.
**kwargs –
Keyword arguments passed for importing images and masks ( mirp.data_import.import_image_and_mask.import_image_and_mask()) and configuring settings:
- general settings (GeneralSettingsClass)
- image post-processing (ImagePostProcessingClass)
- image perturbation / augmentation (ImagePerturbationSettingsClass)
- image interpolation / resampling ( ImageInterpolationSettingsClass and MaskInterpolationSettingsClass)
- mask resegmentation (ResegmentationSettingsClass)
- image transformation (ImageTransformationSettingsClass)
- feature computation / extraction ( FeatureExtractionSettingsClass)

Yields:

None | list[Any] – List of features, images and masks, depending on export_features and export_images.

See also

Keyword arguments can be provided to configure the following:

image and mask import (import_image_and_mask())
general settings (GeneralSettingsClass)
image post-processing (ImagePostProcessingClass)
image perturbation / augmentation (ImagePerturbationSettingsClass)
image interpolation / resampling (ImageInterpolationSettingsClass and MaskInterpolationSettingsClass)
mask resegmentation (ResegmentationSettingsClass)
image transformation (ImageTransformationSettingsClass)
feature computation / extraction (FeatureExtractionSettingsClass)

Compute features from regions of interest in images. This function is a wrapper around extract_features_and_images().

Parameters:

write_features (bool, optional) – Determines whether features computed from images should be written to the directory indicated by the write_dir keyword argument.
export_features (bool, optional) – Determines whether features computed from images should be returned by the function.
write_dir (str, optional) – Path to directory where feature tables should be written. If not set, feature tables are returned by this function. Required if write_features=True.
num_cpus (int, optional, default: None) – Number of CPU nodes that should be used for parallel processing. Image processing and feature computation can be parallelized using the ray package. If a ray cluster is defined by the user, this cluster will be used instead. By default, images are processed sequentially.
**kwargs –
Keyword arguments passed for importing images and masks ( mirp.data_import.import_image_and_mask.import_image_and_mask()) and configuring settings:
- general settings (GeneralSettingsClass)
- image post-processing (ImagePostProcessingClass)
- image perturbation / augmentation (ImagePerturbationSettingsClass)
- image interpolation / resampling ( ImageInterpolationSettingsClass and MaskInterpolationSettingsClass)
- mask resegmentation (ResegmentationSettingsClass)
- image transformation (ImageTransformationSettingsClass)
- feature computation / extraction ( FeatureExtractionSettingsClass)

Returns:

List of feature tables, if export_features=True.

Return type:

None | list[Any]

See also

extract_features_and_images()

mirp.extract_features_and_images.extract_features_generator(write_features: bool = False, export_features: bool = True, **kwargs)[source]

Compute features from regions of interest in images. This generator is a wrapper around extract_features_and_images_generator().

Parameters:

write_features (bool, default: False) – Determines whether features computed from images should be written to the directory indicated by the write_dir keyword argument.
export_features (bool, default: True) – Determines whether features computed from images should be returned by the function.
**kwargs –
Keyword arguments passed for importing images and masks ( mirp.data_import.import_image_and_mask.import_image_and_mask()) and configuring settings:
- general settings (GeneralSettingsClass)
- image post-processing (ImagePostProcessingClass)
- image perturbation / augmentation (ImagePerturbationSettingsClass)
- image interpolation / resampling ( ImageInterpolationSettingsClass and MaskInterpolationSettingsClass)
- mask resegmentation (ResegmentationSettingsClass)
- image transformation (ImageTransformationSettingsClass)
- feature computation / extraction ( FeatureExtractionSettingsClass)

Returns:

List of feature tables, if export_features=True.

Return type:

None | list[Any]

See also

extract_features_and_images_generator()

mirp.extract_features_and_images.extract_images(write_images: None | bool = None, export_images: None | bool = None, write_dir: None | str = None, image_export_format: str = 'dict', num_cpus: None | int = None, **kwargs)[source]

Process images and masks. This function is a wrapper around extract_features_and_images().

Parameters:

write_images (bool, optional) – Determines whether processed images and masks should be written to the directory indicated by the write_dir keyword argument.
export_images (bool, optional) – Determines whether processed images and masks should be returned by the function.
write_dir (str, optional) – Path to directory where processed images and masks should be written. If not set, processed images and masks are returned by this function. Required if write_images=True.
image_export_format ({"dict", "native", "numpy"}, default: "dict") – Return format for processed images and masks. "dict" returns dictionaries of images and masks as numpy arrays and associated characteristics. "native" returns images and masks in their internal format. "numpy" returns images and masks in numpy format. This argument is only used if export_images=True.
num_cpus (int, optional, default: None) – Number of CPU nodes that should be used for parallel processing. Image processing and feature computation can be parallelized using the ray package. If a ray cluster is defined by the user, this cluster will be used instead. By default, images are processed sequentially.
**kwargs –
Keyword arguments passed for importing images and masks ( mirp.data_import.import_image_and_mask.import_image_and_mask()) and configuring settings:
- general settings (GeneralSettingsClass)
- image post-processing (ImagePostProcessingClass)
- image perturbation / augmentation (ImagePerturbationSettingsClass)
- image interpolation / resampling ( ImageInterpolationSettingsClass and MaskInterpolationSettingsClass)
- mask resegmentation (ResegmentationSettingsClass)
- image transformation (ImageTransformationSettingsClass)

Returns:

List of feature tables, if export_images=True.

Return type:

None | list[Any]

See also

extract_features_and_images()

mirp.extract_features_and_images.extract_images_generator(write_images: bool = False, export_images: bool = True, write_dir: None | str = None, image_export_format: str = 'dict', **kwargs)[source]

Process images and masks. This generator is a wrapper around extract_features_and_images_generator().

Parameters:

write_images (bool, default: True) – Determines whether processed images and masks should be written to the directory indicated by the write_dir keyword argument.
export_images (bool, default: False) – Determines whether processed images and masks should be returned by the function.
write_dir (str, optional) – Path to directory where processed images and masks should be written. If not set, processed images and masks are returned by this function. Required if write_images=True.
image_export_format ({"dict", "native", "numpy"}, default: "dict") – Return format for processed images and masks. "dict" returns dictionaries of images and masks as numpy arrays and associated characteristics. "native" returns images and masks in their internal format. "numpy" returns images and masks in numpy format. This argument is only used if export_images=True.
**kwargs –
Keyword arguments passed for importing images and masks ( mirp.data_import.import_image_and_mask.import_image_and_mask()) and configuring settings:
- general settings (GeneralSettingsClass)
- image post-processing (ImagePostProcessingClass)
- image perturbation / augmentation (ImagePerturbationSettingsClass)
- image interpolation / resampling ( ImageInterpolationSettingsClass and MaskInterpolationSettingsClass)
- mask resegmentation (ResegmentationSettingsClass)
- image transformation (ImageTransformationSettingsClass)

Yields:

None | list[Any] – List of feature tables, if export_images=True.

See also

extract_features_and_images_generator()

References

[Zwanenburg2016]

Zwanenburg A, Leger S, Vallieres M, Loeck S. Image Biomarker Standardisation Initiative. arXiv [cs.CV] 2016. doi:10.48550/arXiv.1612.07003

[Zwanenburg2020]

Zwanenburg A, Vallieres M, Abdalah MA, Aerts HJWL, Andrearczyk V, Apte A, et al. The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. Radiology. 2020;295: 328-338. doi:10.1148/radiol.2020191145

[Depeursinge2020]

Depeursinge A, Andrearczyk V, Whybra P, van Griethuysen J, Mueller H, Schaer R, et al. Standardised convolutional filtering for radiomics. arXiv [eess.IV]. 2020. doi:10.48550/arXiv.2006.05470

[Whybra2024]

Whybra P, Zwanenburg A, Andrearczyk V, Schaer R, Apte AP, Ayotte A, et al. The Image Biomarker Standardization Initiative: Standardized Convolutional Filters for Reproducible Radiomics and Enhanced Clinical Insights. Radiology. 2024;310: e231319. doi:10.1148/radiol.231319