Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow segmentations to reference multiple images #199

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 131 additions & 75 deletions src/highdicom/seg/sop.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,14 @@ def __init__(
"""
Parameters
----------
source_images: Sequence[Dataset]
source_images: Sequence[pydicom.dataset.Dataset]
One or more single- or multi-frame images (or metadata of images)
from which the segmentation was derived
from which the segmentation was derived. The images must have the
same dimensions (rows, columns) and orientation, have the same frame
of reference, and contain the same number of frames.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in the general case, it is not required that the geometry of the images being segmented is consistent with the geometry of the segmentation. I was not sure if this constraint is here to simplify the implementation, which would make a lot of sense, or for some other reason.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, this means that the source images must have the same "(rows, columns) and orientation, have the same frame of reference, and contain the same number of frames" as each other. We do handle the case where this differs from the geometry of the segmentation (through the plane_positions parameter)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I don't think that is required either by the standard (other than the same FoR).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard may not place any constraints on the referenced source images. However, in practice the constraints should generally apply. For example, all instances of a series of CT images generally have the same size and orientation.

If this should ever become a problem in practice, we can further relax the constraints. However, this would further complicate mapping segmentation frames to source image frames.

Copy link
Member

@fedorov fedorov Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all makes sense. I want to make sure it is clear that I am not pushing for those features, I just wanted to raise awareness.

Those are not hypothetical situations, however. In practice, it is common that different series within a single MR study in prostate and brain, as few examples, have varying resolution/orientation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fedorov. That's a good point. In the short term, I think it's unlikely that we will be able to support such use cases. However, we should consider them for the API design.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as we open ourselves up to having multiple source series, this becomes relevant as different series may have different values for these attributes. Previously this test was there since we were assuming that they were consistent within a series.

I think at the very least we would still want these checks:

  • Within a series, and
  • Within all images if the user has not provided plane positions for the segmentation

In case of multi-frame images that are tiled (e.g., VL Whole Slide
Microscopy Image instances), the images may be from more multiple
hackermd marked this conversation as resolved.
Show resolved Hide resolved
series as long as the other requirements are satisfied.
hackermd marked this conversation as resolved.
Show resolved Hide resolved
pixel_array: numpy.ndarray
Array of segmentation pixel data of boolean, unsigned integer or
floating point data type representing a mask image. The array may
Expand Down Expand Up @@ -274,29 +279,52 @@ def __init__(
if len(source_images) == 0:
raise ValueError('At least one source image is required.')

unique_sop_instance_uids = set(
[image.SOPInstanceUID for image in source_images]
)
if len(source_images) != len(unique_sop_instance_uids):
raise ValueError(
'Source images must all have a unique SOP Instance UID.'
)

uniqueness_criteria = set(
(
image.StudyInstanceUID,
image.SeriesInstanceUID,
image.Rows,
image.Columns,
int(getattr(image, 'NumberOfFrames', '1')),
hasattr(image, 'FrameOfReferenceUID'),
getattr(image, 'FrameOfReferenceUID', None),
hasattr(image, 'TotalPixelMatrixRows'),
getattr(image, 'TotalPixelMatrixRows', None),
hasattr(image, 'TotalPixelMatrixColumns'),
getattr(image, 'TotalPixelMatrixColumns', None),
hasattr(image, 'TotalPixelMatrixFocalPlanes'),
getattr(image, 'TotalPixelMatrixFocalPlanes', None),
tuple(getattr(image, 'ImageOrientation', [])),
tuple(getattr(image, 'ImageOrientationSlide', [])),
hasattr(image, 'DimensionOrganizationType'),
getattr(image, 'DimensionOrganizationType', None),
len(getattr(image, 'PerFrameFunctionalGroupsSequence', [])),
len(getattr(image, 'SharedFunctionalGroupsSequence', [])),
)
for image in source_images
)
if len(uniqueness_criteria) > 1:
raise ValueError(
'Source images must all be part of the same series and must '
'have the same image dimensions (number of rows/columns).'
'Source images must all have the same image dimensions '
'(number of rows/columns) and image orientation, '
'have the same frame of reference, '
'and contain the same number of frames.'
)

if pixel_array.ndim == 2:
pixel_array = pixel_array[np.newaxis, ...]
if pixel_array.ndim not in [3, 4]:
raise ValueError('Pixel array must be a 2D, 3D, or 4D array.')

src_img = source_images[0]
is_multiframe = hasattr(src_img, 'NumberOfFrames')
if is_multiframe and len(source_images) > 1:
raise ValueError(
'Only one source image should be provided in case images '
'are multi-frame images.'
)
is_tiled = hasattr(src_img, 'TotalPixelMatrixRows')
supported_transfer_syntaxes = {
ImplicitVRLittleEndian,
Expand All @@ -310,11 +338,6 @@ def __init__(
f'Transfer syntax "{transfer_syntax_uid}" is not supported.'
)

if pixel_array.ndim == 2:
pixel_array = pixel_array[np.newaxis, ...]
if pixel_array.ndim not in [3, 4]:
raise ValueError('Pixel array must be a 2D, 3D, or 4D array.')

super().__init__(
study_instance_uid=src_img.StudyInstanceUID,
series_instance_uid=series_instance_uid,
Expand Down Expand Up @@ -342,6 +365,55 @@ def __init__(
**kwargs
)

# General Reference
self.SourceImageSequence: List[Dataset] = []
referenced_series: Dict[str, List[Dataset]] = defaultdict(list)
for s_img in source_images:
ref = Dataset()
ref.ReferencedSOPClassUID = s_img.SOPClassUID
ref.ReferencedSOPInstanceUID = s_img.SOPInstanceUID
self.SourceImageSequence.append(ref)
referenced_series[s_img.SeriesInstanceUID].append(ref)

if len(referenced_series) > 1:
if is_multiframe and not is_tiled:
raise ValueError(
'If source images are multiple-frame images that are '
'not tiled, then only a single source image from a single '
'series must be provided.'
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't understand why there is different behaviour for tiled non-tiled. Can you please explain?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking goes as follows: If a multi-frame image is not tiled, then it likely represents an Enhanced CT/MR/PET Image and I wasn't sure how that should be handled. Would we need to check whether the number of frames in the image (each of the images if multiple should be allowed) match the first dimension of pixel_array?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably right and we can just handle tiled and non-tiled the same way..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed via a2c0d1d

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, we should now also support source_images: Sequence[Sequence[pydicom.Dataset]] to allow users to specify multiple series of single-frame images. However, that will require further thought and work.

elif not is_multiframe:
raise ValueError(
'If source images are single-frame images, then all '
'source images must be from a single series.'
)

# Common Instance Reference
self.ReferencedSeriesSequence: List[Dataset] = []
for series_instance_uid, referenced_images in referenced_series.items():
if is_multiframe and not is_tiled:
if len(referenced_images) > 1:
raise ValueError(
'If source images are multiple-frame images that are '
'not tiled, then only a single source image must be '
f'provided. However, n={len(referenced_images)} images '
f'were provided for series "{series_instance_uid}".'
)
elif not is_multiframe:
if len(referenced_images) != pixel_array.shape[0]:
raise ValueError(
'If source images are single-frame images, then '
f'then n={pixel_array.shape[0]} source images must be '
'provided per series. '
f'However, n={len(referenced_images)} images were '
f'provided for series "{series_instance_uid}".'
)

ref = Dataset()
ref.SeriesInstanceUID = series_instance_uid
ref.ReferencedInstanceSequence = list(referenced_images)
self.ReferencedSeriesSequence.append(ref)

# Frame of Reference
has_ref_frame_uid = hasattr(src_img, 'FrameOfReferenceUID')
if has_ref_frame_uid:
Expand Down Expand Up @@ -395,24 +467,6 @@ def __init__(
)
self._coordinate_system = None

# General Reference
self.SourceImageSequence: List[Dataset] = []
referenced_series: Dict[str, List[Dataset]] = defaultdict(list)
for s_img in source_images:
ref = Dataset()
ref.ReferencedSOPClassUID = s_img.SOPClassUID
ref.ReferencedSOPInstanceUID = s_img.SOPInstanceUID
self.SourceImageSequence.append(ref)
referenced_series[s_img.SeriesInstanceUID].append(ref)

# Common Instance Reference
self.ReferencedSeriesSequence: List[Dataset] = []
for series_instance_uid, referenced_images in referenced_series.items():
ref = Dataset()
ref.SeriesInstanceUID = series_instance_uid
ref.ReferencedInstanceSequence = referenced_images
self.ReferencedSeriesSequence.append(ref)

# Image Pixel
self.Rows = pixel_array.shape[1]
self.Columns = pixel_array.shape[2]
Expand Down Expand Up @@ -742,6 +796,8 @@ def __init__(
# bitpacking at the end
full_pixel_array = np.array([], np.bool_)

derivation_code = codes.cid7203.Segmentation
purpose_code = codes.cid7202.SourceImageForImageProcessingOperation
for i, segment_number in enumerate(described_segment_numbers):
# Pixel array for just this segment
if pixel_array.dtype in (np.float_, np.float32, np.float64):
Expand Down Expand Up @@ -787,17 +843,11 @@ def __init__(
# absent. Such frames should be removed
if omit_empty_frames and np.sum(planes[j]) == 0:
logger.info(
'skip empty plane {} of segment #{}'.format(
j, segment_number
)
f'skip empty plane {j} of segment #{segment_number}'
)
continue
contained_plane_index.append(j)
logger.info(
'add plane #{} for segment #{}'.format(
j, segment_number
)
)
logger.info(f'add plane #{j} for segment #{segment_number}')

pffp_item = Dataset()
frame_content_item = Dataset()
Expand Down Expand Up @@ -837,9 +887,9 @@ def __init__(
]
except IndexError as error:
raise IndexError(
'Could not determine position of plane #{} in '
f'Could not determine position of plane #{j} in '
'three dimensional coordinate system based on '
'dimension index values: {}'.format(j, error)
f'dimension index values: {error}'
)
frame_content_item.DimensionIndexValues = (
[segment_number] + index_values
Expand All @@ -858,43 +908,49 @@ def __init__(
pffp_item.DerivationImageSequence = []

if are_spatial_locations_preserved:
derivation_image_item = Dataset()
derivation_code = codes.cid7203.Segmentation
derivation_image_item.DerivationCodeSequence = [
derivation_img_item = Dataset()
derivation_img_item.DerivationCodeSequence = [
CodedConcept.from_code(derivation_code)
]

derivation_src_img_item = Dataset()
if hasattr(source_images[0], 'NumberOfFrames'):
# A single multi-frame source image
src_img_item = self.SourceImageSequence[0]
# Frame numbers are one-based
derivation_src_img_item.ReferencedFrameNumber = (
source_image_index + 1
)
else:
# Multiple single-frame source images
src_img_item = self.SourceImageSequence[
source_image_index
]
derivation_src_img_item.ReferencedSOPClassUID = \
src_img_item.ReferencedSOPClassUID
derivation_src_img_item.ReferencedSOPInstanceUID = \
src_img_item.ReferencedSOPInstanceUID
purpose_code = \
codes.cid7202.SourceImageForImageProcessingOperation
derivation_src_img_item.PurposeOfReferenceCodeSequence = [
CodedConcept.from_code(purpose_code)
]
derivation_src_img_item.SpatialLocationsPreserved = 'YES'
derivation_image_item.SourceImageSequence = [
derivation_src_img_item,
]
derivation_img_item.SourceImageSequence = []

for _, referenced_images in referenced_series.items():
if is_multiframe:
for src_item in referenced_images:
drv_src_item = Dataset()
drv_src_item.ReferencedFrameNumber = (
source_image_index + 1
)
drv_src_item.ReferencedSOPClassUID = \
src_item.ReferencedSOPClassUID
drv_src_item.ReferencedSOPInstanceUID = \
src_item.ReferencedSOPInstanceUID
drv_src_item.PurposeOfReferenceCodeSequence = [
CodedConcept.from_code(purpose_code)
]
drv_src_item.SpatialLocationsPreserved = 'YES'
derivation_img_item.SourceImageSequence.append(
drv_src_item
)
else:
src_item = referenced_images[source_image_index]
drv_src_item = Dataset()
drv_src_item.ReferencedSOPClassUID = \
src_item.ReferencedSOPClassUID
drv_src_item.ReferencedSOPInstanceUID = \
src_item.ReferencedSOPInstanceUID
drv_src_item.PurposeOfReferenceCodeSequence = [
CodedConcept.from_code(purpose_code)
]
drv_src_item.SpatialLocationsPreserved = 'YES'
derivation_img_item.SourceImageSequence.append(
drv_src_item
)
pffp_item.DerivationImageSequence.append(
derivation_image_item
derivation_img_item
)
else:
logger.warning('spatial locations not preserved')
logger.warning('spatial locations are not preserved')

identification = Dataset()
identification.ReferencedSegmentNumber = segment_number
Expand Down
Loading