Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Support for pydicom v3 #972

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

medihack
Copy link

@medihack medihack commented Oct 12, 2024

Reference issue

Support for pydicom v3, but drops support for pydicom v2.

Tasks

  • Unit tests added that reproduce issue or prove feature is working
  • Fix or feature added
  • Documentation and examples updated (if relevant)
  • Unit tests passing and coverage at 100% after adding fix/feature
  • Type annotations updated and passing with mypy
  • Apps updated and tested (if relevant)

Copy link

codecov bot commented Oct 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (bc5f1ae) to head (32ef447).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #972   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           28        28           
  Lines         8708      8706    -2     
=========================================
- Hits          8708      8706    -2     
Files with missing lines Coverage Δ
pynetdicom/association.py 100.00% <100.00%> (ø)
pynetdicom/dsutils.py 100.00% <100.00%> (ø)
pynetdicom/events.py 100.00% <ø> (ø)

@medihack
Copy link
Author

medihack commented Oct 13, 2024

The coverage went down a bit as the original_encoding property with pydicom v3 is automatically set (in contrast to pydicom v2) when a dataset is loaded. That way this if block is not reachable anymore with the loaded dataset in test_dataset_encoding_mismatch (for example ds.is_implicit_VR = True doesn't influence ds_encoding anymore).
The original_encoding is (None, None) in a manually created dataset. So, in contrast to pydicom v2, original_encoding is always set in pydicom v3.

@medihack
Copy link
Author

This should hopefully have 100% coverage again. There is one caveat. It behaves differently than before when loading a dataset (because original_encoding seems to be always present under pydicom v3), but I think that is the way it should be. The comment in the code dataset might also be created from scratch makes more sense to me now.

@medihack
Copy link
Author

Good to know ... ruff formatting is not fully compatible with black. But for that conda error I am unguilty 😉.

@mrbean-bremen
Copy link
Member

ruff formatting is not fully compatible with black

Yeah, it probably makes sense to introduce pre-commit into this repo like in pydicom, so the same tests can be run locally as in the CI.

Conda seemed to have a temporary problem - I restarted the test, and it seems to work now.

@mrbean-bremen
Copy link
Member

mrbean-bremen commented Oct 14, 2024

BTW, I think you can remove this "WIP" - I'll leave the actual review to @scaramallion, but for me it all looks good!

@medihack medihack changed the title [WIP] Support for pydicom v3 [MRG] Support for pydicom v3 Oct 14, 2024
@medihack
Copy link
Author

medihack commented Oct 14, 2024

One thing that just came to my mind. Now that I create the dataset manually in test_dataset_encoding_mismatch we should maybe add another test that explicitly shows that the original_encoding is used when loading a dataset and the is_implicit_VR and is_little_endian are ignored then. And should also show some warning when there is a mismatch. So I am not fully happy with the PR and revise it a bit. Sorry for the long chain-of-thought in this PR 😉

@medihack medihack changed the title [MRG] Support for pydicom v3 [WIP] Support for pydicom v3 Oct 14, 2024
@mrbean-bremen mrbean-bremen mentioned this pull request Oct 14, 2024
6 tasks
@medihack
Copy link
Author

After much thought, I would like to propose a change where the transfer syntax is always preferred over the encoding of the dataset in the send_c_store method. But this would be a breaking change. Currently, the now deprecated Dataset.is_implicit_VR and Dataset.is_little_endian can override the transfer syntax (but only if both are set). With the proposed change, the encoding would be selected more similarly to the dcmwrite function of pydicom v3 (resp. the _determine_encoding function).

This means that the priority of the selected encoding in send_c_store would be

  1. The encoding of the transfer syntax in the file_meta
  2. Dataset.is_implicit_VR and Dataset.is_little_endian (if 1. is not present)
  3. Dataset.original_encoding (if 1. and 2. is not present)

I don't have a strong opinion about this. That's why I want to discuss this before implementing it. If this change is unwanted I will of course do it in a different way, but we need a clear way of how to handle the encoding in send_c_store now that Dataset.is_implicit_VR and Dataset.is_little_endian are deprecated and original_dataset is always set.

Comment on lines -64 to -66
# File Meta Information is always encoded as Explicit VR Little Endian
file_meta.is_little_endian = True
file_meta.is_implicit_VR = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file_meta.set_original_encoding(False, True)

if elem.VM == 1:
value = f"(Sequence with {len(elem.value)} item)"
# Sequence elements always have a VM of 1
assert elem.VM == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the assert

Comment on lines -871 to -873
ds.is_little_endian = t_syntax.is_little_endian
ds.is_implicit_VR = t_syntax.is_implicit_VR

Copy link
Member

@scaramallion scaramallion Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ds.set_original_encoding(...)

Comment on lines -1895 to -1896
ds.is_implicit_VR = False
ds.is_little_endian = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ds.set_original_encoding(...)

@scaramallion
Copy link
Member

scaramallion commented Oct 15, 2024

After much thought, I would like to propose a change where the transfer syntax is always preferred...

The actual encoding of the dataset is what's important when sending, as we need to determine if it matches the agreed upon transfer syntax, and if not then convert the dataset to match.

For example, if a dataset:

  • Had Implicit VR Little Endian as the Transfer Syntax UID
  • Was actually encoded as explicit VR
  • The agreed upon transfer syntax is Explicit VR Little Endian

Then the dataset can be sent as-is with no conversion. If the agreed upon transfer syntax were instead Implicit VR Little Endian it would need to be read into memory and converted to implicit VR prior to sending.

If the dataset is new, then you can just use the agreed upon transfer syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants