Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CU-8695d4www pydantic 2 #476

Draft
wants to merge 46 commits into
base: master
Choose a base branch
from
Draft

CU-8695d4www pydantic 2 #476

wants to merge 46 commits into from

Conversation

mart-r
Copy link
Collaborator

@mart-r mart-r commented Aug 12, 2024

Let's see if we can do both pydantic 1 and pydantic 2.

This should in principle work with both.

Though I don't want to run GHA for both all the time. So I'll first run it for pydantic 2, then (if successful), change back to 1 and run it on that. And (if that's all good), change back to pydantic 2.

EDIT:
I've now added something to the main GHA workflow that should catch any use of the .dict( method or the .__fields__ attribute from the pydantic1 era.
However, this is only limited to those two (the only things that were previously being used). So if something else that doesn't work cross 1 and 2 is used, it wouldn't be caught.
This could also at some point match other use cases unrelated the pydantic since it's a simple grep (it doesn't currently).

EDIT#2:
TODO: Support pydantic 2 only

@tomolopolis
Copy link
Member

Task linked: CU-8695d4www Support pydantic 2?

@mart-r
Copy link
Collaborator Author

mart-r commented Aug 13, 2024

NOTE:
Now I've run through everything with both pydantic 1 and 2, so it should support either.

Though the issue is that any future changes would probably not be tested in any capacity on pydantic 1 (v2 will be installed by default).

Copy link
Member

@baixiac baixiac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. Just that I noticed pydantic v1 dict() is still called on the MultiDescriptor base model at

jsonpath.write_text(json.dumps(res.dict(), indent=jsonindent))

@@ -158,7 +158,7 @@ class Train(MixingConfig, BaseModel):
"""If set only this CUIs will be used for training"""
auto_save_model: bool = True
"""Should do model be saved during training for best results"""
last_train_on: Optional[int] = None
last_train_on: Optional[float] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

def deprecated(message: str, depr_version: Tuple[int, int, int], removal_version: Tuple[int, int, int]) -> Callable:
def deprecated(message: str, depr_version: Tuple[int, int, int],
removal_version: Tuple[int, int, int],
allow_usage: bool = False) -> Callable:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there is no need to add and expose this argument to the public API? It looks deprecated() will be monkey-patched before tests run anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method does indeed get monkey-patched during testing.

I suppose we could add **kwargs to the method to avoid describing the argument. But we can't remove it entirely since then the addition of the allow_usage keyword/argument would cause an exception during use (TypeError due to unexpected keyword argument).
But at the same time, we if this was hidden, the future maintainer of this resource might not know how to achieve the same result.

The main reason the method gets monkey-patched during testing is because we (generally) want to avoid using deprecated methods in our code. And if we had 100% test coverage (which we certainly don't - but that's besides the point), raising an exception during test time would guarantee that we're not.
But I've made an exception to this one, mostly so we can pre-specify when we'd stop supporting pydantic 1. So that we don't forget to do this at this later date, and so that it's actually documented (at least in code) and caught during GHA workflow (at the appropriate release time).

But if you've got some ideas on how to do this more elegantly, then don't hesitate to propose them!

Copy link
Member

@baixiac baixiac Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. In that case, I think there is no need to deprecate get_model_dump and get_model_fields in this PR given the goal is to support both versions ATM and they are supposed to do the job well. Besides, it is a bit weird to add a new method and deprecate it at the same time while I have no strong opinion on this.

There will be warnings like Field "model_X" has conflict with protected namespace "model_" so to me, extra base model field renaming will be needed before medcat fully embraces pydantic v2 and deprecates v1.

@mart-r mart-r mentioned this pull request Sep 27, 2024
@mart-r mart-r marked this pull request as draft October 14, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants