Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Item model utils #269

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Item model utils #269

wants to merge 10 commits into from

Conversation

drio18
Copy link
Contributor

@drio18 drio18 commented Jul 17, 2023

Here, we add utils for creating "item models", classes that enable navigating portal items directly as well as centralize common operations. Related snovault PR that adds navigation via requests here.

Additionally, we add a new testing_utils module to share code related to writing tests.

@coveralls
Copy link

coveralls commented Jul 17, 2023

Pull Request Test Coverage Report for Build 5754997444

  • 135 of 140 (96.43%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.3%) to 76.251%

Changes Missing Coverage Covered Lines Changed/Added Lines %
dcicutils/item_model_utils.py 120 122 98.36%
dcicutils/testing_utils.py 15 18 83.33%
Totals Coverage Status
Change from base Build 5590491337: 0.3%
Covered Lines: 7590
Relevant Lines: 9954

💛 - Coveralls

Copy link
Collaborator

@netsettler netsettler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these comments are stylistic. I apologize in advance for that, butI found this code a bit hard to read because of the chosen coding style, which doesn't match our normal style in other modules. I've tried to cite rationales for why we use the style we use. I fought with PEP8 for a long time when I came to Python but mostly feel like trying to normalize style makes it easier to both read and especially to grep.

There are a few non-syntactic comments in here, and I may have more later. I'd like to talk to you interactively about some of this stuff before taking another look.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might suggest either model_utils or item_utils. The name doesn't have to be complex, just enough to distinguish it from other things we're likely to have. Are we likely to have any other kinds of models that we want utils for? Or any other kinds of items? I'm OK leaving it if you want to, I'll leave that to you since you'll use it. But it just seems clumsy and like unnecessarily many letters.

Comment on lines +17 to +21
def get_link_to(
existing_item: PortalItem,
link_to: LinkTo,
item_to_create: PortalItem,
) -> Union[str, PortalItem]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of personal preference, I'd like to see this indented like this:

Suggested change
def get_link_to(
existing_item: PortalItem,
link_to: LinkTo,
item_to_create: PortalItem,
) -> Union[str, PortalItem]:
def get_link_to(existing_item: PortalItem,
link_to: LinkTo,
item_to_create: PortalItem,
) -> Union[str, PortalItem]:

The reason I like this style better is:

  1. It leaves "def" and the function name easier to pick out of the crowd.
  2. It means that the only thing in column 0 after a def is the next definition.

Comment on lines +22 to +29
"""Create new item model from existing one for given linkTo.

LinkTos be identifiers (e.g. UUIDs) or (partially) embedded objects.

Follow rules of existing item model for fetching linkTo via
request. If not fetching via request, then make item model from
existing properties if possible.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are referring to parameters by their type rather than their name. I get why that is. The name is somewhat encouraged to be a python name and fights the notation used in the JSON. But in the worst case this style can be confusing if there's more than one thing with the same type, and falling back on different notation in that case can be irregular. I tend to think of documentation as being program-like and it's useful to refer to things that are parameters by the names of the parameter so the reader of the doc doesn't have to guess. e.g.,

Suggested change
"""Create new item model from existing one for given linkTo.
LinkTos be identifiers (e.g. UUIDs) or (partially) embedded objects.
Follow rules of existing item model for fetching linkTo via
request. If not fetching via request, then make item model from
existing properties if possible.
"""
"""Create new item model from existing_item for given link_to.
A link_to should be an identifier (e.g. a UUID) or a (partially) embedded object.
Follow rules of existing item model for fetching linkTo via
request. If not fetching via request, then make item model from
existing properties if possible.
"""

Note also that you referred in your text to LinkTos in plural, which is awkward to start because you are pluralizing a type name and there is no such plural name, but also the argument doesn't seem to allow multiple link_to ids, just one.

I was unable to interpret what the sentence starting "Follow rules of..." even means, so I left it alone. I also don't quite get how an existing item (which is presumably an object?) can also be an item to create (which presumably can't be an object (because it doesn't exist yet). Are these item types?

I am surprised PortalItem works as a forward reference. Did you try that in Python 3.7? I thought that did not work if you're using it in a type description but maybe it does if it's a class. I don't think it works for a free-standing assignment such as you did LinkTo if it follows. Usually you have to use 'PortalItem' instead for forward references.

@@ -1,6 +1,6 @@
[tool.poetry]
name = "dcicutils"
version = "7.6.0"
version = "7.7.0.1b1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally the convention we are using is to make the beta be epsilon more than the version it is built from, not the build it is targeting. For example, if a 7.7.0 is issued while this is being developed, and you have not merged that version, this version will sort higher and so things that seek "^7.7.0" will think you have support for everything below this number and I don't think you do. Unless maybe you've merged 7.7.0 into here, in which case ignore me. :) I often add a comment saying something about what the highest merged version is. The entire tech for all this is heuristic, but I hope that makes sense.


@dataclass(frozen=True)
class NestedProperty:
properties: Dict[str, Any]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like a blank line before this, since there are blank lines between the definitions. It just looks better, even if PEP8 might not require it.

from unittest import mock


AN_UNLIKELY_RETURN_VALUE = "unlikely return value"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordinary convention for Python is to just do

AN_UNLIKELY_RETURN_VALUE = object()

because as soon as you name a string, it becomes interned and possible for someone else to find by name in a pointer-equal way. It's annoying at times. But this is why I added dcicutils.NamedObject, so that you can do something a little classier that is actually reliably unique (you could name it something less probabilistic, too):

_UNSUPPLIED = NamedObject('unsupplied')

where the token NamedObject has unique identity and prints in a more stringy way.

>>> x = NamedObject('unsupplied')
>>> str(x)
'<unsupplied>'
>>> repr(x)
'<unsupplied@101036400>'
>>> x
<unsupplied@101036400>

(Note that I recommend using _ in the name of this because in my long-time considered opinion it's a bad idea to export such a name so that different modules or classes use the same variable for missing/undefined/etc or else one might pass its missing argument explicitly to something else who might think no argument had been passed. The semantics gets quickly messy because it's not clear that an explicitly passed missing argument is intended to mean missing (though that's subject to intent in each such situation). Better each use of unsupplied should be a separate object in my view unless you've thought it through carefully and really mean to be creating a whole ecosystem that conspires on the notion of what a filled or missing argument is, and usually that's a distraction from work you're trying to do. So making the var non-exported/shared seems better to me.)

else:
target = f"{module.__name__}.{to_patch.__qualname__}"
with mock.patch(target, new_callable=new_callable, **kwargs) as mocked_item:
if return_value != AN_UNLIKELY_RETURN_VALUE:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use NamedObject, you can rely on object identity and this will read nicer:

if return_value is not UNSUPPLIED:
    mocked_item.return_value = return_value

Comment on lines +99 to +101
def _get_link_tos(
self, link_tos: Iterable[LinkTo], item_to_create: PortalItem
) -> List[Union[str, PortalItem]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't mark all of the occasions of this issue in this PR, like I did in its companion, but there's no reason for this to extend over three lines. The function header would fit on one line. If you did have to break it, it shouldn't go to column zero (as I explained in more detail elsewhere).

assert result.should_fetch_links() == portal_item.should_fetch_links()


def get_nested_property(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is just making a sample property. I find this kind of thing hard to read because in test files, it's often hard to tell what the newly-defined functions to be tested are and what are the testing artifacts. I like to name things like this something like make_sample_tested_property so that in context of use, it's clear it's a testing artifact. Also, and it's a subtle point, but I like verbs like get, find, lookup, and make to have consistent meanings, and while we don't have a global convention for some of the subtleties I use for myself, I do think get usually connotes finding something that exists already, and here you're making something anew, which is why I'd use make as the part of the function name that's the verb, though maybe you're thinking of this as a representation of something that exists in the data? If so, I would leave it as get but I would clarify that in a doc string, which should be present in any case since this function does something non-trivial.

Note that I'm not perfect at writing doc strings myself, but Will usually beats me up about it, presumably because it's hard to look at someone else's code and figure out what they're doing and he's usually on the outside of what I'm doing trying to claw his way in, as I'm doing here. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the subject of docstrings, I really think some of these tests could use them and even some inline comments. They are quite complex relatively speaking and merit such documentation.

)
assert len(mock_get_metadata.call_args_list) == 1

@pytest.mark.parametrize(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily something to change, but I have to say the body of this is so small and so conditional-ridden that I found myself wondering if individual tests for each of these cases would not be clearer. It's probably fine. But just something to ponder. There is a sense in which tests not only test things but offer little snippets where people can learn things from use cases, and sometimes the tests that are heavily conditionalized like this rob someone of the ability to see what the use case would be. I'm guilty of a lot of this stuff myself, so perhaps not one to talk. It's an ongoing struggle to balance sometimes-conflicting needs.

Copy link
Member

@willronchetti willronchetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kent left some good feedback that should be considered, not sure I have much more but I do really think it's worth documenting some of what's going on here especially in the more complex unit tests

assert result.should_fetch_links() == portal_item.should_fetch_links()


def get_nested_property(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the subject of docstrings, I really think some of these tests could use them and even some inline comments. They are quite complex relatively speaking and merit such documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants