Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Enum value in bpf_map_type #1060

Closed
vobst opened this issue Dec 13, 2023 · 2 comments · Fixed by #1062
Closed

Duplicate Enum value in bpf_map_type #1060

vobst opened this issue Dec 13, 2023 · 2 comments · Fixed by #1062

Comments

@vobst
Copy link

vobst commented Dec 13, 2023

Commit bpf: Implement cgroup storage available to non-cgroup-attached bpf progs merged in Linux 6.2-rc1 introduced a duplicate enum value in enum bpf_map_type.

Currently Volatility throws an exception in that case, could we maybe handle it more gracefully, e.g., by making the mapping int -> str | list[str] or int -> list[str] ?

The relevant code seems to be here and it looks like the problem was already anticipated ;)

    @classmethod
    def _generate_inverse_choices(cls, choices: Dict[str, int]) -> Dict[int, str]:
        """Generates the inverse choices for the object."""
        inverse_choices: Dict[int, str] = {}
        for k, v in choices.items():
            if v in inverse_choices:
                # Technically this shouldn't be a problem, but since we inverse cache
                # and can't map one value to two possibilities we throw an exception during build
                # We can remove/work around this if it proves a common issue
                raise ValueError(
                    f"Enumeration value {v} duplicated as {k} and {inverse_choices[v]}"
                )
            inverse_choices[v] = k
        return inverse_choices

https://github.com/volatilityfoundation/volatility3/blob/713b5f96dd218eea9c3fdb490a2436a474d22352/volatility3/framework/objects/__init__.py#L599C19-L599C19

@vobst
Copy link
Author

vobst commented Dec 14, 2023

As I see it, there are four possible solutions here:

  1. leave everything as it is
  2. leave the function signature the same, but either silently replace the preimage, i.e., remove the if, or stick with the first preimage, i.e., if and continue
  3. change the function signature to dict[int, str | list[str]]
  4. change the function signature to dict[int, list[str]]

My thoughts on those are:

  1. not rly an option since important enums in the kernel are evidently not injective
  2. also not optimal since it could potentially mean that the inverse mapping depends on the order in which items were inserted in the dict (in 3.7+ dictionary iteration order is guaranteed to be in order of insertion)
  3. would be a pain for callers since they have to check what they got back, and it will be a string for 99.9% of enums in the kernel
  4. a bit less of a pain but still makes caller code look awkward to handle 0.1% of cases

To me option 2 seems to be the lesser evil as those strings are equivalent from C's perspective so it shouldn't matter which one is returned. However, I would keep the if an output a debug log message so people can figure out what is going on.

However, I've got no clue about the context of the larger code base so I might be missing something here. Happy to hear some feedback :)

@ikelos
Copy link
Member

ikelos commented Dec 18, 2023

So the point of the function is to generate a map, so that if we have a value, we can look-up what it means. That appears to be the only place it's used, and it is private, so the only public expose is through the lookup function. However, it feels problematic to change the API for what is a so-far, extremely uncommon situation, so I'd prefer to rule out lookup potentially returning a list (it would have to always return a list, handing back different types creates an unstable API and leads to way more problems than just a break on a once-in-a-blue-moon event).

So the real options open to us are:

  • Leave it as is (barf on the situation)
  • Ignore any additional values, so that lookup always returns the first definition
  • Always update the lookup, so the latest defined value wins

I'm kind of ok with any of them. I tend to err on the side of barfing rather than deciding things for the user, but that may work out to be extremely inconvenient, in which case my second choice would just be to ignore duplicates and debug log for the user. I'll mock up a PR that would make the change...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants