Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TB Optimization] Skip subtrees based on the subtree's root node's permissions #4008

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

JoJoDeveloping
Copy link
Contributor

@JoJoDeveloping JoJoDeveloping commented Nov 1, 2024

In #4006, we re-added the functionality for skipping subtrees. It turns out that just skipping subtrees based on their last recorded access is imprecise. In certain cases, we know we can skip subtrees purely based on the root's current permission, without having to track the last access. Specifically:

  • Disabled nodes can always be skipped, since the whole subtree must necessarily be invariant under all foreign accesses
  • Frozen nodes an be skipped for foreign reads, since the whole subtree must necessarily be invariant under all foreign reads.

Note that this PR loosens the notion of "invariant" a bit. For example, it is possible that there is a Reserved protected node that is a child of a Frozen node. When that undergoes a foreign read, it becomes conflicted. If we skip accessing that subtree, it no longer does become conflicted.
The reason this is still OK is that the only effect of this conflictedness is blocking child write accesses. But such accesses are already blocked by the Frozen node further up the tree. So no UB is missed, all that happens is that diagnostics are triggered at a different node.

For more detailed analysis of why this is correct, see the in-code comments.

Here is a performance analysis, comparing this PR's improvements with that of #4006:

performance comparison

As in #4006, this is a log graph. The blue line shows performance without #4006, red is with the re-added optimization of #4006, yellow is this PR (which is stacked on top of #4006), and green is just the changes proposed here, but with the "latest foreign access tracking" machinery of #4006 removed. As can be seen, having both combined gives the greatest performance.

Finally, note that this PR is draft, since it is stacked on top of #4006. This PR only intends to contribute one commit, the rest are included in #4006, and should be discussed there.

This commit supplies a real fix, which makes retags more complicated, at the benefit of
making accesses more performant.
@RalfJung RalfJung self-assigned this Nov 4, 2024
@RalfJung
Copy link
Member

RalfJung commented Nov 4, 2024

all that happens is that diagnostics are triggered at a different node.

That is at least potentially confusing. :/ But maybe the fix here should be on the diagnostic side, not the core algorithm.

Does this depend on when the GC runs, or is it deterministic?

As in #4006, this is a log graph. The blue line shows performance without #4006, red is with the re-added optimization of #4006, yellow is this PR (which is stacked on top of #4006), and green is just the changes proposed here, but with the "latest foreign access tracking" machinery of #4006 removed. As can be seen, having both combined gives the greatest performance.

It seems like most benchmarks are unchanged by this PR (compared to just #4006), only a few of them benefit. big-allocs gets slightly worse.

Do you have evidence that this is beneficial on (a non-trivial fraction of) real-world code?

@JoJoDeveloping
Copy link
Contributor Author

JoJoDeveloping commented Nov 4, 2024

only a few of them benefit

Indeed. Intuitively, if you use lots of shared references, you benefit.

big-allocs gets slightly worse.

True, but I'd say that this is within measurement imprecision. That test just allocates a lot, without ever touching the memory.

@JoJoDeveloping
Copy link
Contributor Author

That is at least potentially confusing. :/ But maybe the fix here should be on the diagnostic side, not the core algorithm.

It depends. Arguably, the fact that it's because there's a frozen parent could be more clear than the fact that it's because you are reserved conflicted protected. But note that there's nothing the diagnostics can do to do things differently here, because the child node would not never become conflicted with this.

Does this depend on when the GC runs, or is it deterministic?

It is deterministic.

// of `ReservedIM`, `Disabled`, or a not-yet-accessed "lazy" permission thing.
// The two former are already invariant under all foreign accesses, and for
// the latter it does not really matter, since they can not be used/initialized
// due to having a protected parent. So this only affects diagnostics, but the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be "disabled parent", right?

@@ -185,6 +185,30 @@ impl LocationState {
// need to be applied to this subtree.
_ => false,
};
if self.permission.is_disabled() {
// A foreign access to a `Disabled` tag will have almost no observable effect.
// It's a theorem that `Disabled` node have no protected initialized children,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not an obvious theorem -- can you give a brief argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's proven in Coq 😛.

The reason it holds is that to become disabled, you need to have a foreign write access happen. But that would have triggered any protected initialized nodes that are children of the node being disabled. And you can't have a new child of Disabled become initialized, because that would mean the to-be-initialized node has a child access, which is however blocked by the Disabled parent.

// It's a theorem that `Disabled` node have no protected initialized children,
// and so this foreign access will never trigger any protector.
// Further, the children will never be able to read or write again, since they
// have a `Disabled` parents. Even further, all children of `Disabled` are one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument could end here, right? The permissions below don't matter since anyway no access is possible.

// effect, the only further thing they could do is make protected `Reserved`
// nodes become conflicted, i.e. make them reject child writes for the further
// duration of their protector. But such a child write is already rejected
// because this node is frozen. So this only affects diagnostics, but the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a testcase that demonstrates the effect on diagnostics?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants