Replies: 3 comments 17 replies
-
It's not enough to prevent The immediate next step is to ask why we can't insert dynamic checks at the moment an object would be transferred/leaked between threads, but then the complication is that the "moment of transfer" of an object can be highly implicit (e.g. another object transitively pointing to it could be transferred instead). This has led us to think that static annotations are the right thing, but I'd be happy to discuss other ideas that take the above concerns into account.
Realistically, these source languages are likely going to just mark everything in Wasm shared, and then work out how to recover acceptable JS interop using one of the thread-local mechanisms we're proposing.
I think the case where the source language wants to hold a strong reference to a DOM node cross-thread is one of the hardest to support in Wasm. But to some extent I think this is an inherent friction in the Web platform (because of the issues with cross-thread GC I sketched above). There are possible compromises - instead of the compiled Wasm code holding the DOM node as a true reference type, it could more easily hold a scalar handle that points to a slot in a table of DOM nodes managed in JS via thread-local functions called from shared Wasm. |
Beta Was this translation helpful? Give feedback.
-
My assumption was that all objects exist within the same unified heap, but reading this doc - it seems we are designing towards a system where there is a shared heap (for shared objects) and a number of individual unshared heaps. In such design of course you would prefer to avoid shared -> unshared references because having such references would mean that you need to perform GC across shared & unshared heaps at the same time to fully reclaim garbage. I don't entirely understand why this design is preferred to be honest. You can't collect shared heap without some amount of synchronizations with threads touching unshared heaps anyway (because unshared heaps are pointing into shared heap). Is it that we want to support GC within unshared heaps without safepointing all threads touching shared heap? If that's an important property we could still achieve it by employing a WB which tracks shared -> unshared references (similar to generational collection). You could perform "minor" collections of unshared heaps without fully marking through the whole shared heap. There is also a question of how things like |
Beta Was this translation helpful? Give feedback.
-
The proposal adds threads & shared types & atomics. So clearly the engines will have to support a unified heap that mutators can concurrently allocate into and concurrently update. That work is unavoidable. So the argument isn't about the GC / compiler technology (which has to be built anyway) but rather the risk of As somewhat elaborated above as well, having multiple heaps, that can be collected independently, but have pointers one way (and weak references the other way) is a technically highly complex, risky thing. It also may deliver sub-par performance due to maintaining remembered sets not only from old to new generation but from one heap to another, the associated write barriers etc. And since the static types tell which heap an object gets allocated into, and languages may compile most of their types to sharable types, it may cause high overhead (as one has to do all this tracking) even for objects that may die very quickly and are never actually shared / concurrently accessed. Also if the heaps are imbalanced (most is shared or most is non-shared) then being able to independently collect isn't that useful. The hope is that WebAssembly is going to be alive for many years to come, so I think it would be the wrong choice to guide it's specification based on short term thinking (more engineering work in JS engines, ...) at the expense of actual usefulness.
Existing workloads don't use threading & shared memory, so how can they be regressed? Also one can introduce the capability to spawn a different execution environment that shares nothing with the original heap, but then require those execution environments to communicate with each other via message passing.
This is a very problematic aspect. It means there's no top-type. WasmGC is mainly targeted by managed, object-oriented languages. If the language is soundly typed (e.g. Dart, Java, C#, ...) then it almost always has a top type (with some that have value types that are boxable into into top types). The spec as written now makes it extremely hard for those languages to compile to it. One could think that those languages should simply compile only to shared types, problem solved. Though the problem isn't solved, because those languages have to interact with host objects (aka Changing all these source languages to have the shared concept in their type systems, for the purpose of targeting wasm (which isn't the main target of those languages), seems very hard. With current proposal an app will have to choose: Either you can use shared multi threading, but not interact with host objects. Or you can interact with host objects but not use shared multithreading. => It will simply hinder the usefulness of this in the first place. So the suggestion I made (and @mraleph wrote down above) to instead put responsibility on the embedder to check at-access-time seems much preferred. It will make WasmGC multi threading a much more useful technology. |
Beta Was this translation helpful? Give feedback.
-
Shared-Everything Threads proposal says:
Why not enforce this property dynamically instead of introducing a transitive restriction into the type system? Each
externref
could carry around some sort of "ownership token" and trying to pass anexternref
back to the embedder would perform an embedder specific ownership check and cause an error when ownership restrictions do not allow that.The way things are currently specified in the proposal make it rather challenging, because a single non-shareable reference poisons the whole type IIUC making it non-shareable. This makes it challenging to compile languages where everything is shareable by default.
Consider for example, Java. Everything is shareable there, including containers like
ArrayList
orObject[]
. Now consider what happens we create a subclass ofObject
which contains a reference to a DOM node inside. Such type can't be shared between threads - so it can't beshared
. But that means neither can beObject
andObject[]
andArrayList
. This breaks uniformity of the representation which is present in the original language.This is closely related to #5 - I actually think toolchains are going to have a very hard time, unless the source language itself has this "partitioning" into shared and non-shared parts built in.
cc @mkustermann
cc @tlively @syg
Beta Was this translation helpful? Give feedback.
All reactions