-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebAssembly Relaxed SIMD #651
Comments
Hi @dtig, I don't know how to find a specification from the links you have provided. The "modified specification" link returns 404. Is there an explanation the proposal you can provide? |
@martinthomson The directory structure makes these docs hard to find, these are the ones I should have probably linked: |
@dtig, thanks for that, I updated the request. I don't see any discussion of how this might result in fingerprinting of users based on the answers provided varying between CPUs. It's possible that the variation between CPU-level implementation is such that fingerprinting information is highly correlated with other fingerprinting surfaces, but some analysis supporting that theory would be good to have. We've taken significant steps to remove fingerprinting surfaces like this, so I'd like to see what our experts in the area think (cc @tomrittervg). |
There is some discussion in this issue: WebAssembly/relaxed-simd#11. We believe that the entropy exposed by this proposal correlates with existing finger printing surfaces on the web. Detailed analysis for each of the operations with included instruction lowering will be available in the associated issues/PRs (open and closed) for the relevant operations, a recent example is the Relaxed Rounding Q-format Multiplication issue WebAssembly/relaxed-simd#40. |
@martinthomson, I'm no longer with Mozilla, try @eqrion. |
So a concrete leak exposed by wasm today is distinguishing x86/x86-64 via this trick (and possibly others). This comment expresses the belief Intel/AMD can be distinguished. There seems to be consensus to not provide an API to determine hardware capabilities, and also that there won't be any capabilities exposed that only work on certain processors. However it seems there will be instructions exposed that map to different assembly instructions (or behaviors) on different processors, and that could be probed some times. Ref Individual instruction proposals have fingerprinting details. This one, sure enough, talks about how it can distinguish presence of an instruction set and how the wasm engine can work around it at a perf cost. This one and this one add new x86/ARM distinguishers, as does this one (and also distinguishes POWER), this one distinguishes another instruction set (FMA), and this one distinguishes x86/ARM as well as AVX512 presence. This one is kind of ambiguous but mentions detecting SSE4.1. x86/ARM is apparently already exposed. I don't know of formalized instruction set exposure through any existing fingerprintable surface. Not in the way this would expose a clear 'yes/no'. Intel/AMD differences could likely be exposed via timing attacks (and WebGL debug info tells you the vendor of the graphics card...) but I tend to put timing attacks on hardware behavior in a different class of fingerprinting protection than 'get a direct answer to a direct question.' Overall I get the impression that it is possible to hide the behavioral differences, it just introduces performance penalties. No Mozilla hat in this comment, but I would prefer to see a line drawn in the sand at x86/ARM distinguishing. That is, a specification would require implementors to fix up outputs such that processor manufacturer and instruction set presence cannot be distinguished in the output (I'm not counting timing here.) At a minimum I think a specification must support implementors doing that, and provide guidance for how. Related work in the fingerprinting space would be
|
The Wasm specification is usually at a lower level than that, for example, the spec formally specifies the execution semantics of instructions, but doesn't make assumptions about behavior on underlying hardware. For Relaxed SIMD specifically, specifying this is still under discussion, but the feedback from the Community group has been to ensure that the spec doesn't introduce correlations between instructions to avoid relying on detection patterns (more about that here). The instruction lowerings in the linked issues are a recommendation on what is possible on different hardware, and these would all be allowed by the spec, when lowered to the Wasm Simd operations for example, the behavior should be deterministic across different hardware (with the exception of IEEE754 non-compliant hardware). Engine implementations are able to make the tradeoff between performance, and to pick the lowering (or a different that isn't already described in the related issue) that matches the entropy requirements for their platforms. So it's not necessarily hiding the differences, more that the slower, more deterministic instructions are also allowed results by the spec. While the spec is at too low level to provide that guidance, the issues linked above do provide that guidance.
|
@tomrittervg, I think this means that an implementation must, on at least some of its supported platforms, implement every floating point addition or subtraction (and probably other operations) whose output's bit pattern can be inspected at some point in the future, in a way that produces a bit pattern that is platform-invariant. Operations whose results have simple lifetimes and are consumed in easy-to-understand ways could be exempted from this laundering by the optimizer, but I would still expect a very large fraction of FP ops to produce results that would have to be laundered, likely resulting in a very significant performance drop for FP intensive code. This does not seem related to SIMD or Relaxed SIMD at all, it's a core Wasm concern. See https://github.com/WebAssembly/design/blob/main/Nondeterminism.md for more about this. |
Hi Deepti, thanks for posting this! Speaking here for the WebAssembly team in SpiderMonkey, we view this proposal as worth-prototyping. This proposal adds several performance-oriented instructions to WebAssembly in order to aid porting high performance native code to the web. This fits well with our vision for WebAssembly's evolution. There are two risks here:
We are engaged in the specification process for this proposal in the WebAssembly CG and will require the above two risks to be resolved in order to advance the proposal. |
Thanks for your reply!
Most instructions only expose x86/ARM-Neon, I've tried to encapsulate this into a separate document in this PR. The one detail that the proposal as it stands now leaks is the availability of native FMA support, which has significant performance wins, documented here. There is some unresolved discussion of adding both a deterministic FMA, as well as a QFMA operation that might mitigate this somewhat - more discussion here.
There's performance data either measured or estimated included in the issue instructions are proposed, as well as the QFMA PR linked above. One note is that from an implementation perspective this proposal doesn't introduce implementation complexity to instructions/Spec, but it does introduce a precedent to be less strict about the specification. I expect the complexity here is for applications or libraries using the proposal, and potential compat issues for issues if the instructions are used incorrectly. This is somewhat mitigated by the fact that tools will not generate relaxed-simd operations by default (compared to fixed-width SIMD instructions which can be generated by the auto vectorizer), and are available using a special intrinsics hearder, or potentially a compiler flag, so the proposal already assumes a higher threshold of knowledge to be able to use the proposed operations.
|
Request for Mozilla Position on an Emerging Web Specification
Other information
None
The text was updated successfully, but these errors were encountered: