SIMD distributions aren't well documented #1227

clarfonthey · 2022-04-17T23:34:37Z

Right now, the exact definitions for the distributions for SIMD types aren't documented at all, and I personally had to delve into the source code to (after much difficulty) discover which actual implementation was provided. Going to separate out some thoughts into sections so it's a bit easier to read.

Uniform distributions

Uniform distributions for SIMD types have two natural definitions: either as a linear interpolation between the two values, or as a random value inside the box enclosed by the two provided points.

For floating-point SIMD vectors, both definitions would make logical sense to include somewhere, even though the latter can easily be computed from existing float distributions with one caveat: does "excluding" the end point from the distribution mean that all the facets around the final point are excluded (no dimensions may be equal to the final point) or that just the final point itself is excluded? The former is easily composable with existing distributions, but the latter is not, although I fail to find a case where the latter is that useful.

For integers, the bounding-box definition still is natural, but the linear interpolation is not, since it would require computing the distance between the points and removing common factors from all components before adding. And, unlike floats where precision can be lost, the linear interpolation could easily be done losslessly.

As expected, integers use the bounding-box approach. However, floats use the linear interpolation approach, at least from what I can see. While the linear interpolation is (IMHO) the objectively more useful version to provide, I think that both could be useful in their own right, and there's not really any documentation anywhere that describes what version is provided, whereas floats have entire pages written about what the distributions do.

Standard distributions

While it seems obvious, the standard distributions (and the open and open-closed distributions) simply generate a random number for each lane of the vector for floats. However, a mathematically useful alternative could be generating a random point inside a unit hypersphere, which is not trivial, but definitely a reasonable distribution to add.

Const generics

One thing that would be useful along the SIMD types' distributions is const-generic versions of the existing gen_iter methods that provide a set amount of points. This could potentially mitigate concerns about the distributions' implementations as well by allowing users to generate values as f64x4::from_array(rng.gen_array::<4>()). While not absolutely required for SIMD support, it's something else I'd like to see that's probably mentioned in another issue, so, I won't focus too much on it.

Non-SIMD versions

And finally… it would be useful to have these distributions available outside of the SIMD implementations. Although it would require extra computation, it would be useful to be able to compute uniform ranges along a line without the loss of accuracy that could be accumulated via multiplication. I'm not sure what this API would look like, but IMHO, the SIMD versions should just be wrappers around these distributions, where the distributions themselves might be optimised to use SIMD operations internally regardless of the end result. (once SIMD support is stabilised, that is)

The text was updated successfully, but these errors were encountered:

clarfonthey · 2022-04-18T16:39:26Z

This also might be related to #496 but that issue hasn't had any comments for 4 years and a lot has been done since then, so, I'm not sure what the status of that is.

dhardy · 2022-04-18T18:35:17Z

Thanks for the detailed issue. I won't read into the specifics now, but do plan to go over SIMD Uniform distributions as part of #1196 (and would welcome relevant comments on the report there).

clarfonthey · 2022-04-18T19:02:37Z

I should also add that before I wrote this, I didn't know about the rand_distr crate which does have unit ball distributions. So, I guess there is precedent that that kind of distribution wouldn't go in the rand crate, but elsewhere.

dhardy · 2023-02-06T17:38:09Z

I really should have read this earlier, apologies.

either as a linear interpolation between the two values

This is a line inside the plane (or space). I'm surprised you'd think we might have a sampler for that, and especially surprised you might think SIMD algorithms would do that. Granted, we do have UnitCircle which samples from a line in the plane, but... SIMD is Single Instruction Multiple Data, not interpolation. You can think of a SIMD sampler as sampling in (hyper-)box defined by two points or you can think of it as independently sampling N values from N independent ranges. This applies to both int and float variants.

SIMD stuff isn't well documented because it's experimental and not even stable really, but... maybe we should add some basic docs.

generating a random point inside a unit hypersphere

This is a different problem solved by UnitBall.

Const generics

gen_iter was removed a long time ago. In it's place there is sample_iter, e.g. rng.sample_iter(Standard). As here, it can output arrays just fine. I guess you could use this to construct SIMD values, though you can sample them directly too.

This works (using rand master since the previous release used packed_simd_2):

[package]
name = "simd-tests"
version = "0.1.0"
edition = "2021"

[dependencies.rand]
git = "https://github.com/rust-random/rand.git"
rev = "7d73990096890960dbc086e5ad93c453e4435b25"
features = ["simd_support"]

#![feature(portable_simd)]

use rand::prelude::*;
use std::simd::Simd;

fn main() {
    let mut rng = rand::thread_rng();

    let x: Simd<i8, 4> = rng.gen();
    println!("x = {x:?}");

    let y: Simd<f32, 4> = rng.gen_range(
        Simd::<f32, 4>::splat(0.0) ..=
        Simd::from_array([1.0, 2.0, 3.0, 1.0])
    );
    println!("y = {y:?}");

    let z: Simd<i32, 4> = rng.gen_range(
        Simd::<i32, 4>::splat(0) ..=
        Simd::from_array([10, 100, 1000, 10000])
    );
    println!("z = {z:?}");
}

Non-SIMD versions

Sounds like you are talking about UnitBall etc. again.

Action

We should add a little documentation clarifying exactly what SIMD stuff is good for.

Maybe we should also support something like rng.gen_range([0u8; 4] ..= [255u8; 4]), I don't know.

Problem: rng.gen() is generic, but should have stable output. It generates tuple and array output by calling rng.gen() for each element; we can't exactly optimise this properly. The same would be true of rng.gen_range using arrays as above.

TheIronBorn · 2023-02-06T21:20:15Z

When I ported to std::simd I added a small bit in Standard's doc

rand/src/distributions/mod.rs

Lines 156 to 159 in 7d73990

    
           /// * SIMD types like x86's [`__m128i`], `std::simd`'s [`u32x4`]/[`f32x4`]/ 
        
           ///   [`mask32x4`] (requires [`simd_support`]), where each lane is distributed 
        
           ///   like their scalar `Standard` variants. See the list of `Standard` 
        
           ///   implementations for more.

Though this issue still deserves more thought

TheIronBorn mentioned this issue Jul 8, 2022

switch to std::simd, expand SIMD & docs #1239

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIMD distributions aren't well documented #1227

SIMD distributions aren't well documented #1227

clarfonthey commented Apr 17, 2022 •

edited

Loading

clarfonthey commented Apr 18, 2022

dhardy commented Apr 18, 2022

clarfonthey commented Apr 18, 2022

dhardy commented Feb 6, 2023 •

edited

Loading

TheIronBorn commented Feb 6, 2023

SIMD distributions aren't well documented #1227

SIMD distributions aren't well documented #1227

Comments

clarfonthey commented Apr 17, 2022 • edited Loading

Uniform distributions

Standard distributions

Const generics

Non-SIMD versions

clarfonthey commented Apr 18, 2022

dhardy commented Apr 18, 2022

clarfonthey commented Apr 18, 2022

dhardy commented Feb 6, 2023 • edited Loading

Action

TheIronBorn commented Feb 6, 2023

clarfonthey commented Apr 17, 2022 •

edited

Loading

dhardy commented Feb 6, 2023 •

edited

Loading