Tuesday, 16 July 2013

Random Floats in GLSL 330

I had some fun putting together a random number generator in GLSL.

I wanted a set functions with the following specs:
  • Outputs a float in the half-open range 0:1  (a fraction)
  • Outputs all variations in the range given sufficient input variation
  • Multidimensional variation: accepts between 1 and 4 arbitrary float inputs
  • Uniform distribution
  • No apparent patterns
  • Self-contained: no texture lookup
  • Portable: identical results on all GPUs
  • Low overhead
Fairly stringent requirements.  I think I managed to pull it off though.

The key to making a decent set of functions was realising that in a shader - which has no state - a random function is actually a kind of hash function.  It returns a uniformly distributed output value which varies wildly in value given small changes in input.

But we're dealing with floats.   Floats which according to the GLSL spec explicitly lack precision guarantees.  Writing a hashing function in terms of floats isn't going to work very well and isn't likely to be portable either.  Also, writing a good hashing function is a difficult task in itself.  I'd rather use an existing integer hash with well defined properties.

That's what we're going to do.  So that we only have to hash integers, we're going to move the bits of the float directly into an unsigned integer, hash it, and move the bits back into a float - accompanied by a little bit-twiddling of course.  For this purpose GLSL 330 has floatBitsToUint and uintBitsToFloat.  This way the PRNG will be capable of dealing with any input value without range or precision being a concern.

The GLSL spec guarantees that float is in IEEE-754 binary32 format and also that uint is a 32-bit unsigned integer.  Which means we can do bit-twiddling tricks without worrying about compatibility issues.

Here's the algorithm I came up with:
  1. Hash all inputs together into a uint
  2. AND away everything but the low 23 bits
  3. Construct a valid float where the mantissa bits are all zero (value 1.0)
  4. OR the constructed float with the 23-bit uint to get range [1:2]
  5. With float arithmetic, subtract 1.0 to get desired [0:1] range
This probably sounds quite confusing if you're not intimately familiar with the binary32 format.  I'll explain it more thoroughly after I post the actual code.

Here's our integer hash function, a single iteration of Bob Jenkins' OAT algorithm.  This is what we're going to mix up our bits with.
uint hash( uint x ) {
    x += ( x << 10u );
    x ^= ( x >>  6u );
    x += ( x <<  3u );
    x ^= ( x >> 11u );
    x += ( x << 15u );
    return x;
}
Now for the bit-twiddling random() function.  Let's keep it simple and stick with one input for now:
float random( float f ) {
    const uint mantissaMask = 0x007FFFFFu;
    const uint one          = 0x3F800000u;
   
    uint h = hash( floatBitsToUint( f ) );
    h &= mantissaMask;
    h |= one;
    
    float  r2 = uintBitsToFloat( h );
    return r2 - 1.0;
}
What the hell is going on here?  To explain this I have to first explain some aspects of the floating point format:
The mantissa of a binary32 float is a 23-bit binary fraction in the half-open range [0:1].  There's an implicit leading 1 in the mantissa so the effective value of the mantissa is always 1.0 + fraction.  The absolute value of a float can be calculated as:
(1.0 + fraction) * power(2,exponent)
This means that the float value 1.0 has both an exponent and mantissa of zero.  Zero mantissa means 23 zero bits.  As the value increases from 1.0 to 2.0 only those mantissa bits change - they count upwards the same as a 23-bit unsigned integer.  By exploiting this property we can easily construct a float in that range using only bitwise operations.  Stick those bits back inside a float, subtract 1.0, and you've got a float in the desired range of [0:1].

To get multidimensional variants of this function we simply combine more inputs into the hash function.



Let's try our one-dimensional variant first:
float r = random( gl_FragCoord.x );
fragment = vec4( r,r,r, 1.0 );
This should vary the luminance randomly over the X axis.  Result:


Already we can see some good indications.  The average luminance is about 50% which means the distribution of the output is roughly uniform and covers the full range.

But it's not possible to tell whether there are nasty patterns using only one-dimensional input.  Let's make some multi-dimensional variants of the hash function:
uint hash( uvec2 v ) {
    return hash( v.x ^ hash(v.y) );
}

uint hash( uvec3 v ) {
    return hash( v.x ^ hash(v.y) ^ hash(v.z) );
}

uint hash( uvec4 v ) {
    return hash( v.x ^ hash(v.y) ^ hash(v.z) ^ hash(v.w) );
}
The way I combined the inputs is totally arbitrary and rather lazy.  I just experimented with them until I found something that worked on my test inputs.  There's probably a better and faster way of doing this, like unrolling the OAT algorithm separately for each one.

Making versions of random() which use these functions is very simple so I'll spare you the wall of text.



Anyway, let's try some 2D randomness:
float r = random( gl_FragCoord.xy );
fragment = vec4( r,r,r, 1.0 );

Great - not a pattern to be seen.  Bad RNGs create visual artifacts which spoil the effect, typically diagonal lines.  Something that looks exactly like static is the best possible outcome.



Our noise is still stationary though.  If you want the value to differ each frame then you need another input: time.  You can pass this into the shader as a uniform.  With both spatial and temporal inputs it now looks like this:
float r = random( vec3(gl_FragCoord.xy, time) );
fragment = vec4( r,r,r, 1.0 );

Wonderful.  (It looks much better in practice at 60hz.  There's only so much I can do with a GIF)



It would be interesting to see how it holds up compared to other techniques.  I've never seen anyone generate 'random' floats this way before.  Knowing computer science somebody probably invented this 50 years ago though.



What about performance?  It seems pretty good but I'm no expert on measuring GPU perf.  My Geforce 660 Ti manages ~3400 FPS at 1920x1200x32 but this is probably a useless figure.  GPUs can execute multiple instruction streams in parallel, interleave them to hide latency, have very deep pipelines, and are often limited by memory bandwidth rather than computational power.  You'd have to compare it with other random number techniques in an actual application to get useful answers.




2 comments:

  1. Thanks, works perfectly in Metal too. Very nice looking noise also... :)

    ReplyDelete
  2. Hello! The idea is sound, but your 2D, 3D and 4D hash functions are very expensive... and the 1D hash is prone to visible artifacts.

    Note that the hash functions only need to care about the lowest 23 bits, with a focus on good avalanche behavior in the higher bits of that range.

    Here's what I use:

    uint hashInt1D( uint x )
    {
    x += x >> 11;
    x ^= x << 7;
    x += x >> 15;
    x ^= x << 5;
    x += x >> 12;
    x ^= x << 9;
    return x;
    }

    uint hashInt2D( uint x, uint y )
    {
    x += x >> 11;
    x ^= x << 7;
    x += y;
    x ^= x << 6;
    x += x >> 15;
    x ^= x << 5;
    x += x >> 12;
    x ^= x << 9;
    return x;
    }

    uint hashInt3D( uint x, uint y, uint z )
    {
    x += x >> 11;
    x ^= x << 7;
    x += y;
    x ^= x << 3;
    x += z ^ ( x >> 14 );
    x ^= x << 6;
    x += x >> 15;
    x ^= x << 5;
    x += x >> 12;
    x ^= x << 9;
    return x;
    }

    ReplyDelete