Thursday, August 15, 2013

Improving on the Distortion Shader

In our last article we examined the distortion required by the Oculus Rift and created a shader similar to the one used in the example code in the Oculus VR SDK.  Similar, but not identical.  One big difference was that I broke out the code that did coordinate transformation into separate functions for clarity.  In addition, I used viewport coordinates instead of screen coordinates, again for reasons of clarity.

However, there is more that could be done to reduce the complexity of the shader and improve it's performance.

Motivation 

While my earlier post explicitly noted that I was not attempting to follow best practices or focus on performance, latency is a critical issue in working with the Oculus Rift, or any VR headset.  The visual information perceived by the user must be so responsive that there must be no perceptible lag between moving your head and having the rendered scene change.  When there is lag, at best you're breaking immersion, at worst the user is going to get motion sick.  Because of this it's important to maintain a high frame rate and shave off unnecessary bottlenecks wherever you can.

Vertex vs Fragment Operations

In OpenGL and DirectX, a shader program is made up of, at minimum, a vertex shader and a fragment shader.  Vertex shaders act on the vertices you pass in, while fragment shaders (also sometimes known as pixel shaders) act on the individual pixels to be rendered.  As you can imagine, the fragment shader tends to be executed far more often than vertex shader.  If I render a two triangle strip that produces a 100x100 square, the vertex shader is going to be called 4 times, but the the fragment shader will be called 10,000 times.  The upshot of this is that if there are operations that can be moved from the fragment shader to the vertex shader, or moved out of the shader program completely, they should be moved.

Eligible Transforms

So what can we remove from the fragment shader?  Well, when I discussed coordinate systems in the earlier post, I noted that by default, screen coordinates go from -1 to 1 on each axis, and texture coordinates go from 0 to 1.  Because of this mismatch, the first transformation we do in the fragment shader is to move from the texture coordinate system into the screen coordinate system.  However, while it's true that when you texture2D() in your fragment shader, the coordinates must be in the 0 to 1 range, when we specify the vertices in OpenGL, they can be any value we like.  Right off the bat this means we can just specify the values in screen coordinates when we pass them into OpenGL, like so:

glBegin(GL_QUADS);
glTexCoord2f( -1, -1); glVertex2f(-1, -1);
glTexCoord2f(  1, -1); glVertex2f( 1, -1);
glTexCoord2f(  1,  1); glVertex2f( 1,  1);
glTexCoord2f( -1,  1); glVertex2f(-1,  1);
glEnd();

But that's just scratching the surface.  In fact if you look at all the transformations we apply to the texture coordinates before applying the distortion function, you'll see they're all simple translations or scalings:

vec2 textureCoordsToDistortionOffsetCoords(vec2 texCoord) {
    vec2 result = texCoord;
    // Convert the texture coordinates from "0 to 1" to "-1 to 1"
    result *= 2.0;    result -= 1.0;

    // Convert from using the center of the screen as the origin to
    // using the lens center as the origin
    result -= u_lensCenterOffset;

    // Correct for the aspect ratio
    result.y /= u_aspect;

    return result;
}

That means that this entire function doesn't need to be in the fragment shader.  You can do the equivalent operations on the initial 2D texture coordinates you pass into OpenGL, and the interpolation of those values prior to the fragment shader will do the rest for your.

Ineligible Transforms

You may be asking yourself now why we can't also move the distortion scale function out of the fragment shader.  The answer is that it's non-linear.  In linear functions, as you scale the input, the output scales at the same rate.  Hence if you have a linear function f, and you know f(0) = 0 and f(2) = 10, then you also know f(1) = 5.  Plotting the inputs and output of a linear function on a graph will always give you a line, hence the name linear.

With a non-linear function you don't have any such guarantee.  The fragment shader values is receives from the OpenGL are linearly interpolated between the original coordinates produced or passed on by the vertex shader, so any non-linear functions applied to the texture coordinates must be done in the fragment shader.  Further any operations that occur after a non-linear function cannot be moved, even if they are linear, because they depend on inputs that are non-linear.

Results

I haven't finished my benchmarking application yet, so this is still pending.

Further Improvements

There is actually a much better way of implementing the fragment shader.  All the shader does is transform one set of texture coordinates to another based on a number of inputs.  However, tf you work backwards through the code it becomes clear that all the inputs are fixed values based on the physical parameters of the Rift headset: the size of the screen, the position of the lenses relative to the screen, and the distortion coefficients of the lenses.  If you have those values you can compute the distortion offset for every possible input coordinate.  Once this is done, all of the distortion calculation can be done away with and replaced by a simple texture lookup.  

More on this later...



No comments:

Post a Comment

Note: Only a member of this blog may post a comment.