5.6 Using Timewarp: catching up to the userIn a Rift application, head pose information is captured before you render the image for each eye. However, rendering is not an instantaneous operation; processing time and vertical sync (“vsync”) mean that every frame can take a dozen milliseconds to get to the screen. This presents a problem, because the head pose information at the start of the frame probably won’t match where the head actually is when the frame is rendered. So, the head pose information has to be predicted for some point in the future; but the prediction is necessarily imperfect. During the rendering of eye views the user could change the speed or direction they’re turning their head, or start moving from a still position, or otherwise change their current motion vector. (Figure 5.8.)
Figure 5.8: There will be a gap between when you sample the predicted orientation of the Rift for each eye and when you actually display the rendered and distorted frame. In that time the user could change their head movement direction or speed. This is usually perceived as latency.
THE PROBLEM: POSE AND PREDICTION DIVERGEAs a consequence, the predicted head pose used to render an eye-view will rarely exactly match the actual pose the head has when the image is displayed on the screen. Even though this is typically over a time of less than 13ms, and the amount of error is very small in the grand scheme of things, the human visual cortex has millions of years of evolution behind it and is perfectly capable of perceiving the discrepancy, even if it can’t necessarily nail down what exactly is wrong. They’ll perceive it as latency—or worse, they won’t be able to say what it is that they perceive, but they’ll declare the whole experience “un-immersive”. You could even make them ill (see Chapter 8 for the relationship between latency and simulation sickness.)
THE SOLUTION: TIMEWARPTo attempt to correct for these issues, Timewarp was introduced in the 0.3.x versions of the Oculus SDK. The SDK can’t actually use time travel to send back head pose information from the future, so it does the next best thing. Immediately before putting the image on the screen it samples the predicted head pose again. Because this prediction occurs so close to the time at which the images will be displayed on the screen it’s much more accurate than the earlier poses. The SDK can look at the difference between the timewarp head pose and the original predicted head pose and shift the image slightly to compensate for the difference.
5.6.1 Using timewarp in your codeBecause the functionality and implementation of timewarp is part of the overall distortion mechanism inside the SDK, all you need to do to use it (assuming you’re using SDK-side distortion) is to pass the relevant flag into the SDK during distortion setup:
int configResult = ovrHmd_ConfigureRendering(
That’s all there is to it.
5.6.2 How timewarp worksConsider an application running at 75 frames per second. It has 13.33 milliseconds to render each frame (not to mention do everything else it has to do for each frame). Suppose your ‘update state’ code takes 1 milliseconds, each eye render takes 4 milliseconds, and the distortion (handled by the SDK) takes 1 millisecond. Assuming you start your rendering loop immediately after the previous refresh then the sequence of events would look something like Figure 5.9.
Figure 5.9: A simple timeline for a single frame showing the points at which the (predicted) head pose is fetched. By capturing the head orientation a third time immediately before ending the frame, it’s possible to warp the image to adjust for the differences between the predicted and actual orientations. Only a few milliseconds—probably less than ten—have passed since the original orientations were captured, but this penultimate update can still strongly improve the perception of responsiveness in the Rift.
- Immediately after the previous screen refresh, you begin your game loop, starting by updating any game state you might have.
- Having updated the game state, we grab the predicted head pose and start rendering the first eye. ~12 ms remain until the screen refresh, so the predicted head pose is for 12 ms in the future.
- We’ve finished with the first eye, so we grab the predicted head pose again, and start rendering the second eye. This pose is for ~8ms in the future, so it’s likely more accurate than the first eye pose, but still imperfect.
- After rendering has completed for each eye, we pass the rendered offscreen images to the SDK. ~4ms remain until the screen refresh.
- The SDK wants to fetch the most accurate head pose it can for timewarp, so it will wait until the last possible moment to perform the distortion
- With just enough time to perform the distortion, the SDK fetches the head pose one last time. This head pose is only predicted about 1ms into the future, so it’s much more accurate than either of the per-eye render predictions. The difference between the each per-eye pose and this final pose is computed and sent into the distortion mechanism so that it can correct the rendered image position by rotating it slightly, as if on the inner surface of a sphere centered on the user.
- The distorted points of view are displayed on the Rift screen.
The exact nature of the shifting is similar to if you took the image and painted it on the interior of a sphere which was centered on your eye, and then slightly rotated the sphere. So for instance if the predicted head pose was incorrect, and you ended up turning your head further to the right than predicted, the timewarp mechanism would compensate by rotating the image to the left (Figure 5.10.)
Figure 5.10: The rendered image is shifted to compensate for the difference between the predicted head pose at eye render time and the actual head pose at distortion time.
One caveat: when you go adding in extra swivel to an image, there’ll be some pixels at the edge of the frame that weren’t rendered before and now they need to be filled in. How best to handle these un-computed pixels is a topic of ongoing study, although initial research from Valve and Oculus suggest that simply coloring them black is fine.
5.6.3 Limitations of timewarp
Timewarp isn’t a latency panacea. This ‘rotation on rails’ works fine if the user’s point of view only rotates, but in real life our anatomy isn’t so simple. When you turn your head your eyes translate as well, swinging around the axis of your neck, producing parallax. So for instance, if you’re looking at a soda can on your desk, and you turn your head to the right, you’d expect a little bit more of the desk behind the right side of the can to be visible, because by turning your head you’ve also moved your eyes a little bit in 3D space. The timewarped view can’t do that, because it can’t manufacture those previously hidden pixels out of nothing. For that matter, the timewarp code doesn’t know where new pixels should appear, because by the time you’re doing timewarp, the scene is simply a flat 2D image, devoid of any information about the distance from the eye to a given pixel.
This is especially visible in motion with a strong translation component, but (perhaps fortunately) the human head’s range and rate of motion in rotation is much greater than in translation. Translation generally involves large, coarse motions of the upper body which are easily predicted by hardware and difficult to amend faster than the Rift can anticipate.
Oculus recognizes that the lack of parallax in timewarped images is an issue, and they’re actively researching the topic. But all the evidence so far has been that, basically, users just don’t notice. It seems probable that parallax timewarp would be a real boost to latency if it were easy, but without it we still get real and significant improvements from rotation alone.