Friday, September 26, 2014

Unity: Playing a video on a TV screen at the start of a Rift application

Let’s say you wanted to have a TV screen that plays a short welcome video on start up in your scene, such as in this demo I'm working on:



Displaying video on a screen in a scene in Unity Pro is typically done using a Movie Texture. Movie Textures do not play automatically - you need to use a script to tell the video when to play. The Rift, however, presents some challenges that you wouldn’t face when working with a more traditional monitor that make knowing when to start the video a bit tricky.
  1. You can’t assume that the user has the headset on when the application starts. This means you can’t assume that the user can see anything that you are displaying. 
  2.  On start-up all Rift applications display a Health and Safety Warning (HSW). The HSW is big rectangle pinned to the user’s perspective that largely obscures the user’s view of everything else in the scene.
  3. You aren’t in control of the where the user looks (or rather, you shouldn’t be - moving the camera for the user can be a major motion sickness trigger), so you can’t be sure the user is even looking at the part of the scene where the video will be displayed.
In my demo, I addressed the first two issues by making sure the HSW had been dismissed before I started the video. If the user has dismissed the HSW, it will no longer be in the way of their view and it is a good bet that if they dismissed the HSW, they have the headset on and are ready to start the demo. The third issue I addressed by making sure the video is in the user’s field of view before it starts playing.

Making sure the Health and Safety Warning (HSW) has been dismissed

The HSW says “Press any key to dismiss.” My first thought was to use the key press as the trigger for starting the video. Unfortunately this doesn’t quite work. The HSW must be displayed for a minimum amount of time before it can actually be dismissed - 15 seconds the first time it is displayed for a given profile and 6 seconds for subsequent times. The result was that often the key was pressed and the welcome video would start but the HSW had no yet gone away. I also wanted the video to replay if the user reloaded the scene. When the scene is reloaded, the HSW is not displayed, the user does not need to press a key and therefore the video would not start.

Fortunately, Oculus Unity Integration package provides a way to know if the HSW is still being displayed or not.
OVRDevice.HMD.GetHSWDisplayState().Displayed
The above will return true if the HSW is still on screen.

Making sure the video is in the player’s field of view

How you get the user to look at the video will depend a lot on what kind of scene you are using. You can, for example, put a TV in every corner of the room so that no matter which direction the user is looking, a screen is in view. Or, if you have only a single TV screen, you can use audio cues to get the get the user’s attention. (I haven't decided yet how I will get the user's attention in my demo.)

No matter how you get the player to look at where the video is playing, you can check that the video is within the user’s field of view by checking the video’s render state before playing the video using:
renderer.isVisible
The above will return true if the object (in this case, the TV screen) is currently being rendered in the scene.

Thursday, September 25, 2014

Video: Asynchronous timewarp with the Oculus Rift

In this video Brad discusses an example of using asynchronous timewarp in order to maintain a smooth experience in the Rift even if your rendering engine can't maintain the full required framerate at all times.

 

Links from the video:

Friday, August 22, 2014

Using basic statistical analysis to discover whether or not the Oculus Rift headset is being worn

As we were getting ready for our talk next week at PAX Dev 2014, entitled "Pitfalls & Perils of VR Development: How to Avoid Them", an interesting question came up: how can you tell if the Rift is actually on the user's head, instead of on their desk?  It's a pretty common (and annoying) scenario right now--you double-click to launch a cool new game, and immediately you can hear intro music and cutscene dialog but the Rift's still on a table.  I hate feeling like, ack!, I have to scramble to get the Rift on my face to see the intro.

Valve's SteamVR will help with this a lot, I expect; if I launch a game when I'm already wearing the Rift, there'll be no jarring switch.  But I'm leery--half the Rift demos I download today start by popping up a Unity dialog on my desktop before they switch to fullscreen VR, and that's going to be an even worse experience if I'm using Steam.

So I was mulling over how to figure out programmatically whether or not the Rift is on the user's head.  I figure that we can't just look at the position data from the tracker camera, because "camera can see Rift" isn't a firm indicator of "Rift is being used".  (Maybe the Rift is sitting on my desk, in view of the camera.)  Instead, we need to look at the noise of its position.

I recorded the eye pose at each frame, taking an average of all eye poses recorded every tenth of a second.  At 60FPS that's about six positions per decisecond.  The Rift's positional sensors are pretty freaking sensitive; when the Rift is sitting on my desk, the difference in position from one decisecond to the next from ambient vibration is on the order of a hundredth of a millimeter.  Pick it up, though, and those differences spike.

I plotted the standard deviation of the Rift's position, in a rolling window of ten samples for the past second, versus time:


This is a graph of log(standard deviation(average change in position per decisecond)) over time.  The units on the left are in log scale.  I found that when the Rift was inert on my desk, casual vibration kept log(σ) < -10.5; as I picked it up log(σ) spiked, and then while worn would generally hover between -4.5 and -10.5.  When the Rift was being put on or taken off, log(σ) climbed as high as -2, but only very briefly.

I found that distinguishing between Rift that was being put on or lifted off from a Rift that was being worn normally was pretty hard with this method, but that the distinction between "not in human hands at all" and "in use" was clear.  So this demonstrates a method for programmatically determining whether the Rift is in active use or not.  I hope it's useful.

Sample code was written in Java, and is available on the book's github repo.  (File "HeadMotionStatsDemo.java".)

Advanced uses of Timewarp II - When you're running late

[This is post three of three on Timewarp, a new technology available on the Oculus Rift. This is a draft of work in progress of Chapter 5.7 from our upcoming book, "Oculus Rift in Action", Manning Press. By posting this draft on the blog, we're looking for feedback and comments: is this useful, and is it intelligible?]


5.7.2 When you're running late

Of course, when the flak really starts to fly, odds are that you won’t be rendering frames ahead of the clock—it’s a lot more likely that you’ll be scrambling to catch up.  Sometimes rendering a single frame costs you longer than the number of milliseconds your target framerate allows.  But timewarp can be useful here too.

Say your engine realizes that it’s going to be running late.  Instead of continuing to render the current frame, you can send the previous frame to the Rift and let the Rift apply timewarp to the images generated a dozen milliseconds ago.  (Figure 5.12.)  Sure, they won’t be quite right—but if it buys you enough time to get back on top of your rendering load, it’ll be worth it, and no human eye will catch it when you drop occasionally one frame out of 75.  Far more importantly, the image sent to the Rift will continue to respond to the user’s head motions with absolute fidelity; low latency means responsive software, even with the occasional lost frame.

Remember, timewarp can distort any frame, so long as it’s clear when that frame was originally generated so that the Rift knows how much distortion to apply.

Figure 5.12: If you’re squeezed for rendering time, you can occasionally save a few cycles by dropping a frame and re-rendering the previous frame through timewarp.

The assumption here is that your code is sufficiently instrumented and capable of self-analysis that you do more than just render a frame and hope it was fast enough.  Carefully instrumented timing code isn’t hard to add, especially with some display-bound timing methods as ovrHmd_GetFrameTiming, but it does mean more complexity in the rendering loop.  If you’re using a commercial graphics engine, they may already have the support baked in.  This is the sort of monitoring that any 3D app engine that handles large, complicated, variable-density scenes will hopefully be capable of performing.

Dropping frames with timewarp is an advanced technique, and probably not worth investing engineering resources into early in a project.  This is something that you should only build when your scene has grown so complicated that you anticipate having spikes of rendering time.  But if that’s you, then timewarp will help.

Tuesday, August 19, 2014

Advanced Uses of Timewarp I - When you're running early

[This is post two of three on Timewarp, a new technology available on the Oculus Rift. This is a draft of work in progress of Chapter 5.7 from our upcoming book, "Oculus Rift in Action", Manning Press. By posting this draft on the blog, we're looking for feedback and comments: is this useful, and is it intelligible?]


5.7.1 When you're running early

One obvious use of timewarp is to fit in extra processing, when you know that you can afford it.  The Rift SDK provides access to its timing data through several API functions:
  • ovrHmd_BeginFrame       // Typically used in the render loop
  • ovrHmd_GetFrameTiming   // Typically used for custom timing and optimization
  • ovrHmd_BeginFrameTiming // Typically used when doing client-side distortion
These methods return an instance of the ovrFrameTiming structure, which stores the absolute time values associated with the frame. The Rift uses system time as an absolute time marker, instead of computing a series of differences from one frame to the next, because doing so reduces the gradual build-up of incremental error. These times are stored as doubles, which is a blessing after all the cross-platform confusion over how to count milliseconds.

ovrFrameTiming includes:
  • float DeltaSeconds
    The amount of time that has passed since the previous frame returned its BeginFrameSeconds value; usable for movement scaling. This will be clamped to no more than 0.1 seconds to prevent excessive movement after pauses for loading or initialization.
  • double ThisFrameSeconds Absolute time value of when rendering of this frame began or is expected to begin; generally equal to NextFrameSeconds of the previous frame. Can be used for animation timing.
  • double TimewarpPointSeconds Absolute point when IMU (timewarp) expects to be sampled for this frame.
  • double NextFrameSeconds
    Absolute time when frame Present + GPU Flush will finish, and the next frame starts.
  • double NextFrameSeconds Absolute time when frame Present + GPU Flush will finish, and the next frame starts.
  • double ScanoutMidpointSeconds Time when when half of the screen will be scanned out. Can be passed as a prediction value to ovrHmd_GetSensorState() to get general orientation.
  • double EyeScanoutSeconds[2]
    Timing points when each eye will be scanned out to display. Used for rendering each eye.

Generally speaking, it is expected that the following should hold:

ThisFrameSeconds
    < TimewarpPointSeconds
        < NextFrameSeconds
            < EyeScanoutSeconds[EyeOrder[0]]
                <= ScanoutMidpointSeconds
                    <= EyeScanoutSeconds[EyeOrder[1]]

…although actual results may vary during execution.

Knowing when the Rift is going to reach TimewarpPointSeconds and ScanoutMidpointSeconds gives us a lot of flexibility if we happen to be rendering faster than necessary. There are some interesting possibilities here: if we know that our code will finish generating the current frame before the clock hits TimewarpPointSeconds, then we effectively have ‘empty time’ to play with in the frame. You could use that time to do almost anything (provided it’s quick)—send data to the GPU to prepare for the next frame, compute another million particle positions, prove the Riemann Hypothesis—whatever, really (Figure 5.11.)


Figure 5.11: Timewarp means you’ve got a chance to do extra processing for ‘free’ if you know when you’re idle.

Keep this in mind when using timewarp. It effectively gives your app free license to scale its scene density, graphics level, and just plain awesomeness up or down dynamically as a function of current performance, measured and decided right down to the individual frame.

But it’s not a free pass! Remember that there are nasty consequences to overrunning your available frame time: a dropped frame. And if you don’t adjust your own timing, you risk the SDK spending a busywait cycle for almost all of the following frame, using past data for the next image, which can consume valuable CPU. So you’ve got a powerful weapon here, but you must be careful not to shoot yourself in the foot with it.

[Next post: Chapter 5.7, "Advanced uses of timewarp", part 2]

Monday, August 18, 2014

Using Timewarp on the Oculus Rift

[This is post one of three on Timewarp, a new technology available on the Oculus Rift. This is a draft of work in progress of Chapter 5.6 from our upcoming book, "Oculus Rift in Action", Manning Press. By posting this draft on the blog, we're looking for feedback and comments: is this useful, and is it intelligible?]


5.6 Using Timewarp: catching up to the user

In a Rift application, head pose information is captured before you render the image for each eye. However, rendering is not an instantaneous operation; processing time and vertical sync (“vsync”) mean that every frame can take a dozen milliseconds to get to the screen. This presents a problem, because the head pose information at the start of the frame probably won’t match where the head actually is when the frame is rendered. So, the head pose information has to be predicted for some point in the future; but the prediction is necessarily imperfect. During the rendering of eye views the user could change the speed or direction they’re turning their head, or start moving from a still position, or otherwise change their current motion vector. (Figure 5.8.)


Figure 5.8: There will be a gap between when you sample the predicted orientation of the Rift for each eye and when you actually display the rendered and distorted frame. In that time the user could change their head movement direction or speed. This is usually perceived as latency.


THE PROBLEM: POSE AND PREDICTION DIVERGE

As a consequence, the predicted head pose used to render an eye-view will rarely exactly match the actual pose the head has when the image is displayed on the screen. Even though this is typically over a time of less than 13ms, and the amount of error is very small in the grand scheme of things, the human visual cortex has millions of years of evolution behind it and is perfectly capable of perceiving the discrepancy, even if it can’t necessarily nail down what exactly is wrong. They’ll perceive it as latency—or worse, they won’t be able to say what it is that they perceive, but they’ll declare the whole experience “un-immersive”. You could even make them ill (see Chapter 8 for the relationship between latency and simulation sickness.)

THE SOLUTION: TIMEWARP

To attempt to correct for these issues, Timewarp was introduced in the 0.3.x versions of the Oculus SDK. The SDK can’t actually use time travel to send back head pose information from the future[1], so it does the next best thing. Immediately before putting the image on the screen it samples the predicted head pose again. Because this prediction occurs so close to the time at which the images will be displayed on the screen it’s much more accurate than the earlier poses. The SDK can look at the difference between the timewarp head pose and the original predicted head pose and shift the image slightly to compensate for the difference. 

5.6.1 Using timewarp in your code

Because the functionality and implementation of timewarp is part of the overall distortion mechanism inside the SDK, all you need to do to use it (assuming you’re using SDK-side distortion) is to pass the relevant flag into the SDK during distortion setup:

    int configResult = ovrHmd_ConfigureRendering(
        hmd, 
        &cfg, 
        ovrDistortionCap_TimeWarp
        hmdDesc.DefaultEyeFov, 
        eyeRenderDescs);

That’s all there is to it.

5.6.2 How timewarp works

Consider an application running at 75 frames per second. It has 13.33 milliseconds to render each frame (not to mention do everything else it has to do for each frame). Suppose your ‘update state’ code takes 1 milliseconds, each eye render takes 4 milliseconds, and the distortion (handled by the SDK) takes 1 millisecond. Assuming you start your rendering loop immediately after the previous refresh then the sequence of events would look something like Figure 5.9.

Figure 5.9: A simple timeline for a single frame showing the points at which the (predicted) head pose is fetched. By capturing the head orientation a third time immediately before ending the frame, it’s possible to warp the image to adjust for the differences between the predicted and actual orientations. Only a few milliseconds—probably less than ten—have passed since the original orientations were captured, but this penultimate update can still strongly improve the perception of responsiveness in the Rift.

  1. Immediately after the previous screen refresh, you begin your game loop, starting by updating any game state you might have.
  2. Having updated the game state, we grab the predicted head pose and start rendering the first eye. ~12 ms remain until the screen refresh, so the predicted head pose is for 12 ms in the future.
  3. We’ve finished with the first eye, so we grab the predicted head pose again, and start rendering the second eye. This pose is for ~8ms in the future, so it’s likely more accurate than the first eye pose, but still imperfect. 
  4. After rendering has completed for each eye, we pass the rendered offscreen images to the SDK. ~4ms remain until the screen refresh.
  5. The SDK wants to fetch the most accurate head pose it can for timewarp, so it will wait until the last possible moment to perform the distortion
  6. With just enough time to perform the distortion, the SDK fetches the head pose one last time. This head pose is only predicted about 1ms into the future, so it’s much more accurate than either of the per-eye render predictions. The difference between the each per-eye pose and this final pose is computed and sent into the distortion mechanism so that it can correct the rendered image position by rotating it slightly, as if on the inner surface of a sphere centered on the user. 
  7. The distorted points of view are displayed on the Rift screen. 
By capturing the head pose a third time, so close to rendering, the Rift can ‘catch up’ to unexpected motion. When the user’s head rotates, the point where the image is projected can be shifted, appearing where it would have been rendered if the Rift could have known where the head pose was going to be.

The exact nature of the shifting is similar to if you took the image and painted it on the interior of a sphere which was centered on your eye, and then slightly rotated the sphere. So for instance if the predicted head pose was incorrect, and you ended up turning your head further to the right than predicted, the timewarp mechanism would compensate by rotating the image to the left (Figure 5.10.)

Figure 5.10: The rendered image is shifted to compensate for the difference between the predicted head pose at eye render time and the actual head pose at distortion time.

One caveat: when you go adding in extra swivel to an image, there’ll be some pixels at the edge of the frame that weren’t rendered before and now they need to be filled in. How best to handle these un-computed pixels is a topic of ongoing study, although initial research from Valve and Oculus suggest that simply coloring them black is fine.

5.6.3 Limitations of timewarp 

Timewarp isn’t a latency panacea. This ‘rotation on rails’ works fine if the user’s point of view only rotates, but in real life our anatomy isn’t so simple. When you turn your head your eyes translate as well, swinging around the axis of your neck, producing parallax. So for instance, if you’re looking at a soda can on your desk, and you turn your head to the right, you’d expect a little bit more of the desk behind the right side of the can to be visible, because by turning your head you’ve also moved your eyes a little bit in 3D space. The timewarped view can’t do that, because it can’t manufacture those previously hidden pixels out of nothing. For that matter, the timewarp code doesn’t know where new pixels should appear, because by the time you’re doing timewarp, the scene is simply a flat 2D image, devoid of any information about the distance from the eye to a given pixel. 
 
This is especially visible in motion with a strong translation component, but (perhaps fortunately) the human head’s range and rate of motion in rotation is much greater than in translation. Translation generally involves large, coarse motions of the upper body which are easily predicted by hardware and difficult to amend faster than the Rift can anticipate.

Oculus recognizes that the lack of parallax in timewarped images is an issue, and they’re actively researching the topic. But all the evidence so far has been that, basically, users just don’t notice. It seems probable that parallax timewarp would be a real boost to latency if it were easy, but without it we still get real and significant improvements from rotation alone.

[Next post: Chapter 5.7, "Advanced uses of timewarp", part 1]

______________________
[1]
It's very difficult to accelerate a Rift up to 88 miles per hour