Oculus Rift in Action: January 2014

Friday, January 31, 2014

Managing non-code resources in cross platform projects and cross language projects

For the book, I am writing a fair amount of example code. The example code uses a lot of non-compiled resources. Meshes, images, fonts and even shaders sources are all needed for rendering. I need to be able to access these resources on all my target platforms (Windows, OSX, Linux, Android), and in all my target languages (C++, Java, Android Java).

Accessing these resources at run-time poses several challenges:

How do I store the resource so that it's available to my application regardless of platform?
How do I specify the resource I want in the application?

On top of these basic problems, which I must solve in order to have the examples work at all, there are two other constraints I would like to support

How do I avoid duplicating the resource storage for each of my example applications?
How can I enable a debugging mode so that if a resource changes while an application is running, the changes are reflected in the application immediately (without rebuilding or even restarting the application)?

Book announcement - Oculus Rift in Action

Manning Publications (creators of the popular 'in Action' series of books) is working on a new title on development for the Oculus Rift, titled "Oculus Rift in Action", by Brad Davis (<-- me!), Alex Benton and Karen Bryla.

This book will focus on driving understanding of the tasks required for developing for the Rift via a series of cross-platform examples written in C++ and OpenGL. The intent is that regardless of your favored language and rendering API, these examples should be clear enough to allow you apply the lessons learned to your preferred environment.

In addition to the basics, we will cover topics such as best practices in developing immersive applications, integrating other VR elements, and working with the Rift in other languages. A tentative full chapter listing is available at the end of this message.

As part of the publication process, the book will go through MEAP, the Manning Early Access Program. Interested readers can purchase the book early (before it's finished) and get a look at material as it becomes available. They can provide feedback on how the book could better serve the needs of the community. The MEAP will start with the first two chapters of the book and incorporate more material as it becomes available. It should be live within the next couple of weeks.

I'm hoping that both beginner and experienced developers will participate. We want to receive feedback on our methods of teaching so that we can build a resource to help create consistently great experiences with the Rift. This will hopefully drive mainstream adoption. A transformative product like the Rift needs to start with a great developer community.

Chapter listing [as of the date of this post]

Part 1: Getting Started

Chapter 1: Meet the Oculus Rift
Chapter 2: Creating Your First Rift Interactions

Part 2: Building Immersive Environments

Chapter 3: Getting input and changing the point of view: Working with the head tracker
Chapter 4: Sending output to the Rift: Working with the display
Chapter 5: Creating Rift viewable images: Transforming the image
Chapter 6: Rendering a scene
Chapter 7: Building an Immersive Environment
Chapter 8: Improving the VR Experience

Part 3: Building on the Basics

Chapter 9: Using the Rift with Other Languages
Chapter 10: Using the Rift with Body Tracking Hardware
Chapter 11: Accessing the Rift Hardware Directly

Part 4: Putting it into Practice

Chapter 12: Working with Unity: Getting a HUD of the Game
Chapter 13: Using the Rift with WebGL: Google Streetview
Chapter 14: Merging External Inputs with the Rift: Augmented Reality

Appendix A: An OpenGL Primer
Appendix B: Sample Values
Appendix C: Recommended Reading

Sunday, January 26, 2014

Flexible displays & the Rift

I keep seeing posts on the Oculus forums along the lines of this

I think the next major step in head mounted displays will be flexable OLED panels as displayed here: http://www.oled-info.com/lg-display-starts-6-flexible-oled-mass-production-products-expected-2014

no... no... NO...

Flexible display panels are a solution in desperate need of a problem. Maybe they'll be handy for making more durable electronics at some point, but certainly not until all the rest of your cellphone is also flexible. Even then, the screen is going to be extremely fragile and vulnerable to damage unless it's covered by a hard surface such as glass, so it's basically a non-starter.

Curved TVs were all the rage at the last CES, but Ars Technica has a good deconstruction on why it's total bullshit. Short version: all the marketing from the various vendors says different things about why curved TVs are awesome. If three people tell you tell you three different reasons why something is useful, you can bet that none of them are exactly the truth, and perhaps all three have some ulterior motive. In the case of TV manufacturers, the ulterior motive is pretty clear (hint: it's "Buy a new TV").

Curved display panels would be a disaster for HMDs. First off, they're not going to curve like Geordi's visor. The curvature has to be centered on the lens axis that use radially symmetrical lenses (like the Rift). So that means that there's going to need to be a kink in the middle, like so:

While OLED panels can currently bend, I don't believe there are any that can fold. So you'd have to have two separate screen panels, each separately curved; which instantly doubles the cost of making the display. Unless curved OLEDs are miraculously cheaper than today's display, this would push the price of a Rift a lot closer to the $500 mark. Badness.

Even more damning is the need for radial symmetry. The radial symmetry in the lenses means that you get radial symmetry in the display, to maintain a constant distance between the lens and the panel for a given theta (angle between the lens axis and the direction you're looking). This doesn't work with a flexible panel if you bend it, because the flexible panel only bends along one axis. Suddenly the distance between the panel and the lens changes depending on whether you're looking up or to the left. Look 30 degrees to the left and the panel surface is closer because of the curvature. Look 30 degrees up and the panel is the same distance as it was on an flat panel. Which means that we're still stuck with all the complexity of the distortion shader. We were supposed to be free of that complexity! OLEDs! You were the chosen one!

The only way to solve that would be to make a panel which can bend in two dimensions, like a contact lens. But that's not a flexible screen any more--that's custom-molded plastics and circuitry, ie., super-expensive again. There's no way to flex a single sheet (a surface with local curvature zero, for the more mathematically inclined among the audience) into a cup. One (entertainingly named) way to visualize the problem is via the Hairy Ball Theorem. The latter says that if you have a sphere that is covered with hair, there is no way to arrange the hair so that it lies flat everywhere. Somewhere you're going to have a cowlick.

Finally, curving the display doesn't actually gain you anything. The limiting factor to the field of view is the lens, not the display. Sure, the display being slightly bigger might take some work off the lens, and possibly simplify the shader code, but ultimately the greatest angle you're going to be able to see is determined by how far off the lens axis you can look and still be looking through the lens itself. No amount of screen curvature is going to change that.

Oculus Rift rendering from Java & mesh based distortion

I've just updated our public examples repository, located here to add an example of rendering content for the Oculus Rift using Java (via LWJGL).

The Java example is doing it's distortion using a new (to me anyway) approach of using a mesh rather than a shader to do the heavy lifting. More details after the break.

Digging Into the Oculus SDK - Part "something": Head Tracking Data

Hardware

The Oculus Rift sensor hardware reports the acceleration and angular velocity as vectors up to 1000 times a second. It also has a magnetometer and temperature sensor. The iFixit teardown of the Rift reports the gyro+accelerometer chip to be a MPU-6000 from Invensense, which also includes the temperature sensor. The same teardown reports that magentometer is believed to be an A983 2206 chip, which they highlight PCB. All of this hardware is accessed ultimately through the HID interface we discussed in our last post.

Inside the SDK

When tracker data is requested, the SDK opens up the HID device if necessary and configures it with an HID feature report.

Opening the HID device gives you a handle that can be used to read from and write to the device, similar to the kind of handle you'd get from opening a file or a network socket, and typically usable with the same reading and writing APIs.

Feature reports, however, are for reading and writing capabilities and configuration of the device. These do not go over the normal read and write API, but are instead performed 'out-of-band' using an OS specific API. On Linux one uses a special ioctl command. On Win32 based systems it is functions defined by the hid.dll.

The configuration I spoke of consists of issuing a KeepAlive feature report. This instructs the hardware to start sending messages with the acceleration and gyroscope data. If this command isn't issued, reading from the HID device handle will never return any data (and if you're doing blocking reads, will hang the thread). The KeepAlive command includes as a parameter a duration during which it should continue to provide updates. The parameter is stored in a 16 bit value and interpreted as milliseconds, which means that the maximum value is about 65 seconds if it's treated as an unsigned value, or half that if it's treated as signed. The SDK code uses a hard coded of 10 seconds, and is set up to send the keep-alive every 3 seconds, which provides plenty of redundancy in ensuring there are no gaps in the data, even if the background processing thread is somehow held in abeyance for several seconds.

The actual reading of the HID device is done through an asynchronous mechanism that is different for each platform. In an earlier installment we touched on the background thread that handles all SDK commands. The same thread also handles all the reading of data from the HID device handle. When data is available it's copied to a buffer and the buffer is passed to a handler object.

SDK Internal Handler

For once we have a single place to look in the SDK for the implementation. While the code to asynchronously read data and copy it into a buffer is platform specific, the code for interpreting the buffer, a simple array of bytes, is thankfully platform neutral. It is technically located in the SensorDeviceImpl::OnInputReport method, but this is really just a thin wrapper around a non-class function DecodeTrackerMessage located in the same file.

DecodeTrackerMessage converts the incoming data from a byte array into an actual C structure called TrackerSensors, again defined in the same file. This conversion mostly consists of simply copying the bytes to the appropriate location in the structure. The only unusual part of this decoding is that the actual gyroscope and accelerometer vectors are not stored as conventionally sized values. Typically integer values are stored in either 8, 16, 32, or 64 bits, i.e. sizes that are powers of 2. However, for whatever reason, 16 bit values for the individual components of the vectors were deemed insufficiently precise, while 32 bit values were deemed overkill. This makes sense when you look at the units.

Acceleration is typically measured in meters per second squared or in Gs. 1 G is about 9.8 meters per second squared. However, in order to avoid dealing with floating point numbers, this close to the hardware, the numbers are scaled by 1/1000 and reported as integers, so what the hardware is actually reporting is millimeters per second squared. If we were limited to 16 bits, the most acceleration we could represent in any given axis would be about +/- 3 Gs, something that's pretty easy to exceed, particularly on small timescales. On other hand if we use 32 bits we have space for representing about 200,000 Gs. This is starting to approach the surface gravity of a white dwarf star. Perhaps that range has some value in the fields of super-villainy and cartoons, it's certainly more than we need.

...or could possibly survive

The compromise is to use a non-standard bit length. By encoding each axis in 21 bits, we can fit the entire vector into a single 64 bit integer, and still have 1 bit left over. 21 bits allows us to encode about +/- 100 Gs in any direction, which provides plenty of leeway for rapid movements, and probably allows the chip to be used in a variety of more interesting applications that might involve high, though not ludicrous, levels of acceleration.

The input byte array and C structure actually contain room for up to 3 samples, each containing one reading from the accelerometer and one from the gyroscope. It also contains a field for the number of samples that have occurred since the last report. If this number is between 1 and 3, then that is the number of samples in the structure that are valid. If this number is greater than 3 then all three samples are valid and there have been N - 3 samples that have been dropped. The structure only contains 1 magnetic vector and one temperature value, both of which are encoded more conventionally, though still as integers.

Integers to floats

Once the data has been decoded, it needs to be processed by whatever message handler is attached to the sensor device. But dealing with integers of non-SI units could easily be a source of bugs if you get the conversions wrong, plus each tracker message can contain up to 3 samples from the sensors. So the next step is for the sensor device to pass the tracker message to an internal method onTrackerMessage(). This code is responsible for taking the TrackerMessage type, containing ints and up to 3 samples, and converting it into up to 3 individual instances of the MessageBodyFrame class. MessageBodyFrame is a subclass of the basic Message type in the SDK, part of it's generic event handling system. If you connect a callback to the SensorDevice, these are the message you would expect to get from it, at a rate of about 1000 per second. Most people don't do this though, relying instead on a SensorFusion insteance to handle the messages for them and turn them into a continuously available quaternion, representing the current orientation of the Rift.

MessageBodyFrame contains a representation of the acceleration, rotation and magnetic field all as 3 dimensional vectors composed of floating point values, and represented in SI units. Actually the docs say magnetic field is represented in gauss, not teslas, but this isn't really important, because a) the conversion factor is a power of 10, and b) the magnetic field isn't combined with any other units, so it might as well be a unit-less vector. Indeed, the magnetic calibration utility could potentially apply a scaling value to the value, so that it's set up to be a unit vector at all times, but it doesn't appear that it does this currently.

Well, that's all we have time for today. More detailed inner workings of the SDK that you don't care about coming soon to a blog near you.

Friday, January 3, 2014

Tech Demo - Strabismus correction in the Rift

Strabismus is the condition of having divergent eyes, also known colloquially as being cross-eyed or wall-eyed. To some extent it can be corrected for with glasses, and in more extreme cases, surgery. Those afflicted who cannot get full correction with glasses or surgery are often at least partially stereo-blind, unable to use the differing parallax of objects to resolve their depth, and are forced to rely on other clues for depth perception. Fortunately the brain is very good at generating a sense of depth from both parallax caused by moving the head and from cues such as known sizes for common objects.

I myself have divergent eyes. My eyes diverge about 5° vertically and about 10° laterally. I wear prismatic glasses that compensate to some degree, but not completely. In fact, since I've started wearing the glasses the divergence has increased, as the glasses now do some of the work, allowing the eyes to work less hard to compensate, a situation my optometrist refers to as 'eating the prism'.

Since the release of the Rift I've been excited about the prospect not only of using it for immersive VR applications and games, but to allow me to correct for the divergent eyes. While the Oculus VR SDK doesn't (yet) natively support measuring and correcting for this condition, doing so is actually pretty simple. In fact I've created a small tech demo that does precisely that.

The tech demo is located here and is built for 64-bit Windows platforms. I will soon be uploading Mac & Linux versions, as well as uploading the source code to my GitHub repository here.

If you have issues with perceiving depth, or wear glasses for correcting strabismus, I suggest you give it a try.

Oculus Rift in Action