The day has finally come where I looked at a real-life practical problem in my actual work as a video director, and said to myself, “Wait! I can figure this out using the hairy ball theorem.”
1. What’s the Hairy Ball Theorem?
The hairy ball theorem is an abstract century-old result in algebraic topology popularly known for its amusing name and wonderfully intuitive visualization: rather than thinking of continuous tangent vector fields on the 2-sphere, imagine hairs on a ball, and try to comb them all smooth and flat with no whorls or cowlicks. Hairy ball theorem says you can’t.
The theorem works even if the ball isn’t perfect. According to topology a squashed ball or a ball molded into the shape of a cow are just as good as a perfect sphere. But the ball must theoretically have an infinitely fine hair at every point, and also must be complete, all the way around. So if you were thinking of a mostly-round and hair-covered body part that is connected to other more hairless parts of the body, such as how the human head has a lot of hair but is connected to hairless parts of the face and neck, that’s not what this is about.
Hair on the head often whorls around a point. You could also part it down the middle and comb to either side, or comb it inward along a line like a mohawk, or comb it all together to a single point such as a ponytail. In a limited patch of hair like the head, it is possible to comb away all discontinuities and zeroes by either slicking it all straight back or all to one side as in a combover, or almost straight back but with a bit of a wave, or in graceful arcs that gently transition from a combover on top to straight down in the back. These continuous hairstyles rely on the fact that all zeroes can be combed out of your patch of hair and onto the hairless part where they no longer exist, but when you’re an entire hairy ball you can’t avoid them. Every cow has a cowlick. Every tribble has a tuft. Continuous hairstyles seem to be favored in business situations, so this mathematically explains why so few tribbles become CEOs.
The canonical example of this theorem in action is for windspeed around the globe. The speed and direction of wind is naturally represented by a vector, and the hairy ball theorem tells us that there will always be somewhere on earth with no horizontal wind speed. Of course the atmosphere is 3d so there could be vertical wind speed, as in a hurricane where the wind vectors whorl around a horizontally-calm eye, while downdrafts from the upper atmosphere give the eye its distinctive visual clarity.
Stereo spherical video is full of vectors on spheres, which means plenty of opportunity for the hairy ball theorem to ruin everything. Relevant vectors on spheres include:
-The final pair of spherical videos, where each point has a panoramic twist vector for how 3d space is projected onto it.
-The original sphere of cameras, each of which point in some direction.
-The position and orientation of your eye on the sphere as you view the video, which determines which portion of the spherical video is displayed.
-What point on the sphere of video you are actually looking at, and what direction you are looking at it from.
It’s a good thing the theorem has such a lovable name, otherwise we’d have no choice but to hate it, because the hairy ball theorem sure leads to a lot of inconvenient truths, some of which follow.
2. Hairy Ball Theorem and Panoramic Twist
In my last post we discussed how stereo video is created by panoramic twist, and how the angle of each camera on a circle or ball can be thought of as vectors that map views of 3d space onto a circle or sphere. On a 360-degree circular panorama it is easy to get a nice even panoramic twist all the way around so that the stereo effect is convincing no matter where you turn your head. When you’ve got an entire ball of cameras for full spherical video, things are less clear. How do you do panoramic twist on a sphere?
Whenever you have a vector field on a sphere, you can bet the hairy ball theorem is lurking somewhere close by.
The hairy ball theorem applies to tangent vectors, while panoramic twist vectors should all be directed out from the ball, not combed completely flat. You can get a field of tangent vectors from the panoramic twist by thinking about the horizontal part of that angle projected onto the sphere (the shadows of the sticking-out hairs, or where they’d be if you used heavy hair gel), and then apply the hairy ball theorem to that. This will tell you that there must exist either a discontinuity or a point where the camera view faces directly outward, no matter what, such as the center of the whorl of views you’d get at the poles of the video if you simply twisted the sphere around its axis.
You might think that in the real world these theoretical problems with continuous vector fields don’t exist. In practice, any camera ball will have a discrete number of cameras yielding discrete pieces of footage. You can give an angle to every physical camera or crop every field of view, but even then, the hairy ball theorem cannot be escaped! If the footage is stitched together into a spherical video, then somewhere in the final video exists a place where the point of view is from a ray that sticks straight out, or where the stitching itself does not work because the discontinuity in the fields of view is too great and thus you end up with an unavoidable stitching error in your final video. Because you need an area of good overlap for stereo vision, the effect of a tiny discontinuity actually spreads out.
3. Hairy Ball Theorem and View Orientation
Thinking about the problem from the other direction, there’s another natural way to get a tangent vector on a sphere. For the point reflecting where one eye is, there’s a vector that points to your other eye. This tells you the orientation of your head.
View orientation is a classic problem in computer graphics, where the hairy ball theorem prevents game developers from being able to make a nice continuous function that will always output an orientation of your field of view given a direction you are looking in. You might be familiar with this phenomenon in first person shooters, where when you look straight up or down there’s a point where the view suddenly flips around, and depending on which side of the discontinuity your mouse hits, a matter of one pixel, you might suddenly whirl around to the right, or left, or not whirl at all.
VR video completely solves this problem for games! In VR, you don’t just have a single point telling the game where you’re looking. You don’t need an algorithm to choose an orientation for the field of view, you already have it! If you decide to do a backbend and see what’s behind you upside-down, your VR headset will (or soon will be able to) detect that, unlike flat games which will flip the field of view right-side-up. Your own body is an engine for creating realistic view orientations. You store your own view-state, in physical positions like backbends and the non-commutative rotations of your head.
The body as a state machine is a beautiful concept that’s worth a bit of a tangent. If you look straight down and want to continue looking back and behind you, you might want to turn your head around yourself to the right and come back up facing up, or turn to the left, or do a handstand and come out with a view that’s upside-down. If you start going to the right but then tilt your head to get a field of view that’s a bit more to the left, the same view you’d have gotten if you’d turned to the left in the first place, your body won’t suddenly flip around as if you’d turned to the left. Your body’s position stores the state of having turned to the right, so the computer doesn’t need to know anything about it.
Games also have the luxury of being able to compute specific views in real time. You can tilt your head however you want, and change the angle from which each eye sees the thing you’re looking at. In a static video, you can’t do this. However a section of the video is shown to one eye, it’s going to stay that way. There’s exactly one panoramic tilt vector for that point in the video, and whatever angle the camera saw it from, that’s the angle that’s shown. Forever. So for stereo to work in static video, you have to have exactly one expected head orientation.
The hairy ball theorem in this case works just like it does in flat 3d games: there’s no continuous way to have expected head orientation. So even if you imagine someday there is an expectation for standard viewing orientations, and that savvy viewers will naturally put their heads only in those orientations while viewing videos, if everything is stereo there will be some point where just one pixel over from the viewer-expected video-approved stereo will be something that appears incredibly misaligned.
Unlike the 3d games case however, zeroes are ok. We need an orientation if we want stereo, but we don’t necessarily need stereo. It’s better to have the very top of a video smooth out to flatness than to have a jarring discontinuity and double-vision. Stereo is a useful effect, but certainly not necessary everywhere all the time. Which is good, because it’s mathematically impossible.
4. So what do we do about it?
On the one hand, the nature of spheres ruins all our hopes and dreams, but on the other hand the hairy ball theorem let us quickly figure out that certain things are impossible so now we can focus our efforts on working within these limitations. For example:
-Have panoramic twist with proper stereo around in a circle, with vectors fading to perpendicularity at the top and bottom so that up and down are not stereo, but not misaligned. Definitely wanna try this soon.
-Have panoramic twist optimized for a forward-facing bias, that has good stereo to the right and left, straight up and straight down, but not behind you. There’s a great vector field on the sphere that has only one pole, and it should work really well for this case. If the vectors go perpendicular at the pole but the background behind you is far away, you might not even notice the lack of stereo.
-Future stereo spherical video editing software will want to be able to have a variety of available panoramic twists, and to let you input your own, and be able to apply the twist to your pile of camera footage dynamically, computing vector fields that smoothly change from one to another. This way you can change the twist depending on what’s important in the scene or where you expect the viewer to be looking, and always have great stereo at that point, or even use changes in panoramic twist as a method of moving the viewer’s attention.
-Perhaps smoothly fading or deforming from one static video to another is not too jarring or difficult to do in real time, and if the video player detects that the viewer is looking at misaligned stereo (it would need to know the vector field of expected orientations and compare) the player can fade one eye’s video to be the same as the other eye’s, so it is at least flat instead of misaligned. Perhaps some small number of static videos, stitched and rendered ahead of time, can have a combination of panoramic twists that give really good stereo in all natural head positions.
-We need to do more research on just how much we can misalign or warp images and still get a stereo effect, to find optimal twists. The twists for each eye don’t have to be exact mirror images, and don’t necessarily have to have twist in opposite directions for the two rays to converge, but we’ll have to test to what extent this actually works. We just put our first preliminary 2-camera width and angle test on our downloads page, “This is what science sounds like” (note that the angle of cameras is for the center ray only. Even when the cameras have a divergent angle, much of their field of view may be convergent).
-It is likely that people differ in their tolerances and preferences, so maybe videos will have to come in different amounts of stereo, or maybe viewers will want to input their own biometrics, then render a video that works just for them.
-Stereo isn’t everything. There’s other ways to trick the brain into thinking there’s depth, and they can probably be combined with panoramic twist in effective ways. Stereo can be used where it’s most effective, and is not needed all the time everywhere.
It’s possible that the most reasonable response to this hairy ball problem is to stop bothering with static stereo spherical video altogether. We could simply give up, in which case there’s two reasonable alternatives. Option 1: assume it won’t be all that long until average people’s computers and VR video software will have the power to create 3d point clouds and render specific views in real time, and focus on producing videos using full wide-angle camera balls, which would be compatible with these future theoretical players. Or…
5. Option 2: I never liked spheres anyway
The hairy ball problem ruins spheres, but there’s other shapes in the world. A torus, for example, is easy to comb completely smooth. And in many ways, a torus is a more natural shape for video to be on: take a normal flat rectangle, wrap the top to the bottom, and the right side to the left side! No weird stretching of pixels, no stereo problems. We can’t help but think of tons of ideas for what we’d love to do with this. We’d have to write a player that could display it, but it’s tempting enough that we might have to do it.
See, with toroidal video, the 360 of video you see as you spin horizontally is different from the 360 of video you see when you turn vertically. So many super cool things could be done with it. I want it so bad. Should probably make an entire post about this.
Or we take a panorama and apply twist in a different direction: mobius videos! When you spin all the way around, you see the same exact footage but upside-down, with no seam, and stereo still works great. A vertical mobius strip would be a bit trickier; we’d have to make sure one eye’s footage wraps to the other eye’s, so the stereo still aligns. For a stereo klein bottle, you’d want it to wrap horizontally to itself with a flip, and vertically to the other eye’s video with a flip. So basically each eye has four copies of the video in a 2×2 torus that appears to be wrapped onto itself into a klein bottle.
Or you could wrap without a twist, a 720 panorama. It seems like a 360 panorama, but when you turn once you come back to a different view, or two turns get back to the same view. In theory you could have any amount of wrapping, and some interesting storytelling opportunities, such as a story that unfolds at your own pace as you turn around to simulate moving around a space.
There’s any number of cool spaces you could theoretically view video from within, and I’d like to spend some hardcore time thinking about which ones would work. Remember that tangent about how the body stores information about its position in space? It’s one of those things that seems too obvious to even notice, but when you start thinking about how the information stored in the body’s state might interact with virtual spaces that are different from ours, you start to realize just how crazy even the simplest things are.
For now, we’re still experimenting with stereo and spheres, because the weirdness of how our brain actually perceives stereo video is hard to predict with theory. As always, you can get our latest experiments on our downloads page, and see for yourself.