Don’t Look Down

posted in: Cinematography and Editing | 0

“Down” seems to be a recurring problem in VR film. We’ve talked extensively about the theoretical problems, from stereo to stitching distance, so today’s post looks at “down” from a content perspective, surveying what people have done so far.

Here’s 11 ways to deal with down!

1. Embrace the tripod

In most of our live-captured videos, you can look down and see our tripod, in all its too-close stitch-artifact glory.

Screen Shot 2015-05-08 at 11.29.08 AM

This is not exactly the most professional of options, but we made an early decision to focus on content and research rather than production quality. Especially in our talk shows, which are about VR in VR, we simply leave in all our production equipment. Laptops, mics, everything, all visible when you look down.

But there might be an unexpected benefit to leaving in the tripod. Some viewers, especially those new to VR, find it disturbing to look down and not “exist”. We’ve had many people look down in our videos and comment “I am a tripod”, which perhaps is better than “I am nothing at all”.

Then there’s our new Ricoh Theta cameras, which are tiny and stitch almost around themselves in the down direction, leaving a stitch line and a tiny sliver of color. They’re small and light enough that they can easily be mounted from any angle, or hand-held, with the internal gyroscope correcting the final footage to always keep down down. They’re great for very informal low-production vlogging, so often the “tripod” is an arm and hand.

Screen Shot 2015-05-08 at 7.02.19 PM

It wouldn’t be hard to mask out this tiny sliver of color. We also got a tip from Jim Watters that you can stick reflective tape on that part of the camera so that it absorbs the environment colors instead of showing a bright yellow sliver. But for informal purposes, I kind of love the bright yellow sliver, and for vlogging there’s something appealing and authentic to me about it being hand-held.

For non-fiction content, it’s easy to embrace the reality of production. But this attitude does not work for all things!

2. Mask out tripod

In our stop-motion animation library.gif, the “down” direction is just a grey rug with nothing going on, so it was easy to do a quick hacky job masking over the tripod using nearby rug footage. This gif is meant to be seen while sitting down on a similar rug (we did a couple installations of it including a rug and library objects), and we didn’t want the tripod to get in the way of the viewer feeling the real rug, or remind the viewer of the reality behind the whimsically moving objects.

Screen Shot 2015-05-08 at 11.54.28 AM

Ideally, library.gif would have had fun stop motion stuff going on straight down as well as everywhere else, but because the camera is set so low to the ground, it’s just too close to get good stitching with a multi-camera setup (this was done with 14 gopros).

There is also the question of mono vs stereo. All around you, the space is animated in stereo. But down would have to be mono (or else have terrible disparity problems), which might be jarringly different if there were actual content there. Some people barely notice stereo vs mono, or don’t have stereo vision at all. But the occasional person is very sensitive, and will look down and report that it looks like the ground drops infinitely away from them as it goes to mono. The rug is fuzzy and smooth enough that you can’t really focus on it or get distracted by it, avoiding all the problems.

Google street view also masks out their tripod using footage taken of the ground under the car in nearby frames. Of course, in their case the tripod is an entire car and the process is automatic, leaving plenty of artifacts, but it works well enough that most people don’t even think about the fact that there should be a car when they look down. The car’s shadow is often there though, following you in a ghostly manner. Google street camel cam is especially strange, with a trail of footprints leading to nowhere.

Screen Shot 2015-05-08 at 1.27.51 PM
Screenshot of Google Street View,

3. Offset tripod

In “Don’t Look Down!”, Emily uses a “selfie stick” (in this case also an everywhere-elsie stick) to hold the camera off the edge of a cliff. The “bottom” of our spherical camera is, then, off to the side, not blocking the down direction.

Screen Shot 2015-05-08 at 12.00.13 PM

There’s a lot of VR out there that plays with people’s fear of height, putting you on the edge of a cliff or dropping the floor out from under you. To get that effect, you need very embodied VR, where the person feels they are in a persistent space and the only change in view are the changes made by their own motion. In the above video, the camera moves around, giving a very third-person camera feeling from the beginning. Being in black and white adds to that.

JauntVR‘s cliff footage for North Face also offsets the bottom of their camera, which in their case is much more imperative since their camera does not film a full sphere. This lets them save on number of cameras and avoid stereo/mono stitching problems, but you can see why their next camera is reportedly going to capture a full sphere.

Screen Shot 2015-05-08 at 12.36.56 PM
screenshot of The North Face: Climb, from

But let’s take a detour and ask: if you can’t film a full sphere, what do you do with your empty bottom space?

4. Brand Your Bottom

In the above screenshot, you can see that JauntVR has branded their empty space with, in this case, the North Face logo (as well as their own). All of their video content I’ve seen has this sort of branding, which I find distracting, but makes sense for JauntVR given that most of their work is with and for brands.

Our Giroptic 360cam also does not film quite a full sphere, and the footage comes out with a branded bottom that also marks it as a development kit:

Screen Shot 2015-05-08 at 12.49.32 PM

Because the Giroptic camera is a consumer camera and we can do what we like with our own footage, we can just remove their branding and put something else there (or nothing at all). So what do we do with that empty space?

5. Do art to it

In 9:72, filmed with the Giroptic 360 developer camera, we reflected the rest of the sphere into the empty space at the bottom. If you look down, you get a bubble view of everything.

Screen Shot 2015-05-08 at 1.01.43 PM

Relatedly, though the opposite problem, when we filmed our talk chat show thing episode 3 the upwards-facing camera did not function, leaving us with a hole in the top of our video. Emily fixed this by cutting out the ceiling entirely and using footage we’d taken in a bamboo forest.

Screen Shot 2015-05-08 at 4.59.35 PM

6. Put Equipment There

if you’re going to leave a hole at the bottom of the video anyway, one might consider just how much equipment you could hide down there, including the camera itself.

The larger the distance between cameras, the harder it is to stitch the footage. There’s a limit to how small you could theoretically make the radius if you want stereo, but for mono ideally all the lenses would actually be in the same exact spot.

Screen Shot 2015-05-08 at 4.21.56 PM
Omnicam360, image adapted from

The Omnicam360, developed by research organization Fraunhofer, uses mirrors so that each camera, while virtually in the exact center of the panorama, is actually arranged vertically below the mirrors.

 It’s fun to imagine using a similar technique to film simultaneously with many RED cameras, which are otherwise too bulky to fit close together.

7. Put Player Controls There

Some players assume that “down” is a dead zone, and use looking down to control things like exiting the video. This makes sense as a short-term hack to get around the limited controls of things like the google cardboard. For some content this doesn’t work, as it restricts you from looking down in videos where there is actually something to look at, but for certain types of content it might be nice to reserve space for things outside the content, such as player controls, links, metadata, or whatever else.

8. Hang your camera

Pretty much all of what we’re talking about is live captured 360 content. There may be post-production, but the sphere of images are captured simultaneously, which is going to necessarily include any visible equipment. If there’s no hole in your footage, that’s going to include whatever’s holding up your camera.

Chris Milk and Beck’s “Hello Again” minimize this by hanging their camera with what looks like wire or fishing line.

Screen Shot 2015-05-08 at 1.44.23 PM
“Hello Again”, screenshot from

Doing this requires multiple lines and anchors in different places so that the camera doesn’t rotate, and it’s certainly not the most convenient or flexible option for all cases, but there’s many situations where I think this is ideal.

9. Live-Rendered Hybrids

Another way to go is non-simultaneous capture, shoot each direction one at a time and composite later. This solves a lot of problems and seems ideal when filming anything with scripted action, or where the important action happens in one direction only. You can hide all your equipment, lighting, tripod, etc, behind the camera’s field of view just as in traditional filming. For the purpose of this post, it wouldn’t make for a very interesting research question, but it does get interesting when you go further than simple non-simultaneous capture and actually live render objects in the view.

One great example of non-simultaneous captured content plus live-rendered fixing of “down” is Felix and Paul’s Strangers with Patrick Watson.

Felix and Paul: “Strangers”, image from

The video itself is beautifully-stitched non-simultaneously-captured video using two REDs, and definitely involved quite a lot of post production. But the most novel part to me was that, in the gearVR version, when you look down you find that you are on a rendered futon, composited on top live in the player, that responds to head motion to create a parallax effect.

I think it’s a brilliant way to solve both the problem of covering the tripod and the panoramic twist problems of getting down to look good without stereo disparity. It’s also very subtle; probably most people don’t notice it.

In Emily’s, video and live-rendered objects are combined in a more obvious and content-driven way. There’s a dome of video above, while an entire half-sphere of “down” is replaced with a live-rendered ground. There’s also 3d modeled creatures wandering around, and you can use the arrow keys or a gamepad to wander around the space.

Screen Shot 2015-05-08 at 4.04.31 PM

This is an art piece that is not trying to simulate the “down” direction as it appears in the filmed space, but it hints at how one might go about solving the problem for regular spherical video. Flat floors are very easy to render live with proper stereo, as are many other static objects, and it’s interesting to think of the possibilities of combining video with rendered objects, even things as simple as a flat wooden floor.

10. Fake Body

3d games avoid all the problems of capture and stitching, so “down” can be anything. “Presence” is the buzzword, and many games strive to make you feel that you are not a disembodied camera or 3rd-person viewer, but a person embodied in the scene. I’ve seen many many VR games where when you look down you see the static neck-down body of a generic man, sitting perfectly still.

Right now the only fake body I’ve seen in video is in Kite & Lightning’s Insurgent VR experience. Kite & Lightning are doing really interesting things with combining video and 3d rendered environments in both live-rendered experiences and rendered out as video. Insurgent VR was created in Unreal Engine 4 using live captured video as well as 3d models, then rendered out as a spherical video from the location where the head of this headless body would be.

Screenshot from Insurgent VR
Screenshot from Kite & Lightning’s Insurgent VR on Android

We’ve idly considered how fun it would be to use a headless mannikin as a tripod, but haven’t taken any steps in that direction. Some people like having a fake static body. Personally, I feel less embodied when I look down and see a static body different than my own, than no body at all. But I’ve also seen very convincing demos where my motions are tracked and move a VR body, making it “my” body  whether it looks like it or not, which will definitely be the thing for embodied games! Perhaps the VR video players of the future will allow you to import an avatar, and live-render your body below you, on top of the video.

11. Real Body

So far, we’ve seen several different people’s home-made VR camera helmets and head-mounts, so that their own body is seen in the down direction. Now there are several companies starting to produce these sorts of cameras, though as far as I know none are available yet.

This sort of head-mounted camera seems ideal for personal embodied experiences, and we’ve seen it used in vlogging, action shots, and erotica, as with Natacha Merritt’s VR work.

from upcoming work by Natacha Merritt, (NSFW)
from upcoming work by Natacha Merritt, (link may contain nudity)

Real-body first person VR has a lot of potential for empathy, because you are seeing the body of someone you know is a person, sharing their real experiences. The most well known thing in this space right now is probably The Machine to Be Another. This project by BeAnotherLab, along with MIT, is doing some interesting research as well as art, with switching the video feed between one body and another.

screenshot of The Machine to Be Another:

The Machine to Be Another is done live, with one “performer” who copies the movements of the “user” to allow the user to feel they are experiencing being in another body. So it’s not exactly in the category of video questions this post is about, but is perhaps one of the more interesting answers to what should happen when you look down.


Genres emerge and separate based on the feedback loop of interaction between the expectations content creators create, and then audiences having those expectations, which then content creators in turn wish to fulfill in order to communicate effectively (further reinforcing those expectations). For VR video, one of those sets of potential expectation/content interactions will involve the treatment of “down”.

It will be interesting to see how viewer expectations change. Right now, many people don’t bother to look down at all. Some people, when first introduced to VR, face forward the entire time and need to be taught to look around, and whether they learn to keep looking around depends on whether there are things to look at. Others find their embodiment, or lack thereof, to be an important sticking point.

People will continue to try all sorts of things and I don’t think any one technique or set of expectations will win out, but I expect different genres of VR film will emerge with different standards. For some artists, having complete control over “down” will be essential. For some types of content, a player that lets you import your own seating and avatar seems like the right thing. Some types of content will gravitate towards embodied viewing, others will gravitate towards 3rd-person disembodied viewing.

I’m looking forward to seeing the many more creative techniques people will come up with, as an increasing number of people gain access to the tools required to make VR video.


P.S. all our own videos referenced are available on our downloads page and YouTube.