Today we discuss setups for mono spherical video, the case where you simply want to collect all the incoming light from the world as seen by a single point in the center of the sphere. Orthogonal projection, simultaneously in all directions.
There’s a bunch of cameras designed to do full or almost-full spherical capture, from just a few wide-angle lenses to dozens of lenses facing all directions. (Note that I am saying “spherical,” not “360,” because “360” is often used to refer to panoramic cameras that capture all around you in a circle but are missing the bottom and/or top).
The more lenses, the fancier your camera looks. But how many lenses do you really need, depending on what you’re trying to do? What’s the best setup for different potential situations?
For mono spherical, you don’t need cameras placed in a way that captures parallax, and you don’t need to capture 3d information, so you can get away with having a small number of cameras. Fewer cameras means less expense, fewer stitching lines, smaller file size, easier file management, less computational power, and a host of other advantages that makes everything so much easier.
Mono cameras with just a few lenses is where inexpensive consumer-level cameras with an easy end-to-end pipeline are going to be for a while. I like where the bublcam is going, and look forward to when the capabilities of small end-to-end setups starts to rival what we right now need more expensive and complicated hardware and software for.
There’s two advantages to having more lenses for mono video. First, you can get slightly more accurate views with less distortion around the edges of the camera’s views, which matters if you want something with really high production quality. The closer the cameras are, the more similar the overlap will look, so with good stitching algorithms the more cameras the more seamless the stitching. But this is only an advantage if you have precise enough hardware that tilt and distortion errors don’t negate the benefits of precise camera placement, and if you have good enough stitching that the seams look seamless, otherwise instead of the occasional stitching error contained to a few locations you’ll have tiny stitching errors all over everything.
Especially for applications where you know what you want to focus on, such as a person’s face, it’s better for it to be easy to tell where there are large stich-safe-zones so you can put important things there. This is true for both stereo and mono filming, and knowing your actor’s face won’t have a huge stitching error down the middle is an absolute necessity if you don’t have the ability to really tweak the stitching calibration or paint out stitching errors in post-production.
The other advantage of more cameras is resolution. A 4k camera becomes low-res when you slap on a super wide-angle lens and spread the footage across your entire vision. If you’re working with gopros for example, and are willing to sacrifice the annoyance of working with a ton of cameras for higher resolution, it’s better to use more gopros with a narrower field of view just to get the resolution up.
However, depending on your application, you might actually need to have lower resolution and easier stitching. VisiSonics makes a 5-lens camera that outputs synchronized video and audio in real time, made for streaming live events. Live streaming means no time for complicated stitching algorithms or tweaking things by hand, and internet speeds can’t handle extremely high-res spherical video anyway, so five cameras is probably a good number.
It’s hard to arrange five cameras in a mathematically beautiful way, but I must say, VisiSonics more than made up for the ugly camera placement with the most beautiful mic arrangement I’ve ever seen. I am quite surprised to see that the many tiny mics, which together capture a sound field, are arranged like the vertices of a circumscribed propello-icosahedron! I didn’t even know anyone knew that shape! Whaaat!
But an ugly camera arrangement is ok when dealing in Mono video, because in addition to mono’s easier stitching, non-head-tilt problems, fewer necessary lenses, and lower file size, it is very forgiving in camera placement. You can get away with placing cameras in any arbitrary locations on the sphere, as long as you’re covering the entire sphere of vision (“arbitrary” appears to be the case for Panono, a camera ball for spherical still images). You can just keep sticking lenses on there and adding them to your finished video, since what you are trying to capture is an orthogonal projection of the sphere, which unlike the problems of stereo capture does have a perfect solution.
Of course, there is a question of efficiency. You want the least amount of camera overlap necessary for getting a good stitch at the distance you’re filming at, because in the end there is no camera overlap, just one mono video. If you have cameras with variable field of view, you can give them narrower field of view for farther stitching distance in higher resolution, or change to a wider field of view for lower resolution but closer stitching distance. Just make sure all cameras have exactly the same field of view, unless you really want to spend some time struggling with your stitching software.
As far as mono camera placement goes, my first instinct is to treat the lenses like points on a sphere with regular placement. The dodecahedral arrangement of Immersive Media‘s Dodeca camera is real pretty, and makes me want an icosahedral setup.
Problem is, right now cameras film rectangles, not pentagons or triangles or circles. What would have been a very efficient circle packing on the sphere is not necessarily a good idea for rectangles, and the pretty picture hides the fact that each lens represents an oriented rectangle.
(Someone let me know how easy it would be to manufacture a camera with a natively circular image censor. I can’t think of any technical barrier that should stop it from being produced the moment a manufacturer decides to do it, and I am looking forward to VR freeing us from the idea that everything has to be rectangles!)
There is one perfect mathematical solution for rectangles. It’s the pyritohedral arrangement you might be familiar with from volleyballs. The arrangement is similar to a cube in that it has six cameras filming six faces, but by carefully orienting them (related to the lovely fact that the vertex graph of the cube is two-colorable so you can have points of rotation that alternate clockwise and counterclockwise, and have I mentioned this is one of my very favourite symmetry groups and I loves it?), you can take full advantage of the rectangular view of the camera. A cube would waste all the pixels outside the center square of footage. A pyritohedral arrangement is optimal.
This is perfect for GoPros, which have exactly the right field of view such that four gopros of alternating orientations will have just enough overlap for good stitching. We used this for our gopro mono spherical rig (made out of foam core), and so has 360heros and Freedom360, and you can 3d print your own holder at home if you have a makerbot, thanks to dtLAB.
Given a bunch of specs for cameras, I’m sure there’s some relevant research on sphere covering that would help you write a program that tells you how many of what type of camera in what setup leads to the best bang for your buck.
For eight cameras, it might be tempting to space them evenly into an octahedron, but cutting an equilateral triangle out of a rectangular FoV means you’re wasting at least half of your footage. You’d actually be better off just filming in a pyritohedral arrangement with six cameras.
My best guess for eight cameras is orienting them like in a tetragonal trapezohedron, but I have a suspicion that you’d still be able to cover the sphere with only six of those same cameras filming in the same FoV. If there’s a benefit, it’s probably not worth the hassle and expense.
But someone else can math that one out! I’m more interested in the question of a 12-camera setup.
[Edit: someone took the above challenge! Spherical video creator Jim Watters compared a trapezohedral 8-camera rig using a particular 16:9 field of view to the same cameras in the pyritohedral arrangement, and as you can see below, the pyritohedral arrangement just barely does not cover the whole sphere.]
[I think you’re better off switching to a squarer aspect ratio or otherwise increasing your horizontal FoV by 5 degrees rather than adding the extra two cameras, but at least we know exactly what the tradeoff is now, so thank you Jim for sending that in and letting us use your images 😀 /end edit, back to 12-camera setups]
This is where things get kind of interesting, because there’s two fundamentally different highly-symmetric ways to arrange 12 points on a sphere. The regular dodecahedral arrangement seems obvious, and it in fact is very closely related to the pyritohedral arrangement. If cameras filmed circles, it would definitely be the right choice. The other contender is a rhombic dodecahedral arrangement. Rhombic dodecahedra not only tile space but have nice rhombic faces that look closer to being like a rectangle.
So which is better? Well, if we compare a best-fit of a regular pentagon and a sqrt(2) rhombus on a rectangle with our camera’s aspect ratio, we should get a pretty good idea (the sphericalness complicates things only slightly, in favor of the rhombus). In fact, it turns out the sqrt(2) rhombus always wins, even in a best-fit rectangle snuggled around a pentagon. The pentagonal dodecahedral arrangement is slightly more regular so there will be slightly closer stitching at the edges, while the rhombic dodecahedron has very slightly further stitching at eight locations, but the difference is very small. More significant is that the rhombic dodecahedron only has 24 edges, that’s 24 stitching lines, as compared to the regular dodecahedron’s 30.
And resolution! There’s a very nice way to symmetrically arrange all the cameras on the faces of a rhombic dodecahedron, oriented to have maximum overlap. For GoPros, unlike the 4×3 Wide setting necessary for the pyritohedral arrangement, you can go down to the narrower 16×9 Wide for higher resolution and close stitching, and I believe you can go down as far as the 16×9 Medium setting and still have room to stitch.
The pyritohedron and rhombic dodecahedron have a nice relationship and fourish symmetry that makes them work well, and for 24 cameras the pentagonal icositetrahedron would do nicely, but once the number of cameras grows high enough you might want to abandon those symmetry groups in favor of something icosahedral. I’d want 30 cameras placed and oriented like the faces of a rhombic triacontahedron, but I’m holding out for a pentagonal hexacontahedral setup with 60 cameras (see title image).
One final note: it is possible to get a good spherical video even out of a setup that’s inefficient or that doesn’t have all the cameras quite facing out from the center. Every single one of our stereo videos contains two mono videos stitched from a subset of the cameras, and even though the projection has a panoramic twist it looks fine (though it’s more work to do the stitching).
The beautiful efficient highly-symmetric polyhedral setups perfect for mono recording are, unfortunately, not so appropriate when it comes to filming in stereo. On the bright side, stereo filming has its own set of interesting questions. Stay tuned for part 2!