Welcome to another edition of “What the heck is Spherical Cinematography!” This time we are looking at what texture is, how to use it, and why. There are two preceding posts in this exploration, Spherical Cinematography 101: Scale and its companion piece The Choreography of Attention if you want to learn more but for now let’s focus on texture and using it to create good immersion.
While immersive media is not new, sporting a steady track record back to panoramic paintings like la chambre du cerf and even cave paintings, spherical video capture of the real world is brand spanking. So it’s not that surprising that most of the work lauded as exceptional is pretty bland. I mean the NYT Magazine piece Take Flight by Daniel Askill is best described as famous people floating around. That’s it. And they never even get that close, it all out of arm’s reach.
So it’s new to makers but it’s also new to viewers. Audiences in general have never seen the medium at all before so all the flabbergasted oohing and awing is understandable but once the novelty wears off, and it does really fast, there needs to be style and composition and authorship in the wings to carry viewers along. So let’s figure out how to do that: how to make great spherical video that will keep viewers interested long after “Wait, what?” and “Holy shit, really?”. In other words, how to texture.
First I’ll cover what texture is then how to organize it over an entire sphere and finally use the concepts laid out here to analyze the recent spherical video piece Waves of Grace.
What is Texture?
In film school the primary question we were taught to answer when going from script to screen was: What is the best viewpoint from which to see this event? Great vistas? Intimate conversation? Looking down the mouth of a valley snaked with train tracks? Up at the face of a deliberating judge? All of these position the audience in relationship to the material of the film creating tension, emotional context, foreshadowing, power relationships, et cetera.
Many of these techniques used in flat film can still work in spherical: using a lower camera height to make figures loom powerfully over the viewer; big open spaces to establish settings; using a close-up shot to enhance details by portraying small scale portions of a scene at larger than life scale. We have names for these recurring composition techniques: Close Up, Medium Shot, Two Shot, Long Shot, Extreme Long Shot, High, Low and Level Angle. Any one of which can be either subjectively—meaning the camera is situated or addressed as a present entity in the scene either verbally or with eye contact—or objectively styled (if you are totes unfamiliar with these terms check out the reference section for some cinematography reading material).
The difference between flat and spherical then lies in how these basic components are combined. Flat cinema relies on sequencing, either through camera movement or editing, to combine shot types. Spherical combines those same component types simultaneously, all in one shot then sequentially combines those aggregates.
I need to take a few lines here to lay down a shift in vocabulary. Instead of “shot types” as they were previously labeled I will now call them “region types” since you can have many different regions in one spherical shot. As in “Look at that close up region, the details, so pixel.” So texture then is how region types are mixed in one spherical shot. It is the juxtaposition of objects and actions at a variety of distances from the camera.
Cropped or Uncropped?
“Yeah, but how can I tell if it’s a close up or not?! It’s spherical!”
-Hypothetical human who asks convenient transitional questions
All the shot types listed above are determined by the ratio of the figure size to the size of the whole image. For example when the figure is larger than, and thus cropped by, the frame the shot is a medium or close up. Since in spherical there is no frame doing any cropping of close figures we need to find a different metric; one that maintains the figure to ground relationship but requires no frame.
There are two basic states an object or person can occupy within a spherical video: cropped or uncropped. Uncropped things can be seen all at once from at least one viewer position. Cropped objects or figures can’t. The viewer has to look around a bit to see them completely. How much movement the viewer has to do to see the entire figure depends on two factors: how many degrees of the visual field the figure covers in the footage and how many degrees of that same field the player used to play the video displays.
This will also vary by player but the standard Youtube player, as an example, displays 60 degrees vertically and 105 degrees horizontally in mono mode (meaning the video is displayed full screen with no headset) while in stereo mode (with the headset) it displays 90 degrees vertically and 75 horizontally. Yup, not even close to the same. This is why it pays to aim at a particular viewing platform. My daily videos are shot with the lower immersion mono mode in mind, since my findings from Play/Room showed that most viewers preferred that watch method. Suki Sleeping, however, was framed for the stereo ratio since the stereo depth was crucial to the design of the animation.
I will use the 60° x 105° field of view for the examples in this post. In order to know if a figure is cropped at a certain distance we need to do some basic measurements. These measurements are for a figure at that distance. The sky’s gonna cover as many degrees as it gosh darn pleases and object coverage at these distances are too variable by size to be a good guide. For this test video the camera was placed around 1 meter up. This is approximately half of Vi’s height, who was around 2 meters tall in that day’s shoes, which kept her centered pole to pole for easier measurement. Each square in the grid is 15° on a side.
These distance-to-degree measurements are going to vary based on the specs of the camera you are using. I’m using a Ricoh Theta S. It would behoove you to do a few experiments and figure out what the ratios are for your set up… even if you never actually measure a scene before recording. But enough disclaiming; for my set up at 0.5 meters, Vi occupies 135 degrees vertically. Since our example player displays 60 degrees at any viewpoint we know the viewer will need to move over twice the screen height to view her fully. In fact she does not become fully visible in one viewing position until she is 2m from the camera.
The point of all of this is not that the viewer will carefully scan her head to toe before moving to the next section of the screen but that instead—if the first time you see a character her whole outfit needs to make an impression—then she needs to be at least two meters from the camera. I am not far enough along in the research to make definitive statements about how a cropped and uncropped figure differ emotionally for the viewer but it points toward things around intimacy and personal space. Whether that’s welcomed or violating will depend on how the shot and the scene are constructed.
Okay, now that we have covered region types, vertical coverage, and cropped versus uncropped, let’s work on how to organize them around a sphere, starting simple and stepping up to more advanced techniques.
Beginner: The Baseball rule
Since engaging immersive shots are ones with texture, even the simplest shot should follow at least this rule. Think of your shot like the skin of a baseball with two wide curving regions that meet edge to edge. Each of these 2 regions needs a different region type, with objects and actions at a different distance from the camera. My video J is for Jaggery is a good, simple example. The close-up region includes the ceiling as well as the microwave and the pot of chai which are both very close (lower quality and visibility of the details due to poor resolution should just be ignored at this point, cameras will catch up eventually.) The other region includes the cutting board, door, table, and all of the action Steve and I do in the scene. This is a medium depth region.
Even a drone enabled shot from high above a city would benefit from the baseball rule. The addition of a flock of migrating geese, for example, which fly close to the camera but not between it and the scene below, a fast moving cloud bank over head, or even a plane taking off from the city’s airport and flying above and away would awaken the rest of the shot. (Again this is limited by current camera resolutions and by the fact we only have one kind of best-at-medium-range cameras at the moment, but near- and long-range capable spherical cameras are an eventuality we should stylistically plan for).
Intermediate: Finger Framing
Are you a fan of the good old fashioned finger framing method, using the thumb and index finger of both hands to make a rectangle to look through for mocking up images? Its a silly yet tactile way to get your brain think about about frames… so here’s a silly yet tactile way to get your brain thinking about spherical shots:
Make a C shape with each hand and hold them so they interlock, one hand running zenith to nadir while the other wraps three quarters of the equator. It’s like holding an invisible ball between your palms fingers wrapped around on all sides. You now have an imaginary spherical video with six general regions: right palm, left palm, right fingers, left fingers, right thumb, left thumb.
Yes, I know you can’t look through it to make a frame like the old way, and yes, I know this won’t work for my digitally disabled brethren, but use your imaginations people! I prefer this to thinking of the video as a clock face, with a detail at 5 oclock, action at 7 and vista from 8 through 4, because it’s so flat. It keeps you thinking about the horizon and organizing things along it, instead of the actual, contiguous, immersive space of spherical capture.
Okay, so you’ve got your 6 regions—not every region needs its own treatment; but this technique can get you thinking about how you organize different textural elements around the sphere. The baseball rule, by the way, would be one texture for the right hand and one for the left, but there are lots of other ways too. Maybe thumbs and palms are an extreme long region, the right fingers a medium region and the left a close-up.
Advanced: Strategic Visual Obstacles
Spherical video had been denigrated by some as “not real VR” because as Will Smith put it in his Wired piece Stop Calling Google Cardboard’s 360-Degree Videos ‘VR’, “360 video is inherently limited… you won’t be able to get up and walk around in a 360 video. The cameras just can’t capture the data required to allow that.” This is supposed to be the earth shattering insult, the last nail in the coffin, for why spherical (*cough* I will never call it 360 *cough*) video is lesser and should be banished from the happy, perfect fairyland of ‘real’ VR forever. But what if we didn’t jump to conclusions? What if we looked at the qualities of the medium without assuming we know better, to see what it wants to be?
In spherical video the viewer can’t move spatially within the scene. How can we use this technical reality stylistically? Viola! Strategic Visual Obstacles! Just because you can place the camera out in the open, at an average eye height, where everything is visible for meters around, doesn’t mean it’s a good idea. Blocking the viewer’s access to a particular part of the scene, whether partially or completely, builds tension. And conveniently you know for a fact that viewers can’t just lean an inch to the right and mess up all your careful planning.
Lets look at The light on a pot and a petal as a super-simple example.
There are three separate regions in this video. There is a close-up region which includes the flower pot, drinking glass, and me drawing, the threesome to my left chatting over coffee in a medium region, and the background of people ordering food and chatting in a medium-to-long region. But then, what about the woman clearly visible between the drinking glass and the pot? She is sitting with her friends. She is completely visible, if a bit pixelated, but the person sitting across from her to whom she is talking is completely blocked by the glass. If the glass were not blocking the view I would say this was a medium region but since it is I’ll go with calling it a layered region. Layering a close-up over a medium shot increases the depth of field of the region but it can also increase suspense by adding mystery.
Denying a viewer access to some part of the scene is really powerful in an immersive piece where the viewer has a built-in expectation of agency over what they look at. One of my personal favorite implementations of this technique so far is attaching the camera to the back of my head. This camera position means the viewer can see everything around except what I am looking at. My gaze blocks the gaze of the viewers.
Waves of Grace
Now let’s use our newfangled tools to look at a few shots from the 2015 piece Wave of Grace by Gabo Arora and Chris Milk. I want to take a look at two different shots, the first of which starts at 2:02. Here a group of people prepare to bury a recently deceased ebola victim by suiting up in full-body bio-safety gear. The camera is placed in the center of the tarp-roofed shelter with people prepping on all sides. In the mostly obscured background you can see a field of white crosses that mark the graves of previous victims and another shelter a few meters away. The shot feels busy, cluttered and at emotional odds with the context of funeral preparations because the camera placement has no tension. It’s all one medium region with glimpses of other region types but no clear layering. Many figures are within the cropping boundary, but here it doesn’t serve to foster a sense of intimacy with the people around, but rather a feeling of being overwhelmed and disconnected, like a strange crowd pushing past you in a narrow place.
Instead of shooting from the dead center of activity which gives the scene even texture on all sides, the camera could have been placed to one side of the open shelter, juxtaposing the diligent efforts of the workers with the field of completely still, high-contrast white crosses. In traditional flat documentary footage, you would normally shoot the two pieces separately and leave it to the editor to reveal the visual tension between the two, but in spherical the two scenes can coexist. That’s the best part about spherical video: it can capture the wild contrasts and strange proximities and heart mending meetings that are part of all lives.
The second shot we’ll look at directly precedes the funeral preparations scene, starting at 1:45. This shot has better texture. It combines a long region, a field with milling people, with a medium region, a man chopping wood next to two girls chatting. A long row of risers stretches out from where the girls sit cutting a beautiful perspective line toward the field giving that region great depth. As you look around notice that no object or figure is within the cropped boundary. Even the large pile of chopped wood is uncropped. This gives the scene a sense of distance. Neither the people nor the world confront you proximally and you are free too look from vignette to vignette unobstructed.
We covered what texture is, why it’s important for creating engaging immersion; how flat shot types can be retooled into spherical region types, how to measure your own spherical video set up for vertical coverage and cropping boundaries, covered a few guidelines on how to organize regions around the sphere, and then used all that sweet, sweet know-how to take a more in-depth look at two shots from Waves of Grace.
If you have a specific piece you would like me to analyze or questions about these techniques feel free to find me on Twitter, @emilyeifler. Happy sphering everyone!
Cinematography, Theory and Practice. By Blain Brown. 2002.
The Five C’s of Cinematography, Motion Picture Filming Techniques. By Joseph V. Mascelli. 1965.
Video Art. By Michael Rush. 2003.
Virtual Art, From Illusion to Immersion. By Oliver Grau. 2003. (Translated from German by Gloria Custance)