Hello eleVR blog readers! This is Henry Segerman, guest blogging about a research project I’ve been working on with eleVR. I’m a mathematician and mathematical artist, and I also worked with eleVR on Hypernom and Monkeys.
First of all, go watch this discussion and tech demo spherical video:
If you want to jump straight into the code, here it is on Github.
Almost all current devices and platforms use an equirectangular projection of spherical video data for storage and streaming. This converts a sphere of data into a 2×1 rectangle of data, which fits in nicely with current infrastructure for video. What doesn’t currently work very nicely is editing spherical video in equirectangular format. As Emily outlined in her talk at Vidcon, what you can do in ordinary rectangular video editing software is quite limited. For example, if you want to rotate the video around anything other than a vertical axis, you’re out of luck.
It seems obvious that future spherical video editing software should include the ability to rotate all or part of a scene around any axis, the equivalent of translating and rotating flat video. What about other effects, for example scaling content to be larger or smaller (or equivalently, performing digital zoom)?
Möbius transformations are transformations of the sphere that include ordinary rotations of the sphere, as well as very natural zoom-like transformations and many other interesting effects, as we show in the video. The code we used to make the transformations to the video above is available on Github. It is written in Python, and modifies individual png files, very slowly, taking around a minute for each frame at 1920×960 resolution. (But fast enough for research purposes. Someone implement this in proper video editing software please!) The code should hopefully be easy to follow, but I’ll also outline the process here.
We have a sequence of transformations:
Pixel coordinates (say on a 1920×960 rectangle) → (0, 2π) × (-π/2, π/2) → unit sphere in R³ → CP¹
The first transformation just scales the pixel coordinates to be angles, and the second is the inverse of equirectangular projection. Next, I’ll describe CP¹ (one-dimensional complex projective space), and the third transformation. CP¹ is the set of pairs of complex numbers, (z,w), where we say that (z,w) is the same as (λz,λw) for any non-zero complex number λ. In other words, we can scale both coordinates and it doesn’t change the point in CP¹. It’s useful to think about a pair (z,w) as the single complex number z/w. Of course if you scale both the numerator and the denominator of a fraction by the same number, it doesn’t change the number that you get. So, CP¹ is almost the same as the set of complex numbers. The difference is that in CP¹ we can talk about the pair (1,0), while 1/0 doesn’t make sense as a complex number. So CP¹ is just the complex plane, with a single point added “at infinity”. The plane, plus a point at infinity, is topologically the same as a sphere. The third transformation in the sequence above, from the unit sphere in R³ to CP¹, is just realizing this topological fact, using stereographic projection. Here’s a photo of a model illustrating stereographic projection as a map from the sphere to the plane (you follow where the light rays go to see what the map does).
The north pole of the sphere doesn’t map anywhere on the plane, it gets mapped to a point at infinity, which corresponds to (1,0) in CP¹.
Ok, so now we are in CP¹. Möbius transformations are what you get by applying 2×2 complex matrices to each point (z,w), viewed as a two-dimensional complex vector. So, for example,
Thinking again of points of CP¹ as single complex numbers, this converts z/w into 2z/w. In other words, this scales everything away from zero by a factor of two. On the sphere however, it scales things away from the south pole, and towards the north pole. If we do this to a spherical video (that is, mapping all the way from pixel coordinates to CP¹, applying the matrix and then mapping all the way back to pixel coordinates), it looks like we are zooming in on the south pole, and zooming out from the north pole. Here’s a test equirectangular image, and the result of doing this kind of “zoom by a factor of two”, this time between two opposite points on the equator:
Notice that close to the center of these images, you really do get a zoom by a factor of two, and on the opposite side of the sphere (at the midpoints of the left and right edges), you zoom out by a factor of two.
If you want to rotate using a 2×2 complex matrix, you would multiply by the appropriate complex number. So for example to rotate by 90 degrees, you want to multiply by , so you would use the matrix
Using other kinds of matrix (functions for which are included in the code on Github) you can rotate about any two points of the sphere, by whatever angle you want, or zoom from any point of the sphere towards any other point.
So, apart from the math being nice, why else are Möbius transformations a good idea for video editing? As we mention in the video above, they are all conformal, meaning that they don’t change angles. In ordinary flat video or still image editing, zooming and rotating don’t change the angle at which two lines meet, while if you shear an image then angles do change. If you look closely at just a small part of an image, it will look pretty much the same after a conformal mapping, while if you shear an image you can tell by looking at any tiny part of it that something is distorted. Of course things can change drastically on the larger scale, but at least each small part of the scene looks reasonable on its own.
If you have any questions about the theory, or the implementation, or make something cool using Möbius transformations, please tweet me at @henryseg!
P. S. I was inspired to start thinking about Möbius transformations applied to spherical imagery when I read The Mercator Redemption, a paper by Sébastien Pérez-Duarte and David Swart.
P. P. S. There is an awesome video, Moebius transformations revealed, which shows visually how various Möbius transformations of the complex plane can be interpreted as motions of a sphere. This sphere is close to, but is not really the same as the sphere of our spherical video!