May 24, 2011
Making-of: Spheres on a plane
This is a somewhat lengthy piece focused around our demo “Spheres on a plane”, and the technical aspects that went into making it. I should state early on that if you’re not interested in the demo (or any demo) or the way they’re made, this entry might not be for you. Before you read on, watch a video capture of the demo embedded above or download the executable.
Over the years, an big pile of links pointing to Vimeo.com and similar sites had been building up in the Skype logs between me and Bent. We quickly realized two things: 1) that we like, more or less, the same sort of visual expressions in demos, and 2) that we never seemed to actually do anything remotely similar even though we wanted to.
Therefore, it was almost with a sigh of relief that we canned our megalomanic dreams about making a compo killer demo for this year’s The Gathering, and instead concentrated on doing a demo consisting of a few simple, yet beautiful (at least we think so), scenes.
This was around mid March. The picture that stayed on as the main reference was the following:
In what follows, I’ll go through the demo and talk a bit about some of the steps we took, on the technology side of things, when making this demo – enjoy!
In order to capture the visual richness of the reference image, the very first thing to get control of is the auto occlusion of the colletion of pyramids that make up the central object. For every point on the object, this information is then used to give it the correct shade of darkness. In computer graphics terms, this auto occlusion information is referred to as “ambient occlusion“, as it measures the amount of visibility of the environment from any given point of view on the object.
In theory, to calculate ambient occlusion correctly for a given point on a given object, you can do the following: take a million rays, all starting at the given point and pointing in every imaginable direction in space. Then, for each ray, find out if there’s any intersection with the ray and the object. The ratio of number of rays that intersects by a million, is then the occlusion factor. For a point inside a sphere, this occlusion factor should be close to 1. For a point in the interior of a face of a cube, it should be close to 1/2, and for a point far far away from the occluding object, it should be close to 0.
This approach would amount to something of a Monte Carlo integration method for determining the occlusion factor. Since we do not want to raycast a million rays in realtime, it’s better (in this case at least) to know what kind of integral we are trying to integrate and solve it in a different way. Of course, what the procedure above is doing for you is that it gives an approximation to the surface area covered by the occluding object after radially projecting it onto a unit sphere centered at the point of view (i.e. the origin of those million rays).
Here’s what we did: Consider the reference image. The interesting object is built up of copies of the same building block, namely a pyramid. Take one such pyramid occluder, fix an orientation and position it so that its center of mass is at the origin. For a given point, p, outside the pyramid, radially project all the triangular faces of the pyramid facing towards the point p onto the unit sphere centered at p. Calculate the area of the projected point set. The area is then the occlusion factor for p with respect to the occluder pyramid. Notice that this works since the pyramid is a convex polyhedron.
For an inconvex triangular mesh, two forward facing trianglular faces might have overlapping projections, and the correct occlusion factor with respect to these two triangles would be the sum of their projected areas minus the area of their intersection. Luckily, we can disregard this difficulty.
We will not be getting into the formulas for calculating the area of the projected triangles. Suffice to say, it is an area integral with domain the union of the triangle faces visible from the given point of view. Originally, I was hoping that the integral had a nice exact and closed solution. But after having wolframalpha.com chew on it and fail a couple of times, I decided it was time to invoke some good old brute force precalculation.
So, back to the pyramid. We choose a bounding volume containing it and proceeded tocalculate the occlusion factor with respect to the pyramid for all points on a regular grid inside the volume. The results were put in a volume texture and saved offline.
When the effect runs in realtime, we proceed in a way similar to deferred shading: first we draw the color of every copy of our pyrmid into a color render target. We then create a light buffer and populate it by simply placing some 20 point lights, randomly distributed, inside the view frustrum. These lights cast no shadows, so if we had stopped here and combined the light buffer with the color buffer, the object would appear with a somewhat interesting lighting but without any ambient occlusion.
Hence, before doing this combining, we do the following: for every pyramid in the object, we render a bounding mesh. For each pixel inside the bounding mesh which is on the object we are shading, we look up the occlusion factor from our precalculated volume texture and decrease the value of the corresponding texel in the lightbuffer accordingly.
Doing this for one pyramid has the effect of causing that pyramid to “cast occlusion” to all nearby pyramids. By doing this for all pyramids in the object, we arrive at the shading we are looking for.
For the sake of self-ridicule, here’s the very first visual outcome of the having implemented the above algorithm (in delicious debug colors!):
Notice the super cool patterns one the floor, close to the pyramid base, due to non-normalization of the precalculated occlusion values. After having massaged the code a bit, the rendering looked like this:
The remaining artifacts were now down to floating point imprecisions and texture resolution. To fix this, more tweaks were made to the precalculation code and finally, ta-daa:
It is worth noticing one very cool shadow effect in this final image that does not stem fromthe algorithm described here: Along the edges of each pyramid, we (well, Bent that is) added a nice shadow in the color texture of the mesh. It has absolutely nothing to do with the geometry of the mesh, but it is just as effective as any realtime ambient occlusion scheme. :)
You can add more pyramids to the object by mouse-clicking on any pyramid face
Notice also that even the pyramids have no relative motion with respect to eachother. This is a bit boring, since it does not show the real power of the ambient occlusion shading. The only place where there the ambient occlusion shading is affected by relative motion is on the ground plane of which the object hovers above – this looks really good. However, having a dynanic shading like this enabled us to put in an easter egg in the demo: you can add more pyramids to the object by mouse-clicking on any pyramid face. Run the demo and try for yourself.
Finally, I have to add that the above description of our algorithm is a bit simplified. For example, I did not say a word about how to keep a pyramid from occluding itself. This, however, belongs to the realm of hacking and any hack works. More seriously, however, the algorithm as portrayed produces dead wrong results in many cases.
Consider for example what happens if you place two pyramids base down on a plane, side-by-side, and look the area around the edge where they intersect. According to the algorithm above, the base faces of both pyramids would “get affected” by ambient occlusion, producing an “ambient occlusion halo” around the edge.. Juck. Luckily, this is easily mendable by including some simple visibility considerations. However, for this demo, time ran out and things looked OK as they were. But take a look at the image above on the lower right edge near the floor to see this pathology in action.
As I mentioned briefly already, we are using a deferred rendering path in this demo. The benefits of doing deferred rendering when shading local light phenomena is very comparable to the benefits of having a spatial hash of rigid objects when doing collision detection in a physics engine. The code also gets a lot more practicle to work with and the temptation to throw in a couple of lights here and there often becomes far greater than the urge to keep the framerate below a sober limit. The former is exciting, the latter is not. Creating the glowing spheres in the following two images is a direct result of this:
In addition to standard use of textures, pointlights and the ambient occlusion technique already mentioned, there’s one part of the demo that uses raytracing. The very last scene (before the credits/greetings) displays a red ball and some reflective pyramids, using a specialized pixel shader for rendering reflections between convex objects. In any pixel shader that relies on raytracing, the crucial thing is always to speed up the calculation needed to find the intersection point between rays and the mesh we are shading.
In our situation, we are (again) lucky enough to be dealing with convex polyhedra, essentially defined by a low number of trianglular faces. Any convex polyhedron can be thought of as the intersection of a collection of half-spaces in Eucledian 3-space. For a cube, you would need six half-spaces and for our pyramid, we need five. Specifying a half-space can be done by specifying a plane together with a choice of normal direction. Thus, we can represent a pyramid as five planes with five chosen normal directions. If you think of a standard mesh with faces and normals, the faces give you the planes and the normals give you the normal directions.
Representing our pyramid by five planes and corresponding normal directions, the problem of finding the intersection between a given ray and the pyramid boils down to doing at most five ray/plane-intersections and some bookkeeping.
Thus, given a pixel on a pyramid, we can now do quite efficient reflection calculation between our pyramids to find out how any initial ray bounces around between our pyramids before it escapes into the environment. Originally implemented in CUDA, I was quite happy to finally have a reason for porting it to DX10 and put it into a release.
There’s something to be said for the shortcomings of the reflection algorithm. Look for the sentence above were I wrote “essentially defined by a low number of trianglular faces”. Of course, the pyramids that we actually rasterize are more refined than the representation by five planes would indicate. The mesh we use has nice round corners and edges, while the convex objects that we actually raytrace has hard, sharp edges and corners.
It would be more appropriate to say that for the raytracing, producing the higher order reflections, we are using a rougher approximation of the pyramid mesh than we use for rasterization and first order reflections. This is quite noticable when you are aware and start looking for it, and is a good reason why we decrease the intensity of reflected light inverse proportional to the distance travelled by the ray while bouncing between the pyramids. Just take a look at the following screenshot for an example of the sharper corners in the secondary reflections:
At this point, I would like to make a comment related to this effect and to demoscene raytracing trends. Within the demoscene, lots of people has been concentrating on distance fields and produced some very interesting effects with them. However, I feel the focus has become too narrow. By this I mean that it somehow seems like people forget that distance fields are just another way of optimizing ray tracing.
The important question is almost always: how do we, in the most efficient way, calculate the intersection between this ray and that object? It is not: how do I produce a distance field that encapsulates the geometric shape of that object? Sometimes, an answer to the latter question combined with the standard way of calculating the intersection between a ray and an object described by a distance field gives you the answer to the first question, but I think it would be healthy to keep the broader picture in mind.
Starting with the reference image, we always imagined having some physics code controlling the movements of our objects in the demo. For a long time, I had been doing GPU accelerated physics without really being able to produce something worth releasing. It was frustratingly difficult (for me at least) to create something that shows off the code and at the same time does not look like myFirstPhysicsSimulation.avi (just go to YouTube and search for “krakatoa” or “mograph” for plentiful examples).
Anyway, in the course of all this, I have been using Verlet integration for the simulation step. This is more or less by accident, having been seduced many years ago by the simplicity of rag doll simulations described by Jacobson.
(I would like to take this opportunity to rant and complain about the incredible mess that is physics coding tutorials online. Even mathematicians get dizzy when seeing awful inertia tensor formulas, and least one I know tends to run to his fridge and seek comfort in beer every time he tries to read through one of these tutorials. Rant over.)
Anyhow, for this demo we were not after world records in GPU accelerated physics, so I ripped out my code and made a nice cpu implementation of it. I must admit that it was done too hastily, and I am sure you can notice some phsyically questionable behaviour in the first scene of the demo :)
The physics simulation in this demo runs in realtime. No big surprise, really. After all, we only simulate a handful of objects at any given time. However, I did have to admit defeat in one scene and bake the simulation. The reason for this was that my code suffered from the fact that it wasn’t deterministic. This was never a big issue when I was doing gpu accelerated physics, as I was focusing on one thing and one thing only: simulating the maximum number of bodies in realtime. (In fact, the C++ class for my GPU rigid bodies had a name reflecting this: FUCRRS — Fast UnaCcurate Realtime RigidS. Yes, really.)
So why was determinism important now all of a sudden? Because of the following scene:
The way this part of the demo was supposed to be syncronized with the music was that every time one of these spheres would collide, a corresponding sound would be played in the soundtrack. Of course, having an indetermistic phsyics simulation is crucial if this should work.
The way indeterminism played a role here is a bit complicated. The whole story involves three software timers trying to be in sync with each other using (finite precision) floating point numbers. Add to that the chaotic nature of rigid body simulations and imagine trying to fix this at 2 AM the night before our deadline. The choice fell quickly on precalculating the whole thing and be done with it. Of course, forgetting that I can’t code at all when being sleep deprived, I messed up the precalculation and still managed to screw up one more time before finally fixing it the morning after. Unfortunately, the end result still didn’t quite work, but the reason for that includes FRAPS (which sucks) and multiple other factors. Needless to say, sync-nazi Bent wasn’t happy, and it will be fixed. :)
In 2010, I promised myself to never again try to do rigid body physics simulations. I probably have to fix a few things and release a final of this demo, but after having done that I will make the same promise to myself once more.
Since I wasn’t responsible for the sound or design-aspects, I’ll let Bent talk a bit about the sound-design.
I knew very early on that I didn’t want to have any melodies or detectable rythm in the soundtrack (or soundscape, I guess). It was also important to maintain a feeling of a large, emtpy space, since the visuals would be in part very intimate, but also reflect something very empty.
For reference material, I went to freesound.org and downloaded a whole bunch of drones, blips and recordings of traffic and wind. In the demo archive, you can find a list of all the IDs and filenames so you are free to listen to them to see how the sound was built up. In addition to lots of ambience-samples, I used my favourite VSTi – Gladiator 2 – to generate some of the lower-end of the soundscapes. If you pay close attention, you may notice that the amount of low-end in the various parts reflects the amount of visual “weight” in the same scenes. For example, in the last scene of the demo, there is quite a bit of bass and chaos, building up to the end.
I also played with conventions in the part with the falling spheres. In this part, I didn’t want to have sounds that fit the visuals, and ended up using four different breathing-samples (also from freesound.org) and mixed them together (including some pitching and time-stretching). The ping-pong-samples at the end of the scene was meant to play on the fact that the “force-field” had been switched off, and that the gravity (= normality) would reintroduce the sounds that the viewer was expecting. Still, the ping-pong sounds are still a bit “off”, seeing as the textures of the spheres are indicating something heavy and hard. Fun.
In the part with the wooden pyramids hanging by a rope, the starting-point was a long sample of a person pulling a big rope back and forth over a metal railing (at least, that was the sample description). This sound is used throughout the part, but is first introduced when the first extra part is added to the object. For the evolution of the scene I needed to find some samples tha fit the object itself, and I was lucky to stumble upon a series of samples of drawers being opened and shut. The internet is a fantastic place. Various edits later, and the part worked well.
The design (look, feel, motion, editing) was the easiest part of the whole demo. After having consumed more than my fair share of random motion graphics pieces from xplsv.tv (R.I.P) and Vimeo, quite a few design and editing conventions were quite clear, and most of the time actually went into deciding the order of the parts and tweaking the cuts (both with sound and timings).
If you watch very closely, you’ll notice that the empty cuts between the parts are all of various lengths and there are various amounts of sound-spillover between them (example: the reverbs that end into blackness are sometimes very long (or “wet”, as we say), and in other times they are very short – almost instant (or “dry”, if you will). This is of course completely intentional. For example, in the opening shots where the scenes are empty, the cuts are shorter and the sound is dryer. This is because they are establishing shots, and the viewer does not need a lot of time to process the different parts. Later on, when there is something to focus on, the cuts are longer.
Personally I think it’s quite confident to opt not to show off a complex raytracing-scene
With regards to the camera-paths, I decided to stick with very simple moves. You can see the camera either dollying back, forth or to the sides. The only part with any sort of complex camera-move is the last part (the raytraced pyramids), and even though I didn’t really want to do it in the beginning, it works well in that part because it’s the last part of the demo, and the viewer is guided towards relalizing that the demo is about to end. The last part also went through a few versions before we settled on the one that’s in the demo now. The fact that nothing is happening (apart from the camera move) for 95% of the scene really makes it, even though it doesn’t show off the raytracing very well. Personally I think it’s quite confident to opt not to show off a complex raytracing-scene, but then again that was always what this demo was all about – minimalism and mood over technology showoffs.
As usual, my weapon of choice for syncing the demo was the very excellent GNU Rocket System by Kusma (of Excess-fame) and Skrebbel. If you don’t use it to sync your demos — start now. It’s a life-saver, trust me. Luckily, Sverre had already implemented it in the demo engine, and seeing as I’m quite comfortable in using it (see Sunshine in a box, Regus Ademordna or Scyphozoa for references).
I also worked more than a fair bit with Sverre on the textures, because getting the “right look” isn’t easy. The wooden pyramids went through at least five revisions before we settled on the final look, and the various “rooms” also took a lot of tweaking to get right. Most of the textures aren’t remarkably high-res (1024×1024), but I paid attention to sharpening cleverly before the final export – a neat trick I’d like to see more of elsewhere and that I’ll most definitely repeat in later projects as well. One final word on the textures – the floor in the part with the hanging spheres is a direct reference to two things: the classic “checkerboard” of Amiga-demos of the 90s, and “American McGee’s: Alice“, one of the best games I have ever played. The whole tone of the demo is directly related to the opening video sequence of that game.
We hope this post has been interesting and not too snobby. Apologies if any of us went off on tangents or became too hipster-like in our descriptions. That’s sometimes what happens when someone is asked to go back and analyze old thoughts and ideas.
Until next time, thanks for reading!