May 18, 2012

Making the Machine – part 1

Preamble

Ödönke bánt.

In hopes that most people enjoy makingof’s as much as I do, the following series of articles would be an attempt to describe the creation process of my Revision demo called “Machine” by Ümlaüt Design, both from a technical and artistic standpoint, and perhaps shed some light on a few easily avoidable mistakes I’ve made along the way.

A disclaimer: I’m not aiming this to be a tutorial or a white paper; if you’re a competent demomaker, you will probably not find anything novel in these texts. This is more aimed at people wanting to get into demomaking, but are perhaps intimidated by the chore, or maybe made a few simple demos, but are unsure how to direct them better or approach the production process easier. Note that I’m not a very good visual effect coder, so don’t expect a lot of linear algebra. I’ll also try to keep a lot of the actual maths / technical stuff to the minimum, or at least digestible to a non-technical / artist audience.

Preparations

I first had the idea to do a demo for Revision around February. Reasons were simple: I haven’t done a demo for several years at that point (meaning actually worked on one instead of just supplying music), and I felt I needed to do something to get back into the loop, and to stay sane. Another motivation was that I found out about ASSIMP, a fantastic little loader-library for 3D formats; I would often discard ideas for a demotool simply because I felt the management of a 3D content pipeline was a right pain: I tried stuff like Flexporter or writing my own plugin, but it just never worked out quite right, and I also realized that to be really effective (and perhaps somewhere down the line, artist-friendly), I would need to use standard formats like COLLADA. So often I shelved ideas because of this, but then ASSIMP came along and I was in a sense reinvigorated, knowing that there’s a whole open source project who’ll gladly deal with these kind of problems.

An early concept model of the first robot.

My plan was simple: I had a track from around 2007 influenced heavily by Drew Cope’s Drum Machine animation, and I realized I never got around to making anything with it, so finishing the track already put me in a good position. I also had some really rudimentary basecode from around that time for the next generation of ÜD tools we wanted to make – this certainly spared me some time to mess around with datafiles and parsers and loaders. Of course this was by no means finished, but it was a good start.

So I started planning:

  • I had roughly two months. First, I really don’t like party coding. (Why code in Germany when I can code at home any time I want? It’s a waste of a good time.) Second, my current portable computer is a netbook; a fantastic little machine for everything other than Shader Model 3.0 demos. These all meant that I had to finish at least a day before the party.
  • Already had the soundtrack (or at least something that needed a day of work at most to finish), and it wasn’t particularly long either.
  • Some basecode, mostly just boilerplate STL-like stuff.
  • One person project – you’d think after 12 years in the scene I would know some willing graphics artists, but the problem here is that they were either uninterested or unavailable at the time, and my concept wasn’t fully fledged either, as I was anticipating to make some of the stuff up as I went along. (Plus there’s something to be said about masochism, megalomania and lack of human skills, but that’s a different story – let’s not dwell on that.)

Demomaking. Serious business.

I also planned what I’ll be needing:

  • I’ll need a tool. You can argue about handcoding, but no. For the stuff I planned, razor-sharp editing and continuous flow, you’ll need a tool. And no, GNU Rocket won’t cut it – it just wasn’t precise enough or convenient enough for what I wanted. I’m a firm believer in that if you spend a week on writing a tool that saves you a week somewhere down the line, do it.
  • I’ll need a 3D engine that can somehow load an animated 3D mesh and display it. I’ve written a few of these so I wasn’t concerned, although I really wanted it to be non-hacky this time, i.e. unified lighting, material system, and hopefully looking the same in the tool as in the modeller.
  • I’ll need a couple of effects. Nothing special – I’m not good at them, so I’ll just keep them basic but hopefully interesting.
  • A LOT of content. I was originally planning about 3-4 scenes, but listening to the music, I had to realize I needed at least double that.

I made a rough timeline for the project: about two weeks for the tool, two weeks for the engine + loader + effects, two weeks for content, and I allocated the last two weeks to edit the demo together. This last point might sound like an overkill, but I really wanted the demo to look polished and feel well thought-out, and I was also aware that making a demo, being a non-linear process, sometimes involves you going back to previous steps and fixing content or code in the phase when you’re supposed to edit (in film, I suppose this is what you’d call “re-shooting”), so in a sense I left myself some headroom to fail.

The tool

2012.02.11 – We all start somewhere.

I decided early on what I wanted my tool NOT to be: I didn’t want to write a content tool. I had no room (and in this case, no time) to make a full-fledged 3D editor, even to just place cameras or tweak materials. I decided that I’m better off to do those in the content package and then just make sure it looks as close (within reason) as possible. All I really wanted is to manage resources (to an extent), composite, but most of all, edit. That was it.

I also decided that since I don’t have much time, I needed to tone down on functionality: the basic idea of a tool is to make certain things easier, case in point editing and scratching back and forth in the demo. That’s all I wanted to have. The underlying system was capable of managing resources, loading them, reloading them, etc., but since my internal format was XML, I decided that as long as it doesn’t get too tedious, everything else I could just do by hand. Now, it’s up to each and every demomaker to decide what’s tedious and what’s not – for me, copy-pasting XML tags to insert a new envelope was a perfectly acceptable trade, even if I had to do it several times; compared to adding GUI to a “New Effect” window with foolproofing and all, it was really just a minor nuisance. Of course, I’ll finish the GUI properly at some point, sure, but then and there I had two months, of which I allocated two weeks for the tool. From scratch. I had to create a triage.

As far as the tool’s interface goes, I went back to what I’ve been using a lot: music editors, specifically DAW‘s. DAW’s have a particular way of arranging envelopes, clips, loops, and they allow a really precise level of editing, but the thing that struck me the most about them is how much ease they have developed over the years. I used to be a tracker person, and my switch to DAWs was somewhat bumpy, but now I don’t regret it – adaptive grids, triplet beats, beat snapping, envelopes, these are all things that one can use for a demotool and gain a lot of strength with them. With that in mind, I set out to essentially build a DAW by my own rules.

The underlying system was simple:

  • I can load an arbitrary number of plugins.
  • These plugins can produce an arbitrary number of resource loaders and effects. These are object factories that can instance any number of effects or resources.
  • I have a common resource pool where I manage the outputs of the loaders: Textures, meshes, videos, shaders. If there’s a loader for it, it goes here. Resource loaders can also use this pool if they depend on other resources (e.g. 3D scenes can reach for textures already loaded)
  • Each effect can specify a set of resources (by type) to use. None of them HAVE to be there – it’s a plugin’s responsibility to fall back gracefully.
  • Each effect can specify a set of parameters to be wired out to envelopes or left as a static value.
  • The demo then loads plugins, instantiates the loaders and effects, loads the resources, feeds them to the effects, loads the envelopes, and runs.

That’s it. That’s all I needed at the moment.

2012.02.16 – GUI shaping up slowly. Note the time signature change.

Early on one of the things I thought about is having arithmetic envelopes, i.e. I would be able to have two splines create an envelope by adding, multiplying, and so on (e.g. a pulsating beat that gets progressively stronger), but I ditched the idea (AFTER I wrote the code for it) because it was a hassle to manage, and I had a better idea: if it’s all just envelopes creating floating point values, why not subclass them? So right now I have one subclass for static values, one for a standard spline-based envelope, and then if I need specific GUI functionality later, I can just implement a new type. This all might sound like an overkill (I haven’t actually implemented new ones yet), and certainly arguing about GUI design can be pretty pointless, but I thought about how I would imagine an envelope to e.g. switch between cameras conveniently: with an ordinary spline-based system, you can set your splines to flat and just move your points up and down until it crosses to another integer value, but it sounds like a nightmare from a convenience point-of-view (and due to the time constraint, this is what I ended up using, so I can confirm it too). A better idea, of course, would be displaying the GUI itself as a list of ranges (i.e. “between 2:00 and 3:00, the value is constantly 7″), and not just a spline – like the pattern editor for the GlitchVST. While you could argue that this way you’re actually spending time on removing functionality, my perspective is that for certain parameters (colors are a good example too), a specific GUI feels much better, and with a subclassed GUI, that’s fairly easy to do without having to break something else. Once again, this might sound like a tangent, but the idea is this: As long as it spits out a floating point number, anything goes, and a bit of forward thinking pays off later.

One thing I kinda forgot about initially and then just sorta chose the simplest solution to take care of it was the actual rendering range, i.e. if there are all these possible envelope types, how does the demo know if it’s time to render an effect or if it’s time to leave it alone. This I decided to leave up to the discretion of the actual envelope class: each class had a bool function that returned whether according to that envelope, something should be happening or not, and when all envelopes said “true”, I rendered the effect. Constant values always returned true, and splines returned true if the play cursor was over one of their clips (= the white boxes visible on the screenshot below). This provided fairly convenient control over when effects were active or inactive, and again, with a different envelope type, I would’ve been able to define a whole new behaviour if I wanted to.

Another thing before I forget: For splines, I didn’t limit myself to a 0.0-1.0 range. When I did my first 4k synth, I realized a lot of the fun stuff comes from the out of range values, and if you want to extend your animation with 20 more frames, you don’t have to re-sync your entire demo. (This, by the way, is a hindsight. I did make this mistake.) Of course the question is, how to fit an infinite range to a GUI like this? Well, I went for the simple and stupid solution – default range is 0.0-1.0, a rightclick menu of “Set Range” can change that to anything. This is purely a GUI thing and the demo doesn’t even have to care about this.

As far as GUI goes, I went with MFC for a number of reasons: I’m really familiar with it, having worked with it for a number of years, I find it not only reliable, but also very convenient, however the REAL real real reason was docking panes. I fell in love with the Visual Studio-style docking panes on first sight and I wanted them in my tool for ages, not only because they’re really easy to rearrange, but I also wanted to have a level of multi-monitor support where I can see the demo on one monitor and edit the envelopes in the other – just like in a convenient DAW. (That hissing sound you just heard was the Ableton camp. Heh heh.) So for that I needed my docking panes to be able to leave the main window and maximize on the other screen.

2012.02.19 – The basic spline editor.

Another reason I enjoyed MFC was the library perks of the Document/View model – anything that keeps me from having to manage stuff like “Do you want to save the changes?” and the clipboard and keyboard shortcuts and loading bars and the whole New Document business is a bliss. To peek ahead a few weeks, somewhere down the line I occasionally found myself working with multiple UDS project files to cut down loading time and then just eventually manually merging them, all thanks to the fact that I was able to do a “New Document” trick fairly fast. (And yes, some of this sounds like an epiphany of a 3 year old finally figuring out how a toilet works, but hey. As said, it’s been years.)

The first iterations of the tool were obviously puritanical, but I wanted the idea to be clear from the word go – the View has a rendering window in the middle, rendering in the exact same resolution the demo is running in (optimally – resolutions can vary but there’s always an intended one, in this case 720p), and everything else is in panes, toolbars, menus, keyboard shortcuts, etc. One positive thing about MFC (and WINAPI in general, I guess) is that it encourages redundancy in GUI: if you have a menu item, you can immediately add it to the toolbar, assign an accelerator, and so on. It’s free, and it makes the GUI adapt the user instead of the other way around.

Now, this isn’t strictly code-related, but it’s worth mentioning that I noticed an interesting trick a few days into the development: I worked better, when I left basic stuff unfinished at the end of the day. My reasoning for this was that if I’ve done a 100% job that day, the next day I’d sit down in front of Visual Studio, and spend time wondering how to jump into the next task; I would start to procrastinate. However, when leaving really run-of-the-mill stuff unfinished, I would often spend the next day thinking about how I will solve a particular solution, and by the time I got home, I just jumped into the project and fixed the problem, and essentially carried on working without losing momentum. This worked best when the problem at hand was as simple as it gets: reproducible crash bugs, broken colors, textures being upside down, stuff you’re able to debug and fix in your sleep. But because they’re so easy, you don’t feel as if there’s a mountain to climb for the next task; by the time you fixed it, you’re “warmed up” and ready for the next big chunk. I highly recommend it if you feel that you’re able to lose motivation in democoding.

After I reached a satisfying stage with the basic envelope GUI, I switched to the next step: the underlying engine.

The engine

I had a pretty good idea from the get-go of what I wanted my engine to achieve, both as far as timing and compositing goes and as far as rendering goes:

For each “effect”:

  1. Load some sort of standard mesh format
  2. Use incoming splines to animate camera, light and meshes
  3. Render in multiple passes, perform compositing

2012.02.21 – FK in action, rendering normals.

Step one used to be a huge pain; my first demos used to use 3DS files, but they’re a bit outdated by now, and they’ve always been a bit of a nuisance to handle. Later I made my own file format using Flexporter, a step above 3DS files but still a fairly dodgy solution: you’re using middleware to create even more middleware – things just go wrong. Finally, I attempted to write my own exporter, and that just ended up being torturous, because of the intricacies of the tools’ inner workings. At around 2007, as a result of this, I spectacularly gave up trying to create a content-pipeline, and with it, making demos.

Revelation came in the form of a set of Pouët posts talking about a library called ASSIMP, a 3D mesh loader library that takes dozens of known 3D formats and loads them into a unified data structure. Not only that, but it’s also able to do a bunch of really cool transformations on it, and it comes with a set of fun little tools to test your meshes as well. I took a look at the API, and I was sold immediately. I did some tests, and it worked so well that I decided to start integrating it into the engine with newfound enthusiasm.

Of course, it wasn’t all fun and games – the 3DS loader in ASSIMP is still kinda broken, but the COLLADA importer works. The COLLADA exporter for MAX, however, is rather lacking. The solution in this case was OpenCOLLADA, a unified exporter for MAX and Maya, which is slightly improved. A weird thing I noticed is that I had to flip my cameras around to get them to face the right way – this might be a bug in my code, but unfortunately ASSIMP’s reference viewer doesn’t use cameras, so I have no way to check. In any case, with the loader slowly creeping in the engine, and step 2 already mentioned above (splines and envelopes – I’m jumping a bit in chronology here), I was ready to tackle step 3.

Over the years I’ve done some intermediate rendering stuff, but I never really built a comprehensive engine where everything was combined with everything, and this time I really wanted to do something that “just works”. My wishlist was simple:

  • Forward kinematics animation. This is easy on paper, comes from step 1, but of course the whole row-major/column-major/nodegraph/LHS/RHS matrix quagmire can take up a bit of time to sort out.
  • “Correct” lighting, i.e. lighting done in the tool, and not the usual “it looks kinda okay if I change this to 0.5″ debacle.
  • Normal maps. I’m a huge sucker for bump maps, and I’m sure I’ll die happy if they’re the last thing I see.
  • Motion blur. Ever since the Realtime Generation invitation I’ve been infatuated with realtime motion blur, because I realized what a nice subtle film-like quality it adds to everything.
  • Shadowing. Again, not a big thing, but it does make a considerable difference.

2012.02.24. – Shadow maps up and running.

This all doesn’t sound like a big thing, but I wasn’t sure how to manage all of this, as far as rendering order, render targets, etc go. The first “a-ha!” moment came when I started looking into D3DXEffect. Granted, after about 7-8 years of it being released, perhaps it was bloody time, but I guess I was always wary of particularly high level stuff. This time, it came like a breath of fresh air – all the tedious pass management is in there and more. The only problem was render targets.

Luckily, I came across a little note in the SDK about annotations. Annotations are essentially meta-data that allows the coder to assign variables and values to passes, and then read them back once the pass is running. This gave me the freedom I needed; I built my little system like so:

  • During initialization, the effect plugin builds the required render targets and depth buffers, and stores them in a dictionary, designated by some sort of identifier e.g. “shadowmap” or “velocitybuffer”. (Later I built a facility where the effects would be able to request render targets from the engine core, and the core would either allocate a new one or hand out one of the older ones, depending on the parameters. This cut down on the video memory usage considerably.)
  • The .fx file is built in a way that all the passes designate which buffer they want to render to. The actual rendering code then loops through the passes, sets up the render target, and pushes out all the draw calls. Also, since I designed .fx files to be handled as resources, I was able to switch between them easily during editing, changing one rendering chain to another.
  • Because I wanted to do some compositing / VFX / etc., I extended this by adding another flag to specify whether the current pass is a “scene”-pass or a “post-processing”-pass. In the latter case, instead of pushing out the scene’s 3D geometry, the engine would simply render a 2D quad, and I’d be able to use the results of the previous passes as input.

Down the line, I also realized that if I wanted to integrate “hard-code” effects into the chain, I could just insert a pass where I render the geometry, and specify an annotation that e.g. I want to render the color pass into a certain buffer, but I want to use the depth values of the previous render. That way, my “effect” would be correctly occluded by the geometry around it.

To this end, I even introduced flags to specify that during certain passes, while the setup of the matrices would remain the same, the effect would “skip” rendering the 3D scene-graph and instead would render the custom effect code. Additionally, I subclassed my renderer, and made sure that all the “custom” code would remain in a separate class – this wasn’t necessary, but it made sure that my actual engine code remained lean and effective, and more importantly, reusable.

An example “technique” for one of the custom passes would end up looking something like this:

technique StandardRender
{
  pass passShadow<string renderTarget="shadow";>
  {
    AlphaBlendEnable = false;
    vertexShader = compile vs_2_0 vsDepthToColor();
    pixelShader  = compile ps_2_0 psDepthToColor();
  }
  pass passBlobObjectsShadow<bool blobs=true; string clear=""; string renderTarget="shadow";>
  {
    vertexShader = compile vs_2_0 vsDepthToColor();
    pixelShader  = compile ps_2_0 psDepthToColor();
  }
  pass passVelocity<string renderTarget="velocitybuffer";>
  {
    vertexShader = compile vs_2_0 vsVelocity();
    pixelShader  = compile ps_2_0 psVelocity();
  }
  pass passBlobObjectsVelocity<bool blobs=true; string clear=""; string renderTarget="velocitybuffer";>
  {
    vertexShader = compile vs_2_0 vsVelocity();
    pixelShader  = compile ps_2_0 psVelocity();
  }
  pass passRender<string renderTarget="colorbuffer";>
  {
    vertexShader = compile vs_3_0 vsRender();
    pixelShader  = compile ps_3_0 psRender();
  }
  pass passBlobObjects<bool blobs=true; string clear="color"; string renderTarget="blobbuffer"; string depthTarget="colorbuffer";>
  {
    vertexShader = compile vs_3_0 vsRenderBlobObjects();
    pixelShader  = compile ps_3_0 psRenderBlobObjects();
  }
  pass passBlur<bool postProcess=true; string renderTarget="blurbuffer";>
  {
    vertexShader = compile vs_3_0 vsPostProcess();
    pixelShader  = compile ps_3_0 psBlur( 2.0, samBlobmap );
  }
  pass passBlur2<bool postProcess=true; string renderTarget="blur2buffer";>
  {
    vertexShader = compile vs_3_0 vsPostProcess();
    pixelShader  = compile ps_3_0 psBlur( 4.0, samBlurmap );
  }
  pass passBlur3<bool postProcess=true; string renderTarget="blurbuffer";>
  {
    vertexShader = compile vs_3_0 vsPostProcess();
    pixelShader  = compile ps_3_0 psBlur( 8.0, samBlur2map );
  }
  pass passComposite<bool postProcess=true; string renderTarget="backbuffer";>
  {
    vertexShader = compile vs_2_0 vsPostProcess();
    pixelShader  = compile ps_2_0 psComposite();
  }
}

Nice and clean, and more importantly: doesn’t require touching the source.

2012.02.26. – Motion blur and lighting.

As far as the actual shading/effects go, I chose simple methods: I restricted myself to a single spotlight source (which worked well with the concept I had in mind), so I was able to do rudimentary PCF shadow mapping, which is jaggy as hell, but for some reason it worked really well with the gritty textures I was making. For normal maps, I used simple hand-painted stuff, and ran them through NVIDIA’s Texture Tools. For the motion blur, I used basic velocity buffering, which I achieved by doubling all my matrices: before I render anything, I pre-transform all my node matrices according to my animation values. (Since the cameras and lights are all just nodes in the scene-graph, these also transform with it for free and come with all sorts of fun perks – but more on this later.) Then I do another transformation calculation, but with my epsilon value subtracted from my animation timer. This will give me two matrices for each draw call, one being slightly behind in time compared to the other. Then for the velocity pass, I simply calculate their screen-space position and calculate their difference vector, which I store into an D3DFMT_G32R32F buffer, and then use that later as a directional blur input. This method is obviously not perfect, but my assumption was that motion blur is only visible when there’s fast motion, so while it doesn’t necessarily look good on a screenshot, in motion it does give that subtle film-like feel to the demo which I was looking for.

So I had my basic engine and tool-chain ready. It was time to sit down and come up with what I considered the hard part: content.

[to be continued, in theaters near you]

avatar
About the author, Gargaj / Conspiracy

Polyurethane audio breeder / semi-organic code regurgitation trooper.

3 Comments Post a comment
  1. avatar
    e.wetheeviladragon
    May 18 2012

    Nice stuff, thanks for sharing! :)

  2. avatar
    rloaderro
    May 20 2012

    Always interesting with some making of articles. I think ASSIMP sounds like a great project in order to save hours of tedious exporter coding. I also think i have to look more into what can be done with .fx files now.

  3. avatar
    May 21 2012

    Great read!

Leave a Comment