Deformable Snow and DirectX 11 in Batman: Arkham Origins

Batman: Arkham Origins

It’s been a while, but I finally found some time for a quick post to regroup the presentations I’ve done this year at the Game Developers Conference (GDC) and NVIDIA’s GPU Technology Conference (GTC). These presentations showcase and explain some of the features developed for Batman: Arkham Origins.

You will find below a quick description of the talks, as well as a link to slides and accompanying video. The GTC presentation is an extended version of the GDC presentation, with additional features discussed as well as the integration of NVIDIA GameWorks.

Feel free to leave comments if you have any questions. :)

GDC 2014 – Deformable Snow Rendering in Batman: Arkham Origins

This talk presents a novel technique for the rendering of surfaces covered with fallen deformable snow featured in Batman: Arkham Origins. Scalable from current generation consoles to high-end PCs, as well as next-generation consoles, the technique allows for visually convincing and organically interactive deformable snow surfaces everywhere characters can stand/walk/fight/fall, is extremely fast, has a low memory footprint, and can be used extensively in an open world game. We will explain how this technique is novel in its approach of acquiring arbitrary deformation, as well as present all the details required for implementation. Moreover, we will share the results of our collaboration with NVIDIA, and how it allowed us to bring this technique to the next level on PC using DirectX 11 tessellation. Attendees will learn about a fast and low-memory footprint technique to render surfaces with deformable snow, which adds interaction between players and the world, depicts iconic and organic visuals of deformable snow, and is a good case for supporting tessellation in a DX11 game with minimal editing and art tweaks.

GDC 2014 – Deformable Snow Rendering in Batman: Arkham Origins (slides)
GDC 2014 – Deformable Snow Rendering in Batman: Arkham Origins (video)

GTC 2014 – DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham Origins

This talk presents several rendering techniques behind Batman: Arkham Origins (BAO), the third installment in the critically-acclaimed Batman: Arkham series. This talk focuses on several DirectX 11 features developed in collaboration with NVIDIA specifically for the high-end PC enthusiast. Features such as tessellation and how it significantly improves the visuals behind Batman’s iconic cape and brings our deformable snow technique from the consoles to the next level on PC will be presented. Features such as physically-based particles with PhysX, particle fields with Turbulence, improved shadows, temporally stable dynamic ambient occlusion, bokeh depth-of-field and improved anti-aliasing will also be presented. Additionally, other improvements to image quality, visual fidelity and compression will be showcased, such as improved detail normal mapping via Reoriented Normal Mapping and how Chroma Subsampling at various stages of our lighting pipeline was essential in doubling the size of our open world and still fit on a single DVD.

GTC 2014 – DirectX 11 Rendering and NVIDIA GameWorks in Batman: Arkham Origins (slides)

Blending Normal Maps?

– What is the best way to blend two normal maps together?
– Why can’t I just add two normal maps together in Photoshop? I heard that to combine two normals together, you need to add the positive components and subtract the negative components, then renormalize. Looks right to me…
– Why shouldn’t I be using Overlay (or a series of Photoshop blend modes) to blend normal maps together? 

– I want to add detail to surfaces. How does one combine normal maps in real-time so that the detail normal map follows the topology described by the base normal map?

These are valid questions which always come back from one game project to another.

Seems like there are a lot of approaches out there which try to tackle normal map blending. Some do it better than others – they often are mathematically-sound and are also suitable for real-time use. One can also find many techniques which are purely adhoc, are often non-rigorous and have been unfortunately accepted by the game development art community as savoir-faire when it comes to normal map blending. :(

If this is something you’ve heard before, something you’ve asked yourself, check out this article, written together with Stephen Hill (@self_shadow) on the topic of blending normal maps. We go through various techniques that are out there, and present a neat alternative (“Reoriented Normal Mapping”). Our mathematically-based approach to normal map blending retains more detail and performs at a similar instruction cost to other existing techniques. We also provide code, and a real-time demo to compare all the techniques. This is by no means a complete analysis – particularly as we focus on detail mapping – so we might return to the subject at a later date and tie up some loose ends. In the meantime, we hope you find the article useful. Please let us know in the comments!

Blending in Detail

Approximating Translucency Revisited – With “Simplified” Spherical Gaussian Exponentiation


Lately, someone at work has pointed out the approximation of translucency Marc Bouchard and I developed back at EA [1], which ended up in DICE’s Frostbite engine [2] (aka The Battlefield 3 Engine). Wanting to know more, we started browsing the slides one by one and revisiting the technique. Looking at the HLSL, an optimization came to my mind, which I’ll end up discussing in this post. In case you missed the technique, here’s a few cool screenshots made by Marc, as well as tips & tricks regarding implementing the technique and generating the inverted ambient-occlusion/thickness map. See the references for additional links.


As mentioned in [2], the approximation of translucency is implemented as such:

// fLTDistortion = Translucency Distortion Scale Factor // fLTScale = Scale Factor // fLTThickness = Thickness (from a texture, per-vertex, or generated) // fLTPower = Power Factor
half3 vLTLight = vLight + vNormal * fLTDistortion;
half fLTDot = pow(saturate(dot(vEye, -vLTLight)), fLTPower) * fLTScale;
half3 fLT = fLightAttenuation * (fLTDot + fLTAmbient) * fLTThickness;
half3 cLT = cDiffuseAlbedo * cLightDiffuse * fLT;

In parallel, as mentioned in Christina Coffin’s talk on tiled-based deferred shading for PS3 [3], Matthew Jones (from Criterion Games) provided an optimization to computing a power function (or Exponentiation, i.e. for specular lighting) using a Spherical Gaussian approximation. This was also documented by Sébastien Lagarde [4][5]. By default, pow is roughly/generally implemented as such:

// Generalized Power Function
float pow(float x, float n)
    return exp(log(x) * n);

The Spherical Gaussian approximation replaces the log(x) and the exp(x) by an exp2(x). The specular power (n) is also scaled and biased by 1/ln(2):

// Spherical Gaussian Power Function float pow(float x, float n)
    n = n * 1.4427f + 1.4427f; // 1.4427f --> 1/ln(2)
    return exp2(x * n - n);

If possible, you should handle the scale and bias offline, or somewhere else. Additionally, if you have to compute the scale and bias at runtime, but don’t really care what actual number is passed as the exponent, a quick hack is to get rid of the scale and the bias all-together. While this is not something you necessarily want to do with physically-based BRDFs – where exponents are tweaked based on surface types – in the case you/artists are visually tweaking results (i.e. for ad hoc techniques, such as this approximation of translucency), this is totally fine. In our case, artists don’t care if the value is 8 or 12.9843 (8*1.4427+1.4427), they just want a specific visual response, and it saves ALU. Again, not to be used for all cases of pow(x, n), but you should try it with other techniques. You’d be surprised how much people won’t see a difference. :)

In the end, after injecting the “Simplified” Spherical Gaussian approximation in our translucency technique, we get:

// fLTDistortion = Translucency Distortion Scale Factor 
// fLTScale = Scale Factor 
// fLTThickness = Thickness (from a texture, per-vertex, or generated) 
// fLTPower = Power Factor
half3 vLTLight = vLight + vNormal * fLTDistortion;
half fLTDot = exp2(saturate(dot(vEye, -vLTLight)) * fLTPower - fLTPower) * fLTScale;
half3 fLT = fLightAttenuation * (fLTDot + fLTAmbient) * fLTThickness;
half3 cLT = cDiffuseAlbedo * cLightDiffuse * fLT;


[1] BARRÉ-BRISEBOIS, Colin and BOUCHARD, Marc. “Real-Time Approximation of Light Transport in Translucent Homogenous Media”, GPU Pro 2, Wolfgang Engel, Ed. Charles River Media, 2011.

[2] BARRÉ-BRISEBOIS, Colin and BOUCHARD, Marc. “Approximating Translucency for a Fast, Cheap and Convincing Subsurface Scattering Look”, GDC 2011, available online.

[3] COFFIN, Christina. “SPU-based Deferred Shading for Battlefield 3 on Playstation 3”, GDC 2011, available online.

[4] LAGARDE, Sébastien. “Adopting a physically based shading model”, Personal Blog, available online.

[5] LAGARDE, Sébastien. “Spherical gaussien approximation for Blinn-Phong, Phong and Fresnel”, available online.

A Taste of Live Code Editing With Visual Studio’s Tracepoints

Needless to say that from one software project to another, compile times vary greatly. When debugging we often spend a significant amount of time changing some lines of code, recompiling, waiting and then relaunching the software. While it is true that one has to change their debugging “style” from one project to another (i.e. having a thorough understanding of the problem before “poking around” code is definitely a plus when dealing with big code bases where edit-and-continue is not possible), waiting for the compiler and linker when debugging is never fun. It’s also very unproductive.

In parallel, some of the tools we use everyday allow us to greatly improve our debugging efficiency, though sometimes we are completely oblivious to the power that is available to us. Regardless of how fast or slow compile and link times are on your current project, any tools that can mitigate the recompile cycle when debugging are welcome. One such tool is Visual Studio’s tracepoint: a breakpoint with a custom action associated to it.

Many think tracepoints are only useful for outputting additional “on-the-fly” debugging information. Of course tracepoints can be used to add additional printf’s as you’re stepping inside code  (or quickly spinning inside loops and various functions), but they also have this amazing ability to “execute code”. Case in point, I can’t actually recall when this amazing feature was added, but it’s been news to people I’ve talked to about it…

As we can see in the (very useful :)) code above, two breakpoints and a tracepoint are set. To convert a breakpoint to a tracepoint, right-click then select the “When hit…” option:

The following window will open:

Microsoft has put up several pages explaining how one can use tracepoints to format and output information (using the special keywords mentioned above, but others as well). While it is true that you can use tracepoints for fetching the content of variables (and outputting them in the Output window), it is not clearly mentioned that tracepoints also allows you to programmatically modify variables, in real-time.

If we go back to the previous code example, let’s say while debugging we realize that the while-loop should stop after 100 iterations (rather than 512). Instead of editing the code, and recompiling, we can simply setup a tracepoint that will update the done variable.

By setting { done = (i == 100); } along with Continue execution, the done variable gets updated every time the loop iterates, with the applied conditional. The loop will then stop when the condition is fulfilled.

You can also concatenate several instructions on the same line. They simply have to be separated by curly braces:

i.e. { {done = (i == 100);} { object.x -= 1.0f; } { data[15] = 3; } }

While this is a very simple example, this also works well with arrays of simple types, as well as structures. Although the scope of what you can do is limited — for instance, you can’t go wild and call functions — the ability to quickly update variables and behaviour on the fly definitely adds to the debugging process. See how this can be used with your codebase and try using this feature the next time you debug code. I can guarantee you will enjoy the ability to temporarily change calculations and behavior without having to recompile!

Many thanks to Stephen Hill (Ubisoft) for reviewing this post.

Approximating Translucency – Part II (addendum to GDC 2011 talk / GPU Pro 2 article)


Thanks to everyone who attended my GDC talk! Was quite happy to see all those faces I hadn’t seen in a while, as well as meet those whom I only had contact with via Twitter, IM or e-mail.

For those who contacted me post-GDC, it seems the content I submitted for GPU Pro 2 didn’t make it into the final samples archive. I must’ve submitted too late, or it didn’t make it to the editor. Either way, the code in the paper is the most up-to-date, so you should definitely check-it out (and/or simply buy the book)!

Roger Cordes sent the following questions. I want to share the answers, since it covers most of the questions people had after the talk:

1. An “inward-facing” occlusion value is used as the local thickness of the object, a key component of this technique. You mention that you render a normal-inverted and color-inverted ambient occlusion texture to obtain this value. Can you give any further detail on how you render an ambient occlusion value with an inverted normal? I am attempting to use an off-the-shelf tool (xNormal) to render ambient occlusion from geometry that has had its normals inverted, and I don’t believe my results are consistent with the Hebe example you include in your article. A related question: is there a difference between rendering the “inward-facing” ambient occlusion versus rendering the normal (outward-facing) ambient occlusion and then inverting the result?

Yes, it is quite different actually. :)

If we take the cube (from the talk) as an example, when rendering the ambient occlusion with the original normals, you basically end up with white for all sides, since there’s nothing really occluding each face (or, approximately, the hemisphere of each face, oriented by the normal of that face). Now, in the case you flip the surface normals, you will end up with a different value at each vertex, because the inner faces that meet at each vertex create occlusion between each other. Basically, by flipping the normals during the AO computation, you are averaging “occlusion” inside the object. Now, the cube is not the best example for this, because it’s pretty uniform, and the demo with the cubes at GDC didn’t really have a map on each object.

In the case of Hebe, it’s quite different:

You can see on the image above how the nose is “whiter” than the other “thicker” parts of the head. And this is the key behind the technique: this map represents local variation of thickness on an object. If we transpose the previous statement for the head example, this basically means that the polygons inside the head generate occlusion between each other: the closer they are to each other, the more occlusion we get. If we get a lot of occlusion, we know the faces are close to each other, so this means that this area of the object is generally “thin”.

Inverting the AO computation doesn’t give the same result, because we’re not comparing the same thing, since the surface is “flipped” and faces are now oriented towards the inside of the mesh.

If we compare this with regular AO, it’s really not the same thing, and we can clearly see why:

Even color-inverted, you can see that the results are not the same. The occlusion on the original mesh happens with it outside hull, where as the normal-inversion gets occlusion from the inside hull, which are totally different results (and, in a way, totally different shapes).

2. In the GDC talk, some examples were shown of the results from using a full RGB color in the local thickness map (e.g. for human skin). The full-color local thickness map itself was not shown, though. I’m curious how these assets are authored, and wonder if you have any examples you might share?

For this demo, we basically just generated the inverted AO map, and multiplied it by a nice uniform and saturated red (the same kind of red you can see when you put your finger in front of a light). It can be more complex, and artists can also paint the various colors they want in order to get better results (i.e. in this case, some shades of orange and red, rather than a uniform color). Nonetheless, we felt it was good enough for this example. With proper tweaking, I’m sure you can make it even look better! :)

To illustrate how we generated the translucency/thickness maps for the objects in the talk, here’s another example with everyone’s favorite teapot:

We used 3D Studio MAX. To generate inverted normals, we use the Normal modifier and select Flip normals:

Make sure your mesh is properly UV-unwrapped. We then use the Render to Texture with Mental Ray to generate the ambient occlusion map for the normal-inverted object:

Notice how the Bright and Dark are inverted (compared to the default values). Here are some general rules/tips for generating a proper translucency/thickness map:

  • Make sure to select Ambient Occlusion when adding the object as output, and the proper UV channel
  • Set the Dark color to the maximum value you want for translucency (white = max, black = min)
  • Set the Bright color to the minimum value you want for translucency (ie.: we use 50% gray, because we want minimal translucency everywhere on the object)
  • Play with the values in the yellow rectangle to get the results you want
  • Max Dist is the distance traveled by the rays. If you have a big object, make it bigger. The opposite for a small object.
  • Increase the number of samples and the resolution once you have what you roughly want, for improved visuals (ie.: less noise)

That is all for now. Hope this helps! :)

GDC 2011 – Approximating Translucency for a Fast, Cheap and Convincing Subsurface Scattering Look

As presented at GDC 2011, here’s my (and the legendary Marc Bouchard) talk on our real-time approximation of translucency, featured in the Frostbite 2 engine (used for DICE’s Battlefield 3). These are the slides that we presented, along with audio. Enjoy! :)



Marc and I would like to thank the following people for their time, reviews and constant support:

For those we managed to meet, we had such a good time with all of you at GDC. Always happy to interact with passionate game developers – this is what makes our industry so great! We hope to see you soon again! :)

GDC 2011 Talks You Should Attend

As seen in the previous post, I’ll be presenting at GDC 2011. We also have several AMAZING speakers from EA (Electronic Arts) whose talk you should attend:

SPU-based Deferred Shading in BATTLEFIELD 3 for Playstation 3


Christina Coffin (DICE), @ChristinaCoffin


This session presents a detailed programmer oriented overview of our SPU based shading system implemented in DICE’s Frostbite 2 engine and how it enables more visually rich environments in BATTLEFIELD 3 and better performance over traditional GPU-only based renderers. We explain in detail how our SPU Tile-based deferred shading system is implemented, and how it supports rich material variety, High Dynamic Range Lighting, and large amounts of light sources of different types through an extensive set of culling, occlusion and optimization techniques.


Attendees will learn how SPU based shading allows a rich variety in materials, more complex lighting and enables offloading of traditional GPU work over to SPUs. Optimization techniques used to minimize SPU processing time for various scenarios will also be taught. Attendees will understand how to technically design, balance and analyze the performance of a game environment that uses an SPU based shading system. Attendees will learn key points of creating and optimizing code and data processing for high throughput shading on SPUs.

[Intended Audience]

This session is intended for advanced programmers with an understanding of current forward and deferred rendering techniques, as well as console development experience. Knowledge of lower level programming in vector intrinsic, assembly language, and structure-of-arrays versus array-of-structures data processing is recommended.


Lighting You Up in BATTLEFIELD 3


Kenny Magnusson (DICE)


This session presents a detailed overview of the new lighting system implemented in DICEs Frostbite 2 engine and how it enables us to stretch the boundaries of lighting in BATTLEFIELD 3 with its highly dynamic, varied and destructible environments. BATTLEFIELD 3 goes beyond the lighting limitations found in our previous battlefield games, while avoiding costly and static prebaked lighting without compromising quality. We discuss the technical implementation of the art direction in BATTLEFIELD 3, the workflows we created for it as well as how all the individual lighting components fit together: deferred rendering, HDR, dynamic radiosity and particle lighting.


Attendees will learn the workflow we use to light our worlds, as well as memory and performance considerations to hit our performance budgets from a technical art perspective. Attendees will also get a thorough insight into an exciting new approach to lighting both open landscapes and indoor environments with dynamic radiosity in a fully destructible world.

[Intended Audience]

Attendees should understand the fundamentals of lighting systems used in contemporary game development as well as basic principles of rendering technology. Primarily directed at technical artist and rendering programmers, the presentation is accessible enough that anyone attending will gain an insight into the world of lighting.


Advanced Visual Effects with DirectX 11


Johan Andersson (DICE, @repi), Evan Hart (NVIDIA), Richard Huddy (AMD), Nicolas Thibieroz (AMD), Cem Cebenoyan (NVIDIA), Jon Story (AMD), John McDonald (NVIDIA Corporation), Jon Jansen (NVIDIA Corp), Holger Grn (AMD), Takahiro Harada (Havok) and Nathan Hoobler (NVIDIA)


Brought to you with the collaboration of the industry’s leading hardware and software vendors, this day-long tutorial provides an in-depth look at the Direct3D technologies in DirectX 11 and how they can be applied to cutting-edge PC game graphics for GPUs and APUs. This year we focus exclusively on DirectX 11, examining a variety of special effects which illustrate its use in real game content. This will include detailed presentations from AMD and NVIDIAs demo and developer support teams as well as some of the top game developers who ship real games into the marketplace. In addition to illustrating the details of rendering advanced real-time visual effects, this tutorial will cover a series of vendor-neutral optimizations that developers need to keep in mind when designing their engines and shaders.


Attendees will gain greater insights into advanced utilization of the Direct3D 11 graphics API as used in popular shipping titles.

[Intended Audience]

The intended audience for this session is a graphics programmer who is planning or actively developing a Direct3D 11 application.


Culling the Battlefield: Data Oriented Design in Practice


Daniel Collin (DICE), @daniel_collin


This talk will highlight the evolution of the object culling system used in the Frostbite engine over the years and why we decide to rewrite a system for BATTLEFIELD 3 that had worked well for 4 shipping titles. The new culling system is developed using a data oriented design that favors simple data layouts which enables very efficient computation using pipelined vector instructions. Concrete examples of how code is developed with this approach and the implications and benefits compared to traditional tree-based systems will be given.


Attendees will learn how to apply data oriented design in practice to write simple but high throughput code that works well on all platforms. This is especially important for the current consoles.

[Intended Audience]

Intended for programmers on all levels but some background on vector math and basic threading would be beneficial.


Four Guns West


Ben Minto (DICE), Chuck Russom (Chuck Russom FX), Jeffrey Wesevich (38 Studios), Chris Sweetman (Splash Damage Ltd.), and Charles Maynes (Freelance)


This session aims to give an insight into the shadowy world of audio in AAA FPS titles. Featuring the sound designers behind MEDAL OF HONOR, BRINK, BLACK, HBO’s THE PACIFIC, and CALL OF DUTY. The face off is split into bite size chunks concentrating on key areas that are required to design the weapon audio for a AAA shooter. Areas of focus will include insight into Weapons Field Recording headed up by Charles Maynes, Sound Design with Chuck Russom, Creating Believable Worlds and Mixing Practices with Ben Minto, and Real vs Hyper Real with Chris Sweetman. The panel will also discuss the emotional power of weapon sound design in Video Games & Film.


New attendees will get tips and tactics on approaching audio in an FPS which can then be applied to their own productions. It will empower producers and game designers to consider audio early in a titles development which will increase the player’s experience and enjoyment tenfold.

[Intended Audience]

Target audience will be sound designers,producers, game designers and creatives from all aspects of video games wanting insight into the tricks behind great sounding AAA titles. The session will be structured to allow for all levels of knowledge in the specific fields.



Get every new post delivered to your Inbox.