Working Around Constructors in HLSL (or lack thereof)

As of today, HLSL doesn’t officially support constructors for user-created classes. In the meantime, with the current tools at our disposal, maybe there is a way to work around this limitation and emulate constructor functionality ourselves?

It turns out we can!

This post describes how to implement constructor-like* functionality in HLSL. The following implementation relies on Microsoft’s DirectX Shader Compiler (DXC) and works for default constructors and constructors with a variable number of arguments.

Note: constructor-like is essential here. Not all functionality is supported, but enough to make it meaningful and worth your while.

Constructor Support in HLSL

Currently, constructors are only available for native HLSL types such as vector and matrix types:

bool2    u = bool2(true, false);
float3   v = float3(1, 2, 3);
float4x4 m = float3x3(1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1);

For example, the following is not supported (yet):

class MyClass 
{
public:
    // Default Constructor
    MyClass()
    {
        x = 0;
        b = false;
        i = 0;
    }

    // Constructor with a single parameter
    MyClass(const float inX) : MyClass()
    {
        x = inX;
    }

    // Constructor with multiple parameters
    MyClass(const float inX, const bool inB, const int inI)
    {
        x = inX;
        b = inB;
        i = inI;
    }

private:
    // Member variables
    float x;
    bool  b;
    int   i; 
};

MyClass a = MyClass();              // Default Constructor
MyClass b = MyClass(3.0f);          // Constructor with a single parameter
MyClass c = MyClass(3.0f, true, 7); // Constructor with multiple parameters

At compilation, several errors will appear due to the lack of functionality.

So, typically, workarounds come in different shapes or forms:

// Initializes the whole class to 0
MyClass a = (MyClass)0;

// Initializes an instance of the class via another function, 
// with single parameter
MyClass CreateMyClass(const float x) 
{ 
    MyClass a; 
    a.x = x; 
    return a; 
}
MyClass b = CreateMyClass(3.0f);

// Initializes an instance of the class via another function, 
// with multiple parameters
MyClass CreateMyClass(float inX, bool inB, int inI) 
{ 
    MyClass a; 
    a.x = inX; 
    a.b = inB; 
    a.i = inI; 
    return a; 
}
MyClass b = CreateMyClass(3.0f, true, 7);

While this might be fine for a simple case, it really adds up in bigger codebases. Also, wouldn’t it be great if these functions could live together with the class they initialize like a typical constructor does? HLSL supports public classes implemented via structs, where member functions are possible. For sure we can do better.

Implementing HLSL Constructors

It turns out that constructors can be emulated in DXC using variadic macros, from LLVM, and with a bit of elbow grease. Let’s adjust our previous example:

#define MyClass(...) static MyClass ctor(__VA_ARGS__)
class MyClass
{
    // Default Constructor
    MyClass() 
    { 
        return (MyClass)0; 
    }

    // Constructor with a single parameter
    MyClass(const float x) 
    { 
        MyClass a = MyClass();
        a.x = x;
        return a;
    }

    // Constructor with multiple parameters
    MyClass(const float x, const bool b, const int i)
    {
        MyClass a;
        a.x = x;
        a.b = b;
        a.i = i;
        return a;
    }

    // Member variables
    float x;
    bool  b;
    int   i; 
};
#define MyClass(...) MyClass::ctor(__VA_ARGS__)

Note: public and private are reserved keywords in HLSL. Currently, classes are structs.

How Does it Work?

First, at Line 1, the macro handles the typical signature one would expect from a constructor. The variadic ... and __VA_ARGS__ enables the definition of constructors-like member functions with a variable number of parameters. At compilation, this replaces all following calls of MyClass(…) with static MyClass ctor(...).

Then, line #33 redefines the MyClass(...) static member function(s), so they can be used later in the code. We are now able to call MyClass(...) directly, which creates and initializes an instance of that class:

MyClass a = MyClass();              // Default Constructor
MyClass b = MyClass(3.0f);          // Constructor with a single parameter
MyClass c = MyClass(3.0f, true, 7); // Constructor with multiple parameters

Success!

Further Simplification

If one wants to eliminate the generic zero-ing default constructor, further simplification is possible with this macro:

#define DEFAULT_CONSTRUCTOR(Type) static Type ctor() { return (Type)0; }

This optional macro further helps remove code deduplication, especially as one implements many classes throughout a big HLSL codebase.

#define MyClass(...) static MyClass ctor(__VA_ARGS__)
class MyClass
{
    // Default Constructor
    DEFAULT_CONSTRUCTOR(MyClass);

    ...

Wrapping-Up & Additional Thoughts

This post demonstrates an implementation of constructor-like functionality in HLSL that works with out-of-the-box DXC.

Woohoo!

Now, as you probably noticed it lacks some functionality to fully cover what constructors enable. But it’s close!

In the event where you have your own HLSL parser, you might be able to work around this whole problem altogether. You could, for example, as a precompilation step, parse your entire codebase and create, build, and call constructors with unique signatures to prevent name collisions. In my case, I wanted to build something that would work out of the box with vanilla DXC. This is what the previous examples solve.

Also, it would be nice if recursive macros were supported. One could use recursive macros to generalize the two #define above and possibly eliminate the various return calls. Unfortunately, recursive macros are not available.

Either way, I’ve been experimenting with this approach for a few months now, and I have found it quite helpful. I find it cleans up usage of user-created classes and brings us one step closer. I hope you find it helpful too!

Until we get native support for constructors in HLSL, please post in the comments if you manage to improve or simplify this approach further or stumble on a more straightforward way. Thanks!

PS: Thanks to Jon Greenberg for reviewing this small blog post.

Gathering Feedback: Open Problems in Real-Time Raytracing

For HPG 2018‘s keynote (co-located / two days before SIGGRAPH 2018) I’ll be discussing some of the latest advances in game raytracing, but most notably some of the open problems.

With DXR making raytracing more accessible, and bringing us one step closer to “real-time raytracing for the masses”, the gap between offline and real-time is significantly getting smaller. To that, tailoring some of the existing offline raytracing approaches to real-time doesn’t happen overnight, can’t be done 1:1 nor free of compromises, as many of you saw in our GDC/DigitalDragons PICA PICA presentations. Existing offline approaches are definitely not free of problems, as raytracing literature and algorithms have originally be designed with offline in mind.

HPG is a great forum for discussing these sort of things since lots of folks in research are definitely interested in what DXR can enable for their research, want to know what problems we are trying to solve, and how their research can be adopted by the games industry.

That said, I would appreciate any feedback from fellow developers & researchers about what you think are the most important open problems in real-time raytracing. Already have a few, but definitely interested in hearing your thoughts on the matter.

Feel free to answer here, tweet at me, or privately. Additionally, if you’re around Vancouver for SIGGRAPH you should consider attending HPG. Schedule is shaping up to be pretty awesome! 🙂

So, what’s your #1 open problem with real-time raytracing?

GDC Retrospective and Additional Thoughts on Real-Time Raytracing

This post is part of the series “Finding Next-Gen“.

Just got back from GDC. Had a great time showcasing the hard work we’ve been up to at SEED. In case you missed it, we did two presentations on real-time raytracing:

DirectX Raytracing Announcement (Microsoft) and Shiny Pixels and Beyond: Real-Time Raytracing at SEED (NVIDIA)

In case you were at GDC and saw the presentation, you can skip directly here.

During the first session Matt Sandy from Microsoft announced DirectX Raytracing (DXR). He went into great detail over the API changes, and showed how DirectX 12 has evolved to support raytracing. We then followed with our own presentations, where we showcased Project PICA PICA, a real-time raytracing experiment featuring a mini-game for self-learning AI agents in a procedurally-assembled world. The team has worked super hard on this demo, and the results really show it! 🙂

PICA PICA is powered by DXR.

DirectX Raytracing?

The addition of raytracing to DirectX 12 is exposed via simple concepts: acceleration structures (bottom & top), new shader types (ray-generation, closest-hit, any-hit, and miss), new HLSL types and intrinsics, commandlist-level DispatchRays(…) and a raytracing pipeline state. You can read more about it here.

Taken from our presentation, here’s a brief overview of how this works in PICA PICA:

Using Bottom/top acceleration structures and shader table (from GDC slides)

Ray Generation Shadow – HLSL Pseudo Code – Does Not Compile (from GDC slides)

While you don’t necessarily need to use DXR to do real-time raytracing on current GPUs (see Sebastian Aaltonen’s Claybook rendering presentation), it’s a flexible new tool in the toolbox. From the code above, you benefit from the fact that it’s unified with the rest of DirectX 12. DXR relies on well known HLSL functionality and types, allowing you to share code between rasterization, compute and raytracing. More than just raytracing, DXR also allows to solve more sparse and incoherent problems that you can’t easily solve with rasterization and compute. It’s also a centralized implementation for hardware vendors to optimize, and now becomes common language for every developer that wants to do raytracing in DirectX 12. It’s not perfect, but it’s a good start and it works well.

Presentation Retrospective

During the presentation we talked about our hybrid rendering pipeline where rasterization, compute and raytracing work together:

PICA PICA’s Hybrid Rendering Pipeline (from GDC slides)

Our hybrid approach allows us to solve, develop and apply several interesting techniques and algorithms that rely on rasterization, compute or raytracing while balancing quality and performance. This shows the flexibility of the API, where one is free to choose a specific pipeline to solve a specific problem. Again since raytracing is another tool in the toolbox, it can be used where it makes sense and doesn’t prevent you from using other available pipelines.

First we talked about how we raytrace reflections from the G-Buffer at half resolution, reconstruct at full resolution, and how it allows us to handle varying levels of roughness. We also presented our multi-layer material system, shared between rasterization, compute and raytracing.

Raytraced Reflections (left) and Multi-Layer Materials (right) (from GDC slides)

We then followed by describing a novel texture-space approach for order-independent transparency, translucency and subsurface scattering:

Glass and Translucency (from GDC slides)

We then presented a sparse surfel-based approach where we use raytracing to pathtrace irradiance from surfels spawned from the camera.

Surfel-based Global Illumination (from GDC slides)

We also covered ambient occlusion (AO), and how raytraced AO compares to screen-space AO.

This slideshow requires JavaScript.

Inspired from Schied/NVIDIA’s Spatiotemporal Variance-Guided Filtering (SVGF), we also presented a super-optimized denoising filter specialized for soft shadows with varying penumbra.

Surfel-based Global Illumination (from GDC slides)

Finally we talked about how we handle multiple GPUs (mGPU) and split the frame, relying on the first GPU to act as an arbiter that dispatches work to secondary GPUs in parallel fork-join style.

mGPU in PICA pica (from GDC slides)

All-and-all, it was a lot of content for the time slot we had. In case you want more info, check out the presentation:

You can also download the slides: Powerpoint and PDF. You can also watch the presentation live here (starts around 21:30).

Here are a few additional links that talk about DirectX Raytracing and Project PICA PICA:

Microsoft: Announcing Microsoft DirectX Raytracing
Ars Technica: DirectX Raytracing is the first step toward a graphics revolution
PC World: Microsoft’s DirectX Raytracing paves the way for lifelike gaming, the graphics holy grail
PC Gamer: What Microsoft’s DirectX Raytracing means for gaming
Anandtech: Expanding DirectX 12: Microsoft Announces DirectX Raytracing
Rock Paper Shotgun: EA’s Project Pica Pica leads new wave of photorealistic ray-tracing graphics demos at GDC 2018
Eurogamer: EA just showed off Project Pica Pica, an adorable new tech demo
Polygon: Nvidia’s latest tech will enable ‘cinematic-quality’ graphics — on unannounced GPUs

Additional Thoughts

As mentioned at GDC we’ve had the chance to be involved early with DXR, to experiment and provide feedback as the API evolved. Super glad to have been part of this initiative. We still have a lot to explore, and the future is exciting! Some additional thoughts:

Noise vs Ghosting vs Performance

DXR opens the door to an entirely new class of techniques that have never been achieved in games. With real-time raytracing it feels like the upcoming years will be about managing complex tradeoffs, such as noise, ghosting, quality vs performance. While you can add more samples to reduce noise (and improve convergence) during stochastic sampling, it decreases performance. Alternatively you can reuse samples from previous frames (via temporal filtering), but it can add ghosting. It feels like achieving the right balance here will be important. As DXR gets adopted in games this topic will generate a lot of good presentations at conferences.

Comparing Against Ground Truth

We also mentioned that we built our own pathtracer inside our framework. This pathtracer acts as reference implementation, which at any point we can toggle when working on a feature for our hybrid renderer. This allows us to rapidly compare results, and see how a feature looks against ground truth. Since a lot of code is shared between the reference and various hybrids techniques, no significant additional maintenance is required. At the end of the day, having a reference implementation will help you make the best decision in order to achieve the balance between quality and performance for your (hybrid) techniques.

If raytracing is new to you and building a reference ray/pathtracer is of interest, many books and online resources are available. Peter Shirley’s Ray Tracing in One Weekend is quite popular. You should check it out! 🙂

Specialized Denoising and Reconstruction

Also mentioned during the presentation, we built a denoising filter specialized for soft penumbra shadows. While one can use general denoising algorithms like SVGF on the whole image, building a denoising filter around a specific term will undeniably achieve greater quality and performance. This is true since you can really customize the filter around the constraints of that term. In the near future one can expect that significant time and energy will be spent on specialized denoisers, and custom reconstruction of stochastically sampled terms.

DXR Interop

As mentioned earlier we share a lot of code between raytracing, rasterization and compute. In the event where one wants to bake lightmaps inside their engine (see Sébastien Hillaire‘s talk on Real-Time Raytracing For Interactive Global Illumination Workflows in Frostbite), DXR is very appealing because you can evaluate your actual HLSL material shaders. No need for (limited) parameter conversion, which is often necessary when using an external lightmap baking tool.

This is awesome!

Wrapping-up

Even though the API is there and available to everyone, this is just the beginning. It’s an important tool going forward that will enable new techniques in games, and could end up pushing the industry to new heights. I’m looking forward to the new techniques that evolve from everyone having access to DXR, and what kind of rendering problems get solved. I also find it quite appealing for the research community to be able to try and solve problems closer to the realm of real-time raytracing, where researchers can implement their solutions using a raytracing API that everyone can use.

Because it’s unified, it should also be easy for you to pick up the API, experiment and integrate in your own engine. Again, one doesn’t need this API to do real-time raytracing, but it provides a really nice package and a common language that all DirectX 12 developers can talk around. It’s also a clear focus point for hardware makers to focus on optimization. Also compute hasn’t really changed in a while, so hopefully these improvements will drive improvements in compute and in the the pipelines as well. That being said, the API is obviously not perfect, and is still at the proposal stage. Microsoft is open to additional feedback and discussion. Try it out and send your feedback!

Can’t wait to see what you will do with DXR! 🙂

SIGGRAPH 2017 – Past, Present and Future Challenges of Global Illumination in Games

This post is part of the series “Finding Next-Gen“.

Just got back from Los Angeles, where I presented in the Open Problems in Real-Time Rendering Course at this year’s SIGGRAPH:

Global illumination (GI) has been an ongoing quest in games. The perpetual tug-of-war between visual quality and performance often forces developers to take the latest and greatest from academia and tailor it to push the boundaries of what has been realized in a game product. Many elements need to align for success, including image quality, performance, scalability, interactivity, ease of use, as well as game-specific and production challenges.

First we will paint a picture of the current state of global illumination in games, addressing how the state of the union compares to the latest and greatest research. We will then explore various GI challenges that game teams face from the art, engineering, pipelines and production perspective. The games industry lacks an ideal solution, so the goal here is to raise awareness by being transparent about the real problems in the field. Finally, we will talk about the future. This will be a call to arms, with the objective of uniting game developers and researchers on the same quest to evolve global illumination in games from being mostly static, or sometimes perceptually real-time, to fully real-time.

You can also download my slides with notes here.

Super grateful to have been part of this initiative. Lots of great content was presented. Thanks to everyone who came to the course!

HLSL to ISPC

For the past few weeks, I’ve been exploring ISPC (Intel SPMD Program Compiler) and experimenting with a few ideas I have in mind around CPU and GPU interop that work well with the SPMD (single program, multiple data) model.

Along the way I felt like something was missing. What if I was able to write ISPC kernels in a way that I’m super familiar with, such as HLSL?

And so I’ve created this HLSL-to-ISPC helper library: a utility library with helper types and functions to provide similar syntax to HLSL inside the ISPC programming environment.

I’ve used it for the following mini-projects:

ispc-smallpt	ispc-mandelbrot

ispc-flower	ispc-worley

The first project (ispc-smallpt) is an ISPC implementation of Kevin Beason’s famous smallpt path tracer. The following two are shadertoys (originally from Inigo Quilez) that got converted to ISPC. Finally, the fourth example is an implementation of Worley cellular noise.

All of the previous use the HLSL-to-ISPC helper library. It’s also a great way to validate that the library works, with minimal alteration to the original code as it gets transformed to ISPC.

The library is not complete, but good enough for now to get started. Please check the Github page for upcoming features and updates. I plan to keep improving it during the following months, with additional test cases, mini-projects and new features. Moreover there is a lot to be said regarding CPU/GPU interop, and I hope to find some time in the following months to chat more about it.

In the meantime, please check out the library and let me know what you think, find any bugs, or even if you want to contribute.

Thanks!

Hexagonal Bokeh Blur Revisited – Part 4: Rhombi Overlap

This post is from a multi-part series titled “Hexagonal Bokeh Blur Revisited”. Want to jump directly to the index? If so, click here.

Another common artifact is the Y-shaped pattern of overlapping rhombi:

Y-shaped Artifact

From the first post in this series, you might remember our blur function:

float4 BlurTexture(sampler2D tex, float2 uv, float2 direction)
{
    float4 finalColor = 0.0f;
    float blurAmount = 0.0f;
 
    // This offset is important. Will explain later. ;)
    uv += direction * 0.5f;
 
    for (int i = 0; i < NUM_SAMPLES; ++i)
    {
        float4 color = tex2D(tex, uv + direction * i);
        color *= color.a;
        blurAmount += color.a; 
        finalColor += color;
    }
 
    return (finalColor / blurAmount);
}

The half sample offset highlighted in bold shows how to prevent this issue.

Rhombi Overlap (Left) vs Proper Alignment (Right)

Steve Hill reminded me that this was actually mentioned in the notes on slide 15:

We also apply a half sample offset to stop overlapping rhombi. Otherwise you’ll end up with a double brightening artifact in an upside Y shape.

As you can see, it’s easily solvable! 😀

Hexagonal Bokeh Blur Revisited – Part 3: Additional Features: Rotation

This post is from a multi-part series titled “Hexagonal Bokeh Blur Revisited”. Want to jump directly to the index? If so, click here.

So far, we’ve shown how to build a separate hexagonal blur in two passes. While the shape is interesting in its basic form, one can definitely change it.

For example: rotation!

GRWL Rotated Hexagonal Bokeh Depth-of-Field in Ghost Recon Wildlands

Alternatively, works really nicely with a ton of them!

Separable Hexagonal Bokeh Blur – Demo On Github

It’s Actually Quite Simple…

While this might sound obvious to many of you out there, I’ve had 2 people mention on separate occasions that they had issues achieving this. Might be with the way they approached the hexagonal blur, but with our separable approach it’s actually quite simple.

Just offset your angles and let the trigonometry do its magic.

float2 blurDir = coc * invViewDims * float2(cos(angle + PI/2), sin(angle + PI/2));

Hexagonal Bokeh Blur Revisited – Part 2: Improved 2-pass Version

This post is from a multi-part series titled “Hexagonal Bokeh Blur Revisited”. Want to jump directly to the index? If so, click here.

As seen previously, we can achieve this blur in a pretty straightforward fashion in three passes. The code below demonstrates an improvement over such approach, by achieving the blur in two passes. Since it builds on the previous post, make sure to read it beforehand.

If this is obvious to you, I invite you to skip to the next part.

Step 1 – Combined Vertical & Diagonal Blur

We have MRTs, so let’s combine both blurs in the same pass.

struct PSOUTPUT
{
    float4 vertical : COLOR0;
    float4 diagonal : COLOR1;
};

// Get the local CoC to determine the radius of the blur.
float coc = tex2D(sceneTexture, uv).a; 

// CoC-weighted vertical blur.
float2 blurDir = coc * invViewDims * float2(cos(PI/2), sin(PI/2));
float4 color = BlurTexture(sceneTexture, uv, blurDir) * coc; 

// CoC-weighted diagonal blur.
float2 blurDir2 = CoC * invViewDims * float2(cos(-PI/6), sin(-PI/6));
float4 color2 = BlurTexture(sceneTexture, uv, blurDir2) * coc;

// Output to MRT
PSOUTPUT output;
output.vertical = float4(color.rgb, coc);
output.diagonal = float4(color2.rgb + output.vertical.xyz, coc);

Much simpler! Also means we don’t have to read a temporary (vertical) buffer unlike in the previous 3-pass approach, since we’re doing this all at once.

Step 2 – Rhomboid Blur

The final step is the rhomboid blur. This is similar to the 3-pass approach. Again, this is done in two parts: via a 30 degrees (-PI/6) blur, as well as its reflection at 150 degrees (-5PI/6).

// Get the center to determine the radius of the blur
float coc = tex2D(verticalBlurTexture, uv).a;
float coc2 = tex2D(diagonalBlurTexture, uv).a;

// Sample the vertical blur (1st MRT) texture with this new blur direction
float2 blurDir = coc * invViewDims * float2(cos(-PI/6), sin(-PI/6));
float4 color = BlurTexture(verticalBlurTexture, uv, blurDir) * coc;

// Sample the diagonal blur (2nd MRT) texture with this new blur direction
float2 blurDir2 = coc2 * invViewDims * float2(cos(-5*PI/6), sin(-5*PI/6));
float4 color2 = BlurTexture(diagonalBlurTexture, uv, blurDir2) * coc2;
 
float3 output = (color.rgb + color2.rgb) * 0.5f;

Well That Was Kind of Obvious…

Yup! Just making sure. Details provided for posterity, and I’ll also be building on this part and the previous for the upcoming sections.

Again, a code sample is provided here. You should be able to toggle between both versions and see… that there is no difference.

Hexagonal Bokeh Blur Revisited – Part 1: Basic 3-pass Version

This post is from a multi-part series titled “Hexagonal Bokeh Blur Revisited“. Want to jump directly to the index? If so, click here.

The code below demonstrates the most straightforward way to achieve this blur. It is done in 3 passes.

Animation

Step 0 – Blur Function

First, let’s define our blur function. This will be reused along the way.

float4 BlurTexture(sampler2D tex, float2 uv, float2 direction)
{
    float4 finalColor = 0.0f;
    float blurAmount = 0.0f;
 
    // This offset is important. Will explain later. ;)
    uv += direction * 0.5f;
 
    for (int i = 0; i < NUM_SAMPLES; ++i)
    {
        float4 color = tex2D(tex, uv + direction * i);
        color *= color.a;
        blurAmount += color.a; 
        finalColor += color;
    }
 
    return (finalColor / blurAmount);
}

Step 1 – Vertical Blur

First, we blur vertically.

Combined1

// Get the local CoC to determine the radius of the blur.
float coc = tex2D(sceneTexture, uv).a; 

// CoC-weighted vertical blur.
float2 blurDirection = coc * invViewDims * float2(cos(PI/2), sin(PI/2));
float3 color = BlurTexture(sceneTexture, uv, blurDirection) * coc;

// Done!
return float4(color, coc);

Step 2 – Diagonal Blur

Second we blur diagonally.

This stage is similar to Stage 1, but now with a 30 degree (PI/6) angle. We also combine the diagonal blur with the vertical blur.

// CoC-weighted diagonal blur
float2 blurDir = coc * invViewDims * float2(cos(-PI/6), sin(-PI/6));
float4 color = BlurTexture(verticalBlurTexture, uv, blurDir) * coc;

// Combine with the vertical blur 
// We don't need to divide by 2 here, because there is no overlap 
return float4(color.xyz + tex2D(verticalBlurTexture, uv).rgb, coc);

Which gives:

4_SceneBottomRight

Step 3 – Rhomboid Blur

The final step is the rhomboid blur.

This is done in two parts: via a 30 degrees (PI/6) blur, as well as its reflection at 150 degrees (5PI/6).

// Get the center to determine the radius of the blur
float coc = tex2D(verticalBlurTexture, uv).a;
float coc2 = tex2D(diagonalBlurTexture, uv).a;

// Sample the vertical blur (1st MRT) texture with this new blur direction
float2 blurDir = coc * invViewDims * float2(cos(-PI/6), sin(-PI/6));
float4 color = BlurTexture(verticalBlurTexture, uv, blurDir) * coc;

// Sample the diagonal blur (2nd MRT) texture with this new blur direction
float2 blurDir2 = coc2 * invViewDims * float2(cos(-5*PI/6), sin(-5*PI/6));
float4 color2 = BlurTexture(diagonalBlurTexture, uv, blurDir2) * coc2;

// And we're done!
float3 output = (color.rgb + color2.rgb) * 0.5f;

Putting It All Together

Animation

As you can see, the code listed previously is pretty straightforward and should be a good base for you to achieve this blur. Additionally a code sample is provided here.

We can do better. Let’s do it in 2 passes!

Hexagonal Bokeh Blur Revisited

Hello Bokeh, My Old Friend

I’ve come to talk with you again…

It’s been a while. The last time we spoke to each other was back at SIGGRAPH 2011 in the Advances in Real-Time Rendering course with John White.

Separable Hexagonal Bokeh Depth-of-Field in Need For Speed: The Run

You’ve Been Around

Back then, we didn’t give out the code on how to achieve this effect. It turns out many developers out there were still able to realize it, solely based on John’s slides and notes!

Watch Dogs – Ubisoft

Ghost Recon Wildlands – Ubisoft

Tom Clancy’s The Division – Ubisoft

Mortal Kombat X – Netherrealm (WB Games)

Sniper Elite 2 – Rebellion

NBA 2K 2014

NBA 2K 2017

Gears of War: Ultimate Edition

Mikkel Gjoel’s – ShaderToy

Evan Wallace’s – WebGL Lens Filter

We Meet Again?

While the technique presented six years ago has been showcased in a myriad of games and can be easily implemented in a straightforward way, it turns out some of the implementations out there have unfortunate visual artifacts. 😦

Luckily these artifacts can be easily solved! 😀

Back then, for the sake of time John left out information regarding circle-of-confusion management, as well as other details. If not handled correctly, this omission could lead to some unwanted artifacts, like in the images above. Again it’s been six years since the presentation, so I feel it’s time we set everything straight and clear these artifacts out.

The goal behind this post is to provide an “artifact-free” implementation, or rather a code companion complementary to the SIGGRAPH 2011 presentation. The code is hosted on Github.

I’m super busy, I want to blog more and I want to make this manageable, so this post is split in multiple parts. Once the whole series is done, I might collapse all the parts into something more concise.

In the meantime, thanks for stopping by! 🙂

Index

Part 1 – Basic 3-pass Version
Part 2 – Improved 2-pass Version
Part 3 – Additional Features: Rotation
Part 4 – Managing Visual Artifacts: Rhombi Overlaps
Part 5 – Managing Visual Artifacts: CoC
Part 6 – Additional Features: Improved Sampling and Other Tricks
Part 7 – Putting It All Together: Demo
Part 8 – Forward-Looking…

Acknowledgements

Thanks to John White for coming up with the original idea of “scatter-gather” separable hexagonal bokeh depth of field by rhomboid decomposition. I really miss the days when we used to work together, back when I was on Battlefield 3 and he was on NFS: The Run. Crazy-but-good times with lots of good exchanges. We shared a lot, and I sure learned a lot. Thanks John! 🙂

References

WHITE, John, and BARRÉ-BRISEBOIS, Colin. More Performance! Five Rendering Ideas From Battlefield 3 and Need For Speed: The Run, Advances in Real-Time Rendering in Games, SIGGRAPH 2011. Available Online.