GDC Retrospective and Additional Thoughts on Real-Time Raytracing

This post is part of the series “Finding Next-Gen“.

Just got back from GDC. Had a great time showcasing the hard work we’ve been up to at SEED. In case you missed it, we did two presentations on real-time raytracing:

gdc1DirectX Raytracing Announcement (Microsoft) and Shiny Pixels and Beyond: Real-Time Raytracing at SEED (NVIDIA)

In case you were at GDC and saw the presentation, you can skip directly here.

During the first session Matt Sandy from Microsoft announced DirectX Raytracing (DXR). He went into great detail over the API changes, and showed how DirectX 12 has evolved to support raytracing. We then followed with our own presentations, where we showcased Project PICA PICA, a real-time raytracing experiment featuring a mini-game for self-learning AI agents in a procedurally-assembled world. The team has worked super hard on this demo, and the results really show it! 🙂

PICA PICA is powered by DXR.

DirectX Raytracing?

The addition of raytracing to DirectX 12 is exposed via simple concepts: acceleration structures (bottom & top), new shader types (ray-generation, closest-hit, any-hit, and miss), new HLSL types and intrinsics, commandlist-level DispatchRays(…) and a raytracing pipeline state. You can read more about it here.

Taken from our presentation, here’s a brief overview of how this works in PICA PICA:

gdc3.pngUsing Bottom/top acceleration structures and shader table (from GDC slides)
picapica_hlslPseudoCode.pngRay Generation Shadow – HLSL Pseudo Code – Does Not Compile (from GDC slides)

While you don’t necessarily need to use DXR to do real-time raytracing on current GPUs (see Sebastian Aaltonen’s Claybook rendering presentation), it’s a flexible new tool in the toolbox. From the code above, you benefit from the fact that it’s unified with the rest of DirectX 12. DXR relies on well known HLSL functionality and types, allowing you to share code between rasterization, compute and raytracing. More than just raytracing, DXR also allows to solve more sparse and incoherent problems that you can’t easily solve with rasterization and compute. It’s also a centralized implementation for hardware vendors to optimize, and now becomes common language for every developer that wants to do raytracing in DirectX 12. It’s not perfect, but it’s a good start and it works well.

Presentation Retrospective

During the presentation we talked about our hybrid rendering pipeline where rasterization, compute and raytracing work together:

gdc2.pngPICA PICA’s Hybrid Rendering Pipeline (from GDC slides)

Our hybrid approach allows us to solve, develop and apply several interesting techniques and algorithms that rely on rasterization, compute or raytracing while balancing quality and performance. This shows the flexibility of the API, where one is free to choose a specific pipeline to solve a specific problem. Again since raytracing is another tool in the toolbox, it can be used where it makes sense and doesn’t prevent you from using other available pipelines.

First we talked about how we raytrace reflections from the G-Buffer at half resolution, reconstruct at full resolution, and how it allows us to handle varying levels of roughness. We also presented our multi-layer material system, shared between rasterization, compute and raytracing.

picapica_reflectionsMaterials.pngRaytraced Reflections (left) and Multi-Layer Materials (right) (from GDC slides)

We then followed by describing a novel texture-space approach for order-independent transparency, translucency and subsurface scattering:

picapica_translucencyGlass.pngGlass and Translucency (from GDC slides)

We then presented a sparse surfel-based approach where we use raytracing to pathtrace irradiance from surfels spawned from the camera.

picapica_gi.pngSurfel-based Global Illumination (from GDC slides)

We also covered ambient occlusion (AO), and how raytraced AO compares to screen-space AO.

This slideshow requires JavaScript.

Inspired from Schied/NVIDIA’s Spatiotemporal Variance-Guided Filtering (SVGF), we also presented a super-optimized denoising filter specialized for soft shadows with varying penumbra.

picapica_shadows.pngSurfel-based Global Illumination (from GDC slides)

Finally we talked about how we handle multiple GPUs (mGPU) and split the frame, relying on the first GPU to act as an arbiter that dispatches work to secondary GPUs in parallel fork-join style.

picapica_mgpu.pngmGPU in PICA pica (from GDC slides)

All-and-all, it was a lot of content for the time slot we had. In case you want more info, check out the presentation:

You can also download the slides: Powerpoint and PDF. You can also watch the presentation live here (starts around 21:30).

Here are a few additional links that talk about DirectX Raytracing and Project PICA PICA:

Additional Thoughts

As mentioned at GDC we’ve had the chance to be involved early with DXR, to experiment and provide feedback as the API evolved. Super glad to have been part of this initiative. We still have a lot to explore, and the future is exciting! Some additional thoughts:

Noise vs Ghosting vs Performance

DXR opens the door to an entirely new class of techniques that have never been achieved in games. With real-time raytracing it feels like the upcoming years will be about managing complex tradeoffs, such as noise, ghosting, quality vs performance. While you can add more samples to reduce noise (and improve convergence) during stochastic sampling, it decreases performance. Alternatively you can reuse samples from previous frames (via temporal filtering), but it can add ghosting. It feels like achieving the right balance here will be important. As DXR gets adopted in games this topic will generate a lot of good presentations at conferences.

Comparing Against Ground Truth

We also mentioned that we built our own pathtracer inside our framework. This pathtracer acts as reference implementation, which at any point we can toggle when working on a feature for our hybrid renderer. This allows us to rapidly compare results, and see how a feature looks against ground truth. Since a lot of code is shared between the reference and various hybrids techniques, no significant additional maintenance is required. At the end of the day, having a reference implementation will help you make the best decision in order to achieve the balance between quality and performance for your (hybrid) techniques.

If raytracing is new to you and building a reference ray/pathtracer is of interest, many books and online resources are available. Peter Shirley’s Ray Tracing in One Weekend is quite popular. You should check it out! 🙂

Specialized Denoising and Reconstruction

Also mentioned during the presentation, we built a denoising filter specialized for soft penumbra shadows. While one can use general denoising algorithms like SVGF on the whole image, building a denoising filter around a specific term will undeniably achieve greater quality and performance. This is true since you can really customize the filter around the constraints of that term. In the near future one can expect that significant time and energy will be spent on specialized denoisers, and custom reconstruction of stochastically sampled terms.

DXR Interop

As mentioned earlier we share a lot of code between raytracing, rasterization and compute. In the event where one wants to bake lightmaps inside their engine (see Sébastien Hillaire‘s talk on Real-Time Raytracing For Interactive Global Illumination Workflows in Frostbite), DXR is very appealing because you can evaluate your actual HLSL material shaders. No need for (limited) parameter conversion, which is often necessary when using an external lightmap baking tool.

This is awesome!

Wrapping-up

Even though the API is there and available to everyone, this is just the beginning. It’s an important tool going forward that will enable new techniques in games, and could end up pushing the industry to new heights. I’m looking forward to the new techniques that evolve from everyone having access to DXR, and what kind of rendering problems get solved. I also find it quite appealing for the research community to be able to try and solve problems closer to the realm of real-time raytracing, where researchers can implement their solutions using a raytracing API that everyone can use.

Because it’s unified, it should also be easy for you to pick up the API, experiment and integrate in your own engine. Again, one doesn’t need this API to do real-time raytracing, but it provides a really nice package and a common language that all DirectX 12 developers can talk around. It’s also a clear focus point for hardware makers to focus on optimization. Also compute hasn’t really changed in a while, so hopefully these improvements will drive improvements in compute and in the the pipelines as well. That being said, the API is obviously not perfect, and is still at the proposal stage. Microsoft is open to additional feedback and discussion. Try it out and send your feedback!

Can’t wait to see what you will do with DXR! 🙂

SIGGRAPH 2017 – Past, Present and Future Challenges of Global Illumination in Games

This post is part of the series “Finding Next-Gen“.

Just got back from Los Angeles, where I presented in the Open Problems in Real-Time Rendering Course at this year’s SIGGRAPH:

Global illumination (GI) has been an ongoing quest in games. The perpetual tug-of-war between visual quality and performance often forces developers to take the latest and greatest from academia and tailor it to push the boundaries of what has been realized in a game product. Many elements need to align for success, including image quality, performance, scalability, interactivity, ease of use, as well as game-specific and production challenges.

First we will paint a picture of the current state of global illumination in games, addressing how the state of the union compares to the latest and greatest research. We will then explore various GI challenges that game teams face from the art, engineering, pipelines and production perspective. The games industry lacks an ideal solution, so the goal here is to raise awareness by being transparent about the real problems in the field. Finally, we will talk about the future. This will be a call to arms, with the objective of uniting game developers and researchers on the same quest to evolve global illumination in games from being mostly static, or sometimes perceptually real-time, to fully real-time.

You can also download my slides with notes here.

Super grateful to have been part of this initiative. Lots of great content was presented. Thanks to everyone who came to the course!

HLSL to ISPC

For the past few weeks, I’ve been exploring ISPC (Intel SPMD Program Compiler) and experimenting with a few ideas I have in mind around CPU and GPU interop that work well with the SPMD (single program, multiple data) model.

Along the way I felt like something was missing. What if I was able to write ISPC kernels in a way that I’m super familiar with, such as HLSL?

And so I’ve created this HLSL-to-ISPC helper library: a utility library with helper types and functions to provide similar syntax to HLSL inside the ISPC programming environment.

I’ve used it for the following mini-projects:

ispc-smallpt ispc-mandelbrot
ispc-flower ispc-worley

The first project (ispc-smallpt) is an ISPC implementation of Kevin Beason’s famous smallpt path tracer. The following two are shadertoys (originally from Inigo Quilez) that got converted to ISPC. Finally, the fourth example is an implementation of Worley cellular noise.

All of the previous use the HLSL-to-ISPC helper library. It’s also a great way to validate that the library works, with minimal alteration to the original code as it gets transformed to ISPC.

The library is not complete, but good enough for now to get started. Please check the Github page for upcoming features and updates. I plan to keep improving it during the following months, with additional test cases, mini-projects and new features. Moreover there is a lot to be said regarding CPU/GPU interop, and I hope to find some time in the following months to chat more about it.

In the meantime, please check out the library and let me know what you think, find any bugs, or even if you want to contribute.

Thanks!

Hexagonal Bokeh Blur Revisited – Part 4: Rhombi Overlap

This post is from a multi-part series titled “Hexagonal Bokeh Blur Revisited”. Want to jump directly to the index? If so, click here.

Another common artifact is the Y-shaped pattern of overlapping rhombi:

YShape.pngY-shaped Artifact

From the first post in this series, you might remember our blur function:

float4 BlurTexture(sampler2D tex, float2 uv, float2 direction)
{
    float4 finalColor = 0.0f;
    float blurAmount = 0.0f;
 
    // This offset is important. Will explain later. ;)
    uv += direction * 0.5f;
 
    for (int i = 0; i < NUM_SAMPLES; ++i)
    {
        float4 color = tex2D(tex, uv + direction * i);
        color *= color.a;
        blurAmount += color.a; 
        finalColor += color;
    }
 
    return (finalColor / blurAmount);
}

The half sample offset highlighted in bold shows how to prevent this issue.

BokehY.pngRhombi Overlap (Left) vs Proper Alignment (Right)

Steve Hill reminded me that this was actually mentioned in the notes on slide 15:

We also apply a half sample offset to stop overlapping rhombi. Otherwise you’ll end up with a double brightening artifact in an upside Y shape.

As you can see, it’s easily solvable! 😀

Hexagonal Bokeh Blur Revisited – Part 3: Additional Features: Rotation

This post is from a multi-part series titled “Hexagonal Bokeh Blur Revisited”. Want to jump directly to the index? If so, click here.

So far, we’ve shown how to build a separate hexagonal blur in two passes. While the shape is interesting in its basic form, one can definitely change it.

For example: rotation!

GRWLRotated Hexagonal Bokeh Depth-of-Field in Ghost Recon Wildlands

Alternatively, works really nicely with a ton of them!

RotatedBokeh.png
Separable Hexagonal Bokeh Blur – Demo On Github

It’s Actually Quite Simple…

While this might sound obvious to many of you out there, I’ve had 2 people mention on separate occasions that they had issues achieving this. Might be with the way they approached the hexagonal blur, but with our separable approach it’s actually quite simple.

Just offset your angles and let the trigonometry do its magic. 

float2 blurDir = coc * invViewDims * float2(cos(angle + PI/2), sin(angle + PI/2));

Hexagonal Bokeh Blur Revisited – Part 2: Improved 2-pass Version

This post is from a multi-part series titled “Hexagonal Bokeh Blur Revisited”. Want to jump directly to the index? If so, click here.

As seen previously, we can achieve this blur in a pretty straightforward fashion in three passes. The code below demonstrates an improvement over such approach, by achieving the blur in two passes. Since it builds on the previous post, make sure to read it beforehand.

If this is obvious to you, I invite you to skip to the next part.

Step 1 – Combined Vertical & Diagonal Blur

We have MRTs, so let’s combine both blurs in the same pass.

Combined4.png

struct PSOUTPUT
{
    float4 vertical : COLOR0;
    float4 diagonal : COLOR1;
};

// Get the local CoC to determine the radius of the blur.
float coc = tex2D(sceneTexture, uv).a; 

// CoC-weighted vertical blur.
float2 blurDir = coc * invViewDims * float2(cos(PI/2), sin(PI/2));
float4 color = BlurTexture(sceneTexture, uv, blurDir) * coc; 

// CoC-weighted diagonal blur.
float2 blurDir2 = CoC * invViewDims * float2(cos(-PI/6), sin(-PI/6));
float4 color2 = BlurTexture(sceneTexture, uv, blurDir2) * coc;

// Output to MRT
PSOUTPUT output;
output.vertical = float4(color.rgb, coc);
output.diagonal = float4(color2.rgb + output.vertical.xyz, coc);

Much simpler! Also means we don’t have to read a temporary (vertical) buffer unlike in the previous 3-pass approach, since we’re doing this all at once.

Step 2 – Rhomboid Blur

Combined5.png

The final step is the rhomboid blur. This is similar to the 3-pass approach. Again, this is done in two parts: via a 30 degrees (-PI/6) blur, as well as its reflection at 150 degrees (-5PI/6).

// Get the center to determine the radius of the blur
float coc = tex2D(verticalBlurTexture, uv).a;
float coc2 = tex2D(diagonalBlurTexture, uv).a;

// Sample the vertical blur (1st MRT) texture with this new blur direction
float2 blurDir = coc * invViewDims * float2(cos(-PI/6), sin(-PI/6));
float4 color = BlurTexture(verticalBlurTexture, uv, blurDir) * coc;

// Sample the diagonal blur (2nd MRT) texture with this new blur direction
float2 blurDir2 = coc2 * invViewDims * float2(cos(-5*PI/6), sin(-5*PI/6));
float4 color2 = BlurTexture(diagonalBlurTexture, uv, blurDir2) * coc2;
 
float3 output = (color.rgb + color2.rgb) * 0.5f;

Well That Was Kind of Obvious…

Yup! Just making sure. Details provided for posterity, and I’ll also be building on this part and the previous for the upcoming sections.

Again, a code sample is provided here. You should be able to toggle between both versions and see… that there is no difference.

Hexagonal Bokeh Blur Revisited – Part 1: Basic 3-pass Version

This post is from a multi-part series titled “Hexagonal Bokeh Blur Revisited“. Want to jump directly to the index? If so, click here.

The code below demonstrates the most straightforward way to achieve this blur. It is done in 3 passes.

Animation

Step 0 – Blur Function

First, let’s define our blur function. This will be reused along the way.

float4 BlurTexture(sampler2D tex, float2 uv, float2 direction)
{
    float4 finalColor = 0.0f;
    float blurAmount = 0.0f;
 
    // This offset is important. Will explain later. ;)
    uv += direction * 0.5f;
 
    for (int i = 0; i < NUM_SAMPLES; ++i)
    {
        float4 color = tex2D(tex, uv + direction * i);
        color *= color.a;
        blurAmount += color.a; 
        finalColor += color;
    }
 
    return (finalColor / blurAmount);
}

Step 1 – Vertical Blur

First, we blur vertically.

Combined1

// Get the local CoC to determine the radius of the blur.
float coc = tex2D(sceneTexture, uv).a; 

// CoC-weighted vertical blur.
float2 blurDirection = coc * invViewDims * float2(cos(PI/2), sin(PI/2));
float3 color = BlurTexture(sceneTexture, uv, blurDirection) * coc;

// Done!
return float4(color, coc);

Step 2 – Diagonal Blur

Second we blur diagonally.

This stage is similar to Stage 1, but now with a 30 degree (PI/6) angle. We also combine the diagonal blur with the vertical blur.

Combined2.png

// CoC-weighted diagonal blur
float2 blurDir = coc * invViewDims * float2(cos(-PI/6), sin(-PI/6));
float4 color = BlurTexture(verticalBlurTexture, uv, blurDir) * coc;

// Combine with the vertical blur 
// We don't need to divide by 2 here, because there is no overlap 
return float4(color.xyz + tex2D(verticalBlurTexture, uv).rgb, coc);

Which gives:

4_SceneBottomRight

Step 3 – Rhomboid Blur

The final step is the rhomboid blur.

This is done in two parts: via a 30 degrees (PI/6) blur, as well as its reflection at 150 degrees (5PI/6).

Combined3.png

// Get the center to determine the radius of the blur
float coc = tex2D(verticalBlurTexture, uv).a;
float coc2 = tex2D(diagonalBlurTexture, uv).a;

// Sample the vertical blur (1st MRT) texture with this new blur direction
float2 blurDir = coc * invViewDims * float2(cos(-PI/6), sin(-PI/6));
float4 color = BlurTexture(verticalBlurTexture, uv, blurDir) * coc;

// Sample the diagonal blur (2nd MRT) texture with this new blur direction
float2 blurDir2 = coc2 * invViewDims * float2(cos(-5*PI/6), sin(-5*PI/6));
float4 color2 = BlurTexture(diagonalBlurTexture, uv, blurDir2) * coc2;

// And we're done!
float3 output = (color.rgb + color2.rgb) * 0.5f;

Putting It All Together

Animation

As you can see, the code listed previously is pretty straightforward and should be a good base for you to achieve this blur. Additionally a code sample is provided here.

We can do better. Let’s do it in 2 passes!