Optimization

Our fragment shader runs over the entire image. At 1080p, that’s 2,073,600 times per frame. With 71 samples, twice, we have 294,451,200 texture fetches, square roots, and exponent calls per frame. Modern GPUs are extremely talented at this kind of parallel work. But we can ease the burden on mobile hardware or older devices with a few tricks.

Pre-computing Gaussian weights

We could improve the performance if we look up any given Gaussian weight; they’ll never be different. No square root, no squaring, no exponent, or function calls; just texture fetches. You could input it all by hand, but Godot has a handy way to run scripts independently as an EditorScript we can leverage to make our job easier.

Create a new script, PrecomputeGaussian.gd, and have it extend EditorScript. We’ll recreate our Gaussian function in GDScript.

tool
extends EditorScript

const SAMPLES := 71

func gaussian(x: float) -> float:
    var x_squared := x*x
    var width := 1.0 / sqrt(PI*2*SAMPLES)

    return width * exp((x_squared / (2.0 * SAMPLES)) * -1.0)

When the script runs, Godot calls the _run() function. We’ll build some code with some string manipulation and put it in the clipboard. That way, we can paste it directly into our shader.

func _run() -> void:
    var output := "const float SAMPLES = %s.0;\n" % SAMPLES
    output += "\tconst float WEIGHTS[%s] = {" % (SAMPLES if SAMPLES %2 == 0 else SAMPLES-1)

    for i in range(-SAMPLES/2.0, SAMPLES/2.0):
        output += "\n\t\t%s," % gaussian(float(i))

    output = output.rstrip(',') + "\n\t};"

    OS.clipboard = output
    print("Precomputing gaussian weights: \n%s" % output)

With the script open in the script editor, we run it using File -> Run (or shift + control + X by default). Now we can open our Gaussian blur shader and paste it at the start of the fragment shader. We can remove the old Gaussian function constants and the Gaussian function.

We also have to alter the for loop to get the weight because arrays are 0-based. Our shader should now look like this:

shader_type canvas_item;

uniform vec2 blur_scale = vec2(1, 0);

void fragment() {
    const float SAMPLES = 71.0;
    const float WEIGHTS[70] = {
        0.000008,
        0.000014,
        0.000022,
        ...etc...
    };

    vec2 scale = TEXTURE_PIXEL_SIZE * blur_scale;

    float weight = 0.0;
    float total_weight = 0.0;
    vec4 color = vec4(0.0);

    for(int i=-int(SAMPLES)/2; i < int(SAMPLES)/2; ++i) {
        int w = i + int(SAMPLES)/2;
        weight = WEIGHTS[w];
        color.rgb += texture(TEXTURE, UV + scale * vec2(float(i))).rgb * weight;
        total_weight += weight;
    }

    COLOR.rgb = color.rgb / total_weight;
}

As a fun bonus, you can see that Gaussian bell curve in action by looking through the array’s values. They start near 0, grow to almost 1 around the middle, then shrink back down.

Side note: If you really want to milk your GPU of every instruction, you could unroll the for loop and pre-calculate every part of the call to texture. In effect, you’d be hard-coding the shader using the EditorScript technique.

Blurring at half scale

We could cut the work necessary to blur our scene by 75% by blurring the image at half the size but showing it at full size. It doesn’t matter if it looks blurry because of up-scaling because it’s going to be blurry anyway! Beware of doing this with a scene where you animate the blur, or it can be turned off, because a non-blurred image at half the size won’t look as good! In those cases, you should instead look into fading in the blurred image on top of the non-blurred one.

Either way, we’ll have to change how we render our blurred image. ViewportContainer can’t scale down and then scale back up partway through the pipeline, but TextureRect nodes can.

Here are the steps to set up your scene with an optimized blur.

Create a new User Interface scene. Add a ViewportContainer node and name it SceneView. This node will render your base scene. Enable its Stretch property, and set its Anchor -> Right and Anchor -> Bottom to 1 to make it span over the entire screen. Also, set its Self Modulate’s alpha to 0. Otherwise, it will show up on the screen.

Add a Viewport as a child of SceneView and instantiate the scene you want to blur as a child of the Viewport.

Then, set up your blur render targets.

Add a ViewportContainer as a child of Control and name it Blur1. Enable its Stretch property, set its Stretch Shrink to 2 , and set its Anchor -> Right and Anchor -> Bottom to 1. Set the node’s Self Modulate alpha to 0.

The Stretch Shrink option divides the resolution of a child viewport node by the property’s value. For example, if the child viewport has a resolution of 1920x1080 and the Stretch Shrink is set to 2, the viewport will render at 960x540.

Note that in Godot 3.2, the Stretch Shrink option appears to distort the image in some cases. If that happens to you, reset the value to 1 and instead, manually set the Resolution of the child Viewport node to half the game’s resolution.

Add a Viewport as a child of Blur1 and a TextureRect as its child.

On the TextureRect, enable Expand and set the Stretch Mode to Scale so the texture stretches to the node’s bounding box. Under Texture, make a new ViewportTexture, and select the Viewport under SceneView.

Set its Right and Bottom anchor to 1 to grow that bounding box to fill the viewport. Otherwise, it will not actually render at full size, or at all!

Add a new ShaderMaterial to the TextureRect, assign the Gaussian blur shader, and make sure the scale is (1, 0) so the blur applies horizontally.

Repeat the above steps but this time, name the viewport container Blur2, set the Texture of the TextureRect to the Blur1/Viewport, and set the scale of the shader so the blur applies vertically. Instead of (0, 1), we set it to (0, 0.5) to make it half as strong.

During the first blur, every UV is the texture’s size at full size because that viewport is full size. But the texture that comes out of the second blur is half the size because of the ViewportContainers scaling, so every UV sampled is twice as effective.

Add a last TextureRect to the scene root and name it Presentation. Apply Layout -> Full Rect to it.

Under Texture, make a new ViewportTexture, and select the Blur2/Viewport node. Enable Expand and set Stretch Mode to Scale. If your image is upside down, enable Flip V to flip it vertically.

Here is an example scene tree built following the steps above.

Suppose any of the textures appear pixelated once stretched to fit the window. In that case, you may want to select the TextureRect nodes, expand their Texture, and under the Flags category, enable Filter. This option enables bilinear filtering of textures when scaling them.

Save the scene and revert it to cause all the containers to reset their sizes, and it should be working as intended.

Mip-map blurring

We have one final technique to mention: blurring for free with mipmaps. Mipmaps are a graphic engine technique where a full-size texture has copies made in memory, each one smaller than the original by a certain amount until it becomes too small to matter. This is an integral part of Level of Detail implementations: objects that are further away don’t need to be as detailed as objects closer to the camera.

As textures are shrunk, they can be optionally filtered by the graphics engine to make its pixels not be so obvious. This is what we can use to reduce the amount of work needed to blur.

We can replace the texture call with textureLod, passing a third parameter, which is how far into the mipmaps to drill, which will use a smaller, more blurry version of the texture. To turn on mipmaps, you must select the original texture in the project’s file manager and, under the Import tab, check the ‘mip-map’ box.

But this technique does not work on viewport textures. If you want to blur a sprite or object, this is an option; but blurring a scene in a viewport this way does not work.