There’s a decent chance you don’t need an anisotropic specular and a specular highlight. You might not need a highlight at all, and your object might not be metal. But we’re still using up instructions calculating them even if the end result is 0.
We’ve advised you throughout these tutorials not to use if statements, but this is where this cardinal rule gets an exception.
Shader invocations happen in parallel, executing the same instructions at the same time on different input values but the same uniform values (hence their name). This invocation is a ‘wavefront’, a wave of data affecting vertices and fragments.
When a conditional statement comes up and the code path has to diverge, different invocations inside the wavefront have to execute different code. The GPU has to create a new wavefront and copy data over to it, an expensive operation during which the fragments that that wave controls have to wait.
There’re three types of branches: static, uniform and dynamic. In static branching, the GPU already knows the value of the condition before it’s even drawn a single frame using the shader:
const bool RUN_THIS_BRANCH = true; void fragment() { if(RUN_THIS_BRANCH) { } }
In uniform branching, the GPU does not know the value, but every invocation of the shader for every fragment is going to be the same because it depends on a uniform value:
uniform float value; void fragment() { if(value > 0.0) { }
In those two cases, there is no divergence. The GPU calling if(value > 0.0)
on one fragment or another will not change the path it takes through the code.
In dynamic branching, the GPU does not know the value and it’s different from fragment to fragment:
void fragment() { if(UV.x <= 0.5) { } }
The wavefront up to the UV.x being 0.5
will all be the same, but there is a guaranteed chance that, somewhere in the two halves, there’ll be an overlap and some code will be running with <0.5
and some will be running with >0.5
, requiring a new wavefront.
This is hardware specific and different eras of GPUs handle branching differently. Hardware support for uniform branching on hardware that only supports GLES 2.0 will be vendor and manufacturer specific. Hardware that supports GLES 3.0 and the upcoming Vulkan build of Godot is almost guaranteed to support uniform branching.
Advances in hardware even increases support for dynamic branching and the GPU can start doing more accurate guesses; but as a best practice, stick with static and uniform branching, if you must branch at all.
Our shader is an all-in-one shader, and there is definite performance gain to be had in optimizing away some of its calls.
We can hide metalness behind the metalness
uniform, the specular behind specular_size
, the anisotropic highlight behind anisotropy_specular_strength
, and the dynamic outline behind outline_size
. They all start at 0 and go up from there.
shader_type spatial; render_mode unshaded; //Specular constants const float SPECULAR_SOFT_MIN = 0.0; const float SPECULAR_SOFT_MAX = 0.64; const float SPECULAR_HARD_MIN = 0.17; const float SPECULAR_HARD_MAX = 0.18; //Outline constants const float OUTLINE_MIN = 0.45; const float OUTLINE_MAX = 0.47; //Anisotropic constants const float ANISOTROPY_SHARPNESS_MIN = 0.34; const float ANISOTROPY_SHARPNESS_MAX = 0.35; const float ANISOTROPY_SOFTNESS_MIN = 0.06; const float ANISOTROPY_SOFTNESS_MAX = 0.364; const float ANISOTROPY_HOTSPOT_MIN = 0.083; const float ANISOTROPY_HOTSPOT_MAX = 0.637; const float ANISOTROPY_BAND_MIN = 0.42; const float ANISOTROPY_BAND_MIDDLE = 0.5; const float ANISOTROPY_BAND_MAX = 0.58; //Data textures uniform sampler2D light_data : hint_black; uniform sampler2D specular_data : hint_black; uniform sampler2D key_light_ramp : hint_black; uniform sampler2D fill_light_ramp : hint_black; uniform sampler2D kick_light_ramp : hint_black; uniform sampler2D metalness_texture : hint_black_albedo; uniform sampler2D high_frequency_anisotropy_noise : hint_black; uniform sampler2D low_frequency_anisotropy_noise : hint_black; uniform sampler2D spottiness_anisotropy_noise : hint_black; //Outline uniform vec4 outline_color : hint_color = vec4(0, 0, 0, 1.0); uniform float outline_size : hint_range(0, 1) = 0.5; //Metalness uniform vec4 dark_metalness_color : hint_color = vec4(0, 0, 0, 1); uniform vec4 light_metalness_color : hint_color = vec4(1, 1, 1, 1); uniform float metalness_contrast_factor : hint_range(0, 5) = 1.0; uniform float metalness : hint_range(0, 1) = 0.0; //Specular uniform float specular_softness : hint_range(0, 1); uniform float specular_size : hint_range(0, 4); uniform vec4 specular_color : hint_color; //Light colors uniform vec4 key_light_color : hint_color; uniform vec4 shadow_color : hint_color; uniform vec4 fill_light_color : hint_color; uniform vec4 kick_light_color : hint_color; //Anisotropic highlight uniform float anisotropy_specular_width = 10.0; uniform float anisotropy_specular_strength : hint_range(0, 1) = 0.0; uniform float anisotropy_specular_contrast : hint_range(0, 12) = 5.0; uniform float anisotropy_specular_brightness : hint_range(0, 2) = 0.85; uniform float anisotropy_in_shadow_strength : hint_range(0, 1) = 0.1; varying vec3 down_camera_angle; void vertex() { down_camera_angle = (vec4(0, -1, 0, 1) * CAMERA_MATRIX).xyz; } void fragment() { //Data vec3 diffuse = texture(light_data, SCREEN_UV).rgb; //Key light float key_light_value = texture(key_light_ramp, vec2(diffuse.r, 0)).r; vec3 out_color = key_light_value * key_light_color.rgb; out_color = max(out_color, shadow_color.rgb); //Fill light float fill_light_value = texture(fill_light_ramp, vec2(diffuse.g, 0)).r; out_color += fill_light_value * fill_light_color.rgb; //Kick light float kick_light_value = texture(kick_light_ramp, vec2(diffuse.b, 0)).r; out_color += kick_light_value * kick_light_color.rgb; //Metalness if(metalness > 0.0) { vec2 metalness_uv = (NORMAL.xy * vec2(0.5, -0.5) + vec2(0.5, 0.5)); vec3 metalness_value = texture(metalness_texture, metalness_uv).rgb; metalness_value = clamp(pow(metalness_value, vec3(metalness_contrast_factor)), 0, 1); metalness_value = clamp(metalness_value, dark_metalness_color.rgb, light_metalness_color.rgb); out_color = mix(out_color, metalness_value, metalness); } //Specular float specular = texture(specular_data, SCREEN_UV).r; if(specular_size > 0.0) { float soft_specular = smoothstep(SPECULAR_SOFT_MIN, SPECULAR_SOFT_MAX, specular * specular_size); float hard_specular = smoothstep(SPECULAR_HARD_MIN, SPECULAR_HARD_MAX, specular * specular_size); vec3 specular_out = mix(hard_specular, soft_specular, specular_softness) * specular_color.rgb; out_color += specular_out; } //Anisotropic highlight if(anisotropy_specular_strength > 0.0) { float anisotropy_angle = down_camera_angle.z * 0.33; float high_anisotropy_noise_value = (texture(high_frequency_anisotropy_noise, vec2(UV.x, 0)).r - 0.5) * 0.2; float low_anisotropy_noise_value = (texture(low_frequency_anisotropy_noise, vec2(UV.x, 0)).r - 0.5) * 0.2; float anisotropy_specular_hotspot = smoothstep( ANISOTROPY_HOTSPOT_MIN, ANISOTROPY_HOTSPOT_MAX, specular * anisotropy_specular_width); float spottiness_anisotropy_noise_value = (texture(spottiness_anisotropy_noise, vec2(UV.x, 0)).r - 0.5) * anisotropy_specular_contrast + anisotropy_specular_brightness; float anisotropy_uv = UV.y + high_anisotropy_noise_value + low_anisotropy_noise_value - anisotropy_angle; float lower_sample = smoothstep(ANISOTROPY_BAND_MIN, ANISOTROPY_BAND_MIDDLE, anisotropy_uv); float higher_sample = 1.0 - smoothstep(ANISOTROPY_BAND_MIDDLE, ANISOTROPY_BAND_MAX, anisotropy_uv); float anisotropy_sample = lower_sample * higher_sample * spottiness_anisotropy_noise_value * max(anisotropy_specular_hotspot, anisotropy_in_shadow_strength); float sharp_anisotropy_value = smoothstep(ANISOTROPY_SHARPNESS_MIN, ANISOTROPY_SHARPNESS_MAX, anisotropy_sample); float soft_anisotropy_value = smoothstep(ANISOTROPY_SOFTNESS_MIN, ANISOTROPY_SOFTNESS_MAX, anisotropy_sample); float anisotropy_value = mix(sharp_anisotropy_value, soft_anisotropy_value, specular_softness); vec3 anisotropy_color = specular_color.rgb * anisotropy_value; out_color = out_color + (anisotropy_color * anisotropy_specular_strength); } //Outline if(outline_size > 0.0) { float outline_factor = outline_size * (1.0 - diffuse.r); float rim_value = pow(dot(NORMAL, VIEW), outline_factor); float outline_amount = smoothstep(OUTLINE_MIN, OUTLINE_MAX, rim_value); out_color = mix(outline_color.rgb, out_color, outline_amount); } ALBEDO = vec3(out_color); }