Here are some random thoughts and summaries on deferred rendering pipeline.
Deferred shading is generally faster than forward shading
According to Unity3D,
As a general rule, Deferred Rendering is likely to be a better choice if our game runs on higher-end hardware and uses a lot of realtime lights, shadows and reflections. Forward Rendering is likely to be more suitable if our game runs on lower-end hardware and does not use these features.
In forward shading, with M lights and N objects, we use a O(MN) loop for rendering them:
1 2 3 4 5 |
For each light: For each object affected by the light: fragColor += object * light |
There are three problems:
- Ineffective light culling
- Large memory footprint of all geometries, lights (shadow maps, environment maps), and textures, must be allocated, initialized, and accessed.
- Shading small triangles is inefficient
- Divergence in the fragment shader: we have to test if this fragment is illuminated by the current light in the O(MN) loop
However, in deferred shading, this is mostly fixed:
1 2 3 4 5 6 |
For each object: Render to multiple targets For each light: Apply light as a 2D post-process |
Usually, the geometry buffer is organized as follows,
so that 1000 lights can be achieved with deferred shading but not forward shading.
As for single texture sampling, the modern GPUs are really fast.
According to Xiaoxu Meng, randomly sampling the texture for 100 times is like nothing at a resolution of 1920×1080.
Counts of random texture sampling | Time |
---|---|
1 | 0.25ms |
10 | 0.28ms |
100 | 0.45ms |
1000 | 2.43ms |
10000 | 20.15ms |
For instance, I could do Gaussian blur with video input in 60 FPS at ShaderToy:
Therefore, I am more likely to believe that texture sampling is NOT the bottleneck of the modern rendering pipeline, but lighting is.
For instance, the Falcor engine has very complicated lighting, click this shader to have a sense:
Cheers,
Ruofei