# Introduction

As GPU power is becoming significantly better than CPU, ray casting / tracing and ray marching is widely used in 4k intros and demoscenes to create 3D (interactive) graphics.

– Thanks to all the references listed in the end, I learnt so much about and I would like to share with the knowledge to my friends. This tutorial might contain some shortcomings, so please, if you find anything wrong, correct my errors.

This tutorial is built on rendering with two triangles and invoking a pixel shader that will create an animated or static image.

Everyone* was a “noob” at one point!

None* of us was born knowing this stuff!“If I have seen further it is by standing on the shoulders of giants.” — Isaac Newton

**Terminology and Background**

All of the three technologies (ray casting / tracing and marching) can handle global effects such as shadows and reflections; they need all objects available to render any object. Some say that it is not a real-time strategy yet while others disagree.

- Ray casting
- Ray casting, as the original method of the other two variants, is a technique that transform a limited form of data (a very simplified map or floor plan) into a 3D projection by tracing rays from the view point into the viewing volume.
**Computation**: Rays are cast and traced*in groups*based on some geometric constraints.**For instance**: on a 320×200 display resolution, a ray-caster traces only 320 rays**Speed**: Very fast compared to ray-tracing; suitable for real time process.**Limitation**: Simple geometric objects.- Resulting image is not very realistic. Often, they are blocky.

- Ray tracing
- Ray tracing is a more sophisticated algorithm in global rendering involving diffraction and reflection, thus generating images looking even more realistic; alas, it emulates the physics but is computationally more expensive.
**Computation:**Each ray is traced*separately,*so that every point (usually a pixel) on the display is traced by one ray.**For instance**: on a 320×200 display resolution, a ray-tracer needs to trace 320×200 (64,000) rays.**Speed**: slow; unsuitable for real time process- Resulting image is very realistic – sometimes too realistic
**Applications**: Raster space antialiasing; Time antialiasing; Camera exposition simulation; Surface shaders (procedural texturing and lighting); Volume shaders (procedural volumetric effects, see first video); Reflections, refractions, shadows; Able of raytrace spheres, cylindres, cubes, planes, cones, revolution objects, 3d julias and polygonal meshes; Lame bump mapping.**Accelerations**: Use kd-trees, bih or bvh acceleration structure does a couple million rays per second (per core) in modern CPUs.**GPU**: GPUs are so fast that even brute force implementations of raytracing run fast at high screen resolutions.

- Ray marching (a.k.a. sphere tracing)
**Ray marching**is a variant of ray casting that permits the use of objects for which there is no analytic formula so that the intersection with the ray cannot be simply computed by solving an algebraic equation.**Accelerations**: Demo coders and researchers do raymarching for heightmaps, volume texture, procedural isosurfaces and analytic surfaces.- Special Thanks to Iñigo Quilez (rgba) – for ShaderToy and for sharing his rendering knowledge!

In contrast, traditional graphics pipeline uses the **pipeline model**, which renders one object at a time and tricks to get global effects.

In pipeline model, geometry is in the form of vertices: two vertices form a line or a sphere (one center and one radius), three vertices form a triangle. **Clipping** is used to decide what can be seen. **Rasterizer** produces potential pixels (fragments).

**Prior Knowledge**

## Interaction of a Ray with An Object

A ray is a straight line. In 2D, it can be defined as y = k * x + d; in 3D, we need an additional coordinate: z = l * x + e. With two points, it is easy to compute the line:

1 2 |
k = (y2 - y1) / (x2 - x1), l = (z2 - z1) / (x2 - x1); d = y - k * x = y1 - x1 * (y2 - y1) / (x2 - x1) = (x2 * y1 - x1 * y2) / (x2 - x1); e = (x2 * z1 - x1 * z2) / (x2 - x1); |

Thus,

1 2 |
y = ((x - x1) * y2 - (x - x2) * y1) / (x2 - x1); z = ((x - x1) * z2 - (x - x2) * z1) / (x2 - x1); |

Take that into a sphere, we can get, but the solution to X is really long, a.k.a. solving

1 |
x^2 - 2 * x0 * x + x0^2 + (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1))^2 - 2 * y0 * (x * (y2 - y1) / (x2 - x1) + (x2 * y1 - x1 * y2) / (x2 - x1)) + y0^2 + (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1))^2 - 2 * z0 * (x * (z2 - z1) / (x2 - x1) + (x2 * z1 - x1 * z2) / (x2 - x1)) + z0^2 - r^2 = 0 |

## Ray Tracing

Here is a nice demo of ray tracing from Polytonic:

This is from iq. Once you have such a function ** f(x,z)**, the goal is to use a raytracing setup to render the image and do other effects as shadows or reflections. That means that for a given ray with some starting point in space (like the camera position) and a direction (like the view direction) we want to compute the intersection of the ray and the terrain

**. The simplest approach is to slowly advance in the ray direction in small steps, and at each stepping point determine if we are above of below the terrain level. The image below shows the process. We start at some point in the ray close to the camera (the leftmost blue point in the image). We evaluate the terrain function**

*f***at the current**

*f***and**

*x***coordinates to get the altitude of the terrain**

*z***:**

*h***. Now we compare the altitude of the blue sampling point**

*h = f(x,z)***with**

*y***, and we realize that**

*h***, or in other words, that the blue point is above the mountains. So, we step to the next blue point in the ray, and we repeat the process. At some point, perhapse, one of the sampling point will fall below the terrain, like the yellow point in the image. When that happens,**

*y > h***, and we know our ray crossed the terrain surface. We can just stop here and mark the current yellow point as the intersection point (even if we know it’s slightly further than the real intersection point), or peharse take the last blue point as interesection point (slightly closer than the real intersection) or the average of the last blue and yellow points.**

*y < h*
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
bool castRay( const vec3 & ro, const vec3 & rd, float & resT ) { const float delt = 0.01f; const float mint = 0.001f; const float maxt = 10.0f; for( float t = mint; t < maxt; t += delt ) { const vec3 p = ro + rd*t; if( p.y < f( p.x, p.z ) ) { resT = t - 0.5f*delt; return true; } } return false; } |

**,**

*mint***and**

*maxt***constants should be adapted for every scene. The first one is the distance to the near clipping plane, you can set it to zero. The second one is the maximun distance the ray is allowed to travel, ie, the visibility distance. The third one is the step size, and it directly influences the rendering speed and the quality of the image. The bigger it is, the faster of course, but the lower the terrain sampling quality will be.**

*delt*As you can see the code is terribly simple. There are many optimizations and improvements possible of course. For example, the accuracy of the intersection can be done more accurately by doing a linear approximation of the terrain altitudes between the blue and yellow points and compute the analytical intersection between the ray and the idealized terrain.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
bool castRay( const vec3 & ro, const vec3 & rd, float & resT ) { const float delt = 0.01f; const float mint = 0.001f; const float maxt = 10.0f; float lh = 0.0f; float ly = 0.0f; for( float t = mint; t < maxt; t += delt ) { const vec3 p = ro + rd*t; const float h = f( p.xz ); if( p.y < h ) { // interpolate the intersection distance resT = t - dt + dt*(lh-ly)/(p.y-ly-h+lh); return true; } lh = h; ly = p.y; } return false; } |

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
bool castRay( const vec3 & ro, const vec3 & rd, float & resT ) { float delt = 0.01f; const float mint = 0.001f; const float maxt = 10.0f; float lh = 0.0f; float ly = 0.0f; for( float t = mint; t < maxt; t += delt ) { const vec3 p = ro + rd*t; const float h = f( p.xz ); if( p.y < h ) { // interpolate the intersection distance resT = t - dt + dt*(lh-ly)/(p.y-ly-h+lh); return true; } // allow the error to be proportinal to the distance delt = 0.01f*t; lh = h; ly = p.y; } return false; } |

The other optimization is to note that as the marching moves the potential intersection point further and further (the bigger ** t** becomes), the less important the error becomes, as geometric details get smaller in screen space as they get further from the camera. In fact, details decay inverse-linearly with the distance, so we can make our error or accuracy

**linear to the distance. This saves lot of rendering time and gives more uniform artifactas than the naive approach. This trick is described in several places, for example in the “Texturing And Modeling – A Procedural Approach” book.**

*delt*# Main Idea

## Ray Marching

Instead of computing the sophisticated intersection points, ray marching algorithm walks along the ray and simply check after each step whether you’ve hit an object.

paniq designed a set of interactive, step-by-step illustration of ray marching:

### Walking

Suppose A is the location of the camera and B is the surface point which you want to compute its color. All we need to do is calculate the x, y, z coordinates for the current point and insert it into the formulas of the objects. The walking steps is increased each time we did not hit the surface. By keeping track of the number of steps you have walked on the ray, we have automatically obtained the distance, and thus are able to calculate the color of the pixel to put.

### Disadvantages

Ray marching is not precise and depends on the distance we walk per step.

**Distance field**

The basic idea is that the distance per step D is not constant, but variable. D depends on the distance of the current point to the object that is closest to it. If we know D, it is safe to walk exactly that distance on the ray, as chances are zero that we will hit an object if we walk less than that.

The hack is that the distance to the closest objects is not computed directly, but is read from a 3D matrix – the distance field.

### Advantages

- Much faster than constant-size stepping.
- Much easier to control than root finders (bisection, Newton… )
- Room for optimization, like using bigger steps when we are further from the ray origin
- Error in world coordinates decreases as 1/ d • So stepping proportionally to d results in constant screen space error.

### Disadvantages

- Slow on the boundaries of the objects (can be controlled by imposing a minimum step size)

### Combination Equals to Minimization

Since only the pixel from the closest objects is rendered, combination of geometry objects is equivalent to minimization of distance field. For example,

1 2 3 4 5 6 7 |
float combinedDistanceField( vec3 p ) { float dist1 = distanceField_A( M1inv*p ); float dist2 = distanceField_A( M2inv*p ); float dist3 = distanceField_B( M3inv*p ); return min( dist1, min( dist2, dist3 ) ); } |

### Domain Distortion (example by iq/rgba)

Here is a single rotate on a column:

1 2 3 4 5 |
float twistedColumn( vec3 p ) { vec3 q = rotateY(p, p.y*1.7); return distanceToColumn(q); } |

Here is slightly rotated tentacles:

1 2 3 4 5 6 |
float rr = dot(p.xy,p.xy); for( int i=0; i<6; i++ ) { vec3 q = rotateY( p, TWOPI*i/6.0 ); distance = min( distance, distanceToTheXAxis(q) ); } |

Here is how the tentacles are lifted and rotated:

1 2 3 4 5 6 7 |
float rr = dot(p.xy,p.xy); for( int i=0; i<6; i++ ) { vec3 q = rotateY( p, TWOPI*i/6.0 ); q.y += 0.6*rr*exp2(-10.0*rr); distance = min( distance, distanceToTheXAxis(q) ); } |

### Blending Fields

1 2 3 4 5 6 7 |
float distanceToMonster( vec3 p ) { float dist1 = distanceToBall(p); float dist2 = distanceToTentacles(p); float bfact = smoothstep( length(p), 0, 1 ); return mix( dist1, dist2, bfact ); } |

# Lighting

### Normal Map

Normal Maps are usually computed by central differences on the distance field at the shading point (gradient apprixmation).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
const float heightScale = 0.0125; float sampleHeight(in vec2 coord) { return heightScale * dot(texture2D(iChannel0, coord), vec4(1.0/3.0, 1.0/3.0, 1.0/3.0, 0.0)); } void mainImage( out vec4 fragColor, in vec2 fragCoord ) { vec2 uv = fragCoord.xy / iResolution.xy; vec2 du = vec2(1.0 / 1024.0, 0.0); vec2 dv = vec2(0.0, 1.0 / 1024.0); float h0 = sampleHeight(uv); float hpx = sampleHeight(uv + du); float hmx = sampleHeight(uv - du); float hpy = sampleHeight(uv + dv); float hmy = sampleHeight(uv - dv); float dHdU = (hmx - hpx) / (2.0 * du.x); float dHdV = (hmy - hpy) / (2.0 * dv.y); vec3 normal = normalize(vec3(dHdU, dHdV, 1.0)); fragColor = vec4(0.5 + 0.5 * normal, 1.0); } |

### Bump Map

Bump map are usually computed by adding the gradient of a fractal sum of Perlin noise function to the surface normal.

Perlin noise is a type of gradient noise developed by Ken Perlin in 1983 as a result of his frustration with the “machine-like” look of computer graphics at the time.

1 2 3 4 5 6 |
n = normalize( grad( distance, p) ) + bump * grad( fbm, p) ) ); grad( func, p ) = normalize( func(p+{eps,0,0}) - func(p-{eps,0,0}), func(p+{0,eps,0}) - func(p-{0,eps,0}), func(p+{0,0,eps}) - func(p-{0,0,eps}) ); |

### Phone

*Bui Tuong Phong* was a student at the University of Utah when he invented the process that took his name (Phong lighting). It is a quick and dirty (while I think clever) way to render objects that reflect light in a privileged direction without a full blown reflection.

1 2 3 4 5 6 7 |
float reflect = 2.0f * (lightRay.dir * n); vec3 phongDir = lightRay.dir - reflect * n; float phongTerm = max(phongDir * viewRay.dir, 0.0f); phongTerm = currentMat.specvalue * pow(phongTerm, currentMat.specpower) * coef; red += phongTerm * current.red; green += phongTerm * current.green; blue += phongTerm * current.blue; |

## Ambient Occulusion

The basic technique was invented by Alex Evans, aka Statix (“Fast Approximation for Global Illumnation on Dynamic Scenes”, 2006)

# Fog

I added this section from iq’s tutorial:

Fog is very popular element in computer graphics, so popular that in fact we are always introduced to it early in textbooks or tutorials. Iñigo Quilez introduced a nice advanced tutorial on *Better Fog*.

The fog quickly helps us understand the distances and therefore scale of objects, and the world itself.

A basic fog looks like this:

1 2 3 4 5 6 7 |
vec3 applyFog( in vec3 rgb, // original color of the pixel in float distance ) // camera to point distance { float fogAmount = 1.0 - exp( -distance*b ); vec3 fogColor = vec3(0.5,0.6,0.7); return mix( rgb, fogColor, fogAmount ); } |

Adding camera and sun light direction to the fog provides a more vivid scene:

1 2 3 4 5 6 7 8 9 10 11 12 |
vec3 applyFog( in vec3 rgb, // original color of the pixel in float distance, // camera to point distance in vec3 rayDir, // camera to point vector in vec3 sunDir ) // sun light direction { float fogAmount = 1.0 - exp( -distance*b ); float sunAmount = max( dot( rayDir, sunDir ), 0.0 ); vec3 fogColor = mix( vec3(0.5,0.6,0.7), // bluish vec3(1.0,0.9,0.7), // yellowish pow(sunAmount,8.0) ); return mix( rgb, fogColor, fogAmount ); } |

As for extintion and inscattering of the light vs. fog, we can adjust the parameters more carefully:

1 2 3 |
vec3 extColor = vec3( exp(-distance*be.x), exp(-distance*be.y) exp(-distance*be.z) ); vec3 insColor = vec3( exp(-distance*bi.x), exp(-distance*bi.y) exp(-distance*bi.z) ); finalColor = pixelColor*(1.0-extColor) + fogColor*insColor; |

Real atmosphere is less dense in the height athmosphere than at the sea level. We can model that density variation with an exponential. The good thing of the exponential function is that the soluting to the formulas is analytical. Let’s see. We start with this exponential density function, which depends on the height * y* of our point:

1 2 3 4 5 6 7 8 9 |
vec3 applyFog( in vec3 rgb, // original color of the pixel in float distance, // camera to point distance in vec3 rayOri, // camera position in vec3 rayDir ) // camera to point vector { float fogAmount = c * exp(-rayOri.y*b) * (1.0-exp( -distance*rayDir.y*b ))/rayDir.y; vec3 fogColor = vec3(0.5,0.6,0.7); return mix( rgb, fogColor, fogAmount ); } |

# Reference

- Andre LaMothe.
*Black Art of 3D Game Programming*. 1995, ISBN 1-57169-004-2, pp. 14, 398, 935-936, 941-943. *Jon Macey, University of Bournemouth Ray-tracing and other Rendering Approaches*” (PDF), lecture notes, MSc Computer Animation and Visual Effects,*Alex Evans. Fast Approximation for Global Illumnation**on Dynamic Scenes*- Iñigo Quilez (rgba).
*Rendering Worlds with Two Triangles with raytracing on the GPU in 4096 bytes*in NVScene 2008. - glöplog on pouet.net.
*Raymarching Tutorial.* - F. Permadi.
*Raycasting Tutorial*. 1996. - Codermind team in UCI.
*Raytracing Turoial.* - Iñigo Quilez (rgba). Better Fog.2010
- Iñigo Quilez (rgba). Modeling with Distance Functions. 2008
- Iñigo Quilez (rgba). Raymarching Distance Fields. 2012.
- Iñigo Quilez (rgba). Menger Fractal. 2011
- Bui Tuong Phong. Illumination for computer generated pictures. 1975
- Iñigo Quilez (rgba). Simple Pathtracing 2012
- Iñigo Quilez (rgba). Terrain Raymarching. 2002. 2007.
- Iñigo Quilez (rgba). Menger Fractal
- Nop Jiarathanakul. “Ray Marching Distance Fields in Real-time on WebGL“
- 9bit Science. “Raymarching Distance Fields”
- László Szirmay-Kalos, Tamás Umenhoffer “Displacement Mapping on the GPU ― State of the Art” 2006.

More collections can be found in http://d.hatena.ne.jp/hanecci/20131005/p1.

Hi there ,great page but impossible to read because the headers are all different sizes so the page is jumping up and down every few seconds.

Thank you so much for your suggestion! I have fixed the headers!

It’s still jumping around. The girl in the hands image is vertically larger than the others.

Thank you so much, Ron!!! It’s fixed now 🙂