Assignment 3: PathTracer

Stephanie Claudino Daffara

In this project I implemented methods to accurately model the flow of light in the real world by using a pathtracing alogirthm. The core routines include ray generation and scene intersection, bounding volume hierarchy (BVH), direct illumination, global illumination, and adaptive sampling. It was very rewarding to write an algorithm and then see it properly generate rays and create normal maps and BVH's. But then once I started to get my results from the direct illumination function, and next with global illumination, I was blown away at each step when my renders (which took a while!) were done. I enjoyed rendering individually a scene with direct lighting only, and indirect lighting only. Viewing these two output made the entire flow of things make way more sense. Plus, the indirect lighting only scene looks awesome (even though unrealistic).

Part 1: Ray Generation and Intersection

Walk through

The ray generation and primitive intersection parts of a rendering pipeline start with the Pinhole camera Model.

Diagram of Pinhole Camera model taken from lecture slide.

I started off by implementing the PathTracer::raytrace_pixel() function. As you can see in the diagram above the ray passes through a particular pixel. This function casts a number of rays over random positions within the pixel. Then it averages over all the samples and returns a corresponding Spectrum. I achieved this by looping num_samples times, and in each loop taking the original given pixel and adding to it an offset from a random sample (between 0 and 1). This way I get different random points within my pixel (which ranges from [x, x+1] and [y, y+1]. Next I call the camera's generate_ray(newPoint.x, newPoint.y) passing in the newPoint which has the offset and is normalized over the sampleBuffers width and height to give a point between (0,0) and (1, 1). This way inside of the camera class we can easily manipulate the point despite it's original mapped space. The generate_ray() returns a ray which I think use as an argument to call est_radiance_global_illumination which evaluates that ray's radiance and returns the Spectrum. In order to get the average I accumulate a sum of all the sample ray's radiance's and after my loop I divide my sum by the number of samples and return the average.

The Camera::generate_ray(), as mentioned above, gets the x and y coordinates of a pixel to cast a ray through. I first mapped the [0,1] normalized space, to the space from [-1,1], to take into account the camera looking along the -z axis. by multiplying each coordinate by 2 and subtracting 1. Next I mapped those coordinates to the ray's direction in camera space by using the following two lsformulas:

x *= tan(radians(hFov) * 0.5)
y *= tan(radians(vFov) * 0.5)

Where hFov and vFov are field of view angles.

Finally, so that we can convert this vector into world space we must apply the given c2w transform matrix, and then normalize it. This is our ray's direction vector. Before returning the ray I set it to begin at the provided nClip camera parameter and end at fClip.

Triangle Intersection

In order to get something to show up on the screen I had to implement the triangle intersection Triangle::intersect(). I used the Möller Trumbore algorithm, which achieves calculating fast a ray-triangle intersection. Given a ray, and 3 points, it defines 5 variables: e1 which is the difference between p2 and p1, e2 which is the difference between p3 and p1, s which is the difference between the ray's origin and p1, s1 which is the cross product of the ray's direction and e2, s2 which is the cross product of s and e1. With these 5 terms defined I can then calculate the variables t, the intersect, b1 and b2 which are values used for the barycentric coordinate interpolation. I calculate these values by doing the following transformations:

Part of Möller Trumbore's Algorithm.

With the computed values for t, b1, and b2, I have to finally check whether or not my t value is actually within the bounds of the triangle or not. I do this by making sure that t is within the beginning and end of the ray's segment, that b1 and b2 are between 0 and 1, adn that b1+b2 is less than 1. if all fo these verify then the ray does in fact intersect the triangle.

Here are some normal shading dae files that were produced by using the implementation above:

Spheres lambertian.
Gems.
Cow.
Banana.

Part 2: Bounding Volume Hierarchy

Although the ray generation method for creating normal maps is cool, it's not very fast on it's own. Rendering images with thousands or more triangles takes very very long. Therefore I implemented the Bounding Volume Hierarchy technique that speeds up the ray tracing process by creating a tight (but not too tight) bound around the object I am tracing.

BVH Construction

To construct BVH I followed these steps:

  1. Iterate through the primitives and compute a union of their bounding boxes.
  2. Then I initialize a node and assign to it that bounding box.
  3. If there are at most max_leaf_size primitives, this means that we are at a leaf node and I simply create a vector with all of the primitives and assign it to the new node and return that node.
  4. If I am not at a leaf node, then I have to recursively iterate through all the nodes splitting them into two subgroups.
  5. In order to find the split point I used a simple solution instead of a heuristic that would potentially be even faster. I find the longest axis by using the bbox.extent property and compare it's x, y, and z lengths. My split point axis becomes the biggest axis, and the split point itself if the center of it. Next I loop through all of my primitives once again and compare their bounding box's centroid point on the selected axis, with the calculated split point. If it is smaller than the split point, I insert the primitive into a vector I named left, and if it is equal or bigger than the split point, I insert it into the vector right.
  6. A special case is when the left or right vector ends up empty. In order to account for this scenario I placed all this logic inside of a while loop that keeps looping unless both the left and right vectors are not empty. If either vector is empty I compute a new split point to be at the mean of the vector that has all of the primitives in it. And I set the vector with content back to empty.
  7. If there is no special case to account for, then I simply recurse through the two vectors and assign them to node->l and node->r. Finally I return the node.

The following pictures show a summary of the steps in splitting the bounding volume:

Initial BV.
1st split.
2nd split.
3rd split.
Skipping some steps..the penultimate split.
The final split, the leaf node.

BB Intersection

I also implemented a ray - bounding box intersect function that returns true if the ray intersects the box and false if it does not. The implementation uses the Ray-Plane intersection algorithm for an intersect perpendicular to a particular axis:

Perpendicular to the x-axis.

Therefore I calculated the min and max for each axis using the algorithm above. I also made sure that the min was in facts smaller than the max, and if they were inverted I swapped them. Then I set my intersect's min to the the max of {xmin, ymin, zmin} and the intersect's max to the min of {xmax, ymax, zmax}. If the intersect's min is larger or equal than it's max, then there is no intersect and I return false. Otherwise an intersect has been found and I return true. The following image depicts this final intersection result clearly:

Image taken from lecture slides.

And here are some images that I was able to render in less than 2 seconds each (on my personal macbook pro):

Beast
Lucy

Rendering experiments

Here is a table that compares the times in seconds of different processes done before and after BVH was implemented on the cow geometry:

Process Before BVH After BVH
Collecting primitives 0.0004s 0.0004s
Building BVH from 5856 primitives 0.0011s 0.0282s
Rendering 191.3375s 0.2567s
BVH traced 479977 rays 471153 rays
Averaged Intersections per ray 4008.378 3.0121

This table makes it easy and clear to see the difference between using a bounding volume hierarchy versus not. It makes sense that collecting the primitives takes the same amount of time, since the number of primitives has not changed between both implementations. It is also no surprise that the process of building BVH from the primitives is quicker before we implement BVH. Things get interesting when we look at the difference in rendering time. Once BVH was implemented, the rendering time improved by about 744%! That is an incredible improvement. Another improvement was the number of rays used, which decreased by about 8000 rays. Lastly, another important observation to make is how many averaged intersections a ray made, which decreased about 1,330%! Another impressive improvement made with the simple addition of BVH.

Part 3: Direct Illumination

Direct illumination is when a scene is lit up by only the light sources in the scene. In order to achieve this I started off by implementing a BSDF (bidirectional scattering distribution function), which represents the color of a material and what colors are produced in in the outgoing direction towards a ray of incoming light that hits the object. I specifically implemented the diffuse case, which is when light is scattered in all directions uniformly, in such a way that the incoming direction of the light and outgoing ray from the object do not affect the final color of the material. Therefore all the Diffuse function does is return the reflectance divided by pi.

Walk through of direct lighting functions:

With a material available to us, next I implemented two direct lighting estimations: Uniform hemisphere sampling and Importance sampling. In a general sense, uniform hemisphere sampling estimates the direct lighting on a point by sampling uniformly in a hemisphere, while importance sampling instead samples all Direct Lighting only. the lights directly.

Hemisphere sampling will take samples uniformly around a point of interest. It does this by creating a number of ray samples that are in the same direction as our sample's hit point, and then checks if each ray intersects our bounding volume or not. One thing to keep in mind is that it it is necessary to offset the origin of the ray from the hit point by a small constant so that the ray does not frequently intersect the same triangle at the same spot again (this is due to floating point imprecision). If it does intersect then I calculate the material's emitted light and transform that radiance into an irradiance. Finally I weight the irradiance by the pdf (following Monte Carlo Integration) and return.

We no longer have only a normal map! But our image is still a bit grainy. Although we can keep increasing the number of camera rays per pixel, and the number of samples per area light until convergence, there is a better method which is Importance sampling over the lights. For this method I summed over every light source in the scene, sampling the surface of each light, computed the incoming radiance from those sampled directions, then converted them into outgoing radiance using the BSDF (which in our case is Diffuse) at the surface. The biggest difference to take away is that we sample each light directly instead of uniformly along a hemisphere. First I looped over every light in the scene. If my scene has a point light, which means that light only goes in one direction, then I only need one sample since all of them would end up falling in the same place. But if the light is an not a point, then we loop over the number of samples to create each ray. For each sample we cast a shadow ray from the intersection point towards the light, anc check if it intersects the scene anywhere. If it does not find an intersect, then the light is hitting the sampled point and I then return the accumulated calculated diffuse BSDF.

Here are the results of both methods:

Direct lighting: Uniform hemisphere sampling.
Direct Lighting: Importance sampling.

Uniform Hemisphere Sampling vs Lighting Sampling

It is clear from these two images above that importance sampling does in fact reduce noise, and it also resolves issues with intersections with point light sources. In uniform hemisphere sampling a single sample point will cast of many rays into randomly uniform directions, and maybe one will hit the light source. This causes noise because think about how a neighboring sample point will also cast many rays, but its rays might not actually hit the light. Therefore, two points near each other might end up one being lit and the other not.

Showing two neighboring rays receiving very different colors.

What importance sampling does instead (and better) is that is that it takes samples directly from the light rays, and checks if the returning ray in the direction from the point to the light hit any objects causing an occlusion. Therefore there is less randomness to this method which results in less noise. Here are two more nice looking Importance sampling Images:

Dragon colored using Importance sampling.
Bench colored using Importance sampling.

And one more example of uniform hemisphere sampling for comparison:

Uniform hemisphere sampling on spheres.

Comparison of noise levels in soft shadows:

Here are four scenes that compare the noise levels in soft shadows when rendering 1, 4, 16, and 64 light rays. This images using 1 sample per pixel and importance sampling direct lighting:

1 light ray.
4 light rays.
8 light ray.
64 light ray.

With only 1 camera ray pixel it is easier to see the difference between different smaples per area light. With 1 sample per area light the noise level is pretty high and the shadow under the bunny is pretty hard. Under the bunny it's pitch black and even kind of hard to distinguish where its feet are. For 4 light rays you can start to see better shadows, but as the shadow get's further away from the object, the more noisy and random the "dark" shadow pixels are. Using 8 light rays produces a smoother image. It is still grainy, giving a sandy-type look to the image's texture. The shadow under the bunny is less harsh, although as the shadow gets further away from the object, the noisier it gets. In the last image, where we have 64 sample per area light the scene's shadows are beautifully smooth! The shadow casted under the bunny smoothly dissolves as it gets further away from it, and is not harsh.

Part 4: Global Illumination

Walk through of indirect lighting function:

Indirect lighting creates a more realistic looking scene because it takes into account for light that bounces off of objects and colors them appropriately. I implemented four functions:

  1. zero_bounce_radiance: Which returns the light that results from no bounces of light. This simply returns the light emitted from the intersection of the ray and the point.
  2. one_bounce_radiance: Which selects whether to set the Spectrum to the Uniform Hemisphere lighting or Importance lighting direct functions.
  3. est_radiance_global_illumination: Which simply returns the addition of the zero_bounce_radiance and the at_least_one_bounce_radiance.
  4. at_least_one_bounce_radiance: Which handles most of the indirect lighting logic so I will explain it in more detail below.

The at_least_one_bounce_radiance does exactly that, it calculates the lighting equations for at least one bounce of light. It takes a sample from the surface BSDF at the intersection point and uses Russian roulette to decide whether to randomly terminate the ray or not. This decision also takes into account whether the depth of original ray is greater than 1, and if the original ray is equal to the max_ray_depth. If these conditions hold then we recurse through our function, passing in the new bounced ray creating, and it's intersection point. The result of this recursion is added to our output along scaled by the sammple BSDF, a cosine, divided by the BSDF pdf.

Here are two scenes that compare direct-only lighting versus indirect-only lighting:

Direct Lighting only.
Indirect Lighting only.

In the direct lighting we get the effect we had before with the direct lighting equations before implementing the indirect ones. There is no color bouncing off of the walls onto the spheres, and the shadow under the spheres are pitch black and thick. In indirect lighting the shadows are much softer and do reflect the colors on the walls. You can see on the left and right side of the spheres in the indirect lighting scene red and blue bouncing off of their surfaces. The ceiling in indirect lighting is now apparent because the light bounces of the objects and hits the ceiling as well, where as in the direct scene, no light rays are casted onto the ceiling. And by joining the two you get:

Combined indirect + direct lighting.

Max Ray Depth Comparison:

Now lets do a max-ray-depth comparison:

Max ray depth = 0
Max ray depth = 1
Max ray depth = 2
Max ray depth = 3
Max ray depth = 100

Since the indirect lighting function is implemented in such way that a ray's depth gets initialized to the max ray depth and we only add one more light bounce to the output if this depth is greater than 1, then it makes sense that at 0 and 1 max ray depth, the images above have no global illumination. It is treated like a single point light and only illuminates the scene as the direct lighting does. The other max ray depths (2,3, and 100) are very similar looking. It seems that at this point, since recursion is expensive, you could probably set the max ray's depth to a small value and still get a very similar high quality result.

Sample-per-pixel rate Comparison:

Sample-per-pixel = 1
Sample-per-pixel = 2
Sample-per-pixel = 4
Sample-per-pixel = 8
Sample-per-pixel = 16
Sample-per-pixel = 64
Sample-per-pixel = 1024

The results above are interesting but not unfamiliar. It makes sense that the less pixels you have, the more "random" the color bouncing becomes. So in the sample-per-pixel = 1, you can see the static-looking image, with many colorful pixels scattered around (but still in a way that you can distinguish the image). As the sample-per-pixel rate goes up, the better the estimate is made of what color a pixel should be depending on the bounced back ray.

Here are some more images displaying the global illumination effect:

Part 5: Adaptive Sampling

Walk through of adaptive sampling:

Adaptive sampling allows for quicker results with less noise since it uses an algorithm to test batches of samples. Basically we know by now that we can reduce noise by increasing the number of samples per pixel, but all the pixels on the scene don't necessarily converge at that same number of samples. Since some pixels converge faster (with lower sampling rates) while others require more pixels, adaptive sampling does exactly this, by figuring out what areas of the scene require more samples.

This algorithm runs on each pixel and uses confidence interval statistics in order to detect whether the pixel has converged as we trace the ray samples through it. Although theory and numbers behind it take a bit more reading and studying to fully understand the algorithm is simple to use. It is implemented inside of the raytrace_pixel function I implemented for Part 1.

In addition to what I was doing before in this method (reference back to Part 1), now I am also keeping track of a sample_batch_count and sample_count variables. These will keep track of how many samples I have already accumulated per batch, and how many total samples I have accumulated, which I increment at each iteration over num_samples. I also keep track of two accumulators of each current_sample's illuminance. These are: s1 += illuminance; and s2 += illuminance * illuminance. Inside my for loop I have a check for if batch_count equals the samplesPerBatch (which gets set from the command-line).

If batch_count equals the samplesPerBatch then I calculate a mean, a variance and a convergence variable I. These equations are as follows:

Mean
Variance
Pixel's Convergence

If the convergence variable I is <= maxTolerance * mean (and the maxTolerance is set in the command-line) then I break out of my for loop because I have reached convergence in sample numbers for this specific pixel. Isn't this amazing and magical?

Last I made sure to update my sampleCountBuffer[index] = sampleCount just so that the colors appear correctly in the outputted rate image. And I also return the average of my samples divided by the actual sampleCount counter I accumulated as oppose to the num_samples it was previously since now I have a change of exiting the loop early.

Here are my outputs for 64 samples per batch and 0.05 max tolerance:

2048 samples/pixel, 1 sample/light, 5 max ray depth
The rate image 2048 samples/pixel image.
4096 samples/pixel, 1 sample/light, 5 max ray depth
The rate image 4096 samples/pixel image.

One interesting thing to note about the images above is that since the 4096 samples/pixel one produces a ceiling that has more colors than the 2048 samples/pixel, we can conclude that it needs more than 2048 pixels per sample but less than 4000 samples in order to converge.