Ray Tracing 101: How does a look at transform work?

The matrix that transforms world space coordinates to view space is called the view matrix. A look at matrix is a view matrix that is constructed from an eye point, a look at direction or target point and an up vector.

If you take a look at the source code of pbrt-v3 and figure out how a camera spans a ray you will recognize that the ray is first described in camera space, which means that the start of a ray is always at the point (0,0,0). This point is then transformed from camera space to world space:

// BSD-2-Clause License - see https://github.com/mmp/pbrt-v3
Float PerspectiveCamera::GenerateRay(const CameraSample &sample,
                                    Ray *ray) const {
    ProfilePhase prof(Prof::GenerateCameraRay);
    // Compute raster and camera sample positions
    Point3f pFilm = Point3f(sample.pFilm.x, sample.pFilm.y, 0);
    Point3f pCamera = RasterToCamera(pFilm);
    *ray = Ray(Point3f(0, 0, 0), Normalize(Vector3f(pCamera)));
    // Modify ray for depth of field
    if (lensRadius > 0) {
        // Sample point on lens
        Point2f pLens = lensRadius * ConcentricSampleDisk(sample.pLens);

        // Compute point on plane of focus
        Float ft = focalDistance / ray->d.z;
        Point3f pFocus = (*ray)(ft);

        // Update ray for effect of lens
        ray->o = Point3f(pLens.x, pLens.y, 0);
        ray->d = Normalize(pFocus - ray->o);
    }
    ray->time = Lerp(sample.time, shutterOpen, shutterClose);
    ray->medium = medium;
    *ray = CameraToWorld(*ray);
    return 1;
}

To keep things simple let consider only a 2D environment. Since I favor a test-driven development process I come up with a test first. In Cucumber style using Gherkin we can formulate the following scenario:

Scenario Outline: Compute a look-at transform in a 2D environment
Given an <eye> and <target> point
When I compute a look-at matrix given those points
Then my computed look-at matrix (listed in row-major fashion) should be <look-at-matrix>

Examples:
    | eye   | target | look-at-matrix                      |
    | (0,0) | (1,0)  | (1,0,0, 0, 0,1,0,0,0,0,1,0,0,0,0,1) |
    | (3,1) | (3,2)  | (0,1,0,-1,-1,0,0,3,0,0,1,0,0,0,0,1) |

I translated this feature description manually to a Google Test. Of course, the Cucumber framework could be used to simplify this translation process, but this is another story.

TEST(Transform44f, lookat1) {
    Point2f eye{0.0f, 0.0f};
    Point2f target{1.0f, 0.0f};

    Transform44f transform = look_at(eye, target);

    Matrix44f expectedMatrix;
    expectedMatrix << 1.0f,0.0f,0.0f,0.0f,
                      0.0f,1.0f,0.0f,0.0f,
                      0.0f,0.0f,1.0f,0.0f,
                      0.0f,0.0f,0.0f,1.0f;

    ASSERT_TRUE(transform.getMatrix().isApprox(expectedMatrix));
}

TEST(Transform44f, lookat2) {
    Point2f eye{3.0f, 1.0f};
    Point2f target{3.0f, 2.0f};

    Transform44f transform = look_at(eye, target);

    Matrix44f expectedMatrix;
    expectedMatrix <<   0.0f,1.0f,0.0f,-1.0f,
                       -1.0f,0.0f,0.0f, 3.0f,
                        0.0f,0.0f,1.0f, 0.0f,
                        0.0f,0.0f,0.0f, 1.0f;

    ASSERT_TRUE(transform.getMatrix().isApprox(expectedMatrix));
}

A look at transform is just a simple basis transform. Let’s consider a basis transform:

B

Let’s assume that basis 1 is a standard basis and basis 2 is translated by $(3,2)$ and a rotated by 45 degrees (counter-clockwise). What are the coordinates of point $A$ regarding basis $B_2$?

The answer is as simple as:

$$ A_{B_2} = (T*R)^{-1} * A $$

All you have do is to move basis 2 back to basis 1. And then you have to the same for the point $A$.

$(T*R)$ describes the transform to convert basis 1 to basis 2. This is composed of a translation $T$ and a rotation $R$.

$$ T = \begin{pmatrix} 1 & 0 & 0 & 3 \\ 0 & 1 & 0 & 2 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} $$
$$ R(45°) = \begin{pmatrix} \cos(45°) & -\sin(45°) & 0 & 0 \\ \sin(45°) & \cos(45°) & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} = \begin{pmatrix} 0.5\sqrt 2 & -0.5\sqrt 2 & 0 & 0 \\ 0.5\sqrt 2 & 0.5\sqrt 2 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} $$
$$ T * R = \begin{pmatrix} 0.5\sqrt 2 & -0.5\sqrt 2 & 0 & 3 \\ 0.5\sqrt 2 & 0.5\sqrt 2 & 0 & 2 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} $$

To go the other way around you have just to invert this transformation:

$$ (T * R)^{-1} \approx \begin{pmatrix} 0.707107 & 0.707107 & 0 & -3.53553 \\ -0.707107 & 0.707107 & 0 & 0.707107 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} $$
$$ (T * R)^{-1} * A_{B_1} = A_{B_2} \approx \begin{pmatrix} 0.707107 & 0.707107 & 0 & -3.53553 \\ -0.707107 & 0.707107 & 0 & 0.707107 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} * A_{B_1} \approx \begin{pmatrix} \sqrt 2 \\ 0 \\ 0 \\ 1 \end{pmatrix} $$

Let’s translate this into a unit test for your look_at function:

Examples:
    | eye   | target | look-at-matrix                                                                |
    | (3,2) | (4,3)  | (0.707107,0.707107,0,−3.53553,−0.707107,0.707107,0,0.707107,0,0,1,0,0,0,0,1)  |

And here as a gtest:

TEST(Transform44f, lookat3) {
    Point2f eye{3.0f, 2.0f};
    Point2f target{4.0f, 3.0f};

    Transform44f transform = look_at(eye, target);

    Matrix44f expectedMatrix;
    expectedMatrix << 0.707107f, 0.707107f, 0.0f, -3.53553f,
                     -0.707107f, 0.707107f, 0.0f, 0.707107f,
                           0.0f,      0.0f, 1.0f, 0.0f,
                           0.0f,      0.0f, 0.0f, 1.0f;

    ASSERT_TRUE(transform.getMatrix().isApprox(expectedMatrix));
}