Homogeneous Coordinates for Computer Graphics

Doug Baldwin
Dept. of Mathematics, SUNY Geneseo

Last modified September 28, 2021

“Homogeneous coordinates” are a convention from geometry (specifically so-called “projective” geometry) for identifying points in space. Homogeneous coordinates also turn out to be very useful tool for computer graphics. This document introduces homogeneous coordinates as defined in mathematics, and then explores their uses in computer graphics.

Homogeneous Coordinates in Mathematics

Imagine lying on your back and looking up at the night sky, as in Figure 1. From your point of view, the sky is a vast plane above you, and the stars appear to be points of light on that plane.

Person lying on back beneath line studded with stars

Figure 1. The night sky as a plane punctuated by stars

Of course, the stars aren’t really points in the sky plane, they are points scattered through 3-dimensional space. What you see are the “projections” of the stars onto the sky plane. Adding a coordinate system with its origin at your eye, as in Figure 2, provides some vocabulary to talk about these projections. In particular, your view of a star whose actual position is $(x, y, z)$ is the point at which a straight line from that star to your eye intersects the apparent plane of the sky. Calling the height of this plane $z_{s}$ , we can call the apparent position of the star $(x^{'}, y^{'}, z_{s})$ . This point is the projection of the star onto the sky plane.

Star at position X,Y,Z projects to X prime, Y prime, Z sub s

Figure 2. A star with actual position $(x, y, z)$ has apparent position in the sky plane of $(x^{'}, y^{'}, z_{s})$

Notice that many different points project onto the sky plane at $(x^{'}, y^{'}, z_{s})$ . Specifically, all points on the line from the origin (your eye) through $(x^{'}, y^{'}, z_{s})$ project to that point. These points are exactly those with coordinates of the form $(k x^{'}, k y^{'}, k z_{s})$ , for any non-zero constant $k$ . (The origin itself doesn’t have a unique projection, thus we require non-zero $k$ . In our physical example, only positive values for $k$ make sense, but in mathematics it can be negative as well.)

Homogeneous coordinates are a way of describing points in some lower number of dimensions in terms of the points that project to them from some higher number of dimensions. In the night sky example, this means describing points in the sky by points on the lines that project to them. In particular, assume that $z_{s}$ is a known, fixed, constant. Then there is no need to include $z_{s}$ when saying where a point in the sky plane is; points in the sky plane can be completely specified by just their $x$ and $y$ coordinates. Another way to look at this is that because the sky plane is indeed a plane, points within it are described by two coordinates. Thus I will henceforth describe sky points by pairs, for example $(x^{'}, y^{'})$ . The homogeneous forms for sky point $(x^{'}, y^{'})$ are then all triples $(x, y, z)$ in which $\frac{x}{z} = x^{'}$ and $\frac{y}{z} = y^{'}$ . Similarly, points in 3-dimensional space, $(x^{'}, y^{'}, z^{'})$ , have homogeneous representations $(x, y, z, w)$ in which $x^{'} = \frac{x}{w}$ , $y^{'} = \frac{y}{w}$ , and $z^{'} = \frac{z}{w}$ ; the idea generalizes to any number of dimensions.

Homogeneous coordinates have lots of applications in the geometry of projections and their relationships to the things they are projections of, but those applications are generally beyond the scope of this introduction. One consequence of homogeneous representations does bear mention before turning to their use in computer graphics though: if the homogeneous form $(x, y, z, w)$ represents the standard Cartesian point $(\frac{x}{w}, \frac{y}{w}, \frac{z}{w})$ , what does $(x, y, z, 0)$ represent? Since in the limit as $w$ goes to $0$ , $(x, y, z, w)$ represents points increasingly far from the origin in direction $⟨ x, y, z ⟩$ , the convention is to interpret homogeneous points with a $0$ final component as being infinitely far from the origin, in the direction indicated by the other components. Describing points “at infinity” but in a specific direction is thus something that homogeneous coordinates allow but Cartesian coordinates do not.

Homogeneous Coordinates in Computer Graphics

Homogeneous coordinates are used in one of two ways in computer graphics. The most widespread is a restricted form, in which the “extra” coordinate (i.e., the third in two dimensions or the fourth in three) can only take on the values $0$ or $1$ . The more general form of homogeneous coordinate, in which the extra coordinate can have any value and is interpreted as a divisor for the other components, is also used occasionally.

The restricted form of homogeneous coordinate is valuable in computer graphics because it solves a problem in representing and implementing transformations of geometric objects. In particular, most of computer graphics’s important geometric transformations can be represented by matrices, and can be applied to points or vectors in Cartesian form by treating the point or vector as a column vector and multiplying it by the transformation’s matrix. This claim is true of rotations and dilations, and it makes applying those transformations fast and easy because matrix-vector multiplication is well understood and lends itself to full or partial implementation directly in computer hardware. However, the third major transformation in computer graphics, translation, cannot be represented as matrix multiplication in Cartesian coordinates. It can, however, in homogeneous coordinates.

In particular, let a point $(x, y, z)$ in 3-dimensional space be represented by the homogeneous quadruple $(x, y, z, 1)$ , and let a vector $⟨ x, y, z ⟩$ be represented as $(x, y, z, 0)$ . (Analogous representations work for points and vectors in two dimensions; I’ll use three dimensional illustrations in this discussion.) Recalling that in general the homogeneous tuple $(x, y, z, w)$ represents the Cartesian point $(\frac{x}{w}, \frac{y}{w}, \frac{z}{w})$ , the representation of points is very reasonable. The representation of vectors draws on the interpretation of homogeneous points with fourth coordinate $0$ being infinitely far from the origin in a certain direction — i.e., no actual point, but a well-defined direction (and magnitude, from the finite $x$ , $y$ , and $z$ components).

With these homogeneous representations of points and vectors, a translation by $a$ units in the $x$ direction, $b$ in the $y$ , and $c$ in the $z$ can be represented by the matrix

(\begin{matrix} 1 & 0 & 0 & a \\ 0 & 1 & 0 & b \\ 0 & 0 & 1 & c \\ 0 & 0 & 0 & 1 \end{matrix})

Multiplying a point in column form by this matrix, i.e.,

(\begin{matrix} 1 & 0 & 0 & a \\ 0 & 1 & 0 & b \\ 0 & 0 & 1 & c \\ 0 & 0 & 0 & 1 \end{matrix}) (\begin{matrix} x \\ y \\ z \\ 1 \end{matrix})

produces the translated point, namely

(\begin{matrix} x + a \\ y + b \\ z + c \\ 1 \end{matrix})

as it should. Similarly, multiplying a translation and a vector produces the original vector. This is again the correct result, since vectors do not have positions and thus should be unaffected by translations:

(\begin{matrix} 1 & 0 & 0 & a \\ 0 & 1 & 0 & b \\ 0 & 0 & 1 & c \\ 0 & 0 & 0 & 1 \end{matrix}) (\begin{matrix} x \\ y \\ z \\ 0 \end{matrix}) = (\begin{matrix} x \\ y \\ z \\ 0 \end{matrix})

For transformations that can already be represented by matrices operating on Cartesian coordinates, there is a simple translation to matrices for homogeneous coordinates: if

M = (\begin{matrix} m_{11} & m_{12} & m_{13} \\ m_{21} & m_{22} & m_{23} \\ m_{31} & m_{32} & m_{33} \end{matrix})

represents a transformation in Cartesian coordinates, then

M^{'} = (\begin{matrix} m_{11} & m_{12} & m_{13} & 0 \\ m_{21} & m_{22} & m_{23} & 0 \\ m_{31} & m_{32} & m_{33} & 0 \\ 0 & 0 & 0 & 1 \end{matrix})

represents the same transformation in homogeneous coordinates, i.e., a transformation that when applied to a homogeneous representation of point $P$ produces the homogeneous representation of point $M P$ , and when applied to the homogeneous representation of a vector $V$ produces the homogeneous representation of vector $M V$ .

Exercises

1. Verify that the matrix $M^{'}$ given in the “Homogeneous Coordinates in Computer Graphics” section, does in fact implement the “same,” as defined in the text, transformation as matrix $M$ .