9
$\begingroup$

I'm learning about 3D computer graphics, but I'm having a hard time understanding why the near plane of a viewing frustum can not be placed at z position $0$ (right at the camera).

I can understand conceptually that the near plane is essentially the retinal canvas -- so by definition it must exist to some extent -- but I have trouble understanding why the near plane isn't simply an abstract concept where it's actual position is infinitesimally close to the camera position, as opposed to a definite z position away.

In several explanations, the following formula is given to explain why $0$ cannot be used as the near plan position (where $A$ is the camera position, $B$ is an object vertex, and $D$ is the perspective projection of point $B$ onto the near plane):

similar right triangles

$${ BC \over AC } = { DE \over AE }$$

In this case, the geometry of similar triangles $ABC$ and $ADE$ is used to determine the height of $D$ via the solution of $DE$. It is obvious that if the near plane is at $0$ ($AE = 0$), then a division by $0$ occurs -- hence, why the near plane cannot be located at position $0$.

However, why is this method used to determine the position of $D$ on the canvas?

I've written a simple raycasting visualizer before and didn't have an explicitly defined near plane. In my engine, I simply defined a $60^\circ$ field of view and divided the number pixels on my screen among that field of view. For example, for a $300$x$300$ screen:

$$1\text{ pixel} = 300/60^\circ = 5^\circ$$

Next, I found the angle between my camera and the object vertex ($\angle BAC$) and divided it by $5^\circ$ to acquire the pixel coordinate on my screen. In this method, no explicit near plane was necessary and I used my actual camera position to determine the angle.

So how was I able to perform a perspective projection without a near plane in my raycasting method?

$\endgroup$
1
  • $\begingroup$ One key point of the near and far planes is that they set the how the depth changes throughout the region between them. $\endgroup$ Commented May 16, 2017 at 15:08

2 Answers 2

12
$\begingroup$

The near and far planes of a viewing frustum aren't needed for simple 3D→2D projection. What the near and far planes actually do, in a typical rasterizer setup, is define the range of values for the depth buffer.

Depths in the [near, far] range will be mapped into [0, 1] to be stored in the depth buffer. However, the depths aren't simply linearly rescaled. Instead, we use a hyperbolic mapping for depth: the value stored in the depth buffer is the reciprocal of the actual depth distance, with a scale and offset applied.

Graph of standard hyperbolic depth mapping

If you look at this curve and imagine moving the near plane value toward $z = 0$, the corresponding point on the $1/z$ curve would shoot up toward infinity. The math blows up and the depth buffer becomes useless (it suffers a catastrophic loss of precision).

The reason why we use the reciprocal instead of some other function of depth is basically convenience: the reciprocal depth interpolates nicely in screen space, and it fits naturally into the mathematical framework of perspective projections: there's already a divide by $z$ being applied to the $x, y$ coordinates.

If you'd like to know more, I have a short article on the topic: Depth Precision Visualized.

$\endgroup$
2
  • $\begingroup$ I want to understand the purpose of the depth buffer. I've only attempted a wireframe renderer, so I didn't consider depth when plotting the 3D points to a 2D screen (I could plot the points and lines in any order and it would look right). Is the purpose of the depth buffer to maintain a record of the Z distance of each point once it's been projected to 2D so that solid polygons may be drawn in the correct order? Is this why I didn't require a near/far plane in my simple wireframe visualizer? $\endgroup$ Commented May 17, 2017 at 18:01
  • 2
    $\begingroup$ @VilhelmGray Right, the depth buffer records the depth of the nearest surface at each pixel, so that when rasterizing a triangle, you can tell if it should be occluded by the previously rendered pixels or not. But if you don't care about depth sorting (because you're drawing wireframe, you've pre-sorted your triangles already, or some other reason) then there's no need for a depth buffer. $\endgroup$ Commented May 17, 2017 at 18:31
7
$\begingroup$

In this case, the geometry of similar triangles ABC and ADE is used to determine the height of D via the solution of DE. It is obvious that if the near plane is at 0 (AE=0), then a division by 0 occurs -- hence, why the near plane cannot be located at position 0.

This is not why the nearZ plane cannot be zero. The goal of perspective math is not to project the E onto the near plane. If you look at the actual perspective matrices, you'll find that the nearZ only applies to the computed clip-space Z coordinate, not the X and Y.

Remember: projection is essentially transforming a position such that you lose one or more dimensions. You're projecting from 3D space to 2D space. So projection is removing the Z component, projecting a scene into a 2D region of space. Orthographic projection just discards the Z; perspective projection does something more complex, rescaling the X and Y based on the Z.

Of course, we're not really doing that; Z still exists. Otherwise, the depth buffer wouldn't work. But the basic perspective projection math does not define how the Z is computed.

Therefore, we can compute the Z however we want. The traditional depth equation (for OpenGL's standard NDC space) is where near/farZ come from:

$$Z_{ndc} = \frac{\frac{Z_{camera}(F + N)}{N - F} + \frac{2NF}{N - F}}{-Z_{camera}}$$

N and F are near/farZ, respectively. If N is zero, then the equation degenerates:

$$Z_{ndc} = \frac{\frac{Z_{camera}(F)}{-F} + 0}{-Z_{camera}}$$ $$Z_{ndc} = \frac{-Z_{camera}}{-Z_{camera}}$$ $$Z_{ndc} = 1$$

Oops.

So your question really is why we use depth computations (which again, don't really have anything to do with perspective projection) that require near/farZ values like this.

This equation, and those like it, has some very nice properties. It plays well with the clip-space W component needed for perspective projection (the division by $-Z_{camera}$). But it also has the effect of being a non-linear Z projection. It puts more precision at the front of the scene than in the distance. And that's generally where you need depth precision the most.

There are alternative ways to compute the clip-space Z. You can use a linear method, which doesn't forbid near Z from being 0. You can also use a method that allows for an infinitely distant far Z (but still relies on near Z); this requires a floating-point Z buffer, though. And so forth.

You haven't really shown your code, so I can't say if it truly "works". The clip-space W in particular is important, as it is used for perspective-correct interpolation of vertex data. If you set it to one, then you won't get proper perspective-correct interpolation.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.