2
$\begingroup$

I am an amateur game developer, and I've been reading a lot on how computer graphics rendering works lately. I was reading on the Z-buffer recently and I can't quite seem to be able to wrap my head around what exactly the Z-buffer looks like in terms of memory. It's described to contain depth information for each fragment that will be drawn on-screen, and that modern Z-buffers have 32-bits, so would that mean that on a, let's say, 1920x1080 screen it'd be just above 8MB (1920 * 1080 * 32) per frame?

I still don't quite understand what this value would be to a GPU (maybe it can crunch it easily), or if this value is even correct. Most demostrative implementations I found implement the Z-buffer as a simple array with size (height * width), so I'm basing myself off of that.

$\endgroup$
0

2 Answers 2

2
$\begingroup$

The Z buffer used to be specialized memory set aside for a single purpose, some web sites still explain it like that, but no longer.

Now the Z buffer is just a chunk of memory you allocate, or an API like OpenGL allocates on your behalf.

The size of that chunk of memory will depend on the type of the Z buffer values, in the example you gave it is a 32bit [floating point] values, but 24 bits is also very common. Always choose the smallest size the program needs as it can have a large effect on the performance of the application. It is indeed multiplied by the size of framebuffer so 8mb is correct for the example you gave.

The values that get stored in it are the depth values for any geometry drawn to the associated framebuffer. It is important to realize that these values are NOT the linear values of the MVP matrix computed in the vertex shader so the Z buffer can not be used for things like shadow maps.

Fragments resulting from each draw call have their depth values tested against existing values in the Z buffer and if the test passes the GPU will write that fragment and update the Z buffer with the new value, if not the fragment gets discarded and the Z buffer is left untouched.

A few other little details:

The Z buffer is generally cleared at the beginning of each frame with a clear value that can be set (or must be set) via the API. This becomes the default value that writes to the Z buffer are tested against.

GPU's have specialized hardware for writing the Z buffer, this hardware can speed up writing to memory by a factor of 2 or more and can be leveraged when creating things like shadow maps, so it is not limited for use with just the Z buffer.

Depth testing can be turned off/on for each draw call, which can be useful.

$\endgroup$
3
  • $\begingroup$ Thank you for the detailed answer! So the GPU is specialized to write and read these buffers, huh. I was under the impression that, with the z-buffer, stencil and possibly other stuff, the data would be somewhat too large to handle effectively (with it being in the megabytes already), but I may be underestimating modern data transfer and processing tech. $\endgroup$
    – Carmo
    Commented Mar 25, 2021 at 20:50
  • $\begingroup$ All that data flowing around does have a performance effect, it has to move in and out of the cache, and there are limits that have to be considered but yeah modern GPU's can be pretty impressive in how much data they can move and processing that can be done. My "old and slow" GPU clocks in at 2 teraflops. $\endgroup$
    – pmw1234
    Commented Mar 25, 2021 at 21:12
  • 1
    $\begingroup$ "Depth testing can be turned off/on for each draw call, which can be useful" It could also be counter-productive. Whenever possible (i.e. there are cases when it isn't), the GPUs I've helped develop do the Z-test first and (again if possible) update the Z. Turning it on/off may reduce the chances of rejecting pointless calculations. $\endgroup$
    – Simon F
    Commented Mar 29, 2021 at 10:42
0
$\begingroup$

Most demostrative[SIC] implementations I found implement the Z-buffer as a simple array with size (height * width), so I'm basing myself off of that.

I suspect a purely linear (i.e., scanline-by-scanline) arrangement for the Z-buffer, at least for HW systems, is unlikely as that would tend to lead to a lot of DRAM "page-breaks". This is because each triangle tends to get rendered in its entirety, which means that as you go from one scanline of pixels to the next, the memory addresses are likely to jump dramatically.

Tiled/Morton(Twiddled) layouts for the Z-buffer are probably more likely (along with a more tile-based rendering order.)

$\endgroup$
2
  • $\begingroup$ that's interesting! I'm not that knowledgeable of how memory-wise optimizations work, so this "morton" layout you mentioned is new to me. I do have a superficial understanding of what you mean by it not being too good to keep it purely linear though. $\endgroup$
    – Carmo
    Commented Mar 30, 2021 at 23:56
  • $\begingroup$ It's worth finding and reading Jim Blinn's "The Truth about Texture Mapping" article for a related topic dealing with pixel/memory layout. $\endgroup$
    – Simon F
    Commented Mar 31, 2021 at 18:02

Not the answer you're looking for? Browse other questions tagged or ask your own question.