0
$\begingroup$

If I understood correctly this is the process of rendering an object:

  1. Array of 4d vector defining points 3d points(with the fourth component 1) representing object's vertices in object/local space:
    {1, 1.21, 2.12}
    {1, 1.21, 2.12}
    {1, 1.21, 2.12}
    {1, 1.21, 2.12}
  2. Constructing a translation matrix(where x, y, z are the new position on each axis):
1 0 0 0
0 1 0 0
0 0 1 0
x y z 1
  1. Constructing a rotation matrix(here I didn't yet understand how quaternions and rotation work , what I know is that there is a rotation matrix where you fill with the sine and cosine of the angles on each axis)

  2. Construction a scaling matrix(x, y, z scaling amount on axis)

x 0 0 0
0 y 0 0
0 0 z 0
0 0 0 1
  1. Multiplying all tree to obtain the Model/Object Matrix

  2. The view/camera matrix(which handles the position of the scene observer, feed the location of the observer, the direction it is looking and the global up direction in the right places and this will move the scene acorrdingly):

  3. Projection matrix(Handles the aspect ratio, clipping planes, fov)

  4. Multiplying Model with View with Projection matrices. And multiplying the Final matrix with the vertices position.

  5. On each vertex, divide the x, y, z components by their w to project the 3d scene in a 2d screen(obtaining vertices in NDC space ranging from -1, to 1).

  6. Remap the NDC values to match the screen coordinates by doing a simple calculus.

  7. GPU interpolates the pixels positions in screen coordinates and fills them with color accordingly.

In DirectX and OpenGL I know that the matrices are computed in the CPU and sent to GPU in the vertex shader as vertexPOS * MVP(Model * View * Projection). And the GPU handles the perspective w division and the remapping to screen coordinates and interpolating pixels. But why doing the initial multiplication in the CPU if we can just send the parameters to the GPU(translation, rotation, scaling, eye position, eye direction, up, fov, aspect ratio, near/far clipping plane) and the GPU handles the matrix magic behind the scenes. Wouldn't GPU make those matrix multiplication faster than CPU?

$\endgroup$

2 Answers 2

1
$\begingroup$

Wouldn't GPU make those matrix multiplication faster than CPU?

No. GPUs are not a magic "make things go faster" box.

Vertex shaders operate on every vertex. If you render 1 million vertices, there have to be (roughly) one million executions of a vertex shader.

So let's have a VS take these matrix parameters and build a matrix from them. One million times. But because these parameters are uniforms, they don't change; you're computing the exact same matrix each time. One million times.

That is a fantastic waste of GPU processing time. Vertex shaders exist to do per-vertex processing. That's what they ought to do. Per-object computations, or per-scene computations ought to be handled by something else.

Also, how many objects do you have? Maybe 100,000 at the most? A modern CPU can do 100,000 matrix multiplies per frame just fine.

Furthermore, you're presupposing that a transformation for a model can be boiled down to a single "translation, rotation, scaling". That is usually not the case.

Let's say you have a base object, like a train. And you have a person on top of the train. If the train rotates, the person should rotate accordingly. But the person also needs to be able to rotate relative to the train. Same goes for the translation and such; if the train moves, it needs to move the person on it. But the person also needs to move relative to the train.

You could send an array of translations/rotations/scaling/etc that the VS would then composed into a final transformation.

Or the CPU can just multiply the train matrix and the local person matrix to produce the full composite transform for the person. Plus, you gain the benefit of being able to render both objects with the same vertex shader, since the VS does not need to know what the actual transformation matrix represents. It just needs to know that this one transformation matrix is how this object gets to camera space. Each object gets different data, but the shape of that data is always just one matrix.

$\endgroup$
1
$\begingroup$

It is perfectly reasonable to do the multiplication on the GPU. The major reason it is taught this way is partly it is historically standard, but the main reason is if you have a vertex shader that has line like the following in it...

gl_Position = ModelReflection * ModelScale * ModelTranslation * View * Projection * vertex;

And there are in the neighborhood of 10 million vertices over thousands of models then you get in the neighborhood of 60 million 4x4 matrix multiplications.

Each matrix multiplication is roughly 64 basic operations so that is 3.2billion basic operations that can be easily avoided.

But if the majority of the multiplications are handled on the CPU then you can easily remove 40 or 50 million 4x4 matrix multiplications or billions of basic operations.

And that is one of the major reasons (perhaps the major reason) for doing as much work as possible on the CPU.


It is easy to say "don't do that" and give the standard argument but I wanted to elaborate a little bit on the pro's and con's of doing computations on the GPU.

Also your question is more along of the lines of CPU vs GPU so I've included a few good reasons to use the GPU that don't include the vertex shader.

Pros:

  1. It reduces bandwidth usage between the GPU and CPU. If there is a lot of shatter between the GPU and CPU like transferring texture data. Then opportunities to reduce bandwidth should include considering recomputing matrices on the GPU perhaps in the vertex shader, perhaps in a compute shader.

  2. Don't use the Vertex shader but rather a dedicated compute shader that runs once, which does the computations. There is a window at the end of each frame where the GPU is still computing output and is "ramping" down. It is hard to squeeze graphics work in but compute shaders can sneak in and perform a considerable amount of work.

  3. Debugging and development reasons, once that's done, go back to using the CPU.

  4. Education, or you just feel like it! Don't let us push you around, the way break throughs are made is by stubborn developers that have an aha moment.

Cons:

  1. What a waste of computational power! Especially when the numbers start to grow very large. Enough said.

  2. I see older GPU's, they're everywhere! Having an app that can run on as many GPU's as possible broadens it's user base.

  3. GPU's tend to be power hungry. Squeezing more work onto the GPU will just make that worse. This is particularly important for battery operated devices.

  4. Debugging and development. (it's on both lists!) Debugging on the GPU is not nearly as easy to do as it is on the CPU. If you lack confidence in the matrices themselves, debugging on the CPU will be much easier.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.