9
$\begingroup$

tl;dr: Math problem in projective geometry: How does one find some 4x4 camera matrix that gives a projection as illustrated below, such that points A,B,C,D are somewhere on the edges of the unit box (e.g. OpenGL normalized device coordinates), and the corners of the unit box fall somewhere reasonable along the rays EA, EB, EC, ED?

(This may be a special case possibly of a homography, a perspectivity, and/or a collineation. Not familiar with the terminology.)


elaboration

Given a quadrilateral ABCD within the the viewport, I think there exists a unique(?) transformation that maps it back to a rectangle. As seen in the image below: the quadrilateral ABCD in the viewport acts as a physical 'window', and if we map it back to a rectangle it will appear distorted.

enter image description here

(the box on the right represents NDC, which I talk about later)

The goal is to quickly obtain the image on the right. We could raytrace every point to obtain the image (which I've done), but I would prefer to use OpenGL or other projective techniques because I wanted to take advantage of things like blending, primitives, etc.

first attempt

I believe I can solve the problem of finding the 3x4 camera matrix that makes the 3+1-dimensional homogeneous coordinate in 3-space (on the left) and projects it down to the 2+1 dimensional homogeneous coordinates in 2-space (on the right). One can solve this using the direct linear transformation to get a system of equations Ba=0 for the unknown entries a of the camera matrix, and solving the system using singular value decomposition (SVD). I would take the vectors EA, EB, EC, ED (where E is your physical eye or the camera in world-space) as points in the pre-image, and (0,0), (1,0), (1,1), (0,1) or something as the points in the post-image, and each pair of points would give a few linear equations to plug into the SVD. The resulting matrix would map EA->(0,0) etc. (assuming there are enough degrees of freedom i.e. if the solution is unique, which I'm not sure about, see note[a].)

But to my chagrin this is not how OpenGL works. OpenGL does not directly project 3d to 2d with a 3x4 matrix. OpenGL requires "normalized device coordinates" (NDC), which are three-dimensional points. After projecting into NDC, everything in the 'unit' box from (-1,-1,-1,1) to (1,1,1,1) is drawn; everything outside is clipped (since we're dealing with homogenerous coordinates: any point (x,y,z,w) will appear only on-screen only if the first three coordinates of (x/w,y/w,z/w,1) are within the unit box from -1 to 1).

So the question becomes: does there exist some reasonable transformation that maps some weird-looking cuboid in homogeneous coordinates (specifically the cuboid drawn on the left, with ABCD (front points) and A'B'C'D' (back points, hidden behind front points)) to the unit cube, e.g. using a 4x4 matrix? How does one do it?

what I've tried

I've tried something stronger: I made ABCD and A'B'C'D' look like a regular pyramidal frustrum (e.g. gl frustrum) (i.e. in this hypothetical setup, the image on the left would just have a black rectangle superposed on it, not a quadrilateral), and then used the DLT/direct linear transformation to solve for the alleged 4x4 matrix. However when I tried it, there did not seem to be enough degrees of freedom... the resulting 4x4 matrix did not map every input vector to every output vector. While using A,B,C,D,A' (5 pairs of pre-transform and post-transform vectors), I /almost/ get the result I want... the vectors are mapped correctly, but for example B',C',D' are mapping to (3,3,1,1) instead of (-1,-1,1,1) and are clipped away by OpenGL. If I try adding a sixth point (6 pairs of points for the 4x4 matrix to project), my solution seems degenerate (zeroes, infinites). How many degrees of freedom am I dealing with here, and is this possible with a 4x4 matrix mapping the usual 4vectors (3+1-dimensional homogeneous-coordinate vectors) that we know and love?

random minor thoughts

I'm guessing that it's not possible to map any arbitrary cuboid to any arbitrary cuboid with a 4x4 matrix, though I'm confused because I thought it was possible to map any convex quadrilateral to any other convex quadrilateral in 2d with some matrix like in, say, Photoshop?... can/can't this not be done with a projective transform? And how does it generalize to 3d? ...... Also given the failed attempt to find a 4x4 matrix, linear algebra says we should not expect an NxN matrix to map more than N linearly independent points to N target points in the best case, but I feel that somehow homogeneous coordinates cheat this because there is some hidden colinearity going on? I guess not?

another solution?

I guess one could also maybe do the following ugly thing, where you use a typical frustrum camera projection matrix, find the 2d points corresponding to the corners, then perform a 2d perspective distort homography, but if that were to happen after the pixels were rendered (e.g. photoshop) then there would be problems with resolution... maybe hypothetically one could figure out a matrix to perform this transformation on the XY-plane within NDC-space, then compose it with the normal frustrum-based matrix?

(note [a]: Degree of freedom: ABCD can be further constrained to be the post-image of a projective transformation acting on a rectangle, if that is necessary... that is the black rectangle on the left could be said to be the result of projecting a picture frame clipart model)

$\endgroup$
1
  • 1
    $\begingroup$ If you google for corner pin you get a few implementations of this $\endgroup$
    – joojaa
    Commented Jul 8, 2016 at 15:42

2 Answers 2

1
$\begingroup$

I think the solution is looking for the projective transform that correctly transforms the four points.

i.e.

$$y' = P \times x'$$

where $x' = [x_0, x_1, 1]$ and $y = [\frac{y'_0}{y'_2}, \frac{y'_1}{y'_2}]$

$P$ is a 3x3 matrix with 9 entries. Due to the final normalization it is unique up to scaling, leaving 8 degrees of freedom, which are uniquely determined by the 8 equations given by the correspondence (2 per point pair).

Now you can use algebra to do this, or just use OpenCV's getPerspectiveTransform :).

Also check out homnogenous coordinates on wikipedia to get familiar with the concept.

$\endgroup$
1
  • $\begingroup$ Thank you! (I solved this a while ago and posted the solution just now when I saw your comment.) $\endgroup$
    – ninjagecko
    Commented Sep 20, 2017 at 11:41
0
$\begingroup$

I solved my own question by implementing the Direct Linear Transformation. The examples section on Wikipedia was my use case.

To get the equations, plug in the matrices (e.g. [x1 x2 x3 x4; x5 x6 x7 x8; x9 x10 x11 x12]) into your favorite computer algebra system like SageMath, then Solve the required matrix equation as illustrated, copy-paste the solutions in terms of variables into your code, and adjust formatting.

One could then adapt the solution to one's use case by scaling or ignoring particular dimensions as appropriate (e.g. ignore the depth/z coordinate in the Normalized Device Coordinates matrix as appropriate to the use case).

You will need an SVD decomposition function or library in your language.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.