0
$\begingroup$

I am trying to build a VR tracking system with a laptop webcam, and I have succeeded in identifying, and tracking paper markers I put in front of my webcam. For context I am using OpenCV with the latest OpenCVSharp wrapper in .Net 5.

Currently I can track all 4 corners of an object (ordered, so orientation is accounted for) as a 2d quad on the webcam's image, so long as I am provided with a reference image before hand. I know the exact measurements of the reference object, and now I want to use both the 4 corners I calculate, and the measurements of the object in order to determine the position and rotation of that object relative to my webcam.

In the future I would also like to add additional cameras to the setup to allow for more precision and reducing dead zones. They would know the positions relative to the other sensors for the sake of simplicity.

How would I go about calculating both the position and rotation of an object given its' measurements, and 4 ordered corners on an image?

$\endgroup$

1 Answer 1

1
$\begingroup$

Hello Djgaven588 and welcome to the forums!

This is not a computer graphics question, but a computer vision one. The answer to it is a little bit involved and I cannot make it justice on a forum post. However, I can point you in the right direction so that you are able to find the answer by yourself.

Here's a helpful link: https://docs.opencv.org/master/d9/db7/tutorial_py_table_of_contents_calib3d.html

The process is somewhat straightforward, but it depends on how much you know about projections and solving systems of equations.

First you need to find key points on a known object in your image through some kind of image-processing algorithm. It seems you already did that. By known object, I mean that you know its exact measurements in the real world with regards to the selected key points.

Secondly, you need to "unproject" your 2D image points onto 3D space, by reversing the projection math from 3D to 2D. This requires building a projection matrix containing intrinsic and extrinsic camera parameters, as well as finding coefficients modelling the camera lens distortion. This process is known as camera calibration.

Thirdly, when using a single camera, you can unproject the points only up to a scale factor, because the depth coordinate has been lost during the original projection. In order to figure out the scale, rotation, and position of the object, you can set up a system of equations relating the image key points to the known key points in the real world at a specific position and rotation of the known object. Solving this system will let you figure out the transform. Note that it is important to have a reasonable amount of key points (even redundant ones) so that the error can be minimised.

OpenCV has methods and functions for doing all of these procedures, and there are many tutorials on the web, like the one I linked above.

I hope you can figure it out! :)

$\endgroup$
2
  • $\begingroup$ Minor nitpick stackexhange is not a forum. Its stictly a Q and A site. Anyway i think op whants to know how to unproject. $\endgroup$
    – joojaa
    Commented Nov 27, 2020 at 6:42
  • $\begingroup$ The link has information on camera calibration, unprojection, and using solvepnp for the system of equations. I am sorry for not providing a proper answer. $\endgroup$
    – vgs
    Commented Nov 30, 2020 at 16:18

Not the answer you're looking for? Browse other questions tagged or ask your own question.