0

Might be silly, but I'm looking for a way to programmatically enrich images of vehicles with some technical, length measurements.

Images will always:

  • have a white background
  • have the same angle (side of vehicle shown)

Vehicles will always:

  • have the same wheels
  • have the same color
  • have different dimensions (longer/shorter, higher/lower)

Example (before and after):
enter image description here

enter image description here

Ideal implementation should take a list of images of vehicles and associated measurements (like total_length, wheelbase, height, front_wheel_to_front_bumper...) and draw the measurements on top.
It should be able to recognize relatively simple features of images, like "(left/right/upper)most non-white pixel" but also more difficult ones like "a wheel".
For "a wheel", it should be able to find it's center.
Based on these "coordinates", it would draw both "guides" and lines between them, denoting length.

I'd appreciate thoughts about feasibility and high-level implementation:

  • is there something that can do "simple stuff" already, like find the left/rightmost pixel and draw a vertical line on it
  • would training a neural network be the way to go?
  • will I spend the rest of my life on it?

2 Answers 2

1

If you need to recognise arbitrary features I would suggest training an AI model or running various filters like edge detection etc before doing complicated maths.

But with the restrictions you outline it seems like you can just find the two lowest non-white pixels in the image and it will give you the wheel locations.

To do this just loop through the pixels checking their colour, starting at the bottom left and working your way up row by row.

2
  • Clever solution with finding the wheel center!
    – Flater
    Commented Feb 11 at 23:33
  • hopefully I will be credited in the homework answer!! :)
    – Ewan
    Commented Feb 12 at 10:22
1

Identifying components of arbitrary vehicles, under diverse pose and lighting conditions, could be a good fit for ResNet50 types of models. But for the highly constrained problem space you outline, very simple techniques suffice: bbox and Hough transform.

Computing bounding box is straightforward. Load W × H greyscale image into an array, optionally threshold it, find W column sums along one axis, and H row sums along the other axis. Walk along each of those 1-D vectors from both ends, till you encounter a non-zero sum. Now you have a bbox, which you can illustrate with green lines if desired.

Robust detection of circles is slightly more involved, typically relying on a Hough transform. OpenCV offers a popular implementation that comes with convenient python bindings, and the documentation even shows example code for finding radii and centers.

1
  • Thank you for the wonderful guideline! Will be back to upvote when my account allows. Commented Feb 11 at 14:34

Not the answer you're looking for? Browse other questions tagged or ask your own question.