Image Processing: Algorithm improvementImprovement for Coca'Coca-Cola can shape recognitionCan' Recognition

One of the most interesting projects I've worked on in the past couple of years was a project about image processingimage processing. The goal was to develop a system to be able to recognize Coca-Cola cans'cans' (note that I'm stressing the word cans'cans', you'll see why in a minute). You can see a sample below, with the can recognized in the green rectanglegreen rectangle with scale and rotation.

The background could be very noisy.
The cancan could have any scalescale or rotationrotation or even orientation (within reasonable limits).
The image could have some degree of fuzziness (contours might not be entirely straight).
There could be Coca-Cola bottles in the image, and the algorithm should only detect the cancan!
The brightness of the image could vary a lot (so you can't rely "too much" on color detection).
The cancan could be partly hidden on the sides or the middle and possibly partly hidden behind a bottle.
There could be no canscan at all in the image, in which case you had to find nothing and write a message saying so.

Language: Done in C++ using OpenCVOpenCV library.

Changing color domain from RGB to HSV (Hue Saturation ValueHSV) and filtering based on "red" hue, saturation above a certain threshold to avoid orange-like colors, and filtering of low value to avoid dark tones. The end result was a binary black and white image, where all white pixels would represent the pixels that match this threshold. Obviously there is still a lot of crap in the image, but this reduces the number of dimensions you have to work with).
Noise filtering using median filtering (taking the median pixel value of all neighbors and replace the pixel by this value) to reduce noise.
Using Canny Edge Detection Filter to get the contours of all items after 2 precedent steps.

Algorithm: The algorithm itself I chose for this task was taken from this (awesome) book on feature extractionthis awesome book on feature extraction and called Generalized Hough Transform (pretty different from the regular Hough Transform). It basically says a few things:

In the end, you end up with a heat map of the votes, for example here all the pixels of the contour of the can will vote for its gravitational center, so you'll have a lot of votes in the same pixel corresponding to the center, and will see a peak in the heat map as below.:

It is extremely slow! I'm not stressing this enough. Almost a full day was needed to process the 30 test images, obviously because I had a very high scaling factor for rotation and translation, since some of the cans were very small.
It was completely lost when bottles were in the image, and for some reason almost always found the bottle instead of the can (perhaps because bottles were bigger, thus had more pixels, thus more votes)
Fuzzy images were also no good, since the votes ended up in pixel at random locations around the center, thus ending with a very noisy heat map.
InvarianceIn-variance in translation and rotation was achieved, but not in orientation, meaning that a can that was not directly facing the camera objective wasn't recognized.

Can you help me improve my specific algorithm, using exclusively OpenCV OpenCV features, to resolve the four specific issues mentioned?

I hope some people will also learn something out of it as well, after all I think not only people who ask questions should learn. :)

Algorithm improvement for Coca-Cola can shape recognition

One of the most interesting projects I've worked on in the past couple of years was a project about image processing. The goal was to develop a system to be able to recognize Coca-Cola cans (note that I'm stressing the word cans, you'll see why in a minute). You can see a sample below, with the can recognized in the green rectangle with scale and rotation.

The background could be very noisy.
The can could have any scale or rotation or even orientation (within reasonable limits)
The image could have some degree of fuzziness (contours might not be entirely straight)
There could be Coca-Cola bottles in the image, and the algorithm should only detect the can!
The brightness of the image could vary a lot (so you can't rely "too much" on color detection).
The can could be partly hidden on the sides or the middle and possibly partly hidden behind a bottle.
There could be no cans at all in the image, in which case you had to find nothing and write a message saying so.

Language: Done in C++ using OpenCV library.

Changing color domain from RGB to HSV (Hue Saturation Value) and filtering based on "red" hue, saturation above a certain threshold to avoid orange-like colors, and filtering of low value to avoid dark tones. The end result was a binary black and white image, where all white pixels would represent the pixels that match this threshold. Obviously there is still a lot of crap in the image, but this reduces the number of dimensions you have to work with).
Noise filtering using median filtering (taking the median pixel value of all neighbors and replace the pixel by this value) to reduce noise.
Using Canny Edge Detection Filter to get the contours of all items after 2 precedent steps.

Algorithm: The algorithm itself I chose for this task was taken from this (awesome) book on feature extraction and called Generalized Hough Transform (pretty different from the regular Hough Transform). It basically says a few things:

In the end, you end up with a heat map of the votes, for example here all the pixels of the contour of the can will vote for its gravitational center, so you'll have a lot of votes in the same pixel corresponding to the center, and will see a peak in the heat map as below.

It is extremely slow! I'm not stressing this enough. Almost a full day was needed to process the 30 test images, obviously because I had a very high scaling factor for rotation and translation, since some of the cans were very small.
It was completely lost when bottles were in the image, and for some reason almost always found the bottle instead of the can (perhaps because bottles were bigger, thus had more pixels, thus more votes)
Fuzzy images were also no good, since the votes ended up in pixel at random locations around the center, thus ending with a very noisy heat map.
Invariance in translation and rotation was achieved, but not in orientation, meaning that a can that was not directly facing the camera objective wasn't recognized.

Can you help me improve my specific algorithm, using exclusively OpenCV features, to resolve the four specific issues mentioned?

I hope some people will also learn something out of it as well, after all I think not only people who ask questions should learn :)

Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition

One of the most interesting projects I've worked on in the past couple of years was a project about image processing. The goal was to develop a system to be able to recognize Coca-Cola 'cans' (note that I'm stressing the word 'cans', you'll see why in a minute). You can see a sample below, with the can recognized in the green rectangle with scale and rotation.

The background could be very noisy.
The can could have any scale or rotation or even orientation (within reasonable limits).
The image could have some degree of fuzziness (contours might not be entirely straight).
There could be Coca-Cola bottles in the image, and the algorithm should only detect the can!
The brightness of the image could vary a lot (so you can't rely "too much" on color detection).
The can could be partly hidden on the sides or the middle and possibly partly hidden behind a bottle.
There could be no can at all in the image, in which case you had to find nothing and write a message saying so.

Language: Done in C++ using OpenCV library.

Changing color domain from RGB to HSV and filtering based on "red" hue, saturation above a certain threshold to avoid orange-like colors, and filtering of low value to avoid dark tones. The end result was a binary black and white image, where all white pixels would represent the pixels that match this threshold. Obviously there is still a lot of crap in the image, but this reduces the number of dimensions you have to work with.
Noise filtering using median filtering (taking the median pixel value of all neighbors and replace the pixel by this value) to reduce noise.
Using Canny Edge Detection Filter to get the contours of all items after 2 precedent steps.

Algorithm: The algorithm itself I chose for this task was taken from this awesome book on feature extraction and called Generalized Hough Transform (pretty different from the regular Hough Transform). It basically says a few things:

In the end, you end up with a heat map of the votes, for example here all the pixels of the contour of the can will vote for its gravitational center, so you'll have a lot of votes in the same pixel corresponding to the center, and will see a peak in the heat map as below:

It is extremely slow! I'm not stressing this enough. Almost a full day was needed to process the 30 test images, obviously because I had a very high scaling factor for rotation and translation, since some of the cans were very small.
It was completely lost when bottles were in the image, and for some reason almost always found the bottle instead of the can (perhaps because bottles were bigger, thus had more pixels, thus more votes)
Fuzzy images were also no good, since the votes ended up in pixel at random locations around the center, thus ending with a very noisy heat map.
In-variance in translation and rotation was achieved, but not in orientation, meaning that a can that was not directly facing the camera objective wasn't recognized.

Can you help me improve my specific algorithm, using exclusively OpenCV features, to resolve the four specific issues mentioned?

I hope some people will also learn something out of it as well, after all I think not only people who ask questions should learn. :)

Spelling, grammar, punctuation, removed superfluous sentences.

Source Link

edit approved Mar 16, 2016 at 8:48

Filip Allberg

4.7k
3
23
39

One of the most interesting projects I've worked on in the past couple of years as I was still a student, was a final project about image processing. The goal was to develop a system to be able to recognize Coca-Cola cans (note that I'm stressing the word cans, you'll see why in a minute). You can see a sample below, with the can recognized in the green rectangle with scale and rotation.

Some contraintsconstraints on the project:

The background could be very noisy.
The can could have any scale or rotation or even orientation (within reasonable limits)
The image could have some degree of fuzinessfuzziness (contours could bemight not reallybe entirely straight)
There could be Coca-Cola bottles in the image, and the algorithm should only detect the can !
The brightness of the image could vary a lot (so you can't rely "too much" on color detection).
The can could be partly hidden on the sides or the middle (andand possibly partly hidden behind thea bottle !).
There could be no cans at all in the image, in which case you had to find nothing and write a message saying so.

Now I've doneI did this project obviously as it was a while ago, and had a lot of fun doing it, and I had a decent implementation. Here are some details about my implementation:

Pre-processing: RegardingFor the image pre-processing I mean how to transform it in, i.e. transforming the image into a more raw form to give to the algorithm., I used 2 methods:

You can describe an object in space without knowing its analytical equation (which is the case here).
It is resistentresistant to image deformations such as scaling and rotation, as it will basically test your image for every combination of scale factor and rotation factor.
It uses a base model (a template) that the algorithm will "learn".
Each pixel remaining in the contour image will vote for another pixel which will supposedly be the center (in terms of gravity) of your object, based on what it learned from the model.

Can you help me improve my specific algorithm, using exclusively OpenCV features, to resolve the four specific issues mentionnedmentioned?

One of the most interesting projects I've worked in the past couple years as I was still a student, was a final project about image processing. The goal was to develop a system to be able to recognize Coca-Cola cans (note that I'm stressing the word cans, you'll see why in a minute). You can see a sample below, with the can recognized in the green rectangle with scale and rotation.

Some contraints on the project:

The background could be very noisy.
The can could have any scale or rotation or even orientation (within reasonable limits)
The image could have some degree of fuziness (contours could be not really straight)
There could be Coca-Cola bottles in the image, and the algorithm should only detect the can !
The brightness of the image could vary a lot (so you can't rely "too much" on color detection.
The can could be partly hidden on the sides or the middle (and possibly partly hidden behind the bottle !)
There could be no cans at all in the image, in which case you had to find nothing and write a message saying so.

Now I've done this project obviously as it was a while ago, and had a lot of fun doing it, and I had a decent implementation. Here are some details about my implementation:

Pre-processing: Regarding image pre-processing I mean how to transform it in a more raw form to give to the algorithm. I used 2 methods:

You can describe an object in space without knowing its analytical equation (which is the case here).
It is resistent to image deformations such as scaling and rotation, as it will basically test your image for every combination of scale factor and rotation factor.
It uses a base model (a template) that the algorithm will "learn".
Each pixel remaining in the contour image will vote for another pixel which will supposedly be the center (in terms of gravity) of your object, based on what it learned from the model.

Can you help me improve my specific algorithm, using exclusively OpenCV features, to resolve the four specific issues mentionned?

One of the most interesting projects I've worked on in the past couple of years was a project about image processing. The goal was to develop a system to be able to recognize Coca-Cola cans (note that I'm stressing the word cans, you'll see why in a minute). You can see a sample below, with the can recognized in the green rectangle with scale and rotation.

Some constraints on the project:

The background could be very noisy.
The can could have any scale or rotation or even orientation (within reasonable limits)
The image could have some degree of fuzziness (contours might not be entirely straight)
There could be Coca-Cola bottles in the image, and the algorithm should only detect the can!
The brightness of the image could vary a lot (so you can't rely "too much" on color detection).
The can could be partly hidden on the sides or the middle and possibly partly hidden behind a bottle.
There could be no cans at all in the image, in which case you had to find nothing and write a message saying so.

I did this project a while ago, and had a lot of fun doing it, and I had a decent implementation. Here are some details about my implementation:

Pre-processing: For the image pre-processing, i.e. transforming the image into a more raw form to give to the algorithm, I used 2 methods:

You can describe an object in space without knowing its analytical equation (which is the case here).
It is resistant to image deformations such as scaling and rotation, as it will basically test your image for every combination of scale factor and rotation factor.
It uses a base model (a template) that the algorithm will "learn".
Each pixel remaining in the contour image will vote for another pixel which will supposedly be the center (in terms of gravity) of your object, based on what it learned from the model.

Can you help me improve my specific algorithm, using exclusively OpenCV features, to resolve the four specific issues mentioned?

typo fix

Source Link

edited May 24, 2015 at 19:03

Emil Laine

42.4k
11
105
158

It is extremely slow ! I'm not stressing this enough. Almost a full day was needed to process the 30 test images, obvisoulyobviously because I had a very high scaling factor for rotation and translation, since some of the cans were very small.
It was completely lost when bottles were in the image, and for some reason almost always found the bottle instead of the can (perhaps because bottles were bigger, thus had more pixels, thus more votes)
Fuzzy images were also no good, since the votes ended up in pixel at random locations around the center, thus ending with a very noisy heat map.
Invariance in translation and rotation was achieved, but not in orientation, meaning that a can that was not directly facing the camera objective wasn't recognized.

edited tags

Link

edited Jan 3, 2013 at 19:54

Yahel

37.2k
23
105
154

Loading

Notice removed Reward existing answer by Charles Menguy

occurred Apr 28, 2012 at 18:23

Bounty Ended with stacker's answer chosen by Charles Menguy

occurred Apr 28, 2012 at 18:23

Notice added Reward existing answer by Charles Menguy

occurred Apr 22, 2012 at 16:27

Bounty Started worth 100 reputation by Charles Menguy

occurred Apr 22, 2012 at 16:27

Removed useless comments.

Source Link

edited Apr 16, 2012 at 20:32

Charles Menguy

41.2k
18
96
117

Loading

Post Reopened by Shog9

occurred Apr 16, 2012 at 18:26

Migration Rejected

occurred Apr 16, 2012 at 18:26

Post Unlocked by CommunityBot

occurred Apr 16, 2012 at 18:26

Post Migrated Away to dsp.stackexchange.com by Bart, Andrey Kamaev, F'x, Dennis, casperOne

occurred Apr 16, 2012 at 14:45

Post Locked by CommunityBot

occurred Apr 16, 2012 at 14:45

Post Closed as "off topic" by Bart, Andrey Kamaev, F'x, Dennis, casperOne

occurred Apr 16, 2012 at 14:45

Reduced scope.

Source Link

edited Apr 16, 2012 at 12:26

Charles Menguy

41.2k
18
96
117

Loading

deleted 1 characters in body

Source Link

edited Apr 16, 2012 at 4:40

littleadv

20.2k
2
37
50

Loading

Source Link

asked Apr 16, 2012 at 4:23

Charles Menguy

41.2k
18
96
117

Loading

Collectives™ on Stack Overflow

Return to Question

Image Processing: Algorithm improvementImprovement for Coca'Coca-Cola can shape recognitionCan' Recognition

Algorithm improvement for Coca-Cola can shape recognition

Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition