103
$\begingroup$

How can I detect and peel the label from the jar below (POV, cylinder radius, jar contents are all unknown)

enter image description here

to get something like this, which is the original label before it was stuck on the jar?

enter image description here

$\endgroup$
16
  • $\begingroup$ Do you have any prior knowledge of the label ? $\endgroup$ Commented May 17, 2012 at 14:15
  • 11
    $\begingroup$ @acl I tried eating the jar contents. Regretfully I own only a photograph, so it wasn't very pleasant. $\endgroup$ Commented May 17, 2012 at 14:16
  • 2
    $\begingroup$ @image_doctor Identifying the label is not the main task (although it could be very nice). I can manually pre-process the image and change the label color to something easy to extract. The geometric transform is the main issue. $\endgroup$ Commented May 17, 2012 at 14:27
  • 3
    $\begingroup$ @Mr. Just image processing. I need to be able to print a few of these labels (a classical marmalade counterfeiting organization contracted me). No need for ultimate accuracy. $\endgroup$ Commented May 17, 2012 at 14:47
  • 1
    $\begingroup$ You had me concerned for a moment before I remembered your singular sense of humor. :-) $\endgroup$
    – Mr.Wizard
    Commented May 17, 2012 at 14:49

4 Answers 4

97
$\begingroup$

This answer evolved over time and got quite long in the process. I've created a cleaned-up, restructured version as an answer to a very similar question on dsp.stackexchange.

Here's my quick&dirty solution. It's a bit similar to @azdahak's answer, but it uses an approximate mapping instead of cylindrical coordinates. On the other hand, there are no manually adjusted control parameters - the mapping coefficients are all determined automatically:

The label is bright in front of a dark background, so I can find it easily using binarization:

src = Import["https://i.sstatic.net/rfNu7.png"];
binary = FillingTransform[DeleteBorderComponents[Binarize[src]]]

binarized image

I simply pick the largest connected component and assume that's the label:

labelMask = Image[SortBy[ComponentMeasurements[binary, {"Area", "Mask"}][[All, 2]], First][[-1, 2]]]

largest component

Next step: find the top/bottom/left/right borders using simple derivative convolution masks:

topBorder = DeleteSmallComponents[ImageConvolve[labelMask, {{1}, {-1}}]];
bottomBorder = DeleteSmallComponents[ImageConvolve[labelMask, {{-1}, {1}}]];
leftBorder = DeleteSmallComponents[ImageConvolve[labelMask, {{1, -1}}]];
rightBorder = DeleteSmallComponents[ImageConvolve[labelMask, {{-1, 1}}]];

enter image description here

This is a little helper function that finds all white pixels in one of these four images and converts the indices to coordinates (Position returns indices, and indices are 1-based {y,x}-tuples, where y=1 is at the top of the image. But all the image processing functions expect coordinates, which are 0-based {x,y}-tuples, where y=0 is the bottom of the image):

{w, h} = ImageDimensions[topBorder];
maskToPoints = Function[mask, {#[[2]]-1, h - #[[1]]+1} & /@ Position[ImageData[mask], 1.]];

Now I have four separate lists of coordinates of the top, bottom, left, right borders of the label. I define a mapping from image coordinates to cylinder coordinates:

Clear[mapping];
mapping[{x_, y_}] := {c1 + c2*x + c3*y + c4*x*y, c5 + c6*y + c7*x + c8*x^2}

This mapping is obviously only a crude approximation to cylinder coordinates. But it's very simple to optimize the coefficients c1..c8:

minimize =
  Flatten[{
    (mapping[#][[1]])^2 & /@ maskToPoints[leftBorder],
    (mapping[#][[1]] - 1)^2 & /@ maskToPoints[rightBorder],
    (mapping[#][[2]] - 1)^2 & /@ maskToPoints[topBorder],
    (mapping[#][[2]])^2 & /@ maskToPoints[bottomBorder]
    }];
solution = NMinimize[Total[minimize], {c1, c2, c3, c4, c5, c6, c7, c8}][[2]]

This minimizes the mapping coefficients, so the points on the left border are mapped to {0, [anything]}, the points on the top border are mapped to {[anything], 1} and so on.

The actual mapping looks like this:

enter image description here

Show[src,
 ContourPlot[mapping[{x, y}][[1]] /. solution, {x, 0, w}, {y, 0, h}, 
  ContourShading -> None, ContourStyle -> Red, 
  Contours -> Range[0, 1, 0.1], 
  RegionFunction -> Function[{x, y}, 0 <= (mapping[{x, y}][[2]] /. solution) <= 1]],
 ContourPlot[mapping[{x, y}][[2]] /. solution, {x, 0, w}, {y, 0, h}, 
  ContourShading -> None, ContourStyle -> Red, 
  Contours -> Range[0, 1, 0.2],
  RegionFunction -> Function[{x, y}, 0 <= (mapping[{x, y}][[1]] /. solution) <= 1]]]

Now I can pass the mapping directly to ImageForwardTransformation

ImageForwardTransformation[src, mapping[#] /. solution &, {400, 300}, DataRange -> Full, PlotRange -> {{0, 1}, {0, 1}}]

enter image description here

The artifacts in the image are already present in the source image. Do you have a high-res version of this image? The distortion on the left side is due to the incorrect mapping. This could probably be reduced by using an improved mapping function, but I can't think of one that's better and still simple enough for minimization right now.

ADD:

I've tried the same algorithm on the high-res image you linked to in the comment, result looks like this:

enter image description here

enter image description here

I had to make minor changes to the label-detection part (DeleteBorderComponents first, then FillingTransform) and I've added extra terms to the mapping formula for the perspective (that wasn't noticeable in the low-res image). At the borders you can see that the 2nd order approximation doesn't fit perfectly, but this might be good enough.

And you might want to invert the mapping function symbolically and use ImageTransformation instead of ImageForwardTransformation, because this is really really slow.

ADD 2:

I think I've found a mapping that eliminates the cylindrical distortion (more or less, at least):

arcSinSeries = Normal[Series[ArcSin[α], {α, 0, 10}]]
Clear[mapping];
mapping[{x_, y_}] := 
   {
    c1 + c2*(arcSinSeries /. α -> (x - cx)/r) + c3*y + c4*x*y, 
    top + y*height + tilt1*Sqrt[Clip[r^2 - (x - cx)^2, {0.01, ∞}]] + tilt2*y*Sqrt[Clip[r^2 - (x - cx)^2, {0.01, ∞}]]
   }

This is a real cylindrical mapping. I used the Taylor series to approximate the arc sine, because I couldn't get the optimization working with ArcSin directly. The Clip calls are my ad-hoc attempt to prevent complex numbers during the optimization. Also, I couldn't get NMinimize to optimize the coefficients, but FindMinimium will work just fine if I give it good start values. And I can estimate good start values from the image, so it should still work for any image (I hope):

leftMean = Mean[maskToPoints[leftBorder]][[1]];
rightMean = Mean[maskToPoints[rightBorder]][[1]];
topMean = Mean[maskToPoints[topBorder]][[2]];
bottomMean = Mean[maskToPoints[bottomBorder]][[2]];
minimize =
  Flatten[{
    (mapping[#][[1]])^2 & /@ maskToPoints[leftBorder],
    (mapping[#][[1]] - 1)^2 & /@ maskToPoints[rightBorder],
    (mapping[#][[2]] - 1)^2 & /@ maskToPoints[topBorder],
    (mapping[#][[2]])^2 & /@ maskToPoints[bottomBorder]
    }];
solution = 
 FindMinimum[
   Total[minimize], 
    {{c1, 0}, {c2, rightMean - leftMean}, {c3, 0}, {c4, 0}, 
     {cx, (leftMean + rightMean)/2}, 
     {top, topMean}, 
     {r, rightMean - leftMean}, 
     {height, bottomMean - topMean}, 
     {tilt1, 0}, {tilt2, 0}}][[2]]

Resulting mapping:

enter image description here

Unrolled image:

enter image description here

The borders now fit the label outline quite well. The characters all seem to have the same width, so I think there's not much distortion, either. The solution of the optimization can also be checked directly: The optimization tries to estimate the cylinder radius r and the x-coordinate of the cylinder center cx, and the estimated values only a few pixels off the real positions in the image.

ADD 3:

I've tried the algorithm on a few images I've found using google image search. No manual interaction except sometimes cropping. The results look promising:

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

As expected, the label detection is the least stable step. (Hence the cropping.) If the user marked points inside the label and outside the label, a watershed-based segmentation will probably give better results.

I'm not sure if the mapping optimization is always numerically stable. But it worked for every image I've tried as long as the label detection worked. The original approximate mapping with the 2nd order terms is probably more stable than the improved cylindrical mapping, so it could be used as a "fallback".

For example, in the 4. sample, the radius can not be estimated from the curvature of the top/bottom border (because there is almost no curvature), so the resulting image is distorted. In this case, it might be better to use the "simpler" mapping, or have the user select the left/right borders of the glass (not the label) manually, and set the center/radius explicitly, instead of estimating it by optimizing the mapping coefficients.

ADD 4:

@Szabolcs has written interactive code that can un-distort these images.

My alternative suggestion to improve this interactively would be to let the user select the left&right borders of the image, for example using a locator pane:

LocatorPane[Dynamic[{{xLeft, y1}, {xRight, y2}}], 
 Dynamic[Show[src, 
   Graphics[{Red, Line[{{xLeft, 0}, {xLeft, h}}], 
     Line[{{xRight, 0}, {xRight, h}}]}]]]]

LocatorPane

Then I can explicitly calculate r and cx instead of optimizing for them:

manualAdjustments = {cx -> (xLeft + xRight)/2, r -> (xRight - xLeft)/2};
solution = 
  FindMinimum[
   Total[minimize /. manualAdjustments], 
    {{c1, 0}, {c2, rightMean - leftMean}, {c3, 0}, {c4, 0}, 
     {top, topMean}, 
     {height, bottomMean - topMean}, 
     {tilt1, 0}, {tilt2, 0}}][[2]]
solution = Join[solution, manualAdjustments]

Using this solution, I get almost distortion-free results:

(Almost) no distortion

enter image description here

$\endgroup$
18
  • 1
    $\begingroup$ That's very nice. +1 $\endgroup$
    – azdahak
    Commented May 17, 2012 at 20:47
  • $\begingroup$ @nikie Thanks for your answer, I'll experiment a bit with the code. As for better images, there are relatively easy to find a few in the Internet. For example: kitchengardenpreserves.co.uk/newsb/wp-content/uploads/2011/12/… $\endgroup$ Commented May 17, 2012 at 21:25
  • $\begingroup$ good idea with using the known fact that the label is a rectangle... +1 $\endgroup$
    – acl
    Commented May 17, 2012 at 21:58
  • 1
    $\begingroup$ Could you perhaps summarize your answer in a new answer to this question and also link back here? Then I could remove my answer. $\endgroup$
    – Szabolcs
    Commented May 18, 2012 at 13:49
  • 1
    $\begingroup$ @Ita: No, that was an ad-hoc invention. The idea was: The angle on the cylinder should be ArcSin[x*c1+y*c2+c3] for some constants c1..c3, if we ignore perspective distortion. But fitting ArcSin didn't work well with FindMinimum, so I used a Taylor series that's close enough in the angle range I'm interested in, and real for all x and y. $\endgroup$ Commented Jun 4, 2018 at 17:53
15
$\begingroup$

Here's my stab at it, using a cylindrical projection, and TextureCoordinateFunction with a fitting parameter. Replace IMG in the code with the actual photo. The last command is a manual crop.

result=With[{para = 1.69}, 
ParametricPlot3D[{Cos[u], Sin[u], v}, {u, 0, Pi}, {v, 0, Pi}, 
PlotStyle -> Texture[ImageReflect[IMG, Left -> Right]], 
Mesh -> None, PlotRange -> All, ViewPoint -> Front, 
Lighting -> "Neutral", 
TextureCoordinateFunction -> ({#4, para Tan[#5]} &), 
AspectRatio -> para]]

Show[ImageTrim[Rasterize[result], {{45, 75}, {232, 165}}],AspectRatio->1]

enter image description here

Here's a modified version of the above using a few more control parameters which gives a projection on to a more general elliptical cylinder.It also corrects the lighting.

result = ParametricPlot3D[{Cos[u], .21 Sin[0.9 u], 
v + .08 Sin[u - .05]}, {u, 0, \[Pi]}, {v, 0, 1}, 
PlotStyle -> Texture[ImageReflect[IMG, Left -> Right]], 
Mesh -> None, PlotRange -> All, ViewPoint -> Front, 
Lighting -> {{"Ambient", White}}];

Show[ImageTrim[Rasterize[result], {{50, 53}, {320, 120}}], 
AspectRatio -> 1, ImageSize -> 200]

enter image description here

$\endgroup$
4
  • $\begingroup$ I think the issue is that the photo was taken from above, and the vertical axis is offset from the picture by a small angle. If you can take that into account, possibly by examining the ellipse made by the top, then I'd think you could get a better projection. $\endgroup$
    – rcollyer
    Commented May 17, 2012 at 16:46
  • $\begingroup$ Right. I made a modification above which uses a more general projection. Of course, it's still all hand-tuned. $\endgroup$
    – azdahak
    Commented May 17, 2012 at 19:04
  • $\begingroup$ we could try to combine this with minimising the mismatch between corresponding points (found as shown in my answer). eg, by producing the plots you have with the viewpoint etc as input, rasterizing the output, locating corresponding pts etc $\endgroup$
    – acl
    Commented May 17, 2012 at 19:06
  • $\begingroup$ That's much better, there's still some curvature and distortion, but that was present already in the image. So, +1, for that. The values for the cylinder look hand tuned, can that be automated? $\endgroup$
    – rcollyer
    Commented May 17, 2012 at 19:09
10
$\begingroup$

@nikie gave a very nice answer. This is a complement to it.

One remaining challenge is compensating for the distortion close to the left and right edges of the image, visible for example here (image taken from nikie's post):

Mathematica graphics

The magnitude of the distortion cannot be estimated in the general case without having some information about what's on the label. If the photo of the label is taken directly from the front, it will always appear as a rectangle. Therefore it's impossible to tell what angular region of the cylinder (jar) the labels takes up simply by looking at its photo. It could be half a jar (180°) or it could be negligibly small (e.g. 30°).

One solution could be letting the user compensate for the distortion manually (as a person can judge the magnitude of the distortion from e.g. the text on the label). We could use a Manipulate. The challenge in implementing this was making the Manipulate responsive enough for practical use. ImageTransformation tends to be too slow, so I used ParametricPlot instead to get an approximation for the angle $2 \times \alpha$ taken up by the label.

img = Import["https://i.sstatic.net/9axDH.png"]

Manipulate[
 ParametricPlot[{ArcSin[u], v}, {u, -Sin[α], Sin[α]}, {v, 0, 1}, 
    PlotStyle -> {Opacity[1], Texture[img]},
    Mesh -> mesh, 
    AspectRatio -> Divide @@ Reverse@ImageDimensions[img], 
    Frame -> False, Axes -> False, BoundaryStyle -> None, 
    PlotPoints -> {ControlActive[30, 300], 2}, Exclusions -> None],
 {α, 0.1, Pi/2}, {{mesh, False}, {True, False}}]

Mathematica graphics

Note: I assumed parallel projection to keep things simple.

After finding an approximation for $\alpha$ by dragging the slider and looking at the output, we can compute a better quality "undistorted" image using ImageTransformation:

α = 1.235;

ImageTransformation[img, 
 Function[p, {(Sin[Rescale[p[[1]], {0, 1}, α {-1, 1}]] + Sin[α])/(2 Sin[α]), p[[2]]}]]

Mathematica graphics

$\endgroup$
9
$\begingroup$

this approach looks promising:

labelphoto = Import["/Users/acl/Desktop/labelimage.png"]
label = Import["/Users/acl/Desktop/FMALS.gif"]

Mathematica graphics

and then locate corresponding points as follows:

images = {label, labelphoto};
matches = ImageCorrespondingPoints @@ images
MapThread[
 Show[#1, Graphics[{Red, 
     MapIndexed[Inset[#2[[1]], #1] & , #2]}]] &, {images, matches}]

Mathematica graphics

So now all we need is a routine to do arbitrary rotations of an image projected onto the surface of a cylinder, then optimize matching of points with respect to cylinder radius, viewpoint, rotation etc. (and it would be nice to have more than 3 points, but that would probably be the case for larger images).

As an aside, it sounds like FindGeometricTransform should do what we want, but it's not general enough as it is:

timage = ImageTransformation[label, Sqrt];
images = {label, timage};
matches = ImageCorrespondingPoints @@ images;
MapThread[
 Show[#1, Graphics[{Red, 
     MapIndexed[Inset[#2[[1]], #1] & , #2]}]] &, {images, matches}]
{terr, t} = FindGeometricTransform[label, timage];
ImagePerspectiveTransformation[ImageTransformation[label, Sqrt], t, 
 DataRange -> Full]

Mathematica graphics

So it'll take some more work.

$\endgroup$
8
  • $\begingroup$ I'm probably missing something built-in though... anyway I'm out of time for now $\endgroup$
    – acl
    Commented May 17, 2012 at 14:56
  • 3
    $\begingroup$ But what if you don't have the label? I think belisarius wanted a generic way to "peel" the label from the can (without a template) $\endgroup$
    – rm -rf
    Commented May 17, 2012 at 15:13
  • $\begingroup$ @R.M Yes, that is the case $\endgroup$ Commented May 17, 2012 at 15:16
  • $\begingroup$ @R.M I guess my idea was to find out if, given a shape (eg a rectangle), you can obtain a transformation to map a feature of an image as closely as possible to this shape (the rectangle). as a first step, this should better work if the "shape" is the actual image... $\endgroup$
    – acl
    Commented May 17, 2012 at 15:16
  • $\begingroup$ @acl but the rectangle aspect ratio is not known ... $\endgroup$ Commented May 17, 2012 at 15:20

Not the answer you're looking for? Browse other questions tagged or ask your own question.