55
$\begingroup$

Motivation: Last October 7 there was a presidential election in Venezuela. Although the opposition saw an unprecedented increase in its votes, the government votes increased even more resulting in the current president being re-elected. The votes were counted by computers that are not trustworthy because of what they have done in the past. Each voting machine printed a voting certificate with the results. About 90% of such certificates where collected by the opposition and are available to anyone at http://hayuncamino.com/actas/

In each voting table there was a paper notebook were each voter put its signature and fingerprint. According to the law, the total number of votes from this notebook was supposed to be compared to the votes reported by the machine. The results certificate provided a space where this number must be hand written. Unfortunately it seems that in a very large number of voting tables the law was broken because the space for this verification is empty.

By using Mathematica image processing capabilities I intend to find out in which voting tables the verification was done and compare the results of this subset with its complement.

The original question:

I need to process a large number of images for a non-profit organization report. The images contain a grid with borders and cells. The cells B2 and C2 (spreadsheet coordinates) can be hand written or can be empty, and that is what needs to be detected.

Here is an example of a filled form:

filled form

And this is an example of an empty form:

enter image description here

My plan is to detect the coordinates of the following points:

enter image description here

and then compute to total amount o black pixels in the area defined by them.

So my question is: What strategy would you recommend to reliably detect the location of those points indicated in red?

I have already tried using ImageLines, Radon, and FindGeometricTransform without much success. I think that the best approach is not to look for independent lines but instead look for the grid as a whole.

This is what I am trying to do:

figWithoutSideBorders = 
  ColorNegate @
    ImageAdd[fig, 
             ColorNegate @ Erosion[#, 3] & @ MeanFilter[#, 1] & @ MaxDetect[fig, 0.95]
    ]

enter image description here

I careful crafted this matrix so that it has the same proportions as the target grid:

formMatrix = SparseArray[{Band[{ 1, 1 }, Automatic, {0,1}] -> 1,
                          Band[{15, 1 }, Automatic, {0,1}] -> 1,
                          Band[{29, 1 }, Automatic, {0,1}] -> 1,
                          Band[{52, 1 }, Automatic, {0,1}] -> 1,
                          Band[{66, 1 }, Automatic, {0,1}] -> 1,
                          Band[{ 1, 1 }, Automatic, {1,0}] -> 1,
                          Band[{ 1,105}, Automatic, {1,0}] -> 1,
                          Band[{ 1,146}, Automatic, {1,0}] -> 1,
                          Band[{ 1,265}, Automatic, {1,0}] -> 1},
                          {66,265}];
formFigure = ColorNegate @ ArrayPlot[formMatrix, AspectRatio -> Automatic, Frame -> False]

enter image description here

But when I try to use FindGeometricTransform, it fails. Maybe it does not work with hollow objects?

enter image description here

As I last resort, I am thinking about doing horizontal and vertical histograms and look for proportionally spaced peaks, but I want to ask the community before I over engineer a solution. Thanks in advance.

UPDATE 1: @nikie answer is certainly very useful and I am thankful for that. My only concern is that this method looks for any table instead of looking for a 4x3 table with row heights 21%, 21%, 36%, 21% and column widths 40%, 15% and 45%. The fragility of the method is exposed by using the other provided sample image where a vertical line, that is not part of the table is confused for an additional column:

misrecognized table

UPDATE 2: As suggested by @belisarius I have added some context / motivation for this question.

UPDATE 3: I have now finished the processing. Only 5.7% on the voting certificates where not blank in the target total votes verification area. About 99% of the voting certificates were processed automatically. I have developed a set of functions that could be useful for other people doing similar tasks (and even in different areas), so I plan to write an answer to share that. Look also for a torrent file in the comments area.

$\endgroup$
16
  • 3
    $\begingroup$ Syntax error ... "LA Parroquia" not "EL Parroquia" $\endgroup$ Commented Oct 31, 2012 at 4:32
  • $\begingroup$ Is it flatbed scanned or photographed? Are all grids expected to have the same size? $\endgroup$ Commented Oct 31, 2012 at 4:34
  • $\begingroup$ A variant of this answer should work... the transformation should be much simpler than the cylindrical one in that answer. $\endgroup$
    – rm -rf
    Commented Oct 31, 2012 at 4:41
  • 3
    $\begingroup$ That context is very interesting. Consider including that kind of info in your questions and your posts will get much more attention (nikie's answer is outstanding, BTW) $\endgroup$ Commented Oct 31, 2012 at 12:49
  • 2
    $\begingroup$ Is there an easy way to download a whole bunch of original images for testing purposes? $\endgroup$
    – wxffles
    Commented Oct 31, 2012 at 20:45

2 Answers 2

53
$\begingroup$

The grid line detection from this answer works almost out of the box.

First, I adjust the brightness of the image for easier binarization:

src = ColorConvert[Import["https://i.sstatic.net/CmKLx.png"], 
   "Grayscale"];
white = Closing[src, DiskMatrix[5]];
srcAdjusted = Image[ImageData[src]/ImageData[white]]

Mathematica graphics

Next I find the largest connected component (largest convex hull area), which should be the grid you're looking for:

components = 
  ComponentMeasurements[
    ColorNegate@Binarize[srcAdjusted], {"ConvexArea", "Mask"}][[All, 
    2]];
largestComponent = Image[SortBy[components, First][[-1, 2]]]

Mathematica graphics

I create a filled mask from that, so I can ignore the background in the image:

mask = FillingTransform[Closing[largestComponent, 2]]

Mathematica graphics

Next step: detect the grid lines. Since they are horizontal/vertical thin lines, I can just use a 2nd derivative filter

lY = ImageMultiply[
   MorphologicalBinarize[
    GaussianFilter[srcAdjusted, 3, {2, 0}], {0.02, 0.05}], mask];
lX = ImageMultiply[
   MorphologicalBinarize[
    GaussianFilter[srcAdjusted, 3, {0, 2}], {0.02, 0.05}], mask];  

The advantage of a 2nd derivative filter here is that it generates a peak at the center of the line and a negative response above and below the line. So it's very easy to binarize. The two result images look like this:

Mathematica graphics

Now I can again use connected component analysis on these and select components with a caliper length > 100 pixels (the grid lines):

verticalGridLineMasks = 
  SortBy[ComponentMeasurements[
      lX, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All, 
      2]], #[[2, 1]] &][[All, 3]];
horizontalGridLineMasks = 
  SortBy[ComponentMeasurements[
      lY, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All, 
      2]], #[[2, 2]] &][[All, 3]];

The intersections between these lines are the grid locations:

centerOfGravity[l_] := 
 ComponentMeasurements[Image[l], "Centroid"][[1, 2]]
gridCenters = 
  Table[centerOfGravity[
    ImageData[Dilation[Image[h], DiskMatrix[2]]]*
     ImageData[Dilation[Image[v], DiskMatrix[2]]]], {h, 
    horizontalGridLineMasks}, {v, verticalGridLineMasks}];

Now I have the grid locations. The rest of the linked answer won't work here, because it assumes a 9x9 regular grid.

Show[src, 
 Graphics[{Red, 
   MapIndexed[{Point[#1], Text[#2, #1, {1, 1}]} &, 
    gridCenters, {2}]}]]

Mathematica graphics

Note that (if all the grid lines were detected) the points are already in the right order. If you're interested in grid cell 3/3, you can just use gridCenters[[3,3]] - gridCenters[[4,4]]

tr = Last@
   FindGeometricTransform[
    Extract[gridCenters, {{3, 3}, {4, 3}, {3, 4}}], {{0, 0}, {0, 
      1}, {1, 0}}] ;
ImageTransformation[src, tr, {300, 50}, DataRange -> Full, 
 PlotRange -> {{0, 1}, {0, 1}}]

Mathematica graphics

ADD: Response to updated question

UPDATE 1: @nikie answer is certainly very useful and I am thankful for that. My only concern is that this method looks for any table instead of looking for a 4x3 table with row heights ...

The algorithm I described above was meant as a proof-of-concept prototype, not an industrial strength, fully-polished solution (where would be the fun in that?). There are a few obvious way to improve it:

  • instead of selecting the connected component with the largest convex area, you could add more filter criteria: caliper length, caliper with, length of the semiaxes of the best-fit ellipse, shape characteristics like eccentricity, circularity, rectangularity. That should make the mask detection a lot more stable. But you'll have to find the right thresholds empirically, using (a lot) more than two samples
  • if the mask that is found contains other objects (e.g. lines running through the table), you can filter them away using morphological operations.
  • you could simply skip the gridline-search, and use the corners of the mask to calculate the geometric transformation, since you already know where the cells are in relation to the grid outline
  • even simpler: maybe you can just use the centroid and orientation found by ComponentMeasurements for the geometric transformation, without using the mask and grid lines at all.
  • you could select only the grid lines that are roughly in the positions you expect them, in relation to the full rectangle.
  • you could filter out grid lines that leave the mask area
  • you could only select grid lines that have the right caliper length

These are just a few ideas of the top of my head. Since you already have the position of the table (either using the mask or the centroid&orientation properties from ComponentMeasurements) and the grid lines, the implementation of these ideas should be mostly straightforward. But there's no way to tell which of them work and how well without implementing them and testing them on a large range of sample images. (At least, I know of no way.)

$\endgroup$
5
  • $\begingroup$ Thank you so much for your answer! Please see my comments on the updated question. $\endgroup$ Commented Oct 31, 2012 at 16:19
  • $\begingroup$ Wow, this caused me to upvote something on StackOverflow! The first time in months! $\endgroup$
    – rcollyer
    Commented Oct 31, 2012 at 18:15
  • $\begingroup$ And I registered in dsp.stackexchange.com to upvote this $\endgroup$ Commented Oct 31, 2012 at 20:02
  • 1
    $\begingroup$ A training set of maybe hundred inputs and expected outputs (probably in a form of expected coordinates), acceptance function and optimization of parameters to maximize amount of acceptable answers is certainly better than twiddling with individual inputs to get it right. Probably odd results can also be recognized by some simple image-recognition algorithms, and forwarded for manual analysis. (Like think if a form is accidentally scanned upside-down...) Using fixed "NRO." and "Letras" fields as anchors might also be beneficial. $\endgroup$
    – kirma
    Commented Nov 1, 2012 at 9:01
  • $\begingroup$ I would like to thank you again for the updated suggestions in your answer. BTW, I fixed the problem of the misinterpreted additional column by using the "BorderComponents" -> False in ComponentMeasurements. $\endgroup$ Commented Nov 1, 2012 at 15:05
15
$\begingroup$

Below are some techniques that together with @nikie answer give you a powerful way of detecting specific table grids.

The Rubber Band Algorithm

The 3 columns to be detected must be very close to 40%, 15% and 45% of the total table width. Similarly, the line heights have a proportion to follow. So the first problem to solve is how to identify a sequence that matches a proportions pattern.

I first tried to look for a Mathematica built-in function as this is a very generic task. The closest I could find is the SequenceAlignment function but it work only on strings. So I developed my own algorithm which is explained in this video

This is how I implemented it:

rubberBandComparePair[wanted_List, suspectsSubset_List] :=
  Module[{rubberBand, reducedSuspects},
    rubberBand = Rescale[wanted, 
                         { Min@wanted        , Max@wanted         },
                         { Min@suspectsSubset, Max@suspectsSubset }];

    reducedSuspects = If[Length @ rubberBand < Length @ suspectsSubset, 
                         Flatten[Nearest[suspectsSubset, #] & /@ rubberBand],
                         suspectsSubset];

    {EuclideanDistance[rubberBand, reducedSuspects] /
     (Max@reducedSuspects - Min@reducedSuspects),
     reducedSuspects}
  ] /; Length@wanted <= Length@suspectsSubset


rubberBandCompare[wanted_List, suspects_List] := 
  Module[{w, k = Length @ wanted, costs, sortedSuspects},
    sortedSuspects = Union @ Sort @ suspects;
    w = Length @ sortedSuspects;
    costs = Flatten[#, 1] & @ 
      Table[rubberBandComparePair[wanted, sortedSuspects[[start ;; end]]],
            {start, 1, w - k + 1},
            {end, start + k - 1, w}];
    SortBy[costs, First][[1]]
  ] /; Length @ Union @ suspects >= Length @ wanted

rubberBandCompare[wanted_List, suspects_List] := 
{∞, {}} /; Length[Union@suspects] < Length[wanted]

For example if we are looking for a sequence proportional to {1,2,4} in the list {9, 10, 15, 21, 40, 55}:

In[1] = rubberBandCompare[{1, 2, 4}, {9, 10, 15, 21, 40, 55}]
Out[1] = {1/30, {10, 21, 40}}

and we have detected that {10,21,40} matches {1,2,4} with an error of 1/30. Notice that this error is a relative error that does not change with scale:

In[2] = rubberBandCompare[{1, 2, 4}, {90, 100, 150, 210, 400, 550}]
Out[2] = {1/30, {100, 210, 400}}

Detecting horizontal and vertical lines

In order to detect lines we can use ImageLines[] or we can use the lower lever Radon[]. The advantage of ImageLines is that it is fast and simple, but it always looks for lines in all directions whereas Radon lets you look for lines in a specific direction. I implemented solutions using both, but I'll explain here only the ImageLines[] solution which worked well in a large number of cases.

ImageLines returns a list of lines where each line is defined by 2 points. So we first write this very simple function to calculate the angle of a line:

lineAngle::usage = "lineAngle[{{x1,y1},{x2,y2}}] returns the angle of the line that goes
                    through points {x1,y1} and {x2,y2}.";

lineAngle[{{ x_?NumericQ, y_?NumericQ},{x_          , y_          }}]:=Indeterminate
lineAngle[{{ x_?NumericQ,y1_?NumericQ},{x_          , y2_?NumericQ}}]:=Pi/2
lineAngle[{{x1_?NumericQ,y1_?NumericQ},{x2_?NumericQ, y2_?NumericQ}}]:=ArcTan[(y2-y1)/(x2-x1)]

When we call ImageLines later on, it will return a list of lines, so we need a function to select the ones that are close to the desired direction. For this I wrote this function:

selectLinesNearAngle::usage = 
    "selectLinesNearAngle[{{{x1,y1},{x2,y2}},...}, angle, tolerance] \
     selects the lines that have an inclination of angle +/- tolerance. \
     Each line is defined by a pair of points.";

selectLinesNearAngle[lines_List, angle_?NumericQ, angularTolerance:(_?NumericQ): 4°] :=
  Select[
    lines, 
    Or @@ Thread[Abs[lineAngle[#] - angle + {-Pi, 0, Pi}] < angularTolerance] &]

Now we are ready to write our modified version of ImageLines for horizontal or vertical lines:

Options[angularImageLines] = {"Debug" -> False};

angularImageLines[img_Image, α_, OptionsPattern[]]:=
  Module[
    {lines, selectedLines, binarizedImage},
    binarizedImage = Binarize@GaussianFilter[img, 3, Switch[α, 0, {2,0}, Pi/2, {0,2}]];
    lines = ImageLines[binarizedImage];
    selectedLines = selectLinesNearAngle[lines,α];
    If[OptionValue["Debug"],
       Print[Show[binarizedImage, Epilog -> {Green, Line /@ selectedLines}]]];
    selectedLines
  ]/; α==0 || α==Pi/2

Notice that binarizedImage is using GaussianFilters as @nikie recommends. This is an example of how it works:

Mathematica graphics

Searching for the lines that match wanted proportions

It is now the time to use rubberBandCompare and angularImageLines together:

matchedLines[wanted_List, g_Image, α_] := 
  Module[
    {candidateSequence, suspects, error, bestMatch, lines}, 
    lines = angularImageLines[g, α];
    suspects = Sort[Switch[α, Pi/2, First, 0, Last] /@ ((#1[[1]] + #1[[2]])/2 & ) /@ lines]; 
    candidateSequence = rubberBandCompare[wanted, suspects]; 
    If[Head[candidateSequence] === rubberBandCompare, Return[$Failed]]; 
    {error, bestMatch} = candidateSequence;  
    If[α == Pi/2 && error > 0.01176495, Return[$Failed]];
    If[α == 0    && error > 0.02994115, Return[$Failed]];
    (Cases[lines, {{x1_, y1_}, {x2_, y2_}} /; 
     Switch[α, Pi/2, (x1+x2)/2, 0, (y1+y2)/2] == #1, 1, 1][[1]] & ) /@ candidateSequence[[2]]
  ]

Notice that the error needs to be lower than a calibration constant for the solution to be accepted. These constants are found using a set of sample files, and making a histogram of the errors.

This is an example:

Mathematica graphics

Lines Intersections

This function finds the intersection of two lines. It was found in the PlaneGeometry.m package by Eric Weisstein (see http://mathworld.wolfram.com/Line-LineIntersection.html):

Intersections[Line[{{x1_, y1_}, {x2_, y2_}}], 
              Line[{{x3_, y3_}, {x4_, y4_}}]] := 
  Module[
    {d   = (x1-x2) * (y3-y4) - (x3-x4) * (y1-y2), 
     d12 = Det[{{x1, y1}, {x2, y2}}], 
     d34 = Det[{{x3, y3}, {x4, y4}}]}, 
    If[NumericQ[d] && d == 0., 
       PointAtInfinity,
       {Det[ {{d12, x1-x2}, {d34, x3-x4}} ]/d, 
        Det[ {{d12, y1-y2}, {d34, y3-y4}} ]/d}]
  ]

Uniformize Background

This technique is explained by @nikes here. I just added the /.0. -> 0.0001 in order to avoid division by zero when large pure black areas are present:

uniformizeBackground[g_Image] := Image[ImageData[g]/(ImageData[Closing[g, DiskMatrix[5]] /. 0. -> 0.0001)]

Grid Centers

gridCenters[g_Image] := 
  Module[
    {gAdjusted, hLines, vLines, 
     wantedX = {16.5, 224.5, 302.5, 535.5}, 
     wantedY = {22.5,  50.5,  97.5, 125.5, 154.5}}, 

    gAdjusted = uniformizeBackground[g];
    {hLines, vLines} = {{wantedY, 0}, {wantedX, Pi/2}} /. 
                {wanted_List, (α_)?NumericQ} :> matchedLines[wanted, gAdjusted, α, opts]; 
    If[hLines == $Failed || vLines == $Failed, Return[$Failed]]; 
    Outer[Intersections, Line /@ hLines, Line /@ vLines]
  ]

The wantedX and wantedY are found from a sample image using the get coordinates tool from the drawing tools palette. Once we have the grid centers, we can use the same techniques that @nikie used to identify and straighten any of the cells.

This is an example:

very light table

gridCenters example

Mathematica graphics

A final comment

My main contribution in this answer in the rubberBandCompare function which may be useful to other people in other areas. If it was already invented somewhere else, please let me know.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.