10

I need a Python program that I am working on to be able to take a small image, determine if it exists inside a larger image, and if so, report its location. If not, report that. (In my case, the large image will be a screenshot, and the small image an image that may or may not be on the screen, in an HTML5 canvas.) Looking on line, I found out about template matching in OpenCV, which does have excellent Python bindings. I tried the following, based on very similar code I found on line, using numpy as well:

import cv2
import numpy as np
image = cv2.imread("screenshot.png")
template = cv2.imread("button.png")
result = cv2.matchTemplate(image,template,cv2.TM_CCOEFF_NORMED)
StartButtonLocation = np.unravel_index(result.argmax(),result.shape)

This doesn't do what I need it to do, because it ALWAYS returns a point in the larger image; the point where the match is closest, no matter how terrible a match it is. I want something that finds an exact, pixel for pixel match of the smaller image in the larger image, and if none exists, raises an exception, or returns False, or something like that. And, it needs to be fairly quick. Does anyone have a good idea about how to do this?

2
  • 1
    A quick question: can you assume that your small image will appear in the large image always in its original size and exactly with its original values? Or you need to deal with variable size small images that might be interpolated and handle illumination variations? I mean, you mention exact match, is it really exact? Commented Apr 16, 2015 at 7:19
  • 2
    Are you guaranteeing ALWAYS to use PNG format? I ask beause JPEGs undergo quantisation and lossy compression and things that are apparently identical can differ in their internal representation. Commented Apr 16, 2015 at 10:33

4 Answers 4

24

I will propose an answer that works fast and perfectly if you are looking for exact match both in size and in image values.

The idea is to calculate a brute force search of the wanted h x w template in a larger H x W image. The bruteforce approach would consist in looking at all the possible h x w windows over the image and check for pixel by pixel correspondence within the template. This however is very computationally expensive, but it can be accelerated.

im = np.atleast_3d(im)
H, W, D = im.shape[:3]
h, w = tpl.shape[:2]

By using the smart integral images one can calculate really fast the sum inside of a h x w window starting at every pixel. An integral image is a summed area table (cumulative summed array), that can be calculated with numpy really fast as:

sat = im.cumsum(1).cumsum(0)

and it has really nice properties, such as the calculation of the sum of all the values within a window with only 4 arithmetic operations:

From wikipedia

Thus, by calculating the sum of the template and matching it with the sum of h x w windows over the integral image, it is easy to find a list of "possible windows" where sum of inside values is the same as the sum of the values in the template (a quick approximation).

iA, iB, iC, iD = sat[:-h, :-w], sat[:-h, w:], sat[h:, :-w], sat[h:, w:]
lookup = iD - iB - iC + iA

The above is a numpy vectorization of the operation of shown in the image for all the possible h x w rectangles over the image (thus, really quick).

This will reduce a lot the number of possible windows (to 2 in one of my tests). The last step, would be to check for exact matches with the template:

posible_match = np.where(np.logical_and.reduce([lookup[..., i] == tplsum[i] for i in range(D)]))
for y, x in zip(*posible_match):
    if np.all(im[y+1:y+h+1, x+1:x+w+1] == tpl):
        return (y+1, x+1)

Note that here y and x coordinates correspond to the A point in the image, which is the previous row and column to the template.

Putting all together:

def find_image(im, tpl):
    im = np.atleast_3d(im)
    tpl = np.atleast_3d(tpl)
    H, W, D = im.shape[:3]
    h, w = tpl.shape[:2]

    # Integral image and template sum per channel
    sat = im.cumsum(1).cumsum(0)
    tplsum = np.array([tpl[:, :, i].sum() for i in range(D)])

    # Calculate lookup table for all the possible windows
    iA, iB, iC, iD = sat[:-h, :-w], sat[:-h, w:], sat[h:, :-w], sat[h:, w:] 
    lookup = iD - iB - iC + iA
    # Possible matches
    possible_match = np.where(np.logical_and.reduce([lookup[..., i] == tplsum[i] for i in range(D)]))

    # Find exact match
    for y, x in zip(*possible_match):
        if np.all(im[y+1:y+h+1, x+1:x+w+1] == tpl):
            return (y+1, x+1)

    raise Exception("Image not found")

It works with both grayscale and color images and runs in 7ms for a 303x384 color image with a 50x50 template.

A practical example:

>>> from skimage import data
>>> im = gray2rgb(data.coins())
>>> tpl = im[170:220, 75:130].copy()

>>> y, x = find_image(im, tpl)
>>> y, x
(170, 75)

And to ilustrate the result:

enter image description here

Left original image, right the template. And here the exact match:

>>> fig, ax = plt.subplots()
>>> imshow(im)
>>> rect = Rectangle((x, y), tpl.shape[1], tpl.shape[0], edgecolor='r', facecolor='none')
>>> ax.add_patch(rect)

enter image description here

And last, just an example of the possible_matches for the test:

enter image description here

The sum over the two windows in the image is the same, but the last step of the function filters the one that doesn't exactly match the template.

6
  • This was EXACTLY what I needed. It does indeed work perfectly, and it's very quick. Thank you very much!
    – clj
    Commented Apr 20, 2015 at 2:47
  • But in this case, the two images have the same source. So the pixel values will be same. What if I get the image in "tpl" from a different source? Will it work in that case too?
    – Neelesh
    Commented Feb 14, 2017 at 7:13
  • @Neelesh If it is from a different source the problem is different. You are looking for template matching and other solutions exist (e.g. normalized cross-correlation). However, the code above could be sightly modified to return the patch with smallest sum diffence rather than exact sum. Hope it helps! Commented Feb 14, 2017 at 10:41
  • 2
    np.logical_and(*[...]) - are you expecting np.logical_and to take an arbitrary number of things to AND together? NumPy ufuncs don't work that way. You're actually specifying the third list element as an array to place the output in, not as another array to AND together. The ufunc reduce method might help: np.logical_and.reduce([...]) (no *) instead of np.logical_and(*[...]). Commented Mar 6, 2017 at 21:36
  • 1
    @Hat you could use padding for that without any need of cv2: img = np.pad(img, ((1,0), (1,0), (0,0)), mode='constant') (sintax might be a little off, Im from phone and cant check it). But thanks for letting me know! Is a quick and nice fix :) Commented Jul 25, 2018 at 15:03
2

Since you are happy with OpenCV, I would suggest that you start with what you have already done and get the best match. Once you have the location of the best match, you can check that it is actually a good match.

Checking that it is a good match should be as easy as extracting the matching image and comparing it to the template. To extract the image you might need to use cv2.minMaxLoc(result) and process the output. The extraction method seems to depend on the method used for comparing the images and is done with examples here.

Once you have extracted the image, you should be able to compare them using numpy.allclose or some other method.

1

I tried to use this last script to find image embadded in directory, but that's not work, there is my what i do :

import cv2
import numpy as np
import os
import glob

pic2 = "/home/tse/Images/pictures/20/redu.png"
path = "/home/tse/Images/pictures/20/*.png"
for pic1 in glob.glob(path):
    def find_image(pic1, pic2):
        dim1_ori = pic1.shape[0]
        dim2_ori = pic1.shape[1]
        dim1_emb = pic2.shape[0]
        dim2_emb = pic2.shape[1]

        v1_emb = pic2[0, 0]
        v2_emb = pic2[0, dim2_emb - 1]
        v3_emb = pic2[dim1_emb - 1, dim2_emb - 1]
        v4_emb = pic2[dim1_emb - 1, 0]

        mask = (pic1 == v1_emb).all(-1)
        found = 0

        if np.sum(mask) > 0: # Check if a pixel identical to v1_emb
            result = np.argwhere(mask)
            mask = (result[:, 0] <= dim1_ori - dim1_emb) & (result[:, 1] <= dim2_ori - dim2_emb)

            if np.sum(mask) > 0: # Check if the pixel induce a rectangl
                result = result[mask] + [0, dim2_emb - 1]
                mask = [(pic1[tuple(coor)] == v2_emb).all(-1) for coor in result]

                if np.sum(mask) > 0: # Check if a pixel identical to v2_emb
                    result = result[mask] + [dim1_emb-1, 0]
                    mask = [(pic1[tuple(coor)] == v3_emb).all(-1) for coor in result]

                    if np.sum(mask) > 0: # Check if a pixel identical to v3_emb
                        result = result[mask] - [0, dim2_emb - 1]
                        mask = [(pic1[tuple(coor)] == v4_emb).all(-1) for coor in result]

                        if np.sum(mask) > 0: # Check if a pixel identical to v4_emb
                            result = result[mask]
                            result[:, 0] = result[:, 0] - (dim1_emb - 1)
                            result = np.c_[result, result[:, 0] + dim1_emb, result[:, 1] + dim2_emb]

                            for coor in result: # Check if the induced rectangle is indentical to the embedding
                                induced_rectangle = pic1[coor[0]:coor[2], coor[1]:coor[3]]
                                if np.array_equal(induced_rectangle, pic2):
                                    found = 1
                                    break
        if found == 0:
            return('No image found')
            print("Not found")
        else:
            return('Image found')
            print("Found")
0

This is a refinement of @Imanol Luengo's function. To reduce computation, we first filter the pixels identical to the up-left vertex of the template. Then we check only the rectangles induced by these pixels.

def find_image(pic1, pic2): # pic1 is the original, while pic2 is the embedding

    dim1_ori = pic1.shape[0]
    dim2_ori = pic1.shape[1]

    dim1_emb = pic2.shape[0]
    dim2_emb = pic2.shape[1]

    v1_emb = pic2[0, 0]
    v2_emb = pic2[0, dim2_emb - 1]
    v3_emb = pic2[dim1_emb - 1, dim2_emb - 1]
    v4_emb = pic2[dim1_emb - 1, 0]

    mask = (pic1 == v1_emb).all(-1)
    found = 0

    if np.sum(mask) > 0: # Check if a pixel identical to v1_emb
        result = np.argwhere(mask)
        mask = (result[:, 0] <= dim1_ori - dim1_emb) & (result[:, 1] <= dim2_ori - dim2_emb)

        if np.sum(mask) > 0: # Check if the pixel induce a rectangle
            result = result[mask] + [0, dim2_emb - 1]
            mask = [(pic1[tuple(coor)] == v2_emb).all(-1) for coor in result]

            if np.sum(mask) > 0: # Check if a pixel identical to v2_emb
                result = result[mask] + [dim1_emb-1, 0]
                mask = [(pic1[tuple(coor)] == v3_emb).all(-1) for coor in result]

                if np.sum(mask) > 0: # Check if a pixel identical to v3_emb
                    result = result[mask] - [0, dim2_emb - 1]
                    mask = [(pic1[tuple(coor)] == v4_emb).all(-1) for coor in result]

                    if np.sum(mask) > 0: # Check if a pixel identical to v4_emb
                        result = result[mask]
                        result[:, 0] = result[:, 0] - (dim1_emb - 1)
                        result = np.c_[result, result[:, 0] + dim1_emb, result[:, 1] + dim2_emb]

                        for coor in result: # Check if the induced rectangle is indentical to the embedding
                            induced_rectangle = pic1[coor[0]:coor[2], coor[1]:coor[3]]
                            if np.array_equal(induced_rectangle, pic2):
                                found = 1
                                break
    if found == 0:
        return('No image found')
    else:
        return('Image found')

Not the answer you're looking for? Browse other questions tagged or ask your own question.