2

I have the following function to pre-process an image for Tesseract OCR, in most of the image the text is white, there can be green, red and purple text too. I want to be able to read all of that, but when I apply the thresholding during the pre-processing the red text is gone. Is there a way to avoid this? It doesn't happen with the green text unless it's dark green

def pre_process_img(img):
    open_cv_image = numpy.array(img)
    # Convert RGB to BGR
    open_cv_image = open_cv_image[:, :, ::-1].copy()

    img_gray = cv2.cvtColor(numpy.array(img), cv2.COLOR_BGR2GRAY)
    img_gray = cv2.resize(img_gray, None, fx=3, fy=3, interpolation=cv2.INTER_CUBIC)
    img_inverted = 255 - img_gray
    ret, thresh1 = cv2.threshold(img_inverted, 127, 255, cv2.THRESH_BINARY)
    # [DEBUG] show pre processed image
    # cv2.imshow("inverted", thresh1)
    # cv2.waitKey(0)
    return thresh1

In this function img is a PIL.Image.Image image, I convert it to an OpenCV image and apply preprocessing (turning into greyscale, rezising, inverting and binary thresholding). With psm 11 on Tesseract it has given a good enough result.

Btw If you have any suggestion to improve my pre_process_img function I'm open to listen. I'm new to OpenCV and I just stuck with the thing that gave me the best result from everything I've tried

This is my image here

1
  • just use a different threshold. you may have to treat different parts of the image differently. Commented Jan 10, 2022 at 1:01

1 Answer 1

0

Convert from BGR to HSV colorspace in Python/OpenCV. Then simply threshold the value channel. Here is the value channel. You will see that all text is white (in this case).

enter image description here

1
  • 2
    using HSV's Value channel is roughly equivalent to converting to grayscale (cvtColor), except HSV calculation treats all primary colors as equally bright, which gives the red text an advantage. one could also use img.max(axis=2) which is the same calculation for converting RGB to HSV Commented Jan 10, 2022 at 11:40

Not the answer you're looking for? Browse other questions tagged or ask your own question.