Take a screenshot and use OCR on it

Question

I know the OCR question with Python has already been discussed many times. However I didn't find anything that seems to help me excpt this question Python Tesseract OCR question. But it didn't solve my problem.

I need to make a little script to capture the text inside an opened window (of a text editor).

So it should:

Take a screenshot
Find the position of the text editor window and slice the screenshot (dunno if this passage is needed)
Convert it to grayscale and pass it to tesseract

I'm kinda newbie to Python and I dunno if this is feasible.

However thanks in advance for any hint.

Giorgio

But you already have the text! It's right there in the text editor! Why in the world would you go to such lengths to get text that you already have? — kindall, Commented Feb 10, 2012 at 19:04
It's an example, it could be a text editor or another program, I mean I have neat text. — Giorgio, Commented Feb 10, 2012 at 19:56

Community · Accepted Answer · 2017-05-23 10:34:38Z

This is certainly possible but also generally, unreasonable. There are better ways. Say you are parsing a webpage, you could either grab the HTML text without running it through an OCR or if you want to read the text of an image, you can parse through the HTML with urllib2, select the image and just download the image directly to a file. There are many HTML parser alternatives in Python that you can use, as well. Greyscale is simple with PIL or ImageMagick. From there, you can run it through an OCR or do it within the script with a Python wrapper like python-tesseract.

Alternatively—if you insist on doing a screenshot, something like this would be useful for you. I still hold that there are almost always better ways, but this should get you started if you want to try it out.

import gtk.gdk

w = gtk.gdk.get_default_root_window()
sz = w.get_size()
print "The size of the window is %d x %d" % sz
pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB,False,8,sz[0],sz[1])
pb = pb.get_from_drawable(w,w.get_colormap(),0,0,0,0,sz[0],sz[1])
if (pb != None):
    pb.save("screenshot.png","png")
    print "Screenshot saved to screenshot.png."
else:
    print "Unable to get the screenshot."

This was taken from Take a screenshot via a python script. [Linux]

What if one wants to take a screenshot of a selected area? I mean selecting some particular area by click-and-drag of mouse cursor. — skt7, Commented Apr 13, 2018 at 18:40
If your platform supports Bash you can try askubuntu.com/questions/280475/… (tested on Ubuntu and OSX, although a bit glitchy at times). I admit I'll like a Python script I can use on all platforms though (or at least a script that is mostly python, and just delegates area selection to OS-specific commands). — hsandt, Commented Dec 11, 2019 at 20:55

Collectives™ on Stack Overflow

Take a screenshot and use OCR on it

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
python
ocr
tesseract
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged pythonocrtesseract or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
ocr
tesseract
or ask your own question.