Detect Areas of Text in Screenshot

Question

I'm working on a project to increase the ability for wine to automatically test software packages. What I'm looking to do now is detect text in the screen capture of the current window. I can then parse all of the text and use autohotkey to give a mouse click on the coordinates of the text I want.

For example, in firefox, I might want to test different things, the first open being opening preferences. I would then need to parse the screenshot of firefox, detect all of the separate locations of text. I can then run these separate images of text into tesseract-ocr and detect which one, says "Edit". I then redo this again for "preferences".

I've tried to find a solution but so far can't find anything. I'd prefer a solution that uses python or has python binds as thats what I've been programing in so far.

don't you need some kind of optical character recognition solution along the way in order to parse the text correctly? In other words, how are you going to get the text from the image? — reckoner, Commented Mar 31, 2011 at 15:56

Giuseppe Cardone · Accepted Answer · 2011-03-23 13:38:05Z

1

A possible starting point is Project SIKULI. It is a tool to automate GUI testing. It is written in Java, nonetheless it includes a scripting environment based on Jython, hence modifying it to support python script may be not too difficult.

answered Mar 23, 2011 at 13:38

Giuseppe Cardone

5,3732 gold badges25 silver badges30 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Detect Areas of Text in Screenshot

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
python
linux
image-processing
ocr
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged pythonlinuximage-processingocr or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
linux
image-processing
ocr
or ask your own question.