1

I'm working on a project to increase the ability for wine to automatically test software packages. What I'm looking to do now is detect text in the screen capture of the current window. I can then parse all of the text and use autohotkey to give a mouse click on the coordinates of the text I want.

For example, in firefox, I might want to test different things, the first open being opening preferences. I would then need to parse the screenshot of firefox, detect all of the separate locations of text. I can then run these separate images of text into tesseract-ocr and detect which one, says "Edit". I then redo this again for "preferences".

I've tried to find a solution but so far can't find anything. I'd prefer a solution that uses python or has python binds as thats what I've been programing in so far.

1
  • don't you need some kind of optical character recognition solution along the way in order to parse the text correctly? In other words, how are you going to get the text from the image?
    – reckoner
    Commented Mar 31, 2011 at 15:56

1 Answer 1

1

A possible starting point is Project SIKULI. It is a tool to automate GUI testing. It is written in Java, nonetheless it includes a scripting environment based on Jython, hence modifying it to support python script may be not too difficult.

Not the answer you're looking for? Browse other questions tagged or ask your own question.