3

So my current personal project is to be able to automatically grab screenshots out of a game, OCR the text, and count the number of occurrences of given words.

Having spent all evening looking around at different OCR solutions, I've come to realize that the majority of OCR packages out there are designed for scanned text. If there are any packages that can read screen text reliably, they're well outside this hobbyist's budget.

I've been reading through some other questions, and the closest I found was OCR engines designed for screen-reading.

It seems to me that reading rendered text should be much easier than printed and scanned text. Lines are always straight, and any given letter will always appear with the exact same pixel representation (mostly, anyways). Also, why not use the actual font file (if you have it) as a cheat sheet to recognizing characters? We might actually reach 100% accuracy with a system like this.

Assuming you have the font file for a cheat sheet and your source image is perfectly square and has no noise, how would you go about recognizing characters from the screen?

(Problems I can foresee are ui lines and images that could confuse any crude attempt at pixel-guessing.)

If you already know of a free/open-source OCR package designed for screen-reading, please let me know. I kind of doubt that's going to show up though, as no other askers seem to have gotten a lead either.

A Python interface is preferred, but beggars can't be choosers.

EDIT:
To clarify, I'm looking for design suggestions for an OCR solution that is specifically designed to read text from screenshots. Popular tools like tesseract (mentioned in the question I linked) are hard to use at best because they are not designed for this kind of source file.

4
  • 2
    My boss once coined a term I like for this -- Obvious Character Recognition.
    – MK.
    Commented Dec 27, 2010 at 5:50
  • Hah! I like that term, especially because it applies. It's a shame that it collides with the other acronym, or I'd use it for this. Commented Dec 27, 2010 at 5:54
  • Hi @Hovis , did you got this? Do you have a link to your open source project? Commented Apr 11, 2014 at 17:23
  • Nope, I never got around to it. Commented Apr 18, 2014 at 16:51

3 Answers 3

2

So I've been thinking about it and I feel that the best approach will be to count the number of pixels in each blob/glyph/character. This should really cut down on the number of tests I need to do to differentiate between glyphs.

Regretfully, I'll have to be very specific about fonts. The software will only be able to recognize fonts at the right dpi, for the right font face and weight, etc.

It isn't ideal, and I'd still like to see someone who knows more about this stuff design OCR for rendered text; but it will work for my limited case.

1

If your goal is to count occurrences of certain events in a game, OCR is really not the right way to be going about it. That said, if you are determined to use OCR, then tesseract-OCR is a well-known open source package for performing optical character recognition. I'm not really sure what you are getting at with respect to scanned vs. rendered text, but tesseract will probably do as good a job as any opensource package that is available. OCR is still a tricky art, so I wouldn't expect 100% accuracy.

1
  • I've been trying to use tesseract all morning, and it's a no-go. It suffers from the same problem of being designed for "large" scanned text. (I.E. high-dpi but probably messy text) Commented Dec 27, 2010 at 5:51
0

This isn't exactly what you want, but you may want to look at Sikuli.

3
  • Hmm, this looks really cool. It really isn't what I'm after but I'll probably end up playing with it. Thanks! Commented Dec 29, 2010 at 19:24
  • For future reference, that bit.ly link actually goes to sikuli.org. Why use a link shortener in the first place?
    – SilverWolf
    Commented Apr 6, 2019 at 21:58
  • 2
    @SilverWolf Who knows, I wrote this answer almost a decade ago! Commented Apr 7, 2019 at 2:56

Not the answer you're looking for? Browse other questions tagged or ask your own question.