I've used tesseract
for OCR a few times in the past successfully. On macOS; it was installed by "home-brew"
Today, I did
WGroleau@MBP ~ % brew upgrade # to make sure everything is the latest and then …
WGroleau@MBP ~ % tesseract ~/Downloads/temp.jpg stdout -l chi_sim
福佳生活饶
The last (fifth) character was incorrect, so I made a minor graphic edit to that character and ran the same command. No output, no diagnostics. Ran it verbose—still no diagnostics, only what libraries it used.
Cropped the edited character out of the file and tried. Still no output, no diagnostics.
What do I do next?
Here's the file after edit but before cropping:
Update: If I tell it to use "Legacy engine only," I get:
Error: Tesseract (legacy) engine requested, but components are not present in /usr/local/share/tessdata/chi_sim.traineddata!!
Failed loading language 'chi_sim'
Tesseract couldn't load any languages!
Could not initialize tesseract.
temp.jpg
? What if you edit it in a different program? Or even re-save the current file using a different program. I mean it looks like the first program saved the jpg in a form that triggers some bug(?) intesseract
. If another program manages to produce a "more compatible" jpg then the simplest workaround will be to use it instead of the first program when editing fortesseract
.