Questions tagged [tesseract]
Tesseract is an OCR (optical character recognition) engine
15
questions
0
votes
0
answers
52
views
What happened to Tesseract's "Math / equation detection module"?
I was able to get Tesseract to run via a Python script on my Windows machine to turn non-searchable PDFs into searchable ones. When downloading Tesseract onto windows, it asked me which languages I ...
2
votes
0
answers
41
views
OCR high res images & combine OCR data later, after image compression?
I have a large number of .tif's coming out of ScanTailor. Is there a way that I might OCR those .tif's with tesseract, holding the OCR data separate from the images; then compress the images, and ...
0
votes
1
answer
408
views
Best command-line OCR software for recognizing typed text over colorful background
I need to extract text from images like the one below:
As you can see, the text is typed not handwritten. Moreover, the background is colorful.
I've tried Tesseract OCR, and while it works some of ...
0
votes
1
answer
122
views
Tesseract doesn't accept process substitution
I'm making a quick script that is supposed to use OCR tool (tesseract) on image in clipboard to convert it to text and output it. It looks like this:
#!/bin/sh
temp="$(mktemp tmpXXX.png)"
...
0
votes
1
answer
118
views
Scripting tesseract for file manager context menu
File manager context menu scripts sometimes do the job far quicker than using a GUI utility. So I've been using dozens of simple and more complex scripts for a long time in file managers Dolphin, ...
1
vote
0
answers
395
views
Using tesseract for character recongniton, result is not as expected (much worse). How to get better?
I wanted to add output of Linux boot to my question and decided to try to use optical character recognition thinking now in 2022 surely there should be decent open source options (have not tried OCR ...
2
votes
0
answers
96
views
Is there software to manually OCR / teach OCR for handwriting (non-english) texts?
I had a problem that can't solve Tesseract/Abbyy Finereader etc - they can't recognize handwriting Russian as example.
So I search
OCR software for such things
or a way to manually OCR my pdfs (...
0
votes
1
answer
202
views
How do you save the text in the terminal to various text formats?
I'm playing around a bit with OCR software, in particular I'm spending a bit of time with tesseract. I got it to where I can load an image and get tesseract to rip the text from the image, in Linux ...
0
votes
1
answer
1k
views
Install tesseract offline in RHEL
I have an RHEL based server that does not connect to the internet. I need to install Tesseract >4.0 on this server. Therefore, my option was to download RPM packages from another and move them to ...
3
votes
0
answers
317
views
Debian Buster: Tesseract not supporting URL as argument
I'm trying to parse text from a hosted image, but it looks like I've miss-configured Tesseract.
I'm using Debian Buster, tesseract-ocr, libtesseract-dev and a Ruby wrapper are installed.
# $ ...
1
vote
0
answers
48
views
script run via keyboard binding does not write to file
Following bash script interprets text in an image file and writes to a .txt file.
#!/usr/bin/env bash
LD_LIBRARY_PATH="/usr/local/lib"
export LD_LIBRARY_PATH
/usr/local/bin/tesseract /home/martin/...
10
votes
2
answers
13k
views
Tesseract: High CPU Usage and slow speed, only when running multiple processes in parallel
Problem
pytesseract.image_to_string() takes too much time when I run the script through supervisordd, but executes almost instantaneously when run directly in shell (on the same server and ...
0
votes
1
answer
260
views
Leptonica compilation error
Trying to install leptonica v1.78 on Ubuntu 16, but it's not working for some reason. After running ./configure and make, I keep getting this error:
make[2]: Entering directory '/home/user/Documents/...
5
votes
1
answer
2k
views
tesseract: is it possible to change font output in OCRed pdf?
Following up on how to OCR a pdf file and get the text stored within pdf? I have successfully produced OCRed pdf pages.
In Evince, however, the letters are not shown; by this I mean that I cannot see ...
2
votes
1
answer
699
views
Where I can get Tesseract binaries for Debian 6 64bit?
I used apt-get to install Tesseract but it's not really working. Maybe I could just download binaries somewhere, put in a dir and use this way?
What's wrong with my Tesseract now:
tesseract --help
...