1

To streamline the scan process I intend to create a script that scans and applies OCR in one step. However my bash skills are rather poor, so I would be very thankful for a bit of help. Here my attempt:

#!/bin/bash

mydate="$(date +"%Y%m%d-%H%M%S")"
image="$(scanimage --device "brother4:net1;dev0" --progress --verbose --resolution=600 -l 0 -t 0 -x 210 -y 297 --format=pdf)"
ocrmypdf --deskew "$image" "$mydate".pdf

The command, which works well, without creating a date specific filename is:

scanimage --device "brother4:net1;dev0" --progress --verbose --resolution=600 -l 0 -t 0 -x 210 -y 297 --format=pdf > scan.pdf && ocrmypdf --deskew scan.pdf scan.pdf

Since the OCR process takes some time, the filename containig the time (up to seconds) has to be stored at scantime, and then applied to the final file. Or maybe it is possible -- did not find how -- to pipe the file to ocrmypdf without naming it and then save the file with date and time informations.

2 Answers 2

2

You can create a temporary directory and save the file therein. mktemp has been designed to give you a unique file/directory name and suits this purpose.

tmpdir=$(mktemp -d OcrTmpDirXXXXXXXXX)

scanimage args >"$tmpdir/in.pdf"
ocrmypdf args "$tmpdir/in.pdf" "$tmpdir/out.pdf"

printf 'See "%s" for result\n' "$tmpdir"
1

The problem with

image="$(scanimage --device "brother4:net1;dev0" --progress --verbose --resolution=600 -l 0 -t 0 -x 210 -y 297 --format=pdf)"

is that the $image variable contains the binary contents of the pdf, not a filename.


One very tricky way to do this would be with a process substitution

ocrmypdf --deskew <(
    scanimage --device "brother4:net1;dev0" --progress --verbose --resolution=600 -l 0 -t 0 -x 210 -y 297 --format=pdf
) "$mydate".pdf

I don't promise that ocrmypdf will accept it.


If ocrmypdf accepts the filename - to mean "standard input", then you can do:

scanimage --device "brother4:net1;dev0" --progress --verbose --resolution=600 -l 0 -t 0 -x 210 -y 297 --format=pdf \
| ocrmypdf --deskew - "$mydate".pdf

or maybe you need a double-hyphen, I don't know how this tool works (check the man page)

... \
| ocrmypdf --deskew -- - "$mydate".pdf
1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .