2

I have a few files in .djvu format, which size is small, but unfortunately mi e-reader can't display it. I use DjvuToy to transform it to PDF and it keeps the small size; but then, when I use Abby FineReader to do the OCR and save, the size increments about eight times (this only happens when the file includes color or grayscale images). So, I figure that it would be possible take the second file text layer and add it to the first so I can get both the small size and the OCR. How can I do that?

Note: The original djvu file does not have text layer, although it would be nice to know how to convert from djvu to pdf including text directly.

1
  • 1
    Actually, it converts it to XML, block by block with coordinate which effectively maps over locations of the words/letters
    – Dave
    Commented Oct 4, 2012 at 14:05

1 Answer 1

0

Ghostscript could be used directly to edit the PDF properties while preserving the text layer:

gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dColorConversionStrategy=/Gray -dProcessColorModel=/DeviceGray -sOutputFile=output.pdf input.pdf

From here.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .