1

I want to merge two PDF documents with Python (prepend a pre-made cover sheet to an existing document) and present the result to a browser. I'm currently using the PyPDF2 library which can perform the merge easily enough, but the PdfFileWriter class write() method only seems to support writing to a file object (must support write() and tell() methods). In this case, there is no reason to touch the filesystem; the merged PDF is already in memory and I just want to send a Content-type header and then the document to STDOUT (the browser via CGI). Is there a Python library better suited to writing a document to STDOUT than PyPDF2? Alternately, is there a way to pass STDIO as an argument to PdfFileWriter's write() method in such a way that it appears to write() as though it were a file handle?

Letting write() write the document to the filesystem and then opening the resulting file and sending it to the browser works, but is not an option in this case (aside from being terribly inelegant).

solution

Using mgilson's advice, this is how I got it to work in Python 2.7:

#!/usr/bin/python
import cStringIO
import sys
from PyPDF2 import PdfFileMerger

merger = PdfFileMerger()

###
# Actual PDF open/merge code goes here
###

output = cStringIO.StringIO()
merger.write(output)
print("Content-type: application/pdf\n")
sys.stdout.write(output.getvalue())
output.close()
3
  • 1
    I'm guessing you tried passing in sys.stdout? That might not support .tell ... If that's the case, you could just write to a cStringIO.StringIO (or io.BytesIO) instance and then write the contents from that to stdout ...
    – mgilson
    Commented Jun 24, 2014 at 13:34
  • This worked perfectly. I used PdfFileMerger to write to a cStringIO object, then called a sys.stdout.write() on that objects .getvalue(). Thanks!
    – Riblet
    Commented Jun 24, 2014 at 16:49
  • I suppose I'll post as an answer so we can close this one out...
    – mgilson
    Commented Jun 24, 2014 at 17:41

1 Answer 1

1

Python supports an "in-memory" filetype via cStringIO.StringIO (or io.BytesIO, ... depending on python version). In your case, you could create an instance of one of those classes, pass that to the method which expects a file and then you can use the .getvalue() method to return the contents as a string (or bytes depending on python version). Once you have the contents as a string, you can simply print them or use sys.stdout.write to write the string to standard output.

1
  • 2023 / Python 3 update: io.BytesIO should be used :-) Commented Feb 11, 2023 at 15:04

Not the answer you're looking for? Browse other questions tagged or ask your own question.