3

I'm trying to redact an existing PDF that contains minor personal secrets (E.g., ISP account number, home address. This is not national security). I used FoxitPDF Reader and drew solid rectangles over the secrets. Then I printed the doc to a new PDF via Microsoft Print to PDF printer driver.

The new PDF seems to have non-interactive blank areas (whereas the original has moveable rectangles). So I think the print driver has "flattened" the document and therefore correctly redacted the information. Is there a way to validate this? Does this seem like a decent practice for (free) redaction of PDF documents?

1 Answer 1

2

I would print a copy of what document you are intending to; on YOUR printer, at home, on YOUR own network. Then, using black sharpie a COVER the information. That or even just cutting out the information you desire completely from the printed copy of your document.

Then rescan as a PDF directly from your printer into the PC. Save as…(xxxxxx*)redacted.PDF and you’re good. Not only will you maintain the integrity of the original file itself, you can now be sure it is redacted correctly, without room for error or possibility of someone somehow “undoing” any computer assisted redaction you may have preformed.
It will also ensure that the PDF.redacted you submit will be certain to have metadata that will be brand new and therefor easy to discern if any further modification to the document has been performed after you have even sent it.

5
  • 1
    Good answer: as @Miles states, PII could be hidden in a digitally-redacted PDF, though not obviously visible. Alternatives are 1) Take screen shots of each page, redact in an image editor such as Paint or IrfanView, and combine into a PDF, or 2) Use OCR to capture the text of the PDF, redact, and create a new document. Commented Jul 14, 2021 at 4:24
  • Thanks for a good answer @miles! I recognize that a perfected redaction would involve print outs, sharpies, and scanning (or as @drmoishe pippik notes, with screen shots). But I'm still curious how to verify if the flattened, printed PDF still contains the underlying text or not.. Any thoughts? (Recognizing that document metadata wouldn't be removed -- which in this case is fine because I'm not trying to redact my identity, just specific content). Commented Jul 14, 2021 at 16:07
  • 1
    @stevemidgley, if the PDF document is not encrypted, it could be opened in a text editor or hex viewer. Some PDF's are compressed; those would need to be extracted first. Then you could search on PII, e.g. ID numbers. Commented Jul 14, 2021 at 20:55
  • @DrMoishePippik - thanks, I would have thought that too. I looked at the original pdf and the flattened pdf in a hex viewer, and there's no text in it. It seems to be some kind of json (?) structure that is processed to get the data out of it? prnt.sc/1bhs2gz So I don't know how eval this for either doc in terms of figuring out what is in or not in the file.. Commented Jul 15, 2021 at 19:04
  • It states in that screenshot "FlateDecode", which means compressed, See prepressure.com/library/compression-algorithm/flate-deflate Commented Jul 15, 2021 at 19:14

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .