14

How can I edit the Table of Contents of a PDF file on Linux? I tried pdfedit but I can't find where the content table list is stored.

2
  • 2
    @new123456: had nothing about ToC editing
    – koniu
    Commented Jul 20, 2015 at 22:15
  • 1
    Perhaps PDFtk can be used to construct an outline (Table of Contents)? See step 3 of superuser.com/a/915399
    – Lekensteyn
    Commented May 29, 2016 at 10:03

6 Answers 6

5

A very nice alternative is to use HandyOutliner, which works for PDF and DJVU. It also provides very good functionality for editing the table of contents. It works on GNU/linux with mono.

Additionally, there is the very handy python script called document-contents-extractor to extract contents from PDF's or DJVU's. It can be installed with pip (for me on Fedora pip3 install --user document-contents-extractor). It requires some additional dependencies to be installed as found in the instructions here.

EDIT

Actually, the best tool for adding a TOC to a PDF is Emacs using the doc-toc package. Using it requires only minimal knowledge of Emacs (if you know Vim keybindings already, then use Spacemacs, with the toc layer). 1

As mentioned by Sam Liao, the best way to add TOC to digitally produced (i.e. 'non-scanned') documents is pdf-tocgen. It is a very powerful tool. The Emacs doc-toc package, makes it even easier to use.

1 On Windows, you'd probably prefer using Emacs via WSL, as otherwise it is not easy to setup doc-toc (for GNU/linux or OSX its easy)

END EDIT

2
  • 1
    +1 Just tried HandyOutliner. It worked.
    – Kitswas
    Commented Jul 25, 2023 at 16:10
  • To edit the TOC in HandyOutliner, you need to add "bookmarks" first, then press "Write to Outline", the second button on top.
    – Unknow0059
    Commented Feb 7 at 1:44
1

I use two programs, PdfMod and JPdfBookmarks (see also this SourceForge page and the manual).

I found JPdfBookmarks to be superiour: for example, one can easily change the level of a nested bookmark, or exchange two bookmarks, which I was not able to do with the PdfMod.

1

You can use pdf.togen to edit the toc.

  1. use pdftocio to get the toc of the pdf
  2. edit the toc
  3. write back to the pdf with pdftocio command.

Further more, if the pdf has no toc, you can also use pdf.tocgen construct the toc with multiple ways:

  • Manually edit a toc file and write to the pdf (this is useful when there is no way automatically detect the toc, for example: a pdf file each page is scanned image).
  • Use tools in pdf.tocgen to construct the toc automatically based on the different styles each level of toc uses. A small script can be used to automate this process once you understand how pdf.tocgen works.
-1

HandyOutliner a great tool. I used it on Windows 11. Just remember to press "write outline" to save the work. The save function seems to be broken. But the "write outline" would work everytime. Your file should not be open in another program. To work on it at same time, make a copy of file and than open the copy, and write to original file.

2
  • As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.
    – Community Bot
    Commented Jun 30, 2023 at 13:26
  • 1
    It sounds very dicey to me. Commented Jun 30, 2023 at 20:28
-2

SIMPLE to edit.
To edit the page number if the table of contents is leading to the wrong page: Open the edit tool in the pdf. Right click on the line you wish to edit. A menu will open ... if you have a link there, it will give you an edit link option ... click on it and the link properties opens. Select the actions tab where you can edit the page number. Keep in mind the labels on the pages may not be the same number if you started page 1 on other than the cover page.

If you only have text and no page link: Open the edit tool in the pdf. Select "Link" > "Add/Edit Web or Document Link" in the menu. Use the crosshair to select the area of text where you want to put the link... a "Create Link" should show once you have drawn the box area. Choose the Link Action "Go to a page view and hit "Next". A box "Create Go to View" should pop up. Scroll to the page you wish the link to go tov and draw a box around the area (full page or section). Choose button to "Set Link". Close the edit tool and try the link.

I find if I set the bookmarks up myself using the formula with switches in Word rather than trusting the automatic bookmarking, I have better control of the results in the conversion. I also make sure I export the Word using the Export>Create PDF/XPS document to better replicate Word with less conversion issues.

1
  • This answer is missing crucial information: what software do these steps apply to? Is this for Word? If so, it applies only to PDFs created from Word documents, which is not the topic of the question.
    – outis
    Commented Aug 2, 2023 at 4:51
-8

PDF is an image format. There is no storage of the contents of the table, only a "picture" of it. It can only be edited if the PDF's OCR can read the table as text, which is unlikely. You will need to use another application to create the table and then convert it to PDF.

5
  • 1
    Not true. Check wikipedia en.wikipedia.org/wiki/Portable_Document_Format . What I want is to change the logical structure of the document.
    – fakedrake
    Commented Jul 17, 2011 at 9:04
  • 5
    PDF is not an image format. It's much more akin to HTML than to something like JPEG.
    – new123456
    Commented Jul 17, 2011 at 15:15
  • Sorry. That is incorrect. There is no text or document coding in PDF "documents". The text is "read" by built-in optical character recognition software, just as text is read from any other image. Though it is far more complex in structure than, say, a jpeg, what you are looking at when you open a PDF is an image of a document. They are not really "documents" at all which is why they can't be directly converted to a document format, like .doc. They contain no document format information to convert.
    – Abraxas
    Commented Jul 23, 2011 at 9:36
  • 1
    Yes, as the article in Wikipedia points out, PDF's are complex. But they describe the way text is rendered, too, and notice that they use the word "drawn", unlike in other documents: "A text element specifies that characters should be drawn at certain positions."
    – Abraxas
    Commented Jul 23, 2011 at 9:47
  • 3
    This answer and your comment above is very inaccurate. The text is not OCRed, it is actually contained in the PDF file itself. When you open it with a text editor, you can see commands like /Length and stream. Objects with /FlateDecode can then be decompressed using zlib-flate -uncompress which will show text. I could for example recognize "Introduction" in [-1125(In)31(tro)-31(duction)] (generated by pdflatex).
    – Lekensteyn
    Commented May 29, 2016 at 9:29

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .