Blog by Sumana Harihareswara, Changeset founder
PDFtk, qpdf, And Dealing With Password-Protected PDFs
I slice, dice, and transform documents often enough that I rely frequently on pandoc and pdftk. I often use
pandoc to turn Markdown, HTML, wiki syntax, reStructuredText, etc. into each other or into LibreOffice, MS Office, etc. And I use
pdftk-java technically) to say, "turn pages 1 & 5 from this PDF into a new one" or "concatenate these 4 PDFs into a new one".
Recently I wanted to take a password-protected file, select several page ranges from it, and emit a new non-protected file. I ran into a problem while trying to do this:
Error: Invalid PDF: unknown.encryption.type.r
Error: Failed to open input PDF file:
Errors encountered. No output created.
Done. Input errors, so no output created.
I looked around and found this GitLab issue -- more recent PDFs are often protected with AES256 which isn't yet supported. Thanks to ergo mesh in that thread who pointed to a solution: qpdf. I am more familiar with
pdftk's syntax so I did it in two steps:
qpdf redacted-name-of-file.pdf --replace-input --password=SECRETPASSWORD --decrypt # note to self: remove this entry from history later
pdftk redacted-name-of-file.pdf cat 3-8 10-16 output /tmp/redacted-file-unprotected-edited.pdf
but I'm likely to learn
qpdf's page selection syntax soon and switch to it entirely, in which case I could have done this in one step. I also could have saved the password in a file and then read it into
qpdf instead of typing it on the command line, which I'll probably start doing. But here's how I remove a specific item from my bash history:
history -a # append recent history
history # print out history lines, so user can look for the offset number of the relevant line, e.g., 301
history -d 301 # now it's gone from the current history list!
history -w # actually seal the deal & write the deletion to the history file
Hope this helps! Also, the
qpdf documentation on how PDF encryption/password protection works is pretty enlightening.
06 May 2022, 18:30 p.m.