Extracting images from PDF (free, using command line)

Posted on Wed 22 May 2013 in WebDev & Code

This is a day when I love computers. I have a multipage PDF and I need to extract the images from it.

Option 1: Open the PDF on screen, capture each section, save each file.
No thanks, that'll take far too long and lose quality (which already isn't too great)

Option 2: Open the PDF using Adobe Illustrator. select each image, copy/crop/save as, etc.
No thanks, almost as bad as option 1

Option 3: Google for an answer to "Extract images from PDF". Discover all the top results are for paid applications.
No thanks. I don't mind paying for applications but this is a (probably) one off job and I feel sure someone would have written a script to extract all the images from a PDF.

Option 4: Try PDFtk, a PDF toolkit that takes instructions by command line.
Almost there. It can do all sorts of things to PDFs, but extract the image objects appears not to be one of them.

Option 5: Re-discover The Unarchiver
t works! It really was so simple. The Unarchiver views PDF files as if they were a compressed file. Select the PDF, tell it to extract all.  Voila! 652 tiff images from 44 pages of PDF.  20 minutes to find the solution. Maybe 2 seconds for unarchiver to run (oh, and I already had it on my Mac, probably from having to extract a less common file archive format).

One last note. It maybe that ghostscript could also do this task, that would have been my option 6...