Table of Contents
http://www.csun.edu/helpdesk/linux.htm
Ever wanted to extract the text out of Portable Document Format (.pdf) file (also known as an Adobe Acrobat file)? You can use Adobe's utilities for this purpose, but chances are your system already contains a neat little utility that can do the job too. It's called pdftotext. The following command extracts the text from report.pdf, and writes it to a file named pdf.txt:
pdftotext report.pdf > report.txt
Like to extract the graphics? The pdfimages command works the same kind of magic for the pictures in the file; it writes each of them to a file that's named with a root filename, an automatically appended number, and a suffix that's appropriate for the type of file that's written (by default, Portable Pixmaps or Portable Bitmaps). The following command extracts the images from report.pdf:
pdfimages report.pdf report
The extracted images are named report.001.ppm, report002.ppm, etc.