1 ! Generated automatically by mantohlp
4 pdftotext - Portable Document Format (PDF) to text con-
7 pdftotext [options] [PDF-file [text-file]]
9 Pdftotext converts Portable Document Format (PDF) files to
12 Pdftotext reads the PDF file, PDF-file, and writes a text
13 file, text-file. If text-file is not specified, pdftotext
14 converts file.pdf to file.txt. If text-file is '-', the
15 text is sent to stdout.
21 Pdftotext reads a configuration file at startup. It first
22 tries to find the user's private config file, ~/.xpdfrc.
23 If that doesn't exist, it looks for a system-wide config
24 file, typically /usr/local/etc/xpdfrc (but this location
25 can be changed when pdftotext is built). See the
26 xpdfrc(5) man page for details.
32 Many of the following options can be set with configura-
33 tion file commands. These are listed in square brackets
34 with the description of the corresponding command line
38 Specifies the first page to convert.
41 Specifies the last page to convert.
44 Maintain (as best as possible) the original physi-
45 cal layout of the text. The default is to 'undo'
46 physical layout (columns, hyphenation, etc.) and
47 output the text in reading order.
49 -raw Keep the text in content stream order. This is a
50 hack which often "undoes" column formatting, etc.
51 Use of raw mode is no longer recommended.
54 Generate a simple HTML file, including the meta
55 information. This simply wraps the text in <pre>
56 and </pre> and prepends the meta headers.
59 Sets the encoding to use for text output. The
60 encoding-name must be defined with the unicodeMap
61 command (see xpdfrc(5)). The encoding name is
62 case-sensitive. This defaults to "Latin1" (which
63 is a built-in encoding). [config file: textEncod-
67 Sets the end-of-line convention to use for text
68 output. [config file: textEOL]
71 Don't insert page breaks (form feed characters)
72 between pages. [config file: textPageBreaks]
75 Specify the owner password for the PDF file. Pro-
76 viding this will bypass all security restrictions.
79 Specify the user password for the PDF file.
81 -q Don't print any messages or errors. [config file:
85 Read config-file in place of ~/.xpdfrc or the sys-
88 -v Print copyright and version information.
90 -h Print usage information. (-help and --help are
97 Some PDF files contain fonts whose encodings have been
98 mangled beyond recognition. There is no way (short of
99 OCR) to extract text from these files.
105 The Xpdf tools use the following exit codes:
109 1 Error opening a PDF file.
111 2 Error opening an output file.
113 3 Error related to PDF permissions.
121 The pdftotext software and documentation are copyright
122 1996-2007 Glyph & Cog, LLC.
128 xpdf(1), pdftops(1), pdfinfo(1), pdffonts(1), pdftoppm(1),
129 pdfimages(1), xpdfrc(5)
130 http://www.foolabs.com/xpdf/