1 ============================
2 Docutils_ Code Introduction
3 ============================
6 w/ text borrowed from throughout the docutils docstrings.
7 :Contact: johnmulder@gmail.com
10 :Copyright: This document has been placed in the public domain.
11 :Abstract: This is the introduction to the Docutils source code
12 :Prerequisites: You will need some basic Python_ knowledge, as
13 well as some understanding of ReStructuredText_.
15 .. _Docutils: http://docutils.sourceforge.net/
16 .. _Python: http://www.python.org
18 http://docutils.sourceforge.net/docs/user/rst/quickstart.html
21 Obtaining the Docutils Code
22 ===========================
23 The latest snapshot of the docutils code is located at sourcforge as a
26 Alternatively, you can get direct access to the subversion server as described
27 on the docutils site in the `repository instructions`_.
29 .. _tarball: http://docutils.sourceforge.net/docutils-snapshot.tgz
31 .. _repository instructions:
32 http://docutils.sourceforge.net/docs/dev/repository.html
34 Docutils Flow of Execution
35 ==========================
36 The flow of a document through a docutils utility starts with a
37 `Publisher` object from `docutils/core.py`. The publisher is used
40 1. Instantiate the publisher, which in turn instantiates the
43 a. The document tree (`docutils.nodes` objects).
44 b. A `docutils.readers.Reader` instance.
45 c. A `docutils.parsers.Parser` instance.
46 d. A `docutils.writers.Writer` instance.
50 If reader, parser, or writer objects are not passed to
51 the publisher, check for names to have been passed in
52 and use them instead. If neither are passed in, use defualts.
54 3. Process settings: ???
64 B. Call the read function of the Reader
66 i. Scan input text from file, string, or pre-proccessed
67 document tree. Uses a subclass of `Input` in:
69 ii. Parse text into document tree. The parser chosen
70 depends on the document format of the input. Uses
73 iii. Return a document tree to the Publisher. The tree
74 is made up of nodes from:
77 C. Call the apply transforms function of the Transformer
78 in: `docutils/transforms/__init__.py`
80 Apply transforms to the document tree as determined by the
81 reader and writer. Uses transforms in:
82 `docutils/transforms/`
84 D. Call the write function of the Writer in: `docutils/writer`
86 a. Takes document tree as input.
87 b. Instantiates a subclass of `docutils.nodes.NodeVisitor` which
88 traverses the doctree using the `Node.walkabout()` function in:
89 `docutils/nodes/nodes.py`
91 Organization of the Docutils Code
92 =================================
93 Within the docutils directory, the package for docutils is in a
94 subdirectory also called docutils. This contains both
95 modules and subpackages.
101 The __init__ module contains base classes and
102 functions that are inherited in other modules
103 throughout the docutils package.
107 The core module contains the `Publisher` object.
109 Calling the ``publish_*`` convenience functions (or instantiating a
110 `Publisher` object) with component names will result in default
111 behavior. For custom behavior (setting component options), create
112 custom component objects first, and pass *them* to
113 ``publish_*``/`Publisher`. See `The Docutils Publisher`_.
115 .. _The Docutils Publisher: http://docutils.sf.net/docs/api/publisher.html
119 Command-line and common processing for Docutils front-end tools.
120 Includes classes which parse options and functions for proccessing those options.
124 I/O classes provide a uniform API for low-level input and output. Subclasses
125 will exist for a variety of input/output mechanisms.
129 Docutils document tree element class library.
131 Classes in CamelCase are abstract base classes or auxiliary classes. The one
132 exception is `Text`, for a text (PCDATA) node; uppercase is used to
133 differentiate from element classes. Classes in lower_case_with_underscores
134 are element classes, matching the XML element generic identifiers in the DTD_.
136 The position of each node (the level at which it can occur) is significant and
137 is represented by abstract base classes (`Root`, `Structural`, `Body`,
138 `Inline`, etc.). Certain transformations will be easier because we can use
139 ``isinstance(node, base_class)`` to determine the position of the node in the
142 .. _DTD: http://docutils.sourceforge.net/docs/ref/docutils.dtd
146 A finite state machine specialized for regular-expression-based text
147 filters. This module is used by the reST parser, but is designed to
148 be of general utility.
152 `schemes` is a dictionary with lowercase URI addressing schemes as
153 keys and descriptions as values.
157 Miscellaneous utilities for the documentation utilities.
161 Contains practical examples of Docutils client code.
163 Subpackages in Docutils
164 =======================
168 This package contains modules for language-dependent features of Docutils.
172 This package contains Docutils parser modules.
174 :null.py: A module containing a parser which does nothing. This is used
175 when transforming from a pickled document tree to any form.
177 :rst: A subpackage containing the parser for reStructuredText. The
178 reStructuredText parser is implemented as a state machine, examining
179 its input one line at a time. To understand how the parser works,
180 please first become familiar with the `docutils.statemachine` module,
181 then see the `states` module.
185 This package contains Docutils Reader modules. Each reader module or
186 package must export a subclass also called 'Reader'. The three steps
187 of a Reader's responsibility are defined: `scan()`, `parse()`, and
188 `transform()`. Call `read()` to process a document.
192 This package contains modules for standard tree transforms available
193 to Docutils components. Tree transforms serve a variety of purposes:
195 - To tie up certain syntax-specific "loose ends" that remain after the
196 initial parsing of the input plaintext. These transforms are used to
197 supplement a limited syntax.
199 - To automate the internal linking of the document tree (hyperlink
200 references, footnote references, etc.).
202 - To extract useful information from the document tree. These
203 transforms may be used to construct (for example) indexes and tables
206 Each transform is an optional step that a Docutils component may
207 choose to perform on the parsed document.
211 This package contains Docutils Writer modules.
213 Each writer module or package must export a subclass also called
214 'Writer'. Each writer must support all standard node types listed in
215 `docutils.nodes.node_class_names`. The `write()` method is the main
218 In the subpackages, each writer is implemented in the `__init__.py` files.
222 :docutils_xml: Simple internal document tree Writer, writes Docutils XML.
224 :null: A do-nothing Writer.
226 :pseudoxml: Simple internal document tree Writer, writes indented pseudo-XML.
228 Subpackages in Writer:
230 :html4css1: Simple HyperText Markup Language document tree
231 Writer. The output conforms to the XHTML version
232 1.0 Transitional DTD (*almost* strict). The output
233 contains a minimum of formatting information. The
234 cascading style sheet "html4css1.css" is required
235 for proper viewing with a modern graphical browser.
237 :latex2e: LaTeX2e document tree Writer.
239 :newlatex2e: LaTeX2e document tree Writer.
241 :pep_html: PEP HTML Writer.
243 :s5_html: S5/HTML Slideshow Writer.
246 .. |---| unicode:: 8212 .. em-dash
252 indent-tabs-mode: nil
253 sentence-end-double-space: t