1 .. -*- coding: utf-8 -*-
3 ===========================================
4 Docutils FAQ (Frequently Asked Questions)
5 ===========================================
9 :Web site: http://docutils.sourceforge.net/
10 :Copyright: This document has been placed in the public domain.
12 .. Please note that until there's a Q&A-specific construct available,
13 this FAQ will use section titles for questions. Therefore
14 questions must fit on one line. The title may be a summary of the
15 question, with the full question in the section body.
22 This is a work in progress. Please feel free to ask questions and/or
23 provide answers; send email to the `Docutils-users`_ mailing list.
24 Project members should feel free to edit the source text file
28 .. _Docutils-users: docs/user/mailing-lists.html#docutils-users
37 Docutils_ is a system for processing plaintext documentation into
38 useful formats, such as HTML, XML, and LaTeX. It supports multiple
39 types of input, such as standalone files (implemented), inline
40 documentation from Python modules and packages (under development),
41 `PEPs (Python Enhancement Proposals)`_ (implemented), and others as
44 For an overview of the Docutils project implementation, see `PEP
45 258`_, "Docutils Design Specification".
47 Docutils is implemented in Python_.
49 .. _Docutils: http://docutils.sourceforge.net/
50 .. _PEPs (Python Enhancement Proposals):
51 http://www.python.org/peps/pep-0012.html
52 .. _PEP 258: http://www.python.org/peps/pep-0258.html
53 .. _Python: http://www.python.org/
56 Why is it called "Docutils"?
57 ----------------------------
59 Docutils is short for "Python Documentation Utilities". The name
60 "Docutils" was inspired by "Distutils", the Python Distribution
61 Utilities architected by Greg Ward, a component of Python's standard
64 The earliest known use of the term "docutils" in a Python context was
65 a `fleeting reference`__ in a message by Fred Drake on 1999-12-02 in
66 the Python Doc-SIG mailing list. It was suggested `as a project
67 name`__ on 2000-11-27 on Doc-SIG, again by Fred Drake, in response to
68 a question from Tony "Tibs" Ibbs: "What do we want to *call* this
69 thing?". This was shortly after David Goodger first `announced
70 reStructuredText`__ on Doc-SIG.
72 Tibs used the name "Docutils" for `his effort`__ "to document what the
73 Python docutils package should support, with a particular emphasis on
74 documentation strings". Tibs joined the current project (and its
75 predecessors) and graciously donated the name.
77 For more history of reStructuredText and the Docutils project, see `An
78 Introduction to reStructuredText`_.
80 Please note that the name is "Docutils", not "DocUtils" or "Doc-Utils"
81 or any other variation.
83 .. _An Introduction to reStructuredText:
84 http://docutils.sourceforge.net/docs/ref/rst/introduction.html
85 __ http://mail.python.org/pipermail/doc-sig/1999-December/000878.html
86 __ http://mail.python.org/pipermail/doc-sig/2000-November/001252.html
87 __ http://mail.python.org/pipermail/doc-sig/2000-November/001239.html
88 __ http://homepage.ntlworld.com/tibsnjoan/docutils/STpy.html
91 Is there a GUI authoring environment for Docutils?
92 --------------------------------------------------
94 DocFactory_ is under development. It uses wxPython and looks very
98 http://docutils.sf.net/sandbox/gschwant/docfactory/doc/
101 What is the status of the Docutils project?
102 -------------------------------------------
104 Although useful and relatively stable, Docutils is experimental code,
105 with APIs and architecture subject to change.
107 Our highest priority is to fix bugs as they are reported. So the
108 latest code from the repository_ (or the snapshots_) is almost always
109 the most stable (bug-free) as well as the most featureful.
112 What is the Docutils project release policy?
113 --------------------------------------------
115 It's "release early & often". We also have automatically-generated
116 snapshots_ which always contain the latest code from the repository_.
117 As the project matures, we may formalize on a
118 stable/development-branch scheme, but we're not using anything like
121 .. _repository: docs/dev/repository.html
122 .. _snapshots: http://docutils.sourceforge.net/#download
128 What is reStructuredText?
129 -------------------------
131 reStructuredText_ is an easy-to-read, what-you-see-is-what-you-get
132 plaintext markup syntax and parser system. The reStructuredText
133 parser is a component of Docutils_. reStructuredText is a revision
134 and reinterpretation of the StructuredText_ and Setext_ lightweight
137 If you are reading this on the web, you can see for yourself. `The
138 source for this FAQ <FAQ.txt>`_ is written in reStructuredText; open
139 it in another window and compare them side by side.
141 `A ReStructuredText Primer`_ and the `Quick reStructuredText`_ user
142 reference are a good place to start. The `reStructuredText Markup
143 Specification`_ is a detailed technical specification.
145 .. _A ReStructuredText Primer:
146 http://docutils.sourceforge.net/docs/user/rst/quickstart.html
147 .. _Quick reStructuredText:
148 http://docutils.sourceforge.net/docs/user/rst/quickref.html
149 .. _reStructuredText Markup Specification:
150 http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html
151 .. _reStructuredText: http://docutils.sourceforge.net/rst.html
153 http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage/
154 .. _Setext: http://docutils.sourceforge.net/mirror/setext.html
157 Why is it called "reStructuredText"?
158 ------------------------------------
160 The name came from a combination of "StructuredText", one of
161 reStructuredText's predecessors, with "re": "revised", "reworked", and
162 "reinterpreted", and as in the ``re.py`` regular expression module.
163 For a detailed history of reStructuredText and the Docutils project,
164 see `An Introduction to reStructuredText`_.
167 What's the standard abbreviation for "reStructuredText"?
168 --------------------------------------------------------
170 "RST" and "ReST" (or "reST") are both acceptable. Care should be
171 taken with capitalization, to avoid confusion with "REST__", an
172 acronym for "Representational State Transfer".
174 The abbreviations "reSTX" and "rSTX"/"rstx" should **not** be used;
175 they overemphasize reStructuredText's precedessor, Zope's
178 __ http://en.wikipedia.org/wiki/Representational_State_Transfer
181 What's the standard filename extension for a reStructuredText file?
182 -------------------------------------------------------------------
184 It's ".txt". Some people would like to use ".rest" or ".rst" or
185 ".restx", but why bother? ReStructuredText source files are meant to
186 be readable as plaintext, and most operating systems already associate
187 ".txt" with text files. Using a specialized filename extension would
188 require that users alter their OS settings, which is something that
189 many users will not be willing or able to do.
192 Are there any reStructuredText editor extensions?
193 -------------------------------------------------
195 See `Editor Support for reStructuredText`__.
197 __ http://docutils.sf.net/tools/editors/README.html
200 How can I indicate the document title? Subtitle?
201 -------------------------------------------------
203 A uniquely-adorned section title at the beginning of a document is
204 treated specially, as the document title. Similarly, a
205 uniquely-adorned section title immediately after the document title
206 becomes the document subtitle. For example::
208 This is the Document Title
209 ==========================
211 This is the Document Subtitle
212 -----------------------------
214 Here's an ordinary paragraph.
218 Here's an ordinary paragraph.
220 This is *not* a Document Title
221 ==============================
223 The "ordinary paragraph" above the section title
224 prevents it from becoming the document title.
226 Another counterexample::
228 This is not the Document Title, because...
229 ===========================================
231 Here's an ordinary paragraph.
233 ... the title adornment is not unique
234 =====================================
236 Another ordinary paragraph.
239 How can I represent esoteric characters (e.g. character entities) in a document?
240 --------------------------------------------------------------------------------
242 For example, say you want an em-dash (XML character entity —,
243 Unicode character U+2014) in your document: use a real em-dash.
244 Insert concrete characters (e.g. type a *real* em-dash) into your
245 input file, using whatever encoding suits your application, and tell
246 Docutils the input encoding. Docutils uses Unicode internally, so the
247 em-dash character is a real em-dash internally.
249 Emacs users should refer to the `Emacs Support for reStructuredText`__
250 document. Tips for other editors are welcome.
252 __ http://docutils.sourceforge.net/tools/editors/emacs/README.html
254 ReStructuredText has no character entity subsystem; it doesn't know
255 anything about XML charents. To Docutils, "—" in input text is
256 7 discrete characters; no interpretation happens. When writing HTML,
257 the "&" is converted to "&", so in the raw output you'd see
258 "&mdash;". There's no difference in interpretation for text
259 inside or outside inline literals or literal blocks -- there's no
260 character entity interpretation in either case.
262 If you can't use a Unicode-compatible encoding and must rely on 7-bit
263 ASCII, there is a workaround. New in Docutils 0.3.10 is a set of
264 `Standard Substitution Definition Sets`_, which provide equivalents of
265 XML & HTML character entity sets as substitution definitions. For
266 example, the Japanese yen currency symbol can be used as follows::
268 .. include:: <xhtml1-lat1.txt>
270 |yen| 600 for a complete meal? That's cheap!
272 For earlier versions of Docutils, equivalent files containing
273 character entity set substitution definitions using the "unicode_"
274 directive `are available`_. Please read the `description and
275 instructions`_ for use. Thanks to David Priest for the original idea.
277 If you insist on using XML-style charents, you'll have to implement a
278 pre-processing system to convert to UTF-8 or something. That
279 introduces complications though; you can no longer *write* about
280 charents naturally; instead of writing "—" you'd have to write
283 For the common case of long dashes, you might also want to insert the
284 following substitution definitons into your document (both of them are
285 using the "unicode_" directive)::
287 .. |--| unicode:: U+2013 .. en dash
288 .. |---| unicode:: U+2014 .. em dash, trimming surrounding whitespace
291 .. |--| unicode:: U+2013 .. en dash
292 .. |---| unicode:: U+2014 .. em dash, trimming surrounding whitespace
295 Now you can write dashes using pure ASCII: "``foo |--| bar; foo |---|
296 bar``", rendered as "foo |--| bar; foo |---| bar". (Note that Mozilla
297 and Firefox may render this incorrectly.) The ``:trim:`` option for
298 the em dash is necessary because you cannot write "``foo|---|bar``";
299 thus you need to add spaces ("``foo |---| bar``") and advise the
300 reStructuredText parser to trim the spaces.
302 .. _Standard Substitution Definition Sets:
303 http://docutils.sf.net/docs/ref/rst/substitutions.html
305 http://docutils.sf.net/docs/ref/rst/directives.html#unicode-character-codes
306 .. _are available: http://docutils.sourceforge.net/tmp/charents/
307 .. _tarball: http://docutils.sourceforge.net/tmp/charents.tgz
308 .. _description and instructions:
309 http://docutils.sourceforge.net/tmp/charents/README.html
310 .. _to-do list: http://docutils.sourceforge.net/docs/dev/todo.html
313 How can I generate backticks using a Scandinavian keyboard?
314 -----------------------------------------------------------
316 The use of backticks in reStructuredText is a bit awkward with
317 Scandinavian keyboards, where the backtick is a "dead" key. To get
318 one ` character one must press SHIFT-` + SPACE.
320 Unfortunately, with all the variations out there, there's no way to
321 please everyone. For Scandinavian programmers and technical writers,
322 this is not limited to reStructuredText but affects many languages and
325 Possible solutions include
327 * If you have to input a lot of backticks, simply type one in the
328 normal/awkward way, select it, copy and then paste the rest (CTRL-V
329 is a lot faster than SHIFT-` + SPACE).
331 * Use keyboard macros.
333 * Remap the keyboard. The Scandinavian keyboard layout is awkward for
334 other programming/technical characters too; for example, []{}
335 etc. are a bit awkward compared to US keyboards.
337 According to Axel Kollmorgen,
339 Under Windows, you can use the `Microsoft Keyboard Layout Creator
340 <http://www.microsoft.com/globaldev/tools/msklc.mspx>`__ to easily
341 map the backtick key to a real backtick (no dead key). took me
342 five minutes to load my default (german) keyboard layout, untick
343 "Dead Key?" from the backtick key properties ("in all shift
344 states"), "build dll and setup package", install the generated
345 .msi, and add my custom keyboard layout via Control Panel >
346 Regional and Language Options > Languages > Details > Add
347 Keyboard layout (and setting it as default "when you start your
350 * Use a virtual/screen keyboard or character palette, such as:
352 - `Web-based keyboards <http://keyboard.lab.co.il/>`__ (IE only
354 - Windows: `Click-N-Type <http://www.lakefolks.org/cnt/>`__.
355 - Mac OS X: the Character Palette can store a set of favorite
356 characters for easy input. Open System Preferences,
357 International, Input Menu tab, enable "Show input menu in menu
358 bar", and be sure that Character Palette is enabled in the list.
360 If anyone knows of other/better solutions, please `let us know`_.
363 Are there any tools for HTML/XML-to-reStructuredText? (Round-tripping)
364 -----------------------------------------------------------------------
366 People have tossed the idea around, and some implementations of
367 reStructuredText-generating tools can be found in the `Docutils Link
370 There's no reason why reStructuredText should not be round-trippable
371 to/from XML; any technicalities which prevent round-tripping would be
372 considered bugs. Whitespace would not be identical, but paragraphs
373 shouldn't suffer. The tricky parts would be the smaller details, like
374 links and IDs and other bookkeeping.
376 For HTML, true round-tripping may not be possible. Even adding lots
377 of extra "class" attributes may not be enough. A "simple HTML" to RST
378 filter is possible -- for some definition of "simple HTML" -- but HTML
379 is used as dumb formatting so much that such a filter may not be
380 particularly useful. An 80/20 approach should work though: build a
381 tool that does 80% of the work automatically, leaving the other 20%
384 .. _Docutils Link List: docs/user/links.html
387 Are there any Wikis that use reStructuredText syntax?
388 -----------------------------------------------------
390 There are several, with various degrees of completeness. With no
391 implied endorsement or recommendation, and in no particular order:
393 * `Webware for Python wiki
394 <http://wiki.webwareforpython.org/thiswiki.html>`__
395 * `Ian Bicking's experimental code
396 <http://docutils.sf.net/sandbox/ianb/wiki/Wiki.py>`__
397 * `MoinMoin <http://moinmoin.wikiwikiweb.de/>`__ has some support;
398 `here's a sample <http://moinmoin.wikiwikiweb.de/RestSample>`__
399 * Zope-based `Zwiki <http://zwiki.org/>`__
400 * Zope3-based Zwiki (in the Zope 3 source tree as ``zope.products.zwiki``)
401 * `StikiWiki <http://mithrandr.moria.org/code/stikiwiki/>`__
402 * `Trac <http://projects.edgewall.com/trac/>`__ `supports using reStructuredText
403 <http://projects.edgewall.com/trac/wiki/WikiRestructuredText>`__ as an
404 alternative to wiki markup. This includes support for `TracLinks
405 <http://projects.edgewall.com/trac/wiki/TracLinks>`__ from within RST
406 text via a custom RST reference-directive or, even easier, an interpreted text
408 * `Vogontia <http://www.ososo.de/vogontia/>`__, a Wiki-like FAQ system
410 Please `let us know`_ of any other reStructuredText Wikis.
412 The example application for the `Web Framework Shootout
413 <http://colorstudy.com/docs/shootout.html>`__ article is a Wiki using
417 Are there any Weblog (Blog) projects that use reStructuredText syntax?
418 ----------------------------------------------------------------------
420 With no implied endorsement or recommendation, and in no particular
423 * `Firedrop <http://www.voidspace.org.uk/python/firedrop2/>`__
424 * `Python Desktop Server <http://pyds.muensterland.org/>`__
425 * `PyBloxsom <http://roughingit.subtlehints.net/pyblosxom/>`__
426 * `Lino WebMan <http://lino.sourceforge.net/webman.html>`__
428 Please `let us know`_ of any other reStructuredText Blogs.
431 Can lists be indented without generating block quotes?
432 ------------------------------------------------------
434 Some people like to write lists with indentation, without intending a
435 block quote context, like this::
442 There has been a lot of discussion about this, but there are some
443 issues that would need to be resolved before it could be implemented.
444 There is a summary of the issues and pointers to the discussions in
447 __ http://docutils.sourceforge.net/docs/dev/todo.html#indented-lists
450 Could the requirement for blank lines around lists be relaxed?
451 --------------------------------------------------------------
455 In reStructuredText, it would be impossible to unambigously mark up
456 and parse lists without blank lines before and after. Deeply nested
457 lists may look ugly with so many blank lines, but it's a price we pay
458 for unambiguous markup. Some other plaintext markup systems do not
459 require blank lines in nested lists, but they have to compromise
460 somehow, either accepting ambiguity or requiring extra complexity.
461 For example, `Epytext <http://epydoc.sf.net/epytext.html#list>`__ does
462 not require blank lines around lists, but it does require that lists
463 be indented and that ambiguous cases be escaped.
466 How can I include mathematical equations in documents?
467 ------------------------------------------------------
469 There is no elegant built-in way, yet. There are several ideas, but
470 no obvious winner. This issue requires a champion to solve the
471 technical and aesthetic issues and implement a generic solution.
472 Here's the `to-do list entry`__.
474 __ http://docutils.sourceforge.net/docs/dev/todo.html#math-markup
476 There are several quick & dirty ways to include equations in documents.
477 They all presently use LaTeX syntax or dialects of it.
479 * For LaTeX output, nothing beats raw LaTeX math. A simple way is to
480 use the `raw directive`_::
484 \[ x^3 + 3x^2a + 3xa^2 + a^3, \]
486 For inline math you could use substitutions of the raw directive but
487 the recently added `raw role`_ is more convenient. You must define a
488 custom role based on it once in your document::
490 .. role:: raw-latex(raw)
493 and then you can just use the new role in your text::
495 the binomial expansion of :raw-latex:`$(x+a)^3$` is
497 .. _raw directive: http://docutils.sourceforge.net/docs/ref/rst/
498 directives.html#raw-data-pass-through
499 .. _raw role: http://docutils.sourceforge.net/docs/ref/rst/roles.html#raw
501 * Jens Jørgen Mortensen has implemented a "latex-math" role and
502 directive, available from `his sandbox`__.
504 __ http://docutils.sourceforge.net/sandbox/jensj/latex_math/
506 * For HTML the "Right" w3c-standard way to include math is MathML_.
507 Unfortunately its rendering is still quite broken (or absent) on many
508 browsers but it's getting better. Another bad problem is that typing
509 or reading raw MathML by humans is *really* painful, so embedding it
510 in a reST document with the raw directive would defy the goals of
511 readability and editability of reST (see an `example of raw MathML
512 <http://sf.net/mailarchive/message.php?msg_id=2177102>`__).
514 A much less painful way to generate HTML+MathML is to use itex2mml_ to
515 convert a dialect of LaTeX syntax to presentation MathML. Here is an
516 example of potential `itex math markup
517 <http://article.gmane.org/gmane.text.docutils.user/118>`__. The
518 simplest way to use it is to add ``html`` to the format lists for the
519 raw directive/role and postprocess the resulting document with
520 itex2mml. This way you can *generate LaTeX and HTML+MathML from the
521 same source*, but you must limit yourself to the intersection of LaTeX
522 and itex markups for this to work. Alan G. Isaac wrote a detailed
523 HOWTO_ for this approach.
525 .. _MathML: http://www.w3.org/Math/
526 .. _itex2mml: http://pear.math.pitt.edu/mathzilla/itex2mml.html
527 .. _HOWTO: http://www.american.edu/econ/itex2mml/mathhack.rst
529 * The other way to add math to HTML is to use images of the equations,
530 typically generated by TeX. This is inferior to MathML in the long
531 term but is perhaps more accessible nowdays.
533 Of course, doing it by hand is too much work. Beni Cherniavsky has
534 written some `preprocessing scripts`__ for converting the
535 ``texmath`` role/directive into images for HTML output and raw
536 directives/subsitution for LaTeX output. This way you can *generate
537 LaTeX and HTML+images from the same source*. `Instructions here`__.
539 __ http://docutils.sourceforge.net/sandbox/cben/rolehack/
540 __ http://docutils.sourceforge.net/sandbox/cben/rolehack/README.html
543 Is nested inline markup possible?
544 ---------------------------------
546 Not currently, no. It's on the `to-do list`__ (`details here`__), and
547 hopefully will be part of the reStructuredText parser soon. At that
548 time, markup like this will become possible::
550 Here is some *emphasized text containing a `hyperlink`_ and
551 ``inline literals``*.
553 __ http://docutils.sf.net/docs/dev/todo.html#nested-inline-markup
554 __ http://docutils.sf.net/docs/dev/rst/alternatives.html#nested-inline-markup
556 There are workarounds, but they are either convoluted or ugly or both.
557 They are not recommended.
559 * Inline markup can be combined with hyperlinks using `substitution
560 definitions`__ and references__ with the `"replace" directive`__.
563 Here is an |emphasized hyperlink|_.
565 .. |emphasized hyperlink| replace:: *emphasized hyperlink*
566 .. _emphasized hyperlink: http://example.org
568 It is not possible for just a portion of the replacement text to be
569 a hyperlink; it's the entire replacement text or nothing.
571 __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#substitution-definitions
572 __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#substitution-references
573 __ http://docutils.sf.net/docs/ref/rst/directives.html#replace
575 * The `"raw" directive`__ can be used to insert raw HTML into HTML
578 Here is some |stuff|.
580 .. |stuff| raw:: html
582 <em>emphasized text containing a
583 <a href="http://example.org">hyperlink</a> and
584 <tt>inline literals</tt></em>
586 Raw LaTeX is supported for LaTeX output, etc.
588 __ http://docutils.sf.net/docs/ref/rst/directives.html#raw
591 How to indicate a line break or a significant newline?
592 ------------------------------------------------------
594 `Line blocks`__ are designed for address blocks, verse, and other
595 cases where line breaks are significant and must be preserved. Unlike
596 literal blocks, the typeface is not changed, and inline markup is
597 recognized. For example::
599 | A one, two, a one two three four
601 | Half a bee, philosophically,
602 | must, *ipso facto*, half not be.
603 | But half the bee has got to be,
604 | *vis a vis* its entity. D'you see?
606 | But can a bee be said to be
607 | or not to be an entire bee,
608 | when half the bee is not a bee,
609 | due to some ancient injury?
613 __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#line-blocks
615 Here's a workaround for manually inserting explicit line breaks in
622 I want to break this line here: |br| this is after the break.
624 If the extra whitespace bothers you, |br|\ backslash-escape it.
627 A URL containing asterisks doesn't work. What to do?
628 -----------------------------------------------------
630 Asterisks are valid URL characters (see :RFC:`2396`), sometimes used
631 in URLs. For example::
633 http://cvs.example.org/viewcvs.py/*checkout*/module/file
635 Unfortunately, the parser thinks the asterisks are indicating
636 emphasis. The slashes serve as delineating punctuation, allowing the
637 asterisks to be recognized as markup. The example above is separated
638 by the parser into a truncated URL, an emphasized word, and some
641 http://cvs.example.org/viewcvs.py/
645 To turn off markup recognition, use a backslash to escape at least the
646 first asterisk, like this::
648 http://cvs.example.org/viewcvs.py/\*checkout*/module/file
650 Escaping the second asterisk doesn't hurt, but it isn't necessary.
653 How can I make a literal block with *some* formatting?
654 ------------------------------------------------------
656 Use the `parsed-literal`_ directive.
658 .. _parsed-literal: docs/ref/rst/directives.html#parsed-literal
660 Scenario: a document contains some source code, which calls for a
661 literal block to preserve linebreaks and whitespace. But part of the
662 source code should be formatted, for example as emphasis or as a
663 hyperlink. This calls for a *parsed* literal block::
667 print "Hello world!" # *tricky* code [1]_
669 The emphasis (``*tricky*``) and footnote reference (``[1]_``) will be
673 Can reStructuredText be used for web or generic templating?
674 -----------------------------------------------------------
676 Docutils and reStructuredText can be used with or as a component of a
677 templating system, but they do not themselves include templating
678 functionality. Templating should simply be left to dedicated
679 templating systems. Users can choose a templating system to apply to
680 their reStructuredText documents as best serves their interests.
682 There are many good templating systems for Python (ht2html_, YAPTU_,
683 Quixote_'s PTL, Cheetah_, etc.; see this non-exhaustive list of `some
684 other templating systems`_), and many more for other languages, each
685 with different approaches. We invite you to try several and find one
686 you like. If you adapt it to use Docutils/reStructuredText, please
687 consider contributing the code to Docutils or `let us know`_ and we'll
690 One reST-specific web templating system is `rest2web
691 <http://www.voidspace.org.uk/python/rest2web>`_, a tool for
692 automatically building websites, or parts of websites.
694 .. _ht2html: http://ht2html.sourceforge.net/
696 http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52305
697 .. _Quixote: http://www.mems-exchange.org/software/quixote/
698 .. _Cheetah: http://www.cheetahtemplate.org/
699 .. _some other templating systems:
700 http://webware.sourceforge.net/Papers/Templates/
706 What is the status of the HTML Writer?
707 --------------------------------------
709 The HTML Writer module, ``docutils/writers/html4css1.py``, is a
710 proof-of-concept reference implementation. While it is a complete
711 implementation, some aspects of the HTML it produces may be
712 incompatible with older browsers or specialized applications (such as
713 web templating). Alternate implementations are welcome.
716 What kind of HTML does it produce?
717 ----------------------------------
719 It produces XHTML compatible with the `XHTML 1.0`_ specification. A
720 cascading stylesheet is required for proper viewing with a modern
721 graphical browser. Correct rendering of the HTML produced depends on
722 the CSS support of the browser. A general-purpose stylesheet,
723 ``html4css1.css`` is provided with the code, and is embedded by
724 default. It is installed in the "writers/support/" subdirectory
725 within the Docutils package. Use the ``--help`` command-line option
726 to see the specific location on your machine.
728 .. _XHTML 1.0: http://www.w3.org/TR/xhtml1/
731 What browsers are supported?
732 ----------------------------
734 No specific browser is targeted; all modern graphical browsers should
735 work. Some older browsers, text-only browsers, and browsers without
736 full CSS support are known to produce inferior results. Firefox,
737 Safari, Mozilla (version 1.0 and up), and MS Internet Explorer
738 (version 5.0 and up) are known to give good results. Reports of
739 experiences with other browsers are welcome.
742 Unexpected results from tools/rst2html.py: H1, H1 instead of H1, H2. Why?
743 --------------------------------------------------------------------------
745 Here's the question in full:
752 All my life, I wanted to be H1.
757 But along came H1, and so shouldn't I be H2?
763 Yeah, imagine me, I'm stuck at H3! No?!?
765 When I run it through tools/rst2html.py, I get unexpected results
766 (below). I was expecting H1, H2, then H3; instead, I get H1, H1,
773 <title>Heading 1</title>
776 <div class="document" id="heading-1">
777 <h1 class="title">Heading 1</h1> <-- first H1
778 <p>All my life, I wanted to be H1.</p>
779 <div class="section" id="heading-1-1">
780 <h1><a name="heading-1-1">Heading 1.1</a></h1> <-- H1
781 <p>But along came H1, and so now I must be H2.</p>
782 <div class="section" id="heading-1-1-1">
783 <h2><a name="heading-1-1-1">Heading 1.1.1</a></h2>
784 <p>Yeah, imagine me, I'm stuck at H3!</p>
789 Check the "class" attribute on the H1 tags, and you will see a
790 difference. The first H1 is actually ``<h1 class="title">``; this is
791 the document title, and the default stylesheet renders it centered.
792 There can also be an ``<h2 class="subtitle">`` for the document
795 If there's only one highest-level section title at the beginning of a
796 document, it is treated specially, as the document title. (Similarly, a
797 lone second-highest-level section title may become the document
798 subtitle.) See `How can I indicate the document title? Subtitle?`_ for
799 details. Rather than use a plain H1 for the document title, we use ``<h1
800 class="title">`` so that we can use H1 again within the document. Why
801 do we do this? HTML only has H1-H6, so by making H1 do double duty, we
802 effectively reserve these tags to provide 6 levels of heading beyond the
803 single document title.
805 HTML is being used for dumb formatting for nothing but final display.
806 A stylesheet *is required*, and one is provided; see `What kind of
807 HTML does it produce?`_ above. Of course, you're welcome to roll your
808 own. The default stylesheet provides rules to format ``<h1
809 class="title">`` and ``<h2 class="subtitle">`` differently from
810 ordinary ``<h1>`` and ``<h2>``::
818 If you don't want the top section heading to be interpreted as a
819 title at all, disable the `doctitle_xform`_ setting
820 (``--no-doc-title`` option). This will interpret your document
821 differently from the standard settings, which might not be a good
822 idea. If you don't like the reuse of the H1 in the HTML output, you
823 can tweak the `initial_header_level`_ setting
824 (``--initial-header-level`` option) -- but unless you match its value
825 to your specific document, you might end up with bad HTML (e.g. H3
829 http://docutils.sourceforge.net/docs/user/config.html#doctitle-xform
830 .. _initial_header_level:
831 http://docutils.sourceforge.net/docs/user/config.html#initial-header-level
833 (Thanks to Mark McEahern for the question and much of the answer.)
836 Why do enumerated lists only use numbers (no letters or roman numerals)?
837 ------------------------------------------------------------------------
839 The rendering of enumerators (the numbers or letters acting as list
840 markers) is completely governed by the stylesheet, so either the
841 browser can't find the stylesheet (try using the "--embed-stylesheet"
842 option), or the browser can't understand it (try a recent Firefox,
843 Mozilla, Konqueror, Opera, Safari, or even MSIE).
846 There appear to be garbage characters in the HTML. What's up?
847 --------------------------------------------------------------
849 What you're seeing is most probably not garbage, but the result of a
850 mismatch between the actual encoding of the HTML output and the
851 encoding your browser is expecting. Your browser is misinterpreting
852 the HTML data, which is encoded text. A discussion of text encodings
853 is beyond the scope of this FAQ; see one or more of these documents
856 * `UTF-8 and Unicode FAQ for Unix/Linux
857 <http://www.cl.cam.ac.uk/~mgk25/unicode.html>`_
859 * Chapters 3 and 4 of `Introduction to i18n [Internationalization]
860 <http://www.debian.org/doc/manuals/intro-i18n/>`_
862 * `Python Unicode Tutorial
863 <http://www.reportlab.com/i18n/python_unicode_tutorial.html>`_
865 * `Python Unicode Objects: Some Observations on Working With Non-ASCII
866 Character Sets <http://effbot.org/zone/unicode-objects.htm>`_
868 The common case is with the default output encoding (UTF-8), when
869 either numbered sections are used (via the "sectnum_" directive) or
870 symbol-footnotes. 3 non-breaking spaces are inserted in each numbered
871 section title, between the generated number and the title text. Most
872 footnote symbols are not available in ASCII, nor are non-breaking
873 spaces. When encoded with UTF-8 and viewed with ordinary ASCII tools,
874 these characters will appear to be multi-character garbage.
876 You may have an decoding problem in your browser (or editor, etc.).
877 The encoding of the output is set to "utf-8", but your browswer isn't
878 recognizing that. You can either try to fix your browser (enable
879 "UTF-8 character set", sometimes called "Unicode"), or choose a
880 different encoding for the HTML output. You can also try
881 ``--output-encoding=ascii:xmlcharrefreplace`` for HTML (not applicable
882 to non-XMLish outputs).
884 If you're generating document fragments, the "Content-Type" metadata
885 (between the HTML ``<head>`` and ``</head>`` tags) must agree with the
886 encoding of the rest of the document. For UTF-8, it should be::
888 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
890 Also, Docutils normally generates an XML declaration as the first line
891 of the output. It must also match the document encoding. For UTF-8::
893 <?xml version="1.0" encoding="utf-8" ?>
896 http://docutils.sourceforge.net/docs/ref/rst/directives.html#sectnum
899 How can I retrieve the body of the HTML document?
900 -------------------------------------------------
902 (This is usually needed when using Docutils in conjunction with a
905 You can use the `docutils.core.publish_parts()`_ function, which
906 returns a dictionary containing an 'html_body_' entry.
908 .. _docutils.core.publish_parts():
909 docs/api/publisher.html#publish-parts
911 docs/api/publisher.html#html-body
914 Why is the Docutils XHTML served as "Content-type: text/html"?
915 --------------------------------------------------------------
919 Docutils' HTML output looks like XHTML and is advertised as such::
921 <?xml version="1.0" encoding="utf-8" ?>
922 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
923 "http://www.w3.org/TR/xht ml1/DTD/xhtml1-transitional.dtd">
925 But this is followed by::
927 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
929 Shouldn't this be "application/xhtml+xml" instead of "text/html"?
931 In a perfect web, the Docutils XHTML output would be 100% strict
932 XHTML. But it's not a perfect web, and a major source of imperfection
933 is Internet Explorer. Despite it's drawbacks, IE still represents the
934 majority of web browsers, and cannot be ignored.
936 Short answer: if we didn't serve XHTML as "text/html" (which is a
937 perfectly valid thing to do), it couldn't be viewed in Internet
940 Long answer: see the `"Criticisms of Internet Explorer" Wikipedia
941 entry <http://en.wikipedia.org/wiki/Criticisms_of_Internet_Explorer#XHTML>`__.
943 However, there's also `Sending XHTML as text/html Considered
944 Harmful`__. What to do, what to do? We're damned no matter what we
945 do. So we'll continue to do the practical instead of the pure:
946 support the browsers that are actually out there, and not fight for
947 strict standards compliance.
949 __ http://hixie.ch/advocacy/xhtml
951 (Thanks to Martin F. Krafft, Robert Kern, Michael Foord, and Alan
958 Can I use Docutils for Python auto-documentation?
959 -------------------------------------------------
961 Yes, in conjunction with other projects.
963 Docstring extraction functionality from within Docutils is still under
964 development. There is most of a source code parsing module in
965 docutils/readers/python/moduleparser.py. We do plan to finish it
966 eventually. Ian Bicking wrote an initial front end for the
967 moduleparser.py module, in sandbox/ianb/extractor/extractor.py. Ian
968 also did some work on the Python Source Reader
969 (docutils.readers.python) component at PyCon DC 2004.
971 Version 2.0 of Ed Loper's `Epydoc <http://epydoc.sourceforge.net/>`_
972 supports reStructuredText-format docstrings for HTML output. Docutils
973 0.3 or newer is required. Development of a Docutils-specific
974 auto-documentation tool will continue. Epydoc works by importing
975 Python modules to be documented, whereas the Docutils-specific tool,
976 described above, will parse modules without importing them (as with
977 `HappyDoc <http://happydoc.sourceforge.net/>`_, which doesn't support
980 The advantages of parsing over importing are security and flexibility;
981 the disadvantage is complexity/difficulty.
983 * Security: untrusted code that shouldn't be executed can be parsed;
984 importing a module executes its top-level code.
985 * Flexibility: comments and unofficial docstrings (those not supported
986 by Python syntax) can only be processed by parsing.
987 * Complexity/difficulty: it's a lot harder to parse and analyze a
988 module than it is to ``import`` and analyze one.
990 For more details, please see "Docstring Extraction Rules" in `PEP
991 258`_, item 3 ("How").
997 Is the Docutils document model based on any existing XML models?
998 ----------------------------------------------------------------
1000 Not directly, no. It borrows bits from DocBook, HTML, and others. I
1001 (David Goodger) have designed several document models over the years,
1002 and have my own biases. The Docutils document model is designed for
1003 simplicity and extensibility, and has been influenced by the needs of
1004 the reStructuredText markup.