FAQ.txt

   1 ===========================================
   2  Docutils FAQ (Frequently Asked Questions)
   3 ===========================================
   4
   5 :Date: $Date$
   6 :Revision: $Revision$
   7 :Web site: http://docutils.sourceforge.net/
   8 :Copyright: This document has been placed in the public domain.
   9
  10 .. Please note that until there's a Q&A-specific construct available,
  11    this FAQ will use section titles for questions.  Therefore
  12    questions must fit on one line.  The title may be a summary of the
  13    question, with the full question in the section body.
  14
  15
  16 .. contents::
  17 .. sectnum::
  18
  19
  20 This is a work in progress.  Please feel free to ask questions and/or
  21 provide answers; send email to the `Docutils-users`_ mailing list.
  22 Project members should feel free to edit the source text file
  23 directly.
  24
  25 .. _let us know:
  26 .. _Docutils-users: docs/user/mailing-lists.html#docutils-users
  27
  28
  29 Docutils
  30 ========
  31
  32 What is Docutils?
  33 -----------------
  34
  35 Docutils_ is a system for processing plaintext documentation into
  36 useful formats, such as HTML, XML, and LaTeX.  It supports multiple
  37 types of input, such as standalone files (implemented), inline
  38 documentation from Python modules and packages (under development),
  39 `PEPs (Python Enhancement Proposals)`_ (implemented), and others as
  40 discovered.
  41
  42 For an overview of the Docutils project implementation, see `PEP
  43 258`_, "Docutils Design Specification".
  44
  45 Docutils is implemented in Python_.
  46
  47 .. _Docutils: http://docutils.sourceforge.net/
  48 .. _PEPs (Python Enhancement Proposals):
  49    http://www.python.org/peps/pep-0012.html
  50 .. _PEP 258: http://www.python.org/peps/pep-0258.html
  51 .. _Python: http://www.python.org/
  52
  53
  54 Why is it called "Docutils"?
  55 ----------------------------
  56
  57 Docutils is short for "Python Documentation Utilities".  The name
  58 "Docutils" was inspired by "Distutils", the Python Distribution
  59 Utilities architected by Greg Ward, a component of Python's standard
  60 library.
  61
  62 The earliest known use of the term "docutils" in a Python context was
  63 a `fleeting reference`__ in a message by Fred Drake on 1999-12-02 in
  64 the Python Doc-SIG mailing list.  It was suggested `as a project
  65 name`__ on 2000-11-27 on Doc-SIG, again by Fred Drake, in response to
  66 a question from Tony "Tibs" Ibbs: "What do we want to *call* this
  67 thing?".  This was shortly after David Goodger first `announced
  68 reStructuredText`__ on Doc-SIG.
  69
  70 Tibs used the name "Docutils" for `his effort`__ "to document what the
  71 Python docutils package should support, with a particular emphasis on
  72 documentation strings".  Tibs joined the current project (and its
  73 predecessors) and graciously donated the name.
  74
  75 For more history of reStructuredText and the Docutils project, see `An
  76 Introduction to reStructuredText`_.
  77
  78 Please note that the name is "Docutils", not "DocUtils" or "Doc-Utils"
  79 or any other variation.
  80
  81 .. _An Introduction to reStructuredText:
  82    http://docutils.sourceforge.net/docs/ref/rst/introduction.html
  83 __ http://mail.python.org/pipermail/doc-sig/1999-December/000878.html
  84 __ http://mail.python.org/pipermail/doc-sig/2000-November/001252.html
  85 __ http://mail.python.org/pipermail/doc-sig/2000-November/001239.html
  86 __ http://homepage.ntlworld.com/tibsnjoan/docutils/STpy.html
  87
  88
  89 Is there a GUI authoring environment for Docutils?
  90 --------------------------------------------------
  91
  92 DocFactory_ is under development.  It uses wxPython and looks very
  93 promising.
  94
  95 .. _DocFactory:
  96    http://docutils.sf.net/sandbox/gschwant/docfactory/doc/
  97
  98
  99 What is the status of the Docutils project?
 100 -------------------------------------------
 101
 102 Although useful and relatively stable, Docutils is experimental code,
 103 with APIs and architecture subject to change.
 104
 105 Our highest priority is to fix bugs as they are reported.  So the
 106 latest code from the repository_ (or the snapshots_) is almost always
 107 the most stable (bug-free) as well as the most featureful.
 108
 109
 110 What is the Docutils project release policy?
 111 --------------------------------------------
 112
 113 It's "release early & often".  We also have automatically-generated
 114 snapshots_ which always contain the latest code from the repository_.
 115 As the project matures, we may formalize on a
 116 stable/development-branch scheme, but we're not using anything like
 117 that yet.
 118
 119 .. _repository: docs/dev/repository.html
 120 .. _snapshots: http://docutils.sourceforge.net/#download
 121
 122
 123 reStructuredText
 124 ================
 125
 126 What is reStructuredText?
 127 -------------------------
 128
 129 reStructuredText_ is an easy-to-read, what-you-see-is-what-you-get
 130 plaintext markup syntax and parser system.  The reStructuredText
 131 parser is a component of Docutils_.  reStructuredText is a revision
 132 and reinterpretation of the StructuredText_ and Setext_ lightweight
 133 markup systems.
 134
 135 If you are reading this on the web, you can see for yourself.  `The
 136 source for this FAQ <FAQ.txt>`_ is written in reStructuredText; open
 137 it in another window and compare them side by side.
 138
 139 `A ReStructuredText Primer`_ and the `Quick reStructuredText`_ user
 140 reference are a good place to start.  The `reStructuredText Markup
 141 Specification`_ is a detailed technical specification.
 142
 143 .. _A ReStructuredText Primer:
 144    http://docutils.sourceforge.net/docs/user/rst/quickstart.html
 145 .. _Quick reStructuredText:
 146    http://docutils.sourceforge.net/docs/user/rst/quickref.html
 147 .. _reStructuredText Markup Specification:
 148    http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html
 149 .. _reStructuredText: http://docutils.sourceforge.net/rst.html
 150 .. _StructuredText:
 151    http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage/
 152 .. _Setext: http://docutils.sourceforge.net/mirror/setext.html
 153
 154
 155 Why is it called "reStructuredText"?
 156 ------------------------------------
 157
 158 The name came from a combination of "StructuredText", one of
 159 reStructuredText's predecessors, with "re": "revised", "reworked", and
 160 "reinterpreted", and as in the ``re.py`` regular expression module.
 161 For a detailed history of reStructuredText and the Docutils project,
 162 see `An Introduction to reStructuredText`_.
 163
 164
 165 What's the standard abbreviation for "reStructuredText"?
 166 --------------------------------------------------------
 167
 168 "RST" and "ReST" (or "reST") are both acceptable.  Care should be
 169 taken with capitalization, to avoid confusion with "REST__", an
 170 acronym for "Representational State Transfer".
 171
 172 The abbreviations "reSTX" and "rSTX"/"rstx" should **not** be used;
 173 they overemphasize reStructuredText's precedessor, Zope's
 174 StructuredText.
 175
 176 __ http://en.wikipedia.org/wiki/Representational_State_Transfer
 177
 178
 179 What's the standard filename extension for a reStructuredText file?
 180 -------------------------------------------------------------------
 181
 182 It's ".txt".  Some people would like to use ".rest" or ".rst" or
 183 ".restx", but why bother?  ReStructuredText source files are meant to
 184 be readable as plaintext, and most operating systems already associate
 185 ".txt" with text files.  Using a specialized filename extension would
 186 require that users alter their OS settings, which is something that
 187 many users will not be willing or able to do.
 188
 189
 190 Are there any reStructuredText editor extensions?
 191 -------------------------------------------------
 192
 193 See `Editor Support for reStructuredText`__.
 194
 195 __ http://docutils.sf.net/tools/editors/README.html
 196
 197
 198 How can I indicate the document title?  Subtitle?
 199 -------------------------------------------------
 200
 201 A uniquely-adorned section title at the beginning of a document is
 202 treated specially, as the document title.  Similarly, a
 203 uniquely-adorned section title immediately after the document title
 204 becomes the document subtitle.  For example::
 205
 206     This is the Document Title
 207     ==========================
 208
 209     This is the Document Subtitle
 210     -----------------------------
 211
 212     Here's an ordinary paragraph.
 213
 214 Counterexample::
 215
 216     Here's an ordinary paragraph.
 217
 218     This is *not* a Document Title
 219     ==============================
 220
 221     The "ordinary paragraph" above the section title
 222     prevents it from becoming the document title.
 223
 224 Another counterexample::
 225
 226     This is not the Document Title,  because...
 227     ===========================================
 228
 229     Here's an ordinary paragraph.
 230
 231     ... the title adornment is not unique
 232     =====================================
 233
 234     Another ordinary paragraph.
 235
 236
 237 How can I represent esoteric characters (e.g. character entities) in a document?
 238 --------------------------------------------------------------------------------
 239
 240 For example, say you want an em-dash (XML character entity &mdash;,
 241 Unicode character U+2014) in your document: use a real em-dash.
 242 Insert concrete characters (e.g. type a *real* em-dash) into your
 243 input file, using whatever encoding suits your application, and tell
 244 Docutils the input encoding.  Docutils uses Unicode internally, so the
 245 em-dash character is a real em-dash internally.
 246
 247 Emacs users should refer to the `Emacs Support for reStructuredText`__
 248 document.  Tips for other editors are welcome.
 249
 250 __ http://docutils.sourceforge.net/tools/editors/emacs/README.html
 251
 252 ReStructuredText has no character entity subsystem; it doesn't know
 253 anything about XML charents.  To Docutils, "&mdash;" in input text is
 254 7 discrete characters; no interpretation happens.  When writing HTML,
 255 the "&" is converted to "&amp;", so in the raw output you'd see
 256 "&amp;mdash;".  There's no difference in interpretation for text
 257 inside or outside inline literals or literal blocks -- there's no
 258 character entity interpretation in either case.
 259
 260 If you can't use a Unicode-compatible encoding and must rely on 7-bit
 261 ASCII, there is a workaround.  New in Docutils 0.3.10 is a set of
 262 `Standard Substitution Definition Sets`_, which provide equivalents of
 263 XML & HTML character entity sets as substitution definitions.  For
 264 example, the Japanese yen currency symbol can be used as follows::
 265
 266     .. include:: <xhtml1-lat1.txt>
 267
 268     |yen| 600 for a complete meal?  That's cheap!
 269
 270 For earlier versions of Docutils, equivalent files containing
 271 character entity set substitution definitions using the "unicode_"
 272 directive `are available`_.  Please read the `description and
 273 instructions`_ for use.  Thanks to David Priest for the original idea.
 274
 275 If you insist on using XML-style charents, you'll have to implement a
 276 pre-processing system to convert to UTF-8 or something.  That
 277 introduces complications though; you can no longer *write* about
 278 charents naturally; instead of writing "&mdash;" you'd have to write
 279 "&amp;mdash;".
 280
 281 For the common case of long dashes, you might also want to insert the
 282 following substitution definitons into your document (both of them are
 283 using the "unicode_" directive)::
 284
 285     .. |--| unicode:: U+2013   .. en dash
 286     .. |---| unicode:: U+2014  .. em dash, trimming surrounding whitespace
 287        :trim:
 288
 289 .. |--| unicode:: U+2013   .. en dash
 290 .. |---| unicode:: U+2014  .. em dash, trimming surrounding whitespace
 291    :trim:
 292
 293 Now you can write dashes using pure ASCII: "``foo |--| bar; foo |---|
 294 bar``", rendered as "foo |--| bar; foo |---| bar".  (Note that Mozilla
 295 and Firefox may render this incorrectly.)  The ``:trim:`` option for
 296 the em dash is necessary because you cannot write "``foo|---|bar``";
 297 thus you need to add spaces ("``foo |---| bar``") and advise the
 298 reStructuredText parser to trim the spaces.
 299
 300 .. _Standard Substitution Definition Sets:
 301    http://docutils.sf.net/docs/ref/rst/substitutions.html
 302 .. _unicode:
 303    http://docutils.sf.net/docs/ref/rst/directives.html#unicode-character-codes
 304 .. _are available: http://docutils.sourceforge.net/tmp/charents/
 305 .. _tarball: http://docutils.sourceforge.net/tmp/charents.tgz
 306 .. _description and instructions:
 307    http://docutils.sourceforge.net/tmp/charents/README.html
 308 .. _to-do list: http://docutils.sourceforge.net/docs/dev/todo.html
 309
 310
 311 How can I generate backticks using a Scandinavian keyboard?
 312 -----------------------------------------------------------
 313
 314 The use of backticks in reStructuredText is a bit awkward with
 315 Scandinavian keyboards, where the backtick is a "dead" key.  To get
 316 one ` character one must press SHIFT-` + SPACE.
 317
 318 Unfortunately, with all the variations out there, there's no way to
 319 please everyone.  For Scandinavian programmers and technical writers,
 320 this is not limited to reStructuredText but affects many languages and
 321 environments.
 322
 323 Possible solutions include
 324
 325 * If you have to input a lot of backticks, simply type one in the
 326   normal/awkward way, select it, copy and then paste the rest (CTRL-V
 327   is a lot faster than SHIFT-` + SPACE).
 328
 329 * Use keyboard macros.
 330
 331 * Remap the keyboard.  The Scandinavian keyboard layout is awkward for
 332   other programming/technical characters too; for example, []{}
 333   etc. are a bit awkward compared to US keyboards.
 334
 335   According to Axel Kollmorgen,
 336
 337       Under Windows, you can use the `Microsoft Keyboard Layout Creator
 338       <http://www.microsoft.com/globaldev/tools/msklc.mspx>`__ to easily
 339       map the backtick key to a real backtick (no dead key). took me
 340       five minutes to load my default (german) keyboard layout, untick
 341       "Dead Key?" from the backtick key properties ("in all shift
 342       states"), "build dll and setup package", install the generated
 343       .msi, and add my custom keyboard layout via Control Panel >
 344       Regional and Language Options > Languages > Details > Add
 345       Keyboard layout (and setting it as default "when you start your
 346       computer").
 347
 348 * Use a virtual/screen keyboard or character palette, such as:
 349
 350   - `Web-based keyboards <http://keyboard.lab.co.il/>`__ (IE only
 351     unfortunately).
 352   - Windows: `Click-N-Type <http://www.lakefolks.org/cnt/>`__.
 353   - Mac OS X: the Character Palette can store a set of favorite
 354     characters for easy input.  Open System Preferences,
 355     International, Input Menu tab, enable "Show input menu in menu
 356     bar", and be sure that Character Palette is enabled in the list.
 357
 358 If anyone knows of other/better solutions, please `let us know`_.
 359
 360
 361 Are there any tools for HTML/XML-to-reStructuredText?  (Round-tripping)
 362 -----------------------------------------------------------------------
 363
 364 People have tossed the idea around, and some implementations of
 365 reStructuredText-generating tools can be found in the `Docutils Link
 366 List`_.
 367
 368 There's no reason why reStructuredText should not be round-trippable
 369 to/from XML; any technicalities which prevent round-tripping would be
 370 considered bugs.  Whitespace would not be identical, but paragraphs
 371 shouldn't suffer.  The tricky parts would be the smaller details, like
 372 links and IDs and other bookkeeping.
 373
 374 For HTML, true round-tripping may not be possible.  Even adding lots
 375 of extra "class" attributes may not be enough.  A "simple HTML" to RST
 376 filter is possible -- for some definition of "simple HTML" -- but HTML
 377 is used as dumb formatting so much that such a filter may not be
 378 particularly useful.  An 80/20 approach should work though: build a
 379 tool that does 80% of the work automatically, leaving the other 20%
 380 for manual tweaks.
 381
 382 .. _Docutils Link List: docs/user/links.html
 383
 384
 385 Are there any Wikis that use reStructuredText syntax?
 386 -----------------------------------------------------
 387
 388 There are several, with various degrees of completeness.  With no
 389 implied endorsement or recommendation, and in no particular order:
 390
 391 * `Webware for Python wiki
 392   <http://wiki.webwareforpython.org/thiswiki.html>`__
 393 * `Ian Bicking's experimental code
 394   <http://docutils.sf.net/sandbox/ianb/wiki/Wiki.py>`__
 395 * `MoinMoin <http://moinmoin.wikiwikiweb.de/>`__ has some support;
 396   `here's a sample <http://moinmoin.wikiwikiweb.de/RestSample>`__
 397 * Zope-based `Zwiki <http://zwiki.org/>`__
 398 * Zope3-based Zwiki (in the Zope 3 source tree as ``zope.products.zwiki``)
 399 * `StikiWiki <http://mithrandr.moria.org/code/stikiwiki/>`__
 400 * `Trac <http://projects.edgewall.com/trac/>`__ `supports using reStructuredText
 401   <http://projects.edgewall.com/trac/wiki/WikiRestructuredText>`__ as an
 402   alternative to wiki markup. This includes support for `TracLinks
 403   <http://projects.edgewall.com/trac/wiki/TracLinks>`__ from within RST
 404   text via a custom RST reference-directive or, even easier, an interpreted text
 405   role 'trac'
 406 * `Vogontia <http://www.ososo.de/vogontia/>`__, a Wiki-like FAQ system
 407
 408 Please `let us know`_ of any other reStructuredText Wikis.
 409
 410 The example application for the `Web Framework Shootout
 411 <http://colorstudy.com/docs/shootout.html>`__ article is a Wiki using
 412 reStructuredText.
 413
 414
 415 Are there any Weblog (Blog) projects that use reStructuredText syntax?
 416 ----------------------------------------------------------------------
 417
 418 With no implied endorsement or recommendation, and in no particular
 419 order:
 420
 421 * `Firedrop <http://www.voidspace.org.uk/python/firedrop2/>`__
 422 * `Python Desktop Server <http://pyds.muensterland.org/>`__
 423 * `PyBloxsom <http://roughingit.subtlehints.net/pyblosxom/>`__
 424 * `Lino WebMan <http://lino.sourceforge.net/webman.html>`__
 425
 426 Please `let us know`_ of any other reStructuredText Blogs.
 427
 428
 429 Can lists be indented without generating block quotes?
 430 ------------------------------------------------------
 431
 432 Some people like to write lists with indentation, without intending a
 433 block quote context, like this::
 434
 435     paragraph
 436
 437       * list item 1
 438       * list item 2
 439
 440 There has been a lot of discussion about this, but there are some
 441 issues that would need to be resolved before it could be implemented.
 442 There is a summary of the issues and pointers to the discussions in
 443 `the to-do list`__.
 444
 445 __ http://docutils.sourceforge.net/docs/dev/todo.html#indented-lists
 446
 447
 448 Could the requirement for blank lines around lists be relaxed?
 449 --------------------------------------------------------------
 450
 451 Short answer: no.
 452
 453 In reStructuredText, it would be impossible to unambigously mark up
 454 and parse lists without blank lines before and after.  Deeply nested
 455 lists may look ugly with so many blank lines, but it's a price we pay
 456 for unambiguous markup.  Some other plaintext markup systems do not
 457 require blank lines in nested lists, but they have to compromise
 458 somehow, either accepting ambiguity or requiring extra complexity.
 459 For example, `Epytext <http://epydoc.sf.net/epytext.html#list>`__ does
 460 not require blank lines around lists, but it does require that lists
 461 be indented and that ambiguous cases be escaped.
 462
 463
 464 How can I include mathematical equations in documents?
 465 ------------------------------------------------------
 466
 467 There is no elegant built-in way, yet.  There are several ideas, but
 468 no obvious winner.  This issue requires a champion to solve the
 469 technical and aesthetic issues and implement a generic solution.
 470 Here's the `to-do list entry`__.
 471
 472 __ http://docutils.sourceforge.net/docs/dev/todo.html#math-markup
 473
 474 There are several quick & dirty ways to include equations in documents.
 475 They all presently use LaTeX syntax or dialects of it.
 476
 477 * For LaTeX output, nothing beats raw LaTeX math.  A simple way is to
 478   use the `raw directive`_::
 479
 480       .. raw:: latex
 481
 482           \[ x^3 + 3x^2a + 3xa^2 + a^3, \]
 483
 484   For inline math you could use substitutions of the raw directive but
 485   the recently added `raw role`_ is more convenient.  You must define a
 486   custom role based on it once in your document::
 487
 488       .. role:: raw-latex(raw)
 489           :format: latex
 490
 491   and then you can just use the new role in your text::
 492
 493       the binomial expansion of :raw-latex:`$(x+a)^3$` is
 494
 495   .. _raw directive: http://docutils.sourceforge.net/docs/ref/rst/
 496                      directives.html#raw-data-pass-through
 497   .. _raw role: http://docutils.sourceforge.net/docs/ref/rst/roles.html#raw
 498
 499 * For HTML the "Right" w3c-standard way to include math is MathML_.
 500   Unfortunately its rendering is still quite broken (or absent) on many
 501   browsers but it's getting better.  Another bad problem is that typing
 502   or reading raw MathML by humans is *really* painful, so embedding it
 503   in a reST document with the raw directive would defy the goals of
 504   readability and editability of reST (see an `example of raw MathML
 505   <http://sf.net/mailarchive/message.php?msg_id=2177102>`__).
 506
 507   A much less painful way to generate HTML+MathML is to use itex2mml_ to
 508   convert a dialect of LaTeX syntax to presentation MathML.  Here is an
 509   example of potential `itex math markup
 510   <http://article.gmane.org/gmane.text.docutils.user/118>`__.  The
 511   simplest way to use it is to add ``html`` to the format lists for the
 512   raw directive/role and postprocess the resulting document with
 513   itex2mml.  This way you can *generate LaTeX and HTML+MathML from the
 514   same source*, but you must limit yourself to the intersection of LaTeX
 515   and itex markups for this to work.  Alan G. Isaac wrote a detailed
 516   HOWTO_ for this approach.
 517
 518   .. _MathML: http://www.w3.org/Math/
 519   .. _itex2mml: http://pear.math.pitt.edu/mathzilla/itex2mml.html
 520   .. _HOWTO: http://www.american.edu/econ/itex2mml/mathhack.rst
 521
 522   * The other way to add math to HTML is to use images of the equations,
 523     typically generated by TeX.  This is inferior to MathML in the long
 524     term but is perhaps more accessible nowdays.
 525
 526     Of course, doing it by hand is too much work.  Beni Cherniavsky has
 527     written some `preprocessing scripts`__ for converting the
 528     ``texmath`` role/directive into images for HTML output and raw
 529     directives/subsitution for LaTeX output.  This way you can *generate
 530     LaTeX and HTML+images from the same source*.  `Instructions here`__.
 531
 532     __ http://docutils.sourceforge.net/sandbox/cben/rolehack/
 533     __ http://docutils.sourceforge.net/sandbox/cben/rolehack/README.html
 534
 535
 536 Is nested inline markup possible?
 537 ---------------------------------
 538
 539 Not currently, no.  It's on the `to-do list`__ (`details here`__), and
 540 hopefully will be part of the reStructuredText parser soon.  At that
 541 time, markup like this will become possible::
 542
 543     Here is some *emphasized text containing a `hyperlink`_ and
 544     ``inline literals``*.
 545
 546 __ http://docutils.sf.net/docs/dev/todo.html#nested-inline-markup
 547 __ http://docutils.sf.net/docs/dev/rst/alternatives.html#nested-inline-markup
 548
 549 There are workarounds, but they are either convoluted or ugly or both.
 550 They are not recommended.
 551
 552 * Inline markup can be combined with hyperlinks using `substitution
 553   definitions`__ and references__ with the `"replace" directive`__.
 554   For example::
 555
 556       Here is an |emphasized hyperlink|_.
 557
 558       .. |emphasized hyperlink| replace:: *emphasized hyperlink*
 559       .. _emphasized hyperlink: http://example.org
 560
 561   It is not possible for just a portion of the replacement text to be
 562   a hyperlink; it's the entire replacement text or nothing.
 563
 564   __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#substitution-definitions
 565   __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#substitution-references
 566   __ http://docutils.sf.net/docs/ref/rst/directives.html#replace
 567
 568 * The `"raw" directive`__ can be used to insert raw HTML into HTML
 569   output::
 570
 571       Here is some |stuff|.
 572
 573       .. |stuff| raw:: html
 574
 575          <em>emphasized text containing a
 576          <a href="http://example.org">hyperlink</a> and
 577          <tt>inline literals</tt></em>
 578
 579   Raw LaTeX is supported for LaTeX output, etc.
 580
 581   __ http://docutils.sf.net/docs/ref/rst/directives.html#raw
 582
 583
 584 How to indicate a line break or a significant newline?
 585 ------------------------------------------------------
 586
 587 `Line blocks`__ are designed for address blocks, verse, and other
 588 cases where line breaks are significant and must be preserved.  Unlike
 589 literal blocks, the typeface is not changed, and inline markup is
 590 recognized.  For example::
 591
 592     | A one, two, a one two three four
 593     |
 594     | Half a bee, philosophically,
 595     |     must, *ipso facto*, half not be.
 596     | But half the bee has got to be,
 597     |     *vis a vis* its entity.  D'you see?
 598     |
 599     | But can a bee be said to be
 600     |     or not to be an entire bee,
 601     |         when half the bee is not a bee,
 602     |             due to some ancient injury?
 603     |
 604     | Singing...
 605
 606 __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#line-blocks
 607
 608 Here's a workaround for manually inserting explicit line breaks in
 609 HTML output::
 610
 611     .. |br| raw:: html
 612
 613        <br />
 614
 615     I want to break this line here: |br| this is after the break.
 616
 617     If the extra whitespace bothers you, |br|\ backslash-escape it.
 618
 619
 620 A URL containing asterisks doesn't work.  What to do?
 621 -----------------------------------------------------
 622
 623 Asterisks are valid URL characters (see :RFC:`2396`), sometimes used
 624 in URLs.  For example::
 625
 626     http://cvs.example.org/viewcvs.py/*checkout*/module/file
 627
 628 Unfortunately, the parser thinks the asterisks are indicating
 629 emphasis.  The slashes serve as delineating punctuation, allowing the
 630 asterisks to be recognized as markup.  The example above is separated
 631 by the parser into a truncated URL, an emphasized word, and some
 632 regular text::
 633
 634     http://cvs.example.org/viewcvs.py/
 635     *checkout*
 636     /module/file
 637
 638 To turn off markup recognition, use a backslash to escape at least the
 639 first asterisk, like this::
 640
 641     http://cvs.example.org/viewcvs.py/\*checkout*/module/file
 642
 643 Escaping the second asterisk doesn't hurt, but it isn't necessary.
 644
 645
 646 How can I make a literal block with *some* formatting?
 647 ------------------------------------------------------
 648
 649 Use the `parsed-literal`_ directive.
 650
 651 .. _parsed-literal: docs/ref/rst/directives.html#parsed-literal
 652
 653 Scenario: a document contains some source code, which calls for a
 654 literal block to preserve linebreaks and whitespace.  But part of the
 655 source code should be formatted, for example as emphasis or as a
 656 hyperlink.  This calls for a *parsed* literal block::
 657
 658     .. parsed-literal::
 659
 660        print "Hello world!"  # *tricky* code [1]_
 661
 662 The emphasis (``*tricky*``) and footnote reference (``[1]_``) will be
 663 parsed.
 664
 665
 666 Can reStructuredText be used for web or generic templating?
 667 -----------------------------------------------------------
 668
 669 Docutils and reStructuredText can be used with or as a component of a
 670 templating system, but they do not themselves include templating
 671 functionality.  Templating should simply be left to dedicated
 672 templating systems.  Users can choose a templating system to apply to
 673 their reStructuredText documents as best serves their interests.
 674
 675 There are many good templating systems for Python (ht2html_, YAPTU_,
 676 Quixote_'s PTL, Cheetah_, etc.; see this non-exhaustive list of `some
 677 other templating systems`_), and many more for other languages, each
 678 with different approaches.  We invite you to try several and find one
 679 you like.  If you adapt it to use Docutils/reStructuredText, please
 680 consider contributing the code to Docutils or `let us know`_ and we'll
 681 keep a list here.
 682
 683 One reST-specific web templating system is `rest2web
 684 <http://www.voidspace.org.uk/python/rest2web>`_, a tool for
 685 automatically building websites, or parts of websites.
 686
 687 .. _ht2html: http://ht2html.sourceforge.net/
 688 .. _YAPTU:
 689    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52305
 690 .. _Quixote: http://www.mems-exchange.org/software/quixote/
 691 .. _Cheetah: http://www.cheetahtemplate.org/
 692 .. _some other templating systems:
 693    http://webware.sourceforge.net/Papers/Templates/
 694
 695
 696 HTML Writer
 697 ===========
 698
 699 What is the status of the HTML Writer?
 700 --------------------------------------
 701
 702 The HTML Writer module, ``docutils/writers/html4css1.py``, is a
 703 proof-of-concept reference implementation.  While it is a complete
 704 implementation, some aspects of the HTML it produces may be
 705 incompatible with older browsers or specialized applications (such as
 706 web templating).  Alternate implementations are welcome.
 707
 708
 709 What kind of HTML does it produce?
 710 ----------------------------------
 711
 712 It produces XHTML compatible with the `XHTML 1.0`_ specification.  A
 713 cascading stylesheet (provided as "tools/stylesheets/default.css") is
 714 required for proper viewing with a modern graphical browser.  Correct
 715 rendering of the HTML produced depends on the CSS support of the
 716 browser.
 717
 718 .. _XHTML 1.0: http://www.w3.org/TR/xhtml1/
 719
 720
 721 What browsers are supported?
 722 ----------------------------
 723
 724 No specific browser is targeted; all modern graphical browsers should
 725 work.  Some older browsers, text-only browsers, and browsers without
 726 full CSS support are known to produce inferior results.  Firefox,
 727 Safari, Mozilla (version 1.0 and up), and MS Internet Explorer
 728 (version 5.0 and up) are known to give good results.  Reports of
 729 experiences with other browsers are welcome.
 730
 731
 732 Unexpected results from tools/rst2html.py: H1, H1 instead of H1, H2.  Why?
 733 --------------------------------------------------------------------------
 734
 735 Here's the question in full:
 736
 737     I have this text::
 738
 739         Heading 1
 740         =========
 741
 742         All my life, I wanted to be H1.
 743
 744         Heading 1.1
 745         -----------
 746
 747         But along came H1, and so shouldn't I be H2?
 748         No!  I'm H1!
 749
 750         Heading 1.1.1
 751         *************
 752
 753         Yeah, imagine me, I'm stuck at H3!  No?!?
 754
 755     When I run it through tools/rst2html.py, I get unexpected results
 756     (below).  I was expecting H1, H2, then H3; instead, I get H1, H1,
 757     H2::
 758
 759         ...
 760         <html lang="en">
 761         <head>
 762         ...
 763         <title>Heading 1</title>
 764         <link rel="stylesheet" href="default.css" type="text/css" />
 765         </head>
 766         <body>
 767         <div class="document" id="heading-1">
 768         <h1 class="title">Heading 1</h1>                <-- first H1
 769         <p>All my life, I wanted to be H1.</p>
 770         <div class="section" id="heading-1-1">
 771         <h1><a name="heading-1-1">Heading 1.1</a></h1>        <-- H1
 772         <p>But along came H1, and so now I must be H2.</p>
 773         <div class="section" id="heading-1-1-1">
 774         <h2><a name="heading-1-1-1">Heading 1.1.1</a></h2>
 775         <p>Yeah, imagine me, I'm stuck at H3!</p>
 776         ...
 777
 778     What gives?
 779
 780 Check the "class" attribute on the H1 tags, and you will see a
 781 difference.  The first H1 is actually ``<h1 class="title">``; this is
 782 the document title, and the default stylesheet renders it centered.
 783 There can also be an ``<h2 class="subtitle">`` for the document
 784 subtitle.
 785
 786 If there's only one highest-level section title at the beginning of a
 787 document, it is treated specially, as the document title.  (Similarly, a
 788 lone second-highest-level section title may become the document
 789 subtitle.)  See `How can I indicate the document title?  Subtitle?`_ for
 790 details.  Rather than use a plain H1 for the document title, we use ``<h1
 791 class="title">`` so that we can use H1 again within the document.  Why
 792 do we do this?  HTML only has H1-H6, so by making H1 do double duty, we
 793 effectively reserve these tags to provide 6 levels of heading beyond the
 794 single document title.
 795
 796 HTML is being used for dumb formatting for nothing but final display.
 797 A stylesheet *is required*, and one is provided:
 798 ``tools/stylesheets/default.css``.  Of course, you're welcome to roll
 799 your own.  The default stylesheet provides rules to format ``<h1
 800 class="title">`` and ``<h2 class="subtitle">`` differently from
 801 ordinary ``<h1>`` and ``<h2>``::
 802
 803     h1.title {
 804       text-align: center }
 805
 806     h2.subtitle {
 807       text-align: center }
 808
 809 If you don't want the top section heading to be interpreted as a
 810 title at all, disable the `doctitle_xform`_ setting
 811 (``--no-doc-title`` option).  This will interpret your document
 812 differently from the standard settings, which might not be a good
 813 idea.  If you don't like the reuse of the H1 in the HTML output, you
 814 can tweak the `initial_header_level`_ setting
 815 (``--initial-header-level`` option) -- but unless you match its value
 816 to your specific document, you might end up with bad HTML (e.g. H3
 817 without H2).
 818
 819 .. _doctitle_xform:
 820    http://docutils.sourceforge.net/docs/user/config.html#doctitle-xform
 821 .. _initial_header_level:
 822    http://docutils.sourceforge.net/docs/user/config.html#initial-header-level
 823
 824 (Thanks to Mark McEahern for the question and much of the answer.)
 825
 826
 827 Why do enumerated lists only use numbers (no letters or roman numerals)?
 828 ------------------------------------------------------------------------
 829
 830 The rendering of enumerators (the numbers or letters acting as list
 831 markers) is completely governed by the stylesheet, so either the
 832 browser can't find the stylesheet (try using the "--embed-stylesheet"
 833 option), or the browser can't understand it (try a recent Firefox,
 834 Mozilla, Konqueror, Opera, Safari, or even MSIE).
 835
 836
 837 There appear to be garbage characters in the HTML.  What's up?
 838 --------------------------------------------------------------
 839
 840 What you're seeing is most probably not garbage, but the result of a
 841 mismatch between the actual encoding of the HTML output and the
 842 encoding your browser is expecting.  Your browser is misinterpreting
 843 the HTML data, which is encoded text.  A discussion of text encodings
 844 is beyond the scope of this FAQ; see one or more of these documents
 845 for more info:
 846
 847 * `UTF-8 and Unicode FAQ for Unix/Linux
 848   <http://www.cl.cam.ac.uk/~mgk25/unicode.html>`_
 849
 850 * Chapters 3 and 4 of `Introduction to i18n [Internationalization]
 851   <http://www.debian.org/doc/manuals/intro-i18n/>`_
 852
 853 * `Python Unicode Tutorial
 854   <http://www.reportlab.com/i18n/python_unicode_tutorial.html>`_
 855
 856 * `Python Unicode Objects: Some Observations on Working With Non-ASCII
 857   Character Sets <http://effbot.org/zone/unicode-objects.htm>`_
 858
 859 The common case is with the default output encoding (UTF-8), when
 860 either numbered sections are used (via the "sectnum_" directive) or
 861 symbol-footnotes.  3 non-breaking spaces are inserted in each numbered
 862 section title, between the generated number and the title text.  Most
 863 footnote symbols are not available in ASCII, nor are non-breaking
 864 spaces.  When encoded with UTF-8 and viewed with ordinary ASCII tools,
 865 these characters will appear to be multi-character garbage.
 866
 867 You may have an decoding problem in your browser (or editor, etc.).
 868 The encoding of the output is set to "utf-8", but your browswer isn't
 869 recognizing that.  You can either try to fix your browser (enable
 870 "UTF-8 character set", sometimes called "Unicode"), or choose a
 871 different encoding for the HTML output.  You can also try
 872 ``--output-encoding=ascii:xmlcharrefreplace`` for HTML (not applicable
 873 to non-XMLish outputs).
 874
 875 If you're generating document fragments, the "Content-Type" metadata
 876 (between the HTML ``<head>`` and ``</head>`` tags) must agree with the
 877 encoding of the rest of the document.  For UTF-8, it should be::
 878
 879     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 880
 881 Also, Docutils normally generates an XML declaration as the first line
 882 of the output.  It must also match the document encoding.  For UTF-8::
 883
 884     <?xml version="1.0" encoding="utf-8" ?>
 885
 886 .. _sectnum:
 887    http://docutils.sourceforge.net/docs/ref/rst/directives.html#sectnum
 888
 889
 890 How can I retrieve the body of the HTML document?
 891 -------------------------------------------------
 892
 893 (This is usually needed when using Docutils in conjunction with a
 894 templating system.)
 895
 896 You can use the `docutils.core.publish_parts()`_ function, which
 897 returns a dictionary containing an 'html_body_' entry.
 898
 899 .. _docutils.core.publish_parts():
 900    docs/api/publisher.html#publish-parts
 901 .. _html_body:
 902    docs/api/publisher.html#html-body
 903
 904
 905 Why is the Docutils XHTML served as "Content-type: text/html"?
 906 --------------------------------------------------------------
 907
 908 Full question:
 909
 910     Docutils' HTML output looks like XHTML and is advertised as such::
 911
 912       <?xml version="1.0" encoding="utf-8" ?>
 913       <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
 914        "http://www.w3.org/TR/xht ml1/DTD/xhtml1-transitional.dtd">
 915
 916     But this is followed by::
 917
 918       <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 919
 920     Shouldn't this be "application/xhtml+xml" instead of "text/html"?
 921
 922 In a perfect web, the Docutils XHTML output would be 100% strict
 923 XHTML.  But it's not a perfect web, and a major source of imperfection
 924 is Internet Explorer.  Despite it's drawbacks, IE still represents the
 925 majority of web browsers, and cannot be ignored.
 926
 927 Short answer: if we didn't serve XHTML as "text/html" (which is a
 928 perfectly valid thing to do), it couldn't be viewed in Internet
 929 Explorer.
 930
 931 Long answer: see the `"Criticisms of Internet Explorer" Wikipedia
 932 entry <http://en.wikipedia.org/wiki/Criticisms_of_Internet_Explorer#XHTML>`__.
 933
 934 However, there's also `Sending XHTML as text/html Considered
 935 Harmful`__.  What to do, what to do?  We're damned no matter what we
 936 do.  So we'll continue to do the practical instead of the pure:
 937 support the browsers that are actually out there, and not fight for
 938 strict standards compliance.
 939
 940 __ http://hixie.ch/advocacy/xhtml
 941
 942 (Thanks to Martin F. Krafft, Robert Kern, Michael Foord, and Alan
 943 G. Isaac.)
 944
 945
 946 Python Source Reader
 947 ====================
 948
 949 Can I use Docutils for Python auto-documentation?
 950 -------------------------------------------------
 951
 952 Yes, in conjunction with other projects.
 953
 954 Docstring extraction functionality from within Docutils is still under
 955 development.  There is most of a source code parsing module in
 956 docutils/readers/python/moduleparser.py.  We do plan to finish it
 957 eventually.  Ian Bicking wrote an initial front end for the
 958 moduleparser.py module, in sandbox/ianb/extractor/extractor.py.  Ian
 959 also did some work on the Python Source Reader
 960 (docutils.readers.python) component at PyCon DC 2004.
 961
 962 Version 2.0 of Ed Loper's `Epydoc <http://epydoc.sourceforge.net/>`_
 963 supports reStructuredText-format docstrings for HTML output.  Docutils
 964 0.3 or newer is required.  Development of a Docutils-specific
 965 auto-documentation tool will continue.  Epydoc works by importing
 966 Python modules to be documented, whereas the Docutils-specific tool,
 967 described above, will parse modules without importing them (as with
 968 `HappyDoc <http://happydoc.sourceforge.net/>`_, which doesn't support
 969 reStructuredText).
 970
 971 The advantages of parsing over importing are security and flexibility;
 972 the disadvantage is complexity/difficulty.
 973
 974 * Security: untrusted code that shouldn't be executed can be parsed;
 975   importing a module executes its top-level code.
 976 * Flexibility: comments and unofficial docstrings (those not supported
 977   by Python syntax) can only be processed by parsing.
 978 * Complexity/difficulty: it's a lot harder to parse and analyze a
 979   module than it is to ``import`` and analyze one.
 980
 981 For more details, please see "Docstring Extraction Rules" in `PEP
 982 258`_, item 3 ("How").
 983
 984
 985 Miscellaneous
 986 =============
 987
 988 Is the Docutils document model based on any existing XML models?
 989 ----------------------------------------------------------------
 990
 991 Not directly, no.  It borrows bits from DocBook, HTML, and others.  I
 992 (David Goodger) have designed several document models over the years,
 993 and have my own biases.  The Docutils document model is designed for
 994 simplicity and extensibility, and has been influenced by the needs of
 995 the reStructuredText markup.