FAQ.txt

   1 .. -*- coding: utf-8 -*-
   2
   3 ===========================================
   4  Docutils FAQ (Frequently Asked Questions)
   5 ===========================================
   6
   7 :Date: $Date$
   8 :Revision: $Revision$
   9 :Web site: http://docutils.sourceforge.net/
  10 :Copyright: This document has been placed in the public domain.
  11
  12 .. Please note that until there's a Q&A-specific construct available,
  13    this FAQ will use section titles for questions.  Therefore
  14    questions must fit on one line.  The title may be a summary of the
  15    question, with the full question in the section body.
  16
  17
  18 .. contents::
  19 .. sectnum::
  20
  21
  22 This is a work in progress.  Please feel free to ask questions and/or
  23 provide answers; send email to the `Docutils-users`_ mailing list.
  24 Project members should feel free to edit the source text file
  25 directly.
  26
  27 .. _let us know:
  28 .. _Docutils-users: docs/user/mailing-lists.html#docutils-users
  29
  30
  31 Docutils
  32 ========
  33
  34 What is Docutils?
  35 -----------------
  36
  37 Docutils_ is a system for processing plaintext documentation into
  38 useful formats, such as HTML, XML, and LaTeX.  It supports multiple
  39 types of input, such as standalone files (implemented), inline
  40 documentation from Python modules and packages (under development),
  41 `PEPs (Python Enhancement Proposals)`_ (implemented), and others as
  42 discovered.
  43
  44 For an overview of the Docutils project implementation, see `PEP
  45 258`_, "Docutils Design Specification".
  46
  47 Docutils is implemented in Python_.
  48
  49 .. _Docutils: http://docutils.sourceforge.net/
  50 .. _PEPs (Python Enhancement Proposals):
  51    http://www.python.org/peps/pep-0012.html
  52 .. _PEP 258: http://www.python.org/peps/pep-0258.html
  53 .. _Python: http://www.python.org/
  54
  55
  56 Why is it called "Docutils"?
  57 ----------------------------
  58
  59 Docutils is short for "Python Documentation Utilities".  The name
  60 "Docutils" was inspired by "Distutils", the Python Distribution
  61 Utilities architected by Greg Ward, a component of Python's standard
  62 library.
  63
  64 The earliest known use of the term "docutils" in a Python context was
  65 a `fleeting reference`__ in a message by Fred Drake on 1999-12-02 in
  66 the Python Doc-SIG mailing list.  It was suggested `as a project
  67 name`__ on 2000-11-27 on Doc-SIG, again by Fred Drake, in response to
  68 a question from Tony "Tibs" Ibbs: "What do we want to *call* this
  69 thing?".  This was shortly after David Goodger first `announced
  70 reStructuredText`__ on Doc-SIG.
  71
  72 Tibs used the name "Docutils" for `his effort`__ "to document what the
  73 Python docutils package should support, with a particular emphasis on
  74 documentation strings".  Tibs joined the current project (and its
  75 predecessors) and graciously donated the name.
  76
  77 For more history of reStructuredText and the Docutils project, see `An
  78 Introduction to reStructuredText`_.
  79
  80 Please note that the name is "Docutils", not "DocUtils" or "Doc-Utils"
  81 or any other variation.
  82
  83 .. _An Introduction to reStructuredText:
  84    http://docutils.sourceforge.net/docs/ref/rst/introduction.html
  85 __ http://mail.python.org/pipermail/doc-sig/1999-December/000878.html
  86 __ http://mail.python.org/pipermail/doc-sig/2000-November/001252.html
  87 __ http://mail.python.org/pipermail/doc-sig/2000-November/001239.html
  88 __ http://homepage.ntlworld.com/tibsnjoan/docutils/STpy.html
  89
  90
  91 Is there a GUI authoring environment for Docutils?
  92 --------------------------------------------------
  93
  94 DocFactory_ is under development.  It uses wxPython and looks very
  95 promising.
  96
  97 .. _DocFactory:
  98    http://docutils.sf.net/sandbox/gschwant/docfactory/doc/
  99
 100
 101 What is the status of the Docutils project?
 102 -------------------------------------------
 103
 104 Although useful and relatively stable, Docutils is experimental code,
 105 with APIs and architecture subject to change.
 106
 107 Our highest priority is to fix bugs as they are reported.  So the
 108 latest code from the repository_ (or the snapshots_) is almost always
 109 the most stable (bug-free) as well as the most featureful.
 110
 111
 112 What is the Docutils project release policy?
 113 --------------------------------------------
 114
 115 It's "release early & often".  We also have automatically-generated
 116 snapshots_ which always contain the latest code from the repository_.
 117 As the project matures, we may formalize on a
 118 stable/development-branch scheme, but we're not using anything like
 119 that yet.
 120
 121 .. _repository: docs/dev/repository.html
 122 .. _snapshots: http://docutils.sourceforge.net/#download
 123
 124
 125 reStructuredText
 126 ================
 127
 128 What is reStructuredText?
 129 -------------------------
 130
 131 reStructuredText_ is an easy-to-read, what-you-see-is-what-you-get
 132 plaintext markup syntax and parser system.  The reStructuredText
 133 parser is a component of Docutils_.  reStructuredText is a revision
 134 and reinterpretation of the StructuredText_ and Setext_ lightweight
 135 markup systems.
 136
 137 If you are reading this on the web, you can see for yourself.  `The
 138 source for this FAQ <FAQ.txt>`_ is written in reStructuredText; open
 139 it in another window and compare them side by side.
 140
 141 `A ReStructuredText Primer`_ and the `Quick reStructuredText`_ user
 142 reference are a good place to start.  The `reStructuredText Markup
 143 Specification`_ is a detailed technical specification.
 144
 145 .. _A ReStructuredText Primer:
 146    http://docutils.sourceforge.net/docs/user/rst/quickstart.html
 147 .. _Quick reStructuredText:
 148    http://docutils.sourceforge.net/docs/user/rst/quickref.html
 149 .. _reStructuredText Markup Specification:
 150    http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html
 151 .. _reStructuredText: http://docutils.sourceforge.net/rst.html
 152 .. _StructuredText:
 153    http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage/
 154 .. _Setext: http://docutils.sourceforge.net/mirror/setext.html
 155
 156
 157 Why is it called "reStructuredText"?
 158 ------------------------------------
 159
 160 The name came from a combination of "StructuredText", one of
 161 reStructuredText's predecessors, with "re": "revised", "reworked", and
 162 "reinterpreted", and as in the ``re.py`` regular expression module.
 163 For a detailed history of reStructuredText and the Docutils project,
 164 see `An Introduction to reStructuredText`_.
 165
 166
 167 What's the standard abbreviation for "reStructuredText"?
 168 --------------------------------------------------------
 169
 170 "RST" and "ReST" (or "reST") are both acceptable.  Care should be
 171 taken with capitalization, to avoid confusion with "REST__", an
 172 acronym for "Representational State Transfer".
 173
 174 The abbreviations "reSTX" and "rSTX"/"rstx" should **not** be used;
 175 they overemphasize reStructuredText's precedessor, Zope's
 176 StructuredText.
 177
 178 __ http://en.wikipedia.org/wiki/Representational_State_Transfer
 179
 180
 181 What's the standard filename extension for a reStructuredText file?
 182 -------------------------------------------------------------------
 183
 184 It's ".txt".  Some people would like to use ".rest" or ".rst" or
 185 ".restx", but why bother?  ReStructuredText source files are meant to
 186 be readable as plaintext, and most operating systems already associate
 187 ".txt" with text files.  Using a specialized filename extension would
 188 require that users alter their OS settings, which is something that
 189 many users will not be willing or able to do.
 190
 191
 192 Are there any reStructuredText editor extensions?
 193 -------------------------------------------------
 194
 195 See `Editor Support for reStructuredText`__.
 196
 197 __ http://docutils.sf.net/tools/editors/README.html
 198
 199
 200 How can I indicate the document title?  Subtitle?
 201 -------------------------------------------------
 202
 203 A uniquely-adorned section title at the beginning of a document is
 204 treated specially, as the document title.  Similarly, a
 205 uniquely-adorned section title immediately after the document title
 206 becomes the document subtitle.  For example::
 207
 208     This is the Document Title
 209     ==========================
 210
 211     This is the Document Subtitle
 212     -----------------------------
 213
 214     Here's an ordinary paragraph.
 215
 216 Counterexample::
 217
 218     Here's an ordinary paragraph.
 219
 220     This is *not* a Document Title
 221     ==============================
 222
 223     The "ordinary paragraph" above the section title
 224     prevents it from becoming the document title.
 225
 226 Another counterexample::
 227
 228     This is not the Document Title,  because...
 229     ===========================================
 230
 231     Here's an ordinary paragraph.
 232
 233     ... the title adornment is not unique
 234     =====================================
 235
 236     Another ordinary paragraph.
 237
 238
 239 How can I represent esoteric characters (e.g. character entities) in a document?
 240 --------------------------------------------------------------------------------
 241
 242 For example, say you want an em-dash (XML character entity &mdash;,
 243 Unicode character U+2014) in your document: use a real em-dash.
 244 Insert concrete characters (e.g. type a *real* em-dash) into your
 245 input file, using whatever encoding suits your application, and tell
 246 Docutils the input encoding.  Docutils uses Unicode internally, so the
 247 em-dash character is a real em-dash internally.
 248
 249 Emacs users should refer to the `Emacs Support for reStructuredText`__
 250 document.  Tips for other editors are welcome.
 251
 252 __ http://docutils.sourceforge.net/tools/editors/emacs/README.html
 253
 254 ReStructuredText has no character entity subsystem; it doesn't know
 255 anything about XML charents.  To Docutils, "&mdash;" in input text is
 256 7 discrete characters; no interpretation happens.  When writing HTML,
 257 the "&" is converted to "&amp;", so in the raw output you'd see
 258 "&amp;mdash;".  There's no difference in interpretation for text
 259 inside or outside inline literals or literal blocks -- there's no
 260 character entity interpretation in either case.
 261
 262 If you can't use a Unicode-compatible encoding and must rely on 7-bit
 263 ASCII, there is a workaround.  New in Docutils 0.3.10 is a set of
 264 `Standard Substitution Definition Sets`_, which provide equivalents of
 265 XML & HTML character entity sets as substitution definitions.  For
 266 example, the Japanese yen currency symbol can be used as follows::
 267
 268     .. include:: <xhtml1-lat1.txt>
 269
 270     |yen| 600 for a complete meal?  That's cheap!
 271
 272 For earlier versions of Docutils, equivalent files containing
 273 character entity set substitution definitions using the "unicode_"
 274 directive `are available`_.  Please read the `description and
 275 instructions`_ for use.  Thanks to David Priest for the original idea.
 276
 277 If you insist on using XML-style charents, you'll have to implement a
 278 pre-processing system to convert to UTF-8 or something.  That
 279 introduces complications though; you can no longer *write* about
 280 charents naturally; instead of writing "&mdash;" you'd have to write
 281 "&amp;mdash;".
 282
 283 For the common case of long dashes, you might also want to insert the
 284 following substitution definitons into your document (both of them are
 285 using the "unicode_" directive)::
 286
 287     .. |--| unicode:: U+2013   .. en dash
 288     .. |---| unicode:: U+2014  .. em dash, trimming surrounding whitespace
 289        :trim:
 290
 291 .. |--| unicode:: U+2013   .. en dash
 292 .. |---| unicode:: U+2014  .. em dash, trimming surrounding whitespace
 293    :trim:
 294
 295 Now you can write dashes using pure ASCII: "``foo |--| bar; foo |---|
 296 bar``", rendered as "foo |--| bar; foo |---| bar".  (Note that Mozilla
 297 and Firefox may render this incorrectly.)  The ``:trim:`` option for
 298 the em dash is necessary because you cannot write "``foo|---|bar``";
 299 thus you need to add spaces ("``foo |---| bar``") and advise the
 300 reStructuredText parser to trim the spaces.
 301
 302 .. _Standard Substitution Definition Sets:
 303    http://docutils.sf.net/docs/ref/rst/substitutions.html
 304 .. _unicode:
 305    http://docutils.sf.net/docs/ref/rst/directives.html#unicode-character-codes
 306 .. _are available: http://docutils.sourceforge.net/tmp/charents/
 307 .. _tarball: http://docutils.sourceforge.net/tmp/charents.tgz
 308 .. _description and instructions:
 309    http://docutils.sourceforge.net/tmp/charents/README.html
 310 .. _to-do list: http://docutils.sourceforge.net/docs/dev/todo.html
 311
 312
 313 How can I generate backticks using a Scandinavian keyboard?
 314 -----------------------------------------------------------
 315
 316 The use of backticks in reStructuredText is a bit awkward with
 317 Scandinavian keyboards, where the backtick is a "dead" key.  To get
 318 one ` character one must press SHIFT-` + SPACE.
 319
 320 Unfortunately, with all the variations out there, there's no way to
 321 please everyone.  For Scandinavian programmers and technical writers,
 322 this is not limited to reStructuredText but affects many languages and
 323 environments.
 324
 325 Possible solutions include
 326
 327 * If you have to input a lot of backticks, simply type one in the
 328   normal/awkward way, select it, copy and then paste the rest (CTRL-V
 329   is a lot faster than SHIFT-` + SPACE).
 330
 331 * Use keyboard macros.
 332
 333 * Remap the keyboard.  The Scandinavian keyboard layout is awkward for
 334   other programming/technical characters too; for example, []{}
 335   etc. are a bit awkward compared to US keyboards.
 336
 337   According to Axel Kollmorgen,
 338
 339       Under Windows, you can use the `Microsoft Keyboard Layout Creator
 340       <http://www.microsoft.com/globaldev/tools/msklc.mspx>`__ to easily
 341       map the backtick key to a real backtick (no dead key). took me
 342       five minutes to load my default (german) keyboard layout, untick
 343       "Dead Key?" from the backtick key properties ("in all shift
 344       states"), "build dll and setup package", install the generated
 345       .msi, and add my custom keyboard layout via Control Panel >
 346       Regional and Language Options > Languages > Details > Add
 347       Keyboard layout (and setting it as default "when you start your
 348       computer").
 349
 350 * Use a virtual/screen keyboard or character palette, such as:
 351
 352   - `Web-based keyboards <http://keyboard.lab.co.il/>`__ (IE only
 353     unfortunately).
 354   - Windows: `Click-N-Type <http://www.lakefolks.org/cnt/>`__.
 355   - Mac OS X: the Character Palette can store a set of favorite
 356     characters for easy input.  Open System Preferences,
 357     International, Input Menu tab, enable "Show input menu in menu
 358     bar", and be sure that Character Palette is enabled in the list.
 359
 360 If anyone knows of other/better solutions, please `let us know`_.
 361
 362
 363 Are there any tools for HTML/XML-to-reStructuredText?  (Round-tripping)
 364 -----------------------------------------------------------------------
 365
 366 People have tossed the idea around, and some implementations of
 367 reStructuredText-generating tools can be found in the `Docutils Link
 368 List`_.
 369
 370 There's no reason why reStructuredText should not be round-trippable
 371 to/from XML; any technicalities which prevent round-tripping would be
 372 considered bugs.  Whitespace would not be identical, but paragraphs
 373 shouldn't suffer.  The tricky parts would be the smaller details, like
 374 links and IDs and other bookkeeping.
 375
 376 For HTML, true round-tripping may not be possible.  Even adding lots
 377 of extra "class" attributes may not be enough.  A "simple HTML" to RST
 378 filter is possible -- for some definition of "simple HTML" -- but HTML
 379 is used as dumb formatting so much that such a filter may not be
 380 particularly useful.  An 80/20 approach should work though: build a
 381 tool that does 80% of the work automatically, leaving the other 20%
 382 for manual tweaks.
 383
 384 .. _Docutils Link List: docs/user/links.html
 385
 386
 387 Are there any Wikis that use reStructuredText syntax?
 388 -----------------------------------------------------
 389
 390 There are several, with various degrees of completeness.  With no
 391 implied endorsement or recommendation, and in no particular order:
 392
 393 * `Webware for Python wiki
 394   <http://wiki.webwareforpython.org/thiswiki.html>`__
 395 * `Ian Bicking's experimental code
 396   <http://docutils.sf.net/sandbox/ianb/wiki/Wiki.py>`__
 397 * `MoinMoin <http://moinmoin.wikiwikiweb.de/>`__ has some support;
 398   `here's a sample <http://moinmoin.wikiwikiweb.de/RestSample>`__
 399 * Zope-based `Zwiki <http://zwiki.org/>`__
 400 * Zope3-based Zwiki (in the Zope 3 source tree as ``zope.products.zwiki``)
 401 * `StikiWiki <http://mithrandr.moria.org/code/stikiwiki/>`__
 402 * `Trac <http://projects.edgewall.com/trac/>`__ `supports using reStructuredText
 403   <http://projects.edgewall.com/trac/wiki/WikiRestructuredText>`__ as an
 404   alternative to wiki markup. This includes support for `TracLinks
 405   <http://projects.edgewall.com/trac/wiki/TracLinks>`__ from within RST
 406   text via a custom RST reference-directive or, even easier, an interpreted text
 407   role 'trac'
 408 * `Vogontia <http://www.ososo.de/vogontia/>`__, a Wiki-like FAQ system
 409
 410 Please `let us know`_ of any other reStructuredText Wikis.
 411
 412 The example application for the `Web Framework Shootout
 413 <http://colorstudy.com/docs/shootout.html>`__ article is a Wiki using
 414 reStructuredText.
 415
 416
 417 Are there any Weblog (Blog) projects that use reStructuredText syntax?
 418 ----------------------------------------------------------------------
 419
 420 With no implied endorsement or recommendation, and in no particular
 421 order:
 422
 423 * `Firedrop <http://www.voidspace.org.uk/python/firedrop2/>`__
 424 * `Python Desktop Server <http://pyds.muensterland.org/>`__
 425 * `PyBloxsom <http://roughingit.subtlehints.net/pyblosxom/>`__
 426 * `Lino WebMan <http://lino.sourceforge.net/webman.html>`__
 427
 428 Please `let us know`_ of any other reStructuredText Blogs.
 429
 430
 431 Can lists be indented without generating block quotes?
 432 ------------------------------------------------------
 433
 434 Some people like to write lists with indentation, without intending a
 435 block quote context, like this::
 436
 437     paragraph
 438
 439       * list item 1
 440       * list item 2
 441
 442 There has been a lot of discussion about this, but there are some
 443 issues that would need to be resolved before it could be implemented.
 444 There is a summary of the issues and pointers to the discussions in
 445 `the to-do list`__.
 446
 447 __ http://docutils.sourceforge.net/docs/dev/todo.html#indented-lists
 448
 449
 450 Could the requirement for blank lines around lists be relaxed?
 451 --------------------------------------------------------------
 452
 453 Short answer: no.
 454
 455 In reStructuredText, it would be impossible to unambigously mark up
 456 and parse lists without blank lines before and after.  Deeply nested
 457 lists may look ugly with so many blank lines, but it's a price we pay
 458 for unambiguous markup.  Some other plaintext markup systems do not
 459 require blank lines in nested lists, but they have to compromise
 460 somehow, either accepting ambiguity or requiring extra complexity.
 461 For example, `Epytext <http://epydoc.sf.net/epytext.html#list>`__ does
 462 not require blank lines around lists, but it does require that lists
 463 be indented and that ambiguous cases be escaped.
 464
 465
 466 How can I include mathematical equations in documents?
 467 ------------------------------------------------------
 468
 469 There is no elegant built-in way, yet.  There are several ideas, but
 470 no obvious winner.  This issue requires a champion to solve the
 471 technical and aesthetic issues and implement a generic solution.
 472 Here's the `to-do list entry`__.
 473
 474 __ http://docutils.sourceforge.net/docs/dev/todo.html#math-markup
 475
 476 There are several quick & dirty ways to include equations in documents.
 477 They all presently use LaTeX syntax or dialects of it.
 478
 479 * For LaTeX output, nothing beats raw LaTeX math.  A simple way is to
 480   use the `raw directive`_::
 481
 482       .. raw:: latex
 483
 484           \[ x^3 + 3x^2a + 3xa^2 + a^3, \]
 485
 486   For inline math you could use substitutions of the raw directive but
 487   the recently added `raw role`_ is more convenient.  You must define a
 488   custom role based on it once in your document::
 489
 490       .. role:: raw-latex(raw)
 491           :format: latex
 492
 493   and then you can just use the new role in your text::
 494
 495       the binomial expansion of :raw-latex:`$(x+a)^3$` is
 496
 497   .. _raw directive: http://docutils.sourceforge.net/docs/ref/rst/
 498                      directives.html#raw-data-pass-through
 499   .. _raw role: http://docutils.sourceforge.net/docs/ref/rst/roles.html#raw
 500
 501 * Jens Jørgen Mortensen has implemented a "latex-math" role and
 502   directive, available from `his sandbox`__.
 503
 504   __ http://docutils.sourceforge.net/sandbox/jensj/latex_math/
 505
 506 * For HTML the "Right" w3c-standard way to include math is MathML_.
 507   Unfortunately its rendering is still quite broken (or absent) on many
 508   browsers but it's getting better.  Another bad problem is that typing
 509   or reading raw MathML by humans is *really* painful, so embedding it
 510   in a reST document with the raw directive would defy the goals of
 511   readability and editability of reST (see an `example of raw MathML
 512   <http://sf.net/mailarchive/message.php?msg_id=2177102>`__).
 513
 514   A much less painful way to generate HTML+MathML is to use itex2mml_ to
 515   convert a dialect of LaTeX syntax to presentation MathML.  Here is an
 516   example of potential `itex math markup
 517   <http://article.gmane.org/gmane.text.docutils.user/118>`__.  The
 518   simplest way to use it is to add ``html`` to the format lists for the
 519   raw directive/role and postprocess the resulting document with
 520   itex2mml.  This way you can *generate LaTeX and HTML+MathML from the
 521   same source*, but you must limit yourself to the intersection of LaTeX
 522   and itex markups for this to work.  Alan G. Isaac wrote a detailed
 523   HOWTO_ for this approach.
 524
 525   .. _MathML: http://www.w3.org/Math/
 526   .. _itex2mml: http://pear.math.pitt.edu/mathzilla/itex2mml.html
 527   .. _HOWTO: http://www.american.edu/econ/itex2mml/mathhack.rst
 528
 529 * The other way to add math to HTML is to use images of the equations,
 530   typically generated by TeX.  This is inferior to MathML in the long
 531   term but is perhaps more accessible nowdays.
 532
 533   Of course, doing it by hand is too much work.  Beni Cherniavsky has
 534   written some `preprocessing scripts`__ for converting the
 535   ``texmath`` role/directive into images for HTML output and raw
 536   directives/subsitution for LaTeX output.  This way you can *generate
 537   LaTeX and HTML+images from the same source*.  `Instructions here`__.
 538
 539   __ http://docutils.sourceforge.net/sandbox/cben/rolehack/
 540   __ http://docutils.sourceforge.net/sandbox/cben/rolehack/README.html
 541
 542
 543 Is nested inline markup possible?
 544 ---------------------------------
 545
 546 Not currently, no.  It's on the `to-do list`__ (`details here`__), and
 547 hopefully will be part of the reStructuredText parser soon.  At that
 548 time, markup like this will become possible::
 549
 550     Here is some *emphasized text containing a `hyperlink`_ and
 551     ``inline literals``*.
 552
 553 __ http://docutils.sf.net/docs/dev/todo.html#nested-inline-markup
 554 __ http://docutils.sf.net/docs/dev/rst/alternatives.html#nested-inline-markup
 555
 556 There are workarounds, but they are either convoluted or ugly or both.
 557 They are not recommended.
 558
 559 * Inline markup can be combined with hyperlinks using `substitution
 560   definitions`__ and references__ with the `"replace" directive`__.
 561   For example::
 562
 563       Here is an |emphasized hyperlink|_.
 564
 565       .. |emphasized hyperlink| replace:: *emphasized hyperlink*
 566       .. _emphasized hyperlink: http://example.org
 567
 568   It is not possible for just a portion of the replacement text to be
 569   a hyperlink; it's the entire replacement text or nothing.
 570
 571   __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#substitution-definitions
 572   __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#substitution-references
 573   __ http://docutils.sf.net/docs/ref/rst/directives.html#replace
 574
 575 * The `"raw" directive`__ can be used to insert raw HTML into HTML
 576   output::
 577
 578       Here is some |stuff|.
 579
 580       .. |stuff| raw:: html
 581
 582          <em>emphasized text containing a
 583          <a href="http://example.org">hyperlink</a> and
 584          <tt>inline literals</tt></em>
 585
 586   Raw LaTeX is supported for LaTeX output, etc.
 587
 588   __ http://docutils.sf.net/docs/ref/rst/directives.html#raw
 589
 590
 591 How to indicate a line break or a significant newline?
 592 ------------------------------------------------------
 593
 594 `Line blocks`__ are designed for address blocks, verse, and other
 595 cases where line breaks are significant and must be preserved.  Unlike
 596 literal blocks, the typeface is not changed, and inline markup is
 597 recognized.  For example::
 598
 599     | A one, two, a one two three four
 600     |
 601     | Half a bee, philosophically,
 602     |     must, *ipso facto*, half not be.
 603     | But half the bee has got to be,
 604     |     *vis a vis* its entity.  D'you see?
 605     |
 606     | But can a bee be said to be
 607     |     or not to be an entire bee,
 608     |         when half the bee is not a bee,
 609     |             due to some ancient injury?
 610     |
 611     | Singing...
 612
 613 __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#line-blocks
 614
 615 Here's a workaround for manually inserting explicit line breaks in
 616 HTML output::
 617
 618     .. |br| raw:: html
 619
 620        <br />
 621
 622     I want to break this line here: |br| this is after the break.
 623
 624     If the extra whitespace bothers you, |br|\ backslash-escape it.
 625
 626
 627 A URL containing asterisks doesn't work.  What to do?
 628 -----------------------------------------------------
 629
 630 Asterisks are valid URL characters (see :RFC:`2396`), sometimes used
 631 in URLs.  For example::
 632
 633     http://cvs.example.org/viewcvs.py/*checkout*/module/file
 634
 635 Unfortunately, the parser thinks the asterisks are indicating
 636 emphasis.  The slashes serve as delineating punctuation, allowing the
 637 asterisks to be recognized as markup.  The example above is separated
 638 by the parser into a truncated URL, an emphasized word, and some
 639 regular text::
 640
 641     http://cvs.example.org/viewcvs.py/
 642     *checkout*
 643     /module/file
 644
 645 To turn off markup recognition, use a backslash to escape at least the
 646 first asterisk, like this::
 647
 648     http://cvs.example.org/viewcvs.py/\*checkout*/module/file
 649
 650 Escaping the second asterisk doesn't hurt, but it isn't necessary.
 651
 652
 653 How can I make a literal block with *some* formatting?
 654 ------------------------------------------------------
 655
 656 Use the `parsed-literal`_ directive.
 657
 658 .. _parsed-literal: docs/ref/rst/directives.html#parsed-literal
 659
 660 Scenario: a document contains some source code, which calls for a
 661 literal block to preserve linebreaks and whitespace.  But part of the
 662 source code should be formatted, for example as emphasis or as a
 663 hyperlink.  This calls for a *parsed* literal block::
 664
 665     .. parsed-literal::
 666
 667        print "Hello world!"  # *tricky* code [1]_
 668
 669 The emphasis (``*tricky*``) and footnote reference (``[1]_``) will be
 670 parsed.
 671
 672
 673 Can reStructuredText be used for web or generic templating?
 674 -----------------------------------------------------------
 675
 676 Docutils and reStructuredText can be used with or as a component of a
 677 templating system, but they do not themselves include templating
 678 functionality.  Templating should simply be left to dedicated
 679 templating systems.  Users can choose a templating system to apply to
 680 their reStructuredText documents as best serves their interests.
 681
 682 There are many good templating systems for Python (ht2html_, YAPTU_,
 683 Quixote_'s PTL, Cheetah_, etc.; see this non-exhaustive list of `some
 684 other templating systems`_), and many more for other languages, each
 685 with different approaches.  We invite you to try several and find one
 686 you like.  If you adapt it to use Docutils/reStructuredText, please
 687 consider contributing the code to Docutils or `let us know`_ and we'll
 688 keep a list here.
 689
 690 One reST-specific web templating system is `rest2web
 691 <http://www.voidspace.org.uk/python/rest2web>`_, a tool for
 692 automatically building websites, or parts of websites.
 693
 694 .. _ht2html: http://ht2html.sourceforge.net/
 695 .. _YAPTU:
 696    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52305
 697 .. _Quixote: http://www.mems-exchange.org/software/quixote/
 698 .. _Cheetah: http://www.cheetahtemplate.org/
 699 .. _some other templating systems:
 700    http://webware.sourceforge.net/Papers/Templates/
 701
 702
 703 HTML Writer
 704 ===========
 705
 706 What is the status of the HTML Writer?
 707 --------------------------------------
 708
 709 The HTML Writer module, ``docutils/writers/html4css1.py``, is a
 710 proof-of-concept reference implementation.  While it is a complete
 711 implementation, some aspects of the HTML it produces may be
 712 incompatible with older browsers or specialized applications (such as
 713 web templating).  Alternate implementations are welcome.
 714
 715
 716 What kind of HTML does it produce?
 717 ----------------------------------
 718
 719 It produces XHTML compatible with the `XHTML 1.0`_ specification.  A
 720 cascading stylesheet is required for proper viewing with a modern
 721 graphical browser.  Correct rendering of the HTML produced depends on
 722 the CSS support of the browser.  A general-purpose stylesheet,
 723 ``html4css1.css`` is provided with the code, and is embedded by
 724 default.  It is installed in the "writers/support/" subdirectory
 725 within the Docutils package.  Use the ``--help`` command-line option
 726 to see the specific location on your machine.
 727
 728 .. _XHTML 1.0: http://www.w3.org/TR/xhtml1/
 729
 730
 731 What browsers are supported?
 732 ----------------------------
 733
 734 No specific browser is targeted; all modern graphical browsers should
 735 work.  Some older browsers, text-only browsers, and browsers without
 736 full CSS support are known to produce inferior results.  Firefox,
 737 Safari, Mozilla (version 1.0 and up), and MS Internet Explorer
 738 (version 5.0 and up) are known to give good results.  Reports of
 739 experiences with other browsers are welcome.
 740
 741
 742 Unexpected results from tools/rst2html.py: H1, H1 instead of H1, H2.  Why?
 743 --------------------------------------------------------------------------
 744
 745 Here's the question in full:
 746
 747     I have this text::
 748
 749         Heading 1
 750         =========
 751
 752         All my life, I wanted to be H1.
 753
 754         Heading 1.1
 755         -----------
 756
 757         But along came H1, and so shouldn't I be H2?
 758         No!  I'm H1!
 759
 760         Heading 1.1.1
 761         *************
 762
 763         Yeah, imagine me, I'm stuck at H3!  No?!?
 764
 765     When I run it through tools/rst2html.py, I get unexpected results
 766     (below).  I was expecting H1, H2, then H3; instead, I get H1, H1,
 767     H2::
 768
 769         ...
 770         <html lang="en">
 771         <head>
 772         ...
 773         <title>Heading 1</title>
 774         </head>
 775         <body>
 776         <div class="document" id="heading-1">
 777         <h1 class="title">Heading 1</h1>                <-- first H1
 778         <p>All my life, I wanted to be H1.</p>
 779         <div class="section" id="heading-1-1">
 780         <h1><a name="heading-1-1">Heading 1.1</a></h1>        <-- H1
 781         <p>But along came H1, and so now I must be H2.</p>
 782         <div class="section" id="heading-1-1-1">
 783         <h2><a name="heading-1-1-1">Heading 1.1.1</a></h2>
 784         <p>Yeah, imagine me, I'm stuck at H3!</p>
 785         ...
 786
 787     What gives?
 788
 789 Check the "class" attribute on the H1 tags, and you will see a
 790 difference.  The first H1 is actually ``<h1 class="title">``; this is
 791 the document title, and the default stylesheet renders it centered.
 792 There can also be an ``<h2 class="subtitle">`` for the document
 793 subtitle.
 794
 795 If there's only one highest-level section title at the beginning of a
 796 document, it is treated specially, as the document title.  (Similarly, a
 797 lone second-highest-level section title may become the document
 798 subtitle.)  See `How can I indicate the document title?  Subtitle?`_ for
 799 details.  Rather than use a plain H1 for the document title, we use ``<h1
 800 class="title">`` so that we can use H1 again within the document.  Why
 801 do we do this?  HTML only has H1-H6, so by making H1 do double duty, we
 802 effectively reserve these tags to provide 6 levels of heading beyond the
 803 single document title.
 804
 805 HTML is being used for dumb formatting for nothing but final display.
 806 A stylesheet *is required*, and one is provided; see `What kind of
 807 HTML does it produce?`_ above.  Of course, you're welcome to roll your
 808 own.  The default stylesheet provides rules to format ``<h1
 809 class="title">`` and ``<h2 class="subtitle">`` differently from
 810 ordinary ``<h1>`` and ``<h2>``::
 811
 812     h1.title {
 813       text-align: center }
 814
 815     h2.subtitle {
 816       text-align: center }
 817
 818 If you don't want the top section heading to be interpreted as a
 819 title at all, disable the `doctitle_xform`_ setting
 820 (``--no-doc-title`` option).  This will interpret your document
 821 differently from the standard settings, which might not be a good
 822 idea.  If you don't like the reuse of the H1 in the HTML output, you
 823 can tweak the `initial_header_level`_ setting
 824 (``--initial-header-level`` option) -- but unless you match its value
 825 to your specific document, you might end up with bad HTML (e.g. H3
 826 without H2).
 827
 828 .. _doctitle_xform:
 829    http://docutils.sourceforge.net/docs/user/config.html#doctitle-xform
 830 .. _initial_header_level:
 831    http://docutils.sourceforge.net/docs/user/config.html#initial-header-level
 832
 833 (Thanks to Mark McEahern for the question and much of the answer.)
 834
 835
 836 Why do enumerated lists only use numbers (no letters or roman numerals)?
 837 ------------------------------------------------------------------------
 838
 839 The rendering of enumerators (the numbers or letters acting as list
 840 markers) is completely governed by the stylesheet, so either the
 841 browser can't find the stylesheet (try using the "--embed-stylesheet"
 842 option), or the browser can't understand it (try a recent Firefox,
 843 Mozilla, Konqueror, Opera, Safari, or even MSIE).
 844
 845
 846 There appear to be garbage characters in the HTML.  What's up?
 847 --------------------------------------------------------------
 848
 849 What you're seeing is most probably not garbage, but the result of a
 850 mismatch between the actual encoding of the HTML output and the
 851 encoding your browser is expecting.  Your browser is misinterpreting
 852 the HTML data, which is encoded text.  A discussion of text encodings
 853 is beyond the scope of this FAQ; see one or more of these documents
 854 for more info:
 855
 856 * `UTF-8 and Unicode FAQ for Unix/Linux
 857   <http://www.cl.cam.ac.uk/~mgk25/unicode.html>`_
 858
 859 * Chapters 3 and 4 of `Introduction to i18n [Internationalization]
 860   <http://www.debian.org/doc/manuals/intro-i18n/>`_
 861
 862 * `Python Unicode Tutorial
 863   <http://www.reportlab.com/i18n/python_unicode_tutorial.html>`_
 864
 865 * `Python Unicode Objects: Some Observations on Working With Non-ASCII
 866   Character Sets <http://effbot.org/zone/unicode-objects.htm>`_
 867
 868 The common case is with the default output encoding (UTF-8), when
 869 either numbered sections are used (via the "sectnum_" directive) or
 870 symbol-footnotes.  3 non-breaking spaces are inserted in each numbered
 871 section title, between the generated number and the title text.  Most
 872 footnote symbols are not available in ASCII, nor are non-breaking
 873 spaces.  When encoded with UTF-8 and viewed with ordinary ASCII tools,
 874 these characters will appear to be multi-character garbage.
 875
 876 You may have an decoding problem in your browser (or editor, etc.).
 877 The encoding of the output is set to "utf-8", but your browswer isn't
 878 recognizing that.  You can either try to fix your browser (enable
 879 "UTF-8 character set", sometimes called "Unicode"), or choose a
 880 different encoding for the HTML output.  You can also try
 881 ``--output-encoding=ascii:xmlcharrefreplace`` for HTML (not applicable
 882 to non-XMLish outputs).
 883
 884 If you're generating document fragments, the "Content-Type" metadata
 885 (between the HTML ``<head>`` and ``</head>`` tags) must agree with the
 886 encoding of the rest of the document.  For UTF-8, it should be::
 887
 888     <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 889
 890 Also, Docutils normally generates an XML declaration as the first line
 891 of the output.  It must also match the document encoding.  For UTF-8::
 892
 893     <?xml version="1.0" encoding="utf-8" ?>
 894
 895 .. _sectnum:
 896    http://docutils.sourceforge.net/docs/ref/rst/directives.html#sectnum
 897
 898
 899 How can I retrieve the body of the HTML document?
 900 -------------------------------------------------
 901
 902 (This is usually needed when using Docutils in conjunction with a
 903 templating system.)
 904
 905 You can use the `docutils.core.publish_parts()`_ function, which
 906 returns a dictionary containing an 'html_body_' entry.
 907
 908 .. _docutils.core.publish_parts():
 909    docs/api/publisher.html#publish-parts
 910 .. _html_body:
 911    docs/api/publisher.html#html-body
 912
 913
 914 Why is the Docutils XHTML served as "Content-type: text/html"?
 915 --------------------------------------------------------------
 916
 917 Full question:
 918
 919     Docutils' HTML output looks like XHTML and is advertised as such::
 920
 921       <?xml version="1.0" encoding="utf-8" ?>
 922       <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
 923        "http://www.w3.org/TR/xht ml1/DTD/xhtml1-transitional.dtd">
 924
 925     But this is followed by::
 926
 927       <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
 928
 929     Shouldn't this be "application/xhtml+xml" instead of "text/html"?
 930
 931 In a perfect web, the Docutils XHTML output would be 100% strict
 932 XHTML.  But it's not a perfect web, and a major source of imperfection
 933 is Internet Explorer.  Despite it's drawbacks, IE still represents the
 934 majority of web browsers, and cannot be ignored.
 935
 936 Short answer: if we didn't serve XHTML as "text/html" (which is a
 937 perfectly valid thing to do), it couldn't be viewed in Internet
 938 Explorer.
 939
 940 Long answer: see the `"Criticisms of Internet Explorer" Wikipedia
 941 entry <http://en.wikipedia.org/wiki/Criticisms_of_Internet_Explorer#XHTML>`__.
 942
 943 However, there's also `Sending XHTML as text/html Considered
 944 Harmful`__.  What to do, what to do?  We're damned no matter what we
 945 do.  So we'll continue to do the practical instead of the pure:
 946 support the browsers that are actually out there, and not fight for
 947 strict standards compliance.
 948
 949 __ http://hixie.ch/advocacy/xhtml
 950
 951 (Thanks to Martin F. Krafft, Robert Kern, Michael Foord, and Alan
 952 G. Isaac.)
 953
 954
 955 Python Source Reader
 956 ====================
 957
 958 Can I use Docutils for Python auto-documentation?
 959 -------------------------------------------------
 960
 961 Yes, in conjunction with other projects.
 962
 963 Docstring extraction functionality from within Docutils is still under
 964 development.  There is most of a source code parsing module in
 965 docutils/readers/python/moduleparser.py.  We do plan to finish it
 966 eventually.  Ian Bicking wrote an initial front end for the
 967 moduleparser.py module, in sandbox/ianb/extractor/extractor.py.  Ian
 968 also did some work on the Python Source Reader
 969 (docutils.readers.python) component at PyCon DC 2004.
 970
 971 Version 2.0 of Ed Loper's `Epydoc <http://epydoc.sourceforge.net/>`_
 972 supports reStructuredText-format docstrings for HTML output.  Docutils
 973 0.3 or newer is required.  Development of a Docutils-specific
 974 auto-documentation tool will continue.  Epydoc works by importing
 975 Python modules to be documented, whereas the Docutils-specific tool,
 976 described above, will parse modules without importing them (as with
 977 `HappyDoc <http://happydoc.sourceforge.net/>`_, which doesn't support
 978 reStructuredText).
 979
 980 The advantages of parsing over importing are security and flexibility;
 981 the disadvantage is complexity/difficulty.
 982
 983 * Security: untrusted code that shouldn't be executed can be parsed;
 984   importing a module executes its top-level code.
 985 * Flexibility: comments and unofficial docstrings (those not supported
 986   by Python syntax) can only be processed by parsing.
 987 * Complexity/difficulty: it's a lot harder to parse and analyze a
 988   module than it is to ``import`` and analyze one.
 989
 990 For more details, please see "Docstring Extraction Rules" in `PEP
 991 258`_, item 3 ("How").
 992
 993
 994 Miscellaneous
 995 =============
 996
 997 Is the Docutils document model based on any existing XML models?
 998 ----------------------------------------------------------------
 999
1000 Not directly, no.  It borrows bits from DocBook, HTML, and others.  I
1001 (David Goodger) have designed several document models over the years,
1002 and have my own biases.  The Docutils document model is designed for
1003 simplicity and extensibility, and has been influenced by the needs of
1004 the reStructuredText markup.