1 .. readme.txt: introduction to the html4strict writer -*- rst-mode -*-
4 =====================================================
6 =====================================================
10 :Abstract: A HTML writer, generating `XHTML 1.1` for styling
21 Docutils' default HTML Writer, ``docutils.writers.html4css1`` generates
22 output that conforms to the HTML 4.01 Transitional DTD and to the
23 Extensible HTML version 1.0 Transitional DTD (*almost* strict).
25 *Almost*, as it contains some deprecated constructs and "a minimum of
26 formatting information" in order to ensure correct display with deficient
27 but (at the time of creation) widespread browsers (mainly IE6).
29 A new HTML5 writer with most features of this writer may become part of
35 Goals of the `xhtml11 writer`:
37 * Strict standards compliance.
39 * Generate good looking, readable, and accessible documents.
41 * Clear distinction of content and layout:
43 + Clean HTML output without "hard-coded" visual markup,
45 + extended configurability by CSS style sheets.
47 * `Graceful Degradation
48 <http://www.anybrowser.org/campaign/abdesign.html#degradability>`__
50 * Best viewed with any (CSS2-conforming) HTML browser. [#]_
52 * Support scientific documents (numbering tables and figures, formal
53 tables, ...). Cf. [markschenk]_.
56 .. [#] Tested with Firefox_, Midori_, Konqueror_ and Opera_. As Safari
57 and Google Chrome use the same rendering engine as Midori and
58 Konqueror (WebKit), they should work fine as well.
60 .. _firefox: http://www.mozilla.com
61 .. _opera: http://www.opera.com
62 .. _midori: http://www.twotoasts.de/index.php?/pages/midori_summary.html
63 .. _konqueror: http://konqueror.kde.org/
68 This writer is for you, if you
70 * care much about standards compliance,
72 * care less about the rendering in non-compliant browsers,
74 * want extended CSS configurability.
80 The `<rst2xhtml.py>`_ front end reads standalone reStructuredText
81 source files and produces clean `XHTML 1.1`_
82 output. A CSS 2 stylesheet is required for proper rendering; a complete
83 sample stylesheet is provided.
86 :Parser: reStructuredText
87 :Writer: xhtml (xhtml11)
89 The front end can be called from the command line (when it is installed in
92 rst2xhtml.py [options] [<source> [<destination>]]
94 The full usage text can be obtained with the ``--help`` option.
96 The front end `rst2xhtml.py`_ is also an example of programmatic use.
102 The writer module subclasses the ``html_plain.Writer`` and
103 ``html_plain.HTMLTranslator`` classes and adds compatibility to the strict
104 requirements of `XHTML 1.1`_:
106 * There is no attribute "lang" (only "xml:lang").
108 * Enumerated lists don't support the 'start' attribute.
110 The style sheet xhtml11.css_ adds support for a "start" value for
111 enumerated lists via a CSS-counter. This allows also nested enumeration.
113 * ``<sup>`` and ``<sub>`` tags are not allowed in preformatted blocks
116 The `math-output` configuration setting defaults to "MathML".
118 The `xhtml11.css <xhtml11/xhtml11.css>`_ style sheet extends the standard
119 layout for CSS2-conforming HTML browsers.
122 Changes to the html4css1 writer
123 -------------------------------
128 + The output conforms to the XHTML version 1.1 DTD::
130 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
131 '"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
135 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN"
136 "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd">
139 Docinfo and field lists based on definition lists (instead of tables)
140 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
142 + Reduced loading time for documents with long field lists.
144 + Enables CSS styling for:
146 - label width (obsoleting the ``--field-name-limit`` option)
147 - handling of long labels: truncate, wrap, ...
148 - label separator (default: ':')
149 - compact vs. open list
151 Class arguments for docinfo items
152 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
154 Items in the docinfo list are passed class arguments specifying
155 their type to enable customizing the docinfo layout.
157 The default style sheet contains example definitions: author and date
158 are typeset centered and without label, if they occur as first docinfo
162 Footnotes and citations
163 ~~~~~~~~~~~~~~~~~~~~~~~
165 + Typeset as CSS-styled definition lists.
167 + Collect adjacent footnotes/citations in one list.
169 Counter for enumerated lists
170 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
172 A CSS counter for `enumerated lists`_ replaces the deprecated "start"
175 .. _enumerated lists:
176 ../../docutils/docs/ref/rst/restructuredtext.html#enumerated-lists
178 + Enables CSS styling for:
180 - label style (including nested numbers),
183 The complicated part was to find out a correct CSS rule-set to replicate the
184 standard behaviour with hanging indent (list-style: "outside"). There is a
185 `W3C example`_ to number nested list items, however, the result is similar
186 to 'list-style: inside': subsequent lines start below the label instead of a
189 Most Internet resources come to the conclusion that "there’s no
190 straightforward replacement mechanism" [tekkie]_, "the solution is
191 buried so deep in CSS2 that there's no point in trying to do it in CSS
192 for the foreseeable future" [webjunction]_, or "the main point to note
193 is that there is no direct mapping from the previous behaviour to CSS"
194 [codelair]_. `Taming Lists`_ did give valuable advise but no working
197 The common advise is "Use 'HTML 4.01 Transitional' and keep the START
198 attribute". [highdots]_, especially, since "There are arguments over
199 whether the start attribute is presentational or not, and indeed HTML5
200 has declared that it is no longer deprecated in the current working
201 drafts (at the time of writing)" [dev.opera]_.
203 However, a reasonable replacement of 'outside'-styled ordered lists
204 with CSS is possible:
206 * The ordered list defines/resets the counter, the automatic numbering
211 list-style-type: none ! important;
214 * The label is defined as "before" pseudo element. The content consists
215 of the counter and a separator (by default a trailing dot)::
218 counter-increment: item;
219 content: counter(item) ".";
222 * The label is right aligned in a box. Both the label and the list
223 content (which Docutils puts in a paragraph node) must be displayed
224 as "inline-block" so that they line up::
227 display: inline-block;
230 padding-right: 0.5em;
234 ol > li > p { display: inline-block; }
236 However, subsequent paragraphs are to be set as nested block
244 * The hanging indent is realized via a negative "textindent"
245 which must be reset for the list content to prevent over-striking::
247 ol > li { text-indent: -2.5em; }
248 ol > li > p { text-indent: 0em; }
250 The resulting list can be customized to a large extend
252 * Different label types and separators are possible, e.g.::
254 ol.lowerroman > li:before {
255 content: "(" counter(item, lower-roman) ")";
258 * nested counters (1, 1.1, 1.1.1, etc)::
260 ol.nested > li:before {
261 content: counters(item, ".") ". ";
264 * chapter/section prefix, continued lists, ...
266 .. _W3C example: http://www.w3.org/TR/CSS2/generate.html#counters
267 .. _taming lists: http://www.alistapart.com/articles/taminglists/
271 Inline literal role with ``pre-wrap``
272 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
274 In contrast to the html4css1 writer, runs of whitespace are not
275 replaced by `` `` entities (cf. bug #1938891).
277 Whitespace-handling and wrapping are configured with the CSS
278 property ``white-space: pre-wrap``:
280 Whitespace is preserved by the browser. Text will wrap when
281 necessary, and on line breaks
283 However, most browsers wrap on non-word chars, too, if set to wrap
284 at white-space. Text like "--an-option" or the regular expression
285 ``[+]?(\d+(\.\d*)?|\.\d+)`` may be broken at the wrong place!
286 The setting ``white-space: pre;`` prevents this, but also
287 prevents wrapping at white space, contrary to the specification__
289 In order to allow line-wrap at whitespace only,
290 words-with-non-word-chars are wrapped in <span>s with class "pre".
294 + White-space handling in inline literals configurable with the CSS
295 stylesheet. Possible values: ``normal, nowrap, pre, pre-wrap,
298 __ http://docutils.sf.net/docs/ref/rst/restructuredtext.html#inline-literals
301 Table styling with CSS
302 ~~~~~~~~~~~~~~~~~~~~~~
304 + No hard-coded border setting.
306 + Pre-defined table styles selected by class arguments "borderless"
307 and "booktabs" matching the interpretation in the latex2e writer.
309 SimpleListChecker also checks field-lists and docinfo
310 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
312 Unified test if a list is compactable:
316 + also works for nesting field-list in enumeration/bullet-list and
319 + also test docinfo, as a field may contain more than one paragraph
322 Docutils-generated section numbers in a <span>
323 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
325 Instead of hard-coded formatting with trailing `` ``,
326 section numbers in section headings and the toc are placed in spans
327 with ``class='sectnum'`` and separated from the heading by a CSS rule.
329 Omit redundant class arguments
330 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
332 Do not mark the first child with 'class="first"' and the last
333 child with 'class="last"' in definitions, table cells, field
334 bodies, option descriptions, and list items. Use the
335 ``:first-child`` and ``:last-child`` selectors instad.
337 Language attribute name changed to 'xml:lang'
338 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
340 The name of the language attribute changed from 'lang' in XHTML 1.0 to
341 'xml:lang' in XHTML 1.1. Documents using 'lang' do not validate.
347 The HTML4CSS1 writer does this to "produce visually compact lists
348 (less vertical whitespace)". This writer relies on CSS rules
349 for"visual compactness".
355 * The first list in the test `2.3. Enumerated Lists` should be
358 * Hanging indent for numbered section headings and ToC entries.
360 * search stylesheets along standard path if enclosed in <>
361 (like the RST syntax for include files).
363 * Validate output with "critical" cases not covered by
364 the functional test (e.g. headings with level > 6).
366 * Move widely supported constructs to the html4css1 writer.
368 * Number sections with CSS if sectnum_xform is False.
370 * Footnotes and Citations (for footnotes see
371 http://www.archiva.net/footnote/index.htm and
372 http://www.xmlplease.com/footnotes
379 `Inside A Docutils Command-Line Front-End Tool
380 <http://docutils.sourceforge.net/docs/api/cmdline-tool.html>`_
382 `API Reference Material for Client-Developers
383 <http://docutils.sf.net/docs/index.html#api-api-reference-material-for-client-developers>`_
385 http://ilovetypography.com/2008/02/28/a-guide-to-web-typography/
387 http://webtypography.net/toc/
389 http://tekkie.flashbit.net/css/replacement-for-deprecated-ol-li-start-value-html-attributes,
392 http://lists.webjunction.org/wjlists/web4lib/2001-September/026413.html,
394 .. [codelair] http://www.doheth.co.uk/codelair/html-css/deprecated#start,
397 http://www.highdots.com/forums/cascading-style-sheets/using-css-set-start-number-262555.html,
400 http://dev.opera.com/articles/view/automatic-numbering-with-css-counters/,
402 .. [markschenk] `Publishing scientific documents with XHTML and CSS
403 <http://www.markschenk.com/cssexp/publication/article.xml>`__
406 is a similar sandbox project, a HTML writer producing XHTML that
407 contains enough formatting information to be viewed without a
408 cascading style sheet by a lightweight html browser
409 (e.g. `Dillo <http://www.dillo.org>`__ or the console browser
410 `elinks <http://elinks.cz>`__).
413 `XHTML™ 1.1 - Module-based XHTML - Second Edition`,
414 W3C Recommendation, 23 November 2010.
415 http://www.w3.org/TR/xhtml11/