xml2po cleanup.
[docbook-zh.git] / defguide / en / appb.xml
blobe195fad6663a72c54867b85e0e92babd88fa880e
1 <appendix id="app-xml">
2 <?dbhtml filename="appb.html"?>
3 <appendixinfo>
4 <pubdate>$Date: 2001-08-02 18:27:50 +0800 (Thu, 02 Aug 2001) $</pubdate>
5 <releaseinfo>$Revision: 546 $</releaseinfo>
6 </appendixinfo>
7 <title>DocBook and &XML;</title>
8 <para>
9 <indexterm id="xmldocbookappb" class='startofrange'><primary>DocBook DTD</primary>
10   <secondary>XML</secondary></indexterm>
11 <indexterm id="docbookxmlappa" class="startofrange"><primary>XML</primary>
12   <secondary>DocBook and</secondary></indexterm>
13 <indexterm><primary>SGML</primary>
14   <secondary>XML and</secondary></indexterm>
15 <indexterm><primary>XML</primary>
16   <secondary>SGML, processing</secondary></indexterm>
18 &XML;, the <ulink url="http://www.w3.org/TR/REC-xml">Extensible
19 Markup Language</ulink>, is a simple dialect of &SGML;. In the words of the
20 &XML; specification, &ldquo;the goal [of &XML;] is to enable generic &SGML; to be
21 served, received, and processed on the Web in the way that is now possible
22 with &HTML;.&rdquo;</para>
23 <para>&XML; raises two issues with respect to DocBook:<itemizedlist>
24 <listitem><para>Are DocBook &SGML; instances valid &XML; instances?</para>
25 </listitem>
26 <listitem><para>Can the DocBook &DTD; be made into a valid &XML; &DTD;?</para>
27 </listitem>
28 </itemizedlist></para>
29 <para>If you have an existing &SGML; system, and your primary goal is
30 to serve DocBook documents over the Web as &XML;, only the first of
31 these issues is relevant.  As the popularity of &XML; grows, we will
32 see more and more &XML;-aware tools that don't implement full
33 <acronym>ISO</acronym> 8879 &SGML;. If your goal is to author DocBook
34 documents with one of this new generation of tools, you will only be
35 able to achieve validity with an &XML; DocBook &DTD;.</para>
36 <para>
37 <indexterm><primary>OASIS</primary>
38   <secondary>XML DocBook version</secondary></indexterm>
40 Although not yet officially adopted by the <acronym>OASIS</acronym> DocBook Technical 
41 Committee, an &XML; version of DocBook is available now and
42 provided on the <acronym>CD-ROM</acronym>.
43 </para>
44 <sect1>
45 <title>DocBook Instances as &XML;</title>
46 <para>
47 <indexterm><primary>DocBook DTD</primary>
48   <secondary>instances, converting to XML</secondary></indexterm>
49 <indexterm><primary>XML</primary>
50   <secondary>DocBook instances, converting to</secondary></indexterm>
52 Most DocBook documents can be made into well-formed &XML; documents very
53 easily. With few exceptions, valid DocBook &SGML; instances are also well-formed
54 &XML; instances. The following areas may need to be addressed.</para>
56 <sect2><title>System Identifiers</title>
57 <para>
58 <indexterm><primary>system identifiers</primary>
59   <secondary>SGML</secondary></indexterm>
60 <indexterm><primary>public identifiers</primary>
61   <secondary>SGML</secondary></indexterm>
62 <indexterm><primary>parameter entities</primary>
63   <secondary>SGML declarations</secondary></indexterm>
64 <indexterm><primary>declarations</primary>
65   <secondary>document type and parameter entity (SGML)</secondary></indexterm>
67 It is common for &SGML; instances to use only a public identifier in document
68 type and parameter entity declarations:</para>
69 <programlisting>&lt;!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
70 &lt;chapter>&lt;title>Chapter Title&lt;/title>
71 &lt;para>
72 This &lt;emphasis>paragraph&lt;/paragraph> is important.
73 &lt;/para>
74 &lt;/chapter></programlisting>
75 <para>&XML; requires a system identifier:
76 <programlisting>
77 &lt;!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
78                   "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
79 &lt;chapter>&lt;title>Chapter Title&lt;/title>
80 &lt;para>
81 This &lt;emphasis>paragraph&lt;/paragraph> is important.
82 &lt;/para>
83 &lt;/chapter></programlisting></para>
84 <para>
85 <indexterm><primary>catalog files</primary>
86   <secondary>system identifiers, resolving</secondary></indexterm>
87 <indexterm><primary>URN</primary>
88   <secondary>XML system identifiers, future</secondary></indexterm>
89 <indexterm><primary>public identifiers</primary>
90   <secondary>system identifiers, overriding</secondary></indexterm>
92 If you're used to using catalog files to resolve system identifiers,
93 you may be dismayed to learn that system identifiers are required. Because most
94 tools favor system identifiers over public identifiers, all of the portability
95 that was gained by the use of catalog files seems to have been lost. In the
96 long run, it'll be regained by the fact that &XML; system identifiers can be
97 <acronym>URN</acronym>s, which will have a resolution scheme like catalogs, but what about the
98 short run?</para>
99 <para>Luckily, there are a couple of options.  First, you can tell your tools to use the public identifiers even
100 though system identifiers are present. Simply add:</para>
101 <screen>OVERRIDE YES</screen>
102 <para>
103 <indexterm><primary>system identifiers</primary>
104   <secondary>remapping with SYSTEM catalog directive</secondary></indexterm>
106 to your catalog files. Alternatively, you can remap system identifers
107 with the <literal>SYSTEM</literal> catalog directive.  If you are faced with 
108 documents that don't use public identifiers at all, this is probably your
109 only option.
110 </para>
111 </sect2>
113 <sect2><title>Minimization</title>
114 <para>
115 <indexterm><primary>markup</primary>
116   <secondary>minimization</secondary>
117     <tertiary>SGML/XML conversion problems</tertiary></indexterm>
118 <indexterm><primary>minimization</primary>
119   <secondary>markup</secondary>
120     <tertiary>SGML/XML conversion problems</tertiary></indexterm>
122 If you have used &SGML; minimization features in your instances:</para>
124 <programlisting>&lt;!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
125 &lt;chapter id=<co id="xml-attrquote"/>chap1>&lt;title>Chapter Title&lt;/title>
126 &lt;para>
127 This &lt;emphasis>paragraph<co id="xml-endtag"/>&lt;/&gt; is important.
128 &lt;/para>
129 &lt;/chapter></programlisting>
131 <para>they will not be well-formed &XML; instances. In particular, &XML;<calloutlist>
132 <callout arearefs="xml-attrquote"><para>
134 <indexterm><primary>quotes</primary>
135   <secondary>attribute values</secondary></indexterm>
136 <indexterm><primary>attributes</primary>
137   <secondary>values</secondary>
138     <tertiary>quoting</tertiary></indexterm>
140 Requires that all attribute values
141 be quoted.</para>
142 </callout>
143 <callout arearefs="xml-endtag"><para>Does not allow short tag minization.
144 </para>
145 </callout></calloutlist>
146 &XML; also forbids tag omission, and there are
147 probably a half dozen or so more exotic
148 examples of minimization that you have used. They're all illegal. The
149 easiest way to remove these minimizations is probably with a tool like <command>
150 sgmlnorm</command> (included in the <acronym>SP</acronym> and Jade distributions, on
151 the <link linkend="app-cdrom"><acronym>CD-ROM</acronym></link>).</para>
152 <para>The result will be something like this:</para>
153 <programlisting>&lt;?xml version='1.0'?>
154 &lt;!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
155                   "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
156 &lt;chapter id="chap1">&lt;title>Chapter Title&lt;/title>
157 &lt;para>
158 This &lt;emphasis>paragraph&lt;/emphasis> is important.
159 &lt;/para>
160 &lt;/chapter></programlisting>
161 </sect2>
163 <sect2><title>Attribute Default Values</title>
164 <para>
165 <indexterm><primary>attributes</primary>
166   <secondary>default values</secondary></indexterm>
168 Correct processing of this document may require access to the default
169 attributes:</para>
170 <programlisting>&lt;!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
171 &lt;chapter>&lt;title>Chapter Title&lt;/title>
172 &lt;para>
173 Write to us at:
174 &lt;address<co id="xml-defattr"/>>
175 90 Sherman Street
176 Cambridge, MA 02140
177 &lt;/address>
178 &lt;/para>
179 &lt;/chapter></programlisting>
180 <calloutlist>
181 <callout arearefs="xml-defattr"><para><sgmltag>Address</sgmltag> expresses
182 that its content is line-specific with an attribute.</para>
183 </callout></calloutlist>
184 <para>Some &XML; processing environments are going to ignore the doctype declaration
185 in your document, even if it's present. This is relevant when your instance
186 uses elements that have attributes with default values. The default values
187 are expressed in the &DTD;, but may not be expressed in your instance. In the
188 case of DocBook, there are relatively few of these, and your stylesheet can
189 probably be constructed to do the right thing in either case. (It essentially
190 treats the attributes as if they had implied values.)</para>
191 <para>The result will be something like this:</para>
192 <programlisting>&lt;?xml version='1.0'?>
193 &lt;!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
194                   "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
195 &lt;chapter>&lt;title>Chapter Title&lt;/title>
196 &lt;para>
197 Write to us at:
198 &lt;address format="linespecific">
199 90 Sherman Street
200 Cambridge, MA 02140
201 &lt;/address>
202 &lt;/para>
203 &lt;/chapter></programlisting>
204 </sect2>
206 <sect2><title>Character and <literal>SDATA</literal> Entities</title>
207 <programlisting>&lt;!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
208 &lt;chapter>&lt;title>Chapter Title&lt;/title>
209 &lt;para>
210 This book was published by O'Reilly<co id="xml-sdata"/>&amp;trade;.
211 &lt;/para>
212 &lt;/chapter></programlisting>
213 <calloutlist>
214 <callout arearefs="xml-sdata"><para>
215 <indexterm><primary>characters</primary>
216   <secondary>entities</secondary></indexterm>
217 <indexterm><primary>SDATA entities</primary></indexterm>
218 <indexterm><primary>entities</primary>
219   <secondary>characters</secondary></indexterm>
220 <indexterm><primary>entities</primary>
221   <secondary>SDATA</secondary></indexterm>
222 <indexterm><primary>XML</primary>
223   <secondary>SDATA entities, not allowing</secondary></indexterm>
224 <indexterm><primary>ISO standards</primary>
225   <secondary>entity sets</secondary>
226     <tertiary>SDATA entities, problems with (XML)</tertiary></indexterm>
227 <indexterm><primary>Unicode character set</primary>
228   <secondary>ISO standard entity sets and</secondary></indexterm>
230 The DocBook &DTD; defines all of the standard <acronym>ISO</acronym>
231 entities automatically, but the <acronym>ISO</acronym> definitions use
232 <literal>SDATA</literal>, which is not allowed in &XML;. Eventually,
233 <acronym>ISO</acronym> (or someone else) will release official
234 <acronym>ISO</acronym> standard entity sets that make reference to the
235 appropriate Unicode character for each entity. Until then, the &XML;
236 version of DocBook is
237 distributed with an unofficial set.</para>
238 <para>
239 <indexterm><primary>internal subset</primary>
240   <secondary>entity declarations</secondary></indexterm>
241 <indexterm><primary>external subset</primary>
242   <secondary>entity declarations (SGML/XML conversion)</secondary></indexterm>
244 If you use entities in your document, it may be wise to put declarations
245 for them in the internal subset of each instance, because some
246 &XML; browsers are going to parse the internal subset but not the external subset.
247 If the entity declarations are in your &DTD;, and the browser does not parse
248 the external subset, the browser won't know how to display the entities in
249 your document.</para>
250 </callout></calloutlist>
251 <para>The result will be something like this:</para>
252 <programlisting>&lt;?xml version='1.0'?>
253 &lt;!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
254                   "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [
255 &lt;!ENTITY trade "&amp;#x2122;">
256 &lt;chapter>&lt;title>Chapter Title&lt;/title>
257 &lt;para>
258 This book was published by O'Reilly&amp;trade;.
259 &lt;/para>
260 &lt;/chapter></programlisting>
261 </sect2>
263 <sect2><title>Case-Sensitivity</title>
264 <programlisting><co id="xml-nmcasekey"/>&lt;!DocType Book PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
265 <co id="xml-namecase1"/>&lt;book>&lt;title>Book Title&lt;/title>
266 &lt;chapter>&lt;title>Chapter Title<co id="xml-namecase2"/>&lt;/Title>
267 &lt;para>
268 Paragraph test.
269 &lt;/para>
270 <co id="xml-wf1"/>&lt;PARA>
271 A second paragraph.
272 &lt;/PARA>
273 &lt;/chapter>
274 &lt;/book></programlisting>
275 <para>
276 <indexterm><primary>case sensitivity</primary>
277   <secondary>DocBook SGML declaration</secondary></indexterm>
278 <indexterm><primary>elements</primary>
279   <secondary>case sensitivity (DocBook)</secondary></indexterm>
280 <indexterm><primary>attributes</primary>
281   <secondary>case sensitivity (DocBook)</secondary></indexterm>
282 <indexterm><primary>XML</primary>
283   <secondary>case sensitivity</secondary></indexterm>
285 With the standard DocBook &SGML; declaration, DocBook instances are not
286 case-sensitive with respect to element and attribute names. &XML; is always
287 case-sensitive. As long as you have used the same case consistently, your
288 &XML; instances will be well-formed, but it may still be advantageous to do some
289 case-folding because it will simplify the construction of stylesheets.</para>
290 <calloutlist>
291 <callout arearefs="xml-nmcasekey"><para>Keywords in &XML; are case-sensitive,
292 and must be in uppercase.
293 <indexterm><primary>keywords</primary>
294   <secondary>case sensitivity, XML</secondary></indexterm>
295 </para>
296 </callout>
297 <callout arearefs="xml-namecase1"><para>The name declared in the document
298 type declaration, like all other names, is case-sensitive.
299 <indexterm><primary>names</primary>
300   <secondary>case sensitivity</secondary></indexterm>
302 </para>
303 </callout>
304 <callout arearefs="xml-namecase2"><para>Start and end tags must use the same
305 case.
306 <indexterm><primary>start tags</primary>
307   <secondary>case sensitivity</secondary></indexterm>
308 <indexterm><primary>end tags</primary>
309   <secondary>case sensitivity</secondary></indexterm>
310 </para>
311 </callout>
312 <callout arearefs="xml-wf1"><para>In &XML;, <sgmltag>Para</sgmltag> is not the
313 same as <sgmltag>PARA</sgmltag>. Note that this is a validity error (against
314 the &XML; version of DocBook), but it is not an &XML; well-formedness error. The use of <sgmltag>
315 para</sgmltag> and <sgmltag>PARA</sgmltag> as distinct names is as legitimate
316 as using <sgmltag>foo</sgmltag> and <sgmltag>bar</sgmltag>, as long as they
317 are properly nested.
318 <indexterm><primary>Para element</primary>
319   <secondary>PARA vs. (XML)</secondary></indexterm>
320 </para>
321 </callout></calloutlist>
322 <para>The result will be something like this:</para>
323 <programlisting>&lt;?xml version='1.0'?>
324 &lt;!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
325                   "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
326 &lt;book>&lt;title>Book Title&lt;/title>
327 &lt;chapter>&lt;title>Chapter Title&lt;/title>
328 &lt;para>
329 Paragraph test.
330 &lt;/para>
331 &lt;para>
332 A second paragraph.
333 &lt;/para>
334 &lt;/chapter>
335 &lt;/book></programlisting>
336 </sect2>
338 <sect2><title>No #CONREF Attributes</title>
339 <programlisting>&lt;!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
340 &lt;chapter>&lt;title>Chapter Title&lt;/title>
341 &lt;indexterm id="idx-bor">&lt;primary>Something&lt;/primary>&lt;/indexterm><co
342 id="xml-conrefstart"/>
343 &lt;para>
344 Paragraph test.
345 &lt;/para>
346 &lt;indexterm startref="idx-bor"><co id="xml-conref"/>
347 &lt;/chapter></programlisting>
348 <para>
349 <indexterm><primary>#CONREF attributes</primary></indexterm>
350 <indexterm><primary>Startref attribute</primary></indexterm>
351 <indexterm><primary>IndexTerm element</primary></indexterm>
352 <indexterm><primary>OtherTerm attribute</primary></indexterm>
353 <indexterm><primary>GlossSee element</primary></indexterm>
354 <indexterm><primary>GlossSeeAlso element</primary></indexterm>
355 <indexterm><primary>empty tags</primary>
356   <secondary>#CONREF attributes</secondary></indexterm>
358 The <sgmltag class="attribute">StartRef</sgmltag> attribute on <sgmltag>
359 indexterm</sgmltag> and the <sgmltag class="attribute">OtherTerm</sgmltag>
360 attribute on <sgmltag>GlossSee</sgmltag> and <sgmltag>GlossSeeAlso</sgmltag>
361 are <literal>#CONREF</literal> attributes.</para>
362 <para>In &SGML; terms, this means that when these attributes are used, the content
363 of the tag is taken to be the same as the content of the tag pointed to by
364 the attribute. <calloutlist>
365 <callout arearefs="xml-conrefstart xml-conref"><para>If you
366 have used these attributes, your instance will contain both empty and non-empty
367 versions of these tags.</para>
368 </callout></calloutlist></para>
369 <para>Your best bet is to transform the <literal>#CONREF</literal>
370 version into an empty tag and let your stylesheet deal with it appropriately.
371 </para>
372 <para>The result will be something like this:</para>
373 <programlisting>&lt;?xml version='1.0'?>
374 &lt;!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
375                   "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
376 &lt;chapter>&lt;title>Chapter Title&lt;/title>
377 &lt;indexterm id="idx-bor">&lt;primary>Something&lt;/primary>&lt;/indexterm>
378 &lt;para>
379 Paragraph test.
380 &lt;/para>
381 &lt;indexterm startref="idx-bor"/>
382 &lt;/chapter></programlisting>
383 </sect2>
385 <sect2><title>Only Explicit CDATA-Marked Sections Are Allowed</title>
386 <programlisting>&lt;!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
387 &lt;!ENTITY % draft "IGNORE">
388 &lt;!ENTITY % sourcecode "CDATA">
390 &lt;chapter>&lt;title>Chapter Title&lt;/title>
391 <co id="xml-ignore"/>&lt;![ %draft; [
392 &lt;para>
393 Draft paragraph.
394 &lt;/para>
395 ]]&#62;
396 &lt;para>
397 The following code is totally out of context:
398 &lt;programlisting>
399 &lt;![ <co id="xml-cdata"/>%sourcecode; [
400 if (x &lt; 3) {
401   y = 3;
403 ]]&#62;
404 &lt;/programlisting>
405 &lt;/chapter></programlisting>
406 <calloutlist>
407 <callout arearefs="xml-ignore xml-cdata"><para>
408 <indexterm><primary>parameter entities</primary>
409   <secondary>XML document body</secondary></indexterm>
410 <indexterm><primary>XML</primary>
411   <secondary>parameter entities</secondary></indexterm>
412 <indexterm><primary>internal subset</primary>
413   <secondary>parameter entities (XML)</secondary></indexterm>
415 Parameter entities are not
416 allowed in the body of &XML; documents (they are allowed in the internal subset).
417 </para>
418 </callout>
419 <callout arearefs="xml-ignore"><para>&XML; instances cannot contain <literal>
420 IGNORE</literal>, <literal>INCLUDE</literal>, <literal>TEMP</literal>, or <literal>
421 RCDATA</literal> marked sections.
422 <indexterm><primary>marked sections</primary>
423   <secondary>XML, restrictions</secondary></indexterm>
424 <indexterm><primary>IGNORE keyword (marked section)</primary></indexterm>
425 <indexterm><primary>INCLUDE keyword (marked section)</primary>
426   <secondary>XML, not allowing</secondary></indexterm>
427 <indexterm><primary>TEMP marked section (XML)</primary></indexterm>
428 <indexterm><primary>RCDATA</primary></indexterm>
429 </para>
430 </callout>
431 <callout arearefs="xml-cdata"><para><literal>CDATA</literal> marked sections
432 must use the &ldquo;<literal>CDATA</literal>&rdquo; keyword literally because
433 parameter entities are not allowed.
434 <indexterm><primary>CDATA</primary>
435   <secondary>marked sections</secondary></indexterm>
436 </para>
437 </callout></calloutlist>
438 <para>The result will be something like this:</para>
439 <programlisting>&lt;?xml version='1.0'?>
440 &lt;!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
441                   "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
442 &lt;chapter>&lt;title>Chapter Title&lt;/title>
443 &lt;para>
444 The following code is totally out of context:
445 &lt;programlisting>
446 &lt;![CDATA[
447 if (x &lt; 3) {
448   y = 3;
450 ]]&#62;
451 &lt;/programlisting>
452 &lt;/chapter></programlisting>
453 </sect2>
455 <sect2><title>No SUBDOC or CDATA External Entities</title>
456 <programlisting>&lt;!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
457 &lt;!ENTITY % sourcecode SYSTEM "program.c" CDATA>
459 &lt;chapter>&lt;title>Chapter Title&lt;/title>
460 &lt;para>
461 The following code is totally out of context:
462 &lt;programlisting>
463 &amp;sourcecode;
464 &lt;/programlisting>
465 &lt;/chapter></programlisting>
466 <para>
467 <indexterm><primary>external general entities</primary>
468   <secondary>XML restrictions</secondary></indexterm>
469 <indexterm><primary>XML</primary>
470   <secondary>external entities, restrictions</secondary></indexterm>
471 <indexterm><primary>CDATA</primary>
472   <secondary>XML instances, restrictions</secondary></indexterm>
473 &XML; instances cannot use <literal>CDATA</literal> or <literal>SUBDOC
474 </literal> external entities. One option for integrating external <literal>
475 CDATA</literal> content into a document is to employ a pre-processing pass
476 that inserts the content inline, wrapped in a <literal>CDATA</literal> marked
477 section.</para>
478 <para>
479 <indexterm><primary>SUBDOC entities</primary></indexterm>
480 <indexterm><primary>namespaces</primary></indexterm>
482 <literal>SUBDOC</literal> entities may be more problematic. If you do
483 not require validation, it may be sufficient to simply put them inline. &XML;
484 namespaces may offer another possible solution.</para>
485 <para>The result will be something like this:</para>
486 <programlisting>&lt;?xml version='1.0'?>
487 &lt;!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
488                   "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
489 &lt;chapter>&lt;title>Chapter Title&lt;/title>
490 &lt;para>
491 The following code is totally out of context:
492 &lt;programlisting>
493 &lt;![CDATA[
494 int main () {
497 ]]&#62;
498 &lt;/programlisting>
499 &lt;/chapter></programlisting>
500 </sect2>
502 <sect2><title>No Data Attributes on Notations</title>
503 <para>They're not allowed in &XML;, so don't add any.
504 <indexterm><primary>data attributes, notations (XML prohibiting)</primary></indexterm>
506 </para>
507 </sect2>
509 <sect2><title>No Attribute Value Specifications on<?lb?>Entity Declarations</title>
510 <para>
511 <indexterm><primary>attributes</primary>
512   <secondary>values</secondary>
513     <tertiary>specifying (entity declarations)</tertiary></indexterm>
514 <indexterm><primary>declarations</primary>
515   <secondary>entities</secondary>
516     <tertiary>attribute values, prohibiting (XML)</tertiary></indexterm>
517 <indexterm><primary>entities</primary>
518   <secondary>declarations, attribute values (XML)</secondary></indexterm>
520 They're not allowed in &XML;, so don't add any.</para>
521 </sect2>
522 </sect1>
523 <sect1 id="s-docbookxml">
524 <title>The DocBook &DTD; as &XML;</title>
525 <indexterm><primary>DocBook DTD</primary>
526   <secondary>XML</secondary>
527     <tertiary>converting to</tertiary></indexterm>
528 <indexterm><primary>XML</primary>
529   <secondary>DocBook DTD, converting to</secondary></indexterm>
531 <para>Converting the DocBook &DTD; to &XML; is much more challenging
532 than converting the instances. It is probably not possible to
533 construct an &XML; &DTD; that is identical to the validation power
534 of DocBook. The list below identifies most of the issues that
535 must be addressed, and describes how the DocBook &XML; &DTD;; deals with
536 them:</para>
538 <variablelist>
539 <varlistentry><term>Comments are not allowed inside markup declarations</term>
540 <listitem>
541 <para>
542 <indexterm><primary>comments</primary>
543   <secondary>markup declarations (DocBook XML)</secondary></indexterm>
544 <indexterm><primary>declarations</primary>
545   <secondary>comment declarations</secondary></indexterm>
547 Most of them have been moved to comment declarations preceding the markup
548 declaration that used to contain them. A few small, inline comments that seemed
549 like they would be out of context if moved before the declaration were simply
550 deleted.</para>
551 </listitem>
552 </varlistentry>
553 <varlistentry><term>Name groups are not allowed in element or attribute list
554 declarations</term>
555 <listitem>
556 <para>
557 <indexterm><primary>name groups (DocBook XML)</primary></indexterm>
558 <indexterm><primary>elements</primary>
559   <secondary>declarations</secondary>
560     <tertiary>name groups, prohibiting</tertiary></indexterm>
561 <indexterm><primary>attributes</primary>
562   <secondary>declarations</secondary>
563     <tertiary>name groups, prohibiting</tertiary></indexterm>
565 The small number of places in which DocBook uses name groups have
566 been expanded.</para>
567 <para>There's one downside: DocBook uses <literal>%admon.class;</literal> in a name
568 group to define the content model, and attribute lists for elements in the
569 admonitions class. In DocBook XML, this convenience cannot be expressed. If additional
570 admonitions are added, the element and attribute list declarations will have
571 to be copied for them.</para>
572 </listitem>
573 </varlistentry>
574 <varlistentry><term>No <literal>CDATA</literal> or <literal>RCDATA</literal>
575 declared content</term>
576 <listitem>
577 <para>
578 <indexterm><primary>CDATA</primary>
579   <secondary>declared content, prohibiting</secondary></indexterm>
580 <indexterm><primary>RCDATA</primary></indexterm>
582 <sgmltag>Graphic</sgmltag> and <sgmltag>InlineGraphic</sgmltag> have
583 been made <literal>EMPTY</literal>. The content model for <sgmltag>SynopFragmentRef
584 </sgmltag>, the only <literal>RCDATA</literal> element in DocBook, has been
585 changed to <literal>(arg | group)+</literal>.</para>
586 </listitem>
587 </varlistentry>
588 <varlistentry><term>No exclusions or inclusions on element declarations</term>
589 <listitem>
590 <para>
591 <indexterm><primary>inclusions</primary>
592   <secondary>element declarations, prohibiting (DocBook XML)</secondary></indexterm>
593 <indexterm><primary>exclusions</primary>
594   <secondary>element declarations, prohibiting (DocBook XML)</secondary></indexterm>
596 They had to be removed.</para>
597 <para>
598 <indexterm><primary>exclusions</primary>
599   <secondary>DocBook, uses</secondary></indexterm>
601 In DocBook, exclusions are used to exclude the following:<itemizedlist>
602 <listitem><para>Ubiquitous elements (<sgmltag>indexterm</sgmltag>
603 and <sgmltag>BeginPage</sgmltag>) from a number of contexts in which they
604 should not occur (such as metadata, for example).</para>
605 </listitem>
606 <listitem><para>
607 <indexterm><primary>formal objects, exclusions (DocBook)</primary></indexterm>
609 Formal objects from <sgmltag>Highlights</sgmltag>, <sgmltag>
610 Example</sgmltag>s, <sgmltag>Figure</sgmltag>s and <sgmltag>LegalNotice</sgmltag>s.
611 </para>
612 </listitem>
613 <listitem><para>
614 <indexterm><primary>tables</primary>
615   <secondary>exclusions (DocBook)</secondary></indexterm>
616 <indexterm><primary>InformalTable element</primary>
617   <secondary>excluding from tables</secondary></indexterm>
619 Formal objects and <sgmltag>InformalTable</sgmltag>s
620 from tables.</para>
621 </listitem>
622 <listitem><para>
623 <indexterm><primary>footnotes, exclusions (DocBook)</primary></indexterm>
624 <indexterm><primary>block elements</primary>
625   <secondary>excluding from footnotes</secondary></indexterm>
627 Block elements and <sgmltag>Footnote</sgmltag>s
628 from <sgmltag>Footnote</sgmltag>s</para>
629 </listitem>
630 <listitem><para>Admonitions, <sgmltag>EntryTbl</sgmltag>s, and <sgmltag>
631 Acronym</sgmltag>s from themselves.
632 <indexterm><primary>admonitions</primary>
633   <secondary>exclusions (DocBook)</secondary></indexterm>
634 <indexterm><primary>acronyms (DocBook XML)</primary></indexterm>
635 </para>
636 </listitem>
637 </itemizedlist></para>
638 <para>Removing these exclusions from DocBook &XML; means that it is now valid, in
639 the &XML; sense, to do some things that don't make a lot of sense (like put
640 a <sgmltag>Footnote</sgmltag> in a <sgmltag>Footnote</sgmltag>). Be careful.
641 </para>
642 <para>
643 <indexterm><primary>inclusions</primary>
644   <secondary>DocBook, uses</secondary></indexterm>
645 <indexterm><primary>IndexTerm element</primary>
646   <secondary>inclusions, DocBook</secondary></indexterm>
647 <indexterm><primary>BeginPage element (DocBook inclusions)</primary></indexterm>
648 <indexterm><primary>parameter entities</primary>
649   <secondary>DbXML, ubiquitous element inclusions</secondary></indexterm>
650 <indexterm><primary>#PCDATA keyword</primary>
651   <secondary>DbXML, ubiquitous elements</secondary></indexterm>
653 Inclusions in DocBook are used to add the ubiquitious elements (<sgmltag>
654 indexterm</sgmltag> and <sgmltag>BeginPage</sgmltag>) unconditionally to a
655 large number of contexts. In order to make these elements available in
656 DocBook &XML;,
657 they have been added to most of the parameter entities that include <literal>
658 #PCDATA</literal>. If new locations are discovered where these terms are desired, DocBook &XML;
659 will be updated.</para>
660 </listitem>
661 </varlistentry>
662 <varlistentry><term>Elements with mixed content must have <literal>#PCDATA
663 </literal> first.</term>
664 <listitem>
665 <para>
666 <indexterm><primary>elements</primary>
667   <secondary>mixed content (DocBook XML)</secondary></indexterm>
668 <indexterm><primary>content models</primary>
669   <secondary>elements, updating (DocBook XML)</secondary></indexterm>
671 The content models of many elements have been updated to make them a
672 repeatable OR group beginning with <literal>#PCDATA</literal>.</para>
673 </listitem>
674 </varlistentry>
675 <varlistentry><term>Many declared attribute types (<literal>NAME</literal>, <literal>
676 NUMBER</literal>, <literal>NUTOKEN</literal>, and so on) are not allowed</term>
677 <listitem>
678 <para>
679 <indexterm><primary>attributes</primary>
680   <secondary>declared types, prohibiting (DocBook XML)</secondary></indexterm>
681 <indexterm><primary>NMTOKEN(S) attribute</primary>
682   <secondary>DbXML</secondary></indexterm>
683 <indexterm><primary>CDATA</primary>
684   <secondary>DbXML</secondary></indexterm>
686 They have all been replaced by <literal>NMTOKEN</literal> or <literal>
687 CDATA</literal>.</para>
688 </listitem>
689 </varlistentry>
690 <varlistentry><term>No <literal>#CONREF</literal> attributes allowed.</term>
691 <listitem>
692 <para>
693 <indexterm><primary>#CONREF attributes</primary>
694   <secondary>DbXML, prohibiting</secondary></indexterm>
695 <indexterm><primary>#IMPLIED attribute (DocBook XML)</primary></indexterm>
696 <indexterm><primary>GlossSee element</primary>
697   <secondary>DbXML</secondary></indexterm>
698 <indexterm><primary>GlossSeeAlso element</primary>
699   <secondary>DbXML</secondary></indexterm>
700 <indexterm><primary>IndexTerm element</primary>
701   <secondary>empty (DocBook XML)</secondary></indexterm>
703 The <literal>#CONREF</literal> attributes on <sgmltag>indexterm</sgmltag>, <sgmltag>
704 GlossSee</sgmltag>, and <sgmltag>GlossSeeAlso</sgmltag> were changed to <literal>
705 #IMPLIED</literal>. The content model of <sgmltag>indexterm</sgmltag> was
706 modified so that it can be empty.</para>
707 </listitem>
708 </varlistentry>
709 <varlistentry><term>Attribute default values must be quoted.</term>
710 <listitem>
711 <para>
712 <indexterm><primary>quotes</primary>
713   <secondary>attribute values</secondary>
714     <tertiary>DbXML</tertiary></indexterm>
715 <indexterm><primary>attributes</primary>
716   <secondary>values</secondary>
717     <tertiary>quoting</tertiary></indexterm>
719 Quotes were added wherever necessary.
720 <indexterm startref="docbookxmlappa" class="endofrange"/>
721 <indexterm startref="xmldocbookappb" class="endofrange"/>
722 </para>
723 </listitem>
724 </varlistentry>
725 </variablelist>
727 </sect1>
728 </appendix>
730 <!--
731 Local Variables:
732 mode:sgml
733 sgml-parent-document: ("book.sgm" "appendix")
734 End: