2 DTD for the format specification files
3 Sergio Ortiz 2005.05.13
7 <!ELEMENT format (options,rules)
>
8 <!ATTLIST format name CDATA #REQUIRED
>
10 'format' is the root element containing the whole format specification
11 file. The attribute 'name' specifies the name of the format
14 <!ELEMENT options (largeblocks,input,output, tag-name, escape-chars,space-chars,case-sensitive)
>
16 General options of the format
19 <!ELEMENT largeblocks EMPTY
>
20 <!ATTLIST largeblocks size CDATA #REQUIRED
>
22 The attribute size is used to define the maximal size in bytes of
26 <!ELEMENT input EMPTY
>
27 <!ATTLIST input zip-path CDATA #IMPLIED
>
28 <!ATTLIST input encoding CDATA #REQUIRED
>
30 Reserved for future extensions
33 <!ELEMENT output EMPTY
>
34 <!ATTLIST output zip-path CDATA #IMPLIED
>
35 <!ATTLIST output encoding CDATA #REQUIRED
>
37 Reserved for future extensions
40 <!ELEMENT tag-name EMPTY
>
41 <!ATTLIST tag-name regexp CDATA #REQUIRED
>
43 The attribute regexp defines (whith a _flex_ regular expression) how
44 take a tag name from a whole tag. '\'
47 <!ELEMENT escape-chars EMPTY
>
48 <!ATTLIST escape-chars regexp CDATA #REQUIRED
>
50 The attribute regexp defines (whith a _flex_ regular expression) the
51 set of characters to be escaped with preceding a backslash '\'
54 <!ELEMENT space-chars EMPTY
>
55 <!ATTLIST space-chars regexp CDATA #REQUIRED
>
57 Define the space characters (in regexp) with a _flex_ regular
61 <!ELEMENT case-sensitive EMPTY
>
62 <!ATTLIST case-sensitive value (yes|no) #REQUIRED
>
64 The attribute 'value' is set to 'yes' if the case is relevant in the
65 specification of the format. Otherwise is set to 'no'
69 <!ELEMENT rules (format-rule|replacement-rule)+
>
71 Group the rules of processing format and the rules of substitute
72 expressions by characters that are part of the text
75 <!ELEMENT format-rule (tag|(begin,end))
>
76 <!ATTLIST format-rule type (comment|empty|open|close) #IMPLIED
>
77 <!ATTLIST format-rule eos (yes|no) #IMPLIED
>
78 <!ATTLIST format-rule priority CDATA #REQUIRED
>
80 Format rule parent element. It may include a 'tag' element or
81 a couple of elements 'begin', 'end'. In the first case, this element is
82 considered to be part of the format. In the second case, the begin and
83 the end element are considered to enclosing format. The attribute
84 'eos' (end of sentence) is set to 'yes' if that rule defines a dot in
85 the text being processed (is no by default). The attribute 'priority'
86 marks the order of precedence of the rule
90 <!ATTLIST tag regexp CDATA #REQUIRED
>
92 Define an element that is part of the format by the pattern specified
93 as a value for the regexp attribute
96 <!ELEMENT begin EMPTY
>
97 <!ATTLIST begin regexp CDATA #REQUIRED
>
99 The attribute 'regexp' is the regular expression that detects the
100 begining delimiter of a block of format
104 <!ATTLIST end regexp CDATA #REQUIRED
>
106 The attribute 'regexp' is the regular expression that detects the
107 ending delimiter of a block of format
110 <!ELEMENT replacement-rule (replace+)
>
111 <!ATTLIST replacement-rule regexp CDATA #REQUIRED
>
113 Root element for a replacement rule. The attribute 'regexp' is the
114 general expression to detect the elements to replace
117 <!ELEMENT replace EMPTY
>
118 <!ATTLIST replace source CDATA #REQUIRED
>
119 <!ATTLIST replace target CDATA #REQUIRED
>
120 <!ATTLIST replace prefer (yes|no) #IMPLIED
>
122 Replacement rule. The 'source' is a string of one or more characters.
123 The 'target' MUST be a single character. The 'prefer' attribute, when
124 set to 'yes' defines the preferred reverse translation of the