3065 some functions in the tcp module can be static
[unleashed.git] / usr / src / cmd / man / src / util / nsgmls.src / doc / sgmldecl.htm
blob54ef6f65069935a1392ba5e9953730c4dc8bbd0a
1 <!-- SCCS keyword
2 #pragma ident "%Z%%M% %I% %E% SMI"
3 -->
4 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
5 <HTML>
6 <HEAD>
7 <TITLE>SP - SGML declaration</TITLE>
8 </HEAD>
9 <BODY>
10 <H1>Handling of the SGML declaration in SP</H1>
11 <H2>Default SGML declaration</H2>
12 <P>
13 If the SGML declaration is omitted
14 and there is no applicable
15 <A HREF="catalog.htm#sgmldecl"><SAMP>SGMLDECL</SAMP></A>
16 entry in a catalog,
17 the following declaration will be implied:
18 <PRE>
19 &lt;!SGML "ISO 8879:1986"
20 CHARSET
21 BASESET "ISO 646-1983//CHARSET
22 International Reference Version (IRV)//ESC 2/5 4/0"
23 DESCSET 0 9 UNUSED
24 9 2 9
25 11 2 UNUSED
26 13 1 13
27 14 18 UNUSED
28 32 95 32
29 127 1 UNUSED
30 CAPACITY PUBLIC "ISO 8879:1986//CAPACITY Reference//EN"
31 SCOPE DOCUMENT
32 SYNTAX
33 SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
34 18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 255
35 BASESET "ISO 646-1983//CHARSET International Reference Version
36 (IRV)//ESC 2/5 4/0"
37 DESCSET 0 128 0
38 FUNCTION RE 13
39 RS 10
40 SPACE 32
41 TAB SEPCHAR 9
42 NAMING LCNMSTRT ""
43 UCNMSTRT ""
44 LCNMCHAR "-."
45 UCNMCHAR "-."
46 NAMECASE GENERAL YES
47 ENTITY NO
48 DELIM GENERAL SGMLREF
49 SHORTREF SGMLREF
50 NAMES SGMLREF
51 QUANTITY SGMLREF
52 ATTCNT 99999999
53 ATTSPLEN 99999999
54 DTEMPLEN 24000
55 ENTLVL 99999999
56 GRPCNT 99999999
57 GRPGTCNT 99999999
58 GRPLVL 99999999
59 LITLEN 24000
60 NAMELEN 99999999
61 PILEN 24000
62 TAGLEN 99999999
63 TAGLVL 99999999
64 FEATURES
65 MINIMIZE DATATAG NO
66 OMITTAG YES
67 RANK YES
68 SHORTTAG YES
69 LINK SIMPLE YES 1000
70 IMPLICIT YES
71 EXPLICIT YES 1
72 OTHER CONCUR NO
73 SUBDOC YES 99999999
74 FORMAL YES
75 APPINFO NONE>
76 </PRE>
77 <P>
78 with the exception that all characters that are neither significant
79 nor shunned will be assigned to DATACHAR.
80 <H2>Character sets</H2>
81 <P>
82 A character in a base character set is described either by giving its
83 number in a universal character set, or by specifying a minimum
84 literal. The constraints on the choice of universal character set are
85 that characters that are significant in the SGML reference concrete
86 syntax must be in the universal character set and must have the same
87 number in the universal character set as in ISO 646 and that each
88 character in the character set must be represented by exactly one
89 number; that character numbers in the range 0 to 31 and 127 to 159 are
90 control characters (for the purpose of enforcing SHUNCHAR CONTROLS).
91 It is recommended that ISO 10646 (Unicode) be used as the universal
92 character set, except in environments where the normal document
93 character sets are large character set which cannot be compactly
94 described in terms of ISO 10646.
95 The public identifier of a base character set can be associated
96 with an entity that describes it by using a
97 <SAMP>PUBLIC</SAMP>
98 entry in the catalog entry file.
99 The entity must be a fragment
100 of an SGML declaration
101 consisting of the
102 portion of a character set description,
103 following the DESCSET keyword,
104 that is, it must be a sequence of character descriptions,
105 where each character description specifies a described character
106 number, the number of characters and
107 either a character number in the universal character set, a minimum literal
108 or the keyword
109 <SAMP>UNUSED</SAMP>.
110 Character numbers in the universal character set can be as big as
111 99999999.
113 In addition SP has built in knowledge of a few character sets.
114 These are identified using the designating sequence in the
115 public identifier. The following designating sequences are
116 recognized:
117 <DL>
118 <DT>
119 <SAMP>ESC 2/5 4/0</SAMP>
120 <DD>
121 The full set of ISO 646 IRV.
122 This is not a registered character set,
123 but is recommended by ISO 8879 (clause 10.2.2.4).
124 <DT>
125 <SAMP>ESC 2/8 4/0</SAMP>
126 <DD>
127 G0 set of ISO 646 IRV,
128 ISO Registration Number 2.
129 <DT>
130 <SAMP>ESC 2/8 4/2</SAMP>
131 <DD>
132 G0 set of ASCII,
133 ISO Registration Number 6.
134 <DT>
135 <SAMP>ESC 2/1 4/0</SAMP>
136 <DD>
137 C0 set of ISO 646,
138 ISO Registration Number 1.
139 </DL>
141 All the above character sets will be treated as mapping character numbers
142 0 to 127 inclusive as in ISO 646.
144 It is not necessary for every character set used in the SGML
145 declaration to be known to SP
146 provided that characters in the document character set that are
147 significant both in the reference concrete syntax and in the described
148 concrete syntax are described using known base character sets and that
149 characters that are significant in the described concrete syntax are
150 described using the same base character sets or the same minimum
151 literals in both the document character set description and the syntax
152 reference character set description.
154 <H2>Concrete syntaxes</H2>
156 The public identifier for a public concrete syntax can be associated
157 with an entity that describes using a
158 <SAMP>PUBLIC</SAMP>
159 entry in the catalog entry file.
160 The entity must be a fragment of an SGML declaration
161 consisting of a concrete syntax description
162 starting with the
163 <SAMP>SHUNCHAR</SAMP>
164 keyword
165 as in an SGML declaration.
166 The entity can also make use of the following extensions:
167 <UL>
168 <LI>
170 <I>added function</I>
171 can be expressed as a parameter literal
172 instead of a name.
173 <LI>
174 The replacement for a reference reserved name
175 can be expressed as a parameter literal instead of a name.
176 <LI>
178 <SAMP>LCNMSTRT</SAMP>,
179 <SAMP>UCNMSTRT</SAMP>,
180 <SAMP>LCNMCHAR</SAMP>
182 <SAMP>UCNMCHAR</SAMP>
183 keywords may each be followed by more than one parameter literal. A
184 sequence of parameter literals has the same meaning as a single
185 parameter literal whose content is the concatenation of the content of
186 each of the literals in the sequence. This extension is useful
187 because of the restriction on the length of a parameter literal in the
188 SGML declaration to 240 characters.
189 <LI>
190 The total number of characters specified for
191 <SAMP>UCNMCHAR</SAMP>
193 <SAMP>UCNMSTRT</SAMP>
194 may exceed the total number of characters specified for
195 <SAMP>LCNMCHAR</SAMP>
197 <SAMP>LCNMSTRT</SAMP>
198 respectively.
199 Each character in
200 <SAMP>UCNMCHAR</SAMP>
202 <SAMP>UCNMSTRT</SAMP>
203 which does not have a corresponding character in the same position in
204 <SAMP>LCNMCHAR</SAMP>
206 <SAMP>LCNMSTRT</SAMP>
207 is simply assigned to <SAMP>UCNMCHAR</SAMP> or <SAMP>UCNMSTRT</SAMP>
208 without making it the upper-case form of any character.
209 <LI>
210 A parameter following any of
211 <SAMP>LCNMSTRT</SAMP>,
212 <SAMP>UCNMSTRT</SAMP>,
213 <SAMP>LCNMCHAR</SAMP>
215 <SAMP>UCNMCHAR</SAMP>
216 keywords may be followed by
217 the name token <SAMP>...</SAMP>
218 (three periods) and another parameter literal.
219 This has the same meaning as the two parameter literals
220 with a parameter literal in between
221 containing in order each character whose number
222 is greater than the number of the last character in
223 the first parameter literal and less than the
224 number of the first character in the second
225 parameter literal.
226 A parameter literal must contain at least one character for each
227 <SAMP>...</SAMP>
228 to which it is adjacent.
229 <LI>
230 A number may be used as a parameter following the
231 <SAMP>LCNMSTRT</SAMP>,
232 <SAMP>UCNMSTRT</SAMP>,
233 <SAMP>LCNMCHAR</SAMP>
235 <SAMP>UCNMCHAR</SAMP>
236 keywords or as a delimiter in the
237 <SAMP>DELIM</SAMP>
238 section with the same meaning as a parameter literal
239 containing just a numeric character reference with that number.
240 <LI>
241 The parameters following the
242 <SAMP>LCNMSTRT</SAMP>,
243 <SAMP>UCNMSTRT</SAMP>,
244 <SAMP>LCNMCHAR</SAMP>
246 <SAMP>UCNMCHAR</SAMP>
247 keywords may be omitted.
248 This has the same meaning as specifying
249 an empty parameter literal.
250 <LI>
251 Within the specification of the short reference delimiters,
252 a parameter literal containing exactly one character
253 may be followed by the name token <SAMP>...</SAMP>
254 and another parameter literal containing exactly one character.
255 This has the same meaning as a sequence of parameter literals
256 one for each character number that is greater than or equal
257 to the number of the character in the first parameter literal
258 and less than or equal to the number of the character in the
259 second parameter literal.
260 </UL>
261 <H2>Capacity sets</H2>
263 The public identifier for a public capacity set can be associated
264 with an entity that describes using a
265 <SAMP>PUBLIC</SAMP>
266 entry in the catalog entry file.
267 The entity must be a fragment of an SGML declaration
268 consisting of a sequence of capacity names and numbers.
270 <ADDRESS>
271 James Clark<BR>
272 jjc@jclark.com
273 </ADDRESS>
274 </BODY>
275 </HTML>