Install Perl 5.8.8
[msysgit.git] / mingw / html / lib / Unicode / Collate.html
blobb9e057f880e1d67526c49e6790af19f249826310
1 <?xml version="1.0" ?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml">
4 <head>
5 <title>Unicode::Collate - Unicode Collation Algorithm</title>
6 <meta http-equiv="content-type" content="text/html; charset=utf-8" />
7 <link rev="made" href="mailto:" />
8 </head>
10 <body style="background-color: white">
11 <table border="0" width="100%" cellspacing="0" cellpadding="3">
12 <tr><td class="block" style="background-color: #cccccc" valign="middle">
13 <big><strong><span class="block">&nbsp;Unicode::Collate - Unicode Collation Algorithm</span></strong></big>
14 </td></tr>
15 </table>
17 <p><a name="__index__"></a></p>
18 <!-- INDEX BEGIN -->
20 <ul>
22 <li><a href="#name">NAME</a></li>
23 <li><a href="#synopsis">SYNOPSIS</a></li>
24 <li><a href="#description">DESCRIPTION</a></li>
25 <ul>
27 <li><a href="#constructor_and_tailoring">Constructor and Tailoring</a></li>
28 <li><a href="#methods_for_collation">Methods for Collation</a></li>
29 <li><a href="#methods_for_searching">Methods for Searching</a></li>
30 <li><a href="#other_methods">Other Methods</a></li>
31 </ul>
33 <li><a href="#export">EXPORT</a></li>
34 <li><a href="#install">INSTALL</a></li>
35 <li><a href="#caveats">CAVEATS</a></li>
36 <li><a href="#author__copyright_and_license">AUTHOR, COPYRIGHT AND LICENSE</a></li>
37 <li><a href="#see_also">SEE ALSO</a></li>
38 </ul>
39 <!-- INDEX END -->
41 <hr />
42 <p>
43 </p>
44 <h1><a name="name">NAME</a></h1>
45 <p>Unicode::Collate - Unicode Collation Algorithm</p>
46 <p>
47 </p>
48 <hr />
49 <h1><a name="synopsis">SYNOPSIS</a></h1>
50 <pre>
51 use Unicode::Collate;</pre>
52 <pre>
53 #construct
54 $Collator = Unicode::Collate-&gt;new(%tailoring);</pre>
55 <pre>
56 #sort
57 @sorted = $Collator-&gt;sort(@not_sorted);</pre>
58 <pre>
59 #compare
60 $result = $Collator-&gt;cmp($a, $b); # returns 1, 0, or -1.</pre>
61 <pre>
62 # If %tailoring is false (i.e. empty),
63 # $Collator should do the default collation.</pre>
64 <p>
65 </p>
66 <hr />
67 <h1><a name="description">DESCRIPTION</a></h1>
68 <p>This module is an implementation of Unicode Technical Standard #10
69 (a.k.a. UTS #10) - Unicode Collation Algorithm (a.k.a. UCA).</p>
70 <p>
71 </p>
72 <h2><a name="constructor_and_tailoring">Constructor and Tailoring</a></h2>
73 <p>The <code>new</code> method returns a collator object.</p>
74 <pre>
75 $Collator = Unicode::Collate-&gt;new(
76 UCA_Version =&gt; $UCA_Version,
77 alternate =&gt; $alternate, # deprecated: use of 'variable' is recommended.
78 backwards =&gt; $levelNumber, # or \@levelNumbers
79 entry =&gt; $element,
80 hangul_terminator =&gt; $term_primary_weight,
81 ignoreName =&gt; qr/$ignoreName/,
82 ignoreChar =&gt; qr/$ignoreChar/,
83 katakana_before_hiragana =&gt; $bool,
84 level =&gt; $collationLevel,
85 normalization =&gt; $normalization_form,
86 overrideCJK =&gt; \&amp;overrideCJK,
87 overrideHangul =&gt; \&amp;overrideHangul,
88 preprocess =&gt; \&amp;preprocess,
89 rearrange =&gt; \@charList,
90 table =&gt; $filename,
91 undefName =&gt; qr/$undefName/,
92 undefChar =&gt; qr/$undefChar/,
93 upper_before_lower =&gt; $bool,
94 variable =&gt; $variable,
95 );</pre>
96 <dl>
97 <dt><strong><a name="item_uca_version">UCA_Version</a></strong>
99 <dd>
100 <p>If the tracking version number of UCA is given,
101 behavior of that tracking version is emulated on collating.
102 If omitted, the return value of <a href="#item_uca_version"><code>UCA_Version()</code></a> is used.
103 <a href="#item_uca_version"><code>UCA_Version()</code></a> should return the latest tracking version supported.</p>
104 </dd>
105 <dd>
106 <p>The supported tracking version: 8, 9, 11, or 14.</p>
107 </dd>
108 <dd>
109 <pre>
110 UCA Unicode Standard DUCET (@version)
111 ---------------------------------------------------
112 8 3.1 3.0.1 (3.0.1d9)
113 9 3.1 with Corrigendum 3 3.1.1 (3.1.1)
114 11 4.0 4.0.0 (4.0.0)
115 14 4.1.0 4.1.0 (4.1.0)</pre>
116 </dd>
117 <dd>
118 <p>Note: Recent UTS #10 renames ``Tracking Version'' to ``Revision.''</p>
119 </dd>
120 </li>
121 <dt><strong><a name="item_alternate">alternate</a></strong>
123 <dd>
124 <p>-- see 3.2.2 Alternate Weighting, version 8 of UTS #10</p>
125 </dd>
126 <dd>
127 <p>For backward compatibility, <a href="#item_alternate"><code>alternate</code></a> (old name) can be used
128 as an alias for <a href="#item_variable"><code>variable</code></a>.</p>
129 </dd>
130 </li>
131 <dt><strong><a name="item_backwards">backwards</a></strong>
133 <dd>
134 <p>-- see 3.1.2 French Accents, UTS #10.</p>
135 </dd>
136 <dd>
137 <pre>
138 backwards =&gt; $levelNumber or \@levelNumbers</pre>
139 </dd>
140 <dd>
141 <p>Weights in reverse order; ex. level 2 (diacritic ordering) in French.
142 If omitted, forwards at all the levels.</p>
143 </dd>
144 </li>
145 <dt><strong><a name="item_entry">entry</a></strong>
147 <dd>
148 <p>-- see 3.1 Linguistic Features; 3.2.1 File Format, UTS #10.</p>
149 </dd>
150 <dd>
151 <p>If the same character (or a sequence of characters) exists
152 in the collation element table through <a href="#item_table"><code>table</code></a>,
153 mapping to collation elements is overrided.
154 If it does not exist, the mapping is defined additionally.</p>
155 </dd>
156 <dd>
157 <pre>
158 entry =&gt; &lt;&lt;'ENTRY', # for DUCET v4.0.0 (allkeys-4.0.0.txt)
159 0063 0068 ; [.0E6A.0020.0002.0063] # ch
160 0043 0068 ; [.0E6A.0020.0007.0043] # Ch
161 0043 0048 ; [.0E6A.0020.0008.0043] # CH
162 006C 006C ; [.0F4C.0020.0002.006C] # ll
163 004C 006C ; [.0F4C.0020.0007.004C] # Ll
164 004C 004C ; [.0F4C.0020.0008.004C] # LL
165 00F1 ; [.0F7B.0020.0002.00F1] # n-tilde
166 006E 0303 ; [.0F7B.0020.0002.00F1] # n-tilde
167 00D1 ; [.0F7B.0020.0008.00D1] # N-tilde
168 004E 0303 ; [.0F7B.0020.0008.00D1] # N-tilde
169 ENTRY</pre>
170 </dd>
171 <dd>
172 <pre>
173 entry =&gt; &lt;&lt;'ENTRY', # for DUCET v4.0.0 (allkeys-4.0.0.txt)
174 00E6 ; [.0E33.0020.0002.00E6][.0E8B.0020.0002.00E6] # ae ligature as &lt;a&gt;&lt;e&gt;
175 00C6 ; [.0E33.0020.0008.00C6][.0E8B.0020.0008.00C6] # AE ligature as &lt;A&gt;&lt;E&gt;
176 ENTRY</pre>
177 </dd>
178 <dd>
179 <p><strong>NOTE:</strong> The code point in the UCA file format (before <code>';'</code>)
180 <strong>must</strong> be a Unicode code point (defined as hexadecimal),
181 but not a native code point.
182 So <code>0063</code> must always denote <code>U+0063</code>,
183 but not a character of <code>&quot;\x63&quot;</code>.</p>
184 </dd>
185 <dd>
186 <p>Weighting may vary depending on collation element table.
187 So ensure the weights defined in <a href="#item_entry"><code>entry</code></a> will be consistent with
188 those in the collation element table loaded via <a href="#item_table"><code>table</code></a>.</p>
189 </dd>
190 <dd>
191 <p>In DUCET v4.0.0, primary weight of <code>C</code> is <code>0E60</code>
192 and that of <code>D</code> is <code>0E6D</code>. So setting primary weight of <code>CH</code> to <code>0E6A</code>
193 (as a value between <code>0E60</code> and <code>0E6D</code>)
194 makes ordering as <code>C &lt; CH &lt; D</code>.
195 Exactly speaking DUCET already has some characters between <code>C</code> and <code>D</code>:
196 <code>small capital C</code> (<code>U+1D04</code>) with primary weight <code>0E64</code>,
197 <code>c-hook/C-hook</code> (<code>U+0188/U+0187</code>) with <code>0E65</code>,
198 and <code>c-curl</code> (<code>U+0255</code>) with <code>0E69</code>.
199 Then primary weight <code>0E6A</code> for <code>CH</code> makes <code>CH</code>
200 ordered between <code>c-curl</code> and <code>D</code>.</p>
201 </dd>
202 </li>
203 <dt><strong><a name="item_hangul_terminator">hangul_terminator</a></strong>
205 <dd>
206 <p>-- see 7.1.4 Trailing Weights, UTS #10.</p>
207 </dd>
208 <dd>
209 <p>If a true value is given (non-zero but should be positive),
210 it will be added as a terminator primary weight to the end of
211 every standard Hangul syllable. Secondary and any higher weights
212 for terminator are set to zero.
213 If the value is false or <a href="#item_hangul_terminator"><code>hangul_terminator</code></a> key does not exist,
214 insertion of terminator weights will not be performed.</p>
215 </dd>
216 <dd>
217 <p>Boundaries of Hangul syllables are determined
218 according to conjoining Jamo behavior in <em>the Unicode Standard</em>
219 and <em>HangulSyllableType.txt</em>.</p>
220 </dd>
221 <dd>
222 <p><strong>Implementation Note:</strong>
223 (1) For expansion mapping (Unicode character mapped
224 to a sequence of collation elements), a terminator will not be added
225 between collation elements, even if Hangul syllable boundary exists there.
226 Addition of terminator is restricted to the next position
227 to the last collation element.</p>
228 </dd>
229 <dd>
230 <p>(2) Non-conjoining Hangul letters
231 (Compatibility Jamo, halfwidth Jamo, and enclosed letters) are not
232 automatically terminated with a terminator primary weight.
233 These characters may need terminator included in a collation element
234 table beforehand.</p>
235 </dd>
236 </li>
237 <dt><strong><a name="item_ignorechar">ignoreChar</a></strong>
239 <dt><strong><a name="item_ignorename">ignoreName</a></strong>
241 <dd>
242 <p>-- see 3.2.2 Variable Weighting, UTS #10.</p>
243 </dd>
244 <dd>
245 <p>Makes the entry in the table completely ignorable;
246 i.e. as if the weights were zero at all level.</p>
247 </dd>
248 <dd>
249 <p>Through <a href="#item_ignorechar"><code>ignoreChar</code></a>, any character matching <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_qr_"><code>qr/$ignoreChar/</code></a>
250 will be ignored. Through <a href="#item_ignorename"><code>ignoreName</code></a>, any character whose name
251 (given in the <a href="#item_table"><code>table</code></a> file as a comment) matches <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_qr_"><code>qr/$ignoreName/</code></a>
252 will be ignored.</p>
253 </dd>
254 <dd>
255 <p>E.g. when 'a' and 'e' are ignorable,
256 'element' is equal to 'lament' (or 'lmnt').</p>
257 </dd>
258 </li>
259 <dt><strong><a name="item_katakana_before_hiragana">katakana_before_hiragana</a></strong>
261 <dd>
262 <p>-- see 7.3.1 Tertiary Weight Table, UTS #10.</p>
263 </dd>
264 <dd>
265 <p>By default, hiragana is before katakana.
266 If the parameter is made true, this is reversed.</p>
267 </dd>
268 <dd>
269 <p><strong>NOTE</strong>: This parameter simplemindedly assumes that any hiragana/katakana
270 distinctions must occur in level 3, and their weights at level 3 must be
271 same as those mentioned in 7.3.1, UTS #10.
272 If you define your collation elements which violate this requirement,
273 this parameter does not work validly.</p>
274 </dd>
275 </li>
276 <dt><strong><a name="item_level">level</a></strong>
278 <dd>
279 <p>-- see 4.3 Form Sort Key, UTS #10.</p>
280 </dd>
281 <dd>
282 <p>Set the maximum level.
283 Any higher levels than the specified one are ignored.</p>
284 </dd>
285 <dd>
286 <pre>
287 Level 1: alphabetic ordering
288 Level 2: diacritic ordering
289 Level 3: case ordering
290 Level 4: tie-breaking (e.g. in the case when variable is 'shifted')</pre>
291 </dd>
292 <dd>
293 <pre>
294 ex.level =&gt; 2,</pre>
295 </dd>
296 <dd>
297 <p>If omitted, the maximum is the 4th.</p>
298 </dd>
299 </li>
300 <dt><strong><a name="item_normalization">normalization</a></strong>
302 <dd>
303 <p>-- see 4.1 Normalize, UTS #10.</p>
304 </dd>
305 <dd>
306 <p>If specified, strings are normalized before preparation of sort keys
307 (the normalization is executed after preprocess).</p>
308 </dd>
309 <dd>
310 <p>A form name <code>Unicode::Normalize::normalize()</code> accepts will be applied
311 as <code>$normalization_form</code>.
312 Acceptable names include <code>'NFD'</code>, <code>'NFC'</code>, <code>'NFKD'</code>, and <code>'NFKC'</code>.
313 See <code>Unicode::Normalize::normalize()</code> for detail.
314 If omitted, <code>'NFD'</code> is used.</p>
315 </dd>
316 <dd>
317 <p><a href="#item_normalization"><code>normalization</code></a> is performed after <a href="#item_preprocess"><code>preprocess</code></a> (if defined).</p>
318 </dd>
319 <dd>
320 <p>Furthermore, special values, <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef</code></a> and <code>&quot;prenormalized&quot;</code>, can be used,
321 though they are not concerned with <code>Unicode::Normalize::normalize()</code>.</p>
322 </dd>
323 <dd>
324 <p>If <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef</code></a> (not a string <code>&quot;undef&quot;</code>) is passed explicitly
325 as the value for this key,
326 any normalization is not carried out (this may make tailoring easier
327 if any normalization is not desired). Under <code>(normalization =&gt; undef)</code>,
328 only contiguous contractions are resolved;
329 e.g. even if <code>A-ring</code> (and <code>A-ring-cedilla</code>) is ordered after <code>Z</code>,
330 <code>A-cedilla-ring</code> would be primary equal to <a href="file://C|\msysgit\mingw\html/pod/perlguts.html#item_a"><code>A</code></a>.
331 In this point,
332 <code>(normalization =&gt; undef, preprocess =&gt; sub { NFD(shift) })</code>
333 <strong>is not</strong> equivalent to <code>(normalization =&gt; 'NFD')</code>.</p>
334 </dd>
335 <dd>
336 <p>In the case of <code>(normalization =&gt; &quot;prenormalized&quot;)</code>,
337 any normalization is not performed, but
338 non-contiguous contractions with combining characters are performed.
339 Therefore
340 <code>(normalization =&gt; 'prenormalized', preprocess =&gt; sub { NFD(shift) })</code>
341 <strong>is</strong> equivalent to <code>(normalization =&gt; 'NFD')</code>.
342 If source strings are finely prenormalized,
343 <code>(normalization =&gt; 'prenormalized')</code> may save time for normalization.</p>
344 </dd>
345 <dd>
346 <p>Except <code>(normalization =&gt; undef)</code>,
347 <strong>Unicode::Normalize</strong> is required (see also <strong>CAVEAT</strong>).</p>
348 </dd>
349 </li>
350 <dt><strong><a name="item_overridecjk">overrideCJK</a></strong>
352 <dd>
353 <p>-- see 7.1 Derived Collation Elements, UTS #10.</p>
354 </dd>
355 <dd>
356 <p>By default, CJK Unified Ideographs are ordered in Unicode codepoint order
357 but <code>CJK Unified Ideographs</code> (if <a href="#item_uca_version"><code>UCA_Version</code></a> is 8 to 11, its range is
358 <code>U+4E00..U+9FA5</code>; if <a href="#item_uca_version"><code>UCA_Version</code></a> is 14, its range is <code>U+4E00..U+9FBB</code>)
359 are lesser than <code>CJK Unified Ideographs Extension</code> (its range is
360 <code>U+3400..U+4DB5</code> and <code>U+20000..U+2A6D6</code>).</p>
361 </dd>
362 <dd>
363 <p>Through <a href="#item_overridecjk"><code>overrideCJK</code></a>, ordering of CJK Unified Ideographs can be overrided.</p>
364 </dd>
365 <dd>
366 <p>ex. CJK Unified Ideographs in the JIS code point order.</p>
367 </dd>
368 <dd>
369 <pre>
370 overrideCJK =&gt; sub {
371 my $u = shift; # get a Unicode codepoint
372 my $b = pack('n', $u); # to UTF-16BE
373 my $s = your_unicode_to_sjis_converter($b); # convert
374 my $n = unpack('n', $s); # convert sjis to short
375 [ $n, 0x20, 0x2, $u ]; # return the collation element
376 },</pre>
377 </dd>
378 <dd>
379 <p>ex. ignores all CJK Unified Ideographs.</p>
380 </dd>
381 <dd>
382 <pre>
383 overrideCJK =&gt; sub {()}, # CODEREF returning empty list</pre>
384 </dd>
385 <dd>
386 <pre>
387 # where -&gt;eq(&quot;Pe\x{4E00}rl&quot;, &quot;Perl&quot;) is true
388 # as U+4E00 is a CJK Unified Ideograph and to be ignorable.</pre>
389 </dd>
390 <dd>
391 <p>If <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef</code></a> is passed explicitly as the value for this key,
392 weights for CJK Unified Ideographs are treated as undefined.
393 But assignment of weight for CJK Unified Ideographs
394 in table or <a href="#item_entry"><code>entry</code></a> is still valid.</p>
395 </dd>
396 </li>
397 <dt><strong><a name="item_overridehangul">overrideHangul</a></strong>
399 <dd>
400 <p>-- see 7.1 Derived Collation Elements, UTS #10.</p>
401 </dd>
402 <dd>
403 <p>By default, Hangul Syllables are decomposed into Hangul Jamo,
404 even if <code>(normalization =&gt; undef)</code>.
405 But the mapping of Hangul Syllables may be overrided.</p>
406 </dd>
407 <dd>
408 <p>This parameter works like <a href="#item_overridecjk"><code>overrideCJK</code></a>, so see there for examples.</p>
409 </dd>
410 <dd>
411 <p>If you want to override the mapping of Hangul Syllables,
412 NFD, NFKD, and FCD are not appropriate,
413 since they will decompose Hangul Syllables before overriding.</p>
414 </dd>
415 <dd>
416 <p>If <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef</code></a> is passed explicitly as the value for this key,
417 weight for Hangul Syllables is treated as undefined
418 without decomposition into Hangul Jamo.
419 But definition of weight for Hangul Syllables
420 in table or <a href="#item_entry"><code>entry</code></a> is still valid.</p>
421 </dd>
422 </li>
423 <dt><strong><a name="item_preprocess">preprocess</a></strong>
425 <dd>
426 <p>-- see 5.1 Preprocessing, UTS #10.</p>
427 </dd>
428 <dd>
429 <p>If specified, the coderef is used to preprocess
430 before the formation of sort keys.</p>
431 </dd>
432 <dd>
433 <p>ex. dropping English articles, such as ``a'' or ``the''.
434 Then, ``the pen'' is before ``a pencil''.</p>
435 </dd>
436 <dd>
437 <pre>
438 preprocess =&gt; sub {
439 my $str = shift;
440 $str =~ s/\b(?:an?|the)\s+//gi;
441 return $str;
442 },</pre>
443 </dd>
444 <dd>
445 <p><a href="#item_preprocess"><code>preprocess</code></a> is performed before <a href="#item_normalization"><code>normalization</code></a> (if defined).</p>
446 </dd>
447 </li>
448 <dt><strong><a name="item_rearrange">rearrange</a></strong>
450 <dd>
451 <p>-- see 3.1.3 Rearrangement, UTS #10.</p>
452 </dd>
453 <dd>
454 <p>Characters that are not coded in logical order and to be rearranged.
455 If <a href="#item_uca_version"><code>UCA_Version</code></a> is equal to or lesser than 11, default is:</p>
456 </dd>
457 <dd>
458 <pre>
459 rearrange =&gt; [ 0x0E40..0x0E44, 0x0EC0..0x0EC4 ],</pre>
460 </dd>
461 <dd>
462 <p>If you want to disallow any rearrangement, pass <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef</code></a> or <code>[]</code>
463 (a reference to empty list) as the value for this key.</p>
464 </dd>
465 <dd>
466 <p>If <a href="#item_uca_version"><code>UCA_Version</code></a> is equal to 14, default is <code>[]</code> (i.e. no rearrangement).</p>
467 </dd>
468 <dd>
469 <p><strong>According to the version 9 of UCA, this parameter shall not be used;
470 but it is not warned at present.</strong></p>
471 </dd>
472 </li>
473 <dt><strong><a name="item_table">table</a></strong>
475 <dd>
476 <p>-- see 3.2 Default Unicode Collation Element Table, UTS #10.</p>
477 </dd>
478 <dd>
479 <p>You can use another collation element table if desired.</p>
480 </dd>
481 <dd>
482 <p>The table file should locate in the <em>Unicode/Collate</em> directory
483 on <a href="file://C|\msysgit\mingw\html/pod/perlvar.html#item__inc"><code>@INC</code></a>. Say, if the filename is <em>Foo.txt</em>,
484 the table file is searched as <em>Unicode/Collate/Foo.txt</em> in <a href="file://C|\msysgit\mingw\html/pod/perlvar.html#item__inc"><code>@INC</code></a>.</p>
485 </dd>
486 <dd>
487 <p>By default, <em>allkeys.txt</em> (as the filename of DUCET) is used.
488 If you will prepare your own table file, any name other than <em>allkeys.txt</em>
489 may be better to avoid namespace conflict.</p>
490 </dd>
491 <dd>
492 <p>If <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef</code></a> is passed explicitly as the value for this key,
493 no file is read (but you can define collation elements via <a href="#item_entry"><code>entry</code></a>).</p>
494 </dd>
495 <dd>
496 <p>A typical way to define a collation element table
497 without any file of table:</p>
498 </dd>
499 <dd>
500 <pre>
501 $onlyABC = Unicode::Collate-&gt;new(
502 table =&gt; undef,
503 entry =&gt; &lt;&lt; 'ENTRIES',
504 0061 ; [.0101.0020.0002.0061] # LATIN SMALL LETTER A
505 0041 ; [.0101.0020.0008.0041] # LATIN CAPITAL LETTER A
506 0062 ; [.0102.0020.0002.0062] # LATIN SMALL LETTER B
507 0042 ; [.0102.0020.0008.0042] # LATIN CAPITAL LETTER B
508 0063 ; [.0103.0020.0002.0063] # LATIN SMALL LETTER C
509 0043 ; [.0103.0020.0008.0043] # LATIN CAPITAL LETTER C
510 ENTRIES
511 );</pre>
512 </dd>
513 <dd>
514 <p>If <a href="#item_ignorename"><code>ignoreName</code></a> or <a href="#item_undefname"><code>undefName</code></a> is used, character names should be
515 specified as a comment (following <code>#</code>) on each line.</p>
516 </dd>
517 </li>
518 <dt><strong><a name="item_undefchar">undefChar</a></strong>
520 <dt><strong><a name="item_undefname">undefName</a></strong>
522 <dd>
523 <p>-- see 6.3.4 Reducing the Repertoire, UTS #10.</p>
524 </dd>
525 <dd>
526 <p>Undefines the collation element as if it were unassigned in the table.
527 This reduces the size of the table.
528 If an unassigned character appears in the string to be collated,
529 the sort key is made from its codepoint
530 as a single-character collation element,
531 as it is greater than any other assigned collation elements
532 (in the codepoint order among the unassigned characters).
533 But, it'd be better to ignore characters
534 unfamiliar to you and maybe never used.</p>
535 </dd>
536 <dd>
537 <p>Through <a href="#item_undefchar"><code>undefChar</code></a>, any character matching <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_qr_"><code>qr/$undefChar/</code></a>
538 will be undefined. Through <a href="#item_undefname"><code>undefName</code></a>, any character whose name
539 (given in the <a href="#item_table"><code>table</code></a> file as a comment) matches <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_qr_"><code>qr/$undefName/</code></a>
540 will be undefined.</p>
541 </dd>
542 <dd>
543 <p>ex. Collation weights for beyond-BMP characters are not stored in object:</p>
544 </dd>
545 <dd>
546 <pre>
547 undefChar =&gt; qr/[^\0-\x{fffd}]/,</pre>
548 </dd>
549 </li>
550 <dt><strong><a name="item_upper_before_lower">upper_before_lower</a></strong>
552 <dd>
553 <p>-- see 6.6 Case Comparisons, UTS #10.</p>
554 </dd>
555 <dd>
556 <p>By default, lowercase is before uppercase.
557 If the parameter is made true, this is reversed.</p>
558 </dd>
559 <dd>
560 <p><strong>NOTE</strong>: This parameter simplemindedly assumes that any lowercase/uppercase
561 distinctions must occur in level 3, and their weights at level 3 must be
562 same as those mentioned in 7.3.1, UTS #10.
563 If you define your collation elements which differs from this requirement,
564 this parameter doesn't work validly.</p>
565 </dd>
566 </li>
567 <dt><strong><a name="item_variable">variable</a></strong>
569 <dd>
570 <p>-- see 3.2.2 Variable Weighting, UTS #10.</p>
571 </dd>
572 <dd>
573 <p>This key allows to variable weighting for variable collation elements,
574 which are marked with an ASTERISK in the table
575 (NOTE: Many punction marks and symbols are variable in <em>allkeys.txt</em>).</p>
576 </dd>
577 <dd>
578 <pre>
579 variable =&gt; 'blanked', 'non-ignorable', 'shifted', or 'shift-trimmed'.</pre>
580 </dd>
581 <dd>
582 <p>These names are case-insensitive.
583 By default (if specification is omitted), 'shifted' is adopted.</p>
584 </dd>
585 <dd>
586 <pre>
587 'Blanked' Variable elements are made ignorable at levels 1 through 3;
588 considered at the 4th level.</pre>
589 </dd>
590 <dd>
591 <pre>
592 'Non-Ignorable' Variable elements are not reset to ignorable.</pre>
593 </dd>
594 <dd>
595 <pre>
596 'Shifted' Variable elements are made ignorable at levels 1 through 3
597 their level 4 weight is replaced by the old level 1 weight.
598 Level 4 weight for Non-Variable elements is 0xFFFF.</pre>
599 </dd>
600 <dd>
601 <pre>
602 'Shift-Trimmed' Same as 'shifted', but all FFFF's at the 4th level
603 are trimmed.</pre>
604 </dd>
605 </li>
606 </dl>
608 </p>
609 <h2><a name="methods_for_collation">Methods for Collation</a></h2>
610 <dl>
611 <dt><strong><a name="item_sort"><code>@sorted = $Collator-&gt;sort(@not_sorted)</code></a></strong>
613 <dd>
614 <p>Sorts a list of strings.</p>
615 </dd>
616 </li>
617 <dt><strong><a name="item_cmp"><code>$result = $Collator-&gt;cmp($a, $b)</code></a></strong>
619 <dd>
620 <p>Returns 1 (when <a href="file://C|\msysgit\mingw\html/pod/perlvar.html#item__a"><code>$a</code></a> is greater than <a href="file://C|\msysgit\mingw\html/pod/perlvar.html#item__b"><code>$b</code></a>)
621 or 0 (when <a href="file://C|\msysgit\mingw\html/pod/perlvar.html#item__a"><code>$a</code></a> is equal to <a href="file://C|\msysgit\mingw\html/pod/perlvar.html#item__b"><code>$b</code></a>)
622 or -1 (when <a href="file://C|\msysgit\mingw\html/pod/perlvar.html#item__a"><code>$a</code></a> is lesser than <a href="file://C|\msysgit\mingw\html/pod/perlvar.html#item__b"><code>$b</code></a>).</p>
623 </dd>
624 </li>
625 <dt><strong><a name="item_eq"><code>$result = $Collator-&gt;eq($a, $b)</code></a></strong>
627 <dt><strong><a name="item_ne"><code>$result = $Collator-&gt;ne($a, $b)</code></a></strong>
629 <dt><strong><a name="item_lt"><code>$result = $Collator-&gt;lt($a, $b)</code></a></strong>
631 <dt><strong><a name="item_le"><code>$result = $Collator-&gt;le($a, $b)</code></a></strong>
633 <dt><strong><a name="item_gt"><code>$result = $Collator-&gt;gt($a, $b)</code></a></strong>
635 <dt><strong><a name="item_ge"><code>$result = $Collator-&gt;ge($a, $b)</code></a></strong>
637 <dd>
638 <p>They works like the same name operators as theirs.</p>
639 </dd>
640 <dd>
641 <pre>
642 eq : whether $a is equal to $b.
643 ne : whether $a is not equal to $b.
644 lt : whether $a is lesser than $b.
645 le : whether $a is lesser than $b or equal to $b.
646 gt : whether $a is greater than $b.
647 ge : whether $a is greater than $b or equal to $b.</pre>
648 </dd>
649 </li>
650 <dt><strong><a name="item_getsortkey"><code>$sortKey = $Collator-&gt;getSortKey($string)</code></a></strong>
652 <dd>
653 <p>-- see 4.3 Form Sort Key, UTS #10.</p>
654 </dd>
655 <dd>
656 <p>Returns a sort key.</p>
657 </dd>
658 <dd>
659 <p>You compare the sort keys using a binary comparison
660 and get the result of the comparison of the strings using UCA.</p>
661 </dd>
662 <dd>
663 <pre>
664 $Collator-&gt;getSortKey($a) cmp $Collator-&gt;getSortKey($b)</pre>
665 </dd>
666 <dd>
667 <pre>
668 is equivalent to</pre>
669 </dd>
670 <dd>
671 <pre>
672 $Collator-&gt;cmp($a, $b)</pre>
673 </dd>
674 </li>
675 <dt><strong><a name="item_viewsortkey"><code>$sortKeyForm = $Collator-&gt;viewSortKey($string)</code></a></strong>
677 <dd>
678 <p>Converts a sorting key into its representation form.
679 If <a href="#item_uca_version"><code>UCA_Version</code></a> is 8, the output is slightly different.</p>
680 </dd>
681 <dd>
682 <pre>
683 use Unicode::Collate;
684 my $c = Unicode::Collate-&gt;new();
685 print $c-&gt;viewSortKey(&quot;Perl&quot;),&quot;\n&quot;;</pre>
686 </dd>
687 <dd>
688 <pre>
689 # output:
690 # [0B67 0A65 0B7F 0B03 | 0020 0020 0020 0020 | 0008 0002 0002 0002 | FFFF FFFF FFFF FFFF]
691 # Level 1 Level 2 Level 3 Level 4</pre>
692 </dd>
693 </li>
694 </dl>
696 </p>
697 <h2><a name="methods_for_searching">Methods for Searching</a></h2>
698 <p><strong>DISCLAIMER:</strong> If <a href="#item_preprocess"><code>preprocess</code></a> or <a href="#item_normalization"><code>normalization</code></a> parameter is true
699 for <code>$Collator</code>, calling these methods (<a href="#item_index"><code>index</code></a>, <a href="#item_match"><code>match</code></a>, <a href="#item_gmatch"><code>gmatch</code></a>,
700 <a href="#item_subst"><code>subst</code></a>, <a href="#item_gsubst"><code>gsubst</code></a>) is croaked,
701 as the position and the length might differ
702 from those on the specified string.
703 (And <a href="#item_rearrange"><code>rearrange</code></a> and <a href="#item_hangul_terminator"><code>hangul_terminator</code></a> parameters are neglected.)</p>
704 <p>The <a href="#item_match"><code>match</code></a>, <a href="#item_gmatch"><code>gmatch</code></a>, <a href="#item_subst"><code>subst</code></a>, <a href="#item_gsubst"><code>gsubst</code></a> methods work
705 like <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_m_"><code>m//</code></a>, <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_m_"><code>m//g</code></a>, <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_s_"><code>s///</code></a>, <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_s_"><code>s///g</code></a>, respectively,
706 but they are not aware of any pattern, but only a literal substring.</p>
707 <dl>
708 <dt><strong><a name="item_index"><code>$position = $Collator-&gt;index($string, $substring[, $position])</code></a></strong>
710 <dt><strong><code>($position, $length) = $Collator-&gt;index($string, $substring[, $position])</code></strong>
712 <dd>
713 <p>If <code>$substring</code> matches a part of <code>$string</code>, returns
714 the position of the first occurrence of the matching part in scalar context;
715 in list context, returns a two-element list of
716 the position and the length of the matching part.</p>
717 </dd>
718 <dd>
719 <p>If <code>$substring</code> does not match any part of <code>$string</code>,
720 returns <code>-1</code> in scalar context and
721 an empty list in list context.</p>
722 </dd>
723 <dd>
724 <p>e.g. you say</p>
725 </dd>
726 <dd>
727 <pre>
728 my $Collator = Unicode::Collate-&gt;new( normalization =&gt; undef, level =&gt; 1 );
729 # (normalization =&gt; undef) is REQUIRED.
730 my $str = &quot;Ich muß studieren Perl.&quot;;
731 my $sub = &quot;MÃœSS&quot;;
732 my $match;
733 if (my($pos,$len) = $Collator-&gt;index($str, $sub)) {
734 $match = substr($str, $pos, $len);
735 }</pre>
736 </dd>
737 <dd>
738 <p>and get <code>&quot;muß&quot;</code> in <code>$match</code> since <code>&quot;muß&quot;</code>
739 is primary equal to <code>&quot;MÃœSS&quot;</code>.</p>
740 </dd>
741 </li>
742 <dt><strong><a name="item_match"><code>$match_ref = $Collator-&gt;match($string, $substring)</code></a></strong>
744 <dt><strong><code>($match) = $Collator-&gt;match($string, $substring)</code></strong>
746 <dd>
747 <p>If <code>$substring</code> matches a part of <code>$string</code>, in scalar context, returns
748 <strong>a reference to</strong> the first occurrence of the matching part
749 (<code>$match_ref</code> is always true if matches,
750 since every reference is <strong>true</strong>);
751 in list context, returns the first occurrence of the matching part.</p>
752 </dd>
753 <dd>
754 <p>If <code>$substring</code> does not match any part of <code>$string</code>,
755 returns <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef</code></a> in scalar context and
756 an empty list in list context.</p>
757 </dd>
758 <dd>
759 <p>e.g.</p>
760 </dd>
761 <dd>
762 <pre>
763 if ($match_ref = $Collator-&gt;match($str, $sub)) { # scalar context
764 print &quot;matches [$$match_ref].\n&quot;;
765 } else {
766 print &quot;doesn't match.\n&quot;;
767 }</pre>
768 </dd>
769 <dd>
770 <pre>
771 or</pre>
772 </dd>
773 <dd>
774 <pre>
775 if (($match) = $Collator-&gt;match($str, $sub)) { # list context
776 print &quot;matches [$match].\n&quot;;
777 } else {
778 print &quot;doesn't match.\n&quot;;
779 }</pre>
780 </dd>
781 </li>
782 <dt><strong><a name="item_gmatch"><code>@match = $Collator-&gt;gmatch($string, $substring)</code></a></strong>
784 <dd>
785 <p>If <code>$substring</code> matches a part of <code>$string</code>, returns
786 all the matching parts (or matching count in scalar context).</p>
787 </dd>
788 <dd>
789 <p>If <code>$substring</code> does not match any part of <code>$string</code>,
790 returns an empty list.</p>
791 </dd>
792 </li>
793 <dt><strong><a name="item_subst"><code>$count = $Collator-&gt;subst($string, $substring, $replacement)</code></a></strong>
795 <dd>
796 <p>If <code>$substring</code> matches a part of <code>$string</code>,
797 the first occurrence of the matching part is replaced by <code>$replacement</code>
798 (<code>$string</code> is modified) and return <code>$count</code> (always equals to <code>1</code>).</p>
799 </dd>
800 <dd>
801 <p><code>$replacement</code> can be a <code>CODEREF</code>,
802 taking the matching part as an argument,
803 and returning a string to replace the matching part
804 (a bit similar to <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_s_"><code>s/(..)/$coderef-&gt;($1)/e</code></a>).</p>
805 </dd>
806 </li>
807 <dt><strong><a name="item_gsubst"><code>$count = $Collator-&gt;gsubst($string, $substring, $replacement)</code></a></strong>
809 <dd>
810 <p>If <code>$substring</code> matches a part of <code>$string</code>,
811 all the occurrences of the matching part is replaced by <code>$replacement</code>
812 (<code>$string</code> is modified) and return <code>$count</code>.</p>
813 </dd>
814 <dd>
815 <p><code>$replacement</code> can be a <code>CODEREF</code>,
816 taking the matching part as an argument,
817 and returning a string to replace the matching part
818 (a bit similar to <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_s_"><code>s/(..)/$coderef-&gt;($1)/eg</code></a>).</p>
819 </dd>
820 <dd>
821 <p>e.g.</p>
822 </dd>
823 <dd>
824 <pre>
825 my $Collator = Unicode::Collate-&gt;new( normalization =&gt; undef, level =&gt; 1 );
826 # (normalization =&gt; undef) is REQUIRED.
827 my $str = &quot;Camel donkey zebra came\x{301}l CAMEL horse cAm\0E\0L...&quot;;
828 $Collator-&gt;gsubst($str, &quot;camel&quot;, sub { &quot;&lt;b&gt;$_[0]&lt;/b&gt;&quot; });</pre>
829 </dd>
830 <dd>
831 <pre>
832 # now $str is &quot;&lt;b&gt;Camel&lt;/b&gt; donkey zebra &lt;b&gt;came\x{301}l&lt;/b&gt; &lt;b&gt;CAMEL&lt;/b&gt; horse &lt;b&gt;cAm\0E\0L&lt;/b&gt;...&quot;;
833 # i.e., all the camels are made bold-faced.</pre>
834 </dd>
835 </li>
836 </dl>
838 </p>
839 <h2><a name="other_methods">Other Methods</a></h2>
840 <dl>
841 <dt><strong><a name="item_change"><code>%old_tailoring = $Collator-&gt;change(%new_tailoring)</code></a></strong>
843 <dd>
844 <p>Change the value of specified keys and returns the changed part.</p>
845 </dd>
846 <dd>
847 <pre>
848 $Collator = Unicode::Collate-&gt;new(level =&gt; 4);</pre>
849 </dd>
850 <dd>
851 <pre>
852 $Collator-&gt;eq(&quot;perl&quot;, &quot;PERL&quot;); # false</pre>
853 </dd>
854 <dd>
855 <pre>
856 %old = $Collator-&gt;change(level =&gt; 2); # returns (level =&gt; 4).</pre>
857 </dd>
858 <dd>
859 <pre>
860 $Collator-&gt;eq(&quot;perl&quot;, &quot;PERL&quot;); # true</pre>
861 </dd>
862 <dd>
863 <pre>
864 $Collator-&gt;change(%old); # returns (level =&gt; 2).</pre>
865 </dd>
866 <dd>
867 <pre>
868 $Collator-&gt;eq(&quot;perl&quot;, &quot;PERL&quot;); # false</pre>
869 </dd>
870 <dd>
871 <p>Not all <code>(key,value)</code>s are allowed to be changed.
872 See also <code>@Unicode::Collate::ChangeOK</code> and <code>@Unicode::Collate::ChangeNG</code>.</p>
873 </dd>
874 <dd>
875 <p>In the scalar context, returns the modified collator
876 (but it is <strong>not</strong> a clone from the original).</p>
877 </dd>
878 <dd>
879 <pre>
880 $Collator-&gt;change(level =&gt; 2)-&gt;eq(&quot;perl&quot;, &quot;PERL&quot;); # true</pre>
881 </dd>
882 <dd>
883 <pre>
884 $Collator-&gt;eq(&quot;perl&quot;, &quot;PERL&quot;); # true; now max level is 2nd.</pre>
885 </dd>
886 <dd>
887 <pre>
888 $Collator-&gt;change(level =&gt; 4)-&gt;eq(&quot;perl&quot;, &quot;PERL&quot;); # false</pre>
889 </dd>
890 </li>
891 <dt><strong><a name="item_version"><code>$version = $Collator-&gt;version()</code></a></strong>
893 <dd>
894 <p>Returns the version number (a string) of the Unicode Standard
895 which the <a href="#item_table"><code>table</code></a> file used by the collator object is based on.
896 If the table does not include a version line (starting with <code>@version</code>),
897 returns <code>&quot;unknown&quot;</code>.</p>
898 </dd>
899 </li>
900 <dt><strong><code>UCA_Version()</code></strong>
902 <dd>
903 <p>Returns the tracking version number of UTS #10 this module consults.</p>
904 </dd>
905 </li>
906 <dt><strong><a name="item_base_unicode_version"><code>Base_Unicode_Version()</code></a></strong>
908 <dd>
909 <p>Returns the version number of UTS #10 this module consults.</p>
910 </dd>
911 </li>
912 </dl>
914 </p>
915 <hr />
916 <h1><a name="export">EXPORT</a></h1>
917 <p>No method will be exported.</p>
919 </p>
920 <hr />
921 <h1><a name="install">INSTALL</a></h1>
922 <p>Though this module can be used without any <a href="#item_table"><code>table</code></a> file,
923 to use this module easily, it is recommended to install a table file
924 in the UCA format, by copying it under the directory
925 &lt;a place in @INC&gt;/Unicode/Collate.</p>
926 <p>The most preferable one is ``The Default Unicode Collation Element Table''
927 (aka DUCET), available from the Unicode Consortium's website:</p>
928 <pre>
929 <a href="http://www.unicode.org/Public/UCA/">http://www.unicode.org/Public/UCA/</a></pre>
930 <pre>
931 <a href="http://www.unicode.org/Public/UCA/latest/allkeys.txt">http://www.unicode.org/Public/UCA/latest/allkeys.txt</a> (latest version)</pre>
932 <p>If DUCET is not installed, it is recommended to copy the file
933 from <a href="http://www.unicode.org/Public/UCA/latest/allkeys.txt">http://www.unicode.org/Public/UCA/latest/allkeys.txt</a>
934 to &lt;a place in @INC&gt;/Unicode/Collate/allkeys.txt
935 manually.</p>
937 </p>
938 <hr />
939 <h1><a name="caveats">CAVEATS</a></h1>
940 <dl>
941 <dt><strong><a name="item_normalization">Normalization</a></strong>
943 <dd>
944 <p>Use of the <a href="#item_normalization"><code>normalization</code></a> parameter requires the <strong>Unicode::Normalize</strong>
945 module (see <a href="file://C|\msysgit\mingw\html/lib/Unicode/Normalize.html">the Unicode::Normalize manpage</a>).</p>
946 </dd>
947 <dd>
948 <p>If you need not it (say, in the case when you need not
949 handle any combining characters),
950 assign <code>normalization =&gt; undef</code> explicitly.</p>
951 </dd>
952 <dd>
953 <p>-- see 6.5 Avoiding Normalization, UTS #10.</p>
954 </dd>
955 </li>
956 <dt><strong><a name="item_conformance_test">Conformance Test</a></strong>
958 <dd>
959 <p>The Conformance Test for the UCA is available
960 under <a href="http://www.unicode.org/Public/UCA/">http://www.unicode.org/Public/UCA/</a>.</p>
961 </dd>
962 <dd>
963 <p>For <em>CollationTest_SHIFTED.txt</em>,
964 a collator via <code>Unicode::Collate-&gt;new( )</code> should be used;
965 for <em>CollationTest_NON_IGNORABLE.txt</em>, a collator via
966 <code>Unicode::Collate-&gt;new(variable =&gt; &quot;non-ignorable&quot;, level =&gt; 3)</code>.</p>
967 </dd>
968 <dd>
969 <p><strong>Unicode::Normalize is required to try The Conformance Test.</strong></p>
970 </dd>
971 </li>
972 </dl>
974 </p>
975 <hr />
976 <h1><a name="author__copyright_and_license">AUTHOR, COPYRIGHT AND LICENSE</a></h1>
977 <p>The Unicode::Collate module for perl was written by SADAHIRO Tomoyuki,
978 &lt;<a href="mailto:SADAHIRO@cpan.org">SADAHIRO@cpan.org</a>&gt;. This module is <code>Copyright(C)</code> 2001-2005,
979 SADAHIRO Tomoyuki. Japan. All rights reserved.</p>
980 <p>This module is free software; you can redistribute it and/or
981 modify it under the same terms as Perl itself.</p>
982 <p>The file Unicode/Collate/allkeys.txt was copied directly
983 from <a href="http://www.unicode.org/Public/UCA/4.1.0/allkeys.txt">http://www.unicode.org/Public/UCA/4.1.0/allkeys.txt</a>.
984 This file is Copyright (c) 1991-2005 Unicode, Inc. All rights reserved.
985 Distributed under the Terms of Use in <a href="http://www.unicode.org/copyright.html">http://www.unicode.org/copyright.html</a>.</p>
987 </p>
988 <hr />
989 <h1><a name="see_also">SEE ALSO</a></h1>
990 <dl>
991 <dt><strong><a name="item_unicode_collation_algorithm__2d_uts__2310">Unicode Collation Algorithm - UTS #10</a></strong>
993 <dd>
994 <p><a href="http://www.unicode.org/reports/tr10/">http://www.unicode.org/reports/tr10/</a></p>
995 </dd>
996 </li>
997 <dt><strong><a name="item_table">The Default Unicode Collation Element Table (DUCET)</a></strong>
999 <dd>
1000 <p><a href="http://www.unicode.org/Public/UCA/latest/allkeys.txt">http://www.unicode.org/Public/UCA/latest/allkeys.txt</a></p>
1001 </dd>
1002 </li>
1003 <dt><strong><a name="item_the_conformance_test_for_the_uca">The conformance test for the UCA</a></strong>
1005 <dd>
1006 <p><a href="http://www.unicode.org/Public/UCA/latest/CollationTest.html">http://www.unicode.org/Public/UCA/latest/CollationTest.html</a></p>
1007 </dd>
1008 <dd>
1009 <p><a href="http://www.unicode.org/Public/UCA/latest/CollationTest.zip">http://www.unicode.org/Public/UCA/latest/CollationTest.zip</a></p>
1010 </dd>
1011 </li>
1012 <dt><strong><a name="item_hangul_syllable_type">Hangul Syllable Type</a></strong>
1014 <dd>
1015 <p><a href="http://www.unicode.org/Public/UNIDATA/HangulSyllableType.txt">http://www.unicode.org/Public/UNIDATA/HangulSyllableType.txt</a></p>
1016 </dd>
1017 </li>
1018 <dt><strong><a name="item_unicode_normalization_forms__2d_uax__2315">Unicode Normalization Forms - UAX #15</a></strong>
1020 <dd>
1021 <p><a href="http://www.unicode.org/reports/tr15/">http://www.unicode.org/reports/tr15/</a></p>
1022 </dd>
1023 </li>
1024 </dl>
1025 <table border="0" width="100%" cellspacing="0" cellpadding="3">
1026 <tr><td class="block" style="background-color: #cccccc" valign="middle">
1027 <big><strong><span class="block">&nbsp;Unicode::Collate - Unicode Collation Algorithm</span></strong></big>
1028 </td></tr>
1029 </table>
1031 </body>
1033 </html>