1 2010-02-18 Gabriel Burt <gabriel.burt@gmail.com>
3 * Normalization.cs: Implement algorithmic Hangul decomposition; Calling
4 string.Normalize on Korean characters now works properly (bnc#480152).
5 This reduces the number of errors in 'make test' from 27k to 4.8k.
7 * StringNormalizationTestSource.cs:
8 * Makefile: Use the local, working copy of Normalization etc,so as to make
9 modifying Normalization.cs and then testing your changes with 'make test'
10 possible. Also, fix building/running of tests, patch by Alexander
13 2009-09-18 Atsushi Enomoto <atsushi@ximian.com>
15 * Normalization.cs : Handle blocked characters which are not
16 immediately next to the primary composite character. This fixes
17 some Arabic string sequence normalization.
18 * Makefile : fix test build.
20 2009-09-17 Atsushi Enomoto <atsushi@ximian.com>
22 * Normalization.cs : some renaming for disambiguation.
23 * NormalizationTableUtil.cs : fix some wrong ranges in
24 mapIdxToComposite. This fixes some Arabic normalization (and more).
25 * normalization-notes.txt : added some notes on the implementation.
27 2008-06-19 Atsushi Enomoto <atsushi@ximian.com>
30 - reverted the previous index calculation change. It was correctly
31 implemented and I rather broke it.
32 - fix index calculation on combining.
33 - NFKD was incorrectly directed to combining path. It should not.
34 - Simplify quick check.
36 2008-06-15 Atsushi Enomoto <atsushi@ximian.com>
38 * Normalization.cs : For NFC and NFKC, IsNormalized() was not working
39 enough to check composed characters. It's not possible without
40 the actual composition, so just call Normalize() and compare them.
41 In Normalize() mapping helper didn't pick correct map index since
42 the table for index stores index for "uncompressed" numbers.
43 * NormalizationTableUtil.cs : updated to the latest UCD.
44 * Makefile : to build test, source file must be downloaded too.
46 2008-11-05 Atsushi Enomoto <atsushi@ximian.com>
48 * ucd.cs : Write type for *_count. Add notice to not edit
49 unicode-data.h directly.
51 2008-11-04 Atsushi Enomoto <atsushi@ximian.com>
53 * ucd.cs : new code to generate unicode table for eglib.
55 2008-07-04 Andreas Nahr <ClassDevelopment@A-SoftTech.com>
57 * SortKey: Fix parameter names, add attribute, small formatting
59 2008-06-27 Rodrigo Kumpera <rkumpera@novell.com>
61 * CodePointIndexer.cs : Make TableRange a struct instead
62 of a class so we save 2 memory ops per ToIndex loop.
64 2008-04-02 Atsushi Enomoto <atsushi@ximian.com>
66 * SortKey.cs : check null arguments. Fixed bug #376171.
68 2007-07-20 Atsushi Enomoto <atsushi@ximian.com>
70 * create-mscompat-collation-table.cs : I wonder how long its build
73 2007-03-06 Atsushi Enomoto <atsushi@ximian.com>
75 * SimpleCollator.cs : disable QuickCheckPossible(), which is
76 inaccurate and inefficient. Fixed bug #79714.
78 2007-02-15 Atsushi Enomoto <atsushi@ximian.com>
80 * SimpleCollator.cs : character filtering is needed for
81 OrdinalIgnoreCase in 2.0 profile. Fixed bug #80865.
83 2007-01-25 Atsushi Enomoto <atsushi@ximian.com>
85 * SimpleCollator.cs : GetTailContraction() was broken to pick correct
86 contraction/special sortkey out and thus LastIndexOf() failed when
87 it is involved. Fixed bug #80612.
89 2007-01-22 Atsushi Enomoto <atsushi@ximian.com>
91 * SimpleCollator.cs : for non-StringSort comparison, level5 (- and ')
92 should be still skipped after initial level5 check is done (while
93 they were simply treated as a normal character). Fixed bug #78748.
94 * SortKeyBuffer.cs : Fixed NRE in french sort.
96 2006-12-25 Atsushi Enomoto <atsushi@ximian.com>
98 * SimpleCollator.cs : added IndexOf() implementation for Ordinal
99 and OrdinalIgnoreCase, though Ordinal version is not used (since
100 it is slower than icall).
102 2006-05-30 Miguel de Icaza <miguel@novell.com>
104 * MSCompatUnicodeTable.cs: Remove the fixed loading and compute it
105 just when we actually consume it. This only fixes the
108 2006-04-14 Atsushi Enomoto <atsushi@ximian.com>
110 * README: removed obsolete info.
111 * Normalization.cs : canonical reordering should participate in the
112 decomposition step. In reordering, string append was incomplete.
113 Combining class check is required in NFD check. Icall is written
116 2005-12-07 Zoltan Varga <vargaz@gmail.com>
118 * SimpleCollator.cs: Fix a warning.
120 2005-11-30 Sebastien Pouliot <sebastien@ximian.com>
122 * SimpleCollator.cs: Fix CAS support. The static ctor/var try to get
123 the environment variable MUCH too soon (i.e. the security manager
126 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
128 * SimpleCollator.cs : direct fast-path optimization for IndexOf().
130 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
132 * SimpleCollator.cs :
133 - CompareQuick(): added immediateBreakup to avoid extraneous sortkey
135 - QuickCheckPossible(): index used for s1 was incorrect.
137 2005-11-29 Atsushi Enomoto <atsushi@ximian.com>
139 * SimpleCollator.cs : added another quick check for CompareInternal()
140 that does almost ordinal comparison for quick-checkable strings.
141 (It affects on Compare(), IndexOf(), IsSuffix() etc. as well.)
143 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
145 * MSCompatUnicodeTable.cs : (IsIgnorable) \0 is not ignorable.
148 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
150 * SimpleCollator.cs :
151 Created another struct to reduce method arguments. Created another
152 flags that keeps "once-matched" state (counterpart of
153 checkedFlags, now neverMatchFlags).
155 2005-11-14 Atsushi Enomoto <atsushi@ximian.com>
157 * SimpleCollator.cs :
158 - Added CompareOrdinalIgnoreCase() for NET_2_0 RTM.
159 - Reduced extra parameter from LastIndexOfSortKey().
160 - LastIndexOf() should use GetTailContraction for the source string.
161 And then, target could match in the middle of the possible
162 "replacement contraction" of the source string, so use
163 LastIndexOfSortKey() to catch them.
164 - Fixed GetTailContraction() that caused index out of range.
166 2005-11-11 Atsushi Enomoto <atsushi@ximian.com>
168 * Makefile : Now use MONO_DISABLE_MANAGED_COLLATION.
169 * SortKey.cs : some members are virtual.
171 2005-10-14 Atsushi Enomoto <atsushi@ximian.com>
173 * SimpleCollator.cs : modified to use stackalloc for byte array.
175 2005-09-27 Atsushi Enomoto <atsushi@ximian.com>
177 * SimpleCollator.cs : in CompareInternal(), there was a possibility of
178 infinite loop. Fixed bug #76243.
180 2005-09-20 Atsushi Enomoto <atsushi@ximian.com>
182 * SimpleCollator.cs : In IsPrefix/IsSuffix, if target is an empty string,
183 immediately return true.
185 2005-09-09 Atsushi Enomoto <atsushi@ximian.com>
187 * SimpleCollator.cs : IsSuffix() optimization logic was buggy, so just
188 use pretty simple way with LastIndexOf() (no significant perf.
191 2005-09-01 Atsushi Enomoto <atsushi@ximian.com>
193 * README, Collation-notes.txt, CollationDataStructures.txt :
194 removing obsolete info and some added some notes.
196 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
198 * Normalization.cs : remove warned code.
199 * managed-collation.patch : now it's not required anymore.
201 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
203 * MSCompatUnicodeTable.cs : added IsSortable(string).
205 2005-08-10 Atsushi Enomoto <atsushi@ximian.com>
207 * SimpleCollator.cs : Now all collator methods are thread safe.
209 All instance non-readonly fields turned into arguments of every
210 methods that use those fields.
211 (Sadly it is the end of no-memory-cost collator era. mcs bootstrap
212 now needs +100KB memory consumption.)
214 2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
216 * SimpleCollator.cs : made "checkedFlags" as nullable and made it as
217 an argument of every index methods (to make it thread safe).
219 2005-08-09 Atsushi Enomoto <atsushi@ximian.com>
222 MSCompatUnicodeTable.cs :
223 - Now IsIgnorable() is aggregated to be one invokation to check
224 completely ignorable, nonspacing and symbols.
225 - Introduced "already checked" flags for IndexOf() and LastIndexOf()
226 to skip sortkey binary check on the same characters. Significant
227 perf. improvement for such case as IndexOf("AABCBABC...Z",'Z').
229 2005-08-08 Gert Driesen <drieseng@users.sourceforge.net>
231 * SortKey.cs: Marked Serializable to match MS.NET.
233 2005-08-08 Atsushi Enomoto <atsushi@ximian.com>
235 * create-mscompat-collation-table.cs,
236 Makefile : changed resources output directory.
238 2005-08-04 Atsushi Enomoto <atsushi@ximian.com>
240 * create-normalization-tests.cs,
241 StringNormalizationTestSource.cs : new files for Unicode
242 Normalization test generator.
243 * Makefile : added support for above.
245 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
247 * NormalizationTableUtil.cs : oops, it does not compile.
248 * managed-collation.patch : I guess having managed resource would be
249 better for collation. At least current code has such #define so
250 Makefile should be in sync with it.
252 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
254 * create-normalization-source.cs : Fixed CharMapComparer which
255 incorrectly returned 0 when the second arg is shorter. Reduced
256 extraneous helperIndex map. Other minor fixes and code removal.
257 * Normalization.cs : several fixes to support blocked combine handling.
258 * NormalizationTableUtil.cs : tiny member renaming.
260 2005-08-03 Atsushi Enomoto <atsushi@ximian.com>
262 * create-normalization-source.cs,
263 NormalizationTableUtil.cs,
264 Normalization.cs : several bugfixes on index miscomputation.
265 Renamed using aliases (csc will bork). Primary combine safety is now
266 computed during UnicodeData.txt parse.
267 Maximum NFKD length was 18, not 4 (U+FDFA).
269 2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
271 * managed-collation.patch : added Normalization support.
272 * managed-collation-icall.patch : added, including normalization stuff.
274 BTW when will collation code checked in?
276 2005-08-02 Atsushi Enomoto <atsushi@ximian.com>
278 * create-normalization-source.cs : Unified three normalization source
279 generators, to compute IsUnsafe flag. Fixed helperIndex array type
281 * create-char-mapping-source.cs,
282 create-combining-class-source.cs : thus removed.
283 * Makefile : thus modified for the above integration.
284 * NormalizationTableUtil.cs : Extended to contain IsUnsafe flag.
285 * Normalization.cs : Several fixes to make Normalize() actually work.
287 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
289 * create-normalization-source.cs,
291 create-char-mapping-source.cs,
292 create-combining-class-source.cs,
293 Makefile : converted managed array to pointers (like collation stuff).
295 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
297 * NormalizationTableUtil.cs : further table range optimization.
298 * create-normalization-source.cs,
299 create-char-mapping-source.cs,
300 create-combining-class-source.cs : added C header output support.
302 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
304 * create-normalization-source.cs, Normalization.cs :
305 Now property size is < 256, so directly embed value in "props" array.
306 Add QuickCheck(c,checkType) and remove IsNFD/C/KD/KC and delegates.
308 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
310 * create-combining-class-source.cs,
311 create-char-mapping-source.cs,
312 create-normalization-source.cs,
313 NormalizationTableUtil.cs,
314 Normalization.cs : String.Normalize() does not handle surrogate
315 characters. mapping information in DerivedNormalizationProps.txt
316 are not used in the code (those from UnicodeData.txt is used).
317 Hangul syllables are computed instead of embedded in the tables.
318 * managed-collation.patch : removed IntPtrStream and Makefile patches.
320 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
322 * MSCompatUnicodeTable.cs : IsSortable() was broken.
324 2005-07-29 Atsushi Enomoto <atsushi@ximian.com>
326 * MSCompatUnicodeTable.cs : added helper for CompareInfo.IsSortable().
328 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
330 * create-tailoring.cfg : added for convenience of contraction check.
332 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
334 * create-normalization-source.cs,
337 create-mscompat-collation-table.cs,
338 MSCompatUnicodeTableUtil.cs,
340 create-collation-element-table.cs,
341 MSCompatUnicodeTable.cs,
343 create-combining-class-source.cs : added copyright lines.
345 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
347 MSCompatUnicodeTable.cs : removed extraneous definition.
349 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
351 * create-mscompat-collation-table.cs
352 MSCompatUnicodeTable.cs : full C header support, finally.
354 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
357 NormalizationTableUtil.cs,
358 create-char-mapping-source.cs : more aggressive data compression.
359 It now ignores characters that are >= U+10000.
361 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
364 Normalization.template,
365 Normalization.cs : renamed existing file.
367 2005-07-28 Atsushi Enomoto <atsushi@ximian.com>
369 * NormalizationTableUtil.cs,
370 Normalization.template,
371 create-combining-class-source.cs : GetCombiningClass is now
372 implemented as indexer based array.
373 * Makefile : renamed output filename.
374 * create-mscompat-collation-table.cs : removed comments that does not
376 * create-tailoring.cs : use utf-8 output (and fixed filename).
378 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
380 * create-mscompat-collation-table.cs : hacked safer IPA extensions.
381 * Collation-notes.txt : status of sortkey table.
383 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
385 * create-mscompat-collation-table.cs : some Greek mapping fix.
387 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
389 * create-mscompat-collation-table.cs : diacritical weight is not
390 treated correctly when they are picked from letter names, as flags.
392 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
394 * create-mscompat-collation-table.cs : fixed culture-dependent
395 nonspacing mark weight.
397 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
399 * create-mscompat-collation-table.cs : some Hebrew case letter fixes.
400 Some diacritical fixes on symbols.
402 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
404 * create-mscompat-collation-table.cs : Fixed level 3 weight of
405 Arabic presentation forms.
407 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
409 * create-mscompat-collation-table.cs : Fixed some diacritical weight
410 of Arabic presentation forms.
412 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
414 * SimpleCollator.cs : more status updates. It's almost complete,
415 except for sortkey values.
417 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
419 * SimpleCollator.cs : similar optimization also for LastIndexOf().
421 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
423 * SimpleCollator.cs : the previous patch was missing IgnoreNonSpace
426 2005-07-27 Atsushi Enomoto <atsushi@ximian.com>
428 * SimpleCollator.cs : reduced extra sortkey value computation in
429 MatchesForward(). It makes IndexOf() roughly 30% faster.
431 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
433 * SortKey.cs : GetHashCode() returns a value based on its byte data.
436 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
438 * SimpleCollator.cs : consider extractions in invariant culture.
440 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
442 * SimpleCollator.cs : (unsafeFlags) be compact ;-)
444 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
446 * SimpleCollator.cs : When the tail of the target does not match more
447 than 3 times, then IsSuffix() will never be true (3 is the max
448 length of an expansion; \uFB03 -> ffi). It brings significant
449 performance boost when "source" string is very long.
450 * MSCompatUnicodeTable.cs : added MaxExpansionLength constant.
451 Reordered code lines.
453 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
455 * Collation-notes.txt : updated implementation status.
457 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
459 * SimpleCollator.cs : Implemented quick codepoint comparison in
460 Compare(). Comparison became 125x faster.
461 * mono-tailoring-source.txt : added tiny comment.
463 2005-07-26 Atsushi Enomoto <atsushi@ximian.com>
465 * mono-tailoring-source.txt : Added all single sortkey remapping to
466 all cultures (still need to fill contractions and annotate possible
467 buggy mapping referencing to CLDR).
468 * SimpleCollator.cs : removed unused code.
469 * MSCompatUnicodeTable.cs : tiny cast removal.
471 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
474 create-mscompat-collation-table.cs
475 MSCompatUnicodeTableUtil.cs
476 MSCompatUnicodeTable.cs : Now CJK mapping data is stored as byte
477 arrays. Thus SimpleCollator does not need to use bitwise and shift
478 operations to get sortkey value and they could be managed resources.
480 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
482 * create-mscompat-collation-table.cs,
483 MSCompatUnicodeTable.cs,
484 MSCompatUnicodeTableUtil.cs : From the result of sortkey comparison
485 between None and IgnoreWidth, width compat table could be computed
486 in somewhat simple way. So removed that table and all related code.
487 Increased the collation resource version.
489 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
491 * create-mscompat-collation-table.cs : Added C header output support.
493 2005-07-25 Atsushi Enomoto <atsushi@ximian.com>
495 * create-mscompat-collation-table.cs : FillLetterNFKD() could also be
496 applied to Cyrillic letters. Saved some of them.
498 2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
500 * MSCompatUnicodeTable.cs : oh, ok, so we already have
501 GetManifestResourceInternal() ;-)
502 * managed-collation.patch : in Assembly.cs made that method internal.
504 2005-07-24 Atsushi Enomoto <atsushi@ximian.com>
506 * MSCompatUnicodeTable.cs : the pointer based icall code could be
507 also applicable for USE_MANAGED_RESOURCE mode.
509 2005-07-23 Atsushi Enomoto <atsushi@ximian.com>
511 * MSCompatUnicodeTable.cs : added icall support code (not enabled
512 unless the first line is commented out).
514 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
516 * create-mscompat-collation-table.cs,
517 MSCompatUnicodeTableUtil.cs,
518 MSCompatUnicodeTable.cs : Added resource version output (and ignore
519 in case of version mismatch). Removed obsolete, commented out code.
521 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
524 MSCompatUnicodeTable.cs,
525 create-mscompat-collation-table.cs : Now they use unmanaged pointers
526 instead of managed arrays.
527 * managed-collation.patch : Now it contains patch for IntPtrStream.cs
528 and Assembly.cs as well.
530 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
532 * MSCompatUnicodeTable.cs,
533 SimpleCollator.cs : Moved tailoring support classes to
534 MSCompatUnicodeTable.cs and drawn out from SimpleCollator.
535 Now that cjk and tailoring support are filled inside
536 MSCompatUnicodeTable, no managed array is exposed.
538 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
540 * create-mscompat-collation-table.cs,
542 MSCompatUnicodeTable.cs : Now it's not exposing collation table
543 internals as managed arrays (to switch to unmanaged pointers).
545 2005-07-22 Atsushi Enomoto <atsushi@ximian.com>
547 * create-mscompat-collation-table.cs : tiny nonspacing mark fix.
549 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
551 * create-mscompat-collation-table.cs : Fixed most of Greek mappings.
552 * MSCompatUnicodeTable.cs : don't lock string.
554 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
556 * create-mscompat-collation-table.cs : More Cyrillic diacritical fixes.
558 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
560 * create-mscompat-collation-table.cs : More Latin diacritical fixes.
562 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
564 * create-mscompat-collation-table.cs : There were still missing
565 math symbol mappings. Added several hacky diacritical weight for
568 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
570 * create-mscompat-collation-table.cs : fixed a few diacritical weight
571 on Cyrillic characters. Fixed ParseTailoringSource() to handle
572 non-heading escape sequence (\uXXXX) as expected.
574 2005-07-21 Atsushi Enomoto <atsushi@ximian.com>
576 * create-mscompat-collation-table.cs,
577 MSCompatUnicodeTableUtil.cs,
578 MSCompatUnicodeTable.cs : added more aggressive index limits for
579 table optimization at data size, in cost of speed.
581 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
583 * create-mscompat-collation-table.cs : fixed Arabic thirtial weight.
585 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
587 * create-mscompat-collation-table.cs : Mapping for hyphens and
588 punctuation are kinda finished. Rewrote batch mapping method to
589 collect all NFKD. Required modification on mapping is done.
591 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
593 * create-mscompat-collation-table.cs : minor mapping fixes on accent
594 marks and punctuations.
596 2005-07-20 Atsushi Enomoto <atsushi@ximian.com>
598 * create-mscompat-collation-table.cs : Fixed some MathSymbol mapping
599 and Box drawing mapping.
601 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
603 * create-mscompat-collation-table.cs : Fixed almost all numbers.
605 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
607 * create-mscompat-collation-table.cs : Symbol mappings are almost done.
608 Removed hack that gave dummy mappings to blank symbols.
610 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
612 * create-mscompat-collation-table.cs : more fix on arrows. Fix on box
613 drawings. Some code refactoring to eliminate hack.
615 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
617 * create-mscompat-collation-table.cs : Fixed some secondary weight
618 in Devanagari and arrows.
620 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
622 * create-mscompat-collation-table.cs : a set of tiny mapping fixes.
624 2005-07-19 Atsushi Enomoto <atsushi@ximian.com>
626 * create-mscompat-collation-table.cs : some diacritical fixes for
627 Latin. Added batch mapping method that considers computed
628 diacritical weight (for numbers).
630 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
632 * managed-collation.patch : forgot to add System.String patch.
634 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
636 * MSCompatUnicodeTable.cs : added resource existence check (required
637 for mscorlib transient time from the one without resources to the
640 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
642 * create-mscompat-collation-table.cs : fixed punctuations and hyphen
643 (shift) primary weight.
645 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
647 * create-mscompat-collation-table.cs : more nonspacing mark fixes.
648 Some non-basic Cyrillic diacritical weight fixes.
650 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
652 * create-mscompat-collation-table.cs : some Gurmukhi fixes on level 1
653 and level 3. Tiny Hangul weight fixes.
654 * MSCompatUnicodeTable.cs : U+30F5 and U+30F6 are small Japanese.
656 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
658 * create-mscompat-collation-table.cs : some normal characters who have
659 "narrow" NFKD mapping are regarded as "wide" and thus level 3 weight
660 values were different. Handle U+30FB as category A.
661 * MSCompatUnicodeTable.cs : U+30FB does not have special weight.
663 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
665 * create-mscompat-collation-table.cs : more diacritical weight fixes.
666 Removed some unused code.
668 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
670 * create-mscompat-collation-table.cs : Fixed some Thai and Arabic
673 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
675 * create-mscompat-collation-table.cs : Fixed Syriac nonspacing marks.
677 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
679 * create-mscompat-collation-table.cs : Fixed nonspacing marks in
680 Malayalam, Thai and Lao. Removed extraneous hack.
682 2005-07-15 Atsushi Enomoto <atsushi@ximian.com>
684 * SimpleCollator.cs : rewrote LastIndexOf() to handle source extenders.
685 Some refactoring on IndexOf() code. Removed unused Matches().
686 * Collation-notes.txt : some methods needed to be reimplemented, so
687 rewrote the description.
689 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
691 * SimpleCollator.cs : rewrote IsSuffix() to use CompareInternal().
692 Thus supported extenders in IsSuffix().
694 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
696 * SimpleCollator.cs : more IsSuffix() simplification, but it will be
697 stopped here since it cannot handle extenders (implementing new
700 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
702 * SimpleCollator.cs : simplified IsSuffix() code.
704 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
706 * SimpleCollator.cs : Fixed IndexOf() and LasIndexOf() to search the
707 entire replacement string if char target was an expansion.
708 IsSuffix() was using a method for IsPrefix() which was incorrect.
709 Removed old IsPrefix() code.
711 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
713 * SimpleCollator.cs : IndexOf() was incorrectly sharing the same
714 byte[] field in different areas of code. Now extenders in both
715 source and target really work in IndexOf().
717 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
719 * create-mscompat-collation-table.cs : fixed U+FF9F diacritical weight.
720 * SimpleCollator.cs : handle U+FF9E and U+FF9F as extenders.
722 2005-07-14 Atsushi Enomoto <atsushi@ximian.com>
724 * SimpleCollator.cs : Now FilterExtender() handles all extender
725 support. IndexOf() and LastIndexOf() now supports extenders.
726 IndexOf() and LastIndexOf() did not proceed contraction source
727 length as expected. Tiny refactoring on private IsPrefix() to take
730 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
732 * SimpleCollator.cs : when restoring from expansion, go back to the
733 top of the loop (to avoid index out of range).
734 Now IsPrefix() is implemented to reuse Compare() and thus it now
735 supports extender as well.
736 * Collation-notes.txt : status update. Deleted optimization part in
737 status section (it is duplicate).
739 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
741 * SimpleCollator.cs : some code reordering.
742 * create-mscompat-collation-table.cs : it was still missing U+3094.
744 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
746 * SimpleCollator.cs : Compare() now supports extender (e.g. U+39FC).
748 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
750 * SimpleCollator.cs : In GetSortKey(), don't update previousChar when
751 it is not primary (e.g. don't "extend" diacritical mark).
753 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
755 * managed-collation.patch : CompareInfo.Compare() should consider
756 the possibilities that non-empty string might be actually empty
757 in culture-sensitive context.
759 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
761 * SimpleCollator.cs : IndexOf() and LastIndexOf() returns start when
762 target is "empty" (in culture-sensitive context).
764 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
766 * SimpleCollator.cs : In IndexOf() and LastIndexOf(), skip ignorable
767 characters in target string.
769 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
771 * SimpleCollator.cs : When IgnoreWidth is specified, all Kana
772 characters are regarded as half-width.
773 Even though IgnoreWidth is specified, it should not ignore case.
774 For special weight comparison, the default values (E4) are bigger
775 than non-default values.
776 * SortKeyBuffer.cs : It should save LCID and original string.
777 * create-mscompat-collation-table.cs : For Japanese half-width kana,
778 it should not be counted in widthCompat map since IgnoreWidth does
779 not really ignore those differences.
781 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
783 * create-mscompat-collation-table.cs : Fixed missing Japanese bits.
785 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
787 * create-mscompat-collation-table.cs :
788 tiny diacritical weight fix for U+20D0-U+20E1.
790 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
792 * create-mscompat-collation-table.cs : ja CJK ideograph got completed.
794 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
796 * create-mscompat-collation-table.cs : Fixed CJK custom Japanese
797 mapping. It (maybe as well as other CJK tables) mixes NFKD. For
798 Japanese, modified NFKD table (because of Windows lame design).
800 2005-07-13 Atsushi Enomoto <atsushi@ximian.com>
802 * Makefile : added MONO_USE_MANAGED_COLLATION=no almost everywhere.
803 * MSCompatUnicodeTable.cs : FillCJK() was not invoked. Now it is
804 invoked at any time it is required.
805 * SimpleCollator.cs : call FillCJK() above in .ctor().
806 * MSCompatUnicodeTableUtil.cs : CJK range was wider.
807 * create-mscompat-collation-table.cs : CJK binary was missing the
808 length. CJK remapping is being moved to ModifyUnidata().
809 For cjk-ja mapping, we have to consider compat characters to be
810 added to the map, besides the raw UCA table.
812 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
814 * SortKeyBuffer.cs : Fixed shift level computation to match w/ Windows.
816 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
818 * SimpleCollator.cs : fixed LastIndexOf() to handle _target's_
819 contraction as expected. Fixed Compare() to save s2's contraction
821 * TestDriver.cs :added LastIndexOf() tester w/ indexes.
823 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
825 * managed-collation.patch : Fixed IsPrefix() and IsSuffix(). They
826 incorrectly use Compare().
827 * TestDriver.cs : more moved to nunit tests.
829 2005-07-12 Atsushi Enomoto <atsushi@ximian.com>
831 * SimpleCollator.cs : several fixes on Compare().
832 - Ignorable characters are skippted at the top of the loop.
833 - IgnoreNonSpace is checked to avoid extraneous level 2 comparison.
834 - In such case that s1 index is increased while s2 contraction is
835 replaced, s1 is inconsistently proceeded (bug).
836 - IsIgnorable() now also checks IgnoreNonSpace.
837 - Fixed FilterOptions() that does not work for IgnoreWidth at all.
838 * TestDriver.cs : now some are moved to nunit tests.
839 * Collation-notes.txt : minor todo update.
841 2005-07-11 Atsushi Enomoto <atsushi@ximian.com>
843 * SimpleCollator.cs : Compare() was ignoring such case that both
844 entire strings have '-' to be compared.
845 * Collation-notes.txt : more status updates.
846 * TestDriver.cs : added '-' use cases.
848 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
850 * SimpleCollator.cs : to be same as other buggy part, it now handles
851 U+3005, U+3031 and U+3032 as buggy as Windows. It just repeats
853 Fixed GetSortKey(): if the repeater is U+3005, second weight is 5.
854 * create-mscompat-collation-table.cs : dummy values for extenders.
856 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
858 * SimpleCollator.cs : Special weight fixes on GetSortKey(). Dash type
859 should be computed from ExtenderType, and voice mark weight should
861 * MSCompatUnicodeTable.cs : added tiny comment.
863 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
865 * SortKey.cs : It borked when MONO_USE_MANAGED_COLLATION is not yes.
866 * SimpleCollator.cs : support for extender (U+309D etc.).
868 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
870 * create-mscompat-collation-table.cs : some punct/symbols fix.
871 * managed-collation.patch : new (and temporary) file to support
872 managed collation in mscorlib.
873 * README : described how to use managed collation.
875 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
877 * create-mscompat-collation-table.cs : Further Cyrillic fixes. Handle
878 U+482-4C8 (though needs diacritical fixes).
879 * MSCompatUnicodeTable.cs : tiny comment for alternative impl.
881 2005-07-08 Atsushi Enomoto <atsushi@ximian.com>
883 * create-mscompat-collation-table.cs : Reimplemented Cyrillic weight
884 computation code, since it looks like the same way as Latin letters
885 have. Thus removed all other approach (UCA, by letter name).
887 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
889 * create-mscompat-collation-table.cs : diacritical fix for "double-
890 struck". Syriac nonspacing fixes.
892 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
894 * create-mscompat-collation-table.cs : more math symbol weight fixes.
896 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
898 * create-mscompat-collation-table.cs : fixed Hebrew character sortkeys.
900 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
902 * create-mscompat-collation-table.cs : math symbols U+25A0-U+2600 are
903 implemented (no stub). Some other fixes on category 8-A.
905 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
907 * create-mscompat-collation-table.cs : some minor fixes on Arabic,
908 Korean and Japanese sortkey weights.
910 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
912 * create-mscompat-collation-table.cs : More diacritical fixes.
913 Georgian characters do not have level 2 weights but level 3.
915 2005-07-07 Atsushi Enomoto <atsushi@ximian.com>
917 * create-mscompat-collation-table.cs : Roman numeral characters
918 have diacritical weight. quick hack for control signs (U+2400..)
921 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
923 * create-mscompat-collation-table.cs : improving Latin mappings.
924 Setting non-ASCII Latin characters' primary weight between those
925 ASCII characters, and setting diacritical weight (hacky).
926 * MSCompatUnicodeTable.cs :
927 Kanatype check: fixed (voice marks) and improved (comparison order).
929 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
931 * create-mscompat-collation-table.cs : more diacritical fixes.
932 primary weight fixes on punctuations in category 07.
934 2005-07-06 Atsushi Enomoto <atsushi@ximian.com>
936 * create-mscompat-collation-table.cs : several diacritical fixes.
937 * TestDriver.cs : sortkey dumper should use StringSort.
939 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
941 * SimpleCollator.cs : fixed incorrect indexer setup. Optimized
942 GetContraction() call a bit.
944 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
946 * create-mscompat-collation-table.cs : fixed incorrect level 2
948 * MSCompatUnicodeTable.cs : remove debug line.
950 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
952 * MSCompatUnicodeTableUtil.cs,
953 MSCompatUnicodeTable.cs,
955 create-mscompat-collation-table.cs : made some members internal and
956 accessible from other classes. Many indexes could be 0 by default.
957 * SimpleCollator.cs : optimizations. avoid method call.
959 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
961 * Collation-notes.txt : more updates.
962 * SimpleCollator.cs : Added quick check for Ordinal comparison.
963 Fixed special weight comparison. It cannot be customizable in the
964 implementation (and it won't be harmful).
965 * mono-tailoring-source.txt : thus updated comment.
967 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
969 * SimpleCollator.cs : Compare() was missing French sort support.
970 * TestDriver.cs : added example case.
972 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
974 * Collation-notes.txt : updated status. Eliminated descriptions on
975 "iterator" (I avoided it for performance concern). Fixed misc.
976 incorrect descriptions.
978 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
980 * Collator.cs : Now that SimpleCollator became feature complete, it is
983 2005-07-05 Atsushi Enomoto <atsushi@ximian.com>
985 * SimpleCollator.cs : implemented decent Compare() that immediately
986 stops at first primary difference.
988 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
990 * SimpleCollator.cs : indexers might return -1.
992 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
994 * SimpleCollator.cs : IsPrefix() and IsSuffix() optimization code was
995 buggy (length check for source was missing).
997 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
999 * create-mscompat-collation-table.cs : Fixed tailoring table output
1000 to be in correct and countable order. Now if tailoring alias was not
1001 found, just stop the build.
1002 * MSCompatUnicodeTable.cs : several build fixes. Now it works to read
1004 * mono-tailoring-source.txt : commented out CJK aliases that miss
1006 * Makefile : needed further filename fixes.
1008 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
1010 * MSCompatUnicodeTable.cs : renamed from MSCompatUnicodeTable.template
1011 (now it is working as a standalone file).
1012 * Makefile : renamed generated file as MSCompatUnicodeTableGenerated.cs
1013 (the generator now creates both binary resources and C# source).
1015 2005-07-04 Atsushi Enomoto <atsushi@ximian.com>
1017 * create-mscompat-collation-table.cs : Now it generates binary
1018 resources (to parent directory).
1019 * MSCompatUnicodeTable.template : added conditional code that fills
1020 collation tables from manifest resources.
1021 * Makefile : remove collation table binaries as well on "make clean".
1022 Removed extraneous dependency.
1024 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
1026 * MSCompatUnicodeTable.template,
1027 SimpleCollator.cs : removed extraneous GetExpansion().
1029 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
1031 * SimpleCollator.cs : IsSuffix() also supports contractions.
1032 * TestDriver.cs : IsSuffix() example contraction cases.
1034 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
1036 * SimpleCollator.cs : reverted IsSuffix() to return bool (to match w/
1037 what current IsPrefix() does). For expansion of target, IsPrefix()
1038 should check the no-match case that expansion is longer than input.
1039 Some refactory on IsPrefix().
1040 Added GetContractionTal() for IsSuffix() (not used yet).
1042 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
1044 * TestDriver.cs : added IsPrefix() expansion cases.
1045 * SimpleCollator.cs : IsPrefix() now supports contractions (with much
1046 of complexity), and it now returns bool again.
1047 IndexOf() for replacement should make use of IndexOfPrimitiveChar()
1048 since expansions won't be expanded recursively.
1050 2005-07-01 Atsushi Enomoto <atsushi@ximian.com>
1052 * SimpleCollator.cs : commonized character comparison in IsPrefix()
1053 and IsSuffix(). csc compile fix.
1054 * CompareInfoImpl.cs : deleted.
1056 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
1058 * TestDriver.cs : added SimpleCollator.ctor() sanity check.
1059 Added replacement contraction example.
1060 * SimpleCollator.cs : Now IndexOf() and LastIndexOf() support
1061 contraction in source string. Extracted matching code to Matches().
1062 Replacement contraction was including extraneous '\x0'.
1064 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
1066 * Collation-notes.txt : updated status.
1067 * CollationDataStructures.txt : tiny fixes.
1068 * SimpleCollator.cs :
1069 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
1070 namespace Util and csc borked).
1071 GetContraction was incorrectly returning first item.
1072 Private IsPrefix() now returns int (but it might not be in real use).
1073 Extracted simple char comparison to CompareCharSimple().
1074 IndexOf() and LastIndexOf() now fully handle contractions (both
1075 binary key and string replacement) in "target" (for "s" not yet).
1076 * TestDriver.cs : be more verbose.
1077 * mono-tailoring-source.txt : added comment.
1078 * MSCompatUnicodeTable.template :
1079 Renamed alias Util to UUtil (MS sys.enterprisesvc has sucky global
1081 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
1083 * create-mscompat-collation-table.cs : compute COMBINING blah marks as
1084 well as those characters WITH blah.
1085 * TestDriver.cs : added combining sortkey cases.
1087 2005-06-30 Atsushi Enomoto <atsushi@ximian.com>
1089 * mono-tailoring-source.txt : fixed description on '*' in sortkeys.
1090 * SimpleCollator.cs : Now it fully uses tailoring info. Fixed
1091 contraction search that worked only when string is contraction.
1092 Removed commented code. Minor refactoring.
1093 * TestDriver.cs : added example that uses "ZS" in Hungarian sorting.
1095 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1097 * create-mscompat-collation-table.cs,
1098 * mono-tailoring-source.txt : removed extraneous level 4 sortkey
1099 which cannot be supported.
1100 * SimpleCollator.cs : added GetContraction() and used in some places.
1101 Now CompareOptions is set only once. Reordered some code (e.g.
1102 ignorable check -> get compat char -> compare).
1104 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1106 * SimpleCollator.cs : sort tailoring tables before actual usage.
1107 Support diacritical remappings (it is customized collation rule
1108 which does not exist in UCA).
1110 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1112 * SimpleCollator.cs : build culture specific tailoring table from
1113 TailoringInfo and unified data array.
1114 * create-mscompat-collation-table.cs : Added null termination to
1115 sortkey map tailorings (mostly to save my eyes).
1116 * MSCompatUnicodeTable.template : added public TailoringValues.
1118 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1120 * SortKeyBuffer.cs : handle special weight (category 06) characters.
1121 * Collation-notes.txt : Updated description on special weight (it was
1123 * TestDriver.cs : added special weight cases.
1125 2005-06-29 Atsushi Enomoto <atsushi@ximian.com>
1127 * MSCompatUnicodeTable.template : added GetTailoringInfo().
1128 * SimpleCollator.cs : Now tailoring information is acquired and used.
1129 (FrenchSort is supported but Compare() won't work expectedly since
1130 the table is still incomplete for those diacritical marks).
1131 * SortKeyBuffer.cs : On reversing diacritical weights, it should
1132 ignore zeros. Reset() should reset frenchSorted flag.
1134 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1136 * create-mscompat-collation-table.cs : Further fixes on Jamo,
1137 diacritical weights by character name, and *Numbers primary weights.
1139 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1141 * create-mscompat-collation-table.cs : More fix on Devanagari,
1142 Gujarati, Oliya, Tamil and Lao sortkeys.
1144 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1146 * create-mscompat-collation-table.cs : Fixed Georgian, Thai, Gurmukhi
1149 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1151 * create-mscompat-collation-table.cs : Fixed Thai character primary
1152 and secondary values. Fixed Thaana letters. Added more LAMESPEC
1153 CJK compat. Fixed some circled CJK secondary weight.
1154 Hacked some nonspacing mark sortkey value adjustment.
1156 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1158 * create-mscompat-collation-table.cs : CP932.TXT was not parsed as
1159 expected. JIS ordering was incorrect. OtherNumbers that represents
1160 10 or more values were incorrectly computed the offset. Some Hangul
1161 compat characters has different offset.
1163 2005-06-28 Atsushi Enomoto <atsushi@ximian.com>
1165 * create-mscompat-collation-table.cs : Fixed 0x8 category characters.
1166 Added hack for need-to-be-fixed characters to fall into 0xA category.
1167 * create-collation-element-table.cs : previous checkin seem failed :(
1168 * README: updated a bit.
1170 2005-06-24 Atsushi Enomoto <atsushi@ximian.com>
1172 * CodePointIndexer.cs :
1173 removed extraneous switch (I could use empty array for that need).
1174 * CollationElementTableUtil.cs : primary weight type became ushort.
1175 * create-collation-element-table.cs : several bugfixes.
1176 collElem should be int. It was skipping most of entries because of
1177 incorrect string tokenization.
1179 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1181 * create-mscompat-collation-table.cs : handle some Jamo NKFD.
1183 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1185 * SimpleCollator.cs : forgot to commit in the last checkin.
1186 * create-mscompat-collation-table.cs : fixed arabic shift weight chars.
1187 * TestDriver.cs : switch table dumper and collator testing.
1188 * SortKey.cs : for now comment out internal indexes (not in use).
1190 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1192 * MSCompatUnicodeTable.template,
1193 SimpleCollator.cs : support for culture dependent CJK table.
1195 2005-06-23 Atsushi Enomoto <atsushi@ximian.com>
1197 * create-mscompat-collation-table.cs,
1198 MSCompatUnicodeTableUtil.cs : make CJK table more compact.
1200 2005-06-22 Atsushi Enomoto <atsushi@ximian.com>
1202 * SimpleCollator.cs : Fixed stupid index search when start != 0.
1204 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1206 * SimpleCollator.cs : fixed my misunderstanding on LastIndexOf(). It
1207 now starts from "start" and proceeds backward by "length".
1208 * TestDriver.cs : fix warning.
1210 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1212 * TestDriver.cs : more tests.
1213 * SimpleCollator.cs : LastIndexOf() is not setting search length
1214 on iteration. Quick workaround fro String.LastIndexOf() bug (maybe).
1216 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1218 * create-normalization-source.cs : output propValue as uint.
1220 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1222 * SortKey.cs : Now it is System.Globalization.SortKey.
1223 To replace existing implementation, it now requires lcid and
1224 CompareOptions. Added required members.
1225 * SortKeyBuffer.cs : thus .ctor() requires LCID.
1226 * SimpleCollator.cs : made required changes above.
1228 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1230 * CodePointIndexer.cs : added CompressArray(). Now it requires two more
1231 parameters for default index and codepoint.
1232 * CollationElementTableUtil.cs,
1233 NormalizationTableUtil.cs : required changes wrt above change.
1234 * MSCompatUnicodeTableUtil.cs : added for several codepoint indexers.
1235 * MSCompatUnicodeTable.template : Now it uses codepoint indexer.
1236 * create-mscompat-collation-table.cs : Now it outputs compressed array.
1237 * Makefile : now collation requires MSCompatUnicodeTableUtil.cs
1239 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1241 * SimpleCollator.cs :
1242 Implemented IsSuffix() and LastIndexOf().
1243 Several fixes on index > 0 cases.
1244 * TestDriver.cs : sample IsSuffix() and LastIndexOf() usage and more.
1246 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1248 * Collation-notes.txt : updated (status, impl. classes).
1249 * MSCompatUnicodeTable.cs : Korean Jamo are not really expansions.
1251 2005-06-21 Atsushi Enomoto <atsushi@ximian.com>
1253 * SimpleCollator.cs : implemented IndexOf(string,string,CompareOptions)
1254 and IsPrefix(). Tiny code refactory.
1255 * TestDriver.cs : sample IsPrefix() and IndexOf() usage.
1256 * MSCompatUnicodeTable.cs : tiny refactory for CodePointIndexer use.
1258 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1260 * SimpleCollator.cs :
1261 IndexOf(string, char, CompareOptions) implementation.
1262 * TestDriver.cs : sample IndexOf() usage.
1264 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1266 * create-mscompat-collation-table.cs : was missing most important
1267 kind of blocks - equivalent expansions (e.g. invariant mappings).
1268 More readable mappings.
1270 2005-06-20 Atsushi Enomoto <atsushi@ximian.com>
1272 * mono-tailoring-source.txt : new file. It describes tailoring
1273 information. Basically examined under .NET 1.x.
1274 * create-mscompat-collation-table.cs : consume the file above.
1275 * MSCompatUnicodeTable.template : now tailorings is not a stub.
1276 * CollationDataStructures.txt : minor fixes.
1278 SimpleCollator.cs : added FrenchSort support.
1279 * Collation-notes.txt : added description on Latin primary weights.
1280 * ldml-limited.rng : added note.
1281 * create-tailorings.cs : added note. more serialization (but won't be
1284 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1286 * SortKeyBuffer.cs : non-primary character is added to previous
1288 * TestDriver.cs : added example case of above.
1290 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1292 * SimpleCollator.cs : IgnoreSymbols support.
1293 * TestDriver.cs : compilation fix. IgnoreSymbols example.
1294 * create-mscompat-collation-table.cs : more Hangul fixes.
1296 2005-06-17 Atsushi Enomoto <atsushi@ximian.com>
1298 * create-mscompat-collation-table.cs : more Hangul fixes.
1299 * SortKey.cs : it will replace sys.globalization.SortKey. It has
1300 some internal members.
1301 * SortKeyBuffer.cs : now it uses SortKey instead of byte[].
1302 * SimpleCollator.cs : CompareOptions support. However I don't think
1303 it will be developed anymore since SortKey never enables IndexOf().
1304 * TestDriver.cs : a few CompareOptions cases.
1306 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1308 * SimpleCollator.cs : simple collator implementation that just will
1309 use GetSortKey() for all its basis.
1310 * TestDriver.cs : sample code that uses this collator set.
1311 * MSCompatUnicodeTable.template : removed test driver from here.
1313 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1315 * create-mscompat-collation-table.cs : Hangul fixes.
1316 Now less than 300 characters that does not have sortkey weights.
1317 * MSCompatUnicodeTable.template : added FIXME info for Hangul Jamo.
1319 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1321 * create-mscompat-collation-table.cs : Added control picture mappings.
1322 Minor primary weight fixes.
1324 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1326 * create-mscompat-collation-table.cs : Added mappings for box
1327 drawings and blocks.
1329 2005-06-16 Atsushi Enomoto <atsushi@ximian.com>
1331 * create-mscompat-collation-table.cs : Added mappings for arrows.
1333 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1335 * create-mscompat-collation-table.cs : added support for letterlike
1336 characters and squared CJK compatibility characters, ordered by
1337 character names (0x0E category).
1338 * Collation-notes.txt : added description on that.
1340 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1342 * MSCompatUnicodeTable.template : Now expansions are simulated.
1343 * create-mscompat-collation-table.cs : filled Korean number level2.
1344 Reordered some code blocks to fill correct diacritical differences.
1345 * Collation-notes.txt : some corrections and minor additions.
1347 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1349 * MSCompatUnicodeTable.template :
1350 Now dumper test driver uses SortKeyBuffer for dogfooding.
1351 * create-mscompat-collation-table.cs : some diacritical level fixes
1352 (with non-working extra latin check).
1353 * SortKeyBuffer.cs : several fixes to get working as a practical code.
1354 * Collator.cs : make it compilable, leaving things as NotImplemented.
1356 2005-06-15 Atsushi Enomoto <atsushi@ximian.com>
1358 * create-mscompat-collation-table.cs : some fixes on primary category
1359 07 (miscellaneous symbols and punctuations).
1361 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1363 * create-mscompat-collation-table.cs : more mapping fix on numbers,
1364 letters, variable weight characters, circled Japanese and CJK.
1365 * MSCompatUnicodeTable.template : fixed HasSpecialWeight() to be more
1366 inclusive. Simplified dumper code.
1368 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1370 * create-mscompat-collation-table.cs : finished Hangul (both Jamo
1371 and Syllables). sortkey dumper diff lines became 8000 from 30000.
1373 2005-06-14 Atsushi Enomoto <atsushi@ximian.com>
1375 * create-mscompat-collation-table.cs : added some nonspacing marks in
1376 either correct or hacky way.
1378 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
1380 * create-mscompat-collation-table.cs : several improvements. Japanese
1381 Kana support, Hebrew accents, Bengali nonspacing marks, sorting of
1382 numeric characters, diacritically decorated latin alphabets. Fixed
1383 some diacritical weights detection.
1384 * MSCompatUnicodeTable.cs : tiny Japanese fix. Handle nonspacing
1385 marks' primary weight as empty.
1386 * Collation-notes.txt : some updates.
1388 2005-06-13 Atsushi Enomoto <atsushi@ximian.com>
1390 * create-mscompat-collation-table.cs : don't process nonexact NFKD
1391 mapping as equivalent, however store CJK extensions into NFKD map
1392 even if one does not strictly match.
1393 Now am going to fill Hangul into tables (unlike UCA it does not look
1394 possible to calculate sortkey value).
1395 Fixed Cyrillic and Georgian UCA based orderings.
1396 * MSCompatUnicodeTable.template : added CJK extension sortkey
1399 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
1401 * create-mscompat-collation-table.cs : Fixed latin alphabet support.
1402 Added latin with diacritical and CJK extension.
1403 * MSCompatUnicodeTable.cs : modified dumper code a bit (for my purpose).
1405 2005-06-10 Atsushi Enomoto <atsushi@ximian.com>
1407 * create-mscompat-collation-table.cs : now parses DerivedAge.txt (right
1408 now not used thouth). Filled CJK ideograph, still not perfect.
1409 Fixed number primary keys. NFKD numbers and CJK ideographs are now
1410 considered, including brackets elimination.
1411 * Makefile : now it downloads DerivedAge.txt.
1412 * MSCompatUnicodeTable.template : added dummy code dumper. It computes
1413 PrivateUse, Surrogate and Hangul Syllables.
1414 * Collation-notes.txt : Noted that Hangul Syllables need more love.
1416 2005-06-09 Atsushi Enomoto <atsushi@ximian.com>
1418 * create-tailorings.cs : added configuration support. sort them.
1419 I wonder if it is really usable. Having own format might be better.
1420 * create-mscompat-collation-table.cs : fixing some sortkey numbers,
1421 making closer to windows. Now it handles NFKD in some places.
1422 * MSCompatUnicodeTable.template : Added dummy sortkey dumper driver.
1423 * CollationDataStructures.txt : added description on tailoring
1424 fields, though they are subject to change.
1426 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
1428 * create-tailorings.cs, ldml-limited.rng : new file.
1429 * LdmlReader.cs : removed old file.
1431 2005-06-07 Atsushi Enomoto <atsushi@ximian.com>
1433 * SortKeyBuffer.cs : split from Collator.cs. Now it considers
1434 practical use, reflecting updated sortkey constant design.
1435 Especially level 4 weight is split to 4 arrays that are merged in
1436 the last stage of GetSortKey().
1437 * Collator.cs : thus SortKeyBuffer is removed from here.
1438 Additionally, removed some extraneous bits in other classes.
1439 * Collation-notes.txt : Some editorial fixes. Added information on
1440 Korean matter (how to compute Hangle Syllables / Hangul Jamo cannot
1441 be stored in simple byte arrays).
1442 * CodePointIndexer.cs,
1443 create-collation-element-table.cs,
1444 CollationElementTable.template,
1445 NormalizationTableUtil.cs : short CodePointIndexer method names.
1446 * create-mscompat-collation-table.cs : Additional info on why some
1447 meaningful characters are ignored in Windows (Unicode version
1448 difference). Removed U+070F from special check (was extraneous).
1450 2005-06-06 Atsushi Enomoto <atsushi@ximian.com>
1452 * MSCompatUnicodeTable.template:
1453 Moved body implementation to table creator and put those bool
1454 results into an array.
1455 * create-mscompat-collation-table.cs :
1456 So imported those methods. Modified array output to emit "0x"
1457 only for more than 9.
1458 * create-normalization-source.cs : ditto on "0x" output matter.
1459 * CollationDataStructures.txt : so now it holds ignorableFlags.
1461 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
1463 * Collation-notes.txt, CollationDataStructures.txt :
1464 separate document for data structure design.
1466 2005-06-03 Atsushi Enomoto <atsushi@ximian.com>
1468 * create-mscompat-collation-table.cs : added culture-dependent CJK
1469 table creation. It uses CLDR as its basis. (Culture independent CJK
1471 * Makefile : added CLDR archive downloading support.
1472 * MSCompatUnicodeTable.template : tiny renamings.
1473 * Collation-notes.txt : additional CJK info.
1475 2005-06-02 Atsushi Enomoto <atsushi@ximian.com>
1477 * Collation-notes.txt, create-mscompat-collation-table.cs :
1478 added secondary weight support for BlahNumber characters.
1480 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
1482 * downloaded : added directory. All downloaded files are stored here.
1483 * Makefile : use "downloaded" directory.
1484 Added more auto-download stuff.
1485 * create-mscompat-collation-table.cs :
1486 Added Japanese square kana support.
1488 2005-06-01 Atsushi Enomoto <atsushi@ximian.com>
1490 * Collation-notes.txt : added Estrangela (ancient Syriac) and Thaana.
1491 * create-mscompat-collation-table.cs : added support for Arabic abjad,
1492 Estrangela and Thaana.
1493 * MSCompatUnicodeTable.template : removed BOM.
1495 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1497 * Collation-notes.txt : wrong comment cleanup and spelling fixes.
1498 * create-mscompat-collation-table.cs : added diacritic support for
1499 Latin letters (as long as covered in primary weight).
1501 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1503 * Makefile : minor fixes. Added warning lines to generated sources.
1505 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1507 * create-char-mapping-source.cs :
1508 Removed ToWidthInsensitive() generation.
1510 2005-05-31 Atsushi Enomoto <atsushi@ximian.com>
1512 * create-mscompat-collation-table.cs : Now it dumps level1 to 3 values.
1513 ToWidthInsensitive() is implemented here, using an array (which is
1514 to be optimized using CodePointIndexer).
1515 * MSCompatUnicodeTable.cs : renamed as MSCompatUnicodeTable.template
1516 * MSCompatUnicodeTable.template : now it is used to generate
1517 MSCompatUnicodeTable.cs which got ready to be used.
1518 * Makefile : added MSCompatUnicodeTable.cs build support. Now it
1519 supports "make normalization" and "make collation".
1521 2005-05-30 Atsushi Enomoto <atsushi@ximian.com>
1523 * Collation-notes.txt : Description on ICU is very incorrect. Now it
1524 became more rational and sane.
1525 * create-mscompat-collation-table.cs : fixed some indexes.
1526 * Makefile : added "mstablegen" target.
1527 * MSCompatUnicodeTable.cs : removed GetPrimaryWeight(). Minor fix.
1529 2005-05-26 Atsushi Enomoto <atsushi@ximian.com>
1531 * Collation-notes.txt : more analysis on "letters".
1532 * create-mscompat-collation-table.cs : more proof of concepts.
1534 2005-05-25 Atsushi Enomoto <atsushi@ximian.com>
1536 * Collation-notes.txt : more info. Started letter sortkey analysis
1537 (some of other stuff are really non-understandable right now.)
1538 * create-mscompat-collation-table.cs : table generator proof-of-
1539 concept source (not compilable).
1540 * MSCompatUnicodeTable.cs : moved some code to the new source.
1543 2005-05-20 Atsushi Enomoto <atsushi@ximian.com>
1545 * Collation-notes.txt : started level 2 weight analysis.
1547 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
1549 * Collation-notes.txt : Additional information on how to create
1551 * MSCompatUnicodeTable.cs : implemented part of GetLevel3Weight().
1553 2005-05-19 Atsushi Enomoto <atsushi@ximian.com>
1555 * Collation-notes.txt : More case weight (level 3) analysis. I'm
1556 likely to just write table generator.
1558 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
1560 * MSCompatUnicodeTable.cs : part of level 4 weight implementation.
1562 2005-05-18 Atsushi Enomoto <atsushi@ximian.com>
1564 * Collation-notes.txt :
1566 Revised comparison methods; backward iteration is possible.
1567 More on char-by-char comparison.
1568 Level 4 comparison is actually a bit more complex.
1570 * Collator.cs : some conceptual updates wrt above.
1572 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
1574 * Collation-notes.txt : Japanese voice mark is level 2, and Hangul
1575 properties are level 3.
1577 2005-05-17 Atsushi Enomoto <atsushi@ximian.com>
1579 * Collation-notes.txt : Make it more readable. More analysis on
1580 level 3 and 4 sortkey structures.
1581 * Collator.cs : some compilation fixes (not compilable yet).
1583 2005-05-16 Atsushi Enomoto <atsushi@ximian.com>
1585 * Collation-notes.txt : Analysis on variable-weighting (level 5)
1587 * Collator.cs : updated corresponding part of level 5, and more.
1589 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1591 * Collation-notes.txt : more updates.
1592 * Collator.cs : rewrote from scratch. Some rough sketch for sortkey
1593 buffer, character iterator and collator methods. Not compiling.
1595 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1597 * Collator.cs : Am going to replace it with new one. No need for
1598 CompareOptions-dependent Comparer.
1600 2005-05-13 Atsushi Enomoto <atsushi@ximian.com>
1602 * Collation-notes.txt : There seems a bit more complexity.
1604 2005-05-10 Atsushi Enomoto <atsushi@ximian.com>
1606 * Collation-notes.txt : more updates, being close to write sortkey
1609 2005-05-09 Atsushi Enomoto <atsushi@ximian.com>
1611 * CompareInfoImpl.cs, Collator.cs : conceptual update
1612 * Collation-notes.txt : some corrections and additions.
1613 * Makefile : added LDML input (but it won't be used at all).
1615 2005-04-28 Atsushi Enomoto <atsushi@ximian.com>
1617 * Collation-notes.txt : more updates.
1619 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1621 * Collation-notes.txt : more updates.
1623 2005-04-26 Atsushi Enomoto <atsushi@ximian.com>
1625 * Collation-notes.txt : some updates.
1626 * create-mapping-char-source.cs : superscripts and subscripts are also
1627 ignored in IgnoreWidth comparison.
1628 * Makefile : tiny touch fix.
1630 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1632 * CompareInfoImpl.cs, Collator.cs : conceptual stuff (not working).
1634 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1636 * create-char-mapping-source.cs : Now it generates
1637 ToWidthInsensitive() from combining category <wide> and <narrow>.
1638 * MSCompatUnicodeTable.cs : added ToKanaTypeInsensitive() and
1639 ToWidthInsensitive() for IgnoreKanaType and IgnoreWidth.
1641 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1643 * README, LdmlReader.cs, DataStructures.txt : new files.
1645 2005-04-25 Atsushi Enomoto <atsushi@ximian.com>
1647 * CodePointIndexer.cs,
1648 Collation-notes.txt,
1649 CollationElementTable.template,
1650 CollationElementTableUtil.cs,
1651 create-char-mapping-source.cs,
1652 create-collation-element-table.cs,
1653 create-combining-class-source.cs,
1654 create-normalization-source.cs,
1656 MSCompatUnicodeTable.cs,
1657 Normalization.template,
1658 NormalizationTableUtil.cs : initial checkin (to private branch).