draft-zeilenga-ldapbis-strmatch-02.txt

   1
   2
   3
   4
   5
   6
   7 Internet-Draft                           Editor: Kurt D. Zeilenga
   8 Intended Category: Informational                 OpenLDAP Foundation
   9 Expires in six months                            3 March 2003
  10
  11
  12
  13             Internationalized String Matching Rules for X.500
  14                  <draft-zeilenga-ldapbis-strmatch-02.txt>
  15
  16
  17 Status of this Memo
  18
  19   This document is an Internet-Draft and is in full conformance with all
  20   provisions of Section 10 of RFC2026.
  21
  22   This document is intended to be submitted to the ITU for publication
  23   as an amendment to X.520 and published as an Informational RFC.
  24   Distribution of this memo is unlimited.  Technical discussion of this
  25   document will take place on the IETF LDAP Revision Working Group
  26   mailing list <ietf-ldapbis@openldap.org>.  Please send editorial
  27   comments directly to the author <Kurt@OpenLDAP.org>.
  28
  29   Internet-Drafts are working documents of the Internet Engineering Task
  30   Force (IETF), its areas, and its working groups.  Note that other
  31   groups may also distribute working documents as Internet-Drafts.
  32   Internet-Drafts are draft documents valid for a maximum of six months
  33   and may be updated, replaced, or obsoleted by other documents at any
  34   time.  It is inappropriate to use Internet-Drafts as reference
  35   material or to cite them other than as ``work in progress.''
  36
  37   The list of current Internet-Drafts can be accessed at
  38   <http://www.ietf.org/ietf/1id-abstracts.txt>. The list of
  39   Internet-Draft Shadow Directories can be accessed at
  40   <http://www.ietf.org/shadow.html>.
  41
  42   Copyright 2003, The Internet Society.  All Rights Reserved.
  43
  44   Please see the Copyright section near the end of this document for
  45   more information.
  46
  47
  48 Abstract
  49
  50   The existing X.500 Directory Service technical specifications do not
  51   precisely define how string matching is to be performed.  This has
  52   lead to a number of interoperability problems.  This document provides
  53   string preparation profiles for standard syntaxes and matching rules
  54   defined in X.520.
  55
  56
  57
  58 Zeilenga               X.500 Intl. String Matching              [Page 1]
  59 \f
  60 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
  61
  62
  63   This document is intended to be submitted to the ITU-T for publication
  64   as an amendment to X.520 and published as an Informational RFC.
  65
  66
  67 Conventions
  68
  69   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
  70   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
  71   document are to be interpreted as described in BCP 14 [RFC2119].
  72
  73   Character names in this document use the notation for code points and
  74   names from the Unicode Standard [UNICODE] and ISO/IEC 10646-1
  75   [ISO10646].  For example, the letter "a" may be represented as either
  76   <U+0061> or <LATIN SMALL LETTER A>.  In the lists of mappings and the
  77   prohibited characters, the "U+" is left off to make the lists easier
  78   to read.  The comments for character ranges are shown in square
  79   brackets (such as "[CONTROL CHARACTERS]") and do not come from the
  80   standards.
  81
  82   Note: a glossary of terms used in Unicode and ISO/IEC 10646 can be
  83   found in [GLOSSARY].  Information on the ISO/IEC 10646/Unicode
  84   character encoding model can be found in [UTR17].
  85
  86
  87
  88 1. Introduction
  89
  90 1.1. Background
  91
  92   An X.500 matching rule [X.501] defines an algorithm for determining
  93   whether a presented value matches an attribute value in accordance
  94   with the criteria defined for the rule.  The proposition may be
  95   evaluated to True, False, or Undefined.
  96
  97       True      - the attribute contains a matching value,
  98
  99       False     - the attribute contains no matching value,
 100
 101       Undefined - it cannot be determined whether the attribute contains
 102                   a matching value or not.
 103
 104   For instance, the caseIgnoreMatch matching rule may be used to compare
 105   whether the commonName attribute contains a particular value without
 106   regard for case and insignificant spaces.
 107
 108
 109 1.2. X.500 String Matching Rules
 110
 111
 112
 113
 114 Zeilenga               X.500 Intl. String Matching              [Page 2]
 115 \f
 116 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 117
 118
 119   "X.520: Selected attribute types" [X.520] provides (amongst other
 120   things) value syntaxes and matching rules for comparing values
 121   commonly used in the Directory [X.500].  These specifications are
 122   inadequate for strings composed of characters from the Universal
 123   Character Set (UCS) [ISO10646], a superset of Unicode [UNICODE].
 124
 125   The CaseIgnoreMatch matching rule, for example, is simply defined as
 126   being a case insensitive comparison where insignificant spaces are
 127   ignored.  For printableString, there is only one space character and
 128   case mapping is bijective, hence this definition is sufficient.
 129   However, for UCS-based string types such as universalString, this is
 130   not sufficient.  For example, a case insensitive matching
 131   implementation which folded lower case characters to upper case would
 132   yield different different results than an implementation which used
 133   upper case to lower case folding.  Or one implementation may view
 134   space as referring to only SPACE (U+0020), a second implementation may
 135   view any character with the space separator (Zs) property as a space,
 136   and another implementation may view any character with the whitespace
 137   (WS) category as a space.
 138
 139   The lack of precise specification for string matching has led to
 140   significant interoperability problems.  When used in certificate chain
 141   validation, security vulnerabilities can arise.  To address these
 142   problems, this document updates X.520 [X.520] with a detailed
 143   specification of string syntax and matching rule requirements.
 144
 145
 146 1.3. Relationship to "stringprep"
 147
 148   The matching rule algorithms described in this document are based upon
 149   the "stringprep" approach [RFC3454].  In "stringprep", presented and
 150   stored values are first prepared for comparison and so that a
 151   character-by-character comparison yields the "correct" result.
 152
 153   The algorithm used here is a refinement of the "stringprep" [RFC3454]
 154   approach.  The algorithm involves two additional preparation steps.
 155
 156   a) prior to applying the Unicode string preparation steps outlined in
 157      "stringprep", the string is transcoded to Unicode;
 158
 159   b) after applying the Unicode string preparation steps outlined in
 160      "stringprep", characters insignificant to the matching rules are
 161      removed.
 162
 163   Hence, preparation of strings for X.500 matching involves the
 164   following steps:
 165
 166       1) Transcode
 167
 168
 169
 170 Zeilenga               X.500 Intl. String Matching              [Page 3]
 171 \f
 172 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 173
 174
 175       2) Map
 176       3) Normalize
 177       4) Prohibit
 178       5) Check Bidi (Bidirectional)
 179       6) Insignificant Character Removal
 180
 181   These steps are described in Section 3.  Section 2 details design
 182   considerations.
 183
 184
 185 1.4. Relationship to X.500
 186
 187   This document updates X.520 [X.520] with additional normative and
 188   informative information.  Sections 3, 4, and 5 are normative parts of
 189   this update.  Other sections are informative.
 190
 191   Section 3 provides a specification for X.500 string preparation.  It
 192   is intended to be added as a new section in X.520.
 193
 194   Section 4 replaces section 6.1 of X.520 [X.520].  It updates select
 195   string matching rules.
 196
 197   Section 5 replaces portions of section 6.2 of X.520 [X.520].  It
 198   updates select syntax-based matching rules.
 199
 200
 201 2. Design Considerations
 202
 203   The X.500 string matching rule specification provided in Section 3 is
 204   designed to leverage the "stringprep" framework [RFC3454] for
 205   comparing of strings.  As noted above, transcoding and space removal
 206   steps have been added.
 207
 208   This section describes the rationale for these and other design
 209   decisions.
 210
 211
 212 2.1. Transcode
 213
 214   In the past, transcoding only occurred when all of the input strings
 215   were not encoded in the same character set.  If all were encoded in
 216   the same character set, no transcoding was to be performed.
 217   Otherwise, all of the strings would be transcoded to one of character
 218   sets used.
 219
 220   As mappings between character sets, such as T.61 and UCS, are not
 221   bijective, this specification requires transliteration of all strings
 222   to a common character encoding set.  UCS was the logical choice as all
 223
 224
 225
 226 Zeilenga               X.500 Intl. String Matching              [Page 4]
 227 \f
 228 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 229
 230
 231   other character sets (used in X.500) can be transcoded to it without
 232   information loss.  None of the other character sets (used in X.500)
 233   offer this property.
 234
 235
 236 2.2. Map
 237
 238   Code points which have no semantic meaning in normal text are mapped
 239   to nothing.  Code points which are semantically equivalent in normal
 240   text are mapped to a single code point.
 241
 242   "Normal text", in this context, is viewed as text commonly held in
 243   attributes of Directory String syntax, such as identifiers, common
 244   names, and short descriptive text.
 245
 246
 247 2.3. Normalize
 248
 249   Normalization is performed to ensure that comparison is always done
 250   between canonical-equivalent strings.  As directory strings are often
 251   used as identifiers, we selected Form KC (compatibility composed) as
 252   it allows a greater number of strings to be treated as equivalent.
 253
 254   Unfortunately, this choice is not best for all applications.
 255   Additional matching rules which use different string preparation
 256   algorithms may be introduced in the future to better support these
 257   applications.  In particular, matching rules which use Form C
 258   (composed) normalization instead of Form KC would also be generally
 259   useful.  It may be desirable to add additional matching rules to X.500
 260   which use Form C normalization.
 261
 262
 263 2.4. Prohibit
 264
 265   TBD
 266
 267
 268 2.5. Check bidi
 269
 270   TBD
 271
 272
 273 2.6. Insignificant Character Removal
 274
 275   This step is used to remove insignificant characters from the string.
 276   Unlike the map step, which supports mapping of characters to nothing,
 277   this step allows removal of characters based upon their location in
 278   the string, surrounding characters in the string, and other factors.
 279
 280
 281
 282 Zeilenga               X.500 Intl. String Matching              [Page 5]
 283 \f
 284 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 285
 286
 287 3. String Preparation
 288
 289   The following six-step process SHALL be applied to each presented and
 290   attribute value in preparation for string match rule evaluation.
 291
 292       1) Transcode
 293       2) Map
 294       3) Normalize
 295       4) Prohibit
 296       5) Check bidi
 297       6) Insignificant Character Removal
 298
 299   Failure in any step is be cause the assertion to be Undefined.
 300
 301   The character repertoire of this process is Unicode 3.2 [UNICODE].
 302
 303
 304 3.1. Transcode
 305
 306   Each non-Unicode string value is transcoded to Unicode.
 307
 308   TeletexString values are transcoded to Unicode as described in
 309   [T61-UCS].
 310
 311   PrintableString value are transcoded directly to Unicode.
 312
 313   UniversalString, UTF8String, and bmpString values need not be
 314   transcoded as they are Unicode-based strings (in the case of
 315   bmpString, restricted to a subset of Unicode).
 316
 317   If the implementation is unable or unwilling to perform the
 318   transcoding as described above, or the transcoding fails, this step
 319   fails and the assertion is evaluated to Undefined.
 320
 321   The transcoded string is the output string.
 322
 323
 324 3.2. Map
 325
 326   SOFT HYPHEN (U+00AD) and MONGOLIAN TODO SOFT HYPHEN (U+1806) code
 327   points are mapped to nothing.  COMBINING GRAPHEME JOINER (U+034F) and
 328   VARIATION SELECTORs (U+180B-180D,FF00-FE0F) code points are also
 329   mapped to nothing.  The OBJECT REPLACEMENT CHARACTER (U+FFFC) is
 330   mapped to nothing.
 331
 332   CHARACTER TABULATION (U+0009), LINE FEED (LF) (U+000A), LINE
 333   TABULATION (U+000B), FORM FEED (FF) (U+000C), CARRIAGE RETURN (CR)
 334   (U+000D), and NEXT LINE (NEL) (U+0085) are mapped to SPACE (U+0020).
 335
 336
 337
 338 Zeilenga               X.500 Intl. String Matching              [Page 6]
 339 \f
 340 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 341
 342
 343   All other control code points (e.g., Cc) or code points with a control
 344   function (e.g., Cf) are mapped to nothing.
 345
 346   ZERO WIDTH SPACE (U+200B) is mapped to nothing.  All other code points
 347   with Separator (space, line, or paragraph) property (e.g, Zs, Zl, or
 348   Zp) are mapped to SPACE (U+0020).
 349
 350   For case ignore, numeric, and stored prefix string matching rules,
 351   characters are case folded per B.2 of [RFC3454].
 352
 353
 354 3.3. Normalize
 355
 356   The input string is be normalized to Unicode Form KC (compatibility
 357   composed) as described in [UAX15].
 358
 359
 360 3.4. Prohibit
 361
 362   All Unassigned, Private Use, and non-character code points are
 363   prohibited.  Surrogate codes (U+D800-DFFFF) are prohibited.
 364
 365   The REPLACEMENT CHARACTER (U+FFFD) code point is prohibited.
 366
 367   The first code point of a string is probibited from being a combining
 368   character.
 369
 370   Empty strings are prohibited.
 371
 372   The step fails and the assertion is evaluated to Undefined if the
 373   input string contains any prohibited code point.  The output string is
 374   the input string.
 375
 376
 377 3.5. Check bidi
 378
 379   There are no bidirectional restrictions.  The output string is the
 380   input string.
 381
 382
 383 3.6. Insignificant Character Removal
 384
 385   In this step, characters insignificant to the matching rule are to be
 386   removed.  The characters to be removed differ from matching rule to
 387   matching rule.
 388
 389   Section 3.6.1 applies to case ignore and exact string matching.
 390   Section 3.6.2 applies to numericString matching.
 391
 392
 393
 394 Zeilenga               X.500 Intl. String Matching              [Page 7]
 395 \f
 396 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 397
 398
 399   Section 3.6.3 applies to telephoneNumber matching
 400
 401
 402 3.6.1. Insignificant Space Removal
 403
 404   For the purposes of this section, a space is defined to be the SPACE
 405   (U+0020) code point followed by no combining marks.
 406
 407          NOTE - The previous steps ensure that the string cannot contain
 408          any code points in the separator class, other than SPACE
 409          (U+0020).
 410
 411   The following spaces are regarded as not significant and are to be
 412   removed:
 413     - leading spaces (i.e. those preceding the first character that is
 414       not a space);
 415     - trailing spaces (i.e. those following the last character that is
 416       not a space);
 417     - multiple consecutive spaces (these are taken as equivalent to a
 418       single space character).
 419
 420   (A string consisting entirely of spaces is equivalent to a string
 421   containing exactly one space.)
 422
 423   For example, removal of spaces from the Form KC string:
 424       "<SPACE><SPACE>foo<SPACE><SPACE>bar<SPACE><SPACE>" would result in
 425   the output string:
 426       "<SPACE>foo<SPACE>bar<SPACE>".
 427
 428   and the Form KC string:
 429       "<SPACE><SPACE><SPACE>" would result in the output string:
 430       "<SPACE>".
 431
 432
 433 3.6.2. NumericString Insignificant Character Removal
 434
 435   For the purposes of this section, a space is defined to be the SPACE
 436   (U+0020) code point followed by no combining marks.
 437
 438   All spaces are regarded as not significant and are to be removed.
 439
 440   For example, removal of spaces from the Form KC string:
 441       "<SPACE><SPACE>123<SPACE><SPACE>456<SPACE><SPACE>" would result in
 442   the output string:
 443       "123456".
 444
 445   and the Form KC string:
 446       "<SPACE><SPACE><SPACE>" would result in an empty output string.
 447
 448
 449
 450 Zeilenga               X.500 Intl. String Matching              [Page 8]
 451 \f
 452 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 453
 454
 455 3.6.3. TelephoneNumber Insignificant Character Removal
 456
 457   For the purposes of this section, a hyphen is defined to be
 458   HYPHEN-MINUS (U+002D), ARMENIAN HYPHEN (U+058A), HYPHEN (U+2010),
 459   NON-BREAKING HYPHEN (U+2011), MINUS SIGN (U+2212), SMALL HYPHEN-MINUS
 460   (U+FE63), or FULLWIDTH HYPHEN-MINUS (U+FF0D) code point followed by no
 461   combining marks and a space is defined to be the SPACE (U+0020) code
 462   point followed by no combining marks.
 463
 464   All hyphens and spaces are regarded as not significant and are to be
 465   removed.
 466
 467
 468 4. String Matching Rules
 469
 470   In the matching rules specified in this section, all presented and
 471   stored string values are be prepared for matching as described in
 472   Section 3.  String preparation produces strings suitable for
 473   character-by-character matching.
 474
 475
 476 4.1. Case Exact / Ignore Match
 477
 478   The Case Exact Match rule compares for equality a presented string
 479   with an attribute value of type DirectoryString or one of the data
 480   types appearing in the choice type DirectoryString, e.g. UTF8String,
 481   without regards to insignificant spaces (3.4.1).
 482
 483       caseExactMatch MATCHING-RULE ::= {
 484           SYNTAX DirectoryString {ub-match}
 485           ID id-mr-caseExactMatch }
 486
 487   The Case Ignore Match rule compares for equality a presented string
 488   with an attribute value of type DirectoryString or one of the data
 489   types appearing in the choice type DirectoryString, e.g. UTF8String,
 490   without regard to the case (upper or lower) of the strings (e.g.
 491   "Dundee" and "DUNDEE" match) and insignificant spaces (3.4.1).  The
 492   rule is identical to the caseExactMatch rule except upper case
 493   characters are folded to lower case during string preparation as
 494   discussed in 3.2.
 495
 496       caseIgnoreMatch MATCHING-RULE ::= {
 497           SYNTAX DirectoryString {ub-match}
 498           ID id-mr-caseIgnoreMatch }
 499
 500   Both rules return TRUE if the prepared strings are the same length and
 501   corresponding characters are identical.
 502
 503
 504
 505
 506 Zeilenga               X.500 Intl. String Matching              [Page 9]
 507 \f
 508 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 509
 510
 511 4.2. Case Exact / Ignore Ordering Match
 512
 513   The Case Exact Ordering Match rule compares the collation order of a
 514   presented string with an attribute value of type DirectoryString or
 515   one of the data types appearing in the choice type DirectoryString,
 516   e.g. UTF8String, without regard to insignificant spaces (3.4.1).
 517
 518       caseExactOrderingMatch MATCHING-RULE ::= {
 519           SYNTAX DirectoryString {ub-match}
 520           ID id-mr-caseExactOrderingMatch }
 521
 522   The Case Ignore Ordering Match rule compares the collation order of a
 523   presented string an attribute value of type DirectoryString or one of
 524   the data types appearing in the choice type DirectoryString, e.g.
 525   UTF8String, without regard to the case (upper or lower) of the strings
 526   and insignificant spaces (3.4.1).  The rule is identical to the
 527   caseExactOrderingMatch rule except upper case characters are folded to
 528   lower case during string preparation as discussed in 3.2.
 529
 530       caseIgnoreOrderingMatch MATCHING-RULE ::= {
 531           SYNTAX DirectoryString {ub-match}
 532           ID id-mr-caseIgnoreOrderingMatch }
 533
 534   Both rules return TRUE if the attribute value is "less" or appears
 535   earlier than the presented value, when the prepared strings are
 536   compared using Unicode code point collation order.
 537
 538
 539 4.3. Case Exact / Ignore Substrings Match
 540
 541   The Case Exact Substrings Match rule determines whether a presented
 542   value is a substring of an attribute value of type DirectoryString or
 543   one of the data types appearing in the choice type DirectoryString,
 544   e.g. UTF8String, without regard to insignficant spaces (3.4.1).
 545
 546       caseExactSubstringsMatch MATCHING-RULE ::= {
 547           SYNTAX SubstringAssertion
 548           ID id-mr-caseExactSubstringsMatch }
 549       SubstringAssertion ::= SEQUENCE OF CHOICE {
 550           initial [0] DirectoryString {ub-match},
 551           any [1] DirectoryString {ub-match},
 552           final [2] DirectoryString {ub-match},
 553       control Attribute }
 554       -- Used to specify interpretation of the following items
 555       -- at most one initial and one final component
 556
 557   The Case Ignore Substrings Match rule determines whether a presented
 558   value is a substring of an attribute value of type DirectoryString or
 559
 560
 561
 562 Zeilenga               X.500 Intl. String Matching             [Page 10]
 563 \f
 564 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 565
 566
 567   one of the data types appearing in the choice type DirectoryString,
 568   e.g. UTF8String, without regard to the case (upper or lower) of the
 569   strings and insignificant spaces (3.4.1).  The rule is identical to
 570   the caseExactSubstringsMatch rule except upper case characters are
 571   folded to lower case during string preparation as discussed in 3.2.
 572
 573       caseIgnoreSubstringsMatch MATCHING-RULE ::= {
 574           SYNTAX SubstringAssertion
 575           ID id-mr-caseIgnoreSubstringsMatch }
 576
 577   Both rules return TRUE if there is a partitioning of the prepared
 578   attribute value (into portions) such that:
 579     - the specified substrings (initial, any, final) match different
 580       portions of the value in the order of the strings sequence.
 581     - initial, if present, matches the first portion of the value;
 582     - any, if present, matches some arbitrary portion of the value;
 583     - final, if present, matches the last portion of the value.
 584     - control is not used for the caseExactSubstringsMatch,
 585       caseIgnoreSubstringsMatch, telephoneNumberSubstringsMatch, or any
 586       other form of substring match for which only initial, any, or
 587       final elements are used in the matching algorithm; if a control
 588       element is encountered, it is ignored.  The control element is
 589       only used for matching rules that explicitly specify its use in
 590       the matching algorithm. Such a matching rule may also redefine the
 591       semantics of the initial, any and final substrings.
 592         NOTE - The generalWordMatch matching rule is an example of such
 593                a matching rule.
 594
 595   There shall be at most one initial, and at most one final in the
 596   SubstringAssertion.  If initial is present, it shall be the first
 597   element.  If final is present, it shall be the last element. There
 598   shall be zero or more any elements.
 599
 600   For a component of substrings to match a portion of the attribute
 601   value, corresponding characters must be identical (including all
 602   combining characters in the combining character sequences).
 603
 604
 605 4.4. Numeric String Match
 606
 607   The Numeric String Match rule compares for equality a presented
 608   numeric string with an attribute value of type NumericString.
 609
 610       numericStringMatch MATCHING-RULE ::= {
 611           SYNTAX NumericString
 612           ID id-mr-numericStringMatch }
 613
 614   The rule is identical to the caseIgnoreMatch rule (case is irrelevant
 615
 616
 617
 618 Zeilenga               X.500 Intl. String Matching             [Page 11]
 619 \f
 620 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 621
 622
 623   as characters are numeric) except that all space characters are
 624   removed during string preparation as detailed in Section 3.6.2.
 625
 626
 627 4.5. Numeric String Ordering Match
 628
 629   The Numeric String Ordering Match rule compares the collation order of
 630   a presented string with an attribute value of type NumericString.
 631
 632       numericStringOrderingMatch MATCHING-RULE ::= {
 633           SYNTAX NumericString
 634           ID id-mr-numericStringOrderingMatch }
 635
 636   The rule is identical to the caseIgnoreOrderingMatch rule (case is
 637   irrelevant as characters are numeric) except that all space characters
 638   are removed during string preparation as detailed in Section 3.6.
 639
 640
 641 4.6. Numeric String Substrings Match
 642
 643   The Numeric String Substrings Match rule determines whether a
 644   presented value is a substring of an attribute value of type
 645   NumericString.
 646
 647       numericStringSubstringsMatch MATCHING-RULE ::= {
 648           SYNTAX SubstringAssertion
 649           ID id-mr-numericStringSubstringsMatch }
 650
 651   The rule is identical to the caseIgnoreSubstringsMatch rule (case is
 652   irrelevant as characters are numeric) except that all space characters
 653   are removed during string preparation as detailed in Section 3.6.
 654
 655
 656 4.7. Case Ignore List Match
 657
 658   The Case Ignore List Match rule compares for equality a presented
 659   sequence of strings with an attribute value which is a sequence of
 660   DirectoryStrings, without regard to the case (upper or lower) of the
 661   strings and insignificant spaces (3.6.1).
 662
 663       caseIgnoreListMatch MATCHING-RULE ::= {
 664           SYNTAX CaseIgnoreList
 665           ID id-mr-caseIgnoreListMatch }
 666       CaseIgnoreList ::= SEQUENCE OF DirectoryString {ub-match}
 667
 668   The rule returns TRUE if and only if the number of strings in each is
 669   the same, and corresponding strings match. The latter matching is as
 670   for the caseIgnoreMatch matching rule.
 671
 672
 673
 674 Zeilenga               X.500 Intl. String Matching             [Page 12]
 675 \f
 676 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 677
 678
 679 4.8. Case Ignore List Substrings Match
 680
 681   The Case Ignore List Substring rule compares a presented substring
 682   with an attribute value which is a sequence of DirectoryStrings, but
 683   without regard for the case (upper or lower) of the strings and
 684   insignificant spaces (3.6.1).
 685
 686       caseIgnoreListSubstringsMatch MATCHING-RULE ::= {
 687           SYNTAX SubstringAssertion
 688           ID id-mr-caseIgnoreListSubstringsMatch }
 689
 690   A presented value matches a stored value if and only if the presented
 691   value matches the string formed by concatenating the strings of the
 692   stored value. This matching is done according to the
 693   caseIgnoreSubstringsMatch rule; however, none of the initial, any, or
 694   final values of the presented value are considered to match a
 695   substring of the concatenated string which spans more than one of the
 696   strings of the stored value.
 697
 698
 699 4.9. Stored Prefix Match
 700
 701   The Stored Prefix Match rule determines whether an attribute value,
 702   whose syntax is DirectoryString, is a prefix (i.e.  initial substring)
 703   of the presented value, without regard to the case (upper or lower) of
 704   the strings and insignficant spaces (3.6.1).
 705
 706              NOTE - It can be used, for example, to compare values in
 707              the Directory which are telephone area codes with a value
 708              which is a purported telephone number.
 709
 710       storedPrefixMatch MATCHING-RULE ::= {
 711           SYNTAX DirectoryString {ub-match}
 712           ID id-mr-storedPrefixMatch }
 713
 714   The rule returns TRUE if the attribute value is an initial substring
 715   of the presented value with corresponding characters identical except
 716   with regard to case.
 717
 718
 719 5. Other changes to X.520
 720
 721   This document makes the following changes to X.520:
 722
 723   The section 6.2.8 (Telephone Number Match) sentence:
 724       The rules for matching are identical to those for caseIgnoreMatch,
 725       except that all space and "-" characters are skipped during the
 726       comparison.
 727
 728
 729
 730 Zeilenga               X.500 Intl. String Matching             [Page 13]
 731 \f
 732 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 733
 734
 735   is replaced with:
 736       The rules for matching are identical to those for caseIgnoreMatch,
 737       except that all hyphens and spaces are insignficant (3.6.3) and
 738       removed during the insignificant character removal step.
 739
 740   The section 6.2.9 (Telephone Number Substrings Match) sentence:
 741       The rules for matching are identical to those for
 742       caseExactSubstringsMatch, except that all space and "-" characters
 743       are skipped during the comparison.
 744
 745   is replaced with:
 746       The rules for matching are identical to those for
 747       caseExactSubstringsMatch, except that all hyphens and spaces are
 748       insignficant (3.6.3) and removed during the insignificant
 749       character removal step.
 750
 751
 752 6. Security Considerations
 753
 754   See [RFC3454].
 755
 756
 757 7. Acknowledgments
 758
 759   The approach used in this document is based upon design principles and
 760   algorithms described in "Preparation of Internationalized Strings
 761   ('stringprep')" [RFC3454] by Paul Hoffman and Marc Blanchet.  Some
 762   additional guidance was drawn from Unicode Technical Standards,
 763   Technical Reports, and Notes.
 764
 765   Sections 3.3 and 4 of this document are derived from Section 6.1 of
 766   [X.520].   Additionally, some text was borrowed from [RFC3454].
 767
 768   This document is the product of IETF and ITU-T collaboration [IETF-
 769   ITU].
 770
 771
 772 8. Editor's Address
 773
 774   Kurt Zeilenga
 775   E-mail: <kurt@openldap.org>
 776
 777
 778 9. References
 779
 780 9.1. Normative References
 781
 782   [RFC2119]  S. Bradner, "Key words for use in RFCs to Indicate
 783
 784
 785
 786 Zeilenga               X.500 Intl. String Matching             [Page 14]
 787 \f
 788 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 789
 790
 791              Requirement Levels", BCP 14 (also RFC 2119), March 1997.
 792
 793   [RFC3454]  P. Hoffman, M. Blanchet, "Preparation of Internationalized
 794              Strings ('stringprep')", RFC 3454, December 2002.
 795
 796   [X.501]    International Telephone Union, "The Directory: The Models",
 797              X.501, 2000.
 798
 799   [X.520]    International Telephone Union, "The Directory: Selected
 800              Attribute Types", X.520, 2000.
 801
 802   [ISO10646]   Universal Multiple-Octet Coded Character Set (UCS) -
 803              Architecture and Basic Multilingual Plane, ISO/IEC 10646-1
 804              : 1993.
 805
 806   [UNICODE]  The Unicode Consortium, "The Unicode Standard, Version
 807              3.2.0" is defined by "The Unicode Standard, Version 3.0"
 808              (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5), as
 809              amended by the "Unicode Standard Annex #27: Unicode 3.1"
 810              (http://www.unicode.org/reports/tr27/) and by the "Unicode
 811              Standard Annex #28: Unicode 3.2"
 812              (http://www.unicode.org/reports/tr28/).
 813
 814   [UAX15]    M. Davis, M. Duerst, "Unicode Standard Annex #15: Unicode
 815              Normalization Forms, Version 3.2.0".
 816              <http://www.unicode.org/unicode/reports/tr15/tr15-22.html>,
 817              March 2002.
 818
 819   [T61-UCS]  TBD
 820
 821
 822 9.2. Informative References
 823
 824   [X.500]    International Telephone Union, "The Directory: Overview of
 825              Concepts, Models and Service", X.500, 2000.
 826
 827   [IETF-ITU] G. Fishman, S. Bradner, "Internet Engineering Task Force
 828              and International Telecommunication Union -
 829              Telecommunications Standardization Sector Collaboration
 830              Guidelines", TSAG A-Series Supplement 3, November 2001
 831              (also RFC 3356, published August 2002).
 832
 833              [GLOSSARY] The Unicode Consortium, "Unicode Glossary",
 834              <http://www.unicode.org/glossary/>.
 835
 836              [UTR17]    K. Whistler, M. Davis, "Unicode Technical Report
 837              #17, Character Encoding Model", UTR17,
 838              <http://www.unicode.org/unicode/reports/tr17/>, August
 839
 840
 841
 842 Zeilenga               X.500 Intl. String Matching             [Page 15]
 843 \f
 844 Internet-Draft     draft-zeilenga-ldapbis-strmatch-02       3 March 2003
 845
 846
 847              2000.
 848
 849
 850
 851 Copyright 2003, The Internet Society.  All Rights Reserved.
 852
 853   This document and translations of it may be copied and furnished to
 854   others, and derivative works that comment on or otherwise explain it
 855   or assist in its implementation may be prepared, copied, published and
 856   distributed, in whole or in part, without restriction of any kind,
 857   provided that the above copyright notice and this paragraph are
 858   included on all such copies and derivative works.  However, this
 859   document itself may not be modified in any way, such as by removing
 860   the copyright notice or references to the Internet Society or other
 861   Internet organizations, except as needed for the  purpose of
 862   developing Internet standards in which case the procedures for
 863   copyrights defined in the Internet Standards process must be followed,
 864   or as required to translate it into languages other than English.
 865
 866   The limited permissions granted above are perpetual and will not be
 867   revoked by the Internet Society or its successors or assigns.
 868
 869   This document and the information contained herein is provided on an
 870   "AS IS" basis and THE AUTHORS, THE INTERNET SOCIETY, AND THE INTERNET
 871   ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
 872   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
 873   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
 874   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898 Zeilenga               X.500 Intl. String Matching             [Page 16]
 899 \f