From cad1d95541375d45ed1dd14e1c5685fce88a09e4 Mon Sep 17 00:00:00 2001
From: Simon Josefsson <simon@josefsson.org>
Date: Wed, 28 May 2003 13:40:20 +0000
Subject: [PATCH] Add.

---
 draft-ietf-ldapbis-strprep-00.txt | 955 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 955 insertions(+)
 create mode 100644 draft-ietf-ldapbis-strprep-00.txt
diff --git a/draft-ietf-ldapbis-strprep-00.txt b/draft-ietf-ldapbis-strprep-00.txt
new file mode 100644
index 0000000..211bbd6
--- /dev/null
+++ b/draft-ietf-ldapbis-strprep-00.txt
@@ -0,0 +1,955 @@
+
+
+
+
+
+
+Internet-Draft                                      Kurt D. Zeilenga
+Intended Category: Standard Track                OpenLDAP Foundation
+Expires in six months                                    26 May 2003
+
+
+
+                LDAP: Internationalized String Preparation
+                   <draft-ietf-ldapbis-strprep-00.txt>
+
+
+Status of this Memo
+
+  This document is an Internet-Draft and is in full conformance with all
+  provisions of Section 10 of RFC2026.
+
+  Distribution of this memo is unlimited.  Technical discussion of this
+  document will take place on the IETF LDAP Revision Working Group
+  mailing list <ietf-ldapbis@openldap.org>.  Please send editorial
+  comments directly to the author <Kurt@OpenLDAP.org>.
+
+  Internet-Drafts are working documents of the Internet Engineering Task
+  Force (IETF), its areas, and its working groups.  Note that other
+  groups may also distribute working documents as Internet-Drafts.
+  Internet-Drafts are draft documents valid for a maximum of six months
+  and may be updated, replaced, or obsoleted by other documents at any
+  time.  It is inappropriate to use Internet-Drafts as reference
+  material or to cite them other than as ``work in progress.''
+
+  The list of current Internet-Drafts can be accessed at
+  <http://www.ietf.org/ietf/1id-abstracts.txt>. The list of
+  Internet-Draft Shadow Directories can be accessed at
+  <http://www.ietf.org/shadow.html>.
+
+  Copyright (C) The Internet Society (2003).  All Rights Reserved.
+
+  Please see the Full Copyright section near the end of this document
+  for more information.
+
+
+Abstract
+
+  The previous Lightweight Directory Access Protocol (LDAP) technical
+  specifications did not precisely define how string matching is to be
+  performed.  This lead to a number of usability and interoperability
+  problems.  This document defines string preparation algorithms for
+  matching rules defined for use in LDAP.
+
+
+
+
+
+Zeilenga                        LDAPprep                        [Page 1]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+Conventions
+
+  The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+  "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+  document are to be interpreted as described in BCP 14 [RFC2119].
+
+  Character names in this document use the notation for code points and
+  names from the Unicode Standard [Unicode].  For example, the letter
+  "a" may be represented as either <U+0061> or <LATIN SMALL LETTER A>.
+  In the lists of mappings and the prohibited characters, the "U+" is
+  left off to make the lists easier to read.  The comments for character
+  ranges are shown in square brackets (such as "[CONTROL CHARACTERS]")
+  and do not come from the standard.
+
+  Note: a glossary of terms used in Unicode can be found in [Glossary].
+  Information on the Unicode character encoding model can be found in
+  [CharModel].
+
+
+1. Introduction
+
+1.1. Background
+
+  A Lightweight Directory Access Protocol (LDAP) [Roadmap] matching rule
+  [Syntaxes] defines an algorithm for determining whether a presented
+  value matches an attribute value in accordance with the criteria
+  defined for the rule.  The proposition may be evaluated to True,
+  False, or Undefined.
+
+      True      - the attribute contains a matching value,
+
+      False     - the attribute contains no matching value,
+
+      Undefined - it cannot be determined whether the attribute contains
+                  a matching value or not.
+
+  For instance, the caseIgnoreMatch matching rule may be used to compare
+  whether the commonName attribute contains a particular value without
+  regard for case and insignificant spaces.
+
+
+1.2. X.500 String Matching Rules
+
+  "X.520: Selected attribute types" [X.520] provides (amongst other
+  things) value syntaxes and matching rules for comparing values
+  commonly used in the Directory.  These specifications are inadequate
+  for strings composed of characters from the Universal Character Set
+  (UCS) [ISO10646], a superset of Unicode [Unicode].
+
+
+
+Zeilenga                        LDAPprep                        [Page 2]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  The caseIgnoreMatch matching rule [X.520], for example, is simply
+  defined as being a case insensitive comparison where insignificant
+  spaces are ignored.  For printableString, there is only one space
+  character and case mapping is bijective, hence this definition is
+  sufficient.  However, for UCS-based string types such as
+  universalString, this is not sufficient.  For example, a case
+  insensitive matching implementation which folded lower case characters
+  to upper case would yield different different results than an
+  implementation which used upper case to lower case folding.  Or one
+  implementation may view space as referring to only SPACE (U+0020), a
+  second implementation may view any character with the space separator
+  (Zs) property as a space, and another implementation may view any
+  character with the whitespace (WS) category as a space.
+
+  The lack of precise specification for string matching has led to
+  significant interoperability problems.  When used in certificate chain
+  validation, security vulnerabilities can arise.  To address these
+  problems, this document defines precise algorithms for preparing
+  strings for matching.
+
+
+1.3. Relationship to "stringprep"
+
+  The string preparation algorithms described in this document are based
+  upon the "stringprep" approach [RFC3454].  In "stringprep", presented
+  and stored values are first prepared for comparison and so that a
+  character-by-character comparison yields the "correct" result.
+
+  The approach used here is a refinement of the "stringprep" [RFC3454]
+  approach.  Each algorithm involves two additional preparation steps.
+
+  a) prior to applying the Unicode string preparation steps outlined in
+     "stringprep", the string is transcoded to Unicode;
+
+  b) after applying the Unicode string preparation steps outlined in
+     "stringprep", characters insignificant to the matching rules are
+     removed.
+
+  Hence, preparation of strings for X.500 matching involves the
+  following steps:
+
+      1) Transcode
+      2) Map
+      3) Normalize
+      4) Prohibit
+      5) Check Bidi (Bidirectional)
+      6) Insignificant Character Removal
+
+
+
+
+Zeilenga                        LDAPprep                        [Page 3]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  These steps are described in Section 2.
+
+
+1.4. Relationship to the LDAP Technical Specification
+
+  This document is a integral part of the LDAP technical specification
+  [Roadmap] which obsoletes the previously defined LDAP technical
+  specification [RFC3377] in its entirety.
+
+  This document details new LDAP internationalized string preparation
+  algorithms used by [Syntaxes] and possible other technical
+  specifications defining LDAP syntaxes and/or matching rules.
+
+
+1.5. Relationship to X.500
+
+  LDAP is defined [Roadmap] in X.500 terms as an X.500 access mechanism.
+  As such, there is a strong desire for alignment between LDAP and X.500
+  syntax and semantics.  The string preparation algorithms described in
+  this document are based upon "Internationalized String Matching Rules
+  for X.500" [XMATCH] proposal to ITU/ISO Joint Study Group 2.
+
+
+2. String Preparation
+
+  The following six-step process SHALL be applied to each presented and
+  attribute value in preparation for string match rule evaluation.
+
+      1) Transcode
+      2) Map
+      3) Normalize
+      4) Prohibit
+      5) Check bidi
+      6) Insignificant Character Removal
+
+  Failure in any step is be cause the assertion to be Undefined.
+
+  The character repertoire of this process is Unicode 3.2 [Unicode].
+
+
+2.1. Transcode
+
+  Each non-Unicode string value is transcoded to Unicode.
+
+  TeletexString [X.680][T.61] values are transcoded to Unicode as
+  described in Appendix A.
+
+  PrintableString [X.680] value are transcoded directly to Unicode.
+
+
+
+Zeilenga                        LDAPprep                        [Page 4]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  UniversalString, UTF8String, and bmpString [X.680] values need not be
+  transcoded as they are Unicode-based strings (in the case of
+  bmpString, a subset of Unicode).
+
+  If the implementation is unable or unwilling to perform the
+  transcoding as described above, or the transcoding fails, this step
+  fails and the assertion is evaluated to Undefined.
+
+  The transcoded string is the output string.
+
+
+2.2. Map
+
+  SOFT HYPHEN (U+00AD) and MONGOLIAN TODO SOFT HYPHEN (U+1806) code
+  points are mapped to nothing.  COMBINING GRAPHEME JOINER (U+034F) and
+  VARIATION SELECTORs (U+180B-180D,FF00-FE0F) code points are also
+  mapped to nothing.  The OBJECT REPLACEMENT CHARACTER (U+FFFC) is
+  mapped to nothing.
+
+  CHARACTER TABULATION (U+0009), LINE FEED (LF) (U+000A), LINE
+  TABULATION (U+000B), FORM FEED (FF) (U+000C), CARRIAGE RETURN (CR)
+  (U+000D), and NEXT LINE (NEL) (U+0085) are mapped to SPACE (U+0020).
+
+  All other control code points (e.g., Cc) or code points with a control
+  function (e.g., Cf) are mapped to nothing.
+
+  ZERO WIDTH SPACE (U+200B) is mapped to nothing.  All other code points
+  with Separator (space, line, or paragraph) property (e.g, Zs, Zl, or
+  Zp) are mapped to SPACE (U+0020).
+
+  For case ignore, numeric, and stored prefix string matching rules,
+  characters are case folded per B.2 of [RFC3454].
+
+
+2.3. Normalize
+
+  The input string is be normalized to Unicode Form KC (compatibility
+  composed) as described in [UAX15].
+
+
+2.4. Prohibit
+
+  All Unassigned, Private Use, and non-character code points are
+  prohibited.  Surrogate codes (U+D800-DFFFF) are prohibited.
+
+  The REPLACEMENT CHARACTER (U+FFFD) code point is prohibited.
+
+  The first code point of a string is prohibited from being a combining
+
+
+
+Zeilenga                        LDAPprep                        [Page 5]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  character.
+
+  Empty strings are prohibited.
+
+  The step fails and the assertion is evaluated to Undefined if the
+  input string contains any prohibited code point.  The output string is
+  the input string.
+
+
+2.5. Check bidi
+
+  There are no bidirectional restrictions.  The output string is the
+  input string.
+
+
+2.5. Insignificant Character Removal
+
+  In this step, characters insignificant to the matching rule are to be
+  removed.  The characters to be removed differ from matching rule to
+  matching rule.
+
+  Section 2.6.1 applies to case ignore and exact string matching.
+  Section 2.6.2 applies to numericString matching.
+  Section 2.6.3 applies to telephoneNumber matching
+
+
+2.6.1. Insignificant Space Removal
+
+  For the purposes of this section, a space is defined to be the SPACE
+  (U+0020) code point followed by no combining marks.
+
+  NOTE - The previous steps ensure that the string cannot contain any
+         code points in the separator class, other than SPACE (U+0020).
+
+  The following spaces are regarded as not significant and are to be
+  removed:
+    - leading spaces (i.e. those preceding the first character that is
+      not a space);
+    - trailing spaces (i.e. those following the last character that is
+      not a space);
+    - multiple consecutive spaces (these are taken as equivalent to a
+      single space character).
+
+  A string consisting entirely of spaces is equivalent to a string
+  containing exactly one space.
+
+  For example, removal of spaces from the Form KC string:
+      "<SPACE><SPACE>foo<SPACE><SPACE>bar<SPACE><SPACE>"
+
+
+
+Zeilenga                        LDAPprep                        [Page 6]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  would result in the output string:
+      "foo<SPACE>bar"
+  and the Form KC string:
+      "<SPACE><SPACE><SPACE>"
+  would result in the output string:
+      "<SPACE>".
+
+
+2.6.2. numericString Insignificant Character Removal
+
+  For the purposes of this section, a space is defined to be the SPACE
+  (U+0020) code point followed by no combining marks.
+
+  All spaces are regarded as not significant and are to be removed.
+
+  For example, removal of spaces from the Form KC string:
+      "<SPACE><SPACE>123<SPACE><SPACE>456<SPACE><SPACE>" would result in
+  the output string:
+      "123456"
+  and the Form KC string:
+      "<SPACE><SPACE><SPACE>"
+  would result in an empty output string.
+
+
+2.6.3. telephoneNumber Insignificant Character Removal
+
+  For the purposes of this section, a hyphen is defined to be
+  HYPHEN-MINUS (U+002D), ARMENIAN HYPHEN (U+058A), HYPHEN (U+2010),
+  NON-BREAKING HYPHEN (U+2011), MINUS SIGN (U+2212), SMALL HYPHEN-MINUS
+  (U+FE63), or FULLWIDTH HYPHEN-MINUS (U+FF0D) code point followed by no
+  combining marks and a space is defined to be the SPACE (U+0020) code
+  point followed by no combining marks.
+
+  All hyphens and spaces are regarded as not significant and are to be
+  removed.
+
+
+3. Security Considerations
+
+  "Preparation for International Strings ('stringprep')" [RFC3454]
+  security considerations generally apply to the algorithms described
+  here.
+
+
+4. Contributors
+
+  Appendix A and B of this document were authored by Howard Chu
+  <hyc@symas.com> of Symas Corporation (based upon information provided
+
+
+
+Zeilenga                        LDAPprep                        [Page 7]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  in RFC 1345).
+
+
+5. Acknowledgments
+
+  The approach used in this document is based upon design principles and
+  algorithms described in "Preparation of Internationalized Strings
+  ('stringprep')" [RFC3454] by Paul Hoffman and Marc Blanchet.  Some
+  additional guidance was drawn from Unicode Technical Standards,
+  Technical Reports, and Notes.
+
+
+6. Author's Address
+
+  Kurt Zeilenga
+  E-mail: <kurt@openldap.org>
+
+
+7. References
+
+7.1. Normative References
+
+  [RFC2119]     Bradner, S., "Key words for use in RFCs to Indicate
+                Requirement Levels", BCP 14 (also RFC 2119), March 1997.
+
+  [RFC3454]     Hoffman, P. and M. Blanchet, "Preparation of
+                Internationalized Strings ('stringprep')", RFC 3454,
+                December 2002.
+
+  [Roadmap]     Zeilenga, K. (editor), "LDAP: Technical Specification
+                Road Map", draft-ietf-ldapbis-roadmap-xx.txt, a work in
+                progress.
+
+  [Syntaxes]    Legg, S. (editor), "LDAP: Syntaxes and Matching Rules",
+                draft-ietf-ldapbis-syntaxes-xx.txt, a work in progress.
+
+  [ISO10646]    International Organization for Standardization,
+                "Universal Multiple-Octet Coded Character Set (UCS) -
+                Architecture and Basic Multilingual Plane", ISO/IEC
+                10646-1 : 1993.
+
+  [Unicode]     The Unicode Consortium, "The Unicode Standard, Version
+                3.2.0" is defined by "The Unicode Standard, Version 3.0"
+                (Reading, MA, Addison-Wesley, 2000. ISBN 0-201-61633-5),
+                as amended by the "Unicode Standard Annex #27: Unicode
+                3.1" (http://www.unicode.org/reports/tr27/) and by the
+                "Unicode Standard Annex #28: Unicode 3.2"
+                (http://www.unicode.org/reports/tr28/).
+
+
+
+Zeilenga                        LDAPprep                        [Page 8]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  [UAX15]       Davis, M. and M. Duerst, "Unicode Standard Annex #15:
+                Unicode Normalization Forms, Version 3.2.0".
+                <http://www.unicode.org/unicode/reports/tr15/tr15-22.html>,
+                March 2002.
+
+  [X.680]       International Telecommunication Union -
+                Telecommunication Standardization Sector, "Abstract
+                Syntax Notation One (ASN.1) - Specification of Basic
+                Notation", X.680(1997) (also ISO/IEC 8824-1:1998).
+
+  [T.61]        CCITT (now ITU), "Character Repertoire and Coded
+                Character Sets for the International Teletex Service",
+                T.61, 1988.
+
+7.2. Informative References
+
+  [X.500]       International Telecommunication Union -
+                Telecommunication Standardization Sector, "The Directory
+                -- Overview of concepts, models and services,"
+                X.500(1993) (also ISO/IEC 9594-1:1994).
+
+  [X.501]       International Telecommunication Union -
+                Telecommunication Standardization Sector, "The Directory
+                -- Models," X.501(1993) (also ISO/IEC 9594-2:1994).
+
+  [X.520]       International Telecommunication Union -
+                Telecommunication Standardization Sector, "The
+                Directory: Selected Attribute Types", X.520(1993) (also
+                ISO/IEC 9594-6:1994).
+
+  [Glossary]    The Unicode Consortium, "Unicode Glossary",
+                <http://www.unicode.org/glossary/>.
+
+  [CharModel]   Whistler, K. and M. Davis, "Unicode Technical Report
+                #17, Character Encoding Model", UTR17,
+                <http://www.unicode.org/unicode/reports/tr17/>, August
+                2000.
+
+  [XMATCH]      Zeilenga, K., "Internationalized String Matching Rules
+                for X.500", draft-zeilenga-ldapbis-strmatch-xx.txt a
+                work in progress.
+
+  [RFC1345]     Simonsen, K., "Character Mnemonics & Character Sets",
+                RFC 1345, June 1992.
+
+
+Appendix A. Teletex (T.61) to Unicode
+
+
+
+
+Zeilenga                        LDAPprep                        [Page 9]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  This appendix defines an algorithm for transcoding [T.61] characters
+  to [Unicode] characters for use in string preparation for LDAP
+  matching rules.  This appendix is a normative.
+
+  The transcoding algorithm is derived from the T.61-8bit definition
+  provided in [RFC1345].  With a few exceptions, the T.61 character
+  codes from x00 to x7f are equivalent to the corresponding [Unicode]
+  code points, and their values are left unchanged by this algorithm.
+  E.g. the T.61 code x20 is identical to (U+0020).  The exceptions are
+  for these T.61 codes that are undefined: x23, x24, x5c, x5e, x60, x7b,
+  x7d, and x7e.
+
+  The codes from x80 to x9f are also equivalent to the corresponding
+  Unicode code points.  This is specified for completeness only, as
+  these codes are control characters, and will be mapped to nothing in
+  the LDAP String Preparation Mapping step.
+
+  The remaining T.61 codes are mapped below in Table A.1.  Table
+  positions marked "??" are undefined.
+
+  Input strings containing undefined T.61 codes SHALL produce an
+  Undefined matching result. For diagnostic purposes, this algorithm
+  does not fail for undefined input codes.  Instead, undefined codes in
+  the input are mapped to the Unicode REPLACEMENT CHARACTER (U+FFFD).
+  As the LDAP String Preparation Probhibit step disallows the
+  REPLACEMENT CHARACTER from appearing in its output, this transcoding
+  yields the desired effect.
+
+  Note: RFC 1345 listed the non-spacing accent codepoints as residing in
+        the range starting at (U+E000).  In the current Unicode
+        standard, the (U+E000) range is reserved for Private Use, and
+        the non-spacing accents are in the range starting at (U+0300).
+        The tables here use the (U+0300) range for these accents.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   a0| 00a0 | 00a1 | 00a2 | 00a3 | 0024 | 00a5 | 0023 | 00a7 |
+   a8| 00a8 |  ??  |  ??  | 00ab |  ??  |  ??  |  ??  |  ??  |
+   b0| 00b0 | 00b1 | 00b2 | 00b3 | 00d7 | 00b5 | 00b6 | 00b7 |
+   b8| 00f7 |  ??  |  ??  | 00bb | 00bc | 00bd | 00be | 00bf |
+   c0|  ??  | 0300 | 0301 | 0302 | 0303 | 0304 | 0306 | 0307 |
+   c8| 0308 |  ??  | 030a | 0327 | 0332 | 030b | 0328 | 030c |
+   d0|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   d8|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   e0| 2126 | 00c6 | 00d0 | 00aa |  ??  | 0126 | 0132 | 013f |
+   e8| 0141 | 00d8 | 0152 | 00ba | 00de | 0166 | 014a | 0149 |
+   f0| 0138 | 00e6 | 0111 | 00f0 | 0127 | 0131 | 0133 | 0140 |
+   f8| 0142 | 00f8 | 0153 | 00df | 00fe | 0167 | 014b |  ??  |
+
+
+
+Zeilenga                        LDAPprep                       [Page 10]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+   --+------+------+------+------+------+------+------+------+
+            Table A.1:  Mapping of 8-bit T.61 codes to Unicode
+
+  T.61 also defines a number of accented characters that are formed by
+  combining an accent prefix followed by a base character.  These
+  prefixes are in the code range xc1 to xcf. If a prefix character
+  appears at the end of a string, the result is undefined.  Otherwise
+  these sequences are mapped to Unicode by substituting the
+  corresponding non-spacing accent code (as listed in Table A.1) for the
+  accent prefix, and exchanging the order so that the base character
+  precedes the accent.
+
+
+Appendix B. Additional Teletex (T.61) to Unicode Tables
+
+  All of the accented characters in T.61 have a corresponding code point
+  in Unicode.  For the sake of completeness, the combined character
+  codes are presented in the following tables.  This is informational
+  only; for matching purposes it is sufficient to map the non-spacing
+  accent and exchange the order of the character pair as specified in
+  Appendix A.
+
+
+B.1. Combinations with SPACE
+
+  Accents may be combined with a <SPACE> to generate the accent by
+  itself.  For each accent code, the result of combining with <SPACE> is
+  listed in Table B.1.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   c0|  ??  | 0060 | 00b4 | 005e | 007e | 00af | 02d8 | 02d9 |
+   c8| 00a8 |  ??  | 02da | 00b8 |  ??  | 02dd | 02db | 02c7 |
+   --+------+------+------+------+------+------+------+------+
+       Table B.1:  Mapping of T.61 Accents with <SPACE> to Unicode
+
+
+B.2. Combinations for xc1: (Grave accent)
+
+  T.61 has predefined characters for combinations with A, E, I, O, and
+  U.  Unicode also defines combinations for N, W, and Y.  All of these
+  combinations are present in Table B.2.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 00c0 |  ??  |  ??  |  ??  | 00c8 |  ??  |  ??  |
+   48|  ??  | 00cc |  ??  |  ??  |  ??  |  ??  | 01f8 | 00d2 |
+   50|  ??  |  ??  |  ??  |  ??  |  ??  | 00d9 |  ??  | 1e80 |
+
+
+
+Zeilenga                        LDAPprep                       [Page 11]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+   58|  ??  | 1ef2 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 00e0 |  ??  |  ??  |  ??  | 00e8 |  ??  |  ??  |
+   68|  ??  | 00ec |  ??  |  ??  |  ??  |  ??  | 01f9 | 00f2 |
+   70|  ??  |  ??  |  ??  |  ??  |  ??  | 00f9 |  ??  | 1e81 |
+   78|  ??  | 1ef3 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+           Table B.2: Mapping of T.61 Grave Accent Combinations
+
+
+B.3. Combinations for xc2: (Acute accent)
+
+  T.61 has predefined characters for combinations with A, E, I, O, U, Y,
+  C, L, N, R, S, and Z.  Unicode also defines G, K, M, P, and W.  All of
+  these combinations are present in Table B.3.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 00c1 |  ??  | 0106 |  ??  | 00c9 |  ??  | 01f4 |
+   48|  ??  | 00cd |  ??  | 1e30 | 0139 | 1e3e | 0143 | 00d3 |
+   50| 1e54 |  ??  | 0154 | 015a |  ??  | 00da |  ??  | 1e82 |
+   58|  ??  | 00dd | 0179 |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 00e1 |  ??  | 0107 |  ??  | 00e9 |  ??  | 01f5 |
+   68|  ??  | 00ed |  ??  | 1e31 | 013a | 1e3f | 0144 | 00f3 |
+   70| 1e55 |  ??  | 0155 | 015b |  ??  | 00fa |  ??  | 1e83 |
+   78|  ??  | 00fd | 017a |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+           Table B.3: Mapping of T.61 Acute Accent Combinations
+
+
+B.4. Combinations for xc3: (Circumflex)
+
+  T.61 has predefined characters for combinations with A, E, I, O, U, Y,
+  C, G, H, J, S, and W.  Unicode also defines the combination for Z.
+  All of these combinations are present in Table B.4.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 00c2 |  ??  | 0108 |  ??  | 00ca |  ??  | 011c |
+   48| 0124 | 00ce | 0134 |  ??  |  ??  |  ??  |  ??  | 00d4 |
+   50|  ??  |  ??  |  ??  | 015c |  ??  | 00db |  ??  | 0174 |
+   58|  ??  | 0176 | 1e90 |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 00e2 |  ??  | 0109 |  ??  | 00ea |  ??  | 011d |
+   68| 0125 | 00ee | 0135 |  ??  |  ??  |  ??  |  ??  | 00f4 |
+   70|  ??  |  ??  |  ??  | 015d |  ??  | 00fb |  ??  | 0175 |
+   78|  ??  | 0177 | 1e91 |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+        Table B.4: Mapping of T.61 Circumflex Accent Combinations
+
+
+
+
+Zeilenga                        LDAPprep                       [Page 12]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+B.5. Combinations for xc4: (Tilde)
+
+  T.61 has predefined characters for combinations with A, I, O, U, and
+  N.  Unicode also defines E, V, and Y.  All of these combinations are
+  present in Table B.5.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 00c3 |  ??  |  ??  |  ??  | 1ebc |  ??  |  ??  |
+   48|  ??  | 0128 |  ??  |  ??  |  ??  |  ??  | 00d1 | 00d5 |
+   50|  ??  |  ??  |  ??  |  ??  |  ??  | 0168 | 1e7c |  ??  |
+   58|  ??  | 1ef8 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 00e3 |  ??  |  ??  |  ??  | 1ebd |  ??  |  ??  |
+   68|  ??  | 0129 |  ??  |  ??  |  ??  |  ??  | 00f1 | 00f5 |
+   70|  ??  |  ??  |  ??  |  ??  |  ??  | 0169 | 1e7d |  ??  |
+   78|  ??  | 1ef9 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+           Table B.5: Mapping of T.61 Tilde Accent Combinations
+
+
+B.6. Combinations for xc5: (Macron)
+
+  T.61 has predefined characters for combinations with A, E, I, O, and
+  U.  Unicode also defines Y, G, and AE.  All of these combinations are
+  present in Table B.6.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 0100 |  ??  |  ??  |  ??  | 0112 |  ??  | 1e20 |
+   48|  ??  | 012a |  ??  |  ??  |  ??  |  ??  |  ??  | 014c |
+   50|  ??  |  ??  |  ??  |  ??  |  ??  | 016a |  ??  |  ??  |
+   58|  ??  | 0232 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 0101 |  ??  |  ??  |  ??  | 0113 |  ??  | 1e21 |
+   68|  ??  | 012b |  ??  |  ??  |  ??  |  ??  |  ??  | 014d |
+   70|  ??  |  ??  |  ??  |  ??  |  ??  | 016b |  ??  |  ??  |
+   78|  ??  | 0233 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   e0|  ??  | 01e2 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   f0|  ??  | 01e3 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+          Table B.6: Mapping of T.61 Macron Accent Combinations
+
+
+B.7. Combinations for xc6: (Breve)
+
+  T.61 has predefined characters for combinations with A, U, and G.
+  Unicode also defines E, I, and O.  All of these combinations are
+  present in Table B.7.
+
+
+
+
+Zeilenga                        LDAPprep                       [Page 13]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 0102 |  ??  |  ??  |  ??  | 0114 |  ??  | 011e |
+   48|  ??  | 012c |  ??  |  ??  |  ??  |  ??  |  ??  | 014e |
+   50|  ??  |  ??  |  ??  |  ??  |  ??  | 016c |  ??  |  ??  |
+   58|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 0103 |  ??  |  ??  |  ??  | 0115 |  ??  | 011f |
+   68|  ??  | 012d |  ??  |  ??  |  ??  |  ??  | 00f1 | 014f |
+   70|  ??  |  ??  |  ??  |  ??  |  ??  | 016d |  ??  |  ??  |
+   78|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+           Table B.7: Mapping of T.61 Breve Accent Combinations
+
+
+B.8. Combinations for xc7: (Dot Above)
+
+  T.61 has predefined characters for C, E, G, I, and Z.  Unicode also
+  defines A, O, B, D, F, H, M, N, P, R, S, T, W, X, and Y.  All of these
+  combinations are present in Table B.8.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 0226 | 1e02 | 010a | 1e0a | 0116 | 1e1e | 0120 |
+   48| 1e22 | 0130 |  ??  |  ??  |  ??  | 1e40 | 1e44 | 022e |
+   50| 1e56 |  ??  | 1e58 | 1e60 | 1e6a |  ??  |  ??  | 1e86 |
+   58| 1e8a | 1e8e | 017b |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 0227 | 1e03 | 010b | 1e0b | 0117 | 1e1f | 0121 |
+   68| 1e23 |  ??  |  ??  |  ??  |  ??  | 1e41 | 1e45 | 022f |
+   70| 1e57 |  ??  | 1e59 | 1e61 | 1e6b |  ??  |  ??  | 1e87 |
+   78| 1e8b | 1e8f | 017c |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+         Table B.8: Mapping of T.61 Dot Above Accent Combinations
+
+
+B.9. Combinations for xc8: (Diaeresis)
+
+  T.61 has predefined characters for A, E, I, O, U, and Y.  Unicode also
+  defines H, W, X, and t.  All of these combinations are present in
+  Table B.9.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 00c4 |  ??  |  ??  |  ??  | 00cb |  ??  |  ??  |
+   48| 1e26 | 00cf |  ??  |  ??  |  ??  |  ??  |  ??  | 00d6 |
+   50|  ??  |  ??  |  ??  |  ??  |  ??  | 00dc |  ??  | 1e84 |
+   58| 1e8c | 0178 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 00e4 |  ??  |  ??  |  ??  | 00eb |  ??  |  ??  |
+   68| 1e27 | 00ef |  ??  |  ??  |  ??  |  ??  |  ??  | 00f6 |
+
+
+
+Zeilenga                        LDAPprep                       [Page 14]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+   70|  ??  |  ??  |  ??  |  ??  | 1e97 | 00fc |  ??  | 1e85 |
+   78| 1e8d | 00ff |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+         Table B.8: Mapping of T.61 Diaeresis Accent Combinations
+
+
+B.10. Combinations for xca: (Ring Above)
+
+  T.61 has predefined characters for A, and U.  Unicode also defines w
+  and y.  All of these combinations are present in Table B.10.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 00c5 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   48|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   50|  ??  |  ??  |  ??  |  ??  |  ??  | 016e |  ??  |  ??  |
+   58|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 00e5 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   68|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   70|  ??  |  ??  |  ??  |  ??  |  ??  | 016f |  ??  | 1e98 |
+   78|  ??  | 1e99 |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+        Table B.10: Mapping of T.61 Ring Above Accent Combinations
+
+
+B.11. Combinations for xcb: (Cedilla)
+
+  T.61 has predefined characters for C, G, K, L, N, R, S, and T.
+  Unicode also defines E, D, and H.  All of these combinations are
+  present in Table B.11.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  |  ??  |  ??  | 00c7 | 1e10 | 0228 |  ??  | 0122 |
+   48| 1e28 |  ??  |  ??  | 0136 | 013b |  ??  | 0145 |  ??  |
+   50|  ??  |  ??  | 0156 | 015e | 0162 |  ??  |  ??  |  ??  |
+   58|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  |  ??  |  ??  | 00e7 | 1e11 | 0229 |  ??  | 0123 |
+   68| 1e29 |  ??  |  ??  | 0137 | 013c |  ??  | 0146 |  ??  |
+   70|  ??  |  ??  | 0157 | 015f | 0163 |  ??  |  ??  |  ??  |
+   78|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+         Table B.11: Mapping of T.61 Cedilla Accent Combinations
+
+
+B.12. Combinations for xcd: (Double Acute Accent)
+
+  T.61 has predefined characters for O, and U.  These combinations are
+
+
+
+Zeilenga                        LDAPprep                       [Page 15]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+  present in Table B.12.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   48|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  | 0150 |
+   50|  ??  |  ??  |  ??  |  ??  |  ??  | 0170 |  ??  |  ??  |
+   68|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  | 0151 |
+   70|  ??  |  ??  |  ??  |  ??  |  ??  | 0171 |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+       Table B.12: Mapping of T.61 Double Acute Accent Combinations
+
+
+B.13. Combinations for xce: (Ogonek)
+
+  T.61 has predefined characters for A, E, I, and U.  Unicode also
+  defines the combination for O.  All of these combinations are present
+  in Table B.13.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 0104 |  ??  |  ??  |  ??  | 0118 |  ??  |  ??  |
+   48|  ??  | 012e |  ??  |  ??  |  ??  |  ??  |  ??  | 01ea |
+   50|  ??  |  ??  |  ??  |  ??  |  ??  | 0172 |  ??  |  ??  |
+   58|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 0105 |  ??  |  ??  |  ??  | 0119 |  ??  |  ??  |
+   68|  ??  | 012f |  ??  |  ??  |  ??  |  ??  |  ??  | 01eb |
+   70|  ??  |  ??  |  ??  |  ??  |  ??  | 0173 |  ??  |  ??  |
+   78|  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |  ??  |
+   --+------+------+------+------+------+------+------+------+
+          Table B.13: Mapping of T.61 Ogonek Accent Combinations
+
+
+B.14. Combinations for xcf: (Caron)
+
+  T.61 has predefined characters for C, D, E, L, N, R, S, T, and Z.
+  Unicode also defines A, I, O, U, G, H, j,and K.  All of these
+  combinations are present in Table B.14.
+
+     |    0 |    1 |    2 |    3 |    4 |    5 |    6 |    7 |
+   --+------+------+------+------+------+------+------+------+
+   40|  ??  | 01cd |  ??  | 010c | 010e | 011a |  ??  | 01e6 |
+   48| 021e | 01cf |  ??  | 01e8 | 013d |  ??  | 0147 | 01d1 |
+   50|  ??  |  ??  | 0158 | 0160 | 0164 | 01d3 |  ??  |  ??  |
+   58|  ??  |  ??  | 017d |  ??  |  ??  |  ??  |  ??  |  ??  |
+   60|  ??  | 01ce |  ??  | 010d | 010f | 011b |  ??  | 01e7 |
+   68| 021f | 01d0 | 01f0 | 01e9 | 013e |  ??  | 0148 | 01d2 |
+   70|  ??  |  ??  | 0159 | 0161 | 0165 | 01d4 |  ??  |  ??  |
+   78|  ??  |  ??  | 017e |  ??  |  ??  |  ??  |  ??  |  ??  |
+
+
+
+Zeilenga                        LDAPprep                       [Page 16]
+
+Internet-Draft        draft-ietf-ldapbis-strprep-00          26 May 2003
+
+
+   --+------+------+------+------+------+------+------+------+
+          Table B.14: Mapping of T.61 Caron Accent Combinations
+
+
+
+
+Intellectual Property Rights
+
+  The IETF takes no position regarding the validity or scope of any
+  intellectual property or other rights that might be claimed to pertain
+  to the implementation or use of the technology described in this
+  document or the extent to which any license under such rights might or
+  might not be available; neither does it represent that it has made any
+  effort to identify any such rights.  Information on the IETF's
+  procedures with respect to rights in standards-track and
+  standards-related documentation can be found in BCP-11.  Copies of
+  claims of rights made available for publication and any assurances of
+  licenses to be made available, or the result of an attempt made to
+  obtain a general license or permission for the use of such proprietary
+  rights by implementors or users of this specification can be obtained
+  from the IETF Secretariat.
+
+  The IETF invites any interested party to bring to its attention any
+  copyrights, patents or patent applications, or other proprietary
+  rights which may cover technology that may be required to practice
+  this standard.  Please address the information to the IETF Executive
+  Director.
+
+
+
+Full Copyright
+
+  Copyright (C) The Internet Society (2003). All Rights Reserved.
+
+  This document and translations of it may be copied and furnished to
+  others, and derivative works that comment on or otherwise explain it
+  or assist in its implmentation may be prepared, copied, published and
+  distributed, in whole or in part, without restriction of any kind,
+  provided that the above copyright notice and this paragraph are
+  included on all such copies and derivative works.  However, this
+  document itself may not be modified in any way, such as by removing
+  the copyright notice or references to the Internet Society or other
+  Internet organizations, except as needed for the  purpose of
+  developing Internet standards in which case the procedures for
+  copyrights defined in the Internet Standards process must be followed,
+  or as required to translate it into languages other than English.
+
+
+
+
+
+Zeilenga                        LDAPprep                       [Page 17]
+
-- 
2.11.4.GIT