1 # $NetBSD: src/share/i18n/csmapper/GB/GB2312%UCS.src,v 1.2 2003/07/12 16:11:08 tshiozak Exp $
2 # $DragonFly: src/share/i18n/csmapper/GB/GB2312%UCS.src,v 1.1 2005/03/10 16:19:35 joerg Exp $
6 SRC_ZONE 0x21-0x7E / 0x21-0x7E / 8
13 # This mapping data is made from the mapping data provided by Unicode, Inc.
16 # Name: GB2312-80 to Unicode table (complete, hex format)
17 # Unicode version: 3.0
19 # Table format: Format A
20 # Date: 1999 October 8
22 # Copyright (c) 1991-1999 Unicode, Inc. All Rights reserved.
24 # This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
25 # No claims are made as to fitness for any particular purpose. No
26 # warranties of any kind are expressed or implied. The recipient
27 # agrees to determine applicability of information provided. If this
28 # file has been provided on optical media by Unicode, Inc., the sole
29 # remedy for any claim will be exchange of defective media within 90
32 # Unicode, Inc. hereby grants the right to freely use the information
33 # supplied in this file in the creation of products supporting the
34 # Unicode Standard, and to make copies of this file in any form for
35 # internal or external distribution as long as this notice remains
41 # This table contains one set of mappings from GB2312-80 into Unicode.
42 # Note that these data are *possible* mappings only and may not be the
43 # same as those used by actual products, nor may they be the best suited
44 # for all uses. For more information on the mappings between various code
45 # pages incorporating the repertoire of GB2312-80 and Unicode, consult the
46 # VENDORS mapping data. Normative information on the mapping between
47 # GB2312-80 and Unicode may be found in the Unihan.txt file in the
48 # latest Unicode Character Database.
50 # If you have carefully considered the fact that the mappings in
51 # this table are only one possible set of mappings between GB2312-80 and
52 # Unicode and have no normative status, but still feel that you
53 # have located an error in the table that requires fixing, you may
54 # report any such error to errata@unicode.org.
57 # Format: Three tab-separated columns
58 # Column #1 is the GB2312 code (in hex as 0xXXXX)
59 # Column #2 is the Unicode (in hex as 0xXXXX)
60 # Column #3 the Unicode name (follows a comment sign, '#')
61 # The official names for Unicode characters U+4E00
62 # to U+9FA5, inclusive, is "CJK UNIFIED IDEOGRAPH-XXXX",
63 # where XXXX is the code point. Including all these
64 # names in this file increases its size substantially
65 # and needlessly. The token "<CJK>" is used for the
66 # name of these characters. If necessary, it can be
67 # expanded algorithmically by a parser or editor.
69 # The entries are in GB2312 order
71 # The following algorithms can be used to change the hex form
72 # of GB2312 to other standard forms:
74 # To change hex to EUC form, add 0x8080
75 # To change hex to kuten form, first subtract 0x2020. Then
76 # the high and low bytes correspond to the ku and ten of
77 # the kuten form. For example, 0x2121 -> 0x0101 -> 0101;
78 # 0x777E -> 0x575E -> 8794
81 # 1.0 version updates 0.0d2 version by correcting mapping for 0x212C
82 # from U+2225 to U+2016.