doc/libidn.texi

   1 \input texinfo   @c -*-texinfo-*-
   2 @c This file is part of the GNU Libidn Manual.
   3 @c Copyright (C) 2002, 2003 Simon Josefsson
   4 @c See below for copying conditions.
   5
   6 @setfilename libidn.info
   7 @include version.texi
   8 @settitle GNU Libidn @value{VERSION}
   9
  10 @syncodeindex pg cp
  11
  12 @copying
  13 This manual is for GNU Libidn version @value{VERSION},
  14 @value{UPDATED}.
  15
  16 Copyright @copyright{} 2002, 2003 Simon Josefsson.
  17
  18 @quotation
  19 Permission is granted to copy, distribute and/or modify this document
  20 under the terms of the GNU Free Documentation License, Version 1.1 or
  21 any later version published by the Free Software Foundation; with no
  22 Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,''
  23 and with the Back-Cover Texts as in (a) below.  A copy of the
  24 license is included in the section entitled ``GNU Free Documentation
  25 License.''
  26
  27 (a) The FSF's Back-Cover Text is: ``You have freedom to copy and modify
  28 this GNU Manual, like GNU software.  Copies published by the Free
  29 Software Foundation raise funds for GNU development.''
  30 @end quotation
  31 @end copying
  32
  33 @dircategory GNU Libraries
  34 @direntry
  35 * libidn: (libidn).     Internationalized string processing library.
  36 @end direntry
  37
  38 @dircategory GNU utilities
  39 @direntry
  40 * idn: (libidn)Invoking idn.            Command line interface to GNU Libidn.
  41 @end direntry
  42
  43 @dircategory Emacs
  44 @direntry
  45 * IDN Library: (libidn)Emacs API.       Emacs API for IDN functions.
  46 @end direntry
  47
  48 @titlepage
  49 @title GNU Libidn
  50 @subtitle for version @value{VERSION}, @value{UPDATED}
  51 @author Simon Josefsson (@email{bug-libidn@@gnu.org})
  52 @page
  53 @vskip 0pt plus 1filll
  54 @insertcopying
  55 @end titlepage
  56
  57 @contents
  58
  59 @ifnottex
  60 @node Top
  61 @top GNU Libidn
  62
  63 @insertcopying
  64 @end ifnottex
  65
  66 @menu
  67 * Introduction::                How to use this manual.
  68 * Preparation::                 What you should do before using the library.
  69 * Stringprep Functions::        Stringprep functions.
  70 * Punycode Functions::          Punycode functions.
  71 * IDNA Functions::              IDNA functions.
  72 * Examples::                    Demonstrate how to use the library.
  73 * Invoking idn::                Command line interface to the library.
  74 * Emacs API::                   Emacs Lisp API for Libidn.
  75 * Acknowledgements::            Whom to blame.
  76
  77 Indices
  78
  79 * Concept Index::
  80 * Function and Variable Index::
  81
  82 Appendices
  83
  84 * Library Copying::             How you can copy and share GNU Libidn.
  85 * Copying This Manual::         How you can copy and share this manual.
  86
  87 @end menu
  88
  89
  90 @node Introduction
  91 @chapter Introduction
  92
  93 GNU Libidn is an implementation of the Stringprep, Punycode and IDNA
  94 specifications defined by the IETF Internationalized Domain Names
  95 (IDN) working group, used for internationalized domain names.  The
  96 package is available under the GNU Lesser General Public License.
  97
  98 The library contains a generic Stringprep implementation that does
  99 Unicode 3.2 NFKC normalization, mapping and prohibitation of
 100 characters, and bidirectional character handling.  Profiles for iSCSI,
 101 Kerberos 5, Nameprep, SASL and XMPP are included.  Punycode and ASCII
 102 Compatible Encoding (ACE) via IDNA are supported.
 103
 104 The Stringprep API consists of two main functions, one for converting
 105 data from the system's native representation into UTF-8, and one
 106 function to perform the Stringprep processing.  Adding a new
 107 Stringprep profile for your application within the API is
 108 straightforward.  The Punycode API consists of one encoding function
 109 and one decoding function.  The IDNA API consists of the ToASCII and
 110 ToUnicode functions, as well as an high-level interface for converting
 111 entire domain names to and from the ACE encoded form.
 112
 113 The library is used by, e.g., GNU SASL and Shishi to process user
 114 names and passwords.  Libidn can be built into GNU Libc to enable a
 115 new system-wide getaddrinfo() flag for IDN processing.
 116
 117 Libidn is developed for the GNU/Linux system, but runs on over 20 Unix
 118 platforms (including Solaris, IRIX, AIX, and Tru64) and Windows.
 119 Libidn is written in C and (parts of) the API is accessible from C,
 120 C++, Emacs Lisp, Python and Java.
 121
 122 @menu
 123 * Getting Started::
 124 * Features::
 125 * Supported Platforms::
 126 * Bug Reports::
 127 @end menu
 128
 129 @node Getting Started
 130 @section Getting Started
 131
 132 This manual documents the library programming interface.  All
 133 functions and data types provided by the library are explained.
 134
 135 The reader is assumed to possess basic familiarity with
 136 internationalization concepts and network programming in C or C++.
 137
 138 This manual can be used in several ways.  If read from the beginning
 139 to the end, it gives a good introduction into the library and how it
 140 can be used in an application.  Forward references are included where
 141 necessary.  Later on, the manual can be used as a reference manual to
 142 get just the information needed about any particular interface of the
 143 library.  Experienced programmers might want to start looking at the
 144 examples at the end of the manual (@pxref{Examples}), and then only
 145 read up those parts of the interface which are unclear.
 146
 147 @node Features
 148 @section Features
 149
 150 This library might have a couple of advantages over other libraries
 151 doing a similar job.
 152
 153 @table @asis
 154 @item It's Free Software
 155 Anybody can use, modify, and redistribute it under the terms of the
 156 GNU Lesser General Public License.
 157
 158 @item It's thread-safe
 159 No global state is kept in the library.
 160
 161 @item It's portable
 162 It should work on all Unix like operating systems, including Windows.
 163
 164 @end table
 165
 166 @node Supported Platforms
 167 @section Supported Platforms
 168
 169 Libidn has at some point in time been tested on the following
 170 platforms.
 171
 172 @enumerate
 173
 174 @item Debian GNU/Linux 3.0 (Woody)
 175 @cindex Debian
 176
 177 GCC 2.95.4 and GNU Make. This is the main development platform.
 178 @code{alphaev67-unknown-linux-gnu}, @code{alphaev6-unknown-linux-gnu},
 179 @code{arm-unknown-linux-gnu}, @code{hppa-unknown-linux-gnu},
 180 @code{hppa64-unknown-linux-gnu}, @code{i686-pc-linux-gnu},
 181 @code{ia64-unknown-linux-gnu}, @code{m68k-unknown-linux-gnu},
 182 @code{mips-unknown-linux-gnu}, @code{mipsel-unknown-linux-gnu},
 183 @code{powerpc-unknown-linux-gnu}, @code{s390-ibm-linux-gnu},
 184 @code{sparc-unknown-linux-gnu}.
 185
 186 @item Debian GNU/Linux 2.1
 187 @cindex Debian
 188
 189 GCC 2.95.1 and GNU Make. @code{armv4l-unknown-linux-gnu}.
 190
 191 @item Tru64 UNIX
 192 @cindex Tru64
 193
 194 Tru64 UNIX C compiler and Tru64 Make. @code{alphaev67-dec-osf5.1},
 195 @code{alphaev68-dec-osf5.1}.
 196
 197 @item SuSE Linux 7.1
 198 @cindex SuSE
 199
 200 GCC 2.96 and GNU Make. @code{alphaev6-unknown-linux-gnu},
 201 @code{alphaev67-unknown-linux-gnu}.
 202
 203 @item SuSE Linux 7.2a
 204 @cindex SuSE Linux
 205
 206 GCC 3.0 and GNU Make. @code{ia64-unknown-linux-gnu}.
 207
 208 @item RedHat Linux 7.2
 209 @cindex RedHat
 210
 211 GCC 2.96 and GNU Make. @code{alphaev6-unknown-linux-gnu},
 212 @code{alphaev67-unknown-linux-gnu}, @code{ia64-unknown-linux-gnu}.
 213
 214 @item RedHat Linux 8.0
 215 @cindex RedHat
 216
 217 GCC 3.2 and GNU Make. @code{i686-pc-linux-gnu}.
 218
 219 @item RedHat Advanced Server 2.1
 220 @cindex RedHat Advanced Server
 221
 222 GCC 2.96 and GNU Make. @code{i686-pc-linux-gnu}.
 223
 224 @item Slackware Linux 8.0.01
 225 @cindex RedHat
 226
 227 GCC 2.95.3 and GNU Make. @code{i686-pc-linux-gnu}.
 228
 229 @item Mandrake Linux 9.0
 230 @cindex Mandrake
 231
 232 GCC 3.2 and GNU Make. @code{i686-pc-linux-gnu}.
 233
 234 @item IRIX 6.5
 235 @cindex IRIX
 236
 237 MIPS C compiler, IRIX Make. @code{mips-sgi-irix6.5}.
 238
 239 @item AIX 4.3.2
 240 @cindex AIX
 241
 242 IBM C for AIX compiler, AIX Make.  @code{rs6000-ibm-aix4.3.2.0}.
 243
 244 @item Microsoft Windows 2000 (Cygwin)
 245 @cindex Windows
 246
 247 GCC 3.2, GNU make. @code{i686-pc-cygwin}.
 248
 249 @item HP-UX 11
 250 @cindex HP-UX
 251
 252 HP-UX C compiler and HP Make. @code{ia64-hp-hpux11.22},
 253 @code{hppa2.0w-hp-hpux11.11}.
 254
 255 @item SUN Solaris 2.8
 256 @cindex Solaris
 257
 258 Sun WorkShop Compiler C 6.0 and SUN Make. @code{sparc-sun-solaris2.8}.
 259
 260 @item NetBSD 1.6
 261 @cindex NetBSD
 262
 263 GCC 2.95.3 and GNU Make. @code{alpha-unknown-netbsd1.6},
 264 @code{i386-unknown-netbsdelf1.6}.
 265
 266 @item OpenBSD 3.1 and 3.2
 267 @cindex OpenBSD
 268
 269 GCC 2.95.3 and GNU Make. @code{alpha-unknown-openbsd3.1},
 270 @code{i386-unknown-openbsd3.1}.
 271
 272 @item FreeBSD 4.7
 273 @cindex FreeBSD
 274
 275 GCC 2.95.4 and GNU Make. @code{alpha-unknown-freebsd4.7},
 276 @code{i386-unknown-freebsd4.7}.
 277
 278 @end enumerate
 279
 280 If you use Libidn on, or port Libidn to, a new platform please report
 281 it to the author.
 282
 283 @node Bug Reports
 284 @section Bug Reports
 285 @cindex Reporting Bugs
 286
 287 If you think you have found a bug in Libidn, please investigate it and
 288 report it.
 289
 290 @itemize @bullet
 291
 292 @item Please make sure that the bug is really in Libidn, and
 293 preferably also check that it hasn't already been fixed in the latest
 294 version.
 295
 296 @item You have to send us a test case that makes it possible for us to
 297 reproduce the bug.
 298
 299 @item You also have to explain what is wrong; if you get a crash, or
 300 if the results printed are not good and in that case, in what way.
 301 Make sure that the bug report includes all information you would need
 302 to fix this kind of bug for someone else.
 303
 304 @end itemize
 305
 306 Please make an effort to produce a self-contained report, with
 307 something definite that can be tested or debugged.  Vague queries or
 308 piecemeal messages are difficult to act on and don't help the
 309 development effort.
 310
 311 If your bug report is good, we will do our best to help you to get a
 312 corrected version of the software; if the bug report is poor, we won't
 313 do anything about it (apart from asking you to send better bug
 314 reports).
 315
 316 If you think something in this manual is unclear, or downright
 317 incorrect, or if the language needs to be improved, please also send a
 318 note.
 319
 320 Send your bug report to:
 321
 322 @center @samp{bug-libidn@@gnu.org}
 323
 324
 325 @c **********************************************************
 326 @c *******************  Preparation  ************************
 327 @c **********************************************************
 328 @node Preparation
 329 @chapter Preparation
 330
 331 To use `Libidn', you have to perform some changes to your sources and
 332 the build system.  The necessary changes are small and explained in
 333 the following sections.  At the end of this chapter, it is described
 334 how the library is initialized, and how the requirements of the
 335 library are verified.
 336
 337 A faster way to find out how to adapt your application for use with
 338 `Libidn' may be to look at the examples at the end of this manual
 339 (@pxref{Examples}).
 340
 341 @menu
 342 * Header::
 343 * Initialization::
 344 * Version Check::
 345 * Building the source::
 346 @end menu
 347
 348 @node Header
 349 @section Header
 350
 351 The library contains a few independent parts, and each part export the
 352 interfaces (data types and functions) in a header file.  You must
 353 include the appropriate header files in all programs using the
 354 library, either directly or through some other header file, like this:
 355
 356 @example
 357 #include <stringprep.h>
 358 @end example
 359
 360 The header files and the functions they define are categorized as
 361 follows:
 362
 363 @table @asis
 364 @item stringprep.h
 365
 366 The low-level stringprep API entry point.  For IDN applications, this
 367 is usually invoked via IDNA. Some applications, specifically non-IDN
 368 ones, may want to prepare strings directly though, and should include
 369 this header file.
 370
 371 The name space of the stringprep part of Libidn is @code{stringprep*}
 372 for function names, @code{Stringprep*} for data types and
 373 @code{STRINGPREP_*} for other symbols.  In addition the same name
 374 prefixes with one prepended underscore are reserved for internal use
 375 and should never be used by an application.
 376
 377 @item punycode.h
 378
 379 The entry point to Punycode encoding and decoding functions.  Normally
 380 punycode is used via the idna.h interface, but some application may
 381 want to perform raw punycode operations.
 382
 383 The name space of the punycode part of Libidn is @code{punycode_*} for
 384 function names, @code{Punycode*} for data types and @code{PUNYCODE_*}
 385 for other symbols.  In addition the same name prefixes with one
 386 prepended underscore are reserved for internal use and should never be
 387 used by an application.
 388
 389 @item idna.h
 390
 391 The entry point to the IDNA functions.  This is the normal entry point
 392 for applications that need IDN functionality.
 393
 394 The name space of the IDNA part of Libidn is @code{idna_*} for
 395 function names, @code{Idna*} for data types and @code{IDNA_*} for
 396 other symbols.  In addition the same name prefixes with one prepended
 397 underscore are reserved for internal use and should never be used by
 398 an application.
 399
 400 @end table
 401
 402 @node Initialization
 403 @section Initialization
 404
 405 Libidn is stateless and does not need any initialization.
 406
 407 @node Version Check
 408 @section Version Check
 409
 410 It is often desirable to check that the version of `Libidn' used is
 411 indeed one which fits all requirements.  Even with binary
 412 compatibility new features may have been introduced but due to problem
 413 with the dynamic linker an old version is actually used.  So you may
 414 want to check that the version is okay right after program startup.
 415
 416 @deftypefun {const char *} stringprep_check_version (const char * @var{req_version})
 417
 418 @var{req_version}:  Required version number, or NULL.
 419
 420 Check that the the version of the library is at minimum the requested one
 421 and return the version string; return NULL if the condition is not
 422 satisfied.  If a NULL is passed to this function, no check is done,
 423 but the version string is simply returned.
 424
 425 See @var{STRINGPREP_VERSION} for a suitable @code{req_version} string.
 426
 427  Version string of run-time library, or NULL if the
 428 run-time library does not meet the required version number.
 429
 430 @end deftypefun
 431
 432 The normal way to use the function is to put something similar to the
 433 following first in your @code{main()}:
 434
 435 @example
 436   if (!stringprep_check_version (STRINGPREP_VERSION))
 437     @{
 438       printf ("stringprep_check_version() failed:\n"
 439               "Header file incompatible with shared library.\n");
 440       exit(1);
 441     @}
 442 @end example
 443
 444 @node Building the source
 445 @section Building the source
 446 @cindex Compiling your application
 447
 448 If you want to compile a source file including e.g. the `idna.h' header
 449 file, you must make sure that the compiler can find it in the
 450 directory hierarchy.  This is accomplished by adding the path to the
 451 directory in which the header file is located to the compilers include
 452 file search path (via the @option{-I} option).
 453
 454 However, the path to the include file is determined at the time the
 455 source is configured.  To solve this problem, `Libidn' uses the
 456 external package @command{pkg-config} that knows the path to the
 457 include file and other configuration options.  The options that need
 458 to be added to the compiler invocation at compile time are output by
 459 the @option{--cflags} option to @command{pkg-config libidn}.  The
 460 following example shows how it can be used at the command line:
 461
 462 @example
 463 gcc -c foo.c `pkg-config libidn --cflags`
 464 @end example
 465
 466 Adding the output of @samp{pkg-config libidn --cflags} to the
 467 compilers command line will ensure that the compiler can find e.g. the
 468 idna.h header file.
 469
 470 A similar problem occurs when linking the program with the library.
 471 Again, the compiler has to find the library files.  For this to work,
 472 the path to the library files has to be added to the library search
 473 path (via the @option{-L} option).  For this, the option
 474 @option{--libs} to @command{pkg-config libidn} can be used.  For
 475 convenience, this option also outputs all other options that are
 476 required to link the program with the `libidn' libarary.  The example
 477 shows how to link @file{foo.o} with the `libidn' library to a program
 478 @command{foo}.
 479
 480 @example
 481 gcc -o foo foo.o `pkg-config libidn --libs`
 482 @end example
 483
 484 Of course you can also combine both examples to a single command by
 485 specifying both options to @command{pkg-config}:
 486
 487 @example
 488 gcc -o foo foo.c `pkg-config libidn --cflags --libs`
 489 @end example
 490
 491 @c **********************************************************
 492 @c ******************  Stringprep Functions *****************
 493 @c **********************************************************
 494 @node Stringprep Functions
 495 @chapter Stringprep Functions
 496 @cindex Stringprep Functions
 497
 498 Stringprep describes a framework for preparing Unicode text strings in
 499 order to increase the likelihood that string input and string
 500 comparison work in ways that make sense for typical users throughout
 501 the world. The stringprep protocol is useful for protocol identifier
 502 values, company and personal names, internationalized domain names,
 503 and other text strings.
 504
 505 @defcv {Enumerated type} Stringprep_profile_flags STRINGPREP_NO_NFKC
 506 STRINGPREP_NO_NFKC disables the NFKC normalization, as well as
 507 selecting the non-NFKC case folding tables.  Usually the profile
 508 specifies BIDI and NFKC settings.
 509 @end defcv
 510
 511 @defcv {Enumerated type} Stringprep_profile_flags STRINGPREP_NO_BIDI
 512 STRINGPREP_NO_BIDI disables the BIDI step.  Usually the profile
 513 specifies BIDI and NFKC settings.
 514 @end defcv
 515
 516 @defcv {Enumerated type} Stringprep_profile_flags STRINGPREP_NO_UNASSIGNED
 517 STRINGPREP_NO_UNASSIGNED causes stringprep() abort with an error if
 518 string contains unassigned characters according to profile.
 519 @end defcv
 520
 521 @deftypefun {int} stringprep (char * @var{in}, size_t @var{maxlen}, int @var{flags}, Stringprep_profile * @var{profile})
 522
 523 @var{in}:  input/ouput array with string to prepare.
 524
 525 @var{maxlen}:  maximum length of input/output array.
 526
 527 @var{flags}:  optional stringprep profile flags.
 528
 529 @var{profile}:  pointer to stringprep profile to use.
 530
 531 Prepare the input UTF-8 string according to the stringprep profile.
 532 Normally application programmers use stringprep profile macros such
 533 as @code{stringprep_nameprep()}, @code{stringprep_kerberos5()} etc instead of
 534 calling this function directly.
 535
 536 Since the stringprep operation can expand the string, @code{maxlen}
 537 indicate how large the buffer holding the string is.  The @code{flags}
 538 are one of Stringprep_profile_flags, or 0.  The profile indicates
 539 processing details, see the profile header files, such as
 540 stringprep_generic.h and stringprep_nameprep.h for two examples.
 541 Your application can define new profiles, possibly re-using the
 542 generic stringprep tables that always will be part of the library.
 543 Note that you must convert strings entered in the systems locale
 544 into UTF-8 before using this function.
 545
 546  Returns 0 iff successful, or an error code.
 547
 548 @end deftypefun
 549
 550 @deftypefun {int} stringprep_profile (char * @var{in}, char ** @var{out}, char * @var{profile}, int @var{flags})
 551
 552 @var{in}:  input/ouput array with string to prepare.
 553
 554 @var{out}:  output variable with newly allocate string.
 555
 556 @var{profile}:  name of stringprep profile to use.
 557
 558 @var{flags}:  optional stringprep profile flags.
 559
 560 Prepare the input UTF-8 string according to the stringprep profile.
 561 Normally application programmers use stringprep profile macros such
 562 as @code{stringprep_nameprep()}, @code{stringprep_kerberos5()} etc instead of
 563 calling this function directly.
 564
 565 Note that you must convert strings entered in the systems locale
 566 into UTF-8 before using this function.
 567
 568 The output @code{out} variable must be deallocated by the caller.
 569
 570  Returns 0 iff successful, or an error code.
 571
 572 @end deftypefun
 573
 574 @deftypefun {uint32_t} stringprep_utf8_to_unichar (const char * @var{p})
 575
 576 @var{p}:  a pointer to Unicode character encoded as UTF-8
 577
 578 Converts a sequence of bytes encoded as UTF-8 to a Unicode character.
 579 If @code{p} does not point to a valid UTF-8 encoded character, results are
 580 undefined.
 581
 582  the resulting character
 583
 584 @end deftypefun
 585
 586 @deftypefun {int} stringprep_unichar_to_utf8 (uint32_t @var{c}, char * @var{outbuf})
 587
 588 @var{c}:  a ISO10646 character code
 589
 590 @var{outbuf}:  output buffer, must have at least 6 bytes of space.
 591 If @var{NULL}, the length will be computed and returned
 592 and nothing will be written to @code{outbuf}.
 593
 594 Converts a single character to UTF-8.
 595
 596  number of bytes written
 597
 598 @end deftypefun
 599
 600 @deftypefun {uint32_t *} stringprep_utf8_to_ucs4 (const char * @var{str}, ssize_t @var{len}, size_t * @var{items_written})
 601
 602 @var{str}:  a UTF-8 encoded string
 603
 604 @var{len}:  the maximum length of @code{str} to use. If @code{len} < 0, then
 605 the string is nul-terminated.
 606
 607 @var{items_written}:  location to store the number of characters in the
 608 result, or @var{NULL}.
 609
 610 Convert a string from UTF-8 to a 32-bit fixed width
 611 representation as UCS-4, assuming valid UTF-8 input.
 612 This function does no error checking on the input.
 613
 614  a pointer to a newly allocated UCS-4 string.
 615 This value must be freed with @code{free()}.
 616
 617 @end deftypefun
 618
 619 @deftypefun {char *} stringprep_ucs4_to_utf8 (const uint32_t * @var{str}, ssize_t @var{len}, size_t * @var{items_read}, size_t * @var{items_written})
 620
 621 @var{str}:  a UCS-4 encoded string
 622
 623 @var{len}:  the maximum length of @code{str} to use. If @code{len} < 0, then
 624 the string is terminated with a 0 character.
 625
 626 @var{items_read}:  location to store number of characters read read, or @var{NULL}.
 627
 628 @var{items_written}:  location to store number of bytes written or @var{NULL}.
 629 The value here stored does not include the trailing 0
 630 byte.
 631
 632 Convert a string from a 32-bit fixed width representation as UCS-4.
 633 to UTF-8. The result will be terminated with a 0 byte.
 634
 635  a pointer to a newly allocated UTF-8 string.
 636 This value must be freed with @code{free()}. If an
 637 error occurs, @var{NULL} will be returned and
 638 @code{error} set.
 639
 640 @end deftypefun
 641
 642 @deftypefun {char *} stringprep_utf8_nfkc_normalize (const char * @var{str}, ssize_t @var{len})
 643
 644 @var{str}:  a UTF-8 encoded string.
 645
 646 @var{len}:  length of @code{str}, in bytes, or -1 if @code{str} is nul-terminated.
 647
 648 Converts a string into canonical form, standardizing
 649 such issues as whether a character with an accent
 650 is represented as a base character and combining
 651 accent or as a single precomposed character. You
 652 should generally call @code{g_utf8_normalize()} before
 653 comparing two Unicode strings.
 654
 655 The normalization mode is NFKC (ALL COMPOSE).  It standardizes
 656 differences that do not affect the text content, such as the
 657 above-mentioned accent representation. It standardizes the
 658 "compatibility" characters in Unicode, such as SUPERSCRIPT THREE to
 659 the standard forms (in this case DIGIT THREE). Formatting
 660 information may be lost but for most text operations such
 661 characters should be considered the same. It returns a result with
 662 composed forms rather than a maximally decomposed form.
 663
 664  a newly allocated string, that is the
 665 NFKC normalized form of @code{str}.
 666
 667 @end deftypefun
 668
 669 @deftypefun {uint32_t *} stringprep_ucs4_nfkc_normalize (uint32_t * @var{str}, ssize_t @var{len})
 670
 671 @var{str}:  a Unicode string.
 672
 673 @var{len}:  length of @code{str} array, or -1 if @code{str} is nul-terminated.
 674
 675 Converts UCS4 string into UTF-8 and runs
 676 @code{stringprep_utf8_nfkc_normalize()}.
 677
 678  a newly allocated Unicode string, that is the NFKC
 679 normalized form of @code{str}.
 680
 681 @end deftypefun
 682
 683 @deftypefun {const char *} stringprep_locale_charset ( @var{void})
 684
 685
 686  Return the character set used by the system locale.
 687 It will never return NULL, but use "ASCII" as a fallback.
 688
 689 @end deftypefun
 690
 691 @deftypefun {char *} stringprep_convert (const char * @var{str}, const char * @var{to_codeset}, const char * @var{from_codeset})
 692
 693 @var{str}:  input zero-terminated string.
 694
 695 @var{to_codeset}:  name of destination character set.
 696
 697 @var{from_codeset}:  name of origin character set, as used by @code{str}.
 698
 699 Convert the string from one character set to another using the
 700 system's @code{iconv()} function.
 701
 702  Returns newly allocated zero-terminated string which
 703 is @code{str} transcoded into to_codeset.
 704
 705 @end deftypefun
 706
 707 @deftypefun {char *} stringprep_locale_to_utf8 (const char * @var{str})
 708
 709 @var{str}:  input zero terminated string.
 710
 711 Convert string encoded in the locale's character set into UTF-8 by
 712 using @code{stringprep_convert()}.
 713
 714  Returns newly allocated zero-terminated string which
 715 is @code{str} transcoded into UTF-8.
 716
 717 @end deftypefun
 718
 719 @deftypefun {char *} stringprep_utf8_to_locale (const char * @var{str})
 720
 721 @var{str}:  input zero terminated string.
 722
 723 Convert string encoded in UTF-8 into the locale's character set by
 724 using @code{stringprep_convert()}.
 725
 726  Returns newly allocated zero-terminated string which
 727 is @code{str} transcoded into the locale's character set.
 728
 729 @end deftypefun
 730
 731 @deftypefun {int} stringprep_nameprep_no_unassigned (char * @var{in}, int @var{maxlen})
 732
 733 @var{in}:  input/ouput array with string to prepare.
 734
 735 @var{maxlen}:  maximum length of input/output array.
 736
 737 Prepare the input UTF-8 string according to the nameprep profile.
 738 The AllowUnassigned flag is false, use @code{stringprep_nameprep()} for
 739 true AllowUnassigned.  Returns 0 iff successful, or an error code.
 740
 741 @end deftypefun
 742
 743 @deftypefun {int} stringprep_iscsi (char * @var{in}, int @var{maxlen})
 744
 745 @var{in}:  input/ouput array with string to prepare.
 746
 747 @var{maxlen}:  maximum length of input/output array.
 748
 749 Prepare the input UTF-8 string according to the draft iSCSI
 750 stringprep profile.  Returns 0 iff successful, or an error code.
 751
 752 @end deftypefun
 753
 754 @deftypefun {int} stringprep_kerberos5 (char * @var{in}, int @var{maxlen})
 755
 756 @var{in}:  input/ouput array with string to prepare.
 757
 758 @var{maxlen}:  maximum length of input/output array.
 759
 760 Prepare the input UTF-8 string according to the draft Kerberos5
 761 stringprep profile.  Returns 0 iff successful, or an error code.
 762
 763 @end deftypefun
 764
 765 @deftypefun {int} stringprep_plain (char * @var{in}, int @var{maxlen})
 766
 767 @var{in}:  input/ouput array with string to prepare.
 768
 769 @var{maxlen}:  maximum length of input/output array.
 770
 771 Prepare the input UTF-8 string according to the draft SASL
 772 ANONYMOUS profile.  Returns 0 iff successful, or an error code.
 773
 774 @end deftypefun
 775
 776 @deftypefun {int} stringprep_xmpp_nodeprep (char * @var{in}, int @var{maxlen})
 777
 778 @var{in}:  input/ouput array with string to prepare.
 779
 780 @var{maxlen}:  maximum length of input/output array.
 781
 782 Prepare the input UTF-8 string according to the draft XMPP node
 783 identifier profile.  Returns 0 iff successful, or an error code.
 784
 785 @end deftypefun
 786
 787 @deftypefun {int} stringprep_xmpp_resourceprep (char * @var{in}, int @var{maxlen})
 788
 789 @var{in}:  input/ouput array with string to prepare.
 790
 791 @var{maxlen}:  maximum length of input/output array.
 792
 793 Prepare the input UTF-8 string according to the draft XMPP resource
 794 identifier profile.  Returns 0 iff successful, or an error code.
 795
 796 @end deftypefun
 797
 798 @deftypefun {int} stringprep_generic (char * @var{in}, int @var{maxlen})
 799
 800 @var{in}:  input/ouput array with string to prepare.
 801
 802 @var{maxlen}:  maximum length of input/output array.
 803
 804 Prepare the input UTF-8 string according to a hypotetical "generic"
 805 stringprep profile. This is mostly used for debugging or when
 806 constructing new stringprep profiles. Returns 0 iff successful, or
 807 an error code.
 808
 809 @end deftypefun
 810
 811 @c **********************************************************
 812 @c *******************  Punycode Functions ******************
 813 @c **********************************************************
 814 @node Punycode Functions
 815 @chapter Punycode Functions
 816 @cindex Punycode Functions
 817
 818 Punycode is a simple and efficient transfer encoding syntax designed
 819 for use with Internationalized Domain Names in Applications. It
 820 uniquely and reversibly transforms a Unicode string into an ASCII
 821 string. ASCII characters in the Unicode string are represented
 822 literally, and non-ASCII characters are represented by ASCII
 823 characters that are allowed in host name labels (letters, digits, and
 824 hyphens). This document defines a general algorithm called Bootstring
 825 that allows a string of basic code points to uniquely represent any
 826 string of code points drawn from a larger set. Punycode is an instance
 827 of Bootstring that uses particular parameter values specified by this
 828 document, appropriate for IDNA.
 829
 830 @deftypefun {enum punycode_status} punycode_encode (size_t @var{input_length}, const uint32_t @var{input[]}, const unsigned char @var{case_flags[]}, size_t * @var{output_length}, char @var{output[]})
 831
 832 @var{input_length}:  The input_length is the number of code points in the input.
 833
 834 @var{output_length}:  The output_length is an in/out argument: the caller
 835 passes in the maximum number of code points that it
 836 can receive, and on successful return it will
 837 contain the number of code points actually output.
 838
 839 Converts Unicode to Punycode.
 840
 841  The return value can be any of the punycode_status
 842 values defined above except punycode_bad_input; if
 843 not punycode_success, then output_size and output
 844 might contain garbage.
 845
 846 @end deftypefun
 847
 848 @deftypefun {enum punycode_status} punycode_decode (size_t @var{input_length}, const char @var{input[]}, size_t * @var{output_length}, uint32_t @var{output[]}, unsigned char @var{case_flags[]})
 849
 850 @var{input_length}:  The input_length is the number of code points in the input.
 851
 852 @var{output_length}:  The output_length is an in/out argument: the caller
 853 passes in the maximum number of code points that it
 854 can receive, and on successful return it will
 855 contain the actual number of code points output.
 856
 857 Converts Punycode to Unicode.
 858
 859  The return value can be any of the punycode_status
 860 values defined above; if not punycode_success, then
 861 output_length, output, and case_flags might contain
 862 garbage.  On success, the decoder will never need to
 863 write an output_length greater than input_length,
 864 because of how the encoding is defined.
 865
 866 @end deftypefun
 867
 868 @c **********************************************************
 869 @c ********************* IDNA Functions *********************
 870 @c **********************************************************
 871 @node IDNA Functions
 872 @chapter IDNA Functions
 873 @cindex IDNA Functions
 874
 875 Until now, there has been no standard method for domain names to use
 876 characters outside the ASCII repertoire. The IDNA document defines
 877 internationalized domain names (IDNs) and a mechanism called IDNA for
 878 handling them in a standard fashion. IDNs use characters drawn from a
 879 large repertoire (Unicode), but IDNA allows the non-ASCII characters
 880 to be represented using only the ASCII characters already allowed in
 881 so-called host names today. This backward-compatible representation is
 882 required in existing protocols like DNS, so that IDNs can be
 883 introduced with no changes to the existing infrastructure. IDNA is
 884 only meant for processing domain names, not free text.
 885
 886 @deftypefun {int} idna_to_ascii_4i (const uint32_t * @var{in}, size_t @var{inlen}, char * @var{out}, int @var{flags})
 887
 888 @var{in}:  input array with unicode code points.
 889
 890 @var{inlen}:  length of input array with unicode code points.
 891
 892 @var{out}:  output zero terminated string that must have room for at
 893 least 63 characters plus the terminating zero.
 894
 895 @var{flags}:  IDNA flags, e.g. IDNA_ALLOW_UNASSIGNED or IDNA_USE_STD3_ASCII_RULES.
 896
 897 The ToASCII operation takes a sequence of Unicode code points that make
 898 up one label and transforms it into a sequence of code points in the
 899 ASCII range (0..7F). If ToASCII succeeds, the original sequence and the
 900 resulting sequence are equivalent labels.
 901
 902 It is important to note that the ToASCII operation can fail. ToASCII
 903 fails if any step of it fails. If any step of the ToASCII operation
 904 fails on any label in a domain name, that domain name MUST NOT be used
 905 as an internationalized domain name. The method for deadling with this
 906 failure is application-specific.
 907
 908 The inputs to ToASCII are a sequence of code points, the AllowUnassigned
 909 flag, and the UseSTD3ASCIIRules flag. The output of ToASCII is either a
 910 sequence of ASCII code points or a failure condition.
 911
 912 ToASCII never alters a sequence of code points that are all in the ASCII
 913 range to begin with (although it could fail). Applying the ToASCII
 914 operation multiple times has exactly the same effect as applying it just
 915 once.
 916
 917  Returns 0 on success, or an error code.
 918
 919 @end deftypefun
 920
 921 @deftypefun {int} idna_to_ascii (const unsigned long * @var{in}, size_t @var{inlen}, char * @var{out}, int @var{allowunassigned}, int @var{usestd3asciirules})
 922
 923 @var{in}:  input array with unicode code points.
 924
 925 @var{inlen}:  length of input array with unicode code points.
 926
 927 @var{out}:  output zero terminated string that must have room for at
 928 least 63 characters plus the terminating zero.
 929
 930 @var{allowunassigned}:  whether to allow unassigned code points.
 931
 932 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
 933
 934 The ToASCII operation takes a sequence of Unicode code points that make
 935 up one label and transforms it into a sequence of code points in the
 936 ASCII range (0..7F). If ToASCII succeeds, the original sequence and the
 937 resulting sequence are equivalent labels.
 938
 939 It is important to note that the ToASCII operation can fail. ToASCII
 940 fails if any step of it fails. If any step of the ToASCII operation
 941 fails on any label in a domain name, that domain name MUST NOT be used
 942 as an internationalized domain name. The method for deadling with this
 943 failure is application-specific.
 944
 945 The inputs to ToASCII are a sequence of code points, the AllowUnassigned
 946 flag, and the UseSTD3ASCIIRules flag. The output of ToASCII is either a
 947 sequence of ASCII code points or a failure condition.
 948
 949 ToASCII never alters a sequence of code points that are all in the ASCII
 950 range to begin with (although it could fail). Applying the ToASCII
 951 operation multiple times has exactly the same effect as applying it just
 952 once.
 953
 954  Returns 0 on success, or an error code.
 955
 956 @end deftypefun
 957
 958 @deftypefun {int} idna_to_unicode (const unsigned long * @var{in}, size_t @var{inlen}, unsigned long * @var{out}, size_t * @var{outlen}, int @var{allowunassigned}, int @var{usestd3asciirules})
 959
 960 @var{in}:  input array with unicode code points.
 961
 962 @var{inlen}:  length of input array with unicode code points.
 963
 964 @var{out}:  output array with unicode code points.
 965
 966 @var{outlen}:  on input, maximum size of output array with unicode code points,
 967 on exit, actual size of output array with unicode code points.
 968
 969 @var{allowunassigned}:  whether to allow unassigned code points.
 970
 971 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
 972
 973 The ToUnicode operation takes a sequence of Unicode code points
 974 that make up one label and returns a sequence of Unicode code
 975 points. If the input sequence is a label in ACE form, then the
 976 result is an equivalent internationalized label that is not in ACE
 977 form, otherwise the original sequence is returned unaltered.
 978
 979 ToUnicode never fails. If any step fails, then the original input
 980 sequence is returned immediately in that step.
 981
 982 The ToUnicode output never contains more code points than its
 983 input.  Note that the number of octets needed to represent a
 984 sequence of code points depends on the particular character
 985 encoding used.
 986
 987 The inputs to ToUnicode are a sequence of code points, the
 988 AllowUnassigned flag, and the UseSTD3ASCIIRules flag. The output of
 989 ToUnicode is always a sequence of Unicode code points.
 990
 991  Returns error condition, but it must only be used for
 992 debugging purposes.  The output buffer is always
 993 guaranteed to contain the correct data according to
 994 the specification (sans malloc induced errors).  NB!
 995 This means that you normally ignore the return code
 996 from this function, as checking it means breaking the
 997 standard.
 998
 999 @end deftypefun
1000
1001 @deftypefun {int} idna_to_ascii_from_ucs4 (const unsigned long * @var{input}, char ** @var{output}, int @var{allowunassigned}, int @var{usestd3asciirules})
1002
1003 @var{input}:  zero terminated input Unicode string.
1004
1005 @var{output}:  pointer to newly allocated output string.
1006
1007 @var{allowunassigned}:  whether to allow unassigned code points.
1008
1009 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
1010
1011 Convert UCS-4 domain name to ASCII string.  The domain name may
1012 contain several labels, separated by dots.  The output buffer must
1013 be deallocated by the caller.
1014
1015  Returns IDNA_SUCCESS on success, or error code.
1016
1017 @end deftypefun
1018
1019 @deftypefun {int} idna_to_ascii_from_utf8 (const char * @var{input}, char ** @var{output}, int @var{allowunassigned}, int @var{usestd3asciirules})
1020
1021 @var{input}:  zero terminated input UTF-8 string.
1022
1023 @var{output}:  pointer to newly allocated output string.
1024
1025 @var{allowunassigned}:  whether to allow unassigned code points.
1026
1027 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
1028
1029 Convert UTF-8 domain name to ASCII string.  The domain name may
1030 contain several labels, separated by dots.  The output buffer must
1031 be deallocated by the caller.
1032
1033  Returns IDNA_SUCCESS on success, or error code.
1034
1035 @end deftypefun
1036
1037 @deftypefun {int} idna_to_ascii_from_locale (const char * @var{input}, char ** @var{output}, int @var{allowunassigned}, int @var{usestd3asciirules})
1038
1039 @var{input}:  zero terminated input UTF-8 string.
1040
1041 @var{output}:  pointer to newly allocated output string.
1042
1043 @var{allowunassigned}:  whether to allow unassigned code points.
1044
1045 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
1046
1047 Convert domain name in the locale's encoding to ASCII string.  The
1048 domain name may contain several labels, separated by dots.  The
1049 output buffer must be deallocated by the caller.
1050
1051  Returns IDNA_SUCCESS on success, or error code.
1052
1053 @end deftypefun
1054
1055 @deftypefun {int} idna_to_unicode_ucs4_from_ucs4 (const unsigned long * @var{input}, unsigned long ** @var{output}, int @var{allowunassigned}, int @var{usestd3asciirules})
1056
1057 @var{input}:  zero-terminated Unicode string.
1058
1059 @var{output}:  pointer to newly allocated output Unicode string.
1060
1061 @var{allowunassigned}:  whether to allow unassigned code points.
1062
1063 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
1064
1065 Convert possibly ACE encoded domain name in UCS-4 format into a
1066 UCS-4 string.  The domain name may contain several labels,
1067 separated by dots.  The output buffer must be deallocated by the
1068 caller.
1069
1070  Returns IDNA_SUCCESS on success, or error code.
1071
1072 @end deftypefun
1073
1074 @deftypefun {int} idna_to_unicode_ucs4_from_utf8 (const char * @var{input}, unsigned long ** @var{output}, int @var{allowunassigned}, int @var{usestd3asciirules})
1075
1076 @var{input}:  zero-terminated UTF-8 string.
1077
1078 @var{output}:  pointer to newly allocated output Unicode string.
1079
1080 @var{allowunassigned}:  whether to allow unassigned code points.
1081
1082 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
1083
1084 Convert possibly ACE encoded domain name in UTF-8 format into a
1085 UCS-4 string.  The domain name may contain several labels,
1086 separated by dots.  The output buffer must be deallocated by the
1087 caller.
1088
1089  Returns IDNA_SUCCESS on success, or error code.
1090
1091 @end deftypefun
1092
1093 @deftypefun {int} idna_to_unicode_utf8_from_utf8 (const char * @var{input}, char ** @var{output}, int @var{allowunassigned}, int @var{usestd3asciirules})
1094
1095 @var{input}:  zero-terminated UTF-8 string.
1096
1097 @var{output}:  pointer to newly allocated output UTF-8 string.
1098
1099 @var{allowunassigned}:  whether to allow unassigned code points.
1100
1101 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
1102
1103 Convert possibly ACE encoded domain name in UTF-8 format into a
1104 UTF-8 string.  The domain name may contain several labels,
1105 separated by dots.  The output buffer must be deallocated by the
1106 caller.
1107
1108  Returns IDNA_SUCCESS on success, or error code.
1109
1110 @end deftypefun
1111
1112 @deftypefun {int} idna_to_unicode_locale_from_utf8 (const char * @var{input}, char ** @var{output}, int @var{allowunassigned}, int @var{usestd3asciirules})
1113
1114 @var{input}:  zero-terminated UTF-8 string.
1115
1116 @var{output}:  pointer to newly allocated output string encoded in the
1117 current locale's character set.
1118
1119 @var{allowunassigned}:  whether to allow unassigned code points.
1120
1121 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
1122
1123 Convert possibly ACE encoded domain name in UTF-8 format into a
1124 string encoded in the current locale's character set.  The
1125 The domain name may contain several labels, separated by dots.  The
1126 output buffer must be deallocated by the caller.
1127
1128  Returns IDNA_SUCCESS on success, or error code.
1129
1130 @end deftypefun
1131
1132 @deftypefun {int} idna_to_unicode_locale_from_locale (const char * @var{input}, char ** @var{output}, int @var{allowunassigned}, int @var{usestd3asciirules})
1133
1134 @var{input}:  zero-terminated string encoded in the current locale's
1135 character set.
1136
1137 @var{output}:  pointer to newly allocated output string encoded in the
1138 current locale's character set.
1139
1140 @var{allowunassigned}:  whether to allow unassigned code points.
1141
1142 @var{usestd3asciirules}:  whether to check input for STD3 compliance.
1143
1144 Convert possibly ACE encoded domain name in the locale's character
1145 set into a string encoded in the current locale's character set.
1146 The domain name may contain several labels, separated by dots.  The
1147 output buffer must be deallocated by the caller.
1148
1149  Returns IDNA_SUCCESS on success, or error code.
1150
1151 @end deftypefun
1152
1153 @c **********************************************************
1154 @c ***********************  Examples  ***********************
1155 @c **********************************************************
1156 @node Examples
1157 @chapter Examples
1158 @cindex Examples
1159
1160 This chapter contains example code which illustrate how `Libidn' can
1161 be used when writing your own application.
1162
1163 @menu
1164 * Example 1::           Example using stringprep.
1165 * Example 2::           Example using punycode.
1166 * Example 3::           Example using IDNA ToASCII.
1167 * Example 4::           Example using IDNA ToUnicode.
1168 @end menu
1169
1170 @node Example 1
1171 @section Example 1
1172
1173 This example demonstrates how the stringprep functions are used.
1174
1175 @example
1176 @include example.c.texi
1177 @end example
1178
1179
1180 @node Example 2
1181 @section Example 2
1182
1183 This example demonstrates how the punycode functions are used.
1184
1185 @example
1186 @include example2.c.texi
1187 @end example
1188
1189
1190 @node Example 3
1191 @section Example 3
1192
1193 This example demonstrates how the library is used to convert
1194 internationalized domain names into ASCII compatible names.
1195
1196 @example
1197 @include example3.c.texi
1198 @end example
1199
1200
1201 @node Example 4
1202 @section Example 4
1203
1204 This example demonstrates how the library is used to convert ASCII
1205 compatible names to internationalized domain names.
1206
1207 @example
1208 @include example4.c.texi
1209 @end example
1210
1211 @c **********************************************************
1212 @c *********************  Invoking idn  *********************
1213 @c **********************************************************
1214 @node Invoking idn
1215 @chapter Invoking idn
1216
1217 @pindex idn
1218 @cindex invoking @command{idn}
1219 @cindex command line
1220
1221 @majorheading Name
1222
1223 GNU Libidn (idn) -- Internationalized Domain Names command line tool
1224
1225 @majorheading Description
1226 @code{idn} is a utility part of GNU Libidn.  It allows preparation of
1227 strings, encoding and decoding of punycode data, and IDNA
1228 ToASCII/ToUnicode operations to be performed on the command line,
1229 without the need to write a program that uses libidn.
1230
1231 Data is read, line by line, from the standard input, and one of the
1232 operations indicated by command parameters are performed and the
1233 output is printed to standard output.  If any errors are encountered,
1234 the execution of the applications is aborted.
1235
1236 @majorheading Options
1237 @code{idn} recognizes these commands:
1238
1239 @verbatim
1240        -h  --help
1241               Print help and exit
1242
1243        -V  --version
1244               Print version and exit
1245
1246        -s --stringprep
1247               Prepare string according to nameprep profile
1248
1249        -e  --punycode-encode
1250               Encode UTF-8 to Punycode
1251
1252        -d  --punycode-decode
1253               Decode Punycode to UTF-8
1254
1255        -a  --idna-to-ascii
1256               Convert UTF-8 to ACE according to IDNA
1257
1258        -u  --idna-to-unicode
1259               Convert ACE to UTF-8 according to IDNA
1260
1261        --allow-unassigned
1262               Toggle IDNA AllowUnassigned flag (default=off)
1263
1264        --usestd3asciirules
1265               Toggle IDNA UseSTD3ASCIIRules flag (default=off)
1266
1267        -pSTRING   --profile=STRING
1268               Use specified stringprep profile instead
1269
1270               Valid stringprep profiles are 'generic', 'Nameprep',
1271               'KRBprep', 'Nodeprep', 'Resourceprep', 'plain',
1272               'SASLprep', and 'ISCSIprep'.
1273
1274        --debug
1275               Print debugging information (default=off)
1276
1277        --quiet
1278               Don't print the welcome greeting (default=off)
1279 @end verbatim
1280
1281 @majorheading Environment Variables
1282
1283 The @var{CHARSET} environment variable can be used to override what
1284 character set to be used for decoding incoming data on the standard
1285 input, and to encode data to the standard output.  If your system is
1286 set up correctly, the application will guess which character set is
1287 used automatically.  Example usage:
1288
1289 @verbatim
1290 $ CHARSET=ISO-8859-1 idn --punycode-encode
1291 ...
1292 @end verbatim
1293
1294 @node Emacs API
1295 @chapter Emacs API
1296
1297 Included in Libidn are @file{punycode.el} and @file{idna.el} that
1298 provides an Emacs Lisp API to (a limited set of) the Libidn API.  This
1299 section describes the API.
1300
1301 @defvar punycode-program
1302 Name of the GNU Libidn @file{idn} application.  The default is
1303 @samp{idn}.  This variable can be customized.
1304 @end defvar
1305
1306 @defvar punycode-environment
1307 List of environment variable definitions prepended to
1308 @samp{process-environment}.  The default is @samp{("CHARSET=UTF-8")}.
1309 This variable can be customized.
1310 @end defvar
1311
1312 @defvar punycode-encode-parameters
1313 List of parameters passed to @var{punycode-program} to invoke punycode
1314 encoding mode.  The default is @samp{("--quiet" "--punycode-encode")}.
1315 This variable can be customized.
1316 @end defvar
1317
1318 @defvar punycode-decode-parameters
1319 Parameters passed to @var{punycode-program} to invoke punycode
1320 decoding mode.  The default is @samp{("--quiet" "--punycode-decode")}.
1321 This variable can be customized.
1322 @end defvar
1323
1324 @defun punycode-encode string
1325 Returns a Punycode encoding of the @var{string}, after converting the
1326 input into UTF-8.
1327 @end defun
1328
1329 @defun punycode-decode string
1330 Returns a possibly multibyte string which is the decoding of the
1331 @var{string} which is a punycode encoded string.
1332 @end defun
1333
1334 @defvar idna-program
1335 Name of the GNU Libidn @file{idn} application.  The default is
1336 @samp{idn}.  This variable can be customized.
1337 @end defvar
1338
1339 @defvar idna-environment
1340 List of environment variable definitions prepended to
1341 @samp{process-environment}.  The default is @samp{("CHARSET=UTF-8")}.
1342 This variable can be customized.
1343 @end defvar
1344
1345 @defvar idna-to-ascii-parameters
1346 List of parameters passed to @var{idna-program} to invoke IDNA ToASCII
1347 mode.  The default is @samp{("--quiet" "--idna-to-ascii")}.  This
1348 variable can be customized.
1349 @end defvar
1350
1351 @defvar idna-to-unicode-parameters
1352 Parameters passed @var{idna-program} to invoke IDNA ToUnicode mode.
1353 The default is @samp{("--quiet" "--idna-to-unicode")}.  This variable
1354 can be customized.
1355 @end defvar
1356
1357 @defun idna-to-ascii string
1358 Returns an ASCII Compatible Encoding (ACE) of the string computed by
1359 the IDNA ToASCII operation on the input @var{string}, after converting
1360 the input to UTF-8.
1361 @end defun
1362
1363 @defun idna-to-unicode string
1364 Returns a possibly multibyte string which is the output of the IDNA
1365 ToUnicode operation computed on the input @var{string}.
1366 @end defun
1367
1368 @c **********************************************************
1369 @c *******************  Acknowledgements  *******************
1370 @c **********************************************************
1371 @node Acknowledgements
1372 @chapter Acknowledgements
1373
1374 The punycode code was taken from the IETF IDN Punycode specification,
1375 by Adam M. Costello.
1376
1377 Some functions (see nfkc.c and toutf8.c) has been borrowed from GLib
1378 downloaded from www.gtk.org.
1379
1380 Several people reported bugs, sent patches or suggested improvements,
1381 see the file THANKS.
1382
1383 @node Concept Index
1384 @unnumbered Concept Index
1385
1386 @printindex cp
1387
1388 @node Function and Variable Index
1389 @unnumbered Function and Variable Index
1390
1391 @printindex fn
1392
1393 @include lgpl.texi
1394
1395 @node Copying This Manual
1396 @appendix Copying This Manual
1397
1398 @menu
1399 * GNU Free Documentation License::  License for copying this manual.
1400 @end menu
1401
1402 @include fdl.texi
1403
1404 @bye