manual/locale.texi

   1 @node Locales, Message Translation, Character Set Handling, Top
   2 @c %MENU% The country and language can affect the behavior of library functions
   3 @chapter Locales and Internationalization
   4
   5 Different countries and cultures have varying conventions for how to
   6 communicate.  These conventions range from very simple ones, such as the
   7 format for representing dates and times, to very complex ones, such as
   8 the language spoken.
   9
  10 @cindex internationalization
  11 @cindex locales
  12 @dfn{Internationalization} of software means programming it to be able
  13 to adapt to the user's favorite conventions.  In @w{ISO C},
  14 internationalization works by means of @dfn{locales}.  Each locale
  15 specifies a collection of conventions, one convention for each purpose.
  16 The user chooses a set of conventions by specifying a locale (via
  17 environment variables).
  18
  19 All programs inherit the chosen locale as part of their environment.
  20 Provided the programs are written to obey the choice of locale, they
  21 will follow the conventions preferred by the user.
  22
  23 @menu
  24 * Effects of Locale::           Actions affected by the choice of
  25                                  locale.
  26 * Choosing Locale::             How the user specifies a locale.
  27 * Locale Categories::           Different purposes for which you can
  28                                  select a locale.
  29 * Setting the Locale::          How a program specifies the locale
  30                                  with library functions.
  31 * Standard Locales::            Locale names available on all systems.
  32 * Locale Names::                Format of system-specific locale names.
  33 * Locale Information::          How to access the information for the locale.
  34 * Formatting Numbers::          A dedicated function to format numbers.
  35 * Yes-or-No Questions::         Check a Response against the locale.
  36 @end menu
  37
  38 @node Effects of Locale, Choosing Locale,  , Locales
  39 @section What Effects a Locale Has
  40
  41 Each locale specifies conventions for several purposes, including the
  42 following:
  43
  44 @itemize @bullet
  45 @item
  46 What multibyte character sequences are valid, and how they are
  47 interpreted (@pxref{Character Set Handling}).
  48
  49 @item
  50 Classification of which characters in the local character set are
  51 considered alphabetic, and upper- and lower-case conversion conventions
  52 (@pxref{Character Handling}).
  53
  54 @item
  55 The collating sequence for the local language and character set
  56 (@pxref{Collation Functions}).
  57
  58 @item
  59 Formatting of numbers and currency amounts (@pxref{General Numeric}).
  60
  61 @item
  62 Formatting of dates and times (@pxref{Formatting Calendar Time}).
  63
  64 @item
  65 What language to use for output, including error messages
  66 (@pxref{Message Translation}).
  67
  68 @item
  69 What language to use for user answers to yes-or-no questions
  70 (@pxref{Yes-or-No Questions}).
  71
  72 @item
  73 What language to use for more complex user input.
  74 (The C library doesn't yet help you implement this.)
  75 @end itemize
  76
  77 Some aspects of adapting to the specified locale are handled
  78 automatically by the library subroutines.  For example, all your program
  79 needs to do in order to use the collating sequence of the chosen locale
  80 is to use @code{strcoll} or @code{strxfrm} to compare strings.
  81
  82 Other aspects of locales are beyond the comprehension of the library.
  83 For example, the library can't automatically translate your program's
  84 output messages into other languages.  The only way you can support
  85 output in the user's favorite language is to program this more or less
  86 by hand.  The C library provides functions to handle translations for
  87 multiple languages easily.
  88
  89 This chapter discusses the mechanism by which you can modify the current
  90 locale.  The effects of the current locale on specific library functions
  91 are discussed in more detail in the descriptions of those functions.
  92
  93 @node Choosing Locale, Locale Categories, Effects of Locale, Locales
  94 @section Choosing a Locale
  95
  96 The simplest way for the user to choose a locale is to set the
  97 environment variable @code{LANG}.  This specifies a single locale to use
  98 for all purposes.  For example, a user could specify a hypothetical
  99 locale named @samp{espana-castellano} to use the standard conventions of
 100 most of Spain.
 101
 102 The set of locales supported depends on the operating system you are
 103 using, and so do their names, except that the standard locale called
 104 @samp{C} or @samp{POSIX} always exist.  @xref{Locale Names}.
 105
 106 In order to force the system to always use the default locale, the
 107 user can set the @code{LC_ALL} environment variable to @samp{C}.
 108
 109 @cindex combining locales
 110 A user also has the option of specifying different locales for
 111 different purposes---in effect, choosing a mixture of multiple
 112 locales.  @xref{Locale Categories}.
 113
 114 For example, the user might specify the locale @samp{espana-castellano}
 115 for most purposes, but specify the locale @samp{usa-english} for
 116 currency formatting.  This might make sense if the user is a
 117 Spanish-speaking American, working in Spanish, but representing monetary
 118 amounts in US dollars.
 119
 120 Note that both locales @samp{espana-castellano} and @samp{usa-english},
 121 like all locales, would include conventions for all of the purposes to
 122 which locales apply.  However, the user can choose to use each locale
 123 for a particular subset of those purposes.
 124
 125 @node Locale Categories, Setting the Locale, Choosing Locale, Locales
 126 @section Locale Categories
 127 @cindex categories for locales
 128 @cindex locale categories
 129
 130 The purposes that locales serve are grouped into @dfn{categories}, so
 131 that a user or a program can choose the locale for each category
 132 independently.  Here is a table of categories; each name is both an
 133 environment variable that a user can set, and a macro name that you can
 134 use as the first argument to @code{setlocale}.
 135
 136 The contents of the environment variable (or the string in the second
 137 argument to @code{setlocale}) has to be a valid locale name.
 138 @xref{Locale Names}.
 139
 140 @vtable @code
 141 @comment locale.h
 142 @comment ISO
 143 @item LC_COLLATE
 144 This category applies to collation of strings (functions @code{strcoll}
 145 and @code{strxfrm}); see @ref{Collation Functions}.
 146
 147 @comment locale.h
 148 @comment ISO
 149 @item LC_CTYPE
 150 This category applies to classification and conversion of characters,
 151 and to multibyte and wide characters;
 152 see @ref{Character Handling}, and @ref{Character Set Handling}.
 153
 154 @comment locale.h
 155 @comment ISO
 156 @item LC_MONETARY
 157 This category applies to formatting monetary values; see @ref{General Numeric}.
 158
 159 @comment locale.h
 160 @comment ISO
 161 @item LC_NUMERIC
 162 This category applies to formatting numeric values that are not
 163 monetary; see @ref{General Numeric}.
 164
 165 @comment locale.h
 166 @comment ISO
 167 @item LC_TIME
 168 This category applies to formatting date and time values; see
 169 @ref{Formatting Calendar Time}.
 170
 171 @comment locale.h
 172 @comment XOPEN
 173 @item LC_MESSAGES
 174 This category applies to selecting the language used in the user
 175 interface for message translation (@pxref{The Uniforum approach};
 176 @pxref{Message catalogs a la X/Open})  and contains regular expressions
 177 for affirmative and negative responses.
 178
 179 @comment locale.h
 180 @comment ISO
 181 @item LC_ALL
 182 This is not a category; it is only a macro that you can use
 183 with @code{setlocale} to set a single locale for all purposes.  Setting
 184 this environment variable overwrites all selections by the other
 185 @code{LC_*} variables or @code{LANG}.
 186
 187 @comment locale.h
 188 @comment ISO
 189 @item LANG
 190 If this environment variable is defined, its value specifies the locale
 191 to use for all purposes except as overridden by the variables above.
 192 @end vtable
 193
 194 @vindex LANGUAGE
 195 When developing the message translation functions it was felt that the
 196 functionality provided by the variables above is not sufficient.  For
 197 example, it should be possible to specify more than one locale name.
 198 Take a Swedish user who better speaks German than English, and a program
 199 whose messages are output in English by default.  It should be possible
 200 to specify that the first choice of language is Swedish, the second
 201 German, and if this also fails to use English.  This is
 202 possible with the variable @code{LANGUAGE}.  For further description of
 203 this GNU extension see @ref{Using gettextized software}.
 204
 205 @node Setting the Locale, Standard Locales, Locale Categories, Locales
 206 @section How Programs Set the Locale
 207
 208 A C program inherits its locale environment variables when it starts up.
 209 This happens automatically.  However, these variables do not
 210 automatically control the locale used by the library functions, because
 211 @w{ISO C} says that all programs start by default in the standard @samp{C}
 212 locale.  To use the locales specified by the environment, you must call
 213 @code{setlocale}.  Call it as follows:
 214
 215 @smallexample
 216 setlocale (LC_ALL, "");
 217 @end smallexample
 218
 219 @noindent
 220 to select a locale based on the user choice of the appropriate
 221 environment variables.
 222
 223 @cindex changing the locale
 224 @cindex locale, changing
 225 You can also use @code{setlocale} to specify a particular locale, for
 226 general use or for a specific category.
 227
 228 @pindex locale.h
 229 The symbols in this section are defined in the header file @file{locale.h}.
 230
 231 @comment locale.h
 232 @comment ISO
 233 @deftypefun {char *} setlocale (int @var{category}, const char *@var{locale})
 234 @safety{@prelim{}@mtunsafe{@mtasuconst{:@mtslocale{}} @mtsenv{}}@asunsafe{@asuinit{} @asulock{} @ascuheap{} @asucorrupt{}}@acunsafe{@acuinit{} @acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
 235 @c Uses of the global locale object are unguarded in functions that
 236 @c ought to be MT-Safe, so we're ruling out the use of this function
 237 @c once threads are started.  It takes a write lock itself, but it may
 238 @c return a pointer loaded from the global locale object after releasing
 239 @c the lock, or before taking it.
 240 @c setlocale @mtasuconst:@mtslocale @mtsenv @asuinit @ascuheap @asulock @asucorrupt @acucorrupt @acsmem @acsfd @aculock
 241 @c  libc_rwlock_wrlock @asulock @aculock
 242 @c  libc_rwlock_unlock @aculock
 243 @c  getenv LOCPATH @mtsenv
 244 @c  malloc @ascuheap @acsmem
 245 @c  free @ascuheap @acsmem
 246 @c  new_composite_name ok
 247 @c  setdata ok
 248 @c  setname ok
 249 @c  _nl_find_locale @mtsenv @asuinit @ascuheap @asulock @asucorrupt @acucorrupt @acsmem @acsfd @aculock
 250 @c   getenv LC_ALL and LANG @mtsenv
 251 @c   _nl_load_locale_from_archive @ascuheap @acucorrupt @acsmem @acsfd
 252 @c    sysconf _SC_PAGE_SIZE ok
 253 @c    _nl_normalize_codeset @ascuheap @acsmem
 254 @c     isalnum_l ok (C locale)
 255 @c     isdigit_l ok (C locale)
 256 @c     malloc @ascuheap @acsmem
 257 @c     tolower_l ok (C locale)
 258 @c    open_not_cancel_2 @acsfd
 259 @c    fxstat64 ok
 260 @c    close_not_cancel_no_status ok
 261 @c    __mmap64 @acsmem
 262 @c    calculate_head_size ok
 263 @c    __munmap ok
 264 @c    compute_hashval ok
 265 @c    qsort dup @acucorrupt
 266 @c     rangecmp ok
 267 @c    malloc @ascuheap @acsmem
 268 @c    strdup @ascuheap @acsmem
 269 @c    _nl_intern_locale_data @ascuheap @acsmem
 270 @c     malloc @ascuheap @acsmem
 271 @c     free @ascuheap @acsmem
 272 @c   _nl_expand_alias @ascuheap @asulock @acsmem @acsfd @aculock
 273 @c    libc_lock_lock @asulock @aculock
 274 @c    bsearch ok
 275 @c     alias_compare ok
 276 @c      strcasecmp ok
 277 @c    read_alias_file @ascuheap @asulock @acsmem @acsfd @aculock
 278 @c     fopen @ascuheap @asulock @acsmem @acsfd @aculock
 279 @c     fsetlocking ok
 280 @c     feof_unlocked ok
 281 @c     fgets_unlocked ok
 282 @c     isspace ok (locale mutex is locked)
 283 @c     extend_alias_table @ascuheap @acsmem
 284 @c      realloc @ascuheap @acsmem
 285 @c     realloc @ascuheap @acsmem
 286 @c     fclose @ascuheap @asulock @acsmem @acsfd @aculock
 287 @c     qsort @ascuheap @acsmem
 288 @c      alias_compare dup
 289 @c    libc_lock_unlock @aculock
 290 @c   _nl_explode_name @ascuheap @acsmem
 291 @c    _nl_find_language ok
 292 @c    _nl_normalize_codeset dup @ascuheap @acsmem
 293 @c   _nl_make_l10nflist @ascuheap @acsmem
 294 @c    malloc @ascuheap @acsmem
 295 @c    free @ascuheap @acsmem
 296 @c    __argz_stringify ok
 297 @c    __argz_count ok
 298 @c    __argz_next ok
 299 @c   _nl_load_locale @ascuheap @acsmem @acsfd
 300 @c    open_not_cancel_2 @acsfd
 301 @c    __fxstat64 ok
 302 @c    close_not_cancel_no_status ok
 303 @c    mmap @acsmem
 304 @c    malloc @ascuheap @acsmem
 305 @c    read_not_cancel ok
 306 @c    free @ascuheap @acsmem
 307 @c    _nl_intern_locale_data dup @ascuheap @acsmem
 308 @c    munmap ok
 309 @c   __gconv_compare_alias @asuinit @ascuheap @asucorrupt @asulock @acsmem@acucorrupt @acsfd @aculock
 310 @c    __gconv_read_conf @asuinit @ascuheap @asucorrupt @asulock @acsmem@acucorrupt @acsfd @aculock
 311 @c     (libc_once-initializes gconv_cache and gconv_path_envvar; they're
 312 @c      never modified afterwards)
 313 @c     __gconv_load_cache @ascuheap @acsmem @acsfd
 314 @c      getenv GCONV_PATH @mtsenv
 315 @c      open_not_cancel @acsfd
 316 @c      __fxstat64 ok
 317 @c      close_not_cancel_no_status ok
 318 @c      mmap @acsmem
 319 @c      malloc @ascuheap @acsmem
 320 @c      __read ok
 321 @c      free @ascuheap @acsmem
 322 @c      munmap ok
 323 @c     __gconv_get_path @asulock @ascuheap @aculock @acsmem @acsfd
 324 @c      getcwd @ascuheap @acsmem @acsfd
 325 @c      libc_lock_lock @asulock @aculock
 326 @c      malloc @ascuheap @acsmem
 327 @c      strtok_r ok
 328 @c      libc_lock_unlock @aculock
 329 @c     read_conf_file @ascuheap @asucorrupt @asulock @acsmem @acucorrupt @acsfd @aculock
 330 @c      fopen @ascuheap @asulock @acsmem @acsfd @aculock
 331 @c      fsetlocking ok
 332 @c      feof_unlocked ok
 333 @c      getdelim @ascuheap @asucorrupt @acsmem @acucorrupt
 334 @c      isspace_l ok (C locale)
 335 @c      add_alias
 336 @c       isspace_l ok (C locale)
 337 @c       toupper_l ok (C locale)
 338 @c       add_alias2 dup @ascuheap @acucorrupt @acsmem
 339 @c      add_module @ascuheap @acsmem
 340 @c       isspace_l ok (C locale)
 341 @c       toupper_l ok (C locale)
 342 @c       strtol ok (@mtslocale but we hold the locale lock)
 343 @c       tfind __gconv_alias_db ok
 344 @c        __gconv_alias_compare dup ok
 345 @c       calloc @ascuheap @acsmem
 346 @c       insert_module dup @ascuheap
 347 @c     __tfind ok (because the tree is read only by then)
 348 @c      __gconv_alias_compare dup ok
 349 @c     insert_module @ascuheap
 350 @c      free @ascuheap
 351 @c     add_alias2 @ascuheap @acucorrupt @acsmem
 352 @c      detect_conflict ok, reads __gconv_modules_db
 353 @c      malloc @ascuheap @acsmem
 354 @c      tsearch __gconv_alias_db @ascuheap @acucorrupt @acsmem [exclusive tree, no @mtsrace]
 355 @c       __gconv_alias_compare ok
 356 @c      free @ascuheap
 357 @c    __gconv_compare_alias_cache ok
 358 @c     find_module_idx ok
 359 @c    do_lookup_alias ok
 360 @c     __tfind ok (because the tree is read only by then)
 361 @c      __gconv_alias_compare ok
 362 @c   strndup @ascuheap @acsmem
 363 @c   strcasecmp_l ok (C locale)
 364 The function @code{setlocale} sets the current locale for category
 365 @var{category} to @var{locale}.
 366
 367 If @var{category} is @code{LC_ALL}, this specifies the locale for all
 368 purposes.  The other possible values of @var{category} specify a
 369 single purpose (@pxref{Locale Categories}).
 370
 371 You can also use this function to find out the current locale by passing
 372 a null pointer as the @var{locale} argument.  In this case,
 373 @code{setlocale} returns a string that is the name of the locale
 374 currently selected for category @var{category}.
 375
 376 The string returned by @code{setlocale} can be overwritten by subsequent
 377 calls, so you should make a copy of the string (@pxref{Copying Strings
 378 and Arrays}) if you want to save it past any further calls to
 379 @code{setlocale}.  (The standard library is guaranteed never to call
 380 @code{setlocale} itself.)
 381
 382 You should not modify the string returned by @code{setlocale}.  It might
 383 be the same string that was passed as an argument in a previous call to
 384 @code{setlocale}.  One requirement is that the @var{category} must be
 385 the same in the call the string was returned and the one when the string
 386 is passed in as @var{locale} parameter.
 387
 388 When you read the current locale for category @code{LC_ALL}, the value
 389 encodes the entire combination of selected locales for all categories.
 390 If you specify the same ``locale name'' with @code{LC_ALL} in a
 391 subsequent call to @code{setlocale}, it restores the same combination
 392 of locale selections.
 393
 394 To be sure you can use the returned string encoding the currently selected
 395 locale at a later time, you must make a copy of the string.  It is not
 396 guaranteed that the returned pointer remains valid over time.
 397
 398 When the @var{locale} argument is not a null pointer, the string returned
 399 by @code{setlocale} reflects the newly-modified locale.
 400
 401 If you specify an empty string for @var{locale}, this means to read the
 402 appropriate environment variable and use its value to select the locale
 403 for @var{category}.
 404
 405 If a nonempty string is given for @var{locale}, then the locale of that
 406 name is used if possible.
 407
 408 The effective locale name (either the second argument to
 409 @code{setlocale}, or if the argument is an empty string, the name
 410 obtained from the process environment) must be a valid locale name.
 411 @xref{Locale Names}.
 412
 413 If you specify an invalid locale name, @code{setlocale} returns a null
 414 pointer and leaves the current locale unchanged.
 415 @end deftypefun
 416
 417 Here is an example showing how you might use @code{setlocale} to
 418 temporarily switch to a new locale.
 419
 420 @smallexample
 421 #include <stddef.h>
 422 #include <locale.h>
 423 #include <stdlib.h>
 424 #include <string.h>
 425
 426 void
 427 with_other_locale (char *new_locale,
 428                    void (*subroutine) (int),
 429                    int argument)
 430 @{
 431   char *old_locale, *saved_locale;
 432
 433   /* @r{Get the name of the current locale.}  */
 434   old_locale = setlocale (LC_ALL, NULL);
 435
 436   /* @r{Copy the name so it won't be clobbered by @code{setlocale}.} */
 437   saved_locale = strdup (old_locale);
 438   if (saved_locale == NULL)
 439     fatal ("Out of memory");
 440
 441   /* @r{Now change the locale and do some stuff with it.} */
 442   setlocale (LC_ALL, new_locale);
 443   (*subroutine) (argument);
 444
 445   /* @r{Restore the original locale.} */
 446   setlocale (LC_ALL, saved_locale);
 447   free (saved_locale);
 448 @}
 449 @end smallexample
 450
 451 @strong{Portability Note:} Some @w{ISO C} systems may define additional
 452 locale categories, and future versions of the library will do so.  For
 453 portability, assume that any symbol beginning with @samp{LC_} might be
 454 defined in @file{locale.h}.
 455
 456 @node Standard Locales, Locale Names, Setting the Locale, Locales
 457 @section Standard Locales
 458
 459 The only locale names you can count on finding on all operating systems
 460 are these three standard ones:
 461
 462 @table @code
 463 @item "C"
 464 This is the standard C locale.  The attributes and behavior it provides
 465 are specified in the @w{ISO C} standard.  When your program starts up, it
 466 initially uses this locale by default.
 467
 468 @item "POSIX"
 469 This is the standard POSIX locale.  Currently, it is an alias for the
 470 standard C locale.
 471
 472 @item ""
 473 The empty name says to select a locale based on environment variables.
 474 @xref{Locale Categories}.
 475 @end table
 476
 477 Defining and installing named locales is normally a responsibility of
 478 the system administrator at your site (or the person who installed
 479 @theglibc{}).  It is also possible for the user to create private
 480 locales.  All this will be discussed later when describing the tool to
 481 do so.
 482 @comment (@pxref{Building Locale Files}).
 483
 484 If your program needs to use something other than the @samp{C} locale,
 485 it will be more portable if you use whatever locale the user specifies
 486 with the environment, rather than trying to specify some non-standard
 487 locale explicitly by name.  Remember, different machines might have
 488 different sets of locales installed.
 489
 490 @node Locale Names, Locale Information, Standard Locales, Locales
 491 @section Locale Names
 492
 493 The following command prints a list of locales supported by the
 494 system:
 495
 496 @pindex locale
 497 @smallexample
 498   locale -a
 499 @end smallexample
 500
 501 @strong{Portability Note:} With the notable exception of the standard
 502 locale names @samp{C} and @samp{POSIX}, locale names are
 503 system-specific.
 504
 505 Most locale names follow XPG syntax and consist of up to four parts:
 506
 507 @smallexample
 508 @var{language}[_@var{territory}[.@var{codeset}]][@@@var{modifier}]
 509 @end smallexample
 510
 511 Beside the first part, all of them are allowed to be missing.  If the
 512 full specified locale is not found, less specific ones are looked for.
 513 The various parts will be stripped off, in the following order:
 514
 515 @enumerate
 516 @item
 517 codeset
 518 @item
 519 normalized codeset
 520 @item
 521 territory
 522 @item
 523 modifier
 524 @end enumerate
 525
 526 For example, the locale name @samp{de_AT.iso885915@@euro} denotes a
 527 German-language locale for use in Austria, using the ISO-8859-15
 528 (Latin-9) character set, and with the Euro as the currency symbol.
 529
 530 In addition to locale names which follow XPG syntax, systems may
 531 provide aliases such as @samp{german}.  Both categories of names must
 532 not contain the slash character @samp{/}.
 533
 534 If the locale name starts with a slash @samp{/}, it is treated as a
 535 path relative to the configured locale directories; see @code{LOCPATH}
 536 below.  The specified path must not contain a component @samp{..}, or
 537 the name is invalid, and @code{setlocale} will fail.
 538
 539 @strong{Portability Note:} POSIX suggests that if a locale name starts
 540 with a slash @samp{/}, it is resolved as an absolute path.  However,
 541 @theglibc{} treats it as a relative path under the directories listed
 542 in @code{LOCPATH} (or the default locale directory if @code{LOCPATH}
 543 is unset).
 544
 545 Locale names which are longer than an implementation-defined limit are
 546 invalid and cause @code{setlocale} to fail.
 547
 548 As a special case, locale names used with @code{LC_ALL} can combine
 549 several locales, reflecting different locale settings for different
 550 categories.  For example, you might want to use a U.S. locale with ISO
 551 A4 paper format, so you set @code{LANG} to @samp{en_US.UTF-8}, and
 552 @code{LC_PAPER} to @samp{de_DE.UTF-8}.  In this case, the
 553 @code{LC_ALL}-style combined locale name is
 554
 555 @smallexample
 556 LC_CTYPE=en_US.UTF-8;LC_TIME=en_US.UTF-8;LC_PAPER=de_DE.UTF-8;@dots{}
 557 @end smallexample
 558
 559 followed by other category settings not shown here.
 560
 561 @vindex LOCPATH
 562 The path used for finding locale data can be set using the
 563 @code{LOCPATH} environment variable.  This variable lists the
 564 directories in which to search for locale definitions, separated by a
 565 colon @samp{:}.
 566
 567 The default path for finding locale data is system specific.  A typical
 568 value for the @code{LOCPATH} default is:
 569
 570 @smallexample
 571 /usr/share/locale
 572 @end smallexample
 573
 574 The value of @code{LOCPATH} is ignored by privileged programs for
 575 security reasons, and only the default directory is used.
 576
 577 @node Locale Information, Formatting Numbers, Locale Names, Locales
 578 @section Accessing Locale Information
 579
 580 There are several ways to access locale information.  The simplest
 581 way is to let the C library itself do the work.  Several of the
 582 functions in this library implicitly access the locale data, and use
 583 what information is provided by the currently selected locale.  This is
 584 how the locale model is meant to work normally.
 585
 586 As an example take the @code{strftime} function, which is meant to nicely
 587 format date and time information (@pxref{Formatting Calendar Time}).
 588 Part of the standard information contained in the @code{LC_TIME}
 589 category is the names of the months.  Instead of requiring the
 590 programmer to take care of providing the translations the
 591 @code{strftime} function does this all by itself.  @code{%A}
 592 in the format string is replaced by the appropriate weekday
 593 name of the locale currently selected by @code{LC_TIME}.  This is an
 594 easy example, and wherever possible functions do things automatically
 595 in this way.
 596
 597 But there are quite often situations when there is simply no function
 598 to perform the task, or it is simply not possible to do the work
 599 automatically.  For these cases it is necessary to access the
 600 information in the locale directly.  To do this the C library provides
 601 two functions: @code{localeconv} and @code{nl_langinfo}.  The former is
 602 part of @w{ISO C} and therefore portable, but has a brain-damaged
 603 interface.  The second is part of the Unix interface and is portable in
 604 as far as the system follows the Unix standards.
 605
 606 @menu
 607 * The Lame Way to Locale Data::   ISO C's @code{localeconv}.
 608 * The Elegant and Fast Way::      X/Open's @code{nl_langinfo}.
 609 @end menu
 610
 611 @node The Lame Way to Locale Data, The Elegant and Fast Way, ,Locale Information
 612 @subsection @code{localeconv}: It is portable but @dots{}
 613
 614 Together with the @code{setlocale} function the @w{ISO C} people
 615 invented the @code{localeconv} function.  It is a masterpiece of poor
 616 design.  It is expensive to use, not extensible, and not generally
 617 usable as it provides access to only @code{LC_MONETARY} and
 618 @code{LC_NUMERIC} related information.  Nevertheless, if it is
 619 applicable to a given situation it should be used since it is very
 620 portable.  The function @code{strfmon} formats monetary amounts
 621 according to the selected locale using this information.
 622 @pindex locale.h
 623 @cindex monetary value formatting
 624 @cindex numeric value formatting
 625
 626 @comment locale.h
 627 @comment ISO
 628 @deftypefun {struct lconv *} localeconv (void)
 629 @safety{@prelim{}@mtunsafe{@mtasurace{:localeconv} @mtslocale{}}@asunsafe{}@acsafe{}}
 630 @c This function reads from multiple components of the locale object,
 631 @c without synchronization, while writing to the static buffer it uses
 632 @c as the return value.
 633 The @code{localeconv} function returns a pointer to a structure whose
 634 components contain information about how numeric and monetary values
 635 should be formatted in the current locale.
 636
 637 You should not modify the structure or its contents.  The structure might
 638 be overwritten by subsequent calls to @code{localeconv}, or by calls to
 639 @code{setlocale}, but no other function in the library overwrites this
 640 value.
 641 @end deftypefun
 642
 643 @comment locale.h
 644 @comment ISO
 645 @deftp {Data Type} {struct lconv}
 646 @code{localeconv}'s return value is of this data type.  Its elements are
 647 described in the following subsections.
 648 @end deftp
 649
 650 If a member of the structure @code{struct lconv} has type @code{char},
 651 and the value is @code{CHAR_MAX}, it means that the current locale has
 652 no value for that parameter.
 653
 654 @menu
 655 * General Numeric::             Parameters for formatting numbers and
 656                                  currency amounts.
 657 * Currency Symbol::             How to print the symbol that identifies an
 658                                  amount of money (e.g. @samp{$}).
 659 * Sign of Money Amount::        How to print the (positive or negative) sign
 660                                  for a monetary amount, if one exists.
 661 @end menu
 662
 663 @node General Numeric, Currency Symbol, , The Lame Way to Locale Data
 664 @subsubsection Generic Numeric Formatting Parameters
 665
 666 These are the standard members of @code{struct lconv}; there may be
 667 others.
 668
 669 @table @code
 670 @item char *decimal_point
 671 @itemx char *mon_decimal_point
 672 These are the decimal-point separators used in formatting non-monetary
 673 and monetary quantities, respectively.  In the @samp{C} locale, the
 674 value of @code{decimal_point} is @code{"."}, and the value of
 675 @code{mon_decimal_point} is @code{""}.
 676 @cindex decimal-point separator
 677
 678 @item char *thousands_sep
 679 @itemx char *mon_thousands_sep
 680 These are the separators used to delimit groups of digits to the left of
 681 the decimal point in formatting non-monetary and monetary quantities,
 682 respectively.  In the @samp{C} locale, both members have a value of
 683 @code{""} (the empty string).
 684
 685 @item char *grouping
 686 @itemx char *mon_grouping
 687 These are strings that specify how to group the digits to the left of
 688 the decimal point.  @code{grouping} applies to non-monetary quantities
 689 and @code{mon_grouping} applies to monetary quantities.  Use either
 690 @code{thousands_sep} or @code{mon_thousands_sep} to separate the digit
 691 groups.
 692 @cindex grouping of digits
 693
 694 Each member of these strings is to be interpreted as an integer value of
 695 type @code{char}.  Successive numbers (from left to right) give the
 696 sizes of successive groups (from right to left, starting at the decimal
 697 point.)  The last member is either @code{0}, in which case the previous
 698 member is used over and over again for all the remaining groups, or
 699 @code{CHAR_MAX}, in which case there is no more grouping---or, put
 700 another way, any remaining digits form one large group without
 701 separators.
 702
 703 For example, if @code{grouping} is @code{"\04\03\02"}, the correct
 704 grouping for the number @code{123456787654321} is @samp{12}, @samp{34},
 705 @samp{56}, @samp{78}, @samp{765}, @samp{4321}.  This uses a group of 4
 706 digits at the end, preceded by a group of 3 digits, preceded by groups
 707 of 2 digits (as many as needed).  With a separator of @samp{,}, the
 708 number would be printed as @samp{12,34,56,78,765,4321}.
 709
 710 A value of @code{"\03"} indicates repeated groups of three digits, as
 711 normally used in the U.S.
 712
 713 In the standard @samp{C} locale, both @code{grouping} and
 714 @code{mon_grouping} have a value of @code{""}.  This value specifies no
 715 grouping at all.
 716
 717 @item char int_frac_digits
 718 @itemx char frac_digits
 719 These are small integers indicating how many fractional digits (to the
 720 right of the decimal point) should be displayed in a monetary value in
 721 international and local formats, respectively.  (Most often, both
 722 members have the same value.)
 723
 724 In the standard @samp{C} locale, both of these members have the value
 725 @code{CHAR_MAX}, meaning ``unspecified''.  The ISO standard doesn't say
 726 what to do when you find this value; we recommend printing no
 727 fractional digits.  (This locale also specifies the empty string for
 728 @code{mon_decimal_point}, so printing any fractional digits would be
 729 confusing!)
 730 @end table
 731
 732 @node Currency Symbol, Sign of Money Amount, General Numeric, The Lame Way to Locale Data
 733 @subsubsection Printing the Currency Symbol
 734 @cindex currency symbols
 735
 736 These members of the @code{struct lconv} structure specify how to print
 737 the symbol to identify a monetary value---the international analog of
 738 @samp{$} for US dollars.
 739
 740 Each country has two standard currency symbols.  The @dfn{local currency
 741 symbol} is used commonly within the country, while the
 742 @dfn{international currency symbol} is used internationally to refer to
 743 that country's currency when it is necessary to indicate the country
 744 unambiguously.
 745
 746 For example, many countries use the dollar as their monetary unit, and
 747 when dealing with international currencies it's important to specify
 748 that one is dealing with (say) Canadian dollars instead of U.S. dollars
 749 or Australian dollars.  But when the context is known to be Canada,
 750 there is no need to make this explicit---dollar amounts are implicitly
 751 assumed to be in Canadian dollars.
 752
 753 @table @code
 754 @item char *currency_symbol
 755 The local currency symbol for the selected locale.
 756
 757 In the standard @samp{C} locale, this member has a value of @code{""}
 758 (the empty string), meaning ``unspecified''.  The ISO standard doesn't
 759 say what to do when you find this value; we recommend you simply print
 760 the empty string as you would print any other string pointed to by this
 761 variable.
 762
 763 @item char *int_curr_symbol
 764 The international currency symbol for the selected locale.
 765
 766 The value of @code{int_curr_symbol} should normally consist of a
 767 three-letter abbreviation determined by the international standard
 768 @cite{ISO 4217 Codes for the Representation of Currency and Funds},
 769 followed by a one-character separator (often a space).
 770
 771 In the standard @samp{C} locale, this member has a value of @code{""}
 772 (the empty string), meaning ``unspecified''.  We recommend you simply print
 773 the empty string as you would print any other string pointed to by this
 774 variable.
 775
 776 @item char p_cs_precedes
 777 @itemx char n_cs_precedes
 778 @itemx char int_p_cs_precedes
 779 @itemx char int_n_cs_precedes
 780 These members are @code{1} if the @code{currency_symbol} or
 781 @code{int_curr_symbol} strings should precede the value of a monetary
 782 amount, or @code{0} if the strings should follow the value.  The
 783 @code{p_cs_precedes} and @code{int_p_cs_precedes} members apply to
 784 positive amounts (or zero), and the @code{n_cs_precedes} and
 785 @code{int_n_cs_precedes} members apply to negative amounts.
 786
 787 In the standard @samp{C} locale, all of these members have a value of
 788 @code{CHAR_MAX}, meaning ``unspecified''.  The ISO standard doesn't say
 789 what to do when you find this value.  We recommend printing the
 790 currency symbol before the amount, which is right for most countries.
 791 In other words, treat all nonzero values alike in these members.
 792
 793 The members with the @code{int_} prefix apply to the
 794 @code{int_curr_symbol} while the other two apply to
 795 @code{currency_symbol}.
 796
 797 @item char p_sep_by_space
 798 @itemx char n_sep_by_space
 799 @itemx char int_p_sep_by_space
 800 @itemx char int_n_sep_by_space
 801 These members are @code{1} if a space should appear between the
 802 @code{currency_symbol} or @code{int_curr_symbol} strings and the
 803 amount, or @code{0} if no space should appear.  The
 804 @code{p_sep_by_space} and @code{int_p_sep_by_space} members apply to
 805 positive amounts (or zero), and the @code{n_sep_by_space} and
 806 @code{int_n_sep_by_space} members apply to negative amounts.
 807
 808 In the standard @samp{C} locale, all of these members have a value of
 809 @code{CHAR_MAX}, meaning ``unspecified''.  The ISO standard doesn't say
 810 what you should do when you find this value; we suggest you treat it as
 811 1 (print a space).  In other words, treat all nonzero values alike in
 812 these members.
 813
 814 The members with the @code{int_} prefix apply to the
 815 @code{int_curr_symbol} while the other two apply to
 816 @code{currency_symbol}.  There is one specialty with the
 817 @code{int_curr_symbol}, though.  Since all legal values contain a space
 818 at the end of the string one either prints this space (if the currency
 819 symbol must appear in front and must be separated) or one has to avoid
 820 printing this character at all (especially when at the end of the
 821 string).
 822 @end table
 823
 824 @node Sign of Money Amount, , Currency Symbol, The Lame Way to Locale Data
 825 @subsubsection Printing the Sign of a Monetary Amount
 826
 827 These members of the @code{struct lconv} structure specify how to print
 828 the sign (if any) of a monetary value.
 829
 830 @table @code
 831 @item char *positive_sign
 832 @itemx char *negative_sign
 833 These are strings used to indicate positive (or zero) and negative
 834 monetary quantities, respectively.
 835
 836 In the standard @samp{C} locale, both of these members have a value of
 837 @code{""} (the empty string), meaning ``unspecified''.
 838
 839 The ISO standard doesn't say what to do when you find this value; we
 840 recommend printing @code{positive_sign} as you find it, even if it is
 841 empty.  For a negative value, print @code{negative_sign} as you find it
 842 unless both it and @code{positive_sign} are empty, in which case print
 843 @samp{-} instead.  (Failing to indicate the sign at all seems rather
 844 unreasonable.)
 845
 846 @item char p_sign_posn
 847 @itemx char n_sign_posn
 848 @itemx char int_p_sign_posn
 849 @itemx char int_n_sign_posn
 850 These members are small integers that indicate how to
 851 position the sign for nonnegative and negative monetary quantities,
 852 respectively.  (The string used for the sign is what was specified with
 853 @code{positive_sign} or @code{negative_sign}.)  The possible values are
 854 as follows:
 855
 856 @table @code
 857 @item 0
 858 The currency symbol and quantity should be surrounded by parentheses.
 859
 860 @item 1
 861 Print the sign string before the quantity and currency symbol.
 862
 863 @item 2
 864 Print the sign string after the quantity and currency symbol.
 865
 866 @item 3
 867 Print the sign string right before the currency symbol.
 868
 869 @item 4
 870 Print the sign string right after the currency symbol.
 871
 872 @item CHAR_MAX
 873 ``Unspecified''.  Both members have this value in the standard
 874 @samp{C} locale.
 875 @end table
 876
 877 The ISO standard doesn't say what you should do when the value is
 878 @code{CHAR_MAX}.  We recommend you print the sign after the currency
 879 symbol.
 880
 881 The members with the @code{int_} prefix apply to the
 882 @code{int_curr_symbol} while the other two apply to
 883 @code{currency_symbol}.
 884 @end table
 885
 886 @node The Elegant and Fast Way, , The Lame Way to Locale Data, Locale Information
 887 @subsection Pinpoint Access to Locale Data
 888
 889 When writing the X/Open Portability Guide the authors realized that the
 890 @code{localeconv} function is not enough to provide reasonable access to
 891 locale information.  The information which was meant to be available
 892 in the locale (as later specified in the POSIX.1 standard) requires more
 893 ways to access it.  Therefore the @code{nl_langinfo} function
 894 was introduced.
 895
 896 @comment langinfo.h
 897 @comment XOPEN
 898 @deftypefun {char *} nl_langinfo (nl_item @var{item})
 899 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 900 @c It calls _nl_langinfo_l with the current locale, which returns a
 901 @c pointer into constant strings defined in locale data structures.
 902 The @code{nl_langinfo} function can be used to access individual
 903 elements of the locale categories.  Unlike the @code{localeconv}
 904 function, which returns all the information, @code{nl_langinfo}
 905 lets the caller select what information it requires.  This is very
 906 fast and it is not a problem to call this function multiple times.
 907
 908 A second advantage is that in addition to the numeric and monetary
 909 formatting information, information from the
 910 @code{LC_TIME} and @code{LC_MESSAGES} categories is available.
 911
 912 @pindex langinfo.h
 913 The type @code{nl_type} is defined in @file{nl_types.h}.  The argument
 914 @var{item} is a numeric value defined in the header @file{langinfo.h}.
 915 The X/Open standard defines the following values:
 916
 917 @vtable @code
 918 @item CODESET
 919 @code{nl_langinfo} returns a string with the name of the coded character
 920 set used in the selected locale.
 921
 922 @item ABDAY_1
 923 @itemx ABDAY_2
 924 @itemx ABDAY_3
 925 @itemx ABDAY_4
 926 @itemx ABDAY_5
 927 @itemx ABDAY_6
 928 @itemx ABDAY_7
 929 @code{nl_langinfo} returns the abbreviated weekday name.  @code{ABDAY_1}
 930 corresponds to Sunday.
 931 @item DAY_1
 932 @itemx DAY_2
 933 @itemx DAY_3
 934 @itemx DAY_4
 935 @itemx DAY_5
 936 @itemx DAY_6
 937 @itemx DAY_7
 938 Similar to @code{ABDAY_1} etc., but here the return value is the
 939 unabbreviated weekday name.
 940 @item ABMON_1
 941 @itemx ABMON_2
 942 @itemx ABMON_3
 943 @itemx ABMON_4
 944 @itemx ABMON_5
 945 @itemx ABMON_6
 946 @itemx ABMON_7
 947 @itemx ABMON_8
 948 @itemx ABMON_9
 949 @itemx ABMON_10
 950 @itemx ABMON_11
 951 @itemx ABMON_12
 952 The return value is abbreviated name of the month.  @code{ABMON_1}
 953 corresponds to January.
 954 @item MON_1
 955 @itemx MON_2
 956 @itemx MON_3
 957 @itemx MON_4
 958 @itemx MON_5
 959 @itemx MON_6
 960 @itemx MON_7
 961 @itemx MON_8
 962 @itemx MON_9
 963 @itemx MON_10
 964 @itemx MON_11
 965 @itemx MON_12
 966 Similar to @code{ABMON_1} etc., but here the month names are not abbreviated.
 967 Here the first value @code{MON_1} also corresponds to January.
 968 @item AM_STR
 969 @itemx PM_STR
 970 The return values are strings which can be used in the representation of time
 971 as an hour from 1 to 12 plus an am/pm specifier.
 972
 973 Note that in locales which do not use this time representation
 974 these strings might be empty, in which case the am/pm format
 975 cannot be used at all.
 976 @item D_T_FMT
 977 The return value can be used as a format string for @code{strftime} to
 978 represent time and date in a locale-specific way.
 979 @item D_FMT
 980 The return value can be used as a format string for @code{strftime} to
 981 represent a date in a locale-specific way.
 982 @item T_FMT
 983 The return value can be used as a format string for @code{strftime} to
 984 represent time in a locale-specific way.
 985 @item T_FMT_AMPM
 986 The return value can be used as a format string for @code{strftime} to
 987 represent time in the am/pm format.
 988
 989 Note that if the am/pm format does not make any sense for the
 990 selected locale, the return value might be the same as the one for
 991 @code{T_FMT}.
 992 @item ERA
 993 The return value represents the era used in the current locale.
 994
 995 Most locales do not define this value.  An example of a locale which
 996 does define this value is the Japanese one.  In Japan, the traditional
 997 representation of dates includes the name of the era corresponding to
 998 the then-emperor's reign.
 999
1000 Normally it should not be necessary to use this value directly.
1001 Specifying the @code{E} modifier in their format strings causes the
1002 @code{strftime} functions to use this information.  The format of the
1003 returned string is not specified, and therefore you should not assume
1004 knowledge of it on different systems.
1005 @item ERA_YEAR
1006 The return value gives the year in the relevant era of the locale.
1007 As for @code{ERA} it should not be necessary to use this value directly.
1008 @item ERA_D_T_FMT
1009 This return value can be used as a format string for @code{strftime} to
1010 represent dates and times in a locale-specific era-based way.
1011 @item ERA_D_FMT
1012 This return value can be used as a format string for @code{strftime} to
1013 represent a date in a locale-specific era-based way.
1014 @item ERA_T_FMT
1015 This return value can be used as a format string for @code{strftime} to
1016 represent time in a locale-specific era-based way.
1017 @item ALT_DIGITS
1018 The return value is a representation of up to @math{100} values used to
1019 represent the values @math{0} to @math{99}.  As for @code{ERA} this
1020 value is not intended to be used directly, but instead indirectly
1021 through the @code{strftime} function.  When the modifier @code{O} is
1022 used in a format which would otherwise use numerals to represent hours,
1023 minutes, seconds, weekdays, months, or weeks, the appropriate value for
1024 the locale is used instead.
1025 @item INT_CURR_SYMBOL
1026 The same as the value returned by @code{localeconv} in the
1027 @code{int_curr_symbol} element of the @code{struct lconv}.
1028 @item CURRENCY_SYMBOL
1029 @itemx CRNCYSTR
1030 The same as the value returned by @code{localeconv} in the
1031 @code{currency_symbol} element of the @code{struct lconv}.
1032
1033 @code{CRNCYSTR} is a deprecated alias still required by Unix98.
1034 @item MON_DECIMAL_POINT
1035 The same as the value returned by @code{localeconv} in the
1036 @code{mon_decimal_point} element of the @code{struct lconv}.
1037 @item MON_THOUSANDS_SEP
1038 The same as the value returned by @code{localeconv} in the
1039 @code{mon_thousands_sep} element of the @code{struct lconv}.
1040 @item MON_GROUPING
1041 The same as the value returned by @code{localeconv} in the
1042 @code{mon_grouping} element of the @code{struct lconv}.
1043 @item POSITIVE_SIGN
1044 The same as the value returned by @code{localeconv} in the
1045 @code{positive_sign} element of the @code{struct lconv}.
1046 @item NEGATIVE_SIGN
1047 The same as the value returned by @code{localeconv} in the
1048 @code{negative_sign} element of the @code{struct lconv}.
1049 @item INT_FRAC_DIGITS
1050 The same as the value returned by @code{localeconv} in the
1051 @code{int_frac_digits} element of the @code{struct lconv}.
1052 @item FRAC_DIGITS
1053 The same as the value returned by @code{localeconv} in the
1054 @code{frac_digits} element of the @code{struct lconv}.
1055 @item P_CS_PRECEDES
1056 The same as the value returned by @code{localeconv} in the
1057 @code{p_cs_precedes} element of the @code{struct lconv}.
1058 @item P_SEP_BY_SPACE
1059 The same as the value returned by @code{localeconv} in the
1060 @code{p_sep_by_space} element of the @code{struct lconv}.
1061 @item N_CS_PRECEDES
1062 The same as the value returned by @code{localeconv} in the
1063 @code{n_cs_precedes} element of the @code{struct lconv}.
1064 @item N_SEP_BY_SPACE
1065 The same as the value returned by @code{localeconv} in the
1066 @code{n_sep_by_space} element of the @code{struct lconv}.
1067 @item P_SIGN_POSN
1068 The same as the value returned by @code{localeconv} in the
1069 @code{p_sign_posn} element of the @code{struct lconv}.
1070 @item N_SIGN_POSN
1071 The same as the value returned by @code{localeconv} in the
1072 @code{n_sign_posn} element of the @code{struct lconv}.
1073
1074 @item INT_P_CS_PRECEDES
1075 The same as the value returned by @code{localeconv} in the
1076 @code{int_p_cs_precedes} element of the @code{struct lconv}.
1077 @item INT_P_SEP_BY_SPACE
1078 The same as the value returned by @code{localeconv} in the
1079 @code{int_p_sep_by_space} element of the @code{struct lconv}.
1080 @item INT_N_CS_PRECEDES
1081 The same as the value returned by @code{localeconv} in the
1082 @code{int_n_cs_precedes} element of the @code{struct lconv}.
1083 @item INT_N_SEP_BY_SPACE
1084 The same as the value returned by @code{localeconv} in the
1085 @code{int_n_sep_by_space} element of the @code{struct lconv}.
1086 @item INT_P_SIGN_POSN
1087 The same as the value returned by @code{localeconv} in the
1088 @code{int_p_sign_posn} element of the @code{struct lconv}.
1089 @item INT_N_SIGN_POSN
1090 The same as the value returned by @code{localeconv} in the
1091 @code{int_n_sign_posn} element of the @code{struct lconv}.
1092
1093 @item DECIMAL_POINT
1094 @itemx RADIXCHAR
1095 The same as the value returned by @code{localeconv} in the
1096 @code{decimal_point} element of the @code{struct lconv}.
1097
1098 The name @code{RADIXCHAR} is a deprecated alias still used in Unix98.
1099 @item THOUSANDS_SEP
1100 @itemx THOUSEP
1101 The same as the value returned by @code{localeconv} in the
1102 @code{thousands_sep} element of the @code{struct lconv}.
1103
1104 The name @code{THOUSEP} is a deprecated alias still used in Unix98.
1105 @item GROUPING
1106 The same as the value returned by @code{localeconv} in the
1107 @code{grouping} element of the @code{struct lconv}.
1108 @item YESEXPR
1109 The return value is a regular expression which can be used with the
1110 @code{regex} function to recognize a positive response to a yes/no
1111 question.  @Theglibc{} provides the @code{rpmatch} function for
1112 easier handling in applications.
1113 @item NOEXPR
1114 The return value is a regular expression which can be used with the
1115 @code{regex} function to recognize a negative response to a yes/no
1116 question.
1117 @item YESSTR
1118 The return value is a locale-specific translation of the positive response
1119 to a yes/no question.
1120
1121 Using this value is deprecated since it is a very special case of
1122 message translation, and is better handled by the message
1123 translation functions (@pxref{Message Translation}).
1124
1125 The use of this symbol is deprecated.  Instead message translation
1126 should be used.
1127 @item NOSTR
1128 The return value is a locale-specific translation of the negative response
1129 to a yes/no question.  What is said for @code{YESSTR} is also true here.
1130
1131 The use of this symbol is deprecated.  Instead message translation
1132 should be used.
1133 @end vtable
1134
1135 The file @file{langinfo.h} defines a lot more symbols but none of them
1136 are official.  Using them is not portable, and the format of the
1137 return values might change.  Therefore we recommended you not use
1138 them.
1139
1140 Note that the return value for any valid argument can be used
1141 in all situations (with the possible exception of the am/pm time formatting
1142 codes).  If the user has not selected any locale for the
1143 appropriate category, @code{nl_langinfo} returns the information from the
1144 @code{"C"} locale.  It is therefore possible to use this function as
1145 shown in the example below.
1146
1147 If the argument @var{item} is not valid, a pointer to an empty string is
1148 returned.
1149 @end deftypefun
1150
1151 An example of @code{nl_langinfo} usage is a function which has to
1152 print a given date and time in a locale-specific way.  At first one
1153 might think that, since @code{strftime} internally uses the locale
1154 information, writing something like the following is enough:
1155
1156 @smallexample
1157 size_t
1158 i18n_time_n_data (char *s, size_t len, const struct tm *tp)
1159 @{
1160   return strftime (s, len, "%X %D", tp);
1161 @}
1162 @end smallexample
1163
1164 The format contains no weekday or month names and therefore is
1165 internationally usable.  Wrong!  The output produced is something like
1166 @code{"hh:mm:ss MM/DD/YY"}.  This format is only recognizable in the
1167 USA.  Other countries use different formats.  Therefore the function
1168 should be rewritten like this:
1169
1170 @smallexample
1171 size_t
1172 i18n_time_n_data (char *s, size_t len, const struct tm *tp)
1173 @{
1174   return strftime (s, len, nl_langinfo (D_T_FMT), tp);
1175 @}
1176 @end smallexample
1177
1178 Now it uses the date and time format of the locale
1179 selected when the program runs.  If the user selects the locale
1180 correctly there should never be a misunderstanding over the time and
1181 date format.
1182
1183 @node Formatting Numbers, Yes-or-No Questions, Locale Information, Locales
1184 @section A dedicated function to format numbers
1185
1186 We have seen that the structure returned by @code{localeconv} as well as
1187 the values given to @code{nl_langinfo} allow you to retrieve the various
1188 pieces of locale-specific information to format numbers and monetary
1189 amounts.  We have also seen that the underlying rules are quite complex.
1190
1191 Therefore the X/Open standards introduce a function which uses such
1192 locale information, making it easier for the user to format
1193 numbers according to these rules.
1194
1195 @deftypefun ssize_t strfmon (char *@var{s}, size_t @var{maxsize}, const char *@var{format}, @dots{})
1196 @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
1197 @c It (and strfmon_l) both call vstrfmon_l, which, besides accessing the
1198 @c locale object passed to it, accesses the active locale through
1199 @c isdigit (but to_digit assumes ASCII digits only).  It may call
1200 @c __printf_fp (@mtslocale @ascuheap @acsmem) and guess_grouping (safe).
1201 The @code{strfmon} function is similar to the @code{strftime} function
1202 in that it takes a buffer, its size, a format string,
1203 and values to write into the buffer as text in a form specified
1204 by the format string.  Like @code{strftime}, the function
1205 also returns the number of bytes written into the buffer.
1206
1207 There are two differences: @code{strfmon} can take more than one
1208 argument, and, of course, the format specification is different.  Like
1209 @code{strftime}, the format string consists of normal text, which is
1210 output as is, and format specifiers, which are indicated by a @samp{%}.
1211 Immediately after the @samp{%}, you can optionally specify various flags
1212 and formatting information before the main formatting character, in a
1213 similar way to @code{printf}:
1214
1215 @itemize @bullet
1216 @item
1217 Immediately following the @samp{%} there can be one or more of the
1218 following flags:
1219 @table @asis
1220 @item @samp{=@var{f}}
1221 The single byte character @var{f} is used for this field as the numeric
1222 fill character.  By default this character is a space character.
1223 Filling with this character is only performed if a left precision
1224 is specified.  It is not just to fill to the given field width.
1225 @item @samp{^}
1226 The number is printed without grouping the digits according to the rules
1227 of the current locale.  By default grouping is enabled.
1228 @item @samp{+}, @samp{(}
1229 At most one of these flags can be used.  They select which format to
1230 represent the sign of a currency amount.  By default, and if
1231 @samp{+} is given, the locale equivalent of @math{+}/@math{-} is used.  If
1232 @samp{(} is given, negative amounts are enclosed in parentheses.  The
1233 exact format is determined by the values of the @code{LC_MONETARY}
1234 category of the locale selected at program runtime.
1235 @item @samp{!}
1236 The output will not contain the currency symbol.
1237 @item @samp{-}
1238 The output will be formatted left-justified instead of right-justified if
1239 it does not fill the entire field width.
1240 @end table
1241 @end itemize
1242
1243 The next part of the specification is an optional field width.  If no
1244 width is specified @math{0} is taken.  During output, the function first
1245 determines how much space is required.  If it requires at least as many
1246 characters as given by the field width, it is output using as much space
1247 as necessary.  Otherwise, it is extended to use the full width by
1248 filling with the space character.  The presence or absence of the
1249 @samp{-} flag determines the side at which such padding occurs.  If
1250 present, the spaces are added at the right making the output
1251 left-justified, and vice versa.
1252
1253 So far the format looks familiar, being similar to the @code{printf} and
1254 @code{strftime} formats.  However, the next two optional fields
1255 introduce something new.  The first one is a @samp{#} character followed
1256 by a decimal digit string.  The value of the digit string specifies the
1257 number of @emph{digit} positions to the left of the decimal point (or
1258 equivalent).  This does @emph{not} include the grouping character when
1259 the @samp{^} flag is not given.  If the space needed to print the number
1260 does not fill the whole width, the field is padded at the left side with
1261 the fill character, which can be selected using the @samp{=} flag and by
1262 default is a space.  For example, if the field width is selected as 6
1263 and the number is @math{123}, the fill character is @samp{*} the result
1264 will be @samp{***123}.
1265
1266 The second optional field starts with a @samp{.} (period) and consists
1267 of another decimal digit string.  Its value describes the number of
1268 characters printed after the decimal point.  The default is selected
1269 from the current locale (@code{frac_digits}, @code{int_frac_digits}, see
1270 @pxref{General Numeric}).  If the exact representation needs more digits
1271 than given by the field width, the displayed value is rounded.  If the
1272 number of fractional digits is selected to be zero, no decimal point is
1273 printed.
1274
1275 As a GNU extension, the @code{strfmon} implementation in @theglibc{}
1276 allows an optional @samp{L} next as a format modifier.  If this modifier
1277 is given, the argument is expected to be a @code{long double} instead of
1278 a @code{double} value.
1279
1280 Finally, the last component is a format specifier.  There are three
1281 specifiers defined:
1282
1283 @table @asis
1284 @item @samp{i}
1285 Use the locale's rules for formatting an international currency value.
1286 @item @samp{n}
1287 Use the locale's rules for formatting a national currency value.
1288 @item @samp{%}
1289 Place a @samp{%} in the output.  There must be no flag, width
1290 specifier or modifier given, only @samp{%%} is allowed.
1291 @end table
1292
1293 As for @code{printf}, the function reads the format string
1294 from left to right and uses the values passed to the function following
1295 the format string.  The values are expected to be either of type
1296 @code{double} or @code{long double}, depending on the presence of the
1297 modifier @samp{L}.  The result is stored in the buffer pointed to by
1298 @var{s}.  At most @var{maxsize} characters are stored.
1299
1300 The return value of the function is the number of characters stored in
1301 @var{s}, including the terminating @code{NULL} byte.  If the number of
1302 characters stored would exceed @var{maxsize}, the function returns
1303 @math{-1} and the content of the buffer @var{s} is unspecified.  In this
1304 case @code{errno} is set to @code{E2BIG}.
1305 @end deftypefun
1306
1307 A few examples should make clear how the function works.  It is
1308 assumed that all the following pieces of code are executed in a program
1309 which uses the USA locale (@code{en_US}).  The simplest
1310 form of the format is this:
1311
1312 @smallexample
1313 strfmon (buf, 100, "@@%n@@%n@@%n@@", 123.45, -567.89, 12345.678);
1314 @end smallexample
1315
1316 @noindent
1317 The output produced is
1318 @smallexample
1319 "@@$123.45@@-$567.89@@$12,345.68@@"
1320 @end smallexample
1321
1322 We can notice several things here.  First, the widths of the output
1323 numbers are different.  We have not specified a width in the format
1324 string, and so this is no wonder.  Second, the third number is printed
1325 using thousands separators.  The thousands separator for the
1326 @code{en_US} locale is a comma.  The number is also rounded.
1327 @math{.678} is rounded to @math{.68} since the format does not specify a
1328 precision and the default value in the locale is @math{2}.  Finally,
1329 note that the national currency symbol is printed since @samp{%n} was
1330 used, not @samp{i}.  The next example shows how we can align the output.
1331
1332 @smallexample
1333 strfmon (buf, 100, "@@%=*11n@@%=*11n@@%=*11n@@", 123.45, -567.89, 12345.678);
1334 @end smallexample
1335
1336 @noindent
1337 The output this time is:
1338
1339 @smallexample
1340 "@@    $123.45@@   -$567.89@@ $12,345.68@@"
1341 @end smallexample
1342
1343 Two things stand out.  Firstly, all fields have the same width (eleven
1344 characters) since this is the width given in the format and since no
1345 number required more characters to be printed.  The second important
1346 point is that the fill character is not used.  This is correct since the
1347 white space was not used to achieve a precision given by a @samp{#}
1348 modifier, but instead to fill to the given width.  The difference
1349 becomes obvious if we now add a width specification.
1350
1351 @smallexample
1352 strfmon (buf, 100, "@@%=*11#5n@@%=*11#5n@@%=*11#5n@@",
1353          123.45, -567.89, 12345.678);
1354 @end smallexample
1355
1356 @noindent
1357 The output is
1358
1359 @smallexample
1360 "@@ $***123.45@@-$***567.89@@ $12,456.68@@"
1361 @end smallexample
1362
1363 Here we can see that all the currency symbols are now aligned, and that
1364 the space between the currency sign and the number is filled with the
1365 selected fill character.  Note that although the width is selected to be
1366 @math{5} and @math{123.45} has three digits left of the decimal point,
1367 the space is filled with three asterisks.  This is correct since, as
1368 explained above, the width does not include the positions used to store
1369 thousands separators.  One last example should explain the remaining
1370 functionality.
1371
1372 @smallexample
1373 strfmon (buf, 100, "@@%=0(16#5.3i@@%=0(16#5.3i@@%=0(16#5.3i@@",
1374          123.45, -567.89, 12345.678);
1375 @end smallexample
1376
1377 @noindent
1378 This rather complex format string produces the following output:
1379
1380 @smallexample
1381 "@@ USD 000123,450 @@(USD 000567.890)@@ USD 12,345.678 @@"
1382 @end smallexample
1383
1384 The most noticeable change is the alternative way of representing
1385 negative numbers.  In financial circles this is often done using
1386 parentheses, and this is what the @samp{(} flag selected.  The fill
1387 character is now @samp{0}.  Note that this @samp{0} character is not
1388 regarded as a numeric zero, and therefore the first and second numbers
1389 are not printed using a thousands separator.  Since we used the format
1390 specifier @samp{i} instead of @samp{n}, the international form of the
1391 currency symbol is used.  This is a four letter string, in this case
1392 @code{"USD "}.  The last point is that since the precision right of the
1393 decimal point is selected to be three, the first and second numbers are
1394 printed with an extra zero at the end and the third number is printed
1395 without rounding.
1396
1397 @node Yes-or-No Questions,  , Formatting Numbers , Locales
1398 @section Yes-or-No Questions
1399
1400 Some non GUI programs ask a yes-or-no question.  If the messages
1401 (especially the questions) are translated into foreign languages, be
1402 sure that you localize the answers too.  It would be very bad habit to
1403 ask a question in one language and request the answer in another, often
1404 English.
1405
1406 @Theglibc{} contains @code{rpmatch} to give applications easy
1407 access to the corresponding locale definitions.
1408
1409 @comment GNU
1410 @comment stdlib.h
1411 @deftypefun int rpmatch (const char *@var{response})
1412 @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@asucorrupt{} @ascuheap{} @asulock{} @ascudlopen{}}@acunsafe{@acucorrupt{} @aculock{} @acsmem{} @acsfd{}}}
1413 @c Calls nl_langinfo with YESEXPR and NOEXPR, triggering @mtslocale but
1414 @c it's regcomp and regexec that bring in all of the safety issues.
1415 @c regfree is also called, but it doesn't introduce any further issues.
1416 The function @code{rpmatch} checks the string in @var{response} for whether
1417 or not it is a correct yes-or-no answer and if yes, which one.  The
1418 check uses the @code{YESEXPR} and @code{NOEXPR} data in the
1419 @code{LC_MESSAGES} category of the currently selected locale.  The
1420 return value is as follows:
1421
1422 @table @code
1423 @item 1
1424 The user entered an affirmative answer.
1425
1426 @item 0
1427 The user entered a negative answer.
1428
1429 @item -1
1430 The answer matched neither the @code{YESEXPR} nor the @code{NOEXPR}
1431 regular expression.
1432 @end table
1433
1434 This function is not standardized but available beside in @theglibc{} at
1435 least also in the IBM AIX library.
1436 @end deftypefun
1437
1438 @noindent
1439 This function would normally be used like this:
1440
1441 @smallexample
1442   @dots{}
1443   /* @r{Use a safe default.}  */
1444   _Bool doit = false;
1445
1446   fputs (gettext ("Do you really want to do this? "), stdout);
1447   fflush (stdout);
1448   /* @r{Prepare the @code{getline} call.}  */
1449   line = NULL;
1450   len = 0;
1451   while (getline (&line, &len, stdin) >= 0)
1452     @{
1453       /* @r{Check the response.}  */
1454       int res = rpmatch (line);
1455       if (res >= 0)
1456         @{
1457           /* @r{We got a definitive answer.}  */
1458           if (res > 0)
1459             doit = true;
1460           break;
1461         @}
1462     @}
1463   /* @r{Free what @code{getline} allocated.}  */
1464   free (line);
1465 @end smallexample
1466
1467 Note that the loop continues until a read error is detected or until a
1468 definitive (positive or negative) answer is read.