manual/ctype.texi

   1 @node Character Handling, String and Array Utilities, Memory, Top
   2 @c %MENU% Character testing and conversion functions
   3 @chapter Character Handling
   4
   5 Programs that work with characters and strings often need to classify a
   6 character---is it alphabetic, is it a digit, is it whitespace, and so
   7 on---and perform case conversion operations on characters.  The
   8 functions in the header file @file{ctype.h} are provided for this
   9 purpose.
  10 @pindex ctype.h
  11
  12 Since the choice of locale and character set can alter the
  13 classifications of particular character codes, all of these functions
  14 are affected by the current locale.  (More precisely, they are affected
  15 by the locale currently selected for character classification---the
  16 @code{LC_CTYPE} category; see @ref{Locale Categories}.)
  17
  18 The @w{ISO C} standard specifies two different sets of functions.  The
  19 one set works on @code{char} type characters, the other one on
  20 @code{wchar_t} wide characters (@pxref{Extended Char Intro}).
  21
  22 @menu
  23 * Classification of Characters::       Testing whether characters are
  24                                         letters, digits, punctuation, etc.
  25
  26 * Case Conversion::                    Case mapping, and the like.
  27 * Classification of Wide Characters::  Character class determination for
  28                                         wide characters.
  29 * Using Wide Char Classes::            Notes on using the wide character
  30                                         classes.
  31 * Wide Character Case Conversion::     Mapping of wide characters.
  32 @end menu
  33
  34 @node Classification of Characters, Case Conversion,  , Character Handling
  35 @section Classification of Characters
  36 @cindex character testing
  37 @cindex classification of characters
  38 @cindex predicates on characters
  39 @cindex character predicates
  40
  41 This section explains the library functions for classifying characters.
  42 For example, @code{isalpha} is the function to test for an alphabetic
  43 character.  It takes one argument, the character to test, and returns a
  44 nonzero integer if the character is alphabetic, and zero otherwise.  You
  45 would use it like this:
  46
  47 @smallexample
  48 if (isalpha (c))
  49   printf ("The character `%c' is alphabetic.\n", c);
  50 @end smallexample
  51
  52 Each of the functions in this section tests for membership in a
  53 particular class of characters; each has a name starting with @samp{is}.
  54 Each of them takes one argument, which is a character to test, and
  55 returns an @code{int} which is treated as a boolean value.  The
  56 character argument is passed as an @code{int}, and it may be the
  57 constant value @code{EOF} instead of a real character.
  58
  59 The attributes of any given character can vary between locales.
  60 @xref{Locales}, for more information on locales.@refill
  61
  62 These functions are declared in the header file @file{ctype.h}.
  63 @pindex ctype.h
  64
  65 @cindex lower-case character
  66 @deftypefun int islower (int @var{c})
  67 @standards{ISO, ctype.h}
  68 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
  69 @c The is* macros call __ctype_b_loc to get the ctype array from the
  70 @c current locale, and then index it by c.  __ctype_b_loc reads from
  71 @c thread-local memory the (indirect) pointer to the ctype array, which
  72 @c may involve one word access to the global locale object, if that's
  73 @c the active locale for the thread, and the array, being part of the
  74 @c locale data, is undeletable, so there's no thread-safety issue.  We
  75 @c might want to mark these with @mtslocale to flag to callers that
  76 @c changing locales might affect them, even if not these simpler
  77 @c functions.
  78 Returns true if @var{c} is a lower-case letter.  The letter need not be
  79 from the Latin alphabet, any alphabet representable is valid.
  80 @end deftypefun
  81
  82 @cindex upper-case character
  83 @deftypefun int isupper (int @var{c})
  84 @standards{ISO, ctype.h}
  85 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
  86 Returns true if @var{c} is an upper-case letter.  The letter need not be
  87 from the Latin alphabet, any alphabet representable is valid.
  88 @end deftypefun
  89
  90 @cindex alphabetic character
  91 @deftypefun int isalpha (int @var{c})
  92 @standards{ISO, ctype.h}
  93 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
  94 Returns true if @var{c} is an alphabetic character (a letter).  If
  95 @code{islower} or @code{isupper} is true of a character, then
  96 @code{isalpha} is also true.
  97
  98 In some locales, there may be additional characters for which
  99 @code{isalpha} is true---letters which are neither upper case nor lower
 100 case.  But in the standard @code{"C"} locale, there are no such
 101 additional characters.
 102 @end deftypefun
 103
 104 @cindex digit character
 105 @cindex decimal digit character
 106 @deftypefun int isdigit (int @var{c})
 107 @standards{ISO, ctype.h}
 108 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 109 Returns true if @var{c} is a decimal digit (@samp{0} through @samp{9}).
 110 @end deftypefun
 111
 112 @cindex alphanumeric character
 113 @deftypefun int isalnum (int @var{c})
 114 @standards{ISO, ctype.h}
 115 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 116 Returns true if @var{c} is an alphanumeric character (a letter or
 117 number); in other words, if either @code{isalpha} or @code{isdigit} is
 118 true of a character, then @code{isalnum} is also true.
 119 @end deftypefun
 120
 121 @cindex hexadecimal digit character
 122 @deftypefun int isxdigit (int @var{c})
 123 @standards{ISO, ctype.h}
 124 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 125 Returns true if @var{c} is a hexadecimal digit.
 126 Hexadecimal digits include the normal decimal digits @samp{0} through
 127 @samp{9} and the letters @samp{A} through @samp{F} and
 128 @samp{a} through @samp{f}.
 129 @end deftypefun
 130
 131 @cindex punctuation character
 132 @deftypefun int ispunct (int @var{c})
 133 @standards{ISO, ctype.h}
 134 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 135 Returns true if @var{c} is a punctuation character.
 136 This means any printing character that is not alphanumeric or a space
 137 character.
 138 @end deftypefun
 139
 140 @cindex whitespace character
 141 @deftypefun int isspace (int @var{c})
 142 @standards{ISO, ctype.h}
 143 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 144 Returns true if @var{c} is a @dfn{whitespace} character.  In the standard
 145 @code{"C"} locale, @code{isspace} returns true for only the standard
 146 whitespace characters:
 147
 148 @table @code
 149 @item ' '
 150 space
 151
 152 @item '\f'
 153 formfeed
 154
 155 @item '\n'
 156 newline
 157
 158 @item '\r'
 159 carriage return
 160
 161 @item '\t'
 162 horizontal tab
 163
 164 @item '\v'
 165 vertical tab
 166 @end table
 167 @end deftypefun
 168
 169 @cindex blank character
 170 @deftypefun int isblank (int @var{c})
 171 @standards{ISO, ctype.h}
 172 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 173 Returns true if @var{c} is a blank character; that is, a space or a tab.
 174 This function was originally a GNU extension, but was added in @w{ISO C99}.
 175 @end deftypefun
 176
 177 @cindex graphic character
 178 @deftypefun int isgraph (int @var{c})
 179 @standards{ISO, ctype.h}
 180 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 181 Returns true if @var{c} is a graphic character; that is, a character
 182 that has a glyph associated with it.  The whitespace characters are not
 183 considered graphic.
 184 @end deftypefun
 185
 186 @cindex printing character
 187 @deftypefun int isprint (int @var{c})
 188 @standards{ISO, ctype.h}
 189 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 190 Returns true if @var{c} is a printing character.  Printing characters
 191 include all the graphic characters, plus the space (@samp{ }) character.
 192 @end deftypefun
 193
 194 @cindex control character
 195 @deftypefun int iscntrl (int @var{c})
 196 @standards{ISO, ctype.h}
 197 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 198 Returns true if @var{c} is a control character (that is, a character that
 199 is not a printing character).
 200 @end deftypefun
 201
 202 @cindex ASCII character
 203 @deftypefun int isascii (int @var{c})
 204 @standards{SVID, ctype.h}
 205 @standards{BSD, ctype.h}
 206 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 207 Returns true if @var{c} is a 7-bit @code{unsigned char} value that fits
 208 into the US/UK ASCII character set.  This function is a BSD extension
 209 and is also an SVID extension.
 210 @end deftypefun
 211
 212 @node Case Conversion, Classification of Wide Characters, Classification of Characters, Character Handling
 213 @section Case Conversion
 214 @cindex character case conversion
 215 @cindex case conversion of characters
 216 @cindex converting case of characters
 217
 218 This section explains the library functions for performing conversions
 219 such as case mappings on characters.  For example, @code{toupper}
 220 converts any character to upper case if possible.  If the character
 221 can't be converted, @code{toupper} returns it unchanged.
 222
 223 These functions take one argument of type @code{int}, which is the
 224 character to convert, and return the converted character as an
 225 @code{int}.  If the conversion is not applicable to the argument given,
 226 the argument is returned unchanged.
 227
 228 @strong{Compatibility Note:} In pre-@w{ISO C} dialects, instead of
 229 returning the argument unchanged, these functions may fail when the
 230 argument is not suitable for the conversion.  Thus for portability, you
 231 may need to write @code{islower(c) ? toupper(c) : c} rather than just
 232 @code{toupper(c)}.
 233
 234 These functions are declared in the header file @file{ctype.h}.
 235 @pindex ctype.h
 236
 237 @deftypefun int tolower (int @var{c})
 238 @standards{ISO, ctype.h}
 239 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 240 @c The to* macros/functions call different functions that use different
 241 @c arrays than those of__ctype_b_loc, but the access patterns and
 242 @c thus safety guarantees are the same.
 243 If @var{c} is an upper-case letter, @code{tolower} returns the corresponding
 244 lower-case letter.  If @var{c} is not an upper-case letter,
 245 @var{c} is returned unchanged.
 246 @end deftypefun
 247
 248 @deftypefun int toupper (int @var{c})
 249 @standards{ISO, ctype.h}
 250 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 251 If @var{c} is a lower-case letter, @code{toupper} returns the corresponding
 252 upper-case letter.  Otherwise @var{c} is returned unchanged.
 253 @end deftypefun
 254
 255 @deftypefun int toascii (int @var{c})
 256 @standards{SVID, ctype.h}
 257 @standards{BSD, ctype.h}
 258 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 259 This function converts @var{c} to a 7-bit @code{unsigned char} value
 260 that fits into the US/UK ASCII character set, by clearing the high-order
 261 bits.  This function is a BSD extension and is also an SVID extension.
 262 @end deftypefun
 263
 264 @deftypefun int _tolower (int @var{c})
 265 @standards{SVID, ctype.h}
 266 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 267 This is identical to @code{tolower}, and is provided for compatibility
 268 with the SVID.  @xref{SVID}.@refill
 269 @end deftypefun
 270
 271 @deftypefun int _toupper (int @var{c})
 272 @standards{SVID, ctype.h}
 273 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 274 This is identical to @code{toupper}, and is provided for compatibility
 275 with the SVID.
 276 @end deftypefun
 277
 278
 279 @node Classification of Wide Characters, Using Wide Char Classes, Case Conversion, Character Handling
 280 @section Character class determination for wide characters
 281
 282 @w{Amendment 1} to @w{ISO C90} defines functions to classify wide
 283 characters.  Although the original @w{ISO C90} standard already defined
 284 the type @code{wchar_t}, no functions operating on them were defined.
 285
 286 The general design of the classification functions for wide characters
 287 is more general.  It allows extensions to the set of available
 288 classifications, beyond those which are always available.  The POSIX
 289 standard specifies how extensions can be made, and this is already
 290 implemented in the @glibcadj{} implementation of the @code{localedef}
 291 program.
 292
 293 The character class functions are normally implemented with bitsets,
 294 with a bitset per character.  For a given character, the appropriate
 295 bitset is read from a table and a test is performed as to whether a
 296 certain bit is set.  Which bit is tested for is determined by the
 297 class.
 298
 299 For the wide character classification functions this is made visible.
 300 There is a type classification type defined, a function to retrieve this
 301 value for a given class, and a function to test whether a given
 302 character is in this class, using the classification value.  On top of
 303 this the normal character classification functions as used for
 304 @code{char} objects can be defined.
 305
 306 @deftp {Data type} wctype_t
 307 @standards{ISO, wctype.h}
 308 The @code{wctype_t} can hold a value which represents a character class.
 309 The only defined way to generate such a value is by using the
 310 @code{wctype} function.
 311
 312 @pindex wctype.h
 313 This type is defined in @file{wctype.h}.
 314 @end deftp
 315
 316 @deftypefun wctype_t wctype (const char *@var{property})
 317 @standards{ISO, wctype.h}
 318 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 319 @c Although the source code of wctype contains multiple references to
 320 @c the locale, that could each reference different locale_data objects
 321 @c should the global locale object change while active, the compiler can
 322 @c and does combine them all into a single dereference that resolves
 323 @c once to the LCTYPE locale object used throughout the function, so it
 324 @c is safe in (optimized) practice, if not in theory, even when the
 325 @c locale changes.  Ideally we'd explicitly save the resolved
 326 @c locale_data object to make it visibly safe instead of safe only under
 327 @c compiler optimizations, but given the decision that setlocale is
 328 @c MT-Unsafe, all this would afford us would be the ability to not mark
 329 @c this function with @mtslocale.
 330 @code{wctype} returns a value representing a class of wide
 331 characters which is identified by the string @var{property}.  Besides
 332 some standard properties each locale can define its own ones.  In case
 333 no property with the given name is known for the current locale
 334 selected for the @code{LC_CTYPE} category, the function returns zero.
 335
 336 @noindent
 337 The properties known in every locale are:
 338
 339 @multitable @columnfractions .25 .25 .25 .25
 340 @item
 341 @code{"alnum"} @tab @code{"alpha"} @tab @code{"cntrl"} @tab @code{"digit"}
 342 @item
 343 @code{"graph"} @tab @code{"lower"} @tab @code{"print"} @tab @code{"punct"}
 344 @item
 345 @code{"space"} @tab @code{"upper"} @tab @code{"xdigit"}
 346 @end multitable
 347
 348 @pindex wctype.h
 349 This function is declared in @file{wctype.h}.
 350 @end deftypefun
 351
 352 To test the membership of a character to one of the non-standard classes
 353 the @w{ISO C} standard defines a completely new function.
 354
 355 @deftypefun int iswctype (wint_t @var{wc}, wctype_t @var{desc})
 356 @standards{ISO, wctype.h}
 357 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 358 @c The compressed lookup table returned by wctype is read-only.
 359 This function returns a nonzero value if @var{wc} is in the character
 360 class specified by @var{desc}.  @var{desc} must previously be returned
 361 by a successful call to @code{wctype}.
 362
 363 @pindex wctype.h
 364 This function is declared in @file{wctype.h}.
 365 @end deftypefun
 366
 367 To make it easier to use the commonly-used classification functions,
 368 they are defined in the C library.  There is no need to use
 369 @code{wctype} if the property string is one of the known character
 370 classes.  In some situations it is desirable to construct the property
 371 strings, and then it is important that @code{wctype} can also handle the
 372 standard classes.
 373
 374 @cindex alphanumeric character
 375 @deftypefun int iswalnum (wint_t @var{wc})
 376 @standards{ISO, wctype.h}
 377 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 378 @c The implicit wctype call in the isw* functions is actually an
 379 @c optimized version because the category has a known offset, but the
 380 @c wctype is equally safe when optimized, unsafe with changing locales
 381 @c if not optimized (thus @mtslocale).  Since it's not a macro, we
 382 @c always optimize, and the locale can't change in any MT-Safe way, it's
 383 @c fine.  The test whether wc is ASCII to use the non-wide is*
 384 @c macro/function doesn't bring any other safety issues: the test does
 385 @c not depend on the locale, and each path after the decision resolves
 386 @c the locale object only once.
 387 This function returns a nonzero value if @var{wc} is an alphanumeric
 388 character (a letter or number); in other words, if either @code{iswalpha}
 389 or @code{iswdigit} is true of a character, then @code{iswalnum} is also
 390 true.
 391
 392 @noindent
 393 This function can be implemented using
 394
 395 @smallexample
 396 iswctype (wc, wctype ("alnum"))
 397 @end smallexample
 398
 399 @pindex wctype.h
 400 It is declared in @file{wctype.h}.
 401 @end deftypefun
 402
 403 @cindex alphabetic character
 404 @deftypefun int iswalpha (wint_t @var{wc})
 405 @standards{ISO, wctype.h}
 406 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 407 Returns true if @var{wc} is an alphabetic character (a letter).  If
 408 @code{iswlower} or @code{iswupper} is true of a character, then
 409 @code{iswalpha} is also true.
 410
 411 In some locales, there may be additional characters for which
 412 @code{iswalpha} is true---letters which are neither upper case nor lower
 413 case.  But in the standard @code{"C"} locale, there are no such
 414 additional characters.
 415
 416 @noindent
 417 This function can be implemented using
 418
 419 @smallexample
 420 iswctype (wc, wctype ("alpha"))
 421 @end smallexample
 422
 423 @pindex wctype.h
 424 It is declared in @file{wctype.h}.
 425 @end deftypefun
 426
 427 @cindex control character
 428 @deftypefun int iswcntrl (wint_t @var{wc})
 429 @standards{ISO, wctype.h}
 430 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 431 Returns true if @var{wc} is a control character (that is, a character that
 432 is not a printing character).
 433
 434 @noindent
 435 This function can be implemented using
 436
 437 @smallexample
 438 iswctype (wc, wctype ("cntrl"))
 439 @end smallexample
 440
 441 @pindex wctype.h
 442 It is declared in @file{wctype.h}.
 443 @end deftypefun
 444
 445 @cindex digit character
 446 @deftypefun int iswdigit (wint_t @var{wc})
 447 @standards{ISO, wctype.h}
 448 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 449 Returns true if @var{wc} is a digit (e.g., @samp{0} through @samp{9}).
 450 Please note that this function does not only return a nonzero value for
 451 @emph{decimal} digits, but for all kinds of digits.  A consequence is
 452 that code like the following will @strong{not} work unconditionally for
 453 wide characters:
 454
 455 @smallexample
 456 n = 0;
 457 while (iswdigit (*wc))
 458   @{
 459     n *= 10;
 460     n += *wc++ - L'0';
 461   @}
 462 @end smallexample
 463
 464 @noindent
 465 This function can be implemented using
 466
 467 @smallexample
 468 iswctype (wc, wctype ("digit"))
 469 @end smallexample
 470
 471 @pindex wctype.h
 472 It is declared in @file{wctype.h}.
 473 @end deftypefun
 474
 475 @cindex graphic character
 476 @deftypefun int iswgraph (wint_t @var{wc})
 477 @standards{ISO, wctype.h}
 478 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 479 Returns true if @var{wc} is a graphic character; that is, a character
 480 that has a glyph associated with it.  The whitespace characters are not
 481 considered graphic.
 482
 483 @noindent
 484 This function can be implemented using
 485
 486 @smallexample
 487 iswctype (wc, wctype ("graph"))
 488 @end smallexample
 489
 490 @pindex wctype.h
 491 It is declared in @file{wctype.h}.
 492 @end deftypefun
 493
 494 @cindex lower-case character
 495 @deftypefun int iswlower (wint_t @var{wc})
 496 @standards{ISO, ctype.h}
 497 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 498 Returns true if @var{wc} is a lower-case letter.  The letter need not be
 499 from the Latin alphabet, any alphabet representable is valid.
 500
 501 @noindent
 502 This function can be implemented using
 503
 504 @smallexample
 505 iswctype (wc, wctype ("lower"))
 506 @end smallexample
 507
 508 @pindex wctype.h
 509 It is declared in @file{wctype.h}.
 510 @end deftypefun
 511
 512 @cindex printing character
 513 @deftypefun int iswprint (wint_t @var{wc})
 514 @standards{ISO, wctype.h}
 515 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 516 Returns true if @var{wc} is a printing character.  Printing characters
 517 include all the graphic characters, plus the space (@samp{ }) character.
 518
 519 @noindent
 520 This function can be implemented using
 521
 522 @smallexample
 523 iswctype (wc, wctype ("print"))
 524 @end smallexample
 525
 526 @pindex wctype.h
 527 It is declared in @file{wctype.h}.
 528 @end deftypefun
 529
 530 @cindex punctuation character
 531 @deftypefun int iswpunct (wint_t @var{wc})
 532 @standards{ISO, wctype.h}
 533 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 534 Returns true if @var{wc} is a punctuation character.
 535 This means any printing character that is not alphanumeric or a space
 536 character.
 537
 538 @noindent
 539 This function can be implemented using
 540
 541 @smallexample
 542 iswctype (wc, wctype ("punct"))
 543 @end smallexample
 544
 545 @pindex wctype.h
 546 It is declared in @file{wctype.h}.
 547 @end deftypefun
 548
 549 @cindex whitespace character
 550 @deftypefun int iswspace (wint_t @var{wc})
 551 @standards{ISO, wctype.h}
 552 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 553 Returns true if @var{wc} is a @dfn{whitespace} character.  In the standard
 554 @code{"C"} locale, @code{iswspace} returns true for only the standard
 555 whitespace characters:
 556
 557 @table @code
 558 @item L' '
 559 space
 560
 561 @item L'\f'
 562 formfeed
 563
 564 @item L'\n'
 565 newline
 566
 567 @item L'\r'
 568 carriage return
 569
 570 @item L'\t'
 571 horizontal tab
 572
 573 @item L'\v'
 574 vertical tab
 575 @end table
 576
 577 @noindent
 578 This function can be implemented using
 579
 580 @smallexample
 581 iswctype (wc, wctype ("space"))
 582 @end smallexample
 583
 584 @pindex wctype.h
 585 It is declared in @file{wctype.h}.
 586 @end deftypefun
 587
 588 @cindex upper-case character
 589 @deftypefun int iswupper (wint_t @var{wc})
 590 @standards{ISO, wctype.h}
 591 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 592 Returns true if @var{wc} is an upper-case letter.  The letter need not be
 593 from the Latin alphabet, any alphabet representable is valid.
 594
 595 @noindent
 596 This function can be implemented using
 597
 598 @smallexample
 599 iswctype (wc, wctype ("upper"))
 600 @end smallexample
 601
 602 @pindex wctype.h
 603 It is declared in @file{wctype.h}.
 604 @end deftypefun
 605
 606 @cindex hexadecimal digit character
 607 @deftypefun int iswxdigit (wint_t @var{wc})
 608 @standards{ISO, wctype.h}
 609 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 610 Returns true if @var{wc} is a hexadecimal digit.
 611 Hexadecimal digits include the normal decimal digits @samp{0} through
 612 @samp{9} and the letters @samp{A} through @samp{F} and
 613 @samp{a} through @samp{f}.
 614
 615 @noindent
 616 This function can be implemented using
 617
 618 @smallexample
 619 iswctype (wc, wctype ("xdigit"))
 620 @end smallexample
 621
 622 @pindex wctype.h
 623 It is declared in @file{wctype.h}.
 624 @end deftypefun
 625
 626 @Theglibc{} also provides a function which is not defined in the
 627 @w{ISO C} standard but which is available as a version for single byte
 628 characters as well.
 629
 630 @cindex blank character
 631 @deftypefun int iswblank (wint_t @var{wc})
 632 @standards{ISO, wctype.h}
 633 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 634 Returns true if @var{wc} is a blank character; that is, a space or a tab.
 635 This function was originally a GNU extension, but was added in @w{ISO C99}.
 636 It is declared in @file{wchar.h}.
 637 @end deftypefun
 638
 639 @node Using Wide Char Classes, Wide Character Case Conversion, Classification of Wide Characters, Character Handling
 640 @section Notes on using the wide character classes
 641
 642 The first note is probably not astonishing but still occasionally a
 643 cause of problems.  The @code{isw@var{XXX}} functions can be implemented
 644 using macros and in fact, @theglibc{} does this.  They are still
 645 available as real functions but when the @file{wctype.h} header is
 646 included the macros will be used.  This is the same as the
 647 @code{char} type versions of these functions.
 648
 649 The second note covers something new.  It can be best illustrated by a
 650 (real-world) example.  The first piece of code is an excerpt from the
 651 original code.  It is truncated a bit but the intention should be clear.
 652
 653 @smallexample
 654 int
 655 is_in_class (int c, const char *class)
 656 @{
 657   if (strcmp (class, "alnum") == 0)
 658     return isalnum (c);
 659   if (strcmp (class, "alpha") == 0)
 660     return isalpha (c);
 661   if (strcmp (class, "cntrl") == 0)
 662     return iscntrl (c);
 663   @dots{}
 664   return 0;
 665 @}
 666 @end smallexample
 667
 668 Now, with the @code{wctype} and @code{iswctype} you can avoid the
 669 @code{if} cascades, but rewriting the code as follows is wrong:
 670
 671 @smallexample
 672 int
 673 is_in_class (int c, const char *class)
 674 @{
 675   wctype_t desc = wctype (class);
 676   return desc ? iswctype ((wint_t) c, desc) : 0;
 677 @}
 678 @end smallexample
 679
 680 The problem is that it is not guaranteed that the wide character
 681 representation of a single-byte character can be found using casting.
 682 In fact, usually this fails miserably.  The correct solution to this
 683 problem is to write the code as follows:
 684
 685 @smallexample
 686 int
 687 is_in_class (int c, const char *class)
 688 @{
 689   wctype_t desc = wctype (class);
 690   return desc ? iswctype (btowc (c), desc) : 0;
 691 @}
 692 @end smallexample
 693
 694 @xref{Converting a Character}, for more information on @code{btowc}.
 695 Note that this change probably does not improve the performance
 696 of the program a lot since the @code{wctype} function still has to make
 697 the string comparisons.  It gets really interesting if the
 698 @code{is_in_class} function is called more than once for the
 699 same class name.  In this case the variable @var{desc} could be computed
 700 once and reused for all the calls.  Therefore the above form of the
 701 function is probably not the final one.
 702
 703
 704 @node Wide Character Case Conversion, , Using Wide Char Classes, Character Handling
 705 @section Mapping of wide characters.
 706
 707 The classification functions are also generalized by the @w{ISO C}
 708 standard.  Instead of just allowing the two standard mappings, a
 709 locale can contain others.  Again, the @code{localedef} program
 710 already supports generating such locale data files.
 711
 712 @deftp {Data Type} wctrans_t
 713 @standards{ISO, wctype.h}
 714 This data type is defined as a scalar type which can hold a value
 715 representing the locale-dependent character mapping.  There is no way to
 716 construct such a value apart from using the return value of the
 717 @code{wctrans} function.
 718
 719 @pindex wctype.h
 720 @noindent
 721 This type is defined in @file{wctype.h}.
 722 @end deftp
 723
 724 @deftypefun wctrans_t wctrans (const char *@var{property})
 725 @standards{ISO, wctype.h}
 726 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 727 @c Similar implementation, same caveats as wctype.
 728 The @code{wctrans} function has to be used to find out whether a named
 729 mapping is defined in the current locale selected for the
 730 @code{LC_CTYPE} category.  If the returned value is non-zero, you can use
 731 it afterwards in calls to @code{towctrans}.  If the return value is
 732 zero no such mapping is known in the current locale.
 733
 734 Beside locale-specific mappings there are two mappings which are
 735 guaranteed to be available in every locale:
 736
 737 @multitable @columnfractions .5 .5
 738 @item
 739 @code{"tolower"} @tab @code{"toupper"}
 740 @end multitable
 741
 742 @pindex wctype.h
 743 @noindent
 744 These functions are declared in @file{wctype.h}.
 745 @end deftypefun
 746
 747 @deftypefun wint_t towctrans (wint_t @var{wc}, wctrans_t @var{desc})
 748 @standards{ISO, wctype.h}
 749 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 750 @c Same caveats as iswctype.
 751 @code{towctrans} maps the input character @var{wc}
 752 according to the rules of the mapping for which @var{desc} is a
 753 descriptor, and returns the value it finds.  @var{desc} must be
 754 obtained by a successful call to @code{wctrans}.
 755
 756 @pindex wctype.h
 757 @noindent
 758 This function is declared in @file{wctype.h}.
 759 @end deftypefun
 760
 761 For the generally available mappings, the @w{ISO C} standard defines
 762 convenient shortcuts so that it is not necessary to call @code{wctrans}
 763 for them.
 764
 765 @deftypefun wint_t towlower (wint_t @var{wc})
 766 @standards{ISO, wctype.h}
 767 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 768 @c Same caveats as iswalnum, just using a wctrans rather than a wctype
 769 @c table.
 770 If @var{wc} is an upper-case letter, @code{towlower} returns the corresponding
 771 lower-case letter.  If @var{wc} is not an upper-case letter,
 772 @var{wc} is returned unchanged.
 773
 774 @noindent
 775 @code{towlower} can be implemented using
 776
 777 @smallexample
 778 towctrans (wc, wctrans ("tolower"))
 779 @end smallexample
 780
 781 @pindex wctype.h
 782 @noindent
 783 This function is declared in @file{wctype.h}.
 784 @end deftypefun
 785
 786 @deftypefun wint_t towupper (wint_t @var{wc})
 787 @standards{ISO, wctype.h}
 788 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
 789 If @var{wc} is a lower-case letter, @code{towupper} returns the corresponding
 790 upper-case letter.  Otherwise @var{wc} is returned unchanged.
 791
 792 @noindent
 793 @code{towupper} can be implemented using
 794
 795 @smallexample
 796 towctrans (wc, wctrans ("toupper"))
 797 @end smallexample
 798
 799 @pindex wctype.h
 800 @noindent
 801 This function is declared in @file{wctype.h}.
 802 @end deftypefun
 803
 804 The same warnings given in the last section for the use of the wide
 805 character classification functions apply here.  It is not possible to
 806 simply cast a @code{char} type value to a @code{wint_t} and use it as an
 807 argument to @code{towctrans} calls.