manual/arith.texi

   1 @node Arithmetic, Date and Time, Mathematics, Top
   2 @c %MENU% Low level arithmetic functions
   3 @chapter Arithmetic Functions
   4
   5 This chapter contains information about functions for doing basic
   6 arithmetic operations, such as splitting a float into its integer and
   7 fractional parts or retrieving the imaginary part of a complex value.
   8 These functions are declared in the header files @file{math.h} and
   9 @file{complex.h}.
  10
  11 @menu
  12 * Integers::                    Basic integer types and concepts
  13 * Integer Division::            Integer division with guaranteed rounding.
  14 * Floating Point Numbers::      Basic concepts.  IEEE 754.
  15 * Floating Point Classes::      The five kinds of floating-point number.
  16 * Floating Point Errors::       When something goes wrong in a calculation.
  17 * Rounding::                    Controlling how results are rounded.
  18 * Control Functions::           Saving and restoring the FPU's state.
  19 * Arithmetic Functions::        Fundamental operations provided by the library.
  20 * Complex Numbers::             The types.  Writing complex constants.
  21 * Operations on Complex::       Projection, conjugation, decomposition.
  22 * Parsing of Numbers::          Converting strings to numbers.
  23 * Printing of Floats::          Converting floating-point numbers to strings.
  24 * System V Number Conversion::  An archaic way to convert numbers to strings.
  25 @end menu
  26
  27 @node Integers
  28 @section Integers
  29 @cindex integer
  30
  31 The C language defines several integer data types: integer, short integer,
  32 long integer, and character, all in both signed and unsigned varieties.
  33 The GNU C compiler extends the language to contain long long integers
  34 as well.
  35 @cindex signedness
  36
  37 The C integer types were intended to allow code to be portable among
  38 machines with different inherent data sizes (word sizes), so each type
  39 may have different ranges on different machines.  The problem with
  40 this is that a program often needs to be written for a particular range
  41 of integers, and sometimes must be written for a particular size of
  42 storage, regardless of what machine the program runs on.
  43
  44 To address this problem, @theglibc{} contains C type definitions
  45 you can use to declare integers that meet your exact needs.  Because the
  46 @glibcadj{} header files are customized to a specific machine, your
  47 program source code doesn't have to be.
  48
  49 These @code{typedef}s are in @file{stdint.h}.
  50 @pindex stdint.h
  51
  52 If you require that an integer be represented in exactly N bits, use one
  53 of the following types, with the obvious mapping to bit size and signedness:
  54
  55 @itemize @bullet
  56 @item int8_t
  57 @item int16_t
  58 @item int32_t
  59 @item int64_t
  60 @item uint8_t
  61 @item uint16_t
  62 @item uint32_t
  63 @item uint64_t
  64 @end itemize
  65
  66 If your C compiler and target machine do not allow integers of a certain
  67 size, the corresponding above type does not exist.
  68
  69 If you don't need a specific storage size, but want the smallest data
  70 structure with @emph{at least} N bits, use one of these:
  71
  72 @itemize @bullet
  73 @item int_least8_t
  74 @item int_least16_t
  75 @item int_least32_t
  76 @item int_least64_t
  77 @item uint_least8_t
  78 @item uint_least16_t
  79 @item uint_least32_t
  80 @item uint_least64_t
  81 @end itemize
  82
  83 If you don't need a specific storage size, but want the data structure
  84 that allows the fastest access while having at least N bits (and
  85 among data structures with the same access speed, the smallest one), use
  86 one of these:
  87
  88 @itemize @bullet
  89 @item int_fast8_t
  90 @item int_fast16_t
  91 @item int_fast32_t
  92 @item int_fast64_t
  93 @item uint_fast8_t
  94 @item uint_fast16_t
  95 @item uint_fast32_t
  96 @item uint_fast64_t
  97 @end itemize
  98
  99 If you want an integer with the widest range possible on the platform on
 100 which it is being used, use one of the following.  If you use these,
 101 you should write code that takes into account the variable size and range
 102 of the integer.
 103
 104 @itemize @bullet
 105 @item intmax_t
 106 @item uintmax_t
 107 @end itemize
 108
 109 @Theglibc{} also provides macros that tell you the maximum and
 110 minimum possible values for each integer data type.  The macro names
 111 follow these examples: @code{INT32_MAX}, @code{UINT8_MAX},
 112 @code{INT_FAST32_MIN}, @code{INT_LEAST64_MIN}, @code{UINTMAX_MAX},
 113 @code{INTMAX_MAX}, @code{INTMAX_MIN}.  Note that there are no macros for
 114 unsigned integer minima.  These are always zero.  Similiarly, there
 115 are macros such as @code{INTMAX_WIDTH} for the width of these types.
 116 Those macros for integer type widths come from TS 18661-1:2014.
 117 @cindex maximum possible integer
 118 @cindex minimum possible integer
 119
 120 There are similar macros for use with C's built in integer types which
 121 should come with your C compiler.  These are described in @ref{Data Type
 122 Measurements}.
 123
 124 Don't forget you can use the C @code{sizeof} function with any of these
 125 data types to get the number of bytes of storage each uses.
 126
 127
 128 @node Integer Division
 129 @section Integer Division
 130 @cindex integer division functions
 131
 132 This section describes functions for performing integer division.  These
 133 functions are redundant when GNU CC is used, because in GNU C the
 134 @samp{/} operator always rounds towards zero.  But in other C
 135 implementations, @samp{/} may round differently with negative arguments.
 136 @code{div} and @code{ldiv} are useful because they specify how to round
 137 the quotient: towards zero.  The remainder has the same sign as the
 138 numerator.
 139
 140 These functions are specified to return a result @var{r} such that the value
 141 @code{@var{r}.quot*@var{denominator} + @var{r}.rem} equals
 142 @var{numerator}.
 143
 144 @pindex stdlib.h
 145 To use these facilities, you should include the header file
 146 @file{stdlib.h} in your program.
 147
 148 @comment stdlib.h
 149 @comment ISO
 150 @deftp {Data Type} div_t
 151 This is a structure type used to hold the result returned by the @code{div}
 152 function.  It has the following members:
 153
 154 @table @code
 155 @item int quot
 156 The quotient from the division.
 157
 158 @item int rem
 159 The remainder from the division.
 160 @end table
 161 @end deftp
 162
 163 @comment stdlib.h
 164 @comment ISO
 165 @deftypefun div_t div (int @var{numerator}, int @var{denominator})
 166 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 167 @c Functions in this section are pure, and thus safe.
 168 The function @code{div} computes the quotient and remainder from
 169 the division of @var{numerator} by @var{denominator}, returning the
 170 result in a structure of type @code{div_t}.
 171
 172 If the result cannot be represented (as in a division by zero), the
 173 behavior is undefined.
 174
 175 Here is an example, albeit not a very useful one.
 176
 177 @smallexample
 178 div_t result;
 179 result = div (20, -6);
 180 @end smallexample
 181
 182 @noindent
 183 Now @code{result.quot} is @code{-3} and @code{result.rem} is @code{2}.
 184 @end deftypefun
 185
 186 @comment stdlib.h
 187 @comment ISO
 188 @deftp {Data Type} ldiv_t
 189 This is a structure type used to hold the result returned by the @code{ldiv}
 190 function.  It has the following members:
 191
 192 @table @code
 193 @item long int quot
 194 The quotient from the division.
 195
 196 @item long int rem
 197 The remainder from the division.
 198 @end table
 199
 200 (This is identical to @code{div_t} except that the components are of
 201 type @code{long int} rather than @code{int}.)
 202 @end deftp
 203
 204 @comment stdlib.h
 205 @comment ISO
 206 @deftypefun ldiv_t ldiv (long int @var{numerator}, long int @var{denominator})
 207 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 208 The @code{ldiv} function is similar to @code{div}, except that the
 209 arguments are of type @code{long int} and the result is returned as a
 210 structure of type @code{ldiv_t}.
 211 @end deftypefun
 212
 213 @comment stdlib.h
 214 @comment ISO
 215 @deftp {Data Type} lldiv_t
 216 This is a structure type used to hold the result returned by the @code{lldiv}
 217 function.  It has the following members:
 218
 219 @table @code
 220 @item long long int quot
 221 The quotient from the division.
 222
 223 @item long long int rem
 224 The remainder from the division.
 225 @end table
 226
 227 (This is identical to @code{div_t} except that the components are of
 228 type @code{long long int} rather than @code{int}.)
 229 @end deftp
 230
 231 @comment stdlib.h
 232 @comment ISO
 233 @deftypefun lldiv_t lldiv (long long int @var{numerator}, long long int @var{denominator})
 234 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 235 The @code{lldiv} function is like the @code{div} function, but the
 236 arguments are of type @code{long long int} and the result is returned as
 237 a structure of type @code{lldiv_t}.
 238
 239 The @code{lldiv} function was added in @w{ISO C99}.
 240 @end deftypefun
 241
 242 @comment inttypes.h
 243 @comment ISO
 244 @deftp {Data Type} imaxdiv_t
 245 This is a structure type used to hold the result returned by the @code{imaxdiv}
 246 function.  It has the following members:
 247
 248 @table @code
 249 @item intmax_t quot
 250 The quotient from the division.
 251
 252 @item intmax_t rem
 253 The remainder from the division.
 254 @end table
 255
 256 (This is identical to @code{div_t} except that the components are of
 257 type @code{intmax_t} rather than @code{int}.)
 258
 259 See @ref{Integers} for a description of the @code{intmax_t} type.
 260
 261 @end deftp
 262
 263 @comment inttypes.h
 264 @comment ISO
 265 @deftypefun imaxdiv_t imaxdiv (intmax_t @var{numerator}, intmax_t @var{denominator})
 266 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 267 The @code{imaxdiv} function is like the @code{div} function, but the
 268 arguments are of type @code{intmax_t} and the result is returned as
 269 a structure of type @code{imaxdiv_t}.
 270
 271 See @ref{Integers} for a description of the @code{intmax_t} type.
 272
 273 The @code{imaxdiv} function was added in @w{ISO C99}.
 274 @end deftypefun
 275
 276
 277 @node Floating Point Numbers
 278 @section Floating Point Numbers
 279 @cindex floating point
 280 @cindex IEEE 754
 281 @cindex IEEE floating point
 282
 283 Most computer hardware has support for two different kinds of numbers:
 284 integers (@math{@dots{}-3, -2, -1, 0, 1, 2, 3@dots{}}) and
 285 floating-point numbers.  Floating-point numbers have three parts: the
 286 @dfn{mantissa}, the @dfn{exponent}, and the @dfn{sign bit}.  The real
 287 number represented by a floating-point value is given by
 288 @tex
 289 $(s \mathrel? -1 \mathrel: 1) \cdot 2^e \cdot M$
 290 @end tex
 291 @ifnottex
 292 @math{(s ? -1 : 1) @mul{} 2^e @mul{} M}
 293 @end ifnottex
 294 where @math{s} is the sign bit, @math{e} the exponent, and @math{M}
 295 the mantissa.  @xref{Floating Point Concepts}, for details.  (It is
 296 possible to have a different @dfn{base} for the exponent, but all modern
 297 hardware uses @math{2}.)
 298
 299 Floating-point numbers can represent a finite subset of the real
 300 numbers.  While this subset is large enough for most purposes, it is
 301 important to remember that the only reals that can be represented
 302 exactly are rational numbers that have a terminating binary expansion
 303 shorter than the width of the mantissa.  Even simple fractions such as
 304 @math{1/5} can only be approximated by floating point.
 305
 306 Mathematical operations and functions frequently need to produce values
 307 that are not representable.  Often these values can be approximated
 308 closely enough for practical purposes, but sometimes they can't.
 309 Historically there was no way to tell when the results of a calculation
 310 were inaccurate.  Modern computers implement the @w{IEEE 754} standard
 311 for numerical computations, which defines a framework for indicating to
 312 the program when the results of calculation are not trustworthy.  This
 313 framework consists of a set of @dfn{exceptions} that indicate why a
 314 result could not be represented, and the special values @dfn{infinity}
 315 and @dfn{not a number} (NaN).
 316
 317 @node Floating Point Classes
 318 @section Floating-Point Number Classification Functions
 319 @cindex floating-point classes
 320 @cindex classes, floating-point
 321 @pindex math.h
 322
 323 @w{ISO C99} defines macros that let you determine what sort of
 324 floating-point number a variable holds.
 325
 326 @comment math.h
 327 @comment ISO
 328 @deftypefn {Macro} int fpclassify (@emph{float-type} @var{x})
 329 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 330 This is a generic macro which works on all floating-point types and
 331 which returns a value of type @code{int}.  The possible values are:
 332
 333 @vtable @code
 334 @item FP_NAN
 335 The floating-point number @var{x} is ``Not a Number'' (@pxref{Infinity
 336 and NaN})
 337 @item FP_INFINITE
 338 The value of @var{x} is either plus or minus infinity (@pxref{Infinity
 339 and NaN})
 340 @item FP_ZERO
 341 The value of @var{x} is zero.  In floating-point formats like @w{IEEE
 342 754}, where zero can be signed, this value is also returned if
 343 @var{x} is negative zero.
 344 @item FP_SUBNORMAL
 345 Numbers whose absolute value is too small to be represented in the
 346 normal format are represented in an alternate, @dfn{denormalized} format
 347 (@pxref{Floating Point Concepts}).  This format is less precise but can
 348 represent values closer to zero.  @code{fpclassify} returns this value
 349 for values of @var{x} in this alternate format.
 350 @item FP_NORMAL
 351 This value is returned for all other values of @var{x}.  It indicates
 352 that there is nothing special about the number.
 353 @end vtable
 354
 355 @end deftypefn
 356
 357 @code{fpclassify} is most useful if more than one property of a number
 358 must be tested.  There are more specific macros which only test one
 359 property at a time.  Generally these macros execute faster than
 360 @code{fpclassify}, since there is special hardware support for them.
 361 You should therefore use the specific macros whenever possible.
 362
 363 @comment math.h
 364 @comment ISO
 365 @deftypefn {Macro} int iscanonical (@emph{float-type} @var{x})
 366 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 367 In some floating-point formats, some values have canonical (preferred)
 368 and noncanonical encodings (for IEEE interchange binary formats, all
 369 encodings are canonical).  This macro returns a nonzero value if
 370 @var{x} has a canonical encoding.  It is from TS 18661-1:2014.
 371
 372 Note that some formats have multiple encodings of a value which are
 373 all equally canonical; @code{iscanonical} returns a nonzero value for
 374 all such encodings.  Also, formats may have encodings that do not
 375 correspond to any valid value of the type.  In ISO C terms these are
 376 @dfn{trap representations}; in @theglibc{}, @code{iscanonical} returns
 377 zero for such encodings.
 378 @end deftypefn
 379
 380 @comment math.h
 381 @comment ISO
 382 @deftypefn {Macro} int isfinite (@emph{float-type} @var{x})
 383 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 384 This macro returns a nonzero value if @var{x} is finite: not plus or
 385 minus infinity, and not NaN.  It is equivalent to
 386
 387 @smallexample
 388 (fpclassify (x) != FP_NAN && fpclassify (x) != FP_INFINITE)
 389 @end smallexample
 390
 391 @code{isfinite} is implemented as a macro which accepts any
 392 floating-point type.
 393 @end deftypefn
 394
 395 @comment math.h
 396 @comment ISO
 397 @deftypefn {Macro} int isnormal (@emph{float-type} @var{x})
 398 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 399 This macro returns a nonzero value if @var{x} is finite and normalized.
 400 It is equivalent to
 401
 402 @smallexample
 403 (fpclassify (x) == FP_NORMAL)
 404 @end smallexample
 405 @end deftypefn
 406
 407 @comment math.h
 408 @comment ISO
 409 @deftypefn {Macro} int isnan (@emph{float-type} @var{x})
 410 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 411 This macro returns a nonzero value if @var{x} is NaN.  It is equivalent
 412 to
 413
 414 @smallexample
 415 (fpclassify (x) == FP_NAN)
 416 @end smallexample
 417 @end deftypefn
 418
 419 @comment math.h
 420 @comment ISO
 421 @deftypefn {Macro} int issignaling (@emph{float-type} @var{x})
 422 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 423 This macro returns a nonzero value if @var{x} is a signaling NaN
 424 (sNaN).  It is from TS 18661-1:2014.
 425 @end deftypefn
 426
 427 @comment math.h
 428 @comment ISO
 429 @deftypefn {Macro} int issubnormal (@emph{float-type} @var{x})
 430 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 431 This macro returns a nonzero value if @var{x} is subnormal.  It is
 432 from TS 18661-1:2014.
 433 @end deftypefn
 434
 435 @comment math.h
 436 @comment ISO
 437 @deftypefn {Macro} int iszero (@emph{float-type} @var{x})
 438 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 439 This macro returns a nonzero value if @var{x} is zero.  It is from TS
 440 18661-1:2014.
 441 @end deftypefn
 442
 443 Another set of floating-point classification functions was provided by
 444 BSD.  @Theglibc{} also supports these functions; however, we
 445 recommend that you use the ISO C99 macros in new code.  Those are standard
 446 and will be available more widely.  Also, since they are macros, you do
 447 not have to worry about the type of their argument.
 448
 449 @comment math.h
 450 @comment BSD
 451 @deftypefun int isinf (double @var{x})
 452 @comment math.h
 453 @comment BSD
 454 @deftypefunx int isinff (float @var{x})
 455 @comment math.h
 456 @comment BSD
 457 @deftypefunx int isinfl (long double @var{x})
 458 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 459 This function returns @code{-1} if @var{x} represents negative infinity,
 460 @code{1} if @var{x} represents positive infinity, and @code{0} otherwise.
 461 @end deftypefun
 462
 463 @comment math.h
 464 @comment BSD
 465 @deftypefun int isnan (double @var{x})
 466 @comment math.h
 467 @comment BSD
 468 @deftypefunx int isnanf (float @var{x})
 469 @comment math.h
 470 @comment BSD
 471 @deftypefunx int isnanl (long double @var{x})
 472 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 473 This function returns a nonzero value if @var{x} is a ``not a number''
 474 value, and zero otherwise.
 475
 476 @strong{NB:} The @code{isnan} macro defined by @w{ISO C99} overrides
 477 the BSD function.  This is normally not a problem, because the two
 478 routines behave identically.  However, if you really need to get the BSD
 479 function for some reason, you can write
 480
 481 @smallexample
 482 (isnan) (x)
 483 @end smallexample
 484 @end deftypefun
 485
 486 @comment math.h
 487 @comment BSD
 488 @deftypefun int finite (double @var{x})
 489 @comment math.h
 490 @comment BSD
 491 @deftypefunx int finitef (float @var{x})
 492 @comment math.h
 493 @comment BSD
 494 @deftypefunx int finitel (long double @var{x})
 495 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 496 This function returns a nonzero value if @var{x} is finite or a ``not a
 497 number'' value, and zero otherwise.
 498 @end deftypefun
 499
 500 @strong{Portability Note:} The functions listed in this section are BSD
 501 extensions.
 502
 503
 504 @node Floating Point Errors
 505 @section Errors in Floating-Point Calculations
 506
 507 @menu
 508 * FP Exceptions::               IEEE 754 math exceptions and how to detect them.
 509 * Infinity and NaN::            Special values returned by calculations.
 510 * Status bit operations::       Checking for exceptions after the fact.
 511 * Math Error Reporting::        How the math functions report errors.
 512 @end menu
 513
 514 @node FP Exceptions
 515 @subsection FP Exceptions
 516 @cindex exception
 517 @cindex signal
 518 @cindex zero divide
 519 @cindex division by zero
 520 @cindex inexact exception
 521 @cindex invalid exception
 522 @cindex overflow exception
 523 @cindex underflow exception
 524
 525 The @w{IEEE 754} standard defines five @dfn{exceptions} that can occur
 526 during a calculation.  Each corresponds to a particular sort of error,
 527 such as overflow.
 528
 529 When exceptions occur (when exceptions are @dfn{raised}, in the language
 530 of the standard), one of two things can happen.  By default the
 531 exception is simply noted in the floating-point @dfn{status word}, and
 532 the program continues as if nothing had happened.  The operation
 533 produces a default value, which depends on the exception (see the table
 534 below).  Your program can check the status word to find out which
 535 exceptions happened.
 536
 537 Alternatively, you can enable @dfn{traps} for exceptions.  In that case,
 538 when an exception is raised, your program will receive the @code{SIGFPE}
 539 signal.  The default action for this signal is to terminate the
 540 program.  @xref{Signal Handling}, for how you can change the effect of
 541 the signal.
 542
 543 @findex matherr
 544 In the System V math library, the user-defined function @code{matherr}
 545 is called when certain exceptions occur inside math library functions.
 546 However, the Unix98 standard deprecates this interface.  We support it
 547 for historical compatibility, but recommend that you do not use it in
 548 new programs.  When this interface is used, exceptions may not be
 549 raised.
 550
 551 @noindent
 552 The exceptions defined in @w{IEEE 754} are:
 553
 554 @table @samp
 555 @item Invalid Operation
 556 This exception is raised if the given operands are invalid for the
 557 operation to be performed.  Examples are
 558 (see @w{IEEE 754}, @w{section 7}):
 559 @enumerate
 560 @item
 561 Addition or subtraction: @math{@infinity{} - @infinity{}}.  (But
 562 @math{@infinity{} + @infinity{} = @infinity{}}).
 563 @item
 564 Multiplication: @math{0 @mul{} @infinity{}}.
 565 @item
 566 Division: @math{0/0} or @math{@infinity{}/@infinity{}}.
 567 @item
 568 Remainder: @math{x} REM @math{y}, where @math{y} is zero or @math{x} is
 569 infinite.
 570 @item
 571 Square root if the operand is less than zero.  More generally, any
 572 mathematical function evaluated outside its domain produces this
 573 exception.
 574 @item
 575 Conversion of a floating-point number to an integer or decimal
 576 string, when the number cannot be represented in the target format (due
 577 to overflow, infinity, or NaN).
 578 @item
 579 Conversion of an unrecognizable input string.
 580 @item
 581 Comparison via predicates involving @math{<} or @math{>}, when one or
 582 other of the operands is NaN.  You can prevent this exception by using
 583 the unordered comparison functions instead; see @ref{FP Comparison Functions}.
 584 @end enumerate
 585
 586 If the exception does not trap, the result of the operation is NaN.
 587
 588 @item Division by Zero
 589 This exception is raised when a finite nonzero number is divided
 590 by zero.  If no trap occurs the result is either @math{+@infinity{}} or
 591 @math{-@infinity{}}, depending on the signs of the operands.
 592
 593 @item Overflow
 594 This exception is raised whenever the result cannot be represented
 595 as a finite value in the precision format of the destination.  If no trap
 596 occurs the result depends on the sign of the intermediate result and the
 597 current rounding mode (@w{IEEE 754}, @w{section 7.3}):
 598 @enumerate
 599 @item
 600 Round to nearest carries all overflows to @math{@infinity{}}
 601 with the sign of the intermediate result.
 602 @item
 603 Round toward @math{0} carries all overflows to the largest representable
 604 finite number with the sign of the intermediate result.
 605 @item
 606 Round toward @math{-@infinity{}} carries positive overflows to the
 607 largest representable finite number and negative overflows to
 608 @math{-@infinity{}}.
 609
 610 @item
 611 Round toward @math{@infinity{}} carries negative overflows to the
 612 most negative representable finite number and positive overflows
 613 to @math{@infinity{}}.
 614 @end enumerate
 615
 616 Whenever the overflow exception is raised, the inexact exception is also
 617 raised.
 618
 619 @item Underflow
 620 The underflow exception is raised when an intermediate result is too
 621 small to be calculated accurately, or if the operation's result rounded
 622 to the destination precision is too small to be normalized.
 623
 624 When no trap is installed for the underflow exception, underflow is
 625 signaled (via the underflow flag) only when both tininess and loss of
 626 accuracy have been detected.  If no trap handler is installed the
 627 operation continues with an imprecise small value, or zero if the
 628 destination precision cannot hold the small exact result.
 629
 630 @item Inexact
 631 This exception is signalled if a rounded result is not exact (such as
 632 when calculating the square root of two) or a result overflows without
 633 an overflow trap.
 634 @end table
 635
 636 @node Infinity and NaN
 637 @subsection Infinity and NaN
 638 @cindex infinity
 639 @cindex not a number
 640 @cindex NaN
 641
 642 @w{IEEE 754} floating point numbers can represent positive or negative
 643 infinity, and @dfn{NaN} (not a number).  These three values arise from
 644 calculations whose result is undefined or cannot be represented
 645 accurately.  You can also deliberately set a floating-point variable to
 646 any of them, which is sometimes useful.  Some examples of calculations
 647 that produce infinity or NaN:
 648
 649 @ifnottex
 650 @smallexample
 651 @math{1/0 = @infinity{}}
 652 @math{log (0) = -@infinity{}}
 653 @math{sqrt (-1) = NaN}
 654 @end smallexample
 655 @end ifnottex
 656 @tex
 657 $${1\over0} = \infty$$
 658 $$\log 0 = -\infty$$
 659 $$\sqrt{-1} = \hbox{NaN}$$
 660 @end tex
 661
 662 When a calculation produces any of these values, an exception also
 663 occurs; see @ref{FP Exceptions}.
 664
 665 The basic operations and math functions all accept infinity and NaN and
 666 produce sensible output.  Infinities propagate through calculations as
 667 one would expect: for example, @math{2 + @infinity{} = @infinity{}},
 668 @math{4/@infinity{} = 0}, atan @math{(@infinity{}) = @pi{}/2}.  NaN, on
 669 the other hand, infects any calculation that involves it.  Unless the
 670 calculation would produce the same result no matter what real value
 671 replaced NaN, the result is NaN.
 672
 673 In comparison operations, positive infinity is larger than all values
 674 except itself and NaN, and negative infinity is smaller than all values
 675 except itself and NaN.  NaN is @dfn{unordered}: it is not equal to,
 676 greater than, or less than anything, @emph{including itself}. @code{x ==
 677 x} is false if the value of @code{x} is NaN.  You can use this to test
 678 whether a value is NaN or not, but the recommended way to test for NaN
 679 is with the @code{isnan} function (@pxref{Floating Point Classes}).  In
 680 addition, @code{<}, @code{>}, @code{<=}, and @code{>=} will raise an
 681 exception when applied to NaNs.
 682
 683 @file{math.h} defines macros that allow you to explicitly set a variable
 684 to infinity or NaN.
 685
 686 @comment math.h
 687 @comment ISO
 688 @deftypevr Macro float INFINITY
 689 An expression representing positive infinity.  It is equal to the value
 690 produced  by mathematical operations like @code{1.0 / 0.0}.
 691 @code{-INFINITY} represents negative infinity.
 692
 693 You can test whether a floating-point value is infinite by comparing it
 694 to this macro.  However, this is not recommended; you should use the
 695 @code{isfinite} macro instead.  @xref{Floating Point Classes}.
 696
 697 This macro was introduced in the @w{ISO C99} standard.
 698 @end deftypevr
 699
 700 @comment math.h
 701 @comment GNU
 702 @deftypevr Macro float NAN
 703 An expression representing a value which is ``not a number''.  This
 704 macro is a GNU extension, available only on machines that support the
 705 ``not a number'' value---that is to say, on all machines that support
 706 IEEE floating point.
 707
 708 You can use @samp{#ifdef NAN} to test whether the machine supports
 709 NaN.  (Of course, you must arrange for GNU extensions to be visible,
 710 such as by defining @code{_GNU_SOURCE}, and then you must include
 711 @file{math.h}.)
 712 @end deftypevr
 713
 714 @comment math.h
 715 @comment ISO
 716 @deftypevr Macro float SNANF
 717 @deftypevrx Macro double SNAN
 718 @deftypevrx Macro {long double} SNANL
 719 These macros, defined by TS 18661-1:2014, are constant expressions for
 720 signaling NaNs.
 721 @end deftypevr
 722
 723 @comment fenv.h
 724 @comment ISO
 725 @deftypevr Macro int FE_SNANS_ALWAYS_SIGNAL
 726 This macro, defined by TS 18661-1:2014, is defined to @code{1} in
 727 @file{fenv.h} to indicate that functions and operations with signaling
 728 NaN inputs and floating-point results always raise the invalid
 729 exception and return a quiet NaN, even in cases (such as @code{fmax},
 730 @code{hypot} and @code{pow}) where a quiet NaN input can produce a
 731 non-NaN result.  Because some compiler optimizations may not handle
 732 signaling NaNs correctly, this macro is only defined if compiler
 733 support for signaling NaNs is enabled.  That support can be enabled
 734 with the GCC option @option{-fsignaling-nans}.
 735 @end deftypevr
 736
 737 @w{IEEE 754} also allows for another unusual value: negative zero.  This
 738 value is produced when you divide a positive number by negative
 739 infinity, or when a negative result is smaller than the limits of
 740 representation.
 741
 742 @node Status bit operations
 743 @subsection Examining the FPU status word
 744
 745 @w{ISO C99} defines functions to query and manipulate the
 746 floating-point status word.  You can use these functions to check for
 747 untrapped exceptions when it's convenient, rather than worrying about
 748 them in the middle of a calculation.
 749
 750 These constants represent the various @w{IEEE 754} exceptions.  Not all
 751 FPUs report all the different exceptions.  Each constant is defined if
 752 and only if the FPU you are compiling for supports that exception, so
 753 you can test for FPU support with @samp{#ifdef}.  They are defined in
 754 @file{fenv.h}.
 755
 756 @vtable @code
 757 @comment fenv.h
 758 @comment ISO
 759 @item FE_INEXACT
 760  The inexact exception.
 761 @comment fenv.h
 762 @comment ISO
 763 @item FE_DIVBYZERO
 764  The divide by zero exception.
 765 @comment fenv.h
 766 @comment ISO
 767 @item FE_UNDERFLOW
 768  The underflow exception.
 769 @comment fenv.h
 770 @comment ISO
 771 @item FE_OVERFLOW
 772  The overflow exception.
 773 @comment fenv.h
 774 @comment ISO
 775 @item FE_INVALID
 776  The invalid exception.
 777 @end vtable
 778
 779 The macro @code{FE_ALL_EXCEPT} is the bitwise OR of all exception macros
 780 which are supported by the FP implementation.
 781
 782 These functions allow you to clear exception flags, test for exceptions,
 783 and save and restore the set of exceptions flagged.
 784
 785 @comment fenv.h
 786 @comment ISO
 787 @deftypefun int feclearexcept (int @var{excepts})
 788 @safety{@prelim{}@mtsafe{}@assafe{@assposix{}}@acsafe{@acsposix{}}}
 789 @c The other functions in this section that modify FP status register
 790 @c mostly do so with non-atomic load-modify-store sequences, but since
 791 @c the register is thread-specific, this should be fine, and safe for
 792 @c cancellation.  As long as the FP environment is restored before the
 793 @c signal handler returns control to the interrupted thread (like any
 794 @c kernel should do), the functions are also safe for use in signal
 795 @c handlers.
 796 This function clears all of the supported exception flags indicated by
 797 @var{excepts}.
 798
 799 The function returns zero in case the operation was successful, a
 800 non-zero value otherwise.
 801 @end deftypefun
 802
 803 @comment fenv.h
 804 @comment ISO
 805 @deftypefun int feraiseexcept (int @var{excepts})
 806 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 807 This function raises the supported exceptions indicated by
 808 @var{excepts}.  If more than one exception bit in @var{excepts} is set
 809 the order in which the exceptions are raised is undefined except that
 810 overflow (@code{FE_OVERFLOW}) or underflow (@code{FE_UNDERFLOW}) are
 811 raised before inexact (@code{FE_INEXACT}).  Whether for overflow or
 812 underflow the inexact exception is also raised is also implementation
 813 dependent.
 814
 815 The function returns zero in case the operation was successful, a
 816 non-zero value otherwise.
 817 @end deftypefun
 818
 819 @comment fenv.h
 820 @comment ISO
 821 @deftypefun int fesetexcept (int @var{excepts})
 822 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 823 This function sets the supported exception flags indicated by
 824 @var{excepts}, like @code{feraiseexcept}, but without causing enabled
 825 traps to be taken.  @code{fesetexcept} is from TS 18661-1:2014.
 826
 827 The function returns zero in case the operation was successful, a
 828 non-zero value otherwise.
 829 @end deftypefun
 830
 831 @comment fenv.h
 832 @comment ISO
 833 @deftypefun int fetestexcept (int @var{excepts})
 834 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 835 Test whether the exception flags indicated by the parameter @var{except}
 836 are currently set.  If any of them are, a nonzero value is returned
 837 which specifies which exceptions are set.  Otherwise the result is zero.
 838 @end deftypefun
 839
 840 To understand these functions, imagine that the status word is an
 841 integer variable named @var{status}.  @code{feclearexcept} is then
 842 equivalent to @samp{status &= ~excepts} and @code{fetestexcept} is
 843 equivalent to @samp{(status & excepts)}.  The actual implementation may
 844 be very different, of course.
 845
 846 Exception flags are only cleared when the program explicitly requests it,
 847 by calling @code{feclearexcept}.  If you want to check for exceptions
 848 from a set of calculations, you should clear all the flags first.  Here
 849 is a simple example of the way to use @code{fetestexcept}:
 850
 851 @smallexample
 852 @{
 853   double f;
 854   int raised;
 855   feclearexcept (FE_ALL_EXCEPT);
 856   f = compute ();
 857   raised = fetestexcept (FE_OVERFLOW | FE_INVALID);
 858   if (raised & FE_OVERFLOW) @{ /* @dots{} */ @}
 859   if (raised & FE_INVALID) @{ /* @dots{} */ @}
 860   /* @dots{} */
 861 @}
 862 @end smallexample
 863
 864 You cannot explicitly set bits in the status word.  You can, however,
 865 save the entire status word and restore it later.  This is done with the
 866 following functions:
 867
 868 @comment fenv.h
 869 @comment ISO
 870 @deftypefun int fegetexceptflag (fexcept_t *@var{flagp}, int @var{excepts})
 871 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 872 This function stores in the variable pointed to by @var{flagp} an
 873 implementation-defined value representing the current setting of the
 874 exception flags indicated by @var{excepts}.
 875
 876 The function returns zero in case the operation was successful, a
 877 non-zero value otherwise.
 878 @end deftypefun
 879
 880 @comment fenv.h
 881 @comment ISO
 882 @deftypefun int fesetexceptflag (const fexcept_t *@var{flagp}, int @var{excepts})
 883 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 884 This function restores the flags for the exceptions indicated by
 885 @var{excepts} to the values stored in the variable pointed to by
 886 @var{flagp}.
 887
 888 The function returns zero in case the operation was successful, a
 889 non-zero value otherwise.
 890 @end deftypefun
 891
 892 Note that the value stored in @code{fexcept_t} bears no resemblance to
 893 the bit mask returned by @code{fetestexcept}.  The type may not even be
 894 an integer.  Do not attempt to modify an @code{fexcept_t} variable.
 895
 896 @comment fenv.h
 897 @comment ISO
 898 @deftypefun int fetestexceptflag (const fexcept_t *@var{flagp}, int @var{excepts})
 899 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 900 Test whether the exception flags indicated by the parameter
 901 @var{excepts} are set in the variable pointed to by @var{flagp}.  If
 902 any of them are, a nonzero value is returned which specifies which
 903 exceptions are set.  Otherwise the result is zero.
 904 @code{fetestexceptflag} is from TS 18661-1:2014.
 905 @end deftypefun
 906
 907 @node Math Error Reporting
 908 @subsection Error Reporting by Mathematical Functions
 909 @cindex errors, mathematical
 910 @cindex domain error
 911 @cindex range error
 912
 913 Many of the math functions are defined only over a subset of the real or
 914 complex numbers.  Even if they are mathematically defined, their result
 915 may be larger or smaller than the range representable by their return
 916 type without loss of accuracy.  These are known as @dfn{domain errors},
 917 @dfn{overflows}, and
 918 @dfn{underflows}, respectively.  Math functions do several things when
 919 one of these errors occurs.  In this manual we will refer to the
 920 complete response as @dfn{signalling} a domain error, overflow, or
 921 underflow.
 922
 923 When a math function suffers a domain error, it raises the invalid
 924 exception and returns NaN.  It also sets @var{errno} to @code{EDOM};
 925 this is for compatibility with old systems that do not support @w{IEEE
 926 754} exception handling.  Likewise, when overflow occurs, math
 927 functions raise the overflow exception and, in the default rounding
 928 mode, return @math{@infinity{}} or @math{-@infinity{}} as appropriate
 929 (in other rounding modes, the largest finite value of the appropriate
 930 sign is returned when appropriate for that rounding mode).  They also
 931 set @var{errno} to @code{ERANGE} if returning @math{@infinity{}} or
 932 @math{-@infinity{}}; @var{errno} may or may not be set to
 933 @code{ERANGE} when a finite value is returned on overflow.  When
 934 underflow occurs, the underflow exception is raised, and zero
 935 (appropriately signed) or a subnormal value, as appropriate for the
 936 mathematical result of the function and the rounding mode, is
 937 returned.  @var{errno} may be set to @code{ERANGE}, but this is not
 938 guaranteed; it is intended that @theglibc{} should set it when the
 939 underflow is to an appropriately signed zero, but not necessarily for
 940 other underflows.
 941
 942 When a math function has an argument that is a signaling NaN,
 943 @theglibc{} does not consider this a domain error, so @code{errno} is
 944 unchanged, but the invalid exception is still raised (except for a few
 945 functions that are specified to handle signaling NaNs differently).
 946
 947 Some of the math functions are defined mathematically to result in a
 948 complex value over parts of their domains.  The most familiar example of
 949 this is taking the square root of a negative number.  The complex math
 950 functions, such as @code{csqrt}, will return the appropriate complex value
 951 in this case.  The real-valued functions, such as @code{sqrt}, will
 952 signal a domain error.
 953
 954 Some older hardware does not support infinities.  On that hardware,
 955 overflows instead return a particular very large number (usually the
 956 largest representable number).  @file{math.h} defines macros you can use
 957 to test for overflow on both old and new hardware.
 958
 959 @comment math.h
 960 @comment ISO
 961 @deftypevr Macro double HUGE_VAL
 962 @comment math.h
 963 @comment ISO
 964 @deftypevrx Macro float HUGE_VALF
 965 @comment math.h
 966 @comment ISO
 967 @deftypevrx Macro {long double} HUGE_VALL
 968 An expression representing a particular very large number.  On machines
 969 that use @w{IEEE 754} floating point format, @code{HUGE_VAL} is infinity.
 970 On other machines, it's typically the largest positive number that can
 971 be represented.
 972
 973 Mathematical functions return the appropriately typed version of
 974 @code{HUGE_VAL} or @code{@minus{}HUGE_VAL} when the result is too large
 975 to be represented.
 976 @end deftypevr
 977
 978 @node Rounding
 979 @section Rounding Modes
 980
 981 Floating-point calculations are carried out internally with extra
 982 precision, and then rounded to fit into the destination type.  This
 983 ensures that results are as precise as the input data.  @w{IEEE 754}
 984 defines four possible rounding modes:
 985
 986 @table @asis
 987 @item Round to nearest.
 988 This is the default mode.  It should be used unless there is a specific
 989 need for one of the others.  In this mode results are rounded to the
 990 nearest representable value.  If the result is midway between two
 991 representable values, the even representable is chosen. @dfn{Even} here
 992 means the lowest-order bit is zero.  This rounding mode prevents
 993 statistical bias and guarantees numeric stability: round-off errors in a
 994 lengthy calculation will remain smaller than half of @code{FLT_EPSILON}.
 995
 996 @c @item Round toward @math{+@infinity{}}
 997 @item Round toward plus Infinity.
 998 All results are rounded to the smallest representable value
 999 which is greater than the result.
1000
1001 @c @item Round toward @math{-@infinity{}}
1002 @item Round toward minus Infinity.
1003 All results are rounded to the largest representable value which is less
1004 than the result.
1005
1006 @item Round toward zero.
1007 All results are rounded to the largest representable value whose
1008 magnitude is less than that of the result.  In other words, if the
1009 result is negative it is rounded up; if it is positive, it is rounded
1010 down.
1011 @end table
1012
1013 @noindent
1014 @file{fenv.h} defines constants which you can use to refer to the
1015 various rounding modes.  Each one will be defined if and only if the FPU
1016 supports the corresponding rounding mode.
1017
1018 @vtable @code
1019 @comment fenv.h
1020 @comment ISO
1021 @item FE_TONEAREST
1022 Round to nearest.
1023
1024 @comment fenv.h
1025 @comment ISO
1026 @item FE_UPWARD
1027 Round toward @math{+@infinity{}}.
1028
1029 @comment fenv.h
1030 @comment ISO
1031 @item FE_DOWNWARD
1032 Round toward @math{-@infinity{}}.
1033
1034 @comment fenv.h
1035 @comment ISO
1036 @item FE_TOWARDZERO
1037 Round toward zero.
1038 @end vtable
1039
1040 Underflow is an unusual case.  Normally, @w{IEEE 754} floating point
1041 numbers are always normalized (@pxref{Floating Point Concepts}).
1042 Numbers smaller than @math{2^r} (where @math{r} is the minimum exponent,
1043 @code{FLT_MIN_RADIX-1} for @var{float}) cannot be represented as
1044 normalized numbers.  Rounding all such numbers to zero or @math{2^r}
1045 would cause some algorithms to fail at 0.  Therefore, they are left in
1046 denormalized form.  That produces loss of precision, since some bits of
1047 the mantissa are stolen to indicate the decimal point.
1048
1049 If a result is too small to be represented as a denormalized number, it
1050 is rounded to zero.  However, the sign of the result is preserved; if
1051 the calculation was negative, the result is @dfn{negative zero}.
1052 Negative zero can also result from some operations on infinity, such as
1053 @math{4/-@infinity{}}.
1054
1055 At any time, one of the above four rounding modes is selected.  You can
1056 find out which one with this function:
1057
1058 @comment fenv.h
1059 @comment ISO
1060 @deftypefun int fegetround (void)
1061 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1062 Returns the currently selected rounding mode, represented by one of the
1063 values of the defined rounding mode macros.
1064 @end deftypefun
1065
1066 @noindent
1067 To change the rounding mode, use this function:
1068
1069 @comment fenv.h
1070 @comment ISO
1071 @deftypefun int fesetround (int @var{round})
1072 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1073 Changes the currently selected rounding mode to @var{round}.  If
1074 @var{round} does not correspond to one of the supported rounding modes
1075 nothing is changed.  @code{fesetround} returns zero if it changed the
1076 rounding mode, or a nonzero value if the mode is not supported.
1077 @end deftypefun
1078
1079 You should avoid changing the rounding mode if possible.  It can be an
1080 expensive operation; also, some hardware requires you to compile your
1081 program differently for it to work.  The resulting code may run slower.
1082 See your compiler documentation for details.
1083 @c This section used to claim that functions existed to round one number
1084 @c in a specific fashion.  I can't find any functions in the library
1085 @c that do that. -zw
1086
1087 @node Control Functions
1088 @section Floating-Point Control Functions
1089
1090 @w{IEEE 754} floating-point implementations allow the programmer to
1091 decide whether traps will occur for each of the exceptions, by setting
1092 bits in the @dfn{control word}.  In C, traps result in the program
1093 receiving the @code{SIGFPE} signal; see @ref{Signal Handling}.
1094
1095 @strong{NB:} @w{IEEE 754} says that trap handlers are given details of
1096 the exceptional situation, and can set the result value.  C signals do
1097 not provide any mechanism to pass this information back and forth.
1098 Trapping exceptions in C is therefore not very useful.
1099
1100 It is sometimes necessary to save the state of the floating-point unit
1101 while you perform some calculation.  The library provides functions
1102 which save and restore the exception flags, the set of exceptions that
1103 generate traps, and the rounding mode.  This information is known as the
1104 @dfn{floating-point environment}.
1105
1106 The functions to save and restore the floating-point environment all use
1107 a variable of type @code{fenv_t} to store information.  This type is
1108 defined in @file{fenv.h}.  Its size and contents are
1109 implementation-defined.  You should not attempt to manipulate a variable
1110 of this type directly.
1111
1112 To save the state of the FPU, use one of these functions:
1113
1114 @comment fenv.h
1115 @comment ISO
1116 @deftypefun int fegetenv (fenv_t *@var{envp})
1117 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1118 Store the floating-point environment in the variable pointed to by
1119 @var{envp}.
1120
1121 The function returns zero in case the operation was successful, a
1122 non-zero value otherwise.
1123 @end deftypefun
1124
1125 @comment fenv.h
1126 @comment ISO
1127 @deftypefun int feholdexcept (fenv_t *@var{envp})
1128 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1129 Store the current floating-point environment in the object pointed to by
1130 @var{envp}.  Then clear all exception flags, and set the FPU to trap no
1131 exceptions.  Not all FPUs support trapping no exceptions; if
1132 @code{feholdexcept} cannot set this mode, it returns nonzero value.  If it
1133 succeeds, it returns zero.
1134 @end deftypefun
1135
1136 The functions which restore the floating-point environment can take these
1137 kinds of arguments:
1138
1139 @itemize @bullet
1140 @item
1141 Pointers to @code{fenv_t} objects, which were initialized previously by a
1142 call to @code{fegetenv} or @code{feholdexcept}.
1143 @item
1144 @vindex FE_DFL_ENV
1145 The special macro @code{FE_DFL_ENV} which represents the floating-point
1146 environment as it was available at program start.
1147 @item
1148 Implementation defined macros with names starting with @code{FE_} and
1149 having type @code{fenv_t *}.
1150
1151 @vindex FE_NOMASK_ENV
1152 If possible, @theglibc{} defines a macro @code{FE_NOMASK_ENV}
1153 which represents an environment where every exception raised causes a
1154 trap to occur.  You can test for this macro using @code{#ifdef}.  It is
1155 only defined if @code{_GNU_SOURCE} is defined.
1156
1157 Some platforms might define other predefined environments.
1158 @end itemize
1159
1160 @noindent
1161 To set the floating-point environment, you can use either of these
1162 functions:
1163
1164 @comment fenv.h
1165 @comment ISO
1166 @deftypefun int fesetenv (const fenv_t *@var{envp})
1167 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1168 Set the floating-point environment to that described by @var{envp}.
1169
1170 The function returns zero in case the operation was successful, a
1171 non-zero value otherwise.
1172 @end deftypefun
1173
1174 @comment fenv.h
1175 @comment ISO
1176 @deftypefun int feupdateenv (const fenv_t *@var{envp})
1177 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1178 Like @code{fesetenv}, this function sets the floating-point environment
1179 to that described by @var{envp}.  However, if any exceptions were
1180 flagged in the status word before @code{feupdateenv} was called, they
1181 remain flagged after the call.  In other words, after @code{feupdateenv}
1182 is called, the status word is the bitwise OR of the previous status word
1183 and the one saved in @var{envp}.
1184
1185 The function returns zero in case the operation was successful, a
1186 non-zero value otherwise.
1187 @end deftypefun
1188
1189 @noindent
1190 TS 18661-1:2014 defines additional functions to save and restore
1191 floating-point control modes (such as the rounding mode and whether
1192 traps are enabled) while leaving other status (such as raised flags)
1193 unchanged.
1194
1195 @vindex FE_DFL_MODE
1196 The special macro @code{FE_DFL_MODE} may be passed to
1197 @code{fesetmode}.  It represents the floating-point control modes at
1198 program start.
1199
1200 @comment fenv.h
1201 @comment ISO
1202 @deftypefun int fegetmode (femode_t *@var{modep})
1203 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1204 Store the floating-point control modes in the variable pointed to by
1205 @var{modep}.
1206
1207 The function returns zero in case the operation was successful, a
1208 non-zero value otherwise.
1209 @end deftypefun
1210
1211 @comment fenv.h
1212 @comment ISO
1213 @deftypefun int fesetmode (const femode_t *@var{modep})
1214 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1215 Set the floating-point control modes to those described by
1216 @var{modep}.
1217
1218 The function returns zero in case the operation was successful, a
1219 non-zero value otherwise.
1220 @end deftypefun
1221
1222 @noindent
1223 To control for individual exceptions if raising them causes a trap to
1224 occur, you can use the following two functions.
1225
1226 @strong{Portability Note:} These functions are all GNU extensions.
1227
1228 @comment fenv.h
1229 @comment GNU
1230 @deftypefun int feenableexcept (int @var{excepts})
1231 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1232 This function enables traps for each of the exceptions as indicated by
1233 the parameter @var{excepts}.  The individual exceptions are described in
1234 @ref{Status bit operations}.  Only the specified exceptions are
1235 enabled, the status of the other exceptions is not changed.
1236
1237 The function returns the previous enabled exceptions in case the
1238 operation was successful, @code{-1} otherwise.
1239 @end deftypefun
1240
1241 @comment fenv.h
1242 @comment GNU
1243 @deftypefun int fedisableexcept (int @var{excepts})
1244 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1245 This function disables traps for each of the exceptions as indicated by
1246 the parameter @var{excepts}.  The individual exceptions are described in
1247 @ref{Status bit operations}.  Only the specified exceptions are
1248 disabled, the status of the other exceptions is not changed.
1249
1250 The function returns the previous enabled exceptions in case the
1251 operation was successful, @code{-1} otherwise.
1252 @end deftypefun
1253
1254 @comment fenv.h
1255 @comment GNU
1256 @deftypefun int fegetexcept (void)
1257 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1258 The function returns a bitmask of all currently enabled exceptions.  It
1259 returns @code{-1} in case of failure.
1260 @end deftypefun
1261
1262 @node Arithmetic Functions
1263 @section Arithmetic Functions
1264
1265 The C library provides functions to do basic operations on
1266 floating-point numbers.  These include absolute value, maximum and minimum,
1267 normalization, bit twiddling, rounding, and a few others.
1268
1269 @menu
1270 * Absolute Value::              Absolute values of integers and floats.
1271 * Normalization Functions::     Extracting exponents and putting them back.
1272 * Rounding Functions::          Rounding floats to integers.
1273 * Remainder Functions::         Remainders on division, precisely defined.
1274 * FP Bit Twiddling::            Sign bit adjustment.  Adding epsilon.
1275 * FP Comparison Functions::     Comparisons without risk of exceptions.
1276 * Misc FP Arithmetic::          Max, min, positive difference, multiply-add.
1277 @end menu
1278
1279 @node Absolute Value
1280 @subsection Absolute Value
1281 @cindex absolute value functions
1282
1283 These functions are provided for obtaining the @dfn{absolute value} (or
1284 @dfn{magnitude}) of a number.  The absolute value of a real number
1285 @var{x} is @var{x} if @var{x} is positive, @minus{}@var{x} if @var{x} is
1286 negative.  For a complex number @var{z}, whose real part is @var{x} and
1287 whose imaginary part is @var{y}, the absolute value is @w{@code{sqrt
1288 (@var{x}*@var{x} + @var{y}*@var{y})}}.
1289
1290 @pindex math.h
1291 @pindex stdlib.h
1292 Prototypes for @code{abs}, @code{labs} and @code{llabs} are in @file{stdlib.h};
1293 @code{imaxabs} is declared in @file{inttypes.h};
1294 @code{fabs}, @code{fabsf} and @code{fabsl} are declared in @file{math.h}.
1295 @code{cabs}, @code{cabsf} and @code{cabsl} are declared in @file{complex.h}.
1296
1297 @comment stdlib.h
1298 @comment ISO
1299 @deftypefun int abs (int @var{number})
1300 @comment stdlib.h
1301 @comment ISO
1302 @deftypefunx {long int} labs (long int @var{number})
1303 @comment stdlib.h
1304 @comment ISO
1305 @deftypefunx {long long int} llabs (long long int @var{number})
1306 @comment inttypes.h
1307 @comment ISO
1308 @deftypefunx intmax_t imaxabs (intmax_t @var{number})
1309 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1310 These functions return the absolute value of @var{number}.
1311
1312 Most computers use a two's complement integer representation, in which
1313 the absolute value of @code{INT_MIN} (the smallest possible @code{int})
1314 cannot be represented; thus, @w{@code{abs (INT_MIN)}} is not defined.
1315
1316 @code{llabs} and @code{imaxdiv} are new to @w{ISO C99}.
1317
1318 See @ref{Integers} for a description of the @code{intmax_t} type.
1319
1320 @end deftypefun
1321
1322 @comment math.h
1323 @comment ISO
1324 @deftypefun double fabs (double @var{number})
1325 @comment math.h
1326 @comment ISO
1327 @deftypefunx float fabsf (float @var{number})
1328 @comment math.h
1329 @comment ISO
1330 @deftypefunx {long double} fabsl (long double @var{number})
1331 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1332 This function returns the absolute value of the floating-point number
1333 @var{number}.
1334 @end deftypefun
1335
1336 @comment complex.h
1337 @comment ISO
1338 @deftypefun double cabs (complex double @var{z})
1339 @comment complex.h
1340 @comment ISO
1341 @deftypefunx float cabsf (complex float @var{z})
1342 @comment complex.h
1343 @comment ISO
1344 @deftypefunx {long double} cabsl (complex long double @var{z})
1345 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1346 These functions return the absolute  value of the complex number @var{z}
1347 (@pxref{Complex Numbers}).  The absolute value of a complex number is:
1348
1349 @smallexample
1350 sqrt (creal (@var{z}) * creal (@var{z}) + cimag (@var{z}) * cimag (@var{z}))
1351 @end smallexample
1352
1353 This function should always be used instead of the direct formula
1354 because it takes special care to avoid losing precision.  It may also
1355 take advantage of hardware support for this operation.  See @code{hypot}
1356 in @ref{Exponents and Logarithms}.
1357 @end deftypefun
1358
1359 @node Normalization Functions
1360 @subsection Normalization Functions
1361 @cindex normalization functions (floating-point)
1362
1363 The functions described in this section are primarily provided as a way
1364 to efficiently perform certain low-level manipulations on floating point
1365 numbers that are represented internally using a binary radix;
1366 see @ref{Floating Point Concepts}.  These functions are required to
1367 have equivalent behavior even if the representation does not use a radix
1368 of 2, but of course they are unlikely to be particularly efficient in
1369 those cases.
1370
1371 @pindex math.h
1372 All these functions are declared in @file{math.h}.
1373
1374 @comment math.h
1375 @comment ISO
1376 @deftypefun double frexp (double @var{value}, int *@var{exponent})
1377 @comment math.h
1378 @comment ISO
1379 @deftypefunx float frexpf (float @var{value}, int *@var{exponent})
1380 @comment math.h
1381 @comment ISO
1382 @deftypefunx {long double} frexpl (long double @var{value}, int *@var{exponent})
1383 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1384 These functions are used to split the number @var{value}
1385 into a normalized fraction and an exponent.
1386
1387 If the argument @var{value} is not zero, the return value is @var{value}
1388 times a power of two, and its magnitude is always in the range 1/2
1389 (inclusive) to 1 (exclusive).  The corresponding exponent is stored in
1390 @code{*@var{exponent}}; the return value multiplied by 2 raised to this
1391 exponent equals the original number @var{value}.
1392
1393 For example, @code{frexp (12.8, &exponent)} returns @code{0.8} and
1394 stores @code{4} in @code{exponent}.
1395
1396 If @var{value} is zero, then the return value is zero and
1397 zero is stored in @code{*@var{exponent}}.
1398 @end deftypefun
1399
1400 @comment math.h
1401 @comment ISO
1402 @deftypefun double ldexp (double @var{value}, int @var{exponent})
1403 @comment math.h
1404 @comment ISO
1405 @deftypefunx float ldexpf (float @var{value}, int @var{exponent})
1406 @comment math.h
1407 @comment ISO
1408 @deftypefunx {long double} ldexpl (long double @var{value}, int @var{exponent})
1409 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1410 These functions return the result of multiplying the floating-point
1411 number @var{value} by 2 raised to the power @var{exponent}.  (It can
1412 be used to reassemble floating-point numbers that were taken apart
1413 by @code{frexp}.)
1414
1415 For example, @code{ldexp (0.8, 4)} returns @code{12.8}.
1416 @end deftypefun
1417
1418 The following functions, which come from BSD, provide facilities
1419 equivalent to those of @code{ldexp} and @code{frexp}.  See also the
1420 @w{ISO C} function @code{logb} which originally also appeared in BSD.
1421
1422 @comment math.h
1423 @comment BSD
1424 @deftypefun double scalb (double @var{value}, double @var{exponent})
1425 @comment math.h
1426 @comment BSD
1427 @deftypefunx float scalbf (float @var{value}, float @var{exponent})
1428 @comment math.h
1429 @comment BSD
1430 @deftypefunx {long double} scalbl (long double @var{value}, long double @var{exponent})
1431 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1432 The @code{scalb} function is the BSD name for @code{ldexp}.
1433 @end deftypefun
1434
1435 @comment math.h
1436 @comment BSD
1437 @deftypefun double scalbn (double @var{x}, int @var{n})
1438 @comment math.h
1439 @comment BSD
1440 @deftypefunx float scalbnf (float @var{x}, int @var{n})
1441 @comment math.h
1442 @comment BSD
1443 @deftypefunx {long double} scalbnl (long double @var{x}, int @var{n})
1444 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1445 @code{scalbn} is identical to @code{scalb}, except that the exponent
1446 @var{n} is an @code{int} instead of a floating-point number.
1447 @end deftypefun
1448
1449 @comment math.h
1450 @comment BSD
1451 @deftypefun double scalbln (double @var{x}, long int @var{n})
1452 @comment math.h
1453 @comment BSD
1454 @deftypefunx float scalblnf (float @var{x}, long int @var{n})
1455 @comment math.h
1456 @comment BSD
1457 @deftypefunx {long double} scalblnl (long double @var{x}, long int @var{n})
1458 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1459 @code{scalbln} is identical to @code{scalb}, except that the exponent
1460 @var{n} is a @code{long int} instead of a floating-point number.
1461 @end deftypefun
1462
1463 @comment math.h
1464 @comment BSD
1465 @deftypefun double significand (double @var{x})
1466 @comment math.h
1467 @comment BSD
1468 @deftypefunx float significandf (float @var{x})
1469 @comment math.h
1470 @comment BSD
1471 @deftypefunx {long double} significandl (long double @var{x})
1472 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1473 @code{significand} returns the mantissa of @var{x} scaled to the range
1474 @math{[1, 2)}.
1475 It is equivalent to @w{@code{scalb (@var{x}, (double) -ilogb (@var{x}))}}.
1476
1477 This function exists mainly for use in certain standardized tests
1478 of @w{IEEE 754} conformance.
1479 @end deftypefun
1480
1481 @node Rounding Functions
1482 @subsection Rounding Functions
1483 @cindex converting floats to integers
1484
1485 @pindex math.h
1486 The functions listed here perform operations such as rounding and
1487 truncation of floating-point values.  Some of these functions convert
1488 floating point numbers to integer values.  They are all declared in
1489 @file{math.h}.
1490
1491 You can also convert floating-point numbers to integers simply by
1492 casting them to @code{int}.  This discards the fractional part,
1493 effectively rounding towards zero.  However, this only works if the
1494 result can actually be represented as an @code{int}---for very large
1495 numbers, this is impossible.  The functions listed here return the
1496 result as a @code{double} instead to get around this problem.
1497
1498 @comment math.h
1499 @comment ISO
1500 @deftypefun double ceil (double @var{x})
1501 @comment math.h
1502 @comment ISO
1503 @deftypefunx float ceilf (float @var{x})
1504 @comment math.h
1505 @comment ISO
1506 @deftypefunx {long double} ceill (long double @var{x})
1507 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1508 These functions round @var{x} upwards to the nearest integer,
1509 returning that value as a @code{double}.  Thus, @code{ceil (1.5)}
1510 is @code{2.0}.
1511 @end deftypefun
1512
1513 @comment math.h
1514 @comment ISO
1515 @deftypefun double floor (double @var{x})
1516 @comment math.h
1517 @comment ISO
1518 @deftypefunx float floorf (float @var{x})
1519 @comment math.h
1520 @comment ISO
1521 @deftypefunx {long double} floorl (long double @var{x})
1522 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1523 These functions round @var{x} downwards to the nearest
1524 integer, returning that value as a @code{double}.  Thus, @code{floor
1525 (1.5)} is @code{1.0} and @code{floor (-1.5)} is @code{-2.0}.
1526 @end deftypefun
1527
1528 @comment math.h
1529 @comment ISO
1530 @deftypefun double trunc (double @var{x})
1531 @comment math.h
1532 @comment ISO
1533 @deftypefunx float truncf (float @var{x})
1534 @comment math.h
1535 @comment ISO
1536 @deftypefunx {long double} truncl (long double @var{x})
1537 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1538 The @code{trunc} functions round @var{x} towards zero to the nearest
1539 integer (returned in floating-point format).  Thus, @code{trunc (1.5)}
1540 is @code{1.0} and @code{trunc (-1.5)} is @code{-1.0}.
1541 @end deftypefun
1542
1543 @comment math.h
1544 @comment ISO
1545 @deftypefun double rint (double @var{x})
1546 @comment math.h
1547 @comment ISO
1548 @deftypefunx float rintf (float @var{x})
1549 @comment math.h
1550 @comment ISO
1551 @deftypefunx {long double} rintl (long double @var{x})
1552 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1553 These functions round @var{x} to an integer value according to the
1554 current rounding mode.  @xref{Floating Point Parameters}, for
1555 information about the various rounding modes.  The default
1556 rounding mode is to round to the nearest integer; some machines
1557 support other modes, but round-to-nearest is always used unless
1558 you explicitly select another.
1559
1560 If @var{x} was not initially an integer, these functions raise the
1561 inexact exception.
1562 @end deftypefun
1563
1564 @comment math.h
1565 @comment ISO
1566 @deftypefun double nearbyint (double @var{x})
1567 @comment math.h
1568 @comment ISO
1569 @deftypefunx float nearbyintf (float @var{x})
1570 @comment math.h
1571 @comment ISO
1572 @deftypefunx {long double} nearbyintl (long double @var{x})
1573 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1574 These functions return the same value as the @code{rint} functions, but
1575 do not raise the inexact exception if @var{x} is not an integer.
1576 @end deftypefun
1577
1578 @comment math.h
1579 @comment ISO
1580 @deftypefun double round (double @var{x})
1581 @comment math.h
1582 @comment ISO
1583 @deftypefunx float roundf (float @var{x})
1584 @comment math.h
1585 @comment ISO
1586 @deftypefunx {long double} roundl (long double @var{x})
1587 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1588 These functions are similar to @code{rint}, but they round halfway
1589 cases away from zero instead of to the nearest integer (or other
1590 current rounding mode).
1591 @end deftypefun
1592
1593 @comment math.h
1594 @comment ISO
1595 @deftypefun double roundeven (double @var{x})
1596 @comment math.h
1597 @comment ISO
1598 @deftypefunx float roundevenf (float @var{x})
1599 @comment math.h
1600 @comment ISO
1601 @deftypefunx {long double} roundevenl (long double @var{x})
1602 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1603 These functions, from TS 18661-1:2014, are similar to @code{round},
1604 but they round halfway cases to even instead of away from zero.
1605 @end deftypefun
1606
1607 @comment math.h
1608 @comment ISO
1609 @deftypefun {long int} lrint (double @var{x})
1610 @comment math.h
1611 @comment ISO
1612 @deftypefunx {long int} lrintf (float @var{x})
1613 @comment math.h
1614 @comment ISO
1615 @deftypefunx {long int} lrintl (long double @var{x})
1616 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1617 These functions are just like @code{rint}, but they return a
1618 @code{long int} instead of a floating-point number.
1619 @end deftypefun
1620
1621 @comment math.h
1622 @comment ISO
1623 @deftypefun {long long int} llrint (double @var{x})
1624 @comment math.h
1625 @comment ISO
1626 @deftypefunx {long long int} llrintf (float @var{x})
1627 @comment math.h
1628 @comment ISO
1629 @deftypefunx {long long int} llrintl (long double @var{x})
1630 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1631 These functions are just like @code{rint}, but they return a
1632 @code{long long int} instead of a floating-point number.
1633 @end deftypefun
1634
1635 @comment math.h
1636 @comment ISO
1637 @deftypefun {long int} lround (double @var{x})
1638 @comment math.h
1639 @comment ISO
1640 @deftypefunx {long int} lroundf (float @var{x})
1641 @comment math.h
1642 @comment ISO
1643 @deftypefunx {long int} lroundl (long double @var{x})
1644 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1645 These functions are just like @code{round}, but they return a
1646 @code{long int} instead of a floating-point number.
1647 @end deftypefun
1648
1649 @comment math.h
1650 @comment ISO
1651 @deftypefun {long long int} llround (double @var{x})
1652 @comment math.h
1653 @comment ISO
1654 @deftypefunx {long long int} llroundf (float @var{x})
1655 @comment math.h
1656 @comment ISO
1657 @deftypefunx {long long int} llroundl (long double @var{x})
1658 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1659 These functions are just like @code{round}, but they return a
1660 @code{long long int} instead of a floating-point number.
1661 @end deftypefun
1662
1663
1664 @comment math.h
1665 @comment ISO
1666 @deftypefun double modf (double @var{value}, double *@var{integer-part})
1667 @comment math.h
1668 @comment ISO
1669 @deftypefunx float modff (float @var{value}, float *@var{integer-part})
1670 @comment math.h
1671 @comment ISO
1672 @deftypefunx {long double} modfl (long double @var{value}, long double *@var{integer-part})
1673 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1674 These functions break the argument @var{value} into an integer part and a
1675 fractional part (between @code{-1} and @code{1}, exclusive).  Their sum
1676 equals @var{value}.  Each of the parts has the same sign as @var{value},
1677 and the integer part is always rounded toward zero.
1678
1679 @code{modf} stores the integer part in @code{*@var{integer-part}}, and
1680 returns the fractional part.  For example, @code{modf (2.5, &intpart)}
1681 returns @code{0.5} and stores @code{2.0} into @code{intpart}.
1682 @end deftypefun
1683
1684 @node Remainder Functions
1685 @subsection Remainder Functions
1686
1687 The functions in this section compute the remainder on division of two
1688 floating-point numbers.  Each is a little different; pick the one that
1689 suits your problem.
1690
1691 @comment math.h
1692 @comment ISO
1693 @deftypefun double fmod (double @var{numerator}, double @var{denominator})
1694 @comment math.h
1695 @comment ISO
1696 @deftypefunx float fmodf (float @var{numerator}, float @var{denominator})
1697 @comment math.h
1698 @comment ISO
1699 @deftypefunx {long double} fmodl (long double @var{numerator}, long double @var{denominator})
1700 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1701 These functions compute the remainder from the division of
1702 @var{numerator} by @var{denominator}.  Specifically, the return value is
1703 @code{@var{numerator} - @w{@var{n} * @var{denominator}}}, where @var{n}
1704 is the quotient of @var{numerator} divided by @var{denominator}, rounded
1705 towards zero to an integer.  Thus, @w{@code{fmod (6.5, 2.3)}} returns
1706 @code{1.9}, which is @code{6.5} minus @code{4.6}.
1707
1708 The result has the same sign as the @var{numerator} and has magnitude
1709 less than the magnitude of the @var{denominator}.
1710
1711 If @var{denominator} is zero, @code{fmod} signals a domain error.
1712 @end deftypefun
1713
1714 @comment math.h
1715 @comment BSD
1716 @deftypefun double drem (double @var{numerator}, double @var{denominator})
1717 @comment math.h
1718 @comment BSD
1719 @deftypefunx float dremf (float @var{numerator}, float @var{denominator})
1720 @comment math.h
1721 @comment BSD
1722 @deftypefunx {long double} dreml (long double @var{numerator}, long double @var{denominator})
1723 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1724 These functions are like @code{fmod} except that they round the
1725 internal quotient @var{n} to the nearest integer instead of towards zero
1726 to an integer.  For example, @code{drem (6.5, 2.3)} returns @code{-0.4},
1727 which is @code{6.5} minus @code{6.9}.
1728
1729 The absolute value of the result is less than or equal to half the
1730 absolute value of the @var{denominator}.  The difference between
1731 @code{fmod (@var{numerator}, @var{denominator})} and @code{drem
1732 (@var{numerator}, @var{denominator})} is always either
1733 @var{denominator}, minus @var{denominator}, or zero.
1734
1735 If @var{denominator} is zero, @code{drem} signals a domain error.
1736 @end deftypefun
1737
1738 @comment math.h
1739 @comment BSD
1740 @deftypefun double remainder (double @var{numerator}, double @var{denominator})
1741 @comment math.h
1742 @comment BSD
1743 @deftypefunx float remainderf (float @var{numerator}, float @var{denominator})
1744 @comment math.h
1745 @comment BSD
1746 @deftypefunx {long double} remainderl (long double @var{numerator}, long double @var{denominator})
1747 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1748 This function is another name for @code{drem}.
1749 @end deftypefun
1750
1751 @node FP Bit Twiddling
1752 @subsection Setting and modifying single bits of FP values
1753 @cindex FP arithmetic
1754
1755 There are some operations that are too complicated or expensive to
1756 perform by hand on floating-point numbers.  @w{ISO C99} defines
1757 functions to do these operations, which mostly involve changing single
1758 bits.
1759
1760 @comment math.h
1761 @comment ISO
1762 @deftypefun double copysign (double @var{x}, double @var{y})
1763 @comment math.h
1764 @comment ISO
1765 @deftypefunx float copysignf (float @var{x}, float @var{y})
1766 @comment math.h
1767 @comment ISO
1768 @deftypefunx {long double} copysignl (long double @var{x}, long double @var{y})
1769 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1770 These functions return @var{x} but with the sign of @var{y}.  They work
1771 even if @var{x} or @var{y} are NaN or zero.  Both of these can carry a
1772 sign (although not all implementations support it) and this is one of
1773 the few operations that can tell the difference.
1774
1775 @code{copysign} never raises an exception.
1776 @c except signalling NaNs
1777
1778 This function is defined in @w{IEC 559} (and the appendix with
1779 recommended functions in @w{IEEE 754}/@w{IEEE 854}).
1780 @end deftypefun
1781
1782 @comment math.h
1783 @comment ISO
1784 @deftypefun int signbit (@emph{float-type} @var{x})
1785 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1786 @code{signbit} is a generic macro which can work on all floating-point
1787 types.  It returns a nonzero value if the value of @var{x} has its sign
1788 bit set.
1789
1790 This is not the same as @code{x < 0.0}, because @w{IEEE 754} floating
1791 point allows zero to be signed.  The comparison @code{-0.0 < 0.0} is
1792 false, but @code{signbit (-0.0)} will return a nonzero value.
1793 @end deftypefun
1794
1795 @comment math.h
1796 @comment ISO
1797 @deftypefun double nextafter (double @var{x}, double @var{y})
1798 @comment math.h
1799 @comment ISO
1800 @deftypefunx float nextafterf (float @var{x}, float @var{y})
1801 @comment math.h
1802 @comment ISO
1803 @deftypefunx {long double} nextafterl (long double @var{x}, long double @var{y})
1804 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1805 The @code{nextafter} function returns the next representable neighbor of
1806 @var{x} in the direction towards @var{y}.  The size of the step between
1807 @var{x} and the result depends on the type of the result.  If
1808 @math{@var{x} = @var{y}} the function simply returns @var{y}.  If either
1809 value is @code{NaN}, @code{NaN} is returned.  Otherwise
1810 a value corresponding to the value of the least significant bit in the
1811 mantissa is added or subtracted, depending on the direction.
1812 @code{nextafter} will signal overflow or underflow if the result goes
1813 outside of the range of normalized numbers.
1814
1815 This function is defined in @w{IEC 559} (and the appendix with
1816 recommended functions in @w{IEEE 754}/@w{IEEE 854}).
1817 @end deftypefun
1818
1819 @comment math.h
1820 @comment ISO
1821 @deftypefun double nexttoward (double @var{x}, long double @var{y})
1822 @comment math.h
1823 @comment ISO
1824 @deftypefunx float nexttowardf (float @var{x}, long double @var{y})
1825 @comment math.h
1826 @comment ISO
1827 @deftypefunx {long double} nexttowardl (long double @var{x}, long double @var{y})
1828 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1829 These functions are identical to the corresponding versions of
1830 @code{nextafter} except that their second argument is a @code{long
1831 double}.
1832 @end deftypefun
1833
1834 @comment math.h
1835 @comment ISO
1836 @deftypefun double nextup (double @var{x})
1837 @comment math.h
1838 @comment ISO
1839 @deftypefunx float nextupf (float @var{x})
1840 @comment math.h
1841 @comment ISO
1842 @deftypefunx {long double} nextupl (long double @var{x})
1843 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1844 The @code{nextup} function returns the next representable neighbor of @var{x}
1845 in the direction of positive infinity.  If @var{x} is the smallest negative
1846 subnormal number in the type of @var{x} the function returns @code{-0}.  If
1847 @math{@var{x} = @code{0}} the function returns the smallest positive subnormal
1848 number in the type of @var{x}.  If @var{x} is NaN, NaN is returned.
1849 If @var{x} is @math{+@infinity{}}, @math{+@infinity{}} is returned.
1850 @code{nextup} is from TS 18661-1:2014.
1851 @code{nextup} never raises an exception except for signaling NaNs.
1852 @end deftypefun
1853
1854 @comment math.h
1855 @comment ISO
1856 @deftypefun double nextdown (double @var{x})
1857 @comment math.h
1858 @comment ISO
1859 @deftypefunx float nextdownf (float @var{x})
1860 @comment math.h
1861 @comment ISO
1862 @deftypefunx {long double} nextdownl (long double @var{x})
1863 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1864 The @code{nextdown} function returns the next representable neighbor of @var{x}
1865 in the direction of negative infinity.  If @var{x} is the smallest positive
1866 subnormal number in the type of @var{x} the function returns @code{+0}.  If
1867 @math{@var{x} = @code{0}} the function returns the smallest negative subnormal
1868 number in the type of @var{x}.  If @var{x} is NaN, NaN is returned.
1869 If @var{x} is @math{-@infinity{}}, @math{-@infinity{}} is returned.
1870 @code{nextdown} is from TS 18661-1:2014.
1871 @code{nextdown} never raises an exception except for signaling NaNs.
1872 @end deftypefun
1873
1874 @cindex NaN
1875 @comment math.h
1876 @comment ISO
1877 @deftypefun double nan (const char *@var{tagp})
1878 @comment math.h
1879 @comment ISO
1880 @deftypefunx float nanf (const char *@var{tagp})
1881 @comment math.h
1882 @comment ISO
1883 @deftypefunx {long double} nanl (const char *@var{tagp})
1884 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
1885 @c The unsafe-but-ruled-safe locale use comes from strtod.
1886 The @code{nan} function returns a representation of NaN, provided that
1887 NaN is supported by the target platform.
1888 @code{nan ("@var{n-char-sequence}")} is equivalent to
1889 @code{strtod ("NAN(@var{n-char-sequence})")}.
1890
1891 The argument @var{tagp} is used in an unspecified manner.  On @w{IEEE
1892 754} systems, there are many representations of NaN, and @var{tagp}
1893 selects one.  On other systems it may do nothing.
1894 @end deftypefun
1895
1896 @comment math.h
1897 @comment ISO
1898 @deftypefun int canonicalize (double *@var{cx}, const double *@var{x})
1899 @comment math.h
1900 @comment ISO
1901 @deftypefunx int canonicalizef (float *@var{cx}, const float *@var{x})
1902 @comment math.h
1903 @comment ISO
1904 @deftypefunx int canonicalizel (long double *@var{cx}, const long double *@var{x})
1905 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1906 In some floating-point formats, some values have canonical (preferred)
1907 and noncanonical encodings (for IEEE interchange binary formats, all
1908 encodings are canonical).  These functions, defined by TS
1909 18661-1:2014, attempt to produce a canonical version of the
1910 floating-point value pointed to by @var{x}; if that value is a
1911 signaling NaN, they raise the invalid exception and produce a quiet
1912 NaN.  If a canonical value is produced, it is stored in the object
1913 pointed to by @var{cx}, and these functions return zero.  Otherwise
1914 (if a canonical value could not be produced because the object pointed
1915 to by @var{x} is not a valid representation of any floating-point
1916 value), the object pointed to by @var{cx} is unchanged and a nonzero
1917 value is returned.
1918
1919 Note that some formats have multiple encodings of a value which are
1920 all equally canonical; when such an encoding is used as an input to
1921 this function, any such encoding of the same value (or of the
1922 corresponding quiet NaN, if that value is a signaling NaN) may be
1923 produced as output.
1924 @end deftypefun
1925
1926 @comment math.h
1927 @comment ISO
1928 @deftypefun double getpayload (const double *@var{x})
1929 @comment math.h
1930 @comment ISO
1931 @deftypefunx float getpayloadf (const float *@var{x})
1932 @comment math.h
1933 @comment ISO
1934 @deftypefunx {long double} getpayloadl (const long double *@var{x})
1935 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1936 IEEE 754 defines the @dfn{payload} of a NaN to be an integer value
1937 encoded in the representation of the NaN.  Payloads are typically
1938 propagated from NaN inputs to the result of a floating-point
1939 operation.  These functions, defined by TS 18661-1:2014, return the
1940 payload of the NaN pointed to by @var{x} (returned as a positive
1941 integer, or positive zero, represented as a floating-point number); if
1942 @var{x} is not a NaN, they return an unspecified value.  They raise no
1943 floating-point exceptions even for signaling NaNs.
1944 @end deftypefun
1945
1946 @comment math.h
1947 @comment ISO
1948 @deftypefun int setpayload (double *@var{x}, double @var{payload})
1949 @comment math.h
1950 @comment ISO
1951 @deftypefunx int setpayloadf (float *@var{x}, float @var{payload})
1952 @comment math.h
1953 @comment ISO
1954 @deftypefunx int setpayloadl (long double *@var{x}, long double @var{payload})
1955 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1956 These functions, defined by TS 18661-1:2014, set the object pointed to
1957 by @var{x} to a quiet NaN with payload @var{payload} and a zero sign
1958 bit and return zero.  If @var{payload} is not a positive-signed
1959 integer that is a valid payload for a quiet NaN of the given type, the
1960 object pointed to by @var{x} is set to positive zero and a nonzero
1961 value is returned.  They raise no floating-point exceptions.
1962 @end deftypefun
1963
1964 @comment math.h
1965 @comment ISO
1966 @deftypefun int setpayloadsig (double *@var{x}, double @var{payload})
1967 @comment math.h
1968 @comment ISO
1969 @deftypefunx int setpayloadsigf (float *@var{x}, float @var{payload})
1970 @comment math.h
1971 @comment ISO
1972 @deftypefunx int setpayloadsigl (long double *@var{x}, long double @var{payload})
1973 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
1974 These functions, defined by TS 18661-1:2014, set the object pointed to
1975 by @var{x} to a signaling NaN with payload @var{payload} and a zero
1976 sign bit and return zero.  If @var{payload} is not a positive-signed
1977 integer that is a valid payload for a signaling NaN of the given type,
1978 the object pointed to by @var{x} is set to positive zero and a nonzero
1979 value is returned.  They raise no floating-point exceptions.
1980 @end deftypefun
1981
1982 @node FP Comparison Functions
1983 @subsection Floating-Point Comparison Functions
1984 @cindex unordered comparison
1985
1986 The standard C comparison operators provoke exceptions when one or other
1987 of the operands is NaN.  For example,
1988
1989 @smallexample
1990 int v = a < 1.0;
1991 @end smallexample
1992
1993 @noindent
1994 will raise an exception if @var{a} is NaN.  (This does @emph{not}
1995 happen with @code{==} and @code{!=}; those merely return false and true,
1996 respectively, when NaN is examined.)  Frequently this exception is
1997 undesirable.  @w{ISO C99} therefore defines comparison functions that
1998 do not raise exceptions when NaN is examined.  All of the functions are
1999 implemented as macros which allow their arguments to be of any
2000 floating-point type.  The macros are guaranteed to evaluate their
2001 arguments only once.  TS 18661-1:2014 adds such a macro for an
2002 equality comparison that @emph{does} raise an exception for a NaN
2003 argument; it also adds functions that provide a total ordering on all
2004 floating-point values, including NaNs, without raising any exceptions
2005 even for signaling NaNs.
2006
2007 @comment math.h
2008 @comment ISO
2009 @deftypefn Macro int isgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
2010 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2011 This macro determines whether the argument @var{x} is greater than
2012 @var{y}.  It is equivalent to @code{(@var{x}) > (@var{y})}, but no
2013 exception is raised if @var{x} or @var{y} are NaN.
2014 @end deftypefn
2015
2016 @comment math.h
2017 @comment ISO
2018 @deftypefn Macro int isgreaterequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
2019 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2020 This macro determines whether the argument @var{x} is greater than or
2021 equal to @var{y}.  It is equivalent to @code{(@var{x}) >= (@var{y})}, but no
2022 exception is raised if @var{x} or @var{y} are NaN.
2023 @end deftypefn
2024
2025 @comment math.h
2026 @comment ISO
2027 @deftypefn Macro int isless (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
2028 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2029 This macro determines whether the argument @var{x} is less than @var{y}.
2030 It is equivalent to @code{(@var{x}) < (@var{y})}, but no exception is
2031 raised if @var{x} or @var{y} are NaN.
2032 @end deftypefn
2033
2034 @comment math.h
2035 @comment ISO
2036 @deftypefn Macro int islessequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
2037 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2038 This macro determines whether the argument @var{x} is less than or equal
2039 to @var{y}.  It is equivalent to @code{(@var{x}) <= (@var{y})}, but no
2040 exception is raised if @var{x} or @var{y} are NaN.
2041 @end deftypefn
2042
2043 @comment math.h
2044 @comment ISO
2045 @deftypefn Macro int islessgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
2046 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2047 This macro determines whether the argument @var{x} is less or greater
2048 than @var{y}.  It is equivalent to @code{(@var{x}) < (@var{y}) ||
2049 (@var{x}) > (@var{y})} (although it only evaluates @var{x} and @var{y}
2050 once), but no exception is raised if @var{x} or @var{y} are NaN.
2051
2052 This macro is not equivalent to @code{@var{x} != @var{y}}, because that
2053 expression is true if @var{x} or @var{y} are NaN.
2054 @end deftypefn
2055
2056 @comment math.h
2057 @comment ISO
2058 @deftypefn Macro int isunordered (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
2059 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2060 This macro determines whether its arguments are unordered.  In other
2061 words, it is true if @var{x} or @var{y} are NaN, and false otherwise.
2062 @end deftypefn
2063
2064 @comment math.h
2065 @comment ISO
2066 @deftypefn Macro int iseqsig (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
2067 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2068 This macro determines whether its arguments are equal.  It is
2069 equivalent to @code{(@var{x}) == (@var{y})}, but it raises the invalid
2070 exception and sets @code{errno} to @code{EDOM} is either argument is a
2071 NaN.
2072 @end deftypefn
2073
2074 @comment math.h
2075 @comment ISO
2076 @deftypefun int totalorder (double @var{x}, double @var{y})
2077 @comment ISO
2078 @deftypefunx int totalorderf (float @var{x}, float @var{y})
2079 @comment ISO
2080 @deftypefunx int totalorderl (long double @var{x}, long double @var{y})
2081 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2082 These functions determine whether the total order relationship,
2083 defined in IEEE 754-2008, is true for @var{x} and @var{y}, returning
2084 nonzero if it is true and zero if it is false.  No exceptions are
2085 raised even for signaling NaNs.  The relationship is true if they are
2086 the same floating-point value (including sign for zero and NaNs, and
2087 payload for NaNs), or if @var{x} comes before @var{y} in the following
2088 order: negative quiet NaNs, in order of decreasing payload; negative
2089 signaling NaNs, in order of decreasing payload; negative infinity;
2090 finite numbers, in ascending order, with negative zero before positive
2091 zero; positive infinity; positive signaling NaNs, in order of
2092 increasing payload; positive quiet NaNs, in order of increasing
2093 payload.
2094 @end deftypefun
2095
2096 @comment math.h
2097 @comment ISO
2098 @deftypefun int totalordermag (double @var{x}, double @var{y})
2099 @comment ISO
2100 @deftypefunx int totalordermagf (float @var{x}, float @var{y})
2101 @comment ISO
2102 @deftypefunx int totalordermagl (long double @var{x}, long double @var{y})
2103 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2104 These functions determine whether the total order relationship,
2105 defined in IEEE 754-2008, is true for the absolute values of @var{x}
2106 and @var{y}, returning nonzero if it is true and zero if it is false.
2107 No exceptions are raised even for signaling NaNs.
2108 @end deftypefun
2109
2110 Not all machines provide hardware support for these operations.  On
2111 machines that don't, the macros can be very slow.  Therefore, you should
2112 not use these functions when NaN is not a concern.
2113
2114 @strong{NB:} There are no macros @code{isequal} or @code{isunequal}.
2115 They are unnecessary, because the @code{==} and @code{!=} operators do
2116 @emph{not} throw an exception if one or both of the operands are NaN.
2117
2118 @node Misc FP Arithmetic
2119 @subsection Miscellaneous FP arithmetic functions
2120 @cindex minimum
2121 @cindex maximum
2122 @cindex positive difference
2123 @cindex multiply-add
2124
2125 The functions in this section perform miscellaneous but common
2126 operations that are awkward to express with C operators.  On some
2127 processors these functions can use special machine instructions to
2128 perform these operations faster than the equivalent C code.
2129
2130 @comment math.h
2131 @comment ISO
2132 @deftypefun double fmin (double @var{x}, double @var{y})
2133 @comment math.h
2134 @comment ISO
2135 @deftypefunx float fminf (float @var{x}, float @var{y})
2136 @comment math.h
2137 @comment ISO
2138 @deftypefunx {long double} fminl (long double @var{x}, long double @var{y})
2139 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2140 The @code{fmin} function returns the lesser of the two values @var{x}
2141 and @var{y}.  It is similar to the expression
2142 @smallexample
2143 ((x) < (y) ? (x) : (y))
2144 @end smallexample
2145 except that @var{x} and @var{y} are only evaluated once.
2146
2147 If an argument is NaN, the other argument is returned.  If both arguments
2148 are NaN, NaN is returned.
2149 @end deftypefun
2150
2151 @comment math.h
2152 @comment ISO
2153 @deftypefun double fmax (double @var{x}, double @var{y})
2154 @comment math.h
2155 @comment ISO
2156 @deftypefunx float fmaxf (float @var{x}, float @var{y})
2157 @comment math.h
2158 @comment ISO
2159 @deftypefunx {long double} fmaxl (long double @var{x}, long double @var{y})
2160 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2161 The @code{fmax} function returns the greater of the two values @var{x}
2162 and @var{y}.
2163
2164 If an argument is NaN, the other argument is returned.  If both arguments
2165 are NaN, NaN is returned.
2166 @end deftypefun
2167
2168 @comment math.h
2169 @comment ISO
2170 @deftypefun double fminmag (double @var{x}, double @var{y})
2171 @comment math.h
2172 @comment ISO
2173 @deftypefunx float fminmagf (float @var{x}, float @var{y})
2174 @comment math.h
2175 @comment ISO
2176 @deftypefunx {long double} fminmagl (long double @var{x}, long double @var{y})
2177 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2178 These functions, from TS 18661-1:2014, return whichever of the two
2179 values @var{x} and @var{y} has the smaller absolute value.  If both
2180 have the same absolute value, or either is NaN, they behave the same
2181 as the @code{fmin} functions.
2182 @end deftypefun
2183
2184 @comment math.h
2185 @comment ISO
2186 @deftypefun double fmaxmag (double @var{x}, double @var{y})
2187 @comment math.h
2188 @comment ISO
2189 @deftypefunx float fmaxmagf (float @var{x}, float @var{y})
2190 @comment math.h
2191 @comment ISO
2192 @deftypefunx {long double} fmaxmagl (long double @var{x}, long double @var{y})
2193 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2194 These functions, from TS 18661-1:2014, return whichever of the two
2195 values @var{x} and @var{y} has the greater absolute value.  If both
2196 have the same absolute value, or either is NaN, they behave the same
2197 as the @code{fmax} functions.
2198 @end deftypefun
2199
2200 @comment math.h
2201 @comment ISO
2202 @deftypefun double fdim (double @var{x}, double @var{y})
2203 @comment math.h
2204 @comment ISO
2205 @deftypefunx float fdimf (float @var{x}, float @var{y})
2206 @comment math.h
2207 @comment ISO
2208 @deftypefunx {long double} fdiml (long double @var{x}, long double @var{y})
2209 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2210 The @code{fdim} function returns the positive difference between
2211 @var{x} and @var{y}.  The positive difference is @math{@var{x} -
2212 @var{y}} if @var{x} is greater than @var{y}, and @math{0} otherwise.
2213
2214 If @var{x}, @var{y}, or both are NaN, NaN is returned.
2215 @end deftypefun
2216
2217 @comment math.h
2218 @comment ISO
2219 @deftypefun double fma (double @var{x}, double @var{y}, double @var{z})
2220 @comment math.h
2221 @comment ISO
2222 @deftypefunx float fmaf (float @var{x}, float @var{y}, float @var{z})
2223 @comment math.h
2224 @comment ISO
2225 @deftypefunx {long double} fmal (long double @var{x}, long double @var{y}, long double @var{z})
2226 @cindex butterfly
2227 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2228 The @code{fma} function performs floating-point multiply-add.  This is
2229 the operation @math{(@var{x} @mul{} @var{y}) + @var{z}}, but the
2230 intermediate result is not rounded to the destination type.  This can
2231 sometimes improve the precision of a calculation.
2232
2233 This function was introduced because some processors have a special
2234 instruction to perform multiply-add.  The C compiler cannot use it
2235 directly, because the expression @samp{x*y + z} is defined to round the
2236 intermediate result.  @code{fma} lets you choose when you want to round
2237 only once.
2238
2239 @vindex FP_FAST_FMA
2240 On processors which do not implement multiply-add in hardware,
2241 @code{fma} can be very slow since it must avoid intermediate rounding.
2242 @file{math.h} defines the symbols @code{FP_FAST_FMA},
2243 @code{FP_FAST_FMAF}, and @code{FP_FAST_FMAL} when the corresponding
2244 version of @code{fma} is no slower than the expression @samp{x*y + z}.
2245 In @theglibc{}, this always means the operation is implemented in
2246 hardware.
2247 @end deftypefun
2248
2249 @node Complex Numbers
2250 @section Complex Numbers
2251 @pindex complex.h
2252 @cindex complex numbers
2253
2254 @w{ISO C99} introduces support for complex numbers in C.  This is done
2255 with a new type qualifier, @code{complex}.  It is a keyword if and only
2256 if @file{complex.h} has been included.  There are three complex types,
2257 corresponding to the three real types:  @code{float complex},
2258 @code{double complex}, and @code{long double complex}.
2259
2260 To construct complex numbers you need a way to indicate the imaginary
2261 part of a number.  There is no standard notation for an imaginary
2262 floating point constant.  Instead, @file{complex.h} defines two macros
2263 that can be used to create complex numbers.
2264
2265 @deftypevr Macro {const float complex} _Complex_I
2266 This macro is a representation of the complex number ``@math{0+1i}''.
2267 Multiplying a real floating-point value by @code{_Complex_I} gives a
2268 complex number whose value is purely imaginary.  You can use this to
2269 construct complex constants:
2270
2271 @smallexample
2272 @math{3.0 + 4.0i} = @code{3.0 + 4.0 * _Complex_I}
2273 @end smallexample
2274
2275 Note that @code{_Complex_I * _Complex_I} has the value @code{-1}, but
2276 the type of that value is @code{complex}.
2277 @end deftypevr
2278
2279 @c Put this back in when gcc supports _Imaginary_I.  It's too confusing.
2280 @ignore
2281 @noindent
2282 Without an optimizing compiler this is more expensive than the use of
2283 @code{_Imaginary_I} but with is better than nothing.  You can avoid all
2284 the hassles if you use the @code{I} macro below if the name is not
2285 problem.
2286
2287 @deftypevr Macro {const float imaginary} _Imaginary_I
2288 This macro is a representation of the value ``@math{1i}''.  I.e., it is
2289 the value for which
2290
2291 @smallexample
2292 _Imaginary_I * _Imaginary_I = -1
2293 @end smallexample
2294
2295 @noindent
2296 The result is not of type @code{float imaginary} but instead @code{float}.
2297 One can use it to easily construct complex number like in
2298
2299 @smallexample
2300 3.0 - _Imaginary_I * 4.0
2301 @end smallexample
2302
2303 @noindent
2304 which results in the complex number with a real part of 3.0 and a
2305 imaginary part -4.0.
2306 @end deftypevr
2307 @end ignore
2308
2309 @noindent
2310 @code{_Complex_I} is a bit of a mouthful.  @file{complex.h} also defines
2311 a shorter name for the same constant.
2312
2313 @deftypevr Macro {const float complex} I
2314 This macro has exactly the same value as @code{_Complex_I}.  Most of the
2315 time it is preferable.  However, it causes problems if you want to use
2316 the identifier @code{I} for something else.  You can safely write
2317
2318 @smallexample
2319 #include <complex.h>
2320 #undef I
2321 @end smallexample
2322
2323 @noindent
2324 if you need @code{I} for your own purposes.  (In that case we recommend
2325 you also define some other short name for @code{_Complex_I}, such as
2326 @code{J}.)
2327
2328 @ignore
2329 If the implementation does not support the @code{imaginary} types
2330 @code{I} is defined as @code{_Complex_I} which is the second best
2331 solution.  It still can be used in the same way but requires a most
2332 clever compiler to get the same results.
2333 @end ignore
2334 @end deftypevr
2335
2336 @node Operations on Complex
2337 @section Projections, Conjugates, and Decomposing of Complex Numbers
2338 @cindex project complex numbers
2339 @cindex conjugate complex numbers
2340 @cindex decompose complex numbers
2341 @pindex complex.h
2342
2343 @w{ISO C99} also defines functions that perform basic operations on
2344 complex numbers, such as decomposition and conjugation.  The prototypes
2345 for all these functions are in @file{complex.h}.  All functions are
2346 available in three variants, one for each of the three complex types.
2347
2348 @comment complex.h
2349 @comment ISO
2350 @deftypefun double creal (complex double @var{z})
2351 @comment complex.h
2352 @comment ISO
2353 @deftypefunx float crealf (complex float @var{z})
2354 @comment complex.h
2355 @comment ISO
2356 @deftypefunx {long double} creall (complex long double @var{z})
2357 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2358 These functions return the real part of the complex number @var{z}.
2359 @end deftypefun
2360
2361 @comment complex.h
2362 @comment ISO
2363 @deftypefun double cimag (complex double @var{z})
2364 @comment complex.h
2365 @comment ISO
2366 @deftypefunx float cimagf (complex float @var{z})
2367 @comment complex.h
2368 @comment ISO
2369 @deftypefunx {long double} cimagl (complex long double @var{z})
2370 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2371 These functions return the imaginary part of the complex number @var{z}.
2372 @end deftypefun
2373
2374 @comment complex.h
2375 @comment ISO
2376 @deftypefun {complex double} conj (complex double @var{z})
2377 @comment complex.h
2378 @comment ISO
2379 @deftypefunx {complex float} conjf (complex float @var{z})
2380 @comment complex.h
2381 @comment ISO
2382 @deftypefunx {complex long double} conjl (complex long double @var{z})
2383 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2384 These functions return the conjugate value of the complex number
2385 @var{z}.  The conjugate of a complex number has the same real part and a
2386 negated imaginary part.  In other words, @samp{conj(a + bi) = a + -bi}.
2387 @end deftypefun
2388
2389 @comment complex.h
2390 @comment ISO
2391 @deftypefun double carg (complex double @var{z})
2392 @comment complex.h
2393 @comment ISO
2394 @deftypefunx float cargf (complex float @var{z})
2395 @comment complex.h
2396 @comment ISO
2397 @deftypefunx {long double} cargl (complex long double @var{z})
2398 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2399 These functions return the argument of the complex number @var{z}.
2400 The argument of a complex number is the angle in the complex plane
2401 between the positive real axis and a line passing through zero and the
2402 number.  This angle is measured in the usual fashion and ranges from
2403 @math{-@pi{}} to @math{@pi{}}.
2404
2405 @code{carg} has a branch cut along the negative real axis.
2406 @end deftypefun
2407
2408 @comment complex.h
2409 @comment ISO
2410 @deftypefun {complex double} cproj (complex double @var{z})
2411 @comment complex.h
2412 @comment ISO
2413 @deftypefunx {complex float} cprojf (complex float @var{z})
2414 @comment complex.h
2415 @comment ISO
2416 @deftypefunx {complex long double} cprojl (complex long double @var{z})
2417 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
2418 These functions return the projection of the complex value @var{z} onto
2419 the Riemann sphere.  Values with an infinite imaginary part are projected
2420 to positive infinity on the real axis, even if the real part is NaN.  If
2421 the real part is infinite, the result is equivalent to
2422
2423 @smallexample
2424 INFINITY + I * copysign (0.0, cimag (z))
2425 @end smallexample
2426 @end deftypefun
2427
2428 @node Parsing of Numbers
2429 @section Parsing of Numbers
2430 @cindex parsing numbers (in formatted input)
2431 @cindex converting strings to numbers
2432 @cindex number syntax, parsing
2433 @cindex syntax, for reading numbers
2434
2435 This section describes functions for ``reading'' integer and
2436 floating-point numbers from a string.  It may be more convenient in some
2437 cases to use @code{sscanf} or one of the related functions; see
2438 @ref{Formatted Input}.  But often you can make a program more robust by
2439 finding the tokens in the string by hand, then converting the numbers
2440 one by one.
2441
2442 @menu
2443 * Parsing of Integers::         Functions for conversion of integer values.
2444 * Parsing of Floats::           Functions for conversion of floating-point
2445                                  values.
2446 @end menu
2447
2448 @node Parsing of Integers
2449 @subsection Parsing of Integers
2450
2451 @pindex stdlib.h
2452 @pindex wchar.h
2453 The @samp{str} functions are declared in @file{stdlib.h} and those
2454 beginning with @samp{wcs} are declared in @file{wchar.h}.  One might
2455 wonder about the use of @code{restrict} in the prototypes of the
2456 functions in this section.  It is seemingly useless but the @w{ISO C}
2457 standard uses it (for the functions defined there) so we have to do it
2458 as well.
2459
2460 @comment stdlib.h
2461 @comment ISO
2462 @deftypefun {long int} strtol (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
2463 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2464 @c strtol uses the thread-local pointer to the locale in effect, and
2465 @c strtol_l loads the LC_NUMERIC locale data from it early on and once,
2466 @c but if the locale is the global locale, and another thread calls
2467 @c setlocale in a way that modifies the pointer to the LC_CTYPE locale
2468 @c category, the behavior of e.g. IS*, TOUPPER will vary throughout the
2469 @c execution of the function, because they re-read the locale data from
2470 @c the given locale pointer.  We solved this by documenting setlocale as
2471 @c MT-Unsafe.
2472 The @code{strtol} (``string-to-long'') function converts the initial
2473 part of @var{string} to a signed integer, which is returned as a value
2474 of type @code{long int}.
2475
2476 This function attempts to decompose @var{string} as follows:
2477
2478 @itemize @bullet
2479 @item
2480 A (possibly empty) sequence of whitespace characters.  Which characters
2481 are whitespace is determined by the @code{isspace} function
2482 (@pxref{Classification of Characters}).  These are discarded.
2483
2484 @item
2485 An optional plus or minus sign (@samp{+} or @samp{-}).
2486
2487 @item
2488 A nonempty sequence of digits in the radix specified by @var{base}.
2489
2490 If @var{base} is zero, decimal radix is assumed unless the series of
2491 digits begins with @samp{0} (specifying octal radix), or @samp{0x} or
2492 @samp{0X} (specifying hexadecimal radix); in other words, the same
2493 syntax used for integer constants in C.
2494
2495 Otherwise @var{base} must have a value between @code{2} and @code{36}.
2496 If @var{base} is @code{16}, the digits may optionally be preceded by
2497 @samp{0x} or @samp{0X}.  If base has no legal value the value returned
2498 is @code{0l} and the global variable @code{errno} is set to @code{EINVAL}.
2499
2500 @item
2501 Any remaining characters in the string.  If @var{tailptr} is not a null
2502 pointer, @code{strtol} stores a pointer to this tail in
2503 @code{*@var{tailptr}}.
2504 @end itemize
2505
2506 If the string is empty, contains only whitespace, or does not contain an
2507 initial substring that has the expected syntax for an integer in the
2508 specified @var{base}, no conversion is performed.  In this case,
2509 @code{strtol} returns a value of zero and the value stored in
2510 @code{*@var{tailptr}} is the value of @var{string}.
2511
2512 In a locale other than the standard @code{"C"} locale, this function
2513 may recognize additional implementation-dependent syntax.
2514
2515 If the string has valid syntax for an integer but the value is not
2516 representable because of overflow, @code{strtol} returns either
2517 @code{LONG_MAX} or @code{LONG_MIN} (@pxref{Range of Type}), as
2518 appropriate for the sign of the value.  It also sets @code{errno}
2519 to @code{ERANGE} to indicate there was overflow.
2520
2521 You should not check for errors by examining the return value of
2522 @code{strtol}, because the string might be a valid representation of
2523 @code{0l}, @code{LONG_MAX}, or @code{LONG_MIN}.  Instead, check whether
2524 @var{tailptr} points to what you expect after the number
2525 (e.g. @code{'\0'} if the string should end after the number).  You also
2526 need to clear @var{errno} before the call and check it afterward, in
2527 case there was overflow.
2528
2529 There is an example at the end of this section.
2530 @end deftypefun
2531
2532 @comment wchar.h
2533 @comment ISO
2534 @deftypefun {long int} wcstol (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
2535 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2536 The @code{wcstol} function is equivalent to the @code{strtol} function
2537 in nearly all aspects but handles wide character strings.
2538
2539 The @code{wcstol} function was introduced in @w{Amendment 1} of @w{ISO C90}.
2540 @end deftypefun
2541
2542 @comment stdlib.h
2543 @comment ISO
2544 @deftypefun {unsigned long int} strtoul (const char *retrict @var{string}, char **restrict @var{tailptr}, int @var{base})
2545 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2546 The @code{strtoul} (``string-to-unsigned-long'') function is like
2547 @code{strtol} except it converts to an @code{unsigned long int} value.
2548 The syntax is the same as described above for @code{strtol}.  The value
2549 returned on overflow is @code{ULONG_MAX} (@pxref{Range of Type}).
2550
2551 If @var{string} depicts a negative number, @code{strtoul} acts the same
2552 as @var{strtol} but casts the result to an unsigned integer.  That means
2553 for example that @code{strtoul} on @code{"-1"} returns @code{ULONG_MAX}
2554 and an input more negative than @code{LONG_MIN} returns
2555 (@code{ULONG_MAX} + 1) / 2.
2556
2557 @code{strtoul} sets @var{errno} to @code{EINVAL} if @var{base} is out of
2558 range, or @code{ERANGE} on overflow.
2559 @end deftypefun
2560
2561 @comment wchar.h
2562 @comment ISO
2563 @deftypefun {unsigned long int} wcstoul (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
2564 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2565 The @code{wcstoul} function is equivalent to the @code{strtoul} function
2566 in nearly all aspects but handles wide character strings.
2567
2568 The @code{wcstoul} function was introduced in @w{Amendment 1} of @w{ISO C90}.
2569 @end deftypefun
2570
2571 @comment stdlib.h
2572 @comment ISO
2573 @deftypefun {long long int} strtoll (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
2574 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2575 The @code{strtoll} function is like @code{strtol} except that it returns
2576 a @code{long long int} value, and accepts numbers with a correspondingly
2577 larger range.
2578
2579 If the string has valid syntax for an integer but the value is not
2580 representable because of overflow, @code{strtoll} returns either
2581 @code{LLONG_MAX} or @code{LLONG_MIN} (@pxref{Range of Type}), as
2582 appropriate for the sign of the value.  It also sets @code{errno} to
2583 @code{ERANGE} to indicate there was overflow.
2584
2585 The @code{strtoll} function was introduced in @w{ISO C99}.
2586 @end deftypefun
2587
2588 @comment wchar.h
2589 @comment ISO
2590 @deftypefun {long long int} wcstoll (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
2591 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2592 The @code{wcstoll} function is equivalent to the @code{strtoll} function
2593 in nearly all aspects but handles wide character strings.
2594
2595 The @code{wcstoll} function was introduced in @w{Amendment 1} of @w{ISO C90}.
2596 @end deftypefun
2597
2598 @comment stdlib.h
2599 @comment BSD
2600 @deftypefun {long long int} strtoq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
2601 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2602 @code{strtoq} (``string-to-quad-word'') is the BSD name for @code{strtoll}.
2603 @end deftypefun
2604
2605 @comment wchar.h
2606 @comment GNU
2607 @deftypefun {long long int} wcstoq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
2608 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2609 The @code{wcstoq} function is equivalent to the @code{strtoq} function
2610 in nearly all aspects but handles wide character strings.
2611
2612 The @code{wcstoq} function is a GNU extension.
2613 @end deftypefun
2614
2615 @comment stdlib.h
2616 @comment ISO
2617 @deftypefun {unsigned long long int} strtoull (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
2618 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2619 The @code{strtoull} function is related to @code{strtoll} the same way
2620 @code{strtoul} is related to @code{strtol}.
2621
2622 The @code{strtoull} function was introduced in @w{ISO C99}.
2623 @end deftypefun
2624
2625 @comment wchar.h
2626 @comment ISO
2627 @deftypefun {unsigned long long int} wcstoull (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
2628 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2629 The @code{wcstoull} function is equivalent to the @code{strtoull} function
2630 in nearly all aspects but handles wide character strings.
2631
2632 The @code{wcstoull} function was introduced in @w{Amendment 1} of @w{ISO C90}.
2633 @end deftypefun
2634
2635 @comment stdlib.h
2636 @comment BSD
2637 @deftypefun {unsigned long long int} strtouq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
2638 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2639 @code{strtouq} is the BSD name for @code{strtoull}.
2640 @end deftypefun
2641
2642 @comment wchar.h
2643 @comment GNU
2644 @deftypefun {unsigned long long int} wcstouq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
2645 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2646 The @code{wcstouq} function is equivalent to the @code{strtouq} function
2647 in nearly all aspects but handles wide character strings.
2648
2649 The @code{wcstouq} function is a GNU extension.
2650 @end deftypefun
2651
2652 @comment inttypes.h
2653 @comment ISO
2654 @deftypefun intmax_t strtoimax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
2655 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2656 The @code{strtoimax} function is like @code{strtol} except that it returns
2657 a @code{intmax_t} value, and accepts numbers of a corresponding range.
2658
2659 If the string has valid syntax for an integer but the value is not
2660 representable because of overflow, @code{strtoimax} returns either
2661 @code{INTMAX_MAX} or @code{INTMAX_MIN} (@pxref{Integers}), as
2662 appropriate for the sign of the value.  It also sets @code{errno} to
2663 @code{ERANGE} to indicate there was overflow.
2664
2665 See @ref{Integers} for a description of the @code{intmax_t} type.  The
2666 @code{strtoimax} function was introduced in @w{ISO C99}.
2667 @end deftypefun
2668
2669 @comment wchar.h
2670 @comment ISO
2671 @deftypefun intmax_t wcstoimax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
2672 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2673 The @code{wcstoimax} function is equivalent to the @code{strtoimax} function
2674 in nearly all aspects but handles wide character strings.
2675
2676 The @code{wcstoimax} function was introduced in @w{ISO C99}.
2677 @end deftypefun
2678
2679 @comment inttypes.h
2680 @comment ISO
2681 @deftypefun uintmax_t strtoumax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base})
2682 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2683 The @code{strtoumax} function is related to @code{strtoimax}
2684 the same way that @code{strtoul} is related to @code{strtol}.
2685
2686 See @ref{Integers} for a description of the @code{intmax_t} type.  The
2687 @code{strtoumax} function was introduced in @w{ISO C99}.
2688 @end deftypefun
2689
2690 @comment wchar.h
2691 @comment ISO
2692 @deftypefun uintmax_t wcstoumax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base})
2693 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2694 The @code{wcstoumax} function is equivalent to the @code{strtoumax} function
2695 in nearly all aspects but handles wide character strings.
2696
2697 The @code{wcstoumax} function was introduced in @w{ISO C99}.
2698 @end deftypefun
2699
2700 @comment stdlib.h
2701 @comment ISO
2702 @deftypefun {long int} atol (const char *@var{string})
2703 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2704 This function is similar to the @code{strtol} function with a @var{base}
2705 argument of @code{10}, except that it need not detect overflow errors.
2706 The @code{atol} function is provided mostly for compatibility with
2707 existing code; using @code{strtol} is more robust.
2708 @end deftypefun
2709
2710 @comment stdlib.h
2711 @comment ISO
2712 @deftypefun int atoi (const char *@var{string})
2713 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2714 This function is like @code{atol}, except that it returns an @code{int}.
2715 The @code{atoi} function is also considered obsolete; use @code{strtol}
2716 instead.
2717 @end deftypefun
2718
2719 @comment stdlib.h
2720 @comment ISO
2721 @deftypefun {long long int} atoll (const char *@var{string})
2722 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2723 This function is similar to @code{atol}, except it returns a @code{long
2724 long int}.
2725
2726 The @code{atoll} function was introduced in @w{ISO C99}.  It too is
2727 obsolete (despite having just been added); use @code{strtoll} instead.
2728 @end deftypefun
2729
2730 All the functions mentioned in this section so far do not handle
2731 alternative representations of characters as described in the locale
2732 data.  Some locales specify thousands separator and the way they have to
2733 be used which can help to make large numbers more readable.  To read
2734 such numbers one has to use the @code{scanf} functions with the @samp{'}
2735 flag.
2736
2737 Here is a function which parses a string as a sequence of integers and
2738 returns the sum of them:
2739
2740 @smallexample
2741 int
2742 sum_ints_from_string (char *string)
2743 @{
2744   int sum = 0;
2745
2746   while (1) @{
2747     char *tail;
2748     int next;
2749
2750     /* @r{Skip whitespace by hand, to detect the end.}  */
2751     while (isspace (*string)) string++;
2752     if (*string == 0)
2753       break;
2754
2755     /* @r{There is more nonwhitespace,}  */
2756     /* @r{so it ought to be another number.}  */
2757     errno = 0;
2758     /* @r{Parse it.}  */
2759     next = strtol (string, &tail, 0);
2760     /* @r{Add it in, if not overflow.}  */
2761     if (errno)
2762       printf ("Overflow\n");
2763     else
2764       sum += next;
2765     /* @r{Advance past it.}  */
2766     string = tail;
2767   @}
2768
2769   return sum;
2770 @}
2771 @end smallexample
2772
2773 @node Parsing of Floats
2774 @subsection Parsing of Floats
2775
2776 @pindex stdlib.h
2777 The @samp{str} functions are declared in @file{stdlib.h} and those
2778 beginning with @samp{wcs} are declared in @file{wchar.h}.  One might
2779 wonder about the use of @code{restrict} in the prototypes of the
2780 functions in this section.  It is seemingly useless but the @w{ISO C}
2781 standard uses it (for the functions defined there) so we have to do it
2782 as well.
2783
2784 @comment stdlib.h
2785 @comment ISO
2786 @deftypefun double strtod (const char *restrict @var{string}, char **restrict @var{tailptr})
2787 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2788 @c Besides the unsafe-but-ruled-safe locale uses, this uses a lot of
2789 @c mpn, but it's all safe.
2790 @c
2791 @c round_and_return
2792 @c   get_rounding_mode ok
2793 @c   mpn_add_1 ok
2794 @c   mpn_rshift ok
2795 @c   MPN_ZERO ok
2796 @c   MPN2FLOAT -> mpn_construct_(float|double|long_double) ok
2797 @c str_to_mpn
2798 @c   mpn_mul_1 -> umul_ppmm ok
2799 @c   mpn_add_1 ok
2800 @c mpn_lshift_1 -> mpn_lshift ok
2801 @c STRTOF_INTERNAL
2802 @c   MPN_VAR ok
2803 @c   SET_MANTISSA ok
2804 @c   STRNCASECMP ok, wide and narrow
2805 @c   round_and_return ok
2806 @c   mpn_mul ok
2807 @c     mpn_addmul_1 ok
2808 @c     ... mpn_sub
2809 @c   mpn_lshift ok
2810 @c   udiv_qrnnd ok
2811 @c   count_leading_zeros ok
2812 @c   add_ssaaaa ok
2813 @c   sub_ddmmss ok
2814 @c   umul_ppmm ok
2815 @c   mpn_submul_1 ok
2816 The @code{strtod} (``string-to-double'') function converts the initial
2817 part of @var{string} to a floating-point number, which is returned as a
2818 value of type @code{double}.
2819
2820 This function attempts to decompose @var{string} as follows:
2821
2822 @itemize @bullet
2823 @item
2824 A (possibly empty) sequence of whitespace characters.  Which characters
2825 are whitespace is determined by the @code{isspace} function
2826 (@pxref{Classification of Characters}).  These are discarded.
2827
2828 @item
2829 An optional plus or minus sign (@samp{+} or @samp{-}).
2830
2831 @item A floating point number in decimal or hexadecimal format.  The
2832 decimal format is:
2833 @itemize @minus
2834
2835 @item
2836 A nonempty sequence of digits optionally containing a decimal-point
2837 character---normally @samp{.}, but it depends on the locale
2838 (@pxref{General Numeric}).
2839
2840 @item
2841 An optional exponent part, consisting of a character @samp{e} or
2842 @samp{E}, an optional sign, and a sequence of digits.
2843
2844 @end itemize
2845
2846 The hexadecimal format is as follows:
2847 @itemize @minus
2848
2849 @item
2850 A 0x or 0X followed by a nonempty sequence of hexadecimal digits
2851 optionally containing a decimal-point character---normally @samp{.}, but
2852 it depends on the locale (@pxref{General Numeric}).
2853
2854 @item
2855 An optional binary-exponent part, consisting of a character @samp{p} or
2856 @samp{P}, an optional sign, and a sequence of digits.
2857
2858 @end itemize
2859
2860 @item
2861 Any remaining characters in the string.  If @var{tailptr} is not a null
2862 pointer, a pointer to this tail of the string is stored in
2863 @code{*@var{tailptr}}.
2864 @end itemize
2865
2866 If the string is empty, contains only whitespace, or does not contain an
2867 initial substring that has the expected syntax for a floating-point
2868 number, no conversion is performed.  In this case, @code{strtod} returns
2869 a value of zero and the value returned in @code{*@var{tailptr}} is the
2870 value of @var{string}.
2871
2872 In a locale other than the standard @code{"C"} or @code{"POSIX"} locales,
2873 this function may recognize additional locale-dependent syntax.
2874
2875 If the string has valid syntax for a floating-point number but the value
2876 is outside the range of a @code{double}, @code{strtod} will signal
2877 overflow or underflow as described in @ref{Math Error Reporting}.
2878
2879 @code{strtod} recognizes four special input strings.  The strings
2880 @code{"inf"} and @code{"infinity"} are converted to @math{@infinity{}},
2881 or to the largest representable value if the floating-point format
2882 doesn't support infinities.  You can prepend a @code{"+"} or @code{"-"}
2883 to specify the sign.  Case is ignored when scanning these strings.
2884
2885 The strings @code{"nan"} and @code{"nan(@var{chars@dots{}})"} are converted
2886 to NaN.  Again, case is ignored.  If @var{chars@dots{}} are provided, they
2887 are used in some unspecified fashion to select a particular
2888 representation of NaN (there can be several).
2889
2890 Since zero is a valid result as well as the value returned on error, you
2891 should check for errors in the same way as for @code{strtol}, by
2892 examining @var{errno} and @var{tailptr}.
2893 @end deftypefun
2894
2895 @comment stdlib.h
2896 @comment ISO
2897 @deftypefun float strtof (const char *@var{string}, char **@var{tailptr})
2898 @comment stdlib.h
2899 @comment ISO
2900 @deftypefunx {long double} strtold (const char *@var{string}, char **@var{tailptr})
2901 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2902 These functions are analogous to @code{strtod}, but return @code{float}
2903 and @code{long double} values respectively.  They report errors in the
2904 same way as @code{strtod}.  @code{strtof} can be substantially faster
2905 than @code{strtod}, but has less precision; conversely, @code{strtold}
2906 can be much slower but has more precision (on systems where @code{long
2907 double} is a separate type).
2908
2909 These functions have been GNU extensions and are new to @w{ISO C99}.
2910 @end deftypefun
2911
2912 @comment wchar.h
2913 @comment ISO
2914 @deftypefun double wcstod (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr})
2915 @comment stdlib.h
2916 @comment ISO
2917 @deftypefunx float wcstof (const wchar_t *@var{string}, wchar_t **@var{tailptr})
2918 @comment stdlib.h
2919 @comment ISO
2920 @deftypefunx {long double} wcstold (const wchar_t *@var{string}, wchar_t **@var{tailptr})
2921 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2922 The @code{wcstod}, @code{wcstof}, and @code{wcstol} functions are
2923 equivalent in nearly all aspect to the @code{strtod}, @code{strtof}, and
2924 @code{strtold} functions but it handles wide character string.
2925
2926 The @code{wcstod} function was introduced in @w{Amendment 1} of @w{ISO
2927 C90}.  The @code{wcstof} and @code{wcstold} functions were introduced in
2928 @w{ISO C99}.
2929 @end deftypefun
2930
2931 @comment stdlib.h
2932 @comment ISO
2933 @deftypefun double atof (const char *@var{string})
2934 @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}}
2935 This function is similar to the @code{strtod} function, except that it
2936 need not detect overflow and underflow errors.  The @code{atof} function
2937 is provided mostly for compatibility with existing code; using
2938 @code{strtod} is more robust.
2939 @end deftypefun
2940
2941 @Theglibc{} also provides @samp{_l} versions of these functions,
2942 which take an additional argument, the locale to use in conversion.
2943
2944 See also @ref{Parsing of Integers}.
2945
2946 @node Printing of Floats
2947 @section Printing of Floats
2948
2949 @pindex stdlib.h
2950 The @samp{strfrom} functions are declared in @file{stdlib.h}.
2951
2952 @comment stdlib.h
2953 @comment ISO/IEC TS 18661-1
2954 @deftypefun int strfromd (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, double @var{value})
2955 @deftypefunx int strfromf (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, float @var{value})
2956 @deftypefunx int strfroml (char *restrict @var{string}, size_t @var{size}, const char *restrict @var{format}, long double @var{value})
2957 @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
2958 @comment these functions depend on __printf_fp and __printf_fphex, which are
2959 @comment AS-unsafe (ascuheap) and AC-unsafe (acsmem).
2960 The functions @code{strfromd} (``string-from-double''), @code{strfromf}
2961 (``string-from-float''), and @code{strfroml} (``string-from-long-double'')
2962 convert the floating-point number @var{value} to a string of characters and
2963 stores them into the area pointed to by @var{string}.  The conversion
2964 writes at most @var{size} characters and respects the format specified by
2965 @var{format}.
2966
2967 The format string must start with the character @samp{%}.  An optional
2968 precision follows, which starts with a period, @samp{.}, and may be
2969 followed by a decimal integer, representing the precision.  If a decimal
2970 integer is not specified after the period, the precision is taken to be
2971 zero.  The character @samp{*} is not allowed.  Finally, the format string
2972 ends with one of the following conversion specifiers: @samp{a}, @samp{A},
2973 @samp{e}, @samp{E}, @samp{f}, @samp{F}, @samp{g} or @samp{G} (@pxref{Table
2974 of Output Conversions}).  Invalid format strings result in undefined
2975 behavior.
2976
2977 These functions return the number of characters that would have been
2978 written to @var{string} had @var{size} been sufficiently large, not
2979 counting the terminating null character.  Thus, the null-terminated output
2980 has been completely written if and only if the returned value is less than
2981 @var{size}.
2982
2983 These functions were introduced by ISO/IEC TS 18661-1.
2984 @end deftypefun
2985
2986 @node System V Number Conversion
2987 @section Old-fashioned System V number-to-string functions
2988
2989 The old @w{System V} C library provided three functions to convert
2990 numbers to strings, with unusual and hard-to-use semantics.  @Theglibc{}
2991 also provides these functions and some natural extensions.
2992
2993 These functions are only available in @theglibc{} and on systems descended
2994 from AT&T Unix.  Therefore, unless these functions do precisely what you
2995 need, it is better to use @code{sprintf}, which is standard.
2996
2997 All these functions are defined in @file{stdlib.h}.
2998
2999 @comment stdlib.h
3000 @comment SVID, Unix98
3001 @deftypefun {char *} ecvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
3002 @safety{@prelim{}@mtunsafe{@mtasurace{:ecvt}}@asunsafe{}@acsafe{}}
3003 The function @code{ecvt} converts the floating-point number @var{value}
3004 to a string with at most @var{ndigit} decimal digits.  The
3005 returned string contains no decimal point or sign.  The first digit of
3006 the string is non-zero (unless @var{value} is actually zero) and the
3007 last digit is rounded to nearest.  @code{*@var{decpt}} is set to the
3008 index in the string of the first digit after the decimal point.
3009 @code{*@var{neg}} is set to a nonzero value if @var{value} is negative,
3010 zero otherwise.
3011
3012 If @var{ndigit} decimal digits would exceed the precision of a
3013 @code{double} it is reduced to a system-specific value.
3014
3015 The returned string is statically allocated and overwritten by each call
3016 to @code{ecvt}.
3017
3018 If @var{value} is zero, it is implementation defined whether
3019 @code{*@var{decpt}} is @code{0} or @code{1}.
3020
3021 For example: @code{ecvt (12.3, 5, &d, &n)} returns @code{"12300"}
3022 and sets @var{d} to @code{2} and @var{n} to @code{0}.
3023 @end deftypefun
3024
3025 @comment stdlib.h
3026 @comment SVID, Unix98
3027 @deftypefun {char *} fcvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
3028 @safety{@prelim{}@mtunsafe{@mtasurace{:fcvt}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
3029 The function @code{fcvt} is like @code{ecvt}, but @var{ndigit} specifies
3030 the number of digits after the decimal point.  If @var{ndigit} is less
3031 than zero, @var{value} is rounded to the @math{@var{ndigit}+1}'th place to the
3032 left of the decimal point.  For example, if @var{ndigit} is @code{-1},
3033 @var{value} will be rounded to the nearest 10.  If @var{ndigit} is
3034 negative and larger than the number of digits to the left of the decimal
3035 point in @var{value}, @var{value} will be rounded to one significant digit.
3036
3037 If @var{ndigit} decimal digits would exceed the precision of a
3038 @code{double} it is reduced to a system-specific value.
3039
3040 The returned string is statically allocated and overwritten by each call
3041 to @code{fcvt}.
3042 @end deftypefun
3043
3044 @comment stdlib.h
3045 @comment SVID, Unix98
3046 @deftypefun {char *} gcvt (double @var{value}, int @var{ndigit}, char *@var{buf})
3047 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
3048 @c gcvt calls sprintf, that ultimately calls vfprintf, which malloc()s
3049 @c args_value if it's too large, but gcvt never exercises this path.
3050 @code{gcvt} is functionally equivalent to @samp{sprintf(buf, "%*g",
3051 ndigit, value}.  It is provided only for compatibility's sake.  It
3052 returns @var{buf}.
3053
3054 If @var{ndigit} decimal digits would exceed the precision of a
3055 @code{double} it is reduced to a system-specific value.
3056 @end deftypefun
3057
3058 As extensions, @theglibc{} provides versions of these three
3059 functions that take @code{long double} arguments.
3060
3061 @comment stdlib.h
3062 @comment GNU
3063 @deftypefun {char *} qecvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
3064 @safety{@prelim{}@mtunsafe{@mtasurace{:qecvt}}@asunsafe{}@acsafe{}}
3065 This function is equivalent to @code{ecvt} except that it takes a
3066 @code{long double} for the first parameter and that @var{ndigit} is
3067 restricted by the precision of a @code{long double}.
3068 @end deftypefun
3069
3070 @comment stdlib.h
3071 @comment GNU
3072 @deftypefun {char *} qfcvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
3073 @safety{@prelim{}@mtunsafe{@mtasurace{:qfcvt}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}}
3074 This function is equivalent to @code{fcvt} except that it
3075 takes a @code{long double} for the first parameter and that @var{ndigit} is
3076 restricted by the precision of a @code{long double}.
3077 @end deftypefun
3078
3079 @comment stdlib.h
3080 @comment GNU
3081 @deftypefun {char *} qgcvt (long double @var{value}, int @var{ndigit}, char *@var{buf})
3082 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
3083 This function is equivalent to @code{gcvt} except that it takes a
3084 @code{long double} for the first parameter and that @var{ndigit} is
3085 restricted by the precision of a @code{long double}.
3086 @end deftypefun
3087
3088
3089 @cindex gcvt_r
3090 The @code{ecvt} and @code{fcvt} functions, and their @code{long double}
3091 equivalents, all return a string located in a static buffer which is
3092 overwritten by the next call to the function.  @Theglibc{}
3093 provides another set of extended functions which write the converted
3094 string into a user-supplied buffer.  These have the conventional
3095 @code{_r} suffix.
3096
3097 @code{gcvt_r} is not necessary, because @code{gcvt} already uses a
3098 user-supplied buffer.
3099
3100 @comment stdlib.h
3101 @comment GNU
3102 @deftypefun int ecvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
3103 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
3104 The @code{ecvt_r} function is the same as @code{ecvt}, except
3105 that it places its result into the user-specified buffer pointed to by
3106 @var{buf}, with length @var{len}.  The return value is @code{-1} in
3107 case of an error and zero otherwise.
3108
3109 This function is a GNU extension.
3110 @end deftypefun
3111
3112 @comment stdlib.h
3113 @comment SVID, Unix98
3114 @deftypefun int fcvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
3115 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
3116 The @code{fcvt_r} function is the same as @code{fcvt}, except that it
3117 places its result into the user-specified buffer pointed to by
3118 @var{buf}, with length @var{len}.  The return value is @code{-1} in
3119 case of an error and zero otherwise.
3120
3121 This function is a GNU extension.
3122 @end deftypefun
3123
3124 @comment stdlib.h
3125 @comment GNU
3126 @deftypefun int qecvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
3127 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
3128 The @code{qecvt_r} function is the same as @code{qecvt}, except
3129 that it places its result into the user-specified buffer pointed to by
3130 @var{buf}, with length @var{len}.  The return value is @code{-1} in
3131 case of an error and zero otherwise.
3132
3133 This function is a GNU extension.
3134 @end deftypefun
3135
3136 @comment stdlib.h
3137 @comment GNU
3138 @deftypefun int qfcvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
3139 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
3140 The @code{qfcvt_r} function is the same as @code{qfcvt}, except
3141 that it places its result into the user-specified buffer pointed to by
3142 @var{buf}, with length @var{len}.  The return value is @code{-1} in
3143 case of an error and zero otherwise.
3144
3145 This function is a GNU extension.
3146 @end deftypefun