docs/pdds/draft/pdd14_numbers.pod

   1 # Copyright (C) 2001-2009, Parrot Foundation.
   2 # $Id$
   3
   4 =head1 [DRAFT] PDD 14: Numbers
   5
   6 =head2 Abstract
   7
   8 This PDD describes Parrot's numeric data types.
   9
  10 =head2 Version
  11
  12 $Revision$
  13
  14 =head2 Description
  15
  16 This PDD details the basic numeric datatypes that the Parrot core knows how to
  17 deal with, including the core numeric PMCs.
  18
  19 =head3 Integer data types
  20
  21 Parrot provides a native integer data type, generally known as an "Int". The
  22 size of the integer is chosen at Parrot configuration time, the same size as
  23 platform-native integers. In C, the typedefs C<INTVAL> and C<UINTVAL> are
  24 native signed and unsigned integers respectively. The semantics of native
  25 integer data types are the same as the semantics of their C equivalents.
  26
  27 Integer data types have a dedicated register set. In PIR, the C<I> register
  28 variables (C<$I0>, etc.) and C<.param>s or C<.local>s declared with the C<int>
  29 type are native integers. Native unsigned integers are not accessible directly
  30 in PIR. Many opcodes or vtable functions are defined with variants that take
  31 native integer arguments. When passed to a subroutine or method call, a native
  32 integer may be autoboxed as an C<Integer> PMC, or as an HLL type mapped to
  33 C<Integer>.
  34
  35 =head3 Floating-point data types
  36
  37 Parrot provides a native floating-point data type, generally known as a "Num".
  38 The size of the float is chosen at Parrot configuration time, the same size as
  39 platform-native floats. In C, the typedef C<FLOATVAL> is a native float data
  40 type. The semantics of the native float data type are the same as the
  41 semantics of the C equivalent.
  42
  43 Float data types have a dedicated register set. In PIR, the C<N> register
  44 variables (C<$N0>, etc.) and C<.param>s or C<.local>s declared with the C<num>
  45 type are native floats. Many opcodes or vtable functions are defined with
  46 variants that take native float arguments. When passed to a subroutine or
  47 method call, a native float may be autoboxed as a C<Float> PMC, or as an HLL
  48 type mapped to C<Float>.
  49
  50 =head3 Integer PMC
  51
  52 The C<Integer> PMC is a high-level integer type, providing the features of a
  53 integer data type appropriate for use in a high-level language. Some languages
  54 may be able to use Parrot's C<Integer> directly as their integer data type.
  55 Others may subclass C<Integer> to add their own functionality, and others may
  56 implement their own high-level integer data type.
  57
  58 The C<Integer> PMC has a single attribute, the integer value.
  59
  60 =head4 Integer Vtable Functions
  61
  62 =over 4
  63
  64 =item C<init()>
  65
  66 Initializes the C<Integer> to 0.
  67
  68 =item C<set_pmc(PMC *value)> and C<set_integer_same(PMC *value)>
  69
  70 Sets the C<Integer> to the integer value of the PMC argument.
  71
  72 =item C<set_integer_native(INTVAL value)>
  73
  74 Set the C<Integer> to the passed-in integer value.
  75
  76 =item C<set_number_native(FLOATVAL value)>, C<set_bool(INTVAL value)>,
  77       C<set_bigint_int(INTVAL value)>, C<set_string_native(STRING *value)>
  78
  79 Morphs the C<Integer> PMC to a C<Float>, C<Boolean>, C<BigInt>, or C<String>
  80 PMC, and sets the value from the passed in value.
  81
  82 {{NOTE: the morphing behavior is currently under consideration and may be
  83 rejected.}}
  84
  85 =item C<get_integer()>
  86
  87 Retrieves the integer value of the C<Integer>.
  88
  89 =item C<get_bool()>
  90
  91 Returns the boolean value of the C<Integer> (false if 0, true otherwise).
  92
  93 =item C<get_number()>
  94
  95 Returns the integer value of the C<Integer> as a floating-point number.
  96
  97 =item C<get_bigint()>
  98
  99 Returns the integer value of the C<Integer> in a new C<BigInt> PMC.
 100
 101 {{ NOTE: this vtable entry may be deprecated }}
 102
 103 =item C<get_string()> and C<get_repr()>
 104
 105 Returns the integer value of the C<Integer> as a string.
 106
 107 =item C<[add|subtract|multiply|divide|floor_divide|modulus|pow]_int(INTVAL b,
 108       PMC *dest)>
 109
 110 Adds/subtracts/multiplies/divides/moduluses/exponents an integer value with
 111 the C<Integer> PMC, and returns the result as a new PMC.  (The C<dest>
 112 parameter is unused). Overflow of the native integer storage auto-promotes the
 113 result PMC to a C<BigInt>.  Note that these are multidispatched.
 114
 115 =item C<i_[add|subtract|multiply|divide|floor_divide|modulus|pow]_int(INTVAL
 116         b)>
 117
 118 Adds/subtracts/multiplies/divides/moduluses/exponents an integer value with
 119 the C<Integer> PMC, and sets the C<Integer> to the resulting value. Overflow
 120 of the native integer storage auto-promotes the C<Integer> to a C<BigInt>.
 121 Note that these are multidispatched.
 122
 123 {{NOTE: there is some discussion of having this promotion of storage happen
 124 purely internally (perhaps by swapping vtables), rather than converting to a
 125 different PMC type.}}
 126
 127 =item C<i_[add|subtract|multiply|divide|floor_divide|modulus|pow]_float(INTVAL
 128        b)>
 129
 130 Add/subtract/multiply/divide/modulus/exponent an integer value with the the
 131 C<Integer> PMC, and set the C<Integer> to the resulting value, morphing it to
 132 a C<Float>.  Note that these are multidispatched.
 133
 134 =item C<increment()>
 135
 136 Adds 1 to the value of the integer.  This may autopromote the PMC to a
 137 C<BigInt>.
 138
 139 =item C<decrement()>
 140
 141 Subtracts 1 from the value of the integer.  This may autopromote the PMC to a
 142 C<BigInt>.
 143
 144 =item C<absolute()>
 145
 146 Returns an C<Integer> PMC set to the absolute value of the current C<Integer>.
 147
 148 =item C<i_absolute()>
 149
 150 Sets the C<Integer> to the absolute value of itself.
 151
 152 =item C<freeze()>
 153
 154 Freezes the C<Integer> PMC for storage.
 155
 156 =item C<thaw()>
 157
 158 Thaws the C<Integer> PMC from storage.
 159
 160
 161 =back
 162
 163 =head4 Integer Multis
 164
 165 Many of the math vtable functions are defined as multiple dispatch functions.
 166
 167 =over 4
 168
 169 =item C<[add|subtract|multiply|divide|floor_divide|modulus|pow](PMC *value,
 170       PMC *dest)>
 171
 172 Performs the addition/subtraction/multiplication/division/modulus/exponent
 173 operation, and returns a new PMC containing the resulting value. Multiple
 174 dispatch variants are defined for C<Integer>, C<Complex>, C<BigInt>,
 175 C<String>, and C<DEFAULT>.
 176
 177 Overflow of the native integer storage auto-promotes the result PMC to a
 178 C<BigInt>.
 179
 180 =item C<i_[add|subtract|multiply|divide|floor_divide|modulus|pow](PMC *value)>
 181
 182 Performs the addition/subtraction/multiplication/division/modulus/exponent
 183 operation, morphing the C<Integer> to the passed in type, and setting it to
 184 the result. Multiple dispatch variants are defined for C<Integer>, C<Complex>,
 185 C<BigInt>, and C<DEFAULT>.
 186
 187 Overflow of the native integer storage auto-promotes the C<Integer> to a
 188 C<BigInt>.
 189
 190 =item C<is_equal(PMC *value)>
 191
 192 Compares the C<Integer> to the passed in PMC, returning true (1) if they are
 193 equal, and false (0) otherwise. Multiple dispatch variants are defined for
 194 C<BigInt> and C<DEFAULT>. {{NOTE: Presumably the C<String>, C<Integer>, and
 195 C<Float> cases are all covered by C<DEFAULT>.}}
 196
 197 =item C<cmp(PMC *value)>
 198
 199 Compares the C<Integer> to the passed in PMC, returning 1 if C<Integer> is
 200 greater, -1 if the PMC is greater, and 0 if they are equal. Multiple dispatch
 201 variants are defined for C<String>, C<Float>, and C<DEFAULT>. {{NOTE:
 202 Presumably the C<Integer> and C<BigInt> cases are covered by C<DEFAULT>.}}
 203
 204 =item C<cmp_num(PMC *value)>
 205
 206 Compares the C<Integer> to the passed in PMC, returning 1 if C<Integer> is
 207 greater, -1 if the PMC is greater, and 0 if they are equal. Multiple dispatch
 208 variants are defined for C<String>, C<Float>, and C<DEFAULT>. {{NOTE:
 209 Presumably the C<Integer> and C<BigInt> cases are covered by C<DEFAULT>.}}
 210
 211 =back
 212
 213 =head4 Integer Methods
 214
 215 =over 4
 216
 217 =item C<get_as_base(INTVAL base)>
 218
 219 Converts the decimal integer to another base (anything from base 2 to base
 220 36), returning the result as a STRING.
 221
 222 =back
 223
 224 =head3 Float PMC
 225
 226 =head3 BigInt PMC
 227
 228 The bigint library provides Parrot with both a collection of (nearly)
 229 infinite precision numeric types and an implementation of an extended decimal
 230 arithmetic (EDA).
 231
 232 =head3 Why decimal arithmetic?
 233
 234 There are benefits in using the big number library to provide both values of
 235 effectively unlimited precision and a defined arithmetic, complete with
 236 rounding and exceptional conditions, for values which are otherwise easily
 237 represented using standard low-level types.  Both require the same range of
 238 operations but differ in the environment under which those operations occur.
 239 The effort required to produce a library which implements a decimal arithmetic
 240 is not much greater than that needed to provide a base-2 big number library.
 241 There is a trade-off in both space and speed, but given the nature of dynamic
 242 languages, this should not present too great a burden.
 243
 244 =head3 Numeric types provided
 245
 246 The bignumber library provides the following data types to Parrot:
 247
 248 =over 4
 249
 250 =item Big integers (BigInt)
 251
 252 Whole numbers with no limits on their size.
 253
 254 =item Big floats (BigNum)
 255
 256 Numbers with decimal fractional parts, again with no limit on size.
 257
 258 =item Big floats with fixed fractional parts
 259
 260 Numbers with a fixed maximum number of digits in their fractional part, again
 261 with no limit on size;. i.e BigRat.
 262
 263 =back
 264
 265 The library implements these different forms of numbers using the same
 266 internal representation, and differentiates between them only when performing
 267 rounding operations.  A number has the following abstract form:
 268
 269  [ sign, string of digits, exponent ]
 270
 271 If sign is zero, the number is positive. If equal to one, the number is
 272 negative.  The number has the value:
 273
 274  sign, string of digits * 10 ** exponent
 275
 276 A big integer must always have a non-negative exponent. A big float may have
 277 any exponent, and a float with a fixed fractional part will have an exponent
 278 greater than a given (negative) number.  These limits are not attached to a
 279 numeric value, but instead are enforced by giving any operation involving the
 280 numbers a I<context>.
 281
 282 In general, Parrot functions will not need to care about what the bignum
 283 objects are or do. They should merely be used as arguments to big number
 284 functions. The objects will be managed by Parrot's garbage collection in a
 285 similar manner to strings.
 286
 287 =head3 Special Values
 288
 289 Additionally the library provides special values which represent the result of
 290 otherwise undefined operations (division by zero, for instance).  Positive and
 291 negative infinity (C<Inf> or C<+Inf> and C<-Inf>, respectively) and both quiet
 292 and signalling Not a Number (C<NaN>) are available.  In general, the result of
 293 an operation with at least one argument which is C<NaN> will be C<NaN>. If the
 294 argument is a signalling C<NaN>, an exception will also be raised.  See the
 295 EDA for full details.
 296
 297 =head3 Context
 298
 299 All operations occur within a defined context.  This tells the operations how
 300 they should treat their arguments, what sort of rounding to perform, and what
 301 to do if rounding loses information.
 302
 303 The context provides the environment in which an operation occurs, in
 304 particular the following options are available:
 305
 306 =over 4
 307
 308 =item precision
 309
 310 A positive I<precision> requires the use of big floats. These cannot have more
 311 than I<precision> digits in their coefficient before or after any operation.
 312 Arguments to operations with more than I<precision> digits will be truncated
 313 and rounded appropriately.  Results of operations will not have more than
 314 I<precision> digits in their coefficients, with any extra digits accumulated
 315 during the calculation of the operation being truncated and rounded as
 316 required.
 317
 318 A I<precision> of zero requires the use of integer operations.  Arguments to
 319 operations are rounded so that they have no fractional part, and the result of
 320 all operations will be rounded to be integers.
 321
 322 A negative value of I<precision> requires the use of a fixed number of
 323 fractional digits, with arguments and results being truncated after those
 324 digits.
 325
 326 With non-positive values of I<precision>, the total number of digits in the
 327 coefficient is limited only by available memory.
 328
 329 =item rounding
 330
 331 The rounding part of the context defines the rounding algorithm to apply when
 332 truncating digits from a number's coefficient. The available rounding forms
 333 are outlined below.
 334
 335 =item traps and flags
 336
 337 The I<traps> part of the context defines how the library raises exceptions.
 338 Seven distinct classes of error can occur. If the corresponding trap is set
 339 (enabled), the library raises an exception.  Otherwise, execution continues
 340 with the exception class recorded in flags.  For more details, see the
 341 extended decimal arithmetic standard.
 342
 343 =back
 344
 345 The current I<context> determines the numeric type during a particular
 346 operation. This makes it easy to upgrade from one numeric form to another and
 347 also allows for considerable code-reuse within the library.
 348
 349 =head3 Exception Classes
 350
 351 The following exception classes are available:
 352
 353 =over 4
 354
 355 =item Lost Digits
 356
 357 Non-zero digits have been removed from an argument to a function during
 358 rounding before the operation.
 359
 360 =item Division By Zero
 361
 362 Division by zero was attempted.
 363
 364 =item Inexact
 365
 366 Because arguments were rounded, or because the result of an operation has lost
 367 significant digits, the result is inexact.
 368
 369 =item Invalid Operation
 370
 371 An invalid operation was attempted, for instance when C<NaN> is present as an
 372 argument to a function.  This also covers recoverable errors such as 0/0,
 373 which signals Invalid Operation and can return C<NaN>.
 374
 375 =item Overflow
 376
 377 The exponent of a number has overflowed.
 378
 379 =item Rounded
 380
 381 An argument has been rounded.
 382
 383 =item Underflow
 384
 385 The exponent of a number has underflowed.
 386
 387 =back
 388
 389 =head3 Rounding
 390
 391 The rounding part of the context defines the rounding algorithm to used.  The
 392 following contexts are available (examples assume a precision of 5):
 393
 394 =over 4
 395
 396 =item Round down
 397
 398 Any unwanted digits are simply truncated from the coefficient.  This rounds
 399 towards zero.
 400
 401  [0, 1234567, 10] => [0, 12345, 12]
 402
 403 =item Round half up
 404
 405 The first lost digit is examined. If this is in the range 0-4, the coefficient
 406 is truncated directly. If in the range 5-9, one is added to the final digit of
 407 the coefficient.  If this leads to a coefficient with more than I<precision>
 408 digits, the number is rounded again, removing the trailing zero.  This is
 409 essentially rounding to nearest.
 410
 411  [0, 1234567, 10] => [0, 12346, 12]
 412  [0, 1234549, 10] => [0, 12345, 12]
 413  [0, 9999950, 10] => [0, 10000, 13]
 414
 415 =item Round half even
 416
 417 The first lost digit is examined. If it lies in the range 0-4, the coefficient
 418 is truncated directly. If in the range 6-9, the coefficient is rounded up.  If
 419 the first lost digit is equal to 5 and the remaining lost digits in the
 420 coefficient are non-zero, the number is also rounded up.  If the lost digits
 421 are equal to exactly half, the number is rounded up if the least significant
 422 retained digit is odd, and rounded down if it is even.
 423
 424 =item Round Floor
 425
 426 If the digits to be discarded are non zero and the number is negative, the
 427 coefficient is rounded up, otherwise it remains the same.
 428
 429 This is rounding towards C<-Inf>.
 430
 431 =item Round Ceiling
 432
 433 If the digits to be discarded are non zero, and the number is positive, the
 434 coefficient is rounded up, otherwise it remains the same.
 435
 436 This is rounding towards C<Inf>.
 437
 438 =back
 439
 440 =head3 Operations
 441
 442 The library provides the following operations. They function exactly as those
 443 described in the Standard Decimal Arithmetic (SDA), with some extension to
 444 cope with integer and fixed fractional part numbers.  Only the deviations are
 445 outlined here.
 446
 447 In all cases, the sequence of rounding and promotion to zero outlined by the
 448 SDA are followed, even where the context implies integer operations.
 449
 450 =over 4
 451
 452 =item Addition, Subtraction
 453
 454 =item Multiplication
 455
 456 =item Division
 457
 458 Under integer conditions, division halts once the first fractional digit is
 459 calculated, with the result rounded to an integer and returned.  Under
 460 fixed-fraction conditions, one more digit than needed is calculated, with the
 461 coefficient then rounded and returned.
 462
 463 If a floating point value is required, or if inexact division by a very small
 464 number is attempted, it may be wise to follow big float arithmetic to limit
 465 the number of digits returned.  It is safe to chose a precision at least as
 466 large as the largest number of digits of either argument to the division
 467 function.
 468
 469 =item Integer division, Remainder
 470
 471 For both integer and fixed-fraction numbers, the result returned by the
 472 remainder function will be an integer or fixed-fraction number. The result of
 473 integer division will be an integer.
 474
 475 =item Rounding
 476
 477 =item Plus / Minus
 478
 479 =item Comparison
 480
 481 Comparison returns a big number which is equal to 1, 0, or -1 if the first
 482 argument is larger, equal to, or smaller than the second.  An alternate form
 483 returns an INTVAL.
 484
 485 =item Rescale
 486
 487 =item Power
 488
 489 =item Square Root
 490
 491 =back
 492
 493 =head3 Conversion to and from strings
 494
 495 A one to one conversion between the abstract representation above and a string
 496 is provided by the library, and acts as defined by the standard decimal
 497 arithmetic.  Other conversation operations may also be implemented; these may
 498 not provide one to one mapping.
 499
 500 A pedantic error checking conversion is available within the library, but only
 501 works with native strings.  Versions which work with Parrot STRINGs will also
 502 be provided, although in a separate file to the rest of the library.  (They
 503 will share a common private header file).
 504
 505 =head2 Implementation
 506
 507 Functions are provided which implement the arithmetic, conversion, creation
 508 and destruction of big numbers by dealing with otherwise opaque big number
 509 objects.
 510
 511 =head3 Big number representation
 512
 513 A big number is represented by the following structure, capable of being
 514 allocated, tracked, and destroyed by the Parrot garbage collection system.
 515
 516  typedef struct {
 517     BN_NIB *buffer; /* string of nibbles */
 518     UINTVAL nibs;   /* nibs allocated, in sizeof(BN_NIB) */
 519     UINTVAL flags;  /* private flags store: 001 Inf,  010 qNAN, 110 sNAN */
 520     INTVAL  digits; /* digits used */
 521     INTVAL  expn;   /* exponent of number */
 522     int     sign;   /* sign of number, 0 => positive or zero, 1 => negative */
 523  } parrot_bignum_t;
 524
 525 Within the library, individual decimal digits can be accessed using macros.
 526 Outside the library, access must be made via exported functions.  BN_NIB is
 527 likely to be a UINTVAL, but this is not essential.
 528
 529 Special values are represented by setting I<digits> to zero and setting
 530 appropriate private I<flags>, using internal macros.  Infinity has one flag
 531 field, NaN another flag field, and sNaN a third.  In general the flags should
 532 not be examined directly, even within the module.
 533
 534 =head3 Context
 535
 536  typedef struct {
 537     INTVAL        precision;  /* number of digs to retain */
 538     BN_ROUNDING   rounding;   /* rounding type to perform */
 539     BOOLVAL       extended;   /* do we use extended or base semantics? */
 540     unsigned char flags;      /* records possible errors */
 541     unsigned char traps;      /* throw errors or not? */
 542  } parrot_bignum_context;
 543
 544 I<BN_ROUNDING> is an enumeration of the possible rounding types as described
 545 earlier.  I<traps> is a bitmask of exception traps. 0 implies that a trap is
 546 disabled and 1 implies it is enabled.  I<flags> is a bitmask which records
 547 exceptional conditions and has the same fields at I<flags>.
 548
 549 Language level types should implement big floats using a global floating point
 550 context available in an interpreter structure (and accessible).  Big integers
 551 and fixed-fraction number are provided by creating a context with an
 552 appropriate precision whenever a call into the library is made.
 553
 554 =head3 Exceptional Conditions
 555
 556 When the module raises an exceptional condition, control passes to
 557 C<BN_nonfatal()>. this examines the error which has occurred and the current
 558 context to determine which class of error has occurred. If the corresponding
 559 trap handler is not enabled, the context's flags are updated and control is
 560 returned to the bignumber library. Otherwise the exception becomes fatal.  How
 561 this mechanism interacts with Parrot's own is yet to be decided.
 562
 563 The possible exceptions are detailed in the extended decimal arithmetic.
 564
 565 =head2 Tests
 566
 567 The Standard Decimal Arithmetic provides a collection of tests for both its
 568 base and extended behavior.
 569
 570 =head2 TODO
 571
 572 Fill in the remaining functions from the EDA, verify that the test suite still
 573 passes, integrate the library into the rest of Parrot, provide PMC types and
 574 suitable opcodes.  Conversion to and from Parrot strings, conversion to and
 575 from floating point types, sprintf output of bignumbers.
 576
 577 =head2 Attachments
 578
 579 =head2 Footnotes
 580
 581 =head2 References
 582
 583 IBM's Standard Decimal Arithmetic, with tests
 584 (L<http://speleotrove.com/decimal/>)
 585
 586 The Perl modules Math::BigInt and Math::BigFloat.
 587
 588 Alex Gough's suggestions for bigint/bignum implementation.
 589
 590 GNU gmp. That's we currently use: mpz and mpf.
 591
 592 =cut
 593
 594 __END__
 595 Local Variables:
 596   fill-column:78
 597 End:
 598 vim: expandtab tw=78 shiftwidth=4: