7 Network Working Group M. Eisler, Ed.
8 Request for Comments: 4506 Network Appliance, Inc.
11 Category: Standards Track
14 XDR: External Data Representation Standard
18 This document specifies an Internet standards track protocol for the
19 Internet community, and requests discussion and suggestions for
20 improvements. Please refer to the current edition of the "Internet
21 Official Protocol Standards" (STD 1) for the standardization state
22 and status of this protocol. Distribution of this memo is unlimited.
26 Copyright (C) The Internet Society (2006).
30 This document describes the External Data Representation Standard
31 (XDR) protocol as it is currently deployed and accepted. This
32 document obsoletes RFC 1832.
58 Eisler Standards Track [Page 1]
60 RFC 4506 XDR: External Data Representation Standard May 2006
65 1. Introduction ....................................................3
66 2. Changes from RFC 1832 ...........................................3
67 3. Basic Block Size ................................................3
68 4. XDR Data Types ..................................................4
69 4.1. Integer ....................................................4
70 4.2. Unsigned Integer ...........................................4
71 4.3. Enumeration ................................................5
72 4.4. Boolean ....................................................5
73 4.5. Hyper Integer and Unsigned Hyper Integer ...................5
74 4.6. Floating-Point .............................................6
75 4.7. Double-Precision Floating-Point ............................7
76 4.8. Quadruple-Precision Floating-Point .........................8
77 4.9. Fixed-Length Opaque Data ...................................9
78 4.10. Variable-Length Opaque Data ...............................9
79 4.11. String ...................................................10
80 4.12. Fixed-Length Array .......................................11
81 4.13. Variable-Length Array ....................................11
82 4.14. Structure ................................................12
83 4.15. Discriminated Union ......................................12
84 4.16. Void .....................................................13
85 4.17. Constant .................................................13
86 4.18. Typedef ..................................................13
87 4.19. Optional-Data ............................................14
88 4.20. Areas for Future Enhancement .............................16
89 5. Discussion .....................................................16
90 6. The XDR Language Specification .................................17
91 6.1. Notational Conventions ....................................17
92 6.2. Lexical Notes .............................................18
93 6.3. Syntax Information ........................................18
94 6.4. Syntax Notes ..............................................20
95 7. An Example of an XDR Data Description ..........................21
96 8. Security Considerations ........................................22
97 9. IANA Considerations ............................................23
98 10. Trademarks and Owners .........................................23
99 11. ANSI/IEEE Standard 754-1985 ...................................24
100 12. Normative References ..........................................25
101 13. Informative References ........................................25
102 14. Acknowledgements ..............................................26
114 Eisler Standards Track [Page 2]
116 RFC 4506 XDR: External Data Representation Standard May 2006
121 XDR is a standard for the description and encoding of data. It is
122 useful for transferring data between different computer
123 architectures, and it has been used to communicate data between such
124 diverse machines as the SUN WORKSTATION*, VAX*, IBM-PC*, and Cray*.
125 XDR fits into the ISO presentation layer and is roughly analogous in
126 purpose to X.409, ISO Abstract Syntax Notation. The major difference
127 between these two is that XDR uses implicit typing, while X.409 uses
130 XDR uses a language to describe data formats. The language can be
131 used only to describe data; it is not a programming language. This
132 language allows one to describe intricate data formats in a concise
133 manner. The alternative of using graphical representations (itself
134 an informal language) quickly becomes incomprehensible when faced
135 with complexity. The XDR language itself is similar to the C
136 language [KERN], just as Courier [COUR] is similar to Mesa.
137 Protocols such as ONC RPC (Remote Procedure Call) and the NFS*
138 (Network File System) use XDR to describe the format of their data.
140 The XDR standard makes the following assumption: that bytes (or
141 octets) are portable, where a byte is defined as 8 bits of data. A
142 given hardware device should encode the bytes onto the various media
143 in such a way that other hardware devices may decode the bytes
144 without loss of meaning. For example, the Ethernet* standard
145 suggests that bytes be encoded in "little-endian" style [COHE], or
146 least significant bit first.
148 2. Changes from RFC 1832
150 This document makes no technical changes to RFC 1832 and is published
151 for the purposes of noting IANA considerations, augmenting security
152 considerations, and distinguishing normative from informative
157 The representation of all items requires a multiple of four bytes (or
158 32 bits) of data. The bytes are numbered 0 through n-1. The bytes
159 are read or written to some byte stream such that byte m always
160 precedes byte m+1. If the n bytes needed to contain the data are not
161 a multiple of four, then the n bytes are followed by enough (0 to 3)
162 residual zero bytes, r, to make the total byte count a multiple of 4.
164 We include the familiar graphic box notation for illustration and
165 comparison. In most illustrations, each box (delimited by a plus
166 sign at the 4 corners and vertical bars and dashes) depicts a byte.
170 Eisler Standards Track [Page 3]
172 RFC 4506 XDR: External Data Representation Standard May 2006
175 Ellipses (...) between boxes show zero or more additional bytes where
178 +--------+--------+...+--------+--------+...+--------+
179 | byte 0 | byte 1 |...|byte n-1| 0 |...| 0 | BLOCK
180 +--------+--------+...+--------+--------+...+--------+
181 |<-----------n bytes---------->|<------r bytes------>|
182 |<-----------n+r (where (n+r) mod 4 = 0)>----------->|
186 Each of the sections that follow describes a data type defined in the
187 XDR standard, shows how it is declared in the language, and includes
188 a graphic illustration of its encoding.
190 For each data type in the language we show a general paradigm
191 declaration. Note that angle brackets (< and >) denote variable-
192 length sequences of data and that square brackets ([ and ]) denote
193 fixed-length sequences of data. "n", "m", and "r" denote integers.
194 For the full language specification and more formal definitions of
195 terms such as "identifier" and "declaration", refer to Section 6,
196 "The XDR Language Specification".
198 For some data types, more specific examples are included. A more
199 extensive example of a data description is in Section 7, "An Example
200 of an XDR Data Description".
204 An XDR signed integer is a 32-bit datum that encodes an integer in
205 the range [-2147483648,2147483647]. The integer is represented in
206 two's complement notation. The most and least significant bytes are
207 0 and 3, respectively. Integers are declared as follows:
212 +-------+-------+-------+-------+
213 |byte 0 |byte 1 |byte 2 |byte 3 | INTEGER
214 +-------+-------+-------+-------+
215 <------------32 bits------------>
217 4.2. Unsigned Integer
219 An XDR unsigned integer is a 32-bit datum that encodes a non-negative
220 integer in the range [0,4294967295]. It is represented by an
221 unsigned binary number whose most and least significant bytes are 0
222 and 3, respectively. An unsigned integer is declared as follows:
226 Eisler Standards Track [Page 4]
228 RFC 4506 XDR: External Data Representation Standard May 2006
231 unsigned int identifier;
234 +-------+-------+-------+-------+
235 |byte 0 |byte 1 |byte 2 |byte 3 | UNSIGNED INTEGER
236 +-------+-------+-------+-------+
237 <------------32 bits------------>
241 Enumerations have the same representation as signed integers.
242 Enumerations are handy for describing subsets of the integers.
243 Enumerated data is declared as follows:
245 enum { name-identifier = constant, ... } identifier;
247 For example, the three colors red, yellow, and blue could be
248 described by an enumerated type:
250 enum { RED = 2, YELLOW = 3, BLUE = 5 } colors;
252 It is an error to encode as an enum any integer other than those that
253 have been given assignments in the enum declaration.
257 Booleans are important enough and occur frequently enough to warrant
258 their own explicit type in the standard. Booleans are declared as
263 This is equivalent to:
265 enum { FALSE = 0, TRUE = 1 } identifier;
267 4.5. Hyper Integer and Unsigned Hyper Integer
269 The standard also defines 64-bit (8-byte) numbers called hyper
270 integers and unsigned hyper integers. Their representations are the
271 obvious extensions of integer and unsigned integer defined above.
272 They are represented in two's complement notation. The most and
273 least significant bytes are 0 and 7, respectively. Their
276 hyper identifier; unsigned hyper identifier;
282 Eisler Standards Track [Page 5]
284 RFC 4506 XDR: External Data Representation Standard May 2006
288 +-------+-------+-------+-------+-------+-------+-------+-------+
289 |byte 0 |byte 1 |byte 2 |byte 3 |byte 4 |byte 5 |byte 6 |byte 7 |
290 +-------+-------+-------+-------+-------+-------+-------+-------+
291 <----------------------------64 bits---------------------------->
293 UNSIGNED HYPER INTEGER
297 The standard defines the floating-point data type "float" (32 bits or
298 4 bytes). The encoding used is the IEEE standard for normalized
299 single-precision floating-point numbers [IEEE]. The following three
300 fields describe the single-precision floating-point number:
302 S: The sign of the number. Values 0 and 1 represent positive and
303 negative, respectively. One bit.
305 E: The exponent of the number, base 2. 8 bits are devoted to this
306 field. The exponent is biased by 127.
308 F: The fractional part of the number's mantissa, base 2. 23 bits
309 are devoted to this field.
311 Therefore, the floating-point number is described by:
313 (-1)**S * 2**(E-Bias) * 1.F
315 It is declared as follows:
319 +-------+-------+-------+-------+
320 |byte 0 |byte 1 |byte 2 |byte 3 | SINGLE-PRECISION
321 S| E | F | FLOATING-POINT NUMBER
322 +-------+-------+-------+-------+
323 1|<- 8 ->|<-------23 bits------>|
324 <------------32 bits------------>
326 Just as the most and least significant bytes of a number are 0 and 3,
327 the most and least significant bits of a single-precision floating-
328 point number are 0 and 31. The beginning bit (and most significant
329 bit) offsets of S, E, and F are 0, 1, and 9, respectively. Note that
330 these numbers refer to the mathematical positions of the bits, and
331 NOT to their actual physical locations (which vary from medium to
338 Eisler Standards Track [Page 6]
340 RFC 4506 XDR: External Data Representation Standard May 2006
343 The IEEE specifications should be consulted concerning the encoding
344 for signed zero, signed infinity (overflow), and denormalized numbers
345 (underflow) [IEEE]. According to IEEE specifications, the "NaN" (not
346 a number) is system dependent and should not be interpreted within
347 XDR as anything other than "NaN".
349 4.7. Double-Precision Floating-Point
351 The standard defines the encoding for the double-precision floating-
352 point data type "double" (64 bits or 8 bytes). The encoding used is
353 the IEEE standard for normalized double-precision floating-point
354 numbers [IEEE]. The standard encodes the following three fields,
355 which describe the double-precision floating-point number:
357 S: The sign of the number. Values 0 and 1 represent positive and
358 negative, respectively. One bit.
360 E: The exponent of the number, base 2. 11 bits are devoted to
361 this field. The exponent is biased by 1023.
363 F: The fractional part of the number's mantissa, base 2. 52 bits
364 are devoted to this field.
366 Therefore, the floating-point number is described by:
368 (-1)**S * 2**(E-Bias) * 1.F
370 It is declared as follows:
374 +------+------+------+------+------+------+------+------+
375 |byte 0|byte 1|byte 2|byte 3|byte 4|byte 5|byte 6|byte 7|
377 +------+------+------+------+------+------+------+------+
378 1|<--11-->|<-----------------52 bits------------------->|
379 <-----------------------64 bits------------------------->
380 DOUBLE-PRECISION FLOATING-POINT
382 Just as the most and least significant bytes of a number are 0 and 3,
383 the most and least significant bits of a double-precision floating-
384 point number are 0 and 63. The beginning bit (and most significant
385 bit) offsets of S, E, and F are 0, 1, and 12, respectively. Note
386 that these numbers refer to the mathematical positions of the bits,
387 and NOT to their actual physical locations (which vary from medium to
394 Eisler Standards Track [Page 7]
396 RFC 4506 XDR: External Data Representation Standard May 2006
399 The IEEE specifications should be consulted concerning the encoding
400 for signed zero, signed infinity (overflow), and denormalized numbers
401 (underflow) [IEEE]. According to IEEE specifications, the "NaN" (not
402 a number) is system dependent and should not be interpreted within
403 XDR as anything other than "NaN".
405 4.8. Quadruple-Precision Floating-Point
407 The standard defines the encoding for the quadruple-precision
408 floating-point data type "quadruple" (128 bits or 16 bytes). The
409 encoding used is designed to be a simple analog of the encoding used
410 for single- and double-precision floating-point numbers using one
411 form of IEEE double extended precision. The standard encodes the
412 following three fields, which describe the quadruple-precision
413 floating-point number:
415 S: The sign of the number. Values 0 and 1 represent positive and
416 negative, respectively. One bit.
418 E: The exponent of the number, base 2. 15 bits are devoted to
419 this field. The exponent is biased by 16383.
421 F: The fractional part of the number's mantissa, base 2. 112 bits
422 are devoted to this field.
424 Therefore, the floating-point number is described by:
426 (-1)**S * 2**(E-Bias) * 1.F
428 It is declared as follows:
430 quadruple identifier;
432 +------+------+------+------+------+------+-...--+------+
433 |byte 0|byte 1|byte 2|byte 3|byte 4|byte 5| ... |byte15|
435 +------+------+------+------+------+------+-...--+------+
436 1|<----15---->|<-------------112 bits------------------>|
437 <-----------------------128 bits------------------------>
438 QUADRUPLE-PRECISION FLOATING-POINT
440 Just as the most and least significant bytes of a number are 0 and 3,
441 the most and least significant bits of a quadruple-precision
442 floating-point number are 0 and 127. The beginning bit (and most
443 significant bit) offsets of S, E , and F are 0, 1, and 16,
444 respectively. Note that these numbers refer to the mathematical
445 positions of the bits, and NOT to their actual physical locations
446 (which vary from medium to medium).
450 Eisler Standards Track [Page 8]
452 RFC 4506 XDR: External Data Representation Standard May 2006
455 The encoding for signed zero, signed infinity (overflow), and
456 denormalized numbers are analogs of the corresponding encodings for
457 single and double-precision floating-point numbers [SPAR], [HPRE].
458 The "NaN" encoding as it applies to quadruple-precision floating-
459 point numbers is system dependent and should not be interpreted
460 within XDR as anything other than "NaN".
462 4.9. Fixed-Length Opaque Data
464 At times, fixed-length uninterpreted data needs to be passed among
465 machines. This data is called "opaque" and is declared as follows:
467 opaque identifier[n];
469 where the constant n is the (static) number of bytes necessary to
470 contain the opaque data. If n is not a multiple of four, then the n
471 bytes are followed by enough (0 to 3) residual zero bytes, r, to make
472 the total byte count of the opaque object a multiple of four.
475 +--------+--------+...+--------+--------+...+--------+
476 | byte 0 | byte 1 |...|byte n-1| 0 |...| 0 |
477 +--------+--------+...+--------+--------+...+--------+
478 |<-----------n bytes---------->|<------r bytes------>|
479 |<-----------n+r (where (n+r) mod 4 = 0)------------>|
482 4.10. Variable-Length Opaque Data
484 The standard also provides for variable-length (counted) opaque data,
485 defined as a sequence of n (numbered 0 through n-1) arbitrary bytes
486 to be the number n encoded as an unsigned integer (as described
487 below), and followed by the n bytes of the sequence.
489 Byte m of the sequence always precedes byte m+1 of the sequence, and
490 byte 0 of the sequence always follows the sequence's length (count).
491 If n is not a multiple of four, then the n bytes are followed by
492 enough (0 to 3) residual zero bytes, r, to make the total byte count
493 a multiple of four. Variable-length opaque data is declared in the
496 opaque identifier<m>;
500 The constant m denotes an upper bound of the number of bytes that the
501 sequence may contain. If m is not specified, as in the second
502 declaration, it is assumed to be (2**32) - 1, the maximum length.
506 Eisler Standards Track [Page 9]
508 RFC 4506 XDR: External Data Representation Standard May 2006
511 The constant m would normally be found in a protocol specification.
512 For example, a filing protocol may state that the maximum data
513 transfer size is 8192 bytes, as follows:
515 opaque filedata<8192>;
518 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
519 | length n |byte0|byte1|...| n-1 | 0 |...| 0 |
520 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
521 |<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
522 |<----n+r (where (n+r) mod 4 = 0)---->|
523 VARIABLE-LENGTH OPAQUE
525 It is an error to encode a length greater than the maximum described
526 in the specification.
530 The standard defines a string of n (numbered 0 through n-1) ASCII
531 bytes to be the number n encoded as an unsigned integer (as described
532 above), and followed by the n bytes of the string. Byte m of the
533 string always precedes byte m+1 of the string, and byte 0 of the
534 string always follows the string's length. If n is not a multiple of
535 four, then the n bytes are followed by enough (0 to 3) residual zero
536 bytes, r, to make the total byte count a multiple of four. Counted
537 byte strings are declared as follows:
543 The constant m denotes an upper bound of the number of bytes that a
544 string may contain. If m is not specified, as in the second
545 declaration, it is assumed to be (2**32) - 1, the maximum length.
546 The constant m would normally be found in a protocol specification.
547 For example, a filing protocol may state that a file name can be no
548 longer than 255 bytes, as follows:
550 string filename<255>;
553 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
554 | length n |byte0|byte1|...| n-1 | 0 |...| 0 |
555 +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+
556 |<-------4 bytes------->|<------n bytes------>|<---r bytes--->|
557 |<----n+r (where (n+r) mod 4 = 0)---->|
562 Eisler Standards Track [Page 10]
564 RFC 4506 XDR: External Data Representation Standard May 2006
567 It is an error to encode a length greater than the maximum described
568 in the specification.
570 4.12. Fixed-Length Array
572 Declarations for fixed-length arrays of homogeneous elements are in
575 type-name identifier[n];
577 Fixed-length arrays of elements numbered 0 through n-1 are encoded by
578 individually encoding the elements of the array in their natural
579 order, 0 through n-1. Each element's size is a multiple of four
580 bytes. Though all elements are of the same type, the elements may
581 have different sizes. For example, in a fixed-length array of
582 strings, all elements are of type "string", yet each element will
585 +---+---+---+---+---+---+---+---+...+---+---+---+---+
586 | element 0 | element 1 |...| element n-1 |
587 +---+---+---+---+---+---+---+---+...+---+---+---+---+
588 |<--------------------n elements------------------->|
592 4.13. Variable-Length Array
594 Counted arrays provide the ability to encode variable-length arrays
595 of homogeneous elements. The array is encoded as the element count n
596 (an unsigned integer) followed by the encoding of each of the array's
597 elements, starting with element 0 and progressing through element
598 n-1. The declaration for variable-length arrays follows this form:
600 type-name identifier<m>;
602 type-name identifier<>;
604 The constant m specifies the maximum acceptable element count of an
605 array; if m is not specified, as in the second declaration, it is
606 assumed to be (2**32) - 1.
609 +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
610 | n | element 0 | element 1 |...|element n-1|
611 +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+
612 |<-4 bytes->|<--------------n elements------------->|
618 Eisler Standards Track [Page 11]
620 RFC 4506 XDR: External Data Representation Standard May 2006
623 It is an error to encode a value of n that is greater than the
624 maximum described in the specification.
628 Structures are declared as follows:
631 component-declaration-A;
632 component-declaration-B;
636 The components of the structure are encoded in the order of their
637 declaration in the structure. Each component's size is a multiple of
638 four bytes, though the components may be different sizes.
640 +-------------+-------------+...
641 | component A | component B |... STRUCTURE
642 +-------------+-------------+...
644 4.15. Discriminated Union
646 A discriminated union is a type composed of a discriminant followed
647 by a type selected from a set of prearranged types according to the
648 value of the discriminant. The type of discriminant is either "int",
649 "unsigned int", or an enumerated type, such as "bool". The component
650 types are called "arms" of the union and are preceded by the value of
651 the discriminant that implies their encoding. Discriminated unions
652 are declared as follows:
654 union switch (discriminant-declaration) {
655 case discriminant-value-A:
657 case discriminant-value-B:
660 default: default-declaration;
663 Each "case" keyword is followed by a legal value of the discriminant.
664 The default arm is optional. If it is not specified, then a valid
665 encoding of the union cannot take on unspecified discriminant values.
666 The size of the implied arm is always a multiple of four bytes.
668 The discriminated union is encoded as its discriminant followed by
669 the encoding of the implied arm.
674 Eisler Standards Track [Page 12]
676 RFC 4506 XDR: External Data Representation Standard May 2006
680 +---+---+---+---+---+---+---+---+
681 | discriminant | implied arm | DISCRIMINATED UNION
682 +---+---+---+---+---+---+---+---+
687 An XDR void is a 0-byte quantity. Voids are useful for describing
688 operations that take no data as input or no data as output. They are
689 also useful in unions, where some arms may contain data and others do
690 not. The declaration is simply as follows:
694 Voids are illustrated as follows:
703 The data declaration for a constant follows this form:
705 const name-identifier = n;
707 "const" is used to define a symbolic name for a constant; it does not
708 declare any data. The symbolic constant may be used anywhere a
709 regular constant may be used. For example, the following defines a
710 symbolic constant DOZEN, equal to 12.
716 "typedef" does not declare any data either, but serves to define new
717 identifiers for declaring data. The syntax is:
721 The new type name is actually the variable name in the declaration
722 part of the typedef. For example, the following defines a new type
723 called "eggbox" using an existing type called "egg":
725 typedef egg eggbox[DOZEN];
730 Eisler Standards Track [Page 13]
732 RFC 4506 XDR: External Data Representation Standard May 2006
735 Variables declared using the new type name have the same type as the
736 new type name would have in the typedef, if it were considered a
737 variable. For example, the following two declarations are equivalent
738 in declaring the variable "fresheggs":
740 eggbox fresheggs; egg fresheggs[DOZEN];
742 When a typedef involves a struct, enum, or union definition, there is
743 another (preferred) syntax that may be used to define the same type.
744 In general, a typedef of the following form:
746 typedef <<struct, union, or enum definition>> identifier;
748 may be converted to the alternative form by removing the "typedef"
749 part and placing the identifier after the "struct", "union", or
750 "enum" keyword, instead of at the end. For example, here are the two
751 ways to define the type "bool":
753 typedef enum { /* using typedef */
758 enum bool { /* preferred alternative */
763 This syntax is preferred because one does not have to wait until the
764 end of a declaration to figure out the name of the new type.
768 Optional-data is one kind of union that occurs so frequently that we
769 give it a special syntax of its own for declaring it. It is declared
772 type-name *identifier;
774 This is equivalent to the following union:
776 union switch (bool opted) {
786 Eisler Standards Track [Page 14]
788 RFC 4506 XDR: External Data Representation Standard May 2006
791 It is also equivalent to the following variable-length array
792 declaration, since the boolean "opted" can be interpreted as the
795 type-name identifier<1>;
797 Optional-data is not so interesting in itself, but it is very useful
798 for describing recursive data-structures such as linked-lists and
799 trees. For example, the following defines a type "stringlist" that
800 encodes lists of zero or more arbitrary length strings:
807 typedef stringentry *stringlist;
809 It could have been equivalently declared as the following union:
811 union stringlist switch (bool opted) {
821 or as a variable-length array:
828 typedef stringentry stringlist<1>;
830 Both of these declarations obscure the intention of the stringlist
831 type, so the optional-data declaration is preferred over both of
832 them. The optional-data type also has a close correlation to how
833 recursive data structures are represented in high-level languages
834 such as Pascal or C by use of pointers. In fact, the syntax is the
835 same as that of the C language for pointers.
842 Eisler Standards Track [Page 15]
844 RFC 4506 XDR: External Data Representation Standard May 2006
847 4.20. Areas for Future Enhancement
849 The XDR standard lacks representations for bit fields and bitmaps,
850 since the standard is based on bytes. Also missing are packed (or
851 binary-coded) decimals.
853 The intent of the XDR standard was not to describe every kind of data
854 that people have ever sent or will ever want to send from machine to
855 machine. Rather, it only describes the most commonly used data-types
856 of high-level languages such as Pascal or C so that applications
857 written in these languages will be able to communicate easily over
860 One could imagine extensions to XDR that would let it describe almost
861 any existing protocol, such as TCP. The minimum necessary for this
862 is support for different block sizes and byte-orders. The XDR
863 discussed here could then be considered the 4-byte big-endian member
864 of a larger XDR family.
868 (1) Why use a language for describing data? What's wrong with
871 There are many advantages in using a data-description language such
872 as XDR versus using diagrams. Languages are more formal than
873 diagrams and lead to less ambiguous descriptions of data. Languages
874 are also easier to understand and allow one to think of other issues
875 instead of the low-level details of bit encoding. Also, there is a
876 close analogy between the types of XDR and a high-level language such
877 as C or Pascal. This makes the implementation of XDR encoding and
878 decoding modules an easier task. Finally, the language specification
879 itself is an ASCII string that can be passed from machine to machine
880 to perform on-the-fly data interpretation.
882 (2) Why is there only one byte-order for an XDR unit?
884 Supporting two byte-orderings requires a higher-level protocol for
885 determining in which byte-order the data is encoded. Since XDR is
886 not a protocol, this can't be done. The advantage of this, though,
887 is that data in XDR format can be written to a magnetic tape, for
888 example, and any machine will be able to interpret it, since no
889 higher-level protocol is necessary for determining the byte-order.
891 (3) Why is the XDR byte-order big-endian instead of little-endian?
892 Isn't this unfair to little-endian machines such as the VAX(r),
893 which has to convert from one form to the other?
898 Eisler Standards Track [Page 16]
900 RFC 4506 XDR: External Data Representation Standard May 2006
903 Yes, it is unfair, but having only one byte-order means you have to
904 be unfair to somebody. Many architectures, such as the Motorola
905 68000* and IBM 370*, support the big-endian byte-order.
907 (4) Why is the XDR unit four bytes wide?
909 There is a tradeoff in choosing the XDR unit size. Choosing a small
910 size, such as two, makes the encoded data small, but causes alignment
911 problems for machines that aren't aligned on these boundaries. A
912 large size, such as eight, means the data will be aligned on
913 virtually every machine, but causes the encoded data to grow too big.
914 We chose four as a compromise. Four is big enough to support most
915 architectures efficiently, except for rare machines such as the
916 eight-byte-aligned Cray*. Four is also small enough to keep the
917 encoded data restricted to a reasonable size.
919 (5) Why must variable-length data be padded with zeros?
921 It is desirable that the same data encode into the same thing on all
922 machines, so that encoded data can be meaningfully compared or
923 checksummed. Forcing the padded bytes to be zero ensures this.
925 (6) Why is there no explicit data-typing?
927 Data-typing has a relatively high cost for what small advantages it
928 may have. One cost is the expansion of data due to the inserted type
929 fields. Another is the added cost of interpreting these type fields
930 and acting accordingly. And most protocols already know what type
931 they expect, so data-typing supplies only redundant information.
932 However, one can still get the benefits of data-typing using XDR.
933 One way is to encode two things: first, a string that is the XDR data
934 description of the encoded data, and then the encoded data itself.
935 Another way is to assign a value to all the types in XDR, and then
936 define a universal type that takes this value as its discriminant and
937 for each value, describes the corresponding data type.
939 6. The XDR Language Specification
941 6.1. Notational Conventions
943 This specification uses an extended Back-Naur Form notation for
944 describing the XDR language. Here is a brief description of the
947 (1) The characters '|', '(', ')', '[', ']', '"', and '*' are special.
948 (2) Terminal symbols are strings of any characters surrounded by
949 double quotes. (3) Non-terminal symbols are strings of non-special
950 characters. (4) Alternative items are separated by a vertical bar
954 Eisler Standards Track [Page 17]
956 RFC 4506 XDR: External Data Representation Standard May 2006
959 ("|"). (5) Optional items are enclosed in brackets. (6) Items are
960 grouped together by enclosing them in parentheses. (7) A '*'
961 following an item means 0 or more occurrences of that item.
963 For example, consider the following pattern:
965 "a " "very" (", " "very")* [" cold " "and "] " rainy "
968 An infinite number of strings match this pattern. A few of them are:
971 "a very, very rainy day"
972 "a very cold and rainy day"
973 "a very, very, very cold and rainy night"
977 (1) Comments begin with '/*' and terminate with '*/'. (2) White
978 space serves to separate items and is otherwise ignored. (3) An
979 identifier is a letter followed by an optional sequence of letters,
980 digits, or underbar ('_'). The case of identifiers is not ignored.
981 (4) A decimal constant expresses a number in base 10 and is a
982 sequence of one or more decimal digits, where the first digit is not
983 a zero, and is optionally preceded by a minus-sign ('-'). (5) A
984 hexadecimal constant expresses a number in base 16, and must be
985 preceded by '0x', followed by one or hexadecimal digits ('A', 'B',
986 'C', 'D', E', 'F', 'a', 'b', 'c', 'd', 'e', 'f', '0', '1', '2', '3',
987 '4', '5', '6', '7', '8', '9'). (6) An octal constant expresses a
988 number in base 8, always leads with digit 0, and is a sequence of one
989 or more octal digits ('0', '1', '2', '3', '4', '5', '6', '7').
991 6.3. Syntax Information
994 type-specifier identifier
995 | type-specifier identifier "[" value "]"
996 | type-specifier identifier "<" [ value ] ">"
997 | "opaque" identifier "[" value "]"
998 | "opaque" identifier "<" [ value ] ">"
999 | "string" identifier "<" [ value ] ">"
1000 | type-specifier "*" identifier
1010 Eisler Standards Track [Page 18]
1012 RFC 4506 XDR: External Data Representation Standard May 2006
1016 decimal-constant | hexadecimal-constant | octal-constant
1019 [ "unsigned" ] "int"
1020 | [ "unsigned" ] "hyper"
1035 ( identifier "=" value )
1036 ( "," identifier "=" value )*
1040 "struct" struct-body
1045 ( declaration ";" )*
1052 "switch" "(" declaration ")" "{"
1055 [ "default" ":" declaration ";" ]
1060 ( "case" value ":") *
1066 Eisler Standards Track [Page 19]
1068 RFC 4506 XDR: External Data Representation Standard May 2006
1072 "const" identifier "=" constant ";"
1075 "typedef" declaration ";"
1076 | "enum" identifier enum-body ";"
1077 | "struct" identifier struct-body ";"
1078 | "union" identifier union-body ";"
1089 (1) The following are keywords and cannot be used as identifiers:
1090 "bool", "case", "const", "default", "double", "quadruple", "enum",
1091 "float", "hyper", "int", "opaque", "string", "struct", "switch",
1092 "typedef", "union", "unsigned", and "void".
1094 (2) Only unsigned constants may be used as size specifications for
1095 arrays. If an identifier is used, it must have been declared
1096 previously as an unsigned constant in a "const" definition.
1098 (3) Constant and type identifiers within the scope of a specification
1099 are in the same name space and must be declared uniquely within this
1102 (4) Similarly, variable names must be unique within the scope of
1103 struct and union declarations. Nested struct and union declarations
1106 (5) The discriminant of a union must be of a type that evaluates to
1107 an integer. That is, "int", "unsigned int", "bool", an enumerated
1108 type, or any typedefed type that evaluates to one of these is legal.
1109 Also, the case values must be one of the legal values of the
1110 discriminant. Finally, a case value may not be specified more than
1111 once within the scope of a union declaration.
1122 Eisler Standards Track [Page 20]
1124 RFC 4506 XDR: External Data Representation Standard May 2006
1127 7. An Example of an XDR Data Description
1129 Here is a short XDR data description of a thing called a "file",
1130 which might be used to transfer files from one machine to another.
1132 const MAXUSERNAME = 32; /* max length of a user name */
1133 const MAXFILELEN = 65535; /* max length of a file */
1134 const MAXNAMELEN = 255; /* max length of a file name */
1140 TEXT = 0, /* ascii data */
1141 DATA = 1, /* raw data */
1142 EXEC = 2 /* executable */
1146 * File information, per kind of file:
1148 union filetype switch (filekind kind) {
1150 void; /* no extra information */
1152 string creator<MAXNAMELEN>; /* data creator */
1154 string interpretor<MAXNAMELEN>; /* program interpretor */
1161 string filename<MAXNAMELEN>; /* name of file */
1162 filetype type; /* info about file */
1163 string owner<MAXUSERNAME>; /* owner of file */
1164 opaque data<MAXFILELEN>; /* file data */
1167 Suppose now that there is a user named "john" who wants to store his
1168 lisp program "sillyprog" that contains just the data "(quit)". His
1169 file would be encoded as follows:
1178 Eisler Standards Track [Page 21]
1180 RFC 4506 XDR: External Data Representation Standard May 2006
1183 OFFSET HEX BYTES ASCII COMMENTS
1184 ------ --------- ----- --------
1185 0 00 00 00 09 .... -- length of filename = 9
1186 4 73 69 6c 6c sill -- filename characters
1187 8 79 70 72 6f ypro -- ... and more characters ...
1188 12 67 00 00 00 g... -- ... and 3 zero-bytes of fill
1189 16 00 00 00 02 .... -- filekind is EXEC = 2
1190 20 00 00 00 04 .... -- length of interpretor = 4
1191 24 6c 69 73 70 lisp -- interpretor characters
1192 28 00 00 00 04 .... -- length of owner = 4
1193 32 6a 6f 68 6e john -- owner characters
1194 36 00 00 00 06 .... -- length of file data = 6
1195 40 28 71 75 69 (qui -- file data bytes ...
1196 44 74 29 00 00 t).. -- ... and 2 zero-bytes of fill
1198 8. Security Considerations
1200 XDR is a data description language, not a protocol, and hence it does
1201 not inherently give rise to any particular security considerations.
1202 Protocols that carry XDR-formatted data, such as NFSv4, are
1203 responsible for providing any necessary security services to secure
1204 the data they transport.
1206 Care must be take to properly encode and decode data to avoid
1207 attacks. Known and avoidable risks include:
1209 * Buffer overflow attacks. Where feasible, protocols should be
1210 defined with explicit limits (via the "<" [ value ] ">" notation
1211 instead of "<" ">") on elements with variable-length data types.
1212 Regardless of the feasibility of an explicit limit on the
1213 variable length of an element of a given protocol, decoders need
1214 to ensure the incoming size does not exceed the length of any
1215 provisioned receiver buffers.
1217 * Nul octets embedded in an encoded value of type string. If the
1218 decoder's native string format uses nul-terminated strings, then
1219 the apparent size of the decoded object will be less than the
1220 amount of memory allocated for the string. Some memory
1221 deallocation interfaces take a size argument. The caller of the
1222 deallocation interface would likely determine the size of the
1223 string by counting to the location of the nul octet and adding
1224 one. This discrepancy can cause memory leakage (because less
1225 memory is actually returned to the free pool than allocated),
1226 leading to system failure and a denial of service attack.
1228 * Decoding of characters in strings that are legal ASCII
1229 characters but nonetheless are illegal for the intended
1230 application. For example, some operating systems treat the '/'
1234 Eisler Standards Track [Page 22]
1236 RFC 4506 XDR: External Data Representation Standard May 2006
1239 character as a component separator in path names. For a
1240 protocol that encodes a string in the argument to a file
1241 creation operation, the decoder needs to ensure that '/' is not
1242 inside the component name. Otherwise, a file with an illegal
1243 '/' in its name will be created, making it difficult to remove,
1244 and is therefore a denial of service attack.
1246 * Denial of service caused by recursive decoder or encoder
1247 subroutines. A recursive decoder or encoder might process data
1248 that has a structured type with a member of type optional data
1249 that directly or indirectly refers to the structured type (i.e.,
1250 a linked list). For example,
1257 An encoder or decoder subroutine might be written to recursively
1258 call itself each time another element of type "struct m" is
1259 found. An attacker could construct a long linked list of
1260 "struct m" elements in the request or response, which then
1261 causes a stack overflow on the decoder or encoder. Decoders and
1262 encoders should be written non-recursively or impose a limit on
1265 9. IANA Considerations
1267 It is possible, if not likely, that new data types will be added to
1268 XDR in the future. The process for adding new types is via a
1269 standards track RFC and not registration of new types with IANA.
1270 Standards track RFCs that update or replace this document should be
1271 documented as such in the RFC Editor's database of RFCs.
1273 10. Trademarks and Owners
1275 SUN WORKSTATION Sun Microsystems, Inc.
1276 VAX Hewlett-Packard Company
1277 IBM-PC International Business Machines Corporation
1279 NFS Sun Microsystems, Inc.
1280 Ethernet Xerox Corporation.
1281 Motorola 68000 Motorola, Inc.
1282 IBM 370 International Business Machines Corporation
1290 Eisler Standards Track [Page 23]
1292 RFC 4506 XDR: External Data Representation Standard May 2006
1295 11. ANSI/IEEE Standard 754-1985
1297 The definition of NaNs, signed zero and infinity, and denormalized
1298 numbers from [IEEE] is reproduced here for convenience. The
1299 definitions for quadruple-precision floating point numbers are
1300 analogs of those for single and double-precision floating point
1301 numbers and are defined in [IEEE].
1303 In the following, 'S' stands for the sign bit, 'E' for the exponent,
1304 and 'F' for the fractional part. The symbol 'u' stands for an
1305 undefined bit (0 or 1).
1307 For single-precision floating point numbers:
1309 Type S (1 bit) E (8 bits) F (23 bits)
1310 ---- --------- ---------- -----------
1311 signalling NaN u 255 (max) .0uuuuu---u
1314 quiet NaN u 255 (max) .1uuuuu---u
1316 negative infinity 1 255 (max) .000000---0
1318 positive infinity 0 255 (max) .000000---0
1320 negative zero 1 0 .000000---0
1322 positive zero 0 0 .000000---0
1324 For double-precision floating point numbers:
1326 Type S (1 bit) E (11 bits) F (52 bits)
1327 ---- --------- ----------- -----------
1328 signalling NaN u 2047 (max) .0uuuuu---u
1331 quiet NaN u 2047 (max) .1uuuuu---u
1333 negative infinity 1 2047 (max) .000000---0
1335 positive infinity 0 2047 (max) .000000---0
1337 negative zero 1 0 .000000---0
1339 positive zero 0 0 .000000---0
1346 Eisler Standards Track [Page 24]
1348 RFC 4506 XDR: External Data Representation Standard May 2006
1351 For quadruple-precision floating point numbers:
1353 Type S (1 bit) E (15 bits) F (112 bits)
1354 ---- --------- ----------- ------------
1355 signalling NaN u 32767 (max) .0uuuuu---u
1358 quiet NaN u 32767 (max) .1uuuuu---u
1360 negative infinity 1 32767 (max) .000000---0
1362 positive infinity 0 32767 (max) .000000---0
1364 negative zero 1 0 .000000---0
1366 positive zero 0 0 .000000---0
1368 Subnormal numbers are represented as follows:
1370 Precision Exponent Value
1371 --------- -------- -----
1372 Single 0 (-1)**S * 2**(-126) * 0.F
1374 Double 0 (-1)**S * 2**(-1022) * 0.F
1376 Quadruple 0 (-1)**S * 2**(-16382) * 0.F
1378 12. Normative References
1380 [IEEE] "IEEE Standard for Binary Floating-Point Arithmetic",
1381 ANSI/IEEE Standard 754-1985, Institute of Electrical and
1382 Electronics Engineers, August 1985.
1384 13. Informative References
1386 [KERN] Brian W. Kernighan & Dennis M. Ritchie, "The C Programming
1387 Language", Bell Laboratories, Murray Hill, New Jersey, 1978.
1389 [COHE] Danny Cohen, "On Holy Wars and a Plea for Peace", IEEE
1390 Computer, October 1981.
1392 [COUR] "Courier: The Remote Procedure Call Protocol", XEROX
1393 Corporation, XSIS 038112, December 1981.
1395 [SPAR] "The SPARC Architecture Manual: Version 8", Prentice Hall,
1398 [HPRE] "HP Precision Architecture Handbook", June 1987, 5954-9906.
1402 Eisler Standards Track [Page 25]
1404 RFC 4506 XDR: External Data Representation Standard May 2006
1407 14. Acknowledgements
1409 Bob Lyon was Sun's visible force behind ONC RPC in the 1980s. Sun
1410 Microsystems, Inc., is listed as the author of RFC 1014. Raj
1411 Srinivasan and the rest of the old ONC RPC working group edited RFC
1412 1014 into RFC 1832, from which this document is derived. Mike Eisler
1413 and Bill Janssen submitted the implementation reports for this
1414 standard. Kevin Coffman, Benny Halevy, and Jon Peterson reviewed
1415 this document and gave feedback. Peter Astrand and Bryan Olson
1416 pointed out several errors in RFC 1832 which are corrected in this
1422 5765 Chase Point Circle
1423 Colorado Springs, CO 80919
1427 EMail: email2mre-rfc4506@yahoo.com
1429 Please address comments to: nfsv4@ietf.org
1458 Eisler Standards Track [Page 26]
1460 RFC 4506 XDR: External Data Representation Standard May 2006
1463 Full Copyright Statement
1465 Copyright (C) The Internet Society (2006).
1467 This document is subject to the rights, licenses and restrictions
1468 contained in BCP 78, and except as set forth therein, the authors
1469 retain all their rights.
1471 This document and the information contained herein are provided on an
1472 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1473 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1474 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1475 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1476 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1477 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
1479 Intellectual Property
1481 The IETF takes no position regarding the validity or scope of any
1482 Intellectual Property Rights or other rights that might be claimed to
1483 pertain to the implementation or use of the technology described in
1484 this document or the extent to which any license under such rights
1485 might or might not be available; nor does it represent that it has
1486 made any independent effort to identify any such rights. Information
1487 on the procedures with respect to rights in RFC documents can be
1488 found in BCP 78 and BCP 79.
1490 Copies of IPR disclosures made to the IETF Secretariat and any
1491 assurances of licenses to be made available, or the result of an
1492 attempt made to obtain a general license or permission for the use of
1493 such proprietary rights by implementers or users of this
1494 specification can be obtained from the IETF on-line IPR repository at
1495 http://www.ietf.org/ipr.
1497 The IETF invites any interested party to bring to its attention any
1498 copyrights, patents or patent applications, or other proprietary
1499 rights that may cover technology that may be required to implement
1500 this standard. Please address the information to the IETF at
1505 Funding for the RFC Editor function is provided by the IETF
1506 Administrative Support Activity (IASA).
1514 Eisler Standards Track [Page 27]