doc/src/sgml/datatype.sgml

   1 <!-- $PostgreSQL$ -->
   2
   3  <chapter id="datatype">
   4   <title id="datatype-title">Data Types</title>
   5
   6   <indexterm zone="datatype">
   7    <primary>data type</primary>
   8   </indexterm>
   9
  10   <indexterm>
  11    <primary>type</primary>
  12    <see>data type</see>
  13   </indexterm>
  14
  15   <para>
  16    <productname>PostgreSQL</productname> has a rich set of native data
  17    types available to users.  Users can add new types to
  18    <productname>PostgreSQL</productname> using the <xref
  19    linkend="sql-createtype" endterm="sql-createtype-title"> command.
  20   </para>
  21
  22   <para>
  23    <xref linkend="datatype-table"> shows all the built-in general-purpose data
  24    types. Most of the alternative names listed in the
  25    <quote>Aliases</quote> column are the names used internally by
  26    <productname>PostgreSQL</productname> for historical reasons.  In
  27    addition, some internally used or deprecated types are available,
  28    but are not listed here.
  29   </para>
  30
  31    <table id="datatype-table">
  32     <title>Data Types</title>
  33     <tgroup cols="3">
  34      <thead>
  35       <row>
  36        <entry>Name</entry>
  37        <entry>Aliases</entry>
  38        <entry>Description</entry>
  39       </row>
  40      </thead>
  41
  42      <tbody>
  43       <row>
  44        <entry><type>bigint</type></entry>
  45        <entry><type>int8</type></entry>
  46        <entry>signed eight-byte integer</entry>
  47       </row>
  48
  49       <row>
  50        <entry><type>bigserial</type></entry>
  51        <entry><type>serial8</type></entry>
  52        <entry>autoincrementing eight-byte integer</entry>
  53       </row>
  54
  55       <row>
  56        <entry><type>bit [ (<replaceable>n</replaceable>) ]</type></entry>
  57        <entry></entry>
  58        <entry>fixed-length bit string</entry>
  59       </row>
  60
  61       <row>
  62        <entry><type>bit varying [ (<replaceable>n</replaceable>) ]</type></entry>
  63        <entry><type>varbit</type></entry>
  64        <entry>variable-length bit string</entry>
  65       </row>
  66
  67       <row>
  68        <entry><type>boolean</type></entry>
  69        <entry><type>bool</type></entry>
  70        <entry>logical Boolean (true/false)</entry>
  71       </row>
  72
  73       <row>
  74        <entry><type>box</type></entry>
  75        <entry></entry>
  76        <entry>rectangular box on a plane</entry>
  77       </row>
  78
  79       <row>
  80        <entry><type>bytea</type></entry>
  81        <entry></entry>
  82        <entry>binary data (<quote>byte array</>)</entry>
  83       </row>
  84
  85       <row>
  86        <entry><type>character varying [ (<replaceable>n</replaceable>) ]</type></entry>
  87        <entry><type>varchar [ (<replaceable>n</replaceable>) ]</type></entry>
  88        <entry>variable-length character string</entry>
  89       </row>
  90
  91       <row>
  92        <entry><type>character [ (<replaceable>n</replaceable>) ]</type></entry>
  93        <entry><type>char [ (<replaceable>n</replaceable>) ]</type></entry>
  94        <entry>fixed-length character string</entry>
  95       </row>
  96
  97       <row>
  98        <entry><type>cidr</type></entry>
  99        <entry></entry>
 100        <entry>IPv4 or IPv6 network address</entry>
 101       </row>
 102
 103       <row>
 104        <entry><type>circle</type></entry>
 105        <entry></entry>
 106        <entry>circle on a plane</entry>
 107       </row>
 108
 109       <row>
 110        <entry><type>date</type></entry>
 111        <entry></entry>
 112        <entry>calendar date (year, month, day)</entry>
 113       </row>
 114
 115       <row>
 116        <entry><type>double precision</type></entry>
 117        <entry><type>float8</type></entry>
 118        <entry>double precision floating-point number (8 bytes)</entry>
 119       </row>
 120
 121       <row>
 122        <entry><type>inet</type></entry>
 123        <entry></entry>
 124        <entry>IPv4 or IPv6 host address</entry>
 125       </row>
 126
 127       <row>
 128        <entry><type>integer</type></entry>
 129        <entry><type>int</type>, <type>int4</type></entry>
 130        <entry>signed four-byte integer</entry>
 131       </row>
 132
 133       <row>
 134        <entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
 135        <entry></entry>
 136        <entry>time span</entry>
 137       </row>
 138
 139       <row>
 140        <entry><type>line</type></entry>
 141        <entry></entry>
 142        <entry>infinite line on a plane</entry>
 143       </row>
 144
 145       <row>
 146        <entry><type>lseg</type></entry>
 147        <entry></entry>
 148        <entry>line segment on a plane</entry>
 149       </row>
 150
 151       <row>
 152        <entry><type>macaddr</type></entry>
 153        <entry></entry>
 154        <entry>MAC (Media Access Control) address</entry>
 155       </row>
 156
 157       <row>
 158        <entry><type>money</type></entry>
 159        <entry></entry>
 160        <entry>currency amount</entry>
 161       </row>
 162
 163       <row>
 164        <entry><type>numeric [ (<replaceable>p</replaceable>,
 165          <replaceable>s</replaceable>) ]</type></entry>
 166        <entry><type>decimal [ (<replaceable>p</replaceable>,
 167          <replaceable>s</replaceable>) ]</type></entry>
 168        <entry>exact numeric of selectable precision</entry>
 169       </row>
 170
 171       <row>
 172        <entry><type>path</type></entry>
 173        <entry></entry>
 174        <entry>geometric path on a plane</entry>
 175       </row>
 176
 177       <row>
 178        <entry><type>point</type></entry>
 179        <entry></entry>
 180        <entry>geometric point on a plane</entry>
 181       </row>
 182
 183       <row>
 184        <entry><type>polygon</type></entry>
 185        <entry></entry>
 186        <entry>closed geometric path on a plane</entry>
 187       </row>
 188
 189       <row>
 190        <entry><type>real</type></entry>
 191        <entry><type>float4</type></entry>
 192        <entry>single precision floating-point number (4 bytes)</entry>
 193       </row>
 194
 195       <row>
 196        <entry><type>smallint</type></entry>
 197        <entry><type>int2</type></entry>
 198        <entry>signed two-byte integer</entry>
 199       </row>
 200
 201       <row>
 202        <entry><type>serial</type></entry>
 203        <entry><type>serial4</type></entry>
 204        <entry>autoincrementing four-byte integer</entry>
 205       </row>
 206
 207       <row>
 208        <entry><type>text</type></entry>
 209        <entry></entry>
 210        <entry>variable-length character string</entry>
 211       </row>
 212
 213       <row>
 214        <entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
 215        <entry></entry>
 216        <entry>time of day (no time zone)</entry>
 217       </row>
 218
 219       <row>
 220        <entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
 221        <entry><type>timetz</type></entry>
 222        <entry>time of day, including time zone</entry>
 223       </row>
 224
 225       <row>
 226        <entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
 227        <entry></entry>
 228        <entry>date and time (no time zone)</entry>
 229       </row>
 230
 231       <row>
 232        <entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
 233        <entry><type>timestamptz</type></entry>
 234        <entry>date and time, including time zone</entry>
 235       </row>
 236
 237       <row>
 238        <entry><type>tsquery</type></entry>
 239        <entry></entry>
 240        <entry>text search query</entry>
 241       </row>
 242
 243       <row>
 244        <entry><type>tsvector</type></entry>
 245        <entry></entry>
 246        <entry>text search document</entry>
 247       </row>
 248
 249       <row>
 250        <entry><type>txid_snapshot</type></entry>
 251        <entry></entry>
 252        <entry>user-level transaction ID snapshot</entry>
 253       </row>
 254
 255       <row>
 256        <entry><type>uuid</type></entry>
 257        <entry></entry>
 258        <entry>universally unique identifier</entry>
 259       </row>
 260
 261       <row>
 262        <entry><type>xml</type></entry>
 263        <entry></entry>
 264        <entry>XML data</entry>
 265       </row>
 266      </tbody>
 267     </tgroup>
 268    </table>
 269
 270   <note>
 271    <title>Compatibility</title>
 272    <para>
 273     The following types (or spellings thereof) are specified by
 274     <acronym>SQL</acronym>: <type>bigint</type>, <type>bit</type>, <type>bit
 275     varying</type>, <type>boolean</type>, <type>char</type>,
 276     <type>character varying</type>, <type>character</type>,
 277     <type>varchar</type>, <type>date</type>, <type>double
 278     precision</type>, <type>integer</type>, <type>interval</type>,
 279     <type>numeric</type>, <type>decimal</type>, <type>real</type>,
 280     <type>smallint</type>, <type>time</type> (with or without time zone),
 281     <type>timestamp</type> (with or without time zone),
 282     <type>xml</type>.
 283    </para>
 284   </note>
 285
 286   <para>
 287    Each data type has an external representation determined by its input
 288    and output functions.  Many of the built-in types have
 289    obvious external formats.  However, several types are either unique
 290    to <productname>PostgreSQL</productname>, such as geometric
 291    paths, or have several possible formats, such as the date
 292    and time types.
 293    Some of the input and output functions are not invertible, i.e.,
 294    the result of an output function might lose accuracy when compared to
 295    the original input.
 296   </para>
 297
 298   <sect1 id="datatype-numeric">
 299    <title>Numeric Types</title>
 300
 301    <indexterm zone="datatype-numeric">
 302     <primary>data type</primary>
 303     <secondary>numeric</secondary>
 304    </indexterm>
 305
 306    <para>
 307     Numeric types consist of two-, four-, and eight-byte integers,
 308     four- and eight-byte floating-point numbers, and selectable-precision
 309     decimals.  <xref linkend="datatype-numeric-table"> lists the
 310     available types.
 311    </para>
 312
 313     <table id="datatype-numeric-table">
 314      <title>Numeric Types</title>
 315      <tgroup cols="4">
 316       <thead>
 317        <row>
 318         <entry>Name</entry>
 319         <entry>Storage Size</entry>
 320         <entry>Description</entry>
 321         <entry>Range</entry>
 322        </row>
 323       </thead>
 324
 325       <tbody>
 326        <row>
 327         <entry><type>smallint</></entry>
 328         <entry>2 bytes</entry>
 329         <entry>small-range integer</entry>
 330         <entry>-32768 to +32767</entry>
 331        </row>
 332        <row>
 333         <entry><type>integer</></entry>
 334         <entry>4 bytes</entry>
 335         <entry>typical choice for integer</entry>
 336         <entry>-2147483648 to +2147483647</entry>
 337        </row>
 338        <row>
 339         <entry><type>bigint</></entry>
 340         <entry>8 bytes</entry>
 341         <entry>large-range integer</entry>
 342         <entry>-9223372036854775808 to 9223372036854775807</entry>
 343        </row>
 344
 345        <row>
 346         <entry><type>decimal</></entry>
 347         <entry>variable</entry>
 348         <entry>user-specified precision, exact</entry>
 349         <entry>no limit</entry>
 350        </row>
 351        <row>
 352         <entry><type>numeric</></entry>
 353         <entry>variable</entry>
 354         <entry>user-specified precision, exact</entry>
 355         <entry>no limit</entry>
 356        </row>
 357
 358        <row>
 359         <entry><type>real</></entry>
 360         <entry>4 bytes</entry>
 361         <entry>variable-precision, inexact</entry>
 362         <entry>6 decimal digits precision</entry>
 363        </row>
 364        <row>
 365         <entry><type>double precision</></entry>
 366         <entry>8 bytes</entry>
 367         <entry>variable-precision, inexact</entry>
 368         <entry>15 decimal digits precision</entry>
 369        </row>
 370
 371        <row>
 372         <entry><type>serial</></entry>
 373         <entry>4 bytes</entry>
 374         <entry>autoincrementing integer</entry>
 375         <entry>1 to 2147483647</entry>
 376        </row>
 377
 378        <row>
 379         <entry><type>bigserial</type></entry>
 380         <entry>8 bytes</entry>
 381         <entry>large autoincrementing integer</entry>
 382         <entry>1 to 9223372036854775807</entry>
 383        </row>
 384       </tbody>
 385      </tgroup>
 386     </table>
 387
 388    <para>
 389     The syntax of constants for the numeric types is described in
 390     <xref linkend="sql-syntax-constants">.  The numeric types have a
 391     full set of corresponding arithmetic operators and
 392     functions. Refer to <xref linkend="functions"> for more
 393     information.  The following sections describe the types in detail.
 394    </para>
 395
 396    <sect2 id="datatype-int">
 397     <title>Integer Types</title>
 398
 399     <indexterm zone="datatype-int">
 400      <primary>integer</primary>
 401     </indexterm>
 402
 403     <indexterm zone="datatype-int">
 404      <primary>smallint</primary>
 405     </indexterm>
 406
 407     <indexterm zone="datatype-int">
 408      <primary>bigint</primary>
 409     </indexterm>
 410
 411     <indexterm>
 412      <primary>int4</primary>
 413      <see>integer</see>
 414     </indexterm>
 415
 416     <indexterm>
 417      <primary>int2</primary>
 418      <see>smallint</see>
 419     </indexterm>
 420
 421     <indexterm>
 422      <primary>int8</primary>
 423      <see>bigint</see>
 424     </indexterm>
 425
 426     <para>
 427      The types <type>smallint</type>, <type>integer</type>, and
 428      <type>bigint</type> store whole numbers, that is, numbers without
 429      fractional components, of various ranges.  Attempts to store
 430      values outside of the allowed range will result in an error.
 431     </para>
 432
 433     <para>
 434      The type <type>integer</type> is the common choice, as it offers
 435      the best balance between range, storage size, and performance.
 436      The <type>smallint</type> type is generally only used if disk
 437      space is at a premium.  The <type>bigint</type> type should only
 438      be used if the <type>integer</type> range is insufficient,
 439      because the latter is definitely faster.
 440     </para>
 441
 442     <para>
 443      On very minimal operating systems the <type>bigint</type> type
 444      might not function correctly, because it relies on compiler support
 445      for eight-byte integers.  On such machines, <type>bigint</type>
 446      acts the same as <type>integer</type>, but still takes up eight
 447      bytes of storage.  (We are not aware of any modern
 448      platform where this is the case.)
 449     </para>
 450
 451     <para>
 452      <acronym>SQL</acronym> only specifies the integer types
 453      <type>integer</type> (or <type>int</type>),
 454      <type>smallint</type>, and <type>bigint</type>.  The
 455      type names <type>int2</type>, <type>int4</type>, and
 456      <type>int8</type> are extensions, which are also used by some
 457      other <acronym>SQL</acronym> database systems.
 458     </para>
 459
 460    </sect2>
 461
 462    <sect2 id="datatype-numeric-decimal">
 463     <title>Arbitrary Precision Numbers</title>
 464
 465     <indexterm>
 466      <primary>numeric (data type)</primary>
 467     </indexterm>
 468
 469    <indexterm>
 470     <primary>arbitrary precision numbers</primary>
 471    </indexterm>
 472
 473     <indexterm>
 474      <primary>decimal</primary>
 475      <see>numeric</see>
 476     </indexterm>
 477
 478     <para>
 479      The type <type>numeric</type> can store numbers with up to 1000
 480      digits of precision and perform calculations exactly. It is
 481      especially recommended for storing monetary amounts and other
 482      quantities where exactness is required. However, arithmetic on
 483      <type>numeric</type> values is very slow compared to the integer
 484      types, or to the floating-point types described in the next section.
 485     </para>
 486
 487     <para>
 488      We use the following terms below:  The
 489      <firstterm>scale</firstterm> of a <type>numeric</type> is the
 490      count of decimal digits in the fractional part, to the right of
 491      the decimal point.  The <firstterm>precision</firstterm> of a
 492      <type>numeric</type> is the total count of significant digits in
 493      the whole number, that is, the number of digits to both sides of
 494      the decimal point.  So the number 23.5141 has a precision of 6
 495      and a scale of 4.  Integers can be considered to have a scale of
 496      zero.
 497     </para>
 498
 499     <para>
 500      Both the maximum precision and the maximum scale of a
 501      <type>numeric</type> column can be
 502      configured.  To declare a column of type <type>numeric</type> use
 503      the syntax:
 504 <programlisting>
 505 NUMERIC(<replaceable>precision</replaceable>, <replaceable>scale</replaceable>)
 506 </programlisting>
 507      The precision must be positive, the scale zero or positive.
 508      Alternatively:
 509 <programlisting>
 510 NUMERIC(<replaceable>precision</replaceable>)
 511 </programlisting>
 512      selects a scale of 0.  Specifying:
 513 <programlisting>
 514 NUMERIC
 515 </programlisting>
 516      without any precision or scale creates a column in which numeric
 517      values of any precision and scale can be stored, up to the
 518      implementation limit on precision.  A column of this kind will
 519      not coerce input values to any particular scale, whereas
 520      <type>numeric</type> columns with a declared scale will coerce
 521      input values to that scale.  (The <acronym>SQL</acronym> standard
 522      requires a default scale of 0, i.e., coercion to integer
 523      precision.  We find this a bit useless.  If you're concerned
 524      about portability, always specify the precision and scale
 525      explicitly.)
 526     </para>
 527
 528     <para>
 529      If the scale of a value to be stored is greater than the declared
 530      scale of the column, the system will round the value to the specified
 531      number of fractional digits.  Then, if the number of digits to the
 532      left of the decimal point exceeds the declared precision minus the
 533      declared scale, an error is raised.
 534     </para>
 535
 536     <para>
 537      Numeric values are physically stored without any extra leading or
 538      trailing zeroes.  Thus, the declared precision and scale of a column
 539      are maximums, not fixed allocations.  (In this sense the <type>numeric</>
 540      type is more akin to <type>varchar(<replaceable>n</>)</type>
 541      than to <type>char(<replaceable>n</>)</type>.)  The actual storage
 542      requirement is two bytes for each group of four decimal digits,
 543      plus five to eight bytes overhead.
 544     </para>
 545
 546     <indexterm>
 547      <primary>NaN</primary>
 548      <see>not a number</see>
 549    </indexterm>
 550
 551     <indexterm>
 552      <primary>not a number</primary>
 553      <secondary>numeric (data type)</secondary>
 554     </indexterm>
 555
 556     <para>
 557      In addition to ordinary numeric values, the <type>numeric</type>
 558      type allows the special value <literal>NaN</>, meaning
 559      <quote>not-a-number</quote>.  Any operation on <literal>NaN</>
 560      yields another <literal>NaN</>.  When writing this value
 561      as a constant in an SQL command, you must put quotes around it,
 562      for example <literal>UPDATE table SET x = 'NaN'</>.  On input,
 563      the string <literal>NaN</> is recognized in a case-insensitive manner.
 564     </para>
 565
 566     <note>
 567      <para>
 568       In most implementations of the <quote>not-a-number</> concept,
 569       <literal>NaN</> is not considered equal to any other numeric
 570       value (including <literal>NaN</>).  In order to allow
 571       <type>numeric</> values to be sorted and used in tree-based
 572       indexes, <productname>PostgreSQL</> treats <literal>NaN</>
 573       values as equal, and greater than all non-<literal>NaN</>
 574       values.
 575      </para>
 576     </note>
 577
 578     <para>
 579      The types <type>decimal</type> and <type>numeric</type> are
 580      equivalent.  Both types are part of the <acronym>SQL</acronym>
 581      standard.
 582     </para>
 583    </sect2>
 584
 585
 586    <sect2 id="datatype-float">
 587     <title>Floating-Point Types</title>
 588
 589     <indexterm zone="datatype-float">
 590      <primary>real</primary>
 591     </indexterm>
 592
 593     <indexterm zone="datatype-float">
 594      <primary>double precision</primary>
 595     </indexterm>
 596
 597     <indexterm>
 598      <primary>float4</primary>
 599      <see>real</see>
 600     </indexterm>
 601
 602     <indexterm>
 603      <primary>float8</primary>
 604      <see>double precision</see>
 605     </indexterm>
 606
 607     <indexterm zone="datatype-float">
 608      <primary>floating point</primary>
 609     </indexterm>
 610
 611     <para>
 612      The data types <type>real</type> and <type>double
 613      precision</type> are inexact, variable-precision numeric types.
 614      In practice, these types are usually implementations of
 615      <acronym>IEEE</acronym> Standard 754 for Binary Floating-Point
 616      Arithmetic (single and double precision, respectively), to the
 617      extent that the underlying processor, operating system, and
 618      compiler support it.
 619     </para>
 620
 621     <para>
 622      Inexact means that some values cannot be converted exactly to the
 623      internal format and are stored as approximations, so that storing
 624      and retrieving a value might show slight discrepancies.
 625      Managing these errors and how they propagate through calculations
 626      is the subject of an entire branch of mathematics and computer
 627      science and will not be discussed here, except for the
 628      following points:
 629      <itemizedlist>
 630       <listitem>
 631        <para>
 632         If you require exact storage and calculations (such as for
 633         monetary amounts), use the <type>numeric</type> type instead.
 634        </para>
 635       </listitem>
 636
 637       <listitem>
 638        <para>
 639         If you want to do complicated calculations with these types
 640         for anything important, especially if you rely on certain
 641         behavior in boundary cases (infinity, underflow), you should
 642         evaluate the implementation carefully.
 643        </para>
 644       </listitem>
 645
 646       <listitem>
 647        <para>
 648         Comparing two floating-point values for equality might not
 649         always work as expected.
 650        </para>
 651       </listitem>
 652      </itemizedlist>
 653     </para>
 654
 655     <para>
 656      On most platforms, the <type>real</type> type has a range of at least
 657      1E-37 to 1E+37 with a precision of at least 6 decimal digits.  The
 658      <type>double precision</type> type typically has a range of around
 659      1E-307 to 1E+308 with a precision of at least 15 digits.  Values that
 660      are too large or too small will cause an error.  Rounding might
 661      take place if the precision of an input number is too high.
 662      Numbers too close to zero that are not representable as distinct
 663      from zero will cause an underflow error.
 664     </para>
 665
 666     <indexterm>
 667      <primary>not a number</primary>
 668      <secondary>double precision</secondary>
 669     </indexterm>
 670
 671     <para>
 672      In addition to ordinary numeric values, the floating-point types
 673      have several special values:
 674 <literallayout>
 675 <literal>Infinity</literal>
 676 <literal>-Infinity</literal>
 677 <literal>NaN</literal>
 678 </literallayout>
 679      These represent the IEEE 754 special values
 680      <quote>infinity</quote>, <quote>negative infinity</quote>, and
 681      <quote>not-a-number</quote>, respectively.  (On a machine whose
 682      floating-point arithmetic does not follow IEEE 754, these values
 683      will probably not work as expected.)  When writing these values
 684      as constants in an SQL command, you must put quotes around them,
 685      for example <literal>UPDATE table SET x = 'Infinity'</>.  On input,
 686      these strings are recognized in a case-insensitive manner.
 687     </para>
 688
 689     <note>
 690      <para>
 691       IEEE754 specifies that <literal>NaN</> should not compare equal
 692       to any other floating-point value (including <literal>NaN</>).
 693       In order to allow floating-point values to be sorted and used
 694       in tree-based indexes, <productname>PostgreSQL</> treats
 695       <literal>NaN</> values as equal, and greater than all
 696       non-<literal>NaN</> values.
 697      </para>
 698     </note>
 699
 700     <para>
 701      <productname>PostgreSQL</productname> also supports the SQL-standard
 702      notations <type>float</type> and
 703      <type>float(<replaceable>p</replaceable>)</type> for specifying
 704      inexact numeric types.  Here, <replaceable>p</replaceable> specifies
 705      the minimum acceptable precision in <emphasis>binary</> digits.
 706      <productname>PostgreSQL</productname> accepts
 707      <type>float(1)</type> to <type>float(24)</type> as selecting the
 708      <type>real</type> type, while
 709      <type>float(25)</type> to <type>float(53)</type> select
 710      <type>double precision</type>.  Values of <replaceable>p</replaceable>
 711      outside the allowed range draw an error.
 712      <type>float</type> with no precision specified is taken to mean
 713      <type>double precision</type>.
 714     </para>
 715
 716     <note>
 717      <para>
 718       Prior to <productname>PostgreSQL</productname> 7.4, the precision in
 719       <type>float(<replaceable>p</replaceable>)</type> was taken to mean
 720       so many <emphasis>decimal</> digits.  This has been corrected to match the SQL
 721       standard, which specifies that the precision is measured in binary
 722       digits.  The assumption that <type>real</type> and
 723       <type>double precision</type> have exactly 24 and 53 bits in the
 724       mantissa respectively is correct for IEEE-standard floating point
 725       implementations.  On non-IEEE platforms it might be off a little, but
 726       for simplicity the same ranges of <replaceable>p</replaceable> are used
 727       on all platforms.
 728      </para>
 729     </note>
 730
 731    </sect2>
 732
 733    <sect2 id="datatype-serial">
 734     <title>Serial Types</title>
 735
 736     <indexterm zone="datatype-serial">
 737      <primary>serial</primary>
 738     </indexterm>
 739
 740     <indexterm zone="datatype-serial">
 741      <primary>bigserial</primary>
 742     </indexterm>
 743
 744     <indexterm zone="datatype-serial">
 745      <primary>serial4</primary>
 746     </indexterm>
 747
 748     <indexterm zone="datatype-serial">
 749      <primary>serial8</primary>
 750     </indexterm>
 751
 752     <indexterm>
 753      <primary>auto-increment</primary>
 754      <see>serial</see>
 755     </indexterm>
 756
 757     <indexterm>
 758      <primary>sequence</primary>
 759      <secondary>and serial type</secondary>
 760     </indexterm>
 761
 762     <para>
 763      The data types <type>serial</type> and <type>bigserial</type>
 764      are not true types, but merely
 765      a notational convenience for creating unique identifier columns
 766      (similar to the <literal>AUTO_INCREMENT</literal> property
 767      supported by some other databases). In the current
 768      implementation, specifying:
 769
 770 <programlisting>
 771 CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
 772     <replaceable class="parameter">colname</replaceable> SERIAL
 773 );
 774 </programlisting>
 775
 776      is equivalent to specifying:
 777
 778 <programlisting>
 779 CREATE SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq;
 780 CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
 781     <replaceable class="parameter">colname</replaceable> integer NOT NULL DEFAULT nextval('<replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq')
 782 );
 783 ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq OWNED BY <replaceable class="parameter">tablename</replaceable>.<replaceable class="parameter">colname</replaceable>;
 784 </programlisting>
 785
 786      Thus, we have created an integer column and arranged for its default
 787      values to be assigned from a sequence generator.  A <literal>NOT NULL</>
 788      constraint is applied to ensure that a null value cannot be
 789      inserted.  (In most cases you would also want to attach a
 790      <literal>UNIQUE</> or <literal>PRIMARY KEY</> constraint to prevent
 791      duplicate values from being inserted by accident, but this is
 792      not automatic.)  Lastly, the sequence is marked as <quote>owned by</>
 793      the column, so that it will be dropped if the column or table is dropped.
 794     </para>
 795
 796     <note>
 797      <para>
 798       Prior to <productname>PostgreSQL</productname> 7.3, <type>serial</type>
 799       implied <literal>UNIQUE</literal>.  This is no longer automatic.  If
 800       you wish a serial column to have a unique constraint or be a
 801       primary key, it must now be specified, just like
 802       any other data type.
 803      </para>
 804     </note>
 805
 806     <para>
 807      To insert the next value of the sequence into the <type>serial</type>
 808      column, specify that the <type>serial</type>
 809      column should be assigned its default value. This can be done
 810      either by excluding the column from the list of columns in
 811      the <command>INSERT</command> statement, or through the use of
 812      the <literal>DEFAULT</literal> key word.
 813     </para>
 814
 815     <para>
 816      The type names <type>serial</type> and <type>serial4</type> are
 817      equivalent: both create <type>integer</type> columns.  The type
 818      names <type>bigserial</type> and <type>serial8</type> work
 819      the same way, except that they create a <type>bigint</type>
 820      column.  <type>bigserial</type> should be used if you anticipate
 821      the use of more than 2<superscript>31</> identifiers over the
 822      lifetime of the table.
 823     </para>
 824
 825     <para>
 826      The sequence created for a <type>serial</type> column is
 827      automatically dropped when the owning column is dropped.
 828      You can drop the sequence without dropping the column, but this
 829      will force removal of the column default expression.
 830     </para>
 831    </sect2>
 832   </sect1>
 833
 834   <sect1 id="datatype-money">
 835    <title>Monetary Types</title>
 836
 837    <para>
 838     The <type>money</type> type stores a currency amount with a fixed
 839     fractional precision; see <xref
 840     linkend="datatype-money-table">.  The fractional precision is
 841     determined by the database's <xref linkend="guc-lc-monetary"> setting.
 842     Input is accepted in a variety of formats, including integer and
 843     floating-point literals, as well as typical
 844     currency formatting, such as <literal>'$1,000.00'</literal>.
 845     Output is generally in the latter form but depends on the locale.
 846     Non-quoted numeric values can be converted to <type>money</type> by
 847     casting the numeric value to <type>text</type> and then
 848     <type>money</type>, for example:
 849 <programlisting>
 850 SELECT 1234::text::money;
 851 </programlisting>
 852     There is no simple way of doing the reverse in a locale-independent
 853     manner, namely casting a <type>money</type> value to a numeric type.
 854     If you know the currency symbol and thousands separator you can use
 855     <function>regexp_replace()</>:
 856 <programlisting>
 857 SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric;
 858 </programlisting>
 859
 860    </para>
 861
 862    <para>
 863     Since the output of this data type is locale-sensitive, it might not
 864     work to load <type>money</> data into a database that has a different
 865     setting of <varname>lc_monetary</>.  To avoid problems, before
 866     restoring a dump into a new database make sure <varname>lc_monetary</> has the same or
 867     equivalent value as in the database that was dumped.
 868    </para>
 869
 870     <table id="datatype-money-table">
 871      <title>Monetary Types</title>
 872      <tgroup cols="4">
 873       <thead>
 874        <row>
 875         <entry>Name</entry>
 876         <entry>Storage Size</entry>
 877         <entry>Description</entry>
 878         <entry>Range</entry>
 879        </row>
 880       </thead>
 881       <tbody>
 882        <row>
 883         <entry>money</entry>
 884         <entry>8 bytes</entry>
 885         <entry>currency amount</entry>
 886         <entry>-92233720368547758.08 to +92233720368547758.07</entry>
 887        </row>
 888       </tbody>
 889      </tgroup>
 890     </table>
 891   </sect1>
 892
 893
 894   <sect1 id="datatype-character">
 895    <title>Character Types</title>
 896
 897    <indexterm zone="datatype-character">
 898     <primary>character string</primary>
 899     <secondary>data types</secondary>
 900    </indexterm>
 901
 902    <indexterm>
 903     <primary>string</primary>
 904     <see>character string</see>
 905    </indexterm>
 906
 907    <indexterm zone="datatype-character">
 908     <primary>character</primary>
 909    </indexterm>
 910
 911    <indexterm zone="datatype-character">
 912     <primary>character varying</primary>
 913    </indexterm>
 914
 915    <indexterm zone="datatype-character">
 916     <primary>text</primary>
 917    </indexterm>
 918
 919    <indexterm zone="datatype-character">
 920     <primary>char</primary>
 921    </indexterm>
 922
 923    <indexterm zone="datatype-character">
 924     <primary>varchar</primary>
 925    </indexterm>
 926
 927     <table id="datatype-character-table">
 928      <title>Character Types</title>
 929      <tgroup cols="2">
 930       <thead>
 931        <row>
 932         <entry>Name</entry>
 933         <entry>Description</entry>
 934        </row>
 935       </thead>
 936       <tbody>
 937        <row>
 938         <entry><type>character varying(<replaceable>n</>)</type>, <type>varchar(<replaceable>n</>)</type></entry>
 939         <entry>variable-length with limit</entry>
 940        </row>
 941        <row>
 942         <entry><type>character(<replaceable>n</>)</type>, <type>char(<replaceable>n</>)</type></entry>
 943         <entry>fixed-length, blank padded</entry>
 944        </row>
 945        <row>
 946         <entry><type>text</type></entry>
 947         <entry>variable unlimited length</entry>
 948        </row>
 949      </tbody>
 950      </tgroup>
 951     </table>
 952
 953    <para>
 954     <xref linkend="datatype-character-table"> shows the
 955     general-purpose character types available in
 956     <productname>PostgreSQL</productname>.
 957    </para>
 958
 959    <para>
 960     <acronym>SQL</acronym> defines two primary character types:
 961     <type>character varying(<replaceable>n</>)</type> and
 962     <type>character(<replaceable>n</>)</type>, where <replaceable>n</>
 963     is a positive integer.  Both of these types can store strings up to
 964     <replaceable>n</> characters (not bytes) in length.  An attempt to store a
 965     longer string into a column of these types will result in an
 966     error, unless the excess characters are all spaces, in which case
 967     the string will be truncated to the maximum length. (This somewhat
 968     bizarre exception is required by the <acronym>SQL</acronym>
 969     standard.) If the string to be stored is shorter than the declared
 970     length, values of type <type>character</type> will be space-padded;
 971     values of type <type>character varying</type> will simply store the
 972     shorter
 973     string.
 974    </para>
 975
 976    <para>
 977     If one explicitly casts a value to <type>character
 978     varying(<replaceable>n</>)</type> or
 979     <type>character(<replaceable>n</>)</type>, then an over-length
 980     value will be truncated to <replaceable>n</> characters without
 981     raising an error. (This too is required by the
 982     <acronym>SQL</acronym> standard.)
 983    </para>
 984
 985    <para>
 986     The notations <type>varchar(<replaceable>n</>)</type> and
 987     <type>char(<replaceable>n</>)</type> are aliases for <type>character
 988     varying(<replaceable>n</>)</type> and
 989     <type>character(<replaceable>n</>)</type>, respectively.
 990     <type>character</type> without length specifier is equivalent to
 991     <type>character(1)</type>. If <type>character varying</type> is used
 992     without length specifier, the type accepts strings of any size. The
 993     latter is a <productname>PostgreSQL</> extension.
 994    </para>
 995
 996    <para>
 997     In addition, <productname>PostgreSQL</productname> provides the
 998     <type>text</type> type, which stores strings of any length.
 999     Although the type <type>text</type> is not in the
1000     <acronym>SQL</acronym> standard, several other SQL database
1001     management systems have it as well.
1002    </para>
1003
1004    <para>
1005     Values of type <type>character</type> are physically padded
1006     with spaces to the specified width <replaceable>n</>, and are
1007     stored and displayed that way.  However, the padding spaces are
1008     treated as semantically insignificant.  Trailing spaces are
1009     disregarded when comparing two values of type <type>character</type>,
1010     and they will be removed when converting a <type>character</type> value
1011     to one of the other string types.  Note that trailing spaces
1012     <emphasis>are</> semantically significant in
1013     <type>character varying</type> and <type>text</type> values.
1014    </para>
1015
1016    <para>
1017     The storage requirement for a short string (up to 126 bytes) is 1 byte
1018     plus the actual string, which includes the space padding in the case of
1019     <type>character</type>.  Longer strings have 4 bytes of overhead instead
1020     of 1.  Long strings are compressed by the system automatically, so
1021     the physical requirement on disk might be less. Very long values are also
1022     stored in background tables so that they do not interfere with rapid
1023     access to shorter column values. In any case, the longest
1024     possible character string that can be stored is about 1 GB. (The
1025     maximum value that will be allowed for <replaceable>n</> in the data
1026     type declaration is less than that. It wouldn't be useful to
1027     change this because with multibyte character encodings the number of
1028     characters and bytes can be quite different. If you desire to
1029     store long strings with no specific upper limit, use
1030     <type>text</type> or <type>character varying</type> without a length
1031     specifier, rather than making up an arbitrary length limit.)
1032    </para>
1033
1034    <tip>
1035     <para>
1036      There is no performance difference among these three types,
1037      apart from increased storage space when using the blank-padded
1038      type, and a few extra CPU cycles to check the length when storing into
1039      a length-constrained column.  While
1040      <type>character(<replaceable>n</>)</type> has performance
1041      advantages in some other database systems, there is no such advantage in
1042      <productname>PostgreSQL</productname>; in fact
1043      <type>character(<replaceable>n</>)</type> is usually the slowest of
1044      the three because of its additional storage costs.  In most situations
1045      <type>text</type> or <type>character varying</type> should be used
1046      instead.
1047     </para>
1048    </tip>
1049
1050    <para>
1051     Refer to <xref linkend="sql-syntax-strings"> for information about
1052     the syntax of string literals, and to <xref linkend="functions">
1053     for information about available operators and functions. The
1054     database character set determines the character set used to store
1055     textual values; for more information on character set support,
1056     refer to <xref linkend="multibyte">.
1057    </para>
1058
1059    <example>
1060     <title>Using the character types</title>
1061
1062 <programlisting>
1063 CREATE TABLE test1 (a character(4));
1064 INSERT INTO test1 VALUES ('ok');
1065 SELECT a, char_length(a) FROM test1; -- <co id="co.datatype-char">
1066 <computeroutput>
1067   a   | char_length
1068 ------+-------------
1069  ok   |           2
1070 </computeroutput>
1071
1072 CREATE TABLE test2 (b varchar(5));
1073 INSERT INTO test2 VALUES ('ok');
1074 INSERT INTO test2 VALUES ('good      ');
1075 INSERT INTO test2 VALUES ('too long');
1076 <computeroutput>ERROR:  value too long for type character varying(5)</computeroutput>
1077 INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation
1078 SELECT b, char_length(b) FROM test2;
1079 <computeroutput>
1080    b   | char_length
1081 -------+-------------
1082  ok    |           2
1083  good  |           5
1084  too l |           5
1085 </computeroutput>
1086 </programlisting>
1087     <calloutlist>
1088      <callout arearefs="co.datatype-char">
1089       <para>
1090        The <function>char_length</function> function is discussed in
1091        <xref linkend="functions-string">.
1092       </para>
1093      </callout>
1094     </calloutlist>
1095    </example>
1096
1097    <para>
1098     There are two other fixed-length character types in
1099     <productname>PostgreSQL</productname>, shown in <xref
1100     linkend="datatype-character-special-table">. The <type>name</type>
1101     type exists <emphasis>only</emphasis> for the storage of identifiers
1102     in the internal system catalogs and is not intended for use by the general user. Its
1103     length is currently defined as 64 bytes (63 usable characters plus
1104     terminator) but should be referenced using the constant
1105     <symbol>NAMEDATALEN</symbol> in <literal>C</> source code.
1106     The length is set at compile time (and
1107     is therefore adjustable for special uses); the default maximum
1108     length might change in a future release. The type <type>"char"</type>
1109     (note the quotes) is different from <type>char(1)</type> in that it
1110     only uses one byte of storage. It is internally used in the system
1111     catalogs as a simplistic enumeration type.
1112    </para>
1113
1114     <table id="datatype-character-special-table">
1115      <title>Special Character Types</title>
1116      <tgroup cols="3">
1117       <thead>
1118        <row>
1119         <entry>Name</entry>
1120         <entry>Storage Size</entry>
1121         <entry>Description</entry>
1122        </row>
1123       </thead>
1124       <tbody>
1125        <row>
1126         <entry><type>"char"</type></entry>
1127         <entry>1 byte</entry>
1128         <entry>single-byte internal type</entry>
1129        </row>
1130        <row>
1131         <entry><type>name</type></entry>
1132         <entry>64 bytes</entry>
1133         <entry>internal type for object names</entry>
1134        </row>
1135       </tbody>
1136      </tgroup>
1137     </table>
1138
1139   </sect1>
1140
1141  <sect1 id="datatype-binary">
1142   <title>Binary Data Types</title>
1143
1144   <indexterm zone="datatype-binary">
1145    <primary>binary data</primary>
1146   </indexterm>
1147
1148   <indexterm zone="datatype-binary">
1149    <primary>bytea</primary>
1150   </indexterm>
1151
1152    <para>
1153     The <type>bytea</type> data type allows storage of binary strings;
1154     see <xref linkend="datatype-binary-table">.
1155    </para>
1156
1157    <table id="datatype-binary-table">
1158     <title>Binary Data Types</title>
1159     <tgroup cols="3">
1160      <thead>
1161       <row>
1162        <entry>Name</entry>
1163        <entry>Storage Size</entry>
1164        <entry>Description</entry>
1165       </row>
1166      </thead>
1167      <tbody>
1168       <row>
1169        <entry><type>bytea</type></entry>
1170        <entry>1 or 4 bytes plus the actual binary string</entry>
1171        <entry>variable-length binary string</entry>
1172       </row>
1173      </tbody>
1174     </tgroup>
1175    </table>
1176
1177    <para>
1178     A binary string is a sequence of octets (or bytes).  Binary
1179     strings are distinguished from character strings in two
1180     ways: First, binary strings specifically allow storing
1181     octets of value zero and other <quote>non-printable</quote>
1182     octets (usually, octets outside the range 32 to 126).
1183     Character strings disallow zero octets, and also disallow any
1184     other octet values and sequences of octet values that are invalid
1185     according to the database's selected character set encoding.
1186     Second, operations on binary strings process the actual bytes,
1187     whereas the processing of character strings depends on locale settings.
1188     In short, binary strings are appropriate for storing data that the
1189     programmer thinks of as <quote>raw bytes</>, whereas character
1190     strings are appropriate for storing text.
1191    </para>
1192
1193    <para>
1194     When entering <type>bytea</type> values, octets of certain
1195     values <emphasis>must</emphasis> be escaped (but all octet
1196     values <emphasis>can</emphasis> be escaped) when used as part
1197     of a string literal in an <acronym>SQL</acronym> statement. In
1198     general, to escape an octet, convert it into its three-digit
1199     octal value and precede it
1200     by two backslashes.  <xref linkend="datatype-binary-sqlesc">
1201     shows the characters that must be escaped, and gives the alternative
1202     escape sequences where applicable.
1203    </para>
1204
1205    <table id="datatype-binary-sqlesc">
1206     <title><type>bytea</> Literal Escaped Octets</title>
1207     <tgroup cols="5">
1208      <thead>
1209       <row>
1210        <entry>Decimal Octet Value</entry>
1211        <entry>Description</entry>
1212        <entry>Escaped Input Representation</entry>
1213        <entry>Example</entry>
1214        <entry>Output Representation</entry>
1215       </row>
1216      </thead>
1217
1218      <tbody>
1219       <row>
1220        <entry>0</entry>
1221        <entry>zero octet</entry>
1222        <entry><literal>E'\\000'</literal></entry>
1223        <entry><literal>SELECT E'\\000'::bytea;</literal></entry>
1224        <entry><literal>\000</literal></entry>
1225       </row>
1226
1227       <row>
1228        <entry>39</entry>
1229        <entry>single quote</entry>
1230        <entry><literal>''''</literal> or <literal>E'\\047'</literal></entry>
1231        <entry><literal>SELECT E'\''::bytea;</literal></entry>
1232        <entry><literal>'</literal></entry>
1233       </row>
1234
1235       <row>
1236        <entry>92</entry>
1237        <entry>backslash</entry>
1238        <entry><literal>E'\\\\'</literal> or <literal>E'\\134'</literal></entry>
1239        <entry><literal>SELECT E'\\\\'::bytea;</literal></entry>
1240        <entry><literal>\\</literal></entry>
1241       </row>
1242
1243       <row>
1244        <entry>0 to 31 and 127 to 255</entry>
1245        <entry><quote>non-printable</quote> octets</entry>
1246        <entry><literal>E'\\<replaceable>xxx'</></literal> (octal value)</entry>
1247        <entry><literal>SELECT E'\\001'::bytea;</literal></entry>
1248        <entry><literal>\001</literal></entry>
1249       </row>
1250
1251      </tbody>
1252     </tgroup>
1253    </table>
1254
1255    <para>
1256     The requirement to escape <emphasis>non-printable</emphasis> octets
1257     varies depending on locale settings. In some instances you can get away
1258     with leaving them unescaped. Note that the result in each of the examples
1259     in <xref linkend="datatype-binary-sqlesc"> was exactly one octet in
1260     length, even though the output representation is sometimes
1261     more than one character.
1262    </para>
1263
1264    <para>
1265     The reason multiple backslashes are required, as shown
1266     in <xref linkend="datatype-binary-sqlesc">, is that an input
1267     string written as a string literal must pass through two parse
1268     phases in the <productname>PostgreSQL</productname> server.
1269     The first backslash of each pair is interpreted as an escape
1270     character by the string-literal parser (assuming escape string
1271     syntax is used) and is therefore consumed, leaving the second backslash of the
1272     pair.  (Dollar-quoted strings can be used to avoid this level
1273     of escaping.)  The remaining backslash is then recognized by the
1274     <type>bytea</type> input function as starting either a three
1275     digit octal value or escaping another backslash.  For example,
1276     a string literal passed to the server as <literal>E'\\001'</literal>
1277     becomes <literal>\001</literal> after passing through the
1278     escape string parser. The <literal>\001</literal> is then sent
1279     to the <type>bytea</type> input function, where it is converted
1280     to a single octet with a decimal value of 1.  Note that the
1281     single-quote character is not treated specially by <type>bytea</type>,
1282     so it follows the normal rules for string literals.  (See also
1283     <xref linkend="sql-syntax-strings">.)
1284    </para>
1285
1286    <para>
1287     <type>Bytea</type> octets are sometimes escaped when output. In general, each
1288     <quote>non-printable</quote> octet is converted into
1289     its equivalent three-digit octal value and preceded by one backslash.
1290     Most <quote>printable</quote> octets are represented by their standard
1291     representation in the client character set. The octet with decimal
1292     value 92 (backslash) is doubled in the output.
1293     Details are in <xref linkend="datatype-binary-resesc">.
1294    </para>
1295
1296    <table id="datatype-binary-resesc">
1297     <title><type>bytea</> Output Escaped Octets</title>
1298     <tgroup cols="5">
1299      <thead>
1300       <row>
1301        <entry>Decimal Octet Value</entry>
1302        <entry>Description</entry>
1303        <entry>Escaped Output Representation</entry>
1304        <entry>Example</entry>
1305        <entry>Output Result</entry>
1306       </row>
1307      </thead>
1308
1309      <tbody>
1310
1311       <row>
1312        <entry>92</entry>
1313        <entry>backslash</entry>
1314        <entry><literal>\\</literal></entry>
1315        <entry><literal>SELECT E'\\134'::bytea;</literal></entry>
1316        <entry><literal>\\</literal></entry>
1317       </row>
1318
1319       <row>
1320        <entry>0 to 31 and 127 to 255</entry>
1321        <entry><quote>non-printable</quote> octets</entry>
1322        <entry><literal>\<replaceable>xxx</></literal> (octal value)</entry>
1323        <entry><literal>SELECT E'\\001'::bytea;</literal></entry>
1324        <entry><literal>\001</literal></entry>
1325       </row>
1326
1327       <row>
1328        <entry>32 to 126</entry>
1329        <entry><quote>printable</quote> octets</entry>
1330        <entry>client character set representation</entry>
1331        <entry><literal>SELECT E'\\176'::bytea;</literal></entry>
1332        <entry><literal>~</literal></entry>
1333       </row>
1334
1335      </tbody>
1336     </tgroup>
1337    </table>
1338
1339    <para>
1340     Depending on the front end to <productname>PostgreSQL</> you use,
1341     you might have additional work to do in terms of escaping and
1342     unescaping <type>bytea</type> strings. For example, you might also
1343     have to escape line feeds and carriage returns if your interface
1344     automatically translates these.
1345    </para>
1346
1347    <para>
1348         The <acronym>SQL</acronym> standard defines a different binary
1349         string type, called <type>BLOB</type> or <type>BINARY LARGE
1350         OBJECT</type>.  The input format is different from
1351         <type>bytea</type>, but the provided functions and operators are
1352         mostly the same.
1353    </para>
1354  </sect1>
1355
1356
1357   <sect1 id="datatype-datetime">
1358    <title>Date/Time Types</title>
1359
1360    <indexterm zone="datatype-datetime">
1361     <primary>date</primary>
1362    </indexterm>
1363    <indexterm zone="datatype-datetime">
1364     <primary>time</primary>
1365    </indexterm>
1366    <indexterm zone="datatype-datetime">
1367     <primary>time without time zone</primary>
1368    </indexterm>
1369    <indexterm zone="datatype-datetime">
1370     <primary>time with time zone</primary>
1371    </indexterm>
1372    <indexterm zone="datatype-datetime">
1373     <primary>timestamp</primary>
1374    </indexterm>
1375    <indexterm zone="datatype-datetime">
1376     <primary>timestamp with time zone</primary>
1377    </indexterm>
1378    <indexterm zone="datatype-datetime">
1379     <primary>timestamp without time zone</primary>
1380    </indexterm>
1381    <indexterm zone="datatype-datetime">
1382     <primary>interval</primary>
1383    </indexterm>
1384    <indexterm zone="datatype-datetime">
1385     <primary>time span</primary>
1386    </indexterm>
1387
1388    <para>
1389     <productname>PostgreSQL</productname> supports the full set of
1390     <acronym>SQL</acronym> date and time types, shown in <xref
1391     linkend="datatype-datetime-table">.  The operations available
1392     on these data types are described in
1393     <xref linkend="functions-datetime">.
1394    </para>
1395
1396     <table id="datatype-datetime-table">
1397      <title>Date/Time Types</title>
1398      <tgroup cols="6">
1399       <thead>
1400        <row>
1401         <entry>Name</entry>
1402         <entry>Storage Size</entry>
1403         <entry>Description</entry>
1404         <entry>Low Value</entry>
1405         <entry>High Value</entry>
1406         <entry>Resolution</entry>
1407        </row>
1408       </thead>
1409       <tbody>
1410        <row>
1411         <entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
1412         <entry>8 bytes</entry>
1413         <entry>both date and time (no time zone)</entry>
1414         <entry>4713 BC</entry>
1415         <entry>294276 AD</entry>
1416         <entry>1 microsecond / 14 digits</entry>
1417        </row>
1418        <row>
1419         <entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
1420         <entry>8 bytes</entry>
1421         <entry>both date and time, with time zone</entry>
1422         <entry>4713 BC</entry>
1423         <entry>294276 AD</entry>
1424         <entry>1 microsecond / 14 digits</entry>
1425        </row>
1426        <row>
1427         <entry><type>date</type></entry>
1428         <entry>4 bytes</entry>
1429         <entry>date (no time of day)</entry>
1430         <entry>4713 BC</entry>
1431         <entry>5874897 AD</entry>
1432         <entry>1 day</entry>
1433        </row>
1434        <row>
1435         <entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
1436         <entry>8 bytes</entry>
1437         <entry>time of day (no date)</entry>
1438         <entry>00:00:00</entry>
1439         <entry>24:00:00</entry>
1440         <entry>1 microsecond / 14 digits</entry>
1441        </row>
1442        <row>
1443         <entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
1444         <entry>12 bytes</entry>
1445         <entry>times of day only, with time zone</entry>
1446         <entry>00:00:00+1459</entry>
1447         <entry>24:00:00-1459</entry>
1448         <entry>1 microsecond / 14 digits</entry>
1449        </row>
1450        <row>
1451         <entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
1452         <entry>12 bytes</entry>
1453         <entry>time interval</entry>
1454         <entry>-178000000 years</entry>
1455         <entry>178000000 years</entry>
1456         <entry>1 microsecond / 14 digits</entry>
1457        </row>
1458       </tbody>
1459      </tgroup>
1460     </table>
1461
1462    <note>
1463     <para>
1464      Prior to <productname>PostgreSQL</productname> 7.3, writing just
1465      <type>timestamp</type> was equivalent to <type>timestamp with
1466      time zone</type>.  This was changed for SQL compliance.
1467     </para>
1468    </note>
1469
1470    <para>
1471     <type>time</type>, <type>timestamp</type>, and
1472     <type>interval</type> accept an optional precision value
1473     <replaceable>p</replaceable> which specifies the number of
1474     fractional digits retained in the seconds field. By default, there
1475     is no explicit bound on precision.  The allowed range of
1476     <replaceable>p</replaceable> is from 0 to 6 for the
1477     <type>timestamp</type> and <type>interval</type> types.
1478    </para>
1479
1480    <note>
1481    <para>
1482     When <type>timestamp</> values are stored as eight-byte integers
1483     (currently the default), microsecond precision is available over
1484     the full range of values. When <type>timestamp</> values are
1485     stored as double precision floating-point numbers instead (a
1486     deprecated compile-time option), the effective limit of precision
1487     might be less than 6. <type>timestamp</type> values are stored as
1488     seconds before or after midnight 2000-01-01.  When
1489     <type>timestamp</type> values are implemented using floating-point
1490     numbers, microsecond precision is achieved for dates within a few
1491     years of 2000-01-01, but the precision degrades for dates further
1492     away. Note that using floating-point datetimes allows a larger
1493     range of <type>timestamp</type> values to be represented than
1494     shown above: from 4713 BC up to 5874897 AD.
1495    </para>
1496
1497    <para>
1498     The same compile-time option also determines whether
1499     <type>time</type> and <type>interval</type> values are stored as
1500     floating-point numbers or eight-byte integers.  In the
1501     floating-point case, large <type>interval</type> values degrade in
1502     precision as the size of the interval increases.
1503    </para>
1504    </note>
1505
1506    <para>
1507     For the <type>time</type> types, the allowed range of
1508     <replaceable>p</replaceable> is from 0 to 6 when eight-byte integer
1509     storage is used, or from 0 to 10 when floating-point storage is used.
1510    </para>
1511
1512    <para>
1513     The <type>interval</type> type has an additional option, which is
1514     to restrict the set of stored fields by writing one of these phrases:
1515 <programlisting>
1516     YEAR
1517     MONTH
1518     DAY
1519     HOUR
1520     MINUTE
1521     SECOND
1522     YEAR TO MONTH
1523     DAY TO HOUR
1524     DAY TO MINUTE
1525     DAY TO SECOND
1526     HOUR TO MINUTE
1527     HOUR TO SECOND
1528     MINUTE TO SECOND
1529 </programlisting>
1530     Note that if both <replaceable>fields</replaceable> and
1531     <replaceable>p</replaceable> are specified, the
1532     <replaceable>fields</replaceable> must include <literal>SECOND</>,
1533     since the precision applies only to the seconds.
1534    </para>
1535
1536    <para>
1537     The type <type>time with time zone</type> is defined by the SQL
1538     standard, but the definition exhibits properties which lead to
1539     questionable usefulness. In most cases, a combination of
1540     <type>date</type>, <type>time</type>, <type>timestamp without time
1541     zone</type>, and <type>timestamp with time zone</type> should
1542     provide a complete range of date/time functionality required by
1543     any application.
1544    </para>
1545
1546    <para>
1547     The types <type>abstime</type>
1548     and <type>reltime</type> are lower precision types which are used internally.
1549     You are discouraged from using these types in
1550     applications;  these internal types
1551     might disappear in a future release.
1552    </para>
1553
1554    <sect2 id="datatype-datetime-input">
1555     <title>Date/Time Input</title>
1556
1557     <para>
1558      Date and time input is accepted in almost any reasonable format, including
1559      ISO 8601, <acronym>SQL</acronym>-compatible,
1560      traditional <productname>POSTGRES</productname>, and others.
1561      For some formats, ordering of day, month, and year in date input is
1562      ambiguous and there is support for specifying the expected
1563      ordering of these fields.  Set the <xref linkend="guc-datestyle"> parameter
1564      to <literal>MDY</> to select month-day-year interpretation,
1565      <literal>DMY</> to select day-month-year interpretation, or
1566      <literal>YMD</> to select year-month-day interpretation.
1567     </para>
1568
1569     <para>
1570      <productname>PostgreSQL</productname> is more flexible in
1571      handling date/time input than the
1572      <acronym>SQL</acronym> standard requires.
1573      See <xref linkend="datetime-appendix">
1574      for the exact parsing rules of date/time input and for the
1575      recognized text fields including months, days of the week, and
1576      time zones.
1577     </para>
1578
1579     <para>
1580      Remember that any date or time literal input needs to be enclosed
1581      in single quotes, like text strings.  Refer to
1582      <xref linkend="sql-syntax-constants-generic"> for more
1583      information.
1584      <acronym>SQL</acronym> requires the following syntax
1585 <synopsis>
1586 <replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>'
1587 </synopsis>
1588      where <replaceable>p</replaceable> is an optional precision
1589      specification giving the number of
1590      fractional digits in the seconds field. Precision can be
1591      specified for <type>time</type>, <type>timestamp</type>, and
1592      <type>interval</type> types.  The allowed values are mentioned
1593      above.  If no precision is specified in a constant specification,
1594      it defaults to the precision of the literal value.
1595     </para>
1596
1597     <sect3>
1598     <title>Dates</title>
1599
1600     <indexterm>
1601      <primary>date</primary>
1602     </indexterm>
1603
1604     <para>
1605      <xref linkend="datatype-datetime-date-table"> shows some possible
1606      inputs for the <type>date</type> type.
1607     </para>
1608
1609      <table id="datatype-datetime-date-table">
1610       <title>Date Input</title>
1611       <tgroup cols="2">
1612        <thead>
1613         <row>
1614          <entry>Example</entry>
1615          <entry>Description</entry>
1616         </row>
1617        </thead>
1618        <tbody>
1619         <row>
1620          <entry>1999-01-08</entry>
1621          <entry>ISO 8601; January 8 in any mode
1622          (recommended format)</entry>
1623         </row>
1624         <row>
1625          <entry>January 8, 1999</entry>
1626          <entry>unambiguous in any <varname>datestyle</varname> input mode</entry>
1627         </row>
1628         <row>
1629          <entry>1/8/1999</entry>
1630          <entry>January 8 in <literal>MDY</> mode;
1631           August 1 in <literal>DMY</> mode</entry>
1632         </row>
1633         <row>
1634          <entry>1/18/1999</entry>
1635          <entry>January 18 in <literal>MDY</> mode;
1636           rejected in other modes</entry>
1637         </row>
1638         <row>
1639          <entry>01/02/03</entry>
1640          <entry>January 2, 2003 in <literal>MDY</> mode;
1641           February 1, 2003 in <literal>DMY</> mode;
1642           February 3, 2001 in <literal>YMD</> mode
1643          </entry>
1644         </row>
1645         <row>
1646          <entry>1999-Jan-08</entry>
1647          <entry>January 8 in any mode</entry>
1648         </row>
1649         <row>
1650          <entry>Jan-08-1999</entry>
1651          <entry>January 8 in any mode</entry>
1652         </row>
1653         <row>
1654          <entry>08-Jan-1999</entry>
1655          <entry>January 8 in any mode</entry>
1656         </row>
1657         <row>
1658          <entry>99-Jan-08</entry>
1659          <entry>January 8 in <literal>YMD</> mode, else error</entry>
1660         </row>
1661         <row>
1662          <entry>08-Jan-99</entry>
1663          <entry>January 8, except error in <literal>YMD</> mode</entry>
1664         </row>
1665         <row>
1666          <entry>Jan-08-99</entry>
1667          <entry>January 8, except error in <literal>YMD</> mode</entry>
1668         </row>
1669         <row>
1670          <entry>19990108</entry>
1671          <entry>ISO 8601; January 8, 1999 in any mode</entry>
1672         </row>
1673         <row>
1674          <entry>990108</entry>
1675          <entry>ISO 8601; January 8, 1999 in any mode</entry>
1676         </row>
1677         <row>
1678          <entry>1999.008</entry>
1679          <entry>year and day of year</entry>
1680         </row>
1681         <row>
1682          <entry>J2451187</entry>
1683          <entry>Julian day</entry>
1684         </row>
1685         <row>
1686          <entry>January 8, 99 BC</entry>
1687          <entry>year 99 BC</entry>
1688         </row>
1689        </tbody>
1690       </tgroup>
1691      </table>
1692     </sect3>
1693
1694     <sect3>
1695      <title>Times</title>
1696
1697      <indexterm>
1698       <primary>time</primary>
1699      </indexterm>
1700      <indexterm>
1701       <primary>time without time zone</primary>
1702      </indexterm>
1703      <indexterm>
1704       <primary>time with time zone</primary>
1705      </indexterm>
1706
1707      <para>
1708       The time-of-day types are <type>time [
1709       (<replaceable>p</replaceable>) ] without time zone</type> and
1710       <type>time [ (<replaceable>p</replaceable>) ] with time
1711       zone</type>.  <type>time</type> alone is equivalent to
1712       <type>time without time zone</type>.
1713      </para>
1714
1715      <para>
1716       Valid input for these types consists of a time of day followed
1717       by an optional time zone. (See <xref
1718       linkend="datatype-datetime-time-table">
1719       and <xref linkend="datatype-timezone-table">.)  If a time zone is
1720       specified in the input for <type>time without time zone</type>,
1721       it is silently ignored. You can also specify a date but it will
1722       be ignored, except when you use a time zone name that involves a
1723       daylight-savings rule, such as
1724       <literal>America/New_York</literal>. In this case specifying the date
1725       is required in order to determine whether standard or daylight-savings
1726       time applies.  The appropriate time zone offset is recorded in the
1727       <type>time with time zone</type> value.
1728      </para>
1729
1730       <table id="datatype-datetime-time-table">
1731        <title>Time Input</title>
1732        <tgroup cols="2">
1733         <thead>
1734          <row>
1735           <entry>Example</entry>
1736           <entry>Description</entry>
1737          </row>
1738         </thead>
1739         <tbody>
1740          <row>
1741           <entry><literal>04:05:06.789</literal></entry>
1742           <entry>ISO 8601</entry>
1743          </row>
1744          <row>
1745           <entry><literal>04:05:06</literal></entry>
1746           <entry>ISO 8601</entry>
1747          </row>
1748          <row>
1749           <entry><literal>04:05</literal></entry>
1750           <entry>ISO 8601</entry>
1751          </row>
1752          <row>
1753           <entry><literal>040506</literal></entry>
1754           <entry>ISO 8601</entry>
1755          </row>
1756          <row>
1757           <entry><literal>04:05 AM</literal></entry>
1758           <entry>same as 04:05; AM does not affect value</entry>
1759          </row>
1760          <row>
1761           <entry><literal>04:05 PM</literal></entry>
1762           <entry>same as 16:05; input hour must be &lt;= 12</entry>
1763          </row>
1764          <row>
1765           <entry><literal>04:05:06.789-8</literal></entry>
1766           <entry>ISO 8601</entry>
1767          </row>
1768          <row>
1769           <entry><literal>04:05:06-08:00</literal></entry>
1770           <entry>ISO 8601</entry>
1771          </row>
1772          <row>
1773           <entry><literal>04:05-08:00</literal></entry>
1774           <entry>ISO 8601</entry>
1775          </row>
1776          <row>
1777           <entry><literal>040506-08</literal></entry>
1778           <entry>ISO 8601</entry>
1779          </row>
1780          <row>
1781           <entry><literal>04:05:06 PST</literal></entry>
1782           <entry>time zone specified by abbreviation</entry>
1783          </row>
1784          <row>
1785           <entry><literal>2003-04-12 04:05:06 America/New_York</literal></entry>
1786           <entry>time zone specified by full name</entry>
1787          </row>
1788         </tbody>
1789        </tgroup>
1790       </table>
1791
1792       <table tocentry="1" id="datatype-timezone-table">
1793        <title>Time Zone Input</title>
1794        <tgroup cols="2">
1795         <thead>
1796          <row>
1797           <entry>Example</entry>
1798           <entry>Description</entry>
1799          </row>
1800         </thead>
1801         <tbody>
1802          <row>
1803           <entry><literal>PST</literal></entry>
1804           <entry>Abbreviation (for Pacific Standard Time)</entry>
1805          </row>
1806          <row>
1807           <entry><literal>America/New_York</literal></entry>
1808           <entry>Full time zone name</entry>
1809          </row>
1810          <row>
1811           <entry><literal>PST8PDT</literal></entry>
1812           <entry>POSIX-style time zone specification</entry>
1813          </row>
1814          <row>
1815           <entry><literal>-8:00</literal></entry>
1816           <entry>ISO-8601 offset for PST</entry>
1817          </row>
1818          <row>
1819           <entry><literal>-800</literal></entry>
1820           <entry>ISO-8601 offset for PST</entry>
1821          </row>
1822          <row>
1823           <entry><literal>-8</literal></entry>
1824           <entry>ISO-8601 offset for PST</entry>
1825          </row>
1826          <row>
1827           <entry><literal>zulu</literal></entry>
1828           <entry>Military abbreviation for UTC</entry>
1829          </row>
1830          <row>
1831           <entry><literal>z</literal></entry>
1832           <entry>Short form of <literal>zulu</literal></entry>
1833          </row>
1834         </tbody>
1835        </tgroup>
1836       </table>
1837
1838      <para>
1839      Refer to <xref linkend="datatype-timezones"> for more information on how
1840      to specify time zones.
1841     </para>
1842     </sect3>
1843
1844     <sect3>
1845     <title>Time Stamps</title>
1846
1847     <indexterm>
1848      <primary>timestamp</primary>
1849     </indexterm>
1850
1851     <indexterm>
1852      <primary>timestamp with time zone</primary>
1853     </indexterm>
1854
1855     <indexterm>
1856      <primary>timestamp without time zone</primary>
1857     </indexterm>
1858
1859      <para>
1860       Valid input for the time stamp types consists of the concatenation
1861       of a date and a time, followed by an optional time zone,
1862       followed by an optional <literal>AD</literal> or <literal>BC</literal>.
1863       (Alternatively, <literal>AD</literal>/<literal>BC</literal> can appear
1864       before the time zone, but this is not the preferred ordering.)
1865       Thus:
1866
1867 <programlisting>
1868 1999-01-08 04:05:06
1869 </programlisting>
1870       and:
1871 <programlisting>
1872 1999-01-08 04:05:06 -8:00
1873 </programlisting>
1874
1875       are valid values, which follow the <acronym>ISO</acronym> 8601
1876       standard.  In addition, the common format:
1877 <programlisting>
1878 January 8 04:05:06 1999 PST
1879 </programlisting>
1880       is supported.
1881      </para>
1882
1883      <para>
1884       The <acronym>SQL</acronym> standard differentiates
1885       <type>timestamp without time zone</type>
1886       and <type>timestamp with time zone</type> literals by the presence of a
1887       <quote>+</quote> or <quote>-</quote> symbol and time zone offset after
1888       the time.  Hence, according to the standard,
1889
1890       <programlisting>TIMESTAMP '2004-10-19 10:23:54'</programlisting>
1891
1892       is a <type>timestamp without time zone</type>, while
1893
1894       <programlisting>TIMESTAMP '2004-10-19 10:23:54+02'</programlisting>
1895
1896       is a <type>timestamp with time zone</type>.
1897       <productname>PostgreSQL</productname> never examines the content of a
1898       literal string before determining its type, and therefore will treat
1899       both of the above as <type>timestamp without time zone</type>.  To
1900       ensure that a literal is treated as <type>timestamp with time
1901       zone</type>, give it the correct explicit type:
1902
1903       <programlisting>TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'</programlisting>
1904
1905       In a literal that has been determined to be <type>timestamp without time
1906       zone</type>, <productname>PostgreSQL</productname> will silently ignore
1907       any time zone indication.
1908       That is, the resulting value is derived from the date/time
1909       fields in the input value, and is not adjusted for time zone.
1910      </para>
1911
1912      <para>
1913       For <type>timestamp with time zone</type>, the internally stored
1914       value is always in UTC (Universal
1915       Coordinated Time, traditionally known as Greenwich Mean Time,
1916       <acronym>GMT</>).  An input value that has an explicit
1917       time zone specified is converted to UTC using the appropriate offset
1918       for that time zone.  If no time zone is stated in the input string,
1919       then it is assumed to be in the time zone indicated by the system's
1920       <xref linkend="guc-timezone"> parameter, and is converted to UTC using the
1921       offset for the <varname>timezone</> zone.
1922      </para>
1923
1924      <para>
1925       When a <type>timestamp with time
1926       zone</type> value is output, it is always converted from UTC to the
1927       current <varname>timezone</> zone, and displayed as local time in that
1928       zone.  To see the time in another time zone, either change
1929       <varname>timezone</> or use the <literal>AT TIME ZONE</> construct
1930       (see <xref linkend="functions-datetime-zoneconvert">).
1931      </para>
1932
1933      <para>
1934       Conversions between <type>timestamp without time zone</type> and
1935       <type>timestamp with time zone</type> normally assume that the
1936       <type>timestamp without time zone</type> value should be taken or given
1937       as <varname>timezone</> local time.  A different time zone can
1938       be specified for the conversion using <literal>AT TIME ZONE</>.
1939      </para>
1940     </sect3>
1941
1942     <sect3>
1943      <title>Special Values</title>
1944
1945      <indexterm>
1946       <primary>time</primary>
1947       <secondary>constants</secondary>
1948      </indexterm>
1949
1950      <indexterm>
1951       <primary>date</primary>
1952       <secondary>constants</secondary>
1953      </indexterm>
1954
1955      <para>
1956       <productname>PostgreSQL</productname> supports several
1957       special date/time input values for convenience, as shown in <xref
1958       linkend="datatype-datetime-special-table">.  The values
1959       <literal>infinity</literal> and <literal>-infinity</literal>
1960       are specially represented inside the system and will be displayed
1961       unchanged; but the others are simply notational shorthands
1962       that will be converted to ordinary date/time values when read.
1963       (In particular, <literal>now</> and related strings are converted
1964       to a specific time value as soon as they are read.)
1965       All of these values need to be enclosed in single quotes when used
1966       as constants in SQL commands.
1967      </para>
1968
1969       <table id="datatype-datetime-special-table">
1970        <title>Special Date/Time Inputs</title>
1971        <tgroup cols="3">
1972         <thead>
1973          <row>
1974           <entry>Input String</entry>
1975           <entry>Valid Types</entry>
1976           <entry>Description</entry>
1977          </row>
1978         </thead>
1979         <tbody>
1980          <row>
1981           <entry><literal>epoch</literal></entry>
1982           <entry><type>date</type>, <type>timestamp</type></entry>
1983           <entry>1970-01-01 00:00:00+00 (Unix system time zero)</entry>
1984          </row>
1985          <row>
1986           <entry><literal>infinity</literal></entry>
1987           <entry><type>date</type>, <type>timestamp</type></entry>
1988           <entry>later than all other time stamps</entry>
1989          </row>
1990          <row>
1991           <entry><literal>-infinity</literal></entry>
1992           <entry><type>date</type>, <type>timestamp</type></entry>
1993           <entry>earlier than all other time stamps</entry>
1994          </row>
1995          <row>
1996           <entry><literal>now</literal></entry>
1997           <entry><type>date</type>, <type>time</type>, <type>timestamp</type></entry>
1998           <entry>current transaction's start time</entry>
1999          </row>
2000          <row>
2001           <entry><literal>today</literal></entry>
2002           <entry><type>date</type>, <type>timestamp</type></entry>
2003           <entry>midnight today</entry>
2004          </row>
2005          <row>
2006           <entry><literal>tomorrow</literal></entry>
2007           <entry><type>date</type>, <type>timestamp</type></entry>
2008           <entry>midnight tomorrow</entry>
2009          </row>
2010          <row>
2011           <entry><literal>yesterday</literal></entry>
2012           <entry><type>date</type>, <type>timestamp</type></entry>
2013           <entry>midnight yesterday</entry>
2014          </row>
2015          <row>
2016           <entry><literal>allballs</literal></entry>
2017           <entry><type>time</type></entry>
2018           <entry>00:00:00.00 UTC</entry>
2019          </row>
2020         </tbody>
2021        </tgroup>
2022       </table>
2023
2024      <para>
2025       The following <acronym>SQL</acronym>-compatible functions can also
2026       be used to obtain the current time value for the corresponding data
2027       type:
2028       <literal>CURRENT_DATE</literal>, <literal>CURRENT_TIME</literal>,
2029       <literal>CURRENT_TIMESTAMP</literal>, <literal>LOCALTIME</literal>,
2030       <literal>LOCALTIMESTAMP</literal>.  The latter four accept an
2031       optional subsecond precision specification.  (See <xref
2032       linkend="functions-datetime-current">.)  Note that these are
2033       SQL functions and are <emphasis>not</> recognized in data input strings.
2034      </para>
2035
2036     </sect3>
2037    </sect2>
2038
2039    <sect2 id="datatype-datetime-output">
2040     <title>Date/Time Output</title>
2041
2042     <indexterm>
2043      <primary>date</primary>
2044      <secondary>output format</secondary>
2045      <seealso>formatting</seealso>
2046     </indexterm>
2047
2048     <indexterm>
2049      <primary>time</primary>
2050      <secondary>output format</secondary>
2051      <seealso>formatting</seealso>
2052     </indexterm>
2053
2054     <para>
2055      The output format of the date/time types can be set to one of the four
2056      styles ISO 8601,
2057      <acronym>SQL</acronym> (Ingres), traditional <productname>POSTGRES</>
2058      (Unix <application>date</> format), or
2059      German.  The default
2060      is the <acronym>ISO</acronym> format.  (The
2061      <acronym>SQL</acronym> standard requires the use of the ISO 8601
2062      format.  The name of the <quote>SQL</quote> output format is a
2063      historical accident.)  <xref
2064      linkend="datatype-datetime-output-table"> shows examples of each
2065      output style.  The output of the <type>date</type> and
2066      <type>time</type> types is of course only the date or time part
2067      in accordance with the given examples.
2068     </para>
2069
2070      <table id="datatype-datetime-output-table">
2071       <title>Date/Time Output Styles</title>
2072       <tgroup cols="3">
2073        <thead>
2074         <row>
2075          <entry>Style Specification</entry>
2076          <entry>Description</entry>
2077          <entry>Example</entry>
2078         </row>
2079        </thead>
2080        <tbody>
2081         <row>
2082          <entry>ISO</entry>
2083          <entry>ISO 8601/SQL standard</entry>
2084          <entry>1997-12-17 07:37:16-08</entry>
2085         </row>
2086         <row>
2087          <entry>SQL</entry>
2088          <entry>traditional style</entry>
2089          <entry>12/17/1997 07:37:16.00 PST</entry>
2090         </row>
2091         <row>
2092          <entry>POSTGRES</entry>
2093          <entry>original style</entry>
2094          <entry>Wed Dec 17 07:37:16 1997 PST</entry>
2095         </row>
2096         <row>
2097          <entry>German</entry>
2098          <entry>regional style</entry>
2099          <entry>17.12.1997 07:37:16.00 PST</entry>
2100         </row>
2101        </tbody>
2102       </tgroup>
2103      </table>
2104
2105     <para>
2106      In the <acronym>SQL</acronym> and POSTGRES styles, day appears before
2107      month if DMY field ordering has been specified, otherwise month appears
2108      before day.
2109      (See <xref linkend="datatype-datetime-input">
2110      for how this setting also affects interpretation of input values.)
2111      <xref linkend="datatype-datetime-output2-table"> shows an
2112      example.
2113     </para>
2114
2115      <table id="datatype-datetime-output2-table">
2116       <title>Date Order Conventions</title>
2117       <tgroup cols="3">
2118        <thead>
2119         <row>
2120          <entry><varname>datestyle</varname> Setting</entry>
2121          <entry>Input Ordering</entry>
2122          <entry>Example Output</entry>
2123         </row>
2124        </thead>
2125        <tbody>
2126         <row>
2127          <entry><literal>SQL, DMY</></entry>
2128          <entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
2129          <entry>17/12/1997 15:37:16.00 CET</entry>
2130         </row>
2131         <row>
2132          <entry><literal>SQL, MDY</></entry>
2133          <entry><replaceable>month</replaceable>/<replaceable>day</replaceable>/<replaceable>year</replaceable></entry>
2134          <entry>12/17/1997 07:37:16.00 PST</entry>
2135         </row>
2136         <row>
2137          <entry><literal>Postgres, DMY</></entry>
2138          <entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
2139          <entry>Wed 17 Dec 07:37:16 1997 PST</entry>
2140         </row>
2141        </tbody>
2142       </tgroup>
2143      </table>
2144
2145     <para>
2146      The date/time styles can be selected by the user using the
2147      <command>SET datestyle</command> command, the <xref
2148      linkend="guc-datestyle"> parameter in the
2149      <filename>postgresql.conf</filename> configuration file, or the
2150      <envar>PGDATESTYLE</envar> environment variable on the server or
2151      client.  The formatting function <function>to_char</function>
2152      (see <xref linkend="functions-formatting">) is also available as
2153      a more flexible way to format date/time output.
2154     </para>
2155    </sect2>
2156
2157    <sect2 id="datatype-timezones">
2158     <title>Time Zones</title>
2159
2160     <indexterm zone="datatype-timezones">
2161      <primary>time zone</primary>
2162     </indexterm>
2163
2164    <para>
2165     Time zones, and time-zone conventions, are influenced by
2166     political decisions, not just earth geometry. Time zones around the
2167     world became somewhat standardized during the 1900's,
2168     but continue to be prone to arbitrary changes, particularly with
2169     respect to daylight-savings rules.
2170     <productname>PostgreSQL</productname> uses the widely-used
2171     <literal>zoneinfo</> time zone database for information about
2172     historical time zone rules.  For times in the future, the assumption
2173     is that the latest known rules for a given time zone will
2174     continue to be observed indefinitely far into the future.
2175    </para>
2176
2177     <para>
2178      <productname>PostgreSQL</productname> endeavors to be compatible with
2179      the <acronym>SQL</acronym> standard definitions for typical usage.
2180      However, the <acronym>SQL</acronym> standard has an odd mix of date and
2181      time types and capabilities. Two obvious problems are:
2182
2183      <itemizedlist>
2184       <listitem>
2185        <para>
2186         Although the <type>date</type> type
2187         cannot have an associated time zone, the
2188         <type>time</type> type can.
2189         Time zones in the real world have little meaning unless
2190         associated with a date as well as a time,
2191         since the offset can vary through the year with daylight-saving
2192         time boundaries.
2193        </para>
2194       </listitem>
2195
2196       <listitem>
2197        <para>
2198         The default time zone is specified as a constant numeric offset
2199         from <acronym>UTC</>. It is therefore impossible to adapt to
2200         daylight-saving time when doing date/time arithmetic across
2201         <acronym>DST</acronym> boundaries.
2202        </para>
2203       </listitem>
2204
2205      </itemizedlist>
2206     </para>
2207
2208     <para>
2209      To address these difficulties, we recommend using date/time types
2210      that contain both date and time when using time zones. We
2211      do <emphasis>not</> recommend using the type <type>time with
2212      time zone</type> (though it is supported by
2213      <productname>PostgreSQL</productname> for legacy applications and
2214      for compliance with the <acronym>SQL</acronym> standard).
2215      <productname>PostgreSQL</productname> assumes
2216      your local time zone for any type containing only date or time.
2217     </para>
2218
2219     <para>
2220      All timezone-aware dates and times are stored internally in
2221      <acronym>UTC</acronym>.  They are converted to local time
2222      in the zone specified by the <xref linkend="guc-timezone"> configuration
2223      parameter before being displayed to the client.
2224     </para>
2225
2226     <para>
2227      <productname>PostgreSQL</productname> allows you to specify time zones in
2228      three different forms:
2229      <itemizedlist>
2230       <listitem>
2231        <para>
2232         A full time zone name, for example <literal>America/New_York</>.
2233         The recognized time zone names are listed in the
2234         <literal>pg_timezone_names</literal> view (see <xref
2235         linkend="view-pg-timezone-names">).
2236         <productname>PostgreSQL</productname> uses the widely-used
2237         <literal>zoneinfo</> time zone data for this purpose, so the same
2238         names are also recognized by much other software.
2239        </para>
2240       </listitem>
2241       <listitem>
2242        <para>
2243         A time zone abbreviation, for example <literal>PST</>.  Such a
2244         specification merely defines a particular offset from UTC, in
2245         contrast to full time zone names which can imply a set of daylight
2246         savings transition-date rules as well.  The recognized abbreviations
2247         are listed in the <literal>pg_timezone_abbrevs</> view (see <xref
2248         linkend="view-pg-timezone-abbrevs">).  You cannot set the
2249         configuration parameters <xref linkend="guc-timezone"> or
2250         <xref linkend="guc-log-timezone"> to a time
2251         zone abbreviation, but you can use abbreviations in
2252         date/time input values and with the <literal>AT TIME ZONE</>
2253         operator.
2254        </para>
2255       </listitem>
2256       <listitem>
2257        <para>
2258         In addition to the timezone names and abbreviations,
2259         <productname>PostgreSQL</productname> will accept POSIX-style time zone
2260         specifications of the form <replaceable>STD</><replaceable>offset</> or
2261         <replaceable>STD</><replaceable>offset</><replaceable>DST</>, where
2262         <replaceable>STD</> is a zone abbreviation, <replaceable>offset</> is a
2263         numeric offset in hours west from UTC, and <replaceable>DST</> is an
2264         optional daylight-savings zone abbreviation, assumed to stand for one
2265         hour ahead of the given offset. For example, if <literal>EST5EDT</>
2266         were not already a recognized zone name, it would be accepted and would
2267         be functionally equivalent to United States East Coast time.  When a
2268         daylight-savings zone name is present, it is assumed to be used
2269         according to the same daylight-savings transition rules used in the
2270         <literal>zoneinfo</> time zone database's <filename>posixrules</> entry.
2271         In a standard <productname>PostgreSQL</productname> installation,
2272         <filename>posixrules</> is the same as <literal>US/Eastern</>, so
2273         that POSIX-style time zone specifications follow USA daylight-savings
2274         rules.  If needed, you can adjust this behavior by replacing the
2275         <filename>posixrules</> file.
2276        </para>
2277       </listitem>
2278      </itemizedlist>
2279
2280      In short, this is the difference between abbreviations
2281      and full names: abbreviations always represent a fixed offset from
2282      UTC, whereas most of the full names imply a local daylight-savings time
2283      rule, and so have two possible UTC offsets.
2284     </para>
2285
2286     <para>
2287      One should be wary that the POSIX-style time zone feature can
2288      lead to silently accepting bogus input, since there is no check on the
2289      reasonableness of the zone abbreviations.  For example, <literal>SET
2290      TIMEZONE TO FOOBAR0</> will work, leaving the system effectively using
2291      a rather peculiar abbreviation for UTC.
2292      Another issue to keep in mind is that in POSIX time zone names,
2293      positive offsets are used for locations <emphasis>west</> of Greenwich.
2294      Everywhere else, <productname>PostgreSQL</productname> follows the
2295      ISO-8601 convention that positive timezone offsets are <emphasis>east</>
2296      of Greenwich.
2297     </para>
2298
2299     <para>
2300      In all cases, timezone names are recognized case-insensitively.
2301      (This is a change from <productname>PostgreSQL</productname> versions
2302      prior to 8.2, which were case-sensitive in some contexts but not others.)
2303     </para>
2304
2305     <para>
2306      Neither full names nor abbreviations are hard-wired into the server;
2307      they are obtained from configuration files stored under
2308      <filename>.../share/timezone/</> and <filename>.../share/timezonesets/</>
2309      of the installation directory
2310      (see <xref linkend="datetime-config-files">).
2311     </para>
2312
2313     <para>
2314      The <xref linkend="guc-timezone"> configuration parameter can
2315      be set in the file <filename>postgresql.conf</>, or in any of the
2316      other standard ways described in <xref linkend="runtime-config">.
2317      There are also several special ways to set it:
2318
2319      <itemizedlist>
2320       <listitem>
2321        <para>
2322         If <varname>timezone</> is not specified in
2323         <filename>postgresql.conf</> or as a server command-line option,
2324         the server attempts to use the value of the <envar>TZ</envar>
2325         environment variable as the default time zone.  If <envar>TZ</envar>
2326         is not defined or is not any of the time zone names known to
2327         <productname>PostgreSQL</productname>, the server attempts to
2328         determine the operating system's default time zone by checking the
2329         behavior of the C library function <literal>localtime()</>.  The
2330         default time zone is selected as the closest match among
2331         <productname>PostgreSQL</productname>'s known time zones.
2332         (These rules are also used to choose the default value of
2333         <xref linkend="guc-log-timezone">, if not specified.)
2334        </para>
2335       </listitem>
2336
2337       <listitem>
2338        <para>
2339         The <acronym>SQL</acronym> command <command>SET TIME ZONE</command>
2340         sets the time zone for the session.  This is an alternative spelling
2341         of <command>SET TIMEZONE TO</> with a more SQL-spec-compatible syntax.
2342        </para>
2343       </listitem>
2344
2345       <listitem>
2346        <para>
2347         The <envar>PGTZ</envar> environment variable is used by
2348         <application>libpq</application> clients
2349         to send a <command>SET TIME ZONE</command>
2350         command to the server upon connection.
2351        </para>
2352       </listitem>
2353      </itemizedlist>
2354     </para>
2355    </sect2>
2356
2357    <sect2 id="datatype-interval-input">
2358     <title>Interval Input</title>
2359
2360     <indexterm>
2361      <primary>interval</primary>
2362     </indexterm>
2363
2364      <para>
2365       <type>interval</type> values can be written using the following
2366       verbose syntax:
2367
2368 <synopsis>
2369 <optional>@</> <replaceable>quantity</> <replaceable>unit</> <optional><replaceable>quantity</> <replaceable>unit</>...</> <optional><replaceable>direction</></optional>
2370 </synopsis>
2371
2372      where <replaceable>quantity</> is a number (possibly signed);
2373      <replaceable>unit</> is <literal>microsecond</literal>,
2374      <literal>millisecond</literal>, <literal>second</literal>,
2375      <literal>minute</literal>, <literal>hour</literal>, <literal>day</literal>,
2376      <literal>week</literal>, <literal>month</literal>, <literal>year</literal>,
2377      <literal>decade</literal>, <literal>century</literal>, <literal>millennium</literal>,
2378      or abbreviations or plurals of these units;
2379      <replaceable>direction</> can be <literal>ago</literal> or
2380      empty.  The at sign (<literal>@</>) is optional noise.  The amounts
2381      of the different units are implicitly added with appropriate
2382      sign accounting.  <literal>ago</literal> negates all the fields.
2383      This syntax is also used for interval output, if
2384      <xref linkend="guc-intervalstyle"> is set to
2385      <literal>postgres_verbose</>.
2386     </para>
2387
2388     <para>
2389      Quantities of days, hours, minutes, and seconds can be specified without
2390      explicit unit markings.  For example, <literal>'1 12:59:10'</> is read
2391      the same as <literal>'1 day 12 hours 59 min 10 sec'</>.  Also,
2392      a combination of years and months can be specified with a dash;
2393      for example <literal>'200-10'</> is read the same as <literal>'200 years
2394      10 months'</>.  (These shorter forms are in fact the only ones allowed
2395      by the <acronym>SQL</acronym> standard, and are used for output when
2396      <varname>IntervalStyle</> is set to <literal>sql_standard</literal>.)
2397     </para>
2398
2399     <para>
2400      Interval values can also be written as ISO 8601 time intervals, using
2401      either the <quote>format with designators</> of the standard's section
2402      4.4.3.2 or the <quote>alternative format</> of section 4.4.3.3.  The
2403      format with designators looks like this:
2404 <synopsis>
2405 P <replaceable>quantity</> <replaceable>unit</> <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional> <optional> T <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional></optional>
2406 </synopsis>
2407       The string must start with a <literal>P</>, and may include a
2408       <literal>T</> that introduces the time-of-day units.  The
2409       available unit abbreviations are given in <xref
2410       linkend="datatype-interval-iso8601-units">.  Units may be
2411       omitted, and may be specified in any order, but units smaller than
2412       a day must appear after <literal>T</>.  In particular, the meaning of
2413       <literal>M</> depends on whether it is before or after
2414       <literal>T</>.
2415      </para>
2416
2417      <table id="datatype-interval-iso8601-units">
2418       <title>ISO 8601 interval unit abbreviations</title>
2419      <tgroup cols="2">
2420        <thead>
2421         <row>
2422          <entry>Abbreviation</entry>
2423          <entry>Meaning</entry>
2424         </row>
2425        </thead>
2426        <tbody>
2427         <row>
2428          <entry>Y</entry>
2429          <entry>Years</entry>
2430         </row>
2431         <row>
2432          <entry>M</entry>
2433          <entry>Months (in the date part)</entry>
2434         </row>
2435         <row>
2436          <entry>W</entry>
2437          <entry>Weeks</entry>
2438         </row>
2439         <row>
2440          <entry>D</entry>
2441          <entry>Days</entry>
2442         </row>
2443         <row>
2444          <entry>H</entry>
2445          <entry>Hours</entry>
2446         </row>
2447         <row>
2448          <entry>M</entry>
2449          <entry>Minutes (in the time part)</entry>
2450         </row>
2451         <row>
2452          <entry>S</entry>
2453          <entry>Seconds</entry>
2454         </row>
2455        </tbody>
2456       </tgroup>
2457      </table>
2458
2459      <para>
2460       In the alternative format:
2461 <synopsis>
2462 P <optional> <replaceable>years</>-<replaceable>months</>-<replaceable>days</> </optional> <optional> T <replaceable>hours</>:<replaceable>minutes</>:<replaceable>seconds</> </optional>
2463 </synopsis>
2464       the string must begin with <literal>P</literal>, and a
2465       <literal>T</> separates the date and time parts of the interval.
2466       The values are given as numbers similar to ISO 8601 dates.
2467     </para>
2468
2469     <para>
2470      When writing an interval constant with a <replaceable>fields</>
2471      specification, or when assigning a string to an interval column that was
2472      defined with a <replaceable>fields</> specification, the interpretation of
2473      unmarked quantities depends on the <replaceable>fields</>.  For
2474      example <literal>INTERVAL '1' YEAR</> is read as 1 year, whereas
2475      <literal>INTERVAL '1'</> means 1 second.  Also, field values
2476      <quote>to the right</> of the least significant field allowed by the
2477      <replaceable>fields</> specification are silently discarded.  For
2478      example, writing <literal>INTERVAL '1 day 2:03:04' HOUR TO MINUTE</>
2479      results in dropping the seconds field, but not the day field.
2480     </para>
2481
2482     <para>
2483      According to the <acronym>SQL</> standard all fields of an interval
2484      value must have the same sign, so a leading negative sign applies to all
2485      fields; for example the negative sign in the interval literal
2486      <literal>'-1 2:03:04'</> applies to both the days and hour/minute/second
2487      parts.  <productname>PostgreSQL</> allows the fields to have different
2488      signs, and traditionally treats each field in the textual representation
2489      as independently signed, so that the hour/minute/second part is
2490      considered positive in this example.  If <varname>IntervalStyle</> is
2491      set to <literal>sql_standard</literal> then a leading sign is considered
2492      to apply to all fields (but only if no additional signs appear).
2493      Otherwise the traditional <productname>PostgreSQL</> interpretation is
2494      used.  To avoid ambiguity, it's recommended to attach an explicit sign
2495      to each field if any field is negative.
2496     </para>
2497
2498     <para>
2499      Internally <type>interval</> values are stored as months, days,
2500      and seconds. This is done because the number of days in a month
2501      varies, and a day can have 23 or 25 hours if a daylight savings
2502      time adjustment is involved.  The months and days fields are integers
2503      while the seconds field can store fractions.  Because intervals are
2504      usually created from constant strings or <type>timestamp</> subtraction,
2505      this storage method works well in most cases. Functions
2506      <function>justify_days</> and <function>justify_hours</> are
2507      available for adjusting days and hours that overflow their normal
2508      ranges.
2509     </para>
2510
2511     <para>
2512      In the verbose input format, and in some fields of the more compact
2513      input formats, field values can have fractional parts; for example
2514      <literal>'1.5 week'</> or <literal>'01:02:03.45'</>.  Such input is
2515      converted to the appropriate number of months, days, and seconds
2516      for storage.  When this would result in a fractional number of
2517      months or days, the fraction is added to the lower-order fields
2518      using the conversion factors 1 month = 30 days and 1 day = 24 hours.
2519      For example, <literal>'1.5 month'</> becomes 1 month and 15 days.
2520      Only seconds will ever be shown as fractional on output.
2521     </para>
2522
2523     <para>
2524      <xref linkend="datatype-interval-input-examples"> shows some examples
2525      of valid <type>interval</> input.
2526     </para>
2527
2528      <table id="datatype-interval-input-examples">
2529       <title>Interval Input</title>
2530       <tgroup cols="2">
2531        <thead>
2532         <row>
2533          <entry>Example</entry>
2534          <entry>Description</entry>
2535         </row>
2536        </thead>
2537        <tbody>
2538         <row>
2539          <entry>1-2</entry>
2540          <entry>SQL standard format: 1 year 2 months</entry>
2541         </row>
2542         <row>
2543          <entry>3 4:05:06</entry>
2544          <entry>SQL standard format: 3 days 4 hours 5 minutes 6 seconds</entry>
2545         </row>
2546         <row>
2547          <entry>1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
2548          <entry>Traditional Postgres format: 1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>
2549         </row>
2550         <row>
2551          <entry>P1Y2M3DT4H5M6S</entry>
2552          <entry>ISO 8601 <quote>format with designators</>: same meaning as above</entry>
2553         </row>
2554         <row>
2555          <entry>P0001-02-03T04:05:06</entry>
2556          <entry>ISO 8601 <quote>alternative format</>: same meaning as above</entry>
2557         </row>
2558        </tbody>
2559       </tgroup>
2560      </table>
2561
2562    </sect2>
2563
2564    <sect2 id="datatype-interval-output">
2565     <title>Interval Output</title>
2566
2567     <indexterm>
2568      <primary>interval</primary>
2569      <secondary>output format</secondary>
2570      <seealso>formatting</seealso>
2571     </indexterm>
2572
2573     <para>
2574      The output format of the interval type can be set to one of the
2575      four styles <literal>sql_standard</>, <literal>postgres</>,
2576      <literal>postgres_verbose</>, or <literal>iso_8601</>,
2577      using the command <literal>SET intervalstyle</literal>.
2578      The default is the <literal>postgres</> format.
2579      <xref linkend="interval-style-output-table"> shows examples of each
2580      output style.
2581     </para>
2582
2583     <para>
2584      The <literal>sql_standard</> style produces output that conforms to
2585      the SQL standard's specification for interval literal strings, if
2586      the interval value meets the standard's restrictions (either year-month
2587      only or day-time only, with no mixing of positive
2588      and negative components).  Otherwise the output looks like a standard
2589      year-month literal string followed by a day-time literal string,
2590      with explicit signs added to disambiguate mixed-sign intervals.
2591     </para>
2592
2593     <para>
2594      The output of the <literal>postgres</> style matches the output of
2595      <productname>PostgreSQL</> releases prior to 8.4 when the
2596      <xref linkend="guc-datestyle"> parameter was set to <literal>ISO</>.
2597     </para>
2598
2599     <para>
2600      The output of the <literal>postgres_verbose</> style matches the output of
2601      <productname>PostgreSQL</> releases prior to 8.4 when the
2602      <varname>DateStyle</> parameter was set to non-<literal>ISO</> output.
2603     </para>
2604
2605     <para>
2606      The output of the <literal>iso_8601</> style matches the <quote>format
2607      with designators</> described in section 4.4.3.2 of the
2608      ISO 8601 standard.
2609     </para>
2610
2611      <table id="interval-style-output-table">
2612        <title>Interval Output Style Examples</title>
2613        <tgroup cols="4">
2614         <thead>
2615          <row>
2616           <entry>Style Specification</entry>
2617           <entry>Year-Month Interval</entry>
2618           <entry>Day-Time Interval</entry>
2619           <entry>Mixed Interval</entry>
2620          </row>
2621         </thead>
2622         <tbody>
2623          <row>
2624           <entry><literal>sql_standard</></entry>
2625           <entry>1-2</entry>
2626           <entry>3 4:05:06</entry>
2627           <entry>-1-2 +3 -4:05:06</entry>
2628          </row>
2629          <row>
2630           <entry><literal>postgres</></entry>
2631           <entry>1 year 2 mons</entry>
2632           <entry>3 days 04:05:06</entry>
2633           <entry>-1 year -2 mons +3 days -04:05:06</entry>
2634          </row>
2635          <row>
2636           <entry><literal>postgres_verbose</></entry>
2637           <entry>@ 1 year 2 mons</entry>
2638           <entry>@ 3 days 4 hours 5 mins 6 secs</entry>
2639           <entry>@ 1 year 2 mons -3 days 4 hours 5 mins 6 secs ago</entry>
2640          </row>
2641          <row>
2642           <entry><literal>iso_8601</></entry>
2643           <entry>P1Y2M</entry>
2644           <entry>P3DT4H5M6S</entry>
2645           <entry>P-1Y-2M3DT-4H-5M-6S</entry>
2646          </row>
2647         </tbody>
2648        </tgroup>
2649     </table>
2650
2651    </sect2>
2652
2653    <sect2 id="datatype-datetime-internals">
2654     <title>Internals</title>
2655
2656     <para>
2657      <productname>PostgreSQL</productname> uses Julian dates
2658      for all date/time calculations. This has the useful property of correctly
2659      calculating dates from 4713 BC
2660      to far into the future, using the assumption that the length of the
2661      year is 365.2425 days.
2662     </para>
2663
2664     <para>
2665      Date conventions before the 19th century make for interesting reading,
2666      but are not consistent enough to warrant coding into a date/time handler.
2667     </para>
2668    </sect2>
2669
2670   </sect1>
2671
2672   <sect1 id="datatype-boolean">
2673    <title>Boolean Type</title>
2674
2675    <indexterm zone="datatype-boolean">
2676     <primary>Boolean</primary>
2677     <secondary>data type</secondary>
2678    </indexterm>
2679
2680    <indexterm zone="datatype-boolean">
2681     <primary>true</primary>
2682    </indexterm>
2683
2684    <indexterm zone="datatype-boolean">
2685     <primary>false</primary>
2686    </indexterm>
2687
2688    <para>
2689     <productname>PostgreSQL</productname> provides the
2690     standard <acronym>SQL</acronym> type <type>boolean</type>.
2691     <type>boolean</type> can have one of only two states:
2692     <quote>true</quote> or <quote>false</quote>.  A third state,
2693     <quote>unknown</quote>, is represented by the
2694     <acronym>SQL</acronym> null value.
2695    </para>
2696
2697    <para>
2698     Valid literal values for the <quote>true</quote> state are:
2699     <simplelist>
2700      <member><literal>TRUE</literal></member>
2701      <member><literal>'t'</literal></member>
2702      <member><literal>'true'</literal></member>
2703      <member><literal>'y'</literal></member>
2704      <member><literal>'yes'</literal></member>
2705      <member><literal>'on'</literal></member>
2706      <member><literal>'1'</literal></member>
2707     </simplelist>
2708     For the <quote>false</quote> state, the following values can be
2709     used:
2710     <simplelist>
2711      <member><literal>FALSE</literal></member>
2712      <member><literal>'f'</literal></member>
2713      <member><literal>'false'</literal></member>
2714      <member><literal>'n'</literal></member>
2715      <member><literal>'no'</literal></member>
2716      <member><literal>'off'</literal></member>
2717      <member><literal>'0'</literal></member>
2718     </simplelist>
2719     Leading or trailing whitespace is ignored, and case does not matter.
2720     The key words
2721     <literal>TRUE</literal> and <literal>FALSE</literal> are the preferred
2722     (<acronym>SQL</acronym>-compliant) usage.
2723    </para>
2724
2725    <example id="datatype-boolean-example">
2726     <title>Using the <type>boolean</type> type</title>
2727
2728 <programlisting>
2729 CREATE TABLE test1 (a boolean, b text);
2730 INSERT INTO test1 VALUES (TRUE, 'sic est');
2731 INSERT INTO test1 VALUES (FALSE, 'non est');
2732 SELECT * FROM test1;
2733  a |    b
2734 ---+---------
2735  t | sic est
2736  f | non est
2737
2738 SELECT * FROM test1 WHERE a;
2739  a |    b
2740 ---+---------
2741  t | sic est
2742 </programlisting>
2743    </example>
2744
2745    <para>
2746     <xref linkend="datatype-boolean-example"> shows that
2747     <type>boolean</type> values are output using the letters
2748     <literal>t</literal> and <literal>f</literal>.
2749    </para>
2750
2751    <para>
2752     <type>boolean</type> uses 1 byte of storage.
2753    </para>
2754   </sect1>
2755
2756   <sect1 id="datatype-enum">
2757    <title>Enumerated Types</title>
2758
2759    <indexterm zone="datatype-enum">
2760     <primary>data type</primary>
2761     <secondary>enumerated (enum)</secondary>
2762    </indexterm>
2763
2764    <indexterm zone="datatype-enum">
2765     <primary>enumerated types</primary>
2766    </indexterm>
2767
2768    <para>
2769     Enumerated (enum) types are data types that
2770     comprise a static, ordered set of values.
2771     They are equivalent to the <type>enum</type>
2772     types supported in a number of programming languages. An example of an enum
2773     type might be the days of the week, or a set of status values for
2774     a piece of data.
2775    </para>
2776
2777    <sect2>
2778     <title>Declaration of Enumerated Types</title>
2779
2780     <para>
2781      Enum types are created using the <xref
2782      linkend="sql-createtype" endterm="sql-createtype-title"> command,
2783      for example:
2784
2785 <programlisting>
2786 CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
2787 </programlisting>
2788
2789      Once created, the enum type can be used in table and function
2790      definitions much like any other type:
2791     </para>
2792
2793     <example>
2794      <title>Basic Enum Usage</title>
2795 <programlisting>
2796 CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
2797 CREATE TABLE person (
2798     name text,
2799     current_mood mood
2800 );
2801 INSERT INTO person VALUES ('Moe', 'happy');
2802 SELECT * FROM person WHERE current_mood = 'happy';
2803  name | current_mood
2804 ------+--------------
2805  Moe  | happy
2806 (1 row)
2807 </programlisting>
2808      </example>
2809     </sect2>
2810
2811     <sect2>
2812      <title>Ordering</title>
2813
2814      <para>
2815       The ordering of the values in an enum type is the
2816       order in which the values were listed when the type was created.
2817       All standard comparison operators and related
2818       aggregate functions are supported for enums.  For example:
2819      </para>
2820
2821      <example>
2822       <title>Enum Ordering</title>
2823 <programlisting>
2824 INSERT INTO person VALUES ('Larry', 'sad');
2825 INSERT INTO person VALUES ('Curly', 'ok');
2826 SELECT * FROM person WHERE current_mood > 'sad';
2827  name  | current_mood
2828 -------+--------------
2829  Moe   | happy
2830  Curly | ok
2831 (2 rows)
2832
2833 SELECT * FROM person WHERE current_mood > 'sad' ORDER BY current_mood;
2834  name  | current_mood
2835 -------+--------------
2836  Curly | ok
2837  Moe   | happy
2838 (2 rows)
2839
2840 SELECT name
2841 FROM person
2842 WHERE current_mood = (SELECT MIN(current_mood) FROM person);
2843  name
2844 -------
2845  Larry
2846 (1 row)
2847 </programlisting>
2848     </example>
2849    </sect2>
2850
2851    <sect2>
2852     <title>Type Safety</title>
2853
2854     <para>
2855      Each enumerated data type is separate and cannot
2856      be compared with other enumerated types.
2857     </para>
2858
2859     <example>
2860      <title>Lack of Casting</title>
2861 <programlisting>
2862 CREATE TYPE happiness AS ENUM ('happy', 'very happy', 'ecstatic');
2863 CREATE TABLE holidays (
2864     num_weeks integer,
2865     happiness happiness
2866 );
2867 INSERT INTO holidays(num_weeks,happiness) VALUES (4, 'happy');
2868 INSERT INTO holidays(num_weeks,happiness) VALUES (6, 'very happy');
2869 INSERT INTO holidays(num_weeks,happiness) VALUES (8, 'ecstatic');
2870 INSERT INTO holidays(num_weeks,happiness) VALUES (2, 'sad');
2871 ERROR:  invalid input value for enum happiness: "sad"
2872 SELECT person.name, holidays.num_weeks FROM person, holidays
2873   WHERE person.current_mood = holidays.happiness;
2874 ERROR:  operator does not exist: mood = happiness
2875 </programlisting>
2876     </example>
2877
2878     <para>
2879      If you really need to do something like that, you can either
2880      write a custom operator or add explicit casts to your query:
2881     </para>
2882
2883     <example>
2884      <title>Comparing Different Enums by Casting to Text</title>
2885 <programlisting>
2886 SELECT person.name, holidays.num_weeks FROM person, holidays
2887   WHERE person.current_mood::text = holidays.happiness::text;
2888  name | num_weeks
2889 ------+-----------
2890  Moe  |         4
2891 (1 row)
2892
2893 </programlisting>
2894     </example>
2895    </sect2>
2896
2897    <sect2>
2898     <title>Implementation Details</title>
2899
2900     <para>
2901      An enum value occupies four bytes on disk.  The length of an enum
2902      value's textual label is limited by the <symbol>NAMEDATALEN</symbol>
2903      setting compiled into <productname>PostgreSQL</productname>; in standard
2904      builds this means at most 63 bytes.
2905     </para>
2906
2907     <para>
2908      Enum labels are case sensitive, so
2909      <type>'happy'</type> is not the same as <type>'HAPPY'</type>.
2910      White space in the labels is significant too.
2911     </para>
2912
2913     <para>
2914      The translations from internal enum values to textual labels are
2915      kept in the system catalog
2916      <link linkend="catalog-pg-enum"><structname>pg_enum</structname></link>.
2917      Querying this catalog directly can be useful.
2918     </para>
2919
2920    </sect2>
2921   </sect1>
2922
2923   <sect1 id="datatype-geometric">
2924    <title>Geometric Types</title>
2925
2926    <para>
2927     Geometric data types represent two-dimensional spatial
2928     objects. <xref linkend="datatype-geo-table"> shows the geometric
2929     types available in <productname>PostgreSQL</productname>.  The
2930     most fundamental type, the point, forms the basis for all of the
2931     other types.
2932    </para>
2933
2934     <table id="datatype-geo-table">
2935      <title>Geometric Types</title>
2936      <tgroup cols="4">
2937       <thead>
2938        <row>
2939         <entry>Name</entry>
2940         <entry>Storage Size</entry>
2941         <entry>Representation</entry>
2942         <entry>Description</entry>
2943        </row>
2944       </thead>
2945       <tbody>
2946        <row>
2947         <entry><type>point</type></entry>
2948         <entry>16 bytes</entry>
2949         <entry>Point on a plane</entry>
2950         <entry>(x,y)</entry>
2951        </row>
2952        <row>
2953         <entry><type>line</type></entry>
2954         <entry>32 bytes</entry>
2955         <entry>Infinite line (not fully implemented)</entry>
2956         <entry>((x1,y1),(x2,y2))</entry>
2957        </row>
2958        <row>
2959         <entry><type>lseg</type></entry>
2960         <entry>32 bytes</entry>
2961         <entry>Finite line segment</entry>
2962         <entry>((x1,y1),(x2,y2))</entry>
2963        </row>
2964        <row>
2965         <entry><type>box</type></entry>
2966         <entry>32 bytes</entry>
2967         <entry>Rectangular box</entry>
2968         <entry>((x1,y1),(x2,y2))</entry>
2969        </row>
2970        <row>
2971         <entry><type>path</type></entry>
2972         <entry>16+16n bytes</entry>
2973         <entry>Closed path (similar to polygon)</entry>
2974         <entry>((x1,y1),...)</entry>
2975        </row>
2976        <row>
2977         <entry><type>path</type></entry>
2978         <entry>16+16n bytes</entry>
2979         <entry>Open path</entry>
2980         <entry>[(x1,y1),...]</entry>
2981        </row>
2982        <row>
2983         <entry><type>polygon</type></entry>
2984         <entry>40+16n bytes</entry>
2985         <entry>Polygon (similar to closed path)</entry>
2986         <entry>((x1,y1),...)</entry>
2987        </row>
2988        <row>
2989         <entry><type>circle</type></entry>
2990         <entry>24 bytes</entry>
2991         <entry>Circle</entry>
2992         <entry>&lt;(x,y),r&gt; (center point and radius)</entry>
2993        </row>
2994       </tbody>
2995      </tgroup>
2996     </table>
2997
2998    <para>
2999     A rich set of functions and operators is available to perform various geometric
3000     operations such as scaling, translation, rotation, and determining
3001     intersections.  They are explained in <xref linkend="functions-geometry">.
3002    </para>
3003
3004    <sect2>
3005     <title>Points</title>
3006
3007     <indexterm>
3008      <primary>point</primary>
3009     </indexterm>
3010
3011     <para>
3012      Points are the fundamental two-dimensional building block for geometric types.
3013      Values of type <type>point</type> are specified using the following syntax:
3014
3015 <synopsis>
3016 ( <replaceable>x</replaceable> , <replaceable>y</replaceable> )
3017   <replaceable>x</replaceable> , <replaceable>y</replaceable>
3018 </synopsis>
3019
3020      where <replaceable>x</> and <replaceable>y</> are the respective
3021      coordinates, as floating-point numbers.
3022     </para>
3023    </sect2>
3024
3025    <sect2>
3026     <title>Line Segments</title>
3027
3028     <indexterm>
3029      <primary>lseg</primary>
3030     </indexterm>
3031
3032     <indexterm>
3033      <primary>line segment</primary>
3034     </indexterm>
3035
3036     <para>
3037      Line segments (<type>lseg</type>) are represented by pairs of points.
3038      Values of type <type>lseg</type> are specified using the following syntax:
3039
3040 <synopsis>
3041 ( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
3042   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
3043     <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   ,   <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
3044 </synopsis>
3045
3046      where
3047      <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
3048      and
3049      <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
3050      are the end points of the line segment.
3051     </para>
3052    </sect2>
3053
3054    <sect2>
3055     <title>Boxes</title>
3056
3057     <indexterm>
3058      <primary>box (data type)</primary>
3059     </indexterm>
3060
3061     <indexterm>
3062      <primary>rectangle</primary>
3063     </indexterm>
3064
3065     <para>
3066      Boxes are represented by pairs of points that are opposite
3067      corners of the box.
3068      Values of type <type>box</type> are specified using the following syntax:
3069
3070 <synopsis>
3071 ( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
3072   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )
3073     <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   ,   <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
3074 </synopsis>
3075
3076      where
3077      <literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
3078      and
3079      <literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
3080      are any two opposite corners of the box.
3081     </para>
3082
3083     <para>
3084      Boxes are output using the first syntax.
3085      Any two opposite corners can be supplied on input, but the values
3086      will be reordered as needed to store the
3087      upper right and lower left corners.
3088     </para>
3089    </sect2>
3090
3091    <sect2>
3092     <title>Paths</title>
3093
3094     <indexterm>
3095      <primary>path (data type)</primary>
3096     </indexterm>
3097
3098     <para>
3099      Paths are represented by lists of connected points. Paths can be
3100      <firstterm>open</firstterm>, where
3101      the first and last points in the list are considered not connected, or
3102      <firstterm>closed</firstterm>,
3103      where the first and last points are considered connected.
3104     </para>
3105
3106     <para>
3107      Values of type <type>path</type> are specified using the following syntax:
3108
3109 <synopsis>
3110 ( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
3111 [ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ]
3112   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3113   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3114     <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
3115 </synopsis>
3116
3117      where the points are the end points of the line segments
3118      comprising the path.  Square brackets (<literal>[]</>) indicate
3119      an open path, while parentheses (<literal>()</>) indicate a
3120      closed path.
3121     </para>
3122
3123     <para>
3124      Paths are output using the first or second syntax, as appropriate.
3125     </para>
3126    </sect2>
3127
3128    <sect2>
3129     <title>Polygons</title>
3130
3131     <indexterm>
3132      <primary>polygon</primary>
3133     </indexterm>
3134
3135     <para>
3136      Polygons are represented by lists of points (the vertexes of the
3137      polygon). Polygons are very similar to closed paths, but are
3138      stored differently
3139      and have their own set of support routines.
3140     </para>
3141
3142     <para>
3143      Values of type <type>polygon</type> are specified using the following syntax:
3144
3145 <synopsis>
3146 ( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
3147   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3148   ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )
3149     <replaceable>x1</replaceable> , <replaceable>y1</replaceable>   , ... ,   <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
3150 </synopsis>
3151
3152      where the points are the end points of the line segments
3153      comprising the boundary of the polygon.
3154     </para>
3155
3156     <para>
3157      Polygons are output using the first syntax.
3158     </para>
3159    </sect2>
3160
3161    <sect2>
3162     <title>Circles</title>
3163
3164     <indexterm>
3165      <primary>circle</primary>
3166     </indexterm>
3167
3168     <para>
3169      Circles are represented by a center point and radius.
3170      Values of type <type>circle</type> are specified using the following syntax:
3171
3172 <synopsis>
3173 &lt; ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> &gt;
3174 ( ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> )
3175   ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable>
3176     <replaceable>x</replaceable> , <replaceable>y</replaceable>   , <replaceable>r</replaceable>
3177 </synopsis>
3178
3179      where
3180      <literal>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</literal>
3181      is the center point and <replaceable>r</replaceable> is the radius of the circle.
3182     </para>
3183
3184     <para>
3185      Circles are output using the first syntax.
3186     </para>
3187    </sect2>
3188
3189   </sect1>
3190
3191   <sect1 id="datatype-net-types">
3192    <title>Network Address Types</title>
3193
3194    <indexterm zone="datatype-net-types">
3195     <primary>network</primary>
3196     <secondary>data types</secondary>
3197    </indexterm>
3198
3199    <para>
3200     <productname>PostgreSQL</> offers data types to store IPv4, IPv6, and MAC
3201     addresses, as shown in <xref linkend="datatype-net-types-table">.  It
3202     is better to use these types instead of plain text types to store
3203     network addresses, because
3204     these types offer input error checking and specialized
3205     operators and functions (see <xref linkend="functions-net">).
3206    </para>
3207
3208     <table tocentry="1" id="datatype-net-types-table">
3209      <title>Network Address Types</title>
3210      <tgroup cols="3">
3211       <thead>
3212        <row>
3213         <entry>Name</entry>
3214         <entry>Storage Size</entry>
3215         <entry>Description</entry>
3216        </row>
3217       </thead>
3218       <tbody>
3219
3220        <row>
3221         <entry><type>cidr</type></entry>
3222         <entry>7 or 19 bytes</entry>
3223         <entry>IPv4 and IPv6 networks</entry>
3224        </row>
3225
3226        <row>
3227         <entry><type>inet</type></entry>
3228         <entry>7 or 19 bytes</entry>
3229         <entry>IPv4 and IPv6 hosts and networks</entry>
3230        </row>
3231
3232        <row>
3233         <entry><type>macaddr</type></entry>
3234         <entry>6 bytes</entry>
3235         <entry>MAC addresses</entry>
3236        </row>
3237
3238       </tbody>
3239      </tgroup>
3240     </table>
3241
3242    <para>
3243     When sorting <type>inet</type> or <type>cidr</type> data types,
3244     IPv4 addresses will always sort before IPv6 addresses, including
3245     IPv4 addresses encapsulated or mapped to IPv6 addresses, such as
3246     ::10.2.3.4 or ::ffff:10.4.3.2.
3247    </para>
3248
3249
3250    <sect2 id="datatype-inet">
3251     <title><type>inet</type></title>
3252
3253     <indexterm>
3254      <primary>inet (data type)</primary>
3255     </indexterm>
3256
3257     <para>
3258      The <type>inet</type> type holds an IPv4 or IPv6 host address, and
3259      optionally its subnet, all in one field.
3260      The subnet is represented by the number of network address bits
3261      present in the host address (the
3262      <quote>netmask</quote>).  If the netmask is 32 and the address is IPv4,
3263      then the value does not indicate a subnet, only a single host.
3264      In IPv6, the address length is 128 bits, so 128 bits specify a
3265      unique host address.  Note that if you
3266      want to accept only networks, you should use the
3267      <type>cidr</type> type rather than <type>inet</type>.
3268     </para>
3269
3270     <para>
3271       The input format for this type is
3272       <replaceable class="parameter">address/y</replaceable>
3273       where
3274       <replaceable class="parameter">address</replaceable>
3275       is an IPv4 or IPv6 address and
3276       <replaceable class="parameter">y</replaceable>
3277       is the number of bits in the netmask.  If the
3278       <replaceable class="parameter">/y</replaceable>
3279       portion is missing, the
3280       netmask is 32 for IPv4 and 128 for IPv6, so the value represents
3281       just a single host.  On display, the
3282       <replaceable class="parameter">/y</replaceable>
3283       portion is suppressed if the netmask specifies a single host.
3284     </para>
3285    </sect2>
3286
3287    <sect2 id="datatype-cidr">
3288     <title><type>cidr</></title>
3289
3290     <indexterm>
3291      <primary>cidr</primary>
3292     </indexterm>
3293
3294     <para>
3295      The <type>cidr</type> type holds an IPv4 or IPv6 network specification.
3296      Input and output formats follow Classless Internet Domain Routing
3297      conventions.
3298      The format for specifying networks is <replaceable
3299      class="parameter">address/y</> where <replaceable
3300      class="parameter">address</> is the network represented as an
3301      IPv4 or IPv6 address, and <replaceable
3302      class="parameter">y</> is the number of bits in the netmask.  If
3303      <replaceable class="parameter">y</> is omitted, it is calculated
3304      using assumptions from the older classful network numbering system, except
3305      it will be at least large enough to include all of the octets
3306      written in the input.  It is an error to specify a network address
3307      that has bits set to the right of the specified netmask.
3308     </para>
3309
3310     <para>
3311      <xref linkend="datatype-net-cidr-table"> shows some examples.
3312     </para>
3313
3314      <table id="datatype-net-cidr-table">
3315       <title><type>cidr</> Type Input Examples</title>
3316       <tgroup cols="3">
3317        <thead>
3318         <row>
3319          <entry><type>cidr</type> Input</entry>
3320          <entry><type>cidr</type> Output</entry>
3321          <entry><literal><function>abbrev</function>(<type>cidr</type>)</literal></entry>
3322         </row>
3323        </thead>
3324        <tbody>
3325         <row>
3326          <entry>192.168.100.128/25</entry>
3327          <entry>192.168.100.128/25</entry>
3328          <entry>192.168.100.128/25</entry>
3329         </row>
3330         <row>
3331          <entry>192.168/24</entry>
3332          <entry>192.168.0.0/24</entry>
3333          <entry>192.168.0/24</entry>
3334         </row>
3335         <row>
3336          <entry>192.168/25</entry>
3337          <entry>192.168.0.0/25</entry>
3338          <entry>192.168.0.0/25</entry>
3339         </row>
3340         <row>
3341          <entry>192.168.1</entry>
3342          <entry>192.168.1.0/24</entry>
3343          <entry>192.168.1/24</entry>
3344         </row>
3345         <row>
3346          <entry>192.168</entry>
3347          <entry>192.168.0.0/24</entry>
3348          <entry>192.168.0/24</entry>
3349         </row>
3350         <row>
3351          <entry>128.1</entry>
3352          <entry>128.1.0.0/16</entry>
3353          <entry>128.1/16</entry>
3354         </row>
3355         <row>
3356          <entry>128</entry>
3357          <entry>128.0.0.0/16</entry>
3358          <entry>128.0/16</entry>
3359         </row>
3360         <row>
3361          <entry>128.1.2</entry>
3362          <entry>128.1.2.0/24</entry>
3363          <entry>128.1.2/24</entry>
3364         </row>
3365         <row>
3366          <entry>10.1.2</entry>
3367          <entry>10.1.2.0/24</entry>
3368          <entry>10.1.2/24</entry>
3369         </row>
3370         <row>
3371          <entry>10.1</entry>
3372          <entry>10.1.0.0/16</entry>
3373          <entry>10.1/16</entry>
3374         </row>
3375         <row>
3376          <entry>10</entry>
3377          <entry>10.0.0.0/8</entry>
3378          <entry>10/8</entry>
3379         </row>
3380         <row>
3381          <entry>10.1.2.3/32</entry>
3382          <entry>10.1.2.3/32</entry>
3383          <entry>10.1.2.3/32</entry>
3384         </row>
3385         <row>
3386          <entry>2001:4f8:3:ba::/64</entry>
3387          <entry>2001:4f8:3:ba::/64</entry>
3388          <entry>2001:4f8:3:ba::/64</entry>
3389         </row>
3390         <row>
3391          <entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128</entry>
3392          <entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1/128</entry>
3393          <entry>2001:4f8:3:ba:2e0:81ff:fe22:d1f1</entry>
3394         </row>
3395         <row>
3396          <entry>::ffff:1.2.3.0/120</entry>
3397          <entry>::ffff:1.2.3.0/120</entry>
3398          <entry>::ffff:1.2.3/120</entry>
3399         </row>
3400         <row>
3401          <entry>::ffff:1.2.3.0/128</entry>
3402          <entry>::ffff:1.2.3.0/128</entry>
3403          <entry>::ffff:1.2.3.0/128</entry>
3404         </row>
3405        </tbody>
3406       </tgroup>
3407      </table>
3408    </sect2>
3409
3410    <sect2 id="datatype-inet-vs-cidr">
3411     <title><type>inet</type> vs. <type>cidr</type></title>
3412
3413     <para>
3414     The essential difference between <type>inet</type> and <type>cidr</type>
3415     data types is that <type>inet</type> accepts values with nonzero bits to
3416     the right of the netmask, whereas <type>cidr</type> does not.
3417     </para>
3418
3419       <tip>
3420         <para>
3421         If you do not like the output format for <type>inet</type> or
3422         <type>cidr</type> values, try the functions <function>host</>,
3423         <function>text</>, and <function>abbrev</>.
3424         </para>
3425       </tip>
3426    </sect2>
3427
3428    <sect2 id="datatype-macaddr">
3429     <title><type>macaddr</></>
3430
3431     <indexterm>
3432      <primary>macaddr (data type)</primary>
3433     </indexterm>
3434
3435     <indexterm>
3436      <primary>MAC address</primary>
3437      <see>macaddr</see>
3438     </indexterm>
3439
3440     <para>
3441      The <type>macaddr</> type stores MAC addresses, known for example
3442      from Ethernet card hardware addresses (although MAC addresses are
3443      used for other purposes as well).  Input is accepted in the
3444      following formats:
3445
3446      <simplelist>
3447       <member><literal>'08:00:2b:01:02:03'</></member>
3448       <member><literal>'08-00-2b-01-02-03'</></member>
3449       <member><literal>'08002b:010203'</></member>
3450       <member><literal>'08002b-010203'</></member>
3451       <member><literal>'0800.2b01.0203'</></member>
3452       <member><literal>'08002b010203'</></member>
3453      </simplelist>
3454
3455      These examples would all specify the same address.  Upper and
3456      lower case is accepted for the digits
3457      <literal>a</> through <literal>f</>.  Output is always in the
3458      first of the forms shown.
3459     </para>
3460
3461     <para>
3462      IEEE Std 802-2001 specifies the second shown form (with hyphens)
3463      as the canonical form for MAC addresses, and specifies the first
3464      form (with colons) as the bit-reversed notation, so that
3465      08-00-2b-01-02-03 = 01:00:4D:08:04:0C.  This convention is widely
3466      ignored nowadays, and it is only relevant for obsolete network
3467      protocols (such as Token Ring).  PostgreSQL makes no provisions
3468      for bit reversal, and all accepted formats use the canonical LSB
3469      order.
3470     </para>
3471
3472     <para>
3473      The remaining four input formats are not part of any standard.
3474     </para>
3475    </sect2>
3476
3477   </sect1>
3478
3479   <sect1 id="datatype-bit">
3480    <title>Bit String Types</title>
3481
3482    <indexterm zone="datatype-bit">
3483     <primary>bit string</primary>
3484     <secondary>data type</secondary>
3485    </indexterm>
3486
3487    <para>
3488     Bit strings are strings of 1's and 0's.  They can be used to store
3489     or visualize bit masks.  There are two SQL bit types:
3490     <type>bit(<replaceable>n</replaceable>)</type> and <type>bit
3491     varying(<replaceable>n</replaceable>)</type>, where
3492     <replaceable>n</replaceable> is a positive integer.
3493    </para>
3494
3495    <para>
3496     <type>bit</type> type data must match the length
3497     <replaceable>n</replaceable> exactly; it is an error to attempt to
3498     store shorter or longer bit strings.  <type>bit varying</type> data is
3499     of variable length up to the maximum length
3500     <replaceable>n</replaceable>; longer strings will be rejected.
3501     Writing <type>bit</type> without a length is equivalent to
3502     <literal>bit(1)</literal>, while <type>bit varying</type> without a length
3503     specification means unlimited length.
3504    </para>
3505
3506    <note>
3507     <para>
3508      If one explicitly casts a bit-string value to
3509      <type>bit(<replaceable>n</>)</type>, it will be truncated or
3510      zero-padded on the right to be exactly <replaceable>n</> bits,
3511      without raising an error.  Similarly,
3512      if one explicitly casts a bit-string value to
3513      <type>bit varying(<replaceable>n</>)</type>, it will be truncated
3514      on the right if it is more than <replaceable>n</> bits.
3515     </para>
3516    </note>
3517
3518    <para>
3519     Refer to <xref
3520     linkend="sql-syntax-bit-strings"> for information about the syntax
3521     of bit string constants.  Bit-logical operators and string
3522     manipulation functions are available; see <xref
3523     linkend="functions-bitstring">.
3524    </para>
3525
3526    <example>
3527     <title>Using the bit string types</title>
3528
3529 <programlisting>
3530 CREATE TABLE test (a BIT(3), b BIT VARYING(5));
3531 INSERT INTO test VALUES (B'101', B'00');
3532 INSERT INTO test VALUES (B'10', B'101');
3533 <computeroutput>
3534 ERROR:  bit string length 2 does not match type bit(3)
3535 </computeroutput>
3536 INSERT INTO test VALUES (B'10'::bit(3), B'101');
3537 SELECT * FROM test;
3538 <computeroutput>
3539   a  |  b
3540 -----+-----
3541  101 | 00
3542  100 | 101
3543 </computeroutput>
3544 </programlisting>
3545    </example>
3546
3547    <para>
3548     A bit string value requires 1 byte for each group of 8 bits, plus
3549     5 or 8 bytes overhead depending on the length of the string
3550     (but long values may be compressed or moved out-of-line, as explained
3551     in <xref linkend="datatype-character"> for character strings).
3552    </para>
3553   </sect1>
3554
3555   <sect1 id="datatype-textsearch">
3556    <title>Text Search Types</title>
3557
3558    <indexterm zone="datatype-textsearch">
3559     <primary>full text search</primary>
3560     <secondary>data types</secondary>
3561    </indexterm>
3562
3563    <indexterm zone="datatype-textsearch">
3564     <primary>text search</primary>
3565     <secondary>data types</secondary>
3566    </indexterm>
3567
3568    <para>
3569     <productname>PostgreSQL</productname> provides two data types that
3570     are designed to support full text search, which is the activity of
3571     searching through a collection of natural-language <firstterm>documents</>
3572     to locate those that best match a <firstterm>query</>.
3573     The <type>tsvector</type> type represents a document in a form optimized
3574     for text search; the <type>tsquery</type> type similarly represents
3575     a text query.
3576     <xref linkend="textsearch"> provides a detailed explanation of this
3577     facility, and <xref linkend="functions-textsearch"> summarizes the
3578     related functions and operators.
3579    </para>
3580
3581    <sect2 id="datatype-tsvector">
3582     <title><type>tsvector</type></title>
3583
3584     <indexterm>
3585      <primary>tsvector (data type)</primary>
3586     </indexterm>
3587
3588     <para>
3589      A <type>tsvector</type> value is a sorted list of distinct
3590      <firstterm>lexemes</>, which are words that have been
3591      <firstterm>normalized</> to merge different variants of the same word
3592      (see <xref linkend="textsearch"> for details).  Sorting and
3593      duplicate-elimination are done automatically during input, as shown in
3594      this example:
3595
3596 <programlisting>
3597 SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector;
3598                       tsvector
3599 ----------------------------------------------------
3600  'a' 'and' 'ate' 'cat' 'fat' 'mat' 'on' 'rat' 'sat'
3601 </programlisting>
3602
3603      To represent
3604      lexemes containing whitespace or punctuation, surround them with quotes:
3605
3606 <programlisting>
3607 SELECT $$the lexeme '    ' contains spaces$$::tsvector;
3608                  tsvector
3609 -------------------------------------------
3610  '    ' 'contains' 'lexeme' 'spaces' 'the'
3611 </programlisting>
3612
3613      (We use dollar-quoted string literals in this example and the next one
3614      to avoid the confusion of having to double quote marks within the
3615      literals.)  Embedded quotes and backslashes must be doubled:
3616
3617 <programlisting>
3618 SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector;
3619                     tsvector
3620 ------------------------------------------------
3621  'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the'
3622 </programlisting>
3623
3624      Optionally, integer <firstterm>positions</>
3625      can be attached to lexemes:
3626
3627 <programlisting>
3628 SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector;
3629                                   tsvector
3630 -------------------------------------------------------------------------------
3631  'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4
3632 </programlisting>
3633
3634      A position normally indicates the source word's location in the
3635      document.  Positional information can be used for
3636      <firstterm>proximity ranking</firstterm>.  Position values can
3637      range from 1 to 16383; larger numbers are silently set to 16383.
3638      Duplicate positions for the same lexeme are discarded.
3639     </para>
3640
3641     <para>
3642      Lexemes that have positions can further be labeled with a
3643      <firstterm>weight</>, which can be <literal>A</literal>,
3644      <literal>B</literal>, <literal>C</literal>, or <literal>D</literal>.
3645      <literal>D</literal> is the default and hence is not shown on output:
3646
3647 <programlisting>
3648 SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
3649           tsvector
3650 ----------------------------
3651  'a':1A 'cat':5 'fat':2B,4C
3652 </programlisting>
3653
3654      Weights are typically used to reflect document structure, for example
3655      by marking title words differently from body words.  Text search
3656      ranking functions can assign different priorities to the different
3657      weight markers.
3658     </para>
3659
3660     <para>
3661      It is important to understand that the
3662      <type>tsvector</type> type itself does not perform any normalization;
3663      it assumes the words it is given are normalized appropriately
3664      for the application.  For example,
3665
3666 <programlisting>
3667 select 'The Fat Rats'::tsvector;
3668       tsvector
3669 --------------------
3670  'Fat' 'Rats' 'The'
3671 </programlisting>
3672
3673      For most English-text-searching applications the above words would
3674      be considered non-normalized, but <type>tsvector</type> doesn't care.
3675      Raw document text should usually be passed through
3676      <function>to_tsvector</> to normalize the words appropriately
3677      for searching:
3678
3679 <programlisting>
3680 SELECT to_tsvector('english', 'The Fat Rats');
3681    to_tsvector
3682 -----------------
3683  'fat':2 'rat':3
3684 </programlisting>
3685
3686      Again, see <xref linkend="textsearch"> for more detail.
3687     </para>
3688
3689    </sect2>
3690
3691    <sect2 id="datatype-tsquery">
3692     <title><type>tsquery</type></title>
3693
3694     <indexterm>
3695      <primary>tsquery (data type)</primary>
3696     </indexterm>
3697
3698     <para>
3699      A <type>tsquery</type> value stores lexemes that are to be
3700      searched for, and combines them honoring the boolean operators
3701      <literal>&amp;</literal> (AND), <literal>|</literal> (OR), and
3702      <literal>!</> (NOT).  Parentheses can be used to enforce grouping
3703      of the operators:
3704
3705 <programlisting>
3706  SELECT 'fat &amp; rat'::tsquery;
3707     tsquery
3708 ---------------
3709  'fat' &amp; 'rat'
3710
3711 SELECT 'fat &amp; (rat | cat)'::tsquery;
3712           tsquery
3713 ---------------------------
3714  'fat' &amp; ( 'rat' | 'cat' )
3715
3716 SELECT 'fat &amp; rat &amp; ! cat'::tsquery;
3717         tsquery
3718 ------------------------
3719  'fat' &amp; 'rat' &amp; !'cat'
3720 </programlisting>
3721
3722      In the absence of parentheses, <literal>!</> (NOT) binds most tightly,
3723      and <literal>&amp;</literal> (AND) binds more tightly than
3724      <literal>|</literal> (OR).
3725     </para>
3726
3727     <para>
3728      Optionally, lexemes in a <type>tsquery</type> can be labeled with
3729      one or more weight letters, which restricts them to match only
3730      <type>tsvector</> lexemes with matching weights:
3731
3732 <programlisting>
3733 SELECT 'fat:ab &amp; cat'::tsquery;
3734     tsquery
3735 ------------------
3736  'fat':AB &amp; 'cat'
3737 </programlisting>
3738     </para>
3739
3740     <para>
3741      Also, lexemes in a <type>tsquery</type> can be labeled with <literal>*</>
3742      to specify prefix matching:
3743 <programlisting>
3744 SELECT 'super:*'::tsquery;
3745   tsquery
3746 -----------
3747  'super':*
3748 </programlisting>
3749      This query will match any word in a <type>tsvector</> that begins
3750      with <quote>super</>.
3751     </para>
3752
3753     <para>
3754      Quoting rules for lexemes are the same as described previously for
3755      lexemes in <type>tsvector</>; and, as with <type>tsvector</>,
3756      any required normalization of words must be done before converting
3757      to the <type>tsquery</> type.  The <function>to_tsquery</>
3758      function is convenient for performing such normalization:
3759
3760 <programlisting>
3761 SELECT to_tsquery('Fat:ab &amp; Cats');
3762     to_tsquery
3763 ------------------
3764  'fat':AB &amp; 'cat'
3765 </programlisting>
3766     </para>
3767
3768    </sect2>
3769
3770   </sect1>
3771
3772   <sect1 id="datatype-uuid">
3773    <title><acronym>UUID</acronym> Type</title>
3774
3775    <indexterm zone="datatype-uuid">
3776     <primary>UUID</primary>
3777    </indexterm>
3778
3779    <para>
3780     The data type <type>uuid</type> stores Universally Unique Identifiers
3781     (UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards.
3782     (Some systems refer to this data type as a globally unique identifier, or
3783     GUID,<indexterm><primary>GUID</primary></indexterm> instead.)  This
3784     identifier is a 128-bit quantity that is generated by an algorithm chosen
3785     to make it very unlikely that the same identifier will be generated by
3786     anyone else in the known universe using the same algorithm.  Therefore,
3787     for distributed systems, these identifiers provide a better uniqueness
3788     guarantee than sequence generators, which
3789     are only unique within a single database.
3790    </para>
3791
3792    <para>
3793     A UUID is written as a sequence of lower-case hexadecimal digits,
3794     in several groups separated by hyphens, specifically a group of 8
3795     digits followed by three groups of 4 digits followed by a group of
3796     12 digits, for a total of 32 digits representing the 128 bits.  An
3797     example of a UUID in this standard form is:
3798 <programlisting>
3799 a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11
3800 </programlisting>
3801     <productname>PostgreSQL</productname> also accepts the following
3802     alternative forms for input:
3803     use of upper-case digits, the standard format surrounded by
3804     braces, omitting some or all hyphens, adding a hyphen after any
3805     group of four digits.  Examples are:
3806 <programlisting>
3807 A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11
3808 {a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11}
3809 a0eebc999c0b4ef8bb6d6bb9bd380a11
3810 a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
3811 {a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11}
3812 </programlisting>
3813     Output is always in the standard form.
3814    </para>
3815
3816    <para>
3817     <productname>PostgreSQL</productname> provides storage and comparison
3818     functions for UUIDs, but the core database does not include any
3819     function for generating UUIDs, because no single algorithm is well
3820     suited for every application.  The contrib module
3821     <filename>contrib/uuid-ossp</filename> provides functions that implement
3822     several standard algorithms.
3823     Alternatively, UUIDs could be generated by client applications or
3824     other libraries invoked through a server-side function.
3825    </para>
3826   </sect1>
3827
3828   <sect1 id="datatype-xml">
3829    <title><acronym>XML</> Type</title>
3830
3831    <indexterm zone="datatype-xml">
3832     <primary>XML</primary>
3833    </indexterm>
3834
3835    <para>
3836     The <type>xml</type> data type can be used to store XML data.  Its
3837     advantage over storing XML data in a <type>text</type> field is that it
3838     checks the input values for well-formedness, and there are support
3839     functions to perform type-safe operations on it; see <xref
3840     linkend="functions-xml">.  Use of this data type requires the
3841     installation to have been built with <command>configure
3842     --with-libxml</>.
3843    </para>
3844
3845    <para>
3846     The <type>xml</type> type can store well-formed
3847     <quote>documents</quote>, as defined by the XML standard, as well
3848     as <quote>content</quote> fragments, which are defined by the
3849     production <literal>XMLDecl? content</literal> in the XML
3850     standard.  Roughly, this means that content fragments can have
3851     more than one top-level element or character node.  The expression
3852     <literal><replaceable>xmlvalue</replaceable> IS DOCUMENT</literal>
3853     can be used to evaluate whether a particular <type>xml</type>
3854     value is a full document or only a content fragment.
3855    </para>
3856
3857    <sect2>
3858     <title>Creating XML Values</title>
3859    <para>
3860     To produce a value of type <type>xml</type> from character data,
3861     use the function
3862     <function>xmlparse</function>:<indexterm><primary>xmlparse</primary></indexterm>
3863 <synopsis>
3864 XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
3865 </synopsis>
3866     Examples:
3867 <programlisting><![CDATA[
3868 XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
3869 XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
3870 ]]></programlisting>
3871     While this is the only way to convert character strings into XML
3872     values according to the SQL standard, the PostgreSQL-specific
3873     syntaxes:
3874 <programlisting><![CDATA[
3875 xml '<foo>bar</foo>'
3876 '<foo>bar</foo>'::xml
3877 ]]></programlisting>
3878     can also be used.
3879    </para>
3880
3881    <para>
3882     The <type>xml</type> type does not validate input values
3883     against a document type declaration
3884     (DTD),<indexterm><primary>DTD</primary></indexterm>
3885     even when the input value specifies a DTD.
3886    </para>
3887
3888    <para>
3889     The inverse operation, producing a character string value from
3890     <type>xml</type>, uses the function
3891     <function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
3892 <synopsis>
3893 XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> )
3894 </synopsis>
3895     <replaceable>type</replaceable> can be
3896     <type>character</type>, <type>character varying</type>, or
3897     <type>text</type> (or an alias for one of those).  Again, according
3898     to the SQL standard, this is the only way to convert between type
3899     <type>xml</type> and character types, but PostgreSQL also allows
3900     you to simply cast the value.
3901    </para>
3902
3903    <para>
3904     When a character string value is cast to or from type
3905     <type>xml</type> without going through <type>XMLPARSE</type> or
3906     <type>XMLSERIALIZE</type>, respectively, the choice of
3907     <literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is
3908     determined by the <quote>XML option</quote>
3909     <indexterm><primary>XML option</primary></indexterm>
3910     session configuration parameter, which can be set using the
3911     standard command:
3912 <synopsis>
3913 SET XML OPTION { DOCUMENT | CONTENT };
3914 </synopsis>
3915     or the more PostgreSQL-like syntax
3916 <synopsis>
3917 SET xmloption TO { DOCUMENT | CONTENT };
3918 </synopsis>
3919     The default is <literal>CONTENT</literal>, so all forms of XML
3920     data are allowed.
3921    </para>
3922    </sect2>
3923
3924    <sect2>
3925     <title>Encoding Handling</title>
3926    <para>
3927     Care must be taken when dealing with multiple character encodings
3928     on the client, server, and in the XML data passed through them.
3929     When using the text mode to pass queries to the server and query
3930     results to the client (which is the normal mode), PostgreSQL
3931     converts all character data passed between the client and the
3932     server and vice versa to the character encoding of the respective
3933     end; see <xref linkend="multibyte">.  This includes string
3934     representations of XML values, such as in the above examples.
3935     This would ordinarily mean that encoding declarations contained in
3936     XML data can become invalid as the character data is converted
3937     to other encodings while travelling between client and server,
3938     because the embedded encoding declaration is not changed.  To cope
3939     with this behavior, encoding declarations contained in
3940     character strings presented for input to the <type>xml</type> type
3941     are <emphasis>ignored</emphasis>, and content is assumed
3942     to be in the current server encoding.  Consequently, for correct
3943     processing, character strings of XML data must be sent
3944     from the client in the current client encoding.  It is the
3945     responsibility of the client to either convert documents to the
3946     current client encoding before sending them to the server, or to
3947     adjust the client encoding appropriately.  On output, values of
3948     type <type>xml</type> will not have an encoding declaration, and
3949     clients should assume all data is in the current client
3950     encoding.
3951    </para>
3952
3953    <para>
3954     When using binary mode to pass query parameters to the server
3955     and query results back to the client, no character set conversion
3956     is performed, so the situation is different.  In this case, an
3957     encoding declaration in the XML data will be observed, and if it
3958     is absent, the data will be assumed to be in UTF-8 (as required by
3959     the XML standard; note that PostgreSQL does not support UTF-16).
3960     On output, data will have an encoding declaration
3961     specifying the client encoding, unless the client encoding is
3962     UTF-8, in which case it will be omitted.
3963    </para>
3964
3965    <para>
3966     Needless to say, processing XML data with PostgreSQL will be less
3967     error-prone and more efficient if the XML data encoding, client encoding,
3968     and server encoding are the same.  Since XML data is internally
3969     processed in UTF-8, computations will be most efficient if the
3970     server encoding is also UTF-8.
3971    </para>
3972
3973    <caution>
3974     <para>
3975      Some XML-related functions may not work at all on non-ASCII data
3976      when the server encoding is not UTF-8.  This is known to be an
3977      issue for <function>xpath()</> in particular.
3978     </para>
3979    </caution>
3980    </sect2>
3981
3982    <sect2>
3983    <title>Accessing XML Values</title>
3984
3985    <para>
3986     The <type>xml</type> data type is unusual in that it does not
3987     provide any comparison operators.  This is because there is no
3988     well-defined and universally useful comparison algorithm for XML
3989     data.  One consequence of this is that you cannot retrieve rows by
3990     comparing an <type>xml</type> column against a search value.  XML
3991     values should therefore typically be accompanied by a separate key
3992     field such as an ID.  An alternative solution for comparing XML
3993     values is to convert them to character strings first, but note
3994     that character string comparison has little to do with a useful
3995     XML comparison method.
3996    </para>
3997
3998    <para>
3999     Since there are no comparison operators for the <type>xml</type>
4000     data type, it is not possible to create an index directly on a
4001     column of this type.  If speedy searches in XML data are desired,
4002     possible workarounds include casting the expression to a
4003     character string type and indexing that, or indexing an XPath
4004     expression.  Of course, the actual query would have to be adjusted
4005     to search by the indexed expression.
4006    </para>
4007
4008    <para>
4009     The text-search functionality in PostgreSQL can also be used to speed
4010     up full-document searches of XML data.  The necessary
4011     preprocessing support is, however, not yet available in the PostgreSQL
4012     distribution.
4013    </para>
4014    </sect2>
4015   </sect1>
4016
4017   &array;
4018
4019   &rowtypes;
4020
4021   <sect1 id="datatype-oid">
4022    <title>Object Identifier Types</title>
4023
4024    <indexterm zone="datatype-oid">
4025     <primary>object identifier</primary>
4026     <secondary>data type</secondary>
4027    </indexterm>
4028
4029    <indexterm zone="datatype-oid">
4030     <primary>oid</primary>
4031    </indexterm>
4032
4033    <indexterm zone="datatype-oid">
4034     <primary>regproc</primary>
4035    </indexterm>
4036
4037    <indexterm zone="datatype-oid">
4038     <primary>regprocedure</primary>
4039    </indexterm>
4040
4041    <indexterm zone="datatype-oid">
4042     <primary>regoper</primary>
4043    </indexterm>
4044
4045    <indexterm zone="datatype-oid">
4046     <primary>regoperator</primary>
4047    </indexterm>
4048
4049    <indexterm zone="datatype-oid">
4050     <primary>regclass</primary>
4051    </indexterm>
4052
4053    <indexterm zone="datatype-oid">
4054     <primary>regtype</primary>
4055    </indexterm>
4056
4057    <indexterm zone="datatype-oid">
4058     <primary>regconfig</primary>
4059    </indexterm>
4060
4061    <indexterm zone="datatype-oid">
4062     <primary>regdictionary</primary>
4063    </indexterm>
4064
4065    <indexterm zone="datatype-oid">
4066     <primary>xid</primary>
4067    </indexterm>
4068
4069    <indexterm zone="datatype-oid">
4070     <primary>cid</primary>
4071    </indexterm>
4072
4073    <indexterm zone="datatype-oid">
4074     <primary>tid</primary>
4075    </indexterm>
4076
4077    <para>
4078     Object identifiers (OIDs) are used internally by
4079     <productname>PostgreSQL</productname> as primary keys for various
4080     system tables.  OIDs are not added to user-created tables, unless
4081     <literal>WITH OIDS</literal> is specified when the table is
4082     created, or the <xref linkend="guc-default-with-oids">
4083     configuration variable is enabled.  Type <type>oid</> represents
4084     an object identifier.  There are also several alias types for
4085     <type>oid</>: <type>regproc</>, <type>regprocedure</>,
4086     <type>regoper</>, <type>regoperator</>, <type>regclass</>,
4087     <type>regtype</>, <type>regconfig</>, and <type>regdictionary</>.
4088     <xref linkend="datatype-oid-table"> shows an overview.
4089    </para>
4090
4091    <para>
4092     The <type>oid</> type is currently implemented as an unsigned
4093     four-byte integer.  Therefore, it is not large enough to provide
4094     database-wide uniqueness in large databases, or even in large
4095     individual tables.  So, using a user-created table's OID column as
4096     a primary key is discouraged.  OIDs are best used only for
4097     references to system tables.
4098    </para>
4099
4100    <para>
4101     The <type>oid</> type itself has few operations beyond comparison.
4102     It can be cast to integer, however, and then manipulated using the
4103     standard integer operators.  (Beware of possible
4104     signed-versus-unsigned confusion if you do this.)
4105    </para>
4106
4107    <para>
4108     The OID alias types have no operations of their own except
4109     for specialized input and output routines.  These routines are able
4110     to accept and display symbolic names for system objects, rather than
4111     the raw numeric value that type <type>oid</> would use.  The alias
4112     types allow simplified lookup of OID values for objects.  For example,
4113     to examine the <structname>pg_attribute</> rows related to a table
4114     <literal>mytable</>, one could write:
4115 <programlisting>
4116 SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass;
4117 </programlisting>
4118     rather than:
4119 <programlisting>
4120 SELECT * FROM pg_attribute
4121   WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable');
4122 </programlisting>
4123     While that doesn't look all that bad by itself, it's still oversimplified.
4124     A far more complicated sub-select would be needed to
4125     select the right OID if there are multiple tables named
4126     <literal>mytable</> in different schemas.
4127     The <type>regclass</> input converter handles the table lookup according
4128     to the schema path setting, and so it does the <quote>right thing</>
4129     automatically.  Similarly, casting a table's OID to
4130     <type>regclass</> is handy for symbolic display of a numeric OID.
4131    </para>
4132
4133     <table id="datatype-oid-table">
4134      <title>Object Identifier Types</title>
4135      <tgroup cols="4">
4136       <thead>
4137        <row>
4138         <entry>Name</entry>
4139         <entry>References</entry>
4140         <entry>Description</entry>
4141         <entry>Value Example</entry>
4142        </row>
4143       </thead>
4144
4145       <tbody>
4146
4147        <row>
4148         <entry><type>oid</></entry>
4149         <entry>any</entry>
4150         <entry>numeric object identifier</entry>
4151         <entry><literal>564182</></entry>
4152        </row>
4153
4154        <row>
4155         <entry><type>regproc</></entry>
4156         <entry><structname>pg_proc</></entry>
4157         <entry>function name</entry>
4158         <entry><literal>sum</></entry>
4159        </row>
4160
4161        <row>
4162         <entry><type>regprocedure</></entry>
4163         <entry><structname>pg_proc</></entry>
4164         <entry>function with argument types</entry>
4165         <entry><literal>sum(int4)</></entry>
4166        </row>
4167
4168        <row>
4169         <entry><type>regoper</></entry>
4170         <entry><structname>pg_operator</></entry>
4171         <entry>operator name</entry>
4172         <entry><literal>+</></entry>
4173        </row>
4174
4175        <row>
4176         <entry><type>regoperator</></entry>
4177         <entry><structname>pg_operator</></entry>
4178         <entry>operator with argument types</entry>
4179         <entry><literal>*(integer,integer)</> or <literal>-(NONE,integer)</></entry>
4180        </row>
4181
4182        <row>
4183         <entry><type>regclass</></entry>
4184         <entry><structname>pg_class</></entry>
4185         <entry>relation name</entry>
4186         <entry><literal>pg_type</></entry>
4187        </row>
4188
4189        <row>
4190         <entry><type>regtype</></entry>
4191         <entry><structname>pg_type</></entry>
4192         <entry>data type name</entry>
4193         <entry><literal>integer</></entry>
4194        </row>
4195
4196        <row>
4197         <entry><type>regconfig</></entry>
4198         <entry><structname>pg_ts_config</></entry>
4199         <entry>text search configuration</entry>
4200         <entry><literal>english</></entry>
4201        </row>
4202
4203        <row>
4204         <entry><type>regdictionary</></entry>
4205         <entry><structname>pg_ts_dict</></entry>
4206         <entry>text search dictionary</entry>
4207         <entry><literal>simple</></entry>
4208        </row>
4209       </tbody>
4210      </tgroup>
4211     </table>
4212
4213    <para>
4214     All of the OID alias types accept schema-qualified names, and will
4215     display schema-qualified names on output if the object would not
4216     be found in the current search path without being qualified.
4217     The <type>regproc</> and <type>regoper</> alias types will only
4218     accept input names that are unique (not overloaded), so they are
4219     of limited use; for most uses <type>regprocedure</> or
4220     <type>regoperator</> are more appropriate.  For <type>regoperator</>,
4221     unary operators are identified by writing <literal>NONE</> for the unused
4222     operand.
4223    </para>
4224
4225    <para>
4226     An additional property of the OID alias types is the creation of
4227     dependencies.  If a
4228     constant of one of these types appears in a stored expression
4229     (such as a column default expression or view), it creates a dependency
4230     on the referenced object.  For example, if a column has a default
4231     expression <literal>nextval('my_seq'::regclass)</>,
4232     <productname>PostgreSQL</productname>
4233     understands that the default expression depends on the sequence
4234     <literal>my_seq</>; the system will not let the sequence be dropped
4235     without first removing the default expression.
4236    </para>
4237
4238    <para>
4239     Another identifier type used by the system is <type>xid</>, or transaction
4240     (abbreviated <abbrev>xact</>) identifier.  This is the data type of the system columns
4241     <structfield>xmin</> and <structfield>xmax</>.  Transaction identifiers are 32-bit quantities.
4242    </para>
4243
4244    <para>
4245     A third identifier type used by the system is <type>cid</>, or
4246     command identifier.  This is the data type of the system columns
4247     <structfield>cmin</> and <structfield>cmax</>. Command identifiers are also 32-bit quantities.
4248    </para>
4249
4250    <para>
4251     A final identifier type used by the system is <type>tid</>, or tuple
4252     identifier (row identifier).  This is the data type of the system column
4253     <structfield>ctid</>.  A tuple ID is a pair
4254     (block number, tuple index within block) that identifies the
4255     physical location of the row within its table.
4256    </para>
4257
4258    <para>
4259     (The system columns are further explained in <xref
4260     linkend="ddl-system-columns">.)
4261    </para>
4262   </sect1>
4263
4264   <sect1 id="datatype-pseudo">
4265    <title>Pseudo-Types</title>
4266
4267    <indexterm zone="datatype-pseudo">
4268     <primary>record</primary>
4269    </indexterm>
4270
4271    <indexterm zone="datatype-pseudo">
4272     <primary>any</primary>
4273    </indexterm>
4274
4275    <indexterm zone="datatype-pseudo">
4276     <primary>anyelement</primary>
4277    </indexterm>
4278
4279    <indexterm zone="datatype-pseudo">
4280     <primary>anyarray</primary>
4281    </indexterm>
4282
4283    <indexterm zone="datatype-pseudo">
4284     <primary>anynonarray</primary>
4285    </indexterm>
4286
4287    <indexterm zone="datatype-pseudo">
4288     <primary>anyenum</primary>
4289    </indexterm>
4290
4291    <indexterm zone="datatype-pseudo">
4292     <primary>void</primary>
4293    </indexterm>
4294
4295    <indexterm zone="datatype-pseudo">
4296     <primary>trigger</primary>
4297    </indexterm>
4298
4299    <indexterm zone="datatype-pseudo">
4300     <primary>language_handler</primary>
4301    </indexterm>
4302
4303    <indexterm zone="datatype-pseudo">
4304     <primary>cstring</primary>
4305    </indexterm>
4306
4307    <indexterm zone="datatype-pseudo">
4308     <primary>internal</primary>
4309    </indexterm>
4310
4311    <indexterm zone="datatype-pseudo">
4312     <primary>opaque</primary>
4313    </indexterm>
4314
4315    <para>
4316     The <productname>PostgreSQL</productname> type system contains a
4317     number of special-purpose entries that are collectively called
4318     <firstterm>pseudo-types</>.  A pseudo-type cannot be used as a
4319     column data type, but it can be used to declare a function's
4320     argument or result type.  Each of the available pseudo-types is
4321     useful in situations where a function's behavior does not
4322     correspond to simply taking or returning a value of a specific
4323     <acronym>SQL</acronym> data type.  <xref
4324     linkend="datatype-pseudotypes-table"> lists the existing
4325     pseudo-types.
4326    </para>
4327
4328     <table id="datatype-pseudotypes-table">
4329      <title>Pseudo-Types</title>
4330      <tgroup cols="2">
4331       <thead>
4332        <row>
4333         <entry>Name</entry>
4334         <entry>Description</entry>
4335        </row>
4336       </thead>
4337
4338       <tbody>
4339        <row>
4340         <entry><type>any</></entry>
4341         <entry>Indicates that a function accepts any input data type.</entry>
4342        </row>
4343
4344        <row>
4345         <entry><type>anyarray</></entry>
4346         <entry>Indicates that a function accepts any array data type
4347         (see <xref linkend="extend-types-polymorphic">).</entry>
4348        </row>
4349
4350        <row>
4351         <entry><type>anyelement</></entry>
4352         <entry>Indicates that a function accepts any data type
4353         (see <xref linkend="extend-types-polymorphic">).</entry>
4354        </row>
4355
4356        <row>
4357         <entry><type>anyenum</></entry>
4358         <entry>Indicates that a function accepts any enum data type
4359         (see <xref linkend="extend-types-polymorphic"> and
4360         <xref linkend="datatype-enum">).</entry>
4361        </row>
4362
4363        <row>
4364         <entry><type>anynonarray</></entry>
4365         <entry>Indicates that a function accepts any non-array data type
4366         (see <xref linkend="extend-types-polymorphic">).</entry>
4367        </row>
4368
4369        <row>
4370         <entry><type>cstring</></entry>
4371         <entry>Indicates that a function accepts or returns a null-terminated C string.</entry>
4372        </row>
4373
4374        <row>
4375         <entry><type>internal</></entry>
4376         <entry>Indicates that a function accepts or returns a server-internal
4377         data type.</entry>
4378        </row>
4379
4380        <row>
4381         <entry><type>language_handler</></entry>
4382         <entry>A procedural language call handler is declared to return <type>language_handler</>.</entry>
4383        </row>
4384
4385        <row>
4386         <entry><type>record</></entry>
4387         <entry>Identifies a function returning an unspecified row type.</entry>
4388        </row>
4389
4390        <row>
4391         <entry><type>trigger</></entry>
4392         <entry>A trigger function is declared to return <type>trigger.</></entry>
4393        </row>
4394
4395        <row>
4396         <entry><type>void</></entry>
4397         <entry>Indicates that a function returns no value.</entry>
4398        </row>
4399
4400        <row>
4401         <entry><type>opaque</></entry>
4402         <entry>An obsolete type name that formerly served all the above purposes.</entry>
4403        </row>
4404       </tbody>
4405      </tgroup>
4406     </table>
4407
4408    <para>
4409     Functions coded in C (whether built-in or dynamically loaded) can be
4410     declared to accept or return any of these pseudo data types.  It is up to
4411     the function author to ensure that the function will behave safely
4412     when a pseudo-type is used as an argument type.
4413    </para>
4414
4415    <para>
4416     Functions coded in procedural languages can use pseudo-types only as
4417     allowed by their implementation languages.  At present the procedural
4418     languages all forbid use of a pseudo-type as argument type, and allow
4419     only <type>void</> and <type>record</> as a result type (plus
4420     <type>trigger</> when the function is used as a trigger).  Some also
4421     support polymorphic functions using the types <type>anyarray</>,
4422     <type>anyelement</>, <type>anyenum</>, and <type>anynonarray</>.
4423    </para>
4424
4425    <para>
4426     The <type>internal</> pseudo-type is used to declare functions
4427     that are meant only to be called internally by the database
4428     system, and not by direct invocation in an <acronym>SQL</acronym>
4429     query.  If a function has at least one <type>internal</>-type
4430     argument then it cannot be called from <acronym>SQL</acronym>.  To
4431     preserve the type safety of this restriction it is important to
4432     follow this coding rule: do not create any function that is
4433     declared to return <type>internal</> unless it has at least one
4434     <type>internal</> argument.
4435    </para>
4436
4437   </sect1>
4438
4439  </chapter>