doc/src/sgml/ddl.sgml

   1 <!-- doc/src/sgml/ddl.sgml -->
   2
   3 <chapter id="ddl">
   4  <title>Data Definition</title>
   5
   6  <para>
   7   This chapter covers how one creates the database structures that
   8   will hold one's data.  In a relational database, the raw data is
   9   stored in tables, so the majority of this chapter is devoted to
  10   explaining how tables are created and modified and what features are
  11   available to control what data is stored in the tables.
  12   Subsequently, we discuss how tables can be organized into
  13   schemas, and how privileges can be assigned to tables.  Finally,
  14   we will briefly look at other features that affect the data storage,
  15   such as inheritance, table partitioning, views, functions, and
  16   triggers.
  17  </para>
  18
  19  <sect1 id="ddl-basics">
  20   <title>Table Basics</title>
  21
  22   <indexterm zone="ddl-basics">
  23    <primary>table</primary>
  24   </indexterm>
  25
  26   <indexterm>
  27    <primary>row</primary>
  28   </indexterm>
  29
  30   <indexterm>
  31    <primary>column</primary>
  32   </indexterm>
  33
  34   <para>
  35    A table in a relational database is much like a table on paper: It
  36    consists of rows and columns.  The number and order of the columns
  37    is fixed, and each column has a name.  The number of rows is
  38    variable &mdash; it reflects how much data is stored at a given moment.
  39    SQL does not make any guarantees about the order of the rows in a
  40    table.  When a table is read, the rows will appear in an unspecified order,
  41    unless sorting is explicitly requested.  This is covered in <xref
  42    linkend="queries"/>.  Furthermore, SQL does not assign unique
  43    identifiers to rows, so it is possible to have several completely
  44    identical rows in a table.  This is a consequence of the
  45    mathematical model that underlies SQL but is usually not desirable.
  46    Later in this chapter we will see how to deal with this issue.
  47   </para>
  48
  49   <para>
  50    Each column has a data type.  The data type constrains the set of
  51    possible values that can be assigned to a column and assigns
  52    semantics to the data stored in the column so that it can be used
  53    for computations.  For instance, a column declared to be of a
  54    numerical type will not accept arbitrary text strings, and the data
  55    stored in such a column can be used for mathematical computations.
  56    By contrast, a column declared to be of a character string type
  57    will accept almost any kind of data but it does not lend itself to
  58    mathematical calculations, although other operations such as string
  59    concatenation are available.
  60   </para>
  61
  62   <para>
  63    <productname>PostgreSQL</productname> includes a sizable set of
  64    built-in data types that fit many applications.  Users can also
  65    define their own data types.  Most built-in data types have obvious
  66    names and semantics, so we defer a detailed explanation to <xref
  67    linkend="datatype"/>.  Some of the frequently used data types are
  68    <type>integer</type> for whole numbers, <type>numeric</type> for
  69    possibly fractional numbers, <type>text</type> for character
  70    strings, <type>date</type> for dates, <type>time</type> for
  71    time-of-day values, and <type>timestamp</type> for values
  72    containing both date and time.
  73   </para>
  74
  75   <indexterm>
  76    <primary>table</primary>
  77    <secondary>creating</secondary>
  78   </indexterm>
  79
  80   <para>
  81    To create a table, you use the aptly named <xref
  82    linkend="sql-createtable"/> command.
  83    In this command you specify at least a name for the new table, the
  84    names of the columns and the data type of each column.  For
  85    example:
  86 <programlisting>
  87 CREATE TABLE my_first_table (
  88     first_column text,
  89     second_column integer
  90 );
  91 </programlisting>
  92    This creates a table named <literal>my_first_table</literal> with
  93    two columns.  The first column is named
  94    <literal>first_column</literal> and has a data type of
  95    <type>text</type>; the second column has the name
  96    <literal>second_column</literal> and the type <type>integer</type>.
  97    The table and column names follow the identifier syntax explained
  98    in <xref linkend="sql-syntax-identifiers"/>.  The type names are
  99    usually also identifiers, but there are some exceptions.  Note that the
 100    column list is comma-separated and surrounded by parentheses.
 101   </para>
 102
 103   <para>
 104    Of course, the previous example was heavily contrived.  Normally,
 105    you would give names to your tables and columns that convey what
 106    kind of data they store.  So let's look at a more realistic
 107    example:
 108 <programlisting>
 109 CREATE TABLE products (
 110     product_no integer,
 111     name text,
 112     price numeric
 113 );
 114 </programlisting>
 115    (The <type>numeric</type> type can store fractional components, as
 116    would be typical of monetary amounts.)
 117   </para>
 118
 119   <tip>
 120    <para>
 121     When you create many interrelated tables it is wise to choose a
 122     consistent naming pattern for the tables and columns.  For
 123     instance, there is a choice of using singular or plural nouns for
 124     table names, both of which are favored by some theorist or other.
 125    </para>
 126   </tip>
 127
 128   <para>
 129    There is a limit on how many columns a table can contain.
 130    Depending on the column types, it is between 250 and 1600.
 131    However, defining a table with anywhere near this many columns is
 132    highly unusual and often a questionable design.
 133   </para>
 134
 135   <indexterm>
 136    <primary>table</primary>
 137    <secondary>removing</secondary>
 138   </indexterm>
 139
 140   <para>
 141    If you no longer need a table, you can remove it using the <xref
 142    linkend="sql-droptable"/> command.
 143    For example:
 144 <programlisting>
 145 DROP TABLE my_first_table;
 146 DROP TABLE products;
 147 </programlisting>
 148    Attempting to drop a table that does not exist is an error.
 149    Nevertheless, it is common in SQL script files to unconditionally
 150    try to drop each table before creating it, ignoring any error
 151    messages, so that the script works whether or not the table exists.
 152    (If you like, you can use the <literal>DROP TABLE IF EXISTS</literal> variant
 153    to avoid the error messages, but this is not standard SQL.)
 154   </para>
 155
 156   <para>
 157    If you need to modify a table that already exists, see <xref
 158    linkend="ddl-alter"/> later in this chapter.
 159   </para>
 160
 161   <para>
 162    With the tools discussed so far you can create fully functional
 163    tables.  The remainder of this chapter is concerned with adding
 164    features to the table definition to ensure data integrity,
 165    security, or convenience.  If you are eager to fill your tables with
 166    data now you can skip ahead to <xref linkend="dml"/> and read the
 167    rest of this chapter later.
 168   </para>
 169  </sect1>
 170
 171  <sect1 id="ddl-default">
 172   <title>Default Values</title>
 173
 174   <indexterm zone="ddl-default">
 175    <primary>default value</primary>
 176   </indexterm>
 177
 178   <para>
 179    A column can be assigned a default value.  When a new row is
 180    created and no values are specified for some of the columns, those
 181    columns will be filled with their respective default values.  A
 182    data manipulation command can also request explicitly that a column
 183    be set to its default value, without having to know what that value is.
 184    (Details about data manipulation commands are in <xref linkend="dml"/>.)
 185   </para>
 186
 187   <para>
 188    <indexterm><primary>null value</primary><secondary>default value</secondary></indexterm>
 189    If no default value is declared explicitly, the default value is the
 190    null value.  This usually makes sense because a null value can
 191    be considered to represent unknown data.
 192   </para>
 193
 194   <para>
 195    In a table definition, default values are listed after the column
 196    data type.  For example:
 197 <programlisting>
 198 CREATE TABLE products (
 199     product_no integer,
 200     name text,
 201     price numeric <emphasis>DEFAULT 9.99</emphasis>
 202 );
 203 </programlisting>
 204   </para>
 205
 206   <para>
 207    The default value can be an expression, which will be
 208    evaluated whenever the default value is inserted
 209    (<emphasis>not</emphasis> when the table is created).  A common example
 210    is for a <type>timestamp</type> column to have a default of <literal>CURRENT_TIMESTAMP</literal>,
 211    so that it gets set to the time of row insertion.  Another common
 212    example is generating a <quote>serial number</quote> for each row.
 213    In <productname>PostgreSQL</productname> this is typically done by
 214    something like:
 215 <programlisting>
 216 CREATE TABLE products (
 217     product_no integer <emphasis>DEFAULT nextval('products_product_no_seq')</emphasis>,
 218     ...
 219 );
 220 </programlisting>
 221    where the <literal>nextval()</literal> function supplies successive values
 222    from a <firstterm>sequence object</firstterm> (see <xref
 223    linkend="functions-sequence"/>). This arrangement is sufficiently common
 224    that there's a special shorthand for it:
 225 <programlisting>
 226 CREATE TABLE products (
 227     product_no <emphasis>SERIAL</emphasis>,
 228     ...
 229 );
 230 </programlisting>
 231    The <literal>SERIAL</literal> shorthand is discussed further in <xref
 232    linkend="datatype-serial"/>.
 233   </para>
 234  </sect1>
 235
 236  <sect1 id="ddl-identity-columns">
 237   <title>Identity Columns</title>
 238
 239   <indexterm zone="ddl-identity-columns">
 240    <primary>identity column</primary>
 241   </indexterm>
 242
 243   <para>
 244    An identity column is a special column that is generated automatically from
 245    an implicit sequence.  It can be used to generate key values.
 246   </para>
 247
 248   <para>
 249    To create an identity column, use the <literal>GENERATED ...
 250    AS IDENTITY</literal> clause in <command>CREATE TABLE</command>, for example:
 251 <programlisting>
 252 CREATE TABLE people (
 253     id bigint <emphasis>GENERATED ALWAYS AS IDENTITY</emphasis>,
 254     ...,
 255 );
 256 </programlisting>
 257    or alternatively
 258 <programlisting>
 259 CREATE TABLE people (
 260     id bigint <emphasis>GENERATED BY DEFAULT IDENTITY</emphasis>,
 261     ...,
 262 );
 263 </programlisting>
 264    See <xref linkend="sql-createtable"/> for more details.
 265   </para>
 266
 267   <para>
 268    If an <command>INSERT</command> command is executed on the table with the
 269    identity column and no value is explicitly specified for the identity
 270    column, then a value generated by the implicit sequence is inserted.  For
 271    example, with the above definitions and assuming additional appropriate
 272    columns, writing
 273 <programlisting>
 274 INSERT INTO people (name, address) VALUE ('A', 'foo');
 275 INSERT INTO people (name, address) VALUE ('B', 'bar');
 276 </programlisting>
 277    would generate values for the <literal>id</literal> column starting at 1
 278    and result in the following table data:
 279 <screen>
 280  id | name | address
 281 ----+------+---------
 282   1 | A    | foo
 283   2 | B    | bar
 284 </screen>
 285    Alternatively, the keyword <literal>DEFAULT</literal> can be specified in
 286    place of a value to explicitly request the sequence-generated value, like
 287 <programlisting>
 288 INSERT INTO people (id, name, address) VALUE (<emphasis>DEFAULT</emphasis>, 'C', 'baz');
 289 </programlisting>
 290    Similarly, the keyword <literal>DEFAULT</literal> can be used in
 291    <command>UPDATE</command> commands.
 292   </para>
 293
 294   <para>
 295    Thus, in many ways, an identity column behaves like a column with a default
 296    value.
 297   </para>
 298
 299   <para>
 300    The clauses <literal>ALWAYS</literal> and <literal>BY DEFAULT</literal> in
 301    the column definition determine how explicitly user-specified values are
 302    handled in <command>INSERT</command> and <command>UPDATE</command>
 303    commands.  In an <command>INSERT</command> command, if
 304    <literal>ALWAYS</literal> is selected, a user-specified value is only
 305    accepted if the <command>INSERT</command> statement specifies
 306    <literal>OVERRIDING SYSTEM VALUE</literal>.  If <literal>BY
 307    DEFAULT</literal> is selected, then the user-specified value takes
 308    precedence.  Thus, using <literal>BY DEFAULT</literal> results in a
 309    behavior more similar to default values, where the default value can be
 310    overridden by an explicit value, whereas <literal>ALWAYS</literal> provides
 311    some more protection against accidentally inserting an explicit value.
 312   </para>
 313
 314   <para>
 315    The data type of an identity column must be one of the data types supported
 316    by sequences.  (See <xref linkend="sql-createsequence"/>.)  The properties
 317    of the associated sequence may be specified when creating an identity
 318    column (see <xref linkend="sql-createtable"/>) or changed afterwards (see
 319    <xref linkend="sql-altertable"/>).
 320   </para>
 321
 322   <para>
 323    An identity column is automatically marked as <literal>NOT NULL</literal>.
 324    An identity column, however, does not guarantee uniqueness.  (A sequence
 325    normally returns unique values, but a sequence could be reset, or values
 326    could be inserted manually into the identity column, as discussed above.)
 327    Uniqueness would need to be enforced using a <literal>PRIMARY KEY</literal>
 328    or <literal>UNIQUE</literal> constraint.
 329   </para>
 330
 331   <para>
 332    In table inheritance hierarchies, identity columns and their properties in
 333    a child table are independent of those in its parent tables.  A child table
 334    does not inherit identity columns or their properties automatically from
 335    the parent. During <command>INSERT</command> or <command>UPDATE</command>,
 336    a column is treated as an identity column if that column is an identity
 337    column in the table named in the statement, and the corresponding identity
 338    properties are applied.
 339   </para>
 340
 341   <para>
 342    Partitions inherit identity columns from the partitioned table.  They
 343    cannot have their own identity columns.  The properties of a given identity
 344    column are consistent across all the partitions in the partition hierarchy.
 345   </para>
 346  </sect1>
 347
 348  <sect1 id="ddl-generated-columns">
 349   <title>Generated Columns</title>
 350
 351   <indexterm zone="ddl-generated-columns">
 352    <primary>generated column</primary>
 353   </indexterm>
 354
 355   <para>
 356    A generated column is a special column that is always computed from other
 357    columns.  Thus, it is for columns what a view is for tables.  There are two
 358    kinds of generated columns: stored and virtual.  A stored generated column
 359    is computed when it is written (inserted or updated) and occupies storage
 360    as if it were a normal column.  A virtual generated column occupies no
 361    storage and is computed when it is read.  Thus, a virtual generated column
 362    is similar to a view and a stored generated column is similar to a
 363    materialized view (except that it is always updated automatically).
 364    <productname>PostgreSQL</productname> currently implements only stored generated columns.
 365   </para>
 366
 367   <para>
 368    To create a generated column, use the <literal>GENERATED ALWAYS
 369    AS</literal> clause in <command>CREATE TABLE</command>, for example:
 370 <programlisting>
 371 CREATE TABLE people (
 372     ...,
 373     height_cm numeric,
 374     height_in numeric <emphasis>GENERATED ALWAYS AS (height_cm / 2.54) STORED</emphasis>
 375 );
 376 </programlisting>
 377    The keyword <literal>STORED</literal> must be specified to choose the
 378    stored kind of generated column.  See <xref linkend="sql-createtable"/> for
 379    more details.
 380   </para>
 381
 382   <para>
 383    A generated column cannot be written to directly.  In
 384    <command>INSERT</command> or <command>UPDATE</command> commands, a value
 385    cannot be specified for a generated column, but the keyword
 386    <literal>DEFAULT</literal> may be specified.
 387   </para>
 388
 389   <para>
 390    Consider the differences between a column with a default and a generated
 391    column.  The column default is evaluated once when the row is first
 392    inserted if no other value was provided; a generated column is updated
 393    whenever the row changes and cannot be overridden.  A column default may
 394    not refer to other columns of the table; a generation expression would
 395    normally do so.  A column default can use volatile functions, for example
 396    <literal>random()</literal> or functions referring to the current time;
 397    this is not allowed for generated columns.
 398   </para>
 399
 400   <para>
 401    Several restrictions apply to the definition of generated columns and
 402    tables involving generated columns:
 403
 404    <itemizedlist>
 405     <listitem>
 406      <para>
 407       The generation expression can only use immutable functions and cannot
 408       use subqueries or reference anything other than the current row in any
 409       way.
 410      </para>
 411     </listitem>
 412     <listitem>
 413      <para>
 414       A generation expression cannot reference another generated column.
 415      </para>
 416     </listitem>
 417     <listitem>
 418      <para>
 419       A generation expression cannot reference a system column, except
 420       <varname>tableoid</varname>.
 421      </para>
 422     </listitem>
 423     <listitem>
 424      <para>
 425       A generated column cannot have a column default or an identity definition.
 426      </para>
 427     </listitem>
 428     <listitem>
 429      <para>
 430       A generated column cannot be part of a partition key.
 431      </para>
 432     </listitem>
 433     <listitem>
 434      <para>
 435       Foreign tables can have generated columns.  See <xref
 436       linkend="sql-createforeigntable"/> for details.
 437      </para>
 438     </listitem>
 439     <listitem>
 440      <para>For inheritance and partitioning:</para>
 441      <itemizedlist>
 442       <listitem>
 443        <para>
 444         If a parent column is a generated column, its child column must also
 445         be a generated column; however, the child column can have a
 446         different generation expression.  The generation expression that is
 447         actually applied during insert or update of a row is the one
 448         associated with the table that the row is physically in.
 449         (This is unlike the behavior for column defaults: for those, the
 450         default value associated with the table named in the query applies.)
 451        </para>
 452       </listitem>
 453       <listitem>
 454        <para>
 455         If a parent column is not a generated column, its child column must
 456         not be generated either.
 457        </para>
 458       </listitem>
 459       <listitem>
 460        <para>
 461         For inherited tables, if you write a child column definition without
 462         any <literal>GENERATED</literal> clause in <command>CREATE TABLE
 463         ... INHERITS</command>, then its <literal>GENERATED</literal> clause
 464         will automatically be copied from the parent.  <command>ALTER TABLE
 465         ... INHERIT</command> will insist that parent and child columns
 466         already match as to generation status, but it will not require their
 467         generation expressions to match.
 468        </para>
 469       </listitem>
 470       <listitem>
 471        <para>
 472         Similarly for partitioned tables, if you write a child column
 473         definition without any <literal>GENERATED</literal> clause
 474         in <command>CREATE TABLE ... PARTITION OF</command>, then
 475         its <literal>GENERATED</literal> clause will automatically be copied
 476         from the parent.  <command>ALTER TABLE ... ATTACH PARTITION</command>
 477         will insist that parent and child columns already match as to
 478         generation status, but it will not require their generation
 479         expressions to match.
 480        </para>
 481       </listitem>
 482       <listitem>
 483        <para>
 484         In case of multiple inheritance, if one parent column is a generated
 485         column, then all parent columns must be generated columns.  If they
 486         do not all have the same generation expression, then the desired
 487         expression for the child must be specified explicitly.
 488        </para>
 489       </listitem>
 490      </itemizedlist>
 491     </listitem>
 492    </itemizedlist>
 493   </para>
 494
 495   <para>
 496    Additional considerations apply to the use of generated columns.
 497    <itemizedlist>
 498     <listitem>
 499      <para>
 500       Generated columns maintain access privileges separately from their
 501       underlying base columns.  So, it is possible to arrange it so that a
 502       particular role can read from a generated column but not from the
 503       underlying base columns.
 504      </para>
 505     </listitem>
 506     <listitem>
 507      <para>
 508       Generated columns are, conceptually, updated after
 509       <literal>BEFORE</literal> triggers have run.  Therefore, changes made to
 510       base columns in a <literal>BEFORE</literal> trigger will be reflected in
 511       generated columns.  But conversely, it is not allowed to access
 512       generated columns in <literal>BEFORE</literal> triggers.
 513      </para>
 514     </listitem>
 515    </itemizedlist>
 516   </para>
 517  </sect1>
 518
 519  <sect1 id="ddl-constraints">
 520   <title>Constraints</title>
 521
 522   <indexterm zone="ddl-constraints">
 523    <primary>constraint</primary>
 524   </indexterm>
 525
 526   <para>
 527    Data types are a way to limit the kind of data that can be stored
 528    in a table.  For many applications, however, the constraint they
 529    provide is too coarse.  For example, a column containing a product
 530    price should probably only accept positive values.  But there is no
 531    standard data type that accepts only positive numbers.  Another issue is
 532    that you might want to constrain column data with respect to other
 533    columns or rows.  For example, in a table containing product
 534    information, there should be only one row for each product number.
 535   </para>
 536
 537   <para>
 538    To that end, SQL allows you to define constraints on columns and
 539    tables.  Constraints give you as much control over the data in your
 540    tables as you wish.  If a user attempts to store data in a column
 541    that would violate a constraint, an error is raised.  This applies
 542    even if the value came from the default value definition.
 543   </para>
 544
 545   <sect2 id="ddl-constraints-check-constraints">
 546    <title>Check Constraints</title>
 547
 548    <indexterm>
 549     <primary>check constraint</primary>
 550    </indexterm>
 551
 552    <indexterm>
 553     <primary>constraint</primary>
 554     <secondary>check</secondary>
 555    </indexterm>
 556
 557    <para>
 558     A check constraint is the most generic constraint type.  It allows
 559     you to specify that the value in a certain column must satisfy a
 560     Boolean (truth-value) expression.  For instance, to require positive
 561     product prices, you could use:
 562 <programlisting>
 563 CREATE TABLE products (
 564     product_no integer,
 565     name text,
 566     price numeric <emphasis>CHECK (price &gt; 0)</emphasis>
 567 );
 568 </programlisting>
 569    </para>
 570
 571    <para>
 572     As you see, the constraint definition comes after the data type,
 573     just like default value definitions.  Default values and
 574     constraints can be listed in any order.  A check constraint
 575     consists of the key word <literal>CHECK</literal> followed by an
 576     expression in parentheses.  The check constraint expression should
 577     involve the column thus constrained, otherwise the constraint
 578     would not make too much sense.
 579    </para>
 580
 581    <indexterm>
 582     <primary>constraint</primary>
 583     <secondary>name</secondary>
 584    </indexterm>
 585
 586    <para>
 587     You can also give the constraint a separate name.  This clarifies
 588     error messages and allows you to refer to the constraint when you
 589     need to change it.  The syntax is:
 590 <programlisting>
 591 CREATE TABLE products (
 592     product_no integer,
 593     name text,
 594     price numeric <emphasis>CONSTRAINT positive_price</emphasis> CHECK (price &gt; 0)
 595 );
 596 </programlisting>
 597     So, to specify a named constraint, use the key word
 598     <literal>CONSTRAINT</literal> followed by an identifier followed
 599     by the constraint definition.  (If you don't specify a constraint
 600     name in this way, the system chooses a name for you.)
 601    </para>
 602
 603    <para>
 604     A check constraint can also refer to several columns.  Say you
 605     store a regular price and a discounted price, and you want to
 606     ensure that the discounted price is lower than the regular price:
 607 <programlisting>
 608 CREATE TABLE products (
 609     product_no integer,
 610     name text,
 611     price numeric CHECK (price &gt; 0),
 612     discounted_price numeric CHECK (discounted_price &gt; 0),
 613     <emphasis>CHECK (price &gt; discounted_price)</emphasis>
 614 );
 615 </programlisting>
 616    </para>
 617
 618    <para>
 619     The first two constraints should look familiar.  The third one
 620     uses a new syntax.  It is not attached to a particular column,
 621     instead it appears as a separate item in the comma-separated
 622     column list.  Column definitions and these constraint
 623     definitions can be listed in mixed order.
 624    </para>
 625
 626    <para>
 627     We say that the first two constraints are column constraints, whereas the
 628     third one is a table constraint because it is written separately
 629     from any one column definition.  Column constraints can also be
 630     written as table constraints, while the reverse is not necessarily
 631     possible, since a column constraint is supposed to refer to only the
 632     column it is attached to.  (<productname>PostgreSQL</productname> doesn't
 633     enforce that rule, but you should follow it if you want your table
 634     definitions to work with other database systems.)  The above example could
 635     also be written as:
 636 <programlisting>
 637 CREATE TABLE products (
 638     product_no integer,
 639     name text,
 640     price numeric,
 641     CHECK (price &gt; 0),
 642     discounted_price numeric,
 643     CHECK (discounted_price &gt; 0),
 644     CHECK (price &gt; discounted_price)
 645 );
 646 </programlisting>
 647     or even:
 648 <programlisting>
 649 CREATE TABLE products (
 650     product_no integer,
 651     name text,
 652     price numeric CHECK (price &gt; 0),
 653     discounted_price numeric,
 654     CHECK (discounted_price &gt; 0 AND price &gt; discounted_price)
 655 );
 656 </programlisting>
 657     It's a matter of taste.
 658    </para>
 659
 660    <para>
 661     Names can be assigned to table constraints in the same way as
 662     column constraints:
 663 <programlisting>
 664 CREATE TABLE products (
 665     product_no integer,
 666     name text,
 667     price numeric,
 668     CHECK (price &gt; 0),
 669     discounted_price numeric,
 670     CHECK (discounted_price &gt; 0),
 671     <emphasis>CONSTRAINT valid_discount</emphasis> CHECK (price &gt; discounted_price)
 672 );
 673 </programlisting>
 674    </para>
 675
 676    <indexterm>
 677     <primary>null value</primary>
 678     <secondary sortas="check constraints">with check constraints</secondary>
 679    </indexterm>
 680
 681    <para>
 682     It should be noted that a check constraint is satisfied if the
 683     check expression evaluates to true or the null value.  Since most
 684     expressions will evaluate to the null value if any operand is null,
 685     they will not prevent null values in the constrained columns.  To
 686     ensure that a column does not contain null values, the not-null
 687     constraint described in the next section can be used.
 688    </para>
 689
 690    <note>
 691     <para>
 692      <productname>PostgreSQL</productname> does not support
 693      <literal>CHECK</literal> constraints that reference table data other than
 694      the new or updated row being checked.  While a <literal>CHECK</literal>
 695      constraint that violates this rule may appear to work in simple
 696      tests, it cannot guarantee that the database will not reach a state
 697      in which the constraint condition is false (due to subsequent changes
 698      of the other row(s) involved).  This would cause a database dump and
 699      restore to fail.  The restore could fail even when the complete
 700      database state is consistent with the constraint, due to rows not
 701      being loaded in an order that will satisfy the constraint.  If
 702      possible, use <literal>UNIQUE</literal>, <literal>EXCLUDE</literal>,
 703      or <literal>FOREIGN KEY</literal> constraints to express
 704      cross-row and cross-table restrictions.
 705     </para>
 706
 707     <para>
 708      If what you desire is a one-time check against other rows at row
 709      insertion, rather than a continuously-maintained consistency
 710      guarantee, a custom <link linkend="triggers">trigger</link> can be used
 711      to implement that.  (This approach avoids the dump/restore problem because
 712      <application>pg_dump</application> does not reinstall triggers until after
 713      restoring data, so that the check will not be enforced during a
 714      dump/restore.)
 715     </para>
 716    </note>
 717
 718    <note>
 719     <para>
 720      <productname>PostgreSQL</productname> assumes that
 721      <literal>CHECK</literal> constraints' conditions are immutable, that
 722      is, they will always give the same result for the same input row.
 723      This assumption is what justifies examining <literal>CHECK</literal>
 724      constraints only when rows are inserted or updated, and not at other
 725      times.  (The warning above about not referencing other table data is
 726      really a special case of this restriction.)
 727     </para>
 728
 729     <para>
 730      An example of a common way to break this assumption is to reference a
 731      user-defined function in a <literal>CHECK</literal> expression, and
 732      then change the behavior of that
 733      function.  <productname>PostgreSQL</productname> does not disallow
 734      that, but it will not notice if there are rows in the table that now
 735      violate the <literal>CHECK</literal> constraint. That would cause a
 736      subsequent database dump and restore to fail.
 737      The recommended way to handle such a change is to drop the constraint
 738      (using <command>ALTER TABLE</command>), adjust the function definition,
 739      and re-add the constraint, thereby rechecking it against all table rows.
 740     </para>
 741    </note>
 742   </sect2>
 743
 744   <sect2 id="ddl-constraints-not-null">
 745    <title>Not-Null Constraints</title>
 746
 747    <indexterm>
 748     <primary>not-null constraint</primary>
 749    </indexterm>
 750
 751    <indexterm>
 752     <primary>constraint</primary>
 753     <secondary>NOT NULL</secondary>
 754    </indexterm>
 755
 756    <para>
 757     A not-null constraint simply specifies that a column must not
 758     assume the null value.  A syntax example:
 759 <programlisting>
 760 CREATE TABLE products (
 761     product_no integer <emphasis>NOT NULL</emphasis>,
 762     name text <emphasis>NOT NULL</emphasis>,
 763     price numeric
 764 );
 765 </programlisting>
 766     An explicit constraint name can also be specified, for example:
 767 <programlisting>
 768 CREATE TABLE products (
 769     product_no integer NOT NULL,
 770     name text <emphasis>CONSTRAINT products_name_not_null</emphasis> NOT NULL,
 771     price numeric
 772 );
 773 </programlisting>
 774    </para>
 775
 776    <para>
 777     A not-null constraint is usually written as a column constraint.  The
 778     syntax for writing it as a table constraint is
 779 <programlisting>
 780 CREATE TABLE products (
 781     product_no integer,
 782     name text,
 783     price numeric,
 784     <emphasis>NOT NULL product_no</emphasis>,
 785     <emphasis>NOT NULL name</emphasis>
 786 );
 787 </programlisting>
 788     But this syntax is not standard and mainly intended for use by
 789     <application>pg_dump</application>.
 790    </para>
 791
 792    <para>
 793     A not-null constraint is functionally equivalent to creating a check
 794     constraint <literal>CHECK (<replaceable>column_name</replaceable>
 795     IS NOT NULL)</literal>, but in
 796     <productname>PostgreSQL</productname> creating an explicit
 797     not-null constraint is more efficient.
 798    </para>
 799
 800    <para>
 801     Of course, a column can have more than one constraint.  Just write
 802     the constraints one after another:
 803 <programlisting>
 804 CREATE TABLE products (
 805     product_no integer NOT NULL,
 806     name text NOT NULL,
 807     price numeric NOT NULL CHECK (price &gt; 0)
 808 );
 809 </programlisting>
 810     The order doesn't matter.  It does not necessarily determine in which
 811     order the constraints are checked.
 812    </para>
 813
 814    <para>
 815     However, a column can have at most one explicit not-null constraint.
 816    </para>
 817
 818    <para>
 819     The <literal>NOT NULL</literal> constraint has an inverse: the
 820     <literal>NULL</literal> constraint.  This does not mean that the
 821     column must be null, which would surely be useless.  Instead, this
 822     simply selects the default behavior that the column might be null.
 823     The <literal>NULL</literal> constraint is not present in the SQL
 824     standard and should not be used in portable applications.  (It was
 825     only added to <productname>PostgreSQL</productname> to be
 826     compatible with some other database systems.)  Some users, however,
 827     like it because it makes it easy to toggle the constraint in a
 828     script file.  For example, you could start with:
 829 <programlisting>
 830 CREATE TABLE products (
 831     product_no integer NULL,
 832     name text NULL,
 833     price numeric NULL
 834 );
 835 </programlisting>
 836     and then insert the <literal>NOT</literal> key word where desired.
 837    </para>
 838
 839    <tip>
 840     <para>
 841      In most database designs the majority of columns should be marked
 842      not null.
 843     </para>
 844    </tip>
 845   </sect2>
 846
 847   <sect2 id="ddl-constraints-unique-constraints">
 848    <title>Unique Constraints</title>
 849
 850    <indexterm>
 851     <primary>unique constraint</primary>
 852    </indexterm>
 853
 854    <indexterm>
 855     <primary>constraint</primary>
 856     <secondary>unique</secondary>
 857    </indexterm>
 858
 859    <para>
 860     Unique constraints ensure that the data contained in a column, or a
 861     group of columns, is unique among all the rows in the
 862     table.  The syntax is:
 863 <programlisting>
 864 CREATE TABLE products (
 865     product_no integer <emphasis>UNIQUE</emphasis>,
 866     name text,
 867     price numeric
 868 );
 869 </programlisting>
 870     when written as a column constraint, and:
 871 <programlisting>
 872 CREATE TABLE products (
 873     product_no integer,
 874     name text,
 875     price numeric,
 876     <emphasis>UNIQUE (product_no)</emphasis>
 877 );
 878 </programlisting>
 879     when written as a table constraint.
 880    </para>
 881
 882    <para>
 883     To define a unique constraint for a group of columns, write it as a
 884     table constraint with the column names separated by commas:
 885 <programlisting>
 886 CREATE TABLE example (
 887     a integer,
 888     b integer,
 889     c integer,
 890     <emphasis>UNIQUE (a, c)</emphasis>
 891 );
 892 </programlisting>
 893     This specifies that the combination of values in the indicated columns
 894     is unique across the whole table, though any one of the columns
 895     need not be (and ordinarily isn't) unique.
 896    </para>
 897
 898    <para>
 899     You can assign your own name for a unique constraint, in the usual way:
 900 <programlisting>
 901 CREATE TABLE products (
 902     product_no integer <emphasis>CONSTRAINT must_be_different</emphasis> UNIQUE,
 903     name text,
 904     price numeric
 905 );
 906 </programlisting>
 907    </para>
 908
 909    <para>
 910     Adding a unique constraint will automatically create a unique B-tree
 911     index on the column or group of columns listed in the constraint.
 912     A uniqueness restriction covering only some rows cannot be written as
 913     a unique constraint, but it is possible to enforce such a restriction by
 914     creating a unique <link linkend="indexes-partial">partial index</link>.
 915    </para>
 916
 917    <indexterm>
 918     <primary>null value</primary>
 919     <secondary sortas="unique constraints">with unique constraints</secondary>
 920    </indexterm>
 921
 922    <para>
 923     In general, a unique constraint is violated if there is more than
 924     one row in the table where the values of all of the
 925     columns included in the constraint are equal.
 926     By default, two null values are not considered equal in this
 927     comparison.  That means even in the presence of a
 928     unique constraint it is possible to store duplicate
 929     rows that contain a null value in at least one of the constrained
 930     columns.  This behavior can be changed by adding the clause <literal>NULLS
 931     NOT DISTINCT</literal>, like
 932 <programlisting>
 933 CREATE TABLE products (
 934     product_no integer UNIQUE <emphasis>NULLS NOT DISTINCT</emphasis>,
 935     name text,
 936     price numeric
 937 );
 938 </programlisting>
 939     or
 940 <programlisting>
 941 CREATE TABLE products (
 942     product_no integer,
 943     name text,
 944     price numeric,
 945     UNIQUE <emphasis>NULLS NOT DISTINCT</emphasis> (product_no)
 946 );
 947 </programlisting>
 948     The default behavior can be specified explicitly using <literal>NULLS
 949     DISTINCT</literal>.  The default null treatment in unique constraints is
 950     implementation-defined according to the SQL standard, and other
 951     implementations have a different behavior.  So be careful when developing
 952     applications that are intended to be portable.
 953    </para>
 954   </sect2>
 955
 956   <sect2 id="ddl-constraints-primary-keys">
 957    <title>Primary Keys</title>
 958
 959    <indexterm>
 960     <primary>primary key</primary>
 961    </indexterm>
 962
 963    <indexterm>
 964     <primary>constraint</primary>
 965     <secondary>primary key</secondary>
 966    </indexterm>
 967
 968    <para>
 969     A primary key constraint indicates that a column, or group of columns,
 970     can be used as a unique identifier for rows in the table.  This
 971     requires that the values be both unique and not null.  So, the following
 972     two table definitions accept the same data:
 973 <programlisting>
 974 CREATE TABLE products (
 975     product_no integer UNIQUE NOT NULL,
 976     name text,
 977     price numeric
 978 );
 979 </programlisting>
 980
 981 <programlisting>
 982 CREATE TABLE products (
 983     product_no integer <emphasis>PRIMARY KEY</emphasis>,
 984     name text,
 985     price numeric
 986 );
 987 </programlisting>
 988    </para>
 989
 990    <para>
 991     Primary keys can span more than one column; the syntax
 992     is similar to unique constraints:
 993 <programlisting>
 994 CREATE TABLE example (
 995     a integer,
 996     b integer,
 997     c integer,
 998     <emphasis>PRIMARY KEY (a, c)</emphasis>
 999 );
1000 </programlisting>
1001    </para>
1002
1003    <para>
1004     Adding a primary key will automatically create a unique B-tree index
1005     on the column or group of columns listed in the primary key, and will
1006     force the column(s) to be marked <literal>NOT NULL</literal>.
1007    </para>
1008
1009    <para>
1010     A table can have at most one primary key.  (There can be any number
1011     of unique constraints, which combined with not-null constraints are functionally almost the
1012     same thing, but only one can be identified as the primary key.)
1013     Relational database theory
1014     dictates that every table must have a primary key.  This rule is
1015     not enforced by <productname>PostgreSQL</productname>, but it is
1016     usually best to follow it.
1017    </para>
1018
1019    <para>
1020     Primary keys are useful both for
1021     documentation purposes and for client applications.  For example,
1022     a GUI application that allows modifying row values probably needs
1023     to know the primary key of a table to be able to identify rows
1024     uniquely.  There are also various ways in which the database system
1025     makes use of a primary key if one has been declared; for example,
1026     the primary key defines the default target column(s) for foreign keys
1027     referencing its table.
1028    </para>
1029   </sect2>
1030
1031   <sect2 id="ddl-constraints-fk">
1032    <title>Foreign Keys</title>
1033
1034    <indexterm>
1035     <primary>foreign key</primary>
1036    </indexterm>
1037
1038    <indexterm>
1039     <primary>constraint</primary>
1040     <secondary>foreign key</secondary>
1041    </indexterm>
1042
1043    <indexterm>
1044     <primary>referential integrity</primary>
1045    </indexterm>
1046
1047    <para>
1048     A foreign key constraint specifies that the values in a column (or
1049     a group of columns) must match the values appearing in some row
1050     of another table.
1051     We say this maintains the <firstterm>referential
1052     integrity</firstterm> between two related tables.
1053    </para>
1054
1055    <para>
1056     Say you have the product table that we have used several times already:
1057 <programlisting>
1058 CREATE TABLE products (
1059     product_no integer PRIMARY KEY,
1060     name text,
1061     price numeric
1062 );
1063 </programlisting>
1064     Let's also assume you have a table storing orders of those
1065     products.  We want to ensure that the orders table only contains
1066     orders of products that actually exist.  So we define a foreign
1067     key constraint in the orders table that references the products
1068     table:
1069 <programlisting>
1070 CREATE TABLE orders (
1071     order_id integer PRIMARY KEY,
1072     product_no integer <emphasis>REFERENCES products (product_no)</emphasis>,
1073     quantity integer
1074 );
1075 </programlisting>
1076     Now it is impossible to create orders with non-NULL
1077     <structfield>product_no</structfield> entries that do not appear in the
1078     products table.
1079    </para>
1080
1081    <para>
1082     We say that in this situation the orders table is the
1083     <firstterm>referencing</firstterm> table and the products table is
1084     the <firstterm>referenced</firstterm> table.  Similarly, there are
1085     referencing and referenced columns.
1086    </para>
1087
1088    <para>
1089     You can also shorten the above command to:
1090 <programlisting>
1091 CREATE TABLE orders (
1092     order_id integer PRIMARY KEY,
1093     product_no integer <emphasis>REFERENCES products</emphasis>,
1094     quantity integer
1095 );
1096 </programlisting>
1097     because in absence of a column list the primary key of the
1098     referenced table is used as the referenced column(s).
1099    </para>
1100
1101    <para>
1102     You can assign your own name for a foreign key constraint,
1103     in the usual way.
1104    </para>
1105
1106    <para>
1107     A foreign key can also constrain and reference a group of columns.
1108     As usual, it then needs to be written in table constraint form.
1109     Here is a contrived syntax example:
1110 <programlisting>
1111 CREATE TABLE t1 (
1112   a integer PRIMARY KEY,
1113   b integer,
1114   c integer,
1115   <emphasis>FOREIGN KEY (b, c) REFERENCES other_table (c1, c2)</emphasis>
1116 );
1117 </programlisting>
1118     Of course, the number and type of the constrained columns need to
1119     match the number and type of the referenced columns.
1120    </para>
1121
1122    <indexterm>
1123     <primary>foreign key</primary>
1124     <secondary>self-referential</secondary>
1125    </indexterm>
1126
1127    <para>
1128     Sometimes it is useful for the <quote>other table</quote> of a
1129     foreign key constraint to be the same table; this is called
1130     a <firstterm>self-referential</firstterm> foreign key.  For
1131     example, if you want rows of a table to represent nodes of a tree
1132     structure, you could write
1133 <programlisting>
1134 CREATE TABLE tree (
1135     node_id integer PRIMARY KEY,
1136     parent_id integer REFERENCES tree,
1137     name text,
1138     ...
1139 );
1140 </programlisting>
1141     A top-level node would have NULL <structfield>parent_id</structfield>,
1142     while non-NULL <structfield>parent_id</structfield> entries would be
1143     constrained to reference valid rows of the table.
1144    </para>
1145
1146    <para>
1147     A table can have more than one foreign key constraint.  This is
1148     used to implement many-to-many relationships between tables.  Say
1149     you have tables about products and orders, but now you want to
1150     allow one order to contain possibly many products (which the
1151     structure above did not allow).  You could use this table structure:
1152 <programlisting>
1153 CREATE TABLE products (
1154     product_no integer PRIMARY KEY,
1155     name text,
1156     price numeric
1157 );
1158
1159 CREATE TABLE orders (
1160     order_id integer PRIMARY KEY,
1161     shipping_address text,
1162     ...
1163 );
1164
1165 CREATE TABLE order_items (
1166     product_no integer REFERENCES products,
1167     order_id integer REFERENCES orders,
1168     quantity integer,
1169     PRIMARY KEY (product_no, order_id)
1170 );
1171 </programlisting>
1172     Notice that the primary key overlaps with the foreign keys in
1173     the last table.
1174    </para>
1175
1176    <indexterm>
1177     <primary>CASCADE</primary>
1178     <secondary>foreign key action</secondary>
1179    </indexterm>
1180
1181    <indexterm>
1182     <primary>RESTRICT</primary>
1183     <secondary>foreign key action</secondary>
1184    </indexterm>
1185
1186    <para>
1187     We know that the foreign keys disallow creation of orders that
1188     do not relate to any products.  But what if a product is removed
1189     after an order is created that references it?  SQL allows you to
1190     handle that as well.  Intuitively, we have a few options:
1191     <itemizedlist spacing="compact">
1192      <listitem><para>Disallow deleting a referenced product</para></listitem>
1193      <listitem><para>Delete the orders as well</para></listitem>
1194      <listitem><para>Something else?</para></listitem>
1195     </itemizedlist>
1196    </para>
1197
1198    <para>
1199     To illustrate this, let's implement the following policy on the
1200     many-to-many relationship example above: when someone wants to
1201     remove a product that is still referenced by an order (via
1202     <literal>order_items</literal>), we disallow it.  If someone
1203     removes an order, the order items are removed as well:
1204 <programlisting>
1205 CREATE TABLE products (
1206     product_no integer PRIMARY KEY,
1207     name text,
1208     price numeric
1209 );
1210
1211 CREATE TABLE orders (
1212     order_id integer PRIMARY KEY,
1213     shipping_address text,
1214     ...
1215 );
1216
1217 CREATE TABLE order_items (
1218     product_no integer REFERENCES products <emphasis>ON DELETE RESTRICT</emphasis>,
1219     order_id integer REFERENCES orders <emphasis>ON DELETE CASCADE</emphasis>,
1220     quantity integer,
1221     PRIMARY KEY (product_no, order_id)
1222 );
1223 </programlisting>
1224    </para>
1225
1226    <para>
1227     Restricting and cascading deletes are the two most common options.
1228     <literal>RESTRICT</literal> prevents deletion of a
1229     referenced row. <literal>NO ACTION</literal> means that if any
1230     referencing rows still exist when the constraint is checked, an error
1231     is raised; this is the default behavior if you do not specify anything.
1232     (The essential difference between these two choices is that
1233     <literal>NO ACTION</literal> allows the check to be deferred until
1234     later in the transaction, whereas <literal>RESTRICT</literal> does not.)
1235     <literal>CASCADE</literal> specifies that when a referenced row is deleted,
1236     row(s) referencing it should be automatically deleted as well.
1237     There are two other options:
1238     <literal>SET NULL</literal> and <literal>SET DEFAULT</literal>.
1239     These cause the referencing column(s) in the referencing row(s)
1240     to be set to nulls or their default
1241     values, respectively, when the referenced row is deleted.
1242     Note that these do not excuse you from observing any constraints.
1243     For example, if an action specifies <literal>SET DEFAULT</literal>
1244     but the default value would not satisfy the foreign key constraint, the
1245     operation will fail.
1246    </para>
1247
1248    <para>
1249     The appropriate choice of <literal>ON DELETE</literal> action depends on
1250     what kinds of objects the related tables represent.  When the referencing
1251     table represents something that is a component of what is represented by
1252     the referenced table and cannot exist independently, then
1253     <literal>CASCADE</literal> could be appropriate.  If the two tables
1254     represent independent objects, then <literal>RESTRICT</literal> or
1255     <literal>NO ACTION</literal> is more appropriate; an application that
1256     actually wants to delete both objects would then have to be explicit about
1257     this and run two delete commands.  In the above example, order items are
1258     part of an order, and it is convenient if they are deleted automatically
1259     if an order is deleted.  But products and orders are different things, and
1260     so making a deletion of a product automatically cause the deletion of some
1261     order items could be considered problematic.  The actions <literal>SET
1262     NULL</literal> or <literal>SET DEFAULT</literal> can be appropriate if a
1263     foreign-key relationship represents optional information.  For example, if
1264     the products table contained a reference to a product manager, and the
1265     product manager entry gets deleted, then setting the product's product
1266     manager to null or a default might be useful.
1267    </para>
1268
1269    <para>
1270     The actions <literal>SET NULL</literal> and <literal>SET DEFAULT</literal>
1271     can take a column list to specify which columns to set.  Normally, all
1272     columns of the foreign-key constraint are set; setting only a subset is
1273     useful in some special cases.  Consider the following example:
1274 <programlisting>
1275 CREATE TABLE tenants (
1276     tenant_id integer PRIMARY KEY
1277 );
1278
1279 CREATE TABLE users (
1280     tenant_id integer REFERENCES tenants ON DELETE CASCADE,
1281     user_id integer NOT NULL,
1282     PRIMARY KEY (tenant_id, user_id)
1283 );
1284
1285 CREATE TABLE posts (
1286     tenant_id integer REFERENCES tenants ON DELETE CASCADE,
1287     post_id integer NOT NULL,
1288     author_id integer,
1289     PRIMARY KEY (tenant_id, post_id),
1290     FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL <emphasis>(author_id)</emphasis>
1291 );
1292 </programlisting>
1293     Without the specification of the column, the foreign key would also set
1294     the column <literal>tenant_id</literal> to null, but that column is still
1295     required as part of the primary key.
1296    </para>
1297
1298    <para>
1299     Analogous to <literal>ON DELETE</literal> there is also
1300     <literal>ON UPDATE</literal> which is invoked when a referenced
1301     column is changed (updated).  The possible actions are the same,
1302     except that column lists cannot be specified for <literal>SET
1303     NULL</literal> and <literal>SET DEFAULT</literal>.
1304     In this case, <literal>CASCADE</literal> means that the updated values of the
1305     referenced column(s) should be copied into the referencing row(s).
1306    </para>
1307
1308    <para>
1309     Normally, a referencing row need not satisfy the foreign key constraint
1310     if any of its referencing columns are null.  If <literal>MATCH FULL</literal>
1311     is added to the foreign key declaration, a referencing row escapes
1312     satisfying the constraint only if all its referencing columns are null
1313     (so a mix of null and non-null values is guaranteed to fail a
1314     <literal>MATCH FULL</literal> constraint).  If you don't want referencing rows
1315     to be able to avoid satisfying the foreign key constraint, declare the
1316     referencing column(s) as <literal>NOT NULL</literal>.
1317    </para>
1318
1319    <para>
1320     A foreign key must reference columns that either are a primary key or
1321     form a unique constraint, or are columns from a non-partial unique index.
1322     This means that the referenced columns always have an index to allow
1323     efficient lookups on whether a referencing row has a match.  Since a
1324     <command>DELETE</command> of a row from the referenced table or an
1325     <command>UPDATE</command> of a referenced column will require a scan of
1326     the referencing table for rows matching the old value, it is often a good
1327     idea to index the referencing columns too.  Because this is not always
1328     needed, and there are many choices available on how to index, the
1329     declaration of a foreign key constraint does not automatically create an
1330     index on the referencing columns.
1331    </para>
1332
1333    <para>
1334     More information about updating and deleting data is in <xref
1335     linkend="dml"/>.  Also see the description of foreign key constraint
1336     syntax in the reference documentation for
1337     <xref linkend="sql-createtable"/>.
1338    </para>
1339   </sect2>
1340
1341   <sect2 id="ddl-constraints-exclusion">
1342    <title>Exclusion Constraints</title>
1343
1344    <indexterm>
1345     <primary>exclusion constraint</primary>
1346    </indexterm>
1347
1348    <indexterm>
1349     <primary>constraint</primary>
1350     <secondary>exclusion</secondary>
1351    </indexterm>
1352
1353    <para>
1354     Exclusion constraints ensure that if any two rows are compared on
1355     the specified columns or expressions using the specified operators,
1356     at least one of these operator comparisons will return false or null.
1357     The syntax is:
1358 <programlisting>
1359 CREATE TABLE circles (
1360     c circle,
1361     EXCLUDE USING gist (c WITH &amp;&amp;)
1362 );
1363 </programlisting>
1364    </para>
1365
1366    <para>
1367     See also <link linkend="sql-createtable-exclude"><command>CREATE
1368     TABLE ... CONSTRAINT ... EXCLUDE</command></link> for details.
1369    </para>
1370
1371    <para>
1372     Adding an exclusion constraint will automatically create an index
1373     of the type specified in the constraint declaration.
1374    </para>
1375   </sect2>
1376  </sect1>
1377
1378  <sect1 id="ddl-system-columns">
1379   <title>System Columns</title>
1380
1381   <para>
1382    Every table has several <firstterm>system columns</firstterm> that are
1383    implicitly defined by the system.  Therefore, these names cannot be
1384    used as names of user-defined columns.  (Note that these
1385    restrictions are separate from whether the name is a key word or
1386    not; quoting a name will not allow you to escape these
1387    restrictions.)  You do not really need to be concerned about these
1388    columns; just know they exist.
1389   </para>
1390
1391   <indexterm>
1392    <primary>column</primary>
1393    <secondary>system column</secondary>
1394   </indexterm>
1395
1396   <variablelist>
1397    <varlistentry id="ddl-system-columns-tableoid">
1398     <term><structfield>tableoid</structfield></term>
1399     <listitem>
1400      <indexterm>
1401       <primary>tableoid</primary>
1402      </indexterm>
1403
1404      <para>
1405       The OID of the table containing this row.  This column is
1406       particularly handy for queries that select from partitioned
1407       tables (see <xref linkend="ddl-partitioning"/>) or inheritance
1408       hierarchies (see <xref linkend="ddl-inherit"/>), since without it,
1409       it's difficult to tell which individual table a row came from.  The
1410       <structfield>tableoid</structfield> can be joined against the
1411       <structfield>oid</structfield> column of
1412       <structname>pg_class</structname> to obtain the table name.
1413      </para>
1414     </listitem>
1415    </varlistentry>
1416
1417    <varlistentry id="ddl-system-columns-xmin">
1418     <term><structfield>xmin</structfield></term>
1419     <listitem>
1420      <indexterm>
1421       <primary>xmin</primary>
1422      </indexterm>
1423
1424      <para>
1425       The identity (transaction ID) of the inserting transaction for
1426       this row version.  (A row version is an individual state of a
1427       row; each update of a row creates a new row version for the same
1428       logical row.)
1429      </para>
1430     </listitem>
1431    </varlistentry>
1432
1433    <varlistentry id="ddl-system-columns-cmin">
1434     <term><structfield>cmin</structfield></term>
1435     <listitem>
1436      <indexterm>
1437       <primary>cmin</primary>
1438      </indexterm>
1439
1440      <para>
1441       The command identifier (starting at zero) within the inserting
1442       transaction.
1443      </para>
1444     </listitem>
1445    </varlistentry>
1446
1447    <varlistentry id="ddl-system-columns-xmax">
1448     <term><structfield>xmax</structfield></term>
1449     <listitem>
1450      <indexterm>
1451       <primary>xmax</primary>
1452      </indexterm>
1453
1454      <para>
1455       The identity (transaction ID) of the deleting transaction, or
1456       zero for an undeleted row version.  It is possible for this column to
1457       be nonzero in a visible row version. That usually indicates that the
1458       deleting transaction hasn't committed yet, or that an attempted
1459       deletion was rolled back.
1460      </para>
1461     </listitem>
1462    </varlistentry>
1463
1464    <varlistentry id="ddl-system-columns-cmax">
1465     <term><structfield>cmax</structfield></term>
1466     <listitem>
1467      <indexterm>
1468       <primary>cmax</primary>
1469      </indexterm>
1470
1471      <para>
1472       The command identifier within the deleting transaction, or zero.
1473      </para>
1474     </listitem>
1475    </varlistentry>
1476
1477    <varlistentry id="ddl-system-columns-ctid">
1478     <term><structfield>ctid</structfield></term>
1479     <listitem>
1480      <indexterm>
1481       <primary>ctid</primary>
1482      </indexterm>
1483
1484      <para>
1485       The physical location of the row version within its table.  Note that
1486       although the <structfield>ctid</structfield> can be used to
1487       locate the row version very quickly, a row's
1488       <structfield>ctid</structfield> will change if it is
1489       updated or moved by <command>VACUUM FULL</command>.  Therefore
1490       <structfield>ctid</structfield> is useless as a long-term row
1491       identifier.  A primary key should be used to identify logical rows.
1492      </para>
1493     </listitem>
1494    </varlistentry>
1495   </variablelist>
1496
1497    <para>
1498     Transaction identifiers are also 32-bit quantities.  In a
1499     long-lived database it is possible for transaction IDs to wrap
1500     around.  This is not a fatal problem given appropriate maintenance
1501     procedures; see <xref linkend="maintenance"/> for details.  It is
1502     unwise, however, to depend on the uniqueness of transaction IDs
1503     over the long term (more than one billion transactions).
1504    </para>
1505
1506    <para>
1507     Command identifiers are also 32-bit quantities.  This creates a hard limit
1508     of 2<superscript>32</superscript> (4 billion) <acronym>SQL</acronym> commands
1509     within a single transaction.  In practice this limit is not a
1510     problem &mdash; note that the limit is on the number of
1511     <acronym>SQL</acronym> commands, not the number of rows processed.
1512     Also, only commands that actually modify the database contents will
1513     consume a command identifier.
1514    </para>
1515  </sect1>
1516
1517  <sect1 id="ddl-alter">
1518   <title>Modifying Tables</title>
1519
1520   <indexterm zone="ddl-alter">
1521    <primary>table</primary>
1522    <secondary>modifying</secondary>
1523   </indexterm>
1524
1525   <para>
1526    When you create a table and you realize that you made a mistake, or
1527    the requirements of the application change, you can drop the
1528    table and create it again.  But this is not a convenient option if
1529    the table is already filled with data, or if the table is
1530    referenced by other database objects (for instance a foreign key
1531    constraint).  Therefore <productname>PostgreSQL</productname>
1532    provides a family of commands to make modifications to existing
1533    tables.  Note that this is conceptually distinct from altering
1534    the data contained in the table: here we are interested in altering
1535    the definition, or structure, of the table.
1536   </para>
1537
1538   <para>
1539    You can:
1540    <itemizedlist spacing="compact">
1541     <listitem>
1542      <para>Add columns</para>
1543     </listitem>
1544     <listitem>
1545      <para>Remove columns</para>
1546     </listitem>
1547     <listitem>
1548      <para>Add constraints</para>
1549     </listitem>
1550     <listitem>
1551      <para>Remove constraints</para>
1552     </listitem>
1553     <listitem>
1554      <para>Change default values</para>
1555     </listitem>
1556     <listitem>
1557      <para>Change column data types</para>
1558     </listitem>
1559     <listitem>
1560      <para>Rename columns</para>
1561     </listitem>
1562     <listitem>
1563      <para>Rename tables</para>
1564     </listitem>
1565    </itemizedlist>
1566
1567    All these actions are performed using the
1568    <xref linkend="sql-altertable"/>
1569    command, whose reference page contains details beyond those given
1570    here.
1571   </para>
1572
1573   <sect2 id="ddl-alter-adding-a-column">
1574    <title>Adding a Column</title>
1575
1576    <indexterm>
1577     <primary>column</primary>
1578     <secondary>adding</secondary>
1579    </indexterm>
1580
1581    <para>
1582     To add a column, use a command like:
1583 <programlisting>
1584 ALTER TABLE products ADD COLUMN description text;
1585 </programlisting>
1586     The new column is initially filled with whatever default
1587     value is given (null if you don't specify a <literal>DEFAULT</literal> clause).
1588    </para>
1589
1590    <tip>
1591     <para>
1592      From <productname>PostgreSQL</productname> 11, adding a column with
1593      a constant default value no longer means that each row of the table
1594      needs to be updated when the <command>ALTER TABLE</command> statement
1595      is executed. Instead, the default value will be returned the next time
1596      the row is accessed, and applied when the table is rewritten, making
1597      the <command>ALTER TABLE</command> very fast even on large tables.
1598     </para>
1599
1600     <para>
1601      However, if the default value is volatile (e.g.,
1602      <function>clock_timestamp()</function>)
1603      each row will need to be updated with the value calculated at the time
1604      <command>ALTER TABLE</command> is executed. To avoid a potentially
1605      lengthy update operation, particularly if you intend to fill the column
1606      with mostly nondefault values anyway, it may be preferable to add the
1607      column with no default, insert the correct values using
1608      <command>UPDATE</command>, and then add any desired default as described
1609      below.
1610     </para>
1611    </tip>
1612
1613    <para>
1614     You can also define constraints on the column at the same time,
1615     using the usual syntax:
1616 <programlisting>
1617 ALTER TABLE products ADD COLUMN description text CHECK (description &lt;&gt; '');
1618 </programlisting>
1619     In fact all the options that can be applied to a column description
1620     in <command>CREATE TABLE</command> can be used here.  Keep in mind however
1621     that the default value must satisfy the given constraints, or the
1622     <literal>ADD</literal> will fail.  Alternatively, you can add
1623     constraints later (see below) after you've filled in the new column
1624     correctly.
1625    </para>
1626
1627   </sect2>
1628
1629   <sect2 id="ddl-alter-removing-a-column">
1630    <title>Removing a Column</title>
1631
1632    <indexterm>
1633     <primary>column</primary>
1634     <secondary>removing</secondary>
1635    </indexterm>
1636
1637    <para>
1638     To remove a column, use a command like:
1639 <programlisting>
1640 ALTER TABLE products DROP COLUMN description;
1641 </programlisting>
1642     Whatever data was in the column disappears.  Table constraints involving
1643     the column are dropped, too.  However, if the column is referenced by a
1644     foreign key constraint of another table,
1645     <productname>PostgreSQL</productname> will not silently drop that
1646     constraint.  You can authorize dropping everything that depends on
1647     the column by adding <literal>CASCADE</literal>:
1648 <programlisting>
1649 ALTER TABLE products DROP COLUMN description CASCADE;
1650 </programlisting>
1651     See <xref linkend="ddl-depend"/> for a description of the general
1652     mechanism behind this.
1653    </para>
1654   </sect2>
1655
1656   <sect2 id="ddl-alter-adding-a-constraint">
1657    <title>Adding a Constraint</title>
1658
1659    <indexterm>
1660     <primary>constraint</primary>
1661     <secondary>adding</secondary>
1662    </indexterm>
1663
1664    <para>
1665     To add a constraint, the table constraint syntax is used.  For example:
1666 <programlisting>
1667 ALTER TABLE products ADD CHECK (name &lt;&gt; '');
1668 ALTER TABLE products ADD CONSTRAINT some_name UNIQUE (product_no);
1669 ALTER TABLE products ADD FOREIGN KEY (product_group_id) REFERENCES product_groups;
1670 </programlisting>
1671    </para>
1672
1673    <para>
1674     To add a not-null constraint, which is normally not written as a table
1675     constraint, this special syntax is available:
1676 <programlisting>
1677 ALTER TABLE products ALTER COLUMN product_no SET NOT NULL;
1678 </programlisting>
1679     This command silently does nothing if the column already has a
1680     not-null constraint.
1681    </para>
1682
1683    <para>
1684     The constraint will be checked immediately, so the table data must
1685     satisfy the constraint before it can be added.
1686    </para>
1687   </sect2>
1688
1689   <sect2 id="ddl-alter-removing-a-constraint">
1690    <title>Removing a Constraint</title>
1691
1692    <indexterm>
1693     <primary>constraint</primary>
1694     <secondary>removing</secondary>
1695    </indexterm>
1696
1697    <para>
1698     To remove a constraint you need to know its name.  If you gave it
1699     a name then that's easy.  Otherwise the system assigned a
1700     generated name, which you need to find out.  The
1701     <application>psql</application> command <literal>\d
1702     <replaceable>tablename</replaceable></literal> can be helpful
1703     here; other interfaces might also provide a way to inspect table
1704     details.  Then the command is:
1705 <programlisting>
1706 ALTER TABLE products DROP CONSTRAINT some_name;
1707 </programlisting>
1708     (If you are dealing with a generated constraint name like <literal>$2</literal>,
1709     don't forget that you'll need to double-quote it to make it a valid
1710     identifier.)
1711    </para>
1712
1713    <para>
1714     As with dropping a column, you need to add <literal>CASCADE</literal> if you
1715     want to drop a constraint that something else depends on.  An example
1716     is that a foreign key constraint depends on a unique or primary key
1717     constraint on the referenced column(s).
1718    </para>
1719
1720    <para>
1721     Simplified syntax is available to drop a not-null constraint:
1722 <programlisting>
1723 ALTER TABLE products ALTER COLUMN product_no DROP NOT NULL;
1724 </programlisting>
1725     This mirrors the <literal>SET NOT NULL</literal> syntax for adding a
1726     not-null constraint.  This command will silently do nothing if the column
1727     does not have a not-null constraint.  (Recall that a column can have at
1728     most one not-null constraint, so it is never ambiguous which constraint
1729     this command acts on.)
1730    </para>
1731   </sect2>
1732
1733   <sect2 id="ddl-alter-column-default">
1734    <title>Changing a Column's Default Value</title>
1735
1736    <indexterm>
1737     <primary>default value</primary>
1738     <secondary>changing</secondary>
1739    </indexterm>
1740
1741    <para>
1742     To set a new default for a column, use a command like:
1743 <programlisting>
1744 ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77;
1745 </programlisting>
1746     Note that this doesn't affect any existing rows in the table, it
1747     just changes the default for future <command>INSERT</command> commands.
1748    </para>
1749
1750    <para>
1751     To remove any default value, use:
1752 <programlisting>
1753 ALTER TABLE products ALTER COLUMN price DROP DEFAULT;
1754 </programlisting>
1755     This is effectively the same as setting the default to null.
1756     As a consequence, it is not an error
1757     to drop a default where one hadn't been defined, because the
1758     default is implicitly the null value.
1759    </para>
1760   </sect2>
1761
1762   <sect2 id="ddl-alter-column-type">
1763    <title>Changing a Column's Data Type</title>
1764
1765    <indexterm>
1766     <primary>column data type</primary>
1767     <secondary>changing</secondary>
1768    </indexterm>
1769
1770    <para>
1771     To convert a column to a different data type, use a command like:
1772 <programlisting>
1773 ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2);
1774 </programlisting>
1775     This will succeed only if each existing entry in the column can be
1776     converted to the new type by an implicit cast.  If a more complex
1777     conversion is needed, you can add a <literal>USING</literal> clause that
1778     specifies how to compute the new values from the old.
1779    </para>
1780
1781    <para>
1782     <productname>PostgreSQL</productname> will attempt to convert the column's
1783     default value (if any) to the new type, as well as any constraints
1784     that involve the column.  But these conversions might fail, or might
1785     produce surprising results.  It's often best to drop any constraints
1786     on the column before altering its type, and then add back suitably
1787     modified constraints afterwards.
1788    </para>
1789   </sect2>
1790
1791   <sect2 id="ddl-alter-renaming-column">
1792    <title>Renaming a Column</title>
1793
1794    <indexterm>
1795     <primary>column</primary>
1796     <secondary>renaming</secondary>
1797    </indexterm>
1798
1799    <para>
1800     To rename a column:
1801 <programlisting>
1802 ALTER TABLE products RENAME COLUMN product_no TO product_number;
1803 </programlisting>
1804    </para>
1805   </sect2>
1806
1807   <sect2 id="ddl-alter-renaming-table">
1808    <title>Renaming a Table</title>
1809
1810    <indexterm>
1811     <primary>table</primary>
1812     <secondary>renaming</secondary>
1813    </indexterm>
1814
1815    <para>
1816     To rename a table:
1817 <programlisting>
1818 ALTER TABLE products RENAME TO items;
1819 </programlisting>
1820    </para>
1821   </sect2>
1822  </sect1>
1823
1824  <sect1 id="ddl-priv">
1825   <title>Privileges</title>
1826
1827   <indexterm zone="ddl-priv">
1828    <primary>privilege</primary>
1829   </indexterm>
1830
1831   <indexterm>
1832    <primary>permission</primary>
1833    <see>privilege</see>
1834   </indexterm>
1835
1836   <indexterm zone="ddl-priv">
1837    <primary>owner</primary>
1838   </indexterm>
1839
1840   <indexterm zone="ddl-priv">
1841    <primary>GRANT</primary>
1842   </indexterm>
1843
1844   <indexterm zone="ddl-priv">
1845    <primary>REVOKE</primary>
1846   </indexterm>
1847
1848   <indexterm zone="ddl-priv">
1849    <primary>ACL</primary>
1850   </indexterm>
1851
1852   <indexterm zone="ddl-priv-default">
1853    <primary>privilege</primary>
1854    <secondary>default</secondary>
1855   </indexterm>
1856
1857   <para>
1858    When an object is created, it is assigned an owner. The
1859    owner is normally the role that executed the creation statement.
1860    For most kinds of objects, the initial state is that only the owner
1861    (or a superuser) can do anything with the object. To allow
1862    other roles to use it, <firstterm>privileges</firstterm> must be
1863    granted.
1864   </para>
1865
1866   <para>
1867    There are different kinds of privileges: <literal>SELECT</literal>,
1868    <literal>INSERT</literal>, <literal>UPDATE</literal>, <literal>DELETE</literal>,
1869    <literal>TRUNCATE</literal>, <literal>REFERENCES</literal>, <literal>TRIGGER</literal>,
1870    <literal>CREATE</literal>, <literal>CONNECT</literal>, <literal>TEMPORARY</literal>,
1871    <literal>EXECUTE</literal>, <literal>USAGE</literal>, <literal>SET</literal>,
1872    <literal>ALTER SYSTEM</literal>, and <literal>MAINTAIN</literal>.
1873    The privileges applicable to a particular
1874    object vary depending on the object's type (table, function, etc.).
1875    More detail about the meanings of these privileges appears below.
1876    The following sections and chapters will also show you how
1877    these privileges are used.
1878   </para>
1879
1880   <para>
1881    The right to modify or destroy an object is inherent in being the
1882    object's owner, and cannot be granted or revoked in itself.
1883    (However, like all privileges, that right can be inherited by
1884    members of the owning role; see <xref linkend="role-membership"/>.)
1885   </para>
1886
1887   <para>
1888    An object can be assigned to a new owner with an <command>ALTER</command>
1889    command of the appropriate kind for the object, for example
1890 <programlisting>
1891 ALTER TABLE <replaceable>table_name</replaceable> OWNER TO <replaceable>new_owner</replaceable>;
1892 </programlisting>
1893    Superusers can always do this; ordinary roles can only do it if they are
1894    both the current owner of the object (or inherit the privileges of the
1895    owning role) and able to <literal>SET ROLE</literal> to the new owning role.
1896   </para>
1897
1898   <para>
1899    To assign privileges, the <xref linkend="sql-grant"/> command is
1900    used. For example, if <literal>joe</literal> is an existing role, and
1901    <literal>accounts</literal> is an existing table, the privilege to
1902    update the table can be granted with:
1903 <programlisting>
1904 GRANT UPDATE ON accounts TO joe;
1905 </programlisting>
1906    Writing <literal>ALL</literal> in place of a specific privilege grants all
1907    privileges that are relevant for the object type.
1908   </para>
1909
1910   <para>
1911    The special <quote>role</quote> name <literal>PUBLIC</literal> can
1912    be used to grant a privilege to every role on the system.  Also,
1913    <quote>group</quote> roles can be set up to help manage privileges when
1914    there are many users of a database &mdash; for details see
1915    <xref linkend="user-manag"/>.
1916   </para>
1917
1918   <para>
1919    To revoke a previously-granted privilege, use the fittingly named
1920    <xref linkend="sql-revoke"/> command:
1921 <programlisting>
1922 REVOKE ALL ON accounts FROM PUBLIC;
1923 </programlisting>
1924   </para>
1925
1926   <para>
1927    Ordinarily, only the object's owner (or a superuser) can grant or
1928    revoke privileges on an object.  However, it is possible to grant a
1929    privilege <quote>with grant option</quote>, which gives the recipient
1930    the right to grant it in turn to others.  If the grant option is
1931    subsequently revoked then all who received the privilege from that
1932    recipient (directly or through a chain of grants) will lose the
1933    privilege.  For details see the <xref linkend="sql-grant"/> and
1934    <xref linkend="sql-revoke"/> reference pages.
1935   </para>
1936
1937   <para>
1938    An object's owner can choose to revoke their own ordinary privileges,
1939    for example to make a table read-only for themselves as well as others.
1940    But owners are always treated as holding all grant options, so they
1941    can always re-grant their own privileges.
1942   </para>
1943
1944   <para>
1945    The available privileges are:
1946
1947    <variablelist>
1948     <varlistentry id="ddl-priv-select">
1949      <term><literal>SELECT</literal></term>
1950      <listitem>
1951       <para>
1952        Allows <command>SELECT</command> from
1953        any column, or specific column(s), of a table, view, materialized
1954        view, or other table-like object.
1955        Also allows use of <command>COPY TO</command>.
1956        This privilege is also needed to reference existing column values in
1957        <command>UPDATE</command>, <command>DELETE</command>,
1958        or <command>MERGE</command>.
1959        For sequences, this privilege also allows use of the
1960        <function>currval</function> function.
1961        For large objects, this privilege allows the object to be read.
1962       </para>
1963      </listitem>
1964     </varlistentry>
1965
1966     <varlistentry id="ddl-priv-insert">
1967      <term><literal>INSERT</literal></term>
1968      <listitem>
1969       <para>
1970        Allows <command>INSERT</command> of a new row into a table, view,
1971        etc.  Can be granted on specific column(s), in which case
1972        only those columns may be assigned to in the <command>INSERT</command>
1973        command (other columns will therefore receive default values).
1974        Also allows use of <command>COPY FROM</command>.
1975       </para>
1976      </listitem>
1977     </varlistentry>
1978
1979     <varlistentry id="ddl-priv-update">
1980      <term><literal>UPDATE</literal></term>
1981      <listitem>
1982       <para>
1983        Allows <command>UPDATE</command> of any
1984        column, or specific column(s), of a table, view, etc.
1985        (In practice, any nontrivial <command>UPDATE</command> command will
1986        require <literal>SELECT</literal> privilege as well, since it must
1987        reference table columns to determine which rows to update, and/or to
1988        compute new values for columns.)
1989        <literal>SELECT ... FOR UPDATE</literal>
1990        and <literal>SELECT ... FOR SHARE</literal>
1991        also require this privilege on at least one column, in addition to the
1992        <literal>SELECT</literal> privilege.  For sequences, this
1993        privilege allows use of the <function>nextval</function> and
1994        <function>setval</function> functions.
1995        For large objects, this privilege allows writing or truncating the
1996        object.
1997       </para>
1998      </listitem>
1999     </varlistentry>
2000
2001     <varlistentry id="ddl-priv-delete">
2002      <term><literal>DELETE</literal></term>
2003      <listitem>
2004       <para>
2005        Allows <command>DELETE</command> of a row from a table, view, etc.
2006        (In practice, any nontrivial <command>DELETE</command> command will
2007        require <literal>SELECT</literal> privilege as well, since it must
2008        reference table columns to determine which rows to delete.)
2009       </para>
2010      </listitem>
2011     </varlistentry>
2012
2013     <varlistentry id="ddl-priv-truncate">
2014      <term><literal>TRUNCATE</literal></term>
2015      <listitem>
2016       <para>
2017        Allows <command>TRUNCATE</command> on a table.
2018       </para>
2019      </listitem>
2020     </varlistentry>
2021
2022     <varlistentry id="ddl-priv-references">
2023      <term><literal>REFERENCES</literal></term>
2024      <listitem>
2025       <para>
2026        Allows creation of a foreign key constraint referencing a
2027        table, or specific column(s) of a table.
2028       </para>
2029      </listitem>
2030     </varlistentry>
2031
2032     <varlistentry id="ddl-priv-trigger">
2033      <term><literal>TRIGGER</literal></term>
2034      <listitem>
2035       <para>
2036        Allows creation of a trigger on a table, view, etc.
2037       </para>
2038      </listitem>
2039     </varlistentry>
2040
2041     <varlistentry id="ddl-priv-create">
2042      <term><literal>CREATE</literal></term>
2043      <listitem>
2044       <para>
2045        For databases, allows new schemas and publications to be created within
2046        the database, and allows trusted extensions to be installed within
2047        the database.
2048       </para>
2049       <para>
2050        For schemas, allows new objects to be created within the schema.
2051        To rename an existing object, you must own the
2052        object <emphasis>and</emphasis> have this privilege for the containing
2053        schema.
2054       </para>
2055       <para>
2056        For tablespaces, allows tables, indexes, and temporary files to be
2057        created within the tablespace, and allows databases to be created that
2058        have the tablespace as their default tablespace.
2059       </para>
2060       <para>
2061        Note that revoking this privilege will not alter the existence or
2062        location of existing objects.
2063       </para>
2064      </listitem>
2065     </varlistentry>
2066
2067     <varlistentry id="ddl-priv-connect">
2068      <term><literal>CONNECT</literal></term>
2069      <listitem>
2070       <para>
2071        Allows the grantee to connect to the database.  This
2072        privilege is checked at connection startup (in addition to checking
2073        any restrictions imposed by <filename>pg_hba.conf</filename>).
2074       </para>
2075      </listitem>
2076     </varlistentry>
2077
2078     <varlistentry id="ddl-priv-temporary">
2079      <term><literal>TEMPORARY</literal></term>
2080      <listitem>
2081       <para>
2082        Allows temporary tables to be created while using the database.
2083       </para>
2084      </listitem>
2085     </varlistentry>
2086
2087     <varlistentry id="ddl-priv-execute">
2088      <term><literal>EXECUTE</literal></term>
2089      <listitem>
2090       <para>
2091        Allows calling a function or procedure, including use of
2092        any operators that are implemented on top of the function.  This is the
2093        only type of privilege that is applicable to functions and procedures.
2094       </para>
2095      </listitem>
2096     </varlistentry>
2097
2098     <varlistentry id="ddl-priv-usage">
2099      <term><literal>USAGE</literal></term>
2100      <listitem>
2101       <para>
2102        For procedural languages, allows use of the language for
2103        the creation of functions in that language.  This is the only type
2104        of privilege that is applicable to procedural languages.
2105       </para>
2106       <para>
2107        For schemas, allows access to objects contained in the
2108        schema (assuming that the objects' own privilege requirements are
2109        also met).  Essentially this allows the grantee to <quote>look up</quote>
2110        objects within the schema.  Without this permission, it is still
2111        possible to see the object names, e.g., by querying system catalogs.
2112        Also, after revoking this permission, existing sessions might have
2113        statements that have previously performed this lookup, so this is not
2114        a completely secure way to prevent object access.
2115       </para>
2116       <para>
2117        For sequences, allows use of the
2118        <function>currval</function> and <function>nextval</function> functions.
2119       </para>
2120       <para>
2121        For types and domains, allows use of the type or domain in the
2122        creation of tables, functions, and other schema objects.  (Note that
2123        this privilege does not control all <quote>usage</quote> of the
2124        type, such as values of the type appearing in queries.  It only
2125        prevents objects from being created that depend on the type.  The
2126        main purpose of this privilege is controlling which users can create
2127        dependencies on a type, which could prevent the owner from changing
2128        the type later.)
2129       </para>
2130       <para>
2131        For foreign-data wrappers, allows creation of new servers using the
2132        foreign-data wrapper.
2133       </para>
2134       <para>
2135        For foreign servers, allows creation of foreign tables using the
2136        server.  Grantees may also create, alter, or drop their own user
2137        mappings associated with that server.
2138       </para>
2139      </listitem>
2140     </varlistentry>
2141
2142     <varlistentry id="ddl-priv-set">
2143      <term><literal>SET</literal></term>
2144      <listitem>
2145       <para>
2146        Allows a server configuration parameter to be set to a new value
2147        within the current session.  (While this privilege can be granted
2148        on any parameter, it is meaningless except for parameters that would
2149        normally require superuser privilege to set.)
2150       </para>
2151      </listitem>
2152     </varlistentry>
2153
2154     <varlistentry id="ddl-priv-alter-system">
2155      <term><literal>ALTER SYSTEM</literal></term>
2156      <listitem>
2157       <para>
2158        Allows a server configuration parameter to be configured to a new
2159        value using the <xref linkend="sql-altersystem"/> command.
2160       </para>
2161      </listitem>
2162     </varlistentry>
2163
2164    <varlistentry id="ddl-priv-maintain">
2165     <term><literal>MAINTAIN</literal></term>
2166     <listitem>
2167      <para>
2168       Allows <command>VACUUM</command>, <command>ANALYZE</command>,
2169       <command>CLUSTER</command>, <command>REFRESH MATERIALIZED VIEW</command>,
2170       <command>REINDEX</command>, and <command>LOCK TABLE</command> on a
2171       relation.
2172      </para>
2173     </listitem>
2174    </varlistentry>
2175   </variablelist>
2176
2177    The privileges required by other commands are listed on the
2178    reference page of the respective command.
2179   </para>
2180
2181   <para id="ddl-priv-default">
2182    PostgreSQL grants privileges on some types of objects to
2183    <literal>PUBLIC</literal> by default when the objects are created.
2184    No privileges are granted to <literal>PUBLIC</literal> by default on
2185    tables,
2186    table columns,
2187    sequences,
2188    foreign data wrappers,
2189    foreign servers,
2190    large objects,
2191    schemas,
2192    tablespaces,
2193    or configuration parameters.
2194    For other types of objects, the default privileges
2195    granted to <literal>PUBLIC</literal> are as follows:
2196    <literal>CONNECT</literal> and <literal>TEMPORARY</literal> (create
2197    temporary tables) privileges for databases;
2198    <literal>EXECUTE</literal> privilege for functions and procedures; and
2199    <literal>USAGE</literal> privilege for languages and data types
2200    (including domains).
2201    The object owner can, of course, <command>REVOKE</command>
2202    both default and expressly granted privileges. (For maximum
2203    security, issue the <command>REVOKE</command> in the same transaction that
2204    creates the object; then there is no window in which another user
2205    can use the object.)
2206    Also, these default privilege settings can be overridden using the
2207    <xref linkend="sql-alterdefaultprivileges"/> command.
2208   </para>
2209
2210   <para>
2211    <xref linkend="privilege-abbrevs-table"/> shows the one-letter
2212    abbreviations that are used for these privilege types in
2213    <firstterm>ACL</firstterm> (Access Control List) values.
2214    You will see these letters in the output of the <xref linkend="app-psql"/>
2215    commands listed below, or when looking at ACL columns of system catalogs.
2216   </para>
2217
2218   <table id="privilege-abbrevs-table">
2219    <title>ACL Privilege Abbreviations</title>
2220    <tgroup cols="3">
2221     <colspec colname="col1" colwidth="1*"/>
2222     <colspec colname="col2" colwidth="1*"/>
2223     <colspec colname="col3" colwidth="2*"/>
2224     <thead>
2225      <row>
2226       <entry>Privilege</entry>
2227       <entry>Abbreviation</entry>
2228       <entry>Applicable Object Types</entry>
2229      </row>
2230     </thead>
2231     <tbody>
2232      <row>
2233       <entry><literal>SELECT</literal></entry>
2234       <entry><literal>r</literal> (<quote>read</quote>)</entry>
2235       <entry>
2236        <literal>LARGE OBJECT</literal>,
2237        <literal>SEQUENCE</literal>,
2238        <literal>TABLE</literal> (and table-like objects),
2239        table column
2240       </entry>
2241      </row>
2242      <row>
2243       <entry><literal>INSERT</literal></entry>
2244       <entry><literal>a</literal> (<quote>append</quote>)</entry>
2245       <entry><literal>TABLE</literal>, table column</entry>
2246      </row>
2247      <row>
2248       <entry><literal>UPDATE</literal></entry>
2249       <entry><literal>w</literal> (<quote>write</quote>)</entry>
2250       <entry>
2251        <literal>LARGE OBJECT</literal>,
2252        <literal>SEQUENCE</literal>,
2253        <literal>TABLE</literal>,
2254        table column
2255       </entry>
2256      </row>
2257      <row>
2258       <entry><literal>DELETE</literal></entry>
2259       <entry><literal>d</literal></entry>
2260       <entry><literal>TABLE</literal></entry>
2261      </row>
2262      <row>
2263       <entry><literal>TRUNCATE</literal></entry>
2264       <entry><literal>D</literal></entry>
2265       <entry><literal>TABLE</literal></entry>
2266      </row>
2267      <row>
2268       <entry><literal>REFERENCES</literal></entry>
2269       <entry><literal>x</literal></entry>
2270       <entry><literal>TABLE</literal>, table column</entry>
2271      </row>
2272      <row>
2273       <entry><literal>TRIGGER</literal></entry>
2274       <entry><literal>t</literal></entry>
2275       <entry><literal>TABLE</literal></entry>
2276      </row>
2277      <row>
2278       <entry><literal>CREATE</literal></entry>
2279       <entry><literal>C</literal></entry>
2280       <entry>
2281        <literal>DATABASE</literal>,
2282        <literal>SCHEMA</literal>,
2283        <literal>TABLESPACE</literal>
2284       </entry>
2285      </row>
2286      <row>
2287       <entry><literal>CONNECT</literal></entry>
2288       <entry><literal>c</literal></entry>
2289       <entry><literal>DATABASE</literal></entry>
2290      </row>
2291      <row>
2292       <entry><literal>TEMPORARY</literal></entry>
2293       <entry><literal>T</literal></entry>
2294       <entry><literal>DATABASE</literal></entry>
2295      </row>
2296      <row>
2297       <entry><literal>EXECUTE</literal></entry>
2298       <entry><literal>X</literal></entry>
2299       <entry><literal>FUNCTION</literal>, <literal>PROCEDURE</literal></entry>
2300      </row>
2301      <row>
2302       <entry><literal>USAGE</literal></entry>
2303       <entry><literal>U</literal></entry>
2304       <entry>
2305        <literal>DOMAIN</literal>,
2306        <literal>FOREIGN DATA WRAPPER</literal>,
2307        <literal>FOREIGN SERVER</literal>,
2308        <literal>LANGUAGE</literal>,
2309        <literal>SCHEMA</literal>,
2310        <literal>SEQUENCE</literal>,
2311        <literal>TYPE</literal>
2312       </entry>
2313      </row>
2314      <row>
2315       <entry><literal>SET</literal></entry>
2316       <entry><literal>s</literal></entry>
2317       <entry><literal>PARAMETER</literal></entry>
2318      </row>
2319      <row>
2320       <entry><literal>ALTER SYSTEM</literal></entry>
2321       <entry><literal>A</literal></entry>
2322       <entry><literal>PARAMETER</literal></entry>
2323      </row>
2324      <row>
2325       <entry><literal>MAINTAIN</literal></entry>
2326       <entry><literal>m</literal></entry>
2327       <entry><literal>TABLE</literal></entry>
2328      </row>
2329      </tbody>
2330    </tgroup>
2331   </table>
2332
2333   <para>
2334    <xref linkend="privileges-summary-table"/> summarizes the privileges
2335    available for each type of SQL object, using the abbreviations shown
2336    above.
2337    It also shows the <application>psql</application> command
2338    that can be used to examine privilege settings for each object type.
2339   </para>
2340
2341   <table id="privileges-summary-table">
2342    <title>Summary of Access Privileges</title>
2343    <tgroup cols="4">
2344     <colspec colname="col1" colwidth="2*"/>
2345     <colspec colname="col2" colwidth="1*"/>
2346     <colspec colname="col3" colwidth="1*"/>
2347     <colspec colname="col4" colwidth="1*"/>
2348     <thead>
2349      <row>
2350       <entry>Object Type</entry>
2351       <entry>All Privileges</entry>
2352       <entry>Default <literal>PUBLIC</literal> Privileges</entry>
2353       <entry><application>psql</application> Command</entry>
2354      </row>
2355     </thead>
2356     <tbody>
2357      <row>
2358       <entry><literal>DATABASE</literal></entry>
2359       <entry><literal>CTc</literal></entry>
2360       <entry><literal>Tc</literal></entry>
2361       <entry><literal>\l</literal></entry>
2362      </row>
2363      <row>
2364       <entry><literal>DOMAIN</literal></entry>
2365       <entry><literal>U</literal></entry>
2366       <entry><literal>U</literal></entry>
2367       <entry><literal>\dD+</literal></entry>
2368      </row>
2369      <row>
2370       <entry><literal>FUNCTION</literal> or <literal>PROCEDURE</literal></entry>
2371       <entry><literal>X</literal></entry>
2372       <entry><literal>X</literal></entry>
2373       <entry><literal>\df+</literal></entry>
2374      </row>
2375      <row>
2376       <entry><literal>FOREIGN DATA WRAPPER</literal></entry>
2377       <entry><literal>U</literal></entry>
2378       <entry>none</entry>
2379       <entry><literal>\dew+</literal></entry>
2380      </row>
2381      <row>
2382       <entry><literal>FOREIGN SERVER</literal></entry>
2383       <entry><literal>U</literal></entry>
2384       <entry>none</entry>
2385       <entry><literal>\des+</literal></entry>
2386      </row>
2387      <row>
2388       <entry><literal>LANGUAGE</literal></entry>
2389       <entry><literal>U</literal></entry>
2390       <entry><literal>U</literal></entry>
2391       <entry><literal>\dL+</literal></entry>
2392      </row>
2393      <row>
2394       <entry><literal>LARGE OBJECT</literal></entry>
2395       <entry><literal>rw</literal></entry>
2396       <entry>none</entry>
2397       <entry><literal>\dl+</literal></entry>
2398      </row>
2399      <row>
2400       <entry><literal>PARAMETER</literal></entry>
2401       <entry><literal>sA</literal></entry>
2402       <entry>none</entry>
2403       <entry><literal>\dconfig+</literal></entry>
2404      </row>
2405      <row>
2406       <entry><literal>SCHEMA</literal></entry>
2407       <entry><literal>UC</literal></entry>
2408       <entry>none</entry>
2409       <entry><literal>\dn+</literal></entry>
2410      </row>
2411      <row>
2412       <entry><literal>SEQUENCE</literal></entry>
2413       <entry><literal>rwU</literal></entry>
2414       <entry>none</entry>
2415       <entry><literal>\dp</literal></entry>
2416      </row>
2417      <row>
2418       <entry><literal>TABLE</literal> (and table-like objects)</entry>
2419       <entry><literal>arwdDxtm</literal></entry>
2420       <entry>none</entry>
2421       <entry><literal>\dp</literal></entry>
2422      </row>
2423      <row>
2424       <entry>Table column</entry>
2425       <entry><literal>arwx</literal></entry>
2426       <entry>none</entry>
2427       <entry><literal>\dp</literal></entry>
2428      </row>
2429      <row>
2430       <entry><literal>TABLESPACE</literal></entry>
2431       <entry><literal>C</literal></entry>
2432       <entry>none</entry>
2433       <entry><literal>\db+</literal></entry>
2434      </row>
2435      <row>
2436       <entry><literal>TYPE</literal></entry>
2437       <entry><literal>U</literal></entry>
2438       <entry><literal>U</literal></entry>
2439       <entry><literal>\dT+</literal></entry>
2440      </row>
2441     </tbody>
2442    </tgroup>
2443   </table>
2444
2445   <para>
2446    <indexterm>
2447     <primary><type>aclitem</type></primary>
2448    </indexterm>
2449    The privileges that have been granted for a particular object are
2450    displayed as a list of <type>aclitem</type> entries, each having the
2451    format:
2452 <synopsis>
2453 <replaceable>grantee</replaceable><literal>=</literal><replaceable>privilege-abbreviation</replaceable><optional><literal>*</literal></optional>...<literal>/</literal><replaceable>grantor</replaceable>
2454 </synopsis>
2455    Each <type>aclitem</type> lists all the permissions of one grantee that
2456    have been granted by a particular grantor.  Specific privileges are
2457    represented by one-letter abbreviations from
2458    <xref linkend="privilege-abbrevs-table"/>, with <literal>*</literal>
2459    appended if the privilege was granted with grant option.  For example,
2460    <literal>calvin=r*w/hobbes</literal> specifies that the role
2461    <literal>calvin</literal> has the privilege
2462    <literal>SELECT</literal> (<literal>r</literal>) with grant option
2463    (<literal>*</literal>) as well as the non-grantable
2464    privilege <literal>UPDATE</literal> (<literal>w</literal>), both granted
2465    by the role <literal>hobbes</literal>.  If <literal>calvin</literal>
2466    also has some privileges on the same object granted by a different
2467    grantor, those would appear as a separate <type>aclitem</type> entry.
2468    An empty grantee field in an <type>aclitem</type> stands
2469    for <literal>PUBLIC</literal>.
2470   </para>
2471
2472   <para>
2473    As an example, suppose that user <literal>miriam</literal> creates
2474    table <literal>mytable</literal> and does:
2475 <programlisting>
2476 GRANT SELECT ON mytable TO PUBLIC;
2477 GRANT SELECT, UPDATE, INSERT ON mytable TO admin;
2478 GRANT SELECT (col1), UPDATE (col1) ON mytable TO miriam_rw;
2479 </programlisting>
2480    Then <application>psql</application>'s <literal>\dp</literal> command
2481    would show:
2482 <programlisting>
2483 =&gt; \dp mytable
2484                                   Access privileges
2485  Schema |  Name   | Type  |   Access privileges    |   Column privileges   | Policies
2486 --------+---------+-------+------------------------+-----------------------+----------
2487  public | mytable | table | miriam=arwdDxtm/miriam+| col1:                +|
2488         |         |       | =r/miriam             +|   miriam_rw=rw/miriam |
2489         |         |       | admin=arw/miriam       |                       |
2490 (1 row)
2491 </programlisting>
2492   </para>
2493
2494   <para>
2495    If the <quote>Access privileges</quote> column is empty for a given
2496    object, it means the object has default privileges (that is, its
2497    privileges entry in the relevant system catalog is null).  Default
2498    privileges always include all privileges for the owner, and can include
2499    some privileges for <literal>PUBLIC</literal> depending on the object
2500    type, as explained above.  The first <command>GRANT</command>
2501    or <command>REVOKE</command> on an object will instantiate the default
2502    privileges (producing, for
2503    example, <literal>miriam=arwdDxt/miriam</literal>) and then modify them
2504    per the specified request.  Similarly, entries are shown in <quote>Column
2505    privileges</quote> only for columns with nondefault privileges.
2506    (Note: for this purpose, <quote>default privileges</quote> always means
2507    the built-in default privileges for the object's type.  An object whose
2508    privileges have been affected by an <command>ALTER DEFAULT
2509    PRIVILEGES</command> command will always be shown with an explicit
2510    privilege entry that includes the effects of
2511    the <command>ALTER</command>.)
2512   </para>
2513
2514   <para>
2515    Notice that the owner's implicit grant options are not marked in the
2516    access privileges display.  A <literal>*</literal> will appear only when
2517    grant options have been explicitly granted to someone.
2518   </para>
2519
2520   <para>
2521    The <quote>Access privileges</quote> column
2522    shows <literal>(none)</literal> when the object's privileges entry is
2523    non-null but empty.  This means that no privileges are granted at all,
2524    even to the object's owner &mdash; a rare situation.  (The owner still
2525    has implicit grant options in this case, and so could re-grant her own
2526    privileges; but she has none at the moment.)
2527   </para>
2528  </sect1>
2529
2530  <sect1 id="ddl-rowsecurity">
2531   <title>Row Security Policies</title>
2532
2533   <indexterm zone="ddl-rowsecurity">
2534    <primary>row-level security</primary>
2535   </indexterm>
2536
2537   <indexterm zone="ddl-rowsecurity">
2538    <primary>policy</primary>
2539   </indexterm>
2540
2541   <para>
2542    In addition to the SQL-standard <link linkend="ddl-priv">privilege
2543    system</link> available through <xref linkend="sql-grant"/>,
2544    tables can have <firstterm>row security policies</firstterm> that restrict,
2545    on a per-user basis, which rows can be returned by normal queries
2546    or inserted, updated, or deleted by data modification commands.
2547    This feature is also known as <firstterm>Row-Level Security</firstterm>.
2548    By default, tables do not have any policies, so that if a user has
2549    access privileges to a table according to the SQL privilege system,
2550    all rows within it are equally available for querying or updating.
2551   </para>
2552
2553   <para>
2554    When row security is enabled on a table (with
2555    <link linkend="sql-altertable">ALTER TABLE ... ENABLE ROW LEVEL
2556    SECURITY</link>), all normal access to the table for selecting rows or
2557    modifying rows must be allowed by a row security policy.  (However, the
2558    table's owner is typically not subject to row security policies.)  If no
2559    policy exists for the table, a default-deny policy is used, meaning that
2560    no rows are visible or can be modified.  Operations that apply to the
2561    whole table, such as <command>TRUNCATE</command> and <literal>REFERENCES</literal>,
2562    are not subject to row security.
2563   </para>
2564
2565   <para>
2566    Row security policies can be specific to commands, or to roles, or to
2567    both.  A policy can be specified to apply to <literal>ALL</literal>
2568    commands, or to <literal>SELECT</literal>, <literal>INSERT</literal>, <literal>UPDATE</literal>,
2569    or <literal>DELETE</literal>.  Multiple roles can be assigned to a given
2570    policy, and normal role membership and inheritance rules apply.
2571   </para>
2572
2573   <para>
2574    To specify which rows are visible or modifiable according to a policy,
2575    an expression is required that returns a Boolean result.  This
2576    expression will be evaluated for each row prior to any conditions or
2577    functions coming from the user's query.  (The only exceptions to this
2578    rule are <literal>leakproof</literal> functions, which are guaranteed to
2579    not leak information; the optimizer may choose to apply such functions
2580    ahead of the row-security check.)  Rows for which the expression does
2581    not return <literal>true</literal> will not be processed.  Separate expressions
2582    may be specified to provide independent control over the rows which are
2583    visible and the rows which are allowed to be modified.  Policy
2584    expressions are run as part of the query and with the privileges of the
2585    user running the query, although security-definer functions can be used
2586    to access data not available to the calling user.
2587   </para>
2588
2589   <para>
2590    Superusers and roles with the <literal>BYPASSRLS</literal> attribute always
2591    bypass the row security system when accessing a table.  Table owners
2592    normally bypass row security as well, though a table owner can choose to
2593    be subject to row security with <link linkend="sql-altertable">ALTER
2594    TABLE ... FORCE ROW LEVEL SECURITY</link>.
2595   </para>
2596
2597   <para>
2598    Enabling and disabling row security, as well as adding policies to a
2599    table, is always the privilege of the table owner only.
2600   </para>
2601
2602   <para>
2603    Policies are created using the <xref linkend="sql-createpolicy"/>
2604    command, altered using the <xref linkend="sql-alterpolicy"/> command,
2605    and dropped using the <xref linkend="sql-droppolicy"/> command.  To
2606    enable and disable row security for a given table, use the
2607    <xref linkend="sql-altertable"/> command.
2608   </para>
2609
2610   <para>
2611    Each policy has a name and multiple policies can be defined for a
2612    table.  As policies are table-specific, each policy for a table must
2613    have a unique name.  Different tables may have policies with the
2614    same name.
2615   </para>
2616
2617   <para>
2618    When multiple policies apply to a given query, they are combined using
2619    either <literal>OR</literal> (for permissive policies, which are the
2620    default) or using <literal>AND</literal> (for restrictive policies).
2621    This is similar to the rule that a given role has the privileges
2622    of all roles that they are a member of.  Permissive vs. restrictive
2623    policies are discussed further below.
2624   </para>
2625
2626   <para>
2627    As a simple example, here is how to create a policy on
2628    the <literal>account</literal> relation to allow only members of
2629    the <literal>managers</literal> role to access rows, and only rows of their
2630    accounts:
2631   </para>
2632
2633 <programlisting>
2634 CREATE TABLE accounts (manager text, company text, contact_email text);
2635
2636 ALTER TABLE accounts ENABLE ROW LEVEL SECURITY;
2637
2638 CREATE POLICY account_managers ON accounts TO managers
2639     USING (manager = current_user);
2640 </programlisting>
2641
2642   <para>
2643    The policy above implicitly provides a <literal>WITH CHECK</literal>
2644    clause identical to its <literal>USING</literal> clause, so that the
2645    constraint applies both to rows selected by a command (so a manager
2646    cannot <command>SELECT</command>, <command>UPDATE</command>,
2647    or <command>DELETE</command> existing rows belonging to a different
2648    manager) and to rows modified by a command (so rows belonging to a
2649    different manager cannot be created via <command>INSERT</command>
2650    or <command>UPDATE</command>).
2651   </para>
2652
2653   <para>
2654    If no role is specified, or the special user name
2655    <literal>PUBLIC</literal> is used, then the policy applies to all
2656    users on the system.  To allow all users to access only their own row in
2657    a <literal>users</literal> table, a simple policy can be used:
2658   </para>
2659
2660 <programlisting>
2661 CREATE POLICY user_policy ON users
2662     USING (user_name = current_user);
2663 </programlisting>
2664
2665   <para>
2666    This works similarly to the previous example.
2667   </para>
2668
2669   <para>
2670    To use a different policy for rows that are being added to the table
2671    compared to those rows that are visible, multiple policies can be
2672    combined.  This pair of policies would allow all users to view all rows
2673    in the <literal>users</literal> table, but only modify their own:
2674   </para>
2675
2676 <programlisting>
2677 CREATE POLICY user_sel_policy ON users
2678     FOR SELECT
2679     USING (true);
2680 CREATE POLICY user_mod_policy ON users
2681     USING (user_name = current_user);
2682 </programlisting>
2683
2684   <para>
2685    In a <command>SELECT</command> command, these two policies are combined
2686    using <literal>OR</literal>, with the net effect being that all rows
2687    can be selected.  In other command types, only the second policy applies,
2688    so that the effects are the same as before.
2689   </para>
2690
2691   <para>
2692    Row security can also be disabled with the <command>ALTER TABLE</command>
2693    command.  Disabling row security does not remove any policies that are
2694    defined on the table; they are simply ignored.  Then all rows in the
2695    table are visible and modifiable, subject to the standard SQL privileges
2696    system.
2697   </para>
2698
2699   <para>
2700    Below is a larger example of how this feature can be used in production
2701    environments.  The table <literal>passwd</literal> emulates a Unix password
2702    file:
2703   </para>
2704
2705 <programlisting>
2706 -- Simple passwd-file based example
2707 CREATE TABLE passwd (
2708   user_name             text UNIQUE NOT NULL,
2709   pwhash                text,
2710   uid                   int  PRIMARY KEY,
2711   gid                   int  NOT NULL,
2712   real_name             text NOT NULL,
2713   home_phone            text,
2714   extra_info            text,
2715   home_dir              text NOT NULL,
2716   shell                 text NOT NULL
2717 );
2718
2719 CREATE ROLE admin;  -- Administrator
2720 CREATE ROLE bob;    -- Normal user
2721 CREATE ROLE alice;  -- Normal user
2722
2723 -- Populate the table
2724 INSERT INTO passwd VALUES
2725   ('admin','xxx',0,0,'Admin','111-222-3333',null,'/root','/bin/dash');
2726 INSERT INTO passwd VALUES
2727   ('bob','xxx',1,1,'Bob','123-456-7890',null,'/home/bob','/bin/zsh');
2728 INSERT INTO passwd VALUES
2729   ('alice','xxx',2,1,'Alice','098-765-4321',null,'/home/alice','/bin/zsh');
2730
2731 -- Be sure to enable row-level security on the table
2732 ALTER TABLE passwd ENABLE ROW LEVEL SECURITY;
2733
2734 -- Create policies
2735 -- Administrator can see all rows and add any rows
2736 CREATE POLICY admin_all ON passwd TO admin USING (true) WITH CHECK (true);
2737 -- Normal users can view all rows
2738 CREATE POLICY all_view ON passwd FOR SELECT USING (true);
2739 -- Normal users can update their own records, but
2740 -- limit which shells a normal user is allowed to set
2741 CREATE POLICY user_mod ON passwd FOR UPDATE
2742   USING (current_user = user_name)
2743   WITH CHECK (
2744     current_user = user_name AND
2745     shell IN ('/bin/bash','/bin/sh','/bin/dash','/bin/zsh','/bin/tcsh')
2746   );
2747
2748 -- Allow admin all normal rights
2749 GRANT SELECT, INSERT, UPDATE, DELETE ON passwd TO admin;
2750 -- Users only get select access on public columns
2751 GRANT SELECT
2752   (user_name, uid, gid, real_name, home_phone, extra_info, home_dir, shell)
2753   ON passwd TO public;
2754 -- Allow users to update certain columns
2755 GRANT UPDATE
2756   (pwhash, real_name, home_phone, extra_info, shell)
2757   ON passwd TO public;
2758 </programlisting>
2759
2760   <para>
2761    As with any security settings, it's important to test and ensure that
2762    the system is behaving as expected.  Using the example above, this
2763    demonstrates that the permission system is working properly.
2764   </para>
2765
2766 <programlisting>
2767 -- admin can view all rows and fields
2768 postgres=&gt; set role admin;
2769 SET
2770 postgres=&gt; table passwd;
2771  user_name | pwhash | uid | gid | real_name |  home_phone  | extra_info | home_dir    |   shell
2772 -----------+--------+-----+-----+-----------+--------------+------------+-------------+-----------
2773  admin     | xxx    |   0 |   0 | Admin     | 111-222-3333 |            | /root       | /bin/dash
2774  bob       | xxx    |   1 |   1 | Bob       | 123-456-7890 |            | /home/bob   | /bin/zsh
2775  alice     | xxx    |   2 |   1 | Alice     | 098-765-4321 |            | /home/alice | /bin/zsh
2776 (3 rows)
2777
2778 -- Test what Alice is able to do
2779 postgres=&gt; set role alice;
2780 SET
2781 postgres=&gt; table passwd;
2782 ERROR:  permission denied for table passwd
2783 postgres=&gt; select user_name,real_name,home_phone,extra_info,home_dir,shell from passwd;
2784  user_name | real_name |  home_phone  | extra_info | home_dir    |   shell
2785 -----------+-----------+--------------+------------+-------------+-----------
2786  admin     | Admin     | 111-222-3333 |            | /root       | /bin/dash
2787  bob       | Bob       | 123-456-7890 |            | /home/bob   | /bin/zsh
2788  alice     | Alice     | 098-765-4321 |            | /home/alice | /bin/zsh
2789 (3 rows)
2790
2791 postgres=&gt; update passwd set user_name = 'joe';
2792 ERROR:  permission denied for table passwd
2793 -- Alice is allowed to change her own real_name, but no others
2794 postgres=&gt; update passwd set real_name = 'Alice Doe';
2795 UPDATE 1
2796 postgres=&gt; update passwd set real_name = 'John Doe' where user_name = 'admin';
2797 UPDATE 0
2798 postgres=&gt; update passwd set shell = '/bin/xx';
2799 ERROR:  new row violates WITH CHECK OPTION for "passwd"
2800 postgres=&gt; delete from passwd;
2801 ERROR:  permission denied for table passwd
2802 postgres=&gt; insert into passwd (user_name) values ('xxx');
2803 ERROR:  permission denied for table passwd
2804 -- Alice can change her own password; RLS silently prevents updating other rows
2805 postgres=&gt; update passwd set pwhash = 'abc';
2806 UPDATE 1
2807 </programlisting>
2808
2809   <para>
2810    All of the policies constructed thus far have been permissive policies,
2811    meaning that when multiple policies are applied they are combined using
2812    the <quote>OR</quote> Boolean operator.  While permissive policies can be constructed
2813    to only allow access to rows in the intended cases, it can be simpler to
2814    combine permissive policies with restrictive policies (which the records
2815    must pass and which are combined using the <quote>AND</quote> Boolean operator).
2816    Building on the example above, we add a restrictive policy to require
2817    the administrator to be connected over a local Unix socket to access the
2818    records of the <literal>passwd</literal> table:
2819   </para>
2820
2821 <programlisting>
2822 CREATE POLICY admin_local_only ON passwd AS RESTRICTIVE TO admin
2823     USING (pg_catalog.inet_client_addr() IS NULL);
2824 </programlisting>
2825
2826   <para>
2827    We can then see that an administrator connecting over a network will not
2828    see any records, due to the restrictive policy:
2829   </para>
2830
2831 <programlisting>
2832 =&gt; SELECT current_user;
2833  current_user
2834 --------------
2835  admin
2836 (1 row)
2837
2838 =&gt; select inet_client_addr();
2839  inet_client_addr
2840 ------------------
2841  127.0.0.1
2842 (1 row)
2843
2844 =&gt; TABLE passwd;
2845  user_name | pwhash | uid | gid | real_name | home_phone | extra_info | home_dir | shell
2846 -----------+--------+-----+-----+-----------+------------+------------+----------+-------
2847 (0 rows)
2848
2849 =&gt; UPDATE passwd set pwhash = NULL;
2850 UPDATE 0
2851 </programlisting>
2852
2853   <para>
2854    Referential integrity checks, such as unique or primary key constraints
2855    and foreign key references, always bypass row security to ensure that
2856    data integrity is maintained.  Care must be taken when developing
2857    schemas and row level policies to avoid <quote>covert channel</quote> leaks of
2858    information through such referential integrity checks.
2859   </para>
2860
2861   <para>
2862    In some contexts it is important to be sure that row security is
2863    not being applied.  For example, when taking a backup, it could be
2864    disastrous if row security silently caused some rows to be omitted
2865    from the backup.  In such a situation, you can set the
2866    <xref linkend="guc-row-security"/> configuration parameter
2867    to <literal>off</literal>.  This does not in itself bypass row security;
2868    what it does is throw an error if any query's results would get filtered
2869    by a policy.  The reason for the error can then be investigated and
2870    fixed.
2871   </para>
2872
2873   <para>
2874    In the examples above, the policy expressions consider only the current
2875    values in the row to be accessed or updated.  This is the simplest and
2876    best-performing case; when possible, it's best to design row security
2877    applications to work this way.  If it is necessary to consult other rows
2878    or other tables to make a policy decision, that can be accomplished using
2879    sub-<command>SELECT</command>s, or functions that contain <command>SELECT</command>s,
2880    in the policy expressions.  Be aware however that such accesses can
2881    create race conditions that could allow information leakage if care is
2882    not taken.  As an example, consider the following table design:
2883   </para>
2884
2885 <programlisting>
2886 -- definition of privilege groups
2887 CREATE TABLE groups (group_id int PRIMARY KEY,
2888                      group_name text NOT NULL);
2889
2890 INSERT INTO groups VALUES
2891   (1, 'low'),
2892   (2, 'medium'),
2893   (5, 'high');
2894
2895 GRANT ALL ON groups TO alice;  -- alice is the administrator
2896 GRANT SELECT ON groups TO public;
2897
2898 -- definition of users' privilege levels
2899 CREATE TABLE users (user_name text PRIMARY KEY,
2900                     group_id int NOT NULL REFERENCES groups);
2901
2902 INSERT INTO users VALUES
2903   ('alice', 5),
2904   ('bob', 2),
2905   ('mallory', 2);
2906
2907 GRANT ALL ON users TO alice;
2908 GRANT SELECT ON users TO public;
2909
2910 -- table holding the information to be protected
2911 CREATE TABLE information (info text,
2912                           group_id int NOT NULL REFERENCES groups);
2913
2914 INSERT INTO information VALUES
2915   ('barely secret', 1),
2916   ('slightly secret', 2),
2917   ('very secret', 5);
2918
2919 ALTER TABLE information ENABLE ROW LEVEL SECURITY;
2920
2921 -- a row should be visible to/updatable by users whose security group_id is
2922 -- greater than or equal to the row's group_id
2923 CREATE POLICY fp_s ON information FOR SELECT
2924   USING (group_id &lt;= (SELECT group_id FROM users WHERE user_name = current_user));
2925 CREATE POLICY fp_u ON information FOR UPDATE
2926   USING (group_id &lt;= (SELECT group_id FROM users WHERE user_name = current_user));
2927
2928 -- we rely only on RLS to protect the information table
2929 GRANT ALL ON information TO public;
2930 </programlisting>
2931
2932   <para>
2933    Now suppose that <literal>alice</literal> wishes to change the <quote>slightly
2934    secret</quote> information, but decides that <literal>mallory</literal> should not
2935    be trusted with the new content of that row, so she does:
2936   </para>
2937
2938 <programlisting>
2939 BEGIN;
2940 UPDATE users SET group_id = 1 WHERE user_name = 'mallory';
2941 UPDATE information SET info = 'secret from mallory' WHERE group_id = 2;
2942 COMMIT;
2943 </programlisting>
2944
2945   <para>
2946    That looks safe; there is no window wherein <literal>mallory</literal> should be
2947    able to see the <quote>secret from mallory</quote> string.  However, there is
2948    a race condition here.  If <literal>mallory</literal> is concurrently doing,
2949    say,
2950 <programlisting>
2951 SELECT * FROM information WHERE group_id = 2 FOR UPDATE;
2952 </programlisting>
2953    and her transaction is in <literal>READ COMMITTED</literal> mode, it is possible
2954    for her to see <quote>secret from mallory</quote>.  That happens if her
2955    transaction reaches the <structname>information</structname> row just
2956    after <literal>alice</literal>'s does.  It blocks waiting
2957    for <literal>alice</literal>'s transaction to commit, then fetches the updated
2958    row contents thanks to the <literal>FOR UPDATE</literal> clause.  However, it
2959    does <emphasis>not</emphasis> fetch an updated row for the
2960    implicit <command>SELECT</command> from <structname>users</structname>, because that
2961    sub-<command>SELECT</command> did not have <literal>FOR UPDATE</literal>; instead
2962    the <structname>users</structname> row is read with the snapshot taken at the start
2963    of the query.  Therefore, the policy expression tests the old value
2964    of <literal>mallory</literal>'s privilege level and allows her to see the
2965    updated row.
2966   </para>
2967
2968   <para>
2969    There are several ways around this problem.  One simple answer is to use
2970    <literal>SELECT ... FOR SHARE</literal> in sub-<command>SELECT</command>s in row
2971    security policies.  However, that requires granting <literal>UPDATE</literal>
2972    privilege on the referenced table (here <structname>users</structname>) to the
2973    affected users, which might be undesirable.  (But another row security
2974    policy could be applied to prevent them from actually exercising that
2975    privilege; or the sub-<command>SELECT</command> could be embedded into a security
2976    definer function.)  Also, heavy concurrent use of row share locks on the
2977    referenced table could pose a performance problem, especially if updates
2978    of it are frequent.  Another solution, practical if updates of the
2979    referenced table are infrequent, is to take an
2980    <literal>ACCESS EXCLUSIVE</literal> lock on the
2981    referenced table when updating it, so that no concurrent transactions
2982    could be examining old row values.  Or one could just wait for all
2983    concurrent transactions to end after committing an update of the
2984    referenced table and before making changes that rely on the new security
2985    situation.
2986   </para>
2987
2988   <para>
2989    For additional details see <xref linkend="sql-createpolicy"/>
2990    and <xref linkend="sql-altertable"/>.
2991   </para>
2992
2993  </sect1>
2994
2995  <sect1 id="ddl-schemas">
2996   <title>Schemas</title>
2997
2998   <indexterm zone="ddl-schemas">
2999    <primary>schema</primary>
3000   </indexterm>
3001
3002   <para>
3003    A <productname>PostgreSQL</productname> database cluster contains
3004    one or more named databases.  Roles and a few other object types are
3005    shared across the entire cluster.  A client connection to the server
3006    can only access data in a single database, the one specified in the
3007    connection request.
3008   </para>
3009
3010   <note>
3011    <para>
3012     Users of a cluster do not necessarily have the privilege to access every
3013     database in the cluster.  Sharing of role names means that there
3014     cannot be different roles named, say, <literal>joe</literal> in two databases
3015     in the same cluster; but the system can be configured to allow
3016     <literal>joe</literal> access to only some of the databases.
3017    </para>
3018   </note>
3019
3020   <para>
3021    A database contains one or more named <firstterm>schemas</firstterm>, which
3022    in turn contain tables.  Schemas also contain other kinds of named
3023    objects, including data types, functions, and operators.  Within one
3024    schema, two objects of the same type cannot have the same name.
3025    Furthermore, tables, sequences, indexes, views, materialized views, and
3026    foreign tables share the same namespace, so that, for example, an index and
3027    a table must have different names if they are in the same schema.  The same
3028    object name can be used in different schemas without conflict; for
3029    example, both <literal>schema1</literal> and <literal>myschema</literal> can
3030    contain tables named <literal>mytable</literal>.  Unlike databases,
3031    schemas are not rigidly separated: a user can access objects in any
3032    of the schemas in the database they are connected to, if they have
3033    privileges to do so.
3034   </para>
3035
3036   <para>
3037    There are several reasons why one might want to use schemas:
3038
3039    <itemizedlist>
3040     <listitem>
3041      <para>
3042       To allow many users to use one database without interfering with
3043       each other.
3044      </para>
3045     </listitem>
3046
3047     <listitem>
3048      <para>
3049       To organize database objects into logical groups to make them
3050       more manageable.
3051      </para>
3052     </listitem>
3053
3054     <listitem>
3055      <para>
3056       Third-party applications can be put into separate schemas so
3057       they do not collide with the names of other objects.
3058      </para>
3059     </listitem>
3060    </itemizedlist>
3061
3062    Schemas are analogous to directories at the operating system level,
3063    except that schemas cannot be nested.
3064   </para>
3065
3066   <sect2 id="ddl-schemas-create">
3067    <title>Creating a Schema</title>
3068
3069    <indexterm zone="ddl-schemas-create">
3070     <primary>schema</primary>
3071     <secondary>creating</secondary>
3072    </indexterm>
3073
3074    <para>
3075     To create a schema, use the <xref linkend="sql-createschema"/>
3076     command.  Give the schema a name
3077     of your choice.  For example:
3078 <programlisting>
3079 CREATE SCHEMA myschema;
3080 </programlisting>
3081    </para>
3082
3083    <indexterm>
3084     <primary>qualified name</primary>
3085    </indexterm>
3086
3087    <indexterm>
3088     <primary>name</primary>
3089     <secondary>qualified</secondary>
3090    </indexterm>
3091
3092    <para>
3093     To create or access objects in a schema, write a
3094     <firstterm>qualified name</firstterm> consisting of the schema name and
3095     table name separated by a dot:
3096 <synopsis>
3097 <replaceable>schema</replaceable><literal>.</literal><replaceable>table</replaceable>
3098 </synopsis>
3099     This works anywhere a table name is expected, including the table
3100     modification commands and the data access commands discussed in
3101     the following chapters.
3102     (For brevity we will speak of tables only, but the same ideas apply
3103     to other kinds of named objects, such as types and functions.)
3104    </para>
3105
3106    <para>
3107     Actually, the even more general syntax
3108 <synopsis>
3109 <replaceable>database</replaceable><literal>.</literal><replaceable>schema</replaceable><literal>.</literal><replaceable>table</replaceable>
3110 </synopsis>
3111     can be used too, but at present this is just for pro forma
3112     compliance with the SQL standard.  If you write a database name,
3113     it must be the same as the database you are connected to.
3114    </para>
3115
3116    <para>
3117     So to create a table in the new schema, use:
3118 <programlisting>
3119 CREATE TABLE myschema.mytable (
3120  ...
3121 );
3122 </programlisting>
3123    </para>
3124
3125    <indexterm>
3126     <primary>schema</primary>
3127     <secondary>removing</secondary>
3128    </indexterm>
3129
3130    <para>
3131     To drop a schema if it's empty (all objects in it have been
3132     dropped), use:
3133 <programlisting>
3134 DROP SCHEMA myschema;
3135 </programlisting>
3136     To drop a schema including all contained objects, use:
3137 <programlisting>
3138 DROP SCHEMA myschema CASCADE;
3139 </programlisting>
3140     See <xref linkend="ddl-depend"/> for a description of the general
3141     mechanism behind this.
3142    </para>
3143
3144    <para>
3145     Often you will want to create a schema owned by someone else
3146     (since this is one of the ways to restrict the activities of your
3147     users to well-defined namespaces).  The syntax for that is:
3148 <programlisting>
3149 CREATE SCHEMA <replaceable>schema_name</replaceable> AUTHORIZATION <replaceable>user_name</replaceable>;
3150 </programlisting>
3151     You can even omit the schema name, in which case the schema name
3152     will be the same as the user name.  See <xref
3153     linkend="ddl-schemas-patterns"/> for how this can be useful.
3154    </para>
3155
3156    <para>
3157     Schema names beginning with <literal>pg_</literal> are reserved for
3158     system purposes and cannot be created by users.
3159    </para>
3160   </sect2>
3161
3162   <sect2 id="ddl-schemas-public">
3163    <title>The Public Schema</title>
3164
3165    <indexterm zone="ddl-schemas-public">
3166     <primary>schema</primary>
3167     <secondary>public</secondary>
3168    </indexterm>
3169
3170    <para>
3171     In the previous sections we created tables without specifying any
3172     schema names.  By default such tables (and other objects) are
3173     automatically put into a schema named <quote>public</quote>.  Every new
3174     database contains such a schema.  Thus, the following are equivalent:
3175 <programlisting>
3176 CREATE TABLE products ( ... );
3177 </programlisting>
3178     and:
3179 <programlisting>
3180 CREATE TABLE public.products ( ... );
3181 </programlisting>
3182    </para>
3183   </sect2>
3184
3185   <sect2 id="ddl-schemas-path">
3186    <title>The Schema Search Path</title>
3187
3188    <indexterm>
3189     <primary>search path</primary>
3190    </indexterm>
3191
3192    <indexterm>
3193     <primary>unqualified name</primary>
3194    </indexterm>
3195
3196    <indexterm>
3197     <primary>name</primary>
3198     <secondary>unqualified</secondary>
3199    </indexterm>
3200
3201    <para>
3202     Qualified names are tedious to write, and it's often best not to
3203     wire a particular schema name into applications anyway.  Therefore
3204     tables are often referred to by <firstterm>unqualified names</firstterm>,
3205     which consist of just the table name.  The system determines which table
3206     is meant by following a <firstterm>search path</firstterm>, which is a list
3207     of schemas to look in.  The first matching table in the search path
3208     is taken to be the one wanted.  If there is no match in the search
3209     path, an error is reported, even if matching table names exist
3210     in other schemas in the database.
3211    </para>
3212
3213   <para>
3214     The ability to create like-named objects in different schemas complicates
3215     writing a query that references precisely the same objects every time.  It
3216     also opens up the potential for users to change the behavior of other
3217     users' queries, maliciously or accidentally.  Due to the prevalence of
3218     unqualified names in queries and their use
3219     in <productname>PostgreSQL</productname> internals, adding a schema
3220     to <varname>search_path</varname> effectively trusts all users having
3221     <literal>CREATE</literal> privilege on that schema.  When you run an
3222     ordinary query, a malicious user able to create objects in a schema of
3223     your search path can take control and execute arbitrary SQL functions as
3224     though you executed them.
3225    </para>
3226
3227    <indexterm>
3228     <primary>schema</primary>
3229     <secondary>current</secondary>
3230    </indexterm>
3231
3232    <para>
3233     The first schema named in the search path is called the current schema.
3234     Aside from being the first schema searched, it is also the schema in
3235     which new tables will be created if the <command>CREATE TABLE</command>
3236     command does not specify a schema name.
3237    </para>
3238
3239    <indexterm>
3240     <primary><varname>search_path</varname> configuration parameter</primary>
3241    </indexterm>
3242
3243    <para>
3244     To show the current search path, use the following command:
3245 <programlisting>
3246 SHOW search_path;
3247 </programlisting>
3248     In the default setup this returns:
3249 <screen>
3250  search_path
3251 --------------
3252  "$user", public
3253 </screen>
3254     The first element specifies that a schema with the same name as
3255     the current user is to be searched.  If no such schema exists,
3256     the entry is ignored.  The second element refers to the
3257     public schema that we have seen already.
3258    </para>
3259
3260    <para>
3261     The first schema in the search path that exists is the default
3262     location for creating new objects.  That is the reason that by
3263     default objects are created in the public schema.  When objects
3264     are referenced in any other context without schema qualification
3265     (table modification, data modification, or query commands) the
3266     search path is traversed until a matching object is found.
3267     Therefore, in the default configuration, any unqualified access
3268     again can only refer to the public schema.
3269    </para>
3270
3271    <para>
3272     To put our new schema in the path, we use:
3273 <programlisting>
3274 SET search_path TO myschema,public;
3275 </programlisting>
3276     (We omit the <literal>$user</literal> here because we have no
3277     immediate need for it.)  And then we can access the table without
3278     schema qualification:
3279 <programlisting>
3280 DROP TABLE mytable;
3281 </programlisting>
3282     Also, since <literal>myschema</literal> is the first element in
3283     the path, new objects would by default be created in it.
3284    </para>
3285
3286    <para>
3287     We could also have written:
3288 <programlisting>
3289 SET search_path TO myschema;
3290 </programlisting>
3291     Then we no longer have access to the public schema without
3292     explicit qualification.  There is nothing special about the public
3293     schema except that it exists by default.  It can be dropped, too.
3294    </para>
3295
3296    <para>
3297     See also <xref linkend="functions-info"/> for other ways to manipulate
3298     the schema search path.
3299    </para>
3300
3301    <para>
3302     The search path works in the same way for data type names, function names,
3303     and operator names as it does for table names.  Data type and function
3304     names can be qualified in exactly the same way as table names.  If you
3305     need to write a qualified operator name in an expression, there is a
3306     special provision: you must write
3307 <synopsis>
3308 <literal>OPERATOR(</literal><replaceable>schema</replaceable><literal>.</literal><replaceable>operator</replaceable><literal>)</literal>
3309 </synopsis>
3310     This is needed to avoid syntactic ambiguity.  An example is:
3311 <programlisting>
3312 SELECT 3 OPERATOR(pg_catalog.+) 4;
3313 </programlisting>
3314     In practice one usually relies on the search path for operators,
3315     so as not to have to write anything so ugly as that.
3316    </para>
3317   </sect2>
3318
3319   <sect2 id="ddl-schemas-priv">
3320    <title>Schemas and Privileges</title>
3321
3322    <indexterm zone="ddl-schemas-priv">
3323     <primary>privilege</primary>
3324     <secondary sortas="schemas">for schemas</secondary>
3325    </indexterm>
3326
3327    <para>
3328     By default, users cannot access any objects in schemas they do not
3329     own.  To allow that, the owner of the schema must grant the
3330     <literal>USAGE</literal> privilege on the schema.  By default, everyone
3331     has that privilege on the schema <literal>public</literal>.  To allow
3332     users to make use of the objects in a schema, additional privileges might
3333     need to be granted, as appropriate for the object.
3334    </para>
3335
3336    <para>
3337     A user can also be allowed to create objects in someone else's schema.  To
3338     allow that, the <literal>CREATE</literal> privilege on the schema needs to
3339     be granted.  In databases upgraded from
3340     <productname>PostgreSQL</productname> 14 or earlier, everyone has that
3341     privilege on the schema <literal>public</literal>.
3342     Some <link linkend="ddl-schemas-patterns">usage patterns</link> call for
3343     revoking that privilege:
3344 <programlisting>
3345 REVOKE CREATE ON SCHEMA public FROM PUBLIC;
3346 </programlisting>
3347     (The first <quote>public</quote> is the schema, the second
3348     <quote>public</quote> means <quote>every user</quote>.  In the
3349     first sense it is an identifier, in the second sense it is a
3350     key word, hence the different capitalization; recall the
3351     guidelines from <xref linkend="sql-syntax-identifiers"/>.)
3352    </para>
3353   </sect2>
3354
3355   <sect2 id="ddl-schemas-catalog">
3356    <title>The System Catalog Schema</title>
3357
3358    <indexterm zone="ddl-schemas-catalog">
3359     <primary>system catalog</primary>
3360     <secondary>schema</secondary>
3361    </indexterm>
3362
3363    <para>
3364     In addition to <literal>public</literal> and user-created schemas, each
3365     database contains a <literal>pg_catalog</literal> schema, which contains
3366     the system tables and all the built-in data types, functions, and
3367     operators.  <literal>pg_catalog</literal> is always effectively part of
3368     the search path.  If it is not named explicitly in the path then
3369     it is implicitly searched <emphasis>before</emphasis> searching the path's
3370     schemas.  This ensures that built-in names will always be
3371     findable.  However, you can explicitly place
3372     <literal>pg_catalog</literal> at the end of your search path if you
3373     prefer to have user-defined names override built-in names.
3374    </para>
3375
3376    <para>
3377     Since system table names begin with <literal>pg_</literal>, it is best to
3378     avoid such names to ensure that you won't suffer a conflict if some
3379     future version defines a system table named the same as your
3380     table.  (With the default search path, an unqualified reference to
3381     your table name would then be resolved as the system table instead.)
3382     System tables will continue to follow the convention of having
3383     names beginning with <literal>pg_</literal>, so that they will not
3384     conflict with unqualified user-table names so long as users avoid
3385     the <literal>pg_</literal> prefix.
3386    </para>
3387   </sect2>
3388
3389   <sect2 id="ddl-schemas-patterns">
3390    <title>Usage Patterns</title>
3391
3392    <para>
3393     Schemas can be used to organize your data in many ways.
3394     A <firstterm>secure schema usage pattern</firstterm> prevents untrusted
3395     users from changing the behavior of other users' queries.  When a database
3396     does not use a secure schema usage pattern, users wishing to securely
3397     query that database would take protective action at the beginning of each
3398     session.  Specifically, they would begin each session by
3399     setting <varname>search_path</varname> to the empty string or otherwise
3400     removing schemas that are writable by non-superusers
3401     from <varname>search_path</varname>.  There are a few usage patterns
3402     easily supported by the default configuration:
3403     <itemizedlist>
3404      <listitem>
3405       <para>
3406        Constrain ordinary users to user-private schemas.
3407        To implement this pattern, first ensure that no schemas have
3408        public <literal>CREATE</literal> privileges.  Then, for every user
3409        needing to create non-temporary objects, create a schema with the
3410        same name as that user, for example
3411        <literal>CREATE SCHEMA alice AUTHORIZATION alice</literal>.
3412        (Recall that the default search path starts
3413        with <literal>$user</literal>, which resolves to the user
3414        name. Therefore, if each user has a separate schema, they access
3415        their own schemas by default.)  This pattern is a secure schema
3416        usage pattern unless an untrusted user is the database owner or
3417        has been granted <literal>ADMIN OPTION</literal> on a relevant role,
3418        in which case no secure schema usage pattern exists.
3419       </para>
3420       <!-- A database owner can attack the database's users via "CREATE SCHEMA
3421            trojan; ALTER DATABASE $mydb SET search_path = trojan, public;". -->
3422
3423       <para>
3424        In <productname>PostgreSQL</productname> 15 and later, the default
3425        configuration supports this usage pattern.  In prior versions, or
3426        when using a database that has been upgraded from a prior version,
3427        you will need to remove the public <literal>CREATE</literal>
3428        privilege from the <literal>public</literal> schema (issue
3429        <literal>REVOKE CREATE ON SCHEMA public FROM PUBLIC</literal>).
3430        Then consider auditing the <literal>public</literal> schema for
3431        objects named like objects in schema <literal>pg_catalog</literal>.
3432       </para>
3433       <!-- "DROP SCHEMA public" is inferior to this REVOKE, because pg_dump
3434            doesn't preserve that DROP. -->
3435      </listitem>
3436
3437      <listitem>
3438       <para>
3439        Remove the public schema from the default search path, by modifying
3440        <link linkend="config-setting-configuration-file"><filename>postgresql.conf</filename></link>
3441        or by issuing <literal>ALTER ROLE ALL SET search_path =
3442        "$user"</literal>.  Then, grant privileges to create in the public
3443        schema.  Only qualified names will choose public schema objects.  While
3444        qualified table references are fine, calls to functions in the public
3445        schema <link linkend="typeconv-func">will be unsafe or
3446        unreliable</link>.  If you create functions or extensions in the public
3447        schema, use the first pattern instead.  Otherwise, like the first
3448        pattern, this is secure unless an untrusted user is the database owner
3449        or has been granted <literal>ADMIN OPTION</literal> on a relevant role.
3450       </para>
3451      </listitem>
3452
3453      <listitem>
3454       <para>
3455        Keep the default search path, and grant privileges to create in the
3456        public schema.  All users access the public schema implicitly.  This
3457        simulates the situation where schemas are not available at all, giving
3458        a smooth transition from the non-schema-aware world.  However, this is
3459        never a secure pattern.  It is acceptable only when the database has a
3460        single user or a few mutually-trusting users.  In databases upgraded
3461        from <productname>PostgreSQL</productname> 14 or earlier, this is the
3462        default.
3463       </para>
3464      </listitem>
3465     </itemizedlist>
3466    </para>
3467
3468    <para>
3469     For any pattern, to install shared applications (tables to be used by
3470     everyone, additional functions provided by third parties, etc.), put them
3471     into separate schemas.  Remember to grant appropriate privileges to allow
3472     the other users to access them.  Users can then refer to these additional
3473     objects by qualifying the names with a schema name, or they can put the
3474     additional schemas into their search path, as they choose.
3475    </para>
3476   </sect2>
3477
3478   <sect2 id="ddl-schemas-portability">
3479    <title>Portability</title>
3480
3481    <para>
3482     In the SQL standard, the notion of objects in the same schema
3483     being owned by different users does not exist.  Moreover, some
3484     implementations do not allow you to create schemas that have a
3485     different name than their owner.  In fact, the concepts of schema
3486     and user are nearly equivalent in a database system that
3487     implements only the basic schema support specified in the
3488     standard.  Therefore, many users consider qualified names to
3489     really consist of
3490     <literal><replaceable>user_name</replaceable>.<replaceable>table_name</replaceable></literal>.
3491     This is how <productname>PostgreSQL</productname> will effectively
3492     behave if you create a per-user schema for every user.
3493    </para>
3494
3495    <para>
3496     Also, there is no concept of a <literal>public</literal> schema in the
3497     SQL standard.  For maximum conformance to the standard, you should
3498     not use the <literal>public</literal> schema.
3499    </para>
3500
3501    <para>
3502     Of course, some SQL database systems might not implement schemas
3503     at all, or provide namespace support by allowing (possibly
3504     limited) cross-database access.  If you need to work with those
3505     systems, then maximum portability would be achieved by not using
3506     schemas at all.
3507    </para>
3508   </sect2>
3509  </sect1>
3510
3511  <sect1 id="ddl-inherit">
3512   <title>Inheritance</title>
3513
3514   <indexterm>
3515    <primary>inheritance</primary>
3516   </indexterm>
3517
3518   <indexterm>
3519    <primary>table</primary>
3520    <secondary>inheritance</secondary>
3521   </indexterm>
3522
3523   <para>
3524    <productname>PostgreSQL</productname> implements table inheritance,
3525    which can be a useful tool for database designers.  (SQL:1999 and
3526    later define a type inheritance feature, which differs in many
3527    respects from the features described here.)
3528   </para>
3529
3530   <para>
3531    Let's start with an example: suppose we are trying to build a data
3532    model for cities.  Each state has many cities, but only one
3533    capital. We want to be able to quickly retrieve the capital city
3534    for any particular state. This can be done by creating two tables,
3535    one for state capitals and one for cities that are not
3536    capitals. However, what happens when we want to ask for data about
3537    a city, regardless of whether it is a capital or not? The
3538    inheritance feature can help to resolve this problem. We define the
3539    <structname>capitals</structname> table so that it inherits from
3540    <structname>cities</structname>:
3541
3542 <programlisting>
3543 CREATE TABLE cities (
3544     name            text,
3545     population      float,
3546     elevation       int     -- in feet
3547 );
3548
3549 CREATE TABLE capitals (
3550     state           char(2)
3551 ) INHERITS (cities);
3552 </programlisting>
3553
3554    In this case, the <structname>capitals</structname> table <firstterm>inherits</firstterm>
3555    all the columns of its parent table, <structname>cities</structname>. State
3556    capitals also have an extra column, <structfield>state</structfield>, that shows
3557    their state.
3558   </para>
3559
3560   <para>
3561    In <productname>PostgreSQL</productname>, a table can inherit from
3562    zero or more other tables, and a query can reference either all
3563    rows of a table or all rows of a table plus all of its descendant tables.
3564    The latter behavior is the default.
3565    For example, the following query finds the names of all cities,
3566    including state capitals, that are located at an elevation over
3567    500 feet:
3568
3569 <programlisting>
3570 SELECT name, elevation
3571     FROM cities
3572     WHERE elevation &gt; 500;
3573 </programlisting>
3574
3575    Given the sample data from the <productname>PostgreSQL</productname>
3576    tutorial (see <xref linkend="tutorial-sql-intro"/>), this returns:
3577
3578 <programlisting>
3579    name    | elevation
3580 -----------+-----------
3581  Las Vegas |      2174
3582  Mariposa  |      1953
3583  Madison   |       845
3584 </programlisting>
3585   </para>
3586
3587   <para>
3588    On the other hand, the following query finds all the cities that
3589    are not state capitals and are situated at an elevation over 500 feet:
3590
3591 <programlisting>
3592 SELECT name, elevation
3593     FROM ONLY cities
3594     WHERE elevation &gt; 500;
3595
3596    name    | elevation
3597 -----------+-----------
3598  Las Vegas |      2174
3599  Mariposa  |      1953
3600 </programlisting>
3601   </para>
3602
3603   <para>
3604    Here the <literal>ONLY</literal> keyword indicates that the query
3605    should apply only to <structname>cities</structname>, and not any tables
3606    below <structname>cities</structname> in the inheritance hierarchy.  Many
3607    of the commands that we have already discussed &mdash;
3608    <command>SELECT</command>, <command>UPDATE</command> and
3609    <command>DELETE</command> &mdash; support the
3610    <literal>ONLY</literal> keyword.
3611   </para>
3612
3613   <para>
3614    You can also write the table name with a trailing <literal>*</literal>
3615    to explicitly specify that descendant tables are included:
3616
3617 <programlisting>
3618 SELECT name, elevation
3619     FROM cities*
3620     WHERE elevation &gt; 500;
3621 </programlisting>
3622
3623    Writing <literal>*</literal> is not necessary, since this behavior is always
3624    the default.  However, this syntax is still supported for
3625    compatibility with older releases where the default could be changed.
3626   </para>
3627
3628   <para>
3629    In some cases you might wish to know which table a particular row
3630    originated from. There is a system column called
3631    <structfield>tableoid</structfield> in each table which can tell you the
3632    originating table:
3633
3634 <programlisting>
3635 SELECT c.tableoid, c.name, c.elevation
3636 FROM cities c
3637 WHERE c.elevation &gt; 500;
3638 </programlisting>
3639
3640    which returns:
3641
3642 <programlisting>
3643  tableoid |   name    | elevation
3644 ----------+-----------+-----------
3645    139793 | Las Vegas |      2174
3646    139793 | Mariposa  |      1953
3647    139798 | Madison   |       845
3648 </programlisting>
3649
3650    (If you try to reproduce this example, you will probably get
3651    different numeric OIDs.)  By doing a join with
3652    <structname>pg_class</structname> you can see the actual table names:
3653
3654 <programlisting>
3655 SELECT p.relname, c.name, c.elevation
3656 FROM cities c, pg_class p
3657 WHERE c.elevation &gt; 500 AND c.tableoid = p.oid;
3658 </programlisting>
3659
3660    which returns:
3661
3662 <programlisting>
3663  relname  |   name    | elevation
3664 ----------+-----------+-----------
3665  cities   | Las Vegas |      2174
3666  cities   | Mariposa  |      1953
3667  capitals | Madison   |       845
3668 </programlisting>
3669   </para>
3670
3671   <para>
3672    Another way to get the same effect is to use the <type>regclass</type>
3673    alias type, which will print the table OID symbolically:
3674
3675 <programlisting>
3676 SELECT c.tableoid::regclass, c.name, c.elevation
3677 FROM cities c
3678 WHERE c.elevation &gt; 500;
3679 </programlisting>
3680   </para>
3681
3682   <para>
3683    Inheritance does not automatically propagate data from
3684    <command>INSERT</command> or <command>COPY</command> commands to
3685    other tables in the inheritance hierarchy. In our example, the
3686    following <command>INSERT</command> statement will fail:
3687 <programlisting>
3688 INSERT INTO cities (name, population, elevation, state)
3689 VALUES ('Albany', NULL, NULL, 'NY');
3690 </programlisting>
3691    We might hope that the data would somehow be routed to the
3692    <structname>capitals</structname> table, but this does not happen:
3693    <command>INSERT</command> always inserts into exactly the table
3694    specified.  In some cases it is possible to redirect the insertion
3695    using a rule (see <xref linkend="rules"/>).  However that does not
3696    help for the above case because the <structname>cities</structname> table
3697    does not contain the column <structfield>state</structfield>, and so the
3698    command will be rejected before the rule can be applied.
3699   </para>
3700
3701   <para>
3702    All check constraints and not-null constraints on a parent table are
3703    automatically inherited by its children, unless explicitly specified
3704    otherwise with <literal>NO INHERIT</literal> clauses.  Other types of constraints
3705    (unique, primary key, and foreign key constraints) are not inherited.
3706   </para>
3707
3708   <para>
3709    A table can inherit from more than one parent table, in which case it has
3710    the union of the columns defined by the parent tables.  Any columns
3711    declared in the child table's definition are added to these.  If the
3712    same column name appears in multiple parent tables, or in both a parent
3713    table and the child's definition, then these columns are <quote>merged</quote>
3714    so that there is only one such column in the child table.  To be merged,
3715    columns must have the same data types, else an error is raised.
3716    Inheritable check constraints and not-null constraints are merged in a
3717    similar fashion.  Thus, for example, a merged column will be marked
3718    not-null if any one of the column definitions it came from is marked
3719    not-null.  Check constraints are merged if they have the same name,
3720    and the merge will fail if their conditions are different.
3721   </para>
3722
3723   <para>
3724    Table inheritance is typically established when the child table is
3725    created, using the <literal>INHERITS</literal> clause of the
3726    <link linkend="sql-createtable"><command>CREATE TABLE</command></link>
3727    statement.
3728    Alternatively, a table which is already defined in a compatible way can
3729    have a new parent relationship added, using the <literal>INHERIT</literal>
3730    variant of <link linkend="sql-altertable"><command>ALTER TABLE</command></link>.
3731    To do this the new child table must already include columns with
3732    the same names and types as the columns of the parent. It must also include
3733    check constraints with the same names and check expressions as those of the
3734    parent. Similarly an inheritance link can be removed from a child using the
3735    <literal>NO INHERIT</literal> variant of <command>ALTER TABLE</command>.
3736    Dynamically adding and removing inheritance links like this can be useful
3737    when the inheritance relationship is being used for table
3738    partitioning (see <xref linkend="ddl-partitioning"/>).
3739   </para>
3740
3741   <para>
3742    One convenient way to create a compatible table that will later be made
3743    a new child is to use the <literal>LIKE</literal> clause in <command>CREATE
3744    TABLE</command>. This creates a new table with the same columns as
3745    the source table. If there are any <literal>CHECK</literal>
3746    constraints defined on the source table, the <literal>INCLUDING
3747    CONSTRAINTS</literal> option to <literal>LIKE</literal> should be
3748    specified, as the new child must have constraints matching the parent
3749    to be considered compatible.
3750   </para>
3751
3752   <para>
3753    A parent table cannot be dropped while any of its children remain. Neither
3754    can columns or check constraints of child tables be dropped or altered
3755    if they are inherited
3756    from any parent tables. If you wish to remove a table and all of its
3757    descendants, one easy way is to drop the parent table with the
3758    <literal>CASCADE</literal> option (see <xref linkend="ddl-depend"/>).
3759   </para>
3760
3761   <para>
3762    <command>ALTER TABLE</command> will
3763    propagate any changes in column data definitions and check
3764    constraints down the inheritance hierarchy.  Again, dropping
3765    columns that are depended on by other tables is only possible when using
3766    the <literal>CASCADE</literal> option. <command>ALTER
3767    TABLE</command> follows the same rules for duplicate column merging
3768    and rejection that apply during <command>CREATE TABLE</command>.
3769   </para>
3770
3771   <para>
3772    Inherited queries perform access permission checks on the parent table
3773    only.  Thus, for example, granting <literal>UPDATE</literal> permission on
3774    the <structname>cities</structname> table implies permission to update rows in
3775    the <structname>capitals</structname> table as well, when they are
3776    accessed through <structname>cities</structname>.  This preserves the appearance
3777    that the data is (also) in the parent table.  But
3778    the <structname>capitals</structname> table could not be updated directly
3779    without an additional grant.  In a similar way, the parent table's row
3780    security policies (see <xref linkend="ddl-rowsecurity"/>) are applied to
3781    rows coming from child tables during an inherited query.  A child table's
3782    policies, if any, are applied only when it is the table explicitly named
3783    in the query; and in that case, any policies attached to its parent(s) are
3784    ignored.
3785   </para>
3786
3787   <para>
3788    Foreign tables (see <xref linkend="ddl-foreign-data"/>) can also
3789    be part of inheritance hierarchies, either as parent or child
3790    tables, just as regular tables can be.  If a foreign table is part
3791    of an inheritance hierarchy then any operations not supported by
3792    the foreign table are not supported on the whole hierarchy either.
3793   </para>
3794
3795  <sect2 id="ddl-inherit-caveats">
3796   <title>Caveats</title>
3797
3798   <para>
3799    Note that not all SQL commands are able to work on
3800    inheritance hierarchies.  Commands that are used for data querying,
3801    data modification, or schema modification
3802    (e.g., <literal>SELECT</literal>, <literal>UPDATE</literal>, <literal>DELETE</literal>,
3803    most variants of <literal>ALTER TABLE</literal>, but
3804    not <literal>INSERT</literal> or <literal>ALTER TABLE ...
3805    RENAME</literal>) typically default to including child tables and
3806    support the <literal>ONLY</literal> notation to exclude them.
3807    Commands that do database maintenance and tuning
3808    (e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
3809    typically only work on individual, physical tables and do not
3810    support recursing over inheritance hierarchies.  The respective
3811    behavior of each individual command is documented in its reference
3812    page (<xref linkend="sql-commands"/>).
3813   </para>
3814
3815   <para>
3816    A serious limitation of the inheritance feature is that indexes (including
3817    unique constraints) and foreign key constraints only apply to single
3818    tables, not to their inheritance children. This is true on both the
3819    referencing and referenced sides of a foreign key constraint. Thus,
3820    in the terms of the above example:
3821
3822    <itemizedlist>
3823     <listitem>
3824      <para>
3825       If we declared <structname>cities</structname>.<structfield>name</structfield> to be
3826       <literal>UNIQUE</literal> or a <literal>PRIMARY KEY</literal>, this would not stop the
3827       <structname>capitals</structname> table from having rows with names duplicating
3828       rows in <structname>cities</structname>.  And those duplicate rows would by
3829       default show up in queries from <structname>cities</structname>.  In fact, by
3830       default <structname>capitals</structname> would have no unique constraint at all,
3831       and so could contain multiple rows with the same name.
3832       You could add a unique constraint to <structname>capitals</structname>, but this
3833       would not prevent duplication compared to <structname>cities</structname>.
3834      </para>
3835     </listitem>
3836
3837     <listitem>
3838      <para>
3839       Similarly, if we were to specify that
3840       <structname>cities</structname>.<structfield>name</structfield> <literal>REFERENCES</literal> some
3841       other table, this constraint would not automatically propagate to
3842       <structname>capitals</structname>.  In this case you could work around it by
3843       manually adding the same <literal>REFERENCES</literal> constraint to
3844       <structname>capitals</structname>.
3845      </para>
3846     </listitem>
3847
3848     <listitem>
3849      <para>
3850       Specifying that another table's column <literal>REFERENCES
3851       cities(name)</literal> would allow the other table to contain city names, but
3852       not capital names.  There is no good workaround for this case.
3853      </para>
3854     </listitem>
3855    </itemizedlist>
3856
3857    Some functionality not implemented for inheritance hierarchies is
3858    implemented for declarative partitioning.
3859    Considerable care is needed in deciding whether partitioning with legacy
3860    inheritance is useful for your application.
3861   </para>
3862
3863    </sect2>
3864   </sect1>
3865
3866   <sect1 id="ddl-partitioning">
3867    <title>Table Partitioning</title>
3868
3869    <indexterm>
3870     <primary>partitioning</primary>
3871    </indexterm>
3872
3873    <indexterm>
3874     <primary>table</primary>
3875     <secondary>partitioning</secondary>
3876    </indexterm>
3877
3878    <indexterm>
3879     <primary>partitioned table</primary>
3880    </indexterm>
3881
3882    <para>
3883     <productname>PostgreSQL</productname> supports basic table
3884     partitioning. This section describes why and how to implement
3885     partitioning as part of your database design.
3886    </para>
3887
3888    <sect2 id="ddl-partitioning-overview">
3889      <title>Overview</title>
3890
3891     <para>
3892      Partitioning refers to splitting what is logically one large table into
3893      smaller physical pieces.  Partitioning can provide several benefits:
3894     <itemizedlist>
3895      <listitem>
3896       <para>
3897        Query performance can be improved dramatically in certain situations,
3898        particularly when most of the heavily accessed rows of the table are in a
3899        single partition or a small number of partitions.  Partitioning
3900        effectively substitutes for the upper tree levels of indexes,
3901        making it more likely that the heavily-used parts of the indexes
3902        fit in memory.
3903       </para>
3904      </listitem>
3905
3906      <listitem>
3907       <para>
3908        When queries or updates access a large percentage of a single
3909        partition, performance can be improved by using a
3910        sequential scan of that partition instead of using an
3911        index, which would require random-access reads scattered across the
3912        whole table.
3913       </para>
3914      </listitem>
3915
3916      <listitem>
3917       <para>
3918        Bulk loads and deletes can be accomplished by adding or removing
3919        partitions, if the usage pattern is accounted for in the
3920        partitioning design.  Dropping an individual partition
3921        using <command>DROP TABLE</command>, or doing <command>ALTER TABLE
3922        DETACH PARTITION</command>, is far faster than a bulk
3923        operation.  These commands also entirely avoid the
3924        <command>VACUUM</command> overhead caused by a bulk <command>DELETE</command>.
3925       </para>
3926      </listitem>
3927
3928      <listitem>
3929       <para>
3930        Seldom-used data can be migrated to cheaper and slower storage media.
3931       </para>
3932      </listitem>
3933     </itemizedlist>
3934
3935      These benefits will normally be worthwhile only when a table would
3936      otherwise be very large. The exact point at which a table will
3937      benefit from partitioning depends on the application, although a
3938      rule of thumb is that the size of the table should exceed the physical
3939      memory of the database server.
3940     </para>
3941
3942     <para>
3943      <productname>PostgreSQL</productname> offers built-in support for the
3944      following forms of partitioning:
3945
3946      <variablelist>
3947       <varlistentry id="ddl-partitioning-overview-range">
3948        <term>Range Partitioning</term>
3949
3950        <listitem>
3951         <para>
3952          The table is partitioned into <quote>ranges</quote> defined
3953          by a key column or set of columns, with no overlap between
3954          the ranges of values assigned to different partitions.  For
3955          example, one might partition by date ranges, or by ranges of
3956          identifiers for particular business objects.
3957          Each range's bounds are understood as being inclusive at the
3958          lower end and exclusive at the upper end.  For example, if one
3959          partition's range is from <literal>1</literal>
3960          to <literal>10</literal>, and the next one's range is
3961          from <literal>10</literal> to <literal>20</literal>, then
3962          value <literal>10</literal> belongs to the second partition not
3963          the first.
3964         </para>
3965        </listitem>
3966       </varlistentry>
3967
3968       <varlistentry id="ddl-partitioning-overview-list">
3969        <term>List Partitioning</term>
3970
3971        <listitem>
3972         <para>
3973          The table is partitioned by explicitly listing which key value(s)
3974          appear in each partition.
3975         </para>
3976        </listitem>
3977       </varlistentry>
3978
3979       <varlistentry id="ddl-partitioning-overview-hash">
3980        <term>Hash Partitioning</term>
3981
3982        <listitem>
3983         <para>
3984          The table is partitioned by specifying a modulus and a remainder for
3985          each partition. Each partition will hold the rows for which the hash
3986          value of the partition key divided by the specified modulus will
3987          produce the specified remainder.
3988         </para>
3989        </listitem>
3990       </varlistentry>
3991      </variablelist>
3992
3993      If your application needs to use other forms of partitioning not listed
3994      above, alternative methods such as inheritance and
3995      <literal>UNION ALL</literal> views can be used instead.  Such methods
3996      offer flexibility but do not have some of the performance benefits
3997      of built-in declarative partitioning.
3998     </para>
3999    </sect2>
4000
4001   <sect2 id="ddl-partitioning-declarative">
4002    <title>Declarative Partitioning</title>
4003
4004    <para>
4005     <productname>PostgreSQL</productname> allows you to declare
4006     that a table is divided into partitions.  The table that is divided
4007     is referred to as a <firstterm>partitioned table</firstterm>.  The
4008     declaration includes the <firstterm>partitioning method</firstterm>
4009     as described above, plus a list of columns or expressions to be used
4010     as the <firstterm>partition key</firstterm>.
4011    </para>
4012
4013    <para>
4014     The partitioned table itself is a <quote>virtual</quote> table having
4015     no storage of its own.  Instead, the storage belongs
4016     to <firstterm>partitions</firstterm>, which are otherwise-ordinary
4017     tables associated with the partitioned table.
4018     Each partition stores a subset of the data as defined by its
4019     <firstterm>partition bounds</firstterm>.
4020     All rows inserted into a partitioned table will be routed to the
4021     appropriate one of the partitions based on the values of the partition
4022     key column(s).
4023     Updating the partition key of a row will cause it to be moved into a
4024     different partition if it no longer satisfies the partition bounds
4025     of its original partition.
4026    </para>
4027
4028    <para>
4029     Partitions may themselves be defined as partitioned tables, resulting
4030     in <firstterm>sub-partitioning</firstterm>.  Although all partitions
4031     must have the same columns as their partitioned parent, partitions may
4032     have their
4033     own indexes, constraints and default values, distinct from those of other
4034     partitions.  See <xref linkend="sql-createtable"/> for more details on
4035     creating partitioned tables and partitions.
4036    </para>
4037
4038    <para>
4039     It is not possible to turn a regular table into a partitioned table or
4040     vice versa.  However, it is possible to add an existing regular or
4041     partitioned table as a partition of a partitioned table, or remove a
4042     partition from a partitioned table turning it into a standalone table;
4043     this can simplify and speed up many maintenance processes.
4044     See <xref linkend="sql-altertable"/> to learn more about the
4045     <command>ATTACH PARTITION</command> and <command>DETACH PARTITION</command>
4046     sub-commands.
4047    </para>
4048
4049    <para>
4050     Partitions can also be <link linkend="ddl-foreign-data">foreign
4051     tables</link>, although considerable care is needed because it is then
4052     the user's responsibility that the contents of the foreign table
4053     satisfy the partitioning rule.  There are some other restrictions as
4054     well.  See <xref linkend="sql-createforeigntable"/> for more
4055     information.
4056    </para>
4057
4058    <sect3 id="ddl-partitioning-declarative-example">
4059     <title>Example</title>
4060
4061    <para>
4062     Suppose we are constructing a database for a large ice cream company.
4063     The company measures peak temperatures every day as well as ice cream
4064     sales in each region. Conceptually, we want a table like:
4065
4066 <programlisting>
4067 CREATE TABLE measurement (
4068     city_id         int not null,
4069     logdate         date not null,
4070     peaktemp        int,
4071     unitsales       int
4072 );
4073 </programlisting>
4074
4075     We know that most queries will access just the last week's, month's or
4076     quarter's data, since the main use of this table will be to prepare
4077     online reports for management.  To reduce the amount of old data that
4078     needs to be stored, we decide to keep only the most recent 3 years
4079     worth of data. At the beginning of each month we will remove the oldest
4080     month's data.  In this situation we can use partitioning to help us meet
4081     all of our different requirements for the measurements table.
4082    </para>
4083
4084    <para>
4085     To use declarative partitioning in this case, use the following steps:
4086
4087     <orderedlist spacing="compact">
4088      <listitem>
4089       <para>
4090        Create the <structname>measurement</structname> table as a partitioned
4091        table by specifying the <literal>PARTITION BY</literal> clause, which
4092        includes the partitioning method (<literal>RANGE</literal> in this
4093        case) and the list of column(s) to use as the partition key.
4094
4095 <programlisting>
4096 CREATE TABLE measurement (
4097     city_id         int not null,
4098     logdate         date not null,
4099     peaktemp        int,
4100     unitsales       int
4101 ) PARTITION BY RANGE (logdate);
4102 </programlisting>
4103       </para>
4104      </listitem>
4105
4106      <listitem>
4107       <para>
4108        Create partitions.  Each partition's definition must specify bounds
4109        that correspond to the partitioning method and partition key of the
4110        parent.  Note that specifying bounds such that the new partition's
4111        values would overlap with those in one or more existing partitions will
4112        cause an error.
4113       </para>
4114
4115       <para>
4116        Partitions thus created are in every way normal
4117        <productname>PostgreSQL</productname>
4118        tables (or, possibly, foreign tables).  It is possible to specify a
4119        tablespace and storage parameters for each partition separately.
4120       </para>
4121
4122       <para>
4123        For our example, each partition should hold one month's worth of
4124        data, to match the requirement of deleting one month's data at a
4125        time.  So the commands might look like:
4126
4127 <programlisting>
4128 CREATE TABLE measurement_y2006m02 PARTITION OF measurement
4129     FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
4130
4131 CREATE TABLE measurement_y2006m03 PARTITION OF measurement
4132     FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
4133
4134 ...
4135 CREATE TABLE measurement_y2007m11 PARTITION OF measurement
4136     FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
4137
4138 CREATE TABLE measurement_y2007m12 PARTITION OF measurement
4139     FOR VALUES FROM ('2007-12-01') TO ('2008-01-01')
4140     TABLESPACE fasttablespace;
4141
4142 CREATE TABLE measurement_y2008m01 PARTITION OF measurement
4143     FOR VALUES FROM ('2008-01-01') TO ('2008-02-01')
4144     WITH (parallel_workers = 4)
4145     TABLESPACE fasttablespace;
4146 </programlisting>
4147
4148        (Recall that adjacent partitions can share a bound value, since
4149        range upper bounds are treated as exclusive bounds.)
4150       </para>
4151
4152       <para>
4153        If you wish to implement sub-partitioning, again specify the
4154        <literal>PARTITION BY</literal> clause in the commands used to create
4155        individual partitions, for example:
4156
4157 <programlisting>
4158 CREATE TABLE measurement_y2006m02 PARTITION OF measurement
4159     FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
4160     PARTITION BY RANGE (peaktemp);
4161 </programlisting>
4162
4163        After creating partitions of <structname>measurement_y2006m02</structname>,
4164        any data inserted into <structname>measurement</structname> that is mapped to
4165        <structname>measurement_y2006m02</structname> (or data that is
4166        directly inserted into <structname>measurement_y2006m02</structname>,
4167        which is allowed provided its partition constraint is satisfied)
4168        will be further redirected to one of its
4169        partitions based on the <structfield>peaktemp</structfield> column.  The partition
4170        key specified may overlap with the parent's partition key, although
4171        care should be taken when specifying the bounds of a sub-partition
4172        such that the set of data it accepts constitutes a subset of what
4173        the partition's own bounds allow; the system does not try to check
4174        whether that's really the case.
4175       </para>
4176
4177       <para>
4178        Inserting data into the parent table that does not map
4179        to one of the existing partitions will cause an error; an appropriate
4180        partition must be added manually.
4181       </para>
4182
4183       <para>
4184        It is not necessary to manually create table constraints describing
4185        the partition boundary conditions for partitions.  Such constraints
4186        will be created automatically.
4187       </para>
4188      </listitem>
4189
4190      <listitem>
4191       <para>
4192        Create an index on the key column(s), as well as any other indexes you
4193        might want, on the partitioned table. (The key index is not strictly
4194        necessary, but in most scenarios it is helpful.)
4195        This automatically creates a matching index on each partition, and
4196        any partitions you create or attach later will also have such an
4197        index.
4198        An index or unique constraint declared on a partitioned table
4199        is <quote>virtual</quote> in the same way that the partitioned table
4200        is: the actual data is in child indexes on the individual partition
4201        tables.
4202
4203 <programlisting>
4204 CREATE INDEX ON measurement (logdate);
4205 </programlisting>
4206       </para>
4207      </listitem>
4208
4209       <listitem>
4210        <para>
4211         Ensure that the <xref linkend="guc-enable-partition-pruning"/>
4212         configuration parameter is not disabled in <filename>postgresql.conf</filename>.
4213         If it is, queries will not be optimized as desired.
4214        </para>
4215       </listitem>
4216     </orderedlist>
4217    </para>
4218
4219    <para>
4220     In the above example we would be creating a new partition each month, so
4221     it might be wise to write a script that generates the required DDL
4222     automatically.
4223    </para>
4224    </sect3>
4225
4226    <sect3 id="ddl-partitioning-declarative-maintenance">
4227     <title>Partition Maintenance</title>
4228
4229     <para>
4230       Normally the set of partitions established when initially defining the
4231       table is not intended to remain static.  It is common to want to
4232       remove partitions holding old data and periodically add new partitions for
4233       new data. One of the most important advantages of partitioning is
4234       precisely that it allows this otherwise painful task to be executed
4235       nearly instantaneously by manipulating the partition structure, rather
4236       than physically moving large amounts of data around.
4237     </para>
4238
4239     <para>
4240      The simplest option for removing old data is to drop the partition that
4241      is no longer necessary:
4242 <programlisting>
4243 DROP TABLE measurement_y2006m02;
4244 </programlisting>
4245      This can very quickly delete millions of records because it doesn't have
4246      to individually delete every record.  Note however that the above command
4247      requires taking an <literal>ACCESS EXCLUSIVE</literal> lock on the parent
4248      table.
4249     </para>
4250
4251    <para>
4252      Another option that is often preferable is to remove the partition from
4253      the partitioned table but retain access to it as a table in its own
4254      right.  This has two forms:
4255
4256 <programlisting>
4257 ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
4258 ALTER TABLE measurement DETACH PARTITION measurement_y2006m02 CONCURRENTLY;
4259 </programlisting>
4260
4261      These allow further operations to be performed on the data before
4262      it is dropped. For example, this is often a useful time to back up
4263      the data using <command>COPY</command>, <application>pg_dump</application>, or
4264      similar tools. It might also be a useful time to aggregate data
4265      into smaller formats, perform other data manipulations, or run
4266      reports.  The first form of the command requires an
4267      <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
4268      Adding the <literal>CONCURRENTLY</literal> qualifier as in the second
4269      form allows the detach operation to require only
4270      <literal>SHARE UPDATE EXCLUSIVE</literal> lock on the parent table, but see
4271      <link linkend="sql-altertable-detach-partition"><literal>ALTER TABLE ... DETACH PARTITION</literal></link>
4272      for details on the restrictions.
4273    </para>
4274
4275    <para>
4276      Similarly we can add a new partition to handle new data. We can create an
4277      empty partition in the partitioned table just as the original partitions
4278      were created above:
4279
4280 <programlisting>
4281 CREATE TABLE measurement_y2008m02 PARTITION OF measurement
4282     FOR VALUES FROM ('2008-02-01') TO ('2008-03-01')
4283     TABLESPACE fasttablespace;
4284 </programlisting>
4285
4286      As an alternative to creating a new partition, it is sometimes more
4287      convenient to create a new table separate from the partition structure
4288      and attach it as a partition later.  This allows new data to be loaded,
4289      checked, and transformed prior to it appearing in the partitioned table.
4290      Moreover, the <literal>ATTACH PARTITION</literal> operation requires
4291      only a <literal>SHARE UPDATE EXCLUSIVE</literal> lock on the
4292      partitioned table rather than the <literal>ACCESS EXCLUSIVE</literal>
4293      lock required by <command>CREATE TABLE ... PARTITION OF</command>,
4294      so it is more friendly to concurrent operations on the partitioned table;
4295      see <link linkend="sql-altertable-attach-partition"><literal>ALTER TABLE ... ATTACH PARTITION</literal></link>
4296      for additional details.  The
4297      <link linkend="sql-createtable-parms-like"><literal>CREATE TABLE ... LIKE</literal></link>
4298      option can be helpful to avoid tediously repeating the parent table's
4299      definition; for example:
4300
4301 <programlisting>
4302 CREATE TABLE measurement_y2008m02
4303   (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
4304   TABLESPACE fasttablespace;
4305
4306 ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
4307    CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' );
4308
4309 \copy measurement_y2008m02 from 'measurement_y2008m02'
4310 -- possibly some other data preparation work
4311
4312 ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
4313     FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
4314 </programlisting>
4315     </para>
4316
4317     <para>
4318      Note that when running the <command>ATTACH PARTITION</command> command,
4319      the table will be scanned to validate the partition constraint while
4320      holding an <literal>ACCESS EXCLUSIVE</literal> lock on that partition.
4321      As shown above, it is recommended to avoid this scan by creating a
4322      <literal>CHECK</literal> constraint matching the expected partition
4323      constraint on the table prior to attaching it.  Once the
4324      <command>ATTACH PARTITION</command> is complete, it is recommended to drop
4325      the now-redundant <literal>CHECK</literal> constraint.
4326      If the table being attached is itself a partitioned table, then each of its
4327      sub-partitions will be recursively locked and scanned until either a
4328      suitable <literal>CHECK</literal> constraint is encountered or the leaf
4329      partitions are reached.
4330     </para>
4331
4332     <para>
4333      Similarly, if the partitioned table has a <literal>DEFAULT</literal>
4334      partition, it is recommended to create a <literal>CHECK</literal>
4335      constraint which excludes the to-be-attached partition's constraint.  If
4336      this is not done, the <literal>DEFAULT</literal> partition will be
4337      scanned to verify that it contains no records which should be located in
4338      the partition being attached.  This operation will be performed whilst
4339      holding an <literal>ACCESS EXCLUSIVE</literal> lock on the <literal>
4340      DEFAULT</literal> partition.  If the <literal>DEFAULT</literal> partition
4341      is itself a partitioned table, then each of its partitions will be
4342      recursively checked in the same way as the table being attached, as
4343      mentioned above.
4344     </para>
4345
4346     <para>
4347      As mentioned earlier, it is possible to create indexes on partitioned
4348      tables so that they are applied automatically to the entire hierarchy.
4349      This can be very convenient as not only will all existing partitions be
4350      indexed, but any future partitions will be as well.  However, one
4351      limitation when creating new indexes on partitioned tables is that it
4352      is not possible to use the <literal>CONCURRENTLY</literal>
4353      qualifier, which could lead to long lock times.  To avoid this, you can
4354      use <command>CREATE INDEX ON ONLY</command> the partitioned table, which
4355      creates the new index marked as invalid, preventing automatic application
4356      to existing partitions.  Instead, indexes can then be created individually
4357      on each partition using <literal>CONCURRENTLY</literal> and
4358      <firstterm>attached</firstterm> to the partitioned index on the parent
4359      using <command>ALTER INDEX ... ATTACH PARTITION</command>.  Once indexes for
4360      all the partitions are attached to the parent index, the parent index will
4361      be marked valid automatically.  Example:
4362 <programlisting>
4363 CREATE INDEX measurement_usls_idx ON ONLY measurement (unitsales);
4364
4365 CREATE INDEX CONCURRENTLY measurement_usls_200602_idx
4366     ON measurement_y2006m02 (unitsales);
4367 ALTER INDEX measurement_usls_idx
4368     ATTACH PARTITION measurement_usls_200602_idx;
4369 ...
4370 </programlisting>
4371
4372      This technique can be used with <literal>UNIQUE</literal> and
4373      <literal>PRIMARY KEY</literal> constraints too; the indexes are created
4374      implicitly when the constraint is created.  Example:
4375 <programlisting>
4376 ALTER TABLE ONLY measurement ADD UNIQUE (city_id, logdate);
4377
4378 ALTER TABLE measurement_y2006m02 ADD UNIQUE (city_id, logdate);
4379 ALTER INDEX measurement_city_id_logdate_key
4380     ATTACH PARTITION measurement_y2006m02_city_id_logdate_key;
4381 ...
4382 </programlisting>
4383     </para>
4384    </sect3>
4385
4386    <sect3 id="ddl-partitioning-declarative-limitations">
4387     <title>Limitations</title>
4388
4389    <para>
4390     The following limitations apply to partitioned tables:
4391     <itemizedlist>
4392      <listitem>
4393       <para>
4394        To create a unique or primary key constraint on a partitioned table,
4395        the partition keys must not include any expressions or function calls
4396        and the constraint's columns must include all of the partition key
4397        columns.  This limitation exists because the individual indexes making
4398        up the constraint can only directly enforce uniqueness within their own
4399        partitions; therefore, the partition structure itself must guarantee
4400        that there are not duplicates in different partitions.
4401       </para>
4402      </listitem>
4403
4404      <listitem>
4405       <para>
4406        Similarly an exclusion constraint must include all the
4407        partition key columns. Furthermore the constraint must compare those
4408        columns for equality (not e.g. <literal>&amp;&amp;</literal>).
4409        Again, this limitation stems from not being able to enforce
4410        cross-partition restrictions. The constraint may include additional
4411        columns that aren't part of the partition key, and it may compare
4412        those with any operators you like.
4413       </para>
4414      </listitem>
4415
4416      <listitem>
4417       <para>
4418        <literal>BEFORE ROW</literal> triggers on <literal>INSERT</literal>
4419        cannot change which partition is the final destination for a new row.
4420       </para>
4421      </listitem>
4422
4423      <listitem>
4424       <para>
4425        Mixing temporary and permanent relations in the same partition tree is
4426        not allowed.  Hence, if the partitioned table is permanent, so must be
4427        its partitions and likewise if the partitioned table is temporary.  When
4428        using temporary relations, all members of the partition tree have to be
4429        from the same session.
4430       </para>
4431      </listitem>
4432     </itemizedlist>
4433     </para>
4434
4435     <para>
4436      Individual partitions are linked to their partitioned table using
4437      inheritance behind-the-scenes.  However, it is not possible to use
4438      all of the generic features of inheritance with declaratively
4439      partitioned tables or their partitions, as discussed below.  Notably,
4440      a partition cannot have any parents other than the partitioned table
4441      it is a partition of, nor can a table inherit from both a partitioned
4442      table and a regular table.  That means partitioned tables and their
4443      partitions never share an inheritance hierarchy with regular tables.
4444     </para>
4445
4446     <para>
4447      Since a partition hierarchy consisting of the partitioned table and its
4448      partitions is still an inheritance hierarchy,
4449      <structfield>tableoid</structfield> and all the normal rules of
4450      inheritance apply as described in <xref linkend="ddl-inherit"/>, with
4451      a few exceptions:
4452
4453      <itemizedlist>
4454       <listitem>
4455        <para>
4456         Partitions cannot have columns that are not present in the parent.  It
4457         is not possible to specify columns when creating partitions with
4458         <command>CREATE TABLE</command>, nor is it possible to add columns to
4459         partitions after-the-fact using <command>ALTER TABLE</command>.
4460         Tables may be added as a partition with <command>ALTER TABLE
4461         ... ATTACH PARTITION</command> only if their columns exactly match
4462         the parent.
4463        </para>
4464       </listitem>
4465
4466       <listitem>
4467        <para>
4468         Both <literal>CHECK</literal> and <literal>NOT NULL</literal>
4469         constraints of a partitioned table are always inherited by all its
4470         partitions.  <literal>CHECK</literal> constraints that are marked
4471         <literal>NO INHERIT</literal> are not allowed to be created on
4472         partitioned tables.
4473         You cannot drop a <literal>NOT NULL</literal> constraint on a
4474         partition's column if the same constraint is present in the parent
4475         table.
4476        </para>
4477       </listitem>
4478
4479       <listitem>
4480        <para>
4481         Using <literal>ONLY</literal> to add or drop a constraint on only
4482         the partitioned table is supported as long as there are no
4483         partitions.  Once partitions exist, using <literal>ONLY</literal>
4484         will result in an error for any constraints other than
4485         <literal>UNIQUE</literal> and <literal>PRIMARY KEY</literal>.
4486         Instead, constraints on the partitions
4487         themselves can be added and (if they are not present in the parent
4488         table) dropped.
4489        </para>
4490       </listitem>
4491
4492       <listitem>
4493        <para>
4494         As a partitioned table does not have any data itself, attempts to use
4495         <command>TRUNCATE</command> <literal>ONLY</literal> on a partitioned
4496         table will always return an error.
4497        </para>
4498       </listitem>
4499      </itemizedlist>
4500     </para>
4501     </sect3>
4502    </sect2>
4503
4504    <sect2 id="ddl-partitioning-using-inheritance">
4505     <title>Partitioning Using Inheritance</title>
4506
4507     <para>
4508      While the built-in declarative partitioning is suitable for most
4509      common use cases, there are some circumstances where a more flexible
4510      approach may be useful.  Partitioning can be implemented using table
4511      inheritance, which allows for several features not supported
4512      by declarative partitioning, such as:
4513
4514      <itemizedlist>
4515       <listitem>
4516        <para>
4517         For declarative partitioning, partitions must have exactly the same set
4518         of columns as the partitioned table, whereas with table inheritance,
4519         child tables may have extra columns not present in the parent.
4520        </para>
4521       </listitem>
4522
4523       <listitem>
4524        <para>
4525         Table inheritance allows for multiple inheritance.
4526        </para>
4527       </listitem>
4528
4529       <listitem>
4530        <para>
4531         Declarative partitioning only supports range, list and hash
4532         partitioning, whereas table inheritance allows data to be divided in a
4533         manner of the user's choosing.  (Note, however, that if constraint
4534         exclusion is unable to prune child tables effectively, query performance
4535         might be poor.)
4536        </para>
4537       </listitem>
4538      </itemizedlist>
4539     </para>
4540
4541     <sect3 id="ddl-partitioning-inheritance-example">
4542      <title>Example</title>
4543
4544      <para>
4545       This example builds a partitioning structure equivalent to the
4546       declarative partitioning example above.  Use
4547       the following steps:
4548
4549       <orderedlist spacing="compact">
4550        <listitem>
4551         <para>
4552          Create the <quote>root</quote> table, from which all of the
4553          <quote>child</quote> tables will inherit.  This table will contain no data.  Do not
4554          define any check constraints on this table, unless you intend them
4555          to be applied equally to all child tables.  There is no point in
4556          defining any indexes or unique constraints on it, either.  For our
4557          example, the root table is the <structname>measurement</structname>
4558          table as originally defined:
4559
4560 <programlisting>
4561 CREATE TABLE measurement (
4562     city_id         int not null,
4563     logdate         date not null,
4564     peaktemp        int,
4565     unitsales       int
4566 );
4567 </programlisting>
4568         </para>
4569        </listitem>
4570
4571        <listitem>
4572         <para>
4573          Create several <quote>child</quote> tables that each inherit from
4574          the root table.  Normally, these tables will not add any columns
4575          to the set inherited from the root.  Just as with declarative
4576          partitioning, these tables are in every way normal
4577          <productname>PostgreSQL</productname> tables (or foreign tables).
4578         </para>
4579
4580         <para>
4581 <programlisting>
4582 CREATE TABLE measurement_y2006m02 () INHERITS (measurement);
4583 CREATE TABLE measurement_y2006m03 () INHERITS (measurement);
4584 ...
4585 CREATE TABLE measurement_y2007m11 () INHERITS (measurement);
4586 CREATE TABLE measurement_y2007m12 () INHERITS (measurement);
4587 CREATE TABLE measurement_y2008m01 () INHERITS (measurement);
4588 </programlisting>
4589         </para>
4590        </listitem>
4591
4592        <listitem>
4593         <para>
4594          Add non-overlapping table constraints to the child tables to
4595          define the allowed key values in each.
4596         </para>
4597
4598         <para>
4599          Typical examples would be:
4600 <programlisting>
4601 CHECK ( x = 1 )
4602 CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 'Warwickshire' ))
4603 CHECK ( outletID &gt;= 100 AND outletID &lt; 200 )
4604 </programlisting>
4605          Ensure that the constraints guarantee that there is no overlap
4606          between the key values permitted in different child tables.  A common
4607          mistake is to set up range constraints like:
4608 <programlisting>
4609 CHECK ( outletID BETWEEN 100 AND 200 )
4610 CHECK ( outletID BETWEEN 200 AND 300 )
4611 </programlisting>
4612          This is wrong since it is not clear which child table the key
4613          value 200 belongs in.
4614          Instead, ranges should be defined in this style:
4615
4616 <programlisting>
4617 CREATE TABLE measurement_y2006m02 (
4618     CHECK ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' )
4619 ) INHERITS (measurement);
4620
4621 CREATE TABLE measurement_y2006m03 (
4622     CHECK ( logdate &gt;= DATE '2006-03-01' AND logdate &lt; DATE '2006-04-01' )
4623 ) INHERITS (measurement);
4624
4625 ...
4626 CREATE TABLE measurement_y2007m11 (
4627     CHECK ( logdate &gt;= DATE '2007-11-01' AND logdate &lt; DATE '2007-12-01' )
4628 ) INHERITS (measurement);
4629
4630 CREATE TABLE measurement_y2007m12 (
4631     CHECK ( logdate &gt;= DATE '2007-12-01' AND logdate &lt; DATE '2008-01-01' )
4632 ) INHERITS (measurement);
4633
4634 CREATE TABLE measurement_y2008m01 (
4635     CHECK ( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' )
4636 ) INHERITS (measurement);
4637 </programlisting>
4638         </para>
4639        </listitem>
4640
4641        <listitem>
4642         <para>
4643          For each child table, create an index on the key column(s),
4644          as well as any other indexes you might want.
4645 <programlisting>
4646 CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
4647 CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate);
4648 CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate);
4649 CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate);
4650 CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
4651 </programlisting>
4652         </para>
4653        </listitem>
4654
4655        <listitem>
4656         <para>
4657          We want our application to be able to say <literal>INSERT INTO
4658          measurement ...</literal> and have the data be redirected into the
4659          appropriate child table.  We can arrange that by attaching
4660          a suitable trigger function to the root table.
4661          If data will be added only to the latest child, we can
4662          use a very simple trigger function:
4663
4664 <programlisting>
4665 CREATE OR REPLACE FUNCTION measurement_insert_trigger()
4666 RETURNS TRIGGER AS $$
4667 BEGIN
4668     INSERT INTO measurement_y2008m01 VALUES (NEW.*);
4669     RETURN NULL;
4670 END;
4671 $$
4672 LANGUAGE plpgsql;
4673 </programlisting>
4674         </para>
4675
4676         <para>
4677          After creating the function, we create a trigger which
4678          calls the trigger function:
4679
4680 <programlisting>
4681 CREATE TRIGGER insert_measurement_trigger
4682     BEFORE INSERT ON measurement
4683     FOR EACH ROW EXECUTE FUNCTION measurement_insert_trigger();
4684 </programlisting>
4685
4686          We must redefine the trigger function each month so that it always
4687          inserts into the current child table.  The trigger definition does
4688          not need to be updated, however.
4689         </para>
4690
4691         <para>
4692          We might want to insert data and have the server automatically
4693          locate the child table into which the row should be added. We
4694          could do this with a more complex trigger function, for example:
4695
4696 <programlisting>
4697 CREATE OR REPLACE FUNCTION measurement_insert_trigger()
4698 RETURNS TRIGGER AS $$
4699 BEGIN
4700     IF ( NEW.logdate &gt;= DATE '2006-02-01' AND
4701          NEW.logdate &lt; DATE '2006-03-01' ) THEN
4702         INSERT INTO measurement_y2006m02 VALUES (NEW.*);
4703     ELSIF ( NEW.logdate &gt;= DATE '2006-03-01' AND
4704             NEW.logdate &lt; DATE '2006-04-01' ) THEN
4705         INSERT INTO measurement_y2006m03 VALUES (NEW.*);
4706     ...
4707     ELSIF ( NEW.logdate &gt;= DATE '2008-01-01' AND
4708             NEW.logdate &lt; DATE '2008-02-01' ) THEN
4709         INSERT INTO measurement_y2008m01 VALUES (NEW.*);
4710     ELSE
4711         RAISE EXCEPTION 'Date out of range.  Fix the measurement_insert_trigger() function!';
4712     END IF;
4713     RETURN NULL;
4714 END;
4715 $$
4716 LANGUAGE plpgsql;
4717 </programlisting>
4718
4719          The trigger definition is the same as before.
4720          Note that each <literal>IF</literal> test must exactly match the
4721          <literal>CHECK</literal> constraint for its child table.
4722         </para>
4723
4724         <para>
4725          While this function is more complex than the single-month case,
4726          it doesn't need to be updated as often, since branches can be
4727          added in advance of being needed.
4728         </para>
4729
4730         <note>
4731          <para>
4732           In practice, it might be best to check the newest child first,
4733           if most inserts go into that child.  For simplicity, we have
4734           shown the trigger's tests in the same order as in other parts
4735           of this example.
4736          </para>
4737         </note>
4738
4739         <para>
4740          A different approach to redirecting inserts into the appropriate
4741          child table is to set up rules, instead of a trigger, on the
4742          root table.  For example:
4743
4744 <programlisting>
4745 CREATE RULE measurement_insert_y2006m02 AS
4746 ON INSERT TO measurement WHERE
4747     ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' )
4748 DO INSTEAD
4749     INSERT INTO measurement_y2006m02 VALUES (NEW.*);
4750 ...
4751 CREATE RULE measurement_insert_y2008m01 AS
4752 ON INSERT TO measurement WHERE
4753     ( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' )
4754 DO INSTEAD
4755     INSERT INTO measurement_y2008m01 VALUES (NEW.*);
4756 </programlisting>
4757
4758          A rule has significantly more overhead than a trigger, but the
4759          overhead is paid once per query rather than once per row, so this
4760          method might be advantageous for bulk-insert situations.  In most
4761          cases, however, the trigger method will offer better performance.
4762         </para>
4763
4764         <para>
4765          Be aware that <command>COPY</command> ignores rules.  If you want to
4766          use <command>COPY</command> to insert data, you'll need to copy into the
4767          correct child table rather than directly into the root. <command>COPY</command>
4768          does fire triggers, so you can use it normally if you use the trigger
4769          approach.
4770         </para>
4771
4772         <para>
4773          Another disadvantage of the rule approach is that there is no simple
4774          way to force an error if the set of rules doesn't cover the insertion
4775          date; the data will silently go into the root table instead.
4776         </para>
4777        </listitem>
4778
4779        <listitem>
4780         <para>
4781          Ensure that the <xref linkend="guc-constraint-exclusion"/>
4782          configuration parameter is not disabled in
4783          <filename>postgresql.conf</filename>; otherwise
4784          child tables may be accessed unnecessarily.
4785         </para>
4786        </listitem>
4787       </orderedlist>
4788      </para>
4789
4790      <para>
4791       As we can see, a complex table hierarchy could require a
4792       substantial amount of DDL.  In the above example we would be creating
4793       a new child table each month, so it might be wise to write a script that
4794       generates the required DDL automatically.
4795      </para>
4796     </sect3>
4797
4798     <sect3 id="ddl-partitioning-inheritance-maintenance">
4799      <title>Maintenance for Inheritance Partitioning</title>
4800      <para>
4801       To remove old data quickly, simply drop the child table that is no longer
4802       necessary:
4803 <programlisting>
4804 DROP TABLE measurement_y2006m02;
4805 </programlisting>
4806      </para>
4807
4808     <para>
4809      To remove the child table from the inheritance hierarchy table but retain access to
4810      it as a table in its own right:
4811
4812 <programlisting>
4813 ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
4814 </programlisting>
4815     </para>
4816
4817     <para>
4818      To add a new child table to handle new data, create an empty child table
4819      just as the original children were created above:
4820
4821 <programlisting>
4822 CREATE TABLE measurement_y2008m02 (
4823     CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' )
4824 ) INHERITS (measurement);
4825 </programlisting>
4826
4827      Alternatively, one may want to create and populate the new child table
4828      before adding it to the table hierarchy.  This could allow data to be
4829      loaded, checked, and transformed before being made visible to queries on
4830      the parent table.
4831
4832 <programlisting>
4833 CREATE TABLE measurement_y2008m02
4834   (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
4835 ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
4836    CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' );
4837 \copy measurement_y2008m02 from 'measurement_y2008m02'
4838 -- possibly some other data preparation work
4839 ALTER TABLE measurement_y2008m02 INHERIT measurement;
4840 </programlisting>
4841     </para>
4842    </sect3>
4843
4844    <sect3 id="ddl-partitioning-inheritance-caveats">
4845     <title>Caveats</title>
4846
4847     <para>
4848      The following caveats apply to partitioning implemented using
4849      inheritance:
4850      <itemizedlist>
4851       <listitem>
4852        <para>
4853         There is no automatic way to verify that all of the
4854         <literal>CHECK</literal> constraints are mutually
4855         exclusive.  It is safer to create code that generates
4856         child tables and creates and/or modifies associated objects than
4857         to write each by hand.
4858        </para>
4859       </listitem>
4860
4861       <listitem>
4862        <para>
4863         Indexes and foreign key constraints apply to single tables and not
4864         to their inheritance children, hence they have some
4865         <link linkend="ddl-inherit-caveats">caveats</link> to be aware of.
4866        </para>
4867       </listitem>
4868
4869       <listitem>
4870        <para>
4871         The schemes shown here assume that the values of a row's key column(s)
4872         never change, or at least do not change enough to require it to move to another partition.
4873         An <command>UPDATE</command> that attempts
4874         to do that will fail because of the <literal>CHECK</literal> constraints.
4875         If you need to handle such cases, you can put suitable update triggers
4876         on the child tables, but it makes management of the structure
4877         much more complicated.
4878        </para>
4879       </listitem>
4880
4881       <listitem>
4882        <para>
4883         If you are using manual <command>VACUUM</command> or
4884         <command>ANALYZE</command> commands, don't forget that
4885         you need to run them on each child table individually. A command like:
4886 <programlisting>
4887 ANALYZE measurement;
4888 </programlisting>
4889         will only process the root table.
4890        </para>
4891       </listitem>
4892
4893       <listitem>
4894        <para>
4895         <command>INSERT</command> statements with <literal>ON CONFLICT</literal>
4896         clauses are unlikely to work as expected, as the <literal>ON CONFLICT</literal>
4897         action is only taken in case of unique violations on the specified
4898         target relation, not its child relations.
4899        </para>
4900       </listitem>
4901
4902       <listitem>
4903        <para>
4904         Triggers or rules will be needed to route rows to the desired
4905         child table, unless the application is explicitly aware of the
4906         partitioning scheme.  Triggers may be complicated to write, and will
4907         be much slower than the tuple routing performed internally by
4908         declarative partitioning.
4909        </para>
4910       </listitem>
4911      </itemizedlist>
4912     </para>
4913    </sect3>
4914   </sect2>
4915
4916   <sect2 id="ddl-partition-pruning">
4917    <title>Partition Pruning</title>
4918
4919    <indexterm>
4920     <primary>partition pruning</primary>
4921    </indexterm>
4922
4923    <para>
4924     <firstterm>Partition pruning</firstterm> is a query optimization technique
4925     that improves performance for declaratively partitioned tables.
4926     As an example:
4927
4928 <programlisting>
4929 SET enable_partition_pruning = on;                 -- the default
4930 SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
4931 </programlisting>
4932
4933     Without partition pruning, the above query would scan each of the
4934     partitions of the <structname>measurement</structname> table. With
4935     partition pruning enabled, the planner will examine the definition
4936     of each partition and prove that the partition need not
4937     be scanned because it could not contain any rows meeting the query's
4938     <literal>WHERE</literal> clause.  When the planner can prove this, it
4939     excludes (<firstterm>prunes</firstterm>) the partition from the query
4940     plan.
4941    </para>
4942
4943    <para>
4944     By using the EXPLAIN command and the <xref
4945     linkend="guc-enable-partition-pruning"/> configuration parameter, it's
4946     possible to show the difference between a plan for which partitions have
4947     been pruned and one for which they have not.  A typical unoptimized
4948     plan for this type of table setup is:
4949 <programlisting>
4950 SET enable_partition_pruning = off;
4951 EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
4952                                     QUERY PLAN
4953 -------------------------------------------------------------------&zwsp;----------------
4954  Aggregate  (cost=188.76..188.77 rows=1 width=8)
4955    -&gt;  Append  (cost=0.00..181.05 rows=3085 width=0)
4956          -&gt;  Seq Scan on measurement_y2006m02  (cost=0.00..33.12 rows=617 width=0)
4957                Filter: (logdate &gt;= '2008-01-01'::date)
4958          -&gt;  Seq Scan on measurement_y2006m03  (cost=0.00..33.12 rows=617 width=0)
4959                Filter: (logdate &gt;= '2008-01-01'::date)
4960 ...
4961          -&gt;  Seq Scan on measurement_y2007m11  (cost=0.00..33.12 rows=617 width=0)
4962                Filter: (logdate &gt;= '2008-01-01'::date)
4963          -&gt;  Seq Scan on measurement_y2007m12  (cost=0.00..33.12 rows=617 width=0)
4964                Filter: (logdate &gt;= '2008-01-01'::date)
4965          -&gt;  Seq Scan on measurement_y2008m01  (cost=0.00..33.12 rows=617 width=0)
4966                Filter: (logdate &gt;= '2008-01-01'::date)
4967 </programlisting>
4968
4969     Some or all of the partitions might use index scans instead of
4970     full-table sequential scans, but the point here is that there
4971     is no need to scan the older partitions at all to answer this query.
4972     When we enable partition pruning, we get a significantly
4973     cheaper plan that will deliver the same answer:
4974 <programlisting>
4975 SET enable_partition_pruning = on;
4976 EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
4977                                     QUERY PLAN
4978 -------------------------------------------------------------------&zwsp;----------------
4979  Aggregate  (cost=37.75..37.76 rows=1 width=8)
4980    -&gt;  Seq Scan on measurement_y2008m01  (cost=0.00..33.12 rows=617 width=0)
4981          Filter: (logdate &gt;= '2008-01-01'::date)
4982 </programlisting>
4983    </para>
4984
4985    <para>
4986     Note that partition pruning is driven only by the constraints defined
4987     implicitly by the partition keys, not by the presence of indexes.
4988     Therefore it isn't necessary to define indexes on the key columns.
4989     Whether an index needs to be created for a given partition depends on
4990     whether you expect that queries that scan the partition will
4991     generally scan a large part of the partition or just a small part.
4992     An index will be helpful in the latter case but not the former.
4993    </para>
4994
4995    <para>
4996     Partition pruning can be performed not only during the planning of a
4997     given query, but also during its execution.  This is useful as it can
4998     allow more partitions to be pruned when clauses contain expressions
4999     whose values are not known at query planning time, for example,
5000     parameters defined in a <command>PREPARE</command> statement, using a
5001     value obtained from a subquery, or using a parameterized value on the
5002     inner side of a nested loop join.  Partition pruning during execution
5003     can be performed at any of the following times:
5004
5005     <itemizedlist>
5006      <listitem>
5007       <para>
5008        During initialization of the query plan.  Partition pruning can be
5009        performed here for parameter values which are known during the
5010        initialization phase of execution.  Partitions which are pruned
5011        during this stage will not show up in the query's
5012        <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>.
5013        It is possible to determine the number of partitions which were
5014        removed during this phase by observing the
5015        <quote>Subplans Removed</quote> property in the
5016        <command>EXPLAIN</command> output.
5017       </para>
5018      </listitem>
5019
5020      <listitem>
5021       <para>
5022        During actual execution of the query plan.  Partition pruning may
5023        also be performed here to remove partitions using values which are
5024        only known during actual query execution.  This includes values
5025        from subqueries and values from execution-time parameters such as
5026        those from parameterized nested loop joins.  Since the value of
5027        these parameters may change many times during the execution of the
5028        query, partition pruning is performed whenever one of the
5029        execution parameters being used by partition pruning changes.
5030        Determining if partitions were pruned during this phase requires
5031        careful inspection of the <literal>loops</literal> property in
5032        the <command>EXPLAIN ANALYZE</command> output.  Subplans
5033        corresponding to different partitions may have different values
5034        for it depending on how many times each of them was pruned during
5035        execution.  Some may be shown as <literal>(never executed)</literal>
5036        if they were pruned every time.
5037       </para>
5038      </listitem>
5039     </itemizedlist>
5040    </para>
5041
5042    <para>
5043     Partition pruning can be disabled using the
5044     <xref linkend="guc-enable-partition-pruning"/> setting.
5045    </para>
5046   </sect2>
5047
5048   <sect2 id="ddl-partitioning-constraint-exclusion">
5049    <title>Partitioning and Constraint Exclusion</title>
5050
5051    <indexterm>
5052     <primary>constraint exclusion</primary>
5053    </indexterm>
5054
5055    <para>
5056     <firstterm>Constraint exclusion</firstterm> is a query optimization
5057     technique similar to partition pruning.  While it is primarily used
5058     for partitioning implemented using the legacy inheritance method, it can be
5059     used for other purposes, including with declarative partitioning.
5060    </para>
5061
5062    <para>
5063     Constraint exclusion works in a very similar way to partition
5064     pruning, except that it uses each table's <literal>CHECK</literal>
5065     constraints &mdash; which gives it its name &mdash; whereas partition
5066     pruning uses the table's partition bounds, which exist only in the
5067     case of declarative partitioning.  Another difference is that
5068     constraint exclusion is only applied at plan time; there is no attempt
5069     to remove partitions at execution time.
5070    </para>
5071
5072    <para>
5073     The fact that constraint exclusion uses <literal>CHECK</literal>
5074     constraints, which makes it slow compared to partition pruning, can
5075     sometimes be used as an advantage: because constraints can be defined
5076     even on declaratively-partitioned tables, in addition to their internal
5077     partition bounds, constraint exclusion may be able
5078     to elide additional partitions from the query plan.
5079    </para>
5080
5081    <para>
5082     The default (and recommended) setting of
5083     <xref linkend="guc-constraint-exclusion"/> is neither
5084     <literal>on</literal> nor <literal>off</literal>, but an intermediate setting
5085     called <literal>partition</literal>, which causes the technique to be
5086     applied only to queries that are likely to be working on inheritance partitioned
5087     tables.  The <literal>on</literal> setting causes the planner to examine
5088     <literal>CHECK</literal> constraints in all queries, even simple ones that
5089     are unlikely to benefit.
5090    </para>
5091
5092    <para>
5093     The following caveats apply to constraint exclusion:
5094
5095    <itemizedlist>
5096     <listitem>
5097      <para>
5098       Constraint exclusion is only applied during query planning, unlike
5099       partition pruning, which can also be applied during query execution.
5100      </para>
5101     </listitem>
5102
5103     <listitem>
5104      <para>
5105       Constraint exclusion only works when the query's <literal>WHERE</literal>
5106       clause contains constants (or externally supplied parameters).
5107       For example, a comparison against a non-immutable function such as
5108       <function>CURRENT_TIMESTAMP</function> cannot be optimized, since the
5109       planner cannot know which child table the function's value might fall
5110       into at run time.
5111      </para>
5112     </listitem>
5113
5114     <listitem>
5115      <para>
5116       Keep the partitioning constraints simple, else the planner may not be
5117       able to prove that child tables might not need to be visited.  Use simple
5118       equality conditions for list partitioning, or simple
5119       range tests for range partitioning, as illustrated in the preceding
5120       examples.  A good rule of thumb is that partitioning constraints should
5121       contain only comparisons of the partitioning column(s) to constants
5122       using B-tree-indexable operators, because only B-tree-indexable
5123       column(s) are allowed in the partition key.
5124      </para>
5125     </listitem>
5126
5127     <listitem>
5128      <para>
5129       All constraints on all children of the parent table are examined
5130       during constraint exclusion, so large numbers of children are likely
5131       to increase query planning time considerably.  So the legacy
5132       inheritance based partitioning will work well with up to perhaps a
5133       hundred child tables; don't try to use many thousands of children.
5134      </para>
5135     </listitem>
5136
5137    </itemizedlist>
5138    </para>
5139   </sect2>
5140
5141   <sect2 id="ddl-partitioning-declarative-best-practices">
5142    <title>Best Practices for Declarative Partitioning</title>
5143
5144    <para>
5145     The choice of how to partition a table should be made carefully, as the
5146     performance of query planning and execution can be negatively affected by
5147     poor design.
5148    </para>
5149
5150    <para>
5151     One of the most critical design decisions will be the column or columns
5152     by which you partition your data.  Often the best choice will be to
5153     partition by the column or set of columns which most commonly appear in
5154     <literal>WHERE</literal> clauses of queries being executed on the
5155     partitioned table.  <literal>WHERE</literal> clauses that are compatible
5156     with the partition bound constraints can be used to prune unneeded
5157     partitions.  However, you may be forced into making other decisions by
5158     requirements for the <literal>PRIMARY KEY</literal> or a
5159     <literal>UNIQUE</literal> constraint.  Removal of unwanted data is also a
5160     factor to consider when planning your partitioning strategy.  An entire
5161     partition can be detached fairly quickly, so it may be beneficial to
5162     design the partition strategy in such a way that all data to be removed
5163     at once is located in a single partition.
5164    </para>
5165
5166    <para>
5167     Choosing the target number of partitions that the table should be divided
5168     into is also a critical decision to make.  Not having enough partitions
5169     may mean that indexes remain too large and that data locality remains poor
5170     which could result in low cache hit ratios.  However, dividing the table
5171     into too many partitions can also cause issues.  Too many partitions can
5172     mean longer query planning times and higher memory consumption during both
5173     query planning and execution, as further described below.
5174     When choosing how to partition your table,
5175     it's also important to consider what changes may occur in the future.  For
5176     example, if you choose to have one partition per customer and you
5177     currently have a small number of large customers, consider the
5178     implications if in several years you instead find yourself with a large
5179     number of small customers.  In this case, it may be better to choose to
5180     partition by <literal>HASH</literal> and choose a reasonable number of
5181     partitions rather than trying to partition by <literal>LIST</literal> and
5182     hoping that the number of customers does not increase beyond what it is
5183     practical to partition the data by.
5184    </para>
5185
5186    <para>
5187     Sub-partitioning can be useful to further divide partitions that are
5188     expected to become larger than other partitions.
5189     Another option is to use range partitioning with multiple columns in
5190     the partition key.
5191     Either of these can easily lead to excessive numbers of partitions,
5192     so restraint is advisable.
5193    </para>
5194
5195    <para>
5196     It is important to consider the overhead of partitioning during
5197     query planning and execution.  The query planner is generally able to
5198     handle partition hierarchies with up to a few thousand partitions fairly
5199     well, provided that typical queries allow the query planner to prune all
5200     but a small number of partitions.  Planning times become longer and memory
5201     consumption becomes higher when more partitions remain after the planner
5202     performs partition pruning.  Another
5203     reason to be concerned about having a large number of partitions is that
5204     the server's memory consumption may grow significantly over
5205     time, especially if many sessions touch large numbers of partitions.
5206     That's because each partition requires its metadata to be loaded into the
5207     local memory of each session that touches it.
5208    </para>
5209
5210    <para>
5211     With data warehouse type workloads, it can make sense to use a larger
5212     number of partitions than with an <acronym>OLTP</acronym> type workload.
5213     Generally, in data warehouses, query planning time is less of a concern as
5214     the majority of processing time is spent during query execution.  With
5215     either of these two types of workload, it is important to make the right
5216     decisions early, as re-partitioning large quantities of data can be
5217     painfully slow.  Simulations of the intended workload are often beneficial
5218     for optimizing the partitioning strategy.  Never just assume that more
5219     partitions are better than fewer partitions, nor vice-versa.
5220    </para>
5221   </sect2>
5222
5223  </sect1>
5224
5225  <sect1 id="ddl-foreign-data">
5226   <title>Foreign Data</title>
5227
5228    <indexterm>
5229     <primary>foreign data</primary>
5230    </indexterm>
5231    <indexterm>
5232     <primary>foreign table</primary>
5233    </indexterm>
5234    <indexterm>
5235     <primary>user mapping</primary>
5236    </indexterm>
5237
5238    <para>
5239     <productname>PostgreSQL</productname> implements portions of the SQL/MED
5240     specification, allowing you to access data that resides outside
5241     PostgreSQL using regular SQL queries.  Such data is referred to as
5242     <firstterm>foreign data</firstterm>.  (Note that this usage is not to be confused
5243     with foreign keys, which are a type of constraint within the database.)
5244    </para>
5245
5246    <para>
5247     Foreign data is accessed with help from a
5248     <firstterm>foreign data wrapper</firstterm>. A foreign data wrapper is a
5249     library that can communicate with an external data source, hiding the
5250     details of connecting to the data source and obtaining data from it.
5251     There are some foreign data wrappers available as <filename>contrib</filename>
5252     modules; see <xref linkend="contrib"/>.  Other kinds of foreign data
5253     wrappers might be found as third party products.  If none of the existing
5254     foreign data wrappers suit your needs, you can write your own; see <xref
5255     linkend="fdwhandler"/>.
5256    </para>
5257
5258    <para>
5259     To access foreign data, you need to create a <firstterm>foreign server</firstterm>
5260     object, which defines how to connect to a particular external data source
5261     according to the set of options used by its supporting foreign data
5262     wrapper. Then you need to create one or more <firstterm>foreign
5263     tables</firstterm>, which define the structure of the remote data. A
5264     foreign table can be used in queries just like a normal table, but a
5265     foreign table has no storage in the PostgreSQL server.  Whenever it is
5266     used, <productname>PostgreSQL</productname> asks the foreign data wrapper
5267     to fetch data from the external source, or transmit data to the external
5268     source in the case of update commands.
5269    </para>
5270
5271    <para>
5272     Accessing remote data may require authenticating to the external
5273     data source.  This information can be provided by a
5274     <firstterm>user mapping</firstterm>, which can provide additional data
5275     such as user names and passwords based
5276     on the current <productname>PostgreSQL</productname> role.
5277    </para>
5278
5279    <para>
5280     For additional information, see
5281     <xref linkend="sql-createforeigndatawrapper"/>,
5282     <xref linkend="sql-createserver"/>,
5283     <xref linkend="sql-createusermapping"/>,
5284     <xref linkend="sql-createforeigntable"/>, and
5285     <xref linkend="sql-importforeignschema"/>.
5286    </para>
5287  </sect1>
5288
5289  <sect1 id="ddl-others">
5290   <title>Other Database Objects</title>
5291
5292   <para>
5293    Tables are the central objects in a relational database structure,
5294    because they hold your data.  But they are not the only objects
5295    that exist in a database.  Many other kinds of objects can be
5296    created to make the use and management of the data more efficient
5297    or convenient.  They are not discussed in this chapter, but we give
5298    you a list here so that you are aware of what is possible:
5299   </para>
5300
5301   <itemizedlist>
5302    <listitem>
5303     <para>
5304      Views
5305     </para>
5306    </listitem>
5307
5308    <listitem>
5309     <para>
5310      Functions, procedures, and operators
5311     </para>
5312    </listitem>
5313
5314    <listitem>
5315     <para>
5316      Data types and domains
5317     </para>
5318    </listitem>
5319
5320    <listitem>
5321     <para>
5322      Triggers and rewrite rules
5323     </para>
5324    </listitem>
5325   </itemizedlist>
5326
5327   <para>
5328    Detailed information on
5329    these topics appears in <xref linkend="server-programming"/>.
5330   </para>
5331  </sect1>
5332
5333  <sect1 id="ddl-depend">
5334   <title>Dependency Tracking</title>
5335
5336   <indexterm zone="ddl-depend">
5337    <primary>CASCADE</primary>
5338    <secondary sortas="DROP">with DROP</secondary>
5339   </indexterm>
5340
5341   <indexterm zone="ddl-depend">
5342    <primary>RESTRICT</primary>
5343    <secondary sortas="DROP">with DROP</secondary>
5344   </indexterm>
5345
5346   <para>
5347    When you create complex database structures involving many tables
5348    with foreign key constraints, views, triggers, functions, etc. you
5349    implicitly create a net of dependencies between the objects.
5350    For instance, a table with a foreign key constraint depends on the
5351    table it references.
5352   </para>
5353
5354   <para>
5355    To ensure the integrity of the entire database structure,
5356    <productname>PostgreSQL</productname> makes sure that you cannot
5357    drop objects that other objects still depend on.  For example,
5358    attempting to drop the products table we considered in <xref
5359    linkend="ddl-constraints-fk"/>, with the orders table depending on
5360    it, would result in an error message like this:
5361 <screen>
5362 DROP TABLE products;
5363
5364 ERROR:  cannot drop table products because other objects depend on it
5365 DETAIL:  constraint orders_product_no_fkey on table orders depends on table products
5366 HINT:  Use DROP ... CASCADE to drop the dependent objects too.
5367 </screen>
5368    The error message contains a useful hint: if you do not want to
5369    bother deleting all the dependent objects individually, you can run:
5370 <screen>
5371 DROP TABLE products CASCADE;
5372 </screen>
5373    and all the dependent objects will be removed, as will any objects
5374    that depend on them, recursively.  In this case, it doesn't remove
5375    the orders table, it only removes the foreign key constraint.
5376    It stops there because nothing depends on the foreign key constraint.
5377    (If you want to check what <command>DROP ... CASCADE</command> will do,
5378    run <command>DROP</command> without <literal>CASCADE</literal> and read the
5379    <literal>DETAIL</literal> output.)
5380   </para>
5381
5382   <para>
5383    Almost all <command>DROP</command> commands in <productname>PostgreSQL</productname> support
5384    specifying <literal>CASCADE</literal>.  Of course, the nature of
5385    the possible dependencies varies with the type of the object.  You
5386    can also write <literal>RESTRICT</literal> instead of
5387    <literal>CASCADE</literal> to get the default behavior, which is to
5388    prevent dropping objects that any other objects depend on.
5389   </para>
5390
5391   <note>
5392    <para>
5393     According to the SQL standard, specifying either
5394     <literal>RESTRICT</literal> or <literal>CASCADE</literal> is
5395     required in a <command>DROP</command> command.  No database system actually
5396     enforces that rule, but whether the default behavior
5397     is <literal>RESTRICT</literal> or <literal>CASCADE</literal> varies
5398     across systems.
5399    </para>
5400   </note>
5401
5402   <para>
5403    If a <command>DROP</command> command lists multiple
5404    objects, <literal>CASCADE</literal> is only required when there are
5405    dependencies outside the specified group.  For example, when saying
5406    <literal>DROP TABLE tab1, tab2</literal> the existence of a foreign
5407    key referencing <literal>tab1</literal> from <literal>tab2</literal> would not mean
5408    that <literal>CASCADE</literal> is needed to succeed.
5409   </para>
5410
5411   <para>
5412    For a user-defined function or procedure whose body is defined as a string
5413    literal, <productname>PostgreSQL</productname> tracks
5414    dependencies associated with the function's externally-visible properties,
5415    such as its argument and result types, but <emphasis>not</emphasis> dependencies
5416    that could only be known by examining the function body.  As an example,
5417    consider this situation:
5418
5419 <programlisting>
5420 CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow',
5421                              'green', 'blue', 'purple');
5422
5423 CREATE TABLE my_colors (color rainbow, note text);
5424
5425 CREATE FUNCTION get_color_note (rainbow) RETURNS text AS
5426   'SELECT note FROM my_colors WHERE color = $1'
5427   LANGUAGE SQL;
5428 </programlisting>
5429
5430    (See <xref linkend="xfunc-sql"/> for an explanation of SQL-language
5431    functions.)  <productname>PostgreSQL</productname> will be aware that
5432    the <function>get_color_note</function> function depends on the <type>rainbow</type>
5433    type: dropping the type would force dropping the function, because its
5434    argument type would no longer be defined.  But <productname>PostgreSQL</productname>
5435    will not consider <function>get_color_note</function> to depend on
5436    the <structname>my_colors</structname> table, and so will not drop the function if
5437    the table is dropped.  While there are disadvantages to this approach,
5438    there are also benefits.  The function is still valid in some sense if the
5439    table is missing, though executing it would cause an error; creating a new
5440    table of the same name would allow the function to work again.
5441   </para>
5442
5443   <para>
5444    On the other hand, for a SQL-language function or procedure whose body
5445    is written in SQL-standard style, the body is parsed at function
5446    definition time and all dependencies recognized by the parser are
5447    stored.  Thus, if we write the function above as
5448
5449 <programlisting>
5450 CREATE FUNCTION get_color_note (rainbow) RETURNS text
5451 BEGIN ATOMIC
5452   SELECT note FROM my_colors WHERE color = $1;
5453 END;
5454 </programlisting>
5455
5456    then the function's dependency on the <structname>my_colors</structname>
5457    table will be known and enforced by <command>DROP</command>.
5458   </para>
5459  </sect1>
5460
5461 </chapter>