doc/src/sgml/queries.sgml

   1 <!-- $PostgreSQL$ -->
   2
   3 <chapter id="queries">
   4  <title>Queries</title>
   5
   6  <indexterm zone="queries">
   7   <primary>query</primary>
   8  </indexterm>
   9
  10  <indexterm zone="queries">
  11   <primary>SELECT</primary>
  12  </indexterm>
  13
  14  <para>
  15   The previous chapters explained how to create tables, how to fill
  16   them with data, and how to manipulate that data.  Now we finally
  17   discuss how to retrieve the data out of the database.
  18  </para>
  19
  20
  21  <sect1 id="queries-overview">
  22   <title>Overview</title>
  23
  24   <para>
  25    The process of retrieving or the command to retrieve data from a
  26    database is called a <firstterm>query</firstterm>.  In SQL the
  27    <xref linkend="sql-select" endterm="sql-select-title"> command is
  28    used to specify queries.  The general syntax of the
  29    <command>SELECT</command> command is
  30 <synopsis>
  31 SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional><replaceable>sort_specification</replaceable></optional>
  32 </synopsis>
  33    The following sections describe the details of the select list, the
  34    table expression, and the sort specification.
  35   </para>
  36
  37   <para>
  38    A simple kind of query has the form:
  39 <programlisting>
  40 SELECT * FROM table1;
  41 </programlisting>
  42   Assuming that there is a table called <literal>table1</literal>,
  43   this command would retrieve all rows and all columns from
  44   <literal>table1</literal>.  (The method of retrieval depends on the
  45   client application.  For example, the
  46   <application>psql</application> program will display an ASCII-art
  47   table on the screen, while client libraries will offer functions to
  48   extract individual values from the query result.)  The select list
  49   specification <literal>*</literal> means all columns that the table
  50   expression happens to provide.  A select list can also select a
  51   subset of the available columns or make calculations using the
  52   columns.  For example, if
  53   <literal>table1</literal> has columns named <literal>a</>,
  54   <literal>b</>, and <literal>c</> (and perhaps others) you can make
  55   the following query:
  56 <programlisting>
  57 SELECT a, b + c FROM table1;
  58 </programlisting>
  59   (assuming that <literal>b</> and <literal>c</> are of a numerical
  60   data type).
  61   See <xref linkend="queries-select-lists"> for more details.
  62  </para>
  63
  64  <para>
  65   <literal>FROM table1</literal> is a particularly simple kind of
  66   table expression: it reads just one table.  In general, table
  67   expressions can be complex constructs of base tables, joins, and
  68   subqueries.  But you can also omit the table expression entirely and
  69   use the <command>SELECT</command> command as a calculator:
  70 <programlisting>
  71 SELECT 3 * 4;
  72 </programlisting>
  73   This is more useful if the expressions in the select list return
  74   varying results.  For example, you could call a function this way:
  75 <programlisting>
  76 SELECT random();
  77 </programlisting>
  78   </para>
  79  </sect1>
  80
  81
  82  <sect1 id="queries-table-expressions">
  83   <title>Table Expressions</title>
  84
  85   <indexterm zone="queries-table-expressions">
  86    <primary>table expression</primary>
  87   </indexterm>
  88
  89   <para>
  90    A <firstterm>table expression</firstterm> computes a table.  The
  91    table expression contains a <literal>FROM</> clause that is
  92    optionally followed by <literal>WHERE</>, <literal>GROUP BY</>, and
  93    <literal>HAVING</> clauses.  Trivial table expressions simply refer
  94    to a table on disk, a so-called base table, but more complex
  95    expressions can be used to modify or combine base tables in various
  96    ways.
  97   </para>
  98
  99   <para>
 100    The optional <literal>WHERE</>, <literal>GROUP BY</>, and
 101    <literal>HAVING</> clauses in the table expression specify a
 102    pipeline of successive transformations performed on the table
 103    derived in the <literal>FROM</> clause.  All these transformations
 104    produce a virtual table that provides the rows that are passed to
 105    the select list to compute the output rows of the query.
 106   </para>
 107
 108   <sect2 id="queries-from">
 109    <title>The <literal>FROM</literal> Clause</title>
 110
 111    <para>
 112     The <xref linkend="sql-from" endterm="sql-from-title"> derives a
 113     table from one or more other tables given in a comma-separated
 114     table reference list.
 115 <synopsis>
 116 FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional>
 117 </synopsis>
 118
 119     A table reference can be a table name (possibly schema-qualified),
 120     or a derived table such as a subquery, a table join, or complex
 121     combinations of these.  If more than one table reference is listed
 122     in the <literal>FROM</> clause they are cross-joined (see below)
 123     to form the intermediate virtual table that can then be subject to
 124     transformations by the <literal>WHERE</>, <literal>GROUP BY</>,
 125     and <literal>HAVING</> clauses and is finally the result of the
 126     overall table expression.
 127    </para>
 128
 129    <indexterm>
 130     <primary>ONLY</primary>
 131    </indexterm>
 132
 133    <para>
 134     When a table reference names a table that is the parent of a
 135     table inheritance hierarchy, the table reference produces rows of
 136     not only that table but all of its descendant tables, unless the
 137     key word <literal>ONLY</> precedes the table name.  However, the
 138     reference produces only the columns that appear in the named table
 139     &mdash; any columns added in subtables are ignored.
 140    </para>
 141
 142    <sect3 id="queries-join">
 143     <title>Joined Tables</title>
 144
 145     <indexterm zone="queries-join">
 146      <primary>join</primary>
 147     </indexterm>
 148
 149     <para>
 150      A joined table is a table derived from two other (real or
 151      derived) tables according to the rules of the particular join
 152      type.  Inner, outer, and cross-joins are available.
 153     </para>
 154
 155     <variablelist>
 156      <title>Join Types</title>
 157
 158      <varlistentry>
 159       <term>Cross join</term>
 160
 161       <indexterm>
 162        <primary>join</primary>
 163        <secondary>cross</secondary>
 164       </indexterm>
 165
 166       <indexterm>
 167        <primary>cross join</primary>
 168       </indexterm>
 169
 170       <listitem>
 171 <synopsis>
 172 <replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable>
 173 </synopsis>
 174
 175        <para>
 176         For each combination of rows from
 177         <replaceable>T1</replaceable> and
 178         <replaceable>T2</replaceable>, the derived table will contain a
 179         row consisting of all columns in <replaceable>T1</replaceable>
 180         followed by all columns in <replaceable>T2</replaceable>.  If
 181         the tables have N and M rows respectively, the joined
 182         table will have N * M rows.
 183        </para>
 184
 185        <para>
 186         <literal>FROM <replaceable>T1</replaceable> CROSS JOIN
 187         <replaceable>T2</replaceable></literal> is equivalent to
 188         <literal>FROM <replaceable>T1</replaceable>,
 189         <replaceable>T2</replaceable></literal>.  It is also equivalent to
 190         <literal>FROM <replaceable>T1</replaceable> INNER JOIN
 191         <replaceable>T2</replaceable> ON TRUE</literal> (see below).
 192        </para>
 193       </listitem>
 194      </varlistentry>
 195
 196      <varlistentry>
 197       <term>Qualified joins</term>
 198
 199       <indexterm>
 200        <primary>join</primary>
 201        <secondary>outer</secondary>
 202       </indexterm>
 203
 204       <indexterm>
 205        <primary>outer join</primary>
 206       </indexterm>
 207
 208       <listitem>
 209 <synopsis>
 210 <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean_expression</replaceable>
 211 <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> )
 212 <replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable>
 213 </synopsis>
 214
 215        <para>
 216         The words <literal>INNER</literal> and
 217         <literal>OUTER</literal> are optional in all forms.
 218         <literal>INNER</literal> is the default;
 219         <literal>LEFT</literal>, <literal>RIGHT</literal>, and
 220         <literal>FULL</literal> imply an outer join.
 221        </para>
 222
 223        <para>
 224         The <firstterm>join condition</firstterm> is specified in the
 225         <literal>ON</> or <literal>USING</> clause, or implicitly by
 226         the word <literal>NATURAL</>.  The join condition determines
 227         which rows from the two source tables are considered to
 228         <quote>match</quote>, as explained in detail below.
 229        </para>
 230
 231        <para>
 232         The <literal>ON</> clause is the most general kind of join
 233         condition: it takes a Boolean value expression of the same
 234         kind as is used in a <literal>WHERE</> clause.  A pair of rows
 235         from <replaceable>T1</> and <replaceable>T2</> match if the
 236         <literal>ON</> expression evaluates to true for them.
 237        </para>
 238
 239        <para>
 240         <literal>USING</> is a shorthand notation: it takes a
 241         comma-separated list of column names, which the joined tables
 242         must have in common, and forms a join condition specifying
 243         equality of each of these pairs of columns.  Furthermore, the
 244         output of a <literal>JOIN USING</> has one column for each of
 245         the equated pairs of input columns, followed by all of the
 246         other columns from each table.  Thus, <literal>USING (a, b,
 247         c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND
 248         t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that
 249         if <literal>ON</> is used there will be two columns
 250         <literal>a</>, <literal>b</>, and <literal>c</> in the result,
 251         whereas with <literal>USING</> there will be only one of each.
 252        </para>
 253
 254        <para>
 255         <indexterm>
 256          <primary>join</primary>
 257          <secondary>natural</secondary>
 258         </indexterm>
 259         <indexterm>
 260          <primary>natural join</primary>
 261         </indexterm>
 262         Finally, <literal>NATURAL</> is a shorthand form of
 263         <literal>USING</>: it forms a <literal>USING</> list
 264         consisting of exactly those column names that appear in both
 265         input tables.  As with <literal>USING</>, these columns appear
 266         only once in the output table.
 267        </para>
 268
 269        <para>
 270         The possible types of qualified join are:
 271
 272        <variablelist>
 273         <varlistentry>
 274          <term><literal>INNER JOIN</></term>
 275
 276          <listitem>
 277           <para>
 278            For each row R1 of T1, the joined table has a row for each
 279            row in T2 that satisfies the join condition with R1.
 280           </para>
 281          </listitem>
 282         </varlistentry>
 283
 284         <varlistentry>
 285          <term><literal>LEFT OUTER JOIN</></term>
 286
 287          <indexterm>
 288           <primary>join</primary>
 289           <secondary>left</secondary>
 290          </indexterm>
 291
 292          <indexterm>
 293           <primary>left join</primary>
 294          </indexterm>
 295
 296          <listitem>
 297           <para>
 298            First, an inner join is performed.  Then, for each row in
 299            T1 that does not satisfy the join condition with any row in
 300            T2, a joined row is added with null values in columns of
 301            T2.  Thus, the joined table unconditionally has at least
 302            one row for each row in T1.
 303           </para>
 304          </listitem>
 305         </varlistentry>
 306
 307         <varlistentry>
 308          <term><literal>RIGHT OUTER JOIN</></term>
 309
 310          <indexterm>
 311           <primary>join</primary>
 312           <secondary>right</secondary>
 313          </indexterm>
 314
 315          <indexterm>
 316           <primary>right join</primary>
 317          </indexterm>
 318
 319          <listitem>
 320           <para>
 321            First, an inner join is performed.  Then, for each row in
 322            T2 that does not satisfy the join condition with any row in
 323            T1, a joined row is added with null values in columns of
 324            T1.  This is the converse of a left join: the result table
 325            will unconditionally have a row for each row in T2.
 326           </para>
 327          </listitem>
 328         </varlistentry>
 329
 330         <varlistentry>
 331          <term><literal>FULL OUTER JOIN</></term>
 332
 333          <listitem>
 334           <para>
 335            First, an inner join is performed.  Then, for each row in
 336            T1 that does not satisfy the join condition with any row in
 337            T2, a joined row is added with null values in columns of
 338            T2.  Also, for each row of T2 that does not satisfy the
 339            join condition with any row in T1, a joined row with null
 340            values in the columns of T1 is added.
 341           </para>
 342          </listitem>
 343         </varlistentry>
 344        </variablelist>
 345        </para>
 346       </listitem>
 347      </varlistentry>
 348     </variablelist>
 349
 350     <para>
 351      Joins of all types can be chained together or nested: either or
 352      both of <replaceable>T1</replaceable> and
 353      <replaceable>T2</replaceable> might be joined tables.  Parentheses
 354      can be used around <literal>JOIN</> clauses to control the join
 355      order.  In the absence of parentheses, <literal>JOIN</> clauses
 356      nest left-to-right.
 357     </para>
 358
 359     <para>
 360      To put this together, assume we have tables <literal>t1</literal>:
 361 <programlisting>
 362  num | name
 363 -----+------
 364    1 | a
 365    2 | b
 366    3 | c
 367 </programlisting>
 368      and <literal>t2</literal>:
 369 <programlisting>
 370  num | value
 371 -----+-------
 372    1 | xxx
 373    3 | yyy
 374    5 | zzz
 375 </programlisting>
 376      then we get the following results for the various joins:
 377 <screen>
 378 <prompt>=&gt;</> <userinput>SELECT * FROM t1 CROSS JOIN t2;</>
 379  num | name | num | value
 380 -----+------+-----+-------
 381    1 | a    |   1 | xxx
 382    1 | a    |   3 | yyy
 383    1 | a    |   5 | zzz
 384    2 | b    |   1 | xxx
 385    2 | b    |   3 | yyy
 386    2 | b    |   5 | zzz
 387    3 | c    |   1 | xxx
 388    3 | c    |   3 | yyy
 389    3 | c    |   5 | zzz
 390 (9 rows)
 391
 392 <prompt>=&gt;</> <userinput>SELECT * FROM t1 INNER JOIN t2 ON t1.num = t2.num;</>
 393  num | name | num | value
 394 -----+------+-----+-------
 395    1 | a    |   1 | xxx
 396    3 | c    |   3 | yyy
 397 (2 rows)
 398
 399 <prompt>=&gt;</> <userinput>SELECT * FROM t1 INNER JOIN t2 USING (num);</>
 400  num | name | value
 401 -----+------+-------
 402    1 | a    | xxx
 403    3 | c    | yyy
 404 (2 rows)
 405
 406 <prompt>=&gt;</> <userinput>SELECT * FROM t1 NATURAL INNER JOIN t2;</>
 407  num | name | value
 408 -----+------+-------
 409    1 | a    | xxx
 410    3 | c    | yyy
 411 (2 rows)
 412
 413 <prompt>=&gt;</> <userinput>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num;</>
 414  num | name | num | value
 415 -----+------+-----+-------
 416    1 | a    |   1 | xxx
 417    2 | b    |     |
 418    3 | c    |   3 | yyy
 419 (3 rows)
 420
 421 <prompt>=&gt;</> <userinput>SELECT * FROM t1 LEFT JOIN t2 USING (num);</>
 422  num | name | value
 423 -----+------+-------
 424    1 | a    | xxx
 425    2 | b    |
 426    3 | c    | yyy
 427 (3 rows)
 428
 429 <prompt>=&gt;</> <userinput>SELECT * FROM t1 RIGHT JOIN t2 ON t1.num = t2.num;</>
 430  num | name | num | value
 431 -----+------+-----+-------
 432    1 | a    |   1 | xxx
 433    3 | c    |   3 | yyy
 434      |      |   5 | zzz
 435 (3 rows)
 436
 437 <prompt>=&gt;</> <userinput>SELECT * FROM t1 FULL JOIN t2 ON t1.num = t2.num;</>
 438  num | name | num | value
 439 -----+------+-----+-------
 440    1 | a    |   1 | xxx
 441    2 | b    |     |
 442    3 | c    |   3 | yyy
 443      |      |   5 | zzz
 444 (4 rows)
 445 </screen>
 446     </para>
 447
 448     <para>
 449      The join condition specified with <literal>ON</> can also contain
 450      conditions that do not relate directly to the join.  This can
 451      prove useful for some queries but needs to be thought out
 452      carefully.  For example:
 453 <screen>
 454 <prompt>=&gt;</> <userinput>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num AND t2.value = 'xxx';</>
 455  num | name | num | value
 456 -----+------+-----+-------
 457    1 | a    |   1 | xxx
 458    2 | b    |     |
 459    3 | c    |     |
 460 (3 rows)
 461 </screen>
 462     </para>
 463    </sect3>
 464
 465    <sect3 id="queries-table-aliases">
 466     <title>Table and Column Aliases</title>
 467
 468     <indexterm zone="queries-table-aliases">
 469      <primary>alias</primary>
 470      <secondary>in the FROM clause</secondary>
 471     </indexterm>
 472
 473     <indexterm>
 474      <primary>label</primary>
 475      <see>alias</see>
 476     </indexterm>
 477
 478     <para>
 479      A temporary name can be given to tables and complex table
 480      references to be used for references to the derived table in
 481      the rest of the query.  This is called a <firstterm>table
 482      alias</firstterm>.
 483     </para>
 484
 485     <para>
 486      To create a table alias, write
 487 <synopsis>
 488 FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable>
 489 </synopsis>
 490      or
 491 <synopsis>
 492 FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable>
 493 </synopsis>
 494      The <literal>AS</literal> key word is optional noise.
 495      <replaceable>alias</replaceable> can be any identifier.
 496     </para>
 497
 498     <para>
 499      A typical application of table aliases is to assign short
 500      identifiers to long table names to keep the join clauses
 501      readable.  For example:
 502 <programlisting>
 503 SELECT * FROM some_very_long_table_name s JOIN another_fairly_long_name a ON s.id = a.num;
 504 </programlisting>
 505     </para>
 506
 507     <para>
 508      The alias becomes the new name of the table reference for the
 509      current query &mdash; it is no longer possible to refer to the table
 510      by the original name.  Thus:
 511 <programlisting>
 512 SELECT * FROM my_table AS m WHERE my_table.a &gt; 5;
 513 </programlisting>
 514      is not valid according to the SQL standard.  In
 515      <productname>PostgreSQL</productname> this will draw an error if the
 516      <xref linkend="guc-add-missing-from"> configuration variable is
 517      <literal>off</> (as it is by default).  If it is <literal>on</>,
 518      an implicit table reference will be added to the
 519      <literal>FROM</literal> clause, so the query is processed as if
 520      it were written as:
 521 <programlisting>
 522 SELECT * FROM my_table AS m, my_table AS my_table WHERE my_table.a &gt; 5;
 523 </programlisting>
 524      That will result in a cross join, which is usually not what you want.
 525     </para>
 526
 527     <para>
 528      Table aliases are mainly for notational convenience, but it is
 529      necessary to use them when joining a table to itself, e.g.:
 530 <programlisting>
 531 SELECT * FROM people AS mother JOIN people AS child ON mother.id = child.mother_id;
 532 </programlisting>
 533      Additionally, an alias is required if the table reference is a
 534      subquery (see <xref linkend="queries-subqueries">).
 535     </para>
 536
 537     <para>
 538      Parentheses are used to resolve ambiguities.  In the following example,
 539      the first statement assigns the alias <literal>b</literal> to the second
 540      instance of <literal>my_table</>, but the second statement assigns the
 541      alias to the result of the join:
 542 <programlisting>
 543 SELECT * FROM my_table AS a CROSS JOIN my_table AS b ...
 544 SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ...
 545 </programlisting>
 546     </para>
 547
 548     <para>
 549      Another form of table aliasing gives temporary names to the columns of
 550      the table, as well as the table itself:
 551 <synopsis>
 552 FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> )
 553 </synopsis>
 554      If fewer column aliases are specified than the actual table has
 555      columns, the remaining columns are not renamed.  This syntax is
 556      especially useful for self-joins or subqueries.
 557     </para>
 558
 559     <para>
 560      When an alias is applied to the output of a <literal>JOIN</>
 561      clause, using any of these forms, the alias hides the original
 562      names within the <literal>JOIN</>.  For example:
 563 <programlisting>
 564 SELECT a.* FROM my_table AS a JOIN your_table AS b ON ...
 565 </programlisting>
 566      is valid SQL, but:
 567 <programlisting>
 568 SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
 569 </programlisting>
 570      is not valid: the table alias <literal>a</> is not visible
 571      outside the alias <literal>c</>.
 572     </para>
 573    </sect3>
 574
 575    <sect3 id="queries-subqueries">
 576     <title>Subqueries</title>
 577
 578     <indexterm zone="queries-subqueries">
 579      <primary>subquery</primary>
 580     </indexterm>
 581
 582     <para>
 583      Subqueries specifying a derived table must be enclosed in
 584      parentheses and <emphasis>must</emphasis> be assigned a table
 585      alias name.  (See <xref linkend="queries-table-aliases">.)  For
 586      example:
 587 <programlisting>
 588 FROM (SELECT * FROM table1) AS alias_name
 589 </programlisting>
 590     </para>
 591
 592     <para>
 593      This example is equivalent to <literal>FROM table1 AS
 594      alias_name</literal>.  More interesting cases, which cannot be
 595      reduced to a plain join, arise when the subquery involves
 596      grouping or aggregation.
 597     </para>
 598
 599     <para>
 600      A subquery can also be a <command>VALUES</> list:
 601 <programlisting>
 602 FROM (VALUES ('anne', 'smith'), ('bob', 'jones'), ('joe', 'blow'))
 603      AS names(first, last)
 604 </programlisting>
 605      Again, a table alias is required.  Assigning alias names to the columns
 606      of the <command>VALUES</> list is optional, but is good practice.
 607      For more information see <xref linkend="queries-values">.
 608     </para>
 609    </sect3>
 610
 611    <sect3 id="queries-tablefunctions">
 612     <title>Table Functions</title>
 613
 614     <indexterm zone="queries-tablefunctions"><primary>table function</></>
 615
 616     <indexterm zone="queries-tablefunctions">
 617      <primary>function</>
 618      <secondary>in the FROM clause</>
 619     </indexterm>
 620
 621     <para>
 622      Table functions are functions that produce a set of rows, made up
 623      of either base data types (scalar types) or composite data types
 624      (table rows).  They are used like a table, view, or subquery in
 625      the <literal>FROM</> clause of a query. Columns returned by table
 626      functions can be included in <literal>SELECT</>,
 627      <literal>JOIN</>, or <literal>WHERE</> clauses in the same manner
 628      as a table, view, or subquery column.
 629     </para>
 630
 631     <para>
 632      If a table function returns a base data type, the single result
 633      column is named like the function. If the function returns a
 634      composite type, the result columns get the same names as the
 635      individual attributes of the type.
 636     </para>
 637
 638     <para>
 639      A table function can be aliased in the <literal>FROM</> clause,
 640      but it also can be left unaliased. If a function is used in the
 641      <literal>FROM</> clause with no alias, the function name is used
 642      as the resulting table name.
 643     </para>
 644
 645     <para>
 646      Some examples:
 647 <programlisting>
 648 CREATE TABLE foo (fooid int, foosubid int, fooname text);
 649
 650 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
 651     SELECT * FROM foo WHERE fooid = $1;
 652 $$ LANGUAGE SQL;
 653
 654 SELECT * FROM getfoo(1) AS t1;
 655
 656 SELECT * FROM foo
 657     WHERE foosubid IN (select foosubid from getfoo(foo.fooid) z
 658                            where z.fooid = foo.fooid);
 659
 660 CREATE VIEW vw_getfoo AS SELECT * FROM getfoo(1);
 661
 662 SELECT * FROM vw_getfoo;
 663 </programlisting>
 664     </para>
 665
 666     <para>
 667      In some cases it is useful to define table functions that can
 668      return different column sets depending on how they are invoked.
 669      To support this, the table function can be declared as returning
 670      the pseudotype <type>record</>.  When such a function is used in
 671      a query, the expected row structure must be specified in the
 672      query itself, so that the system can know how to parse and plan
 673      the query.  Consider this example:
 674 <programlisting>
 675 SELECT *
 676     FROM dblink('dbname=mydb', 'select proname, prosrc from pg_proc')
 677       AS t1(proname name, prosrc text)
 678     WHERE proname LIKE 'bytea%';
 679 </programlisting>
 680      The <literal>dblink</> function executes a remote query (see
 681      <filename>contrib/dblink</>).  It is declared to return
 682      <type>record</> since it might be used for any kind of query.
 683      The actual column set must be specified in the calling query so
 684      that the parser knows, for example, what <literal>*</> should
 685      expand to.
 686     </para>
 687    </sect3>
 688   </sect2>
 689
 690   <sect2 id="queries-where">
 691    <title>The <literal>WHERE</literal> Clause</title>
 692
 693    <indexterm zone="queries-where">
 694     <primary>WHERE</primary>
 695    </indexterm>
 696
 697    <para>
 698     The syntax of the <xref linkend="sql-where"
 699     endterm="sql-where-title"> is
 700 <synopsis>
 701 WHERE <replaceable>search_condition</replaceable>
 702 </synopsis>
 703     where <replaceable>search_condition</replaceable> is any value
 704     expression (see <xref linkend="sql-expressions">) that
 705     returns a value of type <type>boolean</type>.
 706    </para>
 707
 708    <para>
 709     After the processing of the <literal>FROM</> clause is done, each
 710     row of the derived virtual table is checked against the search
 711     condition.  If the result of the condition is true, the row is
 712     kept in the output table, otherwise (that is, if the result is
 713     false or null) it is discarded.  The search condition typically
 714     references at least some column of the table generated in the
 715     <literal>FROM</> clause; this is not required, but otherwise the
 716     <literal>WHERE</> clause will be fairly useless.
 717    </para>
 718
 719    <note>
 720     <para>
 721      The join condition of an inner join can be written either in
 722      the <literal>WHERE</> clause or in the <literal>JOIN</> clause.
 723      For example, these table expressions are equivalent:
 724 <programlisting>
 725 FROM a, b WHERE a.id = b.id AND b.val &gt; 5
 726 </programlisting>
 727      and:
 728 <programlisting>
 729 FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val &gt; 5
 730 </programlisting>
 731      or perhaps even:
 732 <programlisting>
 733 FROM a NATURAL JOIN b WHERE b.val &gt; 5
 734 </programlisting>
 735      Which one of these you use is mainly a matter of style.  The
 736      <literal>JOIN</> syntax in the <literal>FROM</> clause is
 737      probably not as portable to other SQL database management systems.  For
 738      outer joins there is no choice in any case: they must be done in
 739      the <literal>FROM</> clause.  An <literal>ON</>/<literal>USING</>
 740      clause of an outer join is <emphasis>not</> equivalent to a
 741      <literal>WHERE</> condition, because it determines the addition
 742      of rows (for unmatched input rows) as well as the removal of rows
 743      from the final result.
 744     </para>
 745    </note>
 746
 747    <para>
 748     Here are some examples of <literal>WHERE</literal> clauses:
 749 <programlisting>
 750 SELECT ... FROM fdt WHERE c1 &gt; 5
 751
 752 SELECT ... FROM fdt WHERE c1 IN (1, 2, 3)
 753
 754 SELECT ... FROM fdt WHERE c1 IN (SELECT c1 FROM t2)
 755
 756 SELECT ... FROM fdt WHERE c1 IN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10)
 757
 758 SELECT ... FROM fdt WHERE c1 BETWEEN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10) AND 100
 759
 760 SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 &gt; fdt.c1)
 761 </programlisting>
 762     <literal>fdt</literal> is the table derived in the
 763     <literal>FROM</> clause. Rows that do not meet the search
 764     condition of the <literal>WHERE</> clause are eliminated from
 765     <literal>fdt</literal>. Notice the use of scalar subqueries as
 766     value expressions.  Just like any other query, the subqueries can
 767     employ complex table expressions.  Notice also how
 768     <literal>fdt</literal> is referenced in the subqueries.
 769     Qualifying <literal>c1</> as <literal>fdt.c1</> is only necessary
 770     if <literal>c1</> is also the name of a column in the derived
 771     input table of the subquery.  But qualifying the column name adds
 772     clarity even when it is not needed.  This example shows how the column
 773     naming scope of an outer query extends into its inner queries.
 774    </para>
 775   </sect2>
 776
 777
 778   <sect2 id="queries-group">
 779    <title>The <literal>GROUP BY</literal> and <literal>HAVING</literal> Clauses</title>
 780
 781    <indexterm zone="queries-group">
 782     <primary>GROUP BY</primary>
 783    </indexterm>
 784
 785    <indexterm zone="queries-group">
 786     <primary>grouping</primary>
 787    </indexterm>
 788
 789    <para>
 790     After passing the <literal>WHERE</> filter, the derived input
 791     table might be subject to grouping, using the <literal>GROUP BY</>
 792     clause, and elimination of group rows using the <literal>HAVING</>
 793     clause.
 794    </para>
 795
 796 <synopsis>
 797 SELECT <replaceable>select_list</replaceable>
 798     FROM ...
 799     <optional>WHERE ...</optional>
 800     GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>...
 801 </synopsis>
 802
 803    <para>
 804     The <xref linkend="sql-groupby" endterm="sql-groupby-title"> is
 805     used to group together those rows in a table that share the same
 806     values in all the columns listed. The order in which the columns
 807     are listed does not matter.  The effect is to combine each set
 808     of rows sharing common values into one group row that is
 809     representative of all rows in the group.  This is done to
 810     eliminate redundancy in the output and/or compute aggregates that
 811     apply to these groups.  For instance:
 812 <screen>
 813 <prompt>=&gt;</> <userinput>SELECT * FROM test1;</>
 814  x | y
 815 ---+---
 816  a | 3
 817  c | 2
 818  b | 5
 819  a | 1
 820 (4 rows)
 821
 822 <prompt>=&gt;</> <userinput>SELECT x FROM test1 GROUP BY x;</>
 823  x
 824 ---
 825  a
 826  b
 827  c
 828 (3 rows)
 829 </screen>
 830    </para>
 831
 832    <para>
 833     In the second query, we could not have written <literal>SELECT *
 834     FROM test1 GROUP BY x</literal>, because there is no single value
 835     for the column <literal>y</> that could be associated with each
 836     group.  The grouped-by columns can be referenced in the select list since
 837     they have a single value in each group.
 838    </para>
 839
 840    <para>
 841     In general, if a table is grouped, columns that are not
 842     used in the grouping cannot be referenced except in aggregate
 843     expressions.  An example with aggregate expressions is:
 844 <screen>
 845 <prompt>=&gt;</> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x;</>
 846  x | sum
 847 ---+-----
 848  a |   4
 849  b |   5
 850  c |   2
 851 (3 rows)
 852 </screen>
 853     Here <literal>sum</literal> is an aggregate function that
 854     computes a single value over the entire group.  More information
 855     about the available aggregate functions can be found in <xref
 856     linkend="functions-aggregate">.
 857    </para>
 858
 859    <tip>
 860     <para>
 861      Grouping without aggregate expressions effectively calculates the
 862      set of distinct values in a column.  This can also be achieved
 863      using the <literal>DISTINCT</> clause (see <xref
 864      linkend="queries-distinct">).
 865     </para>
 866    </tip>
 867
 868    <para>
 869     Here is another example:  it calculates the total sales for each
 870     product (rather than the total sales on all products):
 871 <programlisting>
 872 SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
 873     FROM products p LEFT JOIN sales s USING (product_id)
 874     GROUP BY product_id, p.name, p.price;
 875 </programlisting>
 876     In this example, the columns <literal>product_id</literal>,
 877     <literal>p.name</literal>, and <literal>p.price</literal> must be
 878     in the <literal>GROUP BY</> clause since they are referenced in
 879     the query select list.  (Depending on how exactly the products
 880     table is set up, name and price might be fully dependent on the
 881     product ID, so the additional groupings could theoretically be
 882     unnecessary, but this is not implemented yet.)  The column
 883     <literal>s.units</> does not have to be in the <literal>GROUP
 884     BY</> list since it is only used in an aggregate expression
 885     (<literal>sum(...)</literal>), which represents the sales
 886     of a product.  For each product, the query returns a summary row about
 887     all sales of the product.
 888    </para>
 889
 890    <para>
 891     In strict SQL, <literal>GROUP BY</> can only group by columns of
 892     the source table but <productname>PostgreSQL</productname> extends
 893     this to also allow <literal>GROUP BY</> to group by columns in the
 894     select list.  Grouping by value expressions instead of simple
 895     column names is also allowed.
 896    </para>
 897
 898    <indexterm>
 899     <primary>HAVING</primary>
 900    </indexterm>
 901
 902    <para>
 903     If a table has been grouped using a <literal>GROUP BY</literal>
 904     clause, but then only certain groups are of interest, the
 905     <literal>HAVING</literal> clause can be used, much like a
 906     <literal>WHERE</> clause, to eliminate groups from a grouped
 907     table.  The syntax is:
 908 <synopsis>
 909 SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable>
 910 </synopsis>
 911     Expressions in the <literal>HAVING</> clause can refer both to
 912     grouped expressions and to ungrouped expressions (which necessarily
 913     involve an aggregate function).
 914    </para>
 915
 916    <para>
 917     Example:
 918 <screen>
 919 <prompt>=&gt;</> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING sum(y) &gt; 3;</>
 920  x | sum
 921 ---+-----
 922  a |   4
 923  b |   5
 924 (2 rows)
 925
 926 <prompt>=&gt;</> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING x &lt; 'c';</>
 927  x | sum
 928 ---+-----
 929  a |   4
 930  b |   5
 931 (2 rows)
 932 </screen>
 933    </para>
 934
 935    <para>
 936     Again, a more realistic example:
 937 <programlisting>
 938 SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
 939     FROM products p LEFT JOIN sales s USING (product_id)
 940     WHERE s.date &gt; CURRENT_DATE - INTERVAL '4 weeks'
 941     GROUP BY product_id, p.name, p.price, p.cost
 942     HAVING sum(p.price * s.units) &gt; 5000;
 943 </programlisting>
 944     In the example above, the <literal>WHERE</> clause is selecting
 945     rows by a column that is not grouped (the expression is only true for
 946     sales during the last four weeks), while the <literal>HAVING</>
 947     clause restricts the output to groups with total gross sales over
 948     5000.  Note that the aggregate expressions do not necessarily need
 949     to be the same in all parts of the query.
 950    </para>
 951   </sect2>
 952  </sect1>
 953
 954
 955  <sect1 id="queries-select-lists">
 956   <title>Select Lists</title>
 957
 958   <indexterm>
 959    <primary>SELECT</primary>
 960    <secondary>select list</secondary>
 961   </indexterm>
 962
 963   <para>
 964    As shown in the previous section,
 965    the table expression in the <command>SELECT</command> command
 966    constructs an intermediate virtual table by possibly combining
 967    tables, views, eliminating rows, grouping, etc.  This table is
 968    finally passed on to processing by the <firstterm>select list</firstterm>.  The select
 969    list determines which <emphasis>columns</emphasis> of the
 970    intermediate table are actually output.
 971   </para>
 972
 973   <sect2 id="queries-select-list-items">
 974    <title>Select-List Items</title>
 975
 976    <indexterm>
 977     <primary>*</primary>
 978    </indexterm>
 979
 980    <para>
 981     The simplest kind of select list is <literal>*</literal> which
 982     emits all columns that the table expression produces.  Otherwise,
 983     a select list is a comma-separated list of value expressions (as
 984     defined in <xref linkend="sql-expressions">).  For instance, it
 985     could be a list of column names:
 986 <programlisting>
 987 SELECT a, b, c FROM ...
 988 </programlisting>
 989      The columns names <literal>a</>, <literal>b</>, and <literal>c</>
 990      are either the actual names of the columns of tables referenced
 991      in the <literal>FROM</> clause, or the aliases given to them as
 992      explained in <xref linkend="queries-table-aliases">.  The name
 993      space available in the select list is the same as in the
 994      <literal>WHERE</> clause, unless grouping is used, in which case
 995      it is the same as in the <literal>HAVING</> clause.
 996    </para>
 997
 998    <para>
 999     If more than one table has a column of the same name, the table
1000     name must also be given, as in:
1001 <programlisting>
1002 SELECT tbl1.a, tbl2.a, tbl1.b FROM ...
1003 </programlisting>
1004     When working with multiple tables, it can also be useful to ask for
1005     all the columns of a particular table:
1006 <programlisting>
1007 SELECT tbl1.*, tbl2.a FROM ...
1008 </programlisting>
1009     (See also <xref linkend="queries-where">.)
1010    </para>
1011
1012    <para>
1013     If an arbitrary value expression is used in the select list, it
1014     conceptually adds a new virtual column to the returned table.  The
1015     value expression is evaluated once for each result row, with
1016     the row's values substituted for any column references.  But the
1017     expressions in the select list do not have to reference any
1018     columns in the table expression of the <literal>FROM</> clause;
1019     they could be constant arithmetic expressions as well, for
1020     instance.
1021    </para>
1022   </sect2>
1023
1024   <sect2 id="queries-column-labels">
1025    <title>Column Labels</title>
1026
1027    <indexterm zone="queries-column-labels">
1028     <primary>alias</primary>
1029     <secondary>in the select list</secondary>
1030    </indexterm>
1031
1032    <para>
1033     The entries in the select list can be assigned names for further
1034     processing.  The <quote>further processing</quote> in this case is
1035     an optional sort specification and the client application (e.g.,
1036     column headers for display).  For example:
1037 <programlisting>
1038 SELECT a AS value, b + c AS sum FROM ...
1039 </programlisting>
1040    </para>
1041
1042    <para>
1043     If no output column name is specified using <literal>AS</>,
1044     the system assigns a default column name.  For simple column references,
1045     this is the name of the referenced column.  For function
1046     calls, this is the name of the function.  For complex expressions,
1047     the system will generate a generic name.
1048    </para>
1049
1050    <para>
1051     The <literal>AS</> keyword is optional, but only if the new column
1052     name does not match any
1053     <productname>PostgreSQL</productname> keyword (see <xref
1054     linkend="sql-keywords-appendix">).  To avoid an accidental match to
1055     a keyword, you can double-quote the column name.  For example,
1056     <literal>VALUE</> is a keyword, so this does not work:
1057 <programlisting>
1058 SELECT a value, b + c AS sum FROM ...
1059 </programlisting>
1060     but this does:
1061 <programlisting>
1062 SELECT a "value", b + c AS sum FROM ...
1063 </programlisting>
1064     For protection against possible
1065     future keyword additions, it is recommended that you always either
1066     write <literal>AS</literal> or double-quote the output column name.
1067    </para>
1068
1069    <note>
1070     <para>
1071      The naming of output columns here is different from that done in
1072      the <literal>FROM</> clause (see <xref
1073      linkend="queries-table-aliases">).  This pipeline will in fact
1074      allow you to rename the same column twice, but the name chosen in
1075      the select list is the one that will be passed on.
1076     </para>
1077    </note>
1078   </sect2>
1079
1080   <sect2 id="queries-distinct">
1081    <title><literal>DISTINCT</literal></title>
1082
1083    <indexterm zone="queries-distinct">
1084     <primary>DISTINCT</primary>
1085    </indexterm>
1086
1087    <indexterm zone="queries-distinct">
1088     <primary>duplicates</primary>
1089    </indexterm>
1090
1091    <para>
1092     After the select list has been processed, the result table can
1093     optionally be subject to the elimination of duplicate rows.  The
1094     <literal>DISTINCT</literal> key word is written directly after
1095     <literal>SELECT</literal> to specify this:
1096 <synopsis>
1097 SELECT DISTINCT <replaceable>select_list</replaceable> ...
1098 </synopsis>
1099     (Instead of <literal>DISTINCT</> the key word <literal>ALL</literal>
1100     can be used to specify the default behavior of retaining all rows.)
1101    </para>
1102
1103    <para>
1104     <indexterm><primary>null value</><secondary sortas="DISTINCT">in
1105     DISTINCT</></indexterm>
1106     Obviously, two rows are considered distinct if they differ in at
1107     least one column value.  Null values are considered equal in this
1108     comparison.
1109    </para>
1110
1111    <para>
1112     Alternatively, an arbitrary expression can determine what rows are
1113     to be considered distinct:
1114 <synopsis>
1115 SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ...</optional>) <replaceable>select_list</replaceable> ...
1116 </synopsis>
1117     Here <replaceable>expression</replaceable> is an arbitrary value
1118     expression that is evaluated for all rows.  A set of rows for
1119     which all the expressions are equal are considered duplicates, and
1120     only the first row of the set is kept in the output.  Note that
1121     the <quote>first row</quote> of a set is unpredictable unless the
1122     query is sorted on enough columns to guarantee a unique ordering
1123     of the rows arriving at the <literal>DISTINCT</> filter.
1124     (<literal>DISTINCT ON</> processing occurs after <literal>ORDER
1125     BY</> sorting.)
1126    </para>
1127
1128    <para>
1129     The <literal>DISTINCT ON</> clause is not part of the SQL standard
1130     and is sometimes considered bad style because of the potentially
1131     indeterminate nature of its results.  With judicious use of
1132     <literal>GROUP BY</> and subqueries in <literal>FROM</> the
1133     construct can be avoided, but it is often the most convenient
1134     alternative.
1135    </para>
1136   </sect2>
1137  </sect1>
1138
1139
1140  <sect1 id="queries-union">
1141   <title>Combining Queries</title>
1142
1143   <indexterm zone="queries-union">
1144    <primary>UNION</primary>
1145   </indexterm>
1146   <indexterm zone="queries-union">
1147    <primary>INTERSECT</primary>
1148   </indexterm>
1149   <indexterm zone="queries-union">
1150    <primary>EXCEPT</primary>
1151   </indexterm>
1152   <indexterm zone="queries-union">
1153    <primary>set union</primary>
1154   </indexterm>
1155   <indexterm zone="queries-union">
1156    <primary>set intersection</primary>
1157   </indexterm>
1158   <indexterm zone="queries-union">
1159    <primary>set difference</primary>
1160   </indexterm>
1161   <indexterm zone="queries-union">
1162    <primary>set operation</primary>
1163   </indexterm>
1164
1165   <para>
1166    The results of two queries can be combined using the set operations
1167    union, intersection, and difference.  The syntax is
1168 <synopsis>
1169 <replaceable>query1</replaceable> UNION <optional>ALL</optional> <replaceable>query2</replaceable>
1170 <replaceable>query1</replaceable> INTERSECT <optional>ALL</optional> <replaceable>query2</replaceable>
1171 <replaceable>query1</replaceable> EXCEPT <optional>ALL</optional> <replaceable>query2</replaceable>
1172 </synopsis>
1173    <replaceable>query1</replaceable> and
1174    <replaceable>query2</replaceable> are queries that can use any of
1175    the features discussed up to this point.  Set operations can also
1176    be nested and chained, for example
1177 <synopsis>
1178 <replaceable>query1</replaceable> UNION <replaceable>query2</replaceable> UNION <replaceable>query3</replaceable>
1179 </synopsis>
1180    which really says
1181 <synopsis>
1182 (<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable>) UNION <replaceable>query3</replaceable>
1183 </synopsis>
1184   </para>
1185
1186   <para>
1187    <literal>UNION</> effectively appends the result of
1188    <replaceable>query2</replaceable> to the result of
1189    <replaceable>query1</replaceable> (although there is no guarantee
1190    that this is the order in which the rows are actually returned).
1191    Furthermore, it eliminates duplicate rows from its result, in the same
1192    way as <literal>DISTINCT</>, unless <literal>UNION ALL</> is used.
1193   </para>
1194
1195   <para>
1196    <literal>INTERSECT</> returns all rows that are both in the result
1197    of <replaceable>query1</replaceable> and in the result of
1198    <replaceable>query2</replaceable>.  Duplicate rows are eliminated
1199    unless <literal>INTERSECT ALL</> is used.
1200   </para>
1201
1202   <para>
1203    <literal>EXCEPT</> returns all rows that are in the result of
1204    <replaceable>query1</replaceable> but not in the result of
1205    <replaceable>query2</replaceable>.  (This is sometimes called the
1206    <firstterm>difference</> between two queries.)  Again, duplicates
1207    are eliminated unless <literal>EXCEPT ALL</> is used.
1208   </para>
1209
1210   <para>
1211    In order to calculate the union, intersection, or difference of two
1212    queries, the two queries must be <quote>union compatible</quote>,
1213    which means that they return the same number of columns and
1214    the corresponding columns have compatible data types, as
1215    described in <xref linkend="typeconv-union-case">.
1216   </para>
1217  </sect1>
1218
1219
1220  <sect1 id="queries-order">
1221   <title>Sorting Rows</title>
1222
1223   <indexterm zone="queries-order">
1224    <primary>sorting</primary>
1225   </indexterm>
1226
1227   <indexterm zone="queries-order">
1228    <primary>ORDER BY</primary>
1229   </indexterm>
1230
1231   <para>
1232    After a query has produced an output table (after the select list
1233    has been processed) it can optionally be sorted.  If sorting is not
1234    chosen, the rows will be returned in an unspecified order.  The actual
1235    order in that case will depend on the scan and join plan types and
1236    the order on disk, but it must not be relied on.  A particular
1237    output ordering can only be guaranteed if the sort step is explicitly
1238    chosen.
1239   </para>
1240
1241   <para>
1242    The <literal>ORDER BY</> clause specifies the sort order:
1243 <synopsis>
1244 SELECT <replaceable>select_list</replaceable>
1245     FROM <replaceable>table_expression</replaceable>
1246     ORDER BY <replaceable>sort_expression1</replaceable> <optional>ASC | DESC</optional> <optional>NULLS { FIRST | LAST }</optional>
1247              <optional>, <replaceable>sort_expression2</replaceable> <optional>ASC | DESC</optional> <optional>NULLS { FIRST | LAST }</optional> ...</optional>
1248 </synopsis>
1249    The sort expression(s) can be any expression that would be valid in the
1250    query's select list.  An example is:
1251 <programlisting>
1252 SELECT a, b FROM table1 ORDER BY a + b, c;
1253 </programlisting>
1254    When more than one expression is specified,
1255    the later values are used to sort rows that are equal according to the
1256    earlier values.  Each expression can be followed by an optional
1257    <literal>ASC</> or <literal>DESC</> keyword to set the sort direction to
1258    ascending or descending.  <literal>ASC</> order is the default.
1259    Ascending order puts smaller values first, where
1260    <quote>smaller</quote> is defined in terms of the
1261    <literal>&lt;</literal> operator.  Similarly, descending order is
1262    determined with the <literal>&gt;</literal> operator.
1263     <footnote>
1264      <para>
1265       Actually, <productname>PostgreSQL</> uses the <firstterm>default B-tree
1266       operator class</> for the expression's data type to determine the sort
1267       ordering for <literal>ASC</> and <literal>DESC</>.  Conventionally,
1268       data types will be set up so that the <literal>&lt;</literal> and
1269       <literal>&gt;</literal> operators correspond to this sort ordering,
1270       but a user-defined data type's designer could choose to do something
1271       different.
1272      </para>
1273     </footnote>
1274   </para>
1275
1276   <para>
1277    The <literal>NULLS FIRST</> and <literal>NULLS LAST</> options can be
1278    used to determine whether nulls appear before or after non-null values
1279    in the sort ordering.  By default, null values sort as if larger than any
1280    non-null value; that is, <literal>NULLS FIRST</> is the default for
1281    <literal>DESC</> order, and <literal>NULLS LAST</> otherwise.
1282   </para>
1283
1284   <para>
1285    Note that the ordering options are considered independently for each
1286    sort column.  For example <literal>ORDER BY x, y DESC</> means
1287    <literal>ORDER BY x ASC, y DESC</>, which is not the same as
1288    <literal>ORDER BY x DESC, y DESC</>.
1289   </para>
1290
1291   <para>
1292    For backwards compatibility with the SQL92 version of the standard,
1293    a <replaceable>sort_expression</> can instead be the name or number
1294    of an output column, as in:
1295 <programlisting>
1296 SELECT a + b AS sum, c FROM table1 ORDER BY sum;
1297 SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;
1298 </programlisting>
1299    both of which sort by the first output column.  Note that an output
1300    column name has to stand alone, it's not allowed as part of an expression
1301    &mdash; for example, this is <emphasis>not</> correct:
1302 <programlisting>
1303 SELECT a + b AS sum, c FROM table1 ORDER BY sum + c;          -- wrong
1304 </programlisting>
1305    This restriction is made to reduce ambiguity.  There is still
1306    ambiguity if an <literal>ORDER BY</> item is a simple name that
1307    could match either an output column name or a column from the table
1308    expression.  The output column is used in such cases.  This would
1309    only cause confusion if you use <literal>AS</> to rename an output
1310    column to match some other table column's name.
1311   </para>
1312
1313   <para>
1314    <literal>ORDER BY</> can be applied to the result of a
1315    <literal>UNION</>, <literal>INTERSECT</>, or <literal>EXCEPT</>
1316    combination, but in this case it is only permitted to sort by
1317    output column names or numbers, not by expressions.
1318   </para>
1319  </sect1>
1320
1321
1322  <sect1 id="queries-limit">
1323   <title><literal>LIMIT</literal> and <literal>OFFSET</literal></title>
1324
1325   <indexterm zone="queries-limit">
1326    <primary>LIMIT</primary>
1327   </indexterm>
1328
1329   <indexterm zone="queries-limit">
1330    <primary>OFFSET</primary>
1331   </indexterm>
1332
1333   <para>
1334    <literal>LIMIT</> and <literal>OFFSET</> allow you to retrieve just
1335    a portion of the rows that are generated by the rest of the query:
1336 <synopsis>
1337 SELECT <replaceable>select_list</replaceable>
1338     FROM <replaceable>table_expression</replaceable>
1339     <optional> ORDER BY ... </optional>
1340     <optional> LIMIT { <replaceable>number</replaceable> | ALL } </optional> <optional> OFFSET <replaceable>number</replaceable> </optional>
1341 </synopsis>
1342   </para>
1343
1344   <para>
1345    If a limit count is given, no more than that many rows will be
1346    returned (but possibly less, if the query itself yields less rows).
1347    <literal>LIMIT ALL</> is the same as omitting the <literal>LIMIT</>
1348    clause.
1349   </para>
1350
1351   <para>
1352    <literal>OFFSET</> says to skip that many rows before beginning to
1353    return rows.  <literal>OFFSET 0</> is the same as
1354    omitting the <literal>OFFSET</> clause.  If both <literal>OFFSET</>
1355    and <literal>LIMIT</> appear, then <literal>OFFSET</> rows are
1356    skipped before starting to count the <literal>LIMIT</> rows that
1357    are returned.
1358   </para>
1359
1360   <para>
1361    When using <literal>LIMIT</>, it is important to use an
1362    <literal>ORDER BY</> clause that constrains the result rows into a
1363    unique order.  Otherwise you will get an unpredictable subset of
1364    the query's rows. You might be asking for the tenth through
1365    twentieth rows, but tenth through twentieth in what ordering?  The
1366    ordering is unknown, unless you specified <literal>ORDER BY</>.
1367   </para>
1368
1369   <para>
1370    The query optimizer takes <literal>LIMIT</> into account when
1371    generating a query plan, so you are very likely to get different
1372    plans (yielding different row orders) depending on what you give
1373    for <literal>LIMIT</> and <literal>OFFSET</>.  Thus, using
1374    different <literal>LIMIT</>/<literal>OFFSET</> values to select
1375    different subsets of a query result <emphasis>will give
1376    inconsistent results</emphasis> unless you enforce a predictable
1377    result ordering with <literal>ORDER BY</>.  This is not a bug; it
1378    is an inherent consequence of the fact that SQL does not promise to
1379    deliver the results of a query in any particular order unless
1380    <literal>ORDER BY</> is used to constrain the order.
1381   </para>
1382
1383   <para>
1384    The rows skipped by an <literal>OFFSET</> clause still have to be
1385    computed inside the server; therefore a large <literal>OFFSET</>
1386    might be inefficient.
1387   </para>
1388  </sect1>
1389
1390
1391  <sect1 id="queries-values">
1392   <title><literal>VALUES</literal> Lists</title>
1393
1394   <indexterm zone="queries-values">
1395    <primary>VALUES</primary>
1396   </indexterm>
1397
1398   <para>
1399    <literal>VALUES</> provides a way to generate a <quote>constant table</>
1400    that can be used in a query without having to actually create and populate
1401    a table on-disk.  The syntax is
1402 <synopsis>
1403 VALUES ( <replaceable class="PARAMETER">expression</replaceable> [, ...] ) [, ...]
1404 </synopsis>
1405    Each parenthesized list of expressions generates a row in the table.
1406    The lists must all have the same number of elements (i.e., the number
1407    of columns in the table), and corresponding entries in each list must
1408    have compatible data types.  The actual data type assigned to each column
1409    of the result is determined using the same rules as for <literal>UNION</>
1410    (see <xref linkend="typeconv-union-case">).
1411   </para>
1412
1413   <para>
1414    As an example:
1415 <programlisting>
1416 VALUES (1, 'one'), (2, 'two'), (3, 'three');
1417 </programlisting>
1418
1419    will return a table of two columns and three rows.  It's effectively
1420    equivalent to:
1421 <programlisting>
1422 SELECT 1 AS column1, 'one' AS column2
1423 UNION ALL
1424 SELECT 2, 'two'
1425 UNION ALL
1426 SELECT 3, 'three';
1427 </programlisting>
1428
1429    By default, <productname>PostgreSQL</productname> assigns the names
1430    <literal>column1</>, <literal>column2</>, etc. to the columns of a
1431    <literal>VALUES</> table.  The column names are not specified by the
1432    SQL standard and different database systems do it differently, so
1433    it's usually better to override the default names with a table alias
1434    list.
1435   </para>
1436
1437   <para>
1438    Syntactically, <literal>VALUES</> followed by expression lists is
1439    treated as equivalent to
1440 <synopsis>
1441 SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable>
1442 </synopsis>
1443    and can appear anywhere a <literal>SELECT</> can.  For example, you can
1444    use it as an arm of a <literal>UNION</>, or attach a
1445    <replaceable>sort_specification</replaceable> (<literal>ORDER BY</>,
1446    <literal>LIMIT</>, and/or <literal>OFFSET</>) to it.  <literal>VALUES</>
1447    is most commonly used as the data source in an <command>INSERT</> command,
1448    and next most commonly as a subquery.
1449   </para>
1450
1451   <para>
1452    For more information see <xref linkend="sql-values"
1453    endterm="sql-values-title">.
1454   </para>
1455
1456  </sect1>
1457
1458 </chapter>