Make GROUP BY work properly for datatypes that only support hashing and not
[PostgreSQL.git] / doc / src / sgml / queries.sgml
blob0d237abc7e2ea2e087811bfa05400b1b9e95a290
1 <!-- $PostgreSQL$ -->
3 <chapter id="queries">
4 <title>Queries</title>
6 <indexterm zone="queries">
7 <primary>query</primary>
8 </indexterm>
10 <indexterm zone="queries">
11 <primary>SELECT</primary>
12 </indexterm>
14 <para>
15 The previous chapters explained how to create tables, how to fill
16 them with data, and how to manipulate that data. Now we finally
17 discuss how to retrieve the data out of the database.
18 </para>
21 <sect1 id="queries-overview">
22 <title>Overview</title>
24 <para>
25 The process of retrieving or the command to retrieve data from a
26 database is called a <firstterm>query</firstterm>. In SQL the
27 <xref linkend="sql-select" endterm="sql-select-title"> command is
28 used to specify queries. The general syntax of the
29 <command>SELECT</command> command is
30 <synopsis>
31 SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional><replaceable>sort_specification</replaceable></optional>
32 </synopsis>
33 The following sections describe the details of the select list, the
34 table expression, and the sort specification.
35 </para>
37 <para>
38 A simple kind of query has the form:
39 <programlisting>
40 SELECT * FROM table1;
41 </programlisting>
42 Assuming that there is a table called <literal>table1</literal>,
43 this command would retrieve all rows and all columns from
44 <literal>table1</literal>. (The method of retrieval depends on the
45 client application. For example, the
46 <application>psql</application> program will display an ASCII-art
47 table on the screen, while client libraries will offer functions to
48 extract individual values from the query result.) The select list
49 specification <literal>*</literal> means all columns that the table
50 expression happens to provide. A select list can also select a
51 subset of the available columns or make calculations using the
52 columns. For example, if
53 <literal>table1</literal> has columns named <literal>a</>,
54 <literal>b</>, and <literal>c</> (and perhaps others) you can make
55 the following query:
56 <programlisting>
57 SELECT a, b + c FROM table1;
58 </programlisting>
59 (assuming that <literal>b</> and <literal>c</> are of a numerical
60 data type).
61 See <xref linkend="queries-select-lists"> for more details.
62 </para>
64 <para>
65 <literal>FROM table1</literal> is a particularly simple kind of
66 table expression: it reads just one table. In general, table
67 expressions can be complex constructs of base tables, joins, and
68 subqueries. But you can also omit the table expression entirely and
69 use the <command>SELECT</command> command as a calculator:
70 <programlisting>
71 SELECT 3 * 4;
72 </programlisting>
73 This is more useful if the expressions in the select list return
74 varying results. For example, you could call a function this way:
75 <programlisting>
76 SELECT random();
77 </programlisting>
78 </para>
79 </sect1>
82 <sect1 id="queries-table-expressions">
83 <title>Table Expressions</title>
85 <indexterm zone="queries-table-expressions">
86 <primary>table expression</primary>
87 </indexterm>
89 <para>
90 A <firstterm>table expression</firstterm> computes a table. The
91 table expression contains a <literal>FROM</> clause that is
92 optionally followed by <literal>WHERE</>, <literal>GROUP BY</>, and
93 <literal>HAVING</> clauses. Trivial table expressions simply refer
94 to a table on disk, a so-called base table, but more complex
95 expressions can be used to modify or combine base tables in various
96 ways.
97 </para>
99 <para>
100 The optional <literal>WHERE</>, <literal>GROUP BY</>, and
101 <literal>HAVING</> clauses in the table expression specify a
102 pipeline of successive transformations performed on the table
103 derived in the <literal>FROM</> clause. All these transformations
104 produce a virtual table that provides the rows that are passed to
105 the select list to compute the output rows of the query.
106 </para>
108 <sect2 id="queries-from">
109 <title>The <literal>FROM</literal> Clause</title>
111 <para>
112 The <xref linkend="sql-from" endterm="sql-from-title"> derives a
113 table from one or more other tables given in a comma-separated
114 table reference list.
115 <synopsis>
116 FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional>
117 </synopsis>
119 A table reference can be a table name (possibly schema-qualified),
120 or a derived table such as a subquery, a table join, or complex
121 combinations of these. If more than one table reference is listed
122 in the <literal>FROM</> clause they are cross-joined (see below)
123 to form the intermediate virtual table that can then be subject to
124 transformations by the <literal>WHERE</>, <literal>GROUP BY</>,
125 and <literal>HAVING</> clauses and is finally the result of the
126 overall table expression.
127 </para>
129 <indexterm>
130 <primary>ONLY</primary>
131 </indexterm>
133 <para>
134 When a table reference names a table that is the parent of a
135 table inheritance hierarchy, the table reference produces rows of
136 not only that table but all of its descendant tables, unless the
137 key word <literal>ONLY</> precedes the table name. However, the
138 reference produces only the columns that appear in the named table
139 &mdash; any columns added in subtables are ignored.
140 </para>
142 <sect3 id="queries-join">
143 <title>Joined Tables</title>
145 <indexterm zone="queries-join">
146 <primary>join</primary>
147 </indexterm>
149 <para>
150 A joined table is a table derived from two other (real or
151 derived) tables according to the rules of the particular join
152 type. Inner, outer, and cross-joins are available.
153 </para>
155 <variablelist>
156 <title>Join Types</title>
158 <varlistentry>
159 <term>Cross join</term>
161 <indexterm>
162 <primary>join</primary>
163 <secondary>cross</secondary>
164 </indexterm>
166 <indexterm>
167 <primary>cross join</primary>
168 </indexterm>
170 <listitem>
171 <synopsis>
172 <replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable>
173 </synopsis>
175 <para>
176 For each combination of rows from
177 <replaceable>T1</replaceable> and
178 <replaceable>T2</replaceable>, the derived table will contain a
179 row consisting of all columns in <replaceable>T1</replaceable>
180 followed by all columns in <replaceable>T2</replaceable>. If
181 the tables have N and M rows respectively, the joined
182 table will have N * M rows.
183 </para>
185 <para>
186 <literal>FROM <replaceable>T1</replaceable> CROSS JOIN
187 <replaceable>T2</replaceable></literal> is equivalent to
188 <literal>FROM <replaceable>T1</replaceable>,
189 <replaceable>T2</replaceable></literal>. It is also equivalent to
190 <literal>FROM <replaceable>T1</replaceable> INNER JOIN
191 <replaceable>T2</replaceable> ON TRUE</literal> (see below).
192 </para>
193 </listitem>
194 </varlistentry>
196 <varlistentry>
197 <term>Qualified joins</term>
199 <indexterm>
200 <primary>join</primary>
201 <secondary>outer</secondary>
202 </indexterm>
204 <indexterm>
205 <primary>outer join</primary>
206 </indexterm>
208 <listitem>
209 <synopsis>
210 <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean_expression</replaceable>
211 <replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> )
212 <replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable>
213 </synopsis>
215 <para>
216 The words <literal>INNER</literal> and
217 <literal>OUTER</literal> are optional in all forms.
218 <literal>INNER</literal> is the default;
219 <literal>LEFT</literal>, <literal>RIGHT</literal>, and
220 <literal>FULL</literal> imply an outer join.
221 </para>
223 <para>
224 The <firstterm>join condition</firstterm> is specified in the
225 <literal>ON</> or <literal>USING</> clause, or implicitly by
226 the word <literal>NATURAL</>. The join condition determines
227 which rows from the two source tables are considered to
228 <quote>match</quote>, as explained in detail below.
229 </para>
231 <para>
232 The <literal>ON</> clause is the most general kind of join
233 condition: it takes a Boolean value expression of the same
234 kind as is used in a <literal>WHERE</> clause. A pair of rows
235 from <replaceable>T1</> and <replaceable>T2</> match if the
236 <literal>ON</> expression evaluates to true for them.
237 </para>
239 <para>
240 <literal>USING</> is a shorthand notation: it takes a
241 comma-separated list of column names, which the joined tables
242 must have in common, and forms a join condition specifying
243 equality of each of these pairs of columns. Furthermore, the
244 output of a <literal>JOIN USING</> has one column for each of
245 the equated pairs of input columns, followed by all of the
246 other columns from each table. Thus, <literal>USING (a, b,
247 c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND
248 t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that
249 if <literal>ON</> is used there will be two columns
250 <literal>a</>, <literal>b</>, and <literal>c</> in the result,
251 whereas with <literal>USING</> there will be only one of each.
252 </para>
254 <para>
255 <indexterm>
256 <primary>join</primary>
257 <secondary>natural</secondary>
258 </indexterm>
259 <indexterm>
260 <primary>natural join</primary>
261 </indexterm>
262 Finally, <literal>NATURAL</> is a shorthand form of
263 <literal>USING</>: it forms a <literal>USING</> list
264 consisting of exactly those column names that appear in both
265 input tables. As with <literal>USING</>, these columns appear
266 only once in the output table.
267 </para>
269 <para>
270 The possible types of qualified join are:
272 <variablelist>
273 <varlistentry>
274 <term><literal>INNER JOIN</></term>
276 <listitem>
277 <para>
278 For each row R1 of T1, the joined table has a row for each
279 row in T2 that satisfies the join condition with R1.
280 </para>
281 </listitem>
282 </varlistentry>
284 <varlistentry>
285 <term><literal>LEFT OUTER JOIN</></term>
287 <indexterm>
288 <primary>join</primary>
289 <secondary>left</secondary>
290 </indexterm>
292 <indexterm>
293 <primary>left join</primary>
294 </indexterm>
296 <listitem>
297 <para>
298 First, an inner join is performed. Then, for each row in
299 T1 that does not satisfy the join condition with any row in
300 T2, a joined row is added with null values in columns of
301 T2. Thus, the joined table unconditionally has at least
302 one row for each row in T1.
303 </para>
304 </listitem>
305 </varlistentry>
307 <varlistentry>
308 <term><literal>RIGHT OUTER JOIN</></term>
310 <indexterm>
311 <primary>join</primary>
312 <secondary>right</secondary>
313 </indexterm>
315 <indexterm>
316 <primary>right join</primary>
317 </indexterm>
319 <listitem>
320 <para>
321 First, an inner join is performed. Then, for each row in
322 T2 that does not satisfy the join condition with any row in
323 T1, a joined row is added with null values in columns of
324 T1. This is the converse of a left join: the result table
325 will unconditionally have a row for each row in T2.
326 </para>
327 </listitem>
328 </varlistentry>
330 <varlistentry>
331 <term><literal>FULL OUTER JOIN</></term>
333 <listitem>
334 <para>
335 First, an inner join is performed. Then, for each row in
336 T1 that does not satisfy the join condition with any row in
337 T2, a joined row is added with null values in columns of
338 T2. Also, for each row of T2 that does not satisfy the
339 join condition with any row in T1, a joined row with null
340 values in the columns of T1 is added.
341 </para>
342 </listitem>
343 </varlistentry>
344 </variablelist>
345 </para>
346 </listitem>
347 </varlistentry>
348 </variablelist>
350 <para>
351 Joins of all types can be chained together or nested: either or
352 both of <replaceable>T1</replaceable> and
353 <replaceable>T2</replaceable> might be joined tables. Parentheses
354 can be used around <literal>JOIN</> clauses to control the join
355 order. In the absence of parentheses, <literal>JOIN</> clauses
356 nest left-to-right.
357 </para>
359 <para>
360 To put this together, assume we have tables <literal>t1</literal>:
361 <programlisting>
362 num | name
363 -----+------
364 1 | a
365 2 | b
366 3 | c
367 </programlisting>
368 and <literal>t2</literal>:
369 <programlisting>
370 num | value
371 -----+-------
372 1 | xxx
373 3 | yyy
374 5 | zzz
375 </programlisting>
376 then we get the following results for the various joins:
377 <screen>
378 <prompt>=&gt;</> <userinput>SELECT * FROM t1 CROSS JOIN t2;</>
379 num | name | num | value
380 -----+------+-----+-------
381 1 | a | 1 | xxx
382 1 | a | 3 | yyy
383 1 | a | 5 | zzz
384 2 | b | 1 | xxx
385 2 | b | 3 | yyy
386 2 | b | 5 | zzz
387 3 | c | 1 | xxx
388 3 | c | 3 | yyy
389 3 | c | 5 | zzz
390 (9 rows)
392 <prompt>=&gt;</> <userinput>SELECT * FROM t1 INNER JOIN t2 ON t1.num = t2.num;</>
393 num | name | num | value
394 -----+------+-----+-------
395 1 | a | 1 | xxx
396 3 | c | 3 | yyy
397 (2 rows)
399 <prompt>=&gt;</> <userinput>SELECT * FROM t1 INNER JOIN t2 USING (num);</>
400 num | name | value
401 -----+------+-------
402 1 | a | xxx
403 3 | c | yyy
404 (2 rows)
406 <prompt>=&gt;</> <userinput>SELECT * FROM t1 NATURAL INNER JOIN t2;</>
407 num | name | value
408 -----+------+-------
409 1 | a | xxx
410 3 | c | yyy
411 (2 rows)
413 <prompt>=&gt;</> <userinput>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num;</>
414 num | name | num | value
415 -----+------+-----+-------
416 1 | a | 1 | xxx
417 2 | b | |
418 3 | c | 3 | yyy
419 (3 rows)
421 <prompt>=&gt;</> <userinput>SELECT * FROM t1 LEFT JOIN t2 USING (num);</>
422 num | name | value
423 -----+------+-------
424 1 | a | xxx
425 2 | b |
426 3 | c | yyy
427 (3 rows)
429 <prompt>=&gt;</> <userinput>SELECT * FROM t1 RIGHT JOIN t2 ON t1.num = t2.num;</>
430 num | name | num | value
431 -----+------+-----+-------
432 1 | a | 1 | xxx
433 3 | c | 3 | yyy
434 | | 5 | zzz
435 (3 rows)
437 <prompt>=&gt;</> <userinput>SELECT * FROM t1 FULL JOIN t2 ON t1.num = t2.num;</>
438 num | name | num | value
439 -----+------+-----+-------
440 1 | a | 1 | xxx
441 2 | b | |
442 3 | c | 3 | yyy
443 | | 5 | zzz
444 (4 rows)
445 </screen>
446 </para>
448 <para>
449 The join condition specified with <literal>ON</> can also contain
450 conditions that do not relate directly to the join. This can
451 prove useful for some queries but needs to be thought out
452 carefully. For example:
453 <screen>
454 <prompt>=&gt;</> <userinput>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num AND t2.value = 'xxx';</>
455 num | name | num | value
456 -----+------+-----+-------
457 1 | a | 1 | xxx
458 2 | b | |
459 3 | c | |
460 (3 rows)
461 </screen>
462 </para>
463 </sect3>
465 <sect3 id="queries-table-aliases">
466 <title>Table and Column Aliases</title>
468 <indexterm zone="queries-table-aliases">
469 <primary>alias</primary>
470 <secondary>in the FROM clause</secondary>
471 </indexterm>
473 <indexterm>
474 <primary>label</primary>
475 <see>alias</see>
476 </indexterm>
478 <para>
479 A temporary name can be given to tables and complex table
480 references to be used for references to the derived table in
481 the rest of the query. This is called a <firstterm>table
482 alias</firstterm>.
483 </para>
485 <para>
486 To create a table alias, write
487 <synopsis>
488 FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable>
489 </synopsis>
491 <synopsis>
492 FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable>
493 </synopsis>
494 The <literal>AS</literal> key word is optional noise.
495 <replaceable>alias</replaceable> can be any identifier.
496 </para>
498 <para>
499 A typical application of table aliases is to assign short
500 identifiers to long table names to keep the join clauses
501 readable. For example:
502 <programlisting>
503 SELECT * FROM some_very_long_table_name s JOIN another_fairly_long_name a ON s.id = a.num;
504 </programlisting>
505 </para>
507 <para>
508 The alias becomes the new name of the table reference for the
509 current query &mdash; it is no longer possible to refer to the table
510 by the original name. Thus:
511 <programlisting>
512 SELECT * FROM my_table AS m WHERE my_table.a &gt; 5;
513 </programlisting>
514 is not valid according to the SQL standard. In
515 <productname>PostgreSQL</productname> this will draw an error if the
516 <xref linkend="guc-add-missing-from"> configuration variable is
517 <literal>off</> (as it is by default). If it is <literal>on</>,
518 an implicit table reference will be added to the
519 <literal>FROM</literal> clause, so the query is processed as if
520 it were written as:
521 <programlisting>
522 SELECT * FROM my_table AS m, my_table AS my_table WHERE my_table.a &gt; 5;
523 </programlisting>
524 That will result in a cross join, which is usually not what you want.
525 </para>
527 <para>
528 Table aliases are mainly for notational convenience, but it is
529 necessary to use them when joining a table to itself, e.g.:
530 <programlisting>
531 SELECT * FROM people AS mother JOIN people AS child ON mother.id = child.mother_id;
532 </programlisting>
533 Additionally, an alias is required if the table reference is a
534 subquery (see <xref linkend="queries-subqueries">).
535 </para>
537 <para>
538 Parentheses are used to resolve ambiguities. In the following example,
539 the first statement assigns the alias <literal>b</literal> to the second
540 instance of <literal>my_table</>, but the second statement assigns the
541 alias to the result of the join:
542 <programlisting>
543 SELECT * FROM my_table AS a CROSS JOIN my_table AS b ...
544 SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ...
545 </programlisting>
546 </para>
548 <para>
549 Another form of table aliasing gives temporary names to the columns of
550 the table, as well as the table itself:
551 <synopsis>
552 FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> )
553 </synopsis>
554 If fewer column aliases are specified than the actual table has
555 columns, the remaining columns are not renamed. This syntax is
556 especially useful for self-joins or subqueries.
557 </para>
559 <para>
560 When an alias is applied to the output of a <literal>JOIN</>
561 clause, using any of these forms, the alias hides the original
562 names within the <literal>JOIN</>. For example:
563 <programlisting>
564 SELECT a.* FROM my_table AS a JOIN your_table AS b ON ...
565 </programlisting>
566 is valid SQL, but:
567 <programlisting>
568 SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
569 </programlisting>
570 is not valid: the table alias <literal>a</> is not visible
571 outside the alias <literal>c</>.
572 </para>
573 </sect3>
575 <sect3 id="queries-subqueries">
576 <title>Subqueries</title>
578 <indexterm zone="queries-subqueries">
579 <primary>subquery</primary>
580 </indexterm>
582 <para>
583 Subqueries specifying a derived table must be enclosed in
584 parentheses and <emphasis>must</emphasis> be assigned a table
585 alias name. (See <xref linkend="queries-table-aliases">.) For
586 example:
587 <programlisting>
588 FROM (SELECT * FROM table1) AS alias_name
589 </programlisting>
590 </para>
592 <para>
593 This example is equivalent to <literal>FROM table1 AS
594 alias_name</literal>. More interesting cases, which cannot be
595 reduced to a plain join, arise when the subquery involves
596 grouping or aggregation.
597 </para>
599 <para>
600 A subquery can also be a <command>VALUES</> list:
601 <programlisting>
602 FROM (VALUES ('anne', 'smith'), ('bob', 'jones'), ('joe', 'blow'))
603 AS names(first, last)
604 </programlisting>
605 Again, a table alias is required. Assigning alias names to the columns
606 of the <command>VALUES</> list is optional, but is good practice.
607 For more information see <xref linkend="queries-values">.
608 </para>
609 </sect3>
611 <sect3 id="queries-tablefunctions">
612 <title>Table Functions</title>
614 <indexterm zone="queries-tablefunctions"><primary>table function</></>
616 <indexterm zone="queries-tablefunctions">
617 <primary>function</>
618 <secondary>in the FROM clause</>
619 </indexterm>
621 <para>
622 Table functions are functions that produce a set of rows, made up
623 of either base data types (scalar types) or composite data types
624 (table rows). They are used like a table, view, or subquery in
625 the <literal>FROM</> clause of a query. Columns returned by table
626 functions can be included in <literal>SELECT</>,
627 <literal>JOIN</>, or <literal>WHERE</> clauses in the same manner
628 as a table, view, or subquery column.
629 </para>
631 <para>
632 If a table function returns a base data type, the single result
633 column is named like the function. If the function returns a
634 composite type, the result columns get the same names as the
635 individual attributes of the type.
636 </para>
638 <para>
639 A table function can be aliased in the <literal>FROM</> clause,
640 but it also can be left unaliased. If a function is used in the
641 <literal>FROM</> clause with no alias, the function name is used
642 as the resulting table name.
643 </para>
645 <para>
646 Some examples:
647 <programlisting>
648 CREATE TABLE foo (fooid int, foosubid int, fooname text);
650 CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
651 SELECT * FROM foo WHERE fooid = $1;
652 $$ LANGUAGE SQL;
654 SELECT * FROM getfoo(1) AS t1;
656 SELECT * FROM foo
657 WHERE foosubid IN (select foosubid from getfoo(foo.fooid) z
658 where z.fooid = foo.fooid);
660 CREATE VIEW vw_getfoo AS SELECT * FROM getfoo(1);
662 SELECT * FROM vw_getfoo;
663 </programlisting>
664 </para>
666 <para>
667 In some cases it is useful to define table functions that can
668 return different column sets depending on how they are invoked.
669 To support this, the table function can be declared as returning
670 the pseudotype <type>record</>. When such a function is used in
671 a query, the expected row structure must be specified in the
672 query itself, so that the system can know how to parse and plan
673 the query. Consider this example:
674 <programlisting>
675 SELECT *
676 FROM dblink('dbname=mydb', 'select proname, prosrc from pg_proc')
677 AS t1(proname name, prosrc text)
678 WHERE proname LIKE 'bytea%';
679 </programlisting>
680 The <literal>dblink</> function executes a remote query (see
681 <filename>contrib/dblink</>). It is declared to return
682 <type>record</> since it might be used for any kind of query.
683 The actual column set must be specified in the calling query so
684 that the parser knows, for example, what <literal>*</> should
685 expand to.
686 </para>
687 </sect3>
688 </sect2>
690 <sect2 id="queries-where">
691 <title>The <literal>WHERE</literal> Clause</title>
693 <indexterm zone="queries-where">
694 <primary>WHERE</primary>
695 </indexterm>
697 <para>
698 The syntax of the <xref linkend="sql-where"
699 endterm="sql-where-title"> is
700 <synopsis>
701 WHERE <replaceable>search_condition</replaceable>
702 </synopsis>
703 where <replaceable>search_condition</replaceable> is any value
704 expression (see <xref linkend="sql-expressions">) that
705 returns a value of type <type>boolean</type>.
706 </para>
708 <para>
709 After the processing of the <literal>FROM</> clause is done, each
710 row of the derived virtual table is checked against the search
711 condition. If the result of the condition is true, the row is
712 kept in the output table, otherwise (that is, if the result is
713 false or null) it is discarded. The search condition typically
714 references at least some column of the table generated in the
715 <literal>FROM</> clause; this is not required, but otherwise the
716 <literal>WHERE</> clause will be fairly useless.
717 </para>
719 <note>
720 <para>
721 The join condition of an inner join can be written either in
722 the <literal>WHERE</> clause or in the <literal>JOIN</> clause.
723 For example, these table expressions are equivalent:
724 <programlisting>
725 FROM a, b WHERE a.id = b.id AND b.val &gt; 5
726 </programlisting>
727 and:
728 <programlisting>
729 FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val &gt; 5
730 </programlisting>
731 or perhaps even:
732 <programlisting>
733 FROM a NATURAL JOIN b WHERE b.val &gt; 5
734 </programlisting>
735 Which one of these you use is mainly a matter of style. The
736 <literal>JOIN</> syntax in the <literal>FROM</> clause is
737 probably not as portable to other SQL database management systems. For
738 outer joins there is no choice in any case: they must be done in
739 the <literal>FROM</> clause. An <literal>ON</>/<literal>USING</>
740 clause of an outer join is <emphasis>not</> equivalent to a
741 <literal>WHERE</> condition, because it determines the addition
742 of rows (for unmatched input rows) as well as the removal of rows
743 from the final result.
744 </para>
745 </note>
747 <para>
748 Here are some examples of <literal>WHERE</literal> clauses:
749 <programlisting>
750 SELECT ... FROM fdt WHERE c1 &gt; 5
752 SELECT ... FROM fdt WHERE c1 IN (1, 2, 3)
754 SELECT ... FROM fdt WHERE c1 IN (SELECT c1 FROM t2)
756 SELECT ... FROM fdt WHERE c1 IN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10)
758 SELECT ... FROM fdt WHERE c1 BETWEEN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10) AND 100
760 SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 &gt; fdt.c1)
761 </programlisting>
762 <literal>fdt</literal> is the table derived in the
763 <literal>FROM</> clause. Rows that do not meet the search
764 condition of the <literal>WHERE</> clause are eliminated from
765 <literal>fdt</literal>. Notice the use of scalar subqueries as
766 value expressions. Just like any other query, the subqueries can
767 employ complex table expressions. Notice also how
768 <literal>fdt</literal> is referenced in the subqueries.
769 Qualifying <literal>c1</> as <literal>fdt.c1</> is only necessary
770 if <literal>c1</> is also the name of a column in the derived
771 input table of the subquery. But qualifying the column name adds
772 clarity even when it is not needed. This example shows how the column
773 naming scope of an outer query extends into its inner queries.
774 </para>
775 </sect2>
778 <sect2 id="queries-group">
779 <title>The <literal>GROUP BY</literal> and <literal>HAVING</literal> Clauses</title>
781 <indexterm zone="queries-group">
782 <primary>GROUP BY</primary>
783 </indexterm>
785 <indexterm zone="queries-group">
786 <primary>grouping</primary>
787 </indexterm>
789 <para>
790 After passing the <literal>WHERE</> filter, the derived input
791 table might be subject to grouping, using the <literal>GROUP BY</>
792 clause, and elimination of group rows using the <literal>HAVING</>
793 clause.
794 </para>
796 <synopsis>
797 SELECT <replaceable>select_list</replaceable>
798 FROM ...
799 <optional>WHERE ...</optional>
800 GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>...
801 </synopsis>
803 <para>
804 The <xref linkend="sql-groupby" endterm="sql-groupby-title"> is
805 used to group together those rows in a table that share the same
806 values in all the columns listed. The order in which the columns
807 are listed does not matter. The effect is to combine each set
808 of rows sharing common values into one group row that is
809 representative of all rows in the group. This is done to
810 eliminate redundancy in the output and/or compute aggregates that
811 apply to these groups. For instance:
812 <screen>
813 <prompt>=&gt;</> <userinput>SELECT * FROM test1;</>
814 x | y
815 ---+---
816 a | 3
817 c | 2
818 b | 5
819 a | 1
820 (4 rows)
822 <prompt>=&gt;</> <userinput>SELECT x FROM test1 GROUP BY x;</>
828 (3 rows)
829 </screen>
830 </para>
832 <para>
833 In the second query, we could not have written <literal>SELECT *
834 FROM test1 GROUP BY x</literal>, because there is no single value
835 for the column <literal>y</> that could be associated with each
836 group. The grouped-by columns can be referenced in the select list since
837 they have a single value in each group.
838 </para>
840 <para>
841 In general, if a table is grouped, columns that are not
842 used in the grouping cannot be referenced except in aggregate
843 expressions. An example with aggregate expressions is:
844 <screen>
845 <prompt>=&gt;</> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x;</>
846 x | sum
847 ---+-----
848 a | 4
849 b | 5
850 c | 2
851 (3 rows)
852 </screen>
853 Here <literal>sum</literal> is an aggregate function that
854 computes a single value over the entire group. More information
855 about the available aggregate functions can be found in <xref
856 linkend="functions-aggregate">.
857 </para>
859 <tip>
860 <para>
861 Grouping without aggregate expressions effectively calculates the
862 set of distinct values in a column. This can also be achieved
863 using the <literal>DISTINCT</> clause (see <xref
864 linkend="queries-distinct">).
865 </para>
866 </tip>
868 <para>
869 Here is another example: it calculates the total sales for each
870 product (rather than the total sales on all products):
871 <programlisting>
872 SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
873 FROM products p LEFT JOIN sales s USING (product_id)
874 GROUP BY product_id, p.name, p.price;
875 </programlisting>
876 In this example, the columns <literal>product_id</literal>,
877 <literal>p.name</literal>, and <literal>p.price</literal> must be
878 in the <literal>GROUP BY</> clause since they are referenced in
879 the query select list. (Depending on how exactly the products
880 table is set up, name and price might be fully dependent on the
881 product ID, so the additional groupings could theoretically be
882 unnecessary, but this is not implemented yet.) The column
883 <literal>s.units</> does not have to be in the <literal>GROUP
884 BY</> list since it is only used in an aggregate expression
885 (<literal>sum(...)</literal>), which represents the sales
886 of a product. For each product, the query returns a summary row about
887 all sales of the product.
888 </para>
890 <para>
891 In strict SQL, <literal>GROUP BY</> can only group by columns of
892 the source table but <productname>PostgreSQL</productname> extends
893 this to also allow <literal>GROUP BY</> to group by columns in the
894 select list. Grouping by value expressions instead of simple
895 column names is also allowed.
896 </para>
898 <indexterm>
899 <primary>HAVING</primary>
900 </indexterm>
902 <para>
903 If a table has been grouped using a <literal>GROUP BY</literal>
904 clause, but then only certain groups are of interest, the
905 <literal>HAVING</literal> clause can be used, much like a
906 <literal>WHERE</> clause, to eliminate groups from a grouped
907 table. The syntax is:
908 <synopsis>
909 SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable>
910 </synopsis>
911 Expressions in the <literal>HAVING</> clause can refer both to
912 grouped expressions and to ungrouped expressions (which necessarily
913 involve an aggregate function).
914 </para>
916 <para>
917 Example:
918 <screen>
919 <prompt>=&gt;</> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING sum(y) &gt; 3;</>
920 x | sum
921 ---+-----
922 a | 4
923 b | 5
924 (2 rows)
926 <prompt>=&gt;</> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x HAVING x &lt; 'c';</>
927 x | sum
928 ---+-----
929 a | 4
930 b | 5
931 (2 rows)
932 </screen>
933 </para>
935 <para>
936 Again, a more realistic example:
937 <programlisting>
938 SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
939 FROM products p LEFT JOIN sales s USING (product_id)
940 WHERE s.date &gt; CURRENT_DATE - INTERVAL '4 weeks'
941 GROUP BY product_id, p.name, p.price, p.cost
942 HAVING sum(p.price * s.units) &gt; 5000;
943 </programlisting>
944 In the example above, the <literal>WHERE</> clause is selecting
945 rows by a column that is not grouped (the expression is only true for
946 sales during the last four weeks), while the <literal>HAVING</>
947 clause restricts the output to groups with total gross sales over
948 5000. Note that the aggregate expressions do not necessarily need
949 to be the same in all parts of the query.
950 </para>
951 </sect2>
952 </sect1>
955 <sect1 id="queries-select-lists">
956 <title>Select Lists</title>
958 <indexterm>
959 <primary>SELECT</primary>
960 <secondary>select list</secondary>
961 </indexterm>
963 <para>
964 As shown in the previous section,
965 the table expression in the <command>SELECT</command> command
966 constructs an intermediate virtual table by possibly combining
967 tables, views, eliminating rows, grouping, etc. This table is
968 finally passed on to processing by the <firstterm>select list</firstterm>. The select
969 list determines which <emphasis>columns</emphasis> of the
970 intermediate table are actually output.
971 </para>
973 <sect2 id="queries-select-list-items">
974 <title>Select-List Items</title>
976 <indexterm>
977 <primary>*</primary>
978 </indexterm>
980 <para>
981 The simplest kind of select list is <literal>*</literal> which
982 emits all columns that the table expression produces. Otherwise,
983 a select list is a comma-separated list of value expressions (as
984 defined in <xref linkend="sql-expressions">). For instance, it
985 could be a list of column names:
986 <programlisting>
987 SELECT a, b, c FROM ...
988 </programlisting>
989 The columns names <literal>a</>, <literal>b</>, and <literal>c</>
990 are either the actual names of the columns of tables referenced
991 in the <literal>FROM</> clause, or the aliases given to them as
992 explained in <xref linkend="queries-table-aliases">. The name
993 space available in the select list is the same as in the
994 <literal>WHERE</> clause, unless grouping is used, in which case
995 it is the same as in the <literal>HAVING</> clause.
996 </para>
998 <para>
999 If more than one table has a column of the same name, the table
1000 name must also be given, as in:
1001 <programlisting>
1002 SELECT tbl1.a, tbl2.a, tbl1.b FROM ...
1003 </programlisting>
1004 When working with multiple tables, it can also be useful to ask for
1005 all the columns of a particular table:
1006 <programlisting>
1007 SELECT tbl1.*, tbl2.a FROM ...
1008 </programlisting>
1009 (See also <xref linkend="queries-where">.)
1010 </para>
1012 <para>
1013 If an arbitrary value expression is used in the select list, it
1014 conceptually adds a new virtual column to the returned table. The
1015 value expression is evaluated once for each result row, with
1016 the row's values substituted for any column references. But the
1017 expressions in the select list do not have to reference any
1018 columns in the table expression of the <literal>FROM</> clause;
1019 they could be constant arithmetic expressions as well, for
1020 instance.
1021 </para>
1022 </sect2>
1024 <sect2 id="queries-column-labels">
1025 <title>Column Labels</title>
1027 <indexterm zone="queries-column-labels">
1028 <primary>alias</primary>
1029 <secondary>in the select list</secondary>
1030 </indexterm>
1032 <para>
1033 The entries in the select list can be assigned names for further
1034 processing. The <quote>further processing</quote> in this case is
1035 an optional sort specification and the client application (e.g.,
1036 column headers for display). For example:
1037 <programlisting>
1038 SELECT a AS value, b + c AS sum FROM ...
1039 </programlisting>
1040 </para>
1042 <para>
1043 If no output column name is specified using <literal>AS</>,
1044 the system assigns a default column name. For simple column references,
1045 this is the name of the referenced column. For function
1046 calls, this is the name of the function. For complex expressions,
1047 the system will generate a generic name.
1048 </para>
1050 <para>
1051 The <literal>AS</> keyword is optional, but only if the new column
1052 name does not match any
1053 <productname>PostgreSQL</productname> keyword (see <xref
1054 linkend="sql-keywords-appendix">). To avoid an accidental match to
1055 a keyword, you can double-quote the column name. For example,
1056 <literal>VALUE</> is a keyword, so this does not work:
1057 <programlisting>
1058 SELECT a value, b + c AS sum FROM ...
1059 </programlisting>
1060 but this does:
1061 <programlisting>
1062 SELECT a "value", b + c AS sum FROM ...
1063 </programlisting>
1064 For protection against possible
1065 future keyword additions, it is recommended that you always either
1066 write <literal>AS</literal> or double-quote the output column name.
1067 </para>
1069 <note>
1070 <para>
1071 The naming of output columns here is different from that done in
1072 the <literal>FROM</> clause (see <xref
1073 linkend="queries-table-aliases">). This pipeline will in fact
1074 allow you to rename the same column twice, but the name chosen in
1075 the select list is the one that will be passed on.
1076 </para>
1077 </note>
1078 </sect2>
1080 <sect2 id="queries-distinct">
1081 <title><literal>DISTINCT</literal></title>
1083 <indexterm zone="queries-distinct">
1084 <primary>DISTINCT</primary>
1085 </indexterm>
1087 <indexterm zone="queries-distinct">
1088 <primary>duplicates</primary>
1089 </indexterm>
1091 <para>
1092 After the select list has been processed, the result table can
1093 optionally be subject to the elimination of duplicate rows. The
1094 <literal>DISTINCT</literal> key word is written directly after
1095 <literal>SELECT</literal> to specify this:
1096 <synopsis>
1097 SELECT DISTINCT <replaceable>select_list</replaceable> ...
1098 </synopsis>
1099 (Instead of <literal>DISTINCT</> the key word <literal>ALL</literal>
1100 can be used to specify the default behavior of retaining all rows.)
1101 </para>
1103 <para>
1104 <indexterm><primary>null value</><secondary sortas="DISTINCT">in
1105 DISTINCT</></indexterm>
1106 Obviously, two rows are considered distinct if they differ in at
1107 least one column value. Null values are considered equal in this
1108 comparison.
1109 </para>
1111 <para>
1112 Alternatively, an arbitrary expression can determine what rows are
1113 to be considered distinct:
1114 <synopsis>
1115 SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ...</optional>) <replaceable>select_list</replaceable> ...
1116 </synopsis>
1117 Here <replaceable>expression</replaceable> is an arbitrary value
1118 expression that is evaluated for all rows. A set of rows for
1119 which all the expressions are equal are considered duplicates, and
1120 only the first row of the set is kept in the output. Note that
1121 the <quote>first row</quote> of a set is unpredictable unless the
1122 query is sorted on enough columns to guarantee a unique ordering
1123 of the rows arriving at the <literal>DISTINCT</> filter.
1124 (<literal>DISTINCT ON</> processing occurs after <literal>ORDER
1125 BY</> sorting.)
1126 </para>
1128 <para>
1129 The <literal>DISTINCT ON</> clause is not part of the SQL standard
1130 and is sometimes considered bad style because of the potentially
1131 indeterminate nature of its results. With judicious use of
1132 <literal>GROUP BY</> and subqueries in <literal>FROM</> the
1133 construct can be avoided, but it is often the most convenient
1134 alternative.
1135 </para>
1136 </sect2>
1137 </sect1>
1140 <sect1 id="queries-union">
1141 <title>Combining Queries</title>
1143 <indexterm zone="queries-union">
1144 <primary>UNION</primary>
1145 </indexterm>
1146 <indexterm zone="queries-union">
1147 <primary>INTERSECT</primary>
1148 </indexterm>
1149 <indexterm zone="queries-union">
1150 <primary>EXCEPT</primary>
1151 </indexterm>
1152 <indexterm zone="queries-union">
1153 <primary>set union</primary>
1154 </indexterm>
1155 <indexterm zone="queries-union">
1156 <primary>set intersection</primary>
1157 </indexterm>
1158 <indexterm zone="queries-union">
1159 <primary>set difference</primary>
1160 </indexterm>
1161 <indexterm zone="queries-union">
1162 <primary>set operation</primary>
1163 </indexterm>
1165 <para>
1166 The results of two queries can be combined using the set operations
1167 union, intersection, and difference. The syntax is
1168 <synopsis>
1169 <replaceable>query1</replaceable> UNION <optional>ALL</optional> <replaceable>query2</replaceable>
1170 <replaceable>query1</replaceable> INTERSECT <optional>ALL</optional> <replaceable>query2</replaceable>
1171 <replaceable>query1</replaceable> EXCEPT <optional>ALL</optional> <replaceable>query2</replaceable>
1172 </synopsis>
1173 <replaceable>query1</replaceable> and
1174 <replaceable>query2</replaceable> are queries that can use any of
1175 the features discussed up to this point. Set operations can also
1176 be nested and chained, for example
1177 <synopsis>
1178 <replaceable>query1</replaceable> UNION <replaceable>query2</replaceable> UNION <replaceable>query3</replaceable>
1179 </synopsis>
1180 which really says
1181 <synopsis>
1182 (<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable>) UNION <replaceable>query3</replaceable>
1183 </synopsis>
1184 </para>
1186 <para>
1187 <literal>UNION</> effectively appends the result of
1188 <replaceable>query2</replaceable> to the result of
1189 <replaceable>query1</replaceable> (although there is no guarantee
1190 that this is the order in which the rows are actually returned).
1191 Furthermore, it eliminates duplicate rows from its result, in the same
1192 way as <literal>DISTINCT</>, unless <literal>UNION ALL</> is used.
1193 </para>
1195 <para>
1196 <literal>INTERSECT</> returns all rows that are both in the result
1197 of <replaceable>query1</replaceable> and in the result of
1198 <replaceable>query2</replaceable>. Duplicate rows are eliminated
1199 unless <literal>INTERSECT ALL</> is used.
1200 </para>
1202 <para>
1203 <literal>EXCEPT</> returns all rows that are in the result of
1204 <replaceable>query1</replaceable> but not in the result of
1205 <replaceable>query2</replaceable>. (This is sometimes called the
1206 <firstterm>difference</> between two queries.) Again, duplicates
1207 are eliminated unless <literal>EXCEPT ALL</> is used.
1208 </para>
1210 <para>
1211 In order to calculate the union, intersection, or difference of two
1212 queries, the two queries must be <quote>union compatible</quote>,
1213 which means that they return the same number of columns and
1214 the corresponding columns have compatible data types, as
1215 described in <xref linkend="typeconv-union-case">.
1216 </para>
1217 </sect1>
1220 <sect1 id="queries-order">
1221 <title>Sorting Rows</title>
1223 <indexterm zone="queries-order">
1224 <primary>sorting</primary>
1225 </indexterm>
1227 <indexterm zone="queries-order">
1228 <primary>ORDER BY</primary>
1229 </indexterm>
1231 <para>
1232 After a query has produced an output table (after the select list
1233 has been processed) it can optionally be sorted. If sorting is not
1234 chosen, the rows will be returned in an unspecified order. The actual
1235 order in that case will depend on the scan and join plan types and
1236 the order on disk, but it must not be relied on. A particular
1237 output ordering can only be guaranteed if the sort step is explicitly
1238 chosen.
1239 </para>
1241 <para>
1242 The <literal>ORDER BY</> clause specifies the sort order:
1243 <synopsis>
1244 SELECT <replaceable>select_list</replaceable>
1245 FROM <replaceable>table_expression</replaceable>
1246 ORDER BY <replaceable>sort_expression1</replaceable> <optional>ASC | DESC</optional> <optional>NULLS { FIRST | LAST }</optional>
1247 <optional>, <replaceable>sort_expression2</replaceable> <optional>ASC | DESC</optional> <optional>NULLS { FIRST | LAST }</optional> ...</optional>
1248 </synopsis>
1249 The sort expression(s) can be any expression that would be valid in the
1250 query's select list. An example is:
1251 <programlisting>
1252 SELECT a, b FROM table1 ORDER BY a + b, c;
1253 </programlisting>
1254 When more than one expression is specified,
1255 the later values are used to sort rows that are equal according to the
1256 earlier values. Each expression can be followed by an optional
1257 <literal>ASC</> or <literal>DESC</> keyword to set the sort direction to
1258 ascending or descending. <literal>ASC</> order is the default.
1259 Ascending order puts smaller values first, where
1260 <quote>smaller</quote> is defined in terms of the
1261 <literal>&lt;</literal> operator. Similarly, descending order is
1262 determined with the <literal>&gt;</literal> operator.
1263 <footnote>
1264 <para>
1265 Actually, <productname>PostgreSQL</> uses the <firstterm>default B-tree
1266 operator class</> for the expression's data type to determine the sort
1267 ordering for <literal>ASC</> and <literal>DESC</>. Conventionally,
1268 data types will be set up so that the <literal>&lt;</literal> and
1269 <literal>&gt;</literal> operators correspond to this sort ordering,
1270 but a user-defined data type's designer could choose to do something
1271 different.
1272 </para>
1273 </footnote>
1274 </para>
1276 <para>
1277 The <literal>NULLS FIRST</> and <literal>NULLS LAST</> options can be
1278 used to determine whether nulls appear before or after non-null values
1279 in the sort ordering. By default, null values sort as if larger than any
1280 non-null value; that is, <literal>NULLS FIRST</> is the default for
1281 <literal>DESC</> order, and <literal>NULLS LAST</> otherwise.
1282 </para>
1284 <para>
1285 Note that the ordering options are considered independently for each
1286 sort column. For example <literal>ORDER BY x, y DESC</> means
1287 <literal>ORDER BY x ASC, y DESC</>, which is not the same as
1288 <literal>ORDER BY x DESC, y DESC</>.
1289 </para>
1291 <para>
1292 For backwards compatibility with the SQL92 version of the standard,
1293 a <replaceable>sort_expression</> can instead be the name or number
1294 of an output column, as in:
1295 <programlisting>
1296 SELECT a + b AS sum, c FROM table1 ORDER BY sum;
1297 SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;
1298 </programlisting>
1299 both of which sort by the first output column. Note that an output
1300 column name has to stand alone, it's not allowed as part of an expression
1301 &mdash; for example, this is <emphasis>not</> correct:
1302 <programlisting>
1303 SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong
1304 </programlisting>
1305 This restriction is made to reduce ambiguity. There is still
1306 ambiguity if an <literal>ORDER BY</> item is a simple name that
1307 could match either an output column name or a column from the table
1308 expression. The output column is used in such cases. This would
1309 only cause confusion if you use <literal>AS</> to rename an output
1310 column to match some other table column's name.
1311 </para>
1313 <para>
1314 <literal>ORDER BY</> can be applied to the result of a
1315 <literal>UNION</>, <literal>INTERSECT</>, or <literal>EXCEPT</>
1316 combination, but in this case it is only permitted to sort by
1317 output column names or numbers, not by expressions.
1318 </para>
1319 </sect1>
1322 <sect1 id="queries-limit">
1323 <title><literal>LIMIT</literal> and <literal>OFFSET</literal></title>
1325 <indexterm zone="queries-limit">
1326 <primary>LIMIT</primary>
1327 </indexterm>
1329 <indexterm zone="queries-limit">
1330 <primary>OFFSET</primary>
1331 </indexterm>
1333 <para>
1334 <literal>LIMIT</> and <literal>OFFSET</> allow you to retrieve just
1335 a portion of the rows that are generated by the rest of the query:
1336 <synopsis>
1337 SELECT <replaceable>select_list</replaceable>
1338 FROM <replaceable>table_expression</replaceable>
1339 <optional> ORDER BY ... </optional>
1340 <optional> LIMIT { <replaceable>number</replaceable> | ALL } </optional> <optional> OFFSET <replaceable>number</replaceable> </optional>
1341 </synopsis>
1342 </para>
1344 <para>
1345 If a limit count is given, no more than that many rows will be
1346 returned (but possibly less, if the query itself yields less rows).
1347 <literal>LIMIT ALL</> is the same as omitting the <literal>LIMIT</>
1348 clause.
1349 </para>
1351 <para>
1352 <literal>OFFSET</> says to skip that many rows before beginning to
1353 return rows. <literal>OFFSET 0</> is the same as
1354 omitting the <literal>OFFSET</> clause. If both <literal>OFFSET</>
1355 and <literal>LIMIT</> appear, then <literal>OFFSET</> rows are
1356 skipped before starting to count the <literal>LIMIT</> rows that
1357 are returned.
1358 </para>
1360 <para>
1361 When using <literal>LIMIT</>, it is important to use an
1362 <literal>ORDER BY</> clause that constrains the result rows into a
1363 unique order. Otherwise you will get an unpredictable subset of
1364 the query's rows. You might be asking for the tenth through
1365 twentieth rows, but tenth through twentieth in what ordering? The
1366 ordering is unknown, unless you specified <literal>ORDER BY</>.
1367 </para>
1369 <para>
1370 The query optimizer takes <literal>LIMIT</> into account when
1371 generating a query plan, so you are very likely to get different
1372 plans (yielding different row orders) depending on what you give
1373 for <literal>LIMIT</> and <literal>OFFSET</>. Thus, using
1374 different <literal>LIMIT</>/<literal>OFFSET</> values to select
1375 different subsets of a query result <emphasis>will give
1376 inconsistent results</emphasis> unless you enforce a predictable
1377 result ordering with <literal>ORDER BY</>. This is not a bug; it
1378 is an inherent consequence of the fact that SQL does not promise to
1379 deliver the results of a query in any particular order unless
1380 <literal>ORDER BY</> is used to constrain the order.
1381 </para>
1383 <para>
1384 The rows skipped by an <literal>OFFSET</> clause still have to be
1385 computed inside the server; therefore a large <literal>OFFSET</>
1386 might be inefficient.
1387 </para>
1388 </sect1>
1391 <sect1 id="queries-values">
1392 <title><literal>VALUES</literal> Lists</title>
1394 <indexterm zone="queries-values">
1395 <primary>VALUES</primary>
1396 </indexterm>
1398 <para>
1399 <literal>VALUES</> provides a way to generate a <quote>constant table</>
1400 that can be used in a query without having to actually create and populate
1401 a table on-disk. The syntax is
1402 <synopsis>
1403 VALUES ( <replaceable class="PARAMETER">expression</replaceable> [, ...] ) [, ...]
1404 </synopsis>
1405 Each parenthesized list of expressions generates a row in the table.
1406 The lists must all have the same number of elements (i.e., the number
1407 of columns in the table), and corresponding entries in each list must
1408 have compatible data types. The actual data type assigned to each column
1409 of the result is determined using the same rules as for <literal>UNION</>
1410 (see <xref linkend="typeconv-union-case">).
1411 </para>
1413 <para>
1414 As an example:
1415 <programlisting>
1416 VALUES (1, 'one'), (2, 'two'), (3, 'three');
1417 </programlisting>
1419 will return a table of two columns and three rows. It's effectively
1420 equivalent to:
1421 <programlisting>
1422 SELECT 1 AS column1, 'one' AS column2
1423 UNION ALL
1424 SELECT 2, 'two'
1425 UNION ALL
1426 SELECT 3, 'three';
1427 </programlisting>
1429 By default, <productname>PostgreSQL</productname> assigns the names
1430 <literal>column1</>, <literal>column2</>, etc. to the columns of a
1431 <literal>VALUES</> table. The column names are not specified by the
1432 SQL standard and different database systems do it differently, so
1433 it's usually better to override the default names with a table alias
1434 list.
1435 </para>
1437 <para>
1438 Syntactically, <literal>VALUES</> followed by expression lists is
1439 treated as equivalent to
1440 <synopsis>
1441 SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable>
1442 </synopsis>
1443 and can appear anywhere a <literal>SELECT</> can. For example, you can
1444 use it as an arm of a <literal>UNION</>, or attach a
1445 <replaceable>sort_specification</replaceable> (<literal>ORDER BY</>,
1446 <literal>LIMIT</>, and/or <literal>OFFSET</>) to it. <literal>VALUES</>
1447 is most commonly used as the data source in an <command>INSERT</> command,
1448 and next most commonly as a subquery.
1449 </para>
1451 <para>
1452 For more information see <xref linkend="sql-values"
1453 endterm="sql-values-title">.
1454 </para>
1456 </sect1>
1458 </chapter>