doc, intagg: fix one-to-many mention to many-to-many
[pgsql.git] / doc / src / sgml / intagg.sgml
blob29e74ce146cc9bca4b20e74fd36e97ca60c99c46
1 <!-- doc/src/sgml/intagg.sgml -->
3 <sect1 id="intagg" xreflabel="intagg">
4 <title>intagg &mdash; integer aggregator and enumerator</title>
6 <indexterm zone="intagg">
7 <primary>intagg</primary>
8 </indexterm>
10 <para>
11 The <filename>intagg</filename> module provides an integer aggregator and an
12 enumerator. <filename>intagg</filename> is now obsolete, because there
13 are built-in functions that provide a superset of its capabilities.
14 However, the module is still provided as a compatibility wrapper around
15 the built-in functions.
16 </para>
18 <sect2 id="intagg-functions">
19 <title>Functions</title>
21 <indexterm>
22 <primary>int_array_aggregate</primary>
23 </indexterm>
25 <indexterm>
26 <primary>array_agg</primary>
27 </indexterm>
29 <para>
30 The aggregator is an aggregate function
31 <function>int_array_aggregate(integer)</function>
32 that produces an integer array
33 containing exactly the integers it is fed.
34 This is a wrapper around <function>array_agg</function>,
35 which does the same thing for any array type.
36 </para>
38 <indexterm>
39 <primary>int_array_enum</primary>
40 </indexterm>
42 <para>
43 The enumerator is a function
44 <function>int_array_enum(integer[])</function>
45 that returns <type>setof integer</type>. It is essentially the reverse
46 operation of the aggregator: given an array of integers, expand it
47 into a set of rows. This is a wrapper around <function>unnest</function>,
48 which does the same thing for any array type.
49 </para>
51 </sect2>
53 <sect2 id="intagg-samples">
54 <title>Sample Uses</title>
56 <para>
57 Many database systems have the notion of a many to many table. Such a table
58 usually sits between two indexed tables, for example:
60 <programlisting>
61 CREATE TABLE left_table (id INT PRIMARY KEY, ...);
62 CREATE TABLE right_table (id INT PRIMARY KEY, ...);
63 CREATE TABLE many_to_many(id_left INT REFERENCES left_table,
64 id_right INT REFERENCES right_table);
65 </programlisting>
67 It is typically used like this:
69 <programlisting>
70 SELECT right_table.*
71 FROM right_table JOIN many_to_many ON (right_table.id = many_to_many.id_right)
72 WHERE many_to_many.id_left = <replaceable>item</replaceable>;
73 </programlisting>
75 This will return all the items in the right hand table for an entry
76 in the left hand table. This is a very common construct in SQL.
77 </para>
79 <para>
80 Now, this methodology can be cumbersome with a very large number of
81 entries in the <structname>many_to_many</structname> table. Often,
82 a join like this would result in an index scan
83 and a fetch for each right hand entry in the table for a particular
84 left hand entry. If you have a very dynamic system, there is not much you
85 can do. However, if you have some data which is fairly static, you can
86 create a summary table with the aggregator.
88 <programlisting>
89 CREATE TABLE summary AS
90 SELECT id_left, int_array_aggregate(id_right) AS rights
91 FROM many_to_many
92 GROUP BY id_left;
93 </programlisting>
95 This will create a table with one row per left item, and an array
96 of right items. Now this is pretty useless without some way of using
97 the array; that's why there is an array enumerator. You can do
99 <programlisting>
100 SELECT id_left, int_array_enum(rights) FROM summary WHERE id_left = <replaceable>item</replaceable>;
101 </programlisting>
103 The above query using <function>int_array_enum</function> produces the same results
106 <programlisting>
107 SELECT id_left, id_right FROM many_to_many WHERE id_left = <replaceable>item</replaceable>;
108 </programlisting>
110 The difference is that the query against the summary table has to get
111 only one row from the table, whereas the direct query against
112 <structname>many_to_many</structname> must index scan and fetch a row for each entry.
113 </para>
115 <para>
116 On one system, an <command>EXPLAIN</command> showed a query with a cost of 8488 was
117 reduced to a cost of 329. The original query was a join involving the
118 <structname>many_to_many</structname> table, which was replaced by:
120 <programlisting>
121 SELECT id_right, count(id_right) FROM
122 ( SELECT id_left, int_array_enum(rights) AS id_right
123 FROM summary
124 JOIN (SELECT id FROM left_table
125 WHERE id = <replaceable>item</replaceable>) AS lefts
126 ON (summary.id_left = lefts.id)
127 ) AS list
128 GROUP BY id_right
129 ORDER BY count DESC;
130 </programlisting>
131 </para>
133 </sect2>
135 </sect1>