doc/src/sgml/intagg.sgml

   1 <!-- doc/src/sgml/intagg.sgml -->
   2
   3 <sect1 id="intagg" xreflabel="intagg">
   4  <title>intagg &mdash; integer aggregator and enumerator</title>
   5
   6  <indexterm zone="intagg">
   7   <primary>intagg</primary>
   8  </indexterm>
   9
  10  <para>
  11   The <filename>intagg</filename> module provides an integer aggregator and an
  12   enumerator.  <filename>intagg</filename> is now obsolete, because there
  13   are built-in functions that provide a superset of its capabilities.
  14   However, the module is still provided as a compatibility wrapper around
  15   the built-in functions.
  16  </para>
  17
  18  <sect2 id="intagg-functions">
  19   <title>Functions</title>
  20
  21  <indexterm>
  22   <primary>int_array_aggregate</primary>
  23  </indexterm>
  24
  25  <indexterm>
  26   <primary>array_agg</primary>
  27  </indexterm>
  28
  29  <para>
  30   The aggregator is an aggregate function
  31   <function>int_array_aggregate(integer)</function>
  32   that produces an integer array
  33   containing exactly the integers it is fed.
  34   This is a wrapper around <function>array_agg</function>,
  35   which does the same thing for any array type.
  36  </para>
  37
  38  <indexterm>
  39   <primary>int_array_enum</primary>
  40  </indexterm>
  41
  42  <para>
  43   The enumerator is a function
  44   <function>int_array_enum(integer[])</function>
  45   that returns <type>setof integer</type>.  It is essentially the reverse
  46   operation of the aggregator: given an array of integers, expand it
  47   into a set of rows.  This is a wrapper around <function>unnest</function>,
  48   which does the same thing for any array type.
  49  </para>
  50
  51  </sect2>
  52
  53  <sect2 id="intagg-samples">
  54   <title>Sample Uses</title>
  55
  56   <para>
  57    Many database systems have the notion of a many to many table. Such a table
  58    usually sits between two indexed tables, for example:
  59
  60 <programlisting>
  61 CREATE TABLE left_table  (id INT PRIMARY KEY, ...);
  62 CREATE TABLE right_table (id INT PRIMARY KEY, ...);
  63 CREATE TABLE many_to_many(id_left  INT REFERENCES left_table,
  64                           id_right INT REFERENCES right_table);
  65 </programlisting>
  66
  67   It is typically used like this:
  68
  69 <programlisting>
  70 SELECT right_table.*
  71 FROM right_table JOIN many_to_many ON (right_table.id = many_to_many.id_right)
  72 WHERE many_to_many.id_left = <replaceable>item</replaceable>;
  73 </programlisting>
  74
  75   This will return all the items in the right hand table for an entry
  76   in the left hand table. This is a very common construct in SQL.
  77  </para>
  78
  79  <para>
  80   Now, this methodology can be cumbersome with a very large number of
  81   entries in the <structname>many_to_many</structname> table.  Often,
  82   a join like this would result in an index scan
  83   and a fetch for each right hand entry in the table for a particular
  84   left hand entry. If you have a very dynamic system, there is not much you
  85   can do. However, if you have some data which is fairly static, you can
  86   create a summary table with the aggregator.
  87
  88 <programlisting>
  89 CREATE TABLE summary AS
  90   SELECT id_left, int_array_aggregate(id_right) AS rights
  91   FROM many_to_many
  92   GROUP BY id_left;
  93 </programlisting>
  94
  95   This will create a table with one row per left item, and an array
  96   of right items. Now this is pretty useless without some way of using
  97   the array; that's why there is an array enumerator.  You can do
  98
  99 <programlisting>
 100 SELECT id_left, int_array_enum(rights) FROM summary WHERE id_left = <replaceable>item</replaceable>;
 101 </programlisting>
 102
 103   The above query using <function>int_array_enum</function> produces the same results
 104   as
 105
 106 <programlisting>
 107 SELECT id_left, id_right FROM many_to_many WHERE id_left = <replaceable>item</replaceable>;
 108 </programlisting>
 109
 110   The difference is that the query against the summary table has to get
 111   only one row from the table, whereas the direct query against
 112   <structname>many_to_many</structname> must index scan and fetch a row for each entry.
 113  </para>
 114
 115  <para>
 116   On one system, an <command>EXPLAIN</command> showed a query with a cost of 8488 was
 117   reduced to a cost of 329.  The original query was a join involving the
 118   <structname>many_to_many</structname> table, which was replaced by:
 119
 120 <programlisting>
 121 SELECT id_right, count(id_right) FROM
 122   ( SELECT id_left, int_array_enum(rights) AS id_right
 123     FROM summary
 124     JOIN (SELECT id FROM left_table
 125           WHERE id = <replaceable>item</replaceable>) AS lefts
 126     ON (summary.id_left = lefts.id)
 127   ) AS list
 128   GROUP BY id_right
 129   ORDER BY count DESC;
 130 </programlisting>
 131  </para>
 132
 133  </sect2>
 134
 135 </sect1>