doc/src/sgml/rules.sgml

   1 <!-- $PostgreSQL$ -->
   2
   3 <chapter id="rules">
   4 <title>The Rule System</title>
   5
   6  <indexterm zone="rules">
   7   <primary>rule</primary>
   8  </indexterm>
   9
  10 <para>
  11      This chapter discusses the rule system in
  12      <productname>PostgreSQL</productname>.  Production rule systems
  13      are conceptually simple, but there are many subtle points
  14      involved in actually using them.
  15 </para>
  16
  17 <para>
  18      Some other database systems define active database rules, which
  19      are usually stored procedures and triggers.  In
  20      <productname>PostgreSQL</productname>, these can be implemented
  21      using functions and triggers as well.
  22 </para>
  23
  24 <para>
  25      The rule system (more precisely speaking, the query rewrite rule
  26      system) is totally different from stored procedures and triggers.
  27      It modifies queries to take rules into consideration, and then
  28      passes the modified query to the query planner for planning and
  29      execution.  It is very powerful, and can be used for many things
  30      such as query language procedures, views, and versions.  The
  31      theoretical foundations and the power of this rule system are
  32      also discussed in <xref linkend="STON90b"> and <xref
  33      linkend="ONG90">.
  34 </para>
  35
  36 <sect1 id="querytree">
  37 <title>The Query Tree</title>
  38
  39 <indexterm zone="querytree">
  40  <primary>query tree</primary>
  41 </indexterm>
  42
  43 <para>
  44     To understand how the rule system works it is necessary to know
  45     when it is invoked and what its input and results are.
  46 </para>
  47
  48 <para>
  49     The rule system is located between the parser and the planner.
  50     It takes the output of the parser, one query tree, and the user-defined
  51     rewrite rules, which are also
  52     query trees with some extra information, and creates zero or more
  53     query trees as result. So its input and output are always things
  54     the parser itself could have produced and thus, anything it sees
  55     is basically representable as an <acronym>SQL</acronym> statement.
  56 </para>
  57
  58 <para>
  59     Now what is a query tree? It is an internal representation of an
  60     <acronym>SQL</acronym> statement where the single parts that it is
  61     built from are stored separately. These query trees can be shown
  62     in the server log if you set the configuration parameters
  63     <varname>debug_print_parse</varname>,
  64     <varname>debug_print_rewritten</varname>, or
  65     <varname>debug_print_plan</varname>.  The rule actions are also
  66     stored as query trees, in the system catalog
  67     <structname>pg_rewrite</structname>.  They are not formatted like
  68     the log output, but they contain exactly the same information.
  69 </para>
  70
  71 <para>
  72     Reading a raw query tree requires some experience.  But since
  73     <acronym>SQL</acronym> representations of query trees are
  74     sufficient to understand the rule system, this chapter will not
  75     teach how to read them.
  76 </para>
  77
  78 <para>
  79     When reading the <acronym>SQL</acronym> representations of the
  80     query trees in this chapter it is necessary to be able to identify
  81     the parts the statement is broken into when it is in the query tree
  82     structure. The parts of a query tree are
  83
  84 <variablelist>
  85     <varlistentry>
  86     <term>
  87         the command type
  88     </term>
  89     <listitem>
  90     <para>
  91         This is a simple value telling which command
  92         (<command>SELECT</command>, <command>INSERT</command>,
  93         <command>UPDATE</command>, <command>DELETE</command>) produced
  94         the query tree.
  95     </para>
  96     </listitem>
  97     </varlistentry>
  98
  99     <varlistentry>
 100     <term>
 101         the range table
 102     </term>
 103       <indexterm><primary>range table</></>
 104     <listitem>
 105     <para>
 106         The range table is a list of relations that are used in the query.
 107         In a <command>SELECT</command> statement these are the relations given after
 108         the <literal>FROM</literal> key word.
 109     </para>
 110
 111     <para>
 112         Every range table entry identifies a table or view and tells
 113         by which name it is called in the other parts of the query.
 114         In the query tree, the range table entries are referenced by
 115         number rather than by name, so here it doesn't matter if there
 116         are duplicate names as it would in an <acronym>SQL</acronym>
 117         statement. This can happen after the range tables of rules
 118         have been merged in. The examples in this chapter will not have
 119         this situation.
 120     </para>
 121     </listitem>
 122     </varlistentry>
 123
 124     <varlistentry>
 125     <term>
 126         the result relation
 127     </term>
 128     <listitem>
 129     <para>
 130         This is an index into the range table that identifies the
 131         relation where the results of the query go.
 132     </para>
 133
 134     <para>
 135         <command>SELECT</command> queries normally don't have a result
 136         relation. The special case of a <command>SELECT INTO</command> is
 137         mostly identical to a <command>CREATE TABLE</command> followed by a
 138         <literal>INSERT ... SELECT</literal> and is not discussed
 139         separately here.
 140     </para>
 141
 142     <para>
 143         For <command>INSERT</command>, <command>UPDATE</command>, and
 144         <command>DELETE</command> commands, the result relation is the table
 145         (or view!) where the changes are to take effect.
 146     </para>
 147     </listitem>
 148     </varlistentry>
 149
 150     <varlistentry>
 151     <term>
 152         the target list
 153     </term>
 154     <indexterm><primary>target list</></>
 155     <listitem>
 156     <para>
 157         The target list is a list of expressions that define the
 158         result of the query.  In the case of a
 159         <command>SELECT</command>, these expressions are the ones that
 160         build the final output of the query.  They correspond to the
 161         expressions between the key words <command>SELECT</command>
 162         and <command>FROM</command>.  (<literal>*</literal> is just an
 163         abbreviation for all the column names of a relation.  It is
 164         expanded by the parser into the individual columns, so the
 165         rule system never sees it.)
 166     </para>
 167
 168     <para>
 169         <command>DELETE</command> commands don't need a target list
 170         because they don't produce any result. In fact, the planner will
 171         add a special <acronym>CTID</> entry to the empty target list, but
 172         this is after the rule system and will be discussed later; for the
 173         rule system, the target list is empty.
 174     </para>
 175
 176     <para>
 177         For <command>INSERT</command> commands, the target list describes
 178         the new rows that should go into the result relation. It consists of the
 179         expressions in the <literal>VALUES</> clause or the ones from the
 180         <command>SELECT</command> clause in <literal>INSERT
 181         ... SELECT</literal>.  The first step of the rewrite process adds
 182         target list entries for any columns that were not assigned to by
 183         the original command but have defaults.  Any remaining columns (with
 184         neither a given value nor a default) will be filled in by the
 185         planner with a constant null expression.
 186     </para>
 187
 188     <para>
 189         For <command>UPDATE</command> commands, the target list
 190         describes the new rows that should replace the old ones. In the
 191         rule system, it contains just the expressions from the <literal>SET
 192         column = expression</literal> part of the command.  The planner will handle
 193         missing columns by inserting expressions that copy the values from
 194         the old row into the new one. And it will add the special
 195         <acronym>CTID</> entry just as for <command>DELETE</command>, too.
 196     </para>
 197
 198     <para>
 199         Every entry in the target list contains an expression that can
 200         be a constant value, a variable pointing to a column of one
 201         of the relations in the range table, a parameter, or an expression
 202         tree made of function calls, constants, variables, operators, etc.
 203     </para>
 204     </listitem>
 205     </varlistentry>
 206
 207     <varlistentry>
 208     <term>
 209         the qualification
 210     </term>
 211     <listitem>
 212     <para>
 213         The query's qualification is an expression much like one of
 214         those contained in the target list entries. The result value of
 215         this expression is a Boolean that tells whether the operation
 216         (<command>INSERT</command>, <command>UPDATE</command>,
 217         <command>DELETE</command>, or <command>SELECT</command>) for the
 218         final result row should be executed or not. It corresponds to the <literal>WHERE</> clause
 219         of an <acronym>SQL</acronym> statement.
 220     </para>
 221     </listitem>
 222     </varlistentry>
 223
 224     <varlistentry>
 225     <term>
 226         the join tree
 227     </term>
 228     <listitem>
 229     <para>
 230         The query's join tree shows the structure of the <literal>FROM</> clause.
 231         For a simple query like <literal>SELECT ... FROM a, b, c</literal>, the join tree is just
 232         a list of the <literal>FROM</> items, because we are allowed to join them in
 233         any order.  But when <literal>JOIN</> expressions, particularly outer joins,
 234         are used, we have to join in the order shown by the joins.
 235         In that case, the join tree shows the structure of the <literal>JOIN</> expressions.  The
 236         restrictions associated with particular <literal>JOIN</> clauses (from <literal>ON</> or
 237         <literal>USING</> expressions) are stored as qualification expressions attached
 238         to those join-tree nodes.  It turns out to be convenient to store
 239         the top-level <literal>WHERE</> expression as a qualification attached to the
 240         top-level join-tree item, too.  So really the join tree represents
 241         both the <literal>FROM</> and <literal>WHERE</> clauses of a <command>SELECT</command>.
 242     </para>
 243     </listitem>
 244     </varlistentry>
 245
 246     <varlistentry>
 247     <term>
 248         the others
 249     </term>
 250     <listitem>
 251     <para>
 252         The other parts of the query tree like the <literal>ORDER BY</>
 253         clause aren't of interest here. The rule system
 254         substitutes some entries there while applying rules, but that
 255         doesn't have much to do with the fundamentals of the rule
 256         system.
 257     </para>
 258     </listitem>
 259     </varlistentry>
 260
 261 </variablelist>
 262 </para>
 263 </sect1>
 264
 265 <sect1 id="rules-views">
 266 <title>Views and the Rule System</title>
 267
 268 <indexterm zone="rules-views">
 269  <primary>rule</primary>
 270  <secondary>and views</secondary>
 271 </indexterm>
 272
 273 <indexterm zone="rules-views">
 274  <primary>view</>
 275  <secondary>implementation through rules</>
 276 </indexterm>
 277
 278 <para>
 279     Views in <productname>PostgreSQL</productname> are implemented
 280     using the rule system. In fact, there is essentially no difference
 281     between:
 282
 283 <programlisting>
 284 CREATE VIEW myview AS SELECT * FROM mytab;
 285 </programlisting>
 286
 287     compared against the two commands:
 288
 289 <programlisting>
 290 CREATE TABLE myview (<replaceable>same column list as mytab</replaceable>);
 291 CREATE RULE "_RETURN" AS ON SELECT TO myview DO INSTEAD
 292     SELECT * FROM mytab;
 293 </programlisting>
 294
 295     because this is exactly what the <command>CREATE VIEW</command>
 296     command does internally.  This has some side effects. One of them
 297     is that the information about a view in the
 298     <productname>PostgreSQL</productname> system catalogs is exactly
 299     the same as it is for a table. So for the parser, there is
 300     absolutely no difference between a table and a view. They are the
 301     same thing: relations.
 302 </para>
 303
 304 <sect2 id="rules-select">
 305 <title>How <command>SELECT</command> Rules Work</title>
 306
 307 <indexterm zone="rules-select">
 308  <primary>rule</primary>
 309  <secondary sortas="SELECT">for SELECT</secondary>
 310 </indexterm>
 311
 312 <para>
 313     Rules <literal>ON SELECT</> are applied to all queries as the last step, even
 314     if the command given is an <command>INSERT</command>,
 315     <command>UPDATE</command> or <command>DELETE</command>. And they
 316     have different semantics from rules on the other command types in that they modify the
 317     query tree in place instead of creating a new one.  So
 318     <command>SELECT</command> rules are described first.
 319 </para>
 320
 321 <para>
 322     Currently, there can be only one action in an <literal>ON SELECT</> rule, and it must
 323     be an unconditional <command>SELECT</> action that is <literal>INSTEAD</>. This restriction was
 324     required to make rules safe enough to open them for ordinary users, and
 325     it restricts <literal>ON SELECT</> rules to act like views.
 326 </para>
 327
 328 <para>
 329     The examples for this chapter are two join views that do some
 330     calculations and some more views using them in turn.  One of the
 331     two first views is customized later by adding rules for
 332     <command>INSERT</command>, <command>UPDATE</command>, and
 333     <command>DELETE</command> operations so that the final result will
 334     be a view that behaves like a real table with some magic
 335     functionality.  This is not such a simple example to start from and
 336     this makes things harder to get into. But it's better to have one
 337     example that covers all the points discussed step by step rather
 338     than having many different ones that might mix up in mind.
 339 </para>
 340
 341 <para>
 342 For the example, we need a little <literal>min</literal> function that
 343 returns the lower of 2 integer values. We create that as:
 344
 345 <programlisting>
 346 CREATE FUNCTION min(integer, integer) RETURNS integer AS $$
 347     SELECT CASE WHEN $1 &lt; $2 THEN $1 ELSE $2 END
 348 $$ LANGUAGE SQL STRICT;
 349 </programlisting>
 350 </para>
 351
 352 <para>
 353     The real tables we need in the first two rule system descriptions
 354     are these:
 355
 356 <programlisting>
 357 CREATE TABLE shoe_data (
 358     shoename   text,          -- primary key
 359     sh_avail   integer,       -- available number of pairs
 360     slcolor    text,          -- preferred shoelace color
 361     slminlen   real,          -- minimum shoelace length
 362     slmaxlen   real,          -- maximum shoelace length
 363     slunit     text           -- length unit
 364 );
 365
 366 CREATE TABLE shoelace_data (
 367     sl_name    text,          -- primary key
 368     sl_avail   integer,       -- available number of pairs
 369     sl_color   text,          -- shoelace color
 370     sl_len     real,          -- shoelace length
 371     sl_unit    text           -- length unit
 372 );
 373
 374 CREATE TABLE unit (
 375     un_name    text,          -- primary key
 376     un_fact    real           -- factor to transform to cm
 377 );
 378 </programlisting>
 379
 380     As you can see, they represent shoe-store data.
 381 </para>
 382
 383 <para>
 384     The views are created as:
 385
 386 <programlisting>
 387 CREATE VIEW shoe AS
 388     SELECT sh.shoename,
 389            sh.sh_avail,
 390            sh.slcolor,
 391            sh.slminlen,
 392            sh.slminlen * un.un_fact AS slminlen_cm,
 393            sh.slmaxlen,
 394            sh.slmaxlen * un.un_fact AS slmaxlen_cm,
 395            sh.slunit
 396       FROM shoe_data sh, unit un
 397      WHERE sh.slunit = un.un_name;
 398
 399 CREATE VIEW shoelace AS
 400     SELECT s.sl_name,
 401            s.sl_avail,
 402            s.sl_color,
 403            s.sl_len,
 404            s.sl_unit,
 405            s.sl_len * u.un_fact AS sl_len_cm
 406       FROM shoelace_data s, unit u
 407      WHERE s.sl_unit = u.un_name;
 408
 409 CREATE VIEW shoe_ready AS
 410     SELECT rsh.shoename,
 411            rsh.sh_avail,
 412            rsl.sl_name,
 413            rsl.sl_avail,
 414            min(rsh.sh_avail, rsl.sl_avail) AS total_avail
 415       FROM shoe rsh, shoelace rsl
 416      WHERE rsl.sl_color = rsh.slcolor
 417        AND rsl.sl_len_cm &gt;= rsh.slminlen_cm
 418        AND rsl.sl_len_cm &lt;= rsh.slmaxlen_cm;
 419 </programlisting>
 420
 421     The <command>CREATE VIEW</command> command for the
 422     <literal>shoelace</literal> view (which is the simplest one we
 423     have) will create a relation <literal>shoelace</> and an entry in
 424     <structname>pg_rewrite</structname> that tells that there is a
 425     rewrite rule that must be applied whenever the relation <literal>shoelace</>
 426     is referenced in a query's range table.  The rule has no rule
 427     qualification (discussed later, with the non-<command>SELECT</> rules, since
 428     <command>SELECT</> rules currently cannot have them) and it is <literal>INSTEAD</>. Note
 429     that rule qualifications are not the same as query qualifications.
 430     The action of our rule has a query qualification.
 431     The action of the rule is one query tree that is a copy of the
 432     <command>SELECT</command> statement in the view creation command.
 433 </para>
 434
 435     <note>
 436     <para>
 437     The two extra range
 438     table entries for <literal>NEW</> and <literal>OLD</> (named <literal>*NEW*</> and <literal>*OLD*</> for
 439     historical reasons in the printed query tree) you can see in
 440     the <structname>pg_rewrite</structname> entry aren't of interest
 441     for <command>SELECT</command> rules.
 442     </para>
 443     </note>
 444
 445 <para>
 446     Now we populate <literal>unit</literal>, <literal>shoe_data</literal>
 447     and <literal>shoelace_data</literal> and run a simple query on a view:
 448
 449 <programlisting>
 450 INSERT INTO unit VALUES ('cm', 1.0);
 451 INSERT INTO unit VALUES ('m', 100.0);
 452 INSERT INTO unit VALUES ('inch', 2.54);
 453
 454 INSERT INTO shoe_data VALUES ('sh1', 2, 'black', 70.0, 90.0, 'cm');
 455 INSERT INTO shoe_data VALUES ('sh2', 0, 'black', 30.0, 40.0, 'inch');
 456 INSERT INTO shoe_data VALUES ('sh3', 4, 'brown', 50.0, 65.0, 'cm');
 457 INSERT INTO shoe_data VALUES ('sh4', 3, 'brown', 40.0, 50.0, 'inch');
 458
 459 INSERT INTO shoelace_data VALUES ('sl1', 5, 'black', 80.0, 'cm');
 460 INSERT INTO shoelace_data VALUES ('sl2', 6, 'black', 100.0, 'cm');
 461 INSERT INTO shoelace_data VALUES ('sl3', 0, 'black', 35.0 , 'inch');
 462 INSERT INTO shoelace_data VALUES ('sl4', 8, 'black', 40.0 , 'inch');
 463 INSERT INTO shoelace_data VALUES ('sl5', 4, 'brown', 1.0 , 'm');
 464 INSERT INTO shoelace_data VALUES ('sl6', 0, 'brown', 0.9 , 'm');
 465 INSERT INTO shoelace_data VALUES ('sl7', 7, 'brown', 60 , 'cm');
 466 INSERT INTO shoelace_data VALUES ('sl8', 1, 'brown', 40 , 'inch');
 467
 468 SELECT * FROM shoelace;
 469
 470  sl_name   | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
 471 -----------+----------+----------+--------+---------+-----------
 472  sl1       |        5 | black    |     80 | cm      |        80
 473  sl2       |        6 | black    |    100 | cm      |       100
 474  sl7       |        7 | brown    |     60 | cm      |        60
 475  sl3       |        0 | black    |     35 | inch    |      88.9
 476  sl4       |        8 | black    |     40 | inch    |     101.6
 477  sl8       |        1 | brown    |     40 | inch    |     101.6
 478  sl5       |        4 | brown    |      1 | m       |       100
 479  sl6       |        0 | brown    |    0.9 | m       |        90
 480 (8 rows)
 481 </programlisting>
 482    </para>
 483
 484    <para>
 485     This is the simplest <command>SELECT</command> you can do on our
 486     views, so we take this opportunity to explain the basics of view
 487     rules.  The <literal>SELECT * FROM shoelace</literal> was
 488     interpreted by the parser and produced the query tree:
 489
 490 <programlisting>
 491 SELECT shoelace.sl_name, shoelace.sl_avail,
 492        shoelace.sl_color, shoelace.sl_len,
 493        shoelace.sl_unit, shoelace.sl_len_cm
 494   FROM shoelace shoelace;
 495 </programlisting>
 496
 497     and this is given to the rule system. The rule system walks through the
 498     range table and checks if there are rules
 499     for any relation. When processing the range table entry for
 500     <literal>shoelace</literal> (the only one up to now) it finds the
 501     <literal>_RETURN</literal> rule with the query tree:
 502
 503 <programlisting>
 504 SELECT s.sl_name, s.sl_avail,
 505        s.sl_color, s.sl_len, s.sl_unit,
 506        s.sl_len * u.un_fact AS sl_len_cm
 507   FROM shoelace *OLD*, shoelace *NEW*,
 508        shoelace_data s, unit u
 509  WHERE s.sl_unit = u.un_name;
 510 </programlisting>
 511 </para>
 512
 513 <para>
 514     To expand the view, the rewriter simply creates a subquery range-table
 515     entry containing the rule's action query tree, and substitutes this
 516     range table entry for the original one that referenced the view.  The
 517     resulting rewritten query tree is almost the same as if you had typed:
 518
 519 <programlisting>
 520 SELECT shoelace.sl_name, shoelace.sl_avail,
 521        shoelace.sl_color, shoelace.sl_len,
 522        shoelace.sl_unit, shoelace.sl_len_cm
 523   FROM (SELECT s.sl_name,
 524                s.sl_avail,
 525                s.sl_color,
 526                s.sl_len,
 527                s.sl_unit,
 528                s.sl_len * u.un_fact AS sl_len_cm
 529           FROM shoelace_data s, unit u
 530          WHERE s.sl_unit = u.un_name) shoelace;
 531 </programlisting>
 532
 533      There is one difference however: the subquery's range table has two
 534      extra entries <literal>shoelace *OLD*</> and <literal>shoelace *NEW*</>.  These entries don't
 535      participate directly in the query, since they aren't referenced by
 536      the subquery's join tree or target list.  The rewriter uses them
 537      to store the access privilege check information that was originally present
 538      in the range-table entry that referenced the view.  In this way, the
 539      executor will still check that the user has proper privileges to access
 540      the view, even though there's no direct use of the view in the rewritten
 541      query.
 542 </para>
 543
 544 <para>
 545     That was the first rule applied.  The rule system will continue checking
 546     the remaining range-table entries in the top query (in this example there
 547     are no more), and it will recursively check the range-table entries in
 548     the added subquery to see if any of them reference views.  (But it
 549     won't expand <literal>*OLD*</> or <literal>*NEW*</> &mdash; otherwise we'd have infinite recursion!)
 550     In this example, there are no rewrite rules for <literal>shoelace_data</> or <literal>unit</>,
 551     so rewriting is complete and the above is the final result given to
 552     the planner.
 553 </para>
 554
 555 <para>
 556     Now we want to write a query that finds out for which shoes currently in the store
 557     we have the matching shoelaces (color and length) and where the
 558     total number of exactly matching pairs is greater or equal to two.
 559
 560 <programlisting>
 561 SELECT * FROM shoe_ready WHERE total_avail &gt;= 2;
 562
 563  shoename | sh_avail | sl_name | sl_avail | total_avail
 564 ----------+----------+---------+----------+-------------
 565  sh1      |        2 | sl1     |        5 |           2
 566  sh3      |        4 | sl7     |        7 |           4
 567 (2 rows)
 568 </programlisting>
 569 </para>
 570
 571 <para>
 572     The output of the parser this time is the query tree:
 573
 574 <programlisting>
 575 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
 576        shoe_ready.sl_name, shoe_ready.sl_avail,
 577        shoe_ready.total_avail
 578   FROM shoe_ready shoe_ready
 579  WHERE shoe_ready.total_avail &gt;= 2;
 580 </programlisting>
 581
 582     The first rule applied will be the one for the
 583     <literal>shoe_ready</literal> view and it results in the
 584     query tree:
 585
 586 <programlisting>
 587 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
 588        shoe_ready.sl_name, shoe_ready.sl_avail,
 589        shoe_ready.total_avail
 590   FROM (SELECT rsh.shoename,
 591                rsh.sh_avail,
 592                rsl.sl_name,
 593                rsl.sl_avail,
 594                min(rsh.sh_avail, rsl.sl_avail) AS total_avail
 595           FROM shoe rsh, shoelace rsl
 596          WHERE rsl.sl_color = rsh.slcolor
 597            AND rsl.sl_len_cm &gt;= rsh.slminlen_cm
 598            AND rsl.sl_len_cm &lt;= rsh.slmaxlen_cm) shoe_ready
 599  WHERE shoe_ready.total_avail &gt;= 2;
 600 </programlisting>
 601
 602     Similarly, the rules for <literal>shoe</literal> and
 603     <literal>shoelace</literal> are substituted into the range table of
 604     the subquery, leading to a three-level final query tree:
 605
 606 <programlisting>
 607 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
 608        shoe_ready.sl_name, shoe_ready.sl_avail,
 609        shoe_ready.total_avail
 610   FROM (SELECT rsh.shoename,
 611                rsh.sh_avail,
 612                rsl.sl_name,
 613                rsl.sl_avail,
 614                min(rsh.sh_avail, rsl.sl_avail) AS total_avail
 615           FROM (SELECT sh.shoename,
 616                        sh.sh_avail,
 617                        sh.slcolor,
 618                        sh.slminlen,
 619                        sh.slminlen * un.un_fact AS slminlen_cm,
 620                        sh.slmaxlen,
 621                        sh.slmaxlen * un.un_fact AS slmaxlen_cm,
 622                        sh.slunit
 623                   FROM shoe_data sh, unit un
 624                  WHERE sh.slunit = un.un_name) rsh,
 625                (SELECT s.sl_name,
 626                        s.sl_avail,
 627                        s.sl_color,
 628                        s.sl_len,
 629                        s.sl_unit,
 630                        s.sl_len * u.un_fact AS sl_len_cm
 631                   FROM shoelace_data s, unit u
 632                  WHERE s.sl_unit = u.un_name) rsl
 633          WHERE rsl.sl_color = rsh.slcolor
 634            AND rsl.sl_len_cm &gt;= rsh.slminlen_cm
 635            AND rsl.sl_len_cm &lt;= rsh.slmaxlen_cm) shoe_ready
 636  WHERE shoe_ready.total_avail &gt; 2;
 637 </programlisting>
 638    </para>
 639
 640    <para>
 641     It turns out that the planner will collapse this tree into a
 642     two-level query tree: the bottommost <command>SELECT</command>
 643     commands will be <quote>pulled up</quote> into the middle
 644     <command>SELECT</command> since there's no need to process them
 645     separately.  But the middle <command>SELECT</command> will remain
 646     separate from the top, because it contains aggregate functions.
 647     If we pulled those up it would change the behavior of the topmost
 648     <command>SELECT</command>, which we don't want.  However,
 649     collapsing the query tree is an optimization that the rewrite
 650     system doesn't have to concern itself with.
 651    </para>
 652 </sect2>
 653
 654 <sect2>
 655 <title>View Rules in Non-<command>SELECT</command> Statements</title>
 656
 657 <para>
 658     Two details of the query tree aren't touched in the description of
 659     view rules above. These are the command type and the result relation.
 660     In fact, view rules don't need this information.
 661 </para>
 662
 663 <para>
 664     There are only a few differences between a query tree for a
 665     <command>SELECT</command> and one for any other
 666     command. Obviously, they have a different command type and for a
 667     command other than a <command>SELECT</command>, the result
 668     relation points to the range-table entry where the result should
 669     go.  Everything else is absolutely the same.  So having two tables
 670     <literal>t1</> and <literal>t2</> with columns <literal>a</> and
 671     <literal>b</>, the query trees for the two statements:
 672
 673 <programlisting>
 674 SELECT t2.b FROM t1, t2 WHERE t1.a = t2.a;
 675
 676 UPDATE t1 SET b = t2.b FROM t2 WHERE t1.a = t2.a;
 677 </programlisting>
 678
 679     are nearly identical.  In particular:
 680
 681     <itemizedlist>
 682         <listitem>
 683         <para>
 684             The range tables contain entries for the tables <literal>t1</> and <literal>t2</>.
 685         </para>
 686         </listitem>
 687
 688         <listitem>
 689         <para>
 690             The target lists contain one variable that points to column
 691             <literal>b</> of the range table entry for table <literal>t2</>.
 692         </para>
 693         </listitem>
 694
 695         <listitem>
 696         <para>
 697             The qualification expressions compare the columns <literal>a</> of both
 698             range-table entries for equality.
 699         </para>
 700         </listitem>
 701
 702         <listitem>
 703         <para>
 704             The join trees show a simple join between <literal>t1</> and <literal>t2</>.
 705         </para>
 706         </listitem>
 707     </itemizedlist>
 708    </para>
 709
 710    <para>
 711     The consequence is, that both query trees result in similar
 712     execution plans: They are both joins over the two tables. For the
 713     <command>UPDATE</command> the missing columns from <literal>t1</> are added to
 714     the target list by the planner and the final query tree will read
 715     as:
 716
 717 <programlisting>
 718 UPDATE t1 SET a = t1.a, b = t2.b FROM t2 WHERE t1.a = t2.a;
 719 </programlisting>
 720
 721     and thus the executor run over the join will produce exactly the
 722     same result set as a:
 723
 724 <programlisting>
 725 SELECT t1.a, t2.b FROM t1, t2 WHERE t1.a = t2.a;
 726 </programlisting>
 727
 728     will do. But there is a little problem in
 729     <command>UPDATE</command>: The executor does not care what the
 730     results from the join it is doing are meant for. It just produces
 731     a result set of rows. The difference that one is a
 732     <command>SELECT</command> command and the other is an
 733     <command>UPDATE</command> is handled in the caller of the
 734     executor. The caller still knows (looking at the query tree) that
 735     this is an <command>UPDATE</command>, and it knows that this
 736     result should go into table <literal>t1</>. But which of the rows that are
 737     there has to be replaced by the new row?
 738 </para>
 739
 740 <para>
 741     To resolve this problem, another entry is added to the target list
 742     in <command>UPDATE</command> (and also in
 743     <command>DELETE</command>) statements: the current tuple ID
 744     (<acronym>CTID</>).<indexterm><primary>CTID</></>
 745     This is a system column containing the
 746     file block number and position in the block for the row. Knowing
 747     the table, the <acronym>CTID</> can be used to retrieve the
 748     original row of <literal>t1</> to be updated.  After adding the
 749     <acronym>CTID</> to the target list, the query actually looks like:
 750
 751 <programlisting>
 752 SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
 753 </programlisting>
 754
 755     Now another detail of <productname>PostgreSQL</productname> enters
 756     the stage. Old table rows aren't overwritten, and this
 757     is why <command>ROLLBACK</command> is fast. In an <command>UPDATE</command>,
 758     the new result row is inserted into the table (after stripping the
 759     <acronym>CTID</>) and in the row header of the old row, which the
 760     <acronym>CTID</> pointed to, the <literal>cmax</> and
 761     <literal>xmax</> entries are set to the current command counter
 762     and current transaction ID. Thus the old row is hidden, and after
 763     the transaction commits the vacuum cleaner can really remove it.
 764 </para>
 765
 766 <para>
 767     Knowing all that, we can simply apply view rules in absolutely
 768     the same way to any command. There is no difference.
 769 </para>
 770 </sect2>
 771
 772 <sect2>
 773 <title>The Power of Views in <productname>PostgreSQL</productname></title>
 774
 775 <para>
 776     The above demonstrates how the rule system incorporates view
 777     definitions into the original query tree. In the second example, a
 778     simple <command>SELECT</command> from one view created a final
 779     query tree that is a join of 4 tables (<literal>unit</> was used twice with
 780     different names).
 781 </para>
 782
 783 <para>
 784     The benefit of implementing views with the rule system is,
 785     that the planner has all
 786     the information about which tables have to be scanned plus the
 787     relationships between these tables plus the restrictive
 788     qualifications from the views plus the qualifications from
 789     the original query
 790     in one single query tree. And this is still the situation
 791     when the original query is already a join over views.
 792     The planner has to decide which is
 793     the best path to execute the query, and the more information
 794     the planner has, the better this decision can be. And
 795     the rule system as implemented in <productname>PostgreSQL</productname>
 796     ensures, that this is all information available about the query
 797     up to that point.
 798 </para>
 799 </sect2>
 800
 801 <sect2 id="rules-views-update">
 802 <title>Updating a View</title>
 803
 804 <para>
 805     What happens if a view is named as the target relation for an
 806     <command>INSERT</command>, <command>UPDATE</command>, or
 807     <command>DELETE</command>?  After doing the substitutions
 808     described above, we will have a query tree in which the result
 809     relation points at a subquery range-table entry.  This will not
 810     work, so the rewriter throws an error if it sees it has produced
 811     such a thing.
 812 </para>
 813
 814 <para>
 815     To change this, we can define rules that modify the behavior of
 816     these kinds of commands. This is the topic of the next section.
 817 </para>
 818 </sect2>
 819
 820 </sect1>
 821
 822 <sect1 id="rules-update">
 823 <title>Rules on <command>INSERT</>, <command>UPDATE</>, and <command>DELETE</></title>
 824
 825 <indexterm zone="rules-update">
 826  <primary>rule</primary>
 827  <secondary sortas="INSERT">for INSERT</secondary>
 828 </indexterm>
 829
 830 <indexterm zone="rules-update">
 831  <primary>rule</primary>
 832  <secondary sortas="UPDATE">for UPDATE</secondary>
 833 </indexterm>
 834
 835 <indexterm zone="rules-update">
 836  <primary>rule</primary>
 837  <secondary sortas="DELETE">for DELETE</secondary>
 838 </indexterm>
 839
 840 <para>
 841     Rules that are defined on <command>INSERT</>, <command>UPDATE</>,
 842     and <command>DELETE</> are significantly different from the view rules
 843     described in the previous section. First, their <command>CREATE
 844     RULE</command> command allows more:
 845
 846     <itemizedlist>
 847         <listitem>
 848         <para>
 849             They are allowed to have no action.
 850         </para>
 851         </listitem>
 852
 853         <listitem>
 854         <para>
 855             They can have multiple actions.
 856         </para>
 857         </listitem>
 858
 859         <listitem>
 860         <para>
 861             They can be <literal>INSTEAD</> or <literal>ALSO</> (the default).
 862         </para>
 863         </listitem>
 864
 865         <listitem>
 866         <para>
 867             The pseudorelations <literal>NEW</> and <literal>OLD</> become useful.
 868         </para>
 869         </listitem>
 870
 871         <listitem>
 872         <para>
 873             They can have rule qualifications.
 874         </para>
 875         </listitem>
 876     </itemizedlist>
 877
 878     Second, they don't modify the query tree in place. Instead they
 879     create zero or more new query trees and can throw away the
 880     original one.
 881 </para>
 882
 883 <sect2>
 884 <title>How Update Rules Work</title>
 885
 886 <para>
 887     Keep the syntax:
 888
 889 <programlisting>
 890 CREATE [ OR REPLACE ] RULE <replaceable class="parameter">name</replaceable> AS ON <replaceable class="parameter">event</replaceable>
 891     TO <replaceable class="parameter">table</replaceable> [ WHERE <replaceable class="parameter">condition</replaceable> ]
 892     DO [ ALSO | INSTEAD ] { NOTHING | <replaceable class="parameter">command</replaceable> | ( <replaceable class="parameter">command</replaceable> ; <replaceable class="parameter">command</replaceable> ... ) }
 893 </programlisting>
 894
 895     in mind.
 896     In the following, <firstterm>update rules</> means rules that are defined
 897     on <command>INSERT</>, <command>UPDATE</>, or <command>DELETE</>.
 898 </para>
 899
 900 <para>
 901     Update rules get applied by the rule system when the result
 902     relation and the command type of a query tree are equal to the
 903     object and event given in the <command>CREATE RULE</command> command.
 904     For update rules, the rule system creates a list of query trees.
 905     Initially the query-tree list is empty.
 906     There can be zero (<literal>NOTHING</> key word), one, or multiple actions.
 907     To simplify, we will look at a rule with one action. This rule
 908     can have a qualification or not and it can be <literal>INSTEAD</> or
 909     <literal>ALSO</> (the default).
 910 </para>
 911
 912 <para>
 913     What is a rule qualification? It is a restriction that tells
 914     when the actions of the rule should be done and when not. This
 915     qualification can only reference the pseudorelations <literal>NEW</> and/or <literal>OLD</>,
 916     which basically represent the relation that was given as object (but with a
 917     special meaning).
 918 </para>
 919
 920    <para>
 921     So we have three cases that produce the following query trees for
 922     a one-action rule.
 923
 924     <variablelist>
 925      <varlistentry>
 926       <term>No qualification, with either <literal>ALSO</> or
 927       <literal>INSTEAD</></term>
 928       <listitem>
 929        <para>
 930         the query tree from the rule action with the original query
 931         tree's qualification added
 932        </para>
 933       </listitem>
 934      </varlistentry>
 935
 936      <varlistentry>
 937       <term>Qualification given and <literal>ALSO</></term>
 938       <listitem>
 939        <para>
 940         the query tree from the rule action with the rule
 941         qualification and the original query tree's qualification
 942         added
 943        </para>
 944       </listitem>
 945      </varlistentry>
 946
 947      <varlistentry>
 948       <term>Qualification given and <literal>INSTEAD</></term>
 949       <listitem>
 950        <para>
 951         the query tree from the rule action with the rule
 952         qualification and the original query tree's qualification; and
 953         the original query tree with the negated rule qualification
 954         added
 955        </para>
 956       </listitem>
 957      </varlistentry>
 958     </variablelist>
 959
 960     Finally, if the rule is <literal>ALSO</>, the unchanged original query tree is
 961     added to the list. Since only qualified <literal>INSTEAD</> rules already add the
 962     original query tree, we end up with either one or two output query trees
 963     for a rule with one action.
 964 </para>
 965
 966 <para>
 967     For <literal>ON INSERT</> rules, the original query (if not suppressed by <literal>INSTEAD</>)
 968     is done before any actions added by rules.  This allows the actions to
 969     see the inserted row(s).  But for <literal>ON UPDATE</> and <literal>ON
 970     DELETE</> rules, the original query is done after the actions added by rules.
 971     This ensures that the actions can see the to-be-updated or to-be-deleted
 972     rows; otherwise, the actions might do nothing because they find no rows
 973     matching their qualifications.
 974 </para>
 975
 976 <para>
 977     The query trees generated from rule actions are thrown into the
 978     rewrite system again, and maybe more rules get applied resulting
 979     in more or less query trees.
 980     So a rule's actions must have either a different
 981     command type or a different result relation than the rule itself is
 982     on, otherwise this recursive process will end up in an infinite loop.
 983     (Recursive expansion of a rule will be detected and reported as an
 984     error.)
 985 </para>
 986
 987 <para>
 988     The query trees found in the actions of the
 989     <structname>pg_rewrite</structname> system catalog are only
 990     templates. Since they can reference the range-table entries for
 991     <literal>NEW</> and <literal>OLD</>, some substitutions have to be made before they can be
 992     used. For any reference to <literal>NEW</>, the target list of the original
 993     query is searched for a corresponding entry. If found, that
 994     entry's expression replaces the reference. Otherwise, <literal>NEW</> means the
 995     same as <literal>OLD</> (for an <command>UPDATE</command>) or is replaced by
 996     a null value (for an <command>INSERT</command>). Any reference to <literal>OLD</> is
 997     replaced by a reference to the range-table entry that is the
 998     result relation.
 999 </para>
1000
1001 <para>
1002     After the system is done applying update rules, it applies view rules to the
1003     produced query tree(s).  Views cannot insert new update actions so
1004     there is no need to apply update rules to the output of view rewriting.
1005 </para>
1006
1007 <sect3>
1008 <title>A First Rule Step by Step</title>
1009
1010 <para>
1011     Say we want to trace changes to the <literal>sl_avail</> column in the
1012     <literal>shoelace_data</literal> relation. So we set up a log table
1013     and a rule that conditionally writes a log entry when an
1014     <command>UPDATE</command> is performed on
1015     <literal>shoelace_data</literal>.
1016
1017 <programlisting>
1018 CREATE TABLE shoelace_log (
1019     sl_name    text,          -- shoelace changed
1020     sl_avail   integer,       -- new available value
1021     log_who    text,          -- who did it
1022     log_when   timestamp      -- when
1023 );
1024
1025 CREATE RULE log_shoelace AS ON UPDATE TO shoelace_data
1026     WHERE NEW.sl_avail &lt;&gt; OLD.sl_avail
1027     DO INSERT INTO shoelace_log VALUES (
1028                                     NEW.sl_name,
1029                                     NEW.sl_avail,
1030                                     current_user,
1031                                     current_timestamp
1032                                 );
1033 </programlisting>
1034 </para>
1035
1036 <para>
1037     Now someone does:
1038
1039 <programlisting>
1040 UPDATE shoelace_data SET sl_avail = 6 WHERE sl_name = 'sl7';
1041 </programlisting>
1042
1043     and we look at the log table:
1044
1045 <programlisting>
1046 SELECT * FROM shoelace_log;
1047
1048  sl_name | sl_avail | log_who | log_when
1049 ---------+----------+---------+----------------------------------
1050  sl7     |        6 | Al      | Tue Oct 20 16:14:45 1998 MET DST
1051 (1 row)
1052 </programlisting>
1053    </para>
1054
1055    <para>
1056     That's what we expected. What happened in the background is the following.
1057     The parser created the query tree:
1058
1059 <programlisting>
1060 UPDATE shoelace_data SET sl_avail = 6
1061   FROM shoelace_data shoelace_data
1062  WHERE shoelace_data.sl_name = 'sl7';
1063 </programlisting>
1064
1065     There is a rule <literal>log_shoelace</literal> that is <literal>ON UPDATE</> with the rule
1066     qualification expression:
1067
1068 <programlisting>
1069 NEW.sl_avail &lt;&gt; OLD.sl_avail
1070 </programlisting>
1071
1072     and the action:
1073
1074 <programlisting>
1075 INSERT INTO shoelace_log VALUES (
1076        *NEW*.sl_name, *NEW*.sl_avail,
1077        current_user, current_timestamp )
1078   FROM shoelace_data *NEW*, shoelace_data *OLD*;
1079 </programlisting>
1080
1081     (This looks a little strange since you cannot normally write
1082     <literal>INSERT ... VALUES ... FROM</>.  The <literal>FROM</>
1083     clause here is just to indicate that there are range-table entries
1084     in the query tree for <literal>*NEW*</> and <literal>*OLD*</>.
1085     These are needed so that they can be referenced by variables in
1086     the <command>INSERT</command> command's query tree.)
1087 </para>
1088
1089 <para>
1090     The rule is a qualified <literal>ALSO</> rule, so the rule system
1091     has to return two query trees: the modified rule action and the original
1092     query tree. In step 1, the range table of the original query is
1093     incorporated into the rule's action query tree. This results in:
1094
1095 <programlisting>
1096 INSERT INTO shoelace_log VALUES (
1097        *NEW*.sl_name, *NEW*.sl_avail,
1098        current_user, current_timestamp )
1099   FROM shoelace_data *NEW*, shoelace_data *OLD*,
1100        <emphasis>shoelace_data shoelace_data</emphasis>;
1101 </programlisting>
1102
1103     In step 2, the rule qualification is added to it, so the result set
1104     is restricted to rows where <literal>sl_avail</> changes:
1105
1106 <programlisting>
1107 INSERT INTO shoelace_log VALUES (
1108        *NEW*.sl_name, *NEW*.sl_avail,
1109        current_user, current_timestamp )
1110   FROM shoelace_data *NEW*, shoelace_data *OLD*,
1111        shoelace_data shoelace_data
1112  <emphasis>WHERE *NEW*.sl_avail &lt;&gt; *OLD*.sl_avail</emphasis>;
1113 </programlisting>
1114
1115     (This looks even stranger, since <literal>INSERT ... VALUES</> doesn't have
1116     a <literal>WHERE</> clause either, but the planner and executor will have no
1117     difficulty with it.  They need to support this same functionality
1118     anyway for <literal>INSERT ... SELECT</>.)
1119    </para>
1120
1121    <para>
1122     In step 3, the original query tree's qualification is added,
1123     restricting the result set further to only the rows that would have been touched
1124     by the original query:
1125
1126 <programlisting>
1127 INSERT INTO shoelace_log VALUES (
1128        *NEW*.sl_name, *NEW*.sl_avail,
1129        current_user, current_timestamp )
1130   FROM shoelace_data *NEW*, shoelace_data *OLD*,
1131        shoelace_data shoelace_data
1132  WHERE *NEW*.sl_avail &lt;&gt; *OLD*.sl_avail
1133    <emphasis>AND shoelace_data.sl_name = 'sl7'</emphasis>;
1134 </programlisting>
1135    </para>
1136
1137    <para>
1138     Step 4 replaces references to <literal>NEW</> by the target list entries from the
1139     original query tree or by the matching variable references
1140     from the result relation:
1141
1142 <programlisting>
1143 INSERT INTO shoelace_log VALUES (
1144        <emphasis>shoelace_data.sl_name</emphasis>, <emphasis>6</emphasis>,
1145        current_user, current_timestamp )
1146   FROM shoelace_data *NEW*, shoelace_data *OLD*,
1147        shoelace_data shoelace_data
1148  WHERE <emphasis>6</emphasis> &lt;&gt; *OLD*.sl_avail
1149    AND shoelace_data.sl_name = 'sl7';
1150 </programlisting>
1151
1152    </para>
1153
1154    <para>
1155     Step 5 changes <literal>OLD</> references into result relation references:
1156
1157 <programlisting>
1158 INSERT INTO shoelace_log VALUES (
1159        shoelace_data.sl_name, 6,
1160        current_user, current_timestamp )
1161   FROM shoelace_data *NEW*, shoelace_data *OLD*,
1162        shoelace_data shoelace_data
1163  WHERE 6 &lt;&gt; <emphasis>shoelace_data.sl_avail</emphasis>
1164    AND shoelace_data.sl_name = 'sl7';
1165 </programlisting>
1166    </para>
1167
1168    <para>
1169     That's it.  Since the rule is <literal>ALSO</>, we also output the
1170     original query tree.  In short, the output from the rule system
1171     is a list of two query trees that correspond to these statements:
1172
1173 <programlisting>
1174 INSERT INTO shoelace_log VALUES (
1175        shoelace_data.sl_name, 6,
1176        current_user, current_timestamp )
1177   FROM shoelace_data
1178  WHERE 6 &lt;&gt; shoelace_data.sl_avail
1179    AND shoelace_data.sl_name = 'sl7';
1180
1181 UPDATE shoelace_data SET sl_avail = 6
1182  WHERE sl_name = 'sl7';
1183 </programlisting>
1184
1185     These are executed in this order, and that is exactly what
1186     the rule was meant to do.
1187    </para>
1188
1189    <para>
1190     The substitutions and the added qualifications
1191     ensure that, if the original query would be, say:
1192
1193 <programlisting>
1194 UPDATE shoelace_data SET sl_color = 'green'
1195  WHERE sl_name = 'sl7';
1196 </programlisting>
1197
1198     no log entry would get written.  In that case, the original query
1199     tree does not contain a target list entry for
1200     <literal>sl_avail</>, so <literal>NEW.sl_avail</> will get
1201     replaced by <literal>shoelace_data.sl_avail</>.  Thus, the extra
1202     command generated by the rule is:
1203
1204 <programlisting>
1205 INSERT INTO shoelace_log VALUES (
1206        shoelace_data.sl_name, <emphasis>shoelace_data.sl_avail</emphasis>,
1207        current_user, current_timestamp )
1208   FROM shoelace_data
1209  WHERE <emphasis>shoelace_data.sl_avail</emphasis> &lt;&gt; shoelace_data.sl_avail
1210    AND shoelace_data.sl_name = 'sl7';
1211 </programlisting>
1212
1213     and that qualification will never be true.
1214    </para>
1215
1216    <para>
1217     It will also work if the original query modifies multiple rows. So
1218     if someone issued the command:
1219
1220 <programlisting>
1221 UPDATE shoelace_data SET sl_avail = 0
1222  WHERE sl_color = 'black';
1223 </programlisting>
1224
1225     four rows in fact get updated (<literal>sl1</>, <literal>sl2</>, <literal>sl3</>, and <literal>sl4</>).
1226     But <literal>sl3</> already has <literal>sl_avail = 0</>.   In this case, the original
1227     query trees qualification is different and that results
1228     in the extra query tree:
1229
1230 <programlisting>
1231 INSERT INTO shoelace_log
1232 SELECT shoelace_data.sl_name, 0,
1233        current_user, current_timestamp
1234   FROM shoelace_data
1235  WHERE 0 &lt;&gt; shoelace_data.sl_avail
1236    AND <emphasis>shoelace_data.sl_color = 'black'</emphasis>;
1237 </programlisting>
1238
1239     being generated by the rule.  This query tree will surely insert
1240     three new log entries. And that's absolutely correct.
1241 </para>
1242
1243 <para>
1244     Here we can see why it is important that the original query tree
1245     is executed last.  If the <command>UPDATE</command> had been
1246     executed first, all the rows would have already been set to zero, so the
1247     logging <command>INSERT</command> would not find any row where
1248     <literal>0 &lt;&gt; shoelace_data.sl_avail</literal>.
1249 </para>
1250 </sect3>
1251
1252 </sect2>
1253
1254 <sect2 id="rules-update-views">
1255 <title>Cooperation with Views</title>
1256
1257 <indexterm zone="rules-update-views"><primary>view</><secondary>updating</></>
1258
1259 <para>
1260     A simple way to protect view relations from the mentioned
1261     possibility that someone can try to run <command>INSERT</command>,
1262     <command>UPDATE</command>, or <command>DELETE</command> on them is
1263     to let those query trees get thrown away.  So we could create the rules:
1264
1265 <programlisting>
1266 CREATE RULE shoe_ins_protect AS ON INSERT TO shoe
1267     DO INSTEAD NOTHING;
1268 CREATE RULE shoe_upd_protect AS ON UPDATE TO shoe
1269     DO INSTEAD NOTHING;
1270 CREATE RULE shoe_del_protect AS ON DELETE TO shoe
1271     DO INSTEAD NOTHING;
1272 </programlisting>
1273
1274     If someone now tries to do any of these operations on the view
1275     relation <literal>shoe</literal>, the rule system will
1276     apply these rules. Since the rules have
1277     no actions and are <literal>INSTEAD</>, the resulting list of
1278     query trees will be empty and the whole query will become
1279     nothing because there is nothing left to be optimized or
1280     executed after the rule system is done with it.
1281 </para>
1282
1283 <para>
1284     A more sophisticated way to use the rule system is to
1285     create rules that rewrite the query tree into one that
1286     does the right operation on the real tables. To do that
1287     on the <literal>shoelace</literal> view, we create
1288     the following rules:
1289
1290 <programlisting>
1291 CREATE RULE shoelace_ins AS ON INSERT TO shoelace
1292     DO INSTEAD
1293     INSERT INTO shoelace_data VALUES (
1294            NEW.sl_name,
1295            NEW.sl_avail,
1296            NEW.sl_color,
1297            NEW.sl_len,
1298            NEW.sl_unit
1299     );
1300
1301 CREATE RULE shoelace_upd AS ON UPDATE TO shoelace
1302     DO INSTEAD
1303     UPDATE shoelace_data
1304        SET sl_name = NEW.sl_name,
1305            sl_avail = NEW.sl_avail,
1306            sl_color = NEW.sl_color,
1307            sl_len = NEW.sl_len,
1308            sl_unit = NEW.sl_unit
1309      WHERE sl_name = OLD.sl_name;
1310
1311 CREATE RULE shoelace_del AS ON DELETE TO shoelace
1312     DO INSTEAD
1313     DELETE FROM shoelace_data
1314      WHERE sl_name = OLD.sl_name;
1315 </programlisting>
1316    </para>
1317
1318    <para>
1319     If you want to support <literal>RETURNING</> queries on the view,
1320     you need to make the rules include <literal>RETURNING</> clauses that
1321     compute the view rows.  This is usually pretty trivial for views on a
1322     single table, but it's a bit tedious for join views such as
1323     <literal>shoelace</literal>.  An example for the insert case is:
1324
1325 <programlisting>
1326 CREATE RULE shoelace_ins AS ON INSERT TO shoelace
1327     DO INSTEAD
1328     INSERT INTO shoelace_data VALUES (
1329            NEW.sl_name,
1330            NEW.sl_avail,
1331            NEW.sl_color,
1332            NEW.sl_len,
1333            NEW.sl_unit
1334     )
1335     RETURNING
1336            shoelace_data.*,
1337            (SELECT shoelace_data.sl_len * u.un_fact
1338             FROM unit u WHERE shoelace_data.sl_unit = u.un_name);
1339 </programlisting>
1340
1341     Note that this one rule supports both <command>INSERT</> and
1342     <command>INSERT RETURNING</> queries on the view &mdash; the
1343     <literal>RETURNING</> clause is simply ignored for <command>INSERT</>.
1344    </para>
1345
1346    <para>
1347     Now assume that once in a while, a pack of shoelaces arrives at
1348     the shop and a big parts list along with it.  But you don't want
1349     to manually update the <literal>shoelace</literal> view every
1350     time.  Instead we setup two little tables: one where you can
1351     insert the items from the part list, and one with a special
1352     trick. The creation commands for these are:
1353
1354 <programlisting>
1355 CREATE TABLE shoelace_arrive (
1356     arr_name    text,
1357     arr_quant   integer
1358 );
1359
1360 CREATE TABLE shoelace_ok (
1361     ok_name     text,
1362     ok_quant    integer
1363 );
1364
1365 CREATE RULE shoelace_ok_ins AS ON INSERT TO shoelace_ok
1366     DO INSTEAD
1367     UPDATE shoelace
1368        SET sl_avail = sl_avail + NEW.ok_quant
1369      WHERE sl_name = NEW.ok_name;
1370 </programlisting>
1371
1372     Now you can fill the table <literal>shoelace_arrive</literal> with
1373     the data from the parts list:
1374
1375 <programlisting>
1376 SELECT * FROM shoelace_arrive;
1377
1378  arr_name | arr_quant
1379 ----------+-----------
1380  sl3      |        10
1381  sl6      |        20
1382  sl8      |        20
1383 (3 rows)
1384 </programlisting>
1385
1386     Take a quick look at the current data:
1387
1388 <programlisting>
1389 SELECT * FROM shoelace;
1390
1391  sl_name  | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1392 ----------+----------+----------+--------+---------+-----------
1393  sl1      |        5 | black    |     80 | cm      |        80
1394  sl2      |        6 | black    |    100 | cm      |       100
1395  sl7      |        6 | brown    |     60 | cm      |        60
1396  sl3      |        0 | black    |     35 | inch    |      88.9
1397  sl4      |        8 | black    |     40 | inch    |     101.6
1398  sl8      |        1 | brown    |     40 | inch    |     101.6
1399  sl5      |        4 | brown    |      1 | m       |       100
1400  sl6      |        0 | brown    |    0.9 | m       |        90
1401 (8 rows)
1402 </programlisting>
1403
1404     Now move the arrived shoelaces in:
1405
1406 <programlisting>
1407 INSERT INTO shoelace_ok SELECT * FROM shoelace_arrive;
1408 </programlisting>
1409
1410     and check the results:
1411
1412 <programlisting>
1413 SELECT * FROM shoelace ORDER BY sl_name;
1414
1415  sl_name  | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1416 ----------+----------+----------+--------+---------+-----------
1417  sl1      |        5 | black    |     80 | cm      |        80
1418  sl2      |        6 | black    |    100 | cm      |       100
1419  sl7      |        6 | brown    |     60 | cm      |        60
1420  sl4      |        8 | black    |     40 | inch    |     101.6
1421  sl3      |       10 | black    |     35 | inch    |      88.9
1422  sl8      |       21 | brown    |     40 | inch    |     101.6
1423  sl5      |        4 | brown    |      1 | m       |       100
1424  sl6      |       20 | brown    |    0.9 | m       |        90
1425 (8 rows)
1426
1427 SELECT * FROM shoelace_log;
1428
1429  sl_name | sl_avail | log_who| log_when
1430 ---------+----------+--------+----------------------------------
1431  sl7     |        6 | Al     | Tue Oct 20 19:14:45 1998 MET DST
1432  sl3     |       10 | Al     | Tue Oct 20 19:25:16 1998 MET DST
1433  sl6     |       20 | Al     | Tue Oct 20 19:25:16 1998 MET DST
1434  sl8     |       21 | Al     | Tue Oct 20 19:25:16 1998 MET DST
1435 (4 rows)
1436 </programlisting>
1437    </para>
1438
1439    <para>
1440     It's a long way from the one <literal>INSERT ... SELECT</literal>
1441     to these results. And the description of the query-tree
1442     transformation will be the last in this chapter.  First, there is
1443     the parser's output:
1444
1445 <programlisting>
1446 INSERT INTO shoelace_ok
1447 SELECT shoelace_arrive.arr_name, shoelace_arrive.arr_quant
1448   FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok;
1449 </programlisting>
1450
1451     Now the first rule <literal>shoelace_ok_ins</literal> is applied and turns this
1452     into:
1453
1454 <programlisting>
1455 UPDATE shoelace
1456    SET sl_avail = shoelace.sl_avail + shoelace_arrive.arr_quant
1457   FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1458        shoelace_ok *OLD*, shoelace_ok *NEW*,
1459        shoelace shoelace
1460  WHERE shoelace.sl_name = shoelace_arrive.arr_name;
1461 </programlisting>
1462
1463     and throws away the original <command>INSERT</command> on
1464     <literal>shoelace_ok</literal>.  This rewritten query is passed to
1465     the rule system again, and the second applied rule
1466     <literal>shoelace_upd</literal> produces:
1467
1468 <programlisting>
1469 UPDATE shoelace_data
1470    SET sl_name = shoelace.sl_name,
1471        sl_avail = shoelace.sl_avail + shoelace_arrive.arr_quant,
1472        sl_color = shoelace.sl_color,
1473        sl_len = shoelace.sl_len,
1474        sl_unit = shoelace.sl_unit
1475   FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1476        shoelace_ok *OLD*, shoelace_ok *NEW*,
1477        shoelace shoelace, shoelace *OLD*,
1478        shoelace *NEW*, shoelace_data shoelace_data
1479  WHERE shoelace.sl_name = shoelace_arrive.arr_name
1480    AND shoelace_data.sl_name = shoelace.sl_name;
1481 </programlisting>
1482
1483     Again it's an <literal>INSTEAD</> rule and the previous query tree is trashed.
1484     Note that this query still uses the view <literal>shoelace</literal>.
1485     But the rule system isn't finished with this step, so it continues
1486     and applies the <literal>_RETURN</literal> rule on it, and we get:
1487
1488 <programlisting>
1489 UPDATE shoelace_data
1490    SET sl_name = s.sl_name,
1491        sl_avail = s.sl_avail + shoelace_arrive.arr_quant,
1492        sl_color = s.sl_color,
1493        sl_len = s.sl_len,
1494        sl_unit = s.sl_unit
1495   FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1496        shoelace_ok *OLD*, shoelace_ok *NEW*,
1497        shoelace shoelace, shoelace *OLD*,
1498        shoelace *NEW*, shoelace_data shoelace_data,
1499        shoelace *OLD*, shoelace *NEW*,
1500        shoelace_data s, unit u
1501  WHERE s.sl_name = shoelace_arrive.arr_name
1502    AND shoelace_data.sl_name = s.sl_name;
1503 </programlisting>
1504
1505     Finally, the rule <literal>log_shoelace</literal> gets applied,
1506     producing the extra query tree:
1507
1508 <programlisting>
1509 INSERT INTO shoelace_log
1510 SELECT s.sl_name,
1511        s.sl_avail + shoelace_arrive.arr_quant,
1512        current_user,
1513        current_timestamp
1514   FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1515        shoelace_ok *OLD*, shoelace_ok *NEW*,
1516        shoelace shoelace, shoelace *OLD*,
1517        shoelace *NEW*, shoelace_data shoelace_data,
1518        shoelace *OLD*, shoelace *NEW*,
1519        shoelace_data s, unit u,
1520        shoelace_data *OLD*, shoelace_data *NEW*
1521        shoelace_log shoelace_log
1522  WHERE s.sl_name = shoelace_arrive.arr_name
1523    AND shoelace_data.sl_name = s.sl_name
1524    AND (s.sl_avail + shoelace_arrive.arr_quant) &lt;&gt; s.sl_avail;
1525 </programlisting>
1526
1527     After that the rule system runs out of rules and returns the
1528     generated query trees.
1529    </para>
1530
1531    <para>
1532     So we end up with two final query trees that are equivalent to the
1533     <acronym>SQL</acronym> statements:
1534
1535 <programlisting>
1536 INSERT INTO shoelace_log
1537 SELECT s.sl_name,
1538        s.sl_avail + shoelace_arrive.arr_quant,
1539        current_user,
1540        current_timestamp
1541   FROM shoelace_arrive shoelace_arrive, shoelace_data shoelace_data,
1542        shoelace_data s
1543  WHERE s.sl_name = shoelace_arrive.arr_name
1544    AND shoelace_data.sl_name = s.sl_name
1545    AND s.sl_avail + shoelace_arrive.arr_quant &lt;&gt; s.sl_avail;
1546
1547 UPDATE shoelace_data
1548    SET sl_avail = shoelace_data.sl_avail + shoelace_arrive.arr_quant
1549   FROM shoelace_arrive shoelace_arrive,
1550        shoelace_data shoelace_data,
1551        shoelace_data s
1552  WHERE s.sl_name = shoelace_arrive.sl_name
1553    AND shoelace_data.sl_name = s.sl_name;
1554 </programlisting>
1555
1556     The result is that data coming from one relation inserted into another,
1557     changed into updates on a third, changed into updating
1558     a fourth plus logging that final update in a fifth
1559     gets reduced into two queries.
1560 </para>
1561
1562 <para>
1563     There is a little detail that's a bit ugly. Looking at the two
1564     queries, it turns out that the <literal>shoelace_data</literal>
1565     relation appears twice in the range table where it could
1566     definitely be reduced to one. The planner does not handle it and
1567     so the execution plan for the rule systems output of the
1568     <command>INSERT</command> will be
1569
1570 <literallayout class="monospaced">
1571 Nested Loop
1572   -&gt;  Merge Join
1573         -&gt;  Seq Scan
1574               -&gt;  Sort
1575                     -&gt;  Seq Scan on s
1576         -&gt;  Seq Scan
1577               -&gt;  Sort
1578                     -&gt;  Seq Scan on shoelace_arrive
1579   -&gt;  Seq Scan on shoelace_data
1580 </literallayout>
1581
1582     while omitting the extra range table entry would result in a
1583
1584 <literallayout class="monospaced">
1585 Merge Join
1586   -&gt;  Seq Scan
1587         -&gt;  Sort
1588               -&gt;  Seq Scan on s
1589   -&gt;  Seq Scan
1590         -&gt;  Sort
1591               -&gt;  Seq Scan on shoelace_arrive
1592 </literallayout>
1593
1594     which produces exactly the same entries in the log table.  Thus,
1595     the rule system caused one extra scan on the table
1596     <literal>shoelace_data</literal> that is absolutely not
1597     necessary. And the same redundant scan is done once more in the
1598     <command>UPDATE</command>. But it was a really hard job to make
1599     that all possible at all.
1600 </para>
1601
1602 <para>
1603     Now we make a final demonstration of the
1604     <productname>PostgreSQL</productname> rule system and its power.
1605     Say you add some shoelaces with extraordinary colors to your
1606     database:
1607
1608 <programlisting>
1609 INSERT INTO shoelace VALUES ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
1610 INSERT INTO shoelace VALUES ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
1611 </programlisting>
1612
1613     We would like to make a view to check which
1614     <literal>shoelace</literal> entries do not fit any shoe in color.
1615     The view for this is:
1616
1617 <programlisting>
1618 CREATE VIEW shoelace_mismatch AS
1619     SELECT * FROM shoelace WHERE NOT EXISTS
1620         (SELECT shoename FROM shoe WHERE slcolor = sl_color);
1621 </programlisting>
1622
1623     Its output is:
1624
1625 <programlisting>
1626 SELECT * FROM shoelace_mismatch;
1627
1628  sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1629 ---------+----------+----------+--------+---------+-----------
1630  sl9     |        0 | pink     |     35 | inch    |      88.9
1631  sl10    |     1000 | magenta  |     40 | inch    |     101.6
1632 </programlisting>
1633    </para>
1634
1635    <para>
1636     Now we want to set it up so that mismatching shoelaces that are
1637     not in stock are deleted from the database.
1638     To make it a little harder for <productname>PostgreSQL</productname>,
1639     we don't delete it directly. Instead we create one more view:
1640
1641 <programlisting>
1642 CREATE VIEW shoelace_can_delete AS
1643     SELECT * FROM shoelace_mismatch WHERE sl_avail = 0;
1644 </programlisting>
1645
1646     and do it this way:
1647
1648 <programlisting>
1649 DELETE FROM shoelace WHERE EXISTS
1650     (SELECT * FROM shoelace_can_delete
1651              WHERE sl_name = shoelace.sl_name);
1652 </programlisting>
1653
1654     <foreignphrase>Voilà</foreignphrase>:
1655
1656 <programlisting>
1657 SELECT * FROM shoelace;
1658
1659  sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1660 ---------+----------+----------+--------+---------+-----------
1661  sl1     |        5 | black    |     80 | cm      |        80
1662  sl2     |        6 | black    |    100 | cm      |       100
1663  sl7     |        6 | brown    |     60 | cm      |        60
1664  sl4     |        8 | black    |     40 | inch    |     101.6
1665  sl3     |       10 | black    |     35 | inch    |      88.9
1666  sl8     |       21 | brown    |     40 | inch    |     101.6
1667  sl10    |     1000 | magenta  |     40 | inch    |     101.6
1668  sl5     |        4 | brown    |      1 | m       |       100
1669  sl6     |       20 | brown    |    0.9 | m       |        90
1670 (9 rows)
1671 </programlisting>
1672    </para>
1673
1674    <para>
1675     A <command>DELETE</command> on a view, with a subquery qualification that
1676     in total uses 4 nesting/joined views, where one of them
1677     itself has a subquery qualification containing a view
1678     and where calculated view columns are used,
1679     gets rewritten into
1680     one single query tree that deletes the requested data
1681     from a real table.
1682 </para>
1683
1684 <para>
1685     There are probably only a few situations out in the real world
1686     where such a construct is necessary. But it makes you feel
1687     comfortable that it works.
1688 </para>
1689 </sect2>
1690
1691 </sect1>
1692
1693 <sect1 id="rules-privileges">
1694 <title>Rules and Privileges</title>
1695
1696 <indexterm zone="rules-privileges">
1697  <primary>privilege</primary>
1698  <secondary sortas="Regeln">with rules</secondary>
1699 </indexterm>
1700
1701 <indexterm zone="rules-privileges">
1702  <primary>privilege</primary>
1703  <secondary sortas="Sichten">with views</secondary>
1704 </indexterm>
1705
1706 <para>
1707     Due to rewriting of queries by the <productname>PostgreSQL</productname>
1708     rule system, other tables/views than those used in the original
1709     query get accessed. When update rules are used, this can include write access
1710     to tables.
1711 </para>
1712
1713 <para>
1714     Rewrite rules don't have a separate owner. The owner of
1715     a relation (table or view) is automatically the owner of the
1716     rewrite rules that are defined for it.
1717     The <productname>PostgreSQL</productname> rule system changes the
1718     behavior of the default access control system. Relations that
1719     are used due to rules get checked against the
1720     privileges of the rule owner, not the user invoking the rule.
1721     This means that a user only needs the required privileges
1722     for the tables/views that he names explicitly in his queries.
1723 </para>
1724
1725 <para>
1726     For example: A user has a list of phone numbers where some of
1727     them are private, the others are of interest for the secretary of the office.
1728     He can construct the following:
1729
1730 <programlisting>
1731 CREATE TABLE phone_data (person text, phone text, private boolean);
1732 CREATE VIEW phone_number AS
1733     SELECT person, phone FROM phone_data WHERE NOT private;
1734 GRANT SELECT ON phone_number TO secretary;
1735 </programlisting>
1736
1737     Nobody except him (and the database superusers) can access the
1738     <literal>phone_data</> table. But because of the <command>GRANT</>,
1739     the secretary can run a <command>SELECT</command> on the
1740     <literal>phone_number</> view. The rule system will rewrite the
1741     <command>SELECT</command> from <literal>phone_number</> into a
1742     <command>SELECT</command> from <literal>phone_data</> and add the
1743     qualification that only entries where <literal>private</> is false
1744     are wanted. Since the user is the owner of
1745     <literal>phone_number</> and therefore the owner of the rule, the
1746     read access to <literal>phone_data</> is now checked against his
1747     privileges and the query is permitted. The check for accessing
1748     <literal>phone_number</> is also performed, but this is done
1749     against the invoking user, so nobody but the user and the
1750     secretary can use it.
1751 </para>
1752
1753 <para>
1754     The privileges are checked rule by rule. So the secretary is for now the
1755     only one who can see the public phone numbers. But the secretary can setup
1756     another view and grant access to that to the public. Then, anyone
1757     can see the <literal>phone_number</> data through the secretary's view.
1758     What the secretary cannot do is to create a view that directly
1759     accesses <literal>phone_data</>.  (Actually he can, but it will not work since
1760     every access will be denied during the permission checks.)
1761     And as soon as the user will notice, that the secretary opened
1762     his <literal>phone_number</> view, he can revoke his access. Immediately, any
1763     access to the secretary's view would fail.
1764 </para>
1765
1766 <para>
1767     One might think that this rule-by-rule checking is a security
1768     hole, but in fact it isn't.   But if it did not work this way, the secretary
1769     could set up a table with the same columns as <literal>phone_number</> and
1770     copy the data to there once per day. Then it's his own data and
1771     he can grant access to everyone he wants. A
1772     <command>GRANT</command> command means, <quote>I trust you</quote>.
1773     If someone you trust does the thing above, it's time to
1774     think it over and then use <command>REVOKE</command>.
1775 </para>
1776
1777 <para>
1778     This mechanism also works for update rules. In the examples of
1779     the previous section, the owner of the tables in the example
1780     database could grant the privileges <literal>SELECT</>,
1781     <literal>INSERT</>, <literal>UPDATE</>, and <literal>DELETE</> on
1782     the <literal>shoelace</> view to someone else, but only
1783     <literal>SELECT</> on <literal>shoelace_log</>. The rule action to
1784     write log entries will still be executed successfully, and that
1785     other user could see the log entries.  But he cannot create fake
1786     entries, nor could he manipulate or remove existing ones.
1787 </para>
1788 </sect1>
1789
1790 <sect1 id="rules-status">
1791 <title>Rules and Command Status</title>
1792
1793 <para>
1794     The <productname>PostgreSQL</productname> server returns a command
1795     status string, such as <literal>INSERT 149592 1</>, for each
1796     command it receives.  This is simple enough when there are no rules
1797     involved, but what happens when the query is rewritten by rules?
1798 </para>
1799
1800 <para>
1801     Rules affect the command status as follows:
1802
1803     <itemizedlist>
1804      <listitem>
1805       <para>
1806        If there is no unconditional <literal>INSTEAD</> rule for the query, then
1807        the originally given query will be executed, and its command
1808        status will be returned as usual.  (But note that if there were
1809        any conditional <literal>INSTEAD</> rules, the negation of their qualifications
1810        will have been added to the original query.  This might reduce the
1811        number of rows it processes, and if so the reported status will
1812        be affected.)
1813       </para>
1814      </listitem>
1815
1816      <listitem>
1817       <para>
1818        If there is any unconditional <literal>INSTEAD</> rule for the query, then
1819        the original query will not be executed at all.  In this case,
1820        the server will return the command status for the last query
1821        that was inserted by an <literal>INSTEAD</> rule (conditional or
1822        unconditional) and is of the same command type
1823        (<command>INSERT</command>, <command>UPDATE</command>, or
1824        <command>DELETE</command>) as the original query.  If no query
1825        meeting those requirements is added by any rule, then the
1826        returned command status shows the original query type and
1827        zeroes for the row-count and OID fields.
1828       </para>
1829      </listitem>
1830     </itemizedlist>
1831
1832     (This system was established in <productname>PostgreSQL</> 7.3.
1833     In versions before that, the command status might show different
1834     results when rules exist.)
1835 </para>
1836
1837 <para>
1838     The programmer can ensure that any desired <literal>INSTEAD</> rule is the one
1839     that sets the command status in the second case, by giving it the
1840     alphabetically last rule name among the active rules, so that it
1841     gets applied last.
1842 </para>
1843 </sect1>
1844
1845 <sect1 id="rules-triggers">
1846 <title>Rules versus Triggers</title>
1847
1848 <indexterm zone="rules-triggers">
1849  <primary>rule</primary>
1850  <secondary sortas="Trigger">compared with triggers</secondary>
1851 </indexterm>
1852
1853 <indexterm zone="rules-triggers">
1854  <primary>trigger</primary>
1855  <secondary sortas="Regeln">compared with rules</secondary>
1856 </indexterm>
1857
1858 <para>
1859     Many things that can be done using triggers can also be
1860     implemented using the <productname>PostgreSQL</productname>
1861     rule system.  One of the things that cannot be implemented by
1862     rules are some kinds of constraints, especially foreign keys. It is possible
1863     to place a qualified rule that rewrites a command to <literal>NOTHING</>
1864     if the value of a column does not appear in another table.
1865     But then the data is silently thrown away and that's
1866     not a good idea. If checks for valid values are required,
1867     and in the case of an invalid value an error message should
1868     be generated, it must be done by a trigger.
1869 </para>
1870
1871 <para>
1872     On the other hand, a trigger cannot be created on views because
1873     there is no real data in a view relation;  however INSERT, UPDATE,
1874     and DELETE rules can be created on views.
1875 </para>
1876
1877 <para>
1878     For the things that can be implemented by both, which is best
1879     depends on the usage of the database.
1880     A trigger is fired for any affected row once. A rule manipulates
1881     the query or generates an additional query. So if many
1882     rows are affected in one statement, a rule issuing one extra
1883     command is likely to be faster than a trigger that is
1884     called for every single row and must execute its operations
1885     many times.  However, the trigger approach is conceptually far
1886     simpler than the rule approach, and is easier for novices to get right.
1887 </para>
1888
1889 <para>
1890     Here we show an example of how the choice of rules versus triggers
1891     plays out in one situation.  There are two tables:
1892
1893 <programlisting>
1894 CREATE TABLE computer (
1895     hostname        text,    -- indexed
1896     manufacturer    text     -- indexed
1897 );
1898
1899 CREATE TABLE software (
1900     software        text,    -- indexed
1901     hostname        text     -- indexed
1902 );
1903 </programlisting>
1904
1905     Both tables have many thousands of rows and the indexes on
1906     <structfield>hostname</> are unique.  The rule or trigger should
1907     implement a constraint that deletes rows from <literal>software</>
1908     that reference a deleted computer.  The trigger would use this command:
1909
1910 <programlisting>
1911 DELETE FROM software WHERE hostname = $1;
1912 </programlisting>
1913
1914     Since the trigger is called for each individual row deleted from
1915     <literal>computer</>, it can prepare and save the plan for this
1916     command and pass the <structfield>hostname</> value in the
1917     parameter.  The rule would be written as:
1918
1919 <programlisting>
1920 CREATE RULE computer_del AS ON DELETE TO computer
1921     DO DELETE FROM software WHERE hostname = OLD.hostname;
1922 </programlisting>
1923    </para>
1924
1925    <para>
1926     Now we look at different types of deletes. In the case of a:
1927
1928 <programlisting>
1929 DELETE FROM computer WHERE hostname = 'mypc.local.net';
1930 </programlisting>
1931
1932     the table <literal>computer</> is scanned by index (fast), and the
1933     command issued by the trigger would also use an index scan (also fast).
1934     The extra command from the rule would be:
1935
1936 <programlisting>
1937 DELETE FROM software WHERE computer.hostname = 'mypc.local.net'
1938                        AND software.hostname = computer.hostname;
1939 </programlisting>
1940
1941     Since there are appropriate indexes setup, the planner
1942     will create a plan of
1943
1944 <literallayout class="monospaced">
1945 Nestloop
1946   -&gt;  Index Scan using comp_hostidx on computer
1947   -&gt;  Index Scan using soft_hostidx on software
1948 </literallayout>
1949
1950     So there would be not that much difference in speed between
1951     the trigger and the rule implementation.
1952    </para>
1953
1954    <para>
1955     With the next delete we want to get rid of all the 2000 computers
1956     where the <structfield>hostname</> starts with
1957     <literal>old</>. There are two possible commands to do that. One
1958     is:
1959
1960 <programlisting>
1961 DELETE FROM computer WHERE hostname &gt;= 'old'
1962                        AND hostname &lt;  'ole'
1963 </programlisting>
1964
1965     The command added by the rule will be:
1966
1967 <programlisting>
1968 DELETE FROM software WHERE computer.hostname &gt;= 'old' AND computer.hostname &lt; 'ole'
1969                        AND software.hostname = computer.hostname;
1970 </programlisting>
1971
1972     with the plan
1973
1974 <literallayout class="monospaced">
1975 Hash Join
1976   -&gt;  Seq Scan on software
1977   -&gt;  Hash
1978     -&gt;  Index Scan using comp_hostidx on computer
1979 </literallayout>
1980
1981     The other possible command is:
1982
1983 <programlisting>
1984 DELETE FROM computer WHERE hostname ~ '^old';
1985 </programlisting>
1986
1987     which results in the following executing plan for the command
1988     added by the rule:
1989
1990 <literallayout class="monospaced">
1991 Nestloop
1992   -&gt;  Index Scan using comp_hostidx on computer
1993   -&gt;  Index Scan using soft_hostidx on software
1994 </literallayout>
1995
1996     This shows, that the planner does not realize that the
1997     qualification for <structfield>hostname</> in
1998     <literal>computer</> could also be used for an index scan on
1999     <literal>software</> when there are multiple qualification
2000     expressions combined with <literal>AND</>, which is what it does
2001     in the regular-expression version of the command. The trigger will
2002     get invoked once for each of the 2000 old computers that have to be
2003     deleted, and that will result in one index scan over
2004     <literal>computer</> and 2000 index scans over
2005     <literal>software</>. The rule implementation will do it with two
2006     commands that use indexes.  And it depends on the overall size of
2007     the table <literal>software</> whether the rule will still be faster in the
2008     sequential scan situation. 2000 command executions from the trigger over the SPI
2009     manager take some time, even if all the index blocks will soon be in the cache.
2010 </para>
2011
2012 <para>
2013     The last command we look at is:
2014
2015 <programlisting>
2016 DELETE FROM computer WHERE manufacturer = 'bim';
2017 </programlisting>
2018
2019     Again this could result in many rows to be deleted from
2020     <literal>computer</>. So the trigger will again run many commands
2021     through the executor.  The command generated by the rule will be:
2022
2023 <programlisting>
2024 DELETE FROM software WHERE computer.manufacturer = 'bim'
2025                        AND software.hostname = computer.hostname;
2026 </programlisting>
2027
2028     The plan for that command will again be the nested loop over two
2029     index scans, only using a different index on <literal>computer</>:
2030
2031 <programlisting>
2032 Nestloop
2033   -&gt;  Index Scan using comp_manufidx on computer
2034   -&gt;  Index Scan using soft_hostidx on software
2035 </programlisting>
2036
2037     In any of these cases, the extra commands from the rule system
2038     will be more or less independent from the number of affected rows
2039     in a command.
2040 </para>
2041
2042 <![IGNORE[
2043 <!-- What's happening with this?  If it doesn't come back, remove this section. -->
2044 <para>
2045     Another situation is cases on <command>UPDATE</command> where it depends on the
2046     change of an attribute if an action should be performed or
2047     not. The only way to
2048     create a rule as in the shoelace_log example is to do it with
2049     a rule qualification. That results in an extra query that is
2050     performed always, even if the attribute of interest cannot
2051     change at all because it does not appear in the target list
2052     of the initial query. When this is enabled again, it will be
2053     one more advantage of rules over triggers. Optimization of
2054     a trigger must fail by definition in this case, because the
2055     fact that its actions will only be done when a specific attribute
2056     is updated is hidden in its functionality. The definition of
2057     a trigger only allows to specify it on row level, so whenever a
2058     row is touched, the trigger must be called to make its
2059     decision. The rule system will know it by looking up the
2060     target list and will suppress the additional query completely
2061     if the attribute isn't touched. So the rule, qualified or not,
2062     will only do its scans if there ever could be something to do.
2063 </para>
2064 ]]>
2065
2066 <para>
2067     The summary is, rules will only be significantly slower than
2068     triggers if their actions result in large and badly qualified
2069     joins, a situation where the planner fails.
2070 </para>
2071 </sect1>
2072
2073 </chapter>