4 <title>The Rule System
</title>
6 <indexterm zone=
"rules">
7 <primary>rule
</primary>
11 This chapter discusses the rule system in
12 <productname>PostgreSQL
</productname>. Production rule systems
13 are conceptually simple, but there are many subtle points
14 involved in actually using them.
18 Some other database systems define active database rules, which
19 are usually stored procedures and triggers. In
20 <productname>PostgreSQL
</productname>, these can be implemented
21 using functions and triggers as well.
25 The rule system (more precisely speaking, the query rewrite rule
26 system) is totally different from stored procedures and triggers.
27 It modifies queries to take rules into consideration, and then
28 passes the modified query to the query planner for planning and
29 execution. It is very powerful, and can be used for many things
30 such as query language procedures, views, and versions. The
31 theoretical foundations and the power of this rule system are
32 also discussed in
<xref linkend=
"STON90b"> and
<xref
36 <sect1 id=
"querytree">
37 <title>The Query Tree
</title>
39 <indexterm zone=
"querytree">
40 <primary>query tree
</primary>
44 To understand how the rule system works it is necessary to know
45 when it is invoked and what its input and results are.
49 The rule system is located between the parser and the planner.
50 It takes the output of the parser, one query tree, and the user-defined
51 rewrite rules, which are also
52 query trees with some extra information, and creates zero or more
53 query trees as result. So its input and output are always things
54 the parser itself could have produced and thus, anything it sees
55 is basically representable as an
<acronym>SQL
</acronym> statement.
59 Now what is a query tree? It is an internal representation of an
60 <acronym>SQL
</acronym> statement where the single parts that it is
61 built from are stored separately. These query trees can be shown
62 in the server log if you set the configuration parameters
63 <varname>debug_print_parse
</varname>,
64 <varname>debug_print_rewritten
</varname>, or
65 <varname>debug_print_plan
</varname>. The rule actions are also
66 stored as query trees, in the system catalog
67 <structname>pg_rewrite
</structname>. They are not formatted like
68 the log output, but they contain exactly the same information.
72 Reading a raw query tree requires some experience. But since
73 <acronym>SQL
</acronym> representations of query trees are
74 sufficient to understand the rule system, this chapter will not
75 teach how to read them.
79 When reading the
<acronym>SQL
</acronym> representations of the
80 query trees in this chapter it is necessary to be able to identify
81 the parts the statement is broken into when it is in the query tree
82 structure. The parts of a query tree are
91 This is a simple value telling which command
92 (
<command>SELECT
</command>,
<command>INSERT
</command>,
93 <command>UPDATE
</command>,
<command>DELETE
</command>) produced
103 <indexterm><primary>range table<
/><
/>
106 The range table is a list of relations that are used in the query.
107 In a
<command>SELECT
</command> statement these are the relations given after
108 the
<literal>FROM
</literal> key word.
112 Every range table entry identifies a table or view and tells
113 by which name it is called in the other parts of the query.
114 In the query tree, the range table entries are referenced by
115 number rather than by name, so here it doesn't matter if there
116 are duplicate names as it would in an
<acronym>SQL
</acronym>
117 statement. This can happen after the range tables of rules
118 have been merged in. The examples in this chapter will not have
130 This is an index into the range table that identifies the
131 relation where the results of the query go.
135 <command>SELECT
</command> queries normally don't have a result
136 relation. The special case of a
<command>SELECT INTO
</command> is
137 mostly identical to a
<command>CREATE TABLE
</command> followed by a
138 <literal>INSERT ... SELECT
</literal> and is not discussed
143 For
<command>INSERT
</command>,
<command>UPDATE
</command>, and
144 <command>DELETE
</command> commands, the result relation is the table
145 (or view!) where the changes are to take effect.
154 <indexterm><primary>target list<
/><
/>
157 The target list is a list of expressions that define the
158 result of the query. In the case of a
159 <command>SELECT
</command>, these expressions are the ones that
160 build the final output of the query. They correspond to the
161 expressions between the key words
<command>SELECT
</command>
162 and
<command>FROM
</command>. (
<literal>*
</literal> is just an
163 abbreviation for all the column names of a relation. It is
164 expanded by the parser into the individual columns, so the
165 rule system never sees it.)
169 <command>DELETE
</command> commands don't need a target list
170 because they don't produce any result. In fact, the planner will
171 add a special
<acronym>CTID<
/> entry to the empty target list, but
172 this is after the rule system and will be discussed later; for the
173 rule system, the target list is empty.
177 For
<command>INSERT
</command> commands, the target list describes
178 the new rows that should go into the result relation. It consists of the
179 expressions in the
<literal>VALUES<
/> clause or the ones from the
180 <command>SELECT
</command> clause in
<literal>INSERT
181 ... SELECT
</literal>. The first step of the rewrite process adds
182 target list entries for any columns that were not assigned to by
183 the original command but have defaults. Any remaining columns (with
184 neither a given value nor a default) will be filled in by the
185 planner with a constant null expression.
189 For
<command>UPDATE
</command> commands, the target list
190 describes the new rows that should replace the old ones. In the
191 rule system, it contains just the expressions from the
<literal>SET
192 column = expression
</literal> part of the command. The planner will handle
193 missing columns by inserting expressions that copy the values from
194 the old row into the new one. And it will add the special
195 <acronym>CTID<
/> entry just as for
<command>DELETE
</command>, too.
199 Every entry in the target list contains an expression that can
200 be a constant value, a variable pointing to a column of one
201 of the relations in the range table, a parameter, or an expression
202 tree made of function calls, constants, variables, operators, etc.
213 The query's qualification is an expression much like one of
214 those contained in the target list entries. The result value of
215 this expression is a Boolean that tells whether the operation
216 (
<command>INSERT
</command>,
<command>UPDATE
</command>,
217 <command>DELETE
</command>, or
<command>SELECT
</command>) for the
218 final result row should be executed or not. It corresponds to the
<literal>WHERE<
/> clause
219 of an
<acronym>SQL
</acronym> statement.
230 The query's join tree shows the structure of the
<literal>FROM<
/> clause.
231 For a simple query like
<literal>SELECT ... FROM a, b, c
</literal>, the join tree is just
232 a list of the
<literal>FROM<
/> items, because we are allowed to join them in
233 any order. But when
<literal>JOIN<
/> expressions, particularly outer joins,
234 are used, we have to join in the order shown by the joins.
235 In that case, the join tree shows the structure of the
<literal>JOIN<
/> expressions. The
236 restrictions associated with particular
<literal>JOIN<
/> clauses (from
<literal>ON<
/> or
237 <literal>USING<
/> expressions) are stored as qualification expressions attached
238 to those join-tree nodes. It turns out to be convenient to store
239 the top-level
<literal>WHERE<
/> expression as a qualification attached to the
240 top-level join-tree item, too. So really the join tree represents
241 both the
<literal>FROM<
/> and
<literal>WHERE<
/> clauses of a
<command>SELECT
</command>.
252 The other parts of the query tree like the
<literal>ORDER BY<
/>
253 clause aren't of interest here. The rule system
254 substitutes some entries there while applying rules, but that
255 doesn't have much to do with the fundamentals of the rule
265 <sect1 id=
"rules-views">
266 <title>Views and the Rule System
</title>
268 <indexterm zone=
"rules-views">
269 <primary>rule
</primary>
270 <secondary>and views
</secondary>
273 <indexterm zone=
"rules-views">
275 <secondary>implementation through rules<
/>
279 Views in
<productname>PostgreSQL
</productname> are implemented
280 using the rule system. In fact, there is essentially no difference
284 CREATE VIEW myview AS SELECT * FROM mytab;
287 compared against the two commands:
290 CREATE TABLE myview (
<replaceable>same column list as mytab
</replaceable>);
291 CREATE RULE
"_RETURN" AS ON SELECT TO myview DO INSTEAD
295 because this is exactly what the
<command>CREATE VIEW
</command>
296 command does internally. This has some side effects. One of them
297 is that the information about a view in the
298 <productname>PostgreSQL
</productname> system catalogs is exactly
299 the same as it is for a table. So for the parser, there is
300 absolutely no difference between a table and a view. They are the
301 same thing: relations.
304 <sect2 id=
"rules-select">
305 <title>How
<command>SELECT
</command> Rules Work
</title>
307 <indexterm zone=
"rules-select">
308 <primary>rule
</primary>
309 <secondary sortas=
"SELECT">for SELECT
</secondary>
313 Rules
<literal>ON SELECT<
/> are applied to all queries as the last step, even
314 if the command given is an
<command>INSERT
</command>,
315 <command>UPDATE
</command> or
<command>DELETE
</command>. And they
316 have different semantics from rules on the other command types in that they modify the
317 query tree in place instead of creating a new one. So
318 <command>SELECT
</command> rules are described first.
322 Currently, there can be only one action in an
<literal>ON SELECT<
/> rule, and it must
323 be an unconditional
<command>SELECT<
/> action that is
<literal>INSTEAD<
/>. This restriction was
324 required to make rules safe enough to open them for ordinary users, and
325 it restricts
<literal>ON SELECT<
/> rules to act like views.
329 The examples for this chapter are two join views that do some
330 calculations and some more views using them in turn. One of the
331 two first views is customized later by adding rules for
332 <command>INSERT
</command>,
<command>UPDATE
</command>, and
333 <command>DELETE
</command> operations so that the final result will
334 be a view that behaves like a real table with some magic
335 functionality. This is not such a simple example to start from and
336 this makes things harder to get into. But it's better to have one
337 example that covers all the points discussed step by step rather
338 than having many different ones that might mix up in mind.
342 For the example, we need a little
<literal>min
</literal> function that
343 returns the lower of
2 integer values. We create that as:
346 CREATE FUNCTION min(integer, integer) RETURNS integer AS $$
347 SELECT CASE WHEN $
1 < $
2 THEN $
1 ELSE $
2 END
348 $$ LANGUAGE SQL STRICT;
353 The real tables we need in the first two rule system descriptions
357 CREATE TABLE shoe_data (
358 shoename text, -- primary key
359 sh_avail integer, -- available number of pairs
360 slcolor text, -- preferred shoelace color
361 slminlen real, -- minimum shoelace length
362 slmaxlen real, -- maximum shoelace length
363 slunit text -- length unit
366 CREATE TABLE shoelace_data (
367 sl_name text, -- primary key
368 sl_avail integer, -- available number of pairs
369 sl_color text, -- shoelace color
370 sl_len real, -- shoelace length
371 sl_unit text -- length unit
375 un_name text, -- primary key
376 un_fact real -- factor to transform to cm
380 As you can see, they represent shoe-store data.
384 The views are created as:
392 sh.slminlen * un.un_fact AS slminlen_cm,
394 sh.slmaxlen * un.un_fact AS slmaxlen_cm,
396 FROM shoe_data sh, unit un
397 WHERE sh.slunit = un.un_name;
399 CREATE VIEW shoelace AS
405 s.sl_len * u.un_fact AS sl_len_cm
406 FROM shoelace_data s, unit u
407 WHERE s.sl_unit = u.un_name;
409 CREATE VIEW shoe_ready AS
414 min(rsh.sh_avail, rsl.sl_avail) AS total_avail
415 FROM shoe rsh, shoelace rsl
416 WHERE rsl.sl_color = rsh.slcolor
417 AND rsl.sl_len_cm
>= rsh.slminlen_cm
418 AND rsl.sl_len_cm
<= rsh.slmaxlen_cm;
421 The
<command>CREATE VIEW
</command> command for the
422 <literal>shoelace
</literal> view (which is the simplest one we
423 have) will create a relation
<literal>shoelace<
/> and an entry in
424 <structname>pg_rewrite
</structname> that tells that there is a
425 rewrite rule that must be applied whenever the relation
<literal>shoelace<
/>
426 is referenced in a query's range table. The rule has no rule
427 qualification (discussed later, with the non-
<command>SELECT<
/> rules, since
428 <command>SELECT<
/> rules currently cannot have them) and it is
<literal>INSTEAD<
/>. Note
429 that rule qualifications are not the same as query qualifications.
430 The action of our rule has a query qualification.
431 The action of the rule is one query tree that is a copy of the
432 <command>SELECT
</command> statement in the view creation command.
438 table entries for
<literal>NEW<
/> and
<literal>OLD<
/> (named
<literal>*NEW*<
/> and
<literal>*OLD*<
/> for
439 historical reasons in the printed query tree) you can see in
440 the
<structname>pg_rewrite
</structname> entry aren't of interest
441 for
<command>SELECT
</command> rules.
446 Now we populate
<literal>unit
</literal>,
<literal>shoe_data
</literal>
447 and
<literal>shoelace_data
</literal> and run a simple query on a view:
450 INSERT INTO unit VALUES ('cm',
1.0);
451 INSERT INTO unit VALUES ('m',
100.0);
452 INSERT INTO unit VALUES ('inch',
2.54);
454 INSERT INTO shoe_data VALUES ('sh1',
2, 'black',
70.0,
90.0, 'cm');
455 INSERT INTO shoe_data VALUES ('sh2',
0, 'black',
30.0,
40.0, 'inch');
456 INSERT INTO shoe_data VALUES ('sh3',
4, 'brown',
50.0,
65.0, 'cm');
457 INSERT INTO shoe_data VALUES ('sh4',
3, 'brown',
40.0,
50.0, 'inch');
459 INSERT INTO shoelace_data VALUES ('sl1',
5, 'black',
80.0, 'cm');
460 INSERT INTO shoelace_data VALUES ('sl2',
6, 'black',
100.0, 'cm');
461 INSERT INTO shoelace_data VALUES ('sl3',
0, 'black',
35.0 , 'inch');
462 INSERT INTO shoelace_data VALUES ('sl4',
8, 'black',
40.0 , 'inch');
463 INSERT INTO shoelace_data VALUES ('sl5',
4, 'brown',
1.0 , 'm');
464 INSERT INTO shoelace_data VALUES ('sl6',
0, 'brown',
0.9 , 'm');
465 INSERT INTO shoelace_data VALUES ('sl7',
7, 'brown',
60 , 'cm');
466 INSERT INTO shoelace_data VALUES ('sl8',
1, 'brown',
40 , 'inch');
468 SELECT * FROM shoelace;
470 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
471 -----------+----------+----------+--------+---------+-----------
472 sl1 |
5 | black |
80 | cm |
80
473 sl2 |
6 | black |
100 | cm |
100
474 sl7 |
7 | brown |
60 | cm |
60
475 sl3 |
0 | black |
35 | inch |
88.9
476 sl4 |
8 | black |
40 | inch |
101.6
477 sl8 |
1 | brown |
40 | inch |
101.6
478 sl5 |
4 | brown |
1 | m |
100
479 sl6 |
0 | brown |
0.9 | m |
90
485 This is the simplest
<command>SELECT
</command> you can do on our
486 views, so we take this opportunity to explain the basics of view
487 rules. The
<literal>SELECT * FROM shoelace
</literal> was
488 interpreted by the parser and produced the query tree:
491 SELECT shoelace.sl_name, shoelace.sl_avail,
492 shoelace.sl_color, shoelace.sl_len,
493 shoelace.sl_unit, shoelace.sl_len_cm
494 FROM shoelace shoelace;
497 and this is given to the rule system. The rule system walks through the
498 range table and checks if there are rules
499 for any relation. When processing the range table entry for
500 <literal>shoelace
</literal> (the only one up to now) it finds the
501 <literal>_RETURN
</literal> rule with the query tree:
504 SELECT s.sl_name, s.sl_avail,
505 s.sl_color, s.sl_len, s.sl_unit,
506 s.sl_len * u.un_fact AS sl_len_cm
507 FROM shoelace *OLD*, shoelace *NEW*,
508 shoelace_data s, unit u
509 WHERE s.sl_unit = u.un_name;
514 To expand the view, the rewriter simply creates a subquery range-table
515 entry containing the rule's action query tree, and substitutes this
516 range table entry for the original one that referenced the view. The
517 resulting rewritten query tree is almost the same as if you had typed:
520 SELECT shoelace.sl_name, shoelace.sl_avail,
521 shoelace.sl_color, shoelace.sl_len,
522 shoelace.sl_unit, shoelace.sl_len_cm
523 FROM (SELECT s.sl_name,
528 s.sl_len * u.un_fact AS sl_len_cm
529 FROM shoelace_data s, unit u
530 WHERE s.sl_unit = u.un_name) shoelace;
533 There is one difference however: the subquery's range table has two
534 extra entries
<literal>shoelace *OLD*<
/> and
<literal>shoelace *NEW*<
/>. These entries don't
535 participate directly in the query, since they aren't referenced by
536 the subquery's join tree or target list. The rewriter uses them
537 to store the access privilege check information that was originally present
538 in the range-table entry that referenced the view. In this way, the
539 executor will still check that the user has proper privileges to access
540 the view, even though there's no direct use of the view in the rewritten
545 That was the first rule applied. The rule system will continue checking
546 the remaining range-table entries in the top query (in this example there
547 are no more), and it will recursively check the range-table entries in
548 the added subquery to see if any of them reference views. (But it
549 won't expand
<literal>*OLD*<
/> or
<literal>*NEW*<
/> — otherwise we'd have infinite recursion!)
550 In this example, there are no rewrite rules for
<literal>shoelace_data<
/> or
<literal>unit<
/>,
551 so rewriting is complete and the above is the final result given to
556 Now we want to write a query that finds out for which shoes currently in the store
557 we have the matching shoelaces (color and length) and where the
558 total number of exactly matching pairs is greater or equal to two.
561 SELECT * FROM shoe_ready WHERE total_avail
>=
2;
563 shoename | sh_avail | sl_name | sl_avail | total_avail
564 ----------+----------+---------+----------+-------------
565 sh1 |
2 | sl1 |
5 |
2
566 sh3 |
4 | sl7 |
7 |
4
572 The output of the parser this time is the query tree:
575 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
576 shoe_ready.sl_name, shoe_ready.sl_avail,
577 shoe_ready.total_avail
578 FROM shoe_ready shoe_ready
579 WHERE shoe_ready.total_avail
>=
2;
582 The first rule applied will be the one for the
583 <literal>shoe_ready
</literal> view and it results in the
587 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
588 shoe_ready.sl_name, shoe_ready.sl_avail,
589 shoe_ready.total_avail
590 FROM (SELECT rsh.shoename,
594 min(rsh.sh_avail, rsl.sl_avail) AS total_avail
595 FROM shoe rsh, shoelace rsl
596 WHERE rsl.sl_color = rsh.slcolor
597 AND rsl.sl_len_cm
>= rsh.slminlen_cm
598 AND rsl.sl_len_cm
<= rsh.slmaxlen_cm) shoe_ready
599 WHERE shoe_ready.total_avail
>=
2;
602 Similarly, the rules for
<literal>shoe
</literal> and
603 <literal>shoelace
</literal> are substituted into the range table of
604 the subquery, leading to a three-level final query tree:
607 SELECT shoe_ready.shoename, shoe_ready.sh_avail,
608 shoe_ready.sl_name, shoe_ready.sl_avail,
609 shoe_ready.total_avail
610 FROM (SELECT rsh.shoename,
614 min(rsh.sh_avail, rsl.sl_avail) AS total_avail
615 FROM (SELECT sh.shoename,
619 sh.slminlen * un.un_fact AS slminlen_cm,
621 sh.slmaxlen * un.un_fact AS slmaxlen_cm,
623 FROM shoe_data sh, unit un
624 WHERE sh.slunit = un.un_name) rsh,
630 s.sl_len * u.un_fact AS sl_len_cm
631 FROM shoelace_data s, unit u
632 WHERE s.sl_unit = u.un_name) rsl
633 WHERE rsl.sl_color = rsh.slcolor
634 AND rsl.sl_len_cm
>= rsh.slminlen_cm
635 AND rsl.sl_len_cm
<= rsh.slmaxlen_cm) shoe_ready
636 WHERE shoe_ready.total_avail
> 2;
641 It turns out that the planner will collapse this tree into a
642 two-level query tree: the bottommost
<command>SELECT
</command>
643 commands will be
<quote>pulled up
</quote> into the middle
644 <command>SELECT
</command> since there's no need to process them
645 separately. But the middle
<command>SELECT
</command> will remain
646 separate from the top, because it contains aggregate functions.
647 If we pulled those up it would change the behavior of the topmost
648 <command>SELECT
</command>, which we don't want. However,
649 collapsing the query tree is an optimization that the rewrite
650 system doesn't have to concern itself with.
655 <title>View Rules in Non-
<command>SELECT
</command> Statements
</title>
658 Two details of the query tree aren't touched in the description of
659 view rules above. These are the command type and the result relation.
660 In fact, view rules don't need this information.
664 There are only a few differences between a query tree for a
665 <command>SELECT
</command> and one for any other
666 command. Obviously, they have a different command type and for a
667 command other than a
<command>SELECT
</command>, the result
668 relation points to the range-table entry where the result should
669 go. Everything else is absolutely the same. So having two tables
670 <literal>t1<
/> and
<literal>t2<
/> with columns
<literal>a<
/> and
671 <literal>b<
/>, the query trees for the two statements:
674 SELECT t2.b FROM t1, t2 WHERE t1.a = t2.a;
676 UPDATE t1 SET b = t2.b FROM t2 WHERE t1.a = t2.a;
679 are nearly identical. In particular:
684 The range tables contain entries for the tables
<literal>t1<
/> and
<literal>t2<
/>.
690 The target lists contain one variable that points to column
691 <literal>b<
/> of the range table entry for table
<literal>t2<
/>.
697 The qualification expressions compare the columns
<literal>a<
/> of both
698 range-table entries for equality.
704 The join trees show a simple join between
<literal>t1<
/> and
<literal>t2<
/>.
711 The consequence is, that both query trees result in similar
712 execution plans: They are both joins over the two tables. For the
713 <command>UPDATE
</command> the missing columns from
<literal>t1<
/> are added to
714 the target list by the planner and the final query tree will read
718 UPDATE t1 SET a = t1.a, b = t2.b FROM t2 WHERE t1.a = t2.a;
721 and thus the executor run over the join will produce exactly the
722 same result set as a:
725 SELECT t1.a, t2.b FROM t1, t2 WHERE t1.a = t2.a;
728 will do. But there is a little problem in
729 <command>UPDATE
</command>: The executor does not care what the
730 results from the join it is doing are meant for. It just produces
731 a result set of rows. The difference that one is a
732 <command>SELECT
</command> command and the other is an
733 <command>UPDATE
</command> is handled in the caller of the
734 executor. The caller still knows (looking at the query tree) that
735 this is an
<command>UPDATE
</command>, and it knows that this
736 result should go into table
<literal>t1<
/>. But which of the rows that are
737 there has to be replaced by the new row?
741 To resolve this problem, another entry is added to the target list
742 in
<command>UPDATE
</command> (and also in
743 <command>DELETE
</command>) statements: the current tuple ID
744 (
<acronym>CTID<
/>).
<indexterm><primary>CTID<
/><
/>
745 This is a system column containing the
746 file block number and position in the block for the row. Knowing
747 the table, the
<acronym>CTID<
/> can be used to retrieve the
748 original row of
<literal>t1<
/> to be updated. After adding the
749 <acronym>CTID<
/> to the target list, the query actually looks like:
752 SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
755 Now another detail of
<productname>PostgreSQL
</productname> enters
756 the stage. Old table rows aren't overwritten, and this
757 is why
<command>ROLLBACK
</command> is fast. In an
<command>UPDATE
</command>,
758 the new result row is inserted into the table (after stripping the
759 <acronym>CTID<
/>) and in the row header of the old row, which the
760 <acronym>CTID<
/> pointed to, the
<literal>cmax<
/> and
761 <literal>xmax<
/> entries are set to the current command counter
762 and current transaction ID. Thus the old row is hidden, and after
763 the transaction commits the vacuum cleaner can really remove it.
767 Knowing all that, we can simply apply view rules in absolutely
768 the same way to any command. There is no difference.
773 <title>The Power of Views in
<productname>PostgreSQL
</productname></title>
776 The above demonstrates how the rule system incorporates view
777 definitions into the original query tree. In the second example, a
778 simple
<command>SELECT
</command> from one view created a final
779 query tree that is a join of
4 tables (
<literal>unit<
/> was used twice with
784 The benefit of implementing views with the rule system is,
785 that the planner has all
786 the information about which tables have to be scanned plus the
787 relationships between these tables plus the restrictive
788 qualifications from the views plus the qualifications from
790 in one single query tree. And this is still the situation
791 when the original query is already a join over views.
792 The planner has to decide which is
793 the best path to execute the query, and the more information
794 the planner has, the better this decision can be. And
795 the rule system as implemented in
<productname>PostgreSQL
</productname>
796 ensures, that this is all information available about the query
801 <sect2 id=
"rules-views-update">
802 <title>Updating a View
</title>
805 What happens if a view is named as the target relation for an
806 <command>INSERT
</command>,
<command>UPDATE
</command>, or
807 <command>DELETE
</command>? After doing the substitutions
808 described above, we will have a query tree in which the result
809 relation points at a subquery range-table entry. This will not
810 work, so the rewriter throws an error if it sees it has produced
815 To change this, we can define rules that modify the behavior of
816 these kinds of commands. This is the topic of the next section.
822 <sect1 id=
"rules-update">
823 <title>Rules on
<command>INSERT<
/>,
<command>UPDATE<
/>, and
<command>DELETE<
/></title>
825 <indexterm zone=
"rules-update">
826 <primary>rule
</primary>
827 <secondary sortas=
"INSERT">for INSERT
</secondary>
830 <indexterm zone=
"rules-update">
831 <primary>rule
</primary>
832 <secondary sortas=
"UPDATE">for UPDATE
</secondary>
835 <indexterm zone=
"rules-update">
836 <primary>rule
</primary>
837 <secondary sortas=
"DELETE">for DELETE
</secondary>
841 Rules that are defined on
<command>INSERT<
/>,
<command>UPDATE<
/>,
842 and
<command>DELETE<
/> are significantly different from the view rules
843 described in the previous section. First, their
<command>CREATE
844 RULE
</command> command allows more:
849 They are allowed to have no action.
855 They can have multiple actions.
861 They can be
<literal>INSTEAD<
/> or
<literal>ALSO<
/> (the default).
867 The pseudorelations
<literal>NEW<
/> and
<literal>OLD<
/> become useful.
873 They can have rule qualifications.
878 Second, they don't modify the query tree in place. Instead they
879 create zero or more new query trees and can throw away the
884 <title>How Update Rules Work
</title>
890 CREATE [ OR REPLACE ] RULE
<replaceable class=
"parameter">name
</replaceable> AS ON
<replaceable class=
"parameter">event
</replaceable>
891 TO
<replaceable class=
"parameter">table
</replaceable> [ WHERE
<replaceable class=
"parameter">condition
</replaceable> ]
892 DO [ ALSO | INSTEAD ] { NOTHING |
<replaceable class=
"parameter">command
</replaceable> | (
<replaceable class=
"parameter">command
</replaceable> ;
<replaceable class=
"parameter">command
</replaceable> ... ) }
896 In the following,
<firstterm>update rules<
/> means rules that are defined
897 on
<command>INSERT<
/>,
<command>UPDATE<
/>, or
<command>DELETE<
/>.
901 Update rules get applied by the rule system when the result
902 relation and the command type of a query tree are equal to the
903 object and event given in the
<command>CREATE RULE
</command> command.
904 For update rules, the rule system creates a list of query trees.
905 Initially the query-tree list is empty.
906 There can be zero (
<literal>NOTHING<
/> key word), one, or multiple actions.
907 To simplify, we will look at a rule with one action. This rule
908 can have a qualification or not and it can be
<literal>INSTEAD<
/> or
909 <literal>ALSO<
/> (the default).
913 What is a rule qualification? It is a restriction that tells
914 when the actions of the rule should be done and when not. This
915 qualification can only reference the pseudorelations
<literal>NEW<
/> and/or
<literal>OLD<
/>,
916 which basically represent the relation that was given as object (but with a
921 So we have three cases that produce the following query trees for
926 <term>No qualification, with either
<literal>ALSO<
/> or
927 <literal>INSTEAD<
/></term>
930 the query tree from the rule action with the original query
931 tree's qualification added
937 <term>Qualification given and
<literal>ALSO<
/></term>
940 the query tree from the rule action with the rule
941 qualification and the original query tree's qualification
948 <term>Qualification given and
<literal>INSTEAD<
/></term>
951 the query tree from the rule action with the rule
952 qualification and the original query tree's qualification; and
953 the original query tree with the negated rule qualification
960 Finally, if the rule is
<literal>ALSO<
/>, the unchanged original query tree is
961 added to the list. Since only qualified
<literal>INSTEAD<
/> rules already add the
962 original query tree, we end up with either one or two output query trees
963 for a rule with one action.
967 For
<literal>ON INSERT<
/> rules, the original query (if not suppressed by
<literal>INSTEAD<
/>)
968 is done before any actions added by rules. This allows the actions to
969 see the inserted row(s). But for
<literal>ON UPDATE<
/> and
<literal>ON
970 DELETE<
/> rules, the original query is done after the actions added by rules.
971 This ensures that the actions can see the to-be-updated or to-be-deleted
972 rows; otherwise, the actions might do nothing because they find no rows
973 matching their qualifications.
977 The query trees generated from rule actions are thrown into the
978 rewrite system again, and maybe more rules get applied resulting
979 in more or less query trees.
980 So a rule's actions must have either a different
981 command type or a different result relation than the rule itself is
982 on, otherwise this recursive process will end up in an infinite loop.
983 (Recursive expansion of a rule will be detected and reported as an
988 The query trees found in the actions of the
989 <structname>pg_rewrite
</structname> system catalog are only
990 templates. Since they can reference the range-table entries for
991 <literal>NEW<
/> and
<literal>OLD<
/>, some substitutions have to be made before they can be
992 used. For any reference to
<literal>NEW<
/>, the target list of the original
993 query is searched for a corresponding entry. If found, that
994 entry's expression replaces the reference. Otherwise,
<literal>NEW<
/> means the
995 same as
<literal>OLD<
/> (for an
<command>UPDATE
</command>) or is replaced by
996 a null value (for an
<command>INSERT
</command>). Any reference to
<literal>OLD<
/> is
997 replaced by a reference to the range-table entry that is the
1002 After the system is done applying update rules, it applies view rules to the
1003 produced query tree(s). Views cannot insert new update actions so
1004 there is no need to apply update rules to the output of view rewriting.
1008 <title>A First Rule Step by Step
</title>
1011 Say we want to trace changes to the
<literal>sl_avail<
/> column in the
1012 <literal>shoelace_data
</literal> relation. So we set up a log table
1013 and a rule that conditionally writes a log entry when an
1014 <command>UPDATE
</command> is performed on
1015 <literal>shoelace_data
</literal>.
1018 CREATE TABLE shoelace_log (
1019 sl_name text, -- shoelace changed
1020 sl_avail integer, -- new available value
1021 log_who text, -- who did it
1022 log_when timestamp -- when
1025 CREATE RULE log_shoelace AS ON UPDATE TO shoelace_data
1026 WHERE NEW.sl_avail
<> OLD.sl_avail
1027 DO INSERT INTO shoelace_log VALUES (
1040 UPDATE shoelace_data SET sl_avail =
6 WHERE sl_name = 'sl7';
1043 and we look at the log table:
1046 SELECT * FROM shoelace_log;
1048 sl_name | sl_avail | log_who | log_when
1049 ---------+----------+---------+----------------------------------
1050 sl7 |
6 | Al | Tue Oct
20 16:
14:
45 1998 MET DST
1056 That's what we expected. What happened in the background is the following.
1057 The parser created the query tree:
1060 UPDATE shoelace_data SET sl_avail =
6
1061 FROM shoelace_data shoelace_data
1062 WHERE shoelace_data.sl_name = 'sl7';
1065 There is a rule
<literal>log_shoelace
</literal> that is
<literal>ON UPDATE<
/> with the rule
1066 qualification expression:
1069 NEW.sl_avail
<> OLD.sl_avail
1075 INSERT INTO shoelace_log VALUES (
1076 *NEW*.sl_name, *NEW*.sl_avail,
1077 current_user, current_timestamp )
1078 FROM shoelace_data *NEW*, shoelace_data *OLD*;
1081 (This looks a little strange since you cannot normally write
1082 <literal>INSERT ... VALUES ... FROM<
/>. The
<literal>FROM<
/>
1083 clause here is just to indicate that there are range-table entries
1084 in the query tree for
<literal>*NEW*<
/> and
<literal>*OLD*<
/>.
1085 These are needed so that they can be referenced by variables in
1086 the
<command>INSERT
</command> command's query tree.)
1090 The rule is a qualified
<literal>ALSO<
/> rule, so the rule system
1091 has to return two query trees: the modified rule action and the original
1092 query tree. In step
1, the range table of the original query is
1093 incorporated into the rule's action query tree. This results in:
1096 INSERT INTO shoelace_log VALUES (
1097 *NEW*.sl_name, *NEW*.sl_avail,
1098 current_user, current_timestamp )
1099 FROM shoelace_data *NEW*, shoelace_data *OLD*,
1100 <emphasis>shoelace_data shoelace_data
</emphasis>;
1103 In step
2, the rule qualification is added to it, so the result set
1104 is restricted to rows where
<literal>sl_avail<
/> changes:
1107 INSERT INTO shoelace_log VALUES (
1108 *NEW*.sl_name, *NEW*.sl_avail,
1109 current_user, current_timestamp )
1110 FROM shoelace_data *NEW*, shoelace_data *OLD*,
1111 shoelace_data shoelace_data
1112 <emphasis>WHERE *NEW*.sl_avail
<> *OLD*.sl_avail
</emphasis>;
1115 (This looks even stranger, since
<literal>INSERT ... VALUES<
/> doesn't have
1116 a
<literal>WHERE<
/> clause either, but the planner and executor will have no
1117 difficulty with it. They need to support this same functionality
1118 anyway for
<literal>INSERT ... SELECT<
/>.)
1122 In step
3, the original query tree's qualification is added,
1123 restricting the result set further to only the rows that would have been touched
1124 by the original query:
1127 INSERT INTO shoelace_log VALUES (
1128 *NEW*.sl_name, *NEW*.sl_avail,
1129 current_user, current_timestamp )
1130 FROM shoelace_data *NEW*, shoelace_data *OLD*,
1131 shoelace_data shoelace_data
1132 WHERE *NEW*.sl_avail
<> *OLD*.sl_avail
1133 <emphasis>AND shoelace_data.sl_name = 'sl7'
</emphasis>;
1138 Step
4 replaces references to
<literal>NEW<
/> by the target list entries from the
1139 original query tree or by the matching variable references
1140 from the result relation:
1143 INSERT INTO shoelace_log VALUES (
1144 <emphasis>shoelace_data.sl_name
</emphasis>,
<emphasis>6</emphasis>,
1145 current_user, current_timestamp )
1146 FROM shoelace_data *NEW*, shoelace_data *OLD*,
1147 shoelace_data shoelace_data
1148 WHERE
<emphasis>6</emphasis> <> *OLD*.sl_avail
1149 AND shoelace_data.sl_name = 'sl7';
1155 Step
5 changes
<literal>OLD<
/> references into result relation references:
1158 INSERT INTO shoelace_log VALUES (
1159 shoelace_data.sl_name,
6,
1160 current_user, current_timestamp )
1161 FROM shoelace_data *NEW*, shoelace_data *OLD*,
1162 shoelace_data shoelace_data
1163 WHERE
6 <> <emphasis>shoelace_data.sl_avail
</emphasis>
1164 AND shoelace_data.sl_name = 'sl7';
1169 That's it. Since the rule is
<literal>ALSO<
/>, we also output the
1170 original query tree. In short, the output from the rule system
1171 is a list of two query trees that correspond to these statements:
1174 INSERT INTO shoelace_log VALUES (
1175 shoelace_data.sl_name,
6,
1176 current_user, current_timestamp )
1178 WHERE
6 <> shoelace_data.sl_avail
1179 AND shoelace_data.sl_name = 'sl7';
1181 UPDATE shoelace_data SET sl_avail =
6
1182 WHERE sl_name = 'sl7';
1185 These are executed in this order, and that is exactly what
1186 the rule was meant to do.
1190 The substitutions and the added qualifications
1191 ensure that, if the original query would be, say:
1194 UPDATE shoelace_data SET sl_color = 'green'
1195 WHERE sl_name = 'sl7';
1198 no log entry would get written. In that case, the original query
1199 tree does not contain a target list entry for
1200 <literal>sl_avail<
/>, so
<literal>NEW.sl_avail<
/> will get
1201 replaced by
<literal>shoelace_data.sl_avail<
/>. Thus, the extra
1202 command generated by the rule is:
1205 INSERT INTO shoelace_log VALUES (
1206 shoelace_data.sl_name,
<emphasis>shoelace_data.sl_avail
</emphasis>,
1207 current_user, current_timestamp )
1209 WHERE
<emphasis>shoelace_data.sl_avail
</emphasis> <> shoelace_data.sl_avail
1210 AND shoelace_data.sl_name = 'sl7';
1213 and that qualification will never be true.
1217 It will also work if the original query modifies multiple rows. So
1218 if someone issued the command:
1221 UPDATE shoelace_data SET sl_avail =
0
1222 WHERE sl_color = 'black';
1225 four rows in fact get updated (
<literal>sl1<
/>,
<literal>sl2<
/>,
<literal>sl3<
/>, and
<literal>sl4<
/>).
1226 But
<literal>sl3<
/> already has
<literal>sl_avail =
0<
/>. In this case, the original
1227 query trees qualification is different and that results
1228 in the extra query tree:
1231 INSERT INTO shoelace_log
1232 SELECT shoelace_data.sl_name,
0,
1233 current_user, current_timestamp
1235 WHERE
0 <> shoelace_data.sl_avail
1236 AND
<emphasis>shoelace_data.sl_color = 'black'
</emphasis>;
1239 being generated by the rule. This query tree will surely insert
1240 three new log entries. And that's absolutely correct.
1244 Here we can see why it is important that the original query tree
1245 is executed last. If the
<command>UPDATE
</command> had been
1246 executed first, all the rows would have already been set to zero, so the
1247 logging
<command>INSERT
</command> would not find any row where
1248 <literal>0 <> shoelace_data.sl_avail
</literal>.
1254 <sect2 id=
"rules-update-views">
1255 <title>Cooperation with Views
</title>
1257 <indexterm zone=
"rules-update-views"><primary>view<
/><secondary>updating<
/><
/>
1260 A simple way to protect view relations from the mentioned
1261 possibility that someone can try to run
<command>INSERT
</command>,
1262 <command>UPDATE
</command>, or
<command>DELETE
</command> on them is
1263 to let those query trees get thrown away. So we could create the rules:
1266 CREATE RULE shoe_ins_protect AS ON INSERT TO shoe
1268 CREATE RULE shoe_upd_protect AS ON UPDATE TO shoe
1270 CREATE RULE shoe_del_protect AS ON DELETE TO shoe
1274 If someone now tries to do any of these operations on the view
1275 relation
<literal>shoe
</literal>, the rule system will
1276 apply these rules. Since the rules have
1277 no actions and are
<literal>INSTEAD<
/>, the resulting list of
1278 query trees will be empty and the whole query will become
1279 nothing because there is nothing left to be optimized or
1280 executed after the rule system is done with it.
1284 A more sophisticated way to use the rule system is to
1285 create rules that rewrite the query tree into one that
1286 does the right operation on the real tables. To do that
1287 on the
<literal>shoelace
</literal> view, we create
1288 the following rules:
1291 CREATE RULE shoelace_ins AS ON INSERT TO shoelace
1293 INSERT INTO shoelace_data VALUES (
1301 CREATE RULE shoelace_upd AS ON UPDATE TO shoelace
1303 UPDATE shoelace_data
1304 SET sl_name = NEW.sl_name,
1305 sl_avail = NEW.sl_avail,
1306 sl_color = NEW.sl_color,
1307 sl_len = NEW.sl_len,
1308 sl_unit = NEW.sl_unit
1309 WHERE sl_name = OLD.sl_name;
1311 CREATE RULE shoelace_del AS ON DELETE TO shoelace
1313 DELETE FROM shoelace_data
1314 WHERE sl_name = OLD.sl_name;
1319 If you want to support
<literal>RETURNING<
/> queries on the view,
1320 you need to make the rules include
<literal>RETURNING<
/> clauses that
1321 compute the view rows. This is usually pretty trivial for views on a
1322 single table, but it's a bit tedious for join views such as
1323 <literal>shoelace
</literal>. An example for the insert case is:
1326 CREATE RULE shoelace_ins AS ON INSERT TO shoelace
1328 INSERT INTO shoelace_data VALUES (
1337 (SELECT shoelace_data.sl_len * u.un_fact
1338 FROM unit u WHERE shoelace_data.sl_unit = u.un_name);
1341 Note that this one rule supports both
<command>INSERT<
/> and
1342 <command>INSERT RETURNING<
/> queries on the view
— the
1343 <literal>RETURNING<
/> clause is simply ignored for
<command>INSERT<
/>.
1347 Now assume that once in a while, a pack of shoelaces arrives at
1348 the shop and a big parts list along with it. But you don't want
1349 to manually update the
<literal>shoelace
</literal> view every
1350 time. Instead we setup two little tables: one where you can
1351 insert the items from the part list, and one with a special
1352 trick. The creation commands for these are:
1355 CREATE TABLE shoelace_arrive (
1360 CREATE TABLE shoelace_ok (
1365 CREATE RULE shoelace_ok_ins AS ON INSERT TO shoelace_ok
1368 SET sl_avail = sl_avail + NEW.ok_quant
1369 WHERE sl_name = NEW.ok_name;
1372 Now you can fill the table
<literal>shoelace_arrive
</literal> with
1373 the data from the parts list:
1376 SELECT * FROM shoelace_arrive;
1378 arr_name | arr_quant
1379 ----------+-----------
1386 Take a quick look at the current data:
1389 SELECT * FROM shoelace;
1391 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1392 ----------+----------+----------+--------+---------+-----------
1393 sl1 |
5 | black |
80 | cm |
80
1394 sl2 |
6 | black |
100 | cm |
100
1395 sl7 |
6 | brown |
60 | cm |
60
1396 sl3 |
0 | black |
35 | inch |
88.9
1397 sl4 |
8 | black |
40 | inch |
101.6
1398 sl8 |
1 | brown |
40 | inch |
101.6
1399 sl5 |
4 | brown |
1 | m |
100
1400 sl6 |
0 | brown |
0.9 | m |
90
1404 Now move the arrived shoelaces in:
1407 INSERT INTO shoelace_ok SELECT * FROM shoelace_arrive;
1410 and check the results:
1413 SELECT * FROM shoelace ORDER BY sl_name;
1415 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1416 ----------+----------+----------+--------+---------+-----------
1417 sl1 |
5 | black |
80 | cm |
80
1418 sl2 |
6 | black |
100 | cm |
100
1419 sl7 |
6 | brown |
60 | cm |
60
1420 sl4 |
8 | black |
40 | inch |
101.6
1421 sl3 |
10 | black |
35 | inch |
88.9
1422 sl8 |
21 | brown |
40 | inch |
101.6
1423 sl5 |
4 | brown |
1 | m |
100
1424 sl6 |
20 | brown |
0.9 | m |
90
1427 SELECT * FROM shoelace_log;
1429 sl_name | sl_avail | log_who| log_when
1430 ---------+----------+--------+----------------------------------
1431 sl7 |
6 | Al | Tue Oct
20 19:
14:
45 1998 MET DST
1432 sl3 |
10 | Al | Tue Oct
20 19:
25:
16 1998 MET DST
1433 sl6 |
20 | Al | Tue Oct
20 19:
25:
16 1998 MET DST
1434 sl8 |
21 | Al | Tue Oct
20 19:
25:
16 1998 MET DST
1440 It's a long way from the one
<literal>INSERT ... SELECT
</literal>
1441 to these results. And the description of the query-tree
1442 transformation will be the last in this chapter. First, there is
1443 the parser's output:
1446 INSERT INTO shoelace_ok
1447 SELECT shoelace_arrive.arr_name, shoelace_arrive.arr_quant
1448 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok;
1451 Now the first rule
<literal>shoelace_ok_ins
</literal> is applied and turns this
1456 SET sl_avail = shoelace.sl_avail + shoelace_arrive.arr_quant
1457 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1458 shoelace_ok *OLD*, shoelace_ok *NEW*,
1460 WHERE shoelace.sl_name = shoelace_arrive.arr_name;
1463 and throws away the original
<command>INSERT
</command> on
1464 <literal>shoelace_ok
</literal>. This rewritten query is passed to
1465 the rule system again, and the second applied rule
1466 <literal>shoelace_upd
</literal> produces:
1469 UPDATE shoelace_data
1470 SET sl_name = shoelace.sl_name,
1471 sl_avail = shoelace.sl_avail + shoelace_arrive.arr_quant,
1472 sl_color = shoelace.sl_color,
1473 sl_len = shoelace.sl_len,
1474 sl_unit = shoelace.sl_unit
1475 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1476 shoelace_ok *OLD*, shoelace_ok *NEW*,
1477 shoelace shoelace, shoelace *OLD*,
1478 shoelace *NEW*, shoelace_data shoelace_data
1479 WHERE shoelace.sl_name = shoelace_arrive.arr_name
1480 AND shoelace_data.sl_name = shoelace.sl_name;
1483 Again it's an
<literal>INSTEAD<
/> rule and the previous query tree is trashed.
1484 Note that this query still uses the view
<literal>shoelace
</literal>.
1485 But the rule system isn't finished with this step, so it continues
1486 and applies the
<literal>_RETURN
</literal> rule on it, and we get:
1489 UPDATE shoelace_data
1490 SET sl_name = s.sl_name,
1491 sl_avail = s.sl_avail + shoelace_arrive.arr_quant,
1492 sl_color = s.sl_color,
1495 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1496 shoelace_ok *OLD*, shoelace_ok *NEW*,
1497 shoelace shoelace, shoelace *OLD*,
1498 shoelace *NEW*, shoelace_data shoelace_data,
1499 shoelace *OLD*, shoelace *NEW*,
1500 shoelace_data s, unit u
1501 WHERE s.sl_name = shoelace_arrive.arr_name
1502 AND shoelace_data.sl_name = s.sl_name;
1505 Finally, the rule
<literal>log_shoelace
</literal> gets applied,
1506 producing the extra query tree:
1509 INSERT INTO shoelace_log
1511 s.sl_avail + shoelace_arrive.arr_quant,
1514 FROM shoelace_arrive shoelace_arrive, shoelace_ok shoelace_ok,
1515 shoelace_ok *OLD*, shoelace_ok *NEW*,
1516 shoelace shoelace, shoelace *OLD*,
1517 shoelace *NEW*, shoelace_data shoelace_data,
1518 shoelace *OLD*, shoelace *NEW*,
1519 shoelace_data s, unit u,
1520 shoelace_data *OLD*, shoelace_data *NEW*
1521 shoelace_log shoelace_log
1522 WHERE s.sl_name = shoelace_arrive.arr_name
1523 AND shoelace_data.sl_name = s.sl_name
1524 AND (s.sl_avail + shoelace_arrive.arr_quant)
<> s.sl_avail;
1527 After that the rule system runs out of rules and returns the
1528 generated query trees.
1532 So we end up with two final query trees that are equivalent to the
1533 <acronym>SQL
</acronym> statements:
1536 INSERT INTO shoelace_log
1538 s.sl_avail + shoelace_arrive.arr_quant,
1541 FROM shoelace_arrive shoelace_arrive, shoelace_data shoelace_data,
1543 WHERE s.sl_name = shoelace_arrive.arr_name
1544 AND shoelace_data.sl_name = s.sl_name
1545 AND s.sl_avail + shoelace_arrive.arr_quant
<> s.sl_avail;
1547 UPDATE shoelace_data
1548 SET sl_avail = shoelace_data.sl_avail + shoelace_arrive.arr_quant
1549 FROM shoelace_arrive shoelace_arrive,
1550 shoelace_data shoelace_data,
1552 WHERE s.sl_name = shoelace_arrive.sl_name
1553 AND shoelace_data.sl_name = s.sl_name;
1556 The result is that data coming from one relation inserted into another,
1557 changed into updates on a third, changed into updating
1558 a fourth plus logging that final update in a fifth
1559 gets reduced into two queries.
1563 There is a little detail that's a bit ugly. Looking at the two
1564 queries, it turns out that the
<literal>shoelace_data
</literal>
1565 relation appears twice in the range table where it could
1566 definitely be reduced to one. The planner does not handle it and
1567 so the execution plan for the rule systems output of the
1568 <command>INSERT
</command> will be
1570 <literallayout class=
"monospaced">
1578 -
> Seq Scan on shoelace_arrive
1579 -
> Seq Scan on shoelace_data
1582 while omitting the extra range table entry would result in a
1584 <literallayout class=
"monospaced">
1591 -
> Seq Scan on shoelace_arrive
1594 which produces exactly the same entries in the log table. Thus,
1595 the rule system caused one extra scan on the table
1596 <literal>shoelace_data
</literal> that is absolutely not
1597 necessary. And the same redundant scan is done once more in the
1598 <command>UPDATE
</command>. But it was a really hard job to make
1599 that all possible at all.
1603 Now we make a final demonstration of the
1604 <productname>PostgreSQL
</productname> rule system and its power.
1605 Say you add some shoelaces with extraordinary colors to your
1609 INSERT INTO shoelace VALUES ('sl9',
0, 'pink',
35.0, 'inch',
0.0);
1610 INSERT INTO shoelace VALUES ('sl10',
1000, 'magenta',
40.0, 'inch',
0.0);
1613 We would like to make a view to check which
1614 <literal>shoelace
</literal> entries do not fit any shoe in color.
1615 The view for this is:
1618 CREATE VIEW shoelace_mismatch AS
1619 SELECT * FROM shoelace WHERE NOT EXISTS
1620 (SELECT shoename FROM shoe WHERE slcolor = sl_color);
1626 SELECT * FROM shoelace_mismatch;
1628 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1629 ---------+----------+----------+--------+---------+-----------
1630 sl9 |
0 | pink |
35 | inch |
88.9
1631 sl10 |
1000 | magenta |
40 | inch |
101.6
1636 Now we want to set it up so that mismatching shoelaces that are
1637 not in stock are deleted from the database.
1638 To make it a little harder for
<productname>PostgreSQL
</productname>,
1639 we don't delete it directly. Instead we create one more view:
1642 CREATE VIEW shoelace_can_delete AS
1643 SELECT * FROM shoelace_mismatch WHERE sl_avail =
0;
1649 DELETE FROM shoelace WHERE EXISTS
1650 (SELECT * FROM shoelace_can_delete
1651 WHERE sl_name = shoelace.sl_name);
1654 <foreignphrase>VoilÃ
</foreignphrase>:
1657 SELECT * FROM shoelace;
1659 sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
1660 ---------+----------+----------+--------+---------+-----------
1661 sl1 |
5 | black |
80 | cm |
80
1662 sl2 |
6 | black |
100 | cm |
100
1663 sl7 |
6 | brown |
60 | cm |
60
1664 sl4 |
8 | black |
40 | inch |
101.6
1665 sl3 |
10 | black |
35 | inch |
88.9
1666 sl8 |
21 | brown |
40 | inch |
101.6
1667 sl10 |
1000 | magenta |
40 | inch |
101.6
1668 sl5 |
4 | brown |
1 | m |
100
1669 sl6 |
20 | brown |
0.9 | m |
90
1675 A
<command>DELETE
</command> on a view, with a subquery qualification that
1676 in total uses
4 nesting/joined views, where one of them
1677 itself has a subquery qualification containing a view
1678 and where calculated view columns are used,
1680 one single query tree that deletes the requested data
1685 There are probably only a few situations out in the real world
1686 where such a construct is necessary. But it makes you feel
1687 comfortable that it works.
1693 <sect1 id=
"rules-privileges">
1694 <title>Rules and Privileges
</title>
1696 <indexterm zone=
"rules-privileges">
1697 <primary>privilege
</primary>
1698 <secondary sortas=
"Regeln">with rules
</secondary>
1701 <indexterm zone=
"rules-privileges">
1702 <primary>privilege
</primary>
1703 <secondary sortas=
"Sichten">with views
</secondary>
1707 Due to rewriting of queries by the
<productname>PostgreSQL
</productname>
1708 rule system, other tables/views than those used in the original
1709 query get accessed. When update rules are used, this can include write access
1714 Rewrite rules don't have a separate owner. The owner of
1715 a relation (table or view) is automatically the owner of the
1716 rewrite rules that are defined for it.
1717 The
<productname>PostgreSQL
</productname> rule system changes the
1718 behavior of the default access control system. Relations that
1719 are used due to rules get checked against the
1720 privileges of the rule owner, not the user invoking the rule.
1721 This means that a user only needs the required privileges
1722 for the tables/views that he names explicitly in his queries.
1726 For example: A user has a list of phone numbers where some of
1727 them are private, the others are of interest for the secretary of the office.
1728 He can construct the following:
1731 CREATE TABLE phone_data (person text, phone text, private boolean);
1732 CREATE VIEW phone_number AS
1733 SELECT person, phone FROM phone_data WHERE NOT private;
1734 GRANT SELECT ON phone_number TO secretary;
1737 Nobody except him (and the database superusers) can access the
1738 <literal>phone_data<
/> table. But because of the
<command>GRANT<
/>,
1739 the secretary can run a
<command>SELECT
</command> on the
1740 <literal>phone_number<
/> view. The rule system will rewrite the
1741 <command>SELECT
</command> from
<literal>phone_number<
/> into a
1742 <command>SELECT
</command> from
<literal>phone_data<
/> and add the
1743 qualification that only entries where
<literal>private<
/> is false
1744 are wanted. Since the user is the owner of
1745 <literal>phone_number<
/> and therefore the owner of the rule, the
1746 read access to
<literal>phone_data<
/> is now checked against his
1747 privileges and the query is permitted. The check for accessing
1748 <literal>phone_number<
/> is also performed, but this is done
1749 against the invoking user, so nobody but the user and the
1750 secretary can use it.
1754 The privileges are checked rule by rule. So the secretary is for now the
1755 only one who can see the public phone numbers. But the secretary can setup
1756 another view and grant access to that to the public. Then, anyone
1757 can see the
<literal>phone_number<
/> data through the secretary's view.
1758 What the secretary cannot do is to create a view that directly
1759 accesses
<literal>phone_data<
/>. (Actually he can, but it will not work since
1760 every access will be denied during the permission checks.)
1761 And as soon as the user will notice, that the secretary opened
1762 his
<literal>phone_number<
/> view, he can revoke his access. Immediately, any
1763 access to the secretary's view would fail.
1767 One might think that this rule-by-rule checking is a security
1768 hole, but in fact it isn't. But if it did not work this way, the secretary
1769 could set up a table with the same columns as
<literal>phone_number<
/> and
1770 copy the data to there once per day. Then it's his own data and
1771 he can grant access to everyone he wants. A
1772 <command>GRANT
</command> command means,
<quote>I trust you
</quote>.
1773 If someone you trust does the thing above, it's time to
1774 think it over and then use
<command>REVOKE
</command>.
1778 This mechanism also works for update rules. In the examples of
1779 the previous section, the owner of the tables in the example
1780 database could grant the privileges
<literal>SELECT<
/>,
1781 <literal>INSERT<
/>,
<literal>UPDATE<
/>, and
<literal>DELETE<
/> on
1782 the
<literal>shoelace<
/> view to someone else, but only
1783 <literal>SELECT<
/> on
<literal>shoelace_log<
/>. The rule action to
1784 write log entries will still be executed successfully, and that
1785 other user could see the log entries. But he cannot create fake
1786 entries, nor could he manipulate or remove existing ones.
1790 <sect1 id=
"rules-status">
1791 <title>Rules and Command Status
</title>
1794 The
<productname>PostgreSQL
</productname> server returns a command
1795 status string, such as
<literal>INSERT
149592 1<
/>, for each
1796 command it receives. This is simple enough when there are no rules
1797 involved, but what happens when the query is rewritten by rules?
1801 Rules affect the command status as follows:
1806 If there is no unconditional
<literal>INSTEAD<
/> rule for the query, then
1807 the originally given query will be executed, and its command
1808 status will be returned as usual. (But note that if there were
1809 any conditional
<literal>INSTEAD<
/> rules, the negation of their qualifications
1810 will have been added to the original query. This might reduce the
1811 number of rows it processes, and if so the reported status will
1818 If there is any unconditional
<literal>INSTEAD<
/> rule for the query, then
1819 the original query will not be executed at all. In this case,
1820 the server will return the command status for the last query
1821 that was inserted by an
<literal>INSTEAD<
/> rule (conditional or
1822 unconditional) and is of the same command type
1823 (
<command>INSERT
</command>,
<command>UPDATE
</command>, or
1824 <command>DELETE
</command>) as the original query. If no query
1825 meeting those requirements is added by any rule, then the
1826 returned command status shows the original query type and
1827 zeroes for the row-count and OID fields.
1832 (This system was established in
<productname>PostgreSQL<
/> 7.3.
1833 In versions before that, the command status might show different
1834 results when rules exist.)
1838 The programmer can ensure that any desired
<literal>INSTEAD<
/> rule is the one
1839 that sets the command status in the second case, by giving it the
1840 alphabetically last rule name among the active rules, so that it
1845 <sect1 id=
"rules-triggers">
1846 <title>Rules versus Triggers
</title>
1848 <indexterm zone=
"rules-triggers">
1849 <primary>rule
</primary>
1850 <secondary sortas=
"Trigger">compared with triggers
</secondary>
1853 <indexterm zone=
"rules-triggers">
1854 <primary>trigger
</primary>
1855 <secondary sortas=
"Regeln">compared with rules
</secondary>
1859 Many things that can be done using triggers can also be
1860 implemented using the
<productname>PostgreSQL
</productname>
1861 rule system. One of the things that cannot be implemented by
1862 rules are some kinds of constraints, especially foreign keys. It is possible
1863 to place a qualified rule that rewrites a command to
<literal>NOTHING<
/>
1864 if the value of a column does not appear in another table.
1865 But then the data is silently thrown away and that's
1866 not a good idea. If checks for valid values are required,
1867 and in the case of an invalid value an error message should
1868 be generated, it must be done by a trigger.
1872 On the other hand, a trigger cannot be created on views because
1873 there is no real data in a view relation; however INSERT, UPDATE,
1874 and DELETE rules can be created on views.
1878 For the things that can be implemented by both, which is best
1879 depends on the usage of the database.
1880 A trigger is fired for any affected row once. A rule manipulates
1881 the query or generates an additional query. So if many
1882 rows are affected in one statement, a rule issuing one extra
1883 command is likely to be faster than a trigger that is
1884 called for every single row and must execute its operations
1885 many times. However, the trigger approach is conceptually far
1886 simpler than the rule approach, and is easier for novices to get right.
1890 Here we show an example of how the choice of rules versus triggers
1891 plays out in one situation. There are two tables:
1894 CREATE TABLE computer (
1895 hostname text, -- indexed
1896 manufacturer text -- indexed
1899 CREATE TABLE software (
1900 software text, -- indexed
1901 hostname text -- indexed
1905 Both tables have many thousands of rows and the indexes on
1906 <structfield>hostname<
/> are unique. The rule or trigger should
1907 implement a constraint that deletes rows from
<literal>software<
/>
1908 that reference a deleted computer. The trigger would use this command:
1911 DELETE FROM software WHERE hostname = $
1;
1914 Since the trigger is called for each individual row deleted from
1915 <literal>computer<
/>, it can prepare and save the plan for this
1916 command and pass the
<structfield>hostname<
/> value in the
1917 parameter. The rule would be written as:
1920 CREATE RULE computer_del AS ON DELETE TO computer
1921 DO DELETE FROM software WHERE hostname = OLD.hostname;
1926 Now we look at different types of deletes. In the case of a:
1929 DELETE FROM computer WHERE hostname = 'mypc.local.net';
1932 the table
<literal>computer<
/> is scanned by index (fast), and the
1933 command issued by the trigger would also use an index scan (also fast).
1934 The extra command from the rule would be:
1937 DELETE FROM software WHERE computer.hostname = 'mypc.local.net'
1938 AND software.hostname = computer.hostname;
1941 Since there are appropriate indexes setup, the planner
1942 will create a plan of
1944 <literallayout class=
"monospaced">
1946 -
> Index Scan using comp_hostidx on computer
1947 -
> Index Scan using soft_hostidx on software
1950 So there would be not that much difference in speed between
1951 the trigger and the rule implementation.
1955 With the next delete we want to get rid of all the
2000 computers
1956 where the
<structfield>hostname<
/> starts with
1957 <literal>old<
/>. There are two possible commands to do that. One
1961 DELETE FROM computer WHERE hostname
>= 'old'
1962 AND hostname
< 'ole'
1965 The command added by the rule will be:
1968 DELETE FROM software WHERE computer.hostname
>= 'old' AND computer.hostname
< 'ole'
1969 AND software.hostname = computer.hostname;
1974 <literallayout class=
"monospaced">
1976 -
> Seq Scan on software
1978 -
> Index Scan using comp_hostidx on computer
1981 The other possible command is:
1984 DELETE FROM computer WHERE hostname ~ '^old';
1987 which results in the following executing plan for the command
1990 <literallayout class=
"monospaced">
1992 -
> Index Scan using comp_hostidx on computer
1993 -
> Index Scan using soft_hostidx on software
1996 This shows, that the planner does not realize that the
1997 qualification for
<structfield>hostname<
/> in
1998 <literal>computer<
/> could also be used for an index scan on
1999 <literal>software<
/> when there are multiple qualification
2000 expressions combined with
<literal>AND<
/>, which is what it does
2001 in the regular-expression version of the command. The trigger will
2002 get invoked once for each of the
2000 old computers that have to be
2003 deleted, and that will result in one index scan over
2004 <literal>computer<
/> and
2000 index scans over
2005 <literal>software<
/>. The rule implementation will do it with two
2006 commands that use indexes. And it depends on the overall size of
2007 the table
<literal>software<
/> whether the rule will still be faster in the
2008 sequential scan situation.
2000 command executions from the trigger over the SPI
2009 manager take some time, even if all the index blocks will soon be in the cache.
2013 The last command we look at is:
2016 DELETE FROM computer WHERE manufacturer = 'bim';
2019 Again this could result in many rows to be deleted from
2020 <literal>computer<
/>. So the trigger will again run many commands
2021 through the executor. The command generated by the rule will be:
2024 DELETE FROM software WHERE computer.manufacturer = 'bim'
2025 AND software.hostname = computer.hostname;
2028 The plan for that command will again be the nested loop over two
2029 index scans, only using a different index on
<literal>computer<
/>:
2033 -
> Index Scan using comp_manufidx on computer
2034 -
> Index Scan using soft_hostidx on software
2037 In any of these cases, the extra commands from the rule system
2038 will be more or less independent from the number of affected rows
2043 <!-- What's happening with this? If it doesn't come back, remove this section. -->
2045 Another situation is cases on
<command>UPDATE
</command> where it depends on the
2046 change of an attribute if an action should be performed or
2047 not. The only way to
2048 create a rule as in the shoelace_log example is to do it with
2049 a rule qualification. That results in an extra query that is
2050 performed always, even if the attribute of interest cannot
2051 change at all because it does not appear in the target list
2052 of the initial query. When this is enabled again, it will be
2053 one more advantage of rules over triggers. Optimization of
2054 a trigger must fail by definition in this case, because the
2055 fact that its actions will only be done when a specific attribute
2056 is updated is hidden in its functionality. The definition of
2057 a trigger only allows to specify it on row level, so whenever a
2058 row is touched, the trigger must be called to make its
2059 decision. The rule system will know it by looking up the
2060 target list and will suppress the additional query completely
2061 if the attribute isn't touched. So the rule, qualified or not,
2062 will only do its scans if there ever could be something to do.
2067 The summary is, rules will only be significantly slower than
2068 triggers if their actions result in large and badly qualified
2069 joins, a situation where the planner fails.