doc/src/sgml/plperl.sgml

   1 <!-- $PostgreSQL$ -->
   2
   3  <chapter id="plperl">
   4   <title>PL/Perl - Perl Procedural Language</title>
   5
   6   <indexterm zone="plperl">
   7    <primary>PL/Perl</primary>
   8   </indexterm>
   9
  10   <indexterm zone="plperl">
  11    <primary>Perl</primary>
  12   </indexterm>
  13
  14   <para>
  15    PL/Perl is a loadable procedural language that enables you to write
  16    <productname>PostgreSQL</productname> functions in the
  17    <ulink url="http://www.perl.com">Perl programming language</ulink>.
  18   </para>
  19
  20   <para>
  21    The main advantage to using PL/Perl is that this allows use,
  22    within stored functions, of the manyfold <quote>string
  23    munging</quote> operators and functions available for Perl.  Parsing
  24    complex strings might be easier using Perl than it is with the
  25    string functions and control structures provided in PL/pgSQL.
  26   </para>
  27
  28   <para>
  29    To install PL/Perl in a particular database, use
  30    <literal>createlang plperl <replaceable>dbname</></literal>.
  31   </para>
  32
  33   <tip>
  34    <para>
  35     If a language is installed into <literal>template1</>, all subsequently
  36     created databases will have the language installed automatically.
  37    </para>
  38   </tip>
  39
  40   <note>
  41    <para>
  42     Users of source packages must specially enable the build of
  43     PL/Perl during the installation process.  (Refer to <xref
  44     linkend="install-short"> for more information.)  Users of
  45     binary packages might find PL/Perl in a separate subpackage.
  46    </para>
  47   </note>
  48
  49  <sect1 id="plperl-funcs">
  50   <title>PL/Perl Functions and Arguments</title>
  51
  52   <para>
  53    To create a function in the PL/Perl language, use the standard
  54    <xref linkend="sql-createfunction" endterm="sql-createfunction-title">
  55    syntax:
  56
  57 <programlisting>
  58 CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-types</replaceable>) RETURNS <replaceable>return-type</replaceable> AS $$
  59     # PL/Perl function body
  60 $$ LANGUAGE plperl;
  61 </programlisting>
  62    The body of the function is ordinary Perl code. In fact, the PL/Perl
  63    glue code wraps it inside a Perl subroutine. A PL/Perl function must
  64    always return a scalar value.  You can return more complex structures
  65    (arrays, records, and sets) by returning a reference, as discussed below.
  66    Never return a list.
  67   </para>
  68
  69   <note>
  70    <para>
  71     The use of named nested subroutines is dangerous in Perl, especially if
  72     they refer to lexical variables in the enclosing scope. Because a PL/Perl
  73     function is wrapped in a subroutine, any named subroutine you create will
  74     be nested. In general, it is far safer to create anonymous subroutines
  75     which you call via a coderef. See the <literal>perldiag</literal>
  76     man page for more details.
  77    </para>
  78   </note>
  79
  80   <para>
  81    The syntax of the <command>CREATE FUNCTION</command> command requires
  82    the function body to be written as a string constant.  It is usually
  83    most convenient to use dollar quoting (see <xref
  84    linkend="sql-syntax-dollar-quoting">) for the string constant.
  85    If you choose to use escape string syntax <literal>E''</>,
  86    you must double the single quote marks (<literal>'</>) and backslashes
  87    (<literal>\</>) used in the body of the function
  88    (see <xref linkend="sql-syntax-strings">).
  89   </para>
  90
  91   <para>
  92    Arguments and results are handled as in any other Perl subroutine:
  93    arguments are passed in <varname>@_</varname>, and a result value
  94    is returned with <literal>return</> or as the last expression
  95    evaluated in the function.
  96   </para>
  97
  98   <para>
  99    For example, a function returning the greater of two integer values
 100    could be defined as:
 101
 102 <programlisting>
 103 CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
 104     if ($_[0] &gt; $_[1]) { return $_[0]; }
 105     return $_[1];
 106 $$ LANGUAGE plperl;
 107 </programlisting>
 108   </para>
 109
 110   <para>
 111    If an SQL null value<indexterm><primary>null value</><secondary
 112    sortas="PL/Perl">in PL/Perl</></indexterm> is passed to a function,
 113    the argument value will appear as <quote>undefined</> in Perl.  The
 114    above function definition will not behave very nicely with null
 115    inputs (in fact, it will act as though they are zeroes).  We could
 116    add <literal>STRICT</> to the function definition to make
 117    <productname>PostgreSQL</productname> do something more reasonable:
 118    if a null value is passed, the function will not be called at all,
 119    but will just return a null result automatically.  Alternatively,
 120    we could check for undefined inputs in the function body.  For
 121    example, suppose that we wanted <function>perl_max</function> with
 122    one null and one nonnull argument to return the nonnull argument,
 123    rather than a null value:
 124
 125 <programlisting>
 126 CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
 127     my ($x,$y) = @_;
 128     if (! defined $x) {
 129         if (! defined $y) { return undef; }
 130         return $y;
 131     }
 132     if (! defined $y) { return $x; }
 133     if ($x &gt; $y) { return $x; }
 134     return $y;
 135 $$ LANGUAGE plperl;
 136 </programlisting>
 137    As shown above, to return an SQL null value from a PL/Perl
 138    function, return an undefined value.  This can be done whether the
 139    function is strict or not.
 140   </para>
 141
 142   <para>
 143    Anything in a function argument that is not a reference is
 144    a string, which is in the standard <productname>PostgreSQL</productname>
 145    external text representation for the relevant data type. In the case of
 146    ordinary numeric or text types, Perl will just do the right thing and
 147    the programmer will normally not have to worry about it. However, in
 148    other cases the argument will need to be converted into a form that is
 149    more usable in Perl. For example, here is how to convert an argument of
 150    type <type>bytea</> into unescaped binary
 151    data:
 152
 153 <programlisting>
 154     my $arg = shift;
 155     $arg =~ s!\\(?:\\|(\d{3}))!$1 ? chr(oct($1)) : "\\"!ge;
 156 </programlisting>
 157
 158   </para>
 159
 160   <para>
 161    Similarly, values passed back to <productname>PostgreSQL</productname>
 162    must be in the external text representation format. For example, here
 163    is how to escape binary data for a return value of type <type>bytea</>:
 164
 165 <programlisting>
 166     $retval =~ s!(\\|[^ -~])!sprintf("\\%03o",ord($1))!ge;
 167     return $retval;
 168 </programlisting>
 169
 170   </para>
 171
 172   <para>
 173    Perl can return <productname>PostgreSQL</productname> arrays as
 174    references to Perl arrays.  Here is an example:
 175
 176 <programlisting>
 177 CREATE OR REPLACE function returns_array()
 178 RETURNS text[][] AS $$
 179     return [['a&quot;b','c,d'],['e\\f','g']];
 180 $$ LANGUAGE plperl;
 181
 182 select returns_array();
 183 </programlisting>
 184   </para>
 185
 186   <para>
 187    Composite-type arguments are passed to the function as references
 188    to hashes.  The keys of the hash are the attribute names of the
 189    composite type.  Here is an example:
 190
 191 <programlisting>
 192 CREATE TABLE employee (
 193     name text,
 194     basesalary integer,
 195     bonus integer
 196 );
 197
 198 CREATE FUNCTION empcomp(employee) RETURNS integer AS $$
 199     my ($emp) = @_;
 200     return $emp-&gt;{basesalary} + $emp-&gt;{bonus};
 201 $$ LANGUAGE plperl;
 202
 203 SELECT name, empcomp(employee.*) FROM employee;
 204 </programlisting>
 205   </para>
 206
 207   <para>
 208    A PL/Perl function can return a composite-type result using the same
 209    approach: return a reference to a hash that has the required attributes.
 210    For example:
 211
 212 <programlisting>
 213 CREATE TYPE testrowperl AS (f1 integer, f2 text, f3 text);
 214
 215 CREATE OR REPLACE FUNCTION perl_row() RETURNS testrowperl AS $$
 216     return {f2 =&gt; 'hello', f1 =&gt; 1, f3 =&gt; 'world'};
 217 $$ LANGUAGE plperl;
 218
 219 SELECT * FROM perl_row();
 220 </programlisting>
 221
 222    Any columns in the declared result data type that are not present in the
 223    hash will be returned as null values.
 224   </para>
 225
 226   <para>
 227     PL/Perl functions can also return sets of either scalar or
 228     composite types.  Usually you'll want to return rows one at a
 229     time, both to speed up startup time and to keep from queueing up
 230     the entire result set in memory.  You can do this with
 231     <function>return_next</function> as illustrated below.  Note that
 232     after the last <function>return_next</function>, you must put
 233     either <literal>return</literal> or (better) <literal>return
 234     undef</literal>.
 235
 236 <programlisting>
 237 CREATE OR REPLACE FUNCTION perl_set_int(int)
 238 RETURNS SETOF INTEGER AS $$
 239     foreach (0..$_[0]) {
 240         return_next($_);
 241     }
 242     return undef;
 243 $$ LANGUAGE plperl;
 244
 245 SELECT * FROM perl_set_int(5);
 246
 247 CREATE OR REPLACE FUNCTION perl_set()
 248 RETURNS SETOF testrowperl AS $$
 249     return_next({ f1 =&gt; 1, f2 =&gt; 'Hello', f3 =&gt; 'World' });
 250     return_next({ f1 =&gt; 2, f2 =&gt; 'Hello', f3 =&gt; 'PostgreSQL' });
 251     return_next({ f1 =&gt; 3, f2 =&gt; 'Hello', f3 =&gt; 'PL/Perl' });
 252     return undef;
 253 $$ LANGUAGE plperl;
 254 </programlisting>
 255
 256     For small result sets, you can return a reference to an array that
 257     contains either scalars, references to arrays, or references to
 258     hashes for simple types, array types, and composite types,
 259     respectively.  Here are some simple examples of returning the entire
 260     result set as an array reference:
 261
 262 <programlisting>
 263 CREATE OR REPLACE FUNCTION perl_set_int(int) RETURNS SETOF INTEGER AS $$
 264     return [0..$_[0]];
 265 $$ LANGUAGE plperl;
 266
 267 SELECT * FROM perl_set_int(5);
 268
 269 CREATE OR REPLACE FUNCTION perl_set() RETURNS SETOF testrowperl AS $$
 270     return [
 271         { f1 =&gt; 1, f2 =&gt; 'Hello', f3 =&gt; 'World' },
 272         { f1 =&gt; 2, f2 =&gt; 'Hello', f3 =&gt; 'PostgreSQL' },
 273         { f1 =&gt; 3, f2 =&gt; 'Hello', f3 =&gt; 'PL/Perl' }
 274     ];
 275 $$ LANGUAGE plperl;
 276
 277 SELECT * FROM perl_set();
 278 </programlisting>
 279   </para>
 280
 281   <para>
 282    If you wish to use the <literal>strict</> pragma with your code,
 283    the easiest way to do so is to <command>SET</>
 284    <literal>plperl.use_strict</literal> to true.  This parameter affects
 285    subsequent compilations of <application>PL/Perl</> functions, but not
 286    functions already compiled in the current session.  To set the
 287    parameter before <application>PL/Perl</> has been loaded, it is
 288    necessary to have added <quote><literal>plperl</></> to the <xref
 289    linkend="guc-custom-variable-classes"> list in
 290    <filename>postgresql.conf</filename>.
 291   </para>
 292
 293   <para>
 294    Another way to use the <literal>strict</> pragma is to put:
 295 <programlisting>
 296 use strict;
 297 </programlisting>
 298    in the function body.  But this only works in <application>PL/PerlU</>
 299    functions, since <literal>use</> is not a trusted operation.  In
 300    <application>PL/Perl</> functions you can instead do:
 301 <programlisting>
 302 BEGIN { strict->import(); }
 303 </programlisting>
 304   </para>
 305  </sect1>
 306
 307  <sect1 id="plperl-database">
 308   <title>Database Access from PL/Perl</title>
 309
 310   <para>
 311    Access to the database itself from your Perl function can be done
 312    via the following functions:
 313
 314    <variablelist>
 315     <varlistentry>
 316      <indexterm>
 317       <primary>spi_exec_query</primary>
 318       <secondary>in PL/Perl</secondary>
 319      </indexterm>
 320
 321      <term><literal><function>spi_exec_query</>(<replaceable>query</replaceable> [, <replaceable>max-rows</replaceable>])</literal></term>
 322      <term><literal><function>spi_query</>(<replaceable>command</replaceable>)</literal></term>
 323      <term><literal><function>spi_fetchrow</>(<replaceable>cursor</replaceable>)</literal></term>
 324      <term><literal><function>spi_prepare</>(<replaceable>command</replaceable>, <replaceable>argument types</replaceable>)</literal></term>
 325      <term><literal><function>spi_exec_prepared</>(<replaceable>plan</replaceable>)</literal></term>
 326      <term><literal><function>spi_query_prepared</>(<replaceable>plan</replaceable> [, <replaceable>attributes</replaceable>], <replaceable>arguments</replaceable>)</literal></term>
 327      <term><literal><function>spi_cursor_close</>(<replaceable>cursor</replaceable>)</literal></term>
 328      <term><literal><function>spi_freeplan</>(<replaceable>plan</replaceable>)</literal></term>
 329
 330      <listitem>
 331       <para>
 332        <literal>spi_exec_query</literal> executes an SQL command and
 333 returns the entire row set as a reference to an array of hash
 334 references.  <emphasis>You should only use this command when you know
 335 that the result set will be relatively small.</emphasis>  Here is an
 336 example of a query (<command>SELECT</command> command) with the
 337 optional maximum number of rows:
 338
 339 <programlisting>
 340 $rv = spi_exec_query('SELECT * FROM my_table', 5);
 341 </programlisting>
 342         This returns up to 5 rows from the table
 343         <literal>my_table</literal>.  If <literal>my_table</literal>
 344         has a column <literal>my_column</literal>, you can get that
 345         value from row <literal>$i</literal> of the result like this:
 346 <programlisting>
 347 $foo = $rv-&gt;{rows}[$i]-&gt;{my_column};
 348 </programlisting>
 349        The total number of rows returned from a <command>SELECT</command>
 350        query can be accessed like this:
 351 <programlisting>
 352 $nrows = $rv-&gt;{processed}
 353 </programlisting>
 354       </para>
 355
 356       <para>
 357        Here is an example using a different command type:
 358 <programlisting>
 359 $query = "INSERT INTO my_table VALUES (1, 'test')";
 360 $rv = spi_exec_query($query);
 361 </programlisting>
 362        You can then access the command status (e.g.,
 363        <literal>SPI_OK_INSERT</literal>) like this:
 364 <programlisting>
 365 $res = $rv-&gt;{status};
 366 </programlisting>
 367        To get the number of rows affected, do:
 368 <programlisting>
 369 $nrows = $rv-&gt;{processed};
 370 </programlisting>
 371       </para>
 372
 373       <para>
 374        Here is a complete example:
 375 <programlisting>
 376 CREATE TABLE test (
 377     i int,
 378     v varchar
 379 );
 380
 381 INSERT INTO test (i, v) VALUES (1, 'first line');
 382 INSERT INTO test (i, v) VALUES (2, 'second line');
 383 INSERT INTO test (i, v) VALUES (3, 'third line');
 384 INSERT INTO test (i, v) VALUES (4, 'immortal');
 385
 386 CREATE OR REPLACE FUNCTION test_munge() RETURNS SETOF test AS $$
 387     my $rv = spi_exec_query('select i, v from test;');
 388     my $status = $rv-&gt;{status};
 389     my $nrows = $rv-&gt;{processed};
 390     foreach my $rn (0 .. $nrows - 1) {
 391         my $row = $rv-&gt;{rows}[$rn];
 392         $row-&gt;{i} += 200 if defined($row-&gt;{i});
 393         $row-&gt;{v} =~ tr/A-Za-z/a-zA-Z/ if (defined($row-&gt;{v}));
 394         return_next($row);
 395     }
 396     return undef;
 397 $$ LANGUAGE plperl;
 398
 399 SELECT * FROM test_munge();
 400 </programlisting>
 401     </para>
 402
 403     <para>
 404     <literal>spi_query</literal> and <literal>spi_fetchrow</literal>
 405     work together as a pair for row sets which might be large, or for cases
 406     where you wish to return rows as they arrive.
 407     <literal>spi_fetchrow</literal> works <emphasis>only</emphasis> with
 408     <literal>spi_query</literal>. The following example illustrates how
 409     you use them together:
 410
 411 <programlisting>
 412 CREATE TYPE foo_type AS (the_num INTEGER, the_text TEXT);
 413
 414 CREATE OR REPLACE FUNCTION lotsa_md5 (INTEGER) RETURNS SETOF foo_type AS $$
 415     use Digest::MD5 qw(md5_hex);
 416     my $file = '/usr/share/dict/words';
 417     my $t = localtime;
 418     elog(NOTICE, "opening file $file at $t" );
 419     open my $fh, '&lt;', $file # ooh, it's a file access!
 420         or elog(ERROR, "cannot open $file for reading: $!");
 421     my @words = &lt;$fh&gt;;
 422     close $fh;
 423     $t = localtime;
 424     elog(NOTICE, "closed file $file at $t");
 425     chomp(@words);
 426     my $row;
 427     my $sth = spi_query("SELECT * FROM generate_series(1,$_[0]) AS b(a)");
 428     while (defined ($row = spi_fetchrow($sth))) {
 429         return_next({
 430             the_num =&gt; $row-&gt;{a},
 431             the_text =&gt; md5_hex($words[rand @words])
 432         });
 433     }
 434     return;
 435 $$ LANGUAGE plperlu;
 436
 437 SELECT * from lotsa_md5(500);
 438 </programlisting>
 439     </para>
 440
 441     <para>
 442     <literal>spi_prepare</literal>, <literal>spi_query_prepared</literal>, <literal>spi_exec_prepared</literal>,
 443     and <literal>spi_freeplan</literal> implement the same functionality but for prepared queries. Once
 444     a query plan is prepared by a call to <literal>spi_prepare</literal>, the plan can be used instead
 445     of the string query, either in <literal>spi_exec_prepared</literal>, where the result is the same as returned
 446     by <literal>spi_exec_query</literal>, or in <literal>spi_query_prepared</literal> which returns a cursor
 447     exactly as <literal>spi_query</literal> does, which can be later passed to <literal>spi_fetchrow</literal>.
 448     </para>
 449
 450     <para>
 451     The advantage of prepared queries is that is it possible to use one prepared plan for more
 452     than one query execution. After the plan is not needed anymore, it can be freed with
 453     <literal>spi_freeplan</literal>:
 454     </para>
 455
 456     <para>
 457     <programlisting>
 458 CREATE OR REPLACE FUNCTION init() RETURNS INTEGER AS $$
 459         $_SHARED{my_plan} = spi_prepare( 'SELECT (now() + $1)::date AS now', 'INTERVAL');
 460 $$ LANGUAGE plperl;
 461
 462 CREATE OR REPLACE FUNCTION add_time( INTERVAL ) RETURNS TEXT AS $$
 463         return spi_exec_prepared(
 464                 $_SHARED{my_plan},
 465                 $_[0],
 466         )->{rows}->[0]->{now};
 467 $$ LANGUAGE plperl;
 468
 469 CREATE OR REPLACE FUNCTION done() RETURNS INTEGER AS $$
 470         spi_freeplan( $_SHARED{my_plan});
 471         undef $_SHARED{my_plan};
 472 $$ LANGUAGE plperl;
 473
 474 SELECT init();
 475 SELECT add_time('1 day'), add_time('2 days'), add_time('3 days');
 476 SELECT done();
 477
 478   add_time  |  add_time  |  add_time
 479 ------------+------------+------------
 480  2005-12-10 | 2005-12-11 | 2005-12-12
 481     </programlisting>
 482     </para>
 483
 484     <para>
 485     Note that the parameter subscript in <literal>spi_prepare</literal> is defined via
 486     $1, $2, $3, etc, so avoid declaring query strings in double quotes that might easily
 487     lead to hard-to-catch bugs.
 488     </para>
 489
 490     <para>
 491      Normally, <function>spi_fetchrow</> should be repeated until it
 492      returns <literal>undef</literal>, indicating that there are no more
 493      rows to read.  The cursor is automatically freed when
 494      <function>spi_fetchrow</> returns <literal>undef</literal>.
 495      If you do not wish to read all the rows, instead call
 496      <function>spi_cursor_close</> to free the cursor.
 497      Failure to do so will result in memory leaks.
 498     </para>
 499      </listitem>
 500     </varlistentry>
 501
 502     <varlistentry>
 503      <indexterm>
 504       <primary>elog</primary>
 505       <secondary>in PL/Perl</secondary>
 506      </indexterm>
 507
 508      <term><literal><function>elog</>(<replaceable>level</replaceable>, <replaceable>msg</replaceable>)</literal></term>
 509      <listitem>
 510       <para>
 511        Emit a log or error message. Possible levels are
 512        <literal>DEBUG</>, <literal>LOG</>, <literal>INFO</>,
 513        <literal>NOTICE</>, <literal>WARNING</>, and <literal>ERROR</>.
 514        <literal>ERROR</>
 515         raises an error condition; if this is not trapped by the surrounding
 516         Perl code, the error propagates out to the calling query, causing
 517         the current transaction or subtransaction to be aborted.  This
 518         is effectively the same as the Perl <literal>die</> command.
 519         The other levels only generate messages of different
 520         priority levels.
 521         Whether messages of a particular priority are reported to the client,
 522         written to the server log, or both is controlled by the
 523         <xref linkend="guc-log-min-messages"> and
 524         <xref linkend="guc-client-min-messages"> configuration
 525         variables. See <xref linkend="runtime-config"> for more
 526         information.
 527       </para>
 528      </listitem>
 529     </varlistentry>
 530    </variablelist>
 531   </para>
 532  </sect1>
 533
 534  <sect1 id="plperl-data">
 535   <title>Data Values in PL/Perl</title>
 536
 537   <para>
 538    The argument values supplied to a PL/Perl function's code are
 539    simply the input arguments converted to text form (just as if they
 540    had been displayed by a <command>SELECT</command> statement).
 541    Conversely, the <literal>return</> command will accept any string
 542    that is acceptable input format for the function's declared return
 543    type.  So, within the PL/Perl function,
 544    all values are just text strings.
 545   </para>
 546  </sect1>
 547
 548  <sect1 id="plperl-global">
 549   <title>Global Values in PL/Perl</title>
 550
 551   <para>
 552     You can use the global hash <varname>%_SHARED</varname> to store
 553     data, including code references, between function calls for the
 554     lifetime of the current session.
 555   </para>
 556
 557   <para>
 558     Here is a simple example for shared data:
 559 <programlisting>
 560 CREATE OR REPLACE FUNCTION set_var(name text, val text) RETURNS text AS $$
 561     if ($_SHARED{$_[0]} = $_[1]) {
 562         return 'ok';
 563     } else {
 564         return "cannot set shared variable $_[0] to $_[1]";
 565     }
 566 $$ LANGUAGE plperl;
 567
 568 CREATE OR REPLACE FUNCTION get_var(name text) RETURNS text AS $$
 569     return $_SHARED{$_[0]};
 570 $$ LANGUAGE plperl;
 571
 572 SELECT set_var('sample', 'Hello, PL/Perl!  How's tricks?');
 573 SELECT get_var('sample');
 574 </programlisting>
 575   </para>
 576
 577   <para>
 578    Here is a slightly more complicated example using a code reference:
 579
 580 <programlisting>
 581 CREATE OR REPLACE FUNCTION myfuncs() RETURNS void AS $$
 582     $_SHARED{myquote} = sub {
 583         my $arg = shift;
 584         $arg =~ s/(['\\])/\\$1/g;
 585         return "'$arg'";
 586     };
 587 $$ LANGUAGE plperl;
 588
 589 SELECT myfuncs(); /* initializes the function */
 590
 591 /* Set up a function that uses the quote function */
 592
 593 CREATE OR REPLACE FUNCTION use_quote(TEXT) RETURNS text AS $$
 594     my $text_to_quote = shift;
 595     my $qfunc = $_SHARED{myquote};
 596     return &amp;$qfunc($text_to_quote);
 597 $$ LANGUAGE plperl;
 598 </programlisting>
 599
 600    (You could have replaced the above with the one-liner
 601    <literal>return $_SHARED{myquote}-&gt;($_[0]);</literal>
 602    at the expense of readability.)
 603   </para>
 604  </sect1>
 605
 606  <sect1 id="plperl-trusted">
 607   <title>Trusted and Untrusted PL/Perl</title>
 608
 609   <indexterm zone="plperl-trusted">
 610    <primary>trusted</primary>
 611    <secondary>PL/Perl</secondary>
 612   </indexterm>
 613
 614   <para>
 615    Normally, PL/Perl is installed as a <quote>trusted</> programming
 616    language named <literal>plperl</>.  In this setup, certain Perl
 617    operations are disabled to preserve security.  In general, the
 618    operations that are restricted are those that interact with the
 619    environment. This includes file handle operations,
 620    <literal>require</literal>, and <literal>use</literal> (for
 621    external modules).  There is no way to access internals of the
 622    database server process or to gain OS-level access with the
 623    permissions of the server process,
 624    as a C function can do.  Thus, any unprivileged database user can
 625    be permitted to use this language.
 626   </para>
 627
 628   <para>
 629    Here is an example of a function that will not work because file
 630    system operations are not allowed for security reasons:
 631 <programlisting>
 632 CREATE FUNCTION badfunc() RETURNS integer AS $$
 633     my $tmpfile = "/tmp/badfile";
 634     open my $fh, '&gt;', $tmpfile
 635         or elog(ERROR, qq{could not open the file "$tmpfile": $!});
 636     print $fh "Testing writing to a file\n";
 637     close $fh or elog(ERROR, qq{could not close the file "$tmpfile": $!});
 638     return 1;
 639 $$ LANGUAGE plperl;
 640 </programlisting>
 641     The creation of this function will fail as its use of a forbidden
 642     operation will be caught by the validator.
 643   </para>
 644
 645   <para>
 646    Sometimes it is desirable to write Perl functions that are not
 647    restricted.  For example, one might want a Perl function that sends
 648    mail.  To handle these cases, PL/Perl can also be installed as an
 649    <quote>untrusted</> language (usually called
 650    <application>PL/PerlU</application><indexterm><primary>PL/PerlU</></indexterm>).
 651    In this case the full Perl language is available.  If the
 652    <command>createlang</command> program is used to install the
 653    language, the language name <literal>plperlu</literal> will select
 654    the untrusted PL/Perl variant.
 655   </para>
 656
 657   <para>
 658    The writer of a <application>PL/PerlU</> function must take care that the function
 659    cannot be used to do anything unwanted, since it will be able to do
 660    anything that could be done by a user logged in as the database
 661    administrator.  Note that the database system allows only database
 662    superusers to create functions in untrusted languages.
 663   </para>
 664
 665   <para>
 666    If the above function was created by a superuser using the language
 667    <literal>plperlu</>, execution would succeed.
 668   </para>
 669
 670   <note>
 671     <para>
 672       For security reasons, to stop a leak of privileged operations from
 673       <application>PL/PerlU</> to <application>PL/Perl</>, these two languages
 674       have to run in separate instances of the Perl interpreter. If your
 675       Perl installation has been appropriately compiled, this is not a problem.
 676       However, not all installations are compiled with the requisite flags.
 677       If <productname>PostgreSQL</> detects that this is the case then it will
 678       not start a second interpreter, but instead create an error. In
 679       consequence, in such an installation, you cannot use both
 680       <application>PL/PerlU</> and <application>PL/Perl</> in the same backend
 681       process. The remedy for this is to obtain a Perl installation created
 682       with the appropriate flags, namely either <literal>usemultiplicity</> or
 683       both <literal>usethreads</> and <literal>useithreads</>.
 684       For more details,see the <literal>perlembed</> manual page.
 685     </para>
 686   </note>
 687
 688  </sect1>
 689
 690  <sect1 id="plperl-triggers">
 691   <title>PL/Perl Triggers</title>
 692
 693   <para>
 694    PL/Perl can be used to write trigger functions.  In a trigger function,
 695    the hash reference <varname>$_TD</varname> contains information about the
 696    current trigger event. <varname>$_TD</> is a global variable,
 697    which gets a separate local value for each invocation of the trigger.
 698    The fields of the <varname>$_TD</varname> hash reference are:
 699
 700    <variablelist>
 701     <varlistentry>
 702      <term><literal>$_TD-&gt;{new}{foo}</literal></term>
 703      <listitem>
 704       <para>
 705        <literal>NEW</literal> value of column <literal>foo</literal>
 706       </para>
 707      </listitem>
 708     </varlistentry>
 709
 710     <varlistentry>
 711      <term><literal>$_TD-&gt;{old}{foo}</literal></term>
 712      <listitem>
 713       <para>
 714        <literal>OLD</literal> value of column <literal>foo</literal>
 715       </para>
 716      </listitem>
 717     </varlistentry>
 718
 719     <varlistentry>
 720      <term><literal>$_TD-&gt;{name}</literal></term>
 721      <listitem>
 722       <para>
 723        Name of the trigger being called
 724       </para>
 725      </listitem>
 726     </varlistentry>
 727
 728     <varlistentry>
 729      <term><literal>$_TD-&gt;{event}</literal></term>
 730      <listitem>
 731       <para>
 732        Trigger event: <literal>INSERT</>, <literal>UPDATE</>,
 733        <literal>DELETE</>, <literal>TRUNCATE</>, or <literal>UNKNOWN</>
 734       </para>
 735      </listitem>
 736     </varlistentry>
 737
 738     <varlistentry>
 739      <term><literal>$_TD-&gt;{when}</literal></term>
 740      <listitem>
 741       <para>
 742        When the trigger was called: <literal>BEFORE</literal>, <literal>AFTER</literal>, or <literal>UNKNOWN</literal>
 743       </para>
 744      </listitem>
 745     </varlistentry>
 746
 747     <varlistentry>
 748      <term><literal>$_TD-&gt;{level}</literal></term>
 749      <listitem>
 750       <para>
 751        The trigger level: <literal>ROW</literal>, <literal>STATEMENT</literal>, or <literal>UNKNOWN</literal>
 752       </para>
 753      </listitem>
 754     </varlistentry>
 755
 756     <varlistentry>
 757      <term><literal>$_TD-&gt;{relid}</literal></term>
 758      <listitem>
 759       <para>
 760        OID of the table on which the trigger fired
 761       </para>
 762      </listitem>
 763     </varlistentry>
 764
 765     <varlistentry>
 766      <term><literal>$_TD-&gt;{table_name}</literal></term>
 767      <listitem>
 768       <para>
 769        Name of the table on which the trigger fired
 770       </para>
 771      </listitem>
 772     </varlistentry>
 773
 774     <varlistentry>
 775      <term><literal>$_TD-&gt;{relname}</literal></term>
 776      <listitem>
 777       <para>
 778        Name of the table on which the trigger fired. This has been deprecated,
 779        and could be removed in a future release.
 780        Please use $_TD-&gt;{table_name} instead.
 781       </para>
 782      </listitem>
 783     </varlistentry>
 784
 785     <varlistentry>
 786      <term><literal>$_TD-&gt;{table_schema}</literal></term>
 787      <listitem>
 788       <para>
 789        Name of the schema in which the table on which the trigger fired, is
 790       </para>
 791      </listitem>
 792     </varlistentry>
 793
 794     <varlistentry>
 795      <term><literal>$_TD-&gt;{argc}</literal></term>
 796      <listitem>
 797       <para>
 798        Number of arguments of the trigger function
 799       </para>
 800      </listitem>
 801     </varlistentry>
 802
 803     <varlistentry>
 804      <term><literal>@{$_TD-&gt;{args}}</literal></term>
 805      <listitem>
 806       <para>
 807        Arguments of the trigger function.  Does not exist if <literal>$_TD-&gt;{argc}</literal> is 0.
 808       </para>
 809      </listitem>
 810     </varlistentry>
 811
 812    </variablelist>
 813   </para>
 814
 815   <para>
 816    Row-level triggers can return one of the following:
 817
 818    <variablelist>
 819     <varlistentry>
 820      <term><literal>return;</literal></term>
 821      <listitem>
 822       <para>
 823        Execute the operation
 824       </para>
 825      </listitem>
 826     </varlistentry>
 827
 828     <varlistentry>
 829      <term><literal>"SKIP"</literal></term>
 830      <listitem>
 831       <para>
 832        Don't execute the operation
 833       </para>
 834      </listitem>
 835     </varlistentry>
 836
 837     <varlistentry>
 838      <term><literal>"MODIFY"</literal></term>
 839      <listitem>
 840       <para>
 841        Indicates that the <literal>NEW</literal> row was modified by
 842        the trigger function
 843       </para>
 844      </listitem>
 845     </varlistentry>
 846    </variablelist>
 847   </para>
 848
 849   <para>
 850    Here is an example of a trigger function, illustrating some of the
 851    above:
 852 <programlisting>
 853 CREATE TABLE test (
 854     i int,
 855     v varchar
 856 );
 857
 858 CREATE OR REPLACE FUNCTION valid_id() RETURNS trigger AS $$
 859     if (($_TD-&gt;{new}{i} &gt;= 100) || ($_TD-&gt;{new}{i} &lt;= 0)) {
 860         return "SKIP";    # skip INSERT/UPDATE command
 861     } elsif ($_TD-&gt;{new}{v} ne "immortal") {
 862         $_TD-&gt;{new}{v} .= "(modified by trigger)";
 863         return "MODIFY";  # modify row and execute INSERT/UPDATE command
 864     } else {
 865         return;           # execute INSERT/UPDATE command
 866     }
 867 $$ LANGUAGE plperl;
 868
 869 CREATE TRIGGER test_valid_id_trig
 870     BEFORE INSERT OR UPDATE ON test
 871     FOR EACH ROW EXECUTE PROCEDURE valid_id();
 872 </programlisting>
 873   </para>
 874  </sect1>
 875
 876  <sect1 id="plperl-missing">
 877   <title>Limitations and Missing Features</title>
 878
 879   <para>
 880    The following features are currently missing from PL/Perl, but they
 881    would make welcome contributions.
 882
 883    <itemizedlist>
 884     <listitem>
 885      <para>
 886       PL/Perl functions cannot call each other directly (because they
 887       are anonymous subroutines inside Perl).
 888      </para>
 889     </listitem>
 890
 891     <listitem>
 892      <para>
 893       SPI is not yet fully implemented.
 894      </para>
 895     </listitem>
 896
 897     <listitem>
 898      <para>
 899       If you are fetching very large data sets using
 900       <literal>spi_exec_query</literal>, you should be aware that
 901       these will all go into memory.  You can avoid this by using
 902       <literal>spi_query</literal>/<literal>spi_fetchrow</literal> as
 903       illustrated earlier.
 904      </para>
 905      <para>
 906         A similar problem occurs if a set-returning function passes a
 907         large set of rows back to PostgreSQL via <literal>return</literal>. You
 908         can avoid this problem too by instead using
 909         <literal>return_next</literal> for each row returned, as shown
 910         previously.
 911      </para>
 912
 913     </listitem>
 914    </itemizedlist>
 915   </para>
 916  </sect1>
 917
 918 </chapter>