docs/pmc/subs.pod

   1 =head1 NAME
   2
   3 Subs - Parrot Subroutines
   4
   5 =head1 VERSION
   6
   7 $Revision$
   8
   9 =head1 ABSTRACT
  10
  11 This document describes how to define, call, and return from Parrot subroutine
  12 objects and other invokables.
  13
  14 =head1 DESCRIPTION
  15
  16 Parrot comes with different subroutine and related classes which implement CPS
  17 (Continuation Passing Style) and PCC (Parrot Calling Conventions)
  18 F<docs/pdds/pdd03_calling_conventions.pod>.
  19
  20 =head2 Class Tree
  21
  22 These are all of the built-in classes that are directly callable, or
  23 "invokable":
  24
  25   Sub
  26     Closure
  27     Coroutine
  28     Eval
  29   Continuation
  30     RetContinuation
  31     Exception_Handler
  32
  33 By "invokable" we mean that they can be supplied as the first argument to the
  34 C<invoke>, C<invokecc>, or C<tailcall> instructions.  Generally speaking,
  35 invokable objects are divided into two subtypes:  C<Sub> and classes that are
  36 built on it create a new context when invoked, and C<Continuation> classes
  37 return control to an existing context that was captured when the
  38 C<Continuation> was created.
  39
  40 There are (of course) two classes that straddle this distinction:
  41
  42 =over 4
  43
  44 =item 1.
  45
  46 Invoking a C<Closure> object creates a new context for the sub it refers to
  47 directly, but it also captures an "outer" context that provides bindings for
  48 the immediately-enclosing lexical scope (and, if that context is itself is for
  49 a C<Closure>, the subsequent scopes working outwards).
  50
  51 [add a C<newclosure> example?  -- rgr, 6-Apr-08.]
  52
  53 =item 2.
  54
  55 A C<Coroutine> acts like a normal sub when called initially, and can also
  56 return normally, but acts like a continuation when exited via the C<yield>
  57 instruction and re-entered by re-invoking.
  58
  59 [need a reference to a C<coroutine> example.  -- rgr, 6-Apr-08.]
  60
  61 =back
  62
  63 =head1 SYNOPSIS
  64
  65 =head2 Creating subs
  66
  67 Subs are created by IMCC (the PIR compiler) via the B<.sub> directive.  Unless
  68 the C<:anon> pragma is included, they are stored in the constant table
  69 associated with the bytecode and can be fetched with the B<get_hll_global> and
  70 B<get_root_global> opcodes.  Within the PIR source, they can also be put in
  71 registers with a C<.const .Sub> declaration:
  72
  73     .const .Sub rsub = 'random_sub'
  74
  75 This uses C<find_sub_not_null> under the hood to look up the sub named
  76 "random_sub".
  77
  78 Here's an example of fetching a sub from another namespace:
  79
  80     .sub main :main
  81         get_hll_global $P0, ['Other'; 'Namespace'], "the_sub"
  82         $P0()
  83         print "back\n"
  84     .end
  85
  86     .namespace ['Other'; 'Namespace']
  87
  88     .sub the_sub
  89         print "in sub\n"
  90     .end
  91
  92 Note that C<the_sub> could be defined in a different bytecode or PIR source
  93 file from C<main>.
  94
  95 =head2 Program entry point
  96
  97 One subroutine in the first executed source or bytecode file may be flagged as
  98 the "main" subroutine, where execution starts.
  99
 100   .sub the_main_event :main
 101
 102 In the absence of a B<:main> entry Parrot starts execution at the first
 103 statement.  Any C<:main> directives in a subsequent PIR or bytecode file that
 104 are loaded under program control are ignored.
 105
 106 Note that if the first executed source or bytecode file contains more than one
 107 sub flagged as C<:main>, Parrot currently picks the I<last> such sub to start
 108 execution.  This is arguably a bug, so users should not depend upon it.
 109
 110 =head2 Load-time initialization
 111
 112 If a subroutine is marked as B<:load> this subroutine is run, before the
 113 B<load_bytecode> opcode returns.
 114
 115 e.g.
 116
 117   .sub main :main
 118      print "in main\n"
 119      load_bytecode "library_code.pir"
 120      print "back to main\n"
 121   .end
 122
 123   # library_code.pir
 124
 125   .sub _my_lib_init :load
 126      print "initializing library\n"
 127   .end
 128
 129 If a subroutine is marked as B<:init> this subroutine is run before the
 130 B<:main> or the first subroutine in the source file runs.  Unlike B<:main>
 131 subs, B<:init> subs are also run when compiling from memory.  B<:load> subs
 132 are run only in any source or bytecode files loaded subsequently.
 133
 134 These markers are called "pragmas", and are defined fully in
 135 L<docs/pdds/draft/pdd19_pir.pod>.  The following table summarizes the behavior
 136 of the five pragmas that cause Parrot to run a sub implicitly:
 137
 138                 ------ Executed when --------
 139                 compiling to    -- loading --
 140   Sub Pragma    disk  memory    first   after
 141   ==========    ====  ======    =====   =====
 142    :immediate   yes   yes       no      no
 143    :postcomp    yes   no        no      no
 144    :load        no    no        no      yes
 145    :init        no    yes       yes     no
 146    :main        no    no        yes     no
 147
 148 The same load-time behavior applies regardless of whether the loaded file is
 149 PIR source or bytecode.  Note that it is possible to mark a sub with both
 150 B<:load> and B<:init>.
 151
 152 =head2 Calling the sub
 153
 154 PIR sub invocation syntax is similar to HLL syntax:
 155
 156     $P0 = do_something($P1, $S3)
 157
 158 This is syntactic sugar for the following four bytecode instructions:
 159
 160     # Establish arguments.
 161     set_args '(0,0)', $P1, $S3
 162     # Find the sub.
 163     $P8 = find_sub_not_null "do_something"
 164     # Establish return values.
 165     get_results '(0)', $P0
 166     # Call the sub in $P8, implicitly creating a return continuation.
 167     invokecc $P8
 168
 169 The sub name could be replaced with a PMC register, in which case the
 170 C<find_sub_not_null> instruction would not be needed.  If the return values
 171 from the sub were ignored (by dropping the C<$P0 => part), the C<get_results>
 172 instruction would be omitted.  However, C<set_args> is emitted even in the
 173 case of a call without arguments.
 174
 175 The first operands to the C<set_args> and C<get_results> instructions are
 176 actually placeholders for an integer array that describes the register types.
 177 For example, the '(0,0)' for C<set_args> is replaced internally with C<[2,
 178 1]>, which means "two arguments, of type PMC and string".  Note that return
 179 values get the same register type coercion as sub parameters.  This is all
 180 described in much more detail in L<docs/pdds/pdd03_calling_conventions.pod>.
 181
 182 To receive multiple values, put the register names in parentheses:
 183
 184     ($P10, $P11) = do_something($P1, $S3)
 185
 186 To test whether a value was returned, declare it C<:optional>, and follow it
 187 with an integer register declared C<:opt_val>:
 188
 189     ($P10 :optional, $I10 :opt_val) = do_something($P1, $S3)
 190
 191 Both of these affect only the signature provided via C<get_results>.
 192
 193 [should also describe :flat, :slurpy, :named, ..., or at least provide a
 194 reference.  -- rgr, 25-May-08.]
 195
 196     # Call the sub in $P8, with continuation (created earlier) in $P9.
 197     invoke $P8, $P9
 198
 199 =head2 Returning from a sub
 200
 201 PIR supports a convenient syntax for returning any number of values from a sub
 202 or closure:
 203
 204     .return ($P0, $I1, $S3)
 205
 206 Integer, float, and string constants are also accepted.  This is translated
 207 to:
 208
 209     get_results '(0,0,0)', $P0, $I1, $S3
 210     returncc    # return by calling the current continuation
 211
 212 As for C<set_args>, the '(0,0,0)' is actually a placeholder for an integer
 213 array that describes the register types; it is replaced internally with C<[2,
 214 0, 1]>, which means "three arguments, of type PMC, integer, and string".
 215
 216 Another way to return from a sub is to use tail-calling, which calls a new sub
 217 with the current continuation, so that the new sub returns directly to the
 218 caller of the old sub (i.e. without first returning to the old sub).  This
 219 passes the three values to C<another_sub> via tail-calling:
 220
 221     .return another_sub($P0, $I1, $S3)
 222
 223 This is translated into a C<set_args> instruction for the call, but with
 224 C<tailcall> instead of C<invokecc>:
 225
 226     set_args '(0,0,0)', $P0, $I1, $S3
 227     $P8 = find_sub_not_null "another_sub"
 228     tailcall $P8
 229
 230 As for calling, the sub name could be replaced with a PMC register, in which
 231 case the C<find_sub_not_null> instruction would not be needed.
 232
 233 If needed, the current continuation can be extracted and called explicitly as
 234 follows:
 235
 236     ## This is what defines .INTERPINFO_CURRENT_CONT.
 237     .include 'interpinfo.pasm'
 238     ## Store our return continuation as exit_cont.
 239     .local pmc exit_cont
 240     exit_cont = interpinfo .INTERPINFO_CURRENT_CONT
 241     ## Invoke it explicitly:
 242     invokecc exit_cont
 243     ## ... or equivalently:
 244     tailcall exit_cont
 245
 246 To return values, use C<set_args> as before.
 247
 248 =head2 All together now
 249
 250 The following complete example illustrates the typical call/return pattern:
 251
 252     .sub main :main
 253         print "in main\n"
 254         the_sub()
 255         print "back to main\n"
 256     .end
 257
 258     .sub the_sub
 259         print "in sub\n"
 260     .end
 261
 262 Notice that we are not passing or returning values here.
 263
 264 [example of passing values.  this could get pretty elaborate; look for other
 265 examples first.  -- rgr, 6-Apr-08.]
 266
 267 If a short subroutine is called several times, for instance inside a loop, the
 268 creation of the return continuation can be done outside the loop:
 269
 270     .sub main :main
 271             ## Initialize the sub and the return cont.
 272             .local pmc cont
 273             cont = new 'Continuation'
 274             set_addr cont, ret_label
 275             .const .Sub rsub = 'random_sub'
 276             ## Loop initialization.
 277             .local int loop_max, i
 278             loop_max = 1000000
 279             i = 0
 280
 281             ## Main loop.
 282     again:
 283             set_args '(0)', i
 284             invoke rsub, cont
 285     ret_label:
 286             ## This is where "cont" returns.
 287             inc i
 288             if i < loop_max goto again
 289     .end
 290
 291     .sub random_sub
 292             .param int foo
 293             ## do_something
 294     .end
 295
 296 If the sub returns values, the C<get_results> must be B<after> C<ret_label> in
 297 order to receive them.
 298
 299 Since this is much more obscure than the PIR calling syntax, it should only be
 300 done if there is a measurable performance advantage.  Even in this trivial
 301 example, calling "rsub(i)" is only about a third slower on x86.
 302
 303 =head1 FILES
 304
 305 F<src/pmc/sub.pmc>, F<src/pmc/closure.pmc>,
 306 F<src/pmc/continuation.pmc>, F<src/pmc/coroutine.pmc>, F<src/sub.c>,
 307 F<t/pmc/sub.t>
 308
 309 =head1 SEE ALSO
 310
 311 F<docs/pdds/pdd03_calling_conventions.pod>
 312 F<docs/pdds/draft/pdd19_pir.pod>
 313
 314 =head1 AUTHOR
 315
 316 Leopold Toetsch <lt@toetsch.at>
 317
 318 =cut
 319
 320 __END__
 321 Local Variables:
 322   fill-column:78
 323 End: