variable capture bug in match extension
[metalua.git] / doc / manual / sample-exception.tex
blob31be5c065c85ce6319c6c552ed1028bb99224f6b
1 \subsection{Exceptions}
2 As a first non-trivial example of extension, we'll pick exception:
3 there is a mechanism in Lua, {\tt pcall()}, which essentially provides
4 the raw functionality to catch errors when some code is run, so
5 enhancing it to get full exceptions is not very difficult. We will aim
6 at:
7 \begin{itemize}
8 \item Having a proper syntax, the kind you get in most
9 exception-enabled languages;
10 \item being able to easily classify exception hierarchically;
11 \item being able to attach additional data to exception (e.g. an error
12 message);
13 \item not interfere with the usual error mechanism;
14 \item support the ``finally'' feature, which guaranties that a piece
15 of code (most often about resource liberation) will be executed.
16 \end{itemize}
18 \subsubsection{Syntax}
19 There are many variants of syntaxes for exceptions. I'll pick one
20 inspired from OCaml, which you're welcome to dislike. And in case you
21 dislike it, writing one which suits your taste is an excellent
22 exercice. So, the syntax for exceptions will be something like:
23 \begin{Verbatim}[fontsize=\scriptsize]
25 try
26 <protected block of statements>
27 with
28 <exception_1> -> <block of statements handling exception 1>
29 | <exception_2> -> <block of statements handling exception 2>
30 ...
31 | <exception_n> -> <block of statements handling exception n>
32 end
33 \end{Verbatim}
35 Notice that OCaml lets you put an optional ``{\tt|}'' in front of the
36 first exception case, just to make the whole list look more regular,
37 and we'll accept it as well. Let's write a {\tt gg} grammar parsing
38 this:
40 \begin{Verbatim}[fontsize=\scriptsize]
42 trywith_parser =
43 gg.sequence{ "try", mlp.block, "with", gg.optkeyword "|",
44 gg.list{ gg.sequence{ mlp.expr, "->", mlp.block },
45 separators = "|", terminators = "end" },
46 "end",
47 builder = trywith_builder }
48 mlp.stat:add(trywith_parser)
49 mlp.lexer:add{ "try", "with", "->" }
50 mlp.block.terminator:add{ "|", "with" }
51 \end{Verbatim}
53 We use {\tt gg.sequence} to chain the various parsers; {\tt
54 gg.optkeyword} lets us allow the optional ``{\tt|}''; {\tt gg.list}
55 lets us read an undetermined series of exception cases, separated by
56 keyword ``{\tt|}'', until we find the terminator ``{\tt end}''
57 keyword. The parser delegates the building of the resulting statement
58 to {\tt trywith\_builder}, which will be detailled later. Finally, we
59 have to declare a couple of mundane things:
60 \begin{itemize}
61 \item that {\tt try}, {\tt with} and {\tt->} are keywords. If we don't
62 do this, the two firsts will be returned by the lexer as identifiers
63 instead of keywords; the later will be read as two separate keywords
64 ``{\tt-}'' and ``{\tt>}''. We don't have to declare explicitly
65 ``{\tt|}'', as single-character symbols are automatically considered to
66 be keywords.
67 \item that ``{\tt|}'' and ``{\tt with}'' can terminate a block of
68 statements. Indeed, metalua needs to know when it reached the end of
69 a block, and introducing new constructions which embed blocks often
70 introduce new block terminators. In our case, ``{\tt with}'' marks
71 the end of the block in which exceptions are monitored, and ``{\tt|}''
72 marks the beginning of a new exception handling case, and therefore
73 the end of the previous case's block.
74 \end{itemize}
76 That's it for syntax, at least for now. The next step is to decide
77 what kind of code we will generate.
79 The fundamental mechanism is {\tt pcall(func, arg1, arg2, ...,
80 argn)}: this call will evaluate\\
81 {\tt func(arg1, arg2, ..., argn)}, and:
82 \begin{itemize}
83 \item if everything goes smoothly, return {\tt true}, followed by any
84 value(s) returned by {\tt func()};
85 \item if an error occurs, return {\tt false}, and the error object,
86 most often a string describing the error encountered.
87 \end{itemize}
89 We'll exploit this mechanism, by enclosing the guarded code in a
90 function, calling it inside a {\tt pcall()}, and using special error
91 objects to represent exceptions.
93 \subsubsection{Exception objects}
94 We want to be able to classify exceptions hierarchically: each
95 exception will inherit form a more generic exception, the most generic
96 one being simply called ``{\tt exception}''. We'll therefore design a
97 system which allows to specialize an exception into a sub-exception,
98 and to compare two exceptions, to know whether one is a special case
99 of another. Comparison will be handled by the usual {\tt< > <= >=}
100 operators, which we'll overload through metatables. Here is an
101 implementation of the base exception {\tt exception}, with working
102 comparisons, and a {\tt new()} method which allow to specialize an
103 exception. Three exceptions are derived as an example, so that
104 {\tt exception > exn\_invalid > exn\_nullarg} and {\tt exception >
105 exn\_nomorecoffee}:
107 \begin{Verbatim}[fontsize=\scriptsize]
109 exception = { } ; exn_mt = { }
110 setmetatable (exception, exn_mt)
112 exn_mt.__le = |a,b| a==b or a<b
113 exn_mt.__lt = |a,b| getmetatable(a)==exn_mt and
114 getmetatable(b)==exn_mt and
115 b.super and a<=b.super
117 function exception:new()
118 local e = { super = self, new = self.new }
119 setmetatable(e, getmetatable(self))
120 return e
123 exn_invalid = exception:new()
124 exn_nullarg = exn_invalid:new()
125 exn_nomorecofee = exception:new()
126 \end{Verbatim}
128 To compile a {\tt try/with} block, after having put the guarded block
129 into a {\tt pcall()} we need to check whether an exception was raised,
130 and if is has been raised, compare it with each case until we find one
131 that fits. If none is found (either it's an uncaught exception, or a
132 genuine error which is not an exception at all), it must be rethrown.
134 Notice that throwing an exception simply consists into sending it as
135 an error:
136 \begin{Verbatim}[fontsize=\scriptsize]
138 throw = error
139 \end{Verbatim}
141 To fix the picture, here is some simple code using {\tt try/catch},
142 followed by its translation:
144 \begin{Verbatim}[fontsize=\scriptsize]
146 -- Original code:
148 print(1)
149 print(2)
150 throw(exn_invalid:new("toto"))
151 print("You shouldn't see that")
152 with
153 | exn_nomorecofee -> print "you shouldn't see that: uncomparable exn"
154 | exn_nullarg -> print "you shouldn't see that: too specialized exn"
155 | exn_invalid -> print "exception caught correctly"
156 | exception -> print "execution should never reach that far"
157 end
158 print("done")
159 \end{Verbatim}
161 \begin{Verbatim}[fontsize=\scriptsize]
163 -- Translated version:
164 local status, exn = pcall (function ()
165 print(1)
166 print(2)
167 throw(exn_invalid)
168 print("You shouldn't see that")
169 end)
171 if not status then
172 if exn < exn_nomorecoffee then
173 print "you shouldn't see that: uncomparable exn"
174 elseif exn < exn_nullarg then
175 print "you shouldn't see that: too specialized exn"
176 elseif exn < exn_invalid then
177 print "exception caught correctly"
178 elseif exn < exception then
179 print "execution should never reach that far"
180 else error(exn) end
181 end
182 print("done")
183 \end{Verbatim}
185 In this, the only nontrivial part is the sequence of {\tt
186 if/then/elseif} tests at the end. If you check the doc about AST
187 representation of such blocks, you'll come up with some generation
188 code which looks like:
190 \pagebreak
192 \begin{Verbatim}[fontsize=\scriptsize]
194 function trywith_builder(x)
195 ---------------------------------------------------------
196 -- Get the parts of the sequence:
197 ---------------------------------------------------------
198 local block, _, handlers = unpack(x)
200 ---------------------------------------------------------
201 -- [catchers] is the big [if] statement which handles errors
202 -- reported by [pcall].
203 ---------------------------------------------------------
204 local catchers = `If{ }
205 for _, x in ipairs (handlers) do
206 -- insert the condition:
207 table.insert (catchers, +{ -{x[1]} <= exn })
208 -- insert the corresponding block to execute on success:
209 table.insert (catchers, x[2])
212 ---------------------------------------------------------
213 -- Finally, put an [else] block to rethrow uncought errors:
214 ---------------------------------------------------------
215 table.insert (catchers, +{error (exn)})
217 ---------------------------------------------------------
218 -- Splice the pieces together and return the result:
219 ---------------------------------------------------------
220 return +{ block:
221 local status, exn = { pcall (function() -{block} end) }
222 if not status then
223 -{ catchers }
224 end }
226 \end{Verbatim}
228 \subsubsection{Not getting lost between metalevels}
229 This is the first non-trivial example we see, and it might require a
230 bit of attention in order not to be lost between metalevels. Parts of
231 this library must go at metalevel (i.e. modify the parser itself at
232 compile time), other parts must be included as regular code:
233 \begin{itemize}
234 \item {\tt trywith\_parser} and {\tt trywith\_builder} are at metalevel:
235 they have to be put between \verb|-{...}|, or to be put in a file
236 which is loaded through \verb|-{ require ... }|.
237 \item the definitions of {\tt throw}, the root {\tt exception} and the
238 various derived exceptions are regular code, and must be included normally.
239 \end{itemize}
241 The whole result in a single file would therefore look like:
243 \begin{Verbatim}[fontsize=\scriptsize]
245 -{ block:
246 local trywith_builder = ...
247 local trywith_parser = ...
248 mlp.stat:add ...
249 mlp.lexer:add ...
250 mlp.block.terminator:add ... }
252 throw = ...
253 exception = ...
254 exn_mt = ...
256 exn_invalid = ...
257 exn_nullarg = ...
258 exn_nomorecofee = ...
260 -- Test code
263 with
264 | ... -> ...
266 \end{Verbatim}
268 Better yet, it should be organized into two files:
269 \begin{itemize}
270 \item the parser modifier, i.e. the content of ``{\tt-\{block:...\}}''
271 above, goes into a file ``ext-syntax/exn.lua'' of Lua's path;
272 \item the library part, i.e. {\tt throw ... exn\_nomorecoffee ...}
273 goes into a file ``ext-lib/exn.lua'' of Lua's path;
274 \item the sample calls them both with metalua standard lib's {\tt
275 extension} function:
276 \end{itemize}
278 \begin{Verbatim}[fontsize=\scriptsize]
280 -{ extension "exn" }
283 with
284 | ... -> ...
286 \end{Verbatim}
288 \subsubsection{shortcomings}
289 This first attempt is full of bugs, shortcomings and other
290 traps. Among others:
291 \begin{itemize}
292 \item Variables {\tt exn} and {\tt status} are subject to capture;
293 \item There is no way to put personalized data in an exception. Or,
294 more accurately, there's no practiccal way to retrieve it in the
295 exception handler.
296 \item What happens if there's a {\tt return} statement in the guraded
297 block?
298 \item There's no {\tt finally} block in the construction.
299 \item Coroutines can't yield across a {\tt pcall()}. Therefore, a
300 yield in the guarded code will cause an error.
301 \end{itemize}
303 Refining the example to address these shortcomings is left as an
304 exercice to the reader, we'll just give a couple of design
305 hints. However, a more comprehensive implementation of this exception
306 system is provided in metalua's standard libraries; you can consider
307 its sources as a solution to this exercice!
309 \subsubsection{Hints}
310 Addressing the variable capture issue is straightforward: use {\tt
311 mlp.gensym()} to generate unique identifiers (which cannot capture
312 anything), and put anti-quotes at the appropriate places. Eventually,
313 metalua will include an hygienization library which will automate this
314 dull process.
316 Passing parameters to exceptions can be done by adding arbitrary
317 parameters to the {\tt new()} method: these parameters will be stored
318 in the exception, e.g. in its array part. finally, the
319 syntax has to be extended so that the caught exception can be given a
320 name. Code such as the one which follows should be accepted:
322 \begin{Verbatim}[fontsize=\scriptsize]
325 throw (exn_invalid:new "I'm sorry Dave, I'm afraid I can't do that.")
326 with
327 | exn_invalid e -> printf ("The computer choked: %s", e[1])
329 \end{Verbatim}
331 The simplest way to detect user-caused returns is to create a unique
332 object (typically an empty table), and return it at the end of the
333 block. when no exception has been thrown, test whether that object was
334 returned: if anything else than it was returned, then propagate it (by
335 {\tt return}ing it again). If not, do nothing. Think about the case
336 when multiple values have been returned.
338 The {\tt finally} block poses no special problem: just go through it,
339 whether an exception occured or not. Think also about going through it
340 even if there's a {\tt return} to propagate.
342 As for yielding from within the guarded code, there is a solution,
343 which you can find by searching Lua's mailing list archives. The idea
344 is to run the guarded code inside a coroutine, and check what's
345 returned by the coroutine run:
346 \begin{itemize}
347 \item if it's an error, treat it as a {\tt pcall()} returning false;
348 \item if it's a normal termination, treat it as a {\tt pcall()}
349 returning true;
350 \item if it's a yield, propagate it to the upper level; when resumed,
351 propagate the resume to the guarded code which yielded.
352 \end{itemize}