2 @comment node-name, next, previous, up
3 @chapter Signal handling
7 * The deferral mechanism::
8 * Implementation warts::
9 * Programming with signal handling in mind::
12 @node Groups of signals
13 @section Groups of signals
15 There are two distinct groups of signals.
17 @subsection Semi-synchronous signals
19 The first group, tentatively named ``semi-synchronous'', consists of
20 signals that are raised on illegal instruction, hitting a protected
21 page, or on a trap. Examples from this group are:
22 @code{SIGBUS}/@code{SIGSEGV}, @code{SIGTRAP}, @code{SIGILL} and
23 @code{SIGEMT}. The exact meaning and function of these signals varies
24 by platform and OS. Understandably, because these signals are raised
25 in a controllable manner they are never blocked or deferred.
27 @subsection Blockable signals
29 The other group is of blockable signals. Typically, signal handlers
30 block them to protect against being interrupted at all. For example
31 @code{SIGHUP}, @code{SIGINT}, @code{SIGQUIT} belong to this group.
33 With the exception of @code{SIG_STOP_FOR_GC} all blockable signals are
36 @node The deferral mechanism
37 @section The deferral mechanism
39 @subsection Pseudo atomic sections
41 Some operations, such as allocation, consist of several steps and
42 temporarily break for instance gc invariants. Interrupting said
43 operations is therefore dangerous to one's health. Blocking the
44 signals for each allocation is out of question as the overhead of the
45 two @code{sigsetmask} system calls would be enormous. Instead, pseudo
46 atomic sections are implemented with a simple flag.
48 When a deferrable signal is delivered to a thread within a pseudo
49 atomic section the pseudo-atomic-interrupted flag is set, the signal
50 and its context are stored, and all deferrable signals blocked. This
51 is to guarantee that there is at most one pending handler in
52 SBCL. While the signals are blocked, the responsibilty of keeping
53 track of other pending signals lies with the OS.
55 On leaving the pseudo atomic section, the pending handler is run and
56 the signals are unblocked.
58 @subsection @code{WITHOUT-INTERRUPTS}
60 Similar to pseudo atomic, @code{WITHOUT-INTERRUPTS} defers deferrable
61 signals in its thread until the end of its body, provided it is not
62 nested in another @code{WITHOUT-INTERRUPTS}.
64 Not so frequently used as pseudo atomic, @code{WITHOUT-INTERRUPTS}
65 benefits less from the deferral mechanism.
67 @subsection Stop the world
69 Something of a special case, a signal that is blockable but not
70 deferrable by @code{WITHOUT-INTERRUPTS} is @code{SIG_STOP_FOR_GC}. It
71 is deferred by pseudo atomic and @code{WITHOUT-GCING}.
73 @node Implementation warts
74 @section Implementation warts
76 @subsection RT signals
78 Sending and receiving the same number of signals is crucial for
79 @code{INTERRUPT-THREAD} and @code{sig_stop_for_gc}, hence they are
80 real-time signals for which the kernel maintains a queue as opposed to
81 just setting a flag for ``sigint pending''.
83 Note, however, that the rt signal queue is finite and on current linux
84 kernels a system wide resource. If the queue is full, SBCL tries to
85 signal until it succeeds. This behaviour can lead to deadlocks, if a
86 thread in a @code{WITHOUT-INTERRUPTS} is interrupted many times,
87 filling up the queue and then a gc hits and tries to send
88 @code{SIG_STOP_FOR_GC}.
90 @subsection Miscellaneous issues
92 Signal handlers should automatically restore errno and fp
93 state. Currently, this is not the case.
95 @subsection POSIX -- Letter and Spirit
97 POSIX restricts signal handlers to a use only a narrow subset of POSIX
98 functions, and declares anything else to have undefined semantics.
100 Apparently the real reason is that a signal handler is potentially
101 interrupting a POSIX call: so the signal safety requirement is really
102 a re-entrancy requirement. We can work around the letter of the
103 standard by arranging to handle the interrupt when the signal handler
104 returns (see: @code{arrange_return_to_lisp_function}.) This does,
105 however, in no way protect us from the real issue of re-entrancy: even
106 though we would no longer be in a signal handler, we might still be in
107 the middle of an interrupted POSIX call.
109 For some signals this appears to be a non-issue: @code{SIGSEGV} and
110 other semi-synchronous signals are raised by our code for our code,
111 and so we can be sure that we are not interrupting a POSIX call with
114 For asynchronous signals like @code{SIGALARM} and @code{SIGINT} this
117 The right thing to do in multithreaded builds would probably be to use
118 POSIX semaphores (which are signal safe) to inform a separate handler
119 thread about such asynchronous events. In single-threaded builds there
120 does not seem to be any other option aside from generally blocking
121 asynch signals and listening for them every once and a while at safe
122 points. Neither of these is implemented as of SBCL 1.0.4.
124 Currently all our handlers invoke unsafe functions without hesitation.
126 @node Programming with signal handling in mind
127 @section Programming with signal handling in mind
129 @subsection On reentrancy
131 Since they might be invoked in the middle of just about anything,
132 signal handlers must invoke only reentrant functions or async signal
133 safe functions to be more precise. Functions passed to
134 @code{INTERRUPT-THREAD} have the same restrictions and considerations
137 Destructive modification, and holding mutexes to protect desctructive
138 modifications from interfering with each other are often the cause of
139 non-reentrancy. Recursive locks are not likely to help, and while
140 @code{WITHOUT-INTERRUPTS} is, it is considered untrendy to litter the
143 Some basic functionality, such as streams and the debugger are
144 intended to be reentrant, but not much effort has been spent on
147 @subsection More deadlocks
149 If functions A and B directly or indirectly lock mutexes M and N, they
150 should do so in the same order to avoid deadlocks.
152 A less trivial scenario is where there is only one lock involved but
153 it is acquired in a @code{WITHOUT-GCING} in thread A, and outside of
154 @code{WITHOUT-GCING} in thread B. If thread A has entered
155 @code{WITHOUT-GCING} but thread B has the lock when the gc hits, then
156 A cannot leave @code{WITHOUT-GCING} because it is waiting for the lock
157 the already suspended thread B has. From this scenario one can easily
158 derive the rule: in a @code{WITHOUT-GCING} form (or pseudo atomic for
159 that matter) never wait for another thread that's not in
160 @code{WITHOUT-GCING}.
162 @subsection Calling user code
164 For the reasons above, calling user code, i.e. functions passed in, or
165 in other words code that one cannot reason about, from non-reentrant
166 code (holding locks), @code{WITHOUT-INTERRUPTS}, @code{WITHOUT-GCING}
167 is dangerous and best avoided.