1 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V3.1//EN"[]>
3 <book id="LKLockingGuide">
5 <title>Unreliable Guide To Locking</title>
9 <firstname>Paul</firstname>
10 <othername>Rusty</othername>
11 <surname>Russell</surname>
14 <email>rusty@linuxcare.com</email>
22 <holder>Paul Russell</holder>
27 This documentation is free software; you can redistribute
28 it and/or modify it under the terms of the GNU General Public
29 License as published by the Free Software Foundation; either
30 version 2 of the License, or (at your option) any later
35 This program is distributed in the hope that it will be
36 useful, but WITHOUT ANY WARRANTY; without even the implied
37 warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
38 See the GNU General Public License for more details.
42 You should have received a copy of the GNU General Public
43 License along with this program; if not, write to the Free
44 Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
49 For more details see the file COPYING in the source
50 distribution of Linux.
57 <title>Introduction</title>
59 Welcome, to Rusty's Remarkably Unreliable Guide to Kernel
60 Locking issues. This document describes the locking systems in
61 the Linux Kernel as we approach 2.4.
64 It looks like <firstterm linkend="gloss-smp"><acronym>SMP</acronym>
65 </firstterm> is here to stay; so everyone hacking on the kernel
66 these days needs to know the fundamentals of concurrency and locking
71 <title>The Problem With Concurrency</title>
73 (Skip this if you know what a Race Condition is).
76 In a normal program, you can increment a counter like so:
79 very_important_count++;
83 This is what they would expect to happen:
87 <title>Expected Results</title>
89 <tgroup cols=2 align=left>
93 <entry>Instance 1</entry>
94 <entry>Instance 2</entry>
100 <entry>read very_important_count (5)</entry>
104 <entry>add 1 (6)</entry>
108 <entry>write very_important_count (6)</entry>
113 <entry>read very_important_count (6)</entry>
117 <entry>add 1 (7)</entry>
121 <entry>write very_important_count (7)</entry>
129 This is what might happen:
133 <title>Possible Results</title>
135 <tgroup cols=2 align=left>
138 <entry>Instance 1</entry>
139 <entry>Instance 2</entry>
145 <entry>read very_important_count (5)</entry>
150 <entry>read very_important_count (5)</entry>
153 <entry>add 1 (6)</entry>
158 <entry>add 1 (6)</entry>
161 <entry>write very_important_count (6)</entry>
166 <entry>write very_important_count (6)</entry>
173 This overlap, where what actually happens depends on the
174 relative timing of multiple tasks, is called a race condition.
175 The piece of code containing the concurrency issue is called a
176 critical region. And especially since Linux starting running
177 on SMP machines, they became one of the major issues in kernel
178 design and implementation.
181 The solution is to recognize when these simultaneous accesses
182 occur, and use locks to make sure that only one instance can
183 enter the critical region at any time. There are many
184 friendly primitives in the Linux kernel to help you do this.
185 And then there are the unfriendly primitives, but I'll pretend
192 <title>Two Main Types of Kernel Locks: Spinlocks and Semaphores</title>
195 There are two main types of kernel locks. The fundamental type
197 (<filename class=headerfile>include/asm/spinlock.h</filename>),
198 which is a very simple single-holder lock: if you can't get the
199 spinlock, you keep trying (spinning) until you can. Spinlocks are
200 very small and fast, and can be used anywhere.
203 The second type is a semaphore
204 (<filename class=headerfile>include/asm/semaphore.h</filename>): it
205 can have more than one holder at any time (the number decided at
206 initialization time), although it is most commonly used as a
207 single-holder lock (a mutex). If you can't get a semaphore,
208 your task will put itself on the queue, and be woken up when the
209 semaphore is released. This means the CPU will do something
210 else while you are waiting, but there are many cases when you
211 simply can't sleep (see <xref linkend="sleeping-things">), and so
212 have to use a spinlock instead.
215 Neither type of lock is recursive: see
216 <xref linkend="techniques-deadlocks">.
219 <sect1 id="uniprocessor">
220 <title>Locks and Uniprocessor Kernels</title>
223 For kernels compiled without <symbol>CONFIG_SMP</symbol>, spinlocks
224 do not exist at all. This is an excellent design decision: when
225 no-one else can run at the same time, there is no reason to
230 You should always test your locking code with <symbol>CONFIG_SMP</symbol>
231 enabled, even if you don't have an SMP test box, because it
232 will still catch some (simple) kinds of deadlock.
236 Semaphores still exist, because they are required for
237 synchronization between <firstterm linkend="gloss-usercontext">user
238 contexts</firstterm>, as we will see below.
243 <title>Read/Write Lock Variants</title>
246 Both spinlocks and semaphores have read/write variants:
247 <type>rwlock_t</type> and <structname>struct rw_semaphore</structname>.
248 These divide users into two classes: the readers and the writers. If
249 you are only reading the data, you can get a read lock, but to write to
250 the data you need the write lock. Many people can hold a read lock,
251 but a writer must be sole holder.
255 This means much smoother locking if your code divides up
256 neatly along reader/writer lines. All the discussions below
257 also apply to read/write variants.
261 <sect1 id="usercontextlocking">
262 <title>Locking Only In User Context</title>
265 If you have a data structure which is only ever accessed from
266 user context, then you can use a simple semaphore
267 (<filename>linux/asm/semaphore.h</filename>) to protect it. This
268 is the most trivial case: you initialize the semaphore to the number
269 of resources available (usually 1), and call
270 <function>down_interruptible()</function> to grab the semaphore, and
271 <function>up()</function> to release it. There is also a
272 <function>down()</function>, which should be avoided, because it
273 will not return if a signal is received.
277 Example: <filename>linux/net/core/netfilter.c</filename> allows
278 registration of new <function>setsockopt()</function> and
279 <function>getsockopt()</function> calls, with
280 <function>nf_register_sockopt()</function>. Registration and
281 de-registration are only done on module load and unload (and boot
282 time, where there is no concurrency), and the list of registrations
283 is only consulted for an unknown <function>setsockopt()</function>
284 or <function>getsockopt()</function> system call. The
285 <varname>nf_sockopt_mutex</varname> is perfect to protect this,
286 especially since the setsockopt and getsockopt calls may well
291 <sect1 id="lock-user-bh">
292 <title>Locking Between User Context and BHs</title>
295 If a <firstterm linkend="gloss-bh">bottom half</firstterm> shares
296 data with user context, you have two problems. Firstly, the current
297 user context can be interrupted by a bottom half, and secondly, the
298 critical region could be entered from another CPU. This is where
299 <function>spin_lock_bh()</function>
300 (<filename class=headerfile>include/linux/spinlock.h</filename>) is
301 used. It disables bottom halves on that CPU, then grabs the lock.
302 <function>spin_unlock_bh()</function> does the reverse.
306 This works perfectly for <firstterm linkend="gloss-up"><acronym>UP
307 </acronym></firstterm> as well: the spin lock vanishes, and this macro
308 simply becomes <function>local_bh_disable()</function>
309 (<filename class=headerfile>include/asm/softirq.h</filename>), which
310 protects you from the bottom half being run.
314 <sect1 id="lock-user-tasklet">
315 <title>Locking Between User Context and Tasklets/Soft IRQs</title>
318 This is exactly the same as above, because
319 <function>local_bh_disable()</function> actually disables all
320 softirqs and <firstterm linkend="gloss-tasklet">tasklets</firstterm>
321 on that CPU as well. It should really be
322 called `local_softirq_disable()', but the name has been preserved
323 for historical reasons. Similarly,
324 <function>spin_lock_bh()</function> would now be called
325 spin_lock_softirq() in a perfect world.
330 <title>Locking Between Bottom Halves</title>
333 Sometimes a bottom half might want to share data with
334 another bottom half (especially remember that timers are run
338 <sect2 id="lock-bh-same">
339 <title>The Same BH</title>
342 Since a bottom half is never run on two CPUs at once, you
343 don't need to worry about your bottom half being run twice
344 at once, even on SMP.
348 <sect2 id="lock-bh-different">
349 <title>Different BHs</title>
352 Since only one bottom half ever runs at a time once, you
353 don't need to worry about race conditions with other bottom
354 halves. Beware that things might change under you, however,
355 if someone changes your bottom half to a tasklet. If you
356 want to make your code future-proof, pretend you're already
357 running from a tasklet (see below), and doing the extra
358 locking. Of course, if it's five years before that happens,
359 you're gonna look like a damn fool.
364 <sect1 id="lock-tasklets">
365 <title>Locking Between Tasklets</title>
368 Sometimes a tasklet might want to share data with another
369 tasklet, or a bottom half.
372 <sect2 id="lock-tasklets-same">
373 <title>The Same Tasklet</title>
375 Since a tasklet is never run on two CPUs at once, you don't
376 need to worry about your tasklet being reentrant (running
377 twice at once), even on SMP.
381 <sect2 id="lock-tasklets-different">
382 <title>Different Tasklets</title>
384 If another tasklet (or bottom half, such as a timer) wants
385 to share data with your tasklet, you will both need to use
386 <function>spin_lock()</function> and
387 <function>spin_unlock()</function> calls.
388 <function>spin_lock_bh()</function> is
389 unnecessary here, as you are already in a a tasklet, and
390 none will be run on the same CPU.
395 <sect1 id="lock-softirqs">
396 <title>Locking Between Softirqs</title>
399 Often a <firstterm linkend="gloss-softirq">softirq</firstterm> might
400 want to share data with itself, a tasklet, or a bottom half.
403 <sect2 id="lock-softirqs-same">
404 <title>The Same Softirq</title>
407 The same softirq can run on the other CPUs: you can use a
408 per-CPU array (see <xref linkend="per-cpu">) for better
409 performance. If you're going so far as to use a softirq,
410 you probably care about scalable performance enough
411 to justify the extra complexity.
415 You'll need to use <function>spin_lock()</function> and
416 <function>spin_unlock()</function> for shared data.
420 <sect2 id="lock-softirqs-different">
421 <title>Different Softirqs</title>
424 You'll need to use <function>spin_lock()</function> and
425 <function>spin_unlock()</function> for shared data, whether it
426 be a timer (which can be running on a different CPU), bottom half,
427 tasklet or the same or another softirq.
433 <chapter id="hardirq-context">
434 <title>Hard IRQ Context</title>
437 Hardware interrupts usually communicate with a bottom half,
438 tasklet or softirq. Frequently this involves putting work in a
439 queue, which the BH/softirq will take out.
442 <sect1 id="hardirq-softirq">
443 <title>Locking Between Hard IRQ and Softirqs/Tasklets/BHs</title>
446 If a hardware irq handler shares data with a softirq, you have
447 two concerns. Firstly, the softirq processing can be
448 interrupted by a hardware interrupt, and secondly, the
449 critical region could be entered by a hardware interrupt on
450 another CPU. This is where <function>spin_lock_irq()</function> is
451 used. It is defined to disable interrupts on that cpu, then grab
452 the lock. <function>spin_unlock_irq()</function> does the reverse.
456 This works perfectly for UP as well: the spin lock vanishes,
457 and this macro simply becomes <function>local_irq_disable()</function>
458 (<filename class=headerfile>include/asm/smp.h</filename>), which
459 protects you from the softirq/tasklet/BH being run.
463 <function>spin_lock_irqsave()</function>
464 (<filename>include/linux/spinlock.h</filename>) is a variant
465 which saves whether interrupts were on or off in a flags word,
466 which is passed to <function>spin_lock_irqrestore()</function>. This
467 means that the same code can be used inside an hard irq handler (where
468 interrupts are already off) and in softirqs (where the irq
469 disabling is required).
474 <chapter id="common-techniques">
475 <title>Common Techniques</title>
478 This section lists some common dilemmas and the standard
479 solutions used in the Linux kernel code. If you use these,
480 people will find your code simpler to understand.
484 If I could give you one piece of advice: never sleep with anyone
485 crazier than yourself. But if I had to give you advice on
486 locking: <emphasis>keep it simple</emphasis>.
494 Be reluctant to introduce new locks.
498 Strangely enough, this is the exact reverse of my advice when
499 you <emphasis>have</emphasis> slept with someone crazier than yourself.
502 <sect1 id="techniques-no-writers">
503 <title>No Writers in Interrupt Context</title>
506 There is a fairly common case where an interrupt handler needs
507 access to a critical region, but does not need write access.
508 In this case, you do not need to use
509 <function>read_lock_irq()</function>, but only
510 <function>read_lock()</function> everywhere (since if an interrupt
511 occurs, the irq handler will only try to grab a read lock, which
512 won't deadlock). You will still need to use
513 <function>write_lock_irq()</function>.
517 Similar logic applies to locking between softirqs/tasklets/BHs
518 which never need a write lock, and user context:
519 <function>read_lock()</function> and
520 <function>write_lock_bh()</function>.
524 <sect1 id="techniques-deadlocks">
525 <title>Deadlock: Simple and Advanced</title>
528 There is a coding bug where a piece of code tries to grab a
529 spinlock twice: it will spin forever, waiting for the lock to
530 be released (spinlocks, rwlocks and semaphores are not
531 recursive in Linux). This is trivial to diagnose: not a
532 stay-up-five-nights-talk-to-fluffy-code-bunnies kind of
537 For a slightly more complex case, imagine you have a region
538 shared by a bottom half and user context. If you use a
539 <function>spin_lock()</function> call to protect it, it is
540 possible that the user context will be interrupted by the bottom
541 half while it holds the lock, and the bottom half will then spin
542 forever trying to get the same lock.
546 Both of these are called deadlock, and as shown above, it can
547 occur even with a single CPU (although not on UP compiles,
548 since spinlocks vanish on kernel compiles with
549 <symbol>CONFIG_SMP</symbol>=n. You'll still get data corruption
550 in the second example).
554 This complete lockup is easy to diagnose: on SMP boxes the
555 watchdog timer or compiling with <symbol>DEBUG_SPINLOCKS</symbol> set
556 (<filename>include/linux/spinlock.h</filename>) will show this up
557 immediately when it happens.
561 A more complex problem is the so-called `deadly embrace',
562 involving two or more locks. Say you have a hash table: each
563 entry in the table is a spinlock, and a chain of hashed
564 objects. Inside a softirq handler, you sometimes want to
565 alter an object from one place in the hash to another: you
566 grab the spinlock of the old hash chain and the spinlock of
567 the new hash chain, and delete the object from the old one,
568 and insert it in the new one.
572 There are two problems here. First, if your code ever
573 tries to move the object to the same chain, it will deadlock
574 with itself as it tries to lock it twice. Secondly, if the
575 same softirq on another CPU is trying to move another object
576 in the reverse direction, the following could happen:
580 <title>Consequences</title>
582 <tgroup cols=2 align=left>
593 <entry>Grab lock A -> OK</entry>
594 <entry>Grab lock B -> OK</entry>
597 <entry>Grab lock B -> spin</entry>
598 <entry>Grab lock A -> spin</entry>
605 The two CPUs will spin forever, waiting for the other to give up
606 their lock. It will look, smell, and feel like a crash.
609 <sect2 id="techs-deadlock-prevent">
610 <title>Preventing Deadlock</title>
613 Textbooks will tell you that if you always lock in the same
614 order, you will never get this kind of deadlock. Practice
615 will tell you that this approach doesn't scale: when I
616 create a new lock, I don't understand enough of the kernel
617 to figure out where in the 5000 lock hierarchy it will fit.
621 The best locks are encapsulated: they never get exposed in
622 headers, and are never held around calls to non-trivial
623 functions outside the same file. You can read through this
624 code and see that it will never deadlock, because it never
625 tries to grab another lock while it has that one. People
626 using your code don't even need to know you are using a
631 A classic problem here is when you provide callbacks or
632 hooks: if you call these with the lock held, you risk simple
633 deadlock, or a deadly embrace (who knows what the callback
634 will do?). Remember, the other programmers are out to get
635 you, so don't do this.
639 <sect2 id="techs-deadlock-overprevent">
640 <title>Overzealous Prevention Of Deadlocks</title>
643 Deadlocks are problematic, but not as bad as data
644 corruption. Code which grabs a read lock, searches a list,
645 fails to find what it wants, drops the read lock, grabs a
646 write lock and inserts the object has a race condition.
650 If you don't see why, please stay the fuck away from my code.
656 <title>Per-CPU Data</title>
659 A great technique for avoiding locking which is used fairly
660 widely is to duplicate information for each CPU. For example,
661 if you wanted to keep a count of a common condition, you could
662 use a spin lock and a single counter. Nice and simple.
666 If that was too slow [it's probably not], you could instead
667 use a counter for each CPU [don't], then none of them need an
668 exclusive lock [you're wasting your time here]. To make sure
669 the CPUs don't have to synchronize caches all the time, align
670 the counters to cache boundaries by appending
671 `__cacheline_aligned' to the declaration
672 (<filename class=headerfile>include/linux/cache.h</filename>).
673 [Can't you think of anything better to do?]
677 They will need a read lock to access their own counters,
678 however. That way you can use a write lock to grant exclusive
679 access to all of them at once, to tally them up.
684 <title>Big Reader Locks</title>
687 A classic example of per-CPU information is Ingo's `big
689 (<filename class=headerfile>linux/include/brlock.h</filename>). These
690 use the Per-CPU Data techniques described above to create a lock which
691 is very fast to get a read lock, but agonizingly slow for a write
696 Fortunately, there are a limited number of these locks
697 available: you have to go through a strict interview process
702 <sect1 id="lock-avoidance-rw">
703 <title>Avoiding Locks: Read And Write Ordering</title>
706 Sometimes it is possible to avoid locking. Consider the
707 following case from the 2.2 firewall code, which inserted an
708 element into a single linked list in user context:
712 new->next = i->next;
717 Here the author (Alan Cox, who knows what he's doing) assumes
718 that the pointer assignments are atomic. This is important,
719 because networking packets would traverse this list on bottom
720 halves without a lock. Depending on their exact timing, they
721 would either see the new element in the list with a valid
722 <structfield>next</structfield> pointer, or it would not be in the
727 Of course, the writes <emphasis>must</emphasis> be in this
728 order, otherwise the new element appears in the list with an
729 invalid <structfield>next</structfield> pointer, and any other
730 CPU iterating at the wrong time will jump through it into garbage.
731 Because modern CPUs reorder, Alan's code actually read as follows:
735 new->next = i->next;
741 The <function>wmb()</function> is a write memory barrier
742 (<filename class=headerfile>include/asm/system.h</filename>): neither
743 the compiler nor the CPU will allow any writes to memory after the
744 <function>wmb()</function> to be visible to other hardware
745 before any of the writes before the <function>wmb()</function>.
749 As i386 does not do write reordering, this bug would never
750 show up on that platform. On other SMP platforms, however, it
755 There is also <function>rmb()</function> for read ordering: to ensure
756 any previous variable reads occur before following reads. The simple
757 <function>mb()</function> macro combines both
758 <function>rmb()</function> and <function>wmb()</function>.
762 Any atomic operation is defined to act as a memory barrier
763 (ie. as per the <function>mb()</function> macro). Also,
764 spinlock operations act as partial barriers: operations after
765 gaining a spinlock will never be moved to precede the
766 <function>spin_lock()</function> call, and operations before
767 releasing a spinlock will never be moved after the
768 <function>spin_unlock()</function> call.
769 <!-- Manfred Spraul <manfreds@colorfullife.com>
770 24 May 2000 2.3.99-pre9 -->
774 <sect1 id="lock-avoidance-atomic-ops">
775 <title>Avoiding Locks: Atomic Operations</title>
778 There are a number of atomic operations defined in
779 <filename class=headerfile>include/asm/atomic.h</filename>: these
780 are guaranteed to be seen atomically from all CPUs in the system, thus
781 avoiding races. If your shared data consists of a single counter, say,
782 these operations might be simpler than using spinlocks (although for
783 anything non-trivial using spinlocks is clearer).
787 Note that the atomic operations are defined to act as both
788 read and write barriers on all platforms.
792 <sect1 id="ref-counts">
793 <title>Protecting A Collection of Objects: Reference Counts</title>
796 Locking a collection of objects is fairly easy: you get a
797 single spinlock, and you make sure you grab it before
798 searching, adding or deleting an object.
802 The purpose of this lock is not to protect the individual
803 objects: you might have a separate lock inside each one for
804 that. It is to protect the <emphasis>data structure
805 containing the objects</emphasis> from race conditions. Often
806 the same lock is used to protect the contents of all the
807 objects as well, for simplicity, but they are inherently
808 orthogonal (and many other big words designed to confuse).
812 Changing this to a read-write lock will often help markedly if
813 reads are far more common that writes. If not, there is
814 another approach you can use to reduce the time the lock is
815 held: reference counts.
819 In this approach, an object has an owner, who sets the
820 reference count to one. Whenever you get a pointer to the
821 object, you increment the reference count (a `get' operation).
822 Whenever you relinquish a pointer, you decrement the reference
823 count (a `put' operation). When the owner wants to destroy
824 it, they mark it dead, and do a put.
828 Whoever drops the reference count to zero (usually implemented
829 with <function>atomic_dec_and_test()</function>) actually cleans
830 up and frees the object.
834 This means that you are guaranteed that the object won't
835 vanish underneath you, even though you no longer have a lock
840 Here's some skeleton code:
844 void create_foo(struct foo *x)
846 atomic_set(&x->use, 1);
847 spin_lock_bh(&list_lock);
848 ... insert in list ...
849 spin_unlock_bh(&list_lock);
852 struct foo *get_foo(int desc)
856 spin_lock_bh(&list_lock);
858 if (ret) atomic_inc(&ret->use);
859 spin_unlock_bh(&list_lock);
864 void put_foo(struct foo *x)
866 if (atomic_dec_and_test(&x->use))
870 void destroy_foo(struct foo *x)
872 spin_lock_bh(&list_lock);
873 ... remove from list ...
874 spin_unlock_bh(&list_lock);
880 <sect2 id="helpful-macros">
881 <title>Macros To Help You</title>
883 There are a set of debugging macros tucked inside
884 <filename class=headerfile>include/linux/netfilter_ipv4/lockhelp.h</filename>
885 and <filename class=headerfile>listhelp.h</filename>: these are very
886 useful for ensuring that locks are held in the right places to protect
892 <sect1 id="sleeping-things">
893 <title>Things Which Sleep</title>
896 You can never call the following routines while holding a
897 spinlock, as they may sleep. This also means you need to be in
905 <firstterm linkend="gloss-userspace">userspace</firstterm>:
910 <function>copy_from_user()</function>
915 <function>copy_to_user()</function>
920 <function>get_user()</function>
925 <function> put_user()</function>
933 <function>kmalloc(GFP_KERNEL)</function>
939 <function>down_interruptible()</function> and
940 <function>down()</function>
943 There is a <function>down_trylock()</function> which can be
944 used inside interrupt context, as it will not sleep.
945 <function>up()</function> will also never sleep.
951 <function>printk()</function> can be called in
952 <emphasis>any</emphasis> context, interestingly enough.
957 <title>The Fucked Up Sparc</title>
960 Alan Cox says <quote>the irq disable/enable is in the register
961 window on a sparc</quote>. Andi Kleen says <quote>when you do
962 restore_flags in a different function you mess up all the
963 register windows</quote>.
967 So never pass the flags word set by
968 <function>spin_lock_irqsave()</function> and brethren to another
969 function (unless it's declared <type>inline</type>. Usually no-one
970 does this, but now you've been warned. Dave Miller can never do
971 anything in a straightforward manner (I can say that, because I have
972 pictures of him and a certain PowerPC maintainer in a compromising
977 <sect1 id="racing-timers">
978 <title>Racing Timers: A Kernel Pastime</title>
981 Timers can produce their own special problems with races.
982 Consider a collection of objects (list, hash, etc) where each
983 object has a timer which is due to destroy it.
987 If you want to destroy the entire collection (say on module
988 removal), you might do the following:
992 /* THIS CODE BAD BAD BAD BAD: IF IT WAS ANY WORSE IT WOULD USE
993 HUNGARIAN NOTATION */
994 spin_lock_bh(&list_lock);
997 struct foo *next = list->next;
998 del_timer(&list->timer);
1003 spin_unlock_bh(&list_lock);
1007 Sooner or later, this will crash on SMP, because a timer can
1008 have just gone off before the <function>spin_lock_bh()</function>,
1009 and it will only get the lock after we
1010 <function>spin_unlock_bh()</function>, and then try to free
1011 the element (which has already been freed!).
1015 This can be avoided by checking the result of
1016 <function>del_timer()</function>: if it returns
1017 <returnvalue>1</returnvalue>, the timer has been deleted.
1018 If <returnvalue>0</returnvalue>, it means (in this
1019 case) that it is currently running, so we can do:
1024 spin_lock_bh(&list_lock);
1027 struct foo *next = list->next;
1028 if (!del_timer(&list->timer)) {
1029 /* Give timer a chance to delete this */
1030 spin_unlock_bh(&list_lock);
1037 spin_unlock_bh(&list_lock);
1041 Another common problem is deleting timers which restart
1042 themselves (by calling <function>add_timer()</function> at the end
1043 of their timer function). Because this is a fairly common case
1044 which is prone to races, you can put a call to
1045 <function>timer_exit()</function> at the very end of your timer function,
1046 and user <function>del_timer_sync()</function>
1047 (<filename class=headerfile>include/linux/timer.h</filename>)
1048 to handle this case. It returns the number of times the timer
1049 had to be deleted before we finally stopped it from adding itself back
1055 <chapter id="references">
1056 <title>Further reading</title>
1061 <filename>Documentation/spinlocks.txt</filename>:
1062 Linus Torvalds' spinlocking tutorial in the kernel sources.
1068 Unix Systems for Modern Architectures: Symmetric
1069 Multiprocessing and Caching for Kernel Programmers:
1073 Curt Schimmel's very good introduction to kernel level
1074 locking (not written for Linux, but nearly everything
1075 applies). The book is expensive, but really worth every
1076 penny to understand SMP locking. [ISBN: 0201633388]
1082 <chapter id="thanks">
1083 <title>Thanks</title>
1086 Thanks to Telsa Gwynne for DocBooking, neatening and adding
1091 Thanks to Martin Pool, Philipp Rumpf, Stephen Rothwell, Paul
1092 Mackerras, Ruedi Aschwanden, Alan Cox, Manfred Spraul and Tim
1093 Waugh for proofreading, correcting, flaming, commenting.
1097 Thanks to the cabal for having no influence on this document.
1101 <glossary id="glossary">
1102 <title>Glossary</title>
1104 <glossentry id="gloss-bh">
1105 <glossterm>bh</glossterm>
1108 Bottom Half: for historical reasons, functions with
1109 `_bh' in them often now refer to any software interrupt, e.g.
1110 <function>spin_lock_bh()</function> blocks any software interrupt
1111 on the current CPU. Bottom halves are deprecated, and will
1112 eventually be replaced by tasklets. Only one bottom half will be
1113 running at any time.
1118 <glossentry id="gloss-hwinterrupt">
1119 <glossterm>Hardware Interrupt / Hardware IRQ</glossterm>
1122 Hardware interrupt request. <function>in_irq()</function> returns
1123 <returnvalue>true</returnvalue> in a hardware interrupt handler (it
1124 also returns true when interrupts are blocked).
1129 <glossentry id="gloss-interruptcontext">
1130 <glossterm>Interrupt Context</glossterm>
1133 Not user context: processing a hardware irq or software irq.
1134 Indicated by the <function>in_interrupt()</function> macro
1135 returning <returnvalue>true</returnvalue> (although it also
1136 returns true when interrupts or BHs are blocked).
1141 <glossentry id="gloss-smp">
1142 <glossterm><acronym>SMP</acronym></glossterm>
1145 Symmetric Multi-Processor: kernels compiled for multiple-CPU
1146 machines. (CONFIG_SMP=y).
1151 <glossentry id="gloss-softirq">
1152 <glossterm>softirq</glossterm>
1155 Strictly speaking, one of up to 32 enumerated software
1156 interrupts which can run on multiple CPUs at once.
1157 Sometimes used to refer to tasklets and bottom halves as
1158 well (ie. all software interrupts).
1163 <glossentry id="gloss-swinterrupt">
1164 <glossterm>Software Interrupt / Software IRQ</glossterm>
1167 Software interrupt handler. <function>in_irq()</function> returns
1168 <returnvalue>false</returnvalue>; <function>in_softirq()</function>
1169 returns <returnvalue>true</returnvalue>. Tasklets, softirqs and
1170 bottom halves all fall into the category of `software interrupts'.
1175 <glossentry id="gloss-tasklet">
1176 <glossterm>tasklet</glossterm>
1179 A dynamically-registrable software interrupt,
1180 which is guaranteed to only run on one CPU at a time.
1185 <glossentry id="gloss-up">
1186 <glossterm><acronym>UP</acronym></glossterm>
1189 Uni-Processor: Non-SMP. (CONFIG_SMP=n).
1194 <glossentry id="gloss-usercontext">
1195 <glossterm>User Context</glossterm>
1198 The kernel executing on behalf of a particular
1199 process or kernel thread (given by the <function>current()</function>
1200 macro.) Not to be confused with userspace. Can be interrupted by
1201 software or hardware interrupts.
1206 <glossentry id="gloss-userspace">
1207 <glossterm>Userspace</glossterm>
1210 A process executing its own code outside the kernel.