2010-06-09 Steven G. Kargl <kargl@gcc.gnu.org>
[official-gcc.git] / boehm-gc / doc / leak.html
blob91fa8ea840236fa3f6b76e2ba9dbf25f78cbda0b
1 <HTML>
2 <HEAD>
3 <TITLE>Using the Garbage Collector as Leak Detector</title>
4 </head>
5 <BODY>
6 <H1>Using the Garbage Collector as Leak Detector</h1>
7 The garbage collector may be used as a leak detector.
8 In this case, the primary function of the collector is to report
9 objects that were allocated (typically with <TT>GC_MALLOC</tt>),
10 not deallocated (normally with <TT>GC_FREE</tt>), but are
11 no longer accessible. Since the object is no longer accessible,
12 there in normally no way to deallocate the object at a later time;
13 thus it can safely be assumed that the object has been "leaked".
14 <P>
15 This is substantially different from counting leak detectors,
16 which simply verify that all allocated objects are eventually
17 deallocated. A garbage-collector based leak detector can provide
18 somewhat more precise information when an object was leaked.
19 More importantly, it does not report objects that are never
20 deallocated because they are part of "permanent" data structures.
21 Thus it does not require all objects to be deallocated at process
22 exit time, a potentially useless activity that often triggers
23 large amounts of paging.
24 <P>
25 All non-ancient versions of the garbage collector provide
26 leak detection support. Version 5.3 adds the following
27 features:
28 <OL>
29 <LI> Leak detection mode can be initiated at run-time by
30 setting GC_find_leak instead of building the collector with FIND_LEAK
31 defined. This variable should be set to a nonzero value
32 at program startup.
33 <LI> Leaked objects should be reported and then correctly garbage collected.
34 Prior versions either reported leaks or functioned as a garbage collector.
35 </ol>
36 For the rest of this description we will give instructions that work
37 with any reasonable version of the collector.
38 <P>
39 To use the collector as a leak detector, follow the following steps:
40 <OL>
41 <LI> Build the collector with -DFIND_LEAK. Otherwise use default
42 build options.
43 <LI> Change the program so that all allocation and deallocation goes
44 through the garbage collector.
45 <LI> Arrange to call <TT>GC_gcollect</tt> at appropriate points to check
46 for leaks.
47 (For sufficiently long running programs, this will happen implicitly,
48 but probably not with sufficient frequency.)
49 </ol>
50 The second step can usually be accomplished with the
51 <TT>-DREDIRECT_MALLOC=GC_malloc</tt> option when the collector is built,
52 or by defining <TT>malloc</tt>, <TT>calloc</tt>,
53 <TT>realloc</tt> and <TT>free</tt>
54 to call the corresponding garbage collector functions.
55 But this, by itself, will not yield very informative diagnostics,
56 since the collector does not keep track of information about
57 how objects were allocated. The error reports will include
58 only object addresses.
59 <P>
60 For more precise error reports, as much of the program as possible
61 should use the all uppercase variants of these functions, after
62 defining <TT>GC_DEBUG</tt>, and then including <TT>gc.h</tt>.
63 In this environment <TT>GC_MALLOC</tt> is a macro which causes
64 at least the file name and line number at the allocation point to
65 be saved as part of the object. Leak reports will then also include
66 this information.
67 <P>
68 Many collector features (<I>e.g</i> stubborn objects, finalization,
69 and disappearing links) are less useful in this context, and are not
70 fully supported. Their use will usually generate additional bogus
71 leak reports, since the collector itself drops some associated objects.
72 <P>
73 The same is generally true of thread support. However, as of 6.0alpha4,
74 correct leak reports should be generated with linuxthreads.
75 <P>
76 On a few platforms (currently Solaris/SPARC, Irix, and, with -DSAVE_CALL_CHAIN,
77 Linux/X86), <TT>GC_MALLOC</tt>
78 also causes some more information about its call stack to be saved
79 in the object. Such information is reproduced in the error
80 reports in very non-symbolic form, but it can be very useful with the
81 aid of a debugger.
82 <H2>An Example</h2>
83 The following header file <TT>leak_detector.h</tt> is included in the
84 "include" subdirectory of the distribution:
85 <PRE>
86 #define GC_DEBUG
87 #include "gc.h"
88 #define malloc(n) GC_MALLOC(n)
89 #define calloc(m,n) GC_MALLOC((m)*(n))
90 #define free(p) GC_FREE(p)
91 #define realloc(p,n) GC_REALLOC((p),(n))
92 #define CHECK_LEAKS() GC_gcollect()
93 </pre>
94 <P>
95 Assume the collector has been built with -DFIND_LEAK. (For very
96 new versions of the collector, we could instead add the statement
97 <TT>GC_find_leak = 1</tt> as the first statement in <TT>main</tt>.
98 <P>
99 The program to be tested for leaks can then look like:
100 <PRE>
101 #include "leak_detector.h"
103 main() {
104 int *p[10];
105 int i;
106 /* GC_find_leak = 1; for new collector versions not */
107 /* compiled with -DFIND_LEAK. */
108 for (i = 0; i < 10; ++i) {
109 p[i] = malloc(sizeof(int)+i);
111 for (i = 1; i < 10; ++i) {
112 free(p[i]);
114 for (i = 0; i < 9; ++i) {
115 p[i] = malloc(sizeof(int)+i);
117 CHECK_LEAKS();
119 </pre>
121 On an Intel X86 Linux system this produces on the stderr stream:
122 <PRE>
123 Leaked composite object at 0x806dff0 (leak_test.c:8, sz=4)
124 </pre>
125 (On most unmentioned operating systems, the output is similar to this.
126 If the collector had been built on Linux/X86 with -DSAVE_CALL_CHAIN,
127 the output would be closer to the Solaris example. For this to work,
128 the program should not be compiled with -fomit_frame_pointer.)
130 On Irix it reports
131 <PRE>
132 Leaked composite object at 0x10040fe0 (leak_test.c:8, sz=4)
133 Caller at allocation:
134 ##PC##= 0x10004910
135 </pre>
136 and on Solaris the error report is
137 <PRE>
138 Leaked composite object at 0xef621fc8 (leak_test.c:8, sz=4)
139 Call chain at allocation:
140 args: 4 (0x4), 200656 (0x30FD0)
141 ##PC##= 0x14ADC
142 args: 1 (0x1), -268436012 (0xEFFFFDD4)
143 ##PC##= 0x14A64
144 </pre>
145 In the latter two cases some additional information is given about
146 how malloc was called when the leaked object was allocated. For
147 Solaris, the first line specifies the arguments to <TT>GC_debug_malloc</tt>
148 (the actual allocation routine), The second the program counter inside
149 main, the third the arguments to <TT>main</tt>, and finally the program
150 counter inside the caller to main (i.e. in the C startup code).
152 In the Irix case, only the address inside the caller to main is given.
154 In many cases, a debugger is needed to interpret the additional information.
155 On systems supporting the "adb" debugger, the <TT>callprocs</tt> script
156 can be used to replace program counter values with symbolic names.
157 As of version 6.1, the collector tries to generate symbolic names for
158 call stacks if it knows how to do so on the platform. This is true on
159 Linux/X86, but not on most other platforms.
160 <H2>Simplified leak detection under Linux</h2>
161 Since version 6.1, it should be possible to run the collector in leak
162 detection mode on a program a.out under Linux/X86 as follows:
163 <OL>
164 <LI> Ensure that a.out is a single-threaded executable. This doesn't yet work
165 for multithreaded programs.
166 <LI> If possible, ensure that the addr2line program is installed in
167 /usr/bin. (It comes with RedHat Linux.)
168 <LI> If possible, compile a.out with full debug information.
169 This will improve the quality of the leak reports. With this approach, it is
170 no longer necessary to call GC_ routines explicitly, though that can also
171 improve the quality of the leak reports.
172 <LI> Build the collector and install it in directory <I>foo</i> as follows:
173 <UL>
174 <LI> configure --prefix=<I>foo</i> --enable-full-debug --enable-redirect-malloc
175 --disable-threads
176 <LI> make
177 <LI> make install
178 </ul>
179 <LI> Set environment variables as follows:
180 <UL>
181 <LI> LD_PRELOAD=<I>foo</i>/lib/libgc.so
182 <LI> GC_FIND_LEAK
183 <LI> You may also want to set GC_PRINT_STATS (to confirm that the collector
184 is running) and/or GC_LOOP_ON_ABORT (to facilitate debugging from another
185 window if something goes wrong).
186 </ul
187 <LI> Simply run a.out as you normally would. Note that if you run anything
188 else (<I>e.g.</i> your editor) with those environment variables set,
189 it will also be leak tested. This may or may not be useful and/or
190 embarrassing. It can generate
191 mountains of leak reports if the application wasn't designed to avoid leaks,
192 <I>e.g.</i> because it's always short-lived.
193 </ol>
194 This has not yet been thropughly tested on large applications, but it's known
195 to do the right thing on at least some small ones.
196 </body>
197 </html>