1 @c Copyright (C) 2002, 2003, 2004, 2007, 2008, 2009
2 @c Free Software Foundation, Inc.
3 @c This is part of the GCC manual.
4 @c For copying conditions, see the file gcc.texi.
7 @chapter Memory Management and Type Information
11 GCC uses some fairly sophisticated memory management techniques, which
12 involve determining information about GCC's data structures from GCC's
13 source code and using this information to perform garbage collection and
14 implement precompiled headers.
16 A full C parser would be too complicated for this task, so a limited
17 subset of C is interpreted and special markers are used to determine
18 what parts of the source to look at. All @code{struct} and
19 @code{union} declarations that define data structures that are
20 allocated under control of the garbage collector must be marked. All
21 global variables that hold pointers to garbage-collected memory must
22 also be marked. Finally, all global variables that need to be saved
23 and restored by a precompiled header must be marked. (The precompiled
24 header mechanism can only save static variables if they're scalar.
25 Complex data structures must be allocated in garbage-collected memory
26 to be saved in a precompiled header.)
28 The full format of a marker is
30 GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{}))
33 but in most cases no options are needed. The outer double parentheses
34 are still necessary, though: @code{GTY(())}. Markers can appear:
38 In a structure definition, before the open brace;
40 In a global variable declaration, after the keyword @code{static} or
43 In a structure field definition, before the name of the field.
46 Here are some examples of marking simple data structures and globals.
49 struct GTY(()) @var{tag}
54 typedef struct GTY(()) @var{tag}
59 static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */
60 static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */
63 The parser understands simple typedefs such as
64 @code{typedef struct @var{tag} *@var{name};} and
65 @code{typedef int @var{name};}.
66 These don't need to be marked.
69 * GTY Options:: What goes inside a @code{GTY(())}.
70 * GGC Roots:: Making global variables GGC roots.
71 * Files:: How the generated files work.
72 * Invoking the garbage collector:: How to invoke the garbage collector.
76 @section The Inside of a @code{GTY(())}
78 Sometimes the C code is not enough to fully describe the type
79 structure. Extra information can be provided with @code{GTY} options
80 and additional markers. Some options take a parameter, which may be
81 either a string or a type name, depending on the parameter. If an
82 option takes no parameter, it is acceptable either to omit the
83 parameter entirely, or to provide an empty string as a parameter. For
84 example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are
87 When the parameter is a string, often it is a fragment of C code. Four
88 special escapes may be used in these strings, to refer to pieces of
89 the data structure being marked:
91 @cindex % in GTY option
94 The current structure.
96 The structure that immediately contains the current structure.
98 The outermost structure that contains the current structure.
100 A partial expression of the form @code{[i1][i2]@dots{}} that indexes
101 the array item currently being marked.
104 For instance, suppose that you have a structure of the form
114 and @code{b} is a variable of type @code{struct B}. When marking
115 @samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]},
116 @code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a}
117 would expand to @samp{[11]}.
119 As in ordinary C, adjacent strings will be concatenated; this is
120 helpful when you have a complicated expression.
123 GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE"
124 " ? TYPE_NEXT_VARIANT (&%h.generic)"
125 " : TREE_CHAIN (&%h.generic)")))
129 The available options are:
133 @item length ("@var{expression}")
135 There are two places the type machinery will need to be explicitly told
136 the length of an array. The first case is when a structure ends in a
137 variable-length array, like this:
139 struct GTY(()) rtvec_def @{
140 int num_elem; /* @r{number of elements} */
141 rtx GTY ((length ("%h.num_elem"))) elem[1];
145 In this case, the @code{length} option is used to override the specified
146 array length (which should usually be @code{1}). The parameter of the
147 option is a fragment of C code that calculates the length.
149 The second case is when a structure or a global variable contains a
150 pointer to an array, like this:
152 struct gimple_omp_for_iter * GTY((length ("%h.collapse"))) iter;
154 In this case, @code{iter} has been allocated by writing something like
156 x->iter = ggc_alloc_cleared_vec_gimple_omp_for_iter (collapse);
158 and the @code{collapse} provides the length of the field.
160 This second use of @code{length} also works on global variables, like:
162 static GTY((length("reg_known_value_size"))) rtx *reg_known_value;
168 If @code{skip} is applied to a field, the type machinery will ignore it.
169 This is somewhat dangerous; the only safe use is in a union when one
170 field really isn't ever used.
175 @item desc ("@var{expression}")
176 @itemx tag ("@var{constant}")
179 The type machinery needs to be told which field of a @code{union} is
180 currently active. This is done by giving each field a constant
181 @code{tag} value, and then specifying a discriminator using @code{desc}.
182 The value of the expression given by @code{desc} is compared against
183 each @code{tag} value, each of which should be different. If no
184 @code{tag} is matched, the field marked with @code{default} is used if
185 there is one, otherwise no field in the union will be marked.
187 In the @code{desc} option, the ``current structure'' is the union that
188 it discriminates. Use @code{%1} to mean the structure containing it.
189 There are no escapes available to the @code{tag} option, since it is a
194 struct GTY(()) tree_binding
196 struct tree_common common;
197 union tree_binding_u @{
198 tree GTY ((tag ("0"))) scope;
199 struct cp_binding_level * GTY ((tag ("1"))) level;
200 @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope;
205 In this example, the value of BINDING_HAS_LEVEL_P when applied to a
206 @code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type
207 mechanism will treat the field @code{level} as being present and if 0,
208 will treat the field @code{scope} as being present.
212 @item param_is (@var{type})
215 Sometimes it's convenient to define some data structure to work on
216 generic pointers (that is, @code{PTR}) and then use it with a specific
217 type. @code{param_is} specifies the real type pointed to, and
218 @code{use_param} says where in the generic data structure that type
221 For instance, to have a @code{htab_t} that points to trees, one would
222 write the definition of @code{htab_t} like this:
224 typedef struct GTY(()) @{
226 void ** GTY ((use_param, @dots{})) entries;
230 and then declare variables like this:
232 static htab_t GTY ((param_is (union tree_node))) ict;
235 @findex param@var{n}_is
236 @findex use_param@var{n}
237 @item param@var{n}_is (@var{type})
238 @itemx use_param@var{n}
240 In more complicated cases, the data structure might need to work on
241 several different types, which might not necessarily all be pointers.
242 For this, @code{param1_is} through @code{param9_is} may be used to
243 specify the real type of a field identified by @code{use_param1} through
249 When a structure contains another structure that is parameterized,
250 there's no need to do anything special, the inner structure inherits the
251 parameters of the outer one. When a structure contains a pointer to a
252 parameterized structure, the type machinery won't automatically detect
253 this (it could, it just doesn't yet), so it's necessary to tell it that
254 the pointed-to structure should use the same parameters as the outer
255 structure. This is done by marking the pointer with the
256 @code{use_params} option.
261 @code{deletable}, when applied to a global variable, indicates that when
262 garbage collection runs, there's no need to mark anything pointed to
263 by this variable, it can just be set to @code{NULL} instead. This is used
264 to keep a list of free structures around for re-use.
267 @item if_marked ("@var{expression}")
269 Suppose you want some kinds of object to be unique, and so you put them
270 in a hash table. If garbage collection marks the hash table, these
271 objects will never be freed, even if the last other reference to them
272 goes away. GGC has special handling to deal with this: if you use the
273 @code{if_marked} option on a global hash table, GGC will call the
274 routine whose name is the parameter to the option on each hash table
275 entry. If the routine returns nonzero, the hash table entry will
276 be marked as usual. If the routine returns zero, the hash table entry
279 The routine @code{ggc_marked_p} can be used to determine if an element
280 has been marked already; in fact, the usual case is to use
281 @code{if_marked ("ggc_marked_p")}.
284 @item mark_hook ("@var{hook-routine-name}")
286 If provided for a structure or union type, the given
287 @var{hook-routine-name} (between double-quotes) is the name of a
288 routine called when the garbage collector has just marked the data as
289 reachable. This routine should not change the data, or call any ggc
290 routine. Its only argument is a pointer to the just marked (const)
296 When applied to a field, @code{maybe_undef} indicates that it's OK if
297 the structure that this fields points to is never defined, so long as
298 this field is always @code{NULL}. This is used to avoid requiring
299 backends to define certain optional structures. It doesn't work with
303 @item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}")
305 The type machinery expects all pointers to point to the start of an
306 object. Sometimes for abstraction purposes it's convenient to have
307 a pointer which points inside an object. So long as it's possible to
308 convert the original object to and from the pointer, such pointers
309 can still be used. @var{type} is the type of the original object,
310 the @var{to expression} returns the pointer given the original object,
311 and the @var{from expression} returns the original object given
312 the pointer. The pointer will be available using the @code{%h}
317 @findex chain_circular
318 @item chain_next ("@var{expression}")
319 @itemx chain_prev ("@var{expression}")
320 @itemx chain_circular ("@var{expression}")
322 It's helpful for the type machinery to know if objects are often
323 chained together in long lists; this lets it generate code that uses
324 less stack space by iterating along the list instead of recursing down
325 it. @code{chain_next} is an expression for the next item in the list,
326 @code{chain_prev} is an expression for the previous item. For singly
327 linked lists, use only @code{chain_next}; for doubly linked lists, use
328 both. The machinery requires that taking the next item of the
329 previous item gives the original item. @code{chain_circular} is similar
330 to @code{chain_next}, but can be used for circular single linked lists.
333 @item reorder ("@var{function name}")
335 Some data structures depend on the relative ordering of pointers. If
336 the precompiled header machinery needs to change that ordering, it
337 will call the function referenced by the @code{reorder} option, before
338 changing the pointers in the object that's pointed to by the field the
339 option applies to. The function must take four arguments, with the
340 signature @samp{@w{void *, void *, gt_pointer_operator, void *}}.
341 The first parameter is a pointer to the structure that contains the
342 object being updated, or the object itself if there is no containing
343 structure. The second parameter is a cookie that should be ignored.
344 The third parameter is a routine that, given a pointer, will update it
345 to its correct new value. The fourth parameter is a cookie that must
346 be passed to the second parameter.
348 PCH cannot handle data structures that depend on the absolute values
349 of pointers. @code{reorder} functions can be expensive. When
350 possible, it is better to depend on properties of the data, like an ID
351 number or the hash of a string instead.
353 @findex variable_size
356 The type machinery expects the types to be of constant size. When this
357 is not true, for example, with structs that have array fields or unions,
358 the type machinery cannot tell how many bytes need to be allocated at
359 each allocation. The @code{variable_size} is used to mark such types.
360 The type machinery then provides allocators that take a parameter
361 indicating an exact size of object being allocated.
365 struct GTY((variable_size)) sorted_fields_type @{
367 tree GTY((length ("%h.len"))) elts[1];
371 Then the objects of @code{struct sorted_fields_type} are allocated in GC
374 field_vec = ggc_alloc_sorted_fields_type (size);
378 @item special ("@var{name}")
380 The @code{special} option is used to mark types that have to be dealt
381 with by special case machinery. The parameter is the name of the
382 special case. See @file{gengtype.c} for further details. Avoid
383 adding new special cases unless there is no other alternative.
387 @section Marking Roots for the Garbage Collector
388 @cindex roots, marking
389 @cindex marking roots
391 In addition to keeping track of types, the type machinery also locates
392 the global variables (@dfn{roots}) that the garbage collector starts
393 at. Roots must be declared using one of the following syntaxes:
397 @code{extern GTY(([@var{options}])) @var{type} @var{name};}
399 @code{static GTY(([@var{options}])) @var{type} @var{name};}
405 @code{GTY(([@var{options}])) @var{type} @var{name};}
408 is @emph{not} accepted. There should be an @code{extern} declaration
409 of such a variable in a header somewhere---mark that, not the
410 definition. Or, if the variable is only used in one file, make it
414 @section Source Files Containing Type Information
415 @cindex generated files
416 @cindex files, generated
418 Whenever you add @code{GTY} markers to a source file that previously
419 had none, or create a new source file containing @code{GTY} markers,
420 there are three things you need to do:
424 You need to add the file to the list of source files the type
425 machinery scans. There are four cases:
429 For a back-end file, this is usually done
430 automatically; if not, you should add it to @code{target_gtfiles} in
431 the appropriate port's entries in @file{config.gcc}.
434 For files shared by all front ends, add the filename to the
435 @code{GTFILES} variable in @file{Makefile.in}.
438 For files that are part of one front end, add the filename to the
439 @code{gtfiles} variable defined in the appropriate
440 @file{config-lang.in}. For C, the file is @file{c-config-lang.in}.
441 Headers should appear before non-headers in this list.
444 For files that are part of some but not all front ends, add the
445 filename to the @code{gtfiles} variable of @emph{all} the front ends
450 If the file was a header file, you'll need to check that it's included
451 in the right place to be visible to the generated files. For a back-end
452 header file, this should be done automatically. For a front-end header
453 file, it needs to be included by the same file that includes
454 @file{gtype-@var{lang}.h}. For other header files, it needs to be
455 included in @file{gtype-desc.c}, which is a generated file, so add it to
456 @code{ifiles} in @code{open_base_file} in @file{gengtype.c}.
458 For source files that aren't header files, the machinery will generate a
459 header file that should be included in the source file you just changed.
460 The file will be called @file{gt-@var{path}.h} where @var{path} is the
461 pathname relative to the @file{gcc} directory with slashes replaced by
462 @verb{|-|}, so for example the header file to be included in
463 @file{cp/parser.c} is called @file{gt-cp-parser.c}. The
464 generated header file should be included after everything else in the
465 source file. Don't forget to mention this file as a dependency in the
470 For language frontends, there is another file that needs to be included
471 somewhere. It will be called @file{gtype-@var{lang}.h}, where
472 @var{lang} is the name of the subdirectory the language is contained in.
474 Plugins can add additional root tables. Run the @code{gengtype}
475 utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir}
476 @var{file-list} @var{plugin*.c}} with your plugin files
477 @var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file.
478 The GCC build tree is needed to be present in that mode.
481 @node Invoking the garbage collector
482 @section How to invoke the garbage collector
483 @cindex garbage collector, invocation
486 The GCC garbage collector GGC is only invoked explicitly. In contrast
487 with many other garbage collectors, it is not implicitly invoked by
488 allocation routines when a lot of memory has been consumed. So the
489 only way to have GGC reclaim storage it to call the @code{ggc_collect}
490 function explicitly. This call is an expensive operation, as it may
491 have to scan the entire heap. Beware that local variables (on the GCC
492 call stack) are not followed by such an invocation (as many other
493 garbage collectors do): you should reference all your data from static
494 or external @code{GTY}-ed variables, and it is advised to call
495 @code{ggc_collect} with a shallow call stack. The GGC is an exact mark
496 and sweep garbage collector (so it does not scan the call stack for
497 pointers). In practice GCC passes don't often call @code{ggc_collect}
498 themselves, because it is called by the pass manager between passes.