info/suif1.info-3

   1 This is Info file suif1.info, produced by Makeinfo version 1.68 from
   2 the input file suif1.texi.
   3
   4    This file documents the SUIF library.
   5
   6    Copyright (C) 1994 Stanford University.  All rights reserved.
   7
   8    Permission is given to use, copy, and modify this documentation for
   9 any non-commercial purpose as long as this copyright notice is not
  10 removed.  All other uses, including redistribution in whole or in part,
  11 are forbidden without prior written permission.
  12
  13 \x1f
  14 File: suif1.info,  Node: Load Constant Instructions,  Next: Call Instructions,  Prev: Branch and Jump Instructions,  Up: Instructions
  15
  16 Load Constant Instructions
  17 ==========================
  18
  19    Rather than allowing constant values to be used directly as operands,
  20 SUIF uses separate `ldc' instructions to load constant values.  The
  21 `in_ldc' class holds these instructions.  Instead of the usual source
  22 operands, this class has an immediate value field (*note Immeds::.).
  23 The `value' and `set_value' methods may be used to access this field.
  24
  25    Only certain kinds of immediate values are supported in an `ldc'
  26 instruction:
  27
  28 Symbolic addresses (*Note Symbolic Addresses::)
  29      The result type of the instruction must be a pointer type.
  30
  31 Integers
  32      The result type must be an integer or pointer type.  Pointer types
  33      are allowed so that the null pointer can be loaded as the integer
  34      value zero.
  35
  36 Floating-point values
  37      The result type must be a floating-point type.
  38
  39 Other kinds of immediate values may be stored in the `value' field of
  40 an `ldc' instruction, but most SUIF passes and certain library
  41 functions will not be able to handle them.
  42
  43 \x1f
  44 File: suif1.info,  Node: Call Instructions,  Next: Array Instructions,  Prev: Load Constant Instructions,  Up: Instructions
  45
  46 Call Instructions
  47 =================
  48
  49    SUIF uses a special `cal' instruction to represent procedure calls.
  50 This high-level representation hides the details of various linkage
  51 conventions.  The `in_cal' class is used to represent these call
  52 instructions.  A call instruction contains a source operand to hold a
  53 pointer to the procedure to be called.  The `addr_op' and `set_addr_op'
  54 methods access this operand field.
  55
  56    The actual parameters for the procedure are stored in an array of
  57 operands.  The `num_args' method returns the number of elements in this
  58 array.  The size of the array can be changed at any time using the
  59 `set_num_args' method.  If necessary, the array will be reallocated.
  60 Elements of the argument array may be accessed using the `argument' and
  61 `set_argument' methods.  You must specify the array index.  The first
  62 argument is at index zero.
  63
  64    Call instructions must obey some conventions on the types of the
  65 operands.  The `addr' operand must hold a pointer to a function type
  66 which is compatible with the type of the procedure being called.  The
  67 result type of the call instruction must match the return type of the
  68 procedure.  The restrictions on instruction result types (*note Result
  69 Types::.) guarantee that the return type will either be void or have
  70 known, non-zero size.  If the function type specifies the number of
  71 arguments, it must match the number of actual parameters (unless the
  72 function takes a variable number of arguments).  Moreover, each operand
  73 in the argument array must be compatible with the type of the
  74 corresponding formal parameter.  Whether or not the function type
  75 specifies the argument types, the restrictions on instruction result
  76 types (*note Result Types::.) and variables (*note Variable Symbols::.)
  77 guarantee that all arguments will have known, non-zero size.
  78
  79 \x1f
  80 File: suif1.info,  Node: Array Instructions,  Next: Multi-way Branch Instructions,  Prev: Call Instructions,  Up: Instructions
  81
  82 Array Instructions
  83 ==================
  84
  85    Because many SUIF passes focus on analyzing and optimizing Fortran
  86 code, a high-level representation of array references is crucial.  SUIF
  87 provides `array' instructions which retain all of the high-level
  88 information in combination with other fields needed to generate code for
  89 the address computations.  The `in_array' class is used to hold these
  90 array instructions.
  91
  92    Array instructions include a number of fields.  First, a pointer to
  93 the base of the array is specified in an operand field that can be
  94 accessed with the `base_op' and `set_base_op' methods.  If the array
  95 elements are structures, a constant offset within the selected element
  96 may be included.  This optional integer offset can be accessed using the
  97 `offset' and `set_offset' methods.  The element size is needed to
  98 generate low-level code for the array address calculation.  The
  99 `elem_size' method returns the element size in bits.  The
 100 `set_elem_size' method may be used to change the element size.
 101
 102    Because Fortran arrays do not always begin with index zero, an
 103 optional operand, which is referenced using the `offset_op' and
 104 `set_offset_op' methods, is provided to specify an offset.  Since there
 105 is a single offset operand, the offsets for all of the dimensions must
 106 be combined into a single value.  The arrays are stored in row-major
 107 form, so the offset for the first dimension is multiplied by the size
 108 of the remaining dimensions, etc.  If the offset operand is provided,
 109 it must have an integer type.
 110
 111    Array instructions can treat arrays of arrays as multidimensional
 112 arrays, even though the type system does not support that directly.
 113 Each array instruction includes a field to specify the number of
 114 dimensions in the array.  This field may be accessed with the `dims'
 115 and `set_dims' methods.  The indexes for the array reference are stored
 116 in an array of source operands, one for each dimension.  These index
 117 operands can be accessed using the `index' and `set_index' methods.
 118 The dimensions are numbered beginning with zero.  Similarly, the number
 119 of elements in each dimension are stored in another array of source
 120 operands, which can be accessed with the `bound' and `set_bound'
 121 methods.
 122
 123    The result type of an array instruction must be a pointer.  However,
 124 it need not be a pointer to the element type.  If the elements are
 125 structures, the result type may be a pointer to one of the structure
 126 fields.  SUIF does not actually require that the result type match
 127 anything within the array element type, although that is highly
 128 recommended.  The `elem_type' method can be used to determine the
 129 actual type of the element being addressed.
 130
 131    The types of the array instruction operands must follow some
 132 conventions.  The index and bound operands must all have integer types.
 133 The base operand must be a pointer to an array.  If the array
 134 instruction has multiple dimensions, the base must point to a nested
 135 array (an array of arrays of arrays...) with the same depth as the
 136 number of dimensions.  For each dimension, if the bound operand is a
 137 constant, it must match the number of elements specified in the
 138 corresponding array type.  (If the lower and upper bounds in the array
 139 type are not both constant, then the bound operand may have any value.)
 140 The bound operand for the first dimension is optional and may be null.
 141 Finally, the element size must match the size of the elements in the
 142 array type.
 143
 144 \x1f
 145 File: suif1.info,  Node: Multi-way Branch Instructions,  Next: Label Instructions,  Prev: Array Instructions,  Up: Instructions
 146
 147 Multi-way Branch Instructions
 148 =============================
 149
 150    Fortran computed `goto' statements and C `switch' statements are
 151 represented in SUIF by multi-way branch (`mbr') instructions.  These
 152 are easier to analyze than the equivalent series of conditional
 153 branches, and they can easily be used to generate efficient jump table
 154 code.  The `in_mbr' class holds these multi-way branch instructions.
 155
 156    The `in_mbr' class contains a field with a pointer to an array of
 157 label symbols.  The `num_labs' method returns the number of labels in
 158 the array.  The size of the array can be changed at any time using the
 159 `set_num_labs' method; if necessary the array will be reallocated.  A
 160 particular element within the array can be accessed using the `label'
 161 and `set_label' methods.  You must specify the array index, and, as
 162 usual, the elements are numbered beginning with zero.
 163
 164    A multi-way branch instruction transfers control to one of the target
 165 labels depending on the value in the source operand.  This operand must
 166 have an integer type.  It can be accessed using the `src_op' and
 167 `set_src' methods.  The value of the source operand is combined with an
 168 integer offset to determine the target label.  The offset can be
 169 accessed with the `lower' and `set_lower' methods.  The offset is
 170 subtracted from the value in the source operand and the result is used
 171 to index into the array of target labels.  If the index is within the
 172 range of the array, the instruction branches to the label at that
 173 position in the array; otherwise, it branches to the default target
 174 label.  The `default_lab' and `set_default_lab' methods access this
 175 default label field.  The destination operand of a multi-way branch is
 176 unused and trying to set it will cause an error.  The result type
 177 should always be the SUIF `void' type.
 178
 179 \x1f
 180 File: suif1.info,  Node: Label Instructions,  Next: Generic Instructions,  Prev: Multi-way Branch Instructions,  Up: Instructions
 181
 182 Label Instructions
 183 ==================
 184
 185    SUIF uses special pseudo-instructions to mark the positions of labels
 186 within the lists of instructions.  These label (`lab') instructions are
 187 represented by the `in_lab' class, which contains a single field
 188 holding the symbol for a label.  The `label' and `set_label' methods
 189 access this field.
 190
 191    No operation is performed by a label instruction.  Its only purpose
 192 is to mark the location of a label symbol in an instruction list.  The
 193 `label' field must be a pointer to the symbol for the label, which must
 194 be defined within the scope where the label instruction occurs.  The
 195 destination operand is unused and trying to set it will cause an error.
 196 The result type should always be the SUIF `void' type.
 197
 198 \x1f
 199 File: suif1.info,  Node: Generic Instructions,  Prev: Label Instructions,  Up: Instructions
 200
 201 Generic Instructions
 202 ====================
 203
 204    To help support special-purpose extensions to SUIF, we have provided
 205 a generic class of instructions.  This is implemented in the `in_gen'
 206 class.  These generic instructions contain arbitrarily large arrays of
 207 source operands and a character string field to hold user-defined names
 208 that function as "sub-opcodes".  Generic instructions are not part of
 209 standard SUIF and most SUIF passes will not handle them.
 210
 211    Because it is difficult to add new opcodes to SUIF at run-time, the
 212 generic instructions all share the same `gen' opcode.  Instead, they
 213 are distinguished by user-defined names.  The `name' and `set_name'
 214 methods may be used to access these character string fields.  The
 215 `set_name' method automatically enters the name in the lexicon (*note
 216 Lexicon::.).
 217
 218    A generic instruction contains a pointer to an array of source
 219 operands.  The base class `num_srcs' method may be used to determine
 220 the size of this array.  The size may be changed at any time using the
 221 `set_num_srcs' method.  If necessary, the array will be reallocated.
 222 The elements of the source operand array can be accessed using the
 223 standard base class `src_op' and `set_src_op' methods.  *Note Source
 224 Operands::.
 225
 226 \x1f
 227 File: suif1.info,  Node: Symbol Tables,  Next: Annotations,  Prev: Types,  Up: Top
 228
 229 Symbol Tables
 230 *************
 231
 232    Symbol tables contain the definitions of the symbols and types used
 233 within a SUIF program.  Each symbol table is associated with an object
 234 corresponding to a particular scope.  For example, a procedure symbol
 235 table is attached to the abstract syntax tree representing the body of
 236 the procedure.  The symbol tables can be reached through the
 237 corresponding objects and vice versa.
 238
 239    This section describes the symbol table hierarchy and the details of
 240 the symbol table operations, such as looking up symbol table entries and
 241 adding new entries.  It also explains how the symbol tables handle the
 242 task of assigning unique ID numbers to the symbols and types.  The
 243 `symtab.h' and `symtab.cc' files contain the code for symbol tables.
 244
 245 * Menu:
 246
 247 * Symbol Table Hierarchy::      Different kinds of symbol tables.
 248 * Basic Symtab Features::       Basic features common to all symbol tables.
 249 * Lookup Methods::              Finding symbol table entries.
 250 * Creating New Entries::        Creating new objects in a symbol table.
 251 * Adding and Removing Entries::  Changing the symbol table contents.
 252 * Numbering Types and Symbols::  Assigning ID numbers to types and symbols.
 253
 254 \x1f
 255 File: suif1.info,  Node: Symbol Table Hierarchy,  Next: Basic Symtab Features,  Up: Symbol Tables
 256
 257 Symbol Table Hierarchy
 258 ======================
 259
 260    The SUIF symbol tables are organized in a hierarchy of nested scopes
 261 and maintained internally within a tree structure.  Every table
 262 contains a list of the symbol tables that are its children, and each
 263 table also has a pointer back to its parent in the tree (except for the
 264 global symbol table which does not have a parent).  The `children'
 265 method returns a pointer to the list of children and the `parent'
 266 method gets the pointer to the parent symbol table.  Thus, to search
 267 through all of the enclosing scopes, one can follow the parent pointers
 268 back to the global symbol table, visiting all of the symbol tables
 269 along the way.  The `is_ancestor' method provides an easy way to check
 270 if a given symbol table is an ancestor (i.e. an enclosing scope) of the
 271 current table.
 272
 273    Note that the symbol table hierarchy is not independent.  The primary
 274 objects in a SUIF program are the files and the abstract syntax trees
 275 for the procedures.  The symbol tables are always attached to these
 276 primary objects and are generally treated as if they are parts of those
 277 objects.  For example, when a block of code is deleted the associated
 278 symbol table is automatically removed from the hierarchy and deleted.
 279
 280    The `base_symtab' class is the base class from which the other
 281 symbol table classes are derived, but it is an abstract class and cannot
 282 be used directly.  There are four different derived symbol tables
 283 classes.  They have much in common, but each is used at a different
 284 level in the hierarchy and thus has slightly different features.
 285
 286 * Menu:
 287
 288 * Global Symbol Table::         Global scope (shared across files).
 289 * File Symbol Tables::          File-level global scopes.
 290 * Procedure Symbol Tables::     Top-level procedure scopes.
 291 * Block Symbol Tables::         Nested scopes within procedures.
 292
 293 \x1f
 294 File: suif1.info,  Node: Global Symbol Table,  Next: File Symbol Tables,  Up: Symbol Table Hierarchy
 295
 296 The Global Symbol Table
 297 -----------------------
 298
 299    The global symbol table is at the top of the symbol table hierarchy
 300 and corresponds to the outermost global scope.  It contains objects
 301 that are visible across source files (i.e. shared types and global
 302 symbols with external linkage).  For this reason, it is associated with
 303 the `file_set' object.  *Note File Set::.
 304
 305    The advantage of using a shared global symbol table appears when
 306 performing interprocedural analyses and transformations.  Without a
 307 common symbol table, it can be quite a burden to deal with references to
 308 symbols that are defined in some files but not in others.  Even trying
 309 to determine which symbols from different files correspond to the same
 310 objects is difficult.  In essence, each interprocedural pass would need
 311 to do the work of a linker!  The shared global symbol table avoids all
 312 of these problems and makes interprocedural optimization relatively
 313 easy.
 314
 315    Along with the benefits of the global symbol table come a few
 316 difficulties.  Sharing the global symbol table across files makes it
 317 difficult to support separate compilation.  Each file must contain a
 318 copy of the global symbol table, and if these files are manipulated
 319 individually, their copies of the global symbol table will not be
 320 consistent.  Thus, before a group of files can be combined in a SUIF
 321 file set, their global symbol tables must be "linked" together using
 322 the SUIF linker pass.  Whether this is preferable to just combining all
 323 of the source files into one big SUIF file is debatable.
 324
 325    The `global_symtab' class is used to represent the global symbol
 326 table.  It is also used as the base class for file symbol tables.
 327 Because procedure symbols may only be entered in global and file symbol
 328 tables, this class contains the methods to deal with them.  The
 329 `new_proc' method creates a new procedure symbol and enters it in the
 330 table (*note Creating New Entries::.), and the `lookup_proc' method
 331 searches for an existing procedure symbol (*note Lookup Methods::.).
 332 The `number_globals' method in this class handles the task of assigning
 333 ID numbers to the symbols and types in global and file symbol tables
 334 (*note Numbering Types and Symbols::.).
 335
 336 \x1f
 337 File: suif1.info,  Node: File Symbol Tables,  Next: Procedure Symbol Tables,  Prev: Global Symbol Table,  Up: Symbol Table Hierarchy
 338
 339 File Symbol Tables
 340 ------------------
 341
 342    A file symbol table corresponds to the global scope for a source
 343 file.  It contains procedure symbols and global variable symbols with
 344 static linkage, as well as types that are only used within the file.
 345 Each file symbol table is associated with a particular file set entry.
 346 *Note File Set Entries::.
 347
 348    The `file_symtab' class is derived from the `global_symtab' class to
 349 implement the file symbol tables.  Besides the features that this class
 350 inherits from its base class, it also contains a field to record the
 351 file set entry with which it is associated.  This field is set
 352 automatically when the file symbol table is created by the file set
 353 entry.  The `fse' method retrieves the value of this field.
 354
 355 \x1f
 356 File: suif1.info,  Node: Procedure Symbol Tables,  Next: Block Symbol Tables,  Prev: File Symbol Tables,  Up: Symbol Table Hierarchy
 357
 358 Procedure Symbol Tables
 359 -----------------------
 360
 361    Procedure symbol tables represent the top-level scopes within
 362 procedures and are associated with the `tree_proc' objects at the roots
 363 of the abstract syntax trees for the procedures.  *Note Procedure
 364 Nodes::.  Because the procedure symbol tables provide a superset of the
 365 block symbol table functions, they are implemented by deriving the
 366 `proc_symtab' class from the `block_symtab' class.  Thus, all of the
 367 `block_symtab' methods can also be applied to `proc_symtab' objects.
 368
 369    Besides the inherited methods, the procedure symbol tables have some
 370 added features.  Each procedure symbol table contains a list of the
 371 formal parameters for the procedure.  The `params' method returns a
 372 pointer to this list.  The entries on this list must be pointers to
 373 symbols for variables that are contained within the procedure symbol
 374 table.  (Formal parameters cannot be global variables or local variables
 375 in inner scopes.)  The symbols are listed in order.  If the function
 376 type for the procedure specifies the parameter types, they should match
 377 the types of the variables on the parameter list.
 378
 379    The procedure symbol table also records the next instruction ID
 380 number for the procedure (*note ID Numbers::.).  The `number_locals'
 381 method handles the task of assigning ID numbers to the symbols and
 382 types in symbol tables within the procedure (*note Numbering Types and
 383 Symbols::.).
 384
 385 \x1f
 386 File: suif1.info,  Node: Block Symbol Tables,  Prev: Procedure Symbol Tables,  Up: Symbol Table Hierarchy
 387
 388 Block Symbol Tables
 389 -------------------
 390
 391    The `block_symtab' class is used for nested block symbol tables and
 392 as the base class for procedure symbol tables.  Each one is associated
 393 with a particular `tree_block' (or `tree_proc') node in an abstract
 394 syntax tree.  *Note Block Nodes::.  Each block symbol table contains a
 395 pointer to the corresponding `tree_block' node.  The `block' method
 396 retrieves the value of this pointer.  When a symbol table is connected
 397 to a `tree_block', its `block' pointer is set automatically.
 398
 399    Since label symbols may not be declared in global scopes, the
 400 `block_symtab' class is the natural place to provide methods for
 401 working with labels.  The `new_label' method creates a new label symbol
 402 and enters it in the table (*note Creating New Entries::.).  The
 403 `new_unique_label' does the same thing but it first makes sure that the
 404 label will have a unique name.  The `lookup_label' method searches for
 405 an existing label symbol (*note Lookup Methods::.).
 406
 407    Block symbol tables also provide a method to create a new child
 408 symbol table, i.e. an inner scope.  The `new_unique_child' method can be
 409 used to create a new child block symtab with a unique name (*note
 410 Creating New Entries::.).  This method is not provided for global
 411 symbol tables, because their children must correspond to procedures,
 412 which already have unique names.
 413
 414 \x1f
 415 File: suif1.info,  Node: Basic Symtab Features,  Next: Lookup Methods,  Prev: Symbol Table Hierarchy,  Up: Symbol Tables
 416
 417 Basic Features
 418 ==============
 419
 420    Symbol tables contain three different kinds of objects: types,
 421 symbols, and variable definitions.  The entries within a symbol table
 422 may only be referenced within the corresponding scope.  This includes
 423 references within registered annotations.  Violating this condition may
 424 lead to strange and unexpected errors.
 425
 426    For simplicity, the symbol table entries are stored on lists instead
 427 of using hash tables.  In theory, the actual implementation (lists or
 428 hash tables) should not be visible in the symbol table interface.
 429 Unfortunately that is not completely true for the current implementation
 430 of SUIF--the lists can be accessed directly.  The `types', `symbols',
 431 and `var_defs' methods return pointers to the lists.  However, these
 432 lists should only be accessed to examine the entries and should never
 433 be modified directly.  The symbol table classes provide other methods
 434 to add and remove entries from the lists and those methods should
 435 always be used.  If the list implementation becomes a performance
 436 bottleneck, we may need to switch to hash tables, and code that
 437 modifies the lists directly will be relatively hard to convert.
 438
 439    To distinguish the symbol tables nested within a particular scope,
 440 each table is given a name.  The `name' and `set_name' methods retrieve
 441 and modify this name.  If a scope in the source program has a name
 442 associated with it, that name may be used for the corresponding symbol
 443 table.  For example, the name of a procedure-level symbol table should
 444 generally be the same as the name of the procedure.  On the other hand,
 445 nested scopes within procedures are typically unnamed, and names must
 446 be generated for the corresponding symbol tables.
 447
 448    The symbol table names are used when printing a reference to a
 449 symbol or named type.  Because the symbol or type name alone may not be
 450 sufficient to identify it uniquely, the `chain_name' method is used to
 451 identify the symbol table.  The chain name of a symbol table includes
 452 the names of all of the symbol tables from the procedure-level downward,
 453 separated by slashes (as in a Unix path).  The file-level name is not
 454 included since it should always be clear from the context.  The chain
 455 name for a global or file symbol table is the empty string.
 456
 457    Duplicate names within a symbol table should be avoided whenever
 458 possible.  Each kind of symbol has a separate name space.  A variable,
 459 for example, may have the same name as a label in the same symbol table.
 460 Named types and child symbol table names are also in separate name
 461 spaces.  Duplicate names may be temporarily introduced but to avoid
 462 problems they should be renamed as soon as possible.  The
 463 `rename_duplicates' method is provided to check for and rename any
 464 duplicates in a symbol table.  This method is automatically called
 465 before writing out each symbol table.
 466
 467 \x1f
 468 File: suif1.info,  Node: Lookup Methods,  Next: Creating New Entries,  Prev: Basic Symtab Features,  Up: Symbol Tables
 469
 470 Lookup Methods
 471 ==============
 472
 473    SUIF symbol tables provide a number of methods to search for and
 474 retrieve particular types, symbols, and variable definitions.  Most of
 475 these lookup methods will optionally search all the ancestor symbol
 476 tables, making it easy to determine if an object is defined in the
 477 current scope.
 478
 479    The `lookup_type' method is available at all levels in the symbol
 480 table hierarchy to search for SUIF types.  Given an existing type, the
 481 method searches for a type that is the same.  It uses the `is_same'
 482 method from the `type_node' class to perform these comparisons.  If a
 483 matching type is not found within the current symbol table,
 484 `lookup_type' will continue searching in the ancestor symbol tables by
 485 default.  However, if the optional `up' parameter is set to `FALSE', it
 486 will give up after searching the first table.
 487
 488    Several methods are provided to lookup symbols.  Each different kind
 489 of symbol (variable, procedure, and label) has its own name space, so
 490 the `lookup_sym' method requires that you specify both the name and the
 491 kind of symbol for which to search.  This method may be used with all
 492 symbol tables.  For convenience, other methods are defined as wrappers
 493 around `lookup_sym'.  Each of these wrappers searches for a particular
 494 kind of symbol: `lookup_var' searches for variables, `lookup_proc'
 495 searches for procedures, and `lookup_label' searches for labels.
 496 Because procedure symbols may only be defined in global symbol tables,
 497 the `lookup_proc' method is declared in the `global_symtab' class.
 498 Similarly, the `lookup_label' method is declared in the `block_symtab'
 499 class, because labels may only be defined within procedures.  By
 500 default, all of these methods search the current symbol table and, if
 501 unsuccessful, proceed to search the ancestor symbol tables.  The
 502 optional `up' parameters may be set to `FALSE' to turn off this default
 503 behavior and only search the current symbol table.
 504
 505    A symbol for a global variable is just a declaration of that variable
 506 and does not automatically have any storage allocated.  Variable
 507 definitions are required to allocate storage and to specify alignment
 508 requirements and any initial data for the variable.  Since the variable
 509 definitions are not directly connected to the variable symbols, the
 510 `lookup_var_def' method is provided to search a symbol table for the
 511 definition of a particular variable symbol.  This method does not
 512 search the parent symbol table.  In general the `definition' method in
 513 the `var_sym' class is a better way to locate a variable definition.
 514
 515    Symbols and types are assigned ID numbers (*note Numbering Types and
 516 Symbols::.) that uniquely identify them within a particular context.
 517 The `lookup_type_id' method searches the types defined within a symbol
 518 table and its ancestors for a type with the specified ID number.  The
 519 `lookup_sym_id' does the same thing for symbols.
 520
 521    Besides searching for one of the entries in a symbol table, you can
 522 also search for one of its children in the symbol table hierarchy.  The
 523 `lookup_child' method searches through the list of children for a
 524 symbol table with a given name.  This may not be very useful, but it is
 525 included for completeness.
 526
 527 \x1f
 528 File: suif1.info,  Node: Creating New Entries,  Next: Adding and Removing Entries,  Prev: Lookup Methods,  Up: Symbol Tables
 529
 530 Creating New Entries
 531 ====================
 532
 533    To make it easier to add new entries, the symbol tables provide
 534 methods that combine the steps of creating new objects and then
 535 entering them in the tables.  Some of these methods automatically make
 536 sure that the new entries have unique names and that is particularly
 537 useful.
 538
 539    New variables can be added to tables anywhere in the symbol table
 540 hierarchy.  The `new_var' method creates a new variable with a given
 541 name and type and then enters the new variable symbol in the table.
 542 The `new_unique_var' method is similar, but it also checks that the
 543 name of the new variable is unique.  If not, it appends a number to the
 544 specified name until it is unique.  With this method, the base name is
 545 optional; the default value is `suif_tmp'.
 546
 547    Procedure symbols can be created in global and file symbol tables
 548 using the `new_proc' method.  The name of the procedure, its type, and
 549 the source language must be specified.  There is currently no method to
 550 automatically create a new procedure symbol with a unique name.
 551
 552    Because label symbols may only be declared within procedures, the
 553 `new_label' and `new_unique_label' methods are provided in the
 554 `block_symtab' class.  The only parameter of these methods is the name
 555 of the label.  The name is optional for `new_unique_label'; its default
 556 value is `L'.  Just as with variables, unique label names are created
 557 by adding a number to the end of the base names.
 558
 559    Within a procedure, new inner scopes may be created to be used with
 560 new `tree_block' nodes.  The `block_symtab' class provides the
 561 `new_unique_child' method to create a new symbol table, give it a
 562 unique name, and add it to the list of children.  The unique name is
 563 created by appending a number to the optional base name.  If the base
 564 name is not given, it defaults to `block'.  This method is not needed
 565 at the global level, because the child symbol tables there correspond
 566 to procedures which should already have unique names.
 567
 568    Finally, new variable definitions can be added to any symbol table
 569 using the `define_var' method.  The parameters are the variable symbol
 570 and the alignment for the storage to be defined.  It returns a pointer
 571 to the new variable definition object, so that you can attach initial
 572 data annotations to it.
 573
 574 \x1f
 575 File: suif1.info,  Node: Adding and Removing Entries,  Next: Numbering Types and Symbols,  Prev: Creating New Entries,  Up: Symbol Tables
 576
 577 Adding and Removing Entries
 578 ===========================
 579
 580    Entries in symbol tables should always be added and removed using the
 581 methods provided by the symbol tables.  Although it is possible to add
 582 and remove entries by directly manipulating the lists, that should never
 583 be done.  The methods for adding and removing entries hide the
 584 underlying representation and using them will make it much easier to
 585 update your code if that representation changes.  Even more importantly,
 586 most symbol table entries contain back pointers to the tables which hold
 587 them, and the adding and removing methods are responsible for
 588 maintaining those pointers and for performing a few other automatic
 589 checks and updates.
 590
 591    Types, symbols, and child symbol tables may be added using the
 592 `add_type', `add_sym', and `add_child' methods, respectively.  Each of
 593 these entries contains a pointer back to the parent symbol table, and
 594 these methods automatically set those back pointers.  They do not,
 595 however, perform any other sanity checks, such as checking for
 596 duplicate names.  Similarly, the `remove_type', `remove_sym', and
 597 `remove_child' methods remove types, symbols, and child symbol table
 598 entries.  These methods clear the parent pointers but do not delete the
 599 entries that are removed.
 600
 601    Variable definitions are treated a bit differently from other kinds
 602 of symbol table entries.  They do not have parent pointers so the
 603 `add_def' and `remove_def' methods do not have to deal with that.
 604 However, adding and removing variable definitions change some
 605 attributes of the corresponding variables, and those attributes must be
 606 automatically updated.  First, each variable has a flag to indicate
 607 whether a variable definition exists for it.  A variable cannot have
 608 more than one definition, so the `add_def' method will fail if this
 609 flag is already set.  Otherwise, it sets the flag when the new
 610 definition is added.  Second, variable symbols also have a flag to
 611 indicate whether they are actual definitions or just declarations of
 612 symbols with external linkage.  This `extern' flag must be set to
 613 `FALSE' when a variable definition is added for a global variable.
 614 When removing a variable definition, these flags must be reversed.
 615
 616    Unlike symbol nodes which always define separate symbols, multiple
 617 type nodes can represent the same type.  The basic `add_type' method
 618 will add a new type even if an equivalent type was already defined in
 619 the same scope.  In most cases, what is actually needed is a method to
 620 first check if an equivalent type exists and if so to throw away the
 621 duplicate and return the existing type.  The `install_type' method
 622 provides this functionality.  It first checks if a type has already been
 623 entered in the symbol table or one of its ancestors using the
 624 `lookup_type' method.  If so, it deletes the new type and returns the
 625 existing one.  If a type is not found, it is entered into the symbol
 626 table and returned.  All of the components of a type are recursively
 627 installed before the type itself.  This makes it easy to create new
 628 types without worrying about duplicate entries in the symbol tables.
 629
 630 \x1f
 631 File: suif1.info,  Node: Numbering Types and Symbols,  Prev: Adding and Removing Entries,  Up: Symbol Tables
 632
 633 Numbering Types and Symbols
 634 ===========================
 635
 636    Every symbol and type is assigned an ID number that uniquely
 637 identifies it within a particular context.  These ID numbers should be
 638 used to refer to symbols and types in annotations that will be written
 639 to the output files and in other situations where pointers to the
 640 symbol and type nodes cannot be used.  The `sym_id' method retrieves
 641 the ID number for a symbol, and the `type_id' method gets the number
 642 for a type.
 643
 644    For symbols and types within a procedure, the ID numbers are only
 645 unique within that procedure.  Similarly, the ID numbers for symbols
 646 and types in a file symbol table are only unique within that file.
 647 Only in the global symbol table are the ID numbers truly unique.  This
 648 is implemented by dividing the ID numbers into three ranges.  Each
 649 range is reserved for a particular level in the symbol table hierarchy.
 650 To make it easier to read an ID number, the `print_id_number' function
 651 prints it as a character to identify the range (`g' for global, `f' for
 652 file, `p' for procedure) combined with the offset of the number within
 653 that range.
 654
 655    The symbol and type ID numbers cannot be assigned individually, but
 656 the symbol tables provide methods to set them.  The `number_globals'
 657 method is defined in the `global_symtab' class to number the entries in
 658 global and file symbol tables, and the `number_locals' method is
 659 defined in the `proc_symtab' class to number all of the entries in the
 660 procedure symbol table and its descendents.  These methods only assign
 661 ID numbers to symbols and types that do not already have numbers.
 662 These methods are called automatically before writing things out to
 663 files, but they can also be called whenever you want to assign numbers
 664 to new symbols and types.
 665
 666    The `clear_sym_id' symbol method and `clear_type_id' method are
 667 provided to reset the ID numbers to zero manually, but as far as the
 668 library itself is concerned, this is never necessary.  The library
 669 automatically changes ID numbers when necessary, such as when moving
 670 from one symbol table to another.
 671
 672 \x1f
 673 File: suif1.info,  Node: Symbols,  Next: Types,  Prev: Instructions,  Up: Top
 674
 675 Symbols
 676 *******
 677
 678    SUIF symbols are stored in the symbol tables (*note Symbol
 679 Tables::.) to represent variables, labels, and procedures.  The
 680 `sym_node' class is the base class for all SUIF symbols.  This is an
 681 abstract class so it cannot be used directly.  The library also defines
 682 the `sym_node_list' class for lists of pointers to symbols.  Classes
 683 are derived from the `sym_node' class for each kind of symbol.  Given
 684 an arbitrary symbol, the `kind' method identifies the kind of symbol
 685 and thus the derived class to which it belongs.  This method returns a
 686 value from the `sym_kinds' enumerated type.  The following values are
 687 defined in that enumeration:
 688
 689 `SYM_VAR'
 690      Variable symbol.  The `var_sym' class represents variable symbols.
 691
 692 `SYM_LABEL'
 693      Label symbol.  The `label_sym' class represents label symbols.
 694
 695 `SYM_PROC'
 696      Procedure symbol.  The `proc_sym' class represents procedure
 697      symbols.
 698
 699    All symbols share some common fields including the symbol names.
 700 These are described in the first section below.  Each kind of symbol
 701 also uses additional fields that are specific to that kind.  For
 702 example, variable symbols specify the types of the variables.  The
 703 subsequent sections describe the specific features of each kind of
 704 symbol.
 705
 706    The `symbols.h' and `symbols.cc' files contain the source code for
 707 SUIF symbols.
 708
 709 * Menu:
 710
 711 * Symbol Features::             Basic features of all symbols.
 712 * Procedure Symbols::           Procedures.
 713 * Label Symbols::               Labels.
 714 * Variable Symbols::            Variables.
 715
 716 \x1f
 717 File: suif1.info,  Node: Symbol Features,  Next: Procedure Symbols,  Up: Symbols
 718
 719 Basic Features of Symbols
 720 =========================
 721
 722    The `sym_node' class defines several fields that are used by all
 723 kinds of symbols.  The most obvious of these is the symbol name.  Each
 724 symbol has a name that should be unique within the symbol table where it
 725 is defined.  The `name' and `set_name' methods access this field.  The
 726 names are automatically entered in the lexicon (*note Lexicon::.) by
 727 `set_name'.  Because the name of a symbol alone is generally
 728 insufficient to uniquely identify it, the symbols are also given ID
 729 numbers.  *Note Numbering Types and Symbols::.
 730
 731    When a symbol is entered in a symbol table, it automatically records
 732 a pointer to that parent table.  Similarly, when the symbol is removed
 733 from the symbol table, its parent pointer is cleared.  The `parent'
 734 method retrieves this parent pointer.
 735
 736    All symbols contain flags to specify various attributes.  The
 737 `sym_node' class provides methods to access these flags.  The
 738 `is_userdef' method tests a flag to see if it is a user-defined symbol
 739 (from the source code) or a new symbol introduced by the compiler.  The
 740 `set_userdef' and `reset_userdef' methods change the value of this flag.
 741
 742    Another flag is used to mark symbols that are only declarations of
 743 external symbols, rather than actual definitions.  This flag is set
 744 automatically.  The `is_extern' method retrieves its value.  Label
 745 symbols are never `extern'.  A procedure symbol is `extern' unless the
 746 procedure body is defined in the input file(s).  A global variable
 747 symbol is `extern' unless it has a separate definition (*note Variable
 748 Definitions::.); no other variables are `extern'.
 749
 750    Since symbols may be treated differently depending on their scopes,
 751 the `sym_node' class includes methods to determine which kind of symbol
 752 table contains a symbol.  The `is_global' method checks if the parent
 753 table is a global or file symbol table.  This is really only useful for
 754 variable symbols, because procedures are always global and labels are
 755 never global.  The `is_private' method checks if a symbol is global but
 756 private to one source file by checking if the parent symbol table is a
 757 file symbol table.  This is obviously irrelevant for label symbols.
 758
 759    The `add_to_table' and `remove_from_table' methods are provided for
 760 convenience when adding or removing symbols from symbol tables.  In the
 761 case of variable symbols, the entire hierarchy of sub-variables (*note
 762 Sub-Variables::.) is added or removed at one time by this method.
 763
 764    The `copy' method makes a copy of a symbol.  This is a virtual
 765 function so it copies the fields that are specific to each kind of
 766 symbol.  However, it only copies the symbol itself: copying a procedure
 767 symbol will not copy the procedure body and copying a variable symbol
 768 will not copy the variable definition.  The `copy' method does not copy
 769 annotations on the symbol, either.  Since the copy will have the same
 770 name as the original symbol, it should generally be renamed or used in
 771 a different symbol table.
 772
 773    Two different methods are available for printing symbols.  The
 774 `print' method just prints the name of the symbol.  Label symbols are
 775 prefixed by `L:' and procedure symbols by `P:' to distinguish them from
 776 variable symbols.  The `print_full' method is used by the library when
 777 listing the contents of symbol tables.  It includes all the fields from
 778 the `sym_node'.
 779
 780 \x1f
 781 File: suif1.info,  Node: Procedure Symbols,  Next: Label Symbols,  Prev: Symbol Features,  Up: Symbols
 782
 783 Procedure Symbols
 784 =================
 785
 786    Procedure symbols are represented by objects of the `proc_sym'
 787 class.  SUIF does not support nested procedures, so these symbols may
 788 only be entered in global and file symbol tables.  The fields in a
 789 procedure symbol hold information about the procedure, including a
 790 pointer to the body if it is in memory.  The `proc_sym' class also
 791 provides methods to read procedure bodies from input files, write them
 792 to the output files, and flush them from memory.
 793
 794    Each procedure symbol contains a field to record the source language
 795 for the procedure.  The `src_lang' and `set_src_lang' methods access
 796 this field, which holds a value from `src_lang_type' enumeration:
 797 `src_unknown', `src_c', `src_fortran', or `src_verilog'.  Other values
 798 may be added in the future.
 799
 800    A procedure symbol also has a field that specifies the type of the
 801 procedure.  The `type' and `set_type' methods retrieve and change this
 802 field.  The type must be a function type.  *Note Function Types::.
 803
 804    The body of a procedure is represented by its abstract syntax tree.
 805 *Note Trees::.  The procedure symbol contains a pointer to the root node
 806 of this tree.  The `block' and `set_block' methods access this pointer.
 807 If the body is not in memory, the `block' pointer will be `NULL'; the
 808 `is_in_memory' method is provided to check this condition.
 809
 810    The `proc_sym' class contains the methods to read procedure bodies
 811 from binary SUIF files and to write them out again.  The details of SUIF
 812 I/O are thus hidden from users; only entire procedures can be read and
 813 written.  If one of the input files contains the body for a procedure, a
 814 pointer to the file set entry (*note File Set Entries::.) is recorded in
 815 the procedure symbol.  The `file' method retrieves this pointer for a
 816 particular `proc_sym'.  The same procedure can be read in and flushed
 817 from memory many times, but once it has been written out it can no
 818 longer be read or written again.  The procedure symbol contains a flag
 819 to indicate if it has been written out yet.  The `is_written' method
 820 returns the value of this flag.  The `is_readable' method checks if the
 821 procedure body exists in one of the input files and if it has not yet
 822 been written out.  If this method returns `TRUE', the `read_proc'
 823 method can be used to read the body of the procedure.  By default,
 824 `read_proc' also converts the procedure to expression tree form (*note
 825 Expression Trees::.) but it does not convert to Fortran form (*note
 826 Fortran::.).  The `exp_trees' and `use_fortran_form' parameters to
 827 `read_proc' can be used to override these defaults.
 828
 829    After a procedure body has been read in and possibly modified, it
 830 can be written to the output file using the procedure symbol's
 831 `write_proc' method.  You must specify the file set entry to which the
 832 procedure should be written.  In most cases, the input and output file
 833 set entry will be the same, and you will just use the `file' method to
 834 determine the output file set entry.  As mentioned above, once a
 835 procedure has been written out it cannot be rewritten or read in again.
 836 Obviously, it should not be changed after that point because the
 837 changes could not be saved.  Besides avoiding changes directly to the
 838 procedure, however, you must also be careful to avoid certain changes to
 839 the global symbol tables.  The symbols and types within the procedure
 840 are written out using their ID numbers.  *Note Numbering Types and
 841 Symbols::.  Thus, you must not do anything to the global symbol tables
 842 that would cause the ID numbers for those symbols and types to change.
 843 For example, moving a symbol from a file symbol table to the global
 844 symbol table would require that its ID number change.  The best solution
 845 to this is to not write out the procedures until you are certain that
 846 such changes to the symbol tables will not be needed.
 847
 848    When a procedure body is no longer needed, typically after it has
 849 been written out, call the `flush_proc' method for the procedure symbol
 850 to deallocate the storage used by the procedure.  In some cases, you may
 851 want to flush the procedure before it is written.  For example,
 852 interprocedural analysis requires that all procedures be read in and
 853 analyzed together.  To save space, the procedures can be summarized for
 854 the purpose of the particular analysis and then flushed.  After the
 855 analysis is complete, they can be re-read and the results can be
 856 attached to the code.
 857
 858 \x1f
 859 File: suif1.info,  Node: Label Symbols,  Next: Variable Symbols,  Prev: Procedure Symbols,  Up: Symbols
 860
 861 Label Symbols
 862 =============
 863
 864    The `label_sym' class represents label symbols.  Labels are used
 865 within procedures to specify targets of branch and jump instructions.
 866 They may not be entered in global or file symbol tables.  The position
 867 of a label is usually indicated by a label instruction (*note Label
 868 Instructions::.), but for labels associated with high-level AST nodes,
 869 the label positions may be implicit.  The `label_sym' class contains no
 870 extra fields beyond those in the base `sym_node' class.
 871
 872 \x1f
 873 File: suif1.info,  Node: Variable Symbols,  Prev: Label Symbols,  Up: Symbols
 874
 875 Variable Symbols
 876 ================
 877
 878    In SUIF, variable symbols represent data objects.  Variable symbols
 879 are represented by objects of the `var_sym' class.  This class adds a
 880 field to specify the type of the variable as well as some additional
 881 flags.  Unlike procedures and labels, variables may be defined in any
 882 scope.
 883
 884    SUIF provides optional "sub-variables" to make it easier to deal
 885 with pieces of aggregate objects that may or may not overlap, in
 886 particular Fortran equivalences and reshaped common blocks.  Instead of
 887 referring to a piece of an aggregate by an offset combined with the
 888 aggregate symbol, a sub-variable can be created to represent the data at
 889 a particular offset within the aggregate, so that it can be referenced
 890 in the same way as if it were not contained within an aggregate
 891 structure.
 892
 893 * Menu:
 894
 895 * Variable Features::           Basic features of variables.
 896 * Sub-Variables::               Variables contained within aggregates.
 897 * Variable Definitions::        Definitions of global and static variables.
 898