1 ------------------------------------------------------------------------------
3 -- GNAT COMPILER COMPONENTS --
9 -- Copyright (C) 1992-2009, Free Software Foundation, Inc. --
11 -- GNAT is free software; you can redistribute it and/or modify it under --
12 -- terms of the GNU General Public License as published by the Free Soft- --
13 -- ware Foundation; either version 3, or (at your option) any later ver- --
14 -- sion. GNAT is distributed in the hope that it will be useful, but WITH- --
15 -- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY --
16 -- or FITNESS FOR A PARTICULAR PURPOSE. --
18 -- As a special exception under Section 7 of GPL version 3, you are granted --
19 -- additional permissions described in the GCC Runtime Library Exception, --
20 -- version 3.1, as published by the Free Software Foundation. --
22 -- You should have received a copy of the GNU General Public License and --
23 -- a copy of the GCC Runtime Library Exception along with this program; --
24 -- see the files COPYING3 and COPYING.RUNTIME respectively. If not, see --
25 -- <http://www.gnu.org/licenses/>. --
27 -- GNAT was originally developed by the GNAT team at New York University. --
28 -- Extensive contributions were provided by Ada Core Technologies Inc. --
30 ------------------------------------------------------------------------------
34 with Hostparm
; use Hostparm
;
35 with System
; use System
;
36 with Types
; use Types
;
40 -- WARNING: There is a C version of this package. Any changes to this
41 -- source file must be properly reflected in the C header file namet.h
42 -- which is created manually from namet.ads and namet.adb.
44 -- This package contains routines for handling the names table. The table
45 -- is used to store character strings for identifiers and operator symbols,
46 -- as well as other string values such as unit names and file names.
48 -- The forms of the entries are as follows:
50 -- Identifiers Stored with upper case letters folded to lower case. Upper
51 -- half (16#80# bit set) and wide characters are stored
52 -- in an encoded form (Uhh for upper half char, Whhhh
53 -- for wide characters, WWhhhhhhhh as provided by the
54 -- routine Store_Encoded_Character, where hh are hex
55 -- digits for the character code using lower case a-f).
56 -- Normally the use of U or W in other internal names is
57 -- avoided, but these letters may be used in internal
58 -- names (without this special meaning), if they appear
59 -- as the last character of the name, or they are
60 -- followed by an upper case letter (other than the WW
61 -- sequence), or an underscore.
63 -- Operator symbols Stored with an initial letter O, and the remainder
64 -- of the name is the lower case characters XXX where
65 -- the name is Name_Op_XXX, see Snames spec for a full
66 -- list of the operator names. Normally the use of O
67 -- in other internal names is avoided, but it may be
68 -- used in internal names (without this special meaning)
69 -- if it is the last character of the name, or if it is
70 -- followed by an upper case letter or an underscore.
72 -- Character literals Character literals have names that are used only for
73 -- debugging and error message purposes. The form is a
74 -- upper case Q followed by a single lower case letter,
75 -- or by a Uxx/Wxxxx/WWxxxxxxx encoding as described for
76 -- identifiers. The Set_Character_Literal_Name procedure
77 -- should be used to construct these encodings. Normally
78 -- the use of O in other internal names is avoided, but
79 -- it may be used in internal names (without this special
80 -- meaning) if it is the last character of the name, or
81 -- if it is followed by an upper case letter or an
84 -- Unit names Stored with upper case letters folded to lower case,
85 -- using Uhh/Whhhh/WWhhhhhhhh encoding as described for
86 -- identifiers, and a %s or %b suffix for specs/bodies.
87 -- See package Uname for further details.
89 -- File names Are stored in the form provided by Osint. Typically
90 -- they may include wide character escape sequences and
91 -- upper case characters (in non-encoded form). Casing
92 -- is also derived from the external environment. Note
93 -- that file names provided by Osint must generally be
94 -- consistent with the names from Fname.Get_File_Name.
96 -- Other strings The names table is also used as a convenient storage
97 -- location for other variable length strings such as
98 -- error messages etc. There are no restrictions on what
99 -- characters may appear for such entries.
101 -- Note: the encodings Uhh (upper half characters), Whhhh (wide characters),
102 -- WWhhhhhhhh (wide wide characters) and Qx (character literal names) are
103 -- described in the spec, since they are visible throughout the system (e.g.
104 -- in debugging output). However, no code should depend on these particular
105 -- encodings, so it should be possible to change the encodings by making
106 -- changes only to the Namet specification (to change these comments) and the
107 -- body (which actually implements the encodings).
109 -- The names are hashed so that a given name appears only once in the table,
110 -- except that names entered with Name_Enter as opposed to Name_Find are
111 -- omitted from the hash table.
113 -- The first 26 entries in the names table (with Name_Id values in the range
114 -- First_Name_Id .. First_Name_Id + 25) represent names which are the one
115 -- character lower case letters in the range a-z, and these names are created
116 -- and initialized by the Initialize procedure.
118 -- Two values, one of type Int and one of type Byte, are stored with each
119 -- names table entry and subprograms are provided for setting and retrieving
120 -- these associated values. The usage of these values is up to the client. In
121 -- the compiler, the Int field is used to point to a chain of potentially
122 -- visible entities (see Sem.Ch8 for details), and the Byte field is used to
123 -- hold the Token_Type value for reserved words (see Sem for details). In the
124 -- binder, the Byte field is unused, and the Int field is used in various
125 -- ways depending on the name involved (see binder documentation).
127 Name_Buffer
: String (1 .. 4 * Max_Line_Length
);
128 -- This buffer is used to set the name to be stored in the table for the
129 -- Name_Find call, and to retrieve the name for the Get_Name_String call.
130 -- The limit here is intended to be an infinite value that ensures that we
131 -- never overflow the buffer (names this long are too absurd to worry!)
134 -- Length of name stored in Name_Buffer. Used as an input parameter for
135 -- Name_Find, and as an output value by Get_Name_String, or Write_Name.
137 -----------------------------
138 -- Types for Namet Package --
139 -----------------------------
141 -- Name_Id values are used to identify entries in the names table. Except
142 -- for the special values No_Name, and Error_Name, they are subscript
143 -- values for the Names table defined in package Namet.
145 -- Note that with only a few exceptions, which are clearly documented, the
146 -- type Name_Id should be regarded as a private type. In particular it is
147 -- never appropriate to perform arithmetic operations using this type.
149 type Name_Id
is range Names_Low_Bound
.. Names_High_Bound
;
150 for Name_Id
'Size use 32;
151 -- Type used to identify entries in the names table
153 No_Name
: constant Name_Id
:= Names_Low_Bound
;
154 -- The special Name_Id value No_Name is used in the parser to indicate
155 -- a situation where no name is present (e.g. on a loop or block).
157 Error_Name
: constant Name_Id
:= Names_Low_Bound
+ 1;
158 -- The special Name_Id value Error_Name is used in the parser to
159 -- indicate that some kind of error was encountered in scanning out
160 -- the relevant name, so it does not have a representable label.
162 subtype Error_Name_Or_No_Name
is Name_Id
range No_Name
.. Error_Name
;
163 -- Used to test for either error name or no name
165 First_Name_Id
: constant Name_Id
:= Names_Low_Bound
+ 2;
166 -- Subscript of first entry in names table
173 -- Called at the end of a use of the Namet package (before a subsequent
174 -- call to Initialize). Currently this routine is only used to generate
177 procedure Get_Name_String
(Id
: Name_Id
);
178 -- Get_Name_String is used to retrieve the string associated with an entry
179 -- in the names table. The resulting string is stored in Name_Buffer and
180 -- Name_Len is set. It is an error to call Get_Name_String with one of the
181 -- special name Id values (No_Name or Error_Name).
183 function Get_Name_String
(Id
: Name_Id
) return String;
184 -- This functional form returns the result as a string without affecting
185 -- the contents of either Name_Buffer or Name_Len. The lower bound is 1.
187 procedure Get_Unqualified_Name_String
(Id
: Name_Id
);
188 -- Similar to the above except that qualification (as defined in unit
189 -- Exp_Dbug) is removed (including both preceding __ delimited names, and
190 -- also the suffixes used to indicate package body entities and to
191 -- distinguish between overloaded entities). Note that names are not
192 -- qualified until just before the call to gigi, so this routine is only
193 -- needed by processing that occurs after gigi has been called. This
194 -- includes all ASIS processing, since ASIS works on the tree written
195 -- after gigi has been called.
197 procedure Get_Name_String_And_Append
(Id
: Name_Id
);
198 -- Like Get_Name_String but the resulting characters are appended to the
199 -- current contents of the entry stored in Name_Buffer, and Name_Len is
200 -- incremented to include the added characters.
202 procedure Get_Decoded_Name_String
(Id
: Name_Id
);
203 -- Same calling sequence an interface as Get_Name_String, except that the
204 -- result is decoded, so that upper half characters and wide characters
205 -- appear as originally found in the source program text, operators have
206 -- their source forms (special characters and enclosed in quotes), and
207 -- character literals appear surrounded by apostrophes.
209 procedure Get_Unqualified_Decoded_Name_String
(Id
: Name_Id
);
210 -- Similar to the above except that qualification (as defined in unit
211 -- Exp_Dbug) is removed (including both preceding __ delimited names, and
212 -- also the suffix used to indicate package body entities). Note that
213 -- names are not qualified until just before the call to gigi, so this
214 -- routine is only needed by processing that occurs after gigi has been
215 -- called. This includes all ASIS processing, since ASIS works on the tree
216 -- written after gigi has been called.
218 procedure Get_Decoded_Name_String_With_Brackets
(Id
: Name_Id
);
219 -- This routine is similar to Decoded_Name, except that the brackets
220 -- notation (Uhh replaced by ["hh"], Whhhh replaced by ["hhhh"],
221 -- WWhhhhhhhh replaced by ["hhhhhhhh"]) is used for all non-lower half
222 -- characters, regardless of how Opt.Wide_Character_Encoding_Method is
223 -- set, and also in that characters in the range 16#80# .. 16#FF# are
224 -- converted to brackets notation in all cases. This routine can be used
225 -- when there is a requirement for a canonical representation not affected
226 -- by the character set options (e.g. in the binder generation of
229 function Get_Name_Table_Byte
(Id
: Name_Id
) return Byte
;
230 pragma Inline
(Get_Name_Table_Byte
);
231 -- Fetches the Byte value associated with the given name
233 function Get_Name_Table_Info
(Id
: Name_Id
) return Int
;
234 pragma Inline
(Get_Name_Table_Info
);
235 -- Fetches the Int value associated with the given name
237 function Is_Operator_Name
(Id
: Name_Id
) return Boolean;
238 -- Returns True if name given is of the form of an operator (that
239 -- is, it starts with an upper case O).
241 procedure Initialize
;
242 -- Initializes the names table, including initializing the first 26
243 -- entries in the table (for the 1-character lower case names a-z) Note
244 -- that Initialize must not be called if Tree_Read is used.
247 -- Lock name tables before calling back end. We reserve some extra space
248 -- before locking to avoid unnecessary inefficiencies when we unlock.
251 -- Unlocks the name table to allow use of the extra space reserved by the
252 -- call to Lock. See gnat1drv for details of the need for this.
254 function Length_Of_Name
(Id
: Name_Id
) return Nat
;
255 pragma Inline
(Length_Of_Name
);
256 -- Returns length of given name in characters. This is the length of the
257 -- encoded name, as stored in the names table, the result is equivalent to
258 -- calling Get_Name_String and reading Name_Len, except that a call to
259 -- Length_Of_Name does not affect the contents of Name_Len and Name_Buffer.
261 function Name_Chars_Address
return System
.Address
;
262 -- Return starting address of name characters table (used in Back_End call
265 function Name_Find
return Name_Id
;
266 -- Name_Find is called with a string stored in Name_Buffer whose length is
267 -- in Name_Len (i.e. the characters of the name are in subscript positions
268 -- 1 to Name_Len in Name_Buffer). It searches the names table to see if
269 -- the string has already been stored. If so the Id of the existing entry
270 -- is returned. Otherwise a new entry is created with its Name_Table_Info
271 -- field set to zero. The contents of Name_Buffer and Name_Len are not
272 -- modified by this call. Note that it is permissible for Name_Len to be
273 -- set to zero to lookup the null name string.
275 function Name_Enter
return Name_Id
;
276 -- Name_Enter has the same calling interface as Name_Find. The difference
277 -- is that it does not search the table for an existing match, and also
278 -- subsequent Name_Find calls using the same name will not locate the
279 -- entry created by this call. Thus multiple calls to Name_Enter with the
280 -- same name will create multiple entries in the name table with different
281 -- Name_Id values. This is useful in the case of created names, which are
282 -- never expected to be looked up. Note: Name_Enter should never be used
283 -- for one character names, since these are efficiently located without
284 -- hashing by Name_Find in any case.
286 function Name_Entries_Address
return System
.Address
;
287 -- Return starting address of Names table (used in Back_End call to Gigi)
289 function Name_Entries_Count
return Nat
;
290 -- Return current number of entries in the names table
292 function Is_OK_Internal_Letter
(C
: Character) return Boolean;
293 pragma Inline
(Is_OK_Internal_Letter
);
294 -- Returns true if C is a suitable character for using as a prefix or a
295 -- suffix of an internally generated name, i.e. it is an upper case letter
296 -- other than one of the ones used for encoding source names (currently
297 -- the set of reserved letters is O, Q, U, W) and also returns False for
298 -- the letter X, which is reserved for debug output (see Exp_Dbug).
300 function Is_Internal_Name
(Id
: Name_Id
) return Boolean;
301 -- Returns True if the name is an internal name (i.e. contains a character
302 -- for which Is_OK_Internal_Letter is true, or if the name starts or ends
303 -- with an underscore. This call destroys the value of Name_Len and
304 -- Name_Buffer (it loads these as for Get_Name_String).
306 -- Note: if the name is qualified (has a double underscore), then only the
307 -- final entity name is considered, not the qualifying names. Consider for
308 -- example that the name:
312 -- is not an internal name, because the B comes from the internal name of
313 -- a qualifying block, but the xyz means that this was indeed a declared
314 -- identifier called "xyz" within this block and there is nothing internal
317 function Is_Internal_Name
return Boolean;
318 -- Like the form with an Id argument, except that the name to be tested is
319 -- passed in Name_Buffer and Name_Len (which are not affected by the call).
320 -- Name_Buffer (it loads these as for Get_Name_String).
322 function Is_Valid_Name
(Id
: Name_Id
) return Boolean;
323 -- True if Id is a valid name -- points to a valid entry in the
324 -- Name_Entries table.
326 procedure Reset_Name_Table
;
327 -- This procedure is used when there are multiple source files to reset
328 -- the name table info entries associated with current entries in the
329 -- names table. There is no harm in keeping the names entries themselves
330 -- from one compilation to another, but we can't keep the entity info,
331 -- since this refers to tree nodes, which are destroyed between each main
334 procedure Add_Char_To_Name_Buffer
(C
: Character);
335 pragma Inline
(Add_Char_To_Name_Buffer
);
336 -- Add given character to the end of the string currently stored in the
337 -- Name_Buffer, incrementing Name_Len.
339 procedure Add_Nat_To_Name_Buffer
(V
: Nat
);
340 -- Add decimal representation of given value to the end of the string
341 -- currently stored in Name_Buffer, incrementing Name_Len as required.
343 procedure Add_Str_To_Name_Buffer
(S
: String);
344 -- Add characters of string S to the end of the string currently stored
345 -- in the Name_Buffer, incrementing Name_Len by the length of the string.
347 procedure Set_Character_Literal_Name
(C
: Char_Code
);
348 -- This procedure sets the proper encoded name for the character literal
349 -- for the given character code. On return Name_Buffer and Name_Len are
350 -- set to reflect the stored name.
352 procedure Set_Name_Table_Info
(Id
: Name_Id
; Val
: Int
);
353 pragma Inline
(Set_Name_Table_Info
);
354 -- Sets the Int value associated with the given name
356 procedure Set_Name_Table_Byte
(Id
: Name_Id
; Val
: Byte
);
357 pragma Inline
(Set_Name_Table_Byte
);
358 -- Sets the Byte value associated with the given name
360 procedure Store_Encoded_Character
(C
: Char_Code
);
361 -- Stores given character code at the end of Name_Buffer, updating the
362 -- value in Name_Len appropriately. Lower case letters and digits are
363 -- stored unchanged. Other 8-bit characters are stored using the Uhh
364 -- encoding (hh = hex code), other 16-bit wide character values are stored
365 -- using the Whhhh (hhhh = hex code) encoding, and other 32-bit wide wide
366 -- character values are stored using the WWhhhhhhhh (hhhhhhhh = hex code).
367 -- Note that this procedure does not fold upper case letters (they are
368 -- stored using the Uhh encoding). If folding is required, it must be done
369 -- by the caller prior to the call.
372 -- Initializes internal tables from current tree file using the relevant
373 -- Table.Tree_Read routines. Note that Initialize should not be called if
374 -- Tree_Read is used. Tree_Read includes all necessary initialization.
376 procedure Tree_Write
;
377 -- Writes out internal tables to current tree file using the relevant
378 -- Table.Tree_Write routines.
380 procedure Get_Last_Two_Chars
(N
: Name_Id
; C1
, C2
: out Character);
381 -- Obtains last two characters of a name. C1 is last but one character
382 -- and C2 is last character. If name is less than two characters long,
383 -- then both C1 and C2 are set to ASCII.NUL on return.
385 procedure Write_Name
(Id
: Name_Id
);
386 -- Write_Name writes the characters of the specified name using the
387 -- standard output procedures in package Output. No end of line is
388 -- written, just the characters of the name. On return Name_Buffer and
389 -- Name_Len are set as for a call to Get_Name_String. The name is written
390 -- in encoded form (i.e. including Uhh, Whhh, Qx, _op as they appear in
391 -- the name table). If Id is Error_Name, or No_Name, no text is output.
393 procedure Write_Name_Decoded
(Id
: Name_Id
);
394 -- Like Write_Name, except that the name written is the decoded name, as
395 -- described for Get_Decoded_Name_String, and the resulting value stored
396 -- in Name_Len and Name_Buffer is the decoded name.
398 ------------------------------
399 -- File and Unit Name Types --
400 ------------------------------
402 -- These are defined here in Namet rather than Fname and Uname to avoid
403 -- problems with dependencies, and to avoid dragging in Fname and Uname
404 -- into many more files, but it would be cleaner to move to Fname/Uname.
406 type File_Name_Type
is new Name_Id
;
407 -- File names are stored in the names table and this type is used to
408 -- indicate that a Name_Id value is being used to hold a simple file name
409 -- (which does not include any directory information).
411 No_File
: constant File_Name_Type
:= File_Name_Type
(No_Name
);
412 -- Constant used to indicate no file is present (this is used for example
413 -- when a search for a file indicates that no file of the name exists).
415 Error_File_Name
: constant File_Name_Type
:= File_Name_Type
(Error_Name
);
416 -- The special File_Name_Type value Error_File_Name is used to indicate
417 -- a unit name where some previous processing has found an error.
419 subtype Error_File_Name_Or_No_File
is
420 File_Name_Type
range No_File
.. Error_File_Name
;
421 -- Used to test for either error file name or no file
423 type Path_Name_Type
is new Name_Id
;
424 -- Path names are stored in the names table and this type is used to
425 -- indicate that a Name_Id value is being used to hold a path name (that
426 -- may contain directory information).
428 No_Path
: constant Path_Name_Type
:= Path_Name_Type
(No_Name
);
429 -- Constant used to indicate no path name is present
431 type Unit_Name_Type
is new Name_Id
;
432 -- Unit names are stored in the names table and this type is used to
433 -- indicate that a Name_Id value is being used to hold a unit name, which
434 -- terminates in %b for a body or %s for a spec.
436 No_Unit_Name
: constant Unit_Name_Type
:= Unit_Name_Type
(No_Name
);
437 -- Constant used to indicate no file name present
439 Error_Unit_Name
: constant Unit_Name_Type
:= Unit_Name_Type
(Error_Name
);
440 -- The special Unit_Name_Type value Error_Unit_Name is used to indicate
441 -- a unit name where some previous processing has found an error.
443 subtype Error_Unit_Name_Or_No_Unit_Name
is
444 Unit_Name_Type
range No_Unit_Name
.. Error_Unit_Name
;
446 ------------------------
447 -- Debugging Routines --
448 ------------------------
450 procedure wn
(Id
: Name_Id
);
451 pragma Export
(Ada
, wn
);
452 -- This routine is intended for debugging use only (i.e. it is intended to
453 -- be called from the debugger). It writes the characters of the specified
454 -- name using the standard output procedures in package Output, followed by
455 -- a new line. The name is written in encoded form (i.e. including Uhh,
456 -- Whhh, Qx, _op as they appear in the name table). If Id is Error_Name,
457 -- No_Name, or invalid an appropriate string is written (<Error_Name>,
458 -- <No_Name>, <invalid name>). Unlike Write_Name, this call does not affect
459 -- the contents of Name_Buffer or Name_Len.
461 ---------------------------
462 -- Table Data Structures --
463 ---------------------------
465 -- The following declarations define the data structures used to store
466 -- names. The definitions are in the private part of the package spec,
467 -- rather than the body, since they are referenced directly by gigi.
471 -- This table stores the actual string names. Although logically there is
472 -- no need for a terminating character (since the length is stored in the
473 -- name entry table), we still store a NUL character at the end of every
474 -- name (for convenience in interfacing to the C world).
476 package Name_Chars
is new Table
.Table
(
477 Table_Component_Type
=> Character,
478 Table_Index_Type
=> Int
,
479 Table_Low_Bound
=> 0,
480 Table_Initial
=> Alloc
.Name_Chars_Initial
,
481 Table_Increment
=> Alloc
.Name_Chars_Increment
,
482 Table_Name
=> "Name_Chars");
484 type Name_Entry
is record
485 Name_Chars_Index
: Int
;
486 -- Starting location of characters in the Name_Chars table minus one
487 -- (i.e. pointer to character just before first character). The reason
488 -- for the bias of one is that indexes in Name_Buffer are one's origin,
489 -- so this avoids unnecessary adds and subtracts of 1.
492 -- Length of this name in characters
495 -- Byte value associated with this name
497 Name_Has_No_Encodings
: Boolean;
498 -- This flag is set True if the name entry is known not to contain any
499 -- special character encodings. This is used to speed up repeated calls
500 -- to Get_Decoded_Name_String. A value of False means that it is not
501 -- known whether the name contains any such encodings.
504 -- Link to next entry in names table for same hash code
507 -- Int Value associated with this name
510 for Name_Entry
use record
511 Name_Chars_Index
at 0 range 0 .. 31;
512 Name_Len
at 4 range 0 .. 15;
513 Byte_Info
at 6 range 0 .. 7;
514 Name_Has_No_Encodings
at 7 range 0 .. 7;
515 Hash_Link
at 8 range 0 .. 31;
516 Int_Info
at 12 range 0 .. 31;
519 for Name_Entry
'Size use 16 * 8;
520 -- This ensures that we did not leave out any fields
522 -- This is the table that is referenced by Name_Id entries.
523 -- It contains one entry for each unique name in the table.
525 package Name_Entries
is new Table
.Table
(
526 Table_Component_Type
=> Name_Entry
,
527 Table_Index_Type
=> Name_Id
'Base,
528 Table_Low_Bound
=> First_Name_Id
,
529 Table_Initial
=> Alloc
.Names_Initial
,
530 Table_Increment
=> Alloc
.Names_Increment
,
531 Table_Name
=> "Name_Entries");