1 ------------------------------------------------------------------------------
3 -- GNAT COMPILER COMPONENTS --
10 -- Copyright (C) 1996-2002 Free Software Foundation, Inc. --
12 -- GNAT is free software; you can redistribute it and/or modify it under --
13 -- terms of the GNU General Public License as published by the Free Soft- --
14 -- ware Foundation; either version 2, or (at your option) any later ver- --
15 -- sion. GNAT is distributed in the hope that it will be useful, but WITH- --
16 -- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY --
17 -- or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License --
18 -- for more details. You should have received a copy of the GNU General --
19 -- Public License distributed with GNAT; see file COPYING. If not, write --
20 -- to the Free Software Foundation, 59 Temple Place - Suite 330, Boston, --
21 -- MA 02111-1307, USA. --
23 -- GNAT was originally developed by the GNAT team at New York University. --
24 -- Extensive contributions were provided by Ada Core Technologies Inc. --
26 ------------------------------------------------------------------------------
28 -- Expand routines for generation of special declarations used by the
29 -- debugger. In accordance with the Dwarf 2.2 specification, certain
30 -- type names are encoded to provide information to the debugger.
32 with Types
; use Types
;
33 with Uintp
; use Uintp
;
34 with Get_Targ
; use Get_Targ
;
38 -----------------------------------------------------
39 -- Encoding and Qualification of Names of Entities --
40 -----------------------------------------------------
42 -- This section describes how the names of entities are encoded in
43 -- the generated debugging information.
45 -- An entity in Ada has a name of the form X.Y.Z ... E where X,Y,Z
46 -- are the enclosing scopes (not including Standard at the start).
48 -- The encoding of the name follows this basic qualified naming scheme,
49 -- where the encoding of individual entity names is as described in
50 -- Namet (i.e. in particular names present in the original source are
51 -- folded to all lower case, with upper half and wide characters encoded
52 -- as described in Namet). Upper case letters are used only for entities
53 -- generated by the compiler.
55 -- There are two cases, global entities, and local entities. In more
56 -- formal terms, local entities are those which have a dynamic enclosing
57 -- scope, and global entities are at the library level, except that we
58 -- always consider procedures to be global entities, even if they are
59 -- nested (that's because at the debugger level a procedure name refers
60 -- to the code, and the code is indeed a global entity, including the
61 -- case of nested procedures.) In addition, we also consider all types
62 -- to be global entities, even if they are defined within a procedure.
64 -- The reason for treating all type names as global entities is that
65 -- a number of our type encodings work by having related type names,
66 -- and we need the full qualification to keep this unique.
68 -- For global entities, the encoded name includes all components of the
69 -- fully expanded name (but omitting Standard at the start). For example,
70 -- if a library level child package P.Q has an embedded package R, and
71 -- there is an entity in this embdded package whose name is S, the encoded
72 -- name will include the components p.q.r.s.
74 -- For local entities, the encoded name only includes the components
75 -- up to the enclosing dynamic scope (other than a block). At run time,
76 -- such a dynamic scope is a subprogram, and the debugging formats know
77 -- about local variables of procedures, so it is not necessary to have
78 -- full qualification for such entities. In particular this means that
79 -- direct local variables of a procedure are not qualified.
81 -- As an example of the local name convention, consider a procedure V.W
82 -- with a local variable X, and a nested block Y containing an entity
83 -- Z. The fully qualified names of the entities X and Z are:
88 -- but since V.W is a subprogram, the encoded names will end up
94 -- The separating dots are translated into double underscores.
96 -----------------------------
97 -- Handling of Overloading --
98 -----------------------------
100 -- The above scheme is incomplete with respect to overloaded
101 -- subprograms, since overloading can legitimately result in a
102 -- case of two entities with exactly the same fully qualified names.
103 -- To distinguish between entries in a set of overloaded subprograms,
104 -- the encoded names are serialized by adding one of the suffixes:
107 -- __nn (two underscores)
109 -- where nn is a serial number (2 for the second overloaded function,
110 -- 2 for the third, etc.). We use $ if this symbol is allowed, and
111 -- double underscore if it is not. In the remaining examples in this
112 -- section, we use a $ sign, but the $ is replaced by __ throughout
113 -- these examples if $ sign is not available. A suffix of $1 is
114 -- always omitted (i.e. no suffix implies the first instance).
116 -- These names are prefixed by the normal full qualification. So
117 -- for example, the third instance of the subprogram qrs in package
118 -- yz would have the name:
122 -- A more subtle case arises with entities declared within overloaded
123 -- subprograms. If we have two overloaded subprograms, and both declare
124 -- an entity xyz, then the fully expanded name of the two xyz's is the
125 -- same. To distinguish these, we add the same __n suffix at the end of
126 -- the inner entity names.
128 -- In more complex cases, we can have multiple levels of overloading,
129 -- and we must make sure to distinguish which final declarative region
130 -- we are talking about. For this purpose, we use a more complex suffix
131 -- which has the form:
135 -- where the nn values are the homonym numbers as needed for any of
136 -- the qualifying entities, separated by a single underscore. If all
137 -- the nn values are 1, the suffix is omitted, Otherwise the suffix
138 -- is present (including any values of 1). The following example
139 -- shows how this suffixing works.
141 -- package body Yz is
142 -- procedure Qrs is -- Name is yz__qrs
143 -- procedure Tuv is ... end; -- Name is yz__qrs__tuv
144 -- begin ... end Qrs;
146 -- procedure Qrs (X: Int) is -- Name is yz__qrs$2
147 -- procedure Tuv is ... end; -- Name is yz__qrs__tuv$2_1
148 -- procedure Tuv (X: Int) is -- Name is yz__qrs__tuv$2_2
149 -- begin ... end Tuv;
151 -- procedure Tuv (X: Float) is -- Name is yz__qrs__tuv$2_3
152 -- type m is new float; -- Name is yz__qrs__tuv__m$2_3
153 -- begin ... end Tuv;
154 -- begin ... end Qrs;
161 -- The above rules applied to operator names would result in names
162 -- with quotation marks, which are not typically allowed by assemblers
163 -- and linkers, and even if allowed would be odd and hard to deal with.
164 -- To avoid this problem, operator names are encoded as follows:
186 -- These names are prefixed by the normal full qualification, and
187 -- suffixed by the overloading identification. So for example, the
188 -- second operator "=" defined in package Extra.Messages would
191 -- extra__messages__Oeq__2
193 ----------------------------------
194 -- Resolving Other Name Clashes --
195 ----------------------------------
197 -- It might be thought that the above scheme is complete, but in Ada 95,
198 -- full qualification is insufficient to uniquely identify an entity
199 -- in the program, even if it is not an overloaded subprogram. There
200 -- are two possible confusions:
204 -- interpretation 1: entity b in body of package a
205 -- interpretation 2: child procedure b of package a
209 -- interpretation 1: entity c in child package a.b
210 -- interpretation 2: entity c in nested package b in body of a
212 -- It is perfectly legal in both cases for both interpretations to
213 -- be valid within a single program. This is a bit of a surprise since
214 -- certainly in Ada 83, full qualification was sufficient, but not in
215 -- Ada 95. The result is that the above scheme can result in duplicate
216 -- names. This would not be so bad if the effect were just restricted
217 -- to debugging information, but in fact in both the above cases, it
218 -- is possible for both symbols to be external names, and so we have
219 -- a real problem of name clashes.
221 -- To deal with this situation, we provide two additional encoding
224 -- First: all library subprogram names are preceded by the string
225 -- _ada_ (which causes no duplications, since normal Ada names can
226 -- never start with an underscore. This not only solves the first
227 -- case of duplication, but also solves another pragmatic problem
228 -- which is that otherwise Ada procedures can generate names that
229 -- clash with existing system function names. Most notably, we can
230 -- have clashes in the case of procedure Main with the C main that
231 -- in some systems is always present.
233 -- Second, for the case where nested packages declared in package
234 -- bodies can cause trouble, we add a suffix which shows which
235 -- entities in the list are body-nested packages, i.e. packages
236 -- whose spec is within a package body. The rules are as follows,
237 -- given a list of names in a qualified name name1.name2....
239 -- If none are body-nested package entities, then there is no suffix
241 -- If at least one is a body-nested package entity, then the suffix
242 -- is X followed by a string of b's and n's (b = body-nested package
243 -- entity, n = not a body-nested package).
245 -- There is one element in this string for each entity in the encoded
246 -- expanded name except the first (the rules are such that the first
247 -- entity of the encoded expanded name can never be a body-nested'
248 -- package. Trailing n's are omitted, as is the last b (there must
249 -- be at least one b, or we would not be generating a suffix at all).
251 -- For example, suppose we have
254 -- pragma Elaborate_Body;
255 -- m1 : integer; -- #1
259 -- package y is m2 : integer; end y; -- #2
261 -- package z is r : integer; end z; -- #3
263 -- m3 : integer; -- #4
267 -- pragma Elaborate_Body;
268 -- m2 : integer; -- #5
271 -- package body x.y is
272 -- m3 : integer; -- #6
273 -- procedure j is -- #7
275 -- z : integer; -- #8
282 -- procedure x.m3 is begin null; end; -- #9
284 -- Then the encodings would be:
286 -- #1. x__m1 (no BNPE's in sight)
287 -- #2. x__y__m2X (y is a BNPE)
288 -- #3. x__y__z__rXb (y is a BNPE, so is z)
289 -- #4. x__m3 (no BNPE's in sight)
290 -- #5. x__y__m2 (no BNPE's in sight)
291 -- #6. x__y__m3 (no BNPE's in signt)
292 -- #7. x__y__j (no BNPE's in sight)
293 -- #8. k__z (no BNPE's, only up to procedure)
294 -- #9 _ada_x__m3 (library level subprogram)
296 -- Note that we have instances here of both kind of potential name
297 -- clashes, and the above examples show how the encodings avoid the
300 -- Lines #4 and #9 both refer to the entity x.m3, but #9 is a library
301 -- level subprogram, so it is preceded by the string _ada_ which acts
302 -- to distinguish it from the package body entity.
304 -- Lines #2 and #5 both refer to the entity x.y.m2, but the first
305 -- instance is inside the body-nested package y, so there is an X
306 -- suffix to distinguish it from the child library entity.
308 -- Note that enumeration literals never need Xb type suffixes, since
309 -- they are never referenced using global external names.
311 ---------------------
312 -- Interface Names --
313 ---------------------
315 -- Note: if an interface name is present, then the external name
316 -- is taken from the specified interface name. Given the current
317 -- limitations of the gcc backend, this means that the debugging
318 -- name is also set to the interface name, but conceptually, it
319 -- would be possible (and indeed desirable) to have the debugging
320 -- information still use the Ada name as qualified above, so we
321 -- still fully qualify the name in the front end.
323 -------------------------------------
324 -- Encodings Related to Task Types --
325 -------------------------------------
327 -- Each task object defined by a single task declaration is associated
328 -- with a prefix that is used to qualify procedures defined in that
332 -- task body TaskObj is
333 -- procedure F1 is ... end;
339 -- The name of subprogram TaskObj.F1 is encoded as p__taskobjTK__f1,
340 -- The body, B, is contained in a subprogram whose name is
343 ------------------------------------------
344 -- Encodings Related to Protected Types --
345 ------------------------------------------
347 -- Each protected type has an associated record type, that describes
348 -- the actual layout of the private data. In addition to the private
349 -- components of the type, the Corresponding_Record_Type includes one
350 -- component of type Protection, which is the actual lock structure.
351 -- The run-time size of the protected type is the size of the corres-
354 -- For a protected type prot, the Corresponding_Record_Type is encoded
357 -- The operations of a protected type are encoded as follows: each
358 -- operation results in two subprograms, a locking one that is called
359 -- from outside of the object, and a non-locking one that is used for
360 -- calls from other operations on the same object. The locking operation
361 -- simply acquires the lock, and then calls the non-locking version.
362 -- The names of all of these have a prefix constructed from the name of
363 -- the type, the string "PT", and a suffix which is P or N, depending on
364 -- whether this is the protected/non-locking version of the operation.
366 -- Given the declaration:
368 -- protected type lock is
369 -- function get return integer;
370 -- procedure set (x: integer);
372 -- value : integer := 0;
375 -- the following operations are created:
382 ----------------------------------------------------
383 -- Conversion between Entities and External Names --
384 ----------------------------------------------------
386 No_Dollar_In_Label
: constant Boolean := Get_No_Dollar_In_Label
;
387 -- True iff the target allows dollar signs ("$") in external names
389 procedure Get_External_Name
391 Has_Suffix
: Boolean);
392 -- Set Name_Buffer and Name_Len to the external name of entity E.
393 -- The external name is the Interface_Name, if specified, unless
394 -- the entity has an address clause or a suffix.
396 -- If the Interface is not present, or not used, the external name
397 -- is the concatenation of:
399 -- - the string "_ada_", if the entity is a library subprogram,
400 -- - the names of any enclosing scopes, each followed by "__",
401 -- or "X_" if the next entity is a subunit)
402 -- - the name of the entity
403 -- - the string "$" (or "__" if target does not allow "$"), followed
404 -- by homonym suffix, if the entity is an overloaded subprogram
405 -- or is defined within an overloaded subprogram.
407 procedure Get_External_Name_With_Suffix
410 -- Set Name_Buffer and Name_Len to the external name of entity E.
411 -- If Suffix is the empty string the external name is as above,
412 -- otherwise the external name is the concatenation of:
414 -- - the string "_ada_", if the entity is a library subprogram,
415 -- - the names of any enclosing scopes, each followed by "__",
416 -- or "X_" if the next entity is a subunit)
417 -- - the name of the entity
418 -- - the string "$" (or "__" if target does not allow "$"), followed
419 -- by homonym suffix, if the entity is an overloaded subprogram
420 -- or is defined within an overloaded subprogram.
421 -- - the string "___" followed by Suffix
423 ----------------------------
424 -- Debug Name Compression --
425 ----------------------------
427 -- The full qualification of names can lead to long names, and this
428 -- section describes the method used to compress these names. Such
429 -- compression is attempted if one of the following holds:
431 -- The length exceeds a maximum set in hostparm, currently set
432 -- to 128, but can be changed as needed.
434 -- The compiler switch -gnatC is set, setting the Compress_Debug_Names
435 -- switch in Opt to True.
437 -- If either of these conditions holds, name compression is attempted
438 -- by replacing the qualifying section as follows.
440 -- Given a name of the form
444 -- where a,b,c,d are arbitrary strings not containing a sequence
445 -- of exactly two underscores, the name is rewritten as:
449 -- where ???????? are 8 hex digits representing a 32-bit checksum
450 -- value that identifies the sequence of compressed names. In
451 -- addition a dummy type declaration is generated as shown by
452 -- the following example. Supposed we have three compression
455 -- XC1234abcd corresponding to a__b__c__ prefix
456 -- XCabcd1234 corresponding to a__b__ prefix
457 -- XCab1234cd corresponding to a__ prefix
459 -- then an enumeration type declaration is generated:
462 -- (XC1234abcdXnn, aXnn, bXnn, cXnn,
463 -- XCabcd1234Xnn, aXnn, bXnn,
464 -- XCab1234cdXnn, aXnn);
466 -- showing the meaning of each compressed prefix, so the debugger
467 -- can interpret the exact sequence of names that correspond to the
468 -- compressed sequence. The Xnn suffixes in the above are simply
469 -- serial numbers that are guaranteed to be different to ensure
470 -- that all names are unique, and are otherwise ignored.
472 --------------------------------------------
473 -- Subprograms for Handling Qualification --
474 --------------------------------------------
476 procedure Qualify_Entity_Names
(N
: Node_Id
);
477 -- Given a node N, that represents a block, subprogram body, or package
478 -- body or spec, or protected or task type, sets a fully qualified name
479 -- for the defining entity of given construct, and also sets fully
480 -- qualified names for all enclosed entities of the construct (using
481 -- First_Entity/Next_Entity). Note that the actual modifications of the
482 -- names is postponed till a subsequent call to Qualify_All_Entity_Names.
483 -- Note: this routine does not deal with prepending _ada_ to library
484 -- subprogram names. The reason for this is that we only prepend _ada_
485 -- to the library entity itself, and not to names built from this name.
487 procedure Qualify_All_Entity_Names
;
488 -- When Qualify_Entity_Names is called, no actual name changes are made,
489 -- i.e. the actual calls to Qualify_Entity_Name are deferred until a call
490 -- is made to this procedure. The reason for this deferral is that when
491 -- names are changed semantic processing may be affected. By deferring
492 -- the changes till just before gigi is called, we avoid any concerns
493 -- about such effects. Gigi itself does not use the names except for
494 -- output of names for debugging purposes (which is why we are doing
495 -- the name changes in the first place.
497 -- Note: the routines Get_Unqualified_[Decoded]_Name_String in Namet
498 -- are useful to remove qualification from a name qualified by the
499 -- call to Qualify_All_Entity_Names.
501 procedure Generate_Auxiliary_Types
;
502 -- The process of qualifying names may result in name compression which
503 -- requires dummy enumeration types to be generated. This subprogram
504 -- ensures that these types are appropriately included in the tree.
506 --------------------------------
507 -- Handling of Numeric Values --
508 --------------------------------
510 -- All numeric values here are encoded as strings of decimal digits.
511 -- Only integer values need to be encoded. A negative value is encoded
512 -- as the corresponding positive value followed by a lower case m for
513 -- minus to indicate that the value is negative (e.g. 2m for -2).
515 -------------------------
516 -- Type Name Encodings --
517 -------------------------
519 -- In the following typ is the name of the type as normally encoded by
520 -- the debugger rules, i.e. a non-qualified name, all in lower case,
521 -- with standard encoding of upper half and wide characters
523 ------------------------
524 -- Encapsulated Types --
525 ------------------------
527 -- In some cases, the compiler encapsulates a type by wrapping it in
528 -- a structure. For example, this is used when a size or alignment
529 -- specification requires a larger type. Consider:
531 -- type y is mod 2 ** 64;
532 -- for y'size use 256;
534 -- In this case the compile generates a structure type y___PAD, which
535 -- has a single field whose name is F. This single field is 64 bits
536 -- long and contains the actual value.
538 -- A similar encapsulation is done for some packed array types,
539 -- in which case the structure type is y___LJM and the field name
542 -- When the debugger sees an object of a type whose name has a
543 -- suffix not otherwise mentioned in this specification, the type
544 -- is a record containing a single field, and the name of that field
545 -- is all upper-case letters, it should look inside to get the value
546 -- of the field, and neither the outer structure name, nor the
547 -- field name should appear when the value is printed.
549 -----------------------
550 -- Fixed-Point Types --
551 -----------------------
553 -- Fixed-point types are encoded using a suffix that indicates the
554 -- delta and small values. The actual type itself is a normal
558 -- typ___XF_nn_dd_nn_dd
560 -- The first form is used when small = delta. The value of delta (and
561 -- small) is given by the rational nn/dd, where nn and dd are decimal
564 -- The second form is used if the small value is different from the
565 -- delta. In this case, the first nn/dd rational value is for delta,
566 -- and the second value is for small.
568 ------------------------------
569 -- VAX Floating-Point Types --
570 ------------------------------
572 -- Vax floating-point types are represented at run time as integer
573 -- types, which are treated specially by the code generator. Their
574 -- type names are encoded with the following suffix:
580 -- representing the Vax F Float, D Float, and G Float types. The
581 -- debugger must treat these specially. In particular, printing
582 -- these values can be achieved using the debug procedures that
583 -- are provided in package System.Vax_Float_Operations:
585 -- procedure Debug_Output_D (Arg : D);
586 -- procedure Debug_Output_F (Arg : F);
587 -- procedure Debug_Output_G (Arg : G);
589 -- These three procedures take a Vax floating-point argument, and
590 -- output a corresponding decimal representation to standard output
591 -- with no terminating line return.
597 -- Discrete types are coded with a suffix indicating the range in
598 -- the case where one or both of the bounds are discriminants or
601 -- Note: at the current time, we also encode static bounds if they
602 -- do not match the natural machine type bounds, but this may be
603 -- removed in the future, since it is redundant for most debugging
604 -- formats. However, we do not ever need XD encoding for enumeration
605 -- base types, since here it is always clear what the bounds are
606 -- from the number of enumeration literals, and of course we do
607 -- not need to encode the dummy XR types generated for renamings.
610 -- typ___XDL_lowerbound
611 -- typ___XDU_upperbound
612 -- typ___XDLU_lowerbound__upperbound
614 -- If a discrete type is a natural machine type (i.e. its bounds
615 -- correspond in a natural manner to its size), then it is left
616 -- unencoded. The above encoding forms are used when there is a
617 -- constrained range that does not correspond to the size or that
618 -- has discriminant references or other non-static bounds.
620 -- The first form is used if both bounds are dynamic, in which case
621 -- two constant objects are present whose names are typ___L and
622 -- typ___U in the same scope as typ, and the values of these constants
623 -- indicate the bounds. As far as the debugger is concerned, these
624 -- are simply variables that can be accessed like any other variables.
625 -- In the enumeration case, these values correspond to the Enum_Rep
626 -- values for the lower and upper bounds.
628 -- The second form is used if the upper bound is dynamic, but the
629 -- lower bound is either constant or depends on a discriminant of
630 -- the record with which the type is associated. The upper bound
631 -- is stored in a constant object of name typ___U as previously
632 -- described, but the lower bound is encoded directly into the
633 -- name as either a decimal integer, or as the discriminant name.
635 -- The third form is similarly used if the lower bound is dynamic,
636 -- but the upper bound is static or a discriminant reference, in
637 -- which case the lower bound is stored in a constant object of
638 -- name typ___L, and the upper bound is encoded directly into the
639 -- name as either a decimal integer, or as the discriminant name.
641 -- The fourth form is used if both bounds are discriminant references
642 -- or static values, with the encoding first for the lower bound,
643 -- then for the upper bound, as previously described.
653 -- Is encoded as a subrange of an unsigned base type with lower bound
654 -- 0 and upper bound N. That is, there is no name encoding. We use
655 -- the standard encodings provided by the debugging format. Thus
656 -- we give these types a non-standard interpretation: the standard
657 -- interpretation of our encoding would not, in general, imply that
658 -- arithmetic on type x was to be performed modulo N (especially not
659 -- when N is not a power of 2).
665 -- Only discrete types can be biased, and the fact that they are
666 -- biased is indicated by a suffix of the form:
668 -- typ___XB_lowerbound__upperbound
670 -- Here lowerbound and upperbound are decimal integers, with the
671 -- usual (postfix "m") encoding for negative numbers. Biased
672 -- types are only possible where the bounds are static, and the
673 -- values are represented as unsigned offsets from the lower
674 -- bound given. For example:
676 -- type Q is range 10 .. 15;
679 -- The size clause will force values of type Q in memory to be
680 -- stored in biased form (e.g. 11 will be represented by the
683 ----------------------------------------------
684 -- Record Types with Variable-Length Fields --
685 ----------------------------------------------
687 -- The debugging formats do not fully support these types, and indeed
688 -- some formats simply generate no useful information at all for such
689 -- types. In order to provide information for the debugger, gigi creates
690 -- a parallel type in the same scope with one of the names
695 -- The former name is used for a record and the latter for the union
696 -- that is made for a variant record (see below) if that union has
697 -- variable size. These encodings suffix any other encodings that
698 -- might be suffixed to the type name.
700 -- The idea here is to provide all the needed information to interpret
701 -- objects of the original type in the form of a "fixed up" type, which
702 -- is representable using the normal debugging information.
704 -- There are three cases to be dealt with. First, some fields may have
705 -- variable positions because they appear after variable-length fields.
706 -- To deal with this, we encode *all* the field bit positions of the
707 -- special ___XV type in a non-standard manner.
709 -- The idea is to encode not the position, but rather information
710 -- that allows computing the position of a field from the position
711 -- of the previous field. The algorithm for computing the actual
712 -- positions of all fields and the length of the record is as
713 -- follows. In this description, let P represent the current
714 -- bit position in the record.
716 -- 1. Initialize P to 0.
718 -- 2. For each field in the record,
720 -- 2a. If an alignment is given (see below), then round P
721 -- up, if needed, to the next multiple of that alignment.
723 -- 2b. If a bit position is given, then increment P by that
724 -- amount (that is, treat it as an offset from the end of the
725 -- preceding record).
727 -- 2c. Assign P as the actual position of the field.
729 -- 2d. Compute the length, L, of the represented field (see below)
730 -- and compute P'=P+L. Unless the field represents a variant part
731 -- (see below and also Variant Record Encoding), set P to P'.
733 -- The alignment, if present, is encoded in the field name of the
734 -- record, which has a suffix:
738 -- where the nn after the XVA indicates the alignment value in storage
739 -- units. This encoding is present only if an alignment is present.
741 -- The size of the record described by an XVE-encoded type (in bits)
742 -- is generally the maximum value attained by P' in step 2d above,
743 -- rounded up according to the record's alignment.
745 -- Second, the variable-length fields themselves are represented by
746 -- replacing the type by a special access type. The designated type
747 -- of this access type is the original variable-length type, and the
748 -- fact that this field has been transformed in this way is signalled
749 -- by encoding the field name as:
753 -- where field is the original field name. If a field is both
754 -- variable-length and also needs an alignment encoding, then the
755 -- encodings are combined using:
759 -- Note: the reason that we change the type is so that the resulting
760 -- type has no variable-length fields. At least some of the formats
761 -- used for debugging information simply cannot tolerate variable-
762 -- length fields, so the encoded information would get lost.
764 -- Third, in the case of a variant record, the special union
765 -- that contains the variants is replaced by a normal C union.
766 -- In this case, the positions are all zero.
768 -- Discriminants appear before any variable-length fields that depend
769 -- on them, with one exception. In some cases, a discriminant
770 -- governing the choice of a variant clause may appear in the list
771 -- of fields of an XVE type after the entry for the variant clause
772 -- itself (this can happen in the presence of a representation clause
773 -- for the record type in the source program). However, when this
774 -- happens, the discriminant's position may be determined by first
775 -- applying the rules described in this section, ignoring the variant
776 -- clause. As a result, discriminants can always be located
777 -- independently of the variable-length fields that depend on them.
779 -- The size of the ___XVE or ___XVU record or union is set to the
780 -- alignment (in bytes) of the original object so that the debugger
781 -- can calculate the size of the original type.
783 -- As an example of this encoding, consider the declarations:
785 -- type Q is array (1 .. V1) of Float; -- alignment 4
786 -- type R is array (1 .. V2) of Long_Float; -- alignment 8
791 -- C : String (1 .. V3);
798 -- The encoded type looks like:
800 -- type anonymousQ is access Q;
801 -- type anonymousR is access R;
803 -- type X___XVE is record
804 -- A : Character; -- position contains 0
805 -- B : Float; -- position contains 24
806 -- C___XVL : access String (1 .. V3); -- position contains 0
807 -- D___XVA4 : Float; -- position contains 0
808 -- E___XVL4 : anonymousQ; -- position contains 0
809 -- F___XVL8 : anonymousR; -- position contains 0
810 -- G : Float; -- position contains 0
813 -- Any bit sizes recorded for fields other than dynamic fields and
814 -- variants are honored as for ordinary records.
818 -- 1) The B field could also have been encoded by using a position
819 -- of zero, and an alignment of 4, but in such a case, the coding by
820 -- position is preferred (since it takes up less space). We have used
821 -- the (illegal) notation access xxx as field types in the example
824 -- 2) The E field does not actually need the alignment indication
825 -- but this may not be detected in this case by the conversion
828 -- 3) Our conventions do not cover all XVE-encoded records in which
829 -- some, but not all, fields have representation clauses. Such
830 -- records may, therefore, be displayed incorrectly by debuggers.
831 -- This situation is not common.
833 -----------------------
834 -- Base Record Types --
835 -----------------------
837 -- Under certain circumstances, debuggers need two descriptions
838 -- of a record type, one that gives the actual details of the
839 -- base type's structure (as described elsewhere in these
840 -- comments) and one that may be used to obtain information
841 -- about the particular subtype and the size of the objects
842 -- being typed. In such cases the compiler will substitute a
843 -- type whose name is typically compiler-generated and
844 -- irrelevant except as a key for obtaining the actual type.
845 -- Specifically, if this name is x, then we produce a record
846 -- type named x___XVS consisting of one field. The name of
847 -- this field is that of the actual type being encoded, which
848 -- we'll call y (the type of this single field is arbitrary).
849 -- Both x and y may have corresponding ___XVE types.
851 -- The size of the objects typed as x should be obtained from
852 -- the structure of x (and x___XVE, if applicable) as for
853 -- ordinary types unless there is a variable named x___XVZ, which,
854 -- if present, will hold the the size (in bits) of x.
856 -- The type x will either be a subtype of y (see also Subtypes
857 -- of Variant Records, below) or will contain no fields at
858 -- all. The layout, types, and positions of these fields will
859 -- be accurate, if present. (Currently, however, the GDB
860 -- debugger makes no use of x except to determine its size).
862 -- Among other uses, XVS types are sometimes used to encode
863 -- unconstrained types. For example, given
865 -- subtype Int is INTEGER range 0..10;
866 -- type T1 (N: Int := 0) is record
867 -- F1: String (1 .. N);
869 -- type AT1 is array (INTEGER range <>) of T1;
871 -- the element type for AT1 might have a type defined as if it had
874 -- type at1___C_PAD is record null; end record;
875 -- for at1___C_PAD'Size use 16 * 8;
877 -- and there would also be
879 -- type at1___C_PAD___XVS is record t1: Integer; end record;
882 -- Had the subtype Int been dynamic:
884 -- subtype Int is INTEGER range 0 .. M; -- M a variable
886 -- Then the compiler would also generate a declaration whose effect
889 -- at1___C_PAD___XVZ: constant Integer := 32 + M * 8 + padding term;
891 -- Not all unconstrained types are so encoded; the XVS
892 -- convention may be unnecessary for unconstrained types of
893 -- fixed size. However, this encoding is always necessary when
894 -- a subcomponent type (array element's type or record field's
895 -- type) is an unconstrained record type some of whose
896 -- components depend on discriminant values.
902 -- Since there is no way for the debugger to obtain the index subtypes
903 -- for an array type, we produce a type that has the name of the
904 -- array type followed by "___XA" and is a record whose field names
905 -- are the names of the types for the bounds. The types of these
906 -- fields is an integer type which is meaningless.
908 -- To conserve space, we do not produce this type unless one of
909 -- the index types is either an enumeration type, has a variable
910 -- upper bound, has a lower bound different from the constant 1,
911 -- is a biased type, or is wider than "sizetype".
913 -- Given the full encoding of these types (see above description for
914 -- the encoding of discrete types), this means that all necessary
915 -- information for addressing arrays is available. In some
916 -- debugging formats, some or all of the bounds information may
917 -- be available redundantly, particularly in the fixed-point case,
918 -- but this information can in any case be ignored by the debugger.
920 ----------------------------
921 -- Note on Implicit Types --
922 ----------------------------
924 -- The compiler creates implicit type names in many situations where
925 -- a type is present semantically, but no specific name is present.
928 -- S : Integer range M .. N;
930 -- Here the subtype of S is not integer, but rather an anonymous
931 -- subtype of Integer. Where possible, the compiler generates names
932 -- for such anonymous types that are related to the type from which
933 -- the subtype is obtained as follows:
937 -- where name is the name from which the subtype is obtained, using
938 -- lower case letters and underscores, and suffix starts with an upper
939 -- case letter. For example, the name for the above declaration of S
944 -- If the debugger is asked to give the type of an entity and the type
945 -- has the form T name suffix, it is probably appropriate to just use
946 -- "name" in the response since this is what is meaningful to the
949 -------------------------------------------------
950 -- Subprograms for Handling Encoded Type Names --
951 -------------------------------------------------
953 procedure Get_Encoded_Name
(E
: Entity_Id
);
954 -- If the entity is a typename, store the external name of
955 -- the entity as in Get_External_Name, followed by three underscores
956 -- plus the type encoding in Name_Buffer with the length in Name_Len,
957 -- and an ASCII.NUL character stored following the name.
958 -- Otherwise set Name_Buffer and Name_Len to hold the entity name.
964 -- Debugging information is generated for exception, object, package,
965 -- and subprogram renaming (generic renamings are not significant, since
966 -- generic templates are not relevant at debugging time).
968 -- Consider a renaming declaration of the form
972 -- There is one case in which no special debugging information is required,
973 -- namely the case of an object renaming where the backend allocates a
974 -- reference for the renamed variable, and the entity x is this reference.
975 -- The debugger can handle this case without any special processing or
976 -- encoding (it won't know it was a renaming, but that does not matter).
978 -- All other cases of renaming generate a dummy type definition for
979 -- an entity whose name is:
981 -- x___XR for an object renaming
982 -- x___XRE for an exception renaming
983 -- x___XRP for a package renaming
985 -- The name is fully qualified in the usual manner, i.e. qualified in
986 -- the same manner as the entity x would be.
988 -- Note: subprogram renamings are not encoded at the present time.
990 -- The type is an enumeration type with a single enumeration literal
991 -- that is an identifier which describes the renamed variable.
993 -- For the simple entity case, where y is an entity name,
994 -- the enumeration is of the form:
998 -- i.e. the enumeration type has a single field, whose name
999 -- matches the name y, with the XE suffix. The entity for this
1000 -- enumeration literal is fully qualified in the usual manner.
1001 -- All subprogram, exception, and package renamings fall into
1002 -- this category, as well as simple object renamings.
1004 -- For the object renaming case where y is a selected component or an
1005 -- indexed component, the literal name is suffixed by additional fields
1006 -- that give details of the components. The name starts as above with
1007 -- a y___XE entity indicating the outer level variable. Then a series
1008 -- of selections and indexing operations can be specified as follows:
1010 -- Indexed component
1012 -- A series of subscript values appear in sequence, the number
1013 -- corresponds to the number of dimensions of the array. The
1014 -- subscripts have one of the following two forms:
1018 -- Here nnn is a constant value, encoded as a decimal
1019 -- integer (pos value for enumeration type case). Negative
1020 -- values have a trailing 'm' as usual.
1024 -- Here e is the (unqualified) name of a constant entity in
1025 -- the same scope as the renaming which contains the subscript
1030 -- For the slice case, we have two entries. The first is for
1031 -- the lower bound of the slice, and has the form
1036 -- Specifies the lower bound, using exactly the same encoding
1037 -- as for an XS subscript as described above.
1039 -- Then the upper bound appears in the usual XSnnn/XSe form
1041 -- Selected component
1043 -- For a selected component, we have a single entry
1047 -- Here f is the field name for the selection
1049 -- For an explicit deference (.all), we have a single entry
1053 -- As an example, consider the declarations:
1057 -- m : string (2 .. 5);
1060 -- type r is array (1 .. 10, 1 .. 20) of q;
1064 -- z : string renames g (1,5).m(2 ..3)
1067 -- The generated type definition would appear as
1069 -- type p__z___XR is
1070 -- (p__g___XEXS1XS5XRmXL2XS3);
1071 -- p__q___XE--------------------outer entity is g
1072 -- XS1-----------------first subscript for g
1073 -- XS5--------------second subscript for g
1074 -- XRm-----------select field m
1075 -- XL2--------lower bound of slice
1076 -- XS3-----upper bound of slice
1078 function Debug_Renaming_Declaration
(N
: Node_Id
) return Node_Id
;
1079 -- The argument N is a renaming declaration. The result is a type
1080 -- declaration as described in the above paragraphs. If not special
1081 -- debug declaration, than Empty is returned.
1083 ---------------------------
1084 -- Packed Array Encoding --
1085 ---------------------------
1087 -- For every packed array, two types are created, and both appear in
1088 -- the debugging output.
1090 -- The original declared array type is a perfectly normal array type,
1091 -- and its index bounds indicate the original bounds of the array.
1093 -- The corresponding packed array type, which may be a modular type, or
1094 -- may be an array of bytes type (see Exp_Pakd for full details). This
1095 -- is the type that is actually used in the generated code and for
1096 -- debugging information for all objects of the packed type.
1098 -- The name of the corresponding packed array type is:
1103 -- ttt is the name of the original declared array
1104 -- nnn is the component size in bits (1-31)
1106 -- When the debugger sees that an object is of a type that is encoded
1107 -- in this manner, it can use the original type to determine the bounds,
1108 -- and the component size to determine the packing details.
1110 -- Packed arrays are represented in tightly packed form, with no extra
1111 -- bits between components. This is true even when the component size
1112 -- is not a factor of the storage unit size, so that as a result it is
1113 -- possible for components to cross storage unit boundaries.
1115 -- The layout in storage is identical, regardless of whether the
1116 -- implementation type is a modular type or an array-of-bytes type.
1117 -- See Exp_Pakd for details of how these implementation types are used,
1118 -- but for the purpose of the debugger, only the starting address of
1119 -- the object in memory is significant.
1121 -- The following example should show clearly how the packing works in
1122 -- the little-endian and big-endian cases:
1124 -- type B is range 0 .. 7;
1125 -- for B'Size use 3;
1127 -- type BA is array (0 .. 5) of B;
1128 -- pragma Pack (BA);
1130 -- BV : constant BA := (1,2,3,4,5,6);
1132 -- Little endian case
1134 -- BV'Address + 2 BV'Address + 1 BV'Address + 0
1135 -- +-----------------+-----------------+-----------------+
1136 -- | 0 0 0 0 0 0 1 1 | 0 1 0 1 1 0 0 0 | 1 1 0 1 0 0 0 1 |
1137 -- +-----------------+-----------------+-----------------+
1138 -- <---------> <-----> <---> <---> <-----> <---> <--->
1139 -- unused bits BV(5) BV(4) BV(3) BV(2) BV(1) BV(0)
1143 -- BV'Address + 0 BV'Address + 1 BV'Address + 2
1144 -- +-----------------+-----------------+-----------------+
1145 -- | 0 0 1 0 1 0 0 1 | 1 1 0 0 1 0 1 1 | 1 0 0 0 0 0 0 0 |
1146 -- +-----------------+-----------------+-----------------+
1147 -- <---> <---> <-----> <---> <---> <-----> <--------->
1148 -- BV(0) BV(1) BV(2) BV(3) BV(4) BV(5) unused bits
1150 ------------------------------------------------------
1151 -- Subprograms for Handling Packed Array Type Names --
1152 ------------------------------------------------------
1154 function Make_Packed_Array_Type_Name
1158 -- This function is used in Exp_Pakd to create the name that is encoded
1159 -- as described above. The entity Typ provides the name ttt, and the
1160 -- value Csize is the component size that provides the nnn value.
1162 --------------------------------------
1163 -- Pointers to Unconstrained Arrays --
1164 --------------------------------------
1166 -- There are two kinds of pointers to arrays. The debugger can tell
1167 -- which format is in use by the form of the type of the pointer.
1171 -- Fat pointers are represented as a struct with two fields. This
1172 -- struct has two distinguished field names:
1174 -- P_ARRAY is a pointer to the array type. The name of this
1175 -- type is the unconstrained type followed by "___XUA". This
1176 -- array will have bounds which are the discriminants, and
1177 -- hence are unparsable, but will give the number of
1178 -- subscripts and the component type.
1180 -- P_BOUNDS is a pointer to a struct, the name of whose type is the
1181 -- unconstrained array name followed by "___XUB" and which has
1182 -- fields of the form
1184 -- LBn (n a decimal integer) lower bound of n'th dimension
1185 -- UBn (n a decimal integer) upper bound of n'th dimension
1187 -- The bounds may be any integral type. In the case of an
1188 -- enumeration type, Enum_Rep values are used.
1190 -- The debugging information will sometimes reference an anonymous
1191 -- fat pointer type. Such types are given the name xxx___XUP, where
1192 -- xxx is the name of the designated type. If the debugger is asked
1193 -- to output such a type name, the appropriate form is "access xxx".
1197 -- Thin pointers are represented as a pointer to the ARRAY field of
1198 -- a structure with two fields. The name of the structure type is
1199 -- that of the unconstrained array followed by "___XUT".
1201 -- The field ARRAY contains the array value. This array field is
1202 -- typically a variable-length array, and consequently the entire
1203 -- record structure will be encoded as previously described,
1204 -- resulting in a type with suffix "___XUT___XVE".
1206 -- The field BOUNDS is a struct containing the bounds as above.
1208 --------------------------------------
1209 -- Tagged Types and Type Extensions --
1210 --------------------------------------
1212 -- A type C derived from a tagged type P has a field named "_parent"
1213 -- of type P that contains its inherited fields. The type of this
1214 -- field is usually P (encoded as usual if it has a dynamic size),
1215 -- but may be a more distant ancestor, if P is a null extension of
1218 -- The type tag of a tagged type is a field named _tag, of type void*.
1219 -- If the type is derived from another tagged type, its _tag field is
1220 -- found in its _parent field.
1222 -----------------------------
1223 -- Variant Record Encoding --
1224 -----------------------------
1226 -- The variant part of a variant record is encoded as a single field
1227 -- in the enclosing record, whose name is:
1231 -- where discrim is the unqualified name of the variant. This field name
1232 -- is built by gigi (not by code in this unit). In the case of an
1233 -- Unchecked_Union record, this discriminant will not appear in the
1234 -- record, and the debugger must proceed accordingly (basically it
1235 -- can treat this case as it would a C union).
1237 -- The type corresponding to this field has a name that is obtained
1238 -- by concatenating the type name with the above string and is similar
1239 -- to a C union, in which each member of the union corresponds to one
1240 -- variant. However, unlike a C union, the size of the type may be
1241 -- variable even if each of the components are fixed size, since it
1242 -- includes a computation of which variant is present. In that case,
1243 -- it will be encoded as above and a type with the suffix "___XVN___XVU"
1246 -- The name of the union member is encoded to indicate the choices, and
1247 -- is a string given by the following grammar:
1249 -- union_name ::= {choice} | others_choice
1250 -- choice ::= simple_choice | range_choice
1251 -- simple_choice ::= S number
1252 -- range_choice ::= R number T number
1253 -- number ::= {decimal_digit} [m]
1254 -- others_choice ::= O (upper case letter O)
1256 -- The m in a number indicates a negative value. As an example of this
1257 -- encoding scheme, the choice 1 .. 4 | 7 | -10 would be represented by
1261 -- In the case of enumeration values, the values used are the
1262 -- actual representation values in the case where an enumeration type
1263 -- has an enumeration representation spec (i.e. they are values that
1264 -- correspond to the use of the Enum_Rep attribute).
1266 -- The type of the inner record is given by the name of the union
1267 -- type (as above) concatenated with the above string. Since that
1268 -- type may itself be variable-sized, it may also be encoded as above
1269 -- with a new type with a further suffix of "___XVU".
1271 -- As an example, consider:
1273 -- type Var (Disc : Boolean := True) is record
1288 -- In this case, the type var is represented as a struct with three
1289 -- fields, the first two are "disc" and "m", representing the values
1290 -- of these record components.
1292 -- The third field is a union of two types, with field names S1 and O.
1293 -- S1 is a struct with fields "r" and "s", and O is a struct with
1296 ------------------------------------------------
1297 -- Subprograms for Handling Variant Encodings --
1298 ------------------------------------------------
1300 procedure Get_Variant_Encoding
(V
: Node_Id
);
1301 -- This procedure is called by Gigi with V being the variant node.
1302 -- The corresponding encoding string is returned in Name_Buffer with
1303 -- the length of the string in Name_Len, and an ASCII.NUL character
1304 -- stored following the name.
1306 ---------------------------------
1307 -- Subtypes of Variant Records --
1308 ---------------------------------
1310 -- A subtype of a variant record is represented by a type in which the
1311 -- union field from the base type is replaced by one of the possible
1312 -- values. For example, if we have:
1314 -- type Var (Disc : Boolean := True) is record
1329 -- V3 : Var (False);
1331 -- Here V2 for example is represented with a subtype whose name is
1332 -- something like TvarS3b, which is a struct with three fields. The
1333 -- first two fields are "disc" and "m" as for the base type, and
1334 -- the third field is S1, which contains the fields "r" and "s".
1336 -- The debugger should simply ignore structs with names of the form
1337 -- corresponding to variants, and consider the fields inside as
1338 -- belonging to the containing record.
1340 -------------------------------------------
1341 -- Character literals in Character Types --
1342 -------------------------------------------
1344 -- Character types are enumeration types at least one of whose
1345 -- enumeration literals is a character literal. Enumeration literals
1346 -- are usually simply represented using their identifier names. In
1347 -- the case where an enumeration literal is a character literal, the
1348 -- name aencoded as described in the following paragraph.
1350 -- A name QUhh, where each 'h' is a lower-case hexadecimal digit,
1351 -- stands for a character whose Unicode encoding is hh, and
1352 -- QWhhhh likewise stands for a wide character whose encoding
1353 -- is hhhh. The representation values are encoded as for ordinary
1354 -- enumeration literals (and have no necessary relationship to the
1355 -- values encoded in the names).
1357 -- For example, given the type declaration
1359 -- type x is (A, 'C', B);
1361 -- the second enumeration literal would be named QU43 and the
1362 -- value assigned to it would be 1.