1 ------------------------------------------------------------------------------
3 -- GNAT COMPILER COMPONENTS --
9 -- Copyright (C) 1992-2017, Free Software Foundation, Inc. --
11 -- GNAT is free software; you can redistribute it and/or modify it under --
12 -- terms of the GNU General Public License as published by the Free Soft- --
13 -- ware Foundation; either version 3, or (at your option) any later ver- --
14 -- sion. GNAT is distributed in the hope that it will be useful, but WITH- --
15 -- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY --
16 -- or FITNESS FOR A PARTICULAR PURPOSE. --
18 -- As a special exception under Section 7 of GPL version 3, you are granted --
19 -- additional permissions described in the GCC Runtime Library Exception, --
20 -- version 3.1, as published by the Free Software Foundation. --
22 -- You should have received a copy of the GNU General Public License and --
23 -- a copy of the GCC Runtime Library Exception along with this program; --
24 -- see the files COPYING3 and COPYING.RUNTIME respectively. If not, see --
25 -- <http://www.gnu.org/licenses/>. --
27 -- GNAT was originally developed by the GNAT team at New York University. --
28 -- Extensive contributions were provided by Ada Core Technologies Inc. --
30 ------------------------------------------------------------------------------
32 -- This package contains the input routines used for reading the
33 -- input source file. The actual I/O routines are in OS_Interface,
34 -- with this module containing only the system independent processing.
36 -- General Note: throughout the compiler, we use the term line or source
37 -- line to refer to a physical line in the source, terminated by the end of
38 -- physical line sequence.
40 -- There are two distinct concepts of line terminator in GNAT
42 -- A logical line terminator is what corresponds to the "end of a line" as
43 -- described in RM 2.2 (13). Any of the characters FF, LF, CR or VT or any
44 -- wide character that is a Line or Paragraph Separator acts as an end of
45 -- logical line in this sense, and it is essentially irrelevant whether one
46 -- or more appears in sequence (since if a sequence of such characters is
47 -- regarded as separate ends of line, then the intervening logical lines
48 -- are null in any case).
50 -- A physical line terminator is a sequence of format effectors that is
51 -- treated as ending a physical line. Physical lines have no Ada semantic
52 -- significance, but they are significant for error reporting purposes,
53 -- since errors are identified by line and column location.
55 -- In GNAT, a physical line is ended by any of the sequences LF, CR/LF, or
56 -- CR. LF is used in typical Unix systems, CR/LF in DOS systems, and CR
57 -- alone in System 7. In addition, we recognize any of these sequences in
58 -- any of the operating systems, for better behavior in treating foreign
59 -- files (e.g. a Unix file with LF terminators transferred to a DOS system).
60 -- Finally, wide character codes in categories Separator, Line and Separator,
61 -- Paragraph are considered to be physical line terminators.
64 with Casing
; use Casing
;
65 with Namet
; use Namet
;
68 with Types
; use Types
;
72 type Type_Of_File
is (
73 -- Indicates type of file being read
76 -- Normal Ada source file
79 -- Configuration pragma file
82 -- Preprocessing definition file
85 -- Source file with preprocessing commands to be preprocessed
87 type Instance_Id
is new Nat
;
88 No_Instance_Id
: constant Instance_Id
;
90 ----------------------------
91 -- Source License Control --
92 ----------------------------
94 -- The following type indicates the license state of a source if it
99 -- Licensing status of this source unit is unknown
102 -- This is a non-GPL'ed unit that is restricted from depending
103 -- on GPL'ed units (e.g. proprietary code is in this category)
106 -- This file is licensed under the unmodified GPL. It is not allowed
107 -- to depend on Non_GPL units, and Non_GPL units may not depend on
111 -- This file is licensed under the GNAT modified GPL (see header of
112 -- This file for wording of the modification). It may depend on other
113 -- Modified_GPL units or on unrestricted units.
116 -- The license on this file is permitted to depend on any other
117 -- units, or have other units depend on it, without violating the
118 -- license of this unit. Examples are public domain units, and
119 -- units defined in the RM).
121 -- The above license status is checked when the appropriate check is
122 -- activated and one source depends on another, and the licensing state
123 -- of both files is known:
125 -- The prohibited combinations are:
127 -- Restricted file may not depend on GPL file
129 -- GPL file may not depend on Restricted file
131 -- Modified GPL file may not depend on Restricted file
132 -- Modified_GPL file may not depend on GPL file
134 -- The reason for the last restriction here is that a client depending
135 -- on a modified GPL file must be sure that the license condition is
136 -- correct considered transitively.
138 -- The licensing status is determined either by the presence of a
139 -- specific pragma License, or by scanning the header for a predefined
140 -- statement, or any file if compiling in -gnatg mode.
142 -----------------------
143 -- Source File Table --
144 -----------------------
146 -- The source file table has an entry for each source file read in for
147 -- this run of the compiler. This table is (default) initialized when
148 -- the compiler is loaded, and simply accumulates entries as compilation
149 -- proceeds and various routines in Sinput and its child packages are
150 -- called to load required source files.
152 -- Virtual entries are also created for generic templates when they are
153 -- instantiated, as described in a separate section later on.
155 -- In the case where there are multiple main units (e.g. in the case of
156 -- the cross-reference tool), this table is not reset between these units,
157 -- so that a given source file is only read once if it is used by two
158 -- separate main units.
160 -- The entries in the table are accessed using a Source_File_Index that
161 -- ranges from 1 to Last_Source_File. Each entry has the following fields.
163 -- Note: fields marked read-only are set by Sinput or one of its child
164 -- packages when a source file table entry is created, and cannot be
165 -- subsequently modified, or alternatively are set only by very special
166 -- circumstances, documented in the comments.
168 -- File_Name : File_Name_Type (read-only)
169 -- Name of the source file (simple name with no directory information)
171 -- Full_File_Name : File_Name_Type (read-only)
172 -- Full file name (full name with directory info), used for generation
173 -- of error messages, etc.
175 -- File_Type : Type_Of_File (read-only)
176 -- Indicates type of file (source file, configuration pragmas file,
177 -- preprocessor definition file, preprocessor input file).
179 -- Reference_Name : File_Name_Type (read-only)
180 -- Name to be used for source file references in error messages where
181 -- only the simple name of the file is required. Identical to File_Name
182 -- unless pragma Source_Reference is used to change it. Only processing
183 -- for the Source_Reference pragma circuit may set this field.
185 -- Full_Ref_Name : File_Name_Type (read-only)
186 -- Name to be used for source file references in error messages where
187 -- the full name of the file is required. Identical to Full_File_Name
188 -- unless pragma Source_Reference is used to change it. Only processing
189 -- for the Source_Reference pragma may set this field.
191 -- Debug_Source_Name : File_Name_Type (read-only)
192 -- Name to be used for source file references in debugging information
193 -- where only the simple name of the file is required. Identical to
194 -- Reference_Name unless the -gnatD (debug source file) switch is used.
195 -- Only processing in Sprint that generates this file is permitted to
198 -- Full_Debug_Name : File_Name_Type (read-only)
199 -- Name to be used for source file references in debugging information
200 -- where the full name of the file is required. This is identical to
201 -- Full_Ref_Name unless the -gnatD (debug source file) switch is used.
202 -- Only processing in Sprint that generates this file is permitted to
205 -- Instance : Instance_Id (read-only)
206 -- For entries corresponding to a generic instantiation, unique
207 -- identifier denoting the full chain of nested instantiations. Set to
208 -- No_Instance_Id for the case of a normal, non-instantiation entry.
209 -- See below for details on the handling of generic instantiations.
211 -- License : License_Type;
212 -- License status of source file
214 -- Num_SRef_Pragmas : Nat;
215 -- Number of source reference pragmas present in source file
217 -- First_Mapped_Line : Logical_Line_Number;
218 -- This field stores logical line number of the first line in the
219 -- file that is not a Source_Reference pragma. If no source reference
220 -- pragmas are used, then the value is set to No_Line_Number.
222 -- Source_Text : Source_Buffer_Ptr (read-only)
223 -- Text of source file. Every source file has a distinct set of
224 -- nonoverlapping bounds, so it is possible to determine which
225 -- file is referenced from a given subscript (Source_Ptr) value.
227 -- Source_First : Source_Ptr; (read-only)
228 -- This is always equal to Source_Text'First, except during
229 -- construction of a debug output file (*.dg), when Source_Text = null,
230 -- and Source_First is the size so far. Likewise for Last.
232 -- Source_Last : Source_Ptr; (read-only)
233 -- Same idea as Source_Last, but for Last
235 -- Time_Stamp : Time_Stamp_Type; (read-only)
236 -- Time stamp of the source file
238 -- Source_Checksum : Word;
239 -- Computed checksum for contents of source file. See separate section
240 -- later on in this spec for a description of the checksum algorithm.
242 -- Last_Source_Line : Physical_Line_Number;
243 -- Physical line number of last source line. While a file is being
244 -- read, this refers to the last line scanned. Once a file has been
245 -- completely scanned, it is the number of the last line in the file,
246 -- and hence also gives the number of source lines in the file.
248 -- Keyword_Casing : Casing_Type;
249 -- Casing style used in file for keyword casing. This is initialized
250 -- to Unknown, and then set from the first occurrence of a keyword.
251 -- This value is used only for formatting of error messages.
253 -- Identifier_Casing : Casing_Type;
254 -- Casing style used in file for identifier casing. This is initialized
255 -- to Unknown, and then set from an identifier in the program as soon as
256 -- one is found whose casing is sufficiently clear to make a decision.
257 -- This value is used for formatting of error messages, and also is used
258 -- in the detection of keywords misused as identifiers.
260 -- Inlined_Call : Source_Ptr;
261 -- Source file location of the subprogram call if this source file entry
262 -- represents an inlined body or an inherited pragma. Set to No_Location
263 -- otherwise. This field is read-only for clients.
265 -- Inlined_Body : Boolean;
266 -- This can only be set True if Instantiation has a value other than
267 -- No_Location. If true it indicates that the instantiation is actually
268 -- an instance of an inlined body.
270 -- Inherited_Pragma : Boolean;
271 -- This can only be set True if Instantiation has a value other than
272 -- No_Location. If true it indicates that the instantiation is actually
273 -- an inherited class-wide pre- or postcondition.
275 -- Template : Source_File_Index; (read-only)
276 -- Source file index of the source file containing the template if this
277 -- is a generic instantiation. Set to No_Source_File for the normal case
278 -- of a non-instantiation entry. See Sinput-L for details.
280 -- Unit : Unit_Number_Type;
281 -- Identifies the unit contained in this source file. Set by
282 -- Initialize_Scanner, must not be subsequently altered.
284 -- The source file table is accessed by clients using the following
285 -- subprogram interface:
287 subtype SFI
is Source_File_Index
;
289 System_Source_File_Index
: SFI
;
290 -- The file system.ads is always read by the compiler to determine the
291 -- settings of the target parameters in the private part of System. This
292 -- variable records the source file index of system.ads. Typically this
293 -- will be 1 since system.ads is read first.
295 function Debug_Source_Name
(S
: SFI
) return File_Name_Type
;
296 function File_Name
(S
: SFI
) return File_Name_Type
;
297 function File_Type
(S
: SFI
) return Type_Of_File
;
298 function First_Mapped_Line
(S
: SFI
) return Logical_Line_Number
;
299 function Full_Debug_Name
(S
: SFI
) return File_Name_Type
;
300 function Full_File_Name
(S
: SFI
) return File_Name_Type
;
301 function Full_Ref_Name
(S
: SFI
) return File_Name_Type
;
302 function Identifier_Casing
(S
: SFI
) return Casing_Type
;
303 function Inlined_Body
(S
: SFI
) return Boolean;
304 function Inherited_Pragma
(S
: SFI
) return Boolean;
305 function Inlined_Call
(S
: SFI
) return Source_Ptr
;
306 function Instance
(S
: SFI
) return Instance_Id
;
307 function Keyword_Casing
(S
: SFI
) return Casing_Type
;
308 function Last_Source_Line
(S
: SFI
) return Physical_Line_Number
;
309 function License
(S
: SFI
) return License_Type
;
310 function Num_SRef_Pragmas
(S
: SFI
) return Nat
;
311 function Reference_Name
(S
: SFI
) return File_Name_Type
;
312 function Source_Checksum
(S
: SFI
) return Word
;
313 function Source_First
(S
: SFI
) return Source_Ptr
;
314 function Source_Last
(S
: SFI
) return Source_Ptr
;
315 function Source_Text
(S
: SFI
) return Source_Buffer_Ptr
;
316 function Template
(S
: SFI
) return Source_File_Index
;
317 function Unit
(S
: SFI
) return Unit_Number_Type
;
318 function Time_Stamp
(S
: SFI
) return Time_Stamp_Type
;
320 procedure Set_Keyword_Casing
(S
: SFI
; C
: Casing_Type
);
321 procedure Set_Identifier_Casing
(S
: SFI
; C
: Casing_Type
);
322 procedure Set_License
(S
: SFI
; L
: License_Type
);
323 procedure Set_Unit
(S
: SFI
; U
: Unit_Number_Type
);
325 function Last_Source_File
return Source_File_Index
;
326 -- Index of last source file table entry
328 function Num_Source_Files
return Nat
;
329 -- Number of source file table entries
331 procedure Initialize
;
332 -- Initialize internal tables
335 -- Lock internal tables
338 -- Unlock internal tables
340 Main_Source_File
: Source_File_Index
:= No_Source_File
;
341 -- This is set to the source file index of the main unit
343 -----------------------
344 -- Checksum Handling --
345 -----------------------
347 -- As a source file is scanned, a checksum is computed by taking all the
348 -- non-blank characters in the file, excluding comment characters, the
349 -- minus-minus sequence starting a comment, and all control characters
352 -- The checksum algorithm used is the standard CRC-32 algorithm, as
353 -- implemented by System.CRC32, except that we do not bother with the
354 -- final XOR with all 1 bits.
356 -- This algorithm ensures that the checksum includes all semantically
357 -- significant aspects of the program represented by the source file,
358 -- but is insensitive to layout, presence or contents of comments, wide
359 -- character representation method, or casing conventions outside strings.
361 -- Scans.Checksum is initialized appropriately at the start of scanning
362 -- a file, and copied into the Source_Checksum field of the file table
363 -- entry when the end of file is encountered.
365 -------------------------------------
366 -- Handling Generic Instantiations --
367 -------------------------------------
369 -- As described in Sem_Ch12, a generic instantiation involves making a
370 -- copy of the tree of the generic template. The source locations in
371 -- this tree directly reference the source of the template. However, it
372 -- is also possible to find the location of the instantiation.
374 -- This is achieved as follows. When an instantiation occurs, a new entry
375 -- is made in the source file table. The Source_Text of the instantiation
376 -- points to the same Source_Buffer as the Source_Text of the template, but
377 -- with different bounds. The separate range of Sloc values avoids
378 -- confusion, and means that the Sloc values can still be used to uniquely
379 -- identify the source file table entry. See Set_Dope below for the
380 -- low-level trickery that allows two different pointers to point at the
381 -- same array, but with different bounds.
383 -- The Instantiation_Id field of this source file index entry, set
384 -- to No_Instance_Id for normal entries, instead contains a value that
385 -- uniquely identifies a particular instantiation, and the associated
386 -- entry in the Instances table. The source location of the instantiation
387 -- can be retrieved using function Instantiation below. In the case of
388 -- nested instantiations, the Instances table can be used to trace the
389 -- complete chain of nested instantiations.
391 -- Two routines are used to build the special instance entries in the
392 -- source file table. Create_Instantiation_Source is first called to build
393 -- the virtual source table entry for the instantiation, and then the
394 -- Sloc values in the copy are adjusted using Adjust_Instantiation_Sloc.
395 -- See child unit Sinput.L for details on these two routines.
398 with procedure Process
(Id
: Instance_Id
; Inst_Sloc
: Source_Ptr
);
399 procedure Iterate_On_Instances
;
400 -- Execute Process for each entry in the instance table
402 function Instantiation
(S
: SFI
) return Source_Ptr
;
403 -- For a source file entry that represents an inlined body, source location
404 -- of the inlined call. For a source file entry that represents an
405 -- inherited pragma, source location of the declaration to which the
406 -- overriding subprogram for the inherited pragma is attached. Otherwise,
407 -- for a source file entry that represents a generic instantiation, source
408 -- location of the instantiation. Returns No_Location in all other cases.
414 Current_Source_File
: Source_File_Index
:= No_Source_File
;
415 -- Source_File table index of source file currently being scanned.
416 -- Initialized so that some tools (such as gprbuild) can be built with
417 -- -gnatVa and pragma Initialize_Scalars without problems.
419 Current_Source_Unit
: Unit_Number_Type
;
420 -- Unit number of source file currently being scanned. The special value
421 -- of No_Unit indicates that the configuration pragma file is currently
422 -- being scanned (this has no entry in the unit table).
424 Source_gnat_adc
: Source_File_Index
:= No_Source_File
;
425 -- This is set if a gnat.adc file is present to reference this file
427 Source
: Source_Buffer_Ptr
;
428 -- Current source (copy of Source_File.Table (Current_Source_Unit).Source)
430 -----------------------------------------
431 -- Handling of Source Line Terminators --
432 -----------------------------------------
434 -- In this section we discuss in detail the issue of terminators used to
435 -- terminate source lines. The RM says that one or more format effectors
436 -- (other than horizontal tab) end a source line, and defines the set of
437 -- such format effectors, but does not talk about exactly how they are
438 -- represented in the source program (since in general the RM is not in
439 -- the business of specifying source program formats).
441 -- The type Types.Line_Terminator is defined as a subtype of Character
442 -- that includes CR/LF/VT/FF. The most common line enders in practice
443 -- are CR (some MAC systems), LF (Unix systems), and CR/LF (DOS/Windows
444 -- systems). Any of these sequences is recognized as ending a physical
445 -- source line, and if multiple such terminators appear (e.g. LF/LF),
446 -- then we consider we have an extra blank line.
448 -- VT and FF are recognized as terminating source lines, but they are
449 -- considered to end a logical line instead of a physical line, so that
450 -- the line numbering ignores such terminators. The use of VT and FF is
451 -- mandated by the standard, and correctly handled in a conforming manner
452 -- by GNAT, but their use is not recommended.
454 -- In addition to the set of characters defined by the type in Types, in
455 -- wide character encoding, then the codes returning True for a call to
456 -- System.UTF_32.Is_UTF_32_Line_Terminator are also recognized as ending a
457 -- source line. This includes the standard codes defined above in addition
458 -- to NEL (NEXT LINE), LINE SEPARATOR and PARAGRAPH SEPARATOR. Again, as in
459 -- the case of VT and FF, the standard requires we recognize these as line
460 -- terminators, but we consider them to be logical line terminators. The
461 -- only physical line terminators recognized are the standard ones (CR,
464 -- However, we do not recognize the NEL (16#85#) character as having the
465 -- significance of an end of line character when operating in normal 8-bit
466 -- Latin-n input mode for the compiler. Instead the rule in this mode is
467 -- that all upper half control codes (16#80# .. 16#9F#) are illegal if they
468 -- occur in program text, and are ignored if they appear in comments.
470 -- First, note that this behavior is fully conforming with the standard.
471 -- The standard has nothing whatever to say about source representation
472 -- and implementations are completely free to make there own rules. In
473 -- this case, in 8-bit mode, GNAT decides that the 16#0085# character is
474 -- not a representation of the NEL character, even though it looks like it.
475 -- If you have NEL's in your program, which you expect to be treated as
476 -- end of line characters, you must use a wide character encoding such as
477 -- UTF-8 for this code to be recognized.
479 -- Second, an explanation of why we take this slightly surprising choice.
480 -- We have never encountered anyone actually using the NEL character to
481 -- end lines. One user raised the issue as a result of some experiments,
482 -- but no one has ever submitted a program encoded this way, in any of
483 -- the possible encodings. It seems that even when using wide character
484 -- codes extensively, the normal approach is to use standard line enders
485 -- (LF or CR/LF). So the failure to recognize NEL in this mode seems to
486 -- have no practical downside.
488 -- Moreover, what we have seen in a significant number of programs from
489 -- multiple sources is the practice of writing all program text in lower
490 -- half (ASCII) form, but using UTF-8 encoded wide characters freely in
491 -- comments, where the comments are terminated by normal line endings
492 -- (LF or CR/LF). The comments do not contain NEL codes, but they can and
493 -- do contain other UTF-8 encoding sequences where one of the bytes is the
494 -- NEL code. Now such programs can of course be compiled in UTF-8 mode,
495 -- but in practice they also compile fine in standard 8-bit mode without
496 -- specifying a character encoding. Since this is common practice, it would
497 -- be a significant upwards incompatibility to recognize NEL in 8-bit mode.
503 procedure Backup_Line
(P
: in out Source_Ptr
);
504 -- Back up the argument pointer to the start of the previous line. On
505 -- entry, P points to the start of a physical line in the source buffer.
506 -- On return, P is updated to point to the start of the previous line.
507 -- The caller has checked that a Line_Terminator character precedes P so
508 -- that there definitely is a previous line in the source buffer.
510 procedure Build_Location_String
511 (Buf
: in out Bounded_String
;
513 -- This function builds a string literal of the form "name:line", where
514 -- name is the file name corresponding to Loc, and line is the line number.
515 -- If instantiations are involved, additional suffixes of the same form are
516 -- appended after the separating string " instantiated at ". The returned
517 -- string is appended to Buf.
519 function Build_Location_String
(Loc
: Source_Ptr
) return String;
520 -- Functional form returning a String
522 procedure Check_For_BOM
;
523 -- Check if the current source starts with a BOM. Scan_Ptr needs to be at
524 -- the start of the current source. If the current source starts with a
525 -- recognized BOM, then some flags such as Wide_Character_Encoding_Method
526 -- are set accordingly, and the Scan_Ptr on return points past this BOM.
527 -- An error message is output and Unrecoverable_Error raised if an
528 -- unrecognized BOM is detected. The call has no effect if no BOM is found.
530 function Get_Column_Number
(P
: Source_Ptr
) return Column_Number
;
531 -- The ones-origin column number of the specified Source_Ptr value is
532 -- determined and returned. Tab characters if present are assumed to
533 -- represent the standard 1,9,17.. spacing pattern.
535 function Get_Logical_Line_Number
536 (P
: Source_Ptr
) return Logical_Line_Number
;
537 -- The line number of the specified source position is obtained by
538 -- doing a binary search on the source positions in the lines table
539 -- for the unit containing the given source position. The returned
540 -- value is the logical line number, already adjusted for the effect
541 -- of source reference pragmas. If P refers to the line of a source
542 -- reference pragma itself, then No_Line is returned. If no source
543 -- reference pragmas have been encountered, the value returned is
544 -- the same as the physical line number.
546 function Get_Logical_Line_Number_Img
547 (P
: Source_Ptr
) return String;
548 -- Same as above function, but returns the line number as a string of
549 -- decimal digits, with no leading space. Destroys Name_Buffer.
551 function Get_Physical_Line_Number
552 (P
: Source_Ptr
) return Physical_Line_Number
;
553 -- The line number of the specified source position is obtained by
554 -- doing a binary search on the source positions in the lines table
555 -- for the unit containing the given source position. The returned
556 -- value is the physical line number in the source being compiled.
558 function Get_Source_File_Index
(S
: Source_Ptr
) return Source_File_Index
;
559 pragma Inline
(Get_Source_File_Index
);
560 -- Return file table index of file identified by given source pointer
561 -- value. This call must always succeed, since any valid source pointer
562 -- value belongs to some previously loaded source file.
564 function Instantiation_Depth
(S
: Source_Ptr
) return Nat
;
565 -- Determine instantiation depth for given Sloc value. A value of
566 -- zero means that the given Sloc is not in an instantiation.
568 function Line_Start
(P
: Source_Ptr
) return Source_Ptr
;
569 -- Finds the source position of the start of the line containing the
570 -- given source location.
573 (L
: Physical_Line_Number
;
574 S
: Source_File_Index
) return Source_Ptr
;
575 -- Finds the source position of the start of the given line in the
576 -- given source file, using a physical line number to identify the line.
578 function Num_Source_Lines
(S
: Source_File_Index
) return Nat
;
579 -- Returns the number of source lines (this is equivalent to reading
580 -- the value of Last_Source_Line, but returns Nat rather than a
581 -- physical line number).
583 procedure Register_Source_Ref_Pragma
584 (File_Name
: File_Name_Type
;
585 Stripped_File_Name
: File_Name_Type
;
587 Line_After_Pragma
: Physical_Line_Number
);
588 -- Register a source reference pragma, the parameter File_Name is the
589 -- file name from the pragma, and Stripped_File_Name is this name with
590 -- the directory information stripped. Both these parameters are set
591 -- to No_Name if no file name parameter was given in the pragma.
592 -- (which can only happen for the second and subsequent pragmas).
593 -- Mapped_Line is the line number parameter from the pragma, and
594 -- Line_After_Pragma is the physical line number of the line that
595 -- follows the line containing the Source_Reference pragma.
597 function Original_Location
(S
: Source_Ptr
) return Source_Ptr
;
598 -- Given a source pointer S, returns the corresponding source pointer
599 -- value ignoring instantiation copies. For locations that do not
600 -- correspond to instantiation copies of templates, the argument is
601 -- returned unchanged. For locations that do correspond to copies of
602 -- templates from instantiations, the location within the original
603 -- template is returned. This is useful in canonicalizing locations.
605 function Instantiation_Location
(S
: Source_Ptr
) return Source_Ptr
;
606 pragma Inline
(Instantiation_Location
);
607 -- Given a source pointer S, returns the corresponding source pointer
608 -- value of the instantiation if this location is within an instance.
609 -- If S is not within an instance, then this returns No_Location.
611 function Comes_From_Inlined_Body
(S
: Source_Ptr
) return Boolean;
612 pragma Inline
(Comes_From_Inlined_Body
);
613 -- Given a source pointer S, returns whether it comes from an inlined body.
614 -- This allows distinguishing these source pointers from those that come
615 -- from instantiation of generics, since Instantiation_Location returns a
616 -- valid location in both cases.
618 function Comes_From_Inherited_Pragma
(S
: Source_Ptr
) return Boolean;
619 pragma Inline
(Comes_From_Inherited_Pragma
);
620 -- Given a source pointer S, returns whether it comes from an inherited
621 -- pragma. This allows distinguishing these source pointers from those
622 -- that come from instantiation of generics, since Instantiation_Location
623 -- returns a valid location in both cases.
625 function Top_Level_Location
(S
: Source_Ptr
) return Source_Ptr
;
626 -- Given a source pointer S, returns the argument unchanged if it is
627 -- not in an instantiation. If S is in an instantiation, then it returns
628 -- the location of the top level instantiation, i.e. the outer level
629 -- instantiation in the nested case.
631 function Physical_To_Logical
632 (Line
: Physical_Line_Number
;
633 S
: Source_File_Index
) return Logical_Line_Number
;
634 -- Given a physical line number in source file whose source index is S,
635 -- return the corresponding logical line number. If the physical line
636 -- number is one containing a Source_Reference pragma, the result will
637 -- be No_Line_Number.
639 procedure Skip_Line_Terminators
640 (P
: in out Source_Ptr
;
641 Physical
: out Boolean);
642 -- On entry, P points to a line terminator that has been encountered,
643 -- which is one of FF,LF,VT,CR or a wide character sequence whose value is
644 -- in category Separator,Line or Separator,Paragraph. P points just past
645 -- the character that was scanned. The purpose of this routine is to
646 -- distinguish physical and logical line endings. A physical line ending
649 -- CR on its own (MAC System 7)
650 -- LF on its own (Unix and unix-like systems)
651 -- CR/LF (DOS, Windows)
652 -- Wide character in Separator,Line or Separator,Paragraph category
654 -- Note: we no longer recognize LF/CR (which we did in some earlier
655 -- versions of GNAT. The reason for this is that this sequence is not
656 -- used and recognizing it generated confusion. For example given the
657 -- sequence LF/CR/LF we were interpreting that as (LF/CR) ending the
658 -- first line and a blank line ending with CR following, but it is
659 -- clearly better to interpret this as LF, with a blank line terminated
660 -- by CR/LF, given that LF and CR/LF are both in common use, but no
661 -- system we know of uses LF/CR.
663 -- A logical line ending (that is not a physical line ending) is one of:
668 -- On return, P is bumped past the line ending sequence (one of the above
669 -- seven possibilities). Physical is set to True to indicate that a
670 -- physical end of line was encountered, in which case this routine also
671 -- makes sure that the lines table for the current source file has an
672 -- appropriate entry for the start of the new physical line.
674 procedure Sloc_Range
(N
: Node_Id
; Min
, Max
: out Source_Ptr
);
675 -- Given a node, returns the minimum and maximum source locations of any
676 -- node in the syntactic subtree for the node. This is not quite the same
677 -- as the locations of the first and last token in the node construct
678 -- because parentheses at the outer level do not have a recorded Sloc.
680 -- Note: At each step of the tree traversal, we make sure to go back to
681 -- the Original_Node, since this function is concerned about original
682 -- (source) locations.
684 -- Note: if the tree for the expression contains no "real" Sloc values,
685 -- i.e. values > No_Location, then both Min and Max are set to
686 -- Sloc (Original_Node (N)).
688 function Source_Offset
(S
: Source_Ptr
) return Nat
;
689 -- Returns the zero-origin offset of the given source location from the
690 -- start of its corresponding unit. This is used for creating canonical
691 -- names in some situations.
693 procedure Write_Location
(P
: Source_Ptr
);
694 -- Writes out a string of the form fff:nn:cc, where fff, nn, cc are the
695 -- file name, line number and column corresponding to the given source
696 -- location. No_Location and Standard_Location appear as the strings
697 -- <no location> and <standard location>. If the location is within an
698 -- instantiation, then the instance location is appended, enclosed in
699 -- square brackets (which can nest if necessary). Note that this routine
700 -- is used only for internal compiler debugging output purposes (which
701 -- is why the somewhat cryptic use of brackets is acceptable).
703 procedure wl
(P
: Source_Ptr
);
704 pragma Export
(Ada
, wl
);
705 -- Equivalent to Write_Location (P); Write_Eol; for calls from GDB
707 procedure Write_Time_Stamp
(S
: Source_File_Index
);
708 -- Writes time stamp of specified file in YY-MM-DD HH:MM.SS format
711 -- Initializes internal tables from current tree file using the relevant
712 -- Table.Tree_Read routines.
714 procedure Tree_Write
;
715 -- Writes out internal tables to current tree file using the relevant
716 -- Table.Tree_Write routines.
718 procedure Clear_Source_File_Table
;
719 -- This procedure frees memory allocated in the Source_File table (in the
720 -- private). It should only be used when it is guaranteed that all source
721 -- files that have been loaded so far will not be accessed before being
722 -- reloaded. It is intended for tools that parse several times sources,
723 -- to avoid memory leaks.
726 pragma Inline
(File_Name
);
727 pragma Inline
(Full_File_Name
);
728 pragma Inline
(File_Type
);
729 pragma Inline
(Reference_Name
);
730 pragma Inline
(Full_Ref_Name
);
731 pragma Inline
(Debug_Source_Name
);
732 pragma Inline
(Full_Debug_Name
);
733 pragma Inline
(Instance
);
734 pragma Inline
(License
);
735 pragma Inline
(Num_SRef_Pragmas
);
736 pragma Inline
(First_Mapped_Line
);
737 pragma Inline
(Source_Text
);
738 pragma Inline
(Source_First
);
739 pragma Inline
(Source_Last
);
740 pragma Inline
(Time_Stamp
);
741 pragma Inline
(Source_Checksum
);
742 pragma Inline
(Last_Source_Line
);
743 pragma Inline
(Keyword_Casing
);
744 pragma Inline
(Identifier_Casing
);
745 pragma Inline
(Inlined_Call
);
746 pragma Inline
(Inlined_Body
);
747 pragma Inline
(Inherited_Pragma
);
748 pragma Inline
(Template
);
749 pragma Inline
(Unit
);
751 pragma Inline
(Set_Keyword_Casing
);
752 pragma Inline
(Set_Identifier_Casing
);
754 pragma Inline
(Last_Source_File
);
755 pragma Inline
(Num_Source_Files
);
756 pragma Inline
(Num_Source_Lines
);
758 pragma Inline
(Line_Start
);
760 No_Instance_Id
: constant Instance_Id
:= 0;
762 -------------------------
763 -- Source_Lines Tables --
764 -------------------------
766 type Lines_Table_Type
is
767 array (Physical_Line_Number
) of Source_Ptr
;
768 -- Type used for lines table. The entries are indexed by physical line
769 -- numbers. The values are the starting Source_Ptr values for the start
770 -- of the corresponding physical line. Note that we make this a bogus
771 -- big array, sized as required, so that we avoid the use of fat pointers.
773 type Lines_Table_Ptr
is access all Lines_Table_Type
;
774 -- Type used for pointers to line tables
776 type Logical_Lines_Table_Type
is
777 array (Physical_Line_Number
) of Logical_Line_Number
;
778 -- Type used for logical lines table. This table is used if a source
779 -- reference pragma is present. It is indexed by physical line numbers,
780 -- and contains the corresponding logical line numbers. An entry that
781 -- corresponds to a source reference pragma is set to No_Line_Number.
782 -- Note that we make this a bogus big array, sized as required, so that
783 -- we avoid the use of fat pointers.
785 type Logical_Lines_Table_Ptr
is access all Logical_Lines_Table_Type
;
786 -- Type used for pointers to logical line tables
788 -----------------------
789 -- Source_File Table --
790 -----------------------
792 -- See earlier descriptions for meanings of public fields
794 type Source_File_Record
is record
795 File_Name
: File_Name_Type
;
796 Reference_Name
: File_Name_Type
;
797 Debug_Source_Name
: File_Name_Type
;
798 Full_Debug_Name
: File_Name_Type
;
799 Full_File_Name
: File_Name_Type
;
800 Full_Ref_Name
: File_Name_Type
;
801 Instance
: Instance_Id
;
802 Num_SRef_Pragmas
: Nat
;
803 First_Mapped_Line
: Logical_Line_Number
;
804 Source_Text
: Source_Buffer_Ptr
;
805 Source_First
: Source_Ptr
;
806 Source_Last
: Source_Ptr
;
807 Source_Checksum
: Word
;
808 Last_Source_Line
: Physical_Line_Number
;
809 Template
: Source_File_Index
;
810 Unit
: Unit_Number_Type
;
811 Time_Stamp
: Time_Stamp_Type
;
812 File_Type
: Type_Of_File
;
813 Inlined_Call
: Source_Ptr
;
814 Inlined_Body
: Boolean;
815 Inherited_Pragma
: Boolean;
816 License
: License_Type
;
817 Keyword_Casing
: Casing_Type
;
818 Identifier_Casing
: Casing_Type
;
820 -- The following fields are for internal use only (i.e. only in the
821 -- body of Sinput or its children, with no direct access by clients).
823 Sloc_Adjust
: Source_Ptr
;
824 -- A value to be added to Sloc values for this file to reference the
825 -- corresponding lines table. This is zero for the non-instantiation
826 -- case, and set so that the addition references the ultimate template
827 -- for the instantiation case. See Sinput-L for further details.
829 Lines_Table
: Lines_Table_Ptr
;
830 -- Pointer to lines table for this source. Updated as additional
831 -- lines are accessed using the Skip_Line_Terminators procedure.
832 -- Note: the lines table for an instantiation entry refers to the
833 -- original line numbers of the template see Sinput-L for details.
835 Logical_Lines_Table
: Logical_Lines_Table_Ptr
;
836 -- Pointer to logical lines table for this source. Non-null only if
837 -- a source reference pragma has been processed. Updated as lines
838 -- are accessed using the Skip_Line_Terminators procedure.
840 Lines_Table_Max
: Physical_Line_Number
;
841 -- Maximum subscript values for currently allocated Lines_Table
842 -- and (if present) the allocated Logical_Lines_Table. The value
843 -- Max_Source_Line gives the maximum used value, this gives the
844 -- maximum allocated value.
846 Index
: Source_File_Index
:= 123456789; -- for debugging
849 -- The following representation clause ensures that the above record
850 -- has no holes. We do this so that when instances of this record are
851 -- written by Tree_Gen, we do not write uninitialized values to the file.
853 AS
: constant Pos
:= Standard
'Address_Size;
855 for Source_File_Record
use record
856 File_Name
at 0 range 0 .. 31;
857 Reference_Name
at 4 range 0 .. 31;
858 Debug_Source_Name
at 8 range 0 .. 31;
859 Full_Debug_Name
at 12 range 0 .. 31;
860 Full_File_Name
at 16 range 0 .. 31;
861 Full_Ref_Name
at 20 range 0 .. 31;
862 Instance
at 48 range 0 .. 31;
863 Num_SRef_Pragmas
at 24 range 0 .. 31;
864 First_Mapped_Line
at 28 range 0 .. 31;
865 Source_First
at 32 range 0 .. 31;
866 Source_Last
at 36 range 0 .. 31;
867 Source_Checksum
at 40 range 0 .. 31;
868 Last_Source_Line
at 44 range 0 .. 31;
869 Template
at 52 range 0 .. 31;
870 Unit
at 56 range 0 .. 31;
871 Time_Stamp
at 60 range 0 .. 8 * Time_Stamp_Length
- 1;
872 File_Type
at 74 range 0 .. 7;
873 Inlined_Call
at 88 range 0 .. 31;
874 Inlined_Body
at 75 range 0 .. 0;
875 Inherited_Pragma
at 75 range 1 .. 1;
876 License
at 76 range 0 .. 7;
877 Keyword_Casing
at 77 range 0 .. 7;
878 Identifier_Casing
at 78 range 0 .. 15;
879 Sloc_Adjust
at 80 range 0 .. 31;
880 Lines_Table_Max
at 84 range 0 .. 31;
881 Index
at 92 range 0 .. 31;
883 -- The following fields are pointers, so we have to specialize their
884 -- lengths using pointer size, obtained above as Standard'Address_Size.
885 -- Note that Source_Text is a fat pointer, so it has size = AS*2.
887 Source_Text
at 96 range 0 .. AS
* 2 - 1;
888 Lines_Table
at 96 range AS
* 2 .. AS
* 3 - 1;
889 Logical_Lines_Table
at 96 range AS
* 3 .. AS
* 4 - 1;
890 end record; -- Source_File_Record
892 for Source_File_Record
'Size use 96 * 8 + AS
* 4;
893 -- This ensures that we did not leave out any fields
895 package Source_File
is new Table
.Table
896 (Table_Component_Type
=> Source_File_Record
,
897 Table_Index_Type
=> Source_File_Index
,
898 Table_Low_Bound
=> 1,
899 Table_Initial
=> Alloc
.Source_File_Initial
,
900 Table_Increment
=> Alloc
.Source_File_Increment
,
901 Table_Name
=> "Source_File");
903 -- Auxiliary table containing source location of instantiations. Index 0
904 -- is used for code that does not come from an instance.
906 package Instances
is new Table
.Table
907 (Table_Component_Type
=> Source_Ptr
,
908 Table_Index_Type
=> Instance_Id
,
909 Table_Low_Bound
=> 0,
910 Table_Initial
=> Alloc
.Source_File_Initial
,
911 Table_Increment
=> Alloc
.Source_File_Increment
,
912 Table_Name
=> "Instances");
918 procedure Alloc_Line_Tables
919 (S
: in out Source_File_Record
;
921 -- Allocate or reallocate the lines table for the given source file so
922 -- that it can accommodate at least New_Max lines. Also allocates or
923 -- reallocates logical lines table if source ref pragmas are present.
925 procedure Add_Line_Tables_Entry
926 (S
: in out Source_File_Record
;
928 -- Increment line table size by one (reallocating the lines table if
929 -- needed) and set the new entry to contain the value P. Also bumps
930 -- the Source_Line_Count field. If source reference pragmas are
931 -- present, also increments logical lines table size by one, and
934 procedure Trim_Lines_Table
(S
: Source_File_Index
);
935 -- Set lines table size for entry S in the source file table to
936 -- correspond to the current value of Num_Source_Lines, releasing
937 -- any unused storage. This is used by Sinput.L and Sinput.D.
939 procedure Set_Source_File_Index_Table
(Xnew
: Source_File_Index
);
940 -- Sets entries in the Source_File_Index_Table for the newly created
941 -- Source_File table entry whose index is Xnew. The Source_First and
942 -- Source_Last fields of this entry must be set before the call.
943 -- See package body for details.
945 type Dope_Rec
is record
946 First
, Last
: Source_Ptr
'Base;
948 Dope_Rec_Size
: constant := 2 * Source_Ptr
'Base'Size;
949 for Dope_Rec'Size use Dope_Rec_Size;
950 for Dope_Rec'Alignment use Dope_Rec_Size / 8;
951 type Dope_Ptr is access all Dope_Rec;
954 (Src : System.Address; New_Dope : Dope_Ptr);
955 -- Src is the address of a variable of type Source_Buffer_Ptr, which is a
956 -- fat pointer. This sets the dope part of the fat pointer to point to the
957 -- specified New_Dope. This low-level processing is used to make the
958 -- Source_Text of an instance point to the same text as the template, but
959 -- with different bounds.
961 procedure Free_Dope (Src : System.Address);
962 -- Calls Unchecked_Deallocation on the dope part of the fat pointer Src
964 procedure Free_Source_Buffer (Src : in out Source_Buffer_Ptr);
965 -- Deallocates the source buffer