PR c++/54038
[official-gcc.git] / gcc / ada / g-awk.ads
blobd6dc83eb64fd2c359241814fc3a59c5f610f7a9d
1 ------------------------------------------------------------------------------
2 -- --
3 -- GNAT COMPILER COMPONENTS --
4 -- --
5 -- G N A T . A W K --
6 -- --
7 -- S p e c --
8 -- --
9 -- Copyright (C) 2000-2011, AdaCore --
10 -- --
11 -- GNAT is free software; you can redistribute it and/or modify it under --
12 -- terms of the GNU General Public License as published by the Free Soft- --
13 -- ware Foundation; either version 3, or (at your option) any later ver- --
14 -- sion. GNAT is distributed in the hope that it will be useful, but WITH- --
15 -- OUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY --
16 -- or FITNESS FOR A PARTICULAR PURPOSE. --
17 -- --
18 -- As a special exception under Section 7 of GPL version 3, you are granted --
19 -- additional permissions described in the GCC Runtime Library Exception, --
20 -- version 3.1, as published by the Free Software Foundation. --
21 -- --
22 -- You should have received a copy of the GNU General Public License and --
23 -- a copy of the GCC Runtime Library Exception along with this program; --
24 -- see the files COPYING3 and COPYING.RUNTIME respectively. If not, see --
25 -- <http://www.gnu.org/licenses/>. --
26 -- --
27 -- GNAT was originally developed by the GNAT team at New York University. --
28 -- Extensive contributions were provided by Ada Core Technologies Inc. --
29 -- --
30 ------------------------------------------------------------------------------
32 -- This is an AWK-like unit. It provides an easy interface for parsing one
33 -- or more files containing formatted data. The file can be viewed seen as
34 -- a database where each record is a line and a field is a data element in
35 -- this line. In this implementation an AWK record is a line. This means
36 -- that a record cannot span multiple lines. The operating procedure is to
37 -- read files line by line, with each line being presented to the user of
38 -- the package. The interface provides services to access specific fields
39 -- in the line. Thus it is possible to control actions taken on a line based
40 -- on values of some fields. This can be achieved directly or by registering
41 -- callbacks triggered on programmed conditions.
43 -- The state of an AWK run is recorded in an object of type session.
44 -- The following is the procedure for using a session to control an
45 -- AWK run:
47 -- 1) Specify which session is to be used. It is possible to use the
48 -- default session or to create a new one by declaring an object of
49 -- type Session_Type. For example:
51 -- Computers : Session_Type;
53 -- 2) Specify how to cut a line into fields. There are two modes: using
54 -- character fields separators or column width. This is done by using
55 -- Set_Fields_Separators or Set_Fields_Width. For example by:
57 -- AWK.Set_Field_Separators (";,", Computers);
59 -- or by using iterators' Separators parameter.
61 -- 3) Specify which files to parse. This is done with Add_File/Add_Files
62 -- services, or by using the iterators' Filename parameter. For
63 -- example:
65 -- AWK.Add_File ("myfile.db", Computers);
67 -- 4) Run the AWK session using one of the provided iterators.
69 -- Parse
70 -- This is the most automated iterator. You can gain control on
71 -- the session only by registering one or more callbacks (see
72 -- Register).
74 -- Get_Line/End_Of_Data
75 -- This is a manual iterator to be used with a loop. You have
76 -- complete control on the session. You can use callbacks but
77 -- this is not required.
79 -- For_Every_Line
80 -- This provides a mixture of manual/automated iterator action.
82 -- Examples of these three approaches appear below
84 -- There are many ways to use this package. The following discussion shows
85 -- three approaches to using this package, using the three iterator forms.
86 -- All examples will use the following file (computer.db):
88 -- Pluton;Windows-NT;Pentium III
89 -- Mars;Linux;Pentium Pro
90 -- Venus;Solaris;Sparc
91 -- Saturn;OS/2;i486
92 -- Jupiter;MacOS;PPC
94 -- 1) Using Parse iterator
96 -- Here the first step is to register some action associated to a pattern
97 -- and then to call the Parse iterator (this is the simplest way to use
98 -- this unit). The default session is used here. For example to output the
99 -- second field (the OS) of computer "Saturn".
101 -- procedure Action is
102 -- begin
103 -- Put_Line (AWK.Field (2));
104 -- end Action;
106 -- begin
107 -- AWK.Register (1, "Saturn", Action'Access);
108 -- AWK.Parse (";", "computer.db");
111 -- 2) Using the Get_Line/End_Of_Data iterator
113 -- Here you have full control. For example to do the same as
114 -- above but using a specific session, you could write:
116 -- Computer_File : Session_Type;
118 -- begin
119 -- AWK.Set_Current (Computer_File);
120 -- AWK.Open (Separators => ";",
121 -- Filename => "computer.db");
123 -- -- Display Saturn OS
125 -- while not AWK.End_Of_File loop
126 -- AWK.Get_Line;
128 -- if AWK.Field (1) = "Saturn" then
129 -- Put_Line (AWK.Field (2));
130 -- end if;
131 -- end loop;
133 -- AWK.Close (Computer_File);
136 -- 3) Using For_Every_Line iterator
138 -- In this case you use a provided iterator and you pass the procedure
139 -- that must be called for each record. You could code the previous
140 -- example could be coded as follows (using the iterator quick interface
141 -- but without using the current session):
143 -- Computer_File : Session_Type;
145 -- procedure Action (Quit : in out Boolean) is
146 -- begin
147 -- if AWK.Field (1, Computer_File) = "Saturn" then
148 -- Put_Line (AWK.Field (2, Computer_File));
149 -- end if;
150 -- end Action;
152 -- procedure Look_For_Saturn is
153 -- new AWK.For_Every_Line (Action);
155 -- begin
156 -- Look_For_Saturn (Separators => ";",
157 -- Filename => "computer.db",
158 -- Session => Computer_File);
160 -- Integer_Text_IO.Put
161 -- (Integer (AWK.NR (Session => Computer_File)));
162 -- Put_Line (" line(s) have been processed.");
164 -- You can also use a regular expression for the pattern. Let us output
165 -- the computer name for all computer for which the OS has a character
166 -- O in its name.
168 -- Regexp : String := ".*O.*";
170 -- Matcher : Regpat.Pattern_Matcher := Regpat.Compile (Regexp);
172 -- procedure Action is
173 -- begin
174 -- Text_IO.Put_Line (AWK.Field (2));
175 -- end Action;
177 -- begin
178 -- AWK.Register (2, Matcher, Action'Unrestricted_Access);
179 -- AWK.Parse (";", "computer.db");
182 with Ada.Finalization;
183 with GNAT.Regpat;
185 package GNAT.AWK is
187 Session_Error : exception;
188 -- Raised when a Session is reused but is not closed
190 File_Error : exception;
191 -- Raised when there is a file problem (see below)
193 End_Error : exception;
194 -- Raised when an attempt is made to read beyond the end of the last
195 -- file of a session.
197 Field_Error : exception;
198 -- Raised when accessing a field value which does not exist
200 Data_Error : exception;
201 -- Raised when it is impossible to convert a field value to a specific type
203 type Count is new Natural;
205 type Widths_Set is array (Positive range <>) of Positive;
206 -- Used to store a set of columns widths
208 Default_Separators : constant String := " " & ASCII.HT;
210 Use_Current : constant String := "";
211 -- Value used when no separator or filename is specified in iterators
213 type Session_Type is limited private;
214 -- This is the main exported type. A session is used to keep the state of
215 -- a full AWK run. The state comprises a list of files, the current file,
216 -- the number of line processed, the current line, the number of fields in
217 -- the current line... A default session is provided (see Set_Current,
218 -- Current_Session and Default_Session below).
220 ----------------------------
221 -- Package initialization --
222 ----------------------------
224 -- To be thread safe it is not possible to use the default provided
225 -- session. Each task must used a specific session and specify it
226 -- explicitly for every services.
228 procedure Set_Current (Session : Session_Type);
229 -- Set the session to be used by default. This file will be used when the
230 -- Session parameter in following services is not specified.
232 function Current_Session return not null access Session_Type;
233 -- Returns the session used by default by all services. This is the
234 -- latest session specified by Set_Current service or the session
235 -- provided by default with this implementation.
237 function Default_Session return not null access Session_Type;
238 -- Returns the default session provided by this package. Note that this is
239 -- the session return by Current_Session if Set_Current has not been used.
241 procedure Set_Field_Separators
242 (Separators : String := Default_Separators;
243 Session : Session_Type);
244 procedure Set_Field_Separators
245 (Separators : String := Default_Separators);
246 -- Set the field separators. Each character in the string is a field
247 -- separator. When a line is read it will be split by field using the
248 -- separators set here. Separators can be changed at any point and in this
249 -- case the current line is split according to the new separators. In the
250 -- special case that Separators is a space and a tabulation
251 -- (Default_Separators), fields are separated by runs of spaces and/or
252 -- tabs.
254 procedure Set_FS
255 (Separators : String := Default_Separators;
256 Session : Session_Type)
257 renames Set_Field_Separators;
258 procedure Set_FS
259 (Separators : String := Default_Separators)
260 renames Set_Field_Separators;
261 -- FS is the AWK abbreviation for above service
263 procedure Set_Field_Widths
264 (Field_Widths : Widths_Set;
265 Session : Session_Type);
266 procedure Set_Field_Widths
267 (Field_Widths : Widths_Set);
268 -- This is another way to split a line by giving the length (in number of
269 -- characters) of each field in a line. Field widths can be changed at any
270 -- point and in this case the current line is split according to the new
271 -- field lengths. A line split with this method must have a length equal or
272 -- greater to the total of the field widths. All characters remaining on
273 -- the line after the latest field are added to a new automatically
274 -- created field.
276 procedure Add_File
277 (Filename : String;
278 Session : Session_Type);
279 procedure Add_File
280 (Filename : String);
281 -- Add Filename to the list of file to be processed. There is no limit on
282 -- the number of files that can be added. Files are processed in the order
283 -- they have been added (i.e. the filename list is FIFO). If Filename does
284 -- not exist or if it is not readable, File_Error is raised.
286 procedure Add_Files
287 (Directory : String;
288 Filenames : String;
289 Number_Of_Files_Added : out Natural;
290 Session : Session_Type);
291 procedure Add_Files
292 (Directory : String;
293 Filenames : String;
294 Number_Of_Files_Added : out Natural);
295 -- Add all files matching the regular expression Filenames in the specified
296 -- directory to the list of file to be processed. There is no limit on
297 -- the number of files that can be added. Each file is processed in
298 -- the same order they have been added (i.e. the filename list is FIFO).
299 -- The number of files (possibly 0) added is returned in
300 -- Number_Of_Files_Added.
302 -------------------------------------
303 -- Information about current state --
304 -------------------------------------
306 function Number_Of_Fields
307 (Session : Session_Type) return Count;
308 function Number_Of_Fields
309 return Count;
310 pragma Inline (Number_Of_Fields);
311 -- Returns the number of fields in the current record. It returns 0 when
312 -- no file is being processed.
314 function NF
315 (Session : Session_Type) return Count
316 renames Number_Of_Fields;
317 function NF
318 return Count
319 renames Number_Of_Fields;
320 -- AWK abbreviation for above service
322 function Number_Of_File_Lines
323 (Session : Session_Type) return Count;
324 function Number_Of_File_Lines
325 return Count;
326 pragma Inline (Number_Of_File_Lines);
327 -- Returns the current line number in the processed file. It returns 0 when
328 -- no file is being processed.
330 function FNR (Session : Session_Type) return Count
331 renames Number_Of_File_Lines;
332 function FNR return Count
333 renames Number_Of_File_Lines;
334 -- AWK abbreviation for above service
336 function Number_Of_Lines
337 (Session : Session_Type) return Count;
338 function Number_Of_Lines
339 return Count;
340 pragma Inline (Number_Of_Lines);
341 -- Returns the number of line processed until now. This is equal to number
342 -- of line in each already processed file plus FNR. It returns 0 when
343 -- no file is being processed.
345 function NR (Session : Session_Type) return Count
346 renames Number_Of_Lines;
347 function NR return Count
348 renames Number_Of_Lines;
349 -- AWK abbreviation for above service
351 function Number_Of_Files
352 (Session : Session_Type) return Natural;
353 function Number_Of_Files
354 return Natural;
355 pragma Inline (Number_Of_Files);
356 -- Returns the number of files associated with Session. This is the total
357 -- number of files added with Add_File and Add_Files services.
359 function File (Session : Session_Type) return String;
360 function File return String;
361 -- Returns the name of the file being processed. It returns the empty
362 -- string when no file is being processed.
364 ---------------------
365 -- Field accessors --
366 ---------------------
368 function Field
369 (Rank : Count;
370 Session : Session_Type) return String;
371 function Field
372 (Rank : Count) return String;
373 -- Returns field number Rank value of the current record. If Rank = 0 it
374 -- returns the current record (i.e. the line as read in the file). It
375 -- raises Field_Error if Rank > NF or if Session is not open.
377 function Field
378 (Rank : Count;
379 Session : Session_Type) return Integer;
380 function Field
381 (Rank : Count) return Integer;
382 -- Returns field number Rank value of the current record as an integer. It
383 -- raises Field_Error if Rank > NF or if Session is not open. It
384 -- raises Data_Error if the field value cannot be converted to an integer.
386 function Field
387 (Rank : Count;
388 Session : Session_Type) return Float;
389 function Field
390 (Rank : Count) return Float;
391 -- Returns field number Rank value of the current record as a float. It
392 -- raises Field_Error if Rank > NF or if Session is not open. It
393 -- raises Data_Error if the field value cannot be converted to a float.
395 generic
396 type Discrete is (<>);
397 function Discrete_Field
398 (Rank : Count;
399 Session : Session_Type) return Discrete;
400 generic
401 type Discrete is (<>);
402 function Discrete_Field_Current_Session
403 (Rank : Count) return Discrete;
404 -- Returns field number Rank value of the current record as a type
405 -- Discrete. It raises Field_Error if Rank > NF. It raises Data_Error if
406 -- the field value cannot be converted to type Discrete.
408 --------------------
409 -- Pattern/Action --
410 --------------------
412 -- AWK defines rules like "PATTERN { ACTION }". Which means that ACTION
413 -- will be executed if PATTERN match. A pattern in this implementation can
414 -- be a simple string (match function is equality), a regular expression,
415 -- a function returning a boolean. An action is associated to a pattern
416 -- using the Register services.
418 -- Each procedure Register will add a rule to the set of rules for the
419 -- session. Rules are examined in the order they have been added.
421 type Pattern_Callback is access function return Boolean;
422 -- This is a pattern function pointer. When it returns True the associated
423 -- action will be called.
425 type Action_Callback is access procedure;
426 -- A simple action pointer
428 type Match_Action_Callback is
429 access procedure (Matches : GNAT.Regpat.Match_Array);
430 -- An advanced action pointer used with a regular expression pattern. It
431 -- returns an array of all the matches. See GNAT.Regpat for further
432 -- information.
434 procedure Register
435 (Field : Count;
436 Pattern : String;
437 Action : Action_Callback;
438 Session : Session_Type);
439 procedure Register
440 (Field : Count;
441 Pattern : String;
442 Action : Action_Callback);
443 -- Register an Action associated with a Pattern. The pattern here is a
444 -- simple string that must match exactly the field number specified.
446 procedure Register
447 (Field : Count;
448 Pattern : GNAT.Regpat.Pattern_Matcher;
449 Action : Action_Callback;
450 Session : Session_Type);
451 procedure Register
452 (Field : Count;
453 Pattern : GNAT.Regpat.Pattern_Matcher;
454 Action : Action_Callback);
455 -- Register an Action associated with a Pattern. The pattern here is a
456 -- simple regular expression which must match the field number specified.
458 procedure Register
459 (Field : Count;
460 Pattern : GNAT.Regpat.Pattern_Matcher;
461 Action : Match_Action_Callback;
462 Session : Session_Type);
463 procedure Register
464 (Field : Count;
465 Pattern : GNAT.Regpat.Pattern_Matcher;
466 Action : Match_Action_Callback);
467 -- Same as above but it pass the set of matches to the action
468 -- procedure. This is useful to analyse further why and where a regular
469 -- expression did match.
471 procedure Register
472 (Pattern : Pattern_Callback;
473 Action : Action_Callback;
474 Session : Session_Type);
475 procedure Register
476 (Pattern : Pattern_Callback;
477 Action : Action_Callback);
478 -- Register an Action associated with a Pattern. The pattern here is a
479 -- function that must return a boolean. Action callback will be called if
480 -- the pattern callback returns True and nothing will happen if it is
481 -- False. This version is more general, the two other register services
482 -- trigger an action based on the value of a single field only.
484 procedure Register
485 (Action : Action_Callback;
486 Session : Session_Type);
487 procedure Register
488 (Action : Action_Callback);
489 -- Register an Action that will be called for every line. This is
490 -- equivalent to a Pattern_Callback function always returning True.
492 --------------------
493 -- Parse iterator --
494 --------------------
496 procedure Parse
497 (Separators : String := Use_Current;
498 Filename : String := Use_Current;
499 Session : Session_Type);
500 procedure Parse
501 (Separators : String := Use_Current;
502 Filename : String := Use_Current);
503 -- Launch the iterator, it will read every line in all specified
504 -- session's files. Registered callbacks are then called if the associated
505 -- pattern match. It is possible to specify a filename and a set of
506 -- separators directly. This offer a quick way to parse a single
507 -- file. These parameters will override those specified by Set_FS and
508 -- Add_File. The Session will be opened and closed automatically.
509 -- File_Error is raised if there is no file associated with Session, or if
510 -- a file associated with Session is not longer readable. It raises
511 -- Session_Error is Session is already open.
513 -----------------------------------
514 -- Get_Line/End_Of_Data Iterator --
515 -----------------------------------
517 type Callback_Mode is (None, Only, Pass_Through);
518 -- These mode are used for Get_Line/End_Of_Data and For_Every_Line
519 -- iterators. The associated semantic is:
521 -- None
522 -- callbacks are not active. This is the default mode for
523 -- Get_Line/End_Of_Data and For_Every_Line iterators.
525 -- Only
526 -- callbacks are active, if at least one pattern match, the associated
527 -- action is called and this line will not be passed to the user. In
528 -- the Get_Line case the next line will be read (if there is some
529 -- line remaining), in the For_Every_Line case Action will
530 -- not be called for this line.
532 -- Pass_Through
533 -- callbacks are active, for patterns which match the associated
534 -- action is called. Then the line is passed to the user. It means
535 -- that Action procedure is called in the For_Every_Line case and
536 -- that Get_Line returns with the current line active.
539 procedure Open
540 (Separators : String := Use_Current;
541 Filename : String := Use_Current;
542 Session : Session_Type);
543 procedure Open
544 (Separators : String := Use_Current;
545 Filename : String := Use_Current);
546 -- Open the first file and initialize the unit. This must be called once
547 -- before using Get_Line. It is possible to specify a filename and a set of
548 -- separators directly. This offer a quick way to parse a single file.
549 -- These parameters will override those specified by Set_FS and Add_File.
550 -- File_Error is raised if there is no file associated with Session, or if
551 -- the first file associated with Session is no longer readable. It raises
552 -- Session_Error is Session is already open.
554 procedure Get_Line
555 (Callbacks : Callback_Mode := None;
556 Session : Session_Type);
557 procedure Get_Line
558 (Callbacks : Callback_Mode := None);
559 -- Read a line from the current input file. If the file index is at the
560 -- end of the current input file (i.e. End_Of_File is True) then the
561 -- following file is opened. If there is no more file to be processed,
562 -- exception End_Error will be raised. File_Error will be raised if Open
563 -- has not been called. Next call to Get_Line will return the following
564 -- line in the file. By default the registered callbacks are not called by
565 -- Get_Line, this can activated by setting Callbacks (see Callback_Mode
566 -- description above). File_Error may be raised if a file associated with
567 -- Session is not readable.
569 -- When Callbacks is not None, it is possible to exhaust all the lines
570 -- of all the files associated with Session. In this case, File_Error
571 -- is not raised.
573 -- This procedure can be used from a subprogram called by procedure Parse
574 -- or by an instantiation of For_Every_Line (see below).
576 function End_Of_Data
577 (Session : Session_Type) return Boolean;
578 function End_Of_Data
579 return Boolean;
580 pragma Inline (End_Of_Data);
581 -- Returns True if there is no more data to be processed in Session. It
582 -- means that the latest session's file is being processed and that
583 -- there is no more data to be read in this file (End_Of_File is True).
585 function End_Of_File
586 (Session : Session_Type) return Boolean;
587 function End_Of_File
588 return Boolean;
589 pragma Inline (End_Of_File);
590 -- Returns True when there is no more data to be processed on the current
591 -- session's file.
593 procedure Close (Session : Session_Type);
594 -- Release all associated data with Session. All memory allocated will
595 -- be freed, the current file will be closed if needed, the callbacks
596 -- will be unregistered. Close is convenient in reestablishing a session
597 -- for new use. Get_Line is no longer usable (will raise File_Error)
598 -- except after a successful call to Open, Parse or an instantiation
599 -- of For_Every_Line.
601 -----------------------------
602 -- For_Every_Line iterator --
603 -----------------------------
605 generic
606 with procedure Action (Quit : in out Boolean);
607 procedure For_Every_Line
608 (Separators : String := Use_Current;
609 Filename : String := Use_Current;
610 Callbacks : Callback_Mode := None;
611 Session : Session_Type);
612 generic
613 with procedure Action (Quit : in out Boolean);
614 procedure For_Every_Line_Current_Session
615 (Separators : String := Use_Current;
616 Filename : String := Use_Current;
617 Callbacks : Callback_Mode := None);
618 -- This is another iterator. Action will be called for each new
619 -- record. The iterator's termination can be controlled by setting Quit
620 -- to True. It is by default set to False. It is possible to specify a
621 -- filename and a set of separators directly. This offer a quick way to
622 -- parse a single file. These parameters will override those specified by
623 -- Set_FS and Add_File. By default the registered callbacks are not called
624 -- by For_Every_Line, this can activated by setting Callbacks (see
625 -- Callback_Mode description above). The Session will be opened and
626 -- closed automatically. File_Error is raised if there is no file
627 -- associated with Session. It raises Session_Error is Session is already
628 -- open.
630 private
631 type Session_Data;
632 type Session_Data_Access is access Session_Data;
634 type Session_Type is new Ada.Finalization.Limited_Controlled with record
635 Data : Session_Data_Access;
636 Self : not null access Session_Type := Session_Type'Unchecked_Access;
637 end record;
639 procedure Initialize (Session : in out Session_Type);
640 procedure Finalize (Session : in out Session_Type);
642 end GNAT.AWK;