1 .. _The_Implementation_of_Standard_I/O:
3 **********************************
4 The Implementation of Standard I/O
5 **********************************
7 GNAT implements all the required input-output facilities described in
8 A.6 through A.14. These sections of the Ada Reference Manual describe the
9 required behavior of these packages from the Ada point of view, and if
10 you are writing a portable Ada program that does not need to know the
11 exact manner in which Ada maps to the outside world when it comes to
12 reading or writing external files, then you do not need to read this
13 chapter. As long as your files are all regular files (not pipes or
14 devices), and as long as you write and read the files only from Ada, the
15 description in the Ada Reference Manual is sufficient.
17 However, if you want to do input-output to pipes or other devices, such
18 as the keyboard or screen, or if the files you are dealing with are
19 either generated by some other language, or to be read by some other
20 language, then you need to know more about the details of how the GNAT
21 implementation of these input-output facilities behaves.
23 In this chapter we give a detailed description of exactly how GNAT
24 interfaces to the file system. As always, the sources of the system are
25 available to you for answering questions at an even more detailed level,
26 but for most purposes the information in this chapter will suffice.
28 Another reason that you may need to know more about how input-output is
29 implemented arises when you have a program written in mixed languages
30 where, for example, files are shared between the C and Ada sections of
31 the same program. GNAT provides some additional facilities, in the form
32 of additional child library packages, that facilitate this sharing, and
33 these additional facilities are also described in this chapter.
35 .. _Standard_I/O_Packages:
40 The Standard I/O packages described in Annex A for
45 Ada.Text_IO.Complex_IO
47 Ada.Text_IO.Text_Streams
51 Ada.Wide_Text_IO.Complex_IO
53 Ada.Wide_Text_IO.Text_Streams
57 Ada.Wide_Wide_Text_IO.Complex_IO
59 Ada.Wide_Wide_Text_IO.Text_Streams
67 are implemented using the C
68 library streams facility; where
71 All files are opened using ``fopen``.
73 All input/output operations use ``fread``/``fwrite``.
75 There is no internal buffering of any kind at the Ada library level. The only
76 buffering is that provided at the system level in the implementation of the
77 library routines that support streams. This facilitates shared use of these
78 streams by mixed language programs. Note though that system level buffering is
79 explicitly enabled at elaboration of the standard I/O packages and that can
80 have an impact on mixed language programs, in particular those using I/O before
81 calling the Ada elaboration routine (e.g., adainit). It is recommended to call
82 the Ada elaboration routine before performing any I/O or when impractical,
83 flush the common I/O streams and in particular Standard_Output before
84 elaborating the Ada code.
91 The format of a FORM string in GNAT is:
96 "keyword=value,keyword=value,...,keyword=value"
99 where letters may be in upper or lower case, and there are no spaces
100 between values. The order of the entries is not important. Currently
101 the following keywords defined.
106 TEXT_TRANSLATION=[YES|NO|TEXT|BINARY|U8TEXT|WTEXT|U16TEXT]
109 ENCODING=[UTF8|8BITS]
112 The use of these parameters is described later in this section. If an
113 unrecognized keyword appears in a form string, it is silently ignored
114 and not considered invalid.
121 Direct_IO can only be instantiated for definite types. This is a
122 restriction of the Ada language, which means that the records are fixed
123 length (the length being determined by ``type'Size``, rounded
124 up to the next storage unit boundary if necessary).
126 The records of a Direct_IO file are simply written to the file in index
127 sequence, with the first record starting at offset zero, and subsequent
128 records following. There is no control information of any kind. For
129 example, if 32-bit integers are being written, each record takes
130 4-bytes, so the record at index ``K`` starts at offset ``(K-1)*4``.
132 There is no limit on the size of Direct_IO files, they are expanded as
133 necessary to accommodate whatever records are written to the file.
140 Sequential_IO may be instantiated with either a definite (constrained)
141 or indefinite (unconstrained) type.
143 For the definite type case, the elements written to the file are simply
144 the memory images of the data values with no control information of any
145 kind. The resulting file should be read using the same type, no validity
146 checking is performed on input.
148 For the indefinite type case, the elements written consist of two
149 parts. First is the size of the data item, written as the memory image
150 of a ``Interfaces.C.size_t`` value, followed by the memory image of
151 the data value. The resulting file can only be read using the same
152 (unconstrained) type. Normal assignment checks are performed on these
153 read operations, and if these checks fail, ``Data_Error`` is
154 raised. In particular, in the array case, the lengths must match, and in
155 the variant record case, if the variable for a particular read operation
156 is constrained, the discriminants must match.
158 Note that it is not possible to use Sequential_IO to write variable
159 length array items, and then read the data back into different length
160 arrays. For example, the following will raise ``Data_Error``:
165 package IO is new Sequential_IO (String);
170 IO.Write (F, "hello!")
171 IO.Reset (F, Mode=>In_File);
177 On some Ada implementations, this will print ``hell``, but the program is
178 clearly incorrect, since there is only one element in the file, and that
179 element is the string ``hello!``.
181 In Ada 95 and Ada 2005, this kind of behavior can be legitimately achieved
182 using Stream_IO, and this is the preferred mechanism. In particular, the
183 above program fragment rewritten to use Stream_IO will work correctly.
190 Text_IO files consist of a stream of characters containing the following
191 special control characters:
196 LF (line feed, 16#0A#) Line Mark
197 FF (form feed, 16#0C#) Page Mark
200 A canonical Text_IO file is defined as one in which the following
204 The character ``LF`` is used only as a line mark, i.e., to mark the end
208 The character ``FF`` is used only as a page mark, i.e., to mark the
209 end of a page and consequently can appear only immediately following a
210 ``LF`` (line mark) character.
213 The file ends with either ``LF`` (line mark) or ``LF``-`FF`
214 (line mark, page mark). In the former case, the page mark is implicitly
215 assumed to be present.
217 A file written using Text_IO will be in canonical form provided that no
218 explicit ``LF`` or ``FF`` characters are written using ``Put``
219 or ``Put_Line``. There will be no ``FF`` character at the end of
220 the file unless an explicit ``New_Page`` operation was performed
221 before closing the file.
223 A canonical Text_IO file that is a regular file (i.e., not a device or a
224 pipe) can be read using any of the routines in Text_IO. The
225 semantics in this case will be exactly as defined in the Ada Reference
226 Manual, and all the routines in Text_IO are fully implemented.
228 A text file that does not meet the requirements for a canonical Text_IO
229 file has one of the following:
232 The file contains ``FF`` characters not immediately following a
236 The file contains ``LF`` or ``FF`` characters written by
237 ``Put`` or ``Put_Line``, which are not logically considered to be
238 line marks or page marks.
241 The file ends in a character other than ``LF`` or ``FF``,
242 i.e., there is no explicit line mark or page mark at the end of the file.
244 Text_IO can be used to read such non-standard text files but subprograms
245 to do with line or page numbers do not have defined meanings. In
246 particular, a ``FF`` character that does not follow a ``LF``
247 character may or may not be treated as a page mark from the point of
248 view of page and line numbering. Every ``LF`` character is considered
249 to end a line, and there is an implied ``LF`` character at the end of
252 .. _Stream_Pointer_Positioning:
254 Stream Pointer Positioning
255 --------------------------
257 ``Ada.Text_IO`` has a definition of current position for a file that
258 is being read. No internal buffering occurs in Text_IO, and usually the
259 physical position in the stream used to implement the file corresponds
260 to this logical position defined by Text_IO. There are two exceptions:
263 After a call to ``End_Of_Page`` that returns ``True``, the stream
264 is positioned past the ``LF`` (line mark) that precedes the page
265 mark. Text_IO maintains an internal flag so that subsequent read
266 operations properly handle the logical position which is unchanged by
267 the ``End_Of_Page`` call.
270 After a call to ``End_Of_File`` that returns ``True``, if the
271 Text_IO file was positioned before the line mark at the end of file
272 before the call, then the logical position is unchanged, but the stream
273 is physically positioned right at the end of file (past the line mark,
274 and past a possible page mark following the line mark. Again Text_IO
275 maintains internal flags so that subsequent read operations properly
276 handle the logical position.
278 These discrepancies have no effect on the observable behavior of
279 Text_IO, but if a single Ada stream is shared between a C program and
280 Ada program, or shared (using ``shared=yes`` in the form string)
281 between two Ada files, then the difference may be observable in some
284 .. _Reading_and_Writing_Non-Regular_Files:
286 Reading and Writing Non-Regular Files
287 -------------------------------------
289 A non-regular file is a device (such as a keyboard), or a pipe. Text_IO
290 can be used for reading and writing. Writing is not affected and the
291 sequence of characters output is identical to the normal file case, but
292 for reading, the behavior of Text_IO is modified to avoid undesirable
293 look-ahead as follows:
295 An input file that is not a regular file is considered to have no page
296 marks. Any ``Ascii.FF`` characters (the character normally used for a
297 page mark) appearing in the file are considered to be data
298 characters. In particular:
301 ``Get_Line`` and ``Skip_Line`` do not test for a page mark
302 following a line mark. If a page mark appears, it will be treated as a
306 This avoids the need to wait for an extra character to be typed or
307 entered from the pipe to complete one of these operations.
310 ``End_Of_Page`` always returns ``False``
313 ``End_Of_File`` will return ``False`` if there is a page mark at
316 Output to non-regular files is the same as for regular files. Page marks
317 may be written to non-regular files using ``New_Page``, but as noted
318 above they will not be treated as page marks on input if the output is
319 piped to another Ada program.
321 Another important discrepancy when reading non-regular files is that the end
322 of file indication is not 'sticky'. If an end of file is entered, e.g., by
323 pressing the :kbd:`EOT` key,
325 is signaled once (i.e., the test ``End_Of_File``
326 will yield ``True``, or a read will
327 raise ``End_Error``), but then reading can resume
328 to read data past that end of
329 file indication, until another end of file indication is entered.
336 .. index:: Get_Immediate
338 Get_Immediate returns the next character (including control characters)
339 from the input file. In particular, Get_Immediate will return LF or FF
340 characters used as line marks or page marks. Such operations leave the
341 file positioned past the control character, and it is thus not treated
342 as having its normal function. This means that page, line and column
343 counts after this kind of Get_Immediate call are set as though the mark
344 did not occur. In the case where a Get_Immediate leaves the file
345 positioned between the line mark and page mark (which is not normally
346 possible), it is undefined whether the FF character will be treated as a
349 .. _Treating_Text_IO_Files_as_Streams:
351 Treating Text_IO Files as Streams
352 ---------------------------------
354 .. index:: Stream files
356 The package ``Text_IO.Streams`` allows a ``Text_IO`` file to be treated
357 as a stream. Data written to a ``Text_IO`` file in this stream mode is
358 binary data. If this binary data contains bytes 16#0A# (``LF``) or
359 16#0C# (``FF``), the resulting file may have non-standard
360 format. Similarly if read operations are used to read from a Text_IO
361 file treated as a stream, then ``LF`` and ``FF`` characters may be
362 skipped and the effect is similar to that described above for
365 .. _Text_IO_Extensions:
370 .. index:: Text_IO extensions
372 A package GNAT.IO_Aux in the GNAT library provides some useful extensions
373 to the standard ``Text_IO`` package:
375 * function File_Exists (Name : String) return Boolean;
376 Determines if a file of the given name exists.
378 * function Get_Line return String;
379 Reads a string from the standard input file. The value returned is exactly
380 the length of the line that was read.
382 * function Get_Line (File : Ada.Text_IO.File_Type) return String;
383 Similar, except that the parameter File specifies the file from which
384 the string is to be read.
387 .. _Text_IO_Facilities_for_Unbounded_Strings:
389 Text_IO Facilities for Unbounded Strings
390 ----------------------------------------
392 .. index:: Text_IO for unbounded strings
394 .. index:: Unbounded_String, Text_IO operations
396 The package ``Ada.Strings.Unbounded.Text_IO``
397 in library files :file:`a-suteio.ads/adb` contains some GNAT-specific
398 subprograms useful for Text_IO operations on unbounded strings:
401 * function Get_Line (File : File_Type) return Unbounded_String;
402 Reads a line from the specified file
403 and returns the result as an unbounded string.
405 * procedure Put (File : File_Type; U : Unbounded_String);
406 Writes the value of the given unbounded string to the specified file
407 Similar to the effect of
408 ``Put (To_String (U))`` except that an extra copy is avoided.
410 * procedure Put_Line (File : File_Type; U : Unbounded_String);
411 Writes the value of the given unbounded string to the specified file,
412 followed by a ``New_Line``.
413 Similar to the effect of ``Put_Line (To_String (U))`` except
414 that an extra copy is avoided.
416 In the above procedures, ``File`` is of type ``Ada.Text_IO.File_Type``
417 and is optional. If the parameter is omitted, then the standard input or
418 output file is referenced as appropriate.
420 The package ``Ada.Strings.Wide_Unbounded.Wide_Text_IO`` in library
421 files :file:`a-swuwti.ads` and :file:`a-swuwti.adb` provides similar extended
422 ``Wide_Text_IO`` functionality for unbounded wide strings.
424 The package ``Ada.Strings.Wide_Wide_Unbounded.Wide_Wide_Text_IO`` in library
425 files :file:`a-szuzti.ads` and :file:`a-szuzti.adb` provides similar extended
426 ``Wide_Wide_Text_IO`` functionality for unbounded wide wide strings.
433 ``Wide_Text_IO`` is similar in most respects to Text_IO, except that
434 both input and output files may contain special sequences that represent
435 wide character values. The encoding scheme for a given file may be
436 specified using a FORM parameter:
444 as part of the FORM string (WCEM = wide character encoding method),
445 where ``x`` is one of the following characters
447 ========== ====================
449 ========== ====================
451 *u* Upper half encoding
452 *s* Shift-JIS encoding
455 *b* Brackets encoding
456 ========== ====================
458 The encoding methods match those that
459 can be used in a source
460 program, but there is no requirement that the encoding method used for
461 the source program be the same as the encoding method used for files,
462 and different files may use different encoding methods.
464 The default encoding method for the standard files, and for opened files
465 for which no WCEM parameter is given in the FORM string matches the
466 wide character encoding specified for the main program (the default
467 being brackets encoding if no coding method was specified with -gnatW).
472 In this encoding, a wide character is represented by a five character
482 where ``a``, ``b``, ``c``, ``d`` are the four hexadecimal
483 characters (using upper case letters) of the wide character code. For
484 example, ESC A345 is used to represent the wide character with code
485 16#A345#. This scheme is compatible with use of the full
486 ``Wide_Character`` set.
490 The wide character with encoding 16#abcd#, where the upper bit is on
491 (i.e., a is in the range 8-F) is represented as two bytes 16#ab# and
492 16#cd#. The second byte may never be a format control character, but is
493 not required to be in the upper half. This method can be also used for
494 shift-JIS or EUC where the internal coding matches the external coding.
498 A wide character is represented by a two character sequence 16#ab# and
499 16#cd#, with the restrictions described for upper half encoding as
500 described above. The internal character code is the corresponding JIS
501 character according to the standard algorithm for Shift-JIS
502 conversion. Only characters defined in the JIS code set table can be
503 used with this encoding method.
507 A wide character is represented by a two character sequence 16#ab# and
508 16#cd#, with both characters being in the upper half. The internal
509 character code is the corresponding JIS character according to the EUC
510 encoding algorithm. Only characters defined in the JIS code set table
511 can be used with this encoding method.
515 A wide character is represented using
516 UCS Transformation Format 8 (UTF-8) as defined in Annex R of ISO
517 10646-1/Am.2. Depending on the character value, the representation
518 is a one, two, or three byte sequence:
523 16#0000#-16#007f#: 2#0xxxxxxx#
524 16#0080#-16#07ff#: 2#110xxxxx# 2#10xxxxxx#
525 16#0800#-16#ffff#: 2#1110xxxx# 2#10xxxxxx# 2#10xxxxxx#
529 where the ``xxx`` bits correspond to the left-padded bits of the
530 16-bit character value. Note that all lower half ASCII characters
531 are represented as ASCII bytes and all upper half characters and
532 other wide characters are represented as sequences of upper-half
533 (The full UTF-8 scheme allows for encoding 31-bit characters as
534 6-byte sequences, but in this implementation, all UTF-8 sequences
535 of four or more bytes length will raise a Constraint_Error, as
536 will all invalid UTF-8 sequences.)
540 In this encoding, a wide character is represented by the following eight
550 where ``a``, ``b``, ``c``, ``d`` are the four hexadecimal
551 characters (using uppercase letters) of the wide character code. For
552 example, ``["A345"]`` is used to represent the wide character with code
554 This scheme is compatible with use of the full Wide_Character set.
555 On input, brackets coding can also be used for upper half characters,
556 e.g., ``["C1"]`` for lower case a. However, on output, brackets notation
557 is only used for wide characters with a code greater than ``16#FF#``.
559 Note that brackets coding is not normally used in the context of
560 Wide_Text_IO or Wide_Wide_Text_IO, since it is really just designed as
561 a portable way of encoding source files. In the context of Wide_Text_IO
562 or Wide_Wide_Text_IO, it can only be used if the file does not contain
563 any instance of the left bracket character other than to encode wide
564 character values using the brackets encoding method. In practice it is
565 expected that some standard wide character encoding method such
566 as UTF-8 will be used for text input output.
568 If brackets notation is used, then any occurrence of a left bracket
569 in the input file which is not the start of a valid wide character
570 sequence will cause Constraint_Error to be raised. It is possible to
571 encode a left bracket as ["5B"] and Wide_Text_IO and Wide_Wide_Text_IO
572 input will interpret this as a left bracket.
574 However, when a left bracket is output, it will be output as a left bracket
575 and not as ["5B"]. We make this decision because for normal use of
576 Wide_Text_IO for outputting messages, it is unpleasant to clobber left
577 brackets. For example, if we write:
582 Put_Line ("Start of output [first run]");
585 we really do not want to have the left bracket in this message clobbered so
586 that the output reads:
591 Start of output ["5B"]first run]
595 In practice brackets encoding is reasonably useful for normal Put_Line use
596 since we won't get confused between left brackets and wide character
597 sequences in the output. But for input, or when files are written out
598 and read back in, it really makes better sense to use one of the standard
599 encoding methods such as UTF-8.
602 For the coding schemes other than UTF-8, Hex, or Brackets encoding,
603 not all wide character
604 values can be represented. An attempt to output a character that cannot
605 be represented using the encoding scheme for the file causes
606 Constraint_Error to be raised. An invalid wide character sequence on
607 input also causes Constraint_Error to be raised.
609 .. _Stream_Pointer_Positioning_1:
611 Stream Pointer Positioning
612 --------------------------
614 ``Ada.Wide_Text_IO`` is similar to ``Ada.Text_IO`` in its handling
615 of stream pointer positioning (:ref:`Text_IO`). There is one additional
618 If ``Ada.Wide_Text_IO.Look_Ahead`` reads a character outside the
619 normal lower ASCII set, i.e. a character in the range:
624 Wide_Character'Val (16#0080#) .. Wide_Character'Val (16#FFFF#)
627 then although the logical position of the file pointer is unchanged by
628 the ``Look_Ahead`` call, the stream is physically positioned past the
629 wide character sequence. Again this is to avoid the need for buffering
630 or backup, and all ``Wide_Text_IO`` routines check the internal
631 indication that this situation has occurred so that this is not visible
632 to a normal program using ``Wide_Text_IO``. However, this discrepancy
633 can be observed if the wide text file shares a stream with another file.
635 .. _Reading_and_Writing_Non-Regular_Files_1:
637 Reading and Writing Non-Regular Files
638 -------------------------------------
640 As in the case of Text_IO, when a non-regular file is read, it is
641 assumed that the file contains no page marks (any form characters are
642 treated as data characters), and ``End_Of_Page`` always returns
643 ``False``. Similarly, the end of file indication is not sticky, so
644 it is possible to read beyond an end of file.
646 .. _Wide_Wide_Text_IO:
651 ``Wide_Wide_Text_IO`` is similar in most respects to Text_IO, except that
652 both input and output files may contain special sequences that represent
653 wide wide character values. The encoding scheme for a given file may be
654 specified using a FORM parameter:
662 as part of the FORM string (WCEM = wide character encoding method),
663 where ``x`` is one of the following characters
665 ========== ====================
667 ========== ====================
669 *u* Upper half encoding
670 *s* Shift-JIS encoding
673 *b* Brackets encoding
674 ========== ====================
677 The encoding methods match those that
678 can be used in a source
679 program, but there is no requirement that the encoding method used for
680 the source program be the same as the encoding method used for files,
681 and different files may use different encoding methods.
683 The default encoding method for the standard files, and for opened files
684 for which no WCEM parameter is given in the FORM string matches the
685 wide character encoding specified for the main program (the default
686 being brackets encoding if no coding method was specified with -gnatW).
691 A wide character is represented using
692 UCS Transformation Format 8 (UTF-8) as defined in Annex R of ISO
693 10646-1/Am.2. Depending on the character value, the representation
694 is a one, two, three, or four byte sequence:
699 16#000000#-16#00007f#: 2#0xxxxxxx#
700 16#000080#-16#0007ff#: 2#110xxxxx# 2#10xxxxxx#
701 16#000800#-16#00ffff#: 2#1110xxxx# 2#10xxxxxx# 2#10xxxxxx#
702 16#010000#-16#10ffff#: 2#11110xxx# 2#10xxxxxx# 2#10xxxxxx# 2#10xxxxxx#
706 where the ``xxx`` bits correspond to the left-padded bits of the
707 21-bit character value. Note that all lower half ASCII characters
708 are represented as ASCII bytes and all upper half characters and
709 other wide characters are represented as sequences of upper-half
714 In this encoding, a wide wide character is represented by the following eight
715 character sequence if is in wide character range
724 and by the following ten character sequence if not
733 where ``a``, ``b``, ``c``, ``d``, ``e``, and ``f``
734 are the four or six hexadecimal
735 characters (using uppercase letters) of the wide wide character code. For
736 example, ``["01A345"]`` is used to represent the wide wide character
737 with code ``16#01A345#``.
739 This scheme is compatible with use of the full Wide_Wide_Character set.
740 On input, brackets coding can also be used for upper half characters,
741 e.g., ``["C1"]`` for lower case a. However, on output, brackets notation
742 is only used for wide characters with a code greater than ``16#FF#``.
745 If is also possible to use the other Wide_Character encoding methods,
746 such as Shift-JIS, but the other schemes cannot support the full range
747 of wide wide characters.
748 An attempt to output a character that cannot
749 be represented using the encoding scheme for the file causes
750 Constraint_Error to be raised. An invalid wide character sequence on
751 input also causes Constraint_Error to be raised.
753 .. _Stream_Pointer_Positioning_2:
755 Stream Pointer Positioning
756 --------------------------
758 ``Ada.Wide_Wide_Text_IO`` is similar to ``Ada.Text_IO`` in its handling
759 of stream pointer positioning (:ref:`Text_IO`). There is one additional
762 If ``Ada.Wide_Wide_Text_IO.Look_Ahead`` reads a character outside the
763 normal lower ASCII set, i.e. a character in the range:
768 Wide_Wide_Character'Val (16#0080#) .. Wide_Wide_Character'Val (16#10FFFF#)
771 then although the logical position of the file pointer is unchanged by
772 the ``Look_Ahead`` call, the stream is physically positioned past the
773 wide character sequence. Again this is to avoid the need for buffering
774 or backup, and all ``Wide_Wide_Text_IO`` routines check the internal
775 indication that this situation has occurred so that this is not visible
776 to a normal program using ``Wide_Wide_Text_IO``. However, this discrepancy
777 can be observed if the wide text file shares a stream with another file.
779 .. _Reading_and_Writing_Non-Regular_Files_2:
781 Reading and Writing Non-Regular Files
782 -------------------------------------
784 As in the case of Text_IO, when a non-regular file is read, it is
785 assumed that the file contains no page marks (any form characters are
786 treated as data characters), and ``End_Of_Page`` always returns
787 ``False``. Similarly, the end of file indication is not sticky, so
788 it is possible to read beyond an end of file.
795 A stream file is a sequence of bytes, where individual elements are
796 written to the file as described in the Ada Reference Manual. The type
797 ``Stream_Element`` is simply a byte. There are two ways to read or
801 The operations ``Read`` and ``Write`` directly read or write a
802 sequence of stream elements with no control information.
805 The stream attributes applied to a stream file transfer data in the
806 manner described for stream attributes.
808 .. _Text_Translation:
813 ``Text_Translation=xxx`` may be used as the Form parameter
814 passed to Text_IO.Create and Text_IO.Open. ``Text_Translation=xxx``
815 has no effect on Unix systems. Possible values are:
819 ``Yes`` or ``Text`` is the default, which means to
820 translate LF to/from CR/LF on Windows systems.
822 ``No`` disables this translation; i.e. it
823 uses binary mode. For output files, ``Text_Translation=No``
824 may be used to create Unix-style files on
828 ``wtext`` translation enabled in Unicode mode.
829 (corresponds to _O_WTEXT).
832 ``u8text`` translation enabled in Unicode UTF-8 mode.
833 (corresponds to O_U8TEXT).
836 ``u16text`` translation enabled in Unicode UTF-16
837 mode. (corresponds to_O_U16TEXT).
845 Section A.14 of the Ada Reference Manual allows implementations to
846 provide a wide variety of behavior if an attempt is made to access the
847 same external file with two or more internal files.
849 To provide a full range of functionality, while at the same time
850 minimizing the problems of portability caused by this implementation
851 dependence, GNAT handles file sharing as follows:
854 In the absence of a ``shared=xxx`` form parameter, an attempt
855 to open two or more files with the same full name is considered an error
856 and is not supported. The exception ``Use_Error`` will be
857 raised. Note that a file that is not explicitly closed by the program
858 remains open until the program terminates.
861 If the form parameter ``shared=no`` appears in the form string, the
862 file can be opened or created with its own separate stream identifier,
863 regardless of whether other files sharing the same external file are
864 opened. The exact effect depends on how the C stream routines handle
865 multiple accesses to the same external files using separate streams.
868 If the form parameter ``shared=yes`` appears in the form string for
869 each of two or more files opened using the same full name, the same
870 stream is shared between these files, and the semantics are as described
871 in Ada Reference Manual, Section A.14.
873 When a program that opens multiple files with the same name is ported
874 from another Ada compiler to GNAT, the effect will be that
875 ``Use_Error`` is raised.
877 The documentation of the original compiler and the documentation of the
878 program should then be examined to determine if file sharing was
879 expected, and ``shared=xxx`` parameters added to ``Open``
880 and ``Create`` calls as required.
882 When a program is ported from GNAT to some other Ada compiler, no
883 special attention is required unless the ``shared=xxx`` form
884 parameter is used in the program. In this case, you must examine the
885 documentation of the new compiler to see if it supports the required
886 file sharing semantics, and form strings modified appropriately. Of
887 course it may be the case that the program cannot be ported if the
888 target compiler does not support the required functionality. The best
889 approach in writing portable code is to avoid file sharing (and hence
890 the use of the ``shared=xxx`` parameter in the form string)
893 One common use of file sharing in Ada 83 is the use of instantiations of
894 Sequential_IO on the same file with different types, to achieve
895 heterogeneous input-output. Although this approach will work in GNAT if
896 ``shared=yes`` is specified, it is preferable in Ada to use Stream_IO
897 for this purpose (using the stream attributes)
899 .. _Filenames_encoding:
904 An encoding form parameter can be used to specify the filename
905 encoding ``encoding=xxx``.
908 If the form parameter ``encoding=utf8`` appears in the form string, the
909 filename must be encoded in UTF-8.
912 If the form parameter ``encoding=8bits`` appears in the form
913 string, the filename must be a standard 8bits string.
915 In the absence of a ``encoding=xxx`` form parameter, the
916 encoding is controlled by the ``GNAT_CODE_PAGE`` environment
917 variable. And if not set ``utf8`` is assumed.
922 The current system Windows ANSI code page.
927 This encoding form parameter is only supported on the Windows
928 platform. On the other Operating Systems the run-time is supporting
931 .. _File_content_encoding:
933 File content encoding
934 =====================
936 For text files it is possible to specify the encoding to use. This is
937 controlled by the by the ``GNAT_CCS_ENCODING`` environment
938 variable. And if not set ``TEXT`` is assumed.
940 The possible values are those supported on Windows:
948 Translated unicode encoding
951 Unicode 16-bit encoding
954 Unicode 8-bit encoding
956 This encoding is only supported on the Windows platform.
963 ``Open`` and ``Create`` calls result in a call to ``fopen``
964 using the mode shown in the following table:
966 +----------------------------+---------------+------------------+
967 | ``Open`` and ``Create`` Call Modes |
968 +----------------------------+---------------+------------------+
969 | | **OPEN** | **CREATE** |
970 +============================+===============+==================+
971 | Append_File | "r+" | "w+" |
972 +----------------------------+---------------+------------------+
973 | In_File | "r" | "w+" |
974 +----------------------------+---------------+------------------+
975 | Out_File (Direct_IO) | "r+" | "w" |
976 +----------------------------+---------------+------------------+
977 | Out_File (all other cases) | "w" | "w" |
978 +----------------------------+---------------+------------------+
979 | Inout_File | "r+" | "w+" |
980 +----------------------------+---------------+------------------+
983 If text file translation is required, then either ``b`` or ``t``
984 is added to the mode, depending on the setting of Text. Text file
985 translation refers to the mapping of CR/LF sequences in an external file
986 to LF characters internally. This mapping only occurs in DOS and
987 DOS-like systems, and is not relevant to other systems.
989 A special case occurs with Stream_IO. As shown in the above table, the
990 file is initially opened in ``r`` or ``w`` mode for the
991 ``In_File`` and ``Out_File`` cases. If a ``Set_Mode`` operation
992 subsequently requires switching from reading to writing or vice-versa,
993 then the file is reopened in ``r+`` mode to permit the required operation.
995 .. _Operations_on_C_Streams:
997 Operations on C Streams
998 =======================
1000 The package ``Interfaces.C_Streams`` provides an Ada program with direct
1001 access to the C library functions for operations on C streams:
1006 package Interfaces.C_Streams is
1007 -- Note: the reason we do not use the types that are in
1008 -- Interfaces.C is that we want to avoid dragging in the
1009 -- code in this unit if possible.
1010 subtype chars is System.Address;
1011 -- Pointer to null-terminated array of characters
1012 subtype FILEs is System.Address;
1013 -- Corresponds to the C type FILE*
1014 subtype voids is System.Address;
1015 -- Corresponds to the C type void*
1016 subtype int is Integer;
1017 subtype long is Long_Integer;
1018 -- Note: the above types are subtypes deliberately, and it
1019 -- is part of this spec that the above correspondences are
1020 -- guaranteed. This means that it is legitimate to, for
1021 -- example, use Integer instead of int. We provide these
1022 -- synonyms for clarity, but in some cases it may be
1023 -- convenient to use the underlying types (for example to
1024 -- avoid an unnecessary dependency of a spec on the spec
1026 type size_t is mod 2 ** Standard'Address_Size;
1027 NULL_Stream : constant FILEs;
1028 -- Value returned (NULL in C) to indicate an
1029 -- fdopen/fopen/tmpfile error
1030 ----------------------------------
1031 -- Constants Defined in stdio.h --
1032 ----------------------------------
1034 -- Used by a number of routines to indicate error or
1036 IOFBF : constant int;
1037 IOLBF : constant int;
1038 IONBF : constant int;
1039 -- Used to indicate buffering mode for setvbuf call
1040 SEEK_CUR : constant int;
1041 SEEK_END : constant int;
1042 SEEK_SET : constant int;
1043 -- Used to indicate origin for fseek call
1044 function stdin return FILEs;
1045 function stdout return FILEs;
1046 function stderr return FILEs;
1047 -- Streams associated with standard files
1048 --------------------------
1049 -- Standard C functions --
1050 --------------------------
1051 -- The functions selected below are ones that are
1052 -- available in UNIX (but not necessarily in ANSI C).
1053 -- These are very thin interfaces
1054 -- which copy exactly the C headers. For more
1055 -- documentation on these functions, see the Microsoft C
1056 -- "Run-Time Library Reference" (Microsoft Press, 1990,
1057 -- ISBN 1-55615-225-6), which includes useful information
1058 -- on system compatibility.
1059 procedure clearerr (stream : FILEs);
1060 function fclose (stream : FILEs) return int;
1061 function fdopen (handle : int; mode : chars) return FILEs;
1062 function feof (stream : FILEs) return int;
1063 function ferror (stream : FILEs) return int;
1064 function fflush (stream : FILEs) return int;
1065 function fgetc (stream : FILEs) return int;
1066 function fgets (strng : chars; n : int; stream : FILEs)
1068 function fileno (stream : FILEs) return int;
1069 function fopen (filename : chars; Mode : chars)
1071 -- Note: to maintain target independence, use
1072 -- text_translation_required, a boolean variable defined in
1073 -- a-sysdep.c to deal with the target dependent text
1074 -- translation requirement. If this variable is set,
1075 -- then b/t should be appended to the standard mode
1076 -- argument to set the text translation mode off or on
1078 function fputc (C : int; stream : FILEs) return int;
1079 function fputs (Strng : chars; Stream : FILEs) return int;
1096 function ftell (stream : FILEs) return long;
1103 function isatty (handle : int) return int;
1104 procedure mktemp (template : chars);
1105 -- The return value (which is just a pointer to template)
1107 procedure rewind (stream : FILEs);
1108 function rmtmp return int;
1116 function tmpfile return FILEs;
1117 function ungetc (c : int; stream : FILEs) return int;
1118 function unlink (filename : chars) return int;
1119 ---------------------
1120 -- Extra functions --
1121 ---------------------
1122 -- These functions supply slightly thicker bindings than
1123 -- those above. They are derived from functions in the
1124 -- C Run-Time Library, but may do a bit more work than
1125 -- just directly calling one of the Library functions.
1126 function is_regular_file (handle : int) return int;
1127 -- Tests if given handle is for a regular file (result 1)
1128 -- or for a non-regular file (pipe or device, result 0).
1129 ---------------------------------
1130 -- Control of Text/Binary Mode --
1131 ---------------------------------
1132 -- If text_translation_required is true, then the following
1133 -- functions may be used to dynamically switch a file from
1134 -- binary to text mode or vice versa. These functions have
1135 -- no effect if text_translation_required is false (i.e., in
1136 -- normal UNIX mode). Use fileno to get a stream handle.
1137 procedure set_binary_mode (handle : int);
1138 procedure set_text_mode (handle : int);
1139 ----------------------------
1140 -- Full Path Name support --
1141 ----------------------------
1142 procedure full_name (nam : chars; buffer : chars);
1143 -- Given a NUL terminated string representing a file
1144 -- name, returns in buffer a NUL terminated string
1145 -- representing the full path name for the file name.
1146 -- On systems where it is relevant the drive is also
1147 -- part of the full path name. It is the responsibility
1148 -- of the caller to pass an actual parameter for buffer
1149 -- that is big enough for any full path name. Use
1150 -- max_path_len given below as the size of buffer.
1151 max_path_len : integer;
1152 -- Maximum length of an allowable full path name on the
1153 -- system, including a terminating NUL character.
1154 end Interfaces.C_Streams;
1157 .. _Interfacing_to_C_Streams:
1159 Interfacing to C Streams
1160 ========================
1162 The packages in this section permit interfacing Ada files to C Stream
1168 with Interfaces.C_Streams;
1169 package Ada.Sequential_IO.C_Streams is
1170 function C_Stream (F : File_Type)
1171 return Interfaces.C_Streams.FILEs;
1173 (File : in out File_Type;
1174 Mode : in File_Mode;
1175 C_Stream : in Interfaces.C_Streams.FILEs;
1176 Form : in String := "");
1177 end Ada.Sequential_IO.C_Streams;
1179 with Interfaces.C_Streams;
1180 package Ada.Direct_IO.C_Streams is
1181 function C_Stream (F : File_Type)
1182 return Interfaces.C_Streams.FILEs;
1184 (File : in out File_Type;
1185 Mode : in File_Mode;
1186 C_Stream : in Interfaces.C_Streams.FILEs;
1187 Form : in String := "");
1188 end Ada.Direct_IO.C_Streams;
1190 with Interfaces.C_Streams;
1191 package Ada.Text_IO.C_Streams is
1192 function C_Stream (F : File_Type)
1193 return Interfaces.C_Streams.FILEs;
1195 (File : in out File_Type;
1196 Mode : in File_Mode;
1197 C_Stream : in Interfaces.C_Streams.FILEs;
1198 Form : in String := "");
1199 end Ada.Text_IO.C_Streams;
1201 with Interfaces.C_Streams;
1202 package Ada.Wide_Text_IO.C_Streams is
1203 function C_Stream (F : File_Type)
1204 return Interfaces.C_Streams.FILEs;
1206 (File : in out File_Type;
1207 Mode : in File_Mode;
1208 C_Stream : in Interfaces.C_Streams.FILEs;
1209 Form : in String := "");
1210 end Ada.Wide_Text_IO.C_Streams;
1212 with Interfaces.C_Streams;
1213 package Ada.Wide_Wide_Text_IO.C_Streams is
1214 function C_Stream (F : File_Type)
1215 return Interfaces.C_Streams.FILEs;
1217 (File : in out File_Type;
1218 Mode : in File_Mode;
1219 C_Stream : in Interfaces.C_Streams.FILEs;
1220 Form : in String := "");
1221 end Ada.Wide_Wide_Text_IO.C_Streams;
1223 with Interfaces.C_Streams;
1224 package Ada.Stream_IO.C_Streams is
1225 function C_Stream (F : File_Type)
1226 return Interfaces.C_Streams.FILEs;
1228 (File : in out File_Type;
1229 Mode : in File_Mode;
1230 C_Stream : in Interfaces.C_Streams.FILEs;
1231 Form : in String := "");
1232 end Ada.Stream_IO.C_Streams;
1235 In each of these six packages, the ``C_Stream`` function obtains the
1236 ``FILE`` pointer from a currently opened Ada file. It is then
1237 possible to use the ``Interfaces.C_Streams`` package to operate on
1238 this stream, or the stream can be passed to a C program which can
1239 operate on it directly. Of course the program is responsible for
1240 ensuring that only appropriate sequences of operations are executed.
1242 One particular use of relevance to an Ada program is that the
1243 ``setvbuf`` function can be used to control the buffering of the
1244 stream used by an Ada file. In the absence of such a call the standard
1245 default buffering is used.
1247 The ``Open`` procedures in these packages open a file giving an
1248 existing C Stream instead of a file name. Typically this stream is
1249 imported from a C program, allowing an Ada file to operate on an