2 <!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
6 <appendix id="appendix.contrib" xreflabel="Contributing">
7 <?dbhtml filename="appendix_contributing.html"?>
23 <primary>Appendix</primary>
24 <secondary>Contributing</secondary>
29 The GNU C++ Library follows an open development model. Active
30 contributors are assigned maintainer-ship responsibility, and given
31 write access to the source repository. First time contributors
32 should follow this procedure:
35 <sect1 id="contrib.list" xreflabel="Contributor Checklist">
36 <title>Contributor Checklist</title>
38 <sect2 id="list.reading">
39 <title>Reading</title>
44 Get and read the relevant sections of the C++ language
45 specification. Copies of the full ISO 14882 standard are
46 available on line via the ISO mirror site for committee
47 members. Non-members, or those who have not paid for the
48 privilege of sitting on the committee and sustained their
49 two meeting commitment for voting rights, may get a copy of
50 the standard from their respective national standards
51 organization. In the USA, this national standards
52 organization is ANSI and their web-site is right
53 <ulink url="http://www.ansi.org">here.</ulink>
54 (And if you've already registered with them, clicking this link will take you to directly to the place where you can
55 <ulink url="http://webstore.ansi.org/RecordDetail.aspx?sku=ISO%2FIEC+14882:2003">buy the standard on-line.)</ulink>
61 The library working group bugs, and known defects, can
63 <ulink url="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21 </ulink>
69 The newsgroup dedicated to standardization issues is
70 comp.std.c++: this FAQ for this group is quite useful and
72 found <ulink url="http://www.comeaucomputing.com/csc/faq.html">
80 the <ulink url="http://www.gnu.org/prep/standards">GNU
81 Coding Standards</ulink>, and chuckle when you hit the part
82 about <quote>Using Languages Other Than C</quote>.
88 Be familiar with the extensions that preceded these
89 general GNU rules. These style issues for libstdc++ can be
90 found <link linkend="contrib.coding_style">here</link>.
96 And last but certainly not least, read the
97 library-specific information
98 found <link linkend="appendix.porting"> here</link>.
104 <sect2 id="list.copyright">
105 <title>Assignment</title>
107 Small changes can be accepted without a copyright assignment form on
108 file. New code and additions to the library need completed copyright
109 assignment form on file at the FSF. Note: your employer may be required
110 to fill out appropriate disclaimer forms as well.
114 Historically, the libstdc++ assignment form added the following
120 Which Belgian comic book character is better, Tintin or Asterix, and
126 While not strictly necessary, humoring the maintainers and answering
127 this question would be appreciated.
131 For more information about getting a copyright assignment, please see
132 <ulink url="http://www.gnu.org/prep/maintain/html_node/Legal-Matters.html">Legal
137 Please contact Benjamin Kosnik at
138 <email>bkoz+assign@redhat.com</email> if you are confused
139 about the assignment or have general licensing questions. When
140 requesting an assignment form from
141 <email>mailto:assign@gnu.org</email>, please cc the libstdc++
142 maintainer above so that progress can be monitored.
146 <sect2 id="list.getting">
147 <title>Getting Sources</title>
149 <ulink url="http://gcc.gnu.org/svnwrite.html">Getting write access
150 (look for "Write after approval")</ulink>
154 <sect2 id="list.patches">
155 <title>Submitting Patches</title>
158 Every patch must have several pieces of information before it can be
159 properly evaluated. Ideally (and to ensure the fastest possible
160 response from the maintainers) it would have all of these pieces:
166 A description of the bug and how your patch fixes this
167 bug. For new features a description of the feature and your
174 A ChangeLog entry as plain text; see the various
175 ChangeLog files for format and content. If you are
176 using emacs as your editor, simply position the insertion
177 point at the beginning of your change and hit CX-4a to bring
178 up the appropriate ChangeLog entry. See--magic! Similar
179 functionality also exists for vi.
185 A testsuite submission or sample program that will
186 easily and simply show the existing error or test new
193 The patch itself. If you are accessing the SVN
194 repository use <command>svn update; svn diff NEW</command>;
195 else, use <command>diff -cp OLD NEW</command> ... If your
196 version of diff does not support these options, then get the
197 latest version of GNU
198 diff. The <ulink url="http://gcc.gnu.org/wiki/SvnTricks">SVN
199 Tricks</ulink> wiki page has information on customising the
200 output of <code>svn diff</code>.
206 When you have all these pieces, bundle them up in a
207 mail message and send it to libstdc++@gcc.gnu.org. All
208 patches and related discussion should be sent to the
209 libstdc++ mailing list.
218 <sect1 id="contrib.organization" xreflabel="Source Organization">
219 <?dbhtml filename="source_organization.html"?>
220 <title>Directory Layout and Source Conventions</title>
223 The unpacked source directory of libstdc++ contains the files
224 needed to create the GNU C++ Library.
228 It has subdirectories:
231 Files in HTML and text format that document usage, quirks of the
232 implementation, and contributor checklists.
235 All header files for the C++ library are within this directory,
236 modulo specific runtime-related files that are in the libsupc++
240 Files meant to be found by #include <name> directives in
241 standard-conforming user programs.
244 Headers intended to directly include standard C headers.
245 [NB: this can be enabled via --enable-cheaders=c]
248 Headers intended to include standard C headers in
249 the global namespace, and put select names into the std::
250 namespace. [NB: this is the default, and is the same as
251 --enable-cheaders=c_global]
254 Headers intended to include standard C headers
255 already in namespace std, and put select names into the std::
256 namespace. [NB: this is the same as --enable-cheaders=c_std]
259 Files included by standard headers and by other files in
263 Headers provided for backward compatibility, such as <iostream.h>.
264 They are not used in this library.
267 Headers that define extensions to the standard library. No
268 standard header refers to any of them.
271 Scripts that are used during the configure, build, make, or test
275 Files that are used in constructing the library, but are not
278 testsuites/[backward, demangle, ext, performance, thread, 17_* to 27_*]
279 Test programs are here, and may be used to begin to exercise the
280 library. Support for "make check" and "make check-install" is
281 complete, and runs through all the subdirectories here when this
282 command is issued from the build directory. Please note that
283 "make check" requires DejaGNU 1.4 or later to be installed. Please
284 note that "make check-script" calls the script mkcheck, which
285 requires bash, and which may need the paths to bash adjusted to
286 work properly, as /bin/bash is assumed.
288 Other subdirectories contain variant versions of certain files
289 that are meant to be copied or linked by the configure script.
298 In addition, a subdirectory holds the convenience library libsupc++.
301 Contains the runtime library for C++, including exception
302 handling and memory allocation and deallocation, RTTI, terminate
305 Note that glibc also has a bits/ subdirectory. We will either
306 need to be careful not to collide with names in its bits/
307 directory; or rename bits to (e.g.) cppbits/.
309 In files throughout the system, lines marked with an "XXX" indicate
310 a bug or incompletely-implemented feature. Lines marked "XXX MT"
311 indicate a place that may require attention for multi-thread safety.
316 <sect1 id="contrib.coding_style" xreflabel="Coding Style">
317 <?dbhtml filename="source_code_style.html"?>
318 <title>Coding Style</title>
321 <sect2 id="coding_style.bad_identifiers">
322 <title>Bad Identifiers</title>
324 Identifiers that conflict and should be avoided.
328 This is the list of names <quote>reserved to the
329 implementation</quote> that have been claimed by certain
330 compilers and system headers of interest, and should not be used
331 in the library. It will grow, of course. We generally are
332 interested in names that are not all-caps, except for those like
375 [Note that this list is out of date. It applies to the old
376 name-mangling; in G++ 3.0 and higher a different name-mangling is
377 used. In addition, many of the bugs relating to G++ interpreting
378 these names as operators have been fixed.]
380 The full set of __* identifiers (combined from gcc/cp/lex.c and
381 gcc/cplus-dem.c) that are either old or new, but are definitely
382 recognized by the demangler, is:
510 // long double conversion members mangled as __opr
511 // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html
516 <sect2 id="coding_style.example">
517 <title>By Example</title>
519 This library is written to appropriate C++ coding standards. As such,
520 it is intended to precede the recommendations of the GNU Coding
521 Standard, which can be referenced in full here:
523 http://www.gnu.org/prep/standards/standards.html#Formatting
525 The rest of this is also interesting reading, but skip the "Design
528 The GCC coding conventions are here, and are also useful:
529 http://gcc.gnu.org/codingconventions.html
531 In addition, because it doesn't seem to be stated explicitly anywhere
532 else, there is an 80 column source limit.
534 ChangeLog entries for member functions should use the
535 classname::member function name syntax as follows:
537 1999-04-15 Dennis Ritchie <dr@att.com>
539 * src/basic_file.cc (__basic_file::open): Fix thinko in
540 _G_HAVE_IO_FILE_OPEN bits.
542 Notable areas of divergence from what may be previous local practice
543 (particularly for GNU C) include:
545 01. Pointers and references
549 char *p = "flop"; // wrong
550 char &c = *p; // wrong
552 Reason: In C++, definitions are mixed with executable code. Here,
553 p is being initialized, not *p. This is near-universal
554 practice among C++ programmers; it is normal for C hackers
555 to switch spontaneously as they gain experience.
557 02. Operator names and parentheses
560 operator == (type) // wrong
562 Reason: The == is part of the function name. Separating
563 it makes the declaration look like an expression.
565 03. Function names and parentheses
568 void mangle () // wrong
570 Reason: no space before parentheses (except after a control-flow
571 keyword) is near-universal practice for C++. It identifies the
572 parentheses as the function-call operator or declarator, as
573 opposed to an expression or other overloaded use of parentheses.
575 04. Template function indentation
576 template<typename T>
578 template_function(args)
581 template<class T>
582 void template_function(args) {};
584 Reason: In class definitions, without indentation whitespace is
585 needed both above and below the declaration to distinguish
586 it visually from other members. (Also, re: "typename"
587 rather than "class".) T often could be int, which is
588 not a class. ("class", here, is an anachronism.)
590 05. Template class indentation
591 template<typename _CharT, typename _Traits>
592 class basic_ios : public ios_base
598 template<class _CharT, class _Traits>
599 class basic_ios : public ios_base
605 template<class _CharT, class _Traits>
606 class basic_ios : public ios_base
620 enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl };
622 07. Member initialization lists
623 All one line, separate from class name.
626 : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
629 gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
648 09. Member functions declarations and definitions
649 Keywords such as extern, static, export, explicit, inline, etc
650 go on the line above the function name. Thus
657 Reason: GNU coding conventions dictate return types for functions
658 are on a separate line than the function name and parameter list
659 for definitions. For C++, where we have member functions that can
660 be either inline definitions or declarations, keeping to this
661 standard allows all member function names for a given class to be
662 aligned to the same margin, increasing readability.
665 10. Invocation of member functions with "this->"
666 For non-uglified names, use this->name to call the function.
672 Reason: Koenig lookup.
686 12. Spacing under protected and private in class declarations:
687 space above, none below
698 13. Spacing WRT return statements.
699 no extra spacing before returns, no parenthesis
716 14. Location of global variables.
717 All global variables of class type, whether in the "user visible"
718 space (e.g., cin) or the implementation namespace, must be defined
719 as a character array with the appropriate alignment and then later
720 re-initialized to the correct value.
722 This is due to startup issues on certain platforms, such as AIX.
723 For more explanation and examples, see src/globals.cc. All such
724 variables should be contained in that file, for simplicity.
726 15. Exception abstractions
727 Use the exception abstractions found in functexcept.h, which allow
728 C++ programmers to use this library with -fno-exceptions. (Even if
729 that is rarely advisable, it's a necessary evil for backwards
732 16. Exception error messages
733 All start with the name of the function where the exception is
734 thrown, and then (optional) descriptive text is added. Example:
736 __throw_logic_error(__N("basic_string::_S_construct NULL not valid"));
738 Reason: The verbose terminate handler prints out exception::what(),
739 as well as the typeinfo for the thrown exception. As this is the
740 default terminate handler, by putting location info into the
741 exception string, a very useful error message is printed out for
742 uncaught exceptions. So useful, in fact, that non-programmers can
743 give useful error messages, and programmers can intelligently
744 speculate what went wrong without even using a debugger.
746 17. The doxygen style guide to comments is a separate document,
749 The library currently has a mixture of GNU-C and modern C++ coding
750 styles. The GNU C usages will be combed out gradually.
754 For nonstandard names appearing in Standard headers, we are constrained
755 to use names that begin with underscores. This is called "uglification".
758 Local and argument names: __[a-z].*
760 Examples: __count __ix __s1
762 Type names and template formal-argument names: _[A-Z][^_].*
764 Examples: _Helper _CharT _N
766 Member data and function names: _M_.*
768 Examples: _M_num_elements _M_initialize ()
770 Static data members, constants, and enumerations: _S_.*
772 Examples: _S_max_elements _S_default_value
774 Don't use names in the same scope that differ only in the prefix,
775 e.g. _S_top and _M_top. See BADNAMES for a list of forbidden names.
776 (The most tempting of these seem to be and "_T" and "__sz".)
778 Names must never have "__" internally; it would confuse name
779 unmanglers on some targets. Also, never use "__[0-9]", same reason.
781 --------------------------
795 gribble(const gribble&);
798 gribble(int __howmany);
801 operator=(const gribble&);
806 // Start with a capital letter, end with a period.
808 public_member(const char* __arg) const;
810 // In-class function definitions should be restricted to one-liners.
812 one_line() { return 0 }
815 two_lines(const char* arg)
816 { return strchr(arg, 'a'); }
819 three_lines(); // inline, but defined below.
822 template<typename _Formal_argument>
824 public_template() const throw();
826 template<typename _Iterator>
836 int _M_private_function();
845 _S_initialize_library();
848 // More-or-less-standard language features described by lack, not presence.
849 # ifndef _G_NO_LONGLONG
850 extern long long _G_global_with_a_good_long_name; // avoid globals!
853 // Avoid in-class inline definitions, define separately;
854 // likewise for member class definitions:
856 gribble::public_member() const
857 { int __local = 0; return __local; }
859 class gribble::_Helper
863 friend class gribble;
867 // Names beginning with "__": only for arguments and
868 // local variables; never use "__" in a type name, or
869 // within any name; never use "__[0-9]".
871 #endif /* _HEADER_ */
876 template<typename T> // notice: "typename", not "class", no space
877 long_return_value_type<with_many, args>
878 function_name(char* pointer, // "char *pointer" is wrong.
880 const Reference& ref)
882 // int a_local; /* wrong; see below. */
888 int a_local = 0; // declare variable at first use.
890 // char a, b, *p; /* wrong */
893 char* c = "abc"; // each variable goes on its own line, always.
895 // except maybe here...
896 for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) {
902 : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
906 gribble::three_lines()
908 // doesn't fit in one line.
915 <sect1 id="contrib.doc_style" xreflabel="Documentation Style">
916 <?dbhtml filename="documentation_style.html"?>
917 <title>Documentation Style</title>
918 <sect2 id="doc_style.doxygen">
919 <title>Doxygen</title>
920 <sect3 id="doxygen.prereq">
921 <title>Prerequisites</title>
923 Prerequisite tools are Bash 2.0 or later,
924 <ulink url="http://www.doxygen.org/">Doxygen</ulink>, and
925 the <ulink url="http://www.gnu.org/software/coreutils/">GNU
926 coreutils</ulink>. (GNU versions of find, xargs, and possibly
927 sed and grep are used, just because the GNU versions make
932 To generate the pretty pictures and hierarchy
934 <ulink url="http://www.graphviz.org">Graphviz</ulink> package
935 will need to be installed. For PDF
936 output, <ulink url="http://www.tug.org/applications/pdftex/">
937 pdflatex</ulink> is required.
941 <sect3 id="doxygen.rules">
942 <title>Generating the Doxygen Files</title>
944 The following Makefile rules run Doxygen to generate HTML
945 docs, XML docs, PDF docs, and the man pages.
949 <screen><userinput>make doc-html-doxygen</userinput></screen>
953 <screen><userinput>make doc-xml-doxygen</userinput></screen>
957 <screen><userinput>make doc-pdf-doxygen</userinput></screen>
961 <screen><userinput>make doc-man-doxygen</userinput></screen>
965 Careful observers will see that the Makefile rules simply call
966 a script from the source tree, <filename>run_doxygen</filename>, which
967 does the actual work of running Doxygen and then (most
968 importantly) massaging the output files. If for some reason
969 you prefer to not go through the Makefile, you can call this
970 script directly. (Start by passing <literal>--help</literal>.)
974 If you wish to tweak the Doxygen settings, do so by editing
975 <filename>doc/doxygen/user.cfg.in</filename>. Notes to fellow
976 library hackers are written in triple-# comments.
981 <sect3 id="doxygen.markup">
982 <title>Markup</title>
985 In general, libstdc++ files should be formatted according to
986 the rules found in the
987 <link linkend="contrib.coding_style">Coding Standard</link>. Before
988 any doxygen-specific formatting tweaks are made, please try to
989 make sure that the initial formatting is sound.
993 Adding Doxygen markup to a file (informally called
994 <quote>doxygenating</quote>) is very simple. The Doxygen manual can be
996 <ulink url="http://www.stack.nl/~dimitri/doxygen/download.html#latestman">here</ulink>.
997 We try to use a very-recent version of Doxygen.
1002 <classname>deque</classname>/<classname>vector</classname>/<classname>list</classname>
1003 and <classname>std::pair</classname> as examples. For
1004 functions, see their member functions, and the free functions
1005 in <filename>stl_algobase.h</filename>. Member functions of
1006 other container-like types should read similarly to these
1011 Some commentary to accompany
1012 the first list in the <ulink url="http://www.stack.nl/~dimitri/doxygen/docblocks.html">Special
1013 Documentation Blocks</ulink> section of
1019 <para>For longer comments, use the Javadoc style...</para>
1024 ...not the Qt style. The intermediate *'s are preferred.
1030 Use the triple-slash style only for one-line comments (the
1031 <quote>brief</quote> mode).
1037 This is disgusting. Don't do this.
1043 Some specific guidelines:
1047 Use the @-style of commands, not the !-style. Please be
1048 careful about whitespace in your markup comments. Most of the
1049 time it doesn't matter; doxygen absorbs most whitespace, and
1050 both HTML and *roff are agnostic about whitespace. However,
1051 in <pre> blocks and @code/@endcode sections, spacing can
1052 have <quote>interesting</quote> effects.
1056 Use either kind of grouping, as
1057 appropriate. <filename>doxygroups.cc</filename> exists for this
1058 purpose. See <filename>stl_iterator.h</filename> for a good example
1059 of the <quote>other</quote> kind of grouping.
1063 Please use markup tags like @p and @a when referring to things
1064 such as the names of function parameters. Use @e for emphasis
1065 when necessary. Use @c to refer to other standard names.
1066 (Examples of all these abound in the present code.)
1070 Complicated math functions should use the multi-line
1071 format. An example from <filename>random.h</filename>:
1077 * @brief A model of a linear congruential random number generator.
1080 * x_{i+1}\leftarrow(ax_{i} + c) \bmod m
1087 Be careful about using certain, special characters when
1088 writing Doxygen comments. Single and double quotes, and
1089 separators in filenames are two common trouble spots. When in
1090 doubt, consult the following table.
1094 <title>HTML to Doxygen Markup Comparison</title>
1095 <tgroup cols='2' align='left' colsep='1' rowsep='1'>
1096 <colspec colname='c1'></colspec>
1097 <colspec colname='c2'></colspec>
1102 <entry>Doxygen</entry>
1113 <entry>"</entry>
1118 <entry>'</entry>
1123 <entry><i></entry>
1124 <entry>@a word</entry>
1128 <entry><b></entry>
1129 <entry>@b word</entry>
1133 <entry><code></entry>
1134 <entry>@c word</entry>
1138 <entry><em></entry>
1139 <entry>@a word</entry>
1143 <entry><em></entry>
1144 <entry><em>two words or more</em></entry>
1156 <sect2 id="doc_style.docbook">
1157 <title>Docbook</title>
1159 <sect3 id="docbook.prereq">
1160 <title>Prerequisites</title>
1162 Editing the DocBook sources requires an XML editor. Many
1163 exist: some notable options
1164 include <command>emacs</command>, <application>Kate</application>,
1165 or <application>Conglomerate</application>.
1169 Some editors support special <quote>XML Validation</quote>
1170 modes that can validate the file as it is
1171 produced. Recommended is the <command>nXML Mode</command>
1172 for <command>emacs</command>.
1176 Besides an editor, additional DocBook files and XML tools are
1181 Access to the DocBook stylesheets and DTD is required. The
1182 stylesheets are usually packaged by vendor, in something
1183 like <filename>docbook-style-xsl</filename>. To exactly match
1184 generated output, please use a version of the stylesheets
1186 to <filename>docbook-style-xsl-1.74.0-5</filename>. The
1187 installation directory for this package corresponds to
1188 the <literal>XSL_STYLE_DIR</literal>
1189 in <filename>doc/Makefile.am</filename> and defaults
1190 to <filename class="directory">/usr/share/sgml/docbook/xsl-stylesheets</filename>.
1194 For processing XML, an XML processor and some style
1195 sheets are necessary. Defaults are <command>xsltproc</command>
1196 provided by <filename>libxslt</filename>.
1200 For validating the XML document, you'll need
1201 something like <command>xmllint</command> and access to the
1202 DocBook DTD. These are provided
1203 by a vendor package like <filename>libxml2</filename>.
1207 For PDF output, something that transforms valid XML to PDF is
1208 required. Possible solutions include
1209 <ulink url="http://dblatex.sourceforge.net">dblatex</ulink>,
1210 <command>xmlto</command>, or <command>prince</command>. Other
1211 options are listed on the DocBook
1212 web <ulink url="http://wiki.docbook.org/topic/DocBookPublishingTools">pages</ulink>. Please
1213 consult the <email>libstdc++@gcc.gnu.org</email> list when
1214 preparing printed manuals for current best practice and
1219 Make sure that the XML documentation and markup is valid for
1220 any change. This can be done easily, with the validation rules
1221 in the <filename>Makefile</filename>, which is equivalent to doing:
1226 xmllint --noout --valid <filename>xml/index.xml</filename>
1231 <sect3 id="docbook.rules">
1232 <title>Generating the DocBook Files</title>
1235 The following Makefile rules generate (in order): an HTML
1236 version of all the DocBook documentation, a PDF version of the same, a
1237 single XML document, and the result of validating the entire XML
1242 <screen><userinput>make doc-html-docbook</userinput></screen>
1246 <screen><userinput>make doc-pdf-docbook</userinput></screen>
1250 <screen><userinput>make doc-xml-single-docbook</userinput></screen>
1254 <screen><userinput>make doc-xml-validate-docbook</userinput></screen>
1259 <sect3 id="docbook.examples">
1260 <title>File Organization and Basics</title>
1263 <emphasis>Which files are important</emphasis>
1265 All Docbook files are in the directory
1266 libstdc++-v3/doc/xml
1268 Inside this directory, the files of importance:
1269 spine.xml - index to documentation set
1270 manual/spine.xml - index to manual
1271 manual/*.xml - individual chapters and sections of the manual
1272 faq.xml - index to FAQ
1273 api.xml - index to source level / API
1275 All *.txml files are template xml files, i.e., otherwise empty files with
1276 the correct structure, suitable for filling in with new information.
1278 <emphasis>Canonical Writing Style</emphasis>
1282 member function template
1283 (via C++ Templates, Vandevoorde)
1285 class in namespace std: allocator, not std::allocator
1287 header file: iostream, not <iostream>
1290 <emphasis>General structure</emphasis>
1325 <sect3 id="docbook.markup">
1326 <title>Markup By Example</title>
1329 Complete details on Docbook markup can be found in the DocBook
1331 <ulink url="http://www.docbook.org/tdg/en/html/part2.html">online</ulink>.
1332 An incomplete reference for HTML to Docbook conversion is
1333 detailed in the table below.
1337 <title>HTML to Docbook XML Markup Comparison</title>
1338 <tgroup cols='2' align='left' colsep='1' rowsep='1'>
1339 <colspec colname='c1'></colspec>
1340 <colspec colname='c2'></colspec>
1345 <entry>Docbook</entry>
1351 <entry><p></entry>
1352 <entry><para></entry>
1355 <entry><pre></entry>
1356 <entry><computeroutput>, <programlisting>,
1357 <literallayout></entry>
1360 <entry><ul></entry>
1361 <entry><itemizedlist></entry>
1364 <entry><ol></entry>
1365 <entry><orderedlist></entry>
1368 <entry><il></entry>
1369 <entry><listitem></entry>
1372 <entry><dl></entry>
1373 <entry><variablelist></entry>
1376 <entry><dt></entry>
1377 <entry><term></entry>
1380 <entry><dd></entry>
1381 <entry><listitem></entry>
1385 <entry><a href=""></entry>
1386 <entry><ulink url=""></entry>
1389 <entry><code></entry>
1390 <entry><literal>, <programlisting></entry>
1393 <entry><strong></entry>
1394 <entry><emphasis></entry>
1397 <entry><em></entry>
1398 <entry><emphasis></entry>
1401 <entry>"</entry>
1402 <entry><quote></entry>
1409 And examples of detailed markup for which there are no real HTML
1410 equivalents are listed in the table below.
1414 <title>Docbook XML Element Use</title>
1415 <tgroup cols='2' align='left' colsep='1' rowsep='1'>
1416 <colspec colname='c1'></colspec>
1417 <colspec colname='c2'></colspec>
1421 <entry>Element</entry>
1428 <entry><structname></entry>
1429 <entry><structname>char_traits</structname></entry>
1432 <entry><classname></entry>
1433 <entry><classname>string</classname></entry>
1436 <entry><function></entry>
1438 <para><function>clear()</function></para>
1439 <para><function>fs.clear()</function></para>
1443 <entry><type></entry>
1444 <entry><type>long long</type></entry>
1447 <entry><varname></entry>
1448 <entry><varname>fs</varname></entry>
1451 <entry><literal></entry>
1453 <para><literal>-Weffc++</literal></para>
1454 <para><literal>rel_ops</literal></para>
1458 <entry><constant></entry>
1460 <para><constant>_GNU_SOURCE</constant></para>
1461 <para><constant>3.0</constant></para>
1465 <entry><command></entry>
1466 <entry><command>g++</command></entry>
1469 <entry><errortext></entry>
1470 <entry><errortext>In instantiation of</errortext></entry>
1473 <entry><filename></entry>
1475 <para><filename class="headerfile">ctype.h</filename></para>
1476 <para><filename class="directory">/home/gcc/build</filename></para>
1477 <para><filename class="libraryfile">libstdc++.so</filename></para>
1487 <sect2 id="doc_style.combines">
1488 <title>Combines</title>
1490 <sect3 id="combines.rules">
1491 <title>Generating Combines and Assemblages</title>
1494 The following Makefile rules are defaults, and are usually
1495 aliased to variable rules.
1499 <screen><userinput>make doc-html</userinput></screen>
1503 <screen><userinput>make doc-man</userinput></screen>
1507 <screen><userinput>make doc-pdf</userinput></screen>
1513 <sect1 id="contrib.design_notes" xreflabel="Design Notes">
1514 <?dbhtml filename="source_design_notes.html"?>
1515 <title>Design Notes</title>
1524 This paper is covers two major areas:
1526 - Features and policies not mentioned in the standard that
1527 the quality of the library implementation depends on, including
1528 extensions and "implementation-defined" features;
1530 - Plans for required but unimplemented library features and
1531 optimizations to them.
1536 The standard defines a large library, much larger than the standard
1537 C library. A naive implementation would suffer substantial overhead
1538 in compile time, executable size, and speed, rendering it unusable
1539 in many (particularly embedded) applications. The alternative demands
1540 care in construction, and some compiler support, but there is no
1541 need for library subsets.
1543 What are the sources of this overhead? There are four main causes:
1545 - The library is specified almost entirely as templates, which
1546 with current compilers must be included in-line, resulting in
1547 very slow builds as tens or hundreds of thousands of lines
1548 of function definitions are read for each user source file.
1549 Indeed, the entire SGI STL, as well as the dos Reis valarray,
1550 are provided purely as header files, largely for simplicity in
1551 porting. Iostream/locale is (or will be) as large again.
1553 - The library is very flexible, specifying a multitude of hooks
1554 where users can insert their own code in place of defaults.
1555 When these hooks are not used, any time and code expended to
1556 support that flexibility is wasted.
1558 - Templates are often described as causing to "code bloat". In
1559 practice, this refers (when it refers to anything real) to several
1560 independent processes. First, when a class template is manually
1561 instantiated in its entirely, current compilers place the definitions
1562 for all members in a single object file, so that a program linking
1563 to one member gets definitions of all. Second, template functions
1564 which do not actually depend on the template argument are, under
1565 current compilers, generated anew for each instantiation, rather
1566 than being shared with other instantiations. Third, some of the
1567 flexibility mentioned above comes from virtual functions (both in
1568 regular classes and template classes) which current linkers add
1569 to the executable file even when they manifestly cannot be called.
1571 - The library is specified to use a language feature, exceptions,
1572 which in the current gcc compiler ABI imposes a run time and
1573 code space cost to handle the possibility of exceptions even when
1574 they are not used. Under the new ABI (accessed with -fnew-abi),
1575 there is a space overhead and a small reduction in code efficiency
1576 resulting from lost optimization opportunities associated with
1577 non-local branches associated with exceptions.
1579 What can be done to eliminate this overhead? A variety of coding
1580 techniques, and compiler, linker and library improvements and
1581 extensions may be used, as covered below. Most are not difficult,
1582 and some are already implemented in varying degrees.
1584 Overhead: Compilation Time
1585 --------------------------
1587 Providing "ready-instantiated" template code in object code archives
1588 allows us to avoid generating and optimizing template instantiations
1589 in each compilation unit which uses them. However, the number of such
1590 instantiations that are useful to provide is limited, and anyway this
1591 is not enough, by itself, to minimize compilation time. In particular,
1592 it does not reduce time spent parsing conforming headers.
1594 Quicker header parsing will depend on library extensions and compiler
1595 improvements. One approach is some variation on the techniques
1596 previously marketed as "pre-compiled headers", now standardized as
1597 support for the "export" keyword. "Exported" template definitions
1598 can be placed (once) in a "repository" -- really just a library, but
1599 of template definitions rather than object code -- to be drawn upon
1600 at link time when an instantiation is needed, rather than placed in
1601 header files to be parsed along with every compilation unit.
1603 Until "export" is implemented we can put some of the lengthy template
1604 definitions in #if guards or alternative headers so that users can skip
1605 over the full definitions when they need only the ready-instantiated
1608 To be precise, this means that certain headers which define
1609 templates which users normally use only for certain arguments
1610 can be instrumented to avoid exposing the template definitions
1611 to the compiler unless a macro is defined. For example, in
1612 <string>, we might have:
1614 template <class _CharT, ... > class basic_string {
1615 ... // member declarations
1617 ... // operator declarations
1620 # if _G_NO_TEMPLATE_EXPORT
1621 # include <bits/std_locale.h> // headers needed by definitions
1623 # include <bits/string.tcc> // member and global template definitions.
1627 Users who compile without specifying a strict-ISO-conforming flag
1628 would not see many of the template definitions they now see, and rely
1629 instead on ready-instantiated specializations in the library. This
1630 technique would be useful for the following substantial components:
1631 string, locale/iostreams, valarray. It would *not* be useful or
1632 usable with the following: containers, algorithms, iterators,
1633 allocator. Since these constitute a large (though decreasing)
1634 fraction of the library, the benefit the technique offers is
1637 The language specifies the semantics of the "export" keyword, but
1638 the gcc compiler does not yet support it. When it does, problems
1639 with large template inclusions can largely disappear, given some
1640 minor library reorganization, along with the need for the apparatus
1643 Overhead: Flexibility Cost
1644 --------------------------
1646 The library offers many places where users can specify operations
1647 to be performed by the library in place of defaults. Sometimes
1648 this seems to require that the library use a more-roundabout, and
1649 possibly slower, way to accomplish the default requirements than
1650 would be used otherwise.
1652 The primary protection against this overhead is thorough compiler
1653 optimization, to crush out layers of inline function interfaces.
1654 Kuck & Associates has demonstrated the practicality of this kind
1657 The second line of defense against this overhead is explicit
1658 specialization. By defining helper function templates, and writing
1659 specialized code for the default case, overhead can be eliminated
1660 for that case without sacrificing flexibility. This takes full
1661 advantage of any ability of the optimizer to crush out degenerate
1664 The library specifies many virtual functions which current linkers
1665 load even when they cannot be called. Some minor improvements to the
1666 compiler and to ld would eliminate any such overhead by simply
1667 omitting virtual functions that the complete program does not call.
1668 A prototype of this work has already been done. For targets where
1669 GNU ld is not used, a "pre-linker" could do the same job.
1671 The main areas in the standard interface where user flexibility
1672 can result in overhead are:
1674 - Allocators: Containers are specified to use user-definable
1675 allocator types and objects, making tuning for the container
1676 characteristics tricky.
1678 - Locales: the standard specifies locale objects used to implement
1679 iostream operations, involving many virtual functions which use
1680 streambuf iterators.
1682 - Algorithms and containers: these may be instantiated on any type,
1683 frequently duplicating code for identical operations.
1685 - Iostreams and strings: users are permitted to use these on their
1686 own types, and specify the operations the stream must use on these
1689 Note that these sources of overhead are _avoidable_. The techniques
1690 to avoid them are covered below.
1695 In the SGI STL, and in some other headers, many of the templates
1696 are defined "inline" -- either explicitly or by their placement
1697 in class definitions -- which should not be inline. This is a
1698 source of code bloat. Matt had remarked that he was relying on
1699 the compiler to recognize what was too big to benefit from inlining,
1700 and generate it out-of-line automatically. However, this also can
1701 result in code bloat except where the linker can eliminate the extra
1704 Fixing these cases will require an audit of all inline functions
1705 defined in the library to determine which merit inlining, and moving
1706 the rest out of line. This is an issue mainly in chapters 23, 25, and
1707 27. Of course it can be done incrementally, and we should generally
1708 accept patches that move large functions out of line and into ".tcc"
1709 files, which can later be pulled into a repository. Compiler/linker
1710 improvements to recognize very large inline functions and move them
1711 out-of-line, but shared among compilation units, could make this
1714 Pre-instantiating template specializations currently produces large
1715 amounts of dead code which bloats statically linked programs. The
1716 current state of the static library, libstdc++.a, is intolerable on
1717 this account, and will fuel further confused speculation about a need
1718 for a library "subset". A compiler improvement that treats each
1719 instantiated function as a separate object file, for linking purposes,
1720 would be one solution to this problem. An alternative would be to
1721 split up the manual instantiation files into dozens upon dozens of
1722 little files, each compiled separately, but an abortive attempt at
1723 this was done for <string> and, though it is far from complete, it
1724 is already a nuisance. A better interim solution (just until we have
1725 "export") is badly needed.
1727 When building a shared library, the current compiler/linker cannot
1728 automatically generate the instantiations needed. This creates a
1729 miserable situation; it means any time something is changed in the
1730 library, before a shared library can be built someone must manually
1731 copy the declarations of all templates that are needed by other parts
1732 of the library to an "instantiation" file, and add it to the build
1733 system to be compiled and linked to the library. This process is
1734 readily automated, and should be automated as soon as possible.
1735 Users building their own shared libraries experience identical
1738 Sharing common aspects of template definitions among instantiations
1739 can radically reduce code bloat. The compiler could help a great
1740 deal here by recognizing when a function depends on nothing about
1741 a template parameter, or only on its size, and giving the resulting
1742 function a link-name "equate" that allows it to be shared with other
1743 instantiations. Implementation code could take advantage of the
1744 capability by factoring out code that does not depend on the template
1745 argument into separate functions to be merged by the compiler.
1747 Until such a compiler optimization is implemented, much can be done
1748 manually (if tediously) in this direction. One such optimization is
1749 to derive class templates from non-template classes, and move as much
1750 implementation as possible into the base class. Another is to partial-
1751 specialize certain common instantiations, such as vector<T*>, to share
1752 code for instantiations on all types T. While these techniques work,
1753 they are far from the complete solution that a compiler improvement
1756 Overhead: Expensive Language Features
1757 -------------------------------------
1759 The main "expensive" language feature used in the standard library
1760 is exception support, which requires compiling in cleanup code with
1761 static table data to locate it, and linking in library code to use
1762 the table. For small embedded programs the amount of such library
1763 code and table data is assumed by some to be excessive. Under the
1764 "new" ABI this perception is generally exaggerated, although in some
1765 cases it may actually be excessive.
1767 To implement a library which does not use exceptions directly is
1768 not difficult given minor compiler support (to "turn off" exceptions
1769 and ignore exception constructs), and results in no great library
1770 maintenance difficulties. To be precise, given "-fno-exceptions",
1771 the compiler should treat "try" blocks as ordinary blocks, and
1772 "catch" blocks as dead code to ignore or eliminate. Compiler
1773 support is not strictly necessary, except in the case of "function
1774 try blocks"; otherwise the following macros almost suffice:
1777 #define try if (true)
1778 #define catch(X) else if (false)
1780 However, there may be a need to use function try blocks in the
1781 library implementation, and use of macros in this way can make
1782 correct diagnostics impossible. Furthermore, use of this scheme
1783 would require the library to call a function to re-throw exceptions
1784 from a try block. Implementing the above semantics in the compiler
1787 Given the support above (however implemented) it only remains to
1788 replace code that "throws" with a call to a well-documented "handler"
1789 function in a separate compilation unit which may be replaced by
1790 the user. The main source of exceptions that would be difficult
1791 for users to avoid is memory allocation failures, but users can
1792 define their own memory allocation primitives that never throw.
1793 Otherwise, the complete list of such handlers, and which library
1794 functions may call them, would be needed for users to be able to
1795 implement the necessary substitutes. (Fortunately, they have the
1801 The template capabilities of C++ offer enormous opportunities for
1802 optimizing common library operations, well beyond what would be
1803 considered "eliminating overhead". In particular, many operations
1804 done in Glibc with macros that depend on proprietary language
1805 extensions can be implemented in pristine Standard C++. For example,
1806 the chapter 25 algorithms, and even C library functions such as strchr,
1807 can be specialized for the case of static arrays of known (small) size.
1809 Detailed optimization opportunities are identified below where
1810 the component where they would appear is discussed. Of course new
1811 opportunities will be identified during implementation.
1813 Unimplemented Required Library Features
1814 ---------------------------------------
1816 The standard specifies hundreds of components, grouped broadly by
1817 chapter. These are listed in excruciating detail in the CHECKLIST
1831 Annex D backward compatibility
1833 Anyone participating in implementation of the library should obtain
1834 a copy of the standard, ISO 14882. People in the U.S. can obtain an
1835 electronic copy for US$18 from ANSI's web site. Those from other
1836 countries should visit http://www.iso.org/ to find out the location
1837 of their country's representation in ISO, in order to know who can
1840 The emphasis in the following sections is on unimplemented features
1841 and optimization opportunities.
1846 Chapter 17 concerns overall library requirements.
1848 The standard doesn't mention threads. A multi-thread (MT) extension
1849 primarily affects operators new and delete (18), allocator (20),
1850 string (21), locale (22), and iostreams (27). The common underlying
1851 support needed for this is discussed under chapter 20.
1853 The standard requirements on names from the C headers create a
1854 lot of work, mostly done. Names in the C headers must be visible
1855 in the std:: and sometimes the global namespace; the names in the
1856 two scopes must refer to the same object. More stringent is that
1857 Koenig lookup implies that any types specified as defined in std::
1858 really are defined in std::. Names optionally implemented as
1859 macros in C cannot be macros in C++. (An overview may be read at
1860 <http://www.cantrip.org/cheaders.html>). The scripts "inclosure"
1861 and "mkcshadow", and the directories shadow/ and cshadow/, are the
1862 beginning of an effort to conform in this area.
1864 A correct conforming definition of C header names based on underlying
1865 C library headers, and practical linking of conforming namespaced
1866 customer code with third-party C libraries depends ultimately on
1867 an ABI change, allowing namespaced C type names to be mangled into
1868 type names as if they were global, somewhat as C function names in a
1869 namespace, or C++ global variable names, are left unmangled. Perhaps
1870 another "extern" mode, such as 'extern "C-global"' would be an
1871 appropriate place for such type definitions. Such a type would
1872 affect mangling as follows:
1876 extern "C-global" { // or maybe just 'extern "C"'
1880 void f(A::X*); // mangles to f__FPQ21A1X
1881 void f(A::Y*); // mangles to f__FP1Y
1883 (It may be that this is really the appropriate semantics for regular
1884 'extern "C"', and 'extern "C-global"', as an extension, would not be
1885 necessary.) This would allow functions declared in non-standard C headers
1886 (and thus fixable by neither us nor users) to link properly with functions
1887 declared using C types defined in properly-namespaced headers. The
1888 problem this solves is that C headers (which C++ programmers do persist
1889 in using) frequently forward-declare C struct tags without including
1890 the header where the type is defined, as in
1895 Without some compiler accommodation, munge cannot be called by correct
1896 C++ code using a pointer to a correctly-scoped tm* value.
1898 The current C headers use the preprocessor extension "#include_next",
1899 which the compiler complains about when run "-pedantic".
1900 (Incidentally, it appears that "-fpedantic" is currently ignored,
1901 probably a bug.) The solution in the C compiler is to use
1902 "-isystem" rather than "-I", but unfortunately in g++ this seems
1903 also to wrap the whole header in an 'extern "C"' block, so it's
1904 unusable for C++ headers. The correct solution appears to be to
1905 allow the various special include-directory options, if not given
1906 an argument, to affect subsequent include-directory options additively,
1909 -pedantic -iprefix $(prefix) \
1910 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \
1911 -iwithprefix -I g++-v3/ext
1913 the compiler would search $(prefix)/g++-v3 and not report
1914 pedantic warnings for files found there, but treat files in
1915 $(prefix)/g++-v3/ext pedantically. (The undocumented semantics
1916 of "-isystem" in g++ stink. Can they be rescinded? If not it
1917 must be replaced with something more rationally behaved.)
1919 All the C headers need the treatment above; in the standard these
1920 headers are mentioned in various chapters. Below, I have only
1921 mentioned those that present interesting implementation issues.
1923 The components identified as "mostly complete", below, have not been
1924 audited for conformance. In many cases where the library passes
1925 conformance tests we have non-conforming extensions that must be
1926 wrapped in #if guards for "pedantic" use, and in some cases renamed
1927 in a conforming way for continued use in the implementation regardless
1928 of conformance flags.
1930 The STL portion of the library still depends on a header
1931 stl/bits/stl_config.h full of #ifdef clauses. This apparatus
1932 should be replaced with autoconf/automake machinery.
1934 The SGI STL defines a type_traits<> template, specialized for
1935 many types in their code including the built-in numeric and
1936 pointer types and some library types, to direct optimizations of
1937 standard functions. The SGI compiler has been extended to generate
1938 specializations of this template automatically for user types,
1939 so that use of STL templates on user types can take advantage of
1940 these optimizations. Specializations for other, non-STL, types
1941 would make more optimizations possible, but extending the gcc
1942 compiler in the same way would be much better. Probably the next
1943 round of standardization will ratify this, but probably with
1944 changes, so it probably should be renamed to place it in the
1945 implementation namespace.
1947 The SGI STL also defines a large number of extensions visible in
1948 standard headers. (Other extensions that appear in separate headers
1949 have been sequestered in subdirectories ext/ and backward/.) All
1950 these extensions should be moved to other headers where possible,
1951 and in any case wrapped in a namespace (not std!), and (where kept
1952 in a standard header) girded about with macro guards. Some cannot be
1953 moved out of standard headers because they are used to implement
1954 standard features. The canonical method for accommodating these
1955 is to use a protected name, aliased in macro guards to a user-space
1956 name. Unfortunately C++ offers no satisfactory template typedef
1957 mechanism, so very ad-hoc and unsatisfactory aliasing must be used
1960 Implementation of a template typedef mechanism should have the highest
1961 priority among possible extensions, on the same level as implementation
1962 of the template "export" feature.
1964 Chapter 18 Language support
1965 ----------------------------
1967 Headers: <limits> <new> <typeinfo> <exception>
1968 C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp>
1969 <ctime> <csignal> <cstdlib> (also 21, 25, 26)
1971 This defines the built-in exceptions, rtti, numeric_limits<>,
1972 operator new and delete. Much of this is provided by the
1973 compiler in its static runtime library.
1975 Work to do includes defining numeric_limits<> specializations in
1976 separate files for all target architectures. Values for integer types
1977 except for bool and wchar_t are readily obtained from the C header
1978 <limits.h>, but values for the remaining numeric types (bool, wchar_t,
1979 float, double, long double) must be entered manually. This is
1980 largely dog work except for those members whose values are not
1981 easily deduced from available documentation. Also, this involves
1982 some work in target configuration to identify the correct choice of
1983 file to build against and to install.
1985 The definitions of the various operators new and delete must be
1986 made thread-safe, which depends on a portable exclusion mechanism,
1987 discussed under chapter 20. Of course there is always plenty of
1988 room for improvements to the speed of operators new and delete.
1990 <cstdarg>, in Glibc, defines some macros that gcc does not allow to
1991 be wrapped into an inline function. Probably this header will demand
1992 attention whenever a new target is chosen. The functions atexit(),
1993 exit(), and abort() in cstdlib have different semantics in C++, so
1994 must be re-implemented for C++.
1996 Chapter 19 Diagnostics
1997 -----------------------
1999 Headers: <stdexcept>
2000 C headers: <cassert> <cerrno>
2002 This defines the standard exception objects, which are "mostly complete".
2003 Cygnus has a version, and now SGI provides a slightly different one.
2004 It makes little difference which we use.
2006 The C global name "errno", which C allows to be a variable or a macro,
2007 is required in C++ to be a macro. For MT it must typically result in
2010 Chapter 20 Utilities
2011 ---------------------
2012 Headers: <utility> <functional> <memory>
2013 C header: <ctime> (also in 18)
2015 SGI STL provides "mostly complete" versions of all the components
2016 defined in this chapter. However, the auto_ptr<> implementation
2017 is known to be wrong. Furthermore, the standard definition of it
2018 is known to be unimplementable as written. A minor change to the
2019 standard would fix it, and auto_ptr<> should be adjusted to match.
2021 Multi-threading affects the allocator implementation, and there must
2022 be configuration/installation choices for different users' MT
2023 requirements. Anyway, users will want to tune allocator options
2024 to support different target conditions, MT or no.
2026 The primitives used for MT implementation should be exposed, as an
2027 extension, for users' own work. We need cross-CPU "mutex" support,
2028 multi-processor shared-memory atomic integer operations, and single-
2029 processor uninterruptible integer operations, and all three configurable
2030 to be stubbed out for non-MT use, or to use an appropriately-loaded
2031 dynamic library for the actual runtime environment, or statically
2032 compiled in for cases where the target architecture is known.
2036 Headers: <string>
2037 C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27)
2038 <cstdlib> (also in 18, 25, 26)
2040 We have "mostly-complete" char_traits<> implementations. Many of the
2041 char_traits<char> operations might be optimized further using existing
2042 proprietary language extensions.
2044 We have a "mostly-complete" basic_string<> implementation. The work
2045 to manually instantiate char and wchar_t specializations in object
2046 files to improve link-time behavior is extremely unsatisfactory,
2047 literally tripling library-build time with no commensurate improvement
2048 in static program link sizes. It must be redone. (Similar work is
2049 needed for some components in chapters 22 and 27.)
2051 Other work needed for strings is MT-safety, as discussed under the
2054 The standard C type mbstate_t from <cwchar> and used in char_traits<>
2055 must be different in C++ than in C, because in C++ the default constructor
2056 value mbstate_t() must be the "base" or "ground" sequence state.
2057 (According to the likely resolution of a recently raised Core issue,
2058 this may become unnecessary. However, there are other reasons to
2059 use a state type not as limited as whatever the C library provides.)
2060 If we might want to provide conversions from (e.g.) internally-
2061 represented EUC-wide to externally-represented Unicode, or vice-
2062 versa, the mbstate_t we choose will need to be more accommodating
2063 than what might be provided by an underlying C library.
2065 There remain some basic_string template-member functions which do
2066 not overload properly with their non-template brethren. The infamous
2067 hack akin to what was done in vector<> is needed, to conform to
2068 23.1.1 para 10. The CHECKLIST items for basic_string marked 'X',
2069 or incomplete, are so marked for this reason.
2071 Replacing the string iterators, which currently are simple character
2072 pointers, with class objects would greatly increase the safety of the
2073 client interface, and also permit a "debug" mode in which range,
2074 ownership, and validity are rigorously checked. The current use of
2075 raw pointers as string iterators is evil. vector<> iterators need the
2076 same treatment. Note that the current implementation freely mixes
2077 pointers and iterators, and that must be fixed before safer iterators
2080 Some of the functions in <cstring> are different from the C version.
2081 generally overloaded on const and non-const argument pointers. For
2082 example, in <cstring> strchr is overloaded. The functions isupper
2083 etc. in <cctype> typically implemented as macros in C are functions
2084 in C++, because they are overloaded with others of the same name
2085 defined in <locale>.
2087 Many of the functions required in <cwctype> and <cwchar> cannot be
2088 implemented using underlying C facilities on intended targets because
2089 such facilities only partly exist.
2093 Headers: <locale>
2094 C headers: <clocale>
2096 We have a "mostly complete" class locale, with the exception of
2097 code for constructing, and handling the names of, named locales.
2098 The ways that locales are named (particularly when categories
2099 (e.g. LC_TIME, LC_COLLATE) are different) varies among all target
2100 environments. This code must be written in various versions and
2101 chosen by configuration parameters.
2103 Members of many of the facets defined in <locale> are stubs. Generally,
2104 there are two sets of facets: the base class facets (which are supposed
2105 to implement the "C" locale) and the "byname" facets, which are supposed
2106 to read files to determine their behavior. The base ctype<>, collate<>,
2107 and numpunct<> facets are "mostly complete", except that the table of
2108 bitmask values used for "is" operations, and corresponding mask values,
2109 are still defined in libio and just included/linked. (We will need to
2110 implement these tables independently, soon, but should take advantage
2111 of libio where possible.) The num_put<>::put members for integer types
2112 are "mostly complete".
2114 A complete list of what has and has not been implemented may be
2115 found in CHECKLIST. However, note that the current definition of
2116 codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write
2117 out the raw bytes representing the wide characters, rather than
2118 trying to convert each to a corresponding single "char" value.
2120 Some of the facets are more important than others. Specifically,
2121 the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets
2122 are used by other library facilities defined in <string>, <istream>,
2123 and <ostream>, and the codecvt<> facet is used by basic_filebuf<>
2124 in <fstream>, so a conforming iostream implementation depends on
2127 The "long long" type eventually must be supported, but code mentioning
2128 it should be wrapped in #if guards to allow pedantic-mode compiling.
2130 Performance of num_put<> and num_get<> depend critically on
2131 caching computed values in ios_base objects, and on extensions
2132 to the interface with streambufs.
2134 Specifically: retrieving a copy of the locale object, extracting
2135 the needed facets, and gathering data from them, for each call to
2136 (e.g.) operator<< would be prohibitively slow. To cache format
2137 data for use by num_put<> and num_get<> we have a _Format_cache<>
2138 object stored in the ios_base::pword() array. This is constructed
2139 and initialized lazily, and is organized purely for utility. It
2140 is discarded when a new locale with different facets is imbued.
2142 Using only the public interfaces of the iterator arguments to the
2143 facet functions would limit performance by forbidding "vector-style"
2144 character operations. The streambuf iterator optimizations are
2145 described under chapter 24, but facets can also bypass the streambuf
2146 iterators via explicit specializations and operate directly on the
2147 streambufs, and use extended interfaces to get direct access to the
2148 streambuf internal buffer arrays. These extensions are mentioned
2149 under chapter 27. These optimizations are particularly important
2152 Unused virtual members of locale facets can be omitted, as mentioned
2153 above, by a smart linker.
2155 Chapter 23 Containers
2156 ----------------------
2157 Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset>
2159 All the components in chapter 23 are implemented in the SGI STL.
2160 They are "mostly complete"; they include a large number of
2161 nonconforming extensions which must be wrapped. Some of these
2162 are used internally and must be renamed or duplicated.
2164 The SGI components are optimized for large-memory environments. For
2165 embedded targets, different criteria might be more appropriate. Users
2166 will want to be able to tune this behavior. We should provide
2167 ways for users to compile the library with different memory usage
2170 A lot more work is needed on factoring out common code from different
2171 specializations to reduce code size here and in chapter 25. The
2172 easiest fix for this would be a compiler/ABI improvement that allows
2173 the compiler to recognize when a specialization depends only on the
2174 size (or other gross quality) of a template argument, and allow the
2175 linker to share the code with similar specializations. In its
2176 absence, many of the algorithms and containers can be partial-
2177 specialized, at least for the case of pointers, but this only solves
2178 a small part of the problem. Use of a type_traits-style template
2179 allows a few more optimization opportunities, more if the compiler
2180 can generate the specializations automatically.
2182 As an optimization, containers can specialize on the default allocator
2183 and bypass it, or take advantage of details of its implementation
2184 after it has been improved upon.
2186 Replacing the vector iterators, which currently are simple element
2187 pointers, with class objects would greatly increase the safety of the
2188 client interface, and also permit a "debug" mode in which range,
2189 ownership, and validity are rigorously checked. The current use of
2190 pointers for iterators is evil.
2192 As mentioned for chapter 24, the deque iterator is a good example of
2193 an opportunity to implement a "staged" iterator that would benefit
2194 from specializations of some algorithms.
2196 Chapter 24 Iterators
2197 ---------------------
2198 Headers: <iterator>
2200 Standard iterators are "mostly complete", with the exception of
2201 the stream iterators, which are not yet templatized on the
2202 stream type. Also, the base class template iterator<> appears
2203 to be wrong, so everything derived from it must also be wrong,
2206 The streambuf iterators (currently located in stl/bits/std_iterator.h,
2207 but should be under bits/) can be rewritten to take advantage of
2208 friendship with the streambuf implementation.
2210 Matt Austern has identified opportunities where certain iterator
2211 types, particularly including streambuf iterators and deque
2212 iterators, have a "two-stage" quality, such that an intermediate
2213 limit can be checked much more quickly than the true limit on
2214 range operations. If identified with a member of iterator_traits,
2215 algorithms may be specialized for this case. Of course the
2216 iterators that have this quality can be identified by specializing
2219 Many of the algorithms must be specialized for the streambuf
2220 iterators, to take advantage of block-mode operations, in order
2221 to allow iostream/locale operations' performance not to suffer.
2222 It may be that they could be treated as staged iterators and
2223 take advantage of those optimizations.
2225 Chapter 25 Algorithms
2226 ----------------------
2227 Headers: <algorithm>
2228 C headers: <cstdlib> (also in 18, 21, 26))
2230 The algorithms are "mostly complete". As mentioned above, they
2231 are optimized for speed at the expense of code and data size.
2233 Specializations of many of the algorithms for non-STL types would
2234 give performance improvements, but we must use great care not to
2235 interfere with fragile template overloading semantics for the
2236 standard interfaces. Conventionally the standard function template
2237 interface is an inline which delegates to a non-standard function
2238 which is then overloaded (this is already done in many places in
2239 the library). Particularly appealing opportunities for the sake of
2240 iostream performance are for copy and find applied to streambuf
2241 iterators or (as noted elsewhere) for staged iterators, of which
2242 the streambuf iterators are a good example.
2244 The bsearch and qsort functions cannot be overloaded properly as
2245 required by the standard because gcc does not yet allow overloading
2246 on the extern-"C"-ness of a function pointer.
2249 --------------------
2250 Headers: <complex> <valarray> <numeric>
2251 C headers: <cmath>, <cstdlib> (also 18, 21, 25)
2253 Numeric components: Gabriel dos Reis's valarray, Drepper's complex,
2254 and the few algorithms from the STL are "mostly done". Of course
2255 optimization opportunities abound for the numerically literate. It
2256 is not clear whether the valarray implementation really conforms
2257 fully, in the assumptions it makes about aliasing (and lack thereof)
2260 The C div() and ldiv() functions are interesting, because they are the
2261 only case where a C library function returns a class object by value.
2262 Since the C++ type div_t must be different from the underlying C type
2263 (which is in the wrong namespace) the underlying functions div() and
2264 ldiv() cannot be re-used efficiently. Fortunately they are trivial to
2267 Chapter 27 Iostreams
2268 ---------------------
2269 Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream>
2270 <iomanip> <sstream> <fstream>
2271 C headers: <cstdio> <cwchar> (also in 21)
2273 Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>,
2274 ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and
2275 basic_ostream<> are well along, but basic_istream<> has had little work
2276 done. The standard stream objects, <sstream> and <fstream> have been
2277 started; basic_filebuf<> "write" functions have been implemented just
2278 enough to do "hello, world".
2280 Most of the istream and ostream operators << and >> (with the exception
2281 of the op<<(integer) ones) have not been changed to use locale primitives,
2282 sentry objects, or char_traits members.
2284 All these templates should be manually instantiated for char and
2285 wchar_t in a way that links only used members into user programs.
2287 Streambuf is fertile ground for optimization extensions. An extended
2288 interface giving iterator access to its internal buffer would be very
2289 useful for other library components.
2291 Iostream operations (primarily operators << and >>) can take advantage
2292 of the case where user code has not specified a locale, and bypass locale
2293 operations entirely. The current implementation of op<</num_put<>::put,
2294 for the integer types, demonstrates how they can cache encoding details
2295 from the locale on each operation. There is lots more room for
2296 optimization in this area.
2298 The definition of the relationship between the standard streams
2299 cout et al. and stdout et al. requires something like a "stdiobuf".
2300 The SGI solution of using double-indirection to actually use a
2301 stdio FILE object for buffering is unsatisfactory, because it
2302 interferes with peephole loop optimizations.
2304 The <sstream> header work has begun. stringbuf can benefit from
2305 friendship with basic_string<> and basic_string<>::_Rep to use
2306 those objects directly as buffers, and avoid allocating and making
2309 The basic_filebuf<> template is a complex beast. It is specified to
2310 use the locale facet codecvt<> to translate characters between native
2311 files and the locale character encoding. In general this involves
2312 two buffers, one of "char" representing the file and another of
2313 "char_type", for the stream, with codecvt<> translating. The process
2314 is complicated by the variable-length nature of the translation, and
2315 the need to seek to corresponding places in the two representations.
2316 For the case of basic_filebuf<char>, when no translation is needed,
2317 a single buffer suffices. A specialized filebuf can be used to reduce
2318 code space overhead when no locale has been imbued. Matt Austern's
2319 work at SGI will be useful, perhaps directly as a source of code, or
2320 at least as an example to draw on.
2322 Filebuf, almost uniquely (cf. operator new), depends heavily on
2323 underlying environmental facilities. In current releases iostream
2324 depends fairly heavily on libio constant definitions, but it should
2325 be made independent. It also depends on operating system primitives
2326 for file operations. There is immense room for optimizations using
2327 (e.g.) mmap for reading. The shadow/ directory wraps, besides the
2328 standard C headers, the libio.h and unistd.h headers, for use mainly
2329 by filebuf. These wrappings have not been completed, though there
2330 is scaffolding in place.
2332 The encapsulation of certain C header <cstdio> names presents an
2333 interesting problem. It is possible to define an inline std::fprintf()
2334 implemented in terms of the 'extern "C"' vfprintf(), but there is no
2335 standard vfscanf() to use to implement std::fscanf(). It appears that
2336 vfscanf but be re-implemented in C++ for targets where no vfscanf
2337 extension has been defined. This is interesting in that it seems
2338 to be the only significant case in the C library where this kind of
2339 rewriting is necessary. (Of course Glibc provides the vfscanf()
2340 extension.) (The functions related to exit() must be rewritten
2346 Headers: <strstream>
2348 Annex D defines many non-library features, and many minor
2349 modifications to various headers, and a complete header.
2350 It is "mostly done", except that the libstdc++-2 <strstream>
2351 header has not been adopted into the library, or checked to
2352 verify that it matches the draft in those details that were
2353 clarified by the committee. Certainly it must at least be
2354 moved into the std namespace.
2356 We still need to wrap all the deprecated features in #if guards
2357 so that pedantic compile modes can detect their use.
2359 Nonstandard Extensions
2360 ----------------------
2361 Headers: <iostream.h> <strstream.h> <hash> <rbtree>
2362 <pthread_alloc> <stdiobuf> (etc.)
2364 User code has come to depend on a variety of nonstandard components
2365 that we must not omit. Much of this code can be adopted from
2366 libstdc++-v2 or from the SGI STL. This particularly includes
2367 <iostream.h>, <strstream.h>, and various SGI extensions such
2368 as <hash_map.h>. Many of these are already placed in the
2369 subdirectories ext/ and backward/. (Note that it is better to
2370 include them via "<backward/hash_map.h>" or "<ext/hash_map>" than
2371 to search the subdirectory itself via a "-I" directive.