FSF GCC merge 02/23/03
[official-gcc.git] / libstdc++-v3 / docs / html / 24_iterators / howto.html
blobc22df9acf4208a6551be5d5c5f9dc2a97fddba2c
1 <?xml version="1.0" encoding="ISO-8859-1"?>
2 <!DOCTYPE html
3 PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
4 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
6 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
7 <head>
8 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
9 <meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards)" />
10 <meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" />
11 <meta name="DESCRIPTION" content="HOWTO for the libstdc++ chapter 24." />
12 <meta name="GENERATOR" content="vi and eight fingers" />
13 <title>libstdc++-v3 HOWTO: Chapter 24</title>
14 <link rel="StyleSheet" href="../lib3styles.css" />
15 </head>
16 <body>
18 <h1 class="centered"><a name="top">Chapter 24: Iterators</a></h1>
20 <p>Chapter 24 deals with the FORTRAN subroutines for automatically
21 transforming lemmings into gold.
22 </p>
25 <!-- ####################################################### -->
26 <hr />
27 <h1>Contents</h1>
28 <ul>
29 <li><a href="#1">They ain't pointers!</a></li>
30 <li><a href="#2">It ends <em>where?</em></a></li>
31 </ul>
33 <hr />
35 <!-- ####################################################### -->
37 <h2><a name="1">They ain't pointers!</a></h2>
38 <p><a href="../faq/index.html#5_1">FAQ 5.1</a> points out that iterators
39 are not implemented as pointers. They are a generalization of
40 pointers, but they are implemented in libstdc++-v3 as separate classes.
41 </p>
42 <p>Keeping that simple fact in mind as you design your code will
43 prevent a whole lot of difficult-to-understand bugs.
44 </p>
45 <p>You can think of it the other way 'round, even. Since iterators
46 are a generalization, that means that <em>pointers</em> are
47 <em>iterators</em>, and that pointers can be used whenever an
48 iterator would be. All those functions in the Algorithms chapter
49 of the Standard will work just as well on plain arrays and their
50 pointers.
51 </p>
52 <p>That doesn't mean that when you pass in a pointer, it gets wrapped
53 into some special delegating iterator-to-pointer class with a layer
54 of overhead. (If you think that's the case anywhere, you don't
55 understand templates to begin with...) Oh, no; if you pass
56 in a pointer, then the compiler will instantiate that template
57 using T* as a type, and good old high-speed pointer arithmetic as
58 its operations, so the resulting code will be doing exactly the same
59 things as it would be doing if you had hand-coded it yourself (for
60 the 273rd time).
61 </p>
62 <p>How much overhead <em>is</em> there when using an interator class?
63 Very little. Most of the layering classes contain nothing but
64 typedefs, and typedefs are &quot;meta-information&quot; that simply
65 tell the compiler some nicknames; they don't create code. That
66 information gets passed down through inheritance, so while the
67 compiler has to do work looking up all the names, your runtime code
68 does not. (This has been a prime concern from the beginning.)
69 </p>
70 <p>Return <a href="#top">to top of page</a> or
71 <a href="../faq/index.html">to the FAQ</a>.
72 </p>
74 <hr />
75 <h2><a name="2">It ends <em>where?</em></a></h2>
76 <p>This starts off sounding complicated, but is actually very easy,
77 especially towards the end. Trust me.
78 </p>
79 <p>Beginners usually have a little trouble understand the whole
80 'past-the-end' thing, until they remember their early algebra classes
81 (see, they <em>told</em> you that stuff would come in handy!) and
82 the concept of half-open ranges.
83 </p>
84 <p>First, some history, and a reminder of some of the funkier rules in
85 C and C++ for builtin arrays. The following rules have always been
86 true for both languages:
87 </p>
88 <ol>
89 <li>You can point anywhere in the array, <em>or to the first element
90 past the end of the array</em>. A pointer that points to one
91 past the end of the array is guaranteed to be as unique as a
92 pointer to somewhere inside the array, so that you can compare
93 such pointers safely.
94 </li>
95 <li>You can only dereference a pointer that points into an array.
96 If your array pointer points outside the array -- even to just
97 one past the end -- and you dereference it, Bad Things happen.
98 </li>
99 <li>Strictly speaking, simply pointing anywhere else invokes
100 undefined behavior. Most programs won't puke until such a
101 pointer is actually dereferenced, but the standards leave that
102 up to the platform.
103 </li>
104 </ol>
105 <p>The reason this past-the-end addressing was allowed is to make it
106 easy to write a loop to go over an entire array, e.g.,
107 while (*d++ = *s++);.
108 </p>
109 <p>So, when you think of two pointers delimiting an array, don't think
110 of them as indexing 0 through n-1. Think of them as <em>boundary
111 markers</em>:
112 </p>
113 <pre>
115 beginning end
117 | | This is bad. Always having to
118 | | remember to add or subtract one.
119 | | Off-by-one bugs very common here.
121 array of N elements
122 |---|---|--...--|---|---|
123 | 0 | 1 | ... |N-2|N-1|
124 |---|---|--...--|---|---|
128 | | This is good. This is safe. This
129 | | is guaranteed to work. Just don't
130 | | dereference 'end'.
131 beginning end
133 </pre>
134 <p>See? Everything between the boundary markers is part of the array.
135 Simple.
136 </p>
137 <p>Now think back to your junior-high school algebra course, when you
138 were learning how to draw graphs. Remember that a graph terminating
139 with a solid dot meant, &quot;Everything up through this point,&quot;
140 and a graph terminating with an open dot meant, &quot;Everything up
141 to, but not including, this point,&quot; respectively called closed
142 and open ranges? Remember how closed ranges were written with
143 brackets, <em>[a,b]</em>, and open ranges were written with parentheses,
144 <em>(a,b)</em>?
145 </p>
146 <p>The boundary markers for arrays describe a <em>half-open range</em>,
147 starting with (and including) the first element, and ending with (but
148 not including) the last element: <em>[beginning,end)</em>. See, I
149 told you it would be simple in the end.
150 </p>
151 <p>Iterators, and everything working with iterators, follows this same
152 time-honored tradition. A container's <code>begin()</code> method returns
153 an iterator referring to the first element, and its <code>end()</code>
154 method returns a past-the-end iterator, which is guaranteed to be
155 unique and comparable against any other iterator pointing into the
156 middle of the container.
157 </p>
158 <p>Container constructors, container methods, and algorithms, all take
159 pairs of iterators describing a range of values on which to operate.
160 All of these ranges are half-open ranges, so you pass the beginning
161 iterator as the starting parameter, and the one-past-the-end iterator
162 as the finishing parameter.
163 </p>
164 <p>This generalizes very well. You can operate on sub-ranges quite
165 easily this way; functions accepting a <em>[first,last)</em> range
166 don't know or care whether they are the boundaries of an entire {array,
167 sequence, container, whatever}, or whether they only enclose a few
168 elements from the center. This approach also makes zero-length
169 sequences very simple to recognize: if the two endpoints compare
170 equal, then the {array, sequence, container, whatever} is empty.
171 </p>
172 <p>Just don't dereference <code>end()</code>.
173 </p>
174 <p>Return <a href="#top">to top of page</a> or
175 <a href="../faq/index.html">to the FAQ</a>.
176 </p>
181 <!-- ####################################################### -->
183 <hr />
184 <p class="fineprint"><em>
185 See <a href="../17_intro/license.html">license.html</a> for copying conditions.
186 Comments and suggestions are welcome, and may be sent to
187 <a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>.
188 </em></p>
191 </body>
192 </html>