Merge -r 127928:132243 from trunk
[official-gcc.git] / libstdc++-v3 / doc / html / ext / pb_ds / pq_design.html
blob95956004527771b4f53f415bcefa50867b742a73
1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5 <head>
6 <meta name="generator" content=
7 "HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" />
9 <title>Priority-Queues</title>
10 <meta http-equiv="Content-Type" content=
11 "text/html; charset=us-ascii" />
12 </head>
14 <body>
15 <div id="page">
16 <h1>Priority-Queue Design</h1>
18 <h2><a name="overview" id="overview">Overview</a></h2>
20 <p>The priority-queue container has the following
21 declaration:</p>
22 <pre>
23 <b>template</b>&lt;
24 <b>typename</b> Value_Type,
25 <b>typename</b> Cmp_Fn = std::less&lt;Value_Type&gt;,
26 <b>typename</b> Tag = <a href="pairing_heap_tag.html">pairing_heap_tag</a>,
27 <b>typename</b> Allocator = std::allocator&lt;<b>char</b>&gt; &gt;
28 <b>class</b> <a href="priority_queue.html">priority_queue</a>;
29 </pre>
31 <p>The parameters have the following meaning:</p>
33 <ol>
34 <li><tt>Value_Type</tt> is the value type.</li>
36 <li><tt>Cmp_Fn</tt> is a value comparison functor</li>
38 <li><tt>Tag</tt> specifies which underlying data structure
39 to use.</li>
41 <li><tt>Allocator</tt> is an allocator
42 type.</li>
43 </ol>
45 <p>The <tt>Tag</tt> parameter specifies which underlying
46 data structure to use. Instantiating it by <a href=
47 "pairing_heap_tag.html"><tt>pairing_heap_tag</tt></a>,
48 <a href=
49 "binary_heap_tag.html"><tt>binary_heap_tag</tt></a>,
50 <a href=
51 "binomial_heap_tag.html"><tt>binomial_heap_tag</tt></a>,
52 <a href=
53 "rc_binomial_heap_tag.html"><tt>rc_binomial_heap_tag</tt></a>,
54 or <a href=
55 "thin_heap_tag.html"><tt>thin_heap_tag</tt></a>,
56 specifies, respectively, an underlying pairing heap [<a href=
57 "references.html#fredman86pairing">fredman86pairing</a>],
58 binary heap [<a href="references.html#clrs2001">clrs2001</a>],
59 binomial heap [<a href=
60 "references.html#clrs2001">clrs2001</a>], a binomial heap with
61 a redundant binary counter [<a href=
62 "references.html#maverik_lowerbounds">maverik_lowerbounds</a>],
63 or a thin heap [<a href=
64 "references.html#kt99fat_heaps">kt99fat_heas</a>]. These are
65 explained further in <a href="#pq_imp">Implementations</a>.</p>
67 <p>As mentioned in <a href=
68 "tutorial.html#pq">Tutorial::Priority Queues</a>,
69 <a href=
70 "priority_queue.html"><tt>__gnu_pbds::priority_queue</tt></a>
71 shares most of the same interface with <tt>std::priority_queue</tt>.
72 <i>E.g.</i> if <tt>q</tt> is a priority queue of type
73 <tt>Q</tt>, then <tt>q.top()</tt> will return the "largest"
74 value in the container (according to <tt><b>typename</b>
75 Q::cmp_fn</tt>). <a href=
76 "priority_queue.html"><tt>__gnu_pbds::priority_queue</tt></a>
77 has a larger (and very slightly different) interface than
78 <tt>std::priority_queue</tt>, however, since typically
79 <tt>push</tt> and <tt>pop</tt> are deemed insufficient for
80 manipulating priority-queues. </p>
82 <p>Different settings require different priority-queue
83 implementations which are described in <a href=
84 "#pq_imp">Implementations</a>; <a href="#pq_traits">Traits</a>
85 discusses ways to differentiate between the different traits of
86 different implementations.</p>
88 <h2><a name="pq_it" id="pq_it">Iterators</a></h2>
90 <p>There are many different underlying-data structures for
91 implementing priority queues. Unfortunately, most such
92 structures are oriented towards making <tt>push</tt> and
93 <tt>top</tt> efficient, and consequently don't allow efficient
94 access of other elements: for instance, they cannot support an efficient
95 <tt>find</tt> method. In the use case where it
96 is important to both access and "do something with" an
97 arbitrary value, one would be out of luck. For example, many graph algorithms require
98 modifying a value (typically increasing it in the sense of the
99 priority queue's comparison functor).</p>
101 <p>In order to access and manipulate an arbitrary value in a
102 priority queue, one needs to reference the internals of the
103 priority queue from some form of an associative container -
104 this is unavoidable. Of course, in order to maintain the
105 encapsulation of the priority queue, this needs to be done in a
106 way that minimizes exposure to implementation internals.</p>
108 <p>In <tt>pb_ds</tt> the priority queue's <tt>insert</tt>
109 method returns an iterator, which if valid can be used for subsequent <tt>modify</tt> and
110 <tt>erase</tt> operations. This both preserves the priority
111 queue's encapsulation, and allows accessing arbitrary values (since the
112 returned iterators from the <tt>push</tt> operation can be
113 stored in some form of associative container).</p>
115 <p>Priority queues' iterators present a problem regarding their
116 invalidation guarantees. One assumes that calling
117 <tt><b>operator</b>++</tt> on an iterator will associate it
118 with the "next" value. Priority-queues are
119 self-organizing: each operation changes what the "next" value
120 means. Consequently, it does not make sense that <tt>push</tt>
121 will return an iterator that can be incremented - this can have
122 no possible use. Also, as in the case of hash-based containers,
123 it is awkward to define if a subsequent <tt>push</tt> operation
124 invalidates a prior returned iterator: it invalidates it in the
125 sense that its "next" value is not related to what it
126 previously considered to be its "next" value. However, it might not
127 invalidate it, in the sense that it can be
128 de-referenced and used for <tt>modify</tt> and <tt>erase</tt>
129 operations.</p>
131 <p>Similarly to the case of the other unordered associative
132 containers, <tt>pb_ds</tt> uses a distinction between
133 point-type and range type iterators. A priority queue's <tt>iterator</tt> can always be
134 converted to a <tt>point_iterator</tt>, and a
135 <tt>const_iterator</tt> can always be converted to a
136 <tt>const_point_iterator</tt>.</p>
138 <p>The following snippet demonstrates manipulating an arbitrary
139 value:</p>
140 <pre>
141 <i>// A priority queue of integers.</i>
142 <a href=
143 "priority_queue.html">priority_queue</a>&lt;<b>int</b>&gt; p;
145 <i>// Insert some values into the priority queue.</i>
146 <a href=
147 "priority_queue.html">priority_queue</a>&lt;<b>int</b>&gt;::point_iterator it = p.push(0);
149 p.push(1);
150 p.push(2);
152 <i>// Now modify a value.</i>
153 p.modify(it, 3);
155 assert(p.top() == 3);
156 </pre>
158 <p>(<a href="pq_examples.html#xref">Priority Queue
159 Examples::Cross-Referencing</a> shows a more detailed
160 example.)</p>
162 <p>It should be noted that an alternative design could embed an
163 associative container in a priority queue. Could, but most probably should not. To begin with, it should be noted that one
164 could always encapsulate a priority queue and an associative
165 container mapping values to priority queue iterators with no
166 performance loss. One cannot, however, "un-encapsulate" a
167 priority queue embedding an associative container, which might
168 lead to performance loss. Assume, that one needs to
169 associate each value with some data unrelated to priority
170 queues. Then using <tt>pb_ds</tt>'s design, one could use an
171 associative container mapping each value to a pair consisting
172 of this data and a priority queue's iterator. Using the
173 embedded method would need to use two associative
174 containers. Similar problems might arise in cases where a value
175 can reside simultaneously in many priority queues.</p>
177 <h2><a name="pq_imp" id="pq_imp">Implementations</a></h2>
179 <p>There are three main implementations of priority queues: the
180 first employs a binary heap, typically one which uses a
181 sequence; the second uses a tree (or forest of trees), which is
182 typically less structured than an associative container's tree;
183 the third simply uses an associative container. These are
184 shown, respectively, in Figures <a href=
185 "#pq_different_underlying_dss">Underlying Priority-Queue
186 Data-Structures</a> A1 and A2, Figure <a href=
187 "#pq_different_underlying_dss">Underlying Priority-Queue
188 Data-Structures</a> B, and Figures <a href=
189 "#pq_different_underlying_dss">Underlying Priority-Queue
190 Data-Structures</a> C.</p>
192 <h6 class="c1"><a name="pq_different_underlying_dss" id=
193 "pq_different_underlying_dss"><img src=
194 "pq_different_underlying_dss.png" alt="no image" /></a></h6>
196 <h6 class="c1">Underlying Priority-Queue Data-Structures.</h6>
198 <p>Roughly speaking, any value that is both pushed and popped
199 from a priority queue must incur a logarithmic expense (in the
200 amortized sense). Any priority queue implementation that would
201 avoid this, would violate known bounds on comparison-based
202 sorting (see, <i>e.g.</i>, [<a href=
203 "references.html#clrs2001">clrs2001</a>] and <a href=
204 "references.html#brodal96priority">brodal96priority</a>]).</p>
206 <p>Most implementations do
207 not differ in the asymptotic amortized complexity of
208 <tt>push</tt> and <tt>pop</tt> operations, but they differ in
209 the constants involved, in the complexity of other operations
210 (<i>e.g.</i>, <tt>modify</tt>), and in the worst-case
211 complexity of single operations. In general, the more
212 "structured" an implementation (<i>i.e.</i>, the more internal
213 invariants it possesses) - the higher its amortized complexity
214 of <tt>push</tt> and <tt>pop</tt> operations.</p>
216 <p><tt>pb_ds</tt> implements different algorithms using a
217 single class: <a href="priority_queue.html">priority_queue</a>.
218 Instantiating the <tt>Tag</tt> template parameter, "selects"
219 the implementation:</p>
221 <ol>
222 <li>Instantiating <tt>Tag = <a href=
223 "binary_heap_tag.html">binary_heap_tag</a></tt> creates
224 a binary heap of the form in Figures <a href=
225 "#pq_different_underlying_dss">Underlying Priority-Queue
226 Data-Structures</a> A1 or A2. The former is internally
227 selected by <a href="priority_queue.html">priority_queue</a>
228 if <tt>Value_Type</tt> is instantiated by a primitive type
229 (<i>e.g.</i>, an <tt><b>int</b></tt>); the latter is
230 internally selected for all other types (<i>e.g.</i>,
231 <tt>std::string</tt>). This implementations is relatively
232 unstructured, and so has good <tt>push</tt> and <tt>pop</tt>
233 performance; it is the "best-in-kind" for primitive
234 types, <i>e.g.</i>, <tt><b>int</b></tt>s. Conversely, it has
235 high worst-case performance, and can support only linear-time
236 <tt>modify</tt> and <tt>erase</tt> operations; this is
237 explained further in <a href="#pq_traits">Traits</a>.</li>
239 <li>Instantiating <tt>Tag = <a href=
240 "pairing_heap_tag.html">pairing_heap_tag</a></tt>
241 creates a pairing heap of the form in Figure <a href=
242 "#pq_different_underlying_dss">Underlying Priority-Queue
243 Data-Structures</a> B. This implementations too is relatively
244 unstructured, and so has good <tt>push</tt> and <tt>pop</tt>
245 performance; it is the "best-in-kind" for non-primitive
246 types, <i>e.g.</i>, <tt>std:string</tt>s. It also has very
247 good worst-case <tt>push</tt> and <tt>join</tt> performance
248 (<i>O(1)</i>), but has high worst-case <tt>pop</tt>
249 complexity.</li>
251 <li>Instantiating <tt>Tag = <a href=
252 "binomial_heap_tag.html">binomial_heap_tag</a></tt>
253 creates a binomial heap of the form in Figure <a href=
254 "#pq_different_underlying_dss">Underlying Priority-Queue
255 Data-Structures</a> B. This implementations is more
256 structured than a pairing heap, and so has worse
257 <tt>push</tt> and <tt>pop</tt> performance. Conversely, it
258 has sub-linear worst-case bounds for <tt>pop</tt>,
259 <i>e.g.</i>, and so it might be preferred in cases where
260 responsiveness is important.</li>
262 <li>Instantiating <tt>Tag = <a href=
263 "rc_binomial_heap_tag.html">rc_binomial_heap_tag</a></tt>
264 creates a binomial heap of the form in Figure <a href=
265 "#pq_different_underlying_dss">Underlying Priority-Queue
266 Data-Structures</a> B, accompanied by a redundant counter
267 which governs the trees. This implementations is therefore
268 more structured than a binomial heap, and so has worse
269 <tt>push</tt> and <tt>pop</tt> performance. Conversely, it
270 guarantees <i>O(1)</i> <tt>push</tt> complexity, and so it
271 might be preferred in cases where the responsiveness of a
272 binomial heap is insufficient.</li>
274 <li>Instantiating <tt>Tag = <a href=
275 "thin_heap_tag.html">thin_heap_tag</a></tt> creates a
276 thin heap of the form in Figure <a href=
277 "#pq_different_underlying_dss">Underlying Priority-Queue
278 Data-Structures</a> B. This implementations too is more
279 structured than a pairing heap, and so has worse
280 <tt>push</tt> and <tt>pop</tt> performance. Conversely, it
281 has better worst-case and identical amortized complexities
282 than a Fibonacci heap, and so might be more appropriate for
283 some graph algorithms.</li>
284 </ol>
286 <p><a href="pq_performance_tests.html">Priority-Queue
287 Performance Tests</a> shows some results for the above, and
288 discusses these points further.</p>
290 <p>Of course, one can use any order-preserving associative
291 container as a priority queue, as in Figure <a href=
292 "#pq_different_underlying_dss">Underlying Priority-Queue
293 Data-Structures</a> C, possibly by creating an adapter class
294 over the associative container (much as
295 <tt>std::priority_queue</tt> can adapt <tt>std::vector</tt>).
296 This has the advantage that no cross-referencing is necessary
297 at all; the priority queue itself is an associative container.
298 Most associative containers are too structured to compete with
299 priority queues in terms of <tt>push</tt> and <tt>pop</tt>
300 performance.</p>
302 <h2><a name="pq_traits" id="pq_traits">Traits</a></h2>
304 <p>It would be nice if all priority queues could
305 share exactly the same behavior regardless of implementation. Sadly, this is not possible. Just one for instance is in join operations: joining
306 two binary heaps might throw an exception (not corrupt
307 any of the heaps on which it operates), but joining two pairing
308 heaps is exception free.</p>
310 <p>Tags and traits are very useful for manipulating generic
311 types. <a href=
312 "priority_queue.html"><tt>__gnu_pbds::priority_queue</tt></a>
313 publicly defines <tt>container_category</tt> as one of the tags
314 discussed in <a href="#pq_imp">Implementations</a>. Given any
315 container <tt>Cntnr</tt>, the tag of the underlying
316 data structure can be found via <tt><b>typename</b>
317 Cntnr::container_category</tt>; this is one of the types shown in
318 Figure <a href="#pq_tag_cd">Data-structure tag class
319 hierarchy</a>.</p>
321 <h6 class="c1"><a name="pq_tag_cd" id=
322 "pq_tag_cd"><img src="priority_queue_tag_cd.png" alt=
323 "no image" /></a></h6>
325 <h6 class="c1">Data-structure tag class hierarchy.</h6>
327 <p>Additionally, a traits mechanism can be used to query a
328 container type for its attributes. Given any container
329 <tt>Cntnr</tt>, then <tt><a href=
330 "assoc_container_traits.html">__gnu_pbds::container_traits</a>&lt;Cntnr&gt;</tt>
331 is a traits class identifying the properties of the
332 container.</p>
334 <p>To find if a container might throw if two of its objects are
335 joined, one can use <a href=
336 "assoc_container_traits.html"><tt>container_traits</tt></a><tt>&lt;Cntnr&gt;::split_join_can_throw</tt>,
337 for example.</p>
339 <p>Different priority-queue implementations have different invalidation guarantees. This is
340 especially important, since as explained in <a href=
341 "#pq_it">Iterators</a>, there is no way to access an arbitrary
342 value of priority queues except for iterators. Similarly to
343 associative containers, one can use
344 <a href=
345 "assoc_container_traits.html"><tt>container_traits</tt></a><tt>&lt;Cntnr&gt;::invalidation_guarantee</tt>
346 to get the invalidation guarantee type of a priority queue.</p>
348 <p>It is easy to understand from Figure <a href=
349 "#pq_different_underlying_dss">Underlying Priority-Queue
350 Data-Structures</a>, what <a href=
351 "assoc_container_traits.html"><tt>container_traits</tt></a><tt>&lt;Cntnr&gt;::invalidation_guarantee</tt>
352 will be for different implementations. All implementations of
353 type <a href="#pq_different_underlying_dss">Underlying
354 Priority-Queue Data-Structures</a> B have <a href=
355 "point_invalidation_guarantee.html"><tt>point_invalidation_guarantee</tt></a>:
356 the container can freely internally reorganize the nodes -
357 range-type iterators are invalidated, but point-type iterators
358 are always valid. Implementations of type <a href=
359 "#pq_different_underlying_dss">Underlying Priority-Queue
360 Data-Structures</a> A1 and A2 have <a href=
361 "basic_invalidation_guarantee.html"><tt>basic_invalidation_guarantee</tt></a>:
362 the container can freely internally reallocate the array - both
363 point-type and range-type iterators might be invalidated.</p>
365 <p>This has major implications, and constitutes a good reason to avoid
366 using binary heaps. A binary heap can perform <tt>modify</tt>
367 or <tt>erase</tt> efficiently <u>given a valid point-type
368 iterator</u>. However, inn order to supply it with a valid point-type
369 iterator, one needs to iterate (linearly) over all
370 values, then supply the relevant iterator (recall that a
371 range-type iterator can always be converted to a point-type
372 iterator). This means that if the number of <tt>modify</tt> or
373 <tt>erase</tt> operations is non-negligible (say
374 super-logarithmic in the total sequence of operations) - binary
375 heaps will perform badly.</p>
376 <pre>
378 </pre>
379 </div>
380 </body>
381 </html>