libstdc++-v3/doc/html/ext/pb_ds/pq_design.html

   1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   2     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   3
   4 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
   5 <head>
   6   <meta name="generator" content=
   7   "HTML Tidy for Linux/x86 (vers 12 April 2005), see www.w3.org" />
   8
   9   <title>Priority-Queues</title>
  10   <meta http-equiv="Content-Type" content=
  11   "text/html; charset=us-ascii" />
  12   </head>
  13
  14 <body>
  15   <div id="page">
  16     <h1>Priority-Queue Design</h1>
  17
  18     <h2><a name="overview" id="overview">Overview</a></h2>
  19
  20     <p>The priority-queue container has the following
  21     declaration:</p>
  22     <pre>
  23 <b>template</b>&lt;
  24     <b>typename</b> Value_Type,
  25     <b>typename</b> Cmp_Fn = std::less&lt;Value_Type&gt;,
  26     <b>typename</b> Tag = <a href="pairing_heap_tag.html">pairing_heap_tag</a>,
  27     <b>typename</b> Allocator = std::allocator&lt;<b>char</b>&gt; &gt;
  28 <b>class</b> <a href="priority_queue.html">priority_queue</a>;
  29 </pre>
  30
  31     <p>The parameters have the following meaning:</p>
  32
  33     <ol>
  34       <li><tt>Value_Type</tt> is the value type.</li>
  35
  36       <li><tt>Cmp_Fn</tt> is a value comparison functor</li>
  37
  38       <li><tt>Tag</tt> specifies which underlying data structure
  39       to use.</li>
  40
  41       <li><tt>Allocator</tt> is an allocator
  42       type.</li>
  43     </ol>
  44
  45     <p>The <tt>Tag</tt> parameter specifies which underlying
  46     data structure to use. Instantiating it by <a href=
  47     "pairing_heap_tag.html"><tt>pairing_heap_tag</tt></a>,
  48     <a href=
  49     "binary_heap_tag.html"><tt>binary_heap_tag</tt></a>,
  50     <a href=
  51     "binomial_heap_tag.html"><tt>binomial_heap_tag</tt></a>,
  52     <a href=
  53     "rc_binomial_heap_tag.html"><tt>rc_binomial_heap_tag</tt></a>,
  54     or <a href=
  55     "thin_heap_tag.html"><tt>thin_heap_tag</tt></a>,
  56     specifies, respectively, an underlying pairing heap [<a href=
  57     "references.html#fredman86pairing">fredman86pairing</a>],
  58     binary heap [<a href="references.html#clrs2001">clrs2001</a>],
  59     binomial heap [<a href=
  60     "references.html#clrs2001">clrs2001</a>], a binomial heap with
  61     a redundant binary counter [<a href=
  62     "references.html#maverik_lowerbounds">maverik_lowerbounds</a>],
  63     or a thin heap [<a href=
  64     "references.html#kt99fat_heaps">kt99fat_heas</a>]. These are
  65     explained further in <a href="#pq_imp">Implementations</a>.</p>
  66
  67     <p>As mentioned in <a href=
  68     "tutorial.html#pq">Tutorial::Priority Queues</a>,
  69     <a href=
  70     "priority_queue.html"><tt>__gnu_pbds::priority_queue</tt></a>
  71     shares most of the same interface with <tt>std::priority_queue</tt>.
  72     <i>E.g.</i> if <tt>q</tt> is a priority queue of type
  73     <tt>Q</tt>, then <tt>q.top()</tt> will return the "largest"
  74     value in the container (according to <tt><b>typename</b>
  75     Q::cmp_fn</tt>). <a href=
  76     "priority_queue.html"><tt>__gnu_pbds::priority_queue</tt></a>
  77     has a larger (and very slightly different) interface than
  78     <tt>std::priority_queue</tt>, however, since typically
  79     <tt>push</tt> and <tt>pop</tt> are deemed insufficient for
  80     manipulating priority-queues. </p>
  81
  82     <p>Different settings require different priority-queue
  83     implementations which are described in <a href=
  84     "#pq_imp">Implementations</a>; <a href="#pq_traits">Traits</a>
  85     discusses ways to differentiate between the different traits of
  86     different implementations.</p>
  87
  88     <h2><a name="pq_it" id="pq_it">Iterators</a></h2>
  89
  90     <p>There are many different underlying-data structures for
  91     implementing priority queues. Unfortunately, most such
  92     structures are oriented towards making <tt>push</tt> and
  93     <tt>top</tt> efficient, and consequently don't allow efficient
  94     access of other elements: for instance, they cannot support an efficient
  95     <tt>find</tt> method. In the use case where it
  96     is important to both access and "do something with" an
  97     arbitrary value, one would be out of luck. For example, many graph algorithms require
  98     modifying a value (typically increasing it in the sense of the
  99     priority queue's comparison functor).</p>
 100
 101     <p>In order to access and manipulate an arbitrary value in a
 102     priority queue, one needs to reference the internals of the
 103     priority queue from some form of an associative container -
 104     this is unavoidable. Of course, in order to maintain the
 105     encapsulation of the priority queue, this needs to be done in a
 106     way that minimizes exposure to implementation internals.</p>
 107
 108     <p>In <tt>pb_ds</tt> the priority queue's <tt>insert</tt>
 109     method returns an iterator, which if valid can be used for subsequent <tt>modify</tt> and
 110     <tt>erase</tt> operations. This both preserves the priority
 111     queue's encapsulation, and allows accessing arbitrary values (since the
 112     returned iterators from the <tt>push</tt> operation can be
 113     stored in some form of associative container).</p>
 114
 115     <p>Priority queues' iterators present a problem regarding their
 116     invalidation guarantees. One assumes that calling
 117     <tt><b>operator</b>++</tt> on an iterator will associate it
 118     with the "next" value. Priority-queues are
 119     self-organizing: each operation changes what the "next" value
 120     means. Consequently, it does not make sense that <tt>push</tt>
 121     will return an iterator that can be incremented - this can have
 122     no possible use. Also, as in the case of hash-based containers,
 123     it is awkward to define if a subsequent <tt>push</tt> operation
 124     invalidates a prior returned iterator: it invalidates it in the
 125     sense that its "next" value is not related to what it
 126     previously considered to be its "next" value. However, it might not
 127     invalidate it, in the sense that it can be
 128     de-referenced and used for <tt>modify</tt> and <tt>erase</tt>
 129     operations.</p>
 130
 131     <p>Similarly to the case of the other unordered associative
 132     containers, <tt>pb_ds</tt> uses a distinction between
 133     point-type and range type iterators. A priority queue's <tt>iterator</tt> can always be
 134     converted to a <tt>point_iterator</tt>, and a
 135     <tt>const_iterator</tt> can always be converted to a
 136     <tt>const_point_iterator</tt>.</p>
 137
 138     <p>The following snippet demonstrates manipulating an arbitrary
 139     value:</p>
 140     <pre>
 141 <i>// A priority queue of integers.</i>
 142 <a href=
 143 "priority_queue.html">priority_queue</a>&lt;<b>int</b>&gt; p;
 144
 145 <i>// Insert some values into the priority queue.</i>
 146 <a href=
 147 "priority_queue.html">priority_queue</a>&lt;<b>int</b>&gt;::point_iterator it = p.push(0);
 148
 149 p.push(1);
 150 p.push(2);
 151
 152 <i>// Now modify a value.</i>
 153 p.modify(it, 3);
 154
 155 assert(p.top() == 3);
 156 </pre>
 157
 158     <p>(<a href="pq_examples.html#xref">Priority Queue
 159     Examples::Cross-Referencing</a> shows a more detailed
 160     example.)</p>
 161
 162     <p>It should be noted that an alternative design could embed an
 163     associative container in a priority queue. Could, but most probably should not. To begin with, it should be noted that one
 164     could always encapsulate a priority queue and an associative
 165     container mapping values to priority queue iterators with no
 166     performance loss. One cannot, however, "un-encapsulate" a
 167     priority queue embedding an associative container, which might
 168     lead to performance loss. Assume, that one needs to
 169     associate each value with some data unrelated to priority
 170     queues. Then using <tt>pb_ds</tt>'s design, one could use an
 171     associative container mapping each value to a pair consisting
 172     of this data and a priority queue's iterator. Using the
 173     embedded method would need to use two associative
 174     containers. Similar problems might arise in cases where a value
 175     can reside simultaneously in many priority queues.</p>
 176
 177     <h2><a name="pq_imp" id="pq_imp">Implementations</a></h2>
 178
 179     <p>There are three main implementations of priority queues: the
 180     first employs a binary heap, typically one which uses a
 181     sequence; the second uses a tree (or forest of trees), which is
 182     typically less structured than an associative container's tree;
 183     the third simply uses an associative container. These are
 184     shown, respectively, in Figures <a href=
 185     "#pq_different_underlying_dss">Underlying Priority-Queue
 186     Data-Structures</a> A1 and A2, Figure <a href=
 187     "#pq_different_underlying_dss">Underlying Priority-Queue
 188     Data-Structures</a> B, and Figures <a href=
 189     "#pq_different_underlying_dss">Underlying Priority-Queue
 190     Data-Structures</a> C.</p>
 191
 192     <h6 class="c1"><a name="pq_different_underlying_dss" id=
 193     "pq_different_underlying_dss"><img src=
 194     "pq_different_underlying_dss.png" alt="no image" /></a></h6>
 195
 196     <h6 class="c1">Underlying Priority-Queue Data-Structures.</h6>
 197
 198     <p>Roughly speaking, any value that is both pushed and popped
 199     from a priority queue must incur a logarithmic expense (in the
 200     amortized sense). Any priority queue implementation that would
 201     avoid this, would violate known bounds on comparison-based
 202     sorting (see, <i>e.g.</i>, [<a href=
 203     "references.html#clrs2001">clrs2001</a>] and <a href=
 204     "references.html#brodal96priority">brodal96priority</a>]).</p>
 205
 206     <p>Most implementations do
 207     not differ in the asymptotic amortized complexity of
 208     <tt>push</tt> and <tt>pop</tt> operations, but they differ in
 209     the constants involved, in the complexity of other operations
 210     (<i>e.g.</i>, <tt>modify</tt>), and in the worst-case
 211     complexity of single operations. In general, the more
 212     "structured" an implementation (<i>i.e.</i>, the more internal
 213     invariants it possesses) - the higher its amortized complexity
 214     of <tt>push</tt> and <tt>pop</tt> operations.</p>
 215
 216     <p><tt>pb_ds</tt> implements different algorithms using a
 217     single class: <a href="priority_queue.html">priority_queue</a>.
 218     Instantiating the <tt>Tag</tt> template parameter, "selects"
 219     the implementation:</p>
 220
 221     <ol>
 222       <li>Instantiating <tt>Tag = <a href=
 223       "binary_heap_tag.html">binary_heap_tag</a></tt> creates
 224       a binary heap of the form in Figures <a href=
 225       "#pq_different_underlying_dss">Underlying Priority-Queue
 226       Data-Structures</a> A1 or A2. The former is internally
 227       selected by <a href="priority_queue.html">priority_queue</a>
 228       if <tt>Value_Type</tt> is instantiated by a primitive type
 229       (<i>e.g.</i>, an <tt><b>int</b></tt>); the latter is
 230       internally selected for all other types (<i>e.g.</i>,
 231       <tt>std::string</tt>). This implementations is relatively
 232       unstructured, and so has good <tt>push</tt> and <tt>pop</tt>
 233       performance; it is the "best-in-kind" for primitive
 234       types, <i>e.g.</i>, <tt><b>int</b></tt>s. Conversely, it has
 235       high worst-case performance, and can support only linear-time
 236       <tt>modify</tt> and <tt>erase</tt> operations; this is
 237       explained further in <a href="#pq_traits">Traits</a>.</li>
 238
 239       <li>Instantiating <tt>Tag = <a href=
 240       "pairing_heap_tag.html">pairing_heap_tag</a></tt>
 241       creates a pairing heap of the form in Figure <a href=
 242       "#pq_different_underlying_dss">Underlying Priority-Queue
 243       Data-Structures</a> B. This implementations too is relatively
 244       unstructured, and so has good <tt>push</tt> and <tt>pop</tt>
 245       performance; it is the "best-in-kind" for non-primitive
 246       types, <i>e.g.</i>, <tt>std:string</tt>s. It also has very
 247       good worst-case <tt>push</tt> and <tt>join</tt> performance
 248       (<i>O(1)</i>), but has high worst-case <tt>pop</tt>
 249       complexity.</li>
 250
 251       <li>Instantiating <tt>Tag = <a href=
 252       "binomial_heap_tag.html">binomial_heap_tag</a></tt>
 253       creates a binomial heap of the form in Figure <a href=
 254       "#pq_different_underlying_dss">Underlying Priority-Queue
 255       Data-Structures</a> B. This implementations is more
 256       structured than a pairing heap, and so has worse
 257       <tt>push</tt> and <tt>pop</tt> performance. Conversely, it
 258       has sub-linear worst-case bounds for <tt>pop</tt>,
 259       <i>e.g.</i>, and so it might be preferred in cases where
 260       responsiveness is important.</li>
 261
 262       <li>Instantiating <tt>Tag = <a href=
 263       "rc_binomial_heap_tag.html">rc_binomial_heap_tag</a></tt>
 264       creates a binomial heap of the form in Figure <a href=
 265       "#pq_different_underlying_dss">Underlying Priority-Queue
 266       Data-Structures</a> B, accompanied by a redundant counter
 267       which governs the trees. This implementations is therefore
 268       more structured than a binomial heap, and so has worse
 269       <tt>push</tt> and <tt>pop</tt> performance. Conversely, it
 270       guarantees <i>O(1)</i> <tt>push</tt> complexity, and so it
 271       might be preferred in cases where the responsiveness of a
 272       binomial heap is insufficient.</li>
 273
 274       <li>Instantiating <tt>Tag = <a href=
 275       "thin_heap_tag.html">thin_heap_tag</a></tt> creates a
 276       thin heap of the form in Figure <a href=
 277       "#pq_different_underlying_dss">Underlying Priority-Queue
 278       Data-Structures</a> B. This implementations too is more
 279       structured than a pairing heap, and so has worse
 280       <tt>push</tt> and <tt>pop</tt> performance. Conversely, it
 281       has better worst-case and identical amortized complexities
 282       than a Fibonacci heap, and so might be more appropriate for
 283       some graph algorithms.</li>
 284     </ol>
 285
 286     <p><a href="pq_performance_tests.html">Priority-Queue
 287     Performance Tests</a> shows some results for the above, and
 288     discusses these points further.</p>
 289
 290     <p>Of course, one can use any order-preserving associative
 291     container as a priority queue, as in Figure <a href=
 292     "#pq_different_underlying_dss">Underlying Priority-Queue
 293     Data-Structures</a> C, possibly by creating an adapter class
 294     over the associative container (much as
 295     <tt>std::priority_queue</tt> can adapt <tt>std::vector</tt>).
 296     This has the advantage that no cross-referencing is necessary
 297     at all; the priority queue itself is an associative container.
 298     Most associative containers are too structured to compete with
 299     priority queues in terms of <tt>push</tt> and <tt>pop</tt>
 300     performance.</p>
 301
 302     <h2><a name="pq_traits" id="pq_traits">Traits</a></h2>
 303
 304     <p>It would be nice if all priority queues could
 305     share exactly the same behavior regardless of implementation. Sadly, this is not possible. Just one for instance is in join operations: joining
 306     two binary heaps might throw an exception (not corrupt
 307     any of the heaps on which it operates), but joining two pairing
 308     heaps is exception free.</p>
 309
 310     <p>Tags and traits are very useful for manipulating generic
 311     types. <a href=
 312     "priority_queue.html"><tt>__gnu_pbds::priority_queue</tt></a>
 313     publicly defines <tt>container_category</tt> as one of the tags
 314     discussed in <a href="#pq_imp">Implementations</a>. Given any
 315     container <tt>Cntnr</tt>, the tag of the underlying
 316     data structure can be found via <tt><b>typename</b>
 317     Cntnr::container_category</tt>; this is one of the types shown in
 318     Figure <a href="#pq_tag_cd">Data-structure tag class
 319     hierarchy</a>.</p>
 320
 321     <h6 class="c1"><a name="pq_tag_cd" id=
 322     "pq_tag_cd"><img src="priority_queue_tag_cd.png" alt=
 323     "no image" /></a></h6>
 324
 325     <h6 class="c1">Data-structure tag class hierarchy.</h6>
 326
 327     <p>Additionally, a traits mechanism can be used to query a
 328     container type for its attributes. Given any container
 329     <tt>Cntnr</tt>, then <tt><a href=
 330     "assoc_container_traits.html">__gnu_pbds::container_traits</a>&lt;Cntnr&gt;</tt>
 331     is a traits class identifying the properties of the
 332     container.</p>
 333
 334     <p>To find if a container might throw if two of its objects are
 335     joined, one can use <a href=
 336     "assoc_container_traits.html"><tt>container_traits</tt></a><tt>&lt;Cntnr&gt;::split_join_can_throw</tt>,
 337     for example.</p>
 338
 339     <p>Different priority-queue implementations have different invalidation guarantees. This is
 340     especially important, since as explained in <a href=
 341     "#pq_it">Iterators</a>, there is no way to access an arbitrary
 342     value of priority queues except for iterators. Similarly to
 343     associative containers, one can use
 344     <a href=
 345     "assoc_container_traits.html"><tt>container_traits</tt></a><tt>&lt;Cntnr&gt;::invalidation_guarantee</tt>
 346     to get the invalidation guarantee type of a priority queue.</p>
 347
 348     <p>It is easy to understand from Figure <a href=
 349     "#pq_different_underlying_dss">Underlying Priority-Queue
 350     Data-Structures</a>, what <a href=
 351     "assoc_container_traits.html"><tt>container_traits</tt></a><tt>&lt;Cntnr&gt;::invalidation_guarantee</tt>
 352     will be for different implementations. All implementations of
 353     type <a href="#pq_different_underlying_dss">Underlying
 354     Priority-Queue Data-Structures</a> B have <a href=
 355     "point_invalidation_guarantee.html"><tt>point_invalidation_guarantee</tt></a>:
 356     the container can freely internally reorganize the nodes -
 357     range-type iterators are invalidated, but point-type iterators
 358     are always valid. Implementations of type <a href=
 359     "#pq_different_underlying_dss">Underlying Priority-Queue
 360     Data-Structures</a> A1 and A2 have <a href=
 361     "basic_invalidation_guarantee.html"><tt>basic_invalidation_guarantee</tt></a>:
 362     the container can freely internally reallocate the array - both
 363     point-type and range-type iterators might be invalidated.</p>
 364
 365     <p>This has major implications, and constitutes a good reason to avoid
 366     using binary heaps. A binary heap can perform <tt>modify</tt>
 367     or <tt>erase</tt> efficiently <u>given a valid point-type
 368     iterator</u>. However, inn order to supply it with a valid point-type
 369     iterator, one needs to iterate (linearly) over all
 370     values, then supply the relevant iterator (recall that a
 371     range-type iterator can always be converted to a point-type
 372     iterator). This means that if the number of <tt>modify</tt> or
 373     <tt>erase</tt> operations is non-negligible (say
 374     super-logarithmic in the total sequence of operations) - binary
 375     heaps will perform badly.</p>
 376     <pre>
 377
 378 </pre>
 379   </div>
 380 </body>
 381 </html>