1 @node Container data types
2 @section Container data types
4 @c Copyright (C) 2019--2024 Free Software Foundation, Inc.
6 @c Permission is granted to copy, distribute and/or modify this document
7 @c under the terms of the GNU Free Documentation License, Version 1.3 or
8 @c any later version published by the Free Software Foundation; with no
9 @c Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
10 @c copy of the license is at <https://www.gnu.org/licenses/fdl-1.3.en.html>.
12 @c Written by Bruno Haible.
14 @c This macro expands to \log in TeX mode, but just 'log' in HTML and info
22 @c This macro expands to \mathopsup in TeX mode, to a superscript in HTML
23 @c mode, and to ^ without braces in info mode.
25 @macro mathopsup {EXP}
30 @macro mathopsup {EXP}
35 Gnulib provides several generic container data types. They can be used
36 to organize collections of application-defined objects.
38 @node Ordinary containers
39 @subsection Ordinary container data types
41 @multitable @columnfractions .15 .5 .1 .1 .15
45 @tab Main include file
46 @tab Include file for operations with out-of-memory checking
48 @tab Can contain any number of objects in any given order.
49 Duplicates are allowed, but can optionally be forbidden.
51 @tab @code{"gl_list.h"}
52 @tab @code{"gl_xlist.h"}
54 @tab Can contain any number of objects; the order does not matter.
55 Duplicates (in the sense of the comparator) are forbidden.
57 @tab @code{"gl_set.h"}
58 @tab @code{"gl_xset.h"}
60 @tab Can contain any number of objects in the order of a given comparator
62 Duplicates (in the sense of the comparator) are forbidden.
64 @tab @code{"gl_oset.h"}
65 @tab @code{"gl_xoset.h"}
67 @tab Can contain any number of (key, value) pairs, where keys and values
69 there are no (key, value1) and (key, value2) pairs with the same key
70 (in the sense of a given comparator function).
72 @tab @code{"gl_map.h"}
73 @tab @code{"gl_xmap.h"}
75 @tab Can contain any number of (key, value) pairs, where keys and values
77 the (key, value) pairs are ordered by the key, in the order of a given
79 there are no (key, value1) and (key, value2) pairs with the same key
80 (in the sense of the comparator function).
82 @tab @code{"gl_omap.h"}
83 @tab @code{"gl_xomap.h"}
86 Operations without out-of-memory checking (suitable for use in libraries) are
87 declared in the ``main include file''. Whereas operations with out-of-memory
88 checking (suitable only in programs) are declared in the ``include file for
89 operations with out-of-memory checking''.
91 For each of the data types, several implementations are available, with
92 different performance profiles with respect to the available operations.
93 This enables you to start with the simplest implementation (ARRAY) initially,
94 and switch to a more suitable implementation after profiling your application.
95 The implementation of each container instance is specified in a single place
96 only: in the invocation of the function @code{gl_*_create_empty} that creates
99 The implementations and the guaranteed average performance for the operations
100 for the ``sequential list'' data type are:
102 @multitable @columnfractions 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
108 @tab LINKEDHASH with duplicates
109 @tab LINKEDHASH without duplicates
110 @tab TREEHASH with duplicates
111 @tab TREEHASH without duplicates
112 @item @code{gl_list_size}
121 @item @code{gl_list_node_value}
130 @item @code{gl_list_node_set_value}
137 @tab @math{O((@log n)@mathopsup{2})}
139 @item @code{gl_list_next_node}
143 @tab @math{O(@log n)}
146 @tab @math{O(@log n)}
147 @tab @math{O(@log n)}
148 @item @code{gl_list_previous_node}
152 @tab @math{O(@log n)}
155 @tab @math{O(@log n)}
156 @tab @math{O(@log n)}
157 @item @code{gl_list_first_node}
161 @tab @math{O(@log n)}
164 @tab @math{O(@log n)}
165 @tab @math{O(@log n)}
166 @item @code{gl_list_last_node}
170 @tab @math{O(@log n)}
173 @tab @math{O(@log n)}
174 @tab @math{O(@log n)}
175 @item @code{gl_list_get_at}
179 @tab @math{O(@log n)}
182 @tab @math{O(@log n)}
183 @tab @math{O(@log n)}
184 @item @code{gl_list_get_first}
188 @tab @math{O(@log n)}
191 @tab @math{O(@log n)}
192 @tab @math{O(@log n)}
193 @item @code{gl_list_get_last}
197 @tab @math{O(@log n)}
200 @tab @math{O(@log n)}
201 @tab @math{O(@log n)}
202 @item @code{gl_list_set_at}
206 @tab @math{O(@log n)}
209 @tab @math{O((@log n)@mathopsup{2})}
210 @tab @math{O(@log n)}
211 @item @code{gl_list_set_first}
215 @tab @math{O(@log n)}
218 @tab @math{O((@log n)@mathopsup{2})}
219 @tab @math{O(@log n)}
220 @item @code{gl_list_set_last}
224 @tab @math{O(@log n)}
227 @tab @math{O((@log n)@mathopsup{2})}
228 @tab @math{O(@log n)}
229 @item @code{gl_list_search}
236 @tab @math{O(@log n)}
238 @item @code{gl_list_search_from}
245 @tab @math{O((@log n)@mathopsup{2})}
246 @tab @math{O(@log n)}
247 @item @code{gl_list_search_from_to}
254 @tab @math{O((@log n)@mathopsup{2})}
255 @tab @math{O(@log n)}
256 @item @code{gl_list_indexof}
263 @tab @math{O(@log n)}
264 @tab @math{O(@log n)}
265 @item @code{gl_list_indexof_from}
272 @tab @math{O((@log n)@mathopsup{2})}
273 @tab @math{O(@log n)}
274 @item @code{gl_list_indexof_from_to}
281 @tab @math{O((@log n)@mathopsup{2})}
282 @tab @math{O(@log n)}
283 @item @code{gl_list_add_first}
287 @tab @math{O(@log n)}
290 @tab @math{O((@log n)@mathopsup{2})}
291 @tab @math{O(@log n)}
292 @item @code{gl_list_add_last}
296 @tab @math{O(@log n)}
299 @tab @math{O((@log n)@mathopsup{2})}
300 @tab @math{O(@log n)}
301 @item @code{gl_list_add_before}
305 @tab @math{O(@log n)}
308 @tab @math{O((@log n)@mathopsup{2})}
309 @tab @math{O(@log n)}
310 @item @code{gl_list_add_after}
314 @tab @math{O(@log n)}
317 @tab @math{O((@log n)@mathopsup{2})}
318 @tab @math{O(@log n)}
319 @item @code{gl_list_add_at}
323 @tab @math{O(@log n)}
326 @tab @math{O((@log n)@mathopsup{2})}
327 @tab @math{O(@log n)}
328 @item @code{gl_list_remove_node}
332 @tab @math{O(@log n)}
335 @tab @math{O((@log n)@mathopsup{2})}
336 @tab @math{O(@log n)}
337 @item @code{gl_list_remove_at}
341 @tab @math{O(@log n)}
344 @tab @math{O((@log n)@mathopsup{2})}
345 @tab @math{O(@log n)}
346 @item @code{gl_list_remove_first}
350 @tab @math{O(@log n)}
353 @tab @math{O((@log n)@mathopsup{2})}
354 @tab @math{O(@log n)}
355 @item @code{gl_list_remove_last}
359 @tab @math{O(@log n)}
362 @tab @math{O((@log n)@mathopsup{2})}
363 @tab @math{O(@log n)}
364 @item @code{gl_list_remove}
371 @tab @math{O((@log n)@mathopsup{2})}
372 @tab @math{O(@log n)}
373 @item @code{gl_list_iterator}
377 @tab @math{O(@log n)}
380 @tab @math{O(@log n)}
381 @tab @math{O(@log n)}
382 @item @code{gl_list_iterator_from_to}
386 @tab @math{O(@log n)}
389 @tab @math{O(@log n)}
390 @tab @math{O(@log n)}
391 @item @code{gl_list_iterator_next}
395 @tab @math{O(@log n)}
398 @tab @math{O(@log n)}
399 @tab @math{O(@log n)}
400 @item @code{gl_sortedlist_search}
401 @tab @math{O(@log n)}
402 @tab @math{O(@log n)}
404 @tab @math{O(@log n)}
407 @tab @math{O(@log n)}
408 @tab @math{O(@log n)}
409 @item @code{gl_sortedlist_search_from}
410 @tab @math{O(@log n)}
411 @tab @math{O(@log n)}
413 @tab @math{O(@log n)}
416 @tab @math{O(@log n)}
417 @tab @math{O(@log n)}
418 @item @code{gl_sortedlist_indexof}
419 @tab @math{O(@log n)}
420 @tab @math{O(@log n)}
422 @tab @math{O(@log n)}
425 @tab @math{O(@log n)}
426 @tab @math{O(@log n)}
427 @item @code{gl_sortedlist_indexof_from}
428 @tab @math{O(@log n)}
429 @tab @math{O(@log n)}
431 @tab @math{O(@log n)}
434 @tab @math{O(@log n)}
435 @tab @math{O(@log n)}
436 @item @code{gl_sortedlist_add}
440 @tab @math{O(@log n)}
443 @tab @math{O((@log n)@mathopsup{2})}
444 @tab @math{O(@log n)}
445 @item @code{gl_sortedlist_remove}
449 @tab @math{O(@log n)}
452 @tab @math{O((@log n)@mathopsup{2})}
453 @tab @math{O(@log n)}
456 The implementations and the guaranteed average performance for the operations
457 for the ``set'' data type are:
459 @multitable @columnfractions 0.4 0.2 0.4
462 @tab LINKEDHASH, HASH
463 @item @code{gl_set_size}
466 @item @code{gl_set_add}
469 @item @code{gl_set_remove}
472 @item @code{gl_set_search}
475 @item @code{gl_set_iterator}
478 @item @code{gl_set_iterator_next}
483 The implementations and the guaranteed average performance for the operations
484 for the ``ordered set'' data type are:
486 @multitable @columnfractions 0.5 0.25 0.25
490 @item @code{gl_oset_size}
493 @item @code{gl_oset_add}
495 @tab @math{O(@log n)}
496 @item @code{gl_oset_remove}
498 @tab @math{O(@log n)}
499 @item @code{gl_oset_search}
500 @tab @math{O(@log n)}
501 @tab @math{O(@log n)}
502 @item @code{gl_oset_search_atleast}
503 @tab @math{O(@log n)}
504 @tab @math{O(@log n)}
505 @item @code{gl_oset_iterator}
507 @tab @math{O(@log n)}
508 @item @code{gl_oset_iterator_next}
510 @tab @math{O(@log n)}
513 The implementations and the guaranteed average performance for the operations
514 for the ``map'' data type are:
516 @multitable @columnfractions 0.4 0.2 0.4
519 @tab LINKEDHASH, HASH
520 @item @code{gl_map_size}
523 @item @code{gl_map_get}
526 @item @code{gl_map_put}
529 @item @code{gl_map_remove}
532 @item @code{gl_map_search}
535 @item @code{gl_map_iterator}
538 @item @code{gl_map_iterator_next}
543 The implementations and the guaranteed average performance for the operations
544 for the ``ordered map'' data type are:
546 @multitable @columnfractions 0.5 0.25 0.25
550 @item @code{gl_omap_size}
553 @item @code{gl_omap_get}
554 @tab @math{O(@log n)}
555 @tab @math{O(@log n)}
556 @item @code{gl_omap_put}
558 @tab @math{O(@log n)}
559 @item @code{gl_omap_remove}
561 @tab @math{O(@log n)}
562 @item @code{gl_omap_search}
563 @tab @math{O(@log n)}
564 @tab @math{O(@log n)}
565 @item @code{gl_omap_search_atleast}
566 @tab @math{O(@log n)}
567 @tab @math{O(@log n)}
568 @item @code{gl_omap_iterator}
570 @tab @math{O(@log n)}
571 @item @code{gl_omap_iterator_next}
573 @tab @math{O(@log n)}
576 For C++, Gnulib provides a C++ template class for each of these container data types.
578 @multitable @columnfractions .30 .20 .25 .25
583 @item Sequential list
586 @tab @code{"gl_list.hh"}
590 @tab @code{"gl_set.hh"}
594 @tab @code{"gl_oset.hh"}
598 @tab @code{"gl_map.hh"}
602 @tab @code{"gl_omap.hh"}
605 @node Specialized containers
606 @subsection Specialized container data types
608 The @code{hamt} module implements the hash array mapped trie (HAMT) data
609 structure. This is a data structure that contains (key, value) pairs.
610 Lookup of a (key, value) pair given the key is on average an @math{O(1)}
611 operation, assuming a good hash function for the keys is employed.
613 The HAMT data structure is useful when you want modifications (additions
614 of pairs, removal, value changes) to be visible only to some part of
615 your program, whereas other parts of the program continue to use the
616 unmodified HAMT. The HAMT makes this possible in a space-efficient
617 manner: the modified and the unmodified HAMT share most of their
618 allocated memory. It is also time-efficient: Every such modification
619 is @math{O(1)} on average, again assuming a good hash function for the keys.
621 A HAMT can be used whenever an ordinary hash table would be used. It
622 does however, provide non-destructive updating operations without the
623 need to copy the whole container. On the other hand, a hash table is
624 simpler so that its performance may be better when non-destructive
625 update operations are not needed.
627 For example, a HAMT can be used to model the dynamic environment in a
628 LISP interpreter. Updating a value in the dynamic environment of one
629 continuation frame would not modify values in earlier frames.
631 To use the module, include @code{hamt.h} in your code. The public
632 interface is documented in that header file. You have to provide a hash
633 function and an equivalence relation, which defines key equality. The
634 module includes a test file @code{test-hamt.c}, which demonstrates how
637 In the current implementation, each inner node of the HAMT can store up
638 to @math{32 = 2^5} entries and subtries. Whenever a collision between
639 the initial bits of the hash values of two entries would happen, the
640 next @math{5} bits of the hash values are examined and the two entries
641 pushed down one level in the trie.
643 HAMTs have the same average access times as hash tables but grow and
644 shrink dynamically, so they use memory more economically and do not have
645 to be periodically resized.
647 They were described and analyzed in @cite{Phil Bagwell (2000). Ideal
648 Hash Trees (Report). Infoscience Department, École Polytechnique
649 Fédérale de Lausanne.}
651 The persistence aspect of the HAMT data structure, which means that each
652 updating operation (like inserting, replacing, or removing an entry)
653 returns a new HAMT while leaving the original one intact, is achieved
654 through structure sharing, which is even safe in the presence of
655 multiple threads when the used C compiler supports atomics.