2 /* Dictionary object implementation using a hash table */
4 /* The distribution includes a separate file, Objects/dictnotes.txt,
5 describing explorations into dictionary design and optimization.
6 It covers typical dictionary use patterns, the parameters for
7 tuning dictionaries, and several ideas for possible optimizations.
13 /* Set a key error with the specified argument, wrapping it in a
14 * tuple automatically so that tuple keys are not unpacked as the
15 * exception arguments. */
17 set_key_error(PyObject
*arg
)
20 tup
= PyTuple_Pack(1, arg
);
22 return; /* caller will expect error to be set anyway */
23 PyErr_SetObject(PyExc_KeyError
, tup
);
27 /* Define this out if you don't want conversion statistics on exit. */
28 #undef SHOW_CONVERSION_COUNTS
30 /* See large comment block below. This must be >= 1. */
31 #define PERTURB_SHIFT 5
34 Major subtleties ahead: Most hash schemes depend on having a "good" hash
35 function, in the sense of simulating randomness. Python doesn't: its most
36 important hash functions (for strings and ints) are very regular in common
39 >>> map(hash, (0, 1, 2, 3))
41 >>> map(hash, ("namea", "nameb", "namec", "named"))
42 [-1658398457, -1658398460, -1658398459, -1658398462]
45 This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
46 the low-order i bits as the initial table index is extremely fast, and there
47 are no collisions at all for dicts indexed by a contiguous range of ints.
48 The same is approximately true when keys are "consecutive" strings. So this
49 gives better-than-random behavior in common cases, and that's very desirable.
51 OTOH, when collisions occur, the tendency to fill contiguous slices of the
52 hash table makes a good collision resolution strategy crucial. Taking only
53 the last i bits of the hash code is also vulnerable: for example, consider
54 [i << 16 for i in range(20000)] as a set of keys. Since ints are their own
55 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
56 hash code are all 0: they *all* map to the same table index.
58 But catering to unusual cases should not slow the usual ones, so we just take
59 the last i bits anyway. It's up to collision resolution to do the rest. If
60 we *usually* find the key we're looking for on the first try (and, it turns
61 out, we usually do -- the table load factor is kept under 2/3, so the odds
62 are solidly in our favor), then it makes best sense to keep the initial index
63 computation dirt cheap.
65 The first half of collision resolution is to visit table indices via this
68 j = ((5*j) + 1) mod 2**i
70 For any initial j in range(2**i), repeating that 2**i times generates each
71 int in range(2**i) exactly once (see any text on random-number generation for
72 proof). By itself, this doesn't help much: like linear probing (setting
73 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
74 order. This would be bad, except that's not the only thing we do, and it's
75 actually *good* in the common cases where hash keys are consecutive. In an
76 example that's really too small to make this entirely clear, for a table of
77 size 2**3 the order of indices is:
79 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
81 If two things come in at index 5, the first place we look after is index 2,
82 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
83 Linear probing is deadly in this case because there the fixed probe order
84 is the *same* as the order consecutive keys are likely to arrive. But it's
85 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
86 and certain that consecutive hash codes do not.
88 The other half of the strategy is to get the other bits of the hash code
89 into play. This is done by initializing a (unsigned) vrbl "perturb" to the
90 full hash code, and changing the recurrence to:
92 j = (5*j) + 1 + perturb;
93 perturb >>= PERTURB_SHIFT;
94 use j % 2**i as the next table index;
96 Now the probe sequence depends (eventually) on every bit in the hash code,
97 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
98 because it quickly magnifies small differences in the bits that didn't affect
99 the initial index. Note that because perturb is unsigned, if the recurrence
100 is executed often enough perturb eventually becomes and remains 0. At that
101 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
102 that's certain to find an empty slot eventually (since it generates every int
103 in range(2**i), and we make sure there's always at least one empty slot).
105 Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
106 small so that the high bits of the hash code continue to affect the probe
107 sequence across iterations; but you want it large so that in really bad cases
108 the high-order hash bits have an effect on early iterations. 5 was "the
109 best" in minimizing total collisions across experiments Tim Peters ran (on
110 both normal and pathological cases), but 4 and 6 weren't significantly worse.
112 Historical: Reimer Behrends contributed the idea of using a polynomial-based
113 approach, using repeated multiplication by x in GF(2**n) where an irreducible
114 polynomial for each table size was chosen such that x was a primitive root.
115 Christian Tismer later extended that to use division by x instead, as an
116 efficient way to get the high bits of the hash code into play. This scheme
117 also gave excellent collision statistics, but was more expensive: two
118 if-tests were required inside the loop; computing "the next" index took about
119 the same number of operations but without as much potential parallelism
120 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
121 above, and then shifting perturb can be done while the table index is being
122 masked); and the PyDictObject struct required a member to hold the table's
123 polynomial. In Tim's experiments the current scheme ran faster, produced
124 equally good collision statistics, needed less code & used less memory.
126 Theoretical Python 2.5 headache: hash codes are only C "long", but
127 sizeof(Py_ssize_t) > sizeof(long) may be possible. In that case, and if a
128 dict is genuinely huge, then only the slots directly reachable via indexing
129 by a C long can be the first slot in a probe sequence. The probe sequence
130 will still eventually reach every slot in the table, but the collision rate
131 on initial probes may be much higher than this scheme was designed for.
132 Getting a hash code as fat as Py_ssize_t is the only real cure. But in
133 practice, this probably won't make a lick of difference for many years (at
134 which point everyone will have terabytes of RAM on 64-bit boxes).
137 /* Object used as dummy key to fill deleted entries */
138 static PyObject
*dummy
= NULL
; /* Initialized by first call to newPyDictObject() */
148 /* forward declarations */
150 lookdict_string(PyDictObject
*mp
, PyObject
*key
, long hash
);
152 #ifdef SHOW_CONVERSION_COUNTS
153 static long created
= 0L;
154 static long converted
= 0L;
159 fprintf(stderr
, "created %ld string dicts\n", created
);
160 fprintf(stderr
, "converted %ld to normal dicts\n", converted
);
161 fprintf(stderr
, "%.2f%% conversion rate\n", (100.0*converted
)/created
);
165 /* Debug statistic to compare allocations with reuse through the free list */
166 #undef SHOW_ALLOC_COUNT
167 #ifdef SHOW_ALLOC_COUNT
168 static size_t count_alloc
= 0;
169 static size_t count_reuse
= 0;
174 fprintf(stderr
, "Dict allocations: %" PY_FORMAT_SIZE_T
"d\n",
176 fprintf(stderr
, "Dict reuse through freelist: %" PY_FORMAT_SIZE_T
178 fprintf(stderr
, "%.2f%% reuse rate\n\n",
179 (100.0*count_reuse
/(count_alloc
+count_reuse
)));
183 /* Initialization macros.
184 There are two ways to create a dict: PyDict_New() is the main C API
185 function, and the tp_new slot maps to dict_new(). In the latter case we
186 can save a little time over what PyDict_New does because it's guaranteed
187 that the PyDictObject struct is already zeroed out.
188 Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
189 an excellent reason not to).
192 #define INIT_NONZERO_DICT_SLOTS(mp) do { \
193 (mp)->ma_table = (mp)->ma_smalltable; \
194 (mp)->ma_mask = PyDict_MINSIZE - 1; \
197 #define EMPTY_TO_MINSIZE(mp) do { \
198 memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
199 (mp)->ma_used = (mp)->ma_fill = 0; \
200 INIT_NONZERO_DICT_SLOTS(mp); \
203 /* Dictionary reuse scheme to save calls to malloc, free, and memset */
204 #ifndef PyDict_MAXFREELIST
205 #define PyDict_MAXFREELIST 80
207 static PyDictObject
*free_list
[PyDict_MAXFREELIST
];
208 static int numfree
= 0;
216 op
= free_list
[--numfree
];
217 assert(PyDict_CheckExact(op
));
225 register PyDictObject
*mp
;
226 if (dummy
== NULL
) { /* Auto-initialize dummy */
227 dummy
= PyString_FromString("<dummy key>");
230 #ifdef SHOW_CONVERSION_COUNTS
231 Py_AtExit(show_counts
);
233 #ifdef SHOW_ALLOC_COUNT
234 Py_AtExit(show_alloc
);
238 mp
= free_list
[--numfree
];
240 assert (Py_TYPE(mp
) == &PyDict_Type
);
241 _Py_NewReference((PyObject
*)mp
);
243 EMPTY_TO_MINSIZE(mp
);
245 assert (mp
->ma_used
== 0);
246 assert (mp
->ma_table
== mp
->ma_smalltable
);
247 assert (mp
->ma_mask
== PyDict_MINSIZE
- 1);
248 #ifdef SHOW_ALLOC_COUNT
252 mp
= PyObject_GC_New(PyDictObject
, &PyDict_Type
);
255 EMPTY_TO_MINSIZE(mp
);
256 #ifdef SHOW_ALLOC_COUNT
260 mp
->ma_lookup
= lookdict_string
;
261 #ifdef SHOW_CONVERSION_COUNTS
264 _PyObject_GC_TRACK(mp
);
265 return (PyObject
*)mp
;
269 The basic lookup function used by all operations.
270 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
271 Open addressing is preferred over chaining since the link overhead for
272 chaining would be substantial (100% with typical malloc overhead).
274 The initial probe index is computed as hash mod the table size. Subsequent
275 probe indices are computed as explained earlier.
277 All arithmetic on hash should ignore overflow.
279 (The details in this version are due to Tim Peters, building on many past
280 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
283 lookdict() is general-purpose, and may return NULL if (and only if) a
284 comparison raises an exception (this was new in Python 2.5).
285 lookdict_string() below is specialized to string keys, comparison of which can
286 never raise an exception; that function can never return NULL. For both, when
287 the key isn't found a PyDictEntry* is returned for which the me_value field is
288 NULL; this is the slot in the dict at which the key would have been found, and
289 the caller can (if it wishes) add the <key, value> pair to the returned
293 lookdict(PyDictObject
*mp
, PyObject
*key
, register long hash
)
296 register size_t perturb
;
297 register PyDictEntry
*freeslot
;
298 register size_t mask
= (size_t)mp
->ma_mask
;
299 PyDictEntry
*ep0
= mp
->ma_table
;
300 register PyDictEntry
*ep
;
304 i
= (size_t)hash
& mask
;
306 if (ep
->me_key
== NULL
|| ep
->me_key
== key
)
309 if (ep
->me_key
== dummy
)
312 if (ep
->me_hash
== hash
) {
313 startkey
= ep
->me_key
;
315 cmp
= PyObject_RichCompareBool(startkey
, key
, Py_EQ
);
319 if (ep0
== mp
->ma_table
&& ep
->me_key
== startkey
) {
324 /* The compare did major nasty stuff to the
326 * XXX A clever adversary could prevent this
327 * XXX from terminating.
329 return lookdict(mp
, key
, hash
);
335 /* In the loop, me_key == dummy is by far (factor of 100s) the
336 least likely outcome, so test for that last. */
337 for (perturb
= hash
; ; perturb
>>= PERTURB_SHIFT
) {
338 i
= (i
<< 2) + i
+ perturb
+ 1;
340 if (ep
->me_key
== NULL
)
341 return freeslot
== NULL
? ep
: freeslot
;
342 if (ep
->me_key
== key
)
344 if (ep
->me_hash
== hash
&& ep
->me_key
!= dummy
) {
345 startkey
= ep
->me_key
;
347 cmp
= PyObject_RichCompareBool(startkey
, key
, Py_EQ
);
351 if (ep0
== mp
->ma_table
&& ep
->me_key
== startkey
) {
356 /* The compare did major nasty stuff to the
358 * XXX A clever adversary could prevent this
359 * XXX from terminating.
361 return lookdict(mp
, key
, hash
);
364 else if (ep
->me_key
== dummy
&& freeslot
== NULL
)
367 assert(0); /* NOT REACHED */
372 * Hacked up version of lookdict which can assume keys are always strings;
373 * this assumption allows testing for errors during PyObject_RichCompareBool()
374 * to be dropped; string-string comparisons never raise exceptions. This also
375 * means we don't need to go through PyObject_RichCompareBool(); we can always
376 * use _PyString_Eq() directly.
378 * This is valuable because dicts with only string keys are very common.
381 lookdict_string(PyDictObject
*mp
, PyObject
*key
, register long hash
)
384 register size_t perturb
;
385 register PyDictEntry
*freeslot
;
386 register size_t mask
= (size_t)mp
->ma_mask
;
387 PyDictEntry
*ep0
= mp
->ma_table
;
388 register PyDictEntry
*ep
;
390 /* Make sure this function doesn't have to handle non-string keys,
391 including subclasses of str; e.g., one reason to subclass
392 strings is to override __eq__, and for speed we don't cater to
394 if (!PyString_CheckExact(key
)) {
395 #ifdef SHOW_CONVERSION_COUNTS
398 mp
->ma_lookup
= lookdict
;
399 return lookdict(mp
, key
, hash
);
403 if (ep
->me_key
== NULL
|| ep
->me_key
== key
)
405 if (ep
->me_key
== dummy
)
408 if (ep
->me_hash
== hash
&& _PyString_Eq(ep
->me_key
, key
))
413 /* In the loop, me_key == dummy is by far (factor of 100s) the
414 least likely outcome, so test for that last. */
415 for (perturb
= hash
; ; perturb
>>= PERTURB_SHIFT
) {
416 i
= (i
<< 2) + i
+ perturb
+ 1;
418 if (ep
->me_key
== NULL
)
419 return freeslot
== NULL
? ep
: freeslot
;
420 if (ep
->me_key
== key
421 || (ep
->me_hash
== hash
422 && ep
->me_key
!= dummy
423 && _PyString_Eq(ep
->me_key
, key
)))
425 if (ep
->me_key
== dummy
&& freeslot
== NULL
)
428 assert(0); /* NOT REACHED */
433 Internal routine to insert a new item into the table.
434 Used both by the internal resize routine and by the public insert routine.
435 Eats a reference to key and one to value.
436 Returns -1 if an error occurred, or 0 on success.
439 insertdict(register PyDictObject
*mp
, PyObject
*key
, long hash
, PyObject
*value
)
442 register PyDictEntry
*ep
;
443 typedef PyDictEntry
*(*lookupfunc
)(PyDictObject
*, PyObject
*, long);
445 assert(mp
->ma_lookup
!= NULL
);
446 ep
= mp
->ma_lookup(mp
, key
, hash
);
452 if (ep
->me_value
!= NULL
) {
453 old_value
= ep
->me_value
;
454 ep
->me_value
= value
;
455 Py_DECREF(old_value
); /* which **CAN** re-enter */
459 if (ep
->me_key
== NULL
)
462 assert(ep
->me_key
== dummy
);
466 ep
->me_hash
= (Py_ssize_t
)hash
;
467 ep
->me_value
= value
;
474 Internal routine used by dictresize() to insert an item which is
475 known to be absent from the dict. This routine also assumes that
476 the dict contains no deleted entries. Besides the performance benefit,
477 using insertdict() in dictresize() is dangerous (SF bug #1456209).
478 Note that no refcounts are changed by this routine; if needed, the caller
479 is responsible for incref'ing `key` and `value`.
482 insertdict_clean(register PyDictObject
*mp
, PyObject
*key
, long hash
,
486 register size_t perturb
;
487 register size_t mask
= (size_t)mp
->ma_mask
;
488 PyDictEntry
*ep0
= mp
->ma_table
;
489 register PyDictEntry
*ep
;
493 for (perturb
= hash
; ep
->me_key
!= NULL
; perturb
>>= PERTURB_SHIFT
) {
494 i
= (i
<< 2) + i
+ perturb
+ 1;
497 assert(ep
->me_value
== NULL
);
500 ep
->me_hash
= (Py_ssize_t
)hash
;
501 ep
->me_value
= value
;
506 Restructure the table by allocating a new table and reinserting all
507 items again. When entries have been deleted, the new table may
508 actually be smaller than the old one.
511 dictresize(PyDictObject
*mp
, Py_ssize_t minused
)
514 PyDictEntry
*oldtable
, *newtable
, *ep
;
516 int is_oldtable_malloced
;
517 PyDictEntry small_copy
[PyDict_MINSIZE
];
519 assert(minused
>= 0);
521 /* Find the smallest table size > minused. */
522 for (newsize
= PyDict_MINSIZE
;
523 newsize
<= minused
&& newsize
> 0;
531 /* Get space for a new table. */
532 oldtable
= mp
->ma_table
;
533 assert(oldtable
!= NULL
);
534 is_oldtable_malloced
= oldtable
!= mp
->ma_smalltable
;
536 if (newsize
== PyDict_MINSIZE
) {
537 /* A large table is shrinking, or we can't get any smaller. */
538 newtable
= mp
->ma_smalltable
;
539 if (newtable
== oldtable
) {
540 if (mp
->ma_fill
== mp
->ma_used
) {
541 /* No dummies, so no point doing anything. */
544 /* We're not going to resize it, but rebuild the
545 table anyway to purge old dummy entries.
546 Subtle: This is *necessary* if fill==size,
547 as lookdict needs at least one virgin slot to
548 terminate failing searches. If fill < size, it's
549 merely desirable, as dummies slow searches. */
550 assert(mp
->ma_fill
> mp
->ma_used
);
551 memcpy(small_copy
, oldtable
, sizeof(small_copy
));
552 oldtable
= small_copy
;
556 newtable
= PyMem_NEW(PyDictEntry
, newsize
);
557 if (newtable
== NULL
) {
563 /* Make the dict empty, using the new table. */
564 assert(newtable
!= oldtable
);
565 mp
->ma_table
= newtable
;
566 mp
->ma_mask
= newsize
- 1;
567 memset(newtable
, 0, sizeof(PyDictEntry
) * newsize
);
572 /* Copy the data over; this is refcount-neutral for active entries;
573 dummy entries aren't copied over, of course */
574 for (ep
= oldtable
; i
> 0; ep
++) {
575 if (ep
->me_value
!= NULL
) { /* active entry */
577 insertdict_clean(mp
, ep
->me_key
, (long)ep
->me_hash
,
580 else if (ep
->me_key
!= NULL
) { /* dummy entry */
582 assert(ep
->me_key
== dummy
);
583 Py_DECREF(ep
->me_key
);
585 /* else key == value == NULL: nothing to do */
588 if (is_oldtable_malloced
)
593 /* Create a new dictionary pre-sized to hold an estimated number of elements.
594 Underestimates are okay because the dictionary will resize as necessary.
595 Overestimates just mean the dictionary will be more sparse than usual.
599 _PyDict_NewPresized(Py_ssize_t minused
)
601 PyObject
*op
= PyDict_New();
603 if (minused
>5 && op
!= NULL
&& dictresize((PyDictObject
*)op
, minused
) == -1) {
610 /* Note that, for historical reasons, PyDict_GetItem() suppresses all errors
611 * that may occur (originally dicts supported only string keys, and exceptions
612 * weren't possible). So, while the original intent was that a NULL return
613 * meant the key wasn't present, in reality it can mean that, or that an error
614 * (suppressed) occurred while computing the key's hash, or that some error
615 * (suppressed) occurred when comparing keys in the dict's internal probe
616 * sequence. A nasty example of the latter is when a Python-coded comparison
617 * function hits a stack-depth error, which can cause this to return NULL
618 * even if the key is present.
621 PyDict_GetItem(PyObject
*op
, PyObject
*key
)
624 PyDictObject
*mp
= (PyDictObject
*)op
;
626 PyThreadState
*tstate
;
627 if (!PyDict_Check(op
))
629 if (!PyString_CheckExact(key
) ||
630 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1)
632 hash
= PyObject_Hash(key
);
639 /* We can arrive here with a NULL tstate during initialization:
640 try running "python -Wi" for an example related to string
641 interning. Let's just hope that no exception occurs then... */
642 tstate
= _PyThreadState_Current
;
643 if (tstate
!= NULL
&& tstate
->curexc_type
!= NULL
) {
644 /* preserve the existing exception */
645 PyObject
*err_type
, *err_value
, *err_tb
;
646 PyErr_Fetch(&err_type
, &err_value
, &err_tb
);
647 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
649 PyErr_Restore(err_type
, err_value
, err_tb
);
654 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
663 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
664 * dictionary if it's merely replacing the value for an existing key.
665 * This means that it's safe to loop over a dictionary with PyDict_Next()
666 * and occasionally replace a value -- but you can't insert new keys or
670 PyDict_SetItem(register PyObject
*op
, PyObject
*key
, PyObject
*value
)
672 register PyDictObject
*mp
;
674 register Py_ssize_t n_used
;
676 if (!PyDict_Check(op
)) {
677 PyErr_BadInternalCall();
682 mp
= (PyDictObject
*)op
;
683 if (PyString_CheckExact(key
)) {
684 hash
= ((PyStringObject
*)key
)->ob_shash
;
686 hash
= PyObject_Hash(key
);
689 hash
= PyObject_Hash(key
);
693 assert(mp
->ma_fill
<= mp
->ma_mask
); /* at least one empty slot */
694 n_used
= mp
->ma_used
;
697 if (insertdict(mp
, key
, hash
, value
) != 0)
699 /* If we added a key, we can safely resize. Otherwise just return!
700 * If fill >= 2/3 size, adjust size. Normally, this doubles or
701 * quaduples the size, but it's also possible for the dict to shrink
702 * (if ma_fill is much larger than ma_used, meaning a lot of dict
703 * keys have been * deleted).
705 * Quadrupling the size improves average dictionary sparseness
706 * (reducing collisions) at the cost of some memory and iteration
707 * speed (which loops over every possible entry). It also halves
708 * the number of expensive resize operations in a growing dictionary.
710 * Very large dictionaries (over 50K items) use doubling instead.
711 * This may help applications with severe memory constraints.
713 if (!(mp
->ma_used
> n_used
&& mp
->ma_fill
*3 >= (mp
->ma_mask
+1)*2))
715 return dictresize(mp
, (mp
->ma_used
> 50000 ? 2 : 4) * mp
->ma_used
);
719 PyDict_DelItem(PyObject
*op
, PyObject
*key
)
721 register PyDictObject
*mp
;
723 register PyDictEntry
*ep
;
724 PyObject
*old_value
, *old_key
;
726 if (!PyDict_Check(op
)) {
727 PyErr_BadInternalCall();
731 if (!PyString_CheckExact(key
) ||
732 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
733 hash
= PyObject_Hash(key
);
737 mp
= (PyDictObject
*)op
;
738 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
741 if (ep
->me_value
== NULL
) {
745 old_key
= ep
->me_key
;
748 old_value
= ep
->me_value
;
751 Py_DECREF(old_value
);
757 PyDict_Clear(PyObject
*op
)
760 PyDictEntry
*ep
, *table
;
761 int table_is_malloced
;
763 PyDictEntry small_copy
[PyDict_MINSIZE
];
768 if (!PyDict_Check(op
))
770 mp
= (PyDictObject
*)op
;
776 table
= mp
->ma_table
;
777 assert(table
!= NULL
);
778 table_is_malloced
= table
!= mp
->ma_smalltable
;
780 /* This is delicate. During the process of clearing the dict,
781 * decrefs can cause the dict to mutate. To avoid fatal confusion
782 * (voice of experience), we have to make the dict empty before
783 * clearing the slots, and never refer to anything via mp->xxx while
787 if (table_is_malloced
)
788 EMPTY_TO_MINSIZE(mp
);
791 /* It's a small table with something that needs to be cleared.
792 * Afraid the only safe way is to copy the dict entries into
793 * another small table first.
795 memcpy(small_copy
, table
, sizeof(small_copy
));
797 EMPTY_TO_MINSIZE(mp
);
799 /* else it's a small table that's already empty */
801 /* Now we can finally clear things. If C had refcounts, we could
802 * assert that the refcount on table is 1 now, i.e. that this function
803 * has unique access to it, so decref side-effects can't alter it.
805 for (ep
= table
; fill
> 0; ++ep
) {
812 Py_DECREF(ep
->me_key
);
813 Py_XDECREF(ep
->me_value
);
817 assert(ep
->me_value
== NULL
);
821 if (table_is_malloced
)
826 * Iterate over a dict. Use like so:
829 * PyObject *key, *value;
830 * i = 0; # important! i should not otherwise be changed by you
831 * while (PyDict_Next(yourdict, &i, &key, &value)) {
832 * Refer to borrowed references in key and value.
835 * CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
836 * mutates the dict. One exception: it is safe if the loop merely changes
837 * the values associated with the keys (but doesn't insert new keys or
838 * delete keys), via PyDict_SetItem().
841 PyDict_Next(PyObject
*op
, Py_ssize_t
*ppos
, PyObject
**pkey
, PyObject
**pvalue
)
843 register Py_ssize_t i
;
844 register Py_ssize_t mask
;
845 register PyDictEntry
*ep
;
847 if (!PyDict_Check(op
))
852 ep
= ((PyDictObject
*)op
)->ma_table
;
853 mask
= ((PyDictObject
*)op
)->ma_mask
;
854 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
860 *pkey
= ep
[i
].me_key
;
862 *pvalue
= ep
[i
].me_value
;
866 /* Internal version of PyDict_Next that returns a hash value in addition to the key and value.*/
868 _PyDict_Next(PyObject
*op
, Py_ssize_t
*ppos
, PyObject
**pkey
, PyObject
**pvalue
, long *phash
)
870 register Py_ssize_t i
;
871 register Py_ssize_t mask
;
872 register PyDictEntry
*ep
;
874 if (!PyDict_Check(op
))
879 ep
= ((PyDictObject
*)op
)->ma_table
;
880 mask
= ((PyDictObject
*)op
)->ma_mask
;
881 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
886 *phash
= (long)(ep
[i
].me_hash
);
888 *pkey
= ep
[i
].me_key
;
890 *pvalue
= ep
[i
].me_value
;
897 dict_dealloc(register PyDictObject
*mp
)
899 register PyDictEntry
*ep
;
900 Py_ssize_t fill
= mp
->ma_fill
;
901 PyObject_GC_UnTrack(mp
);
902 Py_TRASHCAN_SAFE_BEGIN(mp
)
903 for (ep
= mp
->ma_table
; fill
> 0; ep
++) {
906 Py_DECREF(ep
->me_key
);
907 Py_XDECREF(ep
->me_value
);
910 if (mp
->ma_table
!= mp
->ma_smalltable
)
911 PyMem_DEL(mp
->ma_table
);
912 if (numfree
< PyDict_MAXFREELIST
&& Py_TYPE(mp
) == &PyDict_Type
)
913 free_list
[numfree
++] = mp
;
915 Py_TYPE(mp
)->tp_free((PyObject
*)mp
);
916 Py_TRASHCAN_SAFE_END(mp
)
920 dict_print(register PyDictObject
*mp
, register FILE *fp
, register int flags
)
922 register Py_ssize_t i
;
923 register Py_ssize_t any
;
926 status
= Py_ReprEnter((PyObject
*)mp
);
930 Py_BEGIN_ALLOW_THREADS
931 fprintf(fp
, "{...}");
936 Py_BEGIN_ALLOW_THREADS
940 for (i
= 0; i
<= mp
->ma_mask
; i
++) {
941 PyDictEntry
*ep
= mp
->ma_table
+ i
;
942 PyObject
*pvalue
= ep
->me_value
;
943 if (pvalue
!= NULL
) {
944 /* Prevent PyObject_Repr from deleting value during
948 Py_BEGIN_ALLOW_THREADS
952 if (PyObject_Print((PyObject
*)ep
->me_key
, fp
, 0)!=0) {
954 Py_ReprLeave((PyObject
*)mp
);
957 Py_BEGIN_ALLOW_THREADS
960 if (PyObject_Print(pvalue
, fp
, 0) != 0) {
962 Py_ReprLeave((PyObject
*)mp
);
968 Py_BEGIN_ALLOW_THREADS
971 Py_ReprLeave((PyObject
*)mp
);
976 dict_repr(PyDictObject
*mp
)
979 PyObject
*s
, *temp
, *colon
= NULL
;
980 PyObject
*pieces
= NULL
, *result
= NULL
;
981 PyObject
*key
, *value
;
983 i
= Py_ReprEnter((PyObject
*)mp
);
985 return i
> 0 ? PyString_FromString("{...}") : NULL
;
988 if (mp
->ma_used
== 0) {
989 result
= PyString_FromString("{}");
993 pieces
= PyList_New(0);
997 colon
= PyString_FromString(": ");
1001 /* Do repr() on each key+value pair, and insert ": " between them.
1002 Note that repr may mutate the dict. */
1004 while (PyDict_Next((PyObject
*)mp
, &i
, &key
, &value
)) {
1006 /* Prevent repr from deleting value during key format. */
1008 s
= PyObject_Repr(key
);
1009 PyString_Concat(&s
, colon
);
1010 PyString_ConcatAndDel(&s
, PyObject_Repr(value
));
1014 status
= PyList_Append(pieces
, s
);
1015 Py_DECREF(s
); /* append created a new ref */
1020 /* Add "{}" decorations to the first and last items. */
1021 assert(PyList_GET_SIZE(pieces
) > 0);
1022 s
= PyString_FromString("{");
1025 temp
= PyList_GET_ITEM(pieces
, 0);
1026 PyString_ConcatAndDel(&s
, temp
);
1027 PyList_SET_ITEM(pieces
, 0, s
);
1031 s
= PyString_FromString("}");
1034 temp
= PyList_GET_ITEM(pieces
, PyList_GET_SIZE(pieces
) - 1);
1035 PyString_ConcatAndDel(&temp
, s
);
1036 PyList_SET_ITEM(pieces
, PyList_GET_SIZE(pieces
) - 1, temp
);
1040 /* Paste them all together with ", " between. */
1041 s
= PyString_FromString(", ");
1044 result
= _PyString_Join(s
, pieces
);
1050 Py_ReprLeave((PyObject
*)mp
);
1055 dict_length(PyDictObject
*mp
)
1061 dict_subscript(PyDictObject
*mp
, register PyObject
*key
)
1066 assert(mp
->ma_table
!= NULL
);
1067 if (!PyString_CheckExact(key
) ||
1068 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1069 hash
= PyObject_Hash(key
);
1073 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1078 if (!PyDict_CheckExact(mp
)) {
1079 /* Look up __missing__ method if we're a subclass. */
1081 static PyObject
*missing_str
= NULL
;
1082 if (missing_str
== NULL
)
1084 PyString_InternFromString("__missing__");
1085 missing
= _PyType_Lookup(Py_TYPE(mp
), missing_str
);
1086 if (missing
!= NULL
)
1087 return PyObject_CallFunctionObjArgs(missing
,
1088 (PyObject
*)mp
, key
, NULL
);
1099 dict_ass_sub(PyDictObject
*mp
, PyObject
*v
, PyObject
*w
)
1102 return PyDict_DelItem((PyObject
*)mp
, v
);
1104 return PyDict_SetItem((PyObject
*)mp
, v
, w
);
1107 static PyMappingMethods dict_as_mapping
= {
1108 (lenfunc
)dict_length
, /*mp_length*/
1109 (binaryfunc
)dict_subscript
, /*mp_subscript*/
1110 (objobjargproc
)dict_ass_sub
, /*mp_ass_subscript*/
1114 dict_keys(register PyDictObject
*mp
)
1116 register PyObject
*v
;
1117 register Py_ssize_t i
, j
;
1126 if (n
!= mp
->ma_used
) {
1127 /* Durnit. The allocations caused the dict to resize.
1128 * Just start over, this shouldn't normally happen.
1135 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1136 if (ep
[i
].me_value
!= NULL
) {
1137 PyObject
*key
= ep
[i
].me_key
;
1139 PyList_SET_ITEM(v
, j
, key
);
1148 dict_values(register PyDictObject
*mp
)
1150 register PyObject
*v
;
1151 register Py_ssize_t i
, j
;
1160 if (n
!= mp
->ma_used
) {
1161 /* Durnit. The allocations caused the dict to resize.
1162 * Just start over, this shouldn't normally happen.
1169 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1170 if (ep
[i
].me_value
!= NULL
) {
1171 PyObject
*value
= ep
[i
].me_value
;
1173 PyList_SET_ITEM(v
, j
, value
);
1182 dict_items(register PyDictObject
*mp
)
1184 register PyObject
*v
;
1185 register Py_ssize_t i
, j
, n
;
1187 PyObject
*item
, *key
, *value
;
1190 /* Preallocate the list of tuples, to avoid allocations during
1191 * the loop over the items, which could trigger GC, which
1192 * could resize the dict. :-(
1199 for (i
= 0; i
< n
; i
++) {
1200 item
= PyTuple_New(2);
1205 PyList_SET_ITEM(v
, i
, item
);
1207 if (n
!= mp
->ma_used
) {
1208 /* Durnit. The allocations caused the dict to resize.
1209 * Just start over, this shouldn't normally happen.
1214 /* Nothing we do below makes any function calls. */
1217 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1218 if ((value
=ep
[i
].me_value
) != NULL
) {
1220 item
= PyList_GET_ITEM(v
, j
);
1222 PyTuple_SET_ITEM(item
, 0, key
);
1224 PyTuple_SET_ITEM(item
, 1, value
);
1233 dict_fromkeys(PyObject
*cls
, PyObject
*args
)
1236 PyObject
*value
= Py_None
;
1237 PyObject
*it
; /* iter(seq) */
1242 if (!PyArg_UnpackTuple(args
, "fromkeys", 1, 2, &seq
, &value
))
1245 d
= PyObject_CallObject(cls
, NULL
);
1249 if (PyDict_CheckExact(d
) && PyDict_CheckExact(seq
)) {
1250 PyDictObject
*mp
= (PyDictObject
*)d
;
1256 if (dictresize(mp
, PySet_GET_SIZE(seq
)))
1259 while (_PyDict_Next(seq
, &pos
, &key
, &oldvalue
, &hash
)) {
1262 if (insertdict(mp
, key
, hash
, value
))
1268 if (PyDict_CheckExact(d
) && PyAnySet_CheckExact(seq
)) {
1269 PyDictObject
*mp
= (PyDictObject
*)d
;
1274 if (dictresize(mp
, PySet_GET_SIZE(seq
)))
1277 while (_PySet_NextEntry(seq
, &pos
, &key
, &hash
)) {
1280 if (insertdict(mp
, key
, hash
, value
))
1286 it
= PyObject_GetIter(seq
);
1292 if (PyDict_CheckExact(d
)) {
1293 while ((key
= PyIter_Next(it
)) != NULL
) {
1294 status
= PyDict_SetItem(d
, key
, value
);
1300 while ((key
= PyIter_Next(it
)) != NULL
) {
1301 status
= PyObject_SetItem(d
, key
, value
);
1308 if (PyErr_Occurred())
1320 dict_update_common(PyObject
*self
, PyObject
*args
, PyObject
*kwds
, char *methname
)
1322 PyObject
*arg
= NULL
;
1325 if (!PyArg_UnpackTuple(args
, methname
, 0, 1, &arg
))
1328 else if (arg
!= NULL
) {
1329 if (PyObject_HasAttrString(arg
, "keys"))
1330 result
= PyDict_Merge(self
, arg
, 1);
1332 result
= PyDict_MergeFromSeq2(self
, arg
, 1);
1334 if (result
== 0 && kwds
!= NULL
)
1335 result
= PyDict_Merge(self
, kwds
, 1);
1340 dict_update(PyObject
*self
, PyObject
*args
, PyObject
*kwds
)
1342 if (dict_update_common(self
, args
, kwds
, "update") != -1)
1347 /* Update unconditionally replaces existing items.
1348 Merge has a 3rd argument 'override'; if set, it acts like Update,
1349 otherwise it leaves existing items unchanged.
1351 PyDict_{Update,Merge} update/merge from a mapping object.
1353 PyDict_MergeFromSeq2 updates/merges from any iterable object
1354 producing iterable objects of length 2.
1358 PyDict_MergeFromSeq2(PyObject
*d
, PyObject
*seq2
, int override
)
1360 PyObject
*it
; /* iter(seq2) */
1361 Py_ssize_t i
; /* index into seq2 of current element */
1362 PyObject
*item
; /* seq2[i] */
1363 PyObject
*fast
; /* item as a 2-tuple or 2-list */
1366 assert(PyDict_Check(d
));
1367 assert(seq2
!= NULL
);
1369 it
= PyObject_GetIter(seq2
);
1373 for (i
= 0; ; ++i
) {
1374 PyObject
*key
, *value
;
1378 item
= PyIter_Next(it
);
1380 if (PyErr_Occurred())
1385 /* Convert item to sequence, and verify length 2. */
1386 fast
= PySequence_Fast(item
, "");
1388 if (PyErr_ExceptionMatches(PyExc_TypeError
))
1389 PyErr_Format(PyExc_TypeError
,
1390 "cannot convert dictionary update "
1391 "sequence element #%zd to a sequence",
1395 n
= PySequence_Fast_GET_SIZE(fast
);
1397 PyErr_Format(PyExc_ValueError
,
1398 "dictionary update sequence element #%zd "
1399 "has length %zd; 2 is required",
1404 /* Update/merge with this (key, value) pair. */
1405 key
= PySequence_Fast_GET_ITEM(fast
, 0);
1406 value
= PySequence_Fast_GET_ITEM(fast
, 1);
1407 if (override
|| PyDict_GetItem(d
, key
) == NULL
) {
1408 int status
= PyDict_SetItem(d
, key
, value
);
1424 return Py_SAFE_DOWNCAST(i
, Py_ssize_t
, int);
1428 PyDict_Update(PyObject
*a
, PyObject
*b
)
1430 return PyDict_Merge(a
, b
, 1);
1434 PyDict_Merge(PyObject
*a
, PyObject
*b
, int override
)
1436 register PyDictObject
*mp
, *other
;
1437 register Py_ssize_t i
;
1440 /* We accept for the argument either a concrete dictionary object,
1441 * or an abstract "mapping" object. For the former, we can do
1442 * things quite efficiently. For the latter, we only require that
1443 * PyMapping_Keys() and PyObject_GetItem() be supported.
1445 if (a
== NULL
|| !PyDict_Check(a
) || b
== NULL
) {
1446 PyErr_BadInternalCall();
1449 mp
= (PyDictObject
*)a
;
1450 if (PyDict_Check(b
)) {
1451 other
= (PyDictObject
*)b
;
1452 if (other
== mp
|| other
->ma_used
== 0)
1453 /* a.update(a) or a.update({}); nothing to do */
1455 if (mp
->ma_used
== 0)
1456 /* Since the target dict is empty, PyDict_GetItem()
1457 * always returns NULL. Setting override to 1
1458 * skips the unnecessary test.
1461 /* Do one big resize at the start, rather than
1462 * incrementally resizing as we insert new items. Expect
1463 * that there will be no (or few) overlapping keys.
1465 if ((mp
->ma_fill
+ other
->ma_used
)*3 >= (mp
->ma_mask
+1)*2) {
1466 if (dictresize(mp
, (mp
->ma_used
+ other
->ma_used
)*2) != 0)
1469 for (i
= 0; i
<= other
->ma_mask
; i
++) {
1470 entry
= &other
->ma_table
[i
];
1471 if (entry
->me_value
!= NULL
&&
1473 PyDict_GetItem(a
, entry
->me_key
) == NULL
)) {
1474 Py_INCREF(entry
->me_key
);
1475 Py_INCREF(entry
->me_value
);
1476 if (insertdict(mp
, entry
->me_key
,
1477 (long)entry
->me_hash
,
1478 entry
->me_value
) != 0)
1484 /* Do it the generic, slower way */
1485 PyObject
*keys
= PyMapping_Keys(b
);
1487 PyObject
*key
, *value
;
1491 /* Docstring says this is equivalent to E.keys() so
1492 * if E doesn't have a .keys() method we want
1493 * AttributeError to percolate up. Might as well
1494 * do the same for any other error.
1498 iter
= PyObject_GetIter(keys
);
1503 for (key
= PyIter_Next(iter
); key
; key
= PyIter_Next(iter
)) {
1504 if (!override
&& PyDict_GetItem(a
, key
) != NULL
) {
1508 value
= PyObject_GetItem(b
, key
);
1509 if (value
== NULL
) {
1514 status
= PyDict_SetItem(a
, key
, value
);
1523 if (PyErr_Occurred())
1524 /* Iterator completed, via error */
1531 dict_copy(register PyDictObject
*mp
)
1533 return PyDict_Copy((PyObject
*)mp
);
1537 PyDict_Copy(PyObject
*o
)
1541 if (o
== NULL
|| !PyDict_Check(o
)) {
1542 PyErr_BadInternalCall();
1545 copy
= PyDict_New();
1548 if (PyDict_Merge(copy
, o
, 1) == 0)
1555 PyDict_Size(PyObject
*mp
)
1557 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1558 PyErr_BadInternalCall();
1561 return ((PyDictObject
*)mp
)->ma_used
;
1565 PyDict_Keys(PyObject
*mp
)
1567 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1568 PyErr_BadInternalCall();
1571 return dict_keys((PyDictObject
*)mp
);
1575 PyDict_Values(PyObject
*mp
)
1577 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1578 PyErr_BadInternalCall();
1581 return dict_values((PyDictObject
*)mp
);
1585 PyDict_Items(PyObject
*mp
)
1587 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1588 PyErr_BadInternalCall();
1591 return dict_items((PyDictObject
*)mp
);
1594 /* Subroutine which returns the smallest key in a for which b's value
1595 is different or absent. The value is returned too, through the
1596 pval argument. Both are NULL if no key in a is found for which b's status
1597 differs. The refcounts on (and only on) non-NULL *pval and function return
1598 values must be decremented by the caller (characterize() increments them
1599 to ensure that mutating comparison and PyDict_GetItem calls can't delete
1600 them before the caller is done looking at them). */
1603 characterize(PyDictObject
*a
, PyDictObject
*b
, PyObject
**pval
)
1605 PyObject
*akey
= NULL
; /* smallest key in a s.t. a[akey] != b[akey] */
1606 PyObject
*aval
= NULL
; /* a[akey] */
1610 for (i
= 0; i
<= a
->ma_mask
; i
++) {
1611 PyObject
*thiskey
, *thisaval
, *thisbval
;
1612 if (a
->ma_table
[i
].me_value
== NULL
)
1614 thiskey
= a
->ma_table
[i
].me_key
;
1615 Py_INCREF(thiskey
); /* keep alive across compares */
1617 cmp
= PyObject_RichCompareBool(akey
, thiskey
, Py_LT
);
1624 a
->ma_table
[i
].me_value
== NULL
)
1626 /* Not the *smallest* a key; or maybe it is
1627 * but the compare shrunk the dict so we can't
1628 * find its associated value anymore; or
1629 * maybe it is but the compare deleted the
1637 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1638 thisaval
= a
->ma_table
[i
].me_value
;
1640 Py_INCREF(thisaval
); /* keep alive */
1641 thisbval
= PyDict_GetItem((PyObject
*)b
, thiskey
);
1642 if (thisbval
== NULL
)
1645 /* both dicts have thiskey: same values? */
1646 cmp
= PyObject_RichCompareBool(
1647 thisaval
, thisbval
, Py_EQ
);
1650 Py_DECREF(thisaval
);
1663 Py_DECREF(thisaval
);
1677 dict_compare(PyDictObject
*a
, PyDictObject
*b
)
1679 PyObject
*adiff
, *bdiff
, *aval
, *bval
;
1682 /* Compare lengths first */
1683 if (a
->ma_used
< b
->ma_used
)
1684 return -1; /* a is shorter */
1685 else if (a
->ma_used
> b
->ma_used
)
1686 return 1; /* b is shorter */
1688 /* Same length -- check all keys */
1689 bdiff
= bval
= NULL
;
1690 adiff
= characterize(a
, b
, &aval
);
1691 if (adiff
== NULL
) {
1693 /* Either an error, or a is a subset with the same length so
1696 res
= PyErr_Occurred() ? -1 : 0;
1699 bdiff
= characterize(b
, a
, &bval
);
1700 if (bdiff
== NULL
&& PyErr_Occurred()) {
1707 /* bdiff == NULL "should be" impossible now, but perhaps
1708 * the last comparison done by the characterize() on a had
1709 * the side effect of making the dicts equal!
1711 res
= PyObject_Compare(adiff
, bdiff
);
1713 if (res
== 0 && bval
!= NULL
)
1714 res
= PyObject_Compare(aval
, bval
);
1724 /* Return 1 if dicts equal, 0 if not, -1 if error.
1725 * Gets out as soon as any difference is detected.
1726 * Uses only Py_EQ comparison.
1729 dict_equal(PyDictObject
*a
, PyDictObject
*b
)
1733 if (a
->ma_used
!= b
->ma_used
)
1734 /* can't be equal if # of entries differ */
1737 /* Same # of entries -- check all of 'em. Exit early on any diff. */
1738 for (i
= 0; i
<= a
->ma_mask
; i
++) {
1739 PyObject
*aval
= a
->ma_table
[i
].me_value
;
1743 PyObject
*key
= a
->ma_table
[i
].me_key
;
1744 /* temporarily bump aval's refcount to ensure it stays
1745 alive until we're done with it */
1749 bval
= PyDict_GetItem((PyObject
*)b
, key
);
1755 cmp
= PyObject_RichCompareBool(aval
, bval
, Py_EQ
);
1757 if (cmp
<= 0) /* error or not equal */
1765 dict_richcompare(PyObject
*v
, PyObject
*w
, int op
)
1770 if (!PyDict_Check(v
) || !PyDict_Check(w
)) {
1771 res
= Py_NotImplemented
;
1773 else if (op
== Py_EQ
|| op
== Py_NE
) {
1774 cmp
= dict_equal((PyDictObject
*)v
, (PyDictObject
*)w
);
1777 res
= (cmp
== (op
== Py_EQ
)) ? Py_True
: Py_False
;
1780 /* Py3K warning if comparison isn't == or != */
1781 if (PyErr_WarnPy3k("dict inequality comparisons not supported "
1785 res
= Py_NotImplemented
;
1792 dict_contains(register PyDictObject
*mp
, PyObject
*key
)
1797 if (!PyString_CheckExact(key
) ||
1798 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1799 hash
= PyObject_Hash(key
);
1803 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1806 return PyBool_FromLong(ep
->me_value
!= NULL
);
1810 dict_has_key(register PyDictObject
*mp
, PyObject
*key
)
1812 if (PyErr_WarnPy3k("dict.has_key() not supported in 3.x; "
1813 "use the in operator", 1) < 0)
1815 return dict_contains(mp
, key
);
1819 dict_get(register PyDictObject
*mp
, PyObject
*args
)
1822 PyObject
*failobj
= Py_None
;
1823 PyObject
*val
= NULL
;
1827 if (!PyArg_UnpackTuple(args
, "get", 1, 2, &key
, &failobj
))
1830 if (!PyString_CheckExact(key
) ||
1831 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1832 hash
= PyObject_Hash(key
);
1836 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1848 dict_setdefault(register PyDictObject
*mp
, PyObject
*args
)
1851 PyObject
*failobj
= Py_None
;
1852 PyObject
*val
= NULL
;
1856 if (!PyArg_UnpackTuple(args
, "setdefault", 1, 2, &key
, &failobj
))
1859 if (!PyString_CheckExact(key
) ||
1860 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1861 hash
= PyObject_Hash(key
);
1865 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1871 if (PyDict_SetItem((PyObject
*)mp
, key
, failobj
))
1880 dict_clear(register PyDictObject
*mp
)
1882 PyDict_Clear((PyObject
*)mp
);
1887 dict_pop(PyDictObject
*mp
, PyObject
*args
)
1891 PyObject
*old_value
, *old_key
;
1892 PyObject
*key
, *deflt
= NULL
;
1894 if(!PyArg_UnpackTuple(args
, "pop", 1, 2, &key
, &deflt
))
1896 if (mp
->ma_used
== 0) {
1901 PyErr_SetString(PyExc_KeyError
,
1902 "pop(): dictionary is empty");
1905 if (!PyString_CheckExact(key
) ||
1906 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1907 hash
= PyObject_Hash(key
);
1911 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1914 if (ep
->me_value
== NULL
) {
1922 old_key
= ep
->me_key
;
1925 old_value
= ep
->me_value
;
1926 ep
->me_value
= NULL
;
1933 dict_popitem(PyDictObject
*mp
)
1939 /* Allocate the result tuple before checking the size. Believe it
1940 * or not, this allocation could trigger a garbage collection which
1941 * could empty the dict, so if we checked the size first and that
1942 * happened, the result would be an infinite loop (searching for an
1943 * entry that no longer exists). Note that the usual popitem()
1944 * idiom is "while d: k, v = d.popitem()". so needing to throw the
1945 * tuple away if the dict *is* empty isn't a significant
1946 * inefficiency -- possible, but unlikely in practice.
1948 res
= PyTuple_New(2);
1951 if (mp
->ma_used
== 0) {
1953 PyErr_SetString(PyExc_KeyError
,
1954 "popitem(): dictionary is empty");
1957 /* Set ep to "the first" dict entry with a value. We abuse the hash
1958 * field of slot 0 to hold a search finger:
1959 * If slot 0 has a value, use slot 0.
1960 * Else slot 0 is being used to hold a search finger,
1961 * and we use its hash value as the first index to look.
1963 ep
= &mp
->ma_table
[0];
1964 if (ep
->me_value
== NULL
) {
1966 /* The hash field may be a real hash value, or it may be a
1967 * legit search finger, or it may be a once-legit search
1968 * finger that's out of bounds now because it wrapped around
1969 * or the table shrunk -- simply make sure it's in bounds now.
1971 if (i
> mp
->ma_mask
|| i
< 1)
1972 i
= 1; /* skip slot 0 */
1973 while ((ep
= &mp
->ma_table
[i
])->me_value
== NULL
) {
1975 if (i
> mp
->ma_mask
)
1979 PyTuple_SET_ITEM(res
, 0, ep
->me_key
);
1980 PyTuple_SET_ITEM(res
, 1, ep
->me_value
);
1983 ep
->me_value
= NULL
;
1985 assert(mp
->ma_table
[0].me_value
== NULL
);
1986 mp
->ma_table
[0].me_hash
= i
+ 1; /* next place to start */
1991 dict_traverse(PyObject
*op
, visitproc visit
, void *arg
)
1997 while (PyDict_Next(op
, &i
, &pk
, &pv
)) {
2005 dict_tp_clear(PyObject
*op
)
2012 extern PyTypeObject PyDictIterKey_Type
; /* Forward */
2013 extern PyTypeObject PyDictIterValue_Type
; /* Forward */
2014 extern PyTypeObject PyDictIterItem_Type
; /* Forward */
2015 static PyObject
*dictiter_new(PyDictObject
*, PyTypeObject
*);
2018 dict_iterkeys(PyDictObject
*dict
)
2020 return dictiter_new(dict
, &PyDictIterKey_Type
);
2024 dict_itervalues(PyDictObject
*dict
)
2026 return dictiter_new(dict
, &PyDictIterValue_Type
);
2030 dict_iteritems(PyDictObject
*dict
)
2032 return dictiter_new(dict
, &PyDictIterItem_Type
);
2036 PyDoc_STRVAR(has_key__doc__
,
2037 "D.has_key(k) -> True if D has a key k, else False");
2039 PyDoc_STRVAR(contains__doc__
,
2040 "D.__contains__(k) -> True if D has a key k, else False");
2042 PyDoc_STRVAR(getitem__doc__
, "x.__getitem__(y) <==> x[y]");
2044 PyDoc_STRVAR(get__doc__
,
2045 "D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.");
2047 PyDoc_STRVAR(setdefault_doc__
,
2048 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
2050 PyDoc_STRVAR(pop__doc__
,
2051 "D.pop(k[,d]) -> v, remove specified key and return the corresponding value\n\
2052 If key is not found, d is returned if given, otherwise KeyError is raised");
2054 PyDoc_STRVAR(popitem__doc__
,
2055 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
2056 2-tuple; but raise KeyError if D is empty");
2058 PyDoc_STRVAR(keys__doc__
,
2059 "D.keys() -> list of D's keys");
2061 PyDoc_STRVAR(items__doc__
,
2062 "D.items() -> list of D's (key, value) pairs, as 2-tuples");
2064 PyDoc_STRVAR(values__doc__
,
2065 "D.values() -> list of D's values");
2067 PyDoc_STRVAR(update__doc__
,
2068 "D.update(E, **F) -> None. Update D from E and F: for k in E: D[k] = E[k]\n\
2069 (if E has keys else: for (k, v) in E: D[k] = v) then: for k in F: D[k] = F[k]");
2071 PyDoc_STRVAR(fromkeys__doc__
,
2072 "dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
2073 v defaults to None.");
2075 PyDoc_STRVAR(clear__doc__
,
2076 "D.clear() -> None. Remove all items from D.");
2078 PyDoc_STRVAR(copy__doc__
,
2079 "D.copy() -> a shallow copy of D");
2081 PyDoc_STRVAR(iterkeys__doc__
,
2082 "D.iterkeys() -> an iterator over the keys of D");
2084 PyDoc_STRVAR(itervalues__doc__
,
2085 "D.itervalues() -> an iterator over the values of D");
2087 PyDoc_STRVAR(iteritems__doc__
,
2088 "D.iteritems() -> an iterator over the (key, value) items of D");
2090 static PyMethodDef mapp_methods
[] = {
2091 {"__contains__",(PyCFunction
)dict_contains
, METH_O
| METH_COEXIST
,
2093 {"__getitem__", (PyCFunction
)dict_subscript
, METH_O
| METH_COEXIST
,
2095 {"has_key", (PyCFunction
)dict_has_key
, METH_O
,
2097 {"get", (PyCFunction
)dict_get
, METH_VARARGS
,
2099 {"setdefault", (PyCFunction
)dict_setdefault
, METH_VARARGS
,
2101 {"pop", (PyCFunction
)dict_pop
, METH_VARARGS
,
2103 {"popitem", (PyCFunction
)dict_popitem
, METH_NOARGS
,
2105 {"keys", (PyCFunction
)dict_keys
, METH_NOARGS
,
2107 {"items", (PyCFunction
)dict_items
, METH_NOARGS
,
2109 {"values", (PyCFunction
)dict_values
, METH_NOARGS
,
2111 {"update", (PyCFunction
)dict_update
, METH_VARARGS
| METH_KEYWORDS
,
2113 {"fromkeys", (PyCFunction
)dict_fromkeys
, METH_VARARGS
| METH_CLASS
,
2115 {"clear", (PyCFunction
)dict_clear
, METH_NOARGS
,
2117 {"copy", (PyCFunction
)dict_copy
, METH_NOARGS
,
2119 {"iterkeys", (PyCFunction
)dict_iterkeys
, METH_NOARGS
,
2121 {"itervalues", (PyCFunction
)dict_itervalues
, METH_NOARGS
,
2123 {"iteritems", (PyCFunction
)dict_iteritems
, METH_NOARGS
,
2125 {NULL
, NULL
} /* sentinel */
2128 /* Return 1 if `key` is in dict `op`, 0 if not, and -1 on error. */
2130 PyDict_Contains(PyObject
*op
, PyObject
*key
)
2133 PyDictObject
*mp
= (PyDictObject
*)op
;
2136 if (!PyString_CheckExact(key
) ||
2137 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
2138 hash
= PyObject_Hash(key
);
2142 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
2143 return ep
== NULL
? -1 : (ep
->me_value
!= NULL
);
2146 /* Internal version of PyDict_Contains used when the hash value is already known */
2148 _PyDict_Contains(PyObject
*op
, PyObject
*key
, long hash
)
2150 PyDictObject
*mp
= (PyDictObject
*)op
;
2153 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
2154 return ep
== NULL
? -1 : (ep
->me_value
!= NULL
);
2157 /* Hack to implement "key in dict" */
2158 static PySequenceMethods dict_as_sequence
= {
2164 0, /* sq_ass_item */
2165 0, /* sq_ass_slice */
2166 PyDict_Contains
, /* sq_contains */
2167 0, /* sq_inplace_concat */
2168 0, /* sq_inplace_repeat */
2172 dict_new(PyTypeObject
*type
, PyObject
*args
, PyObject
*kwds
)
2176 assert(type
!= NULL
&& type
->tp_alloc
!= NULL
);
2177 self
= type
->tp_alloc(type
, 0);
2179 PyDictObject
*d
= (PyDictObject
*)self
;
2180 /* It's guaranteed that tp->alloc zeroed out the struct. */
2181 assert(d
->ma_table
== NULL
&& d
->ma_fill
== 0 && d
->ma_used
== 0);
2182 INIT_NONZERO_DICT_SLOTS(d
);
2183 d
->ma_lookup
= lookdict_string
;
2184 #ifdef SHOW_CONVERSION_COUNTS
2192 dict_init(PyObject
*self
, PyObject
*args
, PyObject
*kwds
)
2194 return dict_update_common(self
, args
, kwds
, "dict");
2198 dict_iter(PyDictObject
*dict
)
2200 return dictiter_new(dict
, &PyDictIterKey_Type
);
2203 PyDoc_STRVAR(dictionary_doc
,
2204 "dict() -> new empty dictionary.\n"
2205 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
2206 " (key, value) pairs.\n"
2207 "dict(seq) -> new dictionary initialized as if via:\n"
2209 " for k, v in seq:\n"
2211 "dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
2212 " in the keyword argument list. For example: dict(one=1, two=2)");
2214 PyTypeObject PyDict_Type
= {
2215 PyVarObject_HEAD_INIT(&PyType_Type
, 0)
2217 sizeof(PyDictObject
),
2219 (destructor
)dict_dealloc
, /* tp_dealloc */
2220 (printfunc
)dict_print
, /* tp_print */
2223 (cmpfunc
)dict_compare
, /* tp_compare */
2224 (reprfunc
)dict_repr
, /* tp_repr */
2225 0, /* tp_as_number */
2226 &dict_as_sequence
, /* tp_as_sequence */
2227 &dict_as_mapping
, /* tp_as_mapping */
2231 PyObject_GenericGetAttr
, /* tp_getattro */
2232 0, /* tp_setattro */
2233 0, /* tp_as_buffer */
2234 Py_TPFLAGS_DEFAULT
| Py_TPFLAGS_HAVE_GC
|
2235 Py_TPFLAGS_BASETYPE
| Py_TPFLAGS_DICT_SUBCLASS
, /* tp_flags */
2236 dictionary_doc
, /* tp_doc */
2237 dict_traverse
, /* tp_traverse */
2238 dict_tp_clear
, /* tp_clear */
2239 dict_richcompare
, /* tp_richcompare */
2240 0, /* tp_weaklistoffset */
2241 (getiterfunc
)dict_iter
, /* tp_iter */
2242 0, /* tp_iternext */
2243 mapp_methods
, /* tp_methods */
2248 0, /* tp_descr_get */
2249 0, /* tp_descr_set */
2250 0, /* tp_dictoffset */
2251 dict_init
, /* tp_init */
2252 PyType_GenericAlloc
, /* tp_alloc */
2253 dict_new
, /* tp_new */
2254 PyObject_GC_Del
, /* tp_free */
2257 /* For backward compatibility with old dictionary interface */
2260 PyDict_GetItemString(PyObject
*v
, const char *key
)
2263 kv
= PyString_FromString(key
);
2266 rv
= PyDict_GetItem(v
, kv
);
2272 PyDict_SetItemString(PyObject
*v
, const char *key
, PyObject
*item
)
2276 kv
= PyString_FromString(key
);
2279 PyString_InternInPlace(&kv
); /* XXX Should we really? */
2280 err
= PyDict_SetItem(v
, kv
, item
);
2286 PyDict_DelItemString(PyObject
*v
, const char *key
)
2290 kv
= PyString_FromString(key
);
2293 err
= PyDict_DelItem(v
, kv
);
2298 /* Dictionary iterator types */
2302 PyDictObject
*di_dict
; /* Set to NULL when iterator is exhausted */
2305 PyObject
* di_result
; /* reusable result tuple for iteritems */
2310 dictiter_new(PyDictObject
*dict
, PyTypeObject
*itertype
)
2313 di
= PyObject_New(dictiterobject
, itertype
);
2318 di
->di_used
= dict
->ma_used
;
2320 di
->len
= dict
->ma_used
;
2321 if (itertype
== &PyDictIterItem_Type
) {
2322 di
->di_result
= PyTuple_Pack(2, Py_None
, Py_None
);
2323 if (di
->di_result
== NULL
) {
2329 di
->di_result
= NULL
;
2330 return (PyObject
*)di
;
2334 dictiter_dealloc(dictiterobject
*di
)
2336 Py_XDECREF(di
->di_dict
);
2337 Py_XDECREF(di
->di_result
);
2342 dictiter_len(dictiterobject
*di
)
2345 if (di
->di_dict
!= NULL
&& di
->di_used
== di
->di_dict
->ma_used
)
2347 return PyInt_FromSize_t(len
);
2350 PyDoc_STRVAR(length_hint_doc
, "Private method returning an estimate of len(list(it)).");
2352 static PyMethodDef dictiter_methods
[] = {
2353 {"__length_hint__", (PyCFunction
)dictiter_len
, METH_NOARGS
, length_hint_doc
},
2354 {NULL
, NULL
} /* sentinel */
2357 static PyObject
*dictiter_iternextkey(dictiterobject
*di
)
2360 register Py_ssize_t i
, mask
;
2361 register PyDictEntry
*ep
;
2362 PyDictObject
*d
= di
->di_dict
;
2366 assert (PyDict_Check(d
));
2368 if (di
->di_used
!= d
->ma_used
) {
2369 PyErr_SetString(PyExc_RuntimeError
,
2370 "dictionary changed size during iteration");
2371 di
->di_used
= -1; /* Make this state sticky */
2380 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
2396 PyTypeObject PyDictIterKey_Type
= {
2397 PyVarObject_HEAD_INIT(&PyType_Type
, 0)
2398 "dictionary-keyiterator", /* tp_name */
2399 sizeof(dictiterobject
), /* tp_basicsize */
2400 0, /* tp_itemsize */
2402 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2408 0, /* tp_as_number */
2409 0, /* tp_as_sequence */
2410 0, /* tp_as_mapping */
2414 PyObject_GenericGetAttr
, /* tp_getattro */
2415 0, /* tp_setattro */
2416 0, /* tp_as_buffer */
2417 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2419 0, /* tp_traverse */
2421 0, /* tp_richcompare */
2422 0, /* tp_weaklistoffset */
2423 PyObject_SelfIter
, /* tp_iter */
2424 (iternextfunc
)dictiter_iternextkey
, /* tp_iternext */
2425 dictiter_methods
, /* tp_methods */
2429 static PyObject
*dictiter_iternextvalue(dictiterobject
*di
)
2432 register Py_ssize_t i
, mask
;
2433 register PyDictEntry
*ep
;
2434 PyDictObject
*d
= di
->di_dict
;
2438 assert (PyDict_Check(d
));
2440 if (di
->di_used
!= d
->ma_used
) {
2441 PyErr_SetString(PyExc_RuntimeError
,
2442 "dictionary changed size during iteration");
2443 di
->di_used
= -1; /* Make this state sticky */
2449 if (i
< 0 || i
> mask
)
2452 while ((value
=ep
[i
].me_value
) == NULL
) {
2468 PyTypeObject PyDictIterValue_Type
= {
2469 PyVarObject_HEAD_INIT(&PyType_Type
, 0)
2470 "dictionary-valueiterator", /* tp_name */
2471 sizeof(dictiterobject
), /* tp_basicsize */
2472 0, /* tp_itemsize */
2474 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2480 0, /* tp_as_number */
2481 0, /* tp_as_sequence */
2482 0, /* tp_as_mapping */
2486 PyObject_GenericGetAttr
, /* tp_getattro */
2487 0, /* tp_setattro */
2488 0, /* tp_as_buffer */
2489 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2491 0, /* tp_traverse */
2493 0, /* tp_richcompare */
2494 0, /* tp_weaklistoffset */
2495 PyObject_SelfIter
, /* tp_iter */
2496 (iternextfunc
)dictiter_iternextvalue
, /* tp_iternext */
2497 dictiter_methods
, /* tp_methods */
2501 static PyObject
*dictiter_iternextitem(dictiterobject
*di
)
2503 PyObject
*key
, *value
, *result
= di
->di_result
;
2504 register Py_ssize_t i
, mask
;
2505 register PyDictEntry
*ep
;
2506 PyDictObject
*d
= di
->di_dict
;
2510 assert (PyDict_Check(d
));
2512 if (di
->di_used
!= d
->ma_used
) {
2513 PyErr_SetString(PyExc_RuntimeError
,
2514 "dictionary changed size during iteration");
2515 di
->di_used
= -1; /* Make this state sticky */
2524 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
2530 if (result
->ob_refcnt
== 1) {
2532 Py_DECREF(PyTuple_GET_ITEM(result
, 0));
2533 Py_DECREF(PyTuple_GET_ITEM(result
, 1));
2535 result
= PyTuple_New(2);
2541 value
= ep
[i
].me_value
;
2544 PyTuple_SET_ITEM(result
, 0, key
);
2545 PyTuple_SET_ITEM(result
, 1, value
);
2554 PyTypeObject PyDictIterItem_Type
= {
2555 PyVarObject_HEAD_INIT(&PyType_Type
, 0)
2556 "dictionary-itemiterator", /* tp_name */
2557 sizeof(dictiterobject
), /* tp_basicsize */
2558 0, /* tp_itemsize */
2560 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2566 0, /* tp_as_number */
2567 0, /* tp_as_sequence */
2568 0, /* tp_as_mapping */
2572 PyObject_GenericGetAttr
, /* tp_getattro */
2573 0, /* tp_setattro */
2574 0, /* tp_as_buffer */
2575 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2577 0, /* tp_traverse */
2579 0, /* tp_richcompare */
2580 0, /* tp_weaklistoffset */
2581 PyObject_SelfIter
, /* tp_iter */
2582 (iternextfunc
)dictiter_iternextitem
, /* tp_iternext */
2583 dictiter_methods
, /* tp_methods */