2 /* Dictionary object implementation using a hash table */
4 /* The distribution includes a separate file, Objects/dictnotes.txt,
5 describing explorations into dictionary design and optimization.
6 It covers typical dictionary use patterns, the parameters for
7 tuning dictionaries, and several ideas for possible optimizations.
13 /* Set a key error with the specified argument, wrapping it in a
14 * tuple automatically so that tuple keys are not unpacked as the
15 * exception arguments. */
17 set_key_error(PyObject
*arg
)
20 tup
= PyTuple_Pack(1, arg
);
22 return; /* caller will expect error to be set anyway */
23 PyErr_SetObject(PyExc_KeyError
, tup
);
27 /* Define this out if you don't want conversion statistics on exit. */
28 #undef SHOW_CONVERSION_COUNTS
30 /* See large comment block below. This must be >= 1. */
31 #define PERTURB_SHIFT 5
34 Major subtleties ahead: Most hash schemes depend on having a "good" hash
35 function, in the sense of simulating randomness. Python doesn't: its most
36 important hash functions (for strings and ints) are very regular in common
39 >>> map(hash, (0, 1, 2, 3))
41 >>> map(hash, ("namea", "nameb", "namec", "named"))
42 [-1658398457, -1658398460, -1658398459, -1658398462]
45 This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
46 the low-order i bits as the initial table index is extremely fast, and there
47 are no collisions at all for dicts indexed by a contiguous range of ints.
48 The same is approximately true when keys are "consecutive" strings. So this
49 gives better-than-random behavior in common cases, and that's very desirable.
51 OTOH, when collisions occur, the tendency to fill contiguous slices of the
52 hash table makes a good collision resolution strategy crucial. Taking only
53 the last i bits of the hash code is also vulnerable: for example, consider
54 [i << 16 for i in range(20000)] as a set of keys. Since ints are their own
55 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
56 hash code are all 0: they *all* map to the same table index.
58 But catering to unusual cases should not slow the usual ones, so we just take
59 the last i bits anyway. It's up to collision resolution to do the rest. If
60 we *usually* find the key we're looking for on the first try (and, it turns
61 out, we usually do -- the table load factor is kept under 2/3, so the odds
62 are solidly in our favor), then it makes best sense to keep the initial index
63 computation dirt cheap.
65 The first half of collision resolution is to visit table indices via this
68 j = ((5*j) + 1) mod 2**i
70 For any initial j in range(2**i), repeating that 2**i times generates each
71 int in range(2**i) exactly once (see any text on random-number generation for
72 proof). By itself, this doesn't help much: like linear probing (setting
73 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
74 order. This would be bad, except that's not the only thing we do, and it's
75 actually *good* in the common cases where hash keys are consecutive. In an
76 example that's really too small to make this entirely clear, for a table of
77 size 2**3 the order of indices is:
79 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
81 If two things come in at index 5, the first place we look after is index 2,
82 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
83 Linear probing is deadly in this case because there the fixed probe order
84 is the *same* as the order consecutive keys are likely to arrive. But it's
85 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
86 and certain that consecutive hash codes do not.
88 The other half of the strategy is to get the other bits of the hash code
89 into play. This is done by initializing a (unsigned) vrbl "perturb" to the
90 full hash code, and changing the recurrence to:
92 j = (5*j) + 1 + perturb;
93 perturb >>= PERTURB_SHIFT;
94 use j % 2**i as the next table index;
96 Now the probe sequence depends (eventually) on every bit in the hash code,
97 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
98 because it quickly magnifies small differences in the bits that didn't affect
99 the initial index. Note that because perturb is unsigned, if the recurrence
100 is executed often enough perturb eventually becomes and remains 0. At that
101 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
102 that's certain to find an empty slot eventually (since it generates every int
103 in range(2**i), and we make sure there's always at least one empty slot).
105 Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
106 small so that the high bits of the hash code continue to affect the probe
107 sequence across iterations; but you want it large so that in really bad cases
108 the high-order hash bits have an effect on early iterations. 5 was "the
109 best" in minimizing total collisions across experiments Tim Peters ran (on
110 both normal and pathological cases), but 4 and 6 weren't significantly worse.
112 Historical: Reimer Behrends contributed the idea of using a polynomial-based
113 approach, using repeated multiplication by x in GF(2**n) where an irreducible
114 polynomial for each table size was chosen such that x was a primitive root.
115 Christian Tismer later extended that to use division by x instead, as an
116 efficient way to get the high bits of the hash code into play. This scheme
117 also gave excellent collision statistics, but was more expensive: two
118 if-tests were required inside the loop; computing "the next" index took about
119 the same number of operations but without as much potential parallelism
120 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
121 above, and then shifting perturb can be done while the table index is being
122 masked); and the PyDictObject struct required a member to hold the table's
123 polynomial. In Tim's experiments the current scheme ran faster, produced
124 equally good collision statistics, needed less code & used less memory.
126 Theoretical Python 2.5 headache: hash codes are only C "long", but
127 sizeof(Py_ssize_t) > sizeof(long) may be possible. In that case, and if a
128 dict is genuinely huge, then only the slots directly reachable via indexing
129 by a C long can be the first slot in a probe sequence. The probe sequence
130 will still eventually reach every slot in the table, but the collision rate
131 on initial probes may be much higher than this scheme was designed for.
132 Getting a hash code as fat as Py_ssize_t is the only real cure. But in
133 practice, this probably won't make a lick of difference for many years (at
134 which point everyone will have terabytes of RAM on 64-bit boxes).
137 /* Object used as dummy key to fill deleted entries */
138 static PyObject
*dummy
= NULL
; /* Initialized by first call to newPyDictObject() */
148 /* forward declarations */
150 lookdict_string(PyDictObject
*mp
, PyObject
*key
, long hash
);
152 #ifdef SHOW_CONVERSION_COUNTS
153 static long created
= 0L;
154 static long converted
= 0L;
159 fprintf(stderr
, "created %ld string dicts\n", created
);
160 fprintf(stderr
, "converted %ld to normal dicts\n", converted
);
161 fprintf(stderr
, "%.2f%% conversion rate\n", (100.0*converted
)/created
);
165 /* Debug statistic to compare allocations with reuse through the free list */
166 #undef SHOW_ALLOC_COUNT
167 #ifdef SHOW_ALLOC_COUNT
168 static size_t count_alloc
= 0;
169 static size_t count_reuse
= 0;
174 fprintf(stderr
, "Dict allocations: %" PY_FORMAT_SIZE_T
"d\n",
176 fprintf(stderr
, "Dict reuse through freelist: %" PY_FORMAT_SIZE_T
178 fprintf(stderr
, "%.2f%% reuse rate\n\n",
179 (100.0*count_reuse
/(count_alloc
+count_reuse
)));
183 /* Initialization macros.
184 There are two ways to create a dict: PyDict_New() is the main C API
185 function, and the tp_new slot maps to dict_new(). In the latter case we
186 can save a little time over what PyDict_New does because it's guaranteed
187 that the PyDictObject struct is already zeroed out.
188 Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
189 an excellent reason not to).
192 #define INIT_NONZERO_DICT_SLOTS(mp) do { \
193 (mp)->ma_table = (mp)->ma_smalltable; \
194 (mp)->ma_mask = PyDict_MINSIZE - 1; \
197 #define EMPTY_TO_MINSIZE(mp) do { \
198 memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
199 (mp)->ma_used = (mp)->ma_fill = 0; \
200 INIT_NONZERO_DICT_SLOTS(mp); \
203 /* Dictionary reuse scheme to save calls to malloc, free, and memset */
204 #ifndef PyDict_MAXFREELIST
205 #define PyDict_MAXFREELIST 80
207 static PyDictObject
*free_list
[PyDict_MAXFREELIST
];
208 static int numfree
= 0;
216 op
= free_list
[--numfree
];
217 assert(PyDict_CheckExact(op
));
225 register PyDictObject
*mp
;
226 if (dummy
== NULL
) { /* Auto-initialize dummy */
227 dummy
= PyString_FromString("<dummy key>");
230 #ifdef SHOW_CONVERSION_COUNTS
231 Py_AtExit(show_counts
);
233 #ifdef SHOW_ALLOC_COUNT
234 Py_AtExit(show_alloc
);
238 mp
= free_list
[--numfree
];
240 assert (Py_TYPE(mp
) == &PyDict_Type
);
241 _Py_NewReference((PyObject
*)mp
);
243 EMPTY_TO_MINSIZE(mp
);
245 /* At least set ma_table and ma_mask; these are wrong
246 if an empty but presized dict is added to freelist */
247 INIT_NONZERO_DICT_SLOTS(mp
);
249 assert (mp
->ma_used
== 0);
250 assert (mp
->ma_table
== mp
->ma_smalltable
);
251 assert (mp
->ma_mask
== PyDict_MINSIZE
- 1);
252 #ifdef SHOW_ALLOC_COUNT
256 mp
= PyObject_GC_New(PyDictObject
, &PyDict_Type
);
259 EMPTY_TO_MINSIZE(mp
);
260 #ifdef SHOW_ALLOC_COUNT
264 mp
->ma_lookup
= lookdict_string
;
265 #ifdef SHOW_CONVERSION_COUNTS
268 _PyObject_GC_TRACK(mp
);
269 return (PyObject
*)mp
;
273 The basic lookup function used by all operations.
274 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
275 Open addressing is preferred over chaining since the link overhead for
276 chaining would be substantial (100% with typical malloc overhead).
278 The initial probe index is computed as hash mod the table size. Subsequent
279 probe indices are computed as explained earlier.
281 All arithmetic on hash should ignore overflow.
283 (The details in this version are due to Tim Peters, building on many past
284 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
287 lookdict() is general-purpose, and may return NULL if (and only if) a
288 comparison raises an exception (this was new in Python 2.5).
289 lookdict_string() below is specialized to string keys, comparison of which can
290 never raise an exception; that function can never return NULL. For both, when
291 the key isn't found a PyDictEntry* is returned for which the me_value field is
292 NULL; this is the slot in the dict at which the key would have been found, and
293 the caller can (if it wishes) add the <key, value> pair to the returned
297 lookdict(PyDictObject
*mp
, PyObject
*key
, register long hash
)
300 register size_t perturb
;
301 register PyDictEntry
*freeslot
;
302 register size_t mask
= (size_t)mp
->ma_mask
;
303 PyDictEntry
*ep0
= mp
->ma_table
;
304 register PyDictEntry
*ep
;
308 i
= (size_t)hash
& mask
;
310 if (ep
->me_key
== NULL
|| ep
->me_key
== key
)
313 if (ep
->me_key
== dummy
)
316 if (ep
->me_hash
== hash
) {
317 startkey
= ep
->me_key
;
319 cmp
= PyObject_RichCompareBool(startkey
, key
, Py_EQ
);
323 if (ep0
== mp
->ma_table
&& ep
->me_key
== startkey
) {
328 /* The compare did major nasty stuff to the
330 * XXX A clever adversary could prevent this
331 * XXX from terminating.
333 return lookdict(mp
, key
, hash
);
339 /* In the loop, me_key == dummy is by far (factor of 100s) the
340 least likely outcome, so test for that last. */
341 for (perturb
= hash
; ; perturb
>>= PERTURB_SHIFT
) {
342 i
= (i
<< 2) + i
+ perturb
+ 1;
344 if (ep
->me_key
== NULL
)
345 return freeslot
== NULL
? ep
: freeslot
;
346 if (ep
->me_key
== key
)
348 if (ep
->me_hash
== hash
&& ep
->me_key
!= dummy
) {
349 startkey
= ep
->me_key
;
351 cmp
= PyObject_RichCompareBool(startkey
, key
, Py_EQ
);
355 if (ep0
== mp
->ma_table
&& ep
->me_key
== startkey
) {
360 /* The compare did major nasty stuff to the
362 * XXX A clever adversary could prevent this
363 * XXX from terminating.
365 return lookdict(mp
, key
, hash
);
368 else if (ep
->me_key
== dummy
&& freeslot
== NULL
)
371 assert(0); /* NOT REACHED */
376 * Hacked up version of lookdict which can assume keys are always strings;
377 * this assumption allows testing for errors during PyObject_RichCompareBool()
378 * to be dropped; string-string comparisons never raise exceptions. This also
379 * means we don't need to go through PyObject_RichCompareBool(); we can always
380 * use _PyString_Eq() directly.
382 * This is valuable because dicts with only string keys are very common.
385 lookdict_string(PyDictObject
*mp
, PyObject
*key
, register long hash
)
388 register size_t perturb
;
389 register PyDictEntry
*freeslot
;
390 register size_t mask
= (size_t)mp
->ma_mask
;
391 PyDictEntry
*ep0
= mp
->ma_table
;
392 register PyDictEntry
*ep
;
394 /* Make sure this function doesn't have to handle non-string keys,
395 including subclasses of str; e.g., one reason to subclass
396 strings is to override __eq__, and for speed we don't cater to
398 if (!PyString_CheckExact(key
)) {
399 #ifdef SHOW_CONVERSION_COUNTS
402 mp
->ma_lookup
= lookdict
;
403 return lookdict(mp
, key
, hash
);
407 if (ep
->me_key
== NULL
|| ep
->me_key
== key
)
409 if (ep
->me_key
== dummy
)
412 if (ep
->me_hash
== hash
&& _PyString_Eq(ep
->me_key
, key
))
417 /* In the loop, me_key == dummy is by far (factor of 100s) the
418 least likely outcome, so test for that last. */
419 for (perturb
= hash
; ; perturb
>>= PERTURB_SHIFT
) {
420 i
= (i
<< 2) + i
+ perturb
+ 1;
422 if (ep
->me_key
== NULL
)
423 return freeslot
== NULL
? ep
: freeslot
;
424 if (ep
->me_key
== key
425 || (ep
->me_hash
== hash
426 && ep
->me_key
!= dummy
427 && _PyString_Eq(ep
->me_key
, key
)))
429 if (ep
->me_key
== dummy
&& freeslot
== NULL
)
432 assert(0); /* NOT REACHED */
437 Internal routine to insert a new item into the table.
438 Used both by the internal resize routine and by the public insert routine.
439 Eats a reference to key and one to value.
440 Returns -1 if an error occurred, or 0 on success.
443 insertdict(register PyDictObject
*mp
, PyObject
*key
, long hash
, PyObject
*value
)
446 register PyDictEntry
*ep
;
447 typedef PyDictEntry
*(*lookupfunc
)(PyDictObject
*, PyObject
*, long);
449 assert(mp
->ma_lookup
!= NULL
);
450 ep
= mp
->ma_lookup(mp
, key
, hash
);
456 if (ep
->me_value
!= NULL
) {
457 old_value
= ep
->me_value
;
458 ep
->me_value
= value
;
459 Py_DECREF(old_value
); /* which **CAN** re-enter */
463 if (ep
->me_key
== NULL
)
466 assert(ep
->me_key
== dummy
);
470 ep
->me_hash
= (Py_ssize_t
)hash
;
471 ep
->me_value
= value
;
478 Internal routine used by dictresize() to insert an item which is
479 known to be absent from the dict. This routine also assumes that
480 the dict contains no deleted entries. Besides the performance benefit,
481 using insertdict() in dictresize() is dangerous (SF bug #1456209).
482 Note that no refcounts are changed by this routine; if needed, the caller
483 is responsible for incref'ing `key` and `value`.
486 insertdict_clean(register PyDictObject
*mp
, PyObject
*key
, long hash
,
490 register size_t perturb
;
491 register size_t mask
= (size_t)mp
->ma_mask
;
492 PyDictEntry
*ep0
= mp
->ma_table
;
493 register PyDictEntry
*ep
;
497 for (perturb
= hash
; ep
->me_key
!= NULL
; perturb
>>= PERTURB_SHIFT
) {
498 i
= (i
<< 2) + i
+ perturb
+ 1;
501 assert(ep
->me_value
== NULL
);
504 ep
->me_hash
= (Py_ssize_t
)hash
;
505 ep
->me_value
= value
;
510 Restructure the table by allocating a new table and reinserting all
511 items again. When entries have been deleted, the new table may
512 actually be smaller than the old one.
515 dictresize(PyDictObject
*mp
, Py_ssize_t minused
)
518 PyDictEntry
*oldtable
, *newtable
, *ep
;
520 int is_oldtable_malloced
;
521 PyDictEntry small_copy
[PyDict_MINSIZE
];
523 assert(minused
>= 0);
525 /* Find the smallest table size > minused. */
526 for (newsize
= PyDict_MINSIZE
;
527 newsize
<= minused
&& newsize
> 0;
535 /* Get space for a new table. */
536 oldtable
= mp
->ma_table
;
537 assert(oldtable
!= NULL
);
538 is_oldtable_malloced
= oldtable
!= mp
->ma_smalltable
;
540 if (newsize
== PyDict_MINSIZE
) {
541 /* A large table is shrinking, or we can't get any smaller. */
542 newtable
= mp
->ma_smalltable
;
543 if (newtable
== oldtable
) {
544 if (mp
->ma_fill
== mp
->ma_used
) {
545 /* No dummies, so no point doing anything. */
548 /* We're not going to resize it, but rebuild the
549 table anyway to purge old dummy entries.
550 Subtle: This is *necessary* if fill==size,
551 as lookdict needs at least one virgin slot to
552 terminate failing searches. If fill < size, it's
553 merely desirable, as dummies slow searches. */
554 assert(mp
->ma_fill
> mp
->ma_used
);
555 memcpy(small_copy
, oldtable
, sizeof(small_copy
));
556 oldtable
= small_copy
;
560 newtable
= PyMem_NEW(PyDictEntry
, newsize
);
561 if (newtable
== NULL
) {
567 /* Make the dict empty, using the new table. */
568 assert(newtable
!= oldtable
);
569 mp
->ma_table
= newtable
;
570 mp
->ma_mask
= newsize
- 1;
571 memset(newtable
, 0, sizeof(PyDictEntry
) * newsize
);
576 /* Copy the data over; this is refcount-neutral for active entries;
577 dummy entries aren't copied over, of course */
578 for (ep
= oldtable
; i
> 0; ep
++) {
579 if (ep
->me_value
!= NULL
) { /* active entry */
581 insertdict_clean(mp
, ep
->me_key
, (long)ep
->me_hash
,
584 else if (ep
->me_key
!= NULL
) { /* dummy entry */
586 assert(ep
->me_key
== dummy
);
587 Py_DECREF(ep
->me_key
);
589 /* else key == value == NULL: nothing to do */
592 if (is_oldtable_malloced
)
597 /* Create a new dictionary pre-sized to hold an estimated number of elements.
598 Underestimates are okay because the dictionary will resize as necessary.
599 Overestimates just mean the dictionary will be more sparse than usual.
603 _PyDict_NewPresized(Py_ssize_t minused
)
605 PyObject
*op
= PyDict_New();
607 if (minused
>5 && op
!= NULL
&& dictresize((PyDictObject
*)op
, minused
) == -1) {
614 /* Note that, for historical reasons, PyDict_GetItem() suppresses all errors
615 * that may occur (originally dicts supported only string keys, and exceptions
616 * weren't possible). So, while the original intent was that a NULL return
617 * meant the key wasn't present, in reality it can mean that, or that an error
618 * (suppressed) occurred while computing the key's hash, or that some error
619 * (suppressed) occurred when comparing keys in the dict's internal probe
620 * sequence. A nasty example of the latter is when a Python-coded comparison
621 * function hits a stack-depth error, which can cause this to return NULL
622 * even if the key is present.
625 PyDict_GetItem(PyObject
*op
, PyObject
*key
)
628 PyDictObject
*mp
= (PyDictObject
*)op
;
630 PyThreadState
*tstate
;
631 if (!PyDict_Check(op
))
633 if (!PyString_CheckExact(key
) ||
634 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1)
636 hash
= PyObject_Hash(key
);
643 /* We can arrive here with a NULL tstate during initialization:
644 try running "python -Wi" for an example related to string
645 interning. Let's just hope that no exception occurs then... */
646 tstate
= _PyThreadState_Current
;
647 if (tstate
!= NULL
&& tstate
->curexc_type
!= NULL
) {
648 /* preserve the existing exception */
649 PyObject
*err_type
, *err_value
, *err_tb
;
650 PyErr_Fetch(&err_type
, &err_value
, &err_tb
);
651 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
653 PyErr_Restore(err_type
, err_value
, err_tb
);
658 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
667 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
668 * dictionary if it's merely replacing the value for an existing key.
669 * This means that it's safe to loop over a dictionary with PyDict_Next()
670 * and occasionally replace a value -- but you can't insert new keys or
674 PyDict_SetItem(register PyObject
*op
, PyObject
*key
, PyObject
*value
)
676 register PyDictObject
*mp
;
678 register Py_ssize_t n_used
;
680 if (!PyDict_Check(op
)) {
681 PyErr_BadInternalCall();
686 mp
= (PyDictObject
*)op
;
687 if (PyString_CheckExact(key
)) {
688 hash
= ((PyStringObject
*)key
)->ob_shash
;
690 hash
= PyObject_Hash(key
);
693 hash
= PyObject_Hash(key
);
697 assert(mp
->ma_fill
<= mp
->ma_mask
); /* at least one empty slot */
698 n_used
= mp
->ma_used
;
701 if (insertdict(mp
, key
, hash
, value
) != 0)
703 /* If we added a key, we can safely resize. Otherwise just return!
704 * If fill >= 2/3 size, adjust size. Normally, this doubles or
705 * quaduples the size, but it's also possible for the dict to shrink
706 * (if ma_fill is much larger than ma_used, meaning a lot of dict
707 * keys have been * deleted).
709 * Quadrupling the size improves average dictionary sparseness
710 * (reducing collisions) at the cost of some memory and iteration
711 * speed (which loops over every possible entry). It also halves
712 * the number of expensive resize operations in a growing dictionary.
714 * Very large dictionaries (over 50K items) use doubling instead.
715 * This may help applications with severe memory constraints.
717 if (!(mp
->ma_used
> n_used
&& mp
->ma_fill
*3 >= (mp
->ma_mask
+1)*2))
719 return dictresize(mp
, (mp
->ma_used
> 50000 ? 2 : 4) * mp
->ma_used
);
723 PyDict_DelItem(PyObject
*op
, PyObject
*key
)
725 register PyDictObject
*mp
;
727 register PyDictEntry
*ep
;
728 PyObject
*old_value
, *old_key
;
730 if (!PyDict_Check(op
)) {
731 PyErr_BadInternalCall();
735 if (!PyString_CheckExact(key
) ||
736 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
737 hash
= PyObject_Hash(key
);
741 mp
= (PyDictObject
*)op
;
742 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
745 if (ep
->me_value
== NULL
) {
749 old_key
= ep
->me_key
;
752 old_value
= ep
->me_value
;
755 Py_DECREF(old_value
);
761 PyDict_Clear(PyObject
*op
)
764 PyDictEntry
*ep
, *table
;
765 int table_is_malloced
;
767 PyDictEntry small_copy
[PyDict_MINSIZE
];
772 if (!PyDict_Check(op
))
774 mp
= (PyDictObject
*)op
;
780 table
= mp
->ma_table
;
781 assert(table
!= NULL
);
782 table_is_malloced
= table
!= mp
->ma_smalltable
;
784 /* This is delicate. During the process of clearing the dict,
785 * decrefs can cause the dict to mutate. To avoid fatal confusion
786 * (voice of experience), we have to make the dict empty before
787 * clearing the slots, and never refer to anything via mp->xxx while
791 if (table_is_malloced
)
792 EMPTY_TO_MINSIZE(mp
);
795 /* It's a small table with something that needs to be cleared.
796 * Afraid the only safe way is to copy the dict entries into
797 * another small table first.
799 memcpy(small_copy
, table
, sizeof(small_copy
));
801 EMPTY_TO_MINSIZE(mp
);
803 /* else it's a small table that's already empty */
805 /* Now we can finally clear things. If C had refcounts, we could
806 * assert that the refcount on table is 1 now, i.e. that this function
807 * has unique access to it, so decref side-effects can't alter it.
809 for (ep
= table
; fill
> 0; ++ep
) {
816 Py_DECREF(ep
->me_key
);
817 Py_XDECREF(ep
->me_value
);
821 assert(ep
->me_value
== NULL
);
825 if (table_is_malloced
)
830 * Iterate over a dict. Use like so:
833 * PyObject *key, *value;
834 * i = 0; # important! i should not otherwise be changed by you
835 * while (PyDict_Next(yourdict, &i, &key, &value)) {
836 * Refer to borrowed references in key and value.
839 * CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
840 * mutates the dict. One exception: it is safe if the loop merely changes
841 * the values associated with the keys (but doesn't insert new keys or
842 * delete keys), via PyDict_SetItem().
845 PyDict_Next(PyObject
*op
, Py_ssize_t
*ppos
, PyObject
**pkey
, PyObject
**pvalue
)
847 register Py_ssize_t i
;
848 register Py_ssize_t mask
;
849 register PyDictEntry
*ep
;
851 if (!PyDict_Check(op
))
856 ep
= ((PyDictObject
*)op
)->ma_table
;
857 mask
= ((PyDictObject
*)op
)->ma_mask
;
858 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
864 *pkey
= ep
[i
].me_key
;
866 *pvalue
= ep
[i
].me_value
;
870 /* Internal version of PyDict_Next that returns a hash value in addition to the key and value.*/
872 _PyDict_Next(PyObject
*op
, Py_ssize_t
*ppos
, PyObject
**pkey
, PyObject
**pvalue
, long *phash
)
874 register Py_ssize_t i
;
875 register Py_ssize_t mask
;
876 register PyDictEntry
*ep
;
878 if (!PyDict_Check(op
))
883 ep
= ((PyDictObject
*)op
)->ma_table
;
884 mask
= ((PyDictObject
*)op
)->ma_mask
;
885 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
890 *phash
= (long)(ep
[i
].me_hash
);
892 *pkey
= ep
[i
].me_key
;
894 *pvalue
= ep
[i
].me_value
;
901 dict_dealloc(register PyDictObject
*mp
)
903 register PyDictEntry
*ep
;
904 Py_ssize_t fill
= mp
->ma_fill
;
905 PyObject_GC_UnTrack(mp
);
906 Py_TRASHCAN_SAFE_BEGIN(mp
)
907 for (ep
= mp
->ma_table
; fill
> 0; ep
++) {
910 Py_DECREF(ep
->me_key
);
911 Py_XDECREF(ep
->me_value
);
914 if (mp
->ma_table
!= mp
->ma_smalltable
)
915 PyMem_DEL(mp
->ma_table
);
916 if (numfree
< PyDict_MAXFREELIST
&& Py_TYPE(mp
) == &PyDict_Type
)
917 free_list
[numfree
++] = mp
;
919 Py_TYPE(mp
)->tp_free((PyObject
*)mp
);
920 Py_TRASHCAN_SAFE_END(mp
)
924 dict_print(register PyDictObject
*mp
, register FILE *fp
, register int flags
)
926 register Py_ssize_t i
;
927 register Py_ssize_t any
;
930 status
= Py_ReprEnter((PyObject
*)mp
);
934 Py_BEGIN_ALLOW_THREADS
935 fprintf(fp
, "{...}");
940 Py_BEGIN_ALLOW_THREADS
944 for (i
= 0; i
<= mp
->ma_mask
; i
++) {
945 PyDictEntry
*ep
= mp
->ma_table
+ i
;
946 PyObject
*pvalue
= ep
->me_value
;
947 if (pvalue
!= NULL
) {
948 /* Prevent PyObject_Repr from deleting value during
952 Py_BEGIN_ALLOW_THREADS
956 if (PyObject_Print((PyObject
*)ep
->me_key
, fp
, 0)!=0) {
958 Py_ReprLeave((PyObject
*)mp
);
961 Py_BEGIN_ALLOW_THREADS
964 if (PyObject_Print(pvalue
, fp
, 0) != 0) {
966 Py_ReprLeave((PyObject
*)mp
);
972 Py_BEGIN_ALLOW_THREADS
975 Py_ReprLeave((PyObject
*)mp
);
980 dict_repr(PyDictObject
*mp
)
983 PyObject
*s
, *temp
, *colon
= NULL
;
984 PyObject
*pieces
= NULL
, *result
= NULL
;
985 PyObject
*key
, *value
;
987 i
= Py_ReprEnter((PyObject
*)mp
);
989 return i
> 0 ? PyString_FromString("{...}") : NULL
;
992 if (mp
->ma_used
== 0) {
993 result
= PyString_FromString("{}");
997 pieces
= PyList_New(0);
1001 colon
= PyString_FromString(": ");
1005 /* Do repr() on each key+value pair, and insert ": " between them.
1006 Note that repr may mutate the dict. */
1008 while (PyDict_Next((PyObject
*)mp
, &i
, &key
, &value
)) {
1010 /* Prevent repr from deleting value during key format. */
1012 s
= PyObject_Repr(key
);
1013 PyString_Concat(&s
, colon
);
1014 PyString_ConcatAndDel(&s
, PyObject_Repr(value
));
1018 status
= PyList_Append(pieces
, s
);
1019 Py_DECREF(s
); /* append created a new ref */
1024 /* Add "{}" decorations to the first and last items. */
1025 assert(PyList_GET_SIZE(pieces
) > 0);
1026 s
= PyString_FromString("{");
1029 temp
= PyList_GET_ITEM(pieces
, 0);
1030 PyString_ConcatAndDel(&s
, temp
);
1031 PyList_SET_ITEM(pieces
, 0, s
);
1035 s
= PyString_FromString("}");
1038 temp
= PyList_GET_ITEM(pieces
, PyList_GET_SIZE(pieces
) - 1);
1039 PyString_ConcatAndDel(&temp
, s
);
1040 PyList_SET_ITEM(pieces
, PyList_GET_SIZE(pieces
) - 1, temp
);
1044 /* Paste them all together with ", " between. */
1045 s
= PyString_FromString(", ");
1048 result
= _PyString_Join(s
, pieces
);
1054 Py_ReprLeave((PyObject
*)mp
);
1059 dict_length(PyDictObject
*mp
)
1065 dict_subscript(PyDictObject
*mp
, register PyObject
*key
)
1070 assert(mp
->ma_table
!= NULL
);
1071 if (!PyString_CheckExact(key
) ||
1072 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1073 hash
= PyObject_Hash(key
);
1077 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1082 if (!PyDict_CheckExact(mp
)) {
1083 /* Look up __missing__ method if we're a subclass. */
1085 static PyObject
*missing_str
= NULL
;
1086 if (missing_str
== NULL
)
1088 PyString_InternFromString("__missing__");
1089 missing
= _PyType_Lookup(Py_TYPE(mp
), missing_str
);
1090 if (missing
!= NULL
)
1091 return PyObject_CallFunctionObjArgs(missing
,
1092 (PyObject
*)mp
, key
, NULL
);
1103 dict_ass_sub(PyDictObject
*mp
, PyObject
*v
, PyObject
*w
)
1106 return PyDict_DelItem((PyObject
*)mp
, v
);
1108 return PyDict_SetItem((PyObject
*)mp
, v
, w
);
1111 static PyMappingMethods dict_as_mapping
= {
1112 (lenfunc
)dict_length
, /*mp_length*/
1113 (binaryfunc
)dict_subscript
, /*mp_subscript*/
1114 (objobjargproc
)dict_ass_sub
, /*mp_ass_subscript*/
1118 dict_keys(register PyDictObject
*mp
)
1120 register PyObject
*v
;
1121 register Py_ssize_t i
, j
;
1130 if (n
!= mp
->ma_used
) {
1131 /* Durnit. The allocations caused the dict to resize.
1132 * Just start over, this shouldn't normally happen.
1139 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1140 if (ep
[i
].me_value
!= NULL
) {
1141 PyObject
*key
= ep
[i
].me_key
;
1143 PyList_SET_ITEM(v
, j
, key
);
1152 dict_values(register PyDictObject
*mp
)
1154 register PyObject
*v
;
1155 register Py_ssize_t i
, j
;
1164 if (n
!= mp
->ma_used
) {
1165 /* Durnit. The allocations caused the dict to resize.
1166 * Just start over, this shouldn't normally happen.
1173 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1174 if (ep
[i
].me_value
!= NULL
) {
1175 PyObject
*value
= ep
[i
].me_value
;
1177 PyList_SET_ITEM(v
, j
, value
);
1186 dict_items(register PyDictObject
*mp
)
1188 register PyObject
*v
;
1189 register Py_ssize_t i
, j
, n
;
1191 PyObject
*item
, *key
, *value
;
1194 /* Preallocate the list of tuples, to avoid allocations during
1195 * the loop over the items, which could trigger GC, which
1196 * could resize the dict. :-(
1203 for (i
= 0; i
< n
; i
++) {
1204 item
= PyTuple_New(2);
1209 PyList_SET_ITEM(v
, i
, item
);
1211 if (n
!= mp
->ma_used
) {
1212 /* Durnit. The allocations caused the dict to resize.
1213 * Just start over, this shouldn't normally happen.
1218 /* Nothing we do below makes any function calls. */
1221 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1222 if ((value
=ep
[i
].me_value
) != NULL
) {
1224 item
= PyList_GET_ITEM(v
, j
);
1226 PyTuple_SET_ITEM(item
, 0, key
);
1228 PyTuple_SET_ITEM(item
, 1, value
);
1237 dict_fromkeys(PyObject
*cls
, PyObject
*args
)
1240 PyObject
*value
= Py_None
;
1241 PyObject
*it
; /* iter(seq) */
1246 if (!PyArg_UnpackTuple(args
, "fromkeys", 1, 2, &seq
, &value
))
1249 d
= PyObject_CallObject(cls
, NULL
);
1253 if (PyDict_CheckExact(d
) && PyDict_CheckExact(seq
)) {
1254 PyDictObject
*mp
= (PyDictObject
*)d
;
1260 if (dictresize(mp
, Py_SIZE(seq
)))
1263 while (_PyDict_Next(seq
, &pos
, &key
, &oldvalue
, &hash
)) {
1266 if (insertdict(mp
, key
, hash
, value
))
1272 if (PyDict_CheckExact(d
) && PyAnySet_CheckExact(seq
)) {
1273 PyDictObject
*mp
= (PyDictObject
*)d
;
1278 if (dictresize(mp
, PySet_GET_SIZE(seq
)))
1281 while (_PySet_NextEntry(seq
, &pos
, &key
, &hash
)) {
1284 if (insertdict(mp
, key
, hash
, value
))
1290 it
= PyObject_GetIter(seq
);
1296 if (PyDict_CheckExact(d
)) {
1297 while ((key
= PyIter_Next(it
)) != NULL
) {
1298 status
= PyDict_SetItem(d
, key
, value
);
1304 while ((key
= PyIter_Next(it
)) != NULL
) {
1305 status
= PyObject_SetItem(d
, key
, value
);
1312 if (PyErr_Occurred())
1324 dict_update_common(PyObject
*self
, PyObject
*args
, PyObject
*kwds
, char *methname
)
1326 PyObject
*arg
= NULL
;
1329 if (!PyArg_UnpackTuple(args
, methname
, 0, 1, &arg
))
1332 else if (arg
!= NULL
) {
1333 if (PyObject_HasAttrString(arg
, "keys"))
1334 result
= PyDict_Merge(self
, arg
, 1);
1336 result
= PyDict_MergeFromSeq2(self
, arg
, 1);
1338 if (result
== 0 && kwds
!= NULL
)
1339 result
= PyDict_Merge(self
, kwds
, 1);
1344 dict_update(PyObject
*self
, PyObject
*args
, PyObject
*kwds
)
1346 if (dict_update_common(self
, args
, kwds
, "update") != -1)
1351 /* Update unconditionally replaces existing items.
1352 Merge has a 3rd argument 'override'; if set, it acts like Update,
1353 otherwise it leaves existing items unchanged.
1355 PyDict_{Update,Merge} update/merge from a mapping object.
1357 PyDict_MergeFromSeq2 updates/merges from any iterable object
1358 producing iterable objects of length 2.
1362 PyDict_MergeFromSeq2(PyObject
*d
, PyObject
*seq2
, int override
)
1364 PyObject
*it
; /* iter(seq2) */
1365 Py_ssize_t i
; /* index into seq2 of current element */
1366 PyObject
*item
; /* seq2[i] */
1367 PyObject
*fast
; /* item as a 2-tuple or 2-list */
1370 assert(PyDict_Check(d
));
1371 assert(seq2
!= NULL
);
1373 it
= PyObject_GetIter(seq2
);
1377 for (i
= 0; ; ++i
) {
1378 PyObject
*key
, *value
;
1382 item
= PyIter_Next(it
);
1384 if (PyErr_Occurred())
1389 /* Convert item to sequence, and verify length 2. */
1390 fast
= PySequence_Fast(item
, "");
1392 if (PyErr_ExceptionMatches(PyExc_TypeError
))
1393 PyErr_Format(PyExc_TypeError
,
1394 "cannot convert dictionary update "
1395 "sequence element #%zd to a sequence",
1399 n
= PySequence_Fast_GET_SIZE(fast
);
1401 PyErr_Format(PyExc_ValueError
,
1402 "dictionary update sequence element #%zd "
1403 "has length %zd; 2 is required",
1408 /* Update/merge with this (key, value) pair. */
1409 key
= PySequence_Fast_GET_ITEM(fast
, 0);
1410 value
= PySequence_Fast_GET_ITEM(fast
, 1);
1411 if (override
|| PyDict_GetItem(d
, key
) == NULL
) {
1412 int status
= PyDict_SetItem(d
, key
, value
);
1428 return Py_SAFE_DOWNCAST(i
, Py_ssize_t
, int);
1432 PyDict_Update(PyObject
*a
, PyObject
*b
)
1434 return PyDict_Merge(a
, b
, 1);
1438 PyDict_Merge(PyObject
*a
, PyObject
*b
, int override
)
1440 register PyDictObject
*mp
, *other
;
1441 register Py_ssize_t i
;
1444 /* We accept for the argument either a concrete dictionary object,
1445 * or an abstract "mapping" object. For the former, we can do
1446 * things quite efficiently. For the latter, we only require that
1447 * PyMapping_Keys() and PyObject_GetItem() be supported.
1449 if (a
== NULL
|| !PyDict_Check(a
) || b
== NULL
) {
1450 PyErr_BadInternalCall();
1453 mp
= (PyDictObject
*)a
;
1454 if (PyDict_Check(b
)) {
1455 other
= (PyDictObject
*)b
;
1456 if (other
== mp
|| other
->ma_used
== 0)
1457 /* a.update(a) or a.update({}); nothing to do */
1459 if (mp
->ma_used
== 0)
1460 /* Since the target dict is empty, PyDict_GetItem()
1461 * always returns NULL. Setting override to 1
1462 * skips the unnecessary test.
1465 /* Do one big resize at the start, rather than
1466 * incrementally resizing as we insert new items. Expect
1467 * that there will be no (or few) overlapping keys.
1469 if ((mp
->ma_fill
+ other
->ma_used
)*3 >= (mp
->ma_mask
+1)*2) {
1470 if (dictresize(mp
, (mp
->ma_used
+ other
->ma_used
)*2) != 0)
1473 for (i
= 0; i
<= other
->ma_mask
; i
++) {
1474 entry
= &other
->ma_table
[i
];
1475 if (entry
->me_value
!= NULL
&&
1477 PyDict_GetItem(a
, entry
->me_key
) == NULL
)) {
1478 Py_INCREF(entry
->me_key
);
1479 Py_INCREF(entry
->me_value
);
1480 if (insertdict(mp
, entry
->me_key
,
1481 (long)entry
->me_hash
,
1482 entry
->me_value
) != 0)
1488 /* Do it the generic, slower way */
1489 PyObject
*keys
= PyMapping_Keys(b
);
1491 PyObject
*key
, *value
;
1495 /* Docstring says this is equivalent to E.keys() so
1496 * if E doesn't have a .keys() method we want
1497 * AttributeError to percolate up. Might as well
1498 * do the same for any other error.
1502 iter
= PyObject_GetIter(keys
);
1507 for (key
= PyIter_Next(iter
); key
; key
= PyIter_Next(iter
)) {
1508 if (!override
&& PyDict_GetItem(a
, key
) != NULL
) {
1512 value
= PyObject_GetItem(b
, key
);
1513 if (value
== NULL
) {
1518 status
= PyDict_SetItem(a
, key
, value
);
1527 if (PyErr_Occurred())
1528 /* Iterator completed, via error */
1535 dict_copy(register PyDictObject
*mp
)
1537 return PyDict_Copy((PyObject
*)mp
);
1541 PyDict_Copy(PyObject
*o
)
1545 if (o
== NULL
|| !PyDict_Check(o
)) {
1546 PyErr_BadInternalCall();
1549 copy
= PyDict_New();
1552 if (PyDict_Merge(copy
, o
, 1) == 0)
1559 PyDict_Size(PyObject
*mp
)
1561 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1562 PyErr_BadInternalCall();
1565 return ((PyDictObject
*)mp
)->ma_used
;
1569 PyDict_Keys(PyObject
*mp
)
1571 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1572 PyErr_BadInternalCall();
1575 return dict_keys((PyDictObject
*)mp
);
1579 PyDict_Values(PyObject
*mp
)
1581 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1582 PyErr_BadInternalCall();
1585 return dict_values((PyDictObject
*)mp
);
1589 PyDict_Items(PyObject
*mp
)
1591 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1592 PyErr_BadInternalCall();
1595 return dict_items((PyDictObject
*)mp
);
1598 /* Subroutine which returns the smallest key in a for which b's value
1599 is different or absent. The value is returned too, through the
1600 pval argument. Both are NULL if no key in a is found for which b's status
1601 differs. The refcounts on (and only on) non-NULL *pval and function return
1602 values must be decremented by the caller (characterize() increments them
1603 to ensure that mutating comparison and PyDict_GetItem calls can't delete
1604 them before the caller is done looking at them). */
1607 characterize(PyDictObject
*a
, PyDictObject
*b
, PyObject
**pval
)
1609 PyObject
*akey
= NULL
; /* smallest key in a s.t. a[akey] != b[akey] */
1610 PyObject
*aval
= NULL
; /* a[akey] */
1614 for (i
= 0; i
<= a
->ma_mask
; i
++) {
1615 PyObject
*thiskey
, *thisaval
, *thisbval
;
1616 if (a
->ma_table
[i
].me_value
== NULL
)
1618 thiskey
= a
->ma_table
[i
].me_key
;
1619 Py_INCREF(thiskey
); /* keep alive across compares */
1621 cmp
= PyObject_RichCompareBool(akey
, thiskey
, Py_LT
);
1628 a
->ma_table
[i
].me_value
== NULL
)
1630 /* Not the *smallest* a key; or maybe it is
1631 * but the compare shrunk the dict so we can't
1632 * find its associated value anymore; or
1633 * maybe it is but the compare deleted the
1641 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1642 thisaval
= a
->ma_table
[i
].me_value
;
1644 Py_INCREF(thisaval
); /* keep alive */
1645 thisbval
= PyDict_GetItem((PyObject
*)b
, thiskey
);
1646 if (thisbval
== NULL
)
1649 /* both dicts have thiskey: same values? */
1650 cmp
= PyObject_RichCompareBool(
1651 thisaval
, thisbval
, Py_EQ
);
1654 Py_DECREF(thisaval
);
1667 Py_DECREF(thisaval
);
1681 dict_compare(PyDictObject
*a
, PyDictObject
*b
)
1683 PyObject
*adiff
, *bdiff
, *aval
, *bval
;
1686 /* Compare lengths first */
1687 if (a
->ma_used
< b
->ma_used
)
1688 return -1; /* a is shorter */
1689 else if (a
->ma_used
> b
->ma_used
)
1690 return 1; /* b is shorter */
1692 /* Same length -- check all keys */
1693 bdiff
= bval
= NULL
;
1694 adiff
= characterize(a
, b
, &aval
);
1695 if (adiff
== NULL
) {
1697 /* Either an error, or a is a subset with the same length so
1700 res
= PyErr_Occurred() ? -1 : 0;
1703 bdiff
= characterize(b
, a
, &bval
);
1704 if (bdiff
== NULL
&& PyErr_Occurred()) {
1711 /* bdiff == NULL "should be" impossible now, but perhaps
1712 * the last comparison done by the characterize() on a had
1713 * the side effect of making the dicts equal!
1715 res
= PyObject_Compare(adiff
, bdiff
);
1717 if (res
== 0 && bval
!= NULL
)
1718 res
= PyObject_Compare(aval
, bval
);
1728 /* Return 1 if dicts equal, 0 if not, -1 if error.
1729 * Gets out as soon as any difference is detected.
1730 * Uses only Py_EQ comparison.
1733 dict_equal(PyDictObject
*a
, PyDictObject
*b
)
1737 if (a
->ma_used
!= b
->ma_used
)
1738 /* can't be equal if # of entries differ */
1741 /* Same # of entries -- check all of 'em. Exit early on any diff. */
1742 for (i
= 0; i
<= a
->ma_mask
; i
++) {
1743 PyObject
*aval
= a
->ma_table
[i
].me_value
;
1747 PyObject
*key
= a
->ma_table
[i
].me_key
;
1748 /* temporarily bump aval's refcount to ensure it stays
1749 alive until we're done with it */
1753 bval
= PyDict_GetItem((PyObject
*)b
, key
);
1759 cmp
= PyObject_RichCompareBool(aval
, bval
, Py_EQ
);
1761 if (cmp
<= 0) /* error or not equal */
1769 dict_richcompare(PyObject
*v
, PyObject
*w
, int op
)
1774 if (!PyDict_Check(v
) || !PyDict_Check(w
)) {
1775 res
= Py_NotImplemented
;
1777 else if (op
== Py_EQ
|| op
== Py_NE
) {
1778 cmp
= dict_equal((PyDictObject
*)v
, (PyDictObject
*)w
);
1781 res
= (cmp
== (op
== Py_EQ
)) ? Py_True
: Py_False
;
1784 /* Py3K warning if comparison isn't == or != */
1785 if (PyErr_WarnPy3k("dict inequality comparisons not supported "
1789 res
= Py_NotImplemented
;
1796 dict_contains(register PyDictObject
*mp
, PyObject
*key
)
1801 if (!PyString_CheckExact(key
) ||
1802 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1803 hash
= PyObject_Hash(key
);
1807 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1810 return PyBool_FromLong(ep
->me_value
!= NULL
);
1814 dict_has_key(register PyDictObject
*mp
, PyObject
*key
)
1816 if (PyErr_WarnPy3k("dict.has_key() not supported in 3.x; "
1817 "use the in operator", 1) < 0)
1819 return dict_contains(mp
, key
);
1823 dict_get(register PyDictObject
*mp
, PyObject
*args
)
1826 PyObject
*failobj
= Py_None
;
1827 PyObject
*val
= NULL
;
1831 if (!PyArg_UnpackTuple(args
, "get", 1, 2, &key
, &failobj
))
1834 if (!PyString_CheckExact(key
) ||
1835 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1836 hash
= PyObject_Hash(key
);
1840 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1852 dict_setdefault(register PyDictObject
*mp
, PyObject
*args
)
1855 PyObject
*failobj
= Py_None
;
1856 PyObject
*val
= NULL
;
1860 if (!PyArg_UnpackTuple(args
, "setdefault", 1, 2, &key
, &failobj
))
1863 if (!PyString_CheckExact(key
) ||
1864 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1865 hash
= PyObject_Hash(key
);
1869 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1875 if (PyDict_SetItem((PyObject
*)mp
, key
, failobj
))
1884 dict_clear(register PyDictObject
*mp
)
1886 PyDict_Clear((PyObject
*)mp
);
1891 dict_pop(PyDictObject
*mp
, PyObject
*args
)
1895 PyObject
*old_value
, *old_key
;
1896 PyObject
*key
, *deflt
= NULL
;
1898 if(!PyArg_UnpackTuple(args
, "pop", 1, 2, &key
, &deflt
))
1900 if (mp
->ma_used
== 0) {
1905 PyErr_SetString(PyExc_KeyError
,
1906 "pop(): dictionary is empty");
1909 if (!PyString_CheckExact(key
) ||
1910 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1911 hash
= PyObject_Hash(key
);
1915 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1918 if (ep
->me_value
== NULL
) {
1926 old_key
= ep
->me_key
;
1929 old_value
= ep
->me_value
;
1930 ep
->me_value
= NULL
;
1937 dict_popitem(PyDictObject
*mp
)
1943 /* Allocate the result tuple before checking the size. Believe it
1944 * or not, this allocation could trigger a garbage collection which
1945 * could empty the dict, so if we checked the size first and that
1946 * happened, the result would be an infinite loop (searching for an
1947 * entry that no longer exists). Note that the usual popitem()
1948 * idiom is "while d: k, v = d.popitem()". so needing to throw the
1949 * tuple away if the dict *is* empty isn't a significant
1950 * inefficiency -- possible, but unlikely in practice.
1952 res
= PyTuple_New(2);
1955 if (mp
->ma_used
== 0) {
1957 PyErr_SetString(PyExc_KeyError
,
1958 "popitem(): dictionary is empty");
1961 /* Set ep to "the first" dict entry with a value. We abuse the hash
1962 * field of slot 0 to hold a search finger:
1963 * If slot 0 has a value, use slot 0.
1964 * Else slot 0 is being used to hold a search finger,
1965 * and we use its hash value as the first index to look.
1967 ep
= &mp
->ma_table
[0];
1968 if (ep
->me_value
== NULL
) {
1970 /* The hash field may be a real hash value, or it may be a
1971 * legit search finger, or it may be a once-legit search
1972 * finger that's out of bounds now because it wrapped around
1973 * or the table shrunk -- simply make sure it's in bounds now.
1975 if (i
> mp
->ma_mask
|| i
< 1)
1976 i
= 1; /* skip slot 0 */
1977 while ((ep
= &mp
->ma_table
[i
])->me_value
== NULL
) {
1979 if (i
> mp
->ma_mask
)
1983 PyTuple_SET_ITEM(res
, 0, ep
->me_key
);
1984 PyTuple_SET_ITEM(res
, 1, ep
->me_value
);
1987 ep
->me_value
= NULL
;
1989 assert(mp
->ma_table
[0].me_value
== NULL
);
1990 mp
->ma_table
[0].me_hash
= i
+ 1; /* next place to start */
1995 dict_traverse(PyObject
*op
, visitproc visit
, void *arg
)
2001 while (PyDict_Next(op
, &i
, &pk
, &pv
)) {
2009 dict_tp_clear(PyObject
*op
)
2016 extern PyTypeObject PyDictIterKey_Type
; /* Forward */
2017 extern PyTypeObject PyDictIterValue_Type
; /* Forward */
2018 extern PyTypeObject PyDictIterItem_Type
; /* Forward */
2019 static PyObject
*dictiter_new(PyDictObject
*, PyTypeObject
*);
2022 dict_iterkeys(PyDictObject
*dict
)
2024 return dictiter_new(dict
, &PyDictIterKey_Type
);
2028 dict_itervalues(PyDictObject
*dict
)
2030 return dictiter_new(dict
, &PyDictIterValue_Type
);
2034 dict_iteritems(PyDictObject
*dict
)
2036 return dictiter_new(dict
, &PyDictIterItem_Type
);
2040 dict_sizeof(PyDictObject
*mp
)
2044 res
= sizeof(PyDictObject
);
2045 if (mp
->ma_table
!= mp
->ma_smalltable
)
2046 res
= res
+ (mp
->ma_mask
+ 1) * sizeof(PyDictEntry
);
2047 return PyInt_FromSsize_t(res
);
2050 PyDoc_STRVAR(has_key__doc__
,
2051 "D.has_key(k) -> True if D has a key k, else False");
2053 PyDoc_STRVAR(contains__doc__
,
2054 "D.__contains__(k) -> True if D has a key k, else False");
2056 PyDoc_STRVAR(getitem__doc__
, "x.__getitem__(y) <==> x[y]");
2058 PyDoc_STRVAR(sizeof__doc__
,
2059 "D.__sizeof__() -> size of D in memory, in bytes");
2061 PyDoc_STRVAR(get__doc__
,
2062 "D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.");
2064 PyDoc_STRVAR(setdefault_doc__
,
2065 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
2067 PyDoc_STRVAR(pop__doc__
,
2068 "D.pop(k[,d]) -> v, remove specified key and return the corresponding value.\n\
2069 If key is not found, d is returned if given, otherwise KeyError is raised");
2071 PyDoc_STRVAR(popitem__doc__
,
2072 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
2073 2-tuple; but raise KeyError if D is empty.");
2075 PyDoc_STRVAR(keys__doc__
,
2076 "D.keys() -> list of D's keys");
2078 PyDoc_STRVAR(items__doc__
,
2079 "D.items() -> list of D's (key, value) pairs, as 2-tuples");
2081 PyDoc_STRVAR(values__doc__
,
2082 "D.values() -> list of D's values");
2084 PyDoc_STRVAR(update__doc__
,
2085 "D.update(E, **F) -> None. Update D from dict/iterable E and F.\n"
2086 "If E has a .keys() method, does: for k in E: D[k] = E[k]\n\
2087 If E lacks .keys() method, does: for (k, v) in E: D[k] = v\n\
2088 In either case, this is followed by: for k in F: D[k] = F[k]");
2090 PyDoc_STRVAR(fromkeys__doc__
,
2091 "dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
2092 v defaults to None.");
2094 PyDoc_STRVAR(clear__doc__
,
2095 "D.clear() -> None. Remove all items from D.");
2097 PyDoc_STRVAR(copy__doc__
,
2098 "D.copy() -> a shallow copy of D");
2100 PyDoc_STRVAR(iterkeys__doc__
,
2101 "D.iterkeys() -> an iterator over the keys of D");
2103 PyDoc_STRVAR(itervalues__doc__
,
2104 "D.itervalues() -> an iterator over the values of D");
2106 PyDoc_STRVAR(iteritems__doc__
,
2107 "D.iteritems() -> an iterator over the (key, value) items of D");
2109 static PyMethodDef mapp_methods
[] = {
2110 {"__contains__",(PyCFunction
)dict_contains
, METH_O
| METH_COEXIST
,
2112 {"__getitem__", (PyCFunction
)dict_subscript
, METH_O
| METH_COEXIST
,
2114 {"__sizeof__", (PyCFunction
)dict_sizeof
, METH_NOARGS
,
2116 {"has_key", (PyCFunction
)dict_has_key
, METH_O
,
2118 {"get", (PyCFunction
)dict_get
, METH_VARARGS
,
2120 {"setdefault", (PyCFunction
)dict_setdefault
, METH_VARARGS
,
2122 {"pop", (PyCFunction
)dict_pop
, METH_VARARGS
,
2124 {"popitem", (PyCFunction
)dict_popitem
, METH_NOARGS
,
2126 {"keys", (PyCFunction
)dict_keys
, METH_NOARGS
,
2128 {"items", (PyCFunction
)dict_items
, METH_NOARGS
,
2130 {"values", (PyCFunction
)dict_values
, METH_NOARGS
,
2132 {"update", (PyCFunction
)dict_update
, METH_VARARGS
| METH_KEYWORDS
,
2134 {"fromkeys", (PyCFunction
)dict_fromkeys
, METH_VARARGS
| METH_CLASS
,
2136 {"clear", (PyCFunction
)dict_clear
, METH_NOARGS
,
2138 {"copy", (PyCFunction
)dict_copy
, METH_NOARGS
,
2140 {"iterkeys", (PyCFunction
)dict_iterkeys
, METH_NOARGS
,
2142 {"itervalues", (PyCFunction
)dict_itervalues
, METH_NOARGS
,
2144 {"iteritems", (PyCFunction
)dict_iteritems
, METH_NOARGS
,
2146 {NULL
, NULL
} /* sentinel */
2149 /* Return 1 if `key` is in dict `op`, 0 if not, and -1 on error. */
2151 PyDict_Contains(PyObject
*op
, PyObject
*key
)
2154 PyDictObject
*mp
= (PyDictObject
*)op
;
2157 if (!PyString_CheckExact(key
) ||
2158 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
2159 hash
= PyObject_Hash(key
);
2163 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
2164 return ep
== NULL
? -1 : (ep
->me_value
!= NULL
);
2167 /* Internal version of PyDict_Contains used when the hash value is already known */
2169 _PyDict_Contains(PyObject
*op
, PyObject
*key
, long hash
)
2171 PyDictObject
*mp
= (PyDictObject
*)op
;
2174 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
2175 return ep
== NULL
? -1 : (ep
->me_value
!= NULL
);
2178 /* Hack to implement "key in dict" */
2179 static PySequenceMethods dict_as_sequence
= {
2185 0, /* sq_ass_item */
2186 0, /* sq_ass_slice */
2187 PyDict_Contains
, /* sq_contains */
2188 0, /* sq_inplace_concat */
2189 0, /* sq_inplace_repeat */
2193 dict_new(PyTypeObject
*type
, PyObject
*args
, PyObject
*kwds
)
2197 assert(type
!= NULL
&& type
->tp_alloc
!= NULL
);
2198 self
= type
->tp_alloc(type
, 0);
2200 PyDictObject
*d
= (PyDictObject
*)self
;
2201 /* It's guaranteed that tp->alloc zeroed out the struct. */
2202 assert(d
->ma_table
== NULL
&& d
->ma_fill
== 0 && d
->ma_used
== 0);
2203 INIT_NONZERO_DICT_SLOTS(d
);
2204 d
->ma_lookup
= lookdict_string
;
2205 #ifdef SHOW_CONVERSION_COUNTS
2213 dict_init(PyObject
*self
, PyObject
*args
, PyObject
*kwds
)
2215 return dict_update_common(self
, args
, kwds
, "dict");
2219 dict_iter(PyDictObject
*dict
)
2221 return dictiter_new(dict
, &PyDictIterKey_Type
);
2224 PyDoc_STRVAR(dictionary_doc
,
2225 "dict() -> new empty dictionary.\n"
2226 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
2227 " (key, value) pairs.\n"
2228 "dict(seq) -> new dictionary initialized as if via:\n"
2230 " for k, v in seq:\n"
2232 "dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
2233 " in the keyword argument list. For example: dict(one=1, two=2)");
2235 PyTypeObject PyDict_Type
= {
2236 PyVarObject_HEAD_INIT(&PyType_Type
, 0)
2238 sizeof(PyDictObject
),
2240 (destructor
)dict_dealloc
, /* tp_dealloc */
2241 (printfunc
)dict_print
, /* tp_print */
2244 (cmpfunc
)dict_compare
, /* tp_compare */
2245 (reprfunc
)dict_repr
, /* tp_repr */
2246 0, /* tp_as_number */
2247 &dict_as_sequence
, /* tp_as_sequence */
2248 &dict_as_mapping
, /* tp_as_mapping */
2249 (hashfunc
)PyObject_HashNotImplemented
, /* tp_hash */
2252 PyObject_GenericGetAttr
, /* tp_getattro */
2253 0, /* tp_setattro */
2254 0, /* tp_as_buffer */
2255 Py_TPFLAGS_DEFAULT
| Py_TPFLAGS_HAVE_GC
|
2256 Py_TPFLAGS_BASETYPE
| Py_TPFLAGS_DICT_SUBCLASS
, /* tp_flags */
2257 dictionary_doc
, /* tp_doc */
2258 dict_traverse
, /* tp_traverse */
2259 dict_tp_clear
, /* tp_clear */
2260 dict_richcompare
, /* tp_richcompare */
2261 0, /* tp_weaklistoffset */
2262 (getiterfunc
)dict_iter
, /* tp_iter */
2263 0, /* tp_iternext */
2264 mapp_methods
, /* tp_methods */
2269 0, /* tp_descr_get */
2270 0, /* tp_descr_set */
2271 0, /* tp_dictoffset */
2272 dict_init
, /* tp_init */
2273 PyType_GenericAlloc
, /* tp_alloc */
2274 dict_new
, /* tp_new */
2275 PyObject_GC_Del
, /* tp_free */
2278 /* For backward compatibility with old dictionary interface */
2281 PyDict_GetItemString(PyObject
*v
, const char *key
)
2284 kv
= PyString_FromString(key
);
2287 rv
= PyDict_GetItem(v
, kv
);
2293 PyDict_SetItemString(PyObject
*v
, const char *key
, PyObject
*item
)
2297 kv
= PyString_FromString(key
);
2300 PyString_InternInPlace(&kv
); /* XXX Should we really? */
2301 err
= PyDict_SetItem(v
, kv
, item
);
2307 PyDict_DelItemString(PyObject
*v
, const char *key
)
2311 kv
= PyString_FromString(key
);
2314 err
= PyDict_DelItem(v
, kv
);
2319 /* Dictionary iterator types */
2323 PyDictObject
*di_dict
; /* Set to NULL when iterator is exhausted */
2326 PyObject
* di_result
; /* reusable result tuple for iteritems */
2331 dictiter_new(PyDictObject
*dict
, PyTypeObject
*itertype
)
2334 di
= PyObject_New(dictiterobject
, itertype
);
2339 di
->di_used
= dict
->ma_used
;
2341 di
->len
= dict
->ma_used
;
2342 if (itertype
== &PyDictIterItem_Type
) {
2343 di
->di_result
= PyTuple_Pack(2, Py_None
, Py_None
);
2344 if (di
->di_result
== NULL
) {
2350 di
->di_result
= NULL
;
2351 return (PyObject
*)di
;
2355 dictiter_dealloc(dictiterobject
*di
)
2357 Py_XDECREF(di
->di_dict
);
2358 Py_XDECREF(di
->di_result
);
2363 dictiter_len(dictiterobject
*di
)
2366 if (di
->di_dict
!= NULL
&& di
->di_used
== di
->di_dict
->ma_used
)
2368 return PyInt_FromSize_t(len
);
2371 PyDoc_STRVAR(length_hint_doc
, "Private method returning an estimate of len(list(it)).");
2373 static PyMethodDef dictiter_methods
[] = {
2374 {"__length_hint__", (PyCFunction
)dictiter_len
, METH_NOARGS
, length_hint_doc
},
2375 {NULL
, NULL
} /* sentinel */
2378 static PyObject
*dictiter_iternextkey(dictiterobject
*di
)
2381 register Py_ssize_t i
, mask
;
2382 register PyDictEntry
*ep
;
2383 PyDictObject
*d
= di
->di_dict
;
2387 assert (PyDict_Check(d
));
2389 if (di
->di_used
!= d
->ma_used
) {
2390 PyErr_SetString(PyExc_RuntimeError
,
2391 "dictionary changed size during iteration");
2392 di
->di_used
= -1; /* Make this state sticky */
2401 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
2417 PyTypeObject PyDictIterKey_Type
= {
2418 PyVarObject_HEAD_INIT(&PyType_Type
, 0)
2419 "dictionary-keyiterator", /* tp_name */
2420 sizeof(dictiterobject
), /* tp_basicsize */
2421 0, /* tp_itemsize */
2423 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2429 0, /* tp_as_number */
2430 0, /* tp_as_sequence */
2431 0, /* tp_as_mapping */
2435 PyObject_GenericGetAttr
, /* tp_getattro */
2436 0, /* tp_setattro */
2437 0, /* tp_as_buffer */
2438 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2440 0, /* tp_traverse */
2442 0, /* tp_richcompare */
2443 0, /* tp_weaklistoffset */
2444 PyObject_SelfIter
, /* tp_iter */
2445 (iternextfunc
)dictiter_iternextkey
, /* tp_iternext */
2446 dictiter_methods
, /* tp_methods */
2450 static PyObject
*dictiter_iternextvalue(dictiterobject
*di
)
2453 register Py_ssize_t i
, mask
;
2454 register PyDictEntry
*ep
;
2455 PyDictObject
*d
= di
->di_dict
;
2459 assert (PyDict_Check(d
));
2461 if (di
->di_used
!= d
->ma_used
) {
2462 PyErr_SetString(PyExc_RuntimeError
,
2463 "dictionary changed size during iteration");
2464 di
->di_used
= -1; /* Make this state sticky */
2470 if (i
< 0 || i
> mask
)
2473 while ((value
=ep
[i
].me_value
) == NULL
) {
2489 PyTypeObject PyDictIterValue_Type
= {
2490 PyVarObject_HEAD_INIT(&PyType_Type
, 0)
2491 "dictionary-valueiterator", /* tp_name */
2492 sizeof(dictiterobject
), /* tp_basicsize */
2493 0, /* tp_itemsize */
2495 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2501 0, /* tp_as_number */
2502 0, /* tp_as_sequence */
2503 0, /* tp_as_mapping */
2507 PyObject_GenericGetAttr
, /* tp_getattro */
2508 0, /* tp_setattro */
2509 0, /* tp_as_buffer */
2510 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2512 0, /* tp_traverse */
2514 0, /* tp_richcompare */
2515 0, /* tp_weaklistoffset */
2516 PyObject_SelfIter
, /* tp_iter */
2517 (iternextfunc
)dictiter_iternextvalue
, /* tp_iternext */
2518 dictiter_methods
, /* tp_methods */
2522 static PyObject
*dictiter_iternextitem(dictiterobject
*di
)
2524 PyObject
*key
, *value
, *result
= di
->di_result
;
2525 register Py_ssize_t i
, mask
;
2526 register PyDictEntry
*ep
;
2527 PyDictObject
*d
= di
->di_dict
;
2531 assert (PyDict_Check(d
));
2533 if (di
->di_used
!= d
->ma_used
) {
2534 PyErr_SetString(PyExc_RuntimeError
,
2535 "dictionary changed size during iteration");
2536 di
->di_used
= -1; /* Make this state sticky */
2545 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
2551 if (result
->ob_refcnt
== 1) {
2553 Py_DECREF(PyTuple_GET_ITEM(result
, 0));
2554 Py_DECREF(PyTuple_GET_ITEM(result
, 1));
2556 result
= PyTuple_New(2);
2562 value
= ep
[i
].me_value
;
2565 PyTuple_SET_ITEM(result
, 0, key
);
2566 PyTuple_SET_ITEM(result
, 1, value
);
2575 PyTypeObject PyDictIterItem_Type
= {
2576 PyVarObject_HEAD_INIT(&PyType_Type
, 0)
2577 "dictionary-itemiterator", /* tp_name */
2578 sizeof(dictiterobject
), /* tp_basicsize */
2579 0, /* tp_itemsize */
2581 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2587 0, /* tp_as_number */
2588 0, /* tp_as_sequence */
2589 0, /* tp_as_mapping */
2593 PyObject_GenericGetAttr
, /* tp_getattro */
2594 0, /* tp_setattro */
2595 0, /* tp_as_buffer */
2596 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2598 0, /* tp_traverse */
2600 0, /* tp_richcompare */
2601 0, /* tp_weaklistoffset */
2602 PyObject_SelfIter
, /* tp_iter */
2603 (iternextfunc
)dictiter_iternextitem
, /* tp_iternext */
2604 dictiter_methods
, /* tp_methods */