Applying patches backported from 3.1, by Gregor Lingl.
[python.git] / Objects / dictobject.c
blobaf37a93b4a0b5c7e6e218287e485fef19628b381
2 /* Dictionary object implementation using a hash table */
4 /* The distribution includes a separate file, Objects/dictnotes.txt,
5 describing explorations into dictionary design and optimization.
6 It covers typical dictionary use patterns, the parameters for
7 tuning dictionaries, and several ideas for possible optimizations.
8 */
10 #include "Python.h"
13 /* Set a key error with the specified argument, wrapping it in a
14 * tuple automatically so that tuple keys are not unpacked as the
15 * exception arguments. */
16 static void
17 set_key_error(PyObject *arg)
19 PyObject *tup;
20 tup = PyTuple_Pack(1, arg);
21 if (!tup)
22 return; /* caller will expect error to be set anyway */
23 PyErr_SetObject(PyExc_KeyError, tup);
24 Py_DECREF(tup);
27 /* Define this out if you don't want conversion statistics on exit. */
28 #undef SHOW_CONVERSION_COUNTS
30 /* See large comment block below. This must be >= 1. */
31 #define PERTURB_SHIFT 5
34 Major subtleties ahead: Most hash schemes depend on having a "good" hash
35 function, in the sense of simulating randomness. Python doesn't: its most
36 important hash functions (for strings and ints) are very regular in common
37 cases:
39 >>> map(hash, (0, 1, 2, 3))
40 [0, 1, 2, 3]
41 >>> map(hash, ("namea", "nameb", "namec", "named"))
42 [-1658398457, -1658398460, -1658398459, -1658398462]
43 >>>
45 This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
46 the low-order i bits as the initial table index is extremely fast, and there
47 are no collisions at all for dicts indexed by a contiguous range of ints.
48 The same is approximately true when keys are "consecutive" strings. So this
49 gives better-than-random behavior in common cases, and that's very desirable.
51 OTOH, when collisions occur, the tendency to fill contiguous slices of the
52 hash table makes a good collision resolution strategy crucial. Taking only
53 the last i bits of the hash code is also vulnerable: for example, consider
54 [i << 16 for i in range(20000)] as a set of keys. Since ints are their own
55 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
56 hash code are all 0: they *all* map to the same table index.
58 But catering to unusual cases should not slow the usual ones, so we just take
59 the last i bits anyway. It's up to collision resolution to do the rest. If
60 we *usually* find the key we're looking for on the first try (and, it turns
61 out, we usually do -- the table load factor is kept under 2/3, so the odds
62 are solidly in our favor), then it makes best sense to keep the initial index
63 computation dirt cheap.
65 The first half of collision resolution is to visit table indices via this
66 recurrence:
68 j = ((5*j) + 1) mod 2**i
70 For any initial j in range(2**i), repeating that 2**i times generates each
71 int in range(2**i) exactly once (see any text on random-number generation for
72 proof). By itself, this doesn't help much: like linear probing (setting
73 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
74 order. This would be bad, except that's not the only thing we do, and it's
75 actually *good* in the common cases where hash keys are consecutive. In an
76 example that's really too small to make this entirely clear, for a table of
77 size 2**3 the order of indices is:
79 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
81 If two things come in at index 5, the first place we look after is index 2,
82 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
83 Linear probing is deadly in this case because there the fixed probe order
84 is the *same* as the order consecutive keys are likely to arrive. But it's
85 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
86 and certain that consecutive hash codes do not.
88 The other half of the strategy is to get the other bits of the hash code
89 into play. This is done by initializing a (unsigned) vrbl "perturb" to the
90 full hash code, and changing the recurrence to:
92 j = (5*j) + 1 + perturb;
93 perturb >>= PERTURB_SHIFT;
94 use j % 2**i as the next table index;
96 Now the probe sequence depends (eventually) on every bit in the hash code,
97 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
98 because it quickly magnifies small differences in the bits that didn't affect
99 the initial index. Note that because perturb is unsigned, if the recurrence
100 is executed often enough perturb eventually becomes and remains 0. At that
101 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
102 that's certain to find an empty slot eventually (since it generates every int
103 in range(2**i), and we make sure there's always at least one empty slot).
105 Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
106 small so that the high bits of the hash code continue to affect the probe
107 sequence across iterations; but you want it large so that in really bad cases
108 the high-order hash bits have an effect on early iterations. 5 was "the
109 best" in minimizing total collisions across experiments Tim Peters ran (on
110 both normal and pathological cases), but 4 and 6 weren't significantly worse.
112 Historical: Reimer Behrends contributed the idea of using a polynomial-based
113 approach, using repeated multiplication by x in GF(2**n) where an irreducible
114 polynomial for each table size was chosen such that x was a primitive root.
115 Christian Tismer later extended that to use division by x instead, as an
116 efficient way to get the high bits of the hash code into play. This scheme
117 also gave excellent collision statistics, but was more expensive: two
118 if-tests were required inside the loop; computing "the next" index took about
119 the same number of operations but without as much potential parallelism
120 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
121 above, and then shifting perturb can be done while the table index is being
122 masked); and the PyDictObject struct required a member to hold the table's
123 polynomial. In Tim's experiments the current scheme ran faster, produced
124 equally good collision statistics, needed less code & used less memory.
126 Theoretical Python 2.5 headache: hash codes are only C "long", but
127 sizeof(Py_ssize_t) > sizeof(long) may be possible. In that case, and if a
128 dict is genuinely huge, then only the slots directly reachable via indexing
129 by a C long can be the first slot in a probe sequence. The probe sequence
130 will still eventually reach every slot in the table, but the collision rate
131 on initial probes may be much higher than this scheme was designed for.
132 Getting a hash code as fat as Py_ssize_t is the only real cure. But in
133 practice, this probably won't make a lick of difference for many years (at
134 which point everyone will have terabytes of RAM on 64-bit boxes).
137 /* Object used as dummy key to fill deleted entries */
138 static PyObject *dummy = NULL; /* Initialized by first call to newPyDictObject() */
140 #ifdef Py_REF_DEBUG
141 PyObject *
142 _PyDict_Dummy(void)
144 return dummy;
146 #endif
148 /* forward declarations */
149 static PyDictEntry *
150 lookdict_string(PyDictObject *mp, PyObject *key, long hash);
152 #ifdef SHOW_CONVERSION_COUNTS
153 static long created = 0L;
154 static long converted = 0L;
156 static void
157 show_counts(void)
159 fprintf(stderr, "created %ld string dicts\n", created);
160 fprintf(stderr, "converted %ld to normal dicts\n", converted);
161 fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
163 #endif
165 /* Debug statistic to compare allocations with reuse through the free list */
166 #undef SHOW_ALLOC_COUNT
167 #ifdef SHOW_ALLOC_COUNT
168 static size_t count_alloc = 0;
169 static size_t count_reuse = 0;
171 static void
172 show_alloc(void)
174 fprintf(stderr, "Dict allocations: %" PY_FORMAT_SIZE_T "d\n",
175 count_alloc);
176 fprintf(stderr, "Dict reuse through freelist: %" PY_FORMAT_SIZE_T
177 "d\n", count_reuse);
178 fprintf(stderr, "%.2f%% reuse rate\n\n",
179 (100.0*count_reuse/(count_alloc+count_reuse)));
181 #endif
183 /* Debug statistic to count GC tracking of dicts */
184 #ifdef SHOW_TRACK_COUNT
185 static Py_ssize_t count_untracked = 0;
186 static Py_ssize_t count_tracked = 0;
188 static void
189 show_track(void)
191 fprintf(stderr, "Dicts created: %" PY_FORMAT_SIZE_T "d\n",
192 count_tracked + count_untracked);
193 fprintf(stderr, "Dicts tracked by the GC: %" PY_FORMAT_SIZE_T
194 "d\n", count_tracked);
195 fprintf(stderr, "%.2f%% dict tracking rate\n\n",
196 (100.0*count_tracked/(count_untracked+count_tracked)));
198 #endif
201 /* Initialization macros.
202 There are two ways to create a dict: PyDict_New() is the main C API
203 function, and the tp_new slot maps to dict_new(). In the latter case we
204 can save a little time over what PyDict_New does because it's guaranteed
205 that the PyDictObject struct is already zeroed out.
206 Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
207 an excellent reason not to).
210 #define INIT_NONZERO_DICT_SLOTS(mp) do { \
211 (mp)->ma_table = (mp)->ma_smalltable; \
212 (mp)->ma_mask = PyDict_MINSIZE - 1; \
213 } while(0)
215 #define EMPTY_TO_MINSIZE(mp) do { \
216 memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
217 (mp)->ma_used = (mp)->ma_fill = 0; \
218 INIT_NONZERO_DICT_SLOTS(mp); \
219 } while(0)
221 /* Dictionary reuse scheme to save calls to malloc, free, and memset */
222 #ifndef PyDict_MAXFREELIST
223 #define PyDict_MAXFREELIST 80
224 #endif
225 static PyDictObject *free_list[PyDict_MAXFREELIST];
226 static int numfree = 0;
228 void
229 PyDict_Fini(void)
231 PyDictObject *op;
233 while (numfree) {
234 op = free_list[--numfree];
235 assert(PyDict_CheckExact(op));
236 PyObject_GC_Del(op);
240 PyObject *
241 PyDict_New(void)
243 register PyDictObject *mp;
244 if (dummy == NULL) { /* Auto-initialize dummy */
245 dummy = PyString_FromString("<dummy key>");
246 if (dummy == NULL)
247 return NULL;
248 #ifdef SHOW_CONVERSION_COUNTS
249 Py_AtExit(show_counts);
250 #endif
251 #ifdef SHOW_ALLOC_COUNT
252 Py_AtExit(show_alloc);
253 #endif
254 #ifdef SHOW_TRACK_COUNT
255 Py_AtExit(show_track);
256 #endif
258 if (numfree) {
259 mp = free_list[--numfree];
260 assert (mp != NULL);
261 assert (Py_TYPE(mp) == &PyDict_Type);
262 _Py_NewReference((PyObject *)mp);
263 if (mp->ma_fill) {
264 EMPTY_TO_MINSIZE(mp);
265 } else {
266 /* At least set ma_table and ma_mask; these are wrong
267 if an empty but presized dict is added to freelist */
268 INIT_NONZERO_DICT_SLOTS(mp);
270 assert (mp->ma_used == 0);
271 assert (mp->ma_table == mp->ma_smalltable);
272 assert (mp->ma_mask == PyDict_MINSIZE - 1);
273 #ifdef SHOW_ALLOC_COUNT
274 count_reuse++;
275 #endif
276 } else {
277 mp = PyObject_GC_New(PyDictObject, &PyDict_Type);
278 if (mp == NULL)
279 return NULL;
280 EMPTY_TO_MINSIZE(mp);
281 #ifdef SHOW_ALLOC_COUNT
282 count_alloc++;
283 #endif
285 mp->ma_lookup = lookdict_string;
286 #ifdef SHOW_TRACK_COUNT
287 count_untracked++;
288 #endif
289 #ifdef SHOW_CONVERSION_COUNTS
290 ++created;
291 #endif
292 return (PyObject *)mp;
296 The basic lookup function used by all operations.
297 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
298 Open addressing is preferred over chaining since the link overhead for
299 chaining would be substantial (100% with typical malloc overhead).
301 The initial probe index is computed as hash mod the table size. Subsequent
302 probe indices are computed as explained earlier.
304 All arithmetic on hash should ignore overflow.
306 (The details in this version are due to Tim Peters, building on many past
307 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
308 Christian Tismer).
310 lookdict() is general-purpose, and may return NULL if (and only if) a
311 comparison raises an exception (this was new in Python 2.5).
312 lookdict_string() below is specialized to string keys, comparison of which can
313 never raise an exception; that function can never return NULL. For both, when
314 the key isn't found a PyDictEntry* is returned for which the me_value field is
315 NULL; this is the slot in the dict at which the key would have been found, and
316 the caller can (if it wishes) add the <key, value> pair to the returned
317 PyDictEntry*.
319 static PyDictEntry *
320 lookdict(PyDictObject *mp, PyObject *key, register long hash)
322 register size_t i;
323 register size_t perturb;
324 register PyDictEntry *freeslot;
325 register size_t mask = (size_t)mp->ma_mask;
326 PyDictEntry *ep0 = mp->ma_table;
327 register PyDictEntry *ep;
328 register int cmp;
329 PyObject *startkey;
331 i = (size_t)hash & mask;
332 ep = &ep0[i];
333 if (ep->me_key == NULL || ep->me_key == key)
334 return ep;
336 if (ep->me_key == dummy)
337 freeslot = ep;
338 else {
339 if (ep->me_hash == hash) {
340 startkey = ep->me_key;
341 Py_INCREF(startkey);
342 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
343 Py_DECREF(startkey);
344 if (cmp < 0)
345 return NULL;
346 if (ep0 == mp->ma_table && ep->me_key == startkey) {
347 if (cmp > 0)
348 return ep;
350 else {
351 /* The compare did major nasty stuff to the
352 * dict: start over.
353 * XXX A clever adversary could prevent this
354 * XXX from terminating.
356 return lookdict(mp, key, hash);
359 freeslot = NULL;
362 /* In the loop, me_key == dummy is by far (factor of 100s) the
363 least likely outcome, so test for that last. */
364 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
365 i = (i << 2) + i + perturb + 1;
366 ep = &ep0[i & mask];
367 if (ep->me_key == NULL)
368 return freeslot == NULL ? ep : freeslot;
369 if (ep->me_key == key)
370 return ep;
371 if (ep->me_hash == hash && ep->me_key != dummy) {
372 startkey = ep->me_key;
373 Py_INCREF(startkey);
374 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
375 Py_DECREF(startkey);
376 if (cmp < 0)
377 return NULL;
378 if (ep0 == mp->ma_table && ep->me_key == startkey) {
379 if (cmp > 0)
380 return ep;
382 else {
383 /* The compare did major nasty stuff to the
384 * dict: start over.
385 * XXX A clever adversary could prevent this
386 * XXX from terminating.
388 return lookdict(mp, key, hash);
391 else if (ep->me_key == dummy && freeslot == NULL)
392 freeslot = ep;
394 assert(0); /* NOT REACHED */
395 return 0;
399 * Hacked up version of lookdict which can assume keys are always strings;
400 * this assumption allows testing for errors during PyObject_RichCompareBool()
401 * to be dropped; string-string comparisons never raise exceptions. This also
402 * means we don't need to go through PyObject_RichCompareBool(); we can always
403 * use _PyString_Eq() directly.
405 * This is valuable because dicts with only string keys are very common.
407 static PyDictEntry *
408 lookdict_string(PyDictObject *mp, PyObject *key, register long hash)
410 register size_t i;
411 register size_t perturb;
412 register PyDictEntry *freeslot;
413 register size_t mask = (size_t)mp->ma_mask;
414 PyDictEntry *ep0 = mp->ma_table;
415 register PyDictEntry *ep;
417 /* Make sure this function doesn't have to handle non-string keys,
418 including subclasses of str; e.g., one reason to subclass
419 strings is to override __eq__, and for speed we don't cater to
420 that here. */
421 if (!PyString_CheckExact(key)) {
422 #ifdef SHOW_CONVERSION_COUNTS
423 ++converted;
424 #endif
425 mp->ma_lookup = lookdict;
426 return lookdict(mp, key, hash);
428 i = hash & mask;
429 ep = &ep0[i];
430 if (ep->me_key == NULL || ep->me_key == key)
431 return ep;
432 if (ep->me_key == dummy)
433 freeslot = ep;
434 else {
435 if (ep->me_hash == hash && _PyString_Eq(ep->me_key, key))
436 return ep;
437 freeslot = NULL;
440 /* In the loop, me_key == dummy is by far (factor of 100s) the
441 least likely outcome, so test for that last. */
442 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
443 i = (i << 2) + i + perturb + 1;
444 ep = &ep0[i & mask];
445 if (ep->me_key == NULL)
446 return freeslot == NULL ? ep : freeslot;
447 if (ep->me_key == key
448 || (ep->me_hash == hash
449 && ep->me_key != dummy
450 && _PyString_Eq(ep->me_key, key)))
451 return ep;
452 if (ep->me_key == dummy && freeslot == NULL)
453 freeslot = ep;
455 assert(0); /* NOT REACHED */
456 return 0;
459 #ifdef SHOW_TRACK_COUNT
460 #define INCREASE_TRACK_COUNT \
461 (count_tracked++, count_untracked--);
462 #define DECREASE_TRACK_COUNT \
463 (count_tracked--, count_untracked++);
464 #else
465 #define INCREASE_TRACK_COUNT
466 #define DECREASE_TRACK_COUNT
467 #endif
469 #define MAINTAIN_TRACKING(mp, key, value) \
470 do { \
471 if (!_PyObject_GC_IS_TRACKED(mp)) { \
472 if (_PyObject_GC_MAY_BE_TRACKED(key) || \
473 _PyObject_GC_MAY_BE_TRACKED(value)) { \
474 _PyObject_GC_TRACK(mp); \
475 INCREASE_TRACK_COUNT \
478 } while(0)
480 void
481 _PyDict_MaybeUntrack(PyObject *op)
483 PyDictObject *mp;
484 PyObject *value;
485 Py_ssize_t mask, i;
486 PyDictEntry *ep;
488 if (!PyDict_CheckExact(op) || !_PyObject_GC_IS_TRACKED(op))
489 return;
491 mp = (PyDictObject *) op;
492 ep = mp->ma_table;
493 mask = mp->ma_mask;
494 for (i = 0; i <= mask; i++) {
495 if ((value = ep[i].me_value) == NULL)
496 continue;
497 if (_PyObject_GC_MAY_BE_TRACKED(value) ||
498 _PyObject_GC_MAY_BE_TRACKED(ep[i].me_key))
499 return;
501 DECREASE_TRACK_COUNT
502 _PyObject_GC_UNTRACK(op);
507 Internal routine to insert a new item into the table.
508 Used both by the internal resize routine and by the public insert routine.
509 Eats a reference to key and one to value.
510 Returns -1 if an error occurred, or 0 on success.
512 static int
513 insertdict(register PyDictObject *mp, PyObject *key, long hash, PyObject *value)
515 PyObject *old_value;
516 register PyDictEntry *ep;
517 typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
519 assert(mp->ma_lookup != NULL);
520 ep = mp->ma_lookup(mp, key, hash);
521 if (ep == NULL) {
522 Py_DECREF(key);
523 Py_DECREF(value);
524 return -1;
526 MAINTAIN_TRACKING(mp, key, value);
527 if (ep->me_value != NULL) {
528 old_value = ep->me_value;
529 ep->me_value = value;
530 Py_DECREF(old_value); /* which **CAN** re-enter */
531 Py_DECREF(key);
533 else {
534 if (ep->me_key == NULL)
535 mp->ma_fill++;
536 else {
537 assert(ep->me_key == dummy);
538 Py_DECREF(dummy);
540 ep->me_key = key;
541 ep->me_hash = (Py_ssize_t)hash;
542 ep->me_value = value;
543 mp->ma_used++;
545 return 0;
549 Internal routine used by dictresize() to insert an item which is
550 known to be absent from the dict. This routine also assumes that
551 the dict contains no deleted entries. Besides the performance benefit,
552 using insertdict() in dictresize() is dangerous (SF bug #1456209).
553 Note that no refcounts are changed by this routine; if needed, the caller
554 is responsible for incref'ing `key` and `value`.
556 static void
557 insertdict_clean(register PyDictObject *mp, PyObject *key, long hash,
558 PyObject *value)
560 register size_t i;
561 register size_t perturb;
562 register size_t mask = (size_t)mp->ma_mask;
563 PyDictEntry *ep0 = mp->ma_table;
564 register PyDictEntry *ep;
566 MAINTAIN_TRACKING(mp, key, value);
567 i = hash & mask;
568 ep = &ep0[i];
569 for (perturb = hash; ep->me_key != NULL; perturb >>= PERTURB_SHIFT) {
570 i = (i << 2) + i + perturb + 1;
571 ep = &ep0[i & mask];
573 assert(ep->me_value == NULL);
574 mp->ma_fill++;
575 ep->me_key = key;
576 ep->me_hash = (Py_ssize_t)hash;
577 ep->me_value = value;
578 mp->ma_used++;
582 Restructure the table by allocating a new table and reinserting all
583 items again. When entries have been deleted, the new table may
584 actually be smaller than the old one.
586 static int
587 dictresize(PyDictObject *mp, Py_ssize_t minused)
589 Py_ssize_t newsize;
590 PyDictEntry *oldtable, *newtable, *ep;
591 Py_ssize_t i;
592 int is_oldtable_malloced;
593 PyDictEntry small_copy[PyDict_MINSIZE];
595 assert(minused >= 0);
597 /* Find the smallest table size > minused. */
598 for (newsize = PyDict_MINSIZE;
599 newsize <= minused && newsize > 0;
600 newsize <<= 1)
602 if (newsize <= 0) {
603 PyErr_NoMemory();
604 return -1;
607 /* Get space for a new table. */
608 oldtable = mp->ma_table;
609 assert(oldtable != NULL);
610 is_oldtable_malloced = oldtable != mp->ma_smalltable;
612 if (newsize == PyDict_MINSIZE) {
613 /* A large table is shrinking, or we can't get any smaller. */
614 newtable = mp->ma_smalltable;
615 if (newtable == oldtable) {
616 if (mp->ma_fill == mp->ma_used) {
617 /* No dummies, so no point doing anything. */
618 return 0;
620 /* We're not going to resize it, but rebuild the
621 table anyway to purge old dummy entries.
622 Subtle: This is *necessary* if fill==size,
623 as lookdict needs at least one virgin slot to
624 terminate failing searches. If fill < size, it's
625 merely desirable, as dummies slow searches. */
626 assert(mp->ma_fill > mp->ma_used);
627 memcpy(small_copy, oldtable, sizeof(small_copy));
628 oldtable = small_copy;
631 else {
632 newtable = PyMem_NEW(PyDictEntry, newsize);
633 if (newtable == NULL) {
634 PyErr_NoMemory();
635 return -1;
639 /* Make the dict empty, using the new table. */
640 assert(newtable != oldtable);
641 mp->ma_table = newtable;
642 mp->ma_mask = newsize - 1;
643 memset(newtable, 0, sizeof(PyDictEntry) * newsize);
644 mp->ma_used = 0;
645 i = mp->ma_fill;
646 mp->ma_fill = 0;
648 /* Copy the data over; this is refcount-neutral for active entries;
649 dummy entries aren't copied over, of course */
650 for (ep = oldtable; i > 0; ep++) {
651 if (ep->me_value != NULL) { /* active entry */
652 --i;
653 insertdict_clean(mp, ep->me_key, (long)ep->me_hash,
654 ep->me_value);
656 else if (ep->me_key != NULL) { /* dummy entry */
657 --i;
658 assert(ep->me_key == dummy);
659 Py_DECREF(ep->me_key);
661 /* else key == value == NULL: nothing to do */
664 if (is_oldtable_malloced)
665 PyMem_DEL(oldtable);
666 return 0;
669 /* Create a new dictionary pre-sized to hold an estimated number of elements.
670 Underestimates are okay because the dictionary will resize as necessary.
671 Overestimates just mean the dictionary will be more sparse than usual.
674 PyObject *
675 _PyDict_NewPresized(Py_ssize_t minused)
677 PyObject *op = PyDict_New();
679 if (minused>5 && op != NULL && dictresize((PyDictObject *)op, minused) == -1) {
680 Py_DECREF(op);
681 return NULL;
683 return op;
686 /* Note that, for historical reasons, PyDict_GetItem() suppresses all errors
687 * that may occur (originally dicts supported only string keys, and exceptions
688 * weren't possible). So, while the original intent was that a NULL return
689 * meant the key wasn't present, in reality it can mean that, or that an error
690 * (suppressed) occurred while computing the key's hash, or that some error
691 * (suppressed) occurred when comparing keys in the dict's internal probe
692 * sequence. A nasty example of the latter is when a Python-coded comparison
693 * function hits a stack-depth error, which can cause this to return NULL
694 * even if the key is present.
696 PyObject *
697 PyDict_GetItem(PyObject *op, PyObject *key)
699 long hash;
700 PyDictObject *mp = (PyDictObject *)op;
701 PyDictEntry *ep;
702 PyThreadState *tstate;
703 if (!PyDict_Check(op))
704 return NULL;
705 if (!PyString_CheckExact(key) ||
706 (hash = ((PyStringObject *) key)->ob_shash) == -1)
708 hash = PyObject_Hash(key);
709 if (hash == -1) {
710 PyErr_Clear();
711 return NULL;
715 /* We can arrive here with a NULL tstate during initialization: try
716 running "python -Wi" for an example related to string interning.
717 Let's just hope that no exception occurs then... This must be
718 _PyThreadState_Current and not PyThreadState_GET() because in debug
719 mode, the latter complains if tstate is NULL. */
720 tstate = _PyThreadState_Current;
721 if (tstate != NULL && tstate->curexc_type != NULL) {
722 /* preserve the existing exception */
723 PyObject *err_type, *err_value, *err_tb;
724 PyErr_Fetch(&err_type, &err_value, &err_tb);
725 ep = (mp->ma_lookup)(mp, key, hash);
726 /* ignore errors */
727 PyErr_Restore(err_type, err_value, err_tb);
728 if (ep == NULL)
729 return NULL;
731 else {
732 ep = (mp->ma_lookup)(mp, key, hash);
733 if (ep == NULL) {
734 PyErr_Clear();
735 return NULL;
738 return ep->me_value;
741 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
742 * dictionary if it's merely replacing the value for an existing key.
743 * This means that it's safe to loop over a dictionary with PyDict_Next()
744 * and occasionally replace a value -- but you can't insert new keys or
745 * remove them.
748 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
750 register PyDictObject *mp;
751 register long hash;
752 register Py_ssize_t n_used;
754 if (!PyDict_Check(op)) {
755 PyErr_BadInternalCall();
756 return -1;
758 assert(key);
759 assert(value);
760 mp = (PyDictObject *)op;
761 if (PyString_CheckExact(key)) {
762 hash = ((PyStringObject *)key)->ob_shash;
763 if (hash == -1)
764 hash = PyObject_Hash(key);
766 else {
767 hash = PyObject_Hash(key);
768 if (hash == -1)
769 return -1;
771 assert(mp->ma_fill <= mp->ma_mask); /* at least one empty slot */
772 n_used = mp->ma_used;
773 Py_INCREF(value);
774 Py_INCREF(key);
775 if (insertdict(mp, key, hash, value) != 0)
776 return -1;
777 /* If we added a key, we can safely resize. Otherwise just return!
778 * If fill >= 2/3 size, adjust size. Normally, this doubles or
779 * quaduples the size, but it's also possible for the dict to shrink
780 * (if ma_fill is much larger than ma_used, meaning a lot of dict
781 * keys have been * deleted).
783 * Quadrupling the size improves average dictionary sparseness
784 * (reducing collisions) at the cost of some memory and iteration
785 * speed (which loops over every possible entry). It also halves
786 * the number of expensive resize operations in a growing dictionary.
788 * Very large dictionaries (over 50K items) use doubling instead.
789 * This may help applications with severe memory constraints.
791 if (!(mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2))
792 return 0;
793 return dictresize(mp, (mp->ma_used > 50000 ? 2 : 4) * mp->ma_used);
797 PyDict_DelItem(PyObject *op, PyObject *key)
799 register PyDictObject *mp;
800 register long hash;
801 register PyDictEntry *ep;
802 PyObject *old_value, *old_key;
804 if (!PyDict_Check(op)) {
805 PyErr_BadInternalCall();
806 return -1;
808 assert(key);
809 if (!PyString_CheckExact(key) ||
810 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
811 hash = PyObject_Hash(key);
812 if (hash == -1)
813 return -1;
815 mp = (PyDictObject *)op;
816 ep = (mp->ma_lookup)(mp, key, hash);
817 if (ep == NULL)
818 return -1;
819 if (ep->me_value == NULL) {
820 set_key_error(key);
821 return -1;
823 old_key = ep->me_key;
824 Py_INCREF(dummy);
825 ep->me_key = dummy;
826 old_value = ep->me_value;
827 ep->me_value = NULL;
828 mp->ma_used--;
829 Py_DECREF(old_value);
830 Py_DECREF(old_key);
831 return 0;
834 void
835 PyDict_Clear(PyObject *op)
837 PyDictObject *mp;
838 PyDictEntry *ep, *table;
839 int table_is_malloced;
840 Py_ssize_t fill;
841 PyDictEntry small_copy[PyDict_MINSIZE];
842 #ifdef Py_DEBUG
843 Py_ssize_t i, n;
844 #endif
846 if (!PyDict_Check(op))
847 return;
848 mp = (PyDictObject *)op;
849 #ifdef Py_DEBUG
850 n = mp->ma_mask + 1;
851 i = 0;
852 #endif
854 table = mp->ma_table;
855 assert(table != NULL);
856 table_is_malloced = table != mp->ma_smalltable;
858 /* This is delicate. During the process of clearing the dict,
859 * decrefs can cause the dict to mutate. To avoid fatal confusion
860 * (voice of experience), we have to make the dict empty before
861 * clearing the slots, and never refer to anything via mp->xxx while
862 * clearing.
864 fill = mp->ma_fill;
865 if (table_is_malloced)
866 EMPTY_TO_MINSIZE(mp);
868 else if (fill > 0) {
869 /* It's a small table with something that needs to be cleared.
870 * Afraid the only safe way is to copy the dict entries into
871 * another small table first.
873 memcpy(small_copy, table, sizeof(small_copy));
874 table = small_copy;
875 EMPTY_TO_MINSIZE(mp);
877 /* else it's a small table that's already empty */
879 /* Now we can finally clear things. If C had refcounts, we could
880 * assert that the refcount on table is 1 now, i.e. that this function
881 * has unique access to it, so decref side-effects can't alter it.
883 for (ep = table; fill > 0; ++ep) {
884 #ifdef Py_DEBUG
885 assert(i < n);
886 ++i;
887 #endif
888 if (ep->me_key) {
889 --fill;
890 Py_DECREF(ep->me_key);
891 Py_XDECREF(ep->me_value);
893 #ifdef Py_DEBUG
894 else
895 assert(ep->me_value == NULL);
896 #endif
899 if (table_is_malloced)
900 PyMem_DEL(table);
904 * Iterate over a dict. Use like so:
906 * Py_ssize_t i;
907 * PyObject *key, *value;
908 * i = 0; # important! i should not otherwise be changed by you
909 * while (PyDict_Next(yourdict, &i, &key, &value)) {
910 * Refer to borrowed references in key and value.
913 * CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
914 * mutates the dict. One exception: it is safe if the loop merely changes
915 * the values associated with the keys (but doesn't insert new keys or
916 * delete keys), via PyDict_SetItem().
919 PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue)
921 register Py_ssize_t i;
922 register Py_ssize_t mask;
923 register PyDictEntry *ep;
925 if (!PyDict_Check(op))
926 return 0;
927 i = *ppos;
928 if (i < 0)
929 return 0;
930 ep = ((PyDictObject *)op)->ma_table;
931 mask = ((PyDictObject *)op)->ma_mask;
932 while (i <= mask && ep[i].me_value == NULL)
933 i++;
934 *ppos = i+1;
935 if (i > mask)
936 return 0;
937 if (pkey)
938 *pkey = ep[i].me_key;
939 if (pvalue)
940 *pvalue = ep[i].me_value;
941 return 1;
944 /* Internal version of PyDict_Next that returns a hash value in addition to the key and value.*/
946 _PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue, long *phash)
948 register Py_ssize_t i;
949 register Py_ssize_t mask;
950 register PyDictEntry *ep;
952 if (!PyDict_Check(op))
953 return 0;
954 i = *ppos;
955 if (i < 0)
956 return 0;
957 ep = ((PyDictObject *)op)->ma_table;
958 mask = ((PyDictObject *)op)->ma_mask;
959 while (i <= mask && ep[i].me_value == NULL)
960 i++;
961 *ppos = i+1;
962 if (i > mask)
963 return 0;
964 *phash = (long)(ep[i].me_hash);
965 if (pkey)
966 *pkey = ep[i].me_key;
967 if (pvalue)
968 *pvalue = ep[i].me_value;
969 return 1;
972 /* Methods */
974 static void
975 dict_dealloc(register PyDictObject *mp)
977 register PyDictEntry *ep;
978 Py_ssize_t fill = mp->ma_fill;
979 PyObject_GC_UnTrack(mp);
980 Py_TRASHCAN_SAFE_BEGIN(mp)
981 for (ep = mp->ma_table; fill > 0; ep++) {
982 if (ep->me_key) {
983 --fill;
984 Py_DECREF(ep->me_key);
985 Py_XDECREF(ep->me_value);
988 if (mp->ma_table != mp->ma_smalltable)
989 PyMem_DEL(mp->ma_table);
990 if (numfree < PyDict_MAXFREELIST && Py_TYPE(mp) == &PyDict_Type)
991 free_list[numfree++] = mp;
992 else
993 Py_TYPE(mp)->tp_free((PyObject *)mp);
994 Py_TRASHCAN_SAFE_END(mp)
997 static int
998 dict_print(register PyDictObject *mp, register FILE *fp, register int flags)
1000 register Py_ssize_t i;
1001 register Py_ssize_t any;
1002 int status;
1004 status = Py_ReprEnter((PyObject*)mp);
1005 if (status != 0) {
1006 if (status < 0)
1007 return status;
1008 Py_BEGIN_ALLOW_THREADS
1009 fprintf(fp, "{...}");
1010 Py_END_ALLOW_THREADS
1011 return 0;
1014 Py_BEGIN_ALLOW_THREADS
1015 fprintf(fp, "{");
1016 Py_END_ALLOW_THREADS
1017 any = 0;
1018 for (i = 0; i <= mp->ma_mask; i++) {
1019 PyDictEntry *ep = mp->ma_table + i;
1020 PyObject *pvalue = ep->me_value;
1021 if (pvalue != NULL) {
1022 /* Prevent PyObject_Repr from deleting value during
1023 key format */
1024 Py_INCREF(pvalue);
1025 if (any++ > 0) {
1026 Py_BEGIN_ALLOW_THREADS
1027 fprintf(fp, ", ");
1028 Py_END_ALLOW_THREADS
1030 if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
1031 Py_DECREF(pvalue);
1032 Py_ReprLeave((PyObject*)mp);
1033 return -1;
1035 Py_BEGIN_ALLOW_THREADS
1036 fprintf(fp, ": ");
1037 Py_END_ALLOW_THREADS
1038 if (PyObject_Print(pvalue, fp, 0) != 0) {
1039 Py_DECREF(pvalue);
1040 Py_ReprLeave((PyObject*)mp);
1041 return -1;
1043 Py_DECREF(pvalue);
1046 Py_BEGIN_ALLOW_THREADS
1047 fprintf(fp, "}");
1048 Py_END_ALLOW_THREADS
1049 Py_ReprLeave((PyObject*)mp);
1050 return 0;
1053 static PyObject *
1054 dict_repr(PyDictObject *mp)
1056 Py_ssize_t i;
1057 PyObject *s, *temp, *colon = NULL;
1058 PyObject *pieces = NULL, *result = NULL;
1059 PyObject *key, *value;
1061 i = Py_ReprEnter((PyObject *)mp);
1062 if (i != 0) {
1063 return i > 0 ? PyString_FromString("{...}") : NULL;
1066 if (mp->ma_used == 0) {
1067 result = PyString_FromString("{}");
1068 goto Done;
1071 pieces = PyList_New(0);
1072 if (pieces == NULL)
1073 goto Done;
1075 colon = PyString_FromString(": ");
1076 if (colon == NULL)
1077 goto Done;
1079 /* Do repr() on each key+value pair, and insert ": " between them.
1080 Note that repr may mutate the dict. */
1081 i = 0;
1082 while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
1083 int status;
1084 /* Prevent repr from deleting value during key format. */
1085 Py_INCREF(value);
1086 s = PyObject_Repr(key);
1087 PyString_Concat(&s, colon);
1088 PyString_ConcatAndDel(&s, PyObject_Repr(value));
1089 Py_DECREF(value);
1090 if (s == NULL)
1091 goto Done;
1092 status = PyList_Append(pieces, s);
1093 Py_DECREF(s); /* append created a new ref */
1094 if (status < 0)
1095 goto Done;
1098 /* Add "{}" decorations to the first and last items. */
1099 assert(PyList_GET_SIZE(pieces) > 0);
1100 s = PyString_FromString("{");
1101 if (s == NULL)
1102 goto Done;
1103 temp = PyList_GET_ITEM(pieces, 0);
1104 PyString_ConcatAndDel(&s, temp);
1105 PyList_SET_ITEM(pieces, 0, s);
1106 if (s == NULL)
1107 goto Done;
1109 s = PyString_FromString("}");
1110 if (s == NULL)
1111 goto Done;
1112 temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
1113 PyString_ConcatAndDel(&temp, s);
1114 PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
1115 if (temp == NULL)
1116 goto Done;
1118 /* Paste them all together with ", " between. */
1119 s = PyString_FromString(", ");
1120 if (s == NULL)
1121 goto Done;
1122 result = _PyString_Join(s, pieces);
1123 Py_DECREF(s);
1125 Done:
1126 Py_XDECREF(pieces);
1127 Py_XDECREF(colon);
1128 Py_ReprLeave((PyObject *)mp);
1129 return result;
1132 static Py_ssize_t
1133 dict_length(PyDictObject *mp)
1135 return mp->ma_used;
1138 static PyObject *
1139 dict_subscript(PyDictObject *mp, register PyObject *key)
1141 PyObject *v;
1142 long hash;
1143 PyDictEntry *ep;
1144 assert(mp->ma_table != NULL);
1145 if (!PyString_CheckExact(key) ||
1146 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1147 hash = PyObject_Hash(key);
1148 if (hash == -1)
1149 return NULL;
1151 ep = (mp->ma_lookup)(mp, key, hash);
1152 if (ep == NULL)
1153 return NULL;
1154 v = ep->me_value;
1155 if (v == NULL) {
1156 if (!PyDict_CheckExact(mp)) {
1157 /* Look up __missing__ method if we're a subclass. */
1158 PyObject *missing, *res;
1159 static PyObject *missing_str = NULL;
1160 missing = _PyObject_LookupSpecial((PyObject *)mp,
1161 "__missing__",
1162 &missing_str);
1163 if (missing != NULL) {
1164 res = PyObject_CallFunctionObjArgs(missing,
1165 key, NULL);
1166 Py_DECREF(missing);
1167 return res;
1169 else if (PyErr_Occurred())
1170 return NULL;
1172 set_key_error(key);
1173 return NULL;
1175 else
1176 Py_INCREF(v);
1177 return v;
1180 static int
1181 dict_ass_sub(PyDictObject *mp, PyObject *v, PyObject *w)
1183 if (w == NULL)
1184 return PyDict_DelItem((PyObject *)mp, v);
1185 else
1186 return PyDict_SetItem((PyObject *)mp, v, w);
1189 static PyMappingMethods dict_as_mapping = {
1190 (lenfunc)dict_length, /*mp_length*/
1191 (binaryfunc)dict_subscript, /*mp_subscript*/
1192 (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
1195 static PyObject *
1196 dict_keys(register PyDictObject *mp)
1198 register PyObject *v;
1199 register Py_ssize_t i, j;
1200 PyDictEntry *ep;
1201 Py_ssize_t mask, n;
1203 again:
1204 n = mp->ma_used;
1205 v = PyList_New(n);
1206 if (v == NULL)
1207 return NULL;
1208 if (n != mp->ma_used) {
1209 /* Durnit. The allocations caused the dict to resize.
1210 * Just start over, this shouldn't normally happen.
1212 Py_DECREF(v);
1213 goto again;
1215 ep = mp->ma_table;
1216 mask = mp->ma_mask;
1217 for (i = 0, j = 0; i <= mask; i++) {
1218 if (ep[i].me_value != NULL) {
1219 PyObject *key = ep[i].me_key;
1220 Py_INCREF(key);
1221 PyList_SET_ITEM(v, j, key);
1222 j++;
1225 assert(j == n);
1226 return v;
1229 static PyObject *
1230 dict_values(register PyDictObject *mp)
1232 register PyObject *v;
1233 register Py_ssize_t i, j;
1234 PyDictEntry *ep;
1235 Py_ssize_t mask, n;
1237 again:
1238 n = mp->ma_used;
1239 v = PyList_New(n);
1240 if (v == NULL)
1241 return NULL;
1242 if (n != mp->ma_used) {
1243 /* Durnit. The allocations caused the dict to resize.
1244 * Just start over, this shouldn't normally happen.
1246 Py_DECREF(v);
1247 goto again;
1249 ep = mp->ma_table;
1250 mask = mp->ma_mask;
1251 for (i = 0, j = 0; i <= mask; i++) {
1252 if (ep[i].me_value != NULL) {
1253 PyObject *value = ep[i].me_value;
1254 Py_INCREF(value);
1255 PyList_SET_ITEM(v, j, value);
1256 j++;
1259 assert(j == n);
1260 return v;
1263 static PyObject *
1264 dict_items(register PyDictObject *mp)
1266 register PyObject *v;
1267 register Py_ssize_t i, j, n;
1268 Py_ssize_t mask;
1269 PyObject *item, *key, *value;
1270 PyDictEntry *ep;
1272 /* Preallocate the list of tuples, to avoid allocations during
1273 * the loop over the items, which could trigger GC, which
1274 * could resize the dict. :-(
1276 again:
1277 n = mp->ma_used;
1278 v = PyList_New(n);
1279 if (v == NULL)
1280 return NULL;
1281 for (i = 0; i < n; i++) {
1282 item = PyTuple_New(2);
1283 if (item == NULL) {
1284 Py_DECREF(v);
1285 return NULL;
1287 PyList_SET_ITEM(v, i, item);
1289 if (n != mp->ma_used) {
1290 /* Durnit. The allocations caused the dict to resize.
1291 * Just start over, this shouldn't normally happen.
1293 Py_DECREF(v);
1294 goto again;
1296 /* Nothing we do below makes any function calls. */
1297 ep = mp->ma_table;
1298 mask = mp->ma_mask;
1299 for (i = 0, j = 0; i <= mask; i++) {
1300 if ((value=ep[i].me_value) != NULL) {
1301 key = ep[i].me_key;
1302 item = PyList_GET_ITEM(v, j);
1303 Py_INCREF(key);
1304 PyTuple_SET_ITEM(item, 0, key);
1305 Py_INCREF(value);
1306 PyTuple_SET_ITEM(item, 1, value);
1307 j++;
1310 assert(j == n);
1311 return v;
1314 static PyObject *
1315 dict_fromkeys(PyObject *cls, PyObject *args)
1317 PyObject *seq;
1318 PyObject *value = Py_None;
1319 PyObject *it; /* iter(seq) */
1320 PyObject *key;
1321 PyObject *d;
1322 int status;
1324 if (!PyArg_UnpackTuple(args, "fromkeys", 1, 2, &seq, &value))
1325 return NULL;
1327 d = PyObject_CallObject(cls, NULL);
1328 if (d == NULL)
1329 return NULL;
1331 if (PyDict_CheckExact(d) && PyDict_CheckExact(seq)) {
1332 PyDictObject *mp = (PyDictObject *)d;
1333 PyObject *oldvalue;
1334 Py_ssize_t pos = 0;
1335 PyObject *key;
1336 long hash;
1338 if (dictresize(mp, Py_SIZE(seq)))
1339 return NULL;
1341 while (_PyDict_Next(seq, &pos, &key, &oldvalue, &hash)) {
1342 Py_INCREF(key);
1343 Py_INCREF(value);
1344 if (insertdict(mp, key, hash, value))
1345 return NULL;
1347 return d;
1350 if (PyDict_CheckExact(d) && PyAnySet_CheckExact(seq)) {
1351 PyDictObject *mp = (PyDictObject *)d;
1352 Py_ssize_t pos = 0;
1353 PyObject *key;
1354 long hash;
1356 if (dictresize(mp, PySet_GET_SIZE(seq)))
1357 return NULL;
1359 while (_PySet_NextEntry(seq, &pos, &key, &hash)) {
1360 Py_INCREF(key);
1361 Py_INCREF(value);
1362 if (insertdict(mp, key, hash, value))
1363 return NULL;
1365 return d;
1368 it = PyObject_GetIter(seq);
1369 if (it == NULL){
1370 Py_DECREF(d);
1371 return NULL;
1374 if (PyDict_CheckExact(d)) {
1375 while ((key = PyIter_Next(it)) != NULL) {
1376 status = PyDict_SetItem(d, key, value);
1377 Py_DECREF(key);
1378 if (status < 0)
1379 goto Fail;
1381 } else {
1382 while ((key = PyIter_Next(it)) != NULL) {
1383 status = PyObject_SetItem(d, key, value);
1384 Py_DECREF(key);
1385 if (status < 0)
1386 goto Fail;
1390 if (PyErr_Occurred())
1391 goto Fail;
1392 Py_DECREF(it);
1393 return d;
1395 Fail:
1396 Py_DECREF(it);
1397 Py_DECREF(d);
1398 return NULL;
1401 static int
1402 dict_update_common(PyObject *self, PyObject *args, PyObject *kwds, char *methname)
1404 PyObject *arg = NULL;
1405 int result = 0;
1407 if (!PyArg_UnpackTuple(args, methname, 0, 1, &arg))
1408 result = -1;
1410 else if (arg != NULL) {
1411 if (PyObject_HasAttrString(arg, "keys"))
1412 result = PyDict_Merge(self, arg, 1);
1413 else
1414 result = PyDict_MergeFromSeq2(self, arg, 1);
1416 if (result == 0 && kwds != NULL)
1417 result = PyDict_Merge(self, kwds, 1);
1418 return result;
1421 static PyObject *
1422 dict_update(PyObject *self, PyObject *args, PyObject *kwds)
1424 if (dict_update_common(self, args, kwds, "update") != -1)
1425 Py_RETURN_NONE;
1426 return NULL;
1429 /* Update unconditionally replaces existing items.
1430 Merge has a 3rd argument 'override'; if set, it acts like Update,
1431 otherwise it leaves existing items unchanged.
1433 PyDict_{Update,Merge} update/merge from a mapping object.
1435 PyDict_MergeFromSeq2 updates/merges from any iterable object
1436 producing iterable objects of length 2.
1440 PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1442 PyObject *it; /* iter(seq2) */
1443 Py_ssize_t i; /* index into seq2 of current element */
1444 PyObject *item; /* seq2[i] */
1445 PyObject *fast; /* item as a 2-tuple or 2-list */
1447 assert(d != NULL);
1448 assert(PyDict_Check(d));
1449 assert(seq2 != NULL);
1451 it = PyObject_GetIter(seq2);
1452 if (it == NULL)
1453 return -1;
1455 for (i = 0; ; ++i) {
1456 PyObject *key, *value;
1457 Py_ssize_t n;
1459 fast = NULL;
1460 item = PyIter_Next(it);
1461 if (item == NULL) {
1462 if (PyErr_Occurred())
1463 goto Fail;
1464 break;
1467 /* Convert item to sequence, and verify length 2. */
1468 fast = PySequence_Fast(item, "");
1469 if (fast == NULL) {
1470 if (PyErr_ExceptionMatches(PyExc_TypeError))
1471 PyErr_Format(PyExc_TypeError,
1472 "cannot convert dictionary update "
1473 "sequence element #%zd to a sequence",
1475 goto Fail;
1477 n = PySequence_Fast_GET_SIZE(fast);
1478 if (n != 2) {
1479 PyErr_Format(PyExc_ValueError,
1480 "dictionary update sequence element #%zd "
1481 "has length %zd; 2 is required",
1482 i, n);
1483 goto Fail;
1486 /* Update/merge with this (key, value) pair. */
1487 key = PySequence_Fast_GET_ITEM(fast, 0);
1488 value = PySequence_Fast_GET_ITEM(fast, 1);
1489 if (override || PyDict_GetItem(d, key) == NULL) {
1490 int status = PyDict_SetItem(d, key, value);
1491 if (status < 0)
1492 goto Fail;
1494 Py_DECREF(fast);
1495 Py_DECREF(item);
1498 i = 0;
1499 goto Return;
1500 Fail:
1501 Py_XDECREF(item);
1502 Py_XDECREF(fast);
1503 i = -1;
1504 Return:
1505 Py_DECREF(it);
1506 return Py_SAFE_DOWNCAST(i, Py_ssize_t, int);
1510 PyDict_Update(PyObject *a, PyObject *b)
1512 return PyDict_Merge(a, b, 1);
1516 PyDict_Merge(PyObject *a, PyObject *b, int override)
1518 register PyDictObject *mp, *other;
1519 register Py_ssize_t i;
1520 PyDictEntry *entry;
1522 /* We accept for the argument either a concrete dictionary object,
1523 * or an abstract "mapping" object. For the former, we can do
1524 * things quite efficiently. For the latter, we only require that
1525 * PyMapping_Keys() and PyObject_GetItem() be supported.
1527 if (a == NULL || !PyDict_Check(a) || b == NULL) {
1528 PyErr_BadInternalCall();
1529 return -1;
1531 mp = (PyDictObject*)a;
1532 if (PyDict_Check(b)) {
1533 other = (PyDictObject*)b;
1534 if (other == mp || other->ma_used == 0)
1535 /* a.update(a) or a.update({}); nothing to do */
1536 return 0;
1537 if (mp->ma_used == 0)
1538 /* Since the target dict is empty, PyDict_GetItem()
1539 * always returns NULL. Setting override to 1
1540 * skips the unnecessary test.
1542 override = 1;
1543 /* Do one big resize at the start, rather than
1544 * incrementally resizing as we insert new items. Expect
1545 * that there will be no (or few) overlapping keys.
1547 if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1548 if (dictresize(mp, (mp->ma_used + other->ma_used)*2) != 0)
1549 return -1;
1551 for (i = 0; i <= other->ma_mask; i++) {
1552 entry = &other->ma_table[i];
1553 if (entry->me_value != NULL &&
1554 (override ||
1555 PyDict_GetItem(a, entry->me_key) == NULL)) {
1556 Py_INCREF(entry->me_key);
1557 Py_INCREF(entry->me_value);
1558 if (insertdict(mp, entry->me_key,
1559 (long)entry->me_hash,
1560 entry->me_value) != 0)
1561 return -1;
1565 else {
1566 /* Do it the generic, slower way */
1567 PyObject *keys = PyMapping_Keys(b);
1568 PyObject *iter;
1569 PyObject *key, *value;
1570 int status;
1572 if (keys == NULL)
1573 /* Docstring says this is equivalent to E.keys() so
1574 * if E doesn't have a .keys() method we want
1575 * AttributeError to percolate up. Might as well
1576 * do the same for any other error.
1578 return -1;
1580 iter = PyObject_GetIter(keys);
1581 Py_DECREF(keys);
1582 if (iter == NULL)
1583 return -1;
1585 for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1586 if (!override && PyDict_GetItem(a, key) != NULL) {
1587 Py_DECREF(key);
1588 continue;
1590 value = PyObject_GetItem(b, key);
1591 if (value == NULL) {
1592 Py_DECREF(iter);
1593 Py_DECREF(key);
1594 return -1;
1596 status = PyDict_SetItem(a, key, value);
1597 Py_DECREF(key);
1598 Py_DECREF(value);
1599 if (status < 0) {
1600 Py_DECREF(iter);
1601 return -1;
1604 Py_DECREF(iter);
1605 if (PyErr_Occurred())
1606 /* Iterator completed, via error */
1607 return -1;
1609 return 0;
1612 static PyObject *
1613 dict_copy(register PyDictObject *mp)
1615 return PyDict_Copy((PyObject*)mp);
1618 PyObject *
1619 PyDict_Copy(PyObject *o)
1621 PyObject *copy;
1623 if (o == NULL || !PyDict_Check(o)) {
1624 PyErr_BadInternalCall();
1625 return NULL;
1627 copy = PyDict_New();
1628 if (copy == NULL)
1629 return NULL;
1630 if (PyDict_Merge(copy, o, 1) == 0)
1631 return copy;
1632 Py_DECREF(copy);
1633 return NULL;
1636 Py_ssize_t
1637 PyDict_Size(PyObject *mp)
1639 if (mp == NULL || !PyDict_Check(mp)) {
1640 PyErr_BadInternalCall();
1641 return -1;
1643 return ((PyDictObject *)mp)->ma_used;
1646 PyObject *
1647 PyDict_Keys(PyObject *mp)
1649 if (mp == NULL || !PyDict_Check(mp)) {
1650 PyErr_BadInternalCall();
1651 return NULL;
1653 return dict_keys((PyDictObject *)mp);
1656 PyObject *
1657 PyDict_Values(PyObject *mp)
1659 if (mp == NULL || !PyDict_Check(mp)) {
1660 PyErr_BadInternalCall();
1661 return NULL;
1663 return dict_values((PyDictObject *)mp);
1666 PyObject *
1667 PyDict_Items(PyObject *mp)
1669 if (mp == NULL || !PyDict_Check(mp)) {
1670 PyErr_BadInternalCall();
1671 return NULL;
1673 return dict_items((PyDictObject *)mp);
1676 /* Subroutine which returns the smallest key in a for which b's value
1677 is different or absent. The value is returned too, through the
1678 pval argument. Both are NULL if no key in a is found for which b's status
1679 differs. The refcounts on (and only on) non-NULL *pval and function return
1680 values must be decremented by the caller (characterize() increments them
1681 to ensure that mutating comparison and PyDict_GetItem calls can't delete
1682 them before the caller is done looking at them). */
1684 static PyObject *
1685 characterize(PyDictObject *a, PyDictObject *b, PyObject **pval)
1687 PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1688 PyObject *aval = NULL; /* a[akey] */
1689 Py_ssize_t i;
1690 int cmp;
1692 for (i = 0; i <= a->ma_mask; i++) {
1693 PyObject *thiskey, *thisaval, *thisbval;
1694 if (a->ma_table[i].me_value == NULL)
1695 continue;
1696 thiskey = a->ma_table[i].me_key;
1697 Py_INCREF(thiskey); /* keep alive across compares */
1698 if (akey != NULL) {
1699 cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1700 if (cmp < 0) {
1701 Py_DECREF(thiskey);
1702 goto Fail;
1704 if (cmp > 0 ||
1705 i > a->ma_mask ||
1706 a->ma_table[i].me_value == NULL)
1708 /* Not the *smallest* a key; or maybe it is
1709 * but the compare shrunk the dict so we can't
1710 * find its associated value anymore; or
1711 * maybe it is but the compare deleted the
1712 * a[thiskey] entry.
1714 Py_DECREF(thiskey);
1715 continue;
1719 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1720 thisaval = a->ma_table[i].me_value;
1721 assert(thisaval);
1722 Py_INCREF(thisaval); /* keep alive */
1723 thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1724 if (thisbval == NULL)
1725 cmp = 0;
1726 else {
1727 /* both dicts have thiskey: same values? */
1728 cmp = PyObject_RichCompareBool(
1729 thisaval, thisbval, Py_EQ);
1730 if (cmp < 0) {
1731 Py_DECREF(thiskey);
1732 Py_DECREF(thisaval);
1733 goto Fail;
1736 if (cmp == 0) {
1737 /* New winner. */
1738 Py_XDECREF(akey);
1739 Py_XDECREF(aval);
1740 akey = thiskey;
1741 aval = thisaval;
1743 else {
1744 Py_DECREF(thiskey);
1745 Py_DECREF(thisaval);
1748 *pval = aval;
1749 return akey;
1751 Fail:
1752 Py_XDECREF(akey);
1753 Py_XDECREF(aval);
1754 *pval = NULL;
1755 return NULL;
1758 static int
1759 dict_compare(PyDictObject *a, PyDictObject *b)
1761 PyObject *adiff, *bdiff, *aval, *bval;
1762 int res;
1764 /* Compare lengths first */
1765 if (a->ma_used < b->ma_used)
1766 return -1; /* a is shorter */
1767 else if (a->ma_used > b->ma_used)
1768 return 1; /* b is shorter */
1770 /* Same length -- check all keys */
1771 bdiff = bval = NULL;
1772 adiff = characterize(a, b, &aval);
1773 if (adiff == NULL) {
1774 assert(!aval);
1775 /* Either an error, or a is a subset with the same length so
1776 * must be equal.
1778 res = PyErr_Occurred() ? -1 : 0;
1779 goto Finished;
1781 bdiff = characterize(b, a, &bval);
1782 if (bdiff == NULL && PyErr_Occurred()) {
1783 assert(!bval);
1784 res = -1;
1785 goto Finished;
1787 res = 0;
1788 if (bdiff) {
1789 /* bdiff == NULL "should be" impossible now, but perhaps
1790 * the last comparison done by the characterize() on a had
1791 * the side effect of making the dicts equal!
1793 res = PyObject_Compare(adiff, bdiff);
1795 if (res == 0 && bval != NULL)
1796 res = PyObject_Compare(aval, bval);
1798 Finished:
1799 Py_XDECREF(adiff);
1800 Py_XDECREF(bdiff);
1801 Py_XDECREF(aval);
1802 Py_XDECREF(bval);
1803 return res;
1806 /* Return 1 if dicts equal, 0 if not, -1 if error.
1807 * Gets out as soon as any difference is detected.
1808 * Uses only Py_EQ comparison.
1810 static int
1811 dict_equal(PyDictObject *a, PyDictObject *b)
1813 Py_ssize_t i;
1815 if (a->ma_used != b->ma_used)
1816 /* can't be equal if # of entries differ */
1817 return 0;
1819 /* Same # of entries -- check all of 'em. Exit early on any diff. */
1820 for (i = 0; i <= a->ma_mask; i++) {
1821 PyObject *aval = a->ma_table[i].me_value;
1822 if (aval != NULL) {
1823 int cmp;
1824 PyObject *bval;
1825 PyObject *key = a->ma_table[i].me_key;
1826 /* temporarily bump aval's refcount to ensure it stays
1827 alive until we're done with it */
1828 Py_INCREF(aval);
1829 /* ditto for key */
1830 Py_INCREF(key);
1831 bval = PyDict_GetItem((PyObject *)b, key);
1832 Py_DECREF(key);
1833 if (bval == NULL) {
1834 Py_DECREF(aval);
1835 return 0;
1837 cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1838 Py_DECREF(aval);
1839 if (cmp <= 0) /* error or not equal */
1840 return cmp;
1843 return 1;
1846 static PyObject *
1847 dict_richcompare(PyObject *v, PyObject *w, int op)
1849 int cmp;
1850 PyObject *res;
1852 if (!PyDict_Check(v) || !PyDict_Check(w)) {
1853 res = Py_NotImplemented;
1855 else if (op == Py_EQ || op == Py_NE) {
1856 cmp = dict_equal((PyDictObject *)v, (PyDictObject *)w);
1857 if (cmp < 0)
1858 return NULL;
1859 res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1861 else {
1862 /* Py3K warning if comparison isn't == or != */
1863 if (PyErr_WarnPy3k("dict inequality comparisons not supported "
1864 "in 3.x", 1) < 0) {
1865 return NULL;
1867 res = Py_NotImplemented;
1869 Py_INCREF(res);
1870 return res;
1873 static PyObject *
1874 dict_contains(register PyDictObject *mp, PyObject *key)
1876 long hash;
1877 PyDictEntry *ep;
1879 if (!PyString_CheckExact(key) ||
1880 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1881 hash = PyObject_Hash(key);
1882 if (hash == -1)
1883 return NULL;
1885 ep = (mp->ma_lookup)(mp, key, hash);
1886 if (ep == NULL)
1887 return NULL;
1888 return PyBool_FromLong(ep->me_value != NULL);
1891 static PyObject *
1892 dict_has_key(register PyDictObject *mp, PyObject *key)
1894 if (PyErr_WarnPy3k("dict.has_key() not supported in 3.x; "
1895 "use the in operator", 1) < 0)
1896 return NULL;
1897 return dict_contains(mp, key);
1900 static PyObject *
1901 dict_get(register PyDictObject *mp, PyObject *args)
1903 PyObject *key;
1904 PyObject *failobj = Py_None;
1905 PyObject *val = NULL;
1906 long hash;
1907 PyDictEntry *ep;
1909 if (!PyArg_UnpackTuple(args, "get", 1, 2, &key, &failobj))
1910 return NULL;
1912 if (!PyString_CheckExact(key) ||
1913 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1914 hash = PyObject_Hash(key);
1915 if (hash == -1)
1916 return NULL;
1918 ep = (mp->ma_lookup)(mp, key, hash);
1919 if (ep == NULL)
1920 return NULL;
1921 val = ep->me_value;
1922 if (val == NULL)
1923 val = failobj;
1924 Py_INCREF(val);
1925 return val;
1929 static PyObject *
1930 dict_setdefault(register PyDictObject *mp, PyObject *args)
1932 PyObject *key;
1933 PyObject *failobj = Py_None;
1934 PyObject *val = NULL;
1935 long hash;
1936 PyDictEntry *ep;
1938 if (!PyArg_UnpackTuple(args, "setdefault", 1, 2, &key, &failobj))
1939 return NULL;
1941 if (!PyString_CheckExact(key) ||
1942 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1943 hash = PyObject_Hash(key);
1944 if (hash == -1)
1945 return NULL;
1947 ep = (mp->ma_lookup)(mp, key, hash);
1948 if (ep == NULL)
1949 return NULL;
1950 val = ep->me_value;
1951 if (val == NULL) {
1952 val = failobj;
1953 if (PyDict_SetItem((PyObject*)mp, key, failobj))
1954 val = NULL;
1956 Py_XINCREF(val);
1957 return val;
1961 static PyObject *
1962 dict_clear(register PyDictObject *mp)
1964 PyDict_Clear((PyObject *)mp);
1965 Py_RETURN_NONE;
1968 static PyObject *
1969 dict_pop(PyDictObject *mp, PyObject *args)
1971 long hash;
1972 PyDictEntry *ep;
1973 PyObject *old_value, *old_key;
1974 PyObject *key, *deflt = NULL;
1976 if(!PyArg_UnpackTuple(args, "pop", 1, 2, &key, &deflt))
1977 return NULL;
1978 if (mp->ma_used == 0) {
1979 if (deflt) {
1980 Py_INCREF(deflt);
1981 return deflt;
1983 PyErr_SetString(PyExc_KeyError,
1984 "pop(): dictionary is empty");
1985 return NULL;
1987 if (!PyString_CheckExact(key) ||
1988 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1989 hash = PyObject_Hash(key);
1990 if (hash == -1)
1991 return NULL;
1993 ep = (mp->ma_lookup)(mp, key, hash);
1994 if (ep == NULL)
1995 return NULL;
1996 if (ep->me_value == NULL) {
1997 if (deflt) {
1998 Py_INCREF(deflt);
1999 return deflt;
2001 set_key_error(key);
2002 return NULL;
2004 old_key = ep->me_key;
2005 Py_INCREF(dummy);
2006 ep->me_key = dummy;
2007 old_value = ep->me_value;
2008 ep->me_value = NULL;
2009 mp->ma_used--;
2010 Py_DECREF(old_key);
2011 return old_value;
2014 static PyObject *
2015 dict_popitem(PyDictObject *mp)
2017 Py_ssize_t i = 0;
2018 PyDictEntry *ep;
2019 PyObject *res;
2021 /* Allocate the result tuple before checking the size. Believe it
2022 * or not, this allocation could trigger a garbage collection which
2023 * could empty the dict, so if we checked the size first and that
2024 * happened, the result would be an infinite loop (searching for an
2025 * entry that no longer exists). Note that the usual popitem()
2026 * idiom is "while d: k, v = d.popitem()". so needing to throw the
2027 * tuple away if the dict *is* empty isn't a significant
2028 * inefficiency -- possible, but unlikely in practice.
2030 res = PyTuple_New(2);
2031 if (res == NULL)
2032 return NULL;
2033 if (mp->ma_used == 0) {
2034 Py_DECREF(res);
2035 PyErr_SetString(PyExc_KeyError,
2036 "popitem(): dictionary is empty");
2037 return NULL;
2039 /* Set ep to "the first" dict entry with a value. We abuse the hash
2040 * field of slot 0 to hold a search finger:
2041 * If slot 0 has a value, use slot 0.
2042 * Else slot 0 is being used to hold a search finger,
2043 * and we use its hash value as the first index to look.
2045 ep = &mp->ma_table[0];
2046 if (ep->me_value == NULL) {
2047 i = ep->me_hash;
2048 /* The hash field may be a real hash value, or it may be a
2049 * legit search finger, or it may be a once-legit search
2050 * finger that's out of bounds now because it wrapped around
2051 * or the table shrunk -- simply make sure it's in bounds now.
2053 if (i > mp->ma_mask || i < 1)
2054 i = 1; /* skip slot 0 */
2055 while ((ep = &mp->ma_table[i])->me_value == NULL) {
2056 i++;
2057 if (i > mp->ma_mask)
2058 i = 1;
2061 PyTuple_SET_ITEM(res, 0, ep->me_key);
2062 PyTuple_SET_ITEM(res, 1, ep->me_value);
2063 Py_INCREF(dummy);
2064 ep->me_key = dummy;
2065 ep->me_value = NULL;
2066 mp->ma_used--;
2067 assert(mp->ma_table[0].me_value == NULL);
2068 mp->ma_table[0].me_hash = i + 1; /* next place to start */
2069 return res;
2072 static int
2073 dict_traverse(PyObject *op, visitproc visit, void *arg)
2075 Py_ssize_t i = 0;
2076 PyObject *pk;
2077 PyObject *pv;
2079 while (PyDict_Next(op, &i, &pk, &pv)) {
2080 Py_VISIT(pk);
2081 Py_VISIT(pv);
2083 return 0;
2086 static int
2087 dict_tp_clear(PyObject *op)
2089 PyDict_Clear(op);
2090 return 0;
2094 extern PyTypeObject PyDictIterKey_Type; /* Forward */
2095 extern PyTypeObject PyDictIterValue_Type; /* Forward */
2096 extern PyTypeObject PyDictIterItem_Type; /* Forward */
2097 static PyObject *dictiter_new(PyDictObject *, PyTypeObject *);
2099 static PyObject *
2100 dict_iterkeys(PyDictObject *dict)
2102 return dictiter_new(dict, &PyDictIterKey_Type);
2105 static PyObject *
2106 dict_itervalues(PyDictObject *dict)
2108 return dictiter_new(dict, &PyDictIterValue_Type);
2111 static PyObject *
2112 dict_iteritems(PyDictObject *dict)
2114 return dictiter_new(dict, &PyDictIterItem_Type);
2117 static PyObject *
2118 dict_sizeof(PyDictObject *mp)
2120 Py_ssize_t res;
2122 res = sizeof(PyDictObject);
2123 if (mp->ma_table != mp->ma_smalltable)
2124 res = res + (mp->ma_mask + 1) * sizeof(PyDictEntry);
2125 return PyInt_FromSsize_t(res);
2128 PyDoc_STRVAR(has_key__doc__,
2129 "D.has_key(k) -> True if D has a key k, else False");
2131 PyDoc_STRVAR(contains__doc__,
2132 "D.__contains__(k) -> True if D has a key k, else False");
2134 PyDoc_STRVAR(getitem__doc__, "x.__getitem__(y) <==> x[y]");
2136 PyDoc_STRVAR(sizeof__doc__,
2137 "D.__sizeof__() -> size of D in memory, in bytes");
2139 PyDoc_STRVAR(get__doc__,
2140 "D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.");
2142 PyDoc_STRVAR(setdefault_doc__,
2143 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
2145 PyDoc_STRVAR(pop__doc__,
2146 "D.pop(k[,d]) -> v, remove specified key and return the corresponding value.\n\
2147 If key is not found, d is returned if given, otherwise KeyError is raised");
2149 PyDoc_STRVAR(popitem__doc__,
2150 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
2151 2-tuple; but raise KeyError if D is empty.");
2153 PyDoc_STRVAR(keys__doc__,
2154 "D.keys() -> list of D's keys");
2156 PyDoc_STRVAR(items__doc__,
2157 "D.items() -> list of D's (key, value) pairs, as 2-tuples");
2159 PyDoc_STRVAR(values__doc__,
2160 "D.values() -> list of D's values");
2162 PyDoc_STRVAR(update__doc__,
2163 "D.update(E, **F) -> None. Update D from dict/iterable E and F.\n"
2164 "If E has a .keys() method, does: for k in E: D[k] = E[k]\n\
2165 If E lacks .keys() method, does: for (k, v) in E: D[k] = v\n\
2166 In either case, this is followed by: for k in F: D[k] = F[k]");
2168 PyDoc_STRVAR(fromkeys__doc__,
2169 "dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
2170 v defaults to None.");
2172 PyDoc_STRVAR(clear__doc__,
2173 "D.clear() -> None. Remove all items from D.");
2175 PyDoc_STRVAR(copy__doc__,
2176 "D.copy() -> a shallow copy of D");
2178 PyDoc_STRVAR(iterkeys__doc__,
2179 "D.iterkeys() -> an iterator over the keys of D");
2181 PyDoc_STRVAR(itervalues__doc__,
2182 "D.itervalues() -> an iterator over the values of D");
2184 PyDoc_STRVAR(iteritems__doc__,
2185 "D.iteritems() -> an iterator over the (key, value) items of D");
2187 static PyMethodDef mapp_methods[] = {
2188 {"__contains__",(PyCFunction)dict_contains, METH_O | METH_COEXIST,
2189 contains__doc__},
2190 {"__getitem__", (PyCFunction)dict_subscript, METH_O | METH_COEXIST,
2191 getitem__doc__},
2192 {"__sizeof__", (PyCFunction)dict_sizeof, METH_NOARGS,
2193 sizeof__doc__},
2194 {"has_key", (PyCFunction)dict_has_key, METH_O,
2195 has_key__doc__},
2196 {"get", (PyCFunction)dict_get, METH_VARARGS,
2197 get__doc__},
2198 {"setdefault", (PyCFunction)dict_setdefault, METH_VARARGS,
2199 setdefault_doc__},
2200 {"pop", (PyCFunction)dict_pop, METH_VARARGS,
2201 pop__doc__},
2202 {"popitem", (PyCFunction)dict_popitem, METH_NOARGS,
2203 popitem__doc__},
2204 {"keys", (PyCFunction)dict_keys, METH_NOARGS,
2205 keys__doc__},
2206 {"items", (PyCFunction)dict_items, METH_NOARGS,
2207 items__doc__},
2208 {"values", (PyCFunction)dict_values, METH_NOARGS,
2209 values__doc__},
2210 {"update", (PyCFunction)dict_update, METH_VARARGS | METH_KEYWORDS,
2211 update__doc__},
2212 {"fromkeys", (PyCFunction)dict_fromkeys, METH_VARARGS | METH_CLASS,
2213 fromkeys__doc__},
2214 {"clear", (PyCFunction)dict_clear, METH_NOARGS,
2215 clear__doc__},
2216 {"copy", (PyCFunction)dict_copy, METH_NOARGS,
2217 copy__doc__},
2218 {"iterkeys", (PyCFunction)dict_iterkeys, METH_NOARGS,
2219 iterkeys__doc__},
2220 {"itervalues", (PyCFunction)dict_itervalues, METH_NOARGS,
2221 itervalues__doc__},
2222 {"iteritems", (PyCFunction)dict_iteritems, METH_NOARGS,
2223 iteritems__doc__},
2224 {NULL, NULL} /* sentinel */
2227 /* Return 1 if `key` is in dict `op`, 0 if not, and -1 on error. */
2229 PyDict_Contains(PyObject *op, PyObject *key)
2231 long hash;
2232 PyDictObject *mp = (PyDictObject *)op;
2233 PyDictEntry *ep;
2235 if (!PyString_CheckExact(key) ||
2236 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
2237 hash = PyObject_Hash(key);
2238 if (hash == -1)
2239 return -1;
2241 ep = (mp->ma_lookup)(mp, key, hash);
2242 return ep == NULL ? -1 : (ep->me_value != NULL);
2245 /* Internal version of PyDict_Contains used when the hash value is already known */
2247 _PyDict_Contains(PyObject *op, PyObject *key, long hash)
2249 PyDictObject *mp = (PyDictObject *)op;
2250 PyDictEntry *ep;
2252 ep = (mp->ma_lookup)(mp, key, hash);
2253 return ep == NULL ? -1 : (ep->me_value != NULL);
2256 /* Hack to implement "key in dict" */
2257 static PySequenceMethods dict_as_sequence = {
2258 0, /* sq_length */
2259 0, /* sq_concat */
2260 0, /* sq_repeat */
2261 0, /* sq_item */
2262 0, /* sq_slice */
2263 0, /* sq_ass_item */
2264 0, /* sq_ass_slice */
2265 PyDict_Contains, /* sq_contains */
2266 0, /* sq_inplace_concat */
2267 0, /* sq_inplace_repeat */
2270 static PyObject *
2271 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
2273 PyObject *self;
2275 assert(type != NULL && type->tp_alloc != NULL);
2276 self = type->tp_alloc(type, 0);
2277 if (self != NULL) {
2278 PyDictObject *d = (PyDictObject *)self;
2279 /* It's guaranteed that tp->alloc zeroed out the struct. */
2280 assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
2281 INIT_NONZERO_DICT_SLOTS(d);
2282 d->ma_lookup = lookdict_string;
2283 /* The object has been implicitely tracked by tp_alloc */
2284 if (type == &PyDict_Type)
2285 _PyObject_GC_UNTRACK(d);
2286 #ifdef SHOW_CONVERSION_COUNTS
2287 ++created;
2288 #endif
2289 #ifdef SHOW_TRACK_COUNT
2290 if (_PyObject_GC_IS_TRACKED(d))
2291 count_tracked++;
2292 else
2293 count_untracked++;
2294 #endif
2296 return self;
2299 static int
2300 dict_init(PyObject *self, PyObject *args, PyObject *kwds)
2302 return dict_update_common(self, args, kwds, "dict");
2305 static PyObject *
2306 dict_iter(PyDictObject *dict)
2308 return dictiter_new(dict, &PyDictIterKey_Type);
2311 PyDoc_STRVAR(dictionary_doc,
2312 "dict() -> new empty dictionary.\n"
2313 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
2314 " (key, value) pairs.\n"
2315 "dict(seq) -> new dictionary initialized as if via:\n"
2316 " d = {}\n"
2317 " for k, v in seq:\n"
2318 " d[k] = v\n"
2319 "dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
2320 " in the keyword argument list. For example: dict(one=1, two=2)");
2322 PyTypeObject PyDict_Type = {
2323 PyVarObject_HEAD_INIT(&PyType_Type, 0)
2324 "dict",
2325 sizeof(PyDictObject),
2327 (destructor)dict_dealloc, /* tp_dealloc */
2328 (printfunc)dict_print, /* tp_print */
2329 0, /* tp_getattr */
2330 0, /* tp_setattr */
2331 (cmpfunc)dict_compare, /* tp_compare */
2332 (reprfunc)dict_repr, /* tp_repr */
2333 0, /* tp_as_number */
2334 &dict_as_sequence, /* tp_as_sequence */
2335 &dict_as_mapping, /* tp_as_mapping */
2336 (hashfunc)PyObject_HashNotImplemented, /* tp_hash */
2337 0, /* tp_call */
2338 0, /* tp_str */
2339 PyObject_GenericGetAttr, /* tp_getattro */
2340 0, /* tp_setattro */
2341 0, /* tp_as_buffer */
2342 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
2343 Py_TPFLAGS_BASETYPE | Py_TPFLAGS_DICT_SUBCLASS, /* tp_flags */
2344 dictionary_doc, /* tp_doc */
2345 dict_traverse, /* tp_traverse */
2346 dict_tp_clear, /* tp_clear */
2347 dict_richcompare, /* tp_richcompare */
2348 0, /* tp_weaklistoffset */
2349 (getiterfunc)dict_iter, /* tp_iter */
2350 0, /* tp_iternext */
2351 mapp_methods, /* tp_methods */
2352 0, /* tp_members */
2353 0, /* tp_getset */
2354 0, /* tp_base */
2355 0, /* tp_dict */
2356 0, /* tp_descr_get */
2357 0, /* tp_descr_set */
2358 0, /* tp_dictoffset */
2359 dict_init, /* tp_init */
2360 PyType_GenericAlloc, /* tp_alloc */
2361 dict_new, /* tp_new */
2362 PyObject_GC_Del, /* tp_free */
2365 /* For backward compatibility with old dictionary interface */
2367 PyObject *
2368 PyDict_GetItemString(PyObject *v, const char *key)
2370 PyObject *kv, *rv;
2371 kv = PyString_FromString(key);
2372 if (kv == NULL)
2373 return NULL;
2374 rv = PyDict_GetItem(v, kv);
2375 Py_DECREF(kv);
2376 return rv;
2380 PyDict_SetItemString(PyObject *v, const char *key, PyObject *item)
2382 PyObject *kv;
2383 int err;
2384 kv = PyString_FromString(key);
2385 if (kv == NULL)
2386 return -1;
2387 PyString_InternInPlace(&kv); /* XXX Should we really? */
2388 err = PyDict_SetItem(v, kv, item);
2389 Py_DECREF(kv);
2390 return err;
2394 PyDict_DelItemString(PyObject *v, const char *key)
2396 PyObject *kv;
2397 int err;
2398 kv = PyString_FromString(key);
2399 if (kv == NULL)
2400 return -1;
2401 err = PyDict_DelItem(v, kv);
2402 Py_DECREF(kv);
2403 return err;
2406 /* Dictionary iterator types */
2408 typedef struct {
2409 PyObject_HEAD
2410 PyDictObject *di_dict; /* Set to NULL when iterator is exhausted */
2411 Py_ssize_t di_used;
2412 Py_ssize_t di_pos;
2413 PyObject* di_result; /* reusable result tuple for iteritems */
2414 Py_ssize_t len;
2415 } dictiterobject;
2417 static PyObject *
2418 dictiter_new(PyDictObject *dict, PyTypeObject *itertype)
2420 dictiterobject *di;
2421 di = PyObject_GC_New(dictiterobject, itertype);
2422 if (di == NULL)
2423 return NULL;
2424 Py_INCREF(dict);
2425 di->di_dict = dict;
2426 di->di_used = dict->ma_used;
2427 di->di_pos = 0;
2428 di->len = dict->ma_used;
2429 if (itertype == &PyDictIterItem_Type) {
2430 di->di_result = PyTuple_Pack(2, Py_None, Py_None);
2431 if (di->di_result == NULL) {
2432 Py_DECREF(di);
2433 return NULL;
2436 else
2437 di->di_result = NULL;
2438 _PyObject_GC_TRACK(di);
2439 return (PyObject *)di;
2442 static void
2443 dictiter_dealloc(dictiterobject *di)
2445 Py_XDECREF(di->di_dict);
2446 Py_XDECREF(di->di_result);
2447 PyObject_GC_Del(di);
2450 static int
2451 dictiter_traverse(dictiterobject *di, visitproc visit, void *arg)
2453 Py_VISIT(di->di_dict);
2454 Py_VISIT(di->di_result);
2455 return 0;
2458 static PyObject *
2459 dictiter_len(dictiterobject *di)
2461 Py_ssize_t len = 0;
2462 if (di->di_dict != NULL && di->di_used == di->di_dict->ma_used)
2463 len = di->len;
2464 return PyInt_FromSize_t(len);
2467 PyDoc_STRVAR(length_hint_doc, "Private method returning an estimate of len(list(it)).");
2469 static PyMethodDef dictiter_methods[] = {
2470 {"__length_hint__", (PyCFunction)dictiter_len, METH_NOARGS, length_hint_doc},
2471 {NULL, NULL} /* sentinel */
2474 static PyObject *dictiter_iternextkey(dictiterobject *di)
2476 PyObject *key;
2477 register Py_ssize_t i, mask;
2478 register PyDictEntry *ep;
2479 PyDictObject *d = di->di_dict;
2481 if (d == NULL)
2482 return NULL;
2483 assert (PyDict_Check(d));
2485 if (di->di_used != d->ma_used) {
2486 PyErr_SetString(PyExc_RuntimeError,
2487 "dictionary changed size during iteration");
2488 di->di_used = -1; /* Make this state sticky */
2489 return NULL;
2492 i = di->di_pos;
2493 if (i < 0)
2494 goto fail;
2495 ep = d->ma_table;
2496 mask = d->ma_mask;
2497 while (i <= mask && ep[i].me_value == NULL)
2498 i++;
2499 di->di_pos = i+1;
2500 if (i > mask)
2501 goto fail;
2502 di->len--;
2503 key = ep[i].me_key;
2504 Py_INCREF(key);
2505 return key;
2507 fail:
2508 Py_DECREF(d);
2509 di->di_dict = NULL;
2510 return NULL;
2513 PyTypeObject PyDictIterKey_Type = {
2514 PyVarObject_HEAD_INIT(&PyType_Type, 0)
2515 "dictionary-keyiterator", /* tp_name */
2516 sizeof(dictiterobject), /* tp_basicsize */
2517 0, /* tp_itemsize */
2518 /* methods */
2519 (destructor)dictiter_dealloc, /* tp_dealloc */
2520 0, /* tp_print */
2521 0, /* tp_getattr */
2522 0, /* tp_setattr */
2523 0, /* tp_compare */
2524 0, /* tp_repr */
2525 0, /* tp_as_number */
2526 0, /* tp_as_sequence */
2527 0, /* tp_as_mapping */
2528 0, /* tp_hash */
2529 0, /* tp_call */
2530 0, /* tp_str */
2531 PyObject_GenericGetAttr, /* tp_getattro */
2532 0, /* tp_setattro */
2533 0, /* tp_as_buffer */
2534 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2535 0, /* tp_doc */
2536 (traverseproc)dictiter_traverse, /* tp_traverse */
2537 0, /* tp_clear */
2538 0, /* tp_richcompare */
2539 0, /* tp_weaklistoffset */
2540 PyObject_SelfIter, /* tp_iter */
2541 (iternextfunc)dictiter_iternextkey, /* tp_iternext */
2542 dictiter_methods, /* tp_methods */
2546 static PyObject *dictiter_iternextvalue(dictiterobject *di)
2548 PyObject *value;
2549 register Py_ssize_t i, mask;
2550 register PyDictEntry *ep;
2551 PyDictObject *d = di->di_dict;
2553 if (d == NULL)
2554 return NULL;
2555 assert (PyDict_Check(d));
2557 if (di->di_used != d->ma_used) {
2558 PyErr_SetString(PyExc_RuntimeError,
2559 "dictionary changed size during iteration");
2560 di->di_used = -1; /* Make this state sticky */
2561 return NULL;
2564 i = di->di_pos;
2565 mask = d->ma_mask;
2566 if (i < 0 || i > mask)
2567 goto fail;
2568 ep = d->ma_table;
2569 while ((value=ep[i].me_value) == NULL) {
2570 i++;
2571 if (i > mask)
2572 goto fail;
2574 di->di_pos = i+1;
2575 di->len--;
2576 Py_INCREF(value);
2577 return value;
2579 fail:
2580 Py_DECREF(d);
2581 di->di_dict = NULL;
2582 return NULL;
2585 PyTypeObject PyDictIterValue_Type = {
2586 PyVarObject_HEAD_INIT(&PyType_Type, 0)
2587 "dictionary-valueiterator", /* tp_name */
2588 sizeof(dictiterobject), /* tp_basicsize */
2589 0, /* tp_itemsize */
2590 /* methods */
2591 (destructor)dictiter_dealloc, /* tp_dealloc */
2592 0, /* tp_print */
2593 0, /* tp_getattr */
2594 0, /* tp_setattr */
2595 0, /* tp_compare */
2596 0, /* tp_repr */
2597 0, /* tp_as_number */
2598 0, /* tp_as_sequence */
2599 0, /* tp_as_mapping */
2600 0, /* tp_hash */
2601 0, /* tp_call */
2602 0, /* tp_str */
2603 PyObject_GenericGetAttr, /* tp_getattro */
2604 0, /* tp_setattro */
2605 0, /* tp_as_buffer */
2606 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2607 0, /* tp_doc */
2608 (traverseproc)dictiter_traverse, /* tp_traverse */
2609 0, /* tp_clear */
2610 0, /* tp_richcompare */
2611 0, /* tp_weaklistoffset */
2612 PyObject_SelfIter, /* tp_iter */
2613 (iternextfunc)dictiter_iternextvalue, /* tp_iternext */
2614 dictiter_methods, /* tp_methods */
2618 static PyObject *dictiter_iternextitem(dictiterobject *di)
2620 PyObject *key, *value, *result = di->di_result;
2621 register Py_ssize_t i, mask;
2622 register PyDictEntry *ep;
2623 PyDictObject *d = di->di_dict;
2625 if (d == NULL)
2626 return NULL;
2627 assert (PyDict_Check(d));
2629 if (di->di_used != d->ma_used) {
2630 PyErr_SetString(PyExc_RuntimeError,
2631 "dictionary changed size during iteration");
2632 di->di_used = -1; /* Make this state sticky */
2633 return NULL;
2636 i = di->di_pos;
2637 if (i < 0)
2638 goto fail;
2639 ep = d->ma_table;
2640 mask = d->ma_mask;
2641 while (i <= mask && ep[i].me_value == NULL)
2642 i++;
2643 di->di_pos = i+1;
2644 if (i > mask)
2645 goto fail;
2647 if (result->ob_refcnt == 1) {
2648 Py_INCREF(result);
2649 Py_DECREF(PyTuple_GET_ITEM(result, 0));
2650 Py_DECREF(PyTuple_GET_ITEM(result, 1));
2651 } else {
2652 result = PyTuple_New(2);
2653 if (result == NULL)
2654 return NULL;
2656 di->len--;
2657 key = ep[i].me_key;
2658 value = ep[i].me_value;
2659 Py_INCREF(key);
2660 Py_INCREF(value);
2661 PyTuple_SET_ITEM(result, 0, key);
2662 PyTuple_SET_ITEM(result, 1, value);
2663 return result;
2665 fail:
2666 Py_DECREF(d);
2667 di->di_dict = NULL;
2668 return NULL;
2671 PyTypeObject PyDictIterItem_Type = {
2672 PyVarObject_HEAD_INIT(&PyType_Type, 0)
2673 "dictionary-itemiterator", /* tp_name */
2674 sizeof(dictiterobject), /* tp_basicsize */
2675 0, /* tp_itemsize */
2676 /* methods */
2677 (destructor)dictiter_dealloc, /* tp_dealloc */
2678 0, /* tp_print */
2679 0, /* tp_getattr */
2680 0, /* tp_setattr */
2681 0, /* tp_compare */
2682 0, /* tp_repr */
2683 0, /* tp_as_number */
2684 0, /* tp_as_sequence */
2685 0, /* tp_as_mapping */
2686 0, /* tp_hash */
2687 0, /* tp_call */
2688 0, /* tp_str */
2689 PyObject_GenericGetAttr, /* tp_getattro */
2690 0, /* tp_setattro */
2691 0, /* tp_as_buffer */
2692 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2693 0, /* tp_doc */
2694 (traverseproc)dictiter_traverse, /* tp_traverse */
2695 0, /* tp_clear */
2696 0, /* tp_richcompare */
2697 0, /* tp_weaklistoffset */
2698 PyObject_SelfIter, /* tp_iter */
2699 (iternextfunc)dictiter_iternextitem, /* tp_iternext */
2700 dictiter_methods, /* tp_methods */