stmt and setup can contain multiple statements, see #5896
[python.git] / Objects / dictobject.c
blobd797173863f2a0ea84a5d619ecb8cad6032cf81b
2 /* Dictionary object implementation using a hash table */
4 /* The distribution includes a separate file, Objects/dictnotes.txt,
5 describing explorations into dictionary design and optimization.
6 It covers typical dictionary use patterns, the parameters for
7 tuning dictionaries, and several ideas for possible optimizations.
8 */
10 #include "Python.h"
13 /* Set a key error with the specified argument, wrapping it in a
14 * tuple automatically so that tuple keys are not unpacked as the
15 * exception arguments. */
16 static void
17 set_key_error(PyObject *arg)
19 PyObject *tup;
20 tup = PyTuple_Pack(1, arg);
21 if (!tup)
22 return; /* caller will expect error to be set anyway */
23 PyErr_SetObject(PyExc_KeyError, tup);
24 Py_DECREF(tup);
27 /* Define this out if you don't want conversion statistics on exit. */
28 #undef SHOW_CONVERSION_COUNTS
30 /* See large comment block below. This must be >= 1. */
31 #define PERTURB_SHIFT 5
34 Major subtleties ahead: Most hash schemes depend on having a "good" hash
35 function, in the sense of simulating randomness. Python doesn't: its most
36 important hash functions (for strings and ints) are very regular in common
37 cases:
39 >>> map(hash, (0, 1, 2, 3))
40 [0, 1, 2, 3]
41 >>> map(hash, ("namea", "nameb", "namec", "named"))
42 [-1658398457, -1658398460, -1658398459, -1658398462]
43 >>>
45 This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
46 the low-order i bits as the initial table index is extremely fast, and there
47 are no collisions at all for dicts indexed by a contiguous range of ints.
48 The same is approximately true when keys are "consecutive" strings. So this
49 gives better-than-random behavior in common cases, and that's very desirable.
51 OTOH, when collisions occur, the tendency to fill contiguous slices of the
52 hash table makes a good collision resolution strategy crucial. Taking only
53 the last i bits of the hash code is also vulnerable: for example, consider
54 [i << 16 for i in range(20000)] as a set of keys. Since ints are their own
55 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
56 hash code are all 0: they *all* map to the same table index.
58 But catering to unusual cases should not slow the usual ones, so we just take
59 the last i bits anyway. It's up to collision resolution to do the rest. If
60 we *usually* find the key we're looking for on the first try (and, it turns
61 out, we usually do -- the table load factor is kept under 2/3, so the odds
62 are solidly in our favor), then it makes best sense to keep the initial index
63 computation dirt cheap.
65 The first half of collision resolution is to visit table indices via this
66 recurrence:
68 j = ((5*j) + 1) mod 2**i
70 For any initial j in range(2**i), repeating that 2**i times generates each
71 int in range(2**i) exactly once (see any text on random-number generation for
72 proof). By itself, this doesn't help much: like linear probing (setting
73 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
74 order. This would be bad, except that's not the only thing we do, and it's
75 actually *good* in the common cases where hash keys are consecutive. In an
76 example that's really too small to make this entirely clear, for a table of
77 size 2**3 the order of indices is:
79 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
81 If two things come in at index 5, the first place we look after is index 2,
82 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
83 Linear probing is deadly in this case because there the fixed probe order
84 is the *same* as the order consecutive keys are likely to arrive. But it's
85 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
86 and certain that consecutive hash codes do not.
88 The other half of the strategy is to get the other bits of the hash code
89 into play. This is done by initializing a (unsigned) vrbl "perturb" to the
90 full hash code, and changing the recurrence to:
92 j = (5*j) + 1 + perturb;
93 perturb >>= PERTURB_SHIFT;
94 use j % 2**i as the next table index;
96 Now the probe sequence depends (eventually) on every bit in the hash code,
97 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
98 because it quickly magnifies small differences in the bits that didn't affect
99 the initial index. Note that because perturb is unsigned, if the recurrence
100 is executed often enough perturb eventually becomes and remains 0. At that
101 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
102 that's certain to find an empty slot eventually (since it generates every int
103 in range(2**i), and we make sure there's always at least one empty slot).
105 Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
106 small so that the high bits of the hash code continue to affect the probe
107 sequence across iterations; but you want it large so that in really bad cases
108 the high-order hash bits have an effect on early iterations. 5 was "the
109 best" in minimizing total collisions across experiments Tim Peters ran (on
110 both normal and pathological cases), but 4 and 6 weren't significantly worse.
112 Historical: Reimer Behrends contributed the idea of using a polynomial-based
113 approach, using repeated multiplication by x in GF(2**n) where an irreducible
114 polynomial for each table size was chosen such that x was a primitive root.
115 Christian Tismer later extended that to use division by x instead, as an
116 efficient way to get the high bits of the hash code into play. This scheme
117 also gave excellent collision statistics, but was more expensive: two
118 if-tests were required inside the loop; computing "the next" index took about
119 the same number of operations but without as much potential parallelism
120 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
121 above, and then shifting perturb can be done while the table index is being
122 masked); and the PyDictObject struct required a member to hold the table's
123 polynomial. In Tim's experiments the current scheme ran faster, produced
124 equally good collision statistics, needed less code & used less memory.
126 Theoretical Python 2.5 headache: hash codes are only C "long", but
127 sizeof(Py_ssize_t) > sizeof(long) may be possible. In that case, and if a
128 dict is genuinely huge, then only the slots directly reachable via indexing
129 by a C long can be the first slot in a probe sequence. The probe sequence
130 will still eventually reach every slot in the table, but the collision rate
131 on initial probes may be much higher than this scheme was designed for.
132 Getting a hash code as fat as Py_ssize_t is the only real cure. But in
133 practice, this probably won't make a lick of difference for many years (at
134 which point everyone will have terabytes of RAM on 64-bit boxes).
137 /* Object used as dummy key to fill deleted entries */
138 static PyObject *dummy = NULL; /* Initialized by first call to newPyDictObject() */
140 #ifdef Py_REF_DEBUG
141 PyObject *
142 _PyDict_Dummy(void)
144 return dummy;
146 #endif
148 /* forward declarations */
149 static PyDictEntry *
150 lookdict_string(PyDictObject *mp, PyObject *key, long hash);
152 #ifdef SHOW_CONVERSION_COUNTS
153 static long created = 0L;
154 static long converted = 0L;
156 static void
157 show_counts(void)
159 fprintf(stderr, "created %ld string dicts\n", created);
160 fprintf(stderr, "converted %ld to normal dicts\n", converted);
161 fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
163 #endif
165 /* Debug statistic to compare allocations with reuse through the free list */
166 #undef SHOW_ALLOC_COUNT
167 #ifdef SHOW_ALLOC_COUNT
168 static size_t count_alloc = 0;
169 static size_t count_reuse = 0;
171 static void
172 show_alloc(void)
174 fprintf(stderr, "Dict allocations: %" PY_FORMAT_SIZE_T "d\n",
175 count_alloc);
176 fprintf(stderr, "Dict reuse through freelist: %" PY_FORMAT_SIZE_T
177 "d\n", count_reuse);
178 fprintf(stderr, "%.2f%% reuse rate\n\n",
179 (100.0*count_reuse/(count_alloc+count_reuse)));
181 #endif
183 /* Debug statistic to count GC tracking of dicts */
184 #ifdef SHOW_TRACK_COUNT
185 static Py_ssize_t count_untracked = 0;
186 static Py_ssize_t count_tracked = 0;
188 static void
189 show_track(void)
191 fprintf(stderr, "Dicts created: %" PY_FORMAT_SIZE_T "d\n",
192 count_tracked + count_untracked);
193 fprintf(stderr, "Dicts tracked by the GC: %" PY_FORMAT_SIZE_T
194 "d\n", count_tracked);
195 fprintf(stderr, "%.2f%% dict tracking rate\n\n",
196 (100.0*count_tracked/(count_untracked+count_tracked)));
198 #endif
201 /* Initialization macros.
202 There are two ways to create a dict: PyDict_New() is the main C API
203 function, and the tp_new slot maps to dict_new(). In the latter case we
204 can save a little time over what PyDict_New does because it's guaranteed
205 that the PyDictObject struct is already zeroed out.
206 Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
207 an excellent reason not to).
210 #define INIT_NONZERO_DICT_SLOTS(mp) do { \
211 (mp)->ma_table = (mp)->ma_smalltable; \
212 (mp)->ma_mask = PyDict_MINSIZE - 1; \
213 } while(0)
215 #define EMPTY_TO_MINSIZE(mp) do { \
216 memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
217 (mp)->ma_used = (mp)->ma_fill = 0; \
218 INIT_NONZERO_DICT_SLOTS(mp); \
219 } while(0)
221 /* Dictionary reuse scheme to save calls to malloc, free, and memset */
222 #ifndef PyDict_MAXFREELIST
223 #define PyDict_MAXFREELIST 80
224 #endif
225 static PyDictObject *free_list[PyDict_MAXFREELIST];
226 static int numfree = 0;
228 void
229 PyDict_Fini(void)
231 PyDictObject *op;
233 while (numfree) {
234 op = free_list[--numfree];
235 assert(PyDict_CheckExact(op));
236 PyObject_GC_Del(op);
240 PyObject *
241 PyDict_New(void)
243 register PyDictObject *mp;
244 if (dummy == NULL) { /* Auto-initialize dummy */
245 dummy = PyString_FromString("<dummy key>");
246 if (dummy == NULL)
247 return NULL;
248 #ifdef SHOW_CONVERSION_COUNTS
249 Py_AtExit(show_counts);
250 #endif
251 #ifdef SHOW_ALLOC_COUNT
252 Py_AtExit(show_alloc);
253 #endif
254 #ifdef SHOW_TRACK_COUNT
255 Py_AtExit(show_track);
256 #endif
258 if (numfree) {
259 mp = free_list[--numfree];
260 assert (mp != NULL);
261 assert (Py_TYPE(mp) == &PyDict_Type);
262 _Py_NewReference((PyObject *)mp);
263 if (mp->ma_fill) {
264 EMPTY_TO_MINSIZE(mp);
265 } else {
266 /* At least set ma_table and ma_mask; these are wrong
267 if an empty but presized dict is added to freelist */
268 INIT_NONZERO_DICT_SLOTS(mp);
270 assert (mp->ma_used == 0);
271 assert (mp->ma_table == mp->ma_smalltable);
272 assert (mp->ma_mask == PyDict_MINSIZE - 1);
273 #ifdef SHOW_ALLOC_COUNT
274 count_reuse++;
275 #endif
276 } else {
277 mp = PyObject_GC_New(PyDictObject, &PyDict_Type);
278 if (mp == NULL)
279 return NULL;
280 EMPTY_TO_MINSIZE(mp);
281 #ifdef SHOW_ALLOC_COUNT
282 count_alloc++;
283 #endif
285 mp->ma_lookup = lookdict_string;
286 #ifdef SHOW_TRACK_COUNT
287 count_untracked++;
288 #endif
289 #ifdef SHOW_CONVERSION_COUNTS
290 ++created;
291 #endif
292 return (PyObject *)mp;
296 The basic lookup function used by all operations.
297 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
298 Open addressing is preferred over chaining since the link overhead for
299 chaining would be substantial (100% with typical malloc overhead).
301 The initial probe index is computed as hash mod the table size. Subsequent
302 probe indices are computed as explained earlier.
304 All arithmetic on hash should ignore overflow.
306 (The details in this version are due to Tim Peters, building on many past
307 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
308 Christian Tismer).
310 lookdict() is general-purpose, and may return NULL if (and only if) a
311 comparison raises an exception (this was new in Python 2.5).
312 lookdict_string() below is specialized to string keys, comparison of which can
313 never raise an exception; that function can never return NULL. For both, when
314 the key isn't found a PyDictEntry* is returned for which the me_value field is
315 NULL; this is the slot in the dict at which the key would have been found, and
316 the caller can (if it wishes) add the <key, value> pair to the returned
317 PyDictEntry*.
319 static PyDictEntry *
320 lookdict(PyDictObject *mp, PyObject *key, register long hash)
322 register size_t i;
323 register size_t perturb;
324 register PyDictEntry *freeslot;
325 register size_t mask = (size_t)mp->ma_mask;
326 PyDictEntry *ep0 = mp->ma_table;
327 register PyDictEntry *ep;
328 register int cmp;
329 PyObject *startkey;
331 i = (size_t)hash & mask;
332 ep = &ep0[i];
333 if (ep->me_key == NULL || ep->me_key == key)
334 return ep;
336 if (ep->me_key == dummy)
337 freeslot = ep;
338 else {
339 if (ep->me_hash == hash) {
340 startkey = ep->me_key;
341 Py_INCREF(startkey);
342 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
343 Py_DECREF(startkey);
344 if (cmp < 0)
345 return NULL;
346 if (ep0 == mp->ma_table && ep->me_key == startkey) {
347 if (cmp > 0)
348 return ep;
350 else {
351 /* The compare did major nasty stuff to the
352 * dict: start over.
353 * XXX A clever adversary could prevent this
354 * XXX from terminating.
356 return lookdict(mp, key, hash);
359 freeslot = NULL;
362 /* In the loop, me_key == dummy is by far (factor of 100s) the
363 least likely outcome, so test for that last. */
364 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
365 i = (i << 2) + i + perturb + 1;
366 ep = &ep0[i & mask];
367 if (ep->me_key == NULL)
368 return freeslot == NULL ? ep : freeslot;
369 if (ep->me_key == key)
370 return ep;
371 if (ep->me_hash == hash && ep->me_key != dummy) {
372 startkey = ep->me_key;
373 Py_INCREF(startkey);
374 cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
375 Py_DECREF(startkey);
376 if (cmp < 0)
377 return NULL;
378 if (ep0 == mp->ma_table && ep->me_key == startkey) {
379 if (cmp > 0)
380 return ep;
382 else {
383 /* The compare did major nasty stuff to the
384 * dict: start over.
385 * XXX A clever adversary could prevent this
386 * XXX from terminating.
388 return lookdict(mp, key, hash);
391 else if (ep->me_key == dummy && freeslot == NULL)
392 freeslot = ep;
394 assert(0); /* NOT REACHED */
395 return 0;
399 * Hacked up version of lookdict which can assume keys are always strings;
400 * this assumption allows testing for errors during PyObject_RichCompareBool()
401 * to be dropped; string-string comparisons never raise exceptions. This also
402 * means we don't need to go through PyObject_RichCompareBool(); we can always
403 * use _PyString_Eq() directly.
405 * This is valuable because dicts with only string keys are very common.
407 static PyDictEntry *
408 lookdict_string(PyDictObject *mp, PyObject *key, register long hash)
410 register size_t i;
411 register size_t perturb;
412 register PyDictEntry *freeslot;
413 register size_t mask = (size_t)mp->ma_mask;
414 PyDictEntry *ep0 = mp->ma_table;
415 register PyDictEntry *ep;
417 /* Make sure this function doesn't have to handle non-string keys,
418 including subclasses of str; e.g., one reason to subclass
419 strings is to override __eq__, and for speed we don't cater to
420 that here. */
421 if (!PyString_CheckExact(key)) {
422 #ifdef SHOW_CONVERSION_COUNTS
423 ++converted;
424 #endif
425 mp->ma_lookup = lookdict;
426 return lookdict(mp, key, hash);
428 i = hash & mask;
429 ep = &ep0[i];
430 if (ep->me_key == NULL || ep->me_key == key)
431 return ep;
432 if (ep->me_key == dummy)
433 freeslot = ep;
434 else {
435 if (ep->me_hash == hash && _PyString_Eq(ep->me_key, key))
436 return ep;
437 freeslot = NULL;
440 /* In the loop, me_key == dummy is by far (factor of 100s) the
441 least likely outcome, so test for that last. */
442 for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
443 i = (i << 2) + i + perturb + 1;
444 ep = &ep0[i & mask];
445 if (ep->me_key == NULL)
446 return freeslot == NULL ? ep : freeslot;
447 if (ep->me_key == key
448 || (ep->me_hash == hash
449 && ep->me_key != dummy
450 && _PyString_Eq(ep->me_key, key)))
451 return ep;
452 if (ep->me_key == dummy && freeslot == NULL)
453 freeslot = ep;
455 assert(0); /* NOT REACHED */
456 return 0;
459 #ifdef SHOW_TRACK_COUNT
460 #define INCREASE_TRACK_COUNT \
461 (count_tracked++, count_untracked--);
462 #define DECREASE_TRACK_COUNT \
463 (count_tracked--, count_untracked++);
464 #else
465 #define INCREASE_TRACK_COUNT
466 #define DECREASE_TRACK_COUNT
467 #endif
469 #define MAINTAIN_TRACKING(mp, key, value) \
470 do { \
471 if (!_PyObject_GC_IS_TRACKED(mp)) { \
472 if (_PyObject_GC_MAY_BE_TRACKED(key) || \
473 _PyObject_GC_MAY_BE_TRACKED(value)) { \
474 _PyObject_GC_TRACK(mp); \
475 INCREASE_TRACK_COUNT \
478 } while(0)
480 void
481 _PyDict_MaybeUntrack(PyObject *op)
483 PyDictObject *mp;
484 PyObject *value;
485 Py_ssize_t mask, i;
486 PyDictEntry *ep;
488 if (!PyDict_CheckExact(op) || !_PyObject_GC_IS_TRACKED(op))
489 return;
491 mp = (PyDictObject *) op;
492 ep = mp->ma_table;
493 mask = mp->ma_mask;
494 for (i = 0; i <= mask; i++) {
495 if ((value = ep[i].me_value) == NULL)
496 continue;
497 if (_PyObject_GC_MAY_BE_TRACKED(value) ||
498 _PyObject_GC_MAY_BE_TRACKED(ep[i].me_key))
499 return;
501 DECREASE_TRACK_COUNT
502 _PyObject_GC_UNTRACK(op);
507 Internal routine to insert a new item into the table.
508 Used both by the internal resize routine and by the public insert routine.
509 Eats a reference to key and one to value.
510 Returns -1 if an error occurred, or 0 on success.
512 static int
513 insertdict(register PyDictObject *mp, PyObject *key, long hash, PyObject *value)
515 PyObject *old_value;
516 register PyDictEntry *ep;
517 typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
519 assert(mp->ma_lookup != NULL);
520 ep = mp->ma_lookup(mp, key, hash);
521 if (ep == NULL) {
522 Py_DECREF(key);
523 Py_DECREF(value);
524 return -1;
526 MAINTAIN_TRACKING(mp, key, value);
527 if (ep->me_value != NULL) {
528 old_value = ep->me_value;
529 ep->me_value = value;
530 Py_DECREF(old_value); /* which **CAN** re-enter */
531 Py_DECREF(key);
533 else {
534 if (ep->me_key == NULL)
535 mp->ma_fill++;
536 else {
537 assert(ep->me_key == dummy);
538 Py_DECREF(dummy);
540 ep->me_key = key;
541 ep->me_hash = (Py_ssize_t)hash;
542 ep->me_value = value;
543 mp->ma_used++;
545 return 0;
549 Internal routine used by dictresize() to insert an item which is
550 known to be absent from the dict. This routine also assumes that
551 the dict contains no deleted entries. Besides the performance benefit,
552 using insertdict() in dictresize() is dangerous (SF bug #1456209).
553 Note that no refcounts are changed by this routine; if needed, the caller
554 is responsible for incref'ing `key` and `value`.
556 static void
557 insertdict_clean(register PyDictObject *mp, PyObject *key, long hash,
558 PyObject *value)
560 register size_t i;
561 register size_t perturb;
562 register size_t mask = (size_t)mp->ma_mask;
563 PyDictEntry *ep0 = mp->ma_table;
564 register PyDictEntry *ep;
566 MAINTAIN_TRACKING(mp, key, value);
567 i = hash & mask;
568 ep = &ep0[i];
569 for (perturb = hash; ep->me_key != NULL; perturb >>= PERTURB_SHIFT) {
570 i = (i << 2) + i + perturb + 1;
571 ep = &ep0[i & mask];
573 assert(ep->me_value == NULL);
574 mp->ma_fill++;
575 ep->me_key = key;
576 ep->me_hash = (Py_ssize_t)hash;
577 ep->me_value = value;
578 mp->ma_used++;
582 Restructure the table by allocating a new table and reinserting all
583 items again. When entries have been deleted, the new table may
584 actually be smaller than the old one.
586 static int
587 dictresize(PyDictObject *mp, Py_ssize_t minused)
589 Py_ssize_t newsize;
590 PyDictEntry *oldtable, *newtable, *ep;
591 Py_ssize_t i;
592 int is_oldtable_malloced;
593 PyDictEntry small_copy[PyDict_MINSIZE];
595 assert(minused >= 0);
597 /* Find the smallest table size > minused. */
598 for (newsize = PyDict_MINSIZE;
599 newsize <= minused && newsize > 0;
600 newsize <<= 1)
602 if (newsize <= 0) {
603 PyErr_NoMemory();
604 return -1;
607 /* Get space for a new table. */
608 oldtable = mp->ma_table;
609 assert(oldtable != NULL);
610 is_oldtable_malloced = oldtable != mp->ma_smalltable;
612 if (newsize == PyDict_MINSIZE) {
613 /* A large table is shrinking, or we can't get any smaller. */
614 newtable = mp->ma_smalltable;
615 if (newtable == oldtable) {
616 if (mp->ma_fill == mp->ma_used) {
617 /* No dummies, so no point doing anything. */
618 return 0;
620 /* We're not going to resize it, but rebuild the
621 table anyway to purge old dummy entries.
622 Subtle: This is *necessary* if fill==size,
623 as lookdict needs at least one virgin slot to
624 terminate failing searches. If fill < size, it's
625 merely desirable, as dummies slow searches. */
626 assert(mp->ma_fill > mp->ma_used);
627 memcpy(small_copy, oldtable, sizeof(small_copy));
628 oldtable = small_copy;
631 else {
632 newtable = PyMem_NEW(PyDictEntry, newsize);
633 if (newtable == NULL) {
634 PyErr_NoMemory();
635 return -1;
639 /* Make the dict empty, using the new table. */
640 assert(newtable != oldtable);
641 mp->ma_table = newtable;
642 mp->ma_mask = newsize - 1;
643 memset(newtable, 0, sizeof(PyDictEntry) * newsize);
644 mp->ma_used = 0;
645 i = mp->ma_fill;
646 mp->ma_fill = 0;
648 /* Copy the data over; this is refcount-neutral for active entries;
649 dummy entries aren't copied over, of course */
650 for (ep = oldtable; i > 0; ep++) {
651 if (ep->me_value != NULL) { /* active entry */
652 --i;
653 insertdict_clean(mp, ep->me_key, (long)ep->me_hash,
654 ep->me_value);
656 else if (ep->me_key != NULL) { /* dummy entry */
657 --i;
658 assert(ep->me_key == dummy);
659 Py_DECREF(ep->me_key);
661 /* else key == value == NULL: nothing to do */
664 if (is_oldtable_malloced)
665 PyMem_DEL(oldtable);
666 return 0;
669 /* Create a new dictionary pre-sized to hold an estimated number of elements.
670 Underestimates are okay because the dictionary will resize as necessary.
671 Overestimates just mean the dictionary will be more sparse than usual.
674 PyObject *
675 _PyDict_NewPresized(Py_ssize_t minused)
677 PyObject *op = PyDict_New();
679 if (minused>5 && op != NULL && dictresize((PyDictObject *)op, minused) == -1) {
680 Py_DECREF(op);
681 return NULL;
683 return op;
686 /* Note that, for historical reasons, PyDict_GetItem() suppresses all errors
687 * that may occur (originally dicts supported only string keys, and exceptions
688 * weren't possible). So, while the original intent was that a NULL return
689 * meant the key wasn't present, in reality it can mean that, or that an error
690 * (suppressed) occurred while computing the key's hash, or that some error
691 * (suppressed) occurred when comparing keys in the dict's internal probe
692 * sequence. A nasty example of the latter is when a Python-coded comparison
693 * function hits a stack-depth error, which can cause this to return NULL
694 * even if the key is present.
696 PyObject *
697 PyDict_GetItem(PyObject *op, PyObject *key)
699 long hash;
700 PyDictObject *mp = (PyDictObject *)op;
701 PyDictEntry *ep;
702 PyThreadState *tstate;
703 if (!PyDict_Check(op))
704 return NULL;
705 if (!PyString_CheckExact(key) ||
706 (hash = ((PyStringObject *) key)->ob_shash) == -1)
708 hash = PyObject_Hash(key);
709 if (hash == -1) {
710 PyErr_Clear();
711 return NULL;
715 /* We can arrive here with a NULL tstate during initialization:
716 try running "python -Wi" for an example related to string
717 interning. Let's just hope that no exception occurs then... */
718 tstate = PyThreadState_GET();
719 if (tstate != NULL && tstate->curexc_type != NULL) {
720 /* preserve the existing exception */
721 PyObject *err_type, *err_value, *err_tb;
722 PyErr_Fetch(&err_type, &err_value, &err_tb);
723 ep = (mp->ma_lookup)(mp, key, hash);
724 /* ignore errors */
725 PyErr_Restore(err_type, err_value, err_tb);
726 if (ep == NULL)
727 return NULL;
729 else {
730 ep = (mp->ma_lookup)(mp, key, hash);
731 if (ep == NULL) {
732 PyErr_Clear();
733 return NULL;
736 return ep->me_value;
739 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
740 * dictionary if it's merely replacing the value for an existing key.
741 * This means that it's safe to loop over a dictionary with PyDict_Next()
742 * and occasionally replace a value -- but you can't insert new keys or
743 * remove them.
746 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
748 register PyDictObject *mp;
749 register long hash;
750 register Py_ssize_t n_used;
752 if (!PyDict_Check(op)) {
753 PyErr_BadInternalCall();
754 return -1;
756 assert(key);
757 assert(value);
758 mp = (PyDictObject *)op;
759 if (PyString_CheckExact(key)) {
760 hash = ((PyStringObject *)key)->ob_shash;
761 if (hash == -1)
762 hash = PyObject_Hash(key);
764 else {
765 hash = PyObject_Hash(key);
766 if (hash == -1)
767 return -1;
769 assert(mp->ma_fill <= mp->ma_mask); /* at least one empty slot */
770 n_used = mp->ma_used;
771 Py_INCREF(value);
772 Py_INCREF(key);
773 if (insertdict(mp, key, hash, value) != 0)
774 return -1;
775 /* If we added a key, we can safely resize. Otherwise just return!
776 * If fill >= 2/3 size, adjust size. Normally, this doubles or
777 * quaduples the size, but it's also possible for the dict to shrink
778 * (if ma_fill is much larger than ma_used, meaning a lot of dict
779 * keys have been * deleted).
781 * Quadrupling the size improves average dictionary sparseness
782 * (reducing collisions) at the cost of some memory and iteration
783 * speed (which loops over every possible entry). It also halves
784 * the number of expensive resize operations in a growing dictionary.
786 * Very large dictionaries (over 50K items) use doubling instead.
787 * This may help applications with severe memory constraints.
789 if (!(mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2))
790 return 0;
791 return dictresize(mp, (mp->ma_used > 50000 ? 2 : 4) * mp->ma_used);
795 PyDict_DelItem(PyObject *op, PyObject *key)
797 register PyDictObject *mp;
798 register long hash;
799 register PyDictEntry *ep;
800 PyObject *old_value, *old_key;
802 if (!PyDict_Check(op)) {
803 PyErr_BadInternalCall();
804 return -1;
806 assert(key);
807 if (!PyString_CheckExact(key) ||
808 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
809 hash = PyObject_Hash(key);
810 if (hash == -1)
811 return -1;
813 mp = (PyDictObject *)op;
814 ep = (mp->ma_lookup)(mp, key, hash);
815 if (ep == NULL)
816 return -1;
817 if (ep->me_value == NULL) {
818 set_key_error(key);
819 return -1;
821 old_key = ep->me_key;
822 Py_INCREF(dummy);
823 ep->me_key = dummy;
824 old_value = ep->me_value;
825 ep->me_value = NULL;
826 mp->ma_used--;
827 Py_DECREF(old_value);
828 Py_DECREF(old_key);
829 return 0;
832 void
833 PyDict_Clear(PyObject *op)
835 PyDictObject *mp;
836 PyDictEntry *ep, *table;
837 int table_is_malloced;
838 Py_ssize_t fill;
839 PyDictEntry small_copy[PyDict_MINSIZE];
840 #ifdef Py_DEBUG
841 Py_ssize_t i, n;
842 #endif
844 if (!PyDict_Check(op))
845 return;
846 mp = (PyDictObject *)op;
847 #ifdef Py_DEBUG
848 n = mp->ma_mask + 1;
849 i = 0;
850 #endif
852 table = mp->ma_table;
853 assert(table != NULL);
854 table_is_malloced = table != mp->ma_smalltable;
856 /* This is delicate. During the process of clearing the dict,
857 * decrefs can cause the dict to mutate. To avoid fatal confusion
858 * (voice of experience), we have to make the dict empty before
859 * clearing the slots, and never refer to anything via mp->xxx while
860 * clearing.
862 fill = mp->ma_fill;
863 if (table_is_malloced)
864 EMPTY_TO_MINSIZE(mp);
866 else if (fill > 0) {
867 /* It's a small table with something that needs to be cleared.
868 * Afraid the only safe way is to copy the dict entries into
869 * another small table first.
871 memcpy(small_copy, table, sizeof(small_copy));
872 table = small_copy;
873 EMPTY_TO_MINSIZE(mp);
875 /* else it's a small table that's already empty */
877 /* Now we can finally clear things. If C had refcounts, we could
878 * assert that the refcount on table is 1 now, i.e. that this function
879 * has unique access to it, so decref side-effects can't alter it.
881 for (ep = table; fill > 0; ++ep) {
882 #ifdef Py_DEBUG
883 assert(i < n);
884 ++i;
885 #endif
886 if (ep->me_key) {
887 --fill;
888 Py_DECREF(ep->me_key);
889 Py_XDECREF(ep->me_value);
891 #ifdef Py_DEBUG
892 else
893 assert(ep->me_value == NULL);
894 #endif
897 if (table_is_malloced)
898 PyMem_DEL(table);
902 * Iterate over a dict. Use like so:
904 * Py_ssize_t i;
905 * PyObject *key, *value;
906 * i = 0; # important! i should not otherwise be changed by you
907 * while (PyDict_Next(yourdict, &i, &key, &value)) {
908 * Refer to borrowed references in key and value.
911 * CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
912 * mutates the dict. One exception: it is safe if the loop merely changes
913 * the values associated with the keys (but doesn't insert new keys or
914 * delete keys), via PyDict_SetItem().
917 PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue)
919 register Py_ssize_t i;
920 register Py_ssize_t mask;
921 register PyDictEntry *ep;
923 if (!PyDict_Check(op))
924 return 0;
925 i = *ppos;
926 if (i < 0)
927 return 0;
928 ep = ((PyDictObject *)op)->ma_table;
929 mask = ((PyDictObject *)op)->ma_mask;
930 while (i <= mask && ep[i].me_value == NULL)
931 i++;
932 *ppos = i+1;
933 if (i > mask)
934 return 0;
935 if (pkey)
936 *pkey = ep[i].me_key;
937 if (pvalue)
938 *pvalue = ep[i].me_value;
939 return 1;
942 /* Internal version of PyDict_Next that returns a hash value in addition to the key and value.*/
944 _PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue, long *phash)
946 register Py_ssize_t i;
947 register Py_ssize_t mask;
948 register PyDictEntry *ep;
950 if (!PyDict_Check(op))
951 return 0;
952 i = *ppos;
953 if (i < 0)
954 return 0;
955 ep = ((PyDictObject *)op)->ma_table;
956 mask = ((PyDictObject *)op)->ma_mask;
957 while (i <= mask && ep[i].me_value == NULL)
958 i++;
959 *ppos = i+1;
960 if (i > mask)
961 return 0;
962 *phash = (long)(ep[i].me_hash);
963 if (pkey)
964 *pkey = ep[i].me_key;
965 if (pvalue)
966 *pvalue = ep[i].me_value;
967 return 1;
970 /* Methods */
972 static void
973 dict_dealloc(register PyDictObject *mp)
975 register PyDictEntry *ep;
976 Py_ssize_t fill = mp->ma_fill;
977 PyObject_GC_UnTrack(mp);
978 Py_TRASHCAN_SAFE_BEGIN(mp)
979 for (ep = mp->ma_table; fill > 0; ep++) {
980 if (ep->me_key) {
981 --fill;
982 Py_DECREF(ep->me_key);
983 Py_XDECREF(ep->me_value);
986 if (mp->ma_table != mp->ma_smalltable)
987 PyMem_DEL(mp->ma_table);
988 if (numfree < PyDict_MAXFREELIST && Py_TYPE(mp) == &PyDict_Type)
989 free_list[numfree++] = mp;
990 else
991 Py_TYPE(mp)->tp_free((PyObject *)mp);
992 Py_TRASHCAN_SAFE_END(mp)
995 static int
996 dict_print(register PyDictObject *mp, register FILE *fp, register int flags)
998 register Py_ssize_t i;
999 register Py_ssize_t any;
1000 int status;
1002 status = Py_ReprEnter((PyObject*)mp);
1003 if (status != 0) {
1004 if (status < 0)
1005 return status;
1006 Py_BEGIN_ALLOW_THREADS
1007 fprintf(fp, "{...}");
1008 Py_END_ALLOW_THREADS
1009 return 0;
1012 Py_BEGIN_ALLOW_THREADS
1013 fprintf(fp, "{");
1014 Py_END_ALLOW_THREADS
1015 any = 0;
1016 for (i = 0; i <= mp->ma_mask; i++) {
1017 PyDictEntry *ep = mp->ma_table + i;
1018 PyObject *pvalue = ep->me_value;
1019 if (pvalue != NULL) {
1020 /* Prevent PyObject_Repr from deleting value during
1021 key format */
1022 Py_INCREF(pvalue);
1023 if (any++ > 0) {
1024 Py_BEGIN_ALLOW_THREADS
1025 fprintf(fp, ", ");
1026 Py_END_ALLOW_THREADS
1028 if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
1029 Py_DECREF(pvalue);
1030 Py_ReprLeave((PyObject*)mp);
1031 return -1;
1033 Py_BEGIN_ALLOW_THREADS
1034 fprintf(fp, ": ");
1035 Py_END_ALLOW_THREADS
1036 if (PyObject_Print(pvalue, fp, 0) != 0) {
1037 Py_DECREF(pvalue);
1038 Py_ReprLeave((PyObject*)mp);
1039 return -1;
1041 Py_DECREF(pvalue);
1044 Py_BEGIN_ALLOW_THREADS
1045 fprintf(fp, "}");
1046 Py_END_ALLOW_THREADS
1047 Py_ReprLeave((PyObject*)mp);
1048 return 0;
1051 static PyObject *
1052 dict_repr(PyDictObject *mp)
1054 Py_ssize_t i;
1055 PyObject *s, *temp, *colon = NULL;
1056 PyObject *pieces = NULL, *result = NULL;
1057 PyObject *key, *value;
1059 i = Py_ReprEnter((PyObject *)mp);
1060 if (i != 0) {
1061 return i > 0 ? PyString_FromString("{...}") : NULL;
1064 if (mp->ma_used == 0) {
1065 result = PyString_FromString("{}");
1066 goto Done;
1069 pieces = PyList_New(0);
1070 if (pieces == NULL)
1071 goto Done;
1073 colon = PyString_FromString(": ");
1074 if (colon == NULL)
1075 goto Done;
1077 /* Do repr() on each key+value pair, and insert ": " between them.
1078 Note that repr may mutate the dict. */
1079 i = 0;
1080 while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
1081 int status;
1082 /* Prevent repr from deleting value during key format. */
1083 Py_INCREF(value);
1084 s = PyObject_Repr(key);
1085 PyString_Concat(&s, colon);
1086 PyString_ConcatAndDel(&s, PyObject_Repr(value));
1087 Py_DECREF(value);
1088 if (s == NULL)
1089 goto Done;
1090 status = PyList_Append(pieces, s);
1091 Py_DECREF(s); /* append created a new ref */
1092 if (status < 0)
1093 goto Done;
1096 /* Add "{}" decorations to the first and last items. */
1097 assert(PyList_GET_SIZE(pieces) > 0);
1098 s = PyString_FromString("{");
1099 if (s == NULL)
1100 goto Done;
1101 temp = PyList_GET_ITEM(pieces, 0);
1102 PyString_ConcatAndDel(&s, temp);
1103 PyList_SET_ITEM(pieces, 0, s);
1104 if (s == NULL)
1105 goto Done;
1107 s = PyString_FromString("}");
1108 if (s == NULL)
1109 goto Done;
1110 temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
1111 PyString_ConcatAndDel(&temp, s);
1112 PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
1113 if (temp == NULL)
1114 goto Done;
1116 /* Paste them all together with ", " between. */
1117 s = PyString_FromString(", ");
1118 if (s == NULL)
1119 goto Done;
1120 result = _PyString_Join(s, pieces);
1121 Py_DECREF(s);
1123 Done:
1124 Py_XDECREF(pieces);
1125 Py_XDECREF(colon);
1126 Py_ReprLeave((PyObject *)mp);
1127 return result;
1130 static Py_ssize_t
1131 dict_length(PyDictObject *mp)
1133 return mp->ma_used;
1136 static PyObject *
1137 dict_subscript(PyDictObject *mp, register PyObject *key)
1139 PyObject *v;
1140 long hash;
1141 PyDictEntry *ep;
1142 assert(mp->ma_table != NULL);
1143 if (!PyString_CheckExact(key) ||
1144 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1145 hash = PyObject_Hash(key);
1146 if (hash == -1)
1147 return NULL;
1149 ep = (mp->ma_lookup)(mp, key, hash);
1150 if (ep == NULL)
1151 return NULL;
1152 v = ep->me_value;
1153 if (v == NULL) {
1154 if (!PyDict_CheckExact(mp)) {
1155 /* Look up __missing__ method if we're a subclass. */
1156 PyObject *missing, *res;
1157 static PyObject *missing_str = NULL;
1158 missing = _PyObject_LookupSpecial((PyObject *)mp,
1159 "__missing__",
1160 &missing_str);
1161 if (missing != NULL) {
1162 res = PyObject_CallFunctionObjArgs(missing,
1163 key, NULL);
1164 Py_DECREF(missing);
1165 return res;
1167 else if (PyErr_Occurred())
1168 return NULL;
1170 set_key_error(key);
1171 return NULL;
1173 else
1174 Py_INCREF(v);
1175 return v;
1178 static int
1179 dict_ass_sub(PyDictObject *mp, PyObject *v, PyObject *w)
1181 if (w == NULL)
1182 return PyDict_DelItem((PyObject *)mp, v);
1183 else
1184 return PyDict_SetItem((PyObject *)mp, v, w);
1187 static PyMappingMethods dict_as_mapping = {
1188 (lenfunc)dict_length, /*mp_length*/
1189 (binaryfunc)dict_subscript, /*mp_subscript*/
1190 (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
1193 static PyObject *
1194 dict_keys(register PyDictObject *mp)
1196 register PyObject *v;
1197 register Py_ssize_t i, j;
1198 PyDictEntry *ep;
1199 Py_ssize_t mask, n;
1201 again:
1202 n = mp->ma_used;
1203 v = PyList_New(n);
1204 if (v == NULL)
1205 return NULL;
1206 if (n != mp->ma_used) {
1207 /* Durnit. The allocations caused the dict to resize.
1208 * Just start over, this shouldn't normally happen.
1210 Py_DECREF(v);
1211 goto again;
1213 ep = mp->ma_table;
1214 mask = mp->ma_mask;
1215 for (i = 0, j = 0; i <= mask; i++) {
1216 if (ep[i].me_value != NULL) {
1217 PyObject *key = ep[i].me_key;
1218 Py_INCREF(key);
1219 PyList_SET_ITEM(v, j, key);
1220 j++;
1223 assert(j == n);
1224 return v;
1227 static PyObject *
1228 dict_values(register PyDictObject *mp)
1230 register PyObject *v;
1231 register Py_ssize_t i, j;
1232 PyDictEntry *ep;
1233 Py_ssize_t mask, n;
1235 again:
1236 n = mp->ma_used;
1237 v = PyList_New(n);
1238 if (v == NULL)
1239 return NULL;
1240 if (n != mp->ma_used) {
1241 /* Durnit. The allocations caused the dict to resize.
1242 * Just start over, this shouldn't normally happen.
1244 Py_DECREF(v);
1245 goto again;
1247 ep = mp->ma_table;
1248 mask = mp->ma_mask;
1249 for (i = 0, j = 0; i <= mask; i++) {
1250 if (ep[i].me_value != NULL) {
1251 PyObject *value = ep[i].me_value;
1252 Py_INCREF(value);
1253 PyList_SET_ITEM(v, j, value);
1254 j++;
1257 assert(j == n);
1258 return v;
1261 static PyObject *
1262 dict_items(register PyDictObject *mp)
1264 register PyObject *v;
1265 register Py_ssize_t i, j, n;
1266 Py_ssize_t mask;
1267 PyObject *item, *key, *value;
1268 PyDictEntry *ep;
1270 /* Preallocate the list of tuples, to avoid allocations during
1271 * the loop over the items, which could trigger GC, which
1272 * could resize the dict. :-(
1274 again:
1275 n = mp->ma_used;
1276 v = PyList_New(n);
1277 if (v == NULL)
1278 return NULL;
1279 for (i = 0; i < n; i++) {
1280 item = PyTuple_New(2);
1281 if (item == NULL) {
1282 Py_DECREF(v);
1283 return NULL;
1285 PyList_SET_ITEM(v, i, item);
1287 if (n != mp->ma_used) {
1288 /* Durnit. The allocations caused the dict to resize.
1289 * Just start over, this shouldn't normally happen.
1291 Py_DECREF(v);
1292 goto again;
1294 /* Nothing we do below makes any function calls. */
1295 ep = mp->ma_table;
1296 mask = mp->ma_mask;
1297 for (i = 0, j = 0; i <= mask; i++) {
1298 if ((value=ep[i].me_value) != NULL) {
1299 key = ep[i].me_key;
1300 item = PyList_GET_ITEM(v, j);
1301 Py_INCREF(key);
1302 PyTuple_SET_ITEM(item, 0, key);
1303 Py_INCREF(value);
1304 PyTuple_SET_ITEM(item, 1, value);
1305 j++;
1308 assert(j == n);
1309 return v;
1312 static PyObject *
1313 dict_fromkeys(PyObject *cls, PyObject *args)
1315 PyObject *seq;
1316 PyObject *value = Py_None;
1317 PyObject *it; /* iter(seq) */
1318 PyObject *key;
1319 PyObject *d;
1320 int status;
1322 if (!PyArg_UnpackTuple(args, "fromkeys", 1, 2, &seq, &value))
1323 return NULL;
1325 d = PyObject_CallObject(cls, NULL);
1326 if (d == NULL)
1327 return NULL;
1329 if (PyDict_CheckExact(d) && PyDict_CheckExact(seq)) {
1330 PyDictObject *mp = (PyDictObject *)d;
1331 PyObject *oldvalue;
1332 Py_ssize_t pos = 0;
1333 PyObject *key;
1334 long hash;
1336 if (dictresize(mp, Py_SIZE(seq)))
1337 return NULL;
1339 while (_PyDict_Next(seq, &pos, &key, &oldvalue, &hash)) {
1340 Py_INCREF(key);
1341 Py_INCREF(value);
1342 if (insertdict(mp, key, hash, value))
1343 return NULL;
1345 return d;
1348 if (PyDict_CheckExact(d) && PyAnySet_CheckExact(seq)) {
1349 PyDictObject *mp = (PyDictObject *)d;
1350 Py_ssize_t pos = 0;
1351 PyObject *key;
1352 long hash;
1354 if (dictresize(mp, PySet_GET_SIZE(seq)))
1355 return NULL;
1357 while (_PySet_NextEntry(seq, &pos, &key, &hash)) {
1358 Py_INCREF(key);
1359 Py_INCREF(value);
1360 if (insertdict(mp, key, hash, value))
1361 return NULL;
1363 return d;
1366 it = PyObject_GetIter(seq);
1367 if (it == NULL){
1368 Py_DECREF(d);
1369 return NULL;
1372 if (PyDict_CheckExact(d)) {
1373 while ((key = PyIter_Next(it)) != NULL) {
1374 status = PyDict_SetItem(d, key, value);
1375 Py_DECREF(key);
1376 if (status < 0)
1377 goto Fail;
1379 } else {
1380 while ((key = PyIter_Next(it)) != NULL) {
1381 status = PyObject_SetItem(d, key, value);
1382 Py_DECREF(key);
1383 if (status < 0)
1384 goto Fail;
1388 if (PyErr_Occurred())
1389 goto Fail;
1390 Py_DECREF(it);
1391 return d;
1393 Fail:
1394 Py_DECREF(it);
1395 Py_DECREF(d);
1396 return NULL;
1399 static int
1400 dict_update_common(PyObject *self, PyObject *args, PyObject *kwds, char *methname)
1402 PyObject *arg = NULL;
1403 int result = 0;
1405 if (!PyArg_UnpackTuple(args, methname, 0, 1, &arg))
1406 result = -1;
1408 else if (arg != NULL) {
1409 if (PyObject_HasAttrString(arg, "keys"))
1410 result = PyDict_Merge(self, arg, 1);
1411 else
1412 result = PyDict_MergeFromSeq2(self, arg, 1);
1414 if (result == 0 && kwds != NULL)
1415 result = PyDict_Merge(self, kwds, 1);
1416 return result;
1419 static PyObject *
1420 dict_update(PyObject *self, PyObject *args, PyObject *kwds)
1422 if (dict_update_common(self, args, kwds, "update") != -1)
1423 Py_RETURN_NONE;
1424 return NULL;
1427 /* Update unconditionally replaces existing items.
1428 Merge has a 3rd argument 'override'; if set, it acts like Update,
1429 otherwise it leaves existing items unchanged.
1431 PyDict_{Update,Merge} update/merge from a mapping object.
1433 PyDict_MergeFromSeq2 updates/merges from any iterable object
1434 producing iterable objects of length 2.
1438 PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1440 PyObject *it; /* iter(seq2) */
1441 Py_ssize_t i; /* index into seq2 of current element */
1442 PyObject *item; /* seq2[i] */
1443 PyObject *fast; /* item as a 2-tuple or 2-list */
1445 assert(d != NULL);
1446 assert(PyDict_Check(d));
1447 assert(seq2 != NULL);
1449 it = PyObject_GetIter(seq2);
1450 if (it == NULL)
1451 return -1;
1453 for (i = 0; ; ++i) {
1454 PyObject *key, *value;
1455 Py_ssize_t n;
1457 fast = NULL;
1458 item = PyIter_Next(it);
1459 if (item == NULL) {
1460 if (PyErr_Occurred())
1461 goto Fail;
1462 break;
1465 /* Convert item to sequence, and verify length 2. */
1466 fast = PySequence_Fast(item, "");
1467 if (fast == NULL) {
1468 if (PyErr_ExceptionMatches(PyExc_TypeError))
1469 PyErr_Format(PyExc_TypeError,
1470 "cannot convert dictionary update "
1471 "sequence element #%zd to a sequence",
1473 goto Fail;
1475 n = PySequence_Fast_GET_SIZE(fast);
1476 if (n != 2) {
1477 PyErr_Format(PyExc_ValueError,
1478 "dictionary update sequence element #%zd "
1479 "has length %zd; 2 is required",
1480 i, n);
1481 goto Fail;
1484 /* Update/merge with this (key, value) pair. */
1485 key = PySequence_Fast_GET_ITEM(fast, 0);
1486 value = PySequence_Fast_GET_ITEM(fast, 1);
1487 if (override || PyDict_GetItem(d, key) == NULL) {
1488 int status = PyDict_SetItem(d, key, value);
1489 if (status < 0)
1490 goto Fail;
1492 Py_DECREF(fast);
1493 Py_DECREF(item);
1496 i = 0;
1497 goto Return;
1498 Fail:
1499 Py_XDECREF(item);
1500 Py_XDECREF(fast);
1501 i = -1;
1502 Return:
1503 Py_DECREF(it);
1504 return Py_SAFE_DOWNCAST(i, Py_ssize_t, int);
1508 PyDict_Update(PyObject *a, PyObject *b)
1510 return PyDict_Merge(a, b, 1);
1514 PyDict_Merge(PyObject *a, PyObject *b, int override)
1516 register PyDictObject *mp, *other;
1517 register Py_ssize_t i;
1518 PyDictEntry *entry;
1520 /* We accept for the argument either a concrete dictionary object,
1521 * or an abstract "mapping" object. For the former, we can do
1522 * things quite efficiently. For the latter, we only require that
1523 * PyMapping_Keys() and PyObject_GetItem() be supported.
1525 if (a == NULL || !PyDict_Check(a) || b == NULL) {
1526 PyErr_BadInternalCall();
1527 return -1;
1529 mp = (PyDictObject*)a;
1530 if (PyDict_Check(b)) {
1531 other = (PyDictObject*)b;
1532 if (other == mp || other->ma_used == 0)
1533 /* a.update(a) or a.update({}); nothing to do */
1534 return 0;
1535 if (mp->ma_used == 0)
1536 /* Since the target dict is empty, PyDict_GetItem()
1537 * always returns NULL. Setting override to 1
1538 * skips the unnecessary test.
1540 override = 1;
1541 /* Do one big resize at the start, rather than
1542 * incrementally resizing as we insert new items. Expect
1543 * that there will be no (or few) overlapping keys.
1545 if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1546 if (dictresize(mp, (mp->ma_used + other->ma_used)*2) != 0)
1547 return -1;
1549 for (i = 0; i <= other->ma_mask; i++) {
1550 entry = &other->ma_table[i];
1551 if (entry->me_value != NULL &&
1552 (override ||
1553 PyDict_GetItem(a, entry->me_key) == NULL)) {
1554 Py_INCREF(entry->me_key);
1555 Py_INCREF(entry->me_value);
1556 if (insertdict(mp, entry->me_key,
1557 (long)entry->me_hash,
1558 entry->me_value) != 0)
1559 return -1;
1563 else {
1564 /* Do it the generic, slower way */
1565 PyObject *keys = PyMapping_Keys(b);
1566 PyObject *iter;
1567 PyObject *key, *value;
1568 int status;
1570 if (keys == NULL)
1571 /* Docstring says this is equivalent to E.keys() so
1572 * if E doesn't have a .keys() method we want
1573 * AttributeError to percolate up. Might as well
1574 * do the same for any other error.
1576 return -1;
1578 iter = PyObject_GetIter(keys);
1579 Py_DECREF(keys);
1580 if (iter == NULL)
1581 return -1;
1583 for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1584 if (!override && PyDict_GetItem(a, key) != NULL) {
1585 Py_DECREF(key);
1586 continue;
1588 value = PyObject_GetItem(b, key);
1589 if (value == NULL) {
1590 Py_DECREF(iter);
1591 Py_DECREF(key);
1592 return -1;
1594 status = PyDict_SetItem(a, key, value);
1595 Py_DECREF(key);
1596 Py_DECREF(value);
1597 if (status < 0) {
1598 Py_DECREF(iter);
1599 return -1;
1602 Py_DECREF(iter);
1603 if (PyErr_Occurred())
1604 /* Iterator completed, via error */
1605 return -1;
1607 return 0;
1610 static PyObject *
1611 dict_copy(register PyDictObject *mp)
1613 return PyDict_Copy((PyObject*)mp);
1616 PyObject *
1617 PyDict_Copy(PyObject *o)
1619 PyObject *copy;
1621 if (o == NULL || !PyDict_Check(o)) {
1622 PyErr_BadInternalCall();
1623 return NULL;
1625 copy = PyDict_New();
1626 if (copy == NULL)
1627 return NULL;
1628 if (PyDict_Merge(copy, o, 1) == 0)
1629 return copy;
1630 Py_DECREF(copy);
1631 return NULL;
1634 Py_ssize_t
1635 PyDict_Size(PyObject *mp)
1637 if (mp == NULL || !PyDict_Check(mp)) {
1638 PyErr_BadInternalCall();
1639 return -1;
1641 return ((PyDictObject *)mp)->ma_used;
1644 PyObject *
1645 PyDict_Keys(PyObject *mp)
1647 if (mp == NULL || !PyDict_Check(mp)) {
1648 PyErr_BadInternalCall();
1649 return NULL;
1651 return dict_keys((PyDictObject *)mp);
1654 PyObject *
1655 PyDict_Values(PyObject *mp)
1657 if (mp == NULL || !PyDict_Check(mp)) {
1658 PyErr_BadInternalCall();
1659 return NULL;
1661 return dict_values((PyDictObject *)mp);
1664 PyObject *
1665 PyDict_Items(PyObject *mp)
1667 if (mp == NULL || !PyDict_Check(mp)) {
1668 PyErr_BadInternalCall();
1669 return NULL;
1671 return dict_items((PyDictObject *)mp);
1674 /* Subroutine which returns the smallest key in a for which b's value
1675 is different or absent. The value is returned too, through the
1676 pval argument. Both are NULL if no key in a is found for which b's status
1677 differs. The refcounts on (and only on) non-NULL *pval and function return
1678 values must be decremented by the caller (characterize() increments them
1679 to ensure that mutating comparison and PyDict_GetItem calls can't delete
1680 them before the caller is done looking at them). */
1682 static PyObject *
1683 characterize(PyDictObject *a, PyDictObject *b, PyObject **pval)
1685 PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1686 PyObject *aval = NULL; /* a[akey] */
1687 Py_ssize_t i;
1688 int cmp;
1690 for (i = 0; i <= a->ma_mask; i++) {
1691 PyObject *thiskey, *thisaval, *thisbval;
1692 if (a->ma_table[i].me_value == NULL)
1693 continue;
1694 thiskey = a->ma_table[i].me_key;
1695 Py_INCREF(thiskey); /* keep alive across compares */
1696 if (akey != NULL) {
1697 cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1698 if (cmp < 0) {
1699 Py_DECREF(thiskey);
1700 goto Fail;
1702 if (cmp > 0 ||
1703 i > a->ma_mask ||
1704 a->ma_table[i].me_value == NULL)
1706 /* Not the *smallest* a key; or maybe it is
1707 * but the compare shrunk the dict so we can't
1708 * find its associated value anymore; or
1709 * maybe it is but the compare deleted the
1710 * a[thiskey] entry.
1712 Py_DECREF(thiskey);
1713 continue;
1717 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1718 thisaval = a->ma_table[i].me_value;
1719 assert(thisaval);
1720 Py_INCREF(thisaval); /* keep alive */
1721 thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1722 if (thisbval == NULL)
1723 cmp = 0;
1724 else {
1725 /* both dicts have thiskey: same values? */
1726 cmp = PyObject_RichCompareBool(
1727 thisaval, thisbval, Py_EQ);
1728 if (cmp < 0) {
1729 Py_DECREF(thiskey);
1730 Py_DECREF(thisaval);
1731 goto Fail;
1734 if (cmp == 0) {
1735 /* New winner. */
1736 Py_XDECREF(akey);
1737 Py_XDECREF(aval);
1738 akey = thiskey;
1739 aval = thisaval;
1741 else {
1742 Py_DECREF(thiskey);
1743 Py_DECREF(thisaval);
1746 *pval = aval;
1747 return akey;
1749 Fail:
1750 Py_XDECREF(akey);
1751 Py_XDECREF(aval);
1752 *pval = NULL;
1753 return NULL;
1756 static int
1757 dict_compare(PyDictObject *a, PyDictObject *b)
1759 PyObject *adiff, *bdiff, *aval, *bval;
1760 int res;
1762 /* Compare lengths first */
1763 if (a->ma_used < b->ma_used)
1764 return -1; /* a is shorter */
1765 else if (a->ma_used > b->ma_used)
1766 return 1; /* b is shorter */
1768 /* Same length -- check all keys */
1769 bdiff = bval = NULL;
1770 adiff = characterize(a, b, &aval);
1771 if (adiff == NULL) {
1772 assert(!aval);
1773 /* Either an error, or a is a subset with the same length so
1774 * must be equal.
1776 res = PyErr_Occurred() ? -1 : 0;
1777 goto Finished;
1779 bdiff = characterize(b, a, &bval);
1780 if (bdiff == NULL && PyErr_Occurred()) {
1781 assert(!bval);
1782 res = -1;
1783 goto Finished;
1785 res = 0;
1786 if (bdiff) {
1787 /* bdiff == NULL "should be" impossible now, but perhaps
1788 * the last comparison done by the characterize() on a had
1789 * the side effect of making the dicts equal!
1791 res = PyObject_Compare(adiff, bdiff);
1793 if (res == 0 && bval != NULL)
1794 res = PyObject_Compare(aval, bval);
1796 Finished:
1797 Py_XDECREF(adiff);
1798 Py_XDECREF(bdiff);
1799 Py_XDECREF(aval);
1800 Py_XDECREF(bval);
1801 return res;
1804 /* Return 1 if dicts equal, 0 if not, -1 if error.
1805 * Gets out as soon as any difference is detected.
1806 * Uses only Py_EQ comparison.
1808 static int
1809 dict_equal(PyDictObject *a, PyDictObject *b)
1811 Py_ssize_t i;
1813 if (a->ma_used != b->ma_used)
1814 /* can't be equal if # of entries differ */
1815 return 0;
1817 /* Same # of entries -- check all of 'em. Exit early on any diff. */
1818 for (i = 0; i <= a->ma_mask; i++) {
1819 PyObject *aval = a->ma_table[i].me_value;
1820 if (aval != NULL) {
1821 int cmp;
1822 PyObject *bval;
1823 PyObject *key = a->ma_table[i].me_key;
1824 /* temporarily bump aval's refcount to ensure it stays
1825 alive until we're done with it */
1826 Py_INCREF(aval);
1827 /* ditto for key */
1828 Py_INCREF(key);
1829 bval = PyDict_GetItem((PyObject *)b, key);
1830 Py_DECREF(key);
1831 if (bval == NULL) {
1832 Py_DECREF(aval);
1833 return 0;
1835 cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1836 Py_DECREF(aval);
1837 if (cmp <= 0) /* error or not equal */
1838 return cmp;
1841 return 1;
1844 static PyObject *
1845 dict_richcompare(PyObject *v, PyObject *w, int op)
1847 int cmp;
1848 PyObject *res;
1850 if (!PyDict_Check(v) || !PyDict_Check(w)) {
1851 res = Py_NotImplemented;
1853 else if (op == Py_EQ || op == Py_NE) {
1854 cmp = dict_equal((PyDictObject *)v, (PyDictObject *)w);
1855 if (cmp < 0)
1856 return NULL;
1857 res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1859 else {
1860 /* Py3K warning if comparison isn't == or != */
1861 if (PyErr_WarnPy3k("dict inequality comparisons not supported "
1862 "in 3.x", 1) < 0) {
1863 return NULL;
1865 res = Py_NotImplemented;
1867 Py_INCREF(res);
1868 return res;
1871 static PyObject *
1872 dict_contains(register PyDictObject *mp, PyObject *key)
1874 long hash;
1875 PyDictEntry *ep;
1877 if (!PyString_CheckExact(key) ||
1878 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1879 hash = PyObject_Hash(key);
1880 if (hash == -1)
1881 return NULL;
1883 ep = (mp->ma_lookup)(mp, key, hash);
1884 if (ep == NULL)
1885 return NULL;
1886 return PyBool_FromLong(ep->me_value != NULL);
1889 static PyObject *
1890 dict_has_key(register PyDictObject *mp, PyObject *key)
1892 if (PyErr_WarnPy3k("dict.has_key() not supported in 3.x; "
1893 "use the in operator", 1) < 0)
1894 return NULL;
1895 return dict_contains(mp, key);
1898 static PyObject *
1899 dict_get(register PyDictObject *mp, PyObject *args)
1901 PyObject *key;
1902 PyObject *failobj = Py_None;
1903 PyObject *val = NULL;
1904 long hash;
1905 PyDictEntry *ep;
1907 if (!PyArg_UnpackTuple(args, "get", 1, 2, &key, &failobj))
1908 return NULL;
1910 if (!PyString_CheckExact(key) ||
1911 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1912 hash = PyObject_Hash(key);
1913 if (hash == -1)
1914 return NULL;
1916 ep = (mp->ma_lookup)(mp, key, hash);
1917 if (ep == NULL)
1918 return NULL;
1919 val = ep->me_value;
1920 if (val == NULL)
1921 val = failobj;
1922 Py_INCREF(val);
1923 return val;
1927 static PyObject *
1928 dict_setdefault(register PyDictObject *mp, PyObject *args)
1930 PyObject *key;
1931 PyObject *failobj = Py_None;
1932 PyObject *val = NULL;
1933 long hash;
1934 PyDictEntry *ep;
1936 if (!PyArg_UnpackTuple(args, "setdefault", 1, 2, &key, &failobj))
1937 return NULL;
1939 if (!PyString_CheckExact(key) ||
1940 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1941 hash = PyObject_Hash(key);
1942 if (hash == -1)
1943 return NULL;
1945 ep = (mp->ma_lookup)(mp, key, hash);
1946 if (ep == NULL)
1947 return NULL;
1948 val = ep->me_value;
1949 if (val == NULL) {
1950 val = failobj;
1951 if (PyDict_SetItem((PyObject*)mp, key, failobj))
1952 val = NULL;
1954 Py_XINCREF(val);
1955 return val;
1959 static PyObject *
1960 dict_clear(register PyDictObject *mp)
1962 PyDict_Clear((PyObject *)mp);
1963 Py_RETURN_NONE;
1966 static PyObject *
1967 dict_pop(PyDictObject *mp, PyObject *args)
1969 long hash;
1970 PyDictEntry *ep;
1971 PyObject *old_value, *old_key;
1972 PyObject *key, *deflt = NULL;
1974 if(!PyArg_UnpackTuple(args, "pop", 1, 2, &key, &deflt))
1975 return NULL;
1976 if (mp->ma_used == 0) {
1977 if (deflt) {
1978 Py_INCREF(deflt);
1979 return deflt;
1981 PyErr_SetString(PyExc_KeyError,
1982 "pop(): dictionary is empty");
1983 return NULL;
1985 if (!PyString_CheckExact(key) ||
1986 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1987 hash = PyObject_Hash(key);
1988 if (hash == -1)
1989 return NULL;
1991 ep = (mp->ma_lookup)(mp, key, hash);
1992 if (ep == NULL)
1993 return NULL;
1994 if (ep->me_value == NULL) {
1995 if (deflt) {
1996 Py_INCREF(deflt);
1997 return deflt;
1999 set_key_error(key);
2000 return NULL;
2002 old_key = ep->me_key;
2003 Py_INCREF(dummy);
2004 ep->me_key = dummy;
2005 old_value = ep->me_value;
2006 ep->me_value = NULL;
2007 mp->ma_used--;
2008 Py_DECREF(old_key);
2009 return old_value;
2012 static PyObject *
2013 dict_popitem(PyDictObject *mp)
2015 Py_ssize_t i = 0;
2016 PyDictEntry *ep;
2017 PyObject *res;
2019 /* Allocate the result tuple before checking the size. Believe it
2020 * or not, this allocation could trigger a garbage collection which
2021 * could empty the dict, so if we checked the size first and that
2022 * happened, the result would be an infinite loop (searching for an
2023 * entry that no longer exists). Note that the usual popitem()
2024 * idiom is "while d: k, v = d.popitem()". so needing to throw the
2025 * tuple away if the dict *is* empty isn't a significant
2026 * inefficiency -- possible, but unlikely in practice.
2028 res = PyTuple_New(2);
2029 if (res == NULL)
2030 return NULL;
2031 if (mp->ma_used == 0) {
2032 Py_DECREF(res);
2033 PyErr_SetString(PyExc_KeyError,
2034 "popitem(): dictionary is empty");
2035 return NULL;
2037 /* Set ep to "the first" dict entry with a value. We abuse the hash
2038 * field of slot 0 to hold a search finger:
2039 * If slot 0 has a value, use slot 0.
2040 * Else slot 0 is being used to hold a search finger,
2041 * and we use its hash value as the first index to look.
2043 ep = &mp->ma_table[0];
2044 if (ep->me_value == NULL) {
2045 i = ep->me_hash;
2046 /* The hash field may be a real hash value, or it may be a
2047 * legit search finger, or it may be a once-legit search
2048 * finger that's out of bounds now because it wrapped around
2049 * or the table shrunk -- simply make sure it's in bounds now.
2051 if (i > mp->ma_mask || i < 1)
2052 i = 1; /* skip slot 0 */
2053 while ((ep = &mp->ma_table[i])->me_value == NULL) {
2054 i++;
2055 if (i > mp->ma_mask)
2056 i = 1;
2059 PyTuple_SET_ITEM(res, 0, ep->me_key);
2060 PyTuple_SET_ITEM(res, 1, ep->me_value);
2061 Py_INCREF(dummy);
2062 ep->me_key = dummy;
2063 ep->me_value = NULL;
2064 mp->ma_used--;
2065 assert(mp->ma_table[0].me_value == NULL);
2066 mp->ma_table[0].me_hash = i + 1; /* next place to start */
2067 return res;
2070 static int
2071 dict_traverse(PyObject *op, visitproc visit, void *arg)
2073 Py_ssize_t i = 0;
2074 PyObject *pk;
2075 PyObject *pv;
2077 while (PyDict_Next(op, &i, &pk, &pv)) {
2078 Py_VISIT(pk);
2079 Py_VISIT(pv);
2081 return 0;
2084 static int
2085 dict_tp_clear(PyObject *op)
2087 PyDict_Clear(op);
2088 return 0;
2092 extern PyTypeObject PyDictIterKey_Type; /* Forward */
2093 extern PyTypeObject PyDictIterValue_Type; /* Forward */
2094 extern PyTypeObject PyDictIterItem_Type; /* Forward */
2095 static PyObject *dictiter_new(PyDictObject *, PyTypeObject *);
2097 static PyObject *
2098 dict_iterkeys(PyDictObject *dict)
2100 return dictiter_new(dict, &PyDictIterKey_Type);
2103 static PyObject *
2104 dict_itervalues(PyDictObject *dict)
2106 return dictiter_new(dict, &PyDictIterValue_Type);
2109 static PyObject *
2110 dict_iteritems(PyDictObject *dict)
2112 return dictiter_new(dict, &PyDictIterItem_Type);
2115 static PyObject *
2116 dict_sizeof(PyDictObject *mp)
2118 Py_ssize_t res;
2120 res = sizeof(PyDictObject);
2121 if (mp->ma_table != mp->ma_smalltable)
2122 res = res + (mp->ma_mask + 1) * sizeof(PyDictEntry);
2123 return PyInt_FromSsize_t(res);
2126 PyDoc_STRVAR(has_key__doc__,
2127 "D.has_key(k) -> True if D has a key k, else False");
2129 PyDoc_STRVAR(contains__doc__,
2130 "D.__contains__(k) -> True if D has a key k, else False");
2132 PyDoc_STRVAR(getitem__doc__, "x.__getitem__(y) <==> x[y]");
2134 PyDoc_STRVAR(sizeof__doc__,
2135 "D.__sizeof__() -> size of D in memory, in bytes");
2137 PyDoc_STRVAR(get__doc__,
2138 "D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.");
2140 PyDoc_STRVAR(setdefault_doc__,
2141 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
2143 PyDoc_STRVAR(pop__doc__,
2144 "D.pop(k[,d]) -> v, remove specified key and return the corresponding value.\n\
2145 If key is not found, d is returned if given, otherwise KeyError is raised");
2147 PyDoc_STRVAR(popitem__doc__,
2148 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
2149 2-tuple; but raise KeyError if D is empty.");
2151 PyDoc_STRVAR(keys__doc__,
2152 "D.keys() -> list of D's keys");
2154 PyDoc_STRVAR(items__doc__,
2155 "D.items() -> list of D's (key, value) pairs, as 2-tuples");
2157 PyDoc_STRVAR(values__doc__,
2158 "D.values() -> list of D's values");
2160 PyDoc_STRVAR(update__doc__,
2161 "D.update(E, **F) -> None. Update D from dict/iterable E and F.\n"
2162 "If E has a .keys() method, does: for k in E: D[k] = E[k]\n\
2163 If E lacks .keys() method, does: for (k, v) in E: D[k] = v\n\
2164 In either case, this is followed by: for k in F: D[k] = F[k]");
2166 PyDoc_STRVAR(fromkeys__doc__,
2167 "dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
2168 v defaults to None.");
2170 PyDoc_STRVAR(clear__doc__,
2171 "D.clear() -> None. Remove all items from D.");
2173 PyDoc_STRVAR(copy__doc__,
2174 "D.copy() -> a shallow copy of D");
2176 PyDoc_STRVAR(iterkeys__doc__,
2177 "D.iterkeys() -> an iterator over the keys of D");
2179 PyDoc_STRVAR(itervalues__doc__,
2180 "D.itervalues() -> an iterator over the values of D");
2182 PyDoc_STRVAR(iteritems__doc__,
2183 "D.iteritems() -> an iterator over the (key, value) items of D");
2185 static PyMethodDef mapp_methods[] = {
2186 {"__contains__",(PyCFunction)dict_contains, METH_O | METH_COEXIST,
2187 contains__doc__},
2188 {"__getitem__", (PyCFunction)dict_subscript, METH_O | METH_COEXIST,
2189 getitem__doc__},
2190 {"__sizeof__", (PyCFunction)dict_sizeof, METH_NOARGS,
2191 sizeof__doc__},
2192 {"has_key", (PyCFunction)dict_has_key, METH_O,
2193 has_key__doc__},
2194 {"get", (PyCFunction)dict_get, METH_VARARGS,
2195 get__doc__},
2196 {"setdefault", (PyCFunction)dict_setdefault, METH_VARARGS,
2197 setdefault_doc__},
2198 {"pop", (PyCFunction)dict_pop, METH_VARARGS,
2199 pop__doc__},
2200 {"popitem", (PyCFunction)dict_popitem, METH_NOARGS,
2201 popitem__doc__},
2202 {"keys", (PyCFunction)dict_keys, METH_NOARGS,
2203 keys__doc__},
2204 {"items", (PyCFunction)dict_items, METH_NOARGS,
2205 items__doc__},
2206 {"values", (PyCFunction)dict_values, METH_NOARGS,
2207 values__doc__},
2208 {"update", (PyCFunction)dict_update, METH_VARARGS | METH_KEYWORDS,
2209 update__doc__},
2210 {"fromkeys", (PyCFunction)dict_fromkeys, METH_VARARGS | METH_CLASS,
2211 fromkeys__doc__},
2212 {"clear", (PyCFunction)dict_clear, METH_NOARGS,
2213 clear__doc__},
2214 {"copy", (PyCFunction)dict_copy, METH_NOARGS,
2215 copy__doc__},
2216 {"iterkeys", (PyCFunction)dict_iterkeys, METH_NOARGS,
2217 iterkeys__doc__},
2218 {"itervalues", (PyCFunction)dict_itervalues, METH_NOARGS,
2219 itervalues__doc__},
2220 {"iteritems", (PyCFunction)dict_iteritems, METH_NOARGS,
2221 iteritems__doc__},
2222 {NULL, NULL} /* sentinel */
2225 /* Return 1 if `key` is in dict `op`, 0 if not, and -1 on error. */
2227 PyDict_Contains(PyObject *op, PyObject *key)
2229 long hash;
2230 PyDictObject *mp = (PyDictObject *)op;
2231 PyDictEntry *ep;
2233 if (!PyString_CheckExact(key) ||
2234 (hash = ((PyStringObject *) key)->ob_shash) == -1) {
2235 hash = PyObject_Hash(key);
2236 if (hash == -1)
2237 return -1;
2239 ep = (mp->ma_lookup)(mp, key, hash);
2240 return ep == NULL ? -1 : (ep->me_value != NULL);
2243 /* Internal version of PyDict_Contains used when the hash value is already known */
2245 _PyDict_Contains(PyObject *op, PyObject *key, long hash)
2247 PyDictObject *mp = (PyDictObject *)op;
2248 PyDictEntry *ep;
2250 ep = (mp->ma_lookup)(mp, key, hash);
2251 return ep == NULL ? -1 : (ep->me_value != NULL);
2254 /* Hack to implement "key in dict" */
2255 static PySequenceMethods dict_as_sequence = {
2256 0, /* sq_length */
2257 0, /* sq_concat */
2258 0, /* sq_repeat */
2259 0, /* sq_item */
2260 0, /* sq_slice */
2261 0, /* sq_ass_item */
2262 0, /* sq_ass_slice */
2263 PyDict_Contains, /* sq_contains */
2264 0, /* sq_inplace_concat */
2265 0, /* sq_inplace_repeat */
2268 static PyObject *
2269 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
2271 PyObject *self;
2273 assert(type != NULL && type->tp_alloc != NULL);
2274 self = type->tp_alloc(type, 0);
2275 if (self != NULL) {
2276 PyDictObject *d = (PyDictObject *)self;
2277 /* It's guaranteed that tp->alloc zeroed out the struct. */
2278 assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
2279 INIT_NONZERO_DICT_SLOTS(d);
2280 d->ma_lookup = lookdict_string;
2281 /* The object has been implicitely tracked by tp_alloc */
2282 if (type == &PyDict_Type)
2283 _PyObject_GC_UNTRACK(d);
2284 #ifdef SHOW_CONVERSION_COUNTS
2285 ++created;
2286 #endif
2287 #ifdef SHOW_TRACK_COUNT
2288 if (_PyObject_GC_IS_TRACKED(d))
2289 count_tracked++;
2290 else
2291 count_untracked++;
2292 #endif
2294 return self;
2297 static int
2298 dict_init(PyObject *self, PyObject *args, PyObject *kwds)
2300 return dict_update_common(self, args, kwds, "dict");
2303 static PyObject *
2304 dict_iter(PyDictObject *dict)
2306 return dictiter_new(dict, &PyDictIterKey_Type);
2309 PyDoc_STRVAR(dictionary_doc,
2310 "dict() -> new empty dictionary.\n"
2311 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
2312 " (key, value) pairs.\n"
2313 "dict(seq) -> new dictionary initialized as if via:\n"
2314 " d = {}\n"
2315 " for k, v in seq:\n"
2316 " d[k] = v\n"
2317 "dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
2318 " in the keyword argument list. For example: dict(one=1, two=2)");
2320 PyTypeObject PyDict_Type = {
2321 PyVarObject_HEAD_INIT(&PyType_Type, 0)
2322 "dict",
2323 sizeof(PyDictObject),
2325 (destructor)dict_dealloc, /* tp_dealloc */
2326 (printfunc)dict_print, /* tp_print */
2327 0, /* tp_getattr */
2328 0, /* tp_setattr */
2329 (cmpfunc)dict_compare, /* tp_compare */
2330 (reprfunc)dict_repr, /* tp_repr */
2331 0, /* tp_as_number */
2332 &dict_as_sequence, /* tp_as_sequence */
2333 &dict_as_mapping, /* tp_as_mapping */
2334 (hashfunc)PyObject_HashNotImplemented, /* tp_hash */
2335 0, /* tp_call */
2336 0, /* tp_str */
2337 PyObject_GenericGetAttr, /* tp_getattro */
2338 0, /* tp_setattro */
2339 0, /* tp_as_buffer */
2340 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
2341 Py_TPFLAGS_BASETYPE | Py_TPFLAGS_DICT_SUBCLASS, /* tp_flags */
2342 dictionary_doc, /* tp_doc */
2343 dict_traverse, /* tp_traverse */
2344 dict_tp_clear, /* tp_clear */
2345 dict_richcompare, /* tp_richcompare */
2346 0, /* tp_weaklistoffset */
2347 (getiterfunc)dict_iter, /* tp_iter */
2348 0, /* tp_iternext */
2349 mapp_methods, /* tp_methods */
2350 0, /* tp_members */
2351 0, /* tp_getset */
2352 0, /* tp_base */
2353 0, /* tp_dict */
2354 0, /* tp_descr_get */
2355 0, /* tp_descr_set */
2356 0, /* tp_dictoffset */
2357 dict_init, /* tp_init */
2358 PyType_GenericAlloc, /* tp_alloc */
2359 dict_new, /* tp_new */
2360 PyObject_GC_Del, /* tp_free */
2363 /* For backward compatibility with old dictionary interface */
2365 PyObject *
2366 PyDict_GetItemString(PyObject *v, const char *key)
2368 PyObject *kv, *rv;
2369 kv = PyString_FromString(key);
2370 if (kv == NULL)
2371 return NULL;
2372 rv = PyDict_GetItem(v, kv);
2373 Py_DECREF(kv);
2374 return rv;
2378 PyDict_SetItemString(PyObject *v, const char *key, PyObject *item)
2380 PyObject *kv;
2381 int err;
2382 kv = PyString_FromString(key);
2383 if (kv == NULL)
2384 return -1;
2385 PyString_InternInPlace(&kv); /* XXX Should we really? */
2386 err = PyDict_SetItem(v, kv, item);
2387 Py_DECREF(kv);
2388 return err;
2392 PyDict_DelItemString(PyObject *v, const char *key)
2394 PyObject *kv;
2395 int err;
2396 kv = PyString_FromString(key);
2397 if (kv == NULL)
2398 return -1;
2399 err = PyDict_DelItem(v, kv);
2400 Py_DECREF(kv);
2401 return err;
2404 /* Dictionary iterator types */
2406 typedef struct {
2407 PyObject_HEAD
2408 PyDictObject *di_dict; /* Set to NULL when iterator is exhausted */
2409 Py_ssize_t di_used;
2410 Py_ssize_t di_pos;
2411 PyObject* di_result; /* reusable result tuple for iteritems */
2412 Py_ssize_t len;
2413 } dictiterobject;
2415 static PyObject *
2416 dictiter_new(PyDictObject *dict, PyTypeObject *itertype)
2418 dictiterobject *di;
2419 di = PyObject_GC_New(dictiterobject, itertype);
2420 if (di == NULL)
2421 return NULL;
2422 Py_INCREF(dict);
2423 di->di_dict = dict;
2424 di->di_used = dict->ma_used;
2425 di->di_pos = 0;
2426 di->len = dict->ma_used;
2427 if (itertype == &PyDictIterItem_Type) {
2428 di->di_result = PyTuple_Pack(2, Py_None, Py_None);
2429 if (di->di_result == NULL) {
2430 Py_DECREF(di);
2431 return NULL;
2434 else
2435 di->di_result = NULL;
2436 _PyObject_GC_TRACK(di);
2437 return (PyObject *)di;
2440 static void
2441 dictiter_dealloc(dictiterobject *di)
2443 Py_XDECREF(di->di_dict);
2444 Py_XDECREF(di->di_result);
2445 PyObject_GC_Del(di);
2448 static int
2449 dictiter_traverse(dictiterobject *di, visitproc visit, void *arg)
2451 Py_VISIT(di->di_dict);
2452 Py_VISIT(di->di_result);
2453 return 0;
2456 static PyObject *
2457 dictiter_len(dictiterobject *di)
2459 Py_ssize_t len = 0;
2460 if (di->di_dict != NULL && di->di_used == di->di_dict->ma_used)
2461 len = di->len;
2462 return PyInt_FromSize_t(len);
2465 PyDoc_STRVAR(length_hint_doc, "Private method returning an estimate of len(list(it)).");
2467 static PyMethodDef dictiter_methods[] = {
2468 {"__length_hint__", (PyCFunction)dictiter_len, METH_NOARGS, length_hint_doc},
2469 {NULL, NULL} /* sentinel */
2472 static PyObject *dictiter_iternextkey(dictiterobject *di)
2474 PyObject *key;
2475 register Py_ssize_t i, mask;
2476 register PyDictEntry *ep;
2477 PyDictObject *d = di->di_dict;
2479 if (d == NULL)
2480 return NULL;
2481 assert (PyDict_Check(d));
2483 if (di->di_used != d->ma_used) {
2484 PyErr_SetString(PyExc_RuntimeError,
2485 "dictionary changed size during iteration");
2486 di->di_used = -1; /* Make this state sticky */
2487 return NULL;
2490 i = di->di_pos;
2491 if (i < 0)
2492 goto fail;
2493 ep = d->ma_table;
2494 mask = d->ma_mask;
2495 while (i <= mask && ep[i].me_value == NULL)
2496 i++;
2497 di->di_pos = i+1;
2498 if (i > mask)
2499 goto fail;
2500 di->len--;
2501 key = ep[i].me_key;
2502 Py_INCREF(key);
2503 return key;
2505 fail:
2506 Py_DECREF(d);
2507 di->di_dict = NULL;
2508 return NULL;
2511 PyTypeObject PyDictIterKey_Type = {
2512 PyVarObject_HEAD_INIT(&PyType_Type, 0)
2513 "dictionary-keyiterator", /* tp_name */
2514 sizeof(dictiterobject), /* tp_basicsize */
2515 0, /* tp_itemsize */
2516 /* methods */
2517 (destructor)dictiter_dealloc, /* tp_dealloc */
2518 0, /* tp_print */
2519 0, /* tp_getattr */
2520 0, /* tp_setattr */
2521 0, /* tp_compare */
2522 0, /* tp_repr */
2523 0, /* tp_as_number */
2524 0, /* tp_as_sequence */
2525 0, /* tp_as_mapping */
2526 0, /* tp_hash */
2527 0, /* tp_call */
2528 0, /* tp_str */
2529 PyObject_GenericGetAttr, /* tp_getattro */
2530 0, /* tp_setattro */
2531 0, /* tp_as_buffer */
2532 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2533 0, /* tp_doc */
2534 (traverseproc)dictiter_traverse, /* tp_traverse */
2535 0, /* tp_clear */
2536 0, /* tp_richcompare */
2537 0, /* tp_weaklistoffset */
2538 PyObject_SelfIter, /* tp_iter */
2539 (iternextfunc)dictiter_iternextkey, /* tp_iternext */
2540 dictiter_methods, /* tp_methods */
2544 static PyObject *dictiter_iternextvalue(dictiterobject *di)
2546 PyObject *value;
2547 register Py_ssize_t i, mask;
2548 register PyDictEntry *ep;
2549 PyDictObject *d = di->di_dict;
2551 if (d == NULL)
2552 return NULL;
2553 assert (PyDict_Check(d));
2555 if (di->di_used != d->ma_used) {
2556 PyErr_SetString(PyExc_RuntimeError,
2557 "dictionary changed size during iteration");
2558 di->di_used = -1; /* Make this state sticky */
2559 return NULL;
2562 i = di->di_pos;
2563 mask = d->ma_mask;
2564 if (i < 0 || i > mask)
2565 goto fail;
2566 ep = d->ma_table;
2567 while ((value=ep[i].me_value) == NULL) {
2568 i++;
2569 if (i > mask)
2570 goto fail;
2572 di->di_pos = i+1;
2573 di->len--;
2574 Py_INCREF(value);
2575 return value;
2577 fail:
2578 Py_DECREF(d);
2579 di->di_dict = NULL;
2580 return NULL;
2583 PyTypeObject PyDictIterValue_Type = {
2584 PyVarObject_HEAD_INIT(&PyType_Type, 0)
2585 "dictionary-valueiterator", /* tp_name */
2586 sizeof(dictiterobject), /* tp_basicsize */
2587 0, /* tp_itemsize */
2588 /* methods */
2589 (destructor)dictiter_dealloc, /* tp_dealloc */
2590 0, /* tp_print */
2591 0, /* tp_getattr */
2592 0, /* tp_setattr */
2593 0, /* tp_compare */
2594 0, /* tp_repr */
2595 0, /* tp_as_number */
2596 0, /* tp_as_sequence */
2597 0, /* tp_as_mapping */
2598 0, /* tp_hash */
2599 0, /* tp_call */
2600 0, /* tp_str */
2601 PyObject_GenericGetAttr, /* tp_getattro */
2602 0, /* tp_setattro */
2603 0, /* tp_as_buffer */
2604 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2605 0, /* tp_doc */
2606 (traverseproc)dictiter_traverse, /* tp_traverse */
2607 0, /* tp_clear */
2608 0, /* tp_richcompare */
2609 0, /* tp_weaklistoffset */
2610 PyObject_SelfIter, /* tp_iter */
2611 (iternextfunc)dictiter_iternextvalue, /* tp_iternext */
2612 dictiter_methods, /* tp_methods */
2616 static PyObject *dictiter_iternextitem(dictiterobject *di)
2618 PyObject *key, *value, *result = di->di_result;
2619 register Py_ssize_t i, mask;
2620 register PyDictEntry *ep;
2621 PyDictObject *d = di->di_dict;
2623 if (d == NULL)
2624 return NULL;
2625 assert (PyDict_Check(d));
2627 if (di->di_used != d->ma_used) {
2628 PyErr_SetString(PyExc_RuntimeError,
2629 "dictionary changed size during iteration");
2630 di->di_used = -1; /* Make this state sticky */
2631 return NULL;
2634 i = di->di_pos;
2635 if (i < 0)
2636 goto fail;
2637 ep = d->ma_table;
2638 mask = d->ma_mask;
2639 while (i <= mask && ep[i].me_value == NULL)
2640 i++;
2641 di->di_pos = i+1;
2642 if (i > mask)
2643 goto fail;
2645 if (result->ob_refcnt == 1) {
2646 Py_INCREF(result);
2647 Py_DECREF(PyTuple_GET_ITEM(result, 0));
2648 Py_DECREF(PyTuple_GET_ITEM(result, 1));
2649 } else {
2650 result = PyTuple_New(2);
2651 if (result == NULL)
2652 return NULL;
2654 di->len--;
2655 key = ep[i].me_key;
2656 value = ep[i].me_value;
2657 Py_INCREF(key);
2658 Py_INCREF(value);
2659 PyTuple_SET_ITEM(result, 0, key);
2660 PyTuple_SET_ITEM(result, 1, value);
2661 return result;
2663 fail:
2664 Py_DECREF(d);
2665 di->di_dict = NULL;
2666 return NULL;
2669 PyTypeObject PyDictIterItem_Type = {
2670 PyVarObject_HEAD_INIT(&PyType_Type, 0)
2671 "dictionary-itemiterator", /* tp_name */
2672 sizeof(dictiterobject), /* tp_basicsize */
2673 0, /* tp_itemsize */
2674 /* methods */
2675 (destructor)dictiter_dealloc, /* tp_dealloc */
2676 0, /* tp_print */
2677 0, /* tp_getattr */
2678 0, /* tp_setattr */
2679 0, /* tp_compare */
2680 0, /* tp_repr */
2681 0, /* tp_as_number */
2682 0, /* tp_as_sequence */
2683 0, /* tp_as_mapping */
2684 0, /* tp_hash */
2685 0, /* tp_call */
2686 0, /* tp_str */
2687 PyObject_GenericGetAttr, /* tp_getattro */
2688 0, /* tp_setattro */
2689 0, /* tp_as_buffer */
2690 Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2691 0, /* tp_doc */
2692 (traverseproc)dictiter_traverse, /* tp_traverse */
2693 0, /* tp_clear */
2694 0, /* tp_richcompare */
2695 0, /* tp_weaklistoffset */
2696 PyObject_SelfIter, /* tp_iter */
2697 (iternextfunc)dictiter_iternextitem, /* tp_iternext */
2698 dictiter_methods, /* tp_methods */