2 /* Dictionary object implementation using a hash table */
4 /* The distribution includes a separate file, Objects/dictnotes.txt,
5 describing explorations into dictionary design and optimization.
6 It covers typical dictionary use patterns, the parameters for
7 tuning dictionaries, and several ideas for possible optimizations.
12 typedef PyDictEntry dictentry
;
13 typedef PyDictObject dictobject
;
15 /* Define this out if you don't want conversion statistics on exit. */
16 #undef SHOW_CONVERSION_COUNTS
18 /* See large comment block below. This must be >= 1. */
19 #define PERTURB_SHIFT 5
22 Major subtleties ahead: Most hash schemes depend on having a "good" hash
23 function, in the sense of simulating randomness. Python doesn't: its most
24 important hash functions (for strings and ints) are very regular in common
27 >>> map(hash, (0, 1, 2, 3))
29 >>> map(hash, ("namea", "nameb", "namec", "named"))
30 [-1658398457, -1658398460, -1658398459, -1658398462]
33 This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
34 the low-order i bits as the initial table index is extremely fast, and there
35 are no collisions at all for dicts indexed by a contiguous range of ints.
36 The same is approximately true when keys are "consecutive" strings. So this
37 gives better-than-random behavior in common cases, and that's very desirable.
39 OTOH, when collisions occur, the tendency to fill contiguous slices of the
40 hash table makes a good collision resolution strategy crucial. Taking only
41 the last i bits of the hash code is also vulnerable: for example, consider
42 [i << 16 for i in range(20000)] as a set of keys. Since ints are their own
43 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
44 hash code are all 0: they *all* map to the same table index.
46 But catering to unusual cases should not slow the usual ones, so we just take
47 the last i bits anyway. It's up to collision resolution to do the rest. If
48 we *usually* find the key we're looking for on the first try (and, it turns
49 out, we usually do -- the table load factor is kept under 2/3, so the odds
50 are solidly in our favor), then it makes best sense to keep the initial index
51 computation dirt cheap.
53 The first half of collision resolution is to visit table indices via this
56 j = ((5*j) + 1) mod 2**i
58 For any initial j in range(2**i), repeating that 2**i times generates each
59 int in range(2**i) exactly once (see any text on random-number generation for
60 proof). By itself, this doesn't help much: like linear probing (setting
61 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
62 order. This would be bad, except that's not the only thing we do, and it's
63 actually *good* in the common cases where hash keys are consecutive. In an
64 example that's really too small to make this entirely clear, for a table of
65 size 2**3 the order of indices is:
67 0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
69 If two things come in at index 5, the first place we look after is index 2,
70 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
71 Linear probing is deadly in this case because there the fixed probe order
72 is the *same* as the order consecutive keys are likely to arrive. But it's
73 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
74 and certain that consecutive hash codes do not.
76 The other half of the strategy is to get the other bits of the hash code
77 into play. This is done by initializing a (unsigned) vrbl "perturb" to the
78 full hash code, and changing the recurrence to:
80 j = (5*j) + 1 + perturb;
81 perturb >>= PERTURB_SHIFT;
82 use j % 2**i as the next table index;
84 Now the probe sequence depends (eventually) on every bit in the hash code,
85 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
86 because it quickly magnifies small differences in the bits that didn't affect
87 the initial index. Note that because perturb is unsigned, if the recurrence
88 is executed often enough perturb eventually becomes and remains 0. At that
89 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
90 that's certain to find an empty slot eventually (since it generates every int
91 in range(2**i), and we make sure there's always at least one empty slot).
93 Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
94 small so that the high bits of the hash code continue to affect the probe
95 sequence across iterations; but you want it large so that in really bad cases
96 the high-order hash bits have an effect on early iterations. 5 was "the
97 best" in minimizing total collisions across experiments Tim Peters ran (on
98 both normal and pathological cases), but 4 and 6 weren't significantly worse.
100 Historical: Reimer Behrends contributed the idea of using a polynomial-based
101 approach, using repeated multiplication by x in GF(2**n) where an irreducible
102 polynomial for each table size was chosen such that x was a primitive root.
103 Christian Tismer later extended that to use division by x instead, as an
104 efficient way to get the high bits of the hash code into play. This scheme
105 also gave excellent collision statistics, but was more expensive: two
106 if-tests were required inside the loop; computing "the next" index took about
107 the same number of operations but without as much potential parallelism
108 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
109 above, and then shifting perturb can be done while the table index is being
110 masked); and the dictobject struct required a member to hold the table's
111 polynomial. In Tim's experiments the current scheme ran faster, produced
112 equally good collision statistics, needed less code & used less memory.
114 Theoretical Python 2.5 headache: hash codes are only C "long", but
115 sizeof(Py_ssize_t) > sizeof(long) may be possible. In that case, and if a
116 dict is genuinely huge, then only the slots directly reachable via indexing
117 by a C long can be the first slot in a probe sequence. The probe sequence
118 will still eventually reach every slot in the table, but the collision rate
119 on initial probes may be much higher than this scheme was designed for.
120 Getting a hash code as fat as Py_ssize_t is the only real cure. But in
121 practice, this probably won't make a lick of difference for many years (at
122 which point everyone will have terabytes of RAM on 64-bit boxes).
125 /* Object used as dummy key to fill deleted entries */
126 static PyObject
*dummy
= NULL
; /* Initialized by first call to newdictobject() */
136 /* forward declarations */
138 lookdict_string(dictobject
*mp
, PyObject
*key
, long hash
);
140 #ifdef SHOW_CONVERSION_COUNTS
141 static long created
= 0L;
142 static long converted
= 0L;
147 fprintf(stderr
, "created %ld string dicts\n", created
);
148 fprintf(stderr
, "converted %ld to normal dicts\n", converted
);
149 fprintf(stderr
, "%.2f%% conversion rate\n", (100.0*converted
)/created
);
153 /* Initialization macros.
154 There are two ways to create a dict: PyDict_New() is the main C API
155 function, and the tp_new slot maps to dict_new(). In the latter case we
156 can save a little time over what PyDict_New does because it's guaranteed
157 that the PyDictObject struct is already zeroed out.
158 Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
159 an excellent reason not to).
162 #define INIT_NONZERO_DICT_SLOTS(mp) do { \
163 (mp)->ma_table = (mp)->ma_smalltable; \
164 (mp)->ma_mask = PyDict_MINSIZE - 1; \
167 #define EMPTY_TO_MINSIZE(mp) do { \
168 memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
169 (mp)->ma_used = (mp)->ma_fill = 0; \
170 INIT_NONZERO_DICT_SLOTS(mp); \
173 /* Dictionary reuse scheme to save calls to malloc, free, and memset */
174 #define MAXFREEDICTS 80
175 static PyDictObject
*free_dicts
[MAXFREEDICTS
];
176 static int num_free_dicts
= 0;
181 register dictobject
*mp
;
182 if (dummy
== NULL
) { /* Auto-initialize dummy */
183 dummy
= PyString_FromString("<dummy key>");
186 #ifdef SHOW_CONVERSION_COUNTS
187 Py_AtExit(show_counts
);
190 if (num_free_dicts
) {
191 mp
= free_dicts
[--num_free_dicts
];
193 assert (mp
->ob_type
== &PyDict_Type
);
194 _Py_NewReference((PyObject
*)mp
);
196 EMPTY_TO_MINSIZE(mp
);
198 assert (mp
->ma_used
== 0);
199 assert (mp
->ma_table
== mp
->ma_smalltable
);
200 assert (mp
->ma_mask
== PyDict_MINSIZE
- 1);
202 mp
= PyObject_GC_New(dictobject
, &PyDict_Type
);
205 EMPTY_TO_MINSIZE(mp
);
207 mp
->ma_lookup
= lookdict_string
;
208 #ifdef SHOW_CONVERSION_COUNTS
211 _PyObject_GC_TRACK(mp
);
212 return (PyObject
*)mp
;
216 The basic lookup function used by all operations.
217 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
218 Open addressing is preferred over chaining since the link overhead for
219 chaining would be substantial (100% with typical malloc overhead).
221 The initial probe index is computed as hash mod the table size. Subsequent
222 probe indices are computed as explained earlier.
224 All arithmetic on hash should ignore overflow.
226 (The details in this version are due to Tim Peters, building on many past
227 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
230 lookdict() is general-purpose, and may return NULL if (and only if) a
231 comparison raises an exception (this was new in Python 2.5).
232 lookdict_string() below is specialized to string keys, comparison of which can
233 never raise an exception; that function can never return NULL. For both, when
234 the key isn't found a dictentry* is returned for which the me_value field is
235 NULL; this is the slot in the dict at which the key would have been found, and
236 the caller can (if it wishes) add the <key, value> pair to the returned
240 lookdict(dictobject
*mp
, PyObject
*key
, register long hash
)
243 register size_t perturb
;
244 register dictentry
*freeslot
;
245 register size_t mask
= (size_t)mp
->ma_mask
;
246 dictentry
*ep0
= mp
->ma_table
;
247 register dictentry
*ep
;
251 i
= (size_t)hash
& mask
;
253 if (ep
->me_key
== NULL
|| ep
->me_key
== key
)
256 if (ep
->me_key
== dummy
)
259 if (ep
->me_hash
== hash
) {
260 startkey
= ep
->me_key
;
261 cmp
= PyObject_RichCompareBool(startkey
, key
, Py_EQ
);
264 if (ep0
== mp
->ma_table
&& ep
->me_key
== startkey
) {
269 /* The compare did major nasty stuff to the
271 * XXX A clever adversary could prevent this
272 * XXX from terminating.
274 return lookdict(mp
, key
, hash
);
280 /* In the loop, me_key == dummy is by far (factor of 100s) the
281 least likely outcome, so test for that last. */
282 for (perturb
= hash
; ; perturb
>>= PERTURB_SHIFT
) {
283 i
= (i
<< 2) + i
+ perturb
+ 1;
285 if (ep
->me_key
== NULL
)
286 return freeslot
== NULL
? ep
: freeslot
;
287 if (ep
->me_key
== key
)
289 if (ep
->me_hash
== hash
&& ep
->me_key
!= dummy
) {
290 startkey
= ep
->me_key
;
291 cmp
= PyObject_RichCompareBool(startkey
, key
, Py_EQ
);
294 if (ep0
== mp
->ma_table
&& ep
->me_key
== startkey
) {
299 /* The compare did major nasty stuff to the
301 * XXX A clever adversary could prevent this
302 * XXX from terminating.
304 return lookdict(mp
, key
, hash
);
307 else if (ep
->me_key
== dummy
&& freeslot
== NULL
)
313 * Hacked up version of lookdict which can assume keys are always strings;
314 * this assumption allows testing for errors during PyObject_RichCompareBool()
315 * to be dropped; string-string comparisons never raise exceptions. This also
316 * means we don't need to go through PyObject_RichCompareBool(); we can always
317 * use _PyString_Eq() directly.
319 * This is valuable because dicts with only string keys are very common.
322 lookdict_string(dictobject
*mp
, PyObject
*key
, register long hash
)
325 register size_t perturb
;
326 register dictentry
*freeslot
;
327 register size_t mask
= (size_t)mp
->ma_mask
;
328 dictentry
*ep0
= mp
->ma_table
;
329 register dictentry
*ep
;
331 /* Make sure this function doesn't have to handle non-string keys,
332 including subclasses of str; e.g., one reason to subclass
333 strings is to override __eq__, and for speed we don't cater to
335 if (!PyString_CheckExact(key
)) {
336 #ifdef SHOW_CONVERSION_COUNTS
339 mp
->ma_lookup
= lookdict
;
340 return lookdict(mp
, key
, hash
);
344 if (ep
->me_key
== NULL
|| ep
->me_key
== key
)
346 if (ep
->me_key
== dummy
)
349 if (ep
->me_hash
== hash
&& _PyString_Eq(ep
->me_key
, key
))
354 /* In the loop, me_key == dummy is by far (factor of 100s) the
355 least likely outcome, so test for that last. */
356 for (perturb
= hash
; ; perturb
>>= PERTURB_SHIFT
) {
357 i
= (i
<< 2) + i
+ perturb
+ 1;
359 if (ep
->me_key
== NULL
)
360 return freeslot
== NULL
? ep
: freeslot
;
361 if (ep
->me_key
== key
362 || (ep
->me_hash
== hash
363 && ep
->me_key
!= dummy
364 && _PyString_Eq(ep
->me_key
, key
)))
366 if (ep
->me_key
== dummy
&& freeslot
== NULL
)
372 Internal routine to insert a new item into the table.
373 Used both by the internal resize routine and by the public insert routine.
374 Eats a reference to key and one to value.
375 Returns -1 if an error occurred, or 0 on success.
378 insertdict(register dictobject
*mp
, PyObject
*key
, long hash
, PyObject
*value
)
381 register dictentry
*ep
;
382 typedef PyDictEntry
*(*lookupfunc
)(PyDictObject
*, PyObject
*, long);
384 assert(mp
->ma_lookup
!= NULL
);
385 ep
= mp
->ma_lookup(mp
, key
, hash
);
391 if (ep
->me_value
!= NULL
) {
392 old_value
= ep
->me_value
;
393 ep
->me_value
= value
;
394 Py_DECREF(old_value
); /* which **CAN** re-enter */
398 if (ep
->me_key
== NULL
)
401 assert(ep
->me_key
== dummy
);
405 ep
->me_hash
= (Py_ssize_t
)hash
;
406 ep
->me_value
= value
;
413 Internal routine used by dictresize() to insert an item which is
414 known to be absent from the dict. This routine also assumes that
415 the dict contains no deleted entries. Besides the performance benefit,
416 using insertdict() in dictresize() is dangerous (SF bug #1456209).
417 Note that no refcounts are changed by this routine; if needed, the caller
418 is responsible for incref'ing `key` and `value`.
421 insertdict_clean(register dictobject
*mp
, PyObject
*key
, long hash
,
425 register size_t perturb
;
426 register size_t mask
= (size_t)mp
->ma_mask
;
427 dictentry
*ep0
= mp
->ma_table
;
428 register dictentry
*ep
;
432 for (perturb
= hash
; ep
->me_key
!= NULL
; perturb
>>= PERTURB_SHIFT
) {
433 i
= (i
<< 2) + i
+ perturb
+ 1;
436 assert(ep
->me_value
== NULL
);
439 ep
->me_hash
= (Py_ssize_t
)hash
;
440 ep
->me_value
= value
;
445 Restructure the table by allocating a new table and reinserting all
446 items again. When entries have been deleted, the new table may
447 actually be smaller than the old one.
450 dictresize(dictobject
*mp
, Py_ssize_t minused
)
453 dictentry
*oldtable
, *newtable
, *ep
;
455 int is_oldtable_malloced
;
456 dictentry small_copy
[PyDict_MINSIZE
];
458 assert(minused
>= 0);
460 /* Find the smallest table size > minused. */
461 for (newsize
= PyDict_MINSIZE
;
462 newsize
<= minused
&& newsize
> 0;
470 /* Get space for a new table. */
471 oldtable
= mp
->ma_table
;
472 assert(oldtable
!= NULL
);
473 is_oldtable_malloced
= oldtable
!= mp
->ma_smalltable
;
475 if (newsize
== PyDict_MINSIZE
) {
476 /* A large table is shrinking, or we can't get any smaller. */
477 newtable
= mp
->ma_smalltable
;
478 if (newtable
== oldtable
) {
479 if (mp
->ma_fill
== mp
->ma_used
) {
480 /* No dummies, so no point doing anything. */
483 /* We're not going to resize it, but rebuild the
484 table anyway to purge old dummy entries.
485 Subtle: This is *necessary* if fill==size,
486 as lookdict needs at least one virgin slot to
487 terminate failing searches. If fill < size, it's
488 merely desirable, as dummies slow searches. */
489 assert(mp
->ma_fill
> mp
->ma_used
);
490 memcpy(small_copy
, oldtable
, sizeof(small_copy
));
491 oldtable
= small_copy
;
495 newtable
= PyMem_NEW(dictentry
, newsize
);
496 if (newtable
== NULL
) {
502 /* Make the dict empty, using the new table. */
503 assert(newtable
!= oldtable
);
504 mp
->ma_table
= newtable
;
505 mp
->ma_mask
= newsize
- 1;
506 memset(newtable
, 0, sizeof(dictentry
) * newsize
);
511 /* Copy the data over; this is refcount-neutral for active entries;
512 dummy entries aren't copied over, of course */
513 for (ep
= oldtable
; i
> 0; ep
++) {
514 if (ep
->me_value
!= NULL
) { /* active entry */
516 insertdict_clean(mp
, ep
->me_key
, (long)ep
->me_hash
,
519 else if (ep
->me_key
!= NULL
) { /* dummy entry */
521 assert(ep
->me_key
== dummy
);
522 Py_DECREF(ep
->me_key
);
524 /* else key == value == NULL: nothing to do */
527 if (is_oldtable_malloced
)
532 /* Note that, for historical reasons, PyDict_GetItem() suppresses all errors
533 * that may occur (originally dicts supported only string keys, and exceptions
534 * weren't possible). So, while the original intent was that a NULL return
535 * meant the key wasn't present, it reality it can mean that, or that an error
536 * (suppressed) occurred while computing the key's hash, or that some error
537 * (suppressed) occurred when comparing keys in the dict's internal probe
538 * sequence. A nasty example of the latter is when a Python-coded comparison
539 * function hits a stack-depth error, which can cause this to return NULL
540 * even if the key is present.
543 PyDict_GetItem(PyObject
*op
, PyObject
*key
)
546 dictobject
*mp
= (dictobject
*)op
;
548 PyThreadState
*tstate
;
549 if (!PyDict_Check(op
))
551 if (!PyString_CheckExact(key
) ||
552 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1)
554 hash
= PyObject_Hash(key
);
561 /* We can arrive here with a NULL tstate during initialization:
562 try running "python -Wi" for an example related to string
563 interning. Let's just hope that no exception occurs then... */
564 tstate
= _PyThreadState_Current
;
565 if (tstate
!= NULL
&& tstate
->curexc_type
!= NULL
) {
566 /* preserve the existing exception */
567 PyObject
*err_type
, *err_value
, *err_tb
;
568 PyErr_Fetch(&err_type
, &err_value
, &err_tb
);
569 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
571 PyErr_Restore(err_type
, err_value
, err_tb
);
576 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
585 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
586 * dictionary if it's merely replacing the value for an existing key.
587 * This means that it's safe to loop over a dictionary with PyDict_Next()
588 * and occasionally replace a value -- but you can't insert new keys or
592 PyDict_SetItem(register PyObject
*op
, PyObject
*key
, PyObject
*value
)
594 register dictobject
*mp
;
596 register Py_ssize_t n_used
;
598 if (!PyDict_Check(op
)) {
599 PyErr_BadInternalCall();
602 mp
= (dictobject
*)op
;
603 if (PyString_CheckExact(key
)) {
604 hash
= ((PyStringObject
*)key
)->ob_shash
;
606 hash
= PyObject_Hash(key
);
609 hash
= PyObject_Hash(key
);
613 assert(mp
->ma_fill
<= mp
->ma_mask
); /* at least one empty slot */
614 n_used
= mp
->ma_used
;
617 if (insertdict(mp
, key
, hash
, value
) != 0)
619 /* If we added a key, we can safely resize. Otherwise just return!
620 * If fill >= 2/3 size, adjust size. Normally, this doubles or
621 * quaduples the size, but it's also possible for the dict to shrink
622 * (if ma_fill is much larger than ma_used, meaning a lot of dict
623 * keys have been * deleted).
625 * Quadrupling the size improves average dictionary sparseness
626 * (reducing collisions) at the cost of some memory and iteration
627 * speed (which loops over every possible entry). It also halves
628 * the number of expensive resize operations in a growing dictionary.
630 * Very large dictionaries (over 50K items) use doubling instead.
631 * This may help applications with severe memory constraints.
633 if (!(mp
->ma_used
> n_used
&& mp
->ma_fill
*3 >= (mp
->ma_mask
+1)*2))
635 return dictresize(mp
, (mp
->ma_used
> 50000 ? 2 : 4) * mp
->ma_used
);
639 PyDict_DelItem(PyObject
*op
, PyObject
*key
)
641 register dictobject
*mp
;
643 register dictentry
*ep
;
644 PyObject
*old_value
, *old_key
;
646 if (!PyDict_Check(op
)) {
647 PyErr_BadInternalCall();
650 if (!PyString_CheckExact(key
) ||
651 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
652 hash
= PyObject_Hash(key
);
656 mp
= (dictobject
*)op
;
657 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
660 if (ep
->me_value
== NULL
) {
661 PyErr_SetObject(PyExc_KeyError
, key
);
664 old_key
= ep
->me_key
;
667 old_value
= ep
->me_value
;
670 Py_DECREF(old_value
);
676 PyDict_Clear(PyObject
*op
)
679 dictentry
*ep
, *table
;
680 int table_is_malloced
;
682 dictentry small_copy
[PyDict_MINSIZE
];
687 if (!PyDict_Check(op
))
689 mp
= (dictobject
*)op
;
695 table
= mp
->ma_table
;
696 assert(table
!= NULL
);
697 table_is_malloced
= table
!= mp
->ma_smalltable
;
699 /* This is delicate. During the process of clearing the dict,
700 * decrefs can cause the dict to mutate. To avoid fatal confusion
701 * (voice of experience), we have to make the dict empty before
702 * clearing the slots, and never refer to anything via mp->xxx while
706 if (table_is_malloced
)
707 EMPTY_TO_MINSIZE(mp
);
710 /* It's a small table with something that needs to be cleared.
711 * Afraid the only safe way is to copy the dict entries into
712 * another small table first.
714 memcpy(small_copy
, table
, sizeof(small_copy
));
716 EMPTY_TO_MINSIZE(mp
);
718 /* else it's a small table that's already empty */
720 /* Now we can finally clear things. If C had refcounts, we could
721 * assert that the refcount on table is 1 now, i.e. that this function
722 * has unique access to it, so decref side-effects can't alter it.
724 for (ep
= table
; fill
> 0; ++ep
) {
731 Py_DECREF(ep
->me_key
);
732 Py_XDECREF(ep
->me_value
);
736 assert(ep
->me_value
== NULL
);
740 if (table_is_malloced
)
745 * Iterate over a dict. Use like so:
748 * PyObject *key, *value;
749 * i = 0; # important! i should not otherwise be changed by you
750 * while (PyDict_Next(yourdict, &i, &key, &value)) {
751 * Refer to borrowed references in key and value.
754 * CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
755 * mutates the dict. One exception: it is safe if the loop merely changes
756 * the values associated with the keys (but doesn't insert new keys or
757 * delete keys), via PyDict_SetItem().
760 PyDict_Next(PyObject
*op
, Py_ssize_t
*ppos
, PyObject
**pkey
, PyObject
**pvalue
)
762 register Py_ssize_t i
;
763 register Py_ssize_t mask
;
764 register dictentry
*ep
;
766 if (!PyDict_Check(op
))
771 ep
= ((dictobject
*)op
)->ma_table
;
772 mask
= ((dictobject
*)op
)->ma_mask
;
773 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
779 *pkey
= ep
[i
].me_key
;
781 *pvalue
= ep
[i
].me_value
;
788 dict_dealloc(register dictobject
*mp
)
790 register dictentry
*ep
;
791 Py_ssize_t fill
= mp
->ma_fill
;
792 PyObject_GC_UnTrack(mp
);
793 Py_TRASHCAN_SAFE_BEGIN(mp
)
794 for (ep
= mp
->ma_table
; fill
> 0; ep
++) {
797 Py_DECREF(ep
->me_key
);
798 Py_XDECREF(ep
->me_value
);
801 if (mp
->ma_table
!= mp
->ma_smalltable
)
802 PyMem_DEL(mp
->ma_table
);
803 if (num_free_dicts
< MAXFREEDICTS
&& mp
->ob_type
== &PyDict_Type
)
804 free_dicts
[num_free_dicts
++] = mp
;
806 mp
->ob_type
->tp_free((PyObject
*)mp
);
807 Py_TRASHCAN_SAFE_END(mp
)
811 dict_print(register dictobject
*mp
, register FILE *fp
, register int flags
)
813 register Py_ssize_t i
;
814 register Py_ssize_t any
;
817 status
= Py_ReprEnter((PyObject
*)mp
);
821 fprintf(fp
, "{...}");
827 for (i
= 0; i
<= mp
->ma_mask
; i
++) {
828 dictentry
*ep
= mp
->ma_table
+ i
;
829 PyObject
*pvalue
= ep
->me_value
;
830 if (pvalue
!= NULL
) {
831 /* Prevent PyObject_Repr from deleting value during
836 if (PyObject_Print((PyObject
*)ep
->me_key
, fp
, 0)!=0) {
838 Py_ReprLeave((PyObject
*)mp
);
842 if (PyObject_Print(pvalue
, fp
, 0) != 0) {
844 Py_ReprLeave((PyObject
*)mp
);
851 Py_ReprLeave((PyObject
*)mp
);
856 dict_repr(dictobject
*mp
)
859 PyObject
*s
, *temp
, *colon
= NULL
;
860 PyObject
*pieces
= NULL
, *result
= NULL
;
861 PyObject
*key
, *value
;
863 i
= Py_ReprEnter((PyObject
*)mp
);
865 return i
> 0 ? PyString_FromString("{...}") : NULL
;
868 if (mp
->ma_used
== 0) {
869 result
= PyString_FromString("{}");
873 pieces
= PyList_New(0);
877 colon
= PyString_FromString(": ");
881 /* Do repr() on each key+value pair, and insert ": " between them.
882 Note that repr may mutate the dict. */
884 while (PyDict_Next((PyObject
*)mp
, &i
, &key
, &value
)) {
886 /* Prevent repr from deleting value during key format. */
888 s
= PyObject_Repr(key
);
889 PyString_Concat(&s
, colon
);
890 PyString_ConcatAndDel(&s
, PyObject_Repr(value
));
894 status
= PyList_Append(pieces
, s
);
895 Py_DECREF(s
); /* append created a new ref */
900 /* Add "{}" decorations to the first and last items. */
901 assert(PyList_GET_SIZE(pieces
) > 0);
902 s
= PyString_FromString("{");
905 temp
= PyList_GET_ITEM(pieces
, 0);
906 PyString_ConcatAndDel(&s
, temp
);
907 PyList_SET_ITEM(pieces
, 0, s
);
911 s
= PyString_FromString("}");
914 temp
= PyList_GET_ITEM(pieces
, PyList_GET_SIZE(pieces
) - 1);
915 PyString_ConcatAndDel(&temp
, s
);
916 PyList_SET_ITEM(pieces
, PyList_GET_SIZE(pieces
) - 1, temp
);
920 /* Paste them all together with ", " between. */
921 s
= PyString_FromString(", ");
924 result
= _PyString_Join(s
, pieces
);
930 Py_ReprLeave((PyObject
*)mp
);
935 dict_length(dictobject
*mp
)
941 dict_subscript(dictobject
*mp
, register PyObject
*key
)
946 assert(mp
->ma_table
!= NULL
);
947 if (!PyString_CheckExact(key
) ||
948 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
949 hash
= PyObject_Hash(key
);
953 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
958 if (!PyDict_CheckExact(mp
)) {
959 /* Look up __missing__ method if we're a subclass. */
961 static PyObject
*missing_str
= NULL
;
962 if (missing_str
== NULL
)
964 PyString_InternFromString("__missing__");
965 missing
= _PyType_Lookup(mp
->ob_type
, missing_str
);
967 return PyObject_CallFunctionObjArgs(missing
,
968 (PyObject
*)mp
, key
, NULL
);
970 PyErr_SetObject(PyExc_KeyError
, key
);
979 dict_ass_sub(dictobject
*mp
, PyObject
*v
, PyObject
*w
)
982 return PyDict_DelItem((PyObject
*)mp
, v
);
984 return PyDict_SetItem((PyObject
*)mp
, v
, w
);
987 static PyMappingMethods dict_as_mapping
= {
988 (lenfunc
)dict_length
, /*mp_length*/
989 (binaryfunc
)dict_subscript
, /*mp_subscript*/
990 (objobjargproc
)dict_ass_sub
, /*mp_ass_subscript*/
994 dict_keys(register dictobject
*mp
)
996 register PyObject
*v
;
997 register Py_ssize_t i
, j
;
1006 if (n
!= mp
->ma_used
) {
1007 /* Durnit. The allocations caused the dict to resize.
1008 * Just start over, this shouldn't normally happen.
1015 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1016 if (ep
[i
].me_value
!= NULL
) {
1017 PyObject
*key
= ep
[i
].me_key
;
1019 PyList_SET_ITEM(v
, j
, key
);
1028 dict_values(register dictobject
*mp
)
1030 register PyObject
*v
;
1031 register Py_ssize_t i
, j
;
1040 if (n
!= mp
->ma_used
) {
1041 /* Durnit. The allocations caused the dict to resize.
1042 * Just start over, this shouldn't normally happen.
1049 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1050 if (ep
[i
].me_value
!= NULL
) {
1051 PyObject
*value
= ep
[i
].me_value
;
1053 PyList_SET_ITEM(v
, j
, value
);
1062 dict_items(register dictobject
*mp
)
1064 register PyObject
*v
;
1065 register Py_ssize_t i
, j
, n
;
1067 PyObject
*item
, *key
, *value
;
1070 /* Preallocate the list of tuples, to avoid allocations during
1071 * the loop over the items, which could trigger GC, which
1072 * could resize the dict. :-(
1079 for (i
= 0; i
< n
; i
++) {
1080 item
= PyTuple_New(2);
1085 PyList_SET_ITEM(v
, i
, item
);
1087 if (n
!= mp
->ma_used
) {
1088 /* Durnit. The allocations caused the dict to resize.
1089 * Just start over, this shouldn't normally happen.
1094 /* Nothing we do below makes any function calls. */
1097 for (i
= 0, j
= 0; i
<= mask
; i
++) {
1098 if ((value
=ep
[i
].me_value
) != NULL
) {
1100 item
= PyList_GET_ITEM(v
, j
);
1102 PyTuple_SET_ITEM(item
, 0, key
);
1104 PyTuple_SET_ITEM(item
, 1, value
);
1113 dict_fromkeys(PyObject
*cls
, PyObject
*args
)
1116 PyObject
*value
= Py_None
;
1117 PyObject
*it
; /* iter(seq) */
1122 if (!PyArg_UnpackTuple(args
, "fromkeys", 1, 2, &seq
, &value
))
1125 d
= PyObject_CallObject(cls
, NULL
);
1129 it
= PyObject_GetIter(seq
);
1136 key
= PyIter_Next(it
);
1138 if (PyErr_Occurred())
1142 status
= PyObject_SetItem(d
, key
, value
);
1158 dict_update_common(PyObject
*self
, PyObject
*args
, PyObject
*kwds
, char *methname
)
1160 PyObject
*arg
= NULL
;
1163 if (!PyArg_UnpackTuple(args
, methname
, 0, 1, &arg
))
1166 else if (arg
!= NULL
) {
1167 if (PyObject_HasAttrString(arg
, "keys"))
1168 result
= PyDict_Merge(self
, arg
, 1);
1170 result
= PyDict_MergeFromSeq2(self
, arg
, 1);
1172 if (result
== 0 && kwds
!= NULL
)
1173 result
= PyDict_Merge(self
, kwds
, 1);
1178 dict_update(PyObject
*self
, PyObject
*args
, PyObject
*kwds
)
1180 if (dict_update_common(self
, args
, kwds
, "update") != -1)
1185 /* Update unconditionally replaces existing items.
1186 Merge has a 3rd argument 'override'; if set, it acts like Update,
1187 otherwise it leaves existing items unchanged.
1189 PyDict_{Update,Merge} update/merge from a mapping object.
1191 PyDict_MergeFromSeq2 updates/merges from any iterable object
1192 producing iterable objects of length 2.
1196 PyDict_MergeFromSeq2(PyObject
*d
, PyObject
*seq2
, int override
)
1198 PyObject
*it
; /* iter(seq2) */
1199 Py_ssize_t i
; /* index into seq2 of current element */
1200 PyObject
*item
; /* seq2[i] */
1201 PyObject
*fast
; /* item as a 2-tuple or 2-list */
1204 assert(PyDict_Check(d
));
1205 assert(seq2
!= NULL
);
1207 it
= PyObject_GetIter(seq2
);
1211 for (i
= 0; ; ++i
) {
1212 PyObject
*key
, *value
;
1216 item
= PyIter_Next(it
);
1218 if (PyErr_Occurred())
1223 /* Convert item to sequence, and verify length 2. */
1224 fast
= PySequence_Fast(item
, "");
1226 if (PyErr_ExceptionMatches(PyExc_TypeError
))
1227 PyErr_Format(PyExc_TypeError
,
1228 "cannot convert dictionary update "
1229 "sequence element #%zd to a sequence",
1233 n
= PySequence_Fast_GET_SIZE(fast
);
1235 PyErr_Format(PyExc_ValueError
,
1236 "dictionary update sequence element #%zd "
1237 "has length %zd; 2 is required",
1242 /* Update/merge with this (key, value) pair. */
1243 key
= PySequence_Fast_GET_ITEM(fast
, 0);
1244 value
= PySequence_Fast_GET_ITEM(fast
, 1);
1245 if (override
|| PyDict_GetItem(d
, key
) == NULL
) {
1246 int status
= PyDict_SetItem(d
, key
, value
);
1262 return Py_SAFE_DOWNCAST(i
, Py_ssize_t
, int);
1266 PyDict_Update(PyObject
*a
, PyObject
*b
)
1268 return PyDict_Merge(a
, b
, 1);
1272 PyDict_Merge(PyObject
*a
, PyObject
*b
, int override
)
1274 register PyDictObject
*mp
, *other
;
1275 register Py_ssize_t i
;
1278 /* We accept for the argument either a concrete dictionary object,
1279 * or an abstract "mapping" object. For the former, we can do
1280 * things quite efficiently. For the latter, we only require that
1281 * PyMapping_Keys() and PyObject_GetItem() be supported.
1283 if (a
== NULL
|| !PyDict_Check(a
) || b
== NULL
) {
1284 PyErr_BadInternalCall();
1287 mp
= (dictobject
*)a
;
1288 if (PyDict_Check(b
)) {
1289 other
= (dictobject
*)b
;
1290 if (other
== mp
|| other
->ma_used
== 0)
1291 /* a.update(a) or a.update({}); nothing to do */
1293 if (mp
->ma_used
== 0)
1294 /* Since the target dict is empty, PyDict_GetItem()
1295 * always returns NULL. Setting override to 1
1296 * skips the unnecessary test.
1299 /* Do one big resize at the start, rather than
1300 * incrementally resizing as we insert new items. Expect
1301 * that there will be no (or few) overlapping keys.
1303 if ((mp
->ma_fill
+ other
->ma_used
)*3 >= (mp
->ma_mask
+1)*2) {
1304 if (dictresize(mp
, (mp
->ma_used
+ other
->ma_used
)*2) != 0)
1307 for (i
= 0; i
<= other
->ma_mask
; i
++) {
1308 entry
= &other
->ma_table
[i
];
1309 if (entry
->me_value
!= NULL
&&
1311 PyDict_GetItem(a
, entry
->me_key
) == NULL
)) {
1312 Py_INCREF(entry
->me_key
);
1313 Py_INCREF(entry
->me_value
);
1314 if (insertdict(mp
, entry
->me_key
,
1315 (long)entry
->me_hash
,
1316 entry
->me_value
) != 0)
1322 /* Do it the generic, slower way */
1323 PyObject
*keys
= PyMapping_Keys(b
);
1325 PyObject
*key
, *value
;
1329 /* Docstring says this is equivalent to E.keys() so
1330 * if E doesn't have a .keys() method we want
1331 * AttributeError to percolate up. Might as well
1332 * do the same for any other error.
1336 iter
= PyObject_GetIter(keys
);
1341 for (key
= PyIter_Next(iter
); key
; key
= PyIter_Next(iter
)) {
1342 if (!override
&& PyDict_GetItem(a
, key
) != NULL
) {
1346 value
= PyObject_GetItem(b
, key
);
1347 if (value
== NULL
) {
1352 status
= PyDict_SetItem(a
, key
, value
);
1361 if (PyErr_Occurred())
1362 /* Iterator completed, via error */
1369 dict_copy(register dictobject
*mp
)
1371 return PyDict_Copy((PyObject
*)mp
);
1375 PyDict_Copy(PyObject
*o
)
1379 if (o
== NULL
|| !PyDict_Check(o
)) {
1380 PyErr_BadInternalCall();
1383 copy
= PyDict_New();
1386 if (PyDict_Merge(copy
, o
, 1) == 0)
1393 PyDict_Size(PyObject
*mp
)
1395 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1396 PyErr_BadInternalCall();
1399 return ((dictobject
*)mp
)->ma_used
;
1403 PyDict_Keys(PyObject
*mp
)
1405 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1406 PyErr_BadInternalCall();
1409 return dict_keys((dictobject
*)mp
);
1413 PyDict_Values(PyObject
*mp
)
1415 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1416 PyErr_BadInternalCall();
1419 return dict_values((dictobject
*)mp
);
1423 PyDict_Items(PyObject
*mp
)
1425 if (mp
== NULL
|| !PyDict_Check(mp
)) {
1426 PyErr_BadInternalCall();
1429 return dict_items((dictobject
*)mp
);
1432 /* Subroutine which returns the smallest key in a for which b's value
1433 is different or absent. The value is returned too, through the
1434 pval argument. Both are NULL if no key in a is found for which b's status
1435 differs. The refcounts on (and only on) non-NULL *pval and function return
1436 values must be decremented by the caller (characterize() increments them
1437 to ensure that mutating comparison and PyDict_GetItem calls can't delete
1438 them before the caller is done looking at them). */
1441 characterize(dictobject
*a
, dictobject
*b
, PyObject
**pval
)
1443 PyObject
*akey
= NULL
; /* smallest key in a s.t. a[akey] != b[akey] */
1444 PyObject
*aval
= NULL
; /* a[akey] */
1448 for (i
= 0; i
<= a
->ma_mask
; i
++) {
1449 PyObject
*thiskey
, *thisaval
, *thisbval
;
1450 if (a
->ma_table
[i
].me_value
== NULL
)
1452 thiskey
= a
->ma_table
[i
].me_key
;
1453 Py_INCREF(thiskey
); /* keep alive across compares */
1455 cmp
= PyObject_RichCompareBool(akey
, thiskey
, Py_LT
);
1462 a
->ma_table
[i
].me_value
== NULL
)
1464 /* Not the *smallest* a key; or maybe it is
1465 * but the compare shrunk the dict so we can't
1466 * find its associated value anymore; or
1467 * maybe it is but the compare deleted the
1475 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1476 thisaval
= a
->ma_table
[i
].me_value
;
1478 Py_INCREF(thisaval
); /* keep alive */
1479 thisbval
= PyDict_GetItem((PyObject
*)b
, thiskey
);
1480 if (thisbval
== NULL
)
1483 /* both dicts have thiskey: same values? */
1484 cmp
= PyObject_RichCompareBool(
1485 thisaval
, thisbval
, Py_EQ
);
1488 Py_DECREF(thisaval
);
1501 Py_DECREF(thisaval
);
1515 dict_compare(dictobject
*a
, dictobject
*b
)
1517 PyObject
*adiff
, *bdiff
, *aval
, *bval
;
1520 /* Compare lengths first */
1521 if (a
->ma_used
< b
->ma_used
)
1522 return -1; /* a is shorter */
1523 else if (a
->ma_used
> b
->ma_used
)
1524 return 1; /* b is shorter */
1526 /* Same length -- check all keys */
1527 bdiff
= bval
= NULL
;
1528 adiff
= characterize(a
, b
, &aval
);
1529 if (adiff
== NULL
) {
1531 /* Either an error, or a is a subset with the same length so
1534 res
= PyErr_Occurred() ? -1 : 0;
1537 bdiff
= characterize(b
, a
, &bval
);
1538 if (bdiff
== NULL
&& PyErr_Occurred()) {
1545 /* bdiff == NULL "should be" impossible now, but perhaps
1546 * the last comparison done by the characterize() on a had
1547 * the side effect of making the dicts equal!
1549 res
= PyObject_Compare(adiff
, bdiff
);
1551 if (res
== 0 && bval
!= NULL
)
1552 res
= PyObject_Compare(aval
, bval
);
1562 /* Return 1 if dicts equal, 0 if not, -1 if error.
1563 * Gets out as soon as any difference is detected.
1564 * Uses only Py_EQ comparison.
1567 dict_equal(dictobject
*a
, dictobject
*b
)
1571 if (a
->ma_used
!= b
->ma_used
)
1572 /* can't be equal if # of entries differ */
1575 /* Same # of entries -- check all of 'em. Exit early on any diff. */
1576 for (i
= 0; i
<= a
->ma_mask
; i
++) {
1577 PyObject
*aval
= a
->ma_table
[i
].me_value
;
1581 PyObject
*key
= a
->ma_table
[i
].me_key
;
1582 /* temporarily bump aval's refcount to ensure it stays
1583 alive until we're done with it */
1585 bval
= PyDict_GetItem((PyObject
*)b
, key
);
1590 cmp
= PyObject_RichCompareBool(aval
, bval
, Py_EQ
);
1592 if (cmp
<= 0) /* error or not equal */
1600 dict_richcompare(PyObject
*v
, PyObject
*w
, int op
)
1605 if (!PyDict_Check(v
) || !PyDict_Check(w
)) {
1606 res
= Py_NotImplemented
;
1608 else if (op
== Py_EQ
|| op
== Py_NE
) {
1609 cmp
= dict_equal((dictobject
*)v
, (dictobject
*)w
);
1612 res
= (cmp
== (op
== Py_EQ
)) ? Py_True
: Py_False
;
1615 res
= Py_NotImplemented
;
1621 dict_has_key(register dictobject
*mp
, PyObject
*key
)
1626 if (!PyString_CheckExact(key
) ||
1627 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1628 hash
= PyObject_Hash(key
);
1632 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1635 return PyBool_FromLong(ep
->me_value
!= NULL
);
1639 dict_get(register dictobject
*mp
, PyObject
*args
)
1642 PyObject
*failobj
= Py_None
;
1643 PyObject
*val
= NULL
;
1647 if (!PyArg_UnpackTuple(args
, "get", 1, 2, &key
, &failobj
))
1650 if (!PyString_CheckExact(key
) ||
1651 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1652 hash
= PyObject_Hash(key
);
1656 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1668 dict_setdefault(register dictobject
*mp
, PyObject
*args
)
1671 PyObject
*failobj
= Py_None
;
1672 PyObject
*val
= NULL
;
1676 if (!PyArg_UnpackTuple(args
, "setdefault", 1, 2, &key
, &failobj
))
1679 if (!PyString_CheckExact(key
) ||
1680 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1681 hash
= PyObject_Hash(key
);
1685 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1691 if (PyDict_SetItem((PyObject
*)mp
, key
, failobj
))
1700 dict_clear(register dictobject
*mp
)
1702 PyDict_Clear((PyObject
*)mp
);
1707 dict_pop(dictobject
*mp
, PyObject
*args
)
1711 PyObject
*old_value
, *old_key
;
1712 PyObject
*key
, *deflt
= NULL
;
1714 if(!PyArg_UnpackTuple(args
, "pop", 1, 2, &key
, &deflt
))
1716 if (mp
->ma_used
== 0) {
1721 PyErr_SetString(PyExc_KeyError
,
1722 "pop(): dictionary is empty");
1725 if (!PyString_CheckExact(key
) ||
1726 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1727 hash
= PyObject_Hash(key
);
1731 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1734 if (ep
->me_value
== NULL
) {
1739 PyErr_SetObject(PyExc_KeyError
, key
);
1742 old_key
= ep
->me_key
;
1745 old_value
= ep
->me_value
;
1746 ep
->me_value
= NULL
;
1753 dict_popitem(dictobject
*mp
)
1759 /* Allocate the result tuple before checking the size. Believe it
1760 * or not, this allocation could trigger a garbage collection which
1761 * could empty the dict, so if we checked the size first and that
1762 * happened, the result would be an infinite loop (searching for an
1763 * entry that no longer exists). Note that the usual popitem()
1764 * idiom is "while d: k, v = d.popitem()". so needing to throw the
1765 * tuple away if the dict *is* empty isn't a significant
1766 * inefficiency -- possible, but unlikely in practice.
1768 res
= PyTuple_New(2);
1771 if (mp
->ma_used
== 0) {
1773 PyErr_SetString(PyExc_KeyError
,
1774 "popitem(): dictionary is empty");
1777 /* Set ep to "the first" dict entry with a value. We abuse the hash
1778 * field of slot 0 to hold a search finger:
1779 * If slot 0 has a value, use slot 0.
1780 * Else slot 0 is being used to hold a search finger,
1781 * and we use its hash value as the first index to look.
1783 ep
= &mp
->ma_table
[0];
1784 if (ep
->me_value
== NULL
) {
1786 /* The hash field may be a real hash value, or it may be a
1787 * legit search finger, or it may be a once-legit search
1788 * finger that's out of bounds now because it wrapped around
1789 * or the table shrunk -- simply make sure it's in bounds now.
1791 if (i
> mp
->ma_mask
|| i
< 1)
1792 i
= 1; /* skip slot 0 */
1793 while ((ep
= &mp
->ma_table
[i
])->me_value
== NULL
) {
1795 if (i
> mp
->ma_mask
)
1799 PyTuple_SET_ITEM(res
, 0, ep
->me_key
);
1800 PyTuple_SET_ITEM(res
, 1, ep
->me_value
);
1803 ep
->me_value
= NULL
;
1805 assert(mp
->ma_table
[0].me_value
== NULL
);
1806 mp
->ma_table
[0].me_hash
= i
+ 1; /* next place to start */
1811 dict_traverse(PyObject
*op
, visitproc visit
, void *arg
)
1817 while (PyDict_Next(op
, &i
, &pk
, &pv
)) {
1825 dict_tp_clear(PyObject
*op
)
1832 extern PyTypeObject PyDictIterKey_Type
; /* Forward */
1833 extern PyTypeObject PyDictIterValue_Type
; /* Forward */
1834 extern PyTypeObject PyDictIterItem_Type
; /* Forward */
1835 static PyObject
*dictiter_new(dictobject
*, PyTypeObject
*);
1838 dict_iterkeys(dictobject
*dict
)
1840 return dictiter_new(dict
, &PyDictIterKey_Type
);
1844 dict_itervalues(dictobject
*dict
)
1846 return dictiter_new(dict
, &PyDictIterValue_Type
);
1850 dict_iteritems(dictobject
*dict
)
1852 return dictiter_new(dict
, &PyDictIterItem_Type
);
1856 PyDoc_STRVAR(has_key__doc__
,
1857 "D.has_key(k) -> True if D has a key k, else False");
1859 PyDoc_STRVAR(contains__doc__
,
1860 "D.__contains__(k) -> True if D has a key k, else False");
1862 PyDoc_STRVAR(getitem__doc__
, "x.__getitem__(y) <==> x[y]");
1864 PyDoc_STRVAR(get__doc__
,
1865 "D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.");
1867 PyDoc_STRVAR(setdefault_doc__
,
1868 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
1870 PyDoc_STRVAR(pop__doc__
,
1871 "D.pop(k[,d]) -> v, remove specified key and return the corresponding value\n\
1872 If key is not found, d is returned if given, otherwise KeyError is raised");
1874 PyDoc_STRVAR(popitem__doc__
,
1875 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
1876 2-tuple; but raise KeyError if D is empty");
1878 PyDoc_STRVAR(keys__doc__
,
1879 "D.keys() -> list of D's keys");
1881 PyDoc_STRVAR(items__doc__
,
1882 "D.items() -> list of D's (key, value) pairs, as 2-tuples");
1884 PyDoc_STRVAR(values__doc__
,
1885 "D.values() -> list of D's values");
1887 PyDoc_STRVAR(update__doc__
,
1888 "D.update(E, **F) -> None. Update D from E and F: for k in E: D[k] = E[k]\n\
1889 (if E has keys else: for (k, v) in E: D[k] = v) then: for k in F: D[k] = F[k]");
1891 PyDoc_STRVAR(fromkeys__doc__
,
1892 "dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
1893 v defaults to None.");
1895 PyDoc_STRVAR(clear__doc__
,
1896 "D.clear() -> None. Remove all items from D.");
1898 PyDoc_STRVAR(copy__doc__
,
1899 "D.copy() -> a shallow copy of D");
1901 PyDoc_STRVAR(iterkeys__doc__
,
1902 "D.iterkeys() -> an iterator over the keys of D");
1904 PyDoc_STRVAR(itervalues__doc__
,
1905 "D.itervalues() -> an iterator over the values of D");
1907 PyDoc_STRVAR(iteritems__doc__
,
1908 "D.iteritems() -> an iterator over the (key, value) items of D");
1910 static PyMethodDef mapp_methods
[] = {
1911 {"__contains__",(PyCFunction
)dict_has_key
, METH_O
| METH_COEXIST
,
1913 {"__getitem__", (PyCFunction
)dict_subscript
, METH_O
| METH_COEXIST
,
1915 {"has_key", (PyCFunction
)dict_has_key
, METH_O
,
1917 {"get", (PyCFunction
)dict_get
, METH_VARARGS
,
1919 {"setdefault", (PyCFunction
)dict_setdefault
, METH_VARARGS
,
1921 {"pop", (PyCFunction
)dict_pop
, METH_VARARGS
,
1923 {"popitem", (PyCFunction
)dict_popitem
, METH_NOARGS
,
1925 {"keys", (PyCFunction
)dict_keys
, METH_NOARGS
,
1927 {"items", (PyCFunction
)dict_items
, METH_NOARGS
,
1929 {"values", (PyCFunction
)dict_values
, METH_NOARGS
,
1931 {"update", (PyCFunction
)dict_update
, METH_VARARGS
| METH_KEYWORDS
,
1933 {"fromkeys", (PyCFunction
)dict_fromkeys
, METH_VARARGS
| METH_CLASS
,
1935 {"clear", (PyCFunction
)dict_clear
, METH_NOARGS
,
1937 {"copy", (PyCFunction
)dict_copy
, METH_NOARGS
,
1939 {"iterkeys", (PyCFunction
)dict_iterkeys
, METH_NOARGS
,
1941 {"itervalues", (PyCFunction
)dict_itervalues
, METH_NOARGS
,
1943 {"iteritems", (PyCFunction
)dict_iteritems
, METH_NOARGS
,
1945 {NULL
, NULL
} /* sentinel */
1948 /* Return 1 if `key` is in dict `op`, 0 if not, and -1 on error. */
1950 PyDict_Contains(PyObject
*op
, PyObject
*key
)
1953 dictobject
*mp
= (dictobject
*)op
;
1956 if (!PyString_CheckExact(key
) ||
1957 (hash
= ((PyStringObject
*) key
)->ob_shash
) == -1) {
1958 hash
= PyObject_Hash(key
);
1962 ep
= (mp
->ma_lookup
)(mp
, key
, hash
);
1963 return ep
== NULL
? -1 : (ep
->me_value
!= NULL
);
1966 /* Hack to implement "key in dict" */
1967 static PySequenceMethods dict_as_sequence
= {
1973 0, /* sq_ass_item */
1974 0, /* sq_ass_slice */
1975 PyDict_Contains
, /* sq_contains */
1976 0, /* sq_inplace_concat */
1977 0, /* sq_inplace_repeat */
1981 dict_new(PyTypeObject
*type
, PyObject
*args
, PyObject
*kwds
)
1985 assert(type
!= NULL
&& type
->tp_alloc
!= NULL
);
1986 self
= type
->tp_alloc(type
, 0);
1988 PyDictObject
*d
= (PyDictObject
*)self
;
1989 /* It's guaranteed that tp->alloc zeroed out the struct. */
1990 assert(d
->ma_table
== NULL
&& d
->ma_fill
== 0 && d
->ma_used
== 0);
1991 INIT_NONZERO_DICT_SLOTS(d
);
1992 d
->ma_lookup
= lookdict_string
;
1993 #ifdef SHOW_CONVERSION_COUNTS
2001 dict_init(PyObject
*self
, PyObject
*args
, PyObject
*kwds
)
2003 return dict_update_common(self
, args
, kwds
, "dict");
2007 dict_nohash(PyObject
*self
)
2009 PyErr_SetString(PyExc_TypeError
, "dict objects are unhashable");
2014 dict_iter(dictobject
*dict
)
2016 return dictiter_new(dict
, &PyDictIterKey_Type
);
2019 PyDoc_STRVAR(dictionary_doc
,
2020 "dict() -> new empty dictionary.\n"
2021 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
2022 " (key, value) pairs.\n"
2023 "dict(seq) -> new dictionary initialized as if via:\n"
2025 " for k, v in seq:\n"
2027 "dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
2028 " in the keyword argument list. For example: dict(one=1, two=2)");
2030 PyTypeObject PyDict_Type
= {
2031 PyObject_HEAD_INIT(&PyType_Type
)
2036 (destructor
)dict_dealloc
, /* tp_dealloc */
2037 (printfunc
)dict_print
, /* tp_print */
2040 (cmpfunc
)dict_compare
, /* tp_compare */
2041 (reprfunc
)dict_repr
, /* tp_repr */
2042 0, /* tp_as_number */
2043 &dict_as_sequence
, /* tp_as_sequence */
2044 &dict_as_mapping
, /* tp_as_mapping */
2045 dict_nohash
, /* tp_hash */
2048 PyObject_GenericGetAttr
, /* tp_getattro */
2049 0, /* tp_setattro */
2050 0, /* tp_as_buffer */
2051 Py_TPFLAGS_DEFAULT
| Py_TPFLAGS_HAVE_GC
|
2052 Py_TPFLAGS_BASETYPE
, /* tp_flags */
2053 dictionary_doc
, /* tp_doc */
2054 dict_traverse
, /* tp_traverse */
2055 dict_tp_clear
, /* tp_clear */
2056 dict_richcompare
, /* tp_richcompare */
2057 0, /* tp_weaklistoffset */
2058 (getiterfunc
)dict_iter
, /* tp_iter */
2059 0, /* tp_iternext */
2060 mapp_methods
, /* tp_methods */
2065 0, /* tp_descr_get */
2066 0, /* tp_descr_set */
2067 0, /* tp_dictoffset */
2068 dict_init
, /* tp_init */
2069 PyType_GenericAlloc
, /* tp_alloc */
2070 dict_new
, /* tp_new */
2071 PyObject_GC_Del
, /* tp_free */
2074 /* For backward compatibility with old dictionary interface */
2077 PyDict_GetItemString(PyObject
*v
, const char *key
)
2080 kv
= PyString_FromString(key
);
2083 rv
= PyDict_GetItem(v
, kv
);
2089 PyDict_SetItemString(PyObject
*v
, const char *key
, PyObject
*item
)
2093 kv
= PyString_FromString(key
);
2096 PyString_InternInPlace(&kv
); /* XXX Should we really? */
2097 err
= PyDict_SetItem(v
, kv
, item
);
2103 PyDict_DelItemString(PyObject
*v
, const char *key
)
2107 kv
= PyString_FromString(key
);
2110 err
= PyDict_DelItem(v
, kv
);
2115 /* Dictionary iterator types */
2119 dictobject
*di_dict
; /* Set to NULL when iterator is exhausted */
2122 PyObject
* di_result
; /* reusable result tuple for iteritems */
2127 dictiter_new(dictobject
*dict
, PyTypeObject
*itertype
)
2130 di
= PyObject_New(dictiterobject
, itertype
);
2135 di
->di_used
= dict
->ma_used
;
2137 di
->len
= dict
->ma_used
;
2138 if (itertype
== &PyDictIterItem_Type
) {
2139 di
->di_result
= PyTuple_Pack(2, Py_None
, Py_None
);
2140 if (di
->di_result
== NULL
) {
2146 di
->di_result
= NULL
;
2147 return (PyObject
*)di
;
2151 dictiter_dealloc(dictiterobject
*di
)
2153 Py_XDECREF(di
->di_dict
);
2154 Py_XDECREF(di
->di_result
);
2159 dictiter_len(dictiterobject
*di
)
2162 if (di
->di_dict
!= NULL
&& di
->di_used
== di
->di_dict
->ma_used
)
2164 return PyInt_FromSize_t(len
);
2167 PyDoc_STRVAR(length_hint_doc
, "Private method returning an estimate of len(list(it)).");
2169 static PyMethodDef dictiter_methods
[] = {
2170 {"__length_hint__", (PyCFunction
)dictiter_len
, METH_NOARGS
, length_hint_doc
},
2171 {NULL
, NULL
} /* sentinel */
2174 static PyObject
*dictiter_iternextkey(dictiterobject
*di
)
2177 register Py_ssize_t i
, mask
;
2178 register dictentry
*ep
;
2179 dictobject
*d
= di
->di_dict
;
2183 assert (PyDict_Check(d
));
2185 if (di
->di_used
!= d
->ma_used
) {
2186 PyErr_SetString(PyExc_RuntimeError
,
2187 "dictionary changed size during iteration");
2188 di
->di_used
= -1; /* Make this state sticky */
2197 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
2213 PyTypeObject PyDictIterKey_Type
= {
2214 PyObject_HEAD_INIT(&PyType_Type
)
2216 "dictionary-keyiterator", /* tp_name */
2217 sizeof(dictiterobject
), /* tp_basicsize */
2218 0, /* tp_itemsize */
2220 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2226 0, /* tp_as_number */
2227 0, /* tp_as_sequence */
2228 0, /* tp_as_mapping */
2232 PyObject_GenericGetAttr
, /* tp_getattro */
2233 0, /* tp_setattro */
2234 0, /* tp_as_buffer */
2235 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2237 0, /* tp_traverse */
2239 0, /* tp_richcompare */
2240 0, /* tp_weaklistoffset */
2241 PyObject_SelfIter
, /* tp_iter */
2242 (iternextfunc
)dictiter_iternextkey
, /* tp_iternext */
2243 dictiter_methods
, /* tp_methods */
2247 static PyObject
*dictiter_iternextvalue(dictiterobject
*di
)
2250 register Py_ssize_t i
, mask
;
2251 register dictentry
*ep
;
2252 dictobject
*d
= di
->di_dict
;
2256 assert (PyDict_Check(d
));
2258 if (di
->di_used
!= d
->ma_used
) {
2259 PyErr_SetString(PyExc_RuntimeError
,
2260 "dictionary changed size during iteration");
2261 di
->di_used
= -1; /* Make this state sticky */
2267 if (i
< 0 || i
> mask
)
2270 while ((value
=ep
[i
].me_value
) == NULL
) {
2286 PyTypeObject PyDictIterValue_Type
= {
2287 PyObject_HEAD_INIT(&PyType_Type
)
2289 "dictionary-valueiterator", /* tp_name */
2290 sizeof(dictiterobject
), /* tp_basicsize */
2291 0, /* tp_itemsize */
2293 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2299 0, /* tp_as_number */
2300 0, /* tp_as_sequence */
2301 0, /* tp_as_mapping */
2305 PyObject_GenericGetAttr
, /* tp_getattro */
2306 0, /* tp_setattro */
2307 0, /* tp_as_buffer */
2308 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2310 0, /* tp_traverse */
2312 0, /* tp_richcompare */
2313 0, /* tp_weaklistoffset */
2314 PyObject_SelfIter
, /* tp_iter */
2315 (iternextfunc
)dictiter_iternextvalue
, /* tp_iternext */
2316 dictiter_methods
, /* tp_methods */
2320 static PyObject
*dictiter_iternextitem(dictiterobject
*di
)
2322 PyObject
*key
, *value
, *result
= di
->di_result
;
2323 register Py_ssize_t i
, mask
;
2324 register dictentry
*ep
;
2325 dictobject
*d
= di
->di_dict
;
2329 assert (PyDict_Check(d
));
2331 if (di
->di_used
!= d
->ma_used
) {
2332 PyErr_SetString(PyExc_RuntimeError
,
2333 "dictionary changed size during iteration");
2334 di
->di_used
= -1; /* Make this state sticky */
2343 while (i
<= mask
&& ep
[i
].me_value
== NULL
)
2349 if (result
->ob_refcnt
== 1) {
2351 Py_DECREF(PyTuple_GET_ITEM(result
, 0));
2352 Py_DECREF(PyTuple_GET_ITEM(result
, 1));
2354 result
= PyTuple_New(2);
2360 value
= ep
[i
].me_value
;
2363 PyTuple_SET_ITEM(result
, 0, key
);
2364 PyTuple_SET_ITEM(result
, 1, value
);
2373 PyTypeObject PyDictIterItem_Type
= {
2374 PyObject_HEAD_INIT(&PyType_Type
)
2376 "dictionary-itemiterator", /* tp_name */
2377 sizeof(dictiterobject
), /* tp_basicsize */
2378 0, /* tp_itemsize */
2380 (destructor
)dictiter_dealloc
, /* tp_dealloc */
2386 0, /* tp_as_number */
2387 0, /* tp_as_sequence */
2388 0, /* tp_as_mapping */
2392 PyObject_GenericGetAttr
, /* tp_getattro */
2393 0, /* tp_setattro */
2394 0, /* tp_as_buffer */
2395 Py_TPFLAGS_DEFAULT
, /* tp_flags */
2397 0, /* tp_traverse */
2399 0, /* tp_richcompare */
2400 0, /* tp_weaklistoffset */
2401 PyObject_SelfIter
, /* tp_iter */
2402 (iternextfunc
)dictiter_iternextitem
, /* tp_iternext */
2403 dictiter_methods
, /* tp_methods */