Objects/dictobject.c

   1
   2 /* Dictionary object implementation using a hash table */
   3
   4 /* The distribution includes a separate file, Objects/dictnotes.txt,
   5    describing explorations into dictionary design and optimization.
   6    It covers typical dictionary use patterns, the parameters for
   7    tuning dictionaries, and several ideas for possible optimizations.
   8 */
   9
  10 #include "Python.h"
  11
  12 typedef PyDictEntry dictentry;
  13 typedef PyDictObject dictobject;
  14
  15 /* Define this out if you don't want conversion statistics on exit. */
  16 #undef SHOW_CONVERSION_COUNTS
  17
  18 /* See large comment block below.  This must be >= 1. */
  19 #define PERTURB_SHIFT 5
  20
  21 /*
  22 Major subtleties ahead:  Most hash schemes depend on having a "good" hash
  23 function, in the sense of simulating randomness.  Python doesn't:  its most
  24 important hash functions (for strings and ints) are very regular in common
  25 cases:
  26
  27 >>> map(hash, (0, 1, 2, 3))
  28 [0, 1, 2, 3]
  29 >>> map(hash, ("namea", "nameb", "namec", "named"))
  30 [-1658398457, -1658398460, -1658398459, -1658398462]
  31 >>>
  32
  33 This isn't necessarily bad!  To the contrary, in a table of size 2**i, taking
  34 the low-order i bits as the initial table index is extremely fast, and there
  35 are no collisions at all for dicts indexed by a contiguous range of ints.
  36 The same is approximately true when keys are "consecutive" strings.  So this
  37 gives better-than-random behavior in common cases, and that's very desirable.
  38
  39 OTOH, when collisions occur, the tendency to fill contiguous slices of the
  40 hash table makes a good collision resolution strategy crucial.  Taking only
  41 the last i bits of the hash code is also vulnerable:  for example, consider
  42 [i << 16 for i in range(20000)] as a set of keys.  Since ints are their own
  43 hash codes, and this fits in a dict of size 2**15, the last 15 bits of every
  44 hash code are all 0:  they *all* map to the same table index.
  45
  46 But catering to unusual cases should not slow the usual ones, so we just take
  47 the last i bits anyway.  It's up to collision resolution to do the rest.  If
  48 we *usually* find the key we're looking for on the first try (and, it turns
  49 out, we usually do -- the table load factor is kept under 2/3, so the odds
  50 are solidly in our favor), then it makes best sense to keep the initial index
  51 computation dirt cheap.
  52
  53 The first half of collision resolution is to visit table indices via this
  54 recurrence:
  55
  56     j = ((5*j) + 1) mod 2**i
  57
  58 For any initial j in range(2**i), repeating that 2**i times generates each
  59 int in range(2**i) exactly once (see any text on random-number generation for
  60 proof).  By itself, this doesn't help much:  like linear probing (setting
  61 j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
  62 order.  This would be bad, except that's not the only thing we do, and it's
  63 actually *good* in the common cases where hash keys are consecutive.  In an
  64 example that's really too small to make this entirely clear, for a table of
  65 size 2**3 the order of indices is:
  66
  67     0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
  68
  69 If two things come in at index 5, the first place we look after is index 2,
  70 not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
  71 Linear probing is deadly in this case because there the fixed probe order
  72 is the *same* as the order consecutive keys are likely to arrive.  But it's
  73 extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
  74 and certain that consecutive hash codes do not.
  75
  76 The other half of the strategy is to get the other bits of the hash code
  77 into play.  This is done by initializing a (unsigned) vrbl "perturb" to the
  78 full hash code, and changing the recurrence to:
  79
  80     j = (5*j) + 1 + perturb;
  81     perturb >>= PERTURB_SHIFT;
  82     use j % 2**i as the next table index;
  83
  84 Now the probe sequence depends (eventually) on every bit in the hash code,
  85 and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
  86 because it quickly magnifies small differences in the bits that didn't affect
  87 the initial index.  Note that because perturb is unsigned, if the recurrence
  88 is executed often enough perturb eventually becomes and remains 0.  At that
  89 point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
  90 that's certain to find an empty slot eventually (since it generates every int
  91 in range(2**i), and we make sure there's always at least one empty slot).
  92
  93 Selecting a good value for PERTURB_SHIFT is a balancing act.  You want it
  94 small so that the high bits of the hash code continue to affect the probe
  95 sequence across iterations; but you want it large so that in really bad cases
  96 the high-order hash bits have an effect on early iterations.  5 was "the
  97 best" in minimizing total collisions across experiments Tim Peters ran (on
  98 both normal and pathological cases), but 4 and 6 weren't significantly worse.
  99
 100 Historical:  Reimer Behrends contributed the idea of using a polynomial-based
 101 approach, using repeated multiplication by x in GF(2**n) where an irreducible
 102 polynomial for each table size was chosen such that x was a primitive root.
 103 Christian Tismer later extended that to use division by x instead, as an
 104 efficient way to get the high bits of the hash code into play.  This scheme
 105 also gave excellent collision statistics, but was more expensive:  two
 106 if-tests were required inside the loop; computing "the next" index took about
 107 the same number of operations but without as much potential parallelism
 108 (e.g., computing 5*j can go on at the same time as computing 1+perturb in the
 109 above, and then shifting perturb can be done while the table index is being
 110 masked); and the dictobject struct required a member to hold the table's
 111 polynomial.  In Tim's experiments the current scheme ran faster, produced
 112 equally good collision statistics, needed less code & used less memory.
 113 */
 114
 115 /* Object used as dummy key to fill deleted entries */
 116 static PyObject *dummy = NULL; /* Initialized by first call to newdictobject() */
 117
 118 /* forward declarations */
 119 static dictentry *
 120 lookdict_string(dictobject *mp, PyObject *key, long hash);
 121
 122 #ifdef SHOW_CONVERSION_COUNTS
 123 static long created = 0L;
 124 static long converted = 0L;
 125
 126 static void
 127 show_counts(void)
 128 {
 129         fprintf(stderr, "created %ld string dicts\n", created);
 130         fprintf(stderr, "converted %ld to normal dicts\n", converted);
 131         fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
 132 }
 133 #endif
 134
 135 /* Initialization macros.
 136    There are two ways to create a dict:  PyDict_New() is the main C API
 137    function, and the tp_new slot maps to dict_new().  In the latter case we
 138    can save a little time over what PyDict_New does because it's guaranteed
 139    that the PyDictObject struct is already zeroed out.
 140    Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
 141    an excellent reason not to).
 142 */
 143
 144 #define INIT_NONZERO_DICT_SLOTS(mp) do {                                \
 145         (mp)->ma_table = (mp)->ma_smalltable;                           \
 146         (mp)->ma_mask = PyDict_MINSIZE - 1;                             \
 147     } while(0)
 148
 149 #define EMPTY_TO_MINSIZE(mp) do {                                       \
 150         memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable));    \
 151         (mp)->ma_used = (mp)->ma_fill = 0;                              \
 152         INIT_NONZERO_DICT_SLOTS(mp);                                    \
 153     } while(0)
 154
 155 /* Dictionary reuse scheme to save calls to malloc, free, and memset */
 156 #define MAXFREEDICTS 80
 157 static PyDictObject *free_dicts[MAXFREEDICTS];
 158 static int num_free_dicts = 0;
 159
 160 PyObject *
 161 PyDict_New(void)
 162 {
 163         register dictobject *mp;
 164         if (dummy == NULL) { /* Auto-initialize dummy */
 165                 dummy = PyString_FromString("<dummy key>");
 166                 if (dummy == NULL)
 167                         return NULL;
 168 #ifdef SHOW_CONVERSION_COUNTS
 169                 Py_AtExit(show_counts);
 170 #endif
 171         }
 172         if (num_free_dicts) {
 173                 mp = free_dicts[--num_free_dicts];
 174                 assert (mp != NULL);
 175                 assert (mp->ob_type == &PyDict_Type);
 176                 _Py_NewReference((PyObject *)mp);
 177                 if (mp->ma_fill) {
 178                         EMPTY_TO_MINSIZE(mp);
 179                 }
 180                 assert (mp->ma_used == 0);
 181                 assert (mp->ma_table == mp->ma_smalltable);
 182                 assert (mp->ma_mask == PyDict_MINSIZE - 1);
 183         } else {
 184                 mp = PyObject_GC_New(dictobject, &PyDict_Type);
 185                 if (mp == NULL)
 186                         return NULL;
 187                 EMPTY_TO_MINSIZE(mp);
 188         }
 189         mp->ma_lookup = lookdict_string;
 190 #ifdef SHOW_CONVERSION_COUNTS
 191         ++created;
 192 #endif
 193         _PyObject_GC_TRACK(mp);
 194         return (PyObject *)mp;
 195 }
 196
 197 /*
 198 The basic lookup function used by all operations.
 199 This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
 200 Open addressing is preferred over chaining since the link overhead for
 201 chaining would be substantial (100% with typical malloc overhead).
 202
 203 The initial probe index is computed as hash mod the table size. Subsequent
 204 probe indices are computed as explained earlier.
 205
 206 All arithmetic on hash should ignore overflow.
 207
 208 (The details in this version are due to Tim Peters, building on many past
 209 contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
 210 Christian Tismer).
 211
 212 This function must never return NULL; failures are indicated by returning
 213 a dictentry* for which the me_value field is NULL.  Exceptions are never
 214 reported by this function, and outstanding exceptions are maintained.
 215 */
 216
 217 static dictentry *
 218 lookdict(dictobject *mp, PyObject *key, register long hash)
 219 {
 220         register int i;
 221         register unsigned int perturb;
 222         register dictentry *freeslot;
 223         register unsigned int mask = mp->ma_mask;
 224         dictentry *ep0 = mp->ma_table;
 225         register dictentry *ep;
 226         register int restore_error;
 227         register int checked_error;
 228         register int cmp;
 229         PyObject *err_type, *err_value, *err_tb;
 230         PyObject *startkey;
 231
 232         i = hash & mask;
 233         ep = &ep0[i];
 234         if (ep->me_key == NULL || ep->me_key == key)
 235                 return ep;
 236
 237         restore_error = checked_error = 0;
 238         if (ep->me_key == dummy)
 239                 freeslot = ep;
 240         else {
 241                 if (ep->me_hash == hash) {
 242                         /* error can't have been checked yet */
 243                         checked_error = 1;
 244                         if (PyErr_Occurred()) {
 245                                 restore_error = 1;
 246                                 PyErr_Fetch(&err_type, &err_value, &err_tb);
 247                         }
 248                         startkey = ep->me_key;
 249                         cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
 250                         if (cmp < 0)
 251                                 PyErr_Clear();
 252                         if (ep0 == mp->ma_table && ep->me_key == startkey) {
 253                                 if (cmp > 0)
 254                                         goto Done;
 255                         }
 256                         else {
 257                                 /* The compare did major nasty stuff to the
 258                                  * dict:  start over.
 259                                  * XXX A clever adversary could prevent this
 260                                  * XXX from terminating.
 261                                  */
 262                                 ep = lookdict(mp, key, hash);
 263                                 goto Done;
 264                         }
 265                 }
 266                 freeslot = NULL;
 267         }
 268
 269         /* In the loop, me_key == dummy is by far (factor of 100s) the
 270            least likely outcome, so test for that last. */
 271         for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
 272                 i = (i << 2) + i + perturb + 1;
 273                 ep = &ep0[i & mask];
 274                 if (ep->me_key == NULL) {
 275                         if (freeslot != NULL)
 276                                 ep = freeslot;
 277                         break;
 278                 }
 279                 if (ep->me_key == key)
 280                         break;
 281                 if (ep->me_hash == hash && ep->me_key != dummy) {
 282                         if (!checked_error) {
 283                                 checked_error = 1;
 284                                 if (PyErr_Occurred()) {
 285                                         restore_error = 1;
 286                                         PyErr_Fetch(&err_type, &err_value,
 287                                                     &err_tb);
 288                                 }
 289                         }
 290                         startkey = ep->me_key;
 291                         cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
 292                         if (cmp < 0)
 293                                 PyErr_Clear();
 294                         if (ep0 == mp->ma_table && ep->me_key == startkey) {
 295                                 if (cmp > 0)
 296                                         break;
 297                         }
 298                         else {
 299                                 /* The compare did major nasty stuff to the
 300                                  * dict:  start over.
 301                                  * XXX A clever adversary could prevent this
 302                                  * XXX from terminating.
 303                                  */
 304                                 ep = lookdict(mp, key, hash);
 305                                 break;
 306                         }
 307                 }
 308                 else if (ep->me_key == dummy && freeslot == NULL)
 309                         freeslot = ep;
 310         }
 311
 312 Done:
 313         if (restore_error)
 314                 PyErr_Restore(err_type, err_value, err_tb);
 315         return ep;
 316 }
 317
 318 /*
 319  * Hacked up version of lookdict which can assume keys are always strings;
 320  * this assumption allows testing for errors during PyObject_Compare() to
 321  * be dropped; string-string comparisons never raise exceptions.  This also
 322  * means we don't need to go through PyObject_Compare(); we can always use
 323  * _PyString_Eq directly.
 324  *
 325  * This is valuable because the general-case error handling in lookdict() is
 326  * expensive, and dicts with pure-string keys are very common.
 327  */
 328 static dictentry *
 329 lookdict_string(dictobject *mp, PyObject *key, register long hash)
 330 {
 331         register int i;
 332         register unsigned int perturb;
 333         register dictentry *freeslot;
 334         register unsigned int mask = mp->ma_mask;
 335         dictentry *ep0 = mp->ma_table;
 336         register dictentry *ep;
 337
 338         /* Make sure this function doesn't have to handle non-string keys,
 339            including subclasses of str; e.g., one reason to subclass
 340            strings is to override __eq__, and for speed we don't cater to
 341            that here. */
 342         if (!PyString_CheckExact(key)) {
 343 #ifdef SHOW_CONVERSION_COUNTS
 344                 ++converted;
 345 #endif
 346                 mp->ma_lookup = lookdict;
 347                 return lookdict(mp, key, hash);
 348         }
 349         i = hash & mask;
 350         ep = &ep0[i];
 351         if (ep->me_key == NULL || ep->me_key == key)
 352                 return ep;
 353         if (ep->me_key == dummy)
 354                 freeslot = ep;
 355         else {
 356                 if (ep->me_hash == hash
 357                     && _PyString_Eq(ep->me_key, key)) {
 358                         return ep;
 359                 }
 360                 freeslot = NULL;
 361         }
 362
 363         /* In the loop, me_key == dummy is by far (factor of 100s) the
 364            least likely outcome, so test for that last. */
 365         for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
 366                 i = (i << 2) + i + perturb + 1;
 367                 ep = &ep0[i & mask];
 368                 if (ep->me_key == NULL)
 369                         return freeslot == NULL ? ep : freeslot;
 370                 if (ep->me_key == key
 371                     || (ep->me_hash == hash
 372                         && ep->me_key != dummy
 373                         && _PyString_Eq(ep->me_key, key)))
 374                         return ep;
 375                 if (ep->me_key == dummy && freeslot == NULL)
 376                         freeslot = ep;
 377         }
 378 }
 379
 380 /*
 381 Internal routine to insert a new item into the table.
 382 Used both by the internal resize routine and by the public insert routine.
 383 Eats a reference to key and one to value.
 384 */
 385 static void
 386 insertdict(register dictobject *mp, PyObject *key, long hash, PyObject *value)
 387 {
 388         PyObject *old_value;
 389         register dictentry *ep;
 390         typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
 391
 392         assert(mp->ma_lookup != NULL);
 393         ep = mp->ma_lookup(mp, key, hash);
 394         if (ep->me_value != NULL) {
 395                 old_value = ep->me_value;
 396                 ep->me_value = value;
 397                 Py_DECREF(old_value); /* which **CAN** re-enter */
 398                 Py_DECREF(key);
 399         }
 400         else {
 401                 if (ep->me_key == NULL)
 402                         mp->ma_fill++;
 403                 else {
 404                         assert(ep->me_key == dummy);
 405                         Py_DECREF(dummy);
 406                 }
 407                 ep->me_key = key;
 408                 ep->me_hash = hash;
 409                 ep->me_value = value;
 410                 mp->ma_used++;
 411         }
 412 }
 413
 414 /*
 415 Restructure the table by allocating a new table and reinserting all
 416 items again.  When entries have been deleted, the new table may
 417 actually be smaller than the old one.
 418 */
 419 static int
 420 dictresize(dictobject *mp, int minused)
 421 {
 422         int newsize;
 423         dictentry *oldtable, *newtable, *ep;
 424         int i;
 425         int is_oldtable_malloced;
 426         dictentry small_copy[PyDict_MINSIZE];
 427
 428         assert(minused >= 0);
 429
 430         /* Find the smallest table size > minused. */
 431         for (newsize = PyDict_MINSIZE;
 432              newsize <= minused && newsize > 0;
 433              newsize <<= 1)
 434                 ;
 435         if (newsize <= 0) {
 436                 PyErr_NoMemory();
 437                 return -1;
 438         }
 439
 440         /* Get space for a new table. */
 441         oldtable = mp->ma_table;
 442         assert(oldtable != NULL);
 443         is_oldtable_malloced = oldtable != mp->ma_smalltable;
 444
 445         if (newsize == PyDict_MINSIZE) {
 446                 /* A large table is shrinking, or we can't get any smaller. */
 447                 newtable = mp->ma_smalltable;
 448                 if (newtable == oldtable) {
 449                         if (mp->ma_fill == mp->ma_used) {
 450                                 /* No dummies, so no point doing anything. */
 451                                 return 0;
 452                         }
 453                         /* We're not going to resize it, but rebuild the
 454                            table anyway to purge old dummy entries.
 455                            Subtle:  This is *necessary* if fill==size,
 456                            as lookdict needs at least one virgin slot to
 457                            terminate failing searches.  If fill < size, it's
 458                            merely desirable, as dummies slow searches. */
 459                         assert(mp->ma_fill > mp->ma_used);
 460                         memcpy(small_copy, oldtable, sizeof(small_copy));
 461                         oldtable = small_copy;
 462                 }
 463         }
 464         else {
 465                 newtable = PyMem_NEW(dictentry, newsize);
 466                 if (newtable == NULL) {
 467                         PyErr_NoMemory();
 468                         return -1;
 469                 }
 470         }
 471
 472         /* Make the dict empty, using the new table. */
 473         assert(newtable != oldtable);
 474         mp->ma_table = newtable;
 475         mp->ma_mask = newsize - 1;
 476         memset(newtable, 0, sizeof(dictentry) * newsize);
 477         mp->ma_used = 0;
 478         i = mp->ma_fill;
 479         mp->ma_fill = 0;
 480
 481         /* Copy the data over; this is refcount-neutral for active entries;
 482            dummy entries aren't copied over, of course */
 483         for (ep = oldtable; i > 0; ep++) {
 484                 if (ep->me_value != NULL) {     /* active entry */
 485                         --i;
 486                         insertdict(mp, ep->me_key, ep->me_hash, ep->me_value);
 487                 }
 488                 else if (ep->me_key != NULL) {  /* dummy entry */
 489                         --i;
 490                         assert(ep->me_key == dummy);
 491                         Py_DECREF(ep->me_key);
 492                 }
 493                 /* else key == value == NULL:  nothing to do */
 494         }
 495
 496         if (is_oldtable_malloced)
 497                 PyMem_DEL(oldtable);
 498         return 0;
 499 }
 500
 501 PyObject *
 502 PyDict_GetItem(PyObject *op, PyObject *key)
 503 {
 504         long hash;
 505         dictobject *mp = (dictobject *)op;
 506         if (!PyDict_Check(op)) {
 507                 return NULL;
 508         }
 509         if (!PyString_CheckExact(key) ||
 510             (hash = ((PyStringObject *) key)->ob_shash) == -1)
 511         {
 512                 hash = PyObject_Hash(key);
 513                 if (hash == -1) {
 514                         PyErr_Clear();
 515                         return NULL;
 516                 }
 517         }
 518         return (mp->ma_lookup)(mp, key, hash)->me_value;
 519 }
 520
 521 /* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
 522  * dictionary if it's merely replacing the value for an existing key.
 523  * This means that it's safe to loop over a dictionary with PyDict_Next()
 524  * and occasionally replace a value -- but you can't insert new keys or
 525  * remove them.
 526  */
 527 int
 528 PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
 529 {
 530         register dictobject *mp;
 531         register long hash;
 532         register int n_used;
 533
 534         if (!PyDict_Check(op)) {
 535                 PyErr_BadInternalCall();
 536                 return -1;
 537         }
 538         mp = (dictobject *)op;
 539         if (PyString_CheckExact(key)) {
 540                 hash = ((PyStringObject *)key)->ob_shash;
 541                 if (hash == -1)
 542                         hash = PyObject_Hash(key);
 543         }
 544         else {
 545                 hash = PyObject_Hash(key);
 546                 if (hash == -1)
 547                         return -1;
 548         }
 549         assert(mp->ma_fill <= mp->ma_mask);  /* at least one empty slot */
 550         n_used = mp->ma_used;
 551         Py_INCREF(value);
 552         Py_INCREF(key);
 553         insertdict(mp, key, hash, value);
 554         /* If we added a key, we can safely resize.  Otherwise just return!
 555          * If fill >= 2/3 size, adjust size.  Normally, this doubles or
 556          * quaduples the size, but it's also possible for the dict to shrink
 557          * (if ma_fill is much larger than ma_used, meaning a lot of dict
 558          * keys have been * deleted).
 559          *
 560          * Quadrupling the size improves average dictionary sparseness
 561          * (reducing collisions) at the cost of some memory and iteration
 562          * speed (which loops over every possible entry).  It also halves
 563          * the number of expensive resize operations in a growing dictionary.
 564          *
 565          * Very large dictionaries (over 50K items) use doubling instead.
 566          * This may help applications with severe memory constraints.
 567          */
 568         if (!(mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2))
 569                 return 0;
 570         return dictresize(mp, (mp->ma_used>50000 ? mp->ma_used*2 : mp->ma_used*4));
 571 }
 572
 573 int
 574 PyDict_DelItem(PyObject *op, PyObject *key)
 575 {
 576         register dictobject *mp;
 577         register long hash;
 578         register dictentry *ep;
 579         PyObject *old_value, *old_key;
 580
 581         if (!PyDict_Check(op)) {
 582                 PyErr_BadInternalCall();
 583                 return -1;
 584         }
 585         if (!PyString_CheckExact(key) ||
 586             (hash = ((PyStringObject *) key)->ob_shash) == -1) {
 587                 hash = PyObject_Hash(key);
 588                 if (hash == -1)
 589                         return -1;
 590         }
 591         mp = (dictobject *)op;
 592         ep = (mp->ma_lookup)(mp, key, hash);
 593         if (ep->me_value == NULL) {
 594                 PyErr_SetObject(PyExc_KeyError, key);
 595                 return -1;
 596         }
 597         old_key = ep->me_key;
 598         Py_INCREF(dummy);
 599         ep->me_key = dummy;
 600         old_value = ep->me_value;
 601         ep->me_value = NULL;
 602         mp->ma_used--;
 603         Py_DECREF(old_value);
 604         Py_DECREF(old_key);
 605         return 0;
 606 }
 607
 608 void
 609 PyDict_Clear(PyObject *op)
 610 {
 611         dictobject *mp;
 612         dictentry *ep, *table;
 613         int table_is_malloced;
 614         int fill;
 615         dictentry small_copy[PyDict_MINSIZE];
 616 #ifdef Py_DEBUG
 617         int i, n;
 618 #endif
 619
 620         if (!PyDict_Check(op))
 621                 return;
 622         mp = (dictobject *)op;
 623 #ifdef Py_DEBUG
 624         n = mp->ma_mask + 1;
 625         i = 0;
 626 #endif
 627
 628         table = mp->ma_table;
 629         assert(table != NULL);
 630         table_is_malloced = table != mp->ma_smalltable;
 631
 632         /* This is delicate.  During the process of clearing the dict,
 633          * decrefs can cause the dict to mutate.  To avoid fatal confusion
 634          * (voice of experience), we have to make the dict empty before
 635          * clearing the slots, and never refer to anything via mp->xxx while
 636          * clearing.
 637          */
 638         fill = mp->ma_fill;
 639         if (table_is_malloced)
 640                 EMPTY_TO_MINSIZE(mp);
 641
 642         else if (fill > 0) {
 643                 /* It's a small table with something that needs to be cleared.
 644                  * Afraid the only safe way is to copy the dict entries into
 645                  * another small table first.
 646                  */
 647                 memcpy(small_copy, table, sizeof(small_copy));
 648                 table = small_copy;
 649                 EMPTY_TO_MINSIZE(mp);
 650         }
 651         /* else it's a small table that's already empty */
 652
 653         /* Now we can finally clear things.  If C had refcounts, we could
 654          * assert that the refcount on table is 1 now, i.e. that this function
 655          * has unique access to it, so decref side-effects can't alter it.
 656          */
 657         for (ep = table; fill > 0; ++ep) {
 658 #ifdef Py_DEBUG
 659                 assert(i < n);
 660                 ++i;
 661 #endif
 662                 if (ep->me_key) {
 663                         --fill;
 664                         Py_DECREF(ep->me_key);
 665                         Py_XDECREF(ep->me_value);
 666                 }
 667 #ifdef Py_DEBUG
 668                 else
 669                         assert(ep->me_value == NULL);
 670 #endif
 671         }
 672
 673         if (table_is_malloced)
 674                 PyMem_DEL(table);
 675 }
 676
 677 /*
 678  * Iterate over a dict.  Use like so:
 679  *
 680  *     int i;
 681  *     PyObject *key, *value;
 682  *     i = 0;   # important!  i should not otherwise be changed by you
 683  *     while (PyDict_Next(yourdict, &i, &key, &value)) {
 684  *              Refer to borrowed references in key and value.
 685  *     }
 686  *
 687  * CAUTION:  In general, it isn't safe to use PyDict_Next in a loop that
 688  * mutates the dict.  One exception:  it is safe if the loop merely changes
 689  * the values associated with the keys (but doesn't insert new keys or
 690  * delete keys), via PyDict_SetItem().
 691  */
 692 int
 693 PyDict_Next(PyObject *op, int *ppos, PyObject **pkey, PyObject **pvalue)
 694 {
 695         register int i, mask;
 696         register dictentry *ep;
 697
 698         if (!PyDict_Check(op))
 699                 return 0;
 700         i = *ppos;
 701         if (i < 0)
 702                 return 0;
 703         ep = ((dictobject *)op)->ma_table;
 704         mask = ((dictobject *)op)->ma_mask;
 705         while (i <= mask && ep[i].me_value == NULL)
 706                 i++;
 707         *ppos = i+1;
 708         if (i > mask)
 709                 return 0;
 710         if (pkey)
 711                 *pkey = ep[i].me_key;
 712         if (pvalue)
 713                 *pvalue = ep[i].me_value;
 714         return 1;
 715 }
 716
 717 /* Methods */
 718
 719 static void
 720 dict_dealloc(register dictobject *mp)
 721 {
 722         register dictentry *ep;
 723         int fill = mp->ma_fill;
 724         PyObject_GC_UnTrack(mp);
 725         Py_TRASHCAN_SAFE_BEGIN(mp)
 726         for (ep = mp->ma_table; fill > 0; ep++) {
 727                 if (ep->me_key) {
 728                         --fill;
 729                         Py_DECREF(ep->me_key);
 730                         Py_XDECREF(ep->me_value);
 731                 }
 732         }
 733         if (mp->ma_table != mp->ma_smalltable)
 734                 PyMem_DEL(mp->ma_table);
 735         if (num_free_dicts < MAXFREEDICTS && mp->ob_type == &PyDict_Type)
 736                 free_dicts[num_free_dicts++] = mp;
 737         else
 738                 mp->ob_type->tp_free((PyObject *)mp);
 739         Py_TRASHCAN_SAFE_END(mp)
 740 }
 741
 742 static int
 743 dict_print(register dictobject *mp, register FILE *fp, register int flags)
 744 {
 745         register int i;
 746         register int any;
 747
 748         i = Py_ReprEnter((PyObject*)mp);
 749         if (i != 0) {
 750                 if (i < 0)
 751                         return i;
 752                 fprintf(fp, "{...}");
 753                 return 0;
 754         }
 755
 756         fprintf(fp, "{");
 757         any = 0;
 758         for (i = 0; i <= mp->ma_mask; i++) {
 759                 dictentry *ep = mp->ma_table + i;
 760                 PyObject *pvalue = ep->me_value;
 761                 if (pvalue != NULL) {
 762                         /* Prevent PyObject_Repr from deleting value during
 763                            key format */
 764                         Py_INCREF(pvalue);
 765                         if (any++ > 0)
 766                                 fprintf(fp, ", ");
 767                         if (PyObject_Print((PyObject *)ep->me_key, fp, 0)!=0) {
 768                                 Py_DECREF(pvalue);
 769                                 Py_ReprLeave((PyObject*)mp);
 770                                 return -1;
 771                         }
 772                         fprintf(fp, ": ");
 773                         if (PyObject_Print(pvalue, fp, 0) != 0) {
 774                                 Py_DECREF(pvalue);
 775                                 Py_ReprLeave((PyObject*)mp);
 776                                 return -1;
 777                         }
 778                         Py_DECREF(pvalue);
 779                 }
 780         }
 781         fprintf(fp, "}");
 782         Py_ReprLeave((PyObject*)mp);
 783         return 0;
 784 }
 785
 786 static PyObject *
 787 dict_repr(dictobject *mp)
 788 {
 789         int i;
 790         PyObject *s, *temp, *colon = NULL;
 791         PyObject *pieces = NULL, *result = NULL;
 792         PyObject *key, *value;
 793
 794         i = Py_ReprEnter((PyObject *)mp);
 795         if (i != 0) {
 796                 return i > 0 ? PyString_FromString("{...}") : NULL;
 797         }
 798
 799         if (mp->ma_used == 0) {
 800                 result = PyString_FromString("{}");
 801                 goto Done;
 802         }
 803
 804         pieces = PyList_New(0);
 805         if (pieces == NULL)
 806                 goto Done;
 807
 808         colon = PyString_FromString(": ");
 809         if (colon == NULL)
 810                 goto Done;
 811
 812         /* Do repr() on each key+value pair, and insert ": " between them.
 813            Note that repr may mutate the dict. */
 814         i = 0;
 815         while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
 816                 int status;
 817                 /* Prevent repr from deleting value during key format. */
 818                 Py_INCREF(value);
 819                 s = PyObject_Repr(key);
 820                 PyString_Concat(&s, colon);
 821                 PyString_ConcatAndDel(&s, PyObject_Repr(value));
 822                 Py_DECREF(value);
 823                 if (s == NULL)
 824                         goto Done;
 825                 status = PyList_Append(pieces, s);
 826                 Py_DECREF(s);  /* append created a new ref */
 827                 if (status < 0)
 828                         goto Done;
 829         }
 830
 831         /* Add "{}" decorations to the first and last items. */
 832         assert(PyList_GET_SIZE(pieces) > 0);
 833         s = PyString_FromString("{");
 834         if (s == NULL)
 835                 goto Done;
 836         temp = PyList_GET_ITEM(pieces, 0);
 837         PyString_ConcatAndDel(&s, temp);
 838         PyList_SET_ITEM(pieces, 0, s);
 839         if (s == NULL)
 840                 goto Done;
 841
 842         s = PyString_FromString("}");
 843         if (s == NULL)
 844                 goto Done;
 845         temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
 846         PyString_ConcatAndDel(&temp, s);
 847         PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
 848         if (temp == NULL)
 849                 goto Done;
 850
 851         /* Paste them all together with ", " between. */
 852         s = PyString_FromString(", ");
 853         if (s == NULL)
 854                 goto Done;
 855         result = _PyString_Join(s, pieces);
 856         Py_DECREF(s);
 857
 858 Done:
 859         Py_XDECREF(pieces);
 860         Py_XDECREF(colon);
 861         Py_ReprLeave((PyObject *)mp);
 862         return result;
 863 }
 864
 865 static int
 866 dict_length(dictobject *mp)
 867 {
 868         return mp->ma_used;
 869 }
 870
 871 static PyObject *
 872 dict_subscript(dictobject *mp, register PyObject *key)
 873 {
 874         PyObject *v;
 875         long hash;
 876         assert(mp->ma_table != NULL);
 877         if (!PyString_CheckExact(key) ||
 878             (hash = ((PyStringObject *) key)->ob_shash) == -1) {
 879                 hash = PyObject_Hash(key);
 880                 if (hash == -1)
 881                         return NULL;
 882         }
 883         v = (mp->ma_lookup)(mp, key, hash) -> me_value;
 884         if (v == NULL)
 885                 PyErr_SetObject(PyExc_KeyError, key);
 886         else
 887                 Py_INCREF(v);
 888         return v;
 889 }
 890
 891 static int
 892 dict_ass_sub(dictobject *mp, PyObject *v, PyObject *w)
 893 {
 894         if (w == NULL)
 895                 return PyDict_DelItem((PyObject *)mp, v);
 896         else
 897                 return PyDict_SetItem((PyObject *)mp, v, w);
 898 }
 899
 900 static PyMappingMethods dict_as_mapping = {
 901         (inquiry)dict_length, /*mp_length*/
 902         (binaryfunc)dict_subscript, /*mp_subscript*/
 903         (objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
 904 };
 905
 906 static PyObject *
 907 dict_keys(register dictobject *mp)
 908 {
 909         register PyObject *v;
 910         register int i, j;
 911         dictentry *ep;
 912         int mask, n;
 913
 914   again:
 915         n = mp->ma_used;
 916         v = PyList_New(n);
 917         if (v == NULL)
 918                 return NULL;
 919         if (n != mp->ma_used) {
 920                 /* Durnit.  The allocations caused the dict to resize.
 921                  * Just start over, this shouldn't normally happen.
 922                  */
 923                 Py_DECREF(v);
 924                 goto again;
 925         }
 926         ep = mp->ma_table;
 927         mask = mp->ma_mask;
 928         for (i = 0, j = 0; i <= mask; i++) {
 929                 if (ep[i].me_value != NULL) {
 930                         PyObject *key = ep[i].me_key;
 931                         Py_INCREF(key);
 932                         PyList_SET_ITEM(v, j, key);
 933                         j++;
 934                 }
 935         }
 936         assert(j == n);
 937         return v;
 938 }
 939
 940 static PyObject *
 941 dict_values(register dictobject *mp)
 942 {
 943         register PyObject *v;
 944         register int i, j;
 945         dictentry *ep;
 946         int mask, n;
 947
 948   again:
 949         n = mp->ma_used;
 950         v = PyList_New(n);
 951         if (v == NULL)
 952                 return NULL;
 953         if (n != mp->ma_used) {
 954                 /* Durnit.  The allocations caused the dict to resize.
 955                  * Just start over, this shouldn't normally happen.
 956                  */
 957                 Py_DECREF(v);
 958                 goto again;
 959         }
 960         ep = mp->ma_table;
 961         mask = mp->ma_mask;
 962         for (i = 0, j = 0; i <= mask; i++) {
 963                 if (ep[i].me_value != NULL) {
 964                         PyObject *value = ep[i].me_value;
 965                         Py_INCREF(value);
 966                         PyList_SET_ITEM(v, j, value);
 967                         j++;
 968                 }
 969         }
 970         assert(j == n);
 971         return v;
 972 }
 973
 974 static PyObject *
 975 dict_items(register dictobject *mp)
 976 {
 977         register PyObject *v;
 978         register int i, j, n;
 979         int mask;
 980         PyObject *item, *key, *value;
 981         dictentry *ep;
 982
 983         /* Preallocate the list of tuples, to avoid allocations during
 984          * the loop over the items, which could trigger GC, which
 985          * could resize the dict. :-(
 986          */
 987   again:
 988         n = mp->ma_used;
 989         v = PyList_New(n);
 990         if (v == NULL)
 991                 return NULL;
 992         for (i = 0; i < n; i++) {
 993                 item = PyTuple_New(2);
 994                 if (item == NULL) {
 995                         Py_DECREF(v);
 996                         return NULL;
 997                 }
 998                 PyList_SET_ITEM(v, i, item);
 999         }
1000         if (n != mp->ma_used) {
1001                 /* Durnit.  The allocations caused the dict to resize.
1002                  * Just start over, this shouldn't normally happen.
1003                  */
1004                 Py_DECREF(v);
1005                 goto again;
1006         }
1007         /* Nothing we do below makes any function calls. */
1008         ep = mp->ma_table;
1009         mask = mp->ma_mask;
1010         for (i = 0, j = 0; i <= mask; i++) {
1011                 if ((value=ep[i].me_value) != NULL) {
1012                         key = ep[i].me_key;
1013                         item = PyList_GET_ITEM(v, j);
1014                         Py_INCREF(key);
1015                         PyTuple_SET_ITEM(item, 0, key);
1016                         Py_INCREF(value);
1017                         PyTuple_SET_ITEM(item, 1, value);
1018                         j++;
1019                 }
1020         }
1021         assert(j == n);
1022         return v;
1023 }
1024
1025 static PyObject *
1026 dict_fromkeys(PyObject *cls, PyObject *args)
1027 {
1028         PyObject *seq;
1029         PyObject *value = Py_None;
1030         PyObject *it;   /* iter(seq) */
1031         PyObject *key;
1032         PyObject *d;
1033         int status;
1034
1035         if (!PyArg_UnpackTuple(args, "fromkeys", 1, 2, &seq, &value))
1036                 return NULL;
1037
1038         d = PyObject_CallObject(cls, NULL);
1039         if (d == NULL)
1040                 return NULL;
1041
1042         it = PyObject_GetIter(seq);
1043         if (it == NULL){
1044                 Py_DECREF(d);
1045                 return NULL;
1046         }
1047
1048         for (;;) {
1049                 key = PyIter_Next(it);
1050                 if (key == NULL) {
1051                         if (PyErr_Occurred())
1052                                 goto Fail;
1053                         break;
1054                 }
1055                 status = PyObject_SetItem(d, key, value);
1056                 Py_DECREF(key);
1057                 if (status < 0)
1058                         goto Fail;
1059         }
1060
1061         Py_DECREF(it);
1062         return d;
1063
1064 Fail:
1065         Py_DECREF(it);
1066         Py_DECREF(d);
1067         return NULL;
1068 }
1069
1070 static int
1071 dict_update_common(PyObject *self, PyObject *args, PyObject *kwds, char *methname)
1072 {
1073         PyObject *arg = NULL;
1074         int result = 0;
1075
1076         if (!PyArg_UnpackTuple(args, methname, 0, 1, &arg))
1077                 result = -1;
1078
1079         else if (arg != NULL) {
1080                 if (PyObject_HasAttrString(arg, "keys"))
1081                         result = PyDict_Merge(self, arg, 1);
1082                 else
1083                         result = PyDict_MergeFromSeq2(self, arg, 1);
1084         }
1085         if (result == 0 && kwds != NULL)
1086                 result = PyDict_Merge(self, kwds, 1);
1087         return result;
1088 }
1089
1090 static PyObject *
1091 dict_update(PyObject *self, PyObject *args, PyObject *kwds)
1092 {
1093         if (dict_update_common(self, args, kwds, "update") != -1)
1094                 Py_RETURN_NONE;
1095         return NULL;
1096 }
1097
1098 /* Update unconditionally replaces existing items.
1099    Merge has a 3rd argument 'override'; if set, it acts like Update,
1100    otherwise it leaves existing items unchanged.
1101
1102    PyDict_{Update,Merge} update/merge from a mapping object.
1103
1104    PyDict_MergeFromSeq2 updates/merges from any iterable object
1105    producing iterable objects of length 2.
1106 */
1107
1108 int
1109 PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1110 {
1111         PyObject *it;   /* iter(seq2) */
1112         int i;          /* index into seq2 of current element */
1113         PyObject *item; /* seq2[i] */
1114         PyObject *fast; /* item as a 2-tuple or 2-list */
1115
1116         assert(d != NULL);
1117         assert(PyDict_Check(d));
1118         assert(seq2 != NULL);
1119
1120         it = PyObject_GetIter(seq2);
1121         if (it == NULL)
1122                 return -1;
1123
1124         for (i = 0; ; ++i) {
1125                 PyObject *key, *value;
1126                 int n;
1127
1128                 fast = NULL;
1129                 item = PyIter_Next(it);
1130                 if (item == NULL) {
1131                         if (PyErr_Occurred())
1132                                 goto Fail;
1133                         break;
1134                 }
1135
1136                 /* Convert item to sequence, and verify length 2. */
1137                 fast = PySequence_Fast(item, "");
1138                 if (fast == NULL) {
1139                         if (PyErr_ExceptionMatches(PyExc_TypeError))
1140                                 PyErr_Format(PyExc_TypeError,
1141                                         "cannot convert dictionary update "
1142                                         "sequence element #%d to a sequence",
1143                                         i);
1144                         goto Fail;
1145                 }
1146                 n = PySequence_Fast_GET_SIZE(fast);
1147                 if (n != 2) {
1148                         PyErr_Format(PyExc_ValueError,
1149                                      "dictionary update sequence element #%d "
1150                                      "has length %d; 2 is required",
1151                                      i, n);
1152                         goto Fail;
1153                 }
1154
1155                 /* Update/merge with this (key, value) pair. */
1156                 key = PySequence_Fast_GET_ITEM(fast, 0);
1157                 value = PySequence_Fast_GET_ITEM(fast, 1);
1158                 if (override || PyDict_GetItem(d, key) == NULL) {
1159                         int status = PyDict_SetItem(d, key, value);
1160                         if (status < 0)
1161                                 goto Fail;
1162                 }
1163                 Py_DECREF(fast);
1164                 Py_DECREF(item);
1165         }
1166
1167         i = 0;
1168         goto Return;
1169 Fail:
1170         Py_XDECREF(item);
1171         Py_XDECREF(fast);
1172         i = -1;
1173 Return:
1174         Py_DECREF(it);
1175         return i;
1176 }
1177
1178 int
1179 PyDict_Update(PyObject *a, PyObject *b)
1180 {
1181         return PyDict_Merge(a, b, 1);
1182 }
1183
1184 int
1185 PyDict_Merge(PyObject *a, PyObject *b, int override)
1186 {
1187         register PyDictObject *mp, *other;
1188         register int i;
1189         dictentry *entry;
1190
1191         /* We accept for the argument either a concrete dictionary object,
1192          * or an abstract "mapping" object.  For the former, we can do
1193          * things quite efficiently.  For the latter, we only require that
1194          * PyMapping_Keys() and PyObject_GetItem() be supported.
1195          */
1196         if (a == NULL || !PyDict_Check(a) || b == NULL) {
1197                 PyErr_BadInternalCall();
1198                 return -1;
1199         }
1200         mp = (dictobject*)a;
1201         if (PyDict_Check(b)) {
1202                 other = (dictobject*)b;
1203                 if (other == mp || other->ma_used == 0)
1204                         /* a.update(a) or a.update({}); nothing to do */
1205                         return 0;
1206                 if (mp->ma_used == 0)
1207                         /* Since the target dict is empty, PyDict_GetItem()
1208                          * always returns NULL.  Setting override to 1
1209                          * skips the unnecessary test.
1210                          */
1211                         override = 1;
1212                 /* Do one big resize at the start, rather than
1213                  * incrementally resizing as we insert new items.  Expect
1214                  * that there will be no (or few) overlapping keys.
1215                  */
1216                 if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1217                    if (dictresize(mp, (mp->ma_used + other->ma_used)*2) != 0)
1218                            return -1;
1219                 }
1220                 for (i = 0; i <= other->ma_mask; i++) {
1221                         entry = &other->ma_table[i];
1222                         if (entry->me_value != NULL &&
1223                             (override ||
1224                              PyDict_GetItem(a, entry->me_key) == NULL)) {
1225                                 Py_INCREF(entry->me_key);
1226                                 Py_INCREF(entry->me_value);
1227                                 insertdict(mp, entry->me_key, entry->me_hash,
1228                                            entry->me_value);
1229                         }
1230                 }
1231         }
1232         else {
1233                 /* Do it the generic, slower way */
1234                 PyObject *keys = PyMapping_Keys(b);
1235                 PyObject *iter;
1236                 PyObject *key, *value;
1237                 int status;
1238
1239                 if (keys == NULL)
1240                         /* Docstring says this is equivalent to E.keys() so
1241                          * if E doesn't have a .keys() method we want
1242                          * AttributeError to percolate up.  Might as well
1243                          * do the same for any other error.
1244                          */
1245                         return -1;
1246
1247                 iter = PyObject_GetIter(keys);
1248                 Py_DECREF(keys);
1249                 if (iter == NULL)
1250                         return -1;
1251
1252                 for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1253                         if (!override && PyDict_GetItem(a, key) != NULL) {
1254                                 Py_DECREF(key);
1255                                 continue;
1256                         }
1257                         value = PyObject_GetItem(b, key);
1258                         if (value == NULL) {
1259                                 Py_DECREF(iter);
1260                                 Py_DECREF(key);
1261                                 return -1;
1262                         }
1263                         status = PyDict_SetItem(a, key, value);
1264                         Py_DECREF(key);
1265                         Py_DECREF(value);
1266                         if (status < 0) {
1267                                 Py_DECREF(iter);
1268                                 return -1;
1269                         }
1270                 }
1271                 Py_DECREF(iter);
1272                 if (PyErr_Occurred())
1273                         /* Iterator completed, via error */
1274                         return -1;
1275         }
1276         return 0;
1277 }
1278
1279 static PyObject *
1280 dict_copy(register dictobject *mp)
1281 {
1282         return PyDict_Copy((PyObject*)mp);
1283 }
1284
1285 PyObject *
1286 PyDict_Copy(PyObject *o)
1287 {
1288         PyObject *copy;
1289
1290         if (o == NULL || !PyDict_Check(o)) {
1291                 PyErr_BadInternalCall();
1292                 return NULL;
1293         }
1294         copy = PyDict_New();
1295         if (copy == NULL)
1296                 return NULL;
1297         if (PyDict_Merge(copy, o, 1) == 0)
1298                 return copy;
1299         Py_DECREF(copy);
1300         return NULL;
1301 }
1302
1303 int
1304 PyDict_Size(PyObject *mp)
1305 {
1306         if (mp == NULL || !PyDict_Check(mp)) {
1307                 PyErr_BadInternalCall();
1308                 return -1;
1309         }
1310         return ((dictobject *)mp)->ma_used;
1311 }
1312
1313 PyObject *
1314 PyDict_Keys(PyObject *mp)
1315 {
1316         if (mp == NULL || !PyDict_Check(mp)) {
1317                 PyErr_BadInternalCall();
1318                 return NULL;
1319         }
1320         return dict_keys((dictobject *)mp);
1321 }
1322
1323 PyObject *
1324 PyDict_Values(PyObject *mp)
1325 {
1326         if (mp == NULL || !PyDict_Check(mp)) {
1327                 PyErr_BadInternalCall();
1328                 return NULL;
1329         }
1330         return dict_values((dictobject *)mp);
1331 }
1332
1333 PyObject *
1334 PyDict_Items(PyObject *mp)
1335 {
1336         if (mp == NULL || !PyDict_Check(mp)) {
1337                 PyErr_BadInternalCall();
1338                 return NULL;
1339         }
1340         return dict_items((dictobject *)mp);
1341 }
1342
1343 /* Subroutine which returns the smallest key in a for which b's value
1344    is different or absent.  The value is returned too, through the
1345    pval argument.  Both are NULL if no key in a is found for which b's status
1346    differs.  The refcounts on (and only on) non-NULL *pval and function return
1347    values must be decremented by the caller (characterize() increments them
1348    to ensure that mutating comparison and PyDict_GetItem calls can't delete
1349    them before the caller is done looking at them). */
1350
1351 static PyObject *
1352 characterize(dictobject *a, dictobject *b, PyObject **pval)
1353 {
1354         PyObject *akey = NULL; /* smallest key in a s.t. a[akey] != b[akey] */
1355         PyObject *aval = NULL; /* a[akey] */
1356         int i, cmp;
1357
1358         for (i = 0; i <= a->ma_mask; i++) {
1359                 PyObject *thiskey, *thisaval, *thisbval;
1360                 if (a->ma_table[i].me_value == NULL)
1361                         continue;
1362                 thiskey = a->ma_table[i].me_key;
1363                 Py_INCREF(thiskey);  /* keep alive across compares */
1364                 if (akey != NULL) {
1365                         cmp = PyObject_RichCompareBool(akey, thiskey, Py_LT);
1366                         if (cmp < 0) {
1367                                 Py_DECREF(thiskey);
1368                                 goto Fail;
1369                         }
1370                         if (cmp > 0 ||
1371                             i > a->ma_mask ||
1372                             a->ma_table[i].me_value == NULL)
1373                         {
1374                                 /* Not the *smallest* a key; or maybe it is
1375                                  * but the compare shrunk the dict so we can't
1376                                  * find its associated value anymore; or
1377                                  * maybe it is but the compare deleted the
1378                                  * a[thiskey] entry.
1379                                  */
1380                                 Py_DECREF(thiskey);
1381                                 continue;
1382                         }
1383                 }
1384
1385                 /* Compare a[thiskey] to b[thiskey]; cmp <- true iff equal. */
1386                 thisaval = a->ma_table[i].me_value;
1387                 assert(thisaval);
1388                 Py_INCREF(thisaval);   /* keep alive */
1389                 thisbval = PyDict_GetItem((PyObject *)b, thiskey);
1390                 if (thisbval == NULL)
1391                         cmp = 0;
1392                 else {
1393                         /* both dicts have thiskey:  same values? */
1394                         cmp = PyObject_RichCompareBool(
1395                                                 thisaval, thisbval, Py_EQ);
1396                         if (cmp < 0) {
1397                                 Py_DECREF(thiskey);
1398                                 Py_DECREF(thisaval);
1399                                 goto Fail;
1400                         }
1401                 }
1402                 if (cmp == 0) {
1403                         /* New winner. */
1404                         Py_XDECREF(akey);
1405                         Py_XDECREF(aval);
1406                         akey = thiskey;
1407                         aval = thisaval;
1408                 }
1409                 else {
1410                         Py_DECREF(thiskey);
1411                         Py_DECREF(thisaval);
1412                 }
1413         }
1414         *pval = aval;
1415         return akey;
1416
1417 Fail:
1418         Py_XDECREF(akey);
1419         Py_XDECREF(aval);
1420         *pval = NULL;
1421         return NULL;
1422 }
1423
1424 static int
1425 dict_compare(dictobject *a, dictobject *b)
1426 {
1427         PyObject *adiff, *bdiff, *aval, *bval;
1428         int res;
1429
1430         /* Compare lengths first */
1431         if (a->ma_used < b->ma_used)
1432                 return -1;      /* a is shorter */
1433         else if (a->ma_used > b->ma_used)
1434                 return 1;       /* b is shorter */
1435
1436         /* Same length -- check all keys */
1437         bdiff = bval = NULL;
1438         adiff = characterize(a, b, &aval);
1439         if (adiff == NULL) {
1440                 assert(!aval);
1441                 /* Either an error, or a is a subset with the same length so
1442                  * must be equal.
1443                  */
1444                 res = PyErr_Occurred() ? -1 : 0;
1445                 goto Finished;
1446         }
1447         bdiff = characterize(b, a, &bval);
1448         if (bdiff == NULL && PyErr_Occurred()) {
1449                 assert(!bval);
1450                 res = -1;
1451                 goto Finished;
1452         }
1453         res = 0;
1454         if (bdiff) {
1455                 /* bdiff == NULL "should be" impossible now, but perhaps
1456                  * the last comparison done by the characterize() on a had
1457                  * the side effect of making the dicts equal!
1458                  */
1459                 res = PyObject_Compare(adiff, bdiff);
1460         }
1461         if (res == 0 && bval != NULL)
1462                 res = PyObject_Compare(aval, bval);
1463
1464 Finished:
1465         Py_XDECREF(adiff);
1466         Py_XDECREF(bdiff);
1467         Py_XDECREF(aval);
1468         Py_XDECREF(bval);
1469         return res;
1470 }
1471
1472 /* Return 1 if dicts equal, 0 if not, -1 if error.
1473  * Gets out as soon as any difference is detected.
1474  * Uses only Py_EQ comparison.
1475  */
1476 static int
1477 dict_equal(dictobject *a, dictobject *b)
1478 {
1479         int i;
1480
1481         if (a->ma_used != b->ma_used)
1482                 /* can't be equal if # of entries differ */
1483                 return 0;
1484
1485         /* Same # of entries -- check all of 'em.  Exit early on any diff. */
1486         for (i = 0; i <= a->ma_mask; i++) {
1487                 PyObject *aval = a->ma_table[i].me_value;
1488                 if (aval != NULL) {
1489                         int cmp;
1490                         PyObject *bval;
1491                         PyObject *key = a->ma_table[i].me_key;
1492                         /* temporarily bump aval's refcount to ensure it stays
1493                            alive until we're done with it */
1494                         Py_INCREF(aval);
1495                         bval = PyDict_GetItem((PyObject *)b, key);
1496                         if (bval == NULL) {
1497                                 Py_DECREF(aval);
1498                                 return 0;
1499                         }
1500                         cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1501                         Py_DECREF(aval);
1502                         if (cmp <= 0)  /* error or not equal */
1503                                 return cmp;
1504                 }
1505         }
1506         return 1;
1507  }
1508
1509 static PyObject *
1510 dict_richcompare(PyObject *v, PyObject *w, int op)
1511 {
1512         int cmp;
1513         PyObject *res;
1514
1515         if (!PyDict_Check(v) || !PyDict_Check(w)) {
1516                 res = Py_NotImplemented;
1517         }
1518         else if (op == Py_EQ || op == Py_NE) {
1519                 cmp = dict_equal((dictobject *)v, (dictobject *)w);
1520                 if (cmp < 0)
1521                         return NULL;
1522                 res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1523         }
1524         else
1525                 res = Py_NotImplemented;
1526         Py_INCREF(res);
1527         return res;
1528  }
1529
1530 static PyObject *
1531 dict_has_key(register dictobject *mp, PyObject *key)
1532 {
1533         long hash;
1534         register long ok;
1535         if (!PyString_CheckExact(key) ||
1536             (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1537                 hash = PyObject_Hash(key);
1538                 if (hash == -1)
1539                         return NULL;
1540         }
1541         ok = (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1542         return PyBool_FromLong(ok);
1543 }
1544
1545 static PyObject *
1546 dict_get(register dictobject *mp, PyObject *args)
1547 {
1548         PyObject *key;
1549         PyObject *failobj = Py_None;
1550         PyObject *val = NULL;
1551         long hash;
1552
1553         if (!PyArg_UnpackTuple(args, "get", 1, 2, &key, &failobj))
1554                 return NULL;
1555
1556         if (!PyString_CheckExact(key) ||
1557             (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1558                 hash = PyObject_Hash(key);
1559                 if (hash == -1)
1560                         return NULL;
1561         }
1562         val = (mp->ma_lookup)(mp, key, hash)->me_value;
1563
1564         if (val == NULL)
1565                 val = failobj;
1566         Py_INCREF(val);
1567         return val;
1568 }
1569
1570
1571 static PyObject *
1572 dict_setdefault(register dictobject *mp, PyObject *args)
1573 {
1574         PyObject *key;
1575         PyObject *failobj = Py_None;
1576         PyObject *val = NULL;
1577         long hash;
1578
1579         if (!PyArg_UnpackTuple(args, "setdefault", 1, 2, &key, &failobj))
1580                 return NULL;
1581
1582         if (!PyString_CheckExact(key) ||
1583             (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1584                 hash = PyObject_Hash(key);
1585                 if (hash == -1)
1586                         return NULL;
1587         }
1588         val = (mp->ma_lookup)(mp, key, hash)->me_value;
1589         if (val == NULL) {
1590                 val = failobj;
1591                 if (PyDict_SetItem((PyObject*)mp, key, failobj))
1592                         val = NULL;
1593         }
1594         Py_XINCREF(val);
1595         return val;
1596 }
1597
1598
1599 static PyObject *
1600 dict_clear(register dictobject *mp)
1601 {
1602         PyDict_Clear((PyObject *)mp);
1603         Py_RETURN_NONE;
1604 }
1605
1606 static PyObject *
1607 dict_pop(dictobject *mp, PyObject *args)
1608 {
1609         long hash;
1610         dictentry *ep;
1611         PyObject *old_value, *old_key;
1612         PyObject *key, *deflt = NULL;
1613
1614         if(!PyArg_UnpackTuple(args, "pop", 1, 2, &key, &deflt))
1615                 return NULL;
1616         if (mp->ma_used == 0) {
1617                 if (deflt) {
1618                         Py_INCREF(deflt);
1619                         return deflt;
1620                 }
1621                 PyErr_SetString(PyExc_KeyError,
1622                                 "pop(): dictionary is empty");
1623                 return NULL;
1624         }
1625         if (!PyString_CheckExact(key) ||
1626             (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1627                 hash = PyObject_Hash(key);
1628                 if (hash == -1)
1629                         return NULL;
1630         }
1631         ep = (mp->ma_lookup)(mp, key, hash);
1632         if (ep->me_value == NULL) {
1633                 if (deflt) {
1634                         Py_INCREF(deflt);
1635                         return deflt;
1636                 }
1637                 PyErr_SetObject(PyExc_KeyError, key);
1638                 return NULL;
1639         }
1640         old_key = ep->me_key;
1641         Py_INCREF(dummy);
1642         ep->me_key = dummy;
1643         old_value = ep->me_value;
1644         ep->me_value = NULL;
1645         mp->ma_used--;
1646         Py_DECREF(old_key);
1647         return old_value;
1648 }
1649
1650 static PyObject *
1651 dict_popitem(dictobject *mp)
1652 {
1653         int i = 0;
1654         dictentry *ep;
1655         PyObject *res;
1656
1657         /* Allocate the result tuple before checking the size.  Believe it
1658          * or not, this allocation could trigger a garbage collection which
1659          * could empty the dict, so if we checked the size first and that
1660          * happened, the result would be an infinite loop (searching for an
1661          * entry that no longer exists).  Note that the usual popitem()
1662          * idiom is "while d: k, v = d.popitem()". so needing to throw the
1663          * tuple away  if the dict *is* empty isn't a significant
1664          * inefficiency -- possible, but unlikely in practice.
1665          */
1666         res = PyTuple_New(2);
1667         if (res == NULL)
1668                 return NULL;
1669         if (mp->ma_used == 0) {
1670                 Py_DECREF(res);
1671                 PyErr_SetString(PyExc_KeyError,
1672                                 "popitem(): dictionary is empty");
1673                 return NULL;
1674         }
1675         /* Set ep to "the first" dict entry with a value.  We abuse the hash
1676          * field of slot 0 to hold a search finger:
1677          * If slot 0 has a value, use slot 0.
1678          * Else slot 0 is being used to hold a search finger,
1679          * and we use its hash value as the first index to look.
1680          */
1681         ep = &mp->ma_table[0];
1682         if (ep->me_value == NULL) {
1683                 i = (int)ep->me_hash;
1684                 /* The hash field may be a real hash value, or it may be a
1685                  * legit search finger, or it may be a once-legit search
1686                  * finger that's out of bounds now because it wrapped around
1687                  * or the table shrunk -- simply make sure it's in bounds now.
1688                  */
1689                 if (i > mp->ma_mask || i < 1)
1690                         i = 1;  /* skip slot 0 */
1691                 while ((ep = &mp->ma_table[i])->me_value == NULL) {
1692                         i++;
1693                         if (i > mp->ma_mask)
1694                                 i = 1;
1695                 }
1696         }
1697         PyTuple_SET_ITEM(res, 0, ep->me_key);
1698         PyTuple_SET_ITEM(res, 1, ep->me_value);
1699         Py_INCREF(dummy);
1700         ep->me_key = dummy;
1701         ep->me_value = NULL;
1702         mp->ma_used--;
1703         assert(mp->ma_table[0].me_value == NULL);
1704         mp->ma_table[0].me_hash = i + 1;  /* next place to start */
1705         return res;
1706 }
1707
1708 static int
1709 dict_traverse(PyObject *op, visitproc visit, void *arg)
1710 {
1711         int i = 0, err;
1712         PyObject *pk;
1713         PyObject *pv;
1714
1715         while (PyDict_Next(op, &i, &pk, &pv)) {
1716                 err = visit(pk, arg);
1717                 if (err)
1718                         return err;
1719                 err = visit(pv, arg);
1720                 if (err)
1721                         return err;
1722         }
1723         return 0;
1724 }
1725
1726 static int
1727 dict_tp_clear(PyObject *op)
1728 {
1729         PyDict_Clear(op);
1730         return 0;
1731 }
1732
1733
1734 extern PyTypeObject PyDictIterKey_Type; /* Forward */
1735 extern PyTypeObject PyDictIterValue_Type; /* Forward */
1736 extern PyTypeObject PyDictIterItem_Type; /* Forward */
1737 static PyObject *dictiter_new(dictobject *, PyTypeObject *);
1738
1739 static PyObject *
1740 dict_iterkeys(dictobject *dict)
1741 {
1742         return dictiter_new(dict, &PyDictIterKey_Type);
1743 }
1744
1745 static PyObject *
1746 dict_itervalues(dictobject *dict)
1747 {
1748         return dictiter_new(dict, &PyDictIterValue_Type);
1749 }
1750
1751 static PyObject *
1752 dict_iteritems(dictobject *dict)
1753 {
1754         return dictiter_new(dict, &PyDictIterItem_Type);
1755 }
1756
1757
1758 PyDoc_STRVAR(has_key__doc__,
1759 "D.has_key(k) -> True if D has a key k, else False");
1760
1761 PyDoc_STRVAR(contains__doc__,
1762 "D.__contains__(k) -> True if D has a key k, else False");
1763
1764 PyDoc_STRVAR(getitem__doc__, "x.__getitem__(y) <==> x[y]");
1765
1766 PyDoc_STRVAR(get__doc__,
1767 "D.get(k[,d]) -> D[k] if k in D, else d.  d defaults to None.");
1768
1769 PyDoc_STRVAR(setdefault_doc__,
1770 "D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
1771
1772 PyDoc_STRVAR(pop__doc__,
1773 "D.pop(k[,d]) -> v, remove specified key and return the corresponding value\n\
1774 If key is not found, d is returned if given, otherwise KeyError is raised");
1775
1776 PyDoc_STRVAR(popitem__doc__,
1777 "D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
1778 2-tuple; but raise KeyError if D is empty");
1779
1780 PyDoc_STRVAR(keys__doc__,
1781 "D.keys() -> list of D's keys");
1782
1783 PyDoc_STRVAR(items__doc__,
1784 "D.items() -> list of D's (key, value) pairs, as 2-tuples");
1785
1786 PyDoc_STRVAR(values__doc__,
1787 "D.values() -> list of D's values");
1788
1789 PyDoc_STRVAR(update__doc__,
1790 "D.update(E, **F) -> None.  Update D from E and F: for k in E: D[k] = E[k]\n\
1791 (if E has keys else: for (k, v) in E: D[k] = v) then: for k in F: D[k] = F[k]");
1792
1793 PyDoc_STRVAR(fromkeys__doc__,
1794 "dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
1795 v defaults to None.");
1796
1797 PyDoc_STRVAR(clear__doc__,
1798 "D.clear() -> None.  Remove all items from D.");
1799
1800 PyDoc_STRVAR(copy__doc__,
1801 "D.copy() -> a shallow copy of D");
1802
1803 PyDoc_STRVAR(iterkeys__doc__,
1804 "D.iterkeys() -> an iterator over the keys of D");
1805
1806 PyDoc_STRVAR(itervalues__doc__,
1807 "D.itervalues() -> an iterator over the values of D");
1808
1809 PyDoc_STRVAR(iteritems__doc__,
1810 "D.iteritems() -> an iterator over the (key, value) items of D");
1811
1812 static PyMethodDef mapp_methods[] = {
1813         {"__contains__",(PyCFunction)dict_has_key,      METH_O | METH_COEXIST,
1814          contains__doc__},
1815         {"__getitem__", (PyCFunction)dict_subscript,    METH_O | METH_COEXIST,
1816          getitem__doc__},
1817         {"has_key",     (PyCFunction)dict_has_key,      METH_O,
1818          has_key__doc__},
1819         {"get",         (PyCFunction)dict_get,          METH_VARARGS,
1820          get__doc__},
1821         {"setdefault",  (PyCFunction)dict_setdefault,   METH_VARARGS,
1822          setdefault_doc__},
1823         {"pop",         (PyCFunction)dict_pop,          METH_VARARGS,
1824          pop__doc__},
1825         {"popitem",     (PyCFunction)dict_popitem,      METH_NOARGS,
1826          popitem__doc__},
1827         {"keys",        (PyCFunction)dict_keys,         METH_NOARGS,
1828         keys__doc__},
1829         {"items",       (PyCFunction)dict_items,        METH_NOARGS,
1830          items__doc__},
1831         {"values",      (PyCFunction)dict_values,       METH_NOARGS,
1832          values__doc__},
1833         {"update",      (PyCFunction)dict_update,       METH_VARARGS | METH_KEYWORDS,
1834          update__doc__},
1835         {"fromkeys",    (PyCFunction)dict_fromkeys,     METH_VARARGS | METH_CLASS,
1836          fromkeys__doc__},
1837         {"clear",       (PyCFunction)dict_clear,        METH_NOARGS,
1838          clear__doc__},
1839         {"copy",        (PyCFunction)dict_copy,         METH_NOARGS,
1840          copy__doc__},
1841         {"iterkeys",    (PyCFunction)dict_iterkeys,     METH_NOARGS,
1842          iterkeys__doc__},
1843         {"itervalues",  (PyCFunction)dict_itervalues,   METH_NOARGS,
1844          itervalues__doc__},
1845         {"iteritems",   (PyCFunction)dict_iteritems,    METH_NOARGS,
1846          iteritems__doc__},
1847         {NULL,          NULL}   /* sentinel */
1848 };
1849
1850 int
1851 PyDict_Contains(PyObject *op, PyObject *key)
1852 {
1853         long hash;
1854         dictobject *mp = (dictobject *)op;
1855
1856         if (!PyString_CheckExact(key) ||
1857             (hash = ((PyStringObject *) key)->ob_shash) == -1) {
1858                 hash = PyObject_Hash(key);
1859                 if (hash == -1)
1860                         return -1;
1861         }
1862         return (mp->ma_lookup)(mp, key, hash)->me_value != NULL;
1863 }
1864
1865 /* Hack to implement "key in dict" */
1866 static PySequenceMethods dict_as_sequence = {
1867         0,                                      /* sq_length */
1868         0,                                      /* sq_concat */
1869         0,                                      /* sq_repeat */
1870         0,                                      /* sq_item */
1871         0,                                      /* sq_slice */
1872         0,                                      /* sq_ass_item */
1873         0,                                      /* sq_ass_slice */
1874         (objobjproc)PyDict_Contains,            /* sq_contains */
1875         0,                                      /* sq_inplace_concat */
1876         0,                                      /* sq_inplace_repeat */
1877 };
1878
1879 static PyObject *
1880 dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
1881 {
1882         PyObject *self;
1883
1884         assert(type != NULL && type->tp_alloc != NULL);
1885         self = type->tp_alloc(type, 0);
1886         if (self != NULL) {
1887                 PyDictObject *d = (PyDictObject *)self;
1888                 /* It's guaranteed that tp->alloc zeroed out the struct. */
1889                 assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
1890                 INIT_NONZERO_DICT_SLOTS(d);
1891                 d->ma_lookup = lookdict_string;
1892 #ifdef SHOW_CONVERSION_COUNTS
1893                 ++created;
1894 #endif
1895         }
1896         return self;
1897 }
1898
1899 static int
1900 dict_init(PyObject *self, PyObject *args, PyObject *kwds)
1901 {
1902         return dict_update_common(self, args, kwds, "dict");
1903 }
1904
1905 static long
1906 dict_nohash(PyObject *self)
1907 {
1908         PyErr_SetString(PyExc_TypeError, "dict objects are unhashable");
1909         return -1;
1910 }
1911
1912 static PyObject *
1913 dict_iter(dictobject *dict)
1914 {
1915         return dictiter_new(dict, &PyDictIterKey_Type);
1916 }
1917
1918 PyDoc_STRVAR(dictionary_doc,
1919 "dict() -> new empty dictionary.\n"
1920 "dict(mapping) -> new dictionary initialized from a mapping object's\n"
1921 "    (key, value) pairs.\n"
1922 "dict(seq) -> new dictionary initialized as if via:\n"
1923 "    d = {}\n"
1924 "    for k, v in seq:\n"
1925 "        d[k] = v\n"
1926 "dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
1927 "    in the keyword argument list.  For example:  dict(one=1, two=2)");
1928
1929 PyTypeObject PyDict_Type = {
1930         PyObject_HEAD_INIT(&PyType_Type)
1931         0,
1932         "dict",
1933         sizeof(dictobject),
1934         0,
1935         (destructor)dict_dealloc,               /* tp_dealloc */
1936         (printfunc)dict_print,                  /* tp_print */
1937         0,                                      /* tp_getattr */
1938         0,                                      /* tp_setattr */
1939         (cmpfunc)dict_compare,                  /* tp_compare */
1940         (reprfunc)dict_repr,                    /* tp_repr */
1941         0,                                      /* tp_as_number */
1942         &dict_as_sequence,                      /* tp_as_sequence */
1943         &dict_as_mapping,                       /* tp_as_mapping */
1944         dict_nohash,                            /* tp_hash */
1945         0,                                      /* tp_call */
1946         0,                                      /* tp_str */
1947         PyObject_GenericGetAttr,                /* tp_getattro */
1948         0,                                      /* tp_setattro */
1949         0,                                      /* tp_as_buffer */
1950         Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
1951                 Py_TPFLAGS_BASETYPE,            /* tp_flags */
1952         dictionary_doc,                         /* tp_doc */
1953         (traverseproc)dict_traverse,            /* tp_traverse */
1954         (inquiry)dict_tp_clear,                 /* tp_clear */
1955         dict_richcompare,                       /* tp_richcompare */
1956         0,                                      /* tp_weaklistoffset */
1957         (getiterfunc)dict_iter,                 /* tp_iter */
1958         0,                                      /* tp_iternext */
1959         mapp_methods,                           /* tp_methods */
1960         0,                                      /* tp_members */
1961         0,                                      /* tp_getset */
1962         0,                                      /* tp_base */
1963         0,                                      /* tp_dict */
1964         0,                                      /* tp_descr_get */
1965         0,                                      /* tp_descr_set */
1966         0,                                      /* tp_dictoffset */
1967         (initproc)dict_init,                    /* tp_init */
1968         PyType_GenericAlloc,                    /* tp_alloc */
1969         dict_new,                               /* tp_new */
1970         PyObject_GC_Del,                        /* tp_free */
1971 };
1972
1973 /* For backward compatibility with old dictionary interface */
1974
1975 PyObject *
1976 PyDict_GetItemString(PyObject *v, const char *key)
1977 {
1978         PyObject *kv, *rv;
1979         kv = PyString_FromString(key);
1980         if (kv == NULL)
1981                 return NULL;
1982         rv = PyDict_GetItem(v, kv);
1983         Py_DECREF(kv);
1984         return rv;
1985 }
1986
1987 int
1988 PyDict_SetItemString(PyObject *v, const char *key, PyObject *item)
1989 {
1990         PyObject *kv;
1991         int err;
1992         kv = PyString_FromString(key);
1993         if (kv == NULL)
1994                 return -1;
1995         PyString_InternInPlace(&kv); /* XXX Should we really? */
1996         err = PyDict_SetItem(v, kv, item);
1997         Py_DECREF(kv);
1998         return err;
1999 }
2000
2001 int
2002 PyDict_DelItemString(PyObject *v, const char *key)
2003 {
2004         PyObject *kv;
2005         int err;
2006         kv = PyString_FromString(key);
2007         if (kv == NULL)
2008                 return -1;
2009         err = PyDict_DelItem(v, kv);
2010         Py_DECREF(kv);
2011         return err;
2012 }
2013
2014 /* Dictionary iterator types */
2015
2016 typedef struct {
2017         PyObject_HEAD
2018         dictobject *di_dict; /* Set to NULL when iterator is exhausted */
2019         int di_used;
2020         int di_pos;
2021         PyObject* di_result; /* reusable result tuple for iteritems */
2022         long len;
2023 } dictiterobject;
2024
2025 static PyObject *
2026 dictiter_new(dictobject *dict, PyTypeObject *itertype)
2027 {
2028         dictiterobject *di;
2029         di = PyObject_New(dictiterobject, itertype);
2030         if (di == NULL)
2031                 return NULL;
2032         Py_INCREF(dict);
2033         di->di_dict = dict;
2034         di->di_used = dict->ma_used;
2035         di->di_pos = 0;
2036         di->len = dict->ma_used;
2037         if (itertype == &PyDictIterItem_Type) {
2038                 di->di_result = PyTuple_Pack(2, Py_None, Py_None);
2039                 if (di->di_result == NULL) {
2040                         Py_DECREF(di);
2041                         return NULL;
2042                 }
2043         }
2044         else
2045                 di->di_result = NULL;
2046         return (PyObject *)di;
2047 }
2048
2049 static void
2050 dictiter_dealloc(dictiterobject *di)
2051 {
2052         Py_XDECREF(di->di_dict);
2053         Py_XDECREF(di->di_result);
2054         PyObject_Del(di);
2055 }
2056
2057 static PyObject *
2058 dictiter_len(dictiterobject *di)
2059 {
2060         int len = 0;
2061         if (di->di_dict != NULL && di->di_used == di->di_dict->ma_used)
2062                 len = di->len;
2063         return PyInt_FromLong(len);
2064 }
2065
2066 PyDoc_STRVAR(length_cue_doc, "Private method returning an estimate of len(list(it)).");
2067
2068 static PyMethodDef dictiter_methods[] = {
2069         {"_length_cue", (PyCFunction)dictiter_len, METH_NOARGS, length_cue_doc},
2070         {NULL,          NULL}           /* sentinel */
2071 };
2072
2073 static PyObject *dictiter_iternextkey(dictiterobject *di)
2074 {
2075         PyObject *key;
2076         register int i, mask;
2077         register dictentry *ep;
2078         dictobject *d = di->di_dict;
2079
2080         if (d == NULL)
2081                 return NULL;
2082         assert (PyDict_Check(d));
2083
2084         if (di->di_used != d->ma_used) {
2085                 PyErr_SetString(PyExc_RuntimeError,
2086                                 "dictionary changed size during iteration");
2087                 di->di_used = -1; /* Make this state sticky */
2088                 return NULL;
2089         }
2090
2091         i = di->di_pos;
2092         if (i < 0)
2093                 goto fail;
2094         ep = d->ma_table;
2095         mask = d->ma_mask;
2096         while (i <= mask && ep[i].me_value == NULL)
2097                 i++;
2098         di->di_pos = i+1;
2099         if (i > mask)
2100                 goto fail;
2101         di->len--;
2102         key = ep[i].me_key;
2103         Py_INCREF(key);
2104         return key;
2105
2106 fail:
2107         Py_DECREF(d);
2108         di->di_dict = NULL;
2109         return NULL;
2110 }
2111
2112 PyTypeObject PyDictIterKey_Type = {
2113         PyObject_HEAD_INIT(&PyType_Type)
2114         0,                                      /* ob_size */
2115         "dictionary-keyiterator",               /* tp_name */
2116         sizeof(dictiterobject),                 /* tp_basicsize */
2117         0,                                      /* tp_itemsize */
2118         /* methods */
2119         (destructor)dictiter_dealloc,           /* tp_dealloc */
2120         0,                                      /* tp_print */
2121         0,                                      /* tp_getattr */
2122         0,                                      /* tp_setattr */
2123         0,                                      /* tp_compare */
2124         0,                                      /* tp_repr */
2125         0,                                      /* tp_as_number */
2126         0,                                      /* tp_as_sequence */
2127         0,                                      /* tp_as_mapping */
2128         0,                                      /* tp_hash */
2129         0,                                      /* tp_call */
2130         0,                                      /* tp_str */
2131         PyObject_GenericGetAttr,                /* tp_getattro */
2132         0,                                      /* tp_setattro */
2133         0,                                      /* tp_as_buffer */
2134         Py_TPFLAGS_DEFAULT,                     /* tp_flags */
2135         0,                                      /* tp_doc */
2136         0,                                      /* tp_traverse */
2137         0,                                      /* tp_clear */
2138         0,                                      /* tp_richcompare */
2139         0,                                      /* tp_weaklistoffset */
2140         PyObject_SelfIter,                      /* tp_iter */
2141         (iternextfunc)dictiter_iternextkey,     /* tp_iternext */
2142         dictiter_methods,                       /* tp_methods */
2143         0,
2144 };
2145
2146 static PyObject *dictiter_iternextvalue(dictiterobject *di)
2147 {
2148         PyObject *value;
2149         register int i, mask;
2150         register dictentry *ep;
2151         dictobject *d = di->di_dict;
2152
2153         if (d == NULL)
2154                 return NULL;
2155         assert (PyDict_Check(d));
2156
2157         if (di->di_used != d->ma_used) {
2158                 PyErr_SetString(PyExc_RuntimeError,
2159                                 "dictionary changed size during iteration");
2160                 di->di_used = -1; /* Make this state sticky */
2161                 return NULL;
2162         }
2163
2164         i = di->di_pos;
2165         mask = d->ma_mask;
2166         if (i < 0 || i > mask)
2167                 goto fail;
2168         ep = d->ma_table;
2169         while ((value=ep[i].me_value) == NULL) {
2170                 i++;
2171                 if (i > mask)
2172                         goto fail;
2173         }
2174         di->di_pos = i+1;
2175         di->len--;
2176         Py_INCREF(value);
2177         return value;
2178
2179 fail:
2180         Py_DECREF(d);
2181         di->di_dict = NULL;
2182         return NULL;
2183 }
2184
2185 PyTypeObject PyDictIterValue_Type = {
2186         PyObject_HEAD_INIT(&PyType_Type)
2187         0,                                      /* ob_size */
2188         "dictionary-valueiterator",             /* tp_name */
2189         sizeof(dictiterobject),                 /* tp_basicsize */
2190         0,                                      /* tp_itemsize */
2191         /* methods */
2192         (destructor)dictiter_dealloc,           /* tp_dealloc */
2193         0,                                      /* tp_print */
2194         0,                                      /* tp_getattr */
2195         0,                                      /* tp_setattr */
2196         0,                                      /* tp_compare */
2197         0,                                      /* tp_repr */
2198         0,                                      /* tp_as_number */
2199         0,                                      /* tp_as_sequence */
2200         0,                                      /* tp_as_mapping */
2201         0,                                      /* tp_hash */
2202         0,                                      /* tp_call */
2203         0,                                      /* tp_str */
2204         PyObject_GenericGetAttr,                /* tp_getattro */
2205         0,                                      /* tp_setattro */
2206         0,                                      /* tp_as_buffer */
2207         Py_TPFLAGS_DEFAULT,                     /* tp_flags */
2208         0,                                      /* tp_doc */
2209         0,                                      /* tp_traverse */
2210         0,                                      /* tp_clear */
2211         0,                                      /* tp_richcompare */
2212         0,                                      /* tp_weaklistoffset */
2213         PyObject_SelfIter,                      /* tp_iter */
2214         (iternextfunc)dictiter_iternextvalue,   /* tp_iternext */
2215         dictiter_methods,                       /* tp_methods */
2216         0,
2217 };
2218
2219 static PyObject *dictiter_iternextitem(dictiterobject *di)
2220 {
2221         PyObject *key, *value, *result = di->di_result;
2222         register int i, mask;
2223         register dictentry *ep;
2224         dictobject *d = di->di_dict;
2225
2226         if (d == NULL)
2227                 return NULL;
2228         assert (PyDict_Check(d));
2229
2230         if (di->di_used != d->ma_used) {
2231                 PyErr_SetString(PyExc_RuntimeError,
2232                                 "dictionary changed size during iteration");
2233                 di->di_used = -1; /* Make this state sticky */
2234                 return NULL;
2235         }
2236
2237         i = di->di_pos;
2238         if (i < 0)
2239                 goto fail;
2240         ep = d->ma_table;
2241         mask = d->ma_mask;
2242         while (i <= mask && ep[i].me_value == NULL)
2243                 i++;
2244         di->di_pos = i+1;
2245         if (i > mask)
2246                 goto fail;
2247
2248         if (result->ob_refcnt == 1) {
2249                 Py_INCREF(result);
2250                 Py_DECREF(PyTuple_GET_ITEM(result, 0));
2251                 Py_DECREF(PyTuple_GET_ITEM(result, 1));
2252         } else {
2253                 result = PyTuple_New(2);
2254                 if (result == NULL)
2255                         return NULL;
2256         }
2257         di->len--;
2258         key = ep[i].me_key;
2259         value = ep[i].me_value;
2260         Py_INCREF(key);
2261         Py_INCREF(value);
2262         PyTuple_SET_ITEM(result, 0, key);
2263         PyTuple_SET_ITEM(result, 1, value);
2264         return result;
2265
2266 fail:
2267         Py_DECREF(d);
2268         di->di_dict = NULL;
2269         return NULL;
2270 }
2271
2272 PyTypeObject PyDictIterItem_Type = {
2273         PyObject_HEAD_INIT(&PyType_Type)
2274         0,                                      /* ob_size */
2275         "dictionary-itemiterator",              /* tp_name */
2276         sizeof(dictiterobject),                 /* tp_basicsize */
2277         0,                                      /* tp_itemsize */
2278         /* methods */
2279         (destructor)dictiter_dealloc,           /* tp_dealloc */
2280         0,                                      /* tp_print */
2281         0,                                      /* tp_getattr */
2282         0,                                      /* tp_setattr */
2283         0,                                      /* tp_compare */
2284         0,                                      /* tp_repr */
2285         0,                                      /* tp_as_number */
2286         0,                                      /* tp_as_sequence */
2287         0,                                      /* tp_as_mapping */
2288         0,                                      /* tp_hash */
2289         0,                                      /* tp_call */
2290         0,                                      /* tp_str */
2291         PyObject_GenericGetAttr,                /* tp_getattro */
2292         0,                                      /* tp_setattro */
2293         0,                                      /* tp_as_buffer */
2294         Py_TPFLAGS_DEFAULT,                     /* tp_flags */
2295         0,                                      /* tp_doc */
2296         0,                                      /* tp_traverse */
2297         0,                                      /* tp_clear */
2298         0,                                      /* tp_richcompare */
2299         0,                                      /* tp_weaklistoffset */
2300         PyObject_SelfIter,                      /* tp_iter */
2301         (iternextfunc)dictiter_iternextitem,    /* tp_iternext */
2302         dictiter_methods,                       /* tp_methods */
2303         0,
2304 };