This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients rhettinger
Date 2014-03-30.08:58:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1396169909.67.0.904843670128.issue21101@psf.upfronthosting.co.za>
In-reply-to
Content
Propose adding two functions, PyDict_GetItem_KnownHash() and PyDict_SetItem_KnownHash().

It is reasonably common to make two successive dictionary accesses with the same key.  This results in calling the hash function twice to compute the same result.

For example, the technique can be used to speed-up collections.Counter (see the attached patch to show how).  In that patch, the hash is computed once, then used twice (to retrieve the prior count and to store the new count.

There are many other places in the standard library that could benefit:

    Modules/posixmodule.c 1254
    Modules/pyexpat.c 343 and 1788 and 1798
    Modules/_json.c 628 and 1446 and 1515 and 1697
    Modules/selectmodule.c 465
    Modules/zipmodule.c 137
    Objects/typeobject.c 6678 and 6685
    Objects/unicodeobject.c 14997
    Python/_warnings.c 195
    Python/compile.c 1134
    Python/import.c 1046 and 1066
    Python/symtable 671 and 687 and 1068

A similar technique has been used for years in the Objects/setobject.c internals as a way to eliminate unnecessary calls to PyObject_Hash() during set-to-set and set-to-dict operations.

The benefit is biggest for objects such as tuples or user-defined classes that have to recompute the hash on every call on PyObject_Hash().
History
Date User Action Args
2014-03-30 08:58:29rhettingersetrecipients: + rhettinger
2014-03-30 08:58:29rhettingersetmessageid: <1396169909.67.0.904843670128.issue21101@psf.upfronthosting.co.za>
2014-03-30 08:58:29rhettingerlinkissue21101 messages
2014-03-30 08:58:28rhettingercreate