classification
Title: Use hash if available to optimize a==b and a!=b for bytes and str
Type: performance Stage: patch review
Components: Versions: Python 3.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, djc, ezio.melotti, gregory.p.smith, meador.inge, pitrou, python-dev, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2012-10-19 12:36 by vstinner, last changed 2014-04-17 04:58 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
compare_hash.patch vstinner, 2012-10-19 12:36 review
compare_hash-2.patch vstinner, 2013-11-01 01:57 review
compare_hash-3.patch vstinner, 2013-11-04 10:30 review
Messages (23)
msg173332 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-10-19 12:36
Attached patch optimize a==b and a!=b operators for bytes and str types of Python 3.4. For str, memcmp() is now always used, instead of a loop using PyUnicode_READ() (which is slow) for kind different than 1. For bytes, compare the first but also the last byte before calling memcmp(), instead of just comparing the first byte. Similar optimization was implemented in Py_UNICODE_MATCH():

changeset:   38242:0de9a789de39
branch:      legacy-trunk
user:        Fredrik Lundh <fredrik@pythonware.com>
date:        Tue May 23 10:10:57 2006 +0000
files:       Include/unicodeobject.h
description:
needforspeed: check first *and* last character before doing a full memcmp

Initially I only wrote the patch to check the hash values before comparing content of the strings.

--

I done some statistics tests. For a fresh Python interpreter, the hash values are only known in 7% cases (but when hashes are compared, they are quite always different, so the optimization is useful). When running "./python -m test test_os", hashes are known and different in 41.4%. After running 70 tests, hashes are known and different in 80%.
msg173338 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-10-19 13:13
Good. I would like to see similar statistics tests for any real application.
msg173371 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2012-10-20 06:05
something to include in your statistics is the lengths of the already hashed data being compared.

i expect there to be a minimum length before this optimization is useful.
msg173576 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-10-23 01:56
Rather than see statistics, I'm curious about what circumstances where the optimization would kick in.   Interned strings are pre-hashed but they already benefit from an identity-implies-equality check.  Dicts and sets already incorporate a check-hash-before-equality check.

That raises the question of what strings ever have had their hash already computed if the string hasn't been interned or has been used in a dict or set?

P.S.  I rather like the optimization and don't want to discourage it.  I'm just curious about what the current optimizations are missing.
msg173588 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-10-23 10:11
Oh, I forgot this issue when I did the following commit:
--
changeset:   79902:b68be1025c42
user:        Victor Stinner <victor.stinner@gmail.com>
date:        Tue Oct 23 02:48:49 2012 +0200
files:       Objects/unicodeobject.c
description:
Optimize PyUnicode_RichCompare() for Py_EQ and Py_NE: always use memcmp()
--
I will benchmark the overhead of memcmp() on short strings. We may
check the first and last characters before calling memcmp() to limit
the overhead of calling a function.

I also read that GCC uses its builtin memcmp() which is slower than
the memcmp() of the GNU libc.
msg178903 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-01-03 02:24
> P.S.  I rather like the optimization and don't want to discourage it.  I'm just curious about what the current optimizations are missing.

I'm too lazy to produce more statistics or run other benchmarks. I just saw an interesting optimization oportunity. I don't understand why it was not done before.

If you consider that compare_hash.patch might slow down Python, so ok, I will just close the issue as rejected.
msg186460 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-04-09 22:17
"I will benchmark the overhead of memcmp() on short strings. We may
check the first and last characters before calling memcmp() to limit
the overhead of calling a function."

I created the issue #17628 for this point.
msg201570 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-10-28 19:06
> That raises the question of what strings ever have had their hash
> already computed if the string hasn't been interned or has been used
> in a dict or set?

Currently, none, I think.
msg201571 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-28 19:17
>> That raises the question of what strings ever have had their hash
>> already computed if the string hasn't been interned or has been used
>> in a dict or set?
>
> Currently, none, I think.

Strings are used (and compared) outside dict and set.
msg201573 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-28 20:18
Let's try to identify some use cases in the Python test suite using gdb:

(gdb) b unicode_compare_eq
(gdb) condition 1 ((PyASCIIObject*)str1)->hash != -1 &&  ((PyASCIIObject*)str2)->hash != -1 && ((PyASCIIObject*)str1)->hash != ((PyASCIIObject*)str2)->hash 
(gdb) run

I didn't dig to understand why hash of these strings are computed. Tell me if you need more examples.


Random examples:

(1) compare "constant" strings (strings from co_consts of code objects)

importlib._bootstrap: _setup():

    os_details = ('posix', ['/']), ('nt', ['\\', '/'])
    for builtin_os, path_separators in os_details:
        ...
    ...
    if builtin_os == 'nt': <== HERE
        ...


(2) importlib._bootstrap: _LoaderBasics.is_package()

    def is_package(self, fullname):
        filename = _path_split(self.get_filename(fullname))[1]
        filename_base = filename.rsplit('.', 1)[0]
        tail_name = fullname.rpartition('.')[2]
        return filename_base == '__init__' and ... <== HERE

It's surprising that filename_base has its hash computed. I suppose that all these functions (_path_split, .rsplit, .rpartition) return the string unmodified.


(3) importlib._bootstrap: PathFinder._path_importer_cache():

    @classmethod
    def _path_importer_cache(cls, path):
        ...
        if path == '': <== HERE

path is an entry of sys.path.


(4) str in __all__ (list of str):

os.py:

    if "putenv" not in __all__:
        __all__.append("putenv")

__all__ is a list of strings.


(5) site.py:

    if __name__ == '__main__': <== HERE

__name__ is 'site'.


(6) Python/ceval.py: PyEval_EvalCodeEx() called with arbitrary keyword

    for (i = 0; i < kwcount; i++) {
        PyObject **co_varnames;
        PyObject *keyword = kws[2*i];
        PyObject *value = kws[2*i + 1];
        int j;
        ...
        /* Speed hack: do raw pointer compares. As names are
           normally interned this should almost always hit. */
        co_varnames = ((PyTupleObject *)(co->co_varnames))->ob_item;
        for (j = 0; j < total_args; j++) {
            PyObject *nm = co_varnames[j];
            if (nm == keyword)
                goto kw_found;
        }
        /* Slow fallback, just in case */
        for (j = 0; j < total_args; j++) {
            PyObject *nm = co_varnames[j];
            int cmp = PyObject_RichCompareBool(  <== HERE
                keyword, nm, Py_EQ);
            if (cmp > 0)
                goto kw_found;
            else if (cmp < 0)
                goto fail;
        }

It looks like the "just in case" path is taken.

(gdb) where
#0  unicode_compare_eq (str1='isTest', str2='func') at Objects/unicodeobject.c:10532
#1  0x000000000052dd41 in PyUnicode_RichCompare (left='isTest', right='func', op=2) at Objects/unicodeobject.c:10609
#2  0x00000000004be4db in do_richcompare (v='isTest', w='func', op=2) at Objects/object.c:647
#3  0x00000000004be790 in PyObject_RichCompare (v='isTest', w='func', op=2) at Objects/object.c:696
#4  0x00000000004be832 in PyObject_RichCompareBool (v='isTest', w='func', op=2) at Objects/object.c:718
#5  0x00000000005a0f68 in PyEval_EvalCodeEx (...) at Python/ceval.c:3450
...

Traceback (most recent call first):
  File "/home/haypo/prog/python/default/Lib/test/test_xml_etree.py", line 1669, in test_get_keyword_args
    e1 = ET.Element('foo' , x=1, y=2, z=3)

ElementTree.Element() accepts arbitary keywords.


(7) letter==letter singletons:


xml.etree.ElementPath: iterfind()

def iterfind(elem, path, namespaces=None):
    ...
    if path[-1:] == "/": <== HERE

Traceback (most recent call first):
  File "/home/haypo/prog/python/default/Lib/xml/etree/ElementPath.py", line 254, in iterfind
    if path[-1:] == "/":

path is ".//grandchild", path[-1] is 'd' which is a singleton, Python already computed the hash of 'd'.


Similar example in the same file:

def xpath_tokenizer(pattern, namespaces=None):
    for token in xpath_tokenizer_re.findall(pattern):
        tag = token[1]
        if tag and tag[0] != "{" and ":" in tag: <== HERE
            ...

tag[0] != "{" <= tag is 'grandchild', tag[0] is a singleton.


Another example:

Traceback (most recent call first):
  File "/home/haypo/prog/python/default/Lib/sre_parse.py", line 194, in __next
    if char == "\\":


(8) str not in (list of str), test_descr.py: test_dir():

  File "/home/haypo/prog/python/default/Lib/test/test_descr.py", line 2255, in <listcomp>
    names = [x for x in dir(minstance) if x not in default_attributes]

        minstance = M("m")
        minstance.b = 2
        minstance.a = 1
        default_attributes = ['__name__', '__doc__', '__package__',
                              '__loader__']
        names = [x for x in dir(minstance) if x not in default_attributes]
msg201578 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-10-28 20:47
"""
(4) str in __all__ (list of str):

os.py:

    if "putenv" not in __all__:
        __all__.append("putenv")
"""

For this example: "putenv" is probably interned by "def putenv(...)". "putenv" string and the name of the function are the same constant. When a function is defined, it is stored in the locals dictionary, so the key ("putenv") is interned by PyDict_SetItem().

So dict is not used explicitly, but implicitly by the namespace.
msg201871 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-01 01:57
Updated patch.
msg202109 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-11-04 10:11
New changeset 536a7c09c7fd by Victor Stinner in branch 'default':
Issue #16286: write a new subfunction bytes_compare_eq()
http://hg.python.org/cpython/rev/536a7c09c7fd
msg202111 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-11-04 10:24
New changeset 5fa291435740 by Victor Stinner in branch 'default':
Issue #16286: optimize PyUnicode_RichCompare() for identical strings (same
http://hg.python.org/cpython/rev/5fa291435740
msg202114 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-11-04 10:29
New changeset da9c6e4ef301 by Victor Stinner in branch 'default':
Issue #16286: remove duplicated identity check from unicode_compare()
http://hg.python.org/cpython/rev/da9c6e4ef301
msg202115 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-04 10:30
I applied changes unrelated to the hash.
msg202116 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-04 10:56
Results of benchmarks using compare_hash-3.patch:

$ time ../benchmarks/perf.py -r -b default ./pythonorig ./pythonhash 
INFO:root:Skipping benchmark slowspitfire; not compatible with Python 3.4
INFO:root:Skipping benchmark slowpickle; not compatible with Python 3.4
INFO:root:Skipping benchmark slowunpickle; not compatible with Python 3.4
INFO:root:Skipping benchmark spambayes; not compatible with Python 3.4
Running 2to3...
Running django_v2...

Report on Linux smithers 3.9.4-200.fc18.x86_64 #1 SMP Fri May 24 20:10:49 UTC 2013 x86_64 x86_64
Total CPU cores: 8

### 2to3 ###
Min: 6.358000 -> 6.055000: 1.05x faster
Avg: 6.407600 -> 6.179800: 1.04x faster
Significant (t=3.53)
Stddev: 0.04311 -> 0.13785: 3.1979x larger

### nbody ###
Min: 0.219029 -> 0.212477: 1.03x faster
Avg: 0.224940 -> 0.219248: 1.03x faster
Significant (t=10.13)
Stddev: 0.00373 -> 0.00420: 1.1288x larger

The following not significant results are hidden, use -v to show them:
django_v2.


At least, Python is not slower with the patch :-) I'm surprised that the benchmark shows a difference. nbody benchmark is focused on float numbers. I checked with gdb, nbody benchmark does not call any Unicode comparison function.
msg202322 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-07 10:17
I added recently a new _PyUnicode_CompareWithId() function: changeset 77bebcf5c4cf (issue #19512).

This function can be used instead of PyUnicode_CompareWithASCIIString() when the right parameter is a common string. It is interesting when the right string is probably present in a dictionary. For example, "path" is always present as "sys.path". So interning the string doesn't eat more memory.

_PyUnicode_CompareWithId() would be more efficient with compare_hash-3.patch. The function is not used yet in critical path. It is now used in type_new() for example.
msg202323 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-07 10:30
Serhiy, Gregory, Raymond, Antoine: so what is your feeling on this issue? Is it worth it?
msg202702 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-12 16:10
I ran pybench with the patch. I don't understand this result (10% slower with the patch):

DictWithStringKeys:    28ms    25ms  +10.7%    28ms    26ms  +10.5%

This test doesn't use unicode_compare_eq() from Objects/unicodeobject.c but unicode_eq() from Objects/stringlib/eq.h which is not modified by the patch.
msg202703 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-12 16:13
(oops, I didn't want to close the issue, it's a mistake)
msg206387 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-12-16 23:10
If it's hard to see a real speedup, it's probably not interesting to use the hash in string comparison.
msg216654 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-04-17 04:58
> Serhiy, Gregory, Raymond, Antoine: so what is your feeling on 
> this issue? Is it worth it?

I don't think it is worth it.  There may be some cases that benefit, but it adds extra branching code to the common cases (sets and dicts) that already have the identity check.
History
Date User Action Args
2014-04-17 04:58:58rhettingersetmessages: + msg216654
2013-12-16 23:10:58vstinnersetstatus: open -> closed
resolution: not a bug
messages: + msg206387
2013-11-12 16:13:05vstinnersetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg202703
2013-11-12 16:10:48vstinnersetstatus: open -> closed
resolution: not a bug
messages: + msg202702
2013-11-07 10:30:08vstinnersetmessages: + msg202323
2013-11-07 10:17:18vstinnersetmessages: + msg202322
2013-11-04 10:56:50vstinnersetmessages: + msg202116
2013-11-04 10:30:24vstinnersetfiles: + compare_hash-3.patch

messages: + msg202115
2013-11-04 10:29:10python-devsetmessages: + msg202114
2013-11-04 10:24:52python-devsetmessages: + msg202111
2013-11-04 10:11:11python-devsetnosy: + python-dev
messages: + msg202109
2013-11-01 01:57:36vstinnersetfiles: + compare_hash-2.patch

messages: + msg201871
2013-10-28 20:47:20vstinnersetmessages: + msg201578
2013-10-28 20:18:37vstinnersetmessages: + msg201573
2013-10-28 19:17:49vstinnersetmessages: + msg201571
2013-10-28 19:06:52pitrousetnosy: + pitrou
messages: + msg201570
2013-04-09 22:17:40vstinnersetmessages: + msg186460
2013-04-07 23:12:20vstinnersettitle: Optimize a==b and a!=b for bytes and str -> Use hash if available to optimize a==b and a!=b for bytes and str
2013-01-03 02:24:49vstinnersetmessages: + msg178903
2012-12-29 04:51:41meador.ingesetnosy: + meador.inge
2012-10-25 17:29:43ezio.melottisetnosy: + ezio.melotti

stage: patch review
2012-10-23 10:11:15vstinnersetmessages: + msg173588
2012-10-23 01:56:04rhettingersetnosy: + rhettinger
messages: + msg173576
2012-10-20 06:05:20gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg173371
2012-10-19 17:02:54christian.heimessetnosy: + christian.heimes
2012-10-19 13:43:56djcsetnosy: + djc
2012-10-19 13:13:08serhiy.storchakasetmessages: + msg173338
2012-10-19 12:36:06vstinnercreate