msg173332 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2012-10-19 12:36 |
Attached patch optimize a==b and a!=b operators for bytes and str types of Python 3.4. For str, memcmp() is now always used, instead of a loop using PyUnicode_READ() (which is slow) for kind different than 1. For bytes, compare the first but also the last byte before calling memcmp(), instead of just comparing the first byte. Similar optimization was implemented in Py_UNICODE_MATCH():
changeset: 38242:0de9a789de39
branch: legacy-trunk
user: Fredrik Lundh <fredrik@pythonware.com>
date: Tue May 23 10:10:57 2006 +0000
files: Include/unicodeobject.h
description:
needforspeed: check first *and* last character before doing a full memcmp
Initially I only wrote the patch to check the hash values before comparing content of the strings.
--
I done some statistics tests. For a fresh Python interpreter, the hash values are only known in 7% cases (but when hashes are compared, they are quite always different, so the optimization is useful). When running "./python -m test test_os", hashes are known and different in 41.4%. After running 70 tests, hashes are known and different in 80%.
|
msg173338 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2012-10-19 13:13 |
Good. I would like to see similar statistics tests for any real application.
|
msg173371 - (view) |
Author: Gregory P. Smith (gregory.p.smith) * |
Date: 2012-10-20 06:05 |
something to include in your statistics is the lengths of the already hashed data being compared.
i expect there to be a minimum length before this optimization is useful.
|
msg173576 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2012-10-23 01:56 |
Rather than see statistics, I'm curious about what circumstances where the optimization would kick in. Interned strings are pre-hashed but they already benefit from an identity-implies-equality check. Dicts and sets already incorporate a check-hash-before-equality check.
That raises the question of what strings ever have had their hash already computed if the string hasn't been interned or has been used in a dict or set?
P.S. I rather like the optimization and don't want to discourage it. I'm just curious about what the current optimizations are missing.
|
msg173588 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2012-10-23 10:11 |
Oh, I forgot this issue when I did the following commit:
--
changeset: 79902:b68be1025c42
user: Victor Stinner <victor.stinner@gmail.com>
date: Tue Oct 23 02:48:49 2012 +0200
files: Objects/unicodeobject.c
description:
Optimize PyUnicode_RichCompare() for Py_EQ and Py_NE: always use memcmp()
--
I will benchmark the overhead of memcmp() on short strings. We may
check the first and last characters before calling memcmp() to limit
the overhead of calling a function.
I also read that GCC uses its builtin memcmp() which is slower than
the memcmp() of the GNU libc.
|
msg178903 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-01-03 02:24 |
> P.S. I rather like the optimization and don't want to discourage it. I'm just curious about what the current optimizations are missing.
I'm too lazy to produce more statistics or run other benchmarks. I just saw an interesting optimization oportunity. I don't understand why it was not done before.
If you consider that compare_hash.patch might slow down Python, so ok, I will just close the issue as rejected.
|
msg186460 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-04-09 22:17 |
"I will benchmark the overhead of memcmp() on short strings. We may
check the first and last characters before calling memcmp() to limit
the overhead of calling a function."
I created the issue #17628 for this point.
|
msg201570 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2013-10-28 19:06 |
> That raises the question of what strings ever have had their hash
> already computed if the string hasn't been interned or has been used
> in a dict or set?
Currently, none, I think.
|
msg201571 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-10-28 19:17 |
>> That raises the question of what strings ever have had their hash
>> already computed if the string hasn't been interned or has been used
>> in a dict or set?
>
> Currently, none, I think.
Strings are used (and compared) outside dict and set.
|
msg201573 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-10-28 20:18 |
Let's try to identify some use cases in the Python test suite using gdb:
(gdb) b unicode_compare_eq
(gdb) condition 1 ((PyASCIIObject*)str1)->hash != -1 && ((PyASCIIObject*)str2)->hash != -1 && ((PyASCIIObject*)str1)->hash != ((PyASCIIObject*)str2)->hash
(gdb) run
I didn't dig to understand why hash of these strings are computed. Tell me if you need more examples.
Random examples:
(1) compare "constant" strings (strings from co_consts of code objects)
importlib._bootstrap: _setup():
os_details = ('posix', ['/']), ('nt', ['\\', '/'])
for builtin_os, path_separators in os_details:
...
...
if builtin_os == 'nt': <== HERE
...
(2) importlib._bootstrap: _LoaderBasics.is_package()
def is_package(self, fullname):
filename = _path_split(self.get_filename(fullname))[1]
filename_base = filename.rsplit('.', 1)[0]
tail_name = fullname.rpartition('.')[2]
return filename_base == '__init__' and ... <== HERE
It's surprising that filename_base has its hash computed. I suppose that all these functions (_path_split, .rsplit, .rpartition) return the string unmodified.
(3) importlib._bootstrap: PathFinder._path_importer_cache():
@classmethod
def _path_importer_cache(cls, path):
...
if path == '': <== HERE
path is an entry of sys.path.
(4) str in __all__ (list of str):
os.py:
if "putenv" not in __all__:
__all__.append("putenv")
__all__ is a list of strings.
(5) site.py:
if __name__ == '__main__': <== HERE
__name__ is 'site'.
(6) Python/ceval.py: PyEval_EvalCodeEx() called with arbitrary keyword
for (i = 0; i < kwcount; i++) {
PyObject **co_varnames;
PyObject *keyword = kws[2*i];
PyObject *value = kws[2*i + 1];
int j;
...
/* Speed hack: do raw pointer compares. As names are
normally interned this should almost always hit. */
co_varnames = ((PyTupleObject *)(co->co_varnames))->ob_item;
for (j = 0; j < total_args; j++) {
PyObject *nm = co_varnames[j];
if (nm == keyword)
goto kw_found;
}
/* Slow fallback, just in case */
for (j = 0; j < total_args; j++) {
PyObject *nm = co_varnames[j];
int cmp = PyObject_RichCompareBool( <== HERE
keyword, nm, Py_EQ);
if (cmp > 0)
goto kw_found;
else if (cmp < 0)
goto fail;
}
It looks like the "just in case" path is taken.
(gdb) where
#0 unicode_compare_eq (str1='isTest', str2='func') at Objects/unicodeobject.c:10532
#1 0x000000000052dd41 in PyUnicode_RichCompare (left='isTest', right='func', op=2) at Objects/unicodeobject.c:10609
#2 0x00000000004be4db in do_richcompare (v='isTest', w='func', op=2) at Objects/object.c:647
#3 0x00000000004be790 in PyObject_RichCompare (v='isTest', w='func', op=2) at Objects/object.c:696
#4 0x00000000004be832 in PyObject_RichCompareBool (v='isTest', w='func', op=2) at Objects/object.c:718
#5 0x00000000005a0f68 in PyEval_EvalCodeEx (...) at Python/ceval.c:3450
...
Traceback (most recent call first):
File "/home/haypo/prog/python/default/Lib/test/test_xml_etree.py", line 1669, in test_get_keyword_args
e1 = ET.Element('foo' , x=1, y=2, z=3)
ElementTree.Element() accepts arbitary keywords.
(7) letter==letter singletons:
xml.etree.ElementPath: iterfind()
def iterfind(elem, path, namespaces=None):
...
if path[-1:] == "/": <== HERE
Traceback (most recent call first):
File "/home/haypo/prog/python/default/Lib/xml/etree/ElementPath.py", line 254, in iterfind
if path[-1:] == "/":
path is ".//grandchild", path[-1] is 'd' which is a singleton, Python already computed the hash of 'd'.
Similar example in the same file:
def xpath_tokenizer(pattern, namespaces=None):
for token in xpath_tokenizer_re.findall(pattern):
tag = token[1]
if tag and tag[0] != "{" and ":" in tag: <== HERE
...
tag[0] != "{" <= tag is 'grandchild', tag[0] is a singleton.
Another example:
Traceback (most recent call first):
File "/home/haypo/prog/python/default/Lib/sre_parse.py", line 194, in __next
if char == "\\":
(8) str not in (list of str), test_descr.py: test_dir():
File "/home/haypo/prog/python/default/Lib/test/test_descr.py", line 2255, in <listcomp>
names = [x for x in dir(minstance) if x not in default_attributes]
minstance = M("m")
minstance.b = 2
minstance.a = 1
default_attributes = ['__name__', '__doc__', '__package__',
'__loader__']
names = [x for x in dir(minstance) if x not in default_attributes]
|
msg201578 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-10-28 20:47 |
"""
(4) str in __all__ (list of str):
os.py:
if "putenv" not in __all__:
__all__.append("putenv")
"""
For this example: "putenv" is probably interned by "def putenv(...)". "putenv" string and the name of the function are the same constant. When a function is defined, it is stored in the locals dictionary, so the key ("putenv") is interned by PyDict_SetItem().
So dict is not used explicitly, but implicitly by the namespace.
|
msg201871 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-01 01:57 |
Updated patch.
|
msg202109 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-11-04 10:11 |
New changeset 536a7c09c7fd by Victor Stinner in branch 'default':
Issue #16286: write a new subfunction bytes_compare_eq()
http://hg.python.org/cpython/rev/536a7c09c7fd
|
msg202111 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-11-04 10:24 |
New changeset 5fa291435740 by Victor Stinner in branch 'default':
Issue #16286: optimize PyUnicode_RichCompare() for identical strings (same
http://hg.python.org/cpython/rev/5fa291435740
|
msg202114 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2013-11-04 10:29 |
New changeset da9c6e4ef301 by Victor Stinner in branch 'default':
Issue #16286: remove duplicated identity check from unicode_compare()
http://hg.python.org/cpython/rev/da9c6e4ef301
|
msg202115 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-04 10:30 |
I applied changes unrelated to the hash.
|
msg202116 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-04 10:56 |
Results of benchmarks using compare_hash-3.patch:
$ time ../benchmarks/perf.py -r -b default ./pythonorig ./pythonhash
INFO:root:Skipping benchmark slowspitfire; not compatible with Python 3.4
INFO:root:Skipping benchmark slowpickle; not compatible with Python 3.4
INFO:root:Skipping benchmark slowunpickle; not compatible with Python 3.4
INFO:root:Skipping benchmark spambayes; not compatible with Python 3.4
Running 2to3...
Running django_v2...
Report on Linux smithers 3.9.4-200.fc18.x86_64 #1 SMP Fri May 24 20:10:49 UTC 2013 x86_64 x86_64
Total CPU cores: 8
### 2to3 ###
Min: 6.358000 -> 6.055000: 1.05x faster
Avg: 6.407600 -> 6.179800: 1.04x faster
Significant (t=3.53)
Stddev: 0.04311 -> 0.13785: 3.1979x larger
### nbody ###
Min: 0.219029 -> 0.212477: 1.03x faster
Avg: 0.224940 -> 0.219248: 1.03x faster
Significant (t=10.13)
Stddev: 0.00373 -> 0.00420: 1.1288x larger
The following not significant results are hidden, use -v to show them:
django_v2.
At least, Python is not slower with the patch :-) I'm surprised that the benchmark shows a difference. nbody benchmark is focused on float numbers. I checked with gdb, nbody benchmark does not call any Unicode comparison function.
|
msg202322 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-07 10:17 |
I added recently a new _PyUnicode_CompareWithId() function: changeset 77bebcf5c4cf (issue #19512).
This function can be used instead of PyUnicode_CompareWithASCIIString() when the right parameter is a common string. It is interesting when the right string is probably present in a dictionary. For example, "path" is always present as "sys.path". So interning the string doesn't eat more memory.
_PyUnicode_CompareWithId() would be more efficient with compare_hash-3.patch. The function is not used yet in critical path. It is now used in type_new() for example.
|
msg202323 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-07 10:30 |
Serhiy, Gregory, Raymond, Antoine: so what is your feeling on this issue? Is it worth it?
|
msg202702 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-12 16:10 |
I ran pybench with the patch. I don't understand this result (10% slower with the patch):
DictWithStringKeys: 28ms 25ms +10.7% 28ms 26ms +10.5%
This test doesn't use unicode_compare_eq() from Objects/unicodeobject.c but unicode_eq() from Objects/stringlib/eq.h which is not modified by the patch.
|
msg202703 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-11-12 16:13 |
(oops, I didn't want to close the issue, it's a mistake)
|
msg206387 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2013-12-16 23:10 |
If it's hard to see a real speedup, it's probably not interesting to use the hash in string comparison.
|
msg216654 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2014-04-17 04:58 |
> Serhiy, Gregory, Raymond, Antoine: so what is your feeling on
> this issue? Is it worth it?
I don't think it is worth it. There may be some cases that benefit, but it adds extra branching code to the common cases (sets and dicts) that already have the identity check.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:37 | admin | set | github: 60490 |
2014-04-17 04:58:58 | rhettinger | set | messages:
+ msg216654 |
2013-12-16 23:10:58 | vstinner | set | status: open -> closed resolution: not a bug messages:
+ msg206387
|
2013-11-12 16:13:05 | vstinner | set | status: closed -> open resolution: not a bug -> (no value) messages:
+ msg202703
|
2013-11-12 16:10:48 | vstinner | set | status: open -> closed resolution: not a bug messages:
+ msg202702
|
2013-11-07 10:30:08 | vstinner | set | messages:
+ msg202323 |
2013-11-07 10:17:18 | vstinner | set | messages:
+ msg202322 |
2013-11-04 10:56:50 | vstinner | set | messages:
+ msg202116 |
2013-11-04 10:30:24 | vstinner | set | files:
+ compare_hash-3.patch
messages:
+ msg202115 |
2013-11-04 10:29:10 | python-dev | set | messages:
+ msg202114 |
2013-11-04 10:24:52 | python-dev | set | messages:
+ msg202111 |
2013-11-04 10:11:11 | python-dev | set | nosy:
+ python-dev messages:
+ msg202109
|
2013-11-01 01:57:36 | vstinner | set | files:
+ compare_hash-2.patch
messages:
+ msg201871 |
2013-10-28 20:47:20 | vstinner | set | messages:
+ msg201578 |
2013-10-28 20:18:37 | vstinner | set | messages:
+ msg201573 |
2013-10-28 19:17:49 | vstinner | set | messages:
+ msg201571 |
2013-10-28 19:06:52 | pitrou | set | nosy:
+ pitrou messages:
+ msg201570
|
2013-04-09 22:17:40 | vstinner | set | messages:
+ msg186460 |
2013-04-07 23:12:20 | vstinner | set | title: Optimize a==b and a!=b for bytes and str -> Use hash if available to optimize a==b and a!=b for bytes and str |
2013-01-03 02:24:49 | vstinner | set | messages:
+ msg178903 |
2012-12-29 04:51:41 | meador.inge | set | nosy:
+ meador.inge
|
2012-10-25 17:29:43 | ezio.melotti | set | nosy:
+ ezio.melotti
stage: patch review |
2012-10-23 10:11:15 | vstinner | set | messages:
+ msg173588 |
2012-10-23 01:56:04 | rhettinger | set | nosy:
+ rhettinger messages:
+ msg173576
|
2012-10-20 06:05:20 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages:
+ msg173371
|
2012-10-19 17:02:54 | christian.heimes | set | nosy:
+ christian.heimes
|
2012-10-19 13:43:56 | djc | set | nosy:
+ djc
|
2012-10-19 13:13:08 | serhiy.storchaka | set | messages:
+ msg173338 |
2012-10-19 12:36:06 | vstinner | create | |