This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Remove false optimization for equality comparison of hashed strings
Type: Stage:
Components: Interpreter Core Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: alex, python-dev, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2013-08-12 20:33 by rhettinger, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
eq_optimization.diff rhettinger, 2013-08-12 20:38 Remove the test-first-character test review
27.diff rhettinger, 2013-08-13 05:26 Remove the test-first-character test in 2.7
Messages (7)
msg195011 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-08-12 20:38
This code is only run when the kinds, lengths, and hashes match.  So, the probability of the strings being equal is VERY high.  Accordingly, there is no benefit to an earlier out test to see if the first characters differ.

There is a modest benefit to comparing one-character strings, but that comes at the expense of of testing all other string lengths.

This code is on the critical path for dicts and sets, so we don't want to slow it down with false optimizations or optimizations of special cases that come at the expense of the common, general case.
msg195014 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2013-08-12 20:43
does this show demonstrable results (in either direction) on stringbench or the benchmarks repo?
msg195015 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-08-12 20:57
See also issues #16286 and #17628.
msg195040 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-08-13 01:28
Profiling the test suite shows that the short-cut branch NEVER gets taken.
There are no cases where the string lengths, kinds, and 64-bit hashes match, but the stings themselves are a mismatch.  The whole theory behind this optimization is invalid.  The first characters always match, so you don't need to test for them.

Alex, these things are very difficult to measure because the cost of a 100% predictable branch is also zero in a tight benchmark.  The negative effects of useless tests are subtle and indirect (i.e. blowing other useful things out of the branch prediction table).

Victor, thanks for link to 17628.  I will look at that further, but at first glance it looks like you're introducing a big wall of code right in the middle of a critical loop in code the is supposed to be in-lined.  It too smells of a false optimization and places far too much hope that memcmp, case-statements, and whatnot will behave awesomely will all compilers.

Unless you object, I'm going to go ahead and remove the bogus test.
msg195042 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2013-08-13 01:31
The statistic that htis is *never* hit across a large python program is great evidence that this isn't useful. +1 on removing from me.
msg195116 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-08-14 01:21
New changeset 8f9bc9283400 by Raymond Hettinger in branch '3.3':
Issue 18719: Remove a false optimization
http://hg.python.org/cpython/rev/8f9bc9283400
msg195117 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-08-14 01:35
New changeset ac2f59a6637f by Raymond Hettinger in branch '2.7':
Issue 18719: Remove a false optimization
http://hg.python.org/cpython/rev/ac2f59a6637f
History
Date User Action Args
2022-04-11 14:57:49adminsetgithub: 62919
2013-08-14 01:35:51rhettingersetstatus: open -> closed
resolution: fixed
title: Remove false optimizaton for equality comparison of hashed strings -> Remove false optimization for equality comparison of hashed strings
2013-08-14 01:35:12python-devsetmessages: + msg195117
2013-08-14 01:21:14python-devsetnosy: + python-dev
messages: + msg195116
2013-08-13 05:26:06rhettingersetfiles: + 27.diff
2013-08-13 05:24:30rhettingersetversions: + Python 2.7, Python 3.3
2013-08-13 01:31:18alexsetmessages: + msg195042
2013-08-13 01:28:50rhettingersetmessages: + msg195040
2013-08-12 20:57:31vstinnersetmessages: + msg195015
2013-08-12 20:43:16alexsetnosy: + alex
messages: + msg195014
2013-08-12 20:41:20pitrousetnosy: + vstinner, serhiy.storchaka
2013-08-12 20:38:02rhettingersetfiles: + eq_optimization.diff
keywords: + patch
messages: + msg195011
2013-08-12 20:33:24rhettingercreate