classification
Title: Add CHAINED_COMPARE_OP opcode
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: Demur Rumed, Mark.Shannon, josh.r, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-06-05 19:43 by serhiy.storchaka, last changed 2016-06-06 20:49 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
chained_compare_op.patch serhiy.storchaka, 2016-06-05 19:43 review
Messages (5)
msg267466 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-05 19:43
For now complex code is generated for chained comparing.

$ echo "x = a < b > c < d" | ./python -m dis
  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 DUP_TOP
              6 ROT_THREE
              8 COMPARE_OP               0 (<)
             10 JUMP_IF_FALSE_OR_POP    28
             12 LOAD_NAME                2 (c)
             14 DUP_TOP
             16 ROT_THREE
             18 COMPARE_OP               4 (>)
             20 JUMP_IF_FALSE_OR_POP    28
             22 LOAD_NAME                3 (d)
             24 COMPARE_OP               0 (<)
             26 JUMP_FORWARD             4 (to 32)
        >>   28 ROT_TWO
             30 POP_TOP
        >>   32 STORE_NAME               4 (x)
             34 LOAD_CONST               0 (None)
             36 RETURN_VALUE

Proposed patch adds CHAINED_COMPARE_OP opcode that does all necessary stack manipulatios. Using it the generated code is simpler:

$ echo "x = a < b > c < d" | ./python -m dis
  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 CHAINED_COMPARE_OP       0 (<)
              6 JUMP_IF_FALSE_OR_POP    18
              8 LOAD_NAME                2 (c)
             10 CHAINED_COMPARE_OP       4 (>)
             12 JUMP_IF_FALSE_OR_POP    18
             14 LOAD_NAME                3 (d)
             16 COMPARE_OP               0 (<)
        >>   18 STORE_NAME               4 (x)
             20 LOAD_CONST               0 (None)
             22 RETURN_VALUE
msg267484 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-06-05 23:50
Please stop adding new opcodes for rare use cases.  That represents a sharp departure from our entire history of adding opcodes.

Code like "x = a < b > c < d" almost never comes up.
msg267491 - (view) Author: Demur Rumed (Demur Rumed) * Date: 2016-06-06 00:16
@rhettinger can you clarify your opinion in relation to #27140 with #27095 & #27213 in mind?

I agree that CHAINED_COMPARE_OP is unnecessary
msg267541 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2016-06-06 16:31
It is a little funny though. I expect a more common test like:

if a < b < c: pass

to be unconditionally faster than the logically equivalent:

if a < b and b < c: pass

The only difference between the two should be that b is loaded only once, which should make the chained comparison either equivalent or faster. But thanks to the additional stack manipulations needed, in microbenchmarks, this isn't the case. I was just testing a couple simple cases in ipython (3.5, Linux, x64), and the chained comparison is actually slower when short-circuiting is involved:

%%timeit -r5 a, b, c = range(3, 0, -1)
if a < b < c: pass
10000000 loops, best of 5: 33.7 ns per loop

%%timeit -r5 a, b, c = range(3, 0, -1)
if a < b and b < c: pass
10000000 loops, best of 5: 25 ns per loop

or even without short-circuiting, but with the final test failing, chained loses (by a smaller margin):

%%timeit -r5 a, b, c = 0, 1, 0
if a < b < c: pass
10000000 loops, best of 5: 42.5 ns per loop

%%timeit -r5 a, b, c = 0, 1, 0
if a < b and b < c: pass
10000000 loops, best of 5: 39.3 ns per loop

The same pattern holds if a and c are replaced with constants, e.g. if 0 < b < 2: (so only b is loaded by name). Yes, it's a small difference; when a, b, c are initialized from range(3), so all tests succeed, chained comparisons win, 44.9 ns to 54.5 ns. But it seems kind of odd that the simplest case of a chained comparison is actually a pessimization in two of the three cases.

Not saying it's worth changing (I doubt chained comparisons occur in code enough to justify the tiny savings a dedicated op code could provide), but it's not something I'd reject out of hand.
msg267553 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-06 20:49
Thank you for your tests Josh.

I have collected additional statistics. According to it CHAINED_COMPARE_OP would be used 100 times less than COMPARE_OP. This is not worth adding new opcode.
History
Date User Action Args
2016-06-06 20:49:39serhiy.storchakasetstatus: open -> closed
resolution: rejected
messages: + msg267553

stage: patch review -> resolved
2016-06-06 16:31:15josh.rsetnosy: + josh.r
messages: + msg267541
2016-06-06 00:16:56Demur Rumedsetmessages: + msg267491
2016-06-05 23:50:59rhettingersetnosy: + rhettinger
messages: + msg267484
2016-06-05 19:43:25serhiy.storchakacreate