Message 416735 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Dennis Sweeney
Recipients	Dennis Sweeney, akuvfx, serhiy.storchaka, steven.daprano, tim.peters
Date	2022-04-05.03:14:58
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1649128498.38.0.488387437838.issue45542@roundup.psfhosted.org>
In-reply-to

Content
For reference, chaining is about 1.18x slower in this microbenchmark on GCC: ./python -m pyperf timeit -s "x = 100" "if 10 < x < 30: print('no')" --duplicate=10 ..................... Mean +- std dev: 21.3 ns +- 0.2 ns ./python -m pyperf timeit -s "x = 100" "if 10 < x and x < 30: print('no')" --duplicate=10 ..................... Mean +- std dev: 18.0 ns +- 0.5 ns For a related case, in GH-30970, the bytecode generate by "a, b = a0, b0" was changed. Before: [load_a0, load_b0, swap, store_a, store_b] After: [load_a0, load_b0, store_b, store_a] However, this was only changed when the stores were STORE_FASTs. STORE_GLOBAL/STORE_NAME/STORE_DEREF cases still have the SWAP. In the STORE_GLOBAL cases, you can construct scenarios with custom __del__ methods where storing b and then a has different behavior than storing a and then b. No such cases can be constructed for STORE_FAST without resorting to frame hacking. I wonder if the same argument applies here: maybe @akuvfx's PR could be altered to use LOAD_FAST twice for each variable only if everything in sight is the result of a LOAD_FAST or a LOAD_CONST. My example above uses a LOAD_DEREF, so its behavior could remain unchanged. The argument that this would within the language spec is maybe a little bit more dubious than the "a, b = a0, b0" case though, since custom `__lt__` methods are a bit more well-specified than custom `__del__` methods. Thoughts?

For reference, chaining is about 1.18x slower in this microbenchmark on GCC:

./python -m pyperf timeit -s "x = 100" "if 10 < x < 30: print('no')" --duplicate=10
.....................
Mean +- std dev: 21.3 ns +- 0.2 ns
./python -m pyperf timeit -s "x = 100" "if 10 < x and x < 30: print('no')" --duplicate=10
.....................
Mean +- std dev: 18.0 ns +- 0.5 ns

For a related case, in GH-30970, the bytecode generate by "a, b = a0, b0" was changed.
   Before: [load_a0, load_b0, swap, store_a, store_b]
   After:  [load_a0, load_b0, store_b, store_a]
However, this was only changed when the stores were STORE_FASTs. STORE_GLOBAL/STORE_NAME/STORE_DEREF cases still have the SWAP.
In the STORE_GLOBAL cases, you can construct scenarios with custom __del__ methods where storing b and then a has different behavior than storing a and then b. No such cases can be constructed for STORE_FAST without resorting to frame hacking.

I wonder if the same argument applies here: maybe @akuvfx's PR could be altered to use LOAD_FAST twice for each variable *only* if everything in sight is the result of a LOAD_FAST or a LOAD_CONST. My example above uses a LOAD_DEREF, so its behavior could remain unchanged.

The argument that this would within the language spec is maybe a little bit more dubious than the "a, b = a0, b0" case though, since custom `__lt__` methods are a bit more well-specified than custom `__del__` methods.

Thoughts?

History
Date	User	Action	Args
2022-04-05 03:14:58	Dennis Sweeney	set	recipients: + Dennis Sweeney, tim.peters, steven.daprano, serhiy.storchaka, akuvfx
2022-04-05 03:14:58	Dennis Sweeney	set	messageid: <1649128498.38.0.488387437838.issue45542@roundup.psfhosted.org>
2022-04-05 03:14:58	Dennis Sweeney	link	issue45542 messages
2022-04-05 03:14:58	Dennis Sweeney	create