classification
Title: Harmonize STORE_DEREF with STORE_FAST and LOAD_DEREF
Type: performance Stage: commit review
Components: Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: python-dev, rhettinger, serhiy.storchaka
Priority: low Keywords: patch

Created on 2016-11-11 12:10 by rhettinger, last changed 2017-03-31 16:36 by dstufft. This issue is now closed.

Files
File name Uploaded Description Edit
fastcell.diff rhettinger, 2016-11-11 12:10
delete_deref.diff rhettinger, 2016-11-12 04:33 Fix-up DELETE_DEREF
concat_deref.diff rhettinger, 2016-11-12 04:34 Fix-up unicode_concat
issue28665.py serhiy.storchaka, 2016-11-12 08:56
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (11)
msg280574 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-11 12:10
The STORE_FAST, LOAD_FAST, and LOAD_DEREF opcodes all use fast macros for variable access.  This patch harmonizes STORE_DEREF to follow the same pattern.

Both the C code and the generated assembly look nicer.  Gives an approx 40% speed-up (using both Clang and GCC-6) on the "store_nonlocal" portion of the variable access benchmark at http://code.activestate.com/recipes/577834
The eliminates the nonlocal speed penalty, making cell variable updates run nearly as fast as updates to locals.
msg280576 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-11 12:23
LGTM. This saves function call and INCREF/DECREF pair.

What about DELETE_DEREF? PyCell_Set() also is used in unicode_concatenate().
msg280577 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-11-11 12:32
New changeset d78d45436753 by Raymond Hettinger in branch '3.6':
Issue #28665: Harmonize STORE_DEREF with STORE_FAST and LOAD_DEREF giving a 40% speedup.
https://hg.python.org/cpython/rev/d78d45436753
msg280578 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-11 12:33
Thanks for the quick review.  I'll look at the other two cases when I get a chance.
msg280584 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-11-11 14:50
New changeset 7ec45e7d2194 by Serhiy Storchaka in branch 'default':
Merge from 3.6 (issue #28665).
https://hg.python.org/cpython/rev/7ec45e7d2194
msg280585 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-11 14:51
You forgot to merge and created new head.
msg280635 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-12 07:03
Could you please measure the performance effect of these changes?
msg280642 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-12 08:34
I think the speed benefit for the last two patches is likely too modest to care about. The main reason I did the work was because you suggested it and because it seemed like a reasonable idea (the patched code looks nice, it does only work that is necessary, and it is more consistent with the other opcodes).

Let me know if you want to go forward with it or leave it in the current state (the current juxtaposition of PyCell_GET with PyCell_Set looks a little weird but it does get the job done).
msg280643 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-12 08:56
The argument about "harmonizing" doesn't look strong to me. Opcodes for locals use the SETLOCAL() macro which decrefs 
old value, while opcodes for nonlocals with your patches use the PyCell_SET() macro which doesn't.

But performance arguments look more weighty. I made benchmarks. fastcell.diff speeds up STORE_FAST by 40%, delete_deref.diff speeds up DELETE_DEREF by 50%. and concat_deref.diff speeds up string concatenating up to 15%. All these operations are rare in comparison with operations with locals or LOAD_DEREF, but the cognitive cost of the optimization is pretty low. All patches LGTM.

I only have doubts that such changes could be pushed in 3.6 at this stage. This is not bug fix and isn't tweaking new 3.6 feature.
msg280645 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-11-12 09:10
New changeset 5f3b7ceb394c by Raymond Hettinger in branch 'default':
Issue #28665:  Use macro form of PyCell_GET/SET
https://hg.python.org/cpython/rev/5f3b7ceb394c
msg280646 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-11-12 09:12
I think small changes are fine while there is still another beta ahead but would rather just push it to 3.7 than spend more time talking about it.
History
Date User Action Args
2017-03-31 16:36:26dstufftsetpull_requests: + pull_request996
2016-11-12 09:12:16rhettingersetstatus: open -> closed
resolution: fixed
messages: + msg280646
2016-11-12 09:10:43python-devsetmessages: + msg280645
2016-11-12 08:56:04serhiy.storchakasetfiles: + issue28665.py
versions: - Python 3.6
messages: + msg280643

assignee: serhiy.storchaka -> rhettinger
stage: patch review -> commit review
2016-11-12 08:34:42rhettingersetmessages: + msg280642
2016-11-12 08:34:33rhettingersetmessages: - msg280641
2016-11-12 08:34:17rhettingersetmessages: + msg280641
2016-11-12 08:32:10rhettingersetmessages: - msg280637
2016-11-12 07:27:46rhettingersetmessages: + msg280637
2016-11-12 07:03:24serhiy.storchakasetmessages: + msg280635
2016-11-12 04:34:37rhettingersetfiles: + concat_deref.diff
assignee: rhettinger -> serhiy.storchaka
2016-11-12 04:33:58rhettingersetfiles: + delete_deref.diff
2016-11-11 14:51:32serhiy.storchakasetmessages: + msg280585
2016-11-11 14:50:40python-devsetmessages: + msg280584
2016-11-11 12:33:20rhettingersetpriority: normal -> low
assignee: serhiy.storchaka -> rhettinger
messages: + msg280578
2016-11-11 12:32:24python-devsetnosy: + python-dev
messages: + msg280577
2016-11-11 12:23:31serhiy.storchakasetmessages: + msg280576
2016-11-11 12:10:41rhettingercreate