Title: redundant assignments to ob_size of new ints that _PyLong_New returned
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.7
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: Oren Milman, haypo, mark.dickinson, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-07-02 14:38 by Oren Milman, last changed 2017-03-31 16:36 by dstufft. This issue is now closed.

File name Uploaded Description Edit
CPythonTestOutput.txt Oren Milman, 2016-07-02 14:38 test output of CPython without my patches (tested on my PC)
issue27441_ver1.diff Oren Milman, 2016-07-02 14:40 proposed patches diff file - ver1 review
patchedCPythonTestOutput_ver1.txt Oren Milman, 2016-07-02 14:40 test output of CPython with my patches (tested on my PC) - ver1
issue27441_ver2.diff Oren Milman, 2016-07-08 14:43 proposed patches diff file - ver2 review
patchedCPythonTestOutput_ver2.txt Oren Milman, 2016-07-08 14:43 test output of CPython with my patches (tested on my PC) - ver2
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (6)
msg269715 - (view) Author: Oren Milman (Oren Milman) * Date: 2016-07-02 14:38
------------ current state ------------
In six different functions, the following happens:
    1. Function x calls _PyLong_New, with var y as the size argument.
        * Among others, _PyLong_New sets the ob_size of the new int to y (the size argument it received).
    2. Function x sets the ob_size of the new int to y, even though y is already the value of ob_size.

The functions in which this happens are:
    1. in Objects/longobject.c:
        - PyLong_FromUnsignedLong
        - PyLong_FromLongLong
        - PyLong_FromUnsignedLongLong
        - PyLong_FromSsize_t
        - PyLong_FromSize_t
    2. in Python/marshal.c:
        - r_PyLong

With regard to relevant changes made in the past, it seems that the redundant assignment was added (in each of these six functions) on the last major rewriting of the function, or when the function was first added, and remained there to this day.

The revisions in which the redundant assignments were added:
    1. changeset 18114 (
        - PyLong_FromUnsignedLong
    2. changeset 38307 (
        - PyLong_FromLongLong
        - PyLong_FromUnsignedLongLong
    3. changeset 46460 (
        - PyLong_FromSize_t
        - PyLong_FromSsize_t
    4. changeset 52215 (
        - r_PyLong

------------ proposed changes ------------
Remove these six redundant assignments.

------------ diff ------------
The proposed patches diff file is attached.

------------ tests ------------
I built the patched CPython for x86, and played with it a little. Everything seemed to work as usual. 

In addition, I ran 'python_d.exe -m test -j3' (on my 64-bit Windows 10) with and without the patches, and got quite the same output.
The outputs of both runs are attached.
msg269871 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-07-06 08:39
Changes to PyLong_FromUnsignedLong() and PyLong_FromUnsignedLongLong() LGTM. I don't know whether other changes have a positive effect. Are there any microbenchmarks? There are other places in which Py_SIZE() is set to the same value.
msg269987 - (view) Author: Oren Milman (Oren Milman) * Date: 2016-07-08 14:43
I am sorry, but I can't see why micro-benchmarking is needed here, as my patches only remove code that does nothing, while they don't add any new code.

The assembly the compiler generates (on my PC) for 'Py_SIZE(v) = negative ? -ndigits : ndigits;' in PyLong_FromLongLong is: 
('[edx+8]' is 'Py_SIZE(v)',
 '[esp+10h+var_4]' is 'negative',
  The 'lea ecx, [edx+0Ch]' and 'mov eax, edi' instructions set ecx and eax for later (I haven't removed them in order to be as precise as possible.))
    cmp     [esp+10h+var_4], 0
    lea     ecx, [edx+0Ch]
    jz      short loc_1E0D48EC
        neg     ebx
    mov     eax, edi
    mov     [edx+8], ebx
In contrast, the assembly the compiler generates for 'if (negative) Py_SIZE(v) = -ndigits;' is:
    cmp     [esp+10h+var_4], 0
    lea     ecx, [edx+0Ch]
    jz      short loc_1E0D482F
        neg     ebx
        mov     [edx+8], ebx
    mov     eax, edi
Comparing the assembly generated for the other original '?:' expressions with my corresponding patches looks quite the same. Each patch moves the assignment from code which is executed in both of the flows, to code which is executed in only one of the flows.

Am I missing anything that might cause my patches to introduce a performance penalty?

I searched (all of the cpython repo) for other places in which Py_SIZE() is set to the same value, and indeed found one in Objects/longobject.c in _PyLong_Init:
The loop that initializes the small_ints array goes over every element in the array, and checks whether it was already initialized. For some reason, even when it realizes the current element was already initialized, it still sets 'Py_SIZE(v)' and 'v->ob_digit[0]' (to the values they are already set to).
These redundant assignments were first added in changeset 45072 (, and remained there to this day.

So I added a patch to move these assignments so they would be executed only in case the current element of small_ints wasn't already initialized.
The updated patches diff file is attached. I also ran the tests again, and got quite the same output (the output is attached).

Have you spotted any other places in which Py_SIZE() is set to the same value?
msg276813 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-17 18:42
> Am I missing anything that might cause my patches to introduce a performance penalty?

It's at least conceivable that code like

   Py_SIZE(v) = negative ? -ndigits : ndigits;

might be compiled to something branchless on some platforms (with some sets of compiler flags). The assembly you show demonstrates that that doesn't happen on your machine, but that doesn't say anything about other current or future machines.

I also prefer the original form for readability; so I agree with Serhiy that we shouldn't change it without evidence that the change improves performance.

I'll remove the two obviously redundant `Py_SIZE(v) = ...` operations in PyLong_FromUnsignedLong and PyLong_FromUnsignedLongLong.
msg276814 - (view) Author: Roundup Robot (python-dev) Date: 2016-09-17 18:44
New changeset 27a6ecf84f72 by Mark Dickinson in branch 'default':
Issue #27441: Remove some redundant assignments to ob_size in longobject.c. Thanks Oren Milman.
msg276815 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-17 18:48
Changes to PyLong_FromUnsignedLong and PyLong_FromUnsignedLongLong applied. I've left the others; for the small int initialisation, that code isn't performance critical anyway, and I'm not entirely comfortable with assuming that PyObject_INIT_VAR will always handle negative sizes correctly. (The (ab)use of the sign bit of the ob_size field is something that's particular to the int type.)
Date User Action Args
2017-03-31 16:36:28dstufftsetpull_requests: + pull_request1020
2016-09-17 18:50:08mark.dickinsonsetstatus: open -> closed
resolution: fixed
stage: commit review -> resolved
2016-09-17 18:48:58mark.dickinsonsetmessages: + msg276815
2016-09-17 18:44:27python-devsetnosy: + python-dev
messages: + msg276814
2016-09-17 18:42:26mark.dickinsonsetassignee: mark.dickinson
messages: + msg276813
2016-09-14 09:59:52mark.dickinsonsetnosy: + mark.dickinson
2016-09-13 14:02:21Oren Milmansetversions: + Python 3.7, - Python 3.6
2016-07-08 14:43:36Oren Milmansetfiles: + patchedCPythonTestOutput_ver2.txt
2016-07-08 14:43:16Oren Milmansetfiles: + issue27441_ver2.diff

messages: + msg269987
2016-07-06 08:39:15serhiy.storchakasetmessages: + msg269871
2016-07-05 13:30:40SilentGhostsetnosy: + haypo, serhiy.storchaka
stage: commit review

versions: + Python 3.6
2016-07-02 14:40:55Oren Milmansetfiles: + patchedCPythonTestOutput_ver1.txt
2016-07-02 14:40:17Oren Milmansetfiles: + issue27441_ver1.diff
keywords: + patch
2016-07-02 14:38:57Oren Milmancreate