long_add and long_sub might return a new int where &small_ints[x] could be returned #71332

orenmn · 2016-05-28T15:58:37Z

BPO	27145
Nosy	@rhettinger, @vstinner, @methane, @1st1, @orenmn
PRs	bpo-27145:small_ints[x] could be returned in long_add and long_sub #15716
Files	proposedPatches.diff: proposed patches diff file CPythonTestOutput.txt: test output of CPython without my patches (tested on my PC) patchedCPythonTestOutput.txt: test output of CPython with my patches (tested on my PC) proposedPatches.diff: proposed patches diff file update 1

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-11-26.07:55:28.666>
created_at = <Date 2016-05-28.15:58:37.336>
labels = ['interpreter-core', 'type-bug']
title = 'long_add and long_sub might return a new int where &small_ints[x] could be returned'
updated_at = <Date 2019-11-26.08:20:48.593>
user = 'https://github.com/orenmn'

bugs.python.org fields:

activity = <Date 2019-11-26.08:20:48.593>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2019-11-26.07:55:28.666>
closer = 'methane'
components = ['Interpreter Core']
creation = <Date 2016-05-28.15:58:37.336>
creator = 'Oren Milman'
dependencies = []
files = ['43041', '43042', '43043', '43148']
hgrepos = []
issue_num = 27145
keywords = ['patch']
message_count = 5.0
messages = ['266557', '267102', '267103', '357486', '357487']
nosy_count = 5.0
nosy_names = ['rhettinger', 'vstinner', 'methane', 'yselivanov', 'Oren Milman']
pr_nums = ['15716']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue27145'
versions = ['Python 3.6']

orenmn · 2016-05-28T15:58:36Z

------------ the current state ------------

>>> if is32BitCPython:
...   PyLong_SHIFT = 15
... elif is64BitCPython:
...   PyLong_SHIFT = 30
...
>>> ##### case A #####
>>> a = 2 ** PyLong_SHIFT - 1
>>> b = 2 ** PyLong_SHIFT - 2
>>> a - b
1
>>> a - b is 1
True
>>> a + (-b) is 1
True
>>>
>>> ##### case B #####
>>> a = 2 ** PyLong_SHIFT
>>> b = 2 ** PyLong_SHIFT - 1
>>> a - b
1
>>> a - b is 1
False
>>> a + (-b) is 1
False
>>>
>>> ##### case C #####
>>> a = 2 ** PyLong_SHIFT + 1
>>> b = 2 ** PyLong_SHIFT
>>> a - b
1
>>> a - b is 1
False
>>> a + (-b) is 1
False
>>>

This behavior is caused by the implementation of long_add and long_sub:
Both long_add and long_sub check whether both a and b are single-digit ints, and then do (respectively):
return PyLong_FromLong(MEDIUM_VALUE(a) + MEDIUM_VALUE(b));
or
return PyLong_FromLong(MEDIUM_VALUE(a) - MEDIUM_VALUE(b));
Otherwise, long_add and long_sub call x_add or x_sub to do a calculation on the absolute values of a and b.
At last, long_add and long_sub negate the result of the calculation, if needed, and return the final result.

Where both a and b are single-digit ints (e.g. case A), the result of the calculation is passed to PyLong_FromLong, which uses CHECK_SMALL_INT, and so a reference to an element of small_ints is returned where appropriate.
Where a and/or b are not single-digit ints (e.g. cases B and C), x_add or x_sub is called. Both x_add and x_sub don't check whether the result is a small int (except for a case in x_sub where the result is zero), and so long_add and long_sub might return a new int, even where an element of small_ints could be reused.

Due to the way CPython uses them, the issue is relevant to x_sub and not to x_add, as the calculation the former performs might result in a small int, while that of the latter would always result in a multiple-digit int.
(Except for being called by long_add and long_sub, x_add might be called by k_mul, but in that case also the calculation would certainly result in a multiple-digit int.)

------------ Note ------------
I am not sure whether this is actually an issue that we want to fix.
It seems to me that my proposed changes introduce a slight performance gain (in terms of memory, and probably also speed), and a more consistent behavior of CPython.
The performance gain would probably be much more relevant if and when greater default values are chosen for NSMALLNEGINTS and NSMALLPOSINTS.

Anyway, I guess documenting the issue here, along with a proposal for a fix, is better than nothing.
(As far as I know, since the unification of int and long in revision 40626, this issue never came up.)

------------ the proposed changes ------------
All of the proposed changes are in Objects/longobject.c:

in x_sub:
To make sure x_sub returns a small int where appropriate, I simply wrapped the return value of x_sub with the function maybe_small_long.
in long_sub:
The previous patch alone would create a nasty bug.
In case both a and b are negative, long_sub calls x_sub, and then negates the result in-place by doing 'Py_SIZE(z) = -(Py_SIZE(z));'.
If x_sub returned a reference to a statically allocated small int (which is not zero), long_sub would actually change that statically allocated small int.

To prevent that, I replaced that in-place negating with a call to _PyLong_Negate.
in _PyLong_Negate:
The previous patches, along with http://bugs.python.org/issue27073 (another issue I have opened recently), would cause long_sub to call _PyLong_Negate for a zero int, in case a and b are the same multiple-digit negative int.
Moreover, in the default CPython branch, in case long_mul receives a multiple-digit negative int and zero, long_mul would call _PyLong_Negate for a zero int.

To prevent doing 'PyLong_FromLong(-MEDIUM_VALUE(x))' where x is a zero int, I have added a check before that (along with a little addition to the function comment), so that _PyLong_Negate would do nothing if x is a zero int.
It should be noted that MEDIUM_VALUE also checks whether x is a zero int (for its own reasons), so thanks to the wisdom of nowadays compilers, the check I propose to add shouldn't introduce a performance penalty.
(Actually, when comparing the assembly of _PyLong_Negate (for win32) of the default CPython branch and the patched one, the latter looks simpler.)

With regard to similar changes made in the past, _PyLong_Negate wasn't changed since it replaced the macro NEGATE in revision 84698.
in x_sub:
The previous patches made it safe for x_sub to return a reference to a statically allocated small int, and thus made it possible to implement the following optimization.
In case a and b have the same number of digits, x_sub finds the most significant digit where a and b differ. Then, if there is no such digit, it means a and b are equal, and so x_sub does 'return (PyLongObject *)PyLong_FromLong(0);'.
I propose to add another check after that, to determine whether the least significant digit is the only digit where a and b differ. In that case, we can do something very similar to what long_add and long_sub do when they realize they are dealing with single-digit ints (as mentioned in 'the current state' section):
return (PyLongObject *)PyLong_FromLong((sdigit)a->ob_digit[0] -
(sdigit)b->ob_digit[0]);

------------ alternative changes ------------
As an alternative to these 4 changes, I also thought of simply wrapping the return value of long_add and long_sub with the function maybe_small_long (i.e. change the last line of each of them to 'return (PyObject *)maybe_small_long(z);').

This change might be more simple, but I believe it would introduce a performance penalty, as it adds checks also to flows of long_add and long_sub that would never result in a small int.
Furthermore, this change won't save any allocations of small ints. For example, in case C (that was mentioned in 'the current state' section), both in (a - b) and in (a + (-b)):
1. A new int of value 1 would be allocated by x_sub.
2. In the end of long_add or long_sub, maybe_small_long would realize an element of small_ints could be used.
3. maybe_small_long would use Py_DECREF on the new int of value 1, which would cause the deallocation of it.

In contrast, in case C, the 4th change (in the proposed changes section) would cause x_sub to realize in advance that the result is going to be a single-digit. Consequently, no new int would be futilely allocated in the process.

However, in case B (that was mentioned in 'the current state' section), both the alternative changes and the proposed changes (and also the default CPython branch), won't prevent x_sub from allocating a new int.

------------ micro-benchmarking ------------
I did some micro-benchmarking:
- subtraction of multiple-digit ints with the same number of digits, while the result fits in the small_ints array:
python -m timeit "[i - (i + 1) for i in range(2 ** 67, 2 ** 67 + 1000)]" -> The patched CPython is approximately 8% faster.
- subtraction of multiple-digit ints with the same number of digits, which differ only in the least significant digit (while the result doesn't fit in the small_ints array):
python -m timeit "[i - (i + 6) for i in range(2 ** 67, 2 ** 67 + 1000)]" -> The patched CPython is approximately 3% faster.
- subtraction of multiple-digit ints with the same number of digits, which differ (among others) in the most significant digit:
python -m timeit "[i * 2 - i for i in range(2 ** 67, 2 ** 67 + 1000)]" -> The patched CPython is approximately 1% slower.
- subtraction of multiple-digit ints with different number of digits:
python -m timeit "[i ** 2 - i * 3 for i in range(2 ** 67, 2 ** 67 + 500)]" -> The patched CPython is approximately 2% faster.
I expected the patched CPython to be somewhat slower here. Either I have missed something, or some compiler optimization magic was used here.

------------ diff ------------
The patches diff is attached.

------------ tests ------------
I built the patched CPython for x86, and played with it a little. Everything seemed to work as usual.
Also, where cases B and C (that were mentioned in 'the current state' section) returned 'False', the patched CPython returned 'True', as expected.

In addition, I ran 'python_d.exe -m test' (on my 64-bit Windows 10) with and without the patches, and got quite the same output.
the outputs of both runs are attached.

orenmn · 2016-06-03T14:17:18Z

I just realized I had forgotten to check for a failure after using _PyLong_Negate. The updated diff file is attached.

orenmn · 2016-06-03T14:44:41Z

After giving it some more thought, I feel somewhat uncertain about that check for a failure after using _PyLong_Negate.

At first I noticed that after every call to _PyLong_Negate there is such a check. But then I realized that in my patch, and also in long_mul (in the default branch of CPython), the check is unnecessary, as z is just returned after the call to _PyLong_Negate.

Is adding such a check anyway (e.g. in long_mul) a convention? Or should such checks be removed?

methane · 2019-11-26T07:55:05Z

New changeset 036fe85 by Inada Naoki (HongWeipeng) in branch 'master':
bpo-27145: small_ints[x] could be returned in long_add and long_sub (GH-15716)
036fe85

vstinner · 2019-11-26T08:20:48Z

It seems like there are a few corner cases where long integers are not normalized:
#15716 (review)

But the initial issue described here has been fixed, so it's better to keep this issue closed. If someone wants to cover more cases (to normalize), please open a separated issue.

Thanks HongWeipeng for the fix and Oren Milman for the original bug report and original patches!

orenmn mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels May 28, 2016

methane closed this as completed Nov 26, 2019

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

long_add and long_sub might return a new int where &small_ints[x] could be returned #71332

long_add and long_sub might return a new int where &small_ints[x] could be returned #71332

orenmn mannequin commented May 28, 2016

orenmn mannequin commented May 28, 2016

orenmn mannequin commented Jun 3, 2016

orenmn mannequin commented Jun 3, 2016

methane commented Nov 26, 2019

vstinner commented Nov 26, 2019

long_add and long_sub might return a new int where &small_ints[x] could be returned #71332

long_add and long_sub might return a new int where &small_ints[x] could be returned #71332

Comments

orenmn mannequin commented May 28, 2016

orenmn mannequin commented May 28, 2016

orenmn mannequin commented Jun 3, 2016

orenmn mannequin commented Jun 3, 2016

methane commented Nov 26, 2019

vstinner commented Nov 26, 2019