msg289650 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-15 08:55 |
Currently the value of right operand of the right shift operator is limited by C Py_ssize_t type.
>>> 1 >> 10**100
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t
>>> (-1) >> 10**100
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t
>>> 1 >> -10**100
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t
>>> (-1) >> -10**100
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C ssize_t
But this is artificial limitation. Right shift can be extended to support arbitrary integers. `x >> very_large_value` should be 0 for non-negative x and -1 for negative x. `x >> negative_value` should raise ValueError.
>>> 1 >> 10
0
>>> (-1) >> 10
-1
>>> 1 >> -10
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: negative shift count
>>> (-1) >> -10
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: negative shift count
|
msg289651 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-03-15 08:57 |
If we change something, I suggest to be consistent with lshift. I expect a memory error on "1 << (1 << 1024)" (no unlimited loop before a global system collapse please ;-))
|
msg289652 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-03-15 09:00 |
FYI I saw recently that the C limitation of len() was reported in the "owasp-pysec" project:
https://github.com/ebranca/owasp-pysec/wiki/Overflow-in-len-function
I don't understand what such "deliberate" limitation was reported in a hardened CPython project?
|
msg289654 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-15 09:19 |
> If we change something, I suggest to be consistent with lshift. I expect a memory error on "1 << (1 << 1024)" (no unlimited loop before a global system collapse please ;-))
I agree that left shift should raise an ValueError rather than OverflowError for large negative shifts. But is hard to handle large positive shifts. `1 << count` consumes `count*2/15` bytes of memory. There is a gap between the maximal value of bits represented as Py_ssize_t (PY_SSIZE_T_MAX) and the number of bits of maximal Python int (PY_SSIZE_T_MAX*15/2). _PyLong_NumBits() starves from the same issue. I think an OverflowError is appropriate here for denoting the platform and implementation limitation.
|
msg289658 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-15 09:52 |
This may be a part of this issue or a separate issue: bytes(-1) raises a ValueError, but bytes(-10**100) raises an OverflowError.
|
msg289660 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2017-03-15 10:03 |
> I think an OverflowError is appropriate here for denoting the platform and implementation limitation.
It's common that integer overflow on memory allocation in C code raises a MemoryError, not an OverflowError.
>>> "x" * (2**60)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
I suggest to raise a MemoryError.
|
msg289662 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-15 10:17 |
This is not MemoryError. On 32-bit platform `1 << (sys.maxsize + 1)` raises an OverflowError, but `1 << sys.maxsize << 1` can be calculated.
|
msg289692 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-15 20:30 |
Unfortunately it is hard to totally avoid OverflowError in right shift. Righ shift of huge positive value can get non-zero result even if shift count is larger than PY_SSIZE_T_MAX. PR 680 just decreases the opportunity of getting a OverflowError.
|
msg289697 - (view) |
Author: Oren Milman (Oren Milman) * |
Date: 2017-03-15 21:00 |
i played a little with a patch earlier today, but stopped because I
am short on time.
anyway, just in case my code is not totally rubbish, I attach my
patch draft, which should avoid OverflowError also for big positive
ints.
(of course, I don't suggest to use my code instead of PR 680. I just
put it here in case it might be useful for someone.)
(on my Windows 10, it passed some manual tests by me, and the test
module (except for test_venv, which fails also without the patch))
|
msg289751 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-17 09:36 |
Thank you Oren, but your code doesn't work when PY_SSIZE_T_MAX < b < PY_SSIZE_T_MAX * PyLong_SHIFT and a > 2 ** b. When you drop wordshift and left only loshift_d you should drop lower wordshift digits in a.
The code for left shift would be even more complex.
|
msg289767 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-17 16:06 |
Updated PR. Now OverflowError is never raised if the result is representable.
Mark, could you please make a review?
|
msg289878 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2017-03-20 08:53 |
> Mark, could you please make a review?
I'll try to find time this week. At least in principle, the change sounds good to me.
|
msg289898 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-20 19:01 |
Here are two patches. The first uses C long long arithmetic (it corresponds current PR 680), the second uses PyLong arithmetic. What is easier to read and verify?
|
msg289984 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2017-03-22 14:04 |
I much prefer the `divrem1`-based version: it makes fewer assumptions about relative sizes of long / long long / size_t and about the number of bits per digit. I'd rather not have another place that would have to be carefully examined in the future if the number of bits per digit changed again. Overall, Objects/longobject.c is highly portable, and I'd like to keep it that way as much as possible.
|
msg290011 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-22 19:35 |
Updated the PR to divrem1-based version. The drawback is that divrem1 can fail with MemoryError while C long long arithmetic always works for integers of the size less than 1 exbibyte.
|
msg290012 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-22 19:52 |
The special case would be not needed if limit Python ints on 32-bit platforms to approximately 2**2**28. int.bit_length() could be simpler too.
|
msg290824 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-30 06:47 |
New changeset 918403cfc3304d27e80fb792357f40bb3ba69c4e by Serhiy Storchaka in branch 'master':
bpo-29816: Shift operation now has less opportunity to raise OverflowError. (#680)
https://github.com/python/cpython/commit/918403cfc3304d27e80fb792357f40bb3ba69c4e
|
msg290826 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-30 07:00 |
Thank you for your review Mark.
|
msg292133 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-04-22 18:50 |
New changeset 997a4adea606069e01beac6269920709db3994d1 by Serhiy Storchaka in branch 'master':
Remove outdated note about constraining of the bit shift right operand. (#1258)
https://github.com/python/cpython/commit/997a4adea606069e01beac6269920709db3994d1
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:44 | admin | set | github: 74002 |
2017-04-22 18:50:11 | serhiy.storchaka | set | messages:
+ msg292133 |
2017-04-22 17:50:19 | serhiy.storchaka | set | pull_requests:
+ pull_request1371 |
2017-03-30 07:00:24 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages:
+ msg290826
stage: patch review -> resolved |
2017-03-30 06:47:09 | serhiy.storchaka | set | messages:
+ msg290824 |
2017-03-22 19:52:55 | serhiy.storchaka | set | messages:
+ msg290012 |
2017-03-22 19:35:29 | serhiy.storchaka | set | messages:
+ msg290011 |
2017-03-22 14:04:56 | mark.dickinson | set | messages:
+ msg289984 |
2017-03-20 19:02:33 | serhiy.storchaka | set | files:
+ long-shift-overflow-divrem1.diff |
2017-03-20 19:02:18 | serhiy.storchaka | set | files:
+ long-shift-overflow-long-long.diff |
2017-03-20 19:01:21 | serhiy.storchaka | set | messages:
+ msg289898 |
2017-03-20 08:53:27 | mark.dickinson | set | messages:
+ msg289878 |
2017-03-17 16:06:38 | serhiy.storchaka | set | messages:
+ msg289767 |
2017-03-17 09:36:10 | serhiy.storchaka | set | messages:
+ msg289751 |
2017-03-17 08:36:16 | serhiy.storchaka | link | issue29833 dependencies |
2017-03-15 21:00:49 | Oren Milman | set | files:
+ patchDraft1.diff keywords:
+ patch messages:
+ msg289697
|
2017-03-15 20:30:52 | serhiy.storchaka | set | messages:
+ msg289692 stage: needs patch -> patch review |
2017-03-15 20:25:42 | serhiy.storchaka | set | pull_requests:
+ pull_request557 |
2017-03-15 10:17:07 | serhiy.storchaka | set | messages:
+ msg289662 |
2017-03-15 10:03:27 | vstinner | set | messages:
+ msg289660 |
2017-03-15 09:52:04 | serhiy.storchaka | set | messages:
+ msg289658 |
2017-03-15 09:19:13 | serhiy.storchaka | set | messages:
+ msg289654 |
2017-03-15 09:06:14 | serhiy.storchaka | link | issue15988 dependencies |
2017-03-15 09:00:29 | vstinner | set | messages:
+ msg289652 |
2017-03-15 08:57:40 | vstinner | set | nosy:
+ vstinner messages:
+ msg289651
|
2017-03-15 08:55:17 | serhiy.storchaka | create | |