This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: segfault in PyErr_NormalizeException() after memory exhaustion
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Anthony Sottile, brett.cannon, ned.deily, pitrou, serhiy.storchaka, vstinner, xdegaye
Priority: Keywords: patch

Created on 2017-06-18 13:29 by xdegaye, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
memerr.py xdegaye, 2017-06-18 13:29
except_raises_except.py xdegaye, 2017-10-26 13:01
Pull Requests
URL Status Linked Edit
PR 2327 merged xdegaye, 2017-06-22 15:38
PR 4135 merged xdegaye, 2017-10-26 15:22
PR 4941 closed xdegaye, 2017-12-20 14:38
Messages (16)
msg296272 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2017-06-18 13:29
Nosying reviewers of PR 1981 of issue 22898.

The memerr.py script segfaults with the following gdb backtrace:

#0  0x0000000000550268 in PyErr_NormalizeException (exc=exc@entry=0x7fffffffdee8, 
    val=val@entry=0x7fffffffdef0, tb=tb@entry=0x7fffffffdef8) at Python/errors.c:315
#1  0x000000000055045f in PyErr_NormalizeException (exc=exc@entry=0x7fffffffdee8, 
    val=val@entry=0x7fffffffdef0, tb=tb@entry=0x7fffffffdef8) at Python/errors.c:319
#2  0x000000000055045f in PyErr_NormalizeException (exc=exc@entry=0x7fffffffdee8, 
    val=val@entry=0x7fffffffdef0, tb=tb@entry=0x7fffffffdef8) at Python/errors.c:319
#3  0x000000000055045f in PyErr_NormalizeException (exc=exc@entry=0x7fffffffdee8, 
    val=val@entry=0x7fffffffdef0, tb=tb@entry=0x7fffffffdef8) at Python/errors.c:319
...

To be able to run this patch, one needs to apply the nomemory_allocator.patch from issue 30695 and the infinite_loop.patch from issue 30696.

This raises two different problems:

a) The segfault itself that occurs upon setting the PyExc_RecursionErrorInst singleton. Oh! That had already been pointed out in msg231933 in issue 22898 at case 4).

b) When the size of the Python frames stack is greater than the size of the list of preallocated MemoryError instances, this list becomes exhausted and PyErr_NormalizeException() enters an infinite recursion which is stopped:
    * by the PyExc_RecursionErrorInst singleton when a) is fixed
    * by a Fatal Error (abort) when applying PR 1981 in its current state (there is no stack overflow as expected even in the absence of any guard before the recursive call to PyErr_NormalizeException())
The user is presented in both cases with an error hinting at a recursion problem instead of a problem with memory exhaustion. This is bug b).
msg296274 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2017-06-18 14:49
Problem b) is IMO a clear demonstration that using tstate->recursion_depth and the PyExc_RecursionErrorInst singleton is not the correct way to control the recursive calls to PyErr_NormalizeException() since the problem here is memory exhaustion, not recursion. One could instead abort with a Fatal Error message printing the type of the last exception, when the depth of the recursivity of PyErr_NormalizeException() exceeds let's say 128 (knowing that anyway the stack is protected in the functions that attempt to instantiate those exceptions). The normalization of an exception that fails with an exception whose normalization fails with an ... and this, 128 times in a row, surely this can be considered as a fatal error, no ?

PR 2035 eliminates the tail recursive call in PyErr_NormalizeException() and transforms it into a loop. This loop obviously does not involve the stack anymore. This is another argument that shows  that tstate->recursion_depth and the PyExc_RecursionErrorInst singleton which are related to the stack should not be used to control the recursivity of PyErr_NormalizeException() or the iterations of this loop.
msg296358 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-06-19 17:13
So is PR 2035 a fix for this? This discussion on this exact problem seems to have ended up spanning a couple issues and a PR so I'm losing track of where things sit at the moment.
msg296385 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2017-06-19 22:39
The two issues you are refering to are the instruments that are needed to reproduce the problem. The reference to PR 2035 is only made here to argue about the question of the correct way to control the successive calls to PyErr_NormalizeException(). This question is relevant here since one of the problems raised by this issue is that in the case of memory exhaustion the user is given a RecursionError as the cause of the problem.

FWIW PR 2035 transforms the tail recursion in PyErr_NormalizeException() into a loop (as compilers would do during optimization). An infinite recursion becomes then an infinite loop instead. The advantage is that there is no stack overflow. The drawback is that it is an infinite loop.
msg296462 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-06-20 15:45
And hence why you proposed having a counter of 128 (or some number) to prevent the infinite recursion.

I think this has gotten sufficiently complicated and into the bowels of CPython itself it might make sense to ask for a reviewer from python-committers (I don't feel like I'm in a good position to dive into this myself).
msg296639 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2017-06-22 15:41
PR 2327 lacks the test cases mentionned below for the moment.

1) With PR 2327, the memerr.py script runs correctly:

$ ./python /path/to/memerr.py
Fatal Python error: Cannot recover from MemoryErrors while normalizing exceptions.

Current thread 0x00007f37eab54fc0 (most recent call first):
  File "/path/to/memerr.py", line 8 in foo
  File "/path/to/memerr.py", line 13 in <module>
Aborted (core dumped)

2) With PR 2327, exceeding the recursion limit in PyErr_NormalizeException() raises a RecursionError:

$ ./python -q
>>> import _testcapi
>>> raise _testcapi.RecursingInfinitelyError
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded while normalizing an exception
>>>

Note that when the infinite recursion is started by instantiating an exception written in Python code instead, the RecursionError is set by Py_EnterRecursiveCall() instead of by PyErr_NormalizeException().

3) With PR 2327, the test case in PR 1981 runs correctly (so PR 2327 fixes also issue 22898):

$ ./python /path/to/crasher.py    # crasher.py is the code run by test_recursion_normalizing_exception() in PR 1981
Done.
Traceback (most recent call last):
  File "/path/to/crasher.py", line 36, in <module>
    recurse(setrecursionlimit(depth + 2) - depth - 1)
  File "/path/to/crasher.py", line 19, in recurse
    recurse(cnt)
  File "/path/to/crasher.py", line 19, in recurse
    recurse(cnt)
  File "/path/to/crasher.py", line 19, in recurse
    recurse(cnt)
  [Previous line repeated 1 more times]
  File "/path/to/crasher.py", line 21, in recurse
    generator.throw(MyException)
  File "/path/to/crasher.py", line 25, in gen
    yield
RecursionError: maximum recursion depth exceeded while calling a Python object
sys:1: ResourceWarning: unclosed file <_io.FileIO name='/path/to/crasher.py' mode='rb' closefd=True>
msg305048 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2017-10-26 13:01
Checking the test_exceptions test cases that are added by PR 2327 on the current master branch, before the merge of PR 2327:
* test_recursion_normalizing_exception still fails (SIGABRT on a debug build and SIGSEGV otherwise)
* test_recursion_normalizing_infinite_exception is ok as expected
* test_recursion_normalizing_with_no_memory still fails with a SIGSEGV

The attached script except_raises_except.py exercises case 3) described in msg231933. The output is as follows when PR 2327 is applied and confirms that the ResourceWarning is now printed:

Traceback (most recent call last):
  File "except_raises_except.py", line 11, in <module>
    generator.throw(MyException)
  File "except_raises_except.py", line 7, in gen
    yield
  File "except_raises_except.py", line 3, in __init__
    raise MyException
  File "except_raises_except.py", line 3, in __init__
    raise MyException
  File "except_raises_except.py", line 3, in __init__
    raise MyException
  [Previous line repeated 495 more times]
RecursionError: maximum recursion depth exceeded while calling a Python object
sys:1: ResourceWarning: unclosed file <_io.FileIO name='except_raises_except.py' mode='rb' closefd=True>
msg305049 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2017-10-26 13:09
New changeset 56d1f5ca32892c7643eb8cee49c40c1644f1abfe by xdegaye in branch 'master':
bpo-30697: Fix PyErr_NormalizeException() when no memory (GH-2327)
https://github.com/python/cpython/commit/56d1f5ca32892c7643eb8cee49c40c1644f1abfe
msg305066 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2017-10-26 15:48
New changeset 4b27d51222be751125e6800453a39360a2dec11d by xdegaye in branch '3.6':
[3.6] bpo-30697: Fix PyErr_NormalizeException() when no memory (GH-2327). (#4135)
https://github.com/python/cpython/commit/4b27d51222be751125e6800453a39360a2dec11d
msg308672 - (view) Author: Anthony Sottile (Anthony Sottile) * Date: 2017-12-19 19:34
Should this have landed in python3.6? It removes a public symbol `PyExc_RecursionErrorInst` (abi break?)
msg308686 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-12-19 21:32
Re-opened as a release blocker to make sure we're okay with the potential ABI breakage.
msg308736 - (view) Author: Xavier de Gaye (xdegaye) * (Python triager) Date: 2017-12-20 14:52
As a reference the discussion about the removal of PyExc_RecursionErrorInst took place at PR 1981.

The new PR 4941 restores PyExc_RecursionErrorInst on 3.6, not sure if this should be merged. The same operation could be done on 3.7. Python is not using PyExc_RecursionErrorInst anymore and it is doubtful that anyone does.
msg311023 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-01-28 22:40
We should make a decision on this for both 3.6.5 and for 3.7.0. Victor, Antoine, Brett, Serhiy, any opinions?
msg311193 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2018-01-29 22:39
No opinion from me on how critical this is.
msg311303 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-01-30 22:39
I don't know what the right answer here is.  But since there don't seem to be strong opinions one way or the other with regard to 3.7, I am not going to hold 3.7.0b1 for a resolution.
msg313808 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-03-14 06:07
It's been three months since Anthony raised the question about whether this was an acceptable ABI change in 3.6.4.  Since then, I am not aware of any reports of problems this has caused and there has been no agreement that it is a critical problem.  Since 3.6.5rc1 has now been released without this being resolved, I'm going to close this issue and hope for the best.  Thanks, everyone!  Feel free to reopen if a real life problem can be demonstrated.
History
Date User Action Args
2022-04-11 14:58:47adminsetgithub: 74882
2018-03-14 06:07:50ned.deilysetpriority: deferred blocker ->
status: open -> closed
messages: + msg313808

stage: patch review -> resolved
2018-01-30 22:39:38ned.deilysetpriority: release blocker -> deferred blocker

messages: + msg311303
2018-01-29 22:39:51brett.cannonsetmessages: + msg311193
2018-01-28 22:40:11ned.deilysetmessages: + msg311023
2017-12-20 14:52:46xdegayesetmessages: + msg308736
2017-12-20 14:38:28xdegayesetstage: resolved -> patch review
pull_requests: + pull_request4833
2017-12-19 21:32:52brett.cannonsetstatus: closed -> open
priority: normal -> release blocker

nosy: + ned.deily
messages: + msg308686
2017-12-19 19:34:52Anthony Sottilesetnosy: + Anthony Sottile
messages: + msg308672
2017-10-26 15:51:17xdegayesetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-10-26 15:48:50xdegayesetmessages: + msg305066
2017-10-26 15:22:37xdegayesetkeywords: + patch
stage: patch review
pull_requests: + pull_request4098
2017-10-26 13:09:08xdegayesetmessages: + msg305049
2017-10-26 13:01:35xdegayesetfiles: + except_raises_except.py

messages: + msg305048
versions: - Python 3.5
2017-06-22 15:41:54xdegayesetmessages: + msg296639
2017-06-22 15:38:31xdegayesetpull_requests: + pull_request2376
2017-06-20 15:45:06brett.cannonsetmessages: + msg296462
2017-06-19 22:39:55xdegayesetmessages: + msg296385
2017-06-19 17:13:03brett.cannonsetmessages: + msg296358
2017-06-18 14:49:42xdegayesetmessages: + msg296274
2017-06-18 13:54:08xdegayesetversions: + Python 3.5, Python 3.6
2017-06-18 13:29:41xdegayecreate