classification
Title: segfault in PyErr_NormalizeException() after memory exhaustion
Type: crash Stage:
Components: Interpreter Core Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, haypo, pitrou, serhiy.storchaka, xdegaye
Priority: normal Keywords:

Created on 2017-06-18 13:29 by xdegaye, last changed 2017-06-22 15:41 by xdegaye.

Files
File name Uploaded Description Edit
memerr.py xdegaye, 2017-06-18 13:29
Pull Requests
URL Status Linked Edit
PR 2327 open xdegaye, 2017-06-22 15:38
Messages (6)
msg296272 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2017-06-18 13:29
Nosying reviewers of PR 1981 of issue 22898.

The memerr.py script segfaults with the following gdb backtrace:

#0  0x0000000000550268 in PyErr_NormalizeException (exc=exc@entry=0x7fffffffdee8, 
    val=val@entry=0x7fffffffdef0, tb=tb@entry=0x7fffffffdef8) at Python/errors.c:315
#1  0x000000000055045f in PyErr_NormalizeException (exc=exc@entry=0x7fffffffdee8, 
    val=val@entry=0x7fffffffdef0, tb=tb@entry=0x7fffffffdef8) at Python/errors.c:319
#2  0x000000000055045f in PyErr_NormalizeException (exc=exc@entry=0x7fffffffdee8, 
    val=val@entry=0x7fffffffdef0, tb=tb@entry=0x7fffffffdef8) at Python/errors.c:319
#3  0x000000000055045f in PyErr_NormalizeException (exc=exc@entry=0x7fffffffdee8, 
    val=val@entry=0x7fffffffdef0, tb=tb@entry=0x7fffffffdef8) at Python/errors.c:319
...

To be able to run this patch, one needs to apply the nomemory_allocator.patch from issue 30695 and the infinite_loop.patch from issue 30696.

This raises two different problems:

a) The segfault itself that occurs upon setting the PyExc_RecursionErrorInst singleton. Oh! That had already been pointed out in msg231933 in issue 22898 at case 4).

b) When the size of the Python frames stack is greater than the size of the list of preallocated MemoryError instances, this list becomes exhausted and PyErr_NormalizeException() enters an infinite recursion which is stopped:
    * by the PyExc_RecursionErrorInst singleton when a) is fixed
    * by a Fatal Error (abort) when applying PR 1981 in its current state (there is no stack overflow as expected even in the absence of any guard before the recursive call to PyErr_NormalizeException())
The user is presented in both cases with an error hinting at a recursion problem instead of a problem with memory exhaustion. This is bug b).
msg296274 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2017-06-18 14:49
Problem b) is IMO a clear demonstration that using tstate->recursion_depth and the PyExc_RecursionErrorInst singleton is not the correct way to control the recursive calls to PyErr_NormalizeException() since the problem here is memory exhaustion, not recursion. One could instead abort with a Fatal Error message printing the type of the last exception, when the depth of the recursivity of PyErr_NormalizeException() exceeds let's say 128 (knowing that anyway the stack is protected in the functions that attempt to instantiate those exceptions). The normalization of an exception that fails with an exception whose normalization fails with an ... and this, 128 times in a row, surely this can be considered as a fatal error, no ?

PR 2035 eliminates the tail recursive call in PyErr_NormalizeException() and transforms it into a loop. This loop obviously does not involve the stack anymore. This is another argument that shows  that tstate->recursion_depth and the PyExc_RecursionErrorInst singleton which are related to the stack should not be used to control the recursivity of PyErr_NormalizeException() or the iterations of this loop.
msg296358 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-06-19 17:13
So is PR 2035 a fix for this? This discussion on this exact problem seems to have ended up spanning a couple issues and a PR so I'm losing track of where things sit at the moment.
msg296385 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2017-06-19 22:39
The two issues you are refering to are the instruments that are needed to reproduce the problem. The reference to PR 2035 is only made here to argue about the question of the correct way to control the successive calls to PyErr_NormalizeException(). This question is relevant here since one of the problems raised by this issue is that in the case of memory exhaustion the user is given a RecursionError as the cause of the problem.

FWIW PR 2035 transforms the tail recursion in PyErr_NormalizeException() into a loop (as compilers would do during optimization). An infinite recursion becomes then an infinite loop instead. The advantage is that there is no stack overflow. The drawback is that it is an infinite loop.
msg296462 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2017-06-20 15:45
And hence why you proposed having a counter of 128 (or some number) to prevent the infinite recursion.

I think this has gotten sufficiently complicated and into the bowels of CPython itself it might make sense to ask for a reviewer from python-committers (I don't feel like I'm in a good position to dive into this myself).
msg296639 - (view) Author: Xavier de Gaye (xdegaye) * (Python committer) Date: 2017-06-22 15:41
PR 2327 lacks the test cases mentionned below for the moment.

1) With PR 2327, the memerr.py script runs correctly:

$ ./python /path/to/memerr.py
Fatal Python error: Cannot recover from MemoryErrors while normalizing exceptions.

Current thread 0x00007f37eab54fc0 (most recent call first):
  File "/path/to/memerr.py", line 8 in foo
  File "/path/to/memerr.py", line 13 in <module>
Aborted (core dumped)

2) With PR 2327, exceeding the recursion limit in PyErr_NormalizeException() raises a RecursionError:

$ ./python -q
>>> import _testcapi
>>> raise _testcapi.RecursingInfinitelyError
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded while normalizing an exception
>>>

Note that when the infinite recursion is started by instantiating an exception written in Python code instead, the RecursionError is set by Py_EnterRecursiveCall() instead of by PyErr_NormalizeException().

3) With PR 2327, the test case in PR 1981 runs correctly (so PR 2327 fixes also issue 22898):

$ ./python /path/to/crasher.py    # crasher.py is the code run by test_recursion_normalizing_exception() in PR 1981
Done.
Traceback (most recent call last):
  File "/path/to/crasher.py", line 36, in <module>
    recurse(setrecursionlimit(depth + 2) - depth - 1)
  File "/path/to/crasher.py", line 19, in recurse
    recurse(cnt)
  File "/path/to/crasher.py", line 19, in recurse
    recurse(cnt)
  File "/path/to/crasher.py", line 19, in recurse
    recurse(cnt)
  [Previous line repeated 1 more times]
  File "/path/to/crasher.py", line 21, in recurse
    generator.throw(MyException)
  File "/path/to/crasher.py", line 25, in gen
    yield
RecursionError: maximum recursion depth exceeded while calling a Python object
sys:1: ResourceWarning: unclosed file <_io.FileIO name='/path/to/crasher.py' mode='rb' closefd=True>
History
Date User Action Args
2017-06-22 15:41:54xdegayesetmessages: + msg296639
2017-06-22 15:38:31xdegayesetpull_requests: + pull_request2376
2017-06-20 15:45:06brett.cannonsetmessages: + msg296462
2017-06-19 22:39:55xdegayesetmessages: + msg296385
2017-06-19 17:13:03brett.cannonsetmessages: + msg296358
2017-06-18 14:49:42xdegayesetmessages: + msg296274
2017-06-18 13:54:08xdegayesetversions: + Python 3.5, Python 3.6
2017-06-18 13:29:41xdegayecreate