classification
Title: Segfault on __getattr__ infinite recursion on certain attribute accesses
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: The Compiler, bluetech, gvanrossum, loewis, nascheme, pitrou, serhiy.storchaka, sparkingdark
Priority: normal Keywords: patch

Created on 2020-10-02 10:49 by bluetech, last changed 2020-10-04 22:44 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 22536 merged serhiy.storchaka, 2020-10-04 08:05
PR 22550 merged serhiy.storchaka, 2020-10-04 21:57
PR 22551 merged serhiy.storchaka, 2020-10-04 22:03
Messages (8)
msg377807 - (view) Author: Ran Benita (bluetech) Date: 2020-10-02 10:49
The following program crashes with a segfault:

class Segfault:
    def __getattr__(self, name):
        self.unknown_attribute

instance = Segfault()
issubclass(instance, int)  # int doesn't matter

Tested with Python 3.7, 3.8, 3.9rc2, and master in debug mode (commit 497126f7ea955ee005e78f2cdd61f175d4fcdbb2).

The program is odd, but this affects pytest (see https://github.com/pytest-dev/pytest/issues/2330).
The class is buggy user code, pytest does the issubclass check.
In pytest there are other code paths which trigger it besides issubclass, but that's the one I am able to quickly reproduce standalone.

Here is the backtrace with python master:

ran@ran:~/src/cpython$ gdb --args ./python segfault.py
Reading symbols from ./python...
(gdb) r
Starting program: /home/ran/src/cpython/python segfault.py

Program received signal SIGSEGV, Segmentation fault.
0x00005555556bc55e in PyGILState_Check () at Python/pystate.c:1353
1353	    if (!PyThread_tss_is_created(&gilstate->autoTSSkey)) {


(gdb) bt 29
#0  0x00005555556bc55e in PyGILState_Check () at Python/pystate.c:1353
#1  0x00005555555fc117 in _PyMem_DebugCheckGIL (func=0x55555581b330 <__func__.10> "_PyMem_DebugMalloc") at Objects/obmalloc.c:2329
#2  _PyMem_DebugMalloc (ctx=0x55555593b140 <_PyMem_Debug+96>, nbytes=185) at Objects/obmalloc.c:2329
#3  0x00005555555fcee8 in PyObject_Malloc (size=<optimized out>) at Objects/obmalloc.c:685
#4  0x000055555562d6f3 in PyUnicode_New (size=136, maxchar=<optimized out>) at Objects/unicodeobject.c:1455
#5  0x00005555556556e7 in _PyUnicodeWriter_PrepareInternal (writer=writer@entry=0x7fffff7ff0c0, length=length@entry=1, maxchar=<optimized out>, maxchar@entry=127)
    at Objects/unicodeobject.c:14004
#6  0x0000555555658439 in _PyUnicodeWriter_WriteASCIIString (writer=writer@entry=0x7fffff7ff0c0, ascii=ascii@entry=0x5555558196c8 "'%.50s' object has no attribute '%U'", 
    len=1) at Objects/unicodeobject.c:14174
#7  0x000055555565a18d in PyUnicode_FromFormatV (format=format@entry=0x5555558196c8 "'%.50s' object has no attribute '%U'", vargs=vargs@entry=0x7fffff7ff170)
    at Objects/unicodeobject.c:3114
#8  0x000055555569646b in _PyErr_FormatV (tstate=0x5555559aa300, exception=<type at remote 0x555555932040>, 
    format=format@entry=0x5555558196c8 "'%.50s' object has no attribute '%U'", vargs=vargs@entry=0x7fffff7ff170) at Python/errors.c:1029
#9  0x0000555555696ff3 in PyErr_Format (exception=<optimized out>, format=format@entry=0x5555558196c8 "'%.50s' object has no attribute '%U'") at Python/errors.c:1071
#10 0x00005555555fa4a5 in _PyObject_GenericGetAttrWithDict (obj=obj@entry=<Segfault at remote 0x7ffff772dd70>, name=name@entry='unknown_attribute', dict=<optimized out>, 
    dict@entry=0x0, suppress=suppress@entry=0) at Objects/object.c:1268
#11 0x00005555555fa96f in PyObject_GenericGetAttr (obj=obj@entry=<Segfault at remote 0x7ffff772dd70>, name=name@entry='unknown_attribute') at Objects/object.c:1281
#12 0x000055555561c3f8 in slot_tp_getattr_hook (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute') at Objects/typeobject.c:6805
#13 0x00005555555f68c2 in PyObject_GetAttr (v=v@entry=<Segfault at remote 0x7ffff772dd70>, name=<optimized out>) at Objects/object.c:891
#14 0x000055555567c254 in _PyEval_EvalFrameDefault (tstate=0x5555559aa300, 
    f=Frame 0x7ffff71cc3b0, for file /home/ran/src/cpython/segfault.py, line 3, in __getattr__ (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute'), 
    throwflag=<optimized out>) at Python/ceval.c:3036
#15 0x00005555555b9d1d in _PyEval_EvalFrame (throwflag=0, 
    f=Frame 0x7ffff71cc3b0, for file /home/ran/src/cpython/segfault.py, line 3, in __getattr__ (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute'), 
    tstate=0x5555559aa300) at ./Include/internal/pycore_ceval.h:40
#16 function_code_fastcall (tstate=0x5555559aa300, co=<optimized out>, args=0x7fffff7ff590, nargs=2, globals=<optimized out>) at Objects/call.c:329
#17 0x00005555555ba67b in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:366
#18 0x000055555576c2d4 in _PyObject_VectorcallTstate (tstate=0x5555559aa300, callable=<function at remote 0x7ffff773c4b0>, args=0x7fffff7ff580, nargsf=2, kwnames=0x0)
    at ./Include/cpython/abstract.h:114
#19 0x000055555576cb87 in method_vectorcall (method=<optimized out>, args=0x7fffff7ff588, nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:53
#20 0x00005555556109ac in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=9223372036854775809, args=0x7fffff7ff588, callable=<method at remote 0x7ffff71c8c50>, 
    tstate=0x5555559aa300) at ./Include/cpython/abstract.h:114
#21 PyObject_CallOneArg (arg='unknown_attribute', func=<method at remote 0x7ffff71c8c50>) at ./Include/cpython/abstract.h:184
#22 call_attribute (self=self@entry=<Segfault at remote 0x7ffff772dd70>, attr=<method at remote 0x7ffff71c8c50>, attr@entry=<function at remote 0x7ffff773c4b0>, 
    name=name@entry='unknown_attribute') at Objects/typeobject.c:6771
#23 0x000055555561c44a in slot_tp_getattr_hook (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute') at Objects/typeobject.c:6813
#24 0x00005555555f68c2 in PyObject_GetAttr (v=v@entry=<Segfault at remote 0x7ffff772dd70>, name=<optimized out>) at Objects/object.c:891
#25 0x000055555567c254 in _PyEval_EvalFrameDefault (tstate=0x5555559aa300, 
    f=Frame 0x7ffff71cc200, for file /home/ran/src/cpython/segfault.py, line 3, in __getattr__ (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute'), 
    throwflag=<optimized out>) at Python/ceval.c:3036
#26 0x00005555555b9d1d in _PyEval_EvalFrame (throwflag=0, 
    f=Frame 0x7ffff71cc200, for file /home/ran/src/cpython/segfault.py, line 3, in __getattr__ (self=<Segfault at remote 0x7ffff772dd70>, name='unknown_attribute'), 
    tstate=0x5555559aa300) at ./Include/internal/pycore_ceval.h:40
#27 function_code_fastcall (tstate=0x5555559aa300, co=<optimized out>, args=0x7fffff7ff8b0, nargs=2, globals=<optimized out>) at Objects/call.c:329
#28 0x00005555555ba67b in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:366
(More stack frames follow...)



(gdb) py-bt
Traceback (most recent call first):
  File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
    self.unknown_attribute
  File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
    self.unknown_attribute
  File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
    self.unknown_attribute
  File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
    self.unknown_attribute
  File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
    self.unknown_attribute
  File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
    self.unknown_attribute
  File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
    self.unknown_attribute
  File "/home/ran/src/cpython/segfault.py", line 3, in __getattr__
    self.unknown_attribute
--Type <RET> for more, q to quit, c to continue without paging--q
msg377860 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-03 08:08
Interesting, when lookup any attribute you will get a RecursionError, but this is one of only two places where the recursion check is disabled (the other one is in interning strings).

You get a crash also when call isinstance(1, instance) or issubclass(int, instance). This is the same bug.

The implementation of issubclass() calls PyObject_IsSubclass() which calls object_issubclass() which calls recursive_issubclass() which calls check_class() which calls abstract_get_bases() which looks up attribute "__bases__" which leads to infinite recursion in __getattr__().

#125278 0x000055555576116d in abstract_get_bases (cls=cls@entry=0x7fffeabcee10) at Objects/abstract.c:2340
#125279 0x00005555557611f1 in check_class (cls=cls@entry=0x7fffeabcee10, error=error@entry=0x55555585c828 "issubclass() arg 1 must be a class") at Objects/abstract.c:2396
#125280 0x00005555557619bf in recursive_issubclass (derived=derived@entry=0x7fffeabcee10, cls=cls@entry=0x555555946220 <PyLong_Type>) at Objects/abstract.c:2524
#125281 0x0000555555761daf in object_issubclass (tstate=<optimized out>, derived=0x7fffeabcee10, cls=0x555555946220 <PyLong_Type>) at Objects/abstract.c:2550
#125282 0x0000555555765ea1 in PyObject_IsSubclass (derived=<optimized out>, cls=<optimized out>) at Objects/abstract.c:2600
#125283 0x00005555557b78f3 in builtin_issubclass_impl (module=module@entry=0x7fffeae21d70, cls=<optimized out>, class_or_tuple=<optimized out>) at Python/bltinmodule.c:2511
#125284 0x00005555557b794d in builtin_issubclass (module=0x7fffeae21d70, args=0x7fffead0e388, nargs=<optimized out>) at Python/clinic/bltinmodule.c.h:828

The problem is that in abstract_get_bases() the recursion check is disabled by using macros Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION (added in 5b222135f8d2492713994f2cb003980e87ce6a72). I do not know why it was necessary. Currently tests are passed if enable recursion check, and this fixes this issue.

It is worth to mention that attribute __bases__ is looked up to support non-types in issubclass() and isinstance(). Originally it was added in issue464992 to support Zope extension ExtensionClass. I tested that the current code of ExtensionClass does not need it. So we could simplify the code and avoid recursion by just using tp_bases. But this needs wider discussion.
msg377905 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-10-03 22:30
I don't think I remember what's so special about abstract_get_bases() to disable the recursion check. Maybe this macro is always just a micro-optimization? Then again it's hard to be sure, the behavior may depend on the platform.

Pulling in Antoine, who showed up in the  email https://mail.python.org/pipermail/python-dev/2008-August/082106.html is mentioned in a long comment in ceval.h about the recursion protection.

Possibly the Py_ALLOW_RECURSION macro is intended to ensure that some handler code for first-order recursion errors doesn't get interrupted by second-order recursion errors? There's a limit of 50 stack frames which feels a bit dodgy.
msg377920 - (view) Author: Debojyoti Chakraborty (sparkingdark) Date: 2020-10-04 05:34
Is it open Then give the process to  reproduce the issue.
msg377922 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-04 07:42
Antoine give me a hint. The problem seems related to using PyDict_GetItem() when look up an attribute in the instance dict. It silences all exceptions (including RecursionError) and this results in silent "no such key in a dict".

But since 3.8 it was replaced with PyDict_GetItemWithError() which preserves RecursionError (see issue35459). I think we can remove Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION now.

As for other case in interning string, Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION was used to guard PyDict_GetItem(). But the current code uses PyDict_SetDefault() (added in 3.4) which does not silence errors. So Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION can be removed in that place too.
msg377977 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-04 21:56
New changeset 9ece9cd65cdeb0a1f6e60475bbd0219161c348ac by Serhiy Storchaka in branch 'master':
bpo-41909: Enable previously disabled recursion checks. (GH-22536)
https://github.com/python/cpython/commit/9ece9cd65cdeb0a1f6e60475bbd0219161c348ac
msg377979 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-04 22:27
New changeset 7aa22ba923509af1dbf115c090964f503c84ca8d by Serhiy Storchaka in branch '3.9':
[3.9] bpo-41909: Enable previously disabled recursion checks. (GH-22536) (GH-22550)
https://github.com/python/cpython/commit/7aa22ba923509af1dbf115c090964f503c84ca8d
msg377980 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-04 22:28
New changeset 09a7b3b618cd02694a0bc8abfa24c75f0e659407 by Serhiy Storchaka in branch '3.8':
[3.8] bpo-41909: Enable previously disabled recursion checks. (GH-22536) (GH-22551)
https://github.com/python/cpython/commit/09a7b3b618cd02694a0bc8abfa24c75f0e659407
History
Date User Action Args
2020-10-04 22:44:32serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-10-04 22:28:03serhiy.storchakasetmessages: + msg377980
2020-10-04 22:27:43serhiy.storchakasetmessages: + msg377979
2020-10-04 22:03:39serhiy.storchakasetpull_requests: + pull_request21549
2020-10-04 21:57:34serhiy.storchakasetpull_requests: + pull_request21548
2020-10-04 21:56:01serhiy.storchakasetmessages: + msg377977
2020-10-04 08:05:52serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request21539
2020-10-04 07:42:44serhiy.storchakasetmessages: + msg377922
2020-10-04 05:34:56sparkingdarksetnosy: + sparkingdark
messages: + msg377920
2020-10-03 22:30:01gvanrossumsetnosy: + pitrou
messages: + msg377905
2020-10-03 08:08:54serhiy.storchakasetnosy: + serhiy.storchaka, gvanrossum, loewis, nascheme

messages: + msg377860
versions: + Python 3.8, Python 3.9, Python 3.10
2020-10-02 12:49:27The Compilersetnosy: + The Compiler
2020-10-02 10:58:24bluetechsettype: crash
2020-10-02 10:49:51bluetechcreate