msg400728 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-08-31 13:17 |
Writing C extensions using directly the C API is error prone. It's easy to add or forget a Py_INCREF or Py_DECREF. Adding Py_DECREF(Py_True) or Py_DECREF(Py_False) by mistake causes a surprising crash at Python exit:
---
Debug memory block at address p=0x8a6e80: API ''
0 bytes originally requested
The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
at p-7: 0x00 *** OUCH
at p-6: 0x00 *** OUCH
at p-5: 0x00 *** OUCH
at p-4: 0x00 *** OUCH
at p-3: 0x00 *** OUCH
at p-2: 0x00 *** OUCH
at p-1: 0x00 *** OUCH
Because memory is corrupted at the start, the count of bytes requested
may be bogus, and checking the trailing pad bytes may segfault.
The 8 pad bytes at tail=0x8a6e80 are not all FORBIDDENBYTE (0xfd):
at tail+0: 0x00 *** OUCH
at tail+1: 0x00 *** OUCH
at tail+2: 0x00 *** OUCH
at tail+3: 0x00 *** OUCH
at tail+4: 0x00 *** OUCH
at tail+5: 0x00 *** OUCH
at tail+6: 0x00 *** OUCH
at tail+7: 0x00 *** OUCH
Enable tracemalloc to get the memory block allocation traceback
Fatal Python error: _PyMem_DebugRawFree: bad ID: Allocated using API '', verified using API 'o'
Python runtime state: finalizing (tstate=0x0000000001f43c50)
Current thread 0x00007f3f562fa740 (most recent call first):
Garbage-collecting
<no Python frame>
Abandon (core dumped)
---
In my case, the bug occurs at Python exit, in code_dealloc(): "Py_XDECREF(co->co_consts);" destroys a tuple which contains True. It calls object_dealloc() which calls PyBool_Type.tp_dealloc(Py_True), but this object is allocated statically, and so its memory must not deallocated by the Python dynamic memory allocator. In debug mode, PyObject_Free() triggers a fatal error.
Concrete example of such bug in PySide with Python 3.10 which is now stricter on reference counting (thanks to the work made in bpo-1635741 and for Python subinterpreters):
https://bugreports.qt.io/browse/PYSIDE-1436
I propose to add a specific deallocator functions on bool to detect such bug, to ease debugging. There is already a similar deallocator for the None singleton.
|
msg400729 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-08-31 13:19 |
To reproduce the bug, apply attached os_uname_refcount_bug.patch and call os.uname():
---
$ ./python -c 'import os; os.uname()'
(...)
Fatal Python error: _PyMem_DebugRawFree: bad ID: Allocated using API '', verified using API 'o'
(...)
---
|
msg400730 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-08-31 13:34 |
There was such bug in CPython stdlib, in the _testcapi module, see commit 310e2d25170a88ef03f6fd31efcc899fe062da2c of bpo-36854:
commit 310e2d25170a88ef03f6fd31efcc899fe062da2c
Author: Victor Stinner <vstinner@python.org>
Date: Fri Nov 22 10:58:00 2019 +0100
bpo-36854: Fix refleak in subinterpreter (GH-17331)
finalize_interp_clear() now explicitly clears the codec registry and
then trigger a GC collection to clear all references.
diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c
index baa6907b7e..0908f3457f 100644
--- a/Modules/_testcapimodule.c
+++ b/Modules/_testcapimodule.c
@@ -6721,11 +6721,14 @@ PyInit__testcapi(void)
PyModule_AddObject(m, "instancemethod", (PyObject *)&PyInstanceMethod_Type);
PyModule_AddIntConstant(m, "the_number_three", 3);
+ PyObject *v;
#ifdef WITH_PYMALLOC
- PyModule_AddObject(m, "WITH_PYMALLOC", Py_True);
+ v = Py_True;
#else
- PyModule_AddObject(m, "WITH_PYMALLOC", Py_False);
+ v = Py_False;
#endif
+ Py_INCREF(v);
+ PyModule_AddObject(m, "WITH_PYMALLOC", v);
TestError = PyErr_NewException("_testcapi.error", NULL, NULL);
Py_INCREF(TestError);
|
msg400746 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-08-31 15:34 |
New changeset 888d4cc06b887e77f281ba4d640e281cb4c61b7b by Victor Stinner in branch 'main':
bpo-45061: Enhance faulthandler traceback wit no Python frame (GH-28090)
https://github.com/python/cpython/commit/888d4cc06b887e77f281ba4d640e281cb4c61b7b
|
msg400748 - (view) |
Author: miss-islington (miss-islington) |
Date: 2021-08-31 15:53 |
New changeset c4c57e5c0eb79795d4fd1d9d8292455567c60070 by Miss Islington (bot) in branch '3.10':
bpo-45061: Enhance faulthandler traceback wit no Python frame (GH-28090)
https://github.com/python/cpython/commit/c4c57e5c0eb79795d4fd1d9d8292455567c60070
|
msg400753 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-08-31 16:05 |
New changeset 4300352000beed22fb525ec45fd331918d206528 by Victor Stinner in branch 'main':
bpo-45061: Detect Py_DECREF(Py_True) bug (GH-28089)
https://github.com/python/cpython/commit/4300352000beed22fb525ec45fd331918d206528
|
msg400828 - (view) |
Author: Christian Tismer (Christian.Tismer) * |
Date: 2021-09-01 12:29 |
What about an even more flexible solution?
A debug option could memorize always the last object deallocated
and give full info (the object's repr) before the crash would happen.
|
msg400829 - (view) |
Author: Christian Tismer (Christian.Tismer) * |
Date: 2021-09-01 12:32 |
Ah, that would be really much to store: every object with a refcount
going to zero would have to be memorized :/
|
msg400839 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-09-01 14:27 |
Christian Tismer: "What about an even more flexible solution? A debug option could memorize always the last object deallocated and give full info (the object's repr) before the crash would happen."
Bugs in object_dealloc() are rare. Apart None, True and False, do you know other objects which must not be deleted?
Calling repr(obj) can crash during late Python finalization for many reasons.
It is slow and can consume significant memory. Not all object types detect loops (a container storing indirectly itself) and so it can fail with a MemoryError: repr() is not "safe".
Using gdb, it is easy to dump the object currently being deleted: go the object_dealloc() frame and use "print self" command.
Example using attached os_uname_refcount_bug.patch:
------------
$ gdb ./python
(...)
(gdb) run
>>> import os; os.uname()
>>> exit()
Debug memory block at address p=0x886e60: API ''
0 bytes originally requested
The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
at p-7: 0x00 *** OUCH
at p-6: 0x00 *** OUCH
at p-5: 0x00 *** OUCH
at p-4: 0x00 *** OUCH
at p-3: 0x00 *** OUCH
at p-2: 0x00 *** OUCH
at p-1: 0x00 *** OUCH
Because memory is corrupted at the start, the count of bytes requested
may be bogus, and checking the trailing pad bytes may segfault.
The 8 pad bytes at tail=0x886e60 are not all FORBIDDENBYTE (0xfd):
at tail+0: 0x00 *** OUCH
at tail+1: 0x00 *** OUCH
at tail+2: 0x00 *** OUCH
at tail+3: 0x00 *** OUCH
at tail+4: 0x00 *** OUCH
at tail+5: 0x00 *** OUCH
at tail+6: 0x00 *** OUCH
at tail+7: 0x00 *** OUCH
Enable tracemalloc to get the memory block allocation traceback
Fatal Python error: _PyMem_DebugRawFree: bad ID: Allocated using API '', verified using API 'o'
Python runtime state: finalizing (tstate=0x00000000008e1a80)
Current thread 0x00007ffff7c24740 (most recent call first):
Garbage-collecting
<no Python frame>
Program received signal SIGABRT, Aborted.
0x00007ffff7c662a2 in raise () from /lib64/libc.so.6
(...)
(gdb) frame 9
#9 0x0000000000499dd7 in object_dealloc (self=True)
at Objects/typeobject.c:4497
4497 Py_TYPE(self)->tp_free(self);
(gdb) print self
$2 = True
------------
|
msg400904 - (view) |
Author: Christian Tismer (Christian.Tismer) * |
Date: 2021-09-02 08:52 |
> Apart None, True and False, do you know other objects which must not be deleted?
Yes, what I can think of is the immutable empty tuple singleton
which is such a candidate to be forgotten.
|
msg402365 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-09-21 21:04 |
New changeset 79a31480992c3fa5890fc7a6c5d9af6d337d5844 by Victor Stinner in branch 'main':
bpo-45061: Detect refcount bug on empty tuple singleton (GH-28503)
https://github.com/python/cpython/commit/79a31480992c3fa5890fc7a6c5d9af6d337d5844
|
msg402370 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-09-21 21:43 |
New changeset 86f28372b17c8c56539e9543bea9f125ab11b8aa by Victor Stinner in branch 'main':
bpo-45061: Detect refcount bug on empty string singleton (GH-28504)
https://github.com/python/cpython/commit/86f28372b17c8c56539e9543bea9f125ab11b8aa
|
msg402371 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-09-21 21:45 |
I don't want to add a deallocator to bytes and int types to detect when their singleton is destroyed, since it would slow down C extensions which have no refcount bug. Maybe it could be added only if Py_DEBUG macro is defined. But I don't think that Py_DECREF() bugs on these singletons are common enough to justify these. I don't want to make Python debug build slower.
IMO detecting refcount bugs on the most common singletons is enough: None, True, False, (), "". I implemented these checks because it was simple and had no major on performance, even on the debug build.
I now consider that the issue is fully fixed ;-)
|
msg402417 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-09-22 10:16 |
New changeset 8620be99da930230b18ec05f4d7446ee403531af by Victor Stinner in branch 'main':
bpo-45061: Revert unicode_is_singleton() change (GH-28516)
https://github.com/python/cpython/commit/8620be99da930230b18ec05f4d7446ee403531af
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:49 | admin | set | github: 89224 |
2021-09-22 10:16:57 | vstinner | set | messages:
+ msg402417 |
2021-09-22 09:23:49 | vstinner | set | pull_requests:
+ pull_request26908 |
2021-09-21 21:45:48 | vstinner | set | messages:
+ msg402371 |
2021-09-21 21:43:16 | vstinner | set | messages:
+ msg402370 |
2021-09-21 21:19:40 | vstinner | set | pull_requests:
+ pull_request26899 |
2021-09-21 21:04:41 | vstinner | set | messages:
+ msg402365 |
2021-09-21 20:38:24 | vstinner | set | pull_requests:
+ pull_request26898 |
2021-09-02 08:52:39 | Christian.Tismer | set | messages:
+ msg400904 |
2021-09-01 14:27:00 | vstinner | set | messages:
+ msg400839 |
2021-09-01 12:32:21 | Christian.Tismer | set | messages:
+ msg400829 |
2021-09-01 12:29:19 | Christian.Tismer | set | nosy:
+ Christian.Tismer messages:
+ msg400828
|
2021-08-31 16:05:53 | vstinner | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2021-08-31 16:05:22 | vstinner | set | messages:
+ msg400753 |
2021-08-31 15:53:28 | miss-islington | set | messages:
+ msg400748 |
2021-08-31 15:34:56 | miss-islington | set | nosy:
+ miss-islington pull_requests:
+ pull_request26539
|
2021-08-31 15:34:39 | vstinner | set | messages:
+ msg400746 |
2021-08-31 13:38:14 | vstinner | set | pull_requests:
+ pull_request26533 |
2021-08-31 13:34:42 | vstinner | set | messages:
+ msg400730 |
2021-08-31 13:24:06 | vstinner | set | stage: patch review pull_requests:
+ pull_request26532 |
2021-08-31 13:19:06 | vstinner | set | files:
+ os_uname_refcount_bug.patch keywords:
+ patch messages:
+ msg400729
|
2021-08-31 13:17:39 | vstinner | create | |