Issue36880
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2019-05-10 22:39 by dgelessus, last changed 2022-04-11 14:59 by admin.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
python3.8_2019-05-11-000251.crash | dgelessus, 2019-05-10 22:38 | OS X crash report (Python 3.8) | ||
test_gc_ctypes.py | anthonypjshaw, 2019-05-11 06:10 |
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 13364 | open | dgelessus, 2019-05-16 17:07 |
Messages (10) | |||
---|---|---|---|
msg342141 - (view) | Author: (dgelessus) * | Date: 2019-05-10 22:38 | |
This occurs when writing a ctypes callback in Python whose restype is ctypes.py_object. If the callback returns None, the refcount of None is decremented once too often. This happens every time the callback is called, and if done often enough, Python attempts to deallocate None and crashes. To reproduce: $ bin/python3 Python 3.8.0a4+ (heads/master:09532feeec, May 10 2019, 23:53:49) [Clang 8.0.0 (clang-800.0.42.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> import ctypes >>> FUNTYPE = ctypes.CFUNCTYPE(ctypes.py_object) >>> @FUNTYPE ... def fun(): ... return None ... >>> print(fun()) None >>> sys.getrefcount(None) 4329 >>> print(fun()) None >>> sys.getrefcount(None) 4327 >>> print(fun()) None >>> sys.getrefcount(None) 4326 >>> while True: ... fun() ... Fatal Python error: deallocating None Current thread 0x00007fff7bf80000 (most recent call first): File "<stdin>", line 2 in <module> Abort trap: 6 # exits with code 134 (SIGABRT) I've attached the crash report generated by OS X. (It's a plain text file, despite the extension.) Interestingly, this only happens with None. Other returned objects are refcounted correctly: >>> thing = object() >>> @FUNTYPE ... def otherfun(): ... return thing ... >>> print(otherfun()) <object object at 0x10d920220> >>> sys.getrefcount(thing) 2 >>> print(otherfun()) <object object at 0x10d920220> >>> sys.getrefcount(thing) 2 >>> print(otherfun()) <object object at 0x10d920220> >>> sys.getrefcount(thing) 2 I could reproduce this on the following configurations: * Python 3.8.0a4 (self-compiled from master branch: 09532feeece39d5ba68a0d47115ce1967bfbd58e) on OS X 10.11 (x86_64) * Python 3.7.3 (python.org installer) on OS X 10.11 (x86_64) * Python 3.6.8 (from package manager) on Kubuntu 18.10 (x86_64) * Python 3.5.3 (from package manager) on Raspbian Stretch (armv6) |
|||
msg342163 - (view) | Author: anthony shaw (anthonypjshaw) * | Date: 2019-05-11 05:58 | |
Thanks, I'll check this out |
|||
msg342164 - (view) | Author: anthony shaw (anthonypjshaw) * | Date: 2019-05-11 06:03 | |
The documentation says: >> Note Make sure you keep references to CFUNCTYPE() objects as long as they are used from C code. ctypes doesn’t, and if you don’t, they may be garbage collected, crashing your program when a callback is made. Also, note that if the callback function is called in a thread created outside of Python’s control (e.g. by the foreign code that calls the callback), ctypes creates a new dummy Python thread on every invocation. This behavior is correct for most purposes, but it means that values stored with threading.local will not survive across different callbacks, even when those calls are made from the same C thread. But that doesn't describe the situation you've shared. I'll continue to look into the ctypes module |
|||
msg342165 - (view) | Author: anthony shaw (anthonypjshaw) * | Date: 2019-05-11 06:10 | |
From lldb (lldb) run ~/cpython/test_gc_ctypes.py Process 20059 launched: '/Users/anthonyshaw/CLionProjects/cpython/python.exe' (x86_64) Fatal Python error: deallocating None Current thread 0x00000001005c85c0 (most recent call first): File "/Users/anthonyshaw/cpython/test_gc_ctypes.py", line 7 in <module> Objects/typeobject.c:3187: _Py_NegativeRefcount: Assertion failed: object has negative ref count Enable tracemalloc to get the memory block allocation traceback object : <refcnt -1 at 0x10038a2c0> type : NoneType refcount: -1 address : 0x10038a2c0 Process 20059 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT frame #0: 0x00007fff5984f2c6 libsystem_kernel.dylib`__pthread_kill + 10 libsystem_kernel.dylib`__pthread_kill: -> 0x7fff5984f2c6 <+10>: jae 0x7fff5984f2d0 ; <+20> 0x7fff5984f2c8 <+12>: movq %rax, %rdi 0x7fff5984f2cb <+15>: jmp 0x7fff59849457 ; cerror_nocancel 0x7fff5984f2d0 <+20>: retq Target 0: (python.exe) stopped. |
|||
msg342166 - (view) | Author: anthony shaw (anthonypjshaw) * | Date: 2019-05-11 06:27 | |
Full trace for reference: (lldb) bt all * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT * frame #0: 0x00007fff5984f2c6 libsystem_kernel.dylib`__pthread_kill + 10 frame #1: 0x00007fff59904bf1 libsystem_pthread.dylib`pthread_kill + 284 frame #2: 0x00007fff597b96a6 libsystem_c.dylib`abort + 127 frame #3: 0x0000000100236c14 python.exe`fatal_error(prefix=0x0000000000000000, msg="_PyObject_AssertFailed", status=-1) at pylifecycle.c:2128:9 frame #4: 0x0000000100236a5e python.exe`Py_FatalError(msg="_PyObject_AssertFailed") at pylifecycle.c:2138:5 frame #5: 0x00000001000b49f4 python.exe`_PyObject_AssertFailed(obj=0x000000010038a2c0, expr=0x0000000000000000, msg="object has negative ref count", file="Objects/typeobject.c", line=3187, function="_Py_NegativeRefcount") at object.c:2163:5 frame #6: 0x00000001000b4a44 python.exe`_Py_NegativeRefcount(filename="Objects/typeobject.c", lineno=3187, op=0x000000010038a2c0) at object.c:208:5 frame #7: 0x00000001000d2bad python.exe`_Py_DECREF(filename="Objects/typeobject.c", lineno=3187, op=0x000000010038a2c0) at object.h:468:13 frame #8: 0x00000001000d5889 python.exe`_PyType_Lookup(type=0x00000001003e4ad0, name=0x000000010064bba0) at typeobject.c:3187:9 frame #9: 0x00000001000b731f python.exe`_PyObject_GenericGetAttrWithDict(obj=0x0000000100645dd0, name=0x000000010064bba0, dict=0x0000000000000000, suppress=0) at object.c:1221:13 frame #10: 0x00000001000b7243 python.exe`PyObject_GenericGetAttr(obj=0x0000000100645dd0, name=0x000000010064bba0) at object.c:1309:12 frame #11: 0x00000001000b69a8 python.exe`PyObject_GetAttr(v=0x0000000100645dd0, name=0x000000010064bba0) at object.c:916:16 frame #12: 0x0000000100301599 python.exe`textiowrapper_closed_get(self=0x00000001006aba50, context=0x0000000000000000) at textio.c:3015:12 frame #13: 0x000000010004eedf python.exe`getset_get(descr=0x000000010122f210, obj=0x00000001006aba50, type=0x00000001003e7280) at descrobject.c:158:16 frame #14: 0x00000001000b738c python.exe`_PyObject_GenericGetAttrWithDict(obj=0x00000001006aba50, name=0x00000001016f00a8, dict=0x0000000000000000, suppress=0) at object.c:1228:19 frame #15: 0x00000001000b7243 python.exe`PyObject_GenericGetAttr(obj=0x00000001006aba50, name=0x00000001016f00a8) at object.c:1309:12 frame #16: 0x00000001000b69a8 python.exe`PyObject_GetAttr(v=0x00000001006aba50, name=0x00000001016f00a8) at object.c:916:16 frame #17: 0x00000001000b68e4 python.exe`PyObject_GetAttrString(v=0x00000001006aba50, name="closed") at object.c:821:11 frame #18: 0x00000001002399cc python.exe`file_is_closed(fobj=0x00000001006aba50) at pylifecycle.c:1074:21 frame #19: 0x0000000100235f64 python.exe`flush_std_files at pylifecycle.c:1094:45 frame #20: 0x0000000100236bfd python.exe`fatal_error(prefix=0x0000000000000000, msg="deallocating None", status=-1) at pylifecycle.c:2116:9 frame #21: 0x0000000100236a5e python.exe`Py_FatalError(msg="deallocating None") at pylifecycle.c:2138:5 frame #22: 0x00000001000b8318 python.exe`none_dealloc(ignore=0x000000010038a2c0) at object.c:1551:5 frame #23: 0x00000001000ba6ba python.exe`_Py_Dealloc(op=0x000000010038a2c0) at object.c:2178:5 frame #24: 0x00000001001dec2b python.exe`_Py_DECREF(filename="Python/ceval.c", lineno=1200, op=0x000000010038a2c0) at object.h:473:9 frame #25: 0x00000001001cd444 python.exe`_PyEval_EvalFrameDefault(f=0x000000010130b860, throwflag=0) at ceval.c:1200:13 frame #26: 0x00000001001cc277 python.exe`PyEval_EvalFrameEx(f=0x000000010130b860, throwflag=0) at ceval.c:625:12 frame #27: 0x00000001001e3347 python.exe`_PyEval_EvalCodeWithName(_co=0x000000010146b880, globals=0x0000000101246e50, locals=0x0000000101246e50, args=0x0000000000000000, argcount=0, kwnames=0x0000000000000000, kwargs=0x0000000000000000, kwcount=0, kwstep=2, defs=0x0000000000000000, defcount=0, kwdefs=0x0000000000000000, closure=0x0000000000000000, name=0x0000000000000000, qualname=0x0000000000000000) at ceval.c:4036:14 frame #28: 0x00000001001cc1f0 python.exe`PyEval_EvalCodeEx(_co=0x000000010146b880, globals=0x0000000101246e50, locals=0x0000000101246e50, args=0x0000000000000000, argcount=0, kws=0x0000000000000000, kwcount=0, defs=0x0000000000000000, defcount=0, kwdefs=0x0000000000000000, closure=0x0000000000000000) at ceval.c:4065:12 frame #29: 0x00000001001cc04e python.exe`PyEval_EvalCode(co=0x000000010146b880, globals=0x0000000101246e50, locals=0x0000000101246e50) at ceval.c:602:12 frame #30: 0x0000000100249f55 python.exe`run_eval_code_obj(co=0x000000010146b880, globals=0x0000000101246e50, locals=0x0000000101246e50) at pythonrun.c:1047:9 frame #31: 0x00000001002483f7 python.exe`run_mod(mod=0x0000000102017438, filename=0x0000000101495d30, globals=0x0000000101246e50, locals=0x0000000101246e50, flags=0x00007ffeefbff690, arena=0x00000001012e2520) at pythonrun.c:1063:9 frame #32: 0x00000001002478b5 python.exe`PyRun_FileExFlags(fp=0x00007fff8c715030, filename_str="/Users/anthonyshaw/cpython/test_gc_ctypes.py", start=257, globals=0x0000000101246e50, locals=0x0000000101246e50, closeit=1, flags=0x00007ffeefbff690) at pythonrun.c:994:11 frame #33: 0x0000000100246c91 python.exe`PyRun_SimpleFileExFlags(fp=0x00007fff8c715030, filename="/Users/anthonyshaw/cpython/test_gc_ctypes.py", closeit=1, flags=0x00007ffeefbff690) at pythonrun.c:429:13 frame #34: 0x00000001002467bc python.exe`PyRun_AnyFileExFlags(fp=0x00007fff8c715030, filename="/Users/anthonyshaw/cpython/test_gc_ctypes.py", closeit=1, flags=0x00007ffeefbff690) at pythonrun.c:85:16 frame #35: 0x0000000100277095 python.exe`pymain_run_file(config=0x0000000101800ca8, cf=0x00007ffeefbff690) at main.c:346:15 frame #36: 0x000000010027641d python.exe`pymain_run_python(exitcode=0x00007ffeefbff75c) at main.c:511:21 frame #37: 0x00000001002760cc python.exe`_Py_RunMain at main.c:583:24 frame #38: 0x0000000100276612 python.exe`pymain_main(args=0x00007ffeefbff7c8) at main.c:612:12 frame #39: 0x0000000100276655 python.exe`_Py_UnixMain(argc=2, argv=0x00007ffeefbff830) at main.c:636:12 frame #40: 0x0000000100000d82 python.exe`main(argc=2, argv=0x00007ffeefbff830) at python.c:16:12 frame #41: 0x00007fff597143d5 libdyld.dylib`start + 1 |
|||
msg342174 - (view) | Author: anthony shaw (anthonypjshaw) * | Date: 2019-05-11 07:41 | |
In the stack it's calling none_dealloc() which should never happen. Assume this is being triggered by ctypes PyCFuncPtr_call. The stacktrace I'm getting comes after the double decref so it's not showing the root cause. Someone who knows ctypes better might be able to help |
|||
msg342186 - (view) | Author: Eryk Sun (eryksun) * | Date: 2019-05-11 13:31 | |
This is due to an oversight in _CallPythonObject in Modules/_ctypes/callbacks.c. The setfunc protocol is to return None if there's no object to keep alive. This isn't applicable to py_object (i.e. O_set in Modules/_ctypes/cfield.c). So the problem is the unconditional decref of None in the following snippet from _CallPythonObject: if (keep == NULL) /* Could not convert callback result. */ PyErr_WriteUnraisable(callable); else if (keep == Py_None) /* Nothing to keep */ Py_DECREF(keep); else if (setfunc != _ctypes_get_fielddesc("O")->setfunc) { if (-1 == PyErr_WarnEx(PyExc_RuntimeWarning, "memory leak in callback function.", 1)) PyErr_WriteUnraisable(callable); } I'd rewrite it as follows: if (keep == NULL) { /* Could not convert callback result. */ PyErr_WriteUnraisable(callable); } else if (setfunc != _ctypes_get_fielddesc("O")->setfunc) { if (keep == Py_None) { /* Nothing to keep */ Py_DECREF(keep); } else if (PyErr_WarnEx(PyExc_RuntimeWarning, "memory leak " "in callback function.", 1) == -1) { PyErr_WriteUnraisable(callable); } } |
|||
msg342221 - (view) | Author: anthony shaw (anthonypjshaw) * | Date: 2019-05-11 21:02 | |
Thanks Eryk that saved a lot of debugging. dgelessus - if you want to write a patch for CPython am happy to take you through this and get it over the line. Else: am Happy to write a test against the gc counter and a patch for this |
|||
msg342304 - (view) | Author: (dgelessus) * | Date: 2019-05-13 10:17 | |
Thank you for looking into this! I can confirm that Eryk Sun's change fixes the issue for me locally. I'm up for making the patch for this. Regarding tests, I see there are already some refcount-related ctypes tests in Lib/ctypes/test/test_refcounts.py - should I add another test case there that reproduces this situation and ensures that None's refcount is unaffected? (I imagine testing against None's refcount will be a bit fragile, so it might be best to call the previously-buggy function in a loop and check afterwards that None's refcount hasn't changed significantly.) |
|||
msg342387 - (view) | Author: anthony shaw (anthonypjshaw) * | Date: 2019-05-13 21:23 | |
If you can write a test similar to the AnotherLeak.test_callback test case, and commit that first. It will show in the CI/CD log as failed and verify the issue (incase it comes up in PR review) Then add another commit with the patch itself and we should see the CI/CD pass. Please follow the PR guide if this is your first PR to CPython https://devguide.python.org/pullrequest/ |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:15 | admin | set | github: 81061 |
2021-03-13 06:00:14 | eryksun | set | versions: + Python 3.9, Python 3.10, - Python 3.5, Python 3.6, Python 3.7 |
2019-05-16 17:07:58 | dgelessus | set | keywords:
+ patch stage: patch review pull_requests: + pull_request13273 |
2019-05-13 21:23:43 | anthonypjshaw | set | messages: + msg342387 |
2019-05-13 10:17:21 | dgelessus | set | messages: + msg342304 |
2019-05-11 21:02:28 | anthonypjshaw | set | messages: + msg342221 |
2019-05-11 15:32:22 | freakboy3742 | set | nosy:
+ freakboy3742 |
2019-05-11 13:31:50 | eryksun | set | keywords:
+ easy (C) nosy: + eryksun messages: + msg342186 |
2019-05-11 07:41:34 | anthonypjshaw | set | messages: + msg342174 |
2019-05-11 06:27:01 | anthonypjshaw | set | messages: + msg342166 |
2019-05-11 06:10:16 | anthonypjshaw | set | files:
+ test_gc_ctypes.py messages: + msg342165 |
2019-05-11 06:03:48 | anthonypjshaw | set | messages: + msg342164 |
2019-05-11 05:58:39 | anthonypjshaw | set | assignee: anthonypjshaw messages: + msg342163 nosy: + anthonypjshaw |
2019-05-10 22:39:01 | dgelessus | create |