This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: BadInternalCall running test_multiprocessing
Type: crash Stage:
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jnoller Nosy List: amaury.forgeotdarc, asksol, jnoller, rthalley, vstinner
Priority: normal Keywords:

Created on 2010-10-20 20:54 by rthalley, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (4)
msg119233 - (view) Author: Bob Halley (rthalley) Date: 2010-10-20 20:54
For reasons not germane to this bug report, I was running a modified Python 2.7 where PyTrash_UNWIND_LEVEL in Include/object.h had been defined to 10 instead of 50.

With such a python, running test_multiprocessing causes a BadInternalCall to be triggered at line 903 of weakrefobject.c.

test_finalize (__main__.WithProcessesTestFinalize) ... Process Process-18:
Traceback (most recent call last):
  File "/Users/halley/src/release27-maint/Lib/multiprocessing/process.py", line 229, in _bootstrap
    util._run_after_forkers()
  File "/Users/halley/src/release27-maint/Lib/multiprocessing/util.py", line 125, in _run_after_forkers
    items = list(_afterfork_registry.items())
  File "/Users/halley/src/release27-maint/Lib/weakref.py", line 116, in items
    for key, wr in self.data.items():
SystemError: Objects/weakrefobject.c:903: bad argument to internal function

I'm running the head of the release27-maint branch, but the problem occurs with the released 2.7 too.

The trigger for the error is that object->ob_refcount is 1.

I stuck an abort() into the code just before the BadInternalCall, and looked at the core.  The ob_type is a Semaphore, and the last bit of the call chain is:

(gdb) bt
#0  0x00007fff8791c3d6 in __kill ()
#1  0x00007fff879bc972 in abort ()
#2  0x000000010007eb05 in PyObject_ClearWeakRefs (object=<value temporarily unavailable, due to optimizations>) at Objects/weakrefobject.c:903
#3  0x000000010006c949 in subtype_dealloc (self=0x10070a150) at Objects/typeobject.c:952
#4  0x000000010005341b in _PyTrash_destroy_chain () at Objects/object.c:2435
#5  0x000000010003bfaf in listiter_next (it=0x1006db8d0) at Objects/listobject.c:2917
#6  0x00000001000b9f16 in PyEval_EvalFrameEx (f=0x1013b91a0, throwflag=<value temporarily unavailable, due to optimizations>) at Python/ceval.c:2496
#7  0x00000001000c0966 in PyEval_EvalFrameEx (f=0x1013bcc40, throwflag=<value temporarily unavailable, due to optimizations>) at Python/ceval.c:4098
#8  0x00000001000c0966 in PyEval_EvalFrameEx (f=0x10137a500, throwflag=<value temporarily unavailable, due to optimizations>) at Python/ceval.c:4098
#9  0x00000001000c0966 in PyEval_EvalFrameEx (f=0x101374aa0, throwflag=<value temporarily unavailable, due to optimizations>) at Python/ceval.c:4098
#10 0x00000001000c1826 in PyEval_EvalCodeEx (co=0x10065d1b0, globals=<value temporarily unavailable, due to optimizations>, locals=<value temporarily unavailable, due to optimizations>, args=0x100708458, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#11 0x0000000100037580 in function_call (func=0x100678488, arg=0x100708440, kw=0x0) at Objects/funcobject.c:526
#12 0x00000001000063e2 in PyObject_Call (func=0x100678488, arg=0x100708440, kw=0x0) at Objects/abstract.c:2529
#13 0x00000001000177bd in instancemethod_call (func=0x100678488, arg=0x100708440, kw=0x0) at Objects/classobject.c:2578
#14 0x00000001000063e2 in PyObject_Call (func=0x100704780, arg=0x1006f17d0, kw=0x0) at Objects/abstract.c:2529
#15 0x0000000100071a38 in slot_tp_init (self=<value temporarily unavailable, due to optimizations>, args=0x1006f17d0, kwds=0x0) at Objects/typeobject.c:5651
#16 0x000000010006ee75 in type_call (type=0x10130bd10, args=0x1006f17d0, kwds=0x0) at Objects/typeobject.c:726
#17 0x00000001000063e2 in PyObject_Call (func=0x10130bd10, arg=0x1006f17d0, kw=0x0) at Objects/abstract.c:2529

Note that the trashcan function _PyTrash_destroy_chain() is involved.

I don't understand this code well enough to have figured out what's going wrong yet.  I worry that there's some subtle interaction with weakrefs, subtyped objects, and the trashcan, but maybe it's something simpler!

Regards,

/Bob
msg119246 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-10-20 23:51
Does it happen when compiled in debug mode? There may be asserts that will give a better (or earlier) error message.

Some thoughts: the '_PyTrash_delete_later' chain can only contain a limited set of objects: lists, frames... (which code contain the Py_TRASHCAN_SAFE_BEGIN macro)... and the stack dumps suggests (subtype_dealloc) that it is also an instance of a user-defined class.  The only candidate I see is the class 'multiprocessing.managers.ProcessLocalSet', which inherits from 'set'.
msg119272 - (view) Author: Bob Halley (rthalley) Date: 2010-10-21 10:19
I rebuilt using --with-pydebug and -O0 -g.  It now triggers an assertion a bit earlier.

test_finalize (__main__.WithProcessesTestFinalize) ... Assertion failed: (op->ob_refcnt == 0), function _PyTrash_destroy_chain, file Objects/object.c, line 2433

(gdb) bt
#0  0x00007fff8791c3d6 in __kill ()
#1  0x00007fff879bc972 in abort ()
#2  0x00007fff879a99b4 in __assert_rtn ()
#3  0x00000001000796e0 in _PyTrash_destroy_chain () at Objects/object.c:2433
#4  0x0000000100053efe in list_dealloc (op=0x10114cc18) at Objects/listobject.c:317
#5  0x000000010007905b in _Py_Dealloc (op=0x10114cc18) at Objects/object.c:2228
#6  0x000000010005aee7 in listiter_next (it=0x10160d610) at Objects/listobject.c:2917
#7  0x00000001001158c8 in PyEval_EvalFrameEx (f=0x1014c72c0, throwflag=0) at Python/ceval.c:2496
[ ... snip ... ]

(gdb) f 3
#3  0x00000001000796e0 in _PyTrash_destroy_chain () at Objects/object.c:2433
2433	        assert(op->ob_refcnt == 0);
(gdb) list
2428	         * fool Py_DECREF into calling it indirectly, but
2429	         * Py_DECREF was already called on this object, and in
2430	         * assorted non-release builds calling Py_DECREF again ends
2431	         * up distorting allocation statistics.
2432	         */
2433	        assert(op->ob_refcnt == 0);
2434	        ++_PyTrash_delete_nesting;
2435	        (*dealloc)(op);
2436	        --_PyTrash_delete_nesting;
2437	    }

(gdb) p *op
$2 = {
  _ob_next = 0x0, 
  _ob_prev = 0x0, 
  ob_refcnt = 1, 
  ob_type = 0x101441d70
}

(gdb) p *op->ob_type
$4 = {
  _ob_next = 0x10076d798, 
  _ob_prev = 0x101442930, 
  ob_refcnt = 9, 
  ob_type = 0x1001f84e0, 
  ob_size = 0, 
  tp_name = 0x1013e88b4 "Semaphore", 
  tp_basicsize = 48, 
  tp_itemsize = 0, 
  tp_dealloc = 0x10009dd9d <subtype_dealloc>, 
  tp_print = 0, 
  tp_getattr = 0, 
  tp_setattr = 0, 
  tp_compare = 0, 
  tp_repr = 0x1000af4a0 <slot_tp_repr>, 
  tp_as_number = 0x101441f08, 
  tp_as_sequence = 0x101442058, 
  tp_as_mapping = 0x101442040, 
  tp_hash = 0x1000761bc <_Py_HashPointer>, 
  tp_call = 0, 
  tp_str = 0x1000a44e4 <object_str>, 
  tp_getattro = 0x10007728a <PyObject_GenericGetAttr>, 
  tp_setattro = 0x100077676 <PyObject_GenericSetAttr>, 
  tp_as_buffer = 0x1014420a8, 
  tp_flags = 940027, 
  tp_doc = 0x0, 
  tp_traverse = 0x10009da40 <subtype_traverse>, 
  tp_clear = 0x10009dce5 <subtype_clear>, 
  tp_richcompare = 0, 
  tp_weaklistoffset = 40, 
  tp_iter = 0, 
  tp_iternext = 0x100076c13 <_PyObject_NextNotImplemented>, 
  tp_methods = 0x0, 
  tp_members = 0x1014420e8, 
  tp_getset = 0x0, 
  tp_base = 0x1014499f0, 
  tp_dict = 0x101442930, 
  tp_descr_get = 0, 
  tp_descr_set = 0, 
  tp_dictoffset = 32, 
  tp_init = 0x1000b06dd <slot_tp_init>, 
  tp_alloc = 0x10009d7a7 <PyType_GenericAlloc>, 
  tp_new = 0x1000a3f03 <object_new>, 
  tp_free = 0x10016bda4 <PyObject_GC_Del>, 
  tp_is_gc = 0, 
  tp_bases = 0x1010bc840, 
  tp_mro = 0x100744060, 
  tp_cache = 0x0, 
  tp_subclasses = 0x10074d240, 
  tp_weaklist = 0x10075f528, 
  tp_del = 0, 
  tp_version_tag = 194
}
msg203255 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013-11-18 08:30
Even with PyTrash_UNWIND_LEVEL in Include/object.h defined to 10 instead of 50, I cannot reproduce the issue on the 2.7 branch of Mercurial. I guess that the bug was already fixed in Python 2.7 and this issue has no activity since 3 years, so I'm closing it.
History
Date User Action Args
2022-04-11 14:57:07adminsetgithub: 54367
2013-11-18 08:30:50vstinnersetstatus: open -> closed

nosy: + vstinner
messages: + msg203255

resolution: fixed
2010-10-21 10:19:56rthalleysetmessages: + msg119272
2010-10-21 02:33:58r.david.murraysetnosy: + asksol
2010-10-20 23:51:09amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg119246
2010-10-20 21:35:25benjamin.petersonsetassignee: jnoller

nosy: + jnoller
2010-10-20 20:54:20rthalleycreate