classification
Title: ERROR: SystemError: ./../Objects/tupleobject.c:118: bad argument to internal function
Type: behavior Stage:
Components: Extension Modules Versions: Python 2.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, arigo, ghaering, jcea, pitrou, pxd
Priority: normal Keywords:

Created on 2012-06-19 21:52 by pxd, last changed 2012-06-28 17:40 by pitrou.

Messages (8)
msg163224 - (view) Author: Pankaj D (pxd) Date: 2012-06-19 21:52
Hi,

Sporadically, while running an sqlite3 query, the above error is seen.
In order to debug, I modified Objects/tupleobject.c, PyTuple_SetItem() as follows:

	if (!PyTuple_Check(op) || op->ob_refcnt != 1) {
		Py_XDECREF(newitem);
		/*
		 * Temp: Bug XYZ Generate core so that we can debug
		 *  
		 * PyErr_BadInternalCall();
		 * return -1;
		*/
		char errmsg[200];
		sprintf(errmsg, "Bug XYZ: PyTuple_Check(op) = %d "
			"op->ob_refcnt = %d; see core\n", PyTuple_Check(op),
			op->ob_refcnt);
		Py_FatalError(errmsg); 
	}

This generates a core with the following bt.  Showing the top few frames only:
(gdb) bt
#0  0x0000000800acd3fc in thr_kill () at thr_kill.S:3
#1  0x0000000800b5e283 in abort () at /build/mnt/src/lib/libc/stdlib/abort.c:65
#2  0x0000000000494acf in Py_FatalError (msg=Variable "msg" is not available.
) at ./../Python/pythonrun.c:1646
#3  0x000000000044e740 in PyTuple_SetItem (op=0x80c6e6308, i=16, newitem=0x80a80d780) at ./../Objects/tupleobject.c:128
#4  0x0000000807298866 in _pysqlite_fetch_one_row (self=0x80b846e48) at _sqlite/cursor.c:402
#5  0x0000000807298bf5 in pysqlite_cursor_iternext (self=0x80b846e48) at _sqlite/cursor.c:898
#6  0x0000000000476943 in PyEval_EvalFrameEx (f=0x80a94d420, throwflag=Variable "throwflag" is not available.
) at ./../Python/ceval.c:2237
#7  0x000000000047acbf in PyEval_EvalFrameEx (f=0x80a94b820, throwflag=Variable "throwflag" is not available.
) at ./../Python/ceval.c:3765
#8  0x000000000047bf09 in PyEval_EvalCodeEx (co=0x808d2ac60, globals=Variable "globals" is not available.
) at ./../Python/ceval.c:2942
...
(gdb) fr 4
#4  0x0000000807298866 in _pysqlite_fetch_one_row (self=0x80b846e48) at _sqlite/cursor.c:402
402	            PyTuple_SetItem(row, i, converted);
Current language:  auto; currently c
(gdb) l
397	                converted = buffer;
398	            }

(gdb) p *(PyTupleObject *)row
$11 = {ob_refcnt = 2, ob_type = 0x60fee0, ob_size = 22, ob_item = {0x80a534030}}

'row' was allocated via PyTuple_New()
but, somehow its  refcount has become 2 while setting the 16th item!!!

Is this a known issue?  If not, what can I do to debug.

Thanks,
Pankaj
msg163225 - (view) Author: Jesús Cea Avión (jcea) * (Python committer) Date: 2012-06-19 21:57
Could you possibly reproduce this in 2.7, 3.2 and/or "default" (future 3.3)?.

Python 2.6 support is over.
msg163226 - (view) Author: Pankaj D (pxd) Date: 2012-06-19 22:11
sorry, 2.7, 3.2 is not an option currently but I am hoping someone can
provide enough info to help probe this more efficiently.  There seem to be references to this issue on the web but no root-cause.
msg164257 - (view) Author: Pankaj D (pxd) Date: 2012-06-28 14:20
I believe I have found the root-cause for this issue.   

It  is occurring due to the use of the garbage collector in another “memMonitor” thread (we run it periodically to get stats on objects, track mem leaks, etc).  Since _pysqlite_fetch_one_row() releases the GIL before calling PyTuple_SetItem(),  if the  memMonitor is scheduled to run and, say, calls gc.get_objects(), it increments the refcount on all tracked objects (via append_objects()->PyList_Append()->app1()->PY_INCREF()).    I have stack traces to confirm.   This seems to rule out the use of gc methods (such as get_objects(), get_referrers/referents()) in multi-threaded programs or have them handle SystemError arising from such usage.  Agree?
msg164263 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-06-28 15:13
Thanks for the analysis!
This is quite similar to issue793822: gc.get_referrers() can access unfinished tuples.  The difference here is that what breaks is not the monitoring tool, but the "main" program!

Here is a simple script inspired from the original bug; PySequence_Tuple() uses PyTuple_SET_ITEM() which is a macro without the ob_refcnt check, but we can make it call _PyTuple_Resize() and fail there.  All versions of CPython are affected:


import gc
TAG = object()

def monitor():
    lst = [x for x in gc.get_referrers(TAG)
           if isinstance(x, tuple)]
    t = lst[0]   # this *is* the result tuple
    print(t)     # full of nulls !
    return t     # Keep it alive for some time

def my_iter():
    yield TAG    # 'tag' gets stored in the result tuple
    t = monitor()
    for x in range(10):
        yield x  # SystemError when the tuple needs to be resized

tuple(my_iter())
msg164273 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-06-28 16:43
I wonder why _pysqlite_fetch_one_row() releases the GIL around sqlite3_column_type(). By its name, it doesn't sound like an expensive function.

Another workaround would be to call PyTuple_SET_ITEM instead of PyTuple_SetItem.
msg164282 - (view) Author: Pankaj D (pxd) Date: 2012-06-28 17:36
Wondering the same thing myself, and yes sqlite3_column_type() by itself doesn't seem expensive.   I assumed in general it was to allow more responsiveness for apps with huge number of columns (i.e. large tuple size).  But we have about 20-25 columns and so I was going to try removing it and seeing the
results. In any case, it seems, fewer GIL acquire/releases  will help with throughput.  
Are there any guidelines on when GIL should be released?

Re PyTuple_SET_ITEM...yes that's also a possibility but it would then hide genuine bugs.
msg164285 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-06-28 17:40
> Are there any guidelines on when GIL should be released?

The GIL should be released:
- for CPU-heavy external functions (e.g. compression, cryptography)
- for external functions which wait for I/O

> Re PyTuple_SET_ITEM...yes that's also a possibility but it would then
> hide genuine bugs.

Well, as long as your monitor only increfs the tuple and doesn't mutate
it, there shouldn't be any problem. We use PyTuple_SET_ITEM in many
other places.
History
Date User Action Args
2012-06-28 17:40:06pitrousetmessages: + msg164285
2012-06-28 17:36:09pxdsetmessages: + msg164282
versions: + Python 2.6, - Python 2.7, Python 3.2, Python 3.3
2012-06-28 16:43:49pitrousetversions: + Python 2.7, Python 3.2, Python 3.3, - Python 2.6
nosy: + pitrou, ghaering

messages: + msg164273

components: + Extension Modules
2012-06-28 15:13:30amaury.forgeotdarcsetnosy: + amaury.forgeotdarc, arigo
messages: + msg164263
2012-06-28 14:20:05pxdsetmessages: + msg164257
2012-06-19 22:11:02pxdsetmessages: + msg163226
2012-06-19 21:57:55jceasetcomponents: - Cross-Build
2012-06-19 21:57:24jceasetnosy: + jcea
messages: + msg163225
components: + Cross-Build
2012-06-19 21:52:13pxdcreate