classification
Title: SEGFAULT in visit_decref
Type: crash Stage:
Components: Versions: Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Kai.Sterker, amaury.forgeotdarc, jcea
Priority: normal Keywords:

Created on 2012-07-01 09:46 by Kai.Sterker, last changed 2012-07-01 21:26 by amaury.forgeotdarc. This issue is now closed.

Files
File name Uploaded Description Edit
stacktrace.txt Kai.Sterker, 2012-07-01 09:46 Stacktrace
Messages (4)
msg164468 - (view) Author: Kai Sterker (Kai.Sterker) Date: 2012-07-01 09:46
Since update to Python 2.7.3 (as distributed by Ubuntu 12.04 64bit), I experience occasional crashes in the application I am developing (which uses Python scripting). The crash either happens at the first key press or it does not happen at all. Smells like a race condition to me.

I installed the debug version of Python 2.7.3 and compiled my project against that, which gave the attached stack trace. The crash also appears to be easier to reproduce with the debug version, but it still does not occur every time.

The application that exhibits the crash can be found here:
https://github.com/ksterker/adonthell

The Python method executed when the crash happens is this one:

    def estimate_speed (self, terrain):
        try:
            return self.Dic[terrain]
        except: return 0


Don't think it will be possible to construct a minimum example to demonstrate the issue, but if there is any other information helpful to shed more light on the issue, I'm happy to provide it.

Regards,

Kai
msg164485 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-07-01 16:03
This programs embeds a Python interpreter and uses the C API extensively.
I tried to compile it, but could not make it use Python 2.7.

Your stracktrace suggests a buffer overflow, or reuse of a freed object:
"ob_refcnt = 8462385097079783424, ob_type = 0x72746e6f633a3a74" contains the ascii of "input::contr".  Probably a "input::control_event*" which is the raised event.

I suspect that the memory corruption has always occurred, but with 2.7.3 a garbage collection happens in the middle of an event callback.  Could you add some "gc.collect()" here and there, and see if other versions Of Python crash as well?
msg164494 - (view) Author: Kai Sterker (Kai.Sterker) Date: 2012-07-01 20:18
To compile against a python version that is not system-default, configure with

  PYTHON=/usr/bin/python2.7 ../adonthell/configure --with-py-cflags=-I/usr/include/python2.7 --with-py-libs=-lpython2.7


Regardless of that, your hints are proving useful. I compiled with 2.6.8 and was not able to reproduce the issue. However, with a gc.collect() call added to estimate_speed, it will again happen sometimes.

So it does not seem to be specific to Python 2.7.3 (and is probably not even a problem with Python at all). Could this be triggered by a missing Py_INCREF somewhere? Or would you suspect something totally unrelated? (But then I would expect other random crashes, too.)
msg164500 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2012-07-01 21:26
Yes, some INCREF may be missing.  The issue may be with the callback mechanism; these are usually difficult to get right.

Actually by pure luck I found suspect code that may be the cause of this crash:
in src/event/listener_python.cc, the "Args" tuple is first allocated, but item #1 is not set.  It's a bit wrong (try to print it!) but if does not leak outside, it won't probably crash here; gc traverse function luckily skips NULL pointers.
BUT in raise_event(), this Args[1] is set to an event object, which is DECREF'd afterwards.  The pointer now points to invalid memory, and next gc.collect() will crash...

I also found other issues with reference counting here and there (ex: in src/python/python.cc, PyTuple_SET_ITEM (new_tuple, i, Py_None) steals one reference to Py_None each time!)

There are many bugs in this application to fix before we can impute CPython.
History
Date User Action Args
2012-07-01 21:26:19amaury.forgeotdarcsetstatus: open -> closed
resolution: not a bug
messages: + msg164500
2012-07-01 20:18:15Kai.Sterkersetmessages: + msg164494
2012-07-01 16:03:57amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg164485
2012-07-01 13:42:47jceasetnosy: + jcea
2012-07-01 09:46:23Kai.Sterkercreate