classification
Title: Bug in generator if the generator in created in a temporary C thread
Type: crash Stage: patch review
Components: Interpreter Core Versions: Python 3.4, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Mark.Shannon, adamtj, asuffield, asvetlov, haypo, ncoghlan, neologix, pitrou, python-dev, rosslagerwall, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2012-03-28 15:34 by haypo, last changed 2013-12-14 05:56 by ncoghlan. This issue is now closed.

Files
File name Uploaded Description Edit
generator_frame_bug.tar.gz haypo, 2012-03-28 15:34
generator.patch rosslagerwall, 2012-03-28 15:58 patch review
remove_f_tstate.patch Mark.Shannon, 2012-04-11 11:57 Patch to revision 8a47d2322df0 review
issue14432.patch haypo, 2013-11-13 22:47 review
Messages (25)
msg156982 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2012-03-28 15:34
We have a crash in our product when tracing is enabled by
sys.settrace() and threading.settrace(). If a Python generator is
created in a C thread, calling the generator later in another thread
may crash if Python tracing is enabled.

 - the C thread calls PyGILState_Ensure() which creates a temporary
Python thread state
 - a generator is created, the generator has a reference to a Python
frame which keeps a reference to the temporary Python thread state
 - the C thread calls PyGILState_Releases() which destroys the
temporary Python thread state
 - when the generator is called later in another thread, call_trace()
reads the Python thread state from the generator frame, which is the
destroyed frame => it does crash on a pointer dereference if the
memory was reallocated (by malloc()) and the data were erased

To reproduce the crash, unpack the attached
generator_frame_bug.tar.gz, compile the C module using:

   python setup.py build

and then run:

   PYTHONPATH=$(ls -d build/lib*/) python test.py

(or just "python test.py if you installed the _test module).
You may need to use Valgrind to see the error, or call memset(tstate,
0xFF, sizeof(*tstate)) before free(tstate) in tstate_delete_common().

Calling the generator should update its reference to the Python state
thread in its frame. The generator may also clears frame->f_tstate (to
detect bugs earlier), as it does for frame->f_back (to avoid a
reference cycle). Attached patch implements this fix for Python 3.3.
msg156984 - (view) Author: Ross Lagerwall (rosslagerwall) (Python committer) Date: 2012-03-28 15:58
Here's the patch ;-)
msg156986 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-03-28 16:17
The proposed fix sounds reasonable to me. Would it be possible to work something into test_capi to actually test it?
msg156987 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-03-28 16:23
It may not even have to specifically test the crash - any operation that accessed the tstate on the frame and could be shown to be accessing the wrong thread state when called from another thread could demonstrate the problem.
msg157379 - (view) Author: Adam Tomjack (adamtj) Date: 2012-04-02 19:51
For what it's worth, I think I've seen this bug in 2.6 and 2.5, using generators created in python threads, while profiling not tracing.

I'm creating generators in one python thread and storing them in a variable.  In a different python thread I overwrite the variable, which garbage collects the generator.  Sometimes it causes an interpreter crash.  Other times, it seems to trigger a return event in the profiler, but it's associated with an incorrect thread.  The thread in question is often (always?) in the profiler code, so it looks like the profiler is profiling itself.
msg158025 - (view) Author: Mark Shannon (Mark.Shannon) * Date: 2012-04-11 11:57
Rather than ensuring that f_tstate always points to the current frame,
just remove it altogether.

Patch attached
msg158455 - (view) Author: Andrew Suffield (asuffield) Date: 2012-04-16 14:13
I think I've tripped over a variation on this theme using pyqt and 2.7:

When working in a QThread, the PyGILState_Ensure call when transitioning control from Qt to python will frequently allocate a new thread state - because every time control returns to the Qt event loop, gilstate_counter is likely to become zero.

If a generator is kept around longer than this, it becomes quite likely to have an older thread state in f_tstate. This happens all the time.

The access that blows up on me is PyEval_GetRestricted, since PyFrame_IsRestricted checks f_tstate. Annoyingly this debris is still called on the possibly-restricted operations even when restricted execution is nowhere in sight.

Hence, any use of open() in a long-lived generator in a QThread is probably going to crash.

Patch against 2.7?
msg197520 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-09-12 11:34
ping? (for myself :-))
msg202796 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-13 22:47
Updated patch for Python 3.4:

- remove PyFrameObject.f_tstate attribute: the thread state can be easily retrieved, it is known where it is needed (see the patch). There is one function which doesn't know the thread state: _PyEval_CallTracing(), but this function was already calling PyEval_GetFrame() which calls PyThreadState_GET() internally, so...

- add an unit test for this issue (generator created in a temporary C thread)

It's really hard to reproduce the crash. I tried with my old tarball and I failed. I also tried with my unit test and I failed. I'm pretty sure that the crash can only be reproduced when Python is compiled is release mode.

I reproduced the crash once with the unit test on an unpatched Python 3.4.


For Python 2.7 and 3.3, what do you think of applying generator.patch? It looks simple and obvious. I don't know the impact on performances, but it should be very low.
msg202860 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-11-14 16:51
> versions: +Python 3.4 -Python 2.6, Python 3.2

It would be interesting to fix the issue in Python 2.7 and 3.3:
generator.patch should fix it and the patch is simple (update
frame->f_tstate before each execution of a generator).
msg206003 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-13 01:08
New changeset d2560fd8a008 by Victor Stinner in branch 'default':
Issue #14432: Remove the thread state field from the frame structure. Fix a
http://hg.python.org/cpython/rev/d2560fd8a008
msg206004 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-13 01:19
New changeset 0875e5bbe5f0 by Victor Stinner in branch '3.3':
Issue #14432: Generator now clears the borrowed reference to the thread state
http://hg.python.org/cpython/rev/0875e5bbe5f0

New changeset 55dd094a2b01 by Victor Stinner in branch 'default':
Issue #14432: Null merge 3.3, Python 3.4 has a different fix
http://hg.python.org/cpython/rev/55dd094a2b01
msg206005 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-13 01:33
New changeset 2f975036cf39 by Victor Stinner in branch '3.3':
Issue #14432: Fix compilation when thread support is disabled
http://hg.python.org/cpython/rev/2f975036cf39

New changeset 9852637f05c3 by Victor Stinner in branch 'default':
(Merge 3.3) Issue #14432: Fix compilation when thread support is disabled
http://hg.python.org/cpython/rev/9852637f05c3
msg206006 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-13 01:39
New changeset aa324af42c0e by Victor Stinner in branch '2.7':
Issue #14432: Generator now clears the borrowed reference to the thread state
http://hg.python.org/cpython/rev/aa324af42c0e
msg206007 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-13 01:41
Thanks Mark Shannon for your patch. Sorry, I forgot to mention your name if the changesets :-/ I didn't remember the whole story of this issue.

It should now be fixed.
msg206070 - (view) Author: Stefan Behnel (scoder) * Date: 2013-12-13 12:47
I agree that this is an improvement, but isn't it a bit late for removing a public field from a public header file in 3.4, without any preceding deprecation?
msg206072 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-12-13 12:57
Unfortunately Stefan has a point - what's the migration path for C API
users that were relying on that struct field?
msg206075 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-13 13:22
> I agree that this is an improvement, but isn't it a bit late for removing a public field from a public header file in 3.4, without any preceding deprecation?

The header is not public, it is private. The structure in not
documented (in Doc/c-api/*.rst). There is only one public documented
method: PyFrame_GetLineNumber().

If you rely on such implement detail (PyFrameObject field), you have
to be prepared for breakage between Python releases (3.x).

Do you know applications relying on the field?

How don't see how you would like to deprecate the field since it's
private and not documented.
msg206076 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-13 13:28
2013/12/13 Victor Stinner <victor.stinner@gmail.com>:
> The header is not public, it is private.

Hum, I'm not clear. frameobject.h is not included in Python.h, so the
classic #include "Python.h" doesn't give you access to PyFrameObject
structure. You have to add a second #include "frameobject.h". All
definitions in this header are also surrounded by #ifndef
Py_LIMITED_API ... #endif, so the fields are not part of the stable
API.
msg206077 - (view) Author: Roundup Robot (python-dev) Date: 2013-12-13 13:33
New changeset 278dd7eb2f2b by Victor Stinner in branch 'default':
Issue #14432: Document the removal of the PyFrameObject.f_tstate field
http://hg.python.org/cpython/rev/278dd7eb2f2b
msg206078 - (view) Author: Stefan Behnel (scoder) * Date: 2013-12-13 13:34
> what's the migration path for C API users that were relying on that struct field?

Depends on how you use it, I guess. In many cases (at least for Cython and likely some other low-level tools), it could be as simple as

#if PY_VERSION_HEX < 0x030400B2
    // set "f_tstate" here
#endif

My guess is that some (most?) code that sets this field only does so because CPython previously depended on it being set. (Stackless comes to mind, not sure if that's supposed to work with 3.x...)

Read access is an entirely different thing, though. No idea what other tools might be using it for. I'm sure there are debugging/tracing tools out there that make use of the frame internals. Would it be correct for them to just use the current thread state instead? E.g. from a signal handler that analysis the current threads?

On a quick look through Ohloh's code search I only found a couple of other occurrences outside of CPython so far.

http://code.ohloh.net/search?s=%22f_tstate%22&fe=c&filterChecked=true

Disclaimer: it's easy to fix for me with the above conditional, so I won't argue much about keeping up compatibility here. I'm merely raising the issue because there is evidently code that this change breaks.
msg206079 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-13 13:38
> All
> definitions in this header are also surrounded by #ifndef
> Py_LIMITED_API ... #endif, so the fields are not part of the stable
> API.

"Stable ABI", not API.

That said, I agree the field should have been private anyway.
msg206080 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-12-13 13:46
> Depends on how you use it, I guess. In many cases (at least for Cython and likely some other low-level tools), it could be as simple as
>
> #if PY_VERSION_HEX < 0x030400B2
>     // set "f_tstate" here
> #endif

Why would you set f_tstate field? Frame constructor (PyFrame_New) and
Generators (gen_send_ex) do it for you.

Cython does touch this field? Stackless Python supports Python 3 (3.4)?

If something relies on f_tstate, it's easy to avoid this field.

> Would it be correct for them to just use the current thread state instead? E.g. from a signal handler that analysis the current threads?

The faulthandler has different low-level C signal handlers and it
doesn't use f_tstate, but PyGILState_GetThisThreadState() (or
PyThreadState_Get() if thread support is disabled).

I was surprised that it was so easy to modify ceval.c: in most cases,
you already have the thread state. So I just passed the thread state
to the function.
msg206086 - (view) Author: Stefan Behnel (scoder) * Date: 2013-12-13 14:02
> frameobject.h is not included in Python.h, so the
> classic #include "Python.h" doesn't give you access to PyFrameObject
> structure. You have to add a second #include "frameobject.h".

Ah, right. I keep misremembering that, because in order to do anything non-trivial with frames you'll eventually end up importing that header file, so it feels "almost public".

I think it's reasonable to expect low-level tools to adapt manually to this kind of changes (or at least recompile for a specific CPython version), and it's unlikely that there's much other code that would make use of this field specifically.

So, no objection to keeping the change in 3.4 as it is (and I see that you already documented it).
msg206167 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-12-14 05:56
Ah, I always forget that frameobject.h isn't included from python.h
because none of the names have underscore prefixes. In that case,
together with the new note in the porting section, I agree this is
fine.
History
Date User Action Args
2013-12-14 05:56:54ncoghlansetmessages: + msg206167
2013-12-13 14:02:27scodersetmessages: + msg206086
2013-12-13 13:46:08hayposetmessages: + msg206080
2013-12-13 13:38:11pitrousetmessages: + msg206079
2013-12-13 13:34:53scodersetmessages: + msg206078
2013-12-13 13:33:16python-devsetmessages: + msg206077
2013-12-13 13:28:59hayposetmessages: + msg206076
2013-12-13 13:22:38hayposetmessages: + msg206075
2013-12-13 12:57:59ncoghlansetmessages: + msg206072
2013-12-13 12:47:55scodersetnosy: + scoder
messages: + msg206070
2013-12-13 01:41:54hayposetstatus: open -> closed
resolution: fixed
messages: + msg206007
2013-12-13 01:39:39python-devsetmessages: + msg206006
2013-12-13 01:33:14python-devsetmessages: + msg206005
2013-12-13 01:19:34python-devsetmessages: + msg206004
2013-12-13 01:08:42python-devsetnosy: + python-dev
messages: + msg206003
2013-11-14 16:51:33hayposetmessages: + msg202860
2013-11-14 15:38:24serhiy.storchakasetversions: + Python 3.4, - Python 2.6, Python 3.2
2013-11-14 08:28:40hayposettitle: Bug in generator if the generator in created in a C thread -> Bug in generator if the generator in created in a temporary C thread
2013-11-13 22:47:29hayposetfiles: + issue14432.patch
nosy: + serhiy.storchaka
messages: + msg202796

2013-09-12 11:34:14hayposetmessages: + msg197520
2012-04-16 14:13:57asuffieldsetnosy: + asuffield
messages: + msg158455
2012-04-12 16:06:05pitrousetstage: patch review
2012-04-11 11:57:03Mark.Shannonsetfiles: + remove_f_tstate.patch
nosy: + Mark.Shannon
messages: + msg158025

2012-04-02 19:51:19adamtjsetnosy: + adamtj

messages: + msg157379
versions: + Python 2.6
2012-03-28 18:59:13asvetlovsetnosy: + asvetlov
2012-03-28 16:23:01ncoghlansetmessages: + msg156987
2012-03-28 16:17:40ncoghlansetnosy: + ncoghlan
messages: + msg156986
2012-03-28 15:58:50rosslagerwallsetfiles: + generator.patch

nosy: + rosslagerwall
messages: + msg156984

keywords: + patch
2012-03-28 15:34:27haypocreate