Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in generator if the generator in created in a temporary C thread #58637

Closed
vstinner opened this issue Mar 28, 2012 · 25 comments
Closed

Bug in generator if the generator in created in a temporary C thread #58637

vstinner opened this issue Mar 28, 2012 · 25 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@vstinner
Copy link
Member

BPO 14432
Nosy @ncoghlan, @pitrou, @scoder, @vstinner, @asvetlov, @markshannon, @serhiy-storchaka
Files
  • generator_frame_bug.tar.gz
  • generator.patch: patch
  • remove_f_tstate.patch: Patch to revision 8a47d2322df0
  • issue14432.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-12-13.01:41:54.224>
    created_at = <Date 2012-03-28.15:34:27.659>
    labels = ['interpreter-core', 'type-crash']
    title = 'Bug in generator if the generator in created in a temporary C thread'
    updated_at = <Date 2013-12-14.05:56:54.317>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2013-12-14.05:56:54.317>
    actor = 'ncoghlan'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-12-13.01:41:54.224>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2012-03-28.15:34:27.659>
    creator = 'vstinner'
    dependencies = []
    files = ['25054', '25055', '25176', '32605']
    hgrepos = []
    issue_num = 14432
    keywords = ['patch']
    message_count = 25.0
    messages = ['156982', '156984', '156986', '156987', '157379', '158025', '158455', '197520', '202796', '202860', '206003', '206004', '206005', '206006', '206007', '206070', '206072', '206075', '206076', '206077', '206078', '206079', '206080', '206086', '206167']
    nosy_count = 12.0
    nosy_names = ['ncoghlan', 'pitrou', 'scoder', 'vstinner', 'asvetlov', 'adamtj', 'neologix', 'Mark.Shannon', 'rosslagerwall', 'python-dev', 'serhiy.storchaka', 'asuffield']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue14432'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @vstinner
    Copy link
    Member Author

    We have a crash in our product when tracing is enabled by
    sys.settrace() and threading.settrace(). If a Python generator is
    created in a C thread, calling the generator later in another thread
    may crash if Python tracing is enabled.

    • the C thread calls PyGILState_Ensure() which creates a temporary
      Python thread state
    • a generator is created, the generator has a reference to a Python
      frame which keeps a reference to the temporary Python thread state
    • the C thread calls PyGILState_Releases() which destroys the
      temporary Python thread state
    • when the generator is called later in another thread, call_trace()
      reads the Python thread state from the generator frame, which is the
      destroyed frame => it does crash on a pointer dereference if the
      memory was reallocated (by malloc()) and the data were erased

    To reproduce the crash, unpack the attached
    generator_frame_bug.tar.gz, compile the C module using:

    python setup.py build

    and then run:

    PYTHONPATH=$(ls -d build/lib*/) python test.py

    (or just "python test.py if you installed the _test module).
    You may need to use Valgrind to see the error, or call memset(tstate,
    0xFF, sizeof(*tstate)) before free(tstate) in tstate_delete_common().

    Calling the generator should update its reference to the Python state
    thread in its frame. The generator may also clears frame->f_tstate (to
    detect bugs earlier), as it does for frame->f_back (to avoid a
    reference cycle). Attached patch implements this fix for Python 3.3.

    @vstinner vstinner added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Mar 28, 2012
    @rosslagerwall
    Copy link
    Mannequin

    rosslagerwall mannequin commented Mar 28, 2012

    Here's the patch ;-)

    @ncoghlan
    Copy link
    Contributor

    The proposed fix sounds reasonable to me. Would it be possible to work something into test_capi to actually test it?

    @ncoghlan
    Copy link
    Contributor

    It may not even have to specifically test the crash - any operation that accessed the tstate on the frame and could be shown to be accessing the wrong thread state when called from another thread could demonstrate the problem.

    @adamtj
    Copy link
    Mannequin

    adamtj mannequin commented Apr 2, 2012

    For what it's worth, I think I've seen this bug in 2.6 and 2.5, using generators created in python threads, while profiling not tracing.

    I'm creating generators in one python thread and storing them in a variable. In a different python thread I overwrite the variable, which garbage collects the generator. Sometimes it causes an interpreter crash. Other times, it seems to trigger a return event in the profiler, but it's associated with an incorrect thread. The thread in question is often (always?) in the profiler code, so it looks like the profiler is profiling itself.

    @markshannon
    Copy link
    Member

    Rather than ensuring that f_tstate always points to the current frame,
    just remove it altogether.

    Patch attached

    @asuffield
    Copy link
    Mannequin

    asuffield mannequin commented Apr 16, 2012

    I think I've tripped over a variation on this theme using pyqt and 2.7:

    When working in a QThread, the PyGILState_Ensure call when transitioning control from Qt to python will frequently allocate a new thread state - because every time control returns to the Qt event loop, gilstate_counter is likely to become zero.

    If a generator is kept around longer than this, it becomes quite likely to have an older thread state in f_tstate. This happens all the time.

    The access that blows up on me is PyEval_GetRestricted, since PyFrame_IsRestricted checks f_tstate. Annoyingly this debris is still called on the possibly-restricted operations even when restricted execution is nowhere in sight.

    Hence, any use of open() in a long-lived generator in a QThread is probably going to crash.

    Patch against 2.7?

    @vstinner
    Copy link
    Member Author

    ping? (for myself :-))

    @vstinner
    Copy link
    Member Author

    Updated patch for Python 3.4:

    • remove PyFrameObject.f_tstate attribute: the thread state can be easily retrieved, it is known where it is needed (see the patch). There is one function which doesn't know the thread state: _PyEval_CallTracing(), but this function was already calling PyEval_GetFrame() which calls PyThreadState_GET() internally, so...

    • add an unit test for this issue (generator created in a temporary C thread)

    It's really hard to reproduce the crash. I tried with my old tarball and I failed. I also tried with my unit test and I failed. I'm pretty sure that the crash can only be reproduced when Python is compiled is release mode.

    I reproduced the crash once with the unit test on an unpatched Python 3.4.

    For Python 2.7 and 3.3, what do you think of applying generator.patch? It looks simple and obvious. I don't know the impact on performances, but it should be very low.

    @vstinner vstinner changed the title Bug in generator if the generator in created in a C thread Bug in generator if the generator in created in a temporary C thread Nov 14, 2013
    @vstinner
    Copy link
    Member Author

    versions: +Python 3.4 -Python 2.6, Python 3.2

    It would be interesting to fix the issue in Python 2.7 and 3.3:
    generator.patch should fix it and the patch is simple (update
    frame->f_tstate before each execution of a generator).

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 13, 2013

    New changeset d2560fd8a008 by Victor Stinner in branch 'default':
    Issue bpo-14432: Remove the thread state field from the frame structure. Fix a
    http://hg.python.org/cpython/rev/d2560fd8a008

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 13, 2013

    New changeset 0875e5bbe5f0 by Victor Stinner in branch '3.3':
    Issue bpo-14432: Generator now clears the borrowed reference to the thread state
    http://hg.python.org/cpython/rev/0875e5bbe5f0

    New changeset 55dd094a2b01 by Victor Stinner in branch 'default':
    Issue bpo-14432: Null merge 3.3, Python 3.4 has a different fix
    http://hg.python.org/cpython/rev/55dd094a2b01

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 13, 2013

    New changeset 2f975036cf39 by Victor Stinner in branch '3.3':
    Issue bpo-14432: Fix compilation when thread support is disabled
    http://hg.python.org/cpython/rev/2f975036cf39

    New changeset 9852637f05c3 by Victor Stinner in branch 'default':
    (Merge 3.3) Issue bpo-14432: Fix compilation when thread support is disabled
    http://hg.python.org/cpython/rev/9852637f05c3

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 13, 2013

    New changeset aa324af42c0e by Victor Stinner in branch '2.7':
    Issue bpo-14432: Generator now clears the borrowed reference to the thread state
    http://hg.python.org/cpython/rev/aa324af42c0e

    @vstinner
    Copy link
    Member Author

    Thanks Mark Shannon for your patch. Sorry, I forgot to mention your name if the changesets :-/ I didn't remember the whole story of this issue.

    It should now be fixed.

    @scoder
    Copy link
    Contributor

    scoder commented Dec 13, 2013

    I agree that this is an improvement, but isn't it a bit late for removing a public field from a public header file in 3.4, without any preceding deprecation?

    @ncoghlan
    Copy link
    Contributor

    Unfortunately Stefan has a point - what's the migration path for C API
    users that were relying on that struct field?

    @vstinner
    Copy link
    Member Author

    I agree that this is an improvement, but isn't it a bit late for removing a public field from a public header file in 3.4, without any preceding deprecation?

    The header is not public, it is private. The structure in not
    documented (in Doc/c-api/*.rst). There is only one public documented
    method: PyFrame_GetLineNumber().

    If you rely on such implement detail (PyFrameObject field), you have
    to be prepared for breakage between Python releases (3.x).

    Do you know applications relying on the field?

    How don't see how you would like to deprecate the field since it's
    private and not documented.

    @vstinner
    Copy link
    Member Author

    2013/12/13 Victor Stinner <victor.stinner@gmail.com>:

    The header is not public, it is private.

    Hum, I'm not clear. frameobject.h is not included in Python.h, so the
    classic #include "Python.h" doesn't give you access to PyFrameObject
    structure. You have to add a second #include "frameobject.h". All
    definitions in this header are also surrounded by #ifndef
    Py_LIMITED_API ... #endif, so the fields are not part of the stable
    API.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 13, 2013

    New changeset 278dd7eb2f2b by Victor Stinner in branch 'default':
    Issue bpo-14432: Document the removal of the PyFrameObject.f_tstate field
    http://hg.python.org/cpython/rev/278dd7eb2f2b

    @scoder
    Copy link
    Contributor

    scoder commented Dec 13, 2013

    what's the migration path for C API users that were relying on that struct field?

    Depends on how you use it, I guess. In many cases (at least for Cython and likely some other low-level tools), it could be as simple as

    #if PY_VERSION_HEX < 0x030400B2
        // set "f_tstate" here
    #endif

    My guess is that some (most?) code that sets this field only does so because CPython previously depended on it being set. (Stackless comes to mind, not sure if that's supposed to work with 3.x...)

    Read access is an entirely different thing, though. No idea what other tools might be using it for. I'm sure there are debugging/tracing tools out there that make use of the frame internals. Would it be correct for them to just use the current thread state instead? E.g. from a signal handler that analysis the current threads?

    On a quick look through Ohloh's code search I only found a couple of other occurrences outside of CPython so far.

    http://code.ohloh.net/search?s=%22f_tstate%22&fe=c&filterChecked=true

    Disclaimer: it's easy to fix for me with the above conditional, so I won't argue much about keeping up compatibility here. I'm merely raising the issue because there is evidently code that this change breaks.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 13, 2013

    All
    definitions in this header are also surrounded by #ifndef
    Py_LIMITED_API ... #endif, so the fields are not part of the stable
    API.

    "Stable ABI", not API.

    That said, I agree the field should have been private anyway.

    @vstinner
    Copy link
    Member Author

    Depends on how you use it, I guess. In many cases (at least for Cython and likely some other low-level tools), it could be as simple as

    #if PY_VERSION_HEX < 0x030400B2
    // set "f_tstate" here
    #endif

    Why would you set f_tstate field? Frame constructor (PyFrame_New) and
    Generators (gen_send_ex) do it for you.

    Cython does touch this field? Stackless Python supports Python 3 (3.4)?

    If something relies on f_tstate, it's easy to avoid this field.

    Would it be correct for them to just use the current thread state instead? E.g. from a signal handler that analysis the current threads?

    The faulthandler has different low-level C signal handlers and it
    doesn't use f_tstate, but PyGILState_GetThisThreadState() (or
    PyThreadState_Get() if thread support is disabled).

    I was surprised that it was so easy to modify ceval.c: in most cases,
    you already have the thread state. So I just passed the thread state
    to the function.

    @scoder
    Copy link
    Contributor

    scoder commented Dec 13, 2013

    frameobject.h is not included in Python.h, so the
    classic #include "Python.h" doesn't give you access to PyFrameObject
    structure. You have to add a second #include "frameobject.h".

    Ah, right. I keep misremembering that, because in order to do anything non-trivial with frames you'll eventually end up importing that header file, so it feels "almost public".

    I think it's reasonable to expect low-level tools to adapt manually to this kind of changes (or at least recompile for a specific CPython version), and it's unlikely that there's much other code that would make use of this field specifically.

    So, no objection to keeping the change in 3.4 as it is (and I see that you already documented it).

    @ncoghlan
    Copy link
    Contributor

    Ah, I always forget that frameobject.h isn't included from python.h
    because none of the names have underscore prefixes. In that case,
    together with the new note in the porting section, I agree this is
    fine.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants