Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys._current_frames() reports too many/wrong stack frames #61296

Closed
Ringding mannequin opened this issue Jan 31, 2013 · 15 comments
Closed

sys._current_frames() reports too many/wrong stack frames #61296

Ringding mannequin opened this issue Jan 31, 2013 · 15 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@Ringding
Copy link
Mannequin

Ringding mannequin commented Jan 31, 2013

BPO 17094
Nosy @pitrou, @vstinner, @ssbr
Files
  • current-frames-cleanup.patch: Minimal patch
  • test-fork-frames.py: Test program
  • tstate_after_fork.diff
  • tstates-afterfork.patch
  • tstates-afterfork2.patch
  • _current_frames_27_setdefault.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-05-05.21:48:15.780>
    created_at = <Date 2013-01-31.21:39:23.549>
    labels = ['interpreter-core', 'type-bug']
    title = 'sys._current_frames() reports too many/wrong stack frames'
    updated_at = <Date 2015-05-29.23:09:14.080>
    user = 'https://bugs.python.org/Ringding'

    bugs.python.org fields:

    activity = <Date 2015-05-29.23:09:14.080>
    actor = 'Devin Jeanpierre'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-05-05.21:48:15.780>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2013-01-31.21:39:23.549>
    creator = 'Ringding'
    dependencies = []
    files = ['28924', '28925', '29295', '29427', '30137', '39564']
    hgrepos = []
    issue_num = 17094
    keywords = ['patch']
    message_count = 15.0
    messages = ['181045', '182897', '183352', '183377', '183440', '183445', '184343', '184344', '186011', '186210', '188452', '188469', '188470', '244353', '244429']
    nosy_count = 6.0
    nosy_names = ['pitrou', 'vstinner', 'Devin Jeanpierre', 'Ringding', 'neologix', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue17094'
    versions = ['Python 3.4']

    @Ringding
    Copy link
    Mannequin Author

    Ringding mannequin commented Jan 31, 2013

    After a fork, the list of thread states contains all the threads from before. It is especially unfortunate that, at least for me, it usually happens that threads created in the forked process reuse the same thread ids, and sys._current_frames will then return wrong stack frames for existing threads because the old entries occur towards the tail of the linked list and overwrite the earlier, correct ones, in _PyThread_CurrentFrames.

    The attached patch is certainly not the most complete solution, but it solves my problem.

    With Python 2.7.3 I get:

    Exception in thread Thread-6:
    Traceback (most recent call last):
      File "/usr/lib64/python2.7/threading.py", line 551, in __bootstrap_inner
        self.run()
      File "/usr/lib64/python2.7/threading.py", line 504, in run
        self.__target(*self.__args, **self.__kwargs)
      File "test-fork-frames.py", line 24, in do_verify
        assert frame_from_sys is real_frame
    AssertionError

    With my patch applied, it passes silently. I can reproduce this on CentOS 6.3 x86_64 as well as Fedora 18 x86_64.

    @Ringding Ringding mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Jan 31, 2013
    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Feb 24, 2013

    Thanks, it's surprising this was never noticed before.
    Your patch has two issues:

    • it doesn't clear the thread state before deleting it, which would leak the frame, thread-specific dict, etc
    • it only clears the thread states after the current thread state: if the current thread is not at the head of the linked list - if it's not the most recently created thread - some thread states won't get cleared

    I'm attaching a patch doing the cleanup in PyEval_ReInitThreads(), with test.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 2, 2013

    hi Charles-François,

    I'm attaching a patch doing the cleanup in PyEval_ReInitThreads(), with
    test.

    Did you forget to attach the patch?

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Mar 3, 2013

    Did you forget to attach the patch?

    Oops...

    @Ringding
    Copy link
    Mannequin Author

    Ringding mannequin commented Mar 4, 2013

    When I originally worked on this, I noticed that _PyThread_CurrentFrames also iterates over all interpreters. Because I have no experience with or use for multiple interpreters, I intentionally left it out of my patch, but shouldn't it be taken into account for a real patch?

    @Ringding
    Copy link
    Mannequin Author

    Ringding mannequin commented Mar 4, 2013

    (Regarding your test)
    I have also noticed in the past that joining threads after a fork has caused hangs occasionally, although that might have resulted from the messed up _current_frames.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 16, 2013

    Isn't the thread state clearing subject to a race condition? PyThreadState_Clear() will release a bunch of frames, deallocating arbitrary objects and therefore potentially releasing the GIL. This lets the main thread run and potentially spawn other threads, which may be wrongly deallocated in the following loop iteration.

    A solution would be to detach the thread states from the linked list and clear them afterwards.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 16, 2013

    Here is an updated patch.

    Note that I think this patch could break some programs. For example, if you have a thread in your main process which has a database connection open, deleting the thread state in a child process might shutdown the database connection (depending on the exact protocol). Therefore, I think it would be better to only apply the patch in 3.4.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Apr 4, 2013

    Here is an updated patch.

    _PyThreadState_DeleteExcept uses HEAD_LOCK: ISTM that
    PyThreadState_Clear() can trigger arbitrary code execution: if a
    thread ends up being created/destroyed, I think we can get a deadlock
    when trying to acquire the head lock. I think it should be turned into
    an open call if possible.

    Also, as noted by Stefan, shouldn't we also iterate over other interpreters?

    Note that I think this patch could break some programs. For example, if you have a thread in your main process which has a database connection open, deleting the thread state in a child process might shutdown the database connection (depending on the exact protocol). Therefore, I think it would be better to only apply the patch in 3.4.

    Indeed. For the database example, there's this other issue where the
    database connection is stored in a thread-local storage... Some people
    will definitely get bitten by this...

    @pitrou
    Copy link
    Member

    pitrou commented Apr 7, 2013

    if a
    thread ends up being created/destroyed, I think we can get a deadlock
    when trying to acquire the head lock. I think it should be turned into
    an open call if possible.

    How would you do that in a simple way?

    Also, as noted by Stefan, shouldn't we also iterate over other interpreters?

    The problem is that, AFAIK, we don't know which thread states of the other interpreters should be kept alive.

    @pitrou
    Copy link
    Member

    pitrou commented May 5, 2013

    Here is an updated patch which doesn't hold the lock while calling PyThreadState_Clear(). It looks like it should be ok. Also, I've added some comments.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 5, 2013

    New changeset 847692b9902a by Antoine Pitrou in branch 'default':
    Issue bpo-17094: Clear stale thread states after fork().
    http://hg.python.org/cpython/rev/847692b9902a

    @pitrou
    Copy link
    Member

    pitrou commented May 5, 2013

    I've committed after the small change suggested by Charles-François. Hopefully there aren't any applications relying on the previous behaviour.

    @pitrou pitrou closed this as completed May 5, 2013
    @ssbr
    Copy link
    Mannequin

    ssbr mannequin commented May 28, 2015

    This bug also affects 2.7. The main problem I'm dealing with is "sys._current_frames will then return wrong stack frames for existing threads". One fix to just this would be to change how the dict is created, to keep newer threads rather than tossing them.

    Alternatively, we could backport the 3.4 fix.

    Thoughts?

    @ssbr
    Copy link
    Mannequin

    ssbr mannequin commented May 29, 2015

    The patch I'm providing with this comment has a ... really hokey test case, and a two line + whitespace diff for pystate.c . The objective of the patch is only to have _current_frames report the correct frame for any live thread. It continues to report dead threads' frames, up until they would conflict with a live thread. IMO it's the minimal possible fix for this aspect of the bug, and suitable for 2.7.x. Let me know what you think.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant