Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference leak in thread._local #47960

Closed
tamino mannequin opened this issue Aug 28, 2008 · 9 comments
Closed

Reference leak in thread._local #47960

tamino mannequin opened this issue Aug 28, 2008 · 9 comments
Labels
extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error

Comments

@tamino
Copy link
Mannequin

tamino mannequin commented Aug 28, 2008

BPO 3710
Nosy @gpshead, @pitrou
Superseder
  • bpo-1868: threading.local doesn't free attrs when assigning thread exits
  • Files
  • threadmodule.c.diff: Possible patch?
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-08-21.22:12:46.789>
    created_at = <Date 2008-08-28.00:09:02.385>
    labels = ['extension-modules', 'type-bug']
    title = 'Reference leak in thread._local'
    updated_at = <Date 2011-02-14.06:58:55.731>
    user = 'https://bugs.python.org/tamino'

    bugs.python.org fields:

    activity = <Date 2011-02-14.06:58:55.731>
    actor = 'ysj.ray'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-08-21.22:12:46.789>
    closer = 'pitrou'
    components = ['Extension Modules']
    creation = <Date 2008-08-28.00:09:02.385>
    creator = 'tamino'
    dependencies = []
    files = ['11278']
    hgrepos = []
    issue_num = 3710
    keywords = ['patch']
    message_count = 9.0
    messages = ['72052', '72053', '72054', '72055', '72056', '72061', '114563', '114564', '114575']
    nosy_count = 4.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'tamino', 'ysj.ray']
    pr_nums = []
    priority = 'high'
    resolution = 'duplicate'
    stage = None
    status = 'closed'
    superseder = '1868'
    type = 'behavior'
    url = 'https://bugs.python.org/issue3710'
    versions = ['Python 2.6', 'Python 3.1', 'Python 2.7', 'Python 3.2']

    @tamino
    Copy link
    Mannequin Author

    tamino mannequin commented Aug 28, 2008

    This is a copy of a message I sent to the python-dev mailing list; it
    was suggested in a reply that I file a bug for this issue. I'm
    filing it against Python 2.5 because that's where I noticed it,
    but it doesn't look like this code has changed much in trunk.

    I noticed that thread._local can leak references if objects are
    being stored inside the thread._local object whose destructors
    might release the GIL.

    The way this happens is that in Modules/threadmodule.c, in the
    _ldict() function, it does things like this:

                    Py_CLEAR(self->dict);
                    Py_INCREF(ldict);
                    self->dict = ldict;

    If the Py_CLEAR ends up taking away the last reference to an object
    contained in the dict, and a thread context switch occurs during that
    object's deallocation, then self->dict might not be NULL on return
    from Py_CLEAR; another thread might have run, accessed something in
    the same thread._local object, and caused self->dict to be set to
    something else (and Py_INCREF'ed). So when we blindly do the
    assignment into self->dict, we may be overwriting a valid reference,
    and not properly Py_DECREFing it.

    The recent change (revision 64601 to threadmodule.c) did not address
    context switches during the Py_CLEAR call; only context switches
    during tp_init.

    The attached patch (against trunk) is my first attempt at fixing this.
    It detects if self->dict has been set to something else after the
    Py_CLEAR, and retries the Py_CLEAR (because _ldict really only cares
    about installing the proper value of self->dict for the currently
    running thread).

    However, I am still uncomfortable about the fact that local_getattro
    and local_setattro discard the value returned from _ldict, and instead
    hand off control to the PyObject_Generic layer and trust that by the
    time self->dict is actually used, it still has the correct value for
    the current thread. Would it be better to, say, inline a copy of the
    PyObject_Generic* functions inside local_getattro/local_setattro,
    and force the operations to be done on the actual dict returned by
    _ldict()?

    @tamino tamino mannequin added extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error labels Aug 28, 2008
    @pitrou
    Copy link
    Member

    pitrou commented Aug 28, 2008

    Hmm, rather than the while loop in your proposal, the proper idiom would be:

                    PyObject *olddict = self->dict;
                    Py_INCREF(ldict);
                    self->dict = ldict;
                    Py_XDECREF(olddict);

    @tamino
    Copy link
    Mannequin Author

    tamino mannequin commented Aug 28, 2008

    But then if there is a context switch during the last Py_XDECREF, then
    it could be the case that self->dict is not set properly on return from
    _ldict().

    Functions like local_setattro() use _ldict() more for its side effect
    (setting self->dict) than for its return value. It's possible that
    this should be changed; see the last paragraph in my original report.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 28, 2008

    But then if there is a context switch during the last Py_XDECREF, then
    it could be the case that self->dict is not set properly on return from
    _ldict().

    Well, C code is effectively locked in a single thread until the GIL is
    released. It means that a piece of C code which doesn't release the GIL
    behaves as a critical section. So the only thing to consider, IMO, is
    whether the GIL can still be released between "if (ldict == NULL)" and
    "self->dict = ldict".

    @tamino
    Copy link
    Mannequin Author

    tamino mannequin commented Aug 28, 2008

    The specific thing that was happening for me is that an
    _sqlite3.Connection object was in the dictionary. In
    Modules/_sqlite/connection.c, in pysqlite_connection_dealloc(),
    it uses Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS.

    So it's the call to Py_DECREF that's interesting from my point
    of view. I believe that if _ldict() sets self->dict to what it
    should be for the current thread, and then calls Py_DECREF on
    the old value, and then returns, then _ldict() is no longer
    able to guarantee that self->dict will be set to the right
    thing for the current thread after it returns (because if the
    Py_DECREF ended up deallocating something like an sqlite3
    connection, then it'd have released and reacquired the GIL).

    Hence, in the patch I attached, the assignment into self->dict
    is kept as the last thing that happens before _ldict() returns,
    and I believe this means _ldict() can still make that guarantee.

    Of course, I'd be all for changing local_getattro/local_setattro
    to not need _ldict to make that guarantee! _ldict always *returns*
    the correct pointer; it would be nice to make use of that somehow.

    @gpshead
    Copy link
    Member

    gpshead commented Aug 28, 2008

    fyi - This bug and bpo-1868 (http://bugs.python.org/issue1868) seem related.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 21, 2010

    Of course, I'd be all for changing local_getattro/local_setattro
    to not need _ldict to make that guarantee! _ldict always *returns*
    the correct pointer; it would be nice to make use of that somehow.

    Indeed. Therefore, perhaps we could break the problem in two:

    • fix the reference leak by using the correct assignment idiom outlined earlier
    • fix local_getattro / local_setattro to always use the proper dict, regardless of thread switches

    @tamino
    Copy link
    Mannequin Author

    tamino mannequin commented Aug 21, 2010

    The latest patch over in bpo-1868 is working fine for my company in production, and solves bpo-3710 as well. I think the only thing left to do on that patch is to make it special case "__dict__".

    @pitrou
    Copy link
    Member

    pitrou commented Aug 21, 2010

    Ok, then let's continue in bpo-1868.

    @pitrou pitrou closed this as completed Aug 21, 2010
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants