This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Reference leak in thread._local
Type: behavior Stage:
Components: Extension Modules Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: threading.local doesn't free attrs when assigning thread exits
View: 1868
Assigned To: Nosy List: gregory.p.smith, pitrou, tamino, ysj.ray
Priority: high Keywords: patch

Created on 2008-08-28 00:09 by tamino, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
threadmodule.c.diff tamino, 2008-08-28 00:08 Possible patch?
Messages (9)
msg72052 - (view) Author: Ben Cottrell (tamino) Date: 2008-08-28 00:08
This is a copy of a message I sent to the python-dev mailing list; it
was suggested in a reply that I file a bug for this issue. I'm
filing it against Python 2.5 because that's where I noticed it,
but it doesn't look like this code has changed much in trunk.

I noticed that thread._local can leak references if objects are
being stored inside the thread._local object whose destructors
might release the GIL.

The way this happens is that in Modules/threadmodule.c, in the
_ldict() function, it does things like this:

                Py_CLEAR(self->dict);
                Py_INCREF(ldict);
                self->dict = ldict;

If the Py_CLEAR ends up taking away the last reference to an object
contained in the dict, and a thread context switch occurs during that
object's deallocation, then self->dict might not be NULL on return
from Py_CLEAR; another thread might have run, accessed something in
the same thread._local object, and caused self->dict to be set to
something else (and Py_INCREF'ed). So when we blindly do the
assignment into self->dict, we may be overwriting a valid reference,
and not properly Py_DECREFing it.

The recent change (revision 64601 to threadmodule.c) did not address
context switches during the Py_CLEAR call; only context switches
during tp_init.

The attached patch (against trunk) is my first attempt at fixing this.
It detects if self->dict has been set to something else after the
Py_CLEAR, and retries the Py_CLEAR (because _ldict really only cares
about installing the proper value of self->dict for the currently
running thread).

However, I am still uncomfortable about the fact that local_getattro
and local_setattro discard the value returned from _ldict, and instead
hand off control to the PyObject_Generic layer and trust that by the
time self->dict is actually used, it still has the correct value for
the current thread. Would it be better to, say, inline a copy of the
PyObject_Generic* functions inside local_getattro/local_setattro,
and force the operations to be done on the actual dict returned by
_ldict()?
msg72053 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-28 00:14
Hmm, rather than the while loop in your proposal, the proper idiom would be:

                PyObject *olddict = self->dict;
                Py_INCREF(ldict);
                self->dict = ldict;
                Py_XDECREF(olddict);
msg72054 - (view) Author: Ben Cottrell (tamino) Date: 2008-08-28 00:18
But then if there is a context switch during the last Py_XDECREF, then
it could be the case that self->dict is not set properly on return from
_ldict().

Functions like local_setattro() use _ldict() more for its side effect
(setting self->dict) than for its return value. It's possible that
this should be changed; see the last paragraph in my original report.
msg72055 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-28 00:37
> But then if there is a context switch during the last Py_XDECREF, then
> it could be the case that self->dict is not set properly on return from
> _ldict().

Well, C code is effectively locked in a single thread until the GIL is
released. It means that a piece of C code which doesn't release the GIL
behaves as a critical section. So the only thing to consider, IMO, is
whether the GIL can still be released between "if (ldict == NULL)" and
"self->dict = ldict".
msg72056 - (view) Author: Ben Cottrell (tamino) Date: 2008-08-28 00:51
The specific thing that was happening for me is that an
_sqlite3.Connection object was in the dictionary. In
Modules/_sqlite/connection.c, in pysqlite_connection_dealloc(),
it uses Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS.

So it's the call to Py_DECREF that's interesting from my point
of view. I believe that if _ldict() sets self->dict to what it
should be for the current thread, and then calls Py_DECREF on
the old value, and then returns, then _ldict() is no longer
able to guarantee that self->dict will be set to the right
thing for the current thread after it returns (because if the
Py_DECREF ended up deallocating something like an sqlite3
connection, then it'd have released and reacquired the GIL).

Hence, in the patch I attached, the assignment into self->dict
is kept as the last thing that happens before _ldict() returns,
and I believe this means _ldict() can still make that guarantee.

Of course, I'd be all for changing local_getattro/local_setattro
to not need _ldict to make that guarantee! _ldict always *returns*
the correct pointer; it would be nice to make use of that somehow.
msg72061 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2008-08-28 04:44
fyi - This bug and #1868 (http://bugs.python.org/issue1868) seem related.
msg114563 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-08-21 20:49
> Of course, I'd be all for changing local_getattro/local_setattro
> to not need _ldict to make that guarantee! _ldict always *returns*
> the correct pointer; it would be nice to make use of that somehow.

Indeed. Therefore, perhaps we could break the problem in two:
- fix the reference leak by using the correct assignment idiom outlined earlier
- fix local_getattro / local_setattro to always use the proper dict, regardless of thread switches
msg114564 - (view) Author: Ben Cottrell (tamino) Date: 2010-08-21 21:06
The latest patch over in #1868 is working fine for my company in production, and solves #3710 as well. I think the only thing left to do on that patch is to make it special case "__dict__".
msg114575 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-08-21 22:12
Ok, then let's continue in #1868.
History
Date User Action Args
2022-04-11 14:56:38adminsetgithub: 47960
2011-02-14 06:58:55ysj.raysetnosy: + ysj.ray
2010-08-21 22:12:46pitrousetstatus: open -> closed
resolution: duplicate
superseder: threading.local doesn't free attrs when assigning thread exits
messages: + msg114575
2010-08-21 21:06:26taminosetmessages: + msg114564
2010-08-21 20:49:23pitrousetmessages: + msg114563
2010-06-09 21:29:09terry.reedysetversions: - Python 2.5
2010-05-11 20:39:48terry.reedysetversions: + Python 3.1, Python 2.7, Python 3.2, - Python 3.0
2008-08-28 04:44:08gregory.p.smithsetmessages: + msg72061
2008-08-28 04:36:06gregory.p.smithsetnosy: + gregory.p.smith
2008-08-28 00:51:20taminosetmessages: + msg72056
2008-08-28 00:37:59pitrousetmessages: + msg72055
2008-08-28 00:19:00taminosetmessages: + msg72054
2008-08-28 00:14:14pitrousetpriority: high
nosy: + pitrou
messages: + msg72053
versions: + Python 2.6, Python 3.0
2008-08-28 00:09:03taminocreate