This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: threading.local() must be run at module level (doc improvement)
Type: behavior Stage: resolved
Components: Versions: Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, eric.snow, eryksun, ethan.furman, paul.moore, rhettinger
Priority: normal Keywords:

Created on 2015-04-21 13:42 by ethan.furman, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (14)
msg241713 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015-04-21 13:42
In order to work correctly, threading.local() must be run in global scope, yet that tidbit is missing from both the docs and the _threading_local.py file.

Something like:

.. note::
   threading.local() must be run at global scope to function properly.

That would have saved me hours of time.  Thank goodness for SO!  ;)
msg241714 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2015-04-21 14:00
Could you clarify what the problem is? I have no apparent problem using threading.local in a function scope:

    import threading

    def f():
        tlocal = threading.local()
        tlocal.x = 0
        def g():
            tlocal.x = 1
            print('tlocal.x in g:', tlocal.x)
        t = threading.Thread(target=g)
        t.start()
        t.join()
        print('tlocal.x in f:', tlocal.x)

    >>> f()
    tlocal.x in g: 1
    tlocal.x in f: 0
msg241715 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-04-21 14:27
Also, don't use a ".. note::", regular sentences work fine, especially in documentation that is already very short.
msg241718 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015-04-21 15:44
Raymond, okay, thanks.

Eryksun, I've written a FUSE file system (for $DAYJOB) and when I switched over to using threads I would occasionally experience errors such as 'thread.local object does not have attribute ...'; as soon as I found the SO answer and moved the call to 'threading.local()' to the global scope, the problem vanished.

To reliably detect the problem I started approximately 10 threads, each getting an os.listdir() 1,000 times of an area on the FUSE.
msg241719 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015-04-21 15:51
Link to the SO answer? Does it explain *why* this is a requirement?
msg241724 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015-04-21 16:19
http://stackoverflow.com/q/1408171/208880

No, it just says (towards the top):
----------------------------------
> One important thing that everybody seems to neglect to mention is that writing
> threadLocal = threading.local() at the global level is required. Calling
> threading.local() within the worker function will not work.

It is now my experience that "will not work" (reliably) is accurate.
msg241732 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015-04-21 18:48
That seems to merely be saying that each threading.local() object is distinct, so if you want to share threadlocal data between workers, creating local objects won't work.

I think I see what the confusion is (although I can't quite explain it yet, I'll need to think some more about it) but "threading.local() needs to be run at global scope" is not accurate (for example, if I understand correctly, a class attribute which is a threading.local value would work fine, and it's not "global scope".

Basically, each time you call threading.local() you get a brand new object. It looks like a dictionary, but in fact it's a *different* dictionary for each thread. Within one thread, though, you can have multiple threading.local() objects, and they are independent.

The "wrong" code in the SO discussion created a new threading-local() object as a local variable in a function, and tried to use it to remember state from one function call to the next (like a C static variable). That would be just as wrong in a single-threaded program where you used dict() instead of threading.local(), and for the same reasons.

I don't know what your code was doing, so it may well be that the problem you were encountering was more subtle than the one on the wont_work() function. But "threading.local() must be run in global scope" is *not* the answer (even if doing that resulted in your problem going away).
msg241733 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015-04-21 18:48
I should also say, I'll try to work up a doc patch for this, once I've got my head round how to explain it :-)
msg241734 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-04-21 19:11
FYI, I've used thread-local namespaces with success in several different ways and none of them involved binding the thread-local namespace to global scope.  I don't think anything needs to be fixed here.

The SO answer is misleading and perhaps even wrong.  The problem it describes is about sharing the thread-local NS *between function calls*.  Persisting state between function calls is not a new or mysterious problem, nor unique to thread-local namespaces.  In the example they give, rather than a global they could have put it into a default arg or into a class:

def hi(threadlocal=threading.local()):
    ...

class Hi:
    threadlocal = threading.local()
    def __call__(self):
        ...  # change threadlocal to self.threadlocal

hi = Hi()

This is simply a consequence of Python's normal scoping rules (should be unsurprising) and the fact that threading.local is a class (new instance per call) rather than a function (with the assumption of a singleton namespace per thread).

At most the docs could be a little more clear that threading.local() produces a new namespace each time.  However, I don't think even that is necessary and suggest closing this as won't fix.
msg241737 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015-04-21 19:52
You're right, the SO answer is simply wrong. I've posted a (hopefully clearer) answer. If anyone wants to check it for accuracy, that'd be great.

Agreed this can probably be closed as "not a bug". On first reading, I thought the docs could do with clarification, but now I think that was just because I had been confused by the SO posting :-)
msg241749 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015-04-21 22:11
Here's a basic outline of what I was trying:
-------------------------------------------

CONTEXT = None

class MyFUSE(Fuse):

   def __call__(self, op, path, *args):
      global CONTEXT
      ...
      CONTEXT = threading.local()
      # set several CONTEXT vars
      ...
      # dispatch to correct function to handle 'op'

Under stress, I would eventually get threading.local objects that were missing attributes.

Points to consider:
- I have no control over the threads; they just arrive wanting their
  'op's fulfilled
- the same thread can be a repeat customer, but with the above scenario they would/should
  get a new threading.local each time

Hmmm... could my problem be that even though function locals are thread-safe, the globals are not, so trying to create a threading.local via a global statement was clobbering other threading.local instances?  While that would make sense, I'm still completely clueless why having a single global statement, which (apparently) creates a single threading.local object, could be distinct for all the threads... unless, of course, it can detect which thread is accessing it and react appropriately.  Okay, that's really cool.

So I was doing two things wrong:
- calling threading.local() inside a function (although this would
  probably work if I then passed that object around, as I do not
  need to persist state across function calls -- wait, that would
  be the same as using an ordinary, function-local dict, wouldn't
  it?)
- attempting to assign the threading.local object to a global
  variable from inside a function (this won't work, period)

Many thanks for helping me figure that out.

Paul, in your SO answer you state:
---------------------------------
Just like an ordinary object, you can create multiple threading.local instances in your code. They can be local variables, class or instance members, or global variables.

- Local variables are already thread-safe, aren't they?  So there
  would be no point in using threading.local() there.
- Instance members (set from __init__ of someother method): wouldn't
  that be the same problem I was having trying to update a
  non-threadsafe global with a new threading.local() each time?

It seems to me the take-away here is that you only want to create a threading.local() object /once/ -- if you are creating the same threading.local() object more than once, you're doing it wrong.
msg241750 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-04-21 22:25
@Ethan, it may help you to read through the module docstring in Lib/_threading_local.py.
msg241751 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2015-04-21 22:29
On 21 April 2015 at 23:11, Ethan Furman <report@bugs.python.org> wrote:
> Hmmm... could my problem be that even though function locals are thread-safe, the globals are not, so trying to create a threading.local via a global statement was clobbering other threading.local instances?  While that would make sense, I'm still completely clueless why having a single global statement, which (apparently) creates a single threading.local object, could be distinct for all the threads... unless, of course, it can detect which thread is accessing it and react appropriately.  Okay, that's really cool.

You're not creating a single threading object. You're creating one
each call() and overwriting the old one.

> So I was doing two things wrong:
> - calling threading.local() inside a function (although this would
>   probably work if I then passed that object around, as I do not
>   need to persist state across function calls -- wait, that would
>   be the same as using an ordinary, function-local dict, wouldn't
>   it?)

Yes, a dict should be fine if you're only using it within the one function call.

> - attempting to assign the threading.local object to a global
>   variable from inside a function (this won't work, period)

It does work, it's just there isn't *the* object, there's lots and you
keep overwriting.

The thread safety issue is that if you write over the global in one
thread, before another thread has finished, you lose the second
thread's values (because they were on the old, lost, namespace. So
basically you'd see unpredictable, occasional losses of all your
CONTEXT vars in a thread.

> Many thanks for helping me figure that out.

(If you did :-) - hope the clarifications above helped).

> Paul, in your SO answer you state:
> ---------------------------------
> Just like an ordinary object, you can create multiple threading.local instances in your code. They can be local variables, class or instance members, or global variables.
>
> - Local variables are already thread-safe, aren't they?  So there
>   would be no point in using threading.local() there.

Not unless you're going to return them from your function, or
something like that. But yes, it's unlikely they will be needed there.
I only mentioned it to avoid giving any impression that "only set at
global scope" was important.

> - Instance members (set from __init__ of someother method): wouldn't
>   that be the same problem I was having trying to update a
>   non-threadsafe global with a new threading.local() each time?

You'd set vars on the namespace held in the instance variable. It's
much like the local variable case, except you are more likely to pass
an instance around between threads.

> It seems to me the take-away here is that you only want to create a threading.local() object /once/ -- if you are creating the same threading.local() object more than once, you're doing it wrong.

Well, sort of. You can only create *any* object once :-) There seems
to be a confusion (in the SO thread and with you, maybe) that
threading.local objects are somehow singletons in that you "create"
them repeatedly and get the same object. That's just wrong - they are
entirely normal objects, that you can set arbitrary attributes on. The
only difference is that each thread sees an independent set of
attributes on the object.
msg241752 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-04-21 22:34
Think of threading.local this way: instances of threading.local are shared between all the threads, but the effective "__dict__" of each instance is per-thread.  Basically, the object stores a dict for each thread.  In __getattribute__, __setattr__, and __delattr__ it swaps the dict for the current thread into place and then does proceeds normally.
History
Date User Action Args
2022-04-11 14:58:15adminsetgithub: 68208
2015-04-21 22:34:42eric.snowsetmessages: + msg241752
2015-04-21 22:29:57paul.mooresetmessages: + msg241751
2015-04-21 22:25:30eric.snowsetmessages: + msg241750
2015-04-21 22:11:31ethan.furmansetmessages: + msg241749
2015-04-21 21:53:42eric.snowsetstatus: open -> closed
type: behavior
resolution: not a bug
stage: resolved
2015-04-21 19:52:27paul.mooresetmessages: + msg241737
2015-04-21 19:11:41eric.snowsetnosy: + eric.snow
messages: + msg241734
2015-04-21 18:48:59paul.mooresetmessages: + msg241733
2015-04-21 18:48:08paul.mooresetmessages: + msg241732
2015-04-21 16:19:21ethan.furmansetmessages: + msg241724
2015-04-21 15:51:17paul.mooresetnosy: + paul.moore
messages: + msg241719
2015-04-21 15:44:53ethan.furmansetmessages: + msg241718
2015-04-21 14:27:50rhettingersetnosy: + rhettinger
messages: + msg241715
2015-04-21 14:00:21eryksunsetnosy: + eryksun
messages: + msg241714
2015-04-21 13:42:25ethan.furmancreate