New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread hangs on str.encode() when locale is not set #71574
Comments
This bug manifest itself in at least one very specific situation:
[Environment with no locale set]:
$ echo $LANG
[test1.py]: import test2 [test2.py]: from threading import Thread
class TestThread(Thread):
def run(self):
msg = 'Error from server: code=000a'
print msg
msg = msg.encode('utf-8')
t = TestThread()
t.start()
t.join()
[Expected behavior]: $ python test1.py
Error from server: code=000a
done [Actual behavior]: $ python test1.py
Error from server: code=000a
[script hangs here indefinitely] Much thanks to Alan Boudreault, a developer of the cassandra-driver Python package, for helping me locate this bug and further narrow it down to the threading module. The above code snippet was copied from his comment on my issue over there (https://datastax-oss.atlassian.net/browse/PYTHON-592). Another curious behavior is that if you modify test1.py to decode any string prior to the import, it implicitly fixes the issue: [test1.py']:
I realize that one should probably always have a locale set, however, this proved to be very difficult to isolate, especially given that it works if no import occurs or a string is decoded prior to the import. |
It is a deadlock on the import lock. You should avoid creating and waiting |
This situation is warned about explicitly in the threading docs (https://docs.python.org/2/library/threading.html#importing-in-threaded-code). The import deadlock is fixed in python3, but it is still a really bad idea to launch threads on module import. What isn't obvious, of course, is that calling encode for the first time for a given encoding does an implicit import of the relevant encoding. I don't think encodings is the only stdlib module that does implicit imports, but it is probably the most used case. Maybe it is worth adding a warning to that section of the 2.7 docs about implicit imports in general and encode/decode in particular? |
Adding a note to the docs sounds reasonable. |
Ok to add a note to str.encode and str.decode methods to explain that I'm not ok for a warning, we should not discourage developers to use |
No, I'm talking about the threading docs, not the encoding docs. I think that's the only place it matters. Specifically, in the section that I linked to, in the bullet point that warns against launching threads on import, it can note that even if you try to make your own code avoid the import lock, implicit imports such as the one done by encode/decode can trip you up. |
Python 2 issue. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: