New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrent.futures.thread deadlock due to Queue in weakref callback #76757
Comments
As a follow up to https://bugs.python.org/issue14976 which just introduced queue.SimpleQueue: concurrent.futures.thread currently uses a queue.Queue() from weakref callbacks which could in theory lead to a deadlock when periodic gc triggers a cleanup invalidating some weakrefed objects at an inopportune time while Python code has the queue's lock held. I don't have a test case proving this (deadlocks are hard). Switching it to use a SimpleQueue should avoid the problem? ... This and the C extension module based SimpleQueue would be good to backport to https://pypi.python.org/pypi/futures as well. |
We could also switch multiprocessing.Pool. Unfortunately some code in multiprocessing.Pool relies on internal details of queue.Queue (!). |
See also https://bugs.python.org/issue21009 (#65208) |
Thanks for the heads up Mark. Unfortunately the reproducer script https://bugs.python.org/issue21009 needs to hack into Queue.get(), which isn't possible for the C-implemented SimpleQueue. |
Could you get this fixed in earlier versions of CPython? Given that 3.7 is not yet released, having this broken in 3.5 and 3.6 is highly undesirable. This apparently seems to affect our tooling 1 and telling our users to use 3.7 beta is not an option. |
Michał, sorry, I doubt it. The fix is highly non-trivial as it first requires backporting a new feature (see bpo-14976). I'm cc'ing the 3.6 branch release manager just in case. If apparently you're witnessing this in controlled situations (the "gemato" utility, IIUC?), one workaround would be to disable the cyclic GC (call gc.disable()). If your program doesn't create any cyclic references, or if those references don't keep too much memory alive, that would probably work. |
Yes, backporting all of the required code to earlier releases would be out of scope for a maintenance release, particularly at this late stage in the 3.6 life cycle. Let's see whether disabling the GC is a sufficient workaround until 3.7 is available. |
Well, according to the reporters disabling GC doesn't help at all. Maybe it's another issue. |
Perhaps the gemato issue has nothing to do with multiprocessing indeed. I would suggest add some progress logging to your program to see whether/where it actually hangs. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: