classification
Title: Potential deadlock in concurrent futures when garbage collection occurs during Queue.get
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.3, Python 3.4
process
Status: closed Resolution: duplicate
Dependencies: Superseder: concurrent.futures.thread deadlock due to Queue in weakref callback
View: 32576
Assigned To: Nosy List: Chris.Farrow, Patrik Dufresne, bquinlan, davin, devurandom, gregory.p.smith, mark.dickinson, pitrou, simon.jagoe
Priority: normal Keywords:

Created on 2014-03-21 15:47 by simon.jagoe, last changed 2018-04-10 05:47 by devurandom. This issue is now closed.

Files
File name Uploaded Description Edit
executor-hang.py simon.jagoe, 2014-03-21 15:47 Script exercising hang
executor-hang-faulthandler.txt simon.jagoe, 2017-09-10 15:47
Messages (10)
msg214380 - (view) Author: Simon Jagoe (simon.jagoe) Date: 2014-03-21 15:47
At Enthought we have been tracking a deadlock in some code that turned out to be due to the following scenario:

  0) There is some cyclic garbage that requires collection; an object in the garbage is referred to by a weakref with a callback
  1) You have a lock that is acquired
  2) While the lock is held, the garbage collector runs (on the same thread)
  3) The weakref callback in (0) is called via the garbage collecter and the callback tries to acquire the same lock that was acquired in (1)

Attached is a simple script that exercises the issue in Python 3.3. The script monkey-patches Queue.get to force garbage collection after acquiring the Queue.not_empty lock.
msg215972 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2014-04-12 13:27
Adding Brian Quinlan to the nosy list.

I don't see an easy way around this other than documenting it and recommending that people do explicit executor shutdown where necessary.

Weakref callbacks are dangerous things.  Weakref callbacks that acquire a lock at any point doubly so.
msg274591 - (view) Author: Patrik Dufresne (Patrik Dufresne) Date: 2016-09-06 18:33
I've encounter this issue. To easily avoid this issue, I've change `queue.put(None)` to `queue.put(None, block=False)` to work around this.
msg275413 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2016-09-09 19:59
Sounds related to issue14976.
msg275414 - (view) Author: Davin Potts (davin) * (Python committer) Date: 2016-09-09 20:02
Per the decision of issue14976.
msg275424 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-09 20:22
Davin: I don't see the link with #14976. There are no external signals involved in this issue. There's just a piece of simple pure Python code, which can deadlock if gc happens to kick in at the wrong moment.

I don't think this issue should be closed.
msg301168 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-09-02 18:16
I don't really understand this issue.  How is concurrent.futures supposed to be the culprit here?  Can you elaborate which weakref callback you are talking about?
msg301811 - (view) Author: Simon Jagoe (simon.jagoe) Date: 2017-09-10 15:47
In the script attached to the original issue, the weakref callback that causes the hang is the callback defined in ThreadPoolExecutor._adjust_thread_count

Attached is a faulthandler stack captured from Python 3.6.1. The script submitted here uses a patched queue.get() method to force garbage collection to reliably trigger the issue, but this deadlock was observed in a real application using threadpool executors.
msg310207 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2018-01-17 22:03
there's a PR on the other issue i opened for this without noticing this one so i'm marking this as the dup.
msg310212 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-01-17 22:11
There's a proposed fix in https://github.com/python/cpython/pull/5216
It would be nice if someone able to reproduce this issue could test it (the reproducer script here wouldn't work with the new SimpleQueue implementation).
History
Date User Action Args
2018-04-10 05:47:26devurandomsetnosy: + devurandom
2018-01-17 22:11:42pitrousetmessages: + msg310212
2018-01-17 22:03:30gregory.p.smithsetstatus: open -> closed

superseder: queue.Queue() is not reentrant, so signals and GC can cause deadlocks -> concurrent.futures.thread deadlock due to Queue in weakref callback

nosy: + gregory.p.smith
messages: + msg310207
resolution: duplicate
stage: patch review
2017-09-10 15:47:10simon.jagoesetfiles: + executor-hang-faulthandler.txt

messages: + msg301811
2017-09-02 18:16:53pitrousetnosy: + pitrou
messages: + msg301168
2016-09-09 20:27:37davinsetstatus: closed -> open
resolution: wont fix -> (no value)
2016-09-09 20:22:58mark.dickinsonsetmessages: + msg275424
2016-09-09 20:02:30davinsetstatus: open -> closed
resolution: wont fix
messages: + msg275414
2016-09-09 20:01:22davinsetsuperseder: queue.Queue() is not reentrant, so signals and GC can cause deadlocks
2016-09-09 19:59:07davinsetnosy: + davin
messages: + msg275413
2016-09-06 18:33:57Patrik Dufresnesetnosy: + Patrik Dufresne
messages: + msg274591
2014-04-12 13:27:14mark.dickinsonsetnosy: + bquinlan
messages: + msg215972
2014-03-21 16:05:59Chris.Farrowsetnosy: + Chris.Farrow
2014-03-21 15:47:58mark.dickinsonsetnosy: + mark.dickinson
2014-03-21 15:47:26simon.jagoecreate