classification
Title: Use deque instead of list the threading.Condition waiter queue
Type: performance Stage:
Components: Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: asvetlov, gvanrossum, jcea, neologix, pitrou, python-dev, rhettinger, tim.peters
Priority: normal Keywords: patch

Created on 2013-03-08 06:16 by rhettinger, last changed 2013-03-11 01:46 by gvanrossum. This issue is now closed.

Files
File name Uploaded Description Edit
condition.diff rhettinger, 2013-03-08 06:16 Proof of concept review
condition2.diff rhettinger, 2013-03-10 21:03 Minimal patch keeping existing fairness logic review
Messages (8)
msg183727 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-03-08 06:16
Condition variables implement a FIFO queue for waiting threads.  The current implementation uses a regular Python list but could use a deque instead.

A wait() call appends a new waiter.   A notify() call removes the oldest waiter; this is an O(n) operation on list but only an O(1) operation on deques.  A notify_all() call is O(n**2) for a list but only O(n) for a deque.

If there is interest in this patch, I can add slicing support to collections.deque so that this patch won't need itertools.islice()
msg183810 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-09 11:30
I don't think you need slicing if you rewrite the patch in another way, e.g.:

for i in range(n):
    try:
        waiter = __waiters.popleft()
    except IndexError:
        break
    waiter.release()

I think this is safe, since notify() must be called with the lock held: another thread shouldn't be able to mutate the waiters list in the meantime.

As for notify_all(), it could be optimized to swap the internal list with an empty one: there's no need to pop the waiters one by one.
msg183869 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-03-10 06:35
Tim, do you remember why Condition.notify() went to great lengths to act as if the lock could be released after the check for self._is_owned()?  

It loops over its own a copy of __waiters, and the __waiters.remove(waiter) code is wrapped in a try/except to detect a situation where __waiters mutated during the release-loop.  I'm presuming that defensive programming was put there for a reason.
msg183875 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-10 13:47
Actually, wait() calls self._waiters.remove() without holding the lock. But I think it could easily do so after taking the lock (since it takes it anyway before returning).

Also, _waiters should better be a set, since wait() needs the associative behaviour when unregistering a waiter.

notify() would then look like:

for i in range(n):
    try:
        waiter = self._waiters.pop()
    except KeyError:
        break
    waiter.release()

and wait() would look like:

waiter = _allocate_lock()
waiter.acquire()
self._waiters.add(waiter)
self._release_save()
try:
    return waiter.acquire(timeout)
finally:
    self._acquire_restore()
    self._waiters.discard(waiter)
msg183876 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-10 14:11
That said, I seem to remember a discussion of Condition's fairness.
Right now, waiters are notified in the order of wait() calls. This wouldn't be the case anymore if using a set instead of a list or deque.

Also, I can't remember a situation where I made an intensive use of a Condition (say, hundreds of calls per second), as opposed to Lock and RLock which can be heavily invoked to protect the integrity of critical data structures.
msg183899 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-03-10 21:03
Thanks Antoine.  Since the calls are made without a lock, I'll go for a minimal patch and keep the existing fairness logic.

Adding Guido to the nosy list since this is his code.

FWIW, the heaviest load for condition variables likely arises with use of the Queue module which implements substantially all of its logic around three condition variables and a single lock.
msg183905 - (view) Author: Roundup Robot (python-dev) Date: 2013-03-11 00:58
New changeset 0f86b51f8f8b by Raymond Hettinger in branch 'default':
Issue #17385: Fix quadratic behavior in threading.Condition
http://hg.python.org/cpython/rev/0f86b51f8f8b
msg183908 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-03-11 01:46
Looks fine. I'd say that it would be great to add slicing to deque!

There's one oddity in the code, but it was there before the patch -- the local variable name __waiters is pretty silly. It appears to be a micro-optimization to avoid using self._waiters more than once; even if that is worth it (I kind of doubt it), the __private name is wrong and misguided.
History
Date User Action Args
2013-03-11 01:46:52gvanrossumsetmessages: + msg183908
2013-03-11 00:58:36rhettingersetstatus: open -> closed
resolution: fixed
2013-03-11 00:58:08python-devsetnosy: + python-dev
messages: + msg183905
2013-03-10 21:08:11asvetlovsetnosy: + asvetlov
2013-03-10 21:03:05rhettingersetfiles: + condition2.diff

nosy: + gvanrossum
messages: + msg183899

assignee: rhettinger
2013-03-10 14:11:44pitrousetmessages: + msg183876
2013-03-10 13:47:27pitrousetmessages: + msg183875
2013-03-10 06:35:45rhettingersetnosy: + tim.peters
messages: + msg183869
2013-03-10 02:23:03jceasetnosy: + jcea
2013-03-09 11:30:16pitrousetnosy: + neologix
messages: + msg183810
2013-03-08 06:16:26rhettingercreate