Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use deque instead of list the threading.Condition waiter queue #61587

Closed
rhettinger opened this issue Mar 8, 2013 · 8 comments
Closed

Use deque instead of list the threading.Condition waiter queue #61587

rhettinger opened this issue Mar 8, 2013 · 8 comments
Assignees
Labels
performance Performance or resource usage

Comments

@rhettinger
Copy link
Contributor

BPO 17385
Nosy @gvanrossum, @tim-one, @rhettinger, @jcea, @pitrou, @asvetlov
Files
  • condition.diff: Proof of concept
  • condition2.diff: Minimal patch keeping existing fairness logic
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2013-03-11.00:58:36.842>
    created_at = <Date 2013-03-08.06:16:26.876>
    labels = ['performance']
    title = 'Use deque instead of list the threading.Condition waiter queue'
    updated_at = <Date 2013-03-11.01:46:52.437>
    user = 'https://github.com/rhettinger'

    bugs.python.org fields:

    activity = <Date 2013-03-11.01:46:52.437>
    actor = 'gvanrossum'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2013-03-11.00:58:36.842>
    closer = 'rhettinger'
    components = []
    creation = <Date 2013-03-08.06:16:26.876>
    creator = 'rhettinger'
    dependencies = []
    files = ['29348', '29369']
    hgrepos = []
    issue_num = 17385
    keywords = ['patch']
    message_count = 8.0
    messages = ['183727', '183810', '183869', '183875', '183876', '183899', '183905', '183908']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'tim.peters', 'rhettinger', 'jcea', 'pitrou', 'asvetlov', 'neologix', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue17385'
    versions = ['Python 3.4']

    @rhettinger
    Copy link
    Contributor Author

    Condition variables implement a FIFO queue for waiting threads. The current implementation uses a regular Python list but could use a deque instead.

    A wait() call appends a new waiter. A notify() call removes the oldest waiter; this is an O(n) operation on list but only an O(1) operation on deques. A notify_all() call is O(n**2) for a list but only O(n) for a deque.

    If there is interest in this patch, I can add slicing support to collections.deque so that this patch won't need itertools.islice()

    @rhettinger rhettinger added the performance Performance or resource usage label Mar 8, 2013
    @pitrou
    Copy link
    Member

    pitrou commented Mar 9, 2013

    I don't think you need slicing if you rewrite the patch in another way, e.g.:

    for i in range(n):
        try:
            waiter = __waiters.popleft()
        except IndexError:
            break
        waiter.release()

    I think this is safe, since notify() must be called with the lock held: another thread shouldn't be able to mutate the waiters list in the meantime.

    As for notify_all(), it could be optimized to swap the internal list with an empty one: there's no need to pop the waiters one by one.

    @rhettinger
    Copy link
    Contributor Author

    Tim, do you remember why Condition.notify() went to great lengths to act as if the lock could be released after the check for self._is_owned()?

    It loops over its own a copy of __waiters, and the __waiters.remove(waiter) code is wrapped in a try/except to detect a situation where __waiters mutated during the release-loop. I'm presuming that defensive programming was put there for a reason.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 10, 2013

    Actually, wait() calls self._waiters.remove() without holding the lock. But I think it could easily do so after taking the lock (since it takes it anyway before returning).

    Also, _waiters should better be a set, since wait() needs the associative behaviour when unregistering a waiter.

    notify() would then look like:

    for i in range(n):
        try:
            waiter = self._waiters.pop()
        except KeyError:
            break
        waiter.release()

    and wait() would look like:

    waiter = _allocate_lock()
    waiter.acquire()
    self._waiters.add(waiter)
    self._release_save()
    try:
        return waiter.acquire(timeout)
    finally:
        self._acquire_restore()
        self._waiters.discard(waiter)

    @pitrou
    Copy link
    Member

    pitrou commented Mar 10, 2013

    That said, I seem to remember a discussion of Condition's fairness.
    Right now, waiters are notified in the order of wait() calls. This wouldn't be the case anymore if using a set instead of a list or deque.

    Also, I can't remember a situation where I made an intensive use of a Condition (say, hundreds of calls per second), as opposed to Lock and RLock which can be heavily invoked to protect the integrity of critical data structures.

    @rhettinger
    Copy link
    Contributor Author

    Thanks Antoine. Since the calls are made without a lock, I'll go for a minimal patch and keep the existing fairness logic.

    Adding Guido to the nosy list since this is his code.

    FWIW, the heaviest load for condition variables likely arises with use of the Queue module which implements substantially all of its logic around three condition variables and a single lock.

    @rhettinger rhettinger self-assigned this Mar 10, 2013
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 11, 2013

    New changeset 0f86b51f8f8b by Raymond Hettinger in branch 'default':
    Issue bpo-17385: Fix quadratic behavior in threading.Condition
    http://hg.python.org/cpython/rev/0f86b51f8f8b

    @gvanrossum
    Copy link
    Member

    Looks fine. I'd say that it would be great to add slicing to deque!

    There's one oddity in the code, but it was there before the patch -- the local variable name __waiters is pretty silly. It appears to be a micro-optimization to avoid using self._waiters more than once; even if that is worth it (I kind of doubt it), the __private name is wrong and misguided.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants