classification
Title: Add a cancel method to asyncio Queues
Type: enhancement Stage: patch review
Components: asyncio Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Martin.Teichmann, asvetlov, cjrh, eamanu, yselivanov
Priority: normal Keywords: patch

Created on 2019-06-19 07:41 by Martin.Teichmann, last changed 2019-11-23 15:57 by Martin.Teichmann.

Pull Requests
URL Status Linked Edit
PR 14227 open Martin.Teichmann, 2019-06-19 07:43
Messages (15)
msg346023 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2019-06-19 07:41
When working with queues, it is not uncommon that at some point the producer stops producing data for good, or the consumer stops consuming, for example because a network connection broke down or some user simply closed the session.

In this situation it is very useful to simply cancel all the waiting getters and putters. A simple method can do that, Queue.cancel.
msg346026 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-06-19 08:13
Sounds like `.close()` is better name for described behavior.
msg346113 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2019-06-20 11:09
I also thought about `.close()` but then found `.cancel()` more intuitive. But intuition is not universal, so I am open to any wording.
msg346268 - (view) Author: Caleb Hattingh (cjrh) * Date: 2019-06-22 03:00
I'm interested in how this change would affect the pattern of shutting down a queue-processing task.

How would one decide between whether to cancel the queue or the task? (I'm asking for real, this is not an objection to the PR). For example, looking at the two tests in the PR:

    def test_cancel_get(self):
        queue = asyncio.Queue(loop=self.loop)

        getter = self.loop.create_task(queue.get())
        test_utils.run_briefly(self.loop)
        queue.cancel()                       # <---- HERE
        test_utils.run_briefly(self.loop)
        with self.assertRaises(asyncio.CancelledError):
            self.loop.run_until_complete(getter)

This test would work exactly the same if the `getter` task was cancelled instead right?  Like this:

    def test_cancel_get(self):
        queue = asyncio.Queue(loop=self.loop)

        getter = self.loop.create_task(queue.get())
        test_utils.run_briefly(self.loop)
        getter.cancel()                       # <---- HERE
        test_utils.run_briefly(self.loop)
        with self.assertRaises(asyncio.CancelledError):
            self.loop.run_until_complete(getter)

So my initial reaction is that I'm not sure under what conditions it would be more useful to cancel the queue instead of the task. I am very used to applying cancellation to tasks rather than the queues they contain, so I might lack imagination in this area. The idiom I've been using so far for consuming queues looks roughly something like this:

async def consumer(q: asyncio.Queue):
    while True:
        try:
            data = await q.get()
        except asyncio.CancelledError:
            q.put_nowait(None) # ignore QueueFull for this discussion
            continue

        try:
            if not data:
                logging.info('Queue shut down cleanly')
                return     # <------ The only way to leave the coro
            <process data>
        except Exception:
            logging.exception('Unexpected exception:')
            continue
        finally:
            q.task_done() 

^^ With this pattern, I can shut down the `consumer` task either by cancelling the task (internally it'll put a `None` on the queue) or by placing a `None` on the queue outright from anywhere else. The key point is that in either case, existing items on the queue will still get processed before the `None` is consumed, terminating the task from the inside.

(A) If the queue itself is cancelled (as in the proposed PR), would it still be possible to catch the `CancelledError` and continue processing whatever items have already been placed onto the queue? (and in this case, I think I'd still need to place a sentinel onto the queue to mark the "end"...is that correct?)

(B) The `task_done()` is important for app shutdown so that the application shutdown process waits for all currently-pending queue items to be processed before proceeding with the next shutdown step. So, if the queue itself is cancelled (as in the proposed PR), what happens to the application-level call to `await queue.join()` during the shutdown sequence, if a queue was cancelled while there were still unprocessed items on the queue for which `task_done()` had not been called?

It would be great to have an example of how the proposed `queue.cancel()` would be used idiomatically, w.r.t. the two questions above.  It might be intended that the idiomatic usage of `queue.cancel()` is for situations where one doesn't care about dropping items previously placed on the queue. Is that the case?
msg346276 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-06-22 12:32
Agree with Caleb.

The more I think the more I doubt about the proposal.
Cancellation is for tasks, not for queues or locks.

When should I cancel a queue but cannot cancel a task?
msg346391 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2019-06-24 13:13
Given the reactions I gather "close" is a better name for the method, so I changed it accordingly.

In the current implementation, items that had been put on the queue but not processed yet still get processed after the close, and I think this is the desired behavior. I added a test such that this won't unexpectedly change in the future.

To be precise, the current implementation of the queue does not even put new items on the queue if there is already a waiting consumer. The item will directly be handed over to said consumer, which may hang around on the event loop for a bit longer, but during this time the item is not in the queue. This also answers the questions about catching the CancelledError: if there are waiting consumers, there is nothing on the queue, so the problem of processing leftover items does not exist. The same holds for the task_done.

As for the "why don't I just cancel the task?", well, if you know it. There may be many consumer or producer tasks waiting for their turn. Sure, you can keep a list of all those tasks. But that's exactly the point of the proposed change: the Queue already knows all the waiting tasks, no need to keep another list up-to-date!
msg357204 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2019-11-21 19:43
This seems like a useful idea. I recommend to write a test implementation and play with it.


Andrew:
> I think the proposal makes the queues API more error-prone: concurrent put() and close() produces unpredictable result on get() side.

How? Can you elaborate?


Caleb:
> I'm interested in how this change would affect the pattern of shutting down a queue-processing task.

Agree, this can be useful for that.


Martin:
> Given the reactions I gather "close" is a better name for the method, so I changed it accordingly.

Not sure I like "close" since it will *cancel* all getters and putters & discard all items in the queue AND allow further operation on the queue.  The latter part is really questionable -- what's the point of losing the data in the queue and resuming it?  Seems like a mechanism for writing unreliable code, but perhaps you can give us an example where this is necessary.
msg357211 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-11-21 20:02
1. Suppose we have 2 concurrent producers, a single queue and a consumer.
2. The first producer puts several items into the queue and then calls q.close()
3. The second producer also puts several items. It doesn't matter the second producer closes the queue at the end or not.
4. The consumer gets items from the queue and prints them.

What items are printed and what are canceled/skipped? With the proposed PR it depends on timings of putting and getting data items and the queue size. 
The output can vary from zero to all pushed data.
That's why I think that the idea is not reliable.
msg357270 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2019-11-22 11:49
Hi Andrew,

I still don't get your point. First, this is an extension to the asyncio library, so concurrency is not an issue. And sure, if you call random methods of an object without any reason the outcome won't be anything useful, why, in your example, should the one task call close at all?

There is, however, a strong use case: If you have many producers but just one consumer and the consumer stops consuming, it should cancel all producers waiting on the queue. The same way, if you have one producer but many consumers, once the single producer stops producing, it should cancel all waiting consumers. In these situations, the outcome is very clear.

Whether it should be possible to "resurrect" a closed/cancelled queue I don't care much, as I neither see a use case in doing so nor a compelling argument why it should be artificially prohibited. So I simply went for the most simple code.

The proposed code does something very simple, something a user can grasp. That is IMHO a better protection for users than some code trying to artificially stop users from harming themselves.
msg357283 - (view) Author: Caleb Hattingh (cjrh) * Date: 2019-11-22 14:06
Ok, I see now. The improvement with only a single producer/consumer might be marginal, but the proposition that `queue.cancel()` might simplify the situation with multiple producers and/or consumers is more compelling.

Usually, assuming M producers and N threads (and 1 queue), one would typically stop the M producers outright, and then place N `None` values on the queue, and then each consumer shuts itself down when it receives a `None`.  It's clunky but it works. 

This from experience with threaded code, but the difference with asyncio is that we have cancellation available to us, whereas we could not cancel threads. I think this is why I've carried over my queue-idioms from threads to asyncio. So this is an interesting idea: if we introduce cancellation to queue handling, do things get better for the MxN producers and consumers design?

Rehashing, what I might expect to happen when I call `queue.cancel()` is that 

1. A CancelledError (or maybe`QueueCancelled`?) exception is raised in all producers and consumers ) - this gives a producer a chance to handle the error and do something with the waiting item that could not be `put()`
2. Items currently on the queue still get processed in the consumers before the consumers exit.

I think (1) is easy, but I expected (2) to be more tricky. You said it already works that way in your PR. I didn't believe it so I wrote a silly program to check, and it does! All pending items on the queue are still consumed by consumers even after the queue._getters futures are cancelled.  

So yes, both (1) and (2) appear to work.

> As for the "why don't I just cancel the task?", well, if you know it. There may be many consumer or producer tasks waiting for their turn. Sure, you can keep a list of all those tasks. But that's exactly the point of the proposed change: the Queue already knows all the waiting tasks, no need to keep another list up-to-date!

Finally - while I think the MxN producers/consumers case might be simplified by this, it's worth noting that usually in shutdown, *all* the currently-pending tasks are cancelled anyway. And as I said before, in an MxN queue scenario, one might place N `None` values on the queue, and then just send `CancelledError` to everything anyway (consumers will ignore the cancellation and just wait for the `None`s to exit). This works well.  

Thus, if `queue.cancel()` were available to me right now, the primary difference as I see it would be that during shutdown, instead of placing N `None` values on the queue, I would instead call `queue.cancel()`. I agree that's a bit neater.  (It however will still necessary to absorb CancelledError in the consumers, e.g. what is raised by `asyncio.run()` during shutdown, so that's unchanged).

I agree with Yury that I don't like `queue.close`. "Cancel" seems better after all.

I disagree with Yury that items are discarded - I checked that already-present items on the queue will still be consumed by consumers, before the `queue.close` cancellation is actually raised.
msg357303 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2019-11-22 17:31
Yes, in the one-producer-many-consumers situation on can indeed to the trick with the None. But this is just a clumsy hack, cancelling the tasks is IMHO more in line with asyncio.

In the many-producers-one-consumer scenario this does not work. The one dead consumer cannot feed back to the producers. Sure, there are still many hacks imaginable, but a closing or cancelling the queue is a very clear way of doing things.

As for the naming: I personally don't care, close() or cancel() are both fine with me.
msg357305 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-11-22 17:47
My point is that closing (or cancellation) should be one-way ticket.

Is there a case where continuing after stopping?
msg357307 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2019-11-22 17:54
> 1. A CancelledError (or maybe`QueueCancelled`?) exception is raised in all producers and consumers ) - this gives a producer a chance to handle the error and do something with the waiting item that could not be `put()`

> 2. Items currently on the queue still get processed in the consumers before the consumers exit.

This (especially 2) sounds quite tricky.

Maybe we should just add a method to Queue to get the list of current consumers, of pending consumers, and pending producers?  This way users can implement whatever semantics they want (probably :wink:).
msg357376 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-11-23 12:06
Lists of producers/consumers are just lists of future objects.

I don't think that we need to expose futures in high-level public API.
msg357381 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2019-11-23 15:57
I do not think that exposing the lists of futures does any good. I cannot come up with a semantics that could be implemented with that other than the one proposed.
Also I think that close() or cancel() is something a reader intuitively understands, while looping over a list of futures is not immediately obvious.
History
Date User Action Args
2019-11-23 15:57:13Martin.Teichmannsetmessages: + msg357381
2019-11-23 12:06:43asvetlovsetmessages: + msg357376
2019-11-22 22:00:29eamanusetnosy: + eamanu
2019-11-22 17:54:20yselivanovsetmessages: + msg357307
2019-11-22 17:47:31asvetlovsetmessages: + msg357305
2019-11-22 17:31:14Martin.Teichmannsetmessages: + msg357303
2019-11-22 14:06:05cjrhsetmessages: + msg357283
2019-11-22 11:49:20Martin.Teichmannsetmessages: + msg357270
2019-11-21 20:02:32asvetlovsetmessages: + msg357211
2019-11-21 19:43:02yselivanovsetmessages: + msg357204
2019-06-24 13:13:58Martin.Teichmannsetmessages: + msg346391
2019-06-22 12:32:07asvetlovsetmessages: + msg346276
2019-06-22 03:00:41cjrhsetnosy: + cjrh
messages: + msg346268
2019-06-20 11:09:37Martin.Teichmannsetmessages: + msg346113
2019-06-19 08:13:09asvetlovsetmessages: + msg346026
2019-06-19 07:43:15Martin.Teichmannsetkeywords: + patch
stage: patch review
pull_requests: + pull_request14061
2019-06-19 07:41:48Martin.Teichmanncreate