Issue46771
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2022-02-16 17:59 by gvanrossum, last changed 2022-04-11 14:59 by admin.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 31270 | merged | gvanrossum, 2022-02-16 18:00 | |
PR 31394 | merged | asvetlov, 2022-02-17 21:51 | |
PR 31398 | merged | gvanrossum, 2022-02-17 23:58 | |
PR 31415 | closed | tinchester, 2022-02-19 04:03 | |
PR 31434 | merged | tinchester, 2022-02-20 22:54 | |
PR 31483 | merged | tinchester, 2022-02-22 01:13 | |
PR 31508 | closed | tinchester, 2022-02-22 19:11 | |
PR 31513 | merged | tinchester, 2022-02-22 23:29 | |
PR 31623 | merged | gvanrossum, 2022-02-28 20:21 |
Messages (76) | |||
---|---|---|---|
msg413345 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-16 17:59 | |
Now that TaskGroup is merged (see bpo-46752) we might consider adding some form of cancel scopes (another Trio idea). There's a sensible implementation we could use as a starting point in @asvetlov's async-timeout package (https://github.com/aio-libs/async-timeout). |
|||
msg413347 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-16 18:12 | |
I've essentially forked `async-timeout` (a very good library) into Quattro cancel scopes: https://github.com/Tinche/quattro/blob/main/src/quattro/cancelscope.py. The differences are: * the API follows Trio, so instead of `timeout` you'd use `fail_after` or `move_on_after` * instead of `async with timeout`, you use a normal context manager `with fail_after`. The Trio folks think this is important (less suspension points, less race conditions) and I agree * it's somewhat composable (as much as possible under asyncio), each scope knows if the CancelError is meant for it or should be propagated further. This is implemented by using the CancelError message to carry a nonce. This only works on 3.9+, but here that's not a problem * small deadline adjustment differences, I use a setter on the deadline instead of `update` and `shift` * it's fully type annotated, but so is Andrew's Let me know if this sounds interesting. |
|||
msg413349 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-16 18:15 | |
Oh, and Trio's `current_effective_deadline` is also in. |
|||
msg413350 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-16 18:19 | |
Sure, we should create the best possible solution. We have no CI in the stdlib that checks type annotations, so those should probably be moved to a stub file in typeshed. (Ditto for asyncio taskgroups.py.) Using the new .cancelling()/.uncancel() API added to Task you might be able to avoid hacks using the cancel msg (check how it's used in the new TaskGroup). |
|||
msg413354 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-16 18:56 | |
A brief explanation of cancel scopes for the uninitiated: A cancel scope can enclose just a part of a coroutine function, or an entire group of tasks. They can be nested within each other (by using them as context managers), and marked as shielded from cancellation, which means cancellation won't be propagated (i.e. raised in the coroutine function) from a cancelled outer scope until either the inner scope's shielding is disabled or the inner scope is exited or cancelled directly. The fundamental problem in implementing these on top of asyncio is that native task cancellation can throw a wrench in these gears. Since a cancel scope works by catching a cancellation error and then (potentially) allowing the coroutine to proceed, it would have to know, when catching a cancellation error, if the cancellation error was targeted at a cancel scope or the task itself. A workaround for this, made possible in Python 3.9, is to (ab)use cancellation messages to include the ID of the target cancel scope. This only solves half of the problem, however. If the task is already pending a cancellation targeted at a cancel scope, the task itself cannot be cancelled anymore since calling cancel() again on the task is a no-op. This would be solved by updating the cancel message on the second call. The docs don't say anything about the behavior on the second call, so it's not strictly speaking a change in documented behavior. Then, on the subject of level cancellation: level cancellation builds upon cancel scopes and changes cancellation behavior so that whenever a task yields while a cancelled cancel scope is in effect, it gets hit with a CancelledError every time, as opposed to just once in asyncio's "edge" style cancellation. Another very important difference is that with level cancellation, even a task that starts within a cancelled scope gets to run up until the first yield point. This gives it an opportunity to clean up any resources it was given ownership of (a connected socket in a socket server is a common, practical example of this). This is what the asyncio documentation states about Task.cancel(): "This arranges for a CancelledError exception to be thrown into the wrapped coroutine on the next cycle of the event loop. The coroutine then has a chance to clean up or even deny the request by suppressing the exception with a try … … except CancelledError … finally block. Therefore, unlike Future.cancel(), Task.cancel() does not guarantee that the Task will be cancelled, although suppressing cancellation completely is not common and is actively discouraged." This is, however, only true for a task that has started running. A Task that gets cancelled before even entering the coroutine is silently dropped. As asyncio does not allow for custom task instances without overriding the entire task factory, it leaves libraries like AnyIO some less desirable options for implementing level cancellation: 1. Implementing a parallel task system using low level synchronous callbacks (con: such tasks won't show up in asyncio.all_tasks() or work with third party debugging tools) 2. Adding callbacks to continuously cancel tasks that yield inside a cancelled scope (con: ugly; potentially extra overhead?) 3. Adding a wrapper for the task that acts as a "controller" (con: adds an extra visible stack frame, messes with the default task name) Having low level machinery for injecting a custom Task instance to the event loop would probably solve this problem. |
|||
msg413356 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-16 19:36 | |
Alex, the goal here is not to replicate every Trio feature or behavior. For example I don't think that asyncio is likely to get level cancellation in 3.11, but we can certainly see if we can do something about some of the edge cases you mention, like the case of a task that is cancelled before it has started running, where you say that the task should be allowed to run until its first await. It would be nice to have a native asyncio example that demonstrates this, so we have a concrete goal. I am thinking it is something like this: async def send_from_open_file(f, s): data = f.read() f.close() await s.send(data) async def send_filename(filename, s): f = open(filename) t = asyncio.create_task(send_from_open_file(f, s)) t.cancel() await asyncio.sleep(1) This is an interesting edge case and I can see why you'd rather see this run until the `await s.send(data)` line. The question is, can we do that without breaking other promises implicit or explicit? (Just because the docs don't mention some behavior that doesn't mean we can change it. We have to consider what happens to actual real world code.) I don't even know if this would be easy to change if we decided it was a good change. Thoughts? (Possibly this discussion belongs in a new issue, since it's not directly related to adding cancel scopes.) |
|||
msg413358 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-16 21:10 | |
I'm not trying to argue that asyncio should be changed to have level cancellation or even cancel scopes as built-in (at this point), but expanding the low level API to make implementing these features possible in third party libraries without the awkward hacks we have now. As for async-timeout, it suffers from the same problem as AnyIO and Quattro: that cancellations of the entire task can be inadvertently swallowed by the async context manager in edge cases. I hadn't even thought of the possibility of this happening until one of AnyIO's users reported just such a problem: https://github.com/agronholm/anyio/issues/374 I just couldn't think of any way to correctly support such things without at least _some_ changes to the task cancellation behavior, and allowing .cancel() to update the cancel message seemed like the least invasive option. I'm all ears if someone has a better solution. |
|||
msg413359 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-16 21:14 | |
OK. 1. Please have a look at how .cancelling()/.uncancel() works (there's an example of it in TaskGroup) to solve that particular problem without using the cancel message. 2. Suppose you could make (backwards compatible) changes to asyncio. What would you do? 3.11 feature freeze (aka beta 1) is still a few months away (late May) so now's the time to get your wishes in. |
|||
msg413360 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-16 21:20 | |
Thanks, I will take a look at .uncancel() and .cancelling(). I saw that work happening in my email feed but couldn't figure out right away how it helped, but I will definitely look into the new TaskGroup code to see how it's used there and will get back to you after that. |
|||
msg413361 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-16 21:27 | |
Oh, wait. The new Task.cancelling() API helps tell the difference between the *parent* task being cancelled "from outside" vs. the task group itself cancelling the parent task so as to break out of an await like the following: async with TaskGroup() as g: g.create_task(...) await <something> when the await is cancelled, __aexit__() is called with a CancelledError, and we need to tell whether it was cancelled from the outside or by the completion callback on one of the tasks managed by the task group. The EdgeDB TaskGroup monkey-patched the parent task's cancel() method, and the new asyncio.TaskGroup instead checks parent.cancelled(). However, AFAICT when *any* of the task managed by the TaskGroup exits with CancelledError, this is *ignored* (in both the EdgeDB version and in asyncio.TaskGroup). The assumption here seems to be that the only reason a managed task raises CancelledError is because it was cancelled by the TaskGroup. A fix for that would be to separately keep track (maybe in a separate weak dict, or some other data structure -- maybe a flag on the task itself?) of which tasks are successfully cancelled by the TaskGroup. We can then treat a CancelledError bubbling out of a managed task that we *didn't* cancel as any other exception, causing it to abort the task group (i.e., cancel all other tasks). Is that what you are looking for? (But I think this could be solved even in 3.9 without resorting to cancel messages.) |
|||
msg413365 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-16 23:05 | |
The use of the cancel message is a hack, yeah. But it's what I had to work with. We could introduce a new, proper cancellation context to Tasks instead, which could be attached to the CancelError when it's raised. On the topic of multiple cancellations being applied to a task before it gets to run: could we just treat the cancellation context as a stack, and when the task gets to run, we throw them all in, one by one? Other options would involve somehow figuring out what the highest priority cancellation context is, or combining all the cancellation contexts into the CancelError somehow. On the topic of a task getting to run at least once before being cancelled: sure, I guess. I've personally never needed this but I can see how it'd be useful. Either have a flag on the Task instance or build that logic into the cancellation context handling? |
|||
msg413366 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-16 23:23 | |
On the topic of TaskGroups needing to know whether to swallow a child CancelledError or no: just use the same nonce/cancellation context trick? I remember asking Nathaniel about why in Trio nurseries and cancel scopes were linked (nurseries contain a cancel scope there), whereas in Quattro they are completely separate, and not really understanding the answer. I think I'm getting it now. |
|||
msg413367 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-16 23:27 | |
I hesitate to add yet another stack at this fundamental level. The Task.cancel() method returns a bool which indicates whether any state changed. When multiple cancellations happen concurrently, all but the first will return False, and anything that would like to cancel a task but finds that t.cancel() returns False can know that the task was either already cancelled or has already terminated. (To tell the difference, check t.done() first.) What would be the use case of wanting to cancel multiple times and having each cancellation be delivered separately? I know of one use case, where a task somehow decides to catch and *ignore* CancelledError (this should not be done lightly but it is supported -- like shielding in Trio). An impatient user or task manager might want to cancel such a thread a second time. This is what .uncancel() is for -- the thread must call .uncancel() to signal that it has truly ignored the cancellation (as opposed to being busy with cleanup that it deems uncancellable). But in this case the second cancellation (if it is to have any effect) should be delivered *after* .uncancel() is called. Your proposal of a cancel context or stack seems to be suggesting that there's a use case for mutliple *concurrent* cancellations. But I find it difficult to imagine such a use case, so I need your help. Even if we ignore the stack idea, could you provide some code showing how the cancel context would be used? I just learn better from code examples. |
|||
msg413368 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-16 23:46 | |
Ok, first the happy path. We have a task running this: async def a(): with move_on_after(5): await a_long_op() await something_else() `move_on_after` has scheduled a callback to run that calls `parent_task.cancel(msg=1)` 5 seconds after it was executed. So now 5 seconds pass, the callback cancels the task, and `move_on_after` catches the CancelledError, sees the msg == 1, and swallows it. `something_else()` now runs. All good. Sad path. Same scenario, except the event loop is kinda busy since we're running in production. Turns out this task was spawned by a web server, and there's a 5 second timeout (or the client disconnected, or something else). So now we have 2 callbacks that want to cancel this task: the one from `move_on_after` and the one from the web server. The one from the web server is more important, since it's a higher level cancellation. But the callback from `move_on_after` runs first, and marks the task for cancellation, and sets the message to 1. Then, before the task gets to run, the webserver also cancels the task. But that does nothing: https://github.com/python/cpython/blob/6f1efd19a70839d480e4b1fcd9fecd3a8725824b/Lib/asyncio/tasks.py#L206. So now the task actually gets to run, `move_on_after` swallows the CancelledError, and something_else() gets to run. But ideally, it shouldn't. |
|||
msg413369 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-17 00:18 | |
I just tried to write a snippet to demonstrate the issue in TaskGroup, but that doesn't seem possible since TaskGroup never swallows a CancelledError. It always raises/propagates _some_ exception out of __aexit__() unless of course all the child tasks run to completion successfully. I was also surprised to notice that TaskGroup doesn't have a .cancel() method, so in cases where one would launch multiple tasks and cancel the rest when one succeeds, one would have to store all the child tasks separately and then iterate over them and cancel one by one. The Happy Eyeballs algorithm is one such use case (also used in AnyIO this way). |
|||
msg413370 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-17 00:30 | |
@Tin The sad path is just a race, right? If the web server had disconnected just a tad later, __aexit__() would already have returned and await something_else() would already be running. So you can't make any promises if you write your code that way anyway. @Alex For "happy eyeballs" you could also raise an exception or cancel the parent task once you've saved the winning result somewhere. Maybe you could show example code written using different paradigms so we can compare. |
|||
msg413371 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-17 00:41 | |
@Guido Imagine something_else() takes a long time and a lot of server resources (like a heavy query). If the web server disconnected a tad later and avoided the race condition, the task would have gotten cancelled very soon after the start of something_else() and stopped running it. But since the race did happen, the query avoids cancellation (potentially ~forever). |
|||
msg413372 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-17 01:05 | |
Hm, I see. So the problem is that in the interval between move_on's calls to t.cancel() and t.uncancel(), if the web server calls t.cancel() that will just return False. So the web server would have to implement some other mechanism for cancelling operations. That's indeed unfortunate. Maybe we should just roll back that aspect of the TaskGroup PR -- in particular, remove these two lines: if self._cancel_requested: return False from Task.cancel(). These lines don't matter for TaskGroup (it works without them), and they weren't there before yesterday, so the fallout would be very localized. @asvetlov What do you think? |
|||
msg413376 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-17 01:44 | |
FWIW it looks like this part of taskgroups is vulnerable to a similar race: https://github.com/python/cpython/blob/6f1efd19a70839d480e4b1fcd9fecd3a8725824b/Lib/asyncio/taskgroups.py#L212-L232 Deleting the two lines I mentioned won't fix it here; a hack using the cancel message might be more appropriate. (I note that there is no documented way to retrieve the cancel message; you're supposed to access the protected `_cancel_message` attribute, apparently. Looks like we forgot something there.) |
|||
msg413377 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-17 01:49 | |
> (I note that there is no documented way to retrieve the cancel message; you're supposed to access the protected `_cancel_message` attribute, apparently. Looks like we forgot something there.) Oh, it's passed to the CancelledError() constructor. But that's not documented either (I had to find the original issue that introduced this to figure it out -- bpo-31033). |
|||
msg413387 - (view) | Author: Chris Jerdonek (chris.jerdonek) * | Date: 2022-02-17 05:38 | |
> I note that there is no documented way to retrieve the cancel message Does retrieving it from the CancelledError that is bubbling up suffice? Or do you need to be able to obtain it from the future object? |
|||
msg413388 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-17 05:54 | |
> > I note that there is no documented way to retrieve the cancel message > Does retrieving it from the CancelledError that is bubbling up suffice? Or do you need to be able to obtain it from the future object? I'm not sure yet (if anything I'd need it for a task, not a future). But it's also not documented that it gets passed to the exception (at least not in the Task docs -- I didn't check the Future docs). |
|||
msg413389 - (view) | Author: Chris Jerdonek (chris.jerdonek) * | Date: 2022-02-17 06:17 | |
> I'm not sure yet (if anything I'd need it for a task, not a future). (By future, I also meant task, as task inherits from future.) For now, I think it would be safer to get the message from the CancelledError, if possible, since how it gets there truly is an implementation detail. It would be okay to document that the msg argument gets passed to the CancelledError via the constructor, as that was always the intent. See also issue 45390 and the message I wrote there on how to make that API work better (given that the msg is only available from the leaf exception in the exception chain, and the current implementation creates intermediate exceptions, I believe unnecessarily): https://bugs.python.org/issue45390#msg403570 |
|||
msg413402 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-17 12:30 | |
@Guido @Chris Instead of using the message for the nonce we could have a dedicated field for it. I have a proposal though. It's about putting a little logic into the cancellation nonce handling. Let's define the nonce as a float. If you just call Task.cancel(), the nonce defaults to 0.0. We add an argument to Task.cancel, so you can give it a nonce: task.cancel(nonce=nonce). The cancel() method has logic to keep track of the nonce with the smallest value. When it's time for the CancelledError to be thrown into the task, this nonce (default 0.0 if not set by anthing) is attached to the error. Then we change `move_on_after` (and all siblings) to do the following: * in `__enter__`, use time.monotonic() to generate it's own nonce and remember it * in `__exit__`, if the CancelledError has a nonce that is less than its own nonce, it propagates it, otherwise it handles it. How this fixes the sad path in my example: Both the web server and `move_on_after` cancel the task. The web server just calls `task.cancel()`, `move_on_after` calls `task.cancel(nonce=self.nonce)`. No matter the cancellation ordering, the nonce will end up set to 0.0. `move_on_after` will see the 0.0 nonce and propagate the error correctly to kill the task completely. This also handles nested cancel scopes. I'm not sure how it works with the task catching the cancel to do a little cleanup. |
|||
msg413409 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-17 15:11 | |
The discussion is hot, I see several interleaved threads. Let me put my answers on all of them in order of appearance. 1. quattro cancellation scopes are implemented after async-timeout. As the author of async-timeout I am happy to know it. The module is pretty small (as async-timeout itself). I'd like to concentrate on the differences between async-timeout and quattro. a) quattro has `fail_after()` / `fail_at()` context managers, that are similar to async-timeout's `timeout()` / `timeout_at()` The first function schedules timeout as a *relative* delay, the second uses *absolut monotonic* time. So far so good. `timeout()` better points on 'what happens' I believe (TimeoutError is raised). `fail_after` is not that bad at all but `timeout()` is easier I think. `timeout_at()` and `fail_at()` are minic to `loop.call_at()`, the prefix is perfect. Regarding `loop.call_later()` I found that `timeout_later()` is too long name (plus `timeout()` context manager appeared in 2015 as a part of aiohttp). My opinion is biased, sure. b) quattro has `move_on_after()` and `move_on_at()` context managers that doesn't raise TimeoutError but set a flag after certain period. asyncio has lower level primitives than trio (`loop.call_later()` and `loop.call_at()`). Not sure that `move_on_*` should be added, they are low-level building blocks. On the other hand, `move_on_after()` requires less code lines than `call_later()`. If we decide 'yes' I suggest changing naming: I have no idea what is *moved* without reading the documentation. c) `with fail_after()` vs `async with timeout()`. The first async-timeout version used `with timeout()` notation. I've changed it to async counterpart after many questions from newbies: "Why is sync context manages used in async code?". Questions arose when asyncio was not wide-spread as today. People made their first baby steps those days, I provided a dozen of asyncio offline courses and many conference talks. Thus, please consider `async with ...` design as a user feedback reaction. As a side effect, it prohibits `timeout()` usage in non-async functions (which is awkward at least). Regarding async context manager performance, I think it is good enough for timeout-related things. I didn't experience any problem with it. Moreover, async fast-path (async function is called and it returns a value without suspension on awaiting) can be optimized on Python level to make it as fast as a regular python function call, sure. It is not trivial and might require adding a new opcode (combine CALL + GET_AWAITABLE) but this optimization is out of the issue scope. d) `cm.deadline += 0.5` vs `cm.shift(0.5)` is a question of taste. asyncio-timeout design motivation was "don't do complex things in property setter" but I can live with mutable `cm.deadline` attribute, sure e) cancellation stack and `current_effective_deadline` -- I'm with Guido, let's not add yet another stack. It can be an interesting debug feature but I doubt if it is useful in production code. Also, the performance cost is not zero. Merging and slicing stack tuple on any timeout context enter/exit is not free. The implementation can be switched to a linked list but still, do we really need it? 2. Alex Grönholm, asyncio supports custom task instances without overriding the entire task factory. You should provide a custom method for custom task creatuon, that's it. `asyncio.all_tasks()` / `asyncio.current_task()` support is provided by '_register_task()', '_unregister_task()', '_enter_task()', and '_leave_task()' calls. These methods are part of non-user-faced public API, they are intentionally enumerated by `asyncio.__all__`. These methods are mentioned by changelog only, sorry. A pull request for documenting them in asyncio low-level section is welcome! 3. The race condition between two `.cancel()` calls performed by the same loop iteration. Sure, the race exists. Before TaskGroup landing, the last `.cancel()` wins. After the change, the first `.cancel()` wins and all subsequent `.cancel()` calls made on the same event loop iteration are rejected with returning `False`. I believe, the changed behavior is more consistent (and close to `Future.cancel()` design). Assume, we have two scheduled close but different timeouts for the same tasks. Both are reached at the next event loop iteration (see the timeline below): prev-iteration timeout-a timeout-b current-iteration | | | | >---+--------------+-----------+--------------+--------------> future I prepared https://github.com/aio-libs/async-timeout/pull/295 to handle it (not merged yet because the next Python alpha release it required; I've tested it against the latest CPython main branch manually). Sorry for polluting source code by `sys.version_info` checks, the library is supposed to work with Python 3.7+. async-timeout handles the race as follows: https://github.com/aio-libs/async-timeout/blob/master/async_timeout/__init__.py#L209-L239 a) timeout-a and timeout-b TimeHandle's are scheduled for execution on the current iteration and executed. b) Both context managers set their own state to 'TIMEOUT'. c) timeout-a cancels the current task, timeout-b calls the cancel but it is skipped and rejected because timeout-a called `task.cancel()` earlier. d) The task is cancelled, async stack unwinds. e) The inner context manager converts CancelledError to TimeoutError and raises it. f) The outer context manager does nothing but bubbles-up TimeoutError from the inner context manager. Doesn't matter what context manager (timeout-a or timeout-b) is inner and what is outer; it the timeout occurs the TimeoutError is propagated from the inner code up along the stack *unless* some code swallows it intentionally. I don't think that we should prohibit swallowing. Another edge case is the task explicit cancellation that is scheduled on the same event loop iteration as timeout occurrence: prev-iteration (explicit .cancel() called) timeout current-iteration | | | >-----------------+----------------------------+-------------+----------> future a) `loop.call_later()` registered callback (`_on_timeout`) is called first. Its `task.cancel()` is rejected (`False` returned) because the first cancellation was explicitly requested on the previous iteration. b) the task wakes up to handle `.cancel()` call from the prev iteration c) stack unwinds with CancelledError, it is not converted to TimeoutError because CancelledError has no required message. That's fine, because `.cancel()` call from the previous iteration is processed, not timeout itself. d) `.expired` property is `True` though because timeout is reached technically; the wall-clock time is greater than the deadline. 4. Tin Tvrtković, I don't think that a separate field for 'nonce' is needed for the proper cancellation. It adds more complexity; I have no idea what to do with the nonce field if the CancelledError comes not from timeout context manager but is rased by other logic. As I demonstrated above, using cancellation message as nonce work perfectly fine. P.S. It is a long letter, sorry. Please don't hesitate to discuss it, feel free to ask a question if some of my words are not clear. English is not my mother language :( |
|||
msg413410 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-17 15:21 | |
I support Alex Grönholm: TaskGroup is not affected by cancellation races because it doesn't convert the exception or swallows it. The code is safe from my understanding. |
|||
msg413411 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-17 15:51 | |
Hello Andrew, here's some followup. About the names: the move_on_after and fail_after names are from Trio. I don't have strong feeling about them at all, whatever most people like. About move_on_after vs loop.call_at and loop.call_later: move_on_after is much more usable in practice, since it doesn't require splitting your coroutine in several. And it's very useful in production code; when dealing with 3rd party APIs you want to wait a reasonable time and continue on your way if the 3rd party API stalls. About `async with` vs `with`: I have strong feelings for `with`. It's not a performance issue; I don't care about that. `async with` to me says there are suspension points involved; this coroutine might or might not be suspended either entering or existing. With a bare `with` I *know* there is no suspension. This is important to keep my concurrent code more correct, which is hard enough. About overriding the task factory: the default Task implementation is implemented in C, correct? I would be way too scared of putting my (Python) implementation in there because of performance. Spending years shaving microseconds off in my other libraries to risk losing it all because I want better cancellation in asyncio would feel really bad. Ideally we can get a good-enough solution in the stdlib so users don't have to do this. About your point 3, first graph: you are right if both context managers propagate exceptions. If the inner one is set to swallow (`move_on_after`) OR the user plans on swallowing the inner one, the problem is *not* solved (or at least I don't understand the solution). This is the code sample from https://bugs.python.org/issue46771#msg413368. And I think swallowing is an important use case, as I've already mentioned. About the special field for nonce: I'm OK with smuggling the nonce in the message. But I stand by my proposal for making the nonce a monotonic number, and that would require a special field to be clean. |
|||
msg413415 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-17 16:53 | |
@Guido you asked for the AnyIO implementation of Happy Eyeballs; here it is: https://github.com/agronholm/anyio/blob/ac3e7c619913bd0ddf9c36b6e633b278d07405b7/src/anyio/_core/_sockets.py#L85 (I didn't paste the actual code here because it's way too long) |
|||
msg413445 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-17 19:27 | |
Lots of food for thought! There seem to be mostly two discussions: API design for the new asyncio cancel scopes (do we make it more like Trio or more like async-timeout?); and cancel semantics in edge cases. I'll pass on the API design for now: I recommend that Tin and Andrew agree on some middle ground first. (Personally I could do without move_on(), I'd just add a try/except TimeoutError.) On the cancel edge case, I am beginning to warm up to (ab)using the existing cancel message hack, rather than a separate nonce. I think the message argument could be the cancel scope or its id(). I think Andrew missed one case: in his second diagram, what if the explicit cancel() happened *after* the timeout (but still in the same iteration)? That's the case that makes me want to delete those two lines from Task.cancel() (see my earlier message). (Sorry, I've gotta go play now.) |
|||
msg413460 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-17 22:12 | |
https://github.com/python/cpython/pull/31394 is created for collaboration , Tin Tvrtković is invited. Core devs should have the write access already. Non-core devs, please ask for github invite if you want to collaborate. |
|||
msg413461 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-17 22:20 | |
The PR is pretty empty, it has a scaffolding for `asyncio.timeouts` module and its test only. I'll try to add something real tomorrow. My plan is: - solve 'easy' questions with Tin during PR's discussion/reviews - make something that we are both agree on if it is possible at this stage. I'm optimistic, seems like we both are able to compromise (and have the experience to work together on pytest-asyncio project). - raise a hard question loudly if discussion on GitHub will need more people (participation in early stages are welcome, sure). > I think Andrew missed one case: in his second diagram, what if the explicit cancel() happened *after* the timeout (but still in the same iteration)? That's the case that makes me want to delete those two lines from Task.cancel() (see my earlier message). Please let me write a comprehensive answer (with a third diagram, I've found these simple pictures useful) tomorrow. |
|||
msg413465 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-18 00:22 | |
+1 on both aspects of the plan. |
|||
msg413466 - (view) | Author: Yury Selivanov (yselivanov) * | Date: 2022-02-18 04:54 | |
Couple thoughts: I'm +1 for adding TaskGroup.cancel() method. I'd be -1 on abusing `Task.cancel()` to signal something with a nonce. Whatever problem we are trying to solve here, it should be solvable without resorting to hacks like this. It should be trivial to implement simple tracking of whether a child task was cancelled by the group or not to decide on how to handle a rogue CancelledError. |
|||
msg413475 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-18 10:57 | |
I am also uncomfortable using the cancel message to deliver the token/nonce/whatever. If TaskGroup.cancel() is implemented, would it also deliver a cancellation in the parent task like trio and AnyIO do? It should IMHO, because otherwise if the task group is cancelled, it could still get stuck waiting on whatever the parent task happens to be waiting on, if it's not at TaskGroup.__aexit__() yet. |
|||
msg413478 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-18 11:52 | |
Guido, the third case: The third edge case is: explicit cancel() happened *after* the timeout (but still in the same iteration)? timeout current-iteration (calls .cancel() after timeout) next-iteration | | | >---+----------------------------+---------------------------------+----------> future a) timeout occurs, `call_later()`' callback is called, the task cancellation is scheduled on the next loop iteration by `task.cancel()` call b) other activity (e.g. socket-ready event that processed after timers in asyncio) explicitly calls `.cancel()`. The second request is ignored, `.cancel()` returns `False`. c) On the next iteration, the task wakes up with CancelledError with a message that points on the timeout context manager. It means that the timeout is processed, explicit `.cancel()` call that happens *after* timeout is ignored. The first event wins, as usual. |
|||
msg413480 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-18 13:06 | |
Sure, I'll be glad to work with Andrew on getting something presentable. Going through the discussion in the issue: * it seems like folks don't think move_on is useful enough to be in the stdlib, I understand that point. Users can always catch the timeout error from `timeout`, and I can just keep `move_on` in Quattro. We can always add it later. So as far as I'm concerned we can drop it. * `async with` vs `with`: if Andrew thinks `async with` is easier to teach and less error-prone, I'm ok with having the `async with` civilian version in the stdlib and I can keep the `with` expert versions in Quattro, no problem there. So I'm most interested in the cancellation semantics, because those will be very hard to fix in a 3rd party package. @Andrew, in your schema for the third case the behavior is wrong, the `.cancel()` should win over the timeout. Otherwise using the context manager becomes too risky in real-world situations. I also think your first graph has an issue if the user has a `try/except TimeoutError` between `timeout-a` and `timeout-b`, which is now more probable since we're dropping `move_on`. We can take the discussion to the forked repo; I can put together some tests if that would make it easier. |
|||
msg413481 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-18 13:12 | |
I propose the following, backwards compatible solution: Add a new keyword argument to Task.cancel(): "scope: object = None". The behavior would be as follows: the scope is saved, and included in the raised CancelledError. If Task.cancel() is called again, but with scope=None (the default), it clears out the saved scope, if any. Any other scope will be ignored. This simple change would allow for proper implementation of any context manager that needs to swallow or transform a CancelledError raised in the task. |
|||
msg413489 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-18 15:28 | |
> the third case the behavior is wrong, the `.cancel()` should win over the timeout. Otherwise using the context manager becomes too risky in real-world situations. Please elaborate. The first code that calls `.cancel()` wins, doesn't matter what is the source. asyncio has no priorities. > I also think your first graph has an issue if the user has a `try/except TimeoutError` between `timeout-a` and `timeout-b`, which is now more probable since we're dropping `move_on`. We can take the discussion to the forked repo; I can put together some tests if that would make it easier. Ok, let's discuss on GitHub. I only would mention that no code could be executed between timeout-a and timeout-b, because both events are scheduled between the previous event loop iteration and the current one. Sure, if we can start talking with code (and failed tests) -- it can raise the understanding level very much. |
|||
msg413491 - (view) | Author: Yury Selivanov (yselivanov) * | Date: 2022-02-18 15:44 | |
> * `async with` vs `with`: if Andrew thinks `async with` is easier to teach and less error-prone, It has to be 'async with' like most asyncio apis to avoid user confusion. |
|||
msg413591 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-20 15:22 | |
Suppose we have a case when two nested timeouts are reached at the same event loop iteration: async def asyncio.timeout(1) as cm1: async with third_party_cm() as cm2: async def asyncio.timeout(1) as cm3: async with third_party_cm() as cm4: await asyncio.sleep(10) What exception should be bubbled between outer and inner context manager 'exit' executions? `sleep()` is interrupted with CancelledError, it is clear (and the only possible solution in asyncio world). `cm4.__aexit__` receives the CancelledError, does the cleanup if required, and re-raises the cancellation. `cm3.__aexit__` receives the bubbled CancelledError and updates its own state and raises an exception. The question is: what exception should be raised, CancelledError or TimeoutError? What exception should see `cm2.__aexit__` code? After careful thinking, I believe that CancelledError should be re-raised by *inner affected* timeout context managers, the only top-level *affected* context should convert CancelledError and raise TimeoutError. My reasons for this behavior are: A generic asyncio code is usually *ready* for cancellation. If it wants to react to the cancellation event, it caught `asyncio.CancelledError` and reraised it. Also, the asyncio code is cancellation-ready by default because usually `BaseException` is now handled (asyncio.CancelledError is derived from BaseException). TimeoutError is caught by `except Exception` instead, it adds extra difficulty. Handling both CancelledError and TimeoutError by *any* asyncio code on async stack unwinding is tedious and error-prone. If we should choose one I bet on CancelledError. The inner code ignores timeouts usually (and executes resource cleanup only). That's what CancelledError handling exists for already. If the cleanup differs depending on timeout expiration, `cm3.expired` (name it) can be used as a flag. You can disagree with me here, my opinion is based on my experience of writing asyncio code only. The top-level affected timeout context manager should raise TimeoutError because it exists and is used for such things. Long story short: all *internal affected* timeout context managers should not raise TimeoutError (or it should be configurable and 'off' by default) because `third_party_cm()` should have the same simple implementation whether is it used as `cm2` or `cm4`. Happy to see your opinions regarding the question, folks! |
|||
msg413600 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-20 18:05 | |
The 3rd party context managers are somewhat of a red herring. These are just try/except or try/finally blocks. The inner cm (cm4) is irrelevant, it will see CancelledError and presumably that bubbles out. If it does any resource cleanup we can replace it with try/finally for purposes of simplifying the example. But here's an even simpler example that poses the same question: async with asyncio.timeout(1) as cm1: try: async with asyncio.timeout(1) as cm3: await asyncio.sleep(10) # Raises CancelledError except TimeoutError: print("timed out") Does this see CancelledError or catch TimeoutError? I had naively expected that it would catch TimeoutError, but then there's no place for the outer cancel scope to have any effect, so I agree that it should indeed see CancelledError, and "timed out" is never printed. The outer cancel scope sees CancelledError and turns it into TimeoutError. Note that if the outer cancel scope has a longer timeout (which isn't expired yet), the try/except will catch TimeoutError. If it then enters another `await asyncio.sleep(10)` it will be cancelled and the outer cancel scope will raise TimeoutError. How to implement this behavior? It can be done with the "cancel counter" that I proposed and Tin implemented in https://github.com/python/cpython/pull/31434. Can it be done with the simpler version (just a cancel-requested bit), without using a nonce? I don't think so -- we don't know in which order the cancel call from the inner and outer cancel scope happen, and if the inner goes first, it cannot be aware of the outer. So I think the cancel counter is the minimal change needed. I have one final question, to which I don't have a firm answer yet. In Task.cancel(), if the cancel counter is already nonzero, should it still go ahead and set the must-cancel flag (or pass the cancellation on to `self._fut_waiter` -- I am still not sure what that's for :-( ). I think it only makes a difference if the task being cancelled has already caught a CancelledError (from the first cancel()) and is handling it. If we set must-cancel, then if it uses `await` it will be cancelled again. If we don't set must-cancel, its cleanup is "shielded". **Opinions?** (PS. There's a typo in Andrew's example -- it should be "async with", not "async def".) |
|||
msg413601 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-20 18:38 | |
Can I also get comments on my proposal for the "scope" parameter? Is there a use case it would not solve? |
|||
msg413602 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-20 18:45 | |
In docs we can explain the behavior as "the outer expired timeout cancels the inner waiter, waits for CancelError bubbling up, and raising TimeoutError instead". I agree that a counter is required for this behavior. An alternative implementation can use the global `dict[Task, int]` for keeping counters. It can be either WeakKeyDictionary or a regular dict that deletes entries on `task.add_done_callback()` call. We have a similar structure for `asyncio.all_tasks()` support already. The global dict has a benefit: it doesn't overlap with the user's `.cancel()` calls but counts timeouts only. A few words regarding task internals: _must_cancel boolean flag is set when a task doesn't wait for something, it was just created or `await sleep(0)` context switch was executed on the previous step. Otherwise, a task always waits for a future completion, the future is stored as _fut_waiter. If we use the global counting dict, timeout could call `.cancel()` only if the cancellation was not initiated previously. The current behavior works fine with this as the second `.cancel()` call is ignored. Technically the ignorance could be reverted, `task.cancelling()` check is enough. > If we don't set must-cancel, its cleanup is "shielded" If I understand it correctly, I want this feature. Cleanup can perform async operations for a graceful resources shutdown, cancelling these cleaups look dangerous. With the current asyncio state, you can do it by calling `task.uncancel(); task.cancel()` in a line though. |
|||
msg413603 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-20 19:19 | |
Updated https://github.com/python/cpython/pull/31394 demonstrated the approach with global dict for counters. |
|||
msg413604 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-20 19:25 | |
Alex, the 'scope' argument can be added if it is really required. I'm not sure if the nonce is unavoidable still. |
|||
msg413607 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-20 20:09 | |
> Alex, the 'scope' argument can be added if it is really required. > I'm not sure if the nonce is unavoidable still. What other generic solution is there? I've read your last few messages but didn't find an answer. There needs to be some way to avoid a whole-task cancellation being ignored when it happens after a cancel scope triggers a cancellation for itself. My proposal solves that problem, and I think it eliminates the need for un-cancellation or other backwards incompatible changes. |
|||
msg413608 - (view) | Author: Sam Bull (dreamsorcerer) * | Date: 2022-02-20 20:33 | |
> If the task is already pending a cancellation targeted at a cancel scope, the task itself cannot be cancelled anymore since calling cancel() again on the task is a no-op. This would be solved by updating the cancel message on the second call. > I think Andrew missed one case: in his second diagram, what if the explicit cancel() happened *after* the timeout (but still in the same iteration)? That's the case that makes me want to delete those two lines from Task.cancel() (see my earlier message). To expand on this point, I've been looking at solving the race conditions in async-timeout. To see how such a race condition can end up with a task never exiting, take a look at this example: https://github.com/aio-libs/async-timeout/issues/229#issuecomment-908502523 In the condition Guido describes, the user's cancellation is suppressed and the code runs forever. I also wrote tests that seem to reliably reproduce the race condition (the 2nd one still seems unfixable with the current solutions, the 1st was fixed with the nonce/sentinel trick): https://github.com/aio-libs/async-timeout/commit/ab04eb53dcf49388b6e6eacf0a50bafe19c5c74b#diff-60a009a48129ae41018d588c32a6d94c54d1d2948cbc3b831fc27a9c8fdbac68L364-L421 You can see the flow of execution from the call_order assert at the end. I think most of the solutions proposed here will still not solve this race condition. I initially proposed a solution at: https://bugs.python.org/issue45098 In short, I think that every time we call .cancel(), we need to raise another CancelledError. So, in this race condition you would get 2 CancelledErrors (via an ExceptionGroup). Then our code can catch the error with our nonce/sentinel and handle that, but also reraise any other errors which are unrelated. |
|||
msg413610 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-20 21:26 | |
> Can I also get comments on my proposal for the "scope" parameter? Is there a use case it would not solve? It looks more complicated -- the extra parameter needs to be passed around via the task and then into the CancelledError exception. What use case do you have that cannot be solved by some variation of the "cancel count" proposal? |
|||
msg413611 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-20 21:31 | |
(Sam Bull) > To expand on this point, I've been looking at solving the race conditions in async-timeout. To see how such a race condition can end up with a task never exiting, take a look at this example: https://github.com/aio-libs/async-timeout/issues/229#issuecomment-908502523 This should be solved when using the cancel count -- the explicit cancel bumps the cancel count so the cancel scope (i.e. timeout()) will not raise TimeoutError. |
|||
msg413612 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-20 21:46 | |
> It looks more complicated -- the extra parameter needs to be passed around via the task and then into the CancelledError exception. It reduces overall complexity by making uncancellation unnecessary and restoring backwards compatibility. > What use case do you have that cannot be solved by some variation of the "cancel count" proposal? I'm not sure I'm keeping proper track of the variations, but it seems it still relies on task uncancellation. But worse than that, (correct me if I'm wrong) it makes the innermost context manager handle the cancellation, even if it was requested by an outer one. If you have 3 nested "cancel scopes" and the task is cancelled once, how do you know which one of those context managers should handle the cancellation? I'm not sure my proposal is a fix-all either, in its current form. Sure, it fixes the case where a full task cancellation would go unnoticed, but if two unrelated context managers trigger cancellation at the same time, only the first one would actually receive it. Perhaps then we need to raise a CancelledError separately for each scope? I'm not sure yet. |
|||
msg413613 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-20 21:58 | |
> This should be solved when using the cancel count -- the explicit cancel bumps the cancel count so the cancel scope (i.e. timeout()) will not raise TimeoutError. Hmmm. Interesting! Timeouts are not the single primitive that should care about the cancellation source. Suppose, async code reconnects on network problem but should be terminated on explicit task cancelling. Could cancel count be used here? Is the approach generic enough? My first answer is "why not?" |
|||
msg413614 - (view) | Author: Tin Tvrtković (tinchester) * | Date: 2022-02-20 22:00 | |
@Alex you can follow along here: https://github.com/python/cpython/pull/31394 With the cancel_counter approach, a context manager knows whether to handle or propagate the exception by examining its own local state and the remaining counter on the task. If after uncancelling the counter is still non-zero, it propagates. |
|||
msg413618 - (view) | Author: Sam Bull (dreamsorcerer) * | Date: 2022-02-21 00:00 | |
> This should be solved when using the cancel count -- the explicit cancel bumps the cancel count so the cancel scope (i.e. timeout()) will not raise TimeoutError. It will probably work in this case. But, what about more complex cases? If there are 2 different timeouts and the possibility of a user cancellation, and we have a count of 2 cancellations, then what happened? 2 timeouts, or 1 timeout and user cancellation? Without being able to check the nonces of each cancellation, I don't see how this would work. Or if the user calls .cancel() twice explicitly, then you cancel both timeouts, even though it had nothing to do with the timeout. Propagating an ExceptionGroup where every exception can be inspected to see if it was caused by this code or not still seems like the safe option to me (and obviously still has the cancel count implicitly). |
|||
msg413619 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-21 00:05 | |
> Propagating an ExceptionGroup where every exception can be inspected to see if it was caused by this code or not still seems like the safe option to me (and obviously still has the cancel count implicitly). Note that this, too, causes backwards incompatible changes in cancellation behavior. Previously, when the task was cancelled twice, only one CancelledError was raised. Now it would raise a BaseExceptionGroup instead. The current backward incompatible changes in cancellation behavior are already causing 10 tests to fail in the AnyIO test suite. I'm trying to find an alternate solution that does not break anything. |
|||
msg413621 - (view) | Author: Sam Bull (dreamsorcerer) * | Date: 2022-02-21 00:19 | |
Actually, in your counter proposal, you say: > Such context managers should still keep track of whether they cancelled the task themselves or not, and if they did, they should always call t.uncancel(). But, if we are using nonces on the CancelledError to keep track, then only 1 context manager will know if it was themselves or not. This is exactly why I'm proposing to use multiple CancelledErrors, so that every nonce is passed to the handling code. |
|||
msg413622 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-21 00:20 | |
> But, if we are using nonces on the CancelledError to keep track, then only 1 context manager will know if it was themselves or not. This is exactly why I'm proposing to use multiple CancelledErrors, so that every nonce is passed to the handling code. Raising multiple CancelledErrors is not the only way to accomplish this. We could store the nonces in a single CancelledError instead. |
|||
msg413623 - (view) | Author: Sam Bull (dreamsorcerer) * | Date: 2022-02-21 00:22 | |
> Previously, when the task was cancelled twice, only one CancelledError was raised. Now it would raise a BaseExceptionGroup instead. I was under the impression that ExceptionGroup was somewhat backwards compatible, in that you could still use `except CancelledError:` and it would catch all the errors in the group. But, maybe I'm wrong, I've not seen the documentation for the final feature yet, but that's the impression I got from the PEP. |
|||
msg413624 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-02-21 00:24 | |
> I was under the impression that ExceptionGroup was somewhat backwards compatible, in that you could still use `except CancelledError:` and it would catch all the errors in the group. But, maybe I'm wrong, I've not seen the documentation for the final feature yet, but that's the impression I got from the PEP. No, you need the new except* syntax for that. |
|||
msg413625 - (view) | Author: Sam Bull (dreamsorcerer) * | Date: 2022-02-21 00:24 | |
> We could store the nonces in a single CancelledError instead. Indeed, my initial proposal was exactly that, but after learning about ExceptionGroup, I thought that was a cleaner approach. |
|||
msg413631 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-21 03:35 | |
[Alex] > The current backward incompatible changes in cancellation behavior are already causing 10 tests to fail in the AnyIO test suite. I'm trying to find an alternate solution that does not break anything. Are you sure that the tests aren't over-specified? Maybe you could link to the failing test run? (Though I worry that understanding your test infrastructure might be a bit much.) That said, when I landed this change I wasn't at all sure that it wouldn't break things (there just weren't any asyncio tests that it broke), and I am totally willing to roll that part back or change it if it breaks a valid use case. So basically I am saying please chill, feature freeze isn't until late May. |
|||
msg413632 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-21 03:46 | |
[Sam] > It will probably work in this case. But, what about more complex cases? If there are 2 different timeouts and the possibility of a user cancellation, and we have a count of 2 cancellations, then what happened? 2 timeouts, or 1 timeout and user cancellation? Without being able to check the nonces of each cancellation, I don't see how this would work. Or if the user calls .cancel() twice explicitly, then you cancel both timeouts, even though it had nothing to do with the timeout. The cancel scope must record whether it called cancel() or not. (This is what TaskGroup does.) Suppose the inner timeout cancels and sees two cancellations. It knows one of those is its own, so it calls uncancel() and raises CancelledError. Now the outer cancel scope sees one cancellation. If it did not call cancel(), it knows it was a user cancellation (or another, even more outer, cancel scope -- it really doesn't matter), and it raises CancelledError. If the outer cancel scope also called cancel(), it knows that this is so, and it in turn calls uncancel() -- and it knows (by the count returned from uncancel()) that there are no more cancellations, so it raises TimeoutError. QED |
|||
msg413633 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-21 03:53 | |
PS. The cancel count can work whether or not cancel() returns without setting must-cancel (or cancelling the underlying future/task) when there's already a cancellation in progress. Other things may be different though. We have to look into this separately. |
|||
msg413760 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-23 00:32 | |
If some code is used together with timeout() and this code calls `.cancel()` but forgot about `.uncancel()` in try/except/finally -- timeout() never raises TimeoutError. Should we care? The missing `.uncancel()` call is hard to detect by the runtime and static checkers. |
|||
msg413767 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-23 02:52 | |
> If some code is used together with timeout() and this code calls > `.cancel()` but forgot about `.uncancel()` in try/except/finally -- > timeout() never raises TimeoutError. Could you show an example? I'm not sure from this description who cancels whom and where the try/except/finally is in relation to the rest. If you have something that catches CancelledError and then ignores it, e.g. while True: try: await <something> except CancelledError: pass then that's an immortal task and it shouldn't be run inside a timeout. If you have something that catches CancelledError once, e.g. try: await <big action> finally: await <cleanup> there should be no need to call .uncancel() *unless* the <cleanup> may hang -- in that case you could write try: await <big action> finally: async with timeout(5): await <cleanup> I'm not sure that we should recommend using .uncancel() except in very special cases (e.g. when writing a timeout() context manager :-) and those cases should just be tested. |
|||
msg413812 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-23 15:37 | |
I have no good simple real-case scenario, sorry. There is a demonstration of my thoughts. Suppose we have a custom context manager that behaves similar to timeout() but is controlled not by timer but external event source (it could be an invalidation message sent by a distributed broker or something else). class EventRaised(Exception): pass class CancelOnEvent: async def __init__(self, event): self.event = event async def __aenter__(self): self.waiter = asyncio.task(self._cancel_on_event, asyncio.current_task()) async def __aexit__(self, exc_typ, ecx_val, exc_tb): if exc_typ is asyncio.CancelledError: if CASE1: # <<< cleanup strategy selector if asyncio.current_task().uncancel() == 0: raise EventRaised else: if self.event.is_set(): raise EventRaised async def _cancel_on_event(self, task): await self.event.wait() task.cancel() ########### event = asyncio.Event() async with asyncio.timeout(1): # what exception should bubble-up here? async with CancelOnEvent(event): await asyncio.sleep(10) # event.set() is called here after 1 sec timeout If this CancelOnEvent context manager is used together with timeout() CM, is the behavior clear? Should `.uncancel()` be used by CancelOnEvent? Why? How should it interact with timeout()? I have no clear and obvious answer on these questions, this worries me. |
|||
msg413820 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-23 17:03 | |
To make this cleanly interact with timeout, TaskGroup etc., the CancelOnEvent class should have a "did-I-cancel" flag which is set in the _cancel_on_event() callback. Then if that flag is set it should call .uncancel(), and if that returns a value > 0, it should bubble the CancelledError out; otherwise it can raise EventRaised (if the condition is set). |
|||
msg413840 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-02-23 18:21 | |
Clear, thanks! |
|||
msg413841 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-23 18:26 | |
I will now merge GH-31513 (cancel counts). Once that's in you can merge main into your timeout PR (GH-31394) and then that can land soon (I'd like to review it once more). |
|||
msg413874 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-24 02:17 | |
New changeset 7fce1063b6e5a366f8504e039a8ccdd6944625cd by Tin Tvrtković in branch 'main': bpo-46771: Implement task cancel requests counter (GH-31513) https://github.com/python/cpython/commit/7fce1063b6e5a366f8504e039a8ccdd6944625cd |
|||
msg414221 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-28 20:23 | |
Everyone, I've sent a PR (which I expect will make it into alpha 6) that restores the old cancel() semantics. This should make Tin happy, but I think we'll still have to have a longer discussion about the downsides. https://github.com/python/cpython/pull/31623 |
|||
msg414224 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-02-28 23:16 | |
New changeset 7d611b4cabaf7925f5f94daddf711d54aeae2cf9 by Guido van Rossum in branch 'main': bpo-46771: Remove two controversial lines from Task.cancel() (GH-31623) https://github.com/python/cpython/commit/7d611b4cabaf7925f5f94daddf711d54aeae2cf9 |
|||
msg414853 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-03-10 16:05 | |
New changeset f537b2a4fb86445ee3bd6ca7f10bc9d3a9f37da5 by Andrew Svetlov in branch 'main': bpo-46771: Implement asyncio context managers for handling timeouts (GH-31394) https://github.com/python/cpython/commit/f537b2a4fb86445ee3bd6ca7f10bc9d3a9f37da5 |
|||
msg414854 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-03-10 16:08 | |
I'm closing this, the asyncio.timeout() context manager has been merged. Thanks Andrew! @agronholm you said you were interested in tweaking the cancellation behavior some more. If you're still interested, let's discuss that in a separate bpo (please +nosy me if you create one). |
|||
msg414856 - (view) | Author: Andrew Svetlov (asvetlov) * | Date: 2022-03-10 16:22 | |
The implementation has landed, docs are still required. |
|||
msg414857 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2022-03-10 16:24 | |
Good think I forgot to close the issue. ;-) |
|||
msg414858 - (view) | Author: Alex Grönholm (alex.gronholm) * | Date: 2022-03-10 16:25 | |
Yeah, I'm still interested. I'll create a new BPO when I have something. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:56 | admin | set | github: 90927 |
2022-03-23 21:50:42 | asvetlov | set | title: Add some form of cancel scopes -> Implement asyncio.timeout() context manager |
2022-03-10 16:25:19 | alex.gronholm | set | messages: + msg414858 |
2022-03-10 16:24:06 | gvanrossum | set | messages: + msg414857 |
2022-03-10 16:22:11 | asvetlov | set | messages: + msg414856 |
2022-03-10 16:08:13 | gvanrossum | set | messages: + msg414854 |
2022-03-10 16:05:37 | gvanrossum | set | messages: + msg414853 |
2022-02-28 23:16:09 | gvanrossum | set | messages: + msg414224 |
2022-02-28 20:23:46 | gvanrossum | set | messages: + msg414221 |
2022-02-28 20:21:12 | gvanrossum | set | pull_requests: + pull_request29748 |
2022-02-24 02:17:13 | gvanrossum | set | messages: + msg413874 |
2022-02-23 18:26:20 | gvanrossum | set | messages: + msg413841 |
2022-02-23 18:21:11 | asvetlov | set | messages: + msg413840 |
2022-02-23 17:53:26 | yduprat | set | nosy:
+ yduprat |
2022-02-23 17:03:19 | gvanrossum | set | messages: + msg413820 |
2022-02-23 15:37:54 | asvetlov | set | messages: + msg413812 |
2022-02-23 02:52:33 | gvanrossum | set | messages: + msg413767 |
2022-02-23 00:32:01 | asvetlov | set | messages: + msg413760 |
2022-02-22 23:29:24 | tinchester | set | pull_requests: + pull_request29640 |
2022-02-22 19:11:13 | tinchester | set | pull_requests: + pull_request29635 |
2022-02-22 01:13:13 | tinchester | set | pull_requests: + pull_request29613 |
2022-02-21 03:53:04 | gvanrossum | set | messages: + msg413633 |
2022-02-21 03:46:12 | gvanrossum | set | messages: + msg413632 |
2022-02-21 03:35:26 | gvanrossum | set | messages: + msg413631 |
2022-02-21 00:24:15 | dreamsorcerer | set | messages: + msg413625 |
2022-02-21 00:24:07 | alex.gronholm | set | messages: + msg413624 |
2022-02-21 00:22:51 | dreamsorcerer | set | messages: + msg413623 |
2022-02-21 00:20:49 | alex.gronholm | set | messages: + msg413622 |
2022-02-21 00:19:10 | dreamsorcerer | set | messages: + msg413621 |
2022-02-21 00:05:23 | alex.gronholm | set | messages: + msg413619 |
2022-02-21 00:00:29 | dreamsorcerer | set | messages: + msg413618 |
2022-02-20 22:54:13 | tinchester | set | pull_requests: + pull_request29584 |
2022-02-20 22:00:37 | tinchester | set | messages: + msg413614 |
2022-02-20 21:58:45 | asvetlov | set | messages: + msg413613 |
2022-02-20 21:46:35 | alex.gronholm | set | messages: + msg413612 |
2022-02-20 21:31:35 | gvanrossum | set | messages: + msg413611 |
2022-02-20 21:26:40 | gvanrossum | set | messages: + msg413610 |
2022-02-20 20:33:35 | dreamsorcerer | set | nosy:
+ dreamsorcerer messages: + msg413608 |
2022-02-20 20:09:30 | alex.gronholm | set | messages: + msg413607 |
2022-02-20 19:25:47 | asvetlov | set | messages: + msg413604 |
2022-02-20 19:19:45 | asvetlov | set | messages: + msg413603 |
2022-02-20 18:45:12 | asvetlov | set | messages: + msg413602 |
2022-02-20 18:38:30 | alex.gronholm | set | messages: + msg413601 |
2022-02-20 18:05:57 | gvanrossum | set | messages: + msg413600 |
2022-02-20 15:22:30 | asvetlov | set | messages: + msg413591 |
2022-02-19 04:03:27 | tinchester | set | pull_requests: + pull_request29561 |
2022-02-18 15:44:39 | yselivanov | set | messages: + msg413491 |
2022-02-18 15:28:38 | asvetlov | set | messages: + msg413489 |
2022-02-18 13:12:51 | alex.gronholm | set | messages: + msg413481 |
2022-02-18 13:06:09 | tinchester | set | messages: + msg413480 |
2022-02-18 11:52:09 | asvetlov | set | messages: + msg413478 |
2022-02-18 10:57:37 | alex.gronholm | set | messages: + msg413475 |
2022-02-18 04:54:22 | yselivanov | set | messages: + msg413466 |
2022-02-18 00:22:52 | gvanrossum | set | messages: + msg413465 |
2022-02-17 23:58:33 | gvanrossum | set | pull_requests: + pull_request29542 |
2022-02-17 22:20:19 | asvetlov | set | messages: + msg413461 |
2022-02-17 22:12:11 | asvetlov | set | messages: + msg413460 |
2022-02-17 21:51:26 | asvetlov | set | pull_requests: + pull_request29539 |
2022-02-17 19:27:39 | gvanrossum | set | messages: + msg413445 |
2022-02-17 16:53:22 | alex.gronholm | set | messages: + msg413415 |
2022-02-17 15:51:27 | tinchester | set | messages: + msg413411 |
2022-02-17 15:21:17 | asvetlov | set | messages: + msg413410 |
2022-02-17 15:11:11 | asvetlov | set | messages: + msg413409 |
2022-02-17 12:30:07 | tinchester | set | messages: + msg413402 |
2022-02-17 06:17:27 | chris.jerdonek | set | messages: + msg413389 |
2022-02-17 05:54:43 | gvanrossum | set | messages: + msg413388 |
2022-02-17 05:38:24 | chris.jerdonek | set | nosy:
+ chris.jerdonek messages: + msg413387 |
2022-02-17 01:49:37 | gvanrossum | set | messages: + msg413377 |
2022-02-17 01:44:16 | gvanrossum | set | messages: + msg413376 |
2022-02-17 01:05:51 | gvanrossum | set | messages: + msg413372 |
2022-02-17 00:41:38 | tinchester | set | messages: + msg413371 |
2022-02-17 00:30:49 | gvanrossum | set | messages: + msg413370 |
2022-02-17 00:18:03 | alex.gronholm | set | messages: + msg413369 |
2022-02-16 23:46:36 | tinchester | set | messages: + msg413368 |
2022-02-16 23:27:41 | gvanrossum | set | messages: + msg413367 |
2022-02-16 23:23:00 | tinchester | set | messages: + msg413366 |
2022-02-16 23:05:33 | tinchester | set | messages: + msg413365 |
2022-02-16 21:27:54 | gvanrossum | set | messages: + msg413361 |
2022-02-16 21:22:01 | iritkatriel | set | nosy:
+ ajoino |
2022-02-16 21:20:24 | alex.gronholm | set | messages: + msg413360 |
2022-02-16 21:14:11 | gvanrossum | set | messages: + msg413359 |
2022-02-16 21:10:57 | alex.gronholm | set | nosy:
- ajoino messages: + msg413358 |
2022-02-16 21:07:29 | ajoino | set | nosy:
+ ajoino |
2022-02-16 20:05:47 | jab | set | nosy:
+ jab |
2022-02-16 19:36:37 | gvanrossum | set | messages: + msg413356 |
2022-02-16 19:17:25 | gvanrossum | set | nosy:
+ tinchester |
2022-02-16 18:56:12 | alex.gronholm | set | nosy:
+ alex.gronholm, - tinchester messages: + msg413354 |
2022-02-16 18:19:57 | gvanrossum | set | messages: + msg413350 |
2022-02-16 18:15:59 | tinchester | set | messages: + msg413349 |
2022-02-16 18:12:48 | tinchester | set | nosy:
+ tinchester messages: + msg413347 |
2022-02-16 18:00:11 | gvanrossum | set | keywords:
+ patch stage: needs patch -> patch review pull_requests: + pull_request29526 |
2022-02-16 17:59:12 | gvanrossum | create |