Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio.gather drops cancellation #71110

Closed
JohannesEbke mannequin opened this issue May 3, 2016 · 13 comments
Closed

asyncio.gather drops cancellation #71110

JohannesEbke mannequin opened this issue May 3, 2016 · 13 comments
Labels
topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@JohannesEbke
Copy link
Mannequin

JohannesEbke mannequin commented May 3, 2016

BPO 26923
Nosy @gvanrossum, @vstinner, @MartinAltmayer, @1st1, @JohannesEbke, @tatellos
Files
  • cancellation_test.py: Test case demonstrating ineffective cancellation.
  • asyncio_task_prints.patch
  • fix_and_test_26923.patch: Correct return value of _GatheringFuture.cancel with test
  • fix_and_test_26923_reviewed.patch: Correct return value of _GatheringFuture.cancel with test
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-10-21.21:26:49.876>
    created_at = <Date 2016-05-03.12:50:06.006>
    labels = ['type-bug', 'expert-asyncio']
    title = 'asyncio.gather drops cancellation'
    updated_at = <Date 2016-10-21.21:26:49.875>
    user = 'https://github.com/JohannesEbke'

    bugs.python.org fields:

    activity = <Date 2016-10-21.21:26:49.875>
    actor = 'yselivanov'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-10-21.21:26:49.876>
    closer = 'yselivanov'
    components = ['asyncio']
    creation = <Date 2016-05-03.12:50:06.006>
    creator = 'JohannesEbke'
    dependencies = []
    files = ['42694', '43397', '43485', '43508']
    hgrepos = []
    issue_num = 26923
    keywords = ['patch']
    message_count = 13.0
    messages = ['264719', '264722', '264727', '268537', '268596', '268609', '268743', '268893', '268928', '268930', '269063', '279157', '279158']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'vstinner', 'MartinAltmayer', 'python-dev', 'yselivanov', 'JohannesEbke', 'tatellos']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue26923'
    versions = ['Python 3.5']

    @JohannesEbke
    Copy link
    Mannequin Author

    JohannesEbke mannequin commented May 3, 2016

    In a very specific case, asyncio.gather causes a cancel() call on the Task waiting on the _GatheringFuture to return a false positive. An example program demonstrating this is attached.

    The context: asyncio.gather creates a _GatheringFuture with a list of child Futures. Each child Future carries a done callback that updates the _GatheringFuture, and once the last child done callback is executed, the _GatheringFuture sets its result.

    The specific situation: When the last child future changes state to done it schedules its done callbacks, but does not immediately execute them. If the Task waiting on the gather is then cancelled before the last child done callback has run, the cancel method in asyncio/tasks.py:578 determines that the _GatheringFuture inspects itself and sees that it is not done() yet, and proceeds to cancel all child futures - which has no effect, since all of them are already done(). It still returns True, so the Tasks thinks all is well, and proceeds with its execution.

    The behaviour I would expect is that if cancel() is called on the _GatheringFuture, and all children return False on cancel(), then _GatheringFuture.cancel() should also return False, i.e.:

    def cancel(self):
        if self.done():
            return False
        at_least_one_child_cancelled = False
        for child in self._children:
            if child.cancel():
                at_least_one_child_cancelled = True
        return at_least_one_child_cancelled

    If I replace _GatheringFuture.cancel with the above variant, the bug disappears for me.

    More context: We hit this bug sporadically in an integration test of aioredis, where some timings conspired to make it appear with a chance of about 1 in 10. The minimal example calls the cancellation in a done_callback, so that it always hits the window. This was not the way the bug was discovered.

    @JohannesEbke JohannesEbke mannequin added topic-asyncio type-bug An unexpected behavior, bug, or error labels May 3, 2016
    @vstinner
    Copy link
    Member

    vstinner commented May 3, 2016

    The doc:
    https://docs.python.org/dev/library/asyncio-task.html#asyncio.gather
    "Cancellation: if the outer Future is cancelled, all children (that have not completed yet) are also cancelled. If any child is cancelled, this is treated as if it raised CancelledError – the outer Future is not cancelled in this case. (This is to prevent the cancellation of one child to cause other children to be cancelled.)"

    (Sorry, I didn't read your issue yet.)

    @JohannesEbke
    Copy link
    Mannequin Author

    JohannesEbke mannequin commented May 3, 2016

    Thank you for providing the relevant documentation link.

    I just noticed that it should probably be clarified that in case the outer Future is cancelled, and all children that are not already done ignore the cancellation (a.k.a. catch the CancelledError), the cancellation of the outer Future does not continue.

    This is different to the behaviour of asyncio.wait_for, which always raises a CancelledError.

    @MartinAltmayer
    Copy link
    Mannequin

    MartinAltmayer mannequin commented Jun 14, 2016

    I don't think this is a mere documentation problem: If a future cannot be cancelled because it is already done, cancel must return False.

    As Johannes' example demonstrates, a wrong return value from cancel might lead to a cancelled task being continued as if nothing happened: If Task.cancel receives a false positive from its _fut_waiter, it will not throw a CancelledError into the task (_must_cancel=True), but simply continue the task.

    @gvanrossum
    Copy link
    Member

    I think I agree with Johannes. If all children refuse to be cancelled because they are already done, the outer _GatheringFuture might as well refuse to be cancelled as well.

    However I'm not sure I actually understand the mechanism whereby the calling Task ends up surviving, and Johannes' description appears garbled.

    Can anyone add some print statements to various parts and explain it here?

    @JohannesEbke
    Copy link
    Mannequin Author

    JohannesEbke mannequin commented Jun 15, 2016

    On rereading my original description, it really is not clearly described why the calling Task ends up surviving.

    Attached is a patch to the 3.5.1 asyncio/tasks.py which adds some print statements in Task.cancel().

    If I execute the cancellation_test.py with the patch applied, the output looks like this:

    ------------------------
    Cancelling the task: <Task pending coro=<gather_test() running at cancellation_test.py:20> wait_for=<_GatheringFuture pending cb=[Task._wakeup()]> cb=[Task._wakeup()]>
    Entered Task.cancel() for task <Task pending coro=<gather_test() running at cancellation_test.py:20> wait_for=<_GatheringFuture pending cb=[Task._wakeup()]> cb=[Task._wakeup()]>
    Task is not done() and we have a _fut_waiter: Cancelling fut_waiter <_GatheringFuture pending cb=[Task._wakeup()]>
    Entered Task.cancel() for task <Task finished coro=<sleep() done, defined at /home/ebkej/.virtualenvs/kialo/lib/python3.5/asyncio/tasks.py:510> result=None>
    Task is done(), refusing to cancel
    Great, _fut_waiter has agreed to be cancelled. We can now also return True
    Cancellation returned: True
    All children finished OK, setting _GatheringFuture results!
    Finished gathering.
    Proof that this is running even though it was cancelled
    ------------------------

    The Task keeps on running because Task.cancel() trusts its _fut_waiter task to handle the cancellation correctly if its cancel() method returns True. If it returns False, it handles the Cancellation itself.

    In this case, that _fut_waiter continues on, and proceeds to set results etc. so that the calling tasks cannot distinguish this from a CancellationError which has been deliberately caught.

    I hope this explanation is a bit clearer than the first one.

    @gvanrossum
    Copy link
    Member

    Thanks! I had eventually pieced together the same explanation. So yes, I
    think your fix is right, though I would write it like this:

            ret = False
            for child in self._children:
                ret |= child.cancel()
            return ret  # True if at least one child.cancel() call returned True

    It would also be nice if there was a test for this behavior.

    I keep worrying a bit -- a similar bug could exist in other pieces of the
    code, or in other libraries. But I guess we can't do much about that.

    @JohannesEbke
    Copy link
    Mannequin Author

    JohannesEbke mannequin commented Jun 20, 2016

    Right, that's neater. Attached is a patch with your version and a test. I checked that it fails with the old version of cancel() and passes with the new one.

    Concerning possible other bugs, I've had a look in the standard library, but could not find another instance where Future.cancel() is overwritten and has special handling. I also had a look at the try/except Exception blocks in lib/asyncio, but possible Cancellations are handled correctly there.

    I believe the main source of bugs in this context will probably be other asyncio-based libraries. Either by inadvertently catching CancellationErrors in a try/except Exception block and treating them like other errors, or by not protecting resources with try/finally across yield points which might throw a CancellationError.

    Not all libraries use cancel() internally, so the authors might not be aware that they have to write "cancellation-safe" code.

    @gvanrossum
    Copy link
    Member

    @yury, any comments? The fix looks fine to me.

    @1st1
    Copy link
    Member

    1st1 commented Jun 20, 2016

    @yury, any comments? The fix looks fine to me.

    I've left a comment in code review. Other than that, the patch/approach looks good.

    @JohannesEbke
    Copy link
    Mannequin Author

    JohannesEbke mannequin commented Jun 22, 2016

    Attached is a new version of the patch incorporating the review results.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 21, 2016

    New changeset b1aa485fad1b by Yury Selivanov in branch '3.5':
    Issue bpo-26923: Fix asyncio.Gather to refuse being cancelled once all children are done.
    https://hg.python.org/cpython/rev/b1aa485fad1b

    New changeset b0af901b1e2a by Yury Selivanov in branch '3.6':
    Merge 3.5 (issue bpo-26923)
    https://hg.python.org/cpython/rev/b0af901b1e2a

    New changeset 2a228b03ea16 by Yury Selivanov in branch 'default':
    Merge 3.6 (issue bpo-26923)
    https://hg.python.org/cpython/rev/2a228b03ea16

    @1st1
    Copy link
    Member

    1st1 commented Oct 21, 2016

    Thank you, Johannes!

    @1st1 1st1 closed this as completed Oct 21, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-asyncio type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants