Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception handling with "await" can hang in Python3.9.0b1 #84873

Closed
felixxm mannequin opened this issue May 20, 2020 · 26 comments
Closed

Exception handling with "await" can hang in Python3.9.0b1 #84873

felixxm mannequin opened this issue May 20, 2020 · 26 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@felixxm
Copy link
Mannequin

felixxm mannequin commented May 20, 2020

BPO 40696
Nosy @asvetlov, @cjerdonek, @serhiy-storchaka, @1st1, @eamanu, @miss-islington, @isidentical, @sweeneyde, @aeros, @carltongibson, @felixxm, @iritkatriel
PRs
  • bpo-40696: Fix a hang that can arise after gen.throw() #20287
  • [3.9] bpo-40696: Fix a hang that can arise after gen.throw() (GH-20287) #20321
  • [3.9] bpo-40696: Fix a hang that can arise after gen.throw() (GH-20287) #20322
  • bpo-25782: Do not hang when exception contexts form a cycle. #20539
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-08-08.17:15:33.526>
    created_at = <Date 2020-05-20.08:08:03.800>
    labels = ['interpreter-core', 'type-bug', '3.8', '3.9', '3.10', '3.7']
    title = 'Exception handling with "await" can hang in Python3.9.0b1'
    updated_at = <Date 2021-08-08.17:15:33.525>
    user = 'https://github.com/felixxm'

    bugs.python.org fields:

    activity = <Date 2021-08-08.17:15:33.525>
    actor = 'chris.jerdonek'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-08-08.17:15:33.526>
    closer = 'chris.jerdonek'
    components = ['Interpreter Core']
    creation = <Date 2020-05-20.08:08:03.800>
    creator = 'felixxm'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40696
    keywords = ['patch']
    message_count = 26.0
    messages = ['369429', '369441', '369449', '369450', '369451', '369461', '369482', '369483', '369484', '369485', '369486', '369510', '369514', '369557', '369642', '369651', '369654', '369655', '369662', '369947', '370383', '370385', '370398', '370399', '399082', '399221']
    nosy_count = 12.0
    nosy_names = ['asvetlov', 'chris.jerdonek', 'serhiy.storchaka', 'yselivanov', 'eamanu', 'miss-islington', 'BTaskaya', 'Dennis Sweeney', 'aeros', 'carltongibson', 'felixxm', 'iritkatriel']
    pr_nums = ['20287', '20321', '20322', '20539']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue40696'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9', 'Python 3.10']

    @felixxm
    Copy link
    Mannequin Author

    felixxm mannequin commented May 20, 2020

    We noticed a behavior change in Python3.9.0b1 (it works properly in Python3.9.0a6). One of our tests handlers.tests.AsyncHandlerRequestTests.test_suspiciousop_in_view_returns_400[1] hangs on await. /suspicious/ is a view that raises a custom exception SuspiciousOperation.

    [1] https://github.com/django/django/blob/8328811f048fed0dd22573224def8c65410c9f2e/tests/handlers/tests.py#L258-L260

    @felixxm felixxm mannequin added topic-asyncio 3.9 only security fixes labels May 20, 2020
    @cjerdonek
    Copy link
    Member

    I'm getting close to tracking this down. There is a certain point in the code path of the Django test where exc is exc.__context__ becomes True. I'm guessing this is what's causing the hang. I'll try to get a simple reproducer (there is a lot of Django code to sort through).

    @cjerdonek
    Copy link
    Member

    FWIW, I found that the following hangs, but it also hangs on earlier versions of Python:

    import traceback

    try:
    raise RuntimeError
    except Exception as exc:
    print(f'handling: {exc!r}')
    exc.__context__ = exc
    print('printing traceback')
    print(traceback.format_exc()) # hangs

    Is this a separate bug? So maybe the issue is that the new code is letting things get into this state. Some of my changes added new chaining in various places, so that would fit (but still investigating).

    @cjerdonek
    Copy link
    Member

    To start out sharing what I found in the Django code:

    Here inside BaseHandler._get_response_async():
    https://github.com/django/django/blob/3460ea49e839fd6bb924c48eaa1cd3d6dc888035/django/core/handlers/base.py#L226-L232

    try:
    response = await wrapped_callback(request, *callback_args,
    **callback_kwargs)
    except Exception as e:
    response = await sync_to_async( # This line hangs.
    self.process_exception_by_middleware,
    thread_sensitive=True,
    )(e, request)

    you can see an exception being handled, which is then passed to process_exception_by_middleware(). Process_exception_by_middleware() can wind up re-raising that same exception, which causes __context__ to be set circularly inside the except block:
    https://github.com/django/django/blob/3460ea49e839fd6bb924c48eaa1cd3d6dc888035/django/core/handlers/base.py#L323-L332

    If you boil this down, you get the following as a simple reproducer. This doesn't hang, but you can tell the difference by comparing exc2 to exc2.__context as indicated below:

    import asyncio
    
    async def process_exc(exc):
        raise exc
    
    async def run():
        try:
            raise RuntimeError
        except Exception as exc:
            task = asyncio.create_task(process_exc(exc))
            try:
                await task
            except BaseException as exc2:
                # Prints True in 3.9.0b1 and False in 3.9.0a6.
                print(exc2 is exc2.__context__)
    
    loop = asyncio.new_event_loop()
    try:
        loop.run_until_complete(run())
    finally:
        loop.close()
        
    The cause is probably the following PR, which enabled exception chaining for gen.throw() in the yield from case:
    https://github.com/python/cpython/pull/19858
    So the answer might be to do some cycle detection when chaining the exception, which apparently _PyErr_ChainExceptions() doesn't do.

    @cjerdonek
    Copy link
    Member

    The Django details might not matter so much at this point, but to add to something I said above: It might not only be process_exception_by_middleware() as I mentioned, but also asgiref's sync_to_async() function. In that function, you can see an already active exception being re-raised (here the exc_info comes from sys.exc_info()):

    # If we have an exception, run the function inside the except block
    # after raising it so exc_info is correctly populated.
    if exc_info[1]:
        try:
            raise exc_info[1]
        except:
            return func(*args, **kwargs)
    else:
        return func(*args, **kwargs)

    https://github.com/django/asgiref/blob/edd0570a4f6e46f0948afa5ef197a192bb95b7b7/asgiref/sync.py#L306-L314

    @isidentical
    Copy link
    Sponsor Member

    Is this a separate bug? So maybe the issue is that the new code is letting things get into this state. Some of my changes added new chaining in various places, so that would fit (but still investigating).

    Looks like there isn't a recursion guard on

    cpython/Python/errors.c

    Lines 143 to 154 in e572c7f

    if (exc_value != value) {
    PyObject *o = exc_value, *context;
    while ((context = PyException_GetContext(o))) {
    Py_DECREF(context);
    if (context == value) {
    PyException_SetContext(o, NULL);
    break;
    }
    o = context;
    }
    PyException_SetContext(value, exc_value);
    }

    I'm not sure if this would be the real solution but, for this case, it works 
    diff --git a/Python/errors.c b/Python/errors.c
    index 3b42c1120b..ba3df483e2 100644
    --- a/Python/errors.c
    +++ b/Python/errors.c
    @@ -141,8 +141,8 @@ _PyErr_SetObject(PyThreadState *tstate, PyObject *exception, PyObject *value)
                usually very short. Sensitive readers may try
                to inline the call to PyException_GetContext. */
             if (exc_value != value) {
    -            PyObject *o = exc_value, *context;
    -            while ((context = PyException_GetContext(o))) {
    +            PyObject *o = exc_value, *context = NULL;
    +            while (o != context && (context = PyException_GetContext(o))) {
                     Py_DECREF(context);
                     if (context == value) {
                         PyException_SetContext(o, NULL);
    (END)

    @cjerdonek
    Copy link
    Member

    I don't think that would be a real solution because it looks like that would cause the while loop always to loop at most once (which would defeat its purpose) -- because the loop ends with "o = context":

    while ((context = PyException_GetContext(o))) {
        Py_DECREF(context);
        if (context == value) {
            PyException_SetContext(o, NULL);
            break;
        }
        o = context;
    }

    @1st1
    Copy link
    Member

    1st1 commented May 21, 2020

    Just a note, context cycles can theoretically be longer than 2 nodes. I've encountered cycles like exc.__context__.__context__.__context__ is exc a few times in my life, typically resulting from some weird third-party libraries.

    The only solution is to use a set() collection to track already visited exceptions.

    To make it fast I propose to modify the code to:

    1. Do a fast traverse with a regular while loop without tracking (no set())

    2. If the number of iterations is longer than 100 and there's still no top context found -- go to (3)

    3. Do a slower implementation with set()

    @cjerdonek
    Copy link
    Member

    From a process perspective, I think we should probably pursue two PR's for this: one for the general issue that affects all Python versions (what Yury is talking about), and something narrower that addresses the 3.9.0b1 case that came up here. I'd like to focus on the latter first. Someone else is welcome to work on the more general issue while I'm doing that.

    I'm not 100% sure if the more general issue should be a new bpo issue or not. I'm leaning towards separate because it is bigger and there are different decisions to be made around backporting, etc, but we should decide that now. If someone else agrees, I can create a new issue.

    @1st1
    Copy link
    Member

    1st1 commented May 21, 2020

    If someone else agrees, I can create a new issue.

    I'd keep this one issue, but really up to you. I don't think I have time in the next few days to work on what I proposed but would be happy to brainstorm / review PRs.

    @cjerdonek
    Copy link
    Member

    Okay, I'll keep it one issue then. Someone else is still welcome to work on the more general issue.

    Note that there is some chance the narrower fix should happen independent of the more general fix. This is because _PyErr_ChainExceptions() (which is the call I added for the gen.throw() case) doesn't call the code path that we're discussing. Thus, cycles could still wind up being introduced at that call site.

    @cjerdonek
    Copy link
    Member

    I just posted a draft PR that implements the narrower fix:
    #20287
    I confirmed that the Django test passes with it. I also included two regression tests: one using only generators, and one more like the Django test that awaits a task.

    My solution was to update the exception context in gen_send_ex() using _PyErr_SetObject() instead of _PyErr_ChainExceptions() -- because _PyErr_SetObject() does the cycle detection we've been discussing, and _PyErr_ChainExceptions() doesn't.

    While _PyErr_SetObject()'s cycle detection isn't complete in that it can't detect cycles that begin further down the chain, it's good enough for this case.

    @cjerdonek
    Copy link
    Member

    Also, I just want to point out one thing about _PyErr_SetObject(). I believe it can detect cycles of arbitrary length (which is what the while loop is for). It's just that it can only detect cycles that involve the first node. So for things to fail with _PyErr_SetObject(), someone would need to mess with exceptions further down the chain. So I *think* hangs should be pretty unlikely with the narrower fix.

    @cjerdonek cjerdonek added 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed topic-asyncio labels May 21, 2020
    @cjerdonek cjerdonek changed the title "await" hangs in Python3.9.0b1. Exception handling with "await" can hang in Python3.9.0b1 May 21, 2020
    @cjerdonek cjerdonek added type-bug An unexpected behavior, bug, or error 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed topic-asyncio labels May 21, 2020
    @cjerdonek cjerdonek changed the title "await" hangs in Python3.9.0b1. Exception handling with "await" can hang in Python3.9.0b1 May 21, 2020
    @cjerdonek cjerdonek added the type-bug An unexpected behavior, bug, or error label May 21, 2020
    @cjerdonek
    Copy link
    Member

    Mariusz, someone may also want to review Django's code there. Raising an exception that would otherwise create a cycle in the chain could be obscuring another issue, or there could be more straightforward alternatives. (The Python issue will still be fixed of course.)

    @cjerdonek
    Copy link
    Member

    New changeset 7c30d12 by Chris Jerdonek in branch 'master':
    bpo-40696: Fix a hang that can arise after gen.throw() (GH-20287)
    7c30d12

    @felixxm
    Copy link
    Mannequin Author

    felixxm mannequin commented May 22, 2020

    Chris, many thanks for detailed explanation, extensive investigation, and a fix!

    We'll also review Django's code in the next few days.

    @cjerdonek
    Copy link
    Member

    New changeset 7f77ac4 by Miss Islington (bot) in branch '3.9':
    bpo-40696: Fix a hang that can arise after gen.throw() (GH-20287)
    7f77ac4

    @cjerdonek
    Copy link
    Member

    Good to hear, Mariusz! And thanks for the report!

    Also, as discussed above, I'm leaving this issue open (and retitling) until the following more general issue is fixed:

    try:
    raise RuntimeError
    except Exception as exc:
    print(f'handling: {exc!r}')
    exc.__context__ = exc
    raise ValueError # hangs

    As I mentioned above, I believe this is because _PyErr_SetObject() only checks for cycles that involve the exception being raised. In this case, the cycle involves the exception one further down:
    ValueError -> RuntimeError -> RuntimeError -> RuntimeError -> ...

    @cjerdonek cjerdonek changed the title Exception handling with "await" can hang in Python3.9.0b1 exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1") May 22, 2020
    @cjerdonek cjerdonek changed the title Exception handling with "await" can hang in Python3.9.0b1 exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1") May 22, 2020
    @cjerdonek cjerdonek added 3.7 (EOL) end of life 3.8 only security fixes labels May 22, 2020
    @sweeneyde
    Copy link
    Member

    Wouldn't Floyd's or Brent's cycle detection algorithms be better here than the allocation of a new set? I believe they might also eliminate the need to fast-path the first 100 or however many.

    As in: https://en.wikipedia.org/wiki/Cycle_detection

    @felixxm
    Copy link
    Mannequin Author

    felixxm mannequin commented May 26, 2020

    FYI, I created fix django/django#12978 for view's exception handling in Django.

    @sweeneyde
    Copy link
    Member

    I believe PR 20539 solves the more general problem (using Floyd's Tortoise and Hare Algorithm), and I would appreciate review / some more ideas for test cases.

    @serhiy-storchaka
    Copy link
    Member

    I it related to bpo-25782?

    @sweeneyde
    Copy link
    Member

    I it related to bpo-25782?

    Yes -- I didn't see that issue. I'm a little confused about the resolution of that issue though.

    For clarification, the existing behavior on master:
    When trying to raise the exception H,
    F -> G -> H -> I -> NULL
    becomes
    H -> F -> G -> NULL

    But when trying to set the exception A on top of
        B -> C -> D -> E -> C -> ...,
        it gets stuck in an infinite loop from the existing cycle.
    

    My PR keeps the first behavior and resolves the infinite loop by making it
    A -> B -> C -> D -> E -> NULL,
    which seems consistent with the existing behavior.

    So it should be strictly a bugfix.

    @serhiy-storchaka
    Copy link
    Member

    In bpo-25782 two different solutions for general problem was proposed.

    When trying to raise the exception H,
        F -> G -> H -> I -> NULL
    with Yury's patch you would get
        H -> NULL
    and with my path you would get
        H -> F -> G -> I -> NULL
    

    The issue was closed because we did not have realistic example when raising exception from the __context__ list would be reasonable and the more concrete original issue was fixed in other way. Now we have other example of creating cycles with __context__, so we can reopen yhat issue and decide what solution is better.

    @iritkatriel
    Copy link
    Member

    I think this can be closed now because the specific problem was fixed in PR20287 and the more general problem is the subject of bpo-25782.

    @cjerdonek
    Copy link
    Member

    Okay, I'll close the issue and update it to reflect that it was restricted to the narrower, originally reported issue.

    @cjerdonek cjerdonek changed the title exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1") Exception handling with "await" can hang in Python3.9.0b1 Aug 8, 2021
    @cjerdonek cjerdonek changed the title exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1") Exception handling with "await" can hang in Python3.9.0b1 Aug 8, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants