classification
Title: exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1")
Type: behavior Stage: patch review
Components: Interpreter Core Versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, Dennis Sweeney, aeros, asvetlov, carltongibson, chris.jerdonek, eamanu, felixxm, miss-islington, serhiy.storchaka, yselivanov
Priority: normal Keywords: patch

Created on 2020-05-20 08:08 by felixxm, last changed 2020-05-30 20:05 by serhiy.storchaka.

Pull Requests
URL Status Linked Edit
PR 20287 merged chris.jerdonek, 2020-05-21 09:00
PR 20321 closed miss-islington, 2020-05-22 20:33
PR 20322 merged miss-islington, 2020-05-22 21:14
PR 20539 open Dennis Sweeney, 2020-05-30 12:10
Messages (24)
msg369429 - (view) Author: Mariusz Felisiak (felixxm) * Date: 2020-05-20 08:08
We noticed a behavior change in Python3.9.0b1 (it works properly in Python3.9.0a6). One of our tests `handlers.tests.AsyncHandlerRequestTests.test_suspiciousop_in_view_returns_400`[1] hangs on `await`. `/suspicious/` is a view that raises a custom exception `SuspiciousOperation`.

[1] https://github.com/django/django/blob/8328811f048fed0dd22573224def8c65410c9f2e/tests/handlers/tests.py#L258-L260
msg369441 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-20 11:29
I'm getting close to tracking this down. There is a certain point in the code path of the Django test where `exc is exc.__context__` becomes True. I'm guessing this is what's causing the hang. I'll try to get a simple reproducer (there is a lot of Django code to sort through).
msg369449 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-20 12:06
FWIW, I found that the following hangs, but it also hangs on earlier versions of Python:

import traceback

try:
    raise RuntimeError
except Exception as exc:
    print(f'handling: {exc!r}')
    exc.__context__ = exc
    print('printing traceback')
    print(traceback.format_exc())  # hangs

Is this a separate bug? So maybe the issue is that the new code is letting things get into this state. Some of my changes added new chaining in various places, so that would fit (but still investigating).
msg369450 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-20 12:54
To start out sharing what I found in the Django code:

Here inside BaseHandler._get_response_async():
https://github.com/django/django/blob/3460ea49e839fd6bb924c48eaa1cd3d6dc888035/django/core/handlers/base.py#L226-L232

try:
    response = await wrapped_callback(request, *callback_args,
                                      **callback_kwargs)
except Exception as e:
    response = await sync_to_async(  # This line hangs.
        self.process_exception_by_middleware,
        thread_sensitive=True,
    )(e, request)

you can see an exception being handled, which is then passed to process_exception_by_middleware(). Process_exception_by_middleware() can wind up re-raising that same exception, which causes __context__ to be set circularly inside the except block:
https://github.com/django/django/blob/3460ea49e839fd6bb924c48eaa1cd3d6dc888035/django/core/handlers/base.py#L323-L332

If you boil this down, you get the following as a simple reproducer. This doesn't hang, but you can tell the difference by comparing exc2 to exc2.__context as indicated below:

import asyncio

async def process_exc(exc):
    raise exc

async def run():
    try:
        raise RuntimeError
    except Exception as exc:
        task = asyncio.create_task(process_exc(exc))
        try:
            await task
        except BaseException as exc2:
            # Prints True in 3.9.0b1 and False in 3.9.0a6.
            print(exc2 is exc2.__context__)

loop = asyncio.new_event_loop()
try:
    loop.run_until_complete(run())
finally:
    loop.close()
    
The cause is probably the following PR, which enabled exception chaining for gen.throw() in the yield from case:
https://github.com/python/cpython/pull/19858
So the answer might be to do some cycle detection when chaining the exception, which apparently _PyErr_ChainExceptions() doesn't do.
msg369451 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-20 13:08
The Django details might not matter so much at this point, but to add to something I said above: It might not only be process_exception_by_middleware() as I mentioned, but also asgiref's sync_to_async() function. In that function, you can see an already active exception being re-raised (here the exc_info comes from sys.exc_info()):

# If we have an exception, run the function inside the except block
# after raising it so exc_info is correctly populated.
if exc_info[1]:
    try:
        raise exc_info[1]
    except:
        return func(*args, **kwargs)
else:
    return func(*args, **kwargs)

https://github.com/django/asgiref/blob/edd0570a4f6e46f0948afa5ef197a192bb95b7b7/asgiref/sync.py#L306-L314
msg369461 - (view) Author: Batuhan Taskaya (BTaskaya) * (Python triager) Date: 2020-05-20 15:06
> Is this a separate bug? So maybe the issue is that the new code is letting things get into this state. Some of my changes added new chaining in various places, so that would fit (but still investigating).

Looks like there isn't a recursion guard on https://github.com/python/cpython/blob/e572c7f6dbe5397153803eab256e4a4ca3384f80/Python/errors.c#L143-L154

I'm not sure if this would be the real solution but, for this case, it works 
diff --git a/Python/errors.c b/Python/errors.c
index 3b42c1120b..ba3df483e2 100644
--- a/Python/errors.c
+++ b/Python/errors.c
@@ -141,8 +141,8 @@ _PyErr_SetObject(PyThreadState *tstate, PyObject *exception, PyObject *value)
            usually very short. Sensitive readers may try
            to inline the call to PyException_GetContext. */
         if (exc_value != value) {
-            PyObject *o = exc_value, *context;
-            while ((context = PyException_GetContext(o))) {
+            PyObject *o = exc_value, *context = NULL;
+            while (o != context && (context = PyException_GetContext(o))) {
                 Py_DECREF(context);
                 if (context == value) {
                     PyException_SetContext(o, NULL);
(END)
msg369482 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-21 00:06
I don't think that would be a real solution because it looks like that would cause the while loop always to loop at most once (which would defeat its purpose) -- because the loop ends with "o = context":

while ((context = PyException_GetContext(o))) {
    Py_DECREF(context);
    if (context == value) {
        PyException_SetContext(o, NULL);
        break;
    }
    o = context;
}
msg369483 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2020-05-21 00:11
Just a note, __context__ cycles can theoretically be longer than 2 nodes. I've encountered cycles like `exc.__context__.__context__.__context__ is exc` a few times in my life, typically resulting from some weird third-party libraries.

The only solution is to use a `set()` collection to track already visited exceptions.

To make it fast I propose to modify the code to:

1. Do a fast traverse with a regular while loop without tracking (no set())

2. If the number of iterations is longer than 100 and there's still no top context found -- go to (3)

3. Do a slower implementation with set()
msg369484 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-21 00:35
From a process perspective, I think we should probably pursue two PR's for this: one for the general issue that affects all Python versions (what Yury is talking about), and something narrower that addresses the 3.9.0b1 case that came up here. I'd like to focus on the latter first. Someone else is welcome to work on the more general issue while I'm doing that.

I'm not 100% sure if the more general issue should be a new bpo issue or not. I'm leaning towards separate because it is bigger and there are different decisions to be made around backporting, etc, but we should decide that now. If someone else agrees, I can create a new issue.
msg369485 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2020-05-21 00:37
> If someone else agrees, I can create a new issue.

I'd keep this one issue, but really up to you. I don't think I have time in the next few days to work on what I proposed but would be happy to brainstorm / review PRs.
msg369486 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-21 00:50
Okay, I'll keep it one issue then. Someone else is still welcome to work on the more general issue.

Note that there is some chance the narrower fix should happen independent of the more general fix. This is because _PyErr_ChainExceptions() (which is the call I added for the gen.throw() case) doesn't call the code path that we're discussing. Thus, cycles could still wind up being introduced at that call site.
msg369510 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-21 09:28
I just posted a draft PR that implements the narrower fix:
https://github.com/python/cpython/pull/20287
I confirmed that the Django test passes with it. I also included two regression tests: one using only generators, and one more like the Django test that awaits a task.

My solution was to update the exception context in gen_send_ex() using _PyErr_SetObject() instead of _PyErr_ChainExceptions() -- because _PyErr_SetObject() does the cycle detection we've been discussing, and _PyErr_ChainExceptions() doesn't.

While _PyErr_SetObject()'s cycle detection isn't complete in that it can't detect cycles that begin further down the chain, it's good enough for this case.
msg369514 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-21 09:49
Also, I just want to point out one thing about _PyErr_SetObject(). I believe it can detect cycles of arbitrary length (which is what the while loop is for). It's just that it can only detect cycles that involve the first node. So for things to fail with _PyErr_SetObject(), someone would need to mess with exceptions further down the chain. So I *think* hangs should be pretty unlikely with the narrower fix.
msg369557 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-22 04:14
Mariusz, someone may also want to review Django's code there. Raising an exception that would otherwise create a cycle in the chain could be obscuring another issue, or there could be more straightforward alternatives. (The Python issue will still be fixed of course.)
msg369642 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-22 20:33
New changeset 7c30d12bd5359b0f66c4fbc98aa055398bcc8a7e by Chris Jerdonek in branch 'master':
bpo-40696: Fix a hang that can arise after gen.throw() (GH-20287)
https://github.com/python/cpython/commit/7c30d12bd5359b0f66c4fbc98aa055398bcc8a7e
msg369651 - (view) Author: Mariusz Felisiak (felixxm) * Date: 2020-05-22 21:11
Chris, many thanks for detailed explanation, extensive investigation, and a fix!

We'll also review Django's code in the next few days.
msg369654 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-22 21:35
New changeset 7f77ac463cff219e0c8afef2611cad5080cc9df1 by Miss Islington (bot) in branch '3.9':
bpo-40696: Fix a hang that can arise after gen.throw() (GH-20287)
https://github.com/python/cpython/commit/7f77ac463cff219e0c8afef2611cad5080cc9df1
msg369655 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2020-05-22 21:45
Good to hear, Mariusz! And thanks for the report!

Also, as discussed above, I'm leaving this issue open (and retitling) until the following more general issue is fixed:

try:
    raise RuntimeError
except Exception as exc:
    print(f'handling: {exc!r}')
    exc.__context__ = exc
    raise ValueError  # hangs

As I mentioned above, I believe this is because _PyErr_SetObject() only checks for cycles that involve the exception being raised. In this case, the cycle involves the exception one further down:
ValueError -> RuntimeError -> RuntimeError -> RuntimeError -> ...
msg369662 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-05-22 22:27
Wouldn't Floyd's or Brent's cycle detection algorithms be better here than the allocation of a new set? I believe they might also eliminate the need to fast-path the first 100 or however many.

As in: https://en.wikipedia.org/wiki/Cycle_detection
msg369947 - (view) Author: Mariusz Felisiak (felixxm) * Date: 2020-05-26 06:31
FYI, I created fix https://github.com/django/django/pull/12978 for view's exception handling in Django.
msg370383 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-05-30 12:35
I believe PR 20539 solves the more general problem (using Floyd's Tortoise and Hare Algorithm), and I would appreciate review / some more ideas for test cases.
msg370385 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-05-30 14:01
I it related to issue25782?
msg370398 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2020-05-30 19:38
> I it related to issue25782?

Yes -- I didn't see that issue. I'm a little confused about the resolution of that issue though.

For clarification, the existing behavior on master:
    When trying to raise the exception H,
        F -> G -> H -> I -> NULL
    becomes
        H -> F -> G -> NULL

    But when trying to set the exception A on top of
        B -> C -> D -> E -> C -> ...,
        it gets stuck in an infinite loop from the existing cycle.

My PR keeps the first behavior and resolves the infinite loop by making it
    A -> B -> C -> D -> E -> NULL,
    which seems consistent with the existing behavior.

So it should be strictly a bugfix.
msg370399 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-05-30 20:05
In issue25782 two different solutions for general problem was proposed.

    When trying to raise the exception H,
        F -> G -> H -> I -> NULL
    with Yury's patch you would get
        H -> NULL
    and with my path you would get
        H -> F -> G -> I -> NULL

The issue was closed because we did not have realistic example when raising exception from the __context__ list would be reasonable and the more concrete original issue was fixed in other way. Now we have other example of creating cycles with __context__, so we can reopen yhat issue and decide what solution is better.
History
Date User Action Args
2020-05-30 20:05:30serhiy.storchakasetmessages: + msg370399
2020-05-30 19:38:45Dennis Sweeneysetmessages: + msg370398
2020-05-30 14:01:41serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg370385
2020-05-30 12:35:17Dennis Sweeneysetmessages: + msg370383
2020-05-30 12:10:31Dennis Sweeneysetstage: needs patch -> patch review
pull_requests: + pull_request19782
2020-05-26 06:31:51felixxmsetmessages: + msg369947
2020-05-22 22:27:10Dennis Sweeneysetnosy: + Dennis Sweeney
messages: + msg369662
2020-05-22 21:47:00chris.jerdoneksetstage: patch review -> needs patch
versions: + Python 3.7, Python 3.8
2020-05-22 21:45:19chris.jerdoneksetmessages: + msg369655
title: Exception handling with "await" can hang in Python3.9.0b1 -> exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1")
2020-05-22 21:35:26chris.jerdoneksetmessages: + msg369654
2020-05-22 21:14:36miss-islingtonsetpull_requests: + pull_request19591
2020-05-22 21:11:16felixxmsetmessages: + msg369651
2020-05-22 20:33:41miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request19590
2020-05-22 20:33:41chris.jerdoneksetmessages: + msg369642
2020-05-22 04:14:19chris.jerdoneksetmessages: + msg369557
2020-05-21 12:18:46chris.jerdoneksettitle: "await" hangs in Python3.9.0b1. -> Exception handling with "await" can hang in Python3.9.0b1
type: behavior
components: + Interpreter Core, - asyncio
versions: + Python 3.10
2020-05-21 09:49:08chris.jerdoneksetmessages: + msg369514
2020-05-21 09:28:45chris.jerdoneksetmessages: + msg369510
2020-05-21 09:00:01chris.jerdoneksetkeywords: + patch
stage: patch review
pull_requests: + pull_request19561
2020-05-21 00:50:42chris.jerdoneksetmessages: + msg369486
2020-05-21 00:37:32yselivanovsetmessages: + msg369485
2020-05-21 00:35:38chris.jerdoneksetmessages: + msg369484
2020-05-21 00:11:30yselivanovsetmessages: + msg369483
2020-05-21 00:06:14chris.jerdoneksetmessages: + msg369482
2020-05-20 15:06:56BTaskayasetnosy: + BTaskaya
messages: + msg369461
2020-05-20 13:08:01chris.jerdoneksetmessages: + msg369451
2020-05-20 12:54:49chris.jerdoneksetmessages: + msg369450
2020-05-20 12:06:32chris.jerdoneksetmessages: + msg369449
2020-05-20 11:55:49eamanusetnosy: + eamanu
2020-05-20 11:29:47chris.jerdoneksetmessages: + msg369441
2020-05-20 11:15:56pablogsalsetnosy: + aeros
2020-05-20 09:12:45chris.jerdoneksetnosy: + chris.jerdonek
2020-05-20 08:08:03felixxmcreate