classification
Title: Deprecate sys.set_coroutine_wrapper and replace it with more focused API(s)
Type: enhancement Stage: resolved
Components: asyncio, Interpreter Core Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, giampaolo.rodola, gvanrossum, njs, twouters, yselivanov
Priority: normal Keywords: patch

Created on 2018-01-18 07:09 by njs, last changed 2018-05-31 10:20 by twouters. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 5250 merged njs, 2018-01-20 08:57
PR 5263 merged yselivanov, 2018-01-21 21:08
PR 5337 merged njs, 2018-01-26 05:37
PR 5412 merged njs, 2018-01-29 06:59
PR 6536 merged twouters, 2018-04-19 10:43
Messages (20)
msg310226 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-01-18 07:09
sys.set_coroutine_wrapper is a problematic API. It was added to solve a specific problem: asyncio debug mode wanting to track where coroutine objects are created, so that when unawaited coroutines are GC'ed, the warning they print can include a traceback. But to do this, it adds a *very* generic hook that lets you hook into the coroutine creation code to replace all instances of a built-in type with an arbitrary new type, which is highly unusual. It's hard to use correctly (see bpo-30578, and bpo-24342 before it). It's hard to use without creating performance problems, since anything you do here tends to add overhead to every async function call, and in many cases also every async function frame suspend/resume cycle.

Fortunately, it's explicitly documented as provisional. (I think Yury intentionally excluded it from the stabilization of the rest of asyncio and async/await, because of the reasons above.) And the things we *actually* want to do here are very simple. The only known use cases are the one mentioned above (see asyncio.coroutines.CoroWrapper), and the one in bpo-30491. So after discussions with Yury, I'm proposing to migrate those both into the interpreter as directly usable, fast, built-in features that can be used individually or together, and deprecate sys.set_coroutine_wrapper.

See bpo-30491 for details on that use case; here I'll discuss my plan for replacing the "coroutine origin tracking" that asyncio debug mode does.

We add a new function sys.set_coroutine_origin_tracking(depth), and I guess sys.get_coroutine_origin_tracking() because why not. The depth is thread-local, and defaults to 0, which means that newly created coroutines don't capture any origin info.

When a coroutine is created, it will check the current origin_tracking depth, and capture that many frames of traceback information. Like the current asyncio debug code and unlike real tracebacks, we won't capture actual frame objects -- we'll just capture (filename, lineno, function) information, to avoid pinning objects in memory. This is a debug helper, so it should avoid changing semantics whenever possible.

Later, we can retrieve the captured information by accessing the new attribute coro.origin.

In addition, the code in CoroutineType.__del__ that currently emits a warning for unawaited coroutines, will be modified to check for whether 'self' has origin information, and print it if so, similar to the current asyncio debug wrapper.

Some particular points where I'd appreciate feedback:

Should we add an envvar to configure coroutine source tracking? What about an -X switch?

Should it be coro.origin or coro.cr_origin? On the one hand the cr_ prefixing is super annoying and I think we should probably get rid of it; on the other maybe we should keep it here so things stay consistent and then have a separate the debate about removing it in general.

Most important: what format should we use for storing the origin information? I can see three plausible approaches:

1. Call traceback.StackSummary.extract. This means re-entering the Python interpreter from inside the coroutine setup code, which makes me a little nervous. I guess there can't be any real re-entrancy issues here because if there were, set_coroutine_wrapper would already be hitting them. Can it cause problems during shutdown when the traceback module might have disappeared? (Not that anyone's likely to be instantiating coroutines at that point.) It also might be a bit slower than the other options below, but this isn't clear. OTOH, it has the benefit that coro.origin could be a full-fledged StackSummary with all its features, like traceback printing. Which honestly feels a little weird to me, because I know coroutine objects are built-in and StackSummary objects aren't, but it would certainly be convenient.

2. Capture (filename, lineno, function) directly in a packed struct, similar to how tracemalloc works. Then coro.origin can unpack this into a list of tuples or whatever. This is definitely very fast, and avoids re-entering Python. In practice we'd probably still end up needing to re-enter Python and using the traceback module when it came time to print a warning, though, because traceback printing is complicated and we don't want to reimplement it. Also, if the origin comes from a custom importer like zipimport, then we may not be able to look up the line information later, because that requires access to frame.f_globals["__loader__"].

3. Like (2), but insert a call to linecache.lazycache for each origin frame we capture. This would fix option (2)'s issue with printing source lines, but it means re-entering Python as well, so at this point maybe it'd just be simpler to go with option (1).

Given this analysis I guess I'm leaning towards just calling StackSummary.extract, but I don't feel like I fully understand the consequences of calling into a Python module like that so want to hear what others think.
msg310250 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-18 16:28
+1 on on the API and for adding it in 3.7 and for deprecating set_coroutine_wrapepr.  This will simplify asyncio code and other event loops (like uvloop) and will lead to less overhead on asyncio debug mode.

I always disliked the set_coroutine_wrapper API.  Even at the time when I added it to PEP 492 I understood that we'll have to replace it eventually with something more sensible.

I think we should start working on a patch.  Nathaniel, do you have time to champion the work?

> Most important: what format should we use for storing the origin information? I can see three plausible approaches:

I'd be strongly against calling `traceback.StackSummary.extract` when coroutine object is created.  Let's go with the option 2.  You can create a list of tuple objects without the need of inventing a new struct.  coroutine.cr_origin (let's have the "cr_" prefix; dropping the prefixes should be considered in a separate issue) should return that list object directly.

I'm not sure how exactly the warning print logic will work with a list of tuples in cr_origin, I propose to start working on the implementation and figure that out during the review process.
msg310268 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-01-19 07:55
> I'm not sure how exactly the warning print logic will work with a list of tuples in cr_origin, I propose to start working on the implementation and figure that out during the review process.

I think the simplest thing is probably to write the warning code in Python and stash it in Lib/_corohelper.py or something. Warnings already go through warnings.py (in particular, the warning printing code is still warnings.showwarning, it's never been rewritten in C), so we're already transitioning to Python at this stage anyway.
msg310299 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-19 23:05
> I think the simplest thing is probably to write the warning code in Python and stash it in Lib/_corohelper.py or something.

I'd put the helper that accepts a list of traceback-like tuples and emits a warning in warnings.py and a C function to call it in warnings.c.
msg310355 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-20 16:11
Guido,

I'd like to go forward with this and merge Nathaniel's PR.

Quick summary:

* This issue adds new APIs: a sys.set_coroutine_origin_tracking_depth(depth) function and a 'cr_origin' property to native coroutine objects.

* By using this APIs, users can opt-in to save traceback (serialized as tuples of `(funcname, filename, line)`) of where native coroutines are created.

* This allows us to deprecate the sys.set_coroutine_wrapper function, which I never liked.  It's too powerful as it allows anyone to intercept/override native coroutine construction logic.

* In turn, we are removing a lot of complexity from asyncio code: starting with 3.7 we'll use CoroWrapper only for generator-based corotuines; at some point (Python 3.9?) when we remove @asyncio.coroutine we'll be able to remove CoroWrapper.  This will also make asyncio debug mode quite a bit faster, allowing people to use it in production.

* Finally, sys.set_coroutine_wrapper was always documented as a debug-only API that might disappear in future CPython releases.  We'll deprecate it in 3.7 and remove it in 3.8.

Are you OK with this?
msg310364 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-01-21 00:07
Yes, I'm OK with this plan. Note that I've only read Nathaniel's original
message and Yury's last message, and I've not reviewed the PR, but getting
rid of that eyesore sounds good, and it seems you two have reached
agreement.
msg310375 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-21 14:44
Merged!  Thank you, Nathaniel!
msg310391 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-22 01:47
New changeset 3510334361d6d39aad89e4e0d9bccd0695668583 by Yury Selivanov in branch 'master':
bpo-32591: Fix PyExc_WarnFormat call (follow-up commit) (#5263)
https://github.com/python/cpython/commit/3510334361d6d39aad89e4e0d9bccd0695668583
msg310726 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-26 02:07
The following snippet crashes with SIGABRT:

    import asyncio
    async def f(): pass
    asyncio.gather(f())

llvm tells that the crash is in _PyGen_Finalize/_PyErr_WarnUnawaitedCoroutine
msg310736 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-01-26 05:26
Right, I *knew* I should be nervous about calling into Python from a C-level destructor... what's happening is:

- Somehow, that coroutine object is managing to survive until the very, very end of the shutdown sequence (PyImport_Cleanup). I'm not sure how exactly; Yury suspects it somehow involves _asynciomodule.c holding a reference.

- At the very end of PyImport_Cleanup, there's a final call to the cycle collector, which destroys the coroutine object

- _PyGen_Finalize notices it's unawaited, and calls _PyErr_WarnUnawaitedCoroutine

- _PyErr_WarnUnawaitedCoroutine attempts to call warnings._warn_unawaited_coroutine, in a very careful and defensive manner

- But it doesn't matter how careful and defensive you are, because at this stage in the shutdown, we have sys.module = None, which freaks out the import system so badly that when we try to look up the warnings module, it doesn't even raise an error, it just abort()s the whole interpreter.

We can get a similar crash by doing:

import sys
async def f(): pass
sys.corocycle = [f]
sys.corocycle.append(sys.corocycle)

If you run the same code on 3.6, then it gets collected at the same time, and it issues a warning using the regular PyErr_WarnEx. It turns out that code is even *more* careful and defensive and notices that the interpreter is being shutdown, so it just skips printing the warning entirely.

I guess what we have to do is add a similar check to _PyErr_WarnUnawaitedCoroutine.

You can imagine how excited I am that I started working on this patch so I could make sure people get more reliable notice of unawaited coroutines (bpo-30491), and not only has that been rejected, but now I'm writing code to explicitly hide unawaited coroutine warnings. Just saying'...
msg310739 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-01-26 06:29
Maybe it's better to roll this back and take a deep breath. The feature freeze should not be treated as an excuse to cram in things at the last minute. Before this week I had never heard of this proposal. And the sarcasm is a bad start of your stint as a core dev, Nathaniel.
msg310781 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-26 19:28
New changeset dba976b8a28d6e5daa66ef31a6a7c694a9193f6a by Yury Selivanov (Nathaniel J. Smith) in branch 'master':
bpo-32591: fix abort in _PyErr_WarnUnawaitedCoroutine during shutdown (#5337)
https://github.com/python/cpython/commit/dba976b8a28d6e5daa66ef31a6a7c694a9193f6a
msg310782 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-26 19:32
Nathaniel's latest PR looks good and fixes the crash.

FWIW losing some warnings during interpreter shutdown isn't exactly a new problem, therefore I think there's nothing wrong with the new API here.  For instance, I've seen asyncio's CoroWrapper losing warnings in the exact same fashion.

So I've merged Nathaniel's PR.  I think we can rollback the new API if we discover more problems during the beta-1 period.
msg310783 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2018-01-26 19:56
OK. I hope both of you keep an eye on this and actively try to test and
break it!
msg310784 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-26 19:57
> OK. I hope both of you keep an eye on this and actively try to test and
break it!

That's the plan!
msg311067 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-29 06:22
Nathaniel,

test_coroutines raises a bunch of Runtime and Deprecation warnings after this PR.  Could you please make a PR to silence them?


/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:1032: RuntimeWarning: coroutine 'CoroutineTest.test_await_12.<locals>.coro' was never awaited
  return await Awaitable()
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:37: RuntimeWarning: coroutine 'CoroutineTest.test_await_9.<locals>.bar' was never awaited
  buffer.append(coro.send(None))
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:681: RuntimeWarning: coroutine 'CoroutineTest.test_func_13.<locals>.g' was never awaited
  g().send('spam')
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:534: RuntimeWarning: coroutine 'CoroutineTest.test_func_4.<locals>.foo' was never awaited
  list(foo())
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:537: RuntimeWarning: coroutine 'CoroutineTest.test_func_4.<locals>.foo' was never awaited
  tuple(foo())
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:540: RuntimeWarning: coroutine 'CoroutineTest.test_func_4.<locals>.foo' was never awaited
  sum(foo())
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:543: RuntimeWarning: coroutine 'CoroutineTest.test_func_4.<locals>.foo' was never awaited
  iter(foo())
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:564: RuntimeWarning: coroutine 'CoroutineTest.test_func_5.<locals>.foo' was never awaited
  for el in foo(): pass
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:1984: DeprecationWarning: get_coroutine_wrapper is deprecated
  self.assertIs(sys.get_coroutine_wrapper(), wrap)
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:1991: DeprecationWarning: set_coroutine_wrapper is deprecated
  sys.set_coroutine_wrapper(None)
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:1993: DeprecationWarning: get_coroutine_wrapper is deprecated
  self.assertIsNone(sys.get_coroutine_wrapper())
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:2001: DeprecationWarning: get_coroutine_wrapper is deprecated
  self.assertIsNone(sys.get_coroutine_wrapper())
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:2003: DeprecationWarning: set_coroutine_wrapper is deprecated
  sys.set_coroutine_wrapper(1)
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:2004: DeprecationWarning: get_coroutine_wrapper is deprecated
  self.assertIsNone(sys.get_coroutine_wrapper())
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:2015: DeprecationWarning: set_coroutine_wrapper is deprecated
  sys.set_coroutine_wrapper(wrapper)
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:2024: DeprecationWarning: set_coroutine_wrapper is deprecated
  sys.set_coroutine_wrapper(None)
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:2037: DeprecationWarning: set_coroutine_wrapper is deprecated
  sys.set_coroutine_wrapper(wrap)
/Users/yury/dev/pydev/cpython/Lib/test/test_coroutines.py:2045: DeprecationWarning: set_coroutine_wrapper is deprecated
  sys.set_coroutine_wrapper(None)
msg311069 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-01-29 06:25
Specifically "DeprecationWarning: get_coroutine_wrapper is deprecated" needs to be silenced as part of this issue.
msg311070 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-01-29 06:28
Ah, gotcha. I'll take a look. (For posterity: the RuntimeWarnings Yury mentioned are also real, but they're from the branch Yury is testing for bpo-32605 / bpo-32703, nothing to do with this issue.)
msg311077 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-01-29 07:34
New changeset 95e214713a7da4e5cbfb73f35b2119f76074fc3f by Nathaniel J. Smith in branch 'master':
bpo-32591: silence deprecation warnings in test_coroutine (GH-5412)
https://github.com/python/cpython/commit/95e214713a7da4e5cbfb73f35b2119f76074fc3f
msg318281 - (view) Author: Thomas Wouters (twouters) * (Python committer) Date: 2018-05-31 10:20
New changeset 500a419a7a68c32650717b213f0f5ab0461bb16b by T. Wouters in branch '3.6':
[3.6] bpo-32591: fix abort in _PyErr_WarnUnawaitedCoroutine during shutdown (GH-5337) (#6536)
https://github.com/python/cpython/commit/500a419a7a68c32650717b213f0f5ab0461bb16b
History
Date User Action Args
2018-05-31 10:20:49twouterssetnosy: + twouters
messages: + msg318281
2018-05-28 16:48:02yselivanovsetstatus: open -> closed
stage: patch review -> resolved
2018-04-19 10:43:25twouterssetpull_requests: + pull_request6231
2018-01-29 07:34:28njssetmessages: + msg311077
2018-01-29 06:59:43njssetstage: resolved -> patch review
pull_requests: + pull_request5246
2018-01-29 06:28:14njssetmessages: + msg311070
2018-01-29 06:25:37yselivanovsetmessages: + msg311069
2018-01-29 06:22:04yselivanovsetstatus: closed -> open

messages: + msg311067
2018-01-26 19:57:21yselivanovsetmessages: + msg310784
2018-01-26 19:56:47gvanrossumsetmessages: + msg310783
2018-01-26 19:32:21yselivanovsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-01-26 19:32:12yselivanovsetmessages: + msg310782
2018-01-26 19:28:39yselivanovsetmessages: + msg310781
2018-01-26 06:29:16gvanrossumsetmessages: + msg310739
2018-01-26 05:37:50njssetstage: resolved -> patch review
pull_requests: + pull_request5184
2018-01-26 05:27:00njssetmessages: + msg310736
2018-01-26 02:07:56yselivanovsetstatus: closed -> open
resolution: fixed -> (no value)
2018-01-26 02:07:48yselivanovsetmessages: + msg310726
2018-01-22 01:47:10yselivanovsetmessages: + msg310391
2018-01-21 21:08:20yselivanovsetpull_requests: + pull_request5109
2018-01-21 14:44:42yselivanovsetstatus: open -> closed
versions: - Python 3.8
type: enhancement
messages: + msg310375

resolution: fixed
stage: patch review -> resolved
2018-01-21 00:07:20gvanrossumsetmessages: + msg310364
2018-01-20 16:11:59yselivanovsetnosy: + gvanrossum
messages: + msg310355
2018-01-20 08:57:26njssetkeywords: + patch
stage: patch review
pull_requests: + pull_request5097
2018-01-19 23:05:59yselivanovsetmessages: + msg310299
2018-01-19 07:55:42njssetmessages: + msg310268
2018-01-18 16:28:04yselivanovsetmessages: + msg310250
2018-01-18 07:09:04njscreate