classification
Title: Asynchronous generator memory leak
Type: resource usage Stage: resolved
Components: asyncio, Interpreter Core Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Joongi Kim, achimnol, asvetlov, miss-islington, njs, terry.reedy, yselivanov, zkonge
Priority: normal Keywords:

Created on 2020-07-07 12:08 by zkonge, last changed 2020-11-10 07:57 by Joongi Kim. This issue is now closed.

Files
File name Uploaded Description Edit
leak.py zkonge, 2020-07-07 12:08
Pull Requests
URL Status Linked Edit
PR 21545 merged Joongi Kim, 2020-07-19 10:47
PR 23217 merged Joongi Kim, 2020-11-10 07:57
Messages (8)
msg373221 - (view) Author: JIanqiu Tao (zkonge) * Date: 2020-07-07 12:08
The resource used by asynchronous generator can't be released properly when works with "asend" method.

Besides, in Python 3.7-, a RuntimeError was raised when asyncio.run complete, but the message is puzzling:
  RuntimeError: can't send non-None value to a just-started coroutine

In Python 3.8+, No Exception showed.

Python3.5 unsupport yield in async function, so it seems no affect?
msg373498 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-07-11 01:56
Only 3.8+ for bug fixes.
msg373942 - (view) Author: Joongi Kim (achimnol) * Date: 2020-07-19 09:05
From the given example, if I add "await q.aclose()" after "await q.asend(123456)" it does not leak the memory.

This is a good example showing that we should always wrap async generators with explicit "aclosing" context manager (which does not exist yet in the stdlib).
I'm already doing so by writing a custom library:
https://github.com/achimnol/aiotools/blob/ef7bf0ce/src/aiotools/context.py#L152

We may need to update the documentation to recommend explicit aclosing of async generators.
msg373947 - (view) Author: Joongi Kim (achimnol) * Date: 2020-07-19 09:33
I've searched the Python documentation and the docs must be updated to explicitly state the necessity of aclose().

refs)
https://docs.python.org/3/reference/expressions.html#asynchronous-generator-functions
https://www.python.org/dev/peps/pep-0525/

I'm not sure that what the original authors' intention is, but for me, it looks like that calling aclose() is an optional thing and the responsibility to call aclose() on async generators is left to the asyncgen-shutdown handler of the event loop.

The example in this issue show that we need to aclose asyncgens whenever we are done with it, even far before shutting down the event loop.
msg373951 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2020-07-19 10:04
Huh, this is very weird. I can confirm that the async generator objects aren't cleaned up until loop shutdown on asyncio.

On the trio main branch, we don't yet use the `set_asyncgen_hooks` mechanism, and the async generator objects are cleaned up immediately.

However, if I check out this PR that will add it: https://github.com/python-trio/trio/pull/1564

...then we see the same bug happening with Trio: all the async generators are kept around until loop shutdown.

Also, it doesn't seem to be a circular references issue – if I explicitly call `gc.collect()`, then the asyncgen destructors are still *not* called; only shutting down the loop does it.

This doesn't make any sense, because asyncio/trio only keep weak references to the async generator objects, so they should still be freed.

So maybe the `set_asyncgen_hooks` code introduces a reference leak on async generator objects, or something?
msg373953 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2020-07-19 10:09
...On closer examination, it looks like that Trio PR has at least one test that checks that async generators are collected promptly after they stop being referenced, and that test passes:

https://github.com/python-trio/trio/pull/1564/files#diff-c79a78487c2f350ba99059813ea0c9f9R38

So, I have no idea what's going on here.
msg373954 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2020-07-19 10:17
Oh! I see it. This is actually working as intended.

What's happening is that the event loop will clean up async generators when they're garbage collected... but, this requires that the event loop get a chance to run. In the demonstration program, the main task creates lots of async generator objects, but never returns to the main loop. So they're all queued up to be collected, but it can't actually happen until you perform a real async operation. For example, try adding 'await asyncio.sleep(1)` before the input() call so that the event loop has a chance to run, and you'll see that the objects are collected immediately.

So this is a bit tricky, but this is actually expected behavior, and falls under the general category of "don't block the event loop, it will break stuff".
msg380192 - (view) Author: miss-islington (miss-islington) Date: 2020-11-02 08:02
New changeset 6e8dcdaaa49d4313bf9fab9f9923ca5828fbb10e by Joongi Kim in branch 'master':
bpo-41229: Update docs for explicit aclose()-required cases and add contextlib.aclosing() method (GH-21545)
https://github.com/python/cpython/commit/6e8dcdaaa49d4313bf9fab9f9923ca5828fbb10e
History
Date User Action Args
2020-11-10 07:57:19Joongi Kimsetpull_requests: + pull_request22115
2020-11-02 08:02:56miss-islingtonsetnosy: + miss-islington
messages: + msg380192
2020-07-19 10:47:02Joongi Kimsetnosy: + Joongi Kim

pull_requests: + pull_request20687
2020-07-19 10:17:19njssetstatus: open -> closed
resolution: not a bug
messages: + msg373954

stage: resolved
2020-07-19 10:09:14njssetmessages: + msg373953
2020-07-19 10:04:50njssetmessages: + msg373951
2020-07-19 09:39:49achimnolsetnosy: + njs
2020-07-19 09:33:58achimnolsetmessages: + msg373947
2020-07-19 09:05:28achimnolsetnosy: + achimnol
messages: + msg373942
2020-07-11 01:56:29terry.reedysetnosy: + terry.reedy
messages: + msg373498
2020-07-11 01:55:31terry.reedysetversions: - Python 3.6, Python 3.7
2020-07-07 12:08:08zkongecreate